The Pennsylvania State University The Graduate School

ESSAYS ON THE ECONOMICS OF THE MOTION PICTURE INDUSTRY

A Dissertation in Economics by Naibin Chen

© 2020 Naibin Chen

Submitted in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

August 2020 The dissertation of Naibin Chen was reviewed and approved by the following:

Peter Newberry Assistant Professor of Economics Dissertation Co-Advisor Co-Chair of Committee

Mark J. Roberts Professor of Economics Dissertation Co-Advisor Co-Chair of Committee

Paul L. E. Grieco Associate Professor of Economics

Christopher Parker Assistant Professor of Information Technology & Analytics

Charles Murry Assistant Professor of Economics Special Member

Marc Henry Professor of Economics Graduate Program Director

ii Abstract

Chapter 1 inspects how critic reviews and reviews impact a movie’s box office performance differently. I use a Bayesian Learning model to model the process of consumers forming their beliefs about a movie’s quality, based on their prior, critic reviews, and audience reviews. Consumers then make a discrete choice from the set of movies that are available in a given week. Using a dataset of 665 movies released in 2011-2015, I estimate the model and show that critic reviews and audience reviews have significant and heterogeneous effects on movies produced by major studios or minor studios, movies reviewed by critics before release (regular) or not (cold-opened), and movies with different advertising expenditures. My counterfactual analyses show that forcing movies to be screened before release harms 71% of the cold-opened movies; consumers benefit from the availability of audience reviews by watching more good movies and fewer bad movies, with the quality measured from the perspective of consumers. Chapter 2 continues to discuss how the availability of reviews may change the way that consumers respond to advertising and, in turn, affect firms’ advertising strategies. Using data from the motion picture industry, I estimate the effect of advertising on consumer choices depending on whether or not consumers have access to expert reviews. Given the demand estimates, I evaluate a studio’s optimal advertising choice depending on the availability of expert reviews. My results show that advertising and expert reviews both have significant impacts on demand. For an average movie, when expert reviews are absent, advertising is about 2.25 times as effective at increasing the opening week revenue as it is when expert reviews are available. It would appear that expert reviews help consumers make better choices, as the percentage of consumers who choose a movie they would not watch if they were better informed drops by 0.74% on average. Additionally, with the presence of expert reviews, the median studio saves 76.60% of advertising expenditure and studios’ profits increase by $2.80 million per movie on average. Chapter 3 investigates the cultural effect of changing a product characteristic in the motion picture industry. In response to a growing Chinese market, Hollywood companies have been adding Chinese features in their movies. Such a change may improve a movie’s performance in the Chinese market, but may hurt the box office revenues in other markets. Using a dataset of 501 movies released in 2011-2014, I estimate the effect of adding Chinese features on the box office revenues in the domestic market, the Chinese market, and the rest of the world. I find that adding Chinese features significantly improves the revenue in China, but has a negative and insignificant effect in the domestic and the international markets. If Chinese features were added to all 168 movies imported to China, on average, movies would suffer a net loss of $4.58 million. But as the Chinese market size grows, there could be a net gain of $12.10 million if the market size doubles, with 63% of movies benefiting from adding Chinese features.

iii Table of Contents

List of Figures vii

List of Tables viii

Acknowledgments xi

Chapter 1 Audience v Critics: the Effect of Reviews on Box Office Revenue 1 1.1 Introduction ...... 1 1.2 Data ...... 5 1.2.1 Data Description ...... 5 1.2.2 Summary Statistics ...... 7 1.2.3 Preliminary Results ...... 11 1.3 Model and Estimation ...... 13 1.3.1 Learning from Critic & Audience Reviews ...... 13 1.3.2 Estimation ...... 18 1.3.3 Identification ...... 19 1.4 Results ...... 20 1.4.1 Preference Parameters ...... 20 1.4.2 Learning Parameters ...... 21 1.4.3 Robustness ...... 23 1.5 Counterfactuals ...... 24 1.5.1 Effect of Cold Opening ...... 24 1.5.2 Benefit from Audience Reviews ...... 26 1.6 Conclusion ...... 28

Chapter 2 Advertising under Learning from Expert Reviews 29 2.1 Introduction ...... 29 2.2 Data ...... 32 2.2.1 Industry Background ...... 33 2.2.2 Descriptive Statistics ...... 34 2.2.2.1 Advertising Expenditures ...... 34 2.2.2.2 Critic and Audience Reviews ...... 35 2.2.2.3 Box Office Performance ...... 36 2.2.2.4 Movie Characteristics ...... 37

iv 2.2.3 Evidence of Consumer Learning ...... 38 2.3 Model ...... 40 2.3.1 Timing ...... 40 2.3.2 Demand ...... 41 2.3.3 Supply ...... 42 2.3.4 Consumer Learning ...... 43 2.3.4.1 Opening Week ...... 44 2.3.4.2 Post-release Period ...... 46 2.3.5 Studio Learning ...... 46 2.4 Estimation ...... 48 2.4.1 Parameterization ...... 48 2.4.1.1 Weight on Signals ...... 48 2.4.1.2 Rescaling Reviews ...... 48 2.4.2 Demand Estimation ...... 49 2.4.2.1 Identification ...... 49 2.4.2.2 Instrumental Variables ...... 50 2.4.2.3 Estimation Strategy ...... 51 2.4.3 Advertising Policy Function ...... 52 2.5 Results ...... 52 2.5.1 Preference Parameters ...... 53 2.5.2 Learning Parameters ...... 54 2.5.3 Advertising Policy Function ...... 56 2.5.4 Robustness ...... 57 2.6 Counterfactuals ...... 58 2.6.1 Effect of Reviews on Consumer Choice ...... 58 2.6.2 Effect of Reviews on Studio Choice ...... 60 2.6.2.1 Mandatory Screening By Critics ...... 60 2.6.2.2 Removing Review Aggregators ...... 62 2.7 Conclusion ...... 64

Chapter 3 Hollywood’s Response to the Growing Chinese Movie Market 66 3.1 Introduction ...... 66 3.2 Data ...... 69 3.2.1 Chinese Features ...... 69 3.2.2 Box Office Revenues ...... 70 3.2.3 Movie Characteristics ...... 72 3.3 Estimation ...... 74 3.3.1 Domestic Market ...... 74 3.3.2 Chinese Market ...... 75 3.3.3 International Market ...... 75 3.3.4 Identification ...... 76 3.4 Results ...... 76 3.4.1 Effect of Chinese Features ...... 76 3.4.2 Other Movie Characteristics ...... 78 3.4.3 Robustness Tests ...... 78 3.5 Counterfactuals ...... 79

v 3.5.1 Removing Chinese Features ...... 80 3.5.2 Adding Chinese Features ...... 81 3.5.3 Varying Market Size ...... 81 3.6 Conclusion ...... 82

Appendix A Data Collection 84 A.1 Dataset for Reviews and Advertising ...... 84 A.1.1 Box Office Performance ...... 84 A.1.2 Collecting Online Reviews ...... 86 A.1.3 Movie Characteristics ...... 89 A.1.4 Advertising Expenditures ...... 93 A.2 Dataset for Chinese Features ...... 94 A.2.1 Movie Selection ...... 94 A.2.2 Movie Characteristics ...... 94 A.2.3 Chinese Features ...... 96 A.2.4 Box Office Revenues ...... 98

Appendix B Proofs 101 B.1 Learning Process ...... 101 B.2 Combining Prior and Private Signal ...... 103

Appendix C Robustness Tests 105 C.1 Chapter 1 Robustness Tests ...... 105 C.2 Chapter 2 Robustness Tests ...... 108 C.2.1 Linear Model and Estimates ...... 108 C.2.2 Alternative IVs for Advertising ...... 110 C.2.3 Cold-opening Redefined ...... 111 C.2.4 Homogeneous Weight ...... 112 C.2.5 Different Mapping from Advertising ...... 113 C.2.6 Infinite Support of Scores ...... 114

Appendix D Extension 116 D.1 Issues with Private Scores ...... 116 D.2 Supply Estimation ...... 118

Appendix E Additional Tables and Figures 120 E.1 Data Patterns ...... 120 E.2 Estimation Results ...... 125

Bibliography 128

vi List of Figures

2.1 Box Office Revenue & Advertising, Reviews ...... 39 2.2 Advertising & Reviews ...... 39 2.3 Players, Timing, and Information Flow ...... 40 2.4 Instrument for Cold-opening Choice ...... 50 2.5 Weights on Signals ...... 55 2.6 Marginal Effect of Advertising and Reviews on Opening Revenue ...... 55 2.7 Optimal Advertising for Cold-opened Movies ...... 56 2.8 Optimal Advertising for Regular Movies ...... 57 2.9 Market Share or Expected Quality under Different Information Sets ...... 59 2.10 Change in Advertising & Profit after Mandatory Screening ...... 61 2.11 Change in Advertising & Profit after Removing Reviews ...... 63

3.1 Yearly Box Office Revenue in China ...... 67 3.2 Net Change in Revenue for Movies without Chinese Features ...... 81

A.1 Webpage of v : Dawn of Justice on ...... 86 A.2 Review Sections of Batman v Superman: Dawn of Justice on Metacritic . . . . 86 A.3 Webpage of Batman v Superman: Dawn of Justice on . . . . 87 A.4 Critic Reviews of Batman v Superman: Dawn of Justice on Rotten Tomatoes . 87 A.5 Example of Google Search Results Panel ...... 88

D.1 Movies with Private Score Solved or Unsolved ...... 117

E.1 Weekly Advertising Expenditure ...... 120 E.2 Advertising & Production Budget ...... 120

vii List of Tables

1.1 Summary Statistics of Movie Characteristics ...... 8 1.2 Weekly Box Office Revenue ($mln) ...... 9 1.3 Critic Scores, User Scores, and the Number of Reviews on Metacritic ...... 10 1.4 Weekly Competition and Market Revenue ...... 10 1.5 Parameters on Reviews from FE Model ...... 12 1.6 Preference Parameters ...... 21 1.7 Learning Parameters ...... 22 1.8 Hypothesis Tests from Learning Parameters ...... 22 1.9 Change in Revenue After Removing Cold Opening ($mln) ...... 25 1.10 Change in Revenue After Removing Audience Reviews ($mln) ...... 27

2.1 Advertising Expenditures ...... 35 2.2 Days between the First Critic Review on Metacritic and Wide Release . . . . . 35 2.3 Summary Statistics of Movie Characteristics ...... 37 2.4 Preference Parameters ...... 53 2.5 Learning Parameters ...... 54 2.6 Difference in Opening Week Share (%) ...... 59 2.7 Change in Advertising & Profit after Mandatory Screening ...... 61 2.8 Change in Advertising & Profit after Removing Reviews ...... 63

3.1 Summary of Movies with Chinese Features ...... 70 3.2 Summary of Import Types ...... 70 3.3 Box Office Revenues in Different Markets ...... 72 3.4 Summary Statistics of Movie Characteristics ...... 73 3.5 Effect of Adding Chinese Features ...... 76 3.6 Effect of Interaction Terms ...... 77

viii 3.7 Effect of Adding Chinese Features in SUR ...... 79 3.8 Revenue Changes in Different Markets ($mln) ...... 80 3.9 Adding Chinese Features after Changing Market Size ...... 82

A.1 Major Studio Parents and Their Studios ...... 89 A.2 U.S. Holiday Weeks in 2011-2015 ...... 92 A.3 Chinese Holiday Weeks in 2011-2015 ...... 96

C.1 Preference Parameters for Robustness - Chapter 1 ...... 105 C.2 Preference Parameters for Robustness - Chapter 1 (Cont.) ...... 106 C.3 Learning Parameters for Robustness - Chapter 1 ...... 107 C.4 Learning Parameters under Linear Model ...... 109 C.5 Preference Parameters under Linear Model ...... 109 C.6 Preference Parameters under Alternative IVs/Proxy ...... 110 C.7 Learning Parameters under Alternative IVs/Proxy ...... 111 C.8 Preference Parameters under Redefined Cold-opening ...... 111 C.9 Learning Parameters under Redefined Cold-opening ...... 112 C.10 Estimation Results under Homogeneous Weight ...... 113 C.11 Preference Parameters under Square Root of Advertising ...... 113 C.12 Learning Parameters under Square Root of Advertising ...... 114 C.13 Preference Parameters under Infinite Support ...... 115 C.14 Learning Parameters under Infinite Support ...... 115

D.1 Cost Parameters ...... 119

E.1 Number of Theaters for Movies in Each Week of Life ...... 121 E.2 Weekly Box Office Revenue ...... 121 E.3 Number of Movies with Positive Advertising Expenditure ...... 121 E.4 Production and Distribution by Major Studios ...... 122 E.5 Critic Reviews on Metacritic and Rotten Tomatoes ...... 122 E.6 Audience Reviews on IMDb and Metacritic ...... 122 E.7 Critic Scores on Metacritic ...... 123 E.8 Number of Critic Reviews on Metacritic ...... 123 E.9 User Scores on Metacritic ...... 124

ix E.10 Number of User Reviews on Metacritic ...... 124 E.11 Other Preference Parameters ...... 125 E.12 Effect of Interaction Terms for International Regions ...... 126 E.13 Effect of Movie Characteristics ...... 126 E.14 Effect of Movie Characteristics (Cont.) ...... 127

x Acknowledgments

I would like to express my deepest gratitude to my committee. Without their guidance, encouragement, and support, I would not have completed this dissertation. Specifically, I am forever indebted to my advisor, Dr. Peter Newberry, for lending me his knowledge and passion for industrial organization and the media industry even before I arrived at Penn State, being patient and supportive throughout our weekly meetings, and teaching me how to motivate my research and keep myself motivated. My co-advisor, Dr. Mark Roberts, has encouraged me to look from perspectives and try out different ideas, and always been available for another meeting to discuss my work. Dr. Paul Grieco and Dr. Charles Murry have both taught me a lot of empirical skills through not only their lectures but also my research assistantships with them. They also have been asking hard questions that challenge me to think deeper, and generously spending time to help me improve. Dr. Christopher Parker has also provided me with insightful comments and suggestions from his field. I would like to extend my gratitude to Dr. Karl Schurter and Dr. Daniel Grodzicki, for the valuable discussions I had with them and their constructive advice on my research; Dr. S. Nageeb Ali, for the discussions I had with him that led to a few research topics in this dissertation; Dr. Kala Krishna, Dr. Marc Henry, Dr. Vijay Krishna, and other professors in the Department of Economics at Penn State. Special thanks to Dr. John Riew, without whose unwavering recommendation, I would not have made it to Penn State. Throughout my study in the program, he constantly checked in with me to make sure I have made good research progress, developed abundant language skills, and prepared well for the job market. I am also grateful to Dr. Colin Knapp, for whom I worked as a teaching assistant for 4 years, and Ana Enriquez at the Penn State University Libraries, for whom I worked as a research assistant, for their support and guidance on my teaching, research, and communication skills. I also wish to thank my professors in the School of Economics at Peking University. My undergraduate advisor, Dr. Yaguang Zhang, has always supported me as a mentor and a friend. Dr. Yi Chen, Dr. Guitian Huang, Dr. Wenxin Liu, Dr. Jingyi Ye, and Dr. Cheng Yuan also provided me with advice and help on my study and my preparation for the Ph.D. program. I am forever grateful for their teaching and encouragement that led me to where I am. I very much appreciate the help and company of my fellow graduate students, especially Zhiyuan Chen, Paul Ko, Rong Luo, Guoxuan Ma, Jiwoong Moon, Farhod Olimov, Wenjing Ruan, Jinwen Wang, Jia Xiang, Chung Han Yang, Xiaolu Zhou, and Feng Zhu. Being in a diverse program with many brilliant peers, I benefited a lot from the discussions, presentations, and the time we shared just for fun. Last but not least, I thank my parents for their unconditional support. I am lucky to have a family that always encourages me to achieve as much as I can, and tries their best to provide whatever I need along the way.

xi CHAPTER 1

Audience v Critics: the Effect of Reviews on Box Office Revenue

1.1 Introduction

In the motion picture industry, box office revenue can be heavily affected by reviews. There are two types of reviews for a movie — expert reviews from the critics and peer reviews from the audience. Before a movie is released, consumers can observe some characteristics of it, such as the cast, the director, and the production companies. They may also read critic reviews online or in a magazine. When they choose whether to see this movie, they can use these pieces of information to predict the quality of the movie. After the release, the consumers that have already watched the movie can spread their words about the movie quality. Several websites provide user ratings of movies, i.e., audience reviews, and consumers that have not seen the movie can make their choices based on all the information available, including the critic reviews and audience reviews. In the pre-release period, production companies can choose to send their movies to the critics for screening. Once they send out a movie, they cannot control how critics review the movie. If the feedback is positive, it can be used in promotional materials, such as trailers and posters, to attract the audience. However, if the feedback from critics is mostly negative, the situation would be tough for the production companies. For example, before the wide release of Batman v Superman: Dawn of Justice (2016), critics on the famous film review website, Rotten Tomatoes, posted reviews that aggregated around 40% “Tomatometer”. The low rating evoked heavy debates among the fans, and the producers and actors were questioned by the media

1 concerning the quality of their movie. Since the critic reviews are the only piece of information that consumers can get in the pre-release period, besides the observable characteristics, they may strongly affect the opening box office revenue. Thus, studios may intentionally prevent their movies from being reviewed by the critics before release, which is called “cold openings” in the motion picture industry. After the wide release of a movie, consumers that have watched the movie can go to several famous websites, like IMDb and Rotten Tomatoes, to give their opinions on the quality of the movie. As these ratings are publicly observable, they may affect the decision of potential consumers, thus affect the box office revenue. How will the consumers evaluate the information from critic reviews or a cold opening? How will they be affected by the reviews from critics and their peers? These are the questions that I am inspecting in this chapter. Intuitively, cold openings will be chosen by production companies that evaluate their movies as of low quality. Consumers with rational expectations can infer the low quality from cold openings. Thus, whether to get reviewed by critics should not affect box office revenue. Movies with good quality, on the other hand, have no incentive to be hidden from critics. Production companies will send them for review and expect a higher box office revenue compared with movies with bad quality. However, in Brown et al. (2012), the authors inspect 1303 widely released movies and find that cold-opened movies are correlated with a 10-30% increase in domestic revenue, despite getting lower average ratings than movies that are reviewed. Therefore it is not clear how consumers treat the critic reviews and cold openings in the pre-release period. Moreover, the disagreement between the critics and the audience over movie quality may change the way that reviews affect box office revenue. From the reviews I collect from Rotten Tomatoes for about 370 movies in 2012-2014, 55.4% of the movies have over 10% difference in the reviews from the critics and the audience. 28.3% of them have a difference of over 20%. Movies with a low score from the critics may still have good feedback from the audience. (2016), which has only 27% Tomatometer (the average critic score on Rotten Tomatoes), has a 69% user rating on Rotten Tomatoes, and grossed $64 million on its opening day. Thus, knowing more about how consumers evaluate reviews from the critics and the audience may help studios choose a better strategy of attracting consumers. It may also help policymakers improve the welfare of consumers by providing more information about movie quality. To better understand the questions addressed above, I collect a unique dataset that covers 665 movies released in 2011-2015 from various resources. It includes weekly box office performance, critic reviews and user reviews in a chronological order, movie characteristics, and advertising expenditures. These movies are widely released and rank among the top 200 each year in terms of the total box office revenue. Observing critic reviews and audience reviews in a chronological order allows me to check if a movie is cold-opened1, as well as to calculate the average critic score and audience score that consumers observed in each week. Therefore, I can analyze the

1I define a movie to be cold-opened if there is no critic review available 3 days prior to its release.

2 effect of both types of reviews on a movie’s weekly box office revenue since its release. My preliminary analysis uses a Fixed Effect model to control unobservable movie fixed effects, and shows that both critic reviews and audience reviews positively affect box office revenue, though the effect of critic reviews is not significant unless we take the number of critic reviews into account. Thus, it is reasonable to think that critic reviews and audience reviews have exogenous effects on box office revenue. Then I use a structural model of the demand side that incorporates consumer learning about the quality, following Newberry and Zhou (2016). It allows consumers to learn from the movie characteristics, critic reviews, and audience reviews, if available, to form a belief of movie quality using Bayes’ rule. Then consumers can make a choice from the movies that are in theaters in the same week, based on their expected utility of watching the movies. The parameters of the learning process are parameterized as functions of some movie characteristics: major or minor studio production, cold opening, and advertising expenditure. Thus, I can inspect if the learning process differs for different types of movies. Specifically, I model the learning process as of three periods. In the pre-release period, consumers observe movie characteristics and form prior, then update it if critic reviews (given by critics that have screened the movie, based on a signal of experience drawn from a normal distribution centered around the true quality) are available. In the opening week, consumers choose among the available movies, then audience reviews (given by consumers that have already seen the movie, based on the signal of experience drawn from a normal distribution centered around the true quality) are posted, and the potential consumers update their prior with critic reviews and audience reviews. In week 2 and after, consumers make their choices and post the reviews and update beliefs in the same way as in the opening week. I estimate the preference parameters on the movie characteristics and week effects together with the parameters of the learning process using nonlinear least squares. The preference parameters are mostly as expected, with significant coefficients on variables like power, production budget, sequels, and week dummies. The learning parameters show that the prior mean is not significantly different between major and minor studio productions, and cold- opened movies have a higher prior mean than regular movies. Consumers put more weight on audience reviews than on critic reviews in general.2 Compared with the weight for major studio productions, consumers put more weight on both types of reviews for minor studio movies, indicating that consumers may be more uncertain about the quality of minor studio productions. Movies that are cold-opened, as well as movies with higher advertising expenditures, have consumers add weight on critic reviews and subtract weight from audience reviews. After obtaining the estimates, I conduct two counterfactual practices to evaluate the effect

2Although I have already aggregated critic reviews as one review in my assumption, the weight on critic reviews is still lower than audience reviews. In the robustness test, when I consider the number of critic reviews, the weight on critic reviews is even lower.

3 of forcing movies to be screened before release and the benefit from the availability of audience reviews. The first practice shows that cold-opened movies mostly suffer from getting their critic reviews disclosed before release, and most of the regular movies benefit from it. Therefore, most of the cold-opened movies indeed benefit from not screening their movies. The second practice focuses on the “good” and “bad” movies from a consumer’s perspective. After removing the availability of audience reviews, 76% of the good movies see a decrease in revenue, with an average of $22.72 million drop, while 88% of the bad movies gain from the removal of audience scores. 115 of 119 cold-opened bad movies see an increase in their revenue, with an average of $13.59 million gain. This shows the value of audience reviews to consumer welfare. Several strands of literature have covered similar aspects related to this chapter. First, a few papers have applied social learning models to analyze the motion picture industry. Moretti (2011) analyzes the peer effect of the audience learning from the information provided by their peers. Thus, if a movie performs better than expected in the opening week, potential consumers will increase their expectation of quality and further increase revenue in week 2. This effect also gets stronger for the audience with larger networks. Liu (2016) focuses on the pre-release advertising strategy of the studios and compares the effect of advertising and social learning on a movie’s success. Joo et al. (2009) also incorporate a learning model to analyze optimal advertising. Outside of the motion picture industry, Newberry and Zhou (2016) look at the user ratings on an online selling platform, and analyze how consumers learn from the user ratings to update their expected quality of the product. This chapter follows their approach of modeling the learning process, but adds another source of ratings to inspect the difference between two types of reviews. Compared to the papers on the motion picture industry mentioned above, this chapter has a direct focus on comparing the effects of critic reviews and audience reviews, and does not model consumer learning from box office performance or advertising expenditure. Combining these information resources would be an interesting extension, and may provide a better understanding of the motion picture industry. This chapter also covers some discussion about the effect of cold opening. In the literature, Brown et al. (2012), as mentioned before, show that cold-opened movies have lower quality from the perspective of the audience, and generate higher box office revenue compared with movies that have pre-release critic reviews. Brown et al. (2007) use the concept of limited strategic thinking to model the process of consumers forming beliefs about the movie quality, and decide on whether to watch the movie. Reinstein and Snyder (2005) focus on the effect of expert reviews on the opening weekend revenue, and use a difference-in-differences approach to find a smaller but detectable effect, even after taking care of endogeneity. There are also papers in the marketing literature that discuss the effect of online word- of-mouth (WOM) on a movie’s box office performance. Chakravarty et al. (2010) specifically discuss the effects of online WOM and critic reviews on the evaluation of movies in the pre- release period. Their main findings are that online WOM (especially when negative) has a

4 stronger persuasive effect on infrequent consumers; negative online WOM has an enduring effect on infrequent consumers even when critic reviews are positive; and infrequent consumers are more influenced by online WOM, while frequent consumers are more influenced by critic reviews. Hennig-Thurau et al. (2006) use a structural model to estimate early box office revenue and long-term outcomes. The main finding is that the studio actions (including production budget, advertising, star power, and release screens, etc.) primarily affect early box office revenue; movie quality affects both early and long-term box office revenue. Other papers include Duan et al. (2008) and Liu (2006). The rest of the chapter is organized as follows. Section 1.2 introduces the dataset that I collect and presents some descriptive statistics of the data I use. It also contains a discussion about the preliminary regressions and their results. Section 1.3 sets up the demand side model with a learning process, then discusses the estimation strategy and identification. Section 1.4 provides results from the main estimation and robustness tests. In Section 1.5, I conduct two counterfactual practices. Section 1.6 concludes the chapter.

1.2 Data

This section describes the dataset I use in this chapter, and provides the summary statistics. I use a comprehensive dataset that combines various sources on widely released movies that ranked among the top 200 highest-grossing movies in each year of 2011-2015.

1.2.1 Data Description

This dataset includes weekly box office performance of movies, critic reviews and user reviews, and movie characteristics including advertising expenses. Most of the data are collected from websites that are widely used in the research on the motion picture industry, including the International Movie Database (IMDb) and the two critic review websites, Metacritic and Rotten Tomatoes. The advertising data are collected from AdSpender, a database that monitors multi-media advertising expenditures provided by Kantar Media. I will describe the resources in 4 categories below.3

Box Office Performance Data on box office performance are collected from the website Box Office Mojo. This website is a commonly used data resource in research. It provides various box office records including release details and daily, weekly, and weekend box office performance. I collect the domestic box office performance of the top 200 highest-grossing movies each year. “Domestic” means in the United States and Canada, which are usually considered as one market. For each movie,

3Detailed explanation of combining the data resources, adjusting values, and cleaning data can be found in Appendix A.

5 I collect its release date, total box office revenue, weekly box office revenue and the number of theaters showing the movie each week for 25 weeks after it is released. They include both blockbuster movies and much smaller movies, as the total revenue of these movies ranges from $1.3 million to over $900 million. My focus is on the widely released movies that are playing at 600 or more theaters. However, a movie may not be released in over 600 theaters in its first week (considered as in limited release) but hits more theaters later. In my dataset, I redefine the release date and opening week to reflect the wide release. I also drop the movies that have stayed too long in their limited release period, as the audience will certainly be learning from their peers before the wide release. In the domestic market, most of the movies are released on Friday, and the weekly revenue on Box Office Mojo is also measured from Friday to Thursday. Several movies are released on other weekdays due to reasons like a holiday release. While it may not affect the weekly revenue in later weeks, the opening week revenue will be affected due to the mismatch. I deal with this issue by adjusting the opening week revenue. The advantage of using weekly data is that I can observe the exact set of movies that are available to consumers in a week. It allows me to look at consumers’ choices within choice sets that vary by week, which is more realistic than treating the set of all movies or movies in a year as in the same choice set. I also collect the weekly total revenue in the market, which will be adjusted if there is any movie that has an opening week adjustment. It will be used to calculate market shares.

Critic Reviews and Audience Reviews Consumers can get critic reviews from various ways, like magazines, newspapers, or even the advertisements of a movie. The most widely used resources are two websites, Metacritic and Rotten Tomatoes. Metacritic has critics from multiple media that write reviews and give a score from 0 to 100. It takes the average of the scores and provides it as the “Metascore” for each movie. This score is shown at the top of a movie’s webpage, right below the movie title. Rotten Tomatoes provides two types of critic reviews, top critics and all critics, with the former similar to Metacritic. Each critic’s review is categorized as “Fresh” or “Rotten”. The website provides the percentage of “Fresh” reviews as the “Tomatometer”, and shows it in a panel of a movie’s webpage. Both “Metascore” and “Tomatometer” are commonly referred to by the media. Therefore, it is reasonable to assume that consumers have access to critic reviews on these two websites, and the information may affect their choices. Consumers can also get information from their peers after these audiences have seen the movie. While there are also several ways to get this information, like hearing from a neighbor or reading a friend’s tweets, I have to look for a source of audience reviews that is comparable across movies. Online reviews provided by websites like IMDb, Metacritic, and Rotten Tomatoes have this comparable feature, though it is impossible for any of them to take in all the opinions

6 from the audience. If a consumer searches for a movie on Google, the right panel of the result page will provide the average user rating on IMDb, the “Metascore” on Metacritic, and the “Tomatometer” of all critics on Rotten Tomatoes.4 I will focus on these reviews in this chapter. For each site, I obtain every review under a movie in a chronological order, including the identity of the reviewer, the score or rating, and the time that the review was posted. Then I construct the average critic score and the average audience score by taking the average of the corresponding reviews within a period.

Movie Characteristics I collect movie characteristics from IMDb, IMDbPro (paid service of IMDb), and Box Office Mojo. These characteristics include production companies, country, MPAA rating, director(s) and stars (with their STARmeter rankings on IMDbPro), genre, format, production budget, and whether the movie is a sequel. Some of the characteristics are provided by more than one resource, and the details may vary. For example, IMDb usually provides 3 genres for a movie, but Box Office Mojo provides fewer, and they may not exactly match. The final characteristics in this dataset are the combination of these resources. I obtain advertising expenditures from the AdSpender database. It provides weekly advertis- ing data on multiple media, including network TV, cable TV, magazines, national newspapers, etc. The weekly data, calculated in broadcast weeks beginning on Mondays, are available from Jan. 31, 2011. I collect the weekly advertising expenses for 6 weeks: 1) the opening week that contains the wide-release date; 2) 4 weeks before the opening week; 3) 1 week after the opening week.5 To see the difference of movies produced by major studios in the learning process, I label the movies produced by studios from the six media conglomerates - , Warner Bros., Fox, Universal, , and . These movies will be considered as produced by major studios. For the effect of production budget and advertising expense on the learning process, I use the budget information from Box Office Mojo and IMDb, and sum up the advertising expense in the opening week and 4 weeks before it.

1.2.2 Summary Statistics

After matching all the information from the resources above, I further clean the data by dropping movies without enough advertising information or budget information, as well as the movies that are released near the beginning or the end of my sample period (since their major competitors may be missing in my dataset). In my final dataset, there are 665 movies with multiple weeks on the market, and the total number of movie-week observations is 8211. The

4Screenshots from Metacritic, Rotten Tomatoes, and Google are provided in Appendix A.1.2. 5As shown in Figure E.1, advertising expenditures are the highest in these weeks during a movie’s life.

7 dataset covers 249 weeks from Apr. 1, 2011 to Jan. 7, 2016. Table 1.1 shows the summary statistics for movie characteristics. In my dataset, movies produced by the six major studios take up 47.52%, with each studio producing 5-10%. 66.6% of the movies are produced by the U.S., while 29% are co-produced by the U.S. and other countries. PG-13 movies are the most with 44.4%, and R rated movies take up 37.4%. With 16.8% PG movies in the dataset, there are only 1.4% movies rated G. Comedy, drama, and action movies are the most in the dataset, each taking up over 20%. Horror and movies are relatively fewer. About 21.7% of the movies are released in 3D format, and 15.6% are released in both conventional and IMAX theaters. 13.7% of the movies are sequels. Director power varies from 2.89 to 14.80, with an average of 7.833. Star power ranges from 0.768 to 10.66, with an average of 5.558. Both director power and star power are the averages of the STARmeter of director(s) and actors, respectively, on IMDbPro in natural logarithm. Thus, the lower the power index is, the higher the average ranking is. The highest production budget is $250 million, while the lowest budget is only $100,000. The average production budget is $55.13 million, with a high standard deviation. Advertising expenditure (sum of 5 weeks) goes from 0 advertising to $50.92 million, with an average of $17.01 million.

Table 1.1: Summary Statistics of Movie Characteristics

Variable Mean S.D. Min Max Variable Mean S.D. Median Min Max Disney 0.059 0.235 0 1 Action 0.221 0.415 0 1 Fox 0.093 0.291 0 1 Animation 0.078 0.269 0 1 Paramount 0.060 0.238 0 1 Comedy 0.278 0.448 0 1 Sony 0.096 0.295 0 1 Drama 0.242 0.429 0 1 Universal 0.096 0.295 0 1 Horror 0.090 0.287 0 1 Warner 0.075 0.264 0 1 Sci-fi 0.141 0.349 0 1 US prod. 0.666 0.472 0 1 Thriller 0.120 0.326 0 1 Co-prod. 0.290 0.454 0 1 Cold 0.382 0.486 0 1 R 0.374 0.484 0 1 Director 7.833 1.379 8.174 2.89 14.8 PG-13 0.444 0.497 0 1 Star 5.558 1.414 5.512 0.768 10.66 PG 0.168 0.375 0 1 Budget 55.13 55.04 35 0.1 250 3D 0.217 0.412 0 1 Adspend 17.01 9.311 16.56 0 50.92 IMAX 0.156 0.363 0 1 Life 12.40 4.893 12 1 20 Sequel 0.137 0.344 0 1 Observations 665

Notes: Director and star in the natural logarithm of STARmeter; budget and advertising in $ million; life in week.

Instead of defining cold-opened movies as those without critic reviews on the day before wide release, I call movies cold-opened if there is no critic review 3 days before wide release. That is, for a Friday opening, the movie still has no critic review on Monday. This leaves consumers some time to obtain review information in the pre-release period. Under this definition, 38.2% of the 665 movies are cold-opened. In the dataset, the longest life of a movie has 20 weeks, since I trim their box office

8 performance at 20 weeks. The shortest life is 1 week, of The Hateful Eight (2015), which was released at the end of 2015. It stayed on the market for 18 weeks, but has only 1 week in the dataset. The actual shortest life is 2 weeks. The average life of movies is 12.4 weeks. Next, I show the description of box office performance and review information in each of the 20 weeks of a movie’s life. Table 1.2 provides weekly box office revenue since the opening week to week 20.6 The average, maximum, and minimum weekly revenues all drop sharply after release, but the revenue dispersion remains large. A movie can earn as high as $390.9 million and as low as $0.583 million in their opening week. Since week 8, the average weekly revenue drops below $1 million. After week 10, the highest weekly revenue can hardly exceed $10 million. From the number of observations, we can also see that movies are gradually dropping out of the market. Only 85 movies make it to their 20th week in my dataset. Another piece of information about the box office is the number of theaters a movie is shown, as presented in Table E.1. In the opening week, the widest release will have 4404 theaters, and the average is about 2773 theaters. The average number of theaters drops sharply since week 2, and after week 10, movies are shown in fewer than 300 theaters on average. Since the theaters are making strategic choices of showing movies based on demand, I do not treat the number of theaters as capacity constraints in this analysis.

Table 1.2: Weekly Box Office Revenue ($mln)

week1 week2 week3 week4 week5 week6 week7 week8 week9 week10 Mean 33.58 16.94 9.322 5.433 3.273 2.001 1.223 0.834 0.612 0.457 Std.Dev. 41.64 20.41 11.01 6.581 4.627 3.022 2.039 1.559 1.093 0.842 Median 19.77 10.78 5.972 3.081 1.469 0.707 0.397 0.315 0.259 0.240 Min 0.583 0.195 0.018 0.003 0.003 0.003 0.002 0.0009 0.0008 0.0003 Max 390.9 261.1 118.4 55.78 57.59 25.47 18.04 18.15 11.92 11.06 Obs 653 654 648 641 633 611 586 541 499 460 week11 week12 week13 week14 week15 week16 week17 week18 week19 week20 Mean 0.396 0.363 0.310 0.277 0.259 0.243 0.221 0.242 0.198 0.172 Std.Dev. 0.668 0.703 0.516 0.452 0.455 0.417 0.392 0.535 0.420 0.360 Median 0.203 0.189 0.164 0.154 0.127 0.119 0.102 0.079 0.057 0.054 Min 0.002 0.0003 0.0001 0.0009 0.003 0.0005 0.0008 0.001 0.002 0.001 Max 8.427 9.513 5.421 4.915 4.199 2.773 2.891 3.429 3.163 2.73 Obs 419 370 327 276 236 187 152 125 108 85

I primarily look at the critic reviews and user reviews on Metacritic. Here I present the accumulated average score and the accumulated number of reviews in the first 3 weeks available for both types of reviews in Table 1.3.7 In the pre-release period, the largest number of critic

6Notice that, not all 665 weeks have their first week included in the dataset. 12 movies released in early 2011 have been trimmed. 7Complete tables can be found in Appendix E.1.

9 reviews is 45, and there are about 6 reviews on average, with an average score of 62.66. After release, the average number of critic reviews rises to about 32. Most of the reviews are posted in the pre-release period and the first 2 weeks post-release. Starting from week 2, users start to post reviews, and the number of reviews has an average of about 26 in week 2, with a maximum of over 700.8 The average user score is around 6.2. As discussed before, for each movie, I treat the average critic score from multiple critics as the one score that consumers observe, since consumers usually do not observe the size of critic reviews. But for audience reviews, I take the number of reviews into account. Some movies on Metacritic do not have a large number of user reviews. Thus, the scores from Metacritic may not perfectly represent the opinion from the audience.

Table 1.3: Critic Scores, User Scores, and the Number of Reviews on Metacritic

Critic Score No. of Critics User Score No. of Users week1 week2 week3 week1 week2 week3 week2 week3 week4 week2 week3 week4 Mean 62.66 55.8 55.91 6.361 32.23 32.69 6.297 6.239 6.228 25.61 34.73 38.85 Std.Dev. 15.64 14.88 14.83 10.01 9.19 9.033 1.773 1.603 1.552 46.7 63.25 57 Median 63.77 55.71 55.79 3 33 34 6.57 6.44 6.43 13 17 20 Min 0 16.88 16.88 0 0 2 0 0 0 1 1 1 Max 100 95.41 95.41 45 51 52 10 10 10 728 1036 558 Obs 407 653 648 653 654 648 643 644 639 643 644 639

In each of the 249 weeks in my dataset, movies compete with other movies that are shown in theaters. Table 1.4 is a description of the competition intensity and weekly market revenues. On average, there are about 120 movies on the market per week. In my dataset, there are about 33 movies per week on average. For those movies that are on the market but not included in my dataset, I treat them as in the outside option. From the comparison between the total box office revenue of all movies and the top 12 movies, we can see that the top 12 take up 90.25% of overall revenue. Since my dataset includes most of the wide-released movies, I expect it to cover a large part of the market.

Table 1.4: Weekly Competition and Market Revenue

Mean Std.Dev. Median Min Max No. of Movies in Dataset 32.98 4.421 33 15 45 No. of Total Movies 120.2 13.44 120 80 154 Top 12 Revenue 188.8 74.33 175.6 67.87 500.8 Overall Revenue 209.3 76.9 193.7 88.02 528.5 Observations 249 Notes: Top 12 revenue and overall revenue in $ million.

8For week 3, the maximum number of reviews reaches 1036, which belongs to Star Wars: The Force Awakens (2015). It was released at the end of 2015, so it only has 2 weeks in my dataset.

10 1.2.3 Preliminary Results

To check if the reviews indeed have effects on box office revenue, I present some preliminary results before getting into the structural model. A major concern would be the potential endogeneity of critic reviews and audience reviews, since unobserved quality may shift both the scores and the box office revenue. Although I have collected most of the movie characteristics that will be used as controls, the error term may still be questionable. I estimate the weekly revenue of movies using movie characteristics and reviews from critics and audiences. That is, I assume that critic reviews and audience reviews shift the utility directly, and ignore the actual competition between movies in a week. Since critic reviews and audience reviews may not be available for some movies in some periods, I use two dummies, cold1 and cold2, to represent the effect of no reviews. Specifically, for consumer i in week t, utility from watching movie j is:

Revjt = βXj + γDt + θc0cold1jt + θcCriticjt + θa0cold2jt + θaUserjt + aj + ujt, where Xj includes several movie characteristics; Dt labels the week in movie j’s life; aj is the movie fixed effect; ujt is the error term. If there is no critic review available, such as in the opening week of cold-opened movies, cold1jt will be 1 and Criticjt will be 0. Once critic reviews are available, cold1jt will be 0 and Criticjt will take its actual value. For audience reviews, cold2jt and Userjt work the same way.

aj is the movie-specific unobserved part that may shift both the revenue and the reviews. To get rid of it, I use a Fixed Effect model to estimate the equation, since my dataset is an unbalanced panel. As the characteristics in Xj do not vary across periods, the result will provide estimates for week effect parameters, γ, and review parameters, θc and θa. For Criticjt and Userjt, I consider specifications of the score alone, score with the size of reviews, and these terms interacted with major and minor studio dummies. The result is presented in Table 1.5. All the week dummies have significant coefficients with expected signs. For simplicity, I only present the parameters on reviews. Column (1) and (2) show that after controlling the movie fixed effect, critic reviews do not have a significant effect on weekly revenue, and the sign is not as expected. User reviews affect weekly revenue positively for both major and minor studio productions. Column (3) and (4) introduce the interaction of user reviews and the number of user reviews. While the critic reviews still have an insignificant coefficient, both user reviews and the interaction term have positive effects on revenue. The interaction term does not have a significant effect on minor studio productions. Column (5)-(8) use the interaction term of critic reviews and the number of critic reviews instead of critic reviews. The effects of user reviews and the corresponding interaction term are similar to the results in column (1)-(4). The interaction term of critic reviews and the size has a positive effect on weekly revenue for major studio productions. When

11 there is no critic reviews available, the average effect is about 0.3; with no user reviews available, the average effect is about 0.5.

Table 1.5: Parameters on Reviews from FE Model

(1) (2) (3) (4) (5) (6) (7) (8) cr -0.00029 0.00005 (0.00348) (0.00347) × major -0.00063 0.00008 (0.00353) (0.00352) × minor -0.00112 -0.00098 (0.00390) (0.00389)

cr × nc 0.00013* 0.00005 (0.00005) (0.00006) × major 0.00023** 0.00016 (0.00008) (0.00008) × minor -0.000002 -0.00006 (0.00007) (0.00008) ur 0.121*** 0.102*** 0.109*** 0.0980*** (0.0192) (0.0194) (0.0194) (0.0194) × major 0.133*** 0.111*** 0.0846*** 0.0775** (0.0198) (0.0201) (0.0247) (0.0249) × minor 0.106*** 0.0909*** 0.119*** 0.104*** (0.0206) (0.0214) (0.0219) (0.0219)

ur × na 0.00030** 0.00028** (0.00009) (0.00010) × major 0.00026** 0.00022* (0.00010) (0.00010) × minor 0.00040 0.00043 (0.00026) (0.00028) cold1 0.314 0.293 0.285 0.268 0.306*** 0.318*** 0.277*** 0.288*** (0.185) (0.185) (0.185) (0.185) (0.0661) (0.0667) (0.0662) (0.0667) cold2 0.517* 0.490* 0.443 0.429 0.535* 0.563* 0.456* 0.482* (0.234) (0.232) (0.230) (0.230) (0.226) (0.232) (0.226) (0.232) Const. 16.56*** 16.61*** 16.62*** 16.66*** 16.46*** 16.44*** 16.58*** 16.56*** (0.301) (0.299) (0.297) (0.296) (0.230) (0.236) (0.231) (0.238) N 8211 8211 8211 8211 8211 8211 8211 8211 adj. R-sq 0.913 0.913 0.913 0.913 0.913 0.913 0.913 0.913

Notes: Standard errors allow for correlation within cluster (movie); ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001.

Therefore, we have some evidence that critic reviews and audience reviews significantly affect box office revenue after controlling unobserved movie fixed effect, though the effect of critic reviews is quite small. This may come from the fact that I do not have many observations to identify the effect of critic reviews, since there are only 247 cold-opened movies. After

12 interacting with the size of critic reviews, the effect becomes more significant since there is variation between the size of critic reviews in the opening week and week 2. Next, I construct a learning model to explain the effects of critic reviews and audience reviews in a structural way.

1.3 Model and Estimation

In this section, I model the demand side to account for the effects of critic reviews and audience reviews in the observational learning process. This model is based on the learning model in Newberry and Zhou (2016).

In each period t, Nt new consumers arrive and choose one movie to watch from a set of movies, then exit the market. Period t contains two types of information: 1) it labels the week in the calendar; 2) it indicates the week in the movie’s life. Among the movies available in period t, some are in their opening week, while others were released one or more weeks before. For individual i in period t, the utility of watching movie j is:

Uijt = βXj + γDt + ξj + jt + εijt, where Xj contains movie characteristics including production company, star power, MPAA rating, genre, etc. Dt is a dummy variable that corresponds to the week in the movie’s life, representing the effect of the length of stay on the market. The true quality of the movie ξj is unobserved until the individual watches the movie. Since the quality of a movie is decided when the production is finished, it is safe to assume that ξj does not vary over time. jt is an aggregate demand shock for movie j in period t, which could, for example, be a good media coverage that is not captured by other controls. εijt is an idiosyncratic taste shock, which is i.i.d following type I extreme value distribution. The outside option includes watching movies that are shown in theaters but are not included in the dataset, or choosing other types of entertainment other than watching a movie. The utility of the outside option is assumed as:

Ui0t = εi0t.

Since consumers cannot observe the true quality ξj when making their choices, their utility will be based on the expected utility of ξj, using information from their observation of some movie characteristics, critic reviews, and audience reviews.

1.3.1 Learning from Critic & Audience Reviews

Information on movie quality that is available to the consumers varies across different periods in a movie’s life. Before a movie’s release, consumers can observe some movie characteristics

13 from advertisements and reports. If the movie is sent to critics for screening before release, consumers can see critic scores before making choices. After the movie is released, consumers that have watched the movie can rate the movie online, and this piece of information can be observed by new consumers. Moreover, for the cold-opened movies that have no critic review available before release, critics can freely review it after its release. Therefore, I model the learning process as of 3 periods: 1) pre-release; 2) opening week; 3) week 2 and after.

Pre-release Period

In the pre-release period, some characteristics of the movie, Zj, are revealed to the consumers. Assume the prior belief of movie j’s quality is distributed according to a normal distribution:

2 ξj ∼ N(µj, σj ).

2 Notice that the mean and variance of prior, µj and σj , are subscripted by j, since I allow them to depend on observed characteristics Zj. Some of them will also be included in Xj. That is, I allow some movie characteristics to directly affect consumers’ preferences, as well as change the expected quality of the movie through the learning process. Movie j may or may not be reviewed by the critics, depending on the studio choice. If it is cold-opened, there will be no critic review available in the pre-release period. Thus, consumers take expectation of the movie’s quality using their prior only. If movie j is sent to critics for screening in the pre-release period, I assume that a critic’s viewing experience arrives as a random signal xcij. While there will be multiple critic reviews available, on both Metacritic and Rotten Tomatoes only the average critic score is placed at the most eye-catching place. When other media refer to the critic scores on these sites, they also report the average without mentioning the size of reviews. Therefore, the audience have little access to the number of critics that reviewed the movie, and may care more about the average critic score. In practice, I ignore the size of critic reviews and treat the group of critics as one critic, or as a committee that gives a final review. Assume this signal of experience xcj is normally distributed around the true quality: 2 xcj ∼ N(ξj, σcj).

The critic receives this signal of experience (measured in utility) and gives a score based on this signal. I assume that this score is a linear transformation of the signal of experience.

crj = αcxcj + yc, where crj is the critic score before release.=; αc is a scaling parameter; and yc is a constant. Opening Week When movie j is released, the audience form their beliefs based on the available information.

14 For cold-opened movies, since there is no critic score that consumers can learn from, consumers’ expected quality of movie j is:

E[ξj|Zj] = µj. (1.1)

If critic scores are available, the audience can update their beliefs according to Bayes’ Rule using the critic score crj. The posterior is:

0 02 ξj|xcj ∼ N(µj, σj ), where:

2 2 0 σcj σj µj = 2 2 µj + 2 2 xcj; σj + σcj σj + σcj 2 2 02 σj σcj σj = 2 2 . σj + σcj

Thus, the expected quality of movie j is:

σ2 σ2 E cj j (crj − yc) [ξj|Zj, crj] = 2 2 µj + 2 2 · . (1.2) σj + σcj σj + σcj αc

Now we need to make assumptions about how audience reviews are generated. When a consumer posts a review after watching a movie, she is more likely to reveal her viewing experience, rather than her belief about the movie quality. Therefore, I assume that her viewing experience arrives as a random signal, xaij, which is normally distributed around the true quality, ξj. That is, 2 xaij ∼ N(ξj, σaj).

Therefore, the distribution of audience reviews only depends on the true quality of a movie, not on the signal from critics or the signals from other audiences. Thus, we can treat all the audience viewing experience as i.i.d. To map this signal of experience to a score, I use the same approach as for the critic score. Assume the audience score given by individual i is a linear transformation of the signal of experience i receives:

urij = αaxaij + ya, where urij is an audience score in the opening week; αa is a scaling parameter; and ya is a constant. It is important to distinguish the two scaling parameters for the critic score and the audience score, since in my dataset they have different scales. On Metacritic, the critic score ranges from 0 to 100, but the user score is from 0 to 10. After the movie is released, critics will be able to watch and review the movie. For cold-

15 opened movies, suppose the signal of the critic viewing experience is generated following the same process as in the pre-release period. Then we have crj as the critic score given in the opening week. This information is available to consumers that make choices in week 2. In practice, regular movies that have critic reviews before release may receive new reviews right after release. I will use the new average critic score as crj. Week 2 and After Starting from week 2, the audience can use both critic reviews and audience reviews as information to update their priors. As the distribution of audience reviews is independent of previous audience reviews, I do not need to track the order of reviews, though I observe each user review in chronological order in my dataset.

After observing a critic score crj (or a signal of experience xcj from the critics) and n audience scores urij (or signals of experience xaij from the audience) from the previous period, consumers can update their beliefs using Bayes’ Rule, resulting in posterior9

00 002 ξj|xcj, xa1j, ..., xanj ∼ N(µj , σj ), where:

σ2 σ2 σ2σ2 nσ2σ2 µ00 = cj aj µ + j aj x + j cj x¯ ; j A j A cj A aj σ2σ2 σ2 σ002 = j cj aj ; j A 2 2 2 2 2 2 A = nσj σcj + σj σaj + σcjσaj.

x¯aj is the average of signals of experience xaij from the audience. Then, the corresponding average audience score urj is:

1 n 1 n ur = X ur = X(α x + y ) = α x¯ + y . j n ij n a aij a a aj a i=1 i=1

Thus, the expected quality of movie j is:

σ2 σ2 σ2σ2 E cj aj j aj (crj − yc) [ξj|Zj, crj, urj, n] = 2 2 2 2 2 2 · µj + 2 2 2 2 2 2 · nσj σcj + σj σaj + σcjσaj nσj σcj + σj σaj + σcjσaj αc 2 2 nσj σcj (urj − ya) + 2 2 2 2 2 2 · . nσj σcj + σj σaj + σcjσaj αa (1.3)

I utilize the information of posted date for each user review in my dataset to calculate the cumulative average user score and the cumulative number of user reviews, and use them as the

9I prove this in Appendix B.1.

16 average audience score urj and the number of audience reviews n. The learning model above shows how consumers use three pieces of information - prior, critic reviews, and audience reviews - to form the expected quality of a movie. Equation (1.1), (1.2), and (1.3) show that the expected quality is a weighted sum of these three pieces of information. My goal is to see what affects the prior, and how much weight consumers put on critic reviews 2 and audience reviews. Thus, I focus on the prior mean µj and the ratio of signal variances, σcj 2 2 and σaj, to the prior variance, σj . This will give us enough information about the weight on the two types of reviews. I denote the cumulative average critic score, the cumulative average audience score, and the cumulative number of audience reviews for movie j prior to period t as crjt, urjt, and njt, respectively. Then, we can summarize the expected quality of movie j, for each of the three cases movie j may fit in in period t:

• Opening Week, Cold-opened:

E[ξj|Zj] = µj; (1.4)

• Opening Week, Regular: σ˜2 E cj 1 (crjt − yc) [ξj|Zj, crjt] = 2 · µj + 2 · ; (1.5) 1 +σ ˜cj 1 +σ ˜cj αc • Week 2 and After: σ˜2 σ˜2 σ˜2 E cj aj aj (crjt − yc) [ξj|Zj, crjt, urjt, njt] = 2 2 2 2 · µj + 2 2 2 2 · njtσ˜cj +σ ˜aj +σ ˜cjσ˜aj njtσ˜cj +σ ˜aj +σ ˜cjσ˜aj αc 2 njtσ˜cj (urjt − ya) + 2 2 2 2 · ; njtσ˜cj +σ ˜aj +σ ˜cjσ˜aj αa (1.6)

2 2 where σ˜cj and σ˜aj are the ratios of signal variances to the prior variance. It is reasonable to think that for different types of movies, the critics and audience have 2 2 different prior mean, µj, and ratios of variances, σ˜cj and σ˜aj. Thus, I allow for heterogeneity in the learning parameters across movies with different characteristics Zj. Specifically, I put 3 characteristics in Zj: 1) produced by major studios or minor studios; 2) cold-opened or regular; 3) advertising expenditure. Intuitively, these characteristics will likely affect consumers’ prior of a movie, as well as how they treat the reviews from critics and audiences. I parameterize µj, 2 2 σ˜1j, and σ˜2j as functions of these characteristics:

µj = µZj; 2 σcj 2 2 =σ ˜cj = exp(σcZj); σj

17 2 σaj 2 2 =σ ˜aj = exp(σaZj). σj

When consumers take expectation of the utility of watching movie j:

E[Uijt|It] = βXj + γDt + E[ξj|It] + jt + εijt, the information set It, depending on the movie type and period, takes one case of {Zj},

{Zj, crjt}, or {Zj, crjt, urjt, njt}. Then, E[ξt|It] takes the form of Equation (1.4)-(1.6), based on the information set It. After forming the expected utility of watching movie j in period t, consumers choose among all the movies in their choice set. While the choice of cold opening and advertising expenditure may be endogenous as they are studios’ strategic choices, I assume them to be exogenous and do not model the supply side at this stage. In reality, critic scores and audience scores may be affected by some unobserved characteristics that also shift preferences, in this model I do not account for this potential endogeneity issue.

1.3.2 Estimation

I am interested in the parameters that affect consumers’ preferences (β and γ), as well as the parameters in the learning process (µ, σc, σa, αc, αa, yc, and ya). For simplicity, I denote the learning parameters as (µ, σ, α, y).

Recall that I assume the idiosyncratic taste shock εijt is i.i.d type I extreme value distributed. Thus, in period t, the probability of consumer i choosing movie j among movies available in that period is: exp(βXj + γDt + E[ξj|It] + jt) Pijt = P E . 1 + j exp(βXj + γDt + [ξj|It] + jt) Since there is no taste heterogeneity across consumers, the market share of movie j in period t is: exp(βXj + γDt + E[ξj|It] + jt) sjt = P E . 1 + j exp(βXj + γDt + [ξj|It] + jt) Then we have the linear estimating equation using the difference between the market share of movie j and the market share of the outside option as data:

log(sjt) − log(s0t) = βXj + γDt + E[ξj|It] + jt. (1.7)

After parameterization, I can write down the expected quality E[ξj|It] as a function of the learning parameters and the available information of movie characteristics, critic and audience reviews:

E[ξj|It] = ξjt(µ, σ, α, y),

18 where ξjt(µ, σ, α, y) takes the functional forms of equation (1.4)-(1.6), depending on the infor- mation available. We can plug in the functional form of ξjt(µ, σ, α, y) in Equation (1.7) and get:

log(sjt) − log(s0t) = βXj + γDt + ξjt(µ, σ, α, y) + jt.

Then I denote:

∆jt = log(sjt) − log(s0t);

∆jt(β, γ, µ, σ, α, y) = βXj + γDt + ξjt(µ, σ, α, y); and I can estimate all the parameters by using non-linear least squares, and find the parameters that satisfy: X 2 (β,ˆ γ,ˆ µ,ˆ σ,ˆ α,ˆ yˆ) = arg min (∆jt − ∆jt(β, γ, µ, σ, α, y)) . β,γ,µ,σ,α,y j,t

1.3.3 Identification

For the preference parameters, β on movie characteristics can be identified by the variation in market shares and the variation in characteristics across movies. γ on the time dummies can be identified by the change in market shares across weeks in a movie’s life, since I assume γ not vary across movies and most of the movies have stayed for multiple weeks in the market.

The scaling parameters, αc and αa, as well as the constant shifts, yc and ya, can be identified by mapping the distribution of critic and audience scores to the distribution of signals of viewing experience that produces the expected quality. To separately identify the prior mean, µj, and the constant shifts, we need observations with no critic scores and audience scores (thus the expected quality is just the prior mean). Cold-opened movies in my dataset can be used for this purpose. After parameterizing µj, we can identify the coefficients from the variation in 2 2 characteristics Zj. The ratio of variances, σ˜cj and σ˜aj can be identified by the variation in market shares as a result of a change in the critic score, or changes in the critic score, the audience score, and the number of audience reviews. Intuitively, they are the weights on the three pieces of information, and the identification comes from putting the weight that generates the expected quality to fit the data. Again, I parameterize the ratio of variances, and the coefficients will be identified by the variation in characteristics Zj. To estimate all the parameters using nonlinear least squares, the key assumption is that the mean zero random shock on belief, jt, need to be uncorrelated with reviews, and be independent of all the movie characteristics and reviews. Although I have controlled a large set of movie characteristics, there may still be concerns about endogeneity. There may also be serial correlations within the shocks for the same movie in different periods. I perform a robustness test to look at this issue in Section 1.4.3. Another issue is about separately identifying the

19 prior variance and the variances of experience signals. To separately identify the prior variance, we need to use the distribution of true quality, which comes from variation in the observations without critic and audience reviews. Although we have the opening week performance of cold-opened movies, the amount is only 246 even under my loose definition of cold opening. As the variation in the data may be limited, I do not pursue separately identifying the variances.

1.4 Results

Using the estimation strategy proposed in Section 1.3.2, I estimate all the preference parameters and the learning parameters together using Nonlinear Least Squares. I compare two regressions: (1) not include 6 major studios and advertising expenditure in Xj; (2) include them in Xj. Compared to the first regression, the second regression distinguishes the effect of major studios and advertising expenditure on both consumer preferences and the learning process. I first present the estimation results of the preference parameters and the learning parameters, then show some results from robustness tests.

1.4.1 Preference Parameters

In Table 1.6, Column (1) shows the results without controlling major studios and advertising expenditure, and Column (2) shows the results of the controlled case. Estimation results of the preference parameters, β and γ, are very similar between the two regressions. I focus on the second one with 6 major studios and advertising expenditure controlled. Movies produced by the U.S. only do not have any significant effect on preference, but co-produced movies have a weaker preference by the consumers. R-rated movies have a significant and negative effect on preference. Among the genres, and comedies are more preferred by the consumers, while action movies are less preferred. Stronger star power leads to stronger preferences, with a significant and negative coefficient associated with the average ranking of stars in logarithm (since the lower the number is, the higher the ranking is). However, director power has an unexpected positive sign, which indicates that consumers may not have a strong preference over popular directors. Movies released with a 3D version or in IMAX theaters do not have a significant effect on preference. As a previous movie in a series must have performed well enough to get a sequel made, consumers may have stronger preferences on sequels. Indeed, sequels have a significant positive sign in both regressions. As expected, a higher production budget leads to stronger preferences from consumers. All of the week dummies have a significant negative sign. This fits the revenue pattern that has a high peak at the opening week but drops sharply afterward. From the coefficients of week dummies, we can observe the sharp decrease in the first several weeks. It can be interpreted as the impatience of consumers that their utility of watching a movie decreases when they wait.

20 All of the six major studios have coefficients with a positive sign, with Disney, Fox, Sony, and Universal have significant positive effects. This shows the effect of major studios on preference. Advertising expenditure also has a positive sign, but the effect is not significant. Therefore, its effect on utility needs to be explained by the learning process.

Table 1.6: Preference Parameters

(1) (2) (1) (2) (1) (2) US prod. 0.0964 0.0683 Action -0.0876* -0.0950* Disney 0.386*** (0.0688) (0.0692) (0.0412) (0.0412) (0.117) Co-prod. -0.104 -0.140* Animation 0.299*** 0.273*** Fox 0.388*** (0.0712) (0.0711) (0.0740) (0.0742) (0.114) R -0.755*** -0.791*** Comedy 0.381*** 0.346*** Paramount 0.199 (0.120) (0.123) (0.0411) (0.0413) (0.113) PG13 -0.221 -0.260* Drama 0.0128 0.0399 Sony 0.626*** (0.118) (0.120) (0.0414) (0.0416) (0.112) PG 0.232* 0.178 Horror 0.108 0.105 Universal 0.445*** (0.110) (0.111) (0.0643) (0.0641) (0.115) Director 0.0647*** 0.0616*** Sci-fi -0.0892 -0.0850 Warner 0.0589 (0.0121) (0.0123) (0.0485) (0.0487) (0.111) Star -0.0421** -0.0406** Thriller 0.0212 0.0163 ln(Adspend) 0.0532 (0.0129) (0.0130) (0.0520) (0.0522) (0.0552) Sequel 0.281*** 0.314*** ln(Budget) 0.0948*** 0.0733*** Week Y Y (0.0401) (0.0403) (0.0221) (0.0222) Format Y Y

Notes: Standard errors in parentheses; ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001.

1.4.2 Learning Parameters

I am interested in the parameters about the prior mean, the weight on critic reviews and consumer reviews. Since on Metacritic, the critic scores range from 0 to 100, and the audience scores range from 0 to 10, I am also interested in the scaling parameters αc and αa. Table 1.7 shows the results from two regressions. Again, I focus on interpreting the results from the second regression. Table 1.8 provides hypothesis test results for some of the learning parameters. For estimates of the prior mean, major studio productions have a coefficient of -1.188, with -1.108 for minor studio productions. The coefficients indicate that major studio productions are expected to have lower quality ex-ante, but the difference is not significant. While we would assume that cold opening sends a signal of bad quality, cold-opened movies have higher expected quality ex-ante, with a coefficient of about 0.611. This can help explain the result found in Brown et al. (2012), that cold opening has a positive effect on box office revenue. Higher advertising expenditure leads to significantly higher expected quality ex-ante when we do not control advertising in preference, but after controlling it the significance goes away.

21 Table 1.7: Learning Parameters

(1) (2) (1) (2)

µ −Major -0.861*** -1.188*** yc 21.29*** 18.61*** (0.250) (0.268) (4.137) (4.407)

−Minor -1.205*** -1.108*** ya -4.529*** -3.177** (0.235) (0.236) (1.247) (1.115)

−Cold 0.608*** 0.611*** αc 4.004 4.477 (0.104) (0.102) (3.028) (3.106)

−Adspend 0.121*** 0.0829 αa 3.196*** 3.012*** (0.0337) (0.0587) (0.278) (0.266)

σc −Major 5.866*** 6.222*** σa −Major 2.975*** 3.162*** (0.564) (0.541) (0.123) (0.166) −Minor 5.507*** 6.024*** −Minor 3.129*** 2.958*** (0.558) (0.540) (0.0924) (0.134) −Cold -0.383* -0.324* −Cold 0.563*** 0.558*** (0.160) (0.145) (0.103) (0.102) −Adspend -1.150*** -1.333*** −Adspend 0.102** 0.119* (0.210) (0.231) (0.0337) (0.0495) N 8211 8211 adj. R-sq 0.892 0.892

Notes: Standard errors in parentheses; ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001.

Table 1.8: Hypothesis Tests from Learning Parameters

Hypothesis Test p-value H0: µ × 1{Minor}-µ × 1{Major}6 0 0.2651 1 1 H0: σc × {Major}−σa × {Major}6 0 0.0000 1 1 H0: σc × {Minor}−σa × {Minor}6 0 0.0000 1 1 H0: σa × {Cold}−σc × {Cold}6 0 0.0000 1 1 H0: σa × {Adspend}−σc × {Adspend}6 0 0.0000 1 1 H0: σc × {Major}−σc × {Minor}6 0 0.0376 1 1 H0: σa × {Major}−σa × {Minor}6 0 0.0321

Parameters of the ratio of variances, σc and σa, indicate the weights on critic reviews and audience reviews. Since I parameter the ratio of variances as an exponential function, a larger parameter means a larger ratio of signal variances to the prior variance, thus a smaller weight on the signals. Between major studio and minor studio movies, consumers rely more on both critic reviews and audience reviews for minor studio movies, while they put larger weight on the prior for major studio movies. If a movie is cold-opened, consumers will put less weight on audience reviews, and add more weight on critic reviews. Therefore, if studios do not send their movies to be screened by critics in the pre-release period, the box office performance of these movies will depend more on how the critics rate the movie after release. Advertising expenditure works similarly. The heavier a movie advertises, the more weight will be put on

22 critic reviews, and the less weight will be put on audience reviews. For example, for a movie produced by a major studio, if the movie is cold-opened, consumers will put more weight on critic reviews than on audience reviews if the advertising expenditure is higher than $4.48 million; if the movie has been reviewed by the critics before release, consumers put more weight on critic reviews when the advertising expenditure is higher than $8.23 million. For movies produced by minor studios, the cutoffs are about $4.50 million for cold-opened movies and $8.26 million for regular movies. The scaling parameter for the audience score is about 3.012 and it is significant. For the critic score, it is about 4.477 if we keep the original scale of critic scores, though it is not significant. This shows the necessity of adding scaling parameters in this model.

1.4.3 Robustness

I do three groups of robustness tests to cover some potential issues in my model and estimation process. The results from these tests can be found in Table C.1 - Table C.3.

Firstly, I consider removing the scaling parameters (thus set αc = αa = 1). That is, I leave the scaling effect in the weights on different pieces of information. I expect all the preference parameters to be qualitatively similar, and the learning parameters to be different from the previous estimation results. Column (1) and (2) show the results of this specification, without or with controlling the major studios and advertising expenditure in Xj. Most of the preference parameters have the same sign as in the specification with scaling parameters. R movies, star power, sequel, production budget, and some major studios have significant effects that are also quantitatively similar. The learning parameters also have the same sign with the previous results, except for the coefficient of cold opening on the ratio of the variance of the audience signals. The constant shift yc and ya both increase, and the ratios of variances also have coefficients of major and minor studio productions increased.

Secondly, there may be worries about the mean zero random shock on belief, jt. Since this shock is on the movie-week level, there may be a serial correlation in the shocks on the same movie. Thus, I report the standard error that allows for intragroup correlation, vce(cluster), in Stata. I treat observations of each movie as a cluster. Column (3) and (4) show a decrease in the significance of some parameters. After allowing for serial correlation, we still have the coefficient of cold opening on the prior mean, major and minor studio effects on the ratio of variances, advertising expenditure on the ratio of the variance of the critic signal, and the scaling parameter of the audience experience significant at 1% level. Lastly, I utilize the full information of critic reviews in my dataset by looking at the size of critic reviews. This will introduce the number of critic reviews in Equation (1.2) and (1.3).

I expect it to change the learning parameters, especially σc and αc that affect the weights. Column (5) and (6) are the results of this specification. The preference parameters are mostly

23 close to those in the previous results, both qualitatively and quantitatively. All the learning parameters also share the same sign as in the previous specification. While the parameters for the ratio of the variance of the audience signals do not change much, the coefficients of major and minor studio effects on the ratio of the variance of the critic signal increase by about 5.

The scaling parameter αc also increases to about 11. Overall, under this specification, I still obtain qualitatively similar results.

1.5 Counterfactuals

Based on the estimates I obtain, I can quantify the effect of cold opening, critic reviews, and audience reviews. Here I provide two counterfactuals that may be informational for studios or have policy implications. I use the estimates from the specification that controls major studios and advertising expenditure in Xj and includes the scaling parameters (Column (2) in Table 1.6 and Table 1.7).

1.5.1 Effect of Cold Opening

In this counterfactual practice, I consider the case when studios cannot choose cold opening for a movie. From the learning parameters, we see that cold-opened movies have higher prior ex-ante. Thus, if we force all the studios to screen their movies before release, cold-opened movies may see a decrease in revenue. However, screening also brings in critic reviews that can be used by consumers to form their beliefs. For some movies, this may increase their revenue. I use the estimates to perform a practice on the cold-opened movies, which assumes critic scores to be observed by consumers before release. That is, I use the week 2 average critic score as the critic score seen by consumers in the pre-release period.10 The dummy variable for cold-opened movies is also changed to all zeros. Then we can calculate the expected quality using the estimated parameters:

• Without audience reviews: σ˜ˆ2 1 (cr − yˆ ) E[ξˆ |Z , cr ] = cj · µˆ + · jt c ; j j jt ˆ2 j ˆ2 1 + σ˜cj 1 + σ˜cj αˆc • With audience reviews: σ˜ˆ2 σ˜ˆ2 σ˜ˆ2 (cr − yˆ ) E[ξˆ |Z , cr , ur , n ] = cj aj · µˆ + aj · jt c j j jt jt jt ˆ2 ˆ2 ˆ2 ˆ2 j ˆ2 ˆ2 ˆ2 ˆ2 njtσ˜cj + σ˜aj + σ˜cjσ˜aj njtσ˜cj + σ˜aj + σ˜cjσ˜aj αˆc 2 njtσ˜ˆ (ur − yˆ ) + cj · jt a . ˆ2 ˆ2 ˆ2 ˆ2 njtσ˜cj + σ˜aj + σ˜cjσ˜aj αˆa

10There is one movie, Chronicle (2012), that has no critic review on Metacritic until its second week on the market. I use its week 3 average critic score for both its opening week and week 2.

24 Then the new market share can be calculated by:

exp(βXˆ +γD ˆ + E[ξˆ |I ] + ˆ ) sˆ = j t j t jt . (1.8) jt P ˆ E ˆ 1 + j exp(βXj +γD ˆ t + [ξj|It] + ˆjt)

Notice that I preserve the shock on expected quality, ˆjt, from the estimation. Thus, the change in weekly revenue will all come from the effect of removing cold opening. Then I sum up the weekly revenue change for each movie as:

X ∆j = (Revˆ jt − Revjt), (1.9) t where Revjt is the actual weekly revenue of movie j in week t, and Revˆ jt is the counterfactual weekly revenue after removing cold opening. To capture the effect on most of the weeks in a movie’s life, I do not include the movies released before the starting week of this dataset (before Apr. 1, 2011) or the movies released near the end of this dataset (after Oct. 29. 2015). There are 625 movies left, and the result is summarized in Table 1.9.

Table 1.9: Change in Revenue After Removing Cold Opening ($mln)

Mean Std.Dev. Median Min Max Obs. Cold -1.266 11.83 -2.911 -34.33 48.64 238 −Major -2.065 13.60 -3.617 -34.33 48.64 116 −Minor -0.506 9.859 -2.446 -17.1 30.43 122 Regular 0.787 4.144 0.692 -26.39 21.57 387 −Major 0.736 5.715 1.055 -26.39 21.57 178 −Minor 0.830 2.017 0.505 -8.874 12.06 209 All 0.0052 8.049 0.261 -34.33 48.64 625

After removing cold opening, 238 cold-opened movies see a decrease of about $1.27 million on average, and 71% of them have decreased box office revenue. The largest decrease in revenue of a cold-opened movie - : Chipwrecked (2011) - is about $34.33 million. There are also cold-opened movies that have increased revenue after removing cold opening. Jurassic World (2015) would have earned another $48.64 million if not cold-opened in my counterfactual. The standard deviation is very high for the cold-opened movies, and it is even higher for cold-opened movies made by major studios. For cold-opened movies produced by minor studios, the change in revenue is still negative on average, but the variation of changes is smaller. Regular movies on average have their revenues increased by about $0.8 million, though 26% of the 387 regular movies still have decreased revenue. Despicable Me 2 (2013) benefits the most from removing cold opening, with an increase of $21.57 million. Pirates of the Caribbean:

25 On Stranger Tides (2011), on the contrary, sees the largest decrease of about $26.39 million for a regular movie. The standard deviation of revenue changes is smaller for regular movies, especially for those produced by minor studios. Overall, the average change of all the movies is about $5200. These movies in the dataset are not stealing much from those not included. This counterfactual result meets our expectation that most of the cold-opened movies earn more when they are cold-opened. The decrease in prior and the score from critics may both lead to this decrease.

1.5.2 Benefit from Audience Reviews

This counterfactual practice tries to see if consumers are benefiting from audience reviews to distinguish good or bad movies when they make choices. Consumer surplus is usually used in welfare analysis, but in the motion picture industry, price is not affecting consumer choices among movies, as there is no price variation across movies in a theater. Thus, I take a simple approach to look at total revenue changes for good and bad movies. It is also hard to define the “true quality” of a movie since there is no objective standard. Since I am looking at consumer benefits, I will use audience scores as the standard to categorize “good” and “bad” movies. For simplicity, I check the 25th percentile (5.48) and the 75th percentile (7.23) of the average user score at week 20 of all the movies on Metacritic.11 Then I define a movie to be “good” if its average user score at week 20 is higher than or equal to 7.23; a movie is “bad” if its week 20 average user score is lower than or equal to 5.48. Therefore I only look at the highest 25% and the lowest 25% of movies in terms of average user scores. This gets rid of the 50% in the middle that are more likely to have controversial opinions. Then I use the estimates to compute the revenue without any information from the audience reviews. Thus, consumers will use their prior as the expected quality, if the movie is cold-opened. Otherwise, they update their beliefs using the critic score, for all the regular movies in their opening week, and all the movies in their week 2 and after. The expected quality is:

• Cold-opened movies in the opening week:

E[ξˆj|Zj] =µ ˆj;

• Regular movies in the opening week & all movies in week 2 and after: σ˜ˆ2 1 (cr − yˆ ) E[ξˆ |Z , cr ] = cj · µˆ + · jt c . j j jt ˆ2 j ˆ2 1 + σ˜cj 1 + σ˜cj αˆc Then I follow the same procedure as in the first counterfactual practice, and use Equation (1.8) and (1.9) to calculate the new market share and change in box office revenue. I present the result after dropping the movies released before Apr. 1, 2011, and those after Oct. 29, 2015. The result is in Table 1.10. 11I pick week 20 since the average user score should be stable after a movie is released for 20 weeks.

26 Table 1.10: Change in Revenue After Removing Audience Reviews ($mln)

Mean Std. Dev. Median Min Max Obs. Good(> 7.23) -22.72 40.60 -8.793 -297 26.15 129 −Cold 3.876 12.42 4.549 -18.52 26.15 17 −Regular -26.76 41.88 -11.20 -297 16.16 112 Bad(6 5.48) 9.726 14.12 7.247 -43.73 63.26 187 −Cold 13.59 13.60 8.585 -34.40 63.26 119 −Regular 2.967 12.45 3.062 -43.73 27.36 68 All -3.259 29.78 1.996 -297 77.76 624

The result shows that good movies are mostly harmed by removing audience reviews, with a loss of $22.72 million on average and 76% of them seeing a decrease in revenue. The movie that gets affected the most is The (2012), which has a loss of $297 million, nearly half of its total gross in the domestic market. 2 (2011) instead benefits from removing audience reviews as a good movie, with an increase of $26.15 million. However, if we look at the good movies that are cold-opened, on average their revenues increase after removing audience reviews. This is a little surprising, but since the size is small (only 17 movies), it may not be a significant result. For bad movies that are in the lowest 25%, removing audience reviews not surprisingly increases their revenue on average by about $9.73 million. 88% of the bad movies have their revenue increased after removing audience reviews, and The Twilight Saga: Breaking Dawn Part 2 (2012) benefits the most with $63.26 million increase, about 1/5 of its domestic gross. (At week 20, its average user score on Metacritic is 5.05.) Among the bad movies, cold-opened ones see an even larger increase in revenue, with an average of $13.59 million. 115 of the 119 cold-opened bad movies have a positive change. Notice that in the main estimation, the result shows that consumers increase the weight on critic reviews and decrease weight on audience reviews for cold-opened movies. This counterfactual practice tells us that negative responses from the audience still heavily hurt cold-opened bad movies.12 Thus, consumers benefit from the availability of audience reviews, since over 70% of the good movies have higher revenue and nearly 90% of the bad movies have lower revenue when consumers see the audience reviews. For all 624 movies, they lose about $3.26 million on average. This shows the outside option (all other movies that are on the market but are not in my dataset) steals market share from the movies in the dataset. It indicates an overall quality drop for the movies in the dataset.

12The bad movie that still has the largest loss after removing audience reviews is Minions (2015), with a week 20 average user score of 5.31 on Metacritic, and a $43.73 million loss.

27 1.6 Conclusion

I inspect the effects of critic reviews and audience reviews on a movie’s box office revenue using a demand side model with a learning process incorporated. Specifically, I focus on how the effects differ between major and minor studio productions, cold-opened and regular movies, and movies with different advertising expenditures. The result shows that movies produced by minor studios are more heavily affected by both critic reviews and audience reviews, compared with major studio productions. Cold-opened movies and movies with higher advertising expenditures will see consumers putting more weight on critic reviews and subtracting weight from audience reviews. If we force cold-opened movies to be screened before release, 71% of them will suffer a loss in box office revenue. Audience reviews are important for consumer welfare, as removing them harms most of the good movies and benefits most of the bad movies. Without the availability of audience reviews, the top 25% good movies — according to the average user score on Metacritic — lose $22.72 million in box office revenue on average, while movies at the lowest 25% in user score see an increase of $9.73 million in their box office revenues on average. As an extension of this chapter, we can incorporate consumer learning from other signals, such as advertising expenditures and box office revenues in the previous periods. If we focus on the effect of advertising as a quality signal, it is important to analyze a studio’s optimal advertising choice, given that consumers can also learn about movie quality from the reviews. The next chapter will discuss the effect of observing reviews on the advertising choice of studios.

28 CHAPTER 2

Advertising under Learning from Expert Reviews

2.1 Introduction

Firms use advertising to inform consumers about the quality of their products, as discussed in Bagwell (2007). Recently, with the growth of the Internet, online platforms like Yelp and TripAdvisor have provided an additional channel through which consumers can learn about new products. Such websites collect and aggregate product reviews from experts and other consumers, granting consumers access to enhanced information about product quality. The availability of these reviews may change the impact that advertising has on consumers — which, in turn, might also affect the advertising strategies of firms. In this chapter, I study the effectiveness of advertising, as well as the responses of firms by adjusting the intensity of advertising, in the presence of product reviews. I focus on the motion picture industry in particular, as this industry offers several advantages for studying how advertising and product reviews interact. First, advertising is a substantial expenditure for movie studios. The motion picture industry ranks among the top U.S. industries with respect to advertising-to-sales ratio.1 Second, websites like Metacritic and Rotten Tomatoes, which aggregate and display reviews by experts (i.e., critics), have grown in popularity, as such websites give consumers more information about the quality of a movie before they choose to

1A 2010 report on AdAge shows that the motion picture and videotape industry ranks 3rd (19.4%), only behind transportation services (22.2%) and perfume and cosmetics (20.1%); data from Schonfeld & Associates:https: //adage.com/article/datacenter-advertising-spending/advertising-sales-ratios-2010/144639.

29 see it.2 Third, the fact that some movies are reviewed by critics before release while others are not (the latter being referred to as “cold-opened” movies) allows me to identify the relative impact of advertising with and without critic reviews. Specifically, as Metacritic and Rotten Tomatoes can only display critic scores once the critics have seen the movies, any movies that are cold-opened will have no critic score prior to opening, and thus consumers will have to rely more exclusively on advertising to make their early consumption decisions. In this chapter, I ask the following questions: Does the presence of critic reviews change the effectiveness of a studio’s advertising? If so, do studios advertise more or less when critic reviews are available? And does the additional information provided by critic reviews help consumers make their choices, or does the availability of reviews simply replace some of the advertising in which studios might otherwise invest? To answer these questions, I specify a model of consumers making a binary choice — watching a movie or not — based on their preference for movie characteristics and their expectations of movie quality. Consumers form expectations for a movie through a Bayesian Learning process, after observing any or all of three pieces of information: the studio’s advertising choice, critic reviews, and consumer (i.e., audience) reviews. The availability of these pieces of information depends on whether the movie has just been released and whether the movie has been cold-opened. Specifically, critic reviews are only available after critics have seen the movie (whether in a pre-release screening or, as in cold-opening cases, during its opening week), and audience reviews are only available only after the opening week. Given how demand is affected by advertising and reviews, a studio decides on a movie’s total advertising expenditure before the movie’s release. To calculate the need for advertising, a studio observes private information about the quality of its movie through test screenings with small audiences and through critic reviews (if available). I estimate the demand model using a dataset of 587 movies widely released in the U.S. and Canada from 2011 to 2015. The key object to estimate is how consumers respond to advertising and critic reviews. One challenge to doing this is that advertising and screening by critics are both decisions made by studios — a fact that makes both choices endogenous. Therefore, I instrument for the advertising decision using the production budget of a movie. A movie’s production budget is fixed once the production of the movie is finished (and this happens long before the release of the movie), making it plausibly exogenous. On the other hand, the production budget is correlated with advertising through industry tradition. To instrument for the screening choice, I use the foreign release schedule of a movie. The correlation between this and the screening choice is that, if a movie has been released in an overseas market before, consumers will have access to the reviews from that market, which gives the studio less incentive to make the movie cold-opened. However, a foreign release is planned ahead of time and

2While these platforms also collect and display reviews by consumers, I mainly focus in this chapter on expert reviews.

30 depends on factors such as movie types. I find that both advertising and critic reviews have significant effects on box office revenue. If the value of critic reviews increases from 51.88 (40th percentile) to 59.49 (60th percentile), an average movie will enjoy a 14.31% higher opening week revenue. Further, the same effect can also be achieved by a 28.56% increase in advertising expenditure. When critic reviews are not available, advertising is about 2.25 times as effective as it is when a movie is reviewed. Consumers benefit from observing critic reviews by making better choices. With the availability of critic reviews, 0.74% fewer consumers will choose a movie that they would not watch had more information been available. Using the demand estimates, I solve for a studio’s optimal advertising choice as a function of the signal it receives about movie quality from its private information. I find that the optimal advertising level increases as the signal that the studio receives and critic reviews (if available) get better (i.e., indicate a higher movie quality). Moreover, there is a positive interaction effect. As a movie gets better critic reviews, the marginal impact of the studio’s signal is more substantial, by inducing a higher advertising expenditure. With this function, I back out a studio’s signal from its advertising expenditure in the data. Next, I use this signal to conduct counterfactual experiments, where I calculate the studio’s advertising expenditure and profit under different information structures. More specifically, I quantify the impact of reviews for movies that are cold-opened by finding the change in advertising and profit that would be expected if these same movies were reviewed before release. To do this, I give studios the realized critic reviews when they make their advertising choices. I find that studios would save $12.63 million per movie in advertising and gain $0.98 million per movie in profit on average. Likewise, for movies that have been screened by critics prior to release, I quantify the advertising and profit change by removing their critic reviews and calculating the advertising expenditure based on the signals received by the studios alone. The removal of critic reviews leads to an average increase of $70.26 million per movie in advertising and an average drop of $3.81 million in profit. Overall, I find that the presence of critic reviews saves studios’ advertising expenditures and improves their profits. This chapter is related to the strand of literature that examines the effects of advertising and firms’ optimal advertising choices. Nelson (1970) and Nelson (1974) distinguish between search goods and experience goods, and discuss the direct (i.e., characteristics) and indirect (i.e., quality) information in advertising. Kihlstrom and Riordan (1984) and Milgrom and Roberts (1986) formalize the signaling role of advertising, and multiple empirical papers have tested this effect, such as Thomas et al. (1998) and Erdem and Keane (1996). Ackerberg (2003) allows advertising to affect utility indirectly through consumer learning, while Liu (2016) applies this discussion to the motion picture industry and shows that consumer learning from word-of-mouth enhances advertising’s signaling effect. In the marketing literature, Hollenbeck et al. (2019) study the relationship between online ratings of hotels and advertising spending; Basuroy et al.

31 (2006) provide empirical evidence that critic reviews and cumulative word-of-mouth after release both mitigate the positive effect of advertising on box office revenues. I build upon all of this literature by empirically examining the impact of advertising with or without the existence of reviews, and quantifying how consumers’ access to reviews ultimately impacts the degree of benefit a firm can derive from its advertising choices and with respect to its profits. The empirical literature on Bayesian Learning models has studied the learning process of consumers with respect to different sources of information. For example, Erdem and Keane (1996) and Crawford and Shum (2005) look at consumers’ learning from direct experience, while Chevalier and Mayzlin (2006) and Newberry and Zhou (2016) consider the effect of online reviews on consumer choices. Within the scope of the motion picture industry, in particular, Santugini-Repiquet (2007) considers learning from a movie’s market share; Moretti (2011) focuses on consumers’ learning from peer reviews; and marketing literature studies the effect of online word-of-mouth, such as in Chakravarty et al. (2010) and Hennig-Thurau et al. (2006). This chapter differs from these papers in that I include the impact of advertising and reviews, and simultaneously distinguish the effect of expert reviews and peer reviews. The literature on firm choices in the motion picture industry has inspected other non-price choices, including the cold-opening choice and the timing of a movie’s release. Brown et al. (2007) and Brown et al. (2012) study the positive effect of cold-opening on a movie’s box office performance and provide a rationale for this effect. Reinstein and Snyder (2005) quantify the impact of expert reviews on box office performance after controlling for the endogeneity of cold-opening. Einav (2010) investigates the choice of release dates by movie studios. Although I do not model the cold-opening choice in this chapter, I nonetheless add to this literature by linking the advertising and cold-opening choices and by quantifying the former conditional on the latter. The rest of this chapter is organized as follows. Section 2.2 introduces the industry background and data. Section 2.3 describes the structural model of demand and supply. In Section 2.4 and 2.5, I explain the estimation strategy and discuss the results. Section 2.6 presents the counterfactuals, and Section 2.7 concludes.

2.2 Data

In this section, I introduce the background of the motion picture industry, describe the dataset, and provide summary statistics. I use a comprehensive dataset that combines various sources on 587 movies that were widely released3 during 2011-2015.

3Box Office Mojo considers a movie in “wide release” if the movie is playing at 600 or more theaters.

32 2.2.1 Industry Background

In 2018, the motion picture industry generated $11.9 billion domestic box office revenue in the U.S. and Canada, and accumulated worldwide box office revenue of $41.1 billion; 75% of the U.S. and Canadian population have been to a theater at least once in 2018, with 1.30 billion tickets sold.4 While a movie also generates revenue through other channels, such as home video sales and rentals, my focus is on the box office performance, as this makes up the largest part of a movie’s total revenue.5 A movie goes through three stages before reaching consumers: production, distribution, and exhibition. After production is completed, a movie is released to the market by its distributor(s), and the exhibitors (i.e., theaters) show the movie to the audience. Production and distribution can be vertically integrated.6 Exhibition, meanwhile, is rarely integrated with production or distribution. In this chapter, hence, I do not distinguish between the producer and the distributor. Instead, I focus on the advertising choice of a movie studio with a movie that has already been produced. I also ignore exhibitors’ potential influence on total reach and profits by assuming that a movie’s exhibition schedule always meets consumer demand.7 Before movies are released, studios advertise to promote their movies, and their advertising choices depend on the information that they gather. By running test screenings with a small audience, for instance, studios can obtain some private information about the quality of a movie. Studios can also choose to send a movie to be screened by critics. While this latter process incurs little cost, once studios send out a movie, they cannot control how the critics will review it. Positive feedback can be used in promotional materials, such as trailers and posters, to attract an audience. However, if critics’ feedback is mostly negative, such reviews can generate bad publicity. Since the feedback within critic reviews cannot be controlled by studios, and since consumers will observe critic reviews, then studios may strategically choose whether to send their movies to critics at all or, instead, to make them cold-opened. Traditionally, critic reviews could only be found in magazines, newspapers, and even in trailers or on movie posters. However, since the late-1990s, two websites — Metacritic (launched in January 2001) and Rotten Tomatoes (August 1998) — have been collecting and aggregating critic reviews, and are the review sources that both the industry and consumers turn to most frequently. For movies that have been reviewed, each website provides an average score based on all critic reviews that have been collected, and the score is easily accessible for consumers.8

4Motion Picture Association of America (MPAA) THEME Report 2018: https://www.mpaa.org/wp-content/ uploads/2019/03/MPAA-THEME-Report-2018.pdf 5For instance, Avengers: Endgame (2019) has thus far generated $96.4 million in Blu-ray sales and $15.1 million in DVD sales in the U.S. (November 2019), while its domestic box office revenue is $858.4 million. 6For example, Walt Disney Studios own several production companies, including , Studios, and , as well as a distribution company, Walt Disney Studios Motion Pictures. 7That is, under this assumption, the number of theaters showing the movie is ignored, rather than being treated as a constraint. 8As an example of how influential these websites are, note that if one generically searches for a movie on

33 If a movie is cold-opened, however, no critic score will be available on these websites before the movie is released. Therefore, when consumers enter the market in the opening week, they face a movie that belongs to one of two types: regular or cold-opened, depending, respectively, on whether this movie has been reviewed by critics prior to its release or not. Consumers also observe movie characteristics, such as the cast, genre, and production studio(s), as well as studio choices, such as advertising. After a movie is released, consumers that have watched it can go to several online platforms, such as IMDb, to give their opinions on the movie’s quality. At this point, the cold-opened movies will also have been reviewed by critics. Thus, all consumers entering the market after opening week can observe both types of reviews.

2.2.2 Descriptive Statistics

My dataset is based on movies that were widely released from 2011 to 2015 and generated box office revenue among the top 200 in their year of release. After dropping the movies without production budget information or sufficient weekly advertising information, the dataset ultimately contains 587 movies, released from July 22, 2011 to October 30, 2015.9 For each movie, I collect four types of information: 1) advertising expenditures, 2) critic reviews and audience reviews, 3) box office performance, and 4) movie characteristics.

2.2.2.1 Advertising Expenditures

Data on advertising expenditures are collected from Kantar AdSpender. It provides weekly total expenditure, as well as expenditures on different media types, such as cable TV and network TV. For each movie, I collect weekly total advertising expenditure for 35 weeks: 25 weeks pre-release (including the opening week) and 10 weeks post-release. Since this 35-week period covers 6 months pre-release and 2.5 months post-release, I consider it the complete advertising period.10 On average, 93.03% of total advertising expenditure is spent in the pre-release period, with 87.66% heavily concentrated in the single month prior to a movie’s wide release, as shown in Figure E.1. Therefore, I limit my focus to the advertising choices in the pre-release period, and refer henceforth to the aggregation of advertising expenditures in the 25 weeks before a movie’s wide release as its advertising expenditure.

Google, the results page usually provides the critic scores on Metacritic and Rotten Tomatoes. Screenshot A.5 shows the example of Batman v Superman: Dawn of Justice (2016). 9Four movies have enough information to be in the dataset, but spent $0 on advertising. Three of these are religious movies: God’s Not Dead (2014), Do You Believe? (2015), and Woodlawn (2015). The fourth movie, The Cold Light of Day (2012), bombed in overseas markets long before its U.S. release, so its studio decided not to spend money on domestic marketing. 10Some movies may advertise beyond this period, but the majority of advertising expenditures is included in this period. Table E.3 shows that only 4 movies advertised as far ahead as 6 months pre-release, and only 25 movies continued to advertise as late as the 10th week post-release.

34 Table 2.1: Advertising Expenditures

Mean S.D. Min 25% 50% 75% Max Adspend ($mln) 17.63 10.06 0.232 10.23 16.69 24.32 53.24 Adspend/Budget 0.991 6.680 0.049 0.228 0.410 0.736 155.2 Adspend/Revenue 0.475 0.494 0.019 0.192 0.332 0.581 5.327

Table 2.1 shows studios’ advertising choices. On average, studios spend $17.62 million on advertising, with a minimum expenditure of $0.23 million and a maximum of $53.24 million. In the motion picture industry, the rule of thumb is that advertising budgets are roughly half the size of production budgets. While my sample’s median ratio of advertising over production budget (41.0%) is close to this 50% rule of thumb, the actual ratio has an average of about 100%, with a high variance.11 As for the ratio of advertising over total box office revenue, this figure is, on average, 47.5%, while nearly 10% of movies fail to recover their advertising expenditures at the box office. Therefore, it is especially beneficial to understand studios’ advertising choices and look for possible ways to save on such expenditures.

2.2.2.2 Critic and Audience Reviews

Metacritic and Rotten Tomatoes collect reviews from multiple critics, assign scores to their reviews, and then aggregate them. For each movie in my dataset, I collect every critic review from each website, together with the date, author, and source-media of the review, the score posted, and the review contents. In this chapter, I focus on the data obtained from Metacritic, in particular. Knowing the date that a review was posted allows me to check whether a movie was screened by critics before its wide release. Table 2.2 shows how early a movie was reviewed by critics: 52.12% of the movies were screened within a week before release, and 4.77% have no critic reviews available prior to release. In this chapter, I define a cold-opened movie as a movie with no critic reviews by the 3rd day prior to its wide release, due to the assumption that opening week consumers cannot learn from reviews that are posted right before release.12 Under this definition, then, there are 220 cold-opened movies and 367 regular movies in my dataset.13

Table 2.2: Days between the First Critic Review on Metacritic and Wide Release

>36 29-35 22-28 15-21 8-14 1-7 6 0 Min Max Mean 83 21 19 40 90 306 28 -7 738 22.03

11Figure E.2 plots the relationship between advertising and production budget. 12In Section 2.5.4, I test the change in my results under a different assumption. 13Metacritic states that it applies a weighted average when accumulating review scores, and shows the average “Metascore” when at least four reviews have been collected. Since their weight system is unavailable, when calculating the average score of a movie, I apply the raw average of all observed scores, even when there are fewer than four reviews available.

35 Critic scores on Metacritic have a scale of 0 to 100. For movies that were screened 3 days before release, the average score is 62.58. By opening day, 560 of the 587 movies have been reviewed, and that average score drops to 57.11. After opening week, all movies have critic scores available, with an average score of 55.74. Since critic reviews are usually posted pre-release or within the opening week, the average critic score of a movie remains stable after its opening week.14 Multiple websites, including Metacritic and Rotten Tomatoes, allow users to review movies. In this chapter, I use user reviews on IMDb15, where users can rate a movie on a scale from 1 star to 10 stars. For each user review on IMDb of each movie in my dataset, I collect its date and the username of the reviewer, the score posted, and the content of the review. On average, the audience score on IMDb drops from 6.62 after the opening week (Week 1), to 6.56 (Week 2) and 6.51 (Week 3).16

2.2.2.3 Box Office Performance

Data on weekly box office performance are collected from Box Office Mojo, a website commonly used in research on the motion picture industry. For each movie, I collect weekly box office revenue and the number of theaters showing the movie for 25 weeks post-release, as well as its total lifetime revenue. On Box Office Mojo, a week is defined as Friday through Thursday, as Friday releases are the most common in the industry. Nearly 90% of movies in my dataset were released on a Friday (some with a Thursday evening preview); opening week revenue for the other 10% has been adjusted to cover the first 7 days after wide release. 11.41% of movies have generated some box office revenue before their wide release, accounting for 13.89% of their total box office revenue on average. This part of the revenue has been dropped.17 For movies that are widely released, opening week usually accounts for the largest fraction of the total revenue. Opening week garners 47.74% of total revenue on average, with this ratio varying from 14.70% to 81.15%. On average, a movie is carried by about 2,754 theaters in its opening week, with the widest release being 4,404 theaters. Afterward, weekly box office revenue drops quickly, and only about 36% of movies can survive on the market past week 15.18 In this chapter, I treat the opening week separately and accumulate revenues in all the weeks afterward as the post-release revenue. The number of theaters that a movie is playing at will not be used in this chapter but can be used for future work that introduces the interaction of

14Table E.5 provides more details on the average score and the number of critic reviews, as well as the comparison between Metacritic and Rotten Tomatoes. 15As shown in Screenshot A.5, Google provides the average user rating on IMDb in a movie’s general search results too. 16For comparison, I also collect the user reviews on Metacritic with scores ranging from 0 to 10. The number of reviews on Metacritic is generally smaller than that on IMDb. Table E.6 provides more details on audience scores and the comparison between IMDb and Metacritic. 17Details on adjusting box office revenues can be found in Appendix A.1.1. 18Table E.2 provides more information on the change in weekly box office revenue.

36 studios and theaters.

2.2.2.4 Movie Characteristics

Movie characteristics are collected from Box Office Mojo, IMDb, and IMDbPro (a paid service of IMDb). For each movie, I collect the following information regarding characteristics: production companies, distributor, country, MPAA rating, director power, star power, genre, format, production budget, and whether the movie is a sequel. I also collect the movie’s wide-release date and check if it has been released during a holiday week, as well as how many new releases enter the market at the same time as the movie in question.

Table 2.3: Summary Statistics of Movie Characteristics

Variable Mean S.D. Min Max Variable Mean S.D. Min Max Disney 0.053 0.224 0 1 Action 0.225 0.418 0 1 Fox 0.089 0.284 0 1 Animation 0.073 0.261 0 1 Paramount 0.060 0.237 0 1 Comedy 0.269 0.444 0 1 Sony 0.101 0.301 0 1 Drama 0.244 0.430 0 1 Universal 0.094 0.292 0 1 Horror 0.095 0.294 0 1 Warner 0.072 0.258 0 1 Sci-fi 0.138 0.345 0 1 US prod. 0.658 0.475 0 1 Thriller 0.126 0.332 0 1 Co-prod. 0.293 0.456 0 1 Sequel 0.133 0.340 0 1 R 0.400 0.490 0 1 Director 7.816 1.333 2.890 12.20 PG-13 0.431 0.496 0 1 Star 5.581 1.407 0.768 10.66 3D 0.218 0.413 0 1 Budget 53.89 53.85 0.1 250 IMAX 0.155 0.362 0 1 Revenue 72.53 84.02 1.412 652.3 MLK 0.026 0.158 0 1 Labor 0.022 0.147 0 1 President’s 0.022 0.147 0 1 Halloween 0.026 0.158 0 1 Easter 0.012 0.109 0 1 Thanksgiving 0.022 0.147 0 1 Memorial 0.017 0.129 0 1 Christmas 0.051 0.220 0 1 Independence 0.022 0.147 0 1 # of Movies 3.186 1.087 1 6

Notes: Director and star power in the logarithm of STARmeter; budget and revenue in $ million.

Using dummies, I label the six major production companies — Disney, Fox, Paramount, Sony, Universal, and Warner Bros. — that produced 46.17% of the movies in my dataset. These production companies also have their own distribution divisions, and most of the movies distributed by a major studio are produced within the same company. Thus, I do not control the effect of distributors.19 I also categorize the movies into 8 genres — action, animation, comedy, drama, horror, sci-fi (or fantasy), thriller, and other.20 Country of has been

19Table E.4 provides the number of movies distributed by these six major studios. Some movies produced by minor studios can be distributed by a major studio. Three movies in the dataset were co-produced by two major studios: The Adventures of Tintin (2011), The Monuments Men (2014), and Interstellar (2014). 20Genres are not mutually exclusive, except for the “other” category; 140 movies in the sample have been categorized into two genres.

37 categorized into U.S.-produced (65.76%), co-produced by a U.S. studio with another country (29.30%), and produced by another country. MPAA ratings include R, PG-13, and Other (PG and G) and are also coded as dummy variables. I control for the format of a movie by labeling movies with 3D format and IMAX theater release, each accounting for about 15-20% of the total sample. I also measure the effect of the fame of a movie’s directors and actors, based on their popularity rankings on IMDbPro. This feature on IMDbPro is called the STARmeter. For each movie, I define director power as the logarithm of its director’s STARmeter (or the directors’ average, if there is more than one director). Star power, meanwhile, is the average of the logarithm of the top three cast members’ STARmeter. Note that since a lower value on the list means a higher ranking in reality, a movie with “lower” director power and star power has more popular director(s) and actors. Production budget varies from $0.1 million to $250 million, with an average of $53.89 million, and I categorize movies into three groups according to their production scale: low production budget (below $35 million, 52.98%), medium production budget ($35 - $100 million, 30.49%), and high production budget (above $100 million, 16.52%). Total box office revenue shows the inequality across movies, with a minimum of $1.41 million and a maximum of $652.3 million. I further control the effect of release date by controlling the month and year of a movie’s wide release, and whether the release benefits from one of the following 9 holidays: Martin Luther King Jr. Day, President’s Day, Easter, Memorial Day, Independence Day, Labor Day, Halloween, Thanksgiving, and Christmas.21 Overall, about 1.5-3% of movies in the sample were released in each holiday week, with about 5% released during the two-week annual Christmas period. The average number of new releases per week is 3.19.

2.2.3 Evidence of Consumer Learning

Using data on box office revenue, I provide evidence that consumers’ choice of attending a movie is affected by the studio’s advertising expenditure, as well as critic reviews and audience reviews. I also show, using data on advertising expenditure, that a studio’s advertising choice is affected by critic reviews. Figure 2.1(a) shows the relationship between the opening week revenue and the studio’s advertising expenditure conditional on the critic score in the pre-release period: movies with a low critic score, movies with a high critic score, and cold-opened movies with no critic score are color-coded.22 Opening week revenue is positively correlated with advertising, and this pattern is consistent across the three groups. Moreover, when the critic score is high, an increase in advertising is associated with a larger increase in the opening week revenue.

21Each holiday, in my data, is defined as one entire week, except for Christmas, which includes 2 weeks. Specifically, holiday weeks are defined as the week(s) covering the corresponding holiday weekend, as defined on Box Office Mojo. See Appendix A.1.3 for details. 22The categorization is based on their Metacritic scores. A movie has a low score if its Metacritic score is lower than 64.

38 h ogrnadec cr sago niao fmveqaiy hssosta h studio’s suggests the relationship that positive shows This this quality. quality, movie that movie with assume of correlated we positively indicator If is good choice or score. a advertising user cold-opened is long-run is score movie’s movie audience a a long-run with whether the correlated that, positively shows is 2.2(a) advertising Figure not, specifically, More movie. its of 2.1(a). As “Low”. Figure (i.e., as in year rest that the a to and similar half “High” very least as is at higher pattern or for the 6.5 2.1(b), released of Figure been score in has a shown with movie movies a categorize after I IMDb long-run), on collected score user h inln oeo advertising. of role signaling the nFgr .() ltavriigepniue gis h xps vrg rtcscores. critic average ex-post the against expenditures advertising plot I 2.2(b), Figure In reviews the and choice advertising studio’s a between relationship the shows 2.2 Figure average the Using advertising. and revenue total between relationship the at look I Next,

advertising 0 10 20 30 40 50 a ogrnUe Score User Long-run (a) 2 rglrfittedregular fittedcold cold−opened regular log(openrev) 4 0 2 4 6 user score(long−run) a pnn ekRevenue Week Opening (a) 0 6 hg rfittedhigh fittedlow fittedcold cold−opened highcr lowcr 10 iue2.1: Figure 8 20 advertising 10 30 iue2.2: Figure o ffieRvne&Avriig Reviews Advertising, & Revenue Office Box

advertising 40 0 10 20 30 40 50 b xps rtcScore Critic Ex-post (b) 20 50 rglrfittedregular fittedcold cold−opened regular detsn Reviews & Advertising 40 39 critic score 60 80 100 b oa Revenue Total (b)

advertising 0 10 20 30 40 50 2 c ek1Ue Score User 1 Week (c) rglrfittedregular fittedcold cold−opened regular 4 user score(openingweek) 6 8 10 For cold-opened movies, the studio’s advertising choice is positively correlated with its ex-post critic score, which again signals movie quality. However, for movies that have been reviewed by critics, this positive relationship goes away, as the advertising choice is now based on the mixture of the private information and the critic score. Figure 2.2(c) plots advertising against the average user score in the opening week. Since the studio does not observe this score when making its advertising decisions, and this score may not be a good predictor of the studio’s private information, we see two flat fitted lines for both cold-opened and regular movies.

2.3 Model

I consider a model of movie release with 3 periods. There is a single studio with one movie to promote, and the studio makes an advertising decision in the Pre-release Period (Period 0). In each of the Opening Week (Period 1) and the Post-release Period (Period 2), consumers make a binary choice of whether to watch the movie or not, based on their expected utility of the movie.

2.3.1 Timing

[Pre-release] [Opening] [Post-release]

Private Studio 1 2 0 Critic (regular) Critic Audience Consumer Adspend (cold)

Figure 2.3: Players, Timing, and Information Flow

Period 0 (Pre-release Period): The studio has a new movie j with observable characteristics

Xj and unobserved quality ξj. The studio does not know ξj. Instead, it receives a private and s noisy quality signal ξj . If movie j is screened by the critics (a regular movie), both the studio c and consumers observe the critic score, which comes as a noisy signal ξj . Then, the studio chooses an advertising expenditure adj based on the information it observes. Period 1 (Opening Week): The movie j is released to the market. A group of N consumers c enters and learns about movie quality from advertising expenditure adj, and critic score ξj if the movie has been screened. Then, consumers make a binary choice of whether to watch

40 a movie j, then exit the market. After that, the audience score is revealed as a noisy signal ξj . If c the movie has not been screened by the critics in Period 0 (a cold-opened movie), critic score ξj will also be revealed.

Period 2 (Post-release Period): Another group of N consumers enters the market, observes c a advertising expenditure adj, critic score ξj , audience score ξj , and makes a watching decision based on all the information available. The game ends after this period.

2.3.2 Demand

For individual i in period t, the utility of watching movie j is:

Uijt = βXj + γDt + ξj + τjt + εijt, where Xj are movie characteristics; Dt is a dummy variable that indicates whether movie j is in its post-release period. The true quality of the movie ξj, which is fixed after a movie is produced, is unobserved until the individual actually watches the movie. τjt is the realized aggregate demand shock, which has a zero mean. εijt is an idiosyncratic taste shock, which is i.i.d following Type I extreme value distribution. The outside option includes other types of entertainment other than watching this movie. The utility of the outside option is assumed as:

Ui0t = εi0t.

Since consumers cannot observe the true quality ξj when making their choices, they need to take expectations of ξj, using the information of the studio’s advertising choice adj, the c a critic score ξj , and the audience score ξj , whichever they observe. For cold-opened movies and regular movies in different periods, consumers have different information sets:

a c c a It = {adj}, {adj, ξj }, {adj, ξj , ξj }.

The three sets correspond to the information that consumers have for a cold-opened movie in the opening week, a regular movie in the opening week, and a movie in the post-release period, respectively. Given the information available, consumers can calculate the expected utility of watching movie j in week t:

E a E a [Uijt|It ] = βXj + γDt + [ξj|It ] + τjt + εijt.

Then, consumers will choose whether to watch movie j based on this expected quality.

Under the assumption that εijt is i.i.d Type I extreme value distributed, in period t, the

41 probability of consumer i choosing to watch movie j is:

exp(βX + γD + E[ξ |I a] + τ ) P r = j t j t jt . (2.1) jt E a 1 + exp(βXj + γDt + [ξj|It ] + τjt)

This choice probability P rjt of movie j in period t is the share sjt of the total population that have chosen to watch the movie. There are different ways to model the learning process of consumers to form their expected E a quality [ξj|It ]. In this chapter, I will focus on a Bayesian Learning model and formulate this learning process in Section 2.3.4. An alternative way of modeling will be discussed as a robustness test in Section 2.5.4.

2.3.3 Supply

In the pre-release period, I assume that the studio makes a single advertising decision for the opening week, based on the information it observes. This advertising expenditure adj maximizes the studio’s expected total profit in the opening week and the post-release period:

s s max E[sj1(adj)|I ]Npη − adj + E[sj2(adj)|I ]Npη, adj where sj1(·) and sj2(·) are the market shares of the opening week and the post-release period, respectively; N denotes the total population of the U.S.23; p is the yearly average ticket price24; η is the ratio of box office revenue that goes to the studio, which is set at 50%25. The first two terms represent the expected profit in the opening week. The last term is the expected profit in the post-release period. The studio’s expected market shares depend on its information set I s. For cold-opened s s∗ movies, the only signal that the studio observes is its private signal (Ic = {ξj }); for regular s s∗ c movies, both the private signal and the critic score are observed (Ir = {ξj , ξj }). More specifically, I can write the expected market shares of cold-opened and regular movies as follows:

Cold-opened Movies

E E s∗ E exp(βXj + [ξj|adj] + τj1) s∗ [sj1(adj)|ξj ] = [ |ξj ]; 1 + exp(βXj + E[ξj|adj] + τj1) c a exp(βXj + γ + E[ξj|adj, ξ , ξ ] + τj2) E[s (ad )|ξs∗] = E[ j j |ξs∗]. j2 j j E c a j 1 + exp(βXj + γ + [ξj|adj, ξj , ξj ] + τj2)

23The actual domestic market includes Canada, which is about 10% of the size of the U.S. population. Here I do not include the Canadian population, and only consider the U.S. population that is older than 13 years old, since 82.91% of the 587 movies in my dataset are rated PG-13 or R. 24For the motion picture industry, the ticket price varies across theaters and format (e.g., 3D and IMAX) but not across movies. Thus, ticket price is treated as exogenous. 25The revenue-sharing rule varies across movies, and the ballpark of this ratio is 50-55%.

42 For a cold-opened movie, the studio must account for the information that consumers observe in each period. Since consumers in the post-release period learn from the critic score and the audience score, neither are available when the studio makes its advertising choice, the studio needs to take expectations of them using the private signal it receives. Consumers in the opening week only observe the studio’s advertising level, which is observed by the studio.

Regular Movies

c exp(βXj + E[ξj|adj, ξ ] + τj1) E[s (ad )|ξs∗, ξc] = E[ j |ξs∗, ξc]; j1 j j j E c j j 1 + exp(βXj + [ξj|adj, ξj ] + τj1) c a exp(βXj + γ + E[ξj|adj, ξ , ξ ] + τj2) E[s (ad )|ξs∗, ξc] = E[ j j |ξs∗, ξc]. j2 j j j E c a j j 1 + exp(βXj + γ + [ξj|adj, ξj , ξj ] + τj2)

For a regular movie, similarly, the studio needs to take expectation of the audience score, which is not observed until after release. In addition to the private signal, the studio also learns from the critic score to predict the audience score. As for the opening week consumers, they learn from advertising and the critic score, both observed by the studio. The expected market shares give us the intuition of how the studio’s private signal affects the advertising expenditure of a movie. Whether a movie is cold-opened or regular, the studio’s expected opening week market share can be freely affected by the studio’s advertising choice. However, the studio’s expected post-release market share will be determined by the studio’s private signal through the expected scores that the studio has not observed. ∗ From the studio’s maximization problem, the optimal advertising adj must satisfy the following condition:

s s ∂ (E[sj1(adj)|I ]Npη + E[sj2(adj)|I ]Npη) ∗ |adj =ad = 1. (2.2) ∂adj j

I assume that the studio is aware of the aggregate demand shocks τj1 and τj2 when it ∗ decides how much to advertise. Then, Equation (2.2) gives us the optimal advertising level adj s∗ corresponding to a certain private signal ξj . Similar to the demand side, I also model the studio’s learning process using a Bayesian Learning model. I will further discuss it in Section 2.3.5.

2.3.4 Consumer Learning

I model the process of consumers forming expectations of the unobserved quality as a Bayesian Learning process. The three quality signals - advertising expenditure, critic score, and audience score - have different availability.

43 2.3.4.1 Opening Week

For consumers in the opening week, their information comes from the studio’s advertising c adj, and the critic score ξj if the movie has been reviewed. Assume that the studio and consumers share a common prior, that the true quality of movie j is a random draw from a normal distribution: 2 ξj ∼ N(µ, σ ).

The studio’s private information from its test screenings comes as a private signal about its movie quality, which follows a normal distribution around the true quality:

s∗ 2 ξj ∼ N(ξj, σs∗).

Consumers are aware of this distribution but do not observe the studio’s private signal. Instead, I assume that they infer this private signal as a one-to-one mapping from the studio’s s∗ advertising choice adj. For simplicity, I will continue using ξj as the private signal that consumers infer, though it may not be the actual private signal that the studio receives.

Cold-opened Movies s∗ Using the private signal ξj inferred from advertising, consumers can update their beliefs about the movie quality according to Bayes’ Rule. Their posterior after learning from advertising will be:26 s∗ a a2 ξj|ξj ∼ N(µj1,c, σj1,c), where:

2 2 a s σs∗ σ s∗ µj1,c = ξj = 2 2 µ + 2 2 ξj ; (2.3) σs∗ + σ σs∗ + σ 2 2 a2 2 σ σs∗ σj1,c = σs = 2 2 . (2.4) σ + σs∗

Equation (2.3) shows that the posterior mean after learning from advertising is a weighted average of the prior mean and the private signal that the studio receives. Equation (2.4) means that the posterior variance is a combination of the prior variance and the variance of the private signal. Due to the limitation in identification, I do not separately identify the prior distribution and the distribution of the private signal. Instead, I focus on the posterior, denote the posterior s 2 mean as ξj and refer to it as the private score, and denote the posterior variance as σs , as shown in Equation (2.3)-(2.4). Given that consumers are aware of the prior and the distribution of the private signal, s∗ s inferring the private signal ξj from advertising is equivalent to inferring the posterior mean ξj

26Proof of a general learning process from multiple signals can be found in Appendix B.1.

44 s from advertising. I assume that consumers infer ξj as a function of advertising level adj:

s ξj = α0 + α1 ln(adj). (2.5)

This assumption means that consumers are unsophisticated, as they do not infer studio’s private score conditional on movie characteristics. The actual relationship between the studio’s optimal advertising level and the studio’s private signal will be decided by the F.O.C in Equation (2.2), which depends on other information like movie characteristics. Thus, the private score that consumers infer under this assumption may not be the same as the actual signal that the studio receives. 2 2 The posterior variance, σs , is increasing in the variance of the private signal, σs∗. Thus, the posterior variance reflects the accuracy of the studio’s private signal.

Regular Movies For regular movies, consumers can further observe the critic score.27 Assume that the critic c score arrives as a random signal ξj , and is normally distributed around the true quality:

c 2 ξj ∼ N(ξj, σc ).

Under this assumption, the critic score has an infinite support, while in the data, there is a finite range for critic scores. I provide estimation results after transforming the finite range in the data to an infinite support in Section 2.5.4. According to Bayes’ Rule, the posterior in the opening week for regular movies will be:

s∗ c a a2 ξj|ξj , ξj ∼ N(µj1,r, σj1,r), where:

2 2 a σc s σs c µj1,r = 2 2 ξj + 2 2 ξj ; (2.6) σs + σc σs + σc 2 2 a2 σs σc σj1,r = 2 2 . (2.7) σs + σc

s∗ Notice that in Equation (2.6), the studio’s private signal ξj has been replaced with its s 28 private score ξj using Equation (2.3)-(2.4). This equation gives us the expected quality after consumers learn from advertising and the critic score. It is a weighted average of the private score and the critic score. The weight on these two terms depends on the relative accuracy of the signals. As the variance of the critic score increases, the critic score is less accurate, and

27In my dataset, multiple critic reviews are available. While I can incorporate the number of reviews in the learning model, I only use the average critic score as the signal and do not consider the number of reviews. 28Proof of the equivalence of writing the prior mean and the private signal separately, and combining them as the private score, can be found in Appendix B.2.

45 consumers will put less weight on it.

2.3.4.2 Post-release Period

For consumers in the post-release period, they observe the critic score, regardless of whether the movie has been reviewed before release or not. Moreover, they have an extra piece of 29 a information: the audience score. Assume that the audience score comes as a noisy signal, ξj , which is normally distributed around the true quality:30

a 2 ξj ∼ N(ξj, σa).

Given all the information available to the consumers, the posterior of consumers in the post-release period will be: s c a a a2 ξj|ξj , ξj , ξj ∼ N(µj2, σj2 ), where:

2 2 2 2 2 2 a σc σa s σs σa c σs σc a µj2 = 2 2 2 2 2 2 ξj + 2 2 2 2 2 2 ξj + 2 2 2 2 2 2 ξj ; (2.8) σs σc + σs σa + σc σa σs σc + σs σa + σc σa σs σc + σs σa + σc σa 2 2 2 a2 σs σc σa σj2 = 2 2 2 2 2 2 . (2.9) σs σc + σs σa + σc σa

The posterior mean in Equation (2.8) is a weighted average of the three signals. Similarly, a higher variance of a signal means lower signal accuracy, and less weight will be put on that signal by consumers. Therefore, consumers’ expected quality of a movie is a weighted average of observed signals: s the private score ξj (which will be replaced by the advertising expenditure adj using Equation c a (2.5)), the critic score ξj , and the audience score ξj , if they are available.

2.3.5 Studio Learning

I assume that the studio is aware of the distributions of the critic score and the audience score, as well as the mapping from its advertising expenditure adj to the private score that consumers infer, as in Equation (2.5). As discussed in Section 2.3.3, the studio’s expected market shares depend on the learning process of consumers.

Under the assumption that the studio observes aggregate demand shocks τ1 and τ2, the studio knows its market share in the opening week. However, for the post-release period, studio

29Similar to the critic reviews, I do not consider the number of audience reviews in the learning process, despite observing it in the data. Including the number of audience reviews, possibly depending on the movie’s total number of viewers in the opening week, can be a useful extension of this model. 30Following the discussion of the critic score, here I also allow an infinite support for the audience score.

46 a c does not observe audience score ξj , and critic score ξj if the movie is cold-opened. The expected market shares are as follows:

Cold-opened Movies

E s∗ exp[βXj + (α0 + α1 ln(adj)) + τj1] [sj1(adj)|ξj ] = ; (2.10) 1 + exp[βXj + (α0 + α1 ln(adj)) + τj1] ZZZ E c a s∗ c a c a s∗ [sj2(adj, ξj , ξj )|ξj ] = sj2(adj, ξj , ξj ) dF (ξj |ξj)dF (ξj |ξj)dF (ξj|ξj ), (2.11)

where c a exp[βXj + γ + a1(α0 + α1 ln(adj)) + b1ξj + c1ξj + τj2] sj2(adj) = c a , 1 + exp[βXj + γ + a1(α0 + α1 ln(adj)) + b1ξj + c1ξj + τj2] and (a1, b1, c1) corresponds to the weight in Equation (2.8). c Equation (2.11) requires that the studio integrates over the distribution of critic score ξj a and the distribution of audience score ξj , both are normal with the true quality ξj as mean, and integrates over the posterior:

s∗ s s2 ξj|ξj ∼ N(µj,c, σj,c).

Given the same information structure, this posterior can be characterized by Equation (2.3) s∗ and (2.4), and ξj will be the actual private signal received by the studio. As discussed before, I do not distinguish the prior distribution and the distribution of the studio’s private signal. I s 2 again focus on the actual private score ξj of the studio, and the posterior variance σs . Regular Movies

exp[βX + a (α + α ln(ad )) + b ξc] E s∗ c j 0 0 1 j 0 j [sj1(adj)|ξj , ξj ] = c ; (2.12) 1 + exp[βXj + a0(α0 + α1 ln(adj)) + b0ξj ] ZZ E s∗ c a s∗ c [sj2(adj)|ξj , ξj ] = sj2(adj) dF (ξj |ξj)dF (ξj|ξj , ξj ), (2.13)

where c a exp(βXj + γ + a1(α0 + α1 ln(adj)) + b1ξj + c1ξj + τj2) sj2(adj) = c a , 1 + exp(βXj + γ + a1(α0 + α1 ln(adj)) + b1ξj + c1ξj + τj2) and (a0, b0), (a1, b1, c1) correspond to the weight in Equation (2.6) and (2.8), respectively. Equation (2.13) requires that the studio integrates over the distribution of audience score a ξj , and integrates over the posterior:

s∗ c s s2 ξj|ξj , ξj ∼ N(µj,r, σj,r).

s This posterior can be characterized by Equation (2.6) and (2.7), with ξj being the actual private score that the studio receives.

47 2.4 Estimation

I estimate the demand side using GMM with instrumental variables. Then, based on the demand estimates, I use the F.O.C of the studio’s maximization problem to solve for the studio’s private score from its advertising choice.

2.4.1 Parameterization

2.4.1.1 Weight on Signals

The weight that consumers put on different signals depends on the variances of the signals, 2 2 2 σs , σc , and σa. I am particularly interested in the accuracy of the critic score and the audience 2 2 score, and I normalize the variances of the critic score σc and the audience score σa as follows:

2 2 2 σ˜k = σk/σs , k = c, a.

2 2 I further allow the normalized variances, σ˜c and σ˜a, to vary across movies with different characteristics Hj: 1) produced by a major studio or a minor studio; 2) medium production budget; 3) high production budget. Hj also includes a constant. I parameterize the normalized variances as below:

2 σ˜kj = exp(σk0 + σk1majorj + σk2medj + σk3highj), k = c, a.

2.4.1.2 Rescaling Reviews

In the model, the critic score and the audience score both follow normal distributions with the movie’s true quality as the mean, and the true quality follows a normal distribution with prior mean µ. In the actual data, critic reviews and audience reviews both have a fixed scale. On Metacritic, critic scores fall in [0, 100], and on IMDb, audience scores fall in [1, 10]. I rescale the scores to use as the signals that enter utility. First, I combine the prior mean µ with the constant in β, since they cannot be separately identified, to form β0 as the new constant. That will give us a prior mean of 0, with the signals s c a becoming ξj − µ, ξj − µ, and ξj − µ, since the weight always adds up to 1. To adapt the changes in the critic score and the audience score, I subtract the mean of the critic scores and the audience scores of all the movies in the dataset, respectively. Then, I divide the critic scores and the audience scores by their standard deviations to bring them to a comparable level. That is:

cr − cr ξc =cr ˜ = j ; j j σ(cr)

48 ur − ur ξa =ur ˜ = j . j j σ(ur)

Here, cr and ur are the means of long-run critic scores on Metacritic and long-run audience scores on IMDb. σ(cr) and σ(ur) are the standard deviations of these long-run scores.

2.4.2 Demand Estimation

Given the market share in Equation (2.1), I rewrite it as the following equation:

E a ln(sjt) − ln(s0t) = βXj + γDt + [ξj|It ] + τjt. (2.14)

The left-hand side is the difference between the logarithm of the probability of watching movie j and the logarithm of the probability of not watching. On the right-hand side, I control the characteristics of the movie, the time dummy, expected movie quality, and an aggregate demand shock. Movie characteristics Xj include 1) movie characteristics, including production company, origin, MPAA rating, director power, star power, genre, format, sequel, movie scale; and 2) time of release, including release year and month dummies, as well as 9 holiday weeks. E a The expected quality term, [ξj|It ], will be replaced using Equation (2.3), (2.6), or (2.8), depending on what period the movie is in and what type the movie is. I estimate all the observations together, and label each group using 2 dummies: cold-opening dummy NoCrjt, which means that no critic score is available, and time dummy Dt. NoCrjt = 1 if and only if movie j is cold-opened and is in the opening week (Dt = 0). The audience score is available if and only if Dt = 1, when the movie is in its post-release period. More specifically, the expected quality term takes the following form:

E a h i [ξj|It ] = NoCrj · α0 + α1 ln(adj) h c i + (1 − NoCrj) · (1 − Dt) · a0(α0 + α1 ln(adj)) + b0ξj h c a i + (1 − NoCrj) · Dt · a1(α0 + α1 ln(adj)) + b1ξj + c1ξj .

To estimate Equation (2.14), I first discuss the identification of the parameters, especially those with endogeneity concerns, then explain the estimation strategy.

2.4.2.1 Identification

For the preference parameters, the coefficients of movie characteristics, β, are identified by the variation in shares based on the movie characteristics and release time; the coefficient on the period dummy, γ, is identified by the variation in shares for the same movie across two periods. The parameters of variances, σ, are identified by the variation in the shares based on the variation in the signals. More specifically, the variation in the shares of three different

49 groups of movies with different information sets of consumers (advertising only, advertising and the critic score, advertising and both scores) pins down σ. This variation across movies produced by major or minor studios with different production budget levels identifies σ on these characteristics. The coefficient α1 in front of the advertising expenditure is identified by the variation in the shares across movies with different advertising expenditures. The constant

α0 is identified by the difference in the shares between cold-opened movies and regular movies. However, the studio’s advertising choice and cold-opening choice are both subject to endogeneity issues. I use two instrumental variables to deal with them, respectively.

2.4.2.2 Instrumental Variables

First, the advertising expenditure adj is endogenous due to the assumption that the studio observes aggregate demand shocks τ1 and τ2 when making its advertising choice. I use the logarithm of a movie’s production budget, ln(budgetj), as an instrument. In my demand equation, I assume that the scale of a movie affects consumers’ utility, since consumers may react differently to a high-budget blockbuster or a small-budget arthouse movie, though they do not observe the actual production budget. Conditional on the movie scale, I utilize the variation in the actual production budget to identify the effect of advertising. As discussed previously, the advertising expenditure and the production budget are highly correlated under the 50% rule of thumb of the movie industry. While studios can adjust advertising expenditures based on unobserved shocks, production budget cannot be adjusted once the production of the movie 31 is completed. Regressing ln(adj) on this instrument after controlling movie characteristics, the critic score, and the audience score shows that ln(budgetj) is a strong instrument, with a significant and positive coefficient of 0.380 and a t-value of 5.30.

Domestic Foreign

Consumers Consumers

Cold-opening

Foreign τ  Release

Figure 2.4: Instrument for Cold-opening Choice

Second, while I do not model the studio’s choice of whether to send the movie to the critics before release, this choice may be affected by unobserved shocks in τjt that I cannot account for. Thus, the cold-opening dummy NoCrjt is also endogenous. To deal with this issue, I

31Most of the factors that affect a movie’s production budget, such as director power, star power, genre, and format (3D, IMAX), have already been controlled.

50 consider the foreign release schedule of a movie. I check if a movie has been released in an overseas market 3 days before its domestic release, and label it as a dummy variable, frgnjt. frgnjt = 1 if and only if a movie has not been released in a foreign market and the movie is in the opening week. The intuition for its validity is illustrated in Figure 2.4. If a movie has already been released in an overseas market, consumers in that market will watch the movie, and the quality of this movie can spread to the domestic market via word-of-mouth to inform the domestic consumers. Then, the studio will have more incentive to not let the critics review its movie. Therefore, a movie’s foreign release will indirectly affect the studio’s screening choice in the domestic market. On the other hand, foreign release schedule is less flexible and more dependent on observed movie characteristics Xj, rather than on the demand shock τjt. Since critic score is interacted with the cold-opening choice, I separately regress NoCrjt and its interaction with the critic score on frgnjt and its interaction with the critic score, respectively, conditional on movie characteristics, advertising, and audience score. frgnjt is positive and significant at 0.311 with a t-value of 8.68, when NoCrjt is the dependent variable. frgnjt interacted with the critic score is also a strong instrument, with a coefficient of 0.859 with 32 t-value at 38.25, when NoCrjt interacted with the critic score is the regressand.

2.4.2.3 Estimation Strategy

Given the preference parameters, (β, γ), and the learning parameters, (σ, α), τjt can be calculated as:

τjt(β, γ, σ, α) = [ln(sjt) − ln(s0t)] − [βXj + γDt + ξjt(σ, α)] .

The moment conditions are based on the orthogonality between the aggregate demand shock τjt and the instrumental variables Zjt:

E[τjt(β, γ, σ, α)|Zjt] = 0. (2.15)

The instruments include movie characteristics Xj, the time dummy Dt, the production budget ln(budgetj), the foreign release dummy frgnjt, and the interactions of two scores — the c a critic score (1 − frgnjt)ξj and the audience score (1 − frgnjt)Dtξj — with the characteristics Hj in the parameterization. Then, I use GMM to estimate the parameters.

32I have further tested advertising and cold-opening instruments in the first stage of a linear regression to estimate the demand side. The Cragg-Donald Wald F statistics also show the strength of these instruments.

51 2.4.3 Advertising Policy Function

Given the demand estimates, the F.O.C of the studio’s optimization problem in Equation (2.2) gives us the advertising policy function:

∗ s adj = f(ξj ; θ, Xj, I ), where θ = (β, γ, σ, α); Xj are all the movie characteristics; I = ∅ if the movie is cold-opened, c s and I = ξj if the movie is regular. The private score, ξj , is the posterior mean after the studio updates its belief with the private signal it receives.

Cold-opened Movies

If movie j is cold-opened, the studio sticks to its posterior and forms a belief about the critic score and the audience score. This is to integrate over the distributions of the true quality, the critic score, and the audience score. Empirically, I generate n Gaussian quadrature nodes s ξj,k (k = 1, ··· , n) from the studio’s posterior, based on the private score ξj and the posterior 2 c variance σs . At each node ξj,k, I generate n nodes ξj,kl (l = 1, ··· , n) from the distribution of 2 c the critic score, with mean ξj,k and variance σc , and n nodes ξj,km (m = 1, ··· , n) from the 2 distribution of the audience score, with mean ξj,k and variance σa. Then, I calculate the F.O.C ∗ and solve for adj that makes the equation hold. Regular Movies

c For a regular movie, the studio further updates its belief using the observed critic score ξj ,  2 2  2 2 σc s σs c σs σc and the new posterior has a mean of 2 2 ξ + 2 2 ξ and a variance of 2 2 , as defined in σs +σc j σs +σc j σs +σc Equation (2.6) and (2.7). The studio still has to take expectations of the unobserved audience score. Similarly, I generate n nodes ξj,k (k = 1, ··· , n) from the studio’s posterior. At each c node ξj,k, I generate n nodes ξj,km (m = 1, ··· , n) from the distribution of the audience score, 2 ∗ with mean ξj,k and variance σa. Then I solve for adj that satisfies the F.O.C. 2 Notice that, calculating F.O.C requires that we know the value of σs . However, as discussed 2 2 2 in Section 2.4.1.1, I cannot separately identify σs from σc and σa in my demand estimation. To 2 approximate σs , I use the variance of α0 + α1 ln(adj), which is the private score inferred by the 2 2 2 consumers. This gives us σs = 0.4398, then σc and σa can be calculated for movies produced by major or minor studios at different production budget levels.33

2.5 Results

In this section, I first show the regression results and discuss the effect of different signals. Then, I illustrate the policy function and discuss the implication. The last part of this section

33 2 I can further allow σs to vary across movie types by using the variance of α0 + α1 ln(adj ) within each movie 2 type. Here I only focus on the same σs for all movies.

52 presents the results from robustness tests.

2.5.1 Preference Parameters

Table 2.4: Preference Parameters

(1) No IV (2) IV (1) No IV (2) IV Disney 0.320** 0.356* # of Movies -0.108*** -0.103*** (0.125) (0.193) (0.0301) (0.0311) Fox 0.329*** 0.336** Med 0.253*** 0.426* (0.0829) (0.131) (0.0897) (0.242) Paramount 0.162 0.160 High 0.568*** 0.836*** (0.113) (0.159) (0.145) (0.320) Sony 0.324*** 0.331** Director -0.0624** -0.0514* (0.102) (0.139) (0.0271) (0.0298) Universal 0.128 0.170 Star -0.0263 -0.0367 (0.108) (0.163) (0.0293) (0.0341) Warner -0.150 -0.0540 Sequel 0.501*** 0.523*** (0.111) (0.204) (0.0711) (0.0733)

Notes: Standard errors in parentheses; N = 1174; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

While multiple movie characteristics and the timing of release (holiday weekends, month, and year effects) have been controlled for, here, I focus on the effect of the selected characteristics in Table 2.4. Results show that most of the 6 major studios yield higher utility compared to the movies produced by minor studios, while Disney, Fox, and Sony yield significantly more. The number of newly-released movies in a movie’s opening week has a negative effect, partly capturing the competition effect. Movies with a higher production budget level generate higher utility for consumers. The director power and the star power both have positive effects.34 Sequels give consumers higher utility, as they have to belong to a popular franchise to be made. Other characteristics on genre (animation, horror), holiday weekends (MLK, Halloween, and Christmas), and months (February, March, June, July, August) also have significant effects.35 Period dummy, with coefficient γ, has a significant and positive effect, as the opening week revenue makes up less than half (about 47.74%) of the total revenue. After using the actual production budget as an instrument for advertising, the medium budget and the high budget dummies still have a positive and significant effect. This shows that consumers do react to higher budget levels.

34Since a higher ranking means a smaller value, the negative coefficient shows a positive effect. 35For estimates of other preference parameters, see Table E.11.

53 2.5.2 Learning Parameters

The learning parameters include 8 parameters that determine the weight that consumers put on different signals, and 2 parameters that map the advertising expenditure to the private score. Table 2.5 shows the result of the learning parameters. First, advertising has a positive and significant effect on utility. If advertising is positively correlated with an unobserved shock, its coefficient will be biased upward. After using the instruments, α1 decreases from 0.808 to

0.781, showing the expected change in this coefficient. Second, the constant α0 in the mapping from advertising to the private score affects cold-opened movies in their opening week the most. If the cold-opening choice is negatively correlated with an unobserved shock, we would expect this constant to increase after using instruments. My result of the change from -1.877 to -0.610 shows this increase.

Table 2.5: Learning Parameters

(1) No IV (2) IV (1) No IV (2) IV

σc ×Const. 1.937*** 1.480*** σa ×Const. 1.938*** 2.260*** (0.437) (0.421) (0.388) (0.648) ×Major 0.0924 -0.102 ×Major -0.190 -0.267 (0.548) (0.608) (0.562) (0.860) ×Medium -0.836 -0.729 ×Medium -0.543 -0.650 (0.581) (0.586) (0.599) (0.863) ×High -0.845 -0.903 ×High -0.964* -0.899 (0.681) (0.718) (0.515) (0.856)

α0 -1.877*** -0.610 α1 0.808*** 0.781*** (0.418) (1.399) (0.0725) (0.205)

Notes: Standard errors in parentheses; N = 1174; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

Moreover, the coefficient on the constant for the variance of the critic score and that for the audience score are both significant and positive, while those on different movie types are insignificant. From the estimates, we can tell that the variance of the critic score is lower than the variance of the audience score but higher than the variance of the private score (since

σc > 0 for all types of movies). For movies produced by a major studio, and movies with higher production budget level, the variance of the critic score and the variance of the audience score are both smaller. A more intuitive way to interpret the result is to plot weight in a graph. Figure 2.5 shows the variation in the weight on different quality signals across major studio (left panel) and minor studio (right panel) productions, and across movies with different production budget levels (low, medium, and high). Each bar represents a total weight of 100% being divided into 3 parts: on the private score inferred from advertising, the critic score, and the audience score. When three signals are all available, consumers put the most weight on the private score

54 inferred from advertising, and the least weight on the audience score. As the production budget level goes from low to high, consumers increase the weight on the critic score and the audience score. This pattern is consistent across movies produced by major and minor studios, but major studio movies are less affected by the private score inferred from advertising.

Major Studio Minor Studio 1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5 Weight Weight 0.4 0.4

0.3 0.3

0.2 0.2

private private 0.1 critic 0.1 critic audience audience 0 0 low medium high low medium high Budget Level Budget Level

Figure 2.5: Weights on Signals

The change in weight gives us some insight into the effectiveness of advertising and reviews on a movie’s box office performance. To further illustrate the impact, I compare the marginal effects of three changes on opening week box office revenue: 1) a 10% increase in advertising when the movie is cold-opened, 2) a 10% increase in advertising if the movie has been screened, and 3) a 5% increase in the critic score. Figure 2.6 shows the marginal effect on opening revenue across major and minor studio productions with different production budget levels. The marginal effects are calculated for a movie with a mean utility level of -5.9539, a critic score of 55.86, and different advertising expenditures ranging from $0.1 million to $60 million.36

Low Budget Medium Budget High Budget 6 6 6 Ad (cold) Ad (cold) Ad (cold) Ad (major,reg) Ad (major,reg) Ad (major,reg) 5 Cr (major,reg) 5 Cr (major,reg) 5 Cr (major,reg) Ad (minor,reg) Ad (minor,reg) Ad (minor,reg) Cr (minor,reg) Cr (minor,reg) Cr (minor,reg) 4 4 4

3 3 3

2 2 2 ME on Openrev ($mln) ME on Openrev ($mln) ME on Openrev ($mln) 1 1 1

0 0 0 0 20 40 60 0 20 40 60 0 20 40 60 Advertising ($mln) Advertising ($mln) Advertising ($mln) (a) Low Production Budget (b) Medium Production Budget (c) High Production Budget

Figure 2.6: Marginal Effect of Advertising and Reviews on Opening Revenue

When a movie is cold-opened, a 10% increase in advertising (red dash-dot line) is more

36 -5.9539 is the mean of βXj of all 587 movies in my dataset, with holiday, month, and year dummies included. 55.86 is the mean of critic scores in my sample.

55 effective compared to the same increase in advertising when the movie has been screened (blue lines, with the dashed blue line corresponding to a movie produced by a minor studio). The marginal effect of advertising for a minor studio movie is slightly larger than that for a major studio movie. This matches the finding in Figure 2.5 that consumers put more weight on advertising for minor studio movies. As the production budget level increases from low to high, for both major studio and minor studio movies, the marginal effect of a 5% increase in the critic score becomes larger, while the effectiveness of advertising decreases. This also matches the changes in Figure 2.5. To put these effects into perspective, I calculate the change in opening revenue if all movies have their critic scores increase from 51.88 (40th percentile) to 59.49 (60th percentile). On average, this increase in the critic score brings up opening week revenue by 14.31%. To achieve the same effect by increasing advertising expenditures, studios need to advertise 28.56% more on average. Moreover, when critic scores are unavailable, on average, a 10% increase in advertising boosts opening revenue by $3.20 million per movie, while the same increase in advertising when critic scores are available merely improves opening revenue by $1.42 million per movie.

2.5.3 Advertising Policy Function

Using the demand estimates, I calculate the studio’s optimal advertising strategies for cold-opened movies and regular movies. The policy functions in this section are based on a mean utility level of -5.9539. The market size N is set at 260 million, with the average price set at $8. τ1 and τ2 are both set to 0 for simplicity. For each score, I generate n = 9 nodes to integrate over its distribution.37 I plot the private score within the range of [−10, 10]. Critic scores are fixed at 10, 30, 50, 70, 90 as examples.

Cold-opened Movies

Major Studio Minor Studio 250 250 Low Low Medium Medium High High 200 200

150 150

Advertising 100 Advertising 100

50 50

0 0 -10 -5 0 5 10 -10 -5 0 5 10 Private Private

Figure 2.7: Optimal Advertising for Cold-opened Movies

As shown in Figure 2.7, the optimal advertising expenditure for cold-opened movies strictly

37For my counterfactual analysis, I also generate 9 nodes for each score.

56 increases in the private score, except for very high private scores. As the production budget level increases (from yellow to red), the highest optimal advertising level decreases, and the turning point arrives earlier at a lower private score. This pattern persists across major and minor studio productions.

Regular Movies In Figure 2.8, curves with colors changing from blue to green represent optimal advertising levels at a low critic score to a high critic score (from 10 to 90). The other curve is the optimal advertising level when the same movie is cold-opened, taken from Figure 2.7. The upper panel includes 3 graphs for major studio productions. The optimal advertising expenditure is strictly increasing in the private score, and strictly increasing in the critic score. When we compare the advertising level after screening to the cold-opened case, the cold-opened advertising level is higher than the regular case, except for movies with a low private score. This pattern persists for minor studio productions in the lower panel. The gap between the optimal advertising expenditures of a cold-opened movie and a regular movie shows us the possibility of saving advertising expenditures after observing reviews. I conduct a counterfactual study in Section 2.6 to quantify this saving.

Major Studio + Low Budget Major Studio + Medium Budget Major Studio + High Budget 220 220 220 Cr=10 Cr=10 Cr=10 200 Cr=30 200 Cr=30 200 Cr=30 Cr=50 Cr=50 Cr=50 180 Cr=70 180 Cr=70 180 Cr=70 Cr=90 Cr=90 Cr=90 160 cold-opend 160 cold-opend 160 cold-opend 140 140 140 120 120 120 100 100 100

Advertising 80 Advertising 80 Advertising 80 60 60 60 40 40 40 20 20 20 0 0 0 -10 -5 0 5 10 -10 -5 0 5 10 -10 -5 0 5 10 Private Private Private

Minor Studio + Low Budget Minor Studio + Medium Budget Minor Studio + High Budget 220 220 220 Cr=10 Cr=10 Cr=10 200 Cr=30 200 Cr=30 200 Cr=30 Cr=50 Cr=50 Cr=50 180 Cr=70 180 Cr=70 180 Cr=70 Cr=90 Cr=90 Cr=90 160 cold-opend 160 cold-opend 160 cold-opend 140 140 140 120 120 120 100 100 100

Advertising 80 Advertising 80 Advertising 80 60 60 60 40 40 40 20 20 20 0 0 0 -10 -5 0 5 10 -10 -5 0 5 10 -10 -5 0 5 10 Private Private Private

Figure 2.8: Optimal Advertising for Regular Movies

2.5.4 Robustness

I conduct multiple tests to evaluate the robustness of my demand estimates.

57 First, I test a linear model that removes the Bayesian Learning structure, and changes the regressors of Equation (2.14) to linear terms. The regressors include advertising, the cold-opening dummy, the critic score, and the audience score. I parameterize their effects by interacting these variables with characteristics in Hj, while controlling movie characteristics

Xj and the period dummy. This linear model provides estimates that capture a similar data pattern. More details can be found in Appendix C.2.1. Second, I change the advertising instrument to include part (25% or 50%) of the production budgets of movies released a month before or after by the same major studio. This reflects the feature that major studios tend to purchase airtime beforehand during the upfront season and allocate airtime to movies later. I also regress my demand model using the standardized long-run audience score as a quality proxy. The results are similar, though the effect of advertising becomes lower when I include a higher percentage of the production budgets of nearby movies. Appendix C.2.2 provides more discussion on this. Third, I change the definition of a cold-opened movie to a movie with no critic score on the Monday before its opening week, which is about 10 days before release for movies released on a Friday. This increases the number of cold-opened movies (which now takes up near 70% of my dataset), and gives a studio more time to change its advertising expenditure. The parameters estimated, as discussed in Appendix C.2.3, are very close to those in my main results. I further revise some model assumptions or parameterizations. 1) I remove the heterogeneity in the variances of signals, which decreases the number of instruments by eliminating the interaction terms. 2) I map the advertising expenditure to the private score using square root rather than logarithm, which changes the curvature of this mapping. 3) I transform the critic z score and the audience score by ln( 1−z ), with z being the rescaled scores on (0, 1). This provides us an infinite support of signals. I obtain similar results, as shown in Appendix C.2.4 - C.2.6.

2.6 Counterfactuals

Based on the demand estimates, I first show the benefit of observing reviews. Then, I s solve for the studio’s private score ξj from its actual advertising expenditure Adj in the data, calculate the studio’s choice under a counterfactual scenario, and compare the differences in advertising and profit.

2.6.1 Effect of Reviews on Consumer Choice

To quantify the benefit of consumers from observing reviews, ideally, we should use the true quality of a movie to calculate utility. As it is unobservable, I use two measures of movie quality as the baseline: First, I treat the expected quality after consumers learning from the most information

58 (advertising, the critic score, and the audience score) as the baseline (Ad+Cr+Ur). Then, I compare it with the expected quality after learning from advertising only (Ad), and after learning from advertising and the critic score (Ad+Cr). The change in the market share reflects the change in the expected quality. Since the post-release period market share is already under the most information, I consider the change in the opening week market share under different information sets.

Table 2.6: Difference in Opening Week Share (%)

Mean S.D. Min Median Max (Ad) - (Ad+Cr+Ur) 0.87 1.33 -0.56 0.37 10.98 (Ad+Cr) - (Ad+Cr+Ur) 0.14 0.38 -1.72 0.05 3.16

Consumers benefit from observing critic and audience scores because they get more accurate expected quality and make a better choice. Without observing the critic score and the audience score, the probability that consumers choose to watch a movie is 0.87% higher on average, with a standard deviation of 1.33%. After learning from the critic score, the opening week market share is only 0.14% higher than the baseline, and the variance also becomes smaller. Figure 2.9(a) shows the distribution of the differences in the market share. To translate this change into a change in the errors that consumers make, I simulate 100 markets with 1 million consumers each, and calculate the probability that: 1) consumers watch a movie that they would not watch if all information were available; or 2) consumers skip a movie they would watch if all information were available. On average, if both types of reviews are absent, 0.92% of the consumers falsely choose to watch a movie, and 0.01% falsely skip a movie. After observing critic scores, these two percentages change to 0.18% and 0.03%, respectively.

0.7 (Ad) - (Ad+Cr+Ur) (Ad+Cr) - (Ad+Cr+Ur) 0.6

0.5

0.4

0.3 Percentage

0.2

0.1

0 -0.02 0 0.02 0.04 0.06 0.08 0.1 0.12 Difference in Opening Share

(a) Baseline: Ad+Cr+Ur (b) Baseline: Long-run Audience Score

Figure 2.9: Market Share or Expected Quality under Different Information Sets

Second, I use the long-run audience score to approximate the movie quality from the

59 consumers’ perspectives. I categorize movies into 4 quality groups (Low, Mid Low, Mid High, High), corresponding to the lowest 25%, the 25th percentile to the median, the median to the 75th percentile, and the highest 25% based on long-run audience score. Then, I plot the expected quality for these 4 groups under different information sets. In Figure 2.9(b), when advertising is the only signal observed, the expected quality of these 4 groups looks very similar. After the critic score becomes available, the difference between the Low group and the Mid Low group, and the difference between the Mid High group and the High group become clearer. As the audience score is revealed, the 4 groups have the expected quality better aligned with the long-run audience score.

2.6.2 Effect of Reviews on Studio Choice

To quantify the effect of critic reviews on the advertising choice of studios, I look at cold- opened movies and regular movies separately, and calculate the changes in their advertising expenditures and profits.

2.6.2.1 Mandatory Screening By Critics

I consider a policy that mandates critic screening before a movie’s release. For a cold-opened s movie, I solve for its private score ξj given its advertising level adj. Suppose it had been screened c and received its ex-post critic score ξj,1, I calculate two optimal advertising expenditures and the corresponding profits:

c (1) The studio observes the critic score ξj,1 before advertising, but consumers do not observe it;

c (2) Both the studio and consumers observe the critic score ξj,1.

I refer to the first scenario as “private”, since the critic score now is the studio’s private pri information, and denote the optimal advertising level as Adj and the profit under this pri advertising level as πj . The second scenario corresponds to the case that the movie is regular, reg reg so I denote the advertising level and the profit as Adj and πj . Here, the profit is calculated as the total box office revenue times the ratio that goes to the studio (50%) minus advertising expenditure. Then, I decompose the total effect (∆Adj, ∆πj) of mandatory screening into two parts:

reg pri pri a s ∆Adj = (Adj − Adj ) − (Adj − Adj) = ∆Adj + ∆Adj reg pri pri a s ∆πj = (πj − πj ) + (πj − πj) = ∆πj + ∆πj

s s From cold-opening to the private scenario, the change in advertising and profit (∆Adj, ∆πj ) reflects the studio’s benefit from observing the critic score and updating its belief before making the advertising choice. This is the value of studio learning. From the private scenario to

60 a a the regular movie case, the change in advertising and profit (∆Adj , ∆πj ) shows the effect of consumers learning from the critic score. The results are presented in Table 2.7 and Figure 2.10.

Table 2.7: Change in Advertising & Profit after Mandatory Screening

Mean S. D. Min 25% 50% 75% Max Obs. ∆Ad -12.63 8.092 -41.26 -16.67 -10.85 -7.598 3.790 181 ∆Ads -5.581 12.87 -41.12 -12.33 -7.129 -0.196 37.38 181 ∆Ada -7.051 8.692 -36.86 -10.95 -3.640 -0.862 0.0231 181 ∆π 0.982 7.887 -31.99 -3.013 0.619 4.745 30.70 181 ∆πs 4.422 5.188 -0.0968 0.840 2.618 6.441 30.63 181 ∆πa -3.441 6.039 -37.62 -4.029 -0.861 -0.0232 0.515 181 50 80 40 60 30 40 Percent Percent 20 20 10 0 0 −40 −20 0 20 40 −40 −20 0 20 40

Total Change Studio Learning Total Change Studio Learning Consumer Learning Consumer Learning

(a) Advertising Expenditure (b) Total Profit

Figure 2.10: Change in Advertising & Profit after Mandatory Screening

s For 181 cold-opened movies that I can solve for ξj , requiring them to be screened before release saves advertising for 97.79% of them, with an average saving of $12.63 million per movie. The maximum saving would be $41.26 million for Transcendence (2014), which received a critic score of 46.73 and spent $40.22 million on advertising. The movie that would save the least is Chronicle (2012), receiving a critic score of 70.48 and spent $14.09 million on advertising. If it were screened, it would spend $3.79 million more on advertising. About $5.58 million average saving comes from the studio learning from the critic score. However, the actual saving varies a lot across movies, with 24.31% of cold-opened movies increasing advertising expenditure after privately observing the critic score. Another $7.05 million average saving comes from consumer learning. If consumers can also learn from the critic score, 96.13% of cold-opened movies will advertise less than under the private scenario. If cold-opened movies were screened before release, 53.59% would see an increase in profit, with an average increase of $0.98 million per movie. Transcendence (2014) would benefit the most from being screened with an increase of $30.70 million in profit, after enjoying the largest

61 saving in advertising expenditure. Grown Ups 2 (2013), with a critic score of 24.78, would lose the most if it were screened, with a decrease of $31.99 million in profit. 97.24% of cold-opened movies would gain if their studios could learn from the critic score but consumers could not, with an average gain of $4.42 million per movie in profit. However, if consumers can learn from critic score as well, it leads to an average drop of $3.44 million per movie in profit, and over 75% of cold-opened movies would lose profit due to consumer learning. While this may seem to indicate that most of the studios made the wrong choice not to review their movies, notice that this result is based on the ex-post critic scores, which the studios would not be able to observe when making their screening choice. In fact, compared to the expected profit under the regular case based on the studio’s private score alone38, Half of the cold-opened movies made the correct choice. If I further allow the studios to make an error within $5 million, meaning that the cold-opened profit can be no more than $5 million lower than the expected profit under the regular case, 75.1% of the cold-opened movies were rational.

2.6.2.2 Removing Review Aggregators

Now I consider the case where no critic score were available for a regular movie, which would happen if review aggregators like Metacritic and Rotten Tomatoes did not exist. Similarly, s I solve for its private score ξj and calculate the optimal advertising expenditures and the corresponding profits for two scenarios:

c (1) Consumers cannot observe the critic score ξj , but the studio still observes it;

c (2) Neither consumers nor the studio observes the critic score ξj .

The first scenario is the private case where the studio learns from the critic score but consumers do not, due to the lack of review platforms like Metacritic and Rotten Tomatoes. I pri pri again denote the optimal advertising level and the corresponding profit as Adj and πj . The cold second scenario corresponds to the cold-opened case with the optimal advertising Adj and cold total profit πj . I decompose the total effect of removing review aggregators into two parts:

0 cold int int a0 s0 ∆Adj = (Adj − Adj ) − (Adj − Adj) = ∆Adj + ∆Adj 0 cold int int a0 s0 ∆πj = (πj − πj ) + (πj − πj) = ∆πj + ∆πj

a0 a0 Then, the change in advertising and profit (∆Adj , ∆πj ) from the regular case to the private s0 s0 case exhibits the effect of shutting down consumer learning, while the change (∆Adj , ∆πj ) from the private case to cold-opening shows the effect of shutting down studio learning. Table 2.8 and Figure 2.11 below show the total effect and its decomposition.

38That is, the studio takes expectations of the critic score and the audience score, and assumes that consumers in the opening week observe such critic score.

62 Table 2.8: Change in Advertising & Profit after Removing Reviews

Mean S. D. Min 25% 50% 75% Max Obs. ∆Ad0 70.26 68.27 -29.94 3.182 59.10 128.9 253.0 328 ∆Ada0 34.63 45.27 -1.542 3.320 12.57 52.38 186.1 328 ∆Ads0 35.63 66.02 -101.9 -13.19 6.624 88.87 199.1 328 ∆π0 -3.805 60.01 -139.5 -34.28 -2.990 14.51 260.7 328 ∆πa0 20.01 42.14 -5.241 -0.674 1.339 17.13 262.4 328 ∆πs0 -23.81 33.47 -139.5 -38.52 -8.919 -1.497 23.50 328 80 40 60 30 40 20 Percent Percent 20 10 0 0 −100 0 100 200 300 −100 0 100 200 300

Total Change Studio Learning Total Change Studio Learning Consumer Learning Consumer Learning

(a) Advertising Expenditure (b) Total Profit

Figure 2.11: Change in Advertising & Profit after Removing Reviews

s For 328 regular movies that have a private score ξj solved for, 79.57% would advertise more if they did not observe the critic score. On average, these movies would increase advertising by $70.26 million per movie. The maximum increase is $252.97 million for Pain and Gain (2013), with a critic score of 49.03 and $18.67 million advertising expenditure. The maximum saving is $29.94 million for Pan (2015), with a critic score of 43.49 and $38.80 million advertising. Nearly half of the increase ($34.63 million) is due to consumer learning being shut down. The effect of shutting down studio learning on the advertising choice varies, with 53.96% of regular movies spending more on advertising if the studio can no longer learn from the critic score. 60.67% of these movies would lose profit if critic reviews were not observed. On average, regular movies would lose $3.81 million per movie if they were cold-opened. The Diary of a Teenage Girl (2015) would see a decrease of $139.51 million in profit, mainly driven by its sharp increase of $153.79 million in advertising. MIB 3 (2012) would benefit the most from cold-opening, with an increase of $260.68 million in profit. With a critic score of 40 not being revealed, it would have advertised $171.80 million more and significantly improved its opening week revenue. More specifically, shutting down consumer learning from the critic score benefits studios by an increase of $20.01 million per movie in profit. However, if the studios can no longer learn from the critic score, the average loss in profit is $23.81 million per movie.

63 Pooling cold-opened movies and regular movies together, I calculate the change in advertising after observing critic reviews as a percentage of the optimal advertising expenditure when a movie is cold-opened. Compared to the advertising expenditures when movies are all cold- opened, the median saving in advertising after observing critic reviews is 76.60%. On average, profit increases by $2.80 million per movie, which is 15.18% of the average profit in the data ($18.45 million). This shows us that, after observing critic reviews, studios benefit from saving advertising expenditures and improving their profits.

2.7 Conclusion

In this chapter, I studied the impact of product reviews on the effectiveness of advertising and the advertising strategy of firms. I set up a structural model that includes consumers learning from advertising and reviews on the demand side, and firms choosing the optimal advertising level to maximize profit on the supply side. Using the data on the motion picture industry, I utilized the variation in the availability of information to quantify the effect of advertising and reviews on consumer choice. Based on the demand estimates, I evaluate the policy function of the studios and solve for their private information from their advertising choices. I found that, without critic reviews, advertising is about 2.25 times as effective at increasing opening week revenue as it is when critic reviews are available to the consumers. Moreover, without critic reviews, 0.74% of consumers would make the wrong choice to watch a movie that they would not watch if all information were available. The availability of critic reviews saves advertising expenditures for studios, as the median change in advertising is a decrease of 76.60% when movies are reviewed. Further, studios improve profit by $2.80 million per movie on average after being reviewed by critics. There are several aspects of the current model that can be extended. First, my supply side has parameters that all come from demand estimation. Without the flexibility in the F.O.C, there are about 13% of movies that my model cannot rationalize (with their private scores unsolved). The distribution of the private scores that have been solved has a large variance, and the private scores can run out of range when predicting critic scores and audience scores. Since the private score captures all the variation in advertising conditional on the observables, it may reflect the effect of other supply-side variables and shocks. To alleviate this issue, I need to allow for more flexibility on the supply side.39 Second, I have not modeled the screening choice of the studios in this chapter. Modeling this choice will help us further understand the studio’s choice when a high or low critic score is expected. Last but not least, it would be an interesting extension to think about the studio’s dynamic advertising choices in the pre-release period. Using weekly advertising expenditures, I can compare the choices before and after

39In Appendix D, I provide a detailed discussion of this issue, including potential reasons that contribute to it, and a supply estimation strategy to allow for heterogeneous cost.

64 receiving the critic score for the same movie. This will give us extra information on the studio’s advertising strategy. The model and estimation method in this chapter can be further applied to analyze other industries of experience goods, such as music and books, where the signaling role of advertising interacts with the availability of product reviews.

65 CHAPTER 3

Hollywood’s Response to the Growing Chinese Movie Market

3.1 Introduction

Since China became a member of WTO in December 2001, yearly box office revenue in the Chinese movie market increased from 0.92 billion Chinese Yuan (CNY) in 2002 to 64.27 billion CNY in 2019, with an average growth rate of nearly 30% across this period. As the second largest movie market in the world, the Chinese movie market (about $9.30 billion total revenue in 2019) is catching up with the domestic market (the U.S. and Canada, $11.32 billion total revenue in 2019) and have become an ideal place for foreign movies to gross high revenues. Yearly box office revenue of imported foreign movies increased from less than 0.5 billion CNY in 2002 to 23.09 billion CNY (about $3.34 billion) in 2019.1 For Hollywood movies that were imported, many of them have generated high revenues in the Chinese movie market. Some of them, such as Transformers: Age of Extinction (2014) and Need for Speed (2014), earned even more in China compared to the earnings in their domestic market. Therefore, more and more Hollywood companies are seeking ways to enter the Chinese market. To enter the Chinese market, a foreign movie can go through one of the two channels of import. First, it can be imported as a revenue-shared (RS) movie. This type of import will allow film companies to share box office revenue. Before 2012, the share that went to film

1See Figure 3.1 for the complete trend for total revenue and revenue of imported movies. Data collected from various websites and news reports, including National Data (http://data.stats.gov.cn) and the Chinese government website (www.gov.cn).

66 h tt diitaino rs,Pbiain ai,Fl,adTlvso SRT sthe is (SARFT) Television and , Radio, Publication, Press, of Administration State The revenue. office box of threshold a passing after features revenue-shared have can rest The n hn 21)poiesm vdneo h oiiecreainbtenadn Chinese adding between correlation positive the of evidence some provide (2011) Zhang strategy. and companies’ Hollywood in important increasingly movies. been Hollywood has in account features into In Chinese market audience. adding Chinese of more trend attract a observe From and indeed imported can easily we companies more the years, movie locations, recent their Chinese some get at By to movie easily. the expect more filming may or inspection movie the the pass in may actors that Chinese contents adding in put necessary. and companies, if production market movie, Chinese Chinese prospective the the to of release inspection its an before conduct movie will the It censor imports. and company. movie of co-production Hollywood charge non-U.S. some in a But department of movies. year. origin Hollywood per the be using imported enter to be allowed to used can try quota is movies may the type buyout movies movies, this 30-40 buyout co-produced of 2012, For After portion 2012. up small 2005-2012. take since a movies year in Only Hollywood year per 25 per movies. over IMAX has 30 with or category be type, revenue-shared 3D to to this the restricted of on them quota part of new large 14 the a WTO, with entering 2012, year, After Since per year. 1994-2001. 34 in per been year 20 per to 10 rose be quota to the used purchase. quota at the paid movies, are revenue-shared movies buyout the of purchased Most are movies price. Buyout fixed a movie. film at (BO) the distributor buyout to a local goes as Chinese revenue a imported office by box be the can of it 25% Second, 2012, companies. After 13%-17.5%. about was companies lto rdcino h oi ult n h utrldffrne ewe akt,adding markets, between movie’s differences a cultural changing the of and effect quality the movie to the due on However, production China. or in plot performance office box and features digteeCieefaue a aebotdtebxoc efrac nCia Kwak China. in performance office box the boosted have may features Chinese these Adding some with co-produce can companies Hollywood imported, be to chance better a get To For imports. of types two these for quotas set has government Chinese the However, ugF ad (2008) Panda Fu Kung

Revenue (Billion CNY) 0 10 20 30 40 50 60 70 2002 2003 iue3.1: Figure 2004 to 2005 rnfres g fEtnto (2014) Extinction of Age Transformers: 2006 2007 eryBxOc eeu nChina in Revenue Office Box Yearly 2008 2009 oa Imported Total 67 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 aigteChinese the taking , Chinese features may also affect a movie’s box office revenues in the domestic market and other markets worldwide. This impact, potentially negative, may turn out to decrease the total profit of a movie, if the revenues in all markets are considered. In this chapter, I answer the following questions: Do the Chinese features added in a movie have heterogeneous effects on the box office revenues in different movie markets? Considering the total revenue worldwide, is it beneficial for studios to add Chinese features? Would it be profitable or become more profitable, if the Chinese market continues to grow? To do this, I use a pool of 501 movies released in the domestic market and collect box office revenues in the domestic market (the U.S. and Canada), the Chinese market (Mainland China), and the international market (excluding the domestic and Chinese markets). I define a movie with at least one of the four characteristics — co-produced with a Chinese company, having Chinese cast member(s), filmed at a Chinese filming location, having Chinese culture in the plot — as a movie with Chinese features. Then, I estimate the effect of adding Chinese features on box office revenues in different markets. With the estimates obtained, I conduct counterfactual analyses to quantify the net effect of adding Chinese features on the total revenue, and evaluate the changes in this net effect when varying the size of the Chinese market (in terms of the box office revenue). I find that adding Chinese features increases the box office revenue in China by 51.74%, a positive and significant effect. In the domestic and international markets, the effect is negative but insignificant. As the actual Chinese market size is relatively small compared to the domestic and international markets, this change in characteristics would, on average, decrease the total revenue by $4.58 million per movie, with only 31.55% of movies imported to China benefit from this change. However, as the Chinese market continues to grow, adding Chinese features would benefit studios more. Had the Chinese revenues doubled (if the Chinese market doubled its size) for these imported movies, including Chinese movies would lead to a $12.10 million average increase, and 63.10% of movies could benefit from this change. This chapter contributes to the literature on the cultural effects in the motion picture industry. Cultural differences affect the recognition of a movie, such as critic reviews and audience response, thus affect box office revenues. For movies in the U.S. market, Fowdur et al. (2012) investigate the racial bias in newspaper movie reviews and find that movies with black leading and white supporting cast had 6% lower ratings and lost revenues of about $2.57 million in average. Several papers look at the performance of movies in international markets. Lee (2006) finds the existence of a cross-culture discount in the revenue of Hollywood movies released in Hong Kong. More related to this chapter, Kwak and Zhang (2011) inspect foreign movies released in China and provide evidence that China-related content and Chinese crew participation in production are related to foreign movies’ box office performance. In this chapter, I specifically focus on a change in the cultural aspect of a movie and test both the cultural preference in the target market (the Chinese market) and the cultural discount in non-target

68 markets (the domestic and international markets). This chapter also belongs to the literature on estimating box office revenues in the movie industry. In this literature, multiple papers look for movie characteristics that affect box office revenues. For example, Terry et al. (2011) find that characteristics like budget and MPAA rating, as well as quality indicators like critic reviews, nominations, and sequels, are all primary determinants of box office performance. Nelson and Glotfelty (2012) measure the effect of star power by measuring the popularity of a movie’s cast and directors. Another group of papers focuses on the dynamic pattern of box office revenue and estimates the effect of online word-of-mouth, such as Duan et al. (2008) (using user reviews) and Asur et al. (2010) (using tweets from ). I follow the previous literature to control available movie characteristics, including the popularity of cast and directors, and user ratings. I further introduce the dummy variable of having Chinese features in the estimation and evaluate its effect. The rest of this paper is organized as follows: Section 3.2 describes the dataset used and provides summary statistics. Section 3.3 discusses the estimation in different markets and identification. Section 3.4 presents the estimation results, as well as the results from robustness tests. Section 3.5 provides counterfactual analyses of removing or adding Chinese features and further evaluates the effect by varying the Chinese market size. Section 3.6 concludes.

3.2 Data

The dataset used in this chapter is hand-collected with 501 movies that were released to the domestic market (the U.S. and Canada) from 2011 to 2014. For each movie, I collect data on 1) whether it has Chinese features, 2) its box office revenues in the U.S., China, and other parts of the world, and 3) its observable characteristics.

3.2.1 Chinese Features

Studios can try different ways to bring in Chinese elements to attract Chinese audiences to watch their movies. While many features could be hard to directly observe, I consider the following 4 categories of characteristics as Chinese features:

(1) Co-produced with a Chinese company. That is, in the list of production companies that produced the movie, at least one company has the origin as China. Movies in this dataset were co-produced with at most one Chinese company, and no movie was solely produced by Chinese production companies.

(2) Having Chinese cast members. I only consider Chinese actors that are mostly active in the production of Chinese movies and less known by the U.S. audience. All of these actors played a supporting role or a minor role in the movies they were cast in, except for Jay Chou in The Green Hornet (2011).

69 (3) Filmed at a Chinese location. This information is based on the filming locations documented on IMDb. Hong Kong and Shanghai are both popular filming locations.

(4) Including Chinese culture in the plot. This is to identify the movies that do not explicitly have other Chinese features but include content that is related to China. More specifically, I consider movies with Kung Fu or related content to fall under this category.

As long as a movie meets at least one of the 4 categories, I consider it a movie with Chinese features. In total, there are 20 movies (3.99%) that have Chinese features. Table 3.1 summarizes the number of movies that fall under each category, as well as their origins.

Table 3.1: Summary of Movies with Chinese Features

Hollywood Non-Hollywood Total Chinese Co-produced 8 0 8 Chinese Cast 10 1 11 Chinese Location 9 1 10 Chinese Culture 3 0 3 Total 18 2 20

Here, Hollywood movies are defined as movies that are produced or co-produced by U.S. production companies. In the dataset, 168 movies were imported to the Chinese market, including 150 Hollywood movies. I also keep track of the import type, buyout or revenue-shared, of each imported movie. Since there is no official record of these movies and their import types, I utilize various data sources to collect the information to the best of my knowledge, and the process is documented in Appendix A.2.3. Table 3.2 summarizes these movies with their origins and Chinese features.

Table 3.2: Summary of Import Types

Hollywood Chinese Features Yes No Yes No Revenue-shared (RS) 109 3 11 101 Buyout (BO) 41 15 6 50 Total 150 18 17 151

3.2.2 Box Office Revenues

To capture the cultural effect of adding Chinese features in a movie, I collect a movie’s box office revenues in different markets of the world. More specifically, I consider three markets: the domestic market (the U.S. and Canada), the Chinese market, and the rest of the world (referred to as the international market in this chapter).

Domestic Market

70 Data on box office performance in the domestic market are collected from Box Office Mojo. They include the released date of each movie, the number of opening theaters, opening revenue, total theaters, domestic box office revenue, and ranking of revenue in the year that the movie was released.2 Some movies, especially those attending film festivals or imported after being release in another market, may have premieres or limited release first, which are not available to the general audience. Therefore, I use the wide release date on IMDb as a movie’s release date.

Chinese Market The box office of the Chinese market was not well-documented for the period covered by this dataset. I initially collected Chinese box office revenues that could be found on Box Office Mojo3, or in Appendix 2 of McCutchan (2013). For the remaining movies with missing box office revenues, all released in 2011, I utilize other sources of data on a few Chinese websites and convert the amount to U.S. dollars using the average exchange rate in 2011. More details on the data collecting process can be found in Appendix A.2.4. Release dates in China are collected from Douban4. From 2011 to 2014, most imported movies were released after their releases in the U.S., due to the screening by SARFT, dubbing or adding subtitles to the movie, etc.5

International Market Box Office Mojo provides the total revenue of a movie in all markets except for the domestic market as the “international” revenue. Further, it lists individual market revenues of a country or a region under 3 geographical areas: 1) Europe, Middle East, and Africa; 2) Latin America; 3) Asia Pacific. The domestic market and China are listed separately. I collect the “international” revenue and sum up individual market revenues under each of the 3 areas (referred to as EMA, LA, and AP) as the revenues of each area. Ideally, the sum of the revenues in 4 areas (EMA, LA, AP, and China) should match the “international” revenue. However, due to measurement errors or missing data in some markets, over 70% of movies do not have matching revenues. If the sum of these 4 areas exceeds the listed revenue, I use the sum of the 3 areas (EMA, LA, and AP) as the international revenue. Otherwise, international revenue is the listed “international” revenue minus the Chinese market revenue.6 As the release dates vary across different markets, I do not collect international release

2Since the initial collecting process started in November 2015 and movies can stay on the market for over half a year, only movies released before 2015 were considered. Domestic box office revenues have been updated (with very few changes) during the second round of data collection in April 2020. 3During the collection in April 2020, some of the previously collected revenues could no longer be found, while some have been updated, though I do not update them in this dataset. 4Douban is a website where users share interests in reading, music, and movies. For movies, it has a database of movie characteristics and user reviews, similar to IMDb. The movies page of Douban: movie.douban.com 5The import can be delayed for more than a year. For example, The November Man (2014) and Left Behind (2014) were both imported in 2016. 6Appendix A.2.4 provides detailed explanations of this process.

71 dates.

Table 3.3: Box Office Revenues in Different Markets

Mean Std. Dev. Min Median Max Obs. Domestic 79.73 79.58 10.13 53.26 623.4 501 Chinese 34.59 40.29 0.4 19.96 320 168 International 120.7 164.4 0.011 56.32 960.5 500 EMA 51.07 57.73 0.011 27.14 500.6 499 LA 18.66 26.96 0.001 7.706 188.9 471 AP 33.50 49.15 0.017 14.89 397.6 484

Table 3.3 summarizes the box office revenues in the markets mentioned above. For movies with no revenue on a market, they are treated as not available in that market.

3.2.3 Movie Characteristics

Movie characteristics are hand-collected from Box Office Mojo, which documents box office performance for all the movies that were released to the domestic market, and IMDb, the International Movie Database that provides information from the production to the release of a movie. For each movie, I collect up to 3 main production companies, including their names and origins. Then, the movie is labeled with one of the six major studios — Disney, Fox, Paramount, Sony, Universal, and Warner Bros.(Warner) — or as produced by “Other”. As defined previously, movies are categorized as a Hollywood movie that is produced or co-produced by a U.S. company or a non-Hollywood one, based on the origin of the production companies. Each movie is categorized into one of the 8 genres: action, animation, comedy, drama, horror, sci-fi/fantasy (sci-fi), thriller, and other. The “other” group includes genres like musical, , and documentary. According to the MPAA ratings, I control the effect of movies being rated PG-13 or R, with the baseline being a PG or G movie. To capture the effect of having famous directors and cast members in a movie, I document the list of directors and the 5 leading actors of a movie. Most movies have only 1 director but can have up to 3 directors. The 5 leading actors are ranked according to the default order on IMDb, amended with lists on the movie poster and Wikipedia page. The popularity of each director and actor is measured by the STARmeter provided by IMDbPro, the paid service of IMDb. STARmeter is a ranking based on the searching activities of users on IMDb. I use the ranking at the Sunday midnight before its wide release. Then, the director power and star power are the averages of the logarithm of STARmeters for the director(s) and actors, respectively. Estimated budget is collected to control the production size of a movie, since consumers may have different preferences for movies with different production budgets. This information is

72 not available for all the movies, with 36 movies dropped due to the lack of budget information. Using the average user scores from Rotten Tomatoes, I control the quality of each movie in the analysis. It is measured by the percentage of “Fresh” reviews by all users on the website.7 The average scores were collected more than half a year after a movie’s release, therefore they could be considered as the long-run score that reflects the quality of a movie from the consumers’ perspective. Holiday releases are controlled as well. For U.S. holidays, I use a dummy to label movies released during the holiday weeks of Martin Luther King Day, President’s Day, Easter, Memorial Day, Independence Day, Labor Day, Halloween, and Thanksgiving. For Chinese holidays, I consider the holiday weeks of Labor Day (May 1) and National Day (Oct 1).8

Table 3.4: Summary Statistics of Movie Characteristics

Domestic Chinese Domestic Chinese Mean S.D. Mean S.D. Mean S.D. Mean S.D. Disney 0.070 0.255 0.095 0.294 Action 0.174 0.379 0.357 0.481 Fox 0.084 0.277 0.077 0.268 Animation 0.094 0.292 0.155 0.363 Paramount 0.068 0.252 0.083 0.277 Comedy 0.240 0.427 0.042 0.200 Sony 0.106 0.308 0.113 0.318 Drama 0.192 0.394 0.089 0.286 Universal 0.092 0.289 0.065 0.248 Horror 0.066 0.248 0.018 0.133 Warner 0.064 0.245 0.077 0.268 Sci-Fi 0.114 0.318 0.238 0.427 Hollywood 0.934 0.248 0.893 0.310 Thriller 0.098 0.297 0.077 0.268 PG-13 0.461 0.499 0.613 0.488 Sequel 0.178 0.383 0.280 0.450 R 0.351 0.478 0.167 0.374 Re-release 0.014 0.117 0.012 0.109 Format 0.317 0.466 0.637 0.482 Holiday† 0.176 0.381 0.024 0.153 Domestic Chinese Mean S.D. Min Max Mean S.D. Min Max Director 7.725 1.406 2.89 14.80 Director 7.332 1.431 2.944 10.13 Star 5.917 1.188 1.839 9.888 Star 5.435 1.154 1.839 8.598 Budget 59.93 55.89 1 264 Budget 106.7 63.40 8 264 User Score 61.10 17.09 21 94 User Score 63.61 16.95 30 92

Notes: † Holiday corresponds to U.S. holidays and Chinese holidays, respectively; variables in the upper panel are all dummies; budget in million dollars.

Besides the characteristics mentioned above, I further establish the following dummy variables: 1) the format of a movie, if a movie is released in 3D format or in IMAX theaters; 2) whether a movie is a sequel or not; and 3) whether a movie is in its re-release. Table 3.4 summarizes the movie characteristics for 501 movies released to the domestic market, as well as a subset of 168 movies released in China. The international market saw the same movies as the domestic market, except for one movie, Beyond the Lights (2014), which

7The current explanation of this score by Rotten Tomatoes is “the percentage of users who rated this 3.5 stars or higher”. 8For a list of dates that are categorized as a holiday, see Appendix Appendix-A: Data2 Characteristics.

73 was only released to the domestic market. Over 93% of movies released to the domestic market are Hollywood movies. The six major studios produced almost half of the 501 movies. However, the percentage of Hollywood movies among the movies imported to China is lower. China imported more PG-13 movies and fewer R movies (only 17% compared to 35% in the domestic market), which can be attributed to the screening of SARFT. While about 30% of movies on the domestic market are in 3D format or exhibited in IMAX theaters, 63.7% of movies imported to China are in these special formats. This is due to the rule that 14 of the 34 revenue-shared movie quota are restricted to 3D or IMAX movies, encouraging movies in special formats to compete for the quota. Moreover, China imported more action, animation, and sci-fi/fantasy movies, and fewer comedy, drama, and horror movies. 28% of movies imported to China are sequels, while about 18% in the domestic market are sequels. Further, the average budget of the movies imported to China is $106.7 million, much higher than the $60 million average in the domestic market. These imports also have a slightly higher average user score, a stronger director power, and a stronger star power.

3.3 Estimation

To illustrate the effect of adding Chinese features, I estimate the demand for movies in the domestic market, the Chinese market, and the international market. Each market has a different sample size, with the biggest pool being the 501 movies released to the domestic market. I regress the logarithm of the total box office revenue in a market on movie characteristics, market-specific characteristics, and Chinese features. The focus is to compare the performance of movies with Chinese features with the performance of those without Chinese features.

3.3.1 Domestic Market

To estimate the effect of adding Chinese features on the box office revenue of movies on the domestic market, consider the following estimation equation:

ln(Domesticj) = βXj + γUSj + δChinaj + εj (3.1)

Domesticj is the box office revenue in the domestic market (in $ million). Xj are movie characteristics that include production company, Hollywood movie or not, director power, star power, genre, MPAA rating, format, sequel, production budget, and the average user rating on

Rotten Tomatoes (standardized). USj are the characteristics that are specific to the domestic market, including whether the movie is released during a U.S. holiday, the year and the month of release.

74 Chinaj is the dummy indicating whether the movie has Chinese features. To explore the effect of adding Chinese features in movies with specific characteristics, I further interact

Chinaj with the production budget, Hollywood dummy, format, sequel, and two genres (action and sci-fi/fantasy).

3.3.2 Chinese Market

Focusing on the 168 movies that were imported to the Chinese market, I estimate the effect of adding Chinese features on the Chinese box office revenue. Similarly, consider the following estimation equation:

ln(Chinesej) = βXj + γCHj + δChinaj + εj (3.2)

Chinesej is the box office revenue in the Chinese market (in $ million). Xj are the same movie characteristics controlled in the estimation of domestic demand. Chinaj is the same dummy as before, which will also be interacted with the characteristics mentioned in the domestic demand estimation.

CHj are the characteristics that are specific to the Chinese market, including whether the movie is released during a holiday week (either Labor Day week or National Day week), the type of import (revenue-shared or buyout), the year9, and the month of the release in China. Since movies that are imported as revenue-shared (RS) are usually expected to be more attractive to the audience and generate higher demand, the dummy labeling revenue-shared movies could capture some unobservable characteristics these movies share.

3.3.3 International Market

I estimate the effect of adding Chinese features on the international market that excludes the domestic and Chinese markets. I also inspect the effect on the 3 regions: Europe, Middle East, and Africa (EMA); Latin America (LA); and Asia Pacific (AP). Consider the following estimation equation:

ln(Marketj) = βXj + γTj + δChinaj + εj (3.3)

where Xj and Chinaj are defined and treated the same way as in the previous two estimations. Since there are no market-specific characteristics, I only control the year and the month of release, same as those used in the domestic market estimation. Marketj is the box office revenue in the international market or one of the 3 regions, measured in $ million.

9Only two movies were imported in 2016, thus they have been categorized together with movies imported in 2015.

75 3.3.4 Identification

In each estimation equation, the parameters are identified by the variation in the box office revenue of the corresponding market based on the movie characteristics, market-specific characteristics, and whether the movie includes Chinese features (and the interaction with other characteristics). However, the identification of the coefficients of some characteristics is subject to the possibility of endogeneity.

The dummy of having Chinese features, Chinaj, should be exogenous in the estimations of domestic demand and international demand, as the choice of adding Chinese features would most likely not be correlated with unobserved demand shocks in these two markets. For the Chinese market, this variable could be endogenous if there is an unobserved demand shock that is observed by studios and affects the choice of adding Chinese features. I argue that this could not be the case due to two reasons. First, the choice of adding Chinese features would happen at the beginning of a movie’s production, which would be long before the movie’s release to the Chinese market. Second, the main reason to add Chinese features could be to improve the probability of a movie passing the screening of SARFT and being imported to China, and the realization would depend on the choice of SARFT. As the control for a movie’s production size, I consider the estimated budget as exogenous, since the choice of the production budget is decided within the production period and long before a movie’s release. The long-run user score on Rotten Tomatoes is also treated as exogenous, as the score includes the ratings posted by users long after the movie’s release.

3.4 Results

In this section, I will compare the effects of adding Chinese features in different markets, as well as highlight some effects of other movie characteristics that vary across markets. Then, I perform robustness tests using Seemingly Unrelated Regressions (SUR) to allow for correlation between the error terms of different markets.

3.4.1 Effect of Chinese Features

Table 3.5: Effect of Adding Chinese Features

(1) (2) (3) (4) (5) (6) Domestic Chinese Intl. EMA LA AP China -0.0967 0.417* -0.0446 -0.187 0.214 0.129 (0.139) (0.213) (0.262) (0.268) (0.290) (0.233) N 501 168 500 499 471 484 adj. R-sq 0.578 0.663 0.529 0.496 0.475 0.622

Notes: Standard errors in parentheses; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

76 Table 3.5 shows that adding Chinese features increases the Chinese box office revenue by 51.74%, and the effect is significant at the 90% confidence level. This matches the evidence found in Kwak and Zhang (2011). However, adding Chinese features has negative but insignificant effects in the domestic market and the international market. If we further inspect the effect on the 3 regional markets, adding Chinese features has a negative effect in Europe, Middle East, and Africa (EMA), and positive effects in Latin America (LA) and Asia Pacific (AP), though none of the effects are statistically significant. Since the market size, indicated by the average box office revenue, is different across these 3 regional markets, the negative impact on the international market would be mainly driven by the effect in EMA. Due to the missing data and measurement errors that occur more in aggregating revenues for these regional markets, my focus will be on the domestic market, the Chinese market, and the international market.

Table 3.6: Effect of Interaction Terms

(1) (2) (3) (4) (5) (6) ln(Budget) Hollywood Format Action Sci-Fi Sequel Domestic China -0.0836 0.608 -0.0971 -0.0809 -0.148 -0.108 (0.805) (0.428) (0.195) (0.194) (0.167) (0.175) × -0.00293 -0.783* 0.000807 -0.0326 0.167 0.0305 (0.178) (0.450) (0.270) (0.279) (0.304) (0.279) Chinese China -0.354 0.575 0.435 0.743** 0.282 0.242 (1.740) (0.666) (0.331) (0.310) (0.269) (0.283) × 0.165 -0.173 -0.0311 -0.624 0.367 0.404 (0.369) (0.693) (0.450) (0.434) (0.445) (0.429) International China -0.626 -0.162 0.179 -0.173 -0.0680 -0.0532 (1.523) (0.813) (0.369) (0.366) (0.315) (0.330) × 0.131 0.130 -0.439 0.264 0.0771 0.0226 (0.338) (0.853) (0.510) (0.527) (0.574) (0.528)

Notes: Standard errors in parentheses; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

Table 3.6 shows the estimates of the dummy of Chinese features and its interaction terms. Column (1)-(6) corresponds to the estimations where this dummy interacts with production budget, Hollywood dummy, format, action, sci-fi/fantasy, and sequel, respectively. Interacting the dummy of adding Chinese features with other characteristics yields insignifi- cant estimates, except for Hollywood movies on the domestic market. Adding Chinese features in Hollywood movies has a significant and negative impact on the domestic box office, compared to non-Hollywood movies. Notice that, while 18 Hollywood movies added Chinese features, only 2 non-Hollywood movies did so. These two movies may not be representative.

77 For the regional markets, the estimates are also insignificant, though qualitatively, the estimates for AP look similar to those for the Chinese market, and the estimates for EMA are closer to those for the international market.10

3.4.2 Other Movie Characteristics

As expected, production budget, sequels, and user scores on Rotten Tomatoes all have positive and significant effects on box office revenues, regardless of the market. A 1% increase in production budget leads to a 0.74% increase in the Chinese box office revenue and a 0.68% increase in the international market, much higher than the 0.28% increase in the domestic market. The effect of being a sequel is also larger in the Chinese market and the international market, while the effect of user scores is the strongest in the domestic market. Star power is positive and significant in the domestic market and the international market. Additionally, the international market sees a positive and significant effect of director power. Neither of these two powers has a significant effect on the Chinese market. While the audience in the domestic market have a preference for major studio movies and Hollywood movies, only movies produced by Fox perform significantly better on the Chinese market and the international market. Since a non-Hollywood movie needs to perform well internationally to succeed in the domestic market, Hollywood movies achieve lower box office revenues on the international market. Horror movies are the only genre that yields a higher revenue in all three markets than the baseline group (Other). Due to the weak performance of the 4 movies categorized as “Other” on the Chinese market, all other genres have significantly higher revenues. Revenue-shared movies do not perform better on the Chinese market, neither do the 3D or IMAX movies. The dummy variables that label the U.S. and Chinese holiday weeks have no significant impact on demand. Appendix E.2 provides the estimates of movie characteristics.

3.4.3 Robustness Tests

Estimating demand in each market separately may not be as efficient as estimating demand in all markets together, if the error terms are correlated across markets. For a movie that is released to different markets, there could be some unobservable features of the movie that shift demand in all markets. Therefore, as a robustness test, I use Seemingly Unrelated Regressions (SUR) to jointly estimate demand in all markets, and allow for the correlation between error terms. However, not all movies in my dataset were released to every market. Estimating demand in the domestic market, the Chinese market, and the international market together, only 168 movies will be included in the estimation. If I further include the three regional markets (EMA,

10See Table E.12 for the estimates.

78 LA, and AP), the number of movies drops to 166. Dropping movies that were not released to every market may introduce sample selection issues.

Table 3.7: Effect of Adding Chinese Features in SUR

(1) (2) (3) (4) (5) (6) Domestic Chinese Intl. EMA LA AP 3 Markets -0.0993 0.455** 0.117 (0.137) (0.185) (0.139) 6 Markets -0.120 0.424** 0.0983 -0.0267 0.229 0.247* (0.137) (0.186) (0.138) (0.150) (0.192) (0.149)

Notes: Standard errors in parentheses; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

Table 3.7 summarizes the estimation results for the dummy of adding Chinese features when estimating 3 markets or 6 markets jointly. Regardless of the number of markets included, each estimate has the same sign as estimated separately, except for the international market, where there is now a positive but insignificant effect. Stata provides a p-value of 0.0000 for the Breusch-Pagan test of independence in each of the 3-market and the 6-market estimations. Hence, the error terms of different markets are indeed correlated. With the gains in efficiency, the coefficient is positive and significant at the 95% confidence level in the Chinese market; it becomes positive and significant at the 90% confidence level in AP, with the magnitude being higher (0.247, compared to 0.129 when estimated separately). Given the cultural similarities between China and several Asian countries and regions, a positive effect of adding Chinese features in Asia Pacific is expected. Further, restricting the samples to be movies released in all markets (with the Chinese market being the strongest restriction) shows an insignificant but smaller effect on the revenues in EMA (-0.0267) compared to the coefficient estimated using EMA alone (-0.187). Together with the change in AP, we can no longer observe the negative impact on the international market.

3.5 Counterfactuals

Based on the estimation results from the main regressions, adding Chinese features has a significant and positive effect on the Chinese market, and negative but insignificant effects on the domestic market and the international market. As shown in Table 3.3, these three markets have different scales, with the Chinese market being the smallest. Therefore, to analyze the overall effect of adding Chinese features, I compare the changes in box office revenues in all three markets directly and conduct three counterfactual analyses on the 168 movies that were imported to China: 1) removing Chinese features of movies with Chinese features; 2) adding Chinese features to movies without Chinese features; and 3) varying the Chinese market size to

79 test the effects again.

3.5.1 Removing Chinese Features

17 out of 20 movies with Chinese features have been imported to China. On average, removing Chinese features (Chinaj changes from 1 to 0) will decrease the revenue in China by $23.64 million, and increase revenues in the domestic market ($12.59 million) and the international market ($11.21 million). The average net effect is a small increase of $0.16 million.

Table 3.8: Revenue Changes in Different Markets ($mln)

year title DOM ∆ CHN ∆ INT ∆ net ratio 2014 Transformers: Age of Extinction 245.4 24.91 320.0 -109.2 538.6 24.55 -59.73 28.98% 2013 Pacific Rim 101.8 10.33 111.9 -38.20 197.3 8.99 -18.87 27.24% 2014 The Expendables 3 39.32 3.99 72.87 -24.86 102.5 4.67 -16.20 33.95% 2012 Cloud Atlas 27.11 2.75 27.71 -9.45 75.66 3.45 -3.25 21.24% 2012 Battleship 65.42 6.64 50.20 -17.13 187.4 8.54 -1.95 16.57% 2014 Transcendence 23.02 2.34 20.28 -6.92 59.74 2.72 -1.86 19.68% 2012 The Expendables 2 85.03 8.63 53.60 -18.29 176.4 8.04 -1.62 17.02% 2014 X-Men: 233.9 23.74 116.5 -39.75 395.6 18.04 2.03 15.61% 2012 Looper 66.49 6.75 20.80 -7.10 89.22 4.07 3.72 11.78% 2011 Kung Fu Panda 2 165.3 16.77 92.17 -31.45 408.3 18.61 3.93 13.85% 2012 Resident Evil: Retribution 42.35 4.30 17.76 -6.06 180.1 8.21 6.45 7.40% 2014 Fury 85.82 8.71 19.18 -6.54 106.8 4.87 7.03 9.05% 2011 The Green Hornet 98.78 10.02 19.80 -6.76 109.2 4.98 8.25 8.69% 2012 This Means War 54.76 5.56 2.70 -0.92 99.03 4.51 9.15 1.73% 2014 Lucy 126.7 12.85 44.78 -15.28 287.4 13.10 10.68 9.76% 2013 Fast & Furious 6 238.7 24.22 66.49 -22.69 483.5 22.04 23.58 8.43% 2013 3 409.0 41.51 121.2 -41.35 684.6 31.21 31.36 9.98%

Notes: DOM is the domestic market revenue; CHN is the Chinese market revenue; INT is the international market revenue; ratio is the percentage of Chinese market revenue in total revenue.

As listed in Table 3.8, the net change of removing Chinese features is highly correlated with the movie’s performance in China, measured by the ratio of Chinese box office revenue over total revenue. Transformers: Age of Extinction (2014) generated nearly 30% of its revenue in China, and would expect a big decrease of nearly $110 milion in the Chinese market. For Iron Man 3 (2013), with less than 10% of total revenue generated in China, adding Chinese features has cost a bigger loss in the domestic market to make up for the gain in China. Overall, 10 out of 17 movies would have gained from not adding Chinese features. While the benefit in China is significant, this change of characteristics could be costly when all markets are considered.

80 3.5.2 Adding Chinese Features

Now I show the change in the revenues in different markets for 151 movies that were imported to China but do not have Chinese features, if they had included Chinese features instead. On average, adding Chinese features in these movies would improve their box office revenues in China by $15.89 million, but generate a loss of $11.75 million in the domestic market and a loss of $9.22 million in the international market. The net change will be a decrease of $5.07 million worldwide. 30 20 Percent 10 0 −50 0 50 100 Change in Revenue

Net Domestic

Figure 3.2: Net Change in Revenue for Movies without Chinese Features

Figure 3.2 shows the distribution of box office revenue changes. The net changes are, again, highly correlated with the ratio of Chinese box office revenue in total revenue. Titanic 3D (2012), the re-release of the classic movie in 3D format, made nearly 45% of its total revenue in China. Though not feasible, adding Chinese features would lead to a net gain of nearly $70 million. In contrast, (2013), which made huge gains domestically and internationally, generated only 4% of its revenue in the Chinese market. Adding Chinese features would cost the movie nearly $50 million in the worldwide box office. Overall, 105 out of these 151 movies would have made less total revenue, had they included Chinese features.

3.5.3 Varying Market Size

As discussed previously, the net change of adding Chinese features in a movie depends on how much revenue of the movie is generated in the Chinese market. The fast growth of the Chinese market can make it more profitable to add Chinese features. From 29.64 billion CNY in 2014 to 64.27 billion CNY in 2019, the total revenue of the Chinese box office has increased by 116.8%. Although the increase in total revenue may not lead to the same change in the revenue of each imported movie, these imported movies could expect higher revenues with the

81 expansion of the market. If we take this change in the market size into account, should more movies add Chinese features? To do this, I increase the Chinese box office revenue of each movie released in China by 25%, 50%, 75%, and 100%, apply the same effect of adding Chinese features (0.417), then compare the net changes worldwide, as well as how many movies would benefit from adding Chinese features.

Table 3.9: Adding Chinese Features after Changing Market Size

Original +25% +50% +75% +100% ∆Chinese 16.68 20.85 25.02 29.19 33.36 ∆Net -4.577 -0.407 3.762 7.932 12.10 P ositive∆Net 53 72 81 98 106 P ositive% 31.55% 42.86% 48.21% 58.33% 63.10%

Notes: Revenue changes in $ million.

As illustrated in Table 3.9, as the Chinese market size increases, the benefit of adding Chinese features in the Chinese market would increase and gradually make up to the loss in the domestic market and the international market for more and more movies. The average effect would increase from a net loss of $4.58 million (original market size) to a net gain of $12.10 million (if the market size doubles). While only 32% of these 168 movies could benefit from adding Chinese features under the original market size, over 63% would make more profit if the market size doubles. Therefore, as the Chinese movie market continues to grow, we would expect more movies to include Chinese features, which is also observed in reality with more movies co-produced with Chinese companies and more Chinese actors being invited to act in international movies.

3.6 Conclusion

In this chapter, I examined the effect of a change in the product characteristics on the demand of a product, using the case of movie studios adding Chinese features in their movies in response to the fast growth of the Chinese movie market. I collected data on movie characteristics and box office revenues for 501 movies released in the domestic market from 2011-2014, with the revenues in the domestic market, the Chinese market, and the international market collected. I categorized a movie with at least one of the following characteristics as a movie with Chinese features: 1) co-produced with a Chinese company, 2) having at least one Chinese cast member, 3) filmed at one or more Chinese locations, and 4) having Chinese culture in its plot. Then, I estimated the effect of this dummy variable of having Chinese features on the box office revenues in different markets, while controlling multiple movie characteristics and market-specific characteristics.

82 I found that adding Chinese features significantly increases a movie’s box office revenue in China (a 51.74% boost), and has a negative but insignificant effect on the domestic market or the international market. If the worldwide box office revenue is considered, only about 32% of the 168 movies that were imported to China would benefit from adding Chinese features. The average net change across all markets is a loss of $4.58 million. While it is not beneficial for most of these movies to add Chinese features, as the Chinese market continues to grow, this change could become more attractive. If these movies could double their revenues in China, 63% of movies would gain more after adding Chinese features, with an average net change of $12.10 million. This could explain why more Hollywood movies have added Chinese features nowadays. Since 2014, there has been an increase in the number of movies that included Chinese features. With a more comprehensive dataset that includes more years and more markets, for instance, on the country level, this chapter can be extended to further inspect the cultural effect in markets with different demographics. With more detailed data in individual markets, such as the advertising expenditures or competition intensity in China, the demand estimation can be improved. Another extension is to consider the competition between movie studios to get their movies imported by China. This could be related to a studio’s choice of adding Chinese features and may address the sample selection concern of imported movies.

83 APPENDIX A

Data Collection

A.1 Dataset for Reviews and Advertising

This dataset was initially collected between September 2016 and March 2017, and is used in Chapter 1 and Chapter 2.

A.1.1 Box Office Performance

I first collect the weekly box office performance (week from date to date, weekly revenue, the number of theaters, accumulated revenue) for top 200 highest-grossing movies in each year of 2011-2015. Among these 1000 movies, I first drop 16 movies that are re-released, then drop 251 movies that were limited-release in their lifetime.

Wide-release Date For movies that are released in fewer than 600 theaters in their first week, I check the number of theaters in the weeks after its original release. The wide release date is based on the release date on IMDb, which lists the wide release date if the movie is widely released. I add another condition to further restrict the date, requiring that the number of theaters of the wide release to pass 6001, and no other week that follows passes twice the number of theaters of the wide release. For example, August: Osage County (2013) has its wide release date as Jan. 10, 2014 on IMDb, with 905 theaters showing the movie. However, the following week had 2051 theaters showing the movie. In my dataset, the wide release date is Jan. 17, 2014

1The Impossible (2013) was released in 572 theaters on its wide release date listed on IMDb. I also treat it as a wide release in my dataset.

84 instead.2 This is to limit the effect of release scale on the revenue pattern, since such a large increase in the number of theaters can strongly increase the revenue in the second week.

Further Cleaning Since I am looking at the effect of critic reviews and audience reviews before and after release, I need to deal with the movies that have a long limited-release period. The longer it stays on the market, the more consumers may have seen it. Thus, the audience may have been affected by audience reviews before the wide release. I drop the movies that are wide-released more than 4 weeks after their original release. 22 movies are dropped in this process. Due to the reason of limiting the effect of release scale, I further drop 4 movies that have doubled the number of theaters in a week compared to the previous week. This type of release scale change can again strongly affect the revenue. Another movie, : Resurrection ‘F’ (2015), is dropped since the number of theaters fluctuates within the two weeks of its life. After these cleaning processes, there are 706 movies left in the dataset. Note that for some movies that are originally released in December, the wide release may be in the next January.

Adjusting Opening Week Revenue Movies usually have high revenue on their wide-release date. As the weekly box office revenue is calculated from Friday to Thursday, movies released before Friday will have an opening week revenue with fewer than 7 days counted. Therefore, I adjust the opening week revenue for all the movies that are not released on a Friday. Since I can obtain daily revenue from Box Office Revenue, the opening week (week 1) revenue in my dataset is the sum of the revenues of the first 7 days. This makes the opening week revenue comparable across all movies. For the second week revenue, I stick to the weekly revenue accumulated from Friday, but use different weeks for 2 types of movies: 1) If a movie is released on Tuesday, Wednesday, or Thursday, the second week revenue will be the actual third week revenue from Box Office Mojo (with revenue on the second Tuesday, Wednesday, or Thursday not used); 2) If a movie is released on Saturday, Sunday, or Monday, the second week revenue will be the actual second week revenue from Box Office Revenue (with revenue on the first Friday, Saturday, or Sunday used twice). Besides, if a movie has generated some revenue on the day before its opening date, the revenue will be added to the opening week revenue. Some movies may have box office revenues for a short period of time, then have 0 theaters showing the movie until their release. I do not consider the week containing early revenue as the opening week, nor do I include early revenue in the opening week revenue. In total, fewer than 20% of movies that have adjusted opening week revenues. Most of them are released in the summer, around Thanksgiving Day, or around Christmas.

2This movie is not in my final dataset.

85 Figure A.1: Webpage of Batman v Superman: Dawn of Justice on Metacritic

Figure A.2: Review Sections of Batman v Superman: Dawn of Justice on Metacritic

A.1.2 Collecting Online Reviews

I collect both critic reviews and user reviews on Metacritic; collect top critic, all critic, and user reviews on Rotten Tomatoes; collect user reviews on IMDb.

Metacritic On Metacritic, “Metascore” from critics is highlighted on a movie’s webpage under the movie title, as in Figure A.1. As viewers scroll down the page, the sections of critic reviews and user reviews will show up, as in Figure A.2. Metascore will remain at the top right of the webpage when scrolling down. For each critic review, I collect the critic’s name, the media he/she is from, the posted time,

86 Figure A.3: Webpage of Batman v Superman: Dawn of Justice on Rotten Tomatoes

Figure A.4: Critic Reviews of Batman v Superman: Dawn of Justice on Rotten Tomatoes the score (scale from 0 to 100), and the content. For each user review, I collect the username, the posted time, the score (scale from 0 to 10), and the content. While I can obtain all the reviews from critics that provide scores, for user reviews I can only access those with actual review contents. Thus, I can only observe about 1300 user reviews for Batman v Superman: Dawn of Justice, though 3535 users gave it a positive score. For both reviews, I document every review by chronological order.

Rotten Tomatoes On Rotten Tomatoes, viewers need to scroll down the webpage to see the section of

87 “Tomatometer” and “Audience Score” (as shown in Figure A.3). The default Tomatometer is calculated with all critics’ reviews, while viewers can select “Top Critics” to see the other. To see each critic review, viewers need to scroll down more to find the critic reviews section, and click on either “All Critics” or “Top Critics” to see the reviews. Figure A.4 provides an example of the format of reviews available. Again, I collect the critic’s name, media, posted time, and rating. On Rotten Tomatoes, the critic scores are standardized to “Rotten” or “Fresh”, and the average rating is the percentage of Fresh reviews. I do the same for both “All Critics” and “Top Critics”, and document them by chronological order. Potentially, I can also collect the audience score on Rotten Tomatoes, which has a generally larger number of reviews compared to Metacritic. However, Rotten Tomatoes shows the reviews in chronological order from the latest to the oldest, and only show the first 50 pages (with 20 reviews on each page). Therefore it is impossible to access the early reviews right after a movie’s release.

Figure A.5: Example of Google Search Results Panel

Taking Average of Reviews As provided in Google search results (Figure A.5), I focus on the Metascore on Metacritic, and the Tomatometer from all critics on Rotten Tomatoes to represent critic reviews. As for the audience reviews, I use user ratings on Metacritic and IMDb. Since I want to track the actual information consumers can access, I need to utilize the posted time of each review and aggregate the reviews by time. My method is to take the average of all the reviews posted before a week to use the average as the score. For example, if a movie was released on Jan. 7, 2011 (Friday), the average score that is used as a signal for the opening week (Jan. 7-13, 2011) will be the average of all the reviews posted before Jan. 7, 2011. For the second week (Jan. 14-20, 2011), the signal is the average of all the reviews posted before Jan. 14, 2011. Thus, as newer reviews come in, the signal is updated.

88 For critic reviews, the average score barely changes after the opening week, as most of the critic reviews are posted before release (except for cold-opened movies), and they are rarely posted after the opening week. For audience reviews, the average score is changing almost every week, as more people come and review the movie.

A.1.3 Movie Characteristics

Movie characteristics are directly collected from a movie’s webpage on IMDb and Box Office Mojo. I use IMDbPro, the paid service of IMDb, to collect the origin of a production company, as well as the rankings of directors and actors.

Production Companies For each movie, there are usually multiple production companies that co-produce it. Since my goal is to check the effect of major studios, I consider the 6 major studio parents — NBCUniversal, Walt Disney Studios, Warner Bros. Entertainment, Fox Filmed Entertainment, Motion Picture Group, and Paramount Motion Pictures Group — and label the movies produced or co-produced by any of them. Notice that these 6 major studio parents not only have a major film studio unit, but own several smaller studios.3 I also consider movies produced or co-produced by these smaller studios as produced by the corresponding major studio parents(see Table A.1 for the list of studios owned by the 6 major studio parents).

Table A.1: Major Studio Parents and Their Studios

Studio Parent Major Unit Art-house / Indie Genre / B movie Animation Universal Working Title , Entertainment, NBCUniversal Pictures Universal Animation Studios , , Walt Disney Lucasfilm Animation, Walt Disney Lucasfilm, Pictures , Pixar, Walt Disney Animation Studios Warner Bros. Warner Bros. Pictures 20th Century Fox Searchlight , Fox Fox Faith Fox Pictures 20th Century Fox Animation , Columbia Sony Pictures Sony Pictures Affirm Films, Pictures Classics Paramount Paramount Paramount Pictures Vantage

Notes: I do not include other divisions and brands that studio parents hold shares. Fox 2000 is a division of 20th Century Fox, and it is active in production. Although it is not listed on Wikipedia, I still include it as a studio of Fox.

3Wikipedia: https://en.wikipedia.org/wiki/Major_film_studio

89 On IMDb, a movie’s webpage lists 3 production companies. I collect the name of these 3 companies and use IMDbPro to get their origins (country of location). This list of 3 production companies usually contains the main companies producing the movie. Thus, major studios and smaller studios they own are much more likely to show up on this list. However, I double check while using IMDbPro to see the full list of production companies, and update the list if a company owned by a major studio is missing in the original 3-company list. I construct 6 dummy variables for the 6 major studios. A movie can be labeled with multiple major studios if it is a co-production, and in my dataset, a movie is labeled with at most 2 major studios. I also construct a dummy variable to label a major studio production, with its value equal to 1 if the movie is produced by any of the 6 major studios.

Country To control the effect of countries that produce a movie, I use the “Country” information on IMDb. A movie can be produced by several countries. I use three categories: 1) produced by the U.S. (USA is the only country listed); 2) co-produced by the U.S. (multiple countries are listed and USA is one of them); 3) others.

MPAA Rating MPAA rating has 4 categories — G, PG, PG-13, and R. The information is the same from Box Office Mojo and IMDb. I construct three dummy variables for the MPAA rating.

Directors and Actors The effect of movie stars may have a strong effect on box office performance. I collect all the directors (at most 3 in the dataset4) and 3 actors for each movie. For simplicity, the list of 3 actors is directly from the “Stars” list in the section right under the movie title on a movie’s IMDb webpage. While it may not include the top 3 stars for all the movies, using it will avoid selecting issues if I try to choose the top 3 stars. IMDbPro provides “STARmeter”, “MOVIEmeter”, and “COMPANYmeter” that rank people, titles and companies in IMDb.5 These rankings show popularity among IMDb users. Higher ranking (or the lower meter) means more popularity. IMDb updates these meters every week. I use STARmeter (which include both directors and actors) to indicate the star power. By checking the page source, I observe the STARmeter for each person on every Sunday at midnight. To get away from the effect that directors or actors became famous due to the popularity of their movies after wide release, I document the STARmeter of each director and actor at the nearest Sunday midnight before their movie’s wide release. To aggregate the star power, I simply take the average of the natural logarithm of STARmeter for the directors and the actors, and construct two variables to reflect the effect of director power and actor power.

4Movie 43 (2013) contains several short stories, each having a director. I only document the first 3. 5IMDb Help Center: http://www.imdb.com/help/show_leaf?prowhatisstarmeter

90 Genre Box Office Mojo and IMDb both categorize movies into different genres, but the results do not perfectly match for most of the movies. I categorize the movies into 8 general genres: Action, Animation, Comedy, Drama, Horror, Sci-Fi / Fantasy, Thriller, and Other. The goal is to use the information from Box Office Mojo and IMDb to put movies into these 8 genres. I use specific rules to do that, and a movie can be labeled with at most 2 genres. The basic genre is using the genre on Box Office Mojo.

• Action: 1) if the basic genre contains “Action”; 2) if the basic genre is “Adventure” and the IMDb genres include “Action”.

• Animation: if the basic genre is “Animation”.

• Comedy: 1) if the basic genre contains “Comedy”; 2) if the basic genre is “Adventure” and the IMDb genres include “Comedy”; 3) if the basic genre starts with “Family” and the IMDb genres include “Comedy”.

• Drama: 1) if the basic genre contains “Drama”; 2) if the basic genre is “Adventure” and the IMDb genres include “Drama” or “Crime”; 3) if the basic genre starts with “Crime”; 4) if the basic genre starts with “Family” and the IMDb genres include “Drama”; 5) if the basic genre is “Romance” or “Action / Crime”.

• Horror: if the basic genre contains “Horror”.

• Sci-Fi / Fantasy: 1) if the basic genre contains “Sci-Fi” or “Fantasy”; 2) if the basic genre is “Action / Adventure” or “Adventure”, and the IMDb genres include “Sci-Fi” or “Fantasy”.

• Thriller: if the basic genre contains “Thriller”.

• Other: if the movie is not labeled under the rules above.

Format I use two dummies to label movies that are released in 3D format or in IMAX theaters. Most of these movies also provide a 2D version. These two variables are to capture the effect that 3D movies or IMAX movies are priced higher, and may also enter consumers’ utility. On Box Office Mojo, these movies will have a ranking in “3D” / “IMAX (Feature-Length)” in the section of “Genres”.

Production Budget Data on the production budget are a combination of “Production Budget” on Box Office Revenue and “Budget” on IMDb. Data may be available on either site, and if available on both sites, the numbers are the same. A small number of movies that have their budgets in foreign

91 currency on IMDb. For these movies, I convert the budget to a number in U.S. Dollars, using the average exchange rate of the year when the movie is released.6 Note that the production budget refers to the cost of production, which does not include marketing or other expenditures. It is also an estimated number, as claimed by IMDb. Sequel Sequels usually give the audience less uncertainty before watching the movie. They may also give a higher utility to the audience, since the earlier work in the series need to be successful to have a sequel made. To evaluate the effect, I label each movie that is in a series but not the first movie in the series. The easiest indicator is in the movie title, such as “Scary Movie 5 (2013)”, “Insidious: Chapter 2 (2013)”, and “22 Jump Street (2014)”. Besides, I check the section of “Franchises” on Box Office Mojo. To be labeled as in a sequel, the movie must be included in a “Series” or under a “Brand” with similar movie titles. The remakes and re-openings are considered as part of a new series. For example, Man of Steel (2013) is considered as a new Superman series, since the last one Superman Returns (2006) was released 7 years earlier, and the main cast were different. Holidays On Box Office Mojo, weekend records of several holiday weekends are listed under “All Time Box Office” tab.7 Following the list, I label movies that are released close to the following holidays: Martin Luther King Day, President’s Day, Easter, Memorial Day, Independence Day, Labor Day, Halloween, Thanksgiving, and Christmas8.

Table A.2: U.S. Holiday Weeks in 2011-2015

2011 2012 2013 2014 2015 Martin Luther King Jan 10 - 16 Jan 9 - 15 Jan 14 - 20 Jan 13 - 19 Jan 12 - 18 President’s Day Feb 14 - 20 Feb 13 - 19 Feb 11 - 17 Feb 10 - 16 Feb 9 - 15 Easter Apr 18 - 24 Apr 2 - 8 Mar 25 - 31 Apr 14 - 20 Mar 30 - Apr 5 Memorial Day May 23 - 29 May 21 - 27 May 20 - 26 May 19 - 25 May 18 - 24 Independence Day Jun 27 - Jul 3 Jul 2 - 8 Jul 1 - 7 Jun 30 - Jul 6 Jun 29 - Jul 5 Labor Day Aug 29 - Sep 4 Aug 27 - Sep 2 Aug 26 - Sep 1 Aug 25 - 31 Aug 31 - Sep 6 Halloween Oct 24 - 30 Oct 29 - Nov 4 Oct 28 - Nov 3 Oct 27 - Nov 2 Oct 26 - Nov 1 Thanksgiving Nov 21 - 27 Nov 19 - 25 Nov 25 - Dec 1 Nov 24 - 30 Nov 23 - 29 Christmas Dec 19 - Jan 1 Dec 17 - 30 Dec 16 - 29 Dec 15 - 28 Dec 14 - 27

Easter, Memorial Day, Independence Day, Halloween, and Thanksgiving are all treated as holidays with a 3-day weekend. If a movie is released during the week (Monday to Sunday)

6For yearly average exchange rates, I use https://www.irs.gov/individuals/international-taxpayers/ yearly-average-currency-exchange-rates 7All Time Box Office on Box Office Mojo: https://www.boxofficemojo.com/alltime/ 8Although Christmas is not listed on Box Office Mojo, there are usually more movies released around Christmas Day than in a normal week. In my dataset, there were 4-9 movies per year from 2011 to 2014 that were released on or within 5 days before Christmas Day.

92 that covers one of these holiday weekends, I label it as released in the corresponding holiday week. Martin Luther King Day, President’s Day, and Labor Day are holidays with a 4-day weekend (3-day plus Monday). I label a movie with a corresponding holiday week dummy, if it is released during the week that covers one of these 4-day weekends. For Christmas, I label movies that are released during the week that covers Christmas Day, and the week before. More specifically, Table A.2 shows the holiday weeks defined for 2011-2015.

A.1.4 Advertising Expenditures

AdSpender is a database that monitors advertising expenditures, provided by Kantar Media Intelligence. This database provides advertising expenditures on multiple media, and is widely used in research. In my dataset, I use weekly advertising expenditures to reflect a movie’s advertising intensity. In AdSpender, I have access to advertising expenditures up to the broadcast week level, which is defined as Monday through Sunday. The earliest week I can get is the week starting on Jan. 31, 2011. For each broadcast week from Jan. 31, 2011 to Jan. 4, 2016, I restrict the category as “Motion Picture” and generate a report that contains advertising expenditures for all the movies that advertised during that week. The focus is on total advertising expenditure, which is aggregated from expenditures on Network TV, Cable TV, Syndication, Spot TV, Magazines, Sunday Magazines, National Newspapers, Newspapers, Network Radio, National Spot Radio, and Outdoor. For each movie, I collect 35 weeks of advertising expenditures: 1) the broadcast week that includes the wide-release date, considered as the opening week (week 0); 2) 24 weeks before the opening week (week -24 to -1); and 3) 10 weeks after the opening week (week 1 to 10). I match the total advertising expenditures in each report to the advertising expenditure of a movie in a specific week. Normally a movie will have individual advertisements, but some movies may also be advertised as a bundle, or be advertised with a product in another industry (like , insurance, etc.). Since the individual advertising expenditures are usually higher, I do not account for combined advertisements. Two movies have changed names in the database (Atlas Shrugged: Part II (2012) and Peeples (2013)). I combine the advertising expenditures under different names. Weekly advertising data from Kantar AdSpender are measured in broadcast weeks, which begin on Mondays. Therefore, the advertising expenditure in the opening week, defined as the broadcast week that includes the release date, may include post-release days. Since about 90% of the movies are released on Thursday or Friday, for simplicity, I do not adapt opening week advertising when accumulating advertising expenditures.

93 A.2 Dataset for Chinese Features

This dataset was initially collected between November 2015 and April 2016, with additional data on box office revenues collected in April 2020. It is used in Chapter 3.

A.2.1 Movie Selection

From 2011 to 2014, there were approximately 730-850 movies released in the domestic market each year. For each year, I first drop all the movies that grossed less than 10 million in the domestic market. That is, I focus on the “bigger” movies that attracted more audience. This leaves me with about 140-150 movies per year. Among them, I drop the movies that have no available data of estimated budget, since it is an important variable that I need to control. There are 36 such movies, and most of them are relatively smaller (with lower domestic box office revenue rankings). As for production companies, genres, and MPAA ratings, I do not observe obvious patterns among them. So I will assume that there is no selection issue in this cleaning process. Since I control star power in my analysis, I have to drop 2 documentaries that have too few actors that can be accounted for. They both ranked around 100 or lower in their released year. The final process is to take care of the movies that were released earlier, but had a wide release coming later. These movies are mainly those attended film festivals or had limited premieres. In order to capture the availability to most of the audience, I only use the date of wide release as the released date.9 Release dates are obtained initially from Box Office Mojo, and I use IMDb to check release details. Using wide release dates potentially affects the movies that were released in December. Some of them will be considered as released in the next calendar year in my dataset. For example, American Sniper (2014) was initially released on Nov. 11, 2014 (AFI Fest), with a limited release on Dec. 25, 2014. But its wide release was on Jan. 16, 2015. Therefore, I treat it as a 2015 movie (that does not enter my dataset), though IMDb and Box Office Mojo both list it as a 2014 movie.

A.2.2 Movie Characteristics

Most of the movie characteristics are collected in the same way as in Appendix A.1.3. This section will only provide details about the characteristics that are collected differently.

Production Companies For production companies, I focus on the “Company Credits” section on IMDb, and only look at the companies listed as a “production company”. IMDb lists production companies with different notes. For some movies, all production companies are listed with the same note,

9In my dataset, most of the movies have wide releases. Very few movies have limited releases in their entire life, such as The Intouchables (2012) and Belle (2014), so their first date of release will be used.

94 as “production company”. Some movies will have production companies with “(presents)”, “(in association with)”, etc. According to various sources online, a company that “presents” the movie is usually the main distributor, and may also be the main financial resource. (The major studios in the parent groups above are usually listed in this way.) Although it may not be the company that actually produces the movie, I assume it has the main effect on movie performance. Moreover, the audience also know better about these companies that show up early in the opening credits. Therefore, I only document companies labeled with “(presents)” for each movie. If all companies are listed as “production company” without other notes, or with notes but no “(presents)”, I check the movie page on Wikipedia to document the main production companies. I list up to 3 main companies that produce the movie. Using Table A.1, if a single studio is the main production company of a movie, this movie will be labeled with the corresponding studio parent. If two or more studio parents co-produced a movie, I pick the main financial resource using Wikipedia.10

Hollywood Movies I define Hollywood movies using a general concept that the movie is produced mainly by U.S. companies. From IMDb, I can observe the location of each production company, which gives its origin. For each movie, if at least half of the production companies that “present” the movie are U.S. companies, I consider it a Hollywood movie. For movies that have no extra notes other than “production company”, I consider it a Hollywood movie as long as more than half of its production companies are U.S. companies.

Directors and Actors Director power is constructed using the same method as in Appendix A.1.3. However, the star power of actors is measured differently. I first rank the cast to get the top 5 actors for each movie. The rankings are based on the default order listed on IMDb. However, some of the movies have a default order in order of appearance. Therefore, I check the posters of each movie. Usually, a movie poster lists the actors of the main characters with big font. I compare the ranking on the poster with the default order on IMDb, and modify the order using the following rules. (1) If the default order is in the order of appearance, I rank the actors according to the poster order (even if the poster order is alphabetically ordered). If there are more than 5 actors listed, I only list the first five, unless there is one that is specifically mentioned (such as “ as the Wolf” on the poster of Into the Woods (2014), which is the sixth but ranked the fifth in my dataset). (2) If the default order is slightly different from the poster order, I only fix the order if a higher-ranked actor in default order is not specifically mentioned. (Usually, posters will

10This differs from the categorization of production companies in Appendix A.1.3, where each movie is allowed to have more than one production company.

95 intentionally list some actors at the end, using “and” or “with” to mention them.) If the poster does not list or lists less than 5 actors with big fonts, I turn to the list of actors and filmmakers (usually on the lower part of a poster, using small font). Similar rules are adapted if I need to use this ranking. After checking these two parts, I may still be missing some actors in the ranking (due to fewer than 5 actors listed even using small fonts, or no actors listed on posters). Then I will check the rankings on Wikipedia. The procedures above are aimed at reflecting star power using the popularity of the most important cast members. Then, I collect the STARmeter of these 5 cast members the nearest Sunday midnight before their movie’s wide release.

Format One dummy variable of format is used with 1 indicating the movie has been released in 3D format, or in IMAX theaters, or both.

Holidays U.S. holidays are defined in the same way as in Appendix A.1.3, with the exception that Christmas is not considered. The December dummy would capture the effect of Christmas releases. There are 4 major release periods in the Chinese market: the New Year season, which can last from late November to early March; Labor Day week, the first week of May; the Summer season; and National Day week, the first week of October. Since the two seasons can be captured by month dummies, I only label movies released during the two holiday weeks in 2011-2015.11

Table A.3: Chinese Holiday Weeks in 2011-2015

2011 2012 2013 2014 2015 Labor Day Apr 26 - May 2 Apr 25 - May 1 Apr 25 - May 1 Apr 27 - May 3 Apr 27 - May 3 National Day Oct 1 - 7 Sep 30 - Oct 7† Oct 1 - 7 Oct 1 - 7 Oct 1 - 7 Notes: † The 2012 National Day holiday week was combined with the Mid-Autumn Festival.

A.2.3 Chinese Features

Since it is infeasible to quantitatively measure the cultural content in a movie, I construct a dummy variable to reflect the inclusion of Chinese features in a movie. For movies that were imported to China, I collect the import type, that is, buyout or revenue-shared.

Types of Chinese Features A movie has to meet at least one of the following 4 criteria to qualify as a movie with Chinese features. 11Neither of the 2 movies imported in 2016 were released during Labor Day or National Day holiday weeks.

96 1. Chinese Co-Production I check whether there is a Chinese production company, labeled with “[CN]”, in the list of production companies of a movie. Movies in my dataset have at most one Chinese production company listed.

2. Chinese Actors For Chinese actors in a movie, there are roughly two types. One type is those who were born or grew up in the U.S. and have a career in Hollywood. I focus only on the Chinese actors that come from Mainland China, Hong Kong, Taiwan, etc. This restriction serves the purpose of investigating Hollywood companies’ strategy of adding Chinese actors. These actors may be more attractive to the Chinese audience and have a stronger cultural influence on the Chinese market, while they are much less known to the U.S. audience. Under this restriction, actors like Jackie Chan will be excluded. I collect information on the top 3 Chinese actors in a movie (under the same ranking rule stated previously).

3. Chinese Filming Locations IMDb lists filming locations for most of the movies, except for some animated movies. Although there is no guarantee that the list of locations is complete, this is the only reliable source that I can use. This criterion is met as long as at least one Chinese location shows up on the list.

4. Chinese Culture I consider a movie is related to Chinese culture if it has directly mentioned Kung Fu, such as Kung Fu Panda 2 (2011), or similar contents, such as Pacific Rim (2013), which has three characters from Hong Kong and represent China in the fight.

Buyout and Revenue-Shared Movies There is no available official record of the imported movies and their types. Similar research, such as McCutchan (2013), uses information from online forums. To use as much information from available sources as I can, I obtain this information through the following procedure. First, I check the full list of released movies in the Chinese market. Lists of 2013 and 2014 are obtained from Box Office Mojo. Lists of 2011 and 2012 are from a Chinese website, Douban12. Notice that this step is only to obtain the full set of released movies. Second, I drop all the movies that are produced by Chinese production companies, or co-produced by Chinese companies with Korean, Japanese, and other Asian companies. After this step, I will only have imported movies from the U.S., Britain, Australia, and some other countries. Third, I search for online sources of imported types. With no source that contains all years from 2011 to 2014, I have to combine several sources. For 2014, I use a report from the

12The lists I use are managed by a user who constantly follows Chinese box office revenues.

97 Hollywood Reporter13, which contains a list of 34 revenue-shared movies in that year. For 2013, I use a report from Tencent Entertainment14, which lists all the revenue-shared movies, as well as some of the buyout movies that year. For 2012, I use Appendix 2 of McCutchan (2013) to label all the buyout and revenue-shared movies that year. For 2011, I use a blog on Mtime15 for a complete list of revenue-shared and buyout movies.

A.2.4 Box Office Revenues

Box office revenues of multiple markets have been collected for the 501 movies in the dataset, with the markets categorized by the standard of Box Office Mojo.

Domestic Market The domestic market refers to the U.S. and Canada, with the revenue on this market directly provided by Box Office Mojo. Domestic revenues were first collected in late 2015, and have been updated in April 2020 to reflect the revenue changes of a few movies on Box Office Mojo. The Intouchables (2012) and Spring Breakers (2013) have “Canada” listed together with “Domestic” with revenues between $1-2 million. It is unclear why this amount is separately listed, and I do not add it to the domestic revenue.

Chinese Market Box office revenues in China were first collected in late 2015 for 2012-2014 movies and a few movies in 2011. The data sources include Box Office Mojo and McCutchan (2013), with the latter listing the revenues of all imports in 2012. A second attempt of collecting was made in April 2020, which includes the following updates:

• 4 movies were updated as imports, all being buyout movies. Their revenues in China are the revenues listed on Box Office Mojo as of April 2020. The Family (2013) was imported in 2014; Sabotage (2014) was imported in 2015; The November Man (2014) and Left Behind (2014) were imported in 2016.

• 25 movies have a different revenue listed on Box Office Mojo than the revenue collected in 2015. Some are significantly lower. A brief check of the revenue documented by Chinese websites shows that the previous revenue may be a better match. Therefore, I use the box office revenues collected in 2015. 13The Hollywood Reporter, China Box Office: Foreign Quota Movies Saw Revenue Rise 60 Percent in 2014 : http://www.hollywoodreporter.com/news/china-box-office-foreign-quota-758767 14Tencent provides one of the largest web portals in China. This report from Tencent Entertainment is about the decreasing number of buyout movies in the Chinese market: http://ent.qq.com/zt2014/dianyingshuo/ vol72.htm 15Mtime is one of the main movie & TV database in China. It is very similar to IMDb. The blog I use is written by a user of the website, but his similar blogs in the later years have been used by Mtime as reports. Thus, I think it is a relatively reliable source of information: http://i.mtime.com/541891/blog/7303473/

98 • 24 movies with revenues collected in 2015 no longer have a valid revenue listed. The revenues collected previously are kept.

• 17 movies, all released in 2011, did not have their revenues collected in 2015. None of these movies has revenue in China listed on Box Office Mojo. Therefore, I have to turn to a few Chinese websites to collect the revenue in Chinese Yuan, and convert to the U.S. dollar by the average exchange rate of 2011 (1 USD = 6.46 CNY). Revenues for 9 movies can be found on Mtime16. Another 7 movies have revenues listed on two blogs managed by a user on Douban, which provide annual rankings of movies in China in 201117 and 201218. The user claims to have used box office data on newspapers as the data source. However, the period of each ranking is from early December of the previous year to early December this year. Therefore, movies released in late November would be heavily affected. Priest (2011) was released on November 29, 2011, and is listed in both rankings, with 20.6 million CNY generated in 2011 and 17.0 million CNY in 2012. I put 38 million CNY ($5.94 million) as its revenue, which is also the value listed in another Douban list.

International Market International box office revenues were collected in April 2020. Box Office Mojo documents a movie’s “worldwide” revenue as of two parts: “domestic” and “international”, with the latter including all other markets outside of the U.S. and Canada. Moreover, available revenue data for individual markets are provided on the same page, categorized into at most 5 sections: 1) Domestic; 2) Europe, Middle East, and Africa (referred to as EMA); 3) Latin America (referred to as LA); 4) Asia Pacific (referred to as AP); and 5) China.19 For each movie, I collect the revenue listed as “international”, and sum up the individual markets under each section (if available). Hence, the “international” revenue can be decomposed into at most 4 regions: EMA, LA, AP, and China. However, since there could be measurement errors and missing data for some markets, over 70% of movies have the revenues in these 4 parts not adding up to the “international” revenue. I keep the revenue of each region (EMA, LA, and AP) as the revenue for that market, and define the revenue on the international market in the following ways:

• If the sum of the revenues in 4 regions (EMA, LA, AP, and China) is lower than the revenue listed as “international”, the international revenue is the “international” revenue minus the revenue in China (if any).

16http://news.mtime.com/2012/01/12/1479790.html 17douban.com/note/190794027/ 18https://www.douban.com/note/255327548/ 19As an example, see the page of : The Winter Soldier (2014): https://www.boxofficemojo. com/title/tt1843866/?ref_=bo_se_r_1

99 • if the sum of the revenues in 4 regions (EMA, LA, AP, and China) exceeds the “interna- tional” revenue, I use the sum of the revenues in EMA, LA, and AP as the international revenue.

100 APPENDIX B

Proofs

B.1 Learning Process

2 2 LEMMA Assume x1|µ ∼ N(µ, σ1), x2j|µ ∼ N(µ, σ2) i.i.d., x1 and all x2j are independent, 2 and µ ∼ N(µ0, σ0). Then: 0 02 µ|x1, x21, ··· , x2n ∼ N(µ , σ ), where:

2 2 2 2 2 2 0 σ1σ2µ0 + σ0σ2x1 + nσ0σ1x¯2 µ = 2 2 2 2 2 2 ; σ1σ2 + σ0σ2 + nσ0σ1 2 2 2 02 σ0σ1σ2 σ = 2 2 2 2 2 2 . σ1σ2 + σ0σ2 + nσ0σ1

Proof. 1 2 2 If σ1, σ2 are constants, the likelihood of observing (x1, x21, ··· , x2n) is

n 2 2 2 Y 2 p(x1, x21, ··· , x2n|µ, σ1, σ2) = p(x1|µ, σ1) · p(x2i|µ, σ2) i=1 " #  1  1 n ∝ exp − (x − µ)2 exp − X(x − µ)2 . 2σ2 1 2σ2 2i 1 2 i=1

1I use the method provided by Kevin P. Murphy in “Conjugate Bayesian analysis of the Gaussian distribution.” (2007).

101 2 Since the prior µ ∼ N(µ0, σ0), we have   2 1 2 p(µ|µ0, σ0) ∝ exp − 2 (µ − µ0) . 2σ0

Hence the posterior is given by

" #  1  1 n  1  p(µ|x , x , ··· , x ) ∝ exp − (x − µ)2 exp − X(x − µ)2 exp − (µ − µ )2 1 21 2n 2σ2 1 2σ2 2i 2σ2 0 1 2 i=1 0 " 2 2 P 2 P 2 2 2 # x1 − 2x1µ + µ x2i − 2 x2iµ + nµ µ − 2µ0µ + µ0 = exp − 2 − 2 − 2 2σ1 2σ2 2σ0    P   1 n 1 2 x1 x2i µ0 = exp − 2 − 2 − 2 µ + 2 + 2 + 2 µ + C ; 2σ1 2σ2 2σ0 σ1 σ2 σ0 2 P 2 2 x1 x2i µ0 C = − 2 − 2 − 2 . 2σ1 2σ2 2σ0

The product of two normal distributions is a normal distribution. Use it twice, we can rewrite the posterior as

" #  1  1 µ0 µ02 p(µ|x , x , ··· , x ) ∝ exp − (µ − µ0)2 = exp − µ2 + µ − ) . 1 21 2n 2σ02 2σ02 σ02 2σ02

Matching the coefficients of µ2 we have

1 1 n 1 − 02 = − 2 − 2 − 2 2σ 2σ1 2σ2 2σ0 1 1 1 n ⇒ 02 = 2 + 2 + 2 σ σ0 σ1 σ2 2 2 2 02 σ0σ1σ2 ⇒ σ = 2 2 2 2 2 2 . σ1σ2 + σ0σ2 + nσ0σ1

Matching the coefficients of µ we have

0 P µ x1 x2i µ0 02 = 2 + 2 + 2 σ σ1 σ2 σ0    −1 0 µ0 x1 nx¯2 1 1 n ⇒ µ = 2 + 2 + 2 2 + 2 + 2 σ0 σ1 σ2 σ0 σ1 σ2 2 2 2 2 2 2 0 σ1σ2µ0 + σ0σ2x1 + nσ0σ1x¯2 ⇒ µ = 2 2 2 2 2 2 . σ1σ2 + σ0σ2 + nσ0σ1



The notion of precision in normal distribution is 1/σ2, denoted as τ. That is, low variance means high precision, and high variance means low precision. We can use precision to rewrite

102 the result above, that is:

0 τ0µ0 + τ1x1 + nτ2x¯2 τ0 τ1 nτ2 µ = = µ0 + x1 + x¯2; τ0 + τ1 + nτ2 τ0 + τ1 + nτ2 τ0 + τ1 + nτ2 τ0 + τ1 + nτ2 0 τ = τ0 + τ1 + nτ2.

From the equation of µ0, we can see that the mean of the posterior distribution is a weighted average of the prior mean µ0, signal x1, and the average x¯2 of n signals x21, ··· , x2n. The weights are based on the precision and the number of signals. If the prior distribution has higher precision, the weight is higher. So does the signal x1. For signals x2i, as the precision gets higher and the number of signals gets larger, its weight increases.

The precision of the posterior is the sum of the precisions of the prior, signal x1, and the precision and the number of signals x2i. Thus, the precision of the posterior will increase when the other precisions increase, or when the number of signals increases.

B.2 Combining Prior and Private Signal

Proof.

2 2 2 2 2 2 a σs∗σc σ σc s∗ σ σs∗ c µj1,r = 2 2 2 2 2 2 µ + 2 2 2 2 2 2 ξj + 2 2 2 2 2 2 ξj σs∗σc + σ σc + σ σs∗ σs∗σc + σ σc + σ σs∗ σs∗σc + σ σc + σ σs∗ 2 2 2 2 σc 2 σc 2 s∗ σ σs∗ c = 2 2 2 2 2 · σs∗µ + 2 2 2 2 2 · σ ξj + 2 2 2 2 2 ξj (σs∗ + σ )σc + σ σs∗ (σs∗ + σ )σc + σ σs∗ (σs∗ + σ )σc + σ σs∗ 2 2 2 2 σc σc σ σs∗ 2 2 2 2 2 2 σ +σs 2 σ +σs∗ 2 s∗ σ +σs c = 2 2 · σs∗µ + 2 2 · σ ξj + 2 2 ξj σ σs 2 σ σs∗ 2 σ σs∗ 2 2 2 + σc 2 2 + σc 2 2 + σc σ +σs∗ σ +σs∗ σ +σs∗ 2 2 ! σ σs∗ 2 2 2 2 2 σc σs∗ σ s∗ σ +σs∗ c = 2 2 · µ + ξj + 2 2 ξj . σ σs∗ 2 σ2 + σ2 σ2 + σ2 σ σs∗ 2 2 2 + σc s∗ s∗ 2 2 + σc σ +σs∗ σ +σs∗

Using Equation (2.3) and (2.4), we can replace:

2 2 σs∗ σ s∗ s 2 2 µ + 2 2 ξj = ξj , σs∗ + σ σs∗ + σ 2 2 σ σs∗ 2 2 2 = σs , σ + σs∗

which gives us: 2 2 a σc s σs c µj1,r = 2 2 ξj + 2 2 ξj . σs + σc σs + σc

103 Similarly,

2 2 2 a2 σ σs∗σc σj1,r = 2 2 2 2 2 2 σs∗σc + σ σc + σ σs∗ 2 2 2 σ σs∗ · σc = 2 2 2 2 2 (σs∗ + σ )σc + σ σs∗ 2 2 σ σs∗ 2 2 2 · σc σ +σs∗ = 2 2 σ σs∗ 2 2 2 + σc σ +σs∗ 2 2 σs σc = 2 2 . σs + σc



104 APPENDIX C

Robustness Tests

C.1 Chapter 1 Robustness Tests

Table C.1: Preference Parameters for Robustness - Chapter 1

αc = αa = 1 Clustered Add No. of Critics (1) (2) (3) (4) (5) (6) US prod. 0.0424 -0.0184 0.0964 0.0683 0.102 0.0652 (0.0700) (0.0700) (0.164) (0.166) (0.0687) (0.0689) Co-prod. -0.161* -0.218** -0.104 -0.140 -0.0974 -0.140* (0.0725) (0.0719) (0.176) (0.176) (0.0711) (0.0709) R -0.762*** -0.793*** -0.755* -0.791* -0.803*** -0.866*** (0.122) (0.124) (0.340) (0.360) (0.120) (0.122) PG13 -0.305* -0.326** -0.221 -0.260 -0.273* -0.333** (0.120) (0.122) (0.332) (0.351) (0.118) (0.120) PG 0.108 0.0352 0.232 0.178 0.187 0.110 (0.112) (0.112) (0.309) (0.322) (0.110) (0.111) Director 0.0462*** 0.0333** 0.0647* 0.0616 0.0656*** 0.0595*** (0.0124) (0.0126) (0.0329) (0.0329) (0.0121) (0.0123) Star -0.0390** -0.0422** -0.0421 -0.0406 -0.0460*** -0.0466*** (0.0132) (0.0132) (0.0315) (0.0320) (0.0130) (0.0131) 3D 0.000354 0.0363 0.0517 0.0534 0.0627 0.0738 (0.0475) (0.0474) (0.107) (0.107) (0.0465) (0.0464) IMAX -0.0229 0.0353 0.00796 0.0424 0.00492 0.0510 (0.0465) (0.0479) (0.100) (0.102) (0.0456) (0.0469)

Notes: Standard errors in parentheses; ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001.

105 Table C.2: Preference Parameters for Robustness - Chapter 1 (Cont.)

αc = αa = 1 Clustered Add No. of Critics (1) (2) (3) (4) (5) (6) Action -0.0490 -0.0718 -0.0876 -0.0950 -0.0588 -0.0555 (0.0418) (0.0416) (0.0947) (0.0946) (0.0413) (0.0411) Animation 0.307*** 0.259*** 0.299 0.273 0.296*** 0.268*** (0.0754) (0.0755) (0.196) (0.195) (0.0738) (0.0739) Comedy 0.316*** 0.280*** 0.381*** 0.346** 0.371*** 0.332*** (0.0417) (0.0416) (0.103) (0.105) (0.0410) (0.0411) Drama 0.00945 0.0404 0.0128 0.0399 -0.0117 0.0181 (0.0423) (0.0422) (0.115) (0.113) (0.0414) (0.0415) Horror 0.205** 0.172** 0.108 0.105 0.136* 0.133* (0.0648) (0.0644) (0.160) (0.152) (0.0643) (0.0641) Sci-fi -0.125* -0.108* -0.0892 -0.0850 -0.0817 -0.0726 (0.0496) (0.0498) (0.108) (0.110) (0.0484) (0.0487) Thriller -0.000643 -0.00270 0.0212 0.0163 0.00971 0.0134 (0.0530) (0.0530) (0.141) (0.141) (0.0521) (0.0522) Sequel 0.306*** 0.335*** 0.281*** 0.314*** 0.306*** 0.339*** (0.0409) (0.0408) (0.0828) (0.0838) (0.0403) (0.0403) ln(Budget) 0.116*** 0.0863*** 0.0948 0.0733 0.101*** 0.0745*** (0.0219) (0.0224) (0.0546) (0.0552) (0.0217) (0.0220) Week YYYYYY Disney 0.340* 0.386 0.336** (0.163) (0.292) (0.118) Fox 0.342* 0.388 0.353** (0.156) (0.299) (0.115) Paramount 0.244 0.199 0.199 (0.153) (0.302) (0.115) Sony 0.607*** 0.626* 0.616*** (0.152) (0.306) (0.113) Universal 0.432** 0.445 0.405*** (0.158) (0.302) (0.116) Warner -0.355* 0.0589 -0.139 (0.155) (0.271) (0.109) ln(Adspend) -0.0834 0.0532 0.0523 (0.0449) (0.136) (0.0428)

Notes: Standard errors in parentheses; ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001.

106 Table C.3: Learning Parameters for Robustness - Chapter 1

αc = αa = 1 Clustered Add No. of Critics (1) (2) (3) (4) (5) (6) µ −Major -0.702** -0.832** -0.861 -1.188 -0.564* -0.785** (0.235) (0.282) (0.634) (0.696) (0.238) (0.257) −Minor -1.254*** -0.922*** -1.205* -1.108 -0.879*** -0.749** (0.228) (0.226) (0.609) (0.613) (0.227) (0.228) −Cold 0.0540 0.114 0.608** 0.611** 0.270** 0.231** (0.0631) (0.0691) (0.229) (0.230) (0.0856) (0.0816) −Adspend 0.347*** 0.396*** 0.121 0.0829 0.200*** 0.198*** (0.0358) (0.0516) (0.0694) (0.141) (0.0339) (0.0499)

σ1 −Major 7.204*** 7.822*** 5.866*** 6.222*** 10.34*** 11.44*** (0.504) (0.462) (1.309) (1.226) (0.749) (0.802) −Minor 6.810*** 7.921*** 5.507*** 6.024*** 10.11*** 11.37*** (0.467) (0.450) (1.305) (1.243) (0.719) (0.787) −Cold -0.345** -0.157 -0.383 -0.324 -0.711*** -0.605*** (0.126) (0.0973) (0.360) (0.334) (0.155) (0.135) −Adspend -1.038*** -1.313*** -1.150* -1.333** -1.917*** -2.270*** (0.145) (0.136) (0.461) (0.486) (0.245) (0.269)

σ2 −Major 4.273*** 4.593*** 2.975*** 3.162*** 3.004*** 3.241*** (0.113) (0.125) (0.292) (0.358) (0.125) (0.149) −Minor 3.982*** 3.948*** 3.129*** 2.958*** 3.114*** 2.944*** (0.0846) (0.0863) (0.250) (0.308) (0.0909) (0.113) −Cold -0.129 -0.130 0.563* 0.558* 0.261** 0.217* (0.0967) (0.0940) (0.278) (0.273) (0.0905) (0.0917) −Adspend 0.321*** 0.258*** 0.102 0.119 0.163*** 0.205*** (0.0206) (0.0259) (0.0745) (0.101) (0.0270) (0.0353) y1 31.75*** 19.96*** 21.29* 18.61 17.72*** 17.24*** (3.622) (4.455) (8.837) (10.27) (4.593) (4.428) y2 2.893*** 2.647*** -4.529 -3.177 -4.280*** -2.592* (0.263) (0.271) (3.056) (2.751) (1.229) (1.014)

α1 4.004 4.477 11.31*** 10.77*** (6.933) (7.503) (1.595) (1.486)

α2 3.196*** 3.012*** 2.948*** 2.630*** (0.706) (0.694) (0.268) (0.239) N 8211 8211 8211 8211 8211 8211 adj. R-sq 0.888 0.890 0.892 0.893 0.892 0.893 RSS 12214.75 11992.33 11790.52 11684.92 11757.35 11618.94

Notes: Standard errors in parentheses; ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001.

107 C.2 Chapter 2 Robustness Tests

C.2.1 Linear Model and Estimates

Consider the expected utility of consumer i watching movies j taking the following form:

E[Uijt] = βXj + γDt + ρ0 ln(adj) + ρ1NoScjt + ρ2(1 − NoScjt)crjt + ρ3Djturj + τjt + εijt, where Xj, Dt, τjt, and εijt are defined the same way as in the learning model; ln(adj) now directly shifts utility; NoScjt is a dummy that labels observations with no critic score and no 1 audience score; crjt and urj are the original critic and audience scores, without standardization. Under this form of expected utility, consumers have the same information structure for each case discussed in the learning model:

(1) For a cold-opened movie in its opening week (Dt = 0), NoScjt = 1 as there is no score

available. Therefore, consumers have ρ0 ln(adj) + ρ1 in their utility. This corresponds to

the case when consumers only observe adj and put all the weight on the inferred private score.

(2) For a regular movie in its opening week (Dt = 0), as NoScjt = 0 under this case, consumers

have ρ0 ln(adj) + ρ2crjt in their utility. This corresponds to the case when they take the weighted average of the inferred private score and critic score.

(3) For a movie in its post-release period (Dt = 1), since they observe both critic score and

audience score, NoScjt = 0, and consumers have ρ0 ln(adj) + ρ2crjt + ρ3urj in their utility. This corresponds to the case when they take the weighted average of 3 scores available.

Here, the effect of advertising is held as constant across three cases. Moreover, when there is no score available, expected utility will be affected by ρ1, instead of consumers shifting weight on critic score and audience score to advertising.

I further allow the coefficients ρ to be dependent on studio type majorj, medium budget medj, and high budget highj. As in the learning model estimation, I again use actual production budget ln(budgetj) and foreign release schedule frgnj as instruments. By matching model predicted shares to data, I obtain the following results: Table C.4 shows the estimates on advertising, no score dummy, critic scores, and audience scores. These parameters correspond to the learning parameters in my learning model. Focusing on the results with two instruments, we can observe that advertising has a bigger impact on demand than critic scores and audience scores, and the effect of critic scores is slightly larger than that of audience scores. As the production budget level gets higher, the effect of critic scores increases. These patterns are similar to the results from the learning model. Moreover,

1Critic scores have been rescaled to [0, 10] to be on a similar scale with audience scores.

108 when we compare the coefficients of advertising and no score dummy with those under the estimation without using instruments, we observe a decrease in the coefficient on advertising, and an increase in the coefficient on no score dummy. This is also consistent with our findings under the learning model.

Table C.4: Learning Parameters under Linear Model

(1) No IV (2) IV (1) No IV (2) IV ln(Adspend) 0.620*** 0.593*** No Score 0.576*** 1.118*** (0.0563) (0.138) (0.182) (0.346) ×Major 0.190* 0.125 ×Major -0.0363 0.0105 (0.115) (0.141) (0.214) (0.453) ×Medium 0.223* 1.023 ×Medium 0.132 -0.634 (0.127) (0.714) (0.240) (0.680) ×High -0.0823 0.181 ×High 0.448 1.085 (0.272) (1.332) (0.323) (0.683) Critic 0.0694** 0.108** Audience 0.132*** 0.0995** (0.0326) (0.0422) (0.0334) (0.0387) ×Major -0.00369 -0.0101 ×Major 0.00511 0.0163 (0.0370) (0.0656) (0.0114) (0.0269) ×Medium 0.0746* 0.0698 ×Medium 0.0110 -0.0387 (0.0434) (0.0676) (0.0114) (0.0401) ×High 0.110** 0.152** ×High 0.00791 0.00315 (0.0537) (0.0713) (0.0113) (0.0248)

Notes: Standard errors in parentheses; N = 1174; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

Coefficients on other movie characteristics are also similar to those under the learning model, except for six major studios and two production budget levels, as they have been interacted with advertising and scores.

Table C.5: Preference Parameters under Linear Model

(1) No IV (2) IV (1) No IV (2) IV Disney -0.225 -0.0858 # of Movies -0.111*** -0.105*** (0.348) (0.493) (0.0311) (0.0323) Fox -0.159 -0.0281 Med -0.857** -2.849 (0.295) (0.424) (0.436) (1.881) Paramount -0.306 -0.158 High 0.124 -0.927 (0.313) (0.442) (0.956) (4.359) Sony -0.148 -0.0224 Director -0.0600** -0.0426 (0.321) (0.440) (0.0282) (0.0334) Universal -0.391 -0.252 Star -0.0303 -0.0276 (0.323) (0.464) (0.0293) (0.0370) Warner -0.814** -0.859 Sequel 0.488*** 0.500*** (0.337) (0.531) (0.0752) (0.0861)

Notes: Standard errors in parentheses; N = 1174; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

109 C.2.2 Alternative IVs for Advertising

As discussed previously, the industry rule of thumb relates the advertising expenditure to a movie’s production budget. In reality, studios can participate in the upfront season to purchase advertising airtime beforehand, and allocate airtime to movies later. Therefore, when studios adjust the advertising expenditure for a movie, their ability may be correlated with other movies released nearby that are distributed by the same studio. To test how this feature changes my estimation results, I construct two alternative instruments for advertising: (1) using the sum of movie j’s production budget and 50% of other movies released by the same major studio within a month before or after movie j’s release as an instrument; (2) using the sum in (1) but changing 50% to 25%, which gives studios less ability to move around advertising expenditures. Notice that only movies produced by 6 major studios are considered here, with movies produced by minor studios having its own production budget as the instrument. I further compare results under (3) using the standardized long-run audience score as a quality proxy.

Table C.6: Preference Parameters under Alternative IVs/Proxy

(1) 50% (2) 25% (3) Proxy (1) 50% (2) 25% (3) Proxy Disney 0.483 0.440* 0.308** # of Movies -0.111*** -0.108*** -0.102*** (0.297) (0.259) (0.132) (0.0334) (0.0324) (0.0300) Fox 0.384* 0.366** 0.318*** Med 0.666** 0.583** 0.322*** (0.208) (0.180) (0.0930) (0.337) (0.292) (0.123) Paramount 0.216 0.196 0.175 High 1.036*** 0.973*** 0.629*** (0.228) (0.202) (0.121) (0.368) (0.345) (0.182) Sony 0.376* 0.359** 0.331*** Director -0.0582* -0.0561* -0.0610** (0.198) (0.175) (0.107) (0.0307) (0.0302) (0.0270) Universal 0.257 0.227 0.135 Star -0.0682* -0.0580 -0.0269 (0.239) (0.209) (0.118) (0.0403) (0.0372) (0.0290) Warner 0.103 0.0500 -0.143 Sequel 0.532*** 0.529*** 0.512*** (0.309) (0.269) (0.130) (0.0770) (0.0754) (0.0708)

Notes: Standard errors in parentheses; N = 1174; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

Across the three specifications above, estimated preference parameters are similar to my main results, with the effects of medium and high production budgets being higher after adding 50% or 25% production budget of nearby movies by the same major studio, but lower under the quality proxy. For learning parameters, all parameters for the variances of the critic score and the audience score have the same sign and significance level, but the constant part of the critic score variance is much higher under the quality proxy, at 2.554. The mapping from advertising to the private score looks quite similar to my main results with the quality proxy, with α0 at -0.963 and α1 at 0.734. However, when alternative instruments are used, the effect of advertising α1 decreases to 0.603 after adding 25% budget, and 0.519 after adding 50% budget,

110 while the estimate of α0 increases.

Table C.7: Learning Parameters under Alternative IVs/Proxy

(1) 50% (2) 25% (3) Proxy (1) 50% (2) 25% (3) Proxy

σc ×Const. 1.558*** 1.535*** 2.554*** σa ×Const. 2.680** 2.522*** 2.360*** (0.460) (0.445) (0.774) (1.060) (0.870) (0.493) ×Major -0.122 -0.116 -0.0451 ×Major -0.556 -0.446 -0.0834 (0.667) (0.648) (0.752) (1.065) (0.968) (0.789) ×Medium -0.846 -0.808 -1.109 ×Medium -1.016 -0.883 -0.661 (0.596) (0.589) (0.842) (1.144) (1.018) (0.787) ×High -0.869 -0.881 -0.863 ×High -0.811 -0.840 -1.174 (0.792) (0.767) (0.998) (1.249) (1.085) (0.720)

α0 0.832 0.366 -0.963 α1 0.519** 0.603*** 0.734*** (1.673) (1.521) (0.770) (0.244) (0.217) (0.0683)

Notes: Standard errors in parentheses; N = 1174; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

C.2.3 Cold-opening Redefined

In this chapter, I define a cold-opened movie as a movie with no critic score available 3 days before its release. Since a studio’s advertising choice is assumed to be based on the critic score when the movie is regular, this definition indicates that a studio can freely change its advertising level even if it observes a critic score just 3 days before release. While we may be concerned that a studio may not be able to change its advertising expenditure in the final days before release, in the digital age of advertising, it is flexible to add in or pull out advertisements, especially when the studio has made a purchase during the unfront season. Moreover, out of 367 regular movies in my dataset, 253 movies (68.94%) were screened at least 7 days before release.

Table C.8: Preference Parameters under Redefined Cold-opening

(1) No IV (2) IV (1) No IV (2) IV Disney 0.358*** 0.367** # of Movies -0.111*** -0.111*** (0.133) (0.161) (0.0303) (0.0308) Fox 0.349*** 0.359*** Med 0.311*** 0.344** (0.0929) (0.108) (0.102) (0.173) Paramount 0.179 0.182 High 0.651*** 0.728*** (0.121) (0.133) (0.169) (0.238) Sony 0.345*** 0.360*** Director -0.0724*** -0.0640** (0.109) (0.121) (0.0269) (0.0293) Universal 0.142 0.172 Star -0.0262 -0.0334 (0.119) (0.139) (0.0296) (0.0330) Warner -0.167 -0.119 Sequel 0.496*** 0.499*** (0.130) (0.162) (0.0713) (0.0721)

Notes: Standard errors in parentheses; N = 1174; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

111 To check how my estimation results change when the definition of cold-opening is changed, I redefine a cold-opened movie as one without a critic score on the Monday before its release week.2 This gives consumers and studios more than a week to learn from the observed critic score. Under this new definition, 66.78% movies in my dataset are cold-opened.

Table C.9: Learning Parameters under Redefined Cold-opening

(1) No IV (2) IV (1) No IV (2) IV

σc ×Const. 1.610*** 1.443*** σa ×Const. 2.148*** 2.298*** (0.395) (0.410) (0.509) (0.650) ×Major -0.0559 -0.211 ×Major -0.215 -0.200 (0.554) (0.615) (0.726) (0.847) ×Medium -0.683 -0.662 ×Medium -0.669 -0.769 (0.552) (0.593) (0.750) (0.866) ×High -0.631 -0.958 ×High -1.186* -1.149 (0.673) (0.754) (0.706) (0.833)

α0 -1.011*** -0.887 α1 0.805*** 0.759*** (0.389) (1.090) (0.0715) (0.193)

Notes: Standard errors in parentheses; N = 1174; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

The preference parameters and learning parameters estimated are very similar to those in

Table 2.4 and 2.5. The effect of advertising, α1, is positive and significant at 0.759, compared to 0.781 obtained under the original definition. The constants in the variances of the critic score and the audience score, 1.443 and 2.298, are both significant and close to 1.480 and 2.260 obtained previously. The constant in the mapping from advertising to the private score, α0, and other terms in the variances of the critic score and the audience score, all share the same sign with those from the estimation under the original definition. The effects of two instruments are also similar, though the direction of change in α0 is different when we only account for the endogeneity of the cold-opening choice.

C.2.4 Homogeneous Weight

Allowing for variation in the weight that consumers put on signals requires more instruments — more specifically, the interaction terms of instruments with characteristics — to be included in the GMM estimation. To alleviate the concern that my estimation results may be affected by parameterizing weight and increasing the number of parameters to be estimated, I provide estimation results from a learning model with homogeneous weight. Most of the preference parameters are similar to the heterogeneous case, with the coefficients of the medium and the high budget levels being slightly lower at 0.235 and 0.558, respectively, compared to 0.426 and 0.836 under the heterogeneous case. σc for critic score is 1.125, and

2Notice that my weekly advertising data from Kantar AdSpender are measured in broadcast weeks starting from Monday.

112 σa for audience score is 1.875, giving us a similar weight pattern that consumers put the highest weight on advertising, then the critic score, with the audience score getting the lowest weight. After using instruments for advertising and the cold-opening choice, α1 decreases from

0.852 to 0.839, and α0 increases from -1.885 to -0.770, both matching my findings under the heterogeneous case.

Table C.10: Estimation Results under Homogeneous Weight

(1) No IV (2) IV (1) No IV (2) IV Disney 0.369*** 0.374*** # of Movies -0.116*** -0.112*** (0.125) (0.138) (0.0298) (0.0307) Fox 0.333*** 0.312*** Med 0.208*** 0.235** (0.0834) (0.0865) (0.0775) (0.105) Paramount 0.174 0.155 High 0.505*** 0.558*** (0.111) (0.115) (0.127) (0.160) Sony 0.337*** 0.319*** Director -0.0673** -0.0587* (0.104) (0.109) (0.0272) (0.0302) Universal 0.121 0.128 Star -0.0230 -0.0315 (0.107) (0.116) (0.0296) (0.0341) Warner -0.163 -0.127 Sequel 0.513*** 0.525*** (0.110) (0.139) (0.0710) (0.0730)

α0 -1.885*** -0.770 σc 1.560*** 1.125*** (0.433) (1.392) (0.251) (0.281)

α1 0.852*** 0.839*** σa 1.617*** 1.875*** (0.0746) (0.215) (0.236) (0.340)

Notes: Standard errors in parentheses; N = 1174; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

C.2.5 Different Mapping from Advertising

Table C.11: Preference Parameters under Square Root of Advertising

(1) No IV (2) IV (1) No IV (2) IV Disney 0.331*** 0.364* # of Movies -0.115*** -0.108*** (0.125) (0.190) (0.0302) (0.0312) Fox 0.368*** 0.378*** Med 0.136 0.276 (0.0847) (0.124) (0.0903) (0.248) Paramount 0.246** 0.240 High 0.341** 0.602* (0.110) (0.151) (0.155) (0.343) Sony 0.379*** 0.387*** Director -0.0558** -0.0440 (0.101) (0.136) (0.0268) (0.0297) Universal 0.107 0.154 Star -0.0288 -0.0368 (0.106) (0.157) (0.0291) (0.0339) Warner -0.375*** -0.255 Sequel 0.522*** 0.541*** (0.119) (0.232) (0.0732) (0.0750)

Notes: Standard errors in parentheses; N = 1174; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

113 Table C.12: Learning Parameters under Square Root of Advertising

(1) No IV (2) IV (1) No IV (2) IV

σc ×Const. 1.776*** 1.392*** σa ×Const. 2.106*** 2.435*** (0.387) (0.394) (0.452) (0.743) ×Major 0.0576 -0.128 ×Major -0.310 -0.410 (0.518) (0.566) (0.605) (0.925) ×Medium -0.611 -0.615 ×Medium -0.523 -0.586 (0.531) (0.550) (0.664) (0.981) ×High -0.644 -0.833 ×High -1.126** -1.077 (0.647) (0.675) (0.561) (0.885)

α0 -2.090*** -1.047 α1 0.617*** 0.611*** (0.412) (1.396) (0.0507) (0.156)

Notes: Standard errors in parentheses; N = 1174; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

By taking the logarithm of the advertising expenditure, I assume a decreasing marginal return. Instead of using the natural logarithm, I test the estimation results using the square root instead: s q ξj = α0 + α1 adj.

Under this different mapping from advertising to the private score, the preference parameters look similar to my main results, while the effects of the medium budget, the high budget, and the director power all lose some significance. Parameters for signal variances all have the same sign and significance level. The effect of advertising, α1, is positive and significant at 0.611.

After using instruments, the changes in α0 and α1 also match those in my main results.

C.2.6 Infinite Support of Scores

Under the assumption that the critic score and the audience score follow a normal distribution, both scores should have an infinite support. However, since the critic score on Metacritic has a scale of 0 to 100, and the user score on IMDb lies between 1 to 10, both scores have a finite range. To transform the scores to obtain infinite supports, I proceed in the following steps:

3 (1) Rescale both scores to (0, 1), and denote the rescaled scores as zc and za ;

0 z (2) Transform the rescaled scores by z = ln( 1−z ), which achieves an infinite support;

0 0 (3) Demean the transformed scores zc and za.

Then, I use the transformed and demeaned scores as the critic score and the audience score that enter consumers’ utility.

3For scores that are on the boundary, I modify them so that they fall into this range. There are 3 movies with a critic score of 0, and they are modified to 1. One movie has a critic score of 100, which is modified to 99. One movie has an audience score of 10, which is changed to 9.9.

114 The preference parameters are similar to my main results. While the learning parameters have the same sign, there are several differences to point out. First, the constant terms in the variances of the critic score and the audience score are much smaller, which comes from the different scaling of scores. Second, after using instruments for advertising and cold-opening, α0 increases as in my main results, but the effect of advertising, α1, also increases. However, in another regression that only uses ln(budgetj) as the instrument for advertising, α1 drops to

0.845, which matches our expectation. The increase in α1 after using instruments comes from using instruments for the cold-opening choice.

Table C.13: Preference Parameters under Infinite Support

(1) No IV (2) IV (1) No IV (2) IV Disney 0.366*** 0.368* # of Movies -0.109*** -0.0943*** (0.130) (0.199) (0.0301) (0.0319) Fox 0.348*** 0.354*** Med 0.281*** 0.413* (0.0862) (0.116) (0.101) (0.241) Paramount 0.183 0.135 High 0.644*** 0.911*** (0.117) (0.168) (0.167) (0.301) Sony 0.333*** 0.335** Director -0.0687** -0.0335 (0.105) (0.134) (0.0273) (0.0351) Universal 0.149 0.215 Star -0.0269 -0.0363 (0.111) (0.154) (0.0295) (0.0358) Warner -0.0930 0.0663 Sequel 0.504*** 0.536*** (0.125) (0.199) (0.0712) (0.0750)

Notes: Standard errors in parentheses; N = 1174; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

Table C.14: Learning Parameters under Infinite Support

(1) No IV (2) IV (1) No IV (2) IV

σc ×Const. 1.440*** 0.860* σa ×Const. 1.056** 1.220* (0.445) (0.490) (0.454) (0.642) ×Major 0.257 -0.337 ×Major -0.543 -0.590 (0.740) (0.769) (0.712) (0.946) ×Medium -0.909 -1.059 ×Medium -0.745 -0.700 (0.751) (0.766) (0.760) (1.047) ×High -0.333 -1.648 ×High -1.678* -1.407 (1.248) (1.198) (0.910) (1.376)

α0 -2.203*** -1.829 α1 0.926*** 0.962*** (0.421) (1.295) (0.0967) (0.296)

Notes: Standard errors in parentheses; N = 1174; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

115 APPENDIX D

Extension

Two issues remain in the counterfactual analyses above: 1) Some movies cannot have their private scores solved for; 2) Some private scores solved may be out of the range to predict critic and audience scores. In this section, I discuss these issues in detail and propose multiple reasons that explain these issues. Then, I estimate a supply side with heterogeneous costs as a way to extend my model.

D.1 Issues with Private Scores

Based on my demand estimates, I can solve for 509 movies, 86.71% of the total 587 movies s in my dataset. Figure D.1(a) plots the mean utility βXj and the critic score ξj for the 78 unsolved movies. We can see that they are either cold-opened movies (with critic score set at 0 for the plot), or regular movies with higher critic scores. Their actual advertising expenditures are too low, such that even when they get a private score of −∞, their optimal advertising level should still be higher than the actual spending. This can be seen in any of the panels in Figure 2.8: As the private score gets lower, the optimal advertising curve gets flatter, and this pattern holds for both cold-opened and regular movies. This asymptotic bound prevents these movies to have their advertising expenditures rationalized. Even if we can solve for private scores for some movies, the solution may still be problematic. Figure D.1(b) shows the distribution of the solved private scores for 509 movies. Private scores look normally distributed with a positive mean (3.84) and a large standard deviation (17.07). My model assumes that a studio uses this private score, which is the posterior mean after learning from its private signal, to predict critic and audience scores. Thus, this private score

116 250

200 40

30 150 20 Adspend

10 Frequency 100

0 100 -3 50 80 -4 60 -5 40 -6 20 -7 0 0 -8 Mean Utility -150 -100 -50 0 50 100 Critic Score Private Score

(a) Characteristics of Unsolved Movies (b) Private Scores of Solved Movies

Figure D.1: Movies with Private Score Solved or Unsolved should not be too far from the range of critic and audience scores that we can observe in reality. After standardization, the range of the critic score is [−3.74, 2.96], and the range of the audience score is [−5.70, 3.78]. Using the range of the audience score, 37.33% of movies have their private scores falling into this range. Why are there movies with advertising expenditures that cannot be rationalized, and why is the distribution of private scores too wide? There are several reasons that may explain them.

• Heterogeneous Costs My supply side assumes that $1 advertising expenditure costs $1 to studios. However, the economic cost of advertising may not be 1, and may vary across different types of movies. For example, big studios tend to buy “upfront” advertising space and allocate the space to movies they promote. Given fixed advertising space, increasing advertising expenditure for one movie may be more costly, as the promotion of other movies would be affected.

• Advertising Shock Studio’s actual advertising expenditure may not be its optimal advertising level. Whether it is a measurement error or a sudden change made by the studio, we should allow for an additional shock on the observed advertising level. The Cold Light of Day (2012) is a movie with a very low critic score. Before its domestic release, its studio Summit merged with , affecting its release schedule and marketing. This movie is one of the very few with 0 advertising expenditure and shows the existence of advertising shocks.

• Aggregate Demand Shock

I assume that studios observe aggregate demand shocks, τ1 in the opening week, and

τ2 in the post-release period. In reality, a studio most likely can only observe a part of them. Some movies with an advertising level that I cannot rationalize turn out to

117 have large demand shocks. For example, War Room (2015) is a movie with a $3 million production budget and $1.31 million advertising expenditure. However, to generate a box

office revenue of $67.79 million, its τ1 = 1.81 and τ2 = 2.63. This significantly shifts up consumers’ expected utility and makes its advertising level impossible to rationalize. In s fact, if I set τ1 = τ2 = 0 to solve for ξj , an additional 33 movies can have their private scores solved.

• Unmeasured Promotion There may be other promotional expenditures that work as advertising, but are not measured as advertising expenditures. For example, to promote blockbusters, a studio usually has the director(s) and cast members of the movie on promotional tours and interviews. These activities can be costly, but they are not measured as advertising. Movies like Jurassic World (2015) and Hunger Games (2012) with a high budget but a medium advertising expenditure may have spent more on other promotional activities.

While the explanations above give us possible directions to extend my supply side, there are certain limitations that prevent us from incorporating these features. As an example, I provide a supply side estimation in the following section to estimate cost parameters, and discuss the limitations.

D.2 Supply Estimation

I revise a studio’s maximization problem to allow for heterogeneous costs of advertising:

E s E s max [sj1(adj)|ξj ]Npη − λjadj + [sj2(adj)|ξj ]Npη. adj

Here, λj is the economic cost of advertising, which depends on some movie characteristics Zj:

λj = λZj.

I include a constant, major studio production, medium and high budget levels, and 7 genres in Zj. I further allow the observed advertising level to be the optimal advertising level plus a shock νj: ∗ Adj = adj + νj.

While cost parameters λ can be identified by the variation in the optimal advertising expenditures based on movie characteristics Zj, we will not be able to identify two movie- s specific shocks: the private score ξj and the advertising shock νj. Here, I consider approximating

118 unobserved private scores by using observed critic and audience scores in the data:

ξc + ξa ξ˜s = j j . j 2

˜ ∗ ˜s Then, we can calculate the optimal advertising level adj (ξj ; λ) for each movie, and use GMM to look for cost parameters that satisfy:

2 ˆ X h ˜ ∗ ˜s i λ = arg min Adj − adj (ξj ; λ) . λ j,t

Cost parameters are estimated as follows:

Table D.1: Cost Parameters

Estimate Std. Err. Estimate Std. Err. Const. 1.042 - Animation 0.184 - Major 0.0921 - Comedy -0.205 - Medium -0.316 - Drama 0.0677 - High -0.204 - Horror 0.183 - Sequel 0.639 - Sci-Fi 0.168 - Action 0.246 - Thriller -0.179 -

Major studio productions cost significantly higher with a coefficient of 0.0921, while medium and high budget movies cost less. Sequels cost significantly more to promote. As for genres, comedies and thrillers have a lower cost, while the other 5 genres cost more, compared with the baseline group “other”.

The estimation result above is obtained by treating advertising shock νj as the error term s to make the F.O.C hold. I have also considered leaving out νj and treat the private score ξj as s the error term, just like in the counterfactuals, to estimate cost parameters. However, since ξj enters the F.O.C nonlinearly and requires to be solved from the F.O.C, there are movies that s cannot have ξj solved, and the set of such movies changes as cost parameters change. Adding a suitable penalty on these movies that cannot have advertising expenditures rationalized and s estimating cost parameters treating ξj as the error term is a possible direction.

119 APPENDIX E

Additional Tables and Figures

E.1 Data Patterns

×106 4.5 4 regular cold-opened 3.5 3 2.5 2

adspend($) 1.5 1 0.5 0 -25 -20 -15 -10 -5 0 5 10 week Figure E.1: Weekly Advertising Expenditure 50 40 30 adspend (mln) 20 10 0 0 50 100 150 200 250 budget (mln)

Figure E.2: Advertising & Production Budget

120 Table E.1: Number of Theaters for Movies in Each Week of Life

week1 week2 week3 week4 week5 week6 week7 week8 week9 week10 Mean 2773 2772 2268 1699 1217 851.8 583.1 425.0 335.4 272.7 Std.Dev. 902.9 912.7 1094 1131 1015 849.8 654.3 494.7 375.6 301.8 Median 2937 2963.5 2412 1651 928 483 305 245 227.5 202.5 Min 572 159 16 7 6 8 5 2 3 1 Max 4404 4404 4276 3918 3752 3572 3239 2979 2757 2754 Obs 653 654 648 641 633 611 586 541 498 460 week11 week12 week13 week14 week15 week16 week17 week18 week19 week20 Mean 252.7 230.6 206.7 190.2 171.1 160.3 162.2 162.7 138.4 118.9 Std.Dev. 287.4 272.6 240.5 198.9 176.7 171.2 184.3 238.3 184.3 131.5 Median 185 185 166 144 131 126 123 108 79.5 70 Min 5 3 2 2 6 3 2 2 2 3 Max 2460 2967 2161 1746 1660 1466 1260 1705 1209 802 Obs 419 370 327 276 236 187 152 125 108 85

Notes: Based on the dataset used in Chapter 1 with 665 movies.

Table E.2: Weekly Box Office Revenue

Opening Week2 Week3 Week4 Week5 Week8 Week10 Week15 Week20 Mean 32.32 16.33 8.993 5.357 3.244 0.848 0.458 0.268 0.176 Std. Dev. 38.42 18.14 10.26 6.670 4.728 1.607 0.871 0.476 0.365 Median 19.73 10.68 5.811 3.033 1.449 0.300 0.224 0.129 0.054 Min 0.583 0.195 0.029 0.0033 0.0066 0.0009 0.0003 0.0015 0.0014 Max 296.2 149.6 81.46 55.78 57.59 18.15 11.06 4.199 2.730 Obs 587 587 584 577 567 488 416 213 82

Notes: Based on the dataset used in Chapter 2 with 587 movies.

Table E.3: Number of Movies with Positive Advertising Expenditure

w-24 w-23 w-22 w-21 w-20 w-19 w-18 w-17 w-16 w-15 w-14 w-13 4 5 5 7 11 11 14 13 14 21 23 34 w-12 w-11 w-10 w-9 w-8 w-7 w-6 w-5 w-4 w-3 w-2 w-1 23 35 57 86 116 178 232 342 520 566 578 586 w0 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 Total 587 522 301 199 132 84 58 43 35 25 25 587

Notes: Based on the dataset used in Chapter 2 with 587 movies.

121 Table E.4: Production and Distribution by Major Studios

Disney Fox Paramount Sony Universal Warner Same Major 30 49 35 52 54 40 Other 12 25 7 16 32 22

Notes: 6 columns correspond to distribution by 6 major studios, with the first row counting the movies produced by the same major studio, and the second row counting those produced by other studios; based on the dataset used in Chapter 2 with 587 movies.

Table E.5: Critic Reviews on Metacritic and Rotten Tomatoes

Metacritic Rotten Tomatoes -3days Open Week1 Week2 No. -3days Open Week1 Week2 No. Mean 62.64 57.16 56.12 55.86 32.47 0.643 0.495 0.477 0.474 33.96 S.D. 15.75 14.39 14.77 14.94 8.992 0.347 0.291 0.288 0.285 10.83 Median 63.33 56.91 56 55.65 33 0.73 0.5 0.47 0.47 35 Min 0 20 20.04 16.88 0 0 0 0 0 0 Max 100 95.41 95.41 95.41 49 1 1 1 1 56 Obs 367 559 584 586 587 347 556 584 586 587

Notes: For each website, five columns correspond to the average score three days before release, the opening date, the beginning of week 1, the beginning of week 2, and the number of reviews at the beginning of week 2; based on the dataset used in Chapter 2 with 587 movies.

Table E.6: Audience Reviews on IMDb and Metacritic

IMDb Metacritic Average Score No. of Reviews Average Score No. of Reviews Week2 Week3 Week4 Week2 Week3 Week4 Week2 Week3 Week4 Week2 Week3 Week4 Mean 6.622 6.553 6.508 78.88 104.8 121.2 6.254 6.178 6.165 24.43 33.24 38.66 S.D. 1.177 1.132 1.137 124.6 157.2 176.6 1.749 1.594 1.545 38.32 50.41 58.02 Median 6.7 6.57 6.52 39 54 62 6.51 6.36 6.36 12 17 20 Min 1.67 2 1.5 1 1 2 0 0 0 0 0 0 Max 10 10 10 1182 1488 1642 10 9.7 9.73 385 493 558 Obs 587 587 587 587 587 587 580 585 585 586 586 586

Notes: Week2, Week3, and Week4 correspond to the beginning of week 2, 3, and 4, respectively; based on the dataset used in Chapter 2 with 587 movies.

122 Table E.7: Critic Scores on Metacritic

week1 week2 week3 week4 week5 week6 week7 week8 week9 week10 Mean 62.66 55.80 55.91 55.96 56.18 56.56 56.78 57.45 57.80 58.24 Std.Dev. 15.64 14.88 14.83 14.80 14.76 14.80 14.86 14.73 14.78 14.90 Median 63.77 55.71 55.79 55.74 56.05 56.7 56.95 58 58.27 58.44 Min 0 16.88 16.88 16.88 16.88 16.88 16.88 16.88 16.88 16.88 Max 100 95.41 95.41 95.41 95.41 95.41 95.41 95.41 95.41 95.09 Obs 407 653 648 641 633 611 586 541 499 460 week11 week12 week13 week14 week15 week16 week17 week18 week19 week20 Mean 58.72 59.70 60.86 61.44 61.44 62.94 64.43 65.70 66.50 66.24 Std.Dev. 14.83 14.63 14.41 14.55 14.53 14.03 14.18 14.31 14.06 14.70 Median 58.63 60 61.62 61.91 62.13 63.62 65.27 66.66 66.54 66.42 Min 16.88 16.88 16.88 16.88 16.88 22.8 22.8 22.8 22.8 22.8 Max 94.77 94.77 94.77 94.77 94.77 94.77 94.77 94.77 94.77 94.77 Obs 419 370 327 276 236 187 152 125 108 85

Notes: Based on the dataset used in Chapter 1 with 665 movies.

Table E.8: Number of Critic Reviews on Metacritic

week1 week2 week3 week4 week5 week6 week7 week8 week9 week10 Mean 6.361 32.23 32.69 32.90 33.17 33.43 33.73 34.21 34.69 34.83 Std.Dev. 10.01 9.190 9.033 8.972 8.870 8.909 8.843 8.628 8.573 8.651 Median 3 33 34 34 34 35 35 35 36 36 Min 0 0 2 2 2 2 4 4 4 4 Max 45 51 52 49 49 49 49 49 49 49 Obs 653 654 648 641 633 611 586 541 499 460 week11 week12 week13 week14 week15 week16 week17 week18 week19 week20 Mean 35.20 35.59 36.28 36.43 36.25 36.26 36.76 37.06 37.74 37.53 Std.Dev. 8.596 8.609 8.190 8.377 8.181 8.074 8.295 8.172 8.271 8.819 Median 36 37 38 38 37 37 38 38 39 39 Min 4 4 5 5 5 5 5 5 5 5 Max 49 49 49 49 49 49 49 49 49 49 Obs 419 370 327 276 236 187 152 125 108 85

Notes: Based on the dataset used in Chapter 1 with 665 movies.

123 Table E.9: User Scores on Metacritic

week1 week2 week3 week4 week5 week6 week7 week8 week9 week10 Mean 6.297 6.239 6.228 6.229 6.250 6.257 6.290 6.325 6.366 Std.Dev. 1.773 1.603 1.552 1.497 1.469 1.459 1.458 1.466 1.446 Median 6.57 6.44 6.43 6.43 6.465 6.45 6.48 6.53 6.565 Min 0 0 0 0 0 0 0 0 0 Max 10 10 10 10 10 10 10 10 10 Obs 0 643 644 639 631 610 586 541 499 460 week11 week12 week13 week14 week15 week16 week17 week18 week19 week20 Mean 6.414 6.524 6.633 6.680 6.672 6.791 6.933 7.009 7.015 7.002 Std.Dev. 1.418 1.351 1.246 1.260 1.266 1.213 1.148 1.084 1.077 1.086 Median 6.6 6.71 6.8 6.87 6.88 7.02 7.2 7.22 7.195 7.13 Min 0 0 1.33 1.33 2.75 3.07 3.07 3.41 3.47 3.57 Max 10 10 9.47 9.47 9.47 8.94 8.69 8.7 8.7 8.72 Obs 419 370 327 276 236 187 152 125 108 85

Notes: Based on the dataset used in Chapter 1 with 665 movies.

Table E.10: Number of User Reviews on Metacritic

week1 week2 week3 week4 week5 week6 week7 week8 week9 week10 Mean 25.61 34.73 38.85 42.97 46.92 49.90 54.40 59.10 62.33 Std.Dev. 46.70 63.25 57.00 61.87 65.89 69.63 73.92 77.66 81.38 Median 13 17 20 22 24.5 27 29 32 33 Min 1 1 1 1 1 1 1 1 1 Max 728 1036 558 596 618 633 646 652 661 Obs 0 643 644 639 631 610 586 541 499 460 week11 week12 week13 week14 week15 week16 week17 week18 week19 week20 Mean 66.89 71.73 77.97 85.37 85.36 87.75 94.99 102.9 112.2 102.7 Std.Dev. 84.91 89.66 94.40 101.6 99.52 104.7 113.7 121.1 128.1 113.5 Median 36 40 46 48.5 49 49 48.5 57 67.5 58 Min 1 2 3 3 3 4 4 5 7 7 Max 666 674 677 680 569 573 576 580 585 588 Obs 419 370 327 276 236 187 152 125 108 85

Notes: Based on the dataset used in Chapter 1 with 665 movies.

124 E.2 Estimation Results

Table E.11: Other Preference Parameters

(1) No IV (2) IV (1) No IV (2) IV (1) No IV (2) IV Action 0.0132 0.0287 MLK 0.453* 0.429* Feb 0.290 0.308* (0.0743) (0.0776) (0.240) (0.250) (0.178) (0.180) Animation 0.448*** 0.442** President’s -0.0987 -0.116 Mar 0.250* 0.254* (0.165) (0.178) (0.239) (0.241) (0.150) (0.154) Comedy -0.0235 -0.0156 Easter 0.0903 -0.0138 Apr 0.172 0.196 (0.0780) (0.0823) (0.301) (0.311) (0.138) (0.145) Drama 0.0161 0.00563 Memorial -0.0319 0.00939 May 0.198 0.188 (0.0840) (0.0875) (0.187) (0.203) (0.181) (0.186) Horror 0.451*** 0.405*** Independence 0.117 0.117 Jun 0.260* 0.268* (0.110) (0.120) (0.230) (0.228) (0.148) (0.151) Sci-Fi 0.124 0.0888 Labor -0.0230 0.00794 Jul 0.297** 0.296* (0.0902) (0.0931) (0.190) (0.200) (0.151) (0.153) Thriller -0.0371 -0.0490 Halloween -0.313* -0.303* Aug 0.315** 0.291** (0.104) (0.105) (0.172) (0.173) (0.135) (0.142) USA 0.0608 0.0658 Thanksgiving -0.229 -0.194 Sep 0.226 0.228 (0.138) (0.147) (0.176) (0.191) (0.138) (0.145) USA_co -0.138 -0.129 Christmas 0.429** 0.453** Oct -0.129 -0.103 (0.140) (0.148) (0.198) (0.195) (0.149) (0.153) R -0.149 -0.182 2012 0.163 0.170 Nov 0.259 0.254 (0.127) (0.130) (0.102) (0.104) (0.163) (0.161) PG-13 0.0728 0.0555 2013 0.0852 0.0885 Dec 0.0699 0.0920 (0.118) (0.120) (0.106) (0.109) (0.207) (0.207) 3D -0.0870 -0.0978 2014 0.0261 0.0543 Period 0.141** 0.379** (0.0768) (0.0793) (0.109) (0.112) (0.0646) (0.161) IMAX 0.111 0.140 2015 -0.118 -0.108 Const. -4.599*** -5.752*** (0.0966) (0.100) (0.120) (0.122) (0.447) (0.863)

Notes: Estimates from the main estimation of Chapter 2; standard errors in parentheses; N = 1174; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

125 Table E.12: Effect of Interaction Terms for International Regions

(1) (2) (3) (4) (5) (6) ln(Budget) Hollywood Format Action Sci-Fi Sequel Europe, Middle East, and Africa China -0.345 -0.615 0.0312 -0.295 -0.249 -0.0826 (1.560) (0.832) (0.378) (0.375) (0.322) (0.338) × 0.0356 0.475 -0.428 0.221 0.205 -0.274 (0.346) (0.873) (0.522) (0.540) (0.588) (0.540) Latin America China -1.142 0.647 0.470 0.172 0.158 -0.0254 (1.688) (0.903) (0.409) (0.407) (0.349) (0.366) × 0.305 -0.480 -0.502 0.0875 0.186 0.628 (0.374) (0.947) (0.565) (0.585) (0.637) (0.585) Asia Pacific China 0.102 0.782 0.363 0.217 0.0627 0.143 (1.360) (0.725) (0.329) (0.327) (0.281) (0.295) × 0.00604 -0.724 -0.459 -0.181 0.218 -0.0378 (0.301) (0.761) (0.455) (0.471) (0.512) (0.472)

Notes: Estimates for 3 regional markets in Chapter 3; standard errors in parentheses; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

Table E.13: Effect of Movie Characteristics

(1) (2) (3) (4) (5) (6) Domestic Chinese Intl. EMA LA AP Disney 0.315** 0.188 0.0392 -0.0885 0.0194 0.124 (0.122) (0.255) (0.231) (0.237) (0.259) (0.209) Fox 0.226** 0.439* 0.551*** 0.466** 0.569** 0.641*** (0.102) (0.256) (0.192) (0.197) (0.222) (0.175) Paramount 0.414*** 0.281 0.188 0.0827 0.131 0.207 (0.111) (0.261) (0.210) (0.215) (0.237) (0.188) Sony 0.283*** 0.0127 -0.184 -0.194 0.155 0.161 (0.0915) (0.217) (0.173) (0.177) (0.204) (0.164) Universal 0.268*** 0.0748 0.188 0.191 -0.218 0.195 (0.0952) (0.265) (0.180) (0.185) (0.207) (0.163) Warner 0.244** -0.0125 -0.0897 -0.297 0.0242 -0.00968 (0.113) (0.262) (0.214) (0.219) (0.242) (0.192) Hollywood 0.207* -0.0886 -0.781*** -0.792*** -0.235 -0.520*** (0.109) (0.231) (0.205) (0.210) (0.236) (0.184)

Notes: Estimates from the main estimations in Chapter 3; standard errors in parentheses; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

126 Table E.14: Effect of Movie Characteristics (Cont.)

(1) (2) (3) (4) (5) (6) Domestic Chinese Intl. EMA LA AP Director -0.0259 -0.0336 -0.0837* -0.117*** -0.0479 -0.0466 (0.0231) (0.0717) (0.0438) (0.0448) (0.0495) (0.0395) Star -0.0832*** 0.108 -0.138** -0.174*** 0.0130 -0.0612 (0.0287) (0.0726) (0.0542) (0.0555) (0.0623) (0.0495) Sequel 0.342*** 0.602*** 0.451*** 0.551*** 0.338** 0.529*** (0.0734) (0.160) (0.139) (0.142) (0.159) (0.128) ln(Budget) 0.275*** 0.744*** 0.681*** 0.619*** 0.646*** 0.763*** (0.0399) (0.159) (0.0757) (0.0776) (0.0856) (0.0685) Re-release -0.878*** 1.217** -1.146** -1.111** -0.872* -1.312*** (0.240) (0.603) (0.454) (0.465) (0.505) (0.406) User Score 0.363*** 0.185** 0.252*** 0.257*** 0.133** 0.275*** (0.0298) (0.0790) (0.0560) (0.0576) (0.0641) (0.0506) Format -0.0352 0.165 0.338** 0.240 0.404** 0.497*** (0.0796) (0.185) (0.151) (0.154) (0.170) (0.137) PG-13 0.0262 0.0630 0.345* 0.327 -0.145 0.551*** (0.105) (0.297) (0.199) (0.204) (0.235) (0.183) R -0.135 0.305 0.566*** 0.684*** -0.417* 0.683*** (0.112) (0.335) (0.212) (0.217) (0.248) (0.194) Action 0.0463 1.747*** -0.0907 -0.148 0.692* -0.107 (0.187) (0.447) (0.354) (0.362) (0.394) (0.317) Animation 0.242 1.207** 0.730* 0.901** 0.976** 0.288 (0.203) (0.517) (0.383) (0.394) (0.427) (0.343) Comedy 0.181 1.666*** -0.160 -0.135 0.541 -0.154 (0.180) (0.532) (0.341) (0.349) (0.380) (0.305) Drama -0.220 1.059** -0.189 -0.143 0.183 -0.381 (0.182) (0.474) (0.344) (0.353) (0.384) (0.308) Horror 0.617*** 1.526** 1.051*** 0.872** 2.061*** 0.691* (0.209) (0.666) (0.394) (0.404) (0.439) (0.353) Sci-Fi 0.144 1.945*** 0.230 0.232 0.679* 0.161 (0.190) (0.454) (0.359) (0.368) (0.400) (0.323) Thriller 0.0414 1.953*** -0.116 -0.0935 0.573 -0.302 (0.193) (0.487) (0.365) (0.374) (0.408) (0.327) US Holiday -0.0166 (0.0731) CN Holiday 0.135 (0.432) RS 0.209 (0.199)

Notes: Estimates from the main estimations in Chapter 3; standard errors in parentheses; ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01.

127 Bibliography

Ackerberg, D. A. (2001). Empirically distinguishing informative and prestige effects of advertis- ing. RAND Journal of Economics, pages 316–333.

Ackerberg, D. A. (2003). Advertising, learning, and consumer choice in experience good markets: an empirical examination. International Economic Review, 44(3):1007–1040.

Asur, S., Huberman, B., et al. (2010). Predicting the future with social media. In Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on, volume 1, pages 492–499. IEEE.

Bagwell, K. (2007). The economic analysis of advertising. Handbook of Industrial Organization, 3:1701–1844.

Basuroy, S., Chatterjee, S., and Ravid, S. A. (2003). How critical are critical reviews? the box office effects of film critics, star power, and budgets. Journal of marketing, 67(4):103–117.

Basuroy, S., Desai, K. K., and Talukdar, D. (2006). An empirical investigation of signaling in the motion picture industry. Journal of marketing research, 43(2):287–295.

Brown, A. L., Camerer, C. F., and Lovallo, D. (2007). Limited strategic thinking in the field: The box office premium to unreviewed’cold opened’movies. Technical report, working paper, Institute of Technology.

Brown, A. L., Camerer, C. F., and Lovallo, D. (2012). To review or not to review? limited strategic thinking at the movie box office. American Economic Journal: Microeconomics, pages 1–26.

Butters, G. R. (1977). Equilibrium distributions of sales and advertising prices. The Review of Economic Studies, pages 465–491.

Chakravarty, A., Liu, Y., and Mazumdar, T. (2010). The differential effects of online word-of- mouth and critics’ reviews on pre-release movie evaluation. Journal of Interactive Marketing, 24(3):185–197.

Chevalier, J. A. and Mayzlin, D. (2006). The effect of word of mouth on sales: Online book reviews. Journal of Marketing Research, 43(3):345–354.

Crawford, G. S. and Shum, M. (2005). Uncertainty and learning in pharmaceutical demand. Econometrica, 73(4):1137–1173.

128 De Vany, A. (2003). Hollywood economics: How extreme uncertainty shapes the film industry. Routledge.

DellaVigna, S. and Hermle, J. (2017). Does conflict of interest lead to biased coverage? evidence from movie reviews. Review of Economic Studies, 84(4):1510–1550.

Duan, W., Gu, B., and Whinston, A. B. (2008). The dynamics of online word-of-mouth and product sales?an empirical investigation of the movie industry. Journal of Retailing, 84(2):233–242.

Einav, L. (2010). Not all rivals look alike: Estimating an equilibrium model of the release date timing game. Economic Inquiry, 48(2):369–390.

Elberse, A. and Anand, B. (2007). The effectiveness of pre-release advertising for motion pictures: An empirical investigation using a simulated market. Information economics and policy, 19(3-4):319–343.

Erdem, T. and Keane, M. P. (1996). Decision-making under uncertainty: Capturing dynamic brand choice processes in turbulent consumer goods markets. Marketing Science, 15(1):1–20.

Erdem, T., Keane, M. P., and Sun, B. (2008). A dynamic model of brand choice when price and advertising signal product quality. Marketing Science, 27(6):1111–1125.

Fowdur, L., Kadiyali, V., and Prince, J. (2012). Racial bias in expert quality assessment: A study of newspaper movie reviews. Journal of Economic Behavior & Organization, 84(1):292–307.

Goeree, M. S. (2008). Limited information and advertising in the us personal computer industry. Econometrica, 76(5):1017–1074.

Grossman, G. M. and Shapiro, C. (1984). Informative advertising with differentiated products. The Review of Economic Studies, 51(1):63–81.

Hennig-Thurau, T., Houston, M. B., and Sridhar, S. (2006). Can good marketing carry a bad product? evidence from the motion picture industry. Marketing Letters, 17(3):205–219.

Hollenbeck, B., Moorthy, S., and Proserpio, D. (2019). Advertising strategy in the presence of reviews: An empirical analysis. Marketing Science, 38(5):793–811.

Horstmann, I. and MacDonald, G. (2003). Is advertising a signal of product quality? evi- dence from the compact disc player market, 1983–1992. International Journal of Industrial Organization, 21(3):317–345.

Joo, H. H. et al. (2009). Social learning and optimal advertising in the motion picture industry. Citeseer.

Kihlstrom, R. E. and Riordan, M. H. (1984). Advertising as a signal. Journal of Political Economy, 92(3):427–450.

Kwak, J. and Zhang, L. (2011). Does china love hollywood? an empirical study on the determinants of the box-office performance of the foreign films in china*. International Area Studies Review, 14(2):115–140.

129 Lee, F. L. (2006). Cultural discount and cross-culture predictability: Examining the box office performance of american movies in hong kong. Journal of Media Economics, 19(4):259–278.

Liu, H. (2016). A structural model of advertising signaling and social learning: The case of the motion picture industry (working paper).

Liu, Y. (2006). Word of mouth for movies: Its dynamics and impact on box office revenue. Journal of Marketing, 70(3):74–89.

McCutchan, S. (2013). Government allocation of import quota slots to us films in china’s cinematic movie market.

Milgrom, P. and Roberts, J. (1986). Price and advertising signals of product quality. Journal of Political Economy, 94(4):796–821.

Moretti, E. (2011). Social learning and peer effects in consumption: Evidence from movie sales. The Review of Economic Studies, 78(1):356–393.

Nelson, P. (1970). Information and consumer behavior. Journal of Political Economy, 78(2):311– 329.

Nelson, P. (1974). Advertising as information. Journal of Political Economy, 82(4):729–754.

Nelson, R. A. and Glotfelty, R. (2012). Movie stars and box office revenues: an empirical analysis. Journal of Cultural Economics, 36(2):141–166.

Newberry, P. and Zhou, X. (2016). Heterogeneous effects of online reputation for local and national retailers. International Economic Review.

Newberry, P. W. (2016). An empirical study of observational learning. The RAND Journal of Economics, 47(2):394–432.

Reinstein, D. A. and Snyder, C. M. (2005). The influence of expert reviews on consumer demand for experience goods: A case study of movie critics. The Journal of Industrial Economics, 53(1):27–51.

Rennhoff, A. D. and Wilbur, K. C. (2011). The effectiveness of post-release movie advertising. International Journal of Advertising, 30(2):305–328.

Santugini-Repiquet, M. (2007). Observational Learning in the Motion Picture Market. ProQuest.

Terry, N., Butler, M., De’Armond, D., et al. (2011). The determinants of domestic box office performance in the motion picture industry. Southwestern Economic Review, 32:137–148.

Thomas, L., Shane, S., and Weigelt, K. (1998). An empirical examination of advertising as a signal of product quality. Journal of Economic Behavior & Organization, 37(4):415–430.

130 Vita Naibin Chen

Education

Ph.D. Economics (2020), The Pennsylvania State University M.A. Economics (2014), Peking University B.A. Economics (2012), Peking University

Teaching Experience

Teaching Assistant Department of Economics, The Pennsylvania State University

Intermediate Microeconomic Analysis (2014-2018) Teaching Assistant School of Economics, Peking University

Theory of Industrial Organization (2014)

Advanced Microeconomics I (2013)

Research Experience

Research Assistant Penn State University Libraries

Scholarly Communications and Copyright Office (2019-2020) Research Assistant Department of Economics, The Pennsylvania State University

for Paul Grieco (2018-2019)

for Peter Newberry (Summer 2017, Summer 2018)

for Charles Murry (Summer 2016, Summer 2017)

Research Assistant China City Development Academy (2012)

Research Assistant Ministry of Environmental Protection of China (2011-2012)

Conferences and Presentations

2020 Bates White, Washington, DC 2019 International Industrial Organization Conferences (Rising Stars Session), Boston, MA; IO Brown Bag, the Pennsylvania State University 2018 International Industrial Organization Conferences, Indianapolis, IN; China Meeting of the Econometric Society, Shanghai, China; IO Brown Bag, the Pennsylvania State University 2017 IO Brown Bag, the Pennsylvania State University