TWO ESSAYS IN BUSINESS FORECASTING AND DECISION-MAKING
Carmina Caringal Clarke
Doctor of Philosophy, 2007
Australian Graduate School of Management
University of New South Wales
THE UNIVERSITY OF NEW SOUTH WALES Thesis/Dissertation Sheet
Surname or Family name: CLARKE First name: CARMINA CARINGAL Other name/s: Abbreviation for degree as given in the University calendar: PH.D. School: AUSTRALIAN GRADUATE SCHOOL OF MANAGEMENT Faculty: AUSTRALIAN SCHOOL OF BUSINESS Title: TWO ESSAYS IN BUSINESS FORECASTING AND DECISION-MAKING
Abstract 350 words maximum: (PLEASE TYPE)
This dissertation is two essays in business decision-making. The first essay is motivated by recent field evidence suggesting significant reliance on conventional techniques (e.g. NPV and DCF) without assessment of the decision profile - its degree of uncertainty, ambiguity and knowledge distribution. However, without knowing the decision profile, the chosen decision might not be appropriate given the decision situation. Therefore, essay 1 develops a multi-faceted conceptualization of the decision profile and provides a prescriptive model for choosing appraisal methods based on this profile. Specifically, it prescribes the limited use of conventional methods to low ambiguity and uncertainty situations and using decision trees, real options, scenario planning and case-based methods as the level of uncertainty increases. In high ambiguity situations, however, the only viable approaches are case-based methods which do not have perfect information assumption that conventional alternative methods do.
Case-based methods have been supported theoretically in case-based decisions and case-based reasoning literature but lags in its use in business decision-making. Possible reasons for this include a lack of concrete applications and developments of major concepts such as its case memory, similarity and prediction functions. Therefore, essay 2 proposes a model of case-based decisions called similarity-based forecasting (SBF) and applies it to a high uncertainty and ambiguity situation – namely forecasting movie success. In doing so, it outlines operational definitions of the memory, similarity and prediction functions and, based on data from the entertainment industry, provides empirical support for the hypothesis that case-based methods can be more accurate than regression forecasting; both SBF and combined SBF-regression models were able to predict movie gross revenues with 40% and 50% greater accuracy than regression respectively. This essay concludes with a discussion of some possible directions for future research including applications using data from other domains and settings, testing the boundary conditions for which the SBF approach should be applied, experiments using SBF under uncertainty and complexity manipulations, and ‘time-stamped’ comparisons with predictions made using information markets (e.g. Hollywood Stock Exchange).
Declaration relating to disposition of project thesis/dissertation
I hereby grant to the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or in part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all property rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation.
I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstracts International (this is applicable to doctoral theses only).
…………………………………………………………… …………………………………………………………… ………………………………… Signature Witness Date
The University recognises that there may be exceptional circumstances requiring restrictions on copying or conditions on use. Requests for restriction for a period of up to 2 years must be made in writing to the Registrar. Requests for a longer period of restriction may be considered in exceptional circumstances if accompanied by a lette r of support from the Supervisor or Head of School. Such requests must be submitted with the thesis/dissertation.
FOR OFFICE USE ONLY Date of completion of requirements for Award:
Registrar and Deputy Principal
THIS SHEET IS TO BE GLUED TO THE INSIDE FRONT COVER OF THE THESIS
"
ORIGINALITY STATEMENT .1 hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and ~7~~~S~i7~~;~'~~":"""""""""""" Date...... 1lf)(/Y.~:~~~:~: :-~ ~......
COPYRIGHT STATEMENT 'i hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. i retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstract International (this is applicable to doctoral theses only). i have either used no substantial portions of copyright material in my thesis or i have obtained permission to use copyright material; where permission has not been granted I have applied/wil apply for a partial restriction of the ~i~i~~d~~.t~~~::...... Date...... p/. ó¡0.r......
AUTHENTICITY STATEMENT .1 certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of my thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the nversio to digital format.' Signed ... : Date...... (fCf").... .O.:j......
ACKNOWLEDGEMENTS
I have many people to thank for without them, this thesis would never have been. I would like to start by thanking my thesis adviser and mentor, Professor Dan Lovallo.
Dan’s support was and continues to be, above and beyond what was expected of an adviser and I am grateful for his kindness, patience and guidance throughout my candidature. I thank my “academic uncle” Professor Colin Camerer at the California
Institute of Technology, for taking me under his wing during my time at Cal Tech. I am honoured to have been given the opportunity to study and learn from the best. I also extend my thanks to my thesis committee of Professor Tom Powell, Dr Anna
Gunnthorsdottir and Professor Simon Sheather for their insights in the early stages of my thesis.
I thank my family and extended family for continuing to support my every endeavour throughout the last 28 years. Twenty-eight times over I thank my Mum, Dad, my sister
Christine, my aunts and uncles for believing in me. Mum and Dad, I will forever be grateful for the sacrifices you made to see my dreams through, I have not forgotten the countless late nights you stayed up with me, the many times you went out of your way to take me to school, to uni, to classes and exams, your generosity in putting me through tutoring and making sure that I had everything I needed (and more) to succeed. Finally, to my wonderful husband Shane, for loving me and being a constant reminder of what is most important in my life.
This thesis is in memory of my beloved grandparents Lola Imang, Lolo Piroy and Lola
Goria and to my grandfather Lolo Carling, whom I adore. This thesis was to be his birthday present last year, which sadly he did not get to see. I miss you dearly.
ABSTRACT
This dissertation is two essays in business decision-making. The first essay is motivated by recent field evidence suggesting significant reliance on conventional techniques (e.g.
NPV and DCF) without assessment of the decision profile - its degree of uncertainty , ambiguity and knowledge distribution . However, without knowing the decision profile, the chosen decision might not be appropriate given the decision situation.
Therefore, essay 1 develops a multi-faceted conceptualization of the decision profile and provides a prescriptive model for choosing appraisal methods based on this profile. Specifically, it prescribes the limited use of conventional methods to low ambiguity and uncertainty situations and using decision trees, real options, scenario planning and case-based methods as the level of uncertainty increases. In high ambiguity situations, however, the only viable approaches are case-based methods which do not have perfect information assumption that conventional alternative methods do.
Case-based methods have been supported theoretically in case-based decisions and case-based reasoning literature but lags in its use in business decision-making. Possible reasons for this include a lack of concrete applications and developments of major concepts such as its case memory, similarity and prediction functions.
Therefore, essay 2 proposes a model of case-based decisions called similarity-based forecasting (SBF) and applies it to a high uncertainty and ambiguity situation – namely forecasting movie success. In doing so, it outlines operational definitions of the memory, similarity and prediction functions and, based on data from the entertainment industry, provides empirical support for the hypothesis that case-based methods can be more accurate than regression forecasting; both SBF and combined SBF-regression models were able to predict movie gross revenues with 40% and 50% greater accuracy
than regression respectively. This essay concludes with a discussion of some possible directions for future research including applications using data from other domains and settings, testing the boundary conditions for which the SBF approach should be applied, experiments using SBF under uncertainty and complexity manipulations, and
‘time-stamped’ comparisons with predictions made using information markets (e.g.
Hollywood Stock Exchange )
.
AMBIGUITY, UNCERTAINTY AND KNOWLEDGE DISTRIBUTION:
NEW CONSIDERATIONS FOR DECISION-MAKING AND
INVESTMENT APPRAISAL METHODOLOGY
Essay 1
ABSTRACT
When selecting an appraisal method to evaluate new investment options, the decision’s profile – its degree of uncertainty or ambiguity and the distribution of knowledge - is often relegated as a secondary concern or ignored altogether. The problem however, is that without knowing the decision profile, it is not known whether the chosen method is in fact appropriate given the decision situation.
This essay develops a prescriptive approach to the evaluation of investment appraisal methods where the central proposition is that the choice of investment appraisal method is endogenously determined by the decision profile. It contributes to the existing body of literature in two ways. Firstly, it develops a multi-faceted decision profile using the degree of ambiguity and uncertainty and its knowledge distribution.
Secondly, it establishes a broad outline for a prescriptive model for appraisal methods based on this profile.
The prescribed model has implications for the use of several existing investment appraisal methods. Specifically, it prescribes a limited use of conventional capital budgeting methods (such as net present values and discounted cash flows) to situations of low ambiguity and low uncertainty and using decision trees and real options, scenario planning and case-based methods as the level of uncertainty increases. In addition for decisions under ambiguity the case-based approach to decision-making is the recommended approach.
2 The essay concludes by discussing future research avenues, including investigating potential cognitive biases in applying the prescribed approach and alternative methods for investment appraisal under high uncertainty and ambiguity decisions.
3 TABLE OF CONTENTS
1 Introduction ...... 6
2 A Conceptual Framework for Decision Profiling ...... 10
2.1 Decision Ambiguity...... 10 2.1.1 Ambiguity as Functional Form Uncertainty...... 11 2.1.2 Ambiguity as Incompleteness in Determinant set...... 12 2.1.3 Importance of Ambiguity...... 12 2.2 Decision Uncertainty...... 13 2.2.1 Uncertainty as knowledge of probabilities...... 13 2.2.2 Uncertainty as the Size of the State Space...... 14 2.2.3 Importance of Uncertainty ...... 15 2.3 Distribution of Knowledge ...... 16 2.3.1 Conceptualizing Knowledge Distribution...... 18 2.3.2 Theoretical Foundations to Heterogeneous Distribution...... 21 2.3.3 Importance of Knowledge Distributions ...... 22 2.4 Summary ...... 23
3 A Prescriptive Model for Appraisal Methods ...... 25
3.1 Which Method When? A Discussion of the Proposed Approach...... 25 3.1.1 Conventional Appraisal Methods...... 26 3.1.2 Quantitative Multiple Scenarios ...... 28 3.1.3 Qualitative Scenario Analysis and Case-based Decisions ...... 32 3.1.4 High Ambiguity Decisions...... 36 3.2 Distribution of Knowledge and Prediction Markets...... 39 3.3 Prescriptive Model for Decision-Making...... 41
4 Discussion and Conclusion ...... 45
References ...... 50
Appendix ...... 57
Note 1: Conventional Approaches ...... 57 Note 2: Decision Trees...... 61 Note 3: Real Options...... 62 Note 4: Scenario Analysis...... 64
4 LIST OF FIGURES
Figure 1: Aspects of Knowledge Distribution...... 19 Figure 2: Decision Profile Composition...... 24 Figure 3: Conventional methods ...... 27 Figure 4: Quantitative multiple scenarios...... 31 Figure 5: Qualitative scenarios and Case-based methods ...... 36 Figure 6: Ambiguous Decisions and Case-based methods...... 38 Figure 7: A Prescriptive model for Appraisal Methods ...... 43 Figure A1: An overview of Conventional Investment Appraisal Methods...... 57 Figure A2: Investment Appraisal example using conventional methods ...... 59 Figure A3: Calculations with the conventional approach ...... 60 Figure A4a: Investment Appraisal using Decision Trees & Expected Values...... 61 Figure A4b: Cashflow Assumptions by Demand Scenario ...... 62 Figure A5: Investment Appraisal using Scenario Analysis [O’Brien, 2004]...... 64
5 1 Introduction
Empirical research has found that, regardless of the profile of the investment decision, many businesses are still heavily reliant on traditional budgeting techniques (such as payback, accounting rates of return), and net present value approaches [Graham &
Harvey, 2001; Ryan & Ryan, 2002; Brounen, De Jong and Koedijk, 2004]. While this toolkit works well with some types of decisions, they are of limited use in highly ambiguous and uncertain decisions such as investing in a new (and untried) technology, entering a foreign market for the first time, or acquiring a new company.
In theory and in practice, the problem is certainly not the lack of decision tools and processes. There is an ever-expanding toolkit available for the business decision makers from decision trees, real options and scenario analysis to information markets and case-based decisions. Rather, the problem appears to be a lack of guidance on which of these appraisal methods should be used and when.
One may argue that this is, first and foremost a result of the existing literature’s lack of attention to the profile of the decision. At best, theoretical research considers only one aspect of the decision profile at a time, with discussions focusing largely on the level of uncertainty in a decision. In instances where the decision profile is considered, rarely is it in the scope of deciding on the appropriate appraisal tool to use.
This is reflective of the fact that the research to date has yet to establish any clear links between the decision profile and which investment appraisal tool(s) to use.
Why should this topic be of any interest to a researcher or practitioner? Why is getting the decision process right of any importance? Simply put, without knowing the decision profile (such as the level of ambiguity, uncertainty or knowledge distribution
6 for example), the decision maker does not know whether the current decision meets the assumptions or requirements (regarding states, outcomes, probabilities, etc) that underlie the chosen appraisal method. To the extent that the quality of the decision depends on the quality of the decision-making process , then researchers and practitioners interested in making better decisions must first improve the way they make decisions, starting with the appropriate matching of decisions with tools. To be clear, it is not being argued that the decision-maker always applies the most sophisticated tools, but that he or she understands that decision approaches need to be matched to the type of decision. It is argued that by systematically addressing the different facets of a decision, decision-makers can then select the appropriate approach, thereby improving their chances for making sound decisions.
Therefore, the objective of this essay is to (i) develop a unified decision profile which accounts for its degree of ambiguity (is the causal model known?), uncertainty
(how many possible states and outcomes are there?) and its knowledge distribution
(who has information?) and to (ii) provide a broad outline for a prescriptive model for appraisal methods based on this profile. Thus, the aim is to investigate the question of
“which investment appraisal method should be used when?” with the key contributions being to:
• introduce a unified, multi-faceted “decision profile” based on its degree of
ambiguity, and uncertainty, and the distribution of relevant knowledge,
• elaborate on the major facets of ambiguity, uncertainty and knowledge
distribution that describe a decision,
7 • transform the conceptualization of appraisal methods into an endogenous
choice determined by the decision profile, and
• develop a broad outline of a prescriptive approach that assesses the decision
before determining the appropriate (and viable) appraisal method(s).
The central thesis of this essay is that the choice of appraisal method rests on the decision profile. Specifically, if the decision is unambiguous (where the causal model is known) with low uncertainty (knowledge of all states with one clear outcome) then conventional techniques like net present values (NPV) and discounted cash flows (DCF) can be used to appraise the decision. However, if an unambiguous decision has moderate uncertainty (knowledge of all states with multiple possible outcomes), the decision should be evaluated using quantitative multiple scenario methods such as decision trees and real options analyses which can handle a limited state space. Similarly an unambiguous decision under structural ignorance (the state space or the possible outcomes are unknown) should be evaluated using a combination of qualitative scenario analysis and case-based decisions as these methods do not require complete knowledge of the state space.
For ambiguous decisions (where the causal model is unknown ) there would be very few viable approaches. One possible choice would be the case-based decisions approach which does not have the same requirements regarding decision-specific information nor knowledge of all available alternatives, all relevant attributes and/or the causal model which are all or in part, required by conventional techniques, decision trees, real options and scenario analyses.
8 In addition to all of the above, if the relevant knowledge regarding a decision is heterogeneously dispersed across a market of decision-makers then prediction markets could serve as a mechanism of aggregating the relevant information. This method however only applies to decisions where the state space is known and therefore cannot be used for decisions under structural ignorance.
The approach of this essay is to begin by developing a conceptual framework to profile a decision situation. To understand which appraisal methods should be used when requires an understanding what type of decision situation one is facing.
Therefore section 2 reviews the existing economics and behavioral decisions literature and develops a decision profile based on its ambiguity, uncertainty and knowledge distribution. Building on this prescriptive framework, the essay then discusses the applicability of several appraisal methods given differing decision profiles. This discussion is limited to several examples chosen to demonstrate how the proposed framework can help decision-makers choose the appropriate appraisal methods. This is taken up in section 3 which considers the advantages and pitfalls of using certain appraisal methods under varying degrees of ambiguity, uncertainty and knowledge distribution and derives a set of propositions which shape the prescriptive model
(shown in Figure 7). In doing so, this essay will also place into context the second essay which is an application of case-based decision theory to a real decision situation.
This essay closes with a discussion of the potential agenda for decision-making research, which includes further investigation of how other tools and processes fit within this framework, how this framework may be improved with the inclusion of qualitative factors of the decision (such as safety, community value, etc), and how this framework may operate in practice in light of research findings on cognitive biases.
9 2 A Conceptual Framework for Decision Profiling
The prescriptive model begins with the development of a unified conceptual framework for profiling a capital investment decision. Before the decision-maker can begin to search through all the possible appraisal tools and processes, he or she must first diagnose the decision they are facing; specifically, the decision’s level of uncertainty, ambiguity and knowledge distribution. While there are other important aspects of the decision, for example, the impact of the decision on the business’s performance, the size of the investment itself and the objective function of the decision-maker, it is argued that these three determinants are crucial in selecting the appraisal method and thus are the chosen focus of this study.
In the existing behavioral decisions and economics literatures, ambiguity, uncertainty and knowledge distribution are generally considered in a piece-meal manner. Studies of decision-making under uncertainty do not jointly consider the ambiguity or the distribution of relevant information or knowledge. Similarly, studies of knowledge distribution have, by and large, sidestepped issues with ambiguity and have only begun dealing with issues of uncertainty.
What is meant by ambiguity and uncertainty? And what does the distribution of knowledge refer to? In this section, the existing literature is surveyed in order to develop a decision profile that would be useful in selecting an appraisal method.
2.1 Decision Ambiguity
Ambiguity refers two aspects of a decision; the level of uncertainty in describing the relationship between an outcome and its determinants (independent variables), and the uncertainty in defining what the determinants are.
10 In the existing literature, decision ambiguity primarily refers to causal ambiguity - the uncertainties in describing the relationship between an outcome and its determinants 1.
In the resource-based view literature, causal ambiguity has received much attention for its role in creating barriers to imitation (from rival firms) and thus becoming a source of competitive advantage [Lippman and Rumelt, 1982; Rumelt, 1984; Reed and
DeFillippi, 1990]. The paradox of this argument is that the presence of causal ambiguity also prevents the focal firm from leveraging its competitive advantage; just as rival firms have no (or partial) knowledge of the systems that create the advantage, neither does the focal firm [King and Zeithaml, 2001].
A recent paper by Powell, Lovallo and Caringal [2006], introduced some structure to the causal ambiguity debate by, among other things, introducing a framework, which defined causal ambiguity as - characteristic (ambiguity in one or more determinants), measurement (ambiguity in the performance measure), functional form (ambiguity of the nature of the relationship) and incompleteness (ambiguity in the complete set of determinants). Setting aside for the moment concerns regarding measurement of variables, I discuss the last two types of causal ambiguity – functional form and incompleteness- as two facets of a decision profile that should be considered when selecting an appraisal method.
2.1.1 Ambiguity as Functional Form Uncertainty
Decision ambiguity arises when the functional form describing the causal process in which determinants act to produce an outcome, is uncertain. That is, a decision is ambiguous because the decision maker does not know whether the functional form is
1 In the economics stream of research “decisions under uncertainty”, ambiguity has been used to describe uncertainties regarding the probability distribution of outcomes. This point will be taken up in section 2.2 on uncertainty.
11 additive or multiplicative, a linear, power or quadratic function and does not know how the determinants interact with each other (if at all) [Lippman & Rumelt, 1982;
Powell et al, 2006].
Movie studio executives for example, face an ambiguous production decision if they are uncertain about the way the production budget, star power or critical reviews enters into the performance of a movie at the box office. Do these variables have a direct influence or are they mediating variables? Do critical reviews have a quadratic or cubic relationship with box office receipts? Does the production budget interact with star power? Does star power enter with lags? These types of uncertainties relating to the functional form of the causal model increase the ambiguity of a decision.
2.1.2 Ambiguity as Incompleteness in Determinant set
The second type of ambiguity arises when the knowledge of all determinants is incomplete [Lippman and Rumelt, 1982; King and Zeithaml, 2001]. That is, a decision is ambiguous if the decision-maker does not know what the relevant determinants that influence an outcome are. Even if the movie studio executive knows that star power, critical reviews and production budget are determinants of box office receipts, but does not know what other variables might influence success or even how many there may be, then the decision they face is an ambiguous one.
2.1.3 Importance of Ambiguity
The degree of ambiguity of a decision is an important consideration because it implies that methods which require complete knowledge of the causal model - its set of determinants and its functional form – cannot be applied in ambiguous decisions.
Ignoring the ambiguity of a decision potentially results in the application of an
12 appraisal method to a situation that does not satisfy its information requirements.
Therefore decision-makers should consider the degree of ambiguity of a decision where ambiguity pertains to the lack of knowledge of a complete set of relevant determinants and of the functional form of the causal model.
The vast majority of investment decisions are made with some level of ambiguity in the causal relationship and/or its determinants, which makes assessing the ambiguity of a decision an important consideration. However, according to current research, the most popular investment appraisal tools are those that assume low (or no) ambiguity, such as traditional capital budgeting techniques (NPV and DCF). This heightens the importance of understanding the ambiguity surrounding a decision when choosing which decision-making tool to use.
2.2 Decision Uncertainty
2.2.1 Uncertainty as knowledge of probabilities
The term uncertainty can be traced back to the schema developed by Knight [1921] which drew a distinction between a situation of risk and uncertainty 2 . In this framework, risk described a situation where the states of the world and the probabilities of their occurring were known to the decision-maker. Uncertainty on the other hand described decisions where the states of the world were naturally defined and known but their probabilities of occurring were unknown.
2 Knight’s original schema described three different types of probability situations, which was simplified to the risk-uncertainty dichotomy mentioned above. The situation of risk describes two situations, one describing naturally equiprobable instances (tossing a fair coin) and another describing instances that were not naturally equiprobable but with a high degree of empirical confidence in the probability distribution. In both situations, the a priori probability distributions are known and thus are classified as instances of risk.
13 A fair gamble on the popular coin game, “two-up” where participants wager on the result of a toss of two coins, is a risky decision because the states (HH, TT, H&T) and the probabilities of their occurrence (¼, ¼, ½) are known. Some investment decisions are also risky ventures if there exists good, reliable information for states and prior probability distributions. Many business decisions however, are made under uncertainty because no reliable prior probability distribution function can be established.
In the decisions under uncertainty literature and more recently asset pricing theory research, “Knightian uncertainty” appears under the guise of “ambiguity” or
“vagueness” in the probability distribution function [see for example Einhorn and
Hogarth, 1985; Camerer and Weber, 1992; Epstein and Wang, 1994; Chen and Epstein,
2002; Ghirardato and Marinacci, 2002; Ghirardato, Maccheroni and Marinacci, 2004].
This is not to be confused with the terminology of the resource-based view literature which refers to the lack of knowledge of the causal model that defines outcomes. For the purposes of this study, I align the conceptualization of uncertainty with Knight’s original risk-uncertainty schema, and reserve the term ambiguity to describe gaps in the knowledge of the causal model.
2.2.2 Uncertainty as the Size of the State Space
Knight’s risk-uncertainty dichotomy assumed that the states of the world were naturally given or known to the decision-maker. Therefore a third definition of uncertainty might be where both the states and their probabilities were both unknown
(or not naturally given). Gilboa and Schmeidler [2001] called this situation structural ignorance . Examples of structural ignorance abounds in high-technology industries where the pace and direction of technological change makes it difficult for decision-
14 makers to identify a complete set of states. Decisions made under conditions of regulatory change could also face structural ignorance if changes introduce the possibility of an undefined set of new states or future worlds.
This leads to the second and perhaps more well-known conceptualization of uncertainty, namely the range or variance in the outcomes over the set of states of the world 3. A decision has low uncertainty if future states of the world are known with sufficient certainty. Investment decisions made in a stable or mature competitive environment face low uncertainty regarding what the future may look like. A decision with a higher degree of uncertainty may instead be facing not one, but possible multiple (known) futures. Here the state space is finite and bounded, either as discrete scenarios or a range of scenarios. One example would be decisions which are contingent on the entry of a new competitor, or on the response of the industry’s incumbent firm. The success of a retail bank’s new branch, for instance, is contingent on whether the prospective entrant enters the market and if it does, whether it sets up a competing branch in the same locality, whether it offers the same products, etc. Here, the complete state space is known and finite, but which future state will occur is uncertain, and therefore so too is the performance of the new branch 4.
2.2.3 Importance of Uncertainty
While decision uncertainty is generally considered when evaluating a decision, it is not considered in the selection of the method of appraisal. Understanding the uncertainty of a decision is important in choosing the appraisal method because it implies that
3 This variance in outcome is not to be confused with the variance arising from ‘residual errors’ from estimating the causal model. This variance in outcome due to estimation errors relates to causal ambiguities and not to the uncertainties regarding the states of the world. 4 This conceptualization of uncertainty aligns with the definitions offered by Courtney, Kirkland and Viguerie [1999].
15 certain methods which cannot adequately model uncertainty such as NPV and DCF and other such point-estimation type methods cannot be applied to decisions which high uncertainty and structural ignorance respectively.
One possible negative consequence of ignoring uncertainty when choosing a method of appraisal is the potential for predictable surprises or disasters [Bazerman and Watkins, 2004]. For example, applying finite scenario methods such as subjective expected utility (SEU) to a decision with structural ignorance requires the infinite (and partially known) state space to be reduced to a small, discrete subset thereby necessarily and subjectively ‘deleting’ possible future states in order to process the decision. The inappropriate application of such methods to structural ignorance which does not satisfy the requirement of a finite and known state space leaves for the possibility for a ‘deleted’ future to occur thus resulting in disasters like the Enron disaster and September 11 [Bazerman and Watkins, 2004]. These events belong to the universe of all possible future, but whether due to complexities or perceived their plausibility, these events were ignored. Therefore the level of uncertainty of a decision should be assessed before choosing the appropriate method of appraisal.
In the context of investment decisions, prior probabilities are generally not known ex-ante. Thus, the focus will be on situations where the probability distribution function over the state space is not known and our conceptualization of uncertainty rests solely on the size and knowledge of the complete state space.
2.3 Distribution of Knowledge
The third aspect of the decision profile is knowledge distribution, specifically, how the information regarding the decision is dispersed across a collective of people. However,
16 this seemingly simple definition masks the fact that knowledge distribution has many facets; it is not just how information is dispersed (Who knows it?) but its completeness
(Is the information perfect or partial?) and its uniqueness (Identical or unique in content?). Knowing how information is distributed is important because we have imperfect, heterogeneous and context-specific knowledge that can benefit from aggregation in many cases.
Compared to ambiguity and uncertainty, knowledge distribution has received the least attention in decision sciences. However, knowledge distribution matters because most decision-making tools assume that the decision-maker has all the relevant information regarding the decision, or if the information is dispersed, it assumes that the decision-maker can easily aggregate the relevant information and knows how. This section discusses the different aspects of knowledge distribution thus addressing this gap in the literature.
One point that should be clarified is that the conceptualization of knowledge distribution does not make any assumptions regarding the type of decision-makers or economic ‘agents’ assumed, other than that they are making decisions and they act as independent individuals. For example, these agents can be analysts, executives or line- managers charged with making investment decisions, they can be stakeholders making decisions on whether to approve a proposed project or they can be retail investors
(“mum-and-dad” investors) making decisions on which stocks to buy and sell.
Furthermore, it is not in the scope of this essay to outline who and what type of agents they ought to be; specifically, it is not intended that the term decision-maker be interpreted as only agents with a high level of expertise, such as experts, academics or consultants. On the contrary, advice from the current literature is that a diverse
17 collective with independent individuals of varying levels of expertise should be sought
[Surowiecki, 2004].
2.3.1 Conceptualizing Knowledge Distribution
The distribution of knowledge can be conceptualized as one of four stylized cases.
Figure 1 depicts the variations in knowledge distribution for the example of a collective of five individual decision-makers.
Case 1 depicts the neoclassical model of information, where all decision-maker’s stock of knowledge are both perfect (complete) and homogenous (identical). That is, every decision-maker has access to all information and the same content of information [Luthje, Lettl and Herstatt, 2003]. An example of perfect and complete knowledge distribution is the decision to invest in a stock in an efficient market situation. Here, the decision-maker (and all those participating in the market) has all the relevant information in the current stock price to inform their decision to invest or not to invest. Similar information can be found in the price of a stock option, where the current exercise price would convey all the information regarding the decision to invest or not to all in the market.
18 Figure 1: Aspects of Knowledge Distribution
Case 1: Neoclassical Model (Perfect, Homogenous Distribution) Stock of Knowledge 100%
A B C D E Agent
Case 2: Imperfect, Homogenous Distribution Stock of Knowledge 100% Imperfect Information at X% the same degree
A B C D E Agent
Case 3: Imperfect, Heterogeneous Distribution Stock of Knowledge 10 0% Imperfect Information at X% varying degrees
A B C D E Agent
Case 4: Imperfect, Heterogeneous Distribution and Content Stock of Knowledge 100% Imperfect Information at X% varying degrees
A B C D E Agent
Domestic Phones PDAs Cellular Phones Fascimiles
19 Case 2 depicts a distribution where decision-makers do not have perfect information (e.g. they do not know the complete state space) but still share the same information, that is, the knowledge is imperfect but homogenous amongst the collective. An example of this would be a group of decision-makers who were a team of telecommunication analysts, all with the same albeit partial, knowledge of communication products (phones, pagers, personal digital assistants (PDAs), etc).
Now consider the situations where this collective of decision-makers range in their years of industry experience and intelligence and thus differ in their stock of knowledge. For example, decision-makers B and D have more years of experience and have a greater stock of knowledge compared to ‘apprentices’ A and C. Here the knowledge distribution would look like that shown in case 3 where the knowledge regarding a decision was imperfect and heterogeneous across the collective of decision- makers.
Finally, suppose that on closer examination it appears that not only do the decision-makers vary in the level of information on communication productions they also vary in their content, i.e. what they know. Case 4 illustrates this situation of imperfect and heterogeneous knowledge distribution with the extra complication of the content of knowledge also varying across decision-makers. For example decision-maker B’s knowledge comprises of domestic phones and PDAs while decision-maker C’s knowledge is only on cellular phone products.
In addition, the proportion of decision-makers who possess a particular type of information also describes the distribution of knowledge in a market. Suppose the information relevant to a decision related to domestic phones. Then the distribution of knowledge regarding domestic phones can be described as being “dispersed” or shared
20 across 4 out of the 5 in the collective. If however, the required knowledge related to the PDA products, then the distribution of knowledge would be described as
“concentrated” to just 1 in the collective. Sunder [1995] uses the term “thin” markets to describe a concentrated distribution of knowledge, and “thick” markets where knowledge is relatively dispersed.
Therefore the distribution of knowledge determined by assessing:
• The completeness of knowledge (Perfect or Partial information?)
• The symmetry of knowledge (Homogenous or Heterogeneous?)
• The uniqueness of the content of knowledge (Identical or unique content?)
• The rate of dispersion (What proportion share the relevant information?)
2.3.2 Theoretical Foundations to Heterogeneous Distribution
Luthje et al [2003] provided a summary of theories which have been put forward to explain how the distribution of knowledge might differ from that described by the neoclassical model of perfect and homogenous information. For example, according to the bounded-rationality model attributable to Simon [1957] and Newell and Simon [1972], imperfect and heterogeneous knowledge is the product of varying degrees of limited processing capacity of decision-makers. The key assumption underlying this argument is that the decision-maker is boundedly-rational so instead of being able to gather all accessible information, they perceive only a limited but unique fraction of all information. Therefore distribution of knowledge is heterogeneous because decision- makers possess imperfect and asymmetric information due to limited mental processing capacities.
21 A second model which the authors described was the contextual development model which argues that a decision-maker’s knowledge is not only partial but unique because it evolves from a specific context [Fleck, 1997]. The main assumption here is that knowledge development is embedded into the specific domain, situation and milieux
(or immediate local environment) that the decision-maker operates in. This, coupled with bounded rationality explains why the distribution of knowledge is heterogeneous, partial and unique.
Empirically, the argument that knowledge is heterogeneously distributed has been supported by studies such as Rulke and Galaskiewicz [2000] and Luthje et al [2003]. In the latter study the authors investigated the knowledge among three groups of experts in the domain of surgical infection control and hygiene products. The authors found a high variation in domain knowledge between and also within expert groups which supports both the bounded rational and contextual development theories of knowledge distribution.
2.3.3 Importance of Knowledge Distributions
How is the distribution of knowledge and relevant information important to decision making and the choice of appraisal method? Firstly, the distribution of knowledge indicates how many decision-makers are required to obtain the relevant information. If knowledge distribution exists as described by the neoclassical model (case 1) then only one (or very few) decision-makers need to be surveyed because they all possesses the same stock of knowledge. However if knowledge distribution is imperfect and heterogeneous (case 4) then information aggregation requires a collective to be surveyed in as every decision-maker possesses partial and unique information based on their processing capacity and environment.
22 Secondly, the distribution of knowledge has implications for the selecting of appraisal methods because it implies that effectiveness of appraisal methods depends on whether the actual knowledge distribution of a collective satisfies the assumed distribution type. For example, the quality of a forecast obtained from information markets largely depend on the knowledge distribution across the market of traders.
Therefore knowing the completeness, symmetry, uniqueness and dispersion of information is critical in the effectiveness of information markets.
2.4 Summary
To ensure the appropriate appraisal method is selected, the profile of the decision must first be assessed. This section provided a broad outline of the various facets of a decision that bear on the applicability of appraisal methods, these being the degree of ambiguity, uncertainty and knowledge distribution; as summarized in Figure 2. While there are other important aspects of the decision, such as the impact of the decision on the business’s performance, the size of the investment itself and the objective function of the decision-maker, it is argued that these three determinants are crucial in selecting the appraisal method.
23 Figure 2: Decision Profile Composition
Knowle dge of causal model’s functional form Decision - Profile
Knowledge of all model determinants - Degree of Ambiguity
Knowledge of state space -
Size of state space + Degree of Uncertainty
Completeness -
Distribution Symmetry - of Knowledge + Uniqueness of content + Rate of Dispersion
24 3 A Prescriptive Model for Appraisal Methods
Having assessed the decision’s profile – its degree of ambiguity (is the relationship and the determinants of the model known?), uncertainty (how many possible outcomes are there and what is their probability of occurring?) and its knowledge distribution (who has information?), what tools should the decision-maker apply?
This section develops the arguments for an endogenous selection strategy of appraisal methods with the key outcomes being the development of a prescriptive framework and a set of propositions linking the decision profile to the appraisal method. In doing so, it addresses the gap in the current literature where the decision profile is yet to be linked to the choice of decision tools and processes .
Throughout this discussion, examples of appraisal methods will be used to demonstrate the key propositions of the model. It is acknowledged that there is a universe of tools and approaches available to the decision-maker, however as previously noted the aim is not to categorize every decision-making tool, rather to use some examples with the aim of illustrating how the prescriptive framework could be applied 5. Furthermore, the scope of this discussion is also limited to decisions where the probability distribution function is not known, ex ante, thereby excluding risk techniques (probabilities are known) from our analysis.
3.1 Which Method When? A Discussion of the Proposed Approach
Consider the example of PrintDepot, a small stationery company faced with the decision to buy a new self-service photo printing kiosk which will allow their customers to print digital photos instantly without the assistance of store attendants.
5 Notes 1 to 4 in the appendix provide outlines of the appraisal methods considered here.
25 The investment would expand the existing business model currently focused on office and occasional stationery to digital imaging products and services. The cost of procuring the photo kiosk is $100,000 and preliminary forecasts suggests a potential to increase revenues by $25,000 a year. How does PrintDepot decide on this investment?
3.1.1 Conventional Appraisal Methods
By far the most common approach in practice is to assess the value of the investment by using the conventional toolkit comprising of payback , accounting rate of return , discounted cash flow and net present values 6 [Graham and Harvey, 2001; Ryan and Ryan, 2002;
Sandahl and Sjogren, 2003; Brounen, De Jong and Koedijk, 2004]. In general these techniques are bottoms-up, deductive methods that begin with estimates of input variables which are then used to construct a forecast value. The investment is appraised by comparing the value of the criterion (payback period in months, discounted free cash flow, etc.) to a corresponding benchmark value.
Figure A2 and A3 in the appendix provide an illustration and the basic calculations for adopting the conventional appraisal approach. Assuming the kiosk had a 6-year life, the payback period is forecasted to be 4 years with a net present value (NPV) of $8,880 and a rate of return on the initial investment (ROI) of 8.9%.
Suppose that PrintDepot has identified one source of uncertainty- the level of customer demand for the new instant photo service, where the profitability of the photo kiosk is contingent on this uncertain demand. Can PrintDepot use the conventional toolkit to assess the now uncertain decision?
6 Refer to Note 1 for a summary of these methods.
26 The short answer is no. Conventional appraisal methods rely on single point estimates and therefore, on its own cannot sufficiently describe the range of scenarios resulting from uncertainty. In addition to the assumption that a single, clear future state will result (i.e. a ‘certain’ increase in $25,000 in revenue per annum), conventional methods assume that the increase in demand will perfectly translate to the said increase in revenue. If however, the relationship between the level of demand and the increase in revenue is ambiguous (i.e. the causal model is unknown) then the actual increase in revenue cannot be easily determined. Therefore, outside the conditions of unambiguous, low uncertainty (single-state) situations, conventional methods do not provide decision makers with a viable approach to investment appraisal.
Proposition 1: Conventional methods are viable under the conditions of unambiguous and low uncertainty (single-state) decisions.
Figure 3: Conventional methods
Degree of Ambiguity Full Model known?
Yes Degree of Uncertainty States (Pr unknown) known?
Recommended (1) Yes, single state Method(s) Conventional Capital Budgeting
27 3.1.2 Quantitative Multiple Scenarios
Assuming for the moment that the decision involves no ambiguity, PrintDepot might consider a quantitative multiple scenarios approach to assess its decision now with uncertainty in the level of customer demand.
A scenario is a story about how the future might turn out [O’Brien, 2004].
Presented as a set, multiple scenarios analysis acknowledges the inherent uncertainty of the future by attempting to capture the range of uncertainty thought to be present
[O’Brien, 2004]. An example of a quantitative multiple scenarios approach is the decision trees technique. Decision trees analyze the variance of some quantitative measures (such as payback, DCF or NPV) under different states of the world.
For PrintDepot, the states of the world describe the different levels of customer demand (e.g. demand is high, moderate, low) which determines the attractiveness of the photo kiosk investment. The quantity is aggregated through some probability- weighted function such as an expected value or expected utility. Figure A4 in the appendix presents a decision tree and an expected values analysis for this decision.
Depending on the level of demand, NPV could be as high as a profit of $57,300 or as low as a loss of $59,200. Assuming risk neutrality, the expected value of the NPV of the photo kiosk is $4,950.
Another alternative to assessing decisions under uncertainty might be the real options approach. The real options approach to investment appraisal is emerging as a useful way to evaluate business decisions under high uncertainty which also requires a degree of flexibility [Smith and McCardle, 1999; Miller and Folta, 2002; Janney and
Dess, 2004; Coldrick, Longhurst, Ivey and Hannis, 2005; Fichman, Keil and Tiwana,
28 2005; Kauffman and Li, 2005]. A real options approach allows the possibility of partial investment which then creates the right but not the obligation to pursue a subsequent decision [Janney and Dess, 2004, p60].
Real options analysis is all about gathering information and reducing the uncertainty about a new asset’s future value. Typically, this requires time; perhaps the outcome of an investment depends on some legislative decision, a move by a competitor or research of the target market. Either way, the first part of the real option buys such time without risking a large full-sale commitment or sacrificing the competitive position of a company. Once the uncertainty is reduced or resolved, then the company makes subsequent decisions regarding the investment. Therefore, real options create value for the decision-maker facing high uncertainty by offering an approach that is flexible enough to determine the appropriate timing of the investment prior to any full investment [Fichman et al, 2005].
A real options approach would allow PrintDepot to make a partial investment in the photo kiosk technology and wait until the uncertainty in the demand has been reduced or resolved before committing to the new technology. For example,
PrintDepot’s strategy might be to launch a small-scale pilot of the photo kiosk and assess the level of customer demand before deciding on a full commitment. The cost of the pilot of the photo kiosk is similar to a call option – or the right to buy – the full photo kiosk product. This approach allows PrintDepot to exercise the call option, if early market experience supports high customer demand. More importantly, if demand is found to be low, the option allows PrintDepot to abandon the project without having made a full outlay of the investment cost.
29 Other ways that PrintDepot might adopt a real options approach could be by investing in the software technology of photo kiosk or forming relationships with paper and ink suppliers for the kiosk. At the same time, it could also conduct consumer market research to reduce (where possible) the level of uncertainty regarding future demand. Full investment into the photo kiosk would eventuate only when demand uncertainty has been reduced (e.g. market research suggests that demand will be moderate to low) or resolved. Again, the real options approach defers full investment in order to gather information regarding the uncertain quantity, but without sacrificing any advantages by being a ‘late mover’ in the small, partial investments made.
Both examples of quantitative multiple scenario methods work well for uncertainties characterized by a small, discrete state space or a continuous but bounded range of states. Now suppose that PrintDepot actually faces structural ignorance and does not know all the possible states of the world. This could be due to multiple sources of uncertainty, like incumbent competitors’ reaction, an increase in the price of photo supplies, which interact to multiply the number of possible states. If PrintDepot cannot identify all possible states of the world, can it still use these methods?
In short the answer is no. Underlying both quantitative multiple scenarios is the assumption of complete knowledge of all possible states of the world (that resolve all uncertainty), and at the minimum, awareness of the extreme states (e.g. high and low demand). In the absence of these conditions, such methods would be a partial representation of the decision’s degree of uncertainty. Like conventional appraisal methods, quantitative scenarios also assume that the causal model translating the level of demand to increases in revenue is known. Therefore:
30 Proposition 2: Quantitative multiple scenario methods (e.g. decision tree analysis and real options) are viable methods for unambiguous decisions under uncertainty (states are known) but not structural ignorance (states are not known).
Figure 4: Quantitative multiple scenarios
Degree of Ambiguity Full Model known?
Degree of Yes Uncertainty (Pr unknown) States known?
Recommended Yes, single state Yes, few or limited Methods range of states (1) (2)
Conventional Capital Quantitative Multiple Budgeting Scenarios
While it is clear that real options should be considered for large capital investments, especially in conjunction with a highly uncertain environment, it should also be noted that there are situations which are less conducive to this approach. For example, real options may erode first-mover advantages if a pilot version of a product or process is launched and then subsequently imitated by competitors. Therefore it can be argued that the imitability of a product or process may limit the use of real options in some situations.
Another consideration is that the creation of the real option itself may carry significant costs. For example, in creating an option to change the scale of a project
(expand or contract scale) can carry additional costs to allow this flexibility [Fichman et
31 al, 2005]. A business may opt to increase the scale of a project/system (and thus the range of potential benefits) if the circumstances are favorable or can reduce the scale
(and thus potential losses) if the circumstances are unfavorable. In doing so, however, the business may increase its project costs [Fichman et al, 2005, p.80]. Because of the increased costs in creating real options, it can be argued that such an approach may be limited to projects that carry significant benefits or payoffs and that smaller-budget, low-impact projects are not recommended for such an approach.
Finally, one point of clarification that should be made here is that decision tree analysis and real options are only two examples of quantitative multiple scenario approaches. It is not being suggested that they are the only two ways to approach decisions under uncertainty. This discussion is also not suggesting a level of similarity or difference between the two.
3.1.3 Qualitative Scenario Analysis and Case-based Decisions
If PrintDepot faces structural ignorance and therefore cannot identify all possible states of the world, then what approach can it use? Other examples of decisions under structural ignorance are businesses making early R&D investments, new product developments, and merger and acquisition (M&A) decisions. For these types of decisions where the universe of possible outcomes is not known (nor naturally bounded), then the decision-maker has two possible alternatives, namely qualitative scenario analysis and case-based decisions. Both approaches do not require the decision maker to have information regarding all the choices, states and outcomes as assumed by conventional and quantitative scenario methods. So what are these approaches and how to they deal with high uncertainty decisions?
32 Qualitative scenarios operate like the quantitative scenario approach with one key difference. Recall that a scenario is a description of a future history, and is built from a realistic combinations of key driver values, with information about the dependent variables, certain events and the interactions between many scenario elements [Bunn and Salo, 1993]. The key difference between quantitative and qualitative scenario methods is that the latter provides a more qualitative and c ontextual description of how the present will evolve into the future, while the former seeks a numerical analysis of some quantity. In other words, qualitative scenarios method offers a forecast of the future in terms of a narrative [Schnaars, 1987]. The advantage of this technique is that it permits the development of flexible descriptions which do not force the decision- maker to quantify their understanding of qualitative factors [Bunn and Salo, 1993;
Schnaars, 1987].
There are various ways to take a qualitative approach to scenario analysis however they all share common steps. An example from O’Brien [2004] is given in Figure A5 in the appendix. The first step to creating narrative scenarios is to create a background or context for the decision. This involves distilling the key issues pertaining to the industry and the organization such as “What is the business model of PrintDepot?”,
“How attractive is the self-serve digital photo printing market?” and “What is the general trend in self-service kiosks?” These questions provide a rationale for the investment decision.
The second step creates a list of external factors (and a value range for them).
External factors are those that cannot be controlled by the company but influence the attractiveness of the investment. Examples of external factors include the threat of new substitutes to the photo kiosk market (such as lab-quality photo printers for the
33 home) or a decrease in supply of a critical input (such as photo paper) thereby driving up production costs. These external factors and their possible values dictate what scenarios the decision-maker needs to consider.
Therefore the next step is to construct narratives of multiple scenarios regarding the states of the world and the possible outcomes. How profitable will the investment be if the price of photo paper increased? How will demand of the photo kiosk service be affected by an increase in demand for home photo printers? Finally, using these scenarios and the possible states and outcomes described by them, the decision-maker constructs a robust strategy across the scenarios analyzed.
This qualitative approach offers a useful approach to decisions under structural ignorance. Because this method does not attempt to describe all possible states of the world and does not depend on knowledge of quantities and probabilities, it provides a viable approach for decisions made under structural ignorance.
Another approach that is useful in decisions under high uncertainty is the case- based decisions approach. Case-based decisions differ to quantitative scenarios in several important ways. Firstly, to apply quantitative methods such as decision trees and expected value (or utility) analysis, the decision-maker necessarily engages in hypothetical thinking in order to derive all possible states, all possible choices, and all possible outcomes of each choice for each given state of the world. Secondly, if using expected utilities, the decision-maker is also required to derive the desirability (or utility) of each scenario and must also obtain priors over the probability of it occurring. These requirements make applying quantitative scenarios to decisions under structural ignorance an enormous task.
34 What is the case-based decisions approach and how can it help the decision- making facing structural ignorance? In a nutshell, case-based decisions (or CBD) uses analogies to similar past experiences and examples to assess a decision. The basic CBD model, as proposed by Gilboa and Schmeidler (1999) describes decision-making based on the similarity-weighted utility of previous actions. Specifically, CBD selects a choice or action, a that maximizes the similarity-weighted desirability u, of the results, r obtained from previous cases q, where s(p,q) denotes the similarity between decision problems, p and q, that is:
max ( ) = ( , ). ( ) U p,M a ∑s p q u r
Because CBD does not require the decision-maker to construct discrete future histories of the world nor have full information regarding them, it presents a viable and flexible approach to assessing a decision where all possible states and outcomes are generally not known to the decision-maker 7.
In sum, qualitative scenario analysis and CBD are useful approaches to decisions under uncertainty because they have the advantage of doing away with constructing discrete future histories of the world and instead focuses on analogous previous examples and experiences to guide the decision [Gilboa and Schmeidler, 1995, 2001].
Thus, the scenario planner and case-based decision-maker need not know all the possible courses of action, states of the world or all possible outcomes thus making these two approaches useful for situations with structural ignorance. Therefore:
Proposition 3: Assuming unambiguous decisions, qualitative scenario analysis and case-based decisions are viable methods for situations under structural ignorance (states are not known).
7 An example for PrintDepot is provided in the next section.
35 Figure 5: Qualitative scenarios and Case-based methods
Degree of Ambiguity Full Model known?
Degree of Yes Uncertainty (Pr unknown) States No known? Yes, few or limited Recommended Yes, single state range of states Methods (3)
(1) (2) Qualitative Scenarios
+ Conventional Capital Quantitative Multiple Case-Based Budgeting Scenarios Decisions
3.1.4 High Ambiguity Decisions
The discussion so far has assumed that PrintDepot is aware of the causal model that relates the attractiveness of the investment to its set of determinants; that is, it faces a low degree of ambiguity because it knows what determines the financial attractiveness of the photo kiosk.
Suppose however that PrintDepot does not know the exact nature of this relationship (its functional form) or which and how many determinants there are. If
PrintDepot knows that the level of demand is a determinant but cannot rule out the possibility of other factors, such as the cost of photo paper supplies, the price of color ink, the expected ease-of-use for customers and whether this might inter-act with the convenience of store’s location, then it faces a decision under ambiguity. If PrintDepot does not know the causal model for the profitability or financial attractiveness of the kiosk, what are its options? For such situations, PrintDepot’s options narrow
36 significantly. There are very few approaches which do not require knowledge of the causal relationship of the determinants and the outcome.
One viable approach is the CBD because it does not require complete knowledge of the state space and the causal model that conventional techniques and quantitative scenarios assume. And while the qualitative approach of scenario analysis was useful in situations of structural ignorance, it also required detailed and decision-specific information (background to the photo kiosk and photo printing markets, the context of the problem, relevant themes and factors, etc.) in order to arrive at projections of financial attractiveness. Using any of these methods would be a partial analysis at best.
The key to CBD in situations of high ambiguity is that it does not require complete knowledge of the current situation that is required by all other approaches [Gilboa and
Schmeidler, 1995, 2001]. Instead, CBD considers evidence from previous similar cases to formulate a decision.
To take the CBD approach, PrintDepot would look at similar cases of entry into the self-service photo printing market or into other markets also characterized to convenience- and service-sensitive demand. It would then assess the similarity between past cases to the one it current faces, and obtain a similarity-weighted average of financial attractiveness (e.g. profitability, increase in revenue, etc.) to formulate a decision. Using this approach, PrintDepot can arrive at an investment decision under high ambiguity where other methods cannot. Therefore:
Proposition 4: Decisions with ambiguity (causal model not known) should be assessed using the case-based decision approach.
37 Figure 6: Ambiguous Decisions and Case-based methods
Degree of No Ambiguity Full Model known?
Yes Degree of Uncertainty States No (Pr unknown) known? Yes, few or limited Yes, single state range of states (3) Recommended Methods Qualitative (1) (2) Scenario (4)
+ Conventional Capital Quantitative Multiple Case -Based Decisions Budgeting Scenarios
In the discussions so far, it has been assumed that ambiguity resides only in the causal model, that is, ambiguity in the determinants and how they interact to bring about an outcome. What if there was also ambiguity in the similarity function; that is, the determinants of similarity were unknown?
Even in such instances, CBD can still provide the decision-maker with a viable approach. This viability is achieved through the flexibility of how the similarity function is defined; CBD allows similarity to be either rule-based or judged assessments 8. This discussion is presented in detail in the second essay of this thesis, In short, the main argument was that for instances where the key attributes of similarity were known (that is, unambiguous similarity function), then a rule-based algorithm could be used. Rule-based similarity requires that the decision-maker knows and has complete information on what the relevant features of the decision are (and thus similarity function) and their relative importance. Examples of rule-based similarity
8 Refer to Section 3.2 – Similarity Assessments
38 functions are Euclidean distance, Nearest-Neighbor or City-Block [see Liao et al, 1998] and Feature-matching similarity model [Tversky, 1977].
If however, there is ambiguity in the similarity function, then rule-based algorithms would only offer a partial representation of the similarity between two or more cases.
For these situations, judged similarity assessments are prescribed. Judged similarity assessments are obtained using ratings from a collective of individuals (e.g. analysts, investors, voters, etc). They do not explicitly require knowledge of the relevant features or their salience and also have the advantage of being able to aggregate knowledge of similarity that may be dispersed across a collective. Therefore regardless of whether ambiguity resides in the decision or in the similarity function, CBD offers a viable approach to decisions under ambiguity.
3.2 Distribution of Knowledge and Prediction Markets
One aspect of the decision profile not yet considered is whether and how information relevant to the decision may be dispersed. An underlying assumption in the discussion so far is that all the relevant knowledge regarding the investment decision resides with the decision-makers in PrintDepot; that is the distribution of knowledge is
‘concentrated’.
Consider now the situation where the information regarding the financial attractiveness of the photo kiosk is dispersed across a diverse group of people (e.g. in marketing, sales, R&D, accounting, retail or with the end-users of the product) who all possess a partial yet unique knowledge base relevant to the decision. Is there a method that can aggregate such information for PrintDepot?
39 Theoretical and empirical evidence regarding prediction markets suggests the answer is yes. Prediction markets , also known as “information markets”, “futures markets” or “virtual markets”, are collectives of people who trade contracts based on their private information regarding the underlying asset.
Examples of online prediction markets abound; IOWA Poll, Hollywood Stock exchange,
Betfair, Tradesports, Newsfutures, Foresight Exchange, and the Chicago Mercantile Exchange .
More recently, the use of prediction markets have gained commercial interest; Hewlett-
Packard have pioneered applications in sales forecasting and now uses prediction markets in several business units. Initial results so far have been encouraging with sales forecasts from the prediction markets found to be more accurate than official company forecasts [Chen and Plott, 2002].
The theoretical argument for prediction markets rests on the efficient market hypothesis which states that the market price is the best predictor of an event occurring.
That is, the price reflects the total aggregate information of all participants in the market [Fama 1970; Grossman, 1976].
How can prediction markets help PrintDepot with its investment decision? One way would be to adopt the approach used by Hewlett-Packard [see Chen and Plott,
2002] and run an internal prediction market consisting of a diverse cross-section of its staff (sales, accounts, operations, design, etc.). Over a defined period of time, this collective would buy and sell shares of a certain contract on a virtual market where the value of the contract is tied to for example, the future demand for the photo kiosk.
The market could have three contracts where contract A is “ Demand is between 0-1000 photos/day (low) ”, contract B is “ Demand is between 1000-10,000 photos/day (moderate)”, and contract C is “ Demand is greater than 10,000 photos/day (high) ”. Assuming the market is
40 ‘efficient’ 9 the prices for these contracts will reflect the total information in the market regarding the demand for the photo kiosk.
Sunder [1995] discussed the moderating influence of the state space on the efficiency of information market aggregation. Specifically, Sunder posited that an increase in the state space would result in the delay and dilution of the convergence to the rational expectations equilibrium [p.454]. In other words, the aggregating mechanism of information markets decreases with the size of the state space.
Therefore, assuming the conditions of diversity, independence and decentralization, information market can efficiently aggregate relevant information for decisions characterized by uncertainty (but not structural ignorance) and a dispersed knowledge distribution.
Proposition 5: Assuming knowledge of all states of the world, information markets are a viable approach for decisions where the distribution of knowledge is dispersed across a market of diverse, independent and decentralized decision-makers with an efficient aggregating mechanism.
CBD also offer a method of aggregating information when knowledge is dispersed.
Instead of a market mechanism, CBD can aggregate dispersed information either by asking decision-makers to recall previous decisions or to assess similarities of previous examples to the new one. Therefore, for ambiguous decisions where knowledge is dispersed, CBD should also be considered for aggregating information.
3.3 Prescriptive Model for Decision-Making
Figure 7 presents the prescriptive model that describes the steps to matching the decision tools and processes to the decision itself. The central proposition here is that
9 An ‘efficient’ collective is diverse, independent, decentralized and can aggregate dispersed information across a market [Sunder, 1995].
41 the choice of investment appraisal method is endogenously determined by the degree of ambiguity, uncertainty and knowledge distribution of the decision. Specifically the prescribed model proposes the limited use of conventional techniques to unambiguous, low uncertainty decisions. Outside these conditions the information requirements of conventional models would not allow the decision maker to formulate a decision without making simplifying assumptions regarding the functional form and determinant set of the causal model and degree of certainty of the future state. Given that the majority of business investment decisions are characterized by some form of uncertainty and ambiguity, this proposition challenges the widespread, unconditional reliance on conventional methods reported by recent field surveys [Graham and
Harvey, 2001; Ryan and Ryan, 2002; Brounen et al, 2004].
Assuming unambiguous situations, for decisions under Knightian uncertainty (all states are known), quantitative multiple scenario methods such as decision trees, expected values or utility theory and real options should be used instead of conventional single-point estimates. For structural ignorance (states unknown), the only viable methods were qualitative scenario analysis and CBD. The available methods narrow further when decisions are ambiguous. Here, CBD is the suggested approach where decision-makers do not know the causal model that brings about an outcome. Finally, the model prescribes the use of prediction markets to aggregate information when it is dispersed throughout a collective of individuals.
42 Figure 7: A Prescriptive model for Appraisal Methods
Degree of No Ambiguity Full Model known?
Yes Degree of Uncertainty States No States (Pr unknown) known? known? Yes, few or limited Yes No Yes, single state range of states Knowledge Distribution
Knowledge Knowledge Distributed? Distributed? No Yes No Yes Recommended Methods
(1) (2) (5) (3) (4) (5) (4)
Prediction Qualitative Prediction Markets Scenarios Markets
+ + + Conventional Capital Quantitative Multiple Scenarios Case -Based Decisions Budgeting
43 The prescriptive framework presented here can also accommodate for ‘hybrid’ techniques; for example some forms of expected utilities are a hybrid of NPV and decision trees analyses, while CBD approaches can be combined with prediction markets that obtain the relevant information regarding the similarity between two examples. The critical message of this prescriptive model is that regardless of the tools in the toolkit being used to appraise a decision, among those should be what is prescribed here. Therefore a decision-maker facing uncertainty can apply any of the conventional budgeting techniques of NPV, payback and DCF, as long as his methodology also includes some quantitative scenario methods such as decision trees or real options. Similarly, if faced with ambiguity, the decision-maker can choose to adopt a decision tree analysis as long as they also look at the decision from a case- based decisions approach.
Therefore, the prescriptive framework presented here does not attempt to limit the decision maker’s toolkit, rather its aim is to remind and inform decision-makers that the recommended approach should be amongst the tools applied to the decision.
44 4 Discussion and Conclusion
The intended contributions of this study were two-fold. First, this study added to the existing literature by developing a multi-faceted conceptualization of a decision. In previous studies, the decision ambiguity, uncertainty and distribution of knowledge have been considered, but never in unison. This study provides a launch pad for unifying the current piece-meal approach to assessing decisions.
The second contribution of this essay was the re-definition of investment appraisal methods as an endogenous , choice variable. Apart from Courtney, Kirkland and Viguerie
[1999] who highlighted similar contingences but for the degree of uncertainty alone, there have been no other holistic discussions that link the decision itself to the appropriate choice of appraisal method.
In theory and in practice, the problem is certainly not the lack of decision tools and processes. There is an ever-expanding toolkit available although the uptake of these in practice are not necessarily in line with the most recent developments in research.
Despite the increase in new approaches, empirical evidence still suggests a heavy reliance on traditional budgeting techniques (such as payback, accounting rates of return), and net present value approaches [Graham & Harvey, 2001; Ryan & Ryan,
2002; Brounen, et al, 2004].
Other approaches are beginning to rival the popularity of such conventional methods. Recent evidence by Petry and Sprow [1993] and Ryan and Ryan [2002] show a significant proportion of companies using scenario analysis. The survey by Petry and
Sprow [1993] showed that between 35 - 50% of companies (depending on the industry) used the scenario analysis approach to capital investments. Ryan and Ryan [2002] also
45 found that 42% of surveyed companies use scenario analysis often (>=75% of the time), and a total of 68% using it for at least 50% of their capital decisions. This evidence suggests a reasonable acceptance of scenario analysis in budgeting practice.
On the contrary, there is relatively weak evidence for the use of real options as a capital budgeting tool. In one study, Ryan and Ryan [2002] found that a small proportion of surveyed companies (2%) used real options more than 75% of the time, while 11.8% used it more than 50% of the time. This evidence suggests that real options has received less corporate acceptance than compare to scenario analysis and conventional techniques.
Even less empirical information exists on the uptake of newer approaches such as
CBD and prediction markets. While there are many examples of online prediction markets such as the Iowa Poll, The Hollywood Stock Exchange and NewsFutures, there is only one reported case for its commercial use; the Hewlett-Packard application,
(jointly with the California Institute of Technology) which used prediction markets to predict quarterly sales volumes.
For CBD, the only applications that exist is to date are four stylized applications and an empirical application in essay 2 of this thesis. The four stylized applications are
Gilboa and Schmeidler [1997]’s application to consumer choice for repeated decisions,
Blonski [1999] on social learning patterns, Gayer, Gilboa and Lieberman [2004] on apartment rental prices, and Jahnke, Chwolka and Simons, [2005] for co-ordination of demand and supply for industrial services with service sensitive customers(e.g. restaurants, beauty salons, etc). No corporate or commercial applications have been documented to date.
46 One important direction for future research is to ascertain the reason for the disconnect between theory and practice for investment appraisal. There is empirical evidence to support a significant gap between what is available and what is being practiced, but not enough in terms of providing explanations for the existence of the gap. In this essay, we posit that the reason there is no matching between the decision profile and the choice of decision tool is because the literature has approached the use of decision tools in the absence of considering the decision profile and that a clear link between the two has yet to be established. That the contributions of this essay are hinged on this assumption, makes this future research piece on the disconnect between theory and practice, an important one.
There are many more directions in which future research could be directed. Firstly, the present framework considered uncertainty, ambiguity and knowledge distribution as the three facets of a decision profile. However one may argue that there are other equally important factors to selecting the appropriate decision process or tool that have not been considered. Perhaps the size and/or the impact of the investment on the business may play a role in determining the appropriate decision approach. For example, would it be appropriate to construct a prediction market for smaller projects that are expected to have low incremental value to the business simply because there is believed to be information dispersed across many decision-makers? Conversely, would it be appropriate to rely on conventional capital budgeting techniques alone if the investment were a very large one that could make or break the business?
Apart from size, future research might also consider qualitative values or aspects of an investment as another criterion for determining the appropriate decision making tool. Quantitative methods such as NPV and DCFs are inadequate when assessing
47 decisions which have non-quantitative aspects such as safety, national pride, environmental effects, and societal and community development. Future research can also help establish what the appropriate decision processes and tools are for these types of decisions.
Another future research direction would be to seek out other methods for evaluating decisions under ambiguity and/or extreme uncertainty situations. The prescriptive model presented here recommended CBD as the approach for decisions under these circumstances; however it was not the intention to suggest that this be the only method available to the decision maker. Therefore, future research could pay greater attention to compiling approaches and tools for the decision-maker facing ambiguity. For example, Courtney and Lovallo [2004] suggested a belief-based backward induction method for assessing decisions under structural ignorance. Given the high degree of difficulty of decision-making under these conditions, future research should be directed to developing improved methods specifically targeting decisions with high ambiguity and structural ignorance.
Another suggestion for future development would be to expand the research on knowledge distribution. The concept of knowledge distribution is not a well- understood concept and is a new addition to the assessment of appraisal methods. Any future agenda for decision-making research should include further developments to the arguments that tie the use of information markets to the distribution of knowledge.
One interesting avenue would be to seek out other ways to aggregate information which is dispersed across a market. Can forums such as conferences, working groups and meetings be used as aggregating mechanisms for information? What conditions
48 would need to be satisfied to ensure that these forums do aggregate the relevant information regarding a decision?
A large part of this agenda on knowledge distribution could also focus on the question of who to ask in order to collect the relevant intelligence on a given subject matter. The question of identifying which decision-makers possessed what type of information was not discussed in the present paper, or anywhere in the existing literature on information markets, but would be useful in ensuring that the market had the relevant information to aggregate. Almost certainly, developments here will have contributions to the literature regarding experimental assets market, the market efficiency hypothesis and information aggregation mechanisms.
One weakness of this present framework is that it assumes that the decision-maker does not suffer from cognitive biases of over-confidence or over-optimism when assessing the level of ambiguity, uncertainty and knowledge distribution components of the decision. Therefore, a suggested improvement to the framework could be in refining a more precise set of questions which can help decision-makers more accurately assess the decision he/she is facing. For example the question “is the functional form known?” used to determine ambiguity may be refined as “can you rule out all functional forms but a linear one?”
Finally, a potential improvement to this framework includes extensions to other appraisal methods not discussed here; for example, regression methods, Monte Carlo simulations, expert methods, etc. This paper is by no means a comprehensive list of all available appraisal methods. On the contrary, a small and manageable list of the more popular methods with a couple of emerging methods was discussed to develop and demonstrate the prescriptive framework. Therefore, future research could investigate
49 the applicability other methods for instance, regression models. Underpinning classical linear regression models is the assumption that the model is correctly specified, that is all the independent variables and the functional form of the model are both known
[Gujarati, 1995, p.66]. Therefore, regression would only be a viable decision support tool for decisions where the model’s functional form (linear, quadratic, power, etc) and the complete list of determinants are both known and sufficient data for on both independent and dependent variables exist. This implies that regression should be limited to unambiguous decisions where there exists an economic theory explaining the relationship between the variables. While this may be the case for the classical linear regression model, further investigation would be needed on the requirements of more sophisticated econometric techniques.
References
Bazerman, M.H. and M.D. Watkins, Predictable Surprises: The disasters you should have seen
coming and how to prevent them, (Boston, MA: Harvard Business School Press,
2004).
Beinhocker, E.D., “Robust Adaptive Strategies,” Sloan Management Review , Spring (1999),
95-106
Berg, J. and T. Rietz, “Prediction Markets as Decision Support Systems,” Information
Systems Frontiers, 5 (2003), 79-93.
Blonski, M., “Social Learning with case-based decisions,” Journal of Economic Behavior and
Organization , 38 (1999), 59-77.
Brounen, D., A. de Jong, and K. Koedijk, “Corporate finance in Europe: confronting
theory with practice,” Financial Management, 33 (2004), 71-102.
50 Bunn, D.W. and A.A. Salo, “Forecasting with Scenarios,” European Journal of Operational
Research, 68 (1993), 291-303.
Chen, K and C.R. Plott, Information aggregation mechanisms: Concept, design and implementation
for a sales forecasting problem , Social Science Working Paper 1131 (Pasadena, CA:
California Institute of Technology, 2002)
Camerer, C.F. and M.W. Weber, “Recent Developments in modeling preferences:
Uncertainty and Ambiguity,” Journal of Risk and Uncertainty , 5 (1992), 325-70.
Chen, Z. and L.G. Epstein, “Ambiguity, Risk and Asset Returns in continuous Time,”
Econometrica , 70 (2002), 1403-43.
Coldrick, S., P. Longhurst, P. Ivey and J. Hannis, “An R&D options selection model
for investment decisions,” Technovation , 25 (2005), 185-193.
Courtney, H., J. Kirkland and P. Viguerie, “Strategy Under Uncertainty,” in Harvard
Business Review on Managing Uncertainty, (Boston, MA: Harvard Business School
Press, 1999).
Courtney, H. and D. Lovallo, “Predicting the ‘Unpredictable’: Bringing Rigor and
Reality to Early-Stage R&D Decisions,” Research Technology Management,
September-October (2004), 40-45.
Einhorn, H.J. and R.M. Hogarth, “Ambiguity and Uncertainty in Probabilistic
Inference,” Psychological Review , 92 (1985), 433-61.
Epstein, L.G. and T. Wang, “Intertemporal Asset Pricing under Knightian
Uncertainty,” Econometrica , 62 (1994), 283-322.
51 Fama, E.F., “Efficient capital markets: A review of theory and empirical work,” Journal
of Finance, 25 (1970), 383-417.
Fleck, J., “Contingent knowledge and technology development,” Technology Analysis and
Strategic Management , 9 (1997), 383-97.
Fichman, R.G., M. Keil and A. Tiwana, “Beyond Valuation: ‘Options Thinking’ in IT
Project Management,” California Management Review , 47 (2005), 74-96.
Gayer, G., I. Gilboa and O. Lieberman, Rule-Based and Case-Based Reasoning in Housing
Prices , Cowles Foundation Discussion Paper No. 1493 (New Haven, CT: Yale
University, 2004)
Ghirardato, P., F. Maccheroni and M. Marinacci, “Differentiating ambiguity and
ambiguity attitude,” Journal of Economic Theory , 118 (2004), 133-73.
Ghirardato, P. and M. Marinacci, “Ambiguity Made Precise: A Comparative
Foundation,” Journal of Economic Theory , 102 (2002), 251-89.
Gilboa, I. and D. Schmeidler, “Case-Based Decision Theory,” The Quarterly Journal of
Economics , 110 (1995), 605-39.
__, and __, “Cumulative Utility and Consumer Theory,” International Economic Review ,
38 (1997), 737-61.
__, and __, A Theory of Case-Based Decisions, (Cambridge, UK: Cambridge University
Press, 2001)
Graham, J.R. and C.R. Harvey, “The theory and practice of corporate finance:
evidence from the field,” Journal of Financial Economics , 60 (2001), 187-243.
52 Grossman, S., “On the efficiency of competitive stock markets where traders have
diverse information,” Journal of Finance, 31 (1976), 573-85.
Gujarati, D.N., Basic Econometrics, 3 rd Edition, (New York, NY: McGraw-Hill, 1995)
Jahnke, H., A. Chwolka and D. Simons, “Coordinating Service-Sensitive Demand and
Capacity by Adaptive Decision Making: An Application of Case-Based
Decision Theory,” Decision Sciences , 26 (2005), 1-32.
Janney, J.J. and G.G. Dess, “Can real-options analysis improve decision-making?
Promises and pitfalls,” Academy of Management Executive , 18 (2004), 60-75.
Kauffman, R.J. and X. Li, “Technology Competition and Optimal Investment Timing:
A Real Options Perspective,” IEEE Transactions on Engineering Management , 52
(2005), 15-29.
King, A. W. & C. Zeithaml, “Competencies and firm performance: Examining the
causal ambiguity paradox,” Strategic Management Journal , 22 (2001), 75-99.
Knight, F.H., Risk, Uncertainty and Profit , (Hart, Schaffner and Marx; Houghton Mifflin,
1921), Library of Economics and Liberty (23 May 2006)
Liao, T.W., Z. Zhang and C. Mount, “Similarity Measures for Retrieval in Case-Based
Reasoning Systems,” Artificial Intelligence, 7 (1998), 267-88.
Lippman, S. A. & R. Rumelt, “Uncertain imitability: An analysis of interfirm
differences in efficiency under competition,” Bell Journal of Economics , 13 (1982),
418-38.
53 Luthje, C., C. Lettl and C. Herstatt, “Knowledge distribution among market experts: a
closer look into the efficiency of information gathering for innovation
projects,” International Journal of Technology Management , 26 (2003), 561-77.
Miller, K.D. and T.B. Folta, “Option value and entry timing,” Strategic Management
Journal , 23 (2002), 655-665.
Newell, A. and H.A. Simon, Human problem solving (Engelwood Cliffs, NJ: Prentice-Hall,
1972).
O’Brien, F.A., “Scenario planning – lessons for practice from teaching and learning,”
European Journal of Operational Research , 152 (2004), 709-22.
Pennock, D., S. Lawrence, F.A. Nielsen and C.L. Giles, “Extracting collective
probabilistic forecasts from web games,” in Proceedings of the seventh ACM
SIGKDD International conference on Knowledge Discovery and Data Mining, (New York,
NY: ACM Press, 2001).
Petry G.H. and J. Sprow, “The Theory and Practice of Finance in the 1990s,” The
Quarterly Review of Economics and Finance , 33 (1993), 359-381.
Plott, C.R., “Markets as Information Gathering Tools,” Southern Economic Journal , 67
(2000), 1-15.
Plott, C.R.. and S. Sunder, “Efficiency of Experimental Security Markets with Insider
Information,” Journal of Political Economy , 90 (1982), 663-98.
Plott, C.R.. and S. Sunder, “Rational Expectations and the Aggregation of Diverse
Information in Laboratory Security Markets,” Econometrica , 56 (1988), 1085-118.
54 Powell, T.C., D. Lovallo and C. Caringal, “Causal Ambiguity, Management Perception
and Firm Performance,” Academy of Management Review , 15 (1990), 88-102.
Reed, R. & R. DeFillippi, “Causal ambiguity, barriers to imitation, and sustainable
competitive advantage,” Academy of Management Review , 31 (2006), 175-96.
Ryan, P.A. and G.P. Ryan, “Capital Budgeting Practices of the Fortune 1000: How
Have Things Changed?” Journal of Business and Management , 8 (2002), 355-64.
Rulke, D.L. and J. Galaskiewicz, “Distribution of Knowledge, Group Network
Structure, and Group Performance,” Management Science, 46 (2000), 612-25.
Rumelt, R., “Towards a strategic theory of the firm,” In Lamb, R. (Ed.), Competitive
strategic management , (Englewood Cliffs, NJ: Prentice-Hall, 1984).
Schnaars, S.P., “How to Develop and Use Scenarios,” Long Range Planning , 20 (1987),
105-114.
Simon, H.A., Models of Man (New York, NY: Wiley, 1957).
Sandahl, G. and Sjogren S., “Capital budgeting methods among Sweden’s largest
groups of companies. The state of the art and a comparison with earlier
studies,” International Journal of Production Economics , 84 (2003), 51-69.
Servan-Schreiber, E., J. Wolfers, D. Pennock and B. Galebach, “Prediction Markets:
Does Money Matter?” Electronic Markets , 14 (2004), 243-51.
Smith, J.E, and K.F. McCardle, “Options in the Real World: Lessons in Evaluating Oil
and Gas Investments”, Operations Research 47 (1999), 1-15.
55 Sunder, S., “Experimental Asset Markets: A Survey,” in Kagel, J.H. and A.E. Roth
(Eds), Handbook of Experimental Economics, (Princeton, NJ: Princeton University
Press, 1995), 445-500.
Surowiecki, J., The Wisdom of Crowds. Why the Many Are Smarter than the Few and How
Collective Wisdom Shapes Business, Economies, Societies and Nations , (New York, NY:
Doubleday, 2004), 174-83.
Tversky, A., “Features of Similarity”, Psychological Review, 84 (1977), 327-52.
Watson, I. and F. Marir, “Case-Based Reasoning: A Review,” The Knowledge Engineering
Review, 9 (1994), 355-81.
Wolfers, J. and E. Zitzewitz, Prediction Markets , National Bureau of Economic Research
Working Paper 10504 (Cambridge, MA: National Bureau of Economic
Research, 2004)
Wolfers, J. and E. Zitzewitz, Prediction Markets in Theory and Practice , Institute for the
Study of Labor (IZA) Discussion Paper 1991 (Bonn, Germany: IZA, 2006)
56 Appendix
Note 1: Conventional Approaches
Figure A1: An overview of Conventional Investment Appraisal Methods
Payback Criteria The payback method calculates the term in which a particular
investment would pay itself back – the payback period. In practice,
decisions using this method compare the payback period to a
benchmark and options which have payback periods longer than the
benchmark are not considered.
Accounting Rate Accounting rate of return is a broad term used to describe a criterion
of Return based on rates of return of an investment such as “return on
investment” (ROI) and “return on assets” (ROA). As with payback, e.g. ROI, ROA, the term return can take the form of cost savings, incremental profit or etc. value-add (appreciation).
Discounting One of the major limitations of both payback and accounting rates of
Techniques return was that neither method discounted returns forecasted in the
future. Discounting techniques refer to methods which take into account e.g. DCF, NPV, the time value of money when considering alternative investments. IRR Key concepts include present value (PV), net present value (NPV)
and discounted cash flow (DCF). A fourth key concept is the internal
rate of return (IRR) which indicates the minimum discount rate for
NPV to be positive.
PV = Future Value / 1( + d)t
57 DCF = PV(Cashflow 1) + PV(Cashflow 2) + … + PV(Cashflow n)
n ( ) = ∑ PV Cashflow t t=1
n n ( ) ( ) NPV= ∑ PV Cashflow t - ∑ PV Investment t t=1 t=1
The underlying concept of the discounting method is that the value
money changes over time and so too must the valuation of cash flows
forecasted into the future. That is a $25,000 cost saving in the future
is less than a $25,000 cost saving today. The present value of a
quantity reflects the value of a future quantity discounted by over the
number of periods in the future t. The further out into the future a
return, cost saving or increase in sales are, the greater the discount.
Discounted cash flow is simply the sum of all future cash flows
resulting from an investment. Cash flows further into the future are
more heavily discounted by a factor of - the time value of money.
Net present value is the discounted cash flow net of the present
value of total investment outlaid. A positive NPV indicates that the
present value of all future cash flows of an investment is greater than
the investment outlaid.
In the case of PrintDepot, the payback period of the $100,000 investment, with an increase in revenue of $25,000 per annum is 4 years. Figure A2 and A3 provides a graphical representation and the calculations of the payback period. However, an increase in revenue in the future is not worth the same as an increase in revenue now.
Therefore we apply a discount factor to calculate the value of the revenue increase in today’s dollars (i.e. the present value or PV of future cash flows). As shown in the table
58 below, assuming a discount rate of 10%, a $25,000 increase in revenue is actually only an increase of $22,730 in period 1, $20,660 in period 2 and so forth.
Figure A2: Investment Appraisal example using conventional methods
$60.00
$40.00
$20.00 Payback=4 Discount Factor $- 0 1 2 3 4 5 6 $(20.00) NPV= $(20.75) $(40.00)
$(60.00)
$(80.00) Net Cash Flow $(100.00) NPV $(120.00)
59 Figure A3: Calculations with the conventional approach
Period (years) Investment Outlay $(100) Discount Factor 1.10 1 2 3 4 5 6 Cash flow $25 $25 $25 $25 $25 $25 Aggregate Cash flow $25 $50 $75 $100 $125 $150 Payback Period 4
PV (Cash flow) $22.73 $20.66 $18.78 $17.08 $15.52 $14.11 Discounted Cash Flow $22.73 $43.39 $62.17 $79.25 $94.77 $108.88 Rate of Return (ROI) -77% -57% -38% -21% -5% 9% Net Present Value $(77.27) $(56.61) $(37.83) $(20.75) $(5.23) $8.88 Dollar values in (‘000)
The sum of all these future cash flows is called the discounted cash flow or DCF which is equivalent to $108,880 over 6 years. The net present value or NPV is simply this cash flow less the initial investment. Therefore the NPV is $8,880.
60 Note 2: Decision Trees
The d ecision tree method refers to the method of analyzing the likely impact of external factors on the attractiveness of an investment. The most popular framework used to assess the outcomes under uncertainty is the SEU (subjective expected value or utility) framework which calculates the expected value (return or outcome) of an investment by weighing the different outcomes by their probability of occurring.
Figure A4a illustrates a decision tree for PrintDepot with scenario assumptions in
Figure 4b. Depending on the level of demand, NPV could be as high as $57,300 or as low as -$59,200. Using the probabilities in Figure A4 the expected value for the NPV of the technology investment is calculated to be $4,950.
Figure A4a: Investment Appraisal using Decision Trees & Expected Values
No Status Quo
Invest in High NPV = $57,300 photo
kiosks? Pr(H)=20%
Yes Level of Moderate NPV = $8,880 Demand Pr(M)=60%
Low NPV = -$59,200 Pr(L)=20%
Expected Value = $4,950
61 Figure A4b: Cashflow Assumptions by Demand Scenario
Period (years) Scenario 1 2 3 4 5 6 High $25 $25 $25 $25 $25 $25 Moderate $10 $15 $20 $20 $25 $25 Low $5 $5 $10 $10 $15 $15 Dollar values in (‘000)
Note 3: Real Options
A real option approach allows the possibility of partial investment which then creates the right but not the obligation to pursue a subsequent decision [Janney and Dess,
2004, p60]. The first stage of any real option approach ‘buys in’ without risking a large full-sale commitment or sacrificing the competitive position of a company. Once the uncertainty is reduced or resolved, then the company makes subsequent decisions regarding the investment (which may mean another partial or full investment).
Real options analysis is all about gathering information and reducing the uncertainty about a new asset’s future value. Typically, this requires time; perhaps the outcome of an investment depends on some legislative decision, a move by a competitor or research of the target market. Either way, the first part of the real option buys such time without risking a large full-sale commitment or sacrificing the competitive position of a company. Once the uncertainty is reduced or resolved, then the company makes subsequent decisions regarding the investment.
The most commonly used investment option is the delayed entry option. An example of this would be toehold investments such as Yankee 24’s deferred entry into
62 the electronic point of sales (POS) systems market in the late 1980’s [Fichman, Keil and Tiwana, 2005]. Yankee 24 management was concerned about the uncertainty of the uptake of retailers and consumers but also of potential banking regulations that could discourage the rate of the POS system adoption. The “wait and see” strategy allowed Yankee 24 management to gather information to reduce the uncertainty regarding banking regulations and consumer and retailer uptake before deciding a full commitment. Faced with a high uncertainty decision, Yankee 24 management were able to avoid large irreversible commitments associated with full immediate implementation of POS but without being “locked out” of the market entirely.
A second type of real option is one that allows managers immediate entry and therefore access to the benefits of early involvement. Facing uncertainty regarding the direction of technological change and uptake of several competing operating systems,
Microsoft crafted a strategy that would allow them to participate in the markets by developing technologies in Windows and DOS operating systems, OS/2 computers and office suite applications (Word, Excel, etc.) for Macintosh [Beinhocker, 1999]. By developing both Windows and DOS operating systems Windows positioned themselves for immediate entry into the operating systems using the operation systems chosen by customers. By developing OS/2 with IBM and software suites for
Macintosh, Microsoft positioned themselves as a viable competitor regardless of whether IBM or Macintosh won out in the computers market.
63 Note 4: Scenario Analysis
Figure A5: Investment Appraisal using Scenario Analysis [O’Brien, 2004]
1. Create a Backdrop 2. List External 3. Develop 4. Construct a or context for Factors and Scenarios and Decision or decision. value range outcomes Strategy
Brainstorm all External factors Combinations of Develop a decision relevant issues cannot be external factor or investment relating to the controlled by the outcomes define strategy based on the organization (e.g. company but scenarios. These outcomes of the strategic direction) influence the are extrapolated scenario analysis. and the industry. attractiveness of and investment Analyze consistency an investment. outcomes and robustness of developed. strategy.
What is the business Are there threats of What are the 4 How does the strategy model of the company? substitution for this most likely fare under certain How attractive is the product/service? scenarios facing the scenarios? self-serve photo Could a decrease in company? Is the positive outcome printing market? supply of instant What is the best robust over all states of What is the general photo paper drive case and worst the world? trend in self-service up costs? cases? kiosks?
64
SIMILARITY-BASED FORECASTING OF EARLY MOTION PICTURE
BOX OFFICE REVENUE
Essay 2
ABSTRACT
This essay develops an operational model of the case-based approach, called similarity- based forecasting (SBF) and applies it to forecasting motion picture success. It contributes to the existing case-based literature in several ways. Firstly, it provides an operational schema for implementing one version of the case-based approach to forecasting. Key innovations of the SBF model include:
• A memory database where cases are objectively (and unbiasedly) obtained
and not dependent on subjective human recall,
• Predictions which were based on individual case similarity not pooled
similarity thereby allowing high-similarity cases greater influence than
under the original grouped similarity method,
• A model that includes both rule-based similarity and judged similarity
ratings where the choice depends on the conceptualization of the case, and
• Replacement of pre-determined ad-hoc rules with an endogenous selection
method for cases entering into the prediction.
Secondly, it provides empirical support for the hypothesis that case-based methods can be more accurate than regression. Using data from the entertainment industry and similarity ratings obtained from a survey of movie-goers, SBF was able to predict movie gross revenues with over 50% greater accuracy than the benchmark of hedonic linear regression.
Thirdly, this study also developed a combined SBF-regression forecast which was more accurate in its predictions than regression or SBF alone. While the combined
2 forecast was over 55% more accurate than regression, the marginal gain in accuracy over SBF was not statistically significant (t=0.95, p>0.30).
In all, the findings suggest that a case-based approach to forecasting offers significant improvements in predictive accuracy over regression. This essay discusses some possible avenues for future research including applications using data from other domains and settings, testing the boundary conditions for which the SBF approach should be applied, experiments using SBF under uncertainty and complexity manipulations, and ‘time-stamped’ comparisons with predictions made using information markets (e.g. Hollywood Stock Exchange ).
3 TABLE OF CONTENTS
1 Introduction ...... 9
2 Review of Related Literature ...... 13
2.1 Case-Based Decision Theory (CBDT) ...... 13 2.1.1 The Basic CBDT Model ...... 14 2.1.2 Case-based prediction- An extension ...... 15 2.1.3 Empirical Applications...... 16 2.1.4 Cognitive Concerns...... 18 2.2 Case-Based Reasoning ...... 20 2.2.1 Comparisons with CBDT...... 20 2.2.2 Empirical applications ...... 21 2.2.3 Similarity and Prediction Functions...... 22
3 Similarity-Based Forecasting Model...... 24
3.1 Cases and Memory Database...... 25 3.2 Similarity Assessment...... 26 3.2.1 Rule-Based Similarity...... 27 3.2.2 Judged Similarity ...... 28 3.2.3 Summary of Similarity measures ...... 29 3.2.4 Aggregating Function for Judged Similarity ...... 31 3.3 Formulating a Prediction...... 32 3.3.1 Prediction Function...... 32 3.3.2 Endogenous case limits...... 34 3.4 Summary ...... 35
4 Methodology for Motion Picture Application ...... 36
4.1 Background to the Motion Picture Problem...... 36 4.1.1 Objectives...... 36 4.1.2 Empirical Research...... 36 4.1.3 Why SBF for forecasting movies? ...... 38 4.2 Data ...... 40 4.2.1 Target Cases...... 40 4.2.2 Previous Cases...... 40 4.2.3 Case Selection Algorithm ...... 41 4.3 Methodology for Similarity Assessment...... 43 4.3.1 Respondent Pool...... 43 4.3.2 Survey Instrument...... 43 4.3.3 Measuring Familiarity...... 44 4.3.4 Measuring Similarity ...... 45 4.3.5 Aggregating Function for Similarity Scores ...... 47 4.4 Formulating a Prediction...... 49
4 4.4.1 Preliminary SBF predictions and Cluster Analysis...... 49 4.4.2 Case Adaptation ...... 53 4.5 Comparative Analysis...... 54 4.5.1 Hedonic Linear Regression...... 54 4.5.2 Combined SBF-Regression Model ...... 55 4.5.3 Performance Hypotheses ...... 56
5 Results & Analysis ...... 58
5.1 SBF and Hedonic Regression ...... 58 5.2 Combined SBF-Regression Predictions ...... 65 5.3 Prediction of Small-grossing Movies ...... 68
6 Discussion and Conclusion ...... 69
6.1 Contributions to the Literature...... 69 6.2 Limitations and Future Research Directions...... 71 6.3 Suggested applications of SBF...... 75
References ...... 78
Appendix ...... 86
Note 1: Summary of Blonski (1999)...... 86 Note 2: Review of Motion Picture Forecasting Literature...... 87 Note 3: Target Cases...... 92 Note 3: Memory Database – Variables and Descriptive Statistics...... 93 Note 4: Case Selection Algorithm...... 101 Note 5: Case Selection Algorithm – Movie Attributes...... 102 Note 6: Measuring Familiarity...... 104 Note 7: SBF prediction model – SPSS Summary Statistics...... 105 Note 8: Hedonic Regression – SPSS Summary Statistics...... 108 Note 9: Combined SBF-Regression Model – SPSS Summary Statistics ...... 111 Note 10: Correlations –Marketing Budget and Opening Wk Theaters...... 114 Note 11: Regression with Interaction Terms...... 115
Technical Appendix...... 121
Note T1: Robust Similarity Functions ...... 121 Note T2: Hierarchical Cluster Analysis using SPSS ...... 157
5 LIST OF TABLES
Table 1: SBF Similarity Functions...... 30 Table 2: Over-sampling Rule for Revenue Deciles...... 42 Table 3: Cluster Membership Schedule (Abbreviated)- “ The Island ”...... 51 Table 4: Top Cluster and SBF raw prediction...... 52 Table 5: Movies rated as most similar to Charlie and the Chocolate Factory ...... 60 Table 6: Predictions using Regression, SBF and Combined models...... 61 Table 7: Paired Sample Test Results ...... 68 Table A1: Summary of Movie Success Determinants...... 89 Table A2: Movies schedule for future release (US Summer, 2005)...... 92 Table A3: Inflation Adjustment Factor ...... 94 Table A4: Descriptive Statistics and Frequencies for Memory Data ...... 97 Table A5: SBF Model Fit Summary...... 105 Table A6: SBF Model ANOVA Summary...... 105 Table A7: SBF Model Coefficients ...... 106 Table A8: SBF Model Residual Statistics Summary...... 107 Table A9: Hedonic Regression Model Fit Summary...... 108 Table A10: Hedonic Regression Model ANOVA Summary ...... 108 Table A11: Hedonic Regression Model Coefficients ...... 109 Table A12: Hedonic Regression Model Residual Statistics Summary...... 110 Table A13: Combined Model Fit Summary...... 111 Table A14: Combined Model ANOVA Summary ...... 111 Table A15: Combined Model Coefficients ...... 112 Table A16: Combined Model Residual Statistics Summary ...... 113 Table A17: Pearson’s Correlations –Marketing and Opening Wk Theaters...... 114 Table A18: Non-parametric Correlations –Marketing and Opening Wk Theaters .....114 Table A19: Regression with Interactions Model Fit Summary...... 115 Table A20: Regression with Interactions Model ANOVA Summary...... 115 Table A21: Regression with Interactions Model Coefficients Summary...... 116 Table A22: Regression with Interactions Model Residual Statistics...... 117 Figure A6: Residual Scatterplot – Regression with Interactions Model ...... 117
6 Table A23: Predictions using Regression (with interactions), Original Regression and Combined SBF models...... 118 Table A24: Paired Samples Test – Original Regression and Regression with interactions...... 120
7 LIST OF FIGURES
Figure 1: Similarity-Based Forecasting Model...... 25 Figure 2: Similarity function and Movie clusters ...... 35 Figure 3: Similarity-Based Forecasting for Movie Success...... 39 Figure 4: Measuring Degree of Familiarity with Past Movies...... 45 Figure 5: Target Movie Description example – “The Island” ...... 46 Figure 6: Similarity Item example – “The Island” ...... 47 Figure 7a: Similarity Histogram example for “ The Island ”...... 48 Figure 7b: Similarity Histogram for “ The Matrix ” ...... 48 Figure 8: SBF and Regression Revenue Predictions versus Actual ...... 63 Figure 9: MARE for SBF and Regression predictions ...... 64 Figure 10: Regression, SBF and Combined Model predictions ...... 66 Figure 11: MARE for Regression and Combined Model predictions...... 67 Figure A1: A two-stage Case Selection Algorithm for Movies...... 101 Figure A2: Survey Results for Consumer Movie Criteria...... 103 Figure A3: Residual Scatterplot – SBF Model ...... 107 Figure A4: Residual Scatterplot – Hedonic Regression Model ...... 110 Figure A5: Residual Scatterplot – Combined Model ...... 113
8 1 Introduction
The idea that risky decisions are evaluated by combining subjective probabilities and utilities of their possible outcomes (subjective expected utility, SEU) is so familiar in economics that it is difficult to imagine any other approach; Gilboa and Schmeidler
[1995, 2001] have and based on evidence from cognitive science, they offer an alternative approach called case-based decision theory (CBDT). In CBDT, agents judge the similarity of current cases (or acts within a case) to previous cases, and evaluate a current act by a similarity-weighted average of the successes or failures of that act in previous cases. In essence, the central role of probabilistic judgments of future scenarios in SEU is played instead by judgments of similarity to the past.
CBDT has some intuitive appeal, and may also be useful as a part of decision analysis. Unlike SEU approaches, CBDT only requires good historical data and judgments of similarity of pairs of movies, which might be easier for people (whether novice or expert) than forecasting probabilities of success. Unlike regression, CBDT can also be robust to missing data and model misspecification. The crux of CBDT is arguing by analogies between a current case and past cases, to decide what to do now.
Discussions of this type are probably at least as common in boardrooms, faculty hiring meetings, and the Pentagon than formal decision analyses which elicit and combine probability p(a) and utility u(x(a)) . Just as decision analysis brings some rigor and coherence to judging p(a) and u(x(a)) , CBDT could bring some rigor and coherence to discussions which include analogical reasoning.
One possible reason CBDT has not caught on in economics and decision sciences is that there are few concrete applications to date. Apart from Gayer, Gilboa and
Lieberman [2004], applications of CBDT have been limited to stylized examples of
9 coordinating demand [Jahnke, Chwolka and Simons, 2005], social learning [Blonski,
1999] and consumer choice [Gilboa and Schmeidler, 1997]. In addition, implementing the theory to create forecasts requires development of several moving parts: a
‘memory’ or database of past cases (which and how many cases are appropriate?); a similarity function (how are similarities obtained?); and a functional form of the prediction function (how do we construct predictions using past similar cases?).
Therefore, the objective of this paper is two-fold. The first half of the present paper develops a prescriptive operational model called similarity-based forecasting (SBF) which is based on CBDT’s prediction model, including an outline of an implementable schema that attends to outstanding questions regarding the memory, similarity and prediction functions. Unlike previous applications of case-based decision theory, this paper is not developing a descriptive account of decision making, but rather a prescriptive tool for forecasting. SBF proposes the use of global judged similarity functions as an alternative to rule-based similarity functions, which was used in Gayer et al [2004].
In sum, the key innovations of the SBF model include:
• A memory database where cases are unbiasedly obtained and not
dependent on subjective human recall,
• Predictions which are based on individual case similarity not pooled
similarity thereby allowing high-similarity cases greater influence than
under the original grouped similarity method,
• A model that includes both rule-based similarity and judged similarity
ratings where the choice depends on the conceptualization of the case, and
10 • Replacement of pre-determined ad-hoc rules with an endogenous selection
method for cases entering into the prediction.
The second half of this study provides an empirical application of SBF using data from the entertainment industry, namely with motion picture box office results. The hypothesis under investigation is whether SBF can provide more accurate predictions compared to hedonic linear regression. Following the recommendation by Gayer et al
[2004], a combined SBF- Regression forecast is also obtained to determine whether a combined forecast is more accurate than the two methods separately.
The basic result is that SBF predicts relatively well compared to hedonic linear regression. A sample of 19 new movies was selected and forecasts of their gross box office revenue were obtained before they were released in US cinemas. Overall, SBF
(MARE=30%) was 50% 1 more accurate than hedonic regression (MARE=63%). On a case-by-case basis, SBF predicted 75% of large-grossing movies (revenue greater than
$100 million) and 80% of small-grossing movies (less than $50 million) more accurately than did regression. The combined SBF-Regression predictions were obtained by including the SBF prediction as an independent variable into the regression model.
The combined forecasts (MARE=28%) were on average 55% more accurate than regression (t=2.205, p<0.05), and 7% more accurate than SBF alone; the latter difference in errors however was not statistically significant (t=0.95, p>0.30).
The remainder of this paper is as follows; following a review of the literature in the next section, Section 3 presents the SBF model with discussions relating to the conceptualization of the memory, similarity and prediction functions. Section 4 outlines the empirical application and methodology for SBF prediction of motion
1 Obtained using the ratio of the difference in MARE to the MARE for regression.
11 picture revenues. Section 5 presents a summary of these results including predictions from hedonic regression and the combined SBF-Regression model. The paper closes with a discussion of the findings, their implications for the current literature and possible directions for future research.
The next section reviews the literature relating to decision making and forecasting using similarities. In this section previous attempts to incorporate similarities through case-based approaches are evaluated and some key lessons are derived.
12 2 Review of Related Literature
In recent years, the use of similarity methods primarily through the case-based approach to forecasting has increased in popularity in economics and artificial intelligence. The main appeal of the case-based approach is the way in which it parallels the natural approach that agents use to formulate solutions for new problems
[Gilboa and Schmeidler, 1995, 2001]. Specifically, the case-based approach utilizes analogies to similar past experiences and examples to formulate a solution to the current problem. It has the advantage of learning from past successes as well as mistakes, without having to know the exact relationship between determinants and the outcome. It would also not require a studio executive to know the complete set of relevant determinants nor their relative importance.
Unlike traditional approaches to analogical reasoning, the case-based approach formalizes the translation of analogies into decisions and forecasts. Researchers like
Green and Armstrong [2004] and Gavetti, Levinthal and Rivkin [2005] have previously argued that analogies fail to be incorporated into decision-making because of the lack of structure in using and translating the information into a solution. Therefore, formalized methods ensure that analogies to similar past cases are used effectively.
Two established methods, CBDT and case-based reasoning are reviewed next.
2.1 Case-Based Decision Theory (CBDT)
CBDT was first proposed by Gilboa and Schmeidler [1995, 2001] in part to formalize the role of analogical reasoning in decision making but more so because of their dissatisfaction with classical decision theories such as SEU. Their central thesis was that for situations where the language of SEU was far too rigid to describe or process
13 forecasts and decisions, then the case-based decision framework should be used instead.
To apply SEU, the decision maker necessarily engages in hypothetical thinking in order to derive all possible states of the world, all possible choices that she can take, and all possible outcomes of each choice for each given state of the world. She is also required to derive the desirability (or utility) of each scenario and obtain priors over the probability of it occurring. In some instances, the decision problem is simple enough to be able to meet such hypothetical reasoning and information requirements, but in other examples, the problem is far too complex to be expressed in the framework of
SEU. Early R&D investments, new product developments, M&A and diversification decisions are good examples of situations not suited to the SEU framework. In fact, in any situation involving a large state space, a large choice set and high degrees of uncertainty regarding probabilities and outcomes, applying SEU would be a fairly laborious task. In such situations, the prescribed approach is the case-based approach which does away with such data requirements and instead utilizes the information from analogous events and experiences.
2.1.1 The Basic CBDT Model
The basic CBDT model describes decision making based on the similarity-weighted utility of previous actions. Cases are conceptualized as instances of past decisions. The elements of a case are (i) the problem, p , what needs to be decided on or resolved, (ii) the set of acts, A , the set of choices, options or alternative actions that resolve the problem, and (iii) the set of results, R which are the outcomes of the chosen act(s) from cases in memory M . The basic CBDT model chooses the act, a ∈ A that
14 maximizes the similarity-weighted desirability u , of the results, r obtained from previous cases q ∈ M , i.e.,
max ( ) = ( , ). ( ) U p,M a ∑ s p q u r where s( p,q) denotes the similarity between decision problems, p and q .
2.1.2 Case-based prediction- An extension
Gilboa and Schmeidler [2001] extended the basic model to one where the output was a prediction as opposed to a decision. Under the c ase-based prediction model (henceforth
CB prediction), the case definition suppresses the act a and focuses on the problem and eventuality, i.e. c = ( p,r) . Under this specification cases are ranked according to the aggregate similarity across all cases with eventuality r, i.e. ∑ s( p,q) . (q,r)∈M
CB predictions are a function of both the similarity between two cases p and q, and also the relative frequency of the eventuality r. This method of ranking may result in a case with a higher similarity but low frequency being given a lower ranking than another case with lower similarity but higher frequency. In another conceptualization of similarity, Gilboa and Schmeidler [2001] specify an average similarity ranking ,
s( p, q) where cases are ranked on the average similarity across cases where ∑ s( p, q) (q,r)∈M eventuality r transpired therefore eliminating the frequency effect.
The potential problem however is that both approaches mask cases with high similarities if they are grouped with cases with low similarities. In other words, if the most similar case, cˆ with similarity sˆ( p,q) has the same outcome, rˆ as 10 other cases
15 with much lower similarity, then it is possible that case cˆ is overlooked if the summed or averaged similarity rating is lower than for other outcomes, r ≠ rˆ .
One solution to this would be to allow cases to enter individually as opposed to being grouped according to outcome r. This would also allow the most similar cases to have greater influence on the prediction. Therefore a second possible improvement would be to use a case’s individual similarities as opposed to pooled similarities.
2.1.3 Empirical Applications
Since its development, there have only been four empirical applications of CBDT by
Gilboa and Schmeidler [1997], Blonski [1999], Gayer et al [2004] and Jahnke et al
[2005].
Gilboa and Schmeidler [1997] followed up their theoretical developments of
CBDT with a stylized example applied to consumer choice, which they called cumulative utility consumer theory (or CUCT). CUCT is a version of CBDT, where past cases directly influence consumer preferences over small, repeated choices (meals, books, movies, trips, transportation, etc.). The authors maintained that the SEU model would predict consumer behavior only when it was cognitively plausible; that is, the consumer has perfect information regarding states, outcomes and utilities and faces a constrained budget. However, for the more affluent consumers who were not budget- constrained, they argued that CUCT was a better description of how these consumers actually made decisions.
In another application, Blonski [1999] used a dynamic version of CBDT to explain social learning patterns over time. Blonski used CBDT to describe a process in which a society of agents learned the superior decision given complete and partial information
16 conditions 2. In contrast to previous studies of social learning, Blonski did not use the
SEU or neoclassical models to derive behavior rules for aggregate learning. Instead, his case-based social learning model described learning patterns as the result of agents’ actions at time t, based on their stock of knowledge (or information) at time t. Implied in this model was the concept that knowledge of an outcome’s result was based on an agents stock of knowledge at time t.
A recent study by Jahnke et al [2005] used CBDT to model the co-ordination of demand and supply for industrial services with service sensitive customers and stochastic demand for service. Examples of such markets include restaurants, health care centers, beauty salons, automotive repair centers and hotels where demand is sensitive to price and the perceived level of service offered. As in previous studies, the authors argued that SEU was not the appropriate framework primarily because CBDT did not require prior knowledge of model parameters (e.g. price) but instead utilized experiences from past cases to evaluate the alternatives for the decision at hand.
All three studies demonstrated the usefulness of the case-based approach in modeling consumer choice, social learning and market co-ordination. However, unlike
Gayer et al, [2004], these three applications were based on highly stylized examples with no real data. The common limitation among these studies was that they did not address questions of operationalization of the memory, similarity and prediction functions in practice. The present study contributes to the current literature by providing an application with real data from the entertainment industry.
2 A short summary of Blonski’s study appears in the Appendix under Note 1.
17 2.1.4 Cognitive Concerns
A limitation of CBDT and CB prediction is that case retrieval (or recall) is a predominantly subjective process. In addition, case retrieval is also a function of the decision maker’s awareness of past cases. This does not necessarily limit cases to personal experience, as it can also include experiences of others that the decision maker is aware of. However, if the decision maker is not aware of a past experience, then he cannot recall it. Therefore, the CB definition of cases omits potentially relevant cases which the decision maker is not aware of.
Therefore case retrieval under CBDT and CB prediction is based only the set of conceivable cases that have occurred and that the decision maker is aware of [Gilboa &
Schmeidler, 2001, p.37]. While this definition reduces the cognitive burden on the decision maker in the recollection of relevant examples, it results in a dependency on unaided subjective human recall. In an extension of the model, hypothetical cases
(those that might transpire) are also included in the case memory. However, even with hypothetical cases, case retrieval would still be dependent on the degree of awareness of past cases.
This in itself is not the problem. The real issue lies in the availability bias that arises from unaided subjective recall. That is, more recent, vivid and evocative examples are more easily recalled and therefore are more likely to be selected regardless of their true relative frequency [Russo and Shoemaker, 1989; Tversky and Kahneman, 1973].
Suppose we ask the average movie-goer to name romantic comedies or an IT analyst to recall recent innovations to e-commerce. The probability of recalling high-grossing movies which the movie-goer has actually seen or heard of is far greater than small- grossing movies which the movie-goer has not seen or is aware of. Similarly, the IT
18 analyst is most likely to recall more successful technologies than little known advances.
When recall tends to systematically recall certain classes or subsets of cases more frequently than others (i.e. more successful films or IT initiatives as opposed to those that failed) the outcomes drawn (forecasts and decisions) from the sample are also biased.
Kahneman and Tversky (1979) addressed such concerns by introducing the concept of a reference class , which is an unbiased, representative population of similar and relevant past cases useful for making unbiased predictions of the future. In their paper, the authors developed a framework known as reference-class forecasting which required decision makers to obtain a reference class of past cases when making decisions. Their main argument was that a reference class introduced distributional information which ‘corrected’ over-optimism and over-confidence in intuitive predictions. For example, reference-class forecasts of movie revenue are less biased than intuitive forecasts based on subjective recall because the decision maker is forced to consider the entire distribution of movie revenues (both large- and small-gross revenue movies) as opposed to the subset which is most easily recalled (typically high revenue movies). Thus the idea behind reference classes is to bring about an objective and unbiased way of selecting past cases to based forecasts on 3. Therefore, a possible improvement would be to make the case retrieval process independent of human recall, which has been shown to be biased by the availability of cases.
3 SBF differs to Kahneman and Tversky’s [1979] reference-class forecasting model in one important way; the use of a similarity function that weighs previous cases by its likeness to the new case. This point will be taken up in the discussion and conclusions.
19 2.2 Case-Based Reasoning
CBDT was inspired by another well-developed approach called case-based reasoning
(CBR). CBR was introduced as a reasoning technique in situations that proved far too complex for rule-based systems or probabilistic approaches [Riesbeck and Schank,
1989; Watson and Marir, 1994; Liao, Zhang and Mount, 1996].
Like CBDT, the CBR approach aims to solve new problems by adapting solutions used in previous similar problems. When confronted with a new (and often unfamiliar) problem, where the solution and eventual outcome are unknown, decision makers tend to search out similar problems that occurred in the past in order to guide their decisions. CBR models assess the similarity of the new case with each past case in its memory, or case library. It retrieves the case or a series of cases, which are most similar to the case at hand, and using the solutions to these cases, it adapts a solution to the new one.
Like all case-based approaches, their major advantage is that it does not require domain-specific knowledge, hypothetical reasoning, or knowledge of causal relationships and functional form, making the implementation of CBR much simpler than rule-based and SEU approaches [Liao et al, 1996].
2.2.1 Comparisons with CBDT
Gilboa and Schmeidler [2000] argue that CBR and CBDT are two distinct frameworks with key differences in emphasis, motivation and analysis. The emphasis of CBDT is on decision making, in which reasoning forms one part of the entire process. In support of this claim, Pomerol [2001] argues that unlike CBDT, CBR does not model preferences, which are important in making decisions.
20 A second difference is that the focus on CBDT has been on learning and adaptive decision making while CBR focuses on deriving the solution itself. Finally, the two differ in motivation; for CBDT it was a general dissatisfaction with the dominant paradigm for making decisions (namely SEU), whereas CBR was introduced by
Riesbeck and Schank [1989] as a simpler and more natural alternative to large algorithm-based reasoning systems.
2.2.2 Empirical applications
Various CBR programs are being used in domains as diverse as criminal sentencing
(JUDGE), diagnosing heart failure (CASEY) to hospitality and creating new recipes
(CHEF program) [Riesbeck and Schank, 1989]. For example, JUDGE is a case-based reasoning program that compares a crime based on the charge, the events that occurred and the legal statutes – fines, bail, imprisonment and parole conditions - regarding crimes of this nature to a case library of previous crimes, to determine the likely sentence [Riesbeck and Schank, 1989].
There are also applications in information technology such as estimating software development effort using CBR [Shepperd and Schofield, 1997], designing better information systems [Krampe and Lusti, 1997], and improving software quality control
[Shimazu and Takashima, 1997] which use the CBR approach to solving new problems.
There have also been successful applications of CBR to construction procurement planning [Luu, Ng and Chen, 2003], experimental medical studies [Seitz, Uhrmacher and Damm, 1999], predicting property appraisal valuations using CBR [Gonzalez and
Laureano-Ortiz, 1992] and planning vacations [Stewart and Vogt, 1999]. In all of the
CBR studies is the underlying belief that analogies to similar past experiences, situations and examples can be useful in solving new problems.
21 2.2.3 Similarity and Prediction Functions
The study by Gayer et al [2004] presented an application of the CB prediction model using apartment sale and rental data. Comparing their CB predictions against a rule- based reasoning method (hedonic linear regression), they found the CB approach was more accurate in predicting apartment rental prices than regression, whereas the reverse was uncovered for apartment sale prices. In addition, the number of cases in the database also appeared to be of critical importance with the CB approach outperforming regression for larger databases. The authors posited that such an observation could be explained by the decision makers switching in between rule- based and case-based approaches depending on amount of information. In other words, decision makers used rule-based reasoning with small amounts of data, but switched to a case-based approach when the database became too large to apply the rule-based approach.
From the CBDT literature’s perspective, this study broke new ground by developing operational similarity and prediction functions. The similarity function,
( , ) s yi yk was determined by the aggregate of the rule-based similarity scores based on
= { ... } a set of apartment attributes y y1 ym , such as rooms, size, floors, parking, air- conditioning, balcony, etc. Specifically, the rules derived similarity scores as the inverse of the Euclidean distance between attribute values such that:
1 s(y , y ) = . i k 1+ ( , ) d yi yk
,∀ = 1... Letting p j denote the price of apartment j j n , the prediction function for a new apartment Y* was the similarity-weighted average price according to:
22 n ( *, ). ∑ S Y Y j p j * = j=1 p n . ( *, ) ∑ S Y Y j j=1
The rule-based approach is by far the most popular method of case similarity assessment in the CBR literature. In order to construct similarity using these algorithms, the decision maker is required to know the relevant features to construct similarity (e.g. bedrooms, baths, size, parking, etc.), whether some features were more important than others (is the number of bedrooms more important than parking?) and complete data on these features. In the absence of this information, rule-based similarity would only be a partial representation of the likeness between two cases based on the known attributes. An alternative method derives similarity through judged ratings, which would be useful when the complete set of attributes is not known. Therefore, another improvement would be to explore the use of judged similarities where similarity is a global estimate rather than one explicitly defined by attributes.
Another area for potential development is the prediction function; specifically in determining which and how many cases are used to construct a prediction. In the typical CBR application, the number of cases, j is given. For example, Gonzalez et al
[1992] used the top 10 most similar cases to the new one, thus implying j=10 [p.236].
Applying different ad-hoc limits to the number of included cases potentially results in different predictions due to the introduction or omission of certain cases. Therefore an improvement to the prediction function would be to explore the use of an endogenous case limit where the number of cases, j not pre-determined in an ad-hoc manner.
23 3 Similarity-Based Forecasting Model
The common thread in this diverse collection of studies is that analogical and similarity-based reasoning are important and useful parts of the decision making process. Underlying this statement is the need to find improved ways to introduce and encourage similarity-based methods in decision making, especially in uncertain and ambiguous situations where decision-making using traditional methods such as linear regression might prove too difficult.
The objective of this paper is to develop an operational model of the case-based approach that offers the improvements highlighted by our review of the current literature namely:
• A memory database where cases are unbiasedly obtained,
• Predictions based on individual case similarity not pooled similarity,
• A model that includes both global judged similarity ratings and rule-based
similarity, and
• An endogenous selection method for cases inclusion.
The remainder of this section is a discussion of the major components of the SBF model, namely the memory database, similarity and prediction functions as summarized in Figure 1.
24 Figure 1: Similarity-Based Forecasting Model
New case
Similarity Assessment Past Prediction Memory cases Past cases X most similar cases
Endogenous used in Relevant past case selection prediction cases retrieved from memory
3.1 Cases and Memory Database
( , ) A case is defined as the decision problem, p and eventuality, r , i.e. c p r or ck .
The memory is a collection of relevant past cases and is defined as
, ... ∈ ,∀ = 2,1 ... c1 c2 cm C k m for a memory with m cases. Under the basic CB model, memory C is limited to the subset of all past cases that have occurred that the agent recalls. That is, letting D represent all cases that have occurred, CB defines memory as
C ∈ D . For example, a case can be defined as a decision to acquire a new company, where ‘memory’ is a collection a previous examples of acquisitions or other past examples (such as mergers, divestments, etc.) that may be considered a similar decision.
Under the SBF model, memory C is defined as the unbiased database of past cases, which is more inclusive and a more balanced distribution of cases than compared to that of CB. Recall our previous arguments that subjective human recall makes case retrieval dependent on the case’s ‘availability’ – that is the more recent, vivid, evocative and memorable cases are more likely to be recalled, regardless of their actual frequency of occurrence. Good examples of this include the recall of events such as natural disasters (earthquakes and floods) which are very easily recalled despite their rare
25 occurrence. SBF addresses these availability bias concerns by replacing subjective recall with objective, unbiased retrieval process. For example, it would be easy to obtain data on the performance of romantic movies from information suppliers such as Nielsen
EDI or publicly available web-based directories such as The Internet Movie Database
[http://www.imdb.com]. Granted, information might not always be so convenient to obtain however obtaining a wider scope of cases, even if partial, would reduce the susceptibility of basing forecasts and decisions on a biased sub sample.
At this point, it is also noted that the SBF memory database is not necessarily a complete set of all cases that have occurred. While this is ideal and should be obtained where available, greater importance is placed on achieving a selection of cases that is representative of the distribution of all outcomes. Therefore, our memory C is still essentially a subset of D, but a larger and more balanced subset than one obtained by subjective and unaided recall.
3.2 Similarity Assessment
The SBF model allows both rule-based and judged similarity assessments. In this section, we describe both types of similarity assessments and give broad advice to where each should be used. Specifically, it is argued that in instances where the attributes and their salience (relative importance) are known, then rule-based similarity can be used. However, in the absence of such information, judged similarities offer a viable alternative to similarity assessment, provided that the collective making the similarity assessments is an ‘efficient’ 4 one. For example, one can apply a rule-based similarity to compare automobiles because the attributes and their importance can be
4 An ‘efficient’ collective is diverse, independent, decentralized and can aggregate dispersed information across a market [Sunder, 1995]. This is discussed in the following section 3.2.2. and is elaborated in the discussions pertaining to knowledge distribution in essay 1.
26 easily defined (e.g., manufacturer, model, shape, automatic or manual transmission, age, kilometers traveled, fuel efficiency, number of doors, etc). However for other examples, such as motion pictures, these attributes and their relative importance are not so easily defined. Does the presence of a Hollywood actor increase the probability of success? How important is the timing of release? For examples like these, judged similarities would be the prescribed approach.
3.2.1 Rule-Based Similarity
( , *) Let s ck c denote the similarity between the new case c* and each past case ck . In previous CBDT applications such as Gayer et al [2004] and Jahnke et al [2005] similarity was rule-based. Rule-based similarities are either geometric or feature-based .
Geometric Similarity. A common approach in CBR describes cases as points in a co-ordinate space and similarity between two cases as a geometric distance measure such as Euclidean distance , Nearest-Neighbor or City-Block [Liao et al, 1998]. One example that fits the geometric similarity framework is the similarity between two lotteries offering different probabilities and prizes, e.g. comparing lottery {win $a with Pr( b) , lose $d with 1− Pr( b) } to lottery {win $g with Pr( j) , lose $h with 1− Pr( j)}.
Feature-based Similarity. Other rule-based approaches, such as the one used by
Gayer et al [2005] describe cases as collections of features and similarity as the ‘match’ between common and distinctive features. This approach is considered the appropriate approach for describing the similarity of cases which are more like a (finite) collection of features as opposed to two-dimensional points, for example the case of comparing apartments for rent.
27 The basic feature matching model proposed by Tversky [1977] defined similarity as the weighted difference (or contrast) of the measures of their common and their distinctive features. Letting a,b,c... represent cases and A, B,C... represent the set of features belonging to cases a,b,c... respectively similarity between case a and b was defined as:
S(a,b) = θf (A∩ B) −αf (A − B) − βf (B − A) for some θ ,α, β ≥ 0 where θ ,α, β represented the salience of the common and distinctive features, and A∩ B was the set of common features, A − B was the set of features distinctive to case a , and B − A is the set of features distinctive to case b .
Rule-based similarity works well when the decision maker knows what the relevant features (A, B, C,…) of the decision are and their relative importance (i.e., θ ,α, β ).
Sometimes this is easy to identify, but other times this is not clear. For the latter, we suggest using judged similarity to assess the likeness between two cases.
3.2.2 Judged Similarity
Judged similarity are similarity assessments obtained by a collective of individuals (e.g. experts). These methods do not explicitly require knowledge of the relevant features or their salience and also has the advantage of being able to aggregate knowledge of similarity that may be dispersed across agents in a market.
Judged similarity assumes that the market providing the information (in this case, similarity ratings) is an efficient one. Here, ‘efficient’ does not necessarily refer to deep expert knowledge in a given domain. Rather it is used to describe a collective that is diverse (including experts and non-experts varying in the degree and content of
28 information), independent and decentralized (made by agents on an individual basis) and has the ability to aggregate dispersed information [Sunder, 1995]. A stream of research exists which asserts that an efficient market can aggregate dispersed information and can be surprisingly accurate in their predictions [Fama, 1960;
Grossman, 1976; Plott and Sunder, 1988; Plott, 2000; Chen and Plott, 2002; Gruca,
Berg and Cipriano, 2003; Surowiecki, 2004; Wolfers and Zitzewitz, 2004] and such information can be used to aid decision making [Berg & Rietz, 2003]. Examples of wise market mechanisms include the Iowa Electronic Markets , Tradesports , NewsFutures and
Hollywood Stock Exchange .
In sum, the wisdom of the market collective comes from its successful aggregation and consolidation of various (and partial) information from dispersed sources. To the extent that efficient markets do make accurate and reliable predictions from the aggregation and consolidation of private and public information, then judged similarity ratings can be useful in instances where rule-based measures are unsuitable. Indeed, using public predictions markets as forecasts will often be useful for a firm. However, in some cases companies would want some privacy in their forecasting process (e.g., so competitors do not know is being planned), in which case a method like SBF is better than a prediction market competitors can see.
3.2.3 Summary of Similarity measures
The choice between a rule-based and a judged similarity assessment hinges on the conceptualization of the case itself. If the case’s attributes and their relative salience or importance can be established by the decision maker, then rule-based measures, either geometric distance based or feature-based ones, can be used. In the absence of such information, judged similarities offer a viable similarity measure provided that the
29 collective from which judgments are obtained is an efficient one. Table 1 provides a summary of the similarity function for contingent on the case conceptualization.
Table 1: SBF Similarity Functions
If the Case … Then the Similarity Function is…
Can be conceptualized as points on a distance-based algorithm such as Euclidean
a co-ordinate scale (e.g. simple distance, Nearest-Neighbor distance and
lotteries) City-Block distance.
Can be conceptualized as a a feature-matching algorithm such as Tversky’s
collection of attributes or features contrast model.
which these and their relative
weighting are known, (e.g.
apartments, automobiles)
Cannot be conceptualized as either a judged similarity rating aggregated from a
of the above because it has a large wise collective who is diverse, independent
set of attributes or where the set of and decentralized.
attributes is not completely known,
30 3.2.4 Aggregating Function for Judged Similarity
Using judged similarity assessments requires an aggregating mechanism over individual
( , *) responses. Using s ck c as the individual level similarity rating, the standard mean similarity aggregated across y individuals is:
y ( , *) ∑ sn ck c S(c ,c*) = n=1 k y
∀k =1... m cases in the memory.
An underlying assumption of using the mean as a measure of central tendency is that distribution of similarity ratings across people is approximately symmetric. For distributions that are skewed or have outliers, a robust mean would be a more appropriate aggregating function. Rather than weighting each observation by 1/y, a
( , *) robust mean weights an individual similarity rating s ck c more heavily if it is close
(' , *) to the robust mean S ck c and weights the rating less heavily if it is far from the robust mean. Therefore the robust-mean definition is:
y 1 ( , *) ⋅ ∑ sn ck c =1 1+ s (c ,c*) − S′(c ,c*) S′(c ,c*) = n n k k k y 1 ∑1+ ( , *) − ′( , *) n=1 sn ck c S ck c
∀k =1... m cases in the memory. In this definition, the weights are defined as the inverse of the absolute distance between the individual case similarity and the mean
(' , *) itself; the larger the distance, the smaller the weight. Because S ck c appears on both sides of the equation, the aggregating function requires recursive estimation.
31 3.3 Formulating a Prediction
3.3.1 Prediction Function
Once similarities have been assessed, the final step uses these similarities to construct a forecast. For example, if a group of motorists are surveyed to obtain their similarity ratings on a selection of automobiles, how can this information be used to construct a prediction of price for a new automobile?
There are several ways of constructing predictions using similarities. Recall that under the original CB prediction model, cases were ranked by the sum of similarity values where eventuality r occurred, that is:
r* = rˆ
m m { ( ˆ), *} > { ( ), *} for ∑ S ck r c ∑ S ck r c k =1 k=1
∀r ≠ rˆ and ∀k = 2,1 ... m .
The drawback of using either method is that both may overlook cases which individually have high similarities to the new case. Using our automobile example, 2 luxury sports cars which are rated as highly similar, may have less influence on the price on the price of a new sports car than 20 sedans which were rated as quite dissimilar, simply due to the higher frequency of the sedans.
Under the SBF model, cases are allowed to enter individually thereby allowing cases which are strongly similar to the current one to have the strongest influence on the prediction. Thus, the SBF model will allow the 2 examples of sports cars to have a stronger influence on the price than the 20 sedans. Another benefit of this approach
32 compared to the original CB model is that it does not require artificial partitioning of continuous variables. Because the original CB model pools cases by outcome r, if outcome r is continuous (e.g. revenue) then categorizing revenues into discrete bins
(e.g. $0-10, $10-20,…) would be required. Because the SBF approach does not pool cases by outcome, case selection works for both discrete and continuous variables.
Using the new conceptualization, one way to define a prediction would be to use the outcome from the case with the highest similarity. If we define the most-similar past case cˆ implicitly according to
( ˆ, *) > ( , *), s c c s ck, c
∀ ≠ ˆ * = ˆ ck c , then the forecast is r r or the “single-analogy forecast” based on the most similar previous case. Thus, the forecast of our new sports car will be based only on the most similar car (e.g. another sports car), and not only any other. Single-analogy forecasts should be limited to instances where there exists only one highly similar case
(with all other cases rated only moderately similar). Outside of such instances, a single- analogy forecast potentially ignores cases which may also be highly similar and thus informative to the current decisions.
For cases where there is potentially more than one highly similar case, then the
“multi-analogy forecast” based on the similarity-weighted x most similar cases should be applied. This would allow the prediction to be based on more than one similar case.
For example, the price of the sports car would be based on the most similar vehicles to it.
33 The multi-analogy forecast is defined as:
x (' , *). ∑ S ck c rk * = k=1 rm x (' , *) ∑ S ck c k=1
3.3.2 Endogenous case limits
One of the key limitations of CBR and CBDT applications is that the number of cases entering into the prediction function was pre-determined by an ad-hoc limit. Under the
SBF model, the number of cases, x is endogenous and therefore is not determined by any ad-hoc rules (e.g. the top 5 cases) or similarity thresholds, T (e.g. only include cases
( , *) > where s ck , c T ).
Hierarchical cluster analysis provides possible one method of determining the value for x. Cluster analysis determines the boundaries or breakpoints between clusters of cases by maximizing the between-cluster variances while minimizing the within-cluster variances. Cluster analysis defines x as the number of cases in the top cluster when cases are ordered by similarity. For example, the similarity function of comparative movies (in descending similarity order) to The Island (2005) and its clusters are depicted in Figure 2 5.
Casual observation of the function identifies several possible breakpoints such as at case 6, 39 and 56. Cluster analysis provides the statistical rigor to support or determine these apparent clusters. Therefore, under the SBF model the number of cases that enter into the prediction model is endogenous and solved using methods such as hierarchical clustering.
5 To see the complete list of movies in this example, refer to the Technical Appendix under Note T1
34 Figure 2: Similarity function and Movie clusters
NEW MOVIE: THE ISLAND
1.00
0.90 Cluster 1 (k=1..6) 0.80 Cluster 2 (k=7…39) 0.70
0.60 Cluster 3 (k=40…56) 0.50
0.40 Cluster 1 Movies: Robust Similarity Fn RobustSimilarity - The Matrix 0.30 - Gattaca - The Truman Show - The Island of Dr. Moreau 0.20 - No Escape Cluster 4 - Fortress (k=57…63) 0.10
0.00 2 5 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 MOVIE ID
3.4 Summary
SBF builds on the lessons from the case-based literature and offers an operational model with improvements to the memory, similarity and prediction functions. Two key innovations embedded in the new operational schema are (i) allowing both a rule- based and judged similarity function and (ii) allowing the number of cases that enter the prediction function to be endogenously determined.
35 4 Methodology for Motion Picture Application
4.1 Background to the Motion Picture Problem
4.1.1 Objectives
In many instances, the case-based approach parallels the natural way people organize knowledge of cases for decision-making [Gilboa and Schmeidler, 2001, p. 101]. Instead of requiring people to think about the probability distribution of occurrences, they are instead asked to formulate similarities over a range of previous occurrences.
In this section the SBF approach is applied to the problem faced by studio executives when making decisions on which motion pictures (or movies) to produce.
The movie business is a notoriously risky business; only 3 or 4 out of every 10 movies made breaks even or earns profit [Vogel, 2001]. Production decisions require studios to forecast the likely success of the movie. However, forecasting a movie’s commercial success 6 before its theatrical release has challenged researchers for almost three decades. This study contributes to research efforts concentrating on developing forecasting models for gross box office revenues well in advance of the movie’s release.
4.1.2 Empirical Research
This difficulty in accurately forecasting movie success is largely due to the ambiguity regarding the relevant determinants and their relationship with success. For example, the role of critics’ review of a movie before and during its opening week is a heavily debated topic with some studies like Basuroy, Chatterjee and Ravid, [2003], Elberse
6 For the majority of studies in the literature ‘movie success’ is given by gross box office revenue or total gross receipts at the cinemas.
36 and Eliashberg [2003] and Reinstein and Snyder [2005] providing evidence of a positive association with revenues while others contend that critics’ were as predictors but not influencers [Eliashberg and Shugan, 1997]. The interested reader is referred to
Hennig-Thurau, Houston and Walsh [2003] which provides a detailed survey of the determinants of movie success 7.
The lack of consensus may also be due to various the methods of evaluation dominated by statistical estimation methods. Early studies relied on ordinary least squares regression [Litman, 1983; Smith and Smith, 1986; Prag and Casavant, 1994;
Sochay, 1994; Ravid, 1999]; an approach that fell out of favor when it was discovered that the revenue function was too complex to be captured by linear regression [De
Vany and Walls, 1999]. In response to this, researchers developed more complex models in an attempt to represent the apparent complexities in the revenue function including interaction models [Elberse et al, 2003; Hennig-Thurau et al, 2003; Desai and
Basuroy; 2005], logit and probit models [De Vany et al, 1999; Collins, Hand and Snell,
2002] and Bayesian estimation methods [Neelamegham and Chintagunta, 1999].
However, if either the complete set of relevant variables or the true underlying relationship with movie success is unknown, then these statistical methods are at best, partial representations of movie success.
Some of the above models predicted movie success shortly after its release [e.g.
Sawhney and Eliashberg 1996; Eliashberg and Shugan 1997; Neelamegham and
Chintagunta 1999]. By this stage, production and marketing budgets have already been spent. According to Shugan and Swait [2000], there exists a need for predictions prior to the release of the movie [p.2]. In their study Shugan and Swait designed a two-stage
7 Note 2 in the Appendix provides a summary review of the movie success literature.
37 model based on consumers’ intent-to-view and information available at the pre-release stage and used it to predict the box office revenue for one movie Batman and Robin
(1997) . Their model predicted the actual revenue with 19% absolute relative error.
While ground-breaking, this model only looked at the model’s predictive performance for a single movie. Therefore, there is a need for empirical evidence of the predictive accuracy of models based on a larger sample of movies.
In a second study, Simonoff and Sparrow [2000] juxtaposed predictions from two models; the first based only on pre-release data (genre, MPAA rating, star power, production budget, sequel, holiday release, opening week screens) and the second incorporating opening week revenues as a predictor of gross revenue. Based on a small sample of 9 movies released in 1999, the MARE for the predictive model based on pre-release data was 72% compared to 47% for predictions including opening week revenues 8; using their pre-release forecasting model, the authors predicted box office revenue with a large error rate 9. Therefore forecasting models that predict box office revenue prior to release with greater accuracy is needed.
4.1.3 Why SBF for forecasting movies?
Absent in these attempts is a case-based approach to forecasting movie success. The lack of consensus regarding the determinants and their relationship with movie success make movie forecasting a good candidate for the SBF approach. In contrast to previously tried methods, SBF would not attempt to model the complex relations and interactions between the relevant variables nor would it require the decision maker to know what those variables were, thus making it a viable approach to forecasting movie
8 These results were based on the exclusion of two outliers where the error rates were more than 9 times the actual revenue. 9 This is a large error rate compared to those of our predictive models which were 28% and 30% for SBF and combined SBF-Regression models respectively.
38 success. The only requirements for an SBF prediction are good data on previous cases
(movies) and similarity assessments between these and the new movie(s).
The following section discusses the methodology for implementing the SBF approach to forecasting new movie success as illustrated in Figure 3. Each component of the model is discussed in detail. In particular this paper intends to contribute to the current movie forecasting literature by providing,
• a forecasting model for predicting revenues prior to the movie’s theatrical
release (not after),
• empirical evidence of predictive accuracy based on a larger sample of
movies (greater than the 9 in Simonoff and Sparrow [2000]), and
• an approach which does not require complete knowledge of all
determinants and their relationship with movie success.
Figure 3: Similarity-Based Forecasting for Movie Success
(4.2.1) 19 target cases (4.3.1-4.3.5) Similarity obtained via New Case (4.4.3) p,a,r survey of movie-goers. Aggregated using Prediction is constructed Robust similarity. as the similarity-weighted average of the x most similar cases
Past Case Old Case q,b,t Past Case Oldq,b,t Case q,b,t Similarity Past Case Oldq,b,t Case Memory Pastq,b,t Case Assessment q,b,t Prediction Pastq,b,t Case Pastq,b,t Case q,b,t (4.4.2) (4.2.2) Cluster Analysis is used Past cases Endogenous to select cases that sourced from Case (4.2.3) enter into the FilmSource Selection Subsets of past prediction cases are selected
39 4.2 Data
4.2.1 Target Cases
Target cases are the new cases which SBF will construct forecasts for. Nineteen movies scheduled for a future United States summer release were used as the sample of target movies.
Because part of this exercise requires similarity ratings from respondents, using previously released movies can introduce confounding effects with respondents’ knowledge of the box office outcomes for the movies. To address this concern, new unreleased movies are used to ensure that similarity assessments are not based on similarities between successes at the box office. Table A2 in the appendix lists the chosen target movies and their release dates.
4.2.2 Previous Cases
Under SBF, human recall used for case retrieval is replaced by an unbiased and objectively obtained source. Past movies were collated from FilmSource [Nielsen EDI,
2005] and formed our ‘memory database’ of previous cases. Unlike previous studies which were based on Variety’s top movie listings [e.g. Litman, 1983; Smith and Smith,
1986; Sochay, 1994; Neelameagham & Chintagunta, 1999; Basuroy et al, 2003] the
FilmSource database include all movies and not just examples among the largest grossing movies, and therefore is arguably more representative than analyses using
Variety data.
For movie success, we utilize gross domestic (U.S.) box office revenue (hereafter
“revenue”) provided by FilmSource . Adjunct datasets from Box Office Mojo
[http://www.boxofficemojo.com, 2005] and Quigley Publishing [2006] are used to obtain
40 production budget data and top money-making actor lists respectively. Note 3 in the appendix details the data transformations, the construction of the “star indicator” and provides descriptive statistics for the memory database.
The memory database comprises of 1,751 movies released in the United States between 1993 and 2004 which were releases in 600 or more cinemas. Table A4 in the appendix lists the variables collected and descriptive statistics. Revenue for the dataset was $51.3 million at the mean and $31.0 million at the median. Revenue was highly positively-skewed (Skewness=3.21, Std Error =0.06) and fat-tailed (leptokurtotic;
Kurtosis=16.06, Std Error= 0.12). Because of the proprietary nature of financial data, product budget was missing in 16% of the sample. Among the cases with available budget figures, the average estimated production budget was $42.9 million. The production budget distribution was also heavily positively-skewed (Skewness=1.54, Std
Error=0.06) and fat-tailed (Kurtosis=3.23, Std Error=0.13).
4.2.3 Case Selection Algorithm
With over 1,700 past cases to compare with, it would be a fairly laborious (if not impossible) task to ask movie-goers to provide similarity ratings for each of these movies to the 19 target movies. Therefore, a two-stage case selection algorithm was ~ used to obtain a manageable subset of previously-released movies, C ∈C . Figure A1 ~ in the appendix summarizes the approach for selecting a subset C from the population memory C .
The first stage of the algorithm extracts movies from the memory database that share the same (i) movie genre, (ii) storyline description and (iii) featured actors 10 to the
10 A ‘featured actor’ is either the lead or supporting actor of the movie.
41 new movie. This algorithm was based on results from a focus study regarding movie
11 ∈ attributes . Therefore our subset Cs1 C is now defined as:
= ∪ ∪ Cs1 Genre Storyline StarActors
Using this subset, the second stage of the algorithm randomly extracts cases one at a time, until the subset size has been achieved. Literature regarding the appropriate size for creating a representative subset has argued that the subset needs to be broad enough to be statistically meaningful but narrow enough to be comparable to the new case [Lovallo and Kahneman, 2003]. For this study, a class size of 40 cases is deemed large enough to derive meaningful comparisons and small enough to be manageable in a survey setting.
The 40 cases were selected using over-sampling rules regarding featured actors and
revenue deciles. Firstly due to the low representation of movies in Cs1 with matching featured actors, the algorithm first selected at most 10 movies that had matching featured actors to the target case. Secondly, in anticipation of low familiarity with low- decile (low revenue) movies compared to high-decile (high revenue) blockbusters, the algorithm over-sampled on low decile movies according to the rule below Table 2. ~ ~ = { , ,..., ) = 40 ∈ Therefore our memory subset is C c1 c2 cn where n and C Cs1 .
Table 2: Over-sampling Rule for Revenue Deciles
Decile 1 2 3 4 5 6 7 8 9 10 Total
# Cases 7 6 5 4 3 3 3 3 3 3 40
11 Refer to Note 5 in the appendix for detailed results from the focus study.
42 Case selection is only required when similarity assessment is via human raters. The objective of the case-selection is to obtain a manageable subset of past cases that (apart from over-sampling due to low familiarity) is representative of the total memory database. This step in the SBF procedure is not required when similarities are rule- based.
4.3 Methodology for Similarity Assessment
4.3.1 Respondent Pool
Similarity assessments were obtained using a survey of movie-goers. Movie-goers were recruited at two universities (in Los Angeles and Sydney), one unnamed private company along with advertisements placed on Craigslist .
358 respondents took our survey; among them were students (39%), a small group of entertainment industry employees (8%) and general movie goers (53%). Gender was roughly balanced (54% females). More than half our respondents (54%) were aged between 21 and 35 years old; with a total of 82% were aged between 13 and 35 years old. 2 out of 3 respondents went to the movies once a month or more, indicating that the majority of our respondents adequately familiar with motion pictures.
4.3.2 Survey Instrument
The instrument implemented was a web-based self-administered questionnaire. The advantages of a web-based questionnaire were that it allowed respondents from multiple locations to participate and to complete the survey at their own convenience.
The survey was configured with response validation and logic which guided and interacted with the respondent’s choices in the survey.
43 The ordering of target movies was counterbalanced in case respondents became bored or fatigued while completing the survey. The survey was replicated in four versions each representing four different (partially overlapping) subsets ~ ~ ~ ~ ~ C = {C1,C2 ,C3 ,C4 ) thus increasing the number of comparable past movies without increasing the cognitive burden on individual respondents. Each survey contained a
* * * section for each target movie, c1 ,c2 ,... c19 .
4.3.3 Measuring Familiarity
~ , =1... 4 In each section, respondents were presented with a subset a list Ci i of past
movies, c1 ,c2 ,... c40 and were asked to indicate their degree of familiarity with each movie on a 3-point scale. In addition, respondents were told that to be “sufficiently familiar” with a movie they needed to be able to know the genre, main storyline and ~ 12 featured actors of the movie . An example of the familiarity item for a subset Ci relating to the target movie The Island is shown in Figure 4. The wording of the item is shown in Note 6 in the appendix. Movies which the respondent was not sufficiently familiar with were excluded from the rest of the questionnaire. If the individual respondent was familiar with the first j movies, then the list would be
~ = { , ,.. } ≤ 40 C c1 c2 c j where j .
12 The survey also provided respondents with a movie library of all past movies in the questionnaire. This library allowed respondents to look up the movie’s genre, main actors and short synopsis of the storyline in order to assist respondents in determining their level of familiarity.
44 Figure 4: Measuring Degree of Familiarity with Past Movies
I have seen this movie I haven't seen it, but I am not am sufficiently familiar with familiar with it this movie
6TH DAY, THE (Arnold Schwarzenegger, Tony Goldwyn) FORTRESS (Christopher Lambert, Kurtwood Smith) DOUBLE TEAM (Jean-Claude Van Damme, Dennis Rodman) ASTRONAUT'S WIFE, THE (Johnny Depp, Charlize Theron) EXIT TO EDEN (Dana Delany, Paul Mercurio) DIABOLIQUE (Sharon Stone, Isabelle Adjani) TREASURE PLANET (Joseph Gordon-Levitt (Vocal), Brian Murray (Vocal)) EYE OF THE BEHOLDER (Ewan McGregor, Ashley Judd) BEACH, THE (Leonardo DiCaprio, Tilda Swinton) DANGEROUS GROUND (Ice Cube, Elizabeth Hurley) GUILTY AS SIN (Rebecca DeMornay, Don Johnson) MATRIX, THE (Keanu Reeves, Laurence Fishburne) ISLAND OF DR. MOREAU, THE (Marlon Brando, Val Kilmer) LIFE LESS ORDINARY, A (Ewan McGregor, Cameron Diaz) VILLAGE, THE (Bryce Dallas Howard, Joaquin Phoenix) BLINK (Madeleine Stowe, Aidan Quinn) BIG FISH (Ewan McGregor, Albert Finney) JADE (David Caruso, Linda Fiorentino) MASTERMINDS (Patrick Stewart, Vincent Kartheiser) EXTREME OPS (Devon Sawa, Bridgette Wilson-Sampras) CHRONICLES OF RIDDICK, THE (Vin Diesel, Colm Feore) STAR WARS: EPISODE I - THE PHANTOM MENACE (Liam Neeson, Ewan McGregor) SPECIES 2 (Michael Madsen, Natasha Henstridge) MOULIN ROUGE! (Nicole Kidman, Ewan McGregor) DARK CITY (Rufus Sewell, Kiefer Sutherland) TITAN A.E. (Matt Damon (Vocal), Drew Barrymore (Vocal)) STARGATE (Kurt Russell, James Spader) TRUMAN SHOW, THE (Jim Carrey, Laura Linney) DOWN WITH LOVE (Renee Zellweger, Ewan McGregor)
4.3.4 Measuring Similarity
Following this, respondents were then presented with a short synopsis of the target movie, c* . They were given information about the main actors and the general story plot. They were also shown an image associated with the movie, typically the theatrical poster 13 . An example of the similarity item is shown in Figure 5.
13 Unless stated otherwise, the target movie description was obtained from Box Office Mojo website.
45 Figure 5: Target Movie Description example – “The Island”
THE ISLAND Sci-Fi Fantasy starring Ewan McGregor and Scarlett Johansson. Lincoln Six-Echo (McGregor) is a resident of a seemingly utopian but contained facility in the mid-21st century. Like all of the inhabitants of this carefully controlled environment, Lincoln hopes to be chosen to go to the “The Island”—reportedly the last uncontaminated spot on the planet. But Lincoln soon discovers that everything about his existence is a lie. He and all of the other inhabitants of the facility are actually human clones whose only purpose is to provide “spare parts” for their original human counterparts. Realizing it is only a matter of time before he is “harvested,” Lincoln makes a daring escape with a beautiful fellow resident named Jordan Two-Delta (Johansson). Relentlessly pursued by the forces of the sinister institute that once housed them, Lincoln and Jordan engage in a race for their lives to literally meet their makers.
Respondents were then instructed to rate the similarity of the target case c*, to ~ each movie in their list C . To make the task simpler the survey introduced an intermediate step that asked respondents to identify the 10 most similar movies (in the cases when j ≥10 ) and provide ratings for these. Therefore we obtained similarity
~ = { , ,.. } = min( 10, ) ratings for cases in the set C c1 c2 cv where v j .
The similarity ratings were provided along a 7-point scale where a rating of 1 indicated highest dissimilarity and 7 indicated highest similarity to the target movie.
The similarity function at the individual respondent level is be defined as
( , *), ∀ = 1... 1 ≤ ( , *) ≤ 7 s ck c k v with s ck c . An example is shown in Figure 6.
Respondents repeated this procedure for all 19 target movies.
46 Figure 6: Similarity Item example – “The Island”
Using your list of movies, we want to know how you would rate each movie's similarity to The Island . Higher numbers indicate higher degrees of similarity (1 is the lowest possible similarity rating; 7 the highest). You may consider the vast majority of the movies dissimilar or the vast majority very similar or any combination in between. Remember, there are no right or wrong answers - just YOUR opinions that matter.
Very Dissimilar Similar Similar 1 2 3 4 5 6 7
6TH DAY, THE (Arnold Schwarzenegger, Tony Goldwyn)
BEACH, THE (Leonardo DiCaprio, Tilda Swinton)
MATRIX, THE (Keanu Reeves, Laurence Fishburne)
VILLAGE, THE (Bryce Dallas Howard, Joaquin Phoenix)
BIG FISH (Ewan McGregor, Albert Finney)
MASTERMINDS (Patrick Stewart, Vincent Kartheiser)
CHRONICLES OF RIDDICK, THE (Vin Diesel, Colm Feore)
TRUMAN SHOW, THE (Jim Carrey, Laura Linney)
STARGATE (Kurt Russell, James Spader) DOWN WITH LOVE (Renee Zellweger, Ewan McGregor)
4.3.5 Aggregating Function for Similarity Scores
( , *) The individual respondent similarities s ck c were aggregated using a robust mean
′( , *) similarity function S ck c defined in section 3.2.3. A robust mean has the advantage of automatically reducing the influence of outliers (by giving them lower weight) and also adjusting for skewness (since observations skewed away from the median are given less weight). Figure 7a and 7b illustrates this using an actual example from the survey. Figure 7a shows the differences in the standard and robust means for four example movies compared to The Island . For example, similarity ratings between The
Matrix and The Island were skewed towards higher similarity ratings, thus resulting in a robust mean that was greater than the standard mean. On the other for the last two examples ( A Life Less Ordinary and Black Hawk Down ) the skewness was towards lower ratings therefore resulting in a robust mean that was lower than the standard mean.
47 Figure 7a: Similarity Histogram example for “ The Island ”
Frequency Distribution
STD ROBUST Past Case 1 2 3 4 5 6 7 MEAN MEAN The Matrix 2 7 13 26 35 39 37 5.10 5.24 Gattaca 0 5 2 3 8 12 5 5.00 5.24 A Life Less Ordinary 1 5 4 1 2 0 1 4.25 2.90 Black Hawk Down 25 16 6 5 3 1 0 2.05 1.86
Figure 7b: Similarity Histogram for “ The Matrix ”
45
40
35
30
25 Rating weights decrease away from the mean 20
15
10
5 S =5.10 S' =5.24 Frequency 0 1 2 3 4 5 6 7 Similarity Score
The robust similarity functions for all 19 target movies were constructed and are reported in the Technical appendix under Note T1. Across all movie pairings, we found a high correlation between standard and robust means; an average correlation coefficient of ρ = .0 981 (Kendall’s τ = .0 900 , p<0.01). This result implies that in most instances the robust transformation did not alter the similarity ratings at the
48 means, only that it shifted weights from equally distributed to relatively concentrated at the mean.
4.4 Formulating a Prediction
4.4.1 Preliminary SBF predictions and Cluster Analysis
SBF predictions were constructed using the similarity-weighted revenue function:
x ′( , *). ∑ S ck c rk * = k=1 rm x ′( , *) ∑ S ck c k=1
∀k =1... x . x=1 refers to the “single-analogy” forecast while x=m refers to the “multi- analogy” forecast using all cases in the memory. We found that using all cases in the memory, tended to produce forecasts which did not vary much across movies. Because most movies have some components of similarity to many other movies, there are often a lot of previous cases with similarity ratings of 4-5 out of 7, which weigh relatively heavily compared to highly-similar cases with the maximum rating of 7. On the other hand, using only a single analogy, i.e. x=1 produces forecasts which are too variable.
Therefore, instead of choosing an ad-hoc rule, we allowed the number of cases to be endogenously determined, in this case, using hierarchical cluster analysis in SPSS. ~ From this analysis cluster memberships were retrieved for movies in C . Cluster memberships indicate the cluster in which a movie exclusively belongs to.
To determine whether a cluster was robust, the cluster analysis was repeated for 2 to 10 clusters. This analysis would identify the most robust clusters and therefore the
49 number of cases x , to include in the prediction. Cluster membership schedules for all
19 target movies are presented in the Technical appendix under Note T2. Table 3 is an example using “ The Island ” which shows that the top cluster contains 6 cases, which is robust over 7 cluster iterations (4 to 10 clusters).
Table 4 summarizes the SBF predictions, the number of cases in the top cluster and the cluster iteration range. In some instances, the cluster comprised of one stand- out movie based on similarity (e.g. War of the Worlds), while in others, the top cluster contained more than 10 similar movies (e.g. Fantastic Four). On average the top cluster contained 6 past movies.
50 Table 3: Cluster Membership Schedule (Abbreviated)- “ The Island ”
Number of Clusters ID 10 9 8 7 6 5 4 3 2 The Matrix 1 1 1 1 1 1 1 1 1 Gattaca 1 1 1 1 1 1 1 1 1 The Truman Show 1 1 1 1 1 1 1 1 1 The Island of Dr. Moreau 1 1 1 1 1 1 1 1 1 No Escape 1 1 1 1 1 1 1 1 1 Fortress 2 2 1 1 1 1 1 1 1 Species 2 3 3 2 2 2 2 2 1 1 The 6th day 3 3 2 2 2 2 2 1 1 AI: Artificial Intelligence 3 3 2 2 2 2 2 1 1 The Ninth Gate 3 3 2 2 2 2 2 1 1 Strange Days 3 3 2 2 2 2 2 1 1 The Village 3 3 2 2 2 2 2 1 1 The Truth about Charlie 3 3 2 2 2 2 2 1 1 Dark City 3 3 2 2 2 2 2 1 1 Paycheck 3 3 2 2 2 2 2 1 1 Godsend 3 3 2 2 2 2 2 1 1 Universal Solider: The Return 3 3 2 2 2 2 2 1 1 Planet of the Apes 3 3 2 2 2 2 2 1 1 Pleasantville 3 3 2 2 2 2 2 1 1 Antitrust 4 4 3 3 3 3 2 1 1 The Thirteenth Floor 4 4 3 3 3 3 2 1 1 Pitch Black 4 4 3 3 3 3 2 1 1 Demolition Man 4 4 3 3 3 3 2 1 1 Frailty 4 4 3 3 3 3 2 1 1
51 Table 4: Top Cluster and SBF raw prediction
Number of Cluster SBF raw prediction using Target Movies cases in top Iteration robust cluster cluster (x) range War of the Worlds 1 10 $ 358,294,057 Charlie & the Chocolate Factory 6 8 $ 41,391,270 The Wedding Crashers 15 4 $ 44,529,473 Fantastic Four 12 6 $ 121,213,662 The Dukes of Hazzard 4 8 $ 50,816,103 Four Brothers 10 4 $ 47,080,828 Bewitched 1 6 $ 52,949,951 Sky High 2 5 $ 196,069,962 Red Eye 3 4 $ 58,418,817 Skeleton Key 6 5 $ 44,933,002 Must Love Dogs 5 8 $ 28,761,896 The Brothers Grimm 1 7 $ 9,907,064 The Island 6 8 $ 8,493,751 Bad News Bears 1 5 $ 21,130,041 Stealth 4 5 $ 11,418,401 Dark Water 4 5 $ 63,736,572 Deuce Bigalow: European Gigolo 1 10 $ 73,205,095 Valiant 8 5 $ 17,225,593 Rebound 4 9 $ 24,226,004 Average 5 6 Median 4 6
52 4.4.2 Case Adaptation
An implicit assumption in using judged similarity ratings was the respondent pool was a wise crowd with the ability to consolidate diverse information accounting for the relevant determinants of revenue. This assumption was tested by regressing the SBF forecasted revenue (ln(SBF)) along with production budget (ln(Prod)), movie genres
(Genre), the sequel dummy (Sequel) and the star actor dummy (Actor) variables against actual revenue (ln(Revenue)). In addition, to proxy for marketing and advertising efforts, Shugan’s [2000] approach of using the opening week number of theaters
(OpTheaters) was also included in this model [p.8]. While the widest point of release
(maximum number of theaters) is a supply-side variable, the initial number of theaters
(at opening week) can be thought of as a studio-determined variable just like production and marketing costs. Note 10 in the appendix presents correlation matrices for marketing costs and opening week theaters 14 ; the variables were significantly correlated ( ρ = .0 618 , p<0.01). Using the target movie sample, the following regression equation was estimated:
ln(Re ) = β + β ln( ) + β ln(Pr ) + β ( ) venue k 0 1 SBF k 2 od k 3 OpTheaters k z 10 + β ( ) + γ ( ) + δ ( ) 4 D Sequel k ∑ j D Actor j k ∑ i D Genre i k j=1 i=1
∀k =1... 19 , where k denotes the new movie, i represents the twelve genre categories
(Drama, Comedy, Action, Adventure, Sci-fi, Horror, Romantic, Animated, Fantasy,
( ) Musical, Suspense, and Western), Actor j k is the vector of dummy variable for actor
( ) j in movie k and Genre i k is the vector of dummy variable for genre i in movie k .
14 Due to the proprietary nature of marketing and advertising costs, it was difficult to obtained complete and reliable data for cases in our memory database. Only a small fraction (33%) of these movies had marketing figures.
53 Both actor and genre are represented by a vector of dummy variables, allowing a movie to have multiple featured actors and be described by multiple genres. For example, the movie City Slickers (featuring Billy Crystal and Jack Palance) which has
( ) = 1 both comedy and western genre aspects would have Genre i , where i = Western ,Comedy 15 .
This model was estimated using backward stepwise regression in SPSS. The estimated SBF model was estimated as follows:
ln(Re ) = .8 309 + .0 419 ln( ) + .0 001 ( ) − .1 129 ( ) venue k SBF k OpTheaters k D Sequel k
The model has a fit of R 2 = 55.0 with the sample and SBF forecasted revenue, opening week theaters and sequel dummy were all significant at the 3% significance level 16 . Note 7 in the appendix presents detailed statistics on the estimated model. The results from the stepwise regression suggested that the SBF forecasted revenue, opening week theaters and the sequel indicator were the three variables statistically and significantly related to actual revenue. Therefore, the estimated SBF model was used to obtain revised SBF predictions.
4.5 Comparative Analysis
4.5.1 Hedonic Linear Regression
Also of interest would be a comparison of SBF predictions with those from an hedonic linear regression with production budget (ln(Prod)), movie genres (Genre), sequel dummy (Sequel), opening week theaters (OpTheaters), the presence of top actors (Actors), and control factors such as the year of release (Year) and a dummy for
15 I owe this example to the comments of an anonymous examiner. 16 One-tailed significance level.
54 whether the movie was released during a holiday (Holiday) as predictor attributes.
Using the memory database of 1,751 movies the following model was estimated:
ln(Re ) = β + β ln(Pr ) + β ( ) + β + β venue k 0 1 od k 2 D Sequel k 3OpTheaters k 4Year k z 10 +ψ + γ ( ) + δ ( ) 1Holiday k ∑ j Actor j k ∑ i Genre i k j=1 i=1
∀k = 1... 19 , where k denotes the new movie, i represents the twelve genre categories (Drama, Comedy, Action, Adventure, Sci-fi, Horror, Romantic, Animated,
( ) Fantasy, Musical, Suspense, and Western), Actor j k is the vector of dummy variable
( ) for actor j in movie k and Genre i k is the vector of dummy variable for genre i in movie k . Note 8 provides a summary of the estimated regression model. The model had a fit measure of R 2 = 35.0 and the model’s residuals appeared normally distributed. Thus, the following equation was used to derive regression predictions of movie revenues:
ln(Re ) = − .9 074 + .0 449 ln(Pr ) + .0 235 ( ) + .0 0003 venue k od k D Sequel k OpTheaters k + .0 009 − .0 006 + .0 275 ( ) + .0 228 ( ) Year k Holiday k D Actor j k Drama − .0 030 (Comedy ) − .0 076 (Action ) − .0 237 (Adventure ) − .0 152 (Scifi ) + .0 010 (Horror ) + .0 097 (Romantic ) + .0 233 (Animated ) + .0 072 (Fantasy ) − .0 011 (Musical ) + .0 018 (Suspense ) − 37.0 (Western )
4.5.2 Combined SBF-Regression Model
Gayer et al [2004] suggested that a combined case-based and regression forecast should also be considered. The combined SBF-regression forecast was constructed by including the SBF predicted revenue (ln(SBF)) into the hedonic equation in section 4.5.1. For our sample of target movies released in summer 2005, the variables (Year) and
55 (Holiday) were constant and thus omitted from the study. Note 9 in the appendix provides some statistics for the estimates combined model:
ln(Re ) = −12 .181 + .0 865 ln( ) + .0 927 ln(Pr ) − 99.1 ( ) venue k SBF k od k D Sequel k − .0 129 D(Actor ) − .2 942 (Drama ) − .5 045 (Comedy ) − .1 782 (Action ) j k + .0 499 (Adventure ) − .4 955 (Scifi ) − .3 619 (Horror ) − .0 732 (Romantic ) + .0 855 (Animated ) − .1 730 (Fantasy ) − .2 210 (Suspense )
4.5.3 Performance Hypotheses
Based on the reviewed arguments from the existing movie success literature the following hypothesis was examined:
Hypothesis 1: SBF predictions of movie revenues are more accurate than
those obtained from hedonic regression.
A second hypothesis, comparing the combined SBF-regression forecast was also examined. A stream of research in the forecasting literature argued for combining forecasts from several methods to formulate a more accurate forecast. Armstrong
[2001] provides a detailed review of studies that support this claim.
Hypothesis 2a: Combined SBF-Regression predictions of movie revenues
are more accurate than those obtained from hedonic regression.
Hypothesis 2b: Combined SBF-Regression predictions of movie revenues
are more accurate than those obtained from SBF alone.
56 In both hypotheses, (the lack of) accuracy was measured by the mean absolute relative error (or “MARE”) summed over all previous cases k, i.e.,
r * −r k k ∑ rk MARE = , ∀k = 1... x . n
57 5 Results & Analysis
5.1 SBF and Hedonic Regression
The “Actual Revenue” column in Table 6 presents the revenue results at the box office for each target movies. The mean revenue for was $76.81 million, which was approximately 50% greater than the historical mean of $51.3 million. Despite being buoyed by the favorable summer timing of release, half of our new movie sample earned less than the historical average of $51.3 million. In fact just 20% of the sample earned more than $100 million in revenue. Figure 8 illustrates the comparison between
SBF and regression revenue predictions against actual revenue for each target case.
Table 6 also compares the MARE for SBF and regression forecasts. The MARE for SBF forecasts was 30.09% compared to 62.42% for regression forecasts, thus providing support for hypothesis 1 (t=2.21, p<0.04). On average, SBF was 51.78% more accurate than regression in predicting revenue, with the largest gain in accuracy
(of 62.65% over regression) for small grossing movies (less than $50 million revenue).
On a case-by-case basis, SBF provided a more accurate forecast in 15 out of 19 (or
79%) of movies than did regression. Figure 9 compares the error rates associated with the SBF and regression models. The solid points in the graph indicate the cases where
SBF was more accurate than regression, while the white points indicate the reverse.
Two examples of where SBF offered greater accuracy over regression were for the movies The Island and Stealth, where SBF was 76% and 64% more accurate than the regression prediction. In the case of The Island , large revenue-earning blockbuster movies such as The Matrix ($193 million) and The Truman Show ($144 million) were rated as highly similar, but so too were several small-revenue earners such as No Escape
58 ($18 million), Gattaca ($14 million) and Fortress ($8 million). Similarity for Stealth , the hit movies I, Robot ($147 million) and Terminator 3: Rise of the Machines ($155 million) were among the most similar movies, but so were smaller-grossing movies such as Imposter
($7 million) and Mimic ($29 million).
These examples in particular demonstrate the benefits of having an objectively obtained memory database. For example, in approving the production of The Island , the analogies which spring to mind most easily are likely to be higher-grossing films.
People aren’t as likely to think about the low-grossing analogies because nobody saw them (literally) and certainly no one trying to sell a movie uses them. The SBF method however, uses all analogies from past similar cases, regardless of their ease of recall. To the extent that high revenue-earning movies are more ‘available’ in the decision maker’s memory, then frameworks like the basic CBDT, that rely on the recall of the decision maker are susceptible to (over-estimation) bias of actual box office performance.
In 4 out of the 19 cases, the regression prediction was more accurate than SBF. In one example, SBF under-predicted the movie Charlie and the Chocolate Factory by 77%; the largest absolute error across all 19 target movies. The under-prediction could perhaps be explained by the fact that the cluster of the six most similar movies included one large-grossing movie but five movies grossing less than $40 million as shown in Table 5.
59 Table 5: Movies rated as most similar to Charlie and the Chocolate Factory
Past Movie Gross Revenue Lemony Snickets: A Series of Unfortunate Events $ 119,457,073 James and the Giant Peach $ 33,979,766 The Pagemaster $ 16,525,612 Tall Tale $ 9,907,064 Matilda $ 39,155,825 The Borrowers $ 25,881,561
60 Table 6: Predictions using Regression, SBF and Combined models
Combined Regression SBF Model Actual Regression SBF Combined Error Error Error >$100m War of the Worlds $234.28 $160.39 $225.79 $224.62 -31.54% -3.62% -4.13% The Wedding Crashers $209.26 $45.59 $85.73 $67.33 -78.21% -59.03% -67.82% Charlie & the Chocolate Factory $206.46 $156.35 $46.68 $100.47 -24.27% -77.39% -51.34% Fantastic Four $154.70 $71.38 $116.31 $166.00 -53.86% -24.81% 7.31%
Mean Absolute Relative Error (MARE) 46.97% 41.21% 32.65% SBF's % Accuracy over Regression 12.25% 30.49%
Between $50- $100m The Dukes of Hazzard $80.27 $49.67 $91.55 $78.58 -38.12% 14.05% -2.10% Four Brothers $74.49 $50.38 $37.69 $65.31 -32.37% -49.40% -12.33% Sky High $63.95 $36.16 $88.38 $60.96 -43.46% 38.21% -4.67% Bewitched $63.31 $108.55 $61.35 $70.44 71.45% -3.09% 11.25% Red Eye $57.89 $50.43 $59.92 $62.68 -12.89% 3.50% 8.28%
Mean Absolute Relative Error (MARE) 39.66% 21.65% 7.72% SBF's % Accuracy over Regression 45.41% 80.52%
Bold figures indicate lowest error achieved on a case-by-case basis.
61 (Table 6 continued)
Combined Regression SBF Model Actual Regression SBF Combined Error Error Error Less than $50m Skeleton Key $47.91 $44.76 $43.49 $27.13 -6.58% -9.22% -43.38% Must Love Dogs $43.89 $38.15 $30.08 $42.61 -13.08% -31.48% -2.94% The Brothers Grimm $37.92 $73.80 $28.64 $8.34 94.64% -24.48% -78.00% The Island $35.82 $71.02 $27.50 $7.22 98.29% -23.24% -79.84% Bad News Bears $32.87 $46.96 $42.00 $43.65 42.88% 27.79% 32.81% Stealth $32.12 $54.27 $40.16 $30.86 68.99% 25.05% -3.90% Dark Water $25.47 $49.96 $46.58 $43.10 96.11% 82.86% 69.21% Deuce Bigalow: European Gigolo $22.40 $52.10 $22.02 $22.02 132.57% -1.71% -1.71% Valiant $19.48 $39.52 $17.35 $18.86 102.90% -10.94% -3.20% Rebound $16.81 $40.96 $27.22 $24.39 143.70% 61.92% 45.07%
Mean Absolute Relative Error (MARE) 79.97% 29.87% 36.01% SBF's % Accuracy over Regression 62.65% 54.98%
Mean Absolute Relative Error All Cases (MARE) 62.42% 30.09% 27.86% SBF's % Accuracy over Regression 51.78% 55.37%
Bold figures indicate lowest error achieved on a case-by-case basis.
62 Figure 8: SBF and Regression Revenue Predictions versus Actual
$250 1 War of the Worlds Actual 2 The Wedding Crashers Regression Model SBF Model 3 Charlie & the Chocolate Factory 4 Fantastic Four $200 5 The Dukes of Hazzard 6 Four Brothers 7 Sky High $150 8 Bewitched 9 Red Eye 10 Skeleton Key $100 11 Must Love Dogs
RevenueUS$Millions 12 The Brothers Grimm 13 The Island 14 Bad News Bears $50 15 Stealth 16 Dark Water 17 Deuce Bigalow: European Gigolo $- 18 Valiant 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 19 Rebound Movie ID
63 Figure 9: MARE for SBF and Regression predictions
200%
150%
100%
50%
0% -200% -150% -100% -50% 0% 50% 100% 150% 200% SBF-14 Prediction SBF-14 Error
-50%
-100%
-150%
-200% Regression Prediction Error
Shaded points indicate a smaller MARE under SBF compared to Regression model.
Unshaded points indicate a smaller MARE under Regression compared to SBF model.
Square point (□) indicates a large grossing movie (<$100m)
Triangular point (∆) indicates an average grossing movie (between $50-$100m)
Diamond point (◊) indicates a small grossing movie (less than $50m)
64 5.2 Combined SBF-Regression Predictions
Figure 10 compares the combined model’s predictions with those obtained from regression and SBF. This graph suggests that combining regression and SBF does not necessarily result in an “average” of the prediction; in some cases the combined forecast is greater than both predictions (e.g. Fantastic Four ) and in others it is less than both predictions (e.g. Rebound and Valiant ).
On a case-by-case basis, the combined SBF-regression model outperformed the regression and SBF predictions in more than half the cases (10 out of 19). In a further
6 movies, SBF produced the most accurate forecast. In total, 17 out of 19 target movies presented a positive gain in predictive accuracy by introducing a case-based approach implemented using SBF.
The above Table 6 compares the combined model predictions with both SBF and regression. On average, the combined model predictions were 55% more accurate than regression predictions and 7% more accurate than SBF predictions. A paired t-test was conducted to determine whether these gains in accuracy due to combining forecasting methods were statistically significant. The results in Table 7 showed a significant gain in accuracy over regression but not over SBF, thus finding support for hypothesis 2a but not hypothesis 2b.
65 Figure 10: Regression, SBF and Combined Model predictions
1 War of the Worlds 2 The Wedding Crashers $250 Regression Model 3 Charlie & the Chocolate Factory SBF Model 4 Fantastic Four Combined Model 5 The Dukes of Hazzard Actual $200 6 Four Brothers 7 Sky High 8 Bewitched $150 9 Red Eye 10 Skeleton Key 11 Must Love Dogs
$100 12 The Brothers Grimm 13 The Island RevenueUS$Millions 14 Bad News Bears 15 Stealth $50 16 Dark Water 17 Deuce Bigalow: European Gigolo 18 Valiant $- 19 Rebound 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Movie ID
66 Figure 11: MARE for Regression and Combined Model predictions
200%
150%
100%
50%
0% -200% -150% -100% -50% 0% 50% 100% 150% 200%
-50% Combined SBF-Regression CombinedPrediction SBF-Regression Error
-100%
-150%
-200% Regression Prediction Error
Shaded points indicate a smaller MARE under SBF compared to Regression model.
Unshaded points indicate a smaller MARE under Regression compared to SBF model.
Square point (□) indicates a large grossing movie (<$100m)
Triangular point (∆) indicates an average grossing movie (between $50-$100m)
Diamond point (◊) indicates a small grossing movie (less than $50m)
67 Table 7: Paired Sample Test Results
Paired Differences
Hypothesis Mean Std. Dev T Sig (2 tail)
1 Regression - SBF 0.306 0.632 2.115 0.048
2a Combined - Regression -0.367 0.727 2.205 0.041
2b Combined - SBF -0.061 0.281 0.950 0.355
D.f.=18, Bold indicates significance at the 5% level.
5.3 Prediction of Small-grossing Movies
From a studio executive’s viewpoint, the ability to discriminate or ‘flag’ small revenue- earning movies is a priority when making movie production decisions. “Small revenue earners” describes those which achieve gross box office revenue below the historical average of $51.3 million. Of the total sample, 10 movies were small revenue-earners.
The regression forecasting model predicted this group of movies with an error of
79.97% at the average. In 8 out of 10 movies, regression over-predicted revenue by a margin of 40% - 150%. In comparison, SBF forecasts (MARE = 29.87%) were on average 62.65% more accurate while the combined SBF-regression forecasts
(MARE=36.01%) offered an increase of roughly 55% accuracy over regression.
In terms of its ability to flag small-grossing movies (i.e. predict revenue to be less than $51.3 million), regression correctly identified 6 movies as small-grossing movies, compared to SBF and combined SBF-regression models which correctly identified all
10 movies as small-grossing movies. To the extent that SBF and the combined model can correctly identify small grossing movies, then either method would allow studio to avoid potential losses at the early pre-production stage of the decision.
68 6 Discussion and Conclusion
6.1 Contributions to the Literature
This paper developed a prescriptive operational case-based approach called similarity- based forecasting and applied it to the problem of forecasting movie box office success.
This study contributed to the existing business forecasting and decision-making literature in three important ways. Firstly, it provided the case-based decisions literature with an implementable operational schema. Previously, only one study by
Gayer et al [2004] provided an operationalization of CB prediction for forecasting apartment rental and sale prices. Highlighted were innovations to the case-based approach which include:
• A memory database where cases were objectively obtained,
• Predictions based on individual case similarity not pooled similarity,
• A model that included both global judged similarity ratings and rule-based
similarity, and
• An endogenous selection method for cases inclusion.
Secondly, it demonstrated the implementation of the case-based approach using actual data from the ‘historically difficult to predict’ entertainment industry. In doing so, this study addressed key issues relating to the operational definition of several moving parts such as memory database, similarity and prediction functions and also provided empirical results on the relative performance of SBF compared to the hedonic regression benchmark. Using past movie data and similarity ratings obtained from a survey of movie-goers, the SBF model was able to predict movie gross
69 revenues with greater accuracy than hedonic linear regression, thus supporting hypothesis 1.
Thirdly, the study presented a combine SBF-Regression forecasting model. Gayer et al [2004] suggested that a combined regression and cased-based approach should be a consideration of future studies. This paper constructed combined model forecasts and compared it with those from SBF and regression. Support was found for the hypothesis that a combined SBF-regression forecast would be more accurate than regression (hypothesis 2a). The combined forecast was also more accurate than SBF forecasts but the marginal gain in accuracy was not statistically significant, thus finding no support for hypothesis 2b. Overall, both case-based approaches offered significantly greater accuracy over forecasting using hedonic linear regression. In addition, both SBF and combined SBF predictions were constructed using less than
10% of the dataset used for the regression forecasting. Therefore the case-based approach offered greater predictive accuracy using significantly less data than regression.
This study also contributed to the current efforts of researchers involved in forecasting movie success. To date, no study had explored the use of the case-based approach let alone similarities or analogies in formulating box office forecasts. This empirical investigation filled this gap by applying the SBF and combined SBF- regression models to predicting movie success. Because SBF only used information from previous cases and their similarity to the current one, it did not require knowledge of the relevant determinants of performance nor explication of causal relationships, making it an ideal forecasting technique given the uncertainty and ambiguity surrounding the determination of motion picture success.
70 By far the most important contribution that SBF offered the existing motion picture forecasting literature, was the ability to provide accurate forecast box office revenues even as early as the pre-production, or “green lighting” stage where movies are pitched to studios for production. The literature has identified the importance of pre-release forecasting of revenue [Shugan, 2000; Shugan and Swait, 2000]. Empirically, only a few studies have attempted to forecast movies without early box office results
(see for example Shugan and Swait [2000] and Simonoff and Sparrow [2000]) but with too small a dataset and with limited success.
The SBF approach to decision-making could assist movie studio executives in their selection of movies even before a cent has been spent on producing the movie. This is because the only information required by SBF to produce forecasts are those available even in the early movie pitching stage (i.e. the genre, a general storyline, potential star actors and the number of screen initially). Given that the average cost of producing a movie is $40 million and can be as large as $200 million 17 for the movie Titanic ,
[Nielsen EDI, 2005] the SBF model could save the movie studio millions of dollars by avoiding movies which SBF predicts to be unprofitable 18 .
6.2 Limitations and Future Research Directions
Being the first empirical application using the SBF approach, the model is not without its limitations. Firstly, due to time restraints, the present study was based on a detailed analysis of a small group of 19 movies. While this result offers a larger dataset than those previously used in the literature, further studies could examine the performance
17 Refer to Table A4 in the appendix for statistics on production budgets for movies released in the US between 1993-2004. 18 It is assumed that the movie studio can obtain estimates of the production and marketing costs of the movie, and therefore only needs to estimate potential earnings at the box office to determine profitability.
71 of SBF to an even larger movie dataset. It would also be of interest to see what, if any, boundary conditions exist around the data requirements. Gayer et al [2004] noted that the size of the dataset influence the performance of their case-based model predictions.
Therefore another possible avenue for research would be to further explore boundary conditions by comparing the accuracy of SBF when the availability of data changes.
Another possible criticism is based on the omission of marketing budget figures for the 19 target movies. Instead, the opening week’s number of theaters was used as a proxy. Marketing budget, also known as “prints and advertising costs” and “media budgets”, has been cited as a determinant of movie success [see Hennig-Thurau, Walsh and Wruck, 2001; Hennig-Thurau, Houston and Walsh, 2003]. The proprietary nature of marketing and advertising costs meant that it was very difficult to obtain; it was not made available on FilmSource or on other movie database sources considered in this study 19 . Future research could apply the SBF and regression models with marketing figures (as opposed to an opening week theaters proxy) and determine its impact (if any) on the results presented here.
Because quantitative figures are difficult to obtain, perhaps qualitative measures should be considered such as the media intensity indicator used by Eliashberg, Jonker,
Sawhney and Wierenga [2000]. In Eliashberg et al [2000], marketing was proxied by a media intensity dummy variable (low, average, high) was measured by the number of
‘showings’ per week for each given media source (e.g. movie trailers, television, newspaper, radio, etc.). The weakness, however, is that while it can be related to changes in revenue, it cannot be related to changes in profitability – for this, the costs of advertising through the listed media are required [Eliashberg et al, 2000, p.239].
19 In this study, The Internet Movie Database , Box Office Mojo , and The Numbers were searched. While Box Office Mojo appears to have some estimates for Marketing Costs, none were provided for the 19 target movies used.
72 One might argue that the hedonic regression model used could also be improved and therefore, so too the accuracy of its predictions. For instance, how might the accuracy of regression be improved if interactions between the star-indicator and the production budget had been included? Theoretically, it can be argued that the presence of a star actor and the size of the production budget interact to increase the box office revenue of a movie. The regression models presented here were re-estimated with the inclusion of the interaction term between the star indicator and the log of the production budget such that:
ln(Re ) = β + β ln(Pr ) + β ( ) + β + β venue k 0 1 od k 2 D Sequel k 3OpTheaters k 4Year k z 10 +ψ + γ ln(Pr (*) ) + δ ( ) 1Holiday k ∑ j od k Actor j k ∑ i Genre i k j=1 i=1
∀k =1... 19 , where k denotes the new movie, i represents the twelve genre
( ) categories, Actor j k is the vector of dummy variable for actor j in movie k and
( ) Genre i k is the vector of dummy variable for genre i in movie k . The estimated model and the resultant predictions of revenue are presented in the appendix under
Note 11. The addition of the interaction term did improve the accuracy of the regression predictions, albeit by a small and statistically insignificant reduction in the
MARE from 62.42% to 62.13% (t=0.108, p>0.10). Nevertheless, future empirical studies could look to improve the regression model to test the robustness of the results found here.
The operational model presented here used an endogenous case-selection method to determine the number of cases to enter into the prediction function. Instead of choosing an ad-hoc rule as is common in previous case-based approaches, SBF use hierarchical cluster analysis to determine the top cluster of movies, (that is, movies
73 most similar to the target movie) and therefore the number of cases to include in the prediction. By doing this, we allowed the number of cases to be flexible and determined by the similarity function itself, as opposed to some ad-hoc rules such as the top 10 movies. Future empirical applications could explore other methods for selecting cases that would be included in the prediction. Related to this would be studies which examine the robustness of the case selection method used in the present study to ways to select cases.
This paper also makes some progress in the way we conceptualize and operationalize similarity. Specifically, it is argued that the choice between using a rule- based algorithm or a judged (human rated) similarity depends on the conceptualization of the case. For the movie revenue application, the rationale for using judged similarity assessments was based on the belief that movies were a large set of attributes that could not adequately be describes as a point or a co-ordinate scale nor abbreviated to a small set of features, and therefore required the use of judged similarity. Therefore, there exists two possibilities for further research in this area. Firstly, future research could use a rule-based algorithm to describe similarities between movies and compare these predictions with those using the SBF model described here. A second avenue would be to replicate this study in a different context and compare the performance of the rule-based algorithm with judged similarity assessments. Such empirical evidence would contribute to the theoretical arguments made here regarding the dependency of the similarity function on the conceptualization of the case itself.
Another avenue for research would be to compare the accuracy of SBF predictions to those obtained from other forecasting techniques such as neural networks and prediction markets. Such research would further explore the strengths and weaknesses
74 of case-based prediction and the boundary conditions of this approach. A comparison of SBF and prediction markets is feasible as an information market for predicting opening week box office revenues already exists at the Hollywood Stock Exchange or
HSX [http://www.hsx.com]. The key to this investigation however, would be the timing of predictions- i.e. when are they made? Evidence presented by previous studies of virtual information markets like HSX suggests that accurate predictions of opening week revenues can be obtained at the close of trading the day prior to its theatrical release. This might not be true however for forecasts made weeks prior to release or at the pre-production stage when SBF predictions can be formulated. Therefore an interesting future research direction would be to compare the accuracy of information markets and SBF, taking into considering the timing of the prediction as well. With the growing interest in prediction markets, whether and when SBF can offer benefits over the prediction market approach should also form part of the research agenda in applications of the case-based approach.
6.3 Suggested applications of SBF
Forecasting a new movie’s box office revenue is not unlike investment problems faced in other industries. In fact, predicting revenue of uncertain investments is one of the most important strategic objectives. Therefore, parallels can be drawn between the decision studios face when deciding on which movies to produce to a range of other investment decision settings.
The question of which types of decision settings SBF might be useful for was considered in essay 1’s discussion of the applicability of case-based approaches.
Consider first, one application of interest in the business domain, namely predicting the success of companies post-merger or after acquisition. Predicting the performance
75 of a merger or acquisition ex ante is made difficult by the lack of knowledge regarding qualitative aspects such as corporate fit, work cultures and political aspects, and there influence on success (Are they relevant considerations to M&A success? Are they direct, moderating or mediating variables to success? Do they interact with other determinants of success?). These questions heighten the ambiguity of the decision and make the M&A problem difficult to assess using conventional capital budgeting techniques (NPV, IRR), expected utility maximization (or SEU), regression and quantitative multiple scenario methods (decision trees, real options).
Instead, as argued in essay 1, the decision-maker is advised to take a case-based approach, like SBF, to assess the decisions. Because SBF does not require complete knowledge of alternatives, the state space, probability distribution, outcomes and causal models that are required in other more popular decision appraisal methods, it offers a viable approach for forecasting and decision-making where there is ambiguity regarding the causal model or structural ignorance (where the complete state space is not or cannot be known).
In fact, SBF can be applied to any decision (regardless of the degree of uncertainty or ambiguity) provided that there exists good historical data on the outcomes of past decisions and there is a method of assessing the similarity between past cases and the new ones. Therefore another direction for future research would be to apply the SBF approach and compare its predictive accuracy against other methods (e.g. hedonic regression, information markets, etc) for decisions characterized by a lack of knowledge of all relevant determinants, all possible states and outcomes and the exact specification of the causal model, but have good historical data and the ability to obtain similarity assessment between previous examples to the new one.
76 Apart from the M&A example given, other possible applications include early
R&D research funding, pioneer process plants and new product development; all which are characterized by uncertainty or structural ignorance and ambiguity. If the results found in this study’s motion picture application also hold true for other investment decision situations, then SBF has the potential to offer decision makers the ability to discriminate between profitable and loss-making ventures in situations of high uncertainty and ambiguity. Given that the majority of business investment decisions in practice are characterized by uncertainty and ambiguity then SBF, and case-based approaches in general, should be included as part of the research agenda for improving business forecasting and decision-making.
Just as important as understanding where SBF might work well, is understanding where it may be expected to perform poorly. How will SBF perform for one-off decisions (choosing to marry) compared to repeated decisions (choosing a restaurant)?
How accurate will SBF predictions be in instances where all past cases are highly dissimilar to the new case? Can SBF be as accurate where good historical data does not exist? It has been argued that SBF relies on the availability of good historical data to make predictions of future events, so we can expect that where this requirement is satisfied that SBF would work well – for example in the prediction of credit riskiness, outcomes of criminal cases, success of government projects, etc. However, outside these examples, it is possible that SBF is less effective. Part of the answer may lie in the way the reference class is formed. That is, for situations where there is a lack of good historical data, a broader reference class may be required to gain accuracy. For example, if there are only a few past cases of pharmaceutical innovations to compare with, one could expand the reference class to include information from innovations from other
77 domains. Clearly, efforts are required in this area of research to establish the boundary conditions for SBF and case-based approaches in general.
References
Allison, P.D., Missing Data , Sage University Paper Series on Quantitative Applications
in the Social Sciences, 07-136 (Thousand Oaks, CA: Sage Publications, 2001)
Armstrong, J.S., “Combining Forecasts” in J.S. Armstrong (Ed.) Principles of Forecasting:
A Handbook for Researchers and Practitioners , International Series in Operations
Research & Management Science, (Norwell, MA: Kluwer Academic Publishers,
2001), 417-39.
Basuroy, S., S. Chatterjee and S.A. Ravid, “How Critical Are Critical Reviews? The Box
Office Effects of Film Critics, Star Power, and Budgets,” Journal of Marketing ,
67 (2003), 103–17.
Berg, J. and T. Rietz, “Prediction Markets as Decision Support Systems”, Information
Systems Frontiers, 5 (2003), 79-93.
Blonski, M., “Social Learning with case-based decisions,” Journal of Economic Behavior and
Organization , 38 (1999), 59-77.
Box Office Mojo. 2005.
Chen, K. and C.R. Plott, Information aggregation mechanisms: Concept, design and
implementation for a sales forecasting problem , Social Science Working Paper 1131
(Pasadena, CA: California Institute of Technology, 2002)
78 Collins, A., C. Hand and M. Snell, “What Makes A Blockbuster? Economic Analysis of
Film Success in the United Kingdom” Managerial and Decision Economics , 23
(2002), 343-54.
De Vany, A., Hollywood Economics: How extreme uncertainty shapes the film industry , (London,
UK: Routledge, 2004).
De Vany, A. and C. Lee, “Quality signals in information cascades and the dynamics of
the distribution of motion picture box office revenues,” Journal of Economic
Dynamics & Control, 25 (2001), 593-614.
__ and __, “Uncertainty in the movie industry: Does star power reduce the terror of
the box office?” Journal of Cultural Economics 23 (1999), 285–318.
Desai, K.K and S. Basuroy, “Interactive influences of genre familiarity, star power and
critics’ review in the cultural goods industry: The case of motion pictures,”
Psychology and Marketing, 22 (2005), 203-73.
Elberse, A. and J. Eliashberg, “Demand and Supply Dynamics for Sequentially
Released Products in International Markets: The Case of Motion Pictures,”
Marketing Science , 22 (2003), 329-54.
Eliashberg, J., J. Jonker, M.S. Sawhney and B. Wierenga, “MOVIEMOD: An
Implementable Decision-Support System for Prerelease Market Evaluation of
Motion Pictures,” Marketing Science , 19 (2000), 226-43.
Eliashberg, J. and S. Shugan, “Film Critics: Influencers or Predictors?” Journal of
Marketing , 61 (1997), 68-78.
79 Gavetti, G., D. Levinthal and J.W. Rivkin, “Strategy-making in Novel and Complex
Worlds: The Power of Analogy,” Strategic Management Journal , 26 (2005), 691-
712.
Gayer, G., I. Gilboa and O. Lieberman, Rule-Based and Case-Based Reasoning in Housing
Prices , Cowles Foundation Discussion Paper No. 1493 (New Haven, CT: Yale
University, 2004)
Gilboa, I. and D. Schmeidler, “Case-Based Decision Theory,” The Quarterly Journal of
Economics , 110 (1995), 605-39.
__, and __, “Cumulative Utility and Consumer Theory,” International Economic Review,
38 (1997), 737-61.
__, and __, A Theory of Case-Based Decisions, (Cambridge, UK: Cambridge University
Press, 2001)
__, and __, “Case-Based Knowledge and Induction,” IEEE Transactions on Systems, Man,
and Cybernetics – Part A: Systems and Humans , 30 (2000), 85-95.
Green, K. and J.S. Armstrong, Structured Analogies for Forecasting , Monash Econometrics
and Business Statistics Working Paper No 17/04 (Melbourne, VIC: Monash
University, 2004)
Gonzalez, A.J. and R. Laureano-Ortiz, “A Case-Based Reasoning Approach to Real
Estate Property Appraisal,” Expert Systems With Applications , 4 (1992), 229-46.
Gruca, T., J. Berg and M. Cipriano, “The Effect of Electronic Markets on Forecasts of
New Product Success,” Information Systems Frontier , 5 (2003), 95-105.
80 Hennig-Thurau, T., M. Houston and G. Walsh, “Determinants of Motion Picture Box
Office and Profitability: An Interrelationship Approach,” Center for Research on
Motion Picture Success , Working Paper 4 (2003), 1 -36.
Hennig-Thurau, T., G. Walsh and O. Wruck, “An Investigation into the Factors
Determining the Success of Service Innovations: The Case of Motion
Pictures,” Academy of Marketing Science Review , 9 (2001), 1-23.
Hollywood Stock Exchange, The, 2005
Jahnke, H., A. Chwolka and D. Simons, “Coordinating Service-Sensitive Demand and
Capacity by Adaptive Decision Making: An Application of Case-Based
Decision Theory,” Decision Sciences , 26 (2005), 1-32.
Kahneman, D. and A. Tversky, “Intuitive Predictions: Biases and corrective
procedures,” Management Science , 12 (1979), 313-27.
Krampe, D. and M. Lusti, “Case-Based Reasoning for Information System Design”, in
Leake, D.B. and E. Plaza, Case based reasoning research and development , Second
international conference, (Providence, RI: Springer-Verlag, 1997).
Krider, R. and C. Weinberg, “Competitive Dynamics and the Introduction of New
Products: The Motion Picture Timing Game,” Journal of Marketing Research , 35
(1998), 1-15.
Lampel, J. and J. Shamsie, “Capabilities in Motion: New Organizational Forms and the
Reshaping of the Hollywood Movie Industry,” Journal of Management Studies , 40
(2003), 2189-210.
81 Lehmann, D. and C. Weinberg, “Sales Via Sequential Distribution Channels: An
application to Movie Audiences,” Journal of Marketing , 64 (2000), 18-33.
Liao, T.W., Z. Zhang and C. Mount, “Similarity Measures for Retrieval in Case-Based
Reasoning Systems,” Artificial Intelligence, 7 (1998), 267-88.
Litman, B., “Predicting success of theatrical movies: An empirical study,” Journal of
Popular Culture, 16 (1983), 159–75.
Litman, B. and L. Kohl, “Predicting financial success of motion pictures: The ’80s
experience,” Journal of Media Economics , 2 (1989), 35–50.
Luu, D.T., S.T. Ng, and S.E. Chen, “A case-based procurement advisory system for
construction”, Advances in Engineering Software , 34 (2003), 429-38.
Lovallo, D. and D. Kahneman, “Delusions of Success: How Optimism Undermines
Executives’ Decisions,” Harvard Business Review , July (2003).
Neelamegham, R. and P. Chintagunta, “A Bayesian Model to Forecast New Product
Performance in Domestic and International Markets,” Marketing Science, 18
(1999), 115-36.
Nielsen EDI. 2005. FilmSource ,
Plott, C.R., “Markets as Information Gathering Tools,” Southern Economic Journal, 67
(2000), 1-15.
__, and S. Sunder, “Rational Expectations and the Aggregation of Diverse Information
in Laboratory Security Markets,” Econometrica, 56 (1988), 1085-118.
82 Pomerol, J.C, “Scenario Development and practical decision making under
uncertainty,” Decision Support Systems, 31 (2001), 197-204.
Prag, J. and J. Casavant, “An empirical study of the determinants of revenues and
marketing expenditures in the motion picture industry,” Journal of Cultural
Economics, 18 (1994), 217–35.
Quigley Publishing Co. 2006. Top Ten Money-Making Actors ,
Ravid, S.A., “Information, blockbusters, and stars: A study of the film industry,”
Journal of Business , 72 (1999), 463–92.
Reinstein D.A. and C.M. Snyder, “The Influence of Expert Reviews on Consumer
Demand for Experience Goods: A Case Study of Movie Critics,” The Journal of
Industrial Economics , 53 (2005), 27-51.
Riesbeck, C. and R. Schank, Inside case-based reasoning , (Hillsdale, NJ: Lawrence Erlbaum
Associates, 1989).
Russo, J. E. and P.J.H. Shoemaker, Decision Traps , (New York, NY: Doubleday, 1989).
Sawhney, M. and J. Eliashberg, “A parsimonious model for forecasting gross box-
office revenues of motion pictures,” Marketing Science , 15 (1996), 113–31.
Seitz, A, A.M. Uhrmacher, and D. Damm, “Case-based prediction in experimental
medical studies”, Artificial Intelligence in Medicine , 15 (1999), 255-73.
Shepperd, M. and C. Schofield, “Estimating Software Project Effort Using Analogies,”
IEEE Transactions on Software Engineering , 23 (1997), 736-43.
83 Shimazu, H. and Y. Takashima, “Lessons Learned from Deployed CBR Systems and
Design Decisions Made in Building a Commercial CBR Tool”, in Leake, D.B.
and E. Plaza, Case based reasoning research and development , Second international
conference, (Providence, RI: Springer-Verlag, 1997).
Shugan, S.M., “Recent Research on the Motion Picture Industry”, in Eliashberg, J. and
B. Mallen (Eds.), Proceedings of the Inaugural Business and Economics Scholars
Workshop in Motion Picture Industry Studies , (2000), 65-86.
Shugan, S.M. and J. Swait, “Enabling Movie Design and Cumulative Box Office
Predictions Using Historical Data and Consumer Intent-to-View”, Advertising
Research Foundation (ARC) Conference Proceedings , (2000).
Simonoff, J. and I. Sparrow, “Predicting movie grosses: Winners and losers,
blockbusters and sleepers,” Chance , 13 (2000), 15-24.
Smith, S. and V.K. Smith, “Successful movies: A preliminary empirical analysis,”
Applied Economics , 18 (1986), 501-7.
Sochay, S., “Predicting the performance of motion pictures,” Journal of Media Economics ,
7 (1994), 1–20.
Stewart, S.I. and C.A. Vogt, “A Case-Based Approach to Understanding Vacation
Planning,” Leisure Sciences, 21 (1999), 79-95.
Surowiecki, J., The Wisdom of Crowds. Why the Many Are Smarter than the Few and How
Collective Wisdom Shapes Business, Economies, Societies and Nations , (New York, NY:
Doubleday, 2004).
Tversky, A., “Features of Similarity”, Psychological Review, 84 (1977), 327-52.
84 Tversky, A. and D. Kahneman, “Availability: A heuristic for judging frequency and
probability,” Cognitive Psychology , 5 (1973), 207-32.
U.S. Department of Commerce: Bureau of Economic Analysis, Gross Domestic Product:
Implicit Price Deflator, May 2005,
Vogel, H., Entertainment Industry Economics. A guide for financial analysis, 6 th edition,
(Cambridge, UK: Cambridge University Press, 2001).
Watson, I. and F. Marir, “Case-Based Reasoning: A Review,” The Knowledge Engineering
Review, 9 (1994), 355-81.
Wolfers, J. and E. Zitzewitz, Prediction Markets , National Bureau of Economic Research
Working Paper 10504 (Cambridge, MA: National Bureau of Economic
Research, 2004)
Zufryden, F., “Linking advertising to box office performance of new film releases: A
marketing planning model,” Journal of Advertising Research , 36 (1996): 29–41.
85 Appendix
Note 1: Summary of Blonski (1999)
Blonski [1999] used a dynamic version of the case-based decision framework to model the process in which a community of agents (society) learns the superior decision under various information (communication) conditions. Blonski used the concepts of aspiration levels in CBDT to characterize an agent’s indifference threshold between repeating a familiar action with a known outcome (satisficing) and trying a new action with an uncertain result (experimentation). Under CBDT, an agent would be satisfied with an outcome, r if it is above their aspiration level, h and would tend to repeat this chosen action until he learns of one that results in a better outcome. If however the outcome falls below the aspiration level, the agent then experiments with new actions with uncertain outcomes.
Blonski’s study was able to explain social learning behavior under four communication (or information) conditions and his findings are summarized here:
• In the complete information model everybody observes everybody and all
agents are perfectly informed about its society. Assuming the case of two
decisions, society learns the superior decision only if it has an outcome
above the aspiration level. In addition, the initial proportion of agents
starting with the inferior decision (assuming it is also satisficing) determines
the long run direction of social learning.
• In the star communication model, agents only have information on their own
actions and the actions of the “star” agent who has the power to influence
the direction of social learning. Here, a long run stable steady state of
86 aggregated decisions always evolves, in which the position of the steady
state is dependent on the star agent’s direction of choice.
• In the ∆ -neighborhood communication model, each agent only observes a small
environment ∆ . Here, social learning was found to grow at a constant rate
and approached the desired direction of learning only if the initial region
size of agents exceeded the size of the ∆ environment.
• Finally, in the communicating subpopulations model, Blonski assumes two
subpopulations, where agents have complete information regarding its own
subpopulation but only some lower level of information µ ≤ 1 for the
other subpopulation (e.g. a hospital with two wards where communication
is very high (complete) within a ward, but low across the two wards). Here,
the size of the subpopulation influences the direction of social learning,
with larger subpopulations dominating the direction of smaller ones. Also
the as information exchange increases, the results approximate those under
complete information.
Note 2: Review of Motion Picture Forecasting Literature
Levels of Analysis
The research in the motion picture literature has been conducted at two levels of analysis. In the first, studies have focused on consumers’ movie-going decision making
[Sawhney and Eliashberg, 1996; Zufryden, 1996; Eliashberg, Jonker, Sawhney and
Wierenga, 2000] while in the second, the focused was on the determinants which influenced the aggregate commercial performance of motion pictures at the box office
[Litman and Kohl, 1989; Sochay, 1994; Neelamegham and Chintagunta, 1999; Ravid,
87 1999; Simonoff and Sparrow, 2000; Hennig-Thurau, Walsh and Wruck, 2001; Collins,
Hand and Snell, 2002; Elberse and Eliashberg, 2003; Hennig-Thurau, Houston and
Walsh, 2003; Lampel and Shamsie, 2003; Desai and Basuroy, 2005]. This study belongs to the latter subset of research concentrating on aggregate motion picture forecasting.
Determinants of Motion Picture Success
In his book entitled Hollywood Economics , De Vany [2004] concluded that “nobody knows anything” referring to the resemblance that motion picture performance had to chaos. While there might be a lack of consensus on a definitive list of relevant determinants and their exact relationship to commercial success, the efforts in the literature cannot be said to have amounted to nothing. Early research was pre- dominantly exploratory in nature, testing various sets of variables which were theoretically linked to box office takings, the typical proxy of commercial success of a motion picture. The culmination of such research efforts was the Hennig-Thurau,
Houston and Walsh, [2003] framework for the determinants of motion picture success.
In their framework, the determinants of success were classified into three groups (i) movie attributes or characteristics, (ii) studio-controlled factors and (iii) external or non-studio controlled factors. These are summarized in Table A1 below.
88 Table A1: Summary of Movie Success Determinants
Determinant Group Examples of determinants (relationship if known)
• star power or star quality rating Movie Attributes • director power or director quality rating
• genre or storyline
• symbolicity or connections to sequels, series, adaptations, TV
shows, etc.
• MPAA classification
• production budget (+), Studio Factors • advertising expenditure (+),
• Holiday timing of release (+),
• Number of screens (endogenous?)
• critics’ reviews (heavily debated) External Factors • awards
• consumers’ perceived movie quality
• early box-office information (+)
Some of these determinants have gained majority support from existing studies.
For example, the majority of prior studies found a positive correlation between the movie’s production budget and box office revenue [Litman, 1983; Litman and Kohl,
1983; Zufryden, 2000]. Prior research also found a positive association between the level of advertising and box office success [Prag and Casavant, 1994; Zufryden, 1996;
Lehmann and Weinberg, 2000; Elberse et al, 2003]. In terms of the timing of release, there was a consensus that timing did matter and that holiday releases, especially summer, Christmas and 4th of July movies were associated with higher box office
89 revenues [Sochay, 1994; Krider and Weinberg, 1998; Litman, 1983]. There was also strong evidence that early box office performance has a positive impact on longer-term box office performance through revealed behavior and information cascades [De Vany and Lee, 2001].
For release pattern and number of screens a movie was shown on, earlier studies appeared to provide conflicting evidence [Sawhney et al, 1996; Litman and Kohl, 1989;
Sochay, 1994; Neelamegham and Chintagunta, 1999]. Later studies have attempted to resolve the debate by re-defining release pattern as an endogenous, studio-determined variable. That is, studios determined the release pattern of a movie (wide, limited, exclusive, platform, saturation, etc) based on their predictions of success for the movie.
This in turn determined the “supply” at theaters which influences box office revenues.
Elberse et al [2003] use simultaneous equations to capture this interdependence of screens and revenues.
Other determinants have yet to be empirically investigated. For example, movie symbolicity – a movie’s potential to be categorized into existing cognitive categories by consumers was hypothesized to reduce consumer uncertainty regarding new movies.
The elements of symbolicity include connections to previous work such as an installment to a series (Die Another Day), a sequel or prequel to prior movies (Star
Wars Episode III), remakes (Bad News Bears), television (Bewitched), books (The
Lord of the Rings trilogy), musicals (The Phantom of the Opera), and comics (Batman) adaptations or relationships to historical events (Titanic) and people (Ray). Hennig-
Thurau et al [2001] proposed an inverted U-shaped relationship between symbolicity and box office revenue; which implied that symbolicity positively influenced box office revenue up to some threshold where the positive impact was weakened by the lack of
90 novelty, surprise, creativity and/or innovativeness. No empirical testing of this claim has been conducted.
Finally, there are several determinants which are still heavily debated. For example, one point of contention was the role of critics’ and experts’ review of movies before and during the first weeks of its release. Elberse et al [2003] provided strong evidence that critical acclaim was positively related to box office revenues. Eliasberg et al [1997] contend that this correlation suggests that reviews function as predictors (and not influencers).
91 Note 3: Target Cases
Table A2: Movies schedule for future release (US Summer, 2005)
Release Date
War of the Worlds 29-Jun-05
Charlie & the Chocolate Factory 15-Jul-05
The Wedding Crashers 15-Jul-05
Fantastic Four 8-Jul-05
The Dukes of Hazzard 5-Aug-05
Four Brothers 12-Aug-05
Bewitched 24-Jun-05
Sky High 29-Jul-05
Red Eye 19-Aug-05
Skeleton Key 12-Aug-05
Must Love Dogs 29-Jul-05
The Brothers Grimm 26-Aug-05
The Island 22-Jul-05
Bad News Bears 22-Jul-05
Stealth 29-Jul-05
Dark Water 8-Jul-05
Deuce Bigalow: European Gigolo 12-Aug-05
Valiant 19-Aug-05
Rebound 1-Jul-05
92 Note 3: Memory Database – Variables and Descriptive Statistics
Transformations to Dollar Values
All dollar figures were adjusted to 2005 values using the GDP inflation adjuster [U.S.
Department of Commerce: Bureau of Economic Analysis, May 2005].
Source: http://research.stlouisfed.org/fred2/data/GDPDEF.txt
Title: Gross Domestic Product: Implicit Price Deflator
Series ID: GDPDEF
Source: U.S. Department of Commerce: Bureau of Economic Analysis
Release: Not Applicable
Seasonal Adjustment: Seasonally Adjusted
Frequency: Quarterly
Units: Index 2000=100
Date Range: 1947-01-01 to 2005-01-01
Last Updated: 2005-05-26 11:36 AM CT
Notes: A Guide to the National Income and Product Accounts of the United
States (NIPA) - (http://www.bea.doc.gov/bea/an/nipaguid.pdf )
93 Table A3: Inflation Adjustment Factor
Month-Year Current Day Value Inflation Adjustment Factor Oct-92 87.029 1.263 Jan-93 87.707 1.254 Apr-93 88.190 1.247 Jul-93 88.570 1.241 Oct-93 89.038 1.235 Jan-94 89.578 1.227 Apr-94 89.954 1.222 Jul-94 90.530 1.214 Oct-94 90.952 1.209 Jan-95 91.530 1.201 Apr-95 91.859 1.197 Jul-95 92.289 1.191 Oct-95 92.733 1.186 Jan-96 93.328 1.178 Apr-96 93.659 1.174 Jul-96 93.951 1.170 Oct-96 94.450 1.164 Jan-97 95.054 1.157 Apr-97 95.206 1.155 Jul-97 95.534 1.151 Oct-97 95.846 1.147 Jan-98 96.089 1.144 Apr-98 96.249 1.142 Jul-98 96.600 1.138 Oct-98 96.934 1.134 Jan-99 97.328 1.130
94 (Table A3 continued)
Month-Year Current Day Value Inflation Adjustment Factor Apr-99 97.674 1.126 Jul-99 98.013 1.122 Oct-99 98.432 1.117 Jan-00 99.317 1.107 Apr-00 99.745 1.102 Jul-00 100.259 1.097 Oct-00 100.666 1.092 Jan-01 101.478 1.083 Apr-01 102.252 1.075 Jul-01 102.675 1.071 Oct-01 103.191 1.065 Jan-02 103.450 1.063 Apr-02 103.911 1.058 Jul-02 104.243 1.055 Oct-02 104.752 1.050 Jan-03 105.500 1.042 Apr-03 105.799 1.039 Jul-03 106.148 1.036 Oct-03 106.523 1.032 Jan-04 107.246 1.025 Apr-04 108.093 1.017 Jul-04 108.482 1.013 Oct-04 109.100 1.008 Jan-05 109.946 1.000
95 Production Budget
Nielsen EDI’s FilmSource database did not provide a complete information on movie production budgets. Therefore production budget was obtained from Box Office Mojo
[http://www.boxofficemojo.com ]. There was still a 16% missingness in the data.
Using NORM software, we used the common approach of multiple imputations to estimate missing values. Allison [2001] provides a good review of imputation techniques for missing data. Table A2 provide descriptive statistics of production budget in its original form and with imputed values.
Star Actor Indicator
Nielsen EDI’s FilmSource database provided a list of the featured (leading or supporting) actors. In conjunction with Quigley Publishing’s annual “ Top Ten Money-Making Stars” list [Quigley Publishing, 2006] we created a star indicator dummy for each of the past movie cases. A movie was defined as having a “Star” if either of the two lead actors
were listed in any of the previous three years. Letting; Actor ,ti =1 if Actor i at time t is
listed in the “ Top Ten Money-Making Stars” list, Actor ,ti =0 if Actor i at time t is not listed “Top Ten Money-Making Stars” list, Then:
Star Indicator = 1 if Actor ,ti −1 =1 or Actor ,ti −2 =1 or Actor ,ti −3 =1
Star Indicator = 0 otherwise.
96 Table A4: Descriptive Statistics and Frequencies for Memory Data
Production Gross Revenue Budget Imputed Production Opening Week # Max. # Theaters Opening (US$Million) (US$Million) Budget (US$Million) Theaters (Widest Point) Year Number of Cases… Valid 1751 1472 1751 1751 1751 1751 % Missing 15.9% Mean $ 51.29 $ 42.85 $ 40.24 1884 2051 1999 Median $ 31.00 $ 33.61 $ 30.36 1983 2051 1998 Mode $ 17.00 $ 23.56 $ 23.56 6 1200 1996 Minimum $ 0.24 $ 0.30 $ 0.30 1 600 1993 Maximum $ 690.00 $ 229.42 $ 229.42 4163 4223 2004 Std. Deviation $ 61.30 $ 33.44 $ 31.77 892.53 740.81 3.46
Skewness (SE) 3.21 (0.06) 1.54 (0.06) 1.71 (0.06) -0.40 (0.06) 0.04 (0.06) 0.01 (0.06) Kurtosis 16.06 (0.12) 3.23 (0.12) 4.02 (0.12) -0.31 (0.12) -0.70 (0.12) -1.22 (0.12) Skewness Ratio* 54.93 24.19 29.21 -6.87 0.76 0.16 Kurtosis Ratio* 137.36 25.33 34.35 -2.64 -6.03 -10.48 *Bold figures indicate significance (-2 97 (Table A4 continued) Frequency Percent Sequels 171 9.77% Non-Sequels 1580 90.23% Movies with Stars 248 14.16% Movies without Stars 1503 85.84% Holiday Release 1002 57.22% Non-Holiday Release 749 42.78% Frequency Percent Genre Category Action 221 12.6% Adventure 51 2.9% Animated 81 4.6% Black Comedy 23 1.3% Comedy 491 28.0% Documentary 6 0.3% Drama 374 21.4% Fantasy 35 2.0% Horror 93 5.3% Musical 12 0.7% Romantic Comedy 132 7.5% Sci-Fi 71 4.1% Suspense 142 8.1% Western 19 1.1% 98 (Table A4 continued) Frequency Percent Story Description Crime 83 4.7% Animal 74 4.2% Romance 73 4.2% Sports 67 3.8% Period Piece 66 3.8% Family 61 3.5% African American 60 3.4% School 52 3.0% Supernatural 51 2.9% Martial Arts 33 1.9% Cops & Robbers 31 1.8% Spy 31 1.8% Biography 30 1.7% Youth 30 1.7% Murder Mystery 29 1.7% Comic Strip 28 1.6% Political 22 1.3% Slasher 22 1.3% Futuristic 20 1.1% Historical 20 1.1% War 20 1.1% Show Business 18 1.0% Courtroom 15 0.9% Caper 14 0.8% Children 14 0.8% Prison 14 0.8% Disaster 13 0.7% Military 13 0.7% Christmas 12 0.7% 99 (Table A4 continued) Frequency Percent Story Description Gangster 12 0.7% Kidnap 12 0.7% Monster 12 0.7% Vampire 11 0.6% Fairy Tale 10 0.6% Time Travel 10 0.6% Love Story 9 0.5% Dance 8 0.5% Erotic 8 0.5% Revenge 8 0.5% Gay/Lesbian 7 0.4% Divorce 6 0.3% Rock N Roll 6 0.3% Satire 6 0.3% True Story 6 0.3% Wilderness 6 0.3% Ghost 5 0.3% Spoof 5 0.3% Car Stunts 3 0.2% Concert 3 0.2% Contemporary 3 0.2% Werewolf 3 0.2% 100 Note 4: Case Selection Algorithm Figure A1: A two-stage Case Selection Algorithm for Movies All Movies Stage 1 = ∪ ∪ Cs1 Genre Storyline StarActors Extract: All cases with matching genre, story or actors. 40 Memory Subsets Stage 2 1 2 3 4 5 6 7 8 910 Extract: At most 10 cases with matching actors. Decil e Schema Extract: Randomly, the remaining cases by revenue decile. 101 Note 5: Case Selection Algorithm – Movie Attributes To determine the criteria that movie-goers use to in making decisions regarding movies, a short focus study was conducted with fifty-one individuals who were asked the following open-ended question: "What are your criteria for choosing to watch a movie at the cinemas? In other words, what are the things you consider when deciding whether to watch a movie or not?" 46 (90%) of the 51 individuals responded to the question above. Figure A2 summarizes the percentage of the 46 respondents who mentioned each of the 15 criteria. Genre preference , a good storyline and favorite or well-known actors were the top 3 criteria respondents look for when deciding on whether to watch a movie. In fact, 42 (91%) of respondents mentioned at least one of these three factors. Almost a third of respondents mentioned situation and personal factors. These range from the time and place of the movie showing, who they are going with (friends or date), the groups’ choice of movie, their budget (can I afford it), their time (do I have time to watch it), what they are in the mood for, the weather outside (if it’s raining or not), etc. “Other factors” included whether it was a cultural experience, a foreign film or series or sequel to a movie that have previously seen. 102 Figure A2: Survey Results for Consumer Movie Criteria Respondents' criteria for watching a movie Genre Preference 67.4% Good Storyline 65.2% Actors (Favourite or well-known) 60.9% Critics' & general reviews of movie 45.7% Situational & Personal factors 28.3% Strong Advertising & Trailers 21.7% Peer reviews 19.6% Own perceived quality of movie 15.2% Other factors 13.0% Novelty of the movie 10.9% Movie Blockbuster 6.5% Director (Favourite or well-known) 4.3% Studios and Distributors 2.2% Awards Won by Movie 2.2% Good Producers 0.0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 103 Note 6: Measuring Familiarity The following is the wording given to respondents relating to the familiarity question: “Below is a list of movies that have been previously released in U.S. cinemas. Please indicate which movies you have already seen, you have not seen but are sufficiently familiar with, or are not familiar with at all. To be 'sufficiently familiar' with a movie you must at least be able to identify the movie's genre, general story concepts and/or the main actors. For each movie in the list below, we have provided a short description which you can use to help determine your level of familiarity. You should only use these movie descriptions to help you in recalling movies you might have already seen or are sufficiently familiar with. To jump to this library of movie descriptions, simply click on the movie title.” 104 Note 7: SBF prediction model – SPSS Summary Statistics Table A5: SBF Model Fit Summary Model Summary b Change Statistics Adjusted Std. Error of R Square Model R R Square R Square the Estimate Change F Change df1 df2 Sig. F Change 1 .788 a .621 .545 .53126 .621 8.198 3 15 .002 a. Predictors: (Constant), SEQUEL, OPWK_THT, LNSBF b. Dependent Variable: LNACTUAL Table A6: SBF Model ANOVA Summary ANOVA b Sum of Model Squares df Mean Square F Sig. 1 Regression 6.941 3 2.314 8.198 .002 a Residual 4.234 15 .282 Total 11.175 18 a. Predictors: (Constant), SEQUEL, OPWK_THT, LNSBF b. Dependent Variable: LNACTUAL 105 Table A7: SBF Model Coefficients Coefficients a Standardized Unstandardized Coefficients Coefficients 95% Confidence Interval for B Model B Std. Error Beta t Sig. Lower Bound Upper Bound 1 (Constant) 8.309 2.892 2.873 .012 2.144 14.474 LNSBF .419 .181 .424 2.320 .035 .034 .804 OPWK_THT 6.832E-04 .000 .438 2.410 .029 .000 .001 SEQUEL -1.129 .549 -.329 -2.056 .058 -2.298 .041 a. Dependent Variable: LNACTUAL 106 Table A8: SBF Model Residual Statistics Summary Residuals Statistics a Minimum Maximum Mean Std. Deviation N Predicted Value 16.6690 19.2351 17.7891 .62098 19 Std. Predicted Value -1.804 2.329 .000 1.000 19 Standard Error of .12572 .53126 .22100 .10566 19 Predicted Value Adjusted Predicted Value 16.6329 19.2387 17.8533 .60474 18 Residual -.6241 1.3058 .0000 .48497 19 Std. Residual -1.175 2.458 .000 .913 19 Stud. Residual -1.253 2.534 -.012 .999 18 Deleted Residual -.7098 1.3883 -.0151 .56724 18 Stud. Deleted Residual -1.279 3.238 .030 1.114 18 Mahal. Distance .061 17.053 2.842 4.112 19 Cook's Distance .000 .101 .034 .036 18 Centered Leverage Value .003 .947 .158 .228 19 a. Dependent Variable: LNACTUAL Figure A3: Residual Scatterplot – SBF Model Scatterplot Dependent Variable: LNACTUAL 3 2 1 0 -1 -2 Regression Standardized Predicted Value Predicted Standardized Regression -2 -1 0 1 2 3 Regression Studentized Residual 107 Note 8: Hedonic Regression – SPSS Summary Statistics Table A9: Hedonic Regression Model Fit Summary Model Summary b Change Statistics Adjusted Std. Error of R Square Model R R Square R Square the Estimate Change F Change df1 df2 Sig. F Change 1 .594 a .353 .347 .88094 .353 52.584 18 1732 .000 a. Predictors: (Constant), SEQUEL, COMEDY, HOLIDAY, STARIND, MUSICAL, YEAR, WESTERN, FANTASY, ADVENTUR, SCIFI, ANIMATED, ROMANTIC, HORROR, SUSPENSE, LNPROD, ACTION, OPWK_THT, DRAMA b. Dependent Variable: LNREV Table A10: Hedonic Regression Model ANOVA Summary ANOVA b Sum of Model Squares df Mean Square F Sig. 1 Regression 734.549 18 40.808 52.584 .000 a Residual 1344.124 1732 .776 Total 2078.673 1750 a. Predictors: (Constant), SEQUEL, COMEDY, HOLIDAY, STARIND, MUSICAL, YEAR, WESTERN, FANTASY, ADVENTUR, SCIFI, ANIMATED, ROMANTIC, HORROR, SUSPENSE, LNPROD, ACTION, OPWK_THT, DRAMA b. Dependent Variable: LNREV 108 Table A11: Hedonic Regression Model Coefficients Coefficients Unstandardized Coefficients B Std. Error t Sig. (Constant) -9.074 13.447 -0.67 0.50 LNPROD 0.449 0.034 13.22 0.00 OPWK_THT 0.000 0.000 10.41 0.00 YEAR 0.009 0.007 1.33 0.18 HOLIDAY -0.006 0.043 -0.14 0.89 DRAMA 0.228 0.172 1.33 0.18 COMEDY -0.030 0.170 -0.18 0.86 ACTION -0.076 0.180 -0.42 0.67 ADVENTUR -0.237 0.209 -1.13 0.26 SCIFI -0.152 0.201 -0.76 0.45 HORROR -0.010 0.190 -0.05 0.96 ROMANTIC 0.097 0.182 0.53 0.60 ANIMATED 0.233 0.195 1.20 0.23 FANTASY 0.072 0.226 0.32 0.75 MUSICAL -0.011 0.304 -0.04 0.97 SUSPENSE -0.018 0.182 -0.10 0.92 WESTERN -0.370 0.264 -1.40 0.16 STARIND 0.275 0.065 4.22 0.00 SEQUEL 0.235 0.075 3.13 0.00 Dependent Variable: LNREVENUE 109 Table A12: Hedonic Regression Model Residual Statistics Summary Residuals Statistics a Minimum Maximum Mean Std. Deviation N Predicted Value 14.4580 19.0930 17.2090 .64788 1751 Std. Predicted Value -4.246 2.908 .000 1.000 1751 Standard Error of .04520 .26477 .08554 .03322 1751 Predicted Value Adjusted Predicted Value 14.3631 19.0722 17.2085 .64833 1751 Residual -3.9122 3.8756 .0000 .87640 1751 Std. Residual -4.441 4.399 .000 .995 1751 Stud. Residual -4.492 4.446 .000 1.001 1751 Deleted Residual -4.0025 3.9578 .0005 .88742 1751 Stud. Deleted Residual -4.517 4.470 .000 1.002 1751 Mahal. Distance 3.607 157.078 17.990 18.430 1751 Cook's Distance .000 .038 .001 .002 1751 Centered Leverage Value .002 .090 .010 .011 1751 a. Dependent Variable: LNREV Figure A4: Residual Scatterplot – Hedonic Regression Model Scatterplot Dependent Variable: LNREV 4 2 0 -2 -4 -6 Regression Standardized Predicted Value -6 -4 -2 0 2 4 6 Regression Studentized Deleted (Press) Residual 110 Note 9: Combined SBF-Regression Model – SPSS Summary Statistics Table A13: Combined Model Fit Summary Model Summary b Change Statistics Adjusted Std. Error of R Square Model R R Square R Square the Estimate Change F Change df1 df2 Sig. F Change 1 .854 a .730 -.215 .86838 .730 .773 14 4 .681 a. Predictors: (Constant), SEQUEL, ANIMATED, HORROR, STARIND, DRAMA, SCIFI, ACTION, FANTASY, ROMANTIC, ADVENTUR, SUSPENSE, LNSBF, LNPROD, COMEDY b. Dependent Variable: LNACTUAL Table A14: Combined Model ANOVA Summary ANOVA b Sum of Model Squares df Mean Square F Sig. 1 Regression 8.158 14 .583 .773 .681 a Residual 3.016 4 .754 Total 11.175 18 a. Predictors: (Constant), SEQUEL, ANIMATED, HORROR, STARIND, DRAMA, SCIFI, ACTION, FANTASY, ROMANTIC, ADVENTUR, SUSPENSE, LNSBF, LNPROD, COMEDY b. Dependent Variable: LNACTUAL 111 Table A15: Combined Model Coefficients Coefficients Unstandardized Coefficients B Std. Error t Sig. (Constant) -12.181 48.353 -0.252 0.817 LNSBF 0.865 0.715 1.210 0.313 LNPROD 0.927 2.379 0.390 0.723 OPWK_THT 0.001 0.001 0.925 0.423 DRAMA -2.942 3.715 -0.792 0.486 COMEDY -5.045 6.789 -0.743 0.511 ACTION -1.782 3.434 -0.519 0.640 ADVENTUR 0.499 1.788 0.279 0.798 SCIFI -4.955 6.789 -0.730 0.518 HORROR -3.619 4.683 -0.773 0.496 ROMANTIC 0.732 1.756 0.417 0.705 ANIMATED 0.855 2.019 0.423 0.701 FANTASY -1.730 2.466 -0.701 0.534 SUSPENSE -2.210 3.010 -0.734 0.516 STARIND -0.129 2.229 -0.058 0.958 SEQUEL -1.988 1.594 -1.247 0.301 Dependent Variable: LNREVENUE 112 Table A16: Combined Model Residual Statistics Summary Residuals Statistics a Minimum Maximum Mean Std. Deviation N Predicted Value 16.7523 19.2299 17.7891 .70140 19 Std. Predicted Value -1.478 2.054 .000 1.000 19 Standard Error of .51857 .87930 .80143 .09641 19 Predicted Value Adjusted Predicted Value 15.2656 20.2591 17.7502 1.59889 15 Residual -.7225 .9702 .0000 .35897 19 Std. Residual -.822 1.103 .000 .408 19 Stud. Residual -1.302 1.366 -.010 .771 15 Deleted Residual -2.3970 2.3970 .0448 1.56537 15 Stud. Deleted Residual -1.611 1.815 .012 .900 15 Mahal. Distance 5.313 17.053 14.211 3.241 19 Cook's Distance .035 .359 .150 .135 15 Centered Leverage Value .295 .947 .789 .180 19 a. Dependent Variable: LNACTUAL Figure A5: Residual Scatterplot – Combined Model Scatterplot Dependent Variable: LNACTUAL 2.0 1.5 1.0 .5 0.0 -.5 -1.0 -1.5 Regression Standardized Predicted Value Standardized Regression -1.5 -1.0 -.5 0.0 .5 1.0 1.5 Regression Studentized Residual 113 Note 10: Correlations –Marketing Budget and Opening Wk Theaters Table A17: Pearson’s Correlations –Marketing and Opening Wk Theaters Correlations Marketing Marketing Budget Opening Budget Inflation-A Week # Imputed djusted Theaters Adj 05 Marketing Budget Pearson Correlation 1 .618** 1.000** Inflation-Adjusted Sig. (2-tailed) . .000 .000 N 432 432 432 Opening Week # Pearson Correlation .618** 1 .688** Theaters Sig. (2-tailed) .000 . .000 N 432 1751 1751 Marketing Budget Pearson Correlation 1.000** .688** 1 Imputed Adj 05 Sig. (2-tailed) .000 .000 . N 432 1751 1751 **. Correlation is significant at the 0.01 level (2-tailed). Table A18: Non-parametric Correlations –Marketing and Opening Wk Theaters Correlations Marketing Marketing Budget Opening Budget Inflation-A Week # Imputed djusted Theaters Adj 05 Kendall's tau_b Marketing Budget Correlation Coefficient 1.000 .508** 1.000** Inflation-Adjusted Sig. (2-tailed) . .000 .000 N 432 432 432 Opening Week # Correlation Coefficient .508** 1.000 .588** Theaters Sig. (2-tailed) .000 . .000 N 432 1751 1751 Marketing Budget Correlation Coefficient 1.000** .588** 1.000 Imputed Adj 05 Sig. (2-tailed) .000 .000 . N 432 1751 1751 Spearman's rho Marketing Budget Correlation Coefficient 1.000 .674** 1.000** Inflation-Adjusted Sig. (2-tailed) . .000 . N 432 432 432 Opening Week # Correlation Coefficient .674** 1.000 .759** Theaters Sig. (2-tailed) .000 . .000 N 432 1751 1751 Marketing Budget Correlation Coefficient 1.000** .759** 1.000 Imputed Adj 05 Sig. (2-tailed) . .000 . N 432 1751 1751 **. Correlation is significant at the .01 level (2-tailed). 114 Note 11: Regression with Interaction Terms Table A19: Regression with Interactions Model Fit Summary Model Summary b Change Statistics Adjusted Std. Error of R Square Model R R Square R Square the Estimate Change F Change df1 df2 Sig. F Change 1 .595 a .353 .347 .88086 .353 52.611 18 1732 .000 a. Predictors: (Constant), STARPROD, FANTASY, WESTERN, HOLIDAY, MUSICAL, SCIFI, ADVENTUR, YEAR, ROMANTIC, ANIMATED, HORROR, SUSPENSE, SEQUEL, ACTION, LNPROD, DRAMA, OPWK_THT, COMEDY b. Dependent Variable: LNREV Table A20: Regression with Interactions Model ANOVA Summary ANOVA b Sum of Model Squares df Mean Square F Sig. 1 Regression 734.793 18 40.822 52.611 .000 a Residual 1343.880 1732 .776 Total 2078.673 1750 a. Predictors: (Constant), STARPROD, FANTASY, WESTERN, HOLIDAY, MUSICAL, SCIFI, ADVENTUR, YEAR, ROMANTIC, ANIMATED, HORROR, SUSPENSE, SEQUEL, ACTION, LNPROD, DRAMA, OPWK_THT, COMEDY b. Dependent Variable: LNREV 115 Table A21: Regression with Interactions Model Coefficients Summary Coefficients a Unstandardized Standardized Coefficients Coefficients 95% Confidence Interval for B Correlations Collinearity Statistics Model B Std. Error Beta t Sig. Lower Bound Upper Bound Zero-order Partial Part Tolerance VIF 1 (Constant) -9.067 13.445 -.674 .500 -35.438 17.304 LNPROD .447 .034 .341 13.165 .000 .381 .514 .523 .302 .254 .557 1.795 OPWK_THT 3.463E-04 .000 .284 10.396 .000 .000 .000 .479 .242 .201 .502 1.993 YEAR 8.925E-03 .007 .028 1.332 .183 -.004 .022 .176 .032 .026 .823 1.215 HOLIDAY -5.97E-03 .043 -.003 -.139 .889 -.090 .078 -.008 -.003 -.003 .985 1.015 DRAMA .228 .172 .086 1.321 .187 -.110 .565 .002 .032 .026 .089 11.246 COMEDY -3.10E-02 .170 -.013 -.182 .856 -.365 .303 -.098 -.004 -.004 .076 13.196 ACTION -7.67E-02 .180 -.023 -.427 .669 -.429 .276 .111 -.010 -.008 .125 8.032 ADVENTUR -.238 .209 -.037 -1.136 .256 -.648 .173 -.009 -.027 -.022 .358 2.795 SCIFI -.153 .200 -.028 -.761 .447 -.546 .241 .062 -.018 -.015 .283 3.529 HORROR -1.07E-02 .190 -.002 -.056 .955 -.384 .363 -.048 -.001 -.001 .243 4.113 ROMANTIC 9.575E-02 .182 .023 .526 .599 -.261 .453 -.017 .013 .010 .192 5.211 ANIMATED .233 .195 .045 1.195 .232 -.149 .615 .085 .029 .023 .265 3.780 FANTASY 7.087E-02 .225 .009 .314 .753 -.371 .513 .046 .008 .006 .445 2.248 MUSICAL -1.07E-02 .304 -.001 -.035 .972 -.608 .586 -.026 -.001 -.001 .703 1.423 SUSPENSE -1.88E-02 .182 -.005 -.103 .918 -.376 .338 .003 -.002 -.002 .180 5.571 WESTERN -.372 .264 -.035 -1.407 .159 -.889 .146 -.013 -.034 -.027 .593 1.688 SEQUEL .235 .075 .064 3.127 .002 .088 .382 .143 .075 .060 .892 1.121 STARPROD 1.554E-02 .004 .089 4.259 .000 .008 .023 .265 .102 .082 .856 1.168 a. Dependent Variable: LNREV 116 Table A22: Regression with Interactions Model Residual Statistics Residuals Statistics a Minimum Maximum Mean Std. Deviation N Predicted Value 14.4637 19.0974 17.2090 .64798 1751 Std. Predicted Value -4.237 2.914 .000 1.000 1751 Standard Error of .04518 .26411 .08554 .03322 1751 Predicted Value Adjusted Predicted Value 14.3688 19.0766 17.2085 .64843 1751 Residual -3.9127 3.8716 .0000 .87632 1751 Std. Residual -4.442 4.395 .000 .995 1751 Stud. Residual -4.493 4.442 .000 1.001 1751 Deleted Residual -4.0029 3.9539 .0005 .88733 1751 Stud. Deleted Residual -4.518 4.466 .000 1.002 1751 Mahal. Distance 3.605 156.323 17.990 18.429 1751 Cook's Distance .000 .038 .001 .002 1751 Centered Leverage Value .002 .089 .010 .011 1751 a. Dependent Variable: LNREV Figure A6: Residual Scatterplot – Regression with Interactions Model Scatterplot Dependent Variable: LNREV 4 2 0 -2 -4 -6 Regression Standardized Predicted Value Standardized Regression -6 -4 -2 0 2 4 6 Regression Studentized Deleted (Press) Residual 117 Table A23: Predictions using Regression (with interactions), Original Regression and Combined SBF models. Combined Regression Original SBF (with Original Combined Regression Regression Model Actual Interactions) Regression SBF Error Error Error >$100m War of the Worlds $234.28 $162.33 $160.39 $224.62 -30.71% -31.54% -4.13% The Wedding Crashers $209.26 $50.11 $45.59 $67.33 -76.05% -78.21% -67.82% Charlie & the Chocolate Factory $206.46 $158.52 $156.35 $100.47 -23.22% -24.27% -51.34% Fantastic Four $154.70 $71.13 $71.38 $166.00 -54.02% -53.86% 7.31% Mean Absolute Relative Error (MARE) 46.00% 46.97% 32.65% SBF's % Accuracy over Regression 29.02% 30.49% Between $50- $100m The Dukes of Hazzard $80.27 $49.49 $49.67 $78.58 -38.35% -38.12% -2.10% Four Brothers $74.49 $50.27 $50.38 $65.31 -32.52% -32.37% -12.33% Sky High $63.95 $36.06 $36.16 $60.96 -43.62% -43.46% -4.67% Bewitched $63.31 $109.11 $108.55 $70.44 72.33% 71.45% 11.25% Red Eye $57.89 $50.34 $50.43 $62.68 -13.04% -12.89% 8.28% Mean Absolute Relative Error (MARE) 39.97% 39.66% 7.72% SBF's % Accuracy over Regression 80.68% 80.52% 118 (Table A23 continued) Combined Regression Original SBF (with Original Combined Regression Regression Model Actual Interactions) Regression SBF Error Error Error Less than $50m Skeleton Key $47.91 $44.66 $44.76 $27.13 -6.78% -6.58% -43.38% Must Love Dogs $43.89 $38.10 $38.15 $42.61 -13.21% -13.08% -2.94% The Brothers Grimm $37.92 $73.57 $73.80 $8.34 94.03% 94.64% -78.00% The Island $35.82 $70.79 $71.02 $7.22 97.62% 98.29% -79.84% Bad News Bears $32.87 $46.90 $46.96 $43.65 42.71% 42.88% 32.81% Stealth $32.12 $54.05 $54.27 $30.86 68.28% 68.99% -3.90% Dark Water $25.47 $49.83 $49.96 $43.10 95.61% 96.11% 69.21% Deuce Bigalow: European Gigalow $22.40 $52.02 $52.10 $22.02 132.21% 132.57% -1.71% Valiant $19.48 $39.50 $39.52 $18.86 102.78% 102.90% -3.20% Rebound $16.81 $40.92 $40.96 $24.39 143.43% 143.70% 45.07% Mean Absolute Relative Error (MARE) 79.67% 79.97% 36.01% SBF's % Accuracy over Regression 54.80% 54.98% All Cases Mean Absolute Relative Error (MARE) 62.13% 62.42% 27.86% SBF's % Accuracy over Regression 55.17% 55.37% 119 Table A24: Paired Samples Test – Original Regression and Regression with interactions. Paired Samples Test Paired Differences 95% Confidence Interval of the Std. Error Difference Mean Std. Deviation Mean Lower Upper t df Sig. (2-tailed) Pair 1 REG2MARE - REG1MARE .0002 .00717 .00164 -.0033 .0036 .108 18 .915 Pair 2 REG2MARE - .3678 .72425 .16616 .0188 .7169 2.214 18 .040 COMBMARE 120 Technical Appendix Note T1: Robust Similarity Functions BAD NEWS BEARS 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn Similarity Robust 0.30 0.20 0.10 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 MOVIE ID BEWITCHED 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn Similarity Robust 0.30 0.20 0.10 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 MOVIE ID 121 BROTHERS GRIMM 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn Similarity Robust 0.30 0.20 0.10 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 MOVIE ID CHARLIE & CHOCOLATE FACTORY 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn Similarity Robust 0.30 0.20 0.10 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 MOVIE ID 122 DARK WATER 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn 0.30 0.20 0.10 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 MOVIE ID DEUCE BIGALOW 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn Similarity Robust 0.30 0.20 0.10 0.00 1 4 6 8 10 12 14 16 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 MOVIE ID 123 DUKES OF HAZZARD 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn Similarity Robust 0.30 0.20 0.10 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 MOVIE ID FANTASTIC FOUR 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Fn Similarity 0.30 0.20 0.10 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 MOVIE ID 124 FOUR BROTHERS 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn Similarity Robust 0.30 0.20 0.10 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 MOVIE ID THE ISLAND 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn Similarity Robust 0.30 0.20 0.10 0.00 2 4 6 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 MOVIE ID 125 MUST LOVE DOGS 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn Similarity Robust 0.30 0.20 0.10 0.00 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 52 54 57 59 61 63 65 67 69 MOVIE ID REBOUND 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn Similarity Robust 0.30 0.20 0.10 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 MOVIE ID 126 RED EYE 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn Similarity Robust 0.30 0.20 0.10 0.00 3 7 10 14 17 20 23 26 29 32 35 38 41 44 47 50 53 56 59 62 65 68 71 74 78 81 84 87 MOVIE ID SKELETON KEY 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn Similarity Robust 0.30 0.20 0.10 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 MOVIE ID 127 SKY HIGH 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn Similarity Robust 0.30 0.20 0.10 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 70 MOVIE ID STEALTH 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn Similarity Robust 0.30 0.20 0.10 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 MOVIE ID 128 VALIANT 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn Similarity Robust 0.30 0.20 0.10 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 MOVIE ID WAR OF THE WORLDS 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn 0.30 0.20 0.10 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 MOVIE ID 129 WEDDING CRASHERS 1.00 0.90 0.80 0.70 0.60 0.50 0.40 Robust Similarity Fn 0.30 0.20 0.10 0.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 MOVIE ID Movie ID Lists accompanying Robust Similarity Function Charts A past case movie was excluded if less than 3 respondents were familiar with the movie. Some of the lists below have excluded movies based on low familiarity; these are identified by missing movie “ID” numbers. See for example, “Must Love Dogs” where the first two movies were excluded due to low familiarity. TARGET MOVIE: BAD NEWS BEARS ID Movie Title 1 BIG GREEN, THE (Steve Guttenberg, Olivia D'Abo) 2 LITTLE GIANTS (Rick Moranis, Ed O'Neill) 3 D2: THE MIGHTY DUCKS (Emilio Estevez, Kathryn Erbe) 4 D3: THE MIGHTY DUCKS (Emilio Estevez, Jeffrey Nordling) 5 LITTLE BIG LEAGUE (Luke Edwards, Timothy Busfield) 6 MAJOR LEAGUE 3 (Scott Bakula, Corbin Bernsen) 7 MAJOR LEAGUE 2 (Charlie Sheen, Tom Berenger) 8 SANDLOT, THE (Mike Vitar, Tom Guiry) 9 HARDBALL (Keanu Reeves, Diane Lane) 10 CELTIC PRIDE (Damon Wayans, Daniel Stern) 11 DODGEBALL: A TRUE UNDERDOG STORY (Vince Vaughn, Christine Taylor) 12 JUWANNA MANN (Miguel A. Nunez Jr., Vivica A. Fox) 13 REPLACEMENTS, THE (Keanu Reeves, Gene Hackman) 14 SCHOOL OF ROCK (Jack Black, Joan Cusack) 15 MYSTERY, ALASKA (Russell Crowe, Hank Azaria) 130 16 FRIDAY NIGHT LIGHTS (Billy Bob Thornton, Derek Luke) 17 HAPPY GILMORE (Adam Sandler, Christopher McDonald) 18 DRUMLINE (Nick Cannon, Zoe Saldana) 19 COOL RUNNINGS (Leon, Doug E. Doug) 20 SUPERBABIES: BABY GENIUSES 2 (Jon Voight, Scott Baio) 21 AIR BUD (Michael Jeter, Kevin Zegers) 22 READY TO RUMBLE (David Arquette, Oliver Platt) 23 BEND IT LIKE BECKHAM (Parminder Nagra, Keira Knightley) 24 DADDY DAY CARE (Eddie Murphy, Jeff Garlin) 25 MAJOR PAYNE (Damon Wayans, Karyn Parsons) 26 BASEKETBALL (Trey Parker, Matt Stone) 27 RACE THE SUN (Halle Berry, James Belushi) 28 SCOUT, THE (Albert Brooks, Brendan Fraser) 29 AIRBORNE (Shane McDermott, Seth Green) 31 CARPOOL (Tom Arnold, David Paymer) 32 AGAINST THE ROPES (Meg Ryan, Omar Epps) 33 ROLLERBALL (Chris Klein, Jean Reno) 34 ED (Matt LeBlanc, Jayne Brook) 35 SUMMER CATCH (Freddie Prinze Jr., Jessica Biel) 36 MY BABY'S DADDY (Eddie Griffin, Anthony Anderson) 37 SISTER ACT 2 (Whoopi Goldberg, Kathy Najimy) 38 CHEAPER BY THE DOZEN (Steve Martin, Bonnie Hunt) 39 3 STRIKES (Brian Hooks, N'Bushe Wright) 40 AROUND THE WORLD IN 80 DAYS (Jackie Chan, Steve Coogan) 41 RUGRATS MOVIE, THE (E. G. Daily (Vocal), Christine Cavanaugh (Vocal)) 42 BAD SANTA (Billy Bob Thornton, Tony Cox) 43 BABY-SITTERS CLUB, THE (Schuyler Fisk, Bre Blair) 44 WEEKEND AT BERNIE'S 2 (Andrew McCarthy, Jonathan Silverman) 45 BIG BOUNCE, THE (Owen Wilson, Morgan Freeman) 46 DOCTOR DOLITTLE (Eddie Murphy, Ossie Davis) 47 DENNIS THE MENACE (Walter Matthau, Mason Gamble) 48 SLING BLADE (Billy Bob Thornton, Dwight Yoakam) 49 FREAKY FRIDAY (Jamie Lee Curtis, Lindsay Lohan) 50 THAT DARN CAT (Christina Ricci, Doug E. Doug) 51 JACKASS: THE MOVIE (Johnny Knoxville (Himself), Chris Pontius (Himself)) 52 ALAMO, THE (Dennis Quaid, Billy Bob Thornton) 53 DUDLEY DO-RIGHT (Brendan Fraser, Sarah Jessica Parker) 54 MUPPETS FROM SPACE (Jeffrey Tambor, F. Murray Abraham) 55 OCEAN'S ELEVEN (George Clooney, Matt Damon) 56 SIMPLE PLAN, A (Bill Paxton, Billy Bob Thornton) 57 BANDITS (Bruce Willis, Billy Bob Thornton) 58 ASSOCIATE, THE (Whoopi Goldberg, Dianne Wiest) 59 EVOLUTION (David Duchovny, Orlando Jones) 60 TRUE LIES (Arnold Schwarzenegger, Jamie Lee Curtis) 61 JUNGLE BOOK (LIVE ACTION), THE (Jason Scott Lee, Cary Elwes) 62 BLUES BROTHERS 2000 (Dan Aykroyd, John Goodman) 63 PAYBACK (Mel Gibson, Gregg Henry) 64 TAXI (Queen Latifah, Jimmy Fallon) 65 ALFIE (Jude Law, Marisa Tomei) 66 PUSHING TIN (John Cusack, Billy Bob Thornton) 67 WHITE CHICKS (Shawn Wayans, Marlon Wayans) 68 GONE IN 60 SECONDS (Nicolas Cage, Angelina Jolie) 69 DR SEUSS' THE CAT IN THE HAT (Mike Myers, Alec Baldwin) 131 70 ARMAGEDDON (Bruce Willis, Billy Bob Thornton) 71 MONSTER'S BALL (Billy Bob Thornton, Heath Ledger) 72 MEET JOE BLACK (Brad Pitt, Anthony Hopkins) 73 BLACK BEAUTY (Sean Bean, David Thewlis) 74 NINE MONTHS (Hugh Grant, Julianne Moore) TARGET MOVIE: BEWITCHED ID Movie Title 1 PRACTICAL MAGIC (Sandra Bullock, Nicole Kidman) 2 MY FAVORITE MARTIAN (Christopher Lloyd, Jeff Daniels) 3 VERY BRADY SEQUEL, A (Shelley Long, Gary Cole) 4 MUSE, THE (Albert Brooks, Sharon Stone) 5 DUDLEY DO-RIGHT (Brendan Fraser, Sarah Jessica Parker) 6 POKEMON 3 THE MOVIE (Veronica Taylor (Vocal), Rachael Lillis (Vocal)) 7 RINGMASTER (Jerry Springer, Jaime Pressly) 8 FREAKY FRIDAY (Jamie Lee Curtis, Lindsay Lohan) 9 STEPFORD WIVES, THE (Nicole Kidman, Matthew Broderick) 10 ELF (Will Ferrell, James Caan) 11 MCHALE'S NAVY (Tom Arnold, Tim Curry) 12 ANCHORMAN (Will Ferrell, Christina Applegate) ADVENTURES OF ELMO IN GROUCHLAND, THE (Kevin Clash (Vocal), Fran 13 Brill (Vocal)) HARRY POTTER AND THE CHAMBER OF SECRETS (Daniel Radcliffe, Rupert 14 Grint) 15 BEVERLY HILLBILLIES, THE (Diedrich Bader, Dabney Coleman) 16 FRIGHTENERS, THE (Michael J. Fox, Trini Alvarado) 17 SIMONE (Al Pacino, Catherine Keener) 18 MARCI X (Lisa Kudrow, Damon Wayans) 19 MEET THE PARENTS (Robert De Niro, Ben Stiller) 20 JOSIE AND THE PUSSYCATS (Rachael Leigh Cook, Tara Reid) 21 LIZZIE MCGUIRE MOVIE, THE (Hilary Duff, Adam Lamberg) 22 POOTIE TANG (Lance Crouther, Jennifer Coolidge) 23 LITTLE NICKY (Adam Sandler, Patricia Arquette) 24 PINOCCHIO (Roberto Benigni, Nicoletta Braschi) 25 GOOD BURGER (Kel Mitchell, Kenan Thompson) 26 SUPERSTAR (Molly Shannon, Will Ferrell) 27 OFFICE SPACE (Ron Livingston, Jennifer Aniston) 28 DICKIE ROBERTS: FORMER CHILD STAR (David Spade, Mary McCormack) 29 ED WOOD (Johnny Depp, Martin Landau) 30 LADIES MAN, THE (Tim Meadows, Karyn Parsons) 31 GOOD GIRL, THE (Jennifer Aniston, Jake Gyllenhaal) 32 MY GIANT (Billy Crystal, Kathleen Quinlan) CROCODILE HUNTER: COLLISION COURSE. (Steve Irwin (Himself), Terri Irwin 33 (Herself)) 34 CONNIE AND CARLA (Nia Vardalos, Toni Collette) 35 LASSIE (Tom Guiry, Helen Slater) 36 CHARLIE'S ANGELS (Cameron Diaz, Drew Barrymore) 37 NIGHT AT THE ROXBURY, A (Will Ferrell, Chris Kattan) ADVENTURES OF ROCKY AND BULLWINKLE, THE (Rene Russo, Jason 38 Alexander) 39 GIFT, THE (Cate Blanchett, Giovanni Ribisi) 40 BEAUTIFUL GIRLS (Matt Dillon, Noah Emmerich) 132 41 OLD SCHOOL (Luke Wilson, Will Ferrell) 42 RECESS: SCHOOL'S OUT (Rickey D'Shon Collins (Vocal), Jason Davis (Vocal)) 43 BIRTHDAY GIRL (Nicole Kidman, Ben Chaplin) 44 I SPY (Eddie Murphy, Owen Wilson) 45 TO DIE FOR (Nicole Kidman, Matt Dillon) 46 GRUMPIER OLD MEN (Jack Lemmon, Walter Matthau) 47 ACE VENTURA: PET DETECTIVE (Jim Carrey, Courteney Cox Arquette) 48 CAR 54, WHERE ARE YOU? (David Johansen, John C. McGinley) 49 AUSTIN POWERS IN GOLDMEMBER (Mike Myers, Beyonce Knowles) 50 PUNCH-DRUNK LOVE (Adam Sandler, Emily Watson) 51 DUNGEONS & DRAGONS (Justin Whalin, Marlon Wayans) 52 VILLAGE, THE (Bryce Dallas Howard, Joaquin Phoenix) 53 MALIBU'S MOST WANTED (Jamie Kennedy, Taye Diggs) 54 RUGRATS GO WILD (Bruce Willis (Vocal), Chrissie Hynde (Vocal)) 55 THUNDERBIRDS (Bill Paxton, Anthony Edwards) 56 MIGHTY WIND, A (Bob Balaban, Christopher Guest) 57 SNOW DOGS (Cuba Gooding Jr., James Coburn) 58 MALICE (Alec Baldwin, Nicole Kidman) 59 DETROIT ROCK CITY (Edward Furlong, Giuseppe Andrews) 60 PEACEMAKER, THE (George Clooney, Nicole Kidman) 61 WHAT'S EATING GILBERT GRAPE? (Johnny Depp, Juliette Lewis) 62 OCEAN'S ELEVEN (George Clooney, Matt Damon) 63 ARMY OF DARKNESS (Bruce Campbell, Embeth Davidtz) TARGET MOVIE: BROTHERS GRIMM, THE ID Movie Title 1 TALL TALE (Patrick Swayze, Oliver Platt) 2 BIG FISH (Ewan McGregor, Albert Finney) LEMONY SNICKET'S A SERIES OF UNFORTUNATE EVENTS (Jim Carrey, Liam 3 Aiken) 4 JUMANJI (Robin Williams, Bonnie Hunt) 5 TUCK EVERLASTING (Alexis Bledel, William Hurt) 6 THOMAS AND THE MAGIC RAILROAD (Peter Fonda, Mara Wilson) 7 PAGEMASTER, THE (Macaulay Culkin, Christopher Lloyd) 8 PETER PAN (Jason Isaacs, Jeremy Sumpter) 9 HARRY POTTER AND THE PRISONER AZKABAN (Daniel Radcliffe, Emma Watson) 10 KID IN KING ARTHUR'S COURT, A (Thomas Ian Nicholas, Joss Ackland) 11 ELLA ENCHANTED (Anne Hathaway, Hugh Dancy) 12 DUNGEONS & DRAGONS (Justin Whalin, Marlon Wayans) 13 INDIAN IN THE CUPBOARD, THE (Hal Scardino, Litefoot) 14 JAMES AND THE GIANT PEACH (Simon Callow (Vocal), Richard Dreyfuss (Vocal)) 15 THUMBELINA (Jodi Benson (Vocal), Carol Channing (Vocal)) LORD OF THE RINGS: THE FELLOWSHIP OF THE RING, THE (Elijah Wood, Ian 16 McKellen) 17 DRAGONHEART (Dennis Quaid, David Thewlis) HARRY POTTER AND THE SORCERER'S STONE (HARRY POTTER AND THE 18 PHILOSOPHER'S STONE) (Daniel Radcliffe, Rupert Grint) 19 EVER AFTER: A CINDERELLA STORY (Drew Barrymore, Anjelica Huston) 20 HOCUS POCUS (Bette Midler, Sarah Jessica Parker) 21 SIMPLE WISH, A (Martin Short, Mara Wilson) 22 HAPPILY EVER AFTER (1993) (Irene Cara (Vocal), Edward Asner (Vocal)) 23 YU-GI-OH! (Konrad Bösherz, Till C. Hagen) 133 24 KNIGHT'S TALE, A (Heath Ledger, Mark Addy) 25 CASPER (Christina Ricci, Bill Pullman) 26 PINOCCHIO (Roberto Benigni, Nicoletta Braschi) 27 SHREK (Mike Myers (Vocal), Eddie Murphy (Vocal)) 28 SANTA CLAUSE, THE (Tim Allen, Judge Reinhold) 29 SLEEPY HOLLOW (Johnny Depp, Christina Ricci) 30 REIGN OF FIRE (Christian Bale, Matthew McConaughey) 32 DOGMA (Matt Damon, Chris Rock) 33 WHAT DREAMS MAY COME (Robin Williams, Cuba Gooding Jr.) 34 HEART AND SOULS (Robert Downey Jr., Charles Grodin) 35 PRINCESS AND THE GOBLIN, THE (Joss Ackland (Vocal), Claire Bloom (Vocal)) 36 DR SEUSS' HOW THE GRINCH STOLE CHRISTMAS (Jim Carrey, Jeffrey Tambor) 37 MATILDA (Mara Wilson, Danny DeVito) 38 ORDER, THE (Heath Ledger, Shannyn Sossamon) 39 BOGUS (Whoopi Goldberg, Gerard Depardieu) 40 KAZAAM (Shaquille O'Neal, Francis Capra) 41 VILLAGE, THE (Bryce Dallas Howard, Joaquin Phoenix) 42 CINDERELLA STORY, A (Hilary Duff, Jennifer Coolidge) 43 HIGHLANDER: ENDGAME (Adrian Paul, Christopher Lambert) 44 MASK, THE (Jim Carrey, Cameron Diaz) 45 SPY KIDS 2: THE ISLAND OF LOST DREAMS (Antonio Banderas, Carla Gugino) 46 LEGEND OF BAGGER VANCE, THE (Will Smith, Matt Damon) 47 JACK FROST (Michael Keaton, Kelly Preston) 48 POKEMON 2000 THE MOVIE (Veronica Taylor (Vocal), Rachael Lillis (Vocal)) 49 DR SEUSS' THE CAT IN THE HAT (Mike Myers, Alec Baldwin) 50 MEET JOE BLACK (Brad Pitt, Anthony Hopkins) 51 FOUR FEATHERS, THE (Heath Ledger, Wes Bentley) 52 SWAN PRINCESS, THE (Jack Palance (Vocal), Howard McGillin (Vocal)) 53 BOURNE SUPREMACY, THE (Matt Damon, Franka Potente) 54 JUST VISITING (Jean Reno, Christina Applegate) 55 TALENTED MR. RIPLEY, THE (Matt Damon, Gwyneth Paltrow) 56 ROUNDERS (Matt Damon, Edward Norton) 57 10 THINGS I HATE ABOUT YOU (Heath Ledger, Julia Stiles) 58 LIFE AQUATIC WITH STEVE ZISSOU (Bill Murray, Owen Wilson) 59 CROW, THE (Brandon Lee, Ernie Hudson) 60 ALL THE PRETTY HORSES (Matt Damon, Henry Thomas) 61 OCEAN'S ELEVEN (George Clooney, Matt Damon) 62 JOHN GRISHAM'S THE RAINMAKER (Matt Damon, Claire Danes) TARGET MOVIE: CHARLIE AND THE CHOCOLATE FACTORY ID Movie Title LEMONY SNICKET'S A SERIES OF UNFORTUNATE EVENTS (Jim Carrey, Liam 1 Aiken) 2 JAMES AND THE GIANT PEACH (Simon Callow (Vocal), Richard Dreyfuss (Vocal)) 3 PAGEMASTER, THE (Macaulay Culkin, Christopher Lloyd) 4 TALL TALE (Patrick Swayze, Oliver Platt) 5 MATILDA (Mara Wilson, Danny DeVito) 6 BORROWERS, THE (John Goodman, Jim Broadbent) 7 HARRY POTTER AND THE CHAMBER OF SECRETS (Daniel Radcliffe, Rupert Grint) 8 DR SEUSS' THE CAT IN THE HAT (Mike Myers, Alec Baldwin) 9 FINDING NEVERLAND (Johnny Depp, Kate Winslet) 10 BIG FISH (Ewan McGregor, Albert Finney) 134 11 JUMANJI (Robin Williams, Bonnie Hunt) 12 ED WOOD (Johnny Depp, Martin Landau) 13 DISNEY'S TEACHER'S PET (Nathan Lane (Vocal), Kelsey Grammer (Vocal)) 14 JACK FROST (Michael Keaton, Kelly Preston) 15 SECRET GARDEN, THE (Kate Maberly, Heydon Prowse) 16 PINOCCHIO (Roberto Benigni, Nicoletta Braschi) 17 KAZAAM (Shaquille O'Neal, Francis Capra) 18 SPY KIDS 2: THE ISLAND OF LOST DREAMS (Antonio Banderas, Carla Gugino) 19 SHREK 2 (Mike Myers, Cameron Diaz) 20 FREAKY FRIDAY (Jamie Lee Curtis, Lindsay Lohan) 21 INDIAN IN THE CUPBOARD, THE (Hal Scardino, Litefoot) 22 LIKE MIKE (Bow Wow, Morris Chestnut) 23 TUCK EVERLASTING (Alexis Bledel, William Hurt) 24 HOLES (Sigourney Weaver, Jon Voight) 25 SIMPLE WISH, A (Martin Short, Mara Wilson) 26 DR SEUSS' HOW THE GRINCH STOLE CHRISTMAS (Jim Carrey, Jeffrey Tambor) 27 SLEEPY HOLLOW (Johnny Depp, Christina Ricci) 28 THREE WISHES (Patrick Swayze, Mary Elizabeth Mastrantonio) PIRATES OF THE CARIBBEAN: THE CURSE OF THE BLACK PEARL (Johnny 29 Depp, Geoffrey Rush) 30 MUPPETS FROM SPACE (Jeffrey Tambor, F. Murray Abraham) LORD OF THE RINGS: THE FELLOWSHIP OF THE RING, THE (Elijah Wood, Ian 31 McKellen) 32 BOGUS (Whoopi Goldberg, Gerard Depardieu) 33 DUNGEONS & DRAGONS (Justin Whalin, Marlon Wayans) 34 THOMAS AND THE MAGIC RAILROAD (Peter Fonda, Mara Wilson) 35 SUPERBABIES: BABY GENIUSES 2 (Jon Voight, Scott Baio) 36 WHAT DREAMS MAY COME (Robin Williams, Cuba Gooding Jr.) 37 SECRET WINDOW (Johnny Depp, John Turturro) 38 DOCTOR DOLITTLE (Eddie Murphy, Ossie Davis) 39 POKEMON 3 THE MOVIE (Veronica Taylor (Vocal), Rachael Lillis (Vocal)) 40 DRAGONHEART (Dennis Quaid, David Thewlis) 41 BENNY & JOON (Johnny Depp, Mary Stuart Masterson) 42 WHAT'S EATING GILBERT GRAPE? (Johnny Depp, Juliette Lewis) 43 PREACHER'S WIFE, THE (Denzel Washington, Whitney Houston) 44 MEET JOE BLACK (Brad Pitt, Anthony Hopkins) 45 LORD OF THE RINGS: TWO TOWERS, THE (Elijah Wood, Ian McKellen) 46 ASTRONAUT'S WIFE, THE (Johnny Depp, Charlize Theron) 47 CROW: CITY OF ANGELS, THE (Vincent Perez, Mia Kirshner) 48 NINTH GATE, THE (Johnny Depp, Frank Langella) 49 DONNIE BRASCO (Al Pacino, Johnny Depp) 50 HIGHLANDER: ENDGAME (Adrian Paul, Christopher Lambert) TARGET MOVIE: DARK WATER ID Movie Title 1 GRUDGE, THE (Sarah Michelle Gellar, Jason Behr) 2 IN DREAMS (Annette Bening, Aidan Quinn) 3 BOOK OF SHADOWS: BLAIR WITCH 2 (Kim Director, Jeffrey Donovan) 4 OTHERS, THE (Nicole Kidman, Christopher Eccleston) 5 HAUNTING, THE (Liam Neeson, Catherine Zeta-Jones) 6 DARKNESS FALLS (Chaney Kley, Emma Caulfield) 7 HOUSE ON HAUNTED HILL (Geoffrey Rush, Famke Janssen) 135 8 HOUSE OF THE DEAD (Jonathan Cherry, Tyron Leitso) 9 RING, THE (Naomi Watts, Martin Henderson) 10 SIXTH SENSE, THE (Bruce Willis, Toni Collette) 11 RAGE: CARRIE 2, THE (Emily Bergl, Jason London) 12 DARKNESS (Anna Paquin, Lena Olin) 13 DRAGONFLY (Kevin Costner, Joe Morton) 14 GIFT, THE (Cate Blanchett, Giovanni Ribisi) 15 EVENT HORIZON (Laurence Fishburne, Sam Neill) 16 GOTHIKA (Halle Berry, Robert Downey Jr.) 17 GHOST SHIP (Julianna Margulies, Ron Eldard) 18 13 GHOSTS (Tony Shalhoub, Embeth Davidtz) 19 VILLAGE OF THE DAMNED (Christopher Reeve, Kirstie Alley) 20 FALLEN (Denzel Washington, John Goodman) 21 BLAIR WITCH PROJECT, THE (Heather Donahue, Michael Williams) 22 WALKING DEAD, THE (Allen Payne, Eddie Griffin) 23 WES CRAVEN PRESENTS: THEY (Laura Regan, Marc Blucas) 24 FLESH AND BONE (Dennis Quaid, Meg Ryan) 25 STIR OF ECHOES (Kevin Bacon, Kathryn Erbe) 26 FRIGHTENERS, THE (Michael J. Fox, Trini Alvarado) 27 FINAL DESTINATION 2 (Ali Larter, A. J. Cook) 28 CANDYMAN 2 (Tony Todd, Kelly Rowan) 29 FORSAKEN, THE (Kerr Smith, Brendan Fehr) 30 WILLARD (Crispin Glover, R. Lee Ermey) 31 RED DRAGON (Anthony Hopkins, Edward Norton) 32 PSYCHO (Vince Vaughn, Anne Heche) 33 EXORCIST: THE BEGINNING (Stellan Skarsgard, James D'Arcy) 34 CRAFT, THE (Robin Tunney, Fairuza Balk) 35 JEEPERS CREEPERS 2 (Ray Wise, Jonathan Breck) 36 WES CRAVEN PRESENTS DRACULA (Christopher Plummer, Gerard Butler) 37 NINTH GATE, THE (Johnny Depp, Frank Langella) 38 PRACTICAL MAGIC (Sandra Bullock, Nicole Kidman) 39 STIGMATA (Patricia Arquette, Gabriel Byrne) 40 SEED OF CHUCKY (Jennifer Tilly (Herself), Brad Dourif (Vocal)) 41 SOLARIS (George Clooney, Natascha McElhone) 42 U-TURN (Sean Penn, Nick Nolte) 43 12 MONKEYS (Bruce Willis, Madeleine Stowe) 44 JASON GOES TO HELL-FINAL FRIDAY (Jon D. LeMay, Kari Keegan) 45 ISLAND OF DR. MOREAU, THE (Marlon Brando, Val Kilmer) 46 DEVIL'S ADVOCATE, THE (Keanu Reeves, Al Pacino) PIRATES OF THE CARIBBEAN: THE CURSE OF THE BLACK PEARL (Johnny 47 Depp, Geoffrey Rush) 48 UNFAITHFUL (Richard Gere, Diane Lane) 49 STEPFORD WIVES, THE (Nicole Kidman, Matthew Broderick) 50 UNBREAKABLE (Bruce Willis, Samuel L. Jackson) 51 TALENTED MR. RIPLEY, THE (Matt Damon, Gwyneth Paltrow) 52 DRACULA: DEAD AND LOVING IT (Leslie Nielsen, Peter MacNicol) 53 AMERICAN PSYCHO (Christian Bale, Willem Dafoe) 54 CITY OF ANGELS (Nicolas Cage, Meg Ryan) 55 USUAL SUSPECTS, THE (Stephen Baldwin, Gabriel Byrne) 56 BONES (Snoop Dogg, Pam Grier) 57 TRUTH ABOUT CHARLIE, THE (Mark Wahlberg, Thandie Newton) 58 ORIGINAL SIN (Antonio Banderas, Angelina Jolie) 59 HELLRAISER 4: BLOODLINE (Bruce Ramsay, Valentina Vargas) 136 60 HULK (Eric Bana, Jennifer Connelly) 61 BLACK BEAUTY (Sean Bean, David Thewlis) 62 GET CARTER (Sylvester Stallone, Miranda Richardson) 63 TEACHING MRS. TINGLE (Helen Mirren, Katie Holmes) 64 WHAT'S EATING GILBERT GRAPE? (Johnny Depp, Juliette Lewis) 65 LASSIE (Tom Guiry, Helen Slater) 66 ANGELA'S ASHES (Emily Watson, Robert Carlyle) 67 THIN RED LINE, THE (Sean Penn, Adrien Brody) TARGET MOVIE: DEUCE BIGALOW- EUROPEAN GIGOLO ID Movie Title 1 DEUCE BIGALOW: MALE GIGOLO (Rob Schneider, William Forsythe) 3 POOTIE TANG (Lance Crouther, Jennifer Coolidge) 4 MIXED NUTS (Steve Martin, Madeline Kahn) 5 WRONGFULLY ACCUSED (Leslie Nielsen, Richard Crenna) 6 ANIMAL, THE (Rob Schneider, Colleen Haskell) 7 HOT CHICK, THE (Rob Schneider, Anna Faris) AUSTIN POWERS: INTERNATIONAL MAN OF MYSTERY (Mike Myers, 8 Elizabeth Hurley) 9 DOUBLE TAKE (Orlando Jones, Eddie Griffin) 10 ZOOLANDER (Ben Stiller, Owen Wilson) 11 BIG MOMMA'S HOUSE (Martin Lawrence, Nia Long) 12 UNDERCOVER BROTHER (Eddie Griffin, Chris Kattan) 13 HOW HIGH (Method Man, Redman) DUMB AND DUMBERER: WHEN HARRY MET LLOYD (Eric Christian Olsen, 14 Derek Richardson) 15 VEGAS VACATION (Chevy Chase, Beverly D'Angelo) 16 SHALLOW HAL (Gwyneth Paltrow, Jack Black) 17 DUMB AND DUMBER (Jim Carrey, Jeff Daniels) 19 STARSKY & HUTCH (Ben Stiller, Owen Wilson) 20 BEAUTIFUL (Minnie Driver, Joey Lauren Adams) 21 WATERBOY, THE (Adam Sandler, Kathy Bates) 22 WEEKEND AT BERNIE'S 2 (Andrew McCarthy, Jonathan Silverman) 23 MY BABY'S DADDY (Eddie Griffin, Anthony Anderson) 24 ERNEST RIDES AGAIN (Jim Varney, Ron K. James) 25 NUTTY PROFESSOR II: THE KLUMPS, THE (Eddie Murphy, Janet Jackson) 26 SCARY MOVIE 2 (Shawn Wayans, Marlon Wayans) 27 AIR BUD: GOLDEN RECEIVER (Kevin Zegers, Cynthia Stevenson) 28 BIG DADDY (Adam Sandler, Joey Lauren Adams) 29 WHOLE TEN YARDS, THE (Bruce Willis, Matthew Perry) 30 D2: THE MIGHTY DUCKS (Emilio Estevez, Kathryn Erbe) 31 SON OF THE PINK PANTHER (Roberto Benigni, Herbert Lom) 32 AMERICAN WEREWOLF IN PARIS, AN (Tom Everett Scott, Julie Delpy) 33 BLUES BROTHERS 2000 (Dan Aykroyd, John Goodman) 34 CROCODILE DUNDEE IN L.A. (Paul Hogan, Linda Kozlowski) 35 BAD BOYS II (Martin Lawrence, Will Smith) 36 MAJOR LEAGUE 3 (Scott Bakula, Corbin Bernsen) 37 LIFE AQUATIC WITH STEVE ZISSOU (Bill Murray, Owen Wilson) 38 BEVERLY HILLS COP 3 (Eddie Murphy, Judge Reinhold) 39 CHARLIE'S ANGELS: FULL THROTTLE (Cameron Diaz, Drew Barrymore) 40 GRUMPIER OLD MEN (Jack Lemmon, Walter Matthau) 41 KNOCK OFF (Jean-Claude Van Damme, Rob Schneider) 137 42 FATAL INSTINCT (Armand Assante, Sherilyn Fenn) 43 SANTA CLAUSE 2, THE (Tim Allen, Elizabeth Mitchell) 44 IT RUNS IN THE FAMILY (Michael Douglas, Kirk Douglas) 45 RUSH HOUR 2 (Jackie Chan, Chris Tucker) 46 ENVY (Ben Stiller, Jack Black) RUGRATS IN PARIS: THE MOVIE (E. G. Daily (Vocal), Christine Cavanaugh 47 (Vocal)) 48 EXIT TO EDEN (Dana Delany, Paul Mercurio) 49 SURF NINJAS (Ernie Reyes Jr., Rob Schneider) 50 D3: THE MIGHTY DUCKS (Emilio Estevez, Jeffrey Nordling) 51 GOSFORD PARK (Eileen Atkins, Bob Balaban) 52 ARMY OF DARKNESS (Bruce Campbell, Embeth Davidtz) 53 MEN IN BLACK II (Tommy Lee Jones, Will Smith) 54 NEIL SIMON'S THE ODD COUPLE 2 (Jack Lemmon, Walter Matthau) 55 MY GIRL 2 (Dan Aykroyd, Jamie Lee Curtis) 56 AVENGERS, THE (Ralph Fiennes, Uma Thurman) 57 LETHAL WEAPON 4 (Mel Gibson, Danny Glover) 58 BEND IT LIKE BECKHAM (Parminder Nagra, Keira Knightley) 59 HOLY MAN (Eddie Murphy, Jeff Goldblum) 60 TOY STORY 2 (Tom Hanks (Vocal), Tim Allen (Vocal)) 61 MUPPETS FROM SPACE (Jeffrey Tambor, F. Murray Abraham) 62 DESPERADO (Antonio Banderas, Salma Hayek) TARGET MOVIE: DUKES OF HAZZARD, THE ID Movie Title 1 STARSKY & HUTCH (Ben Stiller, Owen Wilson) 2 NATIONAL LAMPOON'S SENIOR TRIP (Matt Frewer, Valerie Mahaffey) 3 BEVERLY HILLBILLIES, THE (Diedrich Bader, Dabney Coleman) 4 MALIBU'S MOST WANTED (Jamie Kennedy, Taye Diggs) 5 LEAVE IT TO BEAVER (Christopher McDonald, Janine Turner) 6 ROAD TRIP (Breckin Meyer, Seann William Scott) 7 FLED (Laurence Fishburne, Stephen Baldwin) 8 BALLISTIC: ECKS VS. SEVER (Antonio Banderas, Lucy Liu) 9 DUDLEY DO-RIGHT (Brendan Fraser, Sarah Jessica Parker) 10 RUNDOWN, THE (Dwayne Johnson, Seann William Scott) 11 BEAVIS & BUTT-HEAD DO AMERICA (Mike Judge (Vocal), Cloris Leachman (Vocal)) 12 GOOD BURGER (Kel Mitchell, Kenan Thompson) 13 DUDE, WHERE'S MY CAR? (Ashton Kutcher, Seann William Scott) 14 DUNGEONS & DRAGONS (Justin Whalin, Marlon Wayans) 15 ACE VENTURA: WHEN NATURE CALLS (Jim Carrey, Ian McNeice) 16 AIRBORNE (Shane McDermott, Seth Green) 17 NEW YORK MINUTE (Ashley Olsen, Mary-Kate Olsen) 18 SUBSTITUTE, THE (Tom Berenger, Ernie Hudson) 19 EUROTRIP (Scott Mechlowicz, Michelle Trachtenberg) DON'T BE A MENACE TO SOUTH CENTRAL WHILE DRINKING YOUR JUICE IN 20 THE HOOD (Shawn Wayans, Marlon Wayans) 21 WILD AMERICA (Jonathan Taylor Thomas, Devon Sawa) 22 MCHALE'S NAVY (Tom Arnold, Tim Curry) 23 CHARLIE'S ANGELS: FULL THROTTLE (Cameron Diaz, Drew Barrymore) 24 BLACK DOG (Patrick Swayze, Meat Loaf Aday) 25 ADVENTURES OF ROCKY AND BULLWINKLE, THE (Rene Russo, Jason Alexander) 26 SLACKERS (Devon Sawa, Jason Schwartzman) 138 27 GUNMEN (Christopher Lambert, Mario Van Peebles) 28 CHARLIE'S ANGELS (Cameron Diaz, Drew Barrymore) 29 SCOOBY-DOO (Freddie Prinze Jr., Sarah Michelle Gellar) 30 INSPECTOR GADGET (Matthew Broderick, Rupert Everett) 31 SUPERSTAR (Molly Shannon, Will Ferrell) 32 DICK (Kirsten Dunst, Michelle Williams) 33 FLINTSTONES, THE (John Goodman, Elizabeth Perkins) 34 TURBO: A POWER RANGERS MOVIE (Jason David Frank, Steve Cardenas) 35 PERFECT SCORE, THE (Erika Christensen, Chris Evans) 36 AMERICAN WEDDING (Jason Biggs, Alyson Hannigan) 37 DIGIMON: THE MOVIE (Lara Jill Miller (Vocal), Joshua Seth (Vocal)) 38 NEW GUY, THE (DJ Qualls, Eliza Dushku) 39 FLINTSTONES, THE (John Goodman, Elizabeth Perkins) 40 POWERPUFF GIRLS MOVIE, THE (E. G. Daily (Vocal), Cathy Cavadini (Vocal)) 41 RUGRATS IN PARIS: THE MOVIE (E. G. Daily (Vocal), Christine Cavanaugh (Vocal)) 42 GEORGE OF THE JUNGLE (Brendan Fraser, Leslie Mann) 43 MUPPET TREASURE ISLAND (Tim Curry, Kevin Bishop) 44 BULLETPROOF MONK (Chow Yun-Fat, Seann William Scott) 45 LETHAL WEAPON 4 (Mel Gibson, Danny Glover) 46 THUNDERBIRDS (Bill Paxton, Anthony Edwards) 47 LADIES MAN, THE (Tim Meadows, Karyn Parsons) 48 AIR BUD: GOLDEN RECEIVER (Kevin Zegers, Cynthia Stevenson) 49 JOSIE AND THE PUSSYCATS (Rachael Leigh Cook, Tara Reid) 50 I SPY (Eddie Murphy, Owen Wilson) 51 MR. MAGOO (Leslie Nielsen, Kelly Lynch) 52 FLIPPER (Elijah Wood, Paul Hogan) 53 WELCOME TO MOOSEPORT (Gene Hackman, Ray Romano) 54 ATLANTIS: THE LOST EMPIRE (Michael J. Fox (Vocal), James Garner (Vocal)) 55 RECESS: SCHOOL'S OUT (Rickey D'Shon Collins (Vocal), Jason Davis (Vocal)) 56 LOST IN SPACE (William Hurt, Mimi Rogers) 57 HOT CHICK, THE (Rob Schneider, Anna Faris) 58 FUGITIVE, THE (Harrison Ford, Tommy Lee Jones) 59 HOLES (Sigourney Weaver, Jon Voight) 60 MUPPETS FROM SPACE (Jeffrey Tambor, F. Murray Abraham) 61 WHATEVER IT TAKES (Shane West, Marla Sokoloff) 62 POOTIE TANG (Lance Crouther, Jennifer Coolidge) 63 MASK, THE (Jim Carrey, Cameron Diaz) 64 ED WOOD (Johnny Depp, Martin Landau) 65 MISSION: IMPOSSIBLE II (Tom Cruise, Dougray Scott) 66 FORREST GUMP (Tom Hanks, Robin Wright Penn) 67 STAR WARS: EPISODE I - THE PHANTOM MENACE (Liam Neeson, Ewan McGregor) HARRY POTTER AND THE SORCERER'S STONE (HARRY POTTER AND THE 68 PHILOSOPHER'S STONE) (Daniel Radcliffe, Rupert Grint) 69 HIGHLANDER 3: FINAL DIMENSION (Christopher Lambert, Mario Van Peebles) SOUTH PARK - BIGGER, LONGER AND UNCUT (Trey Parker (Vocal), Matt Stone 70 (Vocal)) 71 BLADE 2 (Wesley Snipes, Kris Kristofferson) 72 HARRY POTTER AND THE PRISONER AZKABAN (Daniel Radcliffe, Emma Watson) 73 STAR TREK: THE INSURRECTION (Patrick Stewart, Jonathan Frakes) 74 MUMMY RETURNS, THE (Brendan Fraser, Rachel Weisz) 139 TARGET MOVIE: FANTASTIC FOUR ID Movie Title 1 HULK (Eric Bana, Jennifer Connelly) 2 BATMAN FOREVER (Val Kilmer, Tommy Lee Jones) 3 X-MEN (Hugh Jackman, Patrick Stewart) 4 I SPY (Eddie Murphy, Owen Wilson) 5 SPIDER-MAN (Tobey Maguire, Willem Dafoe) 6 INCREDIBLES, THE (Craig T. Nelson, Holly Hunter) 7 TIMECOP (Jean-Claude Van Damme, Mia Sara) 8 TERMINAL VELOCITY (Charlie Sheen, Nastassja Kinski) 9 BARB WIRE (Pamela Anderson Lee, Temuera Morrison) 10 JOSIE AND THE PUSSYCATS (Rachael Leigh Cook, Tara Reid) 11 CATCH THAT KID (Kristen Stewart, Corbin Bleu) 12 LEAGUE OF EXTRAORDINARY GENTLEMEN (Sean Connery, Shane West) 13 DAREDEVIL (Ben Affleck, Jennifer Garner) 14 TITAN A.E. (Matt Damon (Vocal), Drew Barrymore (Vocal)) 15 TURBO: A POWER RANGERS MOVIE (Jason David Frank, Steve Cardenas) 16 PHANTOM, THE (Billy Zane, Kristy Swanson) 17 HELLBOY (Ron Perlman, John Hurt) 18 VIRUS (Jamie Lee Curtis, William Baldwin) 19 PUNISHER, THE (Tom Jane, John Travolta) 20 TEENAGE MUTANT NINJA TURTLES 3 (Paige Turco, Elias Koteas) 21 GODSEND (Greg Kinnear, Rebecca Romijn) 22 28 DAYS LATER (Cillian Murphy, Noah Huntley) 23 WHITE SQUALL (Jeff Bridges, Caroline Goodall) 24 CATWOMAN (Halle Berry, Benjamin Bratt) 25 STAR TREK: GENERATIONS (Patrick Stewart, William Shatner) 26 SUDDEN DEATH (Jean-Claude Van Damme, Powers Boothe) 27 BLADE 2 (Wesley Snipes, Kris Kristofferson) 28 EXCESS BAGGAGE (Alicia Silverstone, Benicio Del Toro) 29 BLACK DOG (Patrick Swayze, Meat Loaf Aday) 30 BLADE (Wesley Snipes, Stephen Dorff) 31 CHARLIE'S ANGELS: FULL THROTTLE (Cameron Diaz, Drew Barrymore) 32 CHRONICLES OF RIDDICK, THE (Vin Diesel, Colm Feore) 33 SOLDIER (Kurt Russell, Jason Scott Lee) 34 BULLETPROOF MONK (Chow Yun-Fat, Seann William Scott) 35 VAN HELSING (Hugh Jackman, Kate Beckinsale) 36 MATRIX REVOLUTIONS, THE (Keanu Reeves, Laurence Fishburne) 37 MOST WANTED (Keenen Ivory Wayans, Jon Voight) 38 BALLISTIC: ECKS VS. SEVER (Antonio Banderas, Lucy Liu) 39 DARK CITY (Rufus Sewell, Kiefer Sutherland) 40 STEEL (Shaquille O'Neal, Annabeth Gish) 41 MOD SQUAD, THE (Claire Danes, Giovanni Ribisi) 42 CROW: CITY OF ANGELS, THE (Vincent Perez, Mia Kirshner) BATMAN: MASK OF THE PHANTASM (Kevin Conroy (Vocal), Dana Delany 43 (Vocal)) 44 WAY OF THE GUN, THE (Ryan Phillippe, Benicio Del Toro) 45 POSTMAN, THE (Kevin Costner, Will Patton) 46 TANK GIRL (Lori Petty, Malcolm McDowell) 47 SKY CAPTAIN AND THE WORLD OF TOMORROW (Gwyneth Paltrow, Jude Law) 48 MYSTERY MEN (Hank Azaria, Janeane Garofalo) 49 A.I. ARTIFICIAL INTELLIGENCE (Haley Joel Osment, Jude Law) 140 50 STAR TREK: NEMESIS (Patrick Stewart, Jonathan Frakes) 51 THUNDERBIRDS (Bill Paxton, Anthony Edwards) 52 HONEY (Jessica Alba, Mekhi Phifer) 53 SHADOW, THE (Alec Baldwin, John Lone) 54 ROCKET MAN (Harland Williams, Jessica Lundy) 55 PROFESSIONAL, THE (Jean Reno, Gary Oldman) 56 FIRESTORM (Howie Long, Scott Glenn) 57 ADVENTURES OF PLUTO NASH, THE (Eddie Murphy, Randy Quaid) 58 JUDGE DREDD (Sylvester Stallone, Armand Assante) 59 QUEST (1996), THE (Jean-Claude Van Damme, Roger Moore) 60 DREAMCATCHER (Morgan Freeman, Tom Jane) 61 HOUSE OF FLYING DAGGERS (Takeshi Kaneshiro, Andy Lau) 62 PAYCHECK (Ben Affleck, Aaron Eckhart) 63 GATTACA (Ethan Hawke, Uma Thurman) 64 FLIGHT OF THE PHOENIX (Dennis Quaid, Giovanni Ribisi) 65 ESCAPE FROM L.A. (Kurt Russell, Stacy Keach) 66 CHILL FACTOR (Cuba Gooding Jr., Skeet Ulrich) 67 ALIEN VS. PREDATOR (Sanaa Lathan, Raoul Bova) 68 BAD BOYS II (Martin Lawrence, Will Smith) 69 GARFIELD (Breckin Meyer, Bill Murray (vocal)) 70 TORQUE (Martin Henderson, Ice Cube) 71 ERASER (Arnold Schwarzenegger, James Caan) TARGET MOVIE: FOUR BROTHERS ID Movie Title 1 SLEEPERS (Kevin Bacon, Robert De Niro) 2 LAST MAN STANDING (Bruce Willis, Christopher Walken) 3 SUBSTITUTE, THE (Tom Berenger, Ernie Hudson) 4 PUNISHER, THE (Tom Jane, John Travolta) 5 ALL ABOUT THE BENJAMINS (Ice Cube, Mike Epps) 6 MOD SQUAD, THE (Claire Danes, Giovanni Ribisi) 7 PAYBACK (Mel Gibson, Gregg Henry) 8 KILL BILL VOL. 1 (Uma Thurman, Lucy Liu) 9 WALKING TALL (Dwayne Johnson, Johnny Knoxville) 10 MYSTIC RIVER (Sean Penn, Tim Robbins) 11 BOILING POINT (Wesley Snipes, Dennis Hopper) 12 U-TURN (Sean Penn, Nick Nolte) 13 KNOCKAROUND GUYS (Barry Pepper, Vin Diesel) 14 SIMPLE PLAN, A (Bill Paxton, Billy Bob Thornton) 15 RONIN (Robert De Niro, Jean Reno) 16 GANGS OF NEW YORK (Leonardo DiCaprio, Daniel Day-Lewis) 17 LONG KISS GOODNIGHT, THE (Geena Davis, Samuel L. Jackson) 18 ITALIAN JOB, THE (Mark Wahlberg, Charlize Theron) 19 TRUTH ABOUT CHARLIE, THE (Mark Wahlberg, Thandie Newton) 20 CORRUPTOR, THE (Chow Yun-Fat, Mark Wahlberg) 21 IRON MONKEY (Yu Rong Guang, Donnie Yen) 22 HIGH CRIMES (Ashley Judd, Morgan Freeman) 23 REPLACEMENT KILLERS, THE (Chow Yun-Fat, Mira Sorvino) 24 PEACEMAKER, THE (George Clooney, Nicole Kidman) 25 PROOF OF LIFE (Meg Ryan, Russell Crowe) 26 FEAR (Mark Wahlberg, Reese Witherspoon) 141 27 MERCURY RISING (Bruce Willis, Alec Baldwin) 28 PATRIOT, THE (Mel Gibson, Heath Ledger) 29 MONSTER'S BALL (Billy Bob Thornton, Heath Ledger) BATMAN: MASK OF THE PHANTASM (Kevin Conroy (Vocal), Dana Delany 30 (Vocal)) 31 BIG HIT, THE (Mark Wahlberg, Lou Diamond Phillips) 32 DOUBLE TEAM (Jean-Claude Van Damme, Dennis Rodman) 33 CHANGING LANES (Ben Affleck, Samuel L. Jackson) 34 SOLDIER (Kurt Russell, Jason Scott Lee) 35 2 FAST 2 FURIOUS (Paul Walker, Tyrese Gibson) 36 COLLATERAL DAMAGE (Arnold Schwarzenegger, Elias Koteas) 37 PAYCHECK (Ben Affleck, Aaron Eckhart) 38 MURDER IN THE FIRST (Christian Slater, Kevin Bacon) 39 MISSION: IMPOSSIBLE II (Tom Cruise, Dougray Scott) 40 HUNTED (1995), THE (Christopher Lambert, John Lone) 41 BARB WIRE (Pamela Anderson Lee, Temuera Morrison) 42 JOHN GRISHAM'S THE RAINMAKER (Matt Damon, Claire Danes) 43 SPIDER-MAN (Tobey Maguire, Willem Dafoe) 44 INSTINCT (Anthony Hopkins, Cuba Gooding Jr.) 45 SHAWSHANK REDEMPTION, THE (Tim Robbins, Morgan Freeman) 46 WAR, THE (Elijah Wood, Kevin Costner) 47 CASINO (Robert De Niro, Sharon Stone) 48 EXTREME MEASURES (Hugh Grant, Gene Hackman) 49 THREE KINGS (George Clooney, Mark Wahlberg) 50 BALLISTIC: ECKS VS. SEVER (Antonio Banderas, Lucy Liu) 51 13TH WARRIOR, THE (Antonio Banderas, Diane Venora) 52 POETIC JUSTICE (Janet Jackson, Tupac Shakur) 53 MULHOLLAND FALLS (Nick Nolte, Melanie Griffith) 54 TWIN DRAGONS (Jackie Chan, Maggie Cheung) 55 BROKEDOWN PALACE (Claire Danes, Kate Beckinsale) 56 HARD TARGET (Jean-Claude Van Damme, Lance Henriksen) 57 CAPTAIN CORELLI'S MANDOLIN (Nicolas Cage, Penelope Cruz) 58 SNOW FALLING ON CEDARS (Ethan Hawke, James Cromwell) 59 ANGELA'S ASHES (Emily Watson, Robert Carlyle) 60 ALL THE PRETTY HORSES (Matt Damon, Henry Thomas) 61 METRO (Eddie Murphy, Michael Rapaport) 62 UP CLOSE AND PERSONAL (Robert Redford, Michelle Pfeiffer) 63 ARMAGEDDON (Bruce Willis, Billy Bob Thornton) 64 ROCK STAR (Mark Wahlberg, Jennifer Aniston) 65 SCARLET LETTER, THE (Demi Moore, Gary Oldman) 66 BOOGIE NIGHTS (Mark Wahlberg, Burt Reynolds) 67 PLANET OF THE APES (Mark Wahlberg, Tim Roth) 68 CRAZY/BEAUTIFUL (Kirsten Dunst, Jay Hernandez) 69 LASSIE (Tom Guiry, Helen Slater) 70 BELLY (Nas, DMX) 71 LOSING ISAIAH (Jessica Lange, Halle Berry) 72 WING COMMANDER (Freddie Prinze Jr., Saffron Burrows) 73 ABOUT A BOY (Hugh Grant, Toni Collette) 74 FREE WILLY 3: THE RESCUE (Jason James Richter, August Schellenberg) 75 SENSE AND SENSIBILITY (James Fleet, Tom Wilkinson) 76 SHOWGIRLS (Elizabeth Berkley, Kyle MacLachlan) 142 TARGET MOVIE: ISLAND, THE ID Movie Title 2 MATRIX, THE (Keanu Reeves, Laurence Fishburne) 3 GATTACA (Ethan Hawke, Uma Thurman) 4 TRUMAN SHOW, THE (Jim Carrey, Laura Linney) 5 ISLAND OF DR. MOREAU, THE (Marlon Brando, Val Kilmer) 6 NO ESCAPE (Ray Liotta, Lance Henriksen) 8 FORTRESS (Christopher Lambert, Kurtwood Smith) 9 SPECIES 2 (Michael Madsen, Natasha Henstridge) 10 6TH DAY, THE (Arnold Schwarzenegger, Tony Goldwyn) 11 A.I. ARTIFICIAL INTELLIGENCE (Haley Joel Osment, Jude Law) 12 NINTH GATE, THE (Johnny Depp, Frank Langella) 13 STRANGE DAYS (Ralph Fiennes, Angela Bassett) 14 VILLAGE, THE (Bryce Dallas Howard, Joaquin Phoenix) 15 TRUTH ABOUT CHARLIE, THE (Mark Wahlberg, Thandie Newton) 16 DARK CITY (Rufus Sewell, Kiefer Sutherland) 17 PAYCHECK (Ben Affleck, Aaron Eckhart) 18 GODSEND (Greg Kinnear, Rebecca Romijn) UNIVERSAL SOLDIER: THE RETURN (Jean-Claude Van Damme, Michael Jai 19 White) 20 PLANET OF THE APES (Mark Wahlberg, Tim Roth) 21 PLEASANTVILLE (Tobey Maguire, Jeff Daniels) 22 ANTITRUST (Ryan Phillippe, Rachael Leigh Cook) 23 THIRTEENTH FLOOR, THE (Craig Bierko, Armin Mueller-Stahl) 24 PITCH BLACK (Radha Mitchell, Vin Diesel) 25 DEMOLITION MAN (Sylvester Stallone, Wesley Snipes) 26 FRAILTY (Bill Paxton, Matthew McConaughey) 27 STIR OF ECHOES (Kevin Bacon, Kathryn Erbe) 28 MIMIC (Mira Sorvino, Jeremy Northam) 29 TERMINATOR 3: RISE OF THE MACHINES (Arnold Schwarzenegger, Nick Stahl) ANACONDAS: THE HUNT FOR THE BLOOD ORCHID (Johnny Messner, KaDee 30 Strickland) 31 STARGATE (Kurt Russell, James Spader) 32 BEACH, THE (Leonardo DiCaprio, Tilda Swinton) 33 SKY CAPTAIN AND THE WORLD OF TOMORROW (Gwyneth Paltrow, Jude Law) 34 CHRONICLES OF RIDDICK, THE (Vin Diesel, Colm Feore) 35 FEMME FATALE (Rebecca Romijn, Antonio Banderas) 36 GLASS HOUSE, THE (Leelee Sobieski, Diane Lane) 37 GIFT, THE (Cate Blanchett, Giovanni Ribisi) 38 EYE OF THE BEHOLDER (Ewan McGregor, Ashley Judd) 39 GHOST SHIP (Julianna Margulies, Ron Eldard) 40 LAKE PLACID (Bill Pullman, Bridget Fonda) 41 STAR TREK: NEMESIS (Patrick Stewart, Jonathan Frakes) 42 TITAN A.E. (Matt Damon (Vocal), Drew Barrymore (Vocal)) 43 MISSING, THE (Tommy Lee Jones, Cate Blanchett) 44 GOTHIKA (Halle Berry, Robert Downey Jr.) 45 PSYCHO (Vince Vaughn, Anne Heche) 46 LAWNMOWER MAN 2 (Patrick Bergin, Matt Frewer) 47 EXTREME OPS (Devon Sawa, Bridgette Wilson-Sampras) 48 PLAYING GOD (David Duchovny, Timothy Hutton) STAR WARS: EPISODE I - THE PHANTOM MENACE (Liam Neeson, Ewan 49 McGregor) 143 50 LIFE LESS ORDINARY, A (Ewan McGregor, Cameron Diaz) 51 FEAR (Mark Wahlberg, Reese Witherspoon) 52 ROCKET MAN (Harland Williams, Jessica Lundy) 53 BIG FISH (Ewan McGregor, Albert Finney) 54 WING COMMANDER (Freddie Prinze Jr., Saffron Burrows) 55 TREASURE PLANET (Joseph Gordon-Levitt (Vocal), Brian Murray (Vocal)) 56 HACKERS (Jonny Lee Miller, Angelina Jolie) 57 THUNDERBIRDS (Bill Paxton, Anthony Edwards) 58 JASON X (Lexa Doig, Lisa Ryder) 59 LOST IN TRANSLATION (Bill Murray, Scarlett Johansson) 60 GODZILLA 2000 (Takehiro Murata, Shiro Sano) 61 BLACK HAWK DOWN (Josh Hartnett, Ewan McGregor) 62 FALLEN (Denzel Washington, John Goodman) 63 MOULIN ROUGE! (Nicole Kidman, Ewan McGregor) 64 FARGO (Frances McDormand, Steve Buscemi) 65 DOWN WITH LOVE (Renee Zellweger, Ewan McGregor) TARGET MOVIE: MUST LOVE DOGS ID Movie Title 3 MR. WONDERFUL (Matt Dillon, Annabella Sciorra) 4 MATCHMAKER, THE (Janeane Garofalo, David O'Hara) 5 IF LUCY FELL (Sarah Jessica Parker, Eric Schaeffer) 6 PALLBEARER, THE (David Schwimmer, Gwyneth Paltrow) 7 YOU'VE GOT MAIL (Tom Hanks, Meg Ryan) 8 SOMEONE LIKE YOU (Ashley Judd, Greg Kinnear) 9 ONLY YOU (Marisa Tomei, Robert Downey Jr.) 10 LIFE OR SOMETHING LIKE IT (Angelina Jolie, Edward Burns) 11 DOWN TO YOU (Freddie Prinze Jr., Julia Stiles) 12 ALEX & EMMA (Luke Wilson, Kate Hudson) 13 SERENDIPITY (John Cusack, Kate Beckinsale) 14 WIN A DATE WITH TAD HAMILTON (Kate Bosworth, Topher Grace) 15 ALONG CAME POLLY (Ben Stiller, Jennifer Aniston) 16 UNDER THE TUSCAN SUN (Diane Lane, Sandra Oh) 17 HOW TO LOSE A GUY IN 10 DAYS (Kate Hudson, Matthew McConaughey) 18 MAID IN MANHATTAN (Jennifer Lopez, Ralph Fiennes) 19 THREE TO TANGO (Matthew Perry, Neve Campbell) 20 FORCES OF NATURE (Sandra Bullock, Ben Affleck) 21 HEAD OVER HEELS (Monica Potter, Freddie Prinze Jr.) 22 BRIDGET JONES: EDGE OF REASON (Renee Zellweger, Hugh Grant) 23 AS GOOD AS IT GETS (Jack Nicholson, Helen Hunt) 24 LOVE ACTUALLY (Alan Rickman, Bill Nighy) 25 JERSEY GIRL (Ben Affleck, Liv Tyler) 26 50 FIRST DATES (Adam Sandler, Drew Barrymore) 27 KISSING A FOOL (David Schwimmer, Jason Lee) 28 LAWS OF ATTRACTION (Pierce Brosnan, Julianne Moore) 29 MILK MONEY (Melanie Griffith, Ed Harris) 30 LOVE LETTER, THE (Kate Capshaw, Blythe Danner) 31 HIGH FIDELITY (John Cusack, Iben Hjejle) 32 WEDDING SINGER, THE (Adam Sandler, Drew Barrymore) 33 SIMPLY IRRESISTIBLE (Sarah Michelle Gellar, Sean Patrick Flanery) 144 34 SHE'S ALL THAT (Freddie Prinze Jr., Rachael Leigh Cook) 35 TWO WEEKS NOTICE (Sandra Bullock, Hugh Grant) 36 WIMBLEDON (Kirsten Dunst, Paul Bettany) 37 THERE'S SOMETHING ABOUT MARY (Cameron Diaz, Matt Dillon) 38 CHASING LIBERTY (Mandy Moore, Matthew Goode) 39 MY BEST FRIEND'S WEDDING (Julia Roberts, Dermot Mulroney) 40 CHOCOLAT (Juliette Binoche, Lena Olin) 41 GIGLI (Ben Affleck, Jennifer Lopez) 42 13 GOING ON 30 (SUDDENLY 30) (Jennifer Garner, Mark Ruffalo) 43 ALFIE (Jude Law, Marisa Tomei) 44 LIFE LESS ORDINARY, A (Ewan McGregor, Cameron Diaz) 45 PRINCESS DIARIES 2:ROYAL ENGAGEMENT (Anne Hathaway, Julie Andrews) 46 FOR LOVE OR MONEY (Michael J. Fox, Gabrielle Anwar) 47 TOWN & COUNTRY (Warren Beatty, Diane Keaton) 48 DR. T & THE WOMEN (Richard Gere, Helen Hunt) 49 FIRST DAUGHTER (Katie Holmes, Marc Blucas) 51 SIX DAYS, SEVEN NIGHTS (Harrison Ford, Anne Heche) 52 GRUMPY OLD MEN (Jack Lemmon, Walter Matthau) 53 FROM JUSTIN TO KELLY (Kelly Clarkson, Justin Guarini) 54 MIDSUMMER NIGHT'S DREAM, A (Kevin Kline, Michelle Pfeiffer) 55 PUSHING TIN (John Cusack, Billy Bob Thornton) 57 BIRTHDAY GIRL (Nicole Kidman, Ben Chaplin) 58 TWO IF BY SEA (Denis Leary, Sandra Bullock) 59 SIDEWAYS (Paul Giamatti, Thomas Haden Church) 60 MRS. WINTERBOURNE (Shirley MacLaine, Ricki Lake) 61 BEAUTICIAN AND THE BEAST, THE (Fran Drescher, Timothy Dalton) 62 HOW TO DEAL (Mandy Moore, Allison Janney) 63 MAD DOG AND GLORY (Robert De Niro, Uma Thurman) 64 MY DOG SKIP (Frankie Muniz, Diane Lane) 65 JACK (Robin Williams, Diane Lane) 66 GLASS HOUSE, THE (Leelee Sobieski, Diane Lane) 67 MURDER AT 1600 (Wesley Snipes, Diane Lane) 68 RUNAWAY JURY (John Cusack, Gene Hackman) MIDNIGHT IN THE GARDEN OF GOOD AND EVIL (Kevin Spacey, John 69 Cusack) TARGET MOVIE: REBOUND ID Movie Title 1 BIG GREEN, THE (Steve Guttenberg, Olivia D'Abo) 2 D3: THE MIGHTY DUCKS (Emilio Estevez, Jeffrey Nordling) 3 AIR UP THERE, THE (Kevin Bacon, Charles Gitonga Maina) 4 LITTLE GIANTS (Rick Moranis, Ed O'Neill) 5 LITTLE BIG LEAGUE (Luke Edwards, Timothy Busfield) 6 MAJOR LEAGUE 3 (Scott Bakula, Corbin Bernsen) 7 DODGEBALL: A TRUE UNDERDOG STORY (Vince Vaughn, Christine Taylor) 8 LIKE MIKE (Bow Wow, Morris Chestnut) 9 CELTIC PRIDE (Damon Wayans, Daniel Stern) 10 BASEKETBALL (Trey Parker, Matt Stone) 11 COOL RUNNINGS (Leon, Doug E. Doug) 12 SANDLOT, THE (Mike Vitar, Tom Guiry) 13 6TH MAN, THE (Marlon Wayans, Kadeem Hardison) 145 14 BEND IT LIKE BECKHAM (Parminder Nagra, Keira Knightley) 15 WATERBOY, THE (Adam Sandler, Kathy Bates) 16 MYSTERY, ALASKA (Russell Crowe, Hank Azaria) 17 ANGUS (Charlie Talbert, George C. Scott) 18 ROOKIE, THE (Dennis Quaid, Rachel Griffiths) 19 JUWANNA MANN (Miguel A. Nunez Jr., Vivica A. Fox) 20 WHAT'S THE WORST THAT COULD HAPPEN? (Martin Lawrence, Danny DeVito) 21 AIR BUD: GOLDEN RECEIVER (Kevin Zegers, Cynthia Stevenson) 22 EDDIE (Whoopi Goldberg, Frank Langella) 23 SCOUT, THE (Albert Brooks, Brendan Fraser) 24 AIRBORNE (Shane McDermott, Seth Green) 25 HIGH SCHOOL HIGH (Jon Lovitz, Tia Carrere) 26 GREAT WHITE HYPE, THE (Samuel L. Jackson, Jeff Goldblum) 27 BLUE STREAK (Martin Lawrence, Luke Wilson) 28 MR. NANNY (Hulk Hogan, Sherman Hemsley) 29 NOTHING TO LOSE (Martin Lawrence, Tim Robbins) 30 SLACKERS (Devon Sawa, Jason Schwartzman) 31 PERFECT SCORE, THE (Erika Christensen, Chris Evans) 32 NATIONAL SECURITY (Martin Lawrence, Steve Zahn) 33 BLACK KNIGHT (Martin Lawrence, Marsha Thomason) 34 LIFE (Eddie Murphy, Martin Lawrence) 35 SORORITY BOYS (Barry Watson, Michael Rosenbaum) 36 RENAISSANCE MAN (Danny DeVito, Gregory Hines) 37 MR. NICE GUY (Jackie Chan, Richard Norton) 38 BAD BOYS (Martin Lawrence, Will Smith) 39 BIG MOMMA'S HOUSE (Martin Lawrence, Nia Long) 40 DROP DEAD GORGEOUS (Kirsten Dunst, Ellen Barkin) DUMB AND DUMBERER: WHEN HARRY MET LLOYD (Eric Christian Olsen, 41 Derek Richardson) 42 ENVY (Ben Stiller, Jack Black) 43 GOOD BURGER (Kel Mitchell, Kenan Thompson) 44 NATIONAL LAMPOON'S VAN WILDER (Ryan Reynolds, Tara Reid) 45 EUROTRIP (Scott Mechlowicz, Michelle Trachtenberg) 46 SIMON BIRCH (Ian Michael Smith, Joseph Mazzello) 47 AMERICAN PIE 2 (Jason Biggs, Shannon Elizabeth) 48 ED (Matt LeBlanc, Jayne Brook) 49 BAD BOYS (Martin Lawrence, Will Smith) 50 GIRL NEXT DOOR, THE (Emile Hirsch, Elisha Cuthbert) 51 DRIVE ME CRAZY (Melissa Joan Hart, Adrian Grenier) 52 TEACHING MRS. TINGLE (Helen Mirren, Katie Holmes) 53 RUSHMORE (Jason Schwartzman, Bill Murray) 54 CONNIE AND CARLA (Nia Vardalos, Toni Collette) 55 HARRIET THE SPY (Michelle Trachtenberg, Rosie O'Donnell) 56 I SPY (Eddie Murphy, Owen Wilson) 57 NATIONAL LAMPOON'S GOLD DIGGERS (Chris Owen, Will Friedle) 58 IDLE HANDS (Devon Sawa, Seth Green) 59 EVOLUTION (David Duchovny, Orlando Jones) 60 JAWBREAKER (Rose McGowan, Rebecca Gayheart) 61 MICKEY BLUE EYES (Hugh Grant, James Caan) 62 MUPPETS FROM SPACE (Jeffrey Tambor, F. Murray Abraham) 63 GOOD GIRL, THE (Jennifer Aniston, Jake Gyllenhaal) 64 DUNSTON CHECKS IN (Jason Alexander, Faye Dunaway) 65 BUBBLE BOY (Jake Gyllenhaal, Swoosie Kurtz) 146 66 PAPER, THE (Michael Keaton, Robert Duvall) TARGET MOVIE: RED EYE ID Movie Title 3 TURBULENCE (Ray Liotta, Lauren Holly) 4 TRAPPED (Charlize Theron, Courtney Love) 6 RANSOM (Mel Gibson, Rene Russo) 7 MERCURY RISING (Bruce Willis, Alec Baldwin) 8 JUROR, THE (Demi Moore, Alec Baldwin) 9 DON'T SAY A WORD (Michael Douglas, Sean Bean) 10 NEGOTIATOR, THE (Samuel L. Jackson, Kevin Spacey) 11 FEAR (Mark Wahlberg, Reese Witherspoon) 12 SPEED (Keanu Reeves, Dennis Hopper) 14 CON AIR (Nicolas Cage, John Cusack) 15 TAKING LIVES (Angelina Jolie, Ethan Hawke) 16 FEARLESS (Jeff Bridges, Isabella Rossellini) 17 BLOWN AWAY (Jeff Bridges, Tommy Lee Jones) 18 STEPHEN KING'S THINNER (Robert John Burke, Joe Mantegna) 19 ORDER, THE (Heath Ledger, Shannyn Sossamon) 20 PAPARAZZI (Cole Hauser, Robin Tunney) 21 DESPERATE MEASURES (Michael Keaton, Andy Garcia) 22 MURDER AT 1600 (Wesley Snipes, Diane Lane) 23 SIEGE, THE (Denzel Washington, Annette Bening) 24 ANTITRUST (Ryan Phillippe, Rachael Leigh Cook) 25 FAN, THE (Robert De Niro, Wesley Snipes) 26 TWISTED (Ashley Judd, Samuel L. Jackson) 27 LONG KISS GOODNIGHT, THE (Geena Davis, Samuel L. Jackson) 28 COLLATERAL DAMAGE (Arnold Schwarzenegger, Elias Koteas) 29 BLOOD WORK (Clint Eastwood, Jeff Daniels) 30 NEVER DIE ALONE (DMX, David Arquette) 31 PROFESSIONAL, THE (Jean Reno, Gary Oldman) 32 SUSPECT ZERO (Aaron Eckhart, Ben Kingsley) 33 DOUBLE TEAM (Jean-Claude Van Damme, Dennis Rodman) 34 NARC (Ray Liotta, Jason Patric) 35 PSYCHO (Vince Vaughn, Anne Heche) 36 CELL, THE (Jennifer Lopez, Vince Vaughn) 37 USUAL SUSPECTS, THE (Stephen Baldwin, Gabriel Byrne) 38 DAYLIGHT (Sylvester Stallone, Amy Brenneman) 39 WATCHER, THE (James Spader, Marisa Tomei) 40 EXTREME MEASURES (Hugh Grant, Gene Hackman) 41 FAMILY THING, A (Robert Duvall, James Earl Jones) 42 FEMME FATALE (Rebecca Romijn, Antonio Banderas) 43 SPARTAN (Val Kilmer, Derek Luke) 44 CONTENDER, THE (Joan Allen, Gary Oldman) 45 SCREAM (Neve Campbell, David Arquette) 46 NOWHERE TO RUN (Jean-Claude Van Damme, Rosanna Arquette) 47 SLIVER (Sharon Stone, William Baldwin) 48 O (Mekhi Phifer, Josh Hartnett) 49 PUSHING TIN (John Cusack, Billy Bob Thornton) 50 FALLEN (Denzel Washington, John Goodman) 51 BODY OF EVIDENCE (Madonna, Willem Dafoe) 147 52 RANDOM HEARTS (Harrison Ford, Kristin Scott Thomas) 53 FARGO (Frances McDormand, Steve Buscemi) 54 12 MONKEYS (Bruce Willis, Madeleine Stowe) 55 SIMPLE PLAN, A (Bill Paxton, Billy Bob Thornton) 56 28 DAYS LATER (Cillian Murphy, Noah Huntley) 57 CITY BY THE SEA (Robert De Niro, Frances McDormand) 58 OUTBREAK (Dustin Hoffman, Rene Russo) 59 GIFT, THE (Cate Blanchett, Giovanni Ribisi) 60 NIXON (Anthony Hopkins, Joan Allen) 61 ASTRONAUT'S WIFE, THE (Johnny Depp, Charlize Theron) 62 DANTE'S PEAK (Pierce Brosnan, Linda Hamilton) 63 VANISHING, THE (Jeff Bridges, Kiefer Sutherland) 64 TEACHING MRS. TINGLE (Helen Mirren, Katie Holmes) 65 U-TURN (Sean Penn, Nick Nolte) 66 GODZILLA 2000 (Takehiro Murata, Shiro Sano) 67 NINTH GATE, THE (Johnny Depp, Frank Langella) 68 ORIGINAL SIN (Antonio Banderas, Angelina Jolie) 69 MOTHMAN PROPHECIES, THE (Richard Gere, Laura Linney) 70 GIGLI (Ben Affleck, Jennifer Lopez) 71 STIR OF ECHOES (Kevin Bacon, Kathryn Erbe) 72 DAWN OF THE DEAD (Sarah Polley, Ving Rhames) 73 THIRTEENTH FLOOR, THE (Craig Bierko, Armin Mueller-Stahl) 74 PERFECT STORM, THE (George Clooney, Mark Wahlberg) 75 SHAUN OF THE DEAD (Simon Pegg, Kate Ashfield) 77 TWISTER (Helen Hunt, Bill Paxton) 78 VOLCANO (Tommy Lee Jones, Anne Heche) 79 MARY REILLY (Julia Roberts, John Malkovich) 80 POETIC JUSTICE (Janet Jackson, Tupac Shakur) 81 MOD SQUAD, THE (Claire Danes, Giovanni Ribisi) 82 MEAN GIRLS (Lindsay Lohan, Rachel McAdams) 83 TRUTH ABOUT CHARLIE, THE (Mark Wahlberg, Thandie Newton) 84 DIABOLIQUE (Sharon Stone, Isabelle Adjani) 85 NOTEBOOK, THE (Ryan Gosling, Rachel McAdams) 86 QUIZ SHOW (John Turturro, Rob Morrow) 87 GLITTER (Mariah Carey, Max Beesley) 88 BLACK BEAUTY (Sean Bean, David Thewlis) 89 WITH HONORS (Joe Pesci, Brendan Fraser) TARGET MOVIE: SKELETON KEY, THE ID Movie Title 1 DARKNESS (Anna Paquin, Lena Olin) 2 OTHERS, THE (Nicole Kidman, Christopher Eccleston) 3 STIR OF ECHOES (Kevin Bacon, Kathryn Erbe) 4 GHOST SHIP (Julianna Margulies, Ron Eldard) 5 FORGOTTEN, THE (Julianne Moore, Dominic West) 6 LOST SOULS (Winona Ryder, Ben Chaplin) 7 FEARDOTCOM (Stephen Dorff, Natascha McElhone) 8 HAUNTING, THE (Liam Neeson, Catherine Zeta-Jones) 9 VILLAGE, THE (Bryce Dallas Howard, Joaquin Phoenix) 10 FLESH AND BONE (Dennis Quaid, Meg Ryan) 11 CHILDREN OF THE CORN 2 (Terence Knox, Paul Scherrer) 148 12 GOTHIKA (Halle Berry, Robert Downey Jr.) 13 DARKNESS FALLS (Chaney Kley, Emma Caulfield) 14 EXTREME MEASURES (Hugh Grant, Gene Hackman) 15 DRAGONFLY (Kevin Costner, Joe Morton) 16 ALONG CAME A SPIDER (Morgan Freeman, Monica Potter) 17 PHANTOMS (Peter O'Toole, Rose McGowan) 18 EXORCIST: THE BEGINNING (Stellan Skarsgard, James D'Arcy) 19 BONES (Snoop Dogg, Pam Grier) 20 FRIGHTENERS, THE (Michael J. Fox, Trini Alvarado) 21 FALLEN (Denzel Washington, John Goodman) 22 MARY REILLY (Julia Roberts, John Malkovich) 23 ORDER, THE (Heath Ledger, Shannyn Sossamon) 24 HOUSE OF THE DEAD (Jonathan Cherry, Tyron Leitso) 25 STIGMATA (Patricia Arquette, Gabriel Byrne) 26 SOLARIS (George Clooney, Natascha McElhone) 27 FINAL DESTINATION (Devon Sawa, Ali Larter) 28 NEEDFUL THINGS (Max Von Sydow, Ed Harris) 29 WISHMASTER (Tammy Lauren, Andrew Divoff) 30 JEEPERS CREEPERS (Gina Philips, Justin Long) 31 IN DREAMS (Annette Bening, Aidan Quinn) 32 WICKER PARK (Josh Hartnett, Rose Byrne) 33 FORSAKEN, THE (Kerr Smith, Brendan Fehr) 34 FREDDY VS. JASON (Robert Englund, Ken Kirzinger) 35 BROKEDOWN PALACE (Claire Danes, Kate Beckinsale) 36 SEED OF CHUCKY (Jennifer Tilly (Herself), Brad Dourif (Vocal)) 37 BEACH, THE (Leonardo DiCaprio, Tilda Swinton) 38 CRAFT, THE (Robin Tunney, Fairuza Balk) 39 BLESS THE CHILD (Kim Basinger, Jimmy Smits) 40 POWDER (Mary Steenburgen, Sean Patrick Flanery) 41 CANDYMAN 2 (Tony Todd, Kelly Rowan) 42 BRIDE OF CHUCKY (Jennifer Tilly, Katherine Heigl) 43 DEVIL'S ADVOCATE, THE (Keanu Reeves, Al Pacino) 44 WOLF (Jack Nicholson, Michelle Pfeiffer) 45 OPEN WATER (Blanchard Ryan, Daniel Travis) 46 RAGE: CARRIE 2, THE (Emily Bergl, Jason London) 47 ARMY OF DARKNESS (Bruce Campbell, Embeth Davidtz) 48 TEACHING MRS. TINGLE (Helen Mirren, Katie Holmes) 49 13TH WARRIOR, THE (Antonio Banderas, Diane Venora) 50 JASON GOES TO HELL-FINAL FRIDAY (Jon D. LeMay, Kari Keegan) 51 PRACTICAL MAGIC (Sandra Bullock, Nicole Kidman) 52 END OF DAYS (Arnold Schwarzenegger, Gabriel Byrne) 53 CRUEL INTENTIONS (Sarah Michelle Gellar, Ryan Phillippe) 54 THOMAS CROWN AFFAIR, THE (Pierce Brosnan, Rene Russo) 55 SELENA (Jennifer Lopez, Edward James Olmos) 56 LE DIVORCE (Kate Hudson, Naomi Watts) 57 CORRINA, CORRINA (Whoopi Goldberg, Ray Liotta) 58 LOSING ISAIAH (Jessica Lange, Halle Berry) 59 RAISING HELEN (Kate Hudson, John Corbett) 60 LOVE AND BASKETBALL (Sanaa Lathan, Omar Epps) 61 WHAT'S LOVE GOT TO DO WITH IT (Angela Bassett, Laurence Fishburne) 62 ALEX & EMMA (Luke Wilson, Kate Hudson) 63 HOW TO LOSE A GUY IN 10 DAYS (Kate Hudson, Matthew McConaughey) 64 PAULIE (Gena Rowlands, Tony Shalhoub) 149 65 IN GOOD COMPANY (Dennis Quaid, Topher Grace) 66 SIDEWAYS (Paul Giamatti, Thomas Haden Church) 67 TERMINAL, THE (Tom Hanks, Catherine Zeta-Jones) 68 LASSIE (Tom Guiry, Helen Slater) TARGET MOVIE: SKY HIGH ID Movie Title 1 INCREDIBLES, THE (Craig T. Nelson, Holly Hunter) 2 SPY KIDS (Antonio Banderas, Carla Gugino) 3 X-MEN (Hugh Jackman, Patrick Stewart) 4 MYSTERY MEN (Hank Azaria, Janeane Garofalo) 5 HOUSE ARREST (Jamie Lee Curtis, Kevin Pollak) 6 CATCH THAT KID (Kristen Stewart, Corbin Bleu) HARRY POTTER AND THE CHAMBER OF SECRETS (Daniel Radcliffe, Rupert 7 Grint) 8 TEENAGE MUTANT NINJA TURTLES 3 (Paige Turco, Elias Koteas) 9 MAX KEEBLE'S BIG MOVE (Alex D. Linz, Larry Miller) 10 BENJI OFF THE LEASH! (Nick Whitaker, Nate Bynum) 11 HELLBOY (Ron Perlman, John Hurt) 12 SHADOW, THE (Alec Baldwin, John Lone) 13 SLEEPOVER (Alexa Vega, Mika Boorem) 14 ANGUS (Charlie Talbert, George C. Scott) 15 MATILDA (Mara Wilson, Danny DeVito) 16 AIRBORNE (Shane McDermott, Seth Green) 17 NEW GUY, THE (DJ Qualls, Eliza Dushku) 18 MEAN GIRLS (Lindsay Lohan, Rachel McAdams) 19 PINOCCHIO (Roberto Benigni, Nicoletta Braschi) 20 LITTLE VAMPIRE, THE (Jonathan Lipnicki, Richard E. Grant) 21 HULK (Eric Bana, Jennifer Connelly) 22 NOTHING TO LOSE (Martin Lawrence, Tim Robbins) 23 IDLE HANDS (Devon Sawa, Seth Green) 24 BORROWERS, THE (John Goodman, Jim Broadbent) 25 SURVIVING CHRISTMAS (Ben Affleck, James Gandolfini) 26 STUPIDS, THE (Tom Arnold, Jessica Lundy) 27 BRADY BUNCH MOVIE, THE (Shelley Long, Gary Cole) 28 MALLRATS (Shannen Doherty, Jeremy London) 29 HIGH SCHOOL HIGH (Jon Lovitz, Tia Carrere) 30 D3: THE MIGHTY DUCKS (Emilio Estevez, Jeffrey Nordling) 31 HOME FOR THE HOLIDAYS (Holly Hunter, Robert Downey Jr.) 32 LIZZIE MCGUIRE MOVIE, THE (Hilary Duff, Adam Lamberg) 33 WILD AMERICA (Jonathan Taylor Thomas, Devon Sawa) 34 HAPPY GILMORE (Adam Sandler, Christopher McDonald) 35 FLIPPER (Elijah Wood, Paul Hogan) 36 KRIPPENDORF'S TRIBE (Richard Dreyfuss, Jenna Elfman) 37 MIRACLE (Kurt Russell, Patricia Clarkson) 38 PERFECT SCORE, THE (Erika Christensen, Chris Evans) 39 NEW YORK MINUTE (Ashley Olsen, Mary-Kate Olsen) 40 STARGATE (Kurt Russell, James Spader) 41 HOT CHICK, THE (Rob Schneider, Anna Faris) 42 BEDAZZLED (Brendan Fraser, Elizabeth Hurley) 43 MR. NANNY (Hulk Hogan, Sherman Hemsley) 150 44 AMAZING PANDA ADVENTURE, THE (Stephen Lang, Ryan Slater) 45 SIMON BIRCH (Ian Michael Smith, Joseph Mazzello) DUMB AND DUMBERER: WHEN HARRY MET LLOYD (Eric Christian Olsen, 46 Derek Richardson) 47 GOOD BURGER (Kel Mitchell, Kenan Thompson) 48 RACE THE SUN (Halle Berry, James Belushi) 49 AMERICAN WEDDING (Jason Biggs, Alyson Hannigan) 50 DUDE, WHERE'S MY CAR? (Ashton Kutcher, Seann William Scott) 51 PUNISHER, THE (Tom Jane, John Travolta) 52 3000 MILES TO GRACELAND (Kurt Russell, Kevin Costner) 53 RAISING HELEN (Kate Hudson, John Corbett) 54 SOLDIER (Kurt Russell, Jason Scott Lee) 55 DARK BLUE (Kurt Russell, Brendan Gleeson) 56 SLACKERS (Devon Sawa, Jason Schwartzman) 57 LEAVE IT TO BEAVER (Christopher McDonald, Janine Turner) 58 MYSTERY, ALASKA (Russell Crowe, Hank Azaria) 59 ESCAPE FROM L.A. (Kurt Russell, Stacy Keach) 60 TEACHING MRS. TINGLE (Helen Mirren, Katie Holmes) 61 ERNEST RIDES AGAIN (Jim Varney, Ron K. James) 62 ORANGE COUNTY (Colin Hanks, Jack Black) 63 VEGAS VACATION (Chevy Chase, Beverly D'Angelo) 64 EUROTRIP (Scott Mechlowicz, Michelle Trachtenberg) 65 WHAT'S EATING GILBERT GRAPE? (Johnny Depp, Juliette Lewis) 66 BARNEY'S GREAT ADVENTURE (George Hearn, Shirley Douglas) 67 CRUSH, THE (Cary Elwes, Alicia Silverstone) 69 LE DIVORCE (Kate Hudson, Naomi Watts) 70 BREAKDOWN (Kurt Russell, J. T. Walsh) TARGET MOVIE: STEALTH ID Movie Title 1 I, ROBOT (Will Smith, Bridget Moynahan) 2 IMPOSTOR (Gary Sinise, Madeleine Stowe) 3 TERMINATOR 3: RISE OF THE MACHINES (Arnold Schwarzenegger, Nick Stahl) 4 MIMIC (Mira Sorvino, Jeremy Northam) 5 FORTRESS (Christopher Lambert, Kurtwood Smith) 6 TERMINAL VELOCITY (Charlie Sheen, Nastassja Kinski) 7 EVENT HORIZON (Laurence Fishburne, Sam Neill) UNIVERSAL SOLDIER: THE RETURN (Jean-Claude Van Damme, Michael Jai 8 White) 9 A.I. ARTIFICIAL INTELLIGENCE (Haley Joel Osment, Jude Law) 10 WING COMMANDER (Freddie Prinze Jr., Saffron Burrows) 11 THIRTEENTH FLOOR, THE (Craig Bierko, Armin Mueller-Stahl) 12 BATTLEFIELD EARTH (John Travolta, Barry Pepper) 13 MATRIX REVOLUTIONS, THE (Keanu Reeves, Laurence Fishburne) 14 6TH DAY, THE (Arnold Schwarzenegger, Tony Goldwyn) 15 MINORITY REPORT (Tom Cruise, Colin Farrell) 16 DEMOLITION MAN (Sylvester Stallone, Wesley Snipes) 17 ESCAPE FROM L.A. (Kurt Russell, Stacy Keach) 18 SKY CAPTAIN AND THE WORLD OF TOMORROW (Gwyneth Paltrow, Jude Law) 19 U-571 (Matthew McConaughey, Bill Paxton) 20 ARMAGEDDON (Bruce Willis, Billy Bob Thornton) 151 21 SPECIES 2 (Michael Madsen, Natasha Henstridge) 22 STEEL (Shaquille O'Neal, Annabeth Gish) 23 MASTERMINDS (Patrick Stewart, Vincent Kartheiser) 24 28 DAYS LATER (Cillian Murphy, Noah Huntley) 25 DARK CITY (Rufus Sewell, Kiefer Sutherland) 26 GODZILLA 2000 (Takehiro Murata, Shiro Sano) 27 PITCH BLACK (Radha Mitchell, Vin Diesel) 28 ONE, THE (Jet Li, Carla Gugino) STAR WARS: EPISODE II - ATTACK OF THE CLONES (Ewan McGregor, Natalie 29 Portman) 30 ADVENTURES OF PLUTO NASH, THE (Eddie Murphy, Randy Quaid) 31 COLLATERAL DAMAGE (Arnold Schwarzenegger, Elias Koteas) 32 XXX (Vin Diesel, Asia Argento) 33 JUDGE DREDD (Sylvester Stallone, Armand Assante) 34 CHRONICLES OF RIDDICK, THE (Vin Diesel, Colm Feore) 35 HIGHLANDER 3: FINAL DIMENSION (Christopher Lambert, Mario Van Peebles) 36 DEEP IMPACT (Robert Duvall, Tea Leoni) 37 CORE, THE (Aaron Eckhart, Hilary Swank) LARA CROFT TOMB RAIDER: THE CRADLE OF LIFE (Angelina Jolie, Gerard 38 Butler) 39 FINAL FANTASY: THE SPIRITS WITHIN (Ming-Na (Vocal), Alec Baldwin (Vocal)) 40 TITAN A.E. (Matt Damon (Vocal), Drew Barrymore (Vocal)) 41 PAYCHECK (Ben Affleck, Aaron Eckhart) 42 THUNDERBIRDS (Bill Paxton, Anthony Edwards) 43 WORLD IS NOT ENOUGH, THE (Pierce Brosnan, Sophie Marceau) 44 POSTMAN, THE (Kevin Costner, Will Patton) 45 TREASURE PLANET (Joseph Gordon-Levitt (Vocal), Brian Murray (Vocal)) 46 GATTACA (Ethan Hawke, Uma Thurman) 47 12 MONKEYS (Bruce Willis, Madeleine Stowe) 48 MARS ATTACKS! (Jack Nicholson, Glenn Close) BATMAN: MASK OF THE PHANTASM (Kevin Conroy (Vocal), Dana Delany 49 (Vocal)) 50 TANK GIRL (Lori Petty, Malcolm McDowell) 51 ISLAND OF DR. MOREAU, THE (Marlon Brando, Val Kilmer) 52 CELL, THE (Jennifer Lopez, Vince Vaughn) 53 TEXAS CHAINSAW MASSACRE, THE (Jessica Biel, Jonathan Tucker) 54 VAN HELSING (Hugh Jackman, Kate Beckinsale) 55 PROOF OF LIFE (Meg Ryan, Russell Crowe) 56 3 NINJAS KICK BACK (Victor Wong, Max Elliott Slade) 57 SUMMER CATCH (Freddie Prinze Jr., Jessica Biel) 58 NEXT KARATE KID, THE (Pat Morita, Hilary Swank) 59 HOUSE OF FLYING DAGGERS (Takeshi Kaneshiro, Andy Lau) 60 BABE: PIG IN THE CITY (James Cromwell, Magda Szubanski) 61 FOUR FEATHERS, THE (Heath Ledger, Wes Bentley) 62 SWEET HOME ALABAMA (Reese Witherspoon, Josh Lucas) 63 JUNGLE BOOK 2, THE (John Goodman (Vocal), Haley Joel Osment (Vocal)) 152 TARGET MOVIE: VALIANT ID Movie Title 1 CHICKEN RUN (Mel Gibson (Vocal), Julia Sawalha (Vocal)) FAR FROM HOME: THE ADVENTURES OF YELLOW DOG (Mimi Rogers, Bruce 2 Davison) 3 PAULIE (Gena Rowlands, Tony Shalhoub) 4 BALTO (Kevin Bacon (Vocal), Bob Hoskins (Vocal)) 5 ANTZ (Woody Allen (Vocal), Dan Aykroyd (Vocal)) 6 FINDING NEMO (Albert Brooks, Ellen DeGeneres) 7 SHARK TALE (Will Smith, Angelina Jolie) 8 TOY STORY (Tom Hanks (Vocal), Tim Allen (Vocal)) 9 BENJI OFF THE LEASH! (Nick Whitaker, Nate Bynum) 10 BABE: PIG IN THE CITY (James Cromwell, Magda Szubanski) 11 STUART LITTLE 2 (Geena Davis, Hugh Laurie) 12 TIGGER MOVIE, THE (Jim Cummings (Vocal), Nikita Hopkins (Vocal)) 13 I'LL DO ANYTHING (Nick Nolte, Whittni Wright) 14 HOMEWARD BOUND 2: LOST IN SAN FRANCISCO (Robert Hays, Kim Greist) 15 SEE SPOT RUN (David Arquette, Michael Clarke Duncan) 16 MAJOR PAYNE (Damon Wayans, Karyn Parsons) 17 SHREK (Mike Myers (Vocal), Eddie Murphy (Vocal)) 18 ZEUS AND ROXANNE (Steve Guttenberg, Kathleen Quinlan) 19 RUGRATS GO WILD (Bruce Willis (Vocal), Chrissie Hynde (Vocal)) 20 TOM & JERRY: THE MOVIE (Richard Kind (Vocal), Dana Hill (Vocal)) 21 AIR BUD: GOLDEN RECEIVER (Kevin Zegers, Cynthia Stevenson) 22 AMAZING PANDA ADVENTURE, THE (Stephen Lang, Ryan Slater) 23 ANDRE (Keith Carradine, Tina Majorino) WE'RE BACK: A DINOSAUR'S STORY (John Goodman (Vocal), Blaze Berdahl 24 (Vocal)) 25 OPERATION DUMBO DROP (Danny Glover, Ray Liotta) 26 MCHALE'S NAVY (Tom Arnold, Tim Curry) 27 ED (Matt LeBlanc, Jayne Brook) 28 FLIPPER (Elijah Wood, Paul Hogan) 29 COUNTRY BEARS, THE (Christopher Walken, Stephen Tobolowsky) 30 THAT DARN CAT (Christina Ricci, Doug E. Doug) 31 MONKEY TROUBLE (Thora Birch, Harvey Keitel) 32 BLACK BEAUTY (Sean Bean, David Thewlis) 33 JUNGLE BOOK (LIVE ACTION), THE (Jason Scott Lee, Cary Elwes) 34 GOOD BOY! (Molly Shannon, Liam Aiken) 35 SGT. BILKO (Steve Martin, Dan Aykroyd) CROCODILE HUNTER: COLLISION COURSE. (Steve Irwin (Himself), Terri Irwin 36 (Herself)) 37 BUDDY (Rene Russo, Robbie Coltrane) 38 DUNSTON CHECKS IN (Jason Alexander, Faye Dunaway) 39 HOT SHOTS! PART DEUX (Charlie Sheen, Lloyd Bridges) 40 EYE OF THE BEHOLDER (Ewan McGregor, Ashley Judd) 41 JAKOB THE LIAR (Robin Williams, Alan Arkin) 42 WHITE FANG 2 (Scott Bairstow, Charmaine Craig) 43 BLACK HAWK DOWN (Josh Hartnett, Ewan McGregor) 44 BIG FISH (Ewan McGregor, Albert Finney) STAR WARS EPISODE II - ATTACK OF THE CLONES (Ewan McGregor, Natalie 45 Portman) 46 LIFE LESS ORDINARY, A (Ewan McGregor, Cameron Diaz) 153 47 DOWN WITH LOVE (Renee Zellweger, Ewan McGregor) 48 MOULIN ROUGE! (Nicole Kidman, Ewan McGregor) TARGET MOVIE: WAR OF THE WORLDS ID Movie Title 1 INDEPENDENCE DAY (Will Smith, Bill Pullman) 2 SKY CAPTAIN AND THE WORLD OF TOMORROW (Gwyneth Paltrow, Jude Law) 3 STARGATE (Kurt Russell, James Spader) 4 DEEP IMPACT (Robert Duvall, Tea Leoni) 5 FORTRESS (Christopher Lambert, Kurtwood Smith) 6 28 DAYS LATER (Cillian Murphy, Noah Huntley) 7 TITAN A.E. (Matt Damon (Vocal), Drew Barrymore (Vocal)) 8 SPECIES 2 (Michael Madsen, Natasha Henstridge) 9 REIGN OF FIRE (Christian Bale, Matthew McConaughey) 10 VOLCANO (Tommy Lee Jones, Anne Heche) 11 GODZILLA 2000 (Takehiro Murata, Shiro Sano) 12 PUPPET MASTERS, THE (Donald Sutherland, Eric Thal) 13 GHOSTS OF MARS (Ice Cube, Natasha Henstridge) 14 FINAL FANTASY: THE SPIRITS WITHIN (Ming-Na (Vocal), Alec Baldwin (Vocal)) AGENT CODY BANKS 2:DESTINATION LONDON (Frankie Muniz, Anthony 15 Anderson) 16 FIRE IN THE SKY (D. B. Sweeney, Robert Patrick) 17 ESCAPE FROM L.A. (Kurt Russell, Stacy Keach) 18 SOLARIS (George Clooney, Natascha McElhone) 19 CORE, THE (Aaron Eckhart, Hilary Swank) 20 LAWNMOWER MAN 2 (Patrick Bergin, Matt Frewer) 21 THIRTEENTH FLOOR, THE (Craig Bierko, Armin Mueller-Stahl) 22 TEARS OF THE SUN (Bruce Willis, Monica Bellucci) 23 GATTACA (Ethan Hawke, Uma Thurman) 24 TIME MACHINE, THE (Guy Pearce, Samantha Mumba) 25 WING COMMANDER (Freddie Prinze Jr., Saffron Burrows) 26 FROM DUSK TILL DAWN (Harvey Keitel, George Clooney) 27 ROBOCOP 3 (Robert John Burke, Nancy Allen) STAR WARS: EPISODE I - THE PHANTOM MENACE (Liam Neeson, Ewan 28 McGregor) 29 MINORITY REPORT (Tom Cruise, Colin Farrell) 30 POSTMAN, THE (Kevin Costner, Will Patton) 31 CHAIN REACTION (Keanu Reeves, Morgan Freeman) 32 DARK CITY (Rufus Sewell, Kiefer Sutherland) 33 TIMELINE (Paul Walker, Frances O'Connor) 34 JASON X (Lexa Doig, Lisa Ryder) 35 DAYLIGHT (Sylvester Stallone, Amy Brenneman) 36 WALKING DEAD, THE (Allen Payne, Eddie Griffin) 37 STREET FIGHTER (Jean-Claude Van Damme, Raul Julia) 38 SOLDIER (Kurt Russell, Jason Scott Lee) 39 POINT OF NO RETURN (Bridget Fonda, Gabriel Byrne) 40 BEHIND ENEMY LINES (Owen Wilson, Gene Hackman) 41 BULLETPROOF (Damon Wayans, Adam Sandler) 42 MISSION: IMPOSSIBLE II (Tom Cruise, Dougray Scott) 43 WINDTALKERS (Nicolas Cage, Adam Beach) 44 JACKIE CHAN'S FIRST STRIKE (Jackie Chan, Jackson Lou) 154 45 13TH WARRIOR, THE (Antonio Banderas, Diane Venora) 46 VANILLA SKY (Tom Cruise, Penelope Cruz) 47 LAST SAMURAI, THE (Tom Cruise, Ken Watanabe) 48 BARB WIRE (Pamela Anderson Lee, Temuera Morrison) 49 KISS OF THE DRAGON (Jet Li, Bridget Fonda) 50 KULL THE CONQUEROR (Kevin Sorbo, Tia Carrere) BATMAN: MASK OF THE PHANTASM (Kevin Conroy (Vocal), Dana Delany 51 (Vocal)) 52 BALLISTIC: ECKS VS. SEVER (Antonio Banderas, Lucy Liu) 53 CHILL FACTOR (Cuba Gooding Jr., Skeet Ulrich) 54 THREE KINGS (George Clooney, Mark Wahlberg) 55 TROY (Brad Pitt, Orlando Bloom) 56 COLLATERAL (Tom Cruise, Jamie Foxx) 57 PROFESSIONAL, THE (Jean Reno, Gary Oldman) 58 THUNDERBIRDS (Bill Paxton, Anthony Edwards) 59 CAPTAIN CORELLI'S MANDOLIN (Nicolas Cage, Penelope Cruz) 60 JERRY MAGUIRE (Tom Cruise, Cuba Gooding Jr.) 61 TUXEDO, THE (Jackie Chan, Jennifer Love Hewitt) 62 MY FAVORITE MARTIAN (Christopher Lloyd, Jeff Daniels) 63 EYES WIDE SHUT (Tom Cruise, Nicole Kidman) 64 SPEED 2: CRUISE CONTROL (Sandra Bullock, Jason Patric) 65 DESPERADO (Antonio Banderas, Salma Hayek) 66 SNATCH (Benicio Del Toro, Dennis Farina) 67 NEXT KARATE KID, THE (Pat Morita, Hilary Swank) TARGET MOVIE: WEDDING CRASHERS, THE ID Movie Title 1 WEDDING SINGER, THE (Adam Sandler, Drew Barrymore) 2 DICK (Kirsten Dunst, Michelle Williams) 3 NATIONAL LAMPOON'S VAN WILDER (Ryan Reynolds, Tara Reid) 4 SWEETEST THING, THE (Cameron Diaz, Christina Applegate) 5 MEET THE PARENTS (Robert De Niro, Ben Stiller) 6 SURVIVING CHRISTMAS (Ben Affleck, James Gandolfini) 7 POOTIE TANG (Lance Crouther, Jennifer Coolidge) 8 50 FIRST DATES (Adam Sandler, Drew Barrymore) 9 MAN OF THE HOUSE (1995) (Chevy Chase, Farrah Fawcett) 10 CONNIE AND CARLA (Nia Vardalos, Toni Collette) 11 BIG BOUNCE, THE (Owen Wilson, Morgan Freeman) 12 MARCI X (Lisa Kudrow, Damon Wayans) 13 ISN'T SHE GREAT (Bette Midler, Nathan Lane) 14 STARSKY & HUTCH (Ben Stiller, Owen Wilson) 15 WHAT A GIRL WANTS (Amanda Bynes, Colin Firth) 16 ZOOLANDER (Ben Stiller, Owen Wilson) 17 DODGEBALL: A TRUE UNDERDOG STORY (Vince Vaughn, Christine Taylor) 18 LADIES MAN, THE (Tim Meadows, Karyn Parsons) 19 MCHALE'S NAVY (Tom Arnold, Tim Curry) 20 STUPIDS, THE (Tom Arnold, Jessica Lundy) 21 TAXI (Queen Latifah, Jimmy Fallon) 22 BYE BYE, LOVE (Matthew Modine, Randy Quaid) 23 PCU (Jeremy Piven, Chris Young) 24 UPTOWN GIRLS (Brittany Murphy, Dakota Fanning) 155 25 DUDE, WHERE'S MY CAR? (Ashton Kutcher, Seann William Scott) 26 BIG MOMMA'S HOUSE (Martin Lawrence, Nia Long) 27 LIFE LESS ORDINARY, A (Ewan McGregor, Cameron Diaz) 28 I SPY (Eddie Murphy, Owen Wilson) 29 LE DIVORCE (Kate Hudson, Naomi Watts) 30 WELCOME TO MOOSEPORT (Gene Hackman, Ray Romano) 31 JOHNSON FAMILY VACATION (Cedric The Entertainer, Vanessa Williams) 32 MURIEL'S WEDDING (Toni Collette, Bill Hunter) 33 PUNCH-DRUNK LOVE (Adam Sandler, Emily Watson) 34 I GOT THE HOOK-UP (Master P, A. J. Johnson) 35 RIDING IN CARS WITH BOYS (Drew Barrymore, Steve Zahn) 36 JOE DIRT (David Spade, Dennis Miller) AUSTIN POWERS: INTERNATIONAL MAN OF MYSTERY (Mike Myers, 37 Elizabeth Hurley) 38 JUWANNA MANN (Miguel A. Nunez Jr., Vivica A. Fox) 39 HANGING UP (Meg Ryan, Diane Keaton) 40 PUSHING TIN (John Cusack, Billy Bob Thornton) 41 SIDEWAYS (Paul Giamatti, Thomas Haden Church) 42 LIFE AQUATIC WITH STEVE ZISSOU (Bill Murray, Owen Wilson) 43 MRS. DOUBTFIRE (Robin Williams, Sally Field) 44 MALLRATS (Shannen Doherty, Jeremy London) 45 MR. NICE GUY (Jackie Chan, Richard Norton) 46 OCEAN'S ELEVEN (George Clooney, Matt Damon) 47 ROBIN HOOD: MEN IN TIGHTS (Cary Elwes, Richard Lewis) 48 FREAKY FRIDAY (Jamie Lee Curtis, Lindsay Lohan) 49 TRUTH ABOUT CHARLIE, THE (Mark Wahlberg, Thandie Newton) 50 RAISING HELEN (Kate Hudson, John Corbett) 51 FAMILY MAN, THE (Nicolas Cage, Tea Leoni) 52 NUTTY PROFESSOR, THE (Eddie Murphy, Jada Pinkett-Smith) 53 CAMP NOWHERE (Jonathan Jackson, Christopher Lloyd) 54 BLUES BROTHERS 2000 (Dan Aykroyd, John Goodman) 55 BEVERLY HILLS COP 3 (Eddie Murphy, Judge Reinhold) 56 BIG LEBOWSKI, THE (Jeff Bridges, John Goodman) 57 SANTA CLAUSE, THE (Tim Allen, Judge Reinhold) 58 ANYWHERE BUT HERE (Susan Sarandon, Natalie Portman) 59 HOUSE ARREST (Jamie Lee Curtis, Kevin Pollak) 60 LEAVE IT TO BEAVER (Christopher McDonald, Janine Turner) 61 CHOCOLAT (Juliette Binoche, Lena Olin) 62 IDLE HANDS (Devon Sawa, Seth Green) 63 JOSIE AND THE PUSSYCATS (Rachael Leigh Cook, Tara Reid) 64 WHAT'S EATING GILBERT GRAPE? (Johnny Depp, Juliette Lewis) 65 ARMY OF DARKNESS (Bruce Campbell, Embeth Davidtz) 66 RETURN TO PARADISE (Vince Vaughn, Anne Heche) 67 MATILDA (Mara Wilson, Danny DeVito) 68 ERNEST RIDES AGAIN (Jim Varney, Ron K. James) 69 HOME FOR THE HOLIDAYS (Holly Hunter, Robert Downey Jr.) 70 FIERCE CREATURES (John Cleese, Jamie Lee Curtis) 71 AFTER THE SUNSET (Pierce Brosnan, Salma Hayek) 72 PSYCHO (Vince Vaughn, Anne Heche) 73 BABY'S DAY OUT (Joe Mantegna, Lara Flynn Boyle) 74 CELL, THE (Jennifer Lopez, Vince Vaughn) 75 LIFE AS A HOUSE (Kevin Kline, Kristin Scott Thomas) 156 Note T2: Hierarchical Cluster Analysis using SPSS In this section the cluster membership schedules for the 19 target movies are presented. The yellow-shaded region depicts the top robust cluster and its cluster iteration range. Where there is a “tie” in the most robust cluster, a conservative approach of taking the most inclusive cluster (larger number of cases) is used. Also, the “ID” column here is not the same as that in the robust similarity function; rather the ID is an ordering index assigned by SPSS and refers to the case as ordered in the robust similarity function. Therefore using the “Must Love Dogs” example, movie 3 in the robust similarity function is labeled movie 1 in the cluster membership schedules. TARGET CASE: BAD NEWS BEARS Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 1 1 1 3 2 2 2 2 2 1 1 1 1 4 2 2 2 2 2 1 1 1 1 5 2 2 2 2 2 1 1 1 1 6 2 2 2 2 2 1 1 1 1 7 3 3 3 3 3 2 2 1 1 8 3 3 3 3 3 2 2 1 1 9 3 3 3 3 3 2 2 1 1 10 3 3 3 3 3 2 2 1 1 11 3 3 3 3 3 2 2 1 1 12 3 3 3 3 3 2 2 1 1 13 3 3 3 3 3 2 2 1 1 14 3 3 3 3 3 2 2 1 1 15 4 4 4 3 3 2 2 1 1 16 4 4 4 3 3 2 2 1 1 17 4 4 4 3 3 2 2 1 1 18 4 4 4 3 3 2 2 1 1 19 4 4 4 3 3 2 2 1 1 20 5 5 5 4 4 3 3 2 2 21 5 5 5 4 4 3 3 2 2 22 5 5 5 4 4 3 3 2 2 23 5 5 5 4 4 3 3 2 2 24 5 5 5 4 4 3 3 2 2 25 5 5 5 4 4 3 3 2 2 26 5 5 5 4 4 3 3 2 2 27 5 5 5 4 4 3 3 2 2 28 5 5 5 4 4 3 3 2 2 29 5 5 5 4 4 3 3 2 2 157 30 5 5 5 4 4 3 3 2 2 31 5 5 5 4 4 3 3 2 2 32 6 6 6 5 5 4 3 2 2 33 6 6 6 5 5 4 3 2 2 34 6 6 6 5 5 4 3 2 2 35 6 6 6 5 5 4 3 2 2 36 6 6 6 5 5 4 3 2 2 37 7 7 6 5 5 4 3 2 2 38 7 7 6 5 5 4 3 2 2 39 7 7 6 5 5 4 3 2 2 40 7 7 6 5 5 4 3 2 2 41 7 7 6 5 5 4 3 2 2 42 7 7 6 5 5 4 3 2 2 43 7 7 6 5 5 4 3 2 2 44 7 7 6 5 5 4 3 2 2 45 7 7 6 5 5 4 3 2 2 46 7 7 6 5 5 4 3 2 2 47 8 8 7 6 6 5 4 3 2 48 8 8 7 6 6 5 4 3 2 49 8 8 7 6 6 5 4 3 2 50 8 8 7 6 6 5 4 3 2 51 8 8 7 6 6 5 4 3 2 52 8 8 7 6 6 5 4 3 2 53 8 8 7 6 6 5 4 3 2 54 8 8 7 6 6 5 4 3 2 55 8 8 7 6 6 5 4 3 2 56 8 8 7 6 6 5 4 3 2 57 8 8 7 6 6 5 4 3 2 58 8 8 7 6 6 5 4 3 2 59 8 8 7 6 6 5 4 3 2 60 8 8 7 6 6 5 4 3 2 61 8 8 7 6 6 5 4 3 2 62 8 8 7 6 6 5 4 3 2 63 8 8 7 6 6 5 4 3 2 64 8 8 7 6 6 5 4 3 2 65 8 8 7 6 6 5 4 3 2 66 8 8 7 6 6 5 4 3 2 67 9 8 7 6 6 5 4 3 2 68 9 8 7 6 6 5 4 3 2 69 9 8 7 6 6 5 4 3 2 70 9 8 7 6 6 5 4 3 2 71 9 8 7 6 6 5 4 3 2 72 10 9 8 7 6 5 4 3 2 73 10 9 8 7 6 5 4 3 2 158 TARGET CASE: BEWITCHED Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 1 1 1 3 3 2 2 2 2 2 1 1 1 4 4 3 3 3 3 3 2 2 2 5 4 3 3 3 3 3 2 2 2 6 4 3 3 3 3 3 2 2 2 7 4 3 3 3 3 3 2 2 2 8 4 3 3 3 3 3 2 2 2 9 4 3 3 3 3 3 2 2 2 10 4 3 3 3 3 3 2 2 2 11 5 4 4 3 3 3 2 2 2 12 5 4 4 3 3 3 2 2 2 13 5 4 4 3 3 3 2 2 2 14 5 4 4 3 3 3 2 2 2 15 5 4 4 3 3 3 2 2 2 16 5 4 4 3 3 3 2 2 2 17 5 4 4 3 3 3 2 2 2 18 6 5 5 4 4 4 3 2 2 19 6 5 5 4 4 4 3 2 2 20 6 5 5 4 4 4 3 2 2 21 6 5 5 4 4 4 3 2 2 22 6 5 5 4 4 4 3 2 2 23 6 5 5 4 4 4 3 2 2 24 6 5 5 4 4 4 3 2 2 25 6 5 5 4 4 4 3 2 2 26 6 5 5 4 4 4 3 2 2 27 6 5 5 4 4 4 3 2 2 28 6 5 5 4 4 4 3 2 2 29 6 5 5 4 4 4 3 2 2 30 6 5 5 4 4 4 3 2 2 31 6 5 5 4 4 4 3 2 2 32 6 5 5 4 4 4 3 2 2 33 6 5 5 4 4 4 3 2 2 34 6 5 5 4 4 4 3 2 2 35 6 5 5 4 4 4 3 2 2 36 6 5 5 4 4 4 3 2 2 37 6 5 5 4 4 4 3 2 2 38 6 5 5 4 4 4 3 2 2 39 6 5 5 4 4 4 3 2 2 40 6 5 5 4 4 4 3 2 2 41 6 5 5 4 4 4 3 2 2 42 7 6 6 5 5 4 3 2 2 43 7 6 6 5 5 4 3 2 2 44 7 6 6 5 5 4 3 2 2 45 7 6 6 5 5 4 3 2 2 46 7 6 6 5 5 4 3 2 2 47 7 6 6 5 5 4 3 2 2 48 7 6 6 5 5 4 3 2 2 49 7 6 6 5 5 4 3 2 2 159 50 8 7 6 5 5 4 3 2 2 51 8 7 6 5 5 4 3 2 2 52 8 7 6 5 5 4 3 2 2 53 8 7 6 5 5 4 3 2 2 54 9 8 7 6 6 5 4 3 2 55 9 8 7 6 6 5 4 3 2 56 9 8 7 6 6 5 4 3 2 57 9 8 7 6 6 5 4 3 2 58 9 8 7 6 6 5 4 3 2 59 9 8 7 6 6 5 4 3 2 60 9 8 7 6 6 5 4 3 2 61 9 8 7 6 6 5 4 3 2 62 9 8 7 6 6 5 4 3 2 63 10 9 8 7 6 5 4 3 2 TARGET CASE: BROTHERS GRIMM, THE Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 1 1 1 3 2 2 2 2 2 2 1 1 1 4 2 2 2 2 2 2 1 1 1 5 2 2 2 2 2 2 1 1 1 6 2 2 2 2 2 2 1 1 1 7 2 2 2 2 2 2 1 1 1 8 2 2 2 2 2 2 1 1 1 9 2 2 2 2 2 2 1 1 1 10 2 2 2 2 2 2 1 1 1 11 3 3 3 2 2 2 1 1 1 12 3 3 3 2 2 2 1 1 1 13 3 3 3 2 2 2 1 1 1 14 3 3 3 2 2 2 1 1 1 15 3 3 3 2 2 2 1 1 1 16 4 4 4 3 3 3 2 2 1 17 4 4 4 3 3 3 2 2 1 18 4 4 4 3 3 3 2 2 1 19 4 4 4 3 3 3 2 2 1 20 4 4 4 3 3 3 2 2 1 21 4 4 4 3 3 3 2 2 1 22 4 4 4 3 3 3 2 2 1 23 4 4 4 3 3 3 2 2 1 24 4 4 4 3 3 3 2 2 1 25 4 4 4 3 3 3 2 2 1 26 4 4 4 3 3 3 2 2 1 27 4 4 4 3 3 3 2 2 1 28 4 4 4 3 3 3 2 2 1 29 4 4 4 3 3 3 2 2 1 30 4 4 4 3 3 3 2 2 1 31 5 5 5 4 4 3 2 2 1 32 5 5 5 4 4 3 2 2 1 33 5 5 5 4 4 3 2 2 1 160 34 5 5 5 4 4 3 2 2 1 35 5 5 5 4 4 3 2 2 1 36 6 5 5 4 4 3 2 2 1 37 6 5 5 4 4 3 2 2 1 38 6 5 5 4 4 3 2 2 1 39 6 5 5 4 4 3 2 2 1 40 6 5 5 4 4 3 2 2 1 41 6 5 5 4 4 3 2 2 1 42 6 5 5 4 4 3 2 2 1 43 6 5 5 4 4 3 2 2 1 44 6 5 5 4 4 3 2 2 1 45 6 5 5 4 4 3 2 2 1 46 7 6 6 5 5 4 3 2 1 47 7 6 6 5 5 4 3 2 1 48 7 6 6 5 5 4 3 2 1 49 7 6 6 5 5 4 3 2 1 50 7 6 6 5 5 4 3 2 1 51 7 6 6 5 5 4 3 2 1 52 8 7 7 6 6 5 4 3 2 53 8 7 7 6 6 5 4 3 2 54 8 7 7 6 6 5 4 3 2 55 8 7 7 6 6 5 4 3 2 56 9 8 8 7 6 5 4 3 2 57 9 8 8 7 6 5 4 3 2 58 9 8 8 7 6 5 4 3 2 59 9 8 8 7 6 5 4 3 2 60 10 9 8 7 6 5 4 3 2 61 10 9 8 7 6 5 4 3 2 TARGET CASE: CHARLIE AND THE CHOCOLATE FACTORY Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 3 2 1 1 1 1 1 1 1 1 4 2 1 1 1 1 1 1 1 1 5 2 1 1 1 1 1 1 1 1 6 2 1 1 1 1 1 1 1 1 7 3 2 2 2 2 2 2 2 1 8 3 2 2 2 2 2 2 2 1 9 3 2 2 2 2 2 2 2 1 10 3 2 2 2 2 2 2 2 1 11 3 2 2 2 2 2 2 2 1 12 3 2 2 2 2 2 2 2 1 13 3 2 2 2 2 2 2 2 1 14 3 2 2 2 2 2 2 2 1 15 3 2 2 2 2 2 2 2 1 16 3 2 2 2 2 2 2 2 1 17 3 2 2 2 2 2 2 2 1 18 4 3 3 3 3 2 2 2 1 19 4 3 3 3 3 2 2 2 1 20 4 3 3 3 3 2 2 2 1 161 21 4 3 3 3 3 2 2 2 1 22 4 3 3 3 3 2 2 2 1 23 4 3 3 3 3 2 2 2 1 24 4 3 3 3 3 2 2 2 1 25 5 4 4 3 3 2 2 2 1 26 5 4 4 3 3 2 2 2 1 27 5 4 4 3 3 2 2 2 1 28 5 4 4 3 3 2 2 2 1 29 5 4 4 3 3 2 2 2 1 30 6 5 5 4 4 3 3 2 1 31 6 5 5 4 4 3 3 2 1 32 6 5 5 4 4 3 3 2 1 33 6 5 5 4 4 3 3 2 1 34 6 5 5 4 4 3 3 2 1 35 6 5 5 4 4 3 3 2 1 36 6 5 5 4 4 3 3 2 1 37 6 5 5 4 4 3 3 2 1 38 6 5 5 4 4 3 3 2 1 39 6 5 5 4 4 3 3 2 1 40 6 5 5 4 4 3 3 2 1 41 6 5 5 4 4 3 3 2 1 42 7 6 6 5 4 3 3 2 1 43 8 7 7 6 5 4 4 3 2 44 8 7 7 6 5 4 4 3 2 45 9 8 7 6 5 4 4 3 2 46 9 8 7 6 5 4 4 3 2 47 9 8 7 6 5 4 4 3 2 48 10 9 8 7 6 5 4 3 2 49 10 9 8 7 6 5 4 3 2 50 10 9 8 7 6 5 4 3 2 TARGET CASE: DARK WATER Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 1 5 2 2 2 2 2 1 1 1 1 6 2 2 2 2 2 1 1 1 1 7 2 2 2 2 2 1 1 1 1 8 2 2 2 2 2 1 1 1 1 9 2 2 2 2 2 1 1 1 1 10 3 3 3 3 3 2 2 2 1 11 3 3 3 3 3 2 2 2 1 12 3 3 3 3 3 2 2 2 1 13 3 3 3 3 3 2 2 2 1 14 3 3 3 3 3 2 2 2 1 15 3 3 3 3 3 2 2 2 1 16 3 3 3 3 3 2 2 2 1 17 3 3 3 3 3 2 2 2 1 18 3 3 3 3 3 2 2 2 1 162 19 3 3 3 3 3 2 2 2 1 20 3 3 3 3 3 2 2 2 1 21 3 3 3 3 3 2 2 2 1 22 3 3 3 3 3 2 2 2 1 23 3 3 3 3 3 2 2 2 1 24 4 4 4 4 3 2 2 2 1 25 4 4 4 4 3 2 2 2 1 26 4 4 4 4 3 2 2 2 1 27 4 4 4 4 3 2 2 2 1 28 4 4 4 4 3 2 2 2 1 29 4 4 4 4 3 2 2 2 1 30 5 5 5 5 4 3 3 2 1 31 5 5 5 5 4 3 3 2 1 32 5 5 5 5 4 3 3 2 1 33 5 5 5 5 4 3 3 2 1 34 5 5 5 5 4 3 3 2 1 35 5 5 5 5 4 3 3 2 1 36 5 5 5 5 4 3 3 2 1 37 5 5 5 5 4 3 3 2 1 38 5 5 5 5 4 3 3 2 1 39 5 5 5 5 4 3 3 2 1 40 5 5 5 5 4 3 3 2 1 41 6 6 6 5 4 3 3 2 1 42 6 6 6 5 4 3 3 2 1 43 6 6 6 5 4 3 3 2 1 44 6 6 6 5 4 3 3 2 1 45 6 6 6 5 4 3 3 2 1 46 6 6 6 5 4 3 3 2 1 47 6 6 6 5 4 3 3 2 1 48 7 7 7 6 5 4 4 3 2 49 7 7 7 6 5 4 4 3 2 50 7 7 7 6 5 4 4 3 2 51 7 7 7 6 5 4 4 3 2 52 7 7 7 6 5 4 4 3 2 53 8 7 7 6 5 4 4 3 2 54 8 7 7 6 5 4 4 3 2 55 8 7 7 6 5 4 4 3 2 56 8 7 7 6 5 4 4 3 2 57 8 7 7 6 5 4 4 3 2 58 8 7 7 6 5 4 4 3 2 59 9 8 8 7 6 5 4 3 2 60 9 8 8 7 6 5 4 3 2 61 9 8 8 7 6 5 4 3 2 62 9 8 8 7 6 5 4 3 2 63 9 8 8 7 6 5 4 3 2 64 9 8 8 7 6 5 4 3 2 65 9 8 8 7 6 5 4 3 2 66 10 9 8 7 6 5 4 3 2 67 10 9 8 7 6 5 4 3 2 163 TARGET CASE: DEUCE BIGALOW: EUROPEAN GIGOLO Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 2 2 2 2 2 2 2 2 2 4 2 2 2 2 2 2 2 2 2 5 2 2 2 2 2 2 2 2 2 6 2 2 2 2 2 2 2 2 2 7 3 3 3 3 2 2 2 2 2 8 3 3 3 3 2 2 2 2 2 9 3 3 3 3 2 2 2 2 2 10 3 3 3 3 2 2 2 2 2 11 3 3 3 3 2 2 2 2 2 12 3 3 3 3 2 2 2 2 2 13 3 3 3 3 2 2 2 2 2 14 4 4 4 4 3 3 3 2 2 15 4 4 4 4 3 3 3 2 2 16 4 4 4 4 3 3 3 2 2 17 5 4 4 4 3 3 3 2 2 18 5 4 4 4 3 3 3 2 2 19 5 4 4 4 3 3 3 2 2 20 5 4 4 4 3 3 3 2 2 21 5 4 4 4 3 3 3 2 2 22 5 4 4 4 3 3 3 2 2 23 6 5 5 5 4 4 3 2 2 24 6 5 5 5 4 4 3 2 2 25 6 5 5 5 4 4 3 2 2 26 6 5 5 5 4 4 3 2 2 27 6 5 5 5 4 4 3 2 2 28 6 5 5 5 4 4 3 2 2 29 6 5 5 5 4 4 3 2 2 30 6 5 5 5 4 4 3 2 2 31 6 5 5 5 4 4 3 2 2 32 6 5 5 5 4 4 3 2 2 33 6 5 5 5 4 4 3 2 2 34 6 5 5 5 4 4 3 2 2 35 7 6 6 5 4 4 3 2 2 36 7 6 6 5 4 4 3 2 2 37 7 6 6 5 4 4 3 2 2 38 7 6 6 5 4 4 3 2 2 39 7 6 6 5 4 4 3 2 2 40 7 6 6 5 4 4 3 2 2 41 7 6 6 5 4 4 3 2 2 42 7 6 6 5 4 4 3 2 2 43 8 7 7 6 5 5 4 3 2 44 8 7 7 6 5 5 4 3 2 45 8 7 7 6 5 5 4 3 2 46 8 7 7 6 5 5 4 3 2 47 8 7 7 6 5 5 4 3 2 48 8 7 7 6 5 5 4 3 2 49 8 7 7 6 5 5 4 3 2 164 50 9 8 8 7 6 5 4 3 2 51 9 8 8 7 6 5 4 3 2 52 9 8 8 7 6 5 4 3 2 53 9 8 8 7 6 5 4 3 2 54 9 8 8 7 6 5 4 3 2 55 9 8 8 7 6 5 4 3 2 56 10 9 8 7 6 5 4 3 2 57 10 9 8 7 6 5 4 3 2 58 10 9 8 7 6 5 4 3 2 59 10 9 8 7 6 5 4 3 2 60 10 9 8 7 6 5 4 3 2 TARGET CASE: DUKES OF HAZZARD, THE Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 4 2 2 2 2 2 2 2 1 1 5 2 2 2 2 2 2 2 1 1 6 2 2 2 2 2 2 2 1 1 7 2 2 2 2 2 2 2 1 1 8 2 2 2 2 2 2 2 1 1 9 2 2 2 2 2 2 2 1 1 10 2 2 2 2 2 2 2 1 1 11 2 2 2 2 2 2 2 1 1 12 2 2 2 2 2 2 2 1 1 13 2 2 2 2 2 2 2 1 1 14 2 2 2 2 2 2 2 1 1 15 2 2 2 2 2 2 2 1 1 16 2 2 2 2 2 2 2 1 1 17 2 2 2 2 2 2 2 1 1 18 2 2 2 2 2 2 2 1 1 19 2 2 2 2 2 2 2 1 1 20 3 3 3 2 2 2 2 1 1 21 3 3 3 2 2 2 2 1 1 22 3 3 3 2 2 2 2 1 1 23 3 3 3 2 2 2 2 1 1 24 3 3 3 2 2 2 2 1 1 25 3 3 3 2 2 2 2 1 1 26 3 3 3 2 2 2 2 1 1 27 3 3 3 2 2 2 2 1 1 28 3 3 3 2 2 2 2 1 1 29 3 3 3 2 2 2 2 1 1 30 4 3 3 2 2 2 2 1 1 31 4 3 3 2 2 2 2 1 1 32 5 4 4 3 3 3 3 2 2 33 5 4 4 3 3 3 3 2 2 34 5 4 4 3 3 3 3 2 2 35 5 4 4 3 3 3 3 2 2 36 5 4 4 3 3 3 3 2 2 165 37 5 4 4 3 3 3 3 2 2 38 5 4 4 3 3 3 3 2 2 39 5 4 4 3 3 3 3 2 2 40 5 4 4 3 3 3 3 2 2 41 5 4 4 3 3 3 3 2 2 42 5 4 4 3 3 3 3 2 2 43 5 4 4 3 3 3 3 2 2 44 6 5 5 4 4 3 3 2 2 45 6 5 5 4 4 3 3 2 2 46 6 5 5 4 4 3 3 2 2 47 6 5 5 4 4 3 3 2 2 48 6 5 5 4 4 3 3 2 2 49 6 5 5 4 4 3 3 2 2 50 7 6 5 4 4 3 3 2 2 51 7 6 5 4 4 3 3 2 2 52 7 6 5 4 4 3 3 2 2 53 7 6 5 4 4 3 3 2 2 54 7 6 5 4 4 3 3 2 2 55 7 6 5 4 4 3 3 2 2 56 7 6 5 4 4 3 3 2 2 57 8 7 6 5 5 4 4 3 2 58 8 7 6 5 5 4 4 3 2 59 8 7 6 5 5 4 4 3 2 60 9 8 7 6 5 4 4 3 2 61 9 8 7 6 5 4 4 3 2 62 9 8 7 6 5 4 4 3 2 63 9 8 7 6 5 4 4 3 2 64 9 8 7 6 5 4 4 3 2 65 9 8 7 6 5 4 4 3 2 66 9 8 7 6 5 4 4 3 2 67 9 8 7 6 5 4 4 3 2 68 9 8 7 6 5 4 4 3 2 69 9 8 7 6 5 4 4 3 2 70 9 8 7 6 5 4 4 3 2 71 9 8 7 6 5 4 4 3 2 72 9 8 7 6 5 4 4 3 2 73 9 8 7 6 5 4 4 3 2 74 10 9 8 7 6 5 4 3 2 TARGET CASE: FANTASTIC FOUR Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 1 5 1 1 1 1 1 1 1 1 1 6 1 1 1 1 1 1 1 1 1 7 1 1 1 1 1 1 1 1 1 8 1 1 1 1 1 1 1 1 1 9 1 1 1 1 1 1 1 1 1 166 10 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 12 1 1 1 1 1 1 1 1 1 13 2 2 2 2 2 2 1 1 1 14 2 2 2 2 2 2 1 1 1 15 2 2 2 2 2 2 1 1 1 16 2 2 2 2 2 2 1 1 1 17 2 2 2 2 2 2 1 1 1 18 2 2 2 2 2 2 1 1 1 19 2 2 2 2 2 2 1 1 1 20 2 2 2 2 2 2 1 1 1 21 2 2 2 2 2 2 1 1 1 22 3 3 3 2 2 2 1 1 1 23 3 3 3 2 2 2 1 1 1 24 3 3 3 2 2 2 1 1 1 25 3 3 3 2 2 2 1 1 1 26 3 3 3 2 2 2 1 1 1 27 3 3 3 2 2 2 1 1 1 28 3 3 3 2 2 2 1 1 1 29 4 4 4 3 3 3 2 2 1 30 4 4 4 3 3 3 2 2 1 31 4 4 4 3 3 3 2 2 1 32 4 4 4 3 3 3 2 2 1 33 4 4 4 3 3 3 2 2 1 34 4 4 4 3 3 3 2 2 1 35 4 4 4 3 3 3 2 2 1 36 4 4 4 3 3 3 2 2 1 37 4 4 4 3 3 3 2 2 1 38 4 4 4 3 3 3 2 2 1 39 4 4 4 3 3 3 2 2 1 40 4 4 4 3 3 3 2 2 1 41 4 4 4 3 3 3 2 2 1 42 4 4 4 3 3 3 2 2 1 43 4 4 4 3 3 3 2 2 1 44 4 4 4 3 3 3 2 2 1 45 5 5 5 4 3 3 2 2 1 46 5 5 5 4 3 3 2 2 1 47 5 5 5 4 3 3 2 2 1 48 5 5 5 4 3 3 2 2 1 49 5 5 5 4 3 3 2 2 1 50 5 5 5 4 3 3 2 2 1 51 5 5 5 4 3 3 2 2 1 52 5 5 5 4 3 3 2 2 1 53 5 5 5 4 3 3 2 2 1 54 5 5 5 4 3 3 2 2 1 55 6 6 5 4 3 3 2 2 1 56 6 6 5 4 3 3 2 2 1 57 6 6 5 4 3 3 2 2 1 58 6 6 5 4 3 3 2 2 1 59 7 7 6 5 4 4 3 3 2 60 7 7 6 5 4 4 3 3 2 61 8 7 6 5 4 4 3 3 2 62 8 7 6 5 4 4 3 3 2 167 63 8 7 6 5 4 4 3 3 2 64 8 7 6 5 4 4 3 3 2 65 8 7 6 5 4 4 3 3 2 66 8 7 6 5 4 4 3 3 2 67 9 8 7 6 5 5 4 3 2 68 9 8 7 6 5 5 4 3 2 69 9 8 7 6 5 5 4 3 2 70 9 8 7 6 5 5 4 3 2 71 10 9 8 7 6 5 4 3 2 TARGET CASE: FOUR BROTHERS Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 3 2 2 2 1 1 1 1 1 1 4 2 2 2 1 1 1 1 1 1 5 2 2 2 1 1 1 1 1 1 6 2 2 2 1 1 1 1 1 1 7 3 3 3 2 2 1 1 1 1 8 3 3 3 2 2 1 1 1 1 9 3 3 3 2 2 1 1 1 1 10 3 3 3 2 2 1 1 1 1 11 4 4 4 3 3 2 2 2 2 12 4 4 4 3 3 2 2 2 2 13 4 4 4 3 3 2 2 2 2 14 4 4 4 3 3 2 2 2 2 15 4 4 4 3 3 2 2 2 2 16 4 4 4 3 3 2 2 2 2 17 4 4 4 3 3 2 2 2 2 18 4 4 4 3 3 2 2 2 2 19 4 4 4 3 3 2 2 2 2 20 4 4 4 3 3 2 2 2 2 21 4 4 4 3 3 2 2 2 2 22 4 4 4 3 3 2 2 2 2 23 4 4 4 3 3 2 2 2 2 24 4 4 4 3 3 2 2 2 2 25 5 5 5 4 3 2 2 2 2 26 5 5 5 4 3 2 2 2 2 27 5 5 5 4 3 2 2 2 2 28 5 5 5 4 3 2 2 2 2 29 5 5 5 4 3 2 2 2 2 30 5 5 5 4 3 2 2 2 2 31 5 5 5 4 3 2 2 2 2 32 5 5 5 4 3 2 2 2 2 33 5 5 5 4 3 2 2 2 2 34 5 5 5 4 3 2 2 2 2 35 5 5 5 4 3 2 2 2 2 36 5 5 5 4 3 2 2 2 2 37 5 5 5 4 3 2 2 2 2 38 5 5 5 4 3 2 2 2 2 168 39 5 5 5 4 3 2 2 2 2 40 5 5 5 4 3 2 2 2 2 41 5 5 5 4 3 2 2 2 2 42 5 5 5 4 3 2 2 2 2 43 5 5 5 4 3 2 2 2 2 44 5 5 5 4 3 2 2 2 2 45 5 5 5 4 3 2 2 2 2 46 5 5 5 4 3 2 2 2 2 47 5 5 5 4 3 2 2 2 2 48 5 5 5 4 3 2 2 2 2 49 5 5 5 4 3 2 2 2 2 50 5 5 5 4 3 2 2 2 2 51 5 5 5 4 3 2 2 2 2 52 5 5 5 4 3 2 2 2 2 53 5 5 5 4 3 2 2 2 2 54 6 6 6 5 4 3 3 3 2 55 6 6 6 5 4 3 3 3 2 56 6 6 6 5 4 3 3 3 2 57 6 6 6 5 4 3 3 3 2 58 6 6 6 5 4 3 3 3 2 59 6 6 6 5 4 3 3 3 2 60 6 6 6 5 4 3 3 3 2 61 7 7 7 6 5 4 3 3 2 62 7 7 7 6 5 4 3 3 2 63 7 7 7 6 5 4 3 3 2 64 7 7 7 6 5 4 3 3 2 65 7 7 7 6 5 4 3 3 2 66 7 7 7 6 5 4 3 3 2 67 7 7 7 6 5 4 3 3 2 68 7 7 7 6 5 4 3 3 2 69 8 8 7 6 5 4 3 3 2 70 8 8 7 6 5 4 3 3 2 71 8 8 7 6 5 4 3 3 2 72 8 8 7 6 5 4 3 3 2 73 8 8 7 6 5 4 3 3 2 74 8 8 7 6 5 4 3 3 2 75 9 8 7 6 5 4 3 3 2 76 10 9 8 7 6 5 4 3 2 TARGET CASE: ISLAND, THE Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 1 5 1 1 1 1 1 1 1 1 1 6 2 2 1 1 1 1 1 1 1 7 3 3 2 2 2 2 2 1 1 8 3 3 2 2 2 2 2 1 1 9 3 3 2 2 2 2 2 1 1 169 10 3 3 2 2 2 2 2 1 1 11 3 3 2 2 2 2 2 1 1 12 3 3 2 2 2 2 2 1 1 13 3 3 2 2 2 2 2 1 1 14 3 3 2 2 2 2 2 1 1 15 3 3 2 2 2 2 2 1 1 16 3 3 2 2 2 2 2 1 1 17 3 3 2 2 2 2 2 1 1 18 3 3 2 2 2 2 2 1 1 19 3 3 2 2 2 2 2 1 1 20 4 4 3 3 3 3 2 1 1 21 4 4 3 3 3 3 2 1 1 22 4 4 3 3 3 3 2 1 1 23 4 4 3 3 3 3 2 1 1 24 4 4 3 3 3 3 2 1 1 25 4 4 3 3 3 3 2 1 1 26 4 4 3 3 3 3 2 1 1 27 4 4 3 3 3 3 2 1 1 28 4 4 3 3 3 3 2 1 1 29 4 4 3 3 3 3 2 1 1 30 5 5 4 3 3 3 2 1 1 31 5 5 4 3 3 3 2 1 1 32 5 5 4 3 3 3 2 1 1 33 5 5 4 3 3 3 2 1 1 34 5 5 4 3 3 3 2 1 1 35 5 5 4 3 3 3 2 1 1 36 5 5 4 3 3 3 2 1 1 37 5 5 4 3 3 3 2 1 1 38 5 5 4 3 3 3 2 1 1 39 5 5 4 3 3 3 2 1 1 40 6 6 5 4 4 4 3 2 2 41 6 6 5 4 4 4 3 2 2 42 6 6 5 4 4 4 3 2 2 43 7 6 5 4 4 4 3 2 2 44 7 6 5 4 4 4 3 2 2 45 7 6 5 4 4 4 3 2 2 46 7 6 5 4 4 4 3 2 2 47 7 6 5 4 4 4 3 2 2 48 7 6 5 4 4 4 3 2 2 49 7 6 5 4 4 4 3 2 2 50 8 7 6 5 5 4 3 2 2 51 8 7 6 5 5 4 3 2 2 52 8 7 6 5 5 4 3 2 2 53 8 7 6 5 5 4 3 2 2 54 8 7 6 5 5 4 3 2 2 55 8 7 6 5 5 4 3 2 2 56 8 7 6 5 5 4 3 2 2 57 9 8 7 6 6 5 4 3 2 58 9 8 7 6 6 5 4 3 2 59 9 8 7 6 6 5 4 3 2 60 9 8 7 6 6 5 4 3 2 61 10 9 8 7 6 5 4 3 2 62 10 9 8 7 6 5 4 3 2 170 63 10 9 8 7 6 5 4 3 2 TARGET CASE: MUST LOVE DOGS Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 3 2 1 1 1 1 1 1 1 1 4 2 1 1 1 1 1 1 1 1 5 2 1 1 1 1 1 1 1 1 6 3 2 2 2 2 2 2 2 1 7 3 2 2 2 2 2 2 2 1 8 3 2 2 2 2 2 2 2 1 9 3 2 2 2 2 2 2 2 1 10 4 3 3 2 2 2 2 2 1 11 4 3 3 2 2 2 2 2 1 12 4 3 3 2 2 2 2 2 1 13 4 3 3 2 2 2 2 2 1 14 4 3 3 2 2 2 2 2 1 15 4 3 3 2 2 2 2 2 1 16 4 3 3 2 2 2 2 2 1 17 4 3 3 2 2 2 2 2 1 18 4 3 3 2 2 2 2 2 1 19 4 3 3 2 2 2 2 2 1 20 4 3 3 2 2 2 2 2 1 21 4 3 3 2 2 2 2 2 1 22 4 3 3 2 2 2 2 2 1 23 4 3 3 2 2 2 2 2 1 24 4 3 3 2 2 2 2 2 1 25 4 3 3 2 2 2 2 2 1 26 5 4 4 3 3 3 2 2 1 27 5 4 4 3 3 3 2 2 1 28 5 4 4 3 3 3 2 2 1 29 5 4 4 3 3 3 2 2 1 30 5 4 4 3 3 3 2 2 1 31 5 4 4 3 3 3 2 2 1 32 5 4 4 3 3 3 2 2 1 33 5 4 4 3 3 3 2 2 1 34 5 4 4 3 3 3 2 2 1 35 5 4 4 3 3 3 2 2 1 36 5 4 4 3 3 3 2 2 1 37 5 4 4 3 3 3 2 2 1 38 6 5 5 4 4 3 2 2 1 39 6 5 5 4 4 3 2 2 1 40 6 5 5 4 4 3 2 2 1 41 6 5 5 4 4 3 2 2 1 42 6 5 5 4 4 3 2 2 1 43 6 5 5 4 4 3 2 2 1 44 6 5 5 4 4 3 2 2 1 45 6 5 5 4 4 3 2 2 1 46 6 5 5 4 4 3 2 2 1 171 47 6 5 5 4 4 3 2 2 1 48 6 5 5 4 4 3 2 2 1 49 7 6 6 5 5 4 3 3 2 50 7 6 6 5 5 4 3 3 2 51 7 6 6 5 5 4 3 3 2 52 7 6 6 5 5 4 3 3 2 53 8 7 6 5 5 4 3 3 2 54 8 7 6 5 5 4 3 3 2 55 8 7 6 5 5 4 3 3 2 56 8 7 6 5 5 4 3 3 2 57 8 7 6 5 5 4 3 3 2 58 9 8 7 6 6 5 4 3 2 59 9 8 7 6 6 5 4 3 2 60 9 8 7 6 6 5 4 3 2 61 9 8 7 6 6 5 4 3 2 62 10 9 8 7 6 5 4 3 2 63 10 9 8 7 6 5 4 3 2 64 10 9 8 7 6 5 4 3 2 65 10 9 8 7 6 5 4 3 2 TARGET CASE: REBOUND Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 1 5 2 2 2 2 2 2 2 2 1 6 2 2 2 2 2 2 2 2 1 7 3 2 2 2 2 2 2 2 1 8 3 2 2 2 2 2 2 2 1 9 3 2 2 2 2 2 2 2 1 10 4 3 3 3 2 2 2 2 1 11 4 3 3 3 2 2 2 2 1 12 4 3 3 3 2 2 2 2 1 13 4 3 3 3 2 2 2 2 1 14 4 3 3 3 2 2 2 2 1 15 4 3 3 3 2 2 2 2 1 16 5 4 4 4 3 3 2 2 1 17 5 4 4 4 3 3 2 2 1 18 5 4 4 4 3 3 2 2 1 19 5 4 4 4 3 3 2 2 1 20 5 4 4 4 3 3 2 2 1 21 5 4 4 4 3 3 2 2 1 22 5 4 4 4 3 3 2 2 1 23 5 4 4 4 3 3 2 2 1 24 5 4 4 4 3 3 2 2 1 25 5 4 4 4 3 3 2 2 1 26 5 4 4 4 3 3 2 2 1 27 6 5 5 5 4 4 3 3 2 28 6 5 5 5 4 4 3 3 2 172 29 6 5 5 5 4 4 3 3 2 30 6 5 5 5 4 4 3 3 2 31 6 5 5 5 4 4 3 3 2 32 6 5 5 5 4 4 3 3 2 33 7 6 6 6 5 4 3 3 2 34 7 6 6 6 5 4 3 3 2 35 7 6 6 6 5 4 3 3 2 36 7 6 6 6 5 4 3 3 2 37 7 6 6 6 5 4 3 3 2 38 7 6 6 6 5 4 3 3 2 39 7 6 6 6 5 4 3 3 2 40 7 6 6 6 5 4 3 3 2 41 8 7 6 6 5 4 3 3 2 42 8 7 6 6 5 4 3 3 2 43 8 7 6 6 5 4 3 3 2 44 8 7 6 6 5 4 3 3 2 45 8 7 6 6 5 4 3 3 2 46 8 7 6 6 5 4 3 3 2 47 8 7 6 6 5 4 3 3 2 48 8 7 6 6 5 4 3 3 2 49 8 7 6 6 5 4 3 3 2 50 8 7 6 6 5 4 3 3 2 51 8 7 6 6 5 4 3 3 2 52 8 7 6 6 5 4 3 3 2 53 8 7 6 6 5 4 3 3 2 54 8 7 6 6 5 4 3 3 2 55 9 8 7 7 6 5 4 3 2 56 9 8 7 7 6 5 4 3 2 57 9 8 7 7 6 5 4 3 2 58 9 8 7 7 6 5 4 3 2 59 9 8 7 7 6 5 4 3 2 60 9 8 7 7 6 5 4 3 2 61 9 8 7 7 6 5 4 3 2 62 10 9 8 7 6 5 4 3 2 63 10 9 8 7 6 5 4 3 2 64 10 9 8 7 6 5 4 3 2 65 10 9 8 7 6 5 4 3 2 TARGET CASE: RED EYE Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 4 2 2 2 2 1 1 1 1 1 5 2 2 2 2 1 1 1 1 1 6 2 2 2 2 1 1 1 1 1 7 2 2 2 2 1 1 1 1 1 8 2 2 2 2 1 1 1 1 1 9 2 2 2 2 1 1 1 1 1 10 3 3 3 3 2 2 2 1 1 11 3 3 3 3 2 2 2 1 1 173 12 3 3 3 3 2 2 2 1 1 13 3 3 3 3 2 2 2 1 1 14 3 3 3 3 2 2 2 1 1 15 3 3 3 3 2 2 2 1 1 16 3 3 3 3 2 2 2 1 1 17 3 3 3 3 2 2 2 1 1 18 3 3 3 3 2 2 2 1 1 19 3 3 3 3 2 2 2 1 1 20 3 3 3 3 2 2 2 1 1 21 3 3 3 3 2 2 2 1 1 22 4 4 4 4 3 2 2 1 1 23 4 4 4 4 3 2 2 1 1 24 4 4 4 4 3 2 2 1 1 25 4 4 4 4 3 2 2 1 1 26 4 4 4 4 3 2 2 1 1 27 4 4 4 4 3 2 2 1 1 28 4 4 4 4 3 2 2 1 1 29 4 4 4 4 3 2 2 1 1 30 4 4 4 4 3 2 2 1 1 31 4 4 4 4 3 2 2 1 1 32 4 4 4 4 3 2 2 1 1 33 5 4 4 4 3 2 2 1 1 34 5 4 4 4 3 2 2 1 1 35 5 4 4 4 3 2 2 1 1 36 5 4 4 4 3 2 2 1 1 37 5 4 4 4 3 2 2 1 1 38 5 4 4 4 3 2 2 1 1 39 5 4 4 4 3 2 2 1 1 40 5 4 4 4 3 2 2 1 1 41 5 4 4 4 3 2 2 1 1 42 5 4 4 4 3 2 2 1 1 43 6 5 5 5 4 3 3 2 2 44 6 5 5 5 4 3 3 2 2 45 6 5 5 5 4 3 3 2 2 46 6 5 5 5 4 3 3 2 2 47 6 5 5 5 4 3 3 2 2 48 6 5 5 5 4 3 3 2 2 49 6 5 5 5 4 3 3 2 2 50 6 5 5 5 4 3 3 2 2 51 6 5 5 5 4 3 3 2 2 52 6 5 5 5 4 3 3 2 2 53 6 5 5 5 4 3 3 2 2 54 7 6 5 5 4 3 3 2 2 55 7 6 5 5 4 3 3 2 2 56 7 6 5 5 4 3 3 2 2 57 7 6 5 5 4 3 3 2 2 58 7 6 5 5 4 3 3 2 2 59 8 7 6 6 5 4 3 2 2 60 8 7 6 6 5 4 3 2 2 61 8 7 6 6 5 4 3 2 2 62 8 7 6 6 5 4 3 2 2 63 8 7 6 6 5 4 3 2 2 64 8 7 6 6 5 4 3 2 2 174 65 8 7 6 6 5 4 3 2 2 66 8 7 6 6 5 4 3 2 2 67 8 7 6 6 5 4 3 2 2 68 8 7 6 6 5 4 3 2 2 69 8 7 6 6 5 4 3 2 2 70 8 7 6 6 5 4 3 2 2 71 8 7 6 6 5 4 3 2 2 72 8 7 6 6 5 4 3 2 2 73 8 7 6 6 5 4 3 2 2 74 9 8 7 7 6 5 4 3 2 75 9 8 7 7 6 5 4 3 2 76 9 8 7 7 6 5 4 3 2 77 9 8 7 7 6 5 4 3 2 78 9 8 7 7 6 5 4 3 2 79 9 8 7 7 6 5 4 3 2 80 9 8 7 7 6 5 4 3 2 81 9 8 7 7 6 5 4 3 2 82 9 8 7 7 6 5 4 3 2 83 10 9 8 7 6 5 4 3 2 84 10 9 8 7 6 5 4 3 2 TARGET CASE: SKELETON KEY, THE Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 3 2 2 1 1 1 1 1 1 1 4 2 2 1 1 1 1 1 1 1 5 2 2 1 1 1 1 1 1 1 6 2 2 1 1 1 1 1 1 1 7 3 3 2 2 2 2 2 1 1 8 3 3 2 2 2 2 2 1 1 9 3 3 2 2 2 2 2 1 1 10 3 3 2 2 2 2 2 1 1 11 3 3 2 2 2 2 2 1 1 12 3 3 2 2 2 2 2 1 1 13 3 3 2 2 2 2 2 1 1 14 4 4 3 3 2 2 2 1 1 15 5 4 3 3 2 2 2 1 1 16 5 4 3 3 2 2 2 1 1 17 5 4 3 3 2 2 2 1 1 18 5 4 3 3 2 2 2 1 1 19 5 4 3 3 2 2 2 1 1 20 5 4 3 3 2 2 2 1 1 21 5 4 3 3 2 2 2 1 1 22 5 4 3 3 2 2 2 1 1 23 5 4 3 3 2 2 2 1 1 24 5 4 3 3 2 2 2 1 1 25 5 4 3 3 2 2 2 1 1 26 6 5 4 4 3 3 3 2 1 27 6 5 4 4 3 3 3 2 1 28 6 5 4 4 3 3 3 2 1 175 29 6 5 4 4 3 3 3 2 1 30 6 5 4 4 3 3 3 2 1 31 6 5 4 4 3 3 3 2 1 32 7 6 5 4 3 3 3 2 1 33 7 6 5 4 3 3 3 2 1 34 7 6 5 4 3 3 3 2 1 35 7 6 5 4 3 3 3 2 1 36 7 6 5 4 3 3 3 2 1 37 7 6 5 4 3 3 3 2 1 38 7 6 5 4 3 3 3 2 1 39 7 6 5 4 3 3 3 2 1 40 7 6 5 4 3 3 3 2 1 41 7 6 5 4 3 3 3 2 1 42 7 6 5 4 3 3 3 2 1 43 7 6 5 4 3 3 3 2 1 44 7 6 5 4 3 3 3 2 1 45 7 6 5 4 3 3 3 2 1 46 7 6 5 4 3 3 3 2 1 47 7 6 5 4 3 3 3 2 1 48 8 7 6 5 4 4 3 2 1 49 8 7 6 5 4 4 3 2 1 50 8 7 6 5 4 4 3 2 1 51 8 7 6 5 4 4 3 2 1 52 8 7 6 5 4 4 3 2 1 53 9 8 7 6 5 4 3 2 1 54 9 8 7 6 5 4 3 2 1 55 10 9 8 7 6 5 4 3 2 56 10 9 8 7 6 5 4 3 2 57 10 9 8 7 6 5 4 3 2 58 10 9 8 7 6 5 4 3 2 59 10 9 8 7 6 5 4 3 2 60 10 9 8 7 6 5 4 3 2 61 10 9 8 7 6 5 4 3 2 62 10 9 8 7 6 5 4 3 2 63 10 9 8 7 6 5 4 3 2 64 10 9 8 7 6 5 4 3 2 65 10 9 8 7 6 5 4 3 2 66 10 9 8 7 6 5 4 3 2 67 10 9 8 7 6 5 4 3 2 68 10 9 8 7 6 5 4 3 2 TARGET CASE: SKY HIGH Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 1 1 1 1 1 3 3 3 3 3 2 2 2 2 2 4 3 3 3 3 2 2 2 2 2 5 4 3 3 3 2 2 2 2 2 6 4 3 3 3 2 2 2 2 2 7 4 3 3 3 2 2 2 2 2 8 4 3 3 3 2 2 2 2 2 176 9 5 4 4 4 3 3 3 2 2 10 5 4 4 4 3 3 3 2 2 11 5 4 4 4 3 3 3 2 2 12 5 4 4 4 3 3 3 2 2 13 5 4 4 4 3 3 3 2 2 14 5 4 4 4 3 3 3 2 2 15 5 4 4 4 3 3 3 2 2 16 5 4 4 4 3 3 3 2 2 17 6 5 5 4 3 3 3 2 2 18 6 5 5 4 3 3 3 2 2 19 6 5 5 4 3 3 3 2 2 20 6 5 5 4 3 3 3 2 2 21 6 5 5 4 3 3 3 2 2 22 6 5 5 4 3 3 3 2 2 23 6 5 5 4 3 3 3 2 2 24 6 5 5 4 3 3 3 2 2 25 6 5 5 4 3 3 3 2 2 26 7 6 6 5 4 4 4 3 2 27 7 6 6 5 4 4 4 3 2 28 7 6 6 5 4 4 4 3 2 29 7 6 6 5 4 4 4 3 2 30 7 6 6 5 4 4 4 3 2 31 7 6 6 5 4 4 4 3 2 32 7 6 6 5 4 4 4 3 2 33 7 6 6 5 4 4 4 3 2 34 7 6 6 5 4 4 4 3 2 35 7 6 6 5 4 4 4 3 2 36 7 6 6 5 4 4 4 3 2 37 7 6 6 5 4 4 4 3 2 38 7 6 6 5 4 4 4 3 2 39 7 6 6 5 4 4 4 3 2 40 7 6 6 5 4 4 4 3 2 41 8 7 6 5 4 4 4 3 2 42 8 7 6 5 4 4 4 3 2 43 8 7 6 5 4 4 4 3 2 44 8 7 6 5 4 4 4 3 2 45 8 7 6 5 4 4 4 3 2 46 8 7 6 5 4 4 4 3 2 47 8 7 6 5 4 4 4 3 2 48 8 7 6 5 4 4 4 3 2 49 8 7 6 5 4 4 4 3 2 50 8 7 6 5 4 4 4 3 2 51 8 7 6 5 4 4 4 3 2 52 8 7 6 5 4 4 4 3 2 53 8 7 6 5 4 4 4 3 2 54 8 7 6 5 4 4 4 3 2 55 8 7 6 5 4 4 4 3 2 56 8 7 6 5 4 4 4 3 2 57 8 7 6 5 4 4 4 3 2 58 8 7 6 5 4 4 4 3 2 59 8 7 6 5 4 4 4 3 2 60 9 8 7 6 5 5 4 3 2 61 9 8 7 6 5 5 4 3 2 177 62 9 8 7 6 5 5 4 3 2 63 9 8 7 6 5 5 4 3 2 64 9 8 7 6 5 5 4 3 2 65 9 8 7 6 5 5 4 3 2 66 9 8 7 6 5 5 4 3 2 67 9 8 7 6 5 5 4 3 2 68 10 9 8 7 6 5 4 3 2 69 10 9 8 7 6 5 4 3 2 TARGET CASE: STEALTH Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 1 5 2 2 2 2 2 1 1 1 1 6 2 2 2 2 2 1 1 1 1 7 2 2 2 2 2 1 1 1 1 8 2 2 2 2 2 1 1 1 1 9 2 2 2 2 2 1 1 1 1 10 2 2 2 2 2 1 1 1 1 11 3 3 3 3 3 2 2 2 1 12 3 3 3 3 3 2 2 2 1 13 3 3 3 3 3 2 2 2 1 14 3 3 3 3 3 2 2 2 1 15 3 3 3 3 3 2 2 2 1 16 3 3 3 3 3 2 2 2 1 17 3 3 3 3 3 2 2 2 1 18 3 3 3 3 3 2 2 2 1 19 3 3 3 3 3 2 2 2 1 20 4 4 3 3 3 2 2 2 1 21 4 4 3 3 3 2 2 2 1 22 4 4 3 3 3 2 2 2 1 23 4 4 3 3 3 2 2 2 1 24 4 4 3 3 3 2 2 2 1 25 4 4 3 3 3 2 2 2 1 26 4 4 3 3 3 2 2 2 1 27 4 4 3 3 3 2 2 2 1 28 4 4 3 3 3 2 2 2 1 29 5 5 4 4 4 3 2 2 1 30 5 5 4 4 4 3 2 2 1 31 5 5 4 4 4 3 2 2 1 32 5 5 4 4 4 3 2 2 1 33 5 5 4 4 4 3 2 2 1 34 5 5 4 4 4 3 2 2 1 35 5 5 4 4 4 3 2 2 1 36 5 5 4 4 4 3 2 2 1 37 5 5 4 4 4 3 2 2 1 38 6 5 4 4 4 3 2 2 1 39 6 5 4 4 4 3 2 2 1 178 40 6 5 4 4 4 3 2 2 1 41 6 5 4 4 4 3 2 2 1 42 6 5 4 4 4 3 2 2 1 43 6 5 4 4 4 3 2 2 1 44 7 6 5 5 5 4 3 3 2 45 7 6 5 5 5 4 3 3 2 46 7 6 5 5 5 4 3 3 2 47 7 6 5 5 5 4 3 3 2 48 7 6 5 5 5 4 3 3 2 49 8 7 6 5 5 4 3 3 2 50 8 7 6 5 5 4 3 3 2 51 8 7 6 5 5 4 3 3 2 52 8 7 6 5 5 4 3 3 2 53 8 7 6 5 5 4 3 3 2 54 8 7 6 5 5 4 3 3 2 55 9 8 7 6 6 5 4 3 2 56 9 8 7 6 6 5 4 3 2 57 10 9 8 7 6 5 4 3 2 58 10 9 8 7 6 5 4 3 2 59 10 9 8 7 6 5 4 3 2 60 10 9 8 7 6 5 4 3 2 61 10 9 8 7 6 5 4 3 2 62 10 9 8 7 6 5 4 3 2 63 10 9 8 7 6 5 4 3 2 TARGET CASE: VALIANT Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 3 2 2 2 1 1 1 1 1 1 4 2 2 2 1 1 1 1 1 1 5 2 2 2 1 1 1 1 1 1 6 2 2 2 1 1 1 1 1 1 7 2 2 2 1 1 1 1 1 1 8 2 2 2 1 1 1 1 1 1 9 3 3 3 2 2 2 2 2 1 10 3 3 3 2 2 2 2 2 1 11 3 3 3 2 2 2 2 2 1 12 3 3 3 2 2 2 2 2 1 13 4 4 4 3 3 2 2 2 1 14 4 4 4 3 3 2 2 2 1 15 4 4 4 3 3 2 2 2 1 16 4 4 4 3 3 2 2 2 1 17 4 4 4 3 3 2 2 2 1 18 4 4 4 3 3 2 2 2 1 19 4 4 4 3 3 2 2 2 1 20 5 5 4 3 3 2 2 2 1 21 5 5 4 3 3 2 2 2 1 22 5 5 4 3 3 2 2 2 1 23 5 5 4 3 3 2 2 2 1 179 24 5 5 4 3 3 2 2 2 1 25 5 5 4 3 3 2 2 2 1 26 5 5 4 3 3 2 2 2 1 27 5 5 4 3 3 2 2 2 1 28 6 6 5 4 4 3 2 2 1 29 6 6 5 4 4 3 2 2 1 30 6 6 5 4 4 3 2 2 1 31 6 6 5 4 4 3 2 2 1 32 6 6 5 4 4 3 2 2 1 33 6 6 5 4 4 3 2 2 1 34 7 6 5 4 4 3 2 2 1 35 7 6 5 4 4 3 2 2 1 36 7 6 5 4 4 3 2 2 1 37 7 6 5 4 4 3 2 2 1 38 7 6 5 4 4 3 2 2 1 39 7 6 5 4 4 3 2 2 1 40 8 7 6 5 5 4 3 3 2 41 8 7 6 5 5 4 3 3 2 42 8 7 6 5 5 4 3 3 2 43 8 7 6 5 5 4 3 3 2 44 8 7 6 5 5 4 3 3 2 45 9 8 7 6 5 4 3 3 2 46 9 8 7 6 5 4 3 3 2 47 10 9 8 7 6 5 4 3 2 48 10 9 8 7 6 5 4 3 2 TARGET CASE: WAR OF THE WORLDS Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 2 2 2 2 2 4 3 3 3 3 2 2 2 2 2 5 3 3 3 3 2 2 2 2 2 6 3 3 3 3 2 2 2 2 2 7 3 3 3 3 2 2 2 2 2 8 3 3 3 3 2 2 2 2 2 9 3 3 3 3 2 2 2 2 2 10 3 3 3 3 2 2 2 2 2 11 3 3 3 3 2 2 2 2 2 12 4 4 4 4 3 3 2 2 2 13 4 4 4 4 3 3 2 2 2 14 4 4 4 4 3 3 2 2 2 15 4 4 4 4 3 3 2 2 2 16 4 4 4 4 3 3 2 2 2 17 4 4 4 4 3 3 2 2 2 18 4 4 4 4 3 3 2 2 2 19 4 4 4 4 3 3 2 2 2 20 4 4 4 4 3 3 2 2 2 21 4 4 4 4 3 3 2 2 2 22 4 4 4 4 3 3 2 2 2 180 23 4 4 4 4 3 3 2 2 2 24 4 4 4 4 3 3 2 2 2 25 4 4 4 4 3 3 2 2 2 26 4 4 4 4 3 3 2 2 2 27 4 4 4 4 3 3 2 2 2 28 5 5 4 4 3 3 2 2 2 29 5 5 4 4 3 3 2 2 2 30 5 5 4 4 3 3 2 2 2 31 5 5 4 4 3 3 2 2 2 32 5 5 4 4 3 3 2 2 2 33 5 5 4 4 3 3 2 2 2 34 5 5 4 4 3 3 2 2 2 35 6 6 5 5 4 4 3 2 2 36 6 6 5 5 4 4 3 2 2 37 6 6 5 5 4 4 3 2 2 38 6 6 5 5 4 4 3 2 2 39 6 6 5 5 4 4 3 2 2 40 6 6 5 5 4 4 3 2 2 41 6 6 5 5 4 4 3 2 2 42 6 6 5 5 4 4 3 2 2 43 6 6 5 5 4 4 3 2 2 44 6 6 5 5 4 4 3 2 2 45 6 6 5 5 4 4 3 2 2 46 7 7 6 6 5 4 3 2 2 47 7 7 6 6 5 4 3 2 2 48 7 7 6 6 5 4 3 2 2 49 7 7 6 6 5 4 3 2 2 50 7 7 6 6 5 4 3 2 2 51 7 7 6 6 5 4 3 2 2 52 8 7 6 6 5 4 3 2 2 53 8 7 6 6 5 4 3 2 2 54 8 7 6 6 5 4 3 2 2 55 8 7 6 6 5 4 3 2 2 56 8 7 6 6 5 4 3 2 2 57 8 7 6 6 5 4 3 2 2 58 8 7 6 6 5 4 3 2 2 59 8 7 6 6 5 4 3 2 2 60 9 8 7 7 6 5 4 3 2 61 9 8 7 7 6 5 4 3 2 62 9 8 7 7 6 5 4 3 2 63 9 8 7 7 6 5 4 3 2 64 9 8 7 7 6 5 4 3 2 65 9 8 7 7 6 5 4 3 2 66 10 9 8 7 6 5 4 3 2 67 10 9 8 7 6 5 4 3 2 68 10 9 8 7 6 5 4 3 2 181 TARGET CASE: WEDDING CRASHERS Number of Clusters ID 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 1 5 2 2 2 2 1 1 1 1 1 6 2 2 2 2 1 1 1 1 1 7 2 2 2 2 1 1 1 1 1 8 2 2 2 2 1 1 1 1 1 9 2 2 2 2 1 1 1 1 1 10 2 2 2 2 1 1 1 1 1 11 2 2 2 2 1 1 1 1 1 12 2 2 2 2 1 1 1 1 1 13 2 2 2 2 1 1 1 1 1 14 2 2 2 2 1 1 1 1 1 15 2 2 2 2 1 1 1 1 1 16 3 3 3 3 2 2 2 2 1 17 3 3 3 3 2 2 2 2 1 18 3 3 3 3 2 2 2 2 1 19 3 3 3 3 2 2 2 2 1 20 3 3 3 3 2 2 2 2 1 21 3 3 3 3 2 2 2 2 1 22 3 3 3 3 2 2 2 2 1 23 3 3 3 3 2 2 2 2 1 24 3 3 3 3 2 2 2 2 1 25 4 4 3 3 2 2 2 2 1 26 4 4 3 3 2 2 2 2 1 27 4 4 3 3 2 2 2 2 1 28 4 4 3 3 2 2 2 2 1 29 4 4 3 3 2 2 2 2 1 30 4 4 3 3 2 2 2 2 1 31 4 4 3 3 2 2 2 2 1 32 4 4 3 3 2 2 2 2 1 33 4 4 3 3 2 2 2 2 1 34 5 5 4 4 3 3 2 2 1 35 5 5 4 4 3 3 2 2 1 36 5 5 4 4 3 3 2 2 1 37 5 5 4 4 3 3 2 2 1 38 5 5 4 4 3 3 2 2 1 39 5 5 4 4 3 3 2 2 1 40 5 5 4 4 3 3 2 2 1 41 5 5 4 4 3 3 2 2 1 42 5 5 4 4 3 3 2 2 1 43 5 5 4 4 3 3 2 2 1 44 5 5 4 4 3 3 2 2 1 45 5 5 4 4 3 3 2 2 1 46 5 5 4 4 3 3 2 2 1 47 5 5 4 4 3 3 2 2 1 48 5 5 4 4 3 3 2 2 1 49 5 5 4 4 3 3 2 2 1 182 50 5 5 4 4 3 3 2 2 1 51 5 5 4 4 3 3 2 2 1 52 5 5 4 4 3 3 2 2 1 53 5 5 4 4 3 3 2 2 1 54 5 5 4 4 3 3 2 2 1 55 6 6 5 5 4 4 3 3 2 56 6 6 5 5 4 4 3 3 2 57 6 6 5 5 4 4 3 3 2 58 7 6 5 5 4 4 3 3 2 59 7 6 5 5 4 4 3 3 2 60 7 6 5 5 4 4 3 3 2 61 7 6 5 5 4 4 3 3 2 62 7 6 5 5 4 4 3 3 2 63 8 7 6 5 4 4 3 3 2 64 8 7 6 5 4 4 3 3 2 65 8 7 6 5 4 4 3 3 2 66 8 7 6 5 4 4 3 3 2 67 8 7 6 5 4 4 3 3 2 68 8 7 6 5 4 4 3 3 2 69 8 7 6 5 4 4 3 3 2 70 8 7 6 5 4 4 3 3 2 71 9 8 7 6 5 5 4 3 2 72 9 8 7 6 5 5 4 3 2 73 9 8 7 6 5 5 4 3 2 74 9 8 7 6 5 5 4 3 2 75 10 9 8 7 6 5 4 3 2 183