<<

How Cognitive Can Undermine Program Scale-Up Decisions

Susan Mayer, Rohen Shah, and Ariel Kalil

University of Chicago

August 2020

Forthcoming in List, J., Suskind, D., & Supplee, L. H. (Ed.). The Scale-up Effect in Early Childhood and Public Policy: Why interventions lose impact at scale and what we can do about it. Routledge.

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 2

Abstract

Many social programs that yield positive results in pilot studies fail to yield such results when implemented at scale. A variety of possible explanations aim to account for this failure. In this chapter we emphasize the potential role of cognitive biases in the decision-making process that takes an intervention from an idea to a fully implemented program. In that process, researchers, practitioners, program participants, and funders all make decisions that influence whether and how the program is implemented and evaluated and how evaluations are interpreted. Each decision is subject to the cognitive biases of the decision maker and these biases can increase the chance that false positive evaluation results lead to scaled—and in this case, failed— implementation. Although the potential for cognitive biases to influence the scale-up process is large, the research evidence is nearly nonexistent, suggesting the need for a psychologically informed understanding of the scale-up process.

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 3

Introduction

Many social programs that yield positive results in pilot studies fail to work when scaled to a large population. A seemingly successful pilot intervention might fail to scale for any number of reasons, including unrepresentative samples of participants, low implementation fidelity, treatment heterogeneity, and other factors that are discussed in this book. But problems may also arise because of the cognitive biases of those involved in the process of initiating, evaluating, and scaling a program.

Researchers, practitioners, program participants, and policy makers must all make decisions that influence whether a program is scaled up, and each decision is potentially subject to cognitive (Sunstein, 2014). In this chapter we focus on three cognitive biases that previous research has shown have an important influence on consequential decisions, namely , status quo or default bias, and bandwagon bias.

A cognitive bias is a consistent and predictable propensity to make decisions in a particular way that deviates from common principles of logic and . As such, cognitive biases sometimes result in suboptimal decisions. Psychologists generally believe that cognitive biases fulfill a current or evolutionary psychological need that is “hardwired” in the brain and thus form the default patterns of human cognition (e.g. Haselton, Nettle, & Murray 2016;

Kahneman, 2011). Cognitive biases arise because when individuals are faced with a decision, they confront vast quantities of , time constraints, a social context and history that provides meaning to the information, and limited brain power to process all the information. Under these circumstances, humans use shortcuts that may produce quick and satisfying decisions characterized by perceptual distortion, inaccurate judgment, and inconsistent interpretations, collectively referred to as irrational (e.g. Kahneman & Tversky, 1972, 1974; Ariely, 2008;

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 4

Gilovich, Griffin, & Kahneman, 2002). For example, Tversky and Kahneman (1974), who have done pioneering work on cognitive biases, demonstrated that human probability judgments are consistently insensitive to knowledge of prior probability and sample size but are just as consistently influenced by irrelevant factors, such as the ease of imagining an event or the provision of an unrelated random number. Other cognitive biases are related to social influences on decision making, including the need for and the desire to preserve one’s identity

(Akerlof & Kranton, 2010).

Not all social scientists agree about how many cognitive biases exist and many of the cognitive biases discussed in the research literature overlap. However, there is broad agreement that cognitive biases should be distinguished from in judgement, computational errors, or other errors that are the result of misinformation. Errors that people make due to misinformation can be corrected by providing accurate information, but cognitive biases are “hardwired”—they are difficult to change and cannot be changed by simply providing information. Changing the

” (Thaler & Sunstein, 2008), which is the structure in which decisions are made, and using behavioral tools, such as commitment devices and reminders (Mayer, Kalil,

Oreopoulos, & Gallegos, 2019; Kalil, Mayer, & Gallegos, 2020), can help people adapt to cognitive biases in specific contexts. However, it is unclear whether people can completely overcome these biases (see Nisbett, 2015 and Hershfield, 2011 for the view that at least some biases can be overcome). Cognitive biases should also be distinguished from responses to incentives that might result in distortions to decision making. Responding to an incentive is neither surprising nor contrary to logic or probability assessment, even if the incentive itself encourages that is suboptimal from the point of view of the person offering the incentive.

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 5

Not everyone shares the same view of the origin, persistence, or influence of cognitive biases (e.g., see Gigerenzer, 1996 for a critique of “,” which was among the first formal theories on decision-making to account for cognitive biases). About 185 cognitive biases have been identified, but some are trivial and others rest on dubious empirical research. In addition, not all decisions are seriously compromised by cognitive biases and, in some cases, cognitive biases are beneficial. For instance, cognitive shortcuts mitigate against “decision fatigue,” i.e., the decrease in decision-making quality that results from a prolonged period of decision making (Baumeister, 2003).

Until recently, economists and other social scientists seldom accounted for consistent deviations from “” in how people gather and interpret information in models of decision making. However, a growing research literature and the popularization of “nudge” units integrating behavioral insights in governments around the world has led to a new awareness of the ways in which cognitive biases influence policy decisions (Afif, Islan, Calvo-Gonzalez, &

Dalton, 2019).

This chapter focuses on cognitive biases that are related to the information that policy makers and others involved in decisions to scale up programs seek out and consume. We focus on how these biases might lead to scaling up programs that should not have been scaled up in the first place. In doing so, we suggest a possible role for cognitive biases in explaining the commonly observed “voltage drop” in scaled up programs and hence provide insight into the question “why do so many programs fail to scale?” The discussion of the potential influence of cognitive biases in the scale-up process is necessarily speculative because there is almost no research on this topic. The intent of this chapter is to make the case that we need clear research on this topic.

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 6

How Cognitive Biases Might Influence Scaling Decisions

Ideally, the process leading to a social program being implemented at scale begins with a theory about what treatment will change the behavior of participants in a positive way. The theory then gets translated into a program with a of procedures that, when followed, might result in a change in the behavior of program participants. If funders are willing, the program is then implemented in a pilot or at a small scale to see if behavior changes in the predicted way. If researchers decide that a pilot program or replication study changes behavior in the predicted way, and if funders are again willing, the program may then be implemented at a larger scale.

In this ideal process, researchers make objective decisions about the research based only on uncovering the true effect of the intervention. However, researchers are not always completely objective. Social scientific research often requires researchers to make decisions about both the research process and the interpretation of results. Put another way, researchers have many degrees of freedom (Simmons, Nelson, & Simonsohn, 2011) or “forking paths”

(Gelman & Loken, 2014) in the research process that require decisions. Such decisions include the sample size, whether to omit some observations, what factors to include in a model, how to construct outcome measures, and so on. These decisions can influence the type of data collected from a pilot study and the way that results get analyzed, ultimately shaping the conclusions that researchers draw about whether or not a program is effective.

When researchers face ambiguous decisions in the research process, they tend to make decisions that increase the chance of achieving statistically significant results (p ≤ .05) for the desired outcome (Babcock & Loewenstein, 1997; Dawson, Gilovich, & Regan, 2002; Gilovich,

1983; Hastorf & Cantril, 1954; Kunda, 1990; Zuckerman, 1979). This is a threat to accurate

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 7 from the results of policy research (Al-Ubaydli, List, & Suskind 2019; Al-Ubaydli,

Lee, List, Mackevicius, & Suskind, 2019). Researcher decisions are influenced by the incentives built into the research process: positive results are more highly rewarded through publication and funding. Other influences on decisions in the research process arise from the funding source.

This potential influence is widely recognized in academia, as indicated by the requirement that researchers report conflicts of interest for funded research. Industry-funded clinical trials and cost-effectiveness analyses, for instance, yield positive results far more often than studies that are not funded or conducted by industry or other interested entities (Davidson, 1986; Rochon et al.,

1994; Cho & Bero, 1996; Friedberg, Saffran, Stinson, Nelson, & Bennett, 1999; Krimsky 2013).

The many degrees of freedom in the research process have two important impacts on the decision to scale up a program. First, it means that conventional measures of statistical significance do not necessarily provide a strong signal in favor of scientific claims (Cohen, 1994;

Ioannidis, 2005; Simmons et al., 2011; Francis, 2013; Button et al., 2013; Gelman & Carlin

2014) because researchers may have made decisions that were designed to produce those results.

Relying solely on statistical significance of an estimate can lead to accepting false positive results (e.g., Simmons et al., 2011; Maniadis, Tufano, and List 2014). When researchers accept false positive results for program impacts, scaling those programs can lead to disappointing null effects in a scaled-up implementation.

Second, the fact that there are many ambiguous decision points in the research process opens the door for cognitive biases to influence how researchers design, implement, and interpret the results of pilot studies. Moreover, researchers are not the only ones involved in the decision to scale up an intervention. Policy makers, practitioners, and funders also play important roles in

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 8 that decision and the fact that they, too, have many degrees of freedom in their decision process leaves open the possibility for their cognitive biases to influence the decision.

A pilot study is intended to influence individuals’ about whether a program is truly effective. According to the economic model of scaling, when faced with the results of a pilot study, individuals should update their prior belief about an intervention’s efficacy by calculating a post-study probability (PSP) that the intervention is effective. To do this, they should begin with their belief about the probability of effectiveness prior to the study, calculate the likelihood that a pilot study’s results are true based on the study’s level of power and statistical significance and then up-date their belief about the probability that a program is effective. The power of a study is the likelihood of detecting a treatment effect if the effect truly exists. Power tends to be greater when the size of the sample in an intervention is greater. However, larger samples result in smaller effects being statistically significant at conventional levels, so both power and statistical significance are aids for interpreting the credibility of effect sizes and they do not tell us alone whether a program is worth scaling.

Al-Ubaydli, Lee, List, Mackevicius, and Suskind (2019) recommend not scaling an intervention until the PSP for the intervention reaches 0.95 or greater. They also recommend starting with a low prior belief about an intervention’s effectiveness—0.10 on a scale of 0 to 1— when there is minimal information about a program’s effectiveness. Given these recommendations, it would take two well-powered studies with positive, statistically significant results to reach the PSP recommended for scaling.

With each statistically significant finding, the PSP increases (Maniadis et al., 2014).

However, the PSP also should be adjusted for studies finding insignificant results. An insignificant pilot result should lower the PSP relative to the prior belief.

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 9

This formal process of updating prior beliefs assumes an unbiased decision maker. But researchers, like everyone else, are subject to cognitive biases that affect their judgements. These biases can increase the chance of their accepting false positive results and scaling programs that are likely to fail. In the next three sections, we describe how three different cognitive biases might influence the decision to scale a program:

• Confirmation Bias: The tendency to search for and interpret information in a way that

confirms one's preconceptions.

: The tendency to want things to stay relatively the same.

• Bandwagon Bias: The tendency to do or believe things solely because many other people

do or believe it.

Confirmation Bias

As already noted, updating one’s beliefs about the efficacy of a program in light of the results of a pilot is fundamental to the decision about whether to scale the program. In this section, we describe how confirmation bias can alter both research procedures and the interpretation of results in ways that can result in an individual accepting false positive results.

Confirmation bias leads people to gather, interpret, and recall information in a way that conforms to their previously existing beliefs (Jonas, Schulz-Hardt, Frey, & Thelen, 2001;

Wason, 1960, 1968; Kleck &Wheaton 1967). Imagine Jack and Jill. Jack supports candidate A and Jill supports candidate B. Jack seeks out news stories and social media contacts who support

A while Jill does the same for B. Both of their views are confirmed by the evidence that they find. Similarly, when Jack sees evidence in support of candidate B, he is likely to find reasons not to believe the evidence and when Jill sees evidence in support of A, she is likely to discount that, too. This same process of selectively attending to information occurs in many arenas.

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 10

In his iconic test of confirmation bias, Peter Wason (1960) developed the Rule Discovery

Test. He gave subjects three numbers and asked them to come up with the rule that applied to the selection of those numbers. Subjects could ask questions and devise other sets of numbers that researchers would say did or did not conform to the rule. In this way, subjects were asked to determine if their rule was correct. Given the sequence of “2-4-6", subjects typically formed a hypothesis that the rule was even numbers. They tested this by presenting other sequences of even numbers, all of which were correct. After several correct tries, the subjects believed that they had discovered the rule. In fact, they had failed, because the rule was “increasing numbers.”

The significance of this study is that almost all subjects formed a hypothesis and tested number sequences that conformed only to their hypothesis. Very few tried a number sequence that might disprove their hypothesis. Few subjects asked questions to try to falsify their hypothesis.

Wason’s Rule Discovery Test demonstrated that most people do not try at all to test their hypotheses critically but instead try only to confirm their hypothesis.

Since this pioneering experiment, dozens of both laboratory and field experiments have demonstrated the existence and importance of confirmation bias. For example, one study found that when presented with an article containing ambiguous information on the pros and cons of policies, conservatives and liberals alike interpreted the same, neutral article as making a case for their own prior viewpoint (Angner, 2016). A similar phenomenon might occur when a pilot program measuring several equally important outcomes produces results that show that it benefits some outcomes but not others. In these circumstances, the presence of confirmation bias suggests that a person with a prior belief in the program might interpret equivocal results such as these as supporting the program being scaled. Alternatively, suppose a researcher has a strong prior belief that an intervention will be effective. She reads about the results of two pilot studies

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 11 that are consistent with her prior. As noted earlier, this should be enough evidence to make her

PSP above the 0.95 threshold if it is the only evidence available. But what if there were also eight other similarly powered studies that all produced nonsignificant results? A researcher facing some studies with positive results and others with negative results should update her beliefs by increasing her prior twice for the two positive results, and then decreasing it eight times for the negative results. The resulting PSP will likely fall far short of the 0.95 threshold.

The influence of confirmation bias suggests that she will give greater weight to the two results that agree with her prior belief than to the eight that disagree with it. That is, the resulting PSP the researcher calculates will be above 0.95 and meet the recommendations in the economic model of scaling, even though the true PSP taking into account the eight negative results should be far lower. This could result in her deciding to scale up the program that should not be scaled up.

While there is not much research on how confirmation bias may influence the research process, some evidence shows that it occurs in the peer review process (Mahoney 1977;

Goldsmith, Blalock, Bobkova, & Hall, 2006; Lee et al., 2013). In an early study on this topic,

Mahoney (1977) asked 75 journal reviewers to review manuscripts that had identical introductory sections and experimental procedures but either reported data that were consistent or were inconsistent with the reviewer’s theoretical perspective. The results demonstrated that reviewers were strongly biased against manuscripts that reported results that were contrary to their own theoretical perspective. In other words, they were more inclined to publish results that conformed to their prior belief than results that disconfirmed it.

Another way that confirmation bias could influence the decision to scale a program is through funding decisions. A substantial research literature in shows that

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 12 government agencies have agendas, are influenced by fashions and their own that create strong prior beliefs about what will work, and fund programs based on these prior beliefs

(e.g. see Vaganay 2016 for a summary). Similarly, foundations have missions and agendas that result in their selectively choosing research to support those pre-existing beliefs. This cognitive bias can lead to funding weakly vetted programs that have a high probability of failure.

Some kinds of data that are frequently used to evaluate programs may also be subject to confirmation bias and therefore lead to scaling programs that are likely to fail. For example, in program evaluations, teacher reports are often used to measure changes in student outcomes (e.g.

Brackett, Rivers, Reyes, & Salovey, 2012). Parent responses are also frequently used as an outcome measure in evaluations of parent programs because what parents do in the home is often unobservable (see e.g. York & Loeb 2014). This kind of self-reported data is problematic for many reasons, but one important reason is the possibility of confirmation bias among practitioners and participants. For example, several studies show that fidelity to a program is greater when practitioners believe the program will work and that they are more likely to believe a program will work when it aligns with their prior beliefs (Mihalic, Fagan, & Argamaso, 2008;

Ringwalt et al., 2003; Rohrbach, Graham, & Hansen, 1993). We can speculate that parents or teachers are more likely to believe that there has been a positive change in their behavior when they participate in a program that is consistent with their priors about what works than when they participate in a program that is inconsistent with what they believe works. This belief may not correspond with actual changes in behavior leading researchers to scale up a program that is likely to fail.

Finally, confirmation bias could result in researchers and others placing more weight on the results of early pilot programs than on later data about the same program (Baron, 2008;

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 13

Allahverdyan and Galstyan 2014). This tendency is known as the “recency effect.” If early results confirm one’s priors, it becomes more likely that latter disconfirming evidence will be discounted. This too can result in relying on false positive results to scale a program.

Status Quo (Default) Bias

It has long been recognized that governments are slow to adopt new programs even when evidence shows that the new program generates benefits that far outweigh its cost. A variety of explanations for this phenomenon have been offered (e.g., Fernandez and Rodrik, 1991), including status quo or default bias. Status quo bias refers to the fact that individuals tend to prefer the current state of events to a change, even if a change is in one’s interest (Samuelson &

Zeckhauser, 1988; Kahneman, Knetsch, & Thaler, 1991). There are two main explanations for status quo bias. The first suggests that people are hardwired to assume that the status quo is good, or at least better than an alternative (Eidelman & Crandall, 2014). The second suggests that it is a result of (Moshinsky & Bar-Hillel, 2010; Quattrone & Tversky, 1988). Loss aversion is a cognitive bias that refers to the documented phenomenon that the psychological discomfort of a loss is about twice as powerful as the psychological of an equal gain

(Kahneman & Tversky, 1979). Because loss is so painful, people prefer the known status quo over changes that might result in loss.

Status quo bias can result in people choosing the default option. A classic demonstration of status quo bias is in decisions about retirement programs. Research has shown that 401(k) participation increases dramatically when companies switch from an enrollment system whereby employees must opt into the program to an enrollment system in which employees are automatically enrolled and must opt out (Madrian & Shea, 2001; Choi, Laibson, Madrian, &

Metrick, 2004). This demonstrates that employees prefer whatever the current state or status quo

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 14 is over having to make a decision to change something—even when the change is in their own financial interest.

In terms of scaling up a program, imagine a researcher who has completed a pilot study of a reading curriculum. The results show that the new curriculum raises reading skill by an impressive amount compared to a control group receiving the curriculum currently used in schools. The new curriculum is also easier and less expensive to implement. As Chambers and

Norton argue in Chapter 14 of this book, a scenario like this suggests that de-implementation of the current curriculum should be considered. So, a researcher might then try to convince school districts to replace the old curriculum with the new one. Anyone who has been in this position knows that school districts are unlikely to jump at the chance to implement the new curriculum

(see e.g. Berkovick, 2011; Terhart, 2013). A possible explanation for this reluctance is status quo bias. Given the choice of a new superior program to a current program, the choice is often the current program, even when the cost of transition to the new program makes it cost effective (see e.g. Polites & Karahanna, 2012).

Status quo bias has been demonstrated in many experimental situations. For example, in one study, participants were given the option to reduce their anxiety while waiting for an electric shock. When doing nothing was the status quo option, participants frequently did not select the option that would reduce their anxiety (Suri, Sheppes, Schwartz & Gross, 2013).

Outside the laboratory, status quo bias has been used to explain many choices that seem irrational or suboptimal. Consider the fact that drivers are often offered the choice between expensive car insurance that protects the insured’s right to sue and cheaper plans that restrict the right to sue. In a study of insurance decisions, one state had set the default plan as the expensive one whereas the cheaper plan was the default in another state. In both states, individuals were

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 15 more likely to choose the default option (Johnson, Hersey, Meszaros & Kunnreuth, 1993). Many studies have examined the effects of changing from opt-in to opt-out choice structures, on everything from charitable giving (Sanders, Halpern & Service 2013), to end-of-life care

(Halpern et al., 2013), and choosing green energy (Pichert & Katsikopoulos 2008), among others. Status quo bias has also been demonstrated in research on the advantage of incumbents in elections and on the preference of retirees for their current investment distribution over a new distribution, even when the new distribution would improve their rate of return (Beshears, Choi,

Laibson, & Madrian, 2006; Ameriks & Zeldes, 2000).

The fact that status quo bias appears to have a widespread influence in consequential policy decisions raises the possibility of its influencing decisions in the scale-up process. We can speculate that status quo bias plays a role in why a new program that works may not be scaled up. If a decision maker must choose between a piloted program with positive results that would displace an existing program, status quo bias gives weight to a decision in favor of the existing program, all else equal. On the other hand, status quo bias can also help explain why a new program that does not work may get scaled. If a decision maker faces a choice between a pilot program that is already being implemented versus an entirely new program, status quo bias may lead them to stick with and scale-up the existing program—even if its results have been disappointing. In short, status quo bias might prevent a potentially beneficial program from being scaled or result in scaling a program that is not beneficial.

Bandwagon Bias

Social scientists have long tried to understand how some things become fashionable compared to other things that appear to be just as functional and cost less. The attraction of luxury-brand items like Louis Vuitton and Gucci bags, Cartier and Piaget watches, or BMW and

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 16

Mercedes cars seems not to align with rational decision making. Similarly, it is hard to explain the popularity of pet rocks or videos that go viral. And why do some diet and exercise regimes become fads while others, including those with more evidence of effectiveness, fail to become widespread?

In 1951, Solomon Asch developed a laboratory experiment intended to understand this kind of behavior. He recruited students to participate in what they thought was a vision experiment. Researchers put these students in classrooms with other students who were actually a part of the experimental design, posing as other participant-students. Everyone in the room was shown a picture with three lines of varying lengths with some obviously longer than others. Each person in the room was asked to state out loud which was the longest line. In one experimental condition, the students who were part of the experimental design all identified the wrong line. On average, over a third of the other participant students went along with the clearly incorrect choice. In some trials, as many as 75 percent of participant students went along with the clearly wrong students. In a control condition with no students placed in the room as part of the experimental design, virtually all students chose the correct answer.

Asch’s experiment demonstrated the existence of bandwagon bias. Bandwagon bias refers to the tendency of people to adopt what others do or think simply because everyone else is doing it. The more people who adopt a particular trend, the more likely it becomes that other people will also “hop on the bandwagon” (see e.g, Henshel & Johnson, 1987). Bandwagon bias is one of several biases that arise from peer or other social influences. Psychologists believe that bandwagon bias, like other cognitive biases, aids in processing information for quick decision making. But in this case, the information is processed in light of the views and behavior of

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 17 others. Since Asch’s experiment, a large research literature has demonstrated the importance of bandwagon bias to human decision making.

Bandwagon bias can interfere with scale-up decisions because it can magnify the impact of false positives. For reasons described above, false positives are more likely to occur than the

5% implied by a result that is significant at � = .05. When a pilot intervention with a false positive result is promoted by a trusted or influential (but not necessarily expert) source, or is generously funded, it can result in others duplicating the program before additional research is done. As noted above, initial pilot results might be favored over later evidence due to confirmation bias. When early but erroneous results are favored by influential sources, it can create a bandwagon bias that magnifies the confirmation bias. Once a critical mass has hopped

“on the bandwagon” it becomes harder to convince people to change their now-strong prior beliefs. This can lead to scaling up a program that is likely to fail.

While fashion and toy choices are clear examples of bandwagon bias, research also suggests that this cognitive bias occurs in more serious and costly decisions. Bandwagon bias has led to the widespread adoption of programs in education with weak or no basis in evidence other than the popularity of the program. These education fads include open classrooms, whole language learning, and “new math” (Phillips, 2015). A large literature documents bandwagon biases in medicine, including the use of estrogen to prevent heart disease (Rossouw 1996), and other medical procedures with dubious efficacy (McCormack & Greenhalgh, 2000; Trotter,

2002). In these cases, bandwagon bias may have resulted in more harm than good both because the policies may have been harmful but also because bandwagon ideas can crowd out policies or program that are more likely to be productive.

Factors That Can Amplify or Reduce the Influence of Cognitive Biases

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 18

We speculate that additional factors in the decision-making environment and the type of available evidence increase the chance that cognitive biases influence the scale up-decision. We know of no research that has tested how these factors interact with cognitive biases. However, we believe that they should be considered in future research. Understanding what factors exacerbate the influence of cognitive biases would help us find ways to ameliorate that influence.

The decision-making environment can greatly affect the extent to which cognitive biases influence decisions about scaling up a program. Decision-making environments that encourage reflective judgment and an understanding of the potential influence of cognitive biases can reduce their influence. For example, recent research indicates that delaying a decision by even a 10th of a second significantly increases decision accuracy (Teichert, Ferrera & Grinband,

2014). As noted earlier, meeting the 0.95 PSP threshold before scaling a program implies that the decision to scale should be based on the results of multiple independent replications in part because the longer time period required to make the decision reduces the likelihood that cognitive bias affects the final decision.

When decision makers have a lot of autonomy, cognitive biases are likely to have more influence than when there are many checks in the decision-making process. As we described above, researchers have a great deal of autonomy in making decisions about the research process.

Checks on those decisions are supposed to occur in the peer review process, including conference and seminar presentations. However, these potentially valuable checks may sometimes also be influenced by participants’ cognitive biases. Autonomy among other decision makers also raises the likelihood for the influence of cognitive biases. In schools and many social programs, practitioners have a great deal of autonomy in decisions about what programs get implemented and in the day-to-day operation of programs. This autonomy may result in scaling

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 19 programs that should not be scaled because it allows practitioners to make decisions influenced by cognitive biases.

While multiple checks in the decision process may reduce cognitive biases they may also result in decision-makers’ becoming more entrenched in their views. When people have invested a lot of effort to develop a program, implement a pilot and defend the results in seminars and other places they tend to want to protect that investment by not changing their mind about whether the program works. This can foster confirmation bias as well as status quo bias.

Common advice to avoid confirmation bias is to establish systems that allow for reviews of decisions, to use many sources of information, and other strategies that lengthen the decision- making process. However, while these procedures might allow for more input, the additional investment may also mean that the decision is more entrenched in the end. Striking the balance between creating decision structures that have sufficient reflection and review but are not so burdensome as to entrench ideas could be an important goal in reforming the research process.

The type of available evidence can also exacerbate or ameliorate the influence of cognitive biases (Baddeley, Curtis, & Wood, 2004). Although cognitive biases influence decisions in a way that is “irrational,” high-quality information may moderate the extent of that influence. The amount and quality of available confirming evidence and the amount and quality of available disconfirming evidence (or the ratio of confirming to disconfirming evidence) can influence the extent of confirmation bias, status quo bias, and bandwagon bias among decision- makers. If both confirming and disconfirming evidence is readily available, the chance for cognitive bias to influence decisions about scaling a program is high because any view can be supported by the evidence. If there are, for example, five studies with positive results and five with null (or negative) results, a person with a high prior belief in the program will favor the

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 20 positive results and discount the null results. As more negative results accumulate relative to the positive results, it becomes harder to discount the negative results and cherry pick the positive results. Different program domains may have a different evidence base, thus leading to more or less confirmation bias in programs in different domains. In terms of status quo bias, additional consistent information can reduce the probability that a decision will lead to a psychological loss

(Iyengar & Lepper 2000; Dean, Ozgur, & Yesufcan, 2014). In terms of bandwagon bias, having both confirming and disconfirming evidence equally available implies that neither is dominant, and there is therefore no “bandwagon” to jump on. Consequently, a way to minimize the influence of cognitive bias is to build a strong and believable research base on the efficacy of programs.

We can also build opportunities into the decision-making process to truly consider alternatives to help mitigate against cherry-picking evidence from biased searches. Because it is hard to ignore high-quality and consistent evidence about a program, we should support repeated pilots of important promising programs and give equal weight in the publication process to positive, negative, and insignificant results of well-powered studies. This aligns well with the recommendation from the economic model of scaling to reward researchers for replicating results as well as designing interventions that can be independently replicated. Doing so would financially de-incentivize a researcher from only looking for evidence that confirms their priors.

Finally, although incentives are not cognitive biases, they may result in biases such as confirmation bias. We should remove pernicious incentives from the research process. The economic model of scaling shows us that results in a higher likelihood of scaling up false positives, and changing incentives where null results are published can help mitigate

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 21 this. This balanced increase in available evidence will help ensure that decision-makers accurately adjust their PSP.

Conclusion

There is not enough data to know the extent to which cognitive biases influence the decision to scale up an intervention. However, research shows that across domains, even decision makers who believe themselves to be objective and dispassionate can be subject to cognitive biases because they are not conscious of them. Even when they are, they may have a hard time overcoming them without changes to the environment in which decisions are made. Furthermore, research has shown that in many policy domains, cognitive biases play an important role in decision making and that they can reduce the quality of decision making. This suggests that a psychologically informed understanding of the scale-up decision process that includes the roles of both researchers and non-researchers may provide evidence that improves decision making and reduces the voltage drop effect. This research should include both laboratory and field experiments to understand the influence of theoretically probable cognitive biases on decisions in the research process, the funding process, the peer review process, and the implementation process, in order to identify where in those processes cognitive biases are resulting in suboptimal decision making.

In sum, confirmation bias occurs when one looks for information that supports one’s existing beliefs and rejects data that goes against those beliefs; status quo bias arises from our tendency to want things to stay relatively the same; and bandwagon bias occurs because humans have a tendency to do or believe things solely because many other people do or believe it. From a practical perspective, it is important in the decision-making process not only to acknowledge that these biases exist but also to build in processes to mitigate against them. As noted, building

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 22 decision-making environments that encourage reflective judgment and an understanding of the potential influence of cognitive biases can reduce their influence. Crucially, it is important for the decision-maker to look for ways to challenge what she thinks she sees. She could accomplish this by surrounding herself with a diverse group of people and actively soliciting dissenting views. A team of decision-makers can be designed to include one or more individuals who challenges others’ opinions. Or, team members can be assigned to play the role of "devil's advocate" for major decisions or to role-play situations that take into account multiple perspectives. Whether in the research or policy process, adopting these decision-check techniques at critical junctures in the decision-making process hold promise for reaching more well-rounded and less biased decisions and avoiding the pitfalls that can arise when cognitive bias interferes with rational decision making.

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 23

References

Afif, Z., Islan, W. W., Calvo-Gonzalez, O., & Dalton, A. G. (2019). Behavioral Science Around

the World: Profiles of 10 Countries (English). eMBeD brief. Washington, D.C.: World

Bank Group.

Akerlof, G., & Kranton, R. (2010). Identity . Princeton University Press.

Allahverdyan A.E. and Galstyan A. (2014). Opinion Dynamics with Confirmation Bias. PLoS

ONE 9(7): e99557. doi:10.1371/journal.pone.0099557

Al-Ubaydli, O., List, J. A., & Suskind, D. (2019). The Science of Using Science : Towards an

Understanding of the Threats to Scaling Experiments. National Bureau of Economic

Research.

Ameriks, J., & Zeldes, S. P. (2000). How do household portfolio shares vary with age?. Working

paper: Columbia University and TIAA-CREF.

Angner, E. (2016). A course in (2nd ed.). Macmillan International Higher

Education.

Ariely, D. (2008). Predictably irrational. New York, NY: HarperAudio.

Asch, S. E. (1951). Effects of group pressure upon the modification and distortion of judgment.

In H. Guetzkow (ed.) Groups, leadership and men. Pittsburgh, PA: Carnegie Press.

Babcock, L., & Loewenstein, G. (1997). Explaining bargaining impasse: The role of self-serving

biases. Journal of Economic Perspectives, 11(1), 109-126.

Baddeley, M. C., Curtis, A., & Wood, R. (2004). An introduction to prior information derived

from probabilistic judgements: Elicitation of knowledge, cognitive bias and herding.

Geological Society, London, Special Publications, 239(1), 15-27.

10.1144/GSL.SP.2004.239.01.02.

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 24

Baron, J. (2008). Thinking and deciding. Cambridge: University Press, Cambridge.

Baumeister, R. (2003). The of : Why people make foolish, self-defeating

choices. The Psychology of Economic Decisions, 1, 3-16.

Berkovich, Izhak. (2011). No we won't! Teachers' resistance to educational reform. Journal of

Educational Administration. 49. 563-578. 10.1108/09578231111159548.

Beshears, J., Choi, J. J., Laibson, D., & Madrian, B. (2006). Early decisions: A regulatory

framework (No. w11920). National Bureau of Economic Research.

Brackett, M. A., Rivers, S. E., Reyes, M. R., & Salovey, P. (2012). Enhancing academic

performance and social and emotional competence with the RULER feeling words

curriculum. Learning and Individual Differences, 22(2), 218-224.

Button, K. S., Ioannidis, J. P., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S., & Munafò,

M. R. (2013). Empirical evidence for low indicates low pre-study odds.

Nature Reviews Neuroscience, 14(12), 877-877.

Chambers, D. & Norton, W. (forthcoming). Sustaining Impact after Scaling Using Data and

Continuous Feedback. In List, J., Suskind, D., & Supplee, L. H., The Scale-up Effect in

Early Childhood and Public Policy. Routledge.

Choi, J. J., Laibson, D., Madrian, B.C., & Metrick, A. (2004). For better or for worse: Default

effects and 401(k) savings behavior. In D. A.Wise (Ed.), Perspectives in the economics of

aging, (pp. 81–121). Chicago: University of Chicago Press.

Cho, M. K., & Bero, L. A. (1996). The quality of drug studies published in symposium

proceedings. Annals of Internal Medicine, 124(5), 485-489.

Cohen, J. (1994). The earth is round (p<. 05). American Psychologist, 49, 997-1003.

Dean, M., Kibris, O. & Masatlioglu, Y. (2014). Limited attention and status quo bias. Working

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 25

Paper, No. 2014-11, Brown University, Department of Economics, Providence, RI

Davidson, R. A. (1986). Source of funding and outcome of clinical trials. Journal of General

Internal Medicine, 1(3), 155-158.

Dawson, E., Gilovich, T., & Regan, D. T. (2002). and performance on the

was on selection task. Personality and Bulletin, 28(10), 1379-1387.

Eidelman, S., & Crandall, C. S. (2014). The intuitive traditionalist: How biases for existence and

longevity promote the status quo. In Advances in experimental social psychology (Vol.

50, pp. 53-104). Academic Press.

Fernandez, R., & Rodrik, D. (1991). Resistance to reform: status quo bias in the presence of

individual-specific uncertainty. The American Economic Review, 1146-1155.

Francis, G. (2013). Replication, statistical consistency, and publication bias. Journal of

Mathematical Psychology, 57(5), 153-169.

Friedberg, M., Saffran, B., Stinson, T. J., Nelson, W., & Bennett, C. L. (1999). Evaluation of

conflict of interest in economic analyses of new drugs used in oncology. Jama, 282(15),

1453-1457.

Gelman, A., & Carlin, J. (2014). Beyond power calculations: Assessing type S (sign) and type M

(magnitude) errors. Perspectives on Psychological Science, 9(6), 641-651.

Gelman, A., & Loken, E. (2014). The statistical crisis in science: data-dependent analysis--a"

garden of forking paths"--explains why many statistically significant comparisons don't

hold up. American Scientist, 102(6), 460-466.

Gigerenzer, G. (1996). On narrow norms and vague : A reply to Kahneman and

Tversky. Psychological Review, 103(3), 592–596

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 26

Gilovich, T. (1983). Biased evaluation and persistence in . Journal of Personality and

Social Psychology, 44(6), 1110-1126. https://doi.org/10.1037/0022-3514.44.6.1110

Gilovich, T., Griffin, D., & Kahneman, D. (2002). Heuristics and biases: The psychology of

intuitive judgment. Cambridge University Press.

Goldsmith, L. A., Blalock, E. N., Bobkova, H., & Hall 3rd, R. P. (2006). Picking your peers. The

Journal of Investigative Dermatology, 126(7), 1429-1430. doi:10.1038/sj.jid.5700387

Halpern, S. D., Loewenstein, G., Volpp, K. G., Cooney, E., Vranas, K., Quill, C. M., ... &

Arnold, R. (2013). Default options in advance directives influence how patients set goals

for end-of-life care. Health Affairs, 32(2), 1-10.

Henshel, R. L. & Johnston, W. (1987). The emergence of bandwagon effects: A theory.

Sociological Quarterly, 28(4), 493-511.

Haselton, M. G., Nettle, D., & Murray, D. R. (2016). The evolution of cognitive bias. In D. M.

Buss (Ed.), The handbook of : Integrations (pp. 968-987). John

Wiley & Sons Inc.

Hastorf, A. H., & Cantril, H. (1954). They saw a game; a case study. The Journal of Abnormal

and Social Psychology, 49(1), 129.

Hershfield, H. E. (2011). Future self-continuity: How conceptions of the future self transform

intertemporal choice. Annals of the New York Academy of Sciences, 1235, 30.

https://doi.org/10.1111/j.1749-6632.2011.06201.x

Ioannidis, J. P. (2005). Why most published research findings are false. PLOS Medicine, 2(8),

e124.

Iyengar, S.S. and M. L. Lepper. (2000). When Choice Is Demotivating: Can One Desire Too

Much of a Good Thing? Journal of Personality and Social Psychology, 79(6):995.1006.

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 27

Jonas, E., Schulz-Hardt, S., Frey, D., & Thelen, N. (2001). Confirmation bias in sequential

information search after preliminary decisions: an expansion of dissonance theoretical

research on selective exposure to information. Journal of Personality and Social

Psychology, 80(4), 557.

Johnson, E. J., Hershey, J., Meszaros, J., & Kunreuther, H. (1993). Framing, probability

distortions, and insurance decisions. Journal of and Uncertainty, 7(1), 35-51.

https://doi.org/10.1007/BF01065313

Kahan, D. M., Peters, E., Dawson, E. C., & Slovic, P. (2013). Motivated and

enlightened self-government ( Project Working Paper No. 116). New

Haven, CT: Yale University School.

Kahneman, Daniel and Amor Tversky. (1972). Subjective probability: A judgment of

representativeness. 3(3) pp 430-454.

Kahneman, D. (2011). Thinking, fast and slow. New York, NY: Farrar, Straus and Giroux.

Kahneman, D., Knetsch, J. L., & Thaler, R. H. (1991). Anomalies: The endowment effect, loss

aversion, and status quo bias. Journal of Economic Perspectives, 5 (1): 193-206.

Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk.

Econometrica, 47(2), 263-292.

Kalil, A., Mayer, S. E., & Gallegos, S. (2020). Using behavioral insights to increase attendance

at subsidized preschool programs: The Show Up to Grow Up intervention. Forthcoming, .

Organizational Behavior and Human Decision Processes.

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 28

Kleck, R. E., & Wheaton, J. (1967). Dogmatism and responses to opinion-consistent and

opinion-inconsistent information. Journal of Personality and Social Psychology, 5(2),

249.

Krimsky, S. (2013). Do financial conflicts of interest bias research? An inquiry into the “funding

effect” hypothesis. Science, Technology, & Human Values, 38(4), 566-587.

Kunda, Z. (1990). The case for motivated reasoning. Psychological Bulletin, 108(3), 480.

Lee, C. J., Sugimoto, C. R., Zhang, G., & Cronin, B. (2013). Bias in peer review. Journal of the

American Society for Information Science and Technology, 64(1), 2-17.

Madrian, B. C., & Shea, D. F. (2001). The power of suggestion: Inertia in 401 (k) participation

and savings behavior. The Quarterly Journal of Economics, 116(4), 1149-1187.

Mahoney, M. J. (1977). Publication : An experimental study of confirmatory bias in

the peer review system. Cognitive Therapy and Research, 1(2), 161-175.

Maniadis, Z., Tufano, F., & List, J. A. (2014). One swallow doesn't make a summer: New

evidence on anchoring effects. American Economic Review, 104(1), 277-290.

Mayer, S. E., Kalil, A., Oreopoulos, P., & Gallegos, S. (2019). Using behavioral insights to

increase parental engagement: The Parents and Children Together Intervention. Journal

of Human Resources, 54(4), 900-925.

McCormack, J., & Greenhalgh, T. (2000). Seeing what you want to see in randomised controlled

trials: versions and perversions of UKPDS data. Bmj, 320(7251), 1720-1723.

Mihalic, S. F., Fagan, A. A., & Argamaso, S. (2008). Implementing the LifeSkills Training drug

prevention program: factors related to implementation fidelity. Implementation Science,

3(1), 5.

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 29

Moshinsky, A., & Bar-Hillel, M. (2010). Loss aversion and status quo label bias. Social

Cognition, 28(2), 191-204. https://doi.org/10.1521/soco.2010.28.2.191

Nisbett, R. E. (2015). Mindware: Tools for smart thinking. New York, NY: Farrar, Straus and

Giroux.

Phillips, C. J. (2015). The new math. University of Chicago Press.

Pichert, D., & Katsikopoulos, K. V. (2008). Green defaults: Information presentation and pro-

environmental behaviour. Journal of , 28(1), 63-73.

Polites, G. L., & Karahanna, E. (2012). Shackled to the status quo: The inhibiting effects of

incumbent system habit, switching costs, and inertia on new system acceptance. MIS

Quarterly, 21-42.

Quattrone, G. A., & Tversky, A. (1988). Contrasting rational and psychological analyses of

political choice. American Political Science Review, 82(3), 719-736.

Ringwalt, C. L., Ennett, S., Johnson, R., Rohrbach, L. A., Simons-Rudolph, A., Vincus, A., &

Thorne, J. (2003). Factors associated with fidelity to substance use prevention curriculum

guides in the nation's middle schools. Health Education & Behavior, 30(3), 375-391.

Rochon, P. A., Gurwitz, J. H., Simms, R. W., Fortin, P. R., Felson, D. T., Minaker, K. L., &

Chalmers, T. C. (1994). A study of manufacturer-supported trials of nonsteroidal anti-

inflammatory drugs in the treatment of arthritis. Archives of Internal Medicine, 154(2),

157-163.

Rohrbach, L. A., Graham, J. W., & Hansen, W. B. (1993). Diffusion of a school-based substance

abuse prevention program: predictors of program implementation. Preventive Medicine:

An International Journal Devoted to Practice and Theory.

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 30

Samuelson, W., & Zeckhauser, R. (1988). Status quo bias in decision making. Journal of Risk

and Uncertainty, 1(1), 7-59.

Sanders, M., Halpern, D., & Service, O. (2013). Applying behavioural insights to charitable

giving.

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-: Undisclosed

flexibility in data collection and analysis allows presenting anything as significant.

Psychological Science, 22(11), 1359-1366.

Sunstein, C. R. (2014). Why nudge?: The politics of libertarian paternalism. Yale University

Press.

Suri, G., Sheppes, G., Schwartz, C., & Gross, J. J. (2013). Patient inertia and the status quo bias:

when an inferior option is preferred. Psychological Science, 24(9), 1763-1769.

Teichert, T., Ferrera, V. P., & Grinband, J. (2014). Humans optimize decision-making by

delaying decision onset. PlOS One, 9(3).

Terhart, E. (2013). Teacher resistance against school reform: reflecting an inconvenient

truth, School Leadership & Management, 33:5, 486-

500, DOI: 10.1080/13632434.2013.793494

Thaler, R. H. & Sunstein, C.R. (2008). Nudge: Improving decisions about health, wealth, and

happiness. New Haven, CT: Yale University Press.

Trotter, Griffin. (2002) “Why were the benefits of tPA exaggerated?” The Western Journal of

Medicine;176(3):194-197.

Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases.

Science, 185(4157), 1124-1131.

HOW COGNITIVE BIASES CAN UNDERMINE SCALE UP 31

Vaganay, A. (2016). Outcome in government-sponsored policy evaluations: A

qualitative content analysis of 13 studies. PLOS One, 11(9).

https://doi.org/10.1371/journal.pone.0163702

Van der Meer, T. W., Hakhverdian, A., & Aaldering, L. (2016). Off the fence, onto the

bandwagon? a large-scale survey experiment on effect of real-life poll outcomes on

subsequent vote intentions. International Journal of Public Opinion Research, 28(1), 46-

72.

Wason, P. C. (1960). On the failure to eliminate hypotheses in a conceptual task. Quarterly

Journal of , 12(3), 129-140.

Wason, P. C. (1968). Reasoning about a Rule. Quarterly Journal of Experimental Psychology,

20(3), 273-281.

World Bank. (2015). World development report 2015: Mind, Society, and Behavior. World

Bank.

York, B. N., & Loeb, S. (2014). One step at a time: The effects of an early literacy text

messaging program for parents of preschoolers (No. w20659). National Bureau of

Economic Research.

Zuckerman, M. (1979). of success and failure revisited, or: The motivational bias is

alive and well in attribution theory. Journal of Personality, 47(2), 245-287.