<<

The as Plumber

The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters.

Citation Duflo, Esther. “The Economist as Plumber.” 107, no. 5 (May 2017): 1–26. 2018 © American Economic Association

As Published http://dx.doi.org/10.1257/AER.P20171153

Publisher American Economic Association

Version Final published version

Citable link http://hdl.handle.net/1721.1/114020

Terms of Use Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. American Economic Review: Papers & Proceedings 2017, 107(5): 1–26 https://doi.org/10.1257/aer.p20171153

RICHARD T. ELY LECTURE

The Economist as Plumber †

By Esther Duflo*

Economists are increasingly getting the real world, not only do they need to give general opportunity to help governments around the prescriptions, they must engage with the details. world design new policies and regulations. This Second, details that we as might gives them a responsibility to get the big pic- consider relatively uninteresting are in fact ture, or the broad design, right. But in addition, extraordinarily important in determining the as these designs actually get implemented in the final impact of a policy or a regulation, while world, this gives them the responsibility to focus some of the theoretical issues we worry about on many details about which their models and most may not be that relevant. This sentiment is theories do not give much guidance. well summarized by Klemperer 2002, p. 170 , There are two reasons for this need to attend who presents his views on what matters( for prac)- to details. First, it turns out that policymakers tical auction design, based on his own experi- rarely have the time or inclination to focus on ence designing them and advising bidders: “in them, and will tend to decide on how to address short,” he writes, “good auction design is mostly them based on hunches, without much regard good elementary , [whereas] most of for evidence. Figuring all of this out is therefore the extensive auction literature is of second-or- not something that economists can just leave der importance for practical auction design.” to policymakers after delivering their report: if It seems appropriate to open an essay on they are taking on the challenge to influence the plumbing with an actual plumbing example that illustrates the two points I made above Devoto et al. 2012 . Many cities in the developing( world ) * Department of Economics, MIT, 77 Massachusetts seek to improve citizens’ access to home water Avenue, Building E52-544, Cambridge, MA 02139 e-mail: connections. Even when there are public taps, [email protected] This essay is based on the Ely (lecture, urban households without a connection at home ). which I delivered at the AEA meeting in January 2017. spend several hours per week collecting water, Many thanks to Alvin Roth, both for his invitation to give and this burden causes them considerable stress this lecture, and for his license to get our hands dirty with policy work. I thank and Cass Sunstein, and tension. The typical policy for improving master plumbers, for their encouragements, for several con- water access is to build the necessary infrastruc- versations that shaped this lecture, and for detailed com- ture, and then to encourage “end of pipe” connec- ments on a first draft. David Atkin, Robert Gibbons, Parag tions through subsidized tariffs and or subsidized Pathak, and , provided useful comments. / Vestal McIntyre gave wonderful and detailed comments loans. In 2007 in Tangier, a firm called Amendis the local subsidiary of Veolia Environnement , on style; all awkwardness remains mine. Laura Stilwell ( ) provided excellent research assistance. Plumbing is not a which was in charge of the water and sanitation solitary activity, and I would like to acknowledge the many for the city, had spent considerable resources people who have plumbed and discussed with me over the building large pipes and installing toilets in each years, notably: Abhijit Banerjee, Rukmini Banerji, James Berry, Raghabendra Chattopadhyay, Iqbal Dhaliwal, Rachel house, and, in collaboration with the city, was Glennerster, Michael Greenstone, , Clement offering interest-free loans to poor households to Imbert, , Shobhini Mukherjee, Karthik make it possible to cover the marginal cost of new Muralidharan, Nicholas Ryan, Rohini Pande, Benjamin water connections. But take-up of the subsidized Olken, Anna Schrimpf, and Michael Walton. † Go to https://doi.org/10.1257/aer.p20171153 to visit the loan program was very low less than 10 per- cent ; applying for the program( required a trip to article page for additional materials and author disclosure ) statement. the municipal office with supporting ­documents, 1 2 AEA PAPERS AND PROCEEDINGS MAY 2017 and this proved a real barrier. When the research its own or does not run at all, we avoid team randomly visited households and offered having to go looking for where the wheels procedural assistance by photocopying the are getting caught and figuring out what required documents at home and delivering them small adjustments it would take to get the to the municipal office, take-up of the loan and machine to run properly. To say that we need to move to a voucher system does the water connections increased to 69 percent. not oblige us to figure out how to make it As a result of this small additional expenditure, work—how to make sure that parents do poor residents of Tangier gained access to water, not trade in the vouchers for cash because and therefore recovered a considerable amount they do not attach enough value( to their of time to do other things. They were ultimately children’s education and that schools do much happier and less stressed, despite a large not take parents for )a ride because par- increase in their water bill. ents may not know what a good( education looks like . And how to get the private Improving access to private water connec- ) tions was a sensible policy idea, and the entire schools to be more effective? After all, effort was broadly well designed. But the lack of at least in India, even children who go to private schools are nowhere near grade attention to the very last step the administrative ( level. And many other messy details that steps to sign up had been preventing this large every real program has to contend with. investment in both) physical and financial infra- structure from paying off. When we are concerned with such details, These kinds of practical design questions are ex ante, we will have some priors on what fea- ubiquitous in policy design. When new front- tures will be important, and this guides our first- line workers are hired to implement a program, pass attempts at design. But it is not clear that will emphasizing wage and career prospects dis- either the policymakers or the scientists will courage publicly minded individuals or encour- correctly identify the most important choices. age the most talented to join? When thinking Those may not have been the focus of either about an immunization policy, should policy- practical or theoretical attention, and thus may makers assume that parents understand the full have been completely ignored. So an economist costs and benefits and immunization and ratio- who cares about the details of policy implemen- nally internalize them, or assume they may be ill tation will need to pay attention to many details informed and or present biased? When design- and complications, some of which may appear ing the health /exchanges for the Affordable Care to be far below their pay grade e.g., the font size Act, should the health-plan options be labeled on posters or far beyond their (competence level with precious metal platinum, gold, or bronze e.g., the intricacy) of government budgeting in or will that inadvertently( bias the participant) a( federal system . It will sometimes appear that choice toward one type of plan versus the other? the extensive training) they received is underused At the outset, it will be difficult to know the if, as Klemperer notes, the theoretical complex- answer to these questions, especially when the ities turn out to be second order. On the other policy problem is fairly new. hand, they will have a chance to apply their Paying attention to the details of policy economist’s mind, since many of the details requires a mindset that is slightly different from have implications for issues that are an econo- that which graduate school instills in econo- mist’s bread and butter: incentives, information, mists. Banerjee 2007, p. 161–162 summarizes imperfect rationality, etc. They will also need to the reluctance of( economists to )engage with be very observant, and keep a close eye on the those details very well in his essay “Inside the impact of any change they recommend. Machine.” Economists, he writes, tend to think in Inspired by this exhortation to go inside the “machine mode”: they want to find out the button machine, Alvin Roth’s image of the economist that will get the machine started, the root cause as engineer, and Banerjee’s 2002 image of the of what makes the world go round. He writes: economist as an experienced( craftsman,) 1 I label

The reason we like these buttons so much, it seems to me, is that they save us the 1 Banerjee in fact mentioned plumbers, in defending the trouble of stepping into the machine. By reputation that economists will provide good advice because, assuming that the machine either runs on like plumbers, they care about their reputation. VOL. 107 NO. 5 richard t. ely lecture 3 this detail-focused approach as the “plumbing” The engineer takes these general principles mindset. The economist-plumber stands on the into account, but applies them to a specific sit- shoulder of scientists and engineers, but does not uation. This requires careful attention to the have the safety net of a bounded set of assump- details of the environment being studied, but tions. She is more concerned about “how” to do also new tools: the economist-engineer cannot things than about “what” to do. In the pursuit shrug off the fact that a particular situation is not of good implementation of public policy, she is covered by the assumptions of the theorem, and willing to tinker. Field experimentation is her cannot ask agents to change their preferences so tool of choice. the assumptions hold. If the specific real-world What I will try to argue in this lecture is problem at hand cannot be solved analytically, not that all of us should be plumbers, or even then she will reach for other tools—in particular that any of us should be plumbers all the time; computation and laboratory experiments—and instead, I will try to show that there is value for will simulate the behavior of a market. economists to take on some plumbing projects, For example, in the matching for doctors in the interest of both society and our discipline. who are applying for their first residency to a hospital, which is the focus of Roth 2002 , the I. Scientists, Engineers, Plumbers simple matching theory does not accommodate( ) the fact that some of these new doctors come as A. Definitions married couples, and need to be assigned to the same town. Roth refined the theory to accommo- Alvin Roth’s seminal 1999 Fisher-Schultz lec- date married couples, but at that point, he could ture Roth 2002 invited economists to adopt an not solve the problem analytically: in particular, “engineering”( approach) to their craft. Economists, theory suggests that, with couples, there could he pointed out, are increasingly called upon, not be a situation without stable matching, and that just to analyze real world institutions, but also to the sequencing of the decision could in principle design them. Roth’s focus is the design of mar- affect results. So Roth and his colleagues used kets, but economists are also called upon to help computation to design an algorithm, and exam- design incentives schemes for firms and regula- ine the impact of different rules using data from tion and social policies for governments. In this previous years , including the potential( impact paper, I will consider the role of economists in the of sequencing. )The computations suggested that design of policies and regulation. the algorithm never failed to converge to a stable For Roth 2002, p. 1341 , intervening in the match, and that sequencing effects were small real world should( fundamentally) alter the atti- and unsystematic. This allowed them to suggest tude of the economist and her way of working. a matching algorithm that would work and has He sets the tone in the abstract of the paper worked in practice , even though the theory( for emphasis added : why couples were )probably not a big problem ( ) after all had not been fully developed. Market design involves a responsibility for The plumber goes one step further than the detail, a need to deal with all of a mar- engineer: she installs the machine in the real ket’s complications, not just its principle world, carefully watches what happens, and features. Designers therefore cannot work then tinkers as needed. At the time she inherits only with the simple conceptual models the machine, the broad goals are clear, but many used for theoretical insights into the gen- eral working of markets. Instead, market details still need to be worked out. The funda- design calls for an engineering approach. mental difference between an engineer and a plumber is that the engineer knows or assume The scientist provides the general framework she knows what the important features( of the that guides the design. In Roth’s Fisher-Schultz environment) are, and can design the machine to lecture, which again is focused on market design, address these features—in the abstract, at least. game theory provides the general principles. For There may not be a theory fully worked out to all the other domains where economists will be accommodate these features, but she can use called to provide inputs, there exists a relevant computation and lab experiments to simulate body of theory or at least general insight that how they will play out. When the plumber fits they can use to guide( design. ) the machine, there are many gears and joints, 4 AEA PAPERS AND PROCEEDINGS MAY 2017 and many parameters of the world that are difficult What is more, we may not even know ex ante to anticipate and will only become known once which of these decisions will in fact matter, and the machine grinds into motion. The plumber will our models offer fairly limited guidance on what use a number of things—the engineering design, to pay attention to. Pathak forthcoming, p. 3 his understanding of the context, prior experience,­ writes: “I will discuss a handful( of issues exam)- and the science to date—to tune every feature of ined in the theoretical literature on matching the policy as well as possible, keeping an eye on mechanisms that have proven to be first-order. all the relevant details as best he can. But with It’s worth emphasizing that it is only with the respect to some details, there will remain genuine benefit of several design case studies that we’re uncertainty about the best way to proceed, because beginning to understand which issues are quan- the solution depends on a host of factors he cannot titatively important.” easily quantify, or sometimes even identify, in the This uncertainty—concerning what the true abstract. These are the “unknown unknowns”: all model is—has consequences for policy engi- the issues( we can’t predict but will arise anyway . neering very similar to the problems discussed Thus, another difference between plumbers and) in the macroeconomics literature on “robust” engineers is that engineers will start from the out- policy Cogley et al. 2008 . Since we don’t come they are seeking to attain and engineer the know what( the true model is,) we need to design machine to reach it. Plumbers, on the other hand, policies that are as robust as possible to this will have to adopt a more tentative approach, start- model uncertainty. For example, Chetty 2015 ing from the machine’s characteristics and iden- argues that public policy that includes “nudges”( ) tifying their effect compared to another possible such as default is more robust, in the Hansen set of choices . ( and( Sargent sense,) to a specific model uncer- For example,) in the early 2000s the city of tainty are people rational or not? : rational peo- Boston decided to change how it assigned stu- ple will( chose what they want, and) “behavioral” dents to schools. There was an important engi- agents will be in default that should largely work neering part to choosing the mechanism that for them. A smart nudging policy is thus a fairly would be used, initiated by Abdulkadiroglu and robust policy, in general. It will have an effect Sönmez 2003 and followed by a considerable only if people suffer from behavioral biases, but literature ( see )Pathak 2011 and Abdulkadiroglu have no effect otherwise. and Sönmez( 2013 for reviews . But once city There is no general theory of how to design leaders settled on a mechanism,) they still had to policy under this kind of model uncertainty, how- make many decisions. How to communicate the ever, and in many cases, even the best educated change to parents? How to persuade them that guess will still be just that, a guess. The econo- they could reveal their true preferences when they mist-plumber will use all they know including ranked schools? Should there be a limit on how model uncertainty , to come up with( the best many schools parents can or are required to guess possible, and) then pay careful attention to rank? Should parents living near( a school receive) what happens in reality. The uncertainty in the preferential treatment, and how should that be set environment creates a highly stochastic world: up? And so on. According to Pathak forthcom- the natural way to “pay attention” to what hap- ing, p. 3 , who reviewed his experience( working pens, as I will argue below, is thus to analyze in several) cities and who modeled his conclusion natural experiments or set up field experiments on Klemperer’s: to try out different plumbing possibilities.

What really matters for school choice mar- B. Designing the Taps, Laying Down the Pipes: ket design are basic insights about straight- Two Examples forward incentives, transparency, avoiding inefficiency through coordination of offers Although I will often talk about details as a and well-functioning aftermarkets, and influencing inputs to the design, including general way to describe the handiwork of plumb- applicant decision-making­ and the quality ers, there are two different kinds of plumbing of schools. Some of the issues examined in policy design. First is the “design of the in the extensive theoretical literature­ on tap” work: taking care of apparently irrelevant school choice matching market­ design are details, such as the way the policy is communi- less important for practical design. cated or the default options offered to customers. VOL. 107 NO. 5 richard t. ely lecture 5

Second is “layout of the pipes” work: important would have been up to the local leaders, and so logistical decisions that are fundamental to the on. But there are good reasons to think that each policy’s functioning but often treated as purely of these features matters. The price to be paid mechanical, such as the way money flows from for rice is one piece of information that recip- point A to point B, or which government worker ients themselves do not have, so local leaders has sign-off on what decisions. can easily skim off the program by inflating the Banerjee et al. 2015 worked with the copay. Making the list widely available will cre- Indonesian government( on )a “design of the tap” ate communal knowledge, potentially changing problem, in the context of the Raskin program, the nature of the bargaining game between the a subsidized rice distribution scheme that, with local leader and the citizen. Distributing phys- an annual budget of US$1.5 billion and a tar- ical cards may not even be necessary, if lists of geted population of 17.5 million households, is those eligible are posted in the village.2 Indonesia’s largest targeted transfer program. The In practice, distributing cards made a large Raskin program is plagued with corruption: the difference: on net, across all of the variations of data in Banerjee et al. show that beneficiary house- the program, the cards led to a large increase in holds pay more than they should and receive less subsidy received by eligible households. Eligible than their full quota of rice, to the extent that they households in treatment villages received a receive only about one-third of the subsidy they 26 percent increase in subsidy, stemming from are entitled to, overall. Working with the gov- both an increase in quantity and a decrease in ernment of Indonesia, the researchers designed the copay price. This occurred despite imperfect a set of field experiments to provide information implementation: eligible households in treat- directly to eligible households. In 378 villages ment villages were only 30 percentage points randomly selected from among 572 villages more likely to have received a card relative to spread( over 3 provinces , the central government the control. Moreover, several of the variations mailed “Raskin identification) cards” to eligible did in fact matter: gains were substantially larger households to inform them of their eligibility and when the information was public and prominent, the quantity of rice that they were entitled to. In when the prices were displayed on the card, and addition, the team varied several features of the when people actually got a physical card. The design of the cards and the way they were intro- coupons feature did not seem to be particularly duced: whether the copay price was also listed on relevant. This piece of plumbing research was the card, whether information about all the bene- influential: since the success of the cards was ficiaries was posted in the village, whether cards blatant, the government decided to scale them were sent to all eligible households or only to a up and immediately started distributing them subset, and whether the cards came with coupons to everyone, for Raskin and other government to acknowledge the delivery of the rice. transfer programs. In total more than 60 mil- A first plumbing issue is whether or not lion beneficiaries got cards. The upside of the households got a physical card. Traditionally, fact that plumbing is not a prime focus of gov- households had not gotten one—the program ernments is that when you hand them a good was designed to be implemented by local lead- plumbing solution, it stands a better chance of ers, who were given considerable discretion, being widely adopted than when you present a and any effort at information dissemination had brand new program, which is prone to come up focused primarily on them, not on citizens. The against someone’s ideology. optimistic policymaker who initially designed Banerjee, Duflo, Imbert, Mathew, and Pande this machine clearly envisioned a beneficent 2016 focus on the hidden pipes. In many gov- local administration dedicated to the good of the ernments( ) worldwide, some programs such as people. The very specific design features of the local workfare programs or infrastructure( build- cards and their distribution are further plumbing ing are financed centrally but implemented by issues, and would normally have been ignored, local) communities. This causes a basic plumbing even after the government had been convinced that sending cards was a good idea. Someone a 2 graphic designer maybe? would have made (a In a different setting, the government of India made the ) conscious decision not to necessarily issue an identity card decision on whether or not to list on the card the when it established biomarker-linked unique identifiers for price to be paid for rice. Distribution of the cards all Indians. 6 AEA PAPERS AND PROCEEDINGS MAY 2017 problem: how to transfer money from the center to the outposts, which are ­normally strapped State pool in central bank of India for cash and unable to front the expenses for 3 Fund transfer the programs. Traditionally, due to slow trans- CPSMS District mission both of financial orders and informa- access 4 tion, the only practical way to transfer funds is 2 to give the local community an advance and, Block

Fund request Fund 1 5 after they provided proof of how they used it, Panchayat give them another advance. But this system is Panchayat savings acount fraught with difficulties. First, there is a basic financial management issue: some money lies Figure 1. Flow of Authorizations under idle in one locale, whereas another might lack the Traditional System money. The advances need to be large enough to avoid paralyzing the program, but will add up for a specific activity “we have hired Mrs. to the government’s expenditure. Second, and Kumar for 14 days” , and( receive a correspond- perhaps more importantly, the ex post justifica- ing payment on a “zero) balance account,” which tion of expenditures is often delayed and patchy, they would immediately use to pay Mrs. Kumar. making auditing and accounting difficult. This We had the opportunity to design and implement creates opportunities for leakage and embezzle- a large-scale experiment to evaluate the impact of ment, and the monitoring structure that is put in this reform. We worked in 120 districts, in 1,000 place to control it may become part of the prob- treatment and 2,000 control local governments, lem, creating its own red tape, inefficiencies, covering a population of 30 million people. and leakage opportunities Banerjee 1997 . The main benefit that Mathew foresaw in this We had the chance to experiment( with )a sys- system was that it would remove the uncertainty tem that was largely the brainchild of Santhosh on whether and when money would come, and Mathew, one of the coauthors of the study. he believed that it would empower local gov- Mathew is a rare creature—a bureaucrat-plumber.­ ernments to be more proactive in implementing A career civil servant with a Masters in econom- the program. As it turned out, unforeseen events ics and a PhD in development studies, Mathew meant that this could not happen in this context. is deeply concerned with the pragmatic issues of A few days after the program was launched, any program he is involved with. In the context the central government cut Bihar’s MGNREGS of the MGNREGS, a public workfare program funds, due to a dispute on data entry, and mas- financed by the Federal government but imple- sive uncertainty ensued on when the money mented at the village level, a unified financial would be available again. This took five months platform was put in place in Mathew’s adopted to be resolved, and since it happened just after state of Bihar, whereby the money transfer could the introduction of the system, many local offi- be made directly to the local government from the cials wrongly believed the new uncertainty they bank in the state capitol. This solved half of the experienced( on) whether money would come was problem, that of the transmission of money. The the new system’s fault. other half transmission of information on who The plumbing reform, however, changed should get( money was originally not solved. something else: the reformed system was con- Figure 1 shows the) flow of authorizations under siderably more transparent than the status quo. the traditional system: the local government pan- The local official was putting his signature on chayat transmits a request for funds to the( next a document where he said that he had recruited administrative) level, the block, which ratifies it specific people. This made it easy for auditors and transmits it to the next level up, the district, to go look for them, and punish the officials if which ratifies it and enters it in the financial man- they did not exist or had not worked for the pro- agement platform, which generates an invoice to gram many of the workers in MGNREGS are the Central Bank of India, leading to a transfer “ghost( workers” . Although this was not the pri- right back down onto the panchayat account. mary motivation) of the reform, basic economics Mathew designed a reform to this system. The suggested that this could change the behavior of Panchayat now could log in directly into the the local officials. We looked for evidence of a financial management platform, enter an invoice reduction in corruption, and we found plenty. VOL. 107 NO. 5 richard t. ely lecture 7

First, we saw a reduction of about 22 percent policies and institutions. There is nothing in the expenditures on the program, without ­obvious about designing a school choice mecha- apparent change in the number of workdays nism, an auction, or an exchange market for kid- in an independent survey we conducted. Since ney. Economic theory has developed the tools the survey data could be noisy and miss a real to think about those questions, and it stands to decline in expenditure, we looked for smoking reason that policymakers would ask for econo- guns, and found two: first, we matched all the mists’ guidance, very much like economists pro- names reported in the public database for the vide advice on macroeconomic policy. It may program to a census that was conducted at the however be less clear that policymakers need same time. We estimated that a match was less plumbers, or that economists would be partic- likely to be found for someone in the public data- ularly well suited to play this role. This section base in the control group than in the treatment argues that policymakers do need plumbers. The group, indicating that they were more likely to next section will argue that our discipline pre- be a ghost worker. Second, we examined the pares us well to think about many of the plumb- wealth declarations of the Panchayat officials, ing problem that they face. and found a significant reduction in the median Abhijit Banerjee, Gita Gopinath, and I wealth though we find no impact at the top and recently spent a few hours with bureaucrats and the average( effect is insignificant . consultants from the health department in the Most anticorruption drives focus) on explicit state of Kerala, in the south of India. Kerala is anticorruption tools such as audits or on incen- one of India’s most successful states, at least tives, but not on the( way that the) program is as far as the social sector. It is starting to face structured. This example showed that structure first-world problems: an aging population, high is not only relevant, but quantitatively important: prevalence of hypertension, diabetes, and obe- during the course of the experiment, the reduc- sity. The bureaucracy and the elected leaders are tion in expenditure alone saved $6 million to the committed to try to do something about both government. And as in the case of the Raskin rising health care costs and noncommunicable program in Indonesia cited above, the experi- disease prevention. Their best idea is to reform ment was actually instrumental in getting this the public primary health care system in order program scaled up across all states, and then for to make it more attractive to customers who for the financial transfers in other types of program. the most part seek their health care in the( private Initially, before the corruption result came out, sector, like elsewhere in India , and to instate the program was canceled in Bihar: faced with a better practices of prevention, management,) and reduction in expenditure and the objections of the treatment. They would try out a new organiza- local officials, who for obvious reasons were not tion of the health care sector. Nurses, volunteers fans of the program, Mathew’s successor decided from the local governments, and doctors would to cancel it. But after it had become apparent that work seamlessly in a health team that would be the reduction in expenditure stemmed mostly in charge of keeping the population healthy, with from a reduction in corruption, the story changed: a heavy focus on encouraging lifestyle changes When the Prime Minister was keen to show that and preventative activities. They are currently his government was doing something to improve planning to try the system in 152 health centers. the implementation of public programs, this par- Though the Additional Chief Secretary top ticular reform seemed uncontroversial enough bureaucrat in charge of health had invited us( to for top bureaucrats—a “technical reform” they the meeting,) he was called away to deal with a could get behind relatively easily. The program doctors’ strike around the time the conversation was scaled up to MGNREGS. It is now poised to turned to the specifics of the reform. He handed be scaled up to other kinds of centrally sponsored us over to a retired professor and a retired doc- schemes and has the potential to affect billions of tor, who have been charged with designing the dollars in central government spending. specifics of the policy. This in itself is symptom- atic: top policymakers usually have absolutely II. Why Policymakers Need the Plumbers no time for figuring out the details of a policy plan, and delegate it to “experts.” In our con- It seems quite natural for policymakers to rely versation, we started to push on some specific on economist-engineers to design complicated­ questions on the model that they had in mind: 8 AEA PAPERS AND PROCEEDINGS MAY 2017 why would patients pay attention to a nurse, of obesity to be built in communities, etc. . given that until now they have only taken doctors Interestingly enough, it had nothing to do with) seriously? Were they really sure that if the nurse the health care reform that we had discussed that started to take blood pressure and fill prescrip- morning. It appeared that the details would have tions, this would give her the authority she would to wait for another day. need to dispense advice? Or that doctors would I have since learned to avoid these kinds of be willing and able to signal that nurses were to meetings in general, but this encounter reminded be respected, in a system that has always been me of my early days as a plumber. It turns out heavily hierarchical? For that matter, did the that most policymakers, and most bureaucrats, planners really think it was going to be possible are not very good plumbers. for health care professionals to spend a lot more Part of this is because, contrary to most econ- time on public health and prevention when there omists’ generous beliefs, agents in general are were only 2 doctors for every 30,000 people? not necessarily very skilled at what they do. This What was striking was, not only did they is the central argument of Banerjee 2002 in not have any answer to these questions, but favor of normative economics. He points( out )that they showed no real interest in even entertain- agents have not necessarily had the opportunity ing them. Whenever we asked them to spell out to experiment, even in their main line of trade. what they thought their policy lever was as For example, farmers may not have stumbled opposed to their aspiration , the stock answer( on the best mix of fertilizer, because once they was that they did not really) have one, that the have something that works well enough, it may local governments and medical officers could be optimal to stop experimenting. Even if they not be forced to do anything. Maybe a village are inclined to experiment, the results may not committee would need to decide to organize be informative if the outcome data is very noisy, yoga classes, but another one would not, so and learning from neighbors is difficult. This there was really no way to find out what really leaves a lot of scope for specialized expertise, worked. This was entirely beside the point, since including that of economists. Banerjee focuses the presence or absence of yoga class would on the example of microcredit, and the fact that, have been an (outcome of )what they could do at before the microfinance movement, bankers did the central government train local governments not figure out how they could use some basic in the fundamental of public( health, for exam- principles of mechanism design to make it pos- ple . But they seemed to have no understanding sible to lend to the poor. of )a causal chain going from policy design to Hanna, Mullainathan, and Schwartzstein implementation and final outcome. Their posi- 2014 emphasize another barrier to learning: tion oscillated between presenting the illusion of the( tendency,) even for experienced agents, not to the perfect system, and presenting the illusion notice some of the important features of the envi- of complete powerlessness in the face of local ronment. They test a model of selective attention power and initiative. with seaweed farmers in Indonesia, which could We tried, and failed, to engage them on the just as well apply to the plight of policy practi- details of policy. Not only did they have no tioners. Seaweed is farmed by attaching strands understanding of plumbing issues, there was not pods to lines in the ocean. A large number of even a realization that plumbing was an issue dimensions( ) affect yield e.g., length of the line, at all. When the top bureaucrat popped in and optimal distance between( pods, optimal length was apprised of the conversation, it was decided of a cycle, size of the pod . The farmers have that we would be shown some details. We went limited attention, and can only) attend to a lim- away to another meeting and came back after ited number of dimensions. They allocate their three hours. They had set up the projector, a attention optimally, but a feedback loop may sign that things were about to get serious. They occur: if they start with the prior that something displayed a power point with each of the new is not important, then they don’t pay any atten- UN Sustainable Development Goals, and a list tion to it, and they may never notice that it is of proposals to achieve them in Kerala. These in fact important. The authors present experi- amounted to a long, meritorious, and likely mental evidence that farmers not only ignore totally vacuous wish list 30 minutes of exercise­ some dimensions of farming, they continue to per day mandatory in all( schools, awareness ignore them even when they are presented data VOL. 107 NO. 5 richard t. ely lecture 9

­suggesting that they actually matter, unless existing Rs500 and Rs1,000 notes would be they are told very specifically to pay attention replaced by new notes, and would become ille- to them. In this instance, they tend to ignore gal tender by the end of December. The initial the size of the pod, and in interpreting the trial stated objective was to cut corruption by getting data, they do not think to compare trial by pod rid of black money, though other goals were size. Very experienced farmers can get it terribly mentioned later: making India cashless, fight- wrong, even when experimenting is not partic- ing terrorists by removing their means of pay- ularly costly, simply because they do not know ment, creating a windfall for the government in what to look at. Very experienced policymakers the case that less money came back than was often get it just as wrong. removed from circulation. What was absolutely This inability to notice details is maybe remarkable is that this major move was made even stronger for the economist-scientists and with zero preparation beforehand. The central the policy practitioners than for farmers. The bank had not printed enough money, creating a ­economist-scientist is effectively trained to cash shortage and huge bank queues. The new ignore them: the art of modeling requires sim- money was a different size and most ATMs had plifying reality to work out the logic of some to be retrofitted. Cash deposits in poor people’s essential assumptions. The spotlight is on “no frills” accounts skyrocketed, presumably general principles, not on details that may be because someone had given them the money for very specific to an environment—indeed, such safekeeping. Between the time the policy was details are distractions. In order to focus on announced and the end of December, the gov- the general principles, the economist-scientist ernment issued and changed more than 100 dif- will tend to dismiss those nasty details which ferent rules concerning when individuals could in fact will make the policy work or not(. This deposit their cash, how much they could deposit, applies to theorists, but to us field experiment) - what kind of justification was needed, when they ers as well, when we wear our scientists’ hats. could withdraw, how much they would have to At those times, we run experiments that are as pay in penalty if they could not prove they had tightly controlled as possible to try to isolate a paid taxes and so on and so forth. The result specific mechanism or test a theoretical princi- was a combination of a flabbergasted public, a ple. This is not a shortcoming, just a feature of massive disruption in the economy and, in short, the scientist’s work, yet it means that scientists complete confusion. will not typically be the ones able to identify the The idea that you could remove 86 percent key issues that hinder policy success. The pol- of the cash in an economy without first think- icy practitioner is often not even willing to try to ing through the logistics, including what would acknowledge details. replace it considering that mobile money, credit Many are, to one degree or another, afflicted cards, and( account-to-account transfers were not by the specific blindness that affects the “high particularly well developed in India is of course modernist” state described by James Scott in extreme. But it expresses the kind) of magical Seeing Like a State Scott 1998 . Policymakers thinking that many governments indulge in— and bureaucrats have( a tendency) to simplify the and not just in developing countries—that if you problems that face them until they fit some pre- have a generally good idea that makes sense in conceived notion of what a human being should some abstract way, the details will work them- want, or need, or be. Scott’s book is mainly con- selves out.3 cerned with episodes where this high modern- The failure to experiment, the failure to ist approach went particularly awry including notice, and the grip of ideology combine in the Soviet period and the Villagization( policy in what Banerjee and Duflo 2012 described as Tanzania , but the temptation to regulate people the “ Three I’s” ideology, (ignorance,) inertia . into what) they want them to be affects policy- Policymakers tend( to design schemes based on) makers more generally. the ideology of the time, in complete ignorance A striking recent example is the other unex- pected and potentially calamitous event that 3 An interesting side question is why bureaucrats cannot took place on November 8, 2016. On that day, be given incentives to get the detail of the policy right, since the prime minister of India, Narendra Modi, unlike policymakers they are not elected and should not be announced that, effective immediately, all wedded to a particular viewpoint and ideology. 10 AEA PAPERS AND PROCEEDINGS MAY 2017

Panel A Control plants Panel B Treatment plants Audits Audits 20 Mass: 0.7297 20 Mass: 0.3913 15 15 10 10

Percent 5 Percent 5 0 0 0 100 200 300 400 0 100 200 300 400

Panel C Panel D Backchecks 20 Mass: 0.1892 20 15 15 10 10

Percent 5 Percent 5 0 0 0 100 200 300 400 0 100 200 300 400

Figure 2. Audit Readings for Suspended Particulate Matter SPM ( )

of the reality of the field, and once these policies we performed backchecks on the auditors by are in place, they just stay in place. sending an independent team to take pollution There are two implications of this attitude of measurements right after them. The results for policymakers, a negative one and a more positive one particular pollutant Suspended Particulate one. On the negative side, it means that many Matter are reproduced in( Figure 2. The auditors’ well-intentioned policies fail in practice or reports) show considerable bunching right below don’t succeed as well as they could have even( if the threshold for acceptable pollution, which is their underlying principles were sound, )because completely absent in the backcheck. Even low the details are not gotten right. Examples are pollution measurements, which are apparent in numerous. Despite the considerable amount of the backcheck, are absent in the auditors’ read- political capital that was expended to launch the ings. It seems that, in many cases, they were not Affordable Care Act in the , and even visiting the firms. This was reflected in the the importance of what was at stake, the roll-out going rate for auditors’ report, which was below of the health exchanges was plagued by entirely the minimum cost of running an actual pollution avoidable software glitches. Another example test. The likely explanation is that the system is the third-party audit system that the Supreme was plagued by conflicts of interest: since the Court of Gujarat mandated in an attempt to firms were selecting and paying the auditors, the improve the monitoring of industrial pollution auditors had every reason to give them the report Duflo et al. 2013 . Under the scheme, each firm they wanted, if possible for very little money. with( high pollution) potential had to hire a pri- Although the system was designed with worthy vate firm to complete a yearly auditing report. general principles in mind, basic design flaws Each firm selected and paid its own auditor. made it essentially useless. Auditors completed three visits and submitted On the positive side, policymakers’ inatten- the resulting report to the government regulat- tion to detail creates a space for policy improve- ing body, the Gujarat Pollution Control Board. A ments even in environments where the policy number of rules were in place to ensure quality discussion is paralyzed, or where institutions are and limit corruption: auditing teams could not not good Banerjee and Duflo 2012 . When there sign up more than a certain number of clients, are existing( laws on the books, finding) ways to there were qualification requirements for audit- make them work better meets with little oppo- ing team members, and dishonest auditors risked sition. In the United States, even as Congress decertification. Yet, the system was a complete and the executive branch were fighting over failure. As part of a randomized ­evaluation, every single line item in the budget, millions­ VOL. 107 NO. 5 richard t. ely lecture 11 more children were certified between 2008 and example, if households receive a cash transfer, 2014 as part of the school lunch program by it may matter whether the woman or the man successive improvements in direct certification of the household receives the money Field et for children of parents who benefited from other al. 2016; Benhassine et al. 2015; Almås( et al. programs: SNAP, TANF, or Medicaid Moore, 2015 . It may matter if the transfer is presented Conway, and Kyler 2012; Moore et al.( 2013 . as something) that can help children go to school The Child Nutrition and WIC Reauthorization) Benhassine et al. 2015 , or if households are Act of 2004 mandated direct certification, but reminded( that they can save) it, and so on. rate of enrollment accelerated greatly after the The success of the nudge agenda, inspired by Healthy, Hunger-free Kids Act of 2010. The act the work of behavioral economists, and popu- contained a set of “plumbing” reforms that led larized by Thaler and Sunstein 2008 has given to significant increases in direct certification: prominence to “design of the tap”( issues) among mandating name matching of different databases economists. Indeed, the last two Ely lectures three times a year; removing the option of “let- centered on this issue. Campbell 2016 argued ter certification,” where a letter was sent to the that the regulation of household( finance) must parents which had to be forwarded to the school; take into account the fact that households do not establishing awards for the best-performing understand finance very well, and also that they states. In 2011–2012, 1.7 million children were are subject to behavioral biases. Chetty 2015 directly certified a 17 percent increase from the argued that the “pragmatist” public finance( will) previous year , and( 740,000 were added the fol- pay attention to those issues regardless of their lowing year )Moore, Conway, and Kyler 2012; scientific foundations, because they do lead to Moore et al. (2013 . This did not raise eyebrows greater success in what the policymakers are try- in Congress, and) probably did more to help ing to do. It is not just economists advocating kids in poverty than a more publicized measure subtle behavioral intervention. Policymakers are would have. also aware of them, at least in the United States and United Kingdom, where “Nudge Units” III. Why Economists Make Good Plumbers have been created in the White House and at 10 Downing Street. The main idea behind “choice All of these arguments suggest that we need architecture” is precisely that what may appear someone smart to take care of the plumbing. But to be irrelevant details default options, ease of why does it need to be an economist? making a decision, the (way some numbers are The reason is that many plumbing issues are presented, etc. actually influence final con- actually fundamentally economists’ problems. sumer choices. )Policymakers therefore need to Not all economists need to be plumbers; we be thoughtful about all those choices. It is worth need scientists and engineers too. And not all nothing that many tap-design issues (don’t have plumbers need to be economists. Sometimes, anything to do with : the what a policymaker needs is a good software issues explored by Banerjee et al. in Indonesia’s engineer, lawyer, or subject expert. But econom- Raskin program have much more to do with ics, as a discipline, provides domain expertise political economy and corruption, for example . for some of the fundamental plumbing issues in The economist’s job does not end once poli)- public policy. cymakers are made aware that these tap-design There are three broad classes of reasons why issues matter, however, particularly because plumbing matters for public policy, and all three policymakers love a free lunch, and some of involve economics. the behavioral ideas may stroke this disposi- First, the citizens or the supposed beneficia- tion. They are thus wont to treat new behavioral ries of the policies (are humans, with conflict- ideas as general “solutions to every problem,” ing objectives, limited) information sets, limited rather than as tools in a plumber’s kit, perhaps attention, and limited willpower. This means necessary to tune any kind of policy properly. A that the specific way that policies are presented plumber’s mindset is required to carefully watch and implemented will potentially have tremen- how these ideas play out in practice, and some- dous influence on whether they will work or not. times an economist may be need to analyze how Therefore, when a new policy is put in place, it works. We will see several examples of cases it is essential to think about those details. For where “obvious” improvements did not turn out 12 AEA PAPERS AND PROCEEDINGS MAY 2017 to be improvements at all. Ben Handel’s and on how to hire, fire, and reward government Jonathan Kolstad’s work on health insurance employees creates the basic environment in illustrates the subtle economic analysis neces- which government work is happening. sary to analyze the plumbing of a comprehen- Consider what is involved in the decision to sive health reform such as the Affordable Care hire new front-line workers community liaison Act a.k.a. Obamacare . They are concerned workers, health workers to carry( out a new pro- with (issues of practical relevance:) in 2015, they gram. While the bureaucracy) will generally be wrote a policy document with two specific pro- quite good at determining how many new work- posals for the health exchanges Handel and ers are needed and may be considering how Kolstad 2015 , a consumer search tool( that per- to set their wages, it will almost surely not be sonalizes choice) framing and recommendations thinking about what the job-announcement post- based on an individual’s specific characteristics, ers should advertise. And yet, Ashraf, Bandiera, and a “smart default” policy, targeted to partic- and Lee 2015 show that a small change in the ular consumers, to which the government could framing of( a front-line) health worker job on the default consumers during open enrollment when advertisement poster—namely, emphasizing the considerable evidence indicates that they are career path that the job can lead to rather than in the wrong plan. Those proposals are quite the opportunity to do good for the community— detailed, and they draw from a substantial aca- makes a large difference on the observable and demic literature to which Handel and Kolstad unobservable characteristics of those who apply. have contributed( before . Because the propos- The “go-getters” who applied for the job in the als are realistic, they need) to tackle issues that career-path framing treatment conduct more are beyond the simple observations that con- visits, accomplish more during visits, and ulti- sumers are ill informed and tend to be charac- mately lead to better health outcomes than the terized by inertia although this clearly is part “do-gooders.” The overall goal of this new pro- of the problem . For( example, Handel 2013 gram was to create a new layer of health workers observes that, due) to inertia, many consumers( ) close to people in the community; however, the remain in dominated plans, where they lose a seemingly minor advertisement decision turns substantial amount of money consumers leave out to have very significant effects. approximately $2,000 on the( table, on aver- We are far from understanding how this plays age . However, Handel also finds that, if con- out. In contrast to the results just discussed, sumer) inertia is reduced, adverse selection will Deserranno 2015 finds that financial incen- likely increase in a marketplace with no insurer tives can lead( to a )less socially motivated appli- risk-adjustment transfers: this means that any cant pool. She randomized the advertisements proposal that helps consumer solve the iner- for a new position of health promoter for BRAC, tia issue needs to address the adverse selection a large international NGO. In one treatment arm, issue at the same time. the job advertisement mentioned the minimum Second, those who implement policies, often amount that a health promoter was expected to government workers, are humans too. They may earn “low-pay” treatment and, in another treat- suffer from the same limitations as the final ment( arm, the maximum “high-pay”) treatment . subjects of the policy, and their incentives are A third treatment advertised( the mean of the) not necessarily to work very hard or in the best expected earnings distribution “­medium-pay” interests of the citizens they serve. In much the treatment . Note that the difference( was only same way as plumbing details create a choice in the framing:) the actual pay was the same in architecture for the final consumers, they also all treatments. The study finds that while the create an “incentive architecture” for the gov- high-pay treatment attracted 30 percent more ernment workers who implement the policies. applicants relative to the low-pay treatment, the Despite the current vogue of catchphrases applicants had much less experience as health such as “state capacity,” these problems have volunteers and were much more likely to state received fairly little attention in economics until “earning money” as the most important feature relatively recently. There are, however, several of the job. Applicants under the medium- and recent interesting experiments on the “personnel high-pay treatments were also 24 percentage economics” of the state see a review in Finan, points less like to make a donation to a public Olken, and Pande 2015 (. Indeed, the decisions health NGO in the context of a dictator game. ) VOL. 107 NO. 5 richard t. ely lecture 13

Some aspects of personnel economics of rules will, advertently or inadvertently, affect the the state such as salary and incentives are very ability and willingness of the front-line workers salient and politically charged, and I would not to implement the policy. Mindful incentive infra- characterize them as plumbing.4 Some are much structure requires understanding the government less salient, such as the framing of the posi- as an organization: any policy will necessarily tions as described above or transfer policy. take place in an organization that has power Transfers( and posting to more) or less desirable structures, and a culture that has large impacts locations are routinely used by the governments on how the policy will play out. The specific of developing countries to reward performance way it is drafted, even the parts of it that do not or punish people who are not liked, but this is formally address incentives, will determine its not done in a systematic way. Banerjee et al. ability to be implemented Gibbons 2010 . This 2012a find some positive effects of reducing gives particular salience (to the piping issues,) the( frequency) of transfers of police officers on which, having not benefited from the behavioral their performance. Banerjee et al. 2012b find economics revival, are even more neglected than that the promise of a transfer out (of a punish) - the tap-design issues. ment posting induced police officers to work Between 2001 and 2014, a group of us harder on conducting random sobriety check- Banerjee et al. 2016 experienced firsthand the points. Khan, Khwaja, and Olken 2016a power( of incentive architecture) as we attempted propose, implement, and test one particular( ) to mainstream a remedial education program system—performance-ranked serial dictator- through the government system. The basic idea of ship—in which tax inspectors select their most the intervention was that children must be taught favored location in an order that is determined at the level at which they currently are, rather by performance. The authors show that the than at the level prescribed by the curriculum for mechanism does in fact work, increasing tax their grade. Several RCTs showed that it could be collection by 40 to 80 percent. Moreover, the implemented by volunteers or paid NGO work- people who, ex ante, their model predicts will ers to very good effect, and it had been shown to be the most sensitive to this incentive are in fact raise learning outcomes when run by Pratham, those who increase their collection the most. a large NGO, outside of school. To mainstream But note that none of these studies investigate it within schools, Pratham trained teachers in an essential aspect of this policy in equilibrium: the pedagogy. We evaluated this attempt, as part if transfers are used as rewards, or if postings of a large scale-up of the policy in the Indian are more secure, the most corrupt people may states of Bihar and Uttarakhand over the two work very hard to get transferred to where they school years of 2008–2009 and 2009–2010. want to be. This could entirely change the wel- The results were disappointing: despite official fare implication of the policy, so this is an area endorsement, teachers did not seem to apply where more plumbing is needed. the methodology when they were back to their But importantly, the incentive infrastructure classrooms, and the intervention did not result in issues apply to the design of any policy and reg- any increase in test scores. Our process monitor- ulation, not just incentives policy specifically ing and a set of qualitative interviews suggested or even personnel economics more generally. that there was an incentive-architecture problem Regardless of whether the policymaker is draft- in the way Pratham and its partner government ing an education intervention, a banking reg- had implemented the program. While there was ulation, a new tax, or anything else, there will endorsement of the approach at the top, nothing be decisions to be made on who will carry it else in the teachers’ lives had really changed. In out, under what supervision, with what level of particular, their stated responsibility remained to autonomy, and how implementation will inter- complete the curriculum, and from their point of act with their other duties. Any particular set of view, any time devoted to the remedial activities was essentially wasted. To have a chance of success, something 4 Although they are extremely important, and the subject needed to be change, not in the basic philosophy of important field experiments: see, for example, Dal Bó, Finan, and Rossi 2013 on front-line worker recruitment of the program or in the content of the training, and Khan, Khwaja,( and )Olken 2016b on financial incen- but in the plumbing. Working with the state of tives for tax collectors. ( ) Haryana, Pratham implemented two key changes 14 AEA PAPERS AND PROCEEDINGS MAY 2017 to the way the intervention was run. First, the been put in place for checking account overdraft program would be implemented during an extra coverage, credit card over-the-limit practices, hour in the day added in all schools, not just and some home-mortgage escrows. In particu- those where the program( was implemented, but lar, a 2010 regulation imposed on banks a pol- earmarked to this in treatment schools . Since icy default regarding overdraft coverage. Under this hour was new, it was easy to make) salient this regulation, banks were not allowed to charge that the responsibility of the teachers was to an overdraft fee for when an account went into carry out the remedial activity during that time. a negative balance due to an ATM transaction or Second, a cadre of supervisors from within the a nonrecurrent debit card transaction unless the government system were recruited( and trained consumer had opted out of the default and cho- to act as both supporting) people and monitors sen the bank overdraft protection plan. The idea for the teachers as they were implementing the was that the overdraft coverage is basically a program. Nothing changed in a teacher’s pay high-interest, low-risk loan charged by bank and or her formal role, but these two adjustments is not in the interest of most customers. Banks get made the difference between signaling that the away with high overdraft coverage fees because program was an optional add-on or part of the customers do not expect to go into overdraft, and teacher’s core duty. We evaluated this version thus pay no attention to this feature when they of the program in an RCT in 400 schools 200 sign up for an account. Moreover, they don’t nec- treatment, 200 control , and found significant( essarily realize when they are going into overdraft increases in test scores.) Stango and Zinman 2014 . Finally, any regulation that seeks to affect ( The 2010 regulation required) banks to offer the way that firms behave, or any policy that no overdraft protection by default, and prohib- relies on firms for implementation at any point ited them from charging overdraft fees in the must also take firm behavior into account. This default if the account ended up being overdrawn behavior is one reason why “obvious” solu- anyway.5 In doing so, the regulators cited the tions do not always work the way they would research on the stickiness of the default, and be expected to. Willis 2013 makes this point expressed the hope that most consumers would in the context of default( enrollment) in an over- stick to the new default of no overdraft protec- draft protection plan which Campbell 2016 tion. To enhance the stickiness of the default, the describes as one of the( signature attempts to regulator set some safeguards: to get out of the incorporate behavioral economics into the reg- default, customer needed to take an “affirmative ulation of household finance products . Defaults choice” talk to someone or click a box , and are settings or rules about how product,) policies, before allowing( them to do so, the bank needed) or legal relationships function that are in place to show them some documents. unless users actively change them. Based on Thus, the regulator had taken care of the obvi- their success in retirement savings in particular ous plumbing concerns they could think of. And Madrian and Shea 2001; Chetty et al. 2014 , yet, the default did not work that well. Willis default-in( programs have been come popular) 2013 cites data from the Consumer Finance with policymakers. Since people tend to stick Protection( ) Bureau suggesting that around one- with the option they are presented with, policy- half of the bank’s clients had opted out of the makers can get the outcome they want by set- default, and that those who switched were those ting the default carefully, while leaving people more likely to then pay high overdraft fees. the freedom of choice this is what Sunstein and Banks achieved that by using any number of Thaler have called “libertarian( paternalism” . strategies that Willis describes in detail, such as When drafting policy, picking a default is often) minimizing the costs of opting out as much as unavoidable, and hence, regardless of whether the law allowed them to, stationing employees one is convinced of their stickiness, choosing at ATMs to convince people to opt out, using the right default is always a plumber’s problem. forms for new clients that made sticking with But in some cases, policymakers can attempt to influence the relationship between consum- ers and firms by mandating that the firm offers 5 This applied to ATM or one-time transactions only. a ­particular default option. In the consumer Banks could still authorize and charge for overdraft occa- finance area, legally required default terms have sioned by checks or recurring transactions. VOL. 107 NO. 5 richard t. ely lecture 15 the default just as complicated as opting out, and These two examples may give a somewhat too using language that made the default scary and simplistic view of the problem: firms, like gov- the alternative attractive. ernments, are organizations themselves and may The bottom line is that, if the objective of thus not always respond optimally to a change in the government was to protect customers from the environment. Taking into account how a reg- the bank’s abusive use of overdraft policy, then ulation will affect firms as organizations further the policy should have been drafted with an underscores the role of details rules and how understanding of the economics of the bank’s they interact with the environment. likely response. This would have probably led To summarize, economists have the dis- to a different set of rules. The 2009 Credit Card ciplinary training to make good plumbers: Accountability Responsibility and Disclosure economics trains us in behavioral science, CARD Act was less respectful of the banks’ incentives issues, and firm behavior; it also freedom.( ) It contained some “nudges,” but also gives us an understanding of both governments hard ceilings on the late fees that banks were and firms as organizations, though more work allowed to charge. Agarwal et al. 2015 find probably remains to be done there. We econo- important effects of the CARD Act( on )inter- mists are even equipped to think about market est and fees paid by customers as the ceilings equilibrium consequences of apparently small were binding on some products( and were not changes. This comparative advantage, along offset by increases on other products , and rel- with the importance of getting these issues right, atively little change of behavior related) to the makes it a responsibility for our profession to nudge. engage with the world on those terms. Economists have some domain expertise to understand those dynamics, and therefore can IV. Plumbing and Experimentation help draft policies that take those incentives into account. Duflo et al. 2013 not only studied the Roth 2002, p. 1342 made the point that “in faulty third-party audit( system) for environment the service( of design,) lab experimental and that( the) court had ordered in Gujarat, but set out computational economics[ are] natural comple- to improve it. Given what we know from eco- ments to game theory.” Lab experiments and nomics, it was not rocket science to identify the computational work are essential tools for the key problem: since the system was plagued by economist-engineer because sometimes real life conflicts of interest, the performance of the audi- will not oblige with the nice assumptions that tors could be expected to improve if the financial are necessary for elegant models with closed- link between them and the firm was severed, and form solutions. If that is the case, simulations if they were instead made liable to the regula- are needed to understand the behavior of the tor, who would them reward them for accuracy. proposed systems under the more complicated Similar proposals have been made for reforming set of assumptions. financial audits in the United States and Europe, Similarly, in the service of fitting policies for though they have not been implemented. The the real world, field experiments are a natural Gujarat Pollution Control Board the regulator in complement to theory and economic intuition. charge implemented a reformed( system, where Evaluation in the field is necessary for the auditors) were randomly assigned to firms, paid plumbers because intuition, however sophis- from a central pool, and independently moni- ticated and however well grounded in existing tored in a randomized experiment in 473 firms theory and prior related evidence, is often a 233 of which where assigned to the new sys- very poor guide of what will happen in reality. tem( . Compared to audits conducted under the Therefore, the economist-plumber is eminently status) quo even with the same auditor operating fallible. We have already seen several interven- in both systems( , we found those “reformed” tions that did not initially succeed, despite being audits to be significantly) more accurate. This designed as well as possible, and with econo- additional accuracy led to a reduction of pol- mists’ inputs e.g., the 2010 default regulation lution among the worst offending plants. This to prevent the( banks from using overdraft pol- work inspired a reform of the entire third party icy, and the first attempt to scale up Pratham’s audit system in Gujarat, once again demonstrat- “teaching at the right level” intervention . My ing the convincing power of plumbing. career is peppered with such examples,) and 16 AEA PAPERS AND PROCEEDINGS MAY 2017 occasionally, the best-intentioned efforts have involved with, as rigorously as possible, and backfired. help correct the course: the economist-plumber In Banerjee, Duflo, and Glennerster 2008 , needs to persistently experiment and repeat the we helped the local government in Udaipur( dis)- cycle of trying something out, observing, tinker- trict, Rajasthan, design an incentive program to ing, trying again. get nurses to come to work more often a base- This underscores the need for evaluation, line survey had shown that they were( absent but of course, randomized controlled trials more than one-half of the days . Each nurse was are not the only way to evaluate policies. For given a time-and-date stamp and) asked to stamp example, Handel and Kolstad’s work, which I a register affixed to the wall of the center sev- discussed, largely exploit nonrandomized situ- eral times in the course of a day, one day a week ations, although they have some experimental on Monday to prove her presence. The gov- papers as well. Conversely, most RCTs are not ernment( announced) that those who didn’t show plumbing experiments—they are designed and up at least 50 percent of the time on Mondays implemented carefully with as little disturbance would get their wages docked. This policy was as possible, to learn a specific parameter or mea- evaluated in an RCT. Initially, everything went sure the effect of one particular intervention. according to plan: nurse attendance, which They are not concerned with the specifics of was around 30 percent before the launch of the how such an intervention would be implemented program, jumped to 60 percent after the pro- at scale. gram was introduced in centers where it was in Still, randomized field experiments are an place, while it remained unchanged elsewhere. economist-plumber’s natural tool, for several However, a few months later, the tide turned. reasons. Nurse attendance in the monitored centers First, there are the traditional identifica- started to drop, and kept dropping. Eight months tion arguments: it is generally difficult to form into the program, the monitored and unmon- a proper counterfactual when a policy was itored centers were performing at an exactly launched in the world, and hence we may not equal—and very poor—level. When we looked know the real effect of many policy attempts. into what happened, we were struck by the fact Second, precisely because policymakers are that recorded absence remained low even after not particularly interested in plumbing, they the program fell apart. What went up sharply will in general make a large number of deci- were “exempt days”—days when the nurses sions at the same time. Even when it is possible claimed that they were not required to come in to evaluate one policy or one regulation with for some reason training and meetings were the a credible identification design, it may be dif- most common reasons( . There seems to have ficult to “unpack.” For example Agarwal et al. been a strong collusion) between the nurses and 2015 estimate the impact of the 2009 CARD the doctors in charge of supervising them. At Act,( which) regulated the fees that banks could some point, attendance in the monitored centers charge on consumer business cards, but not actually fell below the unmonitored ones and business credit cards, and also had some nudg- remained lower until the end of the study: by ing provisions. They compare fees and interest the end, the nurses in monitored centers came paid before and after the acts on consumer cards to work only 25 percent of the time. We had versus business cards. This gives us a convinc- completely failed to foresee how this collusion ing estimate of the effect of the CARD Act taken would undermine the program, and would in as a whole, but it is a little difficult to separate fact make the situation worse than it had been out the impact of its different provisions. In before, as it dawned on nurses that they could contrast, with a carefully designed experiment, get away with anything. it is possible to manipulate one lever at a time. For some people, this very fallibility is a Thus, randomized experiments are a very natu- reason why we should not intervene at all. But ral complement to policy experimentation. This everybody makes mistakes, so this is hardly an is nicely illustrated by the Banerjee et al. paper excuse for inaction Banerjee 2002 . However, on the Raskin information card, to name just one because the economist-plumber( intervenes) in example. the real world, she has a responsibility to assess Third, an explicit experiment, with a start the effects of whatever manipulation she was date and an end date by which a report must VOL. 107 NO. 5 richard t. ely lecture 17 be provided­ and a decision made, has import- it forces the policymakers to think about the ant benefits for the process of government. levers that they actually have control over. In Greenstone 2009 argues for a “culture of per- our conversation with partners from the Kerala sistent regulatory( )experimentation and evalua- health department, which I recounted above, it tion,” where every regulation should be subject was apparent that they were very confused as to to experimentation, and an automatic sunset what they could and could not do as a policy. clause by which it must be reviewed to deter- On the one hand, they had an ideal vision of the mine its( costs and benefits during the exper- nurse and the village officials working together imental period and an expansion clause by to be key dispensers of health advice. On the which it will be) expanded if it turns out to have( other hand, they resisted any notion that they large benefits and low costs . Greenstone argues could actually tell local governments what to do. that, in the United States, )there are opportuni- If they had to design an experiment, they would ties to reform the system for evaluating, adopt- have needed to spell out what was in fact in their ing, and getting rid of policies. Most evaluations control training the panchayats? Training doc- are done ex ante, with very little relevant data to tors and( nurses in a set of new guidelines and determine the cost and benefits, and very little roles and responsibilities? . This would actually serious review ex post. Almost no federal regu- have forced them to write )down a policy, rather lation comes with a sunset clause, although they than keep it suspended at the purely aspirational are used more frequently by states Baugus and level. Bose 2015 . Partly inspired by this( article and Learning happens in the course of the experi- by Greenstone) himself who hung up his pro- ment itself as well. With Banerjee et al. 2012a , fessor’s gown for a time( to engage in full-time we conducted a first set of randomized( exper)- plumbing , the Obama administration called iments, designed to improve the behavior of for a “look) back,” a comprehensive review of police officers with ordinary citizens in the field. all the costs and benefits of regulations in place The Rajasthan police top brass were very com- by all agencies Executive Order 13563, 2011 . mitted to the goal of improving the force, and This was a bold( move, but without proper data) ready to immediately implement reforms that available and without any counterfactual, the we would have suggested. We convinced them exercise had a narrower scope than was origi- to start with an experiment to test out some of nally envisioned. Moreover, inertia is deeply the main proposals made by the various police ingrained in governments everywhere. Once commissions in India: training police officers, an apparatus is installed to support a policy or introducing community observers, rationalizing a regulation, it is unlikely to go away. The case the workweek with a day off and a duty roster, is different if the apparatus is the explicit sub- and limiting the frequency of transfers. Even ject of an experiment, with an anticipation of a though this was a personnel experiment carried review to come. Of course, what matters for this out on a large scale more than 100 police sta- political economy process is that the process is tions, covering a population( of over eight mil- viewed as experimental, not necessarily that the lion people and with a government, this was not experiment be randomized. But once a regula- a plumbing) experiment, in that we did not pay tor agrees to try something out perhaps differ- much attention to the details of how each inter- ent variations of a program on an( experimental vention was going to be carried out. The reason basis, the marginal cost of randomizing) is often is that our partners in the police did not think low, and the results of an RCT are more trans- that plumbing would ever be an issue at all. The parent and easier to explain.6 police in India is structured like a military orga- Fourth, the process of designing a random- nization, and the top brass assumed that if they ized experiment in conjunction with a govern- ordered something, it would happen. So they ment is an important learning process, because simply sent the order to carry out these differ- ent activities down the hierarchy. Unfortunately, this was not exactly what happened. Some of 6 Greenstone’s agenda found some resonance in the the reforms the transfer freeze and the training Obama administration. A 2013 memo from the Office of ( ) Management and Budget Burwell et al. 2013 encouraged could be more or less directly implemented from all heads of agencies to use( existing evidence )to draft new the top, and were indeed carried out. But others rules, but also to conduct randomized experiments. the rationalization of the workweek, and the ( 18 AEA PAPERS AND PROCEEDINGS MAY 2017 community observer relied on the station mas- squad would get them a ticket out of the dog- ters for implementation.) If they had been carried house. We fitted the police cars of this brigade out, they would have caused a direct limitation with GPS, which allowed us to monitor whether on the autonomy of those very station masters. they were in fact doing their job. Compared to So, with the benefit of hindsight, it is not entirely regular police stations, officers in those squads surprising that those measures were in fact never were in fact more likely to be at their checkpoint actually implemented in the field: neat reports and arrested more drunks. Thus, the first exper- made their way up, but when we sent monitors to iment gave the police enough understanding of the field, they found that nothing had changed in what was under their control in their organiza- practice. Collectively, we entirely neglected the tion and what was not, information that allowed plumbing when we were designing the reforms them to better design a new policy. and their implementation and accordingly, we probably tested out reforms( that were unwork- V. The Plumber and the Scientist able, although they had been recommended by various reform commissions , because we had The knowledge that is generated in the course a vision of the organization that) turned out to be of a plumbing experiment is often quite con- entirely different to what it was in practice. But text-specific to a particular organization and this project was carried out as an RCT, rather institutions . One( concern is that it may not gen- than just rolled out which had been our police eralize beyond) this setting. In this case, even if a partner’s initial idea( , so we were at least able randomized experiment is used to pick the ver- to realize what we had) gotten wrong: not only sion of the policy that actually works, is it still because we saw no change in the final outcome, science, or is it just glorified consulting? After but also because we had put in place a compre- all, firms also use randomized controlled trials hensive data collection apparatus to see what to choose the price of their goods, the color of was happening on the ground. their yogurt packets, or what advertisement to While we never got to learn whether the feature on a landing page. The number and the rationalization of the workweek or community scale of these experiments have exploded with observers could have worked if they had been the advent of the Internet and A B testing. Many implemented, we learned a fair amount about of these marketing experiments/ are of use to the what kind of interventions were possible to carry companies that run them, but are not treated as out with the police. We took this lesson to the research, do not receive IRB approval, and are next project we carried out together, when Nina never published. When they are see Simester Singh got the responsibility to design an anti- forthcoming for a review it is (because they drunk-driving project Banerjee et al. 2012b . manage to make a more general) point on con- Some of the issues that( we were concerned with) sumer behavior. Are plumbing experiments able are central to policing, and related to quite subtle to help us draw similarly able to help us draw questions of economics: should enforcement be general conclusions, valid in other contexts as always in a fixed place or vary locations? Should well? it follow a crackdown model or be at a lower level There are several answers to this criticism, for a long period of time? We were interested in which sets apart policy plumbing from usual setting up an experiment to learn that. Based on A B testing. the previous experiment, however, neither the /First, plumbing experimentation can create researchers nor the police officials were ready to the counterfactual that allows for the evaluation assume that traffic checkpoints to catch offend- of the policy itself. Many policies are imple- ers would regularly happen at the specified time mented nationwide, which makes it difficult to and place. So we thought hard about the police evaluate them. Plumbing experiments can help as an organization, with the constraints that it by creating variation in how well they are imple- faces in particular, the prohibition of merit pay mented. This can generate the variation suffi- in order( to find a space where it was possible) cient to evaluate the causal impact of the policy to provide incentives. We formed drunk-driving itself, even when it has already been rolled out. enforcement squads out of officers assigned in a Consider Devoto et al. 2012 , the experiment certain type of station perceived as a punishment on water connection. By( fixing) some “design of posting. We promised that good behavior on this taps” issues, and removing the barriers to access VOL. 107 NO. 5 richard t. ely lecture 19 to the loan and hence the water connections , particular improvement, we would only need to they generated( an exogenous increase in water) be willing to assume that other reasons for inten- access in a randomly selected set of households sive margin increase in program participation 69 percent take-up for treatment households, would have a similar effect on wages in the pri- versus( about 0 in control households , which vate sector. That assumption is more palatable gave them the opportunity to evaluate the) impact that the assumption that smartcards themselves of access to water at home on health and house- would work well in any context; this is a state- hold welfare. ment about the functioning of the private labor In addition, a particular feature of plumbing market rather than about the government as an experiments is that, because they can be car- organization. ried out with governments, and can therefore Second, the plumbing experiments have also be done on a large scale, they give us a window generated insights that are useful to pure science. to evaluate the equilibrium effect of policies, A first relationship between plumbing and which is otherwise very difficult to assess and pure science is that plumbing experiments can be model.7 For example, Muralidharan, Niehaus, designed to test mechanisms, precisely because and Sukhtankar 2016a evaluate the impact the researcher can operate very specific levers. of biometrically authenticated( ) payment in the Thus, to employ the terminology of Ludwig, MGNREGS program in Andhra Pradesh. The Kling, and Mullainathan 2011 and Congdon new system replaced the traditional method et al. 2017 , plumbing experiments( ) can be both of paying beneficiaries through their postal mechanism( ) and policy experiments. To be sure, account, which required them to travel to the this is not a specific advantage of plumbers. The post office and often pay a bribe to an agent most common use of field experiments these if they wanted to get paid. In the new system, days is a careful test of a theoretical proposition, they could withdraw money from an ATM in and this use of experiment as “science” usually the village staffed by a local woman, using takes place in a better controlled environment their fingerprint for identification. The rest of see Banerjee and Duflo 2009 for a discussion the implementation of the MGNREGS program of( this use of experiments, and Glennerster 2017 including how the money got to the Panchayat for practical trade-offs between control and rele- account( was not changed. The authors found vance that arise when running experiments . But that treatment) improved the implementation of plumbing does not preclude science, and )pres- the program in many dimensions. In particular ents some advantages. First, the policy context it reduced leakages and delays, leading to higher is usually not replicable outside of this policy participation in the program. Importantly, the environment. Second, the relative indifference randomization was carried out at the subdis- of governments to plumbing questions often trict level a group of 20–25 villages, covering translates into a willingness to give the plumb- over 60,000( people . This allowed the authors, ers a relatively free reign on the design. Third, in follow-up work )Muralidharan, Niehaus, and the scale also allows many subtreatments with- Sukhtankar 2016b (, to study the impact of the out losing power. The plumbers can then turn to program on wages.) They found that their inter- their scientist friends or to the scientist within vention led to an increase in private sector wages to design quite subtle “mechanism( experiments,”) without reduction in employment. This general moving one parameter at a time to test a very equilibrium impact is quantitatively important: specific scientific theory, in the policy-relevant 90 percent of the increase in individual’s income context, at scale, and on a representative popu- due to the intervention could be attributed to the lation. In the Banerjee et al. 2015 experiment private wage increase. The question of the equi- on the Raskin card, for example,( ) there were librium impact of workfare programs like the several manipulations that permit quite precise MGNREGS is of central interest for the design evaluation of the mechanism at play. One of the of social safety nets. Of course, to generalize the manipulations involved varying the visibility of finding beyond Andhra Pradesh, or beyond this the card distribution program to affect high-or- der belief not only do I know the benefits I’m entitled to,( but I know that the official knows, 7 and I know that others know that we both know . Muralidharan and Niehaus 2016 make the same argu- ) ment in their discussion of experimentation( ) at scale. This provides quite unique empirical evidence 20 AEA PAPERS AND PROCEEDINGS MAY 2017 on the impact of common knowledge, a concept measured by home visits and the organization in many models of game theory and political of community meetings. In contrast, Deserranno economy. 2015 provides evidence in support of the rela- Another nice example is the Kling et al. tionship( ) between prosocialness and job perfor- 2012 study of comparison friction. We have mance. Compared to the high-pay treatment, the seen( that) inertia is an important feature of con- health promoters recruited under the low-pay sumer behavior. One reason may be that it is dif- treatment visited a larger number of households, ficult for them to compare options, even when organized more public presentations in the vil- the information is available. There are a number lage, and provided more antenatal checks. She of studies of “comparison friction” contrasting also finds that they were more likely to target the the Internet to other markets, suggesting that this most vulnerable households. Clearly, these two is a general problem. Kling et al. 2012 set up studies do not settle the issue, as they have oppo- an experiment that isolates comparison( )friction site conclusions, but they help us think about it from other factors. In the treatment group, senior better. citizens in Medicare Part D received personal- To be fair, it is not always possible to have ized information on the performance of different very complex design in plumbing experiments: plans—information that they could, in principle, in some instances, it places too much strain have acquired very easily by going to a website on the implementing agency. For example, the and entering the data. The intervention had large Muralidharan, Niehaus, and Sukhtankar 2016a impact on plan switching which increased from experiment with smartcards in MGNREGS( var)- 27 percent to 38 percent (. The design, which is ies many things at the same time they give the focused on the very narrow) channel of providing smartcards, they move the collection( point to the the results of the comparison, isolates the role village, they move it from the post office to the of comparison friction from other mechanisms, ATM, and they introduce local female “bank- such as difficulty of acquiring information, pro- ing correspondents” who act as local agents for crastination, etc. the bank . This makes it difficult in this setting More generally, well-designed plumbing to isolate) one particular mechanism. But when experiments can sometimes introduce varia- possible, it makes the experiment particularly tion that does not exist in natural conditions, valuable, and my sense is that such trials will and thus generate a counterfactual to illumi- become increasingly feasible, especially as the nate theoretical mechanisms that are not easily first experiments provide examples to follow.8 observable in nature. For example, it is a ques- A first relationship between plumbing and tion of general interest whether people hired pure science is related to the “learning by with prosocial motivations would generally noticing” concept that I referenced earlier. make better employees, and specifically bet- Many scientists-turned-engineers and later, ter public service employees. But since proso- ­turned-plumbers Banerjee, Klemperer, Pathak cial tendencies are not randomly assigned, this have said that experience( taught them to pay) is not something where evaluating causality attention to an entirely different set of notions is easy. The Ashraf, Bandiera, and Lee 2015 than the ones they started with. The scientist’s study and the Deserranno 2015 studies( that )I focus on specific models and assumptions mentioned earlier provide (examples) of how a requires not noticing things that are unimportant, plumbing experiment can help us make progress or that cannot be made sense of in their particu- on difficult-to-settle empirical issues: in both lar paradigm. The plumber must notice anything studies, the authors used the different recruit- if it will matter for implementation. The plumber ment strategies to create variation in the type will share these observations with engineers and of health promoters that were recruited. Once scientists or with the engineer and scientist ver- employed, they were all tasked with the same sion of themselves( and thus hopefully generate responsibilities and faced the same incentives. new interesting and) important questions to work Thus, the experiments clearly identify selection, controlling for incentive. Based on this design, 8 Ashraf, Bandiera, and Lee 2015 find health For other examples of complex plumbing experiments ( ) with many treatments that look specifically at mechanisms, workers attracted by career incentives are much see Olken 2007 ; Alatas et al. 2012, 2016 ; Khan, Khwaja, more effective at delivering health services, as and Olken ( 2016a) ; and Banerjee( et al. 2012b) . ( ) ( ) VOL. 107 NO. 5 richard t. ely lecture 21 on. Plumbing experiments shine the spotlight on ­policymakers, market participants, or research- new problems, and new theories can be devel- ers before the research. In a working paper ver- oped to address those problems. sion, Dur et al. 2013 report how their research For example, Pathak forthcoming, p. 32 was influenced (by the) policy debate and how the concludes by reflecting (on his many years )of policy debate was influenced by their research, helping cities design and implement school which ultimately led to the abandonment of the assignment mechanisms, with which I opened “walk zone” in Boston on the grounds that the this essay by saying: “Experiences from the field MIT-Boston College team had showed it made highlight new issues that have not been the focus no difference, but it was perceived to unfairly of this earlier literature such as nonconsequen- advantage some students. For a last example, in tialist rationales for straightforward incentives, many cases, participants in school choice mech- the importance of transparency and simplicity in anism have a different view than the scientists on influencing designs, the value of single-offer sys- whether a particular school assignment system tems over multiple-offer alternatives, the need can be “gamed” by being strategic about it. This to streamline aftermarkets, and the importance is an example of a much more general problem, of participation on the school and student side. where participants’ perceptions of the optimal It is my hope that further interaction between strategy in a system may be quite different than theory and practice will sharpen focus on these what the optimal strategy is. As scientists, we and potentially other significant issues that have may have dismissed this, since agents are just received relatively little attention from the theo- wrong. But as plumbers, it obviously matters, retical literature.” At the end of the paper, Pathak since it will affect final outcomes. Motivated by goes on to open yet another agenda on feedback these sorts of issues, Li 2016 actually tried to between market-clearing algorithms, school address this issue formally( by )defining the con- attributes, and student preferences. Those are, I cept of “obviously strategy proof.” am guessing, difficult and interesting theoretical This ability of experiments to put the spot- problems, but they would not have arisen with- light on new problems is important. Many new out careful study of the property and behavior of fields in economics came about because some- actual school choice mechanisms one decided that we could not continue to ignore Indeed, the school choice literature has sev- the failure of assumptions that were staring us eral examples of where questions that first arose in the face: informational economics, behavioral as purely practical motivated further theoretical economics, and network economics all come to work, advancing market design as a field. The mind. For this kind of innovation to happen, we questions of how to break a tie in school assign- need to leave a space for surprising informa- ment mechanisms originated as a practical ques- tion to come in. Of course, this requires being tion that parents and school administrators were open to processing reality. Any field study, not very worried about, and led to an active literature just plumbing experiments and not just experi- see Pathak forthcoming for a review . Similarly, ments , can generate such insight.( The plumbing researchers( had not originally paid much) atten- studies) may even have a particular handicap in tion to a central feature of school assignment this regard because, since they focus on details, in Boston, the “walk zone.” But when Mayor they may often be difficult to theorize, even Menino promised that 50 percent of the stu- when they are important for policy Banerjee dents should go to a school they could walk to, 2002 , beyond broad and rather simple( conclu- researchers started simulating what would hap- sions) “information matters,” “default matters,” pen if the size of the “walk zone” was increased etc. . Plumbing( experiments, since they are pri- given the assignment mechanism that exists marily) motivated by pragmatism, must focus on in Boston, and they were surprised to see that what is important for the world, not necessarily it did not lead to many more students assigned on the very subtle issues that theorists would to their walk zone. This motivated theoretical find worth discussing. work on what happens in an assignment system On the other hand, surprising information where a student can qualify for a school on two is perhaps more interesting to theorists, and separate lists. The results are actually not obvi- thus likely to prompt more thinking when it is ous and are interesting theoretically. The basic generated in very high-stakes environments mechanisms at play were not understood by and involving many people. The unexpected 22 AEA PAPERS AND PROCEEDINGS MAY 2017

­behavior of a feature of a policy implemented programs as they are being deployed represents at scale may motivate more interest than the a tremendous opportunity to generate evidence result from a small experiment, however well and improve the effectiveness of money that is controlled. Behavioral public finance is a good already being spent. Moreover, as we have seen, example of such back-and-forth between theo- the lessons that are generated from these part- retical insights and findings from the plumbing nerships are often actually acted upon, which experiments. Madrian and Shea 2001 were means that gains from evaluation are not just motivated by basic idea of procrastination( ) to potential, they are actual. look at the role of default options in retirement To take just one extreme example, the savings. They found that employees enrolled in Affordable Care Act is a unique institution, and the 401k by default were much more likely to perhaps a less-than-ideal result of a long and stay enrolled in the long term than those who complicated political process. It is very unlikely were not defaulted in, and this prompted a much that anybody designing a health insurance sys- more systematic investigation of the preva- tem from scratch would adopt anything resem- lence of the default effect in retirement savings. bling it, so in a sense there is very little external The first wave of studies were empirical see, validity to any work about the Affordable Care e.g., Choi 2002, 2003; Choi et al. 2004, 2012;( Act. Yet, when Handel, Kolstad, and others write Chetty et al. 2014 and these, in turn, moti- papers that are concerned with detailed issues vated theoretical work) both on whether setting of implementation of a system like the ACA, it defaults is desirable e.g., Carroll et al. 2009 does not really occur to us to wonder whether and more generally on( how to set them, since) this is perhaps a little too specific. they had been shown to matter. The core theo- Similarly, most plumbing studies have retical question that defaults raise is motivated first-order impacts. Getting the details right on by welfare analysis: since economist-plumb- school choice directly affects the welfare of mil- ers have demonstrated that defaults matter, the lions of school children. The work of Banerjee economist-engineer wants to set the default at et al. 2015 on the Raskin program led to the a “reasonable” place. What is the right reason- scale-up( of cards) to more than 60 million bene- able rate? This is not an easy theoretical ques- ficiaries of public transfer programs. Obviously tion to answer, since it is likely to depend both the scaled-up policy was adapted and slightly on the underlying reasons why defaults matter different than what was tested in the experiment. in the first place, and because welfare analysis If the effects of this scaled-up version were the cannot rely on the standard tools when agents same as what they had estimated, they would exhibit behavioral biases. Amador, Werning, translate in benefits anywhere between $80 mil- and Angeletos 2006 start tackling the question lion and $170 million annually. The work of of the optimal (savings) policy for an agent with Duflo et al. 2013 on environmental audits in hyperbolic discounting preferences. Bernheim, Gujarat, which( I discussed) above, led to a reform Fradkin, and Popov 2015 explicitly address of the regulation in the state, which has a pop- the question of the optimal( ) default policy, and ulation of 66 million. The evaluations that are show both how this depends on the underlying described in Banerjee, Duflo, Imbert, Mathew, behavioral model, and how some relatively gen- and Pande 2016 and Muralidharan, Niehaus, eral results can nonetheless be obtained. and Sukhtankar( )2016a directly affected hun- Last and perhaps most important, plumbing dreds of thousands( of participants) in the Indian experiments, by their nature, are typically run on workfare program, and eliminated millions of real policies and on a large scale. They are not dollars of leakage, just while the research was just data points: they affect many real people and underway, in the areas they covered. These potentially have real benefits as they are car- studies also played a part in the adoption of a ried( out. When) economists work on understand- combination of those reforms as direct payment ing how to design a policy that is going to affect to beneficiary from the center in August 2015.9 millions of people and cost millions of dollars, the stakes are high enough that it is perhaps­ less 9 Unusually, a briefing presented by the Ministry of important to know whether the findings will Rural Development to the Cabinet meeting actually cited a generalize beyond the particular work. Working research paper Ministry of Rural Development 2015 . “As with governments to evaluate versions of these of today, 92 percent( of Panchayats are brought onto e-FMS) VOL. 107 NO. 5 richard t. ely lecture 23

This will affect the lives of 182 million benefi- Alatas, Vivi, Abhijit Banerjee, Rema Hanna, ciaries, for a budget of $5.6 billion. Benjamin A. Olken, and Julia Tobias. 2012. Many of us chose economics because, ulti- “Targeting the Poor: Evidence from a Field mately, we thought science could be leveraged Experiment in Indonesia.” American Eco- to make a positive change in the world. There nomic Review 102 4 : 1206–40. ( ) are many different paths to get there. Scientists Almås, Ingvild, Alex Armand, Orazio Attanasio, design general frames, engineers turn them into and Pedro Carneiro. 2015. “Measuring and relevant machinery, and plumbers finally make Changing Control: Women’s Empowerment them work in a complicated, messy policy envi- and Targeted Transfers.” National Bureau of ronment. As a discipline, we are sometimes a Economic Research Working Paper 21717. little overwhelmed by “physics envy,” searching Amador, Manuel, Iván Werning, and George- for the ultimate scientific answer to all ques- Marios Angeletos. 2006. “Commitment vs. tions—and this will lead us to question the legit- Flexibility.” Econometrica 74 2 : 365–96. imacy of plumbing. This essay is an attempt to Ashraf, Nava, Oriana Bandiera, and( ) Scott S. Lee. argue that plumbing should be an inherent part 2015. “Do-Gooders and Go-Getters: Career of our profession: we are well prepared for it, Incentives, Selection, and Performance in Pub- reasonably good at it, and it is how we make lic Service Delivery.” Unpublished. ourselves useful. A feature unique to econom- Banerjee, Abhijit V. 1997. “A Theory of Misgov- ics is that scientists, engineers, and plumbers all ernance.” Quarterly Journal of Economics 112 talk to each other and in fact are often talking to 4 : 1289–332. themselves—the (same economist wearing dif- Banerjee,( ) Abhijit V. 2002. “The Uses of Economic ferent hats . This conversation should continue: Theory: Against a Purely Positive Interpreta- it is what )will keep us relevant and, possibly, tion of Theoretical Results.” Unpublished. honest. Banerjee, Abhijit V. 2007. Making Aid Work. Cambridge, MA: MIT Press. References Banerjee, Abhijit, Rukmini Banerji, James Berry, Esther Duflo, Harini Kannan, Shobhini Abdulkadiroglu, Atila, and Tayfun Sönmez. Mukherji, Marc Shotland, and Michael Wal- 2003. “School Choice: A Mechanism Design ton. 2016. “From Proof of Concept to Scal- Approach.” American Economic Review 93 able Policies: Challenges and Solutions, with 3 : 729–47. an Application.” National Bureau of Economic Abdulkadiroglu,( ) Atila, and Tayfun Sönmez. 2013. Research Working Paper 22931. “Matching Markets: Theory and Practice.” In Banerjee, Abhijit, Raghabendra Chattopad- Advances in Economics and Econometrics, hyay, Esther Duflo, Daniel Keniston, and Nina Vol. 1, edited by D. Acemoglu, M. Arello, and Singh. 2012a. “Can Institutions be Reformed E. Dekel, 3–47. Cambridge, MA: Cambridge from Within? Evidence from a Random- University Press. ized Experiment with the Rajasthan Police.” Agarwal, Sumit, Souphala Chomsisengphet, Neale Unpublished. Mahoney, and Johannes Stroebel. 2015. “Regu- Banerjee, Abhijit, Raghabendra Chattopadhyay, lating Consumer Financial Products: Evidence Esther Duflo, Daniel Keniston, and Nina Singh. from Credit Cards.” Quarterly Journal of Eco- 2012b. “Improving Police Performance in nomics 130 1 : 111–64. Rajasthan, India: Experimental Evidence on Alatas, Vivi, Abhijit( ) Banerjee, Rema Hanna, Ben- Incentives, Managerial Autonomy and Train- jamin A. Olken, Ririn Purnamasari, and Mat- ing.” National Bureau of Economic Research thew Wai-Poi. 2016. “Self-Targeting: Evidence Working Paper 17912. from a Field Experiment in Indonesia.” Jour- Banerjee, Abhijit V., and Esther Duflo. 2009. “The nal of Political Economy 124 2 : 371–427. Experimental Approach to Development Eco- ( ) nomics.” Annual Review of Economics 1 1 : 151–78. ( ) platform. The e-FMS has reduced the problem of parking Banerjee, Abhijit V., and Esther Duflo.2012. of funds and increased transparency. Studies on the effec- tiveness of electronic fund flow in Bihar done by eminent : A Radical Rethinking of economists Banerjee et al. 2014 showed that e-governance the Way to Fight Global Poverty. New York: ­ reduced fund( flow leakage by 25) percent.” PublicAffairs. 24 AEA PAPERS AND PROCEEDINGS MAY 2017

Banerjee, Abhijit V., Esther Duflo, and Rachel Chetty, Raj, John N. Friedman, Søren Leth-Pe- Glennerster. 2008. “Putting a Band-Aid on tersen, Torben Helen Nielsen, and Tore Olsen. a Corpse: Incentives for Nurses in the Indian 2014. “Active versus Passive Decisions and Public Health Care System.” Journal of the Crowd-Out in Retirement Savings Accounts: European Economic Association 6 2–3 : 487– Evidence from Denmark.” Quarterly Journal 500. ( ) of Economics 129 3 : 1141–219. Banerjee, Abhijit, Esther Duflo, Clement Choi, James J. 2002. (“Defined) Contribution Pen- Imbert, Santhosh Mathew, and Rohini Pande. sions: Plan Rules, Participant Choices, and the 2016. “E-governance, Accountability, and Path of Least Resistance.” In Tax Policy and Leakage in Public Programs: Experimen- the Economy, Vol. 16, edited by James M. tal Evidence from a Financial Management Poterba, 67–113. Cambridge, MA: MIT Press. Reform in India.” National Bureau of Eco- Choi, James J. 2003. “Optimal Defaults.” Ameri- nomic Research Working Paper 22803. can Economic Review 93 2 : 180–85. Banerjee, Abhijit, Rema Hanna, Jordan C. Kyle, Choi, James J., John Beshears,( ) David Laibson, Benjamin A. Olken, and Sudarno Sumarto. and Brigitte C. Madrian. 2012. “Default Sticki- 2015. “The Power of Transparency: Informa- ness among Low-Income Individuals.” RAND tion, Identification Cards and Food Subsidy Working Paper WR-926-SSA. Programs in Indonesia.” National Bureau of Choi, James J., David Laibson, Brigitte C. Economic Research Working Paper 20923. Madrian, and Andrew Metrick. 2004. “For Baugus, Brian, and Feler Bose. 2015. “Sunset Leg- Better or Worse: Default Effects and 401 k islation in the States: Balancing the Legislature Savings Behavior.” In Perspectives on the Eco( )- and the Executive.” https://www.mercatus.org/ nomics of Aging, edited by David A. Wise, system/files/Baugus-Sunset-Legislation.pdf. 81–121. Chicago: University of Chicago Press. Benhassine, Najy, Florencia Devoto, Esther Duflo, Cogley, Timothy, Riccardo Colacito, Lars Peter , and Victor Pouliquen. 2015. Hansen, and Thomas J. Sargent. 2008. “Robust- “Turning a Shove into a Nudge? A ‘Labeled ness and U.S. Monetary Policy Experimenta- Cash Transfer’ for Education.” American tion.” Journal of Money, Credit, and Banking Economic Journal: Economic Policy 7 3 : 40 8 : 1599–623. 86–125. ( ) Congdon,( ) William J., Jeffrey R. Kling, Jens Lud- Bernheim, B. Douglas, Andrey Fradkin, and Igor wig, and . 2017. “Social Popov. 2015. “The Welfare Economics of Policy: Mechanism Experiments and Policy Default Options in 401 k Plans.” American Evaluations.” In Handbook of Field Experi- Economic Review 105 9( : )2798–837. ments, Vol. 1, edited by Esther Duflo and Abhi- Burwell, Sylvia, John Holdren,( ) Alan Krueger, and jit Banerjee. Amsterdam: North Holland. Cecilia Munoz. 2013. M-13-17 Memorandum Dal Bó, Ernesto, Frederico Finan, and Martin A. to the Heads of Departments and Agencies: Rossi. 2013. “Strengthening State Capabilities: Next Steps in the Evidence and Innovation The Role of Financial Incentives in the Call to Agenda. Washington, DC: Executive Office of Public Service.” Quarterly Journal of Econom- the President, Office of Management and Bud- ics 128 3 : 1169–218. get. Deserranno,( )Erika. 2015. “Financial Incentives Campbell, John Y. 2016. “Richard T. Ely Lecture: as Signals: Experimental Evidence from the Restoring Rational Choice: The Challenge of Recruitment of Health Workers.” Unpublished. Consumer Financial Regulation.” American Devoto, Florencia, Esther Duflo, Pascaline Dupas, Economic Review 106 5 : 1–30. William Pariente, and Vincent Pons. 2012. Carroll, Gabriel D., James( )J. Choi, David Laib- “Happiness on Tap: Piped Water Adoption in son, Brigitte C. Madrian, and Andrew Metrick. Urban Morocco.” American Economic Jour- 2009. “Optimal Defaults and Active Deci- nal: Economic Policy 4 4 : 68–99. sions.” Quarterly Journal of Economics 124 Duflo, Esther, Michael Greenstone,( ) Rohini Pande, 4 : 1639–74. and Nicholas Ryan. 2013. “Truth-Telling by Chetty,( ) Raj. 2015. “Richard T. Ely Lecture: Third-Party Auditors and the Response of Pol- Behavioral Economics and Public Policy: A luting Firms: Experimental Evidence from Pragmatic Perspective.” American Economic India.” Quarterly Journal of Economics 128 Review 105 5 : 1–33. 4 : 1499–545. ( ) ( ) VOL. 107 NO. 5 richard t. ely lecture 25

Dur, Umut M., Scott Duke Kominers, Parag Khan, Adnan Q., Asim I. Khwaja, and Benjamin A. Pathak, and Tayfun Sönmez. 2013. “The A. Olken. 2016b. “Tax Farming Redux: Exper- Demise of Walk Zones in Boston: Priorities imental Evidence on Performance Pay for Tax vs. Precedence in School Choice.” National Collectors.” Quarterly Journal of Economics Bureau of Economic Research Working Paper 131 1 : 219–71. 18981. Klemperer,( ) Paul. 2002. “What Really Matters in Executive Order 13563. 2011. “Improving Regu- Auction Design.” Journal of Economic Per- lation and Regulatory Review.” Federal Regis- spectives 16 1 : 169–89. ter 76: 3821–23. Kling, Jeffrey (R.,) Sendhil Mullainathan, Eldar Field, Erica, Rohini Pande, Natalia Rigol, Simone Shafir, Lee C. Vermeulen, and Marian V. Wro- Schaner, and Charity Troyer Moore. 2016. “On bel. 2012. “Comparison Friction: Experimental Her Account: Can Strengthening Women’s Evidence from Medicare Drug Plans.” Quar- Financial Control Boost Female Labor Sup- terly Journal of Economics 127 1 : 199–235. ply?” Unpublished. Li, Shengwu. 2016. “Obviously Strategy-Proof( ) Finan, Frederico, Benjamin A. Olken, and Rohini Mechanisms.” http://siepr.stanford.edu/sites/ Pande. 2015. “The Personnel Economics of the default/files/publications/16-015.pdf. State.” National Bureau of Economic Research Ludwig, Jens, Jeffrey R. Kling, and Sendhil Mul- Working Paper 21825. lainathan. 2011. “Mechanism Experiments Gibbons, Robert. 2010. “Inside Organiza- and Policy Evaluations.” Journal of Economic tions: Pricing, Politics, and Path Depen- Perspectives 25 3 : 17–38. dence.” Annual Review of Economics 2 1 : Madrian, Brigitte C.,( ) and Dennis F. Shea. 2001. 337–65. ( ) “The Power of Suggestion: Inertia in 401 k Glennerster, Rachel. 2017. “The Practicalities of Participation and Savings Behavior.” Quar( )- Running Randomized Evaluations: Partner- terly Journal of Economics 116 4 : 1149–87. ships, Measurement, Ethics, and Transpar- Ministry of Rural Development MoRD( ) . 2015. ency.” In Handbook of Field Experiments, Vol. “Note to Cabinet.” Government( of India.) 1, edited by Esther Duflo and Abhijit Banerjee. Moore, Quinn, Kevin Conway, and Bran- Amsterdam: North Holland. don Kyler. 2012. “Direct Certification in the Greenstone, Michael. 2009. “Toward a Culture National Lunch Program: State Implementa- of Persistent Regulatory Experimentation and tion Progress School Year 2011–2012. Report Evaluation.” In New Perspectives on Regu- to Congress: Summary.” Alexandria, VA: US lation, edited by David Moss and John Cis- Department of Agriculture, Food and Nutrition ternino, 111–25. Cambridge, MA: The Tobin Service, Office of Research and Analysis. Project. Moore, Quinn, Kevin Conway, Brandon Kyler, Handel, Benjamin R. 2013. “Adverse Selection and Andrew Gothro. 2013. “Direct Certifi- and Inertia in Health Insurance Markets: When cation in the National Lunch Program: State Nudging Hurts.” American Economic Review Implementation Progress, School Year 2012– 103 7 : 2643–82. 2013.” Report CN-13-DC. Alexandria, VA: US Handel,( )Benjamin R., and Jonathan Kolstad. Department of Agriculture, Food and Nutrition 2015. “Getting the Most from Marketplaces: Service, Office of Research and Analysis. Smart Policies on Health Insurance Choice.” Muralidharan, Karthik, and Paul Niehaus. 2016. Brookings Hamilton Project Discussion Paper “Experimentation at Scale.” Unpublished. 2015–08. Muralidharan, Karthik, Paul Niehaus, and San- Hanna, Rema, Sendhil Mullainathan, and Joshua dip Sukhtankar. 2016a. “Building State Capac- Schwartzstein. 2014. “Learning through Notic- ity: Evidence from Biometric Smartcards in ing: Theory and Evidence from a Field Exper- India.” American Economic Review 106 10 : iment.” Quarterly Journal of Economics 129 2895–929. ( ) 3 : 1311–53. Muralidharan, Karthik, Paul Niehaus, and San- Khan,( ) Adnan Q., Asim I. Khwaja, and Benja- dip Sukhtankar. 2016b. “General Equilibrium min A. Olken. 2016a. “Making Moves Mat- Effects of Improving Public Employment ter: Experimental Evidence on Incentivizing Programs: Experimental( ) Evidence from India.” Bureaucrats through Performance-Based Post- Unpublished. ings.” Unpublished. Olken, Benjamin A. 2007. “Monitoring 26 AEA PAPERS AND PROCEEDINGS MAY 2017

­Corruption: Evidence from a Field Experi- University Press. ment in Indonesia.” Journal of Political Econ- Simester, Duncan. Forthcoming. “Field Exper- omy 115 2 : 200–49. iments in Marketing.” In Handbook on Field Pathak, Parag( )A. 2011. “The Mechanism Design Experiments, Vol. 1, edited by Esther Duflo Approach to Student Assignment.” Annual and Abhijit Banerjee. Amsterdam: North Hol- Review of Economics 3 1 : 513–36. land. Pathak, Parag. Forthcoming.( )“What Really Mat- Stango, Victor, and Jonathan Zinman. 2014. ters in Designing School Choice Mechanisms.” “Limited and Varying Consumer Attention: In Advances in Economics and Economet- Evidence from Shocks to the Salience of rics, 11th World Congress of the Econometric Bank Overdraft Fees.” Review of Financial ­Society. Studies 27 4 : 990–1030. Roth, Alvin E. 2002. “The Economist as Engi- Thaler, Richard( ) H., and Cass R. Sunstein. neer: Game Theory, Experimentation, and 2008. Nudge: Improving Decisions about Computation as Tools for Design Econom- Health, Wealth, and Happiness. New Haven: ics.” Econometrica 70 4 : 1341–78. Yale University Press. Scott, James C. 1998. Seeing( ) Like a State: How Willis, Lauren E. 2013. “When Nudges Fail: Certain Schemes to Improve the Human Slippery Defaults.” University of Chicago ­Condition Have Failed. New Haven: Yale Law Review 80 3 : 1155–229. ( )