Essays on the Economics of Water by Nicholas W. Hagerty Sc.B., Brown University (2010)

Submitted to the Department of Economics in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the

MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2018 c 2018 Nicholas W. Hagerty. All rights reserved.

The author hereby grants to MIT permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole or in part in any medium now known or hereafter created.

Author...... Department of Economics May 15, 2018

Certified by ...... Esther Duflo Abdul Latif Jameel Professor of Poverty Alleviation and Development Economics Thesis Supervisor Certified by ...... Benjamin Olken Professor of Economics Thesis Supervisor

Accepted by...... Ricardo Caballero Ford International Professor of Economics Chairman, Departmental Committee on Graduate Studies 2 Essays on the Economics of Water by Nicholas W. Hagerty

Submitted to the Department of Economics on May 15, 2018, in partial fulfillment of the requirements for the degree of Doctor of Philosophy

Abstract This thesis studies three questions in the economics of water resource management. Chapter 1 estimates the economic gains available from greater use of large-scale water markets in California. I develop a revealed-preference empirical approach that exploits observed choices in the existing water market, and I apply it to comprehensive new data on California’s water economy. This approach overcomes the challenge posed by transaction costs, which insert an unobservable wedge between observed prices and marginal valuations. First, I directly estimate transaction costs and use them to recover equilibrium marginal valuations. Then, I use supply shocks to estimate price elasticities of demand, which govern how marginal valuations vary with quantity. I find even a relatively modest market scenario would create additional benefits of $480 million per year, which can be weighed against both the benefits of existing market restrictions and the setup costs of larger-scale markets. Chapter 2 estimates the possible costs of industrial to agriculture in , focusing on 63 industrial sites identified by the central government as “severely polluted.” I exploit the spatial discontinuity in pollution concentrations that these sites generate along a river. First, I show that these sites do in fact coincide with a large, discontinuous rise in pollutant concentrations in the nearest river. Then, I find some evidence that agricultural revenues may be substantially lower in districts immediately downstream of polluting sites, relative to districts immediately upstream of the same site in the same year. These results suggest that damages to agriculture could represent a major cost of water pollution. Chapter 3, co-authored with Ariel Zucker, presents an experimental protocol for a project that pays smallholder farmers in India to reduce their consumption of . This project will test the effectiveness of payments for voluntary conservation – a policy instrument that may be able to sidestep regulatory constraints common in developing countries. It will also measure the price response of demand for groundwater in irrigated agriculture, a key input to many possible reforms. Evidence from a pilot suggests that the program may have reduced groundwater pumping by a large amount, though confidence intervals are wide.

Thesis Supervisor: Esther Duflo Title: Abdul Latif Jameel Professor of Poverty Alleviation and Development Economics

Thesis Supervisor: Benjamin Olken Title: Professor of Economics

3 4 Acknowledgments

I owe tremendous gratitude to my advisors, Ben Olken, Esther Duflo, and Chris Knittel, for their guidance, insight, and wisdom. Through generous time, attention, support, and mentorship, they have kept me inspired, open-minded, on track, and reaching high. I also thank Michael Greenstone, my advisor earlier in graduate school, who gave me crucial opportunities and kept my focus on the big picture. This thesis benefited dramatically from the comments and feedback of other MIT Economics fac- ulty members, who view graduate education as a collective responsibility. I am particularly grateful to Abhijit Banerjee, Amy Finkelstein, Dave Donaldson, David Autor, David Atkin, Jim Poterba, and Simon Jäger. Thanks to Clay Landry and WestWater Research, LLC, for sharing the data that made the first chapter possible, and Divyang Waghela and the staff of the Coastal Salinity Prevention Cell for making the third chapter possible. For excellent research assistance I thank J-PAL staff mem- bers Neha Doshi, Gokul Sampath, Aditya Madhusudan, and Suresh Bharda, and MIT undergraduate student Kavish Gandhi. Throughout graduate school, I may have learned the most from my fellow Ph.D. students. For years of collaboration, camaraderie, and friendship, I want to especially thank Arianna Ornaghi, Ariel Zucker, Donghee Jo, Gabriel Kreindler, Jack Liebersohn, John Firth, Josh Dean, Matt Lowe, Michael Abrahams, Peter Hull, and Rachael Meager. I am also grateful for funding from the MIT Energy Initiative, the MIT Tata Center for Technology and Design, the Abdul Latif Jameel World Water and Food Security Lab (J-WAFS) at MIT, the Weiss Fund for Research in Development Economics, and the International Growth Centre. I am beyond fortunate to have benefited from a long line of extraordinary mentors who helped me reach this point. To name a few standouts: Arik Levinson encouraged me and helped confirm my decision to pursue a Ph.D. Sri Nagavarapu indulged my undergraduate overeagerness and inspired me to think of economics as a potential career path. Amit Kobrowski first taught me to truly read and think critically. Derek Stein, Holly Simon, and Antonio Baptista introduced me to scientific research and were generous with their time and patience to a degree that in retrospect I can scarcely comprehend. Finally, I thank my good friends, my partner Ann, my brother Steve, and my parents for constant love and support. My parents gave me my intellectual curiosity, my sense of justice, and my care for the natural environment – and no matter what I get myself into, they always know exactly the right thing to say. I dedicate this thesis to them.

5 6 Contents

1 Liquid Constrained in California: Estimating the Potential Gains from Water Markets 15 1.1 Introduction ...... 15 1.2 Background on Water in California ...... 21 1.2.1 Water is initially allocated by fixed rules and environmental conditions . . . 22 1.2.2 Secondary markets are inhibited by transaction costs ...... 24 1.3 Theoretical Framework ...... 26 1.3.1 Simplified graphical model ...... 27 1.3.2 Model of an exchange economy with transaction costs ...... 28 1.3.3 From theory to estimation ...... 31 1.4 New Data on California’s Water Economy ...... 37 1.4.1 Water transactions ...... 37 1.4.2 Water entitlements ...... 38 1.4.3 Crosswalk file and user locations file ...... 39 1.5 Step 1: Estimating Transaction Costs from Observable Factors ...... 39 1.5.1 Selecting cost factors ...... 40 1.5.2 Econometric specification ...... 40 1.5.3 Results ...... 42 1.5.4 Discussion ...... 44 1.6 Step 2: Recovering Equilibrium Marginal Valuations of Water ...... 45 1.7 Step 3: Estimating Demand Elasticities ...... 46 1.7.1 Empirical strategy ...... 47 1.7.2 Results ...... 51 1.8 Step 4: Simulating the Gains from Trade ...... 52 1.8.1 Setting up simulations ...... 53 1.8.2 Results ...... 54 1.9 Conclusion ...... 56 1.10 Figures ...... 58 1.11 Tables ...... 67 1.12 Appendix: Validating Low Marginal Values of Water in Agriculture ...... 73 1.12.1 Data ...... 73 1.12.2 Outcome variables ...... 74 1.12.3 Empirical strategy ...... 75 1.12.4 Results ...... 76 1.13 Appendix: Market Power as an Alternative Explanation for Price Gaps ...... 80 1.13.1 Model of spatial trade with market power ...... 80

7 1.13.2 Deriving the estimation procedure ...... 82 1.13.3 Empirical implementation ...... 84 1.13.4 Results ...... 85 1.14 Appendix: Nash-in-Nash Bargaining as an Alternative Model ...... 89 1.14.1 Basic bilateral negotiation ...... 89 1.14.2 Nash equilibrium in simultaneous bilateral negotiations ...... 90 1.14.3 Small transaction quantities ...... 91 1.14.4 Fixed costs ...... 92 1.15 Appendix: Supplementary Tables ...... 94 1.16 Data Appendix ...... 96 1.16.1 Transactions ...... 96 1.16.2 Entitlements and Deliveries ...... 98 1.16.3 Quantity consumed ...... 104 1.16.4 Crosswalk file ...... 104 1.16.5 User location file ...... 105

2 The Costs of Industrial Water Pollution to 107 2.1 Introduction ...... 107 2.2 Background ...... 109 2.2.1 Industrial water pollution and crops ...... 109 2.2.2 Welfare measures ...... 110 2.3 Data ...... 111 2.3.1 Sources ...... 111 2.3.2 Matching pollution data to industrial sites ...... 113 2.3.3 Matching agricultural data to industrial sites ...... 114 2.3.4 Covariate balance ...... 114 2.4 Empirical strategy ...... 115 2.4.1 Regression discontinuity in pollution ...... 115 2.4.2 Spatial matching in agricultural outcomes ...... 116 2.5 Results ...... 117 2.5.1 Pollution ...... 117 2.5.2 Agricultural outcomes ...... 118 2.6 Conclusion ...... 120 2.7 Figures ...... 121 2.8 Tables ...... 123

3 Measuring Demand for Groundwater Irrigation: An Experimental Protocol Using Con- servation Payments 127 3.1 Introduction ...... 127 3.2 Background ...... 130 3.2.1 Optimal groundwater policy: A framework ...... 130 3.2.2 Existing evidence: Costs, benefits, and damages of groundwater extraction in irrigated agriculture ...... 132 3.2.3 Conservation credits as a Pigouvian tax ...... 133 3.2.4 Equivalence between water, energy, and pumping time ...... 134 3.3 Study Setting and Experimental Design ...... 134

8 3.3.1 Setting ...... 135 3.3.2 Enrollment ...... 135 3.3.3 Sample size ...... 136 3.3.4 Randomization ...... 137 3.3.5 Interventions ...... 138 3.4 Data ...... 139 3.4.1 Data collection ...... 139 3.4.2 Data processing ...... 140 3.5 Analysis Plan ...... 141 3.5.1 Balance checks ...... 142 3.5.2 Program evaluation via intent-to-treat estimates ...... 142 3.5.3 Demand estimation via instrumental variables ...... 146 3.5.4 Cost-effectiveness ...... 147 3.6 Pilot Results ...... 149 3.7 Conclusion ...... 150 3.8 Figures ...... 152 3.9 Tables ...... 158

Bibliography 159

9 10 List of Figures

1-1 Structure of water entitlements in California...... 58 1-2 Observed volume of market transactions compared with total water supply...... 59 1-3 Trade between two districts, buying district d (for destination) and selling district o (for origin)...... 60 1-4 Distribution of prices, controlling for year, logarithmic scale...... 61 1-5 Distributions of estimated equilibrium marginal valuations for buyers and sellers. . . 62 1-6 Estimated equilibrium marginal valuations by sector and year...... 63 1-7 Estimated equilibrium marginal valuations by geography...... 64 1-8 Entitlements by market over time...... 65 1-9 Annual water quantities traded in the existing market and counterfactual scenarios, by geography...... 66 1-10 Estimated pass-through rates ...... 87

2-1 Locations of “severely polluted” industrial sites (red dots) and water pollution mea- surement stations (green dots)...... 121 2-2 Discontinuity in river pollution levels at the point of “severely polluted” industrial sites.122

3-1 Price regulation for groundwater management...... 152 3-2 Budget set of conservation credits...... 153 3-3 Power curves for a range of sample sizes and minimum detectable effects...... 154 3-4 Experimental design ...... 155 3-5 Distribution of monthly hours of groundwater irrigation, pooled across months . . . 156 3-6 Event study ...... 157

11 12 List of Tables

1.1 Summary statistics of transactions data ...... 67 1.2 Cost factors for which marginal transaction costs are econometrically identified in price data ...... 68 1.3 Marginal transaction costs from specific cost factors ...... 69 1.4 Aggregate gains from reducing regulatory costs ...... 70 1.5 Demand elasticities ...... 71 1.6 Economic benefits from water markets in several scenarios ...... 72 1.7 Effects of surface water entitlements on land use ...... 78 1.8 Effects of surface water entitlements on profits and revenues ...... 79 1.9 Marginal transaction costs, adjusting for incomplete pass-through ...... 88 1.10 Marginal transaction costs: robustness checks ...... 94 1.11 Demand elasticities: further regressions ...... 95

2.1 Covariate balance (full sample)...... 123 2.2 Covariate balance (sample trimmed by excluding sites with a large upstream-downstream difference in baseline irrigation rates)...... 124 2.3 Local linear regressions of water pollution estimated at the point of a “severely pol- luted” industrial site...... 125 2.4 Regressions of agricultural revenues comparing districts immediately upstream and downstream of “severely polluted” industrial sites...... 125 2.5 Regressions of several agricultural outcomes comparing districts immediately up- stream and downstream of “severely polluted” industrial sites...... 126

3.1 Intent-to-treat effect of conservation credits program in pilot...... 158

13 14 Chapter 1

Liquid Constrained in California: Estimating the Potential Gains from Water Markets

1.1 Introduction

Water supplies are becoming scarcer and more variable in many parts of the world (UNDP 2006). Fueled by population pressure and climate change, water scarcity can increase poverty and conflict (Sekhri 2014; Burke et al. 2015) and is set to worsen in the coming decades (World Bank Group 2016). To help societies adapt to water scarcity, many observers advocate for greater use of water markets. Like other markets, water markets may yield benefits by allocating scarce resources to the greatest social need and allowing participants to flexibly respond to changing conditions. However, the costs of setting up water markets can be substantial, often involving major new infrastructure for conveyance and monitoring, as well as overhauls of legal and regulatory institutions. Meanwhile, the benefits can be highly uncertain. Historical experience is thin, since robust, large-scale water markets are rare even in wealthy industrialized countries (with the exception of Australia’s Murray-Darling Basin). Relatively little quantitative research is available to estimate these prospective benefits and help policymakers weigh them against the costs. In this chapter, I provide an estimate of the potential gains from a more efficient water market in California. I take a revealed-preference approach, in which welfare calculations depend on parameters that I estimate from observed trading behavior in the existing water market. I begin by modeling California’s thin water market as an exchange economy with transaction costs, rationalizing observed transaction prices with an unobserved set of transaction costs and demand curves. As in any exchange

15 economy, these demand curves plus initial endowments are sufficient to find the efficient allocation without transaction costs. I derive an empirical procedure to construct demand curves, and I apply it to comprehensive new data on water transactions and annual entitlements in California. Lastly, I combine the demand curves to simulate the result of a fully efficient market and calculate the resulting gains from trade. There are two key reasons why California is a useful setting in which to study the potential benefits of water markets. First, there are potentially large economic gains available from water reallocation. California has a large and diverse economy, but most of it depends on water supplies that are imported over great distances and prone to droughts. This water is not allocated by a price mechanism but rather according to complex, historically-determined rules and formulas. A secondary market exists, but it is highly regulated, transaction volume is low, and price dispersion is high. Perhaps as a result, retail prices can vary over more than two orders of magnitude: in 2017, according to their websites, commercial and industrial customers in the city of San Diego paid $2,491 per acre- foot1, while agricultural customers less than two hours away in the Imperial Valley paid just $20 per acre-foot. Second, substantial reallocation is achievable through policy reforms alone. California has already built most of the physical infrastructure necessary to support a robust water market, and it has spare capacity. Canals, pipelines, and rivers together form a nearly complete hydrological network connecting the vast majority of water users in the state, such that it is technologically feasible to transfer water between nearly any two consumers. Two salient features of California’s statewide water market are high price dispersion and low transaction volume, relative to the number of water districts and independent consumers who might transact. To rationalize these facts as equilibrium outcomes, I model California’s water market as an exchange economy in which consumers trade endowments of a single homogeneous good via intermediaries. Consumers incur transaction costs that may be location pair-specific and directionally asymmetric. Transaction costs, which I define broadly, may arise from a range of cost factors, some of which are observable (such as regulatory reviews or conveyance distance) and others which are not (such as search or contracting difficulty). The results of this model rationalize a set of observed trading outcomes with a set of demand curves and transaction costs, and they provide an equilibrium condition that can be used to empirically recover these objects.

1An acre-foot, the standard unit of volume for water in California, is the amount of water that would cover one acre of land with twelve inches of water.

16 My goal is to calculate the possible gains from trade from an efficient market without transaction costs. (This scenario could be achieved through a combination of deregulation – i.e., streamlining the policies governing water transfers – and the creation of market-supporting institutions – e.g., setting up a central exchange – that together eliminate the wedge between buying or selling and in situ use.) I note that the efficient allocation is fully determined by initial endowments (which can be read directly from the data) and demand curves. To construct these demand curves and estimate the potential gains from trade, I follow a four-step empirical procedure:

Step 1: Estimate transaction costs arising from observable cost factors, by comparing prices across different transactions. Many water districts sell to (or buy from) more than one other district in a given year. Sometimes, one of these transactions is subject to an ob- servable cost factor (such as an extra regulatory review) and another is not. Under an assumption of perfect competition, a district transacting with multiple counterparties is indifferent between them in equilibrium, so any difference in prices can be interpreted as marginal transaction costs. By averaging over all such cases for both buyers and sellers, I estimate the marginal transaction costs associated with specific, observable regulatory and physical cost factors.

Step 2: Recover market participants’ marginal valuations of water at the observed equilib- rium, using revealed-preference conditions on prices. I estimate marginal valuations of water for all market participants using simple revealed-preference conditions: a buy- ing district’s marginal valuation must be at least as high as its highest price paid, and a selling district’s marginal valuation must be at least as low as its lowest price accepted, after adjusting prices for observed transaction costs (as estimated in Step 1). If there are remaining unobserved transaction costs, these estimates will understate the true disper- sion in marginal valuations, in which case my final results would likely provide a lower bound on the potential gains from trade.

Step 3: Estimate price elasticities of demand, exploiting supply shocks driven by weather and amplified by historical allocation rules. Step 2 gives an equilibrium point on each district’s demand curve; to extrapolate away from equilibrium, I also need an elasticity. To identify price elasticities of demand, I exploit California’s historically-determined al- location rules, which turn shared precipitation shocks into vastly different supply shocks

17 for different regions. Because transaction costs lead to persistence of these initial alloca- tions, I can measure how steeply marginal valuations change in response to exogenous changes in quantity consumed.

Step 4: Simulate the efficient allocation and calculate the resulting gains in consumer sur- plus, by combining demand curves in an optimization problem. I first construct demand curves by combining my estimates of equilibrium marginal valuations and de- mand elasticities with data on equilibrium quantities. Then, I combine demand curves in a constrained optimization problem to solve the social planner’s problem, finding the ef- ficient allocation and calculating the resulting gain in consumer surplus. An ideal market could implement the efficient allocation, so this gain represents the value of an efficient market. Since true physical transportation costs can never be eliminated, I include them (as estimated in Step 1) in the objective function.

To conduct this empirical analysis, I construct what may be the most comprehensive dataset yet compiled on California’s water economy. First, I assemble for the first time the universe of yearly surface water entitlements in California, including federal and state water project allocations and surface water rights. Second, I use a proprietary dataset on open-market water transactions that to my knowledge is the most complete in existence; crucially, it provides a mostly complete record of prices. I build a large crosswalk file to link users across datasets and years and a geospatial dataset on user locations and boundaries. For supplementary analysis in an appendix, I also incorporate two uniquely high-resolution datasets on land use and farm-level agricultural finances. In Step 1, I document large price gaps resulting from specific cost factors. For example, I find that transactions that must cross the Sacramento–San Joaquin Delta (an environmentally sensitive juncture triggering additional regulatory reviews) are associated with a price premium for sellers of $258 per acre-foot, and a price discount for buyers of $65. I interpret these as marginal transaction costs, totaling $324 per acre-foot. These are large as compared with the mean price in my data, $144. Other regulatory reviews also each result in similarly large marginal transaction costs. These results suggest that if state and federal agencies reduced the compliance costs, fees, risks, and/or water losses associated with their regulatory jurisdictions, the gains to market participants would be on the order of several hundred dollars for every additional acre-foot transferred. In Steps 2 and 3, I obtain two intermediate empirical results that are consistent with conventional

18 wisdom in California. First, equilibrium marginal valuations are about 80% higher on average for urban districts than for agricultural districts, a relationship that has remained consistent over time. This suggests that substantial welfare gains are available from reducing transaction costs and enabling farms to sell more water to cities. Second, agricultural districts have more elastic water demand than urban districts; I estimate that the price elasticity of demand is -1.40 for the agricultural sector and -0.74 for the urban sector. This suggests that the full incidence of transaction costs in the water market falls more heavily on agriculture. My central results come from Step 4. First, I find that observed trading in the existing market achieves welfare gains of $150 million per year, an amount roughly equal to the market value of sales ($140 million per year). Then, I simulate an intermediate market scenario, in which all regions are allowed to trade the same percentage of their water as the region that currently exports the greatest volume of water, or 17.5%. I estimate that this intermediate scenario would result in additional gains of $480 million per year – more than quadrupling the benefits of the existing market. Scenarios allowing full equalization of marginal valuations (up to physical transportation costs) generate even larger gains. These results carry three important limitations. First, I cannot identify transaction costs that are both unobserved and constant within user, such as search or contracting costs. This may lead my estimates to understate the true dispersion in marginal valuations. Because greater dispersion in marginal valuations results in greater gains from trade, my simulations likely are lower bounds on the true potential gains. Second, I treat water districts as the key economic agents, leading me to miss any gains from reallocation among the retail customers within these water districts. Third, my approach accounts for only the potential benefits of water markets. It omits the ben- efits from existing market-restricting regulations (such as achieving ecological goals and avoiding hydrological externalities) as well as the costs of setting up market-supporting institutions (such as expanded water-use monitoring systems or a centralized trading exchange) that may be required to achieve the full potential benefits. Many stakeholders in California believe there is scope for reforms to dramatically simplify regulatory review processes while accomplishing the same environmental goals; regardless, a complete policy analysis needs to account for both the benefits and costs of real- locating water. Finally, in appendices, I perform two further analyses supporting these main results. First, I conduct a reduced-form analysis of agricultural outcomes to help validate the result that marginal

19 valuations of water are low in agriculture relative to cities. I find that cutbacks in water entitlements during droughts (1) do not affect whether farmers plant more or less water-intensive crops; (2) in- crease land fallowing, but only by a small amount; and (3) have at most a small effect on profits. Taken together, these results suggest that the marginal value of water is indeed low in agriculture, and that substantial efficiency gains are available from making it easier for farmers to sell water to urban areas. Second, I investigate whether the price gaps I document in Step 1 might partly be explained by market power rather than marginal transaction costs. I relax the assumption of perfect competition and allow both buyers and sellers to exercise market power, extending my model in ways that follow Atkin and Donaldson (2015). I then derive a further two-step empirical procedure to adjust prices for possible markups and markdowns and to re-estimate marginal transaction costs, net of market power. The results do not suggest that market power explains the large marginal transaction costs I estimate in Step 1, and so I interpret the gains from trade estimated in Step 4 as true deadweight loss rather than transfers between buyers and sellers. This chapter makes several contributions. First, I provide a new approach to estimating the prospective gains from trade in water markets, and I assemble new data that enables this approach to overcome previous limitations. My approach differs from the prior literature in two important ways: (1) it is based on a small number of parameters that are econometrically estimated within the model, and (2) these parameters are estimated from data on observed transactions. A large literature uses calibrated optimization models to estimate the prospective gains from water markets, for California (Howitt et al. 1999; Sunding et al. 2002; Jenkins et al. 2003; Medellín-Azuara et al. 2007) as well as for Australia (Peterson et al. 2005; Qureshi et al. 2009) and Chile (Rosegrant et al. 2000). While these models incorporate rich institutional and scientific knowledge, their economic components rely on large numbers of imputed parameters and functional form assumptions (Mérel and Howitt 2014). In contrast, my approach is more parsimonious and estimates parameters with particular attention to causal identification. In addition, by inferring the preferences of market participants directly from observed transactions, I sidestep the need to directly model agricultural production functions or other fundamental determinants of water demand. Working directly with the implied objective functions of water districts also makes my approach more policy-relevant, since these districts will continue to be the primary market participants under most proposals aiming to strengthen water markets. There may be additional frictions between the districts and their retail customers, but eliminating them would

20 likely be more difficult than simply easing trading among districts. More generally, this chapter proposes a method to analyze the welfare impacts of transaction costs in thin asset markets. This may be particularly relevant to other settings in environmental economics, such as pollution permits or individual transferable quotas. There is a large literature in financial economics on thin markets and liquidity, but it is typically focused on strategic trading rather than transaction costs (Kyle 1989; Rostek and Weretka 2012, 2015). The literature on pollution permit markets covers the theoretical effects of market power (Hahn 1984; Malueg and Yates 2009; Liski and Montero 2011) and transaction costs (Stavins 1995; Liski 2001), with some empirical analysis of transaction costs (Gangadharan 2000; Cason and Gangadharan 2003), but there are few empirical studies of the welfare impacts of transaction costs. In the context of water, Carey et al. (2002) and Regnacq et al. (2016) study the effects of transaction costs on trading quantities in water markets, while Ayres et al. (2017) study transaction costs in groundwater management decisions. This chapter also relates to a literature in international trade that estimates trade costs from price gaps (Donaldson 2012; Atkin and Donaldson 2015; Bergquist 2016). Furthermore, it contributes to a broad and growing literature on the costs of misallocation, in settings such as housing (Glaeser and Luttmer 2003), capital (Hsieh and Klenow 2009), energy (Davis and Kilian 2011), labor (Bryan and Morten 2015; Adamopoulos et al. 2017), and land (Restuccia and Santaeulalia-Llopis 2017). Finally, my results contribute to a literature in agricultural economics on the value of water in irrigated agriculture (Schlenker et al. 2007; Buck et al. 2014; Mukherjee and Schwabe 2014). I find in both revealed-preference and profits-based analyses that the marginal value of water is small. Prior studies find much larger estimates, but in using land values, they measure a long-run marginal value. My approach instead measures within-year marginal values, in which farmers likely have greater ability to substitute toward groundwater.

1.2 Background on Water in California

Water is scarce in most of California, the largest economy and most populous state in the United States. A majority of the state’s population lives in Southern California, where there is little rainfall and no major rivers. Farms in the Central Valley, a major agricultural region, receive little rainfall during the summer growing season and instead rely on irrigation. Most precipitation in the state falls during the winter in mountain ranges in the north and east.

21 Moving water throughout the state is technologically feasible, thanks to an interconnected system of water infrastructure that is the world’s most complex. Federal, state, and local authorities operate canals and pipelines that, together with the river system, form a fully connected hydrological net- work among the vast majority of water users in the state of California. Although there are capacity constraints, at the margin it is possible to transfer water between nearly any two consumers in the state. California can be thought of as a closed hydrological system. Essentially none of its precipitation flows to other states or countries. The only major water source it shares with other states is the Col- orado River, but the amounts that each state receives are governed by long-term interstate compacts that I treat as fixed.

1.2.1 Water is initially allocated by fixed rules and environmental conditions

Property rights to water in California are distributed not according to private or social value but instead following a complex system of historical precedent. Some districts and consumers hold entitlements that are almost never curtailed, while others are rationed according to precipitation and runoff during the previous winter. In my empirical exercises, I exploit this rationing to estimate demand elasticities and the marginal value of water in agriculture. I summarize California’s hierarchy of water entitlements in Figure 1-1. All lawful surface water use in the state derives from a legal framework of appropriative and riparian rights. Some independent consumers (such as rural households or isolated farmers) hold their own water rights; others obtain water from federal or state water projects. More commonly, retail consumers (including farms, house- holds, and other consumers) obtain water from their local water district, which in turn either holds its own water rights or long-term contracts with the federal or state water projects. Water districts go by many legal classifications, such as irrigation district, water agency, or mutual water company, and may be public, private, or a blend. They may also have multiple layers, in which a wholesale district sells to retail districts. The different sources of water entitlements are governed by different allocation rules:

1. Appropriative and riparian rights. Rights follow a seniority rule determined by the date of first use; in droughts, senior rights-holders are entitled to their full claim before junior rights- holders are entitled to any. However, it is rare for this seniority system to substantially affect

22 water diversions in major rivers, since the residual claimants are generally the high-volume federal and state water projects.

2. Central Valley Project (CVP). Operated by the U.S. Bureau of Reclamation (USBR), the CVP stores and delivers water to irrigation districts, municipal water districts, and individual farms throughout the Central Valley. Contractors do not buy water at market-clearing rates; instead they are entitled to a certain volume of water each year. Contractors each have a fixed maximum volume, specified in multi-decade contracts. Actual yearly allocations vary from year to year, mostly on the basis of weather in the mountains during the previous winter, as well as environmental regulations. Allocations are announced as percentages of maximum contract volumes, determined separately for each of 14 contract categories based on history, geography, and sector. Some categories tend to have priority over others, but the ordering is neither constant (due to regional differences in water availability) nor lexicographic (like appropriative rights) (Stene 1995).

3. State Water Project (SWP). Operated by the California Department of Water Resources (DWR), the SWP stores and distributes water to users throughout the state. Contractors do not buy water at market-clearing rates but rather are entitled to a certain volume of water each year. SWP entitlements also vary from year to year on the basis of weather during the previous winter as well as environmental regulations. Contractors each have a fixed maximum annual volume, specified in multi-decade contracts. In droughts, all contractors receive cutbacks in equal proportion, except before 1992 when there were different proportions for urban and agri- cultural contractors. These proportions have varied from 100 percent as recently as 2006 to 5 percent of contract maximums in 2014 (California Department of Water Resources 2015).

4. Lower Colorado Project. Also operated by the USBR, the Lower Colorado Project distributes California’s share of Colorado River water (fixed in compacts dating back to 1922) to con- tractors in Southern California. To date, California’s Lower Colorado contactors have always received their full entitlements.

5. Groundwater. Local groundwater is another major source of water for both farms and cities, but its use is generally unmonitored. Availability and pumping costs vary considerably across regions. In this chapter, I treat local groundwater supplies as fixed.

23 1.2.2 Secondary markets are inhibited by transaction costs

Because water is not allocated according to a price mechanism, a robust secondary market might be expected. In fact, California’s statewide water market is thin. Figure 1-2 plots the total volume of market transactions over time, as compared with total water supply, using data described in Section 1.4. I focus on the statewide water market, which I define as transactions directly among districts and other independent consumers at freely negotiated prices. This definition excludes transactions be- tween wholesale districts and retail districts, and between retail districts and retail consumers. Such transactions take place within fixed, long-term relationships in which neither prices nor quantities are always flexible. It also excludes transactions involving retail customers within a water district. Intra-district transactions between consumers are rare in urban water districts but common in some irrigation districts. Unfortunately, data is scarce: even when districts keep records of these transac- tions, they rarely record prices. Retail consumers are usually not allowed or able to negotiate their own transactions with external districts; instead they must rely on their own district to represent their interests on the statewide water market. Many factors may make transactions in this market costly. Next, I outline a typology of transac- tion costs, building on Regnacq et al. (2016), Scheer (2016), and others.

Administrative transaction costs To trade water, a buyer or seller must first search for a po- tential trading partner. Without a central exchange, this happens mostly by word of mouth in social networks; sometimes a professional broker helps with matchmaking. There is no single standard con- tract for water transactions. The buyer and seller must negotiate over the quantity, duration, price, payment terms, delivery date, point of delivery, and delivery pathway. Transaction durations fall into two basic types: (1) permanent sales of water rights or contract entitlements, and (2) intra-year leases, in which the seller transfers a certain quantity of water while retaining the underlying entitle- ment. Together, search and contracting processes may create considerable administrative transaction costs, both explicit (i.e., attorney fees) and implicit (e.g., hassle costs), for both buyers and sellers.

Physical transaction costs Water is heavy; moving it from one place to another is costly. Wa- ter is lost in conveyance to evaporation and percolation. Pumping water uphill into canals requires energy to run turbines. Not all transactions incur these costs: upstream transfers may not incur any

24 conveyance losses, while downstream transfers on a river may not incur any pumping costs. However, these pure physical costs may differ from the costs directly incurred by buyers or sellers. Transactors pay “wheeling” charges to the owners of the intermediate conveyance facilities along the delivery pathway, including canals, pumping stations, and reservoirs. Wheeling charges are not generally equal to the true physical marginal cost of conveyance and pumping; some stakeholders believe they are often substantially higher (Western Water Company 2000).

Regulatory transaction costs Proposed transactions can be subject to regulatory review by three main agencies: California’s State Water Resources Control Board (SWRCB), California’s De- partment of Water Resources (DWR), and the U.S. Bureau of Reclamation (USBR). Some transac- tions are reviewed by counties. Depending on the proposed source and destination, transactions may be reviewed by more than one of these agencies, or none of them. In these reviews, agencies (1) require sellers to compile records proving they have legal enti- tlement to the water and physical ability to transfer it; (2) carefully estimate consumptive use (the amount not returned to the water system); (3) set up monitoring systems to verify sellers do not continue using water once transferred; (4) conduct extensive environmental impact analyses to meet requirements of the National Environmental Policy Act (NEPA) and the California Environmental Quality Act (CEQA); (5) estimate impacts on the local economy; and (6) schedule delivery for a time with available capacity (California State Water Resources Control Board 1999; California Department of Water Resources and U.S. Bureau of Reclamation 2015). Transactions that move water across the Sacramento–San Joaquin Delta, the biggest bottleneck in the system, must meet additional environmental regulations concerning outflow volumes, salin- ity levels, and endangered fish. These transactions can be risky; because the DWR and USBR are sometimes not allowed to pump water into their canals, there is no guarantee that the seller’s water will actually reach the buyer. In addition, these transactions are assessed “carriage losses” to satisfy environmental goals and regulatory constraints. In sum, regulatory reviews may create substantial policy-induced transaction costs, which again may be explicit or implicit. Explicit costs include agency review fees and the engineering fees to prepare documents. Implicit costs include the hassle or general disutility from the process, the time costs of the review plus its public notice and public comment periods, and the risks of disapproval or delivery failure. Both buyers and sellers may bear these costs.

25 Political-economy transaction costs Implicit transaction costs may also arise from political economy effects. First, water districts imperfectly represent the interests of their retail customers, driving a wedge between statewide water prices and an individual farmer’s or household’s willingness to pay or to accept. This is especially relevant in districts controlled by popular vote rather than property value or land area. Second, farmers may be reluctant to sell water because they fear voters will think they don’t need it and may take away their property rights in the future (Carey and Sunding 2001).

Many types of transaction costs may be pure loss to society, such as administrative costs. Some types may represent transfers, in the case of attorney fees. Others may have important benefits in pre- venting negative externalities. Environmental reviews protect public goods like wildlife and ecosys- tem services, and determining consumptive use protects the property rights of downstream users. Political economy constraints may help prevent pecuniary externalities to the origin community. However, the evidentiary standards for regulatory approval of water transactions are much higher than for the use of the water in the first place. Many stakeholders have proposed reforms that could accomplish many of the same regulatory goals while streamlining the process (Western Water Com- pany 2000; Culp et al. 2014; Gray et al. 2015; Association of California Water Agencies 2016). In my analysis, I focus on the costs of these regulations, which then can be weighed against their benefits. In addition to the costs incurred for transactions that do take place, transaction costs also may prevent many mutually beneficial trades from occurring.

1.3 Theoretical Framework

In this section I lay out a model of California’s water market to guide my empirical analysis. This model serves two purposes. First, it allows me to rationalize transactions in the data as equilibrium outcomes that are consistent with a conceptually precise set of demand curves and transaction costs. This can explain the high price dispersion and low transaction volume seen in California, and it also clarifies the relationship between the observed equilibrium and the efficient allocation. Second, the model yields equilibrium conditions that allow me to empirically identify these demand curves and find the efficient allocation. I first present a simplified graphical model to build intuition. I then model an exchange economy in which consumers trade endowments of a homogeneous good through intermediaries, which incur

26 transaction costs. Finally, I derive my four-step empirical procedure.

1.3.1 Simplified graphical model

Consider two districts, d and o (for destination and origin), which might be thought of as an urban

water district d and agricultural irrigation district o. They have initial endowments of water Eo and

Ed and inverse demand curves Vd(Qd) and Vo(Qo), which give their consumers’ marginal valuations of water as a function of quantity demanded. Figure 1-3 plots these two inverse demand curves, with the axis reversed for district o and lined up such that the overall width of the graph is equal to the overall resource constraint – the sum of the endowments – in the style of an Edgeworth box. If consumers in these two districts were allowed to costlessly trade with each other, we would ∗ ∗ ∗ expect them to arrive at the competitive equilibrium, resulting in allocation (Qd,Qo) and price P . 0 0 Suppose they instead arrive at allocation (Qd,Qo) and transaction price Pod. Also suppose that each district is internally efficient such that at this new allocation they reach internally district-clearing 0 0 prices Pd = Vd(Qd) and Po = Vo(Qo), where Pd > Pod > Po. How can we make sense of this scenario, in which three different prices are observed in equilibrium?

Observation 1: Transaction costs can rationalize price gaps. Different prices can be observed in equilibrium within and across districts if trade across districts incurs a marginal (per-unit) transaction b cost. In Figure 1-3, buyers in d must be paying per-unit transaction cost τ = Pd − Pod, while sellers s in o must be paying transaction cost τ = Pod − Po. These transaction costs shift down the buyer’s b marginal willingness-to-pay for cross-district trade, Vd − τ , relative to its marginal valuation: the buyer requires a price discount equal to the transaction costs in order to be indifferent between origins. s Similarly, the seller’s marginal willingness-to-accept for cross-district trade, Vo +τ , is higher than its marginal valuation: the seller requires a price premium in order to be indifferent between destinations. Under an assumption of perfect competition, transaction costs must be exactly equal to the price gap. If they were lower, the buyer and seller would trade more, and if they were higher, the buyer and seller would trade less (or not at all).

Observation 2: Potential gains from trade can be estimated with knowledge of demand curves and initial endowments. The welfare gains from reducing transaction costs can be calculated as the change in consumer surplus of the two districts. This equals ∆CSd +∆CSo, as shown in Figure 1-3.

27 Buyers’ consumer surplus is standard welfare analysis; sellers’ consumer surplus here is analogous to producer surplus. Four vectors of information suffice to approximate this gain: initial endow- 0 0 ments (Ed,Eo), baseline allocation (Qd,Qo), internal prices at this allocation (Pd,Po), and demand

elasticities (ηd,ηo).

Unfortunately, Pd and Po are unobserved in my data, which contains mostly cross-district trans- actions. To overcome this obstacle, below I develop an empirical approach leveraging the fact that districts often trade with more than one other district.

1.3.2 Model of an exchange economy with transaction costs

I now develop a more formal model, in order to more precisely clarify the insights above, microfound them with explicit assumptions, and extend them to more than two consumers. In this model, two layers of intermediaries engage in spatial arbitrage of a homogeneous good across districts. Interme- diaries, which might be thought of as brokers representing each district, incur transaction costs that may be district pair-specific and directionally asymmetric. This reflects that physical, regulatory, and other costs vary not only across buyers and sellers but also by who they choose to transact with, and in which direction. Perfect competition leads to the result that prices equalize marginal valuations, up to transaction costs. The primary purposes of the model are to (1) rationalize price dispersion in equilibrium, and (2) motivate the idea that water market participants buy at their marginal willingness to pay and sell at their marginal willingness to accept. Several alternative models can obtain these same basic results. Here I present a model with intermediaries and perfect competition because it is relatively simple and may be familiar from the international trade literature. In Appendix 1.14, I describe a potentially more realistic bargaining model, in which water districts directly engage in simultaneous bilateral negoti- ations. As individual transaction volumes become small relative to total quantities consumed, prices given by the Nash bargaining solutions (within a Nash equilibrium across negotiations) converge to the result from the perfectly competitive model shown here. Consider N water districts indexed by n (or by o for origin and d for destination). Each district

has initial endowment En of a single homogeneous good (water) that is allocated efficiently among a continuum of consumers. Consumers’ preferences can be aggregated such that each district has an

inverse demand function Vn(Qn), which gives marginal valuations as a function of quantity consumed

28 Qn. Inverse demand is decreasing and twice differentiable. Trade across districts is conducted by two layers of intermediaries. In each district, selling inter- mediaries (“sellers”) can buy units of water from consumers (at their marginal valuations Vo(Qo)) and sell to buying intermediaries (“buyers”) in another district. Buyers, in turn, buy water from sellers

and can sell to consumers in their own district (at their marginal valuations Vd(Qd)). Sellers and buy-

ers meet at exchange points unique to each pair of districts, where prices Pod are determined. Each transaction i generates transaction costs for both sellers and buyers:

Assumption 1. Per-unit transaction costs. To complete a transaction of qiod units of water, sellers and buyers each incur constant marginal transaction costs. They are non-negative and may be specific to each origin-destination district pair:

s s (Sellers) Cod(qiod) ≡ τodqiod (1.1)

b b (Buyers) Cod(qiod) ≡ τodqiod.

These transaction costs capture not only physical transportation costs and legal contracting costs but also factors such as hassle costs, regulatory costs, and political pressures – anything that intro- duces a wedge between the value of the good for a consumer’s own use and its value in a transaction. Note that this assumption rules out fixed costs of trading, which simplifies the model. This approx- imation still reflects many aspects of California’s water market: some percentage of water is lost in conveyance, larger transactions receive greater regulatory scrutiny, owners of canals and pipelines often charge per-unit fees (“wheeling” charges), and larger transactions may generate more political backlash. I make one more key simplifying assumption:

Assumption 2. Perfect competition. Each district has enough intermediaries that they behave as price takers. That is, the quantity sold or purchased by any one intermediary does not affect the equilibrium price for any district pair: dPod/dqiod = 0 for all o and d.

In Appendix 1.13, I relax this assumption and conduct my empirical analysis allowing for markups or markdowns by either buyers or sellers. There, I find little evidence that market power explains the observable transaction costs that I estimate below.

29 Sellers and buyers choose non-negative quantities for each origin and destination pair od to max- imize net profits:

s s s s s (Sellers) max s P q −V (Q )q − q s.t. q ≥ 0 (1.2) qiod od iod o o iod τod iod iod b b b b b (Buyers) max b Vd(Qd)q − Podq − τ q s.t. q ≥ 0 qiod iod iod od iod iod

s Each problem has two candidate solutions. First, sellers and buyers may not trade at all (qiod = 0, b qiod = 0). If sellers’ transaction costs are too large, or marginal valuations in the origin district are too s high (Vo(Qo)+τod > Pod), sellers will not trade. If buyers’ transaction costs are too large, or marginal b valuations in the destination district are too low (Pod > Vd(Qd) − τod), buyers will not trade. In order for trading quantities to be positive, there must be non-negative marginal surplus between the seller b s and buyer: Vd(Qd) − τod ≥ Vo(Qo) + τod. If buyers and sellers do trade, taking first-order conditions yields my main result:

s (Sellers) Pod = Vo(Qo) + τod (1.3)

b (Buyers) Pod = Vd(Qd) − τod.

That is, if two districts trade at all, the price between each pair of districts equalizes sellers’ marginal willingness to accept with buyers’ marginal willingness to pay. Relative to marginal valuations, the negotiated price gives a premium to the seller, and a discount to the buyer, that is exactly large enough to compensate for the transaction costs that each incurs. These are the key conditions I use in my empirical analysis. As I will show, I directly observe prices, but not marginal valuations or transaction costs. However, because marginal valuations vary only by district, while transaction costs vary by district pair, I can separately identify them empirically.

Trading quantities traded are defined implicitly by combining the inverse demand curves Vo(Qo)

and Vd(Qd) with the market clearing condition: the sum of quantities sold by sellers must equal the sum of quantities bought by buyers. Intuitively, in partial equilibrium (holding constant transactions in all other pairs of districts), sellers and buyers keep increasing quantities until the marginal surplus is arbitraged away and the first-order conditions are met. In general equilibrium, quantities adjust so that each district’s marginal valuation is equalized across all of the other districts it sells to or buys from.

30 Consumer surplus for each district is calculated by comparing demand curves with transaction expenditures or revenues. For a district that buys water, its consumers obtain consumer surplus equal to the difference between marginal willingness to pay and transaction expenditures, summed over all quantities purchased in excess of the endowment. For a district that sells water, its consumers obtain consumer surplus equal to the difference between transaction revenues and willingness to accept, summed over all quantities sold from the endowment.2 Expressing these mathematically, consumer surplus at quantity Qn is defined as:

Z Qd b (Buying District) CSd(Qd) ≡ Vd(ϕ)dϕ − ∑Podqod − ∑τodqod (1.4) Ed o o Z Qo s (Selling District) CSo(Qo) ≡ Vo(ϕ)dϕ + ∑Podqod − ∑τodqod Eo d d where qod ≡ ∑i qiod is the sum of quantities traded by all intermediaries for that district. Because b at equilibrium a district’s marginal values are equalized across transactions (Vd(Qd) = Pod + τod and s Vo(Qo) = Pod − τod, from Equation 1.3), consumer surplus can also be written as:

Z Qn   CSn(Qn) = Vn(ϕ) −Vn(Qn) dϕ. (1.5) En

1.3.3 From theory to estimation

Using this theoretical framework, I now derive my empirical procedure. My goals are to (1) find the competitive equilibrium without transaction costs, and (2) calculate the resulting change in consumer surplus. With knowledge of the demand functions Vn(Qn) and initial endowments En, the compet- itive equilibrium (in a counterfactual efficient market) can be found by solving the social planner’s problem. That is: choose Qn for all districts n to maximize the sum of all districts’ consumer sur- plus, subject to the resource constraint that the sum of quantities must be equal to the sum of initial endowments. From there it is straightforward to calculate consumer surplus, using Equation 1.5. The main empirical task is to estimate demand curves, since initial endowments can be read di- rectly from the data. Demand curves are comprised of marginal valuations at different values of quantity. Given equilibrium marginal valuations and quantities in multiple time periods, I can esti-

2This is identical to buyer’s consumer surplus, for negative quantities relative to the endowment. On a graph, the seller’s consumer surplus appears similar to producer surplus, but it is determined from a demand curve, not a supply curve – sellers in an exchange economy do not produce anything.

31 mate price elasticities and reconstruct these demand functions. To estimate marginal valuations in each time period, I rely on the result that prices equalize up to transaction costs, and add two more assumptions. At this point I introduce time subscripts t, allowing all variables and parameters to vary arbitrarily across time periods and making further restrictions when needed. (In my empirical appli- cation I define a time period as one year; this is a reasonable approximation since almost two-thirds of observed transactions are arranged in the same three-month period of May through July.)

Obtaining marginal valuations by estimating transaction costs from observable factors

The first task is to estimate marginal valuations for all districts in each period, Vnt (Qnt ). Ideally, marginal valuations would be observed directly in the data – for example, from prices on intra-district water markets. Unfortunately, these internal prices are difficult to obtain, and most water districts appear to not keep systematic records. Most observations in my dataset are cross-district transactions. Instead, I can estimate marginal valuations by comparing prices across a district’s transactions in each period. Combining the result that prices equal marginal willingness to pay (Equation 1.3) with the fact that transaction costs are non-negative (Assumption 1), it follows that Podt ≤ Vdt for all buying districts d: all prices paid by a buying district are less than its marginal valuation. Therefore, I can

lower-bound a buying district’s marginal valuation with the highest price it paid: Vdt ≥ maxo{Podt }. By a parallel argument, all prices accepted by a selling district are higher than its marginal valuation

(Podt ≥ Vot for all selling districts o) so I can upper-bound a selling district’s marginal valuation with

the lowest price it accepted: Vot ≤ mind{Podt }. However, I can improve on these bounds by leveraging information across transactions. Although s b the overall levels of transaction costs τodt and τodt are unobserved, portions of these costs may be due to observable cost factors, such as conveyance distance or specific regulatory reviews. Many of these observable factors vary across transactions, generating within-district variation that I can use to estimate factor-specific transaction costs. Then, I can use these estimates to adjust raw prices and obtain tighter bounds on marginal valuations. For example, suppose I observe Redding selling both to Sacramento at $50 and to Los Angeles at $70, where the sale to Los Angeles undergoes a regulatory review that the sale to Sacramento does not. I bound Redding’s marginal valuation as no more than $50 and interpret the price difference – $20 – as the cost of this review nominally incident on Redding. Then, if I also observe Fairfield selling to Los Angeles for $80, and I know this transaction undergoes the same regulatory review as

32 the Redding–Los Angeles sale, I can leverage this information. I know that this regulatory review costs sellers $20, so I can improve my upper bound of Fairfield’s marginal valuation from $80 to $60. To formalize this intuition, I first make a further assumption on the functional form of transaction costs.

Assumption 3. Factor-specific transaction costs are additively linear and constant across trans- s s s b b actions. Transaction costs can be decomposed as τodt = τ Bod +τ˜odt (for sellers) and τodt = τ Bod + b s b τ˜odt (for buyers), where Bod is a vector of observable transaction cost factors and τ and τ are vectors of coefficients, and all parameter and cost factor values are non-negative.

This carries three substantive restrictions: first, transaction costs arising from specific cost factors

Bod are constant across consumers with the same values of Bod; second, they are constant across time; and third, cost factors do not interact with each other – when a transaction is subject to multiple cost factors, the total costs equal the sum of their parts. In principle I could relax either the first or the second restriction, but not both; for τs and τb to be identified, they must be constant in at least one dimension. The third restriction in principle could be relaxed nonparametrically. Next, recall that Equation 1.3 relates observed prices to transaction costs and marginal valuations. Both of the latter are unobserved in levels, but some components of transaction costs vary observably across transactions while marginal valuations do not. Combining that result with Assumption 3 yields

s s (Sellers) Podt = αot + τ Bod + εodt (1.6)

b b (Buyers) Podt = αdt + τ Bod + εodt ,

s b where the α’s collect terms fixed within consumer and year (αot ≡ Vot +Ed[τ˜odt ]; αdt ≡ Vdt +Eo[τ˜odt ]) s s s b b b and the ε’s collect miscellaneous terms (εodt ≡ τ˜odt − Ed[τ˜odt ]; εodt ≡ τ˜odt − Eo[τ˜odt ]) that I will treat as econometric errors. Consistent estimators of τs and τb, which I describe in Section 1.5, then will allow me to more strongly bound marginal valuations. Before considering transaction costs, my best upper bound for

sellers was the lowest price: Vot ≤ Podt ∀d. Now, I can improve on this by adjusting these raw prices

33 with estimated transaction costs:

s (Sellers) Vot ≤ Podt − τ Bod ∀d (1.7)

b (Buyers) Vdt ≥ Podt + τ Bod ∀o,

which I obtain from Equation 1.3 and the assumption that all components of transaction costs are weakly positive (Assumption 3). Note that τs and τb are constant effects under Assumption 3, so these bounds hold exactly, not in expectation. Finally, I assume that these bounds are “good enough,” which allows me to interpret both marginal valuations and gains from trade as point estimates rather than bounds:

Assumption 4. Unobserved transaction costs are zero for at least one transaction. Unobserved s b transaction costs τ˜odt and τ˜odt are zero for the transaction in each district and period with the lowest (for selling districts) or highest (for buying districts) adjusted price.

This assumption implies that Equation 1.7 holds with equality for at least one transaction per district and period:

s (Sellers) Vot = min{Podt − τ Bod} (1.8) d b (Buyers) Vdt = max{Podt + τ Bod}. o

If this assumption is not true, then I will understate the dispersion in marginal valuations. Because more dispersion means more mutually beneficial transactions, my estimates of the potential gains from trade would likely be a lower bound. However, this assumption may not be unreasonable, since s my empirical estimates of τ will be large enough to put some sellers’ marginal valuations Vot near zero. Note that this procedure accounts for both observed and unobserved transaction costs. Observed transaction costs are reflected by adjusting prices for estimated costs of observable cost factors. Even after adjusting prices, though, prices may vary across transactions due to differing unobservable cost factors. Taking the minimum (or maximum) across these adjusted prices accounts for (at least some) unobserved transaction costs.

34 Estimating demand elasticities

My next task is to construct demand curves from these marginal valuations; specifically, inverse de- mand functions Vn(Qn) that describe how marginal valuations vary with quantities. The previous

procedure yields marginal valuations in each period Vnt , which I can use along with temporal varia-

tion in quantities Qnt to estimate the function Vnt = fn(Qnt ). This can be done using any consistent estimator. In practice, the classic endogeneity problem posed by the joint determination of prices and quanti- ties makes it difficult to find such a consistent estimator. To overcome this, I instrument for quantities using yearly water entitlements. Entitlements are driven by the interaction of weather fluctuations with historically-determined allocation rules, making their year-to-year variation likely exogenous.

The disadvantage of this quasi-experimental approach is that it cannot approximate fn(Qnt ) very flex- ibly. I will assume that demand is isoelastic in the range between observed and optimal quantities, with constant elasticities per sector.

Finding the efficient allocation and calculating consumer surplus

Finally, upon obtaining demand curves Vn(Qn), I can find the efficient allocation by solving the social planner’s problem. This is a constrained optimization exercise in which I choose Qn for all districts n to maximize total consumer surplus, subject to the resource constraint that the sum of quantities must be equal to the sum of initial endowments. The total gain in consumer surplus is equal to the potential gains from an efficient water market without transaction costs (neither observed nor unobserved). If Assumption 4 does not hold and instead there are unobserved transaction costs that affect all transactions, my estimate of this solution will be a lower bound of the true potential gains. This is because my estimates of Vn will be biased, understating the true dispersion in marginal valuations. To see this graphically in Figure 1-3, consider the efficient equilibrium between buying district d and selling district o calculated using their true demand curves, labeled with price P0. If I omit some unobserved transaction costs, I may underestimate marginal valuations for buyers and overestimate them for sellers, leading me to obtain the demand curves shown in dashed lines. Using these demand curves to calculate the efficient allocation would result in the equilibrium labeled with price Pod. The resulting consumer surplus is the triangle to the left of the dashed demand curves. Relative to this, the true equilibrium at P0 reallocates a greater quantity of water, creating a larger area of consumer

35 surplus. I also note two implications of using consumer surplus as my welfare measure. First, it relies on the Pareto welfare criterion, which treats any voluntary transaction as inherently good and rules out any notion of merit goods – in which society might value certain parties holding possession of water more than the parties themselves do. Second, it does not account for any externalities associated with reallocation. Quantifying the externalities of water markets is beyond the scope of this chapter, but a complete policy analysis should account for them.

Summary of empirical procedure

Summarizing the discussion in this section, I propose an empirical procedure to estimate the potential gains from trade in a thin exchange economy with transaction costs. This procedure has four steps:

• Step 1: Estimate transaction costs from observable cost factors. Find cost factors that are both observable and heterogeneous across transactions. Estimate the transaction costs associ- ated with these cost factors from variation in prices across transactions within consumer and year.

• Step 2: Recover equilibrium marginal valuations. First adjust prices by these observable transaction costs, applying the knowledge that consumers must be compensated for transaction costs in order to agree to the transaction. Then adjust for unobservable transaction costs by taking each buyer’s highest adjusted price paid, and each seller’s lowest adjusted price accepted, as final estimates of marginal valuations.

• Step 3: Estimate demand curves. Combine these equilibrium marginal valuations with data on quantities to measure the relationship between them. Use instrumental variables to overcome the problem of joint determination and obtain consistent estimates of demand parameters.

• Step 4: Simulate counterfactual allocations. Combine demand curves to find the quantity vector that maximizes total consumer surplus, and calculate the resulting consumer surplus.

A remaining concern is that some of the estimated price gaps may in fact represent markups, rather s s than true transaction costs. To see this, consider a decomposition Podt = Vot (Qot )+τodt + µodt , where s µodt > 0 is an ologopolistic markup; here, markups are empirically isomorphic to transaction costs. The assumptions of my model rule out the successful exercise of market power by either side of the

36 market, but if these assumptions are false, then my estimation procedure for marginal transaction costs would also pick up differences in markups that are correlated with the transaction cost factors. Such markups could arise if, for example, most sellers in a low-valuation region find it prohibitively expensive to sell to a high-valuation region, leaving buyers in the high-valuation region with access to only a small number of effective potential sellers. I explore this possibility in Appendix 1.13. There, I present an alternative model that allows both buyers and sellers to exploit market power, and I derive and perform an empirical procedure based on Atkin and Donaldson (2015) that adjusts raw prices for market power, using estimated pass-through rates as sufficient statistics. Then I repeat Step 1, again estimating marginal transaction costs, but using these adjusted prices. I find that estimates are noisier, but if anything, marginal transaction costs are larger. I conclude that the issue of market power is small relative to transaction costs, and so I proceed with the assumption of perfect competition.

1.4 New Data on California’s Water Economy

To conduct my empirical analysis, I compile new data on California’s water economy. I assemble for the first time the universe of yearly surface water entitlements in California. I also use a proprietary dataset on open-market water transactions in California that is significantly more complete than pub- licly available datasets. To link across datasets and complete the analysis, I build a large crosswalk file and a geospatial dataset on user locations and boundaries. For further analysis of agricultural outcomes (Appendix 1.12), I further incorporate two uniquely high-resolution datasets on agriculture in California: (1) a satellite-derived remote sensing product for land use, and (2) microdata from the Census of Agriculture for agricultural finances.

1.4.1 Water transactions

No government agency or other institution maintains a centralized listing of all water transactions in California. Instead, I use a proprietary dataset compiled by WestWater Research, LLC. To my knowl- edge, this is the most complete dataset of water transactions in California. One water transactions dataset has been assembled and made publicly available by Gary Libecap at the University of Cali- fornia, Santa Barbara (e.g., Brewer et al. 2008); however, it has only a fraction of the transactions in California as the WestWater dataset. Another dataset has been assembled by Ellen Hanak at the Pub-

37 lic Policy Institute of California, but it is not public. This dataset also appears to focus on transaction volume rather than prices (Hanak and Stryjewski 2012), whereas prices are mostly complete in the WestWater dataset. I focus on the spot market (within-year leases) as opposed to permanent transfer of rights. The WestWater dataset includes 4,906 spot market transactions, of which prices are available for 4,415. Table 1.1 shows summary statistics. Panel A shows that the distribution of volumes is highly dis- persed, most transactions are within region, transactions across regions tend to have larger volumes but similar prices, and the mean price is $145 per acre-foot, in 2010 dollars. Panel B shows that the Sacramento River hydrologic region is the greatest net exporter, the San Joaquin River and Tulare Lake regions are the greatest net importers, and the South Lahontan hydrologic region accounts for the majority of observations but a relatively small proportion of the total volume. Figure 1-4 shows the distribution of prices on a base-10 logarithmic scale; transactions are centered around $100-200 but have substantial mass in the tails. I identify the geographic location and sector (urban/municipal, agricultural, or environmental) of most buyers and sellers in the data, using several methods described in Appendix 1.16.

1.4.2 Water entitlements

I assemble the universe of surface water entitlements in California, by user, sector, and year, from 1980 through 2016. By entitlements I mean water allocated to institutions or individuals on the basis of property rights or fixed long-term contracts. There are four sources of entitlements: Central Valley Project (CVP) allocations, State Water Project (SWP) allocations, Lower Colorado Project entitlements, and surface water rights. This compilation takes advantage of recently available data on water use under surface water rights, made possible by a new law that required all surface water rights holders to report their water use starting in 2010. For all sources, I construct entitlements by multiplying a time-invariant baseline maximum en- titlement by a year-varying allocation percentage. This ensures that when using fixed effects, all identifying variation comes from changing allocation percentages (which are driven by weather and hydrologic conditions) rather than changing maximum entitlements (which may be endogenously se- lected), although in most cases, maximum entitlements are fixed over multiple decades. For the CVP, allocation percentages vary across both years and contract types; for the SWP, allocation percentages

38 are constant across users, varying only across year. For water rights and the Lower Colorado Project, these allocation percentages are always 100%; they do not contribute to the identifying variation. For project entitlements, maximum contract amounts and yearly percentage allocations come from archives of the California Department of Water Resources (DWR) and U.S. Bureau of Recla- mation (USBR). For surface water rights, I use self-reported diversions collected by the State Water Resources Control Board (SWRCB) rather than the face value of rights, which are often outdated for post-1914 appropriative rights and are not recorded for pre-1914 or riparian rights. It is reasonable to treat these reported diversions as the full, legally defensible value of present water rights, since they are public information and thus could potentially be used in future legal disputes. I also combine entitlements and transactions to calculate another useful variable: quantity con- sumed. Because I cannot directly measure water consumption, I construct quantity consumed by summing (1) entitlements, (2) quantity purchased minus quantity sold from transactions data, and (3) mean groundwater supply as estimated by DWR, all within market and year. (I discuss the definition of market below in Section 1.7.) Details of sources, cleaning, and processing of these variables are described in Appendix 1.16.

1.4.3 Crosswalk file and user locations file

To link users across datasets, I build a crosswalk file that accounts for variations and errors in names as well as mergers and name changes across time. This file has 28,764 entries (input names) pointing to 14,830 targets (output names). To identify the locations, boundaries, and areas of water users, I combine several publicly available shapefiles into a single geospatial dataset. Details of the sources and construction of these files are described in Appendix 1.16.

1.5 Step 1: Estimating Transaction Costs from Observable Factors

The first step in my empirical approach is to estimate marginal transaction costs from observable cost factors. To do this, I find cost factors that are both observable and heterogeneous across transactions, and then I estimate the transaction costs associated with these cost factors from variation in prices. The intuition is that in equilibrium, a consumer selling to two buyers is indifferent between them at the margin, so any difference between them can be interpreted as marginal transaction costs.

39 1.5.1 Selecting cost factors

Transaction costs may arise from a litany of cost factors, which may fall into physical, administrative, policy-induced, or political-economy categories. I start with the typology of transaction cost factors in Section 1.2.2, compiled from institutional knowledge. For some of these cost factors, there are populations of both sellers and buyers for whom the factor is incident on some trading partners but not others. Comparing prices across these trading partners, within these populations of sellers and buyers, can identify the marginal transaction costs. From this list, I select the cost factors that are both (a) observable, and (b) heterogeneous across trading partners. These are the cost factors whose marginal transaction costs can be econometrically identified. Cost factors that are common to all of a consumer’s transactions (such as the costs of writ- ing a contract, or disutility from market participation), are excluded because they are indistinguishable in price data from a level shift in marginal valuations. In Table 1.2, I list the cost factors that meet these criteria, along with their populations and identifying comparisons. They are distance and several types of regulatory reviews: crossing the Sacramento–San Joaquin Delta, and reviews by each of several state, federal, and local government agencies.

1.5.2 Econometric specification

I stack the selected cost factors into a vector Bod, depending on seller (origin) o and buyer (destination) d. Most of these cost factors are discrete, so I use binary indicator variables; distance enters linearly. I also include interactions of the discrete cost factors where they exist in the data, in order to allow transaction costs to vary by the presence of other cost factors.

Then, to identify marginal transaction costs nominally incident on sellers, I regress price Pjodt in transaction j in year t on this vector Bod and seller-by-year fixed effects. Similarly, to identify costs incident on buyers, I use buyer-by-year fixed effects. The resulting regressions precisely follow the identification result derived earlier (Equation 1.6 in Section 1.3.3):

s s (Sellers) Pjodt = αot + τ Bod + ε jodt (1.9)

b b (Buyers) Pjodt = αdt + τ Bod + ε jodt ,

40 The seller regression measures price gaps across transactions within seller and year. That is, it isolates cases in which a seller completed transactions with two different buyers in the same year, of s which one was subject to a particular cost factor and the other was not. Each coefficient τh (on the hth

cost factor indexed in Bod) then is a regression-weighted average of the price differences in all such cases. The intuition is that if the seller has to pay greater transaction costs to complete a transaction, the price received must be higher to compensate. s For this regression to produce an unbiased estimate of τh, I also need to assume selection on

observables: that unobserved determinants of prices are uncorrelated with the cost factors Bod. In the special case where all other components of transaction costs are equal except those associated with the

cost factors Bod, this condition follows immediately from my model. This is because in equilibrium, prices equalize marginal willingness to accept across trading partners, so prices are affected only by marginal valuations and transaction costs. Marginal valuations are absorbed by the seller-by-year fixed effects, as are transaction costs common to all transactions, leaving only variation in transaction costs that differ across transactions. In the more general case where transaction costs vary arbitrarily across transactions, the assump- tion of mean independence is stronger. It requires that, conditional on seller and year, the selected cost factors are uncorrelated with other unobserved cost factors (or, without the assumptions of the model, other determinants of price). In robustness checks, I partially test this assumption by exploring whether the coefficient estimates change when I include other covariates. b The buyer regressions have exactly mirrored conditions and interpretations: τh measures differ- ences in prices paid by a buyer to different sellers. I cluster standard errors in all regressions at the level of subregion-by-year3 to allow for local spatial correlation. Finally, in alternative specifications, I use coarser fixed effects. In these specifications, I also include binary indicator variables for the identifying population. These restrict the control group, ensuring that τs and τb are identified from only the comparisons shown in Table 1.2. With buyer- or seller-specific fixed effects, these indicators are redundant, since they vary only by buyer or seller.

3I use “subregion” to refer to planning subarea (PSA) as defined by the California Department of Water Resources. California has 46 PSAs.

41 1.5.3 Results

Aggregated cost factors First, I show that the major discrete cost factors create large marginal transaction costs. To do this, I create a summary variable that indicates whether a transaction is subject to any major regulatory review (cost factors 2-5 in Table 1.2; I exclude county reviews since these affect relatively few transactions) beyond the minimum number of reviews incurred in all of the district’s transactions. I then run the regressions in Equation 1.9, where the vector of cost factors

Bod includes distance, this summary variable, and a placebo test: an indicator for whether the transfer crosses sectors, from agricultural to urban. To my knowledge, there is nothing specific to cross-sector transfers that lead them to incur greater transaction costs than comparable within-sector transfers. Table 1.3, Panel A shows these first results. Columns (4) and (8) present my preferred specifi- cations, with seller-by-year and buyer-by-year fixed effects. They show that the major discrete cost factors are indeed costly: prices received by sellers are $89 per acre-foot higher for transactions sub- ject to one of these cost factors, relative to other transactions by the same seller in the same year. Similarly, prices paid by buyers are $162 per acre-foot lower. These estimates are quite large, con- sidering the mean price in the sample is $144 (conditional on any regulatory review, $233), and their 95% confidence intervals both exclude zero. Distance appears much less important: for buyers, trans- acting from a seller 100 km away is worth a discount of only $6.70; for sellers, the point estimate is even smaller in magnitude. Neither distance coefficient is statistically distinguishable from zero. Cross-sector transfers also are small and indistinguishable from zero, satisfying the placebo test. In the same panel, columns 1-3 (for sellers) and 5-7 (for buyers) show results of the same regressions with coarser fixed effects: region-by-year, subregion-by-year, or region-by-year with seller/buyer fixed effects. Compared with the preferred specification, magnitudes vary somewhat but the overall results are broadly similar, suggesting that they do not depend on the highly specific within-seller comparison. Column (9) computes the total marginal transaction costs by adding the sellers’ coefficient (i.e., the price premium) to the negative of the buyers’ coefficient (i.e., the price discount). A transaction subject to any major discrete cost factor incurs total marginal transaction costs of $250 per acre-foot.

Disaggregated cost factors In Panel B of Table 1.3, I disaggregate these cost factors. I estimate separate coefficients for each of the cost factors listed in Table 1.2, with the major regulatory reviews (SWRCB, DWR, and USBR) estimated separately by whether the transaction also crosses the Sacra-

42 mento–San Joaquin Delta. These disaggregated estimates are noisier, with large confidence intervals, but the magnitudes are large for the major discrete cost factors. Buyers consistently require large price discounts, and sellers require large price premiums, to choose transactions subject to these cost factors. Again, specifications with coarser fixed effects show broadly similar results. The buyer/seller balance of incidence appears to vary across cost factors. SWRCB review is costly for both sides, while Delta crossing is incident mostly on the seller, and DWR review is incident almost entirely on the buyer. Meanwhile, marginal transaction costs are small for both distance and county reviews of groundwater exports. Previous work (Hanak 2005; Regnacq et al. 2016) find that these county reviews reduce the volume of trade; it may be that they impose large fixed transaction costs, since my approach only detects marginal transaction costs.

Robustness checks Appendix Table 1.10 reports the results of several robustness checks. (For reference, columns 1 and 5 repeat the preferred specifications from Table 1.3.) First, the market may award discounts or premiums for larger transactions, contrary to my model assumptions, and this tendency could be correlated with the cost factors. Columns 2 and 6 report regressions that include the natural log of transaction volume. Second, transactions occur at different points throughout the year; it may be inappropriate to treat them as a single time period if marginal valuations vary as districts move up or down their demand curves. Columns 3 and 7 report regressions that include district-by-year-by-month fixed effects, comparing prices within each month. Third, a large portion of my dataset consists of low-volume transactions of adjudicated groundwater rights within Southern California; if my constant effects assumption is false, estimates from this data may be misleading for a broader context. Columns 4 and 8 report regressions on a subset of the data that excludes these transactions, leaving mostly surface water transactions. None of these modifications substantially change the results. Magnitudes move slightly but gen- erally stay within the range of uncertainty. Signs and orders of magnitude are unaffected.

Aggregate gains from reducing regulatory costs Finally, I apply these estimates to measure the gains available from marginal reductions in the transaction costs associated with regulatory reviews. This is the marginal effect on aggregate consumer surplus of a $1 change in marginal transaction costs; in Figure 1-3 it is represented by the width of the rectangular area of consumer surplus. To calculate it, I simply sum the quantities of observed transactions subject to each cost factor. This

43 estimate captures only gains from reducing costs of existing transactions; it cannot capture gains resulting from new transactions that occur as a result of the reduction in transaction costs – likely a large factor. Table 1.4 shows the results. Simultaneously reducing the costs of all major regulatory reviews by 10% would result in benefits of $14.9 million per year, with a confidence interval of ($9.6, $20.2) million. The total cost in the existing market – the area of the consumer surplus rectangle – is ten times this figure, or $149 million per year. Again, these figures do not include benefits from new trades that might take place. They also exclude the benefits of these regulations, since my focus here is on their costs.

1.5.4 Discussion

This exercise estimates the price effects of several salient cost factors that are both observable and heterogeneous across trading partners. This is a limited analysis of transaction costs for several reasons. One, this exercise misses cost factors that are not econometrically identified in price data; many other types of transaction costs may be important as well, but my approach cannot measure them. Two, my specification only measures marginal transaction costs, capturing transaction costs that scale with the size of the transaction. If there are also fixed costs, either they are not picked up at all, or they may partially load onto the estimate of marginal transaction costs (i.e., as average transaction costs). I emphasize that the estimated transaction costs are not necessarily limited to direct costs of reg- ulatory reviews themselves. This analysis may capture other kinds of transaction costs – including physical, administrative, or political economy costs – that are collinear with the regulatory reviews. These estimates represent the marginal cost of selling or buying across a given institutional or ge- ographic boundary, inclusive of all causes. In addition, despite my focus on costs, these regulatory reviews may bring important social benefits by preventing environmental externalities. However, many stakeholders believe these benefits can be achieved through less costly review processes. I next use these estimated transaction costs to obtain districts’ marginal valuations. This relies on the notion that a seller will not accept a price unless it covers both the transaction costs and the seller’s marginal valuation. For this exercise, I need to interpret my estimates as constant effects – costs incurred equally by everyone, as in Assumption 3 – not as average treatment effects. If the

44 marginal transaction costs associated with my selected cost factors instead differ across transactions, my results may be biased in ambiguous ways.

1.6 Step 2: Recovering Equilibrium Marginal Valuations of Water

Next, I combine these estimated transaction costs with observed prices to estimate marginal valuations of water in each district and year. I use a simple revealed-preference condition: a buying district’s marginal valuation must be as high as its highest price paid, while a selling district’s marginal val- uation must be as low as its lowest price paid, after adjusting for the transaction costs estimated in Step 1. This is motivated by the structural knowledge that transaction costs insert a wedge between observed prices and marginal valuations: to compensate for higher transaction costs, a buyer requires a discount, while a seller requires a premium. Adjusting prices for estimated transaction costs cor- rects for observed transaction costs, while taking the minimum (for sellers) or maximum (for buyers) across transactions within a year corrects for some unobserved transaction costs. To do so, I adjust observed prices for estimated transaction costs and then take the minimum (for sellers) or maximum (for buyers) of these adjusted prices within each district-year cell, as in equation 1.8:

 s (Sellers) Vˆot = min Pjodt − τˆ Bod (1.10) j  ˆb (Buyers) Vˆdt = max Pjodt + τ Bod j

More specifically, I first classify districts as either net buyers or net sellers in each year on the basis of transaction volume (the vast majority only buy or sell, not both). Next, to calculate transaction costs s b τ Bod and τ Bod, I use the disaggregated regressions from Step 1. From these regressions, I obtain

fitted values from the cost factors Bod only, not the fixed effects or controls. Then, I obtain adjusted prices by adding these transaction costs to the raw prices, imposing free disposal by censoring at zero when occasionally necessary. Finally, I obtain marginal valuations for each seller by taking the minimum across these adjusted prices within each year, and marginal valuations for each buyer by taking the maximum. Figure 1-5 plots kernel densities of these marginal valuations separately for buyers and sellers, along with a histogram of the raw prices. Although these distributions are still quite disperse, this

45 exercise reveals considerable separation in the marginal valuations of buyers and sellers. Figure 1-6 displays these marginal valuations by sector over time, taking the volume-weighted mean across consumers within each sector and year, ignoring whether they are buyers or sellers. The dashed lines show that average transaction prices have been largely similar for agricultural and urban consumers. However, in the presence of transaction costs, prices do not directly reflect marginal valuations. The solid lines, in contrast, reveal that marginal valuations are consistently $100-150 per acre-foot higher for urban consumers than for agricultural consumers. This is consistent with both conventional wisdom in California and the prior literature. Note this is a result, not an assumption; nowhere was it built into the model. Figure 1-7 maps the geographic distribution of marginal valuations for a typical year. These show unit-level marginal valuations, which will be direct inputs to Step 4. Units are categories defined by geography, institutions, and sector.4 I form unit-level marginal valuations by partialing out surface water entitlements (in natural logs of unit-level aggregates) in separate regressions for each unit, and computing the fitted value at the mean value of entitlements. To generate the map, I average over all units in a planning area (i.e., up to six), weighting by quantity consumed. The results show that marginal valuations are low in the Sacramento Valley, higher in the San Francisco Bay area and the southern San Joaquin Valley, and very high in urban Southern California. These figures reveal dispersion in marginal valuations across buyers and sellers, across sectors, and across geography. This dispersion implies that substantial gains are available from reducing trans- action costs, both observable and unobservable. That is, in a competitive equilibrium with smaller transaction costs, urban consumers would purchase more water from agricultural consumers, south- ern consumers would purchase more water from northern consumers, and both sides would enjoy welfare gains. However, to quantify these gains, I need to understand how marginal valuations vary with quantity consumed. This is the subject of the next section.

1.7 Step 3: Estimating Demand Elasticities

The third step in my empirical approach is to estimate demand elasticities. Having estimated marginal valuations in each district and year, I can characterize the marginal inefficiencies arising from trans-

4That is, interactions of planning areas (hydro-geographical areas defined by the California Department of Water Re- sources), projects (State Water Project [SWP] contactors, Central Valley Project [CVP] contractors that are not also SWP contractors, and users who are neither SWP nor CVP contractors), and urban versus agricultural sectors.

46 action costs – but I need demand elasticities in order to describe how steeply marginal valuations change as districts’ quantities evolve away from the observed equilibrium. Given enough data and enough exogenous variation, demand could be estimated nonparametri- cally and separately for each district. In practice, both data and exogenous variation are limited. In order to estimate demand quasi-experimentally, I limit attention to two parameters: price elasticities of demand in the urban and agricultural sectors. In other words, I assume that demand is isoelastic – at least in the range between observed and optimal quantities – with constant elasticities within sectors. The assumption of isoelastic demand can be viewed as a local approximation to the true demand curve. This makes my empirical exercise especially well-suited for analyzing relatively small changes in trading quantity. It allows my counterfactuals to extrapolate in a simple but transparent way, re- lying on past observed trading behavior rather than specifying the full objective functions of water consumers.

1.7.1 Empirical strategy

Estimating demand presents a classic endogeneity problem: quantities and prices are jointly deter- mined by the market interaction of all consumers’ demand curves. I need a supply shifter: an instru- ment that provides exogenous variation in either prices or quantities. My proposed instrument is the natural log of annual water entitlements, as constructed in my dataset detailed in Section 1.4. Entitlements are the full maximum values of a district’s rights or contracts multiplied by the year’s percentage allocation for that contract type. The variation I use comes from districts holding contracts with the Central Valley Project and State Water Project, for which these percentage allocations are nonlinear functions of the contract type and reservoir levels – which in turn are primarily determined by precipitation during the previous winter. (For the Lower Colorado Project and appropriative and riparian rights, percentage allocations are always 100%.) This rich variation allows me to simultaneously compare within districts over time and within years across districts, while remaining confident that the variation is driven by natural phenomena rather than demand-side factors. I estimate inverse elasticities of aggregate demand for sets of districts I call markets, using market- level aggregate entitlements to instrument for market-level aggregate quantity demanded. Because

47 transaction costs create market inertia, entitlements are persistent – they are highly correlated with quantities demanded. The thought experiment is this: when cutbacks in aggregate endowments lead to lower aggregate quantities demanded, how does this affect the equilibrium prices in that market? There are three reasons I estimate inverse demand at an aggregate level instead of an individual district level. First, it allows for much more accurate matches between entitlements and transaction prices.5 Second, it sidesteps the issue of selection into the market. Individual district-level elasticities may be biased because I only observe prices for years in which a district does choose to trade. Instead of proceeding with many missing values, or modeling both the intensive and extensive margin of trading for each district, I estimate aggregate elasticities that subsume both margins. Third, my instrument varies only at the level of entitlement contract type, not district. Any ex- ogenous variation specific to an individual district is collinear with variation for other districts of the same contract type, violating the exclusion restriction.6 Therefore, I define 26 markets that are supersets of contract types, making it possible to condition on entitlements of other markets while leaving variation in a market’s own entitlements. These markets span all water districts (and all other consumers) in California, dividing them by geography, sector, and entitlement type.7 Figure 1-8 plots aggregate entitlements at the level of these markets.

Main specification

Summarizing the discussion above, I estimate inverse elasticities of aggregate demand in a differences- in-differences, instrumental variables setup. I perform two-stage least squares, regressing prices for transaction j by district n in market m and sector s in year t on log market-level aggregate quantity

5Often a wholesale district receives entitlements but a geographically-overlapping retail district buys or sells them. It is difficult to track the full extent of the linkages between wholesale and retail water districts, which typically have long-term purchasing arrangements and may not be fully independent actors. Aggregating to market simplifies such situations, since it treats overlapping districts together as one market. 6To see this more concretely, first consider that estimating individual demand would require an exclusion restriction that market-level supply not affect individual quantity demanded except through price. This exclusion restriction is violated because market-level supply is collinear (in logs) with individual-level supply, which can directly affect individual-level quantity demanded in the presence of transaction costs. Second, consider that estimating inverse individual demand would require that market-level supply affect price only through individual quantity demanded. Again the exclusion restriction is violated, because market-level supply also affects price through others’ quantity demanded. 7Market definitions are urban and agricultural sectors interacted with the following 13 classifications: (1) SWP: Con- sumers holding contracts with the State Water Project (SWP). (2) CVP North: Consumers north of the Sacramento-San Joaquin Delta holding contracts with the federal Central Valley Project (CVP) and not the SWP. (3) CVP East: Consumers south of the Delta and east of the San Joaquin River who hold contracts with the CVP and not the SWP. (4) CVP South: Consumers south of the Delta and west of the San Joaquin River who hold contracts with the CVP and not the SWP. (5-13) Consumers in each of nine Hydrological Regions designated by the California Department of Water Resources who do not hold contracts with either the SWP or CVP.

48 with market and year fixed effects:

lnPjnmst = ηs lnQmst + αms + θt + ν jnmst , (1.11)

instrumenting lnQmst with lnEmst , the log of annual market-level aggregate entitlements. I estimate the elasticity separately for each sector: agriculture and urban. I cluster standard errors by market- by-year, the level of variation in both my instrument and endogenous variable. (Because there is no transaction- or district-level variation on the right-hand side, this regression is equivalent to a regression of market-level mean prices.) I also include as controls the cost factors Bod interacted with direction (buying or selling). The intuitive version of my identification assumption is that, conditional on a market’s typical en- titlement (subsumed in market fixed effect) and statewide water availability (subsumed in year fixed effect), year-to-year entitlements affect prices only through quantities demanded. More specifically, there are four identifying assumptions in this setup: instrument relevance, instrument exclusion, par- allel trends in quantities, and parallel trends in prices (Hudson et al. 2015). I discuss each of these in turn. Then, to address concerns raised by these assumptions, I propose an alternative econometric specification.

Relevance First, entitlements must be correlated with quantities demanded. In a completely efficient market, they would be uncorrelated (conditional on statewide aggregate entitlement), as con- sumers reach a Coasian solution independent of their entitlement. However, under transaction costs, the equilibrium will depend on these initial entitlements, so this condition will be met. Empirically, instrument relevance is testable. As I show below, the first stage is strong.

Exclusion Next I assume that a market’s aggregate entitlement affects prices only through that market’s aggregate quantity demanded. Intuitively, entitlements are a supply shock; they should affect prices only through movements along the demand curve, not shifts in the demand curve. This can be derived from the theoretical model. The set of first-order conditions described by Equation 1.3 imply that sellers’ quantities can be described by the implicit function Qot = Qot (Eot ,Ed=1,t ,...,Ed=D,t ). Thus, conditional on the entitlements of potential buyers, quantities depend only on entitlement. In other words, entitlements enter the model only through the quantity argument of inverse demand. To translate this derivation to the regression, I assume entitlements to others affect prices log-linearly,

49 such that a year fixed effect is sufficient to control for them.

Parallel trends Finally I assume parallel counterfactual trends in both quantities and prices on a log scale: Absent cutbacks, markets’ quantities and prices would vary multiplicatively (i.e., proportionally) across years. This assumption allows for unobserved year-specific statewide level shifts, relaxing identification based on pure time-series variation. However, interactions across markets present a potential issue for my differences-in-differences approach. If each market is in autarky, they will have stable treatment values: prices and quantities for each market will not be affected by entitlements in other markets. But if markets are connected by a trading network, there may be spillovers: increased entitlements to one market will result in more sales to other markets. Specifically, if one market’s entitlement increases, other markets’ quantities demanded will be larger, and prices will be lower. If I compare these markets within year, quantity and price differences will be smaller than without spillovers, making both first-stage and reduced- form underestimates of the true response. In the IV, this bias is unsigned; it depends on the relative size of the quantity and price biases. Cross-market trades make up only 23% of all trades, constituting some evidence that this problem may be minor. However, the possibility remains that the benefits of year fixed effects (i.e., absorbing common shocks) are outweighed by the costs (i.e,. the bias from cross-market interactions). There- fore, I next propose a secondary identification strategy that reduces this potential bias.

Secondary specification

To partly address the risk of bias from spillovers across markets, I also estimate demand elasticities in a time series–instrumental variables approach. Instead of year fixed effects, this secondary specifi- cation conditions on (1) a market-specific time trend and (2) the natural log of an average E¯−m of the entitlements of other markets, weighted in proportion to total volume traded with that market over the timespan of the data. The regression is:

lnPjnmst = ηs lnQmst + αms + θmst + θ lnE¯−mt + ν jnmst , (1.12)

still instrumenting lnQmst with lnEmst , the log of annual market-level aggregate entitlements. This approach reduces reliance on the within-year comparison performed by the differences-in-differences

50 setup. It no longer absorbs unobserved statewide shocks to prices and quantities, but it still controls linearly for entitlements of other markets – likely the most important shock to condition on. It also allows for unobserved linear trends in both prices and quantities in each market.

1.7.2 Results

Table 1.5 reports the results of these inverse demand regressions. I start with the first stage; Panel A shows results of regressions of market-level quantity consumed (the endogenous variable) on entitle- ments (the instrument), both in natural logs of the market-level totals. Each specification appears in two columns, first for the agricultural sector and then for the urban sector. Columns 1 and 4 show that the instrument is relevant and strong, with a precisely estimated elasticity close to 1 and a first-stage F-statistic much larger than the typical rules of thumb. Columns 2-3 and 5-6 show that this result holds when including either subregion or district fixed effects. The main results of this section appear in Panel B, columns 1 and 4. These are inverse demand elasticities estimated using the main specification. The instrumented effect of log quantities on log prices is -0.72 for the agricultural sector and -1.35 for the urban sector. These are directly inter- pretable as elasticities, implying that a 10% change in quantity results in a 7% change in price for agricultural districts and a 14% change for urban districts. Both estimates have 95% confidence in- tervals that exclude zero. Because these are elasticities of inverse demand, larger magnitudes indicate that demand slopes down faster and is less elastic. This result then implies that urban districts have less elastic demand for water transactions on the statewide market than agricultural districts, although the estimates are not precise enough to reject their equality with a t-test. This is consistent with con- ventional wisdom in California: farmers will fallow land more readily than households will cut back on lawn-watering. Columns 2-3 and 5-6 of Panel B add subregion or district fixed effects, which are only intended to potentially reduce noise – they do not matter for my identification strategy. Relative to the main specification, these coefficients are very similar for agriculture; they decrease for the urban sector but are not statistically different. Panel C shows the results of my secondary specification, in which I forgo year fixed effects in favor of market-specific time trends and controlling for a weighted average of other-market entitle- ments. The estimates are quite similar to the main specification. Panel D shows the main specifica-

51 tion but restricting to within-market transactions only, with the intention of reducing spillover effects across markets. These estimates are noisier, especially so for the urban sector. Another natural approach would be to place estimated marginal valuations on the left-hand side instead of prices. However, results of that specification are likely to be less precise, since marginal valuations are estimated with their own uncertainty. Estimates from this specification are shown in Panels C and D of Appendix Table 1.11. Point estimates for the urban sector are somewhat volatile, while those for agriculture are attenuated. For reference, the corresponding reduced form and ordinary least squares regressions appear in Panels A and B of Appendix Table 1.11. Estimates in both panels are broadly similar to the two-stage least squares estimates, reflecting the high persistence of initial entitlements.

1.8 Step 4: Simulating the Gains from Trade

The last step in my empirical approach is to simulate trading outcomes in counterfactual scenarios in which transaction costs are reduced, and to calculate the economic gains from increased trade on California’s statewide water market. So far, I have estimated marginal transaction costs associated with specific cost factors, used these estimates to infer bounds on consumers’ marginal valuations, and estimated price elasticities of inverse demand. Gaps in these marginal valuations characterize the marginal inefficiencies arising from transaction costs, while the elasticities describe how marginal valuations evolve away from the observed baseline. Now, I combine these to find (a) the efficient allocation in a market without transaction costs, and (b) calculate the resulting gains in consumer surplus. To do so, I solve the social planner’s problem in a constrained optimization problem. I find the vector of bilateral transaction quantities that maximizes aggregate consumer surplus, subject to the resource constraint that nobody can sell more than the water they start with. (This is equivalent to re- quiring that the aggregate sum of final quantities must be equal to the aggregate sum of endowments.) Because an ideal market could implement the efficient allocation, the increase in consumer surplus from the social planner’s problem represents the potential gains from trade. To reflect that some transaction costs are fundamentally physical costs rather than manipulable policy-induced or administrative transaction costs, I also subtract these pair- and direction-specific physical costs (as estimated in Step 1) from consumer surplus in the objective function.

52 1.8.1 Setting up simulations

For counterfactual simulations, I work at the level of 94 units defined by exhaustive combinations of geography, institutions, and sector, and treat them as representative consumers.8 This aggregation is for three reasons that are similar to the reasons for estimating demand at the aggregate level. First, units span all possible water districts and consumers in California, sidestepping the problem of miss- ing marginal valuation estimates for districts that do not appear in my dataset. Second, it allows for a more accurate match between entitlements and marginal valuations (as discussed in Step 3). Third, the demand elasticities I estimated in Step 3 are aggregate rather than individual, so applying them to other aggregate units requires weaker assumptions about demand aggregation. I use unit-level marginal valuations constructed in Step 2 and shown in Figure 1-7. In most scenarios, I calculate consumer surplus relative to baseline quantities rather than initial 0 endowments Ek. I define this baseline quantity Qk as mean water consumption over 1993-2015, which includes past market transactions, for each unit k. Using this baseline implicitly assumes that water is efficiently allocated within the unit – that it is goes to the districts and other users with the greatest marginal valuations. If this is not true, as seems likely, then there are further gains possible (Davis and Kilian 2011). In the market scenarios that I compare to this baseline, I do assume that water is efficiently allocated among districts and other users within the unit (although not necessarily among retail consumers within these districts). Simulations consist of solving a constrained optimization problem. In these, I find the transaction quantities qkl (non-negative amounts delivered from seller k to buyer l) for all combinations of k and l that maximize aggregate consumer surplus, subject to constraints. Consumer surplus is the area under the demand curve from initial endowment Ek to post-trading quantity Qk, minus physical transaction 9 costs cklqkl, where ckl is the physical marginal cost of delivering water from k to l. (Prices are not included because they are merely transfers between units k, netting out in the sum.) In some scenarios I include a trading constraint, in which each unit may trade no more than Q¯k. Together, the full optimization problem is:

8Again, units are defined by full interactions of planning areas, water project membership, and urban versus agricultural sector. Units are subsets of the market definitions from Step 3. h 9To derive the maximand, I use Equations 1.4 and 1.5: CS (Q ) −CS (Q0) = R Qk V (ϕ)dϕ + P q − ∑k k k k k ∑k Ek k ∑l kl kl 0 i h i  R Qk  0   R Qk 0 0 ∑l Plkqlk − ∑l cklqkl − E Vk(ϕ) −Vk(Qk) dϕ = ∑k 0 Vk(ϕ)dϕ + (Qk − Ek)Vk − ∑l cklqkl . k Qk

53 Z Qk h 0 0 i max Vk(ϕ)dϕ + (Qk − Ek)Vk − cklqkl (1.13) {q } ∑ 0 ∑ kl k

s.t.

0 (resource constraint) Qk ≥ ∑ qkl1(qkl > 0) + ∑ qlk1(qlk > 0) ∀k l>k l 0) + ∑ qlk1(qlk > 0) ≥ −Qk ∀k l>k l

In evaluating the integral of inverse demand, I apply the assumption of isoelastic demand. This 0 0 permits an analytical expression for the integral that depends only on Qk, Qk, Vk , and ηs (the demand elasticity for sector s):

Z Qk  η +1   1  0 0 Qk  s Vk(ϕ)dϕ = Vk Qk − 1 . (1.14) 0 + 0 Qk ηs 1 Qk

1.8.2 Results

Scenario 1: Gains achieved by observed transactions in California’s existing water market.

To quantify the economic benefits achieved to date by the statewide water market, I simply calculate consumer surplus directly from observed transactions. I use the expression for consumer surplus in Equation 1.5 and evaluate the integral using Equation 1.14. I do not need to specifically include prices or transaction costs, since I assume marginal valuations net of transaction costs are equalized at the margin – any variation in prices within consumer-year is interpreted as transaction costs. In an average year within the timespan of my dataset, the total quantity traded in the existing market is 382,576 acre-feet. Multiplying volumes by prices, average total sales is $144 million per year. Applying my estimated elasticities, Table 1.6 shows that the corresponding gain in consumer surplus of these transactions relative to initial endowments is $146 million per year. Figure 1-9 shows the geographic distribution of net water sales in the existing market. Sellers tend to be in the Central Valley and along the Colorado River; buyers tend to be urban centers.

Scenario 2: Gains from all units trading as much as today’s highest-exporting unit.

I run the optimization problem including physical transaction costs and a trading quantity constraint, ¯ 0 in which I limit each unit’s total transaction quantity to 17.5% of its endowment: Qk ≥ 0.175 · Qk.

54 In other words, all units are allowed to trade up to the same percentage currently traded by the unit which sells the greatest volume of water. This scenario takes advantage of equilibrium gains without extrapolating far from the current equilibrium. This estimate is likely to be more reliable than scenarios that extrapolate farther for two reasons: (a) demand elasticities can be considered local estimates without relying heavily on a functional form assumption, and (b) infrastructure capacity constraints are less likely to bind. Results again are in Table 1.6 and Figure 1-9. In this simulation, total quantity traded increases by a factor of six relative to the existing market, and consumer surplus increases by $481 million per year. These gains are additional to the gains in Scenario 1, the water market that already exists, increasing economic gains more than four times. More water moves from the Central Valley, especially the Sacramento Valley in the north, to the coastal cities and the southern end of the San Joaquin Valley.

Scenario 3: Gains from relaxing all non-physical transaction costs (“first-best”).

I run the optimization problem still including physical transaction costs but without a trading quantity constraint. This scenario eliminates all differences in marginal valuations up to physical transaction costs. Because this scenario extrapolates into realms in which (a) the demand response relies more heavily on the isoelastic assumption, and (b) unmodeled capacity constraints are more likely to bind, these estimates are more speculative and should be viewed with some caution. Table 1.6 shows that in this scenario, the gain in consumer surplus would be $1.05 billion per year, and total quantity traded would increase to 7.58 million acre-feet per year. In Figure 1-9, even more water moves from the Sacramento Valley and northern San Joaquin Valley to the southern San Joaquin Valley and coastal cities.

Scenario 4: Gains with zero transaction costs (“zeroth-best”).

I run the simplest possible optimization problem, in which there are no transaction costs, even physi- cal ones, and all marginal valuations are equalized at a single price P. This is a poor guide to policy but may be an interesting thought experiment. To simulate this scenario, I set all entries of Ckl to zero; because the delivery pathways no longer matter, I can re-express the resource constraint as

∑k Qk = ∑k Ek. Then the optimization problem becomes simpler, since I only need to search over K 2 parameters Qk instead of O(K ) parameters qkl.

55 Table 1.6 shows that in this impractical scenario, the gain in consumer surplus would be $1.22 billion per year and total quantity traded would increase slightly to 7.60 million acre-feet per year. The geographical distribution, as shown in Figure 1-9, is broadly similar to Scenario 3.

Together, these simulations suggest that polices to support an expanded water market in California could result in large economic gains. In such a market, substantial amounts of water would flow from the Sacramento Valley to other parts of the state, which may have undesirable environmental consequences. The estimates here serve as a quantitative benchmark against which to weigh the value of these potential costs of water reallocation.

1.9 Conclusion

In this chapter, I use a revealed-preference approach to estimate the potential gains from increased trade in California’s statewide water market. I develop a four-step empirical procedure to analyze welfare in thin markets with transaction costs, and I apply it to new, uniquely comprehensive data on California’s water economy. First, I estimate marginal transaction costs associated with observable cost factors by measuring price gaps across transactions within district and year. I find that regulatory reviews and crossing the Sacramento–San Joaquin Delta create large transaction costs, each on the order of $200 per acre-foot. Second, I estimate districts’ equilibrium marginal valuations by taking the maximum or minimum price after adjusting for estimated marginal transaction costs. I find that urban districts have consistently higher marginal valuations on average than agricultural districts, implying that substantial welfare gains are available from cross-sector reallocation. Third, I estimate elasticities of inverse aggregate demand using weather-driven variation in yearly surface water entitlements. I find that agricultural districts have more elastic demand than urban districts, suggesting that farms bear the greater incidence of transaction costs. Having estimated equilibrium marginal valuations and demand elasticities, I combine them to construct demand curves and simulate counterfactual equilibria. I find that observed trading in the existing market increases consumer surplus by $150 million per year, an amount roughly equal to the total market value of sales. In an intermediate scenario, in which all regions are allowed to trade the same percentage of their endowments as the highest-exporting region (17.5%), I find that consumer surplus would increase by $480 million per year. More speculative scenarios that allow unlimited trading yield even larger benefits.

56 The full optimization exercises are idealized but can serve as policy benchmarks. Essentially, I document empirical gaps in marginal valuations across water users, and the simulations quantify potential gains that could result from closing these gaps with an efficient statewide water market. Both gaps and gains, however, may be underestimated if there are additional unobserved transaction costs that do not vary across a user’s transactions. Still, knowing a conservatively-estimated size of the potential benefits may help in making decisions about policy reforms. My estimates also miss welfare gains from trading within water districts or allowing retail cus- tomers direct access to the statewide water market. This is intrinsic to my empirical approach, which infers the preferences of the water districts from their observed trading behavior. However, these gains may be more difficult to achieve than the ones I do measure, since they would require reforming the institutional structure of local governments and water utilities. Finally, this chapter estimates only the potential benefits of a more efficient water market, not the costs. A complete policy analysis should also consider the benefits of existing market-restricting regulations and the costs of creating stronger market-supporting institutions. Still, there may be many ways to develop more robust water markets without compromising environmental and other policy goals. Local hydrological externalities could be resolved via a streamlined and unified system for determining consumptive use, which would help reduce the large difference in regulatory treatment between on-site water use and off-site water transfers. Instream flow requirements could be met through direct environmental purchases or by capping and auctioning off instream capacity. Even with significant cross-region flow constraints (as might be preferred, for example, at California’s Sacramento–San Joaquin Delta), large economic gains are likely available from within-region water markets.

57 1.10 Figures

Appropriative & riparian rights

Federal & state projects (State Water Project, Central Valley Project, Lower Colorado Project)

Water districts (agricultural or urban; public or private)

Independent consumers (farms, ranches, institutions, others)

Retail consumers (farms, households, institutions, others)

Figure 1-1: Structure of water entitlements in California. All water use is governed by a legal system of appropriative and riparian rights. Some water districts and independent consumers directly hold their own water rights; others hold long-term contracts with the federal and state water projects, which in turn hold water rights. Retail consumers purchase water from water districts.

58 ufc ae ihs loain rmtefdrladsaewtrpoet,adaeaeana groundwater annual average and projects, water state includes supply and Water federal afterward). year the every from and supply. year allocations permanent transaction rights, and the water in leases, counted longer-term surface are leases), (which (within-year rights transactions of market transfers spot include transactions Market iue1-2: Figure

bevdvlm fmre rnatoscmae ihttlwtrsupply. water total with compared transactions market of volume Observed 0 10 20 30 40 1990 Water Supplyvs.MarketVolumeinCalifornia 1995 Total watersupply (million acre-feetperyear) 2000 59 2005 Transaction volume 2010 2015 P P Vd(Qd)

b Vd(Qd) - τ

Pd

b τ ∆CS d P’

Pod ∆CSo τs s Vo(Qo) + τ Po

Vo(Qo)

* ’ Ed Qd Qd Qd

* ’ Qo Eo Qo Qo

Figure 1-3: Trade between two districts, buying district d (for destination) and selling district o (for origin).

Districts have endowments En and inverse demand Vn(Qn); demand for o is on a reversed axis so that the total width of the graph equals the sum of endowments. With costless trading, the competitive equilibrium among consumers in the two districts would result in price P0. If cross-market transactions incur per-unit transaction costs τs (for sellers) and τb (for buyers), buyers’ marginal willingness to pay shifts down relative to demand, and sellers’ marginal willingness to accept shifts up. Competitive equilibrium in this case would result in three b s distinct prices: a cross-market price Pod and within-market prices Po and Pd, with Pd − τ = Pod = Po + τ . Consumer surplus for buyers is shaded with diagonal lines, and consumer surplus for sellers (analogous to producer surplus) is shaded with vertical lines.

60 Price Distribution (controlling for year, per acre-foot, 2010$)

$1 $10 $100 $1000

Figure 1-4: Distribution of prices, controlling for year, logarithmic scale. Graph plots a histogram of observed prices on the California statewide water market, 1980-2015, converted to 2010 dollars using the CPI. I control for year by regressing log price on year fixed effects, taking the residual, and adding the grand mean.

61 Marginal valuations, buyers and sellers 2010$ per acre-foot

Raw prices Marginal valuations (sellers) Marginal valuations (buyers)

$1 $10 $100 $1000

Figure 1-5: Distributions of estimated equilibrium marginal valuations for buyers and sellers. Marginal valuations are estimated by taking the highest price paid (for buyers) or the lowest price accepted (for sellers) in each year, after adjusting prices for estimated transaction costs. The result is that buyers’ marginal valuations are higher (i.e., shifted to the right) than those of sellers. Lines are kernel density plots of marginal valuations separately for buyers and sellers, where an observation is a user in each year it appears in the dataset. Histogram of raw prices is not adjusted for year fixed effects, unlike Figure 1-4. Plot is on a logarithmic scale and omits marginal valuations equal to zero (less than 1% of the estimates).

62 cinvlm.Rwpie dte ie)aeol lgtyhge o ra sr hnfrarclua users; agricultural for than users urban trans- for by sector. higher weighted urban slightly year, the for and only higher sector are consistently by lines) are valuations valuations (dotted marginal marginal prices estimated Raw or prices volume. raw action either of means plot Lines iue1-6: Figure

2010$ per acre-foot 0 200 400 600 1990 siae qiiru agnlvlain ysco n year. and sector by valuations marginal equilibrium Estimated Agriculture rawprices Agriculture marginalvaluations 1995 Marginal valuations,bysector 2000 63 Year 2005 Urban rawprices Urban marginalvaluations 2010 2015 Marginal Valuations ($/acre-foot, average year) $25-100 $100-150 $150-200 $200-250 $250-300 $300-350 $350-400 $400-600

Figure 1-7: Estimated equilibrium marginal valuations by geography. Geographic polygons correspond to planning areas as defined by the California Department of Water Re- sources; areas with diagonal lines have no observed transactions. User-by-year marginal valuations are ag- gregated to planning area by (1) partialing out year-varying surface water entitlements, (2) averaging within unit (defined as the intersection of geography [planning area], sector [urban or agricultural], and project mem- bership [State Water Project, Central Valley Project, or neither]), and then (3) averaging across units within planning area, weighting by quantity consumed. Unit-level valuations are direct inputs to the counterfactual simulations in Step 4.

64 ntebsso hs otattps ahtm eiso h rp ersnsasprt market. separate a represents graph “markets” weather the to on of users series water basis aggregate time the I Each on Water types. types. contract rights contract State year 14 these = to water of of SWP year each of basis for Project; vary the sum Valley separately on allocations the set Central project is = are variability while (CVP This entitlements time-invariant, projects conditions. water are water rights Surface state Water and elasticities. federal Project). demand the from estimate allocations to and variation this use I

300 1000 3000 1980 iue1-8: Figure 1985 SWP CVP North 1990 Surface waterentitlements nilmnsb aktoe time. over market by Entitlements (thousand acre-feet,logscale) 1995 Non-Project Regions CVP East 65 2000 2005 2010 CVP South 2015 Scenario 1. Observed transactions Scenario 2. Trade up to 17.5%; physical costs

Quantities traded Quantities traded Observed transactions Up to 17.5% trade; physical costs Sell >500 taf Sell >500 taf Sell 200-500 taf Sell 200-500 taf Sell 100-200 taf Sell 100-200 taf Sell 20-100 taf Sell 20-100 taf Trade <20 taf Trade <20 taf Buy 20-100 taf Buy 20-100 taf Buy 100-200 taf Buy 100-200 taf Buy 200-500 taf Buy 200-500 taf Buy >500 taf Buy >500 taf

Scenario 3. Unlimited trading; physical costs Scenario 4. Unlimited trading, costless trade

Quantities traded Quantities traded Unlimited trade; physical costs Unlimited trade, no costs Sell >500 taf Sell >500 taf Sell 200-500 taf Sell 200-500 taf Sell 100-200 taf Sell 100-200 taf Sell 20-100 taf Sell 20-100 taf Trade <20 taf Trade <20 taf Buy 20-100 taf Buy 20-100 taf Buy 100-200 taf Buy 100-200 taf Buy 200-500 taf Buy 200-500 taf Buy >500 taf Buy >500 taf

Figure 1-9: Annual water quantities traded in the existing market and counterfactual scenarios, by geography. “taf” = thousand acre-feet. Geographic polygons correspond to planning areas as defined by the California Department of Water Resources; areas with diagonal shading have no observed transactions. Quantities shown on map sum over units in a planning area. For Scenario 2, all units are constrained to buy or sell no more than 17.5% of their endowments, the percentage sold by the single unit that exports the greatest volume of water. For Scenario 3, trading quantities are limited only by endowments (average annual surface water entitlements plus average annual groundwater supplies), and transactions incur per-kilometer physical costs as estimated in Step 1. For Scenario 4, physical transportation costs are set to zero. (Unit is defined as the intersection of geography [planning area], sector [urban or agricultural], and project membership [State Water Project, Central Valley Project, or neither].)

66 1.11 Tables

Table 1.1: Summary statistics of transactions data

Panel A: Summary statistics of transactions data Volume (AF) Price (2010$/AF) Distance (km) Observations Mean SD Mean SD Mean SD All transactions 4,906 2,807 14,473 144.80 158.50 57.8 122.3 Within hydrologic region 4,065 1,588 8,283 147.50 161.80 20.7 20.5 Across hydrologic regions 841 8,703 29,140 132.90 142.10 213.8 214.6 Panel B: Transactions by hydrologic region As Origin (Seller) As Destination (Buyer) Volume Volume Net sales Hydrologic Region Count (TAF) Count (TAF) (TAF) North Coast 6 35.6 2 0.7 34.8 North Lahontan 0 0.0 0 0.0 0.0 Sacramento River 302 5,374.5 145 1,968.9 3,405.6 San Francisco Bay 18 194.7 64 392.2 -197.4 Central Coast 49 41.8 36 23.5 18.3 San Joaquin River 280 3,256.8 349 4,448.9 -1,192.1 Tulare Lake 146 1,636.5 267 3,423.3 -1,786.8 South Coast 902 1,827.4 893 2,518.1 -690.8 South Lahontan 3003 559.4 2977 637.3 -77.9 Colorado River 255 845.2 342 359.8 485.4

TAF = thousand acre-feet. Panel A reports statistics for all transactions, those whose origin and destination fall within the same hydrologic region (as defined by the California Department of Water Resources), and those which cross a hydrologic region boundary. Panel B reports the count and total volume of transactions which begin (left) and end (right) in each hydrologic region. Column sums do not exactly agree due to a small number of transactions that involve more than two parties and count in more than one hydrologic region.

67 Table 1.2: Cost factors for which marginal transaction costs are econometrically identified in price data

Nominally incident on Identifying population & comparison Cost factor transactions For sellers For buyers 1 Distance In proportion to distance. All sellers: Sell to buyer All buyers: Buy from (Euclidean) further vs. nearer. seller further vs. nearer. 2 Delta crossing Moving water across the Sellers north of Delta: Sell Buyers south of Delta: (losses & risks from Sacramento-San Joaquin to south of Delta, vs. to Buy from north of Delta, environmental Delta. remain on same side. vs. from same side. restrictions) 3 SWRCB review Involving a change in the Sellers within SWP or Buyers within SWP or (State Water Resources place of use of a post-1914 CVP: Sell outside of CVP: Buy from outside of Control Board) appropriative water right. project, vs. remain within. project, vs. from within. 4 DWR review Within, or using facilities Sellers outside SWP: Sell Buyers outside SWP: Buy (California Dept. of of, the State Water Project into SWP, vs. remain on from SWP, vs. from Water Resources) (SWP). outside. outside. 5 USBR review Within, or using facilities Sellers outside CVP: Sell Buyers outside CVP: Buy (U.S. Bureau of of, the Central Valley into CVP, vs. remain on from CVP, vs. from Reclamation) Project (CVP). outside. outside. 6 County review Exporting groundwater Sellers within restricting Buyers outside of (Counties with from counties with county: Sell out of county, restricting county: Buy ordinances restricting restrictive ordinances. vs. remain within. from inside county, vs. groundwater exports) from elsewhere.

68 Table 1.3: Marginal transaction costs from specific cost factors

Dependent variable: SELLERS (positive coefficients are costly) BUYERS (negative coefficients are costly) Total Cost Price ($/acre-foot) Panel A: Aggregated cost factors (1) (2) (3) (4) (5) (6) (7) (8) (9) Additional regulatory review 45.7 * 30.2 78.4 *** 88.5 *** -74.9 * -78.3 * -153.5 ** -161.5 ** 250.0 *** (24.9) (27.0) (25.3) (30.7) (44.4) (43.0) (69.2) (64.9) (71.8) Distance (km) -0.047 -0.023 0.004 0.003 -0.082 -0.026 -0.049 -0.067 0.070 (0.041) (0.034) (0.040) (0.045) (0.051) (0.057) (0.072) (0.078) (0.090) Cross-sector transfer (ag-to-urban) -7.4 -6.9 * 5.9 -3.8 18.4 *** -1.7 1.8 4.0 -7.8 (5.2) (3.7) (3.8) (4.9) (5.1) (4.3) (3.9) (4.4) (6.6) Seller's region by year FE     Seller's subregion by year FE   Seller FE   Seller by year FE  Buyer's region by year FE     Buyer's subregion by year FE   Buyer FE   Buyer by year FE  Observations 2,584 2,584 2,584 2,584 2,727 2,727 2,727 2,727 Clusters 177 177 177 177 192 192 192 192

Panel B: Disaggregated cost factors (1) (2) (3) (4) (5) (6) (7) (8) (9) Delta crossing only 172.6 *** 186.5 *** 161.3 *** 258.1 *** -124.8 -49.9 * -85.1 * -65.4 323.5 *** (42.6) (54.9) (47.9) (58.0) (76.4) (29.2) (47.7) (42.9) (72.2) SWRCB review only 42.6 ** 44.6 *** 51.2 *** 58.0 *** -16.4 -67.3 -124.5 -136.8 * 194.8 ** (19.6) (11.3) (17.4) (15.2) (24.1) (49.9) (86.1) (80.3) (81.7) DWR review only -4.7 -40.2 71.2 21.3 -314.8 ** -318.4 ** -376.0 *** -385.3 *** 406.5 *** (26.6) (31.2) (65.8) (33.3) (152.9) (123.7) (132.3) (111.9) (116.7) USBR review only -27.9 -60.8 * -97.0 -8.7 -25.3 -58.4 -121.4 -125.4 116.7 (57.2) (34.2) (108.7) (11.2) (26.0) (45.8) (83.4) (77.3) (78.1) Delta crossing + SWRCB review - - - - 13.6 -31.7 -62.5 -72.7 - - - - - (38.4) (33.7) (57.3) (51.3) - Delta crossing + DWR review 123.5 *** 106.0 ** 117.4 *** 183.5 *** - - - - - (33.0) (47.5) (40.3) (65.0) - - - - - Delta crossing + USBR review 54.8 ** 72.2 ** 84.9 *** 217.6 *** 5.0 -33.1 -108.5 * -114.4 ** 332.0 *** (27.1) (29.4) (26.5) (77.6) (60.1) (30.6) (61.4) (51.6) (93.2) County review 12.8 6.7 3.3 1.1 4.3 11.5 28.3 ** 21.7 ** -20.6 * (12.0) (8.5) (10.1) (7.4) (11.4) (11.2) (11.7) (9.3) (11.9) Distance (km) -0.066 -0.058 * -0.030 -0.059 -0.091 -0.089 -0.168 * -0.160 0.101 (0.040) (0.034) (0.043) (0.045) (0.060) (0.073) (0.099) (0.101) (0.111) Seller's region by year FE     Seller's subregion by year FE   Seller FE   Seller by year FE  Buyer's region by year FE     Buyer's subregion by year FE   Buyer FE   Buyer by year FE  Observations 2,584 2,584 2,584 2,584 2,727 2,727 2,727 2,727 Clusters 177 177 177 177 192 192 192 192

Regressions of transaction price on cost factors from Table 1.2 (i.e., binary variables for regulatory reviews, and distance). The left side (columns 1-4) include seller-side fixed effects, comparing prices across buyers. Positive coefficients reflect a price premium, indicating that the cost factor is costly to the seller. The right side (columns 5-8) include buyer-side fixed effects, comparing prices across sellers. Negative coefficients reflect a price discount, indicating that the cost factor is costly to the buyer. Column (9) calculates the linear combination of the absolute value of (4) and (8), the preferred specifications. Mean price in data is $144; mean price conditional on being subject to any regulatory review is $233. Standard errors are shown in parentheses and clustered by region X year. Region and subregion are hydrological region and planning subarea as defined by the California Department of Water Resources. All regressions also include comparison-sample indicator variables. * p<.1, ** p<.05, *** p<.01.

69 Table 1.4: Aggregate gains from reducing regulatory costs

Marginal benefits from reducing transaction costs in existing market (excluding benefits from new transactions that might take place) Panel A: Regulatory transaction costs Volume Gains ($1000s/year) from Regulatory review affected reducing costs by: (acre-feet/year) 10% 50% 100% Delta crossing 114,010 3,688 18,438 36,877 (823) (4,114) (8,229) SWRCB review 151,475 2,951 14,755 29,510 (1,237) (6,186) (12,373) DWR review 172,919 7,030 35,149 70,297 (2,019) (10,093) (20,185) USBR review 125,931 1,470 7,350 14,699 (984) (4,919) (9,838) County review 99,891 -206 -1,030 -2,059 (118) (592) (1,185) Total 14,932 74,662 149,324 (2,695) (13,476) (26,952) Panel B: Physical transaction costs Distance 547,234 320 1,598 3,196 (350) (1,748) (3,497)

Estimated regulatory costs in the existing market. Benefits of these regulations are not quantified. Right-most column (100%) multiplies the estimated marginal cost of each cost factor (summed across buyers and sellers, as in column 9 of Table 1.3) by the annual average volume of observed spot-market transactions that are subject to each factor. Other columns (10%, 50%) simply scale this proportionally. Standard errors in parentheses.

70 Table 1.5: Demand elasticities

Panel A: First Stage Ln(Quantity consumed) Agriculture Urban (1) (2) (3) (4) (5) (6) Ln(Entitlements) 0.95 0.94 0.94 0.89 0.86 0.86 (0.02) (0.02) (0.02) (0.08) (0.08) (0.09) F-statistic 3520 3783 3019 132 112 92 Cost factors X Direction       Year fixed effects       Market fixed effects       Subregion fixed effects   District fixed effects   Observations 2,421 2,421 2,421 2,547 2,547 2,547 Clusters 204 204 204 157 157 157 Panel B: Inverse Demand Elasticities (2SLS) Ln (Price) Agriculture Urban Ln(Quantity consumed) -0.72 -0.84 -0.86 -1.35 -0.84 -0.47 (0.23) (0.22) (0.19) (0.37) (0.27) (0.25) Cost factors X Direction       Year fixed effects       Market fixed effects       Subregion fixed effects   District fixed effects   Observations 2,421 2,421 2,421 2,547 2,547 2,547 Clusters 204 204 204 157 157 157 Panel C: Alternative Specification (2SLS) Ln (Price) Agriculture Urban Ln(Quantity consumed) -0.61 -0.59 -0.95 -1.57 -0.76 -0.94 (0.25) (0.24) (0.18) (0.47) (0.26) (0.39) Cost factors X Direction       Other-market entitlements       Market fixed effects       Market-specific time trends       Subregion FE & time trends   District FE & time trends   Observations 2,421 2,421 2,421 2,547 2,547 2,547 Clusters 204 204 204 157 157 157 Panel D: Within-Market Transactions Only (2SLS) Ln (Price) Agriculture Urban Ln(Quantity consumed) -1.67 -1.63 -1.35 -1.39 0.36 -11.02 (0.39) (0.41) (0.31) (1.34) (0.79) (7.90) Cost factors X Direction       Year fixed effects       Market fixed effects       Subregion fixed effects   District fixed effects   Observations 1,748 1,748 1,748 1,985 1,985 1,985 Clusters 93 93 93 101 101 101

Dependent variable is listed immediately below each panel title. Observations are transactions; quantities and entitlements are market-level sums per year. Two-stage least squares models (Panels B-D) regress transaction prices on quantity con- sumed, instrumenting with entitlements. Within-market variation in entitlements over time is set by government agencies on the basis of precipitation and snowmelt. Markets are supersets of entitlement types (i.e., distinct values of the instru- ment). All regressions include binary indicators for the cost factors from Table 1.2, interacted with transaction direction (i.e., whether buying or selling). Alternative specification (Panel C) omits year fixed effects in favor of an average of other- market entitlements (weighted in proportion to each market pair’s total cross-market trading volume) and market-specific time trends. Standard errors (in parentheses) are clustered by market X year, the level of variation in both the instrument and the endogenous variable. Subregion is planning subarea as defined by the California Department of Water Resources.

71 Table 1.6: Economic benefits from water markets in several scenarios

Total quantity Net sales from Economic gains Scenarios traded farms to cities (consumer surplus) (acre-feet/year) (acre-feet/year) ($/year) 1 Observed transactions 382,576 266,108 $145,600,000 (In existing statewide market) 2 Intermediate scenario 2,456,841 690,526 $481,400,000 (Trade up to 17.5% of endowment; physical costs) 3 First-best scenario 7,581,138 1,926,043 $1,051,100,000 (Unlimited trade; physical costs) 4 Zeroth-best scenario 7,595,753 3,028,704 $1,218,500,000 (Unlimited trade; no physical costs)

Results of counterfactual simulations, as found by solving constrained optimization problems as described in Step 4. For Scenario 2, all units are constrained to buy or sell no more than 17.5% of their endowments, the percentage sold by the single unit that exports the greatest volume of water. For Scenario 3, trading quantities are limited only by endowments (average annual surface water entitlements plus average annual groundwater supplies), and transactions incur per-kilometer physical costs as estimated in Step 1. For Scenario 4, trading quantities are limited only by endowments, and physical transportation costs are set to zero.

72 1.12 Appendix: Validating Low Marginal Values of Water in Agricul- ture

Counterfactual results suggest that under reduced transaction costs, substantial amounts of water will be transferred from farms to cities. This result ultimately comes from the fact that water transactions involving agricultural users have lower prices on average than those involving urban users. To provide further, independent evidence to support this result, and to begin to investigate some of the reasons for this, I turn to agricultural outcomes. My goals in this section are to (1) measure the effect of water entitlements on land use, and (2) estimate the average marginal product of (surface) water entitlements in agriculture.

1.12.1 Data

For land use, I use a remote sensing (i.e., satellite data) product from U.S. Department of Agriculture (USDA), the Cropland Data Layer. This gives the crop grown at every pixel in a 30-meter grid for the entire United States, in each year from 2007 through 2016. California alone has 300 million pixels per year. I aggregate pixels to farm fields (quarter quarter sections) using the Public Land Survey System, keeping the modal land use within each field. This leaves about 450,000 observations per year. Aggregating to fields reduces noise and computational time without losing much information signal, since the survey system is highly collinear with cropping patterns. For agricultural revenues and profits, I obtain access to restricted-use microdata from the USDA’s Census of Agriculture. This gives self-reported farm-level revenues and expenditures for a sample of farm operations in California at five-year intervals, from 1982 through 2012. (Although it is a census, some operations answer only a short-form questionnaire without financial information.) In some analyses, I also use revenue and profit variables derived from land use data. I construct these by parameterizing observed crop choice using mean prices (within crop category and county) and mean variable expenditures (within crop category). Prices come from California County Agricul- tural Commissioners’ Reports; for each crop category, I divide total revenues by acres harvested and take the mean across years within county. For variable expenditures, I use the Census of Agriculture microdata, taking the coefficients from a regression of total expenditures on area harvested of each crop category and a constant. For crop water intensity, I use preliminary output from Cal-SIMETAW, the DWR’s new agricul-

73 tural model for the California Water Plan Update 2018. This gives applied water per acre for each of 20 crop categories, 278 geographical regions (detailed analysis unit by county), and 36 years (1980- 2015). Estimates are based on weather and plant physiology along with detailed knowledge of soils, geology, and irrigation practices. I divide crop categories into high and low water intensity by median within each geographical region. For weather covariates, I use gridded data from Schlenker and Roberts (2009), derived from PRISM monthly data and daily weather station observations. This gives daily minimum and maxi- mum temperatures and precipitation on a 4-km grid for the contiguous United States. I follow their methods to construct the following secondary variables at the quarter-year level: total precipitation; time at different temperatures, degree days, and vapor pressure deficit (Roberts et al. 2013). Details of the construction of all agricultural variables and covariates are described in the Data Appendix, Section 1.16.

1.12.2 Outcome variables

For land use, my primary data source is remote sensing data. This data includes essentially the universe of farms in California during the years it is available, and its high spatial resolution allows extremely precise matching with my water entitlements dataset, via irrigation district boundaries. It is subject to measurement error from imperfect remote identification of crops, but this is of low concern because it appears on the left-hand side of the regression. For financial outcome variables, I use both remote sensing data and the Census of Agriculture, each of which has advantages and disadvantages. The Census data directly gives self-reported rev- enues and profits, but its spatial identifiers are low-resolution (county and zip code). This may create substantial measurement error on the right-hand side of the regression from imprecise matching with my water entitlements dataset. In addition, the Census data is a smaller sample: not all farms are surveyed, and it is available only once every five years. It is also a highly imbalanced panel; about one-third of the observations are observed only once and cannot be used with farm fixed effects. Because revenues and profits are inherently noisy outcomes, statistical power may be low. Because of concerns about power and measurement error, I also use revenue and profit variables constructed from remote sensing data. This allows me to use a larger sample and reduce measurement error on the right-hand side. The disadvantage is that this measure captures changes in revenues

74 and profits only through crop choice, missing any changes in yields, crop quality, or compensatory expenditures, which would bias the estimate downward. On the other hand, it also may be subject to selection into different crops by land quality or farmer comparative advantage, which would bias it upward.

1.12.3 Empirical strategy

To measure the effect of yearly surface water entitlements on land use and financial variables, I run panel regressions of per-acre outcomes Yf nt (for farm f within irrigation district n in year t) on per- acre surface water entitlements Ent , farm fixed effects, and year fixed effects:

Yf nt = βEnt + κ f n + λt + ε f nt . (1.15)

Intuitively, the identifying assumption is that a farm’s water entitlements are as good as randomly assigned, conditional on the farm’s mean entitlements over time (subsumed in the farm fixed effect) and statewide water availability (subsumed in the year fixed effect). This is plausible given that my variation comes from nonlinear functions of winter precipitation in mountain ranges (i.e., in a different location from the farm) and long-term contract types within the Central Valley Project and the State Water Project. Slightly more rigorously, I assume parallel trends in outcomes: absent cutbacks, farms in different irrigation districts would follow trends over time that differ only by an additive constant. The clearest threats to identification are regional time-varying factors that correlate with both water entitlements and the outcomes. One possibility is local precipitation and other weather, despite the typical geographic separation between where water falls and is used, so I control for a host of weather variables in both the current and previous year.10 Another possibility is crop prices; for remote sensing data, I construct revenues using time-invariant means of crop prices. For Census data, I control for a weighted average of crop prices, where weights equal the baseline crop mix within the county. To make results more representative of the sample, I weight observations by area (for remote sensing data, usually one-sixteenth of a square mile). To account for spatial and temporal correlation, I cluster standard errors either by 8-digit watershed as defined by the U.S. Geological Survey (for

10I include quarterly sums or means of precipitation, temperature, degree days, and vapor pressure difference for the nearest grid point to the farm. See Section 1.4 for details.

75 remote sensing data) or by zip code (for Census data).

1.12.4 Results

Table 1.7 shows the results for land use. In Panel A, I show separate regressions of the linear prob- ability of several land-use choices. First, I find that for every additional acre-foot per acre of water entitlements, 3.5 percentage points less farmland is fallowed (that is, not planted with crops nor used for another purpose). For comparison, the average crop uses 3.4 acre-feet per acre of water. This estimate may appear small; it reflects that farmers have multiple margins of adaptation. These in- clude substitution to groundwater, deficit irrigation (use less water on the same crops), ex-post water purchases, or reducing any ex-post water sales. In the following columns of Panel A, I find that water entitlements increase the acreage of both low- and high-water intensity crops (divided by median). This implies that the response to water cutbacks is predominantly through the extensive margin (cropping or not) rather than the intensive margin (more or less water-intensive crops). I also find that the entire shift to fallowing comes from annual crops (those which must be planted every year). The effect on perennial crops, which require a fixed investment and can be harvested over multiple years, is a precisely estimated zero. In summary, when droughts reduce water availability, farmers fallow some of their annual crops, but not very much of them. Panel B shows results of a few robustness checks. First I show the results of adding lags of water entitlements. When including only contemporaneous weather covariates, the previous year’s entitlements matter. However, when I add weather covariates for the previous year, the previous-year coefficients become smaller and statistically insignificant, and the current-year coefficient is very similar to the main specification. In the last column, I estimate the effect of water entitlements on fallowing in the Census of Agriculture, an entirely independent dataset, and find a nearly identical result. Table 1.8 shows the results for revenue and profits. In Panel A, using remote sensing data (i.e., imputed financial outcomes derived from crop choice), I find that every extra acre-foot of water entitlements raises revenues by $103, and profits by $26. Like the effects on land use, these may appear small, but they are consistent with my estimated agricultural marginal valuations from Section 1.5, which are often around $100. Using Census of Agriculture data, I find that increases in water

76 entitlements reduce profits and revenues, but these estimates are imprecise, with large confidence intervals that include the remote-sensing coefficients. In Panel B, I take the remote sensing estimate and vary either the inclusion of weather covariates or the level of fixed effects. The point estimates are similar to the main specification, falling between $21 and $48 per acre-foot. The empirical specification in this section could be viewed as the reduced form of an instrumental variables model in which actual water consumption is the endogenous variable. This would require stronger identifying assumptions, and none of the available measures of water consumption is ideal. However, I estimate the first-stage of this model in Section 1.7, where Table 1.5 shows that the coefficient for agriculture is 0.90. Dividing the profit estimate from remote sensing (the reduced form), by this first stage yields a two-sample instrumental variables estimate of 29.2, little different from the reduced form. This section provides two basic results: (1) the short-term marginal product of surface water in California agriculture is small to zero, and (2) surface water entitlements affect cropping patterns by only a small amount. Taken together, these results suggest that farmers have scope to reduce their water consumption without resorting to fallowing their land or even forgoing much profit, at least in the short term. Thus, an entirely independent empirical approach lends support to the results in the rest of the chapter: marginal valuations in agriculture are low relative to cities, and if water markets were allowed to expand, substantial amounts of water would be reallocated from farms to cities, resulting in economic gains.

77 Table 1.7: Effects of surface water entitlements on land use

Panel A: Land use effects (linear probability) Crops, by water intensity Crops, by life cycle Fallowing Low High Perennials Annuals Water entitlements (acre-feet/acre) -0.035 *** 0.020 ** 0.021 ** 0.000 0.035 *** (0.007) (0.009) (0.010) (0.004) (0.007) Weather covariates, same year      Weather covariates, previous year      Field fixed effects      Year fixed effects      Observations 3,687,424 3,687,424 3,687,424 3,687,424 3,687,424 Clusters 121 121 121 121 121 Panel B: Robustness checks Fallowing (linear probability) Remote sensing Census Water entitlements (acre-feet/acre) -0.035 *** -0.029 ** -0.036 *** -0.034 *** -0.031 *** (0.007) (0.013) (0.009) (0.007) (0.007) Water entitlements, 1 year prior -0.038 ** -0.020 -0.018 (0.017) (0.012) (0.014) Water entitlements, 2 years prior 0.011 (0.011) Weather covariates, same year     Weather covariates, previous year    Field fixed effects      Year fixed effects      Observations 3,687,424 3,242,530 3,242,530 2,781,603 84,920 Clusters 121 121 121 121 308 Regressions of the linear probability (remote sensing) or percentage (Census) of agricultural land uses on average annual water entitlements. Observations are quarter quarter sections (remote sensing) or operations (Census) and are weighted by area. Standard errors (in parentheses) are clustered by 8-digit watershed (remote sensing) or zip code (Census). * p<0.10, ** p<0.05, *** p<0.01.

78 Table 1.8: Effects of surface water entitlements on profits and revenues

Panel A: Profits and Revenues ($/acre) Remote sensing Census of Agriculture Profit Revenue Profit Revenue Water entitlements (acre-feet/acre) 26.3 103.3 *** -54.6 -103.2 (29.0) (29.9) (46.8) (106.5) Weather covariates, same year   Weather covariates, previous year   Field fixed effects     Year fixed effects     Observations 3,687,416 3,687,416 77,448 80,593 Clusters 121 121 308 307 Panel B: Robustness Checks Remote sensing: Profit ($/acre) Water entitlements (acre-feet/acre) 26.3 47.6 21.9 47.4 * (29.0) (29.9) (30.6) (24.7) Weather covariates, same year    Weather covariates, previous year   Field fixed effects     Year fixed effects     County-by-year fixed effects  Observations 3,687,416 3,704,111 3,687,423 3,687,416 Clusters 121 121 121 121 Regressions of per-acre profits or revenues on average annual water entitlements. For the remote sensing sample, outcomes are constructed from crop choices using time-invariant average prices within crop category and county. Observations are quarter quarter sections (remote sensing) or operations (Census) and are weighted by area. Standard errors (in parentheses) are clustered by 8-digit USGS watershed (remote sensing) or zip code (Census). * p<0.10, ** p<0.05, *** p<0.01.

79 1.13 Appendix: Market Power as an Alternative Explanation for Price Gaps

In the main sections of this chapter, I document price gaps associated with regulatory barriers and other cost factors, and I interpret these price gaps as marginal transaction costs. In this appendix, I investigate whether these price gaps may instead be explained by market power. Because my model of bilateral negotiations rules out the successful exercise of market power, I first develop an alternative model of spatial trade that allows both buyers and sellers to exploit market power. Then, I derive and perform an empirical procedure that adjusts raw prices for market power, using estimated pass-through rates as sufficient statistics. Both the model and empirical procedure are based in, and extend the approach of, Atkin and Donaldson (2015). Finally, using these adjusted prices, I re-estimate marginal transaction costs as in Section 1.5. I find little change in the overall pattern of marginal transaction costs, and so I conclude that the issue of market power is small relative to transaction costs. Even if markups are non-negligible, however, the approach in the main sections are still inter- pretable. The price gaps I estimate in 1.5 would instead measure both marginal transaction costs and differences in markups (and markdowns) that are correlated with the selected cost factors. Because both transaction costs and markups drive wedges between observed prices and marginal valuations, the bounds I estimate for marginal valuations using these estimated price gaps are still valid.11 Then, as long as these differential markups are endogenous to the transaction costs (as opposed to aris- ing from exogeous aspects of the market structure), eliminating transaction costs would also elimi- nate markups, and counterfactual price and quantity analysis proceeds through the same way. What markups do change is the interpretation of consumer surplus in welfare analysis. The deadweight loss triangles still represent true potential economic gains from removing the cost factors, but now some of the price gap rectangle may simply represent transfers between parties without lost efficiency.

1.13.1 Model of spatial trade with market power

As in the main model, N water districts indexed by n (or by o for origin and d for destination) are given

initial endowment En that is allocated efficiently among a continuum of consumers. Consumers’ pref-

11Under an analogous assumption for markups as Assumption 3 for transaction costs: that differences in markups asso- ciated with specific cost factors are constant and additively linear.

80 erences can be aggregated such that each district has an inverse demand function Vn(Qn), which gives marginal valuations as a function of quantity consumed Qn. Trade across districts is conducted by two layers of intermediaries. In each district, selling intermediaries (“sellers”) can buy units of water from consumers (at their marginal valuations Vo(Qo)) and sell to buying intermediaries (“buyers”) in another district. Buyers, in turn, buy water from sellers and can sell to consumers in their own district

(at their marginal valuations Vd(Qd)). Sellers and buyers meet at exchange points unique to each pair of districts, where prices Pod are determined. Each transaction i generates constant marginal transaction costs for both sellers and buyers, as in s s s Assumption 1. However, sellers and buyers may also incur fixed costs: Cod(qiod) = τodqiod + Fod and b b b Cod(qiod) = τodqiod + Fod. Also unlike in the main model, strategic interactions are possible among sellers at specific desti- nations, and among buyers at specific origins. One seller’s quantity decisions may affect the profits of other sellers to the same district through the aggregate quantity sold, and similarly for buyers. I summarize this strategic interaction using a competitiveness index, following Atkin and Donaldson (2015).

Assumption 5. Market structure. Buying and selling intermediaries choose quantities qiod to maximize profits subject to the expected response of other intermediaries. The competitiveness in- −1 −1 dex φ s ≡ dQd  Qd  for sellers and φ b ≡ dQo  Qo  for buyers is fixed for each origin- od dqiod qiod od dqiod qiod destination district pair.

This approach nests many specific models of market structure. For Cournot oligopoly with iden- s tical intermediaries, the competitiveness index φod equals Qd/qiod, or the number of intermediaries. s s For perfect collusion or a pure monopoly, φod = 1, and for perfect competition, φod −→ ∞. Note that both buyers and sellers cannot both successfully exercise market power in the same origin/destination pair. Market power comes from keeping quantities lower or higher than efficient; these are mutually exclusive. However, by permitting both in the same model, I allow the market power to appear on either side (which may even change from place to place). Next, I define markups (also referred to as markdowns, for buyers) as any difference between

81 prices and willingness to pay (or to accept):

s s (Sellers) µod(Qo) ≡ Pod(Qo) −Vo(Qo) − τod (1.16)

b b (Buyers) µod(Qd) ≡ Vd(Qd) − τod − Pod(Qd).

I then assume there are no strategic interactions across sellers and buyers.

Assumption 6. No interactions across sides of the market. Markups set by sellers and buyers are s b not affected by the overall quantities consumed in their own districts: ∂ µod = 0 and ∂ µod = 0. ∂Qo ∂Qd

This says that in choosing quantities, intermediaries consider only the other side’s demand re- sponse, not any expected strategic change in markups. A sufficient condition for this assumption is that market power exists on only one side of the market: for any origin-destination pair, one side may exert market power, while the other consists of price-takers. (Appendix 1.14 describes a model of bargaining under bilateral oligopoly that allows for more symmetric analysis.) With this setup in place, the first-order conditions for buyers and sellers yield expressions for markups:

s ∂Vd  Qd (Sellers) µod = − s (1.17) ∂Qd φod b ∂Vo  Qo (Buyers) µod = + b . ∂Qo φod

Note that if markups are zero (such as under perfect competition), Equation 1.16 becomes identical to the result for price determination in the main model.

1.13.2 Deriving the estimation procedure

Following Atkin and Donaldson (2015), I use pass-through rates as sufficient statistics for the com- petitive structure of the market. I first define pass-through rates as the absolute rate at which costs (for s sellers) or revenues (for buyers) are passed through to market equilibrium prices: ρod ≡ ∂Pod/∂MWTPod b s b and ρod ≡ ∂Pod/∂MWTAod, where MWTPod = Vo(Qo) + τod and MWTAod = Vd(Qd) + τod. Then, I assume demand takes a particularly convenient functional form.

Assumption 7. Bulow-Pfeiderer demand. Consumer preferences are time-invariant and take the

82 form  a − b (Q )δn if δ > 0,a > 0,b > 0,0 < Q < (a /b )1/δn  n n n n n n n n n  (a /b ) Vn(Qn) = an − bn ln(Qn) if δn = 0,an > 0,bn > 0,0 < Qn < e n n    δ an − bn(Qn) n if δn < 0,an ≥ 0,bn < 0,0 < Qn < ∞.

s δd −1 b Under this assumption, pass-through rates can be expressed as ρ = 1 + s and ρ = od φod od δo −1 1 + b , and each is fixed within seller-buyer pair. Inserting this functional form for demand into φod buyers’ and sellers’ first-order conditions, and adding time subscripts, yields:

s s s s b s 1 (Sellers) Podt = γodVot + ρodτodt − (1 − ρod)(τodt − ad) + (1 − ρod)( b − 1)ao (1.18) ρod b b b b s b 1 (Buyers) Podt = γodVdt − ρodτodt + (1 − ρod)(τodt + ao) + (1 − ρod)( s − 1)ad, ρod

s s b b b s b s where γod ≡ (ρod + ρod − 1)/ρod and γod ≡ (ρod + ρod − 1)/ρod. From here, I can obtain unbiased estimates of transaction costs – accounting for markups – via a two-step procedure.

Step 1: Estimate pair-specific pass-through rates for buyers and sellers. First, to transform 1.18 into a regression model, I again adopt Assumption 3 (additively linear factor-specific transaction

costs). I absorb demand levels (terms depending on ao and ad) and time-invariant portions of transac- tion costs with buyer-seller pair fixed effects. Then, I regress prices on previously-estimated marginal valuations (also restoring Assumption 4 to ensure marginal valuations are point estimates rather than bounds):

s ˆ s s (Sellers) Piodt = γodVot + λod + εiodt (1.19) b ˆ b b (Buyers) Piodt = γodVdt + λod + εiodt ,

where I assume movement in marginal valuations within buyer-seller pair is uncorrelated with other, unobserved determinants of prices. (The error terms arise from idiosyncratic time variation in trans- s b s b s b action costs, τ˜odt and τ˜odt .) Upon obtaining estimates γˆod and γˆod, I can back out ρˆod and ρˆod s b s b s b s b by solving the system of two equations defining γod and γod: ρod = γod/(γod + γod − γodγod) and s s s b s b ρod = γod/(γod + γod − γodγod).

83 Step 2: Adjust prices for pass-through and re-estimate transaction costs from gaps in ad- justed prices. Second, by inserting Assumption 3 into Equation 1.18 and rearranging, I obtain

  1 s s ˆ s s (Sellers) s ρˆodPiodt + (1 − ρˆod)Vdt = ωot + τ Bod + viodt (1.20) ρˆod   1 b b ˆ b b (Buyers) b ρˆodPiodt + (1 − ρˆod)Vot = ωdt + τ Bod + viodt . ρˆod

These can be estimated as regressions, yielding new unbiased estimates of factor-specific transaction

s ˆb costs τˆ and τ . Here I have to assume that inverse demand levels an (absorbed by the error terms viodt ) are uncorrelated with the cost factors Bod, which may appear to be a strong assumption but is also implicitly assumed in the main sections of this chapter. Finally, note that the full price gap as estimated in Section 1.5 (i.e., without adjusting prices for imperfect pass-through) identifies the sum of factor-specific transaction costs and factor-specific markups. Therefore, the difference between these two estimates identifies the factor-specific markups.

1.13.3 Empirical implementation

To bring the regressions from Step 1 and Step 2 to the data, I make several more choices. First, I coarsen the level of origin and destination od in order to conserve statistical power. Instead of estimating parameters at the consumer level, I use the level of groups defined by the interaction of hydrologic region and major cost factors (the ones for which I measure transaction costs). Regression 1.19 is more demanding of the data than other regressions in this chapter, as it requires estimating both an intercept and a slope (i.e., a fixed effect and a coefficient) for each origin-destination pair. If I continued to use consumer-level pairs, very few parameters would be identified, since most consumer pairs do not have repeated transactions over time. This coarser grouping of origins and destinations still allows markups to vary discontinuously across the cost factors as well as across geography. Of 1,190 possible group-pair cells, 48 have at least 3 observations (the minimum necessary to identify both point estimates and standard errors). Second, I instrument for marginal valuations using the leave-out mean of marginal valuations within each of these cells, estimating Equation 1.19 via two-stage least squares. This addresses the potential concern of mechanical correlation between prices and marginal valuations, since marginal valuations are constructed using within-consumer price data. It does not necessarily solve any other

84 potential endogeneity concerns. Unfortunately, I cannot use the same instrument as in estimating demand elasticities, namely fluctuations in water entitlements. Here, I need to estimate cell-specific parameters, but water entitlements have variation within only a subset of the cells. s b Third, I impute 1 for any parameters γˆod or γˆod that are unidentified, in order to still calculate s b values for ρˆod and ρˆod. That is, I assume complete pass-through, as in the case of perfect competition. This affects origin-destination cells that have only 1 or 2 observations, which comprise 15% of my dataset. s b Finally, in the second step, I drop observations for which the pass-through rate (ρˆod or ρˆod) is below 0.2 or greater than 2, following Atkin and Donaldson (2015). This ensures that the estimates are not driven by noisy outliers in estimated pass-through rates, which may be amplified because they appear in the denominator on the left-hand side. This sample restriction affects 19% of my dataset.

1.13.4 Results

s In Step 1, I estimate Equation 1.19 by two-stage least squares regression, obtain the estimates γˆod and b s b γˆod, and algebraically solve their definitions to obtain ρˆod and ρˆod. These are pass-through rates for sellers and buyers, estimated using time-series variation in regional marginal valuations. Figure 1-10 plots histograms of these estimated pass-through rates. Sellers’ pass-through rates cluster tightly around 1, which is consistent with models in which sellers are not able to successfully exercise market power, including the model of perfect competition in the main sections of this chapter. Buyers’ pass-through rates are more dispersed, with a mode below 1, suggesting that buyers may have some ability to exercise market power. (Note that depending on market structure, market power may be consistent with pass-through rates either below or above 1). In Step 2, I adjust raw prices for possible markups and markdowns and then re-estimate marginal transaction costs associated with the cost factors selected in Section 1.5. Under the assumptions of the model in this appendix, the adjusted prices reflect what prices would be in a partial-equilibrium counterfactual without markups or markdowns. The prior estimates of transaction costs (in Section 1.5) then may be biased, since they capture both marginal transaction costs as well as any differential markups or markdowns that vary with the cost factors (and presumably arise from the cost factors, since transaction costs reduce competition). In contrast, the estimates here account for markups and markdowns, allowing me to better isolate the marginal transaction costs.

85 Table 1.9 shows the results of these marginal transaction cost regressions, from Equation 1.20. This table is identical to Table 1.3 from Section 1.5, except that the dependent variable is adjusted prices rather than raw prices. For sellers, adjusting for markups leaves the estimates essentially un- changed. For example, the coefficient on the aggregated cost factor variable (Panel A, column 4) decreases from 88.5 to 72.4, but this difference is well within the range of statistical uncertainty. Coefficients on the disaggregated cost factors are also individually similar to those in Table 1.3. For buyers, the overall pattern is similar as in Table 1.3, but the coefficients are considerably noisier. The point estimates are larger, suggesting that market power in this setting does not exacerbate marginal transaction costs, but rather acts to reduce the transaction costs associated with these cost factors. However, once again, these new estimates (using adjusted prices) are not statistically distinguishable from the prior estimates (which used raw prices). Overall, the results of this exercise suggest that market power does not explain the large marginal transaction costs I estimate in Section 1.5. After adjusting for possible markups, marginal transaction costs for sellers remain large and statistically distinguishable from zero. After adjusting for possi- ble markdowns, marginal transaction costs for buyers are no smaller; if anything they may even be larger than when estimated with raw prices. I conclude that the assumption of no exercise of market power, maintained throughout the main sections of this chapter, is reasonable, and that I can interpret marginal transaction costs as true deadweight loss rather than transfers between consumers.

86 Figure 1-10: Estimated pass-through rates

Estimated pass-through rates: Sellers

Estimated pass-through rates: Buyers

0 .5 1 1.5 2

Histograms of estimated pass-through rates for sellers and buyers. Each pass-through rate corresponds to a different origin-destination region pair, where regions are defined by the interaction of hydrologic region and major cost factors. Pass-through rates are derived from transaction-level regressions of prices on marginal valuations (as estimated in Step 2 of the main part of this chapter) in which both coefficients and fixed effects are estimated separately for each origin-destination pair. Each marginal valuation is instrumented with the leave- out regional mean of marginal valuations, meaning the coefficients are identified using time-series variation in regional marginal valuations.

87 Table 1.9: Marginal transaction costs, adjusting for incomplete pass-through

Dependent variable: SELLERS (positive coefficients are costly) BUYERS (negative coefficients are costly) Total Cost Price ($/acre-foot) Panel A: Aggregated cost factors (1) (2) (3) (4) (5) (6) (7) (8) (9) Additional regulatory review 30.8 43.9 * 42.5 72.4 *** -273 -594 -1103 * -1132 1204 * (43.6) (22.6) (30.8) (25.7) (258) (512) (576) (717) (717) Distance (km) -5.21 -3.44 * -0.97 -5.00 -3.51 -17.39 ** -33.41 *** -29.21 ** 24.21 ** (3.94) (1.98) (2.20) (3.13) (12.21) (8.46) (9.27) (11.87) (12.28) Cross-sector transfer (ag-to-urban) -9.4 * -9.0 *** 6.5 -6.3 17.9 -2.3 18.3 * 18.8 -25.1 * (4.8) (3.0) (4.4) (4.0) (16.2) (11.6) (10.6) (13.4) (14.0) Seller's region by year FE     Seller's subregion by year FE   Seller FE   Seller by year FE  Buyer's region by year FE     Buyer's subregion by year FE   Buyer FE   Buyer by year FE  Observations 1,971 1,971 1,971 1,971 1,823 1,823 1,823 1,823 Clusters 125 125 125 125 127 127 127 127

Panel B: Disaggregated cost factors (1) (2) (3) (4) (5) (6) (7) (8) (9) Delta crossing only 135 *** 130 *** 108 ** 195 *** -716 * -914 ** -1681 *** -1740 *** 1934 *** (36) (48) (47) (49) (406) (460) (559) (636) (638) SWRCB review only -26 35 *** 14 46 *** 18 144 -478 -262 308 (26) (12) (26) (13) (386) (658) (739) (1173) (1173) DWR review only -37 -56 * 212 10 120 -153 -498 -68 79 (39) (34) (195) (33) (396) (211) (350) (111) (116) USBR review only -91 -56 -541 -3 91 126 -454 -237 233 (105) (36) (500) (11) (329) (602) (707) (1109) (1109) Delta crossing + SWRCB review - - - - -227 -986 -1265 ** -1383 * - - - - - (496) (595) (625) (719) - Delta crossing + DWR review 165 *** 105 ** 129 *** 151 *** - - - - - (37) (43) (46) (50) - - - - - Delta crossing + USBR review 1 51 * 94 * 195 *** -335 -875 * -1244 ** -1532 ** 1727 ** (32) (30) (48) (72) (338) (487) (565) (679) (683) County review 83.2 * -24.5 * -26.8 -30.4 ** 34.1 -8.8 -25.3 -27.0 -3.4 (43.2) (14.5) (19.1) (14.1) (80.4) (64.0) (73.8) (64.5) (66.0) Distance (km) -0.079 -0.006 0.047 0.002 -0.103 -0.524 -0.677 -0.326 0.328 (0.074) (0.037) (0.040) (0.056) (0.404) (0.355) (0.579) (0.591) (0.593) Seller's region by year FE     Seller's subregion by year FE   Seller FE   Seller by year FE  Buyer's region by year FE     Buyer's subregion by year FE   Buyer FE   Buyer by year FE  Observations 2,037 2,037 2,037 2,037 1,834 1,834 1,834 1,834 Clusters 181 181 181 181 130 130 130 130

Regressions of transaction price on cost factors from Table 1.2, adjusting prices for incomplete pass-through (and therefore for some possible forms of market power). The left side (columns 1-4) include seller-side fixed effects, comparing prices across buyers. Positive coefficients reflect a price premium, indicating that the cost factor is costly to the seller. The right side (columns 5-8) include buyer-side fixed effects, comparing prices across sellers. Negative coefficients reflect a price discount, indicating that the cost factor is costly to the buyer. Column (9) calculates the linear combination of the absolute value of (4) and (8), the preferred specifications. Standard errors are shown in parentheses and clustered by region X year. Region and subregion are hydrological region and planning subarea as defined by the California Department of Water Resources. All regressions also include comparison-sample indicator variables. * p<.1, ** p<.05, *** p<.01.

88 1.14 Appendix: Nash-in-Nash Bargaining as an Alternative Model

The model in the main section of this chapter relies on an assumption of perfect competition among intermediaries that trade across water districts. In this section, I propose a model with alternative foundations that may be more realistic. I show that, under certain conditions, this model yields approximately the same results on equilibrium prices as I use in my empirical approach. In this model, a finite number of districts engage in simultaneous bilateral negotiations to trade endowments of a homogeneous good. Both buyers and sellers incur nominal transaction costs that may be pair-specific and asymmetric. The solution concept, as in other bargaining models in the industrial organization literature (e.g., Ho and Lee 2017), is a Nash equilibrium in Nash bargaining solutions.

1.14.1 Basic bilateral negotiation

I begin with a general bargaining model. Here I make no assumptions on how negotiations proceed; the goal is simply to characterize the set of transaction prices that are individually rational.

An economy has N districts indexed by n who each have initial endowment En of water, a single homogeneous good. Each district has a total valuation function Vn(Qn) that is positive, increasing,

concave, and differentiable. The derivative of total valuation, Vn(Qn) ≡ dVn(Qn)/dQn gives inverse

demand, or marginal valuation at a given quantity demanded Qn. Any pair of districts can trade

water with each other; a transaction consists of quantity qod (sold from origin o to destination d) and s per-unit price Pod. In trading, districts incur pair-specific, asymmetric transaction costs: Cod(qod) in b selling and Cod(qod) in buying. These cost functions are weakly positive, weakly increasing, and differentiable.

Consider a negotiation between a single pair of districts, holding fixed Qn (the outcomes of all other negotiations). Net profits for a seller o and a buyer d are the additional profits obtained by a given transaction:

s s πod ≡ Podqod + Vo(Qo − qod) − Vo(Qo) −Cod(qod) (1.21) b b πod ≡ −Podqod + Vd(Qd + qod) − Vd(Qd) −Cod(qod).

A transaction is individually rational if and only if net profits are positive for both seller and

89 s b buyer: πod ≥ 0, πod ≥ 0. Rearranged, these conditions are:

s  s  Podqod ≥ WTAod ≡ Vo(Qo) − Vo(Qo − qod) −Cod(qod) (1.22) b  b  Podqod ≤ WTPod ≡ Vd(Qd + qod) −Cod(qod) − Vd(Qd).

Combining these, the amount transferred must fall between the seller’s willingness to accept and the s b buyer’s willingness to pay: WTAod ≤ Podqod ≤ WTPod. Without loss of generality, I parameterize

the proportion of bilateral surplus that is obtained by the seller as σod ∈ [0,1]. Then, all individually s s rational prices meet the condition Podqod = σodWTPod + (1 − σod)WTAod, or

    Vo(Qo) − Vo(Qo − qod) Vd(Qd + qod) − Vd(Qd) Pod = (1 − σod) + σod qod qod  s   b  Cod(qod) Cod(qod) +(1 − σod) − σod . (1.23) qod qod

In other words, individually rational prices are a linear combination of average incremental valuations and average transaction costs.

1.14.2 Nash equilibrium in simultaneous bilateral negotiations

Next, I specify conditions of these bilateral negotiations that will permit an equilibrium in prices and quantities. Throughout, I rely on results from Björnerstedt and Stennek (2007). Each possible pair of districts in the economy engages in simultaneous Rubinstein-Stahl negotia- tions. Districts alternate bids in each round until a bid is accepted or negotation fails. Each bilateral negotiation is conducted by a separate delegated agent, who observes the entire history of bids and decisions up to that point but not bids in the same round. This setup does not literally describe the negotiation patterns in the California water market, where transactions may be negotiated sequentially rather than simultaneously, and negotiations are likely conducted with full knowledge of any other simultaneous negotiations. However, I believe it requires substantially weaker assumptions than the model in the main section of the chapter. In- stead of perfect competition among hypothetical intermediaries, this model allows bilateral oligopoly among districts, who conduct transactions directly. Collard-Wexler et al. (2017) relax the delegated- agent assumption, allowing consumers to use information across simultaneous negotiations, but their framework is not a close fit for my context, since it does not allow for endogenous trading-network

90 formation nor for pair-specific transaction costs. s s b b I again adopt Assumption 1, constant marginal transaction costs: Cod = τodqod and Cod = τodqod. Björnerstedt and Stennek (2007) generalize this slightly, but fixed costs still need to be small. Agents maximize the sum of their districts’ per-period profits, discounted by common factor δ, where per- period profits are:

s s Πod ≡ Vo(Eo − ∑qod) + ∑Podqod − ∑τodqod (1.24) d d d b b Πod ≡ Vd(Ed + ∑qod) − ∑Podqod − ∑τodqod. o o o

Given this setup, Proposition 1 of Björnerstedt and Stennek (2007) ensures two things. First, there exists a sequential equilibrium in which agents immediately agree on the vector of transaction quantities qod that maximizes bilateral surplus – i.e., the additional combined profit taking others as s b given, πod +πod. Taking the first-order condition shows that each pair of districts either does not trade

(qod = 0) or trades a positive amount implicitly defined by the condition

b s Vd(Qd + qod) − τod = Vo(Qo − qod) + τod. (1.25)

This result is familiar from this chapter’s main model; it is identical to Equation 1.3. It says that the quantity traded equates the marginal valuations of buyer and seller, up to transaction costs. Second, as the time between negotiation periods becomes small (δ → 1), prices give an equal split 1 of the bilateral surplus. That is, σod = 2 for all districts o and d. Finally, Proposition 2 of Björnerstedt and Stennek (2007) ensures that quantities in this equilibrium coincide with the Walrasian quantities; i.e., the quantities that would result if all districts were price-takers. These results also constitute a Nash equilibrium in Nash bargaining solutions; the outcome of each bilateral Rubinstein-Stahl negotiations (as the discount factor approaches 1) maximizes that pair’s bilateral Nash product, taking as given the outcomes of all other negotiations.

1.14.3 Small transaction quantities

With an equal split of surplus, prices in this equilibrium are given as

    1 Vo(Qo) − Vo(Qo − qod) 1 Vd(Qd + qod) − Vd(Qd) 1 s b Pod = + + (τod − τod). (1.26) 2 qod 2 qod 2

91 Note that the first two expressions in parentheses resemble the definition of a derivative. As individual transaction quantities become small relative to total quantity consumed (qod/Qn → 0), prices converge to12 s b Pod = Vo(Qo) + τod = Vd(Qd) − τod. (1.27)

Once again, prices equalize the seller’s marginal willingness to accept with the buyer’s marginal willingness to pay. This is exactly the same result as Equation 1.3 in the perfectly competitive model – the key identifying condition I use to estimate transaction costs and marginal valuations. With Equation 1.27, this appendix suggests that my overall empirical approach does not hinge on the strong assumptions of the perfectly competitive model from Section 1.3. Instead, my key revealed-preference condition can be viewed as the approximate outcome of a significantly more realistic model of bilateral bargaining. This holds as long as transaction quantities are small relative to total quantity consumed – which is true for the vast majority of water districts in California.

1.14.4 Fixed costs

With arbitrary fixed costs, the equilibrium results of Björnerstedt and Stennek (2007) do not neces- sarily hold. (See Appendix 1.13 for a model that admits arbitrary fixed costs.) However, the set of individually rational prices can still be characterized. s Instead of Assumption 1, I allow both constant marginal transaction costs and fixed costs: Cod = s s b b b τodqod + Fod and Cod = τodqod + Fod. Here, prices in observed transactions must meet the condition

    Vo(Qo) − Vo(Qo − qod) Vd(Qd + qod) − Vd(Qd) Pod = (1 − σod) + σod qod qod s b 1  s b  +(1 − σod)τod − σodτod + (1 − σod)Fod − σodFod . (1.28) qod for some value of σod ∈ [0,1].

As transaction quantities become small relative to total quantity consumed (qod/Qn → 0), assum-

12This result is obtained from applying the definition of a derivative to Equation 1.26 and inserting Equation 1.25. It corresponds to Proposition 6 of Björnerstedt and Stennek (2007), which allows slightly more general transaction costs.

92 1 ing an equal split of surplus (σod = 2 ), prices converge to:

s 1 s b  Pod = Vo(Qo) + τod + Fod − Fod (1.29) 2qod b 1 s b  = Vd(Qd) − τod + Fod − Fod . 2qod

Quantities cannot become small relative to fixed costs, or the transaction would become too ex- pensive. However, it is entirely possible that fixed costs are non-negligible yet individual transaction quantities are still small relative to overall quantities. This result modifies Equation 1.27 under some conditions to account for fixed costs. What affects prices is not the absolute size of fixed costs, but the difference between the fixed costs incurred by buyer and seller. If the buyer and seller incur dramatically different fixed costs, prices look quite different from Equation 1.27. However, if both bargaining power and the size of fixed costs are similar between buyer and seller, prices are approximately the same as without fixed costs.

93 1.15 Appendix: Supplementary Tables

Table 1.10: Marginal transaction costs: robustness checks

Dependent variable: SELLERS (positive coefficients are costly) BUYERS (negative coefficients are costly) Price ($/acre-foot) Panel A: Aggregated cost factors (1) (2) (3) (4) (5) (6) (7) (8) Additional regulatory review 91.2 *** 87.6 *** 92.9 *** 92.6 *** -161.1 ** -163.1 ** -157.5 -166.8 ** (31.8) (30.1) (31.5) (32.5) (65.0) (64.7) (96.7) (76.2) Distance (km) -0.026 -0.012 -0.037 -0.040 -0.067 -0.067 -0.065 -0.007 (0.045) (0.037) (0.046) (0.064) (0.077) (0.078) (0.086) (0.088) Sample All All All Surface All All All Surface Volume (natural log points)   Seller by year FE     Seller by year by month FE  Buyer by year FE     Buyer by year by month FE  Observations 2,584 2,584 2,584 507 2,727 2,727 2,727 634 Clusters 177 177 177 110 192 192 192 119

Panel B: Disaggregated cost factors (1) (2) (3) (4) (5) (6) (7) (8) Delta crossing only 258.1 *** 246.2 *** 338.0 *** 269.1 *** -65.4 -68.7 -151.4 ** -76.7 * (58.0) (54.8) (34.3) (62.3) (42.9) (42.1) (69.8) (41.2) SWRCB review only 58.0 *** 57.6 *** 51.6 *** 58.9 *** -136.8 * -142.9 * -169.1 -129.9 (15.2) (15.0) (13.9) (15.5) (80.3) (80.7) (110.6) (84.5) DWR review only 21.3 15.7 -4.7 23.0 -385.3 *** -384.2 *** - -521.5 *** (33.3) (30.7) (26.6) (34.6) (111.9) (108.6) - (16.3) USBR review only -8.7 -9.4 -0.4 -10.2 -125.4 -125.6 -157.8 -125.1 (11.2) (10.3) (9.0) (11.7) (77.3) (78.2) (107.1) (81.8) Delta crossing + SWRCB review - - - - -72.7 -67.9 -88.8 -67.7 - - - - (51.3) (51.5) (83.7) (53.4) Delta crossing + DWR review 183.5 *** 179.9 *** 289.6 *** 182.4 *** - - - - (65.0) (62.9) (39.3) (68.1) - - - - Delta crossing + USBR review 217.6 *** 212.7 *** 188.0 *** 219.8 *** -114.4 ** -120.3 ** -121.2 -80.6 (77.6) (74.0) (56.7) (81.8) (51.6) (51.2) (85.6) (51.2) County review 1.1 1.0 -3.4 3.6 21.7 ** 21.4 ** 20.1 ** 17.0 * (7.4) (7.2) (5.9) (8.1) (9.3) (9.7) (8.0) (9.4) Distance (km) -0.059 -0.043 -0.072 -0.094 -0.160 -0.168 * -0.199 -0.131 (0.045) (0.037) (0.045) (0.064) (0.101) (0.100) (0.127) (0.097) Sample All All All Surface All All All Surface Volume (natural log points)   Seller by year FE    Seller by year by month FE  Buyer by year FE    Buyer by year by month FE  Observations 2,584 2,584 2,584 507 2,727 2,727 2,727 634 Clusters 177 177 177 110 192 192 192 119

Regressions of transaction price on cost factors from Table 1.2. The left side (columns 1-4) include seller-side fixed effects, comparing prices across buyers. Positive coefficients reflect a price premium, indicating that the cost factor is costly to the seller. The right side (columns 5-8) include buyer-side fixed effects, comparing prices across sellers. Negative coefficients reflect a price discount, indicating that the cost factor is costly to the buyer. Columns 1 and 5 repeat the preferred specifications from Table 1.3 for reference. Columns 2 and 6 control for volume, columns 3 and 7 control for district-by-month fixed effects, and columns 4 and 8 exclude transactions of water originating in adjudicated groundwater rights. Standard errors are shown in parentheses and clustered by region X year. Region and subregion are hydrological region and planning subarea as defined by the California Department of Water Resources. All regressions also include comparison-sample indicator variables. * p<.1, ** p<.05, *** p<.01.

94 Table 1.11: Demand elasticities: further regressions

Panel A: Reduced Form Ln (Price) Agriculture Urban (1) (2) (3) (4) (5) (6) Ln(Entitlements) -0.68 -0.79 -0.81 -1.19 -0.73 -0.40 (0.22) (0.20) (0.18) (0.29) (0.22) (0.20) Cost factors X Direction       Year fixed effects       Market fixed effects       Subregion fixed effects   District fixed effects   Observations 2,421 2,421 2,421 2,547 2,547 2,547 Clusters 204 204 204 157 157 157 Panel B: OLS Ln (Price) Agriculture Urban Ln(Quantity consumed) -0.86 -0.91 -0.94 -0.67 -0.32 -0.07 0.24 0.22 0.19 0.20 0.19 0.14 Cost factors X Direction       Year fixed effects       Market fixed effects       Subregion fixed effects   District fixed effects   Observations 2,421 2,421 2,421 2,547 2,547 2,547 Clusters 204 204 204 157 157 157 Panel C: Marginal Valuations (2SLS) Ln (Marginal valuation) Agriculture Urban Ln(Quantity consumed) -0.34 -0.44 -0.77 0.00 0.10 -0.03 (0.20) (0.16) (0.15) (0.39) (0.38) (0.35) Year fixed effects       Market fixed effects       Subregion fixed effects   District fixed effects   Observations 4,112 4,112 4,112 3,851 3,851 3,851 Clusters 237 237 237 183 183 183 Panel D: Alternative Specification with Marginal Valuations (2SLS) Ln (Marginal valuation) Agriculture Urban Ln(Quantity consumed) -0.68 -0.73 -1.05 -0.52 -0.52 -0.52 (0.15) (0.13) (0.18) (0.36) (0.46) (0.52) Other-market entitlements       Market fixed effects       Market-specific time trends       Subregion FE & time trends   District FE & time trends   Observations 4,112 4,112 4,112 3,851 3,851 3,851 Clusters 237 237 237 183 183 183

Dependent variable is listed immediately below each panel title. Observations are transactions; quantities and entitlements are market-level sums per year. Panel A reports reduced-form regressions of prices on market-level entitlements. Panel B reports OLS regressions of prices on quantity consumed, without instrumenting. Panels C and D report two-stage least squares regressions of estimated marginal valuations (rather than prices) on quantity consumed, instrumenting with entitlements. Panel D omits year fixed effects in favor of an average of other-market entitlements (weighted in proportion to each market pair’s total cross-market trading volume) and market-specific time trends. Panels A and B include binary indicators for the cost factors from Table 1.2, interacted with transaction direction (i.e., whether buying or selling). Markets are supersets of entitlement types (i.e., distinct values of the instrument). Standard errors (in parentheses) are clustered by market X year, the level of variation in both the instrument and the endogenous variable. Subregion is planning subarea as defined by the California Department of Water Resources.

95 1.16 Data Appendix

1.16.1 Transactions

WestWater Research, LLC provided a dataset containing 6,263 water transactions in California be- tween 1990 and 2015. Variables include transaction date, volume, price, and duration; and name, latitude and longitude, and water use category of both origin and destination parties. I focus on the spot market, which I define as transactions that are initiated, delivered, and completed within one year. I drop multi-year leases and permanent transfers, leaving 4,906 spot market transactions. Of these, prices are available for 4,415.

Cleaning. I calculate price per acre-foot and deflate to 2010 dollars using the CPI. I reshape the data so there is one observation per party per transaction, creating 13,328 observations. For transactions with multiple buyers or multiple sellers, if more specific information is not available, I assume transaction volume is divided equally across parties.

Location. I classify all observations into one of 10 hydrologic regions (defined by California’s Department of Water Resources [DWR]). When possible, I also generate latitude and longitude co- ordinates. I first attempt to use centroids from the user location file (see Section 1.16.5), matching parties to users via the crosswalk file (see Section 1.16.4). This matches 6,221 transaction-by-party observations. Second, I manually geolocate 65 users not appearing in the user location file but which are common in either transactions or entitlements data. For these users, I generate coordinates based on addresses, towns, or maps found via user websites and other publicly available documents; they match 180 additional observations. For remaining unmatched observations, I use location information from the original WestWater dataset. This assigns hydrologic region for all remaining observations and coordinates for 3,721 additional observations. This process leaves 3,206 observations for which location coordinates are unavailable. Finally, I use a user-written Stata command, -geoinpoly-, to lo- cate coordinates within 8-digit watershed (hydrological unit code, as defined by the U.S. Geological Survey), sub-sub-region (detailed analysis unit, as defined by DWR), and county. These shapefiles are available from DWR’s California Water Plan: http://www.water.ca.gov/waterplan/gis/index.cfm.

96 Sector. I classify all parties into one of three sectors: agriculture, urban/municipal, or environ- mental. I use the first successful method in the following order of priority:

1. Total historical entitlements (21% of observations). If the party appears in the entitlements dataset (see Section 1.16.2), I assign to agriculture or municipal depending on which sector receives a majority of total historical entitlements.

2. Project agencies (3%). I classify the DWR (which runs the State Water Project) and US. Bu- reau of Reclamation (USBR, which runs the Central Valley Project) as environmental, because they either devote the water to environmental flows or act as intermediaries. This essentially excludes them from sector-wise analysis.

3. Keywords (41%). I classify users based on the following keywords in their name. Agriculture: almonds, citrus, contractors, dairies, dairy, famers, family, farm, farmers, farming, grower, irrigating, irrigation, nurseries, nursery, orchard, ranch, river interests, trust. Municipal: arch- bishop, automobile, cement, cemetery, chemical, Chevron, church, city of, cold storage, col- lege, communities, community services, companies, company, container, corporations, country club, developer, development, electric, energy, estate, foods, gardens, golf, gravel, homeown- ers, homes, housing, inc., Indians, industries, investment, K.O.A., landscaper, leasing, LLC, LP, military, mobile home, monastery, motor, municipal, mutual water, non-ag, oil, owners, park, paving, power authority, properties, property, railway, real estate, realty, recycled, re- fining, retail, rock, sanitation, school, speedway, Texaco, town of, tribe, university, ventures. Environmental: conservancy, duck hunting, ducks, fish & wildlife, forest service, forestry & fire prevention, water bank.

4. Original use categories (23%). I apply WestWater’s original water use categories, based on agriculture, irrigation, and environmental, counting all other categories as urban.

5. Individual names (10%). I assume names of individual people are farmers and therefore agri- cultural.

6. Remainder (2%). I assume all remaining observations are urban.

97 1.16.2 Entitlements and Deliveries

I compile the universe of surface water entitlements (often called allocations) in California by user and year. These entitlements are from a combination of four water sources: Central Valley Project (CVP) allocations, State Water Project (SWP) allocations, and Lower Colorado Project entitlements, and water rights. All portions of entitlements are constructed by multiplying a time-invariant baseline maximum entitlement by a year-varying allocation percentage. This ensures that when using fixed effects, all identifying variation comes from the allocation percentages rather than changing maxi- mum entitlements. For water rights and the Lower Colorado Project, these allocation percentages are simply 1. For the CVP, allocation percentages vary across both years and contract types; for the SWP, allocation percentages vary only across year.

Central Valley Project (CVP)

CVP entitlements are constructed by multiplying present maximum contract volume by yearly per- centage allocations for each user, sector, and contract category. Entitlements are summed across contract categories within user and sector. Maximum contract volumes are downloaded from the U.S. Bureau of Reclamation (USBR) at https://www.usbr.gov/mp/cvp-water/water-contractors.html. I sum the volume of contracts by user, sector (municipal & industrial vs. agricultural), percentage-allocation category (see below), and project vs. base supply. Base supply is contracts for delivery of water based on water rights pre- dating the CVP, while project supply is contracts for delivery of new water made available by the CVP. Percentage allocations by year are downloaded from the USBR at https://www.usbr.gov/mp/cvo/ vungvari/water_allocations_historical.pdf. They are available for each contract year (the 12 months from March of the named year through February of following year) from 1977 to the present. Percent- age allocations are determined separately for each of 14 contract categories (North of Delta Agricul- tural Contractors, North of Delta Urban Contractors (M&I), North of Delta Wildlife Refuges, North of Delta Settlement Contractors/Water Rights, American River M&I Contractors, In Delta - Contra Costa, South of Delta Agricultural Contractors, South of Delta Urban Contractors (M&I), South of Delta Wildlife Refuges, South of Delta Settlement Contractors/Water Rights, Eastside Division Con- tractors, Friant - Class 1, Friant - Class 2, Friant - Hidden & Buchanan Units). Some categories are

98 combined in earlier years; when “American River M&I Contractors” and “In Delta – Contra Costa” are not specified separately, I impute the value for “North of Delta Urban Contractors”. Maximum contract volumes and percentage allocations are merged on contract names via the crosswalk file (see Section 1.16.4). Relative to other sources of entitlements data, CVP entitlements are offset by two months because they are based on the contract year (Mar-Feb) rather than calendar year. However, only 4.7% of water deliveries occur in January and February, so the difference is small.

State Water Project (SWP)

SWP entitlements are constructed by multiplying maximum contract amount in 1990 by yearly percentage allocations. I choose 1990 because that is the first year all now-existing sections of the SWP were completed, and maximum Table A amounts stabilized. Maximum contract (Ta- ble A) amounts are taken from Table B-4 of Bulletin 132-16, downloaded from the DWR at http: //www.water.ca.gov/swpao/bulletin_appendix_b.cfm. They are available by user, sector, and year from 1962 through 2016. Percentage allocations by year and sector are available from 1970 through 2017. For 1996-2017 they are taken from published notices to SWP contractors, downloaded from the DWR at http://www.water.ca.gov/swpao/deliveries.cfm. For 1970-1995 they are taken from Table 2-3 of the Monterey Plus Draft Environmental Impact Report, downloaded from http://www.water. ca.gov/environmentalservices/monterey_plus.cfm.

Lower Colorado Project

Lower Colorado Project entitlements are constructed from lists of Lower Colorado River water entitle- ments in California, downloaded from https://www.usbr.gov/lc/region/g4000/contracts/entitlements. html, and Appendix E of the Final Environmental Impact Statement of 2007 for the Colorado River Interim Guidelines for Lower Basin Shortages and Coordinated Operations for Lakes Powell and Mead, downloaded from https://www.usbr.gov/lc/region/programs/strategies/FEIS/index.html. Post- 2007 entitlements are used as baseline. Percentage allocations do not exist because a shortage has never been declared on the Colorado River.

99 Surface Water Rights

Surface water rights are not directly measured, so I construct entitlements from Statements of Diver- sion and Use and annual reports on file with the State Water Resources Control Board (SWRCB). Water rights rarely change nor are curtailed, so I treat them as permanent, fixed entitlements.

Reporting. Users holding post-1914 appropriative rights are required to submit annual reports of use. Riparian and pre-1914 appropriative rights were not systematically tracked by any government agency prior to 2010. However, since 2010 these rights holders must submit Statements of Diversion & Use, with civil penalties for noncompliance. From 2010 to 2016 this reporting requirement was once every three years; since 2016 rights holders must report every year. This means from 2015 onward, the SWRCB had at least one report of quantity diverted of nearly every water right claimed in California. Although these diversion statements are self-reported, it is reasonable to treat them as the full, legally defensible value of present water rights. This is because appropriative rights are based on documented continuous beneficial use, and these statements are public information, so they could be used in future legal disputes. Therefore, users have incentives to neither report less than they would like to use in the future nor more than other evidence would support.

Data. All of SWRCB’s records – water right permits, licenses, and Statements of Diversion & Use – are publicly available in the SWRCB’s Electronic Water Rights Information Management Sys- tem (eWRIMS). The online eWRIMS interface makes it difficult to view or download details for many records at once. Instead, I use a full extraction of the eWRIMS database, as of February 26, 2015, posted online as an exhibit in a 2016 administrative civil liability hearing for the Byron-Bethany Irrigation District. This was downloaded from http://www.waterboards.ca.gov/waterrights/water_ issues/programs/hearings/byron_bethany/docs/exhibits/pt/wr70.xlsx. Variables include: amount di- verted for select years from 2010 through 2014, face value of rights (for post-1914 rights), types of beneficial uses, status of right, year of first diversion, county, HUC12, and latitude & longitude of POD.

Cleaning. I follow the data cleaning and quality control procedures described by SWRCB in another exhibit (“Exhibit WR-11: Testimony of Jeffrey Yeazell”, http://www.waterboards.ca.gov/

100 waterrights/water_issues/programs/hearings/byron_bethany/docs/exhibits/pt/wr11.pdf), adding a num- ber of further checks and corrections. I drop rights that are canceled, inactive, removed, or revoked, and those not yet active, and minor types of water rights (such as stock ponds and livestock), leaving only appropriative rights and statements of diversion and use. The dataset has 95,535 observations at the level of right by point of diversion (POD) by beneficial use type, with a few duplicates. I drop duplicate observations so that the combination of these three variables form a unique key, then I reshape to the level of right by point of diversion, resulting in a dataset of 56,508 observations. For rights with multiple PODs, I keep only one so that a right is a unique record. SWRCB chooses the POD by alphabetical order on watershed name; I instead choose the POD from the watershed, source within watershed, and 12-digit hydrologic unit within source with the most PODs for that right; if there duplicates within 12-digit hydrologic unit, I keep the POD with the earliest number. To construct the year a right first began, I use the year of first use when available (nearly all pre-1914 and riparian rights holders, and some post-1914 rights holders), followed by original permit issue date when available, license original issue date when available, and record status year when available. To construct the year a right ended, I take the first year a right was canceled, closed, inactive, rejected, or revoked. I remove non-consumptive diversions by power-only and aquaculture-only, following SWRCB procedure. For rights that report no diversion to storage, I set diversions to zero. For diversions that do report diversion to storage, I subtract the amount used from the amount diverted, censoring negative values at zero. I correct for over-reporting, following SWRCB procedure. For post-1914 rights, most observa- tions include the face value of the rights, so if reported diversions in a year exceed the face value, I scale down that year’s monthly reports so their total equals the face value. For pre-1914 and riparian rights, face value is not available but some report irrigated acres, so if reported diversions exceed 8 acre-feet per acre, I scale down that year’s monthly reports so their total equals this limit. I add one more correction not performed by SWRCB: For post-1914 rights for which the face value is unavail- able but irrigated acres is available, I apply the same acres-based correction, but conservatively only for observations whose total diversions exceed 80 acre-feet per acre. I make further corrections to high outliers in a process not separately conducted by SWRCB. Many of these are likely errors in unit selection; there may also be low outliers, but I cannot detect

101 them effectively. I calculate the standard deviation of the natural log of all monthly diversion values. For observations for which this standard deviation is greater than 2 and the average annual diversion exceeds the face value of the rights by more than 100 acre-feet, for years in which the total diversion exceed the smallest annual total by more than 100 times, I scale down each monthly value propor- tionally so that that year’s total equals the smallest annual total. Although this correction process affects only 82 observations, it changes the total statewide reported diversions by more than 12 orders of magnitude. I also drop one riparian right held by an individual that implausibly reports an annual diversion of more than 100,000 acre-feet.

Further sample restrictions. I drop water held by federal and state projects (which are added to the entitlements dataset separately). I drop non-consumptive rights: those whose beneficial use is aesthetic, aquaculture, fish & wildlife, incidental power, power, recreational, or snow-making; sev- eral known environmental or recreational users (California Department of Fish & Wildlife; California Department of Forestry & Fire Prevention; California Department of Parks & Recreation; Nature Conservancy; Pine Mountain Lake Association; Tuscany Research; U.S. Bureau of Land Manage- ment; U.S. National Park Service; U.S. Forest Service; U.S. Department of Fish & Wildlife; White Mallard, Inc.; Woody’s on the River, LLC); two known electricity-generating users (Pacific Gas & Electric Co., Southern California Edison Company); and those whose name includes one of several keywords (duck club, gun club, power, preservation, shooting club, waterfowl, wetlands). I drop a small number of rights (157) with no location information available.

Sector. I categorize each right as agricultural or municipal based on whether its record lists irrigation or stockwatering as a beneficial use. I then set to municipal all users whose names include “city of” or “golf”, and several known municipal users in Orange County (Irvine Ranch W.D., Orange County W.D., Santa Margarita W.D., Serrano W.D.).

Final variables. For each right, I average across reported annual diversions from 2010 through 2014. I then sum across rights within user and sector, keeping location information for the point of delivery with the largest volume. For each year from 1980 through 2015, I calculate the sum for rights that were held and active in that year. I use this sum as of 1990 for the baseline value, since that is the start of the water transactions dataset. Finally, CVP settlement and exchange contractors may have

102 the same rights counted in both CVP and rights dataset. So as not to double count these, I subtract the maximum contract volume for base supply from their rights volumes.

Entitlements datasets

I combine these four sources to create three datasets of entitlements: user-by-year, market-by-year, and polygon-by-year.

User-by-year. I merge all sources on name and year, matching users via the crosswalk file (see Section 1.16.4), and sum entitlements across sources. I restrict to 1981-2015, the years for which entitlements are available from all four sources. This results in a nearly balanced panel of 7,168 users over 35 years, for a total of 250,129 observations. For locations, I use centroids from the user location file (see Section 1.16.5). For users not available in the user location file, I use the point of diversion listed in the rights dataset, assuming the place of use is in the same region as the point of diversion. For most users this is reasonable, but for some appropriative rights holders with large internal distribution systems, this could introduce substantial error. Unfortunately there is no way to systematically identify and correct these. For users not available in either the user location file or rights dataset, I merge to a dataset of 65 manually geolocated users. For these users, I generate coordinates based on addresses, towns, or maps found via user websites and other publicly available documents.

Market-by-year. I define 14 “markets” by geography and project status: SWP, CVP North, CVP East, CVP South, and non-project users in each of the 10 hydrologic regions. These are su- persets of contract types and coincide with several major transaction cost factors. For the few users that hold both SWP and CVP contracts, I choose the project from which the user draws the larger maximum entitlement. CVP North is defined by users north of the Sacramento–San Joaquin Delta, CVP East is users south of the Delta and east of the San Joaquin River, and CVP South is users south of the Delta and west of the San Joaquin River. Starting with the user-by-year dataset, I drop 408 observations without location information, which represent less than two-thousandths of a percent of total entitlements. I then sum across users within market and year.

103 Polygon-by-year. For agricultural analysis, I calculate per-acre entitlements for each of many overlapping polygons, comprised of the user locations file (see Section 1.16.5) and 8-digit water- sheds (“HUC8”, hydrological unit code as defined by the U.S. Geological Survey). Starting with the user-by-year dataset, I match user entitlements to their own polygons when possible, and calculate entitlements and deliveries per acre of cropland within shape. When user polygons are not available, I aggregate users within 8-digit watersheds. (The users are mostly rights-holders only.) This assumes the place of use is in the same region as the point of diversion – which is reasonable for most users but could introduce substantial error for some appropriative rights holders with large private distribution systems; unfortunately there is no way to systematically identify and correct these. It also introduces measurement error via averaging; while most irrigation districts do typically allocate deliveries across retail customers on a per-acre basis, individual rights-holders may not be spread out so evenly. How- ever, this problem is minor for my empirical analyses because I use water rights only for the overall level of water availability, not for identifying variation.

1.16.3 Quantity consumed

In some analyses I use market-level quantity consumed as an endogenous variable. (For definition of market, see “Entitlements datasets” above.) I do not measure water consumption directly but rather construct it from other datasets. I sum, by market and sector: (1) entitlements, (2) quantity purchased minus quantity sold, from transactions data, and (3) mean groundwater supply. Groundwater consumption is not directly measured or monitored at all in California except in certain local areas. However, the DWR estimates groundwater supplies in several publications. I use average annual groundwater supply for 2005-2010 by hydrologic region, subregion (planning area), and sector, from the Volume 2 Regional Reports of DWR’s California Water Plan Update 2013. I sum across planning areas within hydrologic region and sector. This measure of groundwater consumption does not capture changes in consumption from year to year, but my primary purpose is to accurately set the overall log-level of water consumption.

1.16.4 Crosswalk file

I create a crosswalk dataset that links water users by name across all other datasets used in this chapter. To create it, I export raw names from each dataset and append them together. I strip punctuation

104 and correct misspellings and other typos. I standardize common terms into acronyms (e.g., I.D. for irrigation district; M.W.C. for mutual water company; F.C.W.C.D. for flood control and water conservation district). For names of individual people, I match full names to entries with the same last name but only first initial(s) available. For agencies, when names are closely but not precisely similar I use agency websites and other publicly available documents to determine whether (a) one agency has changed its name, (b) one name is erroneous, or (c) they are indeed distinct agencies. I use footnotes and notes in original data sources to link users with name changes over time, keeping the most recent name. When a merger has occurred, I roll users up into the most aggregate version to maintain consistent definitions. The exception is companies with service in noncontiguous locations, for which I treat each location as a separate user. The final crosswalk file has 28,765 entries (input names) pointing to 14,830 targets (output names).

1.16.5 User location file

By combining all the relevant and publicly available georeferenced digital maps I can find, I create a dataset of the most accurate locations, areas, and boundaries for as many water users as possible. I combine the following datasets and link them via the crosswalk file. For each user, I keep one shape (feature) according to the following priority order:

1. DWR’s Water Districts Boundaries, downloaded via the Query link found at https://gis.water. ca.gov/arcgis/rest/services/Boundaries/WaterDistricts/FeatureServer.

2. Federal, State, and Private Water Districts shapefiles maintained by USBR and DWR, down- loaded from the California Atlas at http://www.atlas.ca.gov/download.html.

3. Mojave Water Agency Water Companies, downloaded at https://www.mojavewater.org/geospatial-library. html.

4. California Environmental Health Tracking Program’s Water Boundary Tool, downloaded at http://www.cehtp.org/page/water/download.

Before I append (merge) sources, I combine noncontiguous shapes for the same user (dissolve to create multipart features). After selecting one shape per user, I calculate the user’s centroid (restricted to within shape), area, and cropland area (via zonal statistics). The latter uses the 2015 cropland mask from the USDA’s Cropland Data Layer. I also construct versions of the user location file in which

105 user shapes are interacted with either 8-digit watershed (hydrological unit code, as defined by the U.S. Geological Survey) or the intersection of sub-sub-region (detailed analysis unit, as defined by DWR) and county. These shapefiles are available from DWR’s California Water Plan: http://www.water.ca. gov/waterplan/gis/index.cfm.

106 Chapter 2

The Costs of Industrial Water Pollution to Agriculture in India

2.1 Introduction

Pollution levels in developing countries are often orders of magnitude larger than in developed coun- tries. Simple linear extrapolation suggests that the costs to health, productivity, compensatory behav- ior, and ecology are relatively more important in developing countries. Moreover, there is evidence that costs from pollution may be nonlinear, with marginal costs increasing in pollution levels (Arceo et al. 2016). Unfortunately, most quasi-experimental evidence on the costs of pollution comes from developed countries, with little basis to extrapolate to developing settings. More specifically, air pol- lution has enjoyed growing academic interest and exploding public interest in developing countries, but water pollution, while also high, has received less attention. Public pressure in India has brought down air pollution levels over the last few decades (to still-extreme levels by developed country stan- dards), but similarly strict regulation has not discernibly improved water quality (Greenstone and Hanna 2014). In this chapter I estimate the costs of industrial water pollution to agriculture in India, taking advantage of an unusally rich dataset on water pollution obtained and made public by Greenstone and Hanna (2014). I focus on 63 industrial sites identified by the Central Pollution Control Board in 2009 as “severely polluted” with respect to water pollution, out of 88 sites selected for intensive study. Unlike other forms of pollution, water pollution almost always flows in only one direction from its source. My empirical approach exploits the fact that when industrial wastewater is released into a flowing river, it creates a spatial discontinuity in pollution concentrations along that

107 river. Areas immediately downstream of a heavily polluting industrial site will have higher pollution levels than areas immediately upstream, yet they are likely similar otherwise. This makes upstream areas a reasonable counterfactual for the downstream areas in studying the impacts of water pollution on economic outcomes. I show three sets of results. First, I show that there is a large, discontinuous increase in industrial pollution at the location of these “severely polluted” industrial sites which raises existing levels of pollution in the rivers by a factor of three to five. Second, I find that agricultural revenues in districts immediately downstream of these industrial sites are 9 percent lower than in upstream districts, with a 95% confidence interval of 2 to 17 percent. Third, I document that farmers are neither substituting factors of production away from agriculture nor applying additional compensating inputs; the effect of being downstream on yields is large and significant, while the effects on crop area, irrigation, fer- tilizer, labor, and population are small to zero. Two important caveats are that these effects are precise enough to exclude zero only when accounting for baseline characteristics, and that this approach can- not exclude the possibility that areas upstream and downstream of pollution sources differ in other systematic ways. Still, these results suggest that damages to agriculture may represent a major cost of water pollution and warrant further study. This chapter contributes to the existing literature in four ways. First, it provides among the first quasi-experimental evidence on any form of costs of industrial water pollution. In recent papers on India, Do et al. (2018) study the effects of industrial water pollution on infant mortality, while Brainerd and Menon (2014) study the effects of agricultural water pollution to child health. In the United States, Keiser and Shapiro (2017) study the effect of all water pollution on property values. Most other economics literature on the costs of water pollution deals with domestic water pollution in the context of providing clean drinking water. Second, this chapter documents economic costs of pollution to agriculture; to my knowledge only one other paper does so quasi-experimentally, but in the context of air pollution (Aragón and Rud 2016). Third, it adds to the small but rapidly growing literature on the costs of pollution in developing countries (Jayachandran 2009; Chen et al. 2013; Adhvaryu et al. 2016). Finally, this chapter contributes to a broader understanding of structural transformation and the relationship between industry and agriculture in developing countries. Most existing literature fo- cuses on input reallocation between sectors (Ghatak and Mookherjee 2014; Bustos et al. 2016), while this chapter is among the first studies to document a non-pecuniary externality from industry to agri-

108 culture.

2.2 Background

2.2.1 Industrial water pollution and crops

Manufacturing plants like those in India produce a variety of waste chemicals which, if untreated or insufficiently treated, will reach surface or ground water systems. These chemicals include or- ganic chemicals including petroleum products and chlorinated hydrocarbons; heavy metals including cadmium, lead, copper, mercury, selenium, and chromium; salts and other inorganic compounds and ions; and acidity or alkalinity. Many of these products are carcinogenic or otherwise toxic in sufficient quantities to humans and other plants and animals. Agricultural crops are no exception. Biologically, it is well known that plant growth is sensitive to salinity, pH (i.e., acidity and alkalinity), heavy metals, and toxic organic compounds. In addition, oil and grease can block soil interstices, interfering with the ability of roots to draw water (Scott et al. 2004). Chlorine in particular can cause leaf tip burn. Pollutants, especially heavy metals, harm by accumulating in the soil over long periods of time, but they can also harm directly through irrigation (Hussain et al. 2002). Agronomic field experiments confirm reduced yields and crop quality from irrigation with industrially polluted water. Experiments have found rice to have more damaged grains and disagreeable taste, wheat to have lower protein content, and in general, plant height, leaf area, and dry matter to be reduced (World Bank and State Environmental Protection Administration 2007). A few small case studies suggest that the findings of these field experiments extend to real-world settings. Reddy and Behera (2006) found an 88% decline in cultivated area in a village immedi- ately downstream of an industrial cluster in Andhra Pradesh, India. Lindhjem et al. (2007) found that farmland irrigated with wastewater had lower corn and wheat production quantity and quality in Shijiazhuang, Hebei Province, China. Khai and Yabe (2013) found that areas in Can Tho, Vietnam irrigated with industrially polluted water had 12 percent lower yields and 26 percent lower prof- its. History also suggests that crop loss from industrial water pollution is not unknown to farmers; Patancheru, Andhra Pradesh saw massive farmer protests and a grassroots lawsuit in the late 1980s (Murty and Kumar 2011). In contrast with industrial wasetewater, domestic or municipal wastewater can sometimes have positive effects on crop growth due to the nutrient value (Hussain et al. 2002). This is especially true

109 for treated municipal wastewater. However, undiluted untreated wastewater can in fact have levels of nitrogen, phosphorous, and potassium that are so high they harm crop growth, and it poses health risks to agricultural workers, potentially reducing labor supply.

2.2.2 Welfare measures

I estimate the effects of industrial water pollution on agricultural revenue. Revenue is not directly interpretable as welfare, but I argue it is a reasonable proxy given conceptual and data limitations. One candidate measure of rural welfare is agricultural profits: revenues net of costs. While some cost data is available, unfortunately it tends to be noisy and incomplete: for example, it fails to include the value of household labor. Estimating lost revenue will underestimate lost profits if farmers apply additional compensatory inputs, and it will overestimate lost welfare if farmers migrate or substitute other factors of production away from agriculture. However, I can measure many of these shifting factors, and I show these in one set of results. A second candidate measure of welfare is agricultural yields as measured in crop quantities, taking the view that price fluctuations in both agricultural inputs and outputs are largely just transfers between consumers and producers, and that the welfare of both matters. The claim is that, in the case of additional inputs, lost profits overestimates total lost welfare because, for example, purchasing more fertilizer increases the welfare of fertilizer producers. However, this is also incomplete, failing to consider the opportunity cost of input factors of production and the fact that consumer substitution from any single crop is positively elastic. Nevertheless, I also report estimates on agricultural yields in one set of results. A final candidate measure of welfare is land values or land rental rates. Unfortunately, systematic data do not exist on these with enough spatial coverage for my study. Even if they did, property values are a highly imperfect welfare measure in the presence of location preference heterogeneity, moving costs, or other imperfections in the land market which are likely high in India (Moretti 2011).

110 2.3 Data

2.3.1 Sources

Comprehensive Environmental Performance Index (CEPI)

The Central Pollution Control Board (CPCB) selected 88 industrial sites for detailed, long-term study in 2009. Names of these sites were taken from the CPCB document Comprehensive Environmental Assessment of Industrial Clusters (2009a). I identified the geolocation of each site using Google Earth and other publicly available reference information. These sites are displayed as red dots in Figure 2-1. The CPCB document also contains numerical scores for air, water, and land pollution, and an overall score, each out of 100. Details of the scoring methodology are provided in the companion document Criteria for Comprehensive Environmental Assessment of Industrial Clusters (2009b). The CPCB considers a site “severely polluted” if the score for a single pollution type exceeds 50, or if the overall score exceeds 60 (the overall score is a nonlinear combination of the component scores). Sixty-three sites received such a “severe” rating in water pollution in 2009; these constitute my sam- ple.

Pollution data

I use a rich dataset of water pollution measurements along rivers in India collected by the Central Pollution Control Board, as collected and made publicly available by Greenstone and Hanna (2014). This dataset includes 55,461 observations from 459 monitoring stations along 145 rivers between the years 1986 and 2005. The CPCB’s water quality monitoring network has been expanding rapidly and presently there are over 2,500 monitoring stations. In future work I plan to incorporate this additional data since 2005. The Greenstone & Hanna data include a noisy location measure as well as river name and a decsription of the sampling location. I identified, refined, or corrected the geolocation of each station using these variables, Google Earth, other CPCB documents, and other publicly available reference information. The locations of these stations are displayed as green dots in Figure 2-1. Many water quality parameters have been collected by the CPCB at some point. However, only a few parameters are measured consistently. I focus on chemical oxygen demand (COD), a standardized laboratory test that serves as an omnibus meaure of organic compounds, which factories typically

111 generate in high quantities. Along with the related but narrower test of biochemical oxygen demand (BOD), COD is the Indian government’s top priority in regulating industrial wastewater (Duflo et al. 2013). One disadvantage of focusing on COD is that it misses inorganic pollutants like heavy metals, which, physiologically, are leading candidates for harming crop growth. Unfortunately, heavy metals are measured in less than three percent of observations. Therefore my best option is to use COD to proxy for all industrial pollution. Another disadvantage to COD is that it is not an exclusive meausure of industrial pollution; it also can indicate the presence of domestic or municipal pollution (i.e., untreated ). Because many industrial sites are located in cities, they may be coincident with domestic pollution, confounding the interpretation of my results as being driven by industrial pollution. However, another consistently measured water quality indicator is fecal coliforms, a count of the number of certain types of bacteria that originate from human waste. Because fecal coliforms are overwhelmingly produced by domestic pollution, not industrial pollution, partialing out fecal coliforms from COD should leave only the component of COD that is not produced by domestic pollution, i.e., industrial pollution. Therefore in some specifications I nonparametrically control for fecal coliforms.

Agricultural outcomes

For data on agricultural outcomes I use the Meso Scale Dataset from ICRISAT’s Village Dynamics in South Asia project. This is a comprehensive district-level panel from 1966 to 2009. My primary outcome of interest is agricultural revenue. To calculate this, I multiply the quantity of each of 16 crops available in the dataset by the mean price for that crop in that district between 1966 and 1971. For districts without price data, I impute the state mean if available or the national mean otherwise. I also construct the following variables for use as baseline covariates and outcomes:

• Climate variables: Average precipitation (mm/year); average potential evapotranspiration (mm/year).

• Irrigation variables: Land irrigated (net irrigated area per total district area, %); crops irrigated (gross irrigated area per gross cropped area, %); surface water and groundwater specific mea- sures of the two variables above; pump density (number of diesel and electric pumps per district area, in thousands of hectares).

• Other agricultural variables: Land cropped (net cropped area per total district area, %); fertil- ization intensity (total fertilizer consumed per district area, in thousands of hectares).

112 • Population variables: Rural population density (total rural population per district area, in thou- sands of hectares); farmer density (total cultivators and agricultural workers per district area, in thousands of hectares).

Census

For geographic district boundaries and some population data I use the historic GIS shapefile of India’s 1971 Census, from ML Infomap. I match districts to the ICRISAT data on state and district name. The ICRISAT data uses district definitions from 1966, so the boundaries may be slightly different, but a GIS shapefile of 1966 district boundaries was not available, and district boundaries changed little between 1961 and 1971. Using ArcGIS, I calculate the longitude and latitude of the centroid of each district, as well as the land area in square kilometers. I also use the following variables from the 1971 census, included in the shapefile, as baseline covariates: total population density (people per square kilometer); urbanization rate (%); scheduled castes (%); scheduled tribes (%); and literacy rate (%).

2.3.2 Matching pollution data to industrial sites

I match the pollution observations with polluted industrial sites via the following procedure. First, I obtain a digital elevation model (DEM) at 15 arc-second resolution for the South Asia area from the HydroSHEDS project of the United States Geological Survey. From this DEM, I use the Hydrology tools in ArcGIS to obtain three principal products:

1. A vectorized river network, consisting of river segments (edges) and junctions (nodes).

2. A Shreve stream order system. This numbers each segment of a river such that the number increases at every junction; i.e., two lower-numbered tributaries always combine to form a higher-numbered river, regardless of whether one tributary has the same name as the down- stream river segment.

3. A flow length raster, which calculates from each cell the distance along rivers that a particle would travel before reaching an ocean (or the edge of the raster).

I snap each industrial site to the nearest river within 10 km (only a few do not meet this criteria), and for each industrial site and pollution obsevation station I extract the river segment, stream order, and

113 flow length. I join each industrial site to all pollution stations on the same river network, and kept only pairs that meet criteria for being either upstream or downstream of each other (i.e., a site and station on separate tributaries are neither). I construct distance, which I use as a running variable, as the difference in flow lengths between station and industrial site, and a downstream indicator variable equaling one if the distance variable is positive (station is downstream of industrial site). Finally, I assign each station to its nearest industrial site and discard other matches, so that each station is only in the final dataset once.

2.3.3 Matching agricultural data to industrial sites

I identify districts upstream and downstream of each industrial site by inspection in ArcGIS using district and river maps and elevation, flow direction, and flow length data. I first identify the district the industrial site lies in and classify it according to whether more of its land lies upstream or downstream of the site. I then select 1-2 districts upstream of this district and 1-2 districts downstream. The final sample consists of minimum 3 and maximum 5 districts per industrial site. In future work I will formalize this procedure and calculate the percentage of each district’s land area which lies in the watershed (upstream) or catchment area (downstream) of each site.

2.3.4 Covariate balance

Full sample

Table 2.1 presents the balance of many covariates in 1971-72 across districts categorized as either upstream or downstream of industrial sites. The first column shows the mean and standard deviation of upstream districts, the second shows the same for downstream districts, and the third column shows the difference and the standard error of a t-test comparing the two means. For some variables whose distributions are very right-skewed, I take the natural logarithm of the data and show the same comparisons. The two groups are somewhat similar. Downstream districts have slightly more scheduled castes and slightly fewer scheduled tribes, but since both are disadvantaged categories, socioeconomic char- acteristics should be similar. More worrying is that downstream districts have substantially more irrigation, especially by surface water. Irrigation is a very important factor in determining agricul- tural productivity, so this diference may confound the relationship between polluted factories and

114 agricultural revenues or yields. The difference is likely due to dams which make it easier to irrigate downstream and not upstream. Currently I do not use data on dams; in future work I can directly control for them.

Balanced sample

In an attempt to construct a sample in which upstream and downstream districts are more similar at baseline, I discard industrial sites which have an upstream-downstream difference in either land irri- gated or crops irrigated exceeding 20 percentage points. This is admittedly a fairly arbitrary threshold but one that represents a natural break in the histograms of these differences. Table 2.2 presents again the covariate balance for this trimmed sample. Now the irrigation vari- ables are extremely similar between upstream and downstream districts. In fact, none of the covariates has a statistically significant difference. I show results using both the full sample and this balanced sample.

2.4 Empirical strategy

2.4.1 Regression discontinuity in pollution

The spatial discontinuity in pollution and the linear flow of rivers presents a natural setting for a regression discontinuity design. I match each polluted industrial site to upstream and downstream pollution measurement stations on the same river path, and I line up these stations in river space such that each industrial site is at zero. The running variable is distance along a river path from the industrial site, with positive values indicating that a measurement station is downstream of an industrial site. I estimate local linear regressions of the following form:

codsit = βdownstreamsi + γdistancesi + δdistancesidownstreamsi + αst + εsit (2.1)

where codsit is chemical oxygen demand measured at station i in year t, which spans the years 1986

to 2005. Variable downstreamsi is the discontinuous treatment; it takes a value of 1 if station i lies downstream of industrial site s and 0 if upstream. I estimate local linear regressions without higher order polynomials, following the advice of Gelman and Imbens (2014), allowing both the intercept

and slope of the running variable distancesi to vary on opposite sides of the discontinuity.

115 I include site-by-year fixed effects αst , so that the treatment effect at the discontinuity is identified only using variation between upstream and downstream measurements on the same river path in the same year. I use a triangular kernel, which is optimal for estimating local linear regressions at a boundary (Fan and Gijbels 1996). I cluster standard errors by station and report results using a range of reasonable bandwidths. For the regression discontinuity graph, I first partial out the state-by-year fixed effects from the

COD measurements by running a triangular-weighted local linear regression of codsit on αst . I then graph the residuals from this regression plus the upstream mean of the fixed effects, and fit local linear regressions of these residuals on distancesi, downstreamsi, and their interaction.

2.4.2 Spatial matching in agricultural outcomes

Unfortunately, agricultural data is not available at a finer spatial resolution than district, making a regression discontinuity design infeasible for these outcomes. Instead, I simply match upstream and downstream districts, estimating ordinary least squares regressions of the following form:

ysit = βdownstreamsi + ΠXsi + αst + εsit (2.2) where ysit is an agricultural outcome measured in district i in year t, Xsi is a vector of baseline covari- ates and αst are site-by-year fixed effects. The agricultural data spans 1966-2009; 1972 is the first year for which all covariates are observed, so I divide the time series there, including baseline co- variates from 1972 and estimating the regressions only on data from 1973 onward. Site-by-year fixed effects again ensure that the treatment effect of being downstream is identified only using differences between upstream and downstream districts near the same industrial site in the same year. I cluster standard errors by district to account for serial correlation. The identifying assumption is that no other factors that are relevant to agricultural outcomes also vary systematically between upstream and downstream districts. This is potentially plausible, since the polluted industrial sites are located all over India, on rivers oriented in all directions. This assumption does not seem to hold very well in the full sample (Table 2.1), but it does seem to hold on many observed factors in the balanced sample (Table 2.2). This of course does not ensure that it holds on unobserved factors, and so in future work I will combine this spatial difference with sources of time series variation, which will allow me to remove time-invariant district-specific characteristics.

116 2.5 Results

2.5.1 Pollution

I first show that the industrial sites considered “severely polluted” by the Central Pollution Control Board do in fact increase pollution levels discontinuously in nearby rivers.

Regression discontinuity. Photographic evidence of this discontinuity can be seen at one site in my sample: the Nazafgarh Drain Basin on the River just north of New Delhi (available at, for example, https://www.google.com/maps/place/28.708,77.231). The river flows from north to south and enters the image at the top with a green color. In the center of the image, an industrial effluent channel meets the river, discontinuously turning the river black. Color is neither a sufficient nor necessary condition for industrial water pollution, but it confirms the presence of water from another source and is correlated with pollution. Moving to the full sample and using direct, quantitative measurements of pollution, Figure 2-2 provides visual evidence of the discontinuity in pollution concentrations at polluting industrial sites. This figure plots measurements of chemical oxygen demand (COD) - an omnibus measure of water pollution - after partialing out site-by-year fixed effects, against distance from a polluting industrial site. Positive distance values (right of the vertical line) indicate that the measurement is downstream; negative values (left of the vertical line) are upstream. Data is collapsed to 3-km wide bins; the light blue circles plot the mean within each bin, with the diameter of the circle proportional to the number of observations falling within that bin. The dark blue lines represent a local linear regression at distance = 0 using a bandwidth of 50 km and a triangular kernel, in which both intercept and slope are allowed to vary on opposite sides of distance = 0. Visually there appears to be evidence of a large, discontinuous rise in pollution concentrations at distance = 0, coinciding with the presence of large polluting industrial sites. This confirms that the industrial sites selected by Central Pollution Control Board for detailed study are indeed releasing large amounts of water pollution into India’s river systems. Pollution levels appear to fall downstream of the site; this makes sense because the further downstream the river runs, the more that the original pollution concentrations in the river are diluted with runoff from other areas. It is less clear why pollution might rise approaching the site from upstream; however, the estimated upstream slope is sensitive to bandwidth and often smaller or negative.

117 Varying bandwidths. Table 2.3 makes this visual evidence more formal. Panel A presents results of eight local linear regressions of the form 2.1, with bandwidth varying from 200 km to 7 km, the smallest bandwidth for which I can estimate all parameters of the regression. The first row shows the coefficient on the downstream indicator, which is the estimate of the average discontinuity in pollution at an industrial site. For all bandwidths this estimate is positive, large, and has a 99% or 95% confidence interval that excludes zero. Downstream pollution concentrations are expected to fall with increasing distance from a pollution source; accordingly, the estimate of the discontinuity grows larger in magnitude as the bandwidth narrows and we zoom in on the polluting industrial site. At the third-smallest bandwidth, 15 km, the discontinuity at an industrial site is 95 mg/L. Compar- ing this to the upstream mean (the estimate of the limit of COD concentrations as distance approaches 0 from the upstream side) of 34 mg/L, this implies that the average “severely polluted” industrial site nearly triples pollution levels in the nearest river.

Fecal coliforms. Panel B of Table 2.3 shows the same local linear regressions but linearly control- ling for fecal coliforms. The distribution of this variable is heavily skewed right so I use its natural logarithm and allow its slope to differ upstream and downstream of the industrial sites. While COD is produced by both industrial and domestic sources, fecal coliforms are overwhelmingly produced by domestic sources only, so by controlling for the latter, the remaining variation in COD should reflect only industrial sources. The results here are quite consistent with Panel A. As expected, the estimates of the pollution discontinuity are smaller, but still large, positive, and statistically signifi- cant at bandwidths smaller than 50 km. This suggests that the discontinuous rise in pollution at the point of “severely polluted” industrial sites does in fact reflect industrially generated pollution, not just domestic sources which happen to coincide with the industrial sites.

2.5.2 Agricultural outcomes

Having shown that water pollution in rivers rises discontinuously at “severely polluted” industrial sites, I investigate the downstream effects this pollution has on agriculture, using the sample trimmed by excluding industrial sites with a large upstream-downstream difference in baseline irrigation rates.

Revenues. Table 2.4 reports the results of regressions of the natural logarithm of total agricultural revenues on an indicator for being downstream, baseline covariates, and site-by-year fixed effects,

118 according to equation 2.2. The sample is limited to 3-5 districts immediately surrounding each pol- luting industrial site, and the fixed effects ensure the identifying variation is only between upstream and downstream districts around the same site in the same year. The leftmost column includes no baseline covariates, the middle columns include one group of covariates each, and the rightmost column includes all baseline covariates. Because the dependent variable is in natural logarithms, the coefficients can be interpreted as roughly proportional changes in revenues. The point estimates of the effect of being downstream are almost all negative, and I cannot statistically reject that they are all equal. When all baseline covariates are included, being downstream is associated with a 9 percent reduction in agricultural revenues, with a 95% confidence interval of (-2, -17) log points.

Other outcomes. Table 2.5 shows the results of similar regressions for a variety of other agricultural outcome variables, each including all baseline covariates. These estimates can provide evidence on the channels through which the negative effect on revenues operates, or alternatively whether farmers may be engaging in compensatory behavior, increasing agricultural inputs to soften the impact on revenues but at a higher cost. Column (1) is repeated from Table 2.4, for ease of comparison. The evidence suggests that the entire effect of industrial water pollution operates through reduced yields, i.e., revenues per cropped area, and not through reallocation of factors of production such as land, labor, irrigation, or fertilizer away from agriculture. Moreover, farmers do not appear to be compensating for the reduced yields by applying additional inputs. Of course, it is possible that there are heterogeneous effects: some farmers respond to water pollution by substituting factors away from farming, while others maintain revenue by applying more inputs, but on average there is no evidence that either is happening systematically. This can be seen by noting that the effect of being downstream on yields is 16% and even more precisely estimated than the effect on revenues, while the effects on all other outcomes are smaller and most not significantly different from zero. Land area irrigated and crop area irrigated appear to decline by 2%, but their 95% confidence intervals do not exclude zero in the balanced sample, and these changes could explain at most 20% of the effect on revenues. The effect of being downstream of polluted industrial sites on rural population and numbers of farmers are imprecise but the point estimates are positive, suggesting that the fall in revenues is not coming from out-migration or exit from farming.

119 2.6 Conclusion

This chapter provides the first quasi-experimental evidence on the costs of industrial water pollution to agriculture. I examine 63 industrial sites in India identified by the government as “severely pollut- ing” and estimate the costs of their pollution to downstream agriculture. First, I find that the location of these industrial sites coincides with a large, discontinuous jump in water pollution in nearby rivers. Second, I find that each district immediately downstream of these sites has, on average, 9 percent lower yields (with a 95% confidence interval of -2% to -17%) than a corresponding district immedi- ately upstream of the same site, in the same year. Third, I find that this effect is driven by the direct impact on yields; there is no evidence that factor reallocation either mitigates or exacerbates it. In future work I plan three major improvements to enhance the credulity of these estimates. First, I plan to use a high-resolution vegetation index derived from NASA satellite data to improve the spatial resolution of the agricultural outcomes. Currently the district-to-district comparison is rather coarse; remote sensing data will permit me to implement a true spatial regression discontinuity de- sign on a proxy for crop yields. Second, I plan to combine my existing spatial variation with temporal variation in the number of polluting factories in each industrial site or corresponding district. Under my current empirical strategy, there is still a possibility that upstream and downstream districts differ systematically in unobserved ways that affect agricultural outcomes, confounding the effect of in- dustrial pollution. Adding temporal variation will allow me to control for unobserved time-invariant factors. Third, I plan to also incorporate short-term temporal variation from a river flow instrument. This instrument is inverse flow volume as predicted by upstream rainfall. Economic outcomes depend on pollution concentrations, the ratio of pollution to water volume; upstream rainfall increases the denominator without affecting the numerator. By comparing these results to those using the number of polluting factories over time, I can provide evidence on the difference between short- and long-term effects of pollution.

120 2.7 Figures

Figure 2-1: Locations of “severely polluted” industrial sites (red dots) and water pollution measure- ment stations (green dots).

!( !( !(!( !( !(!(!(!( !( !(!(!( !( !(!( !(!(!(!( !( !(!(!(!(!( !(!(!(!(!( !(!( !(!( !(!(!( !(!( !( !(!( !(!(!( !( !(!(!(!( !( !(!( !(!(!(!( !( !( !( !(!( !(!( !( !( !(!( !(!(!( !( !( !( !(!( !( !( !(!( !( !(!( !(!( !( !( !( !( !( !(!( !( !( !( !( !( !( !( !( !(!(!( !(!( !( !( !( !(!( !(!( !(!( !(!( !( !( !( !( !( !( !( !( !( !( !( !( !( !( !(!( !( !( !( !( !( !( !( !( !(!( !( !( !(!( !(!( !( !(!(!( !(!( !( !(!( !( !( !( !( !( !( !( !( !(!(!( !( !( !( !(!( !( !(!( !( !(!( !( !( !( !( !( !(!( !( !( !(!( !( !( !( !(!( !( !(!( !( !(!(!(!( !( !( !( !( !( !( !( !( !( !(!( !( !( !(!( !( !( !( !(!(!(!( !(!(!( !(!( !( !( !( !( !( !( !( !(!( !( !(!(!( !( !( !( !( !( !( !( !(!( !(!(!( !( !( !(!(!( !( !(!(!( !(!( !( !(!(!( !( !( !( !( !(!( !( !(!( !( !( !(!(!(!( !( !(!( !( !( !(!( !( !( !( !( !(!( !(!( !( !( !( !( !( !(!(!(!( !( !( !(!( !( !( !(!( !( !( !( !(!( !( !(!( !(!( !( !(!( !( !( !( !(!( !(!( !( !( !( !( !( !(!( !(!( !( !(!(!(!( !(!( !( !( !(!( !( !( !(!(!( !( !(!(!(!(!(!(!( !( !( !(!(!( !(!( !(!(!( !( !(!(!(!(!(!(

121 Figure 2-2: Discontinuity in river pollution levels at the point of “severely polluted” industrial sites. 70 60 50 40 30 20 COD (mg/L), fixed effects partialed out 10 0 -50 -25 0 25 50 Distance downstream of polluting industrial cluster (km)

122 2.8 Tables

Table 2.1: Covariate balance (full sample).

Levels Logs Up Down Diff Up Down Diff Latitude (degrees) 23.4 24.5 1.1 [5.1] [4.8] (0.8) Longitude (degrees) 79.3 78.6 -0.7 [4.0] [3.8] (0.6) Land area (km2) 9951 8787 -1165 [5497] [5921] (949) Population density, 1971 (people/km2) 259 282 23 [157] [142] (25) Urban, 1971 (%) 0.21 0.19 -0.01 [0.13] [0.12] (0.02) Scheduled castes, 1971 (%) 0.14 0.17 0.03*** [0.06] [0.06] (0.01) Scheduled tribes, 1971 (%) 0.11 0.05 -0.06** [0.19] [0.10] (0.02) Literacy rate, 1971 (%) 0.27 0.27 -0.01 [0.08] [0.08] (0.01) Avg. precipitation (mm/yr) 1027 934 -93 [397] [324] (59) Avg. potential evapotranspiration (mm/yr) 1579 1569 -10 [200] [148] (29) Land cropped, 1972 (%) 0.56 0.62 0.07** [0.21] [0.17] (0.03) Land irrigated, 1972 (%) 0.19 0.27 0.09** -2.48 -1.84 0.64*** [0.20] [0.22] (0.04) [1.47] [1.27] (0.23) Crops irrigated, 1972 (%) 0.28 0.38 0.10** -1.82 -1.33 0.49*** [0.24] [0.25] (0.04) [1.18] [1.01] (0.18) Land irrigated by surface water, 1972 (%) 0.07 0.12 0.05*** -3.89 -2.95 0.94*** [0.096] [0.11] (0.02) [1.83] [1.83] (0.30) Land irrigated by groundwater, 1972 (%) 0.11 0.15 0.03 -3.36 -2.85 0.50 [0.13] [0.14] (0.02) [1.89] [1.80] (0.31) Crops irrigated by surface water, 1972 (%) 0.11 0.18 0.07*** -3.21 -2.43 0.78*** [0.13] [0.15] (0.02) [1.69] [1.75] (0.29) Crops irrigated by groundwater, 1972 (%) 0.16 0.20 0.04 -2.68 -2.33 0.35 [0.17] [0.18] (0.03) [1.58] [1.53] (0.26) Water pumps per land area, 1972 (km-2) 0.019 0.021 0.002 -4.94 -4.76 0.18 [0.026] [0.021] (0.004) [1.69] [1.80] (0.29) Fertilizer per cropped area, 1972 (ton/Kha) 19.3 21.3 2.0 2.45 2.58 0.13 [17.2] [16.3] (2.8) [1.17] [1.54] (0.23) Observations 67 80 Standard deviations in brackets. Standard errors of t-tests in parentheses.

123 Table 2.2: Covariate balance (sample trimmed by excluding sites with a large upstream-downstream difference in baseline irrigation rates).

Levels Logs Up Down Diff Up Down Diff Latitude (degrees) 23.4 24.0 0.6 [5.1] [4.6] (1.0) Longitude (degrees) 77.8 78.1 0.3 [3.2] [3.6] (0.7) Land area (km2) 9468 9258 -210 [5328] [6488] (1203) Population density, 1971 (people/km2) 249 273 24 [142] [144] (29) Urban, 1971 (%) 0.21 0.19 -0.02 [0.13] [0.13] (0.03) Scheduled castes, 1971 (%) 0.15 0.16 0.02 [0.05] [0.06] (0.011) Scheduled tribes, 1971 (%) 0.06 0.05 -0.01 [0.13] [0.11] (0.02) Literacy rate, 1971 (%) 0.28 0.27 -0.02 [0.08] [0.09] (0.02) Avg. precipitation (mm/yr) 911 906 -5 [371] [294] (66) Avg. potential evapotranspiration (mm/yr) 1622 1589 -33 [200] [153] (35) Land cropped, 1972 (%) 0.63 0.63 0.002 [0.17] [0.16] (0.03) Land irrigated, 1972 (%) 0.22 0.24 0.02 -2.21 -2.02 0.19 [0.21] [0.21] (0.04) [1.42] [1.30] (0.27) Crops irrigated, 1972 (%) 0.29 0.32 0.03 -1.73 -1.52 0.21 [0.23] [0.23] (0.05) [1.20] [1.05] (0.22) Land irrigated by surface water, 1972 (%) 0.08 0.10 0.02 -3.77 -3.21 0.55 [0.10] [0.091] (0.02) [1.89] [1.98] (0.39) Land irrigated by groundwater, 1972 (%) 0.14 0.13 0.00 -2.86 -3.03 -0.18 [0.13] [0.14] (0.03) [1.65] [1.87] (0.36) Crops irrigated by surface water, 1972 (%) 0.12 0.15 0.03 -3.25 -2.71 0.54 [0.13] [0.13] (0.03) [1.78] [1.90] (0.37) Crops irrigated by groundwater, 1972 (%) 0.19 0.18 -0.01 -2.34 -2.53 -0.19 [0.17] [0.17] (0.04) [1.45] [1.62] (0.31) Water pumps per land area, 1972 (km-2) 0.021 0.017 -0.004 -4.53 -4.84 -0.31 [0.024] [0.015] (0.004) [1.38] [1.67] (0.31) Fertilizer per cropped area, 1972 (ton/Kha) 20.2 17.7 -2.5 2.54 2.36 -0.18 [16.8] [13.5] (3.0) [1.16] [1.69] (0.30) Observations 44 58 Standard deviations in brackets. Standard errors of t-tests in parentheses.

124 Table 2.3: Local linear regressions of water pollution estimated at the point of a “severely polluted” industrial site.

Dependent Variable: Chemical oxygen demand (COD), mg/L

Panel A: Benchmark RD Bandwidth (km) 200 100 50 30 20 15 10 7 Downstream indicator 12.6** 19.7*** 30.8*** 38.3*** 58.4*** 95.0*** 112.2*** 152.4*** (5.6) (7.1) (9.7) (12.4) (14.8) (19.7) (23.0) (3.6) Distance along river (km) 0.1* 0.1 0.0 -0.2 -0.9 -2.6 -4.9 -11.0*** (0.03) (0.1) (0.2) (0.4) (0.9) (2.5) (3.7) (0.18) Downstream X Distance -0.3*** -0.6*** -1.0** -0.9 -1.1 -2.3 0.0 5.8*** (0.1) (0.2) (0.4) (0.6) (1.3) (2.4) (3.8) (1.0) Site-by-year fixed effects X X X X X X X X Observations 13,767 10,150 7,437 6,282 5,002 3,715 3,234 2,744 Clusters (stations) 132 100 70 55 43 33 28 24 Upstream mean 36.8 40.3 40.4 40.0 39.9 34.2 28.1 15.1

Panel B: Partialing out Fecal Coliforms Bandwidth (km) 200 100 50 30 20 15 10 7 Downstream indicator 5.0 13.8 19.6* 28.8** 44.7*** 82.4*** 95.0*** 111.2*** (8.4) (8.3) (10.2) (11.9) (14.2) (22.1) (17.7) (12.1) Distance along river (km) 0.0 0.1 0.0 -0.2 -0.8 -3.5 -6.9** -9.4*** (0.0) (0.1) (0.1) (0.3) (0.7) (2.5) (2.5) (0.6) Downstream X Distance -0.2*** -0.4*** -0.5 -0.2 -0.4 0.2 6.5** 10.4*** (0.1) (0.1) (0.3) (0.4) (1.0) (2.4) (2.8) (2.7) Ln Fecal coliforms 2.5*** 2.7*** 2.0** 1.9* 1.6 1.3 1.1 2.1 (0.7) (0.8) (0.9) (1.0) (1.2) (1.4) (1.5) (1.7) Downstream X Ln Fecal Coli. 0.7 0.2 0.5 -0.2 0.0 0.2 0.7 -0.4 (1.0) (0.9) (0.9) (1.1) (1.5) (1.9) (2.3) (2.6) Site-by-year fixed effects X X X X X X X X Observations 10,577 7,530 5,297 4,381 3,423 2,577 2,175 1,804 Clusters (stations) 129 97 68 53 42 31 26 22 Upstream mean 19.4 21.0 25.4 25.2 26.8 19.3 14.9 6.4 Standard errors clustered by station. * p<0.10, ** p<0.05, *** p<0.01

Table 2.4: Regressions of agricultural revenues comparing districts immediately upstream and down- stream of “severely polluted” industrial sites.

Dependent Variable: Ln Total Agricultural Revenues (per total land area) Downstream indicator -0.01 -0.03 -0.10 -0.10 -0.03 0.01 -0.09** (0.11) (0.13) (0.09) (0.10) (0.05) (0.07) (0.04) Geography (Lat/Lon) X X Climate (Rain/PET) X X Social (Dens/Urban/SC/ST) X X Irrigation (%Land/%Crops/Surface/Ground) X X Fertilizer X X Site-by-year fixed effects X X X X X X X Observations 2,303 2,303 2,231 2,303 2,303 2,231 2,231 Clusters (districts) 53 53 51 53 53 51 51 Standard errors clustered by district. * p<0.10, ** p<0.05, *** p<0.01

125 Table 2.5: Regressions of several agricultural outcomes comparing districts immediately upstream and downstream of “severely polluted” industrial sites. 177 X X X X X (8) 0.08 (0.05) Farmers 180 X X X X X (7) 0.09 Rural (0.06) Population * p<0.10, ** p<0.05, *** p<0.01 *** p<0.05, ** p<0.10, * 2,227 X X X X X (6) -0.01 (0.05) Standard errors district. clustered by Fertilizer 2,170 X X X X X (5) -0.02 (0.01) Crops Irrigated 2,174 X X X X X (4) Area (0.01) -0.02* Irrigated 2,237 X X X X X (3) 0.02 Area (0.01) Cropped 2,249 X X X X X (2) (0.03) Yields -0.16*** 2,231 X X X X X (1) (0.04) -0.09** Revenues Agricultural Agricultural (6) Ln Fertilizer Consumption(6) Ln Fertilizer per 1000 hectares area land per(7) 1000 hectares Ln Rural Population area land Workers(8) Agricultural + Ln Cultivators per 1000 hectares area land Downstream indicator Geographic covariates Climate covariates covariates Population covariates Irrigation fixed effects Site-by-year Observations Dependent variable definitions: (1) Ln Revenues area) (revenue land per(INR/1000 total hectares) (revenue per cropped(2) Yields Ln Agricultural area) (INR/1000 hectares) (3) Land cropped (%) (4) (%) Land irrigated (5) Crops Irrigated (%)

126 Chapter 3

Measuring Demand for Groundwater Irrigation: An Experimental Protocol Using Conservation Payments

3.1 Introduction

Groundwater is a major source of irrigation and drinking water worldwide, especially for farmers in developing countries (Ministry of Agriculture of the , 2014). Unfortunately, falling groundwater levels are creating negative consequences in many regions. Depletion reduces water availability, raises the cost of further extraction, may harm water quality, and can increase poverty and conflict (Sekhri, 2014). While groundwater pumping is currently unregulated in much of the world, many regulatory tools are available, ranging from quantity restrictions and tradeable quotas to simple price instruments. However, to implement these tools efficiently, a regulator requires knowledge of the demand for groundwater – a key input for which evidence is thin. This chapter presents an experimental protocol to measure the price response of demand for groundwater in irrigated agriculture. To measure demand, we introduce a mechanism that imple- ments a price incentive without requiring the power of taxation: payments for reduced groundwater pumping, or “conservation credits”. We plan to conduct this study as a randomized controlled trial among well-owning farmers in the Indian state of Gujarat between 2018 and 2020. The basic research design will be to (1) install meters on the groundwater pumps of all study participants, (2) offer pay- ments for reduced pumping, relative to a benchmark quantity, to a randomly selected sub-sample of participants, and then (3) compare the quantity of groundwater extracted by these farmers to that of the rest of the sample (i.e., the control group).

127 Besides allowing us to measure demand, our intervention may be a promising policy tool in itself. By offering payments for voluntary conservation, we may be able to overcome constraints often faced in regulating common-pool resources in developing countries. One such constraint is weak enforcement capacity: In many areas, a natural regulator for groundwater would be the state- owned utility that provides the electricity used for pumping – but consumer-level metering is rare, and electricity theft is widespread (Antmann, 2009; Northeast Group LLC, 2014; Golden and Min, 2012). Another constraint is political concerns: Both energy subsidies and open access to groundwater are often entrenched means of redistribution; in India, reform efforts are commonly met with forceful protests (Sovacool, 2017). The conservation credits model may be able to relax these constraints in two ways. First, our program may be easier to enforce than electricity sales; both technical and institutional features of the program may make cheating both more difficult to do and easier to detect. Second, we do not attempt to interfere with existing de facto entitlements. Unlike (for example) a new Pigouvian tax on groundwater consumption, which has large costs to large users of free groundwater, we instead offer payments relative to existing usage patterns. While conservation credits require large expenditures, a Pareto improvement may be possible: an electric utility may be willing to implement conservation credits if the outlay per unit of energy conserved is smaller than the marginal cost of electricity provision. Our analysis will consist of three parts. First, evaluating our intervention as a whole, we will measure how much water and energy is saved by the conservation credits program. This will provide reduced-form evidence on the response of demand for groundwater irrigation to price incentives, as well as evidence on the ability of the conservation credits model to reduce resource consumption in our context. Our primary outcome is duration of pump operation, as measured directly using hours-of-use meters. We will also estimate treatment effects in energy and water equivalents, and assess mechanisms of water conservation and follow-on environmental and economic impacts. To assess “leakage” in this program (negative spillovers to non-monitored actions), we will also estimate intervention effects on the use of other, unmetered water sources. Second, we will use the design of our intervention to estimate the slope of groundwater demand with respect to price, a parameter that is an important input to the design of any type of groundwater regulation. We will estimate demand by instrumental variables, using treatment group assignment as instruments for price faced at the margin. To obtain a quantitative estimate of demand parameters, we

128 need an instrumental variables approach; intent-to-treat estimates do not suffice because of the struc- ture of our intervention design. Specifically, it would be cost-prohibitive to offer enough incentives to ensure everyone is marginal – in practice, some participants find their benchmark too low to af- fect their decisions. Additional random variation in both prices and benchmarks will help to increase statistical power and first-stage instrument strength. Third, we will assess the cost-effectiveness of the intervention as implemented in the study. Con- servation credits could be implemented by even a budget-constrained electric utility if the cost of the energy conserved is larger than the cost of the program. To evaluate the viability of this potential Pareto improvement, we will compare these costs and test whether an electric utility could be enlisted to reduce groundwater consumption using conservation credits. If the answer is no, we will calcu- late the minimum per-unit groundwater conservation subsidy that would be required for conservation credits to yield net benefits. Our study design is informed by a small pilot trial implemented among 90 farmers in our study region during the winter of 2017-18. This pilot demonstrates logistical feasibility of our interven- tion: farmers approached were overwhelmingly willing to participate in the study and install meters, the meters functioned properly, and we observed little evidence of tampering. The pilot also yielded several improvements in intervention design, as well as preliminary data on pumping hours that in- formed our sample size calculations. Results from the pilot are highly imprecise but point estimates are consistent with a large reduction in pumping hours in the treatment group. Our study also follows an earlier, non-randomized pilot of a similar program in northern Gujarat (Fishman et al., 2016). This study will make several contributions. First, this study will provide, to our knowledge, the first experimental evidence on the price sensitivity of demand for groundwater. Price variation is scarce for an open-access resource, so most previous estimates have used proxies for the cost of pumping (Gonzalez-Alvarez et al., 2006; Hendricks and Peterson, 2012), but these proxies may be correlated with other determinants of groundwater demand. Bruno (2018) exploits panel variation in prices across three regions of an irrigation district in California, but there is still a possibility that these prices may have responded to groundwater consumption; an experiment can rule out both concerns. We also focus on a developing country, where evidence on groundwater demand is particularly scarce. Meenakshi et al. (2013) use differences-in-differences to study a phased-in switch to metering in , India, but they rely on self-reported pumping data and find imprecise results. Badiani and Jessoe (2017) estimate an aggregate price elasticity using panel variation in the fixed cost of an

129 electricity connection, but marginal incentives may produce quite different results. Second, we will contribute to the literature on the cost-effectiveness of payments for environmen- tal services (PES) for resource conservation. Our conservation credits intervention has the same basic structure as hundreds of programs designed to incentivized the provision of environmental services, ranging from increased forest or wetland cover, to reduced input intensity in agriculture.1 Despite their prevalence, rigorous evaluation of these types of programs has been limited (see Pattanayak et al. (2010) and Börner et al. (2017) for reviews). Most existing evaluations use covariate matching and are unable to address selection bias, a particular concern for a voluntary program. The two ex- ceptions are Jayachandran et al. (2017), who use a randomized controlled trial to find that conditional payments to forest-owning households in Uganda reduce deforestation rates by 50 percent, and Jack and Cardona Santos (2017), who find that while contracts for tree planting in Malawi increase trees planted, they also increase tree clearing on unenrolled plots. Our study will provide evidence on the feasibility and effectiveness of PES in a novel context: promoting irrigation efficiency in agriculture. We will also build on previous studies by using detailed survey data to investigate the behavioral mechanisms underlying the response to a PES program. Understanding the response to this program will inform our understanding of whether a PES model holds promise as a method for governments and donors to reduce energy and water use. Finally, we will contribute to literature connecting the price response of electricity consumption in developing countries to policy decisions about energy-sector investment and reform. Experimental and quasi-experimental studies are still limited, but a few have been conducted recently on rural households in Columbia (McRae, 2015), urban households in South Africa (Jack and Smith, 2016), and new grid connections in Kenya (Lee et al., 2018).

3.2 Background

3.2.1 Optimal groundwater policy: A framework

Groundwater is a shared, common-pool resource. Extraction by one user (most often irrigators) imposes an externality on other users in the form of lower water availability and higher costs of extraction. Multiple regulatory tools - including both quantity and price instruments - are available

1For example, in the United States alone, payments are available to farmers for actions to mitigate flood and wildfire risks, provide habitat for endangered species, salinity mitigation, and water and energy conservation.

130 to reduce over-extraction and restore efficiency, and demand for groundwater is an essential input to all of them. In this section we show how the optimal Pigouvian price level is set, and how this calculation is affected by the demand for groundwater. We focus on price regulation because our study implements a type of price instrument, but the analysis would be similar for the quantity instruments more frequently used for groundwater management. Figure 3-1 illustrates consequences of price regulation in the presence of groundwater externali- ties. Irrigators have aggregate inverse demand for groundwater as a function of water quantity, D(q). Inverse demand equals private marginal benefits net of private marginal costs of extraction; it first declines with quantity but eventually slopes upward as marginal costs rise. Extraction generates so- cial marginal damages, SMD(q), which increases with quantity. Although this analysis represents the situation at a single point in time, it can fully incorporate dynamics: the present discounted value of future costs of today’s extraction may be included in demand (the internalized portion) and social marginal damages (the remainder). When groundwater extraction has a price of zero, irrigators continue using water until net private marginal benefits are zero - where the demand curve intersects the x-axis, or q0. This level of extrac- tion is inefficient, since the social marginal damages are greater than the net private marginal benefits. The efficient level of extraction, instead, is found where these two curves intersect, or q∗. One way to achieve this allocation is through a price, or tax, per unit quantity extracted. If the price p is set to equal p∗, the value of social marginal damages at q∗, irrigators will fully internalize the externality of extraction, shifting down the effective demand curve. Then, they will extract only up to the efficient quantity q∗, since net private marginal benefits including the tax are zero. To set this per-unit price p∗, a common heuristic is to set the price equal to the social marginal damages as measured locally. If social marginal damages are constant, the slope of demand does not matter, since the efficient quantity is simply whatever amount results from this price. However, there are two reasons a policymaker pursuing price regulation may need to know the full shape of the groundwater demand curve. First, social marginal damages may not be constant. In

Figure 3-1, if the price were set at SMD(q0), the resulting quantity extracted would be far too low. Constant social marginal damages may be a reasonable approximation over the range of groundwater conserved in small programs in large aquifers, but the slope of the demand curve is essential for larger programs or smaller aquifers. Second, even if social marginal damages are constant, the process of enacting a new policy may incur costs (such as political or administrative costs). Whether the policy is

131 worthwhile depends on the quantity of water conserved, which can only be predicted with knowledge of the demand curve.

3.2.2 Existing evidence: Costs, benefits, and damages of groundwater extraction in irrigated agriculture

Existing evidence is relatively thin for both the social damages and demand functions for groundwater extraction. Social damages are difficult to quantify overall, but the components are well understood. Some components have known values, while others are best estimated using scientific models. De- mand for groundwater, which is the difference between private marginal benefits and private marginal costs, is less well understood. Private marginal costs can be modeled fairly easily, but private marginal benefits are unknown. Our study fills the gap in knowledge by directly estimating demand.

Social damages

Social damages from groundwater extraction come first through the depletion of the resource. Ground- water extraction by one user generally leads directly to a decline in water levels for other users. The precise relationship between extraction and water levels depends on geology, topography, soil, rain- fall, and climate. Deeper groundwater levels raise the cost of extraction, which can lead to increases in poverty and conflict (Sekhri (2014)). Depletion can also degrade water quality, either through inherent local properties of soil and geology, or by drawing in seawater from the ocean in coastal areas. These externalities can be complex and difficult to estimate, since the spatial extent of the extrac- tion externality varies greatly across locations. Depending on geology, in some areas, the externality may fall almost entirely on a small group of neighbors, in which case Coasian bargaining may some- times be able to govern the aquifer efficiently. However, in many areas, and especially over longer periods of time, the externality is felt over a very large area, making local cooperation less likely to be sustained. Another major source of social damages, which is easier to measure, is the costs associated with the energy required to pump groundwater to the surface. Typical energy sources are electricity and diesel, both of which create greenhouse gases and air pollution. In many developing countries, in- cluding almost all states of India, political pressure constrains governments to provide electricity

132 to agricultural customers at a marginal price of zero. In this case, the social marginal damages of groundwater extraction include the marginal cost of electricity provision by the electric utility.

Demand

Private marginal costs in the short run can be modeled reasonably easily: they depend on the price of fuel (which may be zero), water levels, and pump characteristics. In the long run - that is, over large changes in water levels - discontinuities in private marginal costs may arise from deepening wells or purchasing new pump hardware. Private marginal benefits of groundwater extraction are more difficult to estimate since they de- pend on the agricultural production function and any non-profit-maximizing behavior by farmers. Anecdotal evidence suggests that, especially in developing countries, water inputs often exceed yield- and profit-maximizing levels. Instead of measuring inputs precisely, some farmers simply flood their fields - which would suggest that private marginal benefits are low at the current equilibrium. Because these private benefits are difficult to model, we instead directly estimate groundwater demand using a revealed-preference approach, in which we observe how quantity extracted changes with price.

3.2.3 Conservation credits as a Pigouvian tax

Our objective is to estimate groundwater demand by varying the price of extraction. As an exter- nal party lacking the power of the state, we cannot require irrigators to pay a tax. Instead, we offer payments for reduced water extraction, relative to a benchmark amount - an intervention called “con- servation credits.” This intervention provides the same marginal incentives as a Pigouvian tax, at least for some participants. Figure 3-2 illustrates the budget set of the conservation credits contract. Two thresholds are set: a benchmark, and a maximum payment. If the irrigator extracts a greater quantity than the benchmark, the payment is zero. If the irrigator conserves water relative to the benchmark, the payment equals the price times the difference between the quantity and the benchmark. If the irrigator conserves very large amounts of water, the maximum payment may be reached, after which further conservation does not increase the payment. Under a Pigouvian tax, all irrigators are marginal to the incentive, in the sense that any posi- tive quantity extracted is subject to a per-unit price. Under conservation credits, many irrigators are

133 marginal, but not all. To see this, Figure 3-2 plots quasi-linear indifference curves over groundwater extraction (including both the private benefits and costs) and payments of conservation credits. With- out conservation credits, the budget set is flat and coincides with the x-axis; with conservation credits, the budget set is piecewise linear. Irrigator A is marginal: her indifference curves are tangent to the A A x-axis at q0 and tangent to the conservation credits budget set at q1 , indicating that she will reduce groundwater extraction when eligible for conservation credits. Irrigator B is extra-marginal: his in-

difference curves are tangent to both budget sets at qB, indicating that he will not reduce extraction in response to conservation credits.

3.2.4 Equivalence between water, energy, and pumping time

Some of the costs and benefits discussed so far are in units of water quantity, while others are in units of energy consumed. In data collection, our main outcome of interest will be a third unit: pumping time. These three objects are closely related and can be converted using mechanical formulas:

P P η E = b ×t q = b p ×t (3.1) ηm kh

where E is energy consumed, q is quantity of water pumped, and t is duration of pump operation. Pb is the power rating of the pump’s motor (“brake horsepower”). ηm and ηp are the motor and pump efficiencies; they are unitless, between zero and one. h is the total hydraulic head, approximately equal to the depth to water level (plus friction and outlet pressure). k is a conversion constant, equal −1 to 3960 when q is in gallons, Pb is in horsepower, t is in minutes, and head is in feet. Slopes of demand can be related similarly by differentiating both sides of the formulas in Equation 3.1. The price elasticities of demand for water, energy, and hours are equal.

3.3 Study Setting and Experimental Design

To estimate groundwater demand, we will implement a randomized controlled trial among groundwater- irrigating farmers in Gujarat, India. The trial will have two overarching treatment arms: conservation credit farmers will be eligible to receive payments for conserving groundwater below a benchmark, whereas control farmers will receive no such incentives.

134 3.3.1 Setting

Our trial will be implemented in Saurashtra, a water-scarce region of Gujarat state, India. The study villages are located in coastal areas of Talaja block in Bhavnagar district, where falling groundwater levels lead not only to increased irrigation costs, but also to increased risk of seawater intrusion into the freshwater aquifer. Salinity levels in Talaja aquifers are already extremely high (Central Ground Water Board (2013)), with 60% of villages either prone to increased salinity or already partially or fully saline,2 reducing the ability of farmers to grow high-value crops (Samadhan E Cube Innovator Pvt. Ltd. (2016)). The primary source of employment in Talaja block is in agriculture (Registrar General and Cen- sus Commissioner of India (2001)). The literacy rate is low compared to the rest of India, at ap- proximately 44%. In addition, households are relatively large, with an average household size of 5.7 individuals (compared to 4.7 across India). Only 3% of individuals are from Scheduled Castes or Scheduled Tribes. Agricultural land is primarily irrigated by groundwater (47%), although 28% is surface-water irrigated, and the remaining 25% is rain-fed (Registrar General and Census Commis- sioner of India (2011)).

3.3.2 Enrollment

We will recruit our sample from 44 study villages. The study villages were selected by our implement- ing partner, the Coastal Salinity Prevention Cell (CSPC). CSPC is not yet working in these villages, but plans to roll out a number of programs beginning in 2018, ranging from health to agricultural development interventions. The sample in each village will be randomly selected from eligible households of landowning villagers who are willing to participate, following the procedure developed in our pilot study. The sample frame will be formed on the basis of official village landowner lists, which can be obtained from the village talati (accountant). These lists include all land-owning villagers as of the date of the list (and sometimes include other information, such as landholding size and location). CSPC will augment the village list with phone numbers3 through its network of village extension volunteers (local villagers who carry out simple organizational tasks for a small stipend).

2Prone-to-saline indicates average total dissolved salt (TDS) concentration >500mg/L, partially saline from 1000 to 2000 mg/L, and fully saline > 2000 mg/L. 3A survey funded by CSPC in Talaja and neighboring Gogha block in 2016 revealed 78% of farming households owned at least one phone.

135 Random sampling will be conducted as follows. First, each name on a village list will be assigned a random number. Surveyors will call and/or visit individuals on the list, in the order of the number assigned, to determine if the primary agricultural decision-maker (PAD) in the household meets the study eligibility criteria, and is willing to participate in the study. Surveyors will then visit the eligible and willing PADs to obtain informed consent for enrollment in the study. In order to be eligible for the study, the household’s PAD must meet the following criteria:

Inclusion Criteria

• Also be the primary agricultural decision maker for the land owned by the household.

• In the previous Rabi (winter) season:

– Must have planted crops.

– Must have irrigated at least one farm using primarily groundwater.

– Must have solely used an electric-powered pump for this purpose.

• Must plan to farm and irrigate during the next Rabi season.

Exclusion Criteria

• May not have a diesel pump in use on the primary water source.

• May not have multiple pumps, or pump starters, in use on the primary irrigation source (i.e. well).

Farmers who meet the eligibility criteria, complete a baseline survey, and consent to the full study will be enrolled.

3.3.3 Sample size

We plan to draw a total sample of 2,200 well-owning farmers, equally divided between treatment and control groups. These choices are based on power calculations using meter reading data from a pilot of 90 farmers (the pilot is more fully described in Section 3.6). Our primary object of interest is the intent-to-treat estimate: the average treatment effect of eligibility for conservation credits on the duration of pump operation. We measure this as the difference in group means of total pumping

136 hours after partialing out strata and month effects. This is likely conservative, since individual-level time-varying covariates may be able to improve efficiency. First, we divide the sample between treatment groups equally, since the primary object of interest is a simple difference in means. Assuming equal variance in the outcome variable across treatment groups (which our pilot data cannot reject), power is maximized at a treatment proportion of 0.5. Second, we calculate the sample size required for a minimum detectable effect of 10 percent of the sample mean pumping hours, at a power of 0.9. This is the sample size required to reject a null hypothesis of no effect with 90 percent probability when the true effect is 10 percent of the mean. We choose 10 percent because it is a salient and quantitatively reasonable threshold. From pilot data, the conditional variance of total pumping hours (i.e., after partialing out several baseline covariates) is 0.47. The required sample size is then 1,990. Allowing for an approximately 10 percent attrition rate,4 the total sample to be recruited is 2,200. Figure 3-3 plots full power curves using the pilot data for a range of sample sizes and minimum detectable effects.

3.3.4 Randomization

Randomization will be stratified by village and forecasted hours of irrigation. Specifically, the final sample within each village will be divided into above- and below-median forecasted hours of irri- gation, creating two equally-sized cells in each village. Farmers in each cell will then be randomly allocated to one of two treatment arms using a pseudo-random number generator (Stata software): 50% will be allocated to the Conservation Credit arm, and 50% to the Control arm. Within the Con- servation Credit arm, we will cross-randomize both the benchmark below which individuals are in- centivized between a high and low option, and the size of the marginal conservation incentive between a high and low option, resulting in four equally-sized sub-treatments (Figure 3-4). The intervention will run for approximately 18 months, from randomization in late summer 2018, until the final meter readings are completed in February, 2020.

4This is larger than the 2 percent attrition in the pilot; we are conservative about attrition due to the increased duration of the planned experiment compared to the pilot, to 18 months from 5 months.

137 3.3.5 Interventions

Conservation credits

Participants in the Conservation Credits arm will have an hours-of-use meter installed on the electric pump starter of their primary irrigation source. The meter measures the total hours of irrigation done by the farmer. Meters will be read monthly by CSPC village extension volunteers. Farmers will be incentivized for conserving water for five months of the Rabi season, from September-January, across two consecutive years. This is the period of peak irrigation; as there is typically no rainfall during Rabi, agriculture is entirely dependent on irrigation. At each meter reading, farmers are informed of their benchmark for the following month, and the payment for the previous month is calculated. Payments are awarded at a fixed rate for consuming fewer hours of irrigation than the monthly benchmark, according the formula:

Paymentit = max(0,pricei × ((hours benchmark)it − (hours consumed)it )) (3.2)

where pricei is the per-hour incentive rate, (hours benchmark)it is an individual-month-specific bench- mark, and (hours consumed)it is the monthly meter reading. The payments are later disbursed as checks.

Conservation Credit Sub-treatments

The four Conservation Credits sub-treatments differ along two dimensions: the per-hour incentive rate, and the benchmark. Individuals assigned a high price receive 40 INR (0.61 USD) per hour conserved, and those assigned a low price receive 20 INR (0.31 USD) per hour conserved. The prices were chosen to be realistic estimates of the groundwater price that a policymaker might wish to set. The prices are similar to the cost of electricity provision for the median farmer, which is approximately 30 INR per hour of use (authors’ calculation).5 Individuals assigned the high and low benchmark receive 125% and 75%, respectively, of their forecasted monthly hours of irrigation. The monthly hours of irrigation forecast used to set bench- marks (and to stratify randomization) will be created from baseline survey data collected before the

5A rate of 30 INR per hour is approximately equal to the unsubsidized average cost of electricity supply in Gujarat for the power rating of a typical pumpset in the pilot region. That is: (5 INR/kWh average cost of electricity provision in Gujarat) * (6.2 HP average pump brake power) / (74% typical motor efficiency) * (0.75 kW/HP conversion factor) ≈ 31 INR/hr.

138 program is introduced. The survey collects the self-reported duration of irrigation in the previous year’s Rabi season by asking for the number of irrigations made during Rabi, the average duration of each irrigation, and the first and last irrigation dates. This method of constructing benchmarks is potentially liable to manipulation: if farmers know how survey answers will be used, we might expect farmers to artificially inflate their reported irrigation to increase their expected conservation credit payment. Because farmers will not yet know the program details at baseline, we do not fore- see manipulation in our setting. However, any scaled-up program would have to rely on a different method of benchmark setting, such as collecting actual usage data.

Control

Participants in the Control arm will also have an hours-of-use meter installed and read monthly for 18 months. However, these farmers will not be incentivized for conservation.

3.4 Data

3.4.1 Data collection

In order to conduct our analysis, we will collect four datasets. First, we will conduct a baseline survey with both self-reported and field measurement components prior to randomizing participants into treatments. Self-reported data will include demographic and socioeconomic characteristics, such as landholding size and household size; cropping, crop management, and irrigation decisions in the previous year; the power of the primary pumpset, and water conservation strategies and attitudes. Field measurements will include the precise geolocation, depth-to-water and salinity levels (i.e., total dissolved solids) of each well on the participant’s largest farm. All data will be collected electronically through tablet surveys. Second, we will directly measure groundwater pumping for all study participants, using hours- of-use meters installed on the pump starter of each participant’s primary irrigation source.6 Village extension workers will record meter readings each month using a digital tablet survey. Meter data quality will be assured through random audits, in which a research associate will compare the digitally recorded meter readings with dated, geo-located photographs of the meter dial included on the tablet

6Analog hours-of-use meters manufactured by Nishant Engineers (model: NE53/6S).

139 survey. Our third and fourth datasets will be collected in two endline surveys, one after each Rabi sea- son. These two tablet-based surveys will collect the same field measurements and retrospective self- reported information as the baseline survey, as well as specific questions on changes to irrigation behavior and new technology adoption. Finally, we will collect a supplementary dataset of water and energy consumption, measured simultaneously with hours of use. We will use this dataset us to calibrate pump and motor efficiencies under realistic conditions similar to our study sample, which in turn will allow us to convert between our measured hours of irrigation and the water and energy equivalents. For each major pump and motor type in our sample, we will select small calibration sub-samples and compare hours of irrigation with readings from both a portable electricity meter and an ultrasonic water flow meter.

3.4.2 Data processing

To assess the response of groundwater irrigation to water prices, our primary outcome will be monthly hours of groundwater irrigation. Hours of irrigation during each meter-reading period will calculated as the difference between total hours consumed at the end and beginning of the period. For individuals whose meters have been disconnected following the drying of a well, hours will be recorded as usual (i.e. according to the meter dial). For individuals whose meters are otherwise tampered with (e.g. if the meter is disconnected or broken but the well is not dry), hours will be recorded as missing. Because meter-reading periods may vary slightly over time and across individuals, we will normalize the measured hours of irrigation in each period by the number of days in the period. We will construct two secondary outcomes, water consumption and energy consumption, from hours of irrigation using the conversion formulas in Equation 3.1. Pump and motor efficiencies will be imputed based on pump and motor type from the dataset of pump and motor efficiencies, pump power will be collected in surveys, and monthly depth-to-water will be interpolated from baseline and endline measurements. Other secondary outcomes will be measured in each of the two endline surveys. One group of outcomes measures the environmental and economic follow-on effects of water conservation. We will assess environmental impacts through measured water depth and salinity levels in the metered wells, and economic impacts through self-reported crop yields for selected crops, crop revenue, and

140 farm profits in the Rabi season. Crop yields will be measured through questions asking the total kg of crops harvested, crop revenue will be measured as self-reported price per kg times quantity sold, and total profit will be measured both through a direct elicitation and by subtracting total reported costs from total revenues. Another group of outcomes sheds light on the mechanisms through which water conservation may be affected: through adopting efficient irrigation technology, shifting to less water-intensive cropping patterns, or simply through irrigating less. To measure irrigation technology adoption, we will col- lect self-reported data on technological water conservation measures taken (such as micro-irrigation, alternate furrow irrigation, or mulching). We then create two technological water conservation out- comes: a dummy variable for whether any measure was taken, and an index of z-scores following Kling et al. (2007) where we set less intensive measures to 1 if a more intensive but mutually exclu- sive measure was taken (e.g. if alternate furrow is used, we set furrow to 1, since alternate furrow is a more intensive conservation strategy than furrow alone and the two are mutually exclusive). To measure changes in the water-intensity of planted crops, we both measure gross cropped area, and we will create an index of crop water requirements using data on cropped area and crop choice, param- eterized by agronomic estimates of optimal water application rates. To understand the margins over which individuals adjust irrigation, we will measure irrigated area, irrigation frequency, and irrigation intensity. A third group of outcomes investigates the possibility that conservation credits, as implemented on only one well, could cause farmers to substitute to other wells or water sources – a form of what is known as “leakage” in the PES literature. We will assess these substitution patterns with five outcomes: an indicator for the use of any other non-metered irrigation source, the area irrigated from other ground and surface water sources, and an estimate of irrigation water volume drawn from other ground and surface water sources (derived by multiplying together irrigated area, irrigation frequency, and irrigation intensity).

3.5 Analysis Plan

After checking for balance between treatment groups and attrition status, our analysis will proceed in three steps. First, we will report evidence on how individuals respond to groundwater prices through intent-to-treat (ITT) analysis of the conservation credits intervention as a whole. Second, we will

141 estimate a model of demand for groundwater irrigation, using the price variation induced by our experiment in an instrumental variables strategy. Third, we will analyze the cost-effectiveness of the intervention from the perspective of a budget-constrained electric utility, by estimating the cost per unit of energy conserved.

3.5.1 Balance checks

Treatment Groups We will report baseline characteristics of the experimental sample in each treat- ment group including, but not necessarily limited to: household size, age of primary agricultural decision-maker, farm size (hectares), primary Rabi crop (including long-duration cotton planted be- fore Rabi onset), irrigation technology, irrigation intensity, the power of the metered pumpset, and the depth-to-water and total dissolved solids in the metered well. We will test for balance using Wald test for the hypothesis that there is no difference between any of these variables in treatment and control groups.

Attrition. We will next test for potential differential attrition in treatment and control in each of our hours-or-irrigation and endline datasets. First, we will assess whether missing observations in each dataset are equally prevalent in treatment and control groups. Second, we will check for balance in missing and non-missing data using the same baseline characteristics as in our treatment balance checks. In the case that we find evidence of differential attrition of either kind, we will report bounds for all treatment effects following Lee (2009).

3.5.2 Program evaluation via intent-to-treat estimates

Primary statistical model. We will first report intent-to-treat (ITT) estimates of the effects of the conservation credits intervention as a whole, regardless of the specific sub-treatments. These esti- mates can be interpreted as a reduced-form measure of whether individuals respond to water prices. Each outcome variable will be compared between treatment (all farmers receiving any form of con- servation credits) and control (farmers not receiving conservation credits). We will use ordinary least squares to estimate a monthly panel regression of the following form:

0 Yit = α + β(Conservation Credits)i + Xit γ + µt + εit , (3.3)

142 where Yit is an outcome variable for farmer i in month t, (Conservation Credits)i is an indicator for being in one of the conservation treatment groups, µt are month fixed effects, and Xit is a vector of individual-specific covariates (with time subscripts because some will be interacted with month indicators). Covariates will include stratification variables (village indicators and an indicator for being above or below median forecasted hours of irrigation) interacted with month indicators. Covariates will also include baseline characteristics; to avoid overfitting and cherry-picking, we will use the double- LASSO method (following Belloni et al., 2013) to choose covariates from a high-dimensional set of variables derived from the baseline survey and other pre-randomization data. As a secondary specification, results will also be shown without these baseline characteristics. In all models, we will use cluster-robust standard errors, clustering by individual, to correct for arbitrary correlation in outcomes within individual across months. To increase power, we will exclude any month in which less than 50% of the control group has nonzero, non-missing observations – we try to schedule our program in months where pumping is the norm, but an unusual rainfall pattern may lead to unexpectedly low pumping in program months.

Primary outcome variable. The primary outcome variable will be hours of pump operation in each month of meter reading. Because the true functional form of the treatment effect is unknown, we will consider three specifications of this outcome variable. The first two are the untransformed hours of operation and its inverse hyperbolic sine (Asinh) transformation. These transformations will estimate the treatment effect more precisely if it is, respectively, constant (i.e., everyone reduces by an equal number of hours) or proportional (i.e., everyone reduces by an equal percentage of their hours absent the intervention). Regression coefficients from an Asinh transformation can be interpreted similarly as a natural log transformation (i.e., as proportional changes). We choose the Asinh over the natural log because the Asinh admits zeroes, unlike the natural log, albeit in a particular functional form. We will also consider a third specification in which the dependent variable is the natural log of total hours of pump operation across all months of potential payments within each year, and farmers who never pump at all (likely a small number) are excluded. This specification models a propor- tional treatment effect but reduces the influence of the functional form choice for handling individual monthly zeros, by combining zero and nonzero observations into a single total. The downside is that this specification may cost some power; the time period is interpreted as one year instead of one

143 month, so covariates cannot vary by month. We are interested in both whether the program had an effect and the quantitative magnitude of the effect. To answer the first question, we will conduct one-sided t-tests (α = 0.05), in which the alternative hypothesis (Ha) is that the program had a negative effect on pumping duration, and the null hypothesis (H0) is that the program had a zero or positive effect. Because having three versions of the primary outcome constitutes multiple hypothesis testing, we will adjust the p-values using the free step-down approach of Westfall and Young (1993) following Kling et al. (2007). To reduce the influence of outliers, final variables (after any transformations) will be winsorized by replacing extreme outliers with the next most extreme value. We define extreme outliers as values exceeding the third quartile plus three times the interquartile range, of the nonzero values of the same variable.

Secondary statistical models. In addition to the basic linear regression specification, we will show results from three other models. First, to investigate seasonal patterns in treatment effects, we will augment the primary regression to include time-varying (i.e., month-specific) treatment effects. Sec- ond, to more explicitly distinguish between the intensive and extensive margin of irrigation, we will apply a dynamic unobserved effects Tobit Type II model for censored values. Third, to reduce reliance on functional form, we will study the treatment effects across the full distribution using quantile regressions. We will simultaneously estimate quantile treatment effects at the 19 quantiles {0.05,0.10,...,0.95}, obtaining standard errors via bootstrap to account for correlation across quan- tiles, using the following form:

0 QYit (τ) = α + β(Conservation Credits)i + Xit γ + µt + εit (3.4)

Secondary outcome variables. In addition to the primary outcome variable, we will use Equation 3.3 to examine the effects of the intervention on several other outcomes. These will enable us to better understand (1) the impacts as measured in units of energy and water, (2) whether the intervention has any measurable environmental or economic impacts, (3) the particular mechanisms through which farmers reduce water consumption, and (4) whether the intervention induced substitution (“leakage”) to other water sources. The precise variables follow, with details of construction in Section 3.4.2:

1. Unit conversions: Implied energy consumption; implied water consumption (for each: monthly,

144 monthly Asinh, and natural log of yearly totals).

2. Environmental impacts: Depth to water level; total dissolved solids.

3. Economic impacts: Crop revenue; farm profits (for each: level, Asinh, and both per hectare).

4. Mechanisms: Water conservation measures; gross cropped area; crop water intensity; gross irrigated area; irrigation frequency; irrigation intensity.

5. Leakage: Use of other irrigation sources; gross area irrigated by other sources, water volume used for irrigation from other sources.

Because secondary outcomes will be measured annually following each Rabi season, in these regressions, the time period in Equation 3.3 will be interpreted as one year instead of one month. For the outcomes in the unit conversion and economic impact categories, we will adjust p-values category- wise using the same method as for the primary outcomes. For the outcomes in the environmental impact, mechanism, and leakage categories, we plan not to adjust the p-values, because each outcome answers a different question, and they are not measures by which we will judge the overall success of the intervention. All variables will be winsorized in the same way as the primary variables.

Heterogeneity. We also will analyze treatment effect heterogeneity according to baseline charac- teristics in order to further assess mechanisms of conservation and the feasibility of better targeting the program. These analyses will involve a variation of the regression in Equation 3.3, in which the treatment indicator is interacted with an exhaustive set of indicators in a particular category. Three of these categories will be: (1) whether the farmer is a medium or large landowner (defined as owning more than 2 hectares of land), (2) whether the farmer had previously invested in micro-irrigation tech- nology such as sprinkler or drip irrigation, and (3) whether the farmer shares their primary irrigation source with others. Two additional heterogeneity analyses will serve as indirect tests of leakage. The idea of these tests is to check whether the treatment effect is larger in areas where there are more opportunities for leakage to other groundwater sources. Specifically, we would expect to see more leakage in areas where more un-priced groundwater is available. In these tests, we will interact the treatment indicator with (1) an indicator for whether the farmer has unmetered groundwater sources (a proxy for the availability of on-farm unpriced groundwater), and (2) the number of neighboring farms that

145 have a groundwater source but are not enrolled in the conservation credit treatment (a proxy for the availability of off-farm unpriced groundwater). In both cases, a positive interaction coefficient is evidence of leakage. Unlike the direct tests, which compare water use from secondary groundwater sources among treated and untreated farmers, the indirect tests have the advantage that they do not rely on self-reported data.

3.5.3 Demand estimation via instrumental variables

Our second analysis will estimate a model of demand for groundwater irrigation. We will use a linear regression to predict duration of pump operation:

0 Yit = α + β pit + Xit γ + µt + εit (3.5)

where pit ∈ {0,20,40} indicates the marginal cost of an hour of irrigation for farmer i in month t, and µt are month fixed effects. Again, Xit is a vector of individual-specific covariates includ- ing stratification variables interacted with month indicators, plus baseline characteristics chosen by double-LASSO. Standard errors will be clustered by individual. We will estimate Equation 3.5 by two-stage least squares to correct for endogeneity in price. Note that while Control farmers always face a price of 0, Conservation Credits farmers in the low price sub- treatments face a price of either 0 or 20, and those in high price sub-treatments face a price of either 0 or 40, depending on whether their consumption is above or below their benchmark. This introduces endogeneity into Equation 3.5: in the Conservation Credit treatment, positive consumption shocks εit are mechanically correlated with zero prices, biasing OLS estimates of β downward. To boost precision in this estimate while avoiding overfitting and weak instruments concerns, we will apply the instrumental variables LASSO method of Belloni et al. (2012). Our set of candidate instruments will consist of indicators for each of the four conservation credit sub-treatments, and their interactions with baseline characteristics. The final set of instruments will be chosen from this candidate set by the algorithm. As a secondary specification, we will also show results using only the four sub-treatment indicators as instruments. The IV estimate of β can be interpreted as a local average treatment effect of our experimental price variation on those farmers whose marginal consumption is priced. Our methodology is in the spirit of quasi-experimental estimates of the elasticity of taxable income from non-linear budget sets

146 (as summarized by Saez et al., 2012) and of electricity demand (Ito, 2014). The exclusion restriction for these instruments is that the Conservation Credit sub-treatment does not affect consumption except through the actual price of irrigation. This assumption will be vio- lated if the Conservation Credit sub-treatments affect consumption even for farmers who do not face positive marginal incentives in a given month – for example, if they attempt to conserve below the benchmark but fail to reach their target. This is one limitation of our empirical strategy.

Outcome variables. The primary outcome variable will be hours of pump operation as measured in meter readings. We will consider the same three specifications as in the intent-to-treat analysis, adjusting p-values in the same way: (1) the untransformed measure in each month, (2) an inverse hyperbolic sine (Asinh) transformation, and (3) the natural log of the yearly total hours. Secondary outcome variables will be the unit conversions: implied water consumption, and implied energy con- sumption.

Heterogeneity analysis. We will again explore the first three dimensions of heterogeneity for de- mand as for intent-to-treat effects: farm size, micro-irrigation technology, and well sharing.

3.5.4 Cost-effectiveness

Our third analysis will consider the cost-effectiveness of the conservation credits intervention as im- plemented in the study. Because groundwater conservation yields the side benefit of reduced electric- ity demand, a conservation credits program could be implemented by a budget-constrained electric utility under one condition: that the cost of the energy conserved is larger than the cost of the pro- gram. We explore the viability of this idea through three questions. For each, we will report answers for the program as a whole, as well as the low price treatment group alone (discarding data from the high price treatment group).

Question 1. What is the minimum marginal cost of electricity for which the program could be imple- mented by an electric utility with a budget-balance constraint?

For each unit of energy conserved through the program, an implementing electric utility reduces its costs by the marginal cost of electricity. Therefore, the minimum marginal cost (for which the program is viable) is equal to the average cost of the program per unit energy conserved. This will be

147 calculated using the following formula:

Total monthly expenditures Cost per unit energy conserved = (3.6) Total monthly energy conserved

Total monthly expenditures will be tabulated from program data. To estimate total monthly energy conserved, we will multiply the linear treatment effect of the program on energy consumption (from the intent-to-treat analysis) by the number of participants in the treatment group. To obtain a confi- dence interval that allows for correlation between the numerator and denominator, we will bootstrap the treatment effect and expenditures together, stratifying resampling draws on treatment assignment. Note that a limitation of this confidence interval is that it will ignore uncertainty in pump efficiencies.

Question 2. At the best estimate of actual marginal costs faced by electric utilities in India, could the program be implemented by an electric utility with a budget-balance constraint?

To answer this question, we will conduct a literature review on the marginal costs of electric- ity provision in India and arrive at a best estimate of the typical marginal cost prior to performing these calculations. Then, we will conduct a one-tailed bootstrap test (α = 0.05), where the null hy- pothesis that the cost per unit energy conserved is greater than or equal to this marginal cost, while the alternative hypothesis is that the cost per unit energy conserved is less than this marginal cost (revenue-positive). Ignoring uncertainty in the marginal cost estimate, we will draw 1,000 bootstrap samples and count the number in which the cost per unit energy conserved exceeds the marginal cost of electricity. The null hypothesis is rejected if this condition is met for fewer than 950 draws (95 percent).

Question 3. What is the minimum subsidy per unit of energy that an electric utility with a budget- balance constraint would require to implement the program?

We will consider this question only if the answer to Question 2 is no (i.e., we fail to reject the null hypothesis). Even if a conservation credits program does not pay for itself through electricity cost savings, a government placing social value on groundwater conservation may be willing to subsidize the program. We will calculate the minimum necessary subsidy by subtracting the estimated cost of the program per unit energy conserved from the best estimate of the marginal cost of electricity. Again ignoring uncertainty in the marginal cost estimate, the confidence interval will be calculated in the same way as in Question 1.

148 3.6 Pilot Results

We conducted a small pilot of this experiment among 90 farmers in three villages in a nearby district of the same state (Khambhalia, Gujarat). This pilot informed the experimental design in four ways.

Demonstrates logistical feasibility. The pilot shows that the intervention can be successfully im- plemented among a similar population as the experimental sample. First, farmers were broadly will- ing to participate, voluntarily accept hours-of-use meters and agree to monthly meter readings. Of 144 farmers randomly sampled from village rosters, 100% agreed to allow us to install a meter on their pump. Of 90 farmers meeting eligibility criteria, meters were successfully installed for 100%. Of the same group, one withdrew during the intervention, yielding a 99% completion rate. Second, meter tampering is difficult and appeared to be minimal. The meter itself is sealed, with no controls other than a reset button (which can be easily detected after a first reading). Disconnection is not simple and leaves indications in the form of uncoiled wires; only two farmers showed evidence of having disconnected and reconnected in the same month. Third, farmers appear to understand the program; during the initial intervention visit, farmers were asked questions designed to measure com- prehension and corrected if necessary; surveyors reported a subjective assessment that most farmers understood the program very well.

Improvements in intervention design. The pilot yielded several ideas for improving the effective- ness and power of the intervention that we will incorporate into the experiment. First, 20 percent of farmers permanently disconnected their meters following their last irrigation of the season. However, disconnecting the meter disqualified treatment group farmers from receiving payments, so the dis- connections were highly concentrated in the control group (14 of 18). To ensure accurate data from the control group, all farmers will be offered a small financial reward to keep their meter connected through the end of the meter-reading period. Second, the experiment will focus on months and geographical regions in which the vast majority of farmers have access to groundwater (i.e., without deepening a well). In the pilot, 29 percent of meter readings showed zero consumption, a pattern that rose to 50 percent by the end of the pilot. Discussion with farmers revealed that many had stopped pumping because their well had gone dry. These zeros substantially reduced statistical power (by increasing the variance of the outcome vari- able), and paying farmers whose well had gone dry was perceived to be unfair by the implementing

149 partner. For the experiment, the geographical region was chosen in part because it has more reliable water availability. In addition, conservation credits will be paid during a more limited number of months (those in which a large majority of farmers are known to irrigate crops).

Sample size calculations. Neither water consumption nor our proxy, hours of pump operation, is often measured at the farm level in India, and so our pilot measurements represent a contribution in themselves. Figure 3-5 plots the full distribution of monthly measurements of pump operation time across all farmers in our pilot. The variance is much larger than initially expected, which informed our power calculations and led us to revise upward the sample size of the experiment.

Suggests conservation credits may yield the expected effects. While the pilot has low statistical power and cannot yield precise results, analysis is not inconsistent with the intervention having the expected direction of response and a large effect magnitude. Figure 3-6 plots the mean number of hours pumped per month of the pilot. Before the price incentive was introduced, farmers in the treatment group pumped for more mean hours than those in control; after conservation credits began, the treatment group pumped for fewer mean hours than control each month (although none of these differences are statistically significant). To quantify these differences, Table 3.1 shows the results of linear regressions following Equation 3.3, with each column including a different set of covariates. Point estimates suggest that eligibility for conservation credits induces a practically large reduction in pumping hours: a 32 percent de- crease on a control-group mean of 38 hours per month. Although these point estimates are imprecise (confidence intervals include both zero and some positive values), they appear to be stable across specifications.

3.7 Conclusion

This chapter presents an experimental protocol to estimate the demand response to groundwater pric- ing in irrigated agriculture in a region of Gujarat, India. To estimate demand, the study will introduce random variation in prices through an intervention that offers payments for groundwater conserva- tion relative to a benchmark quantity. We show how to use our demand estimate - given a marginal social damages function - to calculate the optimal Pigouvian groundwater tax in our setting. The op-

150 timal quantity of groundwater conservation could be achieved through a marginal incentive on either agricultural electricity, groundwater, or duration of pump operation. Our study will also evaluate the effectiveness of conservation credits, a second-best policy solu- tion similar to “payments for environmental services” programs. In many settings, Pigouvian taxes may be politically infeasible. By exchanging corrective subsidies for taxes, this program overcomes the political barriers to taxing the agricultural sector, while still introducing marginal incentives for conservation. This may be a promising policy approach for reducing inefficient groundwater extrac- tion. Conservation credits, however, are generally not efficient, unless benchmarks can be perfectly targeted or revenue constraints do not bind. Some farmers are likely to be extra-marginal: their ex- traction is so far beyond their benchmark that they do not benefit from conservation. This raises another important question: what is the optimal conservation credit program that a donor or govern- ment would be willing to implement? Evaluating this question, given the goals and constraints of a funder, depends not only on aggregate groundwater demand, but also on the distribution of utilization across farmers. Future research may be able to use variation in the program design parameters, like that introduced in this study, to make progress on this question.

151 3.8 Figures

Price

D(q) = PMB(q) – PMC(q)

D(q) – p* SMD(q)

p*

q* q0 Quantity Extracted

Figure 3-1: Price regulation for groundwater management.

This figure shows how price regulation can be used to achieve the optimal groundwater quantity extracted. Inverse demand for groundwater D(q) is the difference between private marginal benefits PMB(q) and private marginal costs PMC(q); groundwater extraction also creates social marginal damages SMD(q). Without regu- lation (i.e., at a price of zero), irrigators will consume the amount where demand meets the x-axis, q0. When the price is set to p∗, the value of social marginal damages when it equals demand, irrigators will internalize the social damages, shifting effective demand down such that they instead consume the optimal quantity q∗.

152 Payment

maximum IC A payment 1 Slope: p 1

A IC0

ICB

q A q A benchmark qB Quantity 1 0 Quantity Conserved Extracted

Figure 3-2: Budget set of conservation credits.

This figure shows the general form of the budget set created by a conservation credit program, along with indifference curves of two representative participants. The payment equals the price p times the quantity units conserved below the benchmark, up to a maximum payment. Irrigator A is marginal and will respond to the program by reducing quantity extracted. Irrigator B is extra-marginal, and does not change quantity extraction in response to the program.

153 Power calculations: Conservation Credits Model: Linear OLS. 1 .8 n=3000 n=2000 n=1000 .6 .4 .2 Power (probability of rejecting a zero effect) 0 0 .05 .1 .15 .2 .25 Minimum detectable effect (proportion of sample mean) Variance of meter reading totals from pilot of 90 farmers, after partialing out baseline controls.

Figure 3-3: Power curves for a range of sample sizes and minimum detectable effects.

154

Sample Frame Listed in village talati lists (n= )

Assessed for eligibility (n= )

Excluded (n= ) ♦ Ineligible (n= ) Enrollment ♦ Refused consent (n= ) ♦ Incomplete Baseline (n= ) ♦ Other reasons (n= )

Randomized (n= )

Allocated to Control (Pr=.5, n= ) Allocation Allocated to Conservation Credits (Pr=.5, n= ) Subtreatments: ♦ Low Price, Low Benchmark (Probability = .25) ♦ Low Price, High Benchmark (Probability = .25) ♦ High Price, Low Benchmark (Probability = .25) ♦ High Price, High Benchmark (Probability = .25)

Follow-Up Completed Endline (n= ) Completed Endline (n= ) Discontinued intervention (n= ) Discontinued intervention (n= ) Lost to follow-up (n= ) Lost to follow-up (n= )

Analysis Analysed (n= ) Analysed (n= ) ♦ Attrition (n= ) ♦ Attrition (n= ) ♦ Never farmed in Rabi (n= ) ♦ Never farmed in Rabi (n= )

Figure 3-4: Experimental design

Notes: This figure displays the design of a randomized experiment to estimate the demand response to agricul- tural groundwater prices.

155 oe:Ti gr lt h itga ftemnhyhuso rudae riainmaue nte2017- the Gujarat. in Khambaliya, measured in farmers irrigation 90 groundwater of of study hours pilot monthly our in the (October-February) of season histogram Rabi the 2018 plots figure This Notes:

iue3-5: Figure Density 0 .01 .02 .03 .04 .05 0 itiuino otl or fgonwtrirgto,poe cosmonths across pooled irrigation, groundwater of hours monthly of Distribution Distribution ofhourspumped(inpilotdata) 100 Monthly hourspumped 200 156 300 400 500 mn vrtewne 0721 rga.Tebr eoetesadr roso h mean. the of errors standard exper- the pilot denote our in bars farmers The among program. irrigation 2017-2018 groundwater winter of the hours over monthly average iment the plots figure This Notes:

Hours pumped 0 25 50 75 100 125 Oct 17 Hours pumpedbeforeandafterinterventionbegan Control iue3-6: Figure Water UseOverTime Nov 17 157 Month vn study Event Conservation Credits Dec 17 Jan 18 3.9 Tables

Table 3.1: Intent-to-treat effect of conservation credits program in pilot.

Monthly Pumping Hours (1) (2) (3) (4) (5) Conservation Credit treatment group -12.067 -12.126 -16.666 -12.032 -16.872 (12.071) (11.423) (13.305) (15.391) (15.672) Strata FE X X X X Month FE X X Sub-village FE X X Baseline controls X X Observations 270 270 270 258 258 Clusters 90 90 90 86 86 Standard errors clustered by individual.

158 Bibliography

Adamopoulos, Tasso, Loren Brandt, Jessica Leight, and Diego Restuccia, “Misallocation, Selec- tion, and Productivity: A Quantitative Analysis with Panel Data from China,” 2017.

Adhvaryu, Achyuta, Namrata Kala, and Anant Nyshadham, “Management and Shocks to Worker Productivity,” 2016.

Antmann, Pedro, “Reducing technical and non-technical losses in the power sector.,” Technical Report July 2009 2009.

Aragón, Fernando M. and Juan Pablo Rud, “Polluting Industries and Agricultural Productivity: Evidence from Mining in Ghana,” The Economic Journal, 2016, 126 (597), 1980–2011.

Arceo, Eva, Rema Hanna, and Paulina Oliva, “Does the Effect of Pollution on Infant Mortality Dif- fer Between Developing and Developed Countries? Evidence from Mexico City,” The Economic Journal, 2016, 126 (591), 257–280.

Association of California Water Agencies, “Recommendations for Improving Water Transfers and Access to Water Markets in California,” 2016.

Atkin, David and Dave Donaldson, “Who’s Getting Globalized? The Size and Nature of Intrana- tional Trade Costs,” 2015.

Ayres, Andrew B., Eric C. Edwards, and Gary D. Libecap, “How Transaction Costs Obstruct Collective Action to Limit Common-Pool Losses: Evidence from California’s Groundwater,” 2017.

Badiani, Reena and Katrina Jessoe, “Electricity Prices, Groundwater and Agriculture: The Envi- ronmental and Agricultural Impacts of Electricity ,” in Wolfram Schlenker, ed., Understanding Productivity Growth in Agriculture, University of Chicago Press, 2017.

Belloni, Alexandre, Daniel Chen, Victor Chernozhukov, and Christian Hansen, “Sparse Models and Methods for Optimal Instruments with an Application to Eminent Domain,” Econometrica, 2012, 80 (6), 2369–2429.

, Victor Chernozhukov, and Christian Hansen, “Inference on treatment effects after selection among high-dimensional controls,” Review of Economic Studies, 2013, 81 (2), 608–650.

Bergquist, Lauren Falcao, “Pass-through, Competition, and Entry in Agricultural Markets: Experi- mental Evidence from Kenya,” 2016.

Björnerstedt, Jonas and Johan Stennek, “Bilateral oligopoly - The efficiency of intermediate goods markets,” International Journal of Industrial Organization, 2007, 25 (5), 884–907.

159 Börner, Jan, Kathy Baylis, Esteve Corbera, Driss Ezzine de Blas, Jordi Honey-Rosés, U. Martin Persson, and Sven Wunder, “The Effectiveness of Payments for Environmental Services,” World Development, 2017, 96, 359–374.

Brainerd, Elizabeth and Nidhiya Menon, “Seasonal effects of water quality: The hidden costs of the Green Revolution to infant and child ,” Journal of Development Economics, 2014, 107, 49–64.

Brewer, Jedidiah, Robert Glennon, Alan Ker, and Gary Libecap, “2006 Presidential Address: Water Markets in the West: Prices, Trading, and Contractual forms,” Economic Inquiry, 2008, 46 (2), 91–112.

Bruno, Ellen M., “Agricultural Groundwater Markets: Understanding the Gains from Trade and the Role of Market Power,” Job Market Paper, 2018, pp. 1–54.

Bryan, Gharad and Melanie Morten, “Economic Development and the Spatial Allocation of Labor: Evidence From Indonesia,” 2015.

Buck, Steven, Maximilian Auffhammer, and David Sunding, “Land markets and the value of wa- ter: Hedonic analysis using repeat sales of farmland,” American Journal of Agricultural Economics, 2014, 96 (4), 953–969.

Burke, Marshall, Solomon M. Hsiang, and Edward Miguel, “Climate and Conflict,” Annual Re- views of Economics, 2015, 7, 577–617.

Bustos, Paula, Bruno Caprettini, and Jacopo Ponticelli, “Agricultural Productivity and Structural Transformation: Evidence from Brazil,” American Economic Review, 2016, 106 (6), 1320–1365.

California Department of Water Resources, “State Water Project Delivery Capability Report 2015,” 2015.

and U.S. Bureau of Reclamation, “Technical Information for Preparing Water Transfer Proposals (Water Transfer White Paper),” 2015.

California State Water Resources Control Board, “A Guide to Water Transfers,” 1999.

Carey, Janis, David L. Sunding, and David Zilberman, “Transaction costs and trading behavior in an immature water market,” Environment and Development Economics, 2002, 7 (04), 733–750.

Carey, Janis M. and David L. Sunding, “Emerging Markets in Water: A Comparative Institutional Analysis of the Central Valley and Colorado-Big Thompson Projects,” Natural Resources Journal, 2001, (Spring).

Cason, Timothy N. and Lata Gangadharan, “Transaction Costs in Tradable Permit Market: An Experimental Study of Pollution Market Design,” Journal of Regulatory Economics, 2003, 23 (2), 145–165.

Central Ground Water Board, “Ground Water Year Book - India 2012-13,” Technical Report, Min- istry of Water Resources, Government of India, Faridabad 2013.

Central Pollution Control Board, “Comprehensive Environmental Assessment of Industrial Clus- ters,” Technical Report December 2009.

160 , “Criteria for Comprehensive Environmental Assessment of Industrial Clusters,” Technical Report December 2009.

Chen, Yuyu, Avraham Ebenstein, Michael Greenstone, and Hongbin Li, “Evidence on the im- pact of sustained exposure to air pollution on life expectancy from China’s Huai River policy.,” Proceedings of the National Academy of Sciences of the United States of America, aug 2013, 110 (32), 12936–12941.

Collard-Wexler, Allan, Gautam Gowrisankaran, and Robin S. Lee, “"Nash-in-Nash" Bargaining: A Microfoundation for Applied Work,” 2017.

Culp, Peter W., Robert Glennon, Gary Libecap, George A. Akerlof, Roger C. Altman, Gordon S. Rentschler Memorial, and Laura D. Andrea Tyson, “Shopping for Water: How the Market Can Mitigate Water Shortages in the American West,” 2014.

Davis, Lucas W. and Lutz Kilian, “The Allocative Cost of Price Ceilings in the U.S. Residential Market for Natural Gas,” Journal of Political Economy, 2011, 119 (2), 212–241.

Do, Quy Toan, Shareen Joshi, and Samuel Stolper, “Can environmental policy reduce infant mor- tality? Evidence from the Ganga Pollution Cases,” Journal of Development Economics, 2018, 133 (September 2016), 306–325.

Donaldson, Dave, “Railroads of the Raj: Estimating the Impact of Transportation Infrastructure,” 2012.

Duflo, Esther, Michael Greenstone, Rohini Pande, and Nicholas Ryan, “Truth-Telling by Third- Party Auditors and the Response of Polluting Firms: Experimental Evidence from India,” The Quarterly Journal of Economics, 2013, pp. 1–47.

Fan, Jianqing and Irene Gijbels, “Local Polynomial Modelling and its Applications,” Monographs on Statistics and Applied Probability, 1996, 66.

Fishman, Ram, Upmanu Lall, Vijay Modi, and Nikunj Parekh, “Can Electricity Pricing Save India’s Groundwater? Field Evidence from a Novel Policy Mechanism in Gujarat,” Journal of the Association of Environmental and Resource Economists, dec 2016, 3 (4), 819–855.

Gangadharan, Lata, “Transaction Costs in Pollution Markets: An Empirical Study,” Land Eco- nomics, 2000, 76 (4), 601–614.

Gelman, Andrew and Guido Imbens, “Why High-Order Polynomials Should Not Be Used in Re- gression Discontinuity Designs,” National Bureau of Economic Research Working Paper Series, 2014, No. 20405.

Ghatak, Maitreesh and Dilip Mookherjee, “Land acquisition for industrialization and compensa- tion of displaced farmers,” Journal of Development Economics, 2014, 110, 303–312.

Glaeser, Edward L. and Erzo F. P. Luttmer, “The Misallocation of Housing Under Rent Control,” American Economic Review, 2003, 93 (4), 1027–1046.

Golden, Miriam and Brian Min, “Corruption and theft of electricity in an Indian state,” 2012.

161 Gonzalez-Alvarez, Yassert, Andrew G. Keeler, and Jeffrey D. Mullen, “Farm-level irrigation and the marginal cost of water use: Evidence from Georgia,” Journal of Environmental Management, 2006, 80 (4), 311–317.

Gray, Brian, Ellen Hanak, Richard Frank, Richard Howitt, Jay Lund, Leon Szeptycki, and Barton "Buzz" Thompson, “Allocating California’s Water: Directions for Reform,” 2015.

Greenstone, Michael and Rema Hanna, “Environmental Regulations, Air and Water Pollution, and Infant Mortality in India,” American Economic Review, oct 2014, 104 (10), 3038–3072.

Hahn, Robert W., “Market Power and Transferable Property Rights,” The Quarterly Journal of Economics, 1984, 99 (4), 753–765.

Hanak, Ellen, “Stopping the drain: Third-party responses to California’s water market,” Contempo- rary Economic Policy, 2005, 23 (1), 59–77.

and Elizabeth Stryjewski, “California’s Water Market, By the Numbers: Update 2012,” 2012.

Hendricks, Nathan P. and Jeffrey M. Peterson, “Fixed Effects Estimation of the Intensive and Extensive Margins of Irrigation Water Demand,” 2012.

Ho, Kate and Robin S. Lee, “Insurer Competition in Health Care Markets,” Econometrica, 2017, 85 (2), 379–417.

Howitt, Richard E., Jay R. Lund, K.W. Kirby, Marion W. Jenkins, and Andrew J. Draper, “Integrated Economic-Engineering Analysis of California’s Future Water Supply: a report for the State of California Resources Agency,” 1999.

Hsieh, Chang-Tai and Peter J. Klenow, “Misallocation and Manufacturing TFP in China and India,” Quarterly Journal of Economics, 2009, 124 (4).

Hudson, Sally, Peter Hull, and Jack Liebersohn, “Interpreting Instrumented Difference-in- Differences,” 2015.

Hussain, Intizar, Liqa Raschid, Munir A. Hanjra, Fuard Marikar, and Wim van der Hoek, Wastewater Use in Agriculture: Review of Impacts and Methodological Issues in Valuing Impacts 2002.

Ito, Koichiro, “Do Consumers Respond to Marginal or Average Price? Evidence from Nonlinear Electricity Pricing,” American Economic Review, 2014, 104 (2), 537–563.

Jack, B. Kelsey and Elsa Cardona Santos, “The leakage and livelihood impacts of PES contracts: A targeting experiment in Malawi,” Land Use Policy, 2017, 63, 645–658.

and Grant Smith, “Charging Ahead: Prepaid Electricity Metering in South Africa,” NBER Work- ing Paper Series, 2016.

Jayachandran, Seema, “Air Quality and Early-Life Mortality Evidence from Indonesia’s Wildfires,” Journal of Human Resources, 2009, 44 (4).

162 , Joost De Laat, Eric F. Lambin, Charlotte Y. Stanton, Robin Audy, and Nancy E. Thomas, “Cash for carbon: A randomized trial of payments for ecosystem services to reduce deforestation,” Science, 2017, 357 (6348), 267–273.

Jenkins, Marion W., Jay R. Lund, and Richard E. Howitt, “Using Economic Loss Functions to Value Urban Water Scarcity in California,” American Water Works Association, 2003, 95 (2), 395–429.

Keiser, David A. and Joseph S. Shapiro, “Consequences of the Clean Water Act and the Demand for Water Quality,” 2017.

Khai, Huynh Viet and Mitsuyasu Yabe, “Impact of Industrial Water Pollution on Rice Production in Vietnam,” in “International Perspectives on Water Quality Management and Pollutant Control” 2013.

Kling, Jeffrey R., Jeffrey B. Liebman, and Lawrence F. Katz, “Experimental Analysis of Neigh- borhood Effects,” Econometrica, 2007, 75 (1), 83–119.

Kyle, Albert S., “Informed Speculation with Imperfect Competition,” The Review of Economic Stud- ies, 1989, 56 (3), 317–355.

Lee, David S., “Training, wages, and sample selection: Estimating sharp bounds on treatment ef- fects,” Review of Economic Studies, 2009, 76 (3), 1071–1102.

Lee, Kenneth, Edward Miguel, and Catherine Wolfram, “Experimental Evidence on the Eco- nomics of Rural Electrification,” Working Paper, 2018.

Lindhjem, Henrik, Tao Hu, Zhong Ma, John Magne Skjelvik, Guojun Song, Haakon Vennemo, Jian Wu, and Shiqiu Zhang, “Environmental economic impact assessment in China: Problems and prospects,” Environmental Impact Assessment Review, 2007, 27 (1), 1–25.

Liski, Matti, “Thin versus Thick CO2 Market,” Journal of Environmental Economics and Manage- ment, 2001, 41, 295–311.

and Juan Pablo Montero, “Market Power in an Exhaustible Resource Market: The Case of Storable Pollution Permits,” Economic Journal, 2011, 121 (551), 116–144.

Malueg, David A. and Andrew J. Yates, “Bilateral oligopoly, private information, and pollution permit markets,” Environmental and Resource Economics, 2009, 43 (4), 553–572.

McRae, Shaun, “Infrastructure Quality and the Subsidy Trap,” American Economic Review, 2015, 105 (1), 35–66.

Medellín-Azuara, Josué, Julien J. Harou, Marcelo A. Olivares, Kaveh Madani, Jay R. Lund, Richard E. Howitt, Stacy K. Tanaka, Marion W. Jenkins, and Tingju Zhu, “Adaptability and adaptations of California’s water supply system to dry climate warming,” Climatic Change, 2007, 87.

Meenakshi, JV, Abhijit Banerji, Aditi Mukherji, and Anubhab Gupta, “Does marginal cost pric- ing of electricity affect groundwater pumping behaviour of farmers? Evidence from India,” 3ie Impact Evaluation Report, 2013, 4.

163 Mérel, Pierre and Richard Howitt, “Theory and Application of Positive Mathematical Program- ming in Agriculture and the Environment,” Annual Review of Resource Economics, 2014, 6 (1), 451–470.

Ministry of Agriculture of the Government of India, “Agriculture Census 2010-11,” 2014.

Moretti, Enrico, Local Labor Markets, Vol. 4, Elsevier B.V., 2011.

Mukherjee, M. and K. Schwabe, “Irrigated Agricultural Adaptation to Water and Climate Vari- ability: The Economic Value of a Water Portfolio,” American Journal of Agricultural Economics, 2014, 97 (2009), 809–832.

Murty, M.N. and Surender Kumar, “Water Pollution in India: An Economic Appraisal,” in “India Infrastructure Report” 2011.

Northeast Group LLC, “Emerging Markets Smart Grid: Outlook 2015,” Technical Report 2014.

Pattanayak, Subhrendu K., Sven Wunder, and Paul J. Ferraro, “Show me the money: Do pay- ments supply environmental services in developing countries?,” Review of Environmental Eco- nomics and Policy, 2010, 4 (2), 254–274.

Peterson, Deborah, Gavan Dwyer, David Appels, and Jane Fry, “Water trade in the southern Murray-Darling basin,” Economic Record, 2005, 81 (255), S115–S127.

Qureshi, M. Ejaz, Tian Shi, Sumaira E. Qureshi, and Wendy Proctor, “Removing barriers to facilitate efficient water markets in the Murray-Darling Basin of Australia,” Agricultural Water Management, 2009, 96 (11), 1641–1651.

Reddy, V. Ratna and Bhagirath Behera, “Impact of water pollution on rural communities: An economic analysis,” Ecological Economics, 2006, 58 (3), 520–537.

Registrar General and Census Commissioner of India, “Census of India 2001,” 2001.

, “Census of India 2011,” 2011.

Regnacq, Charles, Ariel Dinar, and Ellen Hanak, “The Gravity of Water: Water Trade Frictions in California,” American Journal of Agricultural Economics, 2016, 98 (5), 1273–1294.

Restuccia, Diego and Raul Santaeulalia-Llopis, “Land Misallocation and Productivity,” 2017.

Roberts, Michael J., Wolfram Schlenker, and Jonathan Eyer, “Agronomic weather measures in econometric models of crop yield with implications for climate change,” American Journal of Agricultural Economics, 2013, 95 (2), 236–243.

Rosegrant, M. W., C. Ringler, D. C. McKinney, X. Cai, A. Keller, and G. Donoso, “Integrated economic-hydrologic water modeling at the basin scale: The Maipo river basin,” Agricultural Eco- nomics, 2000, 24 (1), 33–46.

Rostek, Marzena and Marek Weretka, “Price Inference in Small Markets,” Econometrica, 2012, 80 (2), 687–711.

and , “Dynamic Thin Markets,” Review of Financial Studies, 2015, 28 (10), 2946–2992.

164 Saez, Emmanuel, Joel Slemrod, and Seth H Giertz, “The Elasticity of Taxable Income with Re- spect to Marginal Tax Rates: A Critical Review,” Journal of Economic Literature, 2012, 50 (1), 3–50.

Samadhan E Cube Innovator Pvt. Ltd., “Baseline Study for project villages of Coastal Salinity Prevention Cell in Gujarat under Kharash Vistarotthan Yojana (KVY): Talaja and Gogha Taluka, Bhavnagar District,” Technical Report, Coastal Salinity Prevention Cell 2016.

Scheer, Jennifer Beth, “Participation, Price, and Competition in the Sacramento Valley Water Mar- ket,” 2016.

Schlenker, Wolfram and Michael J. Roberts, “Nonlinear Temperature Effects Indicate Severe Dam- ages to U.S. Corn Yields Under Climate Change,” Proceedings of the National Academy of Sci- ences, 2009, 106 (37), 15594–15598.

, W. Michael Hanemann, and Anthony C. Fisher, “Water availability, degree days, and the potential impact of climate change on irrigated agriculture in California,” Climatic Change, 2007, 81 (1), 19–38.

Scott, C.I., N. I. Faruqui, and L. Raschid-Sally, Wastewater Use in Irrigated Agriculture: Con- fronting the Livelihood and Environmental Realities 2004.

Sekhri, Sheetal, “Wells, Water, and Welfare: The Impact of Access to Groundwater on Rural Poverty and Conflict,” American Economic Journal: Applied Economics, 2014, 6 (3), 76–102.

Sovacool, Benjamin K., “Reviewing, Reforming, and Rethinking Global Energy Subsidies: Towards a Political Economy Research Agenda,” Ecological Economics, 2017, 135, 150–163.

Stavins, Robert N., “Transaction Costs and Tradeable Permits,” Journal of Environmental Economics and Management, 1995, 29, 133–148.

Stene, Eric A., “Central Valley Project Overview,” 1995.

Sunding, David L., David Zilberman, Richard Howitt, Ariel Dinar, and Neal MacDougall, “Measuring the costs of reallocating water from agriculture: A multi-model approach,” Natural Resources Modeling, 2002, 15 (2), 201–225.

UNDP, “Human Development Report 2006 - Beyond scarcity: Power, poverty and the global water crisis,” 2006.

Western Water Company, “Statement for the State Water Resources Control Board Public Work- shop on Water Transfers,” 2000.

Westfall, Peter H. and S. Stanley Young, Resampling-Based Multiple Testing: Examples and Meth- ods for p-Value Adjustment, New York: Wiley, 1993.

World Bank and State Environmental Protection Administration, “Cost of : Economic Estimates of Physical Damages,” Technical Report 10 2007.

World Bank Group, “High and Dry: Climate Change, Water, and the Economy,” 2016.

165