<<

OPTIMAL DESIGN OF IN THE PRESENCE OF INTERFERENCE

Sarah Baird, J. Aislinn Bohren, Craig McIntosh, and Berk Özler*

Abstract—We formalize the optimal when there is complex design choices. Therefore, researchers interested interference between units, that is, an individual’s outcome depends on the outcomes of others in her group. We focus on randomized saturation in using experiments to study interference face a novel set designs, two-stage experiments that first randomize treatment saturation of design questions. of a group, then individual treatment assignment. We map the potential In this paper, we study experimental design in the pres- outcomes framework with partial interference to a regression model with clustered errors, calculate standard errors of randomized saturation designs, ence of interference. We focus on settings with partial and derive analytical insights about the optimal design. We show that the interference, in which individuals are split into mutually power to detect average treatment effects declines precisely with the ability exclusive clusters, such as villages or schools and inter- to identify novel treatment and spillover effects. ference occurs between individuals within a cluster but not across clusters. As Hudgens and Halloran (2008) established, I. Introduction a two-stage procedure, in which first each cluster is randomly assigned a treatment saturation, and sec- HE possibility of interference in experiments, where the ond, individuals within each cluster are randomly assigned T treatment status of an individual affects the outcomes to treatment according to the realized treatment saturation, of others, gives rise to a plethora of important questions. can identify treatment and spillover effects when there is How does the benefit of treatment depend on the intensity of partial interference.1 Hudgens and Halloran (2008), Liu and treatment within a population? What if a program benefits Hudgens (2014), and Tchetgen Tchetgen and VanderWeele some by diverting these benefits from others? Does the study (2010) define causal estimands, find unbiased of have an unpolluted counterfactual? Further, in the presence these estimands, and characterize the distributions of these of interference, a full understanding of the policy environ- estimators in such designs.2 But a key questions remains: ment requires a measure of spillover effects that are not How should a researcher designing such a randomized satu- captured, or are even a source of bias, in standard experimen- ration (RS) select the set of treatment saturations tal designs. This is critical to determine the overall program and the share of clusters to assign to each saturation? In this impact. paper, we explore the trade-offs involved in these design Empirical researchers across multiple academic disci- choices, from the perspective of how they affect the standard plines have become increasingly interested in bringing such errors of the estimators of different treatment and spillover spillover effects under the lens of experimental investigation. effects. Over the past decade, a new wave of experimental stud- Our first contribution is to provide a foundation for the ies has relaxed the assumptions around interference between regression models that economists commonly use to analyze units. These studies have used a variety of methods, includ- RS experiments. We map a potential outcomes model with ing experimental variation across treatment groups (Bobba partial interference into a regression model with intracluster & Gignoux, forthcoming; Miguel & Kremer, 2004), leaving correlation, which provides a bridge between the causal some members of a group untreated (Barrera-Osorio et al., inference literature and the methods used to analyze RS 2011), exploiting exogenous variation in within-network designs in practice. This mapping requires two restrictions treatments (Beaman, 2012), and intersecting an experiment on the population distribution of potential outcomes: (a) the with preexisting networks (Banerjee et al., 2013). Compared population average potential outcome depends only on an to experiments with no interference, these experiments often individual’s treatment status, and (b) the share of treated seek to measure a larger set of effects and involve more individuals in the cluster, and the - of the population distribution of potential outcomes, is Received for publication June 15, 2015. Revision accepted for publication August 7, 2017. Editor: Bryan S. Graham. 1 Many recent empirical papers use this randomization procedure, includ- * Baird: George Washington University; Bohren: University of Pennsyl- ing Banerjee et al. (2012), Busso and Galiani (2014), Crepon et al. (2013); vania; McIntosh: University of California, San Diego; Özler: World Bank. Gine and Mansuri (2018), and Sinclair, McConnell, and Green (2012). Par- We are grateful for the useful comments received from Frank DiTraglia, tial population experiments (Mofftt, 2001), in which clusters are assigned Elizabeth Halloran, Cyrus Samii, Patrick Staples, and especially Peter to treatment or control, and a subset of individuals in treatment clusters Aronow. We also thank seminar participants at Caltech, Cowles Economet- are offered treatment, also identify certain treatment and spillover effects rics Workshop, Econometric Society Australiasian Meeting, Fred Hutch when there is partial interference. But they provide no exogenous varia- workshop, Harvard T. H. Chan School of Public Health, Monash, Namur, tion in treatment saturation to identify whether these effects vary with the Paris School of Economics, Stanford, University of British Columbia, UC intensity of treatment. Most partial population experiments feature cluster- Berkeley, University of Colorado, University of Melbourne, the World level saturations that are either endogenous (Mexico’s conditional cash Bank, Yale Political Science, and Yale School of Public Health. We thank transfer program, PROGRESA/Oportunidades—Alix-Garcia et al., 2013; the Global Development Network, Bill & Melinda Gates Foundation, NBER Angelucci & De Giorgi, 2009; Bobonis & Finan, 2009) or fixed and typically Africa Project, World Bank’s Research Support Budget, and several World set at 50% (Duflo & Saez, 2003). Bank trust funds (Gender Action Plan, Knowledge for Change Program, 2 Aronow and Samii (2015) and Manski (2013) study identification and and Spanish Impact Evaluation fund) for funding. variance estimation under more general forms of interference.

The Review of Economics and , December 2018, 100(5): 844–860 © 2018 by the President and Fellows of Harvard College and the Massachusetts Institute of Technology doi:10.1162/rest_a_00716

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 OPTIMAL DESIGN OF EXPERIMENTS IN THE PRESENCE OF INTERFERENCE 845

block diagonal.3 Athey and Imbens (2017) perform a sim- in designing RS experiments. It is publicly available at ilar derivation for a model with uncorrelated observations http://pdel.ucsd.edu/solutions/index.html. and no interference. Our derivation is an extension of their The remainder of the paper is structured as follows. approach that allows for intracluster correlation and partial Section II sets up the potential outcomes framework, for- interference. malizes an RS design, and defines estimands related to We show that using this regression model to analyze spillovers. Section III connects the potential outcomes from an RS experiment identifies a set of novel estimands; framework to a regression model with clustered errors, not only can the researcher identify an unbiased estimate of presents closed-form expressions for the standard errors, and the usual intention-to-treat effect, but she can also observe derives properties of the optimal RS design to measure dif- spillover effects on treated and untreated individuals and ferent sets of estimands. Section IV presents the numerical understand how the intensity of treatment drives spillover application. All proofs are in the appendix. effects on these groups. These are the infinite population analogs of the estimands that Hudgens and Halloran (2008) II. Causal Inference with Partial Interference show can be consistently estimated in a finite population model. The estimate of the average effect on all individuals A. Potential Outcomes in treated clusters, which we refer to as the total causal effect, A researcher seeks to draw inference on the outcome dis- provides policymakers with a very simple tool to understand tribution of an infinite population I under different treatment how the intensity of treatment will drive outcomes for a allocations. The population is partitioned into equal-sized, representative individual. nonoverlapping groups, or clusters, of size n.4 Individual i in Next, we illustrate the power trade-offs that exist in cluster c has response function Y : {0, 1}n → Y that maps designing RS experiments—that is, choosing the set of satu- ic each potential cluster treatment vector t = (t1, ..., tn) ∈ rations and the share of clusters to assign to each saturation. n {0, 1} into potential outcome Yic(t) ∈ Y, where t ∈{0, 1} We derive closed-form expressions for the standard errors is a binary treatment status in which t = 1 corresponds to (SEs) of the OLS estimates of various treatment and spillover being offered treatment and t = 0 corresponds to not being estimands. Using these expressions, we derive properties offered treatment, and Y ⊂ is a set of potential outcomes. of the optimal designs to measure different sets of esti- The response function is independent of the treatment vec- mands. The ability to identify novel estimands, such as slope tors for all clusters d = c: spillovers may flow within a effects, comes at a cost: decreased statistical power to mea- cluster but not between clusters. Thus, we relax the stable sure intention-to-treat effects pooled across all saturations. In unit treatment value assumption (SUTVA) within clusters other words, the same variation in treatment saturation that but maintain it across clusters. This setup is referred to as permits measurement of how treatment and spillover effects partial interference (Sobel, 2006).5 vary with the intensity of treatment is detrimental to the A random sample is drawn from this infinite population power of the simple experimental comparison of treatment and randomly assigned treatment according to a prespec- to pure control. By placing RS designs in the clustered error ified experimental design. Our goal is to study the power regression framework, we provide the closest possible ana- of different experimental designs to detect treatment and log to the familiar power calculations in cluster randomized spillover effects by comparing the standard errors of esti- trials. This makes the design trade-offs present in RS experi- mands across designs. In order to characterize these standard ments transparent. In related work, Hirano and Hahn (2010) errors, we make two assumptions on the and the study the power of a partial population experiment to analyze variance- of the population distribution of a linear-in- model with no intracluster correlation. potential outcomes. We conclude with numerical simulations of hypothetical First, we assume that the expected potential outcome and published RS designs. First, we calculate the opti- n E[Yic(t)] at potential treatment vector t ∈{0, 1} , where mal designs for objective functions that include different the expectation is with respect to the population distribution sets of individual saturation, slope, and pooled estimands. of potential outcomes, depends only on individual treatment This demonstrates how the optimal design depends on the ≡ 1 n status ti and the treatment saturation p(t) n j=1 tj.In set of estimands that the researcher would like to identify other words, it is independent of the identity of the other and estimate precisely. Second, we calculate the standard individuals who receive treatment. errors for several RS designs used in published papers. This illustrates how design choices affect the standard errors of Assumption 1. There exists a Y : {0, 1}×(0, 1) → different estimators. To compute these numerical results, co(Y), where co(Y) is the convex hull of the set of potential we use software that we developed to assist researchers 4 We assume clusters are equal in size to simplify the analysis. In practice, data sets may have significant variation in the size of the cluster, and the 3 Hudgens and Halloran (2008) make the stronger assumption of strati- researcher may want to group clusters into different-sized bins (e.g., rural fied interference to estimate in a setting with partial interference. and urban clusters). Graham, Imbens, and Ridder (2010) relax this assumption with one of 5 The assumption of no interference across groups is testable. For example, observational symmetry (i.e., exchangeability). see Miguel and Kremer (2004).

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 846 THE REVIEW OF ECONOMICS AND STATISTICS

outcomes, such that for all treatment vectors t ∈{0, 1}n \ impose a variance structure on potential outcomes that maps {0n,1n} with p(t) = p, the expected potential outcome for an to the regression model typically used for power calculations individual with treatment status t ∈{0, 1} is Y(t, p). when there is no interference.8 It enables a direct comparison of the power of RS designs to the power of the canonical To maintain consistent notation, define Y(0, 0) ≡ individually randomized (blocked) and cluster-randomized n n E[Yic(0 )] and Y(1, 1) ≡ E[Yic(1 )] as the expected potential (clustered) designs, making explicit the impact that random- outcomes when no individuals and all individuals within a izing saturation has on power. A regression model with a cluster, respectively, are treated. Assumption 1 allows for a block-diagonal structure is also the model underlying the use characterization of the standard errors of estimands with- of OLS with clustered standard errors to analyze resulting out possessing information about the underlying network data, the method commonly used for analysis. structure within a cluster.6 Note that it does not preclude the realized potential outcomes from depending on the iden- B. A Randomized Saturation Design tity of the individuals who receive treatment. Therefore, it is Suppose a researcher draws a sample of C clusters of weaker than the stratified interference assumption proposed size n. A randomized saturation (RS) design is a two- by Hudgens and Halloran (2008), which assumes that the stage treatment assignment mechanism that specifies how realized potential outcomes of an individual are independent to assign treatment to these N ≡ nC individuals. The first of the identity of the other individuals assigned to treatment. stage randomizes the treatment saturation of each cluster. Second, we make an assumption about the variance- Let Π ⊂ [0, 1] be a finite set of treatment saturations. covariance matrix of the distribution of potential outcomes. Each cluster c is randomly assigned a treatment saturation Clustering of outcomes can be due to either (a) the extent to Pc ∈ Π according to the distribution f ∈ Δ(Π), which spec- which outcomes are endogenously driven by the treatment ifies the share of clusters assigned to each saturation. The of others in the same cluster, which is interference between second stage randomizes the treatment status of each indi- units, or (b) a statistical random effect in outcomes that vidual in the cluster according to the realized saturation of is correlated between individuals—correlated effects (Man- the cluster. Individual i in cluster c is randomly assigned ski, 1993)—which does not stem from interference between treatment T ∈{0, 1}, where the realized cluster treatment units. To capture outcome b, we allow potential outcomes ic saturation Pc specifies the share of individuals assigned to to be correlated across individuals within the same cluster n = treatment (i.e., i=1 Tic nPc). Let Tc denote the realized while maintaining no correlation across clusters. treatment vector for cluster c. An RS design is completely characterized by the pair {Π, f }.9 σ2 τ2 ≥ Assumption 2. There exist > 0 and 0 such that We refer to individuals assigned to treatment as treated  ∈{ }n for all t, t 0, 1 , the variance-covariance matrix for individuals, individuals in clusters assigned saturation zero the distribution of potential outcomes satisfies Var (Yic(t)) = 2 2 2 as pure controls, and individuals in treated clusters who σ + τ , Cov(Yic(t), Yjc(t)) = τ for i = j, and  are not assigned to treatment as within-cluster controls. Let Cov(Yic(t), Yjd(t )) = 0 for c = d. Sic = 1{Tic = 0, Pc > 0} and Cic = 1{Tic = 0, Pc = 0} denote whether individual ic is a within-cluster control or Assumption 2 imposes homoskedasticity across all poten- pure control, respectively. An RS design has share of treated tial outcomes for a given individual and across potential individuals μ ≡ pf ( p), share of within-cluster control outcomes between two individuals in the same cluster. In p∈Π individuals μS ≡ 1−μ−ψ, and share of control individuals other words, the variance and covariance of the distribution ψ ≡ f (0). There is a pure control if ψ > 0. We say an RS of potential outcomes do not depend on the treatment status design is nontrivial if it has at least two saturations, at least 7 of an individual or the treatment saturation of a cluster. We one of which is strictly interior. Multiple saturations guar- ρ ≡ τ2 τ2 + σ2 will often use /( ) to denote the intracluster antee a comparison group to determine whether effects vary correlation (ICC). with treatment saturation, and an interior saturation guar- Assumption 2 allows us to connect the potential outcomes antees the existence of within-cluster controls to identify framework to a regression model with a block-diagonal error spillovers on the untreated. structure. Our goal is to provide a bridge between the theoret- The RS design nests several common experimental ical literature and the use of field experiments in economics designs, including the clustered, blocked, and partial to measure spillover effects. To this end, it is natural to 8 See Duflo, Glennerster, and Kremer (2007) for the case of no interfer- 6 In the absence of this assumption, a researcher would need to observe the ence. complete network structure in each cluster, understand the heterogeneity in 9 The framework discussed here uses a simple, spatially defined defini- networks across clusters, and use a model of network-driven spillovers to tion of a cluster that is mutually exclusive and exhaustive. This is distinct simulate the variance in outcomes that could be generated by these networks. from determining how to assign treatment in overlapping social networks This is not an issue when there is no interference. (Aronow, 2012), which requires a more complex sequential randomization 7 The analysis can allow for heteroskedasticity, but standard errors are less routine (Toulis & Kao, 2013). An additional benefit of an RS design is that tractable to characterize analytically and, hence, optimal design results are it also creates exogenous variation in the saturation of any overlapping net- less tractable. The homoskedastic case is a natural benchmark to establish work in which two individuals in the same cluster have a higher how randomizing treatment saturation affects power. of being linked than two individuals in different clusters.

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 OPTIMAL DESIGN OF EXPERIMENTS IN THE PRESENCE OF INTERFERENCE 847

population designs.10 The blocked and clustered designs are channel for interference: treatment and spillover effects may trivial, and it is not possible to identify any spillover effects vary with treatment saturation due to compliance effects, in these designs. The partial population design is nontriv- in addition to the direct impact of an individual’s treatment ial, and it is possible to identify whether there are spillover on other outcomes. Exploring extensions that allow compli- effects on the untreated. ance to depend on treatment saturation, thereby defining a Variation in the treatment saturation introduces correla- response function that depends on whether individuals com- tion between the treatment statuses of individuals in the ply with assigned treatment, is an important avenue for future μ η2 ≡ 2 same cluster. Fixing and defining p∈Π p f ( p), research. the variance in treatment saturation η2 − μ2 and correlation in treatment status for an RS design is bracketed between C. Treatment and Spillover Estimands that of a clustered design, which has the maximum possi- ble variance in treatment saturation and perfect correlation Next we define a set of estimands for treatment and between the treatment statuses of individuals in the same spillover effects. We focus on average effects across all indi- cluster, and that of a blocked design, which has no variance viduals in the population. Recall that Y(t, p) is the expected in treatment saturation and slightly negative correlation in potential outcome at individual treatment t and saturation p. treatment status (due to without replacement). We Individuals offered treatment will experience a direct will show that when there is also correlation between the treatment effect from the program, as well as a spillover potential outcomes of individuals in the same cluster, ρ > 0, effect from the treatment of other individuals in their cluster. ≡ the of these two correlations plays a key role in Let p 1/n denote the treatment saturation corresponding determining the power of an RS design. to a cluster with a single treated individual. The treatment on the uniquely treated (TUT) measures the intention to ≡ Discussion. We implicitly assume that all individuals treat an individual, absent any spillover effects, TUT − who are part of the spillover network within a cluster are Y(1, p) Y(0, 0), and the spillover on the treated (ST) included in the sample. If spillovers occur on individuals measures the spillover effect at saturation p on individuals ≡ − outside the sampling frame, either because there is a gateway offered treatment, ST( p) Y(1, p) Y(1, p). The famil- to treatment within the cluster and not all eligible individu- iar intention to treat (ITT) is the sum of these two effects, = + als are sampled, or because not all individuals in a cluster’s ITT( p) TUT ST( p). Individuals not offered treat- spillover network are eligible for treatment, then it is neces- ment experience only a spillover effect. The spillover on sary to distinguish between the true treatment saturation (the the nontreated (SNT) is the analog of the ST for individ- ≡ − 12 share of treated individuals in the cluster) and the assigned uals not offered treatment, SNT( p) Y(0, p) Y(0, 0). treatment saturation (the share of treated individuals out of Given these definitions, there are spillover effects on the = the sampled individuals in the cluster).11 If the sampling treated (nontreated) if there exists a p such that ST( p) 0 = rate and share of the cluster eligible for treatment are con- (SNT( p) 0). We can also measure the slope of spillovers stant across clusters, the true saturation is proportional to the with respect to treatment saturation. The slope of spillovers on the treated measures the rate of change in the spillover assigned saturation. If sampling rates are driven by cluster  characteristics or the share of the cluster that is eligible for effect on treated individuals between saturations p and p ,  ≡  −  − treatment varies across clusters, the true saturation is endoge- DT( p, p ) (ST( p ) ST( p))/( p p). If spillover effects are affine, then this is a measure of the slope; otherwise, nous. In this case, the researcher can instrument for the true  saturation with the assigned saturation. To streamline the it is a first-order approximation. Let DNT( p, p ) denote the analysis, we maintain the assumption that the assigned and analog for individuals not offered treatment. true saturations coincide. In the presence of spillovers, the true effectiveness of a Our framework can be applied to settings with perfect program is measured by the total effect of treatment on compliance or to identify intention-to-treat effects in set- both treated and untreated individuals. The total causal tings with imperfect compliance. While noncompliance does effect (TCE) measures this overall cluster-level effect on not bias intention to treat estimands, it presents a second clusters treated at saturation p, compared to pure control clusters, TCE( p) ≡ pITT( p) + (1 − p)SNT( p).Wesay 10 Fixing μ, the clustered design corresponds to Π = {0, 1} and f (1) = μ, that treatment effects are diversionary at saturation p if the blocked design corresponds to Π = {μ} and f (μ) = 1, and the partial the benefits to treated individuals are offset by negative population design corresponds to Π = {0, P} and f (P) = μ/P. 11 For example, Gine and Mansuri (2018) sample every fourth household externalities imposed on untreated individuals in the same in a neighborhood and randomly offer treatment to 80% of these house- cluster, ITT( p)>0 and TCE( p)

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 848 THE REVIEW OF ECONOMICS AND STATISTICS

the program is muted compared to the intention-to-treatment obs ≡ where Yic Yic(Tc) denotes the observed outcome for indi- 13 effect. If the TCE is negative, the program causes an aggre- vidual ic and εic is an unobserved error. To map the potential gate reduction in the average outcome, although treatment outcomes framework into this regression model, we define effects may be positive. This highlights one reason why it is the regression coefficients and error in terms of potential out- imperative to use the TCE, rather than the ITT, to inform pol- comes, population average potential outcomes, and realized icy in the presence of spillovers. The ITT may misrepresent treatment status. Let β0 ≡ Y(0, 0), β1p ≡ Y(1, p) − Y(0, 0) 15 the true effectiveness of the program. and β2p ≡ Y(0, p) − Y(0, 0). Define the residual as We can also measure the direct impact of being assigned to treatment at a given saturation. The value of treatment (VT) ε ≡ 1 = Y (t) − Y(t , p(t)) ,(2) measures the individual value of receiving treatment at satu- ic Tc t ic i t∈{0,1}n ration p, VT( p) ≡ Y(1, p) − Y(0, p).IfVT( p) is decreasing in p, then the value of treatment is decreasing in the share where p(t) is the share of treated individuals in treatment of other individuals treated and spillover effects substitute vector t = (t1, ..., tn). The following lemma characterizes for treatment, while if the VT is increasing in p, the value the distribution of the error in terms of the distribution of of treatment is increasing in the share of other individuals potential outcomes. treated and spillover effects complement treatment. Hudgens and Halloran (2008) also study causal inference Lemma 1. Assume assumptions 1 and 2. Then the error in the presence of partial interference and define a similar set defined in equation (2) is strictly exogenous, E[εic|Tc]=0, of estimands for a finite population. The estimands we define and has a block-diagonal variance-covariance matrix with are the infinite population analogs.14 Examples of spillovers [ε2 | ]=σ2 + τ2 [ε ε | ]=τ2 = E ic Tc ,E ic jc Tc for i j and that can be measured using these estimands are discussed E[ε ε |T , T ]=0 for c = d. in the working paper version (Baird et al., 2017) of this ic jd c d manuscript. Athey and Imbens (2017) derive a similar result for a poten- tial outcomes model with no interference and no intracluster III. Standard Errors and Optimal Design correlation. Given lemma 1, the OLS estimate of equation (1) yields This section maps the potential outcomes framework from an unbiased estimate of β. For any RS design with an inte- section IIA into a regression model that identifies the esti- rior saturation and a pure control, this estimate identifies mands defined in section IIC, derives analytical expressions ITTˆ ( p) = βˆ 1p, SNTˆ ( p) = βˆ 2p, TCEˆ ( p) = pβˆ 1p + (1 − p)βˆ 2p for the standard errors of the OLS estimates and charac- ˆ = βˆ − βˆ ∈ Π \{ } terizes properties of the optimal RS design for several sets and VT( p) 1p 2p for each p 0 . Tests for the presence of treatment and spillover effects at saturation p of estimands. We begin with the individual saturation and βˆ = βˆ = βˆ = βˆ slope estimands and follow with complementary results for are 1p 0 and 2p 0, 1p 2p tests whether the value βˆ a model that estimates average effects across multiple satu- to treatment is nonzero, a one-tailed test of the sign of 2p rations (pooled estimands). We conclude with an illustration determines whether treatment creates a negative or positive {βˆ ≥ βˆ ≤ } of the power trade-off between measuring slope and pooled externality on untreated individuals, and 1p 0, 2p 0 estimands. tests for diversionary effects.16 Hudgens and Halloran (2008) present similar estimators for finite population estimands and show that these estimators are unbiased.17 A. Individual Saturation and Slope Effects The OLS estimate of equation (1) also yields unbiased  estimates of the slope estimands, DTˆ ( p, p ) = (βˆ 1p − βˆ 1p)/ A regression framework. A regression model to esti- ( p − p) for each ( p, p) ∈ Π \{0}, with an analogous mate treatment and spillover effects at each saturation in ˆ  Π expression for DNT( p, p ). A pure control is not required the support of an RS design ( , f ) is to estimate the slope estimands; any RS design with two interior saturations identifies the slope effect for both treat- obs = β + β × { = } Yic 0 1pTic 1 Pc p ment and within-cluster control individuals. If a design has p∈Π\{0} + β × { = }+ε 15 Note that we are not assuming a constant treatment effect in equation (1); 2pSic 1 Pc p ic, (1) β is the average effect. p∈Π\{0} 16 This model also allows for tests on the shape of the ITT( p) and SNT( p). For example, three interior saturations allows one to test for concavity or convexity. 13 This does not say anything about the welfare implications of diversion- 17 In Hudgens and Halloran (2008), the sample is equal to the population, ary effects. To do so requires a welfare criterion specifying the social value and uncertainty stems from the unobserved potential outcomes for each of different distributions of the outcome within a cluster. individual. Our model has an infinite population, and uncertainty stems from 14 The ST and SNT defined in our paper are the infinite population analogs both the unobserved potential outcomes for each individual and sampling of the indirect causal effects defined in their paper, the ITT is the analog of uncertainty from observing a subset of the population. Minor technical their total causal effect, the TCE is the analog of their overall causal effect, modifications to their proofs establish the analogous unbiased results in our and the VT is the analog of their direct causal effect. setting.

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 OPTIMAL DESIGN OF EXPERIMENTS IN THE PRESENCE OF INTERFERENCE 849

no pure control, replace the control group with the within- correlation, SEITT depends on a weighted average of the cluster controls in the lowest saturation in the RS design number of treated individuals and the number of clusters and redefine the coefficients in equation (1) to be relative to at saturation p. the population mean of untreated individuals at the lowest Next consider the of the slope effect for saturation. treated individuals. As the distance between two satura- tions increases, 1/( p − p) decreases, making it possible Standard errors. Our first result characterizes the stan- to detect smaller slope effects. At the same time, decreas- dard errors (SEs) for the OLS of the individual ing p decreases the number of treatment individuals at this saturation and slope estimands from equation (1).18 saturation, which increases the SE. The former effect domi- nates when the saturations are close together, and spreading Theorem 1. Assume assumptions 1 and 2. For any RS the saturations apart decreases the SE, while the latter effect design (Π, f ) with a pure control, the SE of the treatment dominates when p is close to 0, and further decreasing p effect at saturation p > 0 is will increase the SE. When ρ is large, the number of clusters assigned to each saturation plays a larger role in determin- τ2+σ2 ing the SE; a more equal distribution leads to a smaller SE. ∗ nρ 1 + 1 nC f ( p) ψ When ρ is small, the number of treated individuals assigned SE ( p) = (3) ITT to each saturation is more important than the number of clus- + (1 − ρ) 1 + 1 pf ( p) ψ ters; equalizing the number of treated individuals at each saturation reduces the SE. for each p ∈ Π. For any RS design (Π, f ) with at least two Theorem 1 can be used to characterize the power of an interior saturations, the SE for the slope effect on treated RS design. The minimum detectable effect (MDE) is the individuals between saturations p > 0 and p > pis smallest value of an estimand that it is possible to distinguish from 0 (Bloom, 1995). Given statistical significance level α, τ2+σ2 1 1 the null hypothesis of no treatment effect at saturation p is ∗ nρ +  1 nC f ( p) f ( p ) rejected with probability γ (the power) for values of ITT( p)  = SEDT ( p, p )  . = + p − p 1 1 that exceed MDE (t1−γ tα) SEITT ( p). The expressions +(1 − ρ) +   pf ( p) p f ( p ) for the MDEs of the spillover effect on untreated individuals (4) and the slope effects are analogous. − −   Substituting 1 p and 1 p for p and p < 1, respec- Optimal design: Individual saturation effects. Given a tively, yields analogous expressions for untreated individuals, Π  set of saturations , the design choice involves choosing denoted SESNT ( p) and SEDNT ( p, p ).19 the share of clusters to allocate to each saturation. If the researcher places equal weight on estimating the treatment Theorem 1 illustrates how the precision of the OLS estimates and spillover effect at each saturation in Π\{0}, she chooses depends on the the RS design and the correlation structure of f to minimize the sum of standard errors, outcomes. At one extreme, if there is no correlation (ρ = 0), the variation in ITTˆ ( p) is inversely proportional to the num- 20 min (SEITT ( p) + SESNT ( p)) . (5) ber of treated individuals at saturation p and the number of f ∈Δ(Π) control individuals. There is no correlation between poten- p∈Π\{0} tial outcomes within a cluster, so two observations from the same cluster provide the same amount of information First consider the choice of how many clusters to allo- about ITT( p) as two observations from different clusters. cate to each positive saturation. By design, clusters assigned At the other extreme, if there is perfect correlation between to extreme saturations have a more unequal number of potential outcomes within a cluster (ρ = 1), the variation in treatment and within-cluster control individuals, relative to ITTˆ ( p) is inversely proportional to the number of clusters at saturations closer to 0.5.21 A researcher who places equal saturation p and the number of control clusters. Observing weight on measuring effects at each positive saturation will Yic(1, p) provides perfect information about Yjc(1, p),soa want to allocate a larger share of clusters to these more second observation from the same cluster provides no addi- extreme saturations. This stems directly from the concavity tional information about ITT( p). At intermediate levels of 20 For ease of exposition, for all optimal design results, we maintain that 18 In general, the OLS estimator is inefficient when errors are correlated. any saturation p and distribution f are feasible. In other words, we ignore The standard errors characterized in theorem 1 will be conservative if GLS the indivisibility of individuals or clusters. The properties of the optimal or another more efficient estimator is used to analyze the resulting data. design extend in a straightforward way to the case where the set of feasi- However, the OLS standard errors are useful for studying ex ante design ble saturations and distributions is discrete. This objective is equivalent to questions due to their tractable analytical characterization. minimizing the minimum detectable effect. 19 Using these expressions to inform experimental design requires esti- 21 For example, if n = 20, clusters treated at saturation 0.25 have five mates of τ2 and σ2. One could use existing observational data or conduct a treated individuals and fifteen within-cluster controls, whereas clusters small pilot experiment (Hahn, Hirano, & Karlan, 2011). treated at saturation 0.5 have ten of each.

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 850 THE REVIEW OF ECONOMICS AND STATISTICS

of the SE with respect to the number of treated or within- Then she chooses an RS design with two saturations to cluster control individuals at that saturation. As ρ increases, solve the share of clusters at a given saturation has a larger impact on the SE, relative to the share of treated or within- min SEDT ( p1, p2) + SEDNT ( p1, p2). (6) ∈ 3 cluster control individuals at that saturation. Therefore, the p1,p2, f ( p1) (0,1) asymmetry of the optimal f decreases with ρ. The optimal saturations are symmetric about 0.5 in order Next, consider the optimal control group size. The mar- to equalize the share of treated individuals in the smaller ginal impact of adding another cluster to the control reduces saturation and the share of untreated individuals in the all SEs in equation (5), while the marginal impact of adding larger saturation. The optimal distance between saturations is another cluster to an interior saturation reduces only the SEs increasing in ρ, as the unequal share of treated and untreated at that saturation. Therefore, when outcomes within a cluster individuals at extreme saturations has a smaller impact on are perfectly correlated, the optimal design allocates more the SEs when outcomes are more correlated. An equal share clusters to the control group than to each positive treat- of clusters is allocated to each saturation, irrespective of ρ, ment saturation. When ρ < 1, the number of treated and since both saturations are equally extreme (i.e., the same dis- within-cluster control individuals is also important, and more tance from 0.5). This design equalizes the standard errors, extreme saturations have more unequal numbers of treated SE ( p∗, p∗) = SE ( p∗, p∗). and untreated individuals. In the optimal design, the num- DT 1 2 DNT 1 2 ber of treated individuals at saturations close to one may be larger than the number of pure control individuals, since Proposition 2 (optimal saturations). Assume assumptions a larger share of clusters is allocated to these high satura- 1 and 2. The RS design that minimizes equation (6) equally tions to guarantee enough within-cluster controls. Similar divides clusters between two saturations that are symmetric about 0.5, p∗ = (1 − Δ)/2 and p∗ = (1 + Δ)/2, where intuition holds for saturations close to 0. However, since 1 2 √ Δ ∈[ an additional control individual reduces all of the SEs in the optimal distance between saturations 2/2, 1) is ρ equation (5), the minimum number of treated and within- increasing in and n and satisfies cluster control individuals at each saturation is smaller than nρ 2Δ2 − 1 the number of control individuals. Proposition 1 formalizes = . − ρ − Δ2 2 these insights. 8(1 ) (1 ) √ If ρ = 0, then Δ = 2/2 for all n, and if ρ = 1, then Δ ≈ 22 Proposition 1 (optimal shares). Assume assumptions 1 and 1. 2, and fix a set of saturations Π. Let f ∗ minimize equation (5), with ψ∗ ≡ f ∗(0): Figure 1 plots the optimal treatment saturations as a function of ρ. 1. When ρ < 1, a larger share of clusters is allocated to More generally, if a researcher is interested in identifying more extreme saturations, and the minimum of the share individual saturation or slope effects at more than two satu- of treated and within-cluster control individuals at any rations, theorem 1 can be used to derive the optimal spacing positive treatment saturation is less than the share of of saturations and the optimal share of clusters to assign to pure control individuals: for any p, p ∈ Π \{0} with each saturation. For example, a design to test for linearity |0.5 − p| > |0.5 − p|,f∗( p)>f ∗( p), and ψ∗ > has three saturations. The optimal design to test for linearity min{p,1− p}f ∗( p) for all p ∈ Π \{0}. would ensure that the three saturations are sufficiently far 2. When ρ = 1, an equal share of clusters is allocated to apart and the two extreme saturations are not too large or each treatment saturation, and a larger share of clus- small. ters is allocated to the control group: ψ∗ > f ∗( p) = f ∗( p) for all p, p ∈ Π \{0}. B. Pooled Effects

For a given intracluster correlation ρ and cluster size n,itis Suppose a researcher would like to combine observations straightforward to numerically solve for the optimal share of from treated clusters to measure an average effect across clusters to assign to each saturation. all saturations in the RS design. A pooled estimand is a weighted sum of the estimand at each individual saturation. Given design (Π, f ) and vector of weights w : Π →[0, 1],a pooled treatment effect that assigns weight w( p) to ITT( p) Optimal design: Slope effects. There are two steps to the is ITT ≡ w( p)ITT( p). The definitions for ST, SNT, design choice to measure slope effects: choosing the set of Π\{0} TCE, and VT are analogous. saturations and choosing the share of clusters to allocate to each saturation. Suppose a researcher places equal weight 22 When ρ = 1 and n is finite, Δ is the largest distance such that there on estimating the slope effect for treated and untreated indi- is at least one treatment and one within-cluster control individual at each viduals and believes that both slope effects are monotonic. saturation: Δ = 1 − 2/n.

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 OPTIMAL DESIGN OF EXPERIMENTS IN THE PRESENCE OF INTERFERENCE 851

Figure 1.—Optimal Design: Slope Effects

A regression framework. A regression model to estimate The interpretation of βˆ is somewhat subtle. When observa- pooled effects is tions are pooled across saturations, βˆ 1 places a disproportion- ate weight on treated individuals in high-saturation clusters obs = β + β + β + ε Yic 0 1Tic 2Sic ic. (7) relative to low-saturation clusters; it identifies the pooled ITT with weight w( p) = pf ( p). Similarly, βˆ 2 places a dispro- As in section IIIA, we map the potential outcomes frame- portionate weight on untreated individuals in low-saturation work into this model by defining the regression coefficients clusters relative to high-saturation clusters; it identifies the and error in terms of potential outcomes, population average pooled SNT with weight w( p) = (1 − p)f ( p). Due to these potential outcomes, and realized treatment status. In the different weights, the comparison of the two pooled mea- pooled case, the definition of the regression coefficients also sures does not have a natural interpretation. Additionally, depends on the RS design. Given an RS design (Π, f ), let one must be careful when combining these estimates to iden- ≡ μ ≡ tify other effects. For example, βˆ + βˆ is a pooled measure Y(1) p∈Π\{0} pf ( p)Y(1, p)/ and Y(0) p∈Π\{0} 1 2 = βˆ − βˆ (1 − p)f ( p)Y(0, p)/μS be the population average potential of the TCE with weight w( p) f ( p),but 1 2 is not a outcome, averaged across all nonzero saturations in the RS pooled measure of the VT.23 design, for t = 1 and t = 0, respectively. Let β0 ≡ Y(0, 0), Pooling observations across multiple saturations intro- β1 ≡ Y(1) − Y(0, 0) and β2 ≡ Y(0) − Y(0, 0). Define the duces the possibility of heteroskedasticity. The form of this residual as 23 What we call saturation weights, which have a similar interpretation to sampling weights, can be used to adjust for the different probability εic ≡ Cic(Yic(0) − Y(0, 0)) of being assigned to treatment at each saturation. To estimate a pooled ITT and SNT that places equal weight w( p) = 1/|Π| on the treatment or + − spillover estimand at each saturation, estimate equation (7) with weights 1Tc=t Yic(t) Y(ti) . (8) s = 1/P f (P ) for treated individuals and weight s = 1/(1 − P )f (P ) t∈{0,1}n\{0} ic c c ic c c for within-cluster controls. Using these weights, βˆ 1 − βˆ 2 is now a pooled measure of the VT that places equal weight on each saturation, but βˆ 1 + βˆ 2 The following lemma establishes that the OLS estimate of is no longer a pooled measure of the TCE. For example, consider a design equation (7) is unbiased. with three saturations, Π ={0, 1/3, 2/3}, and an equal share of clusters assigned to each saturation, f ( p) = 1/3 for each p ∈ Π. An individual in a cluster assigned p = 2/3 is twice as likely to be treated as a cluster assigned Lemma 2. Assume assumption 1. For any RS design with p = 1/3. Weighting the treated individuals in clusters assigned p = 1/3 and p = 2/3bysic = 3 and sic = 3/2, respectively, allows one to calculate an interior saturation and a pure control, the OLS estimate the pooled estimate that places equal weight on both clusters rather than βˆ is an unbiased estimate of β. twice as much weight on the clusters treated at saturation two-thirds.

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 852 THE REVIEW OF ECONOMICS AND STATISTICS

heteroskedasticity depends on the RS design and the popu- where f ( p)/(1 − ψ) is the share of treated clusters assigned lation average potential outcome at each positive saturation. to saturation p > 0 (recall η2 − μ2 is the total variance in η2 = When ITT( p) and SNT( p) are relatively flat with respect to treatment saturation). Trivially, T 0 when there is a single p ∈ Π \{0}, the heteroskedasticity will be small, whereas positive saturation. Theorem 2 characterizes the SEs for the when these estimands significantly vary with the intensity of OLS estimator of the pooled estimands in equation (7). treatment, the heteroskedasticity will be large. The error is homoskedastic precisely when the ITT( p) and SNT( p) are Theorem 2. Assume assumptions 1 and 2. Let (Π, f ) be an constant with respect to the positive treatment saturations in RS design with at least one interior saturation and a pure the RS design. control, and suppose that treatment and spillover effects are constant on Π \{0}. The SE of the pooled treatment effect is Definition 1. Treatment and spillover effects are constant  on a set of saturations Π if for all p, p ∈ Π \{0}, Y(1, p) = τ2+σ2 ∗ ρ 1 + 1−ψ η2    nC n (1−ψ)ψ μ2 T Y(1, p ) and for all p, p ∈ Π \{1}, Y(0, p) = Y(0, p ). = SEITT . (10) + − ρ 1 + 1 (1 ) μ ψ Lemma 3. Assume assumptions 1 and 2. Given an RS design (Π, f ), the error defined in equation (8) has Substituting μ for μ yields an analogous expression for the homoskedastic variance and within-cluster covariance if and S SE of the pooled spillover effect on the untreated, denoted only if treatment and spillover effects are constant on the set SE . of positive saturations Π \{0}. SNT The SE for the pooled treatment effect depends on the num- Generally, cluster robust standard errors should be used in ber of treated and control individuals and the variance in two-level experiments due to the correlated outcomes within η2 treatment saturation across treated clusters, T . Crucially, clusters. This proposition provides an additional argument when outcomes within a cluster are correlated, the SE is for doing so when estimating equation (7) due to the varia- η2 strictly increasing in T , and introducing multiple treatment tion in treatment and spillover effects at different saturations. saturations reduces precision. The SE is minimized in a par- Lemma 4 in appendix A characterizes the precise form of tial population design in which there is a single positive this heteroskedasticity. saturation and a pure control. This design has no variation η2 = in treatment saturation across treated clusters, T 0. Standard errors. Since an RS design opens the door to a novel set of questions about how treatment and spillover Corollary 1 (optimality of partial population design). effects vary with intensity of treatment and still identifies Assume assumptions 1 and 2. For any (ψ, μ) ∈ (0, 1) × pooled treatment and spillover effects, it may be tempting (0, 1 − ψ), the partial population design with treatment sat- to conclude that there is no reason not to run an RS design. uration p = μ/(1 − ψ) simultaneously minimizes SEITT and If there is variation in treatment and spillover effects, then SESNT .24 the heteroskedastic errors in the pooled regression are not an important issue, as the researcher is more interested in the If a researcher a priori believes that slope effects are small individual saturation model, equation (1), while if no slope and intracluster correlation is high, she is best off selecting effects emerge, then the pooled model is homoskedastic and a partial population design. Moving away from the partial there is no need to worry about multiple treatment saturations population design to a design with multiple treatment satu- introducing heteroskedasticity. However, this line of reason- rations, the variance of the pooled treatment effect increases η2 ing misses a crucial piece of the story. Next, we show that linearly with respect to T . The rate at which this variance including multiple treatment saturations increases the stan- increases is proportional to ρ. Therefore, the power loss is dard errors of pooled estimates, even when the treatment and more severe for settings with higher intracluster correlation. spillover effects are constant, so that the error in equation (7) is homoskedastic. Optimal partial population design. Next, we character- In order to isolate the impact that multiple positive treat- ize the optimal treatment saturation and control size for a ment saturations have on the SEs for the pooled estimands, partial population design. In a partial population design with we focus on the case where treatment and spillover effects saturation p, the pooled effects are equivalent to the individ- are constant across all positive saturations in the RS design. ual effects at p. The SE of the ITT decreases with p, while η2 the SE of the SNT increases with p, as illustrated in the Let T be the variance in treatment saturation across treated clusters, left panel of figure 2. The relative importance a researcher p2f ( p) μ 2 η2 μ2 24 Corollary 1 holds even when treatment and spillover effects are not η2 ≡ − = − , constant on positive treatment saturations. In this case, reducing the vari- T 1 − ψ 1 − ψ 1 − ψ (1 − ψ)2 ation in treatment saturation leads to more precise standard errors through p∈Π\{0} two channels: that discussed above, as well as the resulting reduction in (9) heteroskedasticity. Note that no RS design has μ > 1 − ψ.

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 OPTIMAL DESIGN OF EXPERIMENTS IN THE PRESENCE OF INTERFERENCE 853

Figure 2.—Optimal Partial Population Design

places on estimating these two effects will determine the and SNT in a linear-in-means model and characterize the optimal choice of p. If a researcher places equal weight on standard errors for the case of no intracluster correlation. each effect, When ρ = 0, the optimal design in proposition 3 is equiv- alent to the optimal design in their sequential optimization min SEITT ( p) + SESNT ( p), (11) case. ( p,ψ)∈(0,1)2

then the optimal saturation p∗ = 0.5 creates equally sized C. The Design Trade-Off treatment and within-cluster control groups and equalizes Taken together, the results in section III illustrate a novel = the SEs, SEITT (0.5) SESNT (0.5). The optimal share of design trade-off: introducing variation in the intensity of ρ ρ control clusters depends on .As increases, the number of treatment identifies spillover estimands but reduces the pre- clusters at each saturation becomes more important than the cision of estimates of pooled effects, particularly when number of individuals in each treatment group. Similar to intracluster correlation is high. If the researcher has a strong proposition 1, the optimal share of control clusters increases prior belief that spillover effects are relatively flat with ρ in . It is always optimal to allocate more than a third of respect to treatment intensity but ρ is high, then choosing clusters to the pure control, since a control individual serves an RS design with multiple positive treatment saturations as a counterfactual for both treated individuals and within- will reduce precision without yielding novel insights, and the ρ = cluster controls. When 0, designating about 41% of researcher is better off running a partial population design. ρ = clusters as pure controls is optimal, while when 1, 50% However, partial population designs have the drawback that is optimal. The right panel of figure 2 illustrates the optimal they cannot identify or rule out spillover effects. To do so, share of control clusters for a partial population design with the researcher needs sufficient variation in the intensity of ∗ = p 0.5. Proposition 3 summarizes these results. treatment. Moreover, if the researcher is primarily interested in iden- Proposition 3. Assume assumptions 1 and 2. The par- tifying slope effects, a design with no pure control is optimal. tial population design that minimizes equation (11) has But such a design cannot identify treatment and spillover saturation p∗ = 0.5 and allocates share of clusters effects at any individual saturation or pooled across satura- −κ + κ2 + (1 − ρ)κ √ tions. Thus, the optimal RS design for a slope analysis stands ψ∗ = ∈[ − ) − ρ 2 1, 0.5 in sharp to that for an individual saturation or pooled 1 analysis. If the researcher seeks to identify both slope and to pure control for ρ ∈[0, 1), where κ ≡ 1 + (n − 1)ρ, and individual or pooled effects, the optimal design will depend ψ∗ = 0.5 for ρ = 1. The optimal control size ψ∗ is increasing on the relative importance that the researcher places on each in ρ and n. estimand, as well as the level of intracluster correlation. Figure 3 depicts the trade-off between measuring pooled This result is similar in spirit to Hirano and Hahn (2010). and slope effects for an RS design with a pure control and They show that a partial population design identifies the VT two interior saturations that are symmetric about 0.5. The

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 854 THE REVIEW OF ECONOMICS AND STATISTICS

Figure 3.—Trade-Off between SEs of Pooled and Slope Estimands

precision of the pooled estimate is increasing as the two standard deviations of the distribution of potential outcomes interior saturations approach 0.5, which corresponds to a (or, equivalently, assume total variance is σ2 + τ2 = 1). partial population design, while the precision of the slope When ρ = 0, SEITT (1) = 0.063. It increases with ρ, rising estimate first decreases and then increases as the interior to 0.087 when ρ = 0.1 and 0.200 when ρ = 1 (table 1, saturations approach 0.5, capturing the nonmonotonic effect columns 1–3). The researcher cannot identify any spillover of the distance between saturations on the precision of the effects on treated or untreated individuals. slope estimand. Next, suppose that the researcher also would like to mea- sure spillover effects on untreated individuals and cares IV. Application equally about the precision of the estimates of the pooled ITT and pooled SNT. Applying corollary 1, the optimal design To illustrate our results, we numerically characterize the is a partial population experiment. From proposition 3, we optimal design for several objective functions and calculate know that the optimal treatment saturation assigns 50% of the power of RS designs from published studies in economics the individuals in each treatment cluster to treatment, and the and political science. These examples quantify the power optimal share of control clusters ranges from 41% to 50% as trade-offs that arise between measuring individual, slope, ρ increases from 0 to 1 (table 1, columns 4–6). The optimal and pooled effects. The calculations are conducted using design equalizes the SEs for the ITT and SNT, which code we developed as a tool for researchers.25 from 0.076 to 0.200 as ρ increases from 0 to 1. These SEs are Suppose a researcher selects a sample of C = 100 clusters, larger than the SEs for the ITT in the clustered design. The each of which contains n = 10 individuals. As a benchmark, source of the power loss is obvious: it stems from reassigning suppose the researcher uses a clustered design to identify some treatment and control individuals to serve as within- the , ITT(1). She implements the cluster controls. The power loss is decreasing in ρ as the optimal clustered design, which assigns 50% of the clusters share of clusters at each saturation becomes more important to the control group and 50% to the treatment group. The for precision, and this share approaches that of the clustered standard error of her estimate will depend on the intracluster design. correlation, ρ. We measure this standard error in terms of Now suppose that the researcher cares more about esti- mating the pooled SNT relative to the pooled ITT. A partial 25 We created a graphical user interface (GUI) to answer many opti- population experiment remains optimal, but the optimal mal design questions and calculate power for a given RS design. Code treatment saturation decreases. If she places twice as much in R and Python is also available to conduct numerical optimization for more complex design questions. All code is available at http://pdel.ucsd weight on the SESNT ( p) in her objective function, relative to .edu/solutions/index.html. the SEITT ( p), then for ρ = 0.1, the optimal design assigns

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 OPTIMAL DESIGN OF EXPERIMENTS IN THE PRESENCE OF INTERFERENCE 855

Table 1.—Optimal Design for Pooled Estimands Clustered Design Partial Population Design

minp,ψ minp,ψ minp,ψ SEITT ( p) SEITT ( p) SESNT ( p); Objective ++SEITT ( p) Function minp,ψ SEITT ( p) SESNT ( p) 2SESNT ( p) ≤ .09 (1) (2) (3) (4) (5) (6) (7) (8) ICC ρ 0.0 0.1 1.0 0.0 0.1 1.0 0.1 0.1 Pure control 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Optimal p 1.00 1.00 1.00 0.50 0.50 0.50 0.41 0.78 Optimal ψ 0.50 0.50 0.50 0.41 0.45 0.50 0.45 0.47 Optimal f ( p) 0.50 0.50 0.50 0.59 0.55 0.50 0.55 0.53 SEITT ( p) 0.063 0.087 0.200 0.076 0.097 0.200 0.100 0.090 SESNT ( p) . . . 0.076 0.097 0.200 0.094 0.117 Sample size: C = 100, n = 10.

41% of the individuals in each treatment cluster to treatment When ρ = 0.1, the optimal design assigns either 21% and and 45% of clusters to the control group (table 1, column 88% of individuals in treated clusters to treatment, and allo- 7). This produces SEs of 0.100 and 0.094 for the ITT and cates 33% of clusters to the low-treatment saturation, 39% SNT, respectively.26 Alternatively, the design that minimizes to the high-treatment saturation, and the remaining 28% to the SE of the SNT while maintaining a SE of 0.09 for the the control group (table 2, column 4).28 The SEs of 0.104 ITT (approximately the SE in the clustered design) assigns and 0.109 for the pooled ITT and SNT, respectively, are 7% 78% of the individuals in each treatment cluster to treatment to 13% larger than the SEs in the optimal partial population and 47% of clusters to the control group (table 1, column 8). design (table 1, column 5).29 The precision loss in moving This yields a SE of 0.117 for the SNT. from a partial population design to an RS design arises from If a researcher wishes to estimate the slope effect for the variation in treatment saturation. But this variation is pre- treated and untreated individuals and does not care about cisely what enables the identification of slope effects. This identifying individual or pooled effects, then from propo- illustrates the design trade-off discussed in section IIIC. sition 2, the optimal design will equally divide clusters Finally, we calculate the standard errors for RS designs between two interior saturations that are symmetric about used by three published studies. To facilitate comparabil- one half and will not have a pure control. When ρ = 0, ity with the optimal designs discussed above, we use the the optimal design assigns either 15% and 85% of individ- same number of clusters (C = 100), individuals per cluster uals in each cluster to treatment, and the SE of the slope (n = 10), and intracluster correlation (ρ = 0.1) as in our effect is 0.179 for both treated and nontreated individuals examples rather than the actual values from each study.30 (table 2, column 1). Increasing ρ moves the optimal satura- We begin with the RS design used in Banerjee et al. tions farther apart and increases the SEs for the slope effects (2012) and Crepon et al. (2013). Clusters were assigned (table 2, columns 2–3). When outcomes within a cluster to a pure control group and four equally spaced treatment are perfectly correlated, the optimal saturations are as far

apart as possible while still maintaining at least one treated to measure the slope for treated individuals (p2 and p3). This is because there and one nontreated individual in each treatment cluster. This are no treated individuals in the control group, so SEDT (0, p3) is undefined. corresponds to saturations 1/n and (n − 1)/n. 28 The optimal interior saturations are not symmetric around one half. Since the distance between saturations for the slope effect on nontreated However, few researchers will likely be interested in individuals is greater than the distance between saturations for the slope designing an experiment to maximize the precision of slope effect on treated individuals, the small share of nontreated individuals at p3 estimates, at the expense of sacrificing the ability to identify has less of an effect on the SEDNT , relative to the effect of the small share of treated individuals at p2 on the SEDT . 29 standard estimands, such as the ITT. Suppose a researcher The SEDT is larger than in the optimal slope design in column 2 because places equal weight on the precision of the pooled and slope the distance between saturations for treated individuals is smaller in this design, and fewer clusters are allocated to the saturations that identify the estimates, and chooses a design with a control group p1 = 0 DT. The SEDNT is approximately the same as in the optimal slope design in and two positive saturations p3 > p2 > 0 to minimize column 2—the distance between the saturations for untreated individuals is larger, but fewer clusters are allocated to the highest saturation. 27 30 min SEITT + SESNT + SEDT ( p2, p3) + SEDNT (0, p3). The pooled SEs are calculated for a model with constant treatment p2,p3,f and spillover effects, which implies homoskedastic errors. These are lower (12) bounds for the pooled SEs when treatment and spillover effects are not constant, and therefore errors are heteroskedastic. Even if it is not possible to reject the null hypothesis of a 0 slope effect, there may still be a small slope effect that creates heteroskedasticity. For example, in column 4, the 26 Moving to a more extreme objective that places nine times as much design is powered to detect slope effects on treated individuals that are larger weight on the SNT does not substantially alter the share of clusters allocated than 0.62. Suppose the true slope is 0.5. Then the design is not powered to pure control (47%) but does significantly reduce the optimal treatment to detect an effect this small, but there will still be heteroskedasticity, and saturation (23%). the pooled SE for treated individuals will be strictly larger than 0.104. To 27 With a control group, the saturations used to measure the slope for account for this, researchers should build a sample size cushion into their nontreated individuals (0 and p3) are further apart than the saturations used designs.

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 856 THE REVIEW OF ECONOMICS AND STATISTICS

Table 2.—Optimal Design for Slope Estimands and Precision in Existing Studies Optimal RS Designs Existing Studies Π minp2,p3,f Banerjee from SEITT + et al. (2012); Baird et al.; + minp1,p2,f SESNT Crepon Sinclair Baird minf SESNT Objective SEDT ( p1, p2)+ SEDT ( p2, p3)+ et al. et al. et al. s.t. SEITT Function SEDNT ( p1, p2) SEDNT (0, p3) (2013) (2012) (2012) ≤ .095 (1) (2) (3) (4) (5) (6) (7) (8) ICC ρ 0.0 0.1 1.0 0.1 0.1 0.1 0.1 0.1 p1 0.15 0.13 0.10 0.00 0.00 0.00 0.00 0.00 p2 0.85 0.87 0.90 0.21 0.25 0.10 0.33 0.33 p3 . . . 0.88 0.50 0.50 0.67 0.67 p4 . . . . 0.75 1.00 1.00 1.00 p5 . . . . 1.00 . . . f ( p1) 0.50 0.50 0.50 0.28 0.20 0.25 0.55 0.45 f ( p2) 0.50 0.50 0.50 0.33 0.20 0.25 0.15 0.21 f ( p3) . . . 0.39 0.20 0.25 0.15 0.21 f ( p4) . . . . 0.20 0.25 0.15 0.13 f ( p5) . . . . 0.20 . . . SEITT . . . 0.104 0.113 0.109 0.095 0.095 SESNT . . . 0.109 0.120 0.111 0.115 0.106 SEDT 0.179 0.191 0.250 0.217 0.240 0.242 0.289 0.269 SEDNT 0.179 0.191 0.250 0.192 0.240 0.274 0.251 0.221 Sample size: C = 100, n = 10.

saturations in equal shares, Π ={0, 0.25, 0.50, 0.75, 1} and Π ={0, 0.33, 0.67, 1} and f ={0.55, 0.15, 0.15, 0.15}. f ={0.2, 0.2, 0.2, 0.2, 0.2}. By virtue of having a pure con- While the saturations in this design are equally spaced, they trol group and more than two interior saturations, this study are not equally sized: the pure control group, at 55% of design can identify the ITT and SNT (pooled and saturation- clusters, is much larger than the share assigned to any treat- specific) effects and slope effects. Our power calculations ment saturation. The combination of having a larger control yield SEITT = 0.113, SESNT = 0.120, and SEDT (0.25, 1) = group and smaller variation in treatment saturations produces SEDNT (0, .75) = 0.240 (table 2, column 5). All of these SEs a smaller SE for the pooled ITT, relative to Banerjee et al. are higher than their counterparts in the design that mini- (2012) and Crepon et al. (2013), but higher SEs for the slope mizes the sum of these four SEs (table 2, column 4). This effects, particularly for treated individuals (table 2, column illustrates the power loss that arises from having a richer 7). The SE for the pooled SNT is 2 percentage points (or design that can, for example, test for the concavity of ITT( p) 21%) higher than that for the ITT, indicating that the pooled and SNT( p). spillover effects on the untreated are underpowered relative Our next example is the design used by Sinclair, to the pooled treatment effects. McConnell, and Green (2012). They randomized clusters Given this large difference between the SEs for the pooled into a pure control and three different saturations, Π = ITT and SNT, we explore whether it is possible to allo- {0, 1/n, 0.50, 1} and f ={0.25, 0.25, 0.25, 0.25}, where 1/n cate clusters to this set of saturations in a way that reduces is the saturation in which only one household is treated.31 In the SE of the pooled SNT, while maintaining the SE of the addition to the estimands that can be identified in Banerjee pooled ITT at 0.095. The optimal distribution of treatment et al. (2012) and Crepon et al. (2013), this design can also saturations allocates a lower share of clusters to the pure identify the TUT and the ST( p). Our power calculations control group and saturation 1 and a higher share to the yield SEITT = .109, SESNT = 0.111, SEDT (1/n,1) = 0.242, two interior saturations, one-third and two-third (table 2, and SEDNT (0, 0.5) = 0.274 (table 2, column 6). The pooled column 8). Such a design dominates the original study SEs are quite similar to their counterparts in the design that design, as it not only lowers the SE for the pooled SNT, minimizes the sum of these four SEs (table 2, column 4). but it also decreases the SEs of both slope effects. The However, the slope effect SEs are substantially higher, par- improved precision comes from redistributing clusters more ticularly for the nontreated (0.274 versus 0.192). This is efficiently between different treatment saturations, particu- because the largest saturation containing within-cluster con- larly by reallocating clusters from the pure control to interior trols is 0.50, so the saturations used to identify the slope saturations. effect on nontreated individuals are too close together. Our final example is Baird, McIntosh, and Özler (2011), which has a pure control and three positive saturations, V. Conclusion In recent years, empirical researchers have become 31 The saturation of 0.5 is approximate, as one core household plus half of the remaining households were assigned to treatment in these clusters. increasingly interested in studying interference between sub- For the purposes of our calculations, we use 0.5. jects. Experiments designed to rigorously estimate spillovers

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 OPTIMAL DESIGN OF EXPERIMENTS IN THE PRESENCE OF INTERFERENCE 857

open up a fascinating set of research questions and provide Rajasthan, India: Experimental Evidence on Incentives, Managerial policy-relevant information about program design. For Autonomy and Training,” NBER working paper 17912 (2012). Barrera-Osorio, Felipe, Marianne Bertrand, Leigh Linden, and Francisco example, if a vaccination or a bed net distribution program Perez-Calle, “Improving the Design of Conditional Cash Trans- with fixed resources can treat either 50% of all villages or fer Programs: Evidence from a Randomized Education Experiment 100% of half of them, measuring spillover effects will deter- in Colombia,” American Economic Journal: Applied Economics 3 (2011), 167–195. mine which treatment allocation maximizes the total benefit. Beaman, Lori A., “Social Networks and the Dynamics of Labour Market Variation in the intensity of treatment can determine whether Outcomes: Evidence from Refugees Resettled in the U.S.,” Review there are important scale or congestion effects that lead to of Economic Studies 79 (2012), 128–161. Bloom, Howard S., “Minimum Detectable Effects: A Simple Way to Report differential impacts on prices, norms, or behavior. Further, the Statistical Power of Experimental Designs,” Evaluation Review RCTs that fail to account for spillovers can produce biased 19 (1995), 547–556. estimates of intention-to-treat effects while finding mean- Bobba, Matteo, and Jeremie Gignoux, “Neighborhood Effects in Integrated ingful treatment effects but failing to observe deleterious Social Policies,” World Bank Economic Review (forthcoming). https://doi.org/10.1093/wber/lhw061. spillovers can lead to misconstrued policy conclusions. The Bobonis, Gustavo J., and Frederico Finan, “Neighborhood Peer Effects in RS design presented here provides an experimental frame- Secondary School Enrollment Decisions,” this review 91 (2009), work that can inform these policy questions and bolster both 695–716. Busso, Matias, and Sebastian Galiani, “The Causal Effect of Competition internal and external . on Prices and : Evidence from a Field Experiment,” NBER We formalize the design and analysis of such RS designs. working paper 20054 (2014). We show that varying the treatment saturation across clus- Crepon, Bruno, Esther Duflo, Marc Gurgand, Roland Rathelot, and Philippe Zamora, “Do Labor Market Policies Have Displacement Effects? ters generates direct experimental evidence on the nature of Evidence from a Clustered ,” Quarterly spillover effects for both treated and nontreated individu- Journal of Economics 128 (2013), 531–580. als. Having laid out the assumptions necessary to identify Duflo, Esther, Rachel Glennerster, and Michael Kremer, “Using Random- ization in Development Economics Research: A Toolkit,” CEPR average treatment and spillover effects, we derive analyt- discussion paper 6059 (2007). ical closed-form expressions for the standard errors. This Duflo, Esther, and Emmanuel Saez, “The Role of Information and Social allows us to gain analytical insights into the optimal design of Interactions in Retirement Plan Decisions: Evidence from a Ran- such experiments and derive ex ante power calculations. The domized Experiment,” Quarterly Journal of Economics 118 (2003), 815–842. standard errors for the pooled intention-to-treat effect and Gine, Xavier, and Ghazala Mansuri, “Together We Will: Experimental spillover effect on the nontreated are directly related to the Evidence on Female Voting Behavior in Pakistan,” AEJ: Microe- variation in treatment saturation. A design trade-off emerges: conomics 10 (2018), 207–235. Graham, Bryan S., Guido W. Imbens, and Geert Ridder, “Measuring varying the treatment saturation allows the researcher to the Effects of Segregation in the Presence of Social Spillovers: identify novel estimands, but this variation comes with a A Nonparametric Approach,” NBER working paper 16499 cost—it reduces the precision of the estimates of more basic (2010). Hahn, Jinyong, Keisuke Hirano, and Dean Karlan, “Adaptive Experimen- estimands. This is an inherent feature of RS designs. tal Design Using the Propensity Score,” Journal of Business and Economic Statistics 29 (2011), 96–108. Hirano, Keisuke, and Jinyong Hahn, “Design of Randomized Experi- ments to Measure Social Interaction Effects,” Economics Letters REFERENCES 106 (2010), 51–53. Hudgens, Michael, and Elizabeth Halloran, “Towards Causal Inference with Alix-Garcia, Jennifer, Craig McIntosh, Katharine R. E. Sims, and Jarrod R. Interference,” Journal of the American Statistical Association 103 Welch, “The Ecological Footprint of Poverty Alleviation: Evidence from Mexico’s Oportunidades Program,” this review 95 (2013), (2008), 832–842. 417–435. Killeen, G. F., T. A. Smith, H. M. Ferguson, H. Mshinda, S. Abdulla, C. Angelucci, Manuela, and Giacomo De Giorgi, “Indirect Effects of an Aid Lengeler, and S. P.Kachur, “Preventing Childhood Malaria in Africa Program: How Do Cash Transfers Affect Ineligibles’ Consump- by Protecting Adults from Mosquitoes with Insecticide-Treated tion?” American Economic Review 99 (2009), 486–508. Nets,” PLoS Med. 4 (2007), e229. Aronow, Peter, “A General Method for Detecting Interference in Ran- Liu, Lan, and Michael G. Hudgens, “Large Sample Randomization Infer- domized Experiments,” Sociological Methods Research 41 (2012), ence of Causal Effects in the Presence of interference,” Journal of 3–16. the American Statistical Association 109 (2014), 288–301. Aronow, Peter M., and Cyrus Samii, “Estimating Average Causal Effects Manski, Charles, “Identification of Endogenous Social Effects: The Reflec- under General Interference,” Annals of Applied Statistics 12 (2015), tion Problem,” Review of Economic Studies 60 (1993), 531–542. 1912–1947. ——— “Identification of Treatment Response with Social Interactions,” Athey, S., and G. W. Imbens, “The of Randomized Exper- Econometrics Journal 16:1 (2013), S1–S23. iments,” Handbook of Economic Field Experiments 1 (2017), McIntosh, Craig, Tito Alegria, Gerardo Ordóñez, and René Zenteno, “The 73–140. Neighborhood of Local Infrastructure Investment: Evidence from Baird, Sarah, Aislinn Bohren, Craig McIntosh, and Berk Özler, “Opti- Urban Mexico,” American Economic Journal: Applied Economics mal Design of Experiments in the Presence of Interference,” Penn 10 (2018), 263–286. Institute for Economic Research woking paper 16-025 (2017). Miguel, Edward, and Michael Kremer, “Worms: Identifying Impacts on Baird, Sarah, Craig McIntosh, and Berk Özler, “Cash or Condition? Evi- Education and Health in the Presence of Treatment Externalities,” dence from a Cash Transfer Experiment,” Quarterly Journal of Econometrica 72 (2004), 159–217. Economics 126 (2011), 1709–1753. Moffitt, Robert A., “Policy Interventions, Low-Level Equilibria and Social Banerjee, Abhijit, Arun G. Chandrasekhar, Esther Duflo, and Matthew O. Interactions” (pp. 45–82), in Steven Durlauf and Peyton Young,eds., Jackson, “The Diffusion of Microfinance,” Science 341:6144 (2013), Social Dynamics (Cambridge, MA: MIT Press 2001). 1236498. Sinclair, Betsy, Margaret McConnell, and Donald P. Green, “Detecting Banerjee, Abhijit, Raghabendra Chattopadhyay, Esther Duflo, Daniel Spillover Effects: Design and Analysis of Multilevel Experiments,” Keniston, and Nina Singh, “Improving Police Performance in American Journal of Political Science 56 (2012), 1055–1069.

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 858 THE REVIEW OF ECONOMICS AND STATISTICS

Sobel, Michael E., “What Do Randomized Studies of Housing Mobility Lemma 4. Assume assumptions 1 and 2. Given RS design (Π, f ), the Demonstrate? Causal Inference in the Face of Interference,” Journal error defined in equation (8) has individual variances, of the American Statistical Association 101 (2006), 1398–1407. Tchetgen Tchetgen, Eric J., and Tyler VanderWeele, “On Causal Infer- E[ε2 |T = 1, T = t]=σ2 + τ2 + (Y(1, p(t)) − Y(1))2, ence in the Presence of Interference,” Statistical Methods in Medical ic ic c [ε2 | = = ]=σ2 + τ2 + − 2 Research 21 (2010), 55–75. E ic Sic 1, Tc t (Y(0, p(t)) Y(0)) , 2 2 2 Toulis, Panos, and Edward Kao, “Estimation of Causal Peer Influence E[ε |Cic = 1, Tc = t]=σ + τ , Effects,” Journal of Machine Learning Research 28 (2013). ic and within-cluster , APPENDIX A 2 2 E[εicεjc|Tic = Tjc = 1, Tc]=τ + (Y(1, p(t)) − Y(1)) , Proofs from Section III 2 2 E[εicεjc|Sic = Sjc = 1, Tc]=τ + (Y(0, p(t)) − Y(0)) , Preliminary Calculations 2 E[εicεjc|Tic = Sjc = 1, Tc]=τ + (Y(0, p(t)) Consider the OLS estimate of − Y(0))(Y(1, p(t)) − Y(1)), 2 E[εicεjc|Cic = Cjc = 1, Tc]=τ . =  β + ε Yic Xic ic, (A1) The errors are uncorrelated across clusters, E[εicεjd |Tc, Td ]=0 for c = d. where Xic is a vector of treatment status covariates and εic is an error term  =[ ] with a block-diagonal variance-covariance matrix. Given Xc X1c ...Xnc ,  =[   ] ε =[ε ε ] [ε ε | ]=σ2 + τ2 X X1 ...XC and c 1c ... nc , let E c c Xc In 1n denote Proof. The variance of the error for treated individuals is the within-cluster variance-covariance matrix, where 1n is the n × n matrix of ones and assume E[εicεjd |X]=0 for all c = d. Let Y denote the vector 2 2 E[ε |Tic = 1, Tc = t]=E[(Yic(1, t−i) − Y(1)) ] of observed potential outcomes. The estimate of β is βˆ = A−1XY and the ic = τ2 + σ2 + − 2 exact finite sample variance of βˆ is Var(βˆ|X) = σ2A−1 +τ2A−1BA−1, where (Y(1, p(t)) Y(1)) .

C C n The covariance of the error between treated individuals in the same cluster ≡  =  is A XcXc XicXic, (A2) c=1 c=1 i=1 C E[εicεjc|Tic = Tjc = 1, Tc = t] B ≡ X 1 X . (A3) c n c = E[(Yic(1, t−i) − Y(1))(Yjc(1, t−j) − Y(1))] c=1 = τ2 + (Y(1, p(t)) − Y(1))2.

Proof of Lemma 1. Suppose the realized treatment vector for cluster c The other variances and covariances are analogous. Errors across clusters = [ε | = ]= [ − ]= is t (t1, ..., tn). Then E ic Tc t E Yic(t) Y(ti, p(t)) 0. are not correlated since outcomes across clusters are not correlated. [ε2 | = ]= [ − 2]= The variance of the error is E ic Tc t E (Yic(t) Y(ti, p(t))) τ2 + σ2. The covariance of the error between individuals in the same cluster 2 is E[εicεjc|Tc = t]=E[(Yic(t) − Y(ti, p(t)))(Yjc(t) − Y(tj, p(t)))]=τ . Proof of Lemma 3. Suppose treatment effects are constant on Π \{0}. Errors across clusters are not correlated, since outcomes across clusters are Then Y(1, p) = Y(1) and Y(0, p) = Y(0) for all p ∈ Π \{0}. From lemma not correlated. [ε2 | = = ]=σ2 + τ2 + − 2 = σ2 + τ2 4, E ic Tic 1, Tc t (Y(1, p(t)) Y(1)) , with similar calculations for the other variances and covariances. Therefore, Proof of Lemma 2. The error defined in equation (8) is not strictly exoge- the variance-covariance matrix reduces to a block diagonal structure with σ2 + τ2 τ2 nous, so we need to establish directly that βˆ is unbiased for equation (A1) variance and covariance . when X =[1 T S ]. From βˆ = A−1XY, with For the other direction, suppose the variance-covariance matrix is block ic ic ic σ2 +τ2 τ2 − = ⎡ ⎤ ⎡ ⎤ diagonal with variance and covariance . Then Y(1, p(t)) Y(1) 0 = Π 1 T S 1 μμ for all t 0 that arise in ( , f ). But then there must be no variation in the C n ⎢ ic ic ⎥ ⎢ S ⎥ population average potential outcome across positive saturations for treated = ⎣ 2 ⎦ = ⎣ μμ ⎦ A Tic Tic TicSic nC 0 (A4) individuals. Similarly, there must be no variation in the population average c=1 i=1 2 μ μ Sic TicSic Sic S 0 S potential outcome across positive saturations for within-cluster controls. Therefore, treatment effects are constant on Π \{0}. and

C n    Proof of Theorem 1. Assume assumptions 1 and 2. Consider an RS X Y = Yic TicYic SicYic , design with two interior saturations, p1 and p2, and a pure control. Let c=1 i=1 Tkic ≡ Tic ∗ 1{Pc = pk } and S1ic ≡ Sic ∗ 1{Pc = pk } for k = 1, 2. We want to compute Var(βˆ|X) for equation (A1) when C n   1  βˆ = 1 1 − 1 1 − 1  =[ ] ψ CicYic μ TicYic ψ CicYic μ SicYic ψ CicYic . Xic 1 T1ic S1ic T2ic S2ic . nC S c=1 i=1

By lemma 1, the error distribution is block diagonal. Let μk ≡ pk f ( pk ), [βˆ | ]= ≡ − η ≡ 2 ≡ − 2 = −μ +η Therefore, E 0 X Y(0, 0), sk (1 pk )f (pk ), k pk f (pk ) and qk (1 pk ) f (pk ) sk k k . From (A), Var(βˆ|X) = σ2A−1 + τ2A−1BA−1, with 1 E[βˆ |X]= pf ( p)Y( p) − Y( ) = Y( ) − Y( ) 1 μ 1, 0, 0 1 0, 0 , ⎡ ⎤ p∈Π\{0} 1 T1ic S1ic T2ic S2ic ⎢ 2 ⎥ 1 C n ⎢ T1ic T S1icT1ic T2icT1ic S2icT1ic ⎥ [βˆ | ]= − − = − ⎢ 1ic ⎥ E 2 X (1 p)f ( p)Y(0, p) Y(0, 0) Y(0) Y(0, 0), 2 μ A = ⎢ S1ic T1icS1ic S T2icS1ic S2icS1ic ⎥ S p∈Π\{0} ⎢ 1ic ⎥ c=1 i=1 ⎣ 2 ⎦ T2ic T1icT2ic S1icT2ic T2ic S2icT2ic 2 which establishes that βˆ is unbiased. S2ic T1icS2ic S1icS2ic T2icS2ic S2ic

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 OPTIMAL DESIGN OF EXPERIMENTS IN THE PRESENCE OF INTERFERENCE 859

⎡ ⎤ 1 μ1 s1 μ2 s2 and ⎢ μ μ ⎥ ⎡ ⎤ ⎢ 1 1 000⎥ 2 n n ⎢ ⎥ n n = T n = S = nC ⎢ s 0 s 00⎥ , C ⎢ i 1 ic i 1ic ⎥ ⎢ 1 1 ⎥ = ⎣ n n 2 n n ⎦ ⎣ ⎦ B n i=1 Tic i=1 Tic i=1 Tic i=1 Sic μ2 00μ2 0 c=1 n n n n 2 n i=1 Sic i=1 Tic i=1 Sic i=1 Sic s2 000s2 ⎡ ⎤ ⎛⎡ ⎤ ⎡ ⎤⎞ 1 μμ n n ⎢ S ⎥ ⎜⎢ ⎥ ⎢ ⎥ ⎟ = n2C ⎣ μη2 μ − η2 ⎦ , (A10) ⎜⎢ n T ⎥ ⎢ n T ⎥ ⎟ C ⎜⎢ i=1 1ic ⎥ ⎢ i=1 1ic ⎥ ⎟ μ μ − η2 μ − μ + η2 B = ⎜⎢ n ⎥ ∗ ⎢ n ⎥ ⎟ S S ⎜⎢ i=1 S1ic ⎥ ⎢ i=1 S1ic ⎥ ⎟ n n c=1 ⎝⎣ ⎦ ⎣ ⎦ ⎠ n n i=1 T2ic i=1 T2ic where the second equalities follow from T = nP , S = n n i=1 ic c i=1 ic C C n 2 i=1 S2ic i=1 S2ic n(1 − P )1 , nP = nC pf ( p) = nCμ, ( T ) = ⎡ ⎤ c Pc >0 c=1 c p∈Π c=1 i=1 ic μ μ 2 C 2 = 2 η2 C n n = 2 C − = 1 1 s1 2 s2 n c=1 Pc n C , c=1( i=1 Tic i=1 Sic) n c=1 Pc(1 Pc) ⎢ ⎥ 2 μ − η2 C n 2 = 2 C − 2 = 2 − ⎢ μ1 η1 μ1 − η1 00⎥ n C( ) and c=1( i=1 Sic) n c=1(1 Pc) 1Pc >0 n C(1 ⎢ ⎥ ψ− μ+η2 = 2 μ −μ+η2 βˆ| = 2 ⎢ μ − η ⎥ 2 ) n C( S ). Taking the diagonal entries of Var( X) n C ⎢ s1 1 1 q1 00⎥ , ⎣ μ η μ − η ⎦ yields 2 00 2 2 2 s μ − η q 1 η2 1 1 1 2 002 2 2 Var (βˆ ) = × nτ2 + + σ2 + (A11) 1 μ2 ψ μ ψ nC n = n = μ − μ + η2 where the second equalities follow from i=1 Tkic npk , i=1 Skic 1 2 S 1 2 1 1 C n C n Var (βˆ ) = × nτ + + σ + . n(1 − p ), T = np × Cf ( p ) = nCμ , S = 2 μ2 ψ μ ψ k c=1 i=1 kic k k k c=1 i=1 kic nC S S − × = 2 = C n 2 = 2 2 × n(1 pk ) Cf ( pk ) nCsk , Tkic Tkic, c=1( i=1 Tkic) n pk (A12) = 2 η C n × n = 2 − × = Cf ( pk ) n C k, c=1( i=1 Tkic i=1 Skic) n pk (1 pk ) Cf ( pk ) 2 μ − η n n = ρ = τ2 τ2 +σ2 η2 η2 n C( k k ), i=1 T1ic i=1 S2ic 0, and other analogous calcu- Substituting /( ) and the expression relating and T defined lations. Taking the diagonal entries of Var(βˆ|X) = σ2A−1 + τ2A−1BA−1 in equation (9), then taking the square root yields the standard errors. yields Proof of Proposition 1. Suppose that there are two interior saturations, p1 and p2. Without loss of generality, let p1 > p2 ≥ 0.5. Let f denote f ( p1). 1 2 1 1 2 1 1 = −ψ− ψ ∈[ −ψ] Var (βˆ ) = ∗ nτ + + σ + , (A5) Then f ( p2) 1 f . First fix and consider the optimal f 0, 1 1pj nC f ( p ) ψ μ ψ ρ = = j j to minimize equation (5). If 1, then SEITT ( p) SESNT ( p), and the first-order condition is 1 2 1 1 2 1 1 Var (βˆ 2p ) = ∗ nτ + + σ + , (A6) j nC f ( p ) ψ s ψ −1 j j 1 1 1 1 + + = f 2(1 − ψ − f )−2. f ψ 1 − ψ − f ψ for each pj ∈ Π \{0}. Taking the square root yields the standard errors. To compute the SE , note Var(DTˆ ( p , p )) = Var (βˆ − βˆ )/ DT 1 2 1p2 1p1 ∗ = − ∗ − ψ ( p − p )2 and from the matrix Var(βˆ|X),Cov(βˆ , βˆ ) = (nτ2 + σ2)/ Therefore, f 1 f and an equal share of clusters is allocated to 2 1 1p1 1p2 = = − ψ ψnC. Therefore, each treatment saturation, f ( p1) f ( p2) (1 )/2. This allocation equalizes all four SEs. If ρ = 0, then for p > 0.5, SEITT ( p) p2 > 1 − p2 > 1 − p1. Therefore, if f = 1 − ψ − f , equation (A13) is positive. The marginal value, measured in terms of the marginal reduction Proof of Theorem 2. Assume assumptions 1 and 2. Consider an RS Π in the SE, from an additional individual in a given group is concave. When design ( , f ) with at least one interior saturation and a pure control, and f = 1 − ψ − f , it is highest for within-cluster controls at saturation p , suppose that treatment and spillover effects are constant on Π \{0}.We 1 ˆ  followed by within-cluster controls at p2, then treated individuals at p2, want to compute Var(β|X) for equation (A1) when X =[1 Tic Sic ].By ic and finally, treated individuals at saturation p1. Given this concavity, when βˆ| = lemma 2, the error distribution is block diagonal. From (A), Var( X) f = 1 − ψ − f , the marginal value of an additional cluster at saturation p1 σ2 −1 + τ2 −1 −1 A A BA , with is higher than the marginal value of an additional cluster at saturation p2, ⎡ ⎤ ⎡ ⎤ and it must be that f ∗ > 1 − f ∗ − ψ. Similarly, for ρ ∈ (0, 1), it must be μμ ∗ − ∗ − ψ C n 1 Tic Sic 1 S that f > 1 f . ⎢ ⎥ ⎢ ⎥ Next, we consider the optimal share of control clusters. An additional A = ⎣ T T 2 T S ⎦ = nC ⎣ μμ0 ⎦ (A9) ic ic ic ic control cluster reduces the SEs for every term in equation (5), whereas an c=1 i=1 2 μ μ Sic TicSic Sic S 0 S additional cluster at saturation p reduces the SEs for only that saturation.

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021 860 THE REVIEW OF ECONOMICS AND STATISTICS √ If ρ = 0, then noncontrol clusters are divided evenly between each treat- If ρ = 0, then 2Δ2 − 1 = 0, yielding Δ∗ = 2/2. Note that (2Δ2 − 1)/ ment saturation, f = (1 − ψ)/2, and the SEs are equalized. Therefore, the (1−Δ√2)2 is monotonically increasing for Δ ∈[0, 1) and strictly positive for objective simplifies to Δ > 2/2. The left-hand side is√ increasing in ρ and n and strictly positive when ρ > 0. Therefore, Δ∗ > 2/2 for ρ > 0, and Δ∗ is increasing in ρ 2 1 and n.Ifρ > 0, then the left-hand side converges to ∞ as n →∞, which min + , requires Δ∗ → 1. At the extreme, when ρ = 1, the optimal saturations ψ∈[0,1] 1 − ψ ψ are the farthest apart saturations that maintain one treated individual and one with-cluster control individual in each saturation, p∗ = 1/n and p∗ = ψ∗ ∗ ρ 1 2 which has the solution > f .If > 0, the optimal control size depends (n − 1)/n. on the saturations. Suppose p1 is close to 1. Then when ψ = f , the marginal value of an additional cluster at saturation p1 can be higher than the marginal value of an additional control cluster due to the low number of within-cluster μ ψ ∈ ψ∗ ∗ Proof of Corollary 1. Fixing ( , ) (0, 1), SEITT and SESNT are both controls in each p1 cluster. Therefore, it is possible that < f . There are η2 = minimized at T 0. This corresponds to a partial population design with a large number of treated individuals at extreme p1, so it may also be the ∗ ∗ ∗ ∗ a control group of size ψ and a treatment saturation of p = μ/(1 − ψ). case that ψ <(1 − p1)f . However, it can never be that ψ <(1 − p1)f , as it would be optimal to reallocate a cluster from p1 to the control. Proof of Proposition 3. Consider a partial population design with share- Proof of Proposition 2. Consider an RS design with two interior satu- of-control clusters ψ and share of treated individuals μ. Then rations, and let p2 > p1. Denote the size of saturation p1 by f ( p1) = f , and f ( p2) = 1 − f . Let Δ ≡ p2 − p1 denote the distance between the two − − saturations. The SEDT and SEDNT are concave in p1, p2 and 1 p1,1 p2, τ2 + σ2 1 1 1 (βˆ ) = ∗ nρ + ( − ρ) + respectively, and symmetric about one-half. Therefore, the optimal design SE 1 − ψ ψ 1 μ ψ . will equalize the SEs and minimizing equation (6) is equivalent to solving nC (1 ) (A14) 1 1 1 min nρ + + βˆ μ − μ − ψ ψ Δ ∈[ ]3 Δ2 − SE( 2) is analogous, replacing with 1 . Fixing , the optimal f , ,p1 0,1 f 1 f treatment share solves minμ SE(βˆ 1) + SE(βˆ 2), which has solution μ = 1 1 1 − ψ μ = μ (1 − ρ) + (1 )/2. This implies S , which corresponds to a partial population Δ2 − + Δ ∗ = μ = − ψ fp1 (1 f )( p1 ) experiment with treatment saturation p 0.5. Plugging (1 )/2 1 1 into equation (A14) yields SE(βˆ 1) = SE(βˆ 2). Thus, the optimal share of + + . control clusters solves f (1 − p1) (1 − f )(1 − Δ − p1) Fixing Δ, the FOCs with respect to p and f are 1 1 + ψ 1 min nρ + (1 − ρ) . (A15) ψ ψ(1 − ψ) ψ(1 − ψ) f p2(1 − p )2(2( p + Δ) − 1) = 1 1 1 , √ 2 2 ∗ 1 − f ( p1 + Δ) (1 − p1 − Δ) (1 − 2p1) When ρ = 0, equation (A15) is minimized at ψ = 2 − 1. When ρ = 1, ∗ 1 1 − ρ equation (A15) is minimized at ψ = 0.5. When ρ ∈ (0, 1), the general + nρ − 2 + Δ − Δ − FOC for equation (A15) is (1 f ) ( p1 )(1 p) 1 1 − ρ = + nρ . − ρ ψ2 + ψ − + ρ ψ − = 2 (1 )( 2 1) n (2 1) 0. (A16) f p1(1 − p1) Using the quadratic formula with a = 1 − ρ, b = 2(1 − ρ + nρ) and The solutions to these FOCs are p1 = (1−Δ)/2 and f = 0.5, which implies = + Δ = + Δ c =−(1 − ρ + nρ) to solve for ψ yields the optimal control group size. p2 p1 (1 )/2. Therefore, the optimal size of each saturation + ψ ψ − ψ ψ − ψ bin is equal, and the optimal saturations are symmetric about 0.5. The Δ Given that (1 )/ (1 ) and 1/ (1 ) are both convex and have that minimizes equation (6) is equivalent to solving unique minimums, any weighted sum of these functions is minimized at a value ψ∗ that lies between√ the minimum of each function. Therefore, when ρ ∈ ψ∗ ∈ − 1 1 (0, 1), ( 2 1, 1/2). Taking the derivative of equation (A16) min nρ + 8(1 − ρ) . with respect to n yields Δ Δ2 1 − Δ2 Δ∗ ∂ψ∗ ρ(1 − 2ψ∗) The optimal solves = ≥ 0 ∂n 2nρ + 2(1 − ρ)(1 + ψ∗) nρ 2Δ2 − 1 = . ∂ψ∗ 2 2 8(1 − ρ) (1 − Δ ) A similar calculation establishes that ∂ρ > 0.

Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00716 by guest on 29 September 2021