The Short-Run and Long-Run E¤ects of Behavioral Interventions: Experimental Evidence from Energy Conservation

Hunt Allcott and Todd Rogers

May 30, 2013

Abstract

Interventions to a¤ect behaviors such as smoking, exercise, workplace e¤ort, or pro-social behaviors can often have large short-run impacts but uncertain or disappointing long-run e¤ects. We study the three longest-running sites of a larger program designed to induce energy conser- vation, in which home energy reports containing personalized feedback, social comparisons, and energy conservation information are being repeatedly mailed to more than six million house- holds across the United States. We show that treatment group households reduce electricity use within days of receiving each of their initial few reports, but these immediate responses decay rapidly in the months between reports. As more reports are delivered, the average treat- ment e¤ect grows but the high-frequency pattern of action and backsliding attenuates. When a randomly-selected group of households has reports discontinued after two years, the e¤ects are much more persistent than they had been between the initial reports, implying that households have formed a new "capital stock" of physical capital or consumption habits. We show how assumptions about long-run persistence can be important enough to change program adoption decisions, and we illustrate how program design that accounts for the capital stock formation process can signi…cantly improve cost e¤ectiveness. JEL Codes: D03, D11, L97, Q41. Keywords: Energy e¢ ciency, habit formation, persistence, social comparisons.

———————————————————————————— We thank Ken Agnew, David Cesarini, Gary Charness, Paul Cheshire, Xavier Gabaix, Francesca Gino, , Judd Kessler, , Karen Palmer, Charlie Sprenger, sta¤ at the utilities we study, and a number of seminar participants for feedback and helpful conversations. Thanks also to Tyler Curtis, Lisa Danz, Rachel Gold, Arkadi Gerney, Marc Laitin, Laura Lewellyn, Elena Washington, and many others at Opower for sharing data and insight with us. We are grateful to the Sloan Foundation for …nancial support of our research on the economics of energy e¢ ciency. Stata code for replicating the analysis is available from Hunt Allcott’swebsite.

Allcott: NYU Department of Economics. [email protected]. Rogers: Harvard Kennedy School. [email protected].

1 1 Introduction

When making choices such as whether or not to smoke, exercise, eat healthfully, work or study hard, pay bills on time, and conserve resources, our choices sometimes di¤er from those that would maximize social welfare, or perhaps even our own long-run welfare. Individuals, employers, and government agencies have therefore experimented with many di¤erent kinds of interventions to

"improve" behaviors, including …nancial incentives, information provision, commitment contracts, appeals to the public good, and social comparisons. In an e¤ort to produce useful and timely insights about these interventions, evaluations often only examine short-run e¤ects. The studies that do examine long-run e¤ects often …nd that it is di¢ cult to sustainably change behaviors.1

Given this, several related questions are crucial to designing and evaluating programs aimed at changing human behaviors. First, do people continue to respond to repeated intervention, or do we eventually stop paying attention? Second, how persistent are e¤ects after the intervention ends?

Third, do longer interventions cause more persistent post-intervention e¤ects? These questions often determine whether an intervention is cost-e¤ective, and understanding the answers can help optimize program design.

This paper is a comprehensive study of the short-run and long-run e¤ects of an intervention that aims to induce energy conservation by sending home energy reports that feature personalized feedback, social comparisons, and energy conservation information. The reports are mailed to households monthly or every few months for an inde…nite period of time. The program, which is managed by a company called Opower, is typically implemented as a randomized control trial, giving particularly credible estimates of its e¤ects on energy use. Opower’s programs are being implemented at 85 utilities across the United States, and there now 8.9 million households in their experimental populations, including 6.2 million in treatment. Utilities hire Opower to implement the intervention primarily because the resulting energy savings help to comply with state regulations requiring energy conservation.

We study three Opower programs with three important features. First, these are the three longest-running Opower programs, having begun between early 2008 and early 2009. By the end of

1 For example, see Cahill and Perera (2009) for a review of the long-run e¤ects of interventions to encourage smoking cessation, as well as a number of studies of exercise, weight loss, school performance, and other behaviors that we discuss later in the introduction.

2 our sample, some treatment group households have received home energy reports for 60 consecutive months, and one might wonder whether people attenuate to this repeated intervention over such a long time horizon. Second, a randomly-selected subset of treatment group households was dropped from treatment after about two years, allowing us to measure the persistence of e¤ects for two to three years after the intervention stops. Third, while most utilities manually record household electricity use on a monthly basis, one of our three utilities uses advanced meters that record consumption each day. Although in recent years, millions of households have been out…tted with similar "smart meters" (Joskow 2012, Joskow and Wolfram 2012), the granularity of these data has generated privacy concerns that make them especially di¢ cult to acquire for research. At this site alone, we have 225 million observations of daily electricity use at 123,000 de-identi…ed households.

The data reveal a remarkable pattern of "action and backsliding": consumers reduce electricity use markedly within days of receiving each of their initial reports, but these immediate e¤orts decay at a rate that might cause the e¤ects to disappear in well under a year if the treatment were not repeated. However, this cyclical pattern of action and backsliding attenuates after the

…rst few reports: the immediate consumption decreases after report arrivals become much smaller, and the decay rate between reports becomes statistically indistinguishable from zero. Although the short-run pattern of cyclical responses attenuates, the average treatment e¤ects do not: for the groups that continue to receive reports at the same frequency throughout our samples, the e¤ects continue to grow - even after 60 consecutive months.

Compared to behavioral interventions in other domains, the e¤ects are remarkably persistent.

For the groups whose reports are discontinued after about two years, the e¤ects begin to decay at about 10 to 20 percent per year. Furthermore, this decay rate is at least six times slower than the decay rate between the initial reports. This di¤erence implies that as the intervention is repeated, people gradually develop a new "capital stock" that makes the e¤ects persistent. This capital stock might be physical capital, such as new energy e¢ cient lightbulbs, or "consumption capital" - a stock of energy use habits in the sense of Becker and Murphy (1988).

Tangibly, what are consumers doing in response to the intervention? To answer this, we analyze detailed surveys of 6000 households in treatment and control groups across …ve sites. The results suggest that the intervention makes very little di¤erence on the "extensive margin": that is, the set

3 of energy conservation actions that households report taking. Although these self-reports should be interpreted very cautiously, one interpretation is that some of the behavior changes are on the

"intensive margin," by which we mean that the program motivates households to take more of the same actions that they were already taking.

The surveys do suggest that the intervention increases participation in other utility-run energy e¢ ciency programs, which involve improvements to physical capital stock, such as insulation or refrigerators, that would mechanically generate persistent energy savings. To corroborate this, we analyze two utilities’administrative data on program participation by treatment and control groups. We show that the intervention does cause statistically signi…cant increases in utility pro- gram participation, but these di¤erences are not economically large: they explain only a small share of the treatment e¤ects. Our interpretation of the treatment e¤ect patterns, the surveys, and the program participation data is that instead of responding in one or two major ways, di¤erent households respond by taking a wide variety of actions, no one of which constitutes an important share of the total reductions by the treatment group.

After presenting the empirical results, we turn to the economic implications. We show how long-run persistence can play an important role in whether or not it is cost-e¤ective to adopt an intervention. In these three sites, di¤erent assumptions about persistence would have changed the ex-ante predicted cost e¤ectiveness by a factor of two or more. One broader implication is that when deciding whether or not to scale up other interventions when long-run results such as these are not available, it will in some cases be optimal to …rst measure whether short-run e¤ects are persistent - even if this measurement delays the decision-making process. In our context, it turns out that analysts had systematically made assumptions that were too conservative, but in other settings this may not be the case.

We also show how the dynamics of persistence can play an important role in designing behavioral interventions. In doing this, we conceptually distinguish two separate channels through which repeated intervention changes cumulative outcomes. The additionality e¤ect re‡ects the extent to which repeated intervention increases e¤ects during the treatment period, holding constant the post-intervention decay rate. The persistence e¤ect captures the extent to which repeated intervention induces households to change physical capital or consumption habits, which in turn

4 causes the changes in outcomes to last longer after the intervention ends. We quantify these e¤ects in the context of the Opower program, showing that the persistence e¤ect is the primary reason why repeated intervention improves cost e¤ectiveness. For Opower and for other interventions in other contexts, this suggests that it is important to repeat an intervention until participants have developed a new "capital stock" of habits or other technologies. After that point, it may be optimal to reduce the frequency of intervention, unless the additionality e¤ect e¤ect is very powerful.

Our results are related to several di¤erent literatures. The action and backsliding in response to home energy reports is reminiscent of evidence that consumers "learn" about late fees and other charges as we incur them, but we act as if we forget that knowledge over time (Agarwal et al. 2011,

Haselhuhn et al. 2012). For some consumers, the home energy report acts simply as a reminder to conserve energy, making this related to studies of reminders to save money (Karlan, McConnell,

Mullainathan, and Zinman 2010) or take medicine (Macharia et al. 1992). Ebbinghaus (1885),

Rubin and Wenzel (1996), and others have quanti…ed the decay of memory and the functional form of "forgetting curves." Our results are novel in that they provide a clear illustration of how consumers’attention waxes and wanes in response to repeated stimuli, but this cyclical action and backsliding eventually attenuates.

There are also studies of the medium- and long-run e¤ects of interventions to a¤ect exercise

(Charness and Gneezy 2009, Royer, Stehr, and Snydor 2013), smoking (Gine, Karlan, and Zinman

2010, Volpp et al. 2009), weight loss (Anderson et al. 2010, Burke et al. 2012, John et al.

2011), water conservation (Ferraro, Miranda, and Price 2011, Ferraro and Price 2013), academic performance (Jackson 2010, Levitt, List, and Sado¤ 2010), voting (Gerber, Green, and Shachar

2003), charitable donations (Landry et al. 2010), labor e¤ort (Gneezy and List 2006), and other repeated choices. Compared to these studies, we document relatively persistent changes in outcomes over a relatively long time horizon.

Aside from being of scienti…c interest, these results have very concrete practical importance.

Each year, electric and natural gas utilities spend several billion dollars on energy conservation programs in an e¤ort to both reduce energy use externalities and ameliorate other market failures that a¤ect investment in energy-using durable goods (Allcott and Greenstone 2012). Traditionally, one signi…cant disposition of these funds has been to subsidize energy e¢ cient investments, such

5 as Energy Star appliances or home energy weatherization. Recently, there has been signi…cant interest in "behavioral" energy conservation programs, by which is meant information, persuasion, and other non-price interventions.2 The Opower programs are perhaps the most salient example of this approach. One of the foremost questions on practitioners’ minds has been the extent to which behavioral interventions have persistent long-run e¤ects: while capital stock changes like new insulation are believed to reduce energy use for many years, it was not obvious what would happen after several years of home energy reports. Our results give some initial evidence on this issue.

The paper proceeds as follows. Section 2 gives additional background on the program and describes the data. Sections 3 presents short-run analysis using high-frequency data, while Section

4 presents the long-run analysis. Section 5 presents evidence on the physical actions that consumers take in response to the intervention. Section 6 carries out the cost e¤ectiveness analysis, and Section

7 concludes.

2 Experiment Overview

2.1 Background

Aside from selling energy, most large electric and natural gas utilities in the U.S. also run energy conservation programs, such as home energy audit and weatherization programs and rebates for energy e¢ cient light bulbs and appliances. While energy conservation can reduce revenues for private investor-owned utilities, many states require utilities to fund conservation programs out of small surcharges called System Bene…ts Charges, and in recent years many states have passed

Energy E¢ ciency Resource Standards that require utilities to cause consumers to reduce energy use by some amount relative to counterfactual, often 0.5 to 1 percent per year.

Opower is a …rm that contracts with utilities to help meet these energy conservation require- ments. Their "technology" is unusual: instead of renovating houses or subsidizing energy e¢ ciency, they send two-page Home Energy Report letters to residential consumers every month or every several months. Figure 1 reproduces a home energy report for an example utility. The …rst page

2 Abrahamse et al. (2005) is a useful literature review of behavioral interventions centered around energy conser- vation, and Allcott and Mullainathan (2010b) cite some of the more recent work.

6 features a "Social Comparison Module," which compares the household’senergy use from the pre- vious month to that of 100 neighbors with similar house characteristics. The second page includes personalized energy consumption feedback, which varies from report to report. This feedback might include individualized comparisons to usage from the same season in previous years or trends in usage compared to neighbors. The second page also includes an "Action Steps Module," which provides energy conservation tips. These tips are drawn from a large library of possible tips, and they vary with each report. Some tips are targeted at speci…c types of households: for example, a household with relatively heavy summer usage is more likely to see information about purchasing energy e¢ cient air conditioners.

The initial proof of concept that social comparisons could a¤ect energy use was developed in pair of papers by Nolan et al. (2008) and Schultz et al. (2007). There is also a body of evidence that social comparisons a¤ect choices in a variety of domains, such as voting (Gerber and Rogers 2009), retirement savings (Beshears et al. 2012), water use (Ferraro and Miranda 2013, Ferraro and Price

2013) and charitable giving (Frey and Meier 2004), as well as a broader literature in psychology on social norms, including Cialdini, Reno, and Kallgren (1990) and Cialdini et al. (2006). There is now a mini-literature focusing on the Opower programs. Allcott and Mullainathan (2012) show that the average treatment e¤ects across the …rst 14 Opower sites range from 1.4 to 2.8 percent of electricity use. Opower programs have also been studied by Allcott (2011), Ayres, Raseman, and

Shih (2012), Costa and Kahn (2013), Davis (2011), and a number of industry practitioners and o¢ cial program evaluators such as Ashby et al. (2012), Integral Analytics (2012), KEMA (2012),

Opinion Dynamics (2012), Perry and Woehleke (2013), and Violette, Provencher, and Klos (2009).

Within this literature on Opower, our contributions are clear. First, we are the …rst to document consumers’"action and backsliding" using high-frequency data.3 Second, this is the …rst (publicly- available) study that compares long-run persistence across multiple sites. Third, we bring together the short-run and long-run analyses to analyze how the persistence a¤ects cost e¤ectiveness and optimal program design.

3 Although we are the …rst to be able to document it, this potential e¤ect has been of previous interest. For example, Ayres, Raseman, and Shih (2012) test for what they call a "staleness e¤ect" using monthly data for recipients of quarterly reports, but …nd no evidence that e¤ects vary with the time since reports. It would have been unlikely for Ayres, Raseman, and Shih (2012) or others to even …nd suggestive evidence in monthly data because the report arrival dates do not match up well with the monthly billing cycles.

7 2.2 Experimental Design

We study the three longest-running Opower home energy report programs, which take place at three utilities which we have been asked not to identify directly. Table 1 outlines experimental design and provides descriptive statistics. The three sites are in di¤erent parts of the country: Site

1 is in the upper Midwest, with cold winters and mild summers, while Sites 2 and 3 are on the

West coast.

The initial experimental populations across the three sites comprise 234,000 residential electric- ity consumers. The populations have several common features: all households must be single-family homes, have at least one to two years of valid pre-experiment energy use data, and satisfy some additional technical conditions.4 Some aspects of the populations di¤er across sites. Site 1 is a relatively small utility, and its entire residential customer population was included. Site 2 is a larger utility, with about 930,000 residential customers in 2007 (U.S. Energy Information Administration

2013). The utility decided to limit the program to households in particular parts of their service territory that purchase both electricity and natural gas, and they included only relatively heavy energy users (more than 80 million British thermal units per year). Site 3 is also a relatively large utility, with about 520,000 residential customers. In Site 3, Opower selected census tracts within the customer territory to maximize the number of eligible households.

The experimental populations were randomly assigned to treatment and control. In Site 3, which was Opower’s…rst program ever, households were grouped into 952 geographically-contiguous

"block batch" groups, each with an average of 88 households. This was done because of initial concern over geographic spillovers: that people would talk with their neighbors about the reports.

No evidence of this materialized, and all programs since then, including Sites 1 and 2, have been randomized at the household level. In Sites 1 and 2, treatment group households were randomly assigned to receive either monthly or quarterly reports. In Site 3, heavier users were assigned to receive monthly reports, while lighter users were assigned to quarterly.

4 Typically, households in Opower’sexperimental populations need to have valid names and addresses, no negative electricity meter reads, at least one meter read in the last three months, no signi…cant gaps in usage history, exactly one account per customer per location, and a su¢ cient number of neighbors to construct the neighbor comparisons. Households that have special medical rates or photovoltaic panels are sometimes also excluded. Utility sta¤ and "VIPs" are sometimes automatically enrolled in the reports, and we exclude these non-randomized report recipients from any analysis. These technical exclusions eliminate only a small portion of the potential population.

8 The three experiments began between early 2008 and early 2009. After about two years, a subset of treatment group households were randomly selected to stop receiving reports. We call this group the "dropped group." The remainder of the treatment group, which we call the "continued group," is still receiving reports. In Sites 2 and 3, the entire continued group is still receiving reports at their original assigned frequency. In Site 1, however, the utility wished to taper the frequency for the continued group, so this group is now receiving biannual reports.

2.3 Data for Long-Run Analysis

In the "long-run analysis," we analyze monthly billing data from three sites and estimate treatment e¤ects over the past four to …ve years. The three utilities in our analysis bill customers approxi- mately once a month, and our outcome variable is mean electricity use per day over a billing period.

We therefore have about 12 observations per household per year, or 16.7 million total observations in the three sites.

In each site, we construct baseline usage from the earliest one-year period when we observe electricity bills for nearly all households. In each site, average baseline usage is around 30 kilowatt- hours (kWh) per day, or between 11,000 and 11,700 kWh per year. These …gures are close to the mean residential electricity usage for 2007 that that the utilities reported to the U.S. Energy

Information Administration (2013), although in Sites 2 and 3 the slightly higher means re‡ect the fact that the populations were drawn from heavier users. The …gures are also close to the national average of 11,280 (U.S. Energy Information Administration 2011). For context, one kilowatt- hour is enough energy to run either a typical new refrigerator or a standard 60-watt incandescent lightbulb for about 17 hours. In the average American home, space heating and cooling are the two largest uses of electricity, comprising 26 percent of consumption. Refrigerators and hot water heaters use 17 and 9 percent of electricity, respectively, while lighting also uses about 9 percent

(U.S. Energy Information Administration 2009). Appendix Figure A1 provides more detail on nationwide household electricity use.

The three utilities also have fairly standard pricing policies. Site 1 charges 10 to 11 cents/kWh, depending on the season. Sites 2 and 3 have increasing block schedules, with marginal prices of 8 to 11 cents/kWh and 8 to 18 cents/kWh, respectively, depending again on the season.

9 While there appear to be very few errors in the dataset, there are a small number of very high meter reads that may be inaccurate. We exclude any observations with more than 1500 kilowatt- hours per day. In Site 2, for example, this is 0.00035 percent of observations. In all three sites, baseline energy usage is balanced between treatment and control groups, as well as between the dropped and continued groups within the treatment group.

We also observe temperature data from the National Climatic Data Center, which are used to construct heating degree-days (HDDs) and cooling degree-days (CDDs). The heating degrees for a particular day is the di¤erence between 65 degrees and the mean temperature, or zero, whichever is greater. Similarly, the cooling degree days (CDDs) for a particular day is the di¤erence between the mean temperature and 65 degrees, or zero, whichever is greater. For example, a day with average temperature 95 has 30 CDDs and zero HDDs, and a day with average temperature 60 has zero CDDs and 5 HDDs. HDDs and CDDs vary at the household level, as households are mapped to di¤erent nearby weather stations. Because heating and cooling are such important uses of electricity in the typical household, heating and cooling degrees are important correlates of electricity demand.

There is one source of attrition from the data: households that become "inactive," typically when they move houses. If a customer becomes inactive, he or she no longer receives reports after the inactive date, and in most cases we do not observe the electricity bills. In our primary speci…cations, we do include the households that become inactive, but we exclude any data observed after the inactive date. As Table 1 shows, 20 to 26 percent of households move in the four to …ve years after treatment begins, or about …ve percent per year. The table presents six tests of balanced attrition: treatment vs. control and dropped vs. continued in each of the three sites. One of those six tests rejects equality: in Site 1, dropped group households are slightly more likely to move than continued households. We are not very concerned that this could bias the results, because the two groups are balanced on pre-treatment usage, Figure 4a shows that the treatment e¤ects during the joint treatment period are almost visually indistinguishable, and Appendix Table A5 con…rms that the treatment e¤ects are statistically indistinguishable during the …rst and second years of joint treatment.

There is also a source of attrition from the program: people in the treatment group can contact

10 the utility and opt out of treatment. In these sites, about two percent of the treatment group has opted out since the programs began. We continue to observe electricity bills for households that opt out, and we of course cannot drop them from our analysis because this would generate imbalance between treatment and control. We estimate an average treatment e¤ect (ATE) of the program, where by "treatment" we more precisely mean "receiving reports or opting out." Our treatment e¤ects could also be viewed as intent-to-treat estimates, where by the end of the sample, the Local

Average Treatment E¤ect on the compliers who do not opt out is about 1/0.98 larger than our reported ATE. Because the opt-out rate is so low, we do not make any more of this distinction in our analysis. However, when calculating cost e¤ectiveness, we make sure to include costs only for letters actually sent, not letters that would have been sent to households that opted out or moved.

2.4 Data for Short-Run Analysis

In Sites 1 and 3, electricity meters are read each month by utility sta¤, who record the total consumption over the billing period. By contrast, Site 2 has Advanced Metering Infrastructure

(AMI), which records daily electricity consumption. In the "short-run analysis," we exploit these daily energy use data to test for high frequency variation in the average treatment e¤ects (ATEs).

Table 2 presents Site 2’s daily electricity use data. We separate the households into three di¤erent groups based on the frequency of the intervention. The …rst two are the monthly and quarterly groups from the initial experimental population of about 79 thousand households which began in October 2008 and is discussed above. The third group is a "second wave" of about

44 thousand households from a nearby suburb that were added in February 2011. Half of these households were assigned to bimonthly treatment and half to control. Within each of the three frequency groups, all households were scheduled to receive reports on the same days. For this part of the analysis, we exclude the dropped group households in the monthly and quarterly groups after their reports are discontinued. This reduces the sample size somewhat after September 2010 but does not generate imbalance because the households were randomly selected.

While pre-treatment usage is balanced between treatment and control for the monthly and quarterly groups, this is not the case for the bimonthly group that begins in February 2011: the treatment group’s average pre-treatment usage is lower than its control group by 0.69 kWh/day,

11 with a robust standard error of 0.20 kWh/day. The reason is that the partner utility asked Opower to allocate these households to treatment and control based on odd vs. even street address numbers.

Thus, it is especially important that we control for baseline usage when analyzing this group of households. After controlling appropriately, we have found no evidence that the imbalance biases our results. We include this group to provide additional supporting evidence, although readers may feel free to focus on the results from the monthly and quarterly groups.

In order to analyze how daily average treatment e¤ects respond to the receipt of home energy reports, we must predict when the reports actually arrive. In this experiment, all reports delivered in a given month for any of the three frequency groups are generated and mailed at the same time.

Opower’s computer systems generate the reports between Tuesday and Thursday of the …rst or second week of the month. The computer …le of reports for all households in each utility is sent to a printing company in Ohio, which prints and mails them on the Tuesday or Wednesday of the following week. According to the U.S. Postal Service "Modern Service Standards," the monthly and quarterly groups are in a location where expected transit time is eight USPS "business days," which include Saturdays but not Sundays or holidays. The bimonthly group is in a nearby suburb where the expected transit time is nine business days. Of course, reports may arrive before or after the predicted day, and people may not open the letters immediately.

3 Short-Run Analysis

3.1 Graphical

We begin by plotting the average treatment e¤ects for each day of the …rst year of the Site 2 experiment for the quarterly groups, using a seven-day moving window to smooth over idiosyncratic variation. These ATEs are calculated simply by regressing Yit, household i’selectricity use on day t, on treatment indicator Ti, for all days t within a seven-day window around day d. We control

b for a vector of three baseline usage variables Yi : average baseline usage (January-December 2007), average summer baseline usage (June-September 2007), and average winter baseline usage (January-

March and December 2007). We also include a set of day-speci…c constants t. For each day d, the regression is:

12 d b Yit =  Ti + Y + t + "it; t [d 3; d + 3] (1) i 8 2

Figure 2 plots the quarterly group’s ATEs  d with 90 percent con…dence intervals, while Ap- pendix Figure A2 plots both the monthly and quarterly groups. In this regression and all others in the paper, standard errors are robust and clustered at the household level to control for arbitrary serial correlation in "it, per Bertrand, Du‡o, and Mullainathan (2004). Here and everywhere else in the paper, superscripts always index time periods; we never use exponents.

Figure 2 shows that the  d coe¢ cients increase rapidly around October 24th, 2008, the date when the …rst report is predicted to arrive. Two weeks before the predicted arrival date, the quarterly group’s ATE is -0.03 kWh/day, with a 90 percent con…dence interval of (-0.17,0.11).

By November 3rd, 10 days after the predicted arrival date, the ATE is -0.45 kWh/day, with a con…dence interval of (-0.31, -0.60). This ATE is about 1.5 percent of average electricity use, which from Table 1 was about 30 kWh/day. Tangibly, this is equivalent to each treatment group household turning o¤ seven standard 60-watt lightbulbs for an hour, every day. Between early November and early January, however, the treatment e¤ect weakens by about 0.2 kWh/day. About half of the conservation e¤orts that were taken initially were abandoned within two months.

The quarterly group’ssecond report arrives on approximately January 23rd, 2009. In the 14 days between January 19th and February 2nd, the point estimates of the quarterly group’s treatment e¤ect increase in absolute value from -0.25 to -0.58. These e¤ects similarly decay away until mid-

April, when the third report arrives. Around this and the fourth report, the e¤ects similarly jump and decay, although these cyclical responses appear to become less pronounced.

This initial presentation of raw data makes clear the basic trends in households’responses to the treatment. However, the standard errors are wide, and the point estimates ‡uctuate quite a bit on top of this basic potential pattern of jumps and decays. Furthermore, holidays and weather are likely to in‡uence the treatment e¤ects. Collapsing across multiple report arrivals help to address this, by reducing standard errors and smoothing over idiosyncratic variation. In addition, controlling for weather can both increase e¢ ciency and, if weather is correlated with report arrivals, remove bias when estimating trends in e¤ects associated with report arrivals. For example, if the

13 second or third report happened to arrive in an extremely cold week, the treatment e¤ects would likely have been stronger in that week even if the report had arrived a week later. Failing to control for weather could cause us to falsely attribute such treatment e¤ect ‡uctuations as immediate responses to report arrivals.

We therefore control for weather and estimate the treatment e¤ects in "event time," meaning days before and after predicted report arrival. We include a vector of indicators t for the periods around each individual report arrival. The interaction of t with T controls for the fact that the treatment e¤ect could be weaker or stronger over the entire window around each particular report, due to seasonality or other factors. We index the 48-hour periods before and after report arrival by

a a and de…ne an indicator variable At that takes value 1 if day t is a 48-hour periods after a report arrival date. Denote  a as the ATE for each period a. The omitted period is the day of and the day after predicted report arrival, so the  a coe¢ cients represent the di¤erence between the ATE for period a and the ATE for those two days. The vector Mit includes heating degrees and cooling degrees on day t at the weather station closest to household i. The event time regression is:

a a b Yit = tTi +  At Ti + 1TiMit + 2Mit + Yi + t + "it (2) a X Figure 3a plots the  a coe¢ cients and 90 percent con…dence intervals for the monthly, quarterly, and bimonthly groups using data around each group’s …rst four reports. Within each group, the e¤ects follow remarkably similar patterns in event time. The treatment group decreases consump- tion by about 0.2 kWh/day in the several days around the predicted arrival date. Some reports arrive and are opened before the predicted date, which causes consumption to decrease before a = 0. The absolute value of the treatment e¤ect reaches its maximum eight to ten days after the report arrives. After that point, the treatment e¤ect decays, as the treatment group’sconservation e¤orts diminish. This decay is di¢ cult to observe for the monthly group, because before much decay happens, the next report arrives, causing event time to re-start. For the quarterly group, the treatment e¤ect decays by 0.2 kWh/day between 10 days and 80 days after the report arrival.

Figure 3b is analogous to Figure 3a, except that it uses data for all reports beginning with the

…fth report. The treatment e¤ects are very close to constant in event time. Coe¢ cients for the

14 bimonthly group are very imprecisely estimated because there are only six reports delivered to this group, meaning that this graph is estimated o¤ of the event windows around only two reports. The treatment group does appear to decrease consumption just after the arrival of these …fth and sixth reports.

3.2 Empirical Strategy

We now estimate formally the "action and backsliding" patterns suggested in the …gures. First, we estimate the magnitude of the increases in the absolute value of the treatment e¤ect around the report arrival window. Second, we estimate the rate of decay in the treatment e¤ect between reports.

0 De…ne St as an indicator variable for the "arrival period": the seven days beginning three days 1 before and ending three days after the predicted arrival date. St is an indicator for the seven day 1 period after that, which we call the "post-arrival period," and St is an indicator for the seven-day a 1 0 1 period before. De…ne St = St + St + St as an indicator for all 21 days in that window. As above,

t is a vector of indicators for the window around each individual report, Mit captures heating and b cooling degrees near household i on day t, Yi is the three seasonal baseline usage controls, and t are day-speci…c dummies. The regression is:

a 0 0 1 1 b Yit =  S +  S +  S +  Ti + TiMit + Mit + Y + t + "it (3) t t t t  1 2 i  1 1 1 The coe¢ cient  re‡ects the change in the treatment e¤ect in period S relative to period S ,

a while  captures the treatment e¤ect for days not included in the 21-day windows where St = 1. w To estimate the decay rate, we de…ne an indicator variable St to take value 1 if day t is in a window beginning eight days after a predicted arrival date and ending four days before the earliest arrival of a subsequent report. The variable dt is an integer re‡ecting the number of days past the beginning of that period, divided by 365. For example, for a t that is 18 days after a predicted arrival date, d takes value (18-8)/365. Thus, the coe¢ cient on dt, denoted , measures the decay of the treatment e¤ect over the window Sw in units of kWh/day per year. The regression is:

15 w w b Yit = ( S + dtS + ) Ti + TiMit + Mit + Y + t + "it (4) t t t  1 2 i

In this model, treatment e¤ects decay linearly over time. One might hypothesize that the decay process could be convex or concave, and it is almost certainly unrealistic to extrapolate beyond the time when the predicted treatment e¤ect reaches zero. However, the time between reports is not long enough for the e¤ects to return to zero or to precisely estimate non-linearities. We therefore use the linear model for simplicity.

3.3 Results

Analogously to Figures 3a and 3b, we run the above regressions separately for the initial set of four reports and all reports after that initial set. Table 3a presents the estimates of Equation (3) for the windows around the …rst four reports. There are three pairs of columns, for the monthly, quarterly, and bimonthly groups. Within each pair, the regression on the right includes the degree- day controls 1TiMit and 2Mit from the weather station nearest to household i, while the one on the left does not. In all cases where they are statistically signi…cant, the 2 coe¢ cients on heating and cooling degrees are positive, as less mild weather causes more electricityb use, while the 1 coe¢ cients are negative, as more counterfactual electricity use is associated with stronger (moreb negative) treatment e¤ects. These weather controls would have even more explanatory power if the day-speci…c dummies t and the report-speci…c treatment e¤ects t were excluded. Across the six regressions, the coe¢ cient  1 on the TS1 interaction ranges from -0.129 to -0.201 kWh/day. This means that between the week before the seven-day arrival windows and the week after those windows, the average household that receives a letter reduces consumption relative to counterfactual by the equivalent of about three 60-watt lightbulbs for one hour. The coe¢ cients are very robust to the weather controls, implying that the treatment group’s immediate conservation is caused by the report arrivals, not by the weather simultaneously becoming less mild.

These results are especially remarkable in that they highlight how much of consumers’responses to the intervention happen almost immediately after receiving reports. Consider …rst the monthly group. Multiplying the incremental post-arrival period e¤ect  1 by four gives a total decrease of

b 16 0.72 kWh/day - the equivalent of turning o¤ a standard 60-watt lightbulb for an additional 12 hours. This means that if the intervention’s only e¤ect were to generate immediate action in the post-arrival period, and if that immediate action were sustained over time, the treatment e¤ect after the …rst four reports would be -0.72 kWh/day. However, the average treatment e¤ect just before the …fth report (the monthly group’s  d estimated by Equation (1) for February 13, 2009) is -0.52. The reason for this discrepancy is that the treatment group’s immediate action is not b sustained over time - the e¤ects decay in the intervening days between the seven-day post arrival period S1 and the arrival of the next report.

The story is the same for the quarterly group. Multiplying  1 by four gives a total decrease of 0.8 kWh/day. Thus, if these immediate actions were sustained, the treatment e¤ect after the b …rst four reports would be -0.8 kWh/day. In contrast, the ATE just before the …fth report is -0.35.

This discrepancy is larger than for the monthly group because the quarterly group has three times longer to backslide on its immediate actions.

Table 4a formally measures this backsliding using Equation (4). The estimates of  vary across the three groups, but the coe¢ cients are positive in all regressions and statistically positive in all but one. The quarterly and bimonthly estimates are more highly robust to weather controls. For the monthly group, the point estimates di¤er somewhat when weather is included. This is because relative to the quarterly and bimonthly groups, the monthly group has shorter event windows Sw that can be used to estimate , and because the sample period is limited to the …rst four reports, there are fewer days that canb be used to estimate the weather controls . The monthly estimates for the …rst four reports in both Tables 3a and 4a do not include coolingb degrees because all days in this period between October 2008 and February 2009 have zero cooling degrees.

To put the magnitudes of  in context, focus on the estimates for the quarterly group, controlling for weather. A  of 0.7 means that a treatment e¤ect of -0.7 kWh/day would decay to zero in one

1 year, if the linearb decay continued to hold. Thus, the jump in treatment e¤ects of  =-0.2 from column (4) of Table 3a would decay away fully within just over three months. This never happens, b because the next report arrives less than three months after the window Sw begins.

Tables 3b and 4b replicate Tables 3a and 4b for the remainder of the samples beginning with the …fth report. Table 3b shows that in the monthly and quarterly groups from the …rst wave, the

17  1 coe¢ cients are still statistically signi…cant, but they are only about one-…fth the magnitude of

 1 for the initial four reports. The coe¢ cient for the bimonthly group, however, is relatively large.

Because this is estimated o¤ of only the …fth and sixth reports, it is di¢ cult to infer much of a b pattern. It could be that there are unobserved moderators of the treatment e¤ects that coincide with these reports, or that the information included in these particular reports was di¤erent in a particularly compelling way. The standard errors are wider, and the estimates are not statistically di¤erent from the estimates from the …rst four reports.

Table 4b shows that there is no statistically signi…cant decay of the e¤ects after the …rst four reports. Interestingly, all of the point estimates of  are positive, suggesting that there may still be some decay, but the event windows are not long enough for precise estimates. This highlights the importance of the next section, in which we exploit the discontinuation of reports to estimate a decay rate over a much longer period: two years instead of two to ten weeks.

Appendix Tables A1a-b and A2a-b replicate Tables 3a-b and 4a-b with two sets of additional robustness checks. The left column of each pair includes more ‡exiblefunctions of weather in place

5 of the original Mit. The right column of each pair excludes outliers: all observations of Yit greater than 300 kWh/day and all households i with average baseline usage greater than 150 kWh/day, which is …ve times the mean. Based on our inspection of the data, these high-usage observations appear to be correct, not measurement errors. However, they implicitly receive signi…cant weight in the OLS estimation, so a small number of high-usage households could in theory drive the results.

Neither of these alternative speci…cations meaningfully changes the results, and the exact point estimates change only slightly.

All households in all treatment groups receive reports around the same day of the month, typi- cally between the 19th and the 25th. One might worry that our results could somehow be spuriously driven by underlying monthly patterns in the treatment e¤ect. Of course, these underlying patterns would have to take a very speci…c form: they would need to generate cycles in treatment e¤ects that begin in October 2008 and eventually attenuate for the monthly and quarterly groups, then appear beginning in February 2011 for second wave households but do not re-appear for the monthly and

5 This function was based on inspection of the relationship between ATEs and degree days for this site. The relationship is close to linear in heating and cooling degree days, as assumed for the primary regressions, but not exactly. For this robustness check, Mit contains six variables: 1(CDDit) > 0, CDDit, 1(0 < HDDit 5), 1(5 <  HDDit 35), HDDit 1(5 < HDDit 35), and 1(HDDit > 35).   

18 quarterly groups. We can explicitly test for spurious monthly patterns by exploiting the di¤erences in report frequencies to generate placebo report arrivals. We focus on the monthly vs. quarterly frequencies, because they are randomly assigned, and consider only the period after the …rst four reports, because before that, the quarterly ATE changes signi…cantly in the time between reports.

If there were spurious day-of-month e¤ects, the quarterly group’streatment e¤ects would jump in absolute value at the times when the monthly group receives reports but the quarterly group does not. Appendix Table A3 shows that the  0 and  1 coe¢ cients for these placebo report arrival dates are statistically zero and economically small relative to those estimated in Tables 3a and 3b. b b

4 Long-Run Analysis

In the long-run analysis, we ask two questions. First, do treatment e¤ects grow as the interven- tion continues, or do people eventually attenuate? Second, do treatment e¤ects persist after the intervention is discontinued?

4.1 Graphical

For the long-run analysis, we analyze the household-by-month billing data at each of the three sites.

We do not study the bimonthly treatment group from Site 2, because this "second wave" began much later, so long-run outcomes are not yet available.

We …rst plot the time path of the average treatment e¤ects over the sample for both the continued and dropped groups, for each of the three sites. For this graph, we combine the monthly and quarterly groups. Analogously to the short-run graphical analysis, we use a three month moving window to smooth over idiosyncratic variation. The variables Di and Ei are indicator variables for whether household i was assigned to the dropped group and the continued group, respectively.

Both variables take value 0 if the household was assigned to the control group, meaning that

D E Di + Ei = Ti. The sets of coe¢ cients  n and  n are the average treatment e¤ects for the three- month window around month n for the dropped and continued groups, respectively. The variable

b m indexes calendar months. We include month-by-year controls for baseline usage, denoted mYim,

19 b where Yim is household i’s average usage in the same calendar month during the baseline period.

The m are month-by-year intercepts. For each month n, the regression is:

D E b Yim =  Di +  Ei + tY + m + "im; m [n 1; n + 1] (5) n n im 8 2

In this regression, standard errors are clustered over time at the level of randomization, per

Bertrand, Du‡o, and Mullainathan (2004). We cluster by household in Sites 1 and 2, and by block batch in Site 3.

Figures 4a-4c present the results for Sites 1-3. The y-axis is the treatment e¤ect, which is negative because the treatment causes a reduction in energy use. The three …gures all illustrate the same basic story. To the left of the …rst vertical line, the intervention has not yet started, and the treatment e¤ects are statistically zero. The e¤ects grow fairly rapidly over the intervention’s

…rst year, after which the growth rate slows. Until the second vertical line, both the continued and dropped groups receive the same treatment, and the e¤ects for the two groups are indistinguishable, as would be expected due to random assignment. The average treatment e¤ects in the second year range from 0.7 to 1.0 kWh/day, or about three percent of total consumption. After the dropped group’slast report, the e¤ects begin to decay relative to what they had been during the intervention, but the e¤ects are remarkably persistent. The dropped group’s ATEs seem to diminish by about

0.1 to 0.15 kWh/day each year.

The e¤ects are highly seasonal. In all three sites, e¤ects are stronger in the winter compared to the adjacent fall and spring. Although the great majority of households in the populations primarily use natural gas instead of electricity for heat, the fans for natural gas heating systems use electricity, and many homes also have portable electric heaters. In Sites 1 and 3, the e¤ects are also stronger in the summer compared to the fall and the spring. This suggests that an important way in which people respond to the treatment is to reduce heating and cooling energy, either through reducing utilization or perhaps changing to more energy e¢ cient physical capital stock. In

Site 2, the average daily temperature in July is a mild 67 degrees, so air conditioner use is more limited, and the treatment e¤ects are relatively weak in the summer. In Site 3, the monthly point

20 estimates jump around more because of the block batch-level randomization, but they do not move more than we would expect given the con…dence intervals and underlying seasonality.

The graphs also illustrate that the continued groups do not attenuate to treatment, even after four years. In Sites 2 and 3, continued group households continue to receive reports at their original assigned monthly or quarterly frequencies, and their e¤ects strengthen throughout the sample.

In Site 1, however, the continued group’s e¤ects begin to diminish in 2011. This dimunition is associated with the reduction in frequency to biannual reports from [tk] the monthly and quarterly mix of the …rst two years.

In Sites 1 and 2, households were randomly assigned to monthly and quarterly frequencies. Does the monthly group attenuate more quickly to the intervention? Does the more frequent contact induce more capital stock formation and thus more persistence? Figure 5a presents the ATEs over time for the monthly and quarterly groups in Site 2, with standard errors omitted for legibility.

The solid line is the monthly group, while the dot-dashed line is the quarterly group. As treatment ends for the dropped group, a second line splits o¤ for each frequency, representing the ATEs for the subset of households that were dropped.

Remarkably, even after more than four years of home energy reports, the monthly continued group’s ATE continues to strengthen. Households receiving quarterly reports also continue to respond more intensely as long as treatment is continued. By January 2013, the ATEs for the continued monthly and quarterly groups are -1.22 and -1.08 kWh/day. The fact that the monthly

ATE is much less than three times the quarterly ATE means that there are sharply diminishing marginal returns to treatment intensity in this range. The point estimates suggest that the two e¤ects are gradually converging, suggesting that the marginal returns might actually be close to zero.

During the …rst two years after treatment is discontinued, the dropped groups for each frequency have their e¤ects decay at similar proportions to the initial treatment e¤ect. Although the standard errors on these point estimates are wide, the dropped monthly and quarterly group ATEs also may be converging. This again suggests sharply diminishing marginal returns to treatment intensity.

Figure 5b replicates the same analysis for Site 1. Here the results are even more stark: throughout the sample, households assigned to monthly vs. quarterly treatment frequencies have statistically

21 indistinguishable e¤ects.

4.2 Empirical Strategy

For the long-run analysis, we break the samples into four periods. Period 0 is the pre-treatment period, period 1 is the …rst year of treatment, and period 2 runs from the beginning of the second year to the time when treatment is discontinued for the dropped group. Period 3 is the post-drop

p period: the remainder of the sample after the dropped group is discontinued. We denote Pm as indicator variables for whether month m is in period p. The variable rm measures the time (in years) since the beginning of period 3. Analogous to the short-run analysis, Mim represents two weather controls: average heating degrees and average cooling degrees for household i in month m.

The primary estimating equation is:

0 0 1 1 2 2 Yim = ( P +  P +  P ) Ti (6) m m m  3 3 + DiPm + EiPm

LR 3 3 +  rmDiPm + rmEiPm (7)

2 3 2 3 + Mim(P + P ) Ti + Mim(P + P ) 1 m m  2 m m b + mYim + m + "im

In the …rst line of this equation, the coe¢ cients  0,  1, and  2 are ATEs for the treatment groups (the continued and dropped groups combined) for the pre-treatment period and the two joint treatment periods. The second and third lines parameterize the treatment e¤ects for the continued and dropped groups in the post-drop period. The coe¢ cient LR captures the treatment e¤ect decay rate for the dropped group, while measures the change in the continued group treatment

LR e¤ect. Because rm has units in years, the units on  and are kWh/day per year. Because rm takes value zero at the beginning of the post-drop period, and  re‡ect …tted treatment e¤ects for the dropped and continued groups, respectively.

2 3 The fourth line includes controls for the interaction of P + P Ti with heating and cooling m m  

22 2 degrees Mim. This means that the  , , and  coe¢ cients re‡ect …tted treatment e¤ects for a month when the mean temperature each day is 65 degrees. Similarly, the LR and coe¢ cients re‡ect changes in e¤ects after controlling for weather-related ‡uctuations. Controlling for weather is important because if temperatures were more (less) mild later in the post-drop period, this would likely make the treatment e¤ects weaker (stronger), which would otherwise load onto LR and . In other words, when we estimate LR and unconditionally, we measure all causes of persistence, while conditioning on changes in the economic environment such as weather, we more closely capture changes in underlying consumer behaviors.

4.3 Statistical Results

Table 6 begins by simply estimating the average treatment e¤ects for each period. The estimating equation includes only the …rst, second, and …fth lines of Equation (6), omitting the growth rate parameters and weather controls. The estimates closely map to the illustrations in Figures 4a-4c.

Treatment e¤ects are statistically zero in the pre-treatment period, rising to 0.45 to 0.64 kWh/day in the …rst year and 0.66 to 0.87 kWh/day for the remainder of the joint treatment period. In tangible terms, a treatment e¤ect of -0.66 kWh/day means that the average treatment group household took actions equivalent to turning o¤ a standard 60-watt lightbulb for about 11 hours each day. Recalling that average usage is around 30 kWh/day, this corresponds to two percent of electricity use.

Table 5 demonstrates basic evidence of persistence: in all three sites, treatment e¤ects for the dropped group are at least somewhat weaker in the post-drop period compared to the second year.

This also provides basic evidence that people do not attenuate to the intervention: treatment e¤ects for the continued group are at least somewhat stronger in the post-drop period than before.

Table 6 presents estimates of the LR and parameters. For each site, the left column presents estimates without conditioning on weather, corresponding to Equation (6) with the fourth line omitted. The right column presents estimates conditional on weather, corresponding exactly to

Equation (6). The LR and coe¢ cients are the bottom two coe¢ cients in each column. As in the short-run analysis, the coe¢ cients are are highly insensitive to the weather controls.

In Sites 2 and 3, people do not attenuate to the treatment during the post-drop period: the

23 negative coe¢ cient means that the treatment e¤ect is getting stronger (more negative) as time moves into the future. The parameter is not quite statistically signi…cant for Site 2 (p=0.127), although the point estimate is negative and Figure 4b seems to illustrate that the continued group’s treatment e¤ect becomes more negative through 2011, 2012, and 2013. In Site 3, =-0.08 implies that every year, the average continued group household does the equivalent of turning o¤ one b additional 60-watt lightbulb for another 80 minutes. In this site, the treatment e¤ect is becoming about 10 percent stronger each year. This result is quite remarkable: even as the quarterly group receives their 20th home energy report, and the monthly group receives their 60th, consumers have not attenuated to the intervention.

In Site 1, the coe¢ cient is positive. This matches the graphical result in Figure 4a. As we have discussed, this is associated with the decrease in treatment intensity. Of course, a decrease in e¢ cacy does not necessary imply a decrease in cost e¤ectiveness. In Section 6, we analyze this issue in more detail.

The LR parameters are our primary measure of persistence. They range from 0.09 kWh/day per year in Site 3 to 0.18 kWh/day per year in Site 1. If the linear trend continues, the e¤ects would not return to zero until …ve to ten years after treatment was discontinued. Based on Figures 4a to

4c, we hypothesize that the decay rate will tail o¤ over time, meaning that the e¤ects will be more persistent than the linear decay model predicts, but not enough time has passed to meaningfully test this formally. To the extent that the linear model understates persistence, our cost e¤ectiveness projections in the next section are conservative.

One of the basic questions we asked in the introduction was whether longer interventions led to more persistent post-intervention e¤ects. We can answer this for Site 2 by comparing the estimated LR long run decay rate  to the decay rate between each of the …rst four reports, the  estimated in the previous section.b In …ve of the six speci…cations, this  was between 0.7 and 1.4b kWh/day per LR year, which is six to 12 times faster than  . This impliesb that between the …rst four reports and the time when treatment is discontinued,b the dropped group forms some kind of "capital stock" which causes substantially more persistence. In the next two sections, we discuss the potential causes and consequences of this process.6

6 We note that the long-run persistence is measured from one to four years after the period when the short run decay rate is measured. It is possible that changes in macroeconomic conditions or other time-varying factors might

24 In Figures 5a and 5b, we saw that the monthly frequency generates at best only slightly larger e¤ects than quarterly, but it was not clear whether the e¤ects decay at di¤erent rates. Relatedly,

Allcott (2011) documents that the program causes more conservation by heavier baseline users.

Appendix Table A4 formally con…rms these results for Sites 1 and 2, but the standard errors are too large to infer much about whether e¤ects decay proportionally faster or slower for the di¤erent frequency groups or for heavier baseline users.

The online appendix also presents several robustness checks. Appendix Table A5 estimates the di¤erences in ATEs between continued and dropped groups for all periods. In all three sites, the e¤ects do not di¤er statistically in the pre-treatment and joint treatment periods, and the point estimates are almost exactly the same. Tables A6 and A7 replicate Tables 5 and 6, except excluding all data for households that move at any point. These are important because by the end of the sample, 20 to 26 percent of households have moved. Even though this is balanced between treatment and control, if the movers had systematically di¤erent treatment e¤ects, this could cause the estimated treatment e¤ects to change over time. This would in turn bias our inference about persistence. However, the coe¢ cients are statistically and economically the same when limiting to this balanced panel.

5 Mechanisms

Concretely, what actions underlie the observed e¤ects? In many behavioral interventions, this question can be di¢ cult to answer. For example, if a program incentivizes people to lose weight, it may not be clear how much of the observed weight loss comes from exercise vs. reduced calorie intake, and within these categories, what form of exercise is the most useful and what foods have been cut out of their diets. In our setting, we can use utility program participation data and households’self-reported actions to shed some light on two questions. First, what do treatment group households report that they are doing di¤erently than control group households? Second, how much of the e¤ect comes from large observed changes to physical capital stock? cause di¤erences in these decay rates. Ultimately, however, we are not very concerned with this issue.

25 5.1 Surveys of Self-Reported Actions

In the past two years, Opower has surveyed about six thousand people in treatment and control groups in six sites nationwide, including 800 people in Site 2. The surveys are conducted via telephone on behalf of the utility, and for practical reasons no e¤ort is made to obscure the fact that the surveys are about energy use and conservation behaviors. Completion rates are typically between 15 and 25 percent of attempts. Because these are self-reported actions and the majority of people refuse, the results should be interpreted with caution.

In these surveys, respondents are …rst asked if they have taken any steps to reduce energy use in the past 12 months. Those who report that they have are asked whether or not they have taken a series of speci…c actions in the past twelve months. The particular actions vary by survey, but many of the same actions are queried in multiple sites. We group actions into three categories: repeated actions such as switching o¤ power strips and turning computers o¤ at night, physical capital changes such as purchasing Energy Star appliances, and intermittent actions such as replacing air

…lters on air conditioning or heating systems.

Table 7 presents the survey data. Column (1) presents the share of all respondents that report taking the action in the past 12 months. Column (2) presents the di¤erence between the shares of respondents in treatment vs. control that report taking the action. As a robustness check, column

(3) presents the di¤erence in shares in treatment vs. control after controlling for …ve respondent characteristics: gender, age, whether homeowner or renter, education, and annual income.

There is little di¤erence in self-reported actions across treatment and control groups. The only di¤erence that is consistent across di¤erent surveys is that treatment group households are more likely to report having a home energy audit. Audits often include installation of new compact

‡uorescent lightbulbs, which typically last several years until they must be replaced, and also are typically required before moving forward with larger investments such as weather sealing or new insulation. Data from one utility in a relatively warm climate show that their treatment group is more likely to report using fans to keep cool instead of running air conditioners. Across all sites, the treatment group is more likely to report participating in utility energy e¢ ciency programs, although the di¤erence is not statistically signi…cant in the speci…cation that conditions on observables.

For each of the three categories, the …rst row (in bold) presents a test of whether the average

26 probability of taking actions in each category di¤ers between treatment and control. When ag- gregating in this way across actions and across sites, our standard errors are small enough to rule out with 90 percent con…dence that the intervention increases the probability of reporting energy conservation actions by more than one to two percent. Throughout Table 8, this lack of statisti- cal signi…cance would only be further reinforced by adjusting the p-values for multiple hypothesis testing.

There are multiple interpretations of these results. First, the true probabilities of taking actions might di¤er between treatment and control, but the surveys might not pick this up due to demand e¤ects, respondents’general tendency to systematically over-report, biased non-response, and the fact that di¤erent respondents might interpret questions di¤erently. Second, the treatment could cause small changes in the true probability of taking a wide variety of actions. Such changes could potentially add up to the observed e¤ects on electricity use even though no one action accounts for much on its own. Third, it is possible that the intervention does not change whether or not people take speci…c actions to conserve energy, which is what Table 7 reports, but instead changes the intensity with which some people take actions they were already taking. In other words, an important impact of the intervention could be not to give information about new ways to conserve, but instead to increase attention and motivation to conserve in more of the same ways.

5.2 Utility Energy E¢ ciency Program Participation

Perhaps the only area where the survey data suggest di¤erences between treatment and control is participation in utility-sponsored energy e¢ ciency programs. Like many utilities across the country, the utilities we study run series of other energy conservation programs that subsidize or directly install energy e¢ cient capital stock. The utilities maintain household-level program participation data, which are primarily used to estimate the total amount of energy conserved through each program. These household-level data are also useful in estimating whether the Opower intervention a¤ects energy use through an increase in program participation. For this study, we examine the administrative data from Sites 2 and 3.

In Site 3, the utility o¤ers rebates or low-interest loans in three categories: appliances, "home improvement," and HVAC (heating, ventilation, and air conditioning). For appliances, the utility

27 mails $50 to $75 rebate checks to consumers who purchase energy e¢ cient clothes washers, dish- washers, or refrigerators. To claim the rebate, the homeowner needs to …ll out a one-page rebate form and mail it with the purchase receipt and a current utility bill to the utility within 30 days of purchase. For "home improvement," the utility o¤ers up to $5,000 in rebates for households that install better insulation or otherwise retro…t their homes in particular ways. For HVAC, the utility o¤ers $400 to $2000 for energy e¢ cient central air conditioning systems or heat pumps, or $50 for energy e¢ cient window air conditioners. Most home improvement and HVAC jobs are done by contractors. Some consumers probably buy energy e¢ cient appliances and window air conditioners without claiming the utility rebates, and thus these capital stock changes might be unobserved in the data. However, because the home improvement and HVAC rebates are larger, and because the contractors coordinate with the utility and facilitate the rebate process, consumers who undertake these large physical capital improvements are very likely to claim the rebates and thus be observed in the administrative data.

The top panel of Table 8 presents Site 3’sprogram participation statistics for the …rst two years of the program, from April 2008 through June 2010. In total, 3855 households in the experimental population participated in one of the three programs, a relatively high participation rate compared to other utilities. In column (1), we list relatively optimistic estimates of the savings that might accrue for the average participant. Column (3) presents the di¤erence in participation rates between treatment and control, in units of percentage points ranging from 0 to 100. The results con…rm the qualitative conclusions from the household surveys: the treatment group is 0.417 percent more likely to participate in energy conservation programs. Out of every 1000 households in control, 44 participated, while about 48 out of 1000 treatment group households participated.

How much of the treatment e¤ect does this explain? Table 5 shows that by the program’ssecond year, it is causing an 865 Watt-hour/day (0.865 kilowatt-hour/day) reduction in energy use. Column

(4) multiplies the di¤erence in participation rate by the optimistic estimate of savings, showing that the di¤erence in program participation might cause energy use to decrease by 14 Watt-hours per day. Thus, while there are statistically signi…cant changes in program participation, this explains less than two percent of the treatment e¤ect.

The short-run and long-run treatment e¤ects suggest a shift in the composition of responses

28 over time toward capital stock changes. For this reason, it is also important to examine program participation data later after program inception. Therefore, we also examine similar administrative data for 2011 in Site 2. This utility o¤ers a similar set of programs as in Site 3, except that the exact rebate amounts may vary, and some rebate forms can be submitted online instead of in the mail. In these data, we observe more precisely the action that the consumer took, as well as the utility’sestimate of the electricity savings.

The bottom panel of Table 8 presents these data. The most popular programs are clothes washer rebates, insulation, removal of old energy-ine¢ cient refrigerators and freezers, installation of low-

‡ow showerheads, energy e¢ cient windows, and compact ‡uorescent lightbulbs (CFLs). Savings are zero for insulation and windows because for regulatory purposes, the utililty deems that that these programs reduce natural gas use but not electricity.

Columns (3) and (4) compare the takeup rates and implied electricity savings between the control group and the continued treatment group, which is still receiving home energy reports during 2011. There is a statistically signi…cant di¤erence for only one program: CFL replacement, which generates 2.25 Watt-hours/day incremental savings in the continued treatment group. Using the estimates in the bottom row, which combine the savings across all programs, the upper bound of the 90 percent con…dence interval on savings is about 6 Watt-hours/day. By contrast, the continued group’streatment e¤ect in the post-drop period was (negative) 870 Watt-hours/day (0.870 kilowatt- hours/day), which was an increment of 181 Watt-hours/day compared to the year before. Thus, as in Site 3, only a small fraction of the savings are due to participation in utility energy e¢ ciency programs.7

5.3 Summary: The Dynamics of Household Responses

Taken together, our analyses of short-run and long-run treatment e¤ects, along with information on program participation and self-reported actions, start to paint a picture of how consumers respond to the Opower intervention. As the initial reports arrive, some consumers are immediately

7 Several recent consulting reports and o¢ cial program evaluations have also examined the intervention’se¤ect on utility program participation at these sites and others, including Integral Analytics (2012), KEMA (2012), Opinion Dynamics (2012), and Perry and Woehleke (2013). Their …ndings are very similar to ours: the Opower intervention sometimes causes increases in program participation, but this accounts for only a small fraction of the overall reduction in energy use.

29 motivated to conserve. They take actions that are feasible within a short period of time, probably changing utilization choices by adjusting thermostats, turning o¤ lights, and unplugging unused electronics. However, behaviors soon begin to "backslide" toward their pre-intervention levels. In this intervention, subsequent reports arrive before behaviors fully return to their pre-intervention state, and these additional reports repeatedly induce at least some of the same households to conserve. This process of action and backsliding is a repeated version of the phenomena documented by Agarwal et al. (2011) and Haselhuhn et al. (2012), who show that consumers learn to avoid credit card and movie rental late fees after incurring a fee, but they act as if this learning depreciates over time.

Several mechanisms might cause this repeated action and backsliding. Repeated learning and forgetting could generate this pattern, although it seems unlikely that people would forget this information at this rate. Alternatively, it could be that consumers learn that energy conservation actions are more di¢ cult or save less energy than they had thought. However, to generate repeated cycles in energy use, consumers must be experimenting with di¤erent energy conservation actions after receiving each new report, and it is not clear whether this is the case. A third mechanism is what we call malleable attention: reports immediately draw attention to energy conservation, but attention gradually returns to its baseline allocation.

After the …rst few reports, this repeated action and backsliding attenuates, and decay of treat- ment e¤ects can only be observed over a longer period after reports are discontinued for part of the treatment group. This means that in the intervening one to two years between the initial reports and the time when the reports are discontinued, consumers form some type of new "capital stock" that decays at a much slower rate. The program participation data shows that very little of this capital stock is large changes to physical capital such as insulation or home energy retro…ts.

However, consumers may make a wide variety of smaller changes to physical capital stock, such as installing energy e¢ cient compact ‡uorescent lightbulbs or window air conditioners.

Much of this capital stock may also re‡ect changes to consumers’ utilization habits, which

Becker and Murphy (1988) call "consumption capital." This stock of past conservation behaviors lowers the future marginal cost of conservation, because the behavior has become automatic and can be carried out with little mental attention (Oullette and Wood 1998, Schneider and Shi¤rin

30 1977, Shi¤rin and Schneider 1977). However, just as in the Becker and Murphy (1988) model and most models of capital stock, consumption capital also decays. This story is consistent with the results of Charness and Gneezy (2009), who show that …nancial incentives to exercise have some long-run e¤ects after the incentives are removed, suggesting that they induced people to form new habits of going to the gym.

6 Cost E¤ectiveness and Program Design

In this section, we assess the importance of persistence for cost e¤ectiveness and for program design.

We use a simple measure of cost e¤ectiveness: the cost to produce and mail reports divided by the kilowatt-hours of electricity conserved. Although cost e¤ectiveness is a common metric by which interventions are assessed, we emphasize several of the reasons why this is not the same as a welfare evaluation. First, consumers might experience additional unobserved costs and bene…ts from the intervention: they may spend money to buy more energy e¢ cient appliances or spend time turning o¤ the lights, and they might be more or less happy after learning how their energy use compares to their neighbors’. Second, the treatment causes households to reduce natural gas use along with electricity; we have left this for a separate analysis. Third, this measure does not take into account the fact that electricity has di¤erent social costs depending on the time of day when it is consumed.

Of course, this distinction between the observed outcome and welfare is not unique to this domain: with the exception of DellaVigna, Malmendier, and List (2012), most studies of weight loss, smoking, charitable contributions, and other behaviors are only able to estimate e¤ects on behaviors, not on welfare. In our setting, however, the focus on cost e¤ectiveness is still extremely relevant: rightly or wrongly, utilities are mandated to cause cost-e¤ective energy conservation, with little explicit assessment of the consumer welfare e¤ects.

We assume that the cost per report is $1 and that there are no …xed costs of program im- plementation. In our …rst analysis below, we study the "extensive margin" policy decision: how di¤erent assumptions about persistence can a¤ect cost e¤ectiveness, which can in turn a¤ect a policymaker’sdecision of whether or not to adopt an intervention. In our second analysis, we show how the dynamics of persistence can inform "intensive margin" policy decisions: how to optimize

31 program design to maximize cost e¤ectiveness.

6.1 Persistence Matters for Cost E¤ectiveness

When assessing the cost e¤ectiveness of Opower home energy reports and other "behavioral" en- ergy conservation programs, utilities have typically assumed zero persistence, either implicitly or explicitly. For example, when calculating cost e¤ectiveness for regulatory purposes, utility ana- lysts have typically tallied energy conserved and program costs incurred in a given year, without accounting for the fact that the reports delivered during that year also cause incremental energy conservation in future years. This is a sharp contrast to how utilities evaluate traditional energy conservation programs that directly replace physical capital such as air conditioners or lightbulbs.

For these programs, utilities include all future savings from capital stock changes in a given year using estimated "measure lives." The reason for this di¤erence is that until now, there was sig- ni…cant concern that "behavioral" interventions like the home energy reports would not generate persistent savings. When evaluating interventions still in progress, academic studies such as Ayres,

Raseman, and Shih (2012) and our own past work (Allcott 2011) have similarly calculated cost e¤ectiveness by considering only the costs accrued and energy savings up to a given date.

How important are these assumptions? Table 9 presents electricity savings and cost e¤ectiveness for the two-year intervention in Site 2, under alternative assumptions about long-run persistence after treatment is discontinued. The table uses Section 4’s empirical estimates for the "dropped group" and provides a stylized example of the calculations that a policymaker might go through in order to decide whether to adopt a similar two-year intervention. To keep the results transparent and avoid extrapolating out of sample, we assume no time discounting and limit the time horizon to the observed 4.5 years after treatment began.

Scenario 1 in Table 9 assumes zero persistence: after the treatment stops, the e¤ects end immediately. This parallels the most common assumption. Scenario 2 makes the opposite extreme assumption: that the full e¤ect will persist over the remaining 30 months of the sample. Scenario

3 uses the estimated actual persistence.

The electricity savings estimates are simply the average treatment e¤ects for each period es- timated in column (2) of Table 5 multiplied by the length of each period. For example, the 405

32 kWh of conservation during the …rst two years of treatment is ( 1 =0.450 kWh/day) 365 days +  ( 2 =0.659 kWh/day) 365 days. Under the full persistence assumption, post-treatment savings  b equals the ATE from the second year of treatment ( 2 = 0:659 kWh/day) multiplied by the 910 b days in the observed post-treatment period. Post-treatment savings under observed persistence are b ( = 0:584 kWh/day) (910 days). Total electricity cost savings for the two-year intervention would  sum to $6.76 million if applied to all households in the experimental population. Standard errors b are calculated using the Delta method.

Under the zero persistence assumption, dividing a total cost of $18 per household by the 405 kWh of observed savings gives cost e¤ectiveness of 4.44 cents/kWh. By contrast, the observed post-treatment savings show that the true cost e¤ectiveness to date is 1.92 cents/kWh. This underscores the importance of these empirical results: the program is more than twice as e¤ective as policymakers had often previously assumed.

Appendix Tables A9 and A10 re-create this table for Sites 1 and 3, respectively. These two sites have slightly larger treatment e¤ects and are thus slightly more cost e¤ective, but because the decay rates are similar across programs, the relative cost e¤ectiveness under the three assumptions is about the same: cost e¤ectiveness is about twice as good under observed persistence compared to the zero persistence assumption.

Extrapolating further into the future would magnify the importance of long-run persistence assumptions, while time discounting would diminish their importance. Appendix Table A10 re- creates Table 9 under a ten-year time horizon, extrapolating future treatment e¤ects in Scenario 3 LR using the estimated linear decay parameter  through the point when the e¤ects are projected to hit zero. Total savings are about 20 percentb larger than in Scenario 3 of Table 9, and cost e¤ectiveness is a thus about 20 percent better.

In places comparable to Site 2, assuming the observed persistence instead of zero persistence could impact program adoption decisions. Opower competes against other energy conservation programs, and cost e¤ectiveness is one of the most important metrics for comparison. There are some benchmark numbers available, although they are controversial (Allcott and Greenstone 2012).

Using nationwide data, Arimura et al. (2011) estimate cost e¤ectiveness to be about 5.0 cents/kWh, assuming a discount rate of …ve percent. A research and advocacy group called the American

33 Council for an Energy E¢ cient Economy (ACEEE) estimates that in 14 states with aggressive energy conservation programs, the states’cost e¤ectiveness estimates ranged from 1.6 to 3.3 cents per kilowatt-hour (Friedrich et al. 2009). Under the conservative zero persistence assumption, the two-year program in Site 2 is competitive with Arimura et al.’sestimates, but not with ACEEE’s.

This suggests that at least for some utilities, alternative energy conservation programs might be preferred. Allowing for the observed persistence, however, the two-year program in Site 2 is about as good as the most optimistic estimates from the literature. This example suggests that persistence results like these could make a di¤erence in policymakers’ "extensive margin" decisions about whether or not to adopt the intervention.

6.2 Persistence Matters for Program Design

Table 10 compares four di¤erent program designs with increasing amounts of intervention. The ob- jective of the table is to show the marginal costs and bene…ts of extending a behavioral intervention, and how these costs and bene…ts depend on persistence. The …rst program design is a one-shot intervention - one report, with no future contact. The second design exactly replicates the dropped group from Site 2: a two year intervention with the treatment group split between monthly and quarterly frequencies, for an average of nine reports per year. The third design begins with the same two year intervention but adds biannual reports for two additional years. This design mimics the …rst four years of the intervention received by the continued group in Site 1. The fourth design comprises exactly four years of monthly and quarterly reports, which replicates the …rst four years for the continued group in Site 2.

To estimate electricity savings for each design, we use the estimated treatment e¤ects and decay rates for Site 2 from Tables 5 and 6.8 Because we are now considering longer interventions than in

8 For the …rst design, we assume that there is an initial e¤ect of 0.30 kWh/day which decays linearly at 0.70 kWh/day per year, consistent with estimates for the quarterly group from Table 4. As a result, the e¤ect is assumed to have fully disappeared in just under six months. For the second design, we use the treatment e¤ects from column (2) of Table 5. For the third design, we assume that the treatment e¤ect for the …rst two years is the same as in the second design, and we assume that it increases in absolute value by 0.063 kWh/day between the second year and the third and fourth years, as estimated in column (1) of Table 5. For the fourth design, we use the estimated treatment e¤ects for the continued group in Site 2, again from column (2) of Table 5. For the two and four year interventions, we use the long-run decay rate LR estimated in Table 6. With additional empirical experimentation, the decay assumptions might be re…ned, and one might hypothesize that the decay rate is slower after four years than after two.

34 Table 9, we consider the full horizon of treatment e¤ects until the predicted savings decay to zero.

All dollar costs and electricity savings are now discounted to the beginning of the program at a …ve percent discount rate.

The concepts of additionality and persistence have analogues when analyzing the e¤ects of continuing or intensifying an intervention. The additionality e¤ect comprises the incremental sav- ings while holding the post-intervention decay rate constant. The persistence e¤ect comprises the incremental savings caused by the change in the post-intervention decay rate.

Figure 6 illustrates these two e¤ects in the context of lengthening the intervention from one report to two years. The one-report intervention causes the average household to conserve 22 kilowatt-hours. The additionality e¤ect of 466 kWh accrues both because the ATE during the intervention grows and mechanically because the treatment period lasts longer. The continued intention also causes consumers to change the composition of their responses toward "capital stock" changes, which slows the estimated decay rate . This results in a persistence e¤ect of 395 kWh.

The bottom panel of Table 10 quanti…es the incremental e¤ects of more intervention through these two channels. The largest improvement in cost e¤ectiveness occurs as the intervention is extended from one report to two years, as was illustrated in Figure 6. After two years, we assume LR that the decay rate holds constant at the estimated  for Site 2.

Although the largest improvement in average costb e¤ectiveness occurs when the intervention is extended from one report to two years, this does not mean that the marginal continuation of treatment is not cost e¤ective. In fact, adding biannual reports in the third and fourth year gener- ates 512 kWh in present discounted incremental savings at $3.60 in present discounted incremental cost, for an incremental cost e¤ectiveness of 0.70 cents/kWh. However, the results for design 4 suggest that increasing the treatment intensity in the third and fourth years from biannual to the monthly-quarterly mix has somewhat worse incremental cost e¤ectiveness of 3.58 cents/kWh.9

Despite the fact that we exploit extensive data and several di¤erent experiments, there are several reasons why one should draw only limited conclusions from these calculations. First, we have had to extrapolate out of sample, both to compare di¤erent program designs that in fact occurred

9 We calculate that relative to design 2, design 4 has incremental cost e¤ectiveness of 1.87 cents/kWh. These incremental savings are visible graphically in Figure 4b, as the treatment e¤ects for the continued group continue to strengthen relative to the dropped group in the post-drop period.

35 in di¤erent sites and to forecast future persistence. Second, our cost e¤ectiveness estimates are certainly site-speci…c, and as we discuss above, cost e¤ectiveness is not a welfare measure. Third, there are many other potential program designs that we could have evaluated, and many margins on which a program could be varied. For example, aside from varying the duration and intensity of any given intervention, it would be useful to experiment with the content. In this context, marketing weatherization programs or providing more tips about energy e¢ cient appliances in the reports might induce some households to make long-lasting changes to physical capital stock.

In the context of exercise, Royer, Stehr, and Snydor (2013) show that combining incentives with commitment contracts causes more persistent changes in gym attendance than incentives alone.

These issues not withstanding, these results have a clear implication for program design: the persistence e¤ect - inducing consumers to change "capital stock" - can be a signi…cant driver of improved long-run cost e¤ectiveness. In this context, it appears that there is a strong persistence e¤ect during the …rst two years of treatment. After that point, incremental treatment can be cost e¤ective because it generates an additionality e¤ect, but reducing treatment frequency improves incremental cost e¤ectiveness. In general, a crucial ingredient in designing behavioral interventions in many di¤erent contexts is to understand how long the intervention needs to be continued until the persistence e¤ect has been exploited.

7 Conclusion

In this paper, we study part of a large and policy-relevant set of randomized control trials de- signed to induce people to conserve energy. Aside from the speci…c relevance of our results to policymakers and economists studying energy markets, we believe that there are four broader take- aways for behavioral scientists. First, our data clearly illustrate how people can cycle through action and backsliding in response to repeated interventions. This suggests that attention is mal- leable, not static: an intervention can draw attention to one set of repeated behaviors, but that attention gradually returns to its baseline allocation. Second, our empirical results document how repeated intervention can eventually cause people to change the composition of their responses, which generates more persistent changes in outcomes. These persistent e¤ects might result from

36 habitual behavior change, or they may result from changes in physical capital or other technologies that change outcomes without additional action. As Charness and Gneezy (2009) and others have shown, the same e¤ect translates to other contexts: for example, a one-time encouragement to lose weight might cause people to diet for a week, while a longer-term intervention is more likely to eventually encourage people to …nd a workout partner and habitually go to the gym.

Third, our cost e¤ectiveness analysis shows how long-run persistence can materially change whether or not a program is cost-e¤ective. This suggests that for other types of interventions where long-run results such as these are not available, it may sometimes be worthwhile to delay decisions about scaling up a program until long-run e¤ects can be measured. Fourth, we concep- tually distinguish and quantitatively measure two di¤erent reasons for ongoing intervention: the additionality and persistence e¤ects. In our example, we show that the greatest improvement in cost e¤ectiveness happens as the intervention is continued long enough for the persistence e¤ect to take hold. This suggests that an important part of the future research agenda on behavioral interventions is to more precisely identify when and how people form a new "capital stock" that causes persistent changes in outcomes.

37 References

[1] Abrahamse, Wokje, Linda Steg, Charles Vlek, and Talib Rothengatter (2005). “A Review of Intervention Studies Aimed at Household Energy Conservation.”Journal of Environmental Psychology, Vol. 25, No. 3 (September), pages 273-291. [2] Agarwal, Sumit, John Driscoll, Xavier Gabaix, and (2011). "Learning in the Credit Card Market." Working Paper, Harvard University (April). [3] Allcott, Hunt (2011). "Social Norms and Energy Conservation." Journal of Public Economics, Vol. 95, No. 9-10 (October), pages 1082-1095. [4] Allcott, Hunt and Michael Greenstone (2012). "Is There an Energy E¢ ciency Gap?" Journal of Eco- nomic Perspectives, Vol. 26, No. 1 (Winter), pages 3-28. [5] Allcott, Hunt, Sendhil Mullainathan, and Dmitry Taubinsky (2012). "Energy Policy with Externalities and Internalities." Working Paper, New York University (September). [6] Allcott, Hunt, and Sendhil Mullainathan (2010a). "Behavior and Energy Policy." Science, Vol. 327, No. 5970 (March 5th). [7] Allcott, Hunt, and Sendhil Mullainathan (2010b). "Behavioral Science and Energy Policy." Working Paper, New York University (February). [8] Anderson, LA, T. Quinn, Karen Glanz, G Ramirez, LC Kahwati, DB Johnson, LR Buchanan, WR Archer, S Chattopadhyay, GP Kalra, DL Katz, and the Task Force on Community Preventive Services (2009). "The E¤ectiveness of Worksite Nutrition and Physical Activity Interventions for Controlling Employee Overweight and Obesity: A Systematic Review." American Journal of Preventive Medicine, Vol. 37, No. 4 (October), pages 340-357. [9] Argote, Linda, Sara Beckman, and Dennis Epple (1990). "The Persistence and Transfer of Learning in Industrial Settings." Management Science, Vol. 36, No. 2 (February), pages 140-154. [10] Arimura, Toshi, Shanjun Li, Richard Newell, and Karen Palmer (2011). "Cost-E¤ectiveness of Electricity Energy E¢ ciency Programs." Resources for the Future Discussion Paper 09-48 (May). [11] Ashby, Kira, Hilary Forster, Bruce Ceniceros, Bobbi Wilhelm, Kim Friebel, Rachel Henschel, and Sha- hana Samiullah (2012). "Green with Envy: Neighbor Comparisons and Social Norms in Five Home En- ergy Report Programs." http://www.aceee.org/…les/proceedings/2012/data/papers/0193-000218.pdf [12] Ayres, Ian, Sophie Raseman, and Alice Shih (2012). "Evidence from Two Large Field Experiments that Peer Comparison Feedback Can Reduce Residential Energy Usage." Journal of Law, Economics, and Organization, August 20. [13] Becker, Gary, and Kevin Murphy (1988). "A Theory of Rational Addiction." Journal of Political Econ- omy, Vol. 96, No. 4 (August), pages 675-700. [14] Benkard, Lanier (2000). "Learning and Forgetting: The Dynamics of Aircraft Production." American Economic Review, Vol. 90, No. 4 (September), pages 1034-1054. [15] Bertrand, Marianne, Esther Du‡o, and Sendhil Mullainathan (2004). "How Much Should We Trust Di¤erences-in-Di¤erences Estimates?" Quarterly Journal of Economics, Vol. 119, No. 1 (February), pages 249-275. [16] Bertrand, Marianne, Dean Karlan, Sendhil Mullainathan, Eldar Sha…r, and Jonathan Zinman (2010). "What’sAdvertising Content Worth? Evidence from a Consumer Credit Marketing Field Experiment." Quarterly Journal of Economics, Vol. 125, No. 1 (February), pages 263-306. [17] Beshears, John, James Choi, David Laibson, Brigitte Madrian (2009). "How Does Simpli…ed Disclosure A¤ect Individuals’Mutual Fund Choices?" NBER Working Paper No. 14859 (April).

38 [18] Beshears, John, James Choi, David Laibson, Brigitte Madrian, and Katherine Milkman (2012). "The E¤ect of Providing Peer Information on Retirement Savings Decisions." Working Paper, Stanford Uni- versity (May). [19] Bollinger, Brian, Phillip Leslie, and Alan Sorensen (2011). “Calorie Posting in Chain Restaurants.” American Economic Journal: Economic Policy, Vol. 3, No. 1 (February), pages 91-128. [20] Burke, LE, MA Styn, SM Sereika, MB Conroy, L Ye, Karen Glanz, MA Sevick, and LJ Ewing. "Using mHealth Technology to Enhance Self-Monitoring for Weight Loss: A Randomized Trial." American Journal of Preventive Medicine, Vol. 43, No. 1 (July), pages 20-26. [21] Cahill, Kate, and Rafael Perera (2008). "Competitions and Incentives for Smoking Cessation." Cochrane Database of Systematic Reviews, No 3. [22] Charness, Gary, and Uri Gneezy (2009). "Incentives to Exercise." Econometrica, Vol. 77, No. 3 (May), pages 909-931. [23] Chetty, Raj, Adam Looney, and Kory Kroft (2009). "Salience and Taxation: Theory and Evidence." American Economic Review, Vol. 99, No. 4 (September), pages 1145-1177. [24] Cialdini, Robert, Linda Demaine, Brad Sagarin, Daniel Barrett, Kelton Rhoads, and Patricia Winter (2006). "Managing Social Norms for Persuasive Impact." Social In‡uence, Vol. 1, No. 1, pages 3-15. [25] Cialdini, Robert, Raymond Reno, and Carl Kallgren (1990). "A Focus Theory of Normative Conduct: Recycling the Concept of Norms to Reduce Littering in Public Places." Journal of Personality and Social Psychology, Vol. 58, No. 6, pages 1015-1026. [26] Costa, Dora, and Matthew Kahn (2013). "Energy Conservation Nudges and Environmentalist Ideol- ogy: Evidence from a Randomized Residential Electricity Field Experiment." Journal of the European Economic Association, forthcoming. [27] Cutler, David, Robert Huckman, and Mary Beth Landrum (2004). "The Role of Information in Medical Markets: An Analysis of Publicly Reported Outcomes in Cardiac Surgery." American Economic Review, Vol. 94, No. 2 (May), pages 342-346. [28] Davis, Lucas (2008). "Durable Goods and Residential Demand for Energy and Water: Evidence from a Field Trial." RAND Journal of Economics, Vol. 39, No. 2 (Summer), pages 530-546. [29] Davis, Matthew (2011). “Behavior and Energy Savings." Working Paper, Environmental Defense Fund (May). [30] DellaVigna, Stefano, and Ulrike Malmendier (2006). "Paying Not to Go to the Gym." American Eco- nomic Review, Vol. 96, No. 3 (June), pages 694-719. [31] DellaVigna, Stefano, Ulrike Malmendier, and John List (2012). "Testing for Altruism and Social Pressure in Charitable Giving." Quarterly Journal of Economics, Vol. 127, No. 1 (February), pages 1-56. [32] Dranove, David, and Ginger Zhe Jin (2010). "Quality Disclosure and Certi…cation: Theory and Prac- tice.”Journal of Economic Literature, Vol. 48, No. 4 (December), pages 935-963. [33] Dranove, David and Andrew Sfekas (2008) “Start Spreading the News: A Structural Estimate of the E¤ects of New York Hospital Report Cards.”Journal of Health Economics, Vol. 27, No. 5, pages 1201– 07. [34] Du‡o, Esther, and Emmanuel Saez (2003). "The Role of Information and Social Interactions in Retire- ment Plan Decisions: Evidence from a Randomized Experiment." Quarterly Journal of Economics, Vol. 118, No. 3 (August), pages 815-842. [35] Ebbinghaus, Hermann (1885). Memory: A Contribution to Experimental Psychology. New York: Columbia University Teachers College.

39 [36] Ferraro, Paul, and Michael Price (2013). "Using Non-Pecuniary Strategies to In‡uence Behavior: Evi- dence from a Large Scale Field Experiment." Review of Economics and Statistics, Vol. 95, No. 1 (March), pages 64–73. [37] Ferraro, Paul, Juan Jose Miranda, and Michael Price (2011). "The Persistence of Treatment E¤ects with Norm-Based Policy Instruments: Evidence from a Randomized Environmental Policy Experiment." American Economic Review, Vol. 101, No. 3 (May), pages 318-322. [38] Ferraro, Paul, and Juan Jose Miranda (2013). "Heterogeneous Treatment E¤ects and Causal Mecha- nisms in Non-Pecuniary, Information-Based Environmental Policies: Evidence from a Large-Scale Field Experiment." Resource and Energy Economics, Vol. 35, pages 356-379. [39] Figlio, David, and Maurice Lucas (2004) “What’s in a Grade? School Report Cards and the Housing Market.”American Economic Review, Vol. 94, No. 3, pages 591-604. [40] Frey, Bruno, and Stephan Meier (2004). "Social Comparisons and Pro-Social Behavior: Testing ‘Con- ditional Cooperation’in a Field Experiment." American Economic Review, Vol. 94, No. 5 (December), pages 1717-1722. [41] Friedrich, Katherine, Maggie Eldridge, Dan York, Patti Witte, and Marty Kushler (2009). “Saving Energy Cost-E¤ectively: A National Review of the Cost of Energy Saved through Utility-Sector Energy E¢ ciency Programs.”ACEEE Report No. U092 (September). [42] Gabaix, Xavier (2012). "A Sparsity-Based Model of ." Working Paper, NYU (Sep- tember). [43] Gabaix, Xavier, and David Laibson (2006). "Shrouded Attributes, Consumer Myopia, and Information Suppression in Competitive Markets.”Quarterly Journal of Economics, Vol. 121, No. 2, pages 505-540. [44] Gerber, Alan, Donald Green, and Ron Shachar (2003). "Voting May Be Habit-Forming: Evidence from a Randomized Field Experiment." American Journal of Political Science, Vol. 47, No. 3 (July), pages 540-550. [45] Gerber, Alan, and Todd Rogers (2009). “Descriptive Social Norms and Motivation to Vote: Everybody’s Voting and So Should You.”Journal of Politics, Vol. 71, pages 1-14. [46] Gine, Xavier, Dean Karlan, and Jonathan Zinman (2010). "Put Your Money Where Your Butt Is: A Commitment Contract for Smoking Cessation." American Economic Journal: Applied Economics, Vol. 2, No. 4 (October), pages 213-235. [47] Gneezy, Uri, and John List (2006). "Putting to Work: Testing for Gift Exchange in Labor Markets Using Field Experiments." Econometrica, Vol. 74, No. 5 (September), pages 1365- 1384. [48] Gneezy, Uri, Stephan Meier, and Pedro Rey-Biel (2011). "When and Why Incentives (Don’t) Work to Modify Behavior." Journal of Economic Perspectives, Vol. 25, No. 4 (Fall), pages 191-210. [49] Greenstone, Michael, Paul Oyer, and Annette Vissing-Jorgensen (2006). “Mandated Disclosure, Stock Returns, and the 1964 Securities Acts Amendments.”Quarterly Journal of Economics, 121(2): 399–460. [50] Haselhuhn, Michael, Devin Pope, Maurice Schweitzer, and Peter Fishman (2012). "The Impact of Personal Experience on Behavior: Evidence from Video-Rental Fines." Management Science, Vol. 58, No. 1 (January), pages 52-61. [51] Hastings, Justine and Je¤rey Weinstein (2008) “Information, School Choice and Academic Achievement: Evidence from Two Experiments.”Quarterly Journal of Economics, Vol. 123, No. 4, pages 1373-1414. [52] Integral Analytics (2012). "Sacramento Municipal Utility District Home Energy Report Program." http://www.integralanalytics.com/ia/Portals/0/FinalSMUDHERSEval2012v4.pdf [53] Jackson, Kirabo (2010). "A Little Now for a Lot Later: A Look at a Texas Advanced Placement Incentive Program." Journal of Human Resources, Vol. 45, No. 3 (Summer), pages 591-639.

40 [54] Jin, Ginger Zhe, and Alan T. Sorensen (2006) “Information and Consumer Choice: The Value of Publicized Health Plan Ratings.”Journal of Health Economics, 25(2): 248–75. [55] John, Leslie, , Andrea Troxel, Laurie Norton, Jennifer Fassbender, and Kevin Volpp (2011). "Financial Incentives for Extended Weight Loss: A Randomized, Controlled Trial." Journal of General Internal Medicine, Vol. 26, No. 6 (June), pages 621-626. [56] Joskow, Paul (2012). "Creating a Smarter U.S. Electricity Grid." Journal of Economic Perspectives, Vol. 26, No. 1 (Winter), pages 29-48. [57] Joskow, Paul, and Catherine Wolfram (2012). "Dynamic Pricing of Electricity." American Economic Review, Vol. 102, No. 3 (May). [58] Karlan, Dean, Margaret McConnell, Sendhil Mullainathan, and Jonathan Zinman (2010). "Getting to the Top of Mind: How Reminders Increase Saving." Working Paper, Harvard University (October). [59] KEMA (2012). "Puget Sound Energy’sHome Energy Reports Program: Three Year Impact, Behavioral and Process Evaluation." Madison, Wisconsin: DNV KEMA Energy and Sustainability. [60] Kling, Je¤rey, Sendhil Mullainathan, Eldar Sha…r, Lee Vermeulen, and Marian Wrobel (2012). "Com- parison Friction: Experimental Evidence from Medicare Drug Plans." Quarterly Journal of Economics, Vol. 127, No. 1 (February). [61] Landry, Craig, Andreas Lange, John List, Michael Price, and Nicholas Rupp (2010). "Is a Donor in Hand Better than Two in the Bush? Evidence from a Natural Field Experiment." American Economic Review, Vol. 100 (June), pages 958-983. [62] Levitt, Steve, John List, and Sally Sado¤ (2010). "The E¤ect of Performance-Based Incentives on Educational Achievement: Evidence from a Randomized Experiment." Working Paper, University of Chicago. [63] Macharia, W., G. Leon, B. Rowe, B. Stephenson, and R. Haynes (1992). "An Overview of Interventions to Improve Compliance with Appointment Keeping for Medical Services." Journal of the American Medical Association, Vol. 267, No. 13 (April 1), pages 1813-1817. [64] Mathios, Alan (2000). "The Impact of Mandatory Disclosure Laws on Product Choices: An Analysis of the Salad Dressing Market." Journal of Law and Economics, Vol. 43, No. 2 (October), pages 651-678. [65] Mullainathan, Sendhil (2002). "A Memory-Based Model of Bounded Rationality." Quarterly Journal of Economics, Vol. 117, No. 3 (August), pages 735-774. [66] Nolan, Jessica, Wesley Schultz, Robert Cialdini, Noah Goldstein, and Vladas Griskevicius (2008). “Nor- mative In‡uence is Underdetected.”Personality and Social Psychology Bulletin, Vol. 34, pages 913-923. [67] Opinion Dynamics (2012). "Massachusetts Three Year Cross-Cutting Behavioral Program Evaluation Integrated Report." Waltham, MA: Opinion Dynamics Corporation. [68] Oullette, Judith, and Wendy Wood (1998). "Habit and Intention in Everyday Life: The Multiple Processes by Which Past Behavior Predicts Future Behavior." Psychological Bulletin, Vol. 124, No. 1, pages 54-74. [69] Perry, Michael, and Sarah Woehleke ( 2013).“Evaluation of Paci…c Gas and Electric Company’sHome Energy Report Initiative for the 2010-2012 Program.”Freeman, Sullivan, and Company (April). [70] Pope, Devin (2009). "Reacting to Rankings: Evidence from "America’s Best Hospitals." Journal of Health Economics, Vol. 28, No. 6, pages 1154-1165. [71] Reis, Ricardo (2006a). "Inattentive Consumers." Journal of Monetary Economics, Vol. 53, No. 8, pages 1761-1800. [72] Reiss, Peter, and Matthew White (2008). "What Changes Energy Consumption? Prices and Public Pressure." RAND Journal of Economics, Vol. 39, No. 3 (Autumn), pages 636-663.

41 [73] Royer, Heather, Mark Stehr, and Justin Snydor (2013). "Incentives, Commitments, and Habit Formation in Exercise: Evidence from a Field Experiment with Workers at a Fortune-500 Company." Working Paper, University of California at Santa Barbara (March). [74] Rubin, David, and Amy Wenzel (1996). "One Hundred Years of Forgetting: A Quantitative Description of Retention." Psychological Review, Vol. 103, No. 4 (October), pages 734-760. [75] Sallee, James (2011). "Rational Inattention and Energy E¢ ciency." Working Paper, University of Chicago (June). [76] Scanlon, Dennis, Michael Chernew, Catherine McLaughlin, and Gary Solon (2002). “The Impact of Health Plan Report Cards on Managed Care Enrollment.” Journal of Health Economics, Vol. 21, No. 1, pages 19–41. [77] Schneider, Walter, and Richard Shi¤rin (1977). "Controlled and Automatic Human Information Process- ing: I. Detection, Search, and Attention." Psychological Review, Vol. 84, No. 1, pages 1-66. [78] Schultz, Wesley, Jessica Nolan, Robert Cialdini, Noah Goldstein, and Vladas Griskevicius (2007). "The Constructive, Destructive, and Reconstructive Power of Social Norms." Psychological Science, Vol. 18, pages 429-434. [79] Sexton, Steven (2011). "Automatic Bill Payment, Price Salience, and Consumption: Evidence from Residential Electricity Consumption. Working Paper, UC Berkeley (November). [80] Shi¤rin, Richard, and Walter Schneider (1977). "Controlled and Automatic Human Information Process- ing: II. Perceptual Learning, Automatic Attending and a General Theory." Psychological Review, Vol. 84, No. 1, pages 127-190. [81] Sumi, David, and Brian Coates (1989). "Persistence of Energy Savings in Seattle City Light’sResiden- tial Weatherization Program." Energy Program Evaluation: Conservation and Resource Management: Proceedings of the August 23-25, 1989 Conference, pages 311-316. Chicago: Energy Program Evaluation Conference. [82] U.S. Energy Information Administration (2009). "Table 2. Residen- tial Consumption of Electricity by End Use, 2001." Available from http://www.eia.gov/emeu/recs/recs2001/enduse2001/enduse2001.html#table2 [83] U.S. Energy Information Administration (2005). "Table US 14: Av- erage Consumption by Energy End Uses, 2005." Available from http://www.eia.gov/consumption/residential/data/2005/c&e/summary/pdf/tableus14.pdf [84] U.S. Energy Information Administration (2011). "Table 5A. Residential Av- erage Monthly Bill by Census Division, and State." Available from http://www.eia.gov/electricity/sales_revenue_price/html/table5_a.html. [85] U.S. Energy Information Administration (2013). "Form EIA-861 Data Files." Available from http://www.eia.gov/electricity/data/eia861/ [86] van Dulmen, Sandra, Emmy Sluijs, Liset van Dijk, Denise de Ridder, Rob Heerdink, and Jozien Bens- ing (2007). "Patient Adherence to Medical Treatment: A Review of Reviews." BMC Health Services Research, Vol. 7, No. 55, [87] Violette, Daniel, Provencher, Bill, and Mary Klos (2009). "Impact Evaluation of Positive Energy SMUD Pilot Study." Boulder, CO: Summit Blue Consulting. [88] Volpp, Kevin, Andrea Troxel, Mark Pauly, Henry Glick, Andrea Puig, David Asch, Robert Galvin, Jingsan Zhu, Fei Wan, Jill deGuzman, Elizabeth Corbett, Janet Weiner, and Janet Audrain-McGovern (2009). "A Randomized, Controlled Trial of Financial Incentives for Smoking Cessation." New England Journal of Medicine, Vol. 360 (February 12), pages 699-709.

42 Tables Table 1: Descriptive Statistics Site 1 2 3 Region Upper Midwest Northwest Southwest Average January Heating Degrees 46.9 25.4 19.3 Average July Cooling Degrees 5.6 2.2 8.9

Narrative Baseline period begins October 1, 2007 January 1, 2007 April 1, 2006 First reports generated January 13, 2009 to October 8, 2008 to March 25, 2008 to February 24, 2009 October 9, 2008 May 6, 2008 Last report generated for dropped group January 25, 2011 September 10, 2010 June 29, 2010 End of sample April 29, 2013 March 31, 2013 March 20, 2013

Frequency 60% Monthly 72% Monthly 71% Monthly 40% Quarterly 28% Quarterly (Heavier users) (Randomly (Randomly (Randomly assigned) assigned) assigned) 2011: Continued (Lighter users) group to bimonthly Number of Households Treatment: Continued 26,262 23,399 21,630 Treatment: Dropped 12,368 11,543 12,117 Control 33,524 43,945 49,290 Total 72,154 78,887 83,037

Number of Electricity Bill Observations 4,931,925 5,418,250 6,393,523

Utility-wide Average Usage Mean for all residential consumers (kWh/day) 29.9 32.3 24.2

Baseline Usage (kWh/day) (For experimental population) Mean 30.1 30.3 32.1 Standard deviation 16.7 13.5 15.6 Treatment - Control 0.024 0.044 -0.450 (Standard error) (0.124) (0.097) (0.51) Dropped - Continued -0.074 0.062 0.026 (Standard error) (0.182) (0.154) (0.17)

Attrition due to Moving Share of households that move 0.20 0.23 0.26 Treatment - Control -0.0043 0.0021 0.0109 (Standard error) (0.0030) (0.0030) (0.0069) Dropped - Continued 0.011 -0.0074 0.0032 (Standard error) (0.0044) (0.0048) (0.0051)

Attrition due to Opting Out Share of treatment group that opts out 0.020 0.019 0.026

43 Table 2: Descriptive Statistics for Short-Run Analysis Narrative Wave 1 1 2 Start Date October 8, 2008 October 8, 2008 February, 2011 Frequency Monthly Quarterly Bimonthly

Number of Households Treatment 24,851 9,923 21,972 Control 33,003 10,995 21,895 Total 57,854 20,918 43,867

Number of Observations 113,473,575 40,681,988 71,289,771

44 Table 3: Immediate E¤ects at Arrival Window Table 3a: First Four Reports

(1) (2) (3) (4) (5) (6) Monthly Monthly Quarterly Quarterly Bimonthly Bimonthly Base Weather Base Weather Base Weather 1(Treated) 1(Arrival Period) -0.062 -0.062 -0.067 -0.070 -0.047 -0.043  (0.024)*** (0.024)*** (0.029)** (0.028)** (0.033) (0.033) 1(Treated) 1(Post-Arrival Period) -0.172 -0.185 -0.201 -0.197 -0.129 -0.152  (0.030)*** (0.027)*** (0.041)*** (0.035)*** (0.038)*** (0.036)*** 1(Treated) -0.534 -0.451 -0.391 -0.420 -0.366 -0.276 (0.065)*** (0.086)*** (0.067)*** (0.084)*** (0.059)*** (0.106)*** 1(Treated) Heating Degrees -0.004 0.002 -0.009  (0.004) (0.005) (0.008) Heating Degrees 0.038 0.020 0.083 (0.016)** (0.014) (0.011)*** 1(Treated) Cooling Degrees 0.000 -0.031  (0.010) (0.019)* Cooling Degrees 0.281 0.016 (0.019)*** (0.027) N 8,515,691 8,515,691 19,333,058 19,333,058 9,609,303 9,609,303

Table 3b: After First Four Reports

(1) (2) (3) (4) (5) (6) Monthly Monthly Quarterly Quarterly Bimonthly Bimonthly Base Weather Base Weather Base Weather 1(Treated) 1(Arrival Period) -0.015 -0.017 -0.010 -0.005 -0.025 -0.129  (0.007)** (0.007)** (0.019) (0.019) (0.047) (0.049)*** 1(Treated) 1(Post-Arrival Period) -0.032 -0.033 -0.045 -0.038 -0.211 -0.230  (0.009)*** (0.009)*** (0.022)** (0.022)* (0.060)*** (0.061)*** 1(Treated) -0.801 -0.706 -0.657 -0.509 -0.645 -0.048 (0.058)*** (0.059)*** (0.092)*** (0.095)*** (0.089)*** (0.143) 1(Treated) Heating Degrees -0.006 -0.010 -0.034  (0.003)** (0.005)** (0.008)*** 1(Treated) Cooling Degrees -0.017 -0.007 -0.050  (0.007)** (0.013) (0.028)* Heating Degrees 0.004 0.007 0.082 (0.010) (0.011) (0.013)*** Cooling Degrees 0.090 0.023 0.463 (0.012)*** (0.015) (0.029)*** N 75,217,587 75,217,587 52,418,516 52,418,516 19,554,914 19,554,914

Notes: Tables 3a and 3b present the estimates of Equation (3) for the …rst four reports and all remaining reports, respectively. In each pair of estimates, the left column does not control for degree days, while the

45 right column does. The outcome variable is electricity use, in kilowatt-hours per day. Standard errors are robust, clustered by household. *, **, ***: Statistically signi…cant with 90, 95, and 99 percent con…dence, respectively.

46 Table 4: Decays Between Reports Table 4a: First Four Reports

(1) (2) (3) (4) (5) (6) Monthly Monthly Quarterly Quarterly Bimonthly Bimonthly Base Weather Base Weather Base Weather 1(Treated) 1(Window) Time 1.356 4.082 0.706 0.708 1.012 0.948   (1.265) (1.302)*** (0.195)*** (0.187)*** (0.439)** (0.426)** 1(Treated) -0.413 -0.098 -0.346 -0.338 -0.408 -0.242 (0.064)*** (0.095) (0.071)*** (0.084)*** (0.067)*** (0.104)** 1(Treated) Heating Degrees -0.013 -0.000 -0.011  (0.004)*** (0.004) (0.008) Heating Degrees 0.042 0.021 0.085 (0.016)*** (0.014) (0.011)*** 1(Treated) Cooling Degrees -0.004 -0.028  (0.012) (0.019) Cooling Degrees 0.282 0.015 (0.019)*** (0.027) N 8,515,691 8,515,691 19,333,058 19,333,058 9,609,303 9,609,303

Table 4b: After First Four Reports

(1) (2) (3) (4) (5) (6) Monthly Monthly Quarterly Quarterly Bimonthly Bimonthly Base Weather Base Weather Base Weather 1(Treated) 1(Window) Time 0.333 0.393 0.017 0.023 0.022 0.134   (0.322) (0.315) (0.141) (0.140) (0.539) (0.536) 1(Treated) -0.777 -0.682 -0.606 -0.532 -0.551 -0.080 (0.056)*** (0.058)*** (0.087)*** (0.091)*** (0.089)*** (0.141) 1(Treated) Heating Degrees -0.007 -0.006 -0.030  (0.003)** (0.004) (0.008)*** 1(Treated) Cooling Degrees -0.014 -0.005 -0.044  (0.008)* (0.013) (0.029) Heating Degrees 0.004 0.007 0.080 (0.010) (0.011) (0.013)*** Cooling Degrees 0.089 0.023 0.460 (0.012)*** (0.015) (0.029)*** N 75,217,587 75,217,587 52,418,516 52,418,516 19,554,914 19,554,914

Notes: Tables 4a and 4b present the estimates of Equation (4) for the …rst four reports and all remaining reports, respectively. In each pair of estimates, the left column does not control for degree days, while the right column does. The outcome variable is electricity use, in kilowatt-hours per day. Standard errors are robust, clustered by household. *, **, ***: Statistically signi…cant with 90, 95, and 99 percent con…dence, respectively.

47 Table 5: Long-Run E¤ects

(1) (2) (3) Site 1 Site 2 Site 3 ATEs ATEs ATEs 1(Treated) 1(Pre-Treatment) -0.037 -0.022 -0.008  (0.057) (0.036) (0.065) 1(Treated) 1(1st Year) -0.565 -0.450 -0.641  (0.042)*** (0.043)*** (0.084)*** 1(Treated) 1(2nd Year Until Drop) -0.867 -0.659 -0.865  (0.053)*** (0.052)*** (0.092)*** 1(Continued) 1(Post-Drop) -0.930 -0.870 -1.007  (0.068)*** (0.069)*** (0.110)*** 1(Dropped) 1(Post-Drop) -0.591 -0.584 -0.627  (0.085)*** (0.089)*** (0.123)*** N 3,294,294 4,435,689 5,063,949

Notes: This table presents the estimates of Equation (6), omitting the third and fourth lines. The outcome variable is monthly average electricity use, in kilowatt-hours per day. Standard errors are robust, clustered by household. *, **, ***: Statistically signi…cant with 90, 95, and 99 percent con…dence, respectively.

Table 6: Long-Run Decay Parameters

(1) (2) (3) (4) (5) (6) Site 1 Site 1 Site 2 Site 2 Site 3 Site 3 Base Weather Base Weather Base Weather 1(Treated) 1(Pre-Treatment) -0.037 -0.037 -0.022 -0.022 -0.008 -0.008  (0.057) (0.057) (0.036) (0.036) (0.065) (0.065) 1(Treated) 1(1st Year) -0.565 -0.565 -0.450 -0.450 -0.641 -0.641  (0.042)*** (0.042)*** (0.043)*** (0.043)*** (0.084)*** (0.084)*** 1(Treated) 1(2nd Year Until Drop) -0.867 -0.925 -0.659 -0.584 -0.865 -0.756  (0.053)*** (0.062)*** (0.052)*** (0.062)*** (0.092)*** (0.107)*** 1(Dropped) 1(Post-Drop) -0.786 -0.840 -0.718 -0.631 -0.725 -0.590  (0.090)*** (0.090)*** (0.095)*** (0.091)*** (0.129)*** (0.134)*** 1(Continued) 1(Post-Drop) -1.029 -1.030 -0.804 -0.805 -0.889 -0.895  (0.071)*** (0.071)*** (0.076)*** (0.076)*** (0.118)*** (0.117)*** 1(Dropped) 1(Post-Drop) x Time 0.176 0.178 0.114 0.113 0.086 0.090  (0.053)*** (0.053)*** (0.047)** (0.047)** (0.045)* (0.046)* 1(Continued) 1(Post-Drop) x Time 0.091 0.087 -0.062 -0.061 -0.079 -0.082  (0.041)** (0.041)** (0.039) (0.039) (0.034)** (0.036)** N 3,294,294 3,294,294 4,435,689 4,435,689 5,063,949 5,063,949

Notes: This table presents the estimates of Equation (6). The outcome variable is monthly average electricity use, in kilowatt-hours per day. Standard errors are robust, clustered by household. *, **, ***: Statistically signi…cant with 90, 95, and 99 percent con…dence, respectively.

48 Table 7: Self-Reported Actions (1) (2) (3) Treatment- Treatment- "In the past twelve months, have you..." Mean Control Control | X Taken any steps to 0.77 0.010 -0.001 reduce energy use? (0.012) (0.015) Repeated Actions 0.62 0.005 0.011 (0.008) (0.010) Adjusted your thermostat settings? 0.63 0.012 0.007 (0.015) (0.019) Unplugged devices and chargers? 0.65 -0.020 -0.013 (0.039) (0.044) Switched o¤ power strips 0.59 0.002 0.011 or appliances when unused? (0.014) (0.018) Turned o¤ lights when unused? 0.96 0.005 0.006 (0.009) (0.010) Hung laundry to dry? 0.42 0.010 0.001 (0.024) (0.027) Used energy saving or sleep 0.56 0.008 0.021 features on your computer? (0.021) (0.029) Turned o¤ computer at night? 0.65 -0.034 -0.030 (0.023) (0.026) Used fans to keep cool? 0.80 0.072 0.086 ( 0.034 ) ( 0.039 )   Physical Capital Changes 0.55 -0.002 0.002 (0.008) (0.010) Replaced incandescent light bulbs 0.70 0.013 0.016 with LEDs? (0.038) (0.043) Purchased Energy Star appliances? 0.74 0.002 0.019 (0.016) (0.022) Disposed of a second refrigerator 0.26 -0.001 0.030 or freezer? (0.015) (0.019) Installed light timers or sensors? 0.30 -0.018 -0.014 (0.038) (0.043) Replaced incandescent light bulbs 0.81 0.000 -0.017 with CFLs? (0.013) (0.017) Added insulation or 0.54 -0.039 -0.055 (0.024) (0.029) replaced windows?  Had a home energy audit? 0.19 0.057 0.058 ( 0.022 ) ( 0.026 )   Installed a programmable 0.79 -0.033 -0.025 thermostat? (0.032) (0.037) Intermittent Actions 0.62 0.006 0.007 (0.012) (0.017) Tuned up your AC system? 0.63 -0.016 -0.014 (0.018) (0.024) Used a programmable thermostat? 0.59 0.009 0.076 (0.028) (0.047) Added weather-stripping or 0.60 0.008 -0.009 caulking around windows? (0.018) (0.025) Cleaned or replaced heating or 0.70 0.017 0.010 AC system air …lters? (0.038) (0.044) Participated in any utility 0.19 0.018 0.010 energy e¢ ency programs? ( 0.010 ) ( 0.013 ) 49  Notes: This table presents survey data from 5856 households on self-reported energy conservation actions. The three columns at left present aggregated results across six sites, while the three on the right present results from the utility we study in the rest of the paper. Standard errors are robust. *, **, ***: Statistically signi…cant with 90, 95, and 99 percent con…dence, respectively.

50 Table 8: Program Participation (1) (2) (3) (4) Treatment- Treatment - Number Control Control Savings of Participation Savings Measure (kWh/day) Households Rate (%) (Wh/day) Site 3 New Appliance 1 1590 0.23  2.3  (0.098) (1.0) Heating, Ventilation, 5 1481 0.028 1.4 and Air Conditioning (0.093) (4.7) Insulation, Air Sealing, and 5 890 0.18  9.0  other "Home Improvement" (0.074) (3.7) All 3.4 3855 0.417  14.2  (0.149) (5.1) Site 2 Clothes Washer 0.35 1357 0.11 0.38 (0.11) (0.43) Insulation 0 271 0.040 0 ( 0.049 ) ( 0 ) Refrigerator Decommissioning 1.37 215 0.045 0.41 (0.043) (0.59) Showerhead 0.15 214 0.025 0.09 (0.043) (0.10) Windows 0 213 -0.021 0 ( 0.042 ) ( 0 )

Compact Fluorescent Lightbulbs 2.27 204 0.161  2.25 (0.046) (1.14) Water Heater 1.36 144 0.035 -0.99 (0.035) (1.16) Freezer Decommissioning 1.52 99 0.020 0.31 (0.028) (0.43) Heat Pump 1.77 41 0.010 -0.19 (0.019) (0.38) New Refrigerator 1.75 6 -0.007 -0.10 (0.007) (0.13) Windows 6.69 5 0.008 0.89 (0.008) (0.82) Conversion to Gas Heat 28.08 1 -0.002 -0.64 (0.002) (0.64) All 0.70 2481 0.36 2.40 (0.14) (2.19) Notes: This table presents data on participation in energy conservation programs in Site 3 for April 2008 through June 2010, and in Site 2 for calendar year 2011. For readability, the coe¢ cients in column (3) are in percentage points, ranging from 0 to 100, and the coe¢ cients in column (4) are in Watt-hours per day instead of kilowatt-hours per day. Standard errors are robust. *, **, ***: Statistically signi…cant with 90, 95, and 99 percent con…dence, respectively.

51 Table 9: In-Sample Cost E¤ectiveness of a Two-Year Intervention at Site 2 Scenario 1 2 3 Time Horizon Sample Sample Sample Assumed Persistence Zero Full Observed

Electricity Savings and Costs Savings during treatment (kWh) 405 405 405 (Standard Error) (25) (25) (24.6) Post-treatment savings (kWh) 0 600 531 (Standard Error) (0) (47) (81) Total savings (kWh) 405 1004 936 (25) (53) (85)

Total cost ($) 18 18 18

Cost E¤ectiveness Cost E¤ectiveness (cents/kWh) 4.44 1.79 1.92 (Standard Error) (0.27) (0.09) (0.17)

Retail Electricity Cost Savings Per household cost savings ($) 40 100 94 (Standard Error) (2.5) (5.3) (8.5) Total population cost savings ($millions) 2.92 7.25 6.76 (Standard Error) (0.18) (0.38) (0.61) Notes: This table shows the cost e¤ectiveness of a two-year intervention in Site 2 under di¤erent as- sumptions about post-treatment persistence. See text for details.

52 Table 10: Cost E¤ectiveness and Program Design Design 1 2 3 4 Description One report 2 years 2 years 4 years Monthly- Monthly- Monthly- Quarterly Quarterly, Quarterly 2 years Biannual Electricity Savings and Costs Savings during treatment (kWh PDV) 0 397 868 973 |ATE| during last treatment period (kWh/day) 0.30 0.66 0.72 0.88 Decay rate (kWh/day per year) 0.70 0.11 0.11 0.11 Years from end of treatment to zero e¤ect 0.43 5.8 6.3 7.7 Post-treatment savings (kWh PDV) 23 580 612 890 Total savings (kWh PDV) 23 977 1,480 1,863

Total number of reports 1 18 22 36 Total cost ($ PDV) 1.0 17.7 21.3 33.8

Cost E¤ectiveness (cents/kWh) 4.30 1.81 1.44 1.81

Incremental E¤ects (Relative to previous column) Additionality E¤ect (kWh PDV) 468 503 383 Persistence E¤ect (kWh PDV) 485 0 0 Incremental savings (kWh PDV) 953 503 383 Incremental cost ($ PDV) 16.7 3.6 12.5 Incremental cost-e¤ectiveness (cents/kWh) 1.75 0.71 3.27 Notes: This table shows the cost e¤ectiveness of di¤erent program designs in Site 2. See text for details.

53 Figures Figure 1: Home Energy Report, Front and Back

54 55 Figure 2: Short-Run E¤ects for Quarterly Group 56

Notes: This …gure plots the smoothed ATEs for each day of the …rst year of treatment for the quarterly treatment groups, as estimated by Equation (1). The dotted lines re‡ect 90 percent con…dence intervals, with robust standard errors clustered by household. Figure 3: Short-Run E¤ects in Event Time Figure 3a: First Four Reports

57 Figure 3b: After First Four Reports

Notes: Figures 3a and 3b plot the ATEs in event time for the …rst four reports and all remaining reports, respectively, as estimated by Equation (2). These ATEs are residual of degree day and report-speci…c controls. The dotted lines re‡ect 90 percent con…dence intervals, with robust standard errors clustered by household.

58 Figure 4: Long-Run E¤ects Figure 4a: Site 1

Figure 4b: Site 2

59 Figure 4c: Site 3

Notes: These …gures plots the ATEs for each month of the sample for the continued and dropped groups, estimated by Equation (5). The dotted lines re‡ect 90 percent con…dence intervals, with robust standard errors clustered by household.

60 Figure 5: Monthly and Quarterly E¤ects Figure 5a: Site 2

Figure 5b: Site 1

Notes: This …gure plots the ATEs for each month of the sample for the monthly and quarterly groups. The dotted lines re‡ect 90 percent con…dence intervals, with robust standard errors clustered by household.

61 Figure 6: E¤ects of Repeated Intervention

Notes: This …gure shows the channels of incremental electricity conservation when the intervention is lengthened from one report to two years. Electricity savings are kilowatt-hours (kWh) per household, discounted to the present at …ve percent. The assumptions used to calculate these e¤ects are detailed in the text, and the numbers match those in Table 10.

62 Appendix: For Online Publication

The Short-Run and Long-Run E¤ects of Behavioral Interventions: Experimental Evidence from Energy Conservation

Hunt Allcott and Todd Rogers

63 Appendix Tables Table A1: Short-Run E¤ects at Arrival Window Table A1a: First Four Reports

(1) (2) (3) (4) (5) (6) Monthly Monthly Quarterly Quarterly Bimonthly Bimonthly Outliers Full M Outliers Full M Outliers Full M 1(Treated) 1(Arrival Period) -0.061 -0.061 -0.070 -0.069 -0.032 -0.052  (0.024)** (0.023)*** (0.028)** (0.028)** (0.031) (0.029)* 1(Treated) 1(Post-Arrival Period) -0.183 -0.185 -0.190 -0.193 -0.154 -0.160  (0.027)*** (0.027)*** (0.036)*** (0.035)*** (0.034)*** (0.036)*** 1(Treated) -0.430 -0.580 -0.413 -0.408 -0.228 -0.379 (0.086)*** (0.114)*** (0.083)*** (0.091)*** (0.101)** (0.103)*** 1(Treated) Heating Degrees -0.004 0.002 -0.011  (0.004) (0.005) (0.007) Heating Degrees 0.039 0.021 0.089 (0.016)** (0.014) (0.010)*** 1(Treated) Cooling Degrees 0.000 -0.018  (0.010) (0.016) Cooling Degrees 0.279 -0.016 (0.019)*** (0.022) N 8,514,078 8,515,691 19,330,176 19,333,058 9,589,391 9,609,303

Table A1b: After First Four Reports

(1) (2) (3) (4) (5) (6) Monthly Monthly Quarterly Quarterly Bimonthly Bimonthly Outliers Full M Outliers Full M Outliers Full M 1(Treated) 1(Arrival Period) -0.016 -0.017 -0.004 -0.006 -0.129 -0.137  (0.007)** (0.007)** (0.019) (0.019) (0.049)*** (0.050)*** 1(Treated) 1(Post-Arrival Period) -0.030 -0.032 -0.036 -0.035 -0.217 -0.233  (0.009)*** (0.009)*** (0.022) (0.022) (0.060)*** (0.061)*** 1(Treated) -0.696 -0.762 -0.509 -0.555 -0.041 -0.144 (0.059)*** (0.063)*** (0.094)*** (0.115)*** (0.138) (0.141) 1(Treated) Heating Degrees -0.007 -0.009 -0.034  (0.003)** (0.005)* (0.008)*** 1(Treated) Cooling Degrees -0.017 -0.007 -0.042  (0.007)** (0.013) (0.026) Heating Degrees 0.004 0.007 0.082 (0.010) (0.011) (0.012)*** Cooling Degrees 0.089 0.022 0.444 (0.012)*** (0.015) (0.028)*** N 75,201,504 75,217,587 52,409,856 52,418,516 19,513,453 19,554,914

64 Notes: Tables A1a and A1b present the estimates of Equation (3) for the …rst four reports and all remaining reports, respectively. In each pair of estimates, the left column includes more ‡exible temperature controls, while the right column excludes outliers. The outcome variable is electricity use, in kilowatt-hours per day. Standard errors are robust, clustered by household. *, **, ***: Statistically signi…cant with 90, 95, and 99 percent con…dence, respectively.

65 Table A2: Short-Run E¤ects Between Reports Table A2a: First Four Reports

(1) (2) (3) (4) (5) (6) Monthly Monthly Quarterly Quarterly Bimonthly Bimonthly Outliers Full M Outliers Full M Outliers Full M 1(Treated) 1(Window) Time 4.061 4.476 0.674 0.697 0.744 0.884   (1.290)*** (1.309)*** (0.185)*** (0.187)*** (0.392)* (0.417)** 1(Treated) -0.083 -0.544 -0.325 -0.319 -0.222 -0.342 (0.094) (0.120)*** (0.083)*** (0.089)*** (0.101)** (0.101)*** 1(Treated) Heating Degrees -0.014 -0.001 -0.011  (0.004)*** (0.004) (0.007) Heating Degrees 0.043 0.021 0.089 (0.016)*** (0.014) (0.010)*** 1(Treated) Cooling Degrees -0.004 -0.010  (0.012) (0.016) Cooling Degrees 0.280 -0.020 (0.019)*** (0.022) N 8,514,078 8,515,691 19,330,176 19,333,058 9,589,391 9,609,303

Table A2b: After First Four Reports

(1) (2) (3) (4) (5) (6) Monthly Monthly Quarterly Quarterly Bimonthly Bimonthly Outliers Full M Outliers Full M Outliers Full M 1(Treated) 1(Window) Time 0.312 0.449 0.037 0.023 -0.115 0.275   (0.313) (0.312) (0.140) (0.141) (0.526) (0.528) 1(Treated) -0.672 -0.725 -0.532 -0.509 -0.070 -0.144 (0.058)*** (0.061)*** (0.091)*** (0.104)*** (0.136) (0.141) 1(Treated) Heating Degrees -0.007 -0.005 -0.031  (0.003)** (0.004) (0.008)*** 1(Treated) Cooling Degrees -0.014 -0.004 -0.037  (0.008)* (0.013) (0.026) Heating Degrees 0.004 0.007 0.080 (0.010) (0.011) (0.012)*** Cooling Degrees 0.088 0.021 0.441 (0.012)*** (0.015) (0.028)*** N 75,201,504 75,217,587 52,409,856 52,418,516 19,513,453 19,554,914

Notes: Tables A2a and A2b present the estimates of Equation (4) for the …rst four reports and all remaining reports, respectively. In each pair of estimates, the left column includes more ‡exible temperature controls, while the right column excludes outliers. The outcome variable is electricity use, in kilowatt-hours per day. Standard errors are robust, clustered by household. *, **, ***: Statistically signi…cant with 90, 95, and 99 percent con…dence, respectively.

66 Table A3: Placebo Report Arrivals

(1) (2) Base Weather 1(Treated) 1(Arrival Period) -0.001 -0.007  (0.016) (0.015) 1(Treated) 1(Post-Arrival Period) 0.011 -0.004  (0.019) (0.019) 1(Treated) -0.671 -0.489 (0.095)*** (0.093)*** 1(Treated) Heating Degrees -0.012  (0.005)** 1(Treated) Cooling Degrees -0.008  (0.013) Heating Degrees 0.007 (0.011) Cooling Degrees 0.023 (0.015) N 52,418,516 52,418,516

Notes: This table presents the estimates of Equation (3) for the quarterly group, for reports that the monthly group received but the quarterly group did not. The sample includes the period after the quarterly group’s …rst four reports. The left column does not control for degree days, while the right column does. The outcome variable is electricity use, in kilowatt-hours per day. Standard errors are robust, clustered by household. *, **, ***: Statistically signi…cant with 90, 95, and 99 percent con…dence, respectively.

67 Table A4: Persistence by Subgroup

(1) (2) (3) (4) (5) (6) Site 1 Site 1 Site 2 Site 2 Both Sites Both Sites Levels Decays Levels Decays Levels Decays 1(Dropped) -0.601 -0.832 -0.650 -0.805 -0.626 -0.812 (0.090)*** (0.097)*** (0.101)*** (0.108)*** (0.068)*** (0.072)*** 1(Dropped) 1(Quarterly) 0.077 0.324 0.233 0.293 0.169 0.290  (0.188) (0.202) (0.177) (0.190) (0.131) (0.139)** 1(Dropped) Baseline Usage -0.283 -0.477 -0.632 -0.561 -0.476 -0.495  (0.163)* (0.184)*** (0.142)*** (0.154)*** (0.107)*** (0.119)*** 1(Dropped) 1(Post-Drop) x Time 0.211 0.131 0.164  (0.057)*** (0.054)** (0.040)*** Quarterly Decay Di¤erence -0.232 -0.050 -0.109 (0.122)* (0.093) (0.075) Baseline Usage Decay Di¤erence 0.183 -0.061 0.017 (0.111)* (0.081) (0.067) N 956,848 956,848 1,387,473 1,387,473 2,344,321 2,344,321

Notes: This table presents the estimates of Equation (6), allowing and LR to di¤er for monthly vs. quarterly groups and as a function of Y b, which is baseline usage normalized to mean 0, standard deviation 1. The sample is limited to the post-drop period and includes only dropped and control group households. The outcome variable is monthly averagee electricity use, in kilowatt-hours per day. Standard errors are robust, clustered by household. *, **, ***: Statistically signi…cant with 90, 95, and 99 percent con…dence, respectively.

68 Table A5: Di¤erences Between Continued and Dropped Groups

(1) (2) (3) Site 1 Site 2 Site 3 Base Base Base 1(Treated) 1(Pre-Treatment) -0.063 -0.035 -0.011  (0.064) (0.041) (0.069) 1(Treated) 1(1st Year) -0.573 -0.456 -0.640  (0.046)*** (0.048)*** (0.088)*** 1(Treated) 1(2nd Year Until Drop) -0.874 -0.669 -0.879  (0.058)*** (0.058)*** (0.098)*** 1(Treated) 1(Post-Drop) -0.930 -0.870 -1.007  (0.068)*** (0.069)*** (0.110)*** 1(Dropped) 1(Pre-Treatment) 0.079 0.038 0.007  (0.085) (0.057) (0.055) 1(Dropped) 1(1st Year) 0.024 0.018 -0.002  (0.062) (0.067) (0.066) 1(Dropped) 1(2nd Year) 0.022 0.032 0.039  (0.075) (0.081) (0.080) 1(Dropped) 1(Post-Drop) 0.339 0.286 0.380  (0.086)*** (0.096)*** (0.093)*** N 3,294,294 4,435,689 5,063,949

Notes: This table presents estimates of the di¤erences between continued and dropped group treatment e¤ects in each period. The outcome variable is monthly average electricity use, in kilowatt-hours per day. Standard errors are robust, clustered by household. *, **, ***: Statistically signi…cant with 90, 95, and 99 percent con…dence, respectively.

69 Table A6: Long-Run E¤ects with Balanced Panel

(1) (2) (3) Site 1 Site 2 Site 3 ATEs ATEs ATEs 1(Treated) 1(Pre-Treatment) 0.002 -0.024 -0.035  (0.062) (0.039) (0.065) 1(Treated) 1(1st Year) -0.578 -0.469 -0.722  (0.045)*** (0.045)*** (0.083)*** 1(Treated) 1(2nd Year Until Drop) -0.878 -0.671 -0.904  (0.056)*** (0.054)*** (0.092)*** 1(Dropped) 1(Post-Drop) -0.605 -0.554 -0.618  (0.088)*** (0.093)*** (0.126)*** 1(Continued) 1(Post-Drop) -0.934 -0.853 -1.037  (0.070)*** (0.072)*** (0.110)*** N 2,924,939 3,800,809 4,226,607

Notes: This table presents the estimates of Equation (6), omitting the third and fourth lines, with the sample limited to households that never move. It replicates Table 5, except with a balanced panel. The outcome variable is monthly average electricity use, in kilowatt-hours per day. Standard errors are robust, clustered by household. *, **, ***: Statistically signi…cant with 90, 95, and 99 percent con…dence, respectively.

Table A7: Long-Run Decay Parameters with Balanced Panel

(1) (2) (3) (4) (5) (6) Site 1 Site 1 Site 2 Site 2 Site 3 Site 3 Base Weather Base Weather Base Weather 1(Treated) 1(Pre-Treatment) 0.002 0.002 -0.024 -0.024 -0.035 -0.035  (0.062) (0.062) (0.039) (0.039) (0.065) (0.065) 1(Treated) 1(1st Year) -0.578 -0.578 -0.469 -0.469 -0.722 -0.722  (0.045)*** (0.045)*** (0.045)*** (0.045)*** (0.083)*** (0.083)*** 1(Treated) 1(2nd Year Until Drop) -0.878 -0.923 -0.671 -0.597 -0.904 -0.783  (0.056)*** (0.065)*** (0.054)*** (0.065)*** (0.092)*** (0.106)*** 1(Dropped) 1(Post-Drop) -0.783 -0.826 -0.682 -0.595 -0.706 -0.551  (0.092)*** (0.093)*** (0.097)*** (0.091)*** (0.129)*** (0.130)*** 1(Continued) 1(Post-Drop) -1.013 -1.014 -0.778 -0.779 -0.966 -0.973  (0.073)*** (0.073)*** (0.078)*** (0.078)*** (0.114)*** (0.112)*** 1(Dropped) 1(Post-Drop) x Time 0.158 0.161 0.107 0.106 0.075 0.080  (0.050)*** (0.050)*** (0.044)** (0.044)** (0.043)* (0.044)* 1(Continued) 1(Post-Drop) x Time 0.072 0.067 -0.068 -0.067 -0.041 -0.044  (0.039)* (0.039)* (0.036)* (0.036)* (0.032) (0.034) N 2,924,939 2,924,939 3,800,809 3,800,809 4,226,607 4,226,607

Notes: This table presents the estimates of Equation (6), with the sample limited to households that never move. It replicates Table 6, except with a balanced panel. The outcome variable is monthly average

70 electricity use, in kilowatt-hours per day. Standard errors are robust, clustered by household. *, **, ***: Statistically signi…cant with 90, 95, and 99 percent con…dence, respectively.

71 Table A8: In-Sample Cost E¤ectiveness of a Two Year Intervention at Site 1 Scenario 1 2 3 Time Horizon Sample Sample Sample Assumed Persistence Zero Full Observed

Electricity Savings and Costs Savings during treatment (kWh) 523 523 523 (Standard Error) (25) (25) (24.7) Post-treatment savings (kWh) 0 709 483 (Standard Error) (0) (43) (70) Total savings (kWh) 523 1232 1006 (25) (50) (74)

Total cost ($) 17 17 17

Cost E¤ectiveness Cost E¤ectiveness (cents/kWh) 3.31 1.40 1.72 (Standard Error) (0.16) (0.06) (0.13)

Retail Electricity Cost Savings Per household cost savings ($) 52 123 101 (Standard Error) (2.5) (5) (7.4) Total population cost savings ($millions) 3.77 8.89 7.26 (Standard Error) (0.18) (0.36) (0.53) Notes: This re-creates Table 9 for Site 1. See text for further details.

Table A9: In-Sample Cost E¤ectiveness of a Two Year Intervention at Site 3 Scenario 1 2 3 Time Horizon Sample Sample Sample Assumed Persistence Zero Full Observed

Electricity Savings and Costs Savings during treatment (kWh) 628 628 628 (Standard Error) (52) (52) (52) Post-treatment savings (kWh) 0 859 623 (Standard Error) (0) (91) (122) Total savings (kWh) 628 1487 1251 (52) (105) (133)

Total cost ($) 20 20 20

Cost E¤ectiveness Cost E¤ectiveness (cents/kWh) 3.20 1.35 1.61 (Standard Error) (0.26) (0.1) (0.17)

Retail Electricity Cost Savings Per household cost savings ($) 63 149 125 (Standard Error) (5.2) (10.5) (13.3) Total population cost savings ($millions) 4.53 10.73 9.03 (Standard Error) (0.38) (0.76) (0.96)

72 Notes: This re-creates Table 9 for Site 3. See text for further details.

Table A10: Ten-Year Cost E¤ectiveness of a Two-Year Intervention at Site 2 Scenario 1 2 3 Time Horizon 10 Years 10 Years 10 Years Assumed Persistence Zero Full Extrapolated

Electricity Savings and Costs Savings during treatment (kWh) 405 405 405 (Standard Error) (25) (24.6) (24.6) Post-treatment savings (kWh) 0 1,924 733 (Standard Error) (0) (260) (87) Total savings (kWh) 405 2329 1138 (25) (261) (90)

Total cost ($) 18 18 18

Cost E¤ectiveness Cost E¤ectiveness (cents/kWh) 4.44 0.77 1.58 (Standard Error) (0.27) (0.09) (0.12)

Retail Electricity Cost Savings Per household cost savings ($) 40 233 114 (Standard Error) (2.5) (26.1) (9) Total population cost savings ($millions) 2.92 16.81 8.21 (Standard Error) (0.18) (1.88) (0.65) Notes: This re-creates Table 9 with a ten-year time horizon. Savings in Scenario 3 are extrapolated using LR the estimated linear decay parameter  . See text for further details.

b

73 Appendix Figures Figure A1: Breakdown of Household Electricity Use

Notes: This …gure shows the breakdown of electricity use for the average American household in 2001, the most recent year for which detailed …gures are available. Source: U.S. Energy Information Administration (2009).

74 Figure A2: Short-Run E¤ects for Monthly and Quarterly Groups 75

Notes: This …gure plots the smoothed ATEs for each day of the …rst year of treatment for the monthly and quarterly treatment groups, as estimated by Equation (1).