<<

Upgrading and Long-run Urban Development: Evidence from

Mariaflavia Harari† Maisy Wong‡ University of Pennsylvania University of Pennsylvania and NBER

First version: July 2017 This version: October 2019

Abstract Developing face massive . We investigate how to facilitate urban transformation through the lens of the 1969-1984 KIP program, which provided public to five million slum residents in Jakarta, Indonesia. We assemble high-resolution data on program boundaries and current outcomes, including novel slum indexes from photographs. Among historical , KIP areas have 12% lower land values and 50% fewer high-rises. They are more informal, denser, and land is more fragmented. Boundary estimates are similar. As weak property rights limit mobility, place-based distortions from upgrades make central slums persist longer, leading to dynamic inefficiencies as land becomes scarce. JEL Classifications: R14, R31, R48 Keywords: Cities, urbanization, place-based policies, land markets, informality

∗We are grateful to Pak Darrundono, Gilles Duranton, Marja Hoek-Smit, Vernon Henderson, Ben Olken, and Tavneet Suri for their advice. We also thank the many authorities in the Jakarta Government for making the data available for academic research. We acknowledge the individual sources in the paper. We thank participants for the Asian Development Review conference, BREAD/CEPR/LSE/TCD Workshop in Development , CIFAR Research Workshop on Smart Inclusive Cities, the Harvard-IGC, IEB Workshop on , the Urban Economic Association Meetings, the NBER Summer Institute (Urban), Society of Economic Dynamics Meetings, and seminar participants at Harvard/MIT, Haas, University of Delaware, USC, UC Irvine, Wharton. Adil Ahsan, Kania Azrina, Xinzhu Chen, Gitta Djuwadi, Krista Iskandar, Richard Jin, Jeremy Kirk, Melinda Martinus, Joonyup Park, Yuan Pei, Xuequan Peng, Arliska Fatma Rosi, Beatrix Siahaan, Vincent Tanutama, and Ramda Yanurzha were excellent research assistants. We thank the Research Sponsors Program of the Zell/Lurie Real Estate Center, the Tanoto ASEAN Initiative, and the Global Initiatives at the Wharton School. All errors are our own. †Corresponding author. Wharton Real Estate. 428 Vance Hall, 3733 Spruce Street, Philadelphia, PA 19104-6301. Email: [email protected]. ‡Wharton Real Estate. 434 Vance Hall, 3733 Spruce Street, Philadelphia, PA 19104-6301. Email: [email protected]. 1 Introduction

Developing are experiencing massive urbanization.1 To promote growth, cities require transformative , while facing increasing land (Bryan et al., 2019). Moreover, weak land institutions associated with unclear property rights hinder the allocation of land to its most productive use, preventing governments from coordinating provision and land taxation to fund investments. A prominent manifestation is the persistence of slums alongside formal developments, pointing to misallocation (Henderson et al., 2016). Despite the growing importance of slums, which host one billion people worldwide (UN Habitat, 2016), we know little about the effects of policies targeting them due to a lack of data and endogeneity challenges (Field and Kremer, 2006). We make three contributions to the debate on how to facilitate urban transformation in settings with weak land market institutions. Our first contribution is to provide novel causal estimates of the long-term impacts of one of the world’s largest slum upgrading programs.2 The 1969-1984 Improvement Program (KIP)3 provided local public goods, such as roads and sanita- tion, to 5 million beneficiaries in the then-impoverished of Jakarta, Indonesia. To encourage private investments by residents, planners bundled public investments with informal non-eviction guarantees.4 While slum upgrading improves the quality of life in slums, policy makers are also concerned that the improvements will encourage crowding and make slums more persistent (Du- ranton and Venables, 2018). Our main finding is that KIP neighborhoods today have lower land values and fewer tall buildings. They are also more likely to stay informal and have high population density, echoing the concerns of policy makers. Our second contribution is to shed light on how place-based policies can induce dynamic inef- ficiencies when a city grows out of informality (Krugman, 1991). Place-based investments directly improve local land values, but also distort the location decisions of people and firms (Kline and Moretti, 2014), particularly in slums, where weak property rights limit mobility (Field, 2007). By inducing residents to stay, KIP improvements and the non-eviction guarantee endogenously in-

1The estimates that 2.5 billion people will be added to cities by 2050 and that 95% of the urban expansion will be in the developing world (UN Habitat, 2012). 2Slum upgrading is an increasingly popular approach and has been considered in , , Colombia, Kenya, , Indonesia, and Tanzania (UN Habitat, 2011; , 2017; UN Habitat, 2017; Govern- ment of India, 2016; World Bank, 2018). 3Kampung is a colloquial term used in Indonesia to describe traditional (rural and urban) . Unless stated otherwise, we will use the terms slums, informal settlements, and kampungs interchangeably. 4Non-eviction guarantees encourage slum residents to make long-term investments and are commonly bundled with physical slum upgrades (Collier et al., 2018), especially in settings with weak land market institutions and limited capacity to establish a system of secure, enforceable, and marketable property rights (Glaeser and Shleifer, 2002).

1 crease population density and land fragmentation over time. As the city faces rapid urbanization and land becomes scarce, it is more challenging to redevelop (denser and more fragmented) KIP neighborhoods,5 due to the higher costs of relocating residents and assembling land parcels. While non-KIP neighborhoods become formal, KIP neighborhoods stay informal with lower land values and building heights. This way, the negative indirect effects induced by place-based distortions can be large enough to reverse the direct improvements in land values in the short-run (Duranton and Venables, 2018). We conceptualize this through a model of spatial equilibrium across neigh- borhoods à la Rosen-Roback (Rosen, 1979; Roback, 1982), augmented with formalization costs. Third, we contribute a uniquely rich, granular, and comprehensive dataset to quantify the im- pacts of KIP and empirically assess potential channels. We address critical challenges, including a lack of data in slums and a lack of standardized definitions of informality. Our core datasets include KIP policy maps, assessed land values, and a photographic survey of 28,000 photos from which we measure building heights and the degree of informality. Crucially, our photos cover both formal and informal areas, combining Google Street View imagery with photos we took in slums where roads are inaccessible to Street View cars. We also obtained auxiliary data on one million land parcels to characterize land fragmentation and ten million individuals from the 2010 Popu- lation Census to assess population density and demographics. Our primary outcomes are current land values and building heights. We describe other outcomes below. To isolate the long-term causal impacts of KIP, the main threat is program selection bias be- cause KIP planners targeted low-quality neighborhoods. Our first empirical strategy restricts the sample to historical kampungs that existed before KIP and compares treated ones with those that were never treated, within the same neighborhood. Our thought experiment involves two locations (T and C) that were both slums before KIP, with T becoming treated. Over time, massive urbaniza- tion in Jakarta introduced demand shocks common to both T and C. Absent KIP, our assumption is that both neighborhoods would have relatively similar market potential today, conditional on granular locality fixed effects (comparable in area to census tracts in the ) and rich distance and topography controls. Furthermore, we refine our comparison using a boundary discontinuity exercise within 200 meters of KIP boundaries, utilizing the high-resolution policy maps and rich variation. The iden- tification assumption is that unobserved market potential does not change discontinuously around the boundaries, conditional on observables. According to World Bank(1995), developers ac-

5The process by which slums become formal neighborhoods (formalization) is complex and requires coordinating costly verification of ownership, the relocation of large numbers of slum residents, and potential political backlash. It also requires assembling many land parcels and facing holdout problems, as dense slums are highly fragmented due to repeated and uncoordinated land subdivisions.

2 counted for KIP status as they selected sites for development, in line with perceptions of greater costs in KIP. Our robustness checks assess spatial spillovers and confounding of KIP boundaries by administrative boundaries. Our preferred estimate suggests that KIP areas have 12% lower land values today and half as many tall buildings (defined as having more than three floors). Reassuringly, we can account for 88% of the effect from the assessed values analysis by imputing the of the missing tall buildings in KIP. This assuages the concern that land values may not be accurately measured in (informal) KIP areas, since heights represent real quantities from our representative photo sample and our imputing exercise only uses land values in non-KIP areas. Aggregating across locations, the -12% effect on land values translates into an of of US$11 billion. While part of this may reflect a mere reshuffling of development activity, we provide evidence that points to an aggregate reduction - central KIP areas have worse outcomes than middle non-KIP areas, with two-thirds of KIP being central by now. Next, we exploit the staggered roll-out of KIP across three waves to assess program selection bias. KIP planners formulated a scoring rule to prioritize kampungs with worse neighborhood quality. If our results were driven by selection bias only, we should find the worst outcomes for the earliest wave. Indeed, we estimate a statistically significant monotonic pattern in land value impacts consistent with the scoring rule. Importantly, this pattern disappears once we include gran- ular location fixed effects and refine the comparison to KIP and non-KIP historical kampungs only. The findings are robust to controlling for differences in treatment intensity across the three waves. As an additional identification check, to assess confounding by the generic persistence of slums, we repeat our boundary discontinuity analysis using placebo borders from non-KIP historical kam- pungs, finding no discontinuity. To investigate why land values and heights are lower in KIP areas, we develop novel proxies of informality to show that KIP areas are more likely to stay informal today, relative to non-KIP areas in the same locality that were also informal before KIP. Defining and measuring informality is challenging. We make progress using our photographic survey to construct a novel quality to quantify varying degrees of informality. We also complement this with an attributes-based index where we coded up fifteen physical attributes to quantify vehicular access, neighborhood appearance, and permanence of structures in neighborhoods. Our estimates suggest that KIP areas are 15 percentage points more likely to be informal, as per our quality index. We find similar patterns with the attributes-based index. We then consider some of the contributing factors underlying informality and provide metrics for land fragmentation and population density. First, we develop a proxy of land assembly costs

3 using the number of land parcels per unit area. In KIP neighborhoods, assembling the land area needed to build a high-rise requires 9 more parcels (47%), exacerbating holdout problems, land assembly costs, and relocation costs associated with formalizing slums. A back-of-the-envelope exercise suggests that greater land fragmentation in KIP explains 75% (38%) of the effect of KIP on land values (heights). Along with the greater parcel density, we also find greater population density and migration patterns consistent with residents remaining in KIP and sub-dividing land parcels. Interestingly, KIP areas are more fragmented than non-KIP even when considering peripheral and informal areas that plausibly have not formalized yet, suggesting that KIP directly increased parcel and population density. Taken together, our results suggest an aggregate loss of surplus from delayed formalization, manifesting itself when urbanization increases the market potential of central kampungs. As the city expands, the initial place-based investments can evolve into sunk costs, but KIP residents cannot monetize the increased market potential due to weak property rights. Moreover, we show that even if all the surplus from redevelopment could be distributed to current KIP residents, it may not be enough to compensate them for displacement to the periphery (Bilal and Rossi-Hansberg, 2019; Barnhardt et al., 2017). This rationalizes why formalization of slums is so slow. The ensuing misallocation of land also leads to an inefficient spatial configuration of the city that limits urban connectivity, agglomeration , and the land tax base. The foregone tax revenues could fund person-based redistribution to the poor and improvements in infrastructure in the periphery, which can moderate relocation costs for central slum dwellers. Our paper extends the literature on urban development to deepen our understanding of how cities grow out of informality. Henderson et al.(2016) and Gechter and Tsivanidis(2018) high- light misallocation and opportunity costs of land use in the context of slums in Kenya and India, respectively. Our results emphasize similar patterns by examining one of the world’s largest slum upgrading programs. These findings underscore the severe challenges in cities facing rapid urbanization under limited resources and weak institutions.6 Second, we contribute to the literature on place-based policies by examining neighborhood improvements in settings with weak land market institutions.7 Michaels et al.(2017) evaluate

6The frictions we highlight also echo the urban economics literature on land assembly costs (Brooks and Lutz, 2016), property rights and land fragmentation (Libecap and Lueck, 2013), coordination failures (Hornbeck and Keniston, 2017), path dependence (Bleakley and Lin, 2012), and land use regulations in the United States (Turner et al., 2014) and in India (Duranton et al., 2015). 7The existing literature has focused on settings with well-defined property rights, largely in developed countries (Neu- mark and Simpson, 2015). In urban , McIntosh et al.(2018) and Gonzalez-Navarro and Quintana-Domeque (2016) find that infrastructural improvements increase land in the short run for low-income neighborhoods where tenure security is not contentious.

4 the effects of a “sites and services” program in Tanzania in the 1970’s that provided infrastruc- ture to unsettled areas, contrasting it with slum upgrading on-site.8 Together, our papers provide complementary evidence on how to facilitate urban transformation in cities at different stages of development,9 by examining the two most common policy options to manage slum proliferation. Sites and services programs help shed light on the importance of coordinating ex ante, when land is still available, to minimize dynamic ineffiencies. We show that policy makers considering slum upgrading need to internalize ex post distortionary costs, that manifest when land becomes scarce. These costs need to be weighed against the benefits for slum residents and redistributive goals that policy makers may have. We also contribute to research on housing and slums in developing countries. Beyond the urban renewal literature above, researchers have examined the relationship between slums and access to labor market opportunities, titling, housing affordability, provision of local public goods, and the political in slums.10

The rest of the paper proceeds as follows. Section2 describes the background, Section3 describes the data, Section4 outlines the conceptual framework, Section5 describes the empirical strategy, Section6 presents our main results, Section7 explores potential channels, Section8 addresses the threats to the identification, Section9 discusses the policy implications, and Section 10 concludes.

2 Background

Indonesia is the fourth most populous country in the world with 240 million inhabitants and a GDP per capita of US$3,570 (World Bank, 2016b). Slightly more than half of the population lives in cities today, expected to become more than two thirds by 2025 (World Bank, 2016a). Jakarta, the capital, is a mega-city with close to eleven million inhabitants, land area of 260 square miles (World Bank, 2012), and annual gross regional product per capita of US$14,000 (World Population Review, 2018). The key stages of urban development in Jakarta can be seen through the evolution of the city’s 8A number of papers have examined smaller-scale slum programs that have an upgrading component: Takeuchi et al. (2008) contrast on-site slum improvements versus relocations, and Kaufmann and Quigley(1987) and Galiani et al. (2017) study the impact of improving housing in slums. 9The average GDP per capita for Indonesia during KIP was around $1000 (2015 dollars). Today, this is comparable to the GDP per capita of Bangladesh, Kenya, Pakistan, Tanzania. 10See Field(2007) and Galiani and Schargrodsky(2010) on titling; Barnhardt et al.(2017) and Franklin(2019) on relocations; Feler and Henderson(2011) and Castells-Quintana(2017) on public services; Marx et al.(2019) on private investments. Also see Brueckner and Lall(2015) and Marx et al.(2013) for an overview.

5 Master Plans. After Indonesia declared independence from the Dutch in 1945, Jakarta experienced massive in-migration, leading to concerns about overcrowding in the city’s traditional settlements (“kampungs” or urban villages). The 1960 Master Plan prioritized the provision of basic public services and promoted slum upgrading in kampungs. In 1970, Jakarta was declared a “closed city” to limit in-migration, but the city’s population size continued to increase. To accommodate the rapidly growing population and to facilitate urban transformation, the 1985-2005 Master Plan began to envision the redevelopment of kampungs and the creation of a (the Golden Triangle) slightly south of the city’s historic center (the National Monument in Merdeka Square, indicated in Figure1). This marked a departure from the previous plan that considered kampungs as part of the urban fabric (Silver, 2007). However, the transition was interrupted by an oil crisis in 1986 and by the Asian in 1997, with the economy only recovering by the mid-2000’s.11 Today, Jakarta continues to face increasing land scarcity with an annual population growth rate of 2.5% a year (Haryanto, 2018). The city has been expanding into the sprawling of Jabodetabek,12 the world’s second-largest, home to 30 million inhabitants and over 5 million commuters (Rukmana, 2015). The most recent Jakarta Master Plan aims to address concerns of overpopulation, a severe shortage in housing, and traffic congestion in the center by further promoting the redevelopment of central areas and supporting investments in transit infrastructure (including a new subway) to better connect the city with the adjacent (Human Cities Coalition, 2017).

2.1 Land market institutions and informality

The history of modern property rights in Jakarta can be traced back to the 17th century, when the Dutch colonial administration established a system of well-defined land rights in Dutch settlements (bewoude-kom or “built-up” areas). Conversely, they retained local customary land rights (adat law) in traditional kampungs. After independence, the Basic Agrarian Law of 1960 was intended to unify this dual system, establishing a process for registering customary rights with the National Land Agency. However, the transition was never fully completed. Today, land market institutions in Indonesia face several challenges towards creating a system of secure, legally enforceable, and marketable land rights. The problems are not limited to titling,

11For example, according to The Skyscraper Center(2019), virtually all skyscrapers in Jakarta were built after 2005. Notable early examples include the 32-story Stock Exchange Building in the Golden Triangle (1994) and the 48- story “Wisma 46” office tower (1996), which would remain the tallest building in Jakarta until 2015. By the late 1990’s, there were still only twenty luxury apartment complexes (Firman, 1999). 12Jabodetabek comprises Jakarta and the adjacent municipalities of Bogor, Depok, Tangerang, and Bekasi.

6 but also depend on the trust in government and an efficient judicial system to verify tenure status, resolve ownership disputes, and enforce property rights. As it is, there still exists a complex system of formal and customary property rights. Customary rights (hak girik) are unofficially secured by property tax records, sales receipts, and other documents. They are tradable and recorded in local administrative offices (Leaf, 1994). Although no comprehensive measure of informality exists, estimates suggest that a quarter of Jakarta’s population lives in informal kampungs (McCarthy, 2003). In 2016, we conducted a field survey covering 300 in eight kampungs. We found kampungs in Jakarta to be relatively high-quality compared to informal settlements in other developing countries. Structures are relatively permanent and residents have access to many public amenities. However, the roads are narrow and some households lack water and sanitation facilities.13 25% of respondents reported having a formal title. The average annual income is US$3,500 and the annual rental cost is US$1,600. Given the challenges associated with weak land market institutions, it is not surprising that the process of redeveloping a kampung into a formal neighborhood (formalization) is complex (Leitner and Sheppard, 2018). Land assembly, the crucial phase, involves complicated negotiations between developers, residents, and officials. Without formal titles, it is challenging to identify existing claimants, verify their tenure status, and resolve disputes over land ownership. Furthermore, densely-populated kampungs tend to have fragmented land through repeated and unplanned parcel subdivisions, making it difficult to coordinate the assembly of contiguous parcels and exacerbating holdout problems (Brooks and Lutz, 2016). Local governments also fear political backlash from slum clearance. The compensation paid to residents is highly contentious. This is well exemplified by the case of “Wisma 46”, an office tower in Central Jakarta just outside the boundaries of a KIP neighborhood (McCarthy, 2003). The developer offered to pay full only to those who could produce a formal title, which was too costly to obtain (potentially involving bribery) for the majority of informal residents. Ultimately, many residents are thought to have received less than the negotiated price and were displaced to peripheral areas of the city.

1377% of houses had brick or concrete walls and 6% wood-only walls. 93% of respondents reported having metered electricity, 79% utilizing private , and 71% having private toilets. Only 12% of respondents reported that cars can access the road nearest to their house. See Wong(2019) for more details on the survey. We obtained permission from the local government to conduct this household survey.

7 2.2 The Kampung Improvement Program

KIP is one of the earliest and largest slum upgrading programs ever implemented. The program in Jakarta covered 11,000 hectares and 5 million beneficiaries, with a total outlay between $450 million and $550 million (2016 USD). KIP was later expanded to other cities, eventually covering 50,000 hectares and 15 million beneficiaries in Indonesia (see World Bank(1995), Darrundono (1997), and Darrundono(2012) for more details about KIP). History of KIP. The earliest slum interventions in Jakarta date back to the 1920’s, when the Dutch upgraded settlements (kampoeng verbeteering) surrounding their communities. After inde- pendence, rapid in-migration raised concerns about floods, fires, health hazards, and political riots, especially in kampungs. At that time, Indonesia was resource-constrained, being one of the poorest countries in the world (with a GDP per capita below that of India, Bangladesh, and ). Slum upgrading thus appeared as an affordable policy tool to provide immediate assistance to a large number of kampung residents (Darrundono, 2012). This study considers the first three waves of KIP upgrades implemented in Jakarta between 1969 and 1984. KIP was discontinued due to a negative budget shock when oil prices plummeted in 1986. Beginning in the 1990’s, KIP was expanded to other cities and the government broadened the scope to include community building and microcredit programs, in contrast to the top-down approach of the earlier waves which prioritized physical upgrades (Kessides, 1997). However, this expansion was disrupted by the 1997 Asian Financial Crisis. Recently, the government has expanded slum upgrading efforts again with the KotaKu program, slated to cover 154 cities and 9.7 million beneficiaries. Many of these cities are not as developed as Jakarta is today. Program Details. The primary objective of KIP was to improve neighborhood conditions in kampungs. The government aimed to benefit as many low-income households as possible in the shortest period of time and under a limited budget. They provided basic upgrades with an estimated useful life of 15 years. Beyond budget constraints, the upgrades were basic to avoid making the neighborhood too attractive for higher-income groups (Devas, 1981). Residents were not relocated during the upgrading process. To encourage residents to invest in their properties to complement the public investments, KIP planners verbally promised not to evict residents for 15 years (Darrundono, 2012, p.50). Physical upgrades by the government also contribute to establishing the permanence of informal settlements by shifting residents’ perceptions of their occupancy rights (De Soto(1989), Garr(1996)). Given the challenges in establishing formal land market institutions, it is common for slum upgrading programs to bundle some form of tenure security (verbal guarantees or occupancy certificates)

8 with the physical upgrades. KIP provided three types of physical upgrades. First, the government wanted to improve access in kampungs by widening and paving vehicular roads, bridges, and pedestrian footpaths. Widening roads was of particular importance as policy makers wanted to ensure that ambulances and fire fighters could access slum locations in the event of emergencies. The second component focused on sanitation and water , such as the provision of public water supply and sanitation facilities. As many areas in Jakarta are prone to floods, the government also installed drainage canals and waste disposal facilities, to minimize the clogging of waterways. Third, KIP included the construction of community buildings such as primary schools and neighborhood health clinics. Figure A1 shows an example of a kampung before and after KIP upgrades. KIP had a staggered roll-out over three five-year plans (Pelita): Pelita I (1969-1974) , II (1974- 1979) and III (1979-1984). The selection rule prioritized kampungs according to need. Specif- ically, planners created a scoring rule that ranked kampungs based on physical conditions (e.g. sanitation facilities, flood damage, and road quality), age of the kampung, population density, and estimates of income (KIP, 1969). Given the time constraint and limited information, the scoring rule over-weighted physical conditions that were easy to observe. Moreover, the selection rule stated that kampungs had to be distributed evenly across the five of Jakarta. Figure A2 shows the location of treated kampungs across the three KIP waves. Descriptive evidence of KIP’s short-run impacts. KIP is generally held up by practitioners and policy makers as an example of a successful slum upgrading program (Devas(1981), Taylor (1987), World Bank(1995), Darrundono(1997), and Darrundono(2012)). A 1995 evaluation report by the World Bank concluded that KIP “improved the quality of life of Indonesian urban areas at a low cost of ” (World Bank, 1995, p. 71).14 At the same time, there was also evidence that “non-KIP kampungs [had] caught up” (p.6), as a result of broader . We highlight three findings from the 1995 report. First, the majority of respondents reported that neighborhood quality improved after KIP. Specifically, KIP neighborhoods attained better access to clean water, better education and health facilities, improved footpaths, better living spaces, and less flooding. This was accompanied by a crowd-in of private investments: for instance, households upgraded their floors once they realized that the neighborhood was less prone to flooding after KIP. There is also suggestive evidence that prices in KIP neighborhoods were higher than in non-KIP areas. Second, KIP was associated with strengthened perceptions of tenure security. Even though

14The findings are based on extensive interviews with multiple stakeholders and surveys of two KIP kampungs in central Jakarta as well as one non-KIP kampung in the periphery.

9 respondents “had no land certificate or document to prove [ownership]” (p.111), 47% of KIP re- spondents claimed a right of ownership compared to 32% of non-KIP respondents (Table 13) and the study suggests that the physical upgrades “were crucial to establishing the permanence of the kampungs” (p.59). Third, the demographic composition appeared stable, along with improvements in the resi- dents’ : “KIP did not encourage an influx of higher-income groups into the kam- pungs, as had originally been feared...Residents are, however, now better educated and healthier, household size have declined, more residents are employed and have greater income” (p.6). The average length of stay for KIP residents was 27 years (Table 1).

The patterns above illustrate how KIP likely improved neighborhood conditions and enhanced the permanence of kampungs in the short run, in line with the objectives of KIP planners and the 1960 Master Plan. Over time, as the city began to grow, planners started promoting the redevel- opment of kampungs. This formalization process is complicated by weak property rights, holdout problems, and the political costs associated with relocating kampung residents. Slum upgrad- ing programs that bundle place-based improvements with informal tenure security can exacerbate these frictions through the interplay of entitlement effects, crowding, and land fragmentation. Be- low, we capture this in a stylized model with formalization costs and show how slum upgrading can increase land values in the short run but also delay the redevelopment of kampungs in the long run.

3 Data

This Section discusses our key data sources. We begin by assembling policy maps and granular data on our primary outcomes, assessed land values and building heights. We additionally con- tribute novel measures of informality and land fragmentation. We complement these sources with census data, data on land use and amenities, and historical maps. Unless stated otherwise, the unit of observation is a 75 meter-by-75 meter pixel that we obtain by superimposing a grid over the of Jakarta (the full grid has 89,463 pixels). Each pixel has a size comparable to the land area required for an average high-rise development project in Jakarta, based on reports from the Jakarta City Agency.

10 3.1 Policy maps

We obtained permission to use a 2011 publication by the Jakarta Department of Housing (DPGP, 2011), consisting of more than 200 physical maps of Jakarta, with a detailed indication of KIP boundaries as well as KIP investments. Extensive ground surveying was performed by the Jakarta Department of Housing mapping team to ensure accuracy. We were able to access the raw Autocad files that form the basis of these maps, achieving a high level of precision (1:5000 meter scale or a resolution of 2.5 meters) in georeferencing and tracing KIP boundaries and locations of KIP investments. Figure1 displays KIP treated areas as unshaded polygons. To implement the boundary discontinuity exercise, we manually select a subset of 125 “clean” boundary segments such that the control group is not contaminated by treated areas nearby (Turner et al., 2014). Figure A3 displays the boundaries used in our discontinuity analyses, which are evenly distributed across Jakarta. Details of this selection procedure are provided in the Data Appendix. One of the goals of the government was to make a detailed inventory of KIP investments in Jakarta. In addition to KIP policy boundaries, these maps also detail the individual assets provided as part of KIP: infrastructure (the network of vehicular and pedestrian road segments), sanitation facilities (garbage collection bins, public taps, public toilets, deep water wells, drainage canals), and community buildings (markets, health centers, and schools). An example map is provided in Figure A4.

3.2 Assessed land values

We observe assessed land values in Jakarta from a digital map created through the Smart City Jakarta initiative. In developing countries, it is notoriously challenging to obtain reliable data of land values. In the absence of directly observed market transaction data, researchers often turn to tax record data, which may suffer from underreporting due to tax evasion. The data we employ is based on a computer-assisted mass appraisal assessment approach, recently adopted by the Jakarta tax office in an effort to improve transparency and property tax collection. The newly developed valuation model, similar to those used in developed countries, utilizes data from brokers, online websites, and administrative and notary offices, including mar- ket transaction prices. As discussed above, transactions of unregistered land are recorded in parcel books in the local administrator’s office and kampung residents pay local taxes as a proof of res- idency. Further adjustments are then made, based on property and neighborhood attributes, data quality, as well as the timing of the transaction. The land office also implements field visits to

11 verify property and location attributes. The estimated property value is then decomposed into a building component (imputed using replacement costs) and a land component (nilai tanah), which is what we consider. The property tax due for each building is calculated as the assessed land value multiplied by the area of the building. The result of this process is a database with assessed land values in Rupiah per square meter at the sub-block level, where a sub-block is the smallest unit in zoning and city planning maps. The dataset covers 20,000 sub-blocks, shown in Figure A5. The average land value in our sample of sub-blocks is 12 million Rupiahs per square meter (around 90 USD per square foot). To gauge the extent to which assessed values reflect market prices, we also scraped and man- ually geo-referenced a sub-sample of 4,000 transactions prices from Indonesia’s largest property website. Figure A6 presents a scatterplot indicating that assessed values are positively correlated with market transaction prices, with a correlation coefficient of 0.56.

3.3 Building heights from photographic survey

Besides land values, we also examine quantities by assembling a novel dataset of building heights. Our in heights is threefold. The intensity of land use has important implications because low limits agglomeration externalities and irregular height gradients can reduce con- nectivity and complicate the provision of public services (Glaeser and Henderson, 2017). Second, observing heights allows us to examine real quantities that validate our assessed land value anal- ysis. Finally, albeit imperfect, building heights can also serve as a proxy for formality, insofar as informal building technologies and weak enforcement of property rights limit investments in high-rises (Henderson et al., 2016). To measure building heights and informality, we construct a novel photographic survey of 7,104 pixels as follows: we first restrict the full Jakarta grid of 89,463 pixels to two of our primary estimation samples (detailed in Section5 below): a historical kampung sample, comprising areas that were slums before the implementation of KIP, and a boundary discontinuity sample, compris- ing areas within a 500 meter distance from KIP boundaries. We then randomly sample from the historical sample and from the boundary sample, resulting in 7,104 pixels when pooled together. We also stratified by distance terciles to the city center to ensure we have adequate coverage of the center, middle, and periphery of Jakarta, and include these strata fixed effects in our regressions. For each of the 7,104 pixels, we obtain 4 photos, corresponding to the north, south, east, and west angles from the pixel’s center. In total, we have 28,000 photos. Figure A7 presents a map of the photo sample.

12 The main advantage of our approach is the ability to construct a representative sample including both formal and informal areas. We downloaded most of the photos from Google Street View imagery. However, Street View cars are typically too large to go through the narrow streets of kampungs (19% of pixels) and cannot access private gated developments (5% of pixels). For the 1,703 pixels (24%) not covered by Google Street View, we obtain photos from enumerators sent to the field.15 In addition, our approach also overcomes the problem of under-reporting of buildings in administrative records (e.g. due to tax evasion). We record the height, in floors, of the tallest building in the pixel. If we could not determine the height from the photo, we obtained the information from concierges, leasing offices, or visits to the building’s elevators. Pixels with no buildings (4% of the sample), corresponding to large roads, parks, or empty lots, were assigned a height of 0 and a “no building” dummy; results are robust to excluding those pixels. The average height in our sample is 4 floors, with a median of 2 floors, and a long right tail (a 95th percentile of 19 floors, and a 99th percentile of 44 floors). Our primary height outcome is an indicator that is one if the height is greater than three. We report results for other measures of building height in the Appendix.

3.4 Measuring informality from photos

Defining and measuring urban informality is notoriously difficult. Frictions associated with legal enforcement, coordination failures, and credit constraints have tangible manifestations in the built environment in the form of low building heights, irregular and non-permanent structures, narrow roads, with implications for connectivity and (Collier et al., 2018). Absent detailed land records and surveys, the literature has relied on remotely-sensed imagery (see Kuffer et al. (2016) for a review), which has comprehensive coverage but misses many attributes that can only be seen from the ground. Recently, ground imagery from Google Street View has been employed to detect urban change in the United States (Naik et al., 2017), but this is problematic for slums, because of the coverage bias discussed above. We utilize the photographic survey above to develop two innovative indexes to quantify the degree of informality. We first develop a rank-based index which provides a holistic assessment of the neighborhood’s quality. The index takes values ranging from 0 (very formal) to 4 (very infor- mal).16 Examples of photos that we classified can be found in Figure A8. We trained two research

15We obtained permission from the local government to conduct our photographic survey. 16In particular, a value of 0 corresponds to areas that are completely formalized and comparable to a developed country city; 1 for neighborhoods that appear formal but retain some of the traditional features of kampungs, such as the narrow roads; 2 for kampungs that are overall in good conditions (e.g. they have paved roads and concrete buildings);

13 assistants from Jakarta who independently performed the assessment. They were instructed to rely on characteristics of the neighborhood (including the density and irregularity of structures, clean- liness) and of the buildings (such as the durability of materials and the size of windows). We then averaged both rankings; our results are robust to different aggregation approaches and fixed effects to account for subjective differences in the two rankings. The correlation between the two research assistants’ rankings is 0.78. In our analyses below, we define a binary “kampung” indicator which is one for rank-based index values greater than one; our results are robust to using zero and two as threshold values. Next, we construct an attribute-based index that quantifies features in three important domains: vehicular access to the neighborhood, neighborhood appearance, and the permanence of structures. Guided by the Jakarta government’s criteria to define slums,17 we coded fifteen binary attributes, which are then averaged and standardized within each subsample into a z-score. Due to the com- plexity and heterogeneity of the imagery, we manually coded the attributes in 28,000 photos, as opposed to relying on machine learning approaches. The resulting index takes positive (negative) values for locations that have a higher (lower) degree of informality than average based on those attributes. The two indexes are positively correlated (0.64). The rank-based index measures the degree of informality in a comprehensive manner, for quality features that may not be captured by the fifteen attributes. The attribute-based index quantifies the presence of a rich set of clearly definable characteristics. Table A1 reports the correlation between the rank-based index and each of the attributes. The attributes in all three domains are largely positively correlated with the rank- based index.

3.5 Land fragmentation

Next, we consider land fragmentation. We rely on a series of detailed digital cadastral maps created by the Jakarta Department of Housing in 2011, based on aerial and ground surveying. These maps are made available through the website of the Jakarta Regional Disaster Management Agency and include the outlines of land parcels (Figure A9). We consider the number of parcels in each pixel as a simple metric of land fragmentation that proxies for the costs of land assembly. For concreteness, consider a developer starting a high-rise project that requires an amount of contiguous land roughly as large as one of our pixels. This

3 for kampungs that are in worse conditions and 4 for areas that are “very informal”. 17The Jakarta government defines slums using seven criteria associated with housing conditions, access to roads, drainage, access to , sanitation, solid waste management, and fire prevention.

14 will require purchasing potentially many adjacent land parcels from current owners, exacerbating holdout problems.

3.6 Population

We also capture population density in hamlets and demographics using the complete count 2010 Population Census (obtained from the Harvard Library Government Documents Group). We ob- serve 10 million individuals in Jakarta and we utilize information on their age, gender, educational attainment, and migration status (based on district of birth and district of residence five years be- fore).18 The finest geographic unit in the population census that we can geo-reference is the . We consider a hamlet treated if the majority of its pixels are in KIP. The conclusions are similar if we require all pixels or any pixel to be in KIP. In all the corresponding regressions, we average the distance and topography controls at the hamlet level.

3.7 Amenities

Our first source for amenities is an administrative map of current land use patterns made available through the website of the Jakarta Government. The map includes the location of retail, office, industrial, and residential properties. From these maps, we construct two measures of commercial density by computing the land share of each pixel corresponding to retail and office buildings respectively. Our source for current public amenities is OpenStreetMap, a publicly available, crowd-sourced database which supplements our administrative source. We measure distance from the center of each pixel to the closest school, hospital, police station, and bus stop.

3.8 Historical settlements

We identify areas that were kampungs before the implementation of KIP through two maps, that we georeferenced and digitized. The first is a 1959 U.S. Army Map map of Jakarta (U.S. Army Map Service, 1959) with a scale 1:50,000 (corresponding to a resolution of approximately

18The Census surveys households based on physical dwellings and irrespective of their legal status. However, it is notoriously difficult to track people in slums. To assess the extent to which this differs by KIP status, we consider households living with unrelated individuals (presumably tenants - a common living arrangement among informal households). Reassuringly, the prevalence of such households is very similar in KIP and non-KIP areas, with, if anything, fewer of them in KIP. Results are available upon request. Household sizes are also similar (Table A7).

15 25 meters). Our second source is a 1937 map of the then-Batavia (G. Kolff & Co, 1937), with a scale 1:23,500 (or a resolution of roughly 11 meters). We consider historical kampungs as areas that are marked as “kampung” in either the 1937 or the 1959 map. These areas correspond to the shaded in Figure1. 19 In both maps, kampungs are clearly demarcated as distinct from “built-up” areas originally settled by the Dutch. We also use these maps to trace major historical road arteries.

Taken together, the data sources above form the basis of our empirical strategy. As can be seen in the maps, our database is uniquely rich, high-resolution, and with a comprehensive coverage of Jakarta. Our primary outcomes are land values and building heights. We also consider informality indexes, land fragmentation, population density, and amenities. In the Appendix, we discuss the sources of other data, including distance and topography controls, administrative boundaries, and market transactions. We also provide further details on the data and variables construction. Table1 presents summary statistics for our outcomes and controls. Panel A reports statistics for our outcomes. We have land values for 19,862 sub-blocks from the assessed land values database, building heights and the informality index for 7,104 pixels in our photo sample, as well as land fragmentation and amenities from administrative data, covering the full grid of 89,463 pixels. Panel B reports summary statistics for our distance and topography controls, calculated at the pixel level. In the next Section, we present a simple model to outline how KIP relates to our key variables.

4 Conceptual framework

This Section outlines a stylized framework to characterize urban development in a context with weak land market institutions. We adapt a standard model of spatial equilibrium across neighbor- hoods (Rosen, 1979; Roback, 1982), introducing a friction in the form of a fixed cost of formaliza- tion. The model describes how kampungs (K) and formal (F) neighborhoods arise endogenously in equilibrium, differing by land values (p), building heights (h), and amenities (A). We illustrate how a neighborhood improvement program directly increases land values in treated neighborhoods in the short run (Kline and Moretti, 2014; Glaeser and Gottlieb, 2008), but can indirectly lead to lower land values relative to control neighborhoods in the long run, in the presence of formalization costs. In the model, we conceptualize slum upgrading programs as increasing neighborhood amenities but also formalization costs. We focus on housing markets to

19Note that there are KIP areas that do not correspond to historical kampungs; these are kampungs that were settled post 1959.

16 deliver intuition, but our conclusions hold in a full general equilibrium model incorporating also lo- cal labor markets. Below we focus on contrasting kampungs and formal neighborhoods, relegating other details - that are standard in the Rosen-Roback framework - to the Theoretical Appendix.

4.1 Housing markets and the formalization process

Our model rests on the building blocks of the Rosen-Roback framework: housing demand, housing supply, and spatial equilibrium. Housing demand is standard. Consumers, who are homogeneous and freely mobile, choose how much housing to consume, conditional on their choice of neighbor- hood. Their depends positively on housing h and the local bundle of amenities A. Specifically, they maximize U = AC1−α hα subject to their budget constraint C + ph = 1, where C denotes non-housing consumption and p is the price of housing. Turning to housing supply, competitive developers choose how many floors of housing space h to build in a given neighborhood. The cost of building at height h is

δ C(h) = h + cI[h>h¯], with δ > 1 and c > 0. (1) To account for frictions in the development process, we assume that developers face a fixed cost c if they choose to supply buildings with height above h¯. The non-linearity in the costs of building high-rises stems from engineering costs (e.g. steel frames), compliance with regulations (e.g. minimum lot sizes), and formalization costs (e.g. land assembly frictions and political costs from clearing slums, as discussed in Section 2.1). We define neighborhoods with heights below h¯ as “kampung” (K) and neighborhoods with taller buildings as “formal” (F), respectively. This echoes the urban development model in Henderson et al.(2016), featuring different building technologies and heights in formal and informal settlements. Given the fixed costs, the developers’ profit maximization problem delivers a discontinuous supply curve, depicted in red in FigureA below. When prices p are sufficiently higher than the fixed cost c, denoted on the horizontal axis as the threshold P¯(c), developers will formalize, i.e. develop at a height above h¯ along an upward sloping curve. Otherwise, they will supply at height h = h¯ and the neighborhood will remain informal. To close the model, we impose spatial equilibrium so that consumers are indifferent across neighborhoods and have no incentives to deviate. Neighborhoods are differentiated by the presence of two types of amenities: AP - such as local public goods or favorable topography - that are exogenous to the model, and AF - such as malls or office complexes - that arise endogenously when a neighborhood formalizes in equilibrium. Using the optimality condition for developers, that for consumers, and the spatial equilibrium

17 condition, we characterize equilibrium prices in a neighborhood as 1 logp = (logA) + const. (2) α where A represents the bundle of amenities. Intuitively, neighborhood prices will provide a compensating differential for amenity differences. The two possible housing market equilibria are depicted in FigureA, where we denote log prices on the horizontal axis and log heights on the vertical axis.

Figure A: Kampung and formal equilibrium

D S Equilibrium in a kampung. This equilibrium, corresponding to the intersection of hK and h , conceptualizes what most of Jakarta was like before KIP. When demand for a neighborhood is low relative to formalization costs, the neighborhood is a kampung, characterized by low prices and low heights, constrained at h¯. We provide empirical evidence of a bunching pattern in heights in Section 6.2. Equilibrium in a formal neighborhood. When demand for a neighborhood is high relative D S to formalization costs, the equilibrium corresponds to the intersection of hF and h . The neigh- borhood is formal, with taller buildings and higher prices, which are an increasing function of local amenities, including both AP and endogenous amenities from formalization AF . Through the lens of this model, a city starts formalizing once housing demand pushes prices high enough to overcome formalization costs. As a neighborhood becomes formal, its price and building height increase to new levels that also reflect the newly acquired amenities from formalization AF . This echoes Demsetz(1967), where the optimal intensity of land use is achieved when the costs of enforcing property rights are sufficiently low relative to the gains.

18 4.2 Comparing KIP and non-KIP neighborhoods

Below we illustrate how a slum upgrading program such as KIP can impact land values and build- ing heights in a treated neighborhood relative to a non-treated one. Motivated by our discussion

in Section 2.2, we model KIP as (i) raising public amenity levels (AP) and (ii) increasing for- malization costs (c). The former captures direct effects of the on-site improvements, whereas the latter incorporates the indirect effects induced by the ensuing place-based distortions, including residents’ greater propensity to stay in the neighborhood. We characterize the static equilibrium shortly after the introduction of KIP (which we label “short run”) and today (which we refer to as the “long run”). The short-run equilibrium is one in which amenities and demand are low everywhere to start with, so that both neighborhoods begin as kampungs. The long-run equilibrium is one in which the city as a whole has experienced a boost in housing demand, capturing the steep increase in urbanization rates that Jakarta has experienced in recent years.

In the short run, KIP has a positive direct effect on land values: the improved amenities (AP) attract population into the treated neighborhood, inducing an outward shift in the demand curve, which bids up prices. Thus, in the short run, KIP neighborhoods will have higher prices than non-KIP ones.20

Figure B: Long-run, KIP neighborhood Figure C: Long-run, non-KIP neighborhood

In the long run, the housing demand curve shifts outward, reflecting in-migration from out-

side Jakarta. Short-run direct differences in AP between KIP and non-KIP neighborhoods have

20Beyond the forces at play in our model, land values may also increase in KIP areas because of crowd-in of private investments by residents, as discussed in Section 2.2.

19 converged by this point, through depreciation of the initial investments or non-KIP neighborhoods catching up - which we demonstrate empirically in Section 7.2 below. Crucially, formalization costs c are higher in the treated neighborhood, corresponding to an outward extension of the flat portion of the supply curve in FigureB. If the is not large enough to overcome these higher formalization costs, the KIP neighborhood remains informal, with heights at level h¯, despite the neighborhood being in high demand. By contrast, the non-KIP neighborhood will be in a formal equilibrium with higher prices and taller buildings. In non-KIP neighborhoods, where formalization costs did not change, the increase in demand is now large enough to justify formalization. In FigureC, the same long- run demand curve intersects the supply curve in the upward sloping portion. Moreover, through

endogenous amenities from formalization AF , there is a further outward shift in the demand in formal neighborhoods. In sum, while the program had the direct effect of improving amenities and land values in the short run, in the long run it delays the redevelopment of areas that now have high market potential, echoing dynamic inefficiency concerns. The delayed formalization in KIP neighborhoods repre- sents a deadweight loss to society, which in our model is captured by the foregone formal amenities

(AF ) that would have been realized absent KIP. The surplus from AF is entirely given up, as neither developers nor consumers can capture it due to weak property rights. While we conceptualize the fixed cost c as an exogenous constant, in reality formalization costs reflect the equilibrium interaction of several forces reinforcing each other and amplifying over time. Absent well-defined property rights, KIP-provided amenities and the non-eviction guarantee make residents more likely to stay in the neighborhood and subdivide land, leading to greater population density and land fragmentation over time. At the same time, because of the political costs of clearing dense slums, entitlement perceptions will be reinforced by high population density (Jimenez, 1985). Both population density and land fragmentation thus contribute to increasing formalization costs endogenously. In the Appendix we consider possible extensions of the model, including (i) mobility costs, (ii) heterogeneous consumers, (iii) agglomeration and congestion, and (iv) spillovers across neighbor- hoods, all of which preserve the intuitions highlighted above. In addition, we acknowledge that some benefits of KIP may not be fully capitalized in the presence of market imperfections. For example, the crowd-in channel by which person-based benefits are capitalized into land values is through private investments in the neighborhood (e.g. housing upgrades). These crowd-in effects are probably limited due to credit constraints and weak property rights (Field, 2007). We test the model’s predictions on land values (p) and building heights (h) in Section6. In

20 Section 7.1 we examine empirically some of the equilibrium correlates of formalization costs c, providing evidence on population density and land fragmentation. We also consider amenities AP and AF in Section 7.2.

5 Empirical framework

To estimate the long-run impacts of KIP on land values and building height, we develop empirical strategies that leverage the high-resolution policy maps, the large-scale of the program, and rich data. We consider the following regression model linking current outcomes (Y) to KIP treatment status and an index capturing local unobserved market potential (ξ):

Yi j = α + βKIPi j + ξ j + εi j (3) where unit i is a sub-block (land values) or pixel (heights) in neighborhood j and εi j is an error term. The parameter of interest is β, which captures the long-term impacts of KIP. The main threat to identification is program selection bias because KIP planners formulated a scoring rule to prioritize kampungs with low quality. To the extent that historical differences are persistent, KIP areas may have worse outcomes today due to the selection bias (E[ξ|KIP = 1]−E[ξ|KIP = 0] < 0). While we do not observe the scores, we develop empirical strategies that restrict our control groups to non- KIP areas that likely had similar scores as treated ones: kampungs existing before KIP (historical sample) or areas next to KIP boundaries (boundary sample). We also use the sequential rollout of KIP across three waves to empirically assess selection bias in Section 6.3. Our first strategy restricts the sample to historical kampungs that existed before KIP and in- cludes locality fixed effects.21 Our thought experiment involves two locations (T and C) in the same locality that were both kampungs before KIP. Unconditionally, T had a lower ξpre than C, and was selected into KIP on the basis of the scoring rule. However, over time, massive urbaniza- tion and transformations under the new master plans introduced large shocks common to both T and C. As we describe in the model, we assume that, had KIP not increased formalization costs, T and C would have similar market potential today, conditional on observables and fine-grained locality fixed effects. Standard errors are clustered by locality. Our second strategy is to implement a boundary discontinuity design (BDD) comparing ob- servations located on either side of KIP boundaries. The identification assumption is that unob-

21Localities are fine geographic units, comparable to U.S. census tracts. They have an average area of 2.5 square kilometers.

21 served market potential varies smoothly at the program boundaries,22 conditional on the controls. According to World Bank(1995), developers accounted for KIP status as they selected sites for development, likely perceiving formalization costs to be greater in KIP. The strength of this strat- egy is that we can restrict the sample to narrow distance bands (within 200 meters), which holds constant market potential today. We label this the boundary sample. We include boundary fixed effects and quadratic distance controls to KIP boundaries; the results are robust to other types of distance controls (e.g. logs). Standard errors are clustered by KIP boundaries. We have nine controls capturing access and . These include distance from the Na- tional Monument (the historical city center), from the Stock Exchange Building, and from his- torical main roads, as well as various topography and flood proneness controls.23 The full list of controls can be found in Table2 and all variables are described in the Data Appendix. Our results are unaffected if we drop the controls that are not pre-determined (see Section 8.5). Table2 assesses the comparability between KIP and non-KIP areas using granular fixed effects. Each pair of cells reports the coefficient and p-value, in square brackets, from a regression of lo- cation attributes Xi j on the treatment dummy and geographic fixed effects. The first three columns correspond to the sub-block level dataset (for the land values analysis), followed by the pixel-level dataset (for other outcomes). As a preliminary step, columns 1 and 4 show that KIP and non-KIP areas differ in these char- acteristics, when we condition on coarse district fixed effects. Next, we show that these differences become smaller for our two primary specifications. Columns 2 and 5 (3 and 6) restrict to the historical kampung specification (boundary specification). While there are four variables that show statistical significance, none of them can explain our findings. Distance to the Stock Exchange and slope are not economically significant.24 Next, differences in the distance from waterways would bias against our results.25 Lastly, the flood- proneness dummy is significant for the boundary analysis in column 3 only and insignificant for all other specifications. This likely reflects spurious differences due to how the variable is con- structed.26 Notably, our main results are nearly unchanged if we drop the latter two variables (see

22Moreover, KIP neighborhood boundaries are pre-determined because they largely depend on hamlet boundaries defined during World War II by the Japanese for security purposes. 23Jakarta lies on a coastal lowland and parts of the city are frequently paralyzed by flooding. 24KIP pixels are only 0.5% closer to the Stock Exchange (Panel A, column 6), and this difference, if anything, would bias against finding a lower land value in KIP. Differences in slope are also small (0.24 and 0.21 degrees relative to a mean of 2 degrees, columns 5 and 6 in Panel B). 25The estimated differences are between 14% and 27% (Panel B, columns 3, 5, and 6) but locations that are farther from waterways have higher land values, perhaps because they tend to be less flood prone. 26The flood-proneness variable is defined as a binary hamlet-level indicator, even though actual flood proneness varies

22 Section 8.5).

6 Main results

Our model suggests that, if KIP increases formalization costs in treated areas, KIP neighborhoods could have lower land values (p) and building heights (h) than control areas in the long run. In this Section, we take these predictions to the data.

6.1 Effect of KIP on land values

Table3 presents our main results for the effect of KIP on land values (columns 1 and 2). The dependent variable is the log of price per square meter in a sub-block, from the assessed land values database. Column 1 presents the historical kampung specification and column 2 presents the boundary analysis. We control for nine distance and topographic attributes. Our baseline estimate in column 1 shows that KIP areas have 12% lower land values compared to historical kampungs within the same locality that were not part of KIP. This specification in- cludes 196 locality fixed effects and is identified from 88 localities spread across Jakarta that have variation in KIP status. The magnitude is large, translating into an aggregate effect of $11 billion US dollars.27 Deflating back to 1969 and dividing by the number of beneficiary households sug- gest the long-run cost per household is slightly more than twice the annual household income in 1969.28 This negative impact can be interpreted as a long-run opportunity cost of land use in KIP areas, consistent with the prediction in the model that non-KIP areas may formalize first if formalization costs are greater in KIP. As with any spatial analysis, it is difficult to quantify how much of the $11 billion represents an aggregate effect versus spatial reallocation of development activity. A full general equilibrium model that calibrates the city’s counter-factual spatial development and accounts for agglomeration externalities would be needed to perform the decomposition exercise.

continuously in space. Given that KIP boundaries are sometimes near hamlet boundaries, a small change in actual flood proneness as the KIP boundary is crossed may be recorded as a discontinuous jump from 0 to 1 in our binary indicator variable. 27The average assessed value for historical kampungs is 12 million Rupiahs per square meter (around US$89 per square foot), translating into an effect size of 1.4 million Rupiahs per square meter, or US$11 per square foot (at an exchange rate of 13,371 Rupiahs to US dollars). We obtain the aggregate effect by multiplying by the total area under KIP, 10,000 hectares. 28We use the GDP deflator (International Monetary Fund, 2018) and divided the aggregate amount by the five million original beneficiaries mentioned in KIP reports assuming five persons per household, and 45,000 Rupiahs of monthly household income KIP(1969). 23 Nevertheless, in Section 8.4, we provide suggestive evidence that our results do not merely reflect a reshuffling of development activity. In column 2, we present our boundary discontinuity analysis showing a similar effect (-13%) comparing observations within 200 meters of KIP boundaries. We include 124 boundary fixed effects and are identified using boundary pairs that provide a different source of variation than the historical sample. In Section8, we show that the boundary analysis survives additional robustness checks to address challenges related to spatial spillovers, stability across different distance bands, and confounding between KIP boundaries and administrative boundaries.

6.2 Effect of KIP on heights

Columns 3 and 4 of Table3 test the second prediction of the model that building heights are lower in KIP neighborhoods. The unit of analysis is a 75-meter pixel and the dependent variable is a dummy indicating whether the tallest building in a pixel has more than three floors. We add sampling strata fixed effects (from our photographic survey) as well as a dummy for pixels with no buildings (corresponding to public spaces and roads - see Section 3.3). Our historical sample estimate suggests buildings in KIP are 12 percentage points (p.p.) less likely to have more than three floors, relative to observably identical non-KIP areas. This is half of the control group mean (0.24). The estimate for the boundary sample is similar (9 p.p.). In terms of the number of floors, we estimate an effect of -1.6 floors for the historical sample, relative to a mean of 5 floors for the control group (see Table A2, column 1 for number of floors and the other columns for different height outcomes). Figure2 shows that the negative effect is driven by non-KIP areas having more tall buildings. Here, we plot the distributions of building heights for the treated (shaded bars) and control (un- shaded bars) groups in historical kampungs. Interestingly, there is a striking bunching pattern at two floors for KIP neighborhoods. This is consistent with the model, whereby buildings in KIP neighborhoods bunch around the h¯ threshold, corresponding to the maximum height attainable with the informal building technology. By contrast, at the right tail of the distribution, non-KIP areas have twice as many buildings with five to ten floors and also those with more than ten floors. Importantly, we demonstrate that the magnitudes we find for building heights corroborate our land values analysis. Translating our height effects to land values, we show that the lack of tall buildings in KIP explains 88% of the aggregate land value impact, lending support to our -12% estimated impact for land values (Table3, column 1) and mitigating the concern that it may be con- founded by differences in the accuracy of land values data in KIP and non-KIP areas. Specifically,

24 we estimate a hedonic regression using land values in non-KIP historical kampungs only and indi- cators for medium and tall buildings, respectively.29 We then apply the estimated price premium to impute the aggregate loss from having fewer tall buildings.30 This approach circumvents concerns around measurement error in informal KIP areas (by using land values in non-KIP areas only), and overcomes sample coverage bias (by using heights data from a representative sample). The results are similar using formal areas (defined using our informality index) to estimate the price premium.

6.3 Program selection bias

Recall that KIP planners formulated a scoring rule that prioritized kampungs with low quality first. Here, we use the sequential rollout of KIP across the three Pelita waves (five-year plans) to empirically assess program selection bias (E[ξ|KIP = 1] − E[ξ|KIP = 0] < 0). Specifically, we augment Equation3 by decomposing the overall KIP indicator into three dummies corresponding to the three KIP waves, to assess whether E[ξ|KIPI] < E[ξ|KIPII] < E[ξ|KIPIII]. Below, we first trace out a monotonic pattern indicative of the earliest wave having the lowest quality, followed by the second, then third wave. Critically, we next show that this monotonic pattern disappears in our historical kampung specification. Column 1 of Table4 uses the full administrative data from land assessments to estimate a monotonic pattern in line with the selection rule, with estimates for the three waves being -0.52 (wave I), -0.33 (wave II), and -0.14 (wave III). We control for district fixed effects (the selection rule specified that KIP had to be distributed evenly across the five districts of Jakarta), as well as distance and topography controls. We report p-values from one-sided hypothesis tests indicating that the monotonic pattern in col- umn 1 is statistically significant, consistent with planners prioritizing lower-quality areas. Specif- ically, we find that -0.52 is statistically significantly less than -0.33 (we reject the one-sided null that the first wave is better than the second wave with a p-value of 2.2%). Similarly, for the second and third wave, we find that -0.33 is statistically significantly less than -0.14 (p-value of 6.5%). To assess both one-sided hypothesis tests jointly, we also implemented 10,000 bootstrap simula-

29The key variables are a dummy for buildings with four to ten floors and a dummy for buildings with more than ten floors. The omitted group represents buildings with three or fewer floors. We include the standard controls for the heights specification, including locality fixed effects. We find a land value increase of 22 log points (25%) for medium-height buildings and 42 log points (52%) for tall buildings. 30KIP areas have 146 fewer buildings with more than ten floors. Combined with the 52% price premium in price per square meter (relative to a price of 13.4 millions of Rupiahs per square meter for the omitted group), and assuming a building in each pixel, the total heights effect (expressed in terms of land value) is US$1.3 billion. The effect for buildings from four to ten floors is US$900 million. Therefore, the aggregate impact from the heights analysis is US$2.1 billion, which is 88% of the aggregate land value impact for the historical kampung sample, $2.4 billion.

25 tions, finding that 90% of the iterations resulted in estimated coefficients for the three waves that corroborate the selection rule, βI < βII < βIII. Reassuringly, the differences in column 1 become insignificant in our historical kampung spec- ification. Here, we restrict the sample to kampungs predating KIP, which likely had similarly (low) ξ, and we include 196 locality fixed effects. The corresponding estimates are -0.13 (wave I), -0.09 (wave II), and -0.14 (wave III), relative to historical kampung settlements within the same locality that were not in KIP. The p-values for the separate one-sided tests are much larger now (28% and 68%). In the bootstrap simulations, the three coefficients are monotonic only in 22% of the 10,000 iterations, compared to 90% before. Next, column 3 shows that the conclusions remain the same even after accounting for differ- ences in program design across the three waves. For example, Figure A2 shows that earlier KIP waves were implemented in more central and older parts of the city. In addition to our baseline distance controls and locality fixed effects, we further control for distance terciles from the center. Moreover, the investments provided by each of the three waves were not identical: for example, the first wave focused on sanitation facilities and the paving of footpaths. We account for this by controlling for the intensity of KIP-provided investments.31 In columns 4 through 6, we replicate a similar analysis for building heights and our results are consistent, with no monotonic pattern in the historical kampung specification. As discussed in Section3, the photo sample for heights is already restricted to historical kampungs and areas near KIP boundaries, which plausibly had similarly low ξ. Therefore, it is not surprising that in column 4 (where we pool historical and boundary samples) we cannot detect a monotonic selection pattern. By contrast, in column 1, the assessed land values data covers all of Jakarta. Taken together, while we observe large initial differences indicative of program selection bias, it is reassuring that these differences are greatly attenuated in the historical kampung specifications. These results are also in line with descriptions of the of KIP and non-KIP kampungs documented in World Bank(1995). Additionally, there should be little heterogeneity in treatment effects associated with the timing of the investments: as we discuss in Section 7.2 below, we find evidence that the direct effect from KIP investments are small by now, plausibly due to deprecia- tion. Finally, it is also unlikely that areas treated earlier display muted effects simply because they had more time to redevelop, as the redevelopment process would only take off in the mid-2000’s (see Section2).

31From our detailed policy maps, we can measure KIP roads, sanitation facilities, and public buildings located within 500 meters of each observation (see Section 7.2 below).

26 7 Why do upgraded areas have low land values and heights?

Below, we investigate potential channels that can explain why KIP areas have lower land values and building heights. We first examine whether KIP areas are more informal and quantify the

proximate barriers to formalization. Then, we investigate differences in amenities (AF and AP in the model) and educational attainment in KIP.

7.1 Barriers to Formalization

In our model, differences in land values and heights between KIP and non-KIP areas are driven by higher formalization costs in KIP neighborhoods. As we take the model to the data, one key challenge we had to overcome was measurement as there is little data to quantify the various aspects of formalization costs. We first use our novel informality indexes to establish that KIP areas are more likely to remain informal. Then, we investigate two factors associated with informality: land fragmentation and population density.

KIP areas are more informal Consistent with our narrative of delayed formalization, we find that KIP areas are more likely to be informal kampungs. Figure3 plots the distribution of our rank- based informality index for treated and control pixels, within the historical kampung sample. Here, 0 indicates very formal areas and 4 indicates very informal areas. There is a continuum across the index values, reflecting the varying degrees of informality in a city undergoing urban transforma- tion. In line with the patterns in the building heights histogram, control areas (unshaded bars) are more likely to be formal whilst KIP areas are more likely to have a higher degree of informality, as per our index. In particular, non-KIP areas are more likely to bunch at 0. Interestingly, there is a sizable share of KIP areas that have formalized. In Table5 we examine historical kampungs to show that KIP areas are 15 p.p. more likely to be informal kampungs today, relative to non-KIP areas that were already informal before KIP (column 1). The dependent variable is 1 if the rank-based informality index (the average of the codings of both research assistants) is strictly above one. The mean for non-KIP historical kampungs is 40%. The effect is 20 p.p. for the boundary analysis.32 Table A3 shows that our conclusions remain the same using different thresholds for the indicator besides one, and also pooling both research

32Furthermore, we employ the continuous informality index to assess whether, among areas that are still informal (index>1) today, those that were part of KIP are better quality, consistent with direct KIP improvement effects. Accordingly, we find suggestive evidence of KIP having a negative impact on the informality index, for informal settlements outside of central Jakarta, but we caution that these findings are not robust.

27 assistants’ rankings and adding a fixed effect for one of them. Columns 3 and 4 use the rank as the dependent variable, instead of an indicator. Columns 5 and 6 report effects on the attribute-based index, echoing those on the rank-based index. In historical kampungs, the summary index is 0.05 standard deviation units higher (more informal) in KIP areas, relative to a control group mean of zero. Table A4 shows that KIP areas have poorer vehicular access (as proxied by width of roads, quality of pavements, etc.), lower neighboorhood quality by appearance (as proxied by the presence of trash, low-hanging wires, lack of drainage canals), and less permanent structures. The signs are consistent with the aggregate effects but we lack statistical power.

KIP areas have higher population density and land fragmentation. Next, we provide evidence on two factors jointly contributing to formalization costs: land fragmentation and population den- sity. The land assembly and relocation costs associated with kampung redevelopment are more pronounced in areas with high population and parcel density. Entitlement effects are also likely to be stronger in denser communities. Together, these factors reinforce each other as residents are encouraged to stay in the neighborhood, which endogenously increases formalization costs in the long run. In contrast with the informality results above, we are not aware of data on parcel density or population density before 1969. We begin by considering the number of land parcels within a pixel as a novel proxy for land as- sembly costs, capturing the intuition that more claimants exacerbate holdout problems. Besides our standard set of distance and geography controls, we also control for the total log length of roads in the pixel, as the presence of road intersections may mechanically increase observed fragmentation. Column 1 of Table6 shows that KIP areas are more fragmented, with an average of 9 more parcels per pixel, relative to a mean of 19 parcels in non-KIP areas in historical kampungs. A back-of-the-envelope exercise implies that KIP’s impact of raising the parcel count by 9 translates into a 9% effect on prices (75% of the overall 12% price effect).33 The BDD estimate is larger (13 parcels, column 2). The implied effect on building height is -0.045 (38% of the overall -0.12 effect on the likelihood of tall buildings). In a similar vein, column 3 of Table6 indicates that population density in KIP hamlets is greater by 34 log points (40%) in the historical sample. The unit of analysis is a hamlet and we control for locality fixed effects as well as distance and topography controls averaged to the hamlet level. The effect size translates into 51 more people per pixel, consistent with the parcel density estimates.34

33We estimate that one additional parcel per pixel in KIP areas is associated with a 1% decline in land values within historical kampungs in KIP (Table A5, column 1). 34Assuming four individuals per parcel (the average household size in the Census data), 9 more parcels per pixel

28 Greater population density in KIP could be driven by differential migration, fertility, or mor- tality. Decomposing the effects would require granular data on year of in-migration, exit rates, and vital statistics, which are not available. Nevertheless, we do not see differences in birth migrants and five-year migrants (Table A6, columns 1 and 2). Additionally, Table A7 shows no differences in household size and proxies for fertility (number of live births per woman) and mortality (number of child deaths per 1000 live births). Taken together, the results above are consistent with the interpretation that KIP may have in- duced residents to stay longer than they otherwise would. Over time, the ensuing household for- mation and parcel sub-division could have contributed to greater population and parcel density in KIP areas.

Heterogeneity analyses for parcel and population density In Table A8, we further investigate whether parcel and population density are higher in KIP areas because (i) non-KIP areas are formal with already assembled parcels; or (ii) KIP areas have a higher density of parcels and people as a direct effect of KIP. We find supportive evidence of the latter channel when we examine parts of the city where neighborhoods have not yet formalized (periphery or informal areas per our index). Indeed, KIP areas are more fragmented and have higher population density in those areas, consistent with a direct effect of KIP on land fragmentation and population density that predates assembly of control areas. Next, we explore heterogeneity in physical features of the KIP design by manually identifying grid networks in KIP-provided paved roads by the presence of 90-degree angles and straight lines. An orderly, grid-like layout of blocks could reduce fragmentation by easing coordination failures (Fuller and Romer, 2014) and by encouraging spatially contiguous development (Baruah et al., 2017). Interestingly, column 5 of Table A8 shows that being part of a vehicular grid road network reduces fragmentation, with a large enough effect size to almost fully offset the direct KIP effect on land fragmentation. This is suggestive of the benefits of planning regular blocks ahead and consistent with Michaels et al.(2017). Finally, we investigate whether congestion costs associated with higher population density may reduce land values directly, through channels unrelated to formalization costs. To assess direct impacts of congestion on land values abstracting from KIP, we consider 45 informal non-KIP hamlets with high population density and estimate spatial decay in land values away from these hamlets. Notably, Figure A10 shows no spatial decay that is large enough to explain our effects. We can statistically reject (at the 5% level) that the direct effect of population density on land

implies 36 more individuals per pixel. This figure could approach 51, since some parcels may have two floors (the median building height in KIP) and one household per floor.

29 values can explain our effect sizes (-0.12 to -0.13).

7.2 Amenities

Let us now examine amenities, which have their model counterpart in AP and AF . Table7 demon- strates that the original KIP investments (initial AP shock induced by KIP) have plausibly fully depreciated. Turning to current amenities, Table A9 shows that KIP and non-KIP areas have com- parable access to public amenities today (current AP). However, non-KIP neighborhoods have more endogenous formal amenities (current AF ), consistent with delayed formalization in KIP. Initial KIP investments. Table7 rules out that differences in land values between KIP and non- KIP areas are driven by differences in the original KIP investments. Specifically, we examine four primary KIP policy components - vehicular roads, pedestrian roads, sanitation facilities, and public buildings (health centers and schools). We observe the location and type of KIP investments from the policy maps. For each assessed land value observation, we quantify the amount of KIP investments located within a 500 meter buffer, for observations in KIP and non-KIP areas. This allows for the possibility that residents in non-KIP areas were also able to access KIP investments. To quantify road investments, we calculate the length of roads built as part of KIP within 500 meters of each observation. To quantify the prevalence of sanitation facilities and public buildings, we count the number of facilities within 500 meters. Additionally, we de-mean these four measures so that the coefficient on the treatment indicator corresponds to the average treatment effect (i.e. evaluated at the average prevalence of KIP investments). Column 1 reports the results for the historical sample and column 2 presents results for the full sample. Interestingly, we do not find differential treatment effects by type of investment on current land values. This suggests that differences in initial public investments may have equalized across KIP and non-KIP areas by now. Given that planners assumed a useful life of 15 years, it is plausi- ble that the initial KIP investments, implemented more than four decades ago, have significantly depreciated by now. Current amenities. Table A9 demonstrates that KIP areas today have similar access to public amenities (AP), but fewer formal amenities (AF ) relative to non-KIP areas. For public amenities (columns 1 to 4), differences in access to the nearest school, hospital, police station, and bus stop cannot explain our results. This corroborates the discussion in World Bank(1995) that KIP accel- erated the provision of amenities in treated neighborhoods, but that non-KIP kampungs converged as a result of broader economic growth in Jakarta.

In contrast, for formal amenities (AF ), columns 5 and 6 show that KIP areas have 2 p.p. lower

30 retail density and 4 p.p. lower office density, in line with our findings of lower land values, lower heights, and more informality. The dependent variables are the share of each pixel with retail or office developments, respectively, according to administrative data on land use patterns.

7.3 Educational attainment

Table A10 shows that the lower land values in KIP are unlikely to be driven by compositional differences in the resident population. We examine educational attainment and find that, if any- thing, individuals in KIP have slightly better rates of junior secondary and high school attainment, echoing the findings in World Bank(1995). The effect is 1 p.p. relative to control group means of 0.76 for junior secondary and 0.58 for high school completion. The sample includes 4.9 million individuals above the age of 25, controlling for gender, age, and locality fixed effects, as well as distance and topography controls at the hamlet level.

8 Threats to identification

This Section discusses threats to identification, to lend further support to our main findings of lower land values and lower building heights in KIP areas. We discuss potential confounding due to the persistence of slums, spatial spillovers, displacement effects, and provide various robustness checks.

8.1 Persistence of slums

To probe the extent to which the lower land values at the KIP boundary are confounded by the generic persistence of slums, we present a falsification test using historical slum boundaries in non-KIP areas as placebo borders (Table8). Specifically, column 1 restricts the sample to locations within 200 meters of historical slum boundaries outside KIP. Here, we include 45 historical slum boundaries that are not contaminated, control for distance to the boundary (and its square), and cluster standard errors by boundary segments. In column 2 we consider observations within a 500 meter distance band. If historical slums have persistently lower land values, we should find a negative and significant effect when we compare areas that were historical kampungs against areas that were not. Instead, we find an insignificant effect for both the 500 meter and the 200 meter samples.

31 8.2 Spatial externalities

Next, we consider the extent to which spatial spillovers may confound the boundary analysis. For example, negative spillovers stemming from unsanitary living conditions and depressed public and private investment in slums could lower land values in control areas just outside the KIP boundary, attenuating the estimates from the boundary analysis. Following Turner et al.(2014), we argue that localized externalities related to KIP should decline with distance from KIP areas. We test this by estimating spatial decay away from KIP boundaries. Specifically, we repeat the same specifications as Table3, but the omitted group is now the treatment group and the key regressors are binary indicators for areas outside of KIP and within 100 meter distance bands. We plot the effects by distance bands and the 95% confidence intervals, for the historical kampungs specification. We start with 0 to 100 meters, and include up to 10 distance bands (up to 900 to 1000 meters).35 Figure4 shows the effect on land values does not exhibit any spatial decay as we move away from KIP boundaries. This lack of spatial decay reduces the concern that the boundary discontinu- ity estimates are attenuated by spillovers from KIP contaminating non-KIP neighborhoods. This is consistent with the prominence of gated communities in formal neighborhoods in Jakarta, which minimizes spillovers from nearby neighborhoods. The full set of regression results are reported in Table A11 in the Appendix.

8.3 Robustness for boundary analysis

Another potential threat with the boundary discontinuity exercise is that KIP boundaries may be confounded by administrative boundaries. We additionally control for locality fixed effects in Table A12 (with the 500 meter distance band), showing that the effects of KIP at the boundary are similar when considering variation within the same administrative unit.36 Moreover, Table A13 shows that the boundary discontinuity estimates are similar across dif- ferent buffer distances ranging from 150 meters (the optimal bandwidth for land values as per Calonico et al.(2014)) to 500 meters. This corroborates the weak evidence of spatial spillovers above.

35We drop observations beyond 1000 meters since we are identified off of within-locality variation only. The average locality has an area of 2.5 square kilometer, implying an equivalent area radius of 892 meters. 36Localities are important administrative units in Jakarta. For example, land transfers are recorded in registries in the kelurahan office.

32 8.4 Displacement and full sample analysis

A common concern with place-based programs is that the effects are localized and reflect a mere spatial reshuffling of economic activity. This also raises external validity concerns because the effects may average out across locations upon scaling up. This is less of a concern given the already comprehensive coverage and large scale of KIP. Below, we empirically assess this.

Effects do not out in full sample. To assuage concerns that the effects detected in the boundary sample are only hyper-localized effects, Table9 presents estimates utilizing the full sample, where purely localized effects would be netted out once we average across all locations, leaving small effects in aggregate - which is not what we find. For example, we find a similar magnitude of -11% for land values when we use the full sample. This specification controls for 2,060 hamlet fixed effects.37 There are 304 hamlets that have within-hamlet variation in treatment status. The estimates for the other outcomes (heights, informality, parcel and population density) continue to be economically and statistically significant in the full sample. Overall, it is reassuring that the estimates for all our outcomes are similar in the full sample, the historical sample, and the boundary samples, as they are identified from different sources of variation.

Potential displacement from central KIP to middle non-KIP areas Table 10 examines het- erogeneity by distance to the National Monument, shedding light on the spatial patterns of devel- opment induced by KIP. We construct indicators for the three distance terciles (central, middle, and periphery) and interact them with the KIP dummy. The omitted group is the periphery. We report heterogeneous effects for land values (column 1) and heights (column 2) in the historical kampung sample. Consistent with the model’s predictions, we find little difference in land values in peripheral areas, where the gains from formalization are likely to be low relative to the costs. Interestingly, our estimates are consistent with KIP-induced displacement of development ac- tivity from high market potential areas in the center to lower market potential areas in the middle. In fact, we can reject that central KIP areas are more developed (i.e. have higher land values and building height) than non-KIP areas in the middle (p-values are 1% or less). This displacement is suggestive of an aggregate reduction in realized land values. KIP areas are concentrated in central areas (68% of KIP observations), with 24% in the middle, and 8% in the periphery, consistent with Jakarta’s area expanding by 50% since 1965.

37Hamlets are smaller than localities, with an average of 0.24 square kilometers and are comparable to U.S. census block groups.

33 8.5 Robustness checks

Below we provide robustness checks that address concerns related to endogenous sorting, selection of building heights, biases in assessed land values, and historical land institutions. We also discuss the robustness to dropping non-predetermined controls and to the clustering of standard errors.

Endogenous sorting. Endogenous sorting of lower-income households into KIP neighborhoods could lead to persistently lower land values (Ambrus et al., 2019). Table A6 shows that sorting patterns are unlikely to explain away the lower land values and higher population density in KIP. The first two columns show that the migration rates are slightly lower in KIP but the effects are small relative to the mean. Defined by district of birth, the migration rate is 2 p.p. lower in KIP (relative to a control group mean of 0.46). The 5-year migration rate (based on the district of residence 5 years before) is 1 p.p. lower relative to a control group mean of 0.09. The next two columns show that migrants have slightly more years of schooling. The effect is 0.19 more years for 5-year migrants relative to a mean of 10.6 years. These patterns are inconsistent with KIP investments causing crowding through in-migration of people with low education. Addi- tionally, we examined differences in educational attainment of the stayers and find small positive effects for KIP (Table A14), which goes against the lower land values. These patterns corroborate the conclusions in World Bank(1995) that “KIP did not disturb the existing residential stability of the kampungs” (p.6.).

Selection for development activity. Furthermore, we also consider selection into development activity stemming from the fact that the potential for building high-rises depends on zoning reg- ulations and market access. Table A15 shows that the results for building heights survive after dropping pixels with no buildings (columns 1 and 2), restricting to pixels zoned for commercial developments (columns 3 and 4), or restricting to pixels within 1000 meters of pre-determined historical main roads, as a proxy of market access (columns 5 and 6).

Selection for land values assessment. Next, we assess whether there are systematic differences in the quality and coverage of the assessed land values data. First, to the extent that the measure- ment error is smooth across KIP boundaries, it is unlikely that the boundary discontinuity estimates are biased by differences in the land values assessment. Nevertheless, one concern is that KIP areas are more likely to be informal today, and property data for informal settlements are less likely to be reported. On the contrary, Table A16 shows that KIP areas are more likely to appear in the assessed values database. This is consistent with kampung residents recording property transactions in the local administrator’s office (Section 2.1).

34 Historical land institutions. In Table A17, we explore whether the results are confounded by differences between titled and untitled areas, including differential land values assessment. We are not aware of any systematic data on titles. As a proxy, we consider early Dutch settlements (as identified by our historical maps), which may be unobservedly higher quality or more likely to be titled. We replicate our main results (Table3) dropping hamlets with any Dutch settlements. Reassuringly, our results are very similar.

Non-predetermined controls. In Table A18 we replicate our main results dropping the three con- trols that are not pre-determined and that may thus be affected by KIP: distance to the Stock Ex- change Building (built in 1994), distance to waterways (as currently reported in OpenStreetMap), and flood-proneness (a measure based on the frequency of recent floods). The effect sizes are very similar to the baseline ones.

Standard errors robustness. In Table A19, we replicate the specifications in Table3 and show the p-values for the KIP treatment effect are similar under alternative standard errors specifications. For our historical kampung specification, where we cluster by locality at baseline, we consider a coarser clustering by sub-districts, as well as Conley(1999) standard errors with a radius of 700 meters, 900 meters (comparable to clustering by locality), and 1200 meters. Moreover, our boundary discontinuity inference - where we cluster by KIP boundary at baseline - is also robust to clustering standard errors at the sub-district level, with p-values of respectively 0.02 and 0.08.

9 Policy discussion

Overall, we find an opportunity cost from delayed formalization of US$11 billion. This represents a deadweight loss to society, which cannot be monetized by residents, developers, nor the govern- ment. Below we discuss the -offs faced by residents and policy makers and the implications from our findings. We begin by providing a bounding exercise to illustrate why the redevelopment process is so slow. As a first step, we divide the aggregate US$11 billion38 by estimates of the number of current residents of KIP areas. We calculate that the surplus available per household represents three times annual household income in kampungs (based on our household survey in 2016, see Section2). 39

38We focus on the opportunity costs of land use. Incorporating the program’s implementation costs does not change the conclusions. 39An average population density of 363 people per hectare and 11,000 hectares in KIP implies one million households (assuming a household size of four) in KIP neighborhoods today. The surplus available per displaced household would be US$10,788, about three times the current household income.

35 To benchmark the surplus per resident, a critical cost component we consider is the displace- ment of residents. Two thirds of KIP areas are central by now and residents face major disruptions if they are relocated to the periphery, including changes in job opportunities, amenities, commut- ing times, and the potential destruction of social capital and informal insurance networks.40 We quantify displacement costs by assuming that the utility gain from living in the center versus the periphery is at least greater than the difference in rental costs, where differences in the value of productive and consumption amenities are presumably capitalized into rents.41 Given this, we calculate that as long as households expect to keep living in KIP neighborhoods for more than seven years, the expected benefits from remaining will be large enough to offset the US$11 billion opportunity cost of land use, shedding light on why formalization is such a slow process.42 We caveat that our $11 billion opportunity cost estimate may not fully incorporate person- based benefits induced by KIP (e.g. on residents’ human capital and inter-generational mobility). Furthermore, there are foregone tax revenues from depressed land values. Our exercise also ab- stracts away from important considerations including incidence, complexities in implementation, and political sensitivities involved in redeveloping slums. Beyond Indonesia, as cities face rapid urbanization and increasing land scarcity, policy makers face pressures to manage slum proliferation. An urgent policy question becomes where, when, and how to intervene (Duranton and Venables, 2018), acknowledging that land market institutions are slow moving and full-scale titling not an immediate option. Our findings inform these decisions. First, our results indicate that the location of place-based investments is crucial. In central ar- eas, the place-based distortions can be severe and give rise to sunk costs, suggesting that relocating slum residents earlier will be less costly, if land is available nearby (Michaels et al., 2017).43 Sec- ond, ex post dynamic inefficiencies associated with slum upgrading can be large, which points to the importance of the timing of interventions. In a city in an early stage of development, slum up- grading provides a low-cost way to target poor residents and improve their quality of life, convey- ing immediate benefits that may be large relative to future distortionary costs. In this respect, slum

40While the literature provides no explicit estimates of displacement costs for slum households, Barnhardt et al.(2017) consider a relocation program in Ahmedabad, India, and find that the majority of program recipients had given up heavily subsidized housing in the city’s periphery (as large as 1.4 times annual household income) and moved back to the slums. 41From our household survey (see Footnote 13), we calculate that the annual rental costs for residents in the center versus the periphery differ by US$2,160. The conclusions are similar if we use differences in income instead of rental costs. 42We assume an annual discount rate of 10% and four persons per household. 43A recently proposed policy approach in Jakarta is the redevelopment of central slums with on-site rehousing, which could minimize displacement costs while also freeing up land for other uses. The viability of these efforts will ultimately depend on the government’s ability to enforce requirements in prime areas.

36 upgrading can be viewed as a “stop-gap” policy for early-stage, resource-constrained cities (Du- ranton and Venables, 2018).44 Third, some of the goals of slum upgrading could be achieved for a lower distortionary cost by combining person-based benefits to improve the income-generating potential of residents coupled with place-based targeting.45 Ultimately, policy responses to the proliferation of slums entail a balancing act between aggre- gate efficiency and equity considerations. As governments seek to promote agglomeration exter- nalities and upward mobility for migrants moving to cities, they are also concerned about housing affordability and redistributive considerations, to protect the of slum residents. Our empir- ical exercise helps inform these trade-offs by shedding light on the potential efficiency costs of a widely implemented slum policy approach.

10 Conclusion

Policy makers are debating different policy approaches to accommodate massive urbanization flows. One popular approach is slum upgrading. In this paper, we contribute novel causal evi- dence of the long-term impacts of the world’s largest scale slum upgrading program, the Kampung Improvement Program, on urban development patterns in the city of Jakarta. Our setting is unique in that we have granular data of formal and informal areas for a mega-city that has started formal- izing, over 40 years from the implementation of the program. Across empirical exercises and robustness checks, we find a consistent pattern with KIP neigh- borhoods having 12% lower land values and half as many tall buildings. They are also more informal,with greater land fragmentation and population density. Our findings shed light on how place-based investments can directly improve land values but also give rise to large distortionary sunk costs in the long run, informing the trade-offs faced by policy makers. There are several venues for future research. First, it will be important to shed light on the redistributive aspects associated with upgrading and redeveloping slums, including the incidence of costs and benefits for residents, developers, and the government. This will require a better un- derstanding of the housing and labor market decisions of slum dwellers as well as their patterns of intergenerational mobility in and out of slums. In future work, it will also be interesting to

44This echoes some of the trade-offs discussed in Leaf(1994): "Despite the important contributions which KIP has made to the betterment of physical conditions in the city (Devas, 1981), it is quite possible that in the broader view KIP will have the effect of being more of a temporary ameliorative measure (although for many kampung, temporary may refer to decades)...". 45In particular, the 1990s wave of KIP complemented physical improvements with person-based benefits, but it was discontinued following the 1997 Asian Financial Crisis.

37 elaborate on the policy trade-offs further, by studying slum upgrading programs that are imple- mented in cities at earlier stages of urban development or that bundle physical upgrades with other interventions. Finally, our study also points to the use of ground imagery and rank-based indexes as a promising approach to measure the properties of the built environment. Policy makers are interested in identifying slums for targeting and for tracking their proliferation, but lack the data to quantify the degree of informality. In future work, this could be extended to other outcomes, such as the behavior of residents, or scaled up to produce systematic mappings.

38 References

Ambrus, A., E. Field, and R. Gonzalez (2019). Loss in the Time of Cholera: Long-run Impact of a Disease Epidemic on the Urban Landscape. Working paper. Barnhardt, S., E. Field, and R. Pande (2017). Moving to Opportunity or Isolation? Network Effects of Randomized Housing Lottery in Urban India. American Economic Journal: 9(1), 1–32. Baruah, N., V. J. Henderson, and C. Peng (2017). Colonial Legacies: Shaping African Cities. SERC/Urban and Spatial Programme Discussion Paper. Bilal, A. and E. Rossi-Hansberg (2019). Location as an Asset. Working paper. Bleakley, H. and J. Lin (2012). Portage and Path Dependence. Quarterly Journal of Economics 127(2), 587–644. Brooks, L. and B. Lutz (2016). From Today’s City to Tomorrow’s City: An Empirical Investigation of Urban Land Assembly. American Economic Journal: 8(3), 69–105. Brueckner, J. K. and S. V. Lall (2015). Cities in Developing Countries: Fueled by Rural-Urban Migration, Lacking in Tenure Security, and Short of Affordable Housing. In G. Duranton, J. V. Henderson, and W. C. Strange (Eds.), Handbook of Regional and Urban Economics Volume 5, Chapter 21. Elsevier Science. Bryan, G., E. Glaeser, and N. Tsivanidis (2019). Cities in the Developing World. Working paper. Calonico, S., M. D. Cattaneo, and R. Titiunik (2014). Robust Nonparametric Confidence Intervals for Regression-Discontinuity Designs. Econometrica 82(6), 2295–2326. Castells-Quintana, D. (2017). Malthus Living in a Slum: Urban Concentration, Infrastructure and Economic Growth. Journal of Urban Economics 98, 158–173. Collier, P., E. Glaeser, A. Venables, M. Blake, and P. Manwaring (2018). Land rights - unlocking land for urban development. International Growth Centre, Cities that Work Policy Brief. Conley, T. G. (1999). GMM Estimation with Cross Sectional Dependence. Journal of 92, 1–45. Darrundono (1997). Kampung Improvement Programme, Jakarta, Indonesia. Aga Khan Award for Archi- tecture brief. Darrundono (2012). Proyek Muhammad Husni Thamrin. De Soto, H. (1989). The Other Path: The Invisible Revolution in the Third World. Harper & Row, New York. Demsetz, H. (1967). Toward a Theory of Property Rights. American Economic Review: Papers and Pro- ceedings 57, 347–359. Devas, N. (1981). Indonesia’s Kampung Improvement Program: An Evaluative Case Study. 286, 19–36. DPGP (2011). Updating Aset MHT Administrasi Dan Kepulauan Seribu. Dinas Perumahan Dan Gedung Pemerintah Daerah (DPGP), Provinsi Daerah Khusus Ibukota Jakarta. Duranton, G., E. Ghani, A. G. Goswami, and W. Kerr (2015). The Misallocation of Land and Other Factors of Production in India. World Bank Policy Research Working Paper 7221.

39 Duranton, G. and A. Venables (2018). Place-Based Policies for Development. World Bank Policy Research Working Paper 8410. Feler, L. and J. V. Henderson (2011). Exclusionary Policies in Urban Development: Urban-Servicing Mi- grant Households in Brazilian Cities. Journal of Urban Economics 69(3), 253–272. Field, E. (2007). Entitled to Work: Urban Property Rights and Labor Supply in Peru. The Quarterly Journal of Economics 122(4), 1561–1602. Field, E. and M. Kremer (2006). Impact Evaluation for Slum Upgrading Interventions. World Bank, and Economic Management, Thematic Group on Poverty Analysis, Monitoring and Impact Evaluation. Firman, T. (1999). From “” to “City of Crisis”: Jakarta Metropolitan Region Under Economic Turmoil. Habitat International 23(4), 447–466. Franklin, S. (2019). The Demand for Government Housing: Evidence from Lotteries for 200,000 Homes in Ethiopia. Fuller, B. and P. Romer (2014). Urbanization as Opportunity. World Bank Policy Research Working Paper 6874. G. Kolff & Co (1937). Plattegrond van Batavia, schaal 1:23.500. Galiani, S., P. Gertler, R. Cooper, S. Martinez, A. Ross, and R. Undurraga (2017). Shelter from the Storm: Upgrading Housing Infrastructure in Latin American Slums. Journal of Urban Economics 98(1), 187– 213. Galiani, S. and E. Schargrodsky (2010). Property Rights for the Poor; Effects of Land Titling. Journal of 94(9), 700–29. Garr, D. J. (1996). Expectative Land Rights, House Consolidation and Cemetery : Some Perspec- tives from Central Java. World Development 24(12), 1925–1933. Gechter, M. and N. Tsivanidis (2018). Efficiency and Equity of Land Policy in Developing Country Cities: Evidence from the Mumbai Mills Redevelopment. Glaeser, E. and V. Henderson (2017). Urban Economics for the Developing World: An Introduction. Journal of Urban Economics 98, 1–5. Glaeser, E. L. and J. Gottlieb (2008). The Economics of Place-Making Policies. Brookings Papers on Economic Activity 2(1), 155–239. Glaeser, E. L. and A. Shleifer (2002). Legal Origins. The Quarterly Journal of Economics 117(4), 1193– 1229. Gonzalez-Navarro, M. and C. Quintana-Domeque (2016). Paving Streets for the Poor: Experimental Anal- ysis of Infrastructure Effects. Review of Economics and Statistics 98(2), 254–267. Government of India (2016). Pradhan Mantri Awas Yojana: Housing for All (Urban) Scheme Guidelines. Haryanto, W. (2018). Jakarta Must Build Upwards for More Space. Accessed online on 11 April 2019, available at https://www.thejakartapost.com/academia/2018/11/24/jakarta-must-build-upwards-for-more- space.html. Henderson, J. V., A. Venables, and T. Regan (2016). Building the City: Sunk Capital, Sequencing, and Institutional Frictions. SRC Discussion Paper 196.

40 Hornbeck, R. and D. Keniston (2017). : Barriers to Urban Growth and the Great Boston Fire of 1872. American Economic Review 107(6), 1365–1398. Human Cities Coalition (2017). Rapid Scan of Policies, Plans and On-going Programs. International Monetary Fund (2018). International Financial Statistics (IFS). Country Tables, United States, Consumer Price Index. Washington, DC: International Monetary Fund. Jimenez, E. (1985). Urban Squatting and Community in Developing Countries. Journal of Public Economics 27(1), 69–92. Kaufmann, D. and J. M. Quigley (1987). The Consumption Benefits of Investment in Infrastructure: The Evaluation of Sites-and-services Programs in Underdeveloped Countries. Journal of Development Eco- nomics 25(2), 263–284. Kessides, C. (1997). World Bank Experience with the Provision of Infrastructure Services for the Urban Poor: Preliminary Identification and Review of Best Practices. KIP (1969). Kriteria Pemilihan Kampung Yang Akan Mendapat Perbaikan Dalam Proyek Muhammad Husni Thamrin Ditinjau Dari Segi Physik Kampung. Kline, P. and E. Moretti (2014). People, Places, and : Some Simple of Local Programs. Annual Review of Economics 6, 629–662. Krugman, P. (1991). History versus Expectations. The Quarterly Journal of Economics 106(2), 651–667. Kuffer, M., K. Pfeffer, and R. Sliuzas (2016). Slums from Space-15 Years of Slum Mapping Using Remote Sensing. Remote Sensing 8(6). Leaf, M. (1994). Legal Authority in an Extralegal Setting: The Case of Land Rights in Jakarta, Indonesia. Journal of Planning Education and Research. Leitner, H. and E. Sheppard (2018). From Kampungs to Condos? Contested Accumulations Through Displacement in Jakarta. Environment and Planning A: Economy and Space 50(2), 437–456. Libecap, G. D. and D. Lueck (2013). The Demarcation of Land and the Role of Coordinating Property Institutions. Journal of 119(3), 426 –57. Marx, B., T. Stoker, and T. Suri (2013). The Economics of Slums in the Developing World. Journal of Economic Perspectives 27(4), 187–210. Marx, B., T. Stoker, and T. Suri (2019). There Is No Free House: Ethnic Patronage in a Kenyan Slum. American Economic Journal: Applied Economics 11(4), 36–70. McCarthy, P. (2003). The Case of Jakarta, Indonesia. UN Habitat Case Study. McIntosh, C., T. Alegría, G. Ordóñez, and R. Zenteno (2018). The Neighborhood Impacts of Local In- frastructure Investment: Evidence from Urban Mexico. American Economic Journal: Applied Eco- nomics 10(3), 263–86. Michaels, G., D. Nigmatulina, F. Rauch, T. Regan, N. Baruah, and A. Dahlstrand-Rudin (2017). Planning Ahead for Better Neighborhoods: Long Run Evidence from Tanzania. Working paper. Naik, N., S. D. Kominers, R. Raskar, E. L. Glaeser, and C. A. Hidalgo (2017). Computer Vision Uncovers Predictors of Physical Urban Change. Proceedings of the National Academy of Sciences 114(29), 7571– 7576. NASA and METI (2011). ASTER Global Digital Elevation Model, Version 2. USGS/Earth Resources

41 Observation and Science (EROS) Center, Sioux Falls, South Dakota. Neumark, D. and H. Simpson (2015). Place-based Policies. In G. Duranton, J. V. Henderson, and W. C. Strange (Eds.), Handbook of Regional and Urban Economics Volume 5, Chapter 18, pp. 1197–1287. Elsevier. Pekel, J.-F., A. Cottam, N. Gorelick, and A. Belward (2016). High-Resolution Mapping of Global Surface Water and Its Long-Term Changes. Nature 540, 418–422. Roback, J. (1982). , Rents, and the Quality of Life. Journal of Political Economy 90(6), 1257–1278. Rosen, S. (1979). -Based Indexes of Urban Quality of Life. In P. Mieszkowski and M. Straszheim (Eds.), Current Issues in Urban Economics. Johns Hopkins University Press. Rukmana, D. (2015). Peripheral Pressures. In Archaeology of the Periphery. Moscow Urban Forum. Silver, C. (2007). Planning the : Jakarta in the Twentieth Century. London, United Kingdom and New York, NY, USA: Routledge. Takeuchi, A., M. Cropper, and A. Bento (2008). Measuring the Welfare Effects of Slum Improvement Programs: The Case of Mumbai. Journal of Urban Economics 64(1), 65–84. Taylor, J. L. (1987). Evaluation of the Jakarta Kampung Improvement Program. Shelter Upgrading for the Urban Poor: Evaluation of Third World Experience, 39–67. The Skyscraper Center (2019). The Global Tall Building Database of the CTBUH. Turner, M. A., A. Haughwout, and W. Van Der Klaauw (2014). Land Use Regulation and Welfare. Econo- metrica 82(4), 1341–1403. UN Habitat (2011). Building Urban Safety Through Slum Upgrading. UN Habitat (2012). State of the World’s Cities 2012/2013. UN Habitat (2016). Slums Almanac 2015-16. Tracking Improvement in the Lives of Slum Dwellers. UN Habitat (2017). Participatory Slum Upgrading Programme. U.S. Army Map Service (1959). Djakarta - 1:50,000. Edition 1-AMS, Series T725. Wong, M. (2019). Intergenerational Mobility in Slums: Evidence from a Field Survey in Jakarta. Asian Development Review 36(1), 1–19. World Bank (1995). Enhancing the Quality of Life in Urban Indonesia: The Legacy of Kampung Improve- ment Program. World Bank (2012). Jakarta Case Study Overview. World Bank (2016a). Indonesia’s Urban Story. World Bank (2016b). World Development Indicators. World Bank (2017). Tanzania - Urban Water Supply Project. World Bank (2018). Indonesia National Slum Upgrading Project: Implementation Status and Results Report. World Population Review (2018). Indonesia Population 2018.

42 Tables

Table 1: Summary statistics Variable name N Unit Mean SD

Panel A: Outcomes Assessed land values, thousand Rupiahs per sqm 19862 sub-block 12386 14687 1(Height>3) 7104 pixel 0.19 0.40 Building height, number of floors 7104 pixel 4.21 7.50 1(Rank-based informality index > 1) 7104 pixel 0.47 0.50 Rank-based informality index 7104 pixel 1.11 1.12 Attribute-based informality index 7104 pixel 0.00 0.42 Parcel count 89463 pixel 15.80 16.18 Retail density 89463 pixel 0.02 0.10 Office density 89463 pixel 0.04 0.16 Population density 2534 hamlet 25691 25915

Panel B: Controls Distance to Monument, m 89463 pixel 10725.01 4684.06 Distance to historical main road, m 89463 pixel 6986.00 4452.91 Distance to Stock Exchange, m 89463 pixel 10228.02 4564.43 Elevation, m 89463 pixel 21.95 14.81 Slope, degrees 89463 pixel 4.85 3.37 1(Flood-prone hamlet) 89463 pixel 0.38 0.48 Distance to nearest waterways, m 89463 pixel 485.57 611.81 Distance to surface water occurrence, m 89463 pixel 2429.41 1510.93 Distance to coast, m 89463 pixel 11527.70 6946.25 Notes: Panel A reports summary statistics for outcome variables, including land values for 19,862 sub-blocks in the assessed land values database, building heights, the rank-based and attribute-based informality indexes, and a binary “kampung” indicator (if the rank-based index is greater than 1) for 7,104 pixels in our photographic survey, land use patterns (parcel count, retail, and office density) for 89,463 pixels, and population density for 2,534 hamlets. Panel B reports summary statistics for distance and topography controls measured at the pixel level.

43 Table 2: Comparing KIP and non-KIP areas Unit of analysis: Sub-block level Pixel level Sample: Full Historical BDD Full Historical BDD sample kampung 200m sample kampung 200m (1) (2) (3) (4) (5) (6)

Panel A: Distance controls Log distance to Monument -0.31*** -0.01 0.0007 -0.33*** 0.004 0.004 [ 0.0001] [ 0.25] [ 0.90] [ 0.00001] [ 0.53] [ 0.25] Log distance to historical main road -0.30** -0.01 -0.003 -0.34*** 0.02 0.02 [ 0.02] [ 0.72] [ 0.86] [ 0.004] [ 0.49] [ 0.21] Log distance to Stock Exchange -0.21** -0.01 -0.002 -0.25*** -0.01 -0.005* [ 0.03] [ 0.24] [ 0.57] [ 0.0004] [ 0.41] [ 0.10]

Panel B: Topography controls Elevation (m) -3.81* 0.005 0.15 -4.04** -0.36 0.07 [ 0.09] [ 0.99] [ 0.75] [ 0.02] [ 0.11] [ 0.78] Slope (degrees) -0.16 0.15 -0.15 -0.13 -0.24** -0.21* [ 0.53] [ 0.60] [ 0.51] [ 0.36] [ 0.03] [ 0.07] 1(Flood-prone hamlet) 0.02 0.02 0.10* -0.01 -0.03 -0.001 [ 0.76] [ 0.55] [ 0.05] [ 0.69] [ 0.17] [ 0.96] Log distance to nearest waterways 0.14 0.12 0.27*** -0.05 0.14*** 0.23*** [ 0.34] [ 0.13] [ 0.004] [ 0.68] [ 0.003] [ 0.00003] Log distance to surface water occurrence -0.04 0.002 -0.0008 -0.15 -0.01 0.02 [ 0.70] [ 0.96] [ 0.95] [ 0.16] [ 0.65] [ 0.23] Log distance to coast -0.19** -0.002 -0.003 -0.18*** -0.002 0.003 [ 0.03] [ 0.92] [ 0.64] [ 0.01] [ 0.76] [ 0.47] N 19862 3147 1345 89463 11015 5412 Geography FE District Locality KIP Boundary District Locality KIP Boundary * 0.10 ** 0.05 *** 0.01 Notes: This table reports regressions of distance and topography attributes as the dependent variables, with the treatment indicator as the key regressor, and geography fixed effects. For each variable, the top row reports the coefficient, and the bottom row reports the p-value in brackets. The unit of analysis is either a sub-block (assessed land values analysis, columns 1 to 3) or a pixel (heights analysis, columns 4 to 6). Columns 1 and 4 report results for the full sample, and columns 2 and 5 report results for the historical kampung sample and columns 3 and 6 correspond to the boundary discontinuity sample, within a 200 meter distance band. We have 3 distance controls (log distance to the National Monument, historical main road, and stock exchange) and 6 topography controls. Standard errors are clustered by locality except for the boundary analysis where we cluster by KIP boundary.

44 Table 3: Effect of KIP on land values and building heights Dependent variable: Log land values 1(Height>3) Sample: Historical BDD Historical BDD kampung 200m kampung 200m (1) (2) (3) (4) KIP -0.12*** -0.13** -0.12*** -0.09*** ( 0.05) ( 0.06) ( 0.02) ( 0.03) N 3147 1345 5280 1452 R-Squared 0.73 0.81 0.29 0.37 Distance Y Y Y Y Topography Y Y Y Y Distance to KIP boundary N Y N Y Geography FE Locality KIP Boundary Locality KIP Boundary * 0.10 ** 0.05 *** 0.01 Notes: This table reports the effect of KIP on land values and building heights. Columns 1 and 2 report the effect of KIP on log of assessed land values in a sub-block, where the key regressor is an indicator that is 1 for sub-blocks in KIP. Column 1 includes the historical kampung sample with 196 locality fixed effects. Column 2 uses observations within 200 meters from a KIP boundary, controlling for distance to the KIP boundary (and its square), and 124 KIP boundary fixed effects. Columns 3 and 4 present the analysis for heights at the pixel level, where the dependent variable is a dummy that is 1 if the tallest building in the pixel has more than 3 floors. We also control for strata fixed effects from our photographic survey and an indicator for pixels with no buildings. Distance and topography controls are defined in Table2. Standard errors are clustered by locality (historical specification) and by KIP boundary (BDD specification).

45 Table 4: Heterogeneous effects by KIP waves Dependent variable: Log land values 1(Height>3) Sample: Full Historical Historical Photo Historical Historical sample kampung kampung sample kampung kampung (1) (2) (3) (4) (5) (6) KIP I (1969-1974) -0.52*** -0.13 -0.08 -0.11*** -0.10*** -0.08*** ( 0.10) ( 0.09) ( 0.10) ( 0.03) ( 0.03) ( 0.03) KIP II (1974-1979) -0.33*** -0.09 -0.04 -0.12*** -0.10*** -0.08*** ( 0.08) ( 0.06) ( 0.06) ( 0.02) ( 0.02) ( 0.02) KIP III (1979-1984) -0.14 -0.14* -0.09 -0.05** -0.07** -0.05 ( 0.09) ( 0.07) ( 0.08) ( 0.03) ( 0.03) ( 0.03) N 19862 3147 3147 7104 5280 5280 R-Squared 0.55 0.73 0.73 0.18 0.29 0.29

p-val (H0 : βI ≥ βII) 0.022 0.279 0.295 0.557 0.527 0.431 p-val (H0 : βII ≥ βIII) 0.065 0.680 0.672 0.016 0.211 0.257 Distance Y Y Y Y Y Y Topography Y Y Y Y Y Y KIP investments N N Y N N Y Distance tercile N N Y N N Y Geography FE District Locality Locality District Locality Locality * 0.10 ** 0.05 *** 0.01 Notes: This table assesses whether there is a monotonic pattern in the effects of the three KIP waves that is consistent with the scoring rule prioritizing worse neighborhoods first (βI < βII < βIII). Specifically, we estimate heterogeneous effects on land values (columns 1 to 3) and building heights (columns 4 to 6) with the key regressors being dummies for each of the three KIP Pelita waves (five-year plans). Column 1 includes the full sample of 19,862 sub-blocks from the assessed land values data and 5 district fixed effects. Column 2 restricts to the historical kampung sample with 3,147 sub-blocks and includes 196 locality fixed effects. Column 3 further adds controls for KIP investments (see Table7) and dummies for distance terciles to the National Monument. Columns 4 through 6 present the analogous analysis for heights. Column 4 includes the full photographic survey sample, corresponding to 7,104 pixels drawn from the historical kampung sample and the boundary sample. The first p-value corresponds to the one-sided hypothesis that the first wave has better outcomes (higher land values and building heights) than the second wave. The second p-value tests whether the second wave has better outcomes than the third. A small p-value indicates a high likelihood of the one-sided alternative, i.e. rejecting the null implies a statistically monotonic pattern consistent with the scoring rule. Standard errors are clustered by locality.

46 Table 5: Effect of KIP on informality Dependent variable: 1(Kampung) Rank-based index Attribute-based index Sample: Historical BDD Historical BDD Historical BDD kampung 200m kampung 200m kampung 200m (1) (2) (3) (4) (5) (6) KIP 0.15*** 0.20*** 0.29*** 0.39*** 0.05** 0.08* ( 0.02) ( 0.05) ( 0.05) ( 0.12) ( 0.02) ( 0.04) N 5280 1452 5280 1452 5280 1452 R-Squared 0.24 0.31 0.25 0.34 0.17 0.20 Distance Y Y Y Y Y Y Topography Y Y Y Y Y Y Distance to KIP boundary N Y N Y N Y Geography FE Locality KIP boundary Locality KIP boundary Locality KIP boundary * 0.10 ** 0.05 *** 0.01 Notes: This table reports the effect of KIP on informality, using the building heights specifications of Table3 at the pixel level. In columns 1 and 2 the dependent variable is the binary kampung indicator (1 if the rank-based informality index is greater than 1). In columns 3 and 4 the dependent variable is the continuous rank-based informality index. Columns 5 and 6 employ the attribute-based index as the dependent variable. Standard errors are clustered by locality (historical specification) and by KIP boundary (BDD specification).

Table 6: Effect of KIP on parcel and population density Dependent variable: Parcel count Log population density Sample: Historical BDD Historical kampung 200m kampung (1) (2) (3) KIP 8.65*** 12.54*** 0.34*** (1.07) (1.13) (0.07) N 11015 5412 1184 R-Squared 0.51 0.40 0.56 Distance Y Y Y Topography Y Y Y Distance to KIP boundary N Y N Geography FE Locality KIP boundary Locality * 0.10 ** 0.05 *** 0.01 Notes: This table reports the effects of KIP on the number of parcels in a pixel (columns 1 and 2) and the log of population density in a hamlet (column 3). Columns 1 and 2 repeat the building heights specifications of Table3, adding the log length of roads in a pixel as a control. In column 3, we report effects for population density at the hamlet level with locality fixed effects. The KIP dummy is equal to 1 if the majority of the area in the hamlet is in KIP. Controls are all averaged to the hamlet level. Standard errors are clustered by locality except for the boundary analysis where we cluster by KIP boundary.

47 Table 7: Heterogeneous effects by KIP components Dependent variable: Log land values Sample: Historical kampung Full sample (1) (2) KIP -0.08 -0.11*** (0.05) (0.04) Length of vehicular roads (in km) -0.02 -0.02 (0.03) (0.02) Length of pedestrian roads (in km) -0.01 0.02 (0.02) (0.02) Number of sanitation facilities 0.001 0.003 (0.01) (0.01) Number of public buildings 0.01 0.01 (0.03) (0.02) KIP X Length of vehicular roads -0.01 0.002 (0.03) (0.02) KIP X Length of pedestrian roads -0.004 -0.01 (0.02) (0.02) KIP X Number of sanitation facilities 0.01 -0.004 (0.01) (0.01) KIP X Number of public buildings -0.01 0.02 (0.03) (0.02) N 3147 19862 R-Squared 0.73 0.85 Distance Y Y Topography Y Y Geography FE Locality Hamlet * 0.10 ** 0.05 *** 0.01 Notes: This table reports heterogeneous effects on land values by four policy components (vehicular roads, pedestrian roads, sanitation, public buildings). Column 1 presents the historical kampung specification with locality fixed effects. Column 2 presents the full sample with hamlet fixed effects. The intensity of KIP investments is measured by length of vehicular and paved roads, number of sanitation facilities, and number of public buildings within a 500 meter buffer around each observation. The KIP intensity variables have been demeaned so that the coefficient on the KIP indicator reflects the effects when evaluated at average intensity levels. The omitted category is non-KIP areas. Standard errors are clustered by locality.

48 Table 8: Effect of placebo boundaries Dependent variable: Log land values BDD 200m BDD 500m (1) (2) Historical Kampung -0.001 -0.04 ( 0.04) ( 0.06) N 1790 2628 R-Squared 0.49 0.49 Distance Y Y Topography Y Y Distance to historical kampung boundary Y Y Geography FE Historical Kampung Boundary Historical Kampung Boundary * 0.10 ** 0.05 *** 0.01 Notes: This table reports the effect of placebo boundaries on land values, where the key regressor is the historical kampung indicator. The sample includes sub-blocks that are not in KIP and are within 200 (500) meters of a historical kampung boundary for column 1 (2), conditional on 45 (41) historical kampung boundary fixed effects. Both control for distance to the nearest historical kampung boundary (and its square). Standard errors are clustered by historical kampung boundary.

49 Table 9: Effect of KIP: Robustness to full sample Dependent variable: Log land values 1(Height>3) 1(Kampung) Rank index Attribute index Parcel count Log population density (1) (2) (3) (4) (5) (6) (7) KIP -0.11*** -0.07*** 0.14*** 0.29*** 0.06*** 10.13*** 0.50*** ( 0.03) ( 0.02) ( 0.03) ( 0.05) ( 0.02) (0.55) (0.06) N 19862 7104 7104 7104 7104 89463 2528 R-Squared 0.85 0.43 0.43 0.47 0.38 0.52 0.46 Distance Y Y Y Y Y YY Topography Y Y Y Y Y YY Geography FE Hamlet Hamlet Hamlet Hamlet Hamlet Hamlet Locality * 0.10 ** 0.05 *** 0.01 Notes: This table reports robustness using the full sample for all the outcomes in Tables3,5, and6. The samples include 19,862 sub-blocks from assessed land values data, 7,104 pixels from the photo sample, 89,463 pixels from the full grid of Jakarta and land parcels data, and 2,528 hamlets from the population census. All columns include hamlet fixed effects with the exception of the population density (column 7) where locality fixed effects are used. Dependent variables are the log of assessed land values (column 1), the binary indicator for buildings above 3 floors (column 2), the binary kampung indicator that is 1 if the rank-based informality index is greater 50 than 1 (column 3), the continuous rank-based informality index (column 4), the attribute index (column 5), parcel count per pixel (column 6), and log of population density (column 7). Distance and topography controls are defined in Table2. Standard errors are clustered by locality. Table 10: Heterogeneous effects by distance to the city center Dependent variable: Log land values 1(Height>3) Sample: Historical Historical kampung kampung (1) (2) KIP X Central -0.10 -0.13*** ( 0.08) ( 0.02) KIP X Middle -0.22** -0.11*** ( 0.09) ( 0.03) KIP X Peripheral -0.06 -0.10*** ( 0.08) ( 0.03) Central -0.56** 0.01 ( 0.28) ( 0.06) Middle -0.23 -0.01 ( 0.23) ( 0.05) N 3147 5280 R-Squared 0.73 0.29

p-val (H0 : βCentralKIP ≥ βMiddleControl) 0.01 0.00 Distance Y Y Topography Y Y Geography FE Locality Locality * 0.10 ** 0.05 *** 0.01 Notes: This table reports heterogeneous effects by distance to the city center. We create dummy variables defined by the terciles of distance to the city center (defined as the National Monument) within each of the relevant samples. The omitted group is the periphery. The dependent variables are land values (column 1) and the binary indicator for buildings with more than three floors (column 2). The p-values tests the one-sided hypothesis that outcomes in central KIP areas (coefficient for the Central dummy plus the coefficient for KIP interacted with the Central dummy) have better outcomes than non-KIP middle areas (coefficient for Middle). A low p-value indicates that central KIP areas are in fact worse (lower land values and heights). Standard errors are clustered by locality.

51 Figures

Figure 1: KIP boundaries and historical kampungs

Notes: Map showing KIP boundaries (thick border) and historical kampungs that existed before KIP (shaded region). The grey borders are locality boundaries. The black triangle is the National Monument.

52 Figure 2: Distribution of building heights

1800

1600

1400

1200

1000

800 No. of Buildings of No. 600

400

200

0 1 2 3 4 5-10 >10 non-KIP KIP

Notes: This histogram shows the distribution of building heights for the tallest building in KIP and non-KIP pixels in the historical kampung sample. The horizontal axis represents the number of floors.

Figure 3: Distribution of the rank-based informality index

3000

2500

2000

1500

No. of pixels of No. 1000

500

0 0 1 2 3 4

non-KIP KIP

Notes: This histogram shows the distribution of the rank-based informality index. Index values range from 0, corresponding to “very formal”, to 4, corresponding to “very informal”. We report the integer index values, as coded by each of the two research assistants.

53 Figure 4: Spatial decay: Effect of KIP on land values

.4

.3

.2

.1

0

−.1

−.2

100 200 300 400 500 600 700 800 900 1000 meters

Notes: Each point corresponds to a coefficient and 95% confidence interval for the distance bins regressors in the historical kampung sample. The omitted group is KIP areas. These coefficients reflect differences in land values moving away from KIP boundaries. The full regression is reported in column 1 of Table A11.

54 Online Appendix

Section A: Appendix Tables 56 Table A1 : Rank-based informality index and attributes ...... 56 Table A2 : Robustness checks for building heights ...... 57 Table A3 : Robustness checks for the rank-based informality index ...... 58 Table A4 : Effect of KIP on informality, domains of the attribute-based index ...... 59 Table A5 : Fragmentation, land values, and building heights ...... 60 Table A6 : Migration ...... 61 Table A7 : Differences in household size, mortality, and fertility ...... 62 Table A8 : Heterogeneous effects for land fragmentation ...... 63 Table A9 : Access to current amenities ...... 64 Table A10: Educational attainment ...... 65 Table A11: Effect of KIP on land values: spatial decay ...... 66 Table A12: Boundary analysis with locality fixed effects ...... 67 Table A13: Boundary analysis, robustness to different distance bands ...... 68 Table A14: Educational attainment for stayers ...... 68 Table A15: Selection for building heights ...... 69 Table A16: Selection for assessed land values ...... 70 Table A17: Robustness to excluding Dutch areas ...... 71 Table A18: Robustness to predetermined controls only ...... 72 Table A19: Effect of KIP on land values, building heights, and informality, standard errors robustness ...... 73

Section B: Appendix Figures 74 Figure A1: Kampungs in Jakarta, before and after KIP ...... 74 Figure A2: Map of KIP waves ...... 75 Figure A3: Map of KIP boundary segments ...... 76 Figure A4: Policy maps: KIP assets ...... 77 Figure A5: Map of the assessed land values database ...... 78 Figure A6: Scatterplot of assessed land values and transaction prices ...... 79 Figure A7: Map of photographic survey locations ...... 80 Figure A8: Examples of coding of the rank-based informality index ...... 81 Figure A9: Example of cadastral map of land parcels ...... 82 Figure A10: Spatial decay: Distance from high-density hamlet boundaries ...... 83 Section C: Data Appendix 84 Section D: Theoretical Appendix 87

55 Appendix Tables

Table A1: Rank-based informality index and attributes Dependent variable: Rank-based informality index Sample: Historical BDD kampung 200m (1) (2)

Panel A: Access Road accessible by car (1=no) 0.82*** 0.82*** (0.05) (0.08) Paved road (1=no) 0.40 -0.16 (0.25) (0.46) Unpaved road (1= yes) 0.10 0.31 (0.12) (0.22) Damaged road pavement (1=yes) 0.09 -0.19 (0.08) (0.15) Garden (1=no) 0.32*** 0.33*** (0.05) (0.07) Panel B: Neighborhood appearance Exposed wires (1=yes) 0.41*** 0.33*** (0.04) (0.07) Drainage canals (1=no) 0.25*** 0.30** (0.06) (0.13) Trash (1=yes) 0.26*** 0.46*** (0.05) (0.08) Panel C: Permanence of structures Unfinished buildings (1=yes) -0.23*** -0.12 (0.08) (0.14) Permanent wall (1=no) 0.87** -0.64*** (0.35) (0.24) Unfinished wall (1=yes) 0.51*** 0.49*** (0.03) (0.07) Non-permanent wall (1=yes) 0.36*** 0.36*** (0.04) (0.07) Damaged wall (1=yes) 0.23*** 0.22*** (0.04) (0.07) Permanent fence (1=no) 0.09* 0.005 (0.05) (0.08) Rust (1=yes) 0.22*** 0.15 (0.05) (0.11) N 5280 1452 R-Squared 0.60 0.64 Distance Y Y Topography Y Y Distance to KIP boundary N Y Geography FE Locality KIP boundary * 0.10 ** 0.05 *** 0.01 Notes: This table reports regressions at the pixel level where the dependent variable is the rank-based informality index and the regressors are the individual components of the attribute-based informality index. Column 1 includes the historical kampung sample and column 2 includes the boundary sample. Standard errors are clustered by locality in column 1 and by KIP boundary in column 2.

56 Other building height outcomes Table A2 repeats the building heights analysis for different height outcomes. We discuss the historical kampung sample results (odd columns) but the conclusions are similar for the boundary sample (even columns). In columns 1 and 2, we consider the number of floors for the tallest building in the pixel, finding an effect of -1.64 floors, relative to a control group mean of 5 floors. In columns 3 and 4, the dependent variable is the log of height; we obtain an effect of -19 log points (-21 percent, implying an effect size of 1.1 floors). In columns 5 through 8, we address the concern that some of the survey photos may be outside of the pixel due to GPS imprecision in the field. Results are similar if we only consider photos taken within the pixel (col- umn 5) and if we replace field photos taken outside of the pixel with photos from Google Street View that are closer to the intended location (columns 7 and 8).

Table A2: Robustness checks for building heights Dependent variable: Building Heights Log(height) 1(Height>3) Sample: Historical BDD Historical BDD Historical BDD Historical BDD kampung 200m kampung 200m kampung 200m kampung 200m (1) (2) (3) (4) (5) (6) (7) (8)

57 KIP -1.64*** -1.40** -0.19*** -0.13** -0.12*** -0.11*** -0.12*** -0.10*** ( 0.38) ( 0.64) ( 0.04) ( 0.06) ( 0.02) ( 0.03) ( 0.02) ( 0.03) N 5280 1452 5064 1415 5000 1368 5280 1452 R-Squared 0.32 0.52 0.37 0.52 0.30 0.38 0.30 0.37 Distance Y Y Y Y Y Y Y Y Topography Y Y Y Y Y Y Y Y Distance to KIP boundary N Y N Y N Y N Y Exclude photos outside pixel N N N N Y Y N N Replace photos outside pixel N N N N N N Y Y Geography FE Locality KIP boundary Locality KIP boundary Locality KIP boundary Locality KIP boundary * 0.10 ** 0.05 *** 0.01 Notes: This table reports specifications similar to those in Table3. Standard errors are clustered by locality in odd columns and by KIP boundary in even columns. Other measures of the informality index Table A3 shows that our conclusion that KIP areas are more informal remains the same using different approaches, such as defining the kampung indicator using different threshold values for the rank-based informality index besides one (columns 1 through 4), and pooling the index values for the two research assistants (and adding a fixed effect for one of them), as opposed to averaging the two scores (columns 5 and 6). Table A4 examines the three domains of the attribute-based index. The dependent variables correspond to z- scores for each domain (standardized using the control group mean and standard deviation within each estimation sample): access (columns 1 and 2), neighborhood appearance (columns 3 and 4), and permanence of structures (columns 5 and 6). The individual attributes in each domain are the ones listed in Table A1.

Table A3: Robustness checks for the rank-based informality index Dependent variable: 1(Informality index > 0) 1(Informality index > 2) Informality index, pooling Sample: Historical BDD Historical BDD Historical BDD kampung 200m kampung 200m kampung 200m (1) (2) (3) (4) (5) (6) KIP 0.12*** 0.15*** 0.05*** 0.05 0.29*** 0.39*** ( 0.02) ( 0.04) ( 0.02) ( 0.04) ( 0.05) ( 0.11) N 5280 1452 5280 1452 10560 2904 R-Squared 0.28 0.33 0.13 0.22 0.23 0.30 Distance Y Y Y Y Y Y Topography Y Y Y Y Y Y Distance to KIP boundary N Y N Y N Y Geography FE Locality KIP boundary Locality KIP boundary Locality KIP boundary * 0.10 ** 0.05 *** 0.01 Notes: This table reports specifications similar to those in Table5. Standard errors are clustered by locality in odd columns and by KIP boundary in even columns.

58 Table A4: Effect of KIP on informality, domains of the attribute-based index Dependent variable: Lack of access Poor neighborhood appearance Non-permanent structures Sample: Historical BDD Historical BDD Historical BDD kampung 200m kampung 200m kampung 200m (1) (2) (3) (4) (5) (6) KIP 0.04 0.08* 0.05 0.18*** 0.06*** 0.03 (0.03) (0.05) (0.04) (0.07) (0.02) (0.04) N 5280 1452 5280 1452 5280 1452 R-Squared 0.13 0.15 0.14 0.16 0.13 0.16 Distance Y Y Y Y Y Y Topography Y Y Y Y Y Y Distance to KIP boundary N Y N Y N Y Geography FE Locality KIP boundary Locality KIP boundary Locality KIP boundary * 0.10 ** 0.05 *** 0.01 59 Notes: This table reports specifications similar to those in Table5. Standard errors are clustered by locality in odd columns and by KIP boundary in even columns. Fragmentation, land values, and building heights As discussed in the paper, Table6 shows that KIP neighborhoods have greater land fragmentation, with an effect size of 9 more parcels per pixel compared to non-KIP neighborhoods. Table A5 below shows that land fragmenta- tion is associated with lower land values and fewer tall buildings. For the land values analysis (columns 1 and 3), in order to merge land values observations with land fragmentation data, we calculate parcel count at the sub-block level. This exercise helps us assess how the KIP effect on land fragmentation can help explain the effects of KIP on land values and building heights. The first two columns restrict the sample to historical kampungs within KIP only, showing that one additional parcel per pixel is associated with a 1% decline in land values in KIP areas and a 0.005 reduced likelihood of tall buildings. These estimates imply that KIP’s effect of raising the parcel count by 9 translates into a 9% effect on prices (75% of the overall 12% price effect), -0.045 floors (38% of the overall -0.12 effect on tall buildings). The conclusions are the same if we do not restrict the analysis to observations within KIP only, as shown in the last two columns.

Table A5: Fragmentation, land values, and building heights Dependent variable: Log land values 1(Height>3) Log land values 1(Height>3) (1) (2) (3) (4) Parcel count -0.01*** -0.005*** -0.01*** -0.01*** ( 0.002) ( 0.0007) ( 0.001) ( 0.0006) N 1207 2699 2225 5280 R-Squared 0.80 0.35 0.77 0.35 Distance Y Y Y Y Topography Y Y Y Y Within KIP Y Y N N Geography FE Locality Locality Locality Locality * 0.10 ** 0.05 *** 0.01 Notes: The unit of analysis is a sub-block in odd columns and a pixel in even columns. The estimation sample is restricted to KIP areas in the historical kampung sample in columns 1 and 2, and all areas in the historical kampung sample in columns 3 and 4.

60 Migration This table reports regressions at the individual level using the 2010 Population Census. Each observation is an individual residing in Jakarta in the 2010 Census. In column 1 the dependent variable is a dummy equal to 1 if the individual was not born in the same district of residence (“migrant by birthplace”). In column 2 the dependent variable is a dummy equal to 1 if the individual was not living in the same district of residence 5 years prior (“5-year migrant”). The dependent variable for columns 3 and 4 is years of schooling. Column 3 restricts the sample to migrants by birthplace, and column 4 restricts the sample to 5-year migrants. The KIP dummy is equal to 1 for individuals residing in a hamlet that is in KIP for the majority of its area. All columns include gender fixed effects, locality fixed effects, and 70 age fixed effects. Distance and topography controls are averaged at the hamlet level. Here, we explore the concern that KIP areas have worse outcomes due to the endogenous sorting in of nega- tively selected migrants. Instead, we find that KIP areas are less likely to have birth migrants and five-year migrants (columns 1 and 2) and the migrants have more education (columns 3 and 4).

Table A6: Migration Dependent variable: Migrant by birthplace 5-year migrant Years of schooling Years of schooling (1) (2) (3) (4) KIP -0.02*** -0.01*** 0.02 0.19*** ( 0.005) ( 0.003) ( 0.07) ( 0.07) N 8621849 7861339 2785144 328807 R-Squared 0.14 0.06 0.10 0.10 Distance Y Y Y Y Topography Y Y Y Y Gender FE Y Y Y Y Age FE Y Y Y Y Migrant by birthplace N N Y N 5-year migrant N N N Y Geography FE Locality Locality Locality Locality * 0.10 ** 0.05 *** 0.01 Notes: Standard errors are clustered by locality.

61 Table A7 reports the effect of KIP on household size, mortality, and fertility, drawn from the 2010 Population Census. Column 1 is analogous to the population density specification of column 3 in Table6, with the dependent variable being the average household size in a hamlet. The next two columns present individual-level specifications similar to Table A6, but the sample is restricted to ever-married women who are over 10 years old (column 2) and have ever had a live birth (column 3). We do not find systematically different patterns in KIP with respect to household size, mortality, and fertility.

Table A7: Differences in household size, mortality, and fertility Dependent variable: Household size Number of children deaths per 1000 live births Number of children (1) (2) (3) KIP 0.005 0.23 0.02 ( 0.03) ( 0.64) ( 0.01) N 2528 2010321 2010321 R-Squared 0.34 0.03 0.24 Distance Y Y Y Topography Y Y Y Age FE Y Y Y Geography FE Locality Locality Locality * 0.10 ** 0.05 *** 0.01 Notes: Standard errors are clustered by locality.

62 How can KIP affect fragmentation Table A8 extends the main results on land fragmentation and population density (Table6) and presents evidence of how KIP affects fragmentation. In column 1, we restrict the sample to pixels in the periphery, where we know that the KIP effects on land values and building heights are insignificant. At the periphery, the control areas are less likely to have formalized with assembled land. Importantly, we continue to find an effect of KIP on fragmentation with 8.4 more parcels per pixel in KIP neighborhoods. In column 2, we restrict the sample to pixels that are classified as kampungs as per our photo survey, with a similar estimate (9.5 more parcels per pixel). Columns 3 and 4 repeat this analysis at the hamlet level, with the log of population density as an outcome, finding evidence of higher pop- ulation density in KIP areas in both subsamples. Taken together, these results suggest that KIP may have increased fragmentation and population density directly following the initial KIP improvements. Next, we explore whether the design of KIP roads influences fragmentation. From the KIP policy maps we manually code the dummy “Grid roads” to be equal to 1 in KIP pixels where the roads paved as part of KIP form a grid, identified by the presence of 90 degree angles and straight lines. Column 5 replicates the historical sample specification and column 6 reports the full sample analysis. Around 8% of KIP pixels in the full sample and 3% in the historical sample have one grid-like network of roads. Pixels with and without grids are comparable in terms of observables. Since we have limited statistical power to detect within-hamlet effects, we only include locality fixed effects for the full sample analysis. Additionally, we also interact the “Grid roads” indicator with the share of KIP vehicular roads over total KIP-provided roads in the pixel (“Vehicular”), which we interpret as a proxy for the permanence of the grid. About 42% of KIP-provided roads are vehicular in the historical sample (48% in the full sample). The share of vehicular roads is de-meaned, so that the coefficient on the treatment indicator corresponds to the average treatment effect, evaluated for the average level of this share. Interestingly, the estimated effects show that grids tend to reduce the positive impact of KIP on fragmentation by 7.67 to 9.48 parcels per pixel, significantly offsetting the direct effect of KIP on fragmentation.

Table A8: Heterogeneous effects for land fragmentation Dependent variable: Parcel count Log population density Parcel count Sample: Peripheral Informal Peripheral Informal Historical Full sample sample sample sample kampung sample (1) (2) (3) (4) (5) (6) KIP 8.40*** 9.49*** 0.40*** 0.25*** 8.85*** 11.86*** (1.17) (2.83) (0.11) (0.08) (1.05) (0.66) Grid roads -7.46** 2.25 (3.42) (2.09) Grid roads x -7.67* -9.48*** Vehicular (4.34) (2.67) N 29821 2307 844 762 11015 89461 R-Squared 0.41 0.73 0.42 0.52 0.51 0.33 Distance Y Y Y Y Y Y Topography Y Y Y Y Y Y Geography FE Hamlet Hamlet Locality Locality Locality Locality * 0.10 ** 0.05 *** 0.01 Notes: This table studies heterogeneous effects of KIP on parcel count and population density. Standard errors are clustered by locality.

63 Current amenities Table A9 investigates differences in access to public amenities (columns 1 through 4). In particular, the outcomes are logs of the distance (from the middle of the pixel) to the nearest school, hospital, police station, and bus stop, all drawn from OpenStreetMap. These can be viewed as being part of the vector AP in the model. There are no effects for schools and police stations. The 7 percent effect for hospitals translates to 70 meters (small relative to a mean of 1000 meters), and the 31 percent for bus stops is equivalent to 258 meters (moderate relative to a mean of 835). Notably, Panel B shows that the difference for bus stops reduces to 7% in the full sample. Since our main results are similar in the full sample, the difference in distance to bus stops cannot explain our main result. This corroborates the discussion in World Bank(1995) that KIP accelerated the provision of amenities in treated neighborhoods, but that non-KIP kampungs converged as a result of broader economic growth in Jakarta. By contrast, columns 5 and 6 show that KIP areas have fewer formal commercial developments, such as retail and office developments (part of vector AF in the model). The dependent variables measure the share of each pixel that has retail activity or office developments, respectively, according to the administrative database on land use patterns (described in Section 3.7). Our estimates indicate KIP areas have 2 p.p. lower retail density and 4 p.p. lower office development density. This is in line with our findings of KIP neighborhoods having lower land values, shorter buildings, and being less formal.

Table A9: Access to current amenities Dependent variable: Log distance to Density School Hospital Police Bus Stop Retail Office (1) (2) (3) (4) (5) (6)

Panel A: Historical kampung KIP 0.04 0.07* 0.01 0.31*** -0.02*** -0.04** (0.06) (0.04) (0.04) (0.06) (0.00) (0.02) N 11015 11015 11015 11015 11015 11015 R-Squared 0.24 0.58 0.70 0.55 0.15 0.26

Panel B: Full Sample KIP -0.02 0.03** 0.01 0.07*** -0.01*** -0.02*** (0.03) (0.01) (0.02) (0.02) (0.00) (0.01) N 89463 89463 89463 89463 89463 89463 R-Squared 0.51 0.78 0.87 0.84 0.25 0.39 Distance Y Y Y Y Y Y Topography Y Y Y Y Y Y * 0.10 ** 0.05 *** 0.01 Notes: The dependent variables are log of distance of a pixel’s center to the nearest school, hospital, police station, and bus stop (columns 1 through 4), and share of retail (column 5) and office development within a pixel (column 6). The sample includes 11,015 pixels in the historical kampung (Panel A) and 89,463 pixels in the full sample (Panel B). Standard errors are clustered by locality.

64 Education

Table A10: Educational attainment Dependent variable: Junior Secondary High School College Years of Schooling (1) (2) (3) (4) KIP 0.01** 0.01* -0.004 0.07 ( 0.01) ( 0.01) ( 0.01) ( 0.07) N 4920036 4920036 4920036 4920036 R-Squared 0.11 0.10 0.06 0.13 Distance Y Y Y Y Topography Y Y Y Y Gender FE Y Y Y Y Age FE Y Y Y Y Geography FE Locality Locality Locality Locality * 0.10 ** 0.05 *** 0.01 Notes: This table reports individual level regressions using the 2010 Population Census, with educational attainment dummies (columns 1 through 3) and years of schooling (column 4) as the dependent variables. The sample is restricted to individuals above age 25. All columns include fixed effects for gender, age, and locality. Standard errors are clustered by locality.

65 Spatial decay Table A11 estimates spatial decay patterns by exploring heterogeneous effects by distance bands. We repeat the same specifications as Table3, but the omitted group is now the treatment group and the key regressors include indicators for areas outside of KIP and within 100 meter distance bands. We start with 0 to 100 meters, and include up to 10 distance bands (900 to 1000 meters). We drop observations beyond 1000 meters. The average has an area of 2.5 square kilometer, implying an equivalent area radius of 892 meters. Column 1 includes the historical kampung sample with locality fixed effects. In column 2, we include the full sample and add hamlet fixed effects. The coefficients beyond the 500 meters bands are not significant because there is little within-hamlet variation beyond 500 meters. The average hamlet has an area of 243,156 square meters, implying an equivalent area radius of 278 meters. All columns indicate no significant spatial decay pattern. In both columns, the 95% confidence intervals are quite stable and overlap with each other.

Table A11: Effect of KIP on land values: spatial decay Dependent variable: Log land values Sample: Historical kampung Full Sample Distance to KIP boundary (m) (1) (2) [1, 100] 0.15*** 0.10*** (0.05) (0.04) [101, 200] 0.07 0.14*** (0.06) (0.05) [201, 300] 0.06 0.13*** (0.10) (0.05) [301, 400] 0.15* 0.15*** (0.09) (0.06) [401, 500] 0.26** 0.14** (0.10) (0.06) [501, 600] 0.18** 0.08 (0.08) (0.07) [601, 700] 0.22* 0.04 (0.12) (0.07) [701, 800] 0.04 0.05 (0.12) (0.08) [801, 900] -0.03 0.07 (0.12) (0.08) [901, 1000] 0.12 0.04 (0.17) (0.07) N 2886 14185 R-squared 0.75 0.86 Distance Y Y Topography Y Y Geography FE Locality Hamlet * 0.10 ** 0.05 *** 0.01 Notes: This table replicates the assessed land values analysis of Table3, but the omitted group is the treatment group and the key regressors are dummies corresponding to different distance bands; each dummy equals 1 if the observation is not treated and within a given 100 meter distance band from KIP boundaries. Observations beyond 1000 meters are excluded. Standard errors are clustered by locality.

66 Robustness for boundary analysis Another potential concern with the boundary discontinuity exercise is that we may be picking up differences across administrative units, to the extent that KIP boundaries coincide with administrative boundaries. In Table A12 we mitigate this concern by showing that the effects of KIP at the boundary are similar when considering variation within the same administrative unit, as we control for locality fixed effects within a 500 meter distance band.

Next, Table A13 shows that the estimates are similar across different buffer distances ranging from 150 meters (the optimal bandwidth for land values as per Calonico et al.(2014)) to 500 meters. Once again, we find similar effect sizes with KIP areas having 14 to 16 % lower land values and a likelihood of having tall buildings that is between 9 and 10 p.p. lower. One concern that arises with the boundary discontinuity sample is contamination or spillovers between treatment and control areas. The stability of the estimates across the buffer distances suggests limited evidence of spatial externalities due to KIP, echoing our findings above on a lack of spatial decay patterns.

Table A12: Boundary analysis with locality fixed effects Sample: BDD 500m Dependent variable: Log land values 1(Height>3) (1) (2) (3) (4) KIP -0.15*** -0.13** -0.10*** -0.09*** (0.05) (0.06) (0.02) (0.02) N 2825 2755 3586 3586 R-Squared 0.81 0.86 0.26 0.33 Distance Y Y Y Y Topography Y Y Y Y Distance to KIP boundary Y Y Y Y KIP Boundary FE Y Y Y Y Locality FE N Y N Y * 0.10 ** 0.05 *** 0.01 Notes: This table reports specifications analogous to the boundary analysis in Table3. Standard errors are clustered by KIP boundary.

67 Table A13: Boundary analysis, robustness to different distance bands Dependent variable: Log land values 1(Height>3) BDD BDD BDD BDD BDD BDD 150m 300m 500m 150m 300m 500m (1) (2) (3) (4) (5) (6) KIP -0.14** -0.16*** -0.15*** -0.09** -0.09*** -0.10*** (0.06) (0.05) (0.05) (0.03) (0.02) (0.02) N 985 2044 2825 968 2601 3586 R-Squared 0.81 0.81 0.81 0.41 0.29 0.26 Distance Y Y Y Y Y Y Topography Y Y Y Y Y Y Distance to KIP boundary Y Y Y Y Y Y * 0.10 ** 0.05 *** 0.01 Notes: This table reports specifications analogous to the boundary analysis in Table3. Standard errors are clustered by KIP boundary.

Other robustness checks: Education for stayers

Table A14: Educational attainment for stayers Dependent variable: Junior Secondary High School College Years of Schooling (1) (2) (3) (4) KIP 0.01** 0.02** -0.002 0.09 ( 0.01) ( 0.01) ( 0.01) ( 0.07) N 2134892 2134892 2134892 2134892 R-Squared 0.22 0.20 0.08 0.25 Distance Y Y Y Y Topography Y Y Y Y Gender FE Y Y Y Y Age FE Y Y Y Y Born in the same district Y Y Y Y Geography FE Locality Locality Locality Locality * 0.10 ** 0.05 *** 0.01 Notes: This table reports specifications similar to those in Table A10, but restricting the sample to individuals above age 25 born in the same district of residence. Standard errors are clustered by locality.

68 Other robustness checks: Selection bias for building heights Below, we consider selection bias arising from development activity, stemming from the fact that the potential for building tall buildings depends on zoning regulations and market potential. In Table A15, we show that the results for building heights are similar if we drop pixels with no buildings (columns 1 and 2), restrict the sample to pixels in places that are zoned for commercial developments, based on digital zoning maps provided by the Jakarta City Government (columns 3 and 4), or restrict the sample to pixels that are within 1000 meters of a pre-determined historical main road, as a proxy of market potential (columns 5 and 6). Our results are similar if we include all the observations but add controls for zoning and for being close to historical roads.

Table A15: Selection for building heights Dependent variable: 1(Height>3) Sample: Historical BDD Historical BDD Historical BDD kampung 200m kampung 200m kampung 200m (1) (2) (3) (4) (5) (6) KIP -0.12*** -0.09*** -0.15*** -0.09 -0.13*** -0.11*** (0.02) (0.03) (0.04) (0.08) (0.02) (0.04)

69 N 5084 1419 857 278 3620 955 R-Squared 0.29 0.37 0.43 0.59 0.31 0.40 Distance Y Y Y Y Y Y Topography Y Y Y Y Y Y Distance to KIP Boundary N Y N Y N Y Exclude no building pixels Y Y N N N N Only pixels zoned for services N N Y Y N N Only pixels near predetermined roads N N N N Y Y Geography FE Locality KIP Boundary Locality KIP Boundary Locality KIP Boundary * 0.10 ** 0.05 *** 0.01 Notes: This table reports specifications similar to the heights analysis in Table3. Standard errors are clustered by locality in odd columns and by KIP boundary in even columns. Other robustness checks: Selection bias for assessed land values One concern is that KIP areas are more likely to be informal today, and property data for informal settlements are less likely to be reported. Table A16 investigates whether KIP areas are less likely to be represented in the assessed values dataset. The unit of analysis is a pixel and the dependent variable is whether we observe an assessed value for the pixel. In contrast with concerns that KIP areas are less likely to be included in the assessed values database, we actually find a positive coefficient on the treatment dummy. Column 1 includes the full sample with hamlet fixed effects and column 2 restricts the sample to historical kampungs only, with locality fixed effects. Standard errors are clustered by locality.

Table A16: Selection for assessed land values Dependent variable 1(Has assessed values) Sample Full sample Historical kampung (1) (2) KIP 0.03*** 0.04*** (0.00) (0.01) N 89463 11015 R-Squared 0.09 0.08 Distance Y Y Topography Y Y Geography FE Hamlet Locality * 0.10 ** 0.05 *** 0.01

70 Other robustness checks: Historical land institutions In Table A17, we drop all hamlets that have any Dutch settlement areas (identified from our historical maps) since these places are more likely to be formal and high quality, given that these settlements have formal titles. Columns 1 and 2 show that our land values result remains the same even after dropping hamlets with Dutch settlements. Columns 3 and 4 present similarly robust findings for building heights, which also speaks to the concern of selection of development activity since these places are more likely to have high market potential.

Table A17: Robustness to excluding Dutch areas Dependent variable: Log land values 1(Height>3) Sample: Historical BDD Historical BDD kampung 200m kampung 200m (1) (2) (3) (4) KIP -0.13*** -0.17*** -0.12*** -0.08*** (0.05) (0.06) (0.02) (0.03) N 1886 972 5243 1417 R-Squared 0.72 0.80 0.29 0.37 Distance Y Y Y Y Topography Y Y Y Y Distance to KIP boundary N Y N Y Exclude hamlets with Dutch settlements Y Y Y Y Geography FE Locality KIP Boundary Locality KIP Boundary * 0.10 ** 0.05 *** 0.01 Notes: This table reports specifications analogous to those in Table3, excluding Dutch settlements indicated in our historical maps. Standard errors are clustered by locality except for the boundary analysis where we cluster by KIP boundary.

71 Other robustness checks: Dropping controls that are not pre-determined Table A18 replicates our main results (Table3), but drops the three controls that are not pre-determined. These controls include the distance to the stock exchange, the distance to waterways, and flood-proneness (a measure based on the frequency of recent floods). All are drawn from OpenStreetMap, a crowd-sourced database. The conclusions remain the same.

Table A18: Robustness to predetermined controls only Dependent variable: Log land values 1(Height>3) Sample: Historical BDD Historical BDD kampung 200m kampung 200m (1) (2) (3) (4) KIP -0.13** -0.16*** -0.12*** -0.09*** (0.05) (0.06) (0.02) (0.03) N 3147 1345 5280 1452 R-Squared 0.73 0.80 0.28 0.36 Pre-determined Distance Y Y Y Y Pre-determined Topography Y Y Y Y Distance to KIP boundary N Y N Y Geography FE Locality KIP Boundary Locality KIP Boundary * 0.10 ** 0.05 *** 0.01 Notes: This table reports specifications similar to those in Table3. Standard errors are clustered by locality except for the boundary analysis where we cluster by KIP boundary.

72 Other robustness checks: Standard errors In Table A19, we demonstrate the robustness of our inference to alternative standard error specifications. We replicate the specifications in Table3 and report the p-values corresponding to the KIP treatment effect. In our baseline historical kampung sample specification we cluster standard errors by locality, which has an equivalent- area radius of approximately 900 meters (column 1). In column 2, we consider a coarser level of clustering, the sub-district. There are 45 sub-districts in Jakarta, with an average area of approximately 15 square kilometers and radius of 2 kilometers. In columns 3 through 5, we employ the Conley(1999) GMM approach, allowing for arbitrary spatial correlation between observations within 700 meters, 900 meters (similar to the radius of our baseline clusters), and 1200 meters. P-values are very similar to our baseline ones. In addition, our boundary discontinuity inference - where we cluster by KIP boundary at baseline - is also robust to clustering standard errors at the sub-district level, with p-values of respectively 0.02 and 0.08.

Table A19: Effect of KIP on land values, building heights, and informality, standard errors robustness P-values of ATE Dependent variable Cluster: Cluster: Conley, Conley, Conley, locality sub-district 700m cutoff 900m cutoff 1200m cutoff (1) (2) (3) (4) (5) Log Land Values 0.01 0.00 0.00 0.01 0.01 1(Height>3) 0.00 0.00 0.00 0.00 0.00 * 0.10 ** 0.05 *** 0.01 Notes: This table reports p-values for the significance of the treatment indicator. Each row corresponds to a regression. We replicate the specifications of Table3, columns 1 and 3. In column 1 standard errors are clustered at the locality level (baseline). In column 2 we cluster at the sub-district level. In columns 3 through 5 we employ Conley(1999) standard errors with cutoffs of 700 meters, 900 meters, and 1200 meters respectively.

73 Appendix Figures

Figure A1: Kampungs in Jakarta, before and after KIP (Darrundono, 2012)

74 Figure A2: Map of KIP waves

Notes: Map showing areas treated as part of the 3 KIP Pelita waves. Red, blue and green areas were respectively exposed to KIP wave I, II, and III.

75 Figure A3: Map of KIP boundary segments

Notes: Map showing KIP boundary segments selected for the boundary discontinuity design.

76 Figure A4: Policy maps: KIP assets 77

Notes: Map showing KIP assets. Dotted lines indicate boundaries of KIP areas, with different colors corresponding to different Pelita waves. Solid lines indicate vehicular roads (in pink), footpaths (yellow), and canals (blue). Dots denote public buildings. Figure A5: Map of the assessed land values database

Notes: Map showing the coverage of the assessed land values database throughout Jakarta. Each shaded polygon corresponds to a sub-block. Thick boundaries correspond to localities. Light boundaries correspond to hamlets.

78 Figure A6: Scatterplot of assessed land values and transaction prices

19 Correlation = 0.56 18 17 16 15 ln(assessed value, Rp. per sq. m) 14

14 15 16 17 18 19 ln(transaction price, Rp. per sq. m)

Notes: This figure correlates the log of assessed land values and the log of transaction prices from www.brickz.id. Each point represents values averaged at the hamlet level.

79 Figure A7: Map of photographic survey locations

Notes: Map showing the 7,104 pixels sampled for the photographic survey.

80 Figure A8: Examples of coding of the rank-based informality index

81 Figure A9: Example of cadastral map of land parcels 82

Notes: Cadastral map for one sub-district. The solid red boundaries indicate KIP treated areas. Figure A10: Spatial decay: Distance from high-density hamlet boundaries

.3

.2

.1

0

-.1 200 400 600 800 1000 meters

Notes: We investigate the spatial decay pattern of land values away from the boundaries of 45 high-density, informal, and non- KIP hamlets. Specifically, these hamlets do not belong to KIP, have population density above the median, and are informal as per the kampung indicator. The figure estimates spatial decay patterns by exploring heterogeneous effects by distance bands. The omitted group is areas that are inside the 45 hamlets, and the key regressors include indicators for areas outside of the hamlet boundaries and within 200 meter distance bands. We start with 0 to 200 meters, and include up to 5 distance bands (800 to 1000 meters). We drop observations beyond 1000 meters and keep only non-KIP observations that are within 1000 meters. We include boundary pair fixed effects for the 45 hamlets, assigning all observations that are within 1000 meters from the closet hamlet boundary to belong to the same hamlet. Each point in the figure corresponds to a coefficient and 95% confidence interval for the distance bins regressors. We also compare the coefficient for the 0-200 meter distance band with the coefficient on KIP in Table3, column 2. We reject the null hypothesis that these two coefficients are the same at the 1% level.

83 Data Appendix

Program boundaries Our source for KIP program boundaries is the Jakarta Department of Housing (DPGP, 2011) publication discussed in Section 3.1. We employ this data source both to create our main explanatory variable - a binary indicator for whether a sub-block or a pixel falls within a KIP treated area - and for our boundary discontinuity exercise. Selecting boundary segments for the latter exercise involves an additional data processing step. When we restrict the sample to areas within 500 meters of KIP boundaries, some of the observations classified as control - i.e. on the non-KIP side - relative to one particular boundary segment may fall within the KIP side of a nearby boundary segment. We exclude contaminated units by manually selecting a subset of “clean” boundary segments, following Turner et al.(2014). These boundary segments are displayed in Figure A3. We follow a similar procedure to select clean placebo boundaries for the test discussed in Section8.

Market transactions We compare assessed land values with real estate transaction prices scraped from the Brickz Indonesia website (www.brickz.id). Brickz has been collecting data of property sales since January 2015; sales are reported for properties advertised in the Rumah123 website (www.rumah123.com), an online property portal advertising sales and rentals. We scraped all the data available for sales of apartments and houses in Jakarta as of October 2016. For each entry, Brickz reports number of rooms, square footage, sale price and a street address, with varying precision. By a combination of Google API and manual search, we were able to geocode about 3800 entries at the street and street number level. In order to compare these data with assessed land values, we average transacted prices per square foot at the hamlet level.

Photographic survey Sampling procedure In order to construct a representative sample of locations, we start from our grid of 75-meter pixels and sample from the two subsamples that we consider for our main specifications (detailed in Section5): the historical kampung sample and the boundary discontinuity sample, comprising areas within a 500 meter distance from the KIP boundaries used in the analyses. We select a random sample of 5000 pixels from the historical kampung sample and 2500 pixels from the discontinuity sample. We stratify by terciles of distance from the National Monument within each subsample, to ensure we have a broad spatial distribution. The proportion of pixels in the first, second and third distance terciles are respectively 50%, 40% and 10%. Within each distance stratum, we draw half of the observations from KIP and half from non-KIP areas. The proportion of KIP and non-KIP pixels in the original samples is comparable - about 45% KIP and 55% non-KIP. Because there is some overlap between the two subsamples, in the end we have 7104 observations in total, of which 5280 are in the historical kampung sample and 1824 are in the 500 meter discontinuity sample. Figure A7 shows our sampled pixels. Photographs For each pixel, we first draw imagery from Google Street View. The Street View imagery was collected mostly in 2015-2017 (about 10% were collected in 2013). All locations for which Google could not return imagery were covered by our field enumerators. We provided them with latitude and longitude of the locations to be surveyed (corresponding to the center of pixels in the sample) and instructed to take four photographs from as close as possible to the exact coordinate points. We showed them photos from Google Street View as examples. They were asked to report GPS coordinates of the photos they took, so that the accuracy of the location could be verified. In a few cases the enumerators could not reach the exact location due to buildings, walls, or roads blocking the access. In these cases, we used the closest available photos from Google Street View. Our results are robust to dropping those locations.

84 Building height To measure height at each surveyed pixel, we instructed our research assistants to count the number of floors of the tallest building within the pixel, as seen in the photos corresponding to each location. Occasionally, there would be tall buildings visible in the photos but located outside of the sampled 75-meter pixel, in which case they were not considered in the ranking. When uncertain, research assistants referred to Google Maps, where they would locate the dubious building and calculate the distance from the sampled location, thus determining whether it fell within or outside the pixel. For locations surveyed on the field, enumerators were instructed to consider only buildings within 75 meters of the location and provided with a rule of thumb of a maximum distance of 50 steps. For particularly tall buildings, the total number of floors was either not visible in the photo or not easy to count from the ground. In these cases, the dubious building was identified through Google Maps and the information on number of floors was either found on the building’s website or obtained contacting the leasing office or concierge. For field locations with buildings of dubious height, enumerators entered the building and checked the number of floors from elevators. Rank-based informality index We trained two research assistants, both from Jakarta, to independently rank pho- tos on a scale ranging from 0 to 4. We performed an initial calibration of the index on a subset of photos and provided our research assistants with sample photos belonging to each category, explaining the characteristics that determined our assessment. When ranking locations, we instructed them to consider holistically the following aspects: width, paving, and condition of roads; density of structures; regularity of building heights; overall cleanli- ness of the neighborhood, including presence of rust, garbage, low-hanging electrical wires; quality and durability of building materials; irregularity of structures and presence of setbacks; size and quality of windows and doors. We also instructed our research assistants to focus on the physical appearance of the built environment and not on the activities of people, nor on the assets (such as parked cars) that may be visible in the photos. Attribute-based informality index Our attribute-based slum index is based on the coding of fifteen attributes detailed below: • Access: 1. Is the location accessible by a four-wheeler (based on the width of roads/pathways): 1= no 2. Presence of paved ground / road / footpath / access way: 1 = no 3. Presence of unpaved ground / road / footpath / access way: 1 = yes 4. Presence of damage to the pavement (e.g. sitting water, potholes) or incomplete paving: 1 = yes 5. Presence of green space (e.g. garden, orchard): 1 = no. The latter attribute captures the fact that green space is not paved but does not imply a lack of accessibility. • Neighborhood appearance: 6. Presence of wires at building level: 1=no 7. Presence of drainage canals: 1=no 8. Presence of trash (uncollected garbage): 1=yes. • Permanence of structures: 9. Presence of unfinished buildings (e.g. without roof, partially finished upper floors): 1 = yes 10. Presence of permanent (concrete or brick) and finished (painted) walls : 1= no 11. Presence of permanent but unfinished walls: 1 = yes 12. Presence of non-permanent walls (wood or zinc): 1 = yes 13. Presence of damaged wall (graffiti, peeling paint, holes): 1 = yes 14. Presence of permanent fences: 1 = no 15. Presence of rust: 1 = yes. We standardize each attribute as a z-score and then average them in an index applying equal weights.

85 Control variables Our baseline set of distance controls includes distance, in logs, from the headquarters of the Indonesia Stock Exchange, built in 1994 in the Golden Triangle (see Section2), from the 1961 National Monument in Merdeka Square, and from the main road arteries reported in the 1959 historical map. Our topography controls include slope and elevation, computed based on the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) Global Digital Elevation Model (NASA and METI, 2011), with a resolution of 30 meters. We include four measures of local hydrology. The first is a binary indicator for whether a hamlet is classified as “flood-prone” in OpenStreetMap, a collaborative mapping project which provides crowd-sourced maps of the world and is continuously updated. Our second measure is the distance, in logs, from the nearest permanent or semi-permanent water body. We detect water bodies drawing upon the European Commission Joint Research Centre’s Global Surface Water Dataset (Pekel et al., 2016), a global 30 meter resolution raster map reporting the occurrence of water bodies from 1984 to October 2015. We consider pixels corresponding to water for at least 50% of the sample period. Finally, we control for distance from waterways (from OpenStreetMap) and from the coast.

Other variables Boundaries of sub-city administrative divisions (localities and hamlets), as well as the location of schools, hospitals, police stations, bus stops, and roads are all drawn from OpenStreetMap.

86 Theoretical Appendix

Below we present our model of spatial equilibirum across neighborhoods (Rosen, 1979; Roback, 1982). The building blocks of the model are housing supply, housing demand, and spatial equilibrium. The key endogenous variables are land values (p) and building heights (h), which we characterize in formal (F) and informal (K) neigh- borhoods as a function of local amenities (A).

Housing supply Competitive developers produce housing space H in a given location using land l and building height h, where H = l ·h. Construction costs are convex in height, but in order to build above a certain height threshold h¯ there is an additional location-specific fixed cost c that developers need to incur. The cost of building at height h over a land surface l is

 δ  C(h,l) = h + cI[h>h¯] l with δ > 1 and c > 0. (B.1)

Buildings with a height below and above h¯ can be thought of as “kampung” and “formal” respectively. Assuming all locations have the same fixed amount of land normalized to 1 and denoting the location-specific price of housing per square meter as p, the developers’ profit maximization problem for a given location reads:

n δ o maxπ(h) , π(h) := ph − h − cI[h>h¯] . (B.2) h

1 Define h∗ (p) := arg max ph−hδ = p  δ−1 . We assume that h¯ < h∗(p), so that developers will choose to formalize h δ when the fixed cost c is low enough.46 Developers choose to formalize, i.e. supply at a height h∗(p) above h¯, if

ph¯ − h¯δ < ph∗ − (h∗)δ − c. (B.3) Otherwise, they choose the corner solution h¯. This can be shown graphically in Figure B.1, in which we depict two examples of profit functions π(h) for given p. For a fixed cost of c, the profit function is represented by the blue solid line and the profit-maximizing h is h∗ (p). For a fixed cost c’> c, the profit function is represented by the union of the solid blue line for h < h¯ and the thick green line for h > h¯. In this case the profit-maximizing h is h¯. In the data, we observe bunching at a height of 2 floors in kampungs, which is consistent with our prediction of a corner solution. Equation B.3 simplifies to

hp¯ − h¯δ + c ≤ ApB (B.4)

1 δ−1 δ δ where A := δ > 0 and B := δ−1 > 1. Provided that h < (c) , the inequality in B.4 is satisfied for p greater δ δ−1 than a given threshold P¯(c) that is increasing in c. This can be shown graphically in Figure B.2, where the shaded region on the p axis represents the range of prices for which developers do not formalize. Housing supply in a given location is thus given by

( ¯ ¯ S h if p ≤ P(c) h = 1 (B.5) p  δ−1 ¯ δ if p > P(c) 46Without making this assumption, there would be a region in which the neighborhood remains informal regardless of the extent of the fixed cost c. Since our focus is on the transition of neighborhoods from informal to formal, we rule out this case.

87 Figure B.1: Profit-maximizing h

where the threshold P¯ is increasing in c. Intuitively, when formalization costs are high, there is a narrower range of prices for which developers are willing to formalize. Note that, because the land area in each location is normalized to 1, hS is also equal to the total supply of housing space.

Housing demand Assume there is a homogeneous population of consumers who move to the city as long as they can attain pos- itive utility there. Once in the city, they optimally choose in which neighborhood to locate, facing no mobility costs. Neighborhoods are differentiated by the presence of two types of amenities entering the utility function as multiplicative constants (AP>1 and AF >1). AP corresponds to exogenous amenities and AF represents endogenous amenities from formalization, acquired by the neighborhood when it becomes formal, such that: ( = 1 if h ≤ h¯ AF (B.6) > 1 if h > h¯.

The utility of consumers depends positively on amenities AP and AF in each location and on the consumption of a numeraire good C and housing space H. Assume they all face the same exogenous income normalized to 1. Conditional on the choice of neighborhood, their utility maximization problem reads:

1−α α max APAFC H s.t. C = 1 − pH (B.7) C,H where α ∈ (0,1) is identical across consumers and locations. Each household optimally consumes α H∗ = . (B.8) p

88 Figure B.2: Range of prices in the kampung

Denoting population in a given location as N, for housing space is thus equal to Nα hD = . (B.9) p

Equilibrium To close the model, we invoke spatial equilibrium. Housing markets are cleared such that consumers are indifferent across locations, otherwise they would move. Their equilibrium utility must be constant in all locations:

∗ 1−α ∗α APAF (1 − pH ) H = υ (B.10) which using B.8 can be rewritten as 1 log p = (logA + logA ) + η (B.11) α P F h α i where η := log (1 − α) 1−α α . From B.11 it follows that a neighborhood will formalize when its amenities are large enough relative to the fixed cost c, i.e. when

logAP + logAF > αP¯(c) + αη. (B.12) Using the optimality condition for developers (B.5), that for households (B.8), and spatial equilibrium (B.10), the model can be solved for the three endogenous variables housing prices p, population N, and building height h as a function of the model’s parameters, in particular the location-specific ones such as amenities AP and AF and the fixed cost c. The first step is to solve for the housing market equilibrium as a function of N, equating aggregate housing supply (B.5) and housing demand (B.9). The solution is depicted graphically in FigureA in the paper. 47

47Given the discontinuity in the supply curve, there could be intermediate levels of demand for which the demand curve does not intersect the supply curve. In FigureA, this corresponds to demand curves that do not intersect any segment of the solid red line, but intersect the dashed vertical line in correspondence of p = P¯(c). In such cases, there would be excess demand

89 Denote prices and quantities in kampungs and formal neighborhoods with the subscripts K and F. Analytically, the housing market equilibrium in the kampung case is defined by the following:

( D loghK = −log p + logα + logN S ¯ loghK = logh which implies

log pK = −logh¯ + logα + logN. (B.13) The equilibrium in the formal case is defined by

( D loghF = −log p + logα + logN S 1 1 loghF = δ−1 log p − δ−1 logδ which implies:

δ − 1 1 δ − 1 log p = logα + logδ + logN (B.14) F δ δ δ 1 1 1 logh = logα − logδ + logN. (B.15) F δ δ δ In order to pin down N, we equate the equilibrium price resulting from the housing market equilibrium (B.13 and B.14 respectively) with the price dictated by spatial equilibrium (2). Denoting the equilibrium population of kampungs and formal neighborhoods as NK and NF , we obtain:

1 logN = logA + η − logα + logh¯ (B.16) K α P and

δ 1 δ logN = (logA + logA ) − logδ − logα + η. (B.17) F (δ − 1)α P F δ − 1 δ − 1 We can then solve for the equilibrium h in the formal neighborhood by plugging equation B.17 into B.15 and we obtain:

1 1 η logh = (logA + logA ) − logδ + . (B.18) F α (δ − 1) P F (δ − 1) δ − 1 To sum up, the equilibria are as follows: ( log p = 1 logA + η Kampung: K α P S ¯ loghK = logh

( 1 log pF = α (logAP + logAF ) + η Formal: 1 1 η loghF = α(δ−1) (logAP + logAF ) − (δ−1) logδ + δ−1 . ∗ It can be easily verified that pF > pH and hF > hH (the latter follows from h¯ < h ).

for housing. In our model, this case is ruled out because we assume that consumers move to the city only if they can attain positive utility, which they would not be able to do if they cannot consume any housing there. Therefore, those demand curves do not arise in equilibrium.

90 Discussion

To deliver intuition, the model makes a number of simplifying assumptions. Below we consider possible exten- sions, which would not change they key predictions. First, as in the standard version of the Rosen-Roback model, consumers are perfectly mobile and homogeneous. In a model with mobility costs or heterogeneous preferences for locations or amenities, we would no longer have a perfect equalization of utility across neighborhoods and spatial equilibrium would hold only for the marginal consumer. However, the qualitative impacts on prices and heights would be the same. If consumers had heterogeneous income levels and non-homothetic preferences over housing, the (formalized) non-KIP neighborhood would attract higher-income households, further increasing prices and housing space per capita. We could also explicitly allow consumers to derive positive utility from interactions with high-income neighbors, which in our current model is parsimoniously captured through greater AF in for- mal neighborhoods. Additionally, relaxing the assumption of homogeneous parcel size would provide a richer characterization of how land fragmentation arises endogenously. Next, the model does not allow for agglomeration or congestion. In a more general framework with productive amenities and agglomeration benefits that are stronger in formal neighborhoods, foregone AF could also represent losses in productivity and agglomeration externalities that would add to the social welfare loss. A richer model may introduce congestion in KIP amenities, which would tend to attenuate the response of population and prices to changes in amenities, but, again, would not change the sign of the impacts. Our model also rules out spatial externalities to surrounding neighborhoods. In the presence of spillovers, the difference in price between the informal KIP neighborhood and the formalized non-KIP one will be attenuated. In the empirics, we leverage our granular data to explore the extent of spatial externalities, finding limited evidence that spillovers would overturn our conclusions.

91