The power of context and location: a spatial approach to model the market for new housing in Bogota,

by

Sebastian Perez Sarralde

B.S., Architecture, 2012

Universidad de los

Submitted to the Program in Real Estate Development in Conjunction with the Center for Real Estate in Partial Fulfillment of the Requirements for the Degree of Master of Science in Real Estate Development

at the

Massachusetts Institute of Technology

February, 2019

©2019 Sebastian Perez Sarralde All rights reserved

The author hereby grants to MIT permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole or in part in any medium now known or hereafter created.

Signature of Author______

Center for Real Estate January 11, 2019

Certified by______

Dr. Andrea Chegut Research Scientist, MIT Center for Real Estate Thesis Supervisor

Accepted by______

Professor Dennis Frenchman Class of 1922 Professor of Urban Design and Planning, Director, Center for Real Estate School of Architecture and Planning The power of context and location: a spatial approach to model the market for new housing in Bogota, Colombia

by

Sebastian Perez Sarralde

Submitted to the Program in Real Estate Development in Conjunction with the Center for Real Estate on January 11, 2019 in Partial Fulfillment of the Requirements for the Degree of Master of Science in Real Estate Development

ABSTRACT

This study introduces a geographical approach to analyze the market for new housing in Bogota, Colombia and address limitations of currently available research that is not sensitive to underlying spatial determinants in this city. The overall purpose of this investigation is to provide a framework to evaluate this market from a data-driven perspective in a context where information is often limited or disperse, while illustrating the importance of spatial interactions to develop estimations through quality-adjusted hedonic price models. The analysis is based on a dataset with information of more than 400 thousand new condominium transactions during the period between August 2010 and August 2018 in Bogota and surrounding municipalities. The properties are reverse-geocoded, assigned to their specific local planning jurisdictions within the city and surroundings, and analyzed in relation to their structural parameters.

The intersection between transactional and spatial data is explored to provide three approaches that contribute to the notion of the importance of social-political territorial subdivision as a driver of the residential market, while suggesting an initial route to develop accurate predictive models based on location rather than overly-detailed datasets for this city. The first approach consists of a comprehensive data summary that integrates several variables into graphical and geographical representations to portray urban characteristics of the city, reveal patterns and provide insights through the lens of the new housing market. The second approach involves the construction of quality-adjusted housing price indices for new housing. The precision of a model with limited structural attributes is enhanced by including a combination of neighborhood fixed effects and factors that provide a qualitative assessment of the properties’ socioeconomic context, a method that results effective to substantially augment coefficients of determination and lower residual standard errors. The scope of the price index is then expanded to analyze price dynamics according to locations and socioeconomic strata. Finally, the same methodology for the construction of the price indices is implemented to generate estimations for property area and prices at individual levels.

Thesis Supervisor: Dr. Andrea Chegut Title: Research Scientist, MIT Center for Real Estate

- 2 -

ACKNOWLEDGEMENTS

To my wife, Natalia, thank you for being my unconditional partner in life and adventures, my greatest source of joy, inspiration and light. My admiration for you has no limits. To my family, my highest and most sincere gratitude for your immeasurable love and support.

Special thanks to the people that made this research possible. Thanks to my advisor, Andrea Chegut, for the guidance, knowledge and enthusiasm to make ideas happen. Many thanks to the staff of Galeria Inmobiliaria and Rodrigo Archila for your generosity, time and information. Thanks to Alexander Van de Minne for the great help and advice.

Finally, infinite thanks to the MIT CRE Community, professors, friends and especially the MKG for this life-changing experience.

- 3 -

TABLE OF CONTENTS

1. INTRODUCTION ...... 7

2. CONTEXT AND MARKET OVERVIEW ...... 10

3. LITERATURE REVIEW ...... 16

4. DATA SUMMARY ...... 19

4.1 DESCRIPTION AND MANIPULATION...... 19

4.2. SUMMARY ...... 22

4.3. GEOGRAPHIC DATA SUMMARY ...... 31

5. NEW HOUSING PRICE INDICES: METHODOLOGY AND RESULTS ...... 39

6. INDEX ANALYSIS ...... 45

6.1. GENERAL MARKET ...... 45

6.2. INDICES BY STRATA ...... 48

6.3. INDICES BY LOCATION ...... 51

6.3.1. SUBMARKET: NORTHEAST ...... 54

6.3.2. SUBMARKET: NORTHWEST ...... 55

6.3.3. SUBMARKET: SOUTHWEST ...... 56

6.3.4. SUBMARKET: MUNICIPALITIES ...... 57

7. LINEAR MODEL FOR ESTIMATION ...... 58

8. CONCLUSION ...... 61

BIBLIOGRAPHY ...... 63

APPENDICES ...... 65

- 4 -

FIGURES

FIGURE 1: Construction and Real Estate Industries Share of GDP ...... 11 FIGURE 2: Population per Stratum in Bogota ...... 12 FIGURE 3: Socioeconomic Urban Stratification of Bogota...... 12 FIGURE 4: Administrative and Planning Subdivision of Bogota ...... 13 FIGURE 5: New Housing Price Index (IPVN-BR) ...... 14 FIGURE 6: Annual Variation (IPVN-BR) ...... 14 FIGURE 7: Properties per Area (m2) ...... 22 FIGURE 8: Properties Sold per Number of Rooms ...... 22 FIGURE 9: Units Sold per Quarter ...... 23 FIGURE 10: Properties Sold per Stratum and Bedrooms ...... 25 FIGURE 11: Stratum 2 - Properties Sold per Area and Bedrooms ...... 26 FIGURE 12: Stratum 3 - Properties Sold per Area and Bedrooms ...... 26 FIGURE 13: Stratum 4 - Properties Sold per Area and Bedrooms ...... 27 FIGURE 14: Stratum 5 - Properties Sold per Area and Bedrooms ...... 28 FIGURE 15: Stratum 6 - Properties Sold per Area and Bedrooms ...... 28 FIGURE 16: Number of Projects per Location ...... 29 FIGURE 17: Gross Revenue per Location ...... 30 FIGURE 18: Square Meters Sold per Location ...... 30 FIGURE 19: Projects by Stratum ...... 32 FIGURE 20: Average Property Area ...... 33 FIGURE 21: Average Property Area ...... 34 FIGURE 22: Average Number of Garages per Property ...... 35 FIGURE 23: Total Area Sold per Project v. Price/m2 ...... 36 FIGURE 24: Gross Revenue per Project v. Price/m2 ...... 36 FIGURE 25: Av. Area Sold per Project v. Properties Sold v. Median Property Price ...... 37 FIGURE 26: Cumulative Area Sold in quadrant v. Properties Sold v. Median Property Price ...... 37 FIGURE 27: Av. Revenue per Project in quadrant v. Properties Sold v. Median Property Price ...... 38 FIGURE 28: Cumulative Revenue per quadrant v. Properties Sold v. Median Property Price...... 38 FIGURE 29: Locations by Coefficients ...... 44 FIGURE 30: INDICES BY REGRESSION MODEL (by year) ...... 45 FIGURE 31: INDICES BY REGRESSION MODEL (Monthly) ...... 46 FIGURE 32: INDEX COMPARISON...... 47 FIGURE 33: INDICES BY STRATA ...... 50 FIGURE 34: REPRESENTATIVE LOCATIONS BY SUBMARKET ...... 51 FIGURE 35: INDICES BY SUBMARKET - NORTH EAST ...... 54 FIGURE 36: INDICES BY SUBMARKET - NORTHWEST ...... 55 FIGURE 37: INDICES BY SUBMARKET - SOUTHWEST ...... 56 FIGURE 38: INDICES BY SUBMARKET - MUNICIPALITIES ...... 57 FIGURE 39: Sample Prediction ...... 60

- 5 -

TABLES

TABLE 1: Descriptive statistics of property parameters ...... 22 TABLE 2: Data summary by stratum and year ...... 24 TABLE 3: Aggregated Summary by Project ...... 31 TABLE 4: REGRESSION RESULTS SUMMARY ...... 41 TABLE 5: Correlation chart for selected variables ...... 42 Table 6: REGRESSION RESULTS BY SOCIOECONOMIC STRATA ...... 49 TABLE 7: REGRESSION BY LOCATION RESULTS ...... 52 TABLE 8: LINEAR REGRESSION FOR AREA RESULTS ...... 59

APPENDIX A: Complete Regression Results – Price Indices ...... 65 APPENDIX B: Area Regression Results ...... 69

- 6 -

1. INTRODUCTION

Location is often quoted as the supreme value driver for real estate. Given the fixed nature of the industry’s assets in space, absolute position within urban environments and the quality of immediate surroundings are key determinants of property value perception and potential for appreciation. Locational externalities are integral components that affect the economic value of properties and define market dynamics. Features such as accessibility, zoning regulations, quality of public space and general demographics are some of the inherent characteristics of neighborhoods that influence prices for new housing, in addition to the structural features of properties.

In order to analyze a highly heterogeneous asset class such as residential property, there is consensus around microeconomic theory of hedonic price modeling as a method with the capacity to describe goods as a bundle of attributes. Regression techniques are effective to estimate the marginal contribution of each attribute to the overall value of assets. However, value drivers may vary deeply across geographies and urban characteristics may not be properly identified, accounted for or misinterpreted as a result of unobserved heterogeneity. The possibility to include neighborhood fixed effects, in combination with factors that provide insights to the contextual qualitative conditions of residences, provides an opportunity to incorporate unobserved value drivers and achieve a higher level of accuracy with a limited number of structural attributes.

This paper introduces a geographical approach to analyze the market for new housing in Bogota, Colombia and address major limitations of currently available research that is not sensitive to underlying spatial determinants in this city. The overall purpose of this investigation is to provide a framework to evaluate this market from a data-driven perspective, as a method capable of increasing transparency in a context where information is often limited, while illustrating the importance of spatial interactions to develop estimations.

The analysis is based on a dataset provided by Galeria Inmobiliaria, one of the largest real estate data providers in Colombia, which consists of a detailed collection of more than 400 thousand new condominium transactions during the period between August 2010 and August 2018 in Bogota and surrounding municipalities. The properties are reverse-geocoded, assigned to their specific local

- 7 - planning districts within the city and surroundings, and analyzed in relation to their structural parameters. The intersection between transactional and spatial data is explored to provide three approaches that will hopefully serve business decision making and further research of the housing market in Colombia.

The first approach consists of a comprehensive data summary that integrates several variables into graphical and geographical representations to portray urban characteristics of the city through the lens of the new housing market. Spatial and market regulations, as well as historical and geographical characteristics, are critical elements that shape the way housing is deployed in a territory and how people dwell within the urban landscape. Visual data aggregation reveals defined patterns, challenges and opportunities, and insights for different stakeholders of the market for residential property.

The second approach involves the construction of a quality-adjusted housing price index for new housing, and a subsequent expansion of its scope to analyze price dynamics according to specific socioeconomic context parameters and local submarkets. Price indices are an important input to measure housing demand, contrast trends at varying levels of analysis, guide residential real estate investment decisions and understand the effect of policies, programs and subsidies. Different levels of accuracy are determined depending on the number and nature of the factors included, which reveals that the inclusion of neighborhood and locational effects is an efficient way to improve estimations. Indices constructed through the hedonic models developed with a geographical reference are believed to be more precise, based on the increase on the resulting coefficients of determination and reduced standard errors. The benefits of this outcome are twofold: firstly, it creates models suitable for constructing indices with limited structural attributes, which is a common limitation of transactional datasets; secondly, it brings the possibility to develop price indices for specific subregions or categorical groups of assets to create insights of how different layers of the market are trending

The final approach consists of implementing the same methodology for the construction of the price indices and applying the estimated coefficients to generate estimations for area and prices at an individual property level. The objective for this section is to provide the reader with a tool built upon market-wide evidence to cross-examine assumptions in the conception, design and valuation of assets in this market. Frequently, individual property design, size and attribute selection are left to outside consultants without further verification prior to commercialization. Developing a model that generates estimations with a considerable level of confidence will hopefully enrich discussions at the moment

- 8 - of structuring residential products, offering an additional point of view based on the evidence provided by this particular dataset.

Lastly, there is a final reflection on the contribution of this approach, which underscores the importance of considering the political and social subdivision of the territory as key determinants of geographical markets for residential real estate. A city-wide examination of the market under this lens characterizes and quantifies trends beyond what individual stakeholder’s common intuition may provide, while suggesting an efficient route to further develop accurate predictive models, automate appraisals and monitor market fluctuations based on the quoted segmentations and a few structural attributes. This notion is complemented with a consideration of additional questions that surged during the course of development of this research, addressing opportunities to enhance the methodology and consolidate a data-driven framework for decision-making in this dynamic market.

- 9 -

2. CONTEXT AND MARKET OVERVIEW

Colombia is a country with a total population of 45,5 million, with 77.8% of it living in main urban centers according to the 2018 National Census performed by the DANE (National Administrative Department of Statistics). For a developing country, this high share of urban population is a result of atypical internal migration dynamics within the country, pushed by historical socioeconomic circumstances. There are 13.8 million households with an average of 3.1 people per household and a housing deficit of 5,2% (586 thousand households)1. By the end of 2017, total GDP was $309 billion growing at a 2.8% rate and GDP per capita was estimated at $7,600. The inflation rate closed at 3.23% and interest rates averaged 4.25%.2

The construction industry plays an important role in the overall economic performance of the national GDP, given its relation to other sectors such as the metallurgic, wood and cement sector on one hand, and real estate activities (property sales, leasing and management) on the other. It generates an intensive demand for non-qualified labor: in 2012, 89% of total labor of the sector was non-qualified, representing a 7.3% of the total non-qualified labor of the country.3 The construction of new buildings subsector has consistently maintained a share between 4-6% of GDP for the last ten years, while the general construction sector has increased substantially driven by government investment in infrastructure since 2012. At the same time, the real estate activities sector has grown its participation from 8 to 13% since 2005.

1 Ministry of Housing, City and Territory: Colombia exceeded housing deficit goals for 2018 2 Trading Economics: Colombia- Economic Indicators. 3 Banco de la Republica: New Housing Price Index for Bogota.

- 10 -

FIGURE 1: Construction and Real Estate Industries Share of GDP

14%

12%

10%

8%

6%

4%

2%

0% 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018

Construction Sector/GDP Building Construction Subsector/GDP

Real Estate Activities/GDP

Source: DANE. Calculations by author.

Bogota, the capital, is the largest city of the country with a population of 8.2 million and a share of nearly 27% of the overall national GDP. It has 2.4 million households and a total housing deficit of 299 thousand according to 2017 figures.4

QUALITATIVE CONTEXT: SOCIOECONOMIC STRATA

Residential urban sectors and properties are classified into socioeconomic strata to support a system that charges for public services, allocates subsidies and taxes property on a differentiated basis, based on constitutional premises of solidarity and income redistribution. The classification, which ranges from 1 (lowest) to 6 (highest), is done based on evaluating the interior and exterior characteristics of residences (material quality, size, among others), immediate urban surroundings (parks, access to schools) and functional context, which are factors that have significant associations to the socioeconomic conditions of the inhabitants of such residences. Users’ income is not considered into this metric. 5 This category becomes highly relevant for this study because it provides a ground-level assessment of the contextual quality of properties and neighborhoods.

4 Secretary of Habitat: New Methodology for urban housing deficit in Bogota 2017. 5 DANE: Socioeconomic Stratification

- 11 -

FIGURE 2: Population per Stratum in Bogota

2% 3% 10% 8% 5 6 1 4

Stratum 1: Lowest

Stratum 2: Low

Stratum 3: Medium-Low 2 3 Stratum 4: Medium 41% 36% Stratum 5: Medium-High Stratum 6: High

Source: DANE-SDP.6

FIGURE 3: Socioeconomic Urban Stratification of Bogota 7

Source: SDP

6 Administrative Unit of District Cadastre: Real Estate Analysis 2016-2017 7 District Planning Secretary: Socioeconomic Stratification

- 12 -

LOCATION: REGIONAL ADMINISTRATIVE SUBDIVISION

The city functions as an independent district divided into 20 administrative local regions (19 urban and 1 rural), which count with their own Local Administrative Boards (JAL). These local boards are elected by popular vote and have specific functions of conciliating the district’s plans and programs at a local level for social and economic development, surveillance and control on public investment, among others. The 19 urban local regions are further subdivided into a total of 112 Zonal Planning Units (UPZ) that group several neighborhoods. These units provide an intermediate scale through which urban planning and policy is deployed through the territory, defining not only uses and FAR regulations, but also being focus of discretionary investment in infrastructure and regions that promote citizen participation at different levels. For these reasons and for the purposes of this research, the UPZ subdivision provides a suitable framework to deal with spatial segmentation.

FIGURE 4: Administrative and Planning Subdivision of Bogota

Source: CCB.8

8 Chamber of Commerce of Bogota: UPZ Distribution

- 13 -

The condominium market in Bogota cannot be understood, however, without including the surrounding municipalities into our considerations. As a result of a brief period of regulatory uncertainty that began by the end of 2012 and driven mostly by the rising prices for urban land, developers and households of diversified categories have found an alternative to develop and acquire residential property outside the district’s perimeter at lower prices.

For the last 10 years, prices for new residential properties have grown at a compounded annual rate of 5.6% in Bogota and by 3.1% in the surrounding municipalities.

FIGURE 5: New Housing Price Index (IPVN-BR) FIGURE 5: New Housing Price Index (IPVN-BR) FIGURE 6:6: Annual VariationVariation (IPVN-BR)

200 0.2

150 0.15

100 0.1

50 0.05

0 0

-0.05

Jun-08 Jun-13 Jun-18

Oct-06 Oct-11 Oct-16

Apr-09 Apr-14

Feb-10 Feb-15

Dec-15 Dec-05 Dec-10

Jun-13 Jun-08 Jun-18

Aug-07 Aug-12 Aug-17

Oct-06 Oct-11 Oct-16

Apr-09 Apr-14

Feb-10 Feb-15

Dec-05 Dec-10 Dec-15

Aug-07 Aug-12 Aug-17

Bogota Sorroundings Bogota Bogota Sorroundings Bogota

Source: IPVN – Banco de la Republica.9 Calculations by author.

The index for prices of new housing (IPVN) developed by the central bank (Banco de la Republica) illustrates the dynamics of price growth for the quoted regions. The index demonstrates a cyclical behavior of housing prices, with very high growth rates before the international financial crisis in 2007 and before the peak of the commodity boom in 2013-14. After the drop in the price for crude oil in 2014, national GDP growth was lower than expected and housing prices in the region even contracted by the end of 2015 in both regions. Prices in Bogota, in comparison to the prices in the surrounding municipalities, appear to be more susceptible to the cycles of the overall economy. This may be explained by the fact that demand for higher-end condominium property, which is highly concentrated in Bogota and had been commanding a substantial share of the sector’s growth, decreased considerably

9 Banco de la Republica: New Housing Price Index (IPVNBR)

- 14 - and created a sudden mismatch between supply and demand that required prices to adjust precipitously.

Additional to this, starting in 2015 the government gave a major boost to the affordable housing incentives program in order to accomplish national housing policy goals and reactivate this economic sector.10 Subsidies for down payments during pre-sales period and on mortgage interest rates in combination with a series of tax incentives for developers has skewed residential development activities towards lower income housing. In 2017, to further reactivate the economic sector and provide housing solutions for a growing middle class, the program was extended to cover middle- income housing purchases.

10 A property is classified as affordable housing depending on its purchase price in relation to the legislated minimum monthly salary (SMMLV for its initials in Spanish). Affordable housing is divided into two categories, Housing of Priority Interest (VIP) and Housing of Social Interest (VIS), which must be priced below 70 SMMLV and 135 SMMLV correspondingly. The 2017 program extension covers residences for a price up to 435 SMMLV.

- 15 -

3. LITERATURE REVIEW

Because of the importance of housing markets from a socioeconomic development standpoint in many countries and the impact of its relation to economic cycles and financial stability, there is substantial literature that looks to explain the evolution of housing prices in a variety of methods.

This research focuses on the methodology of hedonic price modeling to deal with the expected heterogeneity of a residential market in a growing metropolis such as Bogota. Under this perspective, a residence is a heterogeneous good that can be evaluated by the utility that its attributes generate. By measuring the marginal utility of these attributes, we can explain the price of such good as a bundle of its performing characteristics. The characteristics for a residence may be divided into two main categories, the first one relating to the structural characteristics of the property (dwelling size, number of bedrooms and bathrooms, among others), and the second one relating to the context where it exists, given that buildings are assets fixed in a spatial context.

For the residential market in Colombia, research under this methodology can be divided into two groups. The first one, which has not been widely explored but has considerable institutional importance, has focused on creating functions for hedonic prices to develop housing price indices. Castano, Laverde & Morales (2013) developed an index for prices of new housing which controls for qualitative characteristics of the properties and includes fixed effects to account for common unobserved factors of the projects that host the properties. Using data of transactions in Bogota for new housing from 2003 to 2013, for an aggregate total of 3,348 projects and 400,526 properties, the index accounts for relative price changes as a function of the marginal utility of property attributes (area, number of bathrooms, presence of deposit, garage, studio, floor level, gas and electrical fixtures) associated with a time variable, and multiplicative and additive fixed effects to account for the marginal value a project’s common characteristics. By isolating the effect of these components, overestimation of growth in prices over time is reduced, finding that changes in property characteristics has positive effects over the price of housing.

- 16 -

A second group of research has focused on explaining the effect that specific attributes and characteristics associated with locations have on housing prices. Carriazo, et al. (2011) test for the correlation of unobserved property attributes with measurements of air quality and its impact on residential rental prices. Revollo (2009) tests for the structural and locational attributes (including property age, proximity to public transport and parks) that affect housing prices and finds that investment in public infrastructure can have mixed effects over property prices depending on the location.

Even though the effects of location may be described to an extent by the inclusion of project fixed effects or testing for specific spatial attributes of neighborhoods, limitation of data to fully account for a diverse array of drivers of property prices poses a risk that may lead to misrepresent how much prices are already correlated to its context, neighbors or unobserved factors.

It is expected that dealing with spatial heterogeneity and account for unobserved characteristics of locations by segmenting the market geographically will provide another layer of precision to the estimates. Basu & Thibodeau (1998) tested for spatial autocorrelation under the reasoning that neighborhoods tend to be developed at the same time and as such, account for similar structural property characteristics as dwelling unit sizes and qualitative characteristics. Moreover, property values in the same locations capitalize on shared characteristics such as socioeconomic and demographic variables, predominant land use, local services and amenities, accessibility, and exposure to factors as pollution and proximity to parks. The authors used data for 5320 transactions of single-family housing in Dallas, geocoded the properties and assigned them to submarkets within the metropolitan area, finding mixed evidence of spatial autocorrelation in properties within a 1200-meter radius in four of the eight submarkets.

By augmenting the hedonic model to include spatial characteristics Chegut et al. (2014) researched on spatial dependence in prices of office buildings as a way to account for adjacency effects that result from transaction price spillovers within locations. This paper is one of the few examples of spatial and spatial-temporal autoregressive models developed for commercial real estate, employing a global database of office property transactions on six cities to determine the relevance of spatial dependence in hedonic models and price indices. To this end, spatial-temporal autoregressive models that also control for the spatial dependence of the error terms were developed and compared to a benchmark

- 17 - hedonic model to find that spatial dependence tends to be more significant than temporal dependence. The base hedonic model, which accounts for exogenous property characteristics as age, size, stories, renovation status, investor type, seller type and year of transactions, was extended by including spatial interactions with neighbors. The spatial component relates the transaction prices to a spatially lagged price of neighboring buildings in function of a weight matrix that incorporates the physical distances between them. Even though this research found considerable statistical evidence of spatial dependence for the six markets, the models indicate weak economic impacts on actual prices, possibly as a result of studying a period that was impacted by the Global Financial Crisis.

Another approach to deal with unobserved heterogeneity that arises from locational attributes is the one proposed by Francke & Van de Minne (2018). Departing from the notion of the limits and scarcity of property and locational characteristics in most databases, the authors employ a standard hedonic model for residential markets in Atlanta and Heemstede, with attributes for dwelling size, age, maintenance and presence of yards or gardens, among others. The model is extended by adding first, mutual independent property level random effects, and second, spatial property effects that relate properties to their neighbors within a predefined threshold or distance. For the latter, the results suggest strong spatial effects in Atlanta north of the CBD, and substantial variation along different locations that relates to the spatial heterogeneity. On the other hand, Heemstede, a town with a smaller area, presents lower variations as a result of a possibly more homogeneous urban landscape.

- 18 -

4. DATA SUMMARY

4.1 DESCRIPTION AND MANIPULATION

The main data source for this research was provided by Galeria Inmobiliaria, one of the largest real estate data providers in the country, which surveys a diverse array of asset classes and geographies in Colombia, Peru and Panama. The original dataset consists of 97 monthly update files (from August 2010 to August 2018, inclusive) which provide details of the commercial activity of new residential developments in Bogota and the surrounding municipalities. The database, which is updated by field survey researchers that call or visit every month each project’s sales gallery, provides diverse details about the projects and a breakdown of each development into its corresponding unit mix.

The relevant fields for this study are classified into two levels included in the raw data - project and individual property level information - and various categories. Additional fields were derived from the raw data into a third category to facilitate the purposes of the research:

A. Project Level Information: 1. Identification: a. Project Code: Unique numerical project identifier. b. Project Name: Commercial Name of the Project. c. Developer: Entity promoting the project. In Colombia, most real estate development companies are vertically integrated. They invest seed equity, source debt, structure the product, manage the sales and marketing process, entitle and build the projects. 2. Locational - Temporal: a. Address: Urban or rural address. Given that geographical systems do not generally interpret addresses in Colombia with great precision, this field serves mostly for verification of the location given by the coordinates.

- 19 -

b. Coordinates: Exact location of the project, expressed in decimal format of latitude and longitude. c. Project Start Date: This field documents the commercial start date of the project. Generally, a project is launched commercially for a period of pre-sales until it achieves the ‘break-even’ point, when developers are able to obtain a construction loan and begin execution. The ‘break-even’ point oscillates between 50 to 60% of total units sold, but varies according to the creditors’ assessment of risk. B. Unit (Property) Level Information: 1. Categorical: a. Stratum: As defined in Chapter 2, the stratum category looks to identify the socioeconomic “category” for the project. The categories run from 1 (lowest) to 6 (highest), but Stratum 1 is not included in the database because it corresponds to self-built residences or government-led housing initiatives that are not commercially available. b. Type: Indicates if the property is an apartment or a house. The original database also included a few samples for lots within gated developments which were removed, since they distort the relationship between areas and price. 2. Numerical: a. Area: Gross area for property in square meters. b. Rooms: Number of bedrooms. Studios, microunits and 1-bedroom units are classified under the 1-bedroom category. c. Garages: Number of garages per residential unit. d. Down Payment: The down payment reflects the portion that buyers must pay to an escrow account during the pre-sales period (between the commercial start date and the property delivery date). Generally, this portion oscillates between 30 and 40% of the total purchase price and is paid in monthly installments. 3. Sales in time: a. Number of Units Sold for the Specific Month: Each of the 97 original files has a field that informs how many units of each property type were sold for that month. As such, the aggregated data includes 97 fields that document the corresponding unit sales from August 2010 to August 2018.

- 20 -

b. Sales price: Each file also contains a field informing the transaction price for each of the units sold at the specific month of sale.

C. Derived Information:

The following data category is a result of data manipulation and organization by the author:

o Property ID: To facilitate data processing, a unique ID was assigned to each property type based on its corresponding project code, area and number of rooms. o Normal sales price: Property prices are adjusted to real values, using the monthly CPI index of Bogota provided by the Banco de la Republica and taking the month August 2010 as basis. o Price per square meter: Nominal sales price for each specific month divided by the unit area. o Normal price per square meter: Normal sales price divided by area. o Latitude – Longitude: The coordinates field was split into Lat – Long values. Hundreds of samples with coordinates expressed in other formats or missing were verified and corrected. o Location: The projects were reverse-geocoded in GIS to their corresponding planning and zoning sectors. Properties within the perimeter of Bogota were assigned to their corresponding UPZ (Zonal Planning Unit), and properties outside the city were assigned to their corresponding municipality. This location assignment procedure follows the reasoning of allocating each project to the jurisdictions that will determine on one hand the project’s zoning, overall size, scale, use and entitlement process, and on the other its relation to locational amenities, infrastructure and local demographics. o Location code: Derived from the previous item. The projects were located within a total of 103 UPZs, with codes between 1 and 116, and a total of 32 municipalities with 5-digit codes. o Total Units Sold: Sum of the 97 monthly sales fields explained in B.3.a. of this chapter.

The raw data files included information of properties that were not commercially active during the research period, which had either been already sold before August 2010 or removed from the market. For this work, only properties that had actual sales activity during the period and a complete set of fields were considered. After merging and filtering the information, the initial result is a database with 22,826 samples - one sample per each condominium unit type along with its corresponding details and

- 21 - sales timeline - consolidating information of 4,012 projects and 434,159 transacted properties during the eight-year period.

4.2. SUMMARY

TABLE 1: Descriptive statistics of property parameters

Stratum Down PMT Area Rooms Garages Minimum 2.00 0.10 12.00 1.00 0.00 Maximum 6.00 0.90 783.00 4.00 8.00 Median 4.00 0.35 77.00 3.00 1.00 Mean 4.39 0.38 62.04 2.61 1.47 Standard Deviation 1.20 0.10 28.47 0.61 0.88

Distribution by Property Type Apartment 94.10% House 5.90%

FIGUREFIGURE 8 :7: Properties Properties Soldper Number per Number of Rooms of Rooms FIGUREFIGURE 7 8:: Properties Properties perper Area Area (m (m2)2) 300000 50000 250000 40000 200000 30000

150000 20000

100000 10000 0 50000

0

32.5 - 35 - 32.5 45 - 42.5 55 - 52.5 65 - 62.5 75 - 72.5 85 - 82.5 95 - 92.5

1 2 3 4 25 - 22.5

102.5 - 105 - 102.5 115 - 112.5 125 - 122.5 135 - 132.5 145 - 142.5 155 - 152.5

Aggregating the data by their main parameters illustrates a general picture of the residential market for the quoted period. Figure 7 shows that three-bedroom properties command a very high share of the total number of properties sold, with almost 300 thousand units. The weighted mean for the number of rooms is 2.61 with a standard deviation of 0.61, indicating that 3 rooms is within the main range that defines the demand. This may be explained by the fact that properties for the lower income

- 22 - segment of the market, which commands a substantial share, is mainly 3-bedroom units. The distribution of properties by area in Figure 8 indicates a high demand for units around 50 square meters, which is an area that may well be suitable for 1, 2 or 3 bedrooms in the market, depending on the target customers of a specific socioeconomic bracket.

Figure 9 aggregates sales information over time, classified over quarters during the eight-year period. Intuitively, a cyclical behavior within each year may be observed, with sales peaking in the third quarter of almost every year, before descending into the troughs that may be generally found in the fourth and first quarters.

FIGURE 9: Units Sold per Quarter

20000 18005

18000

16529

16118

15212

14935

14840 14818

16000 14706

14273

14141

13980

13714

13474

13393

13321

13269

13139

12985 12707

14000 12679

12581

12560

12401

12178

12111

12060

11511

11489 11417

12000 10954

10272 9574

10000 8813 8000 6000 4000 2000

0

Q1 2011 Q1 2011 Q3 2012 Q1 2012 Q3 2013 Q1 2013 Q3 2014 Q1 2014 Q3 2015 Q1 2015 Q3 2016 Q1 2016 Q3 2017 Q1 2017 Q3 2018 Q1 2018 Q3 Q3 2010 Q3 *Q3-2010 considers only August and September **Q3-2018 considers only July and August

- 23 -

A further breakdown of the information by stratum and year provides the following summary:

TABLE 2: Data summary by stratum and year

2 3 4 5 6 Total Total Projects 416 1,258 946 699 693 4,012 Properties Sold 105,920 222,942 75,274 17,999 12,024 434,159 Market Share 24.4% 51.4% 17.3% 4.1% 2.8% 100.0%

Average Area (m²) 48.99 55.56 79.27 99.47 133.43

Average Rooms 2.57 2.68 2.59 2.29 2.29 Total Units Sold 2010 (since August) 7,835 8,385 4,015 891 1,008 22,134 2011 15,090 27,866 9,875 2,807 2,305 57,943 2012 13,856 27,617 9,142 2,905 1,934 55,454 2013 10,103 34,580 9,520 2,354 1,773 58,330 2014 13,400 27,988 8,134 1,825 1,529 52,876 2015 10,874 25,785 9,672 1,717 1,130 49,178 2016 16,603 27,005 10,245 2,225 1,005 57,083 2017 10,307 23,176 9,244 1,571 905 45,203 2018 (until August) 7,852 20,540 5,427 1,704 435 35,958 Average Property Price (millions of pesos) 2010 (since August) 46.0 79.2 234.5 374.7 575.5 1,310 2011 51.9 76.1 236.5 364.1 711.6 1,440 2012 58.8 84.1 252.5 387.3 701.7 1,484 2013 72.4 92.0 273.1 469.5 849.2 1,756 2014 66.3 104.5 289.1 532.7 928.3 1,921 2015 72.0 122.2 289.6 546.7 1,015.7 2,046 2016 75.4 126.9 293.6 633.9 928.3 2,058 2017 92.1 143.6 324.8 597.2 972.1 2,130 2018 (until August) 104.5 139.7 327.6 462.8 964.7 1,999 Average Price per m² (millions of pesos) 2010 (since August) 0.979 1.393 2.641 3.450 4.348

2011 1.072 1.390 2.783 3.593 5.137

2012 1.194 1.517 3.080 3.996 5.317

2013 1.423 1.698 3.392 4.584 6.138

2014 1.333 1.872 3.564 4.906 6.645

2015 1.498 2.104 3.657 5.215 7.380

2016 1.559 2.238 4.054 6.264 7.730

2017 1.863 2.563 4.293 6.133 8.091

2018 (until August) 2.092 2.636 4.578 6.057 7.974

- 24 -

FIGURE 10: Properties Sold per Stratum and Bedrooms

160000 140000 120000 2 100000 3 80000 4 60000 5 40000 6 20000 0 1 2 3 4 Number of Bedrooms

This additional breakdown of the information by bedrooms and stratum suggests that the market has been highly determined by a 3-bedroom product on stratum 3 locations. A combination of factors may explain this defined trend, such as the fact that Bogota hosts a consistently rising middle-class population, which at the same time had the means to take the highest advantage of circumstances and incentives to become home-owners. At the same time, supply may be driving this trend, with development companies finding in this product a point that balances rising costs of urban land with actual purchasing capacity of lower-middle segments of the population. Figures from 11 to 15 look to characterize the distribution of properties by area, rooms and stratum:

- 25 -

FIGURE 11: Stratum 2 - Properties Sold per Area and Bedrooms

45000

40000

35000

30000

25000 4 BR 20000 3 BR 15000 2 BR

10000 1 BR

5000

0

20 - - 20 25 - 25 30 - 30 35 - 35 40 - 40 45 - 45 50 - 50 55 - 55 60 - 60 65 - 65 70 - 70 75 - 75 80 - 80 85 - 85 90 - 90 95 Area (m2)

FIGURE 12: Stratum 3 - Properties Sold per Area and Bedrooms

50000 45000 40000 35000 30000

25000 4 BR 20000 3 BR 15000 2 BR 10000 1 BR 5000

0

15 - 20 - 15 60 - 55 20 - - 20 25 - 25 30 - 30 35 - 35 40 - 40 45 - 45 50 - 50 55 - 60 65 - 65 70 - 70 75 - 75 80 - 80 85 - 85 90 - 90 95

95 - 100 - 95

100 - - 100 105 - 105 110 - 110 115 - 115 120 - 120 125 - 125 130 - 130 135 - 135 140 - 140 145 Area (m2)

- 26 -

FIGURE 13: Stratum 4 - Properties Sold per Area and Bedrooms

9000

8000

7000

6000

5000 4 BR 4000 3 BR 3000 2 BR 2000 1 BR 1000

0

10 - - 10 15 - 20 25 - 30 35 - 40 45 - 50 55 - 60 65 - 70 75 - 80 85 - 90 95

120 - 125 - 120 100 - - 100 105 - 110 115 - 130 135 - 140 145 - 150 155 - 160 165 - 170 175 - 180 185 - 190 195 - 200 205 - 210 215 - 220 225 - 230 235 - 240 245 Area (m2)

From the previous graphs it is possible to identify that, as the target stratum increases, the purchased properties become more diverse in their combination of rooms and area. While the properties from stratum 2 have a pronounced distribution around areas within the 45-50 square meter range and a mix of 2 to 3 bedrooms, stratum 3 graph illustrates how 3-bedroom properties increase their area range. In the stratum 4 graph, 1-bedroom properties become more relevant and the area ranges amplify further but still indicate a trend towards a skewed normal distribution.

- 27 -

FIGURE 14: Stratum 5 - Properties Sold per Area and Bedrooms

1200

1000

800

600 4 BR 3 BR 400 2 BR 1 BR 200

0

10 - 15 - 10 20 - - 20 25 - 30 35 - 40 45 - 50 55 - 60 65 - 70 75 - 80 85 - 90 95

140 - 145 - 140 100 - - 100 105 - 110 115 - 120 125 - 130 135 - 150 155 - 160 165 - 170 175 - 180 185 - 190 195 - 200 205 - 210 215 - 220 225 - 230 235 - 240 245 - 250 255 - 260 265 - 270 275 - 280 285 - 290 295 Area (m2)

FIGURE 15: Stratum 6 - Properties Sold per Area and Bedrooms

800

700

600

500

400 4 BR

300 3 BR 2 BR 200 1 BR 100

0

15 - - 15 20 - 30 35 - 45 50 - 60 65 - 75 80 - 90 95

195 - 200 - 195 105 - - 105 110 - 120 125 - 135 140 - 150 155 - 165 170 - 180 185 - 210 215 - 225 230 - 240 245 - 255 260 - 270 275 - 285 290 - 300 305 - 315 320 - 330 335 - 345 350 Area (m2)

On the other hand, graphs for properties from stratums 5 and 6 suggest that the active properties were highly diverse products in their combinations of area and rooms. Area ranges become considerably wider and three-bedroom unit areas may even span from 70 to 700m2. At this point, these distributions indicate high product heterogeneity and the area-room combination may result

- 28 - insufficient to explain their commercial activity. For higher-end properties, product differentiation becomes critical and other host of variables, such as location, amenities, product marketing, among others might be the determinant for this segment.

It is worth highlighting the different impact that the definition of stratum has over the different segments of the market. On one hand for lower stratums, the fact that strata categorization is determined rigorously by the final price of properties and limits the pool of buyers to a specific demographic that has access to credits and subsidies, confines the possible outcomes to a well-defined combination of attributes. Facing limited design choices, developers will opt for high product replicability in macro-projects that can achieve convenient economies of scale. On the other hand, for middle and higher segments, the strata classification becomes less relevant in determining a specific product and buyer, and the qualitative assessments of developers and buyers may become more diverse.

Finally, location remains critical in the definition of the characteristics and historical performance of a project. The following graphs select the first 20 locations of each list, organized by specific criteria. Light blue represents UPZ locations, and dark blue represent the city’s surrounding municipalities.

FIGURE 16: Number of Projects per Location

450 400 350 300 250 200 150 100 50 0

Santa Barbara, Los Cedros, Chico Lago, La Alhambra and Usaquen UPZs tend to be composed by middle-high to high income neighborhoods. The graph indicates a large number of projects clustered in those specific locations. is a municipality to the south-west of Bogota, with integrated public transport infrastructure to the city. Chia is the first municipality to the north of the city and new

- 29 - development tends to target middle to upper income households. The graph may give a hint on how fragmented and competitive the market is at each location, judging by the available amount of options. FIGURE 17: Gross Revenue per Location

In billions of Colombian Pesos 7000 6000 5000 4000 3000 2000 1000 0

Classifying locations per gross nominal revenue renders the same first four locations of the previous graph, in different positions. While the price per square meter in Soacha tends to be low in comparison to the ones in other locations, the graph suggest that the number of units sold may be high enough to result in such high revenue.

FIGURE 18: Square Meters Sold per Location

5000000 4500000 4000000 3500000 3000000 2500000 2000000 1500000 1000000 500000 0

Lastly, organizing locations by the total amount of square meters sold indicates that an important segment of buyers is looking for housing opportunities in the surrounding municipalities. Six of the first seven locations with the highest amounts of area transacted are neighboring municipalities. The case of Soacha is outstanding for its magnitude, with 4.5 million square meters commercialized during

- 30 - the previous eight years, leading the fastest pace of urbanization. Being a location that generally hosts lower-income households that purchase units with smaller average areas, the population density is expected to rise considerably at this end of the city.

4.3. GEOGRAPHIC DATA SUMMARY

The intersection between locational and transactional data allows to present an overview of the market along the geography with the aid of online tools for geographical data visualization.11 To this end, the database was aggregated and summarized by project to create the following additional fields for representation:

o Total Sales: Aggregate total of units sold per project o Total Area: Sum of total square meters sold per project o Total Revenue: Sum of total number of properties times their corresponding sales price. Property prices are deflated to August 2010 figures to make prices comparable across the years. o Average Area: Weighted average of sold properties’ area per project o Average Rooms: Weighted average of the number of bedrooms per property for each project o Average Garages: Weighted average of the number of garages per property for each project

TABLE 3: Aggregated Summary by Project

total sales total area total revenue average area average rooms average garage Minimum: 1 37 26,800,000 19.61 0.5 0.0 Maximum: 1,992 107,912 385,620,185,463 821.00 4.0 6.5 Median: 39 3,164 10,352,314,964 71.67 2.6 1.0 Mean: 131 8,198 18,687,899,195 85.13 2.5 1.2 St. Dev.: 200 11,202 25,539,776,470 50.60 0.61 0.81

11 Kepler.gl: Large-scale WebGL-powered Geospatial Data Visualization Tool

- 31 -

A first reference to the projects by stratum reveals a defined spatial structure: a dense Stratum 6 cluster at a prime location, flanked by the on one side and Strata 5 projects on others; Stratum 4 projects that are far from this cluster appear to follow the path of major roads, while Strata 2 and 3 are more scattered to occupy the remaining regions of the city. The results are varied outside the urban limit: to the north, supply is very heterogeneous and ranges from Stratum 2 to 5, while projects in southern and western municipalities are predominantly 3 and 4, targeting the low-middle range of the market. FIGURE 19: Projects by Stratum

FIGURE 19: Projects by Stratum

- 32 -

The following map shows the weighted average unit size per project sorted into quantiles. Properties to the south and southwest are mostly below 60m2, which is consistent with Figure 8. Going north, the average unit size of properties becomes more mixed in space except for a clear cluster of units above 109m2 to the east. Property sizes above 109m2 are also common in the northern municipalities, evidencing the common tradeoff between larger properties at farther distances from urban centers. To the southwest, properties in Soacha are in the lower area range, while the ones in Mosquera tend to be in the mid-ranges. FIGURE 20: Average Property Area

Average Property Area

- 33 -

Charting projects by average number of bedrooms illustrates a defined trend, where most of the properties in the market below 2.59 bedrooms concentrate in the most affluent regions, while properties of namely 3 bedrooms are more common in the rest of the city. In relation to Figure 20, property areas in affluent locations tend to be larger but have fewer bedrooms, and vice versa. This provides a snapshot of customer demographics, with wealthier users acquiring larger properties with fewer bedrooms for – presumably - smaller households, while convenience-oriented products tend to maximize bedrooms in function of their area. FIGURE 21: Average Property Area

FIGURE 21: Average Property Area

- 34 -

Mapping the average allocation of garages per unit provides an insight to one of the starkest contrasts for this market. In general, points in red tones indicate projects where every property has at least one parking space and are concentrated in a large but defined area towards the north. Towards the south and western municipalities (the farther regions from the CBD, paradoxically) garage allocation is less than one per property, where spaces may be shared or purchased individually. This characteristic is a result of zoning legislation, traces a clear differential for the assets in a geography and may reinforce existing urban (in)accessibility trends. FIGURE 22: Average Number of Garages per Property

FIGURE 22: Average Number of Garages per Property

- 35 -

The graphics to the left illustrate the relationship of price per square meter along the geography to total project area sold and total project revenue, to understand the gross sales performance of projects according to their location. Prices per square meter vary from dark violet in the lower price ranges up to yellow tones in the highest price bracket. Price dynamics are consistent with previous graphics, with a defined central cluster of the highest prices, where key locational amenities are found, and a gradual price decrease towards the extremes of the urban setting.

The radii of the points in Figure 23 present FIGURE 23: Total Area Sold per Project v. Price/m2 a comparison of the total area sold per 2 FIGURE 23: Total Area Sold per Project v. Price/m project, which provides an insight to the developments’ overall size and urban magnitude. Presumably, larger projects may be found farther from the urban centrality where raw land costs are lower. Conversely, projects of higher price/m2 appear to be smaller but abundant, supporting the creation of a very heterogeneous supply in the market’s higher end.

In Figure 24, the radii express the gross revenue per project and indicate that a large proportion of the gross sales in the market tends to concentrate where higher prices/m2 are found, despite smaller project sizes. FIGURE 24: Gross Revenue per Project v. Price/m2 - 36 - FIGURE 24: Gross Revenue per Project v. Price/m2

A three-dimensional approach to the data is done by segmenting the city into equal quadrants that summarize the activity of projects that fall within their areas. The height of each quadrant extrusion corresponds to the average area sold per project for Figure 25, and the cumulative area sold (area sum for all the projects within each quadrant) for Figure 26. Block colors express average property prices and the radii of the blue rings at ground level indicate the number of individual sales to map transactional intensity of the specific projects. The cumulative area sold in Soacha in Figure 26 indicates where the most relevant urban residential expansion is taking place. FIGURE 25: Av. Area Sold per Project v. Properties Sold v. Median Property Price

FIGURE 25: Av. Total Area Sold per Project in quadrant v. Properties Sold v. Median Property Price

FIGURE 26: Cumulative Area Sold in quadrant v. Properties Sold v. Median Property Price

FIGURE 26: Cumulative Area Sold in quadrant v. Properties Sold v. Median Property Price

- 37 -

Analyzing the difference between quadrant averages and cumulative figures reveals whereas the metrics analyzed result from a general locational trend or the performance of specific outliers. This is the case of Chia, where overall sales performance may be skewed by one specific project. Also, contrary to intuitive reasoning of Figure 24, Figure 28 reaffirms the strong development performance in Soacha both in gross area transacted, total revenue and number of transactions. Finally, while the “prime” cluster between El Refugio and Usaquen commands high regional cumulative revenue, its average revenue per project is small. Developments around the cluster, however, present consistent commercial activity both at the regional and individual project level. FIGURE 27: Average Revenue per Project in quadrant v. Properties Sold v. Median Property Price

27: Gross Revenue per Project v. Properties Sold v. Median Property Price

FIGURE 28: Cumulative Revenue per quadrant v. Properties Sold v. Median Property Price

- 38 -

5. NEW HOUSING PRICE INDICES: METHODOLOGY AND RESULTS

An effective method to describe real estate prices and build a housing price index involves the definition of a hedonic price model. The fact that the quality and attributes of the properties change with time indicates the necessity to define a function that relates observed housing prices to a set of attributes that are expected to influence the value of the goods. This type of models allows, then, to determine the marginal contribution of each of the observed characteristics to the price of the unit sold, or put in other words, estimate the “willingness” to pay by a customer for each of the characteristics in the property.

To construct the indices, the statistical model that controls for differences in qualities incorporates a time-dummy variable, which indicates the date or year when the transaction took place and captures the trend of housing prices over time. Three different models are proposed in order to understand the relevance of each set of attributes in the specification of housing prices.

The first one consists of a basic regression that holds the price of property as the dependent variable, explained as a function of the time dummies and property area in square meters, without controlling for qualitative characteristics. The function is defined by the formula:

0 푇푖푚푒 푝푖 = 훼 + 푇푖푚푒푖 휇 + 퐴푟푒푎푖훽 + 휖푖 (1)

Where 푝 is the logarithm of the transaction price (adjusted for inflation) for property 푖, defined as a function of 푇푖푚푒푇푖푚푒, the dummy matrix for the time of sale with its corresponding coefficient 휇, 퐴푟푒푎 which indicates the logarithm of the size of property 푖 in square meters with its corresponding coefficient 훽, and standard error 휖.

The second equation includes controls for the effect of quality and attributes of the property, defined by the formula:

- 39 -

0 푇푖푚푒 퐾 푀 푝푖 = 훼 + 푇푖푚푒푖 휇 + 퐴푟푒푎푖훽 + 푅표표푚푠푖훽 + 퐺푎푟푎푔푒푖훽 + ∑푘=1 푇푖 + ∑푚=1 푆푖 + 휖푖 (2)

Where 푅표표푚푠 indicates the number of dwelling rooms in the property, 퐺푎푟푎푔푒 the number of garages for that property, 푇 the property type (Apartment or House) and 푆 determines the socioeconomic stratum of the property.

Finally, in addition to the inherent property characteristics, a variable to control for the locational effects in property value is included. Locational effects can be defined as the attributes associated with the geographic location of a property, and can be distinguished between two levels: (1) adjacency effects, which are the externalities associated with the absolute location of the structure and (2) neighborhood effects, which are the array of locational characteristics (neighbors, accessibility, public service provision) that will lead to a differential household housing demand for certain locations (Can, 1997). For the purposes of this research, we consider only the neighborhood effects in the following formula:

0 푇푖푚푒 퐾 푀 퐿 푝푖 = 훼 + 푇푖푚푒푖 휇 + 퐴푟푒푎푖훽 + 푅표표푚푠푖훽 + 퐺푎푟푎푔푒푖훽 + ∑푘=1 푇푖 + ∑푚=1 푆푖 + ∑푙=1 퐿푖 + 휖푖 (3)

Where variable 퐿 captures the fixed effects of the location of property 푖 within its planning and territorial jurisdiction (UPZ or Municipality - as defined in Chapter 4 point C), which controls for unobserved heterogeneity of neighborhood qualities and city planning determinants.

The definition of the housing price index will be determined by the changes in 휇 across the time range between August 2010 and August 2018, omitting the first period of the time of sale dummy matrix and starting the index at 100. The change in 휇, which reflects the arithmetic return between two periods, for instance 2010 and 2011, corresponds to the difference between the two coefficients or

∆ 퐼푛푑푒푥2011,2010 = exp (휇2011 − 휇2010).

- 40 -

TABLE 4: REGRESSION RESULTS SUMMARY

- 41 -

The results for the first regression, which accounts only for the area of the property and year of transaction, renders an adjusted R-Squared of 0.66 and provides a framework that manages to be very explanatory of housing prices, considering the limited number of variables employed. Naturally, the variable that commands the highest marginal contribution in prices in all models is the area of the property. The coefficients for the time variables decrease their growth rate with the years, which makes sense with the general economic conditions and growth during the period between 2010 and 2018.

In the second model, it is evident that controlling for the inherent qualities of the properties increases the ability of the model to predict housing prices as a function of the attributes and time, judging by the increase in the adjusted R-squared and low p-values. Interestingly, the number of rooms presents a negative coefficient, which means that maintaining all else equal for a property (namely its size), including an additional room will decrease the price of the asset. This is consistent with economic theory, given that increasing the number of rooms will increase the density or number of people per dwelling unit, which commands a marginal reduction in value as consumers will pay less for that characteristic (Wheaton, DiPasquale, 1996). In a similar fashion, properties that have garages will see a marginal increase in their values. This may be explained by the fact that having a garage gives users the right to use an additional piece of real estate and will contribute to the user’s mobility. The coefficient for the property type House indicates a marginal reduction of the price if the property is not an Apartment, which may be related to diverse factors on the demand and supply side of the market that this difference implies. Finally, the different categories of socioeconomic Strata from 2 to 6 indicate that this characteristic is highly relevant and commands substantial price increments as the properties increase from one category to the other. Not only this category is relevant to explain housing prices because it describes the overall socioeconomic conditions of the properties and immediacies, but also its effect on housing prices may overemphasize a specific trend because only properties within lower price ranges and strata are eligible for subsidies. Table 5 indicates how numerical factors correlate at a raw level:

TABLE 5: Correlation chart for selected variables

area rooms garage price.m2 area 1.00 0.42 0.77 0.33 rooms 0.42 1.00 0.16 -0.31 garage 0.77 0.16 1.00 0.57 price.m2 0.33 -0.31 0.57 1.00

- 42 -

Finally, the inclusion of the location variable in the third model is very effective in bringing the model to a higher level of confidence. Accounting for the neighborhood effects through this method may be very effective in accounting for unobserved heterogeneity and we obtain a diverse array of results for the 135 locations. The majority of locations, particularly the most relevant ones from a real estate development market perspective, are statistically significant and consistent with the actual market dynamics. For instance, among the locations with the highest coefficients are El Refugio, Chico Lago and Parque , which are all prime locations in consideration of different attributes, amenities, accessibility, among others. On the other hand, locations within or around Bogota such as El Porvenir, Ciudad , Soacha or Tintal Sur, are places where competition is rife in the affordable housing market on a cost basis and prices will generally be lower in comparison to other places in the city. In this case, the omitted and baseline location category is the UPZ “20 de Julio”.

- 43 -

Overall, the relatively high adjusted R-square measurements indicate that housing prices are rather predictable in Bogota when the model controls for the quoted variables.

FIGURE 29: Locations by Coefficients Mapping the location coefficients FIGURE 29: Locations by Coefficients according to their corresponding

regions in Figure 29 presents results that are consistent with the findings of Section 4.2. The highest property prices are clustered at a centrality with location coefficients between 0.58 and 0.79. Prices start to decrease as they go further from this cluster, presenting a higher marginal decline towards the south and a gradual one towards the north. Interestingly, this price dynamic spills over the urban limit of Bogota towards the municipalities, which also follow a pattern of marginal price increase as they go north. Judging from this graphic and results, the monocentric city model is put into evidence, with an urban centrality limited to the east by natural features and a radial semicircular price gradient towards the extremes.

- 44 -

6. INDEX ANALYSIS

6.1. GENERAL MARKET

For the three different tests, the time dummy coefficients 휇 were all positive and growing across time, denoting a sustained increase in prices over time. New housing price growth may be related to a constrained supply in the market which is not catching up with market demand. The growing population in Bogota, a prevailing housing deficit, regulatory uncertainty that characterized part of this period and increasing land prices may converge to sustain this dynamic over time. The indices are built by plotting the delta between the coefficients 휇 of each year for each regression, as shown by the following graph:

FIGURE 30: INDICES BY REGRESSION MODEL (by year)

155.0 150.0 145.0 140.0 135.0 130.0 125.0 120.0 115.0 110.0 105.0 100.0 95.0 2010 2011 2012 2013 2014 2015 2016 2017 2018

Reg 1 Reg 2 Reg 3

The marginal effect of time in housing prices changes depending on the number and relevance of variables included, as the difference between the three index curves indicate. The index for Regression 1, which only considers housing prices as a function of area and time, captures the overall trend of

- 45 - housing prices as they grow above the city’s inflation rate since 2010. Nonetheless, the result indicates a rather constant growth rate along the years and is not very sensitive to general market dynamics. By including property characteristics into Regression 2, the effect of time over prices decreases. The curve indicates a price expansion from 2010 until the end of 2014 (when general economic growth starts to stagnate), stops growing for a year until it reaches an inflection point in 2016 when prices begin growing again. Finally, by including neighborhood effects in Regression 3, the index indicates a higher magnitude of the pure effect of time in housing prices. The index curve evidences similar dynamics of prices in time as in Regression 2, but the period of price expansion from 2010 to the end of 2014 is more pronounced during a period of sustained economic growth.

FIGURE 31: INDICES BY REGRESSION MODEL (Monthly)

155.0 150.0 145.0 140.0 135.0 130.0 125.0 120.0 115.0 110.0 105.0 100.0

95.0

Jul-13 Jul-18

Jan-11 Jan-16

Jun-11 Jun-16

Oct-14

Apr-12 Apr-17

Sep-12 Feb-13 Sep-17 Feb-18

Dec-13

Aug-10 Aug-15

Nov-11 Nov-16

Mar-15 May-14

Reg 1 Reg 2 Reg 3

Analyzing the index results on a monthly basis provides an indication of the volatility of the time coefficients for each model, depending on the specification of the regression formula. Potential problems such as omitted variable bias are absorbed in the residual error terms of the equation, which is reflected by the overall “noisiness” of the curves. The index for Regression 1 demonstrates a very volatile behavior from month to month, since it is not controlling for many factors that determine the price, which is a notion supported by the evidence in Table 4 that indicates a residual standard error of 0.48. As the specification of the regression formula includes more relevant variables, the residual

- 46 - standard error drops to 0.16 in Regression 3 and the resulting index curve provides higher precision in isolating the pure effect of time over new housing prices.

The following graph compares the resulting index to equivalent indices from national agencies, in order to assess the appropriateness of the model in capturing trends of new residential properties in Bogota: FIGURE 32: INDEX COMPARISON

160 155 150 145 140 135 130 125 120 115 110 105 100

95

Apr-13 Apr-11 Apr-12 Apr-14 Apr-15 Apr-16 Apr-17 Apr-18

Dec-10 Dec-11 Dec-12 Dec-13 Dec-14 Dec-15 Dec-16 Dec-17

Aug-18 Aug-10 Aug-11 Aug-12 Aug-13 Aug-14 Aug-15 Aug-16 Aug-17

REG 3 BAN-REP DNP DANE

The institutional indices for new housing prices are the following: BAN-REP corresponds to a superlative Fisher price index created by the central bank (Banco de la Republica), which measures the monthly evolution of new housing prices and employs data from the same source as this paper (Galeria Inmobiliaria); DNP corresponds to the index for new housing prices created by the National Planning Department, which is not clear how it is calculated and uses data from “La Guia”, a real estate brokerage company; DANE corresponds to a superlative Fisher price index created by the National Administrative Department of Statistics which measures quarterly changes of new housing prices and uses data from the Building Census (CEED).

The four indices represent the overall trends and movements of prices in the similar fashion, but the magnitude of price expansions or contractions vary significantly. The most relevant index for this comparison is BAN-REP, because it employs the same data but uses a different methodology. In

- 47 - general terms, this index presents very similar behavior and magnitude of changes in time to Regression 3 index, accounting for higher price expansion during the period of economic growth before 2015 but leveling down in similar ranges as the index from our model. The index from DANE also trends in a similar way but indicates that prices stabilize at a lower level. The fact that the model is comparable to the estimations from government agencies provides certain degree of confidence to continue expanding the scope of the analysis.

6.2. INDICES BY STRATA

In order to explore the model further to explain market prices in Bogota, the data is separated into the corresponding categories of socioeconomic strata (2 to 6) and independent hedonic regressions are run to evaluate each subset under the following specification:

0 푇푖푚푒 퐾 퐿 푝푖 = 훼 + 푇푖푚푒푖 휇 + 퐴푟푒푎푖훽 + 푅표표푚푠푖훽 + 퐺푎푟푎푔푒푖훽 + ∑푘=1 푇푖 + ∑푙=1 퐿푖 + 휖푖 (4)

- 48 -

Table 6: REGRESSION RESULTS BY SOCIOECONOMIC STRATA

The results suggest that segmenting the market by strata does not compromise the consistency of the estimations, with almost all the variables being statistically significant. Nonetheless, the fit for the model for Stratum 2 is reduced in comparison to the model from Regression 3, arguably by the reduced number of observations. Additionally, the model for Stratum 2 shows Rooms as the only non- statistically-significant variable of the set. This could be potentially explained by low variation of this factor between the subset, as properties in this segment tend to be more homogeneous in their attributes and do not present substantial variation to affect housing prices. The coefficient for area tends to be near 1 for all the categories with exception of Stratum 2, suggesting a relation of diminishing returns for this category where prices don’t increase in the same proportion as the property grows in size. The coefficient for the number of garages has diverse impacts over housing prices according to the stratum, with Stratum 3 presenting the most critical case. Since properties in Stratum 2 can be offered without an allocated private parking, or in a relation of one shared parking space for several units, perhaps one of the most important differentials of buildings of Stratum 3 is the fact that an increased number of owners can own a car.

- 49 -

Once again, the delta between the time dummy coefficients for each regression represents the effect of time in housing prices which are illustrated in the following graph. In the interest of visual simplicity, the indices will be presented on a yearly basis:

FIGURE 33: INDICES BY STRATA

175

165 Begins

155 Drop Price Oil

Program Housing Subsidy Housing 145

135

125

115

105

95 2010 2011 2012 2013 2014 2015 2016 2017 2018

Stratum 2 Stratum 3 Stratum 4 Stratum 5 Stratum 6

The resulting indices suggest an important difference in the dynamics of housing prices according to their socioeconomic categorization. During the period 2010-2014 Colombia experienced a consistent economic GDP growth between 4 to 6%, which may have driven the prices for all housing types to grow at a compounded annual growth rate of ~ 7.5% above inflation.

During 2014, a combination of events that began with the oil price crash reduced the country’s external revenue, consumer expectations and increased interest rates, which may have initiated the price divergence that can be observed in 2015. This divergence may be explained by many factors. First, in the face of reduced disposable income, consumers may demand fewer higher-end properties and opt to purchase assets that adjust to a more disciplined economic reality. Prices for housing within Strata 4 to 6 plateaued and even presented a marginal drop. In second place, by 2015 the national government announced an important program to boost home-ownership through subsidies for buyers and incentives for developers, which targeted mainly Strata 2 and 3 and may have increased demand for these properties. Moreover, properties are eligible for the incentive program only if their total price is below 135x the minimum wage, which is decreed by the government every year to grow above the

- 50 - national inflation rate. As a result, a conjuncture of increased demand and pricing incentives makes housing prices for lower stratums to present higher growth rates.

6.3. INDICES BY LOCATION

A final exploration of the methodology to illustrate the nature of growth of housing prices consists of creating indices for 20 of the most representative planning units (UPZ) or municipalities in the market. The locations were selected following the process of considering only the categories with high statistical significance in Regression 3 (with p-values below 0.005), selecting the first 15 locations by total square meters sold and the first 15 locations by total revenue during the 2010-2018 period. Considering only the unique values of the process renders the following 20 locations which are part of the following regional submarkets:

FIGURE 34: REPRESENTATIVE LOCATIONS BY SUBMARKET

NORTH-EAST NORTH-WEST SOUTH-WEST MUNICIPALITIES Los Cedros Suba Tintal Sur Soacha Usaquen Garces Navas Castilla Mosquera El Refugio Niza Calandaima Madrid Santa Barbara Casa Blanca Suba Cajica Chico Lago Britalia Chia Zipaquira

20 data subsets are created for each of the locations above and an independent hedonic regression is run for each subset following the equation:

0 푇푖푚푒 퐾 푀 푝푖 = 훼 + 푇푖푚푒푖 휇 + 퐴푟푒푎푖훽 + 푅표표푚푠푖훽 + 퐺푎푟푎푔푒푖훽 + ∑푘=1 푇푖 + ∑푚=1 푆푖 + 휖푖 (6)

- 51 -

TABLE 7: REGRESSION BY LOCATION RESULTS

- 52 -

The overall results suggest that the level of confidence prevails for the regressions of each location subset. Adjusted R-squared values range from 0.88 to 0.97, residual errors from 0.09 to 0.16, and the majority of variables are statistically significant. Most of the results for the coefficients maintain the overall trend of the main regressions: mostly positive time-dummy coefficients; positive area coefficients, even though their values suggest a very heterogeneous impact over prices depending on the location, from Tintal Sur in the lowest range (0.28) to Funza in the highest place (1.25); negative coefficient for the number of rooms in general, which follows the reasoning of previous sections; positive coefficients for garages in a diversified array of magnitudes; negative impact in price if the property is a house; and finally, a diverse impact on prices of mixed Strata categories within the locations.

The following pages illustrate the dynamic and relation of prices in the sample locations, which may provide the reader with an insight of the market geography.

- 53 -

6.3.1. SUBMARKET: NORTHEAST

This group of five UPZs concentrates a large number of individual projects and presents a high concentration of very heterogeneous products within the strata range of 4 to 6. An important portion of the supply for the higher-end of the market is concentrated in these locations, as a result of several historical, connectivity, social and geographical factors. Following the trend from previous examples, prices grow consistently for the period between 2010-14. El Refugio is one of the most prime locations and presents remarkable price growth during 2013 to eventually fall in real terms since 2014. Usaquen, also a higher-end location, also falls in real terms while Los Cedros and Chico Lago seem to reactivate price growth after 2016.

160

150

140

130

120

110

100

90 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019

Los Cedros Usaquen El Refugio Santa Barbara Chico Lago

FIGURE 35: INDICES BY SUBMARKET - NORTH EAST

- 54 -

6.3.2. SUBMARKET: NORTHWEST

This group of locations is representative of the middle- range market, with property supply targeting mainly strata 3 and 4. Suba, Casa Blanca Suba and Britalia are locations that share similar characteristics and may be the reason why their price indices follow a similar pattern. Their price growth exceeds price growth for all the other locations within the north-east and north-west submarkets. Niza on the other hand is one of the most heterogeneous locations of the city, with supply within the strata range of 3 to 6 and a diverse mix of property types (apartments, houses and row-housing). Prices in this location stopped growing in 2014. Even though concentrating in the well-performing Stratum 3, prices in Garces Navas don’t grow as much as in other locations, possibly because of a large and competitive supply.

160

150

140

130

120

110

100

90 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019

Suba Garces Navas Niza Casa Blanca Suba Britalia

FIGURE 36: INDICES BY SUBMARKET - NORTHWEST

- 55 -

6.3.3. SUBMARKET: SOUTHWEST

Most of the supply within the city limits for the lower range of the market, for strata 2 to 3, is concentrated in the south-west portion of the city. This group of three locations present a smaller number of projects but a large amount of properties and square meters sold, indicating the presence of larger mega-projects. Of the 20-location subset, Tintal Sur, Castilla and Calandaima are the locations that present the highest price growth between the period 2010-18 and are evidence of the price dynamics by strata mentioned in section 6.2. Additionally, many planning initiatives to enhance transport and connectivity target this portion of the city, which may potentially increase their attractiveness above others and maintain prices in an upward trend.

170

160

150

140

130

120

110

100

90 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019

Tintal Sur Castilla Calandaima

FIGURE 37: INDICES BY SUBMARKET - SOUTHWEST

- 56 -

6.3.4. SUBMARKET: MUNICIPALITIES

As for the surrounding municipalities, these locations present much more heterogeneity. Supply tends to inherit similar characteristics to the ones found in their neighboring locations within city limits, with Soacha, Mosquera and Madrid supplying an important amount of properties for strata 2-3 and housing prices growing consistently over time, while Chia and Cajica supply targetting middle-higher segments of the market, presenting declining housing prices after 2015. Prices in Funza have been more volatile. Finally, prices for Zipaquira, the farthest location from the city which is still predominantly rural, have grown at a slower but very consistent way over the years.

160

150

140

130

120

110

100

90 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019

Soacha Mosquera Madrid Cajica Chia Zipaquira Funza

FIGURE 38: INDICES BY SUBMARKET - MUNICIPALITIES

- 57 -

7. LINEAR MODEL FOR ESTIMATION

The same methodology to construct indices to describe market prices at a general level also provides the possibility to generate estimations at individual levels. Determining property area for commercialization is perhaps one of the most critical steps in the process for real estate development to deliver an appropriate product in accordance to its location and attributes. An imprecise combination of attributes, size and price for a property may condition its attractiveness in the market. Adjusting a non-performing product once the pre-sales process begins is very costly and time- consuming, while a more precise initial specification will probably award a better market reception. An evidence-based approach to this issue is possible through the models derived from the actual dataset, which provides information of actual transactions, hence a snapshot of the effective absorption for the condominium market.

In the interest of creating a simple tool that works for actual application in the industry, the method consists of implementing a model that estimates property area according to the input for the main attributes, to subsequently calculate a price range depending on the resulting estimation for area and a year of transaction input. The first model for this section is built under the following formula:

0 퐾 푀 퐿 퐴푖 = 훼 + 푅표표푚푠푖훽 + 퐺푎푟푎푔푒푖훽 + ∑푘=1 푇푖 + ∑푚=1 푆푖 + ∑푙=1 퐿푖 + 휖푖 (7)

Where 퐴푖 is the log of property Area held as the dependent variable, calculated as a function of the number of 푅표표푚푠, number of 퐺푎푟푎푔푒 spaces, property type 푇푖 (house or apartment), property Stratum 푆푖 (2 to 6) and location 퐿푖.

- 58 -

TABLE 8: LINEAR REGRESSION FOR AREA RESULTS

Dependent variable: log(area) rooms 0.26*** garage 0.25*** as.factor(type)house 0.25*** as.factor(stratum)3 0.07*** as.factor(stratum)4 0.19*** as.factor(stratum)5 0.27*** as.factor(stratum)6 0.39*** as.factor(location) (135 location coefficients ommited in this summary) Constant 2.94*** Observations 104,269 R2 0.84 Adjusted R2 0.84 Residual Std. Error 0.17 (df = 104127) F Statistic 3,930.23*** (df = 141; 104127) Note: *p<0.1; **p<0.05; ***p<0.01

Besides the variables for a few locational factors, all variables are statistically significant with p-values below 0.01, resulting in a regression that is able to explain 84% of the results, according to its adjusted R-squared value. As expected, factors such as bedroom number, number of parking spaces, house type and strata have a marginal positive impact on property area. All locations also have positive coefficients of varying magnitudes, with the smallest in Tintal Norte and the highest in San Isidro Patios (See Appendix B for full results).

The estimation for a property’s area consists of providing the model with values for the five attributes into Equation 7, calculated to a confidence interval of 99% probability of containing the correct area. The confidence interval is defined by the sample result ± the margin of error. The margin of error is found by multiplying the Critical value by the Standard Error. The standard error of the mean is given by the model as 0.0019, and the critical value is computed as a t-score of 2.576 derived from a 0.99 confidence level, hence 0.995 critical probability, and 104,127 degrees of freedom.

The results are then used as the input for area in the regression model for the general market prices (Equation 3 from Chapter 5), which estimates the property price in accordance to the initial specified attributes and year of transaction.

- 59 -

Following the adage that recommends being approximately right rather than precisely wrong for business approximations, the estimation range is roughly amplified by dividing the lower-bound price estimation by the upper-bound area estimation to obtain the lower range of price/m2, and vice versa to obtain the higher range of the estimation. The total price range for the unit is recalculated in function of the resulting price/m2 times the area.

In an attempt to make this method open and accessible to provide the seed for a platform that can assist developers or appraisers in validating estimations, the linear models are rebuilt in Excel and made available online for download.12 Figure 39 is an example of the interface, where the user must specify the property’s attributes in the first line to obtain an estimation of the corresponding area first and the hypothetical price according to the year of transaction:

FIGURE 39: Sample Prediction

ATTRIBUTES Type Stratum Rooms Garage Location Year Apt. 4 2 1 LOS CEDROS 2018

Area Price/m² Price Lower Bound 62.28 5,570,776 346,961,120 Estimate 62.59 5,629,266 352,363,097 Upper Bound 62.91 5,688,369 357,849,180

The results for this example, considering a 2-bedroom, 1-garage, stratum 4 apartment in “Los Cedros” are consistent with actual market offerings. While the model is certainly not infallible, the value of the method comes from an approach that picks trends of the complete dataset across a wide market geography and implements them at an individual level to provide an additional guide for the mentioned stakeholders.

12 The file can be downloaded from the following site: https://github.com/spsarralde/housing-bogota/raw/master/Bogota-New_housing_model.xlsx

- 60 -

8. CONCLUSION

The results of this research provide another layer of analysis that has not been fully explored to understand the market for new residential property in Bogota. Accounting for the intersection between locational effects and socioeconomic stratification factors of the urban landscape not only provides higher levels of confidence for econometric modelling, bringing the coefficients of determination from 0.66 to 0.96 and reducing the model’s standard error from 0.48 to 0.16, but is also helpful to provide a tangible and contextual picture of market trends along a determined geographical space. The results suggest that the spatial segmentation approach, which allocates points to their local planning jurisdictions (UPZ or municipalities), implements an appropriate scale and is effective in capturing unobserved locational attributes. This is perhaps one of the most insightful characteristics of this research because it evidences the impact of political economies over geographies, which reinforces the idea that researchers can build accurate predictive models by specifying locational drivers, rather than relying on extremely meticulous datasets.

The analysis produced results that are both insightful and intuitive. Any person acting in this market understands that higher property prices are concentrated in the indicated regions and within higher socioeconomic strata. Nonetheless, the results indicate that supply is biased by this fact and has not effectively aligned with the growing demand in other locations and population segments. New developments continue to proliferate in well-served areas despite high competition, stagnating prices in real terms and decreasing commercial activity, spreading sales thinly between a large number of projects. On the other hand, other portions of the city have seen the effect of induced demand and a more sophisticated articulation of public and private interests to develop real estate at a large scale to attend the effective housing shortage. Even though underserved regions present a more spatially disperse supply, thus less immediate competition, high demand and rising prices, a substantial portion of new residential development has not shifted towards those regions to benefit from this opportunity. The apparent mismatch between supply and demand is characteristic of the real estate development industry where decisions are often made based upon information that lags actual market conditions and without a comprehensive set of open aggregated evidence that reveals city-wide trends.

- 61 -

It is also evident that socioeconomic stratification of the urban space is helpful to describe housing price dynamics, and a straightforward tool for institutions to design policies and implement programs that target specific population segments with limited access to become home-owners. While the indices portray the differentiated price dynamics depending on the strata, the data summaries illustrate the magnitude of the effect that this policy has over the private market, creating and reinforcing development clusters outside conventional regions. However, from a city planning standpoint, creating a differentiated approach based on socioeconomic strata may deliver mixed outcomes. The results suggest an effective implementation in the short term to induce demand and facilitate supply, given by the increasing amounts of area transacted and rising prices over the years in the lower income segments. But in doing so, it limits new housing supply for these segments to places with compatible pre-existing conditions, reinforcing the existing social-spatial segregation, deterring income mixture within neighborhoods and reducing regional housing heterogeneity, which is evident by the structured distribution of housing with identifiable qualitative characteristics across the city.

Despite the high levels of confidence expressed by the regression models, the estimations can be said to be suggestive but not conclusive. This research is focused on introducing a spatial approach that underscores the importance of neighborhoods while providing tangible insights at a general level. However, an extension of this research should complement the model specification by including controls for spatial-temporal dependence among properties to reduce error covariance and adjust the confidence intervals to incorporate the effect of localized dependencies. Accounting for price differentials that are not justified by their exclusive “housing service” contribution, but by the fact that properties are adjacent to other developments that make neighborhoods more attractive, is a critical path to understand the clustering of prices and developments in space. Finally, an additional approach that would be particularly helpful for real estate developers in Bogota would explore the influence that location externalities, in conjunction with structural attributes, have on property marketability to cross- examine underwriting sales pace assumptions. Hopefully, industry professionals and academics will continue to refine the scope of real estate geospatial analysis in Bogota to encourage more efficient market dynamics, capable of delivering greater value by creating a more inclusive urban housing landscape.

- 62 -

BIBLIOGRAPHY

1. Ministry of Housing, City and Territory (2018). “Portal Minvivienda Colombia supero la meta del déficit habitacional consignada en el Plan Nacional de Desarrollo”. Retrieved from http://www.minvivienda.gov.co/sala-de-prensa/noticias/2018/abril/colombia-supero-la- meta-del-deficit-habitacional-consignada-en-el-plan-nacional-de-desarrollo 2. Trading Economics (2018). “Colombia – Economic Indicators”. Retrieved from https://tradingeconomics.com/colombia/indicators 3. Castaño, J., Laverde, M., Morales, M.A., Yaruro, A.M. (2016). “Índice de los precios de la vivienda nueva para : metodología de precios hedónicos”. Banco de la Republica. 4. District Secretary of Habitat (2017). “Nueva metodología de déficit habitacional urbano para Bogotá” Retrieved from: http://habitatencifras.habitatbogota.gov.co/documentos/Estudios_Sectoriales/DeficitHabit acional2017.pdf 5. National Administrative Department of Statistics (DANE). “Estratificación socioeconómica para servicios públicos domiciliarios”. Retrieved from https://www.dane.gov.co/index.php/en/citizen-service/information- services/socioeconomic-stratification 6. Administrative Unit of District Cadastre (2017). “Analisis Inmobiliario 2016-2017”. Retrieved from: https://www.catastrobogota.gov.co/sites/default/files/Resultados_Censo_2017%20version %20final.pdf 7. District Planning Secretary (2017). “Estratificación Socioeconomica por Localidad”. Retrieved from: http://www.sdp.gov.co/gestion-estudios- estrategicos/estratificacion/estratificacion-por-localidad 8. Chamber of Commerce of Bogota. “Mapa Interactivo: Distribucion de UPZ”. Retrieved from: http://recursos.ccb.org.co/ccb/pot/PC/files/3distribucion.html 9. Banco de la Republica (2018). “New Housing Price Index (IPVNBR)”. Retrieved from: http://www.banrep.gov.co/es/indice-precios-vivienda-nueva-ipvnbr 10. Carriazo, F., Ready, R. & Shortle, J. (2011) “Using Frontier Models to Mitigate Omitted Variable Bias in Hedonic Pricing Models: A Case Study for Air Quality in Bogotá, Colombia”, Documentos Cede. 11. Revollo, D. (2009) “Calidad de la vivienda a partir de la metodología de precios hedónicos para la ciudad de Bogota-Colombia”. Revista Digital Universitaria 10-7, UNAM. 12. Basu, S. Thibodeau, T. G. (1998) “Analysis of spatial autocorrelation in housing prices”. Journal of Real Estate Finance and Economics, Vol. 17:1. Kluwer Academic Publishers, Boston. 13. Chegut, A.M., Eichholtz, P., Rodrigues, P. (2014) “Spatial Dependence in International Office Markets”. Journal of Real Estate Finance and Economics. Springer Science+BusinessMedia, New York. 14. Francke, M. & Van de Minne, A. (2018). “Dealing with Unobserved Heterogeneity in Hedonic Price Models”. Available at SSRN: https://ssrn.com/abstract=3249256 or http://dx.doi.org/10.2139/ssrn.3249256

- 63 -

15. Can, A & Megbolugbe, I (1997). “Spatial Dependence and House Price Index Construction”. Journal of Real Estate Finance and Economics. Kluwer Academic Publishers. 16. Geltner, D. & Van de Minne, A. (Forthcoming). Real Estate Price Index Methodologies, with applications in R. 17. kepler.gl: Large-scale WebGL-powered Geospatial Data Visualization Tool. Available in: http://kepler.gl/#/. Date Accessed: December 26, 2018. 18. DiPasquale, D. and Wheaton, W. (1996). “Urban Economics and Real Estate Markets”. Englewood cliffs: Prentice Hall. 19. Callahan, Mark F. (2017) “Using Transactional and Spatial Data to Determine Drivers of Industrial Land Value”. Massachusetts Institute of Technology. Cambridge, MA.

- 64 -

APPENDICES

APPENDIX A: Complete Regression Results – Price Indices

Dependent variable: log(price.a) Regression 1 Regression 2 Regression 3 as.factor(year)2011 0.055*** 0.038*** 0.056*** as.factor(year)2012 0.125*** 0.097*** 0.139*** as.factor(year)2013 0.221*** 0.179*** 0.235*** as.factor(year)2014 0.272*** 0.226*** 0.305*** as.factor(year)2015 0.309*** 0.257*** 0.339*** as.factor(year)2016 0.327*** 0.259*** 0.343*** as.factor(year)2017 0.342*** 0.286*** 0.378*** as.factor(year)2018 0.374*** 0.317*** 0.403*** log(area) 1.533*** 1.218*** 1.087*** rooms -0.251*** -0.096*** garage 0.255*** 0.132*** as.factor(type)Casa -0.406*** -0.072*** as.factor(stratum)3 0.249*** 0.177*** as.factor(stratum)4 0.479*** 0.338*** as.factor(stratum)5 0.578*** 0.401*** as.factor(stratum)6 0.719*** 0.478*** as.factor(location)ALAMOS 0.561*** as.factor(location)AMERICAS 0.368*** as.factor(location)APOGEO 0.07 as.factor(location)ARBELÁEZ -0.127*** as.factor(location)ARBORIZADORA 0.02 as.factor(location)BAVARIA 0.306*** as.factor(location)BOGOTÁ 0.118*** as.factor(location)BOJACÁ 0.064*** as.factor(location)BOLIVIA 0.143*** as.factor(location)BOSA CENTRAL -0.144*** as.factor(location)BOSA OCCIDENTAL -0.033* as.factor(location)BOYACA REAL 0.437*** as.factor(location)BRITALIA 0.386*** as.factor(location)CAJICÁ 0.087*** as.factor(location)CALANDAIMA 0.154*** as.factor(location)CAPELLANIA 0.505*** as.factor(location)CARVAJAL 0.266*** as.factor(location)CASA BLANCA SUBA 0.372*** as.factor(location)CASTILLA 0.294*** as.factor(location)CHAGUANÍ -0.380***

- 65 - as.factor(location) 0.638*** as.factor(location)CHÍA 0.152*** as.factor(location)CHICO LAGO 0.744*** as.factor(location)CIUDAD JARDIN 0.173*** as.factor(location)CIUDAD MONTES 0.393*** as.factor(location)CIUDAD SALITRE 0.541*** OCCIDENTAL as.factor(location)CIUDAD SALITRE 0.469*** ORIENTAL as.factor(location)CIUDAD USME -0.322*** as.factor(location)COMUNEROS -0.121*** as.factor(location)CORABASTOS 0.307*** as.factor(location)COTA 0.072*** as.factor(location)COUNTRY CLUB 0.515*** as.factor(location)DANUBIO -0.120*** as.factor(location)DOCE DE OCTUBRE 0.470*** as.factor(location) -0.299*** as.factor(location)EL MINUTO DE DIOS 0.302*** as.factor(location)EL MOCHUELO 0.210*** as.factor(location)EL PORVENIR -0.380*** as.factor(location)EL PRADO 0.451*** as.factor(location)EL REFUGIO 0.827*** as.factor(location)EL RINCON 0.233*** as.factor(location)EL ROSAL -0.315*** as.factor(location)EL TESORO -0.171*** as.factor(location)ENGATIVA 0.181*** as.factor(location)FACATATIVÁ -0.397*** as.factor(location)FONTIBON 0.286*** as.factor(location)FONTIBON SAN 0.00 PABLO as.factor(location)FUNZA -0.155*** as.factor(location)GACHANCIPÁ -0.192*** as.factor(location)GALERIAS 0.543*** as.factor(location)GARCES NAVAS 0.262*** as.factor(location)GRAN BRITALIA -0.096*** as.factor(location)GRAN YOMASA -0.231*** as.factor(location)GRANJAS DE 0.293*** as.factor(location)GUACHETÁ (0.12) as.factor(location) 0.205*** as.factor(location)GUAYMARAL 0.131*** as.factor(location)GUTIÉRREZ -0.479*** as.factor(location)ISMAEL PERDOMO 0.055** as.factor(location)JARDIN BOTANICO 0.610*** as.factor(location)JERUSALEM (0.03) as.factor(location)KENNEDY CENTRAL 0.125**

- 66 - as.factor(location)LA ALHAMBRA 0.432*** as.factor(location)LA CALERA 0.191*** as.factor(location) 0.360*** as.factor(location)LA ESMERALDA 0.508*** as.factor(location)LA FLORESTA 0.421*** as.factor(location)LA GLORIA -0.136*** as.factor(location)LA MACARENA 0.292*** as.factor(location)LA SABANA 0.418*** as.factor(location)LA URIBE 0.361*** as.factor(location)LAS CRUCES 0.548*** as.factor(location)LAS FERIAS 0.456*** as.factor(location)LAS MARGARITAS 0.056** as.factor(location)LAS NIEVES 0.566*** as.factor(location) -0.115*** as.factor(location)LOS ALCAZARES 0.653*** as.factor(location)LOS ANDES 0.529*** as.factor(location)LOS CEDROS 0.411*** as.factor(location)LOS LIBERTADORES -0.242*** as.factor(location)LOURDES (0.02) as.factor(location)LUCERO -0.156* as.factor(location)MADRID -0.232*** as.factor(location)MARCO FIDEL 0.055*** SUAREZ as.factor(location)MARRUECOS 0.01 as.factor(location)MODELIA 0.427*** as.factor(location)MONTE BLANCO -0.226*** as.factor(location)MOSQUERA -0.174*** as.factor(location)NEMOCÓN -0.376*** as.factor(location)NIZA 0.414*** as.factor(location)PARDO RUBIO 0.619*** as.factor(location)PARQUE EL SALITRE 0.818*** as.factor(location)PASCA 0.050*** as.factor(location)PASEO DE LOS 0.03 LIBERTADORES as.factor(location)PATIO BONITO 0.163*** as.factor(location) 0.434*** as.factor(location)QUINTA PAREDES 0.406*** as.factor(location)QUIROGA 0.110*** as.factor(location)RESTREPO 0.02 as.factor(location)SAGRADO CORAZON 0.768*** as.factor(location)SAN ANTONIO DEL -0.606*** as.factor(location)SAN BERNARDO -0.060*** as.factor(location)SAN BLAS 0.00 as.factor(location)SAN CRISTOBAL 0.346*** NORTE

- 67 - as.factor(location)SAN ISIDRO-PATIOS 0.813*** as.factor(location)SAN JOSE (0.01) as.factor(location)SAN JOSE DE 0.319*** BAVARIA as.factor(location)SAN RAFAEL 0.386*** as.factor(location)SANTA BARBARA 0.551*** as.factor(location)SANTA CECILIA 0.555*** as.factor(location)SANTA ISABEL 0.326*** as.factor(location)SOACHA -0.326*** as.factor(location)SOPÓ 0.118*** as.factor(location)SOSIEGO 0.076*** as.factor(location)SUBA 0.239*** as.factor(location) (0.04) as.factor(location) -0.075*** as.factor(location) 0.01 as.factor(location) 0.534*** as.factor(location) 0.01 as.factor(location)TIMIZA 0.109*** as.factor(location)TINTAL NORTE -0.309*** as.factor(location)TINTAL SUR -0.301*** as.factor(location)TOBERIN 0.288*** as.factor(location) 0.687*** as.factor(location)TOCANCIPÁ -0.167*** as.factor(location) 0.056* as.factor(location) -0.224*** as.factor(location)USAQUEN 0.519*** as.factor(location)VENECIA 0.318*** as.factor(location)VERBENAL 0.263*** as.factor(location)ZIPAQUIRÁ -0.243*** as.factor(location)ZONA FRANCA 0.038** as.factor(location)ZONA INDUSTRIAL 0.475*** Constant 12.305*** 13.665*** 13.785*** Observations 104,269 104,269 104,269 R2 0.657 0.888 0.961 Adjusted R2 0.656 0.888 0.961 Residual Std. Error 0.481 (df = 104259) 0.274 (df = 104252) 0.162 (df = 104118) 22,142.720*** (df = 9; 51,750.300*** (df = 16; 17,153.740*** (df = F Statistic 104259) 104252) 150; 104118) Note: *p<0.1; **p<0.05; ***p<0.01

- 68 -

APPENDIX B: Area Regression Results

Dependent variable: log(area) rooms 0.26*** garage 0.25*** as.factor(type)Casa 0.25*** as.factor(stratum)3 0.07*** as.factor(stratum)4 0.19*** as.factor(stratum)5 0.27*** as.factor(stratum)6 0.39*** as.factor(location)ALAMOS 0.23*** as.factor(location)AMERICAS 0.22*** as.factor(location)APOGEO 0.08 as.factor(location)ARBELÁEZ 0.31*** as.factor(location)ARBORIZADORA 0.17*** as.factor(location)BAVARIA 0.20*** as.factor(location)BOGOTÁ 0.32*** as.factor(location)BOJACÁ 0.16*** as.factor(location)BOLIVIA 0.13*** as.factor(location)BOSA CENTRAL 0.21*** as.factor(location)BOSA OCCIDENTAL 0.15*** as.factor(location)BOYACA REAL 0.14*** as.factor(location)BRITALIA 0.28*** as.factor(location)CAJICÁ 0.22*** as.factor(location)CALANDAIMA 0.22*** as.factor(location)CAPELLANIA 0.38*** as.factor(location)CARVAJAL 0.19*** as.factor(location)CASA BLANCA SUBA 0.33*** as.factor(location)CASTILLA 0.19*** as.factor(location)CHAGUANÍ 0.27*** as.factor(location)CHAPINERO 0.26*** as.factor(location)CHÍA 0.30*** as.factor(location)CHICO LAGO 0.23*** as.factor(location)CIUDAD JARDIN 0.31*** as.factor(location)CIUDAD MONTES 0.28*** as.factor(location)CIUDAD SALITRE 0.30*** OCCIDENTAL as.factor(location)CIUDAD SALITRE 0.40*** ORIENTAL as.factor(location)CIUDAD USME 0.01 as.factor(location)COMUNEROS 0.17***

- 69 - as.factor(location)CORABASTOS 0.17*** as.factor(location)COTA 0.26*** as.factor(location)COUNTRY CLUB 0.28*** as.factor(location)DANUBIO 0.15*** as.factor(location)DOCE DE OCTUBRE 0.35*** as.factor(location)EL COLEGIO 0.27*** as.factor(location)EL MINUTO DE DIOS 0.14*** as.factor(location)EL MOCHUELO 0.21*** as.factor(location)EL PORVENIR 0.16*** as.factor(location)EL PRADO 0.25*** as.factor(location)EL REFUGIO 0.32*** as.factor(location)EL RINCON 0.21*** as.factor(location)EL ROSAL 0.09** as.factor(location)EL TESORO 0.08 as.factor(location)ENGATIVA 0.17*** as.factor(location)FACATATIVÁ 0.20*** as.factor(location)FONTIBON 0.16*** as.factor(location)FONTIBON SAN 0.13*** PABLO as.factor(location)FUNZA 0.22*** as.factor(location)GACHANCIPÁ 0.14*** as.factor(location)GALERIAS 0.25*** as.factor(location)GARCES NAVAS 0.24*** as.factor(location)GRAN BRITALIA 0.24*** as.factor(location)GRAN YOMASA 0.17*** as.factor(location)GRANJAS DE TECHO 0.22*** as.factor(location)GUACHETÁ 0.77*** as.factor(location)GUASCA 0.03 as.factor(location)GUAYMARAL 0.28*** as.factor(location)GUTIÉRREZ 0.26*** as.factor(location)ISMAEL PERDOMO 0.15*** as.factor(location)JARDIN BOTANICO 0.27*** as.factor(location)JERUSALEM 0.21*** as.factor(location)KENNEDY CENTRAL 0.44*** as.factor(location)LA ALHAMBRA 0.20*** as.factor(location)LA CALERA 0.26*** as.factor(location)LA CANDELARIA 0.20*** as.factor(location)LA ESMERALDA 0.28*** as.factor(location)LA FLORESTA 0.26*** as.factor(location)LA GLORIA 0.18*** as.factor(location)LA MACARENA 0.23*** as.factor(location)LA SABANA 0.27*** as.factor(location)LA URIBE 0.28***

- 70 - as.factor(location)LAS CRUCES 0.35*** as.factor(location)LAS FERIAS 0.15*** as.factor(location)LAS MARGARITAS 0.21*** as.factor(location)LAS NIEVES 0.20*** as.factor(location)LENGUAZAQUE 0.12*** as.factor(location)LOS ALCAZARES 0.25*** as.factor(location)LOS ANDES 0.24*** as.factor(location)LOS CEDROS 0.24*** as.factor(location)LOS LIBERTADORES 0.08*** as.factor(location)LOURDES 0.24*** as.factor(location)LUCERO 0.05 as.factor(location)MADRID 0.18*** as.factor(location)MARCO FIDEL 0.11*** SUAREZ as.factor(location)MARRUECOS 0.02 as.factor(location)MODELIA 0.30*** as.factor(location)MONTE BLANCO 0.19*** as.factor(location)MOSQUERA 0.23*** as.factor(location)NEMOCÓN 0.09 as.factor(location)NIZA 0.31*** as.factor(location)PARDO RUBIO 0.33*** as.factor(location)PARQUE EL SALITRE 0.36*** as.factor(location)PASCA 0.20*** as.factor(location)PASEO DE LOS 0.36*** LIBERTADORES as.factor(location)PATIO BONITO 0.37*** as.factor(location)PUENTE ARANDA 0.27*** as.factor(location)QUINTA PAREDES 0.31*** as.factor(location)QUIROGA 0.17*** as.factor(location)RESTREPO 0.20*** as.factor(location)SAGRADO CORAZON 0.28*** as.factor(location)SAN ANTONIO DEL 0.17*** TEQUENDAMA as.factor(location)SAN BERNARDO 0.21*** as.factor(location)SAN BLAS 0.15*** as.factor(location)SAN CRISTOBAL 0.29*** NORTE as.factor(location)SAN ISIDRO-PATIOS 0.92*** as.factor(location)SAN JOSE 0.01 as.factor(location)SAN JOSE DE 0.25*** BAVARIA as.factor(location)SAN RAFAEL 0.22*** as.factor(location)SANTA BARBARA 0.18*** as.factor(location)SANTA CECILIA 0.27***

- 71 - as.factor(location)SANTA ISABEL 0.21*** as.factor(location)SOACHA 0.15*** as.factor(location)SOPÓ 0.38*** as.factor(location)SOSIEGO 0.23*** as.factor(location)SUBA 0.19*** as.factor(location)TABIO 0.12*** as.factor(location)TAUSA 0.22*** as.factor(location)TENJO 0.15*** as.factor(location)TEUSAQUILLO 0.27*** as.factor(location)TIBABUYES 0.18*** as.factor(location)TIMIZA 0.09*** as.factor(location)TINTAL NORTE 0.07** as.factor(location)TINTAL SUR 0.18*** as.factor(location)TOBERIN 0.26*** as.factor(location)TOCAIMA 0.48*** as.factor(location)TOCANCIPÁ 0.16*** as.factor(location)TUNJUELITO 0.20*** as.factor(location)UNE 0.26*** as.factor(location)USAQUEN 0.31*** as.factor(location)VENECIA 0.34*** as.factor(location)VERBENAL 0.21*** as.factor(location)ZIPAQUIRÁ 0.22*** as.factor(location)ZONA FRANCA 0.20*** as.factor(location)ZONA INDUSTRIAL 0.26*** Constant 2.94*** Observations 104,269 R2 0.84 Adjusted R2 0.84 Residual Std. Error 0.17 (df = 104127) 3,930.23*** (df = 141; F Statistic 104127) Note: *p<0.1; **p<0.05; ***p<0.01

- 72 -