<<

Habitat International 67 (2017) 69e78

Contents lists available at ScienceDirect

Habitat International

journal homepage: www.elsevier.com/locate/habitatint

Spatial and hedonic analysis of housing prices in

* ** Zezhou Huang a, Ruishan Chen a, b, ,DiXuc, , Wei Zhou d a Key Laboratory of Geographic Information Science, Ministry of Education, and School of Geographic Sciences, East Normal University, Shanghai 200241, China b Key Laboratory of the Coastal Zone Exploitation and Protection, The Ministry of Land and Resource of China, Nanjing 210023, China c Urban Development Research Institution, Shanghai Normal University, Shanghai 200234, China d School of Public Administration, Hohai University, Nanjing 210098, China article info

Article history: had become one of the cities with the highest average housing price Received 1 March 2017 in China (National Bureau of Statistics of China, 2003). While United Received in revised form Nations Center for Human Settlements estimated that a rational 2 July 2017 ratio of housing price and revenue is about 3:1, that in Shanghai in Accepted 5 July 2017 2004 was amazingly 14.8:1. Even if the residents' average income in Shanghai was fewer than one-tenth of that in the US, the average housing price in Shanghai was higher than that in New York and Chicago (Gu, 2005). In 2010, with the ratio of housing price and revenue at about 17:1, middle-income residents in Shanghai absolutely lacked the housing affordability through bank loans & 1. Introduction (Luan, Zhou, Yi, 2012). Under such circumstances, it is meaningful for us to further explore the spatial pattern and determining factors Housing prices in Chinese cities, particularly in Shanghai, Bei- of housing prices in Shanghai. jing, Guangzhou and Shenzhen, are among the highest in the world. A number of different approaches have been adopted to improve The average price for an apartment in Shanghai was around the accuracy of housing price predictions. Among these, Ordinary $10,000 per square metre in 2014 (Shanghai Bureau of Statistics, Least Squares (OLS) regression is perhaps the commonly employed & 2015), which is clearly beyond affordability for most families and hedonic pricing approach (Craven Islam, 2011). OLS has several strongly affects the quality of life in urban populations. According to distinctive features, such as simplicity of calculation, fast compu- Tobler's first law of geography, the closer the distance, the greater tation, and easy interpretation. However, since many hedonic the influence between two geographical entities (Tobler, 1970). problems have been generated by the variables, researchers have & Accordingly, the price for one house is likely to be most strongly challenged the validity of the (OLS) regression (Gao Li, 2011). affected by the price of neighbouring houses, although the degree One problem of OLS is that the distribution of its residuals over fi of influence remains to be explored. Hedonic models may be the space is characterized by regularity and signi cant spatial corre- & most appropriate tools to understand this (Rosen, 1974). Hedonic lation (Des Rosiers, Theriault, Villeneuve, 2000). Spatial auto- models deconstruct housing prices into their component attributes correlation, the correlation between features based upon repli- such as their structural, neighborhood and accessibility attributes, cated realizations of the geographic distribution of some attributes, & acquire the estimated values of these attributes, and evaluate persists in the result of OLS regression (Dunse, Jones, Orr, Tarbet, housing prices according to their attributes (Hui, Zhong, & Yu, 1998). The main shortcoming of spatial auto-correlation is that it 2012; Liao & Wang, 2012; Wen, Xiao, & Zhang, 2017). can make the t-test values of OLS regression unreliable (Bowen, & Shanghai, one of the leading industrial centers in China, suffers Mikelbank, Prestegaard, 2001). The resultant t-values make it greatly from the problem of high housing prices. In 2003, the impossible to decide whether some explanatory variables are sig- fi average housing price in Shanghai reached 5118 Chinese Yuan ni cant in modeling the housing prices (Miron, 1984). Small sam- (618.34 US dollar at that time) per square meter which even sur- pling variability is also a potential weakness in OLS regression (Basu & passed the average housing price in Beijing. Since then, Shanghai Thibodeau, 1998). An additional problem of the method is spatial

* Corresponding author. Present/permanent address: Key Laboratory of Geographic Information Science, Ministry of Education, and School of Geographic Sciences, East China Normal University, Shanghai 200241, China. ** Corresponding author. Present/permanent address: Urban Development Research Institution, Shanghai Normal University, Shanghai 200234, China. E-mail addresses: [email protected] (Z. Huang), [email protected] (R. Chen), [email protected] (D. Xu), [email protected] (W. Zhou). http://dx.doi.org/10.1016/j.habitatint.2017.07.002 0197-3975/© 2017 Elsevier Ltd. All rights reserved. 70 Z. Huang et al. / Habitat International 67 (2017) 69e78 heterogeneity, whereby there is continuous change and uneven 2. Data and method distribution of spatial data in a study area. The relevance of these independent and dependent variables is likely to vary within the 2.1. Housing price and attributes data study area (Gillen, Thibodeau, & Wachter, 2001). In order to model spatially heterogeneous processes, the This article uses exactly 12,732 individual pieces of data from a Geographically Weighted Regression (GWR) model is an appro- Shanghai real estate website (http://sh.centanet.com/ershoufang) priate alternative (Cleveland & Devlin, 1988; Wang et al., 2016). which owned by Centaline Property Agency to analyze the spatial GWR provides a local model of variables to calculate the regression relation of housing prices in Shanghai in early May 2016. Centaline equations. It incorporates both independent and dependent vari- Property Agency, founded in 1978, is the largest real estate inter- ables within the study area to build local equations (Fotheringham, mediary business agency in Shanghai. Its data have high credibility. Brunsdon, & Charlton, 2003). However, GWR also has its limitations Because thousands of records are uploaded one day on the website and its credibility has been challenged (Wheeler, 2014, pp. and they are always presented in a mixed and disorderly way, it is 1435e1459). Problems such as multicollinearity (Wheeler & difficult to gather these information in a good order manually. Tiefelsdorf, 2005), extreme coefficients (Cho, Lambert, Kim, & However, web crawler, an Internet bot which systematically Jung, 2009) and dependencies between spatial errors browses the Internet and gathers the useful information, can (Fotheringham et al., 2003) have been uncovered in GWR. However, collect, clean, and filter data automatically, which greatly saves in general, GWR is an appropriate model to analyze the spatial non- manpower. In this study, we used the web crawler to acquire all the stationarity of housing price distribution (Paez, Long, & Farber, data. The distribution of the housing data collected is shown in 2008). Fig. 1. There have been many studies on the housing prices in China in Data collected by the web crawler include the price, the area, the the last ten years. For example, Zheng, Kahn, and Liu (2010) used address, the orientation, the floor, the decoration condition, and the hedonic models to examine the relationship between housing age of houses that were sold in early May of 2016 in Shanghai. Each prices, investment, wages, and pollution and found that Chinese address of the houses is assigned a latitude and a longitude using a cities are undergoing a transition from “producer cities” to “con- software program called Xgeocoding. It is evident that the features sumer cities”. Hou (2010) analyzed housing market prices in Beijing are concentrated in the center of Shanghai and are not uniformly and Shanghai and concluded that the housing price bubble seems distributed. The socioeconomic data of different districts including to have appeared in Beijing from 2005 to 2008 and in Shanghai the GDP per capita, male-female ratio, population density and from 2003 to 2004. Li and Ge (2008) used their models to evaluate average wage in 2010 are also collected from 2010 Shanghai pop- inflation capabilities of housing assets in Shanghai and found that, ulation census. Shanghai has been readjusting and optimizing its even if hedging against anticipated and unanticipated inflation was own industrial structure since the development and opening of the not a feature of the Shanghai real estate market, the actual yield in 1992. In recent years, the spatial distribution of ratio was still positive. However, while many studies have analyzed industry in Shanghai has been relatively stable, which results in a the real estate market on the macro level, studies of housing prices relatively stable spatial distribution of the population. Therefore, on the micro level in Shanghai are still limited. data such as the male-female ratio and population density in Although the theory of hedonic price models has been widely Shanghai in 2010 can still be used to explain the housing prices in applied to analyze housing prices in countries like the US (Sander & 2016. However, the division of districts in Shanghai has changed Polasky, 2009), France (Gourieroux & Laferrere, 2009), Norway from 2010 to 2016: the old Huangpu District and (Osland, 2010), Japan (Shimizu, Takatsuji, & Nishimura, 2010), merged into new Huangpu District in 2011, and the old Jing'an Austria (Helbich, Brunauer, Vaz, & Nijkamp, 2013), and Netherlands District and District merged into new Jing'an District in € (Ozyurt, 2014), similar studies in China are still scarce. Questions 2015. In this article, we use the division of districts in 2010. such as how housing prices vary over space in Shanghai, or what The housing prices in Shanghai is analyzed with Geographical are the most important factors determining the market value of a Information System (GIS). GIS provides a powerful tool of hedonic single property are still unanswered. Therefore, the main purpose price model under Spatial Statistics Tools in ArcToolbox, It has been of this article is to estimate housing prices in light of their relations applied by many scholars in recent years. GIS can effectively pre- with a series of independent variables including the size, quality, sent all the geographic entities vividly by positioning them on the neighborhood, and accessibility of the property. map according to their coordinates which is conducive to further We applied big data techniques, more specifically, a large set of analysis (Din, Hoesli, & Bender, 2001). Besides, GIS also has ad- data on house location, characteristics and price extracted from real vantages of integrating data and analyzing spatial entities estate website, to examine the spatial distribution of housing prices efficiently. in Shanghai and its determining factors. The methodology and analysis of results may help government 2.2. Hedonic price theory administration to better understand the spatial pattern and deter- mining factors of housing prices in Shanghai so that they can, In the past 3 decades, the hedonic price model was widely used accordingly, formulate policies on land use and urban planning. The to evaluate the value of houses worldwide. Hedonic price model performances of both OLS and GWR models are examined in our decomposes its object of study into component attributes and ac- analysis which contribute to theoretical development. Although the quires the estimated values of these attributes (Lancaster, 1966; analysis of housing prices is limited to Shanghai in this article, the Rosen, 1974). In the field of real estate, hedonic price model basic method can be applied to further analyze the housing prices commonly uses regression analysis to estimate the effects of in other major cities around the world. various housing attributes including structural attributes, accessi- This article consists of 4 sections. In section 2, the data set and bility attributes and neighborhood attributes on housing prices the hedonic pricing theory are described. In section 3, the results of (Wilhelmsson, 2002). Among them, the accessibility attributes and the spatial auto-correlation analysis, hot spot analysis, and the neighborhood attributes contain the main location factors. How- hedonic models estimated by OLS and GWR are presented. In sec- ever, traditional hedonic price model is difficult to capture the tion 4, discussion and conclusion are provided. location factors. In order to accurately reflect the impact of location factors on the housing prices, the spatial effects should be Z. Huang et al. / Habitat International 67 (2017) 69e78 71

Fig. 1. The distribution of extracted housing data in Shanghai (Chongmin and other islands are not included). considered. model can be defined as: To resolve this problem, many scholars began to further apply the spatial hedonic price model. One important issue of spatial Y ¼ a þ Xb þ ε (2) hedonic price model construction is the integration of spatial ef- a b fects into the model. If the spatial effects are not properly identified, Y is a n vector of the housing prices, is a n vector of parameters, series of problems like multicollinearity, spatial correlation, and is a k vector of the parameters, X is the n k matrix of the com- ε spatial heterogeneity will exist (Des Rosiers, Theriault & Villeneuve, bination of attributes and is an n 1 error term. By examining 2000). In the early literature of spatial hedonic price model, Dubin their parameters, the effect of each attribute can be analyzed. (1998) uses the geostatistical method- Kriging-to estimate the The main spatial effects in hedonic models are spatial auto- & covariance structure of the model. Can (1990) uses a special lag correlation and spatial heterogeneity (Des Rosiers, Theriault model with variable coefficients to capture the neighborhood Villeneuve, 2000). If spatial effects exist in the data of nearby effects. houses, the hedonic housing price models will not consistent with In recent years, scholars begin to integrate hedonic price model normal distribution which brings forth challenges to spatial sta- with Geographical Information System (GIS). GIS makes it possible tistics. If OLS is still used to estimate housing prices regardless of fi to calculate various neighborhood and location attributes in he- the spatial effects, the precision of OLS prediction and the signi - donic price model. These real state studies which have considered cance of the model estimate will be impaired. the spatial effect prove a great improvement in terms of price The extent of spatial dependency among geographic entities can forecasting and statistical inference (Bowen et al., 2001; Dubin, be measured and analyzed in the Spatial auto-correlation analysis & 1998). Since houses are fixed on the geographical space, and (Anselin Bera, 1998). There are many ways to examine spatial almost all the housing information is essentially spatial informa- auto-correlations, such as Geary's C (Geary, 1954), Moran's I & tion, it is suitable to apply GIS to explore the real estate (Belsky, Can, (Moran, 1950) and Getis-Ord Gi* (Ord Getis, 1995). Among them, fi & Megbolugbe, 1998). Besides, GIS can also synthesize all kinds of Moran's I is the most common method which is de ned as: spatial data and their analysis comprehensively. GIS has already P P n n n ¼ ¼ wij xi X xj X been widely applied to analyze real estate in different countries and I ¼ P P i 1 j 1 (3) n n P 2 regions (Thrall, 1998). ¼ ¼ wij n i 1 j 1 i¼1 xi X And Getis-Ord Gi* is defined as: 2.3. Hedonic price models P P n n ¼ w x X ¼ w * ¼ v"ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffij 1 ij j # j 1 ij Hedonic housing prices model is a method to analyze the rela- Gi u   2 3 (4) u P P 2 tion between housing prices and their attributes. Traditional he- n n u n w2 w P u j¼1 ij j¼1 ij 6 n 7 donic price models set every attribute of housing as an explanatory u 6 x2 27 u 6 j¼1 j X 7 variable and set housing prices as the dependent variable (Chau & t n1 4 n 5 Chin, 2002). Generally, the housing price can be classified as:

P ¼ f ðS; A; NÞ (1) n is the sample size, xi is the housing price in i location, xj is the Where S is the structural attributes, A is the accessibility attri- housing price in a specified area, X is the mean of the attribute and butes, and N is the neighborhood attributes. Ordinary Least Squares wij is the spatial weight matrix. (OLS) is usually used to estimate housing prices in traditional he- If both spatial auto-correlation and spatial heterogeneity exist donic price models. OLS assumes that error terms are independent simultaneously in the same data, the problem will be far more and identically distributed, samples are independent of each other, difficult. De Graaff, Florax, Nijkamp, and Reggiani (2001) listed and explanatory variables are independent and exogenous. OLS three obstacles in handling these spatial effects: the difficulty in 72 Z. Huang et al. / Habitat International 67 (2017) 69e78 distinguishing the spatial auto-correlation and spatial heteroge- receive enough sunshine without direct solar radiation. Therefore, neity through observation, the special form of spatial heterogeneity houses facing to the south will have higher housing prices than caused by spatial auto-correlation, and the difficulty in dis- others. tinguishing the spatial auto-correlation and the spatial heteroge- Houses in the ground floor of a building may be cold and damp, neity through an empirical method. and their visions are always restricted. If the building is surrounded Different from OLS, Geographically Weighted Regression (GWR) by woods, the harassment of mosquitoes and other insects are also is a local regression method whose parameters varying spatially unbearable in the lower floor. However, houses in the higher floor (Cleveland & Devlin, 1988). It uses a series of weight matrices to often suffer from the bad weather such as the extreme heat and air integrate spatial structure and geographic information into the pollution. In a building without an escalator, it is also weary to linear regression model. GWR then combine all the dependent and climb stairs. Therefore, houses in the intermediate floors of a independent attributes in the bandwidths of target characteristics building are always the most expensive. and build different equations. GWR can effectively solve the spatial A fully furnished house is usually more expensive than a rough non-stationarity of housing price distribution (Paez et al., 2008), house, and a newly built house is less expensive than an old one. although issues like multicollinearity (Wheeler & Tiefelsdorf, Both the extent of decoration and the age of the houses should be 2005), extreme coefficients (Cho et al., 2009) and the dependence considered as the structural attributes of housing prices. between spatial errors (Fotheringham et al., 2003) are still Each house is given variables of 'price' and 'area', and these unavoidable. variables are respectively assigned with the log of housing price in ten thousand Chinese Yuan and the log of total floor area in square 2.4. Explanatory variables and their expected effects meter. Dummy variables like 'decoration', 'orientation', 'floor' and 'age' are also applied in each house, and those of houses that are Explanatory variables that influence the housing prices have well furnished, facing the south, in the intermediate floors, or was been widely discussed when applied hedonic pricing models. built before 1990 will be assigned the value 1 while others are Structural attribute, which refers to the internal attributes of the assigned the value 0. The distribution of housing price and total house structure, is a key factor that decides the housing price (Tu, floor area in Shanghai are shown in Fig. 2. Yu, & Sun, 2004). The living area (Baumont, Ertur, & Gallo, 2004; Helbich et al., 2013; Opoku & Abdul-Muhmin, 2010; Wilhelms- 2.4.2. Neighborhood attributes son, 2002) and room or toilet quantity (Baumont et al., 2004; The first neighborhood attribute used in the hedonic analysis is Helbich et al., 2013; Opoku & Abdul-Muhmin, 2010) are widely GDP per capita (min ¼ 7387 CNY, max ¼ 41,726 CNY, mean ¼ 13,948 recognized as the most important structural attributes. The types of CNY and std ¼ 6328) in different districts which reflect the living houses like whether it has a terrace, garage and basement standards of people. The second neighborhood attribute is the (Baumont et al., 2004; Helbich et al., 2013) are often considered by average wage in different communities which evaluates the peo- the buyer. The age of the house (Baumont et al., 2004; Helbich et al., ple's economic levels. Generally, the higher people's living stan- 2013) and its decoration conditions (Opoku & Abdul-Muhmin, dards and economic levels are, the more they are willing to pay for a 2010) are also emphasized by the residents. comfortable accommodation. The average wage is recorded in units Neighborhood attribute, which is the attribute of house envi- of Chinese Yuan per year (min ¼ 36,628, max ¼ 97,125, ronment and its surrounding area, is another kind of explanatory mean ¼ 61,836 and std ¼ 7603). variables that worth noting. The unemployment rate and average The population density in every community is the third neigh- wage are both key neighborhood attributes which indicate the borhood attribute. Neighborhoods that are densely populated are development degree of the area (Baumont et al., 2004; Helbich expected to be more competitive. Housing price spontaneously et al., 2013). The education level and social class of the popula- rises as the demand for the houses rises. The population density is tion who lived in the neighborhood is also an appropriate reflection recorded in units of number of people per square meter of the surrounding environment (Baumont et al., 2004). (min ¼ 0.000, max ¼ 1.057, mean ¼ 0.107 and std ¼ 0.096). Accessibility attribute is also considered as a key factor by pro- Male-female ratio (min ¼ 0.015, max ¼ 14.062, mean ¼ 0.844 spective residents. Residents always prefer to live in an area that is and std ¼ 0.398) is also considered a neighborhood attribute in our close to various amenities such as parks (Montgomery & Curtis, study. Maybe male-female ratio doesn't affect housing prices as 2006). Distance to the nearest public facilities such as school, directly as previous attributes do. However, in China, it is always commercial center, hospital, workplace (Baumont et al., 2004) and men need to buy houses. If there are more males than females in nearby transportation (Wang & Jiang, 2016) will also be the major some places, males tend to be more eager to buy their houses due to consideration from the house purchasers. the pressure from the society and housing prices will be higher there. 2.4.1. Structural attributes Each house is given variables 'gdppc', 'avewage', 'density' and The housing data on Centaline Property Agency's website 'gender', and these variables are respectively assigned with the GDP included the attributes of the price, area, address, floor, orientation, per capita, the average wage in Yuan per year, the population decoration and age of houses. The housing price is recorded in units density in people per square meter and the male female ratio. The of ten thousand Chinese Yuan (min ¼ 32, max ¼ 8000, mean ¼ 476 distribution of GDP per capita, average wage, population density and std ¼ 448). The housing prices have a positive correlation with and male-female ratio in Shanghai are shown in Fig. 3. its area as under similar conditions, larger houses are more expensive than smaller ones. The area is recorded in units of square 2.4.3. Accessibility attributes meter (min ¼ 16, max ¼ 916, Mean ¼ 121 and std 84). Besides, their The distance from houses to the downtown area is one of the market values also depend greatly on the orientation, the floor, the most important accessibility attributes. Houses that near down- decoration, and the age of houses. town areas provide more convenience to people for shopping, Houses facing to the west or east will be exposed to the sun too dining, and entertainment. In Shanghai, there are nine downtown much especially in summer. However, the environment in houses areas that worth mentioning as shown in the Fig. 4: Wujiaochang in facing to the north will be cold and damp because of the lack of , Quyang in , Daning in Zhabei sunshine. Houses facing to the south are the best because they will district, Ever-bright Center in Zhabei district, Dapuqiao Station in Z. Huang et al. / Habitat International 67 (2017) 69e78 73

Fig. 2. The distribution of (a) housing price and (b) total floor area.

Fig. 3. The distribution of (a) GDP per capita, (b) average wage, (c) population density and male female ratio in Shanghai.

Huangpu district, Jing'an Temple in Jing'an district, in We marked the top 36 key primary schools in Shanghai with GIS as , Changfeng ecological business district in Putuo shown in Fig. 4, and houses that are within buffer zone with a district and Central Business District in Putuo district. Each house is radius of 1 km are assumed as houses near the primary schools in given a variable 'distance' and the variable is assigned with the this article. Each house is given a dummy variable 'school', and euclidean distance from houses to the downtown area in meters. variables of houses that are near primary schools are assigned the Also, houses near schools are favored by families with children. value 1, while others are assigned the value 0. 74 Z. Huang et al. / Habitat International 67 (2017) 69e78

Fig. 4. The distribution of (a) downtown areas and (b) key primary schools.

The descriptions of all the explanatory variables which will be Table 2 used in the hedonic models are shown in Table 1. The result of auto-correlation analysis. Moran's Index 0.916 Expected Index 0.001 3. Results Variance 0.002 z-score 22.941 p-value 0.000 3.1. Spatial auto-correlation

In GIS, Spatial auto-correlation is mainly evaluated by calcu- prices may affect the subsequent analysis. lating Moran's I. After normalized by variance, values of Moran's I are in the range of 1 to 1. The positive values of Moran's I mean that a high value tends to attract other high value but repel low 3.2. Hot spots analysis values and the negative ones are just the opposite. The index will approximate zero if the features' values are the results of the Hot Spot Analysis tool in GIS can identify whether houses with random processes. The absolute value of Moran's I indicates the high or low prices cluster together. The Getis-Ord Gi* statistic is degree to which values are attracted or repelled. The p-value is the used in the Hot Spot Analysis to recognize the statistically signifi- possibility whether the features' values are randomized and it is cant spatial clusters of values. z-scores and p-values are also always associated with z-score. These features' high and low values included in the output table which are used to ensure that these can be deemed as spatially clustered rather than randomly results are not results of a random process. False Discovery Rate distributed if their p-value is low and its z-score is positive. (FDR) is then used to avoid multiple testing and spatial correlation. Calculating Moran's I of the housing prices in Shanghai leads to Having a high value is not enough for one point to be deemed as the following results (Table 2). The p-value, which is almost zero, statistically significant. Other points with high values are also shows that it is almost impossible for these data to be a result of a required to be found near a statistically significant hot spot. We use random spatial process. The Moran's I, which is 0.916, means that fixed distance band to conceptualize the spatial relationships. The houses which have high prices are likely to cluster together, so do threshold distance will guarantee these points have at least one the cheap ones. The spatial auto-correlation in Shanghai housing neighbor. The output features are shown in Fig. 5.

Table 1 3.3. Ordinary Least Squares Descriptions of the explanatory variables.

variable description A summary report of OLS was obtained by inputting the attri- butes into Ordinary Least Squares tool in GIS (Table 3). The adjusted price Log of housing price (ten thousand Chinese Yuan) R-square in OLS model is 0.70 which indicates that almost 70% of area Log of total floor area (square meter) decoration Decoration of the house (1 ¼ furnished; 0 ¼ not furnished) the variation in logged prices can be explained. Attributes with an orientation Orientation of the house (1 ¼ south; 0 ¼ others) asterisk in probability are unlikely to be zero in essential. The re- floor Floor of the house (1 ¼ intermediate; 0 ¼ others) sults are based on the incorrect assumption of spatially uncorre- ¼ ¼ old Whether the house was built before 1990 (1 yes; 0 no) lated model residuals which may lead to inefficient parameter gdppc GDP per capita school Whether the house is near the primary schools (1 ¼ yes; 0 ¼ no) estimations. gender Male female ratio There are some variables worthy of attention in Table 3. Firstly, avewage Average wage (Chinese Yuan per year) the coefficients of these independent variables are most important density Population density (number of people per square meter) because they show the relation between dependent variables and distance Distance from the house to the downtown area (meter) independent variables. Besides, the standard errors also need to be Z. Huang et al. / Habitat International 67 (2017) 69e78 75

Fig. 5. The distribution of hot spots of housing prices in Shanghai.

Table 3 3.4. Geographically weighted regression The result of Ordinary Least Squares.

variable coefficient std error t-statistic probability To improve the estimation of housing prices in Shanghai, we use

intercept 1.156 0.023 56.893 0.000* Geographically Weighted Regression in GIS. In GWR, the Gaussian area 1.078 0.003 373.630 0.000* kernel is used to distinguish between global and local effects by decoration 0.026 0.003 8.585 0.000* assigning a high value to densely-distributed feature and vice versa. orientation 0.015 0.000 4.764 0.000* Due to the high demand of variables in GWR, we choose total floor fl oor 0.003 0.376 0.885 0.000* area, GDP per capita, distance to downtown areas, and male-female old 0.117 0.004 22.134 0.000* gdppc 0.001 0.000 34.045 0.000* ratio to be the explanatory variables from the structural, neigh- school 0.193 0.007 28.823 0.000* borhood and accessibility attributes of the housing prices. A fixed gender 0.026 0.008 3.121 0.002* distance is used to calculate the bandwidths of the Gaussian kernel avewage 0.001 0.000 9.603 0.000* and Akaike Information Criterion. The adjusted R-Squared in this density 0.789 0.095 7.839 0.000* distance 0.001 0.000 173.108 0.000* GWR analysis is 0.826. However, the result is still restricted to Multiple R-Squared 0.700 multicollinearity (Wheeler & Tiefelsdorf, 2005), extreme co- Adjusted R-Squared 0.700 efficients (Cho et al., 2009) and the dependence between spatial errors (Fotheringham et al., 2003). The local adjusted R-Squared or predicted value is shown in Fig. 6 a. The local adjusted R-Squared ranges from 0.633 to 0.884 noticed because it can tell you the possibility of getting same values throughout Shanghai. Housing prices in downtown areas such as after resampling and recalculating, and decide the significance of Jing'an District and Huangpu District are best explained. Some the results. Thirdly, t-statistics can evaluate the statistical signifi- houses in the suburban areas such as and Jiading cance of these independent variables. Finally, a low p-value can District are also well explained. Fig. 6 b shows the different effects demonstrate that these coefficients are not likely to be zero in of explanatory variables on housing prices in Shanghai. essential. All the structural attributes like decoration, orientation, and 3.4.1. The effect of the total floor area floor show a positive correlation with housing prices as expected. A positive effect of the total floor area on housing prices has The age of the houses indeed has a great effect on the housing been shown in Fig. 7a. The strongest effect is shown in the down- prices. Most of the old houses have short property right period and town area while the weakest one is shown in suburban areas. are also poor in appearance which keeps the young customers Because of the convenience of the downtown area and the rarity of away, and a number of problems often emerge in their in- its land, citizens are more willing to pay more for houses with a frastructures. The neighborhood attributes like population density, larger area there. However, land in suburban areas is adequate so male-female ratio, average wage and school proximity all promote they are not in high demand. the housing prices to some extent. The distance to the downtown area is also an essential attribute of housing prices as expected. Houses near downtown areas are 3.4.2. The effect of population density favored by families with children and office workers because of The effect of population density on housing prices is relatively their great convenience and accessibility. complicated in Shanghai (Fig. 7b). In traditional downtown areas like the old Jing'an District and Huangpu District, available lands for construction become difficult to find which results in the scarcity of 76 Z. Huang et al. / Habitat International 67 (2017) 69e78

Fig. 6. The distribution of (a) GWR performance and (b) predicted housing prices.

Fig. 7. The effects of (a) area, (b) population density, (c) distance and (d) male female ratio. commercial houses. Besides, many high-rise office buildings cluster Although lands in Xuhui District and are in these areas which appeal to a huge number of high-income relatively ample, and population density there is not so high, groups to live and work there. Finally, some historical and cul- housing prices there are still not lower than those in areas like old tural factors also attract lots of investments and make these areas Jing'an District, and Huangpu District, which results in a negative over populated. Therefore, under this type of supply-demand effect of population density on housing prices. One of the main relation, a positive effect of population density on housing prices reasons for the high housing prices in areas like Xuhui District and is shown in these areas. Changning District is the high educational and cultural level there. Z. Huang et al. / Habitat International 67 (2017) 69e78 77

The housing demands of high-income groups also contribute to the heterogeneity and complexity. We agree that both the physical high housing prices to some extent. attributes and the geographical attributes of these houses A negative effect of population density is shown in Yangpu contribute to the final housing prices (Widlak, Waszczuk, & District and Hongkou District because of the high proportion of the Olszewski, 2015). Among all the structural attributes, the area old public houses and traditional Chinese dwellings- and the age of the house affect the housing prices most, especially there. While lots of workers in traditional industries cluster in in the downtown areas. Meanwhile, the effect of education quality these areas, their demands of high-price houses are relatively low. is also worthy of attention. More high-quality schools should be Hence, housing prices are not so high in these areas. established, and educational resources can be more equalized to When it comes to areas like and Baoshan ease the tension. Furthermore, it is also necessary to control the District, a positive effect of population density is shown because of population density to meet the different needs of people from both the low population density and the low housing prices. While different classes. plenty of lands can be used for construction, housing prices there is In face of spatial auto-correlation in hedonic models, different relatively low. Even if a lot of population come to areas like Min- models are used previously such as estimated generalized least hang District and Baoshan District every year, the large size of these squares (Basu & Thibodeau, 1998) and spatial lag model (Cohen & areas keeps the population density at a low level. Coughlin, 2008). In this article, geographically weighted regres- sion is used to deal with the spatial auto-correlation. By calculating 3.4.3. The effect of the distance to downtown areas geographical entities using regression equations and constructing The distance to downtown areas has less effect on housing local model equations, GWR can prevent the observed weakness of prices as the distance increases, and almost drops to zero in sub- fixed effects in OLS. The result of adjusted R-Squared in GWR model urban areas. It is because in downtown areas, the cultural and is 0.826 which demonstrates that GWR is effective in exploring the athletics facilities are well established, the medical treatments and spatial non-stationarity of housing price distribution. health organizations are clustered, and amusement functions are Since the development and openness of Pudong District in 1992, well found. As the life there is very convenient and the demands for Shanghai has become one of the global cities in the world. Spatial housing are large, housing prices are relatively high. However, the and hedonic analysis of housing prices in Shanghai has important suburban areas in Shanghai are relatively independent. Residents in implications for policies on land use and urban planning in these areas can meet their needs by the suburban area's own fa- Shanghai. The downtown areas have the highest housing prices cilities which have no direct relation to the distance to downtown because they share the best public services there. The analysis helps areas. government administration better understand the spatial pattern and determining factors of housing prices in Shanghai so that 3.4.4. The male-female ratio relative policies on land use, industry planning, public services While male-female ratio has a positive effect on housing prices configuration can be set to promote the healthy, harmonious, and in many areas in Shanghai, it has a negative effect in areas like old sustainable development of economy and society. Jing'an District. It is because a large number of high-income white- However, GWR model also has its limitation and is not appro- collar women tend to work in the high-rise office buildings in these priate for every kind of data. Some variables are held constant areas and the female population even surpasses the male popula- across the study area and others vary spatially. In that case, mixed tion there. At the same time, their demands of houses also promote or semi-parametric models can be applied to analyze these vari- the housing prices which result in a negative effect of male-female ables. Some variables may not be quite Poisson distributed and a ratio on housing prices. Negative Binomial model will be more useful (Charlton, Fotheringham, & Brunsdon, 2009). Considering the limitations 4. Discussion and conclusion and advantages of the methods can improve the estimation of housing price in terms of its relations to more variables. Currently, This article examines the dynamics of housing prices in there are many property agencies which control different housing Shanghai based on the 12,732 pieces of housing data in Shanghai in information in Shanghai, most of them have websites to provide early May 2016 with the help of GIS. The data has house attributes housing information for the advertisement. It is better to integrate including location, age, area, floor, orientation, the level of deco- all these data to take advantage of the big data and identify more ration and price, and covers all the districts of Shanghai expect meaningful results on the spatial distribution and determining Jinshan and . Our study shows that the spatial factors of housing prices in the near future. distribution of high or low housing prices is so spatially clustered that it can't be deemed as the results of random processes based on Acknowledgement the value of Moran's I at 0.916. Getis-Ord Gi* statistic also shows fi that the hot spots are spatially concentrated. In the global speci - We sincerely thank editor and anonymous reviewers for cation, OLS analysis with the adjusted R-squared of 0.7 has well constructive comments on the manuscript. We gratefully detected the relation between the attributes of the houses and the acknowledge the financial support from the National Natural Sci- housing prices as expected. GWR model with the adjusted R- ence Foundation of China (41401094) and program of the key lab- squared of 0.826 has greatly improved the estimation of housing oratory of the coastal zone exploitation and protection, the Ministry prices, however, with the local adjusted R-squared ranges from of land and resource of China(2015CZEPK05). 0.633 to 0.884 shows the heterogeneity of the relationship. The housing prices in downtown areas can be explained better than suburban areas. The result indicates that the effects of the attri- References butes are different over space. The coefficient of total floor area and Anselin, L., & Bera, A. K. (1998). Spatial dependence in linear regression models with distance to the downtown area are distributed more regular than of an introduction to spatial econometrics. Statistics Textbooks and Monographs, population density and male-female ratio. 155,237e290. Since studies on the micro structure of housing prices in Basu, S., & Thibodeau, T. G. (1998). Analysis of spatial autocorrelation in house prices. The Journal of Real Estate Finance and Economics, 17(1), 61e85. Shanghai are relatively limited, our spatial and hedonic analysis of Baumont, C., Ertur, C., & Gallo, J. (2004). Spatial analysis of employment and pop- housing prices in Shanghai is significant to understand its ulation density: The case of the agglomeration of Dijon 1999. Geographical 78 Z. Huang et al. / Habitat International 67 (2017) 69e78

Analysis, 36(2), 146e176. based on the housing price to income ratio-take the middle-income residents in Belsky, E., Can, A., & Megbolugbe, I. (1998). A primer on geographic information Shanghai as an example. China Opening Journal, 02,39e43 (in Chinese). systems in mortgage finance. Journal of Housing Research, 9(1), 5e31. Miron, J. (1984). Spatial autocorrelation in regression analysis: A beginner’s guide. Bowen, W. M., Mikelbank, B. A., & Prestegaard, D. M. (2001). Theoretical and In G. L. Gaile, & C. J. Willmott (Eds.), Spatial statistics and models (pp. 201e222). empirical considerations regarding space in hedonic housing price model ap- Boston, USA: D. Reidel. plications. Growth and Change, 32(4), 466e490. Montgomery, M., & Curtis, C. (2006). Housing mobility and location choice: A review Can, A. (1990). The measurement of neighborhood dynamics in urban house prices. of the literature. Impacts of transit led development in a new rail Corridor(2006). Economic Geography, 66(3), 254e272. working paper #2 http://urbanet.curtin.edu.au/local/pdf/ARC_TOD_Working_ Charlton, M., Fotheringham, S., & Brunsdon, C. (2009). Geographically weighted Paper_2.pdf. regression. White paper. National Centre for Geocomputation. National Uni- Moran, P. A. (1950). Notes on continuous stochastic phenomena. Biometrika, 37(1/ versity of Ireland Maynooth. 2), 17e23. http://www.stats.gov.cn/ztjc/ztfx/fxbg/200403/t20040312_14529. Chau, K. W., & Chin, T. L. (2002). A critical review of literature on the hedonic price html (in Chinese). model. International Journal for Housing Science and Its Applications, 27(2), National Bureau of Statistics of China. (2003). The national real estate development 145e165. market prosperity status reports in 2003. Cho, S., Lambert, D. M., Kim, S. G., & Jung, S. (2009). Extreme coefficients in Opoku, R. A., & Abdul-Muhmin, A. G. (2010). Housing preferences and attribute geographically weighted regression and their effects on mapping. GIScience & importance among low-income consumers in Saudi Arabia. Habitat Interna- Remote Sensing, 46(3), 273e288. tional, 34(2), 219e227. Cleveland, W. S., & Devlin, S. J. (1988). Locally weighted regression: An approach to Ord, J. K., & Getis, A. (1995). Local spatial autocorrelation statistics: Distributional regression analysis by local fitting. Journal of the American Statistical Association, issues and an application. Geographical Analysis, 27(4), 286e306. 83(403), 596e610. Osland, L. (2010). An application of spatial econometrics in relation to hedonic Cohen, J. P., & Coughlin, C. C. (2008). Spatial hedonic models of airport noise, house price modeling. Journal of Real Estate Research, 32(3), 289e320. € proximity, and housing prices. Journal of Regional Science, 48(5), 859e878. Ozyurt, S. (2014). Spatial dependence in commercial property prices: Micro evidence Craven, B. D., & Islam, S. M. (2011). Ordinary least-squares regression. In from The Netherlands. Working Paper Series #1627. European Central Bank. L. Moutinho, & G. D. Hutcheson (Eds.), The SAGE dictionary of quantitative Paez, A., Long, F., & Farber, S. (2008). Moving window approaches for hedonic price management research (pp. 224e228). Bangalore, India: SAGE Publications. estimation: An empirical comparison of modelling techniques. Urban Studies, De Graaff, T., Florax, R. J., Nijkamp, P., & Reggiani, A. (2001). A general mis- 45(8), 1565e1581. specification test for spatial regression models: Dependence, heterogeneity, and Rosen, S. (1974). Hedonic prices and implicit markets: Product differentiation in nonlinearity. Journal of Regional Science, 41(2), 255e276. pure competition. Journal of Political Economy, 82(1), 34e55. Des Rosiers, F., Theriault, M., & Villeneuve, P. Y. (2000). Sorting out access and Sander, H. A., & Polasky, S. (2009). The value of views and open space: Estimates neighbourhood factors in hedonic price modelling. Journal of Property Invest- from a hedonic pricing model for Ramsey County, Minnesota, USA. Land Use ment & Finance, 18(3), 291e315. Policy, 26(3), 837e845. Din, A., Hoesli, M., & Bender, A. (2001). Environmental variables and real estate Shanghai Bureau of Statistics. (2015). Shanghai statistics yearbook 2015. Beijing: prices. Urban Studies, 38(11), 1989e2000. China Statistics Press. Dubin, R. A. (1998). Spatial autocorrelation: A primer. Journal of Housing Economics, Shimizu, C., Takatsuji, H., Ono, H., & Nishimura, K. G. (2010). Structural and temporal 7(4), 304e327. changes in the housing market and hedonic housing price indices: A case of the Dunse, N., Jones, C., Orr, A., & Tarbet, H. (1998). The extent and limitations of local previously owned condominium market in the Tokyo metropolitan area. In- commercial property market data. Journal of Property Valuation and Investment, ternational Journal of Housing Markets and Analysis, 3(4), 351e368. 16(5), 455e473. Thrall, G. I. (1998). GIS applications in real estate and related industries. Journal of Fotheringham, A. S., Brunsdon, C., & Charlton, M. (2003). Geographically weighted Housing Research, 9(1), 33e59. regression: The analysis of spatially varying relationships. Sussex, England: John Tobler, W. R. (1970). A computer movie simulating urban growth in the Detroit Wiley & Sons Ltd. region. Economic Geography, 46(sup1), 234e240. Gao, J. B., & Li, S. C. (2011). Detecting spatially non-stationary and scale-dependent Tu, Y., Yu, S. M., & Sun, H. (2004). Transaction-based office price indexes: A relationships between urban landscape fragmentation and related factors using spatiotemporal modeling approach. Real Estate Economics, 32(2), 297e328. geographically weighted regression. Applied Geography, 31, 292e302. Wang, C., Du, S., Wen, J., Zhang, M., Gu, H., Shi, Y., et al. (2016). Analyzing explan- Geary, R. C. (1954). The contiguity ratio and statistical mapping. The Incorporated atory factors of urban pluvial floods in Shanghai using geographically weighted Statistician, 5(3), 115e146. regression. Stochastic Environmental Research and Risk Assessment,1e14. http:// Gillen, K., Thibodeau, T., & Wachter, S. (2001). Anisotropic autocorrelation in house dx.doi.org/10.1007/s00477-016-1242-6. prices. The Journal of Real Estate Finance and Economics, 23(1), 5e30. Wang, Y., & Jiang, Y. (2016). An empirical analysis of factors affecting the housing Gourieroux, C., & Laferrere, A. (2009). Managing hedonic housing price indexes: The price in Shanghai. Asian Journal of Economic Modelling, 4(2), 104e111. French experience. Journal of Housing Economics, 18(3), 206e213. Wen, H., Xiao, Y., & Zhang, L. (2017). Spatial effect of river landscape on housing Gu, W. (2005). The comparison of housing prices in Shanghai and America. Price: price: An empirical study on the Grand Canal in Hangzhou, China. Habitat In- theory & Practice (01),50e51 (in Chinese). ternational, 63,34e44. Helbich, M., Brunauer, W., Vaz, E., & Nijkamp, P. (2013). Spatial heterogeneity in Wheeler, D. C. (2014). Geographically weighted regression. In handbook of regional hedonic house price models: The case of Austria. Urban Studies, 51(2), 390e411. science. Berlin Heidelberg, German: Springer. Hou, Y. (2010). Housing price bubbles in Beijing and Shanghai? A multi-indicator Wheeler, D., & Tiefelsdorf, M. (2005). Multicollinearity and correlation among local analysis. International Journal of Housing Markets and Analysis, 3(1), 17e37. regression coefficients in geographically weighted regression. Journal of Hui, E. C., Zhong, J. W., & Yu, K. H. (2012). The impact of landscape views and storey Geographical Systems, 7(2), 161e187. levels on property prices. Landscape and Urban Planning, 105(1), 86e93. Widlak, M., Waszczuk, J., & Olszewski, K. (2015). Spatial and hedonic analysis of Lancaster, K. J. (1966). A new approach to consumer theory. Journal of Political house price dynamics in Warsaw. NBP Working Paper #197 http://econpapers. Economy, 74(2), 132e157. repec.org/paper/nbpnbpmis/197.htm. Liao, W. C., & Wang, X. (2012). Hedonic house prices and spatial quantile regression. Wilhelmsson, M. (2002). Spatial models in real estate economics. Housing, Theory Journal of Housing Economics, 21(1), 16e27. and Society, 19(2), 92e101. Li, H., & Ge, C. (2008). Inflation and housing market in Shanghai. Property Man- Zheng, S., Kahn, M. E., & Liu, H. (2010). Towards a system of open cities in China: agement, 26(4), 273e288. Home prices, FDI flows and air quality in 35 major cities. Regional Science and Luan, G., Zhou, W., & Yi, W. (2012). Resaerch of residents' housing affordability Urban Economics, 40(1), 1e10.