Influence of urban green vegetation fraction on the urban heat island effect across Europe

Arjan Droste

Supervised by: Reinder Ronda and Natalie Theeuwes

Meteorology and Air Quality Group, Wageningen University

June 2015

i

Contents Abstract ...... 1 Introduction ...... 1 Data and methodology ...... 3 General data and methodology ...... 3 ECMWF data ...... 4 Urban Heat Island calculation ...... 5 Green fraction and Local Climate Zones ...... 5 Statistics ...... 6 ...... 6 Temperature data and analyses ...... 6 Green fraction and Local Climate Zones ...... 7 Madrid ...... 8 Urban and rural data ...... 9 Green cover ...... 9 Oslo ...... 10 Urban and rural data ...... 10 Green cover ...... 11 Results ...... 11 Data validation ...... 11 Validation of the Wunderground data ...... 11 Validation of the ECMWF data ...... 12 UHI vs vegetation ...... 16 NDVI and GVF...... 16 Green cover estimate ...... 19 Seasonality of the Urban Heat Island ...... 22 Rotterdam ...... 22 Madrid ...... 24 Oslo ...... 26 Discussion ...... 27 Rotterdam ...... 28 Madrid ...... 29 Oslo ...... 30 Conclusions ...... 31 Outlook and recommendations...... 31 Acknowledgments ...... 32 References ...... 33 Appendices ...... 36

ii

Appendix A: table and map of the Rotterdam stations ...... 36 Appendix B: table and map of the Madrid stations ...... 38 Appendix C: table and map of the Oslo stations ...... 40 Appendix D: table of measurement accuracy of weather stations ...... 42 Appendix E: Histogram of the difference between Wunderground and CPC data, Rotterdam ...... 43 Appendix F: hysteresis of Tryvannshøgda ...... 45

iii

Abstract The difference in temperature between urban areas and their rural surroundings is known as the Urban Heat Island (UHI). In the light of climate change and the ongoing world-wide urbanisation, it is important to take measures to reduce the UHI. Studies have shown that increased vegetation fraction in cities correlates with a reduction in mean UHI values. Though this relation has been researched for single cities and countries, it is yet unknown whether this relation is universal across different climate zones. This study researches the applicability of measurements from hobby meteorologists combined with ECMWF model data to calculate the UHI in an innovative new approach. We research the influence of the vegetation fraction of neighbourhoods on the maximum UHI, in order to establish a statistical relationship between vegetation and UHI reduction. This analysis is performed for several European cities in varying climates (Rotterdam, the ; Oslo, Norway; Madrid, Spain) to determine the role of climate zones on this relationship. Our results establish that using the ECMWF operational model data as means of rural background temperature is a valid new way of calculating UHI. For Rotterdam we find a significant (R2 of 72%) relationship between vegetation and maximum UHI reduction, but we are unable to reproduce a significant relation for the cities of Madrid and Oslo. Water availability for the rural surroundings, the anthropogenic water flux and the regional climate appear to play a large role in the way vegetation affects the UHI.

Introduction From 2008 onwards, over half of the world’s population has been living in cities, and projections show urbanisation to increase even further (United Nations, 2011). This rapid urbanisation, combined with the issue of climate change, can bring various health issues related to the Urban Heat Island (UHI) effect. This is the phenomenon of temperatures being several degrees higher in cities than in the rural surroundings. The elevated urban temperature can lead to additional heat stress or heat-related illness during hot days. The urban heat island is caused by enhanced radiative energy storage by the built-up urban area during the day, which is subsequently released during the night. The resulting elevated nocturnal temperatures can lead to thermal discomfort, when people cannot cool down at night which can lead to sleep deprivation and other health issues (Bell, 1982). During the 2003 summer heat wave in Western Europe over 70,000 people lost their lives due to heat-related stress (Poumadère et al., 2005). Groups at risk include those suffering from cardiovascular disease, pregnant women, children and the elderly (Kovats and Hajat, 2008; Reid et al., 2009). In the light of climate change which will increase extreme temperatures further, it is important to mitigate the urban heat island. Investigating potential measures to mitigate the UHI is therefore of great interest to society, especially in strongly urbanised areas such as Europe, North America and parts of East Asia.

One way of reducing urban temperatures is by using urban vegetation (parks, grass along roads, etc.). Vegetation has a much lower heat storage capacity than stone, concrete and other building materials, which means it cannot capture as much heat during the day to release at night. The surface energy balance is affected by vegetation as well, since it evaporates water during photosynthesis, increasing the amount of energy going to latent heat instead of sensible heat, which is the major component over built-up area (Oke, 1982). Trees in urban parks provide additional shading, decreasing the amount of radiation reaching the surface and lowering air and soil temperatures (Lin and Lin, 2010). The ground heat flux in urban area is much higher than rural surroundings, since roads absorb high amounts of radiative energy (Oke, 1982). Compared to vegetated surface roads have very high thermal capacity and generally low albedo (black asphalt roads), causing efficient radiative energy storage during the day and subsequent release at night. The cooling effect of urban parks and the effect of trees on surface energy has been discussed in many studies (Oke, 1982; Spronken-Smith and Oke, 1999; Lin and Lin, 2010; Petralli et al., 2014) and has been observed during mobile measurement campaigns in Rotterdam (Heusinkveld et al., 2014).

1

Steeneveld et al. (2011) found a robust relation between urban greenness and reduction of the urban heat island for various cities in the Netherlands. The 95 percentile of UHI decreases with ~0.6 °C for every 10% vegetated surface in the area around the measurement site (confirmed in Heusinkveld et al., 2014). Whether such a relation exists for other countries, and whether this is a universal relation or dependent on climate zones and city structure is yet unexplored. Weng et al. (2004) did a study using remote sensing data to explore the relationship between green fraction and land surface temperature, finding a correlation between the spatial variations of the two. Zhao et al. (2014) performed a comparison of UHI intensities in various cities across the United States, and found that night-time UHI does not correlate well with differences in regional climate zones, but daytime UHI correlates with precipitation. They find that convective heat dissipation (i.e. the efficient release of heat into the atmosphere), rather than cooling by evaporation, is of the largest impact on UHI differences. This would seem to suggest that a relation such as in Steeneveld et al. (2011) would not hold when comparing several climate zones. Researching how exactly the influence of vegetation on the UHI changes with (local) climate zones, and what factors influence this relation has not been done yet. Doing this research is key to adaptive urban planning and ultimately weather forecasts at street- level: predicting where in a city the heat island is higher and where it is lower can save lives during heat waves. The following research questions will be addressed:

 By how much does the urban vegetation fraction reduce the Urban Heat Island in climate zones across Europe?  How do local and regional climate differences influence the relationship between urban vegetation fraction and UHI reduction?

We investigate the influence of vegetation on the UHI in 3 distinctly different cities across Europe (Fig. 1): Rotterdam (the Netherlands), Madrid (Spain) and Oslo (Norway). Rotterdam is situated close to the Dutch North Sea coast, with a temperate maritime climate (Köppen class Cfb) and prevailing south- westerly winds. Madrid is situated inland, with an average altitude of 660 metres above sea level, and a dry Mediterranean climate (Köppen class Csa). Figure 1: Map of Europe with the 3 cities circled. North: Oslo; middle: Rotterdam; south: Madrid. Oslo lies in the Oslofjord in the southern part of Norway, with a steep topographical gradient from 0 metres at the harbour to 300 metres close to the northern city border, and a humid continental climate (Köppen class Dfb).

We hypothesise that UHI reduction in drier climates will be lower, since the vegetation would not be evaporating at its optimal level (Priestley and Taylor, 1972) due to moisture deficit. Vegetation in temperate, humid climates will have enough moisture available to evaporate at (or close to) optimal values. Below-optimal evaporation means less energy is partitioned to latent heat and thus sensible heat will go up (i.e. the Bowen ratio will go up). The strategy of the vegetation in dryer, warmer climates would therefore be of influence on its efficiency in UHI reduction. As a result, our hypothesis is that urban vegetation in hot, dry climates will have a smaller effect on the surrounding neighbourhood. Similarly, we expect the largest UHI reduction effect to occur in the temperate zones where vegetation is not limited by moisture availability. Furthermore, in cities with a lot of open space (Oslo) we expect the UHI intensity to be lower than in compact cities (e.g. the centre of Madrid). Radiation is more easily trapped and stored in cities with a high urban canopy and a high building

2 height to street width ratio (H/W: Oke, 1982), up to a certain optimal threshold after which shadowing effects limit the amount of radiation penetrating the canopy (Theeuwes et al., 2014).

Data and methodology In this section we give an overview of the various sources of data we use for this study, as well as the methods behind our analysis. For each of the 3 cities we generally apply the same methods for quantifying the maximum (95 percentile) UHI and the vegetation fraction, though the characteristics of each city require that especially the calculation of the vegetation fraction requires optimisation to fit the typical vegetation.

General data and methodology Data in urban areas, where often no conventional weather stations are placed, are obtained from hobby meteorologists who have their own personal weather station. The data from these stations is uploaded to the Weather Underground website (http://www.wunderground.com), which stores and shows meteorological data from these hobby meteorologists. This data is freely available for download, and provides an interesting new platform of urban weather analysis. Because there are no tight regulations for these hobbyist stations like there are for stations operated through the World Meteorological Organisation (WMO), we have applied several restrictions to ensure we use stations of appropriate quality (Oke, 2006, published some guidelines for obtaining representative data from urban stations). Selection criteria are as follows:

 A sufficient length of the dataset. A station is only suitable for contributing to this study if the available record is over 12 months long (in the timeframe of 1 January 2010 up to 1 October 2014, the start of this study), and preferably includes 2 summer (JJA) periods. This is to avoid a bias in UHI values due to a very hot or cold period in one year, which could create misleading statistics of e.g. the 95 percentile.  As little gaps as possible, to ensure continuity of the data record. Stations that have several large gaps (>10% missing hours) are excluded from analysis.  Only stations with enough metadata have been chosen. This includes elevation, type of hardware, and accurate information of the location (at least 3 decimal degrees latitude/longitude). Elevation information is required to correct for temperature differences due to the air temperature decreasing with height (adiabatic lapse rate). Hardware type is required to know the errors made in measuring, as some brands are more accurate than others. The table in Appendix D provides a list of the measurement accuracy of the most commonly used hardware in our analysis. Accurate location information is needed to correctly determine the vegetation fraction around the stations and place each station in the correct neighbourhood.  Preferably, a selection of stations will be made from different neighbourhoods, to capture a satisfactory variation in green cover, and to gain a representative view of the UHI of the city as a whole, without overrepresentation of one neighbourhood.

We have applied several filters on the data of the stations that meet the aforementioned selection criteria, to remove incorrect data that might skew the analysis. Observations with precipitation, high relative humidity (>99%) or unrealistic spikes of air temperature have thus been removed. Often the stations are not radiation shielded, which could cause the temperature sensor to get wet. Wet instruments often give unrealistic temperature values, and could therefore cause erroneously high or low UHI values. Not every Wunderground station has reliable precipitation measurements, so we filtered out hours of rain based on the observations from the WMO stations in each respective city.

The variables of interest are air temperature; relative humidity; wind speed and direction; rain rate; air pressure. Air pressure is not used in the analyses, but functions as an extra quality check of the data: when air pressure is unphysically low or high, the rest of the data at that time step is likely to be erroneous as well.

3

Ideally, all the stations would have the exact same measuring period, to facilitate comparing the differences in temperature between them. However, since we make use of a network of weather hobbyists rather than a strictly controlled scientific setup, this is often not the case. There are a few stations that measure continuously during the timeframe of interest (January 2010 up to September 2014), but most stations stop recording before 2014, or start measuring after 2010. Exact measuring begin/end of each station can be found in Appendices A up to C. Due to the differences in measurement periods, the absolute temperature values cannot be directly compared using statistical tests. However, since the data records all span at least 1 year, we can analyse the difference in the descriptive statistics (95 percentile, mean, median) between stations.

A difficult problem in urban measurement setups is the height at which the sensors are installed. The typical measurements on top of buildings measure above the urban canopy layer, where the air is well-mixed. In this study the focus lies on the heterogeneous urban canopy layer, which is heavily impacted by the surface properties (Oke, 1982). Steeneveld et al. (2011) report that their statistical relationship between maximum UHI and green fraction severely deteriorates when including stations that measure above the urban canopy layer. We will perform separate analyses that consider the measurement height of the stations. The Wunderground website does not provide detailed information about the sensor height above ground level, but fortunately several weather hobbyists have their own website with extensive metadata about the measurement setup. We have used that information to estimate whether stations are located on the roofs or in the urban canopy itself. In most cases where this information was missing, we have utilized Google Street View™ determine the most likely location of the measurement station in the urban structure.

ECMWF data An important issue in UHI calculations is the choice of rural stations. Stewart (2011) found in a review study that many studies fail to report the metadata for their rural stations, and Sakikabara and Owa (2005) mention that the choice of the rural station is not only dependent on the distance to the city or absence of built-up area, but also on distance to coast and soil type. Selecting an unsuitable rural station to accompany the urban measurements can therefore lead to large variations in UHI. Rural measurement stations often feel an urban effect if they are downwind from their closest city, which leads to apparent lower UHI values (see Heusinkveld et al., 2014). WMO stations close to large cities are often located at airports (e.g. station Zestienhoven in Rotterdam, the Netherlands, or the Madrid Barajas station in Spain), which cannot be fully seen as rural stations. The presence of large impervious surface and proximity to the city may lead to influences from asphalt heating and heat advection.

An as of yet unexplored option is to use weather model data as approximation for the rural data. The model data can be selected in a grid cell near the city of interest, upwind of the urban area to circumvent the problem of urban boundary layer air advection. Since the ECMWF data is available globally, this provides an interesting venture to devise a universal UHI calculation strategy. If successful, this approach could be used by, for example, Non-Governmental Organisations (NGOs) for risk mapping in megacities. We research the use of ECMWF (European Centre for Medium range Weather Forecasting) model data to represent the rural conditions, to calculate UHI without the need to set up a rural reference station. After initially unsatisfying results from using ERA-INTERIM reanalysis data, we have chosen to use data from the operational model, which provides output with a spatial resolution of 0.15° (latitude and longitude) and a 6-hour temporal resolution (00, 06, 12 and 18 UTC). Used variables are the 2-metre air temperature, dew point temperature, soil temperature (at a depth of 8 cm), and the zonal and meridional components of the wind (u and v). For each city, several grid points have been chosen around the built area, and the most representative grid point was chosen to calculate the UHI with (grid points are visualised by purple markers in the maps of Appendices A to C). In addition, information about the surface geopotential height, vegetation classes and land-cover was obtained, to calculate an average surface elevation per grid box, and to explain

4 any differences in 10-metre temperature that could arise due to different vegetative cover. Results for the verification of both the Wunderground and ECMWF data are found in chapter 3.

Urban Heat Island calculation There are various definitions of the Urban Heat Island, all based on the concept of a temperature difference between the urban and rural area (Oke, 1982; Stewart, 2011; Stewart and Oke, 2012). There are various methods to express the difference: from maximum daily air temperature; hourly averaged air temperatures, temperature at the exact same time step, minimum daily temperatures, etc. (e.g. Steeneveld et al., 2011). In this study we focus on the maximum daily UHI: the value that represents the maximum hourly UHI for a given day. Ordinarily, to calculate this maximum UHI one would first calculate the hourly UHI values, and then select the largest value. However, since the ECMWF data has a six-hourly temporal resolution, calculating the maximum UHI from only 4 data points would create large uncertainty. From theory we know the maximum UHI most often occurs during night-time, after sunset. Kim and Baik (2002) report that the maximum daily UHI is 3.3 times more likely to occur during the night than during the day. Therefore, we focus on the nocturnal UHI at 00 UTC, which is near the maximum UHI (after sunset and before sunrise). For every city under consideration, 00 UTC is indeed after sunset and before sunrise, even during the summer solstice in Oslo.

While the ECMWF data is limited by its low temporal resolution, the Wunderground data is not, and we are able to calculate hourly UHI values by calculating UHI with the measurements of the WMO reference stations. In order to correctly establish the performance of the ECMWF data, we will compare the 00 UTC ECMWF UHI values to the 00 UTC Reference station UHI values, and not calculate separate maximum UHI using the maximum hourly UHI. To deal with the highly heterogeneous temporal resolution of the Wunderground stations, we define the UHI as the difference in the hourly averaged air temperature between the rural and urban stations:

Equation 1: 푈퐻퐼_푚푎푥 (0 푈푇퐶) = 푇_푎𝑖푟_푎푣푔 푈푟푏푎푛 (0 푈푇퐶) – 푇_푎𝑖푟_푎푣푔 푅푢푟푎푙 (0 푈푇퐶)

The averages were taken over the values 30 minutes before and 30 minutes after the whole hour: e.g. the UHI at 00:00 UTC is based on all the measurements taken between 11:30 and 00:30 UTC. These UHI_max values are used in the statistical regression with the green fraction: for each station, the 95 percentile of the UHI_max is plotted against the vegetation fraction of that particular station.

Green fraction and Local Climate Zones We have used several methods to estimate a green vegetation fraction for the stations of all cities. These methods vary between each of the 3 cities due to the different characteristics of the vegetation in each climate, and for each city the used methods will be explained separately. The calculated green fractions are plotted against the 95-percentile of the maximum UHI and subsequently a regression analysis is performed to infer a statistical relationship between the two.

Aside from determining the vegetation fraction for each site, we have also assigned each site their appropriate Local Climate Zone (LCZ), as by the framework of Stewart and Oke (2012). They developed a framework to bring more clarity to the definition of what exactly is considered to be ‘urban’, and what to be ‘rural’ areas, as well as improving the metadata about used sites. Their classification is based on surface properties, such as the vegetative fraction, impervious fraction and the albedo, as well as building properties such as sky view factor, mean building height and the building surface fraction. Based on these zone properties they distinguish between 10 urban zones (labelled 1 to 10) and 7 rural zones (labelled A to G). In the tables with station data (Appendices A to C) each site is assigned its appropriate LCZ label. The evaluation is based on surface characteristics and approximate building height of the neighbourhood surrounding each station, utilising Google Street View™. The LCZs of the used rural reference stations are given in the upcoming sections, detailing the methods used per city.

5

Statistics In our analyses we apply a range of statistical tests to research significant differences between groups (e.g. ranges of vegetation fractions). While we assume knowledge of descriptive statistics (e.g. means, median, percentiles) to be common, we will shortly go over some of the inferential statistics used:

 Students t-test. This is a very basic test used to test whether the mean of a (normally distributed) group of data is significantly different from the value tested against. This test can be done on one group of data, e.g. the differences between two air temperature records, and tested against 0. A significant result would indicate that the mean difference is not zero: therefore the two groups are significantly different from one-another. The test can also be performed pair-wise, where records from two sites are compared pair-wise (at the same moment in time) and tested for any significant difference. The T-test only tests whether the mean of a group of data is significantly different from the tested value: it does not provide information, in case of a significant result, what the actual difference is.  Analysis Of Variance (ANOVA). This test in its simplest form is a generalisation of the t-test for two or more groups. It tests whether the means of two or more groups are equal to one another, or whether significant differences occur between groups. It provides more accuracy than repeatedly performing two-samples t-tests due to the decreased likelihood of incorrectly rejecting the null-hypothesis (a false positive test result). While the ANOVA is a powerful tool, which can be used in a wide array of analyses, we limit ourselves to the one-way ANOVA, which is testing groups of data for the effect of one independent variable (vegetation) on one dependent variable (temperature).  Least Significant Difference (LSD) and Tukey’s Honest Significant Difference (HSD) tests. Both the LSD and HSD tests are post-hoc tests: performed after the experiment to locate any patterns in the data. They can be used to quantify differences between groups, bearing similarity of the one-way ANOVA discussed above. While a significant ANOVA result does not mention which groups are significantly different from each other, the LSD and HSD tests report which groups significantly differ from each other and the magnitude of the difference (hence the term Least Significant Difference). In essence, the LSD test is a series of t-tests, with a pooled standard deviation instead of a standard deviation calculated from the two groups under testing. The HSD test performs nearly the same, but corrects for the increased probability of drawing false conclusions by performing a series of tests (the multiple comparisons problem), and is therefore generally more safe to use with a large number of groups. Both tests assume normality of the data and homogeneity of variances. Normality of the data is not fully met: histograms show that the UHI distribution is skewed to the right (i.e. a higher number of data points at the right side of the peak). The distribution is still Gaussian however, so we can still use these tests: some caution is required when interpreting the standard deviation, however.

The upcoming 3 sections describe specific data and methods used separately for each city: the used urban and rural stations and the determination of the green fractions.

Rotterdam Rotterdam is a coastal city in the Netherlands, bordering the North Sea, with a temperate maritime climate (Cfb in Köppen), and prevailing winds from the south-east. The river Rhine flows through the city, dividing the city centre. The city is bordered by extensive pasture lands and smaller towns.

Temperature data and analyses Applying the station selection criteria mentioned in the previous paragraph, 8 Wunderground stations remain with sufficient quality for analysis. In addition to the Wunderground station we use data from an urban measurement network in Rotterdam. This is a network of 12 urban stations and 1 rural reference station installed by the WUR for the Climate-Proof Cities project (from now referred to as CPC) which was part of the Knowledge for Climate programme (Van Hove et al., 2015). The network

6 is currently maintained by the Technical University of Delft, and it provides a detailed and accurate urban measurement record since 2010 (2009 for some stations). Van Hove et al. (2015) report the stations have an accuracy of 0.4 K for temperature; 2% for relative humidity (for 10-90%, 4% outside this range) and 0.5 ms-1 for wind speed. The location of these stations can be found on the map in Appendix A, and station characteristics (hardware, measuring period and other properties) can be found in the accompanying table in Appendix A.

We use the data from this detailed urban network to verify the Wunderground measurements, to check for any bias within the Wunderground stations compared to the CPC network. The Reference station (located to the North-East of Rotterdam) of the CPC network is used to calculate UHI for Rotterdam. In addition to this Reference station we also use the data from the WMO station Zestienhoven, located near the airport of Rotterdam (in the north-west of the city). Validation of the ECMWF model data as estimation of rural temperatures is done using both of these rural stations.

We are interested in the air temperature at 0 UTC, which we assume to be near the maximum UHI, and will therefore only compare data from this time, rather than the other hours (6, 12, 18 UTC). For each city there are several ECMWF grid points (shown on the maps in appendices A to C) available, so it is necessary to determine whether these points differ substantially from each other to avoid creating a bias in the UHI calculation. For Rotterdam we focus on 3 points: West (51.90°; 4.20°), South (51.75°, 4.50°) and East (51.90°; 4.65°). Grid points to the north are either located in city area (The Hague and Gouda) or very close to the North Sea. There are a few grid points in the city centre of Rotterdam itself, and we will later on test whether the ECMWF model recognises this and has its own UHI.

Aside from determining a region’s dominant vegetation type, the regional climate also regulates the background air temperature, which determines the resulting city temperature and subsequently the UHI magnitude. From that, it may follow that changes in climate could influence the seasonality of the Urban Heat Island. To research the seasonal variation of the UHI we employ a method from Zhou et al. (2013), who visualise the intensity of the UHI over the year plotted against background (rural) temperature in a hysteresis-like curve. Their method is based on remote sensing data averaged over various days and Fourier-transformed, whereas our data are very local measurements. Nevertheless, the way of visualisation allows for thorough analysis of the yearly variation in UHI.

For Rotterdam we have calculated an average median UHI per month for each station, and subsequently averaged over all these stations to get an average UHI for the entire city of Rotterdam. This introduces some error, as there are stations that have a much longer record than others (mostly the Wunderground stations). However, applying weighted averages discriminates towards the stations with smaller records, which can therefore cause an overrepresentation of certain parts of the city and deteriorate the claim that we have created an average UHI seasonality for Rotterdam itself.

Green fraction and Local Climate Zones For Rotterdam we determine the vegetation fraction using 3 separate methods, in order to see which method yields the most favourable result in the regression with the UHI. We use the commonly used Normalized Difference Vegetation Index (NDVI); a Green Vegetation Fraction (GVF, Gutman and Ignatov, 1998) derived from this NDVI and a simple function to count the amount of green pixels in Google Earth images (Steeneveld et al., 2011).

From satellite imagery we can construct an index for vegetation, the Normalized Difference Vegetation Index (NDVI). Based on the difference in reflective properties of the surface the vegetated fraction can be determined. The NDVI ranges from 0 to 1, where high values (around 0.6) indicate a densely vegetated surface, and low values (around 0 or 0.05) indicate surface with little vegetation (water, bare soil, or built-up area). Using the ArcMap programme we have made an NDVI map of the Rotterdam area at 25x25 metre resolution, using Landsat 8 imagery from spring months as input. Accuracy of this

7 map is validated with the higher resolution (10x10 m.) NDVI map courtesy of the Geodesk, Alterra. The equation for the NDVI is:

Equation 2: 푁퐷푉퐼 = (푁퐼푅 − 푉퐼푆)/(푁퐼푅 + 푉퐼푆)

Where NIR is the reflectance in the near-infrared spectrum (0.725 to 1.1 micrometre) and VIS the reflectance in the visible spectrum (0.58 to 0.68 micrometre) (Gutman and Ignatov, 1998).

While NDVI is in itself an index of vegetation, and is therefore suitable for analysis, it is not very straightforward. The range is not completely uniform: values differ between vegetation type since some are greener than others, meaning that a fully vegetated surface of one plant species can have a substantially different NDVI than a surface with a different species (Gutman and Ignatov, 1998). The NDVI also has the disadvantage of being very sensitive to perturbations: atmospheric perturbations from water vapour and aerosols or surface perturbations from shadowing by clouds and buildings. A green vegetation fraction (i.e. fraction of the surface covered by active vegetation) is more intuitive to work with and would be applicable in areas with multiple plant species, as is often the case in urban areas due to gardening. To this end, we employ the method used by Gutman and Ignatov (1998), who developed a method to derive a green vegetation fraction (GVF) from the NDVI:

Equation 3: 푓푔 = (푁퐷푉퐼 − 푁퐷푉퐼_푚𝑖푛 )/(푁퐷푉퐼_푚푎푥 − 푁퐷푉퐼_푚𝑖푛)

Where fg is the green vegetation fraction (-), NDVI_min is the NDVI of bare soil, and NDVI_max is the NDVI of a densely vegetation surface (Gutman and Ignatov, 1998). For every measurement station in Rotterdam we have determined the average NDVI of a circle around the station (100, 250, 500 metres radius), and the corresponding green vegetation fraction. For the low threshold (NDVI_min) we use an NDVI of 0.03 to represent bare soil, and for the upper threshold (NDVI_max) we use an NDVI of 0.66 to represent densely vegetated surface. These values represent the minimum and maximum NDVI values found in the area around Rotterdam. The GFV values were checked for reliability by comparing them to the CPC station data as given by Van Hove et al. (2015), who also performed a detailed analysis of the surface cover fractions (vegetation, water, impervious fraction) of the CPC Rotterdam stations (Table 1 in Van Hove et al., 2015).

Since the Landsat data used for the NDVI maps is still rather coarse for detailed urban analysis (25x25 metres) we have employed another method of calculating a green fraction, as used by Steeneveld et al. (2011). They use Google Maps™ imagery to estimate the green vegetation fraction in a square with sides of 600 metres around each measurement point. They make use of the increased green reflectance of vegetation as compared to blue and red reflectance, quite similarly to the NDVI (Steeneveld et al., 2011; Sarlikioti et al., 2011). We have slightly modified the original equation (Steeneveld et al., 2011, their equation 3), to reduce the underestimation of the green fraction for imagery containing shadowing effects (e.g. from trees). In situations where this problem still occurs, we have manually greened the parts that we know to be covered in vegetation, to improve the results.

Equation 4: 퐺푟푒푒푛% = 푅 > 1.15퐵 & 퐺 > 1.01푅 & 퐺 > 1.01퐵

Here R is the reflectance in the red spectrum; B is the reflectance in the blue spectrum and G is the reflectance in the green spectrum. For each station the green cover has been estimated using equation 4 for a square with an area of 600x600 metres around the station.

Madrid The city of Madrid, Spain, is situated inland, with an average altitude of 660 metres above sea level, and a dry Mediterranean climate with dry and hot summers (Köppen class Csa). The city is dominated by tall, multiple storey buildings and little urban vegetation. The rural surroundings are mainly covered in grass and shrubs, which are inactive during the hot dry summer season.

8

Urban and rural data Additional urban weather stations from scientific campaigns are not available in Madrid. The abundance of Wunderground stations is high, resulting in 28 stations with sufficient quality after applying the selection criteria (see Appendix B). Since validation of the Wunderground stations against a network of known quality is not possible, quality assessment is based on the results of the Rotterdam validation. The validation of the ECMWF data is problematic, since there is no truly reliable rural reference station. The WMO station at Madrid Barajas airport is located more or less within the urban area (see the map in Appendix B), and would therefore likely be too contaminated by urban influences for a good rural reference station. The same problem is the case for the WMO station at Cuatro Vientos airport located to the south of Madrid. We will not use the Barajas data to quality-check the ECMWF data, but use it as a second way of calculating UHI, with the ECMWF data being the primary focus. In order to select the most representative rural background for the calculation of UHI we make use of the land-cover data of the ECMWF model and whether significant differences in air temperature occur between the ECMWF points. We have selected 8 grid points of the ECMWF model that lie around Madrid, visible on the map in Appendix B. To test which point is the most suitable to calculate with, we use an ANOVA with two post-hoc tests, the LSD and the HSD test. A Levene’s test for homogeneity of variance was first conducted, which proved that variances of the 8 sets were equal and the test assumptions are valid.

Madrid contains a widely varying topography: the lowest station (Arganda del Rey) has an elevation of 545 metres above mean sea level, whereas the highest station (Plaza de Castilla) is located at 750 metres a.m.s.l. In the troposphere, air temperature decreases with height, due the decrease of density with height (Wallace and Hobbs, 2006, ch. 1). This requires us to correct for differences in elevation when calculating UHI’s. The ECMWF data provides only a surface geopotential, which can be coverted into the geopotential height when divided by the gravitational constant (taken as 9.81 ms-2). Using this surface geopotential height as estimate for mean elevation we can correct the ECMWF data for elevation as well. To accurately calculate the UHI, we need to determine the difference in air temperature between stations at a shared base level, which is set at the mean sea level. To achieve this, we have corrected each temperature value for the height by adiabatically moving it to sea level according to equation 5:

Equation 5: 푇_ℎ = 푇_푟푒푓 + 푑ℎ ∗ 푔푎푚푚푎.

Here T_h is the temperature (°C) at a height h (metres); T_ref is the temperature (°C) at a reference level (usually mean sea level); dh (metres) is the difference in height between the reference level and h; gamma is the dry adiabatic lapse rate (Km-1). The dry adiabatic lapse rate is set at -6*10-3 Km-1, within the range given by Wallace and Hobbs (2006, chapter 1). Gao et al. (2012) use a more correct monthly varying lapse rate (ranging between -4.4*10-3 to -8.2*10-3 Km-1) to correct the ECMWF ERA- INTERIM data, but since our objective is just a correction of the temperature to compare two locations at the same point in time, we can allow to use a fixed lapse rate.

Green cover Though Landsat data is freely available, higher resolution NDVI data (e.g. the 10x10 metre data from Geodesk) is not available for Madrid like for the Rotterdam case. Therefore we will solely focus on the green cover estimation method using Google Earth for Madrid. Initially using the green estimation function (equation 4) gave very unrealistic green values (using visual comparison of the used imagery). Madrid has darker vegetation than Rotterdam, with more dark trees and shrubs, whereas Rotterdam has more lighter green grass. The equation was optimized for these lighter colours, causing the green fractions for Madrid to be severely underestimating the reality. Trees have nearly the same reflectivity in the red, blue and green spectra respectively, making it difficult to discriminate between actual green trees and other structures. After several rounds of testing we have found that the following equation gave satisfactory results:

Equation 6: G > B & G > R & (G - R+B/2 > 20) & (R + B < 200)

9

This ensures that the green reflectivity is always the dominant factor, and that the total reflectivity of the red and blue bands is low. Since this function has not been reported in earlier work, we have taken extra care to visually inspect each resulting green fraction and correcting where necessary. Some stations were still underestimated by this new function, and those have been manually greened.

The spread of green cover in Madrid is limited: the majority of the stations are located in neighbourhoods with a low to very low green cover (0 to 15%). The histogram in Figure 2 shows the distribution of green fraction over the 28 stations under consideration, showing that 80% of the stations have a green fraction below 15%. Madrid is a dense city, with the majority of the neighbourhoods classifying as LCZ 2 or LCZ 3: compact mid/low rise. This means that the space for vegetation is limited, and is generally only abundant in parks or the sub-urban areas. Histogram 12 100% 90% 10 80% 8 70% Frequency 60% 6 50% Cumulative % 40% Frequency 4 30% 2 20% 10% 0 0%

Green fraction (%)

Figure 2: Histogram of the frequency of green fractions in Madrid. There are a total of 28 stations, 21 of which fall below 15% green fraction

Oslo Lastly we consider the city of Oslo, Norway. The city is situated in the Oslofjord, bordering water, with a steep topographical gradient from 0 metres at the harbour to 300 metres close to the northern city border, and a humid continental climate (Köppen class Dfb). It is an open structured city with a very high urban green cover and generally low buildings.

Urban and rural data Applying the selection criteria for the Wunderground results in 18 stations in Oslo. The website of the Norwegian weather service (http://eklima.net.no) provides data from 3 stations in the city proper itself, resulting in a total of 21 urban stations. The spread of the urban stations over the city is limited: there are no suitable stations in the old city centre, and the majority of the stations is located in the sub- urban areas. We make use of 2 reference stations to the north of Oslo, the data of which is provided by the Norwegian weather service as well. Tryvannshøgda is located on a rather steep hill just to the north of Oslo, while Bjørnholt is located about 8 kilometres north of the city. Both are WMO regulated stations, but Tryvannshøgda has the complication of being located in a forest, while Bjørnholt is located on a grass field near a small lake. The Norwegian weather service stations report every 6 hours, in the same interval as the ECMWF data. This makes them useful for comparing to the ECMWF data, but we cannot inspect the diurnal cycle of the UHI or rural temperature around the city.

There are 6 ECMWF grid points around Oslo (see appendix C for the full map of stations and locations). The southern points are located near or over the water, and are therefore not favourable for

10 analysis, since the water might influence the temperature too much. An LSD test (not shown) indicates that the difference between the 6 points is negligible: on the order of 0.2 °C. We will use ECMWF point 1, located to the north-west of the city centre, to calculate the UHI.

Oslo has a large gradient in topography: the harbour is near sea level, but the topography increases sharply to the northern border of the city, which goes up to 348 metres for the highest station (Voksenlia). On average the urban stations lie between 0 and 120 metres. To correct for temperature differences arising from the differences in topography we have adiabatically corrected all temperatures towards mean sea level, similar to the method for Madrid (equation 6).

Green cover Similar to Rotterdam we use equation 4 for the determination of green cover in Oslo. No further optimization of the terms of the equation was required, as the vegetation in Oslo bears similarities to that of Rotterdam. 3 stations (Snarøya, Nakholmen, Bleikoya) are located on small islands close to the city. This causes the function to prescribe a very low green fraction when the land is actually very green, but the 600x600 metre rectangle around the station is mostly filled with water. These stations have been omitted from the statistical analysis, since the water would be the dominant surface cover to interact with air temperature, rather than the green cover. The spread of the green cover is low: as briefly mentioned the majority of the stations is situated in the sub-urban areas around the old city centre, which causes a bias towards high green fractions.

Results In this chapter we outline our results, starting with the validation of the Wunderground data and the ECMWF model data. We proceed with the results of the statistical analysis of the UHI and the green fractions in the 3 cities. Finally we discuss the yearly trend of the UHI, linked to the climatological background of each city.

Data validation

Validation of the Wunderground data To validate the Wunderground data, and see whether the stations provide data of sufficient quality to use in analysis, we have compared the Wunderground stations with the CPC network of Rotterdam. Most CPC stations are located in different neighbourhoods than the selected Wunderground stations, except for the Vlaardingen station, which is close to the Vlaardingen Wunderground station. We will compare these two stations since they are in similar neighbourhoods, very close to one another. As an extra test we have compared the data of Beverwaard (Wunderground) and Zuid (CPC), which are located in very similar neighbourhoods (LCZ 2, little vegetation, mainly impervious surface, high human/traffic activity). Any systematic differences could indicate that we cannot freely compare data from the CPC stations with the Wunderground stations, since differences would be caused by the measuring setup rather than surface characteristics. We have run a one-sample t-test for these two pairs of stations to test whether the difference in air temperature between CPC and Wunderground (i.e. all tests are done for the difference in air temperature between the CPC station and the Wunderground station) is statistically close to zero (Table 1):

11

Table 1: statistical output for test of Tair CPC - Wunderground (i.e. negative difference indicates the Wunderground temperature is higher). The years 2013 and 2014 (up to September) have been included, since this is the timeframe where all 4 stations have data. Test Value = 0; alpha = 0.05

95% Confidence

Significance Mean Interval Std. t-value (2-tailed) Difference Deviation Lower Upper Range

2013: -4.945 0 -0.06892 -0.0962 -0.0861 1.08569 19.73 Vlaardingen

2014: -4.352 0 -0.05932 -0.0416 -0.0326 1.08052 18.72 Vlaardingen

2014: -5.376 0 -0.04597 -0.0627 -0.0292 0.62191 7.13 Beverwaard

2013: -11.94 0 -0.07091 -0.0826 -0.0593 0.54508 7.57 Beverwaard

For both combinations it follows that the difference is significantly different from 0 (the significance is lower than the threshold of 0.05), and that therefore the Wunderground stations do not report the same as their CPC counterparts. For both the tests the Wunderground stations are consistently warmer than their CPC counterparts: the mean difference is always negative. However, this is partly to be expected, since the set-up and local circumstances are different, even if they are relatively close together like in Vlaardingen. Moreover, the mean difference is very small (on the order of 0.1 °C, though with a relatively large spread), and the difference is centred around zero. The 95% confidence interval around the difference is still lower than the measurement accuracy of both the Wunderground and the CPC stations (on the order of 0.5 °C, for the exact numbers we refer to Appendix D). This indicates that the difference between the Wunderground and CPC stations lies within the measurement error, and that the difference could be due to random error rather than a systematic error. Histogram plots of the difference are given in Appendix E. The statistics show that the two Vlaardingen stations have a larger range of temperature difference than the Beverwaard/Zuid stations, which are much further apart but are located in similar LCZs, whereas the two Vlaardingen stations are slightly different in terms of neighbourhood structure.

We will assume that our initial quality checks when selecting the stations and downloading the data make the Wunderground stations of sufficient quality to work with, and the statistics seem to back this conclusion, since the difference in air temperature is rather small, though significantly different from 0. Since we lack a detailed independent urban network for Madrid and Oslo, we cannot separately assess the quality of the Wunderground stations in those cities. We will work under the assumption that those stations are of equal quality. Many of the hobby meteorologists have the same hardware type of weather station, so there should not be large differences in quality between the cities.

Validation of the ECMWF data For our analysis of the ECMWF data we make the assumption that the UHI at 0 UTC will be near the maximum UHI. This assumption is tested using the Rotterdam data: the Reference data is available at half-hourly resolution rather than the six-hourly resolution of the ECMWF. We have calculated hourly UHI values and selected the maximum UHI for each day, looking at which hour it falls. The histogram in Figure 3 shows the results for the Beverwaard Wunderground station. The histogram identifies 3 peaks: one at 0 UTC, one at 13 UTC and one at 23 UTC. The 23 and 0 UTC peaks add up to make

12 out over 20% of maximum UHI occurrences (318 out of 1567 hours are during 23 or 0 UTC). The peak at 13 UTC could be caused by the angle of the sun on the station used for the histogram. As we have mainly focused on the nocturnal UHI we do not further consider this local peak.

Figure 3: Histogram of maximum daily UHI occurrence of the Beverwaard station, Rotterdam. Total days considered = 1567.

In each city we select an appropriate grid point from the ECMWF model data near the city of interest, and then test whether the data of that grid point is significantly different from the temperature data from the rural measurement station.

For Rotterdam we have selected 3 ECMWF grid points that are near the city: these are named ECMWF West, South and East respectively (see the map in Appendix A). To test how the ECMWF points perform compared to the Rotterdam Reference station, and which of the 3 points of interest perform best, we have performed a one-sample T-test of the difference between the Reference station (CPC) and the 3 ECMWF points (West, South, East). Results are in Table 2.

Table 2: T-test of the difference in air temperature between the CPC Reference station and the 3 ECMWF points for Rotterdam. Only data from 0 UTC has been used. Test Value = 0; alpha = 0.05 95% Confidence Interval of the Significance Mean Difference Std. t-value (2-tailed) Difference Deviation Std. Lower Upper Error Mean - West 0.000 -0.447 -0.506 -0.388 1.131 0.030 14.879 East 3.108 0.002 0.085 0.031 0.138 1.028 0.027 South 1.936 0.053 0.055 -0.001 0.111 1.070 0.028

From this test we can conclude that the South-point performs best, being not significantly different from the CPC reference station (the Signifiance is above the 0.05 threshold, indicating no significant difference, though only barely). The choice of reference station can matter when determining UHI, but additional testing has shown that the CPC Reference station and the Zestienhoven Airport WMO station are also not significantly different from each other (results not shown). Zestienhoven is rather

13 close to the city (see the map of Rotterdam in Appendix A), so we have chosen to use the Reference CPC station as our rural measurement station, to avoid data contamination by the advection of warm urban air. For analysis with the ECMWF data we will use the South point, since it is most similar to the measured Reference data and a proper distance away from the city border.

Madrid

We cannot reliably test the ECMWF data of Madrid against the Barajas data due to the airport’s proximity to the city. Instead, we tested which of the 8 grid points around the city (See the map in Appendix B) is the most reliable to calculate UHI. Figure 4 shows the means plot of the 8 ECMWF points. The LSD and HSD tests showed that point 1 and 7 are significantly different from all the other points, visible in the means plot by their respective low and high values compared to the other points The statistics show that points 3 up to 6 have the least difference between them. However, this need not necessarily mean they are the most representative of the rural conditions around Madrid. The ECMWF land-use data reveals that points 7 and 8 have a different low vegetation type than the other 6 points, namely Semi-desert instead of Mixed Crops (see ECMWF, 2012, ch. 8, for additional information). The high vegetation type is either interrupted forest or mixed forest/woodland, but table 8.1 in the ECMWF model documentation shows that their surface properties are nearly the same. Each land cover type in the ECMWF model has a certain fraction of bare soil: the ground cover is never 100% vegetated (ECMWF, 2012, ch. 8). The ECMWF model documentation mentions that the total vegetation (low or high) coverage is calculated as the fraction vegetation times a vegetation type dependent ground coverage fraction (ECMWF, 2012, equation 8.1). The bare soil fraction is the residual after the total coverage has been determined for high and low vegetation.

Figure 4: Means plot of the air temperature (°C) of 8 ECMWF grid points around Madrid

The difference in vegetation type causes a high fraction of bare soil in the grid points 7 and 8, which makes them unsuitable to serve as a well-vegetated rural background for the calculation of UHI. On basis of location we have chosen to use grid point 3 as our rural reference, since it is close by to the city (to the north). Points 1 and 2 have a higher elevation than the other grid points, and could be subject to local circulations impacting especially the nocturnal air temperatures (katabatic winds), so we will use point 3 instead, as there is no significant difference between this point and the majority of the ECMWF grid points, suggesting a good representation of the local climate.

14

Oslo

There are 6 ECMWF grid points around Oslo (see Appendix C for the full map of stations and locations). The southernmost points are located near or over the water, and are therefore not favourable for analysis, since the water could influence air temperature. An LSD test (not shown) indicates that the difference between the 6 points is negligible: on the order of 0.2 °C. We will use ECMWF point 1, located to the north-west of the city centre, to calculate the UHI.

The two reference stations from the Norwegian weather service are located in distinctly different areas (Tryvannshøgda in a dense forest, LCZ A, while Bjørnholt in a grass field near a lake, LCZ D). To see whether this difference in location influences the resulting UHI values we have performed a one-way ANOVA of the (adiabatically corrected) air temperature of both stations. Results in Table 3 show that Bjørnholt is nearly 2.3 °C colder on average (Mean temperature of 4.05 °C versus 6.34 °C) , and the difference is highly significant (p-value of 0.000, not shown). We suspect that Tryvannshøgda might be too close to the city edge and too inconveniently placed (located on a very steep hill in a forest) to be a good rural reference station. Plotting the UHI statistics (5%, 95% and median) for the 3 different references (Tryvannshøgda, Bjørnholt and the ECMWF point) results in the largest UHI values for Bjørnholt and the lowest values (with a median around 0) for Tryvannshøgda, with the ECMWF values lying in between (not shown). The low UHI values for Tryvannshøgda suggest that the station might indeed be too close to the city, and its high elevation compared to the other stations (545 m a.m.s.l. whereas the Oslo stations range from 0 to 100 metres a.m.s.l.) could cause additional issues.

Table 3: One-way ANOVA of air temperature between the rural WMO stations of Oslo. Differences were highly significant at the 0.05 level. Only data from 0 UTC has been used.

95% Confidence Interval for Mean Std. name N Mean Deviation Lower Bound Upper Bound

Tryvannshøgda 1731 6.3413 7.565 5.9847 6.6979

Bjørnholt 1710 4.0567 8.371 3.6596 4.4537

Total 3441 5.2059 8.056 4.9367 5.4752

A final point of interest is whether the ECMWF ‘feels’ the influence of the urban area: i.e. does the ECMWF model distinguish between rural and urban area? To test whether urban influence might play a role in the data we have compared and ‘urban’ grid point with the ‘rural’ grid point in Rotterdam. The urban grid point is situated in the middle of the city area, at the same longitude as the chosen rural grid point (ECMWF South), but to the north. Figure 5 shows the scatterplot of the ‘urban’ ECMWF grid point (x-axis) versus the ‘rural’ ECMWF South grid point Figure 5: Comparison of air temperature between the 'urban' ECMWF grid point (the western point) to the used 'rural' ECMWF grid point. The black line is the 1:1 line (i.e. both points give equal temperatures)15 . R² is 0.9, suggesting nearly perfect fit: no significant difference between the two points.

(y-axis). The black line is the 1:1 line, i.e. where the grid points show the same data. All data are centred around this 1:1 line: had there been some urban influence in the ECMWF data, there should have been a bias to below this 1:1 line (i.e. higher urban temperatures and lower rural temperatures). Since we do not see this behaviour for both the urban points, even when comparing it with the other 2 rural ECMWF points discussed above (plots not shown), we find no evident urban influence on the air temperature in the ECMWF model.

UHI vs vegetation In this section we evaluate 3 ways of determining the urban vegetation fraction: the NDVI, the GVF and the green fraction estimate. Using the NDVI maps for Rotterdam we calculate these 3 indices and relate them to the Rotterdam UHI, to see which provides the most significant statistical relationship. We then apply the most successful method to Madrid and Oslo to relate the 3 cities to each other.

NDVI and GVF Having established the quality of Wunderground stations as urban data, and ECMWF data as suitable proxy for rural data, we have calculated the Urban Heat Island for Rotterdam, and relate any differences between stations to the difference in surface properties. Appendix A provides the NDVI and green fractions from Van Hove et al. (2015) for each of the CPC stations. Figure 6 shows the resulting regression between NDVI and this green fraction.

100 90 80 70 60 y = 266.9x - 19.93 R² = 0.8388 50 40

Green cover (%) cover Green 30 20 10 0 0 0.1 0.2 NDVI (-) 0.3 0.4 0.5

Figure 6: NDVI against green cover (as determined by Van Hove et al., in %). Both are determined in a circular area with a radius of 250 metres around the CPC stations in Rotterdam. The regression equation is shown on the chart. Note that the outlier point with 99% green fraction is in fact the Reference station and not part of the urban network.

Results show that NDVI and green fraction are correlated quite well (fraction of explained variance of 0.8388), establishing that NDVI can be translated to an easier to understand green fraction. Though the regression behaves unphysically (at low NDVI the green fraction goes to below 0%, which could indicate the relation is non-linear), the relation behaves appropriately for the typical NDVI values we find in a city (which for Rotterdam range approximately between 0.1 and 0.25). When the regression is forced through the origin, the R² drops to 71.6%, which is still a robust relation.

Next, we relate the NDVI to the calculated UHI statistics (median and 95 percentile) to see whether we can find a significant relationship, for circles of 250 metre and 500 metre radius. Table 4 provides the results of the regression statistics for this (1st block) and the following analyses. Though all regressions of the NDVI shown in Figure 7A and 7B are statistically significant (i.e. they are an

16 improvement over predicting with a constant value, at an alpha-level of 0.05), none of the runs perform specifically well: Table 4 shows the R² of the NDVI regression does not go above 0.47. There is a rather high amount of spread in the UHI whereas the NDVI values are clumped together, with 16 out of 21 stations falling within the 0.15 to 0.25 range. We suspect that the limited range of NDVI values (the reference station with 99% green cover has an NDVI of merely 0.44) is the main cause of the poor regression results. Therefore we will attempt to calculate a green fraction using the Green Vegetation Fraction method from Gutman and Ignatov (1998) as discussed in the methodology.

As means of quality check, we compare the calculated GVF values with the green cover fractions as calculated by Van Hove et al. (2015). Figure 8 shows the results of this quality check. The calculated GVF values generally does not match the green cover as calculated by Van Hove et al. (2015), and are quite often over 10% off (for example the station , which has 65% green fraction in Van Hove et al., is estimated at only 33%). Changing the NDVI_min and NDVI_max values does not improve the relation: the regression stays nearly the same. Though the GVF does not seem an improvement over the NDVI as an accurate estimate of the green fraction we have performed the regression against the 95 percentile UHI as before, to see whether this improves the results using NDVI. Results are in Figure 7 (C and D panels), and Table 4 (2nd block).

Figure 7: Regression and confidence interval of NDVI (A, B) and GVF (C, D) for a circle with 500 metre radius against the 95 percentile UHI from ECMWF (A,C) and Reference (B,D) data, in Rotterdam.

17

70

60 y = 1.1712x - 2.4471 50 R² = 0.4896

40

30

GreenCover(%) 20

10

0 0 10 20 30 40 50 Green Vegetation Fraction (%)

Figure 8: Green Vegetation Fraction (x-axis) calculated from 250 metre radius NDVI versus Green Cover (y-axis) as given in Van Hove et al. (2015) for the 13 CPC stations. Each marker represents one CPC station, and the solid line is the resulting linear trend line. The dashed line represents the 1:1 line.

The results are quite similar to the NDVI analysis: the best performance is at the 500 metre-radius, where the R² is 0.5 for both ECMWF and Reference (Table 4, 2nd block). We find no significant differences in performance between the UHI calculated using the measured Reference data and the UHI from the ECMWF model data. The relationship between green fraction and maximum UHI is similar to the one found by Steeneveld et al. (2011), namely a regression slope on the order of 0.6 to 0.8 K UHI reduction per added 10% green cover. However, this (and the previous) analysis contains all the stations in Rotterdam. Steeneveld et al. (2011) mention that their results significantly improved when including only the urban canopy stations, rather than include stations situated on roofs and balconies.

Table 4: Regression statistics for the 95 percentile UHI against NDVI and GVF. ‘ECMWF’ indicates the UHI has been calculated using the ECMWF model data; ‘Ref’ Indicates UHI is calculated from the Reference measurement station data. Radius refers to the radius around each station in which the average NDVI or GVF has been determined. The x in the regression equation refers to either the NDVI or the GVF, indicated by the header. All regressions are significant at the 0.05 level (p=0.000). Data & Radius R² RMSE (°C) Regression eqn. NDVI ECMWF, 250 m 0.34 0.734 -8.856x + 4.362 ECMWF, 500 m 0.477 0.653 -12.735x + 5.090 Ref, 250 m 0.229 0.856 -7.843x + 4.862 Ref, 500 m 0.4 0.756 -12.57x + 5.7338 GVF ECMWF, 250 m 0.378 0.732 -0.0632x + 4.2119 ECMWF, 500 m 0.5 0.656 -0.0855x + 4.7901 Ref, 250 m 0.342 0.797 -0.0634x + 4.8494 Ref, 500 m 0.499 0.695 -0.0903x + 5.5283 GVF, Canopy stations ECMWF, 250 m 0.572 (0.314) 0.697 (0.733) -0.1074x + 5.5236 ECMWF, 500 m 0.651 (0.438) 0.63 (0.664) -0.1275x + 5.9436 Ref, 250 m 0.553 (0.37) 0.768 (0.796) -0.1138x + 6.3356 Ref, 500 m 0.709 (0.6) 0.619 (0.634) -0.1434x + 6.9839

Including only canopy stations leaves 12 stations to be analysed, since several of the CPC network stations are located on roofs or balconies, and a few Wunderground stations are rooftop measurements as well. The regression using only canopy stations was performed for the GVF, since there was no significant difference in performance with the NDVI, and the GVF range is larger. Results are in the third section of Table 4.

18

Using canopy stations significantly improves the fraction of explained variance (R²), increasing with about 15% for all datasets compared to the test with all stations. The 500 metre radius performs the best, and the RMSE slightly decreases. However, the regression is heavily influenced by the outlier value of the Centrum station, with a low NDVI and high UHI. Removing this station from analysis strongly reduces the R² (Table 4, 3rd block, the values in brackets), with only the 500 metre Reference data remaining an improvement over the original analysis. The regression slope does not seem to be impacted by this outlier, only becoming slightly steeper (about 0.01 K extra). The slope of the regression becomes more steep when only including canopy stations, suggesting a stronger decrease of UHI with added vegetation (on the order of 1.2 K per 10% green cover!). This is nearly twice the value as found by Steeneveld et al., so as final method we apply their method of estimating green area, using equation 5 (Methodology).

Green cover estimate Analysis results of the regression of the green cover estimations against the 95 percentile UHI are given in Table 5 below. One radius is used instead of various different radii, and the regression is performed for all stations, and separately for the canopy-only stations.

Table 5: Regression statistics of the green cover estimation versus 95 percentile UHI. In the first column, ‘ECMWF’ indicates the UHI has been calculated using ECMWF model data; ‘Ref’ indicates UHI calculated from the Reference station measurements. The x in the regression equation refers to the GVF, in %. Data used R² RMSE (°C) Regression eqn. ECMWF, all stations 0.651 0.557 -0.0683x + 4.097 Ref, all stations 0.398 0.604 -0.0442x + 5.752 ECMWF, canopy only 0.729 0.592 -0.0748x + 4.285 Ref, canopy only 0.771 0.562 -0.0794x + 5.096

The results from this test seem to be the most favourable, with the ECMWF data performing very well even when all stations are included: R² is above 0.65 for both occasions. Removing outliers slightly improved the R² (though not enough to report on separately), so there seems to be a solid statistical relationship between this green cover estimate and the 95 percentile of UHI. When analysing all stations, the ECMWF data seems to do better than the Reference station data, with R² values of 0.65 (ECMWF) against 0.398 (Reference). When only analysing canopy stations the Reference station data gains a large increase in predictability and performs better than the ECMWF (higher R² and lower RMSE). Regression plots of the canopy-only analysis are given in Figure 9. A notable difference between the ECMWF and Reference data is the slope of the regression: the ECMWF is steeper, and starts at a lower intercept (slope of about 0.7 K per 10% green fraction, and an intercept of around 4 degrees). The intercept illustrates the UHI we would find at a (fictional) station with 0% green cover. This UHI is higher for the Reference data, which means that the UHI magnitude of the Reference data is generally higher than that found for the ECMWF. We can indeed find this in our data: the air temperatures of the ECMWF are generally higher than the measurements at the Reference station.

19

Figure 9: Regression of the green cover estimate versus the 95 percentile of UHI using only data from stations in the urban canopy for Rotterdam. A: Reference data; B: ECMWF data.

Given the favourable results of using the simple green cover estimate to construct a statistical relationship between vegetation fraction and UHI, we have applied this method for Madrid and Oslo. Unlike Rotterdam, the results for Madrid show no significant relationship (see Figure 10 for regression plots). Both the ECMWF and the Barajas regression data show R² values of around 1%, and no model significance (i.e. a constant value predicts the UHI better than the regression). When only canopy stations are taken into the regression analysis (plots given in Figure 10) the results improve only very slightly, with R² increasing to 6.9% for Barajas, and 9.1% for the ECMWF point. The choice of ECMWF grid point does not change the outcome significantly: all 8 points give similar results (not shown), with R² ranging in between 7 and 10%, and none of the models having any significant predictive value (P- values around 0.15 to 0.20). Performing the regression using median UHI instead of 95 percentile slightly increases the model reliability: R² values go up to around 12 % and the p-values decrease slightly, but none of the ECMWF points or Barajas performs nearly well enough to establish a good statistical relationship.

Our initial thought was that this could be due to the different type of vegetation in the city. To test this hypothesis we have repeated the regression, using only the data from the growing season of vegetation in spring and early summer (the MAMJ months). In these months, vegetation should still be active and not have withered from water shortage and heat. However, the regression results do not improve: the resulting R² values are still lower than 10% for both ECMWF and Barajas with similar p- values (highly insignificant). Repeating this analysis using the median UHI rather than the 95 percentile for the growing season provides a slight improvement, similar to the results of the median UHI for the all-year data. Ultimately, with the current data no statistical relationship can be established between the green vegetation cover and the UHI for Madrid.

20

Figure 10: Regression of 95 percentile UHI versus green cover, using only stations in the urban canopy. A: Barajas airport data, B: ECMWF data.

In order to try to discriminate some influence of the local background climate on the UHI in Madrid, we have performed an LSD test of the stations categorized into their respective Local Climate Zones. 4 LCZs have been included into the analysis: compact mid-rise (LCZ 2, 12 stations); compact low-rise (LCZ 3, 3 stations); open mid-rise (LCZ 5, 7 stations) and open low-rise (LCZ 6, 5 stations), since they account for the majority of the stations. Since each category needs at least 3 members in order to do the test successfully, stations not belonging to these 4 residential LCZs have been reclassified into the nearest LCZ. One station has been left out of this analysis, because it was closer to a rural zone (LCZ 9, sparsely built) than an urban residential zone. The LSD test showed (full statistics not shown, means plot in Figure 11) that only LCZ 2 and 6 were significantly different from each other (at alpha = 0.05), though there is a difference of over 1 °C between the mid-rise (2 and 3) and the low-rise zones (5 and 6). The building height (mid-rise versus low-rise) is more determining of the maximum UHI intensity than the density of the buildings (compact versus open). Van Hove et al. (2015) also report a strong dependence of the UHI on the building height, stronger than on the vegetation fraction.

Figure 11: Average 95 percentile UHI (°C) per LCZ, Madrid. 2 = Compact Mid-Rise; 3 = Compact Low-Rise; 5 = open Mid- Rise; 6 = Open Low-Rise

21

For Oslo, performing the regression analysis using 95 percentile UHI gives no significant results for either reference data set: all 3 (Tryvannshøgda, Bjørnholt, ECMWF) show a scattered UHI with no real pattern (R² around 0.09, results not shown). When only including the canopy stations the result does not improve (results for the ECMWF and Bjørnholt data shown in Figure 12). The regression line even trends slightly upwards, though when removing 2 outliers the line goes nearly flat, so we cannot make any conclusions concerning this opposite behaviour. Plotting the same regression for the median UHI only deteriorates the relation further with R² going down to 0.02, unlike in Madrid. Both the ECMWF and Bjørnholt datasets show the same upwards trending line, though the UHI magnitude of Bjørnholt is nearly 2 °C higher than the ECMW-based UHI. The ECMWF data for Rotterdam and Madrid has shown lower UHI magnitude as well, though in the order of 1 °C. The larger difference in this case could indicate the Bjørnholt data to be unusually cold. A concise analysis of the temperature values of Bjørnholt and Tryvannshøgda has shown that Bjørnholt is nearly 4 degrees colder during the yearly minima, which occur in these winter months. A separate regression analysis disregarding the winter months does not improve results, however (R2 0.07, results not plotted).

The upwards regression line seen in both plots of Figure 12 could partially be explained by the Norwegian Weather service stations: they seem to give lower air temperatures than the Wunderground stations, most likely due to being well-ventilated and therefore less prone to heat up. The Alna station (identified as eKlima – Alna on the map in Appendix C), located near the Oslo railway station, is one of these, and has the lowest green cover and a subsequent low UHI. Removing these stations from the regression does not improve the regression line (since the weak pattern vanishes) but does pull the line nearly straight again, suggesting no relation.

Figure 12: Regression of 95 percentile UHI versus green cover (%) for Oslo, only canopy stations (A: ECMWF, B: Bjørnholt). R² values were around 0.06 for both reference datasets, with no model significance.

Seasonality of the Urban Heat Island In this section we plot the mean monthly UHI (at 0 UTC) against the rural background temperature. This method visualises the seasonality of the UHI, which can aid explaining the absence of a significant relationship between UHI and vegetation in Madrid and Oslo.

Rotterdam Figure 13 shows the resulting hysteresis curve for Rotterdam. The differences between the ECMWF and the Reference station plots are not large: the ECMWF data seems to underestimate the mean UHI, giving negative values during the autumn and winter months (October until February) when the Reference data gives values close to 0. However, the pattern of both plots is the same, with the direction and relative intensity of the UHI following the same pattern. The direction of the hysteresis curve is clockwise, which is in agreement with the findings of Zhou et al. (2013), who find a clockwise direction for the majority of cities under research. The maximum magnitude of the hysteresis effect is

22 about 1.1 °C for the Reference data and 0.6 °C for the ECMWF, due to the higher winter UHI intensity. This hysteresis effect means that for the same air temperature, differences in radiation uptake, wind speed (advection) or other meteorological circumstances cause a difference in mean UHI intensity at night. Autumn and spring have about the same rural air temperatures (as seen from the plots), and the difference in UHI are likely caused by radiation differences: there is more insolation during the spring months, which translates to more nocturnal heat release by the urban fabric.

Figure 13: Hysteresis curves of night-time background temperature (ECMWF and Reference based) against mean UHI for all Rotterdam stations. The numbers at the data points indicate the month (1 = January and so on). Standard deviations amount to 0.5 °C for UHI, and 2 °C for the background temperature.

From the hysteresis in Figure 13 we can observe that in July and August the UHI intensity is the largest, as we would expect from theory: the insolation is large, causing increased heat uptake by the urban fabric during the day and subsequent nocturnal release, increasing the UHI. Aside from summertime, there is a curious peak in UHI intensity in April, seen in both the ECMWF and the Reference plots. At this time of year the sun’s radiation is not very high yet, but starting to build, and air temperatures are generally low. A high UHI indicates either a high city temperature, which we deem unlikely due to the aforementioned weaker solar radiation input, or a low rural background temperature.

Figure 14 shows the diurnal cycle of the hourly average urban temperature of the period of interest, with time in UTC on the x-axis and the Day of the Year (DOY) on the y-axis. If we consider the difference between spring and autumn again we need to look at DOY 90 to 150 (which constitutes April/May) and DOY 275 to 330 (which constitutes October/November). The hysteresis-plots of Figure 13 only show night-time UHI at 0 UTC due to the limited ECMWF temporal resolution, and focussing on that time shows that May (DOY 120 to 150) seems to be significantly warmer at night (typical temperature around 10 °C) than October or November (typical temperature around 7 to 8 °C), even though the rural background temperature is more or less equal.

23

Figure 14: Contour plot of hourly averaged urban temperature (averaged over all Rotterdam urban stations) over the period 2010 to 2014

Madrid To understand why we find a weak relationship between UHI and green fraction in Madrid we will look at the seasonal changes in UHI. The ECMWF-3 point (located to the North of the city, see the map in Appendix B) has been used in the hysteresis, though as an additional check we have constructed hysteresis plots of all 8 ECMWF points. These plots showed that differences in seasonality of UHI between the ECMWF grid points were negligible.

Figure 15 shows the resulting hysteresis loops. The plots show the opposite of the previous results for Rotterdam, i.e. the highest UHI intensity in winter and autumn, and the lowest (negative) intensity during summer. The ECMWF data gives higher UHI values than Barajas airport for the winter months, and the opposite is true for the summertime where Barajas gives stronger negative UHI intensities. A second observation is that the direction of the hysteresis plots are counter-clockwise, as opposed to the results of Rotterdam. This indicates that spring has lower UHI (or negative UHI in this case) than the autumn for the same rural air temperature. This same effect is visible in the contour plot of the average urban temperature of Madrid, in Figure 16. Urban night-time temperatures are in the order of 17 degrees in April and May, whereas October temperatures occasionally go near 20 degrees, with background temperature nearly equal for those two periods. Urban temperatures in October might be slightly higher because of lingering heat from summer.

In Figure 11 we saw a difference between the more open-spaced LCZ6 and the compact LCZ2, indicating that the UHI in Madrid does vary between neighbourhoods. To further quantify the differences in UHI in the city we have constructed 3 different hysteresis curves, based on green fraction. These are plotted in Figure 17, where the Low curve consists of 4 stations with a vegetation fraction below 6% (Las Mercedes, San Sebastian, Chamberi, Léganes); Moderate of stations between 10 and 15 % (Vallecas, Aravaca, Alameda, Alcala de Henares); High of stations with more than 25% vegetation fraction (Tres Cantos, Avenida de Illustracion, Ciemat, Ciemat Edificio). Notably the stations with low vegetation fraction have a higher UHI than those with the moderate or high vegetation fraction, though the shape of the hysteresis curves is nearly equal. The high January value for the High curve is caused by gaps in the dataset and would in reality probably be lower.

24

Figure 15: Hysteresis loops of mean monthly UHI (over all stations) against rural background temperature of Barajas data and ECMWF data (Madrid). The numbers at the markers indicate the month. Standard deviations for the UHI are on the order of 0.75 °C, for the background temperature this becomes 4 °C.

Figure 16: Contourplot of urban air temperature over the day (x-axis) and year (y-axis) for Madrid. Data from all urban stations has been used, of 4 years (2010-2013).

25

Figure 17: Hysteresis plot of stations with low (0-10%), moderate (15-20%) and high (25+%) vegetation fraction for Madrid. UHI is calculated from ECMWF data at 00 UTC. The numbers at the markers indicate the month. Standard deviations amount to around 0.7 °C for the UHI, and 4 °C for the background temperature.

Oslo Figure 18 shows the hysteresis plots for Oslo, with curves from both the ECMWF data and the Bjørnholt reference data. There is not a strong hysteresis effect visible as was the case for Rotterdam, and maximum UHI seems to occur during the winter (ECMWF) or spring (Bjørnholt). Remarkable is that the two data sources have an opposite hysteresis direction. While the ECMWF data shows a counter-clockwise direction, Bjørnholt is going clockwise, which contradicts with the findings by Zhou et al. (2013) who report that the majority of the city clusters in their research exhibits a clockwise hysteresis curve of UHI. This large difference between the datasets shows that the choice of reference station is very crucial in determining the UHI magnitude (further researched by Sakikabara and Owa, 2005), and conclusions should therefore be drawn with care. Interestingly, the hysteresis curve for Tryvannshøgda (Appendix F) shows some sort of middle ground in between: a bow-tie shape, going clockwise in the first half of the year and then turning counter-clockwise, similar to the shape we see for the summer months in the ECMWF hysteresis plot. For both the ECMWF (Figure 18) and Tryvannshøgda (Appendix F) the standard deviation of the UHI is larger than the hysteresis effect (on the order of 0.5 °C), and what we see in the plots could therefore just as well be caused by statistical error rather than actual physical phenomena. The standard deviation in the background temperature is significant as well, especially during the winter months, where temperatures can go down to -25°C. This deviation is smaller in the summer months which are less extreme.

For Bjørnholt this error is relatively smaller and the hysteresis shape could generally be trusted. Here we see the large April UHI, which we have also observed in Rotterdam, with minimum UHI during the autumn and slightly increasing in winter again. The high winter UHI could be caused by Bjørnholt’s remote location, rather than physical processes in the city itself. As seen in the results for the Oslo regression analysis (Figure 12), Bjørnholt’s air temperatures are nearly 4 degrees colder than those of Tryvannshøgda, with the difference being especially prominent in wintertime. Anthropogenic heat production might also contribute to the higher winter UHI, though Oslo is a very open city without dense traffic or industry, so the only true source would then be domestic heating. The high variability between the reference datasets makes this difficult to research, however. The low summertime UHI intensity of the ECMWF data can be explained using the urban temperature plot in Figure 19. Night- time temperatures during July and August (approximate DOY 200 to 250) are around 15 °C, which is equal to the rural background temperature for those months of the ECMWF plot in Figure 18.

26

Figure 18: Hysteresis plots of median UHI against background temperature (A: ECMWF; B: Bjørnholt). UHI is calculated from ECMWF data at 00 UTC. The numbers at the markers indicate the month. Standard deviations amount to around 0.5 °C for the UHI, and 6 °C for the background temperature.

Figure 19: Contourplot of urban air temperature for Oslo over the day (x-axis) and year (y-axis). Data from all urban stations has been used, of 4 years (2010-2013)

Discussion In this study we have looked at individual cities, and comparing different neighbourhood with different green fractions in order to find a universal relationship such as in Steeneveld et al. (2011). However, they look at various cities in one country, with one or more stations per city. In such a study, it is easier to find cities with a varying green cover, which was the major issue in our study of Oslo and Madrid.

27

Both cities do not have a large spread in their green cover: Madrid is dense and built-up, with generally low green covers (80% of our stations had green cover of below 15%) whereas Oslo is a very open and green city, where only one station had a green cover of below 20%, and the majority lay in neighbourhoods with over 50% green cover. This hinders the power of the statistics we apply, since we can actually only apply them to a small part of the spectrum of green covers. With more stations covering a wider portion of the green cover spectrum we would have yielded more solid results, regardless of the found relationship, that could provide a solid conclusion.

The Wunderground stations tend to be clustered in some neighbourhoods, and absent from others. This causes a geographic bias in which neighbourhoods are represented on the Wunderground platform, and consequently in our analysis, and which are not. However, over the past years the coverage of Wunderground stations has steadily increased, and cities that had few measurements (like Rotterdam) several years ago are now building up a dense network of hobbyist weather stations. In a few years’ time, repeating this study may yield much more valuable results, if this dense network stays in the air.

Supplementing the Wunderground data with data from a scientific network (like the CPC stations of Rotterdam) is much preferred, since a network with known setup, instruments and documentation is always to be preferred over a network of hobbyist stations, no matter how thorough they might be. The amount of metadata on the Wunderground site is limited, and venturing to contact every single owner of a weather station proves to be nigh impossible, since there is usually no contact information.

The ECMWF data has proven to be valuable for UHI research: validation for Rotterdam showed no significant differences with the actual measured reference data, though the resulting UHI values are always somewhat lower than the UHI calculated by measurements, which we also see again in Oslo. Night-time temperatures of the ECMWF are somewhat higher than those measured in the field, due to difficulties in modelling the stable boundary layer. However, the limited temporal resolution of the ECMWF is a hindrance for detailed UHI research, since the diurnal cycle can no longer be made visible at this timescale. Using the ECMWF data as a reference has proved to be an innovative new way of calculating UHI, though there is still quite some difference with the measurements done at rural stations. However, this proves the point made by Sakikabara and Owa (2005), that the Urban Heat Island is in essence defined by the choice of the reference station. Perhaps it would be best to take a multitude of reference points and average over these, to not get UHI patterns that are more governed by local oddities around the reference station (which we suspect may be the case for the Tryvannshøgda station in Oslo), but purely by the urban circumstances.

Rotterdam For Rotterdam we found a curious peak of the UHI in April, next to the expected peak during the summer months. Oke (1982) reports that in temperate latitudes the highest UHI intensities are most often found during summer and autumn; Georgescu et al. (2012, in Zhou et al., 2013) also report summertime maxima and less intense UHI during spring and autumn for the USA; Kim and Baik (2002) conversely report weakest UHI in the summertime for Seoul, and moderate UHI in spring. A study by Lorridan and Grimmond (2012) researches the flux-partitioning in the urban environment related to the active surface (vegetation and built-area), which also shows some variability with the seasons. They find that there is an outgoing/incoming radiation ratio minimum during spring and summer, attributed to the increased dissipation of incoming radiation by turbulence and radiation storage. This elevated radiation uptake could partially explain the peak UHI in April, though we are cautious to jump to this conclusion without any additional testing. Offerle et al. (2006) performed flux measurements in Lodz, Poland, which show that April and May have a significantly higher sensible heat flux during the day than October and November. This could indicate that during the day the urban canopy layer warms up and retains some of that heat during the night due to heating from the built environment, giving rise to high urban heat islands even in spring.

28

A second explanation might be that the rural air temperature is lagging slightly behind: i.e. cooling processes hamper the increase of rural air temperature over the month and over the day. This can be related to the vegetation: the Reference station of Rotterdam has a grass-cover all around it, which is active early in the year, compared to shrubs and trees, which start actively growing later in the year. The ECMWF land cover data shows that the used grid point (South) is dominantly grass-covered as well. Active vegetation has a higher latent heat flux (i.e. evaporates more strongly) than inactive or dead vegetation, which decreases the sensible heat flux if the net radiation is equal. This causes cooling of the surface layer and therefore colder rural air temperature compared to the urban environment.

A different explanation for cold rural temperatures could be the soil temperature. The soil, especially deeper layers, shows a delayed response to air temperature, heating up very slowly due to the poor heat conductivity. The soil is therefore colder than the air above it, which cools down the surface layer in the rural area. In urban area this affect will be absent, since there is very little bare soil. The ground in the urban environment is instead mostly covered by asphalt and other building material, which has high heat conductivity (and is indeed a major source of the nocturnal heat island). This means that the soil temperature does not provide a dampening influence on the air temperature in the urban environment, which leads to an urban heat island. We have attempted an analysis of the ECMWF soil temperature data, but this variable is only provided for a depth of 8 centimetres, and is nearly linearly coupled to the air temperature. Analysis of the soil temperate in deeper layers could provide an insight in the early peaking of the UHI.

The ECMWF model data has shown good performance as an estimate of the rural temperatures. For all cities we find that the UHI calculated from ECMWF is on the order of 1 °C lower than the UHI calculated from measurements, which indicates the ECMWF air temperature is higher. This seems to be consistent with the warm bias at night found in many climate and mesoscale models (e.g. Holtslag et al., 2013): resolving the stable boundary layer is still a large issue in numeric weather prediction, and can cause overestimation of night-time air temperatures, resulting in lower UHI.

Madrid The absence of a significant statistical relationship between UHI and vegetation fraction in Madrid could be caused by the difference in vegetation between Spain and e.g. the Netherlands. From the used Google Earth™ imagery, the streets and gardens of Madrid appear to contain more trees and shrubs than Rotterdam, which is mainly grass-covered. The effect of trees on the energy balance is more complex than the effect of grass (which main influence is through the latent heat flux), due to the added effects of roughness and shading (Van Heerwaarden and Teuling, 2014). Lin and Lin (2010) find that shade trees decrease local air and soil temperatures during the day, while Spronken-Smith and Oke (1999) mention that the reduction of sky-view factor by trees decreases outgoing longwave radiation, enhancing the nocturnal air temperature. Zhou et al. (2013) have looked at Madrid specifically in their analysis, and they find the negative UHI intensity during the summer as well, with typical values of between -1 and -2 °C. Our results give summertime UHI intensities on the order of - 0.5°C, though our analysis is based on air temperatures, whereas Zhou et al. (2013) considered large- scale surface temperatures measured from satellites. Zhou et al. (2013) also have a different direction of their hysteresis curve, which moves clockwise with the peak UHI located in spring (though small, on the order of 0.5 °C) and wintertime UHI close to 0 °C. Their hypothesis for the hysteresis curve is related to different phenological phases in urban and rural areas, meaning that the onset of the growing season might be different between these areas, similar to our hypothesis for Rotterdam.

A second hypothesis for these negative UHI intensities in summer is the anthropogenic water flux. Madrid is situated in a dry climate, meaning that the rural surroundings are not irrigated naturally. Therefore, the vegetation is no longer active in summertime to decrease rural temperatures, whereas in the city itself gardens and parks will be watered by the inhabitants and the municipality. The irrigation in the city during the dry summer months causes an influx of water in the urban environment

29 that is absent from the rural environment (Oke, 1982). The potential evaporation is very high during the hot summer months, so any additional water will be evaporated, changing the flux partitioning to an elevated latent heat flux. Results from Oke (1982) even suggest the potential evaporation rate of an irrigated lawn is exceeded by nearly 30% during hot days, due to advection of warm dry air. Moreover, in southern European countries it is common to clean the streets with water, which provides another anthropogenic water flux into the urban system (Grimmond et al., 2010). This increased latent heat compared to the dry rural area causes city temperatures to be lower, causing a negative UHI.

Thirdly, we have briefly mentioned that Madrid is a very dense city, with the majority of the neighbourhoods classifying as Compact Mid-Rise (LCZ 2), indicating buildings of over 5 storeys tall. This limits the amount of radiation going in and out, which might also go some way to explaining the negative UHI. Theeuwes et al. (2014) find that the aspect ratio (building height divided by street width) influences the maximum UHI through two counteracting processes: longwave radiation trapping (increasing urban temperature) and shadowing (decreasing urban temperature). The net effect depends on the exact value of the aspect ratio: UHI does not increase linearly with the aspect ratio. Above an aspect ratio of 1 the increase of the maximum UHI stops and even decreases slightly due to the stronger shadowing effect. Though we do not have actual data of building height and street width in Madrid, the paper by Stewart and Oke (2012) provides ranges of aspect ratio values belonging to LCZ classes. LCZ 2 is characterised by an aspect ratio of 0.75 to 2 (Stewart and Oke, 2012, table 3). This relatively high aspect ratio could be of influence on the lower UHI magnitude.

Oslo The UHI seasonality of Oslo is highly dependent on which reference dataset is used: each of the 3 datasets give a significantly different distribution of the UHI throughout the year. In general, UHI is high during winter and spring, and low during the summer months. The UHI in winter can be caused by anthropogenic heat release during winter, which governs the UHI at higher latitudes due to the limited radiation influx (Oke, 1982).

Suomi and Käykhö (2012) have researched the influence of environmental on the UHI in the city of Turku, south-west Finland. This city is located near the coast, and thus experience a similar water influence as our case of Oslo. They find that measurement locations near the water (buffer of 2 km) are clearly influenced: a warming effect in autumn and winter, and a cooling effect in spring (Suomi and Käykhö, 2012). The warming effect of water during the wintertime could be an explanation for the relatively high UHI in winter we can observe in the seasonal trends of the UHI. This would however not explain the high values in spring seen in the Bjørnholt dataset (Figure 18). Given that we do not see a similar structure in the ECMWF UHI values, it is likely caused by the temperature variation of the Bjørnholt station itself. The reference station is situated near a lake, on a grassy field surrounded by trees. Using data from a reference station situated in a uniform area free from possible contamination by water and forest (through turbulence) could improve the reliability of the found UHI in Oslo.

The strong influence of the water could explain why there is no significant relationship found between UHI and vegetation in Oslo. Suomi and Käykhö (2012) do find a significant influence of the urban land use on temperature difference, but they do not differentiate between green cover fractions, researching differences between LCZ-like classes instead. Due to the limited spread of LCZs in our data (the majority is LCZ 6, open low-rise) we have opted to not do this analysis for Oslo, since it would not have enough power to draw conclusions from. For the city of Gothenburg, Sweden, no significant cooling effect of urban trees could be found (Konarska et al., 2015), suggesting that vegetation might not play a large role at higher latitudes.

30

Conclusions A repeat of the research questions that we hoped to answer at the start of this study:

 By how much does urban vegetation fraction reduce the Urban Heat Island in climate zones across Europe?  How do local and regional climate differences influence the relationship between urban vegetation fraction and UHI reduction?

We have found that vegetation cover does not seem to play a determining role in mediating the maximum Urban Heat Island in cities across Europe. Though a good relation for Rotterdam was found (10% additional vegetation decreases the maximum UHI with between 0.5 to 0.7 °C), we were not successful in establishing a similar relationship for the cities of Madrid and Oslo. In Madrid, the water availability for irrigation in the city causes what seems to be an Urban Cool Island, compared to the dry rural surroundings. In Oslo, the influence of the Oslo-fjord might be the dominant driver of temperature changes in the city, as there was no visible pattern related to the green fraction of the city. This is confirmed by a study of the city of Turku, Finland, by Suomi and Käykhö (2012). The already low temperatures in Oslo and surroundings are not influenced greatly by enhanced evaporation by vegetation, lying the cause of UHI variability elsewhere.

This answers our last research question, namely that the regional climate appears to be the driving factor to determining the role of the vegetation: for climates with a low or negative precipitation surplus, vegetation does not evaporate at its potential level, but is limited by the input of water, whereas the input of radiation (disregarding other limiting factors like nutrients) would be the limiting factor in wetter climates such as Rotterdam and Oslo. In colder climates, additional evaporation by urban vegetation does not seem to influence the urban temperatures any further, and the magnitude of the UHI is generally smaller.

Outlook and recommendations As recommendations for further research we would urge to develop a consistent way of determining urban green fraction. Currently there is a plethora of methods used (of which we have explored but a few), some using NDVI, some using innovative green estimates, which makes it very difficult to compare results to those found by others. A consistent, preferable easily performed method of calculating the urban green fraction could provide a big step on the way to standardizing research of the urban climate. Stewart and Oke (2012) have already made a step in this direction by introducing the Local Climate Zone framework, but do not yet mention how the various surface fractions are to be determined.

Related to that point would be a standardized way of determining and calculating UHI. This is related to the reference station, which is a highly determining factor in the magnitude and diurnal cycle of the UHI, but also to the UHI equation itself. The UHI can be calculated with minimum temperature, maximum temperature, hourly averages or on-the-hour values, and each method will give a slightly different UHI. The UHI is thereby not a physical property of the urban area, but more of a mathematical construction which depends too much on the choices made. Creating a type of of UHI for thermal comfort might also be an interesting venture: rather than expressing the urban-rural difference with air temperature, it would express how much more heat (or cold) stress the urban area gives compared to the rural surroundings.

Specifically related to this study we recommend researching various cities per climate zone or country, rather than taking one per city. Right now we have found distinctly different results for different climate zones, but it would be very interesting to see how much the relation to green fraction varies within these climate zones. The results for Madrid point to the important effect of the anthropogenic water

31 flux on urban temperatures in drier climates, and the effect of the Oslo-fjord on the urban air temperature may also be significant. Inclusion of water effects, related to vegetation, would improve future results.

We have gained satisfactory results using model data as ways of rural estimate, though validation using actual measurements will always be necessary. Using models of higher resolution, e.g. mesoscale models such as WRF, could provide an improvement over the used large scale model in this study. In the GABLS model inter-comparison study it has been noted that models with a complex land-surface coupling tend to reproduce night-time stable boundary layer temperatures to a better degree (Holtslag et al., 2013). Using these models could be a valuable follow-up to this study.

Acknowledgments We thank all the weather hobbyists uploading their personal weather station data to the Wunderground platform for providing such a versatile and extensive source of urban meteorological data; Jan Elbers for maintaining the Rotterdam CPC data network and providing information about these data; Bert van Hove for sharing methods and suggestions about the Rotterdam data analysis; Gert-Jan Steeneveld for supplying the topic and extra information on the vegetation fraction calculation; and the Norwegian Weather Service for supplying the data for Oslo.

Furthermore, my personal thanks to my supervisors, Natalie and Reinder, who have helped me greatly over the past months to shape and form this thesis, provided useful ideas and feedback whenever I needed some advice, and did not object to all the other activities I explored during these past months. Lastly, I want to thank my friends, with whom I have shared many thesis struggles, whacky ideas and even more lunches.

32

References Bell, P. A. (1981). Physiological, Comfort, Performance, and Social Effects of Heat Stress. Journal of Social Issues, 37, 71–94, doi: 10.1111/j.1540-4560.1981.tb01058.x

Blazejczyk, K., Epstein, Y., Jendritzky, G., Staiger, H., and Tinz, B. (2012): Comparison of UTCI to selected thermal indices. International Journal of Biometeorology, 56, 515–535

ECMWF (2012): IFS documentation - Cy37r2 Operational implementation 18 May 2011: Part IV: physical processes. http://old.ecmwf.int/research/ifsdocs/CY37R²/IFSPart4.pdf (retrieved on 25-5- 2015)

Georgescu, M., Mahalov, A. and Mousaoui, M. (2012). Seasonal hydro-climatic impacts of Sun Corridor expansion. Environmental Res. Letters, 7 (3), 034026, doi: 10.1088/1748-9326/7/3/034026

Gutman, G. and Ignatov, A. (1998): The derivation of the green vegetation fraction from NOAA/AVHRR data for use in numerical weather prediction models. Int. J. Remote Sensing, 19 (8), 1533-1543

Holtslag, A. A. M., G. Svensson, P. Baas, S. Basu, B. Beare, A. C. M. Beljaars, F. C. Bosveld, J. Cuxart, J. Lindvall, G. J. Steeneveld, M. Tjernström and B. J. H. Van de Wiel (2013). Stable Atmospheric Boundary Layers and Diurnal Cycles: Challenges for Weather and Climate Models. Bull. Amer. Meteor. Soc., 94, 1691–1706. doi; 10.1175/BAMS-D-11-00187.1

Heusinkveld, B. G., G. J. Steeneveld, L. W. A. van Hove, C. M. J. Jacobs and A. A. M. Holtslag (2014). Spatial variability of the Rotterdam urban heat island as influenced by urban land use. J. Geophys. Res. Atmos., 119, 677-692, doi: 10.1002/2012JD019399

Kim, Y. and J. Baik (2002). Maximum urban heat island intensity in Seoul. J. Appl. Meteor., 41, 651– 659.

Konarska, J., J. Uddling, B. Holmer, M Lutz, F. Lindberg, H Pleijel and S. Thorsson (2015)/ Transpiration of urban trees and its cooling effect in a high latitude city. Int J. Biometerol., doi: 10.1007/s00484-015-1014-x

Kovats, R. S. and S. Hajat (2008). Heat stress and public health: a critical review. Annual. Rev. Public Health, 29, 41–55, doi: 10.1146/annurev.publhealth.29.020907.090843

Lin, B. S. and Y. J. Lin (2010). Cooling effect of shade trees with different characteristics in a subtropical urban park. HortiScience, 45, 83-86,

McCarthy, M. P., M. J. Best, and R. A. Betts (2010). Climate change in cities due to global warming and urban effects. Geophys. Res. Lett., 37, doi: 10.1029/2010GL042845.

Offerle, B., Grimmond, C.S.B., Fortuniak, K., Klysik, K. and Oke, T.R. (2006). Temporal variations in heat fluxes over a central European city centre. Theor. Appl. Climatol., 84, 103–115, doi: 10.1007/s00704-005-0148-x

Oke, T. R. (1982). The energetic basis of the urban heat island. Quart. J. R. Met. Soc., 108, 1-24.

Oke, T. R. (2006). Initial guidance to obtaining representative meteorological observations at urban sites. World Meteorological Organisation, Instruments and Observing methods, Report no. 81

Petralli, M., L. Massetti, G. Brandani and S. Orlandini (2014). Urban planning indicators: useful tools to measure the effect of urbanization and vegetation on summer air temperatures. Int. J. Climatol., 34, 1236–1244, doi: 10.1002/joc.3760

33

Priestley, C. H. B. and R. J. Taylor (1972). On the assessment of surface heat flux and evaporation using large-scale parameters. Mo. Weather Rev., 100, 81-92

Poumadère, M., C. Mays, S. Le Mer, and R. Blong (2005). The 2003 Heat Wave in France: Dangerous Climate Change Here and Now. Risk Analysis, 25, 1483–1494. doi: 10.1111/j.1539- 6924.2005.00694.x

Reid, C. E., M. S. O'Neill, C. J. Gronlund, S. J. Brines, D. G. Brown, A. V. Diez-Roux, and J. Schwartz (2009). Mapping community determinants of heat vulnerability. Environ. Health Perspect., 117 (11), 1730-1736.

Sakakibara, Y and K. Owa (2005). Urban-rural temperature differences in coastal cities: influence of rural sites. Int. J. Climatol., 25, 811-820, doi: 10.1002/joc.1180

Sarlikioti, V., Meinen, E and Marcelis, L. F. M. (2011). Crop reflectance as a tool for the online monitoring of LAI and PAR interception in two difference greenhouse crops. Biosys. Engineering., 118, 114-120.

Spronken-Smith, R. A. and T. R. Oke (1999). Scale modelling of nocturnal cooling in urban parks. Boundary-layer meteorology, 93 (2), 287-312. doi: 10.1023/A:1002001408973

Steeneveld, G. J., S. Koopmans, B. G. Heusinkveld, L. W. A. Van Hove and A. A. M. Holtslag (2011). Quantifying urban heat island effects and human comfort of variable size and urban morphology in the Netherlands. J. Geophys. Res., 116, doi: 10.1029/2011JD015988

Stewart, I. D. (2011). A systematic review and scientific critique of methodology in modern urban heat island literature. Int. J. Climatol,. 31, 200-217. doi: 10.1002/joc.2141

Stewart, I. D. and T. R. Oke (2012). Local climate zones for urban temperature studies. Bull. Amer. Met. Soc., 93, 1879-1900, doi: 10.1175/BAMS-D-11-00019.1

Suomi, J. and J Käykhö (2012). The impact of environmental factors on urban temperature variability in the coastal city of Turku, SW Finland. Int. J. Climatol., 32, 451-463 doi: 10.1002/joc.2277

Tan, J., Y. Zheng, X. Tang, C. Guo, L. Li, G. Song, X. Zhen, D. Yuan, A. J. Kalkstein, F. Li and H. Chen (2010). The urban heat island and its impact on heat waves and human health in Shanghai. Int. J. Biometeorol., 54, 75-84. doi: 10.1007/s00484-009-0256-x

Theeuwes, N.E., Steeneveld, G.J., Ronda, R.J., Heusinkveld, B.G., Van Hove, L.W.A. and Holtslag, A.A.M. (2014). Seasonal dependence of the urban heat island on the street canyon aspect ratio. Q. J. R. Meteorol. Soc. 140: 2197–2210

United Nations, department of social and economic affairs (2012). World urbanization prospects: the 2011 revision: highlights. http://esa.un.org/unup/pdf/WUP2011_Highlights.pdf (retrieved on 17-9-2014)

Van Hove, L.W.A., Jacobs, C.M.J., Heusinkveld, B.G., Elbers, J.A., Van Driel, B.L., Holtslag, A.A.M. (2015). Temporal and spatial variability of urban heat island and thermal comfort within the Rotterdam agglomeration. Building and Environment, 83, 91-103. doi: 10.1016/j.buildenv.2014.08.029

Van Heerwaarden, C. C. and Teuling, A. J. (2014). Disentangling the response of forest and grassland energy exchange to heatwaves under idealized land–atmosphere coupling. Biogeosciences, 11, 6159- 6171. doi:10.5194/bg-11-6159-2014

Wallace, J.M. and Hobbs, P.V., Atmospheric Science: an introductory survey, 2nd edition. (Academic Press, 2006). 504 pp.

34

Weng, Q., D. Lu and J. Schrubring (2004). Estimation of land surface temperature-vegetation abundance relationship for urban heat island studies. Remote Sensing of Environment, 89, 467-483. doi: 10.1016/j.rse.2003.11.005

Zhou, B., Rybski, D. and Kropp, J.P. (2013). On the statistics of urban heat island intensity. Geophys. Res. Letters, 40, 5486-5491. doi; 10.1002/2013GL057320

35

Appendices

Appendix A: table and map of the Rotterdam stations

Code/name Begin End Lat Lon LCZ Hardware Kralingen Dec 2013 Running 51.921 4.518 6 Unknown IZUIDHOL158 Kralingseveer Jan 2013 Running 51.974 4.623 6 Davis Vantage pro 2 IZUIDHOL136 Beverwaard Mar 2010 Running 51.895 4.568 2 Davis Vantage pro 2 IZUIDHOL87 Steins Weer Site Aug 2012 Aug 2014 51.941 4.546 5 Unknown IZHROTTE3 Capelle 2008 Running 51.950 4.571 6 Davis Vantage pro 2 plus IZUIDHOL36 Vlaardingen Mar 2013 Running 51.914 4.338 3/6. Davis Vantage pro 2 plus IZHVLAAR1 IJsselmonde 2009 May 2013 51.882 4.526 2/3 Ultimeter 2100 IZUIDHOL3 Rijnmond 2009 Feb 2011 51.876 4.467 2 Davis Vantage pro 2 IZUID-HO16 CPC stations Bernisse 2010 Running 51.821 4.251 3 Campbell Scientific Bolnes 2010 Running 51.898 4.552 3 Campbell Scientific Capelle 2011? Running 51.927 4.585 3 Campbell Scientific Centrum 2009 Running 51.922 4.468 1/2 Campbell Scientific Hoogvliet 2010 Running 51.861 4.374 3 Campbell Scientific Lansingerland 2010 Running 52.010 4.529 3 Campbell Scientific Ommoord 2010 Running 51.958 4.547 3 Campbell Scientific Oost 2009 Running 51.925 4.548 6 Campbell Scientific Ridderkerk 2010 Running 51.878 4.585 9 Campbell Scientific Rijnhaven 2010 Running 51.906 4.493 4 (on a pier) Campbell Scientific Spaansepolder 2010 Running 51.933 4.415 8 Campbell Scientific Zuid 2009 Running 51.887 4.488 2 Campbell Scientific Reference 2009 Running 51.986 4.436 D Campbell Scientific Zestienhoven (WMO) - Running 51.959 4.442 D

CPC station NDVI (250 m.) Green cover (250 m., %, from Van Hove et al., 2015) Bernisse 0.1472 32 Bolnes 0.2051 25 Capelle 0.1807 13 Rijnhaven 0.0487 4 Spaansepolder 0.0649 4 Hoogvliet 0.1956 32 Lansingerland 0.1672 17 Ommoord 0.2009 23 Ridderkerk 0.2454 64 Vlaardingen 0.2305 36 Centrum 0.0652 0 Oost 0.0907 4 Zuid 0.1984 20 Reference 0.4021 99

Data for the CPC stations comes from Van Hove et al. (2015), their Table 1. All CPC stations have the same measurement setup: for the exact instruments we refer to Van Hove et al (2015). Specifics for the WMO station Zestienhoven can be found at the website of the Dutch Royal Meteorological Institute: http://www.knmi.nl

36

Map of the Rotterdam Wunderground (white marks); CPC (yellow marks) and ECMWF points (purple marks)

37

Appendix B: table and map of the Madrid stations Code/name Begin End Lat Lon LCZ Elevation Hardware IMADRIDM2 July 2009 Running 40.388 -3.661 2 659 m. VM918 Vallecas IMADRIDM24 Jan 2010 14 jun 2012 40.450 -3.683 3 720 m.) Davis Vantage Pro 2 Jesus M Canillas IMADRIDM36 July 2010 Running 40.481 -3.716 2 709 m; Davis Vantage Vue Fermin Caballero IMADRIDM41 Jan 2011 Running 40.445 -3.786 5/6 672 m. Davis Vantage Vue Aravaca IMADRIDM42 Feb 2011 Running 40.455 -3.724 9 669 m. Vaisala WXT520 Ciemat edif22 IMADRIDM43 Mar2011 Jun 2014: 40.456 -3.730 9 651 m. Geonica Meteodata Ciemat edificio42 3000C IMADRIDM44 12 jul 2011 1 jun 2014 40.435 -3.663 2 685 m. Unknown Guindalera IMADRIDM45 Oct 2011 Running 40.359 -3.694 2/5 623 m. Davis Vantage Vue Villaverde IMADRIDM46 2012 running 40.389 -3.630 5/6 655 m. Davis Vantage Pro 2 Centro de plus Investigación IMADRIDM49 19 feb 8 dec 2014 40.407 -3.653 2 667 m. WH2081 Moratalaz 2012 IMADRIDM51 Oct 2012 Running 40.398 -3.622 2 691 m; PCE-FWS 20 Valdebernardo IMADRIDM53 1 Feb 23 Sep 2014 40.543 -3.688 5/9? 709 m., Davis Vantage Pro 2 Campus UAM-CSIC 2013 ICOMINID35 2008 Running 40.479 -3.718 1/2 685 m. Davis Vantage Pro2 Avenida de la plus Ilustracion ICOMUNID56 2009 Running 40.432 -3.697 2 697 m. Davis Vantage Pro2 Chamberi ICOMUNID82 Oct 2010 Dec 2013 40.606 -3.718 5 749 m. Davis Vantage Pro Tres Cantos ICOMUNID88 Nov 2010 Running 40.438 -3.663 2 690 m Davis Vantage Vue Parque de las Avenidas ICOMUNID90 Jan 2011 2013; gaps in 40.452 -3.801 3/6 657 m. Davis Vantage Vue Pozuelo de Alarcon 2014 ICOMUNID91 Feb 2011 Running 40.333 -3.767 5 670 m. Young (?) Leganés ICOMUNID96 Apr 2011 Apr 2013 40.460 -3.587 3/6 646 m. Davis Vantage vue Alameda de Osuna ICOMUNID99 Apr 2011 Aug 2013 40.564 -3.611 2 667 m. Watson W-8681 San Sebastian de los Reyes Icomunid103 June 2011 running 40.501 -3.368 2/3 613 m. Scientific Oregon Alcala de Hernares WMR200 ICOMUNID113 Oct 2011 Running 40.444 -3.586 2/8: 620 m. Oregon Scientific Las Mercedes WMR928N ICOMUNID115 Oct 2011 Running 40.548 -3.462 6 679 m. PCE FWS-20 Daganzo de Arriba Icomunid134 Dec 2012 running 40.313 -3.478 3 545 m. PCE FWS-20 Arganda del Rey ICOMUNID144 April 2013 Running 40.469 -3.689 2/1 750 m Davis Vantage Vue Plaza de castilla IMADRIDG2 2009 Running 40.317 -3.733 5 646 m. Davis Vantage Pro2 Getafe IMADRIDA10 April 2012 Running 40.355 -3.815 2 694 m. Davis Vantage VUE Alcorcon S.J. Valderas IMADRIDA7 Jan 2008 Running 40.341 -3.817 2 700 m. Davis Vantage 6163 Alcorcon

38

Map of Madrid Wunderground stations (white and yellow marks) and ECMWF points (blue marks)

39

Appendix C: table and map of the Oslo stations Code/name Begin End Lat Lon LCZ Elevation Hardware IOSLOOSL5 Jan 2010 Running 59.944 10.932 6 170 m. Oregon Scientific Hoybraten WMR200 IOSLOOSL9 Jan 2010 Running 59.857 10.790 6 91 m. Oregon Scientific Solveien WMR200 IOSLOOSL12 Jan 2010 Running 59.828 10.788 3 100 m. Unknown Holmlia IOSLOOSL13 Jan 2010 April 2014 59.917 10.828 8 150 m. Davis Vantage Pro [-] (nameless) IOSLOOSL14 March Running 59.889 10.740 9 18 m. WH1080 Bleikoya 2010 IOSLOOSL21 April 2011 Running 59.907 10.855 9 213 m. Clas Ohlson 36-3242 Trasopp IOSLOOSL23 May 2012 Running 59.903 10.806 3/6 131 m. Unknown Høyenhall IOSLOOSL24 Nov 2011 May 2014 59.906 10.773 2 28 m. WMR 928 Gamlebyen IOSLOOSL26 Dec 2011 Running 59.916 10.650 6 13 m. WS-1080 Vaekeroe IOSLOOSL28 Jan 2012 Running 59.903 10.833 8 135 m. Fine offset WH-1080 Godlia IOSLOOSL31 March Running 59.884 10.822 6 114 m. Unknown Abilds 2012 IOSLOOSL32 April 2012 Running 59.946 10.780 3 123 m. WH 1080 Storo IOSLOOSL36 Jan 2013 March 59.891 10.695 6 0 m. WS-1080 Nakholmen 2014 IOSLO5 Jan 2010 Running 59.970 10.659 9 348 m. 1-wire Voksenlia IOSLO16 April 2010 Running 59.933 10.669 6 79 m. Unknown Husybybakken IAKERSHU18 Jun 2011 Running 59.889 10.521 2 24 m. Oregon Scientific Sandvika WMR200A IAKERSHU27 Feb 2012 Running 59.791 10.695 D 96 m. Davis Vantage pro Bomansvik IAKERSHU32 Jun 2012 April 2014 59.892 10.635 5/G 0 m. Davis Vantage pro Snarøya (water) eKlima - Alna Jan 2010 Running 59.927 10.835 8/E 90 m. OS WMR200 (paved) eKlima – Blindern Jan 2010 Running 59.942 10.72 6/D 94 m. WMR200 (grass) eKlima - Bygdøy Nov 2012 Running 59.905 10.683 6 15 m. WH1080 eKlima – 1927 Running 59.984 10.669 A 514 m. Tryvannshøgda eKlima - Bjørnholt 1876 Running 60.051 10.688 D 360 m.

40

Map of Oslo Wunderground stations (light blue marks), Reference stations (white marks) and ECMWF points (blue marks). The urban Norwegian weather stations have the same colour as the Wunderground stations and are marked as eKlima.

41

Appendix D: table of measurement accuracy of weather stations Weather station Air temperature (K) Relative Humidity (%) Wind speed (ms-1) hardware Davis Vantage (2) pro 0.5 3 1 Davis Vantage (2) Pro 0.5 3 1 plus Ultimeter 2100 0.5 4 0.9 Vaisala wxt520 0.3 3 (for 0-90%), 5 (90+) 3% Wh2081 1 5 1 (10% for ws > 10m/s) Davis Vantage Vue 0.5 3 1 Oregon Scientific 1 5 3 (10% for >10 m/s) WMR200 Clas Ohlson 36-3242 1 5 1 (10% for ws > 10m/s)

Table of the measurement accuracy of the most used weather station types from the hobby meteorologists. Not all of the manufacturers provide information about the accuracy of their weather station, but the majority of the stations under research use a weather station of which the accuracy is known.

42

Appendix E: Histogram of the difference between Wunderground and CPC data, Rotterdam

43

Plot 1 and 2 are histograms of the difference between the CPC Vlaardingen station and the Wunderground Vlaardingen station (so CPC – Wunderground). Plot 3 and 4 is the difference between the Zuid CPC station and the Beverwaard Wunderground station, which have a similar LCZ.

44

Appendix F: hysteresis of Tryvannshøgda

Hysteresis plot of median UHI versus background temperature, from Tryvannshøgda, Oslo.

45