1 Thomaes, A., Kervyn, T. & Maes, D 2008. Applying species 2 distribution modelling for the conservation of the threatened 3 saproxylic Stag ( ). Biological conservation, 4 141: 1400-1410

5 Authors a b c 6 Arno THOMAES , Thierry KERVYN & Dirk MAES

7 Affiliations a 8 Research Institute for Nature and Forest (INBO), Gaverstraat 4, B-9500 9 Geraardsbergen, Belgium, email: [email protected] b 10 Directorate General for Nature Resources and Environment (DGRNE), Avenue Prince 11 de Liège 15, B-5100 Jambes, Belgium, [email protected] c 12 Research Institute for Nature and Forest (INBO), Kliniekstraat 25, B-1070 Brussels, 13 Belgium, [email protected]

14 *Full address for correspondence

15 Dirk MAES, Research Institute for Nature and Forest (INBO), Kliniekstraat 25, B-1070 16 Brussels, Belgium, Tel. +32 2 558 18 37; fax +32 2 558 18 05; [email protected] 17

18 Abstract

19 Despite its size and attractiveness, many sites remain undetected in 20 NW because of its short flight period and its nocturnal activity. Therefore, 21 present-day designated conservation areas for L. cervus are probably insufficient for a 22 sustainable conservation of the species. We applied eight species distribution 23 modelling techniques (Artificial Neural Networks, Classification Tree Analysis, 24 Generalised Additive Models, Generalised Boosting Models, Generalised Linear 25 Models, Mixture Discriminant Analysis, Multiple Adaptive Regression Splines and 26 Random Forests) to predict the distribution of L. cervus in Belgium using ten randomly 27 generated calibration and evaluation sets. We used AUC, sensitivity (% correctly 28 predicted presences in the evaluation set) and specificity (% correctly predicted 29 absences in the evaluation set) and Kappa statistics to compare model performances. 30 To avoid the incorporation of only marginally suitable woodland sites into the Natura 31 2000 network, we, conservatively, considered the species as being present only in grid 32 cells where all ten randomly generated model sets predicted the species as such. 33 Model performance was, on average, good allowing to predict the potential distribution 34 of L. cervus accurately. According to the predicted distribution using the more robust 35 prevalence threshold, only 5 731 ha (11% of the potentially suitable area) is protected 36 under the Natura 2000 scheme in Belgium. Subsequently, we categorised the 37 potentially suitable woodlands into three conservation priority categories based on their 38 surface area and the already designated Natura 2000 area. Including the most suitable 39 L. cervus woodlands previously not included in the Natura 2000 sites within such 40 network would require protecting an area of 15 260 ha. Finally, we discuss the 41 implications of using species distribution modelling for nature policy decisions in 42 designating conservation networks. 43

44 Keywords: Belgium, saproxylic beetle, Natura 2000, predictive modelling, AUC, Kappa 45 statistics, model comparison, protected area effectiveness 46

47 1. Introduction

48 The accelerating decline and extinction of many species has made species 49 conservation in particular and nature conservation in general a globally important issue 50 (Thomas et al., 2004). This resulted in the ratification of the Convention on Biological 51 Diversity (CBD) in which world leaders agreed to halt biodiversity loss by 2010

1 52 (Secretariat of the Convention on Biological Diversity, 2006). Although they make up 53 75-80% of all known species, invertebrate focal species are rarely used for hotspot 54 analyses or site designation (McGeoch, 1998; Samways, 2005). However, coincidence 55 of invertebrate hotspots with those from the more commonly used taxonomic groups 56 (vertebrates and/or plants) could be rather low (Prendergast et al., 1993; Maes et al., 57 2005). 58 In Europe, two directives oblige member states to protect sites where focal species 59 occur: the Habitats Directive (92/43/EEC) and Birds Directive (79/409/EEC). 60 Furthermore, member states have to adequately map the distribution and to monitor 61 the abundance of known populations of species that are in the annexes of both 62 directives. Together, sites protected under the European Bird Directive and Habitat 63 Directive form the so called Natura 2000 network (see Wätzold and Schwerdtner, 64 2005). To designate sites on the basis of focal species, information on their distribution 65 is essential (Kareiva and Levin, 2003; Hortal et al., 2007). Even in well studied regions, 66 species can remain undetected because of their inconspicuous behaviour (e.g., 67 nocturnal), small size (e.g., many invertebrates), low abundance, taxonomical problems 68 (difficult to classify) or spatial biases in distribution data (Dennis and Thomas, 2000). 69 Furthermore, if distribution data are present, sites are usually only designated if the 70 species is actually known to be present. This approach almost always results in a too 71 limited area of protected sites, because sites with a high potential for the occurrence of 72 the species but without its documented permanent presence, are rarely designated 73 (Decleer, 2007). 74 The threatened saproxylic Lucanus cervus (LINNAEUS 1758) is one of the 75 invertebrate species that is listed in Annex II of the European Habitat Directive. 76 Saproxylic invertebrates in general and in particular have been identified as 77 one of the most threatened invertebrate communities in Europe (Speight, 1989; Berg et 78 al., 1994) and many saproxylic species have been used as indicators for the quality of 79 woodlands (e.g., Fowles et al., 1999; Ranius, 2002). Despite its size and 80 attractiveness, distribution data of L. cervus are not available on a scale that allows an 81 adequate designation of protected areas in many NW European countries. This is due 82 to the fact that L. cervus occurs in low numbers and is only active during a short period 83 of warm nights in June and July (Smith, 2003; Smit, 2004). 84 When distribution data are scarce, modelling techniques are increasingly used to fill in 85 gaps in distribution maps or to target conservation efforts towards sites with a 86 potentially high conservation value (e.g., Guisan and Zimmermann, 2000; Luoto et al., 87 2002; Wilson et al., 2005; Heikkinen et al., 2007). Modelling techniques use distribution 88 data to link species to a set of biotic (e.g., land cover, species interactions) and/or 89 abiotic variables (e.g., soil type, climate data) and permit to predict presence 90 probabilities for un-surveyed sites (Guisan and Zimmermann, 2000). Furthermore, 91 predictive modelling is advocated to be an increasingly powerful tool in conservation 92 biology because it allows to incorporate un-surveyed areas into nature policy decision 93 making (Pullin et al., 2004; Rushton et al., 2004). A prerequisite, however, is that a 94 minimum number of data are available to build and evaluate the species distribution 95 models (Pearson et al., 2007). Despite their attractiveness and recommended use in 96 conservation biology, models always need to be critically evaluated and validated if 97 they are to be applied to designate sites (Hortal et al., 2007). The use of different 98 modelling techniques and different methods to test model efficiency is, therefore, highly 99 recommended (Thuiller, 2003). They can, at most, help to prioritise sites and to help 100 policy makers in their decision to designate the most cost-effective sites for the 101 protection of species or habitats (Chefaoui et al., 2005). 102 Here, we use the threatened saproxylic Stag Beetle Lucanus cervus in Belgium as an 103 example of how modelling techniques can be used to help designate ecological 104 networks. First, we describe the characteristics of the known distribution pattern of L. 105 cervus in Belgium. Secondly, we apply different modelling techniques to predict the 106 potential distribution of this species. Subsequently, we assign conservation priority 107 values to woodland sites that have a high probability of harbouring the species and 108 suggest their incorporation in the Natura 2000 network. Finally, we discuss the use of

2 109 species distribution modelling to detect un-surveyed but potentially suitable sites for the 110 focal species and how such sites can be incorporated in ecological networks and 111 conservation policy making. 112

113 2. Material & Methods

114 2.1. Study area

115 Belgium is a strongly industrialised NW European country with high human population 116 density (335 inhabitants/km², Van Goethem, 2001) and, consequently, intense 117 pressure on nature (OECD, 1998). The general landscape and topography differ 118 considerably between the two administrative regions of Belgium: Flanders and 119 Wallonia. Flanders, the northern part, is a lowland zone (mean elevation = 38 m) and 120 only has a limited total area of nature reserves (1.6% of the territory, Van Goethem, 121 2001) and forest (8% - CEC, 1994). The highest amount of woodland in Flanders is 122 found in the north-eastern part of the country and in the surroundings of Brussels (e.g., 123 Sonian Forest, Haller Forest, Meerdaal Forest). Wallonia, the southern part and 124 comparatively an upland region (mean elevation = 310 m) has a similar total area of 125 nature reserves (ca. 1% of the territory, Van Goethem, 2001). However, it has a 126 considerably higher amount of woodlands, mainly coniferous (31%, CEC, 1994). 127

128 2.2. Study species and data

129 The saproxylic Stag Beetle Lucanus cervus (LINNAEUS 1758) is one of the largest 130 beetle species in Europe (Luce, 1996). In the past, the species was thought to be 131 confined to large woodlands (Tochtermann, 1992), but more recent studies in NW 132 Europe have shown that L. cervus can also occur in open and more urban habitats 133 such as gardens, parks, open forests, hollow ways, orchards and afforested slopes in 134 the vicinity of large woodlands (Rink and Sinsch, 2006). The larvae live about five 135 years underground in woody debris mostly of Quercus spp. on loamy or silty soils 136 (Pratt, 2000). L. cervus has a limited colonization capacity: maximal female dispersal is 137 about 1 km, while males can fly distances of about 3 km (Rink and Sinsch, 2007). The 138 medium home range size of females is about 0.2 ha, while males have a home range 139 of about 1 ha (Sprecher-Uebersax, 2003). 140 Distribution data of L. cervus in Belgium were collated from literature, collections, public 141 surveys and field observations (Thomaes et al., 2008). Only recent observations (1974- 142 2005) were used for modelling purposes (see discussion). These observations were 143 attributed to the 5 x 5 km square of the UTM grid for the fuses 31 and 32 (Universal 144 Transverse Mercator) projection, hereafter called grid cells. The known present-day 145 and historical distribution of L. cervus in Belgium is shown in Fig. 1. Its distribution is 146 concentrated in the centre and east of the country, around the cities of Brussels and 147 Liège, but some more isolated locations are scattered mainly in the south. 148 We selected a total of 356 grid cells for which reliable presence/absence data were 149 available (73 grid cells with and 283 grid cells without the species, Fig. 2). We only 150 used grid cells in ecological regions (sensu Dufrêne and Legendre, 1991) with old or 151 recent observations and, therefore, excluded three ecological regions (i.e., Dunes, 152 Polders and Sandy Loam) from the analyses (Fig. 2, see discussion). The absence 153 here can be explained by the lack of southerly exposed terrain, unsuitable soil types 154 and a very low amount of woodland. 155

156 2.3 Environmental variables

157 Four types of environmental data were collected (Table 1 - Maes et al., 2003): (1) land 158 use variables (Corine Land Cover - CEC, 1994), (2) topographic variables (digital 159 elevation model for Belgium with a resolution of 20m - National Geographical Institute), 160 (3) climate variables for the period 1996-2001 (Royal Meteorological Institute of 161 Belgium) and (4) soil variables (Soil Service of Belgium). The area of each land use 162 and topographic classes in each grid cell were estimated using ArcView3.2 (ESRI,

3 163 Redlands, CA, USA). We use the 1990 Corine Land cover data (CEC, 1994) instead of 164 the more recent 2000 data (Nunes de Lima, 2005) deliberately, because this coincided 165 better with the time window of the L. cervus data used in this analysis. Furthermore, 166 shifts in areas of urban area and deciduous/mixed woodland (two land cover classes 167 used in the analysis, see further) in Belgium were small: only 2 305 ha of 168 deciduous/mixed woodland (average change of 1.7 ha per grid cell) and 2 158 ha of 169 urban area (1.6 ha per grid cell) changed into another land cover type between both 170 mapping periods. 171 The relatively short and recent time period for the climatic data (1996-2001) was 172 preferred because a much larger number of point data was available compared to the 173 data for a longer time period (1970-2000 for example). This allowed a more accurate 174 interpolating of the climatic variables to the whole of Belgium (see Maes et al., 2003 for 175 more details). The range in elevation was calculated as the difference between the 176 highest and lowest altitude in the grid cell. From the soil association map, a soil 177 suitability index (0/1) for L. cervus was derived according to its larval preferences (see 178 above). Soil types with entirely or mostly sandy and/or loamy soils and a year round 179 water table below 40 cm had a suitability of 1; all other soil types had a suitability of 0 180 (including all clay soils, pure sandy soils, soils with shallow rock layers and 181 permanently wet soils). The value of the soil suitability index in each grid cell was 182 calculated as a weighted average of the area of the different soil types. 183

184 2.4. Building and validating predictive models

185 Variables entered in the model were urban area, deciduous/mixed woodland, mean 186 annual temperature, range in elevation and soil suitability index. These variables were 187 selected for ecological reasons (Mac Nally, 2002; Austin et al., 2006): forest as major 188 biotope type, urban area because of the warmer microclimate and the species’ 189 occurrence in semi-urban environments (Sprecher-Uebersax, 2003; Hawes, 2004), 190 temperature because of the thermophilous character of the species (Whitehead, 1993; 191 Pratt, 2000; Napier, 2003), range in elevation as variable for the occurrence of warm 192 exposed hillsides and soil suitability as a limiting factor for larval development (Pratt, 193 2000; Napier, 2003; Hawes, 2004). 194 We randomly selected 70% of all grid cells with reliable presence/absence data (i.e., 195 249 out of the 356 grid cells) as calibration set, using the remaining 30% (i.e., 107) as 196 evaluation set. This data splitting procedure was repeated ten times, to generate ten 197 calibration-evaluation data sets (Olden and Jackson, 2000). 198 We applied eight different modelling techniques to each of the ten random calibration 199 sets: Artificial Neural Networks (ANN - Ripley, 1996), Classification Tree Analysis (CTA 200 - Breiman et al., 1984), Generalised Additive Models (GAM - Hastie and Tibshirani, 201 1987), Generalised Boosting Models (GBM - Friedman et al., 2000), Generalised 202 Linear Models (GLM - McCullagh and Nelder, 1989), Mixture Discriminant Analysis 203 (MDA - Hastie et al., 1994), Multiple Adaptive Regression Splines (MARS – Friedman, 204 1991) and Random Forests (RF - Breiman, 2001). All analyses were done in R using 205 the BIOMOD module (Thuiller, 2003). Model efficiency was tested using the Area under 206 the Curve (AUC) of the receiver operation characteristics plot (ROC, Fielding and Bell, 207 1997) and the Kappa statistics (Cohen, 1960) of the evaluation set. For both methods 208 we calculated the sensitivity (i.e., the % correctly predicted presences) and the 209 specificity (i.e., the % correctly predicted absences) in the evaluation set. For each of 210 the ten random datasets and for both evaluation methods, we used the model with the 211 highest AUC or the highest Kappa value in the evaluation set to make predictions for 212 the whole of Belgium. Predicted probabilities of occurrence were transformed into 213 binary presence/absence data using the prevalence criterion (in our case, 0.205, i.e. 214 73/356) in the total data set for the AUC method (Manel et al., 2001; Liu et al., 2005; 215 Jiménez-Valverde and Lobo, 2007) and the Kappa threshold for the Kappa statistics 216 (Table 2). Subsequently, we summed all binary predictions of the ten random sets per 217 grid cell. Conservatively, we considered L. cervus as predicted to be present only in 218 those grid cells where all ten random sets predicted its occurrence (i.e., the sum of the

4 219 binary predictions of the ten random model runs = 10; cf. Araújo and New, 2007). 220 Finally, we transformed the binary predictions of occurrence data in a grid cell into 221 probabilities per woodland site by calculating a weighted average of the predictions for 222 the grid cells that overlap with a given woodland site (following Araújo, 2004). Only 223 woodland sites that completely fall within grid cells were the ten random sets predicted 224 the species as present are considered as potentially suitable woodland sites for the 225 species. These sites were given conservation priority values based on their size and 226 their amount of already designated Natura 2000 area. Priority1-sites are >50 ha and 227 have, at present, only 0-10% of their area designated as Natura 2000 site, Priority2- 228 sites are >50 ha and have 10-50% of their area designated as Natura 2000 site and 229 Priority3-sites area >50 ha and have >50% of their area designated as Natura 2000 230 site (cf. Chefaoui et al., 2005). 231

232 3. Results

233 The grid cells where L. cervus is present have a significantly higher amount of urban 234 area, deciduous/mixed woodland, a higher range in elevation and a higher soil 235 suitability index than the rest of the country, whereas agricultural area was significantly 236 lower in these grid cells. Coniferous woodland, temperature, rainfall and shrubland did 237 not differ significantly between cells with and without the species (Table 1). 238 Random Forests performed best in six out of ten random modelling sets, Generalised 239 Linear Models (GLM) did so in three cases and General Additive Models in one case 240 using AUC as evaluation of model performance. Using Kappa statistics, Random 241 Forests performed better in five cases, General Additive Models in four cases and 242 Mixture Discriminant Analysis (MDA) in one case (Table 2). The use of AUC and 243 Kappa resulted in different model selections: GLM was selected three times as best 244 model using AUC but not when using Kappa, while MDA was not selected when using 245 AUC. Average model performance was good using AUC and nearly good using Kappa 246 statistics (Table 2). The precentage of correctly predicted presences (sensitivity) was 247 on average, much higher using the prevalence threshold, while the percentage 248 correctly predicted absences (specificity) was higher using the Kappa threshold (Table 249 2). 250 The predicted distributions of L. cervus in Belgium based on the sum of all ten random 251 models (only grid cells where all ten random sets predicted the species as present, are 252 shown) according to the prevalence and the Kappa threshold are shown in Fig. 3a and 253 3b respectively. In both cases, grid cells with a high probability of harbouring L. cervus 254 were situated in a large area around the capital of Brussels in the centre of Belgium, 255 around the city of Liège in the east, along the Meuse, in the Condroz region and some 256 more isolated grid cells in the Gaume region in the south. Using the prevalence 257 threshold, additional regions where L. cervus was predicted are the Sambre and Haine 258 river valleys in the south-western part and the Campine region in the north-eastern part 259 of the country. 260 In total, 18 449 ha are designated as Natura 2000 sites for the presence of L. cervus in 261 Belgium (Fig. 3a) of which 5 717 ha and 4 059 ha were predicted to be suitable for the 262 species using the prevalence and the Kappa threshold respectively. Out of 2 498 263 deciduous/mixed woodland sites in Belgium (total area = 461 792 ha), 425 were 264 predicted to be suitable for L. cervus using the prevalence threshold (i.e., 51 997 ha of 265 which 11% is designated for L. cervus); when using the Kappa threshold, 142 266 woodland sites were predicted to be suitable for L. cervus (i.e., 16 840 ha of which 24% 267 is designated for L. cervus - Table 3). 268 Using the prevalence threshold, 111 woodlands were classified as Priority-1 sites 269 (larger than 50 ha and <10% of their area designated as Natura 2000 site) and a 270 further 49 woodlands as Priority-2 sites (larger than 50 ha and 10-50% of their area 271 designated as Natura 2000 site. Fifty nine potentially suitable woodlands were 272 classified as Priority3-sites (larger than 50 ha and >50% of their area designated as 273 Natura 2000 site - Table 3). Using the Kappa threshold, 40 woodlands were classified 274 as Priority-1 sites and a further 18 woodlands as Priority-2 sites. Finally, 14 potentially

5 275 suitable woodlands were classified as Priority3-sites. Priority-1 and Priority-2 sites are 276 mainly located in the Condroz region and the Meuse valley, regardless of whether the 277 prevalence or the Kappa threshold was used (Fig. 4). 278

279 4. Discussion

280 Despite the fact that the threatened saproxylic Stag beetle Lucanus cervus is a large 281 and appealing species, distribution data and ecological knowledge are far from 282 complete in NW Europe. Comparing grid cells with and without L. cervus in Belgium 283 revealed that this species is present in grid cells with a relatively lower amount of 284 agricultural area, a higher amount of urban and deciduous/mixed woodland area, larger 285 differences in elevation and a higher soil suitability. We were able to predict the 286 species’ distribution quite accurately and showed that only 11-24% of the potentially 287 suitable woodland sites are protected under the Natura 2000 legislation. 288 289 4.1. Data quality 290 Despite its size, L. cervus is difficult to survey because of its low numbers and its 291 nocturnal activity during a short period of warm nights in June and July in NW Europe 292 (Smith, 2003; Smit, 2004). Because L. cervus is easily recognised, some countries 293 used surveys by volunteers and questionnaires to the large public to map its 294 distribution which may have caused on overrepresentation of urban environments 295 (Smith, 2003; Smit, 2004; Thomaes and Vandekerkhove, 2004). The data used here 296 were collated from different sources: field observations, literature and collection data. 297 We used mainly observation data from 1974 onwards because the exact location of 298 older data is usually less well documented (Janssens, 1960; Leclercq et al., 1973). The 299 fact that L. cervus has been observed in a site since 1974, indicates that the site was at 300 least temporarily suitable and justifies the use of all sites with relatively old 301 observations (see Willis et al., 2007). Here, we used 5 x 5 km grid cells because it 302 coincides with the maximal geographic accuracy of the (historical) faunistic data and 303 with the vagility of the species (cf. Chefaoui et al., 2005; Lobo et al., 2006). To 304 maximally avoid false absences (Engler et al., 2004; Araújo and Guisan, 2006), we 305 considered L. cervus as absent in grid cells that were at least 10 km away from all 306 recent and historic records, but in an ecological region with records from L. cervus. 307 This approach does not completely exclude the incorporation of false absences in the 308 data set, but is certainly more reliable than using all grid cells without records of L. 309 cervus (cf. Lobo et al., 2006; Vanreusel et al., 2007). 310 311 4.2. Factors limiting L. cervus distribution in Belgium 312 In NW Europe, L. cervus is often observed in small wooded biotopes within and near 313 cities (Pratt, 2000; Napier, 2003; Smith, 2003; Sprecher-Uebersax, 2003; Rink and 314 Sinsch, 2007; Thomaes et al., 2008). One of the possible explanations could be that 315 this thermophilous species prefers the warmer microclimate of cities. The historic 316 protection of dead wood and large old trees as a romantic landscape in parks nearby or 317 within cities could have created suitable small habitats. Large woodlands remain, 318 however, as the prime habitat type for this saproxylic species on a landscape scale. 319 The importance of woodland area for L. cervus can be attributed to the presence of a 320 higher amount of and greater continuity in coarse woody material in larger woodlands 321 that are managed in an ecological manner (Speight, 1989; Tochtermann 1992; Davies 322 et al., 2006; Ranius and Kindvall, 2006; Franc et al., 2007). 323 Two variables emphasize the thermophilous character of the species (Whitehead, 324 1993; Pratt, 2000; Napier, 2003): mean average temperature (as in Great Britain, the 325 distribution of L. cervus in Belgium is largely contained within the 16.5°C mean July 326 isotherm - Hawes, 2004) and the range in elevation that is used as a surrogate for 327 warm south-facing slopes (Rink and Sinsch, 2006). 328 Although some authors suggest that the distribution of L. cervus is limited by the 329 amount of rainfall (Percy et al., 2000; Pratt, 2000), we did not find a significant 330 difference between cells with and without L. cervus (Table 1). Furthermore, since

6 331 rainfall was strongly correlated with temperature (Pearson r=-0.852, p<0.0001), we did 332 not use it to model the distribution of L. cervus. 333 334 4.3. Model performance 335 Random Forests was the modelling technique selected more often, using either AUC or 336 Kappa evaluation statistics. Different modelling techniques for species distribution 337 modelling have been compared for several taxonomic groups and in different regions 338 (e.g., Segurado and Araújo, 2004; Elith et al., 2006; Austin, 2007; Heikkinen et al., 339 2007; Meynard and Quinn, 2007). However, Random Forests (RF) is one of the newest 340 approaches in species distribution modelling and comparisons with other more 341 classical techniques are still scarce. Three recent studies showed that RF significantly 342 outperformed most modelling techniques and this technique was advocated to be 343 robust for predictive mapping (Lawler et al., 2006; Prasad et al., 2006; Peters et al., 344 2007). 345 On average, cross evaluation of the model results revealed that model performance 346 was either close to good (according to the Kappa criterion) or good (according to AUC). 347 AUC was recently criticized as a measure for model performance (Lobo et al., 2007). 348 Therefore, we additionally calculated sensitivity and specificity of the models. 349 Sensitivity was, on average, higher using the prevalence threshold, while for specificity 350 the Kappa threshold had a higer average. In our case, the use of the prevalence value 351 to transform the model’s probabilities in to binary presence/absence values, was 352 justified by its higher sensitivity than the Kappa threshold (Table 2). 353 Predictions of species distribution models can be strongly influenced by how the 354 calibration and evaluation set are selected (Guisan and Zimmermann, 2000; Araújo 355 and Guisan, 2006). Here, we accounted for this potential drawback in three ways. First, 356 we randomly selected ten calibration and evaluation sets, while most studies use only a 357 single calibration and evaluation set to build and evaluate models. Secondly, we 358 applied eight different modelling techniques and selected the modelling technique that 359 resulted in the highest Area under the Curve (AUC) of the Receiver Operating 360 Characteristics (ROC) plot or the highest Kappa statistic (Thuiller, 2003). Thirdly, we 361 combined the results of all the best models to decide whether L. cervus could be 362 present in a given grid cell. Our approach was highly conservative, since we 363 determined that the presence of L. cervus was ensured only in the grid cells where all 364 the models calibrated using the ten randomly selected data sets, predicted that the 365 species is present (cf. Araújo and New, 2007). This probably explains why not all 366 present-day L. cervus sites were predicted correctly by our method (see grey squares 367 in Fig. 3). The use of ten randomly selected data sets allows for a much more robust 368 selection of potentially suitable sites for a focal species and helps to maximally avoid 369 the designation of only marginally suitable sites. Allowing for less conservancy, for 370 example by incorporating woodlands where the models predicted L. cervus as present 371 in nine out of ten models, would increase the amount of Priority-1 sites to be protected 372 from 15 260 ha to 65 123 ha using AUC and from 4 469 ha to 11 077 ha using Kappa 373 (Fig. 5). 374 The predicted distribution of L. cervus in Belgium largely coincided with its historical 375 distribution (before 1974 – Fig. 1). The present-day absence of L. cervus in the 376 historical sites can have two causes: i) historical sites were recently under-surveyed or 377 ii) L. cervus actually disappeared from the sites. By predicting potential distributions, 378 mapping schemes can more easily target grid cells with a high probability for 379 threatened species (Dennis and Hardy, 1999). 380 381 4.4. Implications for ecological network designation 382 Detailed distribution data are usually scarce or even lacking for most invertebrate 383 species. Since sites that are potentially suitable for a species can also be designated 384 as Natura 2000 site without evidence for the actual occurrence of the species (Decleer, 385 2007), modelling can help to detect such sites (e.g., Chefaoui et al., 2005; Lobo et al., 386 2006; Buse et al., 2007). Given the possible socio-economic implications (costs, land

7 387 acquisition) of site-designation based on such predictive models, some criticism of their 388 use in conservation biology is appropriate (Liu et al., 2005; Hortal et al., 2007). 389 To improve the reliability of the models to the species ecology, we only used predictive 390 variables that are of direct importance to the species such as its biotope and its 391 thermoregulatory requirements (cf. Mac Nally, 2002). This allows for a better 392 understanding of why woodland sites are predicted to be suitable for focal species such 393 as L. cervus in our case. Furthermore, only woodland sites in grid cells for which all ten 394 model runs predicted the presence of the species, are, very conservatively, proposed 395 for incorporation into the Natura 2000 network in Belgium. Many papers predict species 396 distribution in different resolutions (5 x 5 km, 10 x 10 km or even 50 x 50 km; e.g., 397 Maes et al., 2003; Araújo, 2004). Relatively large grid cells are very useful to detect 398 general patterns and changes in distribution under, for example, climate change 399 scenario’s on a larger scale (e.g., Bakkenes et al., 2002; Harrison et al., 2006). To be 400 informative for policy makers and for prioritising conservation decisions, we 401 transformed the predicted presences of L. cervus in grid cells into probabilities in 402 woodland sites within those grid cells (see Araújo, 2004; Cabeza et al., 2004; Araújo et 403 al., 2005). 404 The major difference between the use of the prevalence and the Kappa threshold was 405 the number of woodlands predicted to be suitable for L. cervus. The model using the 406 Kappa statistic, predicted far less grid cells and woodlands to be suitable for L. cervus, 407 due to the different thresholds used to transform probabilities into presence/absence 408 predictions. This has important consequences for the cost-effectiveness of the sites to 409 be designated in Belgium. Using the Kappa threshold, three times less woodland area 410 was predicted suitable for L. cervus compared to the use of the prevalence threshold. 411 In our case, the prevalence threshold appeared more liberal than the Kappa threshold 412 (cf. Jiménez-Valverde and Lobo, 2006), but proved to be more accurate when 413 predicting the distribution of L. cervus in Belgium (see Fig. 3). Recent studies have 414 shown, however, that the threshold that maximises the Kappa statistics was among the 415 worst (Jiménez-Valverde and Lobo, 2007) and proved to be less robust than the 416 prevalence threshold (Liu et al., 2005). Furthermore, the prevalence threshold is 417 suggested as one of the most appropriate ones, because it reflects the data used to 418 build the model, especially when species are relatively rare (Liu et al., 2005; Jiménez- 419 Valverde and Lobo, 2006), as is the case for L. cervus in Belgium. Furthermore, the 420 prevalence threshold resulted in a much accurate prediction of presences in the 421 evaluation set. When using modelling for conservation biology purposes, we are 422 especially interested in predicting a species’ presence to underpin the designation of 423 sites in ecological networks (Lobo et al., 2007). We, therefore, prefer to use the 424 predictions obtained by the prevalence threshold to prioritise Natura 2000 sites 425 designation in Belgium. The use of the more conservative Kappa threshold would not 426 only lead to a much smaller amount of Natura 2000 sites to be preferably designated 427 but, additionally, to a much more fragmented network of sites that would hamper the 428 exchange of individuals among suitable sites. 429 A rule of thumb indicates that member states should strive for 60% of all sites of an 430 Annex II species to be designated as Natura 2000 site (Decleer, 2007). As shown in 431 our analysis using the prevalence threshold, only 11% of all potentially suitable sites 432 are actually designated as such. To achieve the designation of at least 60% of all 433 potential L. cervus sites, a first step should be to designate all Priority1-sites as Natura 434 2000 sites as well (i.e., 15 383 ha). This would increase the fraction of woodlands 435 suitable for L. cervus to 40%. Additionally designating all Priority2-sites would increase 436 the amount of sites designated for L. cervus to 69% of all suitable sites in Belgium. 437 Priority3-sites are already largely designated as Natura 2000 sites, but not particularly 438 for the presence of L. cervus. Here, emphasis should be on an adequate management 439 to make the woodlands more suitable for L. cervus in particular and other saproxylic 440 invertebrates in general. The use of different priority levels for conservation helps 441 decision makers to rank sites according to their importance for L. cervus (cf. Maes et 442 al., 2004). Further prioritisation can be done by first designating the largest sites (e.g., 443 ≥ 100 ha – Table 3) which assure a more sustainable conservation of the focal species.

8 444 All sites predicted suitable for L. cervus using the Kappa threshold were a subset of the 445 sites predicted suitable using the prevalence threshold, which could additionally be 446 used in ranking sites for designation. 447 Although L. cervus can also be found in smaller wooded areas outside woodlands in 448 NW Europe, (e.g., city parks, gardens with large trees), we focus on large woodland 449 sites for its conservation. Relatively large woodlands (>100 ha) are more likely to 450 conserve L. cervus than smaller woodlands and are often also inhabited by other 451 (threatened) saproxylic beetles (Tochtermann, 1992; Ranius and Kindvall, 2006; Franc 452 et al., 2007). Most of the larger woodlands in NW Europe have a historical continuity of 453 dead wood and have greater habitat variability within the woodland which permits 454 different species to co-occur. Therefore, L. cervus could act as a focal species for the 455 conservation of suitable woodlands for saproxylic beetles (cf. Osmoderma eremita - 456 Ranius, 2002; Cerambyx cerdo – Buse et al., 2007). Because many saproxylic 457 invertebrates are very sedentary (including L. cervus), small suitable woodland patches 458 could be protected (and incorporated into the Natura 2000 network) and managed 459 appropriately, especially when they are situated in the neighbourhood of existing 460 populations (Ranius and Hedin, 2001; Grove, 2002; Ranius and Kindvall, 2006). Apart 461 from designating sites for the conservation of threatened species, an appropriate 462 management scheme is an additional prerequisite for the local preservation of the 463 species (Pickett et al., 1992; Kareiva and Levin, 2003). Beneficial management 464 measures consist of creating small microhabitats with different types and quantities of 465 dead and decaying wood (Ranius, 2002; Davies et al., 2006; Schroeder et al., 2007). 466

467 Acknowledgements

468 We thank Roger Cammaerts (ULB), Olivier Beck (BIM), Luc Crevecoeur (LIKONA) and 469 all the volunteers for collecting field observations. The Royal Belgian Institute for 470 Natural Sciences and the Universities of Liège, Ghent and Brussels gave us the 471 permission to use their collection data of L. cervus in Belgium. We kindly thank Wilfried 472 Thuiller for his permission to use BIOMOD and Dirk Bauwens, Joaquín Hortal and an 473 anonymous reviewer for useful comments on the manuscript. We are grateful to Miska 474 Luoto and Mathieu Marmion for fruitful discussions on modelling techniques. Finally, 475 we acknowledge the support of the European Community - Research Infrastructure Action 476 under the FP6 “Structuring the European Research Area” Programme, LAPBIAT. 477 478 References

479 Araújo, M.B., 2004. Matching species with reserves - uncertainties from using data at different resolutions. Biological 480 Conservation 118, 533-538.

481 Araújo, M.B., Guisan, A., 2006. Five (or so) challenges for species distribution modelling. Journal of Biogeography 33, 482 1677-1688.

483 Araújo, M.B., New, M., 2007. Ensemble forecasting of species distributions. Trends in Ecology & Evolution 22, 42-47.

484 Araújo, M.B., Thuiller, W., Williams, P.H., Reginster, I., 2005. Downscaling European species atlas distributions to a 485 finer resolution: implications for conservation planning. Global ecology and biogeography 14, 17-30.

486 Austin, M., 2007. Species distribution models and ecological theory: A critical assessment and some possible new 487 approaches. Ecological Modelling 200, 1-19.

488 Austin, M.P., Belbin, L., Meyers, J.A., Doherty, M.D., Luoto, M., 2006. Evaluation of statistical models used for 489 predicting plant species distributions: Role of artificial data and theory. Ecological Modelling 199, 197-216.

490 Bakkenes, M., Alkemade, J.R.M., Ihle, F., e.a., 2002. Assessing effects of forecasted climate change on the diversity 491 and distribution of European higher plants for 2050. Global Change Biology 8, 390-407.

492 Berg, A., Ehnstrom, B., Gustafsson, L., Hallingback, T., Jonsell, M., Weslien, J., 1994. Threatened Plant, , and 493 Fungus Species in Swedish Forests - Distribution and Habitat Associations. Conservation Biology 8, 718-731.

494 Breiman, L., 2001. Random forests. Machine Learning 45, 5-32.

9 495 Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J., 1984. Classification and regression trees. New York.

496 Buse, J., Schroder, B., Assmann, T., 2007. Modelling habitat and spatial distribution of an endangered longhorn beetle - 497 A case study for saproxylic conservation. Biological Conservation 137, 372-381.

498 Cabeza, M., Araújo, M.B., Wilson, R.J., Thomas, C.D., Cowley, M.J.R., Moilanen, A., 2004. Combining probabilities of 499 occurrence with spatial reserve design. Journal of Applied Ecology 41, 252-262.

500 CEC, 1994. CORINE Land Cover technical guide. European Commission, Luxemburg.

501 Chefaoui, R.M., Hortal, J., Lobo, J.M., 2005. Potential distribution modelling, niche characterization and conservation 502 status assessment using GIS tools: a case study of Iberian Copris species. Biological Conservation 122, 327-338.

503 Cohen, J., 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20, 37-46.

504 Davies, Z.G., Tyler, C., Stewart, G.B., Pullin, A.S., 2006. Are current management recommendations for conserving 505 saproxylic invertebrates effective? Centre for Evidence-Based Conservation,University of Birmingham, Birmingham, UK.

506 Decleer, K., 2007. Europees beschermde natuur in Vlaanderen en het Belgisch deel van de Noordzee. Habitattypen | 507 Dier- en plantensoorten. Instituut voor Natuur- en Bosonderzoek, Brussels.

508 Dennis, R.L.H., Hardy, P.B., 1999. Targeting squares for survey: predicting species richness and incidence of species 509 for a butterfly atlas. Global Ecology and Biogeography Letters 8, 443-454.

510 Dennis, R.L.H., Thomas, C.D., 2000. Bias in butterfly distributions maps: the influence of hot spots and recorder's home 511 range. Journal of Insect Conservation 4, 73-77.

512 Dufrêne, M., Legendre, P., 1991. Geographic structure and potential ecological factors in Belgium. Journal of 513 Biogeography 18, 257-266.

514 Elith, J., Graham, C.H., Anderson, R.P., Dudik, M., Ferrier, S., Guisan, A., Hijmans, R.J., Huettmann, F., Leathwick, 515 J.R., Lehmann, A., Li, J., Lohmann, L.G., Loiselle, B.A., Manion, G., Moritz, C., Nakamura, M., Nakazawa, Y., Overton, 516 J.M., Peterson, A.T., Phillips, S.J., Richardson, K., Scachetti-Pereira, R., Schapire, R.E., Soberon, J., Williams, S., 517 Wisz, M.S., Zimmermann, N.E., 2006. Novel methods improve prediction of species' distributions from occurrence data. 518 Ecography 29, 129-151.

519 Engler, R., Guisan, A., Rechsteiner, L., 2004. An improved approach for predicting the distribution of rare endangered 520 species from occurrence and pseudo-absence data. Journal of Applied Ecology 41, 263-274.

521 Fielding, A.H., Bell, J.F., 1997. A review of methods for the assessment of prediction errors in conservation 522 presence/absence models. Environmental Conservation 24, 38-49.

523 Fowles, A.P., Alexander, K.N.A., Key, R.S., 1999. The saproxylic quality index: evaluating wooded habitats for the 524 conservation of dead-wood Coleoptera. Coleopterist 8, 121-141.

525 Franc, N., Gotmark, F., Okland, B., Norden, B., Paltto, H., 2007. Factors and scales potentially important for saproxylic 526 beetles in temperate mixed oak forest. Biological Conservation 135, 86-98.

527 Friedman, J., Hastie, T., Tibshirani, R., 2000. Additive logistic regression: A statistical view of boosting. Annals of 528 Statistics 28, 337-374.

529 Friedman, J.H., 1991. Multivariate Adaptive Regression Splines. Annals of Statistics 19, 1-67.

530 Grove, S.J., 2002. Saproxylic insect ecology and the sustainable management of forests. Annual review of ecology and 531 systematics 33, 1-23.

532 Guisan, A., Zimmermann, N.E., 2000. Predictive habitat distribution models in ecology. Ecological Modelling 13, 147- 533 186.

534 Harrison, P.A., Berry, P.M., Butt, N., New, M., 2006. Modelling climate change impacts on species' distributions at the 535 European scale: implications for conservation policy. Environmental Science & Policy 9, 116-128.

536 Hastie, T., Tibshirani, R., 1987. Generalized additive models: some applications. Journal of the American Statistical 537 Society 82, 371-386.

538 Hastie, T., Tibshirani, R., Buja, A., 1994. Flexible Discriminant-Analysis by Optimal Scoring. Journal of the American 539 Statistical Association 89, 1255-1270.

540 Hawes, C.J., 2004. The Stag beetle Lucanus cervus (L.) (Coleoptera: Lucanidae) in the County of Suffolk (England): 541 Distribution and Monitoring. Proceedings of the 3rd Syymposium and Workshop on the Conservation of Saproxylic 542 Beetles (07th-11th July 2004) Riga, , pp pp. 51-67.

10 543 Heikkinen, R.K., Luoto, M., Kuussaari, M., Toivonen, T., 2007. Modelling the spatial distribution of a threatened butterfly: 544 Impacts of scale and statistical technique. Landscape and Urban Planning 79, 347-357.

545 Hortal, J., Lobo, J.M., Jimenez-Valverde, A., 2007. Limitations of biodiversity databases: Case study on seed-plant 546 diversity in Tenerife, Canary Islands. Conservation Biology 21, 853-863.

547 Janssens, A., 1960. Faune de Belgique: Insectes Coléoptères Lamellicornes. Koninklijk Belgisch Instituut voor 548 Natuurwetenschappen, Brussel.

549 Jiménez-Valverde, A., Lobo, J.M., 2006. The ghost of unbalanced species distribution data in geographical model 550 predictions. Diversity and Distributions 12, 521-524.

551 Jiménez-Valverde, A., Lobo, J.M., 2007. Threshold criteria for conversion of probability of species presence to either-or 552 presence-absence. Acta Oecologica-International Journal of Ecology 31, 361-369.

553 Kareiva, P., Levin, S.A., 2003. The Importance of Species: Perspectives on Expendability and Triage. Princeton 554 University Press, Princeton.

555 Lawler, J.J., White, D., Neilson, R.P., Blaustein, A.R., 2006. Predicting climate-induced range shifts: model differences 556 and model reliability. Global Change Biology 12, 1568-1584.

557 Leclercq, J., Gaspar, C., Verstraeten, C., 1973. Atlas provisoire des Insectes de Belgique (et des régions limitrophes). 558 Faculte des sciences agronomiques de l'etat, zoologie generale et faunistique, Gembloux.

559 Liu, C.R., Berry, P.M., Dawson, T.P., Pearson, R.G., 2005. Selecting thresholds of occurrence in the prediction of 560 species distributions. Ecography 28, 385-393.

561 Lobo, J.M., Jimenez-Valverde, A., Real, R., 2007. AUC: a misleading measure of the performance of predictive 562 distribution models. Global ecology and biogeography in press.

563 Lobo, J.M., Verdu, J.R., Numa, C., 2006. Environmental and geographical factors affecting the Iberian distribution of 564 flightless Jekelius species (Coleoptera : ). Diversity and Distributions 12, 179-188.

565 Luce, J.-M., 1996. Lucanus cervus (Linnaeus, 1758). In: eds. P. J. van Helsdingen, L. Willemse, & M. C. D. Speight, 566 Background information on invertebrates of the Habitat Directive and the Bern Convention, pp pp. 53-58.

567 Luoto, M., Kuussaari, M., Toivonen, T., 2002. Modelling butterfly distribution based on remote sensing data. Journal of 568 Biogeography 29, 1027-1037.

569 Mac Nally, R., 2002. Multiple regression and inference in ecology and conservation biology: further comments on 570 identifying important predictor variables. Biodiversity and Conservation 11, 1397-1401.

571 Maes, D., Bauwens, D., De Bruyn, L., Anselin, A., Vermeersch, G., Van Landuyt, W., De Knijf, G., Gilbert, M., 2005. 572 Species richness coincidence: conservation strategies based on predictive modelling. Biodiversity and Conservation 14, 573 1345-1364.

574 Maes, D., Gilbert, M., Titeux, N., Goffart, P., Dennis, R.L.H., 2003. Prediction of butterfly diversity hotspots in Belgium: a 575 comparison of statistically-focused and land use-focused models. Journal of Biogeography 30, 1907-1920.

576 Maes, D., Vanreusel, W., Talloen, W., Van Dyck, H., 2004. Functional conservation units for the endangered Alcon Blue 577 butterfly Maculinea alcon in Belgium (Lepidoptera, Lycaenidae). Biological Conservation 120, 229-241.

578 Manel, S., Williams, H.C., Ormerod, S.J., 2001. Evaluating presence-absence models in ecology: the need to account 579 for prevalence. Journal of Applied Ecology 38, 921-931.

580 McCullagh, P., Nelder, J.A., 1989. Generalized linear models. Chapman & Hall, London.

581 McGeoch, M.A., 1998. The selection, testing and application of terrestrial as bioindicators. Biological Reviews of 582 the Cambridge Philosophical Society 73, 181-201.

583 Meynard, C.N., Quinn, J.F., 2007. Predicting species distributions: a critical comparison of the most common statistical 584 models using artificial species. Journal of Biogeography 34, 1455-1469.

585 Napier, D., 2003. The Great Stag Hunt: Methods and findings of the 1998 National Stag Beetle Survey. In: ed. People's 586 Trust for Endangered Species, Proceedings of the second pan-European conference on Saproxylic Beetles London, pp 587 pp. 32-35.

588 Nunes de Lima, V., 2005. CORINE Land Cover updating for the year 2000. European Commission, Ispra.

589 OECD, 1998. Environmental performance reviews Belgium. OECD Editions, Paris.

11 590 Olden, J.D., Jackson, D.A., 2000. Torturing data for the sake of generality: How valid are our regression models? 591 Ecoscience 7, 501-510.

592 Pearson, R.G., Raxworthy, C.J., Nakamura, M., Peterson, A.T., 2007. Predicting species distributions from small 593 numbers of occurrence records: a test case using cryptic geckos in Madagascar. Journal of Biogeography 34, 102-117.

594 Percy, C., Bassford, G., Keeble, V., 2000. Findings of the 1998 National Stag Beetle Survey. Londen.

595 Peters, J., De Baets, B., Verhoest, N.E.C., Samson, R., Degroeve, S., De Becker, P., Huybrechts, W., 2007. Random 596 forests as a tool for ecohydrological distribution modelling. Ecological Modelling 207, 304-318.

597 Pickett, S.T.A., Parker, V.T., Fiedler, P.L., 1992. The new paradigm in ecology: implications for conservation biology 598 above the species level. In: eds. P. L. Fiedler & S. K. Jain, Conservation biology: the theory and practice of nature 599 conservation preservation and management.Chapmann & Hall, New York, pp pp. 65-88.

600 Prasad, A., Iverson, L., Liaw, A., 2006. Newer Classification and Regression Tree Techniques: Bagging and Random 601 Forests for Ecological Prediction. Ecosystems 9, 181-199.

602 Pratt, C.R., 2000. An investigation into the status history of the stag beetle Lucanus cervus (Linnaeus) (Lucanidae) in 603 Sussex. Coleopterist 9, 75-90.

604 Prendergast, J.R., Quinn, R.M., Lawton, J.H., Eversham, B.C., Gibbons, D.W., 1993. Rare species, the coincidence of 605 diversity hotspots and conservation strategies. Nature 365, 335-337.

606 Pullin, A.S., Knight, T.M., Stone, D.A., Charman, K., 2004. Do conservation managers use scientific evidence to support 607 their decision-making. Biological Conservation 119, 245-252.

608 Ranius, T., 2002. Osmoderma eremita as an indicator of species richness of beetles in tree hollows. Biodiversity and 609 Conservation 11, 931-941.

610 Ranius, T., Hedin, J., 2001. The dispersal rate of a beetle, Osmoderma eremita, living in tree hollows. Oecologia 126, 611 363-370.

612 Ranius, T., Kindvall, O., 2006. Extinction risk of wood-living model species in forest landscapes as related to forest 613 history and conservation strategy. Landscape Ecology 21, 687-698.

614 Rink, M., Sinsch, U., 2007. Aktuelle Verbreitung des Hirschkäfers (Lucanus cervus) im nördlichen Rheinland-Pfalz mit 615 Schwerpunkt Modeltal. Decheniana 160, in press.

616 Rink, M., Sinsch, U., 2006. Habitatpräferenzen des Hirshkäfers Lucanus cervus (Linnaeus, 1758) in der Kulturlandschaft 617 - eine methodenkritische Analyse (Coleoptera: Lucanidae). Entomologische Zeitschrift 116, 228-234.

618 Rink, M., Sinsch, U., 2007. Radio-telemetric monitoring of dispersing stag beetles: implications for conservation. Journal 619 of Zoology 272, 235-243.

620 Ripley, B.D., 1996. Pattern recognition and neural networks. Cambridge University Press, Cambridge.

621 Rushton, S.P., Ormerod, S.J., Kerby, G., 2004. New paradigms for modelling species distributions? Journal of Applied 622 Ecology 41, 193-200.

623 Samways, M.J., 2005. Insect diversity conservation. University of Cambridge, Cambridge.

624 Schroeder, L.M., Ranius, T., Ekbom, B., Larsson, S., 2007. Spatial occurrence of a habitat-tracking saproxylic beetle 625 inhabiting a managed forest landscape. Ecological Applications 17, 900-909.

626 Secretariat of the Convention on Biological Diversity, 2006. Global Biodiversity Outlook 2. Montreal.

627 Segurado, P., Araújo, M.B., 2004. An evaluation of methods for modelling species distributions. Journal of 628 Biogeography 31, 1555-1568.

629 Smit, J.T., 2004. Inhaalslag verspreidingsonderzoek vliegend hert. Leiden.

630 Smith, M.N., 2003. National stag beetle survey 2002. London.

631 Speight, M.C.D., 1989. Saproxylic invertebrates and their conservation. Council of Europe, Strasbourg.

632 Sprecher-Uebersax, E., 2003. The status of Lucanus cervus in Zwitserland. Proceedingsof the second pan-European 633 conference on Saproxylic Beetles, pp pp. 1-3.

634 Thomaes, A., Kervyn, T., Beck, O., Cammaerts, R., 2008. Distribution of Lucanus cervus in Belgium: surviving in a 635 changing landscape (Coleoptera: Lucanidae). La Terre et la Vie-Revue d'Ecologie in press.

12 636 Thomaes, A., Vandekerkhove, K., 2004. Ecologie en verspreiding van Vliegend hert in Vlaanderen. Instituut voor Bosbouw 637 en Wildbeheer, Geraardsbergen.

638 Thomas, J.A., Telfer, M.G., Roy, D.B., Preston, C.D., Fox, R., Clarke, R.T., Lawton, J.H., 2004. Comparative losses in 639 British butterflies, birds, and plants and the global extinction crisis. Science 303, 1879-1881.

640 Thuiller, W., 2003. BIOMOD - optimizing predictions of species distributions and projecting potential future shifts under 641 global change. Global Change Biology 9, 1353-1362.

642 Tochtermann, E., 1992. Das "Spessartmodell" heute. Neue biologische Fakten and Problematik der 643 Hirschkäferförderung. Allg.Forst Zeitschr. 47, 308-311.

644 Van Goethem, J., 2001. Second National Report of Belgium to the Convention on Biological Diversity. Royal Belgian 645 Institute of Natural Sciences (RBINS), Brussels.

646 Vanreusel, W., Maes, D., Van Dyck, H., 2007. Transferability of Species Distribution Models: a Functional Habitat 647 Approach for Two Regionally Threatened Butterflies. Conservation Biology 21, 201-212.

648 Wätzold, F., Schwerdtner, K., 2005. Why be wasteful when preserving a valuable resource? A review article on the cost- 649 affectiveness of European biodiversity conservation policy. Biological Conservation 123, 327-338.

650 Whitehead, P.F., 1993. Lucanus cervus (L.) (Coleoptera:Lucanidae) in Worcestershire with a hypothesis for its 651 distribution. Entomologist's monthly magazine 129, 206.

652 Willis, K.J., Araújo, M.B., Bennett, K.D., Figueroa-Rangel, B., Froyd, C.A., Myers, N., 2007. How can a knowledge of the 653 past help to conserve the future? Biodiversity conservation and the relevance of long-term ecological studies. 654 Philosophical Transactions of the Royal Society B-Biological Sciences 362, 175-186.

655 Wilson, K.A., Westphal, M.I., Possingham, H.P., Elith, J., 2005. Sensitivity of conservation planning to different 656 approaches to using predicted distribtution data. Biological Conservation 122, 99-112. 657 658 659

13 660 Table 1 - Average values ± StDev and the results of a t-test (t- and p-value) for the different variables for 661 grid cells of 5 x 5 km with and without L. cervus. Variables are given in decreasing order of significance. 662 663 Present Absent t p 664 665 N 73 283 666 Agriculture (ha) 1041.6±573.9 1557.4±526.4 -7.326 <0.001 667 Urban area (ha) 765.9±540.6 414.6±319.6 7.131 <0.001 668 Range in elevation (meters) 108.9±75.5 65.7±63.3 4.983 <0.001 669 Soil suitability index (%) 67.41±23.49 57.25±24.72 3.063 0.002 670 Deciduous/mixed woodland (ha) 348.8±369.3 237.3±269.1 2.908 0.004 671 Coniferous woodland (ha) 257.4±283.7 196.3±306.8 1.539 0.125 672 Temperature (°C) 14.09±0.67 13.90±1.11 1.402 0.162 673 Rainfall (mm y-1) 955±142 972±154 -0.866 0.387 674 Shrubland (ha) 26.2±44.4 25.8±59.0 0.048 0.962 675 676 Table 2 - Best model according to the Area under Curve (AUC) of the Receiver Operating Characterisitcs 677 (ROC) plots, sensitivity (% correctly predicted presences in the evaluation set), specificity (% correctly 678 predicted absences in the evaluation set) and according to Cohen’s Kappa of the evaluation sets in the ten 679 randomly generated data sets. GLM = Generalised Linear Models, RF = Random Forest, GAM = General 680 Additive Models. 681 682 RandomSet 1 2 3 4 5 6 7 8 9 10 Average±s.e. 683 684 Best model AUC GLM RF GLM RF RF GAM RF RF RF GLM 685 AUC 0.902 0.915 0.865 0.882 0.829 0.900 0.859 0.839 0.824 0.856 0.867±0.010 686 Sensitivity 81.8 86.4 86.4 81.8 68.2 81.8 68.2 63.6 77.3 59.1 75.7±3.1 687 Specificty 82.4 83.5 71.8 80.0 82.4 84.7 78.8 75.3 71.8 81.2 79.2±1.5 688 689 Best model Kappa GAM RF GAM RF RF RF GAM MDA RF GAM 690 Cohen’s Kappa 0.622 0.719 0.543 0.669 0.471 0.645 0.564 0.502 0.564 0.614 0.591±0.024 691 pKappa 0.485 0.460 0.509 0.410 0.390 0.570 0.290 0.440 0.520 0.529 692 693 Sensitivity 50.0 68.2 54.5 68.2 45.5 54.5 59.1 45.5 50.0 50.0 54.5±2.6 694 Specificty 95.3 97.6 90.6 95.3 95.3 98.8 91.8 90.6 97.6 97.6 95.1±1.0 695 696 Table 3. Area of priority sites for the Stag Beetle in Belgium based on area and the already designated 697 area of the sites. Bold = Priority1-site, Italic = Priority2-site, Underlined = Priority3-site. Between brackets 698 is the number of woodlands in the different categories. 699 700 Prevalence threshold 701 Woodland area ≥100ha 50-100ha <50ha Total 702 703 %Area Natura2000 704 0-10 10 788 (47) 4 472 (64) 4 132 (153) 19 392 (37.3%) 705 10-50 13 841 (33) 1 093 (16) 709 (22) 15 649 (30.1%) 706 51-100 14 695 (39) 1 412 (20) 855 (31) 16 962 (32.6%) 707 708 Total 39 325 (119) 6 977 (100) 5 696 (206) 51 997 (425) 709 710 Kappa threshold 711 Woodland area ≥100ha 50-100ha <50ha Total 712 713 %Area Natura2000 714 0-10 2 431 (12) 2 038 (28) 1 244 (46) 5 713 (33.9%) 715 10-50 2 291 (11) 465 (7) 290 (9) 3 046 (18.1%) 716 51-100 7400 (10) 260 (4) 421 (15) 8 081 (48.0%) 717 718 Total 12 122 (33) 2 763 (39) 1 955 (70) 16 840 (142) 719

14 720

721 722 723 Fig. 1 - Known present-day (since 1974, black squares) and historical (before 1974, 724 white dots) distribution of Lucanus cervus and the ecological regions in Belgium. 725

726 727 Fig. 2 - Location of the grid cells that were used to model the distribution of Lucanus 728 cervus in Belgium. 729

730 a b

15 731 c 732 Fig. 3 – Predicted distribution (black dots) using the prevalence threshold (a) and the 733 Kappa threshold (b) and the observed present-day distribution (grey squares) of 734 Lucanus cervus in Belgium, Natura 2000 sites already designated where Lucanus 735 cervus is present (c). 736

737 a1 a2

738 b1 b2

16 739 c1 c2 740 741 Fig. 4 – Priority sites for L. cervus in Belgium: Priority1-sites using the prevalence 742 threshold (a1) and the Kappa threshold (a2); Priority2-sites using the prevalence 743 threshold (b1) and the Kappa threshold (b2); Priority3-sites using the prevalence 744 threshold (c1) and the Kappa threshold (c2). 745

746 747 Fig. 5. Area of Priority1-sites to be designated using the prevalence (black bars) and 748 the Kappa threshold (white bars) in function of the number of times L. cervus was 749 predicted as present in the ten model runs. 750

17