Marine Species Distributions: from Data to Predictive Models
Total Page:16
File Type:pdf, Size:1020Kb
Marine Species Distributions: From data to predictive models Samuel Bosch Promoter: Prof. Dr. Olivier De Clerck Thesis submitted in partial fulfilment of the requirements for the degree of Doctor (PhD) in Science – Biology Academic year 2016-2017 Members of the examination committee Prof. Dr. Olivier De Clerck - Ghent University (Promoter)* Prof. Dr. Tom Moens – Ghent University (Chairman) Prof. Dr. Elie Verleyen – Ghent University (Secretary) Prof. Dr. Frederik Leliaert – Botanic Garden Meise / Ghent University Dr. Tom Webb – University of Sheffield Dr. Lennert Tyberghein - Vlaams Instituut voor de Zee * non-voting members Financial support This thesis was funded by the ERANET INVASIVES project (EU FP7 SEAS-ERA/INVASIVES SD/ER/010) and by VLIZ as part of the Flemish contribution to the LifeWatch ESFRI. Table of contents Chapter 1 General Introduction 7 Chapter 2 Fishing for data and sorting the catch: assessing the 25 data quality, completeness and fitness for use of data in marine biogeographic databases Chapter 3 sdmpredictors: an R package for species distribution 49 modelling predictor datasets Chapter 4 In search of relevant predictors for marine species 61 distribution modelling using the MarineSPEED benchmark dataset Chapter 5 Spatio-temporal patterns of introduced seaweeds in 97 European waters, a critical review Chapter 6 A risk assessment of aquarium trade introductions of 119 seaweed in European waters Chapter 7 Modelling the past, present and future distribution of 147 invasive seaweeds in Europe Chapter 8 General discussion 179 References 193 Summary 225 Samenvatting 229 Acknowledgements 233 Chapter 1 General Introduction 8 | C h a p t e r 1 Species distribution modelling Throughout most of human history knowledge of species diversity and their respective distributions was an essential skill for survival and civilization. In this respect it is not surprising that distributions of species and the mechanisms governing the distributions has intrigued humans from the early dawn of humanity up till present. Some of the earliest scientific writings about distributions of species and the interrelationships between species and the environment in which they live are found in History of Animals by Aristotle and Enquiry into Plants by Theophrastus (Mayr, 1985; Magner, 2002). More recently the link between species distributions, geography and the physical environment was noted by the likes of von Humboldt and Bonpland (1805), Watson (1847), de Candolle (1855), Wallace (1876) and Grinnell (1904), gradually leading to the advent of the research fields ecology and biogeography. While these early writings were merely of a qualitative and descriptive nature, in the 21th century, ecology and biogeography under the influence of e.g. Hutchinson, Elton, McArthur and Wilson became progressively infused with theory and mathematics, thereby paving the way for species distribution modelling. The core of predictive modelling in geographic space involves the quantification of species- environment relationships (Guisan & Zimmermann, 2000). Species distribution modelling (SDM) as a separate research field emerged from the intersection of ecological gradient analysis (Whittaker et al., 1973), biogeography (Box, 1981), remote sensing and geographic information science (Franklin, 1995). SDM, also known as ecological niche modelling, habitat suitability modelling or climate envelope modelling, is now a widely used method in ecology, conservation biology, biogeography, paleoecology and global change biology. Apart from modelling species distributions, the same methods used in SDM have also been used in other fields to model for instance the geographical variation in species traits (Briones et al., 2014), agricultural crops (Hijmans et al., 2003), wildfire frequency (Syphard et al., 2008) and the distribution of humans and Neanderthals (Banks et al., 2008; Benito et al., 2017). Species distribution modelling (SDM) is the process of using numerical tools to combine observations of species occurrence or abundance with environmental information (Elith & Leathwick, 2009). One of the core assumptions of SDM is that species are in equilibrium with their environment whereby a certain species occupies all suitable habitats (Austin, 2002; O’Connor, 2002; Araújo & Peterson, 2012). Invasive species in theory violate this assumption because they are expected I ntroduction | 9 to be still expanding their range in the invaded region. These static, correlative models are opposed to more dynamic models of ecosystem processes (Guisan & Theurillat, 2000; Guisan & Zimmermann, 2000). Disturbances of the equilibrium between a species and its environment due to migration, invasion or biotic interactions put a natural upper limit on the performance of species distribution models. Some authors attempt to overcome this by building mechanistic distribution models, which incorporate information on the response of a species to different environmental conditions (Kearney & Porter, 2009). Others, combine both environmental and species co-occurrence data into a joint species distribution model (Clark et al., 2014; Harris, 2015; Ovaskainen et al., 2015; Tikhonov et al., 2017). Alternatively some studies include a temporal or dispersal aspect into their models (Zurell et al., 2009; Gutt et al., 2012; Génard & Lescourret, 2013; Mieszkowska et al., 2013; Hayes et al., 2015), resulting in models able to capture the dynamic aspect of the species distribution better. Such models, however, require additional knowledge on dispersal capacity and ecophysiology of the species that is often unavailable for marine species. Regardless of the flavour of the species distribution model, all models require an estimate of the species distribution, either through presence-only or presence- absence data, as well as environmental data which is relevant for the geography of the species. Evidently, inadequate estimates of the distribution and environmental information, will reflect on the quality of the SDM. Niche The theoretical framework of SDM is based on the ecological niche concept (Guisan & Zimmermann, 2000; Pulliam, 2000). Biologists commonly distinguish three types of niches: the “Grinellian”, the “Eltonian” and the “Hutchinsonian”. Grinnell (1917) restricted the term niche to the climatic and habitat conditions where a species occurs (Pulliam, 2000; Guisan & Thuiller, 2005; Peterson et al., 2011). Elton (1927), on the other hand, viewed the niche as the functional role of a species in a community, especially its position in food webs, thereby highlighting species interactions. Lastly, Hutchinson (1957) defined the niche as “… the hypervolume defined by the environmental dimensions within which that species can survive and reproduce.” He furthermore distinguished the fundamental and realized niche, where the fundamental niche is the response of the species to the environment without taking species interactions into account and the realized niche takes species interactions into account. As correlative species distribution modelling starts from the realized distribution in geographic space it is only able to model the realized niche in environmental space (Austin, 2002; Colwell & Rangel, 2009). Translated to 10 | C h a p t e r 1 geographic space, this realized niche may include areas where the species is not present due to dispersal limitation or biotic interactions. For species with source-sink populations even modelling the realized niche can be problematic when occurrences are recorded in unsuitable geographic and environmental space (Pulliam, 2000; Austin, 2002). For instance an invasive seaweed might be recorded shortly after its introduction in places where the year round conditions are not suitable for its long term survival and reproduction. But, if this record is used in a species distribution model it will lead to an overprediction of suitable areas. Figure 1. BAM diagram adapted from Soberón (2007) and Peterson et al. (2011) with the suitable abiotic (A) and biotic (B) conditions and the accessible area (M). Solid circles represent source populations, open shapes absences. The intersection between A and B is comprised of GO, which is the occupied distributional area, and GI, which is the potentially invadable distributional area. Note that the species absent in all areas except GO, but only areas relevant for niche modelling have absences depicted. Open stars represent areas where the species is not competitive, open squares are areas with abiotically unsuitable conditions. G is the total area of the study. Soberón and Peterson (2005) identified four factors determining the distribution of a species: abiotic conditions (A), biotic factors (B), regions that are accessible through dispersal (M) and the evolutionary capacity of the species to adapt to new conditions. Building on previous work by Pulliam (2000), Soberón and Peterson (2005) developed the BAM diagram, a Venn diagram representing all these distribution determining factors except evolution (Fig. 1). The center of the BAM diagram is the area occupied by the species (G0), the invadable area, which is geographically not accessible through migration but abiotically and biotically suitable is denoted as GI. Distribution data Most commonly used species distribution modelling algorithms require species occurrence and sometimes species absence data as modelling input. Ideally, such distribution data used for SDM comes from a well-designed biological survey, with accurately located species presences and absences from a known