<<

Biogeographical network analysis of plant species distribution in the Mediterranean region

Maxime Lenormand,1, ∗ Guillaume Papuga,2, 3, ∗ Olivier Argagnon,2 Maxence Soubeyrand,1 Guilhem De Barros,2 Samuel Alleaume,1 and Sandra Luque1 1Irstea, UMR TETIS, 500 rue JF Breton, 34093 Montpellier, France 2Conservatoire botanique national m´editerran´eende Porquerolles, Parc scientifique Agropolis, 2214 boulevard de la Lironde, 34980 Montferrier sur Lez, France 3UMR 5175 CEFE, CNRS, 1919 route de Mende, 34293 Montpellier cedex 5, France

The delimitation of bioregions helps to understand historical and ecological drivers of species distribution. In this work, we performed a network analysis of the spatial distribution patterns of plants in south of France (Languedoc-Roussillon and Provence-Alpes-Cˆoted’Azur) to analyze the biogeographical structure of the French Mediterranean flora at different scales. We used a network approach to identify and characterize biogeographical regions, based on a large database containing 2.5 million of geolocalized plant records corresponding to more than 3,500 plant species. This methodology is performed following five steps, from the biogeographical bipartite network construction, to the identification of biogeographical regions under the form of spatial network communities, the analysis of their interactions and the identification of clusters of plant species based on the species contribution to the biogeographical regions. First, we identified two sub-networks that distinguish Mediterranean and temperate biota. Then, we separated eight statistically significant bioregions that present a complex spatial structure. Some of them are spatially well delimited, and match with particular geological entities. On the other hand fuzzy transitions arise between adjacent bioregions that share a common geological setting, but are spread along a climatic gradient. The proposed network approach illustrates the biogeographical structure of the flora in southern France, and provides precise insights into the relationships between bioregions. This approach sheds light on ecological drivers shaping the distribution of Mediterranean biota: the interplay between a climatic gradient and geological substrate shapes biodiversity patterns. Finally this work exemplifies why fragmented distributions are common in the Mediterranean region, isolating groups of species that share a similar eco-evolutionary history.

INTRODUCTION logical advances have allowed for the development of more rigorous frameworks (Kreft & Jetz, 2010). Mul- The delimitation of biogeographical regions or tivariate methods, such as hierarchical clustering algo- bioregions based on the analysis of their biota has been rithms, have thus been successfully applied in a wide a founding theme in , from the pioneer range of studies focused on a variety of organisms, un- work of Wallace(1876), Murray(1866) or Wahlen- der very different spatial scale (from regional to world- berg(1812) to the most recent advances of Cheruvelil wide perspective). Yet, the production of detailed et al. (2017); Ficetola et al. (2017). Describing spatial cartographic outputs portraying the differentiation of patterns of biodiversity has appeared fundamental to vegetation into distinct homogeneous bioregions re- understand the historical diversification of biota, and mains difficult, especially where spatial heterogeneity gain a better understanding of ecological factors that of assemblages is associated with complex environ- imprint spatial patterns of biodiversity (Graham & mental gradients (Mikolajczak et al., 2015). Besides, Hijmans, 2006; Ricklefs, 2004). Additionally, it has the identification of meaningful and coherent biore- become a key element in the identification of spatial gions represents only one step of the biogeographical conservation strategies (Funk et al., 2002; Mikolajczak regionalizations (Morrone, 2018). It is also crucial to propose new metrics to quantify the relationship be-

arXiv:1803.05275v3 [q-bio.PE] 7 Jan 2019 et al., 2015; Rushton et al., 2004). To divide a given territory into meaningful and coherent bioregions, the tween bioregions and to analyze species and spatial overall aim is to minimize the heterogeneity in taxo- relationships. nomic composition within regions, while maximizing Some regions of the world oppose inherent diffi- differences between them (Kreft & Jetz, 2010; Stod- culties due to their highly diversified biota, reflect- dart, 1992). Although such delineation of bioregions ing complex eco-evolutionary processes. The Mediter- has been based for a long time on expert knowledge of ranean basin is one of the largest and most impor- qualitative data collection the increasing availability tant biodiversity hotspots in the world (Blondel et al., of species-level distribution data and recent techno- 2010; Myers et al., 2000). This region hosts about 25,000 plant species representing 10% of the world’s total floristic richness concentrated on only 1% of the ∗ Corresponding authors: [email protected] & guil- world’s surface (Greuter, 1991). Additionally, a high [email protected] who contributed equally to this level of narrow endemism is a major feature of this work. (Thompson, 2005). Endemism and richness re- 2 sult in a very heterogeneous region, whose compre- plant species revealed by botanical inventories (Tison hension of spatial patterns of plant distribution is & Foucault, 2014; Tison et al., 2014) and the detailed clue to get better insights into past and actual pro- databases compiled by the French National Botanic cesses shaping biodiversity (Qu´ezel, 1999). The onset Conservatory of Porquerolles and the Alpine National of the Mediterranean climate during the Pliocene and Botanic Conservatory. The objective of this work is the diverse glacial periods of the Pleistocene (Qu´ezel to delineate bioregions, identify groups of species and & M´edail, 2004) have shaped the most important analyse the relationships between the two entities. phases of plant evolution since the Tertiary (Thomp- son, 2005). Additionally, due to a long history of hu- man presence, contemporary flora has been widely in- MATERIALS AND METHODS fluenced by human-mediated dispersal, land-use and other pressures (Dahlin et al., 2014; Fenu et al., 2014). Dataset and study area The French Mediterranean area stretches from the Pyrenees in the south-west to the slopes of the Mar- The study area, situated in southern France, itime and Ligurian Alps in the east. It encompasses encompasses the former Languedoc-Roussillon re- three zones highlighted as glacial refugia (M´edail& gion (five departments of the current Occitanie re- Diadema, 2009), and the eastern sector represents one gion: Pyr´en´ees-Orientales, Aude, H´erault,Gard and of the ten main biodiversity hotspots in the Mediter- Loz`ere)and the whole Provence-Alpes-Cˆoted’Azur ranean area (M´edail& Qu´ezel, 1997). This area rep- region. It extends around the entire Mediterranean resents the northern limit of the Mediterranean cli- coastline of mainland France and inland, compris- mate in the western basin, and thus constitutes a cli- ing almost all the Mediterranean hinterland, totalling matic transition from a Mediterranean zone that has 558, 776 km2 (Figure1). The topography is struc- a summer drought to a temperate zone less prone to tured by three major mountain ranges, the Pyrenees summer drought (Walter & Breckle, 1991 1994). On in the southwest, the Massif central in the north- a finer scale, the climate is more complex with sev- west and the Maritime Alps in the north-east. In- eral subtypes and intricated boundaries (Joly et al., between, the landscape is mostly hilly with some low- 2010; Tassin, 2017). Several works have tried to map lands around rivers that flow into lagoons or marshy the distribution of biogeographical entities. To date, deltas such as the Camargue. The Rhˆoneis the main no statistical analysis had been ran to tackle those structuring river and delimitates western and eastern expert-based maps with up-to-date plant records, in subregions. Acidic substrates and silicate soils are order to test their reliability. mainly found in the aforementioned mountain ranges In order to depict spatial structure in such a com- and in the smaller Maures-Est´erelrange in southern plex regional flora, a large dataset is required. While Provence. The remaining part of the territory is dom- the level of diversity and complexity of such dataset inated by calcareous or marly substrates (principally may appear overwhelming at first glance, the emer- Cretaceous and Jurassic), with some significant allu- gence of network-based approaches has opened new vial zones and small volcanic areas. paths for identifying and delimiting bioregions where The SILENE database1, has been created in 2006, the presence-absence matrix is represented by a bi- and is the reference botanical database in the study partite network. For example, Kougioumoutzis et al. area. It contains historical data gathered from the sci- (2014) applied the NetCarto algorithm (Guimer`a& entific literature and herbaria along with more recent Nunes Amaral, 2005) in order to identify biogeograph- data coming from public studies, partnerships, local ical modules within the phytogeographical area of the amateur botanist networks and professional botanists Cyclades. Similarly, Vilhena & Antonelli(2015) pro- of the Botanical Conservatory. Our analysis is based 2 posed a network approach for delimiting biogeograph- on a 5 × 5 km grid cells. We decided to only re- ical region based on the InfoMap algorithm (Rosvall tain data whose georeferencement precision is below & Bergstrom, 2008). By embedding species distri- 10 meters. While the SILENE database contained butional data into complex networks, these methods nearly five million observations at the date of the ex- have the great advantage to be generic, flexible and port (June 2016), we deleted several taxa whose dis- to incorporate several scales in the analysis. Most im- tribution is still insufficiently known and could distort portantly, these methods integrate species community the results (e.g. apomictic taxa such as Rubus or Hi- and spatial units within a single framework, which eracium). For the same reason, we also aggregated all allow to test the relative contribution of each taxa sub-taxa at the species level. The final dataset results to bioregions depicted, and to represent the relation- in 4, 263, 734 vegetation plant samples corresponding ship between those bioregions based on those contri- to 3, 697 plant species. We divided the study area us- butions. ing a UTM grid composed of 2, 607 squares of lateral In this study, we present a biogeographical net- work analysis of plant species distribution in the French Mediterranean area at different scales. The 1 Conservatoire Botanique National M´editerran´een & Con- French Mediterranean territory represents an inter- servatoire Botanique National Alpin (Admin.). AAAA. esting study area to test new approaches, given the SILENE-Flore [online]. http://flore.silene.eu (accessed excellent knowledge of the spatial distribution of the the 16/03/2018) 3

#Species 1000

800

600

400

200

0

Figure 1. Distribution of the number of species per grid cell (l = 5 km). The inset shows a map of France including the studied area colored in red. An altitude map of the studied area is available in Appendix.

size l = 5 km. In order to assess the impact of the in a given grid cell during a certain time window spatial resolution on the results (Div´ıˇsek et al., 2016; (Step 1 in Figure2). This way of understanding Lennon et al., 2001), we also applied the aforemen- complex interactions makes it possible to visualize tioned biogeographical network analysis with a grid and analyze complex spatio-ecological systems as a composed of squares of lateral size l = 10 km (see Fig- whole from individual interactions to local and global ure S2 and Table S1 in Appendix for more details). biogeographical properties.

2. Delineating bioregions. To identify bioregions Biogeographical network analysis we projected our biogeographical bipartite network on a spatial template (Step 2 in Figure2), by defining 1. Biogeographical bipartite network. Delin- a metric to measure the similarity of species com- eating bioregions requires a link between the species position between grid cells. Several measures based studied and their spatial environment. This link on beta diversity have been proposed to quantify the is usually identified with presence-absence matrices degree of (dis)similarity between grid cells, typically where each row represents a grid cell and each column taking into account the number of shared species be- a species. The region of interest is usually divided tween grid cells (Koleff et al., 2003; Wilson & Shmida, into grid cells, the resolution of which depends mostly 1984). These measures are mostly based on presence- on the size of the study area, the taxonomic group absence data and aim at quantifying species turnover under study and the accuracy of the data. According and species nestedness among grid cells, together or to the type and quality of data, but also to the re- separately (Baselga, 2012). Although this indicator search question, the species relev´e can be aggregated may be influenced by gradients in species richness both spatially or by group of species. Another way (Baselga, 2012; Dapporto et al., 2015; Lennon et al., of formalizing complex interactions between species 2001), results obtained with the Jaccard index were and grid cells is to build a biogeographical bipartite more spatially coherent in our case. network. This bipartite network enables us to model The resulting network is a weighted undirected spa- relations between two disjoint sets of nodes, grid tial network whose intensity of links between grid cells cells and species (in our case), which are linked by range from 0, absence of a link (no species in common) the presence of a species (or a group of species) to 1 (identical species composition). The detection 4

2

Grid cells Plant species test-value

1 3 5

4

Figure 2. Steps of the biogeographical network analysis. 1. Biogeographical bipartite network where grid cells and species are linked by the presence of a species (or a group of species) in a given grid cell during a certain time window. Note that there is no link between nodes belonging to the same set. 2. The bipartite network is then spatially projected by using a similarity measure of species composition between grid cells. Bioregions are then identified with a network community detection algorithm. 3. The test-value matrix based on the contribution of species to bioregions is computed. 4. Then, a network of similarity between species is built, based on the test-value matrix. Groups of species sharing similar spatial features are identified using a community detection algorithm. 5. Finally, a coarse-grained biogeographical network unveiling the biogeographical structure of the studied area and the relationship between bioregions is obtained.

of community structure in biogeographical networks distribution). Therefore, the output of OSLOM con- is an interesting alternative approach to delineating sists in a collection of clusters that are unlikely to be bioregions (Kougioumoutzis et al., 2014; Vilhena & found in an equivalent random network with the same Antonelli, 2015). Community structure is indeed an degree sequence. This algorithm is nonparametric in important feature, revealing both the network inter- the sense that it identifies the statistically significant nal organization and similarity patterns among its partition, without defining the number of communi- individual elements. In this study we used the Or- ties a priori. However, the tolerance value that deter- der Statistics Local Optimization Method (OSLOM) mines whether a cluster is significant or not might play (Lancichinetti et al., 2011). OSLOM uses an iterative an important role for the determination of the clus- process to detect statistically significant communities ters found by OSLOM. The influence of this value, with respect to a global null model (i.e. random graph fixed initially, is however relevant only when the com- without community structure). The main character- munity structure of the network is not pronounced. istic of OSLOM is that it is based on a score used When communities are well defined, as it is usually to quantify the statistical significance of a cluster in the case in biogeography, the results of OSLOM do the network (Lancichinetti et al., 2010). The score is not depend on the particular choice of tolerance value defined as the probability of finding the cluster in a (Lancichinetti et al., 2011). See Lancichinetti et al. random null model. The random null model used in (2011) for a comparison between OSLOM and other OSLOM is the configuration model (Molloy & Reed, community detection algorithms. 1995) that generates random graphs while preserving an essential property of the network: the distribution 3. Test-value matrix. To analyse the bioregions of the number of neighbors of a node (i.e. the degree and their species composition, we rely on test-values measuring the under- or over-representation of species 5 in a bioregion. Let us consider a studied area divided Then, since we are interested in interactions be- into n grid cells, a species i present in ni grid cells tween bioregions we focused on the way species contri- and a biogeographical region j composed of nj grid butions are distributed among regions by normalizing cells. The test-value compares the actual number of ρ+ by row (Equation4). grid cells n , located in biogeographical region j and ij ρˆ+ ρ+ ρ+ (4) supporting species i, with the average number ninj n ij = ij~ Q ij ~ i that would be expected if the species were uniformly distributed over the whole studied area. Since this We then determined for each bioregions j how the quantity depends on ni and nj it is normalized by set of species Aj = {i S ρij > 1.96} that contributes to the standard deviation associated with the average this biogeographical region are specific to it or also expected number of grid cells (Lebart et al., 2000). contribute to other regions (Equation5). The test value ρij is then defined as, 1 + λ ′ ρˆ ′ (5) n ninj jj = A Q ij ij − n S jS i∈Aj ρij = ¼ (1) n−nj nj ninj n−1 ‰1 − n Ž n λjj′ represents therefore the average fraction of contribution to cluster j′ of species that contribute The test value ρ is negative if the species i is ij significantly to cluster j. The specificity of a biogeo- under-represented in region j, equal to 0 if the species graphical region is therefore measured with λ , while i is present in region j in the same proportion as jj the relationships with other regions is given by λ ′ . It in the whole study area or positive if the species i jj is important to note that for a given region j the vec- is over-represented in region j. In the latter case tor λ sum to one and can be expressed in percentage. we consider that the species i contribute positively j. to region j and the level of contribution depends At the end of the process, we obtain a coarse- of the ρ value. Additionally, we consider that a ij grained biogeographical network summarizing the bio- plant species contribute positively and significantly geographical structure of the study area. This net- to a bioregion j if ρ is higher than a predetermined ij work is composed of the bioregions and the species significance threshold δ. Hence, The test-value groups (Step 5 in Figure2). All the metrics used matrix ρ can be used to highlight set of species which to measure the similarity between the different biore- better characterize the bioregions. The test-values gions are derived from the matrix of test-value ρ. are easy to interpret by specialists and represent an user-friendly way of ranking species according to 0.004 their relevance. #Plant species per grid cell #Grid cells per plant species 4. Groups of species. The next step is to iden- 0.003 tify how similarities between species are spatially dis- tributed across the study area. Here also we build a network in which the similarity sii′ between two 0.002 species i and i′ is equal to, PDF

1 0.001 sii′ (2) = » 2 1 + ∑j(ρij − ρi′j) This similarity metric is based on the Euclidean 0.000 distance between test-values for each pair of species. 0 500 1000 1500 2000 Again, the community detection algorithm OSLOM Degree is used to detect significant groups of species sharing the same spatial features (Step 4 in Figure2). This Figure 3. Degree distributions of the biogeographical bi- step produces a preliminary delimitation of the partite network. Probability density functions of the number relationships between bioregions by identifying how of plant species per grid cell (in blue) and the number of cells the groups of species contributes to one or several covered per plant species (in red). Similar figures showing bioregions. histograms instead of densities are available in Figure S13 in Appendix. 5. Coarse-grained biogeographical network. To quantitatively characterize relationships between bioregions, we retained only the positive and signif- RESULTS icant species contributions by considering only test- values higher than δ = 1.96 (5% significance level of a Biogeographical bipartite network Gaussian distribution). The bipartite network extracted from the database + 1 2 ρij = ρij ρij >1.96 (3) is composed of 2, 607 5 × 5 km grid cells and 3, 697 6

1 Gulf of Lion coast 2 Cork oak zone 3 Mediterranean lowlands 4 Mediterranean border 5 Cévennes sensu lato 6 Subatlantic mountains 7 Pre-Alps 8 High mountains

Figure 4. Bioregions based on similarity in plant species (l = 5 km). Eight bioregions have been identified. 1. Gulf of Lion coast in red. 2. Cork oak zone in orange. 3. Mediterranean lowlands in light green. 4. Mediterranean border in dark green. 5. C´evennes sensu lato in purple. 6. Subatlantic mountains in pink. 7. Prealps and other medium mountains in yellow. 8. High mountains in brown. The inset shows a map of France including the studied areas colored in red. An altitude map of the studied area is available in Appendix (Figure S14).

plant species, where the links represent the occurrence hibiting a connectivity measure always higher than of plant species in the grid cells. Two network degree 0.5 (i.e. ratio between the number of grid cells in distributions can be associated to this network: the the largest patch and the total number of grid cells number of species per grid cell and the number of cells (Turner et al., 2001)). Results obtained are not scale covered by each species. The probability density func- sensitive, and the spatial coherence of each cluster ac- tions of these two distributions are displayed in Fig- cording to the scale (l = 5 and 10 km) can be found ure3. The spatial component of the network is very in Table S1 in Appendix. It also important to note dense. Most of the grid cells host between 200 and that this step can also be performed with standard 500 plant species, with an average of 360 species per hierarchical clustering methods. The results obtained 2 cell (i.e. ∼ 15 species/km ). For species side, the sit- with Ward’s clustering are available in Appendix. uation is different; the majority of plant species cover less than 10% of the study area, which highlight the importance of range restricted taxa. Nevertheless, the Groups of plant species distribution exhibits a long tail with a non-negligible number of widespread species. The test-value matrix can be used to identify plant species that contribute positively and significantly to one or more bioregions. It is worth noting that Delineating bioregions the number of contributions and their intensity vary among species. Indeed, some species contribute very We identified eight statistically significant biore- little to only one region while other species contribute gions reflecting the biogeographical structure of the significantly to three or more regions. The number French Mediterranean area based on plant species dis- of species contributing to a given number of regions tribution (Figure4). Clusters size vary from 120 to depends on the significance threshold δ. A very small 807 square cells. Clusters are spatially coherent, ex- and negative value of δ will imply that almost all plant 7

from the test-value matrix is plotted in Figure6. We 0.5 found that, globally, plant species contributing signif- icantly to a region contribute mostly to this region, 0.4 with an average specificity of 51% across the eight 0 bioregions. It must be pointed out however that some 1 0.3 2 regions are more specific than others with λjj values 3 ranging from 40% to 65%. 0.2 4 Analysis of how bioregions connect with each other 5+ showed that there is no isolated region in the sense

Fraction of species Fraction 0.1 that every region is connected with at least one other region with a λjj′ value varying from 1 to 28%. More- 0.0 over, for all regions, the maximal λjj′ value is always 0 2 4 6 8 10 higher than 10%. Although it is generally the case, it is also worth mentioning that the relationships are Significance threshold α not necessarily symmetric, which represents an inter- esting way of detecting hierarchical relationships. A Figure 5. Fraction of species contributing positively and table displaying all λjj′ values is available in Table S4 significantly to a given number of bioregions (from 0 to 5 in Appendix. or more) as a function of the significance threshold. The vertical line represents the significance threshold δ = 1.96. 65 49 19 8 6 species contribute significantly to the 8 bioregions. In 17 26 contrast, a very high value of δ will result in all species 24 contributing to no regions. In order to get a better un- 28 derstanding of species contribution mechanisms and 40 to assess the influence of δ, we plot in Figure5 the 5 7 52 fraction of species contributing positively to a given 11 number of bioregions as a function of a significance 19 16 threshold value. If we consider the default threshold 13 δ 1.96, that corresponds to a 2.5% significance level 14 = 4 55 of a Gaussian distribution, we observe that the vast 56 majority of plant species contributes positively to one 2 11 or two regions representing 35% and 45% of species, 11 respectively. There is also 20% percent of plant species 25 that contribute to three or more bioregions. If we in- 24 18 crease the minimum level of contribution necessary 15 15 to claim that a species contributes to a region, we see 3 42 that the fraction of species contributing to two or more 52 bioregions dramatically decreases while the fraction of 1 13 species with no contribution increases. However, it is interesting to note that the fraction of species con- Figure 6. Network of interactions between bioregions. tributing to one region to increases until reaching a λjj′ , expressed here in percentage, represents the average plateau. This demonstrates that 50% of plant species fraction of contribution to cluster j′ of species that contribute are strongly connected to a single region. significantly to cluster j. Only links with a value λjj′ higher The similarities between plant species’ contribution than 10% are shown. to the 8 regions allowed us to identify 20 groups of species, and their contribution to each bioregion is dis- played in Figure7. We observed different patterns of DISCUSSION contributions in terms of shape and intensity. This al- lows for the identification of groups of species sharing similar spatial features and highlights relationships In this study we delineate spatial bioregions in between bioregions through the way plant species con- southern France, a transition area between a mediter- tribute to different group of regions. ranean and temperate climate. The present analy- sis represents to our knowledge one of the largest network-based studies published to date, relying on a database containing more than four million data Relationships between bioregions points across a territory of about 558, 776 km2. While this territory has been divided into bioregions on ex- This leads us to the study of relationships between pert knowledge, we confront those approaches to data- bioregions. The network of interactions λ derived driven classification, and discuss the coherence of the 8

40 a b c d 30 20 10 0 −10

40 e f g h 30 20 10 0 −10

40 i j k l 30 20 10 0 −10 Test−value

40 m n o p 30 20 10 0 −10

40 q r s t 30 20 10 0 −10

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Biogeographical regions

Figure 7. Description of the groups of plant species. Boxplot of test-values according to the bioregions and the plant species groups. The horizontal line represents the significance threshold δ = 1.96. The number of plant species per group is available in Table S3 in Appendix.

different perspectives. We delineated eight statisti- Tallon, 1970). Some subdivisions have been suggested cally significant bioregions, which we will first present separating, even at a coarse scale, the sand-dune in relation to previously published work, and empha- complex, the halophytic vegetation and the salt size their specificity regarding associated groups of meadows (Bohn et al., 2000), but were not found species. We discuss the observed spatial patterns in here probably due to the size of the cells we used. terms of ecological and historical drivers, to provide From a geological point of view, this bioregion is insights into mechanisms driving the assemblage of essentially made of sand dunes, lagoon sediments vegetation communities. and modern alluvium. It is entirely situated under a Mediterranean climate, in the mesomediterranean climatic belt, with a dry season of two or three Bioregions months in the summer (Rivas-Mart´ınez et al., 2004a). Taxa specific to this cluster exhibit a distribution following the Mediterranean coastal area, extending The clustering approach identified eight statis- in some cases towards other coastal areas or to tically significant spatial clusters, that represents arid inland zones. They are mostly encountered in coherent territories detailed below. Regions are halophytic communities and surprisingly not that presented from Mediterranean toward temperate and much into dunes, suggesting that the key factor mountainous climates. defining this bioregion might be the saline soils rather than the coastal position alone. Several 1. Gulf of Lion coast is a bioregion that extends narrow-endemics rely on those habitats, especially in west of the Rhˆone,penetrating more inland around the genus Limonium whose rapid radiation is typi- the of the RhˆoneDelta. The latter, along cal of mediterranean neoendemics (Lled´o et al., 2005). with the Languedoc lagoons, is frequently used as an example of azonal vegetation (Ozenda, 1994), and the originality of the flora and the vegetation 2. Cork oak zone encompasses the Maures-Est´erel of these areas has long been recognized (Molinier & range and neighbouring areas. West of the Rhˆone, 9 it is fragmented with cells in the eastern tip of and share a common , occurring frequently in the Pyrenees (low Alb`eresand the Roussillon low- communities belonging to the Helianthemo italici- lands), plus a few more sparsely dispersed zones in Aphyllanthion monspeliensis and to a lesser extent Languedoc. The Provence and Alb`eresareas have to the Ononidetalia striatae (Gaultier, 1989; Rivas- been identified by phytogeographers (Ozenda, 1994; Mart´ınez et al., 2002), i.e. dry dwarf scrubs and their Ozenda & Lucas, 1987) as the “Cork oak zone”, a associated on calcareous and marly eroded silicicolous warm mesomediterranean area. Indeed, soils (Mucina et al., 2016). climatic data show a clear summer dry period of one to two months. Almost all of the cells contain 5. C´evennes sensu lato is a bioregion to which acidic soils over a variety of substrates (granites, most of the cells are situated in the C´evennes gneiss, schists, sandstones, alluvial deposits, etc.). areas, while the remainder is scattered over the Species most linked to the “Cork oak zone” have eastern Pyrenees piedmont and the Montagne Noire a Mediterranean distribution, with some extending (southern limit of the Massif Central). This spatial towards the Atlantic area. Characteristic species have cluster overlays four zones of the phyto-ecological ecological preferences for acid soils, and belong to regions (Dupias & Rey, 1985): the lower C´evennes, various vegetation stages (forest, scrub or the “warm” C´evennes valleys, the Aspres and the formations). chestnut zone of the southern edge of the Mon- tagne Noire. The C´evennes proper part of this 3. Mediterranean lowlands bioregion covers cluster has also been identified by other authors the hinterland of the Gulf of Lion from the Rous- (Braun-Blanquet, 1923; Ozenda, 1994) and putative sillon to western Provence. Several authors have glacial refugia has been positioned there (M´edail individualized an arc shaped mesomediterrean zone & Diadema, 2009). This area is not subject to a (Dupias & Rey, 1985; Ozenda, 1994) but their limits summer drought and covers siliceous substrata such do not fit exactly ours. The closest match is the as schists, granites or gneiss. Taxa exhibiting the catalonian-proven¸cal mesomediterranean holm oak strongest link to this biogeographical region are forests unit of the European natural vegetation map either C´evennes endemics, subendemics (Dupont, (Bohn et al., 2000). The area is principally com- 2015; Lavergne et al., 2004) or plants with a more posed of sedimentary rocks (mostly limestones and or less Atlantic distribution (Dupont, 2015), but no marls) and alluvium. Its climate is Mediterranean clear ecological pattern is emerging among these taxa. (Rivas-Mart´ınez et al., 2004a), with a summer dry period of one to three months. With few excep- 6. Subatlantic mountains The largest area covered tions, species most linked to this bioregion have a by cells of this biogeographical region is the northern distribution included in the Mediterranean region part of the Loz`eredepartment. The remaining cells (Rivas-Mart´ınez et al., 2004b). Most of them belongs are mostly distributed in the Massif Central and in to communities of the Quercetea ilicis or of the former the Pyrenees. These areas belong to the beech (Fagus Thero-Brachypodietea, i.e. the matorral / forest and sylvatica L.) montane belt (Bohn et al., 2000; Ozenda, grasslands communities making up the landscape 1994) with a few exceptions where Scots pines (Pi- locally called “garrigues”. The other part of these nus sylvestris L.) dominate. It corresponds to the taxa is usually found in disturbed communities, predominantly siliceous subatlantic type (Ozenda & showing the strong incidence of human activities in Lucas, 1987), where the climate is rather wet, with this area. precipitations frequently exceeding 1,000 mm per year and no dry period. Thus, and bogs are 4. Mediterranean border is a bioregion whose not rare, and the substratum is made of igneous rocks northern edge roughly follows the limit of the which explain the acidic nature of the soils. The Mediterranean world as it is usually depicted (Dupias majority of the taxa most linked to this spatial cluster & Rey, 1985; Qu´ezel& M´edail, 2004). It broadly co- are generally distributed all over the eurosiberian re- incide to what has been called a supramediterranean gion or the western part of this region, corresponding belt (Ozenda, 1994) or a submediterranean zone to a subatlantic distribution (Dupont, 2015; Rivas- (B´olos, 1961), and fits quite well with four mapping Mart´ınez et al., 2004b). Interestingly, most of those units of the Map of the natural vegetation of plants grow in wetlands habitats, a trend already (Bohn et al., 2000); namely the catalonian-proven¸cal noticed in the Massif Central (Braun-Blanquet, 1923). supramediterranean holm oak forests and three types of downy oak forests (ligurian- middle apennine, 7. Pre-Alps and other medium mountains languedocian and those extending from the southern represent a bioregion whose cells are disseminated Pyrenees to the southwest pre-Alps). The substratum through the lower parts of the eastern Pyrenees of this area is mainly calcareous and marly. This including almost all the Pyrenean part of the Aude area has a short (one month) summer drought period department, through the highest areas of the Causses, with the exception of some Var and Alpes-Maritimes around the Mont Ventoux and through the most places where the summer drought is more pronounced eastern part of the Pre-Alps. This area has rarely (two months). Species most linked to this bioregion been individualised in such a way even if at a present a western eury-mediterranean distribution, European scale it can be related to several more or 10 less calcicolous beech or fir-beech forest belts (Bohn approach allowed to discriminate two “sub-networks” et al., 2000) (Abies alba L. and Fagus sylvatica L.), with little exchange regarding species composition and or more specifically, for the Var department, to a different relative contribution to each area, which pre-alpine district (Lavagne, 2008). Most of the rock globally relate to a temperate and a Mediterranean underlying this area is calcareous. Climatically, we sub-groups. Several earlier bioregionalizations in the are outside of the Mediterranean climate as there is have failed to separate mediter- no dry period. The distribution of taxa most linked ranean from eurosiberian ensembles, suggesting this to this biogeographical region is basically holarctic, boundary would be highly permeable (Garc´ıa-Barros avoiding the Mediterranean parts of Europe. Some et al., 2002; Saiz et al., 1998) and easily crossed by of these taxa also avoid the most Atlantic part of species. Here, the use of a precise dataset coupled the continent. Their ecology is varied, pertaining to with a network analyse has proven to be relevant to different stages (grasslands, shrubs, forests) of moun- depict such spatial transition, which reinforce the need tain vegetation series, often (but not systematically) to gather coherent dataset to characterize complex calcicolous. and intricate spatial structures. This biogeographi- cal boundary has been linked to a change in the an- 8. High mountains This bioregion regroups the nual distribution of precipitation, which induces a pro- highest part of the Alps and the Pyrenees. If most longed summer drought and a stronger climatic sea- authors agree on individualizing the upper vegetation sonality in the mediterranean (Antonelli, 2017). At belts of these mountain ranges, its unity and the a finer scale, the three mediterranean clusters present common points are less often identified (Ozenda, a high spatial coherence, and closely fit to the me- 2002). Both calcareous and acidic soils are to be somediterranean thermoclimatic belt (Rivas-Mart´ınez found in this area. Cells of this region are the et al., 2004b) (see Figure S11 in Appendix). The high coldest of our study area, and there is no dry period: congruence between climatic model (Rivas-Mart´ınez the climate is relatively harsh and the vegetation et al., 2004b) and biogeographic entities has never period is reduced (Ozenda, 2002) compared to the been pictured by previous bioregionalization works other clusters. Taxa most linked to this region are (see Appendix for maps), as most of them presented a mainly European mountains endemics, venturing wider definition of the mediterranean biome, extend- also in the . They belong to grasslands or ing northward. Then, the absence of orogenic bar- snowbeds communities, which is consistent with their riers along this climate-based distinction is likely to occurrence on the highest ranges. produce shallow boundaries typical of transition ar- eas (Antonelli, 2017; Ficetola et al., 2017) exemplified here by the cluster “Mediterranean border” that con- tains all historical attempt to delimitate the mediter- Species and spatial relationships among ranean biome. West of the Rhone, this region is rel- bioregions atively thin and fence around the mesomediterranean ensemble; east of the Rhone, it occupies a wide area Defining the Mediterranean region on the Alpine piedmont. Thus, instead of drawing a single line (Cox, 2001), we propose to identify a tran- At a global scale, the delimitation of the Mediter- sition area (Droissart et al., 2017; Latini et al., 2017) ranean border has been a long running question (La- with an upper boundary as the limit of the Mediter- tini et al., 2017), and the mismatch of the numerous ranean biome (Antonelli, 2017). attempts attest to the difficulties (Figures S5-12 in Appendix). In France, the first attempt goes back to third edition of the Flore Fran¸caiseby Lamarck & Vicariance and fragmentation among bioregions Candolle(1805), as shown in Ebach & Goujet(2006) followed by several other works such as Flahault & The relationship between bioregions can be seen Durand(1887), who considered the distribution limit through the understanding of species relative impor- of the olive tree (Olea europaea L.) as a marker of tance in each area. First, the regions “Gulf of Lion the Mediterranean biome. This was later generalized coast”, “Cork oak zone” and “Mediterranean low- to the evergreen oak belt (Qu´ezel, 1999), but it ap- lands”, all included within the same bioclimatic belt peared that the situation was more complex (Qu´ezel& (Rivas-Mart´ınez et al., 2004a), differ mostly on sub- M´edail, 2004). Thus, variability in results has not lead stratum, i.e. calcareous (bioregion 3), siliceous (biore- to a comprehensive framework yet. This has several gion 2) or quaternary deposits (bioregion 1). Thus, implications regarding conservation programs, as the they are well defined and little uncertainty exists con- delimitation mentioned by European legislation has cerning their spatial configuration (Figure S15 in Ap- been used as a reference to delimit the distribution of pendix); those three entities can be seen as climatic several protected habitat2. In this study, the network vicariant bioregions which have conjointly developed

2 http://www.eea.europa.eu/data-and-maps/ F50D9CF8-FFEE-475F-8A65-4E095512CBB7 (accessed the 04/07/2018) 11 on different geological substrates, or “islands”. As cal communities (Ricklefs, 1987). Indeed, our study a result, they share an important pool of species, area is at the crossroad of recolonization routes out of and present the highest complementarity in the net- two major refugia, i.e. the Iberic and Italian peninsu- work, as they are the only three clusters all related las (Hewitt, 2000), and represents an admixture zone to each other. In contrast, the relationship between for several mediterranean taxa (Lumaret et al., 2002). the “Cork oak zone” and the “C´evennes” exemplify Joint action of colonization-retraction sequences and the opposite process: those two areas share a simi- long term persistence within microrefugia has been lar bedrock (mainly acidic substrate) but are located suspected to generate fragmented distribution. Thus, at each extreme of the Mediterranean climatic gra- one particular feature of such climatic transition area dient. While the “Cork oak zone” is present under is the high proportion of population isolated at the pe- hot and dry mesomediterranean climate (some coastal riphery of their main range (Thompson, 2005), either cells even belonging to a thermomediterranean belt), at the rear or at the leading edge of their distribution the “C´evennes” present a higher impluvium and a very (Hampe & Petit, 2005). However, spatial patterns weak summer drought. Consequently, they share a alone do not inform on the evolutionary isolation of common set of species which interestingly are typical such populations, could it be of recent dispersal fol- of the “C´evennes” cluster, and extend into the “Cork lowing Last Glacial Maximum (Lumaret et al., 2002), oak zone”. Noteworthy point, those population can or long term persistence in a given refugia (M´edail constitute relictual rear edge populations, which of- & Diadema, 2009; Papuga et al., 2015). Thus, in- ten retain particular interest for conservation (Hampe tegrating phylogenies within bioregionalization would & Petit, 2005; Lavergne et al., 2006). prove informative to analyse historical events that Finally, the “Pre-Alps” and “High mountains” have shaped current spatial patterns of biodiversity bioregions are both present within the three moun- (Nieto Feliner, 2014), and capture the evolutionary tain chains, and occupy climatic conditions with no relationship among bioregions (Holt et al., 2013). dry period at all, and especially harsh prolonged win- Nevertheless, analysing the spatial organization of ter for the second. Several species groups are highly flora can help us to understand ecological factors that informative for both of those bioregions, which signify shape such bioregions. Orographic barriers and past that they share an important group of species globally tectonic movement are expected to have little im- adapted to mountain environment. “High mountains” pact on our study area, as no such events have oc- present the highest percentage of typical species. Yet, curred there since the onset of the Mediterranean cli- within the numerous plant species groups characteriz- mate in the Pliocene (Rosenbaum et al., 2002). In ing those entities (5 groups in Figure7), the relative our analysis, spatial structuration relies principally contribution of each toward one or the other bioregion on two elements. On the one hand, a climatic gra- might differ slightly, sometimes in association with an- dient from Mediterranean to temperate climate cre- other bioregion such as the “Mediterranean border” ates fuzzy spatial limits among adjacent groups, and (Figure7). This illustrates that groups of taxa are increases uncertainty when delimitating groups (Fig- unevenly important across these two regions, proba- ure S15 in Appendix). This is exemplified by the bly reflecting the complex geological substrate. Thus, spatial imbrication of “Mediterranean lowlands” and while our analysis reflect an overall homogeneity of “Mediterranean border”. On the other hand, geologi- mountain flora mainly driven by climate, it is likely cal variations can form sharp transitions creating im- that finer divisions based on a more precise study portant species turnover between places close apart. could be expected. This has been pinpointed by Bohn This is exemplified by the “Cork oak zone” whose spa- et al. (2000) who pictured a high local heterogeneity tial delimitation is very clear, due to the presence of due to steep altitudinal gradients and geological diver- an acidic substrate surrounded by places dominated sity, despite some vegetation groups shared between by calcareous-based rock. Interestingly, this area still the Alps and the Pyrenees. Therefore, a compara- shares an important part of its biota with other places tive analysis including a broader spatial perspective in the Mediterranean basin probably inherited from on those massif could improve our understanding of times where such geological islands formed a single the spatial structure of mountain flora in western Eu- ensemble, before the separation and later migration rope. of these islands (M´edail & Qu´ezel, 1997; Rosenbaum et al., 2002). The joint action of these two ecological factors has already been highlighted in previous biore- Eco-evolutionary factors driving the spatial organisation gionalization of the Mediterranean basin (Buira et al., of plant diversity 2017). As a result, complex geo-climatic variation have played a key role in shaping island-like territories The spatial distribution and species relative impor- which have fragmented species distributions, a factor tance for each bioregion can help us to better un- that has strong influence on populations character- derstand processes that have shaped Mediterranean istics both genetically and demographically (Pironon biota in the south of France. The regional species et al., 2017). pool results from several waves of colonization fol- The flora of the Mediterranean basin shows recur- lowing glacial cycles, constrained by ecological filters rent patterns of narrow endemism, species turnover that allowed taxa to persist and ultimately shaped lo- and highly disjunct distributions (Thompson, 2005). 12

While allopatric isolation has been suspected to be DATA AVAILABILITY the main mechanism explaining the differentiation of taxa, the shared significance of different ecologi- Code and data are available at www. cal variables (namely climate and geology) points out maximelenormand.com/Codes the combined importance of spatial isolation and het- erogeneous selective pressures (Anacker & Strauss, 2014; Thompson, 2005). Additionally, recent studies REFERENCES have shown that this can be enhanced by small scale changes of the ecological niche (Lavergne et al., 2004; Papuga et al., 2018; Thompson et al., 2005). Contrary Anacker, B. L. & Strauss, S. Y. (2014) The geography to other mediterranean (e.g. South-Africa and ecology of plant speciation: range overlap and ), the mediterranean basin is marked and niche divergence in sister species. Proceed- by an active speciation, which has led to the high ings of the Royal Society of London B: Biological observed proportion of neoendemic species (Rundel Sciences, 281, 20132980. et al., 2016). If evidences have accumulated concern- Antonelli, A. (2017) Biogeography: Drivers of biore- ing cryptic microrefugia for temperate trees (Stew- gionalization. Nature Ecology & Evolution, 1, art & Lister, 2001), little is known regarding mediter- 0114. ranean taxa, especially those that exhibit little disper- sal capacities, a shared trait among mediterranean en- Baselga, A. (2012) The relationship between species demics (Lavergne et al., 2004). Thus, this bioregion- replacement, dissimilarity derived from nested- alization set the scene to investigate the shared phylo- ness, and nestedness. Global Ecology and Bio- geographic legacy of the Mediterranean biota (Pu¸sca¸s geography, 21, 1223–1232. & Choler, 2012), and measure the evolutionary iso- lation of such communities that separate peripheral Blondel, J., Aronson, J., Bodiou, J.-Y. & Boeuf, G. isolates from newly differentiated species (Crawford, (2010) The Mediterranean Region: Biological Di- 2010). versity in Space and Time. Oxford University Press, Oxford, New York, second edition edition.

CONCLUSION Bohn, U., Gollub, G. & Hettwer, C. (2000) Karte der nat¨urlichenVegetation Europas. Bundesamt f¨ur The quality of a bioregionalization is dependent on Naturschutz. Landwirtschaftsvlg M¨unster,Bonn, the data and the method used. To our knowledge, 1., aufl. edition. the present analysis constitutes the densest species- cells network analysed in a bioregionalization study, B´olos, O. (1961) La transici´onentre la depresi´on at such a high spatial resolution. Therefore, results of del ebro y los pirineos en el aspecto geobot´anico. this study demonstrate that new statistical methods Anal. lnst. Bot. Cavanilles, 18, 99–254. based on network analysis can bring solutions to man- age and analyse large databases, and provide efficient Braun-Blanquet, J. (1923) L’origine et le bioregionalization at different scales. New perspec- d´eveloppement des flores dans le massif central tives for bioregionalization will integrate community de France; avec aperu sur les migrations des structure across different scales, in order to under- flores dans l’Europe sud-occidentale. L. Lhomme, stand how deterministic (i.e. niche based) processes Paris. and stochastic events (dispersal, random extinction, Buira, A., Aedo, C. & Medina, L. (2017) Spatial pat- ecological drift) interact to shape plant communities, terns of the Iberian and Balearic endemic vas- from regional species pool to local assemblages (Chase cular flora. Biodiversity and Conservation, 26, & Myers, 2011). 479–508.

Chase, J. M. & Myers, J. A. (2011) Disentangling ACKNOWLEDGEMENTS the importance of ecological niches from stochas- tic processes across scales. Philosophical Trans- This work was supported by a grant from the French actions of the Royal Society of London. Series B, National Research Agency (project NetCost, ANR- Biological Sciences, 366, 2351–2363. 17-CE03-0003 grant). Partial financial support has been received from the French Minist`erede la Tran- Cheruvelil, K. S., Yuan, S., Webster, K. E., Tan, P.- sition Ecologique et Solidaire (MTES). We thank the N., Lapierre, J.-F., Collins, S. M., Fergus, C. E., Alpine National Botanic Conservatory for providing Scott, C. E., Henry, E. N., Soranno, P. A., Fil- some of the Provence-Alpes-Cˆoted’Azur data. We strup, C. T. & Wagner, T. (2017) Creating mul- wish to thank Christelle H´ely-Alleaume,Virgile No- tithemed ecological regions for macroscale ecol- ble and James Molina for useful discussions. A spe- ogy: Testing a flexible, repeatable, and accessi- cial thank goes to John D. Thompson for correcting ble clusteringmethod. Ecology and Evolution, 7, English and interesting remarks. 3046–3058. 13

Cox, B. (2001) The biogeographic regions reconsid- analysis of endemicity and its application to ani- ered. Journal of Biogeography, 28, 511–523. mal and plant geographical distributions in the Ibero-Balearic region (western Mediterranean). Crawford, D. J. (2010) Progenitor-derivative species Journal of Biogeography, 29, 109–124. pairs and plant speciation. Taxon, 59, 1413–1423. Gaultier, C. (1989) Relations entre pelouses eu- Dahlin, K. M., Asner, G. P. & Field, C. B. (2014) rosib´eriennes(Festuco-Brometea Br. -Bl. Et Tx. Linking vegetation patterns to environmental 43) et groupements m´editerran´eens (Ononido- gradients and human impacts in a mediterranean- Rosmarinetea Br. -Bl. 47) : ´etude r´egionale type island . Landscape Ecology, 29, (Diois) et synth´esesur le pourtour m´editerran´een 1571–1585. Nord-occidental. PhD thesis, Universit´ede Paris- Sud. Dapporto, L., Ciolli, G., Dennis, R. L. H., Fox, R. & Shreeve, T. G. (2015) A new procedure for ex- Graham, C. H. & Hijmans, R. J. (2006) A compar- trapolating turnover regionalization at mid-small ison of methods for mapping species ranges and spatial scales, tested on British butterflies. Meth- species richness. Global Ecology and Biogeogra- ods in Ecology and Evolution, 6, 1287–1297. phy, 15, 578–587. Div´ıˇsek,J., Storch, D., Zelen´y,D. & Culek, M. (2016) Greuter, W. (1991) Botanical diversity, endemism, Towards the spatial coherence of biogeographical rarity and extinction in the Mediterranean area: regionalizations at subcontinental and landscape an analysis based on the published volumes of scales. Journal of Biogeography, 43, 2489–2501. MedChecklist. Botanika Chronika, 10, 63–79. Droissart, V., Dauby, G., Hardy, O. J., Deblauwe, V., Guimer`a,R. & Nunes Amaral, L. A. (2005) Functional Harris, D. J., Janssens, S., Mackinder, B., Blach- cartography of complex metabolic networks. Na- Overgaard, A., Sonk´e,B., Sosef, M. M., St´evart, ture, 433, 895–900. T., Svenning, J.-C., Wieringa, J. J. & Couvreur, T. L. P. (2017) Beyond trees: Biogeographical Hampe, A. & Petit, R. J. (2005) Conserving biodiver- regionalization of tropical Africa. Journal of Bio- sity under climate change: the rear edge matters. geography. Ecology Letters, 8, 461–467. Dupias, G. & Rey, P. (1985) Document pour un zonage Hewitt, G. (2000) The genetic legacy of the Quater- des r´egionsphyto-´ecologiques. Centre d’´ecologie nary ice ages. Nature, 405, 907–913. des ressources renouvelables. Holt, B. G., Lessard, J.-P., Borregaard, M. K., Fritz, Dupont, P. (2015) Les plantes vasculaires atlantiques, S. A., Ara´uo,M. B., Dimitrov, D.and Fabre, P.- les pyr´en´eo-cantabriques et les ´el´ementsfloris- H., Graham, C. H., Graves, G. R., Jønsson, K. A., tiques voisins dans la p´eninsuleib´eriqueet en Nogu´es-Bravo, D., Wang, Z., Whittaker, R. J., France. Soci´et´ebotanique du Centre-Ouest. Fjelds˜a,J. & Rahbek, C. (2013) An Update of Wallace’s Zoogeographic Regions of the World. Ebach, M. C. & Goujet, D. F. (2006) The first bio- Science, 339. geographical map. Journal of Biogeography, 33, 761–769. Joly, D., Brossard, T., Cardot, H., Cavailhes, J., Hilal, M. & Wavresky, P. (2010) Les types de climats Fenu, G., Fois, M., Ca˜nadas,E. M. & Bacchetta, G. en France, une construction spatiale. Cybergeo : (2014) Using endemic-plant distribution, geology European Journal of Geography. and geomorphology in biogeography: the case of Sardinia (Mediterranean Basin). Systematics and Koleff, P., Gaston, J. & Lennon, J. (2003) Measuring Biodiversity, 12, 181–193. beta diversity for presence-absence data. Journal of Animal Ecology, 72, 367–382. Ficetola, G. F., Mazel, F. & Thuiller, W. (2017) Global determinants of zoogeographical bound- Kougioumoutzis, K., Simaiakis, S. M. & Tiniakou, A. aries. Nature Ecology & Evolution, 1, 0089. (2014) Network biogeographical analysis of the central Aegean archipelago. Journal of Biogeog- Flahault, C. & Durand, M. (1887) Limite de la r´egion raphy, 41, 1848–1858. m´editerran´eenneen France. Publications de la Soci´et´eLinn´eennede Lyon, 5, 9–9. Kreft, H. & Jetz, W. (2010) A framework for delin- eating biogeographical regions based on species Funk, V. A., Richardson, K. S. & Sakai, A. K. (2002) distributions. Journal of Biogeography, 37, 2029– Systematic Data in Biodiversity Studies: Use It 2053. or Lose It. Systematic Biology, 51, 303–316. Lamarck, J. d. M. d. & Candolle, A. (1805) Garc´ıa-Barros,E., Gurrea, P., Luci´a˜nez,M. J., Cano, Flore fran¸caise, ou descriptions succinctes de J. M., Munguira, M. L., Moreno, J. C., Sainz, toutes les plantes qui croissent naturellement en H., Sanz, M. J. & Sim´on,J. C. (2002) Parsimony France, dispos´ees selon une nouvelle m´ethode 14

d’analyse, et pr´ec´ed´ees par un expos´e des Mikolajczak, A., Marchal, D., Sanz, T., Isenmann, M., principes ´el´ementaires de la botanique. Paris. Thierion, V. & Luque, S. (2015) Modelling spatial distributions of alpine vegetation: A graph the- Lancichinetti, A., Radicchi, F. & Ramasco, J. J. ory approach to delineate ecologically-consistent (2010) Statistical significance of communities in species assemblages. Ecological Informatics, 30, networks. Physical Review E, 81, 046110. 196–202. Lancichinetti, A., Radicchi, F., Ramasco, J. J. & For- Molinier, R. & Tallon, G. (1970) Prodrome des tunato, S. (2011) Finding Statistically Signifi- unit´esphytosociologiques observ´eesen Camargue. cant Communities in Networks. PLOS ONE, 6, Molinier. e18961. Latini, M., Bartolucci, F., Conti, F., Iberite, M., Molloy, M. & Reed, B. (1995) A critical point for ran- Nicolella, G., Scoppola, A. & Abbate, G. (2017) dom graphs with a given degree sequence. Ran- Detecting Phytogeographic Units Based on Na- dom Structures & Algorithms, 6, 161–180. tive Woody Flora: A Case Study in Central Morrone, J. J. (2018) The spectre of biogeographi- Peninsular Italy. The Botanical Review, 83, 253– cal regionalization. Journal of Biogeography, 105, 281. 1118–1123. Lavagne, A. (2008) Synth`esephytog´eographique du Mucina, L., B¨utmann,H., Dierßen, K., Theurillat, d´epartement du Var. In Le Var et sa flore. ˇ ˇ Plantes rares ou prot´eg´ees. Naturalia Publica- J.-P., Raus, T., Carni, A., Sumberov´a,K., Will- tions, Inflovar. ner, W., Dengler, J., Garc´ı, R. G., Chytr´y, M., H´ajek,M., Di Pietro, R., Iakushenko, D., Lavergne, S., Molina, J. & Debussche, M. (2006) Fin- Pallas, J., Dani¨els, F. J., Bergmeier, E., San- gerprints of environmental change on the rare tos Guerra, A., Ermakov, N., Valachoviˇc, M., mediterranean flora: a 115-year study. Global Schamin´ee,J. H. J., Lysenko, T., Didukh, Y. P., Change Biology, 12, 1466–1478. Pignatti, S., Rodwell, J. S., Capelo, J., Weber, H. E., Solomeshch, A., Dimopoulos, P., Aguiar, Lavergne, S., Thompson, J. D., Garnier, E. & De- C., Hennekens, S. M. & Tich´y,L. (2016) Vegeta- bussche, M. (2004) The biology and ecology of tion of Europe: hierarchical floristic classification narrow endemic and widespread plants: a com- system of vascular plant, bryophyte, lichen, and parative study of trait variation in 20 congeneric algal communities. Applied Vegetation Science, pairs. Oikos, 107, 505–518. 19, 3–264.

Lebart, L., Piron, M. & Morineau, A. (2000) Statis- Murray, A. (1866) The geographical distribution of tique exploratoire multidimensionnelle. Dunod, mammals. Day and Son, limited,, London,. Paris. Myers, N., Mittermeier, R. A., Mittermeier, C. G., da Lennon, J. J., Koleff, P., Greenwood, J. J. D. & Gas- Fonseca, G. A. B. & Kent, J. (2000) Biodiversity ton, K. J. (2001) The geographical structure hotspots for conservation priorities. Nature, 403, of British bird distributions: Diversity, spatial 853–858. turnover and scale. Journal of Animal Ecology, 70, 966–979. Nieto Feliner, G. (2014) Patterns and processes Lled´o,M. D., Crespo, M. B., Fay, M. F. & Chase, in plant phylogeography in the Mediterranean M. W. (2005) Molecular phylogenetics of Limo- Basin. A review. Perspectives in Plant Ecology, nium and related genera (Plumbaginaceae): bio- Evolution and Systematics, 16, 265–278. geographical and systematic implications. Amer- Ozenda, P. (1994) V´eg´etationdu continent Europ´een. ican Journal of Botany, 92, 1189–1198. Delachaux et Niestl´e,Lausanne. Lumaret, R., Mir, C., Michaud, H. & Raynal, V. (2002) Phylogeographical variation of chloroplast Ozenda, P. (2002) Perspectives pour une g´eobiologie DNA in holm oak (Quercus ilex L.). Molecular des montagnes. Presses Polytechniques et Uni- Ecology, 11, 2327–2336. versitaires Romandes, Lausanne. M´edail,F. & Diadema, K. (2009) Glacial refugia in- Ozenda, P. & Lucas, M. J. (1987) Esquisse d’une carte fluence plant diversity patterns in the Mediter- de la v´eg´etationpotentielle de la france 1/1 500 ranean Basin. Journal of Biogeography, 36, 1333– 000. Documents de cartographie ´ecologique. 1345. Papuga, G., Gauthier, P., Pons, V., Farris, E. & M´edail, F. & Qu´ezel, P. (1997) Hot-Spots Analy- Thompson, J. (2018) Ecological niche differen- sis for Conservation of Plant Biodiversity in the tiation in peripheral populations: a comparative Mediterranean Basin. Annals of the Missouri analysis of eleven mediterranean plant species. Botanical Garden, 84, 112–127. Ecography, 176, 724–738. 15

Papuga, G., Gauthier, P., Ramos, J., Pons, V., Rundel, P. W., Arroyo, M., Cowling, R. M., Keeley, Pironon, S., Farris, E. & Thompson, J. D. J. E., Lamont, B. B. & Vargas, P. (2016) Mediter- (2015) Range-Wide Variation in the Ecologi- ranean Biomes: Evolution of Their Vegetation, cal Niche and Floral Polymorphism of the West- Floras, and Climate. Annual Review of Ecology, ern Mediterranean Geophyte Narcissus dubius Evolution, and Systematics, 47, 383–407. Gouan. International Journal of Plant Sciences, 176, 724–738. Rushton, S. P., Ormerod, S. J. & Kerby, G. (2004) New paradigms for modelling species distribu- Pironon, S., Papuga, G., Villellas, J., Angert, A. L., tions? Journal of Applied Ecology, 41, 193–200. Garc´ıa,M. B. & Thompson, J. D. (2017) Geo- graphic variation in genetic and demographic per- Saiz, J. C. M., Parga, I. C. & Ollero, H. S. (1998) Nu- formance: new insights from an old biogeographi- merical analyses of distributions of Iberian and cal paradigm. Biological Reviews, 92, 1877–1909. Balearic endemic monocotyledons. Journal of Biogeography, 25, 179–194. Pu¸sca¸s,M. & Choler, P. (2012) A biogeographic de- Stewart, J. R. & Lister, A. M. (2001) Cryptic north- lineation of the European Alpine System based ern refugia and the origins of the modern biota. on a cluster analysis of Carex curvula-dominated Trends in Ecology & Evolution, 16, 608–613. grasslands. Flora - Morphology, Distribution, Functional Ecology of Plants, 207, 168–178. Stoddart, D. R. (1992) Biogeography of the Tropical Pacific. University of Hawaii Press. Qu´ezel, P. (1999) Les grandes structures de v´eg´etationen r´egionm´editerran´eenne: Facteurs Tassin, C. (2017) Paysages v´eg´etaux du domaine d´eterminants dans leur mise en place post- m´editerran´een : Bassin m´editerran´een, Cali- glaciaire. Geobios, 32, 19–32. fornie, Chili central, Afrique du Sud, Australie m´eridionale. Rfrence. IRD ditions, Marseille. Qu´ezel, P. & M´edail, F. (2004) Ecologie et biog´eographie des forˆetsdu bassin m´edit´erran´een. Thompson, J. D. (2005) Plant Evolution in the Elsevier Masson, Paris. Mediterranean. Oxford University Press, Oxford, New York. Ricklefs, R. E. (1987) Community diversity: rela- tive roles of local and regional processes. Science Thompson, J. D., Lavergne, S., Affre, L., Gaudeul, (New York, N.Y.), 235, 167–171. M. & Debussche, M. (2005) Ecological Differen- tiation of Mediterranean Endemic Plants. Taxon, Ricklefs, R. E. (2004) A comprehensive framework for 54, 967–976. global patterns in biodiversity. Ecology Letters, 7, 1–15. Tison, J.-M. & Foucault, B. d. (2014) Flora Gallica : Flore de France. Biotope Editions, Mze. Rivas-Mart´ınez,S., D´ıaz,T., Fern´andez-Gonz´alez,F., Tison, J.-M., Jauzein, P. & Michaud, H. (2014) Flore Izco, J., Loidi, J., Lous˜a,M. & Penas, A. (2002) de la France m´editerranenne continentale. Natu- Vascular plant communities of Spain and Portu- ralia Publications, Turriers, 1re dition edition. gal: addenda to the syntaxonomical checklist of 2001. Part II. Itinera Geobot., 15(2), 433–922. Turner, M. G., Gardner, R. H. & O’Neill, R. V. (2001) Landscape Ecology in Theory and Practice: Pat- Rivas-Mart´ınez,S., Penas, A. & D´ıaz,T. (2004a) Bio- tern and Process. Springer-Verlag. climatic Map of Europe. University of Le´on. Vilhena, D. A. & Antonelli, A. (2015) A network Rivas-Mart´ınez,S., Penas, A. & D´ıaz,T. (2004b) Bio- approach for identifying and delimiting biogeo- geographic Map of Europe. University of Le´on. graphical regions. Nature Communications, 6, 6848. Rosenbaum, G., Lister, G. S. & Duboz, C. (2002) Re- construction of the tectonic evolution of the west- Wahlenberg, G. (1812) Flora Lapponica. Taberna li- ern Mediterranean since the Oligocene. Journal braria scholae realis, Berolini. of the Virtual Explorer, 8, 107–130. Wallace, A. R. (1876) The geographical distribution of Rosvall, M. & Bergstrom, C. T. (2008) Maps of animals : with a study of the relations of living random walks on complex networks reveal com- and extinct faunas as elucidating the past changes munity structure. Proceedings of the National of the earth’s surface. Harper and Brothers, Pub- Academy of Sciences, 105, 1118–1123. lishers, New York.

Rousseeuw, P. J. (1987) Silhouettes: A graphical Walter, H. & Breckle, S.-W. (1991) Okologie¨ der Erde. aid to the interpretation and validation of cluster Band 4: Spezielle Okologie¨ der Gem¨aßigtenund analysis. Journal of Computational and Applied Arktischen Zonen außerhalb Euro-Nordasiens. Mathematics, 20, 53 – 65. Gustav Fischer Verlag. 16

Walter, H. & Breckle, S.-W. (1994) Okologie¨ der Erde. Wilson, M. V. & Shmida, A. (1984) Measuring Beta Band 3: Spezielle Okologie¨ der Gem¨aßigtenund Diversity with Presence-Absence Data. Journal Arktischen Zonen außerhalb Euro-Nordasiens. of Ecology, 72, 1055–1064. Gustav Fischer Verlag.

APPENDIX

Interactive web application

An interactive web application has been designed to provide an easy-to-use interface to visualize the results and the maps of the main paper and the Appendix (Figure S1). The source code of the interactive web application1 can be downloaded from2.

Figure S1. Screenshot of the interactive web application.

1 https://maximelenormand.shinyapps.io/Biogeo/ 2 www.maximelenormand.com/Codes 17

Influence of scale on the biogeographical regions delineation

In order to assess the impact of the spatial resolution on the results, we also applied the analysis with a grid composed of squares of lateral size l = 10 km (Figure S2). The spatial coherence, defined as the ratio between the number of grid cells in the largest patch and the total number of grid cells (Turner et al., 2001), is displayed for both scale in Table S1.

1 Gulf of Lion coast 2 Cork oak zone 3 Mediterranean lowlands 4 Mediterranean border 5 Cévennes sensu lato 6 Subatlantic mountains 7 Pre-Alps 8 High mountains

Figure S2. Biogeographical regions based on similarity in plant species (l = 10 km). Eight biogeographical regions have been identified. 1. Gulf of Lion coast in red. 2. Cork oak zone in orange. 3. Mediterranean lowlands in light green. 4. Mediterranean border in dark green. 5. C´evennes sensu lato in purple. 6. Subatlantic mountains in pink. 7. Prealps and other medium mountains in yellow. 8. High mountains in brown.

Table S1. Spatial coherence of the biogeographical regions according to the scale.

Bioregion ni (l=5) SPi (l=5) ni (l=10) SPi (l=10) 1 170 0.63 63 0.75 2 183 0.73 47 0.87 3 529 0.86 124 0.83 4 807 0.57 164 0.51 5 120 0.50 45 0.31 6 152 0.70 48 0.79 7 400 0.67 114 0.75 8 246 0.78 110 0.77 18

Comparison of OSLOM to standard clustering methods

In this work, bioregions are delineating using the community detection algorithm OSLOM applied on a weighted undirected spatial network whose intensity of links between grid cells are measured with the Jaccard similarity coefficient. This algorithm is nonparametric in the sense that it identifies statistically significant communities with respect to a global null model, and therefore the number of communities does not need to be defined a priori. In order to assess the accuracy of the method, we compared the results obtained with OSLOM with the ones obtained with standard hierarchical clustering methods. Not that these standard methods cannot be directly applied on the spatial network described above, we first need to transform the network into a dissimilarity matrix. Three different agglomeration methods have been tested: average (UPGMA), mcquitty (WPGMA) and Ward1. To choose the number of clusters, we used the average silhouette index S¯ (Rousseeuw, 1987). For each cell g, we can compute a(g) the average dissimilarity of g (based on the Jaccard index in our case) with all the other cells in the cluster to which g belongs. In the same way, we can compute the average dissimilarities of g to the other clusters and define b(g) as the lowest average dissimilarity among them. Using these two quantities, we compute the silhouette index s(g) defined as,

b(g) − a(g) s(g) = (1) max{a(g), b(g)} which measures how well clustered g is. This measure is comprised between −1 for a very poor clustering quality and 1 for an appropriately clustered g. We choose the number of clusters that maximize the average silhouette ¯ n index over all the grid cells S = ∑g=1 s(g)~n. UPGMA and WPGMA failed to detect any coherent partitions, most of the grid cells were gathered in a giant cluster component even increasing significantly the number of clusters. Better results were obtained with Ward’s method. The average Silhouette index as a function of the number of clusters is shown in Figure S3.

0.10 HC Ward OSLOM 0.08

0.06

0.04 Average Silhouette Average 0.02

0.00 5 10 15 20 Number of Clusters

Figure S3. Average Silhouette as a function of the number of clusters obtained with Ward’s clustering (in blue) and OSLOM (in red).

Two optimal partitions have been detected with the average Silhouette index. It is interesting to note that the number of clusters of the second partition is the same that the one automatically detected with OSLOM (Figure S3). A map of the eight optimal bioregions obtained with Ward’s method is display in Figure S4. In order to compare the two partitions a contingency between the partitions obtained with Ward’s method and OSLOM is shown in Table S2.

1 method=”average”, ”mcquitty” and ”ward.D2” with the hclust R function 19

1 2 3 4 5 6 7 8

Figure S4. Biogeographical regions based on similarity in plant species obtained with Ward’s clustering (l = 5 km). Eight biogeographical regions have been identified.

Table S2. Contingency tables between the partitions obtained with Ward (in row) and OSLOM (in column).

Bioregion 1 2 3 4 5 6 7 8 1 125 2 54 0 4 0 0 0 2 0 115 0 2 0 0 0 0 3 28 47 435 396 0 0 0 0 4 0 4 4 187 48 1 43 1 5 17 15 34 20 53 32 8 15 6 0 0 0 0 15 119 38 33 7 0 0 0 200 0 0 216 0 8 0 0 0 2 0 0 95 197 20

Comparison with other delineations

Lamarck (1805)

Figure S5. Comparison of the results obtained with OSLOM (l = 5 km) with Lamarck’s limit of the Mediterranean level (Ebach & Goujet, 2006; Lamarck & Candolle, 1805).

Flahaut (1887)

Figure S6. Comparison of the results obtained with OSLOM (l = 5 km) with Flahaut limit of the olive tree distribution (Flahault & Durand, 1887). 21

Ozenda (1994)

Figure S7. Comparison of the results obtained with OSLOM (l = 5 km) with Ozenda’s mediter- ranean/supramediterranean limit (Ozenda, 1994).

Julve (1999)

Figure S8. Comparison of the results obtained with OSLOM (l = 5 km) with Julve’s mediterranean/supramediterranean limit (?). 22

Bohn (2000)

Figure S9. Comparison of the results obtained with OSLOM (l = 5 km) with Bohn’s mediterranean/supramediterranean limit (Bohn et al., 2000).

Rivas−Martinez (2004) − thermoclimat

Figure S10. Comparison of the results obtained with OSLOM (l = 5 km) with Rivas-Mart´ınez’sthermoclimatic limit (Rivas-Mart´ınezet al., 2004a). 23

Rivas−Martinez (2004) − biogeo

Figure S11. Comparison of the results obtained with OSLOM (l = 5 km) with Rivas-Mart´ınez’sbiogeographical limit (Rivas-Mart´ınezet al., 2004b).

Natura 2000 (2006)

Figure S12. Comparison of the results obtained with OSLOM (l = 5 km) with the Natura 2000’s limit (?). 24

Supplementary Figures

(a) (b)

500 2500

400 2000

300 1500

200 1000 Frequencies Frequencies

100 500

0 0 0 200 400 600 800 1000 0 500 1000 1500 2000 2500 Plant species per grid cell Grid cells covered per plant species

Figure S13. Histograms of the degree distributions of the biogeographical bipartite network. Histogram of the number of plant species per grid cell (a) and the number of cells covered per plant species (b).

3500

3000

2500

2000

1500

1000

500

0

Figure S14. Altitude map of the studied area (in meters). 25

1 − J2/J1 1.0

0.8

0.6

0.4

0.2

0.0

Figure S15. Uncertainty map (l = 5 km). For a given cell, J1 represents the average Jaccard similarity index between this cell and all the cells that belong to its cluster, and J2 represents the average Jaccard similarity index between this cell and all the cells belonging to the second closest cluster (based on the Jaccard similarity). 26

Supplementary Tables

Table S3. Number of plant species per group.

Group Number of species a 445 b 149 c 230 d 299 e 169 f 277 g 37 h 180 i 242 j 136 k 125 l 95 m 180 n 180 o 178 p 186 q 44 r 59 s 212 t 274

Table S4. Network of interactions between biogeographical regions.

Bioregion 1 2 3 4 5 6 7 8 1 0.52 0.24 0.15 0.07 0 0.01 0.01 0 2 0.15 0.56 0.11 0.14 0.01 0.01 0.02 0 3 0.13 0.18 0.42 0.25 0 0 0.01 0 4 0.03 0.11 0.1 0.55 0.01 0.01 0.19 0.01 5 0.01 0.13 0 0.07 0.4 0.24 0.11 0.05 6 0.01 0.03 0 0.02 0.1 0.49 0.17 0.19 7 0 0.01 0 0.16 0.01 0.04 0.52 0.26 8 0 0 0 0.01 0.01 0.05 0.28 0.65