Traits fonctionnels, variabilité environnementale et bioindication : les communautés piscicoles des cours d’eau européens M. Logez

To cite this version:

M. Logez. Traits fonctionnels, variabilité environnementale et bioindication : les communautés piscicoles des cours d’eau européens. Sciences de l’environnement. Doctorat en Sciences de l’Environnement, Université de Provence, Aix-Marseille 1, 2010. Français. ￿tel-02596713￿

HAL Id: tel-02596713 https://hal.inrae.fr/tel-02596713 Submitted on 15 May 2020

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Thèse présentée à L’UNIVERSITE DE PROVENCE AIX-MARSEILLE 1

Ecole Doctorale SCIENCES DE L’ENVIRONNEMENT

Pour obtenir le grade de DOCTEUR Spécialité : ECOLOGIE

Traits fonctionnels, variabilité environnementale et bioindication : les communautés piscicoles des cours d’eau européens

Maxime LOGEZ

Soutenue le 13-12-2010

Composition du Jury : M. Rémi CHAPPAZ (Professeur, Université de Provence) Président M. Joël GUIOT (Directeur de Recherche, CEREGE) Examinateur M. Daniel HERING (Full Professor, Université de Duisbourg-Essen) Rapporteur M. Robert HUGHES (Senior Research Professor, Université de l’état d’Oregon) Rapporteur M. Bernard HUGUENY (Directeur de Recherche, IRD) Co-directeur de thèse M. Didier PONT (Directeur de Recherche, Cemagref) Directeur de thèse

Travaux réalisés dans l’équipe Ecosystèmes Eaux Courantes Cemagref – Unité de Recherche Hydrobiologie, HYAX

A mes parents

Remerciements De cette expérience, je me suis aperçu qu’une thèse est à la fois une aventure scientifique et humaine, impossible à mener à bien, sans l’aide, les conseils et le soutien de nombreuses personnes à qui je souhaite dire un grand merci.

La première personne que je souhaite remercier est Didier Pont. Didier merci de m’avoir donnée l’opportunité d’aller au bout de mon envie de thèse et d’avoir cru en moi. Merci pour tes conseils, tes explications, ton aide plus que précieuse tout au long de ces années, mais surtout merci d’être pour moi, beaucoup plus qu’un directeur de thèse. Si je devais résumer je dirais juste : merci pour tout !

Je souhaite aussi remercier mon co-directeur de thèse Bernard Hugueny. Bernard merci de m’avoir fait profité de tes lumières, de ta disponibilité sans faille, de tes conseils et de ton éclairage avisé sur ma thèse et les articles. Merci aussi de m’avoir fait partager ton immense et inimitable sens de l’humour.

Je tiens à remercier tout particulièrement, Daniel Hering et Robert Hughes, de m’avoir fait l’honneur d’être les rapporteurs de ce manuscrit et de parcourir autant de kilomètres pour faire partie de mon jury. Merci aussi à Rémi Chappaz et Joël Guiot d’avoir accepté d’évaluer ce travail en étant membre de mon jury.

Comment ne pas remercier tous les membres du projet européen EFI+, pour ce qui fut pour moi, une expérience professionnelle et personnelle géniale. Parmi toutes ces personnes j’adresse un grand merci à Teresa, Stefan, Andi, Rafi, Nico, Stéphane, Giuseppe, Enrico, Patrick, Pedro, Ian et Richard. Enfin, un merci tout spécial à Gerti pour ta gentillesse et tous ces bons moments passés en ta compagnie que ça soit au Cemagref ou durant EFI+. Merci aussi de m’avoir, sans le savoir, permis d’améliorer mon anglais.

Merci à l’ensemble des membres passés et présents de l’UR HYAX et plus particulièrement ceux de l’équipe Ecosystème Eaux Courantes pour l’accueil que vous m’avez réservé durant toutes ces années. Un remerciement spécial à Isabelle et Yann qui ont fait en sorte que cette thèse se passe sans accrocs.

Je souhaite remercier les membres de mon comité de pilotage de thèse pour leurs apports scientifiques : Joel Guiot, Stefan Schmutz, Jérôme Belliard.

Je veux aussi remercié ceux qui m’ont donné envie, à leur manière, de faire une thèse. Merci aux membres du bureau d’en face à l’université de Lyon : Asghar, Franck, Vincent, Yorick, Pablo. Parmi toutes ces personnes, comment ne pas te remercier plus particulièrement Pierre. La vie est étrange et a fait en sorte que nos chemins se suivent pendant mon DEA et ma thèse. Alors Pierre, merci pour nos discussions enflammées sur les statistiques et R, pour tes conseils plus qu’éclairés, pour avoir fait de chaque voyage une aventure, pour nos fous rires mémorables, pour la décoration du bureau et plus que tout, merci pour ton amitié.

Merci aussi à Dim et Nico pour cette année de DEA inoubliable. D’autres doivent encore s’en souvenir !

Merci à toi Renaud, pour avoir fait de ces années passées au Cemagref un grand éclat de rire. Je te remercie pour ta bonne humeur, pour tes blagues de plus ou moins bon goût mais qui me font tant rires et pour la complice amitié qui nous lie. Ah oui merci aussi d’avoir élargi mes horizons musicaux …

Un grand merci à Lionel pour ta naïveté rafraichissante, pour nos nombreuses parties de mots croisés le midi et pour nos grands délires. Notre duo de rire musical, l’otarie et la hyène, n’est pas prêt d’être oublié chez HYAX. Bon vent l’artiste dans tes nouvelles aventures ministérielles.

Je souhaite aussi remercier toutes les personnes qui ont participé de près ou loin à ce manuscrit. Alors merci à Oliv, Did, Clémence et sa maman, et à mes beaux-parents Brigitte et Yves. Un merci tout spécial à Georges, pour tes suggestions et pour les dessins de poissons qui égaillent certaines figures.

D’un point de vue plus personnel je veux remercier toute ma famille pour leur soutien et leurs encouragements tout au long de cette thèse, tout particulièrement ma maman, ma petite sœur Soso et mon beau-frère P’tilou.

Merci à toi Yann, mon gone, mon Biboune, qui fait de moi avec un sourire, un éclat de rire voir même un regard le plus heureux des papas. Chaque jour avec toi et ta maman est un enchantement. Merci d’avoir été aussi gentil et calme, de grandir sereinement et de m’avoir permis de travailler paisiblement durant cette dernière année de thèse.

Enfin, je remercie, Anne mon amour, pour avoir été là tout au long de cette aventure, de m’avoir supporté et soutenu, d’avoir fait preuve d’autant de patience (et oui …) et de compréhension, d’avoir tout fait pour que je puisse réaliser cette thèse dans les meilleures conditions qui soient, et pour m’apporter autant jour après jour, dans les bons comme dans les mauvais moments. Pour finir, promis c’est la première et la dernière ! Résumé La Directive Cadre sur l’eau (DCE) fournit un cadre pour la gestion et la protection des masses d’eau de surface, côtières, souterraines et de transition. Une des avancées majeures de la DCE est la prise en compte des différents compartiments biologiques (algues, macrophytes, macro-invertébrés et poissons) dans l’évaluation de l’état des masses d’eau. La DCE recommande que l’état écologique des masses d’eau soit évalué en comparant les communautés actuelles aux communautés théoriques observables dans les conditions de références (sans pressions anthropiques). La mise en place de la DCE par les différents états membres, a accéléré le développement d’indices basés sur plusieurs descripteurs des communautés (métriques) pour diagnostiquer l’état écologique.

Parallèlement aux indices poissons conçus à l’échelle nationale, le projet européen FAME a permis l’élaboration d’un premier indice poisson pan-européen (European Fish Index, EFI). Les 10 métriques, intégrées dans le calcul du score de cet indice, sont basées sur les caractéristiques (traits) biologiques et écologiques des espèces (ex. le substrat de ponte, le type de reproduction). Bien qu’EFI présente une réponse satisfaisante aux niveaux de dégradation des sites, plusieurs hypothèses sous-jacentes au développement de MMI n’ont pu être testées au cours du projet FAME. Le développement des indices multimétriques (MMI) repose en effet sur plusieurs postulats : la prise en compte de plusieurs caractéristiques des communautés devrait permettre une meilleure estimation de la condition des systèmes, les métriques intégrées au calcul de l’indice présentent des réponses spécifiques aux pressions et rendent compte de la particularité de la région où elles sont appliquées, le score (note) associé à une métrique pour un site donné est influencé uniquement par l’intensité de la dégradation .

La première partie de ce travail vise à tester certaines de ces hypothèses, et à développer des métriques spécifiques aux cours d’eau à faible richesse spécifique. L’étude du lien entre traits met en évidence une relative redondance fonctionnelle des communautés, notamment concernant les caractéristiques liées à la tolérance et à la reproduction des espèces. Il est aussi montré que la structure fonctionnelle des communautés évolue le long des gradients thermiques et physiques des cours d’eau de façon relativement similaire entre l’Europe de l’Ouest et la péninsule Ibérique, deux régions aux pools spécifiques pourtant distincts. Néanmoins certaines métriques présentent des réponses divergentes à l’environnement entre ces deux régions. Par ailleurs, parmi les 96 métriques testées intégrant les traits des espèces et la taille des individus, trois ont été modélisées avec succès en fonction de l’environnement et sont sensibles à l’altération anthropique. Ces résultats montrent qu’il est possible de prendre en compte la structure en tailles des peuplements dans l’évaluation des cours d’eau, en particulier lorsque la richesse spécifique est très faible. L’ensemble de ces travaux ont servi à la conception du nouvel indice poisson européen développé dans le cadre du projet européen EFI+.

La deuxième partie de ce travail consiste à proposer des solutions pour l’estimation de l’incertitude associée aux scores des métriques et de l’indice. Les valeurs attendues des métriques en absence de pressions sont prédites par des modèles statistiques reliant les métriques à l’environnement. Deux approches ont été proposées pour estimer l’intervalle de confiance de prédiction associé aux valeurs attendues des métriques. L’incertitude autour des métriques et de l’indice est estimée en propageant l’erreur autour des valeurs attendues (limites de l’intervalle de prédiction) des métriques. La troisième partie de ce travail a pour objet d’évaluer l’effet du changement climatique sur la mesure de l’état écologique des cours d’eau à l’aide d’indicateur de type prédictif. Il a été mis en évidence que la température joue un rôle majeur sur la croissance des jeunes de l’année de truite (Salmo trutta fario) et sur la distribution des espèces. Ces résultats présagent de conséquences importantes sur le long terme sur les modes actuels d’évaluation des cours d’eau et en particulier les états de référence.

Mots clés : Communautés piscicoles, traits biologiques et écologiques, variabilité environnementale, convergence, écorégions, indice multimétrique, condition de référence, classe de taille, incertitude, changment global, croissance, aire de répartition, Europe. Abstract The Water Framework Directive (WFD) establishes a framework for protecting and managing the different waterbodies: surface, transitory, coastal and underground waters. The WFD specifies that the different biocenosis components (diatom, macrophyte, benthic macroinvertebrate, and fish) must be taken into account in assessing the condition of waterbodies, a major innovation compared to past policies. The WFD recommends evaluating the ecological status of the waterbodies by comparing the observed communities to theoretical communities expected to reflect in reference conditions (without human pressures). The implementation of the WFD by the different member countries quickened the development of indices based on various assemblage characteristics (metrics) to diagnose the ecological state.

In parallel to the national fish index, the first Pan-European fish index was developed within the European FAME project: the European Fish Index (EFI). The ten metrics integrated into the EFI computation are based on species’ biological and ecological characteristics (e.g., spawning substrate, type of reproduction). Although the EFI score shows a satisfying response to the degradation level, several hypotheses underlying the development of a multimetric index (MMI) were not addressed in the FAME project. MMIs are based on several premises: taking into account different assemblage characteristics enables a better assessment of the system’s conditions, metrics show a specific response to human pressures and should reflect the specificities of the region where they are applied, and the score associated with a metric should only reflect the between-site difference in degradation.

The first part of this thesis tested a number of these hypotheses and developed new metrics specific to low-species rivers. The study of the relation between traits highlights the relative functional redundancy of the communities, especially for the characteristics related to species tolerance and species reproduction. The assemblage’s functional structure was also shown to vary along river’s physical and thermal gradients, in much the same way between Western Europe and the Iberian Peninsula, although these two regions have distinct species pools. Nevertheless, some metrics display a divergent response to the environment between these two regions. In addition, among the 96 metrics tested combining both species traits and fish body size, three were successfully modeled by the environment and are sensitive to anthropogenic alterations. These results show that it is possible to evaluate stream condition using assemblage size structure, especially when species richness is very low. All these investigations served to develop the new European fish index conceived within the European EFI+ project.

The second part proposes solutions to estimate the uncertainty related to metric and index scores. The metric’s expected values in absence of pressures are predicted using statistical models relating the metric to the environment. Two approaches were proposed to estimate the prediction interval associated with expected metric values. The uncertainty associated with metric and index scores was estimated by propagating the error associated with the expected values (prediction interval limits).

The third part consists in evaluating the potential effect of global climate change on the assessment of the stream’s ecological status using predictive indicators. It was shown that temperature plays a major role on brown trout (Salmo trutta fario) young of the year and species distribution. These results suggest long-term consequences on how streams are currently assessed, especially on the reference states. Keywords: Fish community, biological and ecological traits, environmental variability, convergence, ecoregions, multimetric index, reference condition, size class, uncertainty, global change, growth, distribution area, Europe. Table des matières

1.Introduction ...... 17

Première partie : Synthèse ...... 25

2.Présentation des données ...... 27 2.1. Environnement ...... 27 2.2. Pressions anthropiques ...... 30 2.3. Données faunistiques ...... 32 2.4. Traits biologiques et écologiques des espèces ...... 32 2.5. Jeu de données utilisé ...... 34

3.Résultats ...... 35 3.1. Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication ...... 35 3.1.1.Structures fonctionnelles des peuplements et redondance des traits ...... 35 3.1.2.Influence de l’environnement sur les associations de traits bio/écologiques ...... 39 3.1.3.Test de la convergence des communautés de la péninsule ibérique et de l’Europe de l’Ouest ...... 45 3.1.4.Développement de métriques spécifiques aux cours d’eau à faible richesse spécifique ...... 53 3.1.5.Conclusion : Implication de ces résultats dans le développement du nouvel indice ...... 62

3.2. Incertitudes associées à l’usage des bioindicateurs multimétriques prédictifs ...... 67 3.2.1.Choix de l’intervalle de confiance ...... 68 3.2.2.Illustration : cas de la régression linéaire ...... 69 3.2.3.Estimation de l’incertitude autour du score ...... 70 3.2.3.1. Estimation de l’intervalle de prédiction par approximation ...... 71 3.2.3.2. Estimation de l’intervalle de prédiction par simulation ...... 72 3.2.4.Estimation de l’incertitude autour de l’indice ...... 72 3.2.4.1. Illustration de l’effet de la corrélation entre métrique, par simulation ...... 73 3.2.4.2. Estimation de l’incertitude autour de l’indice par approximation ...... 76 3.2.4.3. Estimation de l’incertitude autour de l’indice par simulation ...... 76 3.2.4.4. Perspectives ...... 77

3.3. Implications potentielles du changement climatique ...... 80 3.3.1.Croissance des jeunes de l’année de la truite, Salmo trutta fario, L...... 81 3.3.1.1. Limites : période d’échantillonnage ...... 83 3.3.1.2. Perspectives ...... 84 3.3.2.Modélisation de la distribution des espèces ...... 84 3.3.2.1. Qualité de l’ajustement et validation des modèles ...... 87 3.3.2.2. Effets des variables environnementales sur l’occurrence des espèces ...... 90 3.3.2.3. Intervalle de confiance ...... 99 3.3.2.4. Perspectives ...... 99 3.3.2.5. Limite : utilisation de la température de l’air et non de l’eau ...... 100

4.Discussion et perspectives ...... 102 5.Bibliographie ...... 109

Deuxième partie : Articles et traduction anglaise ...... 127

Introduction ...... 129

Presentation of the data ...... 139

P1: Functional fish assemblage structure in European rivers...... 155

P2: Do Iberian and European fish faunas exhibit convergent functional structure along environmental gradients? ...... 205

P3: Development of metrics based on fish body size and species traits to assess European coldwater streams...... 227

P4: Implication of this work in the development of the new European Fish Index, EFI+...... 247

P5: Uncertainty associated with predictive multimetric indices ...... 253

Climate change ...... 269

P6: Variation of brown trout Salmo trutta young of the year growth along environmental gradients in Europe...... 273

P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties...... 283

Discussion and Perspectives ...... 327

Annexes ...... 335 Annexe 1 : Description de la base de données EFI+ ...... 337 Annexe 2 : Manuel d’utilisation du logiciel EFI+ ...... 341

Appendices ...... 389 Appendix 1: Description of the EFI+ data base ...... 391 Appendix 2: EFI+ user guide (see pages 317–361)...... 395 Table des illustrations

Figure 1 : Premier plan factoriel de l’analyse de Hill et Smith : a) lien entre les variables, 2) ordination des sites. Les carrés gris représentent le barycentre des pays : Allemagne (DE), Autriche (AT), Espagne (ES), Finlande (FI), France (FR), Hongrie (HU), Italie (IT), Lituanie (LT), Pays-Bas (NL), Pologne (PL), Portugal (PT), Roumanie (RO), Royaume-Uni (UK), Suède (SE) et Suisse (CH)...... 29

Figure 2 : Ecorégions adaptées d’Illies (1978) : Alps (ALPS), Baltic province (BAL), Central highlands (C.H), Central plains (C.P), the Carpathiens (CAR), Eastern region (EAST), Great Britain (ENG), Ibero-Macaronesian (IBE), Italy, Corsica & Malta (ITA), Méditerranée (MED), Nordic region (NOR), Western plains (W.P) et Western highlands (W.H)...... 30

Figure 3 : Distribution des valeurs de l’indice de pression entre les cinq classes...... 31

Figure 4 : Localisation des 9 948 sites. En gris, les pays membres du projet EFI+...... 34

Figure 5 : Ordination de l’abondance relative des catégories de traits (aux contributions > 0,5 %) sur le premier plan factoriel de l’ACP (matrice de variance co-variance)...... 38

Figure 6 : La théorie des filtres écologiques, adaptée de Poff (1997). Selon cet auteur, la présence et l’abondance d’une espèce dans un environnement donné dépendent de ses caractéristiques (traits)...... 40

Figure 7 : Résultats de la RDA : a) ordination des catégories de traits, b) corrélation entre les variables environnementales et les deux premiers axes...... 42

Figure 8: Ordination des sites sur le premier plan factoriel de l’ACP. Les carrés gris sont placés au barycentre de chaque écorégion (cf. Figure 2)...... 42

Figure 9 : Relation entre une catégorie de trait et l’environnement dans deux régions, droite noire ou grise : a) convergence quantitative (le même modèle pour les deux régions) ; b), c) et d) convergence qualitative, les relations varient dans le même sens dans les deux régions ; e) et f) non convergence entre les régions, en e) la catégorie de traits varie avec l’environnement dans une région mais pas dans l’autre et en f) les relations à l’environnement sont opposées...... 46

Figure 10 : Effet marginal (Fox 1987, 2003) de la température (TEMP), de la pente (SLOP) et de la distance à la source (DIS) sur les 8 métriques basées sur la densité totale de poisson (DENS), sur la densité d’eurytopes (EURY), d’insectivores (INSEV), de lithophiles (LITH), d’omnivores (OMNI), de potamodromes (POTAD), de rhéophiles (RHEO) et d’intolérant (INTOL). Seules les variables ou les interactions entre variables et région avec un effet significatif ont été représentées (drop in deviance F-tests, Chambers & Hastie 1993). La réponse commune est représentée en pointillés (modèle 1), la réponse de la région IP en noir et de la région FB en gris. Les métriques avec des interactions significatives (modèle 3) sont indiquées avec un astérisque. Les environnements calcaires (c) et siliceux (s) sont différenciés si l’interaction entre la géologie et la région (GEO × REG) est significative. ns est ajouté sur le graphique si la variable environnementale avait un effet significatif, mais pas l’interaction avec la région...... 49 Figure 11 : Effet marginal (Fox 1987, 2003) de la température (TEMP), de la pente (SLOP) et de la distance à la source sur les 9 métriques basées sur la richesse spécifique locale (RICH), sur la richesse (Ns) de poissons benthiques (BENTH), eurytopes (EURY), intolérants (INTOL), lithophiles (LITH), potamodromes (POTAD), rhéophiles (RHEO), tolérants (TOLE) et vivant et se nourrissant dans la colonne d’eau (WATE). Seules les variables ou les interactions ente variables et région avec un effet significatif ont été représentées (drop in deviance 2-tests, Chambers & Hastie 1993). La réponse commune est représentée en pointillés (modèle 1), la réponse de la région IP en noir et de la région FB en gris. Les métriques avec des interactions significatives (modèle 3) sont indiquées avec un astérisque. 50

Figure 12 : Abondance relative du goujon, du chevaine et du vairon au sein des 35 peuplements IP lorsqu’ils sont présents...... 51

Figure 13 : Illustration du calcul des métriques pour une catégorie de trait donnée. Dans cet exemple, le calcul porte sur deux espèces caractérisé par le trait d’intérêt porté par la truite et le chabot (contours noirs) et seuls les individus de petite taille sont pris en compte (colorés en gris). (a) Tous les individus des deux espèces sont pris en compte soit 12/21. (b) Seuls les truites de petites tailles sont utilisées, le chabot n’étant pas une espèce de grande taille, soit 5/14...... 55

Figure 14 : Procédure de sélection des métriques et de calcul des scores...... 58

Figure 15 : Relations entre valeurs prédites et valeurs observées pour les sites de références : (a) nombre de petits individus intolérants aux faibles concentrations en oxygène (O2INTOL, N = 214), (b) nombre de petits individus intolérants à la dégradation de l’habitat (HINTOL, N = 214), (c) nombre de petits individus rhéophiles (RHEO, N = 212), (d) nombre de petits individus insectivores (INSEV, N = 212). La droite continue représente la première bissectrice et la courbe pointillée une tendance générale (régression loess, f = 0,667 ; Hastie et al. 2009)...... 59

Figure 16 : Variation du score des métriques le long du gradient de pression : (a) nombre de petits individus (< 150 mm) intolérants aux faibles concentrations en oxygène, b) nombre de petits individus intolérants à la dégradation de l’habitat, c) nombre de petits individus rhéophiles, d) nombre de petits individus insectivores...... 60

Figure 17 : Schéma du développement d’un indice multimétrique prédictif basé sur les conditions de référence, adapté de Pont (2010)...... 64

Figure 18 : Principe de la propagation d’erreur...... 68

Figure 19 : Relation théorique entre une métrique (ordonnée) et une variable environnementale (abscisse). La droite en noire représente les valeurs prédites par une régression linéaire, la bande grise foncée l’intervalle de confiance autour du modèle (de la moyenne) et la bande en gris clair l’intervalle de prédiction (autour d’une observation)...... 70

Figure 20 : Résultats des 1 000 simulations pour le site test : (a) résidus de la métrique Y1, (b) résidus de la métrique Y2, (c) lien entre les résidus des deux métriques (carré noir localisé au barycentre des deux distributions), et (d) distribution de la somme des résidus. La courbe en pointillés représente la distribution théorique de la somme des résidus en absence de corrélation et la courbe continue la distribution théorique de la somme des résidus avec une corrélation de 0,5424 (corrélation observée entre les deux distributions). Les deux segments représentent les bornes estimées de l’intervalle de confiance avec = 0,05...... 75

Figure 21 : Illustration de l’estimation de l’incertitude autour de l’indice (pointillés) pour des peuplements tolérants, avec un intervalle à 80 % (a) et à 95 % (b). Les bornes inférieures et supérieures (courbes noires) ont été obtenues par simulation...... 76

Figure 22 : Somme de 10 variables aléatoires distribuées selon des lois normales de paramètre = 0 et ² = 4, avec différents niveaux de corrélation (r) entre chaque variable : 0, 0,25, 0,5 et 0,75...... 78

Figure 23 : Exemple de la distribution en taille d’une population de truite. Les paramètres de la loi normale associée aux YOY (ligne pointillée) permettent d’estimer la taille maximale (le point noir)...... 81

Figure 24 : Profils marginaux (Fox 1987, 2003) de l’effet de la température moyenne annuelle de l’air et de la surface du bassin versant sur la taille maximale des 0+ de truite...... 82

Figure 25 : Localisation des 1 548 sites peu ou pas impactés...... 86

Figure 26 : Effet de la température moyenne de l’air en juillet sur la probabilité de présence des 24 espèces, une fois les valeurs des autres variables environnementales fixées (Fox 1987, 2003), à la médiane des sites où l’espèce est présente. La courbe rouge représente la probabilité estimée et le polygone gris, l’intervalle de confiance estimé par l’approche de Wald (Hosmer & Lemeshow 2000, Agresti 2002)...... 94

Figure 27 : Effet de la pente sur la probabilité de présence des espèces, une fois les valeurs des autres variables environnementales fixées (Fox 1987, 2003), à la médiane des sites où l’espèce est présente. La courbe rouge représente la probabilité estimée et le polygone gris, l’intervalle de confiance estimé par l’approche de Wald (Hosmer & Lemeshow 2000, Agresti 2002)...... 95

Figure 28 : Effet du pseudo run-off (lPA) sur la probabilité de présence des espèces, une fois les valeurs des autres variables environnementales fixées (Fox 1987, 2003), à la médiane des sites où l’espèce est présente. La courbe rouge représente la probabilité estimée et le polygone gris, l’intervalle de confiance estimé par l’approche de Wald (Hosmer & Lemeshow 2000, Agresti 2002)...... 97

Figure 29 : Effet de l’amplitude thermique entre janvier et juillet, sur la probabilité de présence des espèces, une fois les valeurs des autres variables environnementales fixées (Fox 1987, 2003), à la médiane des sites où l’espèce est présente. La courbe rouge représente la probabilité estimée et le polygone gris, l’intervalle de confiance estimé par l’approche de Wald (Hosmer & Lemeshow 2000, Agresti 2002)...... 98 Liste des tableaux

Tableau 1 : Récapitulatif des différents articles...... 23

Tableau 2 : Description des variables environnementales...... 27

Tableau 3 : Description des 11 traits écologiques et biologiques des espèces...... 33

Tableau 4 : Statistiques et p-valeur ajustées des tests, F pour les métriques en densité (Ni), de 2 pour les métriques basées sur la richesse (Ns), de comparaison des modèles emboités. La méthode de Benjamini et Yekutieli (2001), qui contrôle le taux de mauvaise erreur « false discovery rate » (FDR), a été utilisée pour ajuster les p-valeurs du fait des comparaisons multiples (Dudoit & van der Laan 2008). L’effet régional est non significatif (ns) si aucune des comparaisons n’est significative ; additif si la comparaison entre les modèles 1 et 2 est significative mais pas la comparaison entre les modèles 2 et 3 ; et interactif si la comparaison entre les modèles 2 et 3 est significative. Les métriques peuvent être soit non convergentes, présenter une convergence quantitative ou qualitative...... 47

Tableau 5 : Calcul des 12 métriques pour un trait donné...... 55

Tableau 6 : Aperçu de la matrice de variance-covariance du modèle reliant l’environnement à la métrique nombre de petits individus intolérants à la dégradation de l’habitat, estimée soit avec une distribution de Poisson, soit avec une distribution binomiale négative...... 56

Tableau 7 : Statistiques et p-valeur des tests de la somme des rangs de Wilcoxon entre les scores de la classe 1 et les scores des classes 4-5 de l’indice de pressions...... 60

Tableau 8 : Corrélations entre les scores des métriques ( de Spearman)...... 60

Tableau 9 : Traits des dix espèces les plus abondantes dans le jeu de données des cours d’eau froids (cellule grisée si l’espèce présente le trait)...... 61

Tableau 10 : Liste des 19 espèces caractéristiques des peuplements intolérants...... 65

Tableau 11 : Limites des classes pour l’indice « intolérant » (Bady et al. 2009a,b)...... 67

Tableau 12 : Résultat du partionnement hiérarchique de la régression multiple reliant la taille maximale des YOY aux variables environnementales : contribution indépendante (I), contribution jointe (J) et %Ii la contribution indépendante relative. * Les effets de la variable et de sa forme quadratique (équation 3.33) sont estimés simultanément...... 83

Tableau 13 : Résumé du nombre de sites disponibles pour chaque espèce et de leur prévalence au sein de ces sites et des régions marines où elles sont considérées comme natives (MMR ; Kottelat & Freyhof 2007, Reyjol et al. 2007)...... 87

Tableau 14 : Résumé de la qualité de l’ajustement des modèles : pourcentage de covariate pattern (Hosmer & Lemeshow 2000) supérieur à 4 (CovPat), facteur d’inflation de la variance moyen (Mvif ; Greene 2003, Kutner et al. 2005), pseudo-R² (pR²) basé sur le ratio de déviance (Estrella 1998), critère d’information d’Akaike (AIC ; Akaike 1974, Collett 2002), la probabilité seuil (Cut), la sensibilité (Sens), la spécificité (Spec), l’aire sous la courbe ROC (AUC ; Agresti 2002), la statistique kappa ( ; Agresti 2002) et le taux de bon classement (le pourcentage de sites bien classés, Class)...... 88

Tableau 15: Estimation de l’optimisme (Harrel 2001) du kappa et du taux de bon classement. Pour chaque espèce 200 nouveaux jeu de données train sont définis en sélectionnant aléatoirement et avec remise les sites de calibration. Les coefficients des modèles sont réestimés pour chacun des 200 nouveaux échantillons. A partir de ces modèles on calcule le kappa et le taux de bon classement pour le jeu de données train et pour le jeu de données initial (Init) des sites de calibration. La différence entre les moyennes des 200 valeurs calculées pour les sites de calibration sélectionnés aléatoirement (Train), et pour les sites de calibration initiaux, est une estimation de l’optimisme (Optim ; Harrel 2001)...... 89

Tableau 16 : Moyennes et écarts types de la sensibilité (Sens), spécificité (Spec), aire sous la courbe ROC (AUC), kappa et taux de bon classement (Class) des modèles, estimés sur un jeu de données indépendants par une méthode dérivée du split-sampling (Harrel 2001)...... 90

Tableau 17 : Résultat du partitionnement hiérarchique (Chevan & Sutherland 1991, Mac Nally 2000, Walsh & Mac Nally 2008) pour 24 espèces piscicoles européennes : réduction totale en déviance (R), contribution indépendante des variables (I), contribution jointe (J), ratio entre les contributions indépendantes et jointes (I/J) et contribution indépendante relative (%Ii = Ii/I)...... 91

1.- Introduction

1. Introduction

Cette thèse s’est inscrite dans le cadre du projet européen EFI+1 et dans la continuité des travaux du projet européen précédent FAME2. Ces deux projets avaient pour objectif le développement d’outils d’aide à la mise en place de la Directive Cadre sur l’Eau (DCE ; European Union (EC) 2000). Parallèlement au développement de méthodes nationales, le projet FAME a permis l’élaboration d’un indice poisson « the European Fish Index » (EFI) applicable dans son principe à l’ensemble des cours d’eau européens (Pont et al. 2006, Pont et al. 2007). Le projet EFI+ avait pour but de poursuivre le développement et d’améliorer cet indicateur, afin d’en surmonter un certain nombre de limites. Pour ce faire, il était nécessaire de s'appuyer sur une base de données décrivant de façon plus précise les conditions environnementales et les pressions anthropiques s’exerçant sur les cours d’eau, mais aussi et surtout de tester certaines hypothèses sous-jacentes au développement d’un bioindicateur de ce type. C’est là l’objet de ce travail.

La Directive Cadre sur l’Eau (DCE) et les outils de bioindication La DCE est une politique environnementale instaurée par le parlement européen (European Union (EC) 2000). Elle établit un cadre pour la protection et la gestion des différentes masses d’eau : masses d’eau de surface (lacs et rivières), de transition (estuaires et lagunes), côtières et souterraines. La DCE définit un objectif de retour ou de maintien du « bon état » des masses d’eau d’ici à l’horizon 2015 (ou au plus tard 2027). Sa mise en oeuvre est planifiée selon trois phases. La première phase consiste à diagnostiquer l’état des différentes masses d’eau, la deuxième à établir des plans de restauration pour les masses identifiées comme étant dégradées, et la troisième à réévaluer l’état des masses d’eau après restauration. La DCE spécifie que l’évaluation des masses d’eau doit s’établir en considérant à la fois l’intégrité des processus physico-chimiques de l’écosystème (ex. régime hydrologique, polluants) et le fonctionnement des communautés (ex. composition en espèce, structure en

1 Improvement and Spatial extension of the European Fish Index (http://efi-plus.boku.ac.at/), 01/01/2007 – 31/04/2009, numéro de contrat 044096 2 Development, Evaluation and Implementation of a standardised Fish-based Assessment Method for the Ecological Status of European Rivers (http://fame.boku.ac.at/), 01/01/2002 – 31/10/2004, numéro de contract EVK1-CT-2001-00094 17 1.- Introduction

âge) de quatre éléments de la biocénose : les algues (diatomées), les macrophytes, les macro- invertébrés benthiques et les poissons. La caractérisation du fonctionnement écologique des masses d’eau pour évaluer leur état, constitue une des grandes innovations de cette directive. Le « bon état écologique » tel que défini par la DCE (European Union (EC) 2000) se doit d’être évalué en comparant les caractéristiques actuelles des communautés à celles attendues en l’absence d’altération anthropique : « reference condition approach » (Bailey et al. 1998). La DCE a ainsi encouragé et accéléré le développement d’indices nationaux basés sur les caractéristiques des communautés. En France, l’indice poisson rivière (IPR ; Oberdorff et al. 2002) a ainsi été développé pour évaluer la « santé » des cours d’eau français à partir des caractéristiques des peuplements piscicoles.

Les indicateurs biotiques (piscicoles) multimétriques La bioindication a déjà une histoire longue d’un siècle, principalement en relation avec l’évaluation de la dégradation des eaux liée aux rejets urbains et industriels (Kolkwitz & Marsson 1909).

En ce qui concerne le recours aux poissons comme indicateurs de l’état des milieux aquatiques, les travaux de Karr aux Etats-Unis ont été décisifs. Son « Index of Biotic Integrity », IBI (Karr 1981, Karr et al. 1986), a été développé directement dans le contexte de la mise en place des mesures de protection des eaux naturelles aux Etats-Unis, le Clean Water Act en 1972 (“To restore and maintain the chemical, physical, and biological integrity of the Nation’s waters.”) puis le National Wildlife Refuge System Improvement Act en 1977. L’IBI repose sur des postulats largement repris dans la Directive Cadre sur l’Eau et en particulier sur la notion d’intégrité biotique. Celle-ci se définit comme l’ensemble des caractéristiques des communautés biologiques (en termes structurels et fonctionnels) observables dans des sites dits « pristines », c'est-à-dire non perturbés par l’homme. L’IBI cherche ensuite à mesurer un écart à cet état, en liaison avec l’intensité des pressions anthropiques. De plus, il évalue la réponse des peuplements piscicoles aux perturbations à partir d’un ensemble de descripteurs (métriques) de l’état des peuplements. L’IBI a posé les fondations tant théoriques que pratiques du développement des indicateurs multimétriques (MMI). Le concept de cet indice, développé originellement pour les cours d’eau chauds du Midwest des Etats-Unis, a été exporté vers d’autres régions et types de cours d’eau des Etats-Unis (ex. Leonard & Orth 1986, Angermeier & Schlosser 1987, McCormick et al. 2001) et sur à peu près tous les continents (ex. Oberdorff & Hughes 1992, Lyons et al. 1995, Hugueny et al. 1996, Harris &

18 1.- Introduction

Silveira 1999, Hughes & Oberdorff 1999, An et al. 2002, Joy & Death 2004, Magalhaes et al. 2008). Les MMI reposent sur plusieurs postulats.

Le premier postulat suppose que la prise en compte de différentes caractéristiques des peuplements telles que la composition spécifique, la richesse, la structure trophique, les modes de reproduction, l’abondance et la santé des individus (Karr 1981, Fausch et al. 1984, Karr et al. 1986, Karr 1991, Karr & Chu 1999) permet une meilleure estimation de la condition des systèmes (intégrité biotique, santé, condition écologique, etc.) que la considération d’un seul attribut de ces peuplements (Karr & Chu 1999). Chaque métrique est supposée fournir une information singulière sur le peuplement (Karr et al. 1986) décrivant la qualité d’une composante de la communauté (Karr 1991).

Le second postulat suppose que ces métriques présentent des réponses spécifiques aux pressions (Karr & Chu 2000) et leur sensibilité doit varier le long d’un gradient de pressions anthropiques (Angermeier & Karr 1986). Dans le cas de sites très dégradés, plusieurs métriques doivent refléter ce mauvais état. L’intégration du signal contenu dans chaque métrique en un seul indice permet au final de détecter un grand nombre de pressions humaines et de représenter un degré global d’altération. Idéalement, un indice doit être sensible à l’ensemble des pressions anthropiques rencontrées dans la zone géographique d’application (Karr 1991). En plus de la spécificité du signal qu’elles renvoient, les métriques doivent rendre compte de la particularité de la région où elles seront appliquées : pool d’espèces, environnement et pressions. Ce postulat a priori trivial est d’une grande importance. Les résultats de Harris et Silveira (1999) montrent qu’incorporer une métrique faiblement représentée dans une région ne permet pas une discrimination pertinente des sites. Par exemple, dans le premier IBI, Karr a intégré une métrique basée sur le pourcentage de « crapet vert » (Lepomis cyanellus, Rafinesque). L’usage de cette métrique en Europe serait incohérent (Hering et al. 2006) puisque l’aire de répartition de cette espèce est limitée à l’Amérique du Nord (Lee et al. 1980, Nelson 2006).

Le troisième postulat inhérent au développement d’un MMI est que le score associé à une métrique est influencé uniquement par la différence de dégradation entre les sites (Hughes et al. 1998, Karr & Chu 1999, 2000, Oberdorff et al. 2002, Hering et al. 2006, Pont et al. 2007, Stoddard et al. 2008, Pont et al. 2009). Autrement dit, la différence de score entre les

19 1.- Introduction sites doit refléter une différence d’altération. Fausch et al. (1984) ont été les premiers à considérer que les caractéristiques des peuplements (la richesse) puissent être influencées par les conditions environnementales (la taille du cours d’eau et la région).

Deux stratégies sont possibles pour le choix des métriques. La première consiste à sélectionner des métriques considérées comme invariantes quelles que soient les conditions environnementales ou les régions considérées (Statzner et al. 2001). La deuxième consiste à contrôler la part de variabilité des métriques liées à l’environnement (Fausch et al. 1984, Angermeier & Karr 1986, Karr et al. 1986, Karr & Chu 1999, Oberdorff et al. 2002, Pont et al. 2007, Stoddard et al. 2008, Pont et al. 2009, Hawkins et al. 2010), soit en restreignant leur aire d’application, soit en les standardisant par rapport à l’environnement. C’est cette dernière approche qui a été retenue dans les deux projets européens (Pont et al. 2006, 2007, Melcher et al. 2007).

Quelles que soient les méthodes utilisées pour prendre en compte la variabilité environnementale des métriques ou de la composition des assemblages : - ligne de valeur maximale et ses dérivés (maximum species richness line, Fausch et al. 1984, Karr et al. 1986, Hughes et al. 1998, Roset et al. 2007), - modélisation statistique (Oberdorff et al. 2002, Baker et al. 2005, Pont et al. 2006, Pont et al. 2007, Pont et al. 2009), - analyse discriminante (Joy & Death 2002), - algorithme d’apprentissage (ex. forêt aléatoire ; Hawkins et al. 2010), - plus proche voisin (nearest-neighbour ; Bates Prins & Smith 2007) ; l’hypothèse sous jacente est toujours la même à savoir que deux communautés vivant dans les mêmes conditions environnementales en l’absence de pressions anthropiques présentent des caractéristiques similaires. Bien que majeure, cette hypothèse est très rarement testée.

Le développement d’indicateurs européens (projets FAME et EFI+) En parallèle au développement des méthodes nationales, le projet FAME a permis l’élaboration d’un indice poisson européen (European Fish Index, EFI) applicable dans son principe à l’ensemble des cours d’eau européens (Pont et al. 2006, Pont et al. 2007). EFI est basé sur l’agrégation des scores de 10 caractéristiques de la structure des peuplements, aussi appelées métriques (Hering et al. 2006). Les valeurs des scores sont calculées suivant le principe de la « reference condition approach » (Bailey et al. 1998) : les valeurs des métriques

20 1.- Introduction observées au sein des peuplements sont comparées à des valeurs théoriques attendues en l’absence de pressions anthropiques. La mise en place d’un tel indice a été possible par la création d’une base de données commune à 12 pays européens (FIDES) intégrant des données faunistiques (nombre d’individus capturés par espèces), des traits écologiques et biologiques des mêmes espèces (traits, guilde trophiques, reproduction), des descripteurs environnementaux (ex. pente, température) et une caractérisation des pressions humaines. La création de cette base est une des principales réalisations de FAME, particulièrement l’évaluation de cinq altérations anthropiques majeures sur le régime hydrologique, les conditions morphologiques, l’acidification et les toxiques, sur l’apport de nutriments organiques et les ruptures de la connectivité.

Bien que les scores d’EFI « montrent une relation linéaire négative et significative à un indice de perturbation humaine » (Pont et al. 2007), cet indice présente plusieurs limites. - l’étendue géographique de son utilisation est insuffisante puisqu’aucun pays de l’Europe de l’Est ne faisait partie de FAME (Schmutz et al. 2007); - la sensibilité de l’indice est variable selon les différentes régions de l’Europe, EFI est notamment moins sensible dans les cours d’eau méditerranéens ; - la sensibilité d’EFI est significativement différente selon le type de pressions anthropiques, EFI est globalement plus sensible aux pressions chimiques qu’aux pressions hydro-morphologiques ; - l’utilisation de cet indice en grand cours d’eau est relativement limitée.

Le projet européen EFI+ avait pour but de poursuivre le développement et d’améliorer l’indice poisson européen, afin de surmonter ses limites actuelles. Pour ce faire, il était nécessaire d’améliorer nos connaissances et plus particulièrement de tester certaines hypothèses sous-jacentes au développement d’un indice multimétrique (MMI) n’ayant pas pu être étudiées au cours de FAME.

Les objectifs de la thèse Les objectifs de cette thèse sont à la fois théoriques et pratiques : - 1. Tester plusieurs hypothèses écologiques sur lesquelles repose le développement d’un indice multimétrique à large échelle et d’utiliser ces nouvelles connaissances dans le développement du nouvel indice poisson européen EFI+.

21 1.- Introduction

- 2. Aller au-delà du développement d’un indice, en proposant des solutions pour estimer l’incertitude autour du score des métriques et de l’indice. - 3. Estimer les conséquences du changement global et plus particulièrement du réchauffement climatique, sur les peuplements piscicoles, afin d’envisager les effets potentiels liés à ce changement, dans l’évaluation de l’état écologique des cours d’eau. Les résultats de cette thèse sont articulés en trois grandes parties.

La première partie concerne, le test des hypothèses sous-jacentes aux indices et le développement de l’indice poisson EFI+. L’utilisation des traits qui regroupent les espèces avec des mêmes caractéristiques au sein de la même guilde était à la base d’EFI. Le premier objectif de cette partie consiste à étudier la représentativité et le lien entre les traits biologiques et écologiques des espèces au sein des peuplements piscicoles européens ; le but étant de pouvoir apprécier la redondance fonctionnelle au sein des peuplements (P1).

Le deuxième objectif est d’évaluer l’influence de l’environnement sur la structure fonctionnelle des peuplements. Plus précisément, il convient d’étudier la part de variabilité de la structure des peuplements expliquée par l’environnement et le lien entre les gradients environnementaux et la variation de la structure des communautés (P1).

Le troisième objectif vise à tester la convergence des réponses fonctionnelles des communautés aux gradients environnementaux, entre deux régions ayant des pools d’espèces relativement distincts : la péninsule ibérique d’un côté, la France et la Belgique de l’autre. Le but étant de tester si la structure fonctionnelle des peuplements de ces deux régions, peut varier similairement le long des gradients environnementaux. Autrement dit, est-ce qu’un même modèle statistique est applicable aux deux régions pour prédire la valeur attendue d’une métrique dans un environnement donné ? (P2).

Le quatrième objectif est de développer pour les cours d’eau à faible richesse spécifique, des métriques basées sur les classes de taille, et sur les caractéristiques des espèces (P3).

L’implication des résultats obtenus dans ces travaux, dans le développement du nouvel indice poisson européen, conclue cette partie (P4).

22 1.- Introduction

La deuxième partie de cette thèse concerne l’estimation de l’incertitude associée au score des métriques et au score de l’indice. Une section de cette partie concerne plus particulièrement l’importance de la corrélation entre les métriques, dans l’estimation de l’incertitude associée au score de l’indice (P5).

La troisième partie porte sur les conséquences du changement global, plus particulièrement du réchauffement climatique sur l’évaluation future de l’état écologique des cours d’eau. L’influence de ce changement a été étudiée en évaluant l’effet relatif de la température et des autres facteurs environnementaux sur : la croissance des jeunes de l’année de truite (Salmo trutta fario, L.) (P6) et sur la présence-absence de 24 espèces européennes (P7).

Ce manuscrit est structuré en deux grandes sections. La première couvre les trois parties sous la forme d’une synthèse basée, sur les articles, des travaux menés au cours de cette thèse. Elle est structurée par une introduction, la présentation des données, les résultats, une conclusion et des perspectives suvies de la bibliographie. La seconde regroupe les articles rédigés au cours de cette thèse (Tableau 1), ainsi que la traduction anglaise des chapitres qui feront l’objet d’articles par la suite (chapitre sur l’indice poisson EFI+ et sur l’incertitude). Tableau 1 : Récapitulatif des différents articles. Partie Article Statut P1 Functional fish assemblage structure in European rivers. En preparation Journal of the North Do Iberian and European fish faunas exhibit convergent functional P2 American Benthological structure along environmental gradients? Society, publié Development of metrics based on fish body size and species traits to Ecological Indicators, P3 assess European coldwater streams sous presse

Implication of this work in the development of the new European Fish P4 Index, EFI+. P5 Uncertainty associated with predictive multimetric indices Variation of brown trout Salmo trutta young-of-the-year growth along Journal of Fish Biology, P6 environmental gradients in Europe sous presse

Modelling ecological niche of fish species at the European scale: P7 sensitivity to climate variables (temperature, run-off) and associated En préparation uncertainties

23

Première partie : Synthèse

2.- Présentation des données

2. Présentation des données

A l’exception de l’étude sur la convergence des communautés (P2) basée sur des données du programme européen FAME, l’ensemble des données utilisées proviennent du projet EFI+. La base de données EFI+ comprend 14 458 sites distribués parmi 15 pays. Elle contient des informations sur : l’environnement des sites (climat, facteurs abiotiques, structure du cours d’eau …), les pressions anthropiques (altérations physiques, qualité de l’eau, rupture de connectivité), les peuplements piscicoles présents dans ces sites (nombre d’individus par espèce, taille des individus) et les caractéristiques des espèces (traits biologiques, écologiques, d’histoire de vie).

Une présentation détaillée de la base de données est disponible sur le site : http://efi- plus.boku.ac.at/downloads/EFI+%200044096%20Deliverable%20D2_1-2_2.pdf. Une liste complète des variables de la base est également proposée dans l’annexe 1. Seules les données utilisées dans ce travail sont présentées.

2.1. Environnement

Douze variables ont été retenues (Tableau 2) pour décrire les conditions environnementales caractérisant les sites.

Tableau 2 : Description des variables environnementales. Variable Unité ou modalité Température annuelle moyenne de l’air °C Température moyenne de juillet de l’air °C Amplitude thermique entre janvier et juillet* °C Précipitations annuelles mm Pente ‰ Substrat petit, moyen, grand Géologie dominante du bassin versant calcaire, siliceuse, organique Surface du bassin versant km² Distance à la source km glacial, nival, pluvial, apports karstiques ou Régime hydrologique d’aquifères dominants Type géomorphologique méandrage, tressage, cours d’eau contraint Présence d'une plaine d'inondation oui, non

Seules des variables ayant des effets connus sur les peuplements piscicoles et peu ou pas modifiées par les activités humaines ont été sélectionnées. Le but est de décrire le site sur

27 2.1- Environnement la base de critères ne dépendant que de la variabilité « naturelle » de l’environnement. La largeur du cours d’eau n’a pas été utilisée car elle peut être facilement modifiée par l’homme, par exemple dans le cas de la chenalisation d’un cours d’eau. La surface du bassin versant et la distance du site à la source sont utilisées comme un proxi de la taille du cours d’eau, en raison de sa bonne corrélation avec la largeur.

L’hydro-morphologie des sites a été résumée en deux variables en synthétisant l’information contenue par les variables suivantes : surface du bassin versant, distance à la source, régime hydrologique, type géomorphologique et présence d’une plaine d’inondation. Pour ce faire, on a utilisé une méthode d’ordination capable d’analyser à la fois des variables quantitatives et qualitatives : l’analyse de Hill et Smith (1976). Cette analyse se comporte comme une analyse en composante principale (ACP) sur matrice de corrélation si elle n’est utilisée qu’avec des variables quantitatives. A l’inverse, elle se comporte comme une analyse des correspondances multiples (ACM) si elle est utilisée uniquement avec des variables qualitatives. Cette méthode recherche des combinaisons linéaires des variables, des axes orthogonaux deux à deux qui expliquent la plus grande part d’inertie du nuage de points.

Le premier plan factoriel de cette analyse représente 51,9 % de l’inertie totale dont 32,9 % expliquée par le premier axe et 19 % par le deuxième. Le premier axe (SYNGEO1) correspond à un gradient de taille de cours d’eau associé à la présence ou non de plaine d’inondation (Figure 1a). Le deuxième axe (SYNGEO2) oppose les cours d’eau tressés à régime non pluvial (forte composante glacio-nivale) aux cours d’eau à régime pluvial et non tressés (Figure 1a). Alors que le premier axe représente un gradient physique, le deuxième possède clairement une composante géographique. La Finlande et la Suède se distinguent très nettement des autres pays (Figure 1b).

28 2.1- Environnement

Figure 1 : Premier plan factoriel de l’analyse de Hill et Smith : a) lien entre les variables, 2) ordination des sites. Les carrés gris représentent le barycentre des pays : Allemagne (DE), Autriche (AT), Espagne (ES), Finlande (FI), France (FR), Hongrie (HU), Italie (IT), Lituanie (LT), Pays-Bas (NL), Pologne (PL), Portugal (PT), Roumanie (RO), Royaume-Uni (UK), Suède (SE) et Suisse (CH).

Chaque site a également été caractérisé par son appartenance à une écorégion (Illies 1978). Ces écorégions sont des unités géographiques contiguës qui donnent une description générale de l’environnement à une échelle régionale. Les écorégions utilisées dans le projet EFI+ sont une adaptation des écorégions telles de Illies (1978) afin de tenir compte des spécificités du jeu de données. Premièrement, du fait d’un nombre relativement faible de sites, certaines régions ont été regroupées dans des unités géographiques plus larges (Figure 2). Les écorégions « Alps » et « Pyrenees » ont été regroupées dans l’écorégion Alps, « Hungarian lowlands », « Eastern plains » et « Pontic province » dans l’écorégion Eastern region et les écorégions « Fenno scandian shield » et « Borealic uplands » en Nordic region. De plus, les écorégions d’Illies ne prennent pas en compte les particularités du climat méditerranéen (Gasith & Resh 1999). Afin de prendre en compte cette spécificité, une écorégion méditerranéenne a été définie à partir du niveau 1 de « méditerranitée » de la classification de Segurado et al. (2008).

29 2.1- Environnement

NOR

C.P BAL

ENG C.P EAST

C.H W.P CAR W.H ALPS EAST ITA IBE ALPS MED MED

Figure 2 : Ecorégions adaptées d’Illies (1978) : Alps (ALPS), Baltic province (BAL), Central highlands (C.H), Central plains (C.P), the Carpathiens (CAR), Eastern region (EAST), Great Britain (ENG), Ibero-Macaronesian (IBE), Italy, Corsica & Malta (ITA), Méditerranée (MED), Nordic region (NOR), Western plains (W.P) et Western highlands (W.H).

2.2. Pressions anthropiques

Cinq grands types d’altérations anthropiques ont été pris en compte dans le projet EFI+ relatives à la connectivité, à l’hydrologie, à la morphologie, à la qualité de l’eau et aux usages pouvant affecter la faune (navigation). Alors que le projet FAME n’utilisait qu’une classification à cinq modalités pour estimer le degré d’altération, une caractérisation beaucoup plus précise a été mise en place. Pour chaque altération, plusieurs descripteurs ont été considérés et évalués essentiellement par jugement d’expert. L’ensemble des pressions et leurs modalités sont décrites dans l’annexe 1B.XX. La caractérisation de ces pressions permet de sélectionner les sites qui sont peu ou pas impactés, considérés comme « références ». Les pressions prises en compte et le degré d’altération accepté ont été ajustés suivant la problématique de chaque article. Les critères employés pour les parties P1 et P3, sont plus restrictifs que ceux utilisés pour la partie P5. L’abondance relative des différentes catégories de traits est a priori plus facilement affectée par les pressions anthropiques que la présence-absence des espèces. Néanmoins, chaque sélection de sites a été accomplie de sorte que l’expression des phénomènes biologiques étudiés ne soit pas limitée par les activités anthropiques. L’objectif majeur de la définition des

30 2.2- Pressions anthropiques sites de références est d’avoir la représentation la plus fidèle possible des processus étudiés, par exemple la relation entre l’environnement et la structure fonctionnelle des communautés (P1). Les pressions anthropiques peuvent modifier les différentes composantes des assemblages (traits d’histoire de vie des espèces, composition spécifique, etc.) et par la même « bruiter » le signal biologique. La sélection des sites de références, sur des critères objectifs (Whittier et al. 2007), constitue ainsi une étape fondamentale dans le développement des indices basés sur l’approche par condition de référence (ex. Pont et al. 2006).

Pour ce faire, un indice synthétique des pressions anthropiques a été développé, afin d’estimer globalement le niveau d’altération des sites. Il a été calculé en prenant le premier axe d’une ACM basée sur sept pressions : modification de la vitesse du courant par une retenue, régime d’éclusée, prélèvement d’eau, présence de substances toxiques, qualité d’eau, modification de la section transversale liée à la chenalisation, présence d’une barrière à la migration à l’aval du tronçon. Le premier axe oppose les modalités qui reflètent l’absence de pressions ou de dégradation (ex. très bonne qualité d’eau, pas d’éclusées, pas de substances toxiques) aux modalités représentatives d’un stress maximal (ex. forts prélèvements d’eau, modification extrême de la section transversale, présence d’éclusées). Les catégories des différentes pressions sont majoritairement ordonnées le long de cet axe, selon le niveau de dégradation qu’elles représentent.

Les coordonnées factorielles des sites ont été 1 2 replacées dans une échelle de valeurs allant de 0 (peu à 3 4 pas d’impact) à 1 (très fortement impacté) avec une 5 transformation min-max (Legendre & Legendre 1998, Saporta 2006). Cinq classes allant de 1 (peu ou pas impacté) à 5 (fortement impacté) ont été définies en 0,0 0,5 1,0 1,5 2,0 2,5 partitionnant les valeurs de l’indice (Figure 3) avec 0,0 0,2 0,4 0,6 0,8 1,0 3 Figure 3 : Distribution des valeurs algorithme k-means (Hartigan & Wong 1979) . de l’indice de pression entre les cinq classes. 3

3 Pour plus de détails concernant le développement de cet indice de pression, voir le rapport EFI+ 4.1 et 4.2 (Bady et al. 2009a,b). 31 2.3- Données faunistiques

2.3. Données faunistiques

Tous les sites ont été échantillonnés entre 1955 et 2007, pour l’essentiel après 1995, par pêche électrique à pied ou en bateau selon la profondeur du cours d’eau. Les individus capturés lors des différents passages (entre 1 à 5) ont été identifiés à l’espèce. Un total de 161 espèces et de 7 686 350 individus ont ainsi été capturés et 6 309 639 poissons ont été mesurés, soit 82 % des individus. Afin d’utiliser les données les plus homogènes possibles, seuls les individus échantillonnés lors du premier passage ont été considérés. De plus, pour limiter l’autocorrélation temporelle, une seule occasion de pêche (date) par site a été sélectionnée aléatoirement.

2.4. Traits biologiques et écologiques des espèces

Parmi les 24 traits considérés dans le projet EFI+, dont 15 traits écologiques et biologiques (qualitatif) et 9 traits d’histoire de vie (quantitatif), je me suis uniquement focalisé sur les 11 caractéristiques écologiques et biologiques suivantes: tolérance à la temperature, tolérance à l’hypoxie, tolérance à la dégradation de l’habitat, régime trophique, habitat d’alimentation; habitat (affinité au courant), substrat de ponte, type de reproduction, comportement de reproduction, soins parentaux, comportement migratoire.

Chaque trait biologique ou écologique est décrit par plusieurs catégories (Tableau 3) et une seule catégorie par trait est attribuée à chaque espèce. A la différence des macro- invertébrés benthiques (Usseglio-Polatera et al. 2000), une espèce ne peut être représentée dans plusieurs catégories, et seules les caractéristiques à l’état adulte sont considérées.

32 2.4- Traits biologiques et écologiques des espèces

Tableau 3 : Description des 11 traits écologiques et biologiques des espèces. Trait Catégories Eurytherme (EUTHER) : espèce tolérant une large ampltitude thermique Tolérance à la température Sténotherme (STTHER) : espèce ne vivant que dans une plage étroite de températures Intolérante (O2INTOL) : espèce qui a besoin de plus de 6 mg.L-1 Intermédiaire (O2IM) : espèce relativement tolérante aux faibles Tolérance à l’hypoxie concentrations en oxygène Tolérante (O2TOL) : espèce capable de vivre dans des eaux avec moins de 3 mg.L-1 Intolérante (HINTOL) Tolérance à la dégradation de Intermédiaire (HIM) l'habitat Tolérante (HTOL) Détritivore (DETR) : alimentation composée d'une forte proportion de détritus Herbivore (HERB) : alimentation composée d'au moins 75 % de matière végétale Insectivore (INSEV) : alimentation composé au moins de 75 % d'insectes Guilde trophique des adultes Omnivore (OMNI) : alimentation composée de plus de 25 % de matière végétale et de plus de 25 % de matière animale Parasite (PARA) : présente un mode d'alimentation parasitaire Piscivore (PISC) : alimentation composée de poisson à plus de 75 % Planctivore (PLAN) : alimentation composée d'au moins 75 % de phytoplancton Benthique (B) : espèce préférant vivre et se nourrir vers le fond Habitat d’alimentation Colonne d'eau (WC) : espèce qui vit et se nourrit dans la colonne d'eau Limnophile (LIMNO) : espèce qui préfère les courants faibles ou les eaux Affinité pour la vitesse du stagnantes courant Rhéophile (RH) : espèce qui préfère les eaux courantes Eurytope (EURY) : espèce avec une large tolérance (LIPAR) espèce pondant préférentiellementdans des eaux stagnantes Habitat de ponte (RHPAR) espèce pondant préférentiellement dans des eaux courantes (EUPAR) espèce sans préférence marquée Ariadnophile (ARIAD) : espèce spécialisée dans la construction d'un nid, comportement très souvent associé avec des soins parentaux Lithopelagophile (LIPE) : espèce pondant dans les rochers ou les graviers, avec des embryons pélagiques libres Lithophile (LITH) : espèce pondant exclusivement sur les graviers, galets, pierres, rochers ou débris avec des larves photophobes Ostracophile (OSTRA) : espèce qui pond dans les mollusques bivalves Pelagophile (PELA) : espèce qui pond dans la zone pélagique Reproduction Phyto-lithophile (PHLI) : espèce qui dépose ses œufs dans des habitats d'eau claire, sur des plantes ou d'autres supports immergés tels que des souches, des graviers, des rochers. Sa larve est photophobe Phytophile (PHYT) : espèce qui dépose ses œufs dans des eaux claires sur des plantes immergées Psammophile (PSAM) : espèce qui pond sur les racines ou sur l'herbe au- dessus d'un fond sableux ou sur le sable lui-même Spéléophile (SPEL) : espèce qui pond dans les espaces interstitiels, fissures ou cavernes

33 2.4- Traits biologiques et écologiques des espèces

Tableau 3 : Description des 11 traits écologiques et biologiques des espèces. (Suite) Trait Catégories Unique (SIN) : espèce ayant une ponte unique au cours de la saison de reproduction Fractionné (FR) : espèce qui se reproduit répétitivement dans la saison ou Comportement de reproduction dont différentes fractions de sa population se reproduisent à différents moments Prolongé (PRO) : espèce qui se reproduit sur une longue période durant la saison reproductrice (PROT) espèce dont les stades embryonnaires ou larvaires sont protégés Soins parentaux (NOP) espèce n’assurant pas de protection des jeunes stades Résident (RESID) : espèce aux déplacements limitiés, au sein d'un segment de cours d'eau particulier Potamodrome (POTAD) : espèce qui se déplace au sein du réseau hydrographique Comportement migratoire Anadrome (LMA) : espèce qui vit au stade juvénile ou sub-adulte dans la mer et qui migre dans les rivières à sa maturité Catadrome (LMC) : espèce avec les jeunes stades vivant en eau douce qui migre en mer à maturité pour se reproduire

2.5. Jeu de données utilisé

Après sélection des sites pour lesquels l’ensemble des informations environnementales, de pressions anthropiques et faunistiques (abondance par espèce) était disponible, le jeu de données final regroupe 9 948 sites (Figure 4) échantillonnés entre 1974 et 2007 (87 % des sites après 1995). Cent quarante sept espèces et 1 938 339 individus ont été capturés. Les données de tailles sont disponibles pour 3 436 occasions de pêche représentant 727 976 individus. Figure 4 : Localisation des 9 948 sites. En N gris, les pays membres du projet EFI+.

0 500 1000 km

34 3.- Résultats

3. Résultats

3.1. Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

Ce chapitre synthétise les résultats des études menées dans les articles P1, P2, et P3. Les objectifs sont : d’étudier la structure fonctionnelle des peuplements (basée sur la composition en traits biologiques et écologiques), d’étudier l’influence relative de l’environnement et des écorégions sur la composition en traits des peuplements, de tester la convergence de la réponse de la structure fonctionnelle des communautés aux gradients environnementaux dans deux régions aux pools d’espèces relativement distincts, et de développer des métriques spécifiques aux cours d’eau à faible richesse spécifique qui intégrent la taille des individus et les caractéristiques des espèces. L’article P4 décrit le processus de développement de l’indice EFI+ et l’apport de mes travaux dans celui-ci.

3.1.1. Structures fonctionnelles des peuplements et redondance des traits La première étape du développement d’un indice multimétrique (MMI) suppose de sélectionner des métriques représentatives de la région d’intérêt (Karr 1991, Angermeier et al. 2000). Par conséquent, l’étendue de cette région ainsi que la richesse spécifique conditionnent le choix des métriques candidates pour un indice.

Idéalement, les métriques sélectionnées doivent permettre d’évaluer les peuplements indifféremment de l’environnement dans lequel ils vivent. La diversité des conditions environnementales et des cours observés dans les différentes régions en Europe (Koster 2005, Tockner et al. 2009), constitue le premier défi du développement de l’indice poisson européen. Les métriques doivent être utilisables dans des conditions climatiques, hydrologiques et géomorphologiques très contrastées. Elles doivent pouvoir évaluer la condition écologique de cours d’eau méditerranéens, alpins, côtiers, scandinaves, etc.

La diversité de la faune piscicole européenne constitue le deuxième défi majeur de l’indice poisson européen. Cette diversité s’exprime à la fois par le nombre d’espèces présentes en Europe, un peu plus de 500 selon Kottelat & Freyhof (2007), et par la disparité des pools d’espèces régionaux (Reyjol et al. 2007). En regroupant les grands bassins hydrographiques selon leur similarité faunistique, Reyjol et al. (2007) ont défini sept

35 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication ichthyorégions. La richesse et la composition faunistique de ces régions sont très variables et dépendantes de l’aire de répartition des espèces.

La niche des espèces (Hutchinson 1957, Pont et al. 2005), la structuration spatiale de l’environnement (Huet 1954, Grenouillet et al. 2004), la géographie (Hewitt 2004), les trois grands processus évolutifs que sont la spéciation, l’extinction et la migration (Stearns 1992), sont les principaux facteurs naturels responsables de la distribution actuelle des espèces. Une espèce ne peut être présente dans une région que si les conditions environnementales nécessaires à son existence sont réunies (Hutchinson 1957). Or, à l’échelle d’un continent, un facteur environnemental clé comme la température varie avec la latitude, l’altitude et la continentalité (Ward 1985, Figure 1.5A de Tockner et al. 2009). La formation des bassins hydrographiques et les dernières glaciations sont les deux événements géologiques les plus récents qui ont le plus contribué à la répartition actuelle des espèces. L’isolement géographique des populations qui a suivi la formation des bassins hydrographiques a favorisé des processus de spéciations allopatriques. C’est notamment le cas de la péninsule ibérique où des espèces du même genre (ex. Luciobarbus, Kottelat & Freyhof 2007) sont présentes dans des bassins hydrographiques différents (Gomez & Lunt 2007, Filipe et al. 2009). Les limites des bassins sont des barrières géographiques infranchissables. Chaque bassin peut être considéré comme une île biogéographique (Hugueny 1989). Seules les espèces anadromes (Tableau 3) ou supportant des eaux saumâtres seront à même de migrer d’un bassin à un autre.

La dernière période glaciaire du Pléistocène (entre –110 000 et –10 000 ans) constitue une période clé pour la répartition actuelle des espèces (Hewitt 1999, 2000, Kontula & Vainola 2001, Koskinen et al. 2002, Hewitt 2004, Griffiths 2006). Le recouvrement d’une grande partie des terres émergées européennes par la calotte glaciaire (Banarescu 1992, Keith 1998, Clark et al. 2009) a entraîné une forte extinction d’espèces (Hewitt 2000). Au maximum de l’expansion glaciaire, la calotte recouvrait le nord de l’Europe, l’Islande, les îles britanniques, et s’étendait plus au sud jusqu’en Allemagne et en Pologne. Seules les régions les plus au sud, telles que le bassin du Danube et la région ponto-caspienne, la péninsule ibérique, ont constitué des zones refuges pour la faune piscicole (Banarescu 1992, Hewitt 1999, 2004). Les phases d’expansion glaciaire correspondent à des phases de contraction des aires de répartition (Qian & Ricklefs 2000). A l’inverse, le retrait des glaciers correspond à une phase d’expansion des aires de distribution de certaines espèces (Hewitt 2000). Les zones

36 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication glacées ont été recolonisées par les espèces présentes dans les zones refuges (principalement la région ponto-caspienne) via différentes voies migratoires (Banarescu 1989, 1992, Hewitt 1999, Nesbo et al. 1999, Hewitt 1999, 2000, 2004, Kontula & Vainola 2001, Koskinen et al. 2002, Hewitt 2004, Griffiths 2006). Le bassin du Danube a été le principal foyer de dispersion pour la recolonisation. D’immenses lacs d’eau froide formés suite à la fonte des glaces ont établi des passages entre des bassins hydrographiques adjacents (Banarescu 1992, Griffiths 2006). Les espèces d’eau froide, généralistes, avec une forte capacité de dispersion, ont été les plus à aptes à recoloniser les zones glacées (Griffiths 2006). Les chaînes de montagnes, véritables barrières géographiques, ont isolé les régions les plus au sud, telle que la péninsule ibérique, et empêché toute recolonisation à partir de ces zones (Hewitt 2000, 2004). Les conséquences les plus visibles de ces événements sur la faune piscicole européenne actuelle sont : la présence d’un gradient latitudinal et longitudinal de la richesse spécifique (Griffiths 2006), aini qu’un nombre plus élevé d’espèces endémiques dans les zones non attenteintes par les glaces (Bianco 1995, Reyjol et al. 2007). L’Europe de l’Ouest présente peu d’espèces endémiques et une faune plus pauvre, constituée majoritairement d’espèces originaires d’Europe Centrale (Reyjol et al. 2007). Par conséquent, à large échelle, les patrons dérivés de l’analyse de la composition faunistique des peuplements reflètent principalement le rôle des facteurs historiques et géographiques sur la distribution actuelle des espèces (Van Sickle & Hughes 2000, Heino 2005, Bremner et al. 2006a, Hoeinghaus et al. 2007). L’usage de métriques basées sur la composition (présence-absence, abondance relative d’une espèce ; Alcorlo et al. 2006, Pollard & Yuan 2009) pour développer un indice à l’échelle de l’Europe ne semble pas pertinent. L’étendue spatiale de l’utilisation de ces métriques est limitée par l’aire de répartition de chaque espèce. L’utilisation de métriques qui regroupent au sein d’une même variable les espèces ayant des traits biologiques et écologiques similaires (Usseglio-Polatera et al. 2000, Melville et al. 2006, Hoeinghaus et al. 2007, Noble et al. 2007) permet de comparer des peuplements faunistiquement différents (Moyle & Herbold 1987, Wiens 1991, Kelt et al. 1996, Smith & Ganzhorn 1996, Reich et al. 1997, Statzner et al. 1997, Winemiller & Adite 1997, Bellwood et al. 2002, Lamouroux et al. 2002, Vila-Gispert et al. 2002, Goldstein & Meador 2004, Statzner et al. 2004, Melville et al. 2006, Bonada et al. 2007, Hoeinghaus et al. 2007, Irz et al. 2007, Ibañez et al. 2009). Par conséquent, dans le cadre du projet EFI+, seules les métriques basées sur les traits biologiques et écologiques des espèces ont été considérées (voir le chapitre 2.4). En plus de leur couverture spatiale, l’utilisation de ces traits a été recommandée

37 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

(Statzner et al. 2001, Bonada et al. 2006) et utilisée avec succès en bioindication, notamment à large échelle (Pont et al. 2006, 2007, 2009).

L’analyse de l’association des 41 modalités de traits (88 espèces, Tableau 3) au sein de 849 peuplements piscicoles européens peu ou pas impactés (P2) démontre une certaine redondance entre les catégories de traits. Elle se traduit par la proximité des projections associées à chaque catégorie sur le premier plan factoriel de l’ACP (Figure 5a). Deux gradients principaux sont observés : un gradient de tolérance et un gradient de reproduction. Le premier axe (48,3 % de l’inertie) oppose les peuplements dominés d’une part par des individus sténothermes (STTHER), intolérants vis-à-vis de l’hypoxie (O2INTOL) et d’une dégradation de l’habitat (HINTOL), aux peuplements dominés par des individus eurythermes (EUTHERM), avec une tolérance intermédiaire quant à l’oxygènation des eaux (O2IM) et tolérants la dégradation de l’habitat (HTOL). Le premier axe représente aussi un gradient trophique et d’habitat. Il oppose les catégories « insectivore » et « rhéophile » aux catégories « omnivore » et « eurytope ». Le deuxième axe (15,4 % de l’inertie) oppose les peuplements composés d’individus peu mobliles (RESID), sans préférences d’habitat de ponte (EUPAR) et ayant des reproductions fractionnées (FR) à des peuplements d’individus potamodromes pondant préférentiellement en eau courante (RHPAR) et ayant une seule ponte annuelle (SIN). L’orthogonalité observée entre ces deux gradients révèle leur relative indépendance. Des communautés dominées par des individus tolérants pourront tout aussi bien l’être par des individus à ponte unique ou à ponte fractionnée.

FRFR RESIDRESID EUPAREUPAR INSVINSV STTHERSTTHER O2INTOLO2INTOL RHRH NOPNOP HINTOLO LITHLLITH PHLI HTOLHTOL HIMHIMM O2TOLL O2IMO2IM PROTPROT OMNIOMNI EURYEURY EUTHER

POTADPOTAD RHPAR SINSIN

Figure 5 : Ordination de l’abondance relative des catégories de traits (aux contributions > 0,5 %) sur le premier plan factoriel de l’ACP (matrice de variance co-variance).

38 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

Implication de ces résultats pour le développement d’un indice multimétrique. La forte redondance de certaines catégories de traits limite le nombre de métriques disponibles pour le développement de l’indice. Les catégories fortement liées comme STTHER, O2INTOL et HINTOL, fourniront a priori la même information sur l’état des peuplements, ce qui est contraire à la théorie des MMI (Karr & Chu 1999). Dans le cas où ces métriques passent les différentes phases de sélection (Hughes et al. 1998, Hering et al. 2006, Pont et al. 2006, Stoddard et al. 2008) et permettent de mesurer l’effet des pressions anthropiques, il conviendrait de n’en retenir qu’une. En plus de leur capacité de détection des pressions, la distribution géographique des métriques doit être un critère de sélection. Dix-neuf catégories sur les 41 ne contribuent que très faiblement au premier plan factoriel (contribution relative inférieure à 0,5 %) et n’ont pas été représentées sur la Figure 5a. Parmi ces catégories, certaines sont peu fréquentes au sein des peuplements et donc participent très peu à l’analyse. La reproduction ostracophile (OSTRA), les régimes alimentaires parasitaires (PARA) et herbivores (HERB) sont présents dans moins de 10 % des sites. Les métriques rares sont peu représentatives des peuplements et sont donc peu utiles pour le développement d’un indice. Elles peuvent être utilisées pour le développement d’outils additionnels spécifiques à certains types de peuplements ou de cours d’eau. Certains traits peuvent varier indépendamment des deux gradients majeurs et contribuer à la construction d’autres axes. C’est le cas de l’habitat d’alimentation dont les modalités sont principalement représentées sur le troisième axe de l’ACP. Ces catégories sont très intéressantes puisqu’elles intègrent une information complémentaire sur la structure des peuplements. De même des catégories relativement stables entre les peuplements, qui ne participent que très faiblement à l’analyse, sont potentiellement intéressantes pour le développement des MMI (Charvet et al. 2000, Statzner et al. 2001, Hering et al. 2006).

3.1.2. Influence de l’environnement sur les associations de traits bio/écologiques Théorie Plusieurs hypothèses ont été émises pour expliquer l’association des traits au sein des peuplements. La théorie neutre, « neutral theory » développée par Hubbel (2001), considère chaque individu de chaque espèce comme écologiquement équivalent. D’après cette théorie, la structure des communautés est uniquement due à la dérive écologique. La structure des communautés est supposée indépendante de la niche et des caractéristiques des espèces. Cette théorie ne prédit aucun lien entre l’environnement, les interactions spécifiques et l’abondance relative des espèces ou leurs traits.

39 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

A l’inverse, la théorie de la similarité limitante, « limiting similarity », stipule que l’abondance relative des espèces dans un peuplement dépend de la compétition interspécifique (MacArthur & Levins 1967). Cette théorie suggère que la co-occurrence et l’abondance d’espèces avec des caractéristiques similaires au sein d’un même peuplement (trait, niche, morphologie) sont limitées (Weiher & Keddy 1995, Winston 1995, Kelt & Brown 1999, Mouillot et al. 2007, Cornwell & Ackerly 2009). Sous cette hypothèse, on s’attend à une surdispersion des traits. Les théories de l’« habitat templet » (Southwood 1977, Southwood 1988, Townsend & Hildrew 1994) et des « landscape filters » (Tonn et al. 1990, Poff 1997) considèrent que l’habitat est le facteur clé explicatif de la répartition des espèces au sein des peuplements. La théorie de l’habitat templet déclare que la variabilité spatio-temporelle de l’environnement dirige l’évolution des traits des espèces. La théorie des filtres suppose que la composition spécifique d’une communauté dépend du pool régional d’espèces et de l’action de différents filtres (Figure 6). Chaque filtre agit à une échelle spatiale différente (Poff 1997) et sélectionne les espèces qui possèdent la ou les caractéristiques adéquates (Keddy 1992, Diaz et al. 1998, Diaz et al. 1999, Weiher & Keddy 1999, Lamouroux et al. 2004, Statzner et al. 2004). Seules les espèces possédant les caractéristiques nécessaires à leur maintien et à leur développement dans un environnement donné, seront sélectionnées. Selon cette théorie, les traits devraient être sous dispersés au sein des peuplements (Mouillot et al. 2007).

Pool d'espèce régional

Bassin versant

Vallée

Unité du chenal

Microhabitat

Peuplement local Figure 6 : La théorie des filtres écologiques, adaptée de Poff (1997). Selon cet auteur, la présence et l’abondance d’une espèce dans un environnement donné dépendent de ses caractéristiques (traits). 40 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

Analyse et résultats Les résultats de la RDA (Redundancy Analysis ou ACP sur variable instrumentale ; Lebreton et al. 1991, Legendre & Legendre 1998, Baty et al. 2006) confirment que l’environnement influence fortement la structure des peuplements (Poff & Allan 1995, Diaz et al. 1998, Bellwood et al. 2002, Oberdorff et al. 2002, Goldstein & Meador 2004, Haybach et al. 2004, Heino et al. 2005, Bêche et al. 2006, Bremner et al. 2006a, Yoshimura et al. 2006, Bonada et al. 2007, Hoeinghaus et al. 2007, Ibañez et al. 2007, Horrigan & Baird 2008, Tedesco et al. 2008, Cornwell & Ackerly 2009, Friberg et al. 2009, Ibañez et al. 2009), piscicoles en Europe (Lamouroux et al. 2002, Santoul et al. 2005, Pont et al. 2006, Blanck & Lamouroux 2007). La température, l’amplitude thermique, la pente, la taille du substrat et la géomorphologie expliquent 30,5 % de la structure en traits des peuplements (test de Monte- Carlo, P < 0,001). L’effet de l’environnement est principalement représenté sur le premier axe de la RDA (83,2 % de l’inertie expliquée par l’analyse). Les gradients de tolérance et de trophie sont relativement bien expliqués par l’environnement. Le gradient de reproduction est moins distinct que dans l’ACP (Figure 5 et Figure 7a).

Les peuplements sont structurés le long d’un gradient thermique « Tjuly » (Wehrly et al. 2003) et d’un gradient physique représenté par la pente et SYNGEO1 (Figure 7b ; Huet 1954, Matthews 1998, May & Brown 2002). La relative orthogonalité entre ces deux gradients (Figure 7b) suggère, qu’à large échelle, ces deux gradients structurent les communautés de façon indépendante. Lorsque l’environnement est pris en compte, l’inertie expliquée par les écorégions, est beaucoup plus faible (Sandin & Johnson 2004, Bremner et al. 2006b, Hewitt et al. 2008), soit environ 7 %. La variabilité spatiale observée dans la composition en traits des communautés est donc principalement due à la spatialisation de l’environnement (Hoeinghaus et al. 2007) en Europe (Koster 2005, Tockner et al. 2009). A large échelle, la température varie avec la latitude, l’altitude et la continentalité (Ward 1985), alors que les conditions physico-chimiques d’un cours d’eau évoluent le long du gradient amont-aval (Allan & Castillo 2007).

41 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

a b

TDIF

SYNGEO1 PROT FR RESID PHLI PISC EUPAR HTOL SYNGEO2 NATSED-small STTHER LMA O2INTOL RH OMNI EURY O2IM HINTOL RHPAR EUTHER SIN HIM NATSED-medium INSV POTAD LITH NOP

SLOP TJULY

Figure 7 : Résultats de la RDA : a) ordination des catégories de traits, b) corrélation entre les variables environnementales et les deux premiers axes.

Effet des écorégions L’ordination des sites démontre clairement une structure spatiale dans la répartition des traits (Figure 1b). L’inertie inter-écorégion (adaptées d’Illies 1978, Figure 2) représente 29,4 % de l’inertie totale (test de Monte-Carlo, P < 0,001). Lorsque l’effet de l’environnement est pris en compte, les écorégions n’expliquent plus que 7 % de l’inertie de la structure des peuplements. Cet effet, bien que faible, suggère une variation interrégionale de la réponse des traits aux conditions environnementales.

ITA ENGENG

Figure 8: Ordination des sites sur le premier plan factoriel de l’ACP. Les carrés gris sont placés au barycentre de chaque écorégion (cf. Figure 2). 42 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

Limites potentielles de ces résultats : une majorité de petit ou moyen cours d’eau. Le jeu de données comptant 849 sites peu ou pas perturbés contient peu de grands cours d’eau. Quatre-vingt-quinze pourcent des sites sont situés dans des cours d’eau ayant un rang de Strahler inférieur ou égal à cinq. Cette restrition du jeu de données est principalement dû à l’accroissement des activités humaines en aval des cours d’eau. Il est probable que ce jeu de données ne reflète que partiellement les peuplements et les processus dans les grands cours d’eau pourvu d’une plaine alluviale. Néanmoins, il est représentatif du jeu de données du projet EFI+, puisque 92 % des 9 948 sont situés dans des cours d’eau avec un rang de Strahler inférieur ou égal à cinq.

Implication de ces résultats pour le développement de l’indice : Lorsque l’environnement est pris en compte, les ichtyorégions (régions regroupant les bassins hydrographiques selon leur similarité faunistque; Reyjol et al. 2007) expliquent 2,3 % de l’inertie résiduelle (P < 0,001) soit 1,6 % de l’inertie totale de la structure des peuplements. Ce résultat montre qu’à large échelle, l’utilisation de métriques basées sur les traits permet de surmonter les différences faunistiques régionales (Pont et al. 2006). La même métrique peut être représentative de régions aux conditions environnementales similaires, mais aux pools d’espèces différents. A l’inverse, bien que la composition des peuplements varie avec l’environnement (Huet 1954, Rahel & Hubert 1991, Schlosser 1991, Hawkins et al. 1997, Matthews 1998, Magalhaes et al. 2002a, Wehrly et al. 2003, Hoeinghaus et al. 2007, Infante et al. 2009), à large échelle les patrons dérivés de l’analyse taxonomique reflètent majoritairement les différences géographiques liées à la distribution des espèces (Oswood et al. 2000, Van Sickle & Hughes 2000, Hoeinghaus et al. 2007). Par conséquent l’utilisation de métriques basées sur « l’identité », la composition spécifique, sera limitée spatialement.

Idéalement, la valeur du score associée à une métrique reflète uniquement le degré d’altération du site (Hughes et al. 1998, Karr & Chu 1999, Hering et al. 2006, Pont et al. 2007, Stoddard et al. 2008, Pont et al. 2009). La variation inter-site des scores devrait représenter la variabilité du degré d’altération entre les sites. Dans le cadre de la « reference condition approach » (Bailey et al. 1998), la première étape consiste à comparer la valeur observée de la métrique à une valeur attendue en absence de pressions. Les résultats de la RDA montrent la nécessité de prendre en compte l’environnement dans le calcul du score des métriques basées sur ces catégories de traits (Oberdorff et al. 2002, Pont et al. 2006, Pont et

43 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication al. 2007). Sans cette étape, il semble impossible de distinguer la part relative de la variabilité des scores due à l’environnement et aux pressions.

Une possibilité consiste à modéliser la variabilité des métriques en fonction de l’environnement (Oberdorff et al. 2002, Baker et al. 2005, Pont et al. 2007, Pont et al. 2009, Hawkins et al. 2010) en utilisant des sites de référence (Stoddard et al. 2006) comme sites de calibrations. Ces sites sont sélectionnés sur des critères objectifs (Whittier et al. 2007) pour être peu ou pas impactés (ex. Hughes et al. 1998, Bates Prins & Smith 2007, Pont et al. 2007, Stoddard et al. 2008, Pont et al. 2009). La valeur prédite par le modèle correspond à la valeur attendue de la métrique pour un environnement donné. La variabilité environnementale est prise en compte en soustrayant la valeur attendue de la métrique à la valeur observée : le résidu. Les résidus « mesurent la gamme de variation d’une métrique attendue après avoir ôté l’effet des prédicteurs abiotiques » (Pont et al. 2009). La fiabilité des prédictions des valeurs attendues, dans un environnement donné en absence de pressions, conditionnera en partie la capacité de détection des métriques. Sans prédiction cohérente, il est impossible de savoir ce que représente la déviation entre valeur observée et valeur attendue, ceci, quelle que soit la méthode utilisée : modèles linéaires et modèles linéaires généralisés (Pont et al. 2006, Pont et al. 2007, Pont 2010), analyse discriminante (Joy & Death 2002), plus proche voisin (Bates Prins & Smith 2007), forêt aléatoire (Hawkins et al. 2010), ligne de valeur maximale (Fausch et al. 1984, Hughes et al. 2004, Roset et al. 2007), etc.

L’inertie résiduelle expliquée par les écorégions (paragraphe sur les écorégions), suggère que les variations des catégories de traits le long des gradients environnementaux peuvent variées selon les écorégions. Par conséquent, peut-on utiliser le même modèle pour prédire les valeurs attendues d’une métrique pour l’ensemble des régions (Pont et al. 2006, Pont et al. 2007) ? Faut-il utiliser un modèle par région (Pont et al. 2009) ? Pour répondre à ces questions, il convient de tester la convergence de la réponse des communautés aux variations environnementales entre les différentes régions (Wiens 1991, Smith & Ganzhorn 1996, Bellwood et al. 2002, Lamouroux et al. 2002, Irz et al. 2007, Ibañez et al. 2009, Hugueny et al. 2010).

44 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

3.1.3. Test de la convergence des communautés de la péninsule ibérique et de l’Europe de l’Ouest La convergence de la structure des communautés vivant dans des environnements semblables est prédite par la théorie du déterminisme local (Ricklefs & Schluter 1993, Ricklefs 2006). Cette théorie considère que l’environnement est le déterminant principal de la structure des communautés (Tonn et al. 1990, Townsend & Hildrew 1994, Weiher & Keddy 1995, Poff 1997, Diaz et al. 1998, Bellwood et al. 2002) et que l’évolution est unidirectionnelle. Les mêmes stratégies, ensemble de traits, seraient sélectionnés pour faire face aux mêmes conditions environnementales. La théorie du déterminisme local prédit la convergence des communautés vivant dans des environnements similaires, indépendamment de la proximité géographique et de l’histoire évolutive des régions.

Deux types de convergence, quantitative et qualitative, peuvent être observées (Hugueny et al. 2010). La convergence quantitative correspond à des réponses identiques à l’environnement entre les deux régions (Figure 8a ; Irz et al. 2007). Le même modèle peut être utilisé pour prédire la structure des peuplements quelle que soit la région. Dans le cas d’une convergence qualitative, les réponses à l’environnement dans les deux régions sont analogues mais pas identiques (Figure 9b, c et d). Des modèles spécifiques à chaque région sont nécessaires pour prédire la structure des peuplements.

A l’inverse, des facteurs spéciaux (Ricklefs 2006) comme des facteurs historiques (Ricklefs & Schluter 1993, Smith & Ganzhorn 1996, Samuels & Drake 1997, Lobo & Davis 1999, Qian & Ricklefs 2000, Tedesco et al. 2005, Vitt & Pianka 2005) tels les effets séquences (Samuels & Drake 1997) ou la coévolution entre espèces (Smith & Ganzhorn 1996), peuvent conduire à des divergences entre les régions (Figures 8e, f ; Wiens 1991, Smith & Ganzhorn 1996, Irz et al. 2007).

La variabilité environnementale interrégionale est une source attendue de divergence entre les communautés. Des peuplements évoluant dans des milieux différents devraient présenter des structures différentes (Wiens 1991).

45 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

abc

de f Catégorie de trait

Environnement Figure 9 : Relation entre une catégorie de trait et l’environnement dans deux régions, droite noire ou grise : a) convergence quantitative (le même modèle pour les deux régions) ; b), c) et d) convergence qualitative, les relations varient dans le même sens dans les deux régions ; e) et f) non convergence entre les régions, en e) la catégorie de traits varie avec l’environnement dans une région mais pas dans l’autre et en f) les relations à l’environnement sont opposées.

La convergence de la réponse des structures des communautés aux gradients environnementaux a été testée en comparant les peuplements de la France et de la Belgique (FB) avec les peuplements ibériques (IP). Ces deux régions présentent des histoires évolutives (Banarescu 1992, Hewitt 2000, 2004) et des faunes relativement différentes (Ferreira et al. 2007a, Reyjol et al. 2007). La péninsule ibérique n’ayant pas été affecté par les dernières glaciations (Hewitt 2004) et présentent de nombreuses espèces endémiques (Reyjol et al. 2007). Parmi les 57 espèces recensées dans les deux régions, seules 10 sont communes aux deux régions. Trois modèles avec des hypothèses différentes ont été définis par catégorie de trait. Le premier modèle considère une convergence quantitative entre les régions (Figure 8a). Le deuxième modèle considère que les réponses à l’environnement sont parallèles entre les régions (convergence qualitative, Figure 9b). Le troisième modèle permet que les réponses à l’environnement des traits peuvent être : similaires mais avec des amplitudes variables selon les régions (Figures 8c,d), opposées entre les régions (Figure 8f), et visibles dans une région et absentes dans l’autre (Figure 8e). Les résultats de la comparaison des modèles deux à deux (Tableau 4) montrent que les réponses à l’environnement de la structure des peuplements, sont globalement convergentes (Bellwood et al. 2002, Lamouroux et al. 2002, Tedesco et al. 2005, Hugueny et al. 2010) entre les deux régions. Parmi les 17 métriques testées, trois présentent des convergences

46 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication quantitatives, 11 des convergences qualitatives (dont huit des réponses parallèles) et trois présentent des réponses non convergentes (Tableau 4). Ces résultats tendent à démontrer que les structures des peuplements, provenant de pools d’espèces régionaux différents, peuvent avoir des réponses similaires aux variations environnementales (Bellwood et al. 2002, Lamouroux et al. 2002, Hoeinghaus et al. 2007). L’environnement peut être considéré comme une force majeure de la sélection sur les traits des espèces (Tonn et al. 1990, Keddy 1992, Townsend & Hildrew 1994, Weiher & Keddy 1995, Poff 1997, Townsend et al. 1997, Diaz et al. 1999, Cornwell et al. 2006, Cornwell & Ackerly 2009). Ces résultats sont cohérents avec ceux de la RDA (chapitre 3.1.2).

Tableau 4 : Statistiques et p-valeur ajustées des tests, F pour les métriques en densité (Ni), de 2 pour les métriques basées sur la richesse (Ns), de comparaison des modèles emboités. La méthode de Benjamini et Yekutieli (2001), qui contrôle le taux de mauvaise erreur « false discovery rate » (FDR), a été utilisée pour ajuster les p-valeurs du fait des comparaisons multiples (Dudoit & van der Laan 2008). L’effet régional est non significatif (ns) si aucune des comparaisons n’est significative ; additif si la comparaison entre les modèles 1 et 2 est significative mais pas la comparaison entre les modèles 2 et 3 ; et interactif si la comparaison entre les modèles 2 et 3 est significative. Les métriques peuvent être soit non convergentes, présenter une convergence quantitative ou qualitative.

Modèle 1 vs Modèle 2 Modèle 2 vs Modèle 3 Effet Métrique Convergence Statistique P-valeur Statistique P-valeur régional DENS 4,023 0,309 4,522 0,008 interactif non Ni_EURY 1,545 1,000 4,729 0,006 interactif qualitative Ni_INSEV 0,196 1,000 5,586 0,002 interactif qualitative Ni_LITH 2,576 0,617 3,184 0,068 ns quantitative Ni_OMNI 0,964 1,000 1,854 0,610 ns quantitative Ni_POTAD 13,897 0,004 6,191 0,001 interactif non Ni_RHEO 12,288 0,007 5,454 0,002 interactif non Ni_TOLE 9,897 0,018 0,470 1,000 additif qualitative RICH 27,039 < 0,001 13,150 0,154 additif qualitative Ns_BENTH 19,022 < 0,001 1,157 1,000 additif qualitative Ns_EURY 1,777 0,983 16,880 0,041 interactif qualitative Ns_INTOL 33,438 < 0,001 4,506 1,000 additif qualitative Ns_LITH 16,342 0,001 4,556 1,000 additif qualitative Ns_POTAD 0,819 1,000 15,563 0,068 ns quantitative Ns_RHEO 31,059 < 0,001 10,230 0,439 additif qualitative Ns_TOLE 9,604 0,018 9,560 0,540 additif qualitative Ns_WATE 9,633 0,018 14,862 0,081 additif qualitative

La majorité des métriques convergentes entre les régions, présentent des convergences qualitatives (Irz et al. 2007, Hugueny et al. 2010). Parmi ces onze métriques, huit présentent des réponses parallèles et trois présentent des patrons de réponses similaires mais avec des amplitudes de variation différentes (Figure 10 et Figure 11). Seuls le nombre d’espèces potamodromes (Ns_POTAD, Figure 11), la densité d’espèces lithophiles (Ni_LITH, Figure 10) et la densité d’espèces omnivores (Ni_OMNI, Figure 10) présentent des convergences

47 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication quantitatives. Il est probable que les différences observées entre les régions (écart constant ou amplitude de la réponse différente) soient les conséquences de processus géographiques ou historiques (Ricklefs & Schluter 1993, Lobo & Davis 1999, Qian & Ricklefs 2000, Lamouroux et al. 2002, Johnson et al. 2004, Tedesco et al. 2005, Ricklefs 2006, Irz et al. 2007, Hugueny et al. 2010).

48 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

Figure 10 : Effet marginal (Fox 8.5 9.5 IP 1987, 2003) de la température 9 (TEMP), de la pente (SLOP) et de la ns FB 7.5 distance à la source (DIS) sur les 8 8.5 IP & FB

log(DENS)* 6.5 8 métriques basées sur la densité totale de poisson (DENS), sur la densité d’eurytopes (EURY), d’insectivores (INSEV), de lithophiles (LITH), d’omnivores (OMNI), de potamodromes (POTAD), de rhéophiles (RHEO) et d’intolérant (INTOL). Seules les variables ou les log(Ni_EURY)* interactions entre variables et région avec un effet significatif ont été c s 6 s 9 représentées (drop in deviance F- c 8 c tests, Chambers & Hastie 1993). La 2 7 s réponse commune est représentée en 6 s pointillés (modèle 1), la réponse de -2 5 c log(Ni_INSEV)* la région IP en noir et de la région FB en gris. Les métriques avec des interactions significatives (modèle 3) 8 sont indiquées avec un astérisque. Les environnements calcaires (c) et 7 siliceux (s) sont différenciés si

log(Ni_LITH) l’interaction entre la géologie et la 6 région (GEO × REG) est significative. ns est ajouté sur le 5 4 graphique si la variable 4 environnementale avait un effet 3 3 3 significatif, mais pas l’interaction 2 2 avec la région. 1 log(Ni_OMNI) 1 1

6 6 6 2 4 ns 4 2 -2 0 2 log(Ni_POTAD)*

9 8 7 8 6

5 7 log(Ni_RHEO)* 0 123 4 5

6 6 4 2 2 0

log(Ni_TOLE) -2 681012 14 16 -1 0 123 4 TEMP log(SLOP) log(DIS)

49 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

Figure 11 : Effet marginal (Fox 6 10 IP 1987, 2003) de la température FB (TEMP), de la pente (SLOP) et de la 4

RICH 6 IP & FB distance à la source sur les 9 métriques basées sur la richesse 2 2 spécifique locale (RICH), sur la richesse (Ns) de poissons benthiques 7 (BENTH), eurytopes (EURY), 3 intolérants (INTOL), lithophiles 5 2 (LITH), potamodromes (POTAD), 3 rhéophiles (RHEO), tolérants Ns_BENTH 1 1 (TOLE) et vivant et se nourrissant dans la colonne d’eau (WATE). Seules les variables ou les 5 interactions ente variables et région 3 avec un effet significatif ont été représentées (drop in deviance 2-

Ns_EURY* 1 tests, Chambers & Hastie 1993). La réponse commune est représentée en pointillés (modèle 1), la réponse de la région IP en noir et de la région 2.5 FB en gris. Les métriques avec des 1.5 interactions significatives (modèle 3) sont indiquées avec un astérisque.

Ns_INTOL 0.5

6

4 Ns_LITH

2

1.2 2 1.5 0.8 1

Ns_POTAD 0.4

4 6 3 4 2 Ns_RHEO 1 2

4 1

0.6 2 Ns_TOLE 0.2 0

5 2.5 4 2 3 1.5 2 Ns_WATE 1 681012 14 16 -1 0 123 4 TEMP log(SLOP) log(DIS) 50 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

A taille égale, la richesse locale des cours d’eau méditerranéens ibériques est plus faible que celle des cours d’eau de l’Europe de l’Ouest (Reyjol et al. 2007) et que celle des cours d’eau méditerranéens français (Ferreira et al. 2007a). Un maximum de six espèces a été observé pour les peuplements méditerranéens, conformément à des observations antérieures (Godinho et al. 1997, Carmona et al. 1999, Pires et al. 1999, Godinho et al. 2000, Magalhaes et al. 2002a,b, Mesquita & Coelho 2002, Clavero et al. 2004, Pires et al. 2004, Clavero & Garcia-Berthou 2006, Mesquita et al. 2006, Ferreira et al. 2007a, Ferreirai et al. 2007b), contre un maximum de 20 espèces pour les cours d’eau français et belges. Les écarts des réponses aux gradients environnementaux des métriques basées sur la richesse spécifique (Figure 11) sont donc pour partie dus à la différence de richesse locale entre ces deux régions (Reyjol et al. 2007, Leprieur et al. 2008b). La richesse spécifique locale est elle-même dépendante de la richesse spécifique régionale (Hugueny et al. 2010). Une part des convergences quantitatives est donc probablement due aux différences régionales de la taille des pools d’espèces (Hugueny et al. 2010).

Limites de ces résultats : poids des introductions dans la faune ibérique

Les introductions d’espèces originaires de l’Europe centrale 100 ou de l’ouest (Clavero & Garcia-Berthou 2006) peuvent avoir 80 brouillé les résultats des tests de la convergence (Leprieur et al. 60

2008a). Parmi les 10 espèces communes aux deux régions, trois 40 sont natives de FB et pas d’IP (Kottelat & Freyhof 2007) : le 20 goujon (Gobio gobio, L.), le chevaine (Squalius cephalus, L.) et le Abondance relative (%) 0 MED vairon (Phoxinus phoxinus, L.). Ces espèces sont présentes dans Figure 12 : Abondance 46,7 % des sites IP et représentent entre 1,5 et 98,5 % des individus relative du goujon, du chevaine et du vairon au quand elles sont présentent (Figure 12). Cette homogénéisation sein des 35 peuplements peut avoir favorisé la convergence des traits portés par ces espèces. IP lorsqu’ils sont présents.

Perspectives : quelle est l’influence de la phylogénie sur la structure des communautés et la convergence ?

La convergence entre les communautés de deux régions est plus facilement observée quand les espèces de ces deux régions, bien que différentes, appartiennent aux mêmes lignées (Bellwood et al. 2002, Lamouroux et al. 2002). A l’inverse, des régions composées d’espèces appartenant à des lignées différentes présentent rarement des patrons convergents (Cadle &

51 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

Greene 1993, Smith & Ganzhorn 1996). L’évolution différente de lignées d’un même phylum et leur déploiement géographique différent, peuvent expliquer les divergences de structures des peuplements observées entre deux régions (Cadle & Greene 1993, Ricklefs & Schluter 1993). L’histoire évolutive d’une lignée peut jouer un rôle profond sur les caractéristiques actuelles des espèces (Vitt & Pianka 2005). Bien que seules 10 espèces soient communes aux deux régions, la grande majorité des espèces appartiennent aux mêmes genres, sous-familles ou familles. Cette proximité phylogénétique des pools d’espèces régionaux favorise probablement la convergence observée au sein de ces deux régions (Bellwood et al. 2002). De nouvelles techniques pour prendre en compte la phylogénie sont développées et devraient permettre à l’avenir de mieux appréhender cette question (par exemple, Pavoine et al. 2010).

Conséquence pour le développement de bioindicateurs

Ces résultats confirment l’importance de la prise en compte de l’environnement dans l’établissement des scores associés à chaque métrique (Joy & Death 2002, Oberdorff et al. 2002, Hughes et al. 2004, Pont et al. 2006, Bates Prins & Smith 2007, Pont et al. 2007, Roset et al. 2007, Hawkins et al. 2010). Ils démontrent aussi que les variables environnementales, qui expliquent la variabilité des métriques, diffèrent d’une métrique à l’autre. Il est donc essentiel de sélectionner les variables environnementales qui expliquent le mieux la variabilité de chaque métrique (Oberdorff et al. 2002, Pont et al. 2006, Bates Prins & Smith 2007, Pont et al. 2009, Hawkins et al. 2010). Conserver l’ensemble des variables environnementales pour prédire les valeurs attendues entrainerait un problème « d’over-fitting » (Hastie et al. 2009). Plus le nombre de variables intégrées au modèle augmente, plus l’erreur de prédiction sur le jeu de données de calibration diminue, mais plus l’erreur de prédiction sur un jeu de données indépendant risque d’être élevée (voir la figure 2.11, p. 38, de Hastie et al. 2009). Le faible nombre de métriques présentant des convergences quantitatives implique que la région doit être prise en compte lorsqu’on utilise des métriques basées sur la densité ou le nombre d’espèces (Oberdorff et al. 2002, Pont et al. 2006, 2009). L’utilisation d’un modèle commun pour prédire les valeurs attendues de ces métriques biaiserait l’estimation des valeurs attendues. Dans le cas d’une réponse parallèle entre les régions, un modèle commun surestimerait en moyenne, les valeurs attendues dans une région et sous-estimerait les valeurs attendues dans l’autre.

52 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

Néanmoins, les régions méditerranéennes abritent de nombreuses espèces endémiques (Reyjol et al. 2007), et des phylums spécifiques, par exemple le genre Iberochondrostoma (Kottelat & Freyhof 2007). Il est donc possible que la singularité de la faune méditerranéenne ait exacerbée les divergences fonctionnelles observées avec la France et la Belgique. A l’inverse, les différences spécifiques entre les autres régions de l’Europe sont beaucoup moins marquées (Reyjol et al. 2007). On peut donc supposer que les structures fonctionnelles des peuplements de ces régions sont relativement similaires. Par conséquent les mêmes métriques devraient être utilisables pour l’ensemble de ces régions (Pont et al. 2006, 2007). Les régions méditerranéennes constituent probablement un cas particulier en Europe. Ces hypothèses restent à tester. Une possibilité pour surmonter les différences régionales est de considérer des métriques basées sur les ratios (Matzen & Berge 2008) du nombre d’espèces ou d’individus présentant une catégorie de traits donnés.

3.1.4. Développement de métriques spécifiques aux cours d’eau à faible richesse spécifique

En Europe, les cours d’eau à faible richesse spécifique (moins de quatre espèces) concernent majoritairement, les cours d’eau méditerranéens, les cours d’eau côtiers et les cours d’eau froids. Ces derniers ont reçu une attention particulière de par la spécificité de leur environnement et de leurs peuplements (Lyons et al. 1996, Mundahl & Simon 1999, Mebane et al. 2003, Wang et al. 2003, Hughes et al. 2004). En plus de leur faible richesse, les peuplements vivant dans ces cours d’eau sont majoritairement composés d’espèces aux caractéristiques proches (Halliwell et al. 1999, Zaroban et al. 1999), majoritairement intolérantes (Figure 7, paragraphe 3.1.2). La singularité de la réponse aux pressions anthropiques des communautés piscicoles d’eau froide, conforte le développement d’outils de bioindication spécifiques (Lyons et al. 1996, Mundahl & Simon 1999, Wang et al. 2003, Hughes et al. 2004). De nombreuses métriques présentent des réponses opposées aux pressions entre les cours d’eau chauds et froids (Lyons et al. 1996, Mebane et al. 2003). Communément, la richesse spécifique diminue avec l’altération du milieu dans un cours d’eau chaud et peut augmenter dans un cours d’eau froid (Lyons et al. 1996, Mundahl & Simon 1999).

53 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

Développer des métriques, des indices, pour des cours d’eau naturellement pauvres en espèces (Harris & Silveira 1999) est un défi. La faible richesse limite le nombre de métriques et leur variabilité (Simon & Lyons 1995, Lyons et al. 1996). En réponse à ces limites, certains indices développés pour ces cours d’eau ont, soit intégré des métriques basées sur d’autres groupes de vertébrés (Hughes et al. 2004), soit considéré un plus petit nombre de métriques que pour les cours d’eau chauds (Lyons et al. 1996, Langdon 2001, Southerland et al. 2007). Une autre possibilité est de considérer d’autres attributs des peuplements comme la structure en âge ou en taille (Oberdorff & Porcher 1994, Langdon 2001, Breine et al. 2004, Hughes et al. 2004). C’est cette possibilité que nous avons exploré. L’objectif était de développer de nouvelles métriques propres aux peuplements à faible richesse spécifique, vivant majoritairement dans les cours d’eau froids. Ces métriques devaient prendre en compte à la fois les traits des espèces et la taille des individus. Huit catégories de traits représentatives de ces peuplements (Figure 5) ont été considérées : intolérante à la plupart des pressions (INTOL), intolérante aux faibles concentrations en oxygène (O2INTOL), intolérant à la dégradation de l’habitat (HINTOL), rhéophile (RHEO), insectivore (INSEV), potamodrome (POTAD), lithophile (LITH) et n’ayant qu’une ponte par an (SIN, cf. Tableau 3). La structure en taille a été prise en compte en considérant, soit les petits individus, soit les grands individus, définis en fonction d’une taille limite. Cette étude est une étape préliminaire dans le développement de telles métriques. Par conséquent, trois tailles limites ont été testées pour définir les classes de taille : 100, 150 et 200 mm. La première étape consiste à dénombrer les individus d’une population possédant le trait d’intérêt, par exemple le nombre d’individus O2INTOL. La deuxième étape consiste à distinguer parmi ces individus, les poissons ayant une taille inférieure ou supérieure à la taille limite, selon la classe de taille considérée. Les métriques correspondent au nombre d’individus qui présentent un trait donné et qui appartiennent à la classe de taille d’intérêt (petit ou grand), par exemple le nombre de petits individus, inférieurs à 100 mm, rhéophiles (Tableau 5). Les individus de petite taille, regroupent à la fois les individus des espèces naturellement petites (ex. le chabot, Cottus gobio, L.), et les jeunes individus des grandes espèces (ex. le barbeau, Barbus fluviatilis, L.) (Figure 13). Pour prendre en compte cette dichotomie, les mêmes métriques ont été calculées en se basant soit sur l’ensemble des espèces du peuplement (Figure 13a), soit uniquement sur les espèces de grandes tailles (taille maximale de l’espèce 300 mm ; Figure 13b).

54 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

ab

Trait Oui Non Taille Oui Non Figure 13 : Illustration du calcul des métriques pour une catégorie de trait donnée. Dans cet exemple, le calcul porte sur deux espèces caractérisé par le trait d’intérêt porté par la truite et le chabot (contours noirs) et seuls les individus de petite taille sont pris en compte (colorés en gris). (a) Tous les individus des deux espèces sont pris en compte soit 12/21. (b) Seuls les truites de petites tailles sont utilisées, le chabot n’étant pas une espèce de grande taille, soit 5/14.

Pour chacun des huit traits, 12 métriques ont été calculées, chacune correspondant à l’une des combinaisons pool d’espèces (toutes les espèces ou que les grandes espèces), taille limite (100, 150 ou 200 mm), classe de tailles (petits ou grands individus, Tableau 5). Au final 96 métriques ont ainsi été développées. Tableau 5 : Calcul des 12 métriques pour un trait donné. Pool d'espèces Seuil (mm) Classe de taille Petit 100 Grand Ensemble des Petit 150 espèces Grand Petit 200 Grand Petit 100 Grand Espèces de Petit grandes tailles 150 Grand Petit 200 Grand

Chaque métrique a ensuite été reliée à l’environnement en utilisant des modèles linéaires généralisés (GLM ; McCullagh & Nelder 1989, Faraway 2006). Dans un GLM, le prédicteur linéaire est relié linéairement aux variables explicatives X (les variables environnementales) par la relation : X ii 3.1 avec l’ordonnée à l’origine, i le paramètre associé à la variable environnementale Xi. Les paramètres des modèles sont estimés au maximum de vraisemblance (McCullagh & Nelder

55 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

1989). La liaison entre la variable expliquée Y et le prédicteur linéaire s’établit avec la fonction de lien g de sorte que : Yg )( 3.2

gYE 1 )()( 3.3 Bien que les distributions de Poisson et binomiale négatives soient toutes deux adaptées à des données de comptage (Cameron & Trivedi 1998), la distribution binomiale négative a été préférée. L’utilisation de la loi de Poisson fait l’hypothèse de l’égalité de la moyenne et de la variance, équidispersion (Cameron & Trivedi 1998) : YE 3.4 YV 3.5 avec le paramètre de la loi de Poisson, « L’échec de l’hypothèse de non dispersion entraîne les mêmes conséquences qualitatives que l’échec de l’hypothèse d’hétéroscédasticité dans le modèle de régression linéaire » (Cameron & Trivedi 1998, p. 77). Dans ce cas, l’estimation de la matrice de variance-covariance des paramètres sera biaisée. L’utilisation de distribution binomiale négative permet de prendre en compte la surdispersion de la variable réponse Y. La fonction de variance associée à cette loi est : YV 2 3.6 avec le paramètre de dispersion estimé au maximum de vraisemblance (Cameron & Trivedi 1998). En pratique, les coefficients des modèles estimés soit avec une distribution de Poisson, soit avec une distribution binomiale négative, sont proches. Il en va de même pour les prédictions issues de ces modèles. Les valeurs prédites, pour la métrique « nombre de petits individus HINTOL », avec une distribution de Poisson et une distribution binomiale négative sont fortement corrélées (r = 0,997 dans l’espace de la variable réponse). Néanmoins, comme attendu, les estimations des matrices de variance-covariance sont relativement différentes entre les deux distributions (Tableau 6 ; Cameron & Trivedi 1998). Ce résultat aura des répercussions importantes dans les estimations des intervalles de confiance (voir chapitre 3.2).

Tableau 6 : Aperçu de la matrice de Poisson Binomiale négative variance-covariance du modèle reliant l’environnement à la métrique Ordonnée à l'origine Ordonnée à l'origine nombre de petits individus Ordonnée à l'origine 0,001612 0,025572 intolérants à la dégradation de l’habitat, estimée soit avec une Pente 0,001445 0,011266 distribution de Poisson, soit avec Pente² 0,000821 0,012116 une distribution binomiale négative. Amplitude thermique -0,000063 -0,000989

56 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

La fonction de lien associée à la loi négative binomiale est le logarithme népérien (Cameron & Trivedi 1998). La relation entre la métrique et les variables environnementales est donc de la forme : )(log XY ii 3.7 Seuls les sites peu ou pas perturbés, dits de « référence » (Stoddard et al. 2006), sont utilisés pour estimer les paramètres des modèles (Figure 14 ; Pont et al. 2006, Pont et al. 2007, Pont et al. 2009, Hawkins et al. 2010). Les valeurs prédites issues de ces modèles correspondent aux valeurs attendues en l’absence de pressions (Bailey et al. 1998, Pont et al. 2006, Pont et al. 2007). Au lieu de relier directement les métriques à l’environnement, une variable « offset » a été intégrée aux modèles afin de modéliser : - la proportion d’individus avec un trait donné et une taille donnée dans les peuplements, pour les métriques basées sur l’ensemble des espèces, par exemple la proportion de petits individus rhéophiles (Figure 13a) ; - la proportion d’individus avec un trait donné et une taille donnée parmi les individus des grandes espèces, pour les métriques basées sur les grandes espèces, par exemple la proportion de petits individus rhéophiles parmi les individus des grandes espèces (Figure 13b). Les métriques basées sur les grandes espèces s’apparentent à des métriques en classe d’âge. Les petits individus sont majoritairement de jeunes individus, les grands étant généralement plus âgés. Par conséquent, le logarithme du nombre total d’individus a été intégré comme « offset » (Chambers & Hastie 1993, Faraway 2006) dans les modèles des métriques basées sur l’ensemble des espèces, et le logarithme du nombre d’individus des grandes espèces a été intégré comme « offset » dans les modèles des métriques basées sur les grandes espèces. Le paramètre associé aux offsets est fixé à 1, permettant de modéliser une proportion, il s’agit alors de « rate model » (Cameron & Trivedi 1998) : )(log ii NXY )log( 3.8

Y log X 3.9 N ii avec N le nombre total d’individus ou le nombre total d’individus des grandes espèces, selon le type de métriques.

57 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

Jeu de données Sites de calibration

Global CAL

Modélisation

Métrique

Environnement

Calcul des résidus Sélection de métriques

Mis à l’échelle (0–1) - Limite supérieure utilisant les sites SID - Limite inférieure utilisant tous les sites

Scores Sensibilité aux pressions (0,1) 1 Scores

0 Redondance 12345 des scores Indice de pressions

Sélection finale Figure 14 : Procédure de sélection des métriques et de calcul des scores.

Parmi les 96 métriques testées (12 métriques pour chaque trait, Tableau 5), seules quatre ont été modélisées avec succès par l’environnement (normalité des résidus de Pearson, hétéroscédasticité des résidus, influence des points leviers et relation linéaire y = x entre valeurs prédites et valeurs observées ; Figure 15) : (1) le nombre de petits individus intolérants aux faibles concentrations en oxygène (O2INTOL, Figure 15a), (2) le nombre de petits individus intolérants à la dégradation de l’habitat (HINTOL, Figure 15b), (3) le nombre de petits individus rhéophiles (RHEO, Figure 15c) et (4) le nombre de petits individus insectivores (Figure 15d). Toutes ces métriques ont été calculées en considérant l’ensemble des espèces et un seuil de 150 mm pour distinguer les classes de taille. Aucune autre métrique calculée avec un autre seuil, basée sur les grands individus ou ne prenant en compte que les individus des grandes espèces n’a été retenues de part un mauvais ajustement des modèles.

58 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

a b 600 y = x loess 500

400

300

200

100

0

c d 600

500 Nombre di'ndividus oberservés 400

300

200

100

0 0 100 200 300 400 500 600 0 100 200 300 400 500 600 Nombre d'individus prédits Figure 15 : Relations entre valeurs prédites et valeurs observées pour les sites de références : (a) nombre de petits individus intolérants aux faibles concentrations en oxygène (O2INTOL, N = 214), (b) nombre de petits individus intolérants à la dégradation de l’habitat (HINTOL, N = 214), (c) nombre de petits individus rhéophiles (RHEO, N = 212), (d) nombre de petits individus insectivores (INSEV, N = 212). La droite continue représente la première bissectrice et la courbe pointillée une tendance générale (régression loess, f = 0,667 ; Hastie et al. 2009).

L’étape suivante consiste à transformer les valeurs observées des métriques en scores et à ne retenir que les métriques dont les scores répondent aux pressions anthropiques (Figure 14). La variabilité environnementale des métriques est prise en compte en soustrayant la valeur attendue de la métrique à la valeur observée. Cette soustraction se fait dans l’espace du lien (équations 3.2 et 3.7) : yi log1log yˆi 1 3.10 avec yi la valeur observée de la métrique dans un site i et i la valeur attendue en absence de pressions. L’ajout de la valeur une aux observations et aux valeurs prédites (équation 3.10)

59 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication permet de prendre en compte les cas ou l’une des deux valeurs serait nulle. Les scores sont transformés en un score sans unité variant entre 0 et 1 via une transformation min-max (Legendre & Legendre 1998, Hering et al. 2006, Saporta 2006) : score min 3.11 max min La valeur 0 représente une très forte dégradation et la valeur maximale 1 caractérise très bon état du site. La sensibilité des métriques a été Tableau 7 : Statistiques et p-valeur des tests de la somme des rangs de Wilcoxon entre les scores appréciée en comparant les scores de la classe 1 de la classe 1 et les scores des classes 4-5 de l’indice de pressions. de l’indice de pression (voir chapitre 2.2) aux Métrique Statistique W p-valeur scores des classes 4 et 5 réunies (Tableau 7). O2INTOL 64326 < 0,001 HINTOL 66911 < 0,001 Seule la métrique « nombre de petits individus RH 61557 < 0,001 insectivores », ne répond pas au gradient de INSV 43599 0,145 pressions (Tableau 7 et Figure 16d). A l’inverse les métriques « nombre de petits individus O2INTOL », « HINTOL » et « RH » ont des réponses significatives aux variations de pressions (Tableau 7 et Figure 16a, b, c).

1.0 aabcb c d 0.8

0.6

0.4

0.2

0.0 1 2345 12345 12345 12345 O2INTOL HINTOL RHEO INSEV Figure 16 : Variation du score des métriques le long du gradient de pression : (a) nombre de petits individus (< 150 mm) intolérants aux faibles concentrations en oxygène, b) nombre de petits individus intolérants à la dégradation de l’habitat, c) nombre de petits individus rhéophiles, d) nombre de petits individus insectivores.

Les corrélations entre les scores de ces trois métriques sont relativement élevées ( de Spearman > 0,8, Tableau 8) et significatives (p-valeurs < 0,001). Tableau 8 : Corrélations entre les scores des métriques ( de Spearman). Score des métriques HINTOL RH INSV O2INTOL 0,966 0,811 0,676 HINTOL – 0,838 0,675 RH – – 0,74

60 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

Limites : l’empoissonnement, « fish stocking » Cette pratique est largement répandue en Europe (European Inland Fisheries Advisory Commission. 1982, Largiadèr & Scholl 1996, Hansen 2002, Caudron et al. 2009) à la fois pour le maintien des populations et les activités récréatives (Cowx 1994, Rutherford 2002, Cowx & Gerdeaux 2004). Les salmonidés, de par leur intérêt économique et leur statut patrimonial, sont probablement les plus concernées par l’empoissonnement en Europe. Ils représentent plus de 59 % des individus capturés dans le jeu de données EFI+. Cet apport d’individus exogènes peut modifier à la fois la structure en taille des peuplements et le nombre d’individus échantillonés dans un site donné. Malheureusement aucune information concernant cette pratique n’est disponible dans la base de données du projet EFI+. Par conséquent, l’empoissonnement peut avoir biaisé les relations observées entre l’environnement et les métriques ainsi que les réponses des métriques aux pressions anthropiques (Mebane et al. 2003).

Implication de ces résultats en bioindication : Etendue géographique de l’utilisation de ces métriques La majorité des espèces qui contribuent au calcul de ces métriques, les salmonidés (Salmo trutta, L., Salmo salar, L.) et les cottidés (Cottus gobio), sont largement distribuées en Europe (Kottelat & Freyhof 2007). L’étendue de leus aires de distribution assure une très bonne représentativité de ces métriques à l’échelle européenne. Ceci indépendamment des facteurs historiques et biogéographiques qui ont modelé la répartition actuelle des espèces (Banarescu 1992, Hewitt 1999, 2000, Griffiths 2006, Hoeinghaus et al. 2007). L’utilisation de ces métriques peut être étendue à d’autres régions du globe, notamment en Amérique du Nord. Les faunes d’eau froide sont relativement similaires entre ces deux régions (Moyle & Herbold 1987). A l’instar des peuplements piscicoles européens, les peuplements d’eau froide nord-américains sont dominés par des espèces intolérantes (Tableau 9) (Halliwell et al. 1999, Zaroban et al. 1999).

Tableau 9 : Traits des dix O2INTOL HINTOL RH INSEV espèces les plus abondantes Salmo trutta dans le jeu de données des cours d’eau froids (cellule Phoxinus phoxinus grisée si l’espèce présente le Cottus gobio trait). Barbatula barbatula Rutilus rutilus Salmo salar Gobio gobio Oncorhynchus mykiss Anguilla anguilla Thymallus thymallus

61 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

Prise en compte de la structure en taille des peuplements dans l’évaluation des cours d’eau Parmi les 96 métriques testées, trois ont franchi les différentes étapes de sélection (Hughes et al. 1998, Karr & Chu 2000, Hughes et al. 2004, Hering et al. 2006, Roset et al. 2007, Stoddard et al. 2008) et sont candidates au calcul final de l’indice. Bien que l’utilisation de métriques en classe de taille ou d’âge ait été souvent recommandée (Karr 1991, Roset et al. 2007), notamment par la DCE (European Union (EC) 2000), relativement peu d’indices les intègrent à ce jour (Oberdorff & Porcher 1994, Langdon 2001, Breine et al. 2004, Hughes et al. 2004). L’utilisation de ces métriques permettrait d’intégrer la structure en classe de taille des peuplements dans l’évaluation des conditions des cours d’eau. Néanmoins, afin de limiter la redondance de l’information fournie par ces trois métriques (Tableau 6 ; Karr et al. 1986, Karr & Chu 1999, 2000), il serait a priori nécessaire de n’en sélectionner qu’une seule. La variabilité des scores observée entre les différentes classes de l’indice de pression (Figure 16), suggère une sensibilité différente de ces métriques aux différentes pressions (Karr & Chu 1999, Angermeier et al. 2000). L’indice de pression est une combinaison linéaire (Tenenhaus & Young 1985) de plusieurs pressions. Il est donc possible que deux sites affectés par des pressions différentes appartiennent à la même classe de l’indice. Les différences entre scores reflètent peut-être la variabilité intersites des altérations anthropiques (Leonard & Orth 1986). Cette hypothèse reste à tester.

3.1.5. Conclusion : Implication de ces résultats dans le développement du nouvel indice

Les résultats des tests précédents ont mis en évidence : - la redondance fonctionnelle existant au sein des peuplements européens, avec des gradients de tolérance et de reproduction ; - la variation de la structure fonctionnelle des peuplements le long de gradients environnementaux physiques et thermiques ; - la variabilité interrégionale de la structure des peuplements, une fois l’environnement prise en compte ; - la relative convergence de la variation de la structure fonctionnelle des communautés le long de gradient environnementaux ; - l’intérêt des métriques en classe de tailles pour évaluer la condition des cours d’eau froids, à faible richesse spécifique. Ces résultats ont été utilisés pour développer le nouvel indice poisson européen.

62 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

La conception du nouvel indice EFI+ a été très fortement inspiré d’EFI développé au cours du précédent projet FAME (Figure 17 ; Pont et al. 2006, Pont et al. 2007). A l’instar d’EFI, les métriques testées sont basées sur les traits biologiques et écologiques des espèces (Usseglio-Polatera et al. 2000, Noble et al. 2007). Les métriques ont été calculées sur la base d’un nombre d’espèces, ou d’individus présentant un trait donné, par exemle le nombre d’espèces rhéophiles ou le nombre d’individus rhéophiles. Le nouvel indice reprend la « reference condition approach » (Bailey et al. 1998) utilisée lors du projet FAME et préconisé par la DCE (European Union (EC) 2000). La première étape a consisté à sélectionner, jeu de données de sites peu ou pas impactés dit de référence (Oberdorff et al. 2002, Pont et al. 2006, Stoddard et al. 2006, Hawkins et al. 2010), et un jeu de données de sites perturbés (Figure 17 ; Pont et al. 2006). Les sites de référence servent à calibrer les modèles reliant les métriques aux variables environnementales (Oberdorff et al. 2002, Pont et al. 2006, Pont et al. 2007, Pont et al. 2009, Hawkins et al. 2010). Les valeurs attendues des métriques dans un environnement donné en absence de pressions sont prédites à partir de ces modèles. Les indices utilisant des modèles prédictifs sont des bioindicateurs prédictifs (Pont 2010). Dans d’autres approches, les valeurs attendues des métriques sont directement estimées à partir de la distribution des valeurs observées dans les sites de références (ex. Bates Prins & Smith 2007). Une grande différence dans la modélisation des métriques par rapport au projet FAME, réside dans l’intégration dans chaque modèle d’une variable offset (richesse totale, nombre d’individus capturés, etc.) pour modéliser des proportions plutôt que des métriques brutes (voir chapitre précédent ; Chambers & Hastie 1993, Cameron & Trivedi 1998, Faraway 2006). Les valeurs observées des métriques sont ensuite comparées aux valeurs attendues afin de prendre en compte la variabilité environnementale des métriques (équation 3.10). Les valeurs ainsi calculées sont ensuite transformées en un score sans unité variant entre 0 et 1. Le passage des résidus aux scores s’établit en deux étapes ; la première consiste à standardiser les résidus : xres S ii 3.12 i s avec Si le score, resi la différence entre la valeur observée et le valeur attendue, ¯xi la moyenne des résidus des sites de références dans l’écorégion i et s l’écart type des résidus de l’ensembles des sites non perturbés (voir Annexe 2). Le processus de notation, « scoring », prend donc en compte l’écorégion d’appartenance des sites. La mise à l’échelle se fait en

63 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication utilisant la transformation min-max (équation 3.11 et Figure 14 ; Legendre & Legendre 1998, Hering et al. 2006, Saporta 2006). Les bornes minimales et maximales ont été calculées de sorte que la médiane des sites peu ou pas perturbés soit égale à 0,8. La transformation utilisée dans FAME induisait une espérance de 0,5 pour un site de référence. Par conséquent, avec le nouvel indice, la comparaison entre les scores d’un site test et le score attendu en absence de pression s’établit sur une gamme de valeurs beaucoup plus importante. La sensibilité de chaque métrique aux pressions a été évaluée en comparant la moyenne des scores des sites peu ou pas perturbés avec la moyenne des scores des sites fortement perturbés (Figure 16). Seules les métriques les moins redondantes, présentant les corrélations les plus faibles, ont été intégrées au calcul final de l’indice. La valeur de l’indice est la moyenne du score obtenu pour chaque métrique sélectionnée (voir Annexe 2).

Base de données

Variables environnementales

Communauté piscicole

Evaluation des pressions

Métriques potentielles

Sites de référénce Métrique ~ ƒ(environnement) Valeurs attendues en absence de pressions Modélisation des métriques

Sites perturbés Prise en compte de la observées - attendues variabilité environnementale

Transformation Scores [0;1] remise à l'échelle

Sensibilité des métriques Relations aux pressions pression-impact

Sélection des métriques Redondance

Construction de l'indice Aggrégation des et validation scores

Figure 17 : Schéma du développement d’un indice multimétrique prédictif basé sur les conditions de référence, adapté de Pont (2010).

Les résultats de l’étude sur la redondance des traits au sein des peuplements piscicoles européens (chapitre 3.1.1) a amené à considérer deux types de communautés : celles dominées par les espèces intolérantes et celles dominées par les espèces tolérantes. Un indice a ainsi été développé pour chaque type de communauté. Au final deux indices ont été développés au 64 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication cours du projet EFI+ ; une différence majeure par rapport au projet FAME et à l’indice EFI. L’objectif de cette distinction est de pouvoir sélectionner des ensembles de métriques spécifiques et représentatifs de chaque type de peuplements (chapitre 3.1.4). Les indices ainsi développés devraient permettre une évaluation plus précise de la condition des sites. La distinction entre les deux types de communauté est basée sur l’abondance relative des espèces : intolérantes à de faible concentration en oxygène et à la dégradation de l’habitat, sténothermes, avec une reproduction lithophiles ou spéléophiles et pondant en milieu lotique. Les peuplements dits « intolérants » sont dominés par ces espèces alors qu’elles sont minoritaires dans les peuplements dits « tolérants ». D’un point de vue faunistique, les peuplements « intolérants » sont majoritairement dominés par les salmonidés et leurs espèces associées (Tableau 10), alors que les peuplements « intolérants » sont dominés par les cyprinidés. Les pressions anthropiques peuvent modifier la dominance entre espèces et la composition faunistique des peuplements (McCormick et al. 2001, Kruk & Penczak 2003, Quinn & Kwak 2003, Wang et al. 2003, Quist et al. 2005, Haxton & Findlay 2008). La proportion observée d’individus intolérants dans un peuplement peut ne pas correspondre au type de peuplement attendu en absence de perturbations. Une solution consiste à classer les sites en fonction de facteurs abiotiques contrôlant la composition spécifique des peuplements. La typologie des peuplements piscicoles européens de Melcher et al. (2007) différencie 15 types de peuplements en fonction de sept variables environnementales. Ces 15 grands types ont été regroupés pour correspondre à la classification « intolérant-tolérant ».

Tableau 10 : Liste des 19 espèces caractéristiques des peuplements intolérants. Alburnoides bipunctatus Cobitis calderoni Coregonus lavaretus Cottus gobio Cottus poecilopus Eudontomyzon mariae Hucho hucho Lampetra planeri Phoxinus phoxinus Salmo salar Salmo trutta fario Salmo trutta lacustris Salmo trutta macrostigma Salmo trutta trutta Salmo trutta marmoratus Salvelinus fontinalis Salvelinus namaycush Salvelinus umbla Thymallus thymallus

La conséquence majeure de cette distinction des cours d’eau en fonction du type de peuplement est que les deux indices intègrent des métriques différentes dans leur calcul. L’indice pour les peuplements « intolérants » est basé sur le nombre d’individus intolérants aux faibles concentrations en oxygène (Ni.O2INTOL) et sur le nombre d’individus intolérants à la dégradation de l’habitat d’une taille inférieure à 150 mm (Ni.HINTOL.150) : (Ni.O2INTOL Ni.HINTOL.150) Indice 3.13 Intol 2 65 3.1- Hypothèses et contraintes sous-jacentes à l’usage des traits fonctionnels en bioindication

L’indice pour les peuplements « tolérants » est basé sur le nombre d’espèces qui pondent dans en milieux rhéophiles (Ric.RHPAR) et sur le nombre d’individus avec une reproduction lithophile (Ni.LITH) : (Ric.RHPAR Ni.LITH) Indice 3.14 Tol 2 La standardisation (3.12) des résidus des modèles tient aussi compte de cette distinction : xres ijij Sij 3.15 s j avec Sij le score, resij la différence entre la valeur observée et la valeur attendue, ¯xij la moyenne des résidus des sites de références dans l’écorégion i et pour le type de cours d’eau j

(tolérant-intolérant) et sj l’écart type des résidus de l’ensembles des sites non perturbés (voir Annexe 2). Les résultats du développement des métriques basées sur les classes de taille (chapitre 3.1.4) ont permis d’intégrer la métrique « nombre de petits individus intolérants à la dégradation de l’habitat » (Ni.HINTOL.150) dans le calcul de l’indice pour les peuplements dominés par les espèces intolérantes (3.13). Cette métrique a été préférée aux deux autres du fait d’une plus grande répartition spatiale et une meilleure représentativité dans certaines régions d’Europe. L’ensemble de ces résultats ont été intégrés dans un logiciel disponible à l’adresse suivante : http://efi-plus.boku.ac.at/software/.

66 3.2- Incertitudes associées à l’usage des bioindicateurs multimétriques prédictifs

3.2. Incertitudes associées à l’usage des bioindicateurs multimétriques prédictifs

L’évaluation de l’incertitude autour des valeurs de l’indice est une demande de plus en plus forte de la part de la communauté scientifique internationale et des gestionnaires ; le but étant de pouvoir évaluer la fiabilité du diagnostic fourni par les MMI. La DCE impose que les masses d’eau soient réparties en cinq classes selon leur état, de 1 pour une masse d’eau en très bon état à 5 pour une masse d’eau très fortement dégradée. La classe 2 correspond au « bon état écologique ». Chaque masse d’eau est positionnée dans une classe en comparant le score de l’indice aux valeurs qui définissent les limites de chaque classe (Tableau 11).

Tableau 11 : Limites des classes pour l’indice « intolérant » (Bady et al. 2009a,b). Classe Limites 1 [0,911 ; 1] 2 [0,755 ; 0,911] 3 [0,503 ; 0,755[ 4 [0,252 ; 0,503] 5 [0 ; 0,252]

Il est tout à fait possible que deux sites qui présentent des valeurs d’indice très proches soient classés dans deux états écologiques différents. Cette petite différence numérique peut avoir des conséquences, notamment économiques, importantes pour les gestionnaires. Dans un cas, le site sera considéré comme dégradé et devra être restauré alors que dans l’autre son état sera conforme aux normes de la DCE. Associer un intervalle de confiance autour de l’indice permettrait d’avoir une idée de la fiabilité du classement d’un site. A l’heure actuelle, hormis les travaux de Clarke (2000) pour l’estimation de l’incertitude associée aux indices de type RIVPACS (Wright et al. 2000), il ne semble pas exister dans la littérature d’autres essais de calcul d’incertitude. Prendre en compte l’ensemble des sources potentielles d’erreur susceptibles d’influencer la valeur de l’indice, comme l’échantillonnage, l’incertitude autour de la mesure des variables environnementales, semble illusoire. Néanmoins, l’utilisation de modèles statistiques pour prédire les valeurs attendues (Figure 17) devrait permettre d’estimer l’incertitude autour de cette valeur. Il semble donc possible d’évaluer l’incertitude autour du score en propageant l’incertitude autour de la valeur attendue (Figure 18 ; Bevington & Robinson 2003).

67 3.2- Incertitudes associées à l’usage des bioindicateurs multimétriques prédictifs

Modèle statistique

IC valeur prédite

Valeur observée (y) [ŷmin ; ŷmax]

IC résidu

[y-ŷmin ; y-ŷmax] Standardisation Transformation IC score

[scoreinf ; scoresup]

Figure 18 : Principe de la propagation d’erreur.

3.2.1. Choix de l’intervalle de confiance A chaque valeur prédite via un modèle statistique, il est possible d’associer, soit un intervalle dit de « confiance » (confidence interval), soit un intervalle dit de « prédiction » (prediction interval ; Hahn & Meeker 1991). L’intervalle de confiance nous renseigne sur le degré de connaissance d’une caractéristique d’une population, à partir d’un échantillon aléatoire (Hahn & Meeker 1991). Cet intervalle devrait contenir avec une confiance de 100(1– ) % la vraie valeur du paramètre de la population étudiée, par exemple la moyenne (Hahn & Meeker 1991, Scherrer 2009). Chaque valeur prédite par un modèle linéaire ou un GLM correspond à l’espérance de la variable réponse Y sachant les variables explicatives X (McCullagh & Nelder 1989, Saporta 2006) : ˆ Xyy iii 3.16 Par conséquent, l’intervalle de confiance associé à une prédiction correspond à l’intervalle qui devrait contenir la moyenne de la métrique pour un environnement donné. A l’inverse, « l’intervalle de prédiction pour une observation future unique est un intervalle qui, avec un niveau de confiance 100(1–) %, contiendra la prochaine observation sélectionnée aléatoirement dans une population » (Hahn & Meeker 1991, p.31). Cet intervalle permet d’estimer l’incertitude associée à la prédiction d’une nouvelle observation sachant ce qui a déjà été observé. L’intervalle de prédiction associé à une valeur prédite estime la gamme de valeurs attendue pour une métrique, pour un nouveau site ou un nouvel échantillonnage, dans un environnement donné. L’intervalle de prédiction est donc plus large que l’intervalle de confiance (Figure 19). Les observations sont plus variables que les moyennes.

68 3.2- Incertitudes associées à l’usage des bioindicateurs multimétriques prédictifs

3.2.2. Illustration : cas de la régression linéaire L’intervalle de confiance autour d’une valeur prédite par une régression linéaire est égal à : ˆˆ nii 2,2 ysetyyIC ˆi 3.17 avec 1– le niveau de confiance, i la valeur attendue de la métrique dans un environnement donné en absence de pression et se(i) l’erreur standard associée à la prédiction :

2 1 yse i ˆˆ i XXXX i 3.18 avec ˆ² la dispersion associée au modèle (la variance résiduelle), X la matrice des variables explicatives et Xi un vecteur de cette matrice. X correspond à la matrice des variables environnementales et Xi correspond à l’environnement observé dans le site i. Cet intervalle rend compte de la variabilité associée à l’estimation des coefficients du modèle. En effet,

s2 XX 1 de l’équation 3.18 correspond à l’estimation de la matrice de variance-covariance des coefficients du modèle (Kutner et al. 2005). Ces coefficients sont eux-mêmes des variables aléatoires (Greene 2003, Kutner et al. 2005, Saporta 2006). Leur estimation dépend de l’échantillon de la population qui a servi à calibrer le modèle. L’estimation des paramètres dépend donc des sites de références sélectionnés pour estimer les coefficients du modèle (Figure 14 et Figure 17). Les mêmes coefficients estimés à partir de deux échantillons différents auront des valeurs différentes. Par conséquent, les moyennes estimées, pour un environnement donné, par ces deux modèles seront également différentes. L’intervalle de confiance permet d’estimer cette variabilité autour de la moyenne (Figure 19).

L’intervalle de prédiction se calcule avec l’équation : ˆˆ njj 2,2 ysetyyIC ˆ j 3.19 avec j la prédiction et se(j) l’erreur standard associée à la prédiction calculée avec la formule :

2 1 yse j ˆˆ 1 j XXXX j 3.20 Le terme « +1 », de l’équation 3.20 permet de prendre en compte le fait qu’une prédiction est une réalisation issue d’une distribution (ici une distribution normale) et s’écarte dans la grande majorité des cas de la moyenne (Hahn & Meeker 1991). Les deux intervalles prennent aussi en compte l’écart entre l’environnement dans un site donné (Xi) et l’environnement des sites qui ont servi à l’estimation des coefficients du modèle. Plus un site aura un environnement éloigné de l’environnement moyen des sites de

69 3.2- Incertitudes associées à l’usage des bioindicateurs multimétriques prédictifs référence, plus l’intervalle associé à la valeur de la métrique prédite par le modèle sera large (Figure 19).

Figure 19 : IntervalleIntervalle Relation théorique entre une métrique confianceconfiance (ordonnée) et une prédiction variable environnementale (abscisse). La droite en noire représente les valeurs prédites par une régression linéaire, la bande

Métrique grise foncée l’intervalle de confiance autour du modèle (de la moyenne) et la bande en gris clair l’intervalle de prédiction (autour d’une observation). -4 -2 0 2 4 6 8 10 -2 -1 0 12 Variable environnementale

Perspectives : quel intervalle choisir ? Le choix de l’intervalle dépend de l’objectif principal : soit il s’agit de décrire la population ou le processus à partir duquel l’échantillon a été sélectionné, soit il s’agit de prédire les résultats d’un nouvel échantillon de la même population. L’intervalle de confiance permet de répondre au premier objectif alors que l’intervalle de prédiction répond au second (Hahn & Meeker 1991). Chaque site pour lequel l’indice sera calculé représente une nouvelle observation pour laquelle on cherchera à estimer les bornes de l’intervalle dans lequel elle devrait être observée. L’intervalle de prédiction est donc le plus adapté pour mesurer l’incertitude autour de la valeur prédite et par conséquent l’incertitude autour du score (Figure 18).

3.2.3. Estimation de l’incertitude autour du score L’incertitude autour du score est estimée en propageant l’incertitude autour des valeurs prédites par les modèles (Figure 18). La première étape consiste à estimer l’intervalle de prédiction autour de la valeur attendue. Dans le cas d’une variable distribuée selon une loi normale, l’estimation de l’intervalle de prédiction (équations 3.19 et 3.20) se fait sur des bases

70 3.2- Incertitudes associées à l’usage des bioindicateurs multimétriques prédictifs statistiques théoriques solides (Neter et al. 1983, Kutner et al. 2005). Néanmoins, aucune des métriques utilisées dans le calcul des indices n’est distribuée selon une loi normale. Elles suivent des distributions de Poisson pour les métriques basées sur la richesse, ou des distributions binomiales négatives pour les métriques basées sur le nombre d’individus. Deux alternatives sont possibles pour estimer l’intervalle de prédiction : l’approximation ou la simulation.

3.2.3.1. Estimation de l’intervalle de prédiction par approximation

Plusieurs auteurs ont proposé des formules pour estimer les intervalles de prédictions pour des distributions autres que la loi normale (Cameron & Trivedi 1998, Vidoni 2003, Wood 2005). Cameron et Trivedi (1998) ont ainsi proposé l’équation suivante pour une distribution binomiale négative :

2 1 ˆii 2/ ˆi , ˆ yyzyyIC ˆ ii XWXXX i 3.21 avec i la valeur prédite par le modèle (la valeur attendue de la métrique), 2(i,ˆ) la fonction de variance de la moyenne (équation 3.11) et W la matrice diagonale suivante : yˆ 1  0 1 ˆ.yˆ1 W  3.22 yˆ 0  n 1 ˆ.yˆn

Comme dans le cadre du modèle linéaire, la matrice (X’WX)-1 est une estimation de la matrice de variance-covariance des coefficients d’un GLM. Cet intervalle est associé à la prédiction dans l’espace de la variable, par exemple le nombre d’espèces ou d’individus. Les résidus de chacune des métriques utilisées dans les deux indices sont calculés dans l’espace du lien (équation 3.7). Il sera donc nécessaire d’appliquer la fonction de lien aux limites de cet intervalle pour estimer l’intervalle autour du prédicteur linéaire .

La deuxième étape consiste à appliquer aux bornes de cet intervalle, l’ensemble des transformations qui permettent de passer de la valeur prédite de la métrique au score : calcul du résidu (équation 3.7), standardisation (équation 3.15) et transformation min-max (équation 3.8). L’intervalle ainsi calculé représente l’incertitude autour du score.

71 3.2- Incertitudes associées à l’usage des bioindicateurs multimétriques prédictifs

L’intervalle de prédiction estimé avec l’équation 3.21 est symétrique alors que la distribution binomiale négative est, a priori, asymétrique. Il est donc possible que cet intervalle soit biaisé.

3.2.3.2. Estimation de l’intervalle de prédiction par simulation

La méthode détaillée dans ce paragraphe est celle proposée au cours du projet EFI+ (Bady et al. 2009a,b). Cette approche fait l’approximation que les valeurs prédites dans l’espace du lien suivent une distribution normale de paramètre = ˆ et ² = ˆ². Pour chaque site x, l’écart type de la loi normale est approximé par celui calculé pour une distribution normale (équation 3.20) :

2 tt 1 ˆ ˆx ˆ 1 x XXXX x 3.23

Le paramètre de dispersion ˆ² est estimé comme la statistique de Pearson (Cameron & Trivedi 1998, Agresti 2002, Faraway 2006) divisée par le nombre de degrés de liberté résiduel (McCullagh & Nelder 1989) :

1 n yy ˆ 2 ˆ 2 ii 3.24 pn i1 ˆ i avec n le nombre de sites de références utilisés pour calibrer le modèle, p le nombre de paramètres du modèles, yi la valeur observée pour le site de référence i, i la valeur attendue et

2ˆ i la variance conditionnelle de yi (équation 3.11 ; Cameron & Trivedi 1998). Une fois l’ensemble des paramètres estimé, 99 valeurs sont générées aléatoirement selon une loi normale de paramètres N(ˆx,ˆ²(ˆ x)). Un vecteur de 99 résidus est ensuite obtenu en soustrayant à la valeur observée chaque valeur générée. Ces résidus sont ensuite transformés en score (Figure 17). Les bornes de l’intervalle autour du score sont estimées par des quantiles de la distribution des 99 scores générés.

3.2.4. Estimation de l’incertitude autour de l’indice Le calcul de l’indice, quel que soit le type de cours d’eau, est la somme des scores de deux métriques (équations 3.13 et 3.14). La simple addition des bornes des intervalles de chaque métrique ne semble pas être une approximation pertinente de l’incertitude autour de l’indice. Cette méthode entraînerait une surestimation de l’intervalle autour de l’indice.

72 3.2- Incertitudes associées à l’usage des bioindicateurs multimétriques prédictifs

Soit Y3 une variable aléatoire issue de la somme de deux variables aléatoires Y1 et Y2. La variance de la somme de deux variables aléatoires est égale à la somme de leur variance plus deux fois leur covariance. La variance de Y3 est donc égale à : 222 3.25 3 1 2 YYYYY 21

La variance de Y3 peut aussi s’exprimer en fonction de la corrélation entre Y1 et Y2. L’équation 3.24 devient : 222 2r 3.26 3 1 2 YYYYYYY 2121

Selon l’équation 3.26, l’écart type de Y3 ne peut être égal à la somme des écart types de Y1 et

Y2 que lorsque la corrélation entre Y1 et Y2 est égale à 1. Les métriques étant sélectionnées pour limiter leur redondance, une telle corrélation n’est jamais observée. Par conséquent, la somme des bornes des intervalles de confiance autour des moyennes de Y1 et Y2 surestimerait l’amplitude de l’intervalle de confiance autour de la moyenne de Y3. Le même phénomène se produirait pour l’intervalle de prédiction. Néanmoins l’équation 3.26 démontre l’importance de prendre en compte la corrélation entre les métriques dans le calcul de l’incertitude autour de l’indice.

3.2.4.1. Illustration de l’effet de la corrélation entre métrique, par simulation

Considérons une population de 1 000 sites de référence, avec deux métriques Y1 et Y2 et une variable environnementale Z, toutes trois distribuées selon une loi normale avec pour paramètres :

Y1 -2,1212 4,3234

Y2 -0,9568 4,3609 Z 2,0000 1,5000 et pour corrélation :

Y1 Y2 Y2 0,6591 Z 0,6500 0,5500 Parmi cette population de 1000 sites, 200 sites sont sélectionnés aléatoirement. Ces 200 sites servent à estimer les paramètres de la régression linéaire entre chaque métrique et la variable environnementale Z, par exemple : 111 ZbaY 3.27 Les différences entre les valeurs observées pour les 1 000 sites de la population et les valeurs attendues sont ensuite calculées. Deux vecteurs de 1 000 résidus sont ainsi obtenus. Cette opération est répétée 1 000 fois, ce qui permet d’estimer, pour chaque site, 1 000 valeurs de

73 3.2- Incertitudes associées à l’usage des bioindicateurs multimétriques prédictifs résidus pour chaque métrique. Par exemple, pour un site test, qui présente les valeurs de métriques et d’environnement suivantes :

Y1 Y2 Z -5,1090 -4,1968 1,0921 On obtient les distributions suivantes : Résidus métrique

Simulation n° Y1 Y2 1 -1,6484 -1,9626 2 -1,3286 -1,3967 3 -1,3290 -1,3247 4 -1,3414 -1,9118 5 -1,1860 -1,3958 6 -1,5652 -1,5761 mille estimations de la somme des résidus sont calculées pour chaque site.

En moyenne, sur les 1 000 simulations, la corrélation entre les distributions des résidus des deux métriques est égale à 0,4755, ce qui est relativement proche de la corrélation partielle entre Y1 et Y2, rY1Y2|Z = 0,4752. Les distributions des résidus, pour le site test, semblent être distribuées selon des lois normales (tests de Shapiro, P > 0,05, Figure 20a,b). La Figure 20c montre clairement que les distributions des résidus des deux métriques, du site test, sont fortement corrélées, rtest = 0,5424 (P < 0,001). La somme des résidus des métriques

Y1 et Y2 est plus dispersée qu’en l’absence de corrélation entre ces deux distributions (Figure 20d). Le paramètre des deux distributions normales représentées par les courbes continues et pointillées a été estimé par la somme de la moyenne des deux distributions de résidus, puisque : 21 1 YYYY 2 3.28 La variance de la somme des deux lois normales a été estimée par l’équation 3.25 avec rY Y = rtest 1 2 pour la courbe continue et rY1Y2 = 0 pour la courbe pointillée. L’écart type de chaque distribution a été estimé via l’équation suivante : 2 sˆY testCovXX test 3.29 avec Xtest l’environnement dans le site test et Cov la moyenne des matrices de variance- covariance des modèles calculées à chaque simulation.

74 3.2- Incertitudes associées à l’usage des bioindicateurs multimétriques prédictifs

a b Densité Densité 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 -2.0 -1.5 -1.0 -0.5 -2.5 -2.0 -1.5 -1.0

Résidus Y1 Résidus Y2

cdr 0 0,5424 2 Densité Résidus Y -2.5 -2.0 -1.5 -1.0 0.0 0.2 0.4 0.6 0.8 1.0 -2.0 -1.5 -1.0 -0.5 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5

Résidus Y1 Résidus Y1 + résidus Y2

Figure 20 : Résultats des 1 000 simulations pour le site test : (a) résidus de la métrique Y1, (b) résidus de la métrique Y2, (c) lien entre les résidus des deux métriques (carré noir localisé au barycentre des deux distributions), et (d) distribution de la somme des résidus. La courbe en pointillés représente la distribution théorique de la somme des résidus en absence de corrélation et la courbe continue la distribution théorique de la somme des résidus avec une corrélation de 0,5424 (corrélation observée entre les deux distributions). Les deux segments représentent les bornes estimées de l’intervalle de confiance avec = 0,05.

Ces simulations ne représentent pas réellement l’incertitude autour du score. Elles sont basées sur l’intervalle de confiance et non pas sur l’intervalle de prédiction et les résidus ne sont pas transformés. Néanmoins, ces résultats montrent l’influence de la corrélation de deux métriques sur la somme de leurs résidus, particulièrement sur la variabilité de cette somme. La somme de deux variables aléatoires corrélées est plus dispersée que la somme de deux variables aléatoires indépendantes. Ceci démontre la nécessité de prendre en compte la corrélation entre les scores des métriques pour l’estimation de l’incertitude autour de l’indice.

75 3.2- Incertitudes associées à l’usage des bioindicateurs multimétriques prédictifs

Comme pour l’estimation de l’incertitude autour du score des métriques, deux approches sont envisageables : par approximation ou par simulation.

3.2.4.2. Estimation de l’incertitude autour de l’indice par approximation L’objectif est de pouvoir estimer l’espérance et l’écart type associé à chaque score pour pouvoir estimer l’espérance et la variance de la somme des scores via les équations 3.27 et 3.25. Une fois ces paramètres connus, l’incertitude autour du score peut être estimée de la manière suivante : scoreIC 2/ . scoresezscore 3.30 L’incertitude autour de l’indice est ensuite estimée en divisant par deux (équations 3.13 et 3.14) les bornes de cet intervalle. Toute la difficulté de cette approche, réside principalement dans l’estimation de l’écart type associé au score d’une métrique. En effet, seule une approximation de l’écart type associé à une prédiction pour une nouvelle observation est disponible (voir chapitre 3.2.3.1).

3.2.4.3. Estimation de l’incertitude autour de l’indice par simulation Cette approche est une extension de celle présentée dans le chapitre 3.2.3.2 et a été proposée dans le cadre du projet EFI+ (Bady et al. 2009a,b). Pour chaque métrique, 99 valeurs attendues dans l’espace du lien sont générées aléatoirement à partir d’une loi normale de paramètres N(ˆx,ˆ²(ˆx)). Ces valeurs attendues sont ensuite transformées en scores et agrégées pour obtenir 99 valeurs simulées de l’indice. L’intervalle autour de l’indice pour un site donné est estimé par les quantiles de ce vecteur de 99 valeurs (Figure 21).

aba b Indice 0,0 0,2 0,4 0,6 0,8 1,0 0,0 0,2 0,4 0,6 0,8 1,0 0,0 0,2 0,4 0,6 0,8 1,0 Indice Indice Figure 21 : Illustration de l’estimation de l’incertitude autour de l’indice (pointillés) pour des peuplements tolérants, avec un intervalle à 80 % (a) et à 95 % (b). Les bornes inférieures et supérieures (courbes noires) ont été obtenues par simulation.

76 3.2- Incertitudes associées à l’usage des bioindicateurs multimétriques prédictifs

A l’heure actuelle, cette méthode ne prend pas en compte la corrélation entre les métriques. Les vecteurs de valeurs aléatoires pour chaque métrique sont générés indépendamment les uns des autres. Une amélioration possible consisterait à générer aléatoirement les vecteurs de valeurs attendues pour chaque métrique à partir d’une distribution normale multivariée (Ripley 1987). L’utilisation d’un tel algorithme permettrait de prendre en compte la corrélation entre les métriques.

3.2.4.4. Perspectives Les deux approches présentées dans ce chapitre pour estimer les incertitudes autour du score des métriques et de la valeur de l’indice, ne sont que des approches préliminaires. De futurs développements seront nécessaires pour pouvoir utiliser ces méthodes en routine à l’avenir. Il conviendrait notamment de confronter les résultats obtenus avec ces approches, avec des intervalles obtenus par simulation, en se plaçant dans des conditions connues de variance, moyenne, valeurs de coefficients, etc.

Chaque méthode présente des avantages et des inconvénients. La méthode par approximation nécessite a priori un temps de calcul beaucoup plus court. Cet avantage sera d’autant plus grand que le nombre de sites pour lequel un utilisateur cherchera à calculer l’indice sera important. Le principal avantage de l’approche par simulation est sa simplicité. Cette démarche sera susceptible d’être comprise par un public assez large et ne nécessite que peu de connaissances statistiques a priori. En revanche, le temps de calcul nécessaire à l’estimation des incertitudes, sera d’autant plus long que le nombre de simulations et de sites seront importants. L’utilisation de distributions autres que la loi normale, associée aux différentes étapes qui conduisent au calcul des scores et de l’indice, permet uniquement de calculer une approximation des deux incertitudes.

Ces résultats ont aussi montré l’importance de prendre en compte la corrélation entre les métriques dans le calcul de l’incertitude de l’indice. Très souvent, les indices multimétriques intègrent entre 8 et 12 métriques dans leur calcul (Karr 1981, Fausch et al. 1984, Simon & Lyons 1995, Lyons et al. 1996, Hughes et al. 1998, Pont et al. 2006, 2007, Stoddard et al. 2008). Les métriques sont sélectionnées afin de limiter leur redondance. Classiquement, seules les métriques ayant une corrélation inférieure à un seuil arbitraire sont sélectionnées (ex. Hughes et al. 1998, Heringet al. 2006, Pont et al. 2006, 2007, 2009,

77 3.2- Incertitudes associées à l’usage des bioindicateurs multimétriques prédictifs

Stoddard et al. 2008). Généralement, ce seuil varie entre 0,7 et 0,8 selon les auteurs. Plus les corrélations entre les scores des métriques seront élevées et plus l’incertitude autour de l’indice sera grande. La variance d’une somme de variable aléatoire est égale à :

p Var Y X i ,Cov2Var XX ji 3.31 i1 ji En remplaçant la covariance par la corrélation dans l’équation 3.31, on obtient :

p Var Y X i ji i VarVar,Co2Var XXXXr j 3.32 i1 ji Plus les scores des métriques seront corrélés et plus la variance de la somme des scores sera grande (Figure 22).

r Figure 22 : Somme de 10 variables 0 aléatoires distribuées selon des lois 0,25 normales de paramètre = 0 et ² = 4, avec différents niveaux de 0,5 corrélation (r) entre chaque 0,75 variable : 0, 0,25, 0,5 et 0,75. Densité 0,00 0,02 0,04 0,06 -60 -40 -20 0 20 40 60 Somme de 10 variables aléatoires

L’utilisation de métriques redondantes dans le calcul de l’indice entraine une augmentation du risque de mauvaise classification (Bowman & Somers 2005). Un site test sans altération mais qui présente un score faible pour une métrique donnée, présentera probablement d’autres métriques avec un score faible de par leurs corrélations. Ce site sera probablement considéré comme impacté. Le diagnostic est alors en partie un artefact lié à la redondance partielle entre les métriques. Cet artefact va à l’encontre de la théorie des indices multimétriques (Karr 1991, Karr & Chu 1999, 2000). La détection de l’impact de pressions multiples par plusieurs métriques est souhaitable lorsque les métriques utilisées sont effectivement non redondantes (Karr 1991). Dans ce cas, plusieurs métriques avec des scores

78 3.2- Incertitudes associées à l’usage des bioindicateurs multimétriques prédictifs faibles démontrent que plusieurs aspects du fonctionnement des communautés sont altérés par l’impact des pressions anthropiques. Par conséquent, une faible note d’indice est cohérente et donne une information précise et fiable de l’état d’un site. Avec un indice intégrant plusieurs métriques redondantes, il n’est pas possible de distinguer l’effet lié aux pressions de celui lié à la corrélation entre métriques sur la valeur de l’indice. Pour réduire l’effet de la corrélation entre métriques, on peut envisager d’être plus strict en abaissant la valeur de la corrélation maximale tolérée lors du choix des métriques. Limiter le nombre de métriques intégrées dans le calcul de l’indice est également un moyen efficace pour réduire cet effet (voir les Figure 20d et Figure 22). C’est une des raisons qui a conduit à ne considérer que deux métriques dans le calcul final d’EFI+.

79 3.3- Implications potentielles du changement climatique

3.3. Implications potentielles du changement climatique

L’utilisation de la « référence condition approach » pour développer un indice multimétrique suppose une relative stabilité et prédictibilité des systèmes. Quelle que soit la méthode statistique utilisée (Bowman & Somers 2005), les valeurs observées des métriques sont comparées aux valeurs des sites de références. Il peut s’agir d’une comparaison directe, par exemple la moyenne des sites de références les plus proches du site test, ou d’une comparaison indirecte via des modèles statistiques. Pour des conditions environnementales données, en l’absence de pressions anthropiques, une seule valeur est attendue par métrique. Un léger changement de l’environnement devrait entrainer une légère variation de la structure des peuplements. Cette dernière est censée évoluer selon les relations estimées par les modèles. Dans le cas contraire, la prise en compte de la variabilité environnementale des métriques sera biaisée et l’évaluation de la condition des sites faussée. La température est un facteur environnemental majeur agissant sur les êtres vivants et agissant à différents niveaux d’organisation : individus, espèces et communautés (Daufresne et al. 2009). L’évolution constatée du régime thermique de certains cours d’eau sous l’effet du changement global (Webb & Nobilis 1995, Webb 1996, Milly et al. 2005, Webb & Nobilis 2007), associée aux variations thermiques prédites par le GIEC (groupe d'experts intergouvernemental sur l'évolution du climat ; IPCC 2007) suggère une évolution future des peuplements piscicoles (Buisson et al. 2008a,b, Lassale et al. 2008, Daufresne et al. 2009). La finalité de cette partie est d’appréhender les conséquences potentielles du changement climatique sur l’évaluation de l’état écologique des cours d’eau via les indices actuels (Wilby et al. 2006).

Le premier objectif de cette partie est d’étudier l’effet de l’environnement sur la croissance des jeunes de l’année (YOY, 0+) de truites, pour appréhender l’effet du changement climatique sur la structure en taille des peuplements. Une variation de la structure en taille des peuplements de truite aurait des conséquences d’autant plus importantes que cette espèce domine les peuplements « intolérants ». L’indice développé pour ces peuplements intègre la seule métrique basée sur la structure en taille des peuplements (équation 3.13) : « le nombre d’individus habitat intolérants (HINTOL) inférieur à 150 mm ». Plus généralement, l’ensemble des métriques utilisées durant le projet EFI+ sont basées sur le regroupement d’espèces ayant des caractéristiques communes en guildes

80 3.3- Implications potentielles du changement climatique

(Usseglio-Polatera et al. 2000, Noble et al. 2007). L’évolution future de l’aire de répartition des espèces (Buisson et al. 2008a,b, Lassale et al. 2008) pourrait donc avoir des conséquences importantes sur la structure des peuplements. Le deuxième objectif de cette partie est d’étudier les paramètres environnementaux qui influencent la distribution de 24 espèces piscicoles européennes.

3.3.1. Croissance des jeunes de l’année de la truite, Salmo trutta fario, L. En plus de la température (Brown 1951, Elliott et al. 1995, Abdoli et al. 2007), de nombreux facteurs abiotiques et biotiques contrôlent la croissance individuelle des jeunes de l’année : le débit (Jonsson et al. 2001, Arnekleiv et al. 2006), la nourriture disponible (Clarke & Scruton 1999, Boughton et al. 2007, Ward et al. 2009), les interactions intra- et interspécifiques comme les agressions et la compétition interspécifique (Bystrom & Garcia- Berthou 1999, Lahti et al. 2001), et les rétroactions densité dépendantes (Elliott 1994). La croissance des jeunes de truite a été étudiée en considérant la taille maximale que peut atteindre un individu 0+ dans un peuplement. La distribution en taille de chaque population de truite est considérée comme étant un mélange de deux lois normales (Macdonald 1987). La première loi représente les 0+ et les lois suivantes les individus plus âgés (Figure 23). Les paramètres du mélange de lois ont été estimés avec un algorithme EM (Benaglia et al. 2009). Dans chacun des 105 peuplements considérés, la taille maximale que peut atteindre un 0+ est estimée comme un quantile (97,5ème centile) de la loi normale associée aux 0+ (Figure 23).

YOY Figure 23 : Exemple de la > 0+ distribution en taille d’une population de truite. Les paramètres de la loi normale associée aux YOY (ligne pointillée) permettent d’estimer la taille maximale (le point noir). Densité 0 0.005 0.010 0.015 0.020 50 100 150 200 250 300 350 Tailles individuelles

81 3.3- Implications potentielles du changement climatique

L’échantillonnage s’étalant entre le 1er aout et le 22 novembre, on suppose constante la croissance des 0+ entre ces deux dates (croissance linéaire). Après avoir contrôlé l’effet de la période d’échantillonnage (jour de l’année, D), la température moyenne annuelle de l’air (T), l’amplitude thermique entre janvier et juillet (différence des deux moyennes mensuelles, Td), la surface du bassin versant (A) et la géologie dominante du bassin versant (calcaire ou silice, G) expliquent 49 % de la taille maximale des YOY. Comparé à un modèle intégrant uniquement la température comme variable environnementale, la part de variance expliquée par le modèle complet est significativement plus importante (F4,98 = 13,1, P < 0,001). La relation estimée entre la taille maximale et les variables environnementales est : -.-L (A) (A)² d 10,32-68,113,047,049,11log56,1log46,51824 GTDT²T- 3.33 avec L la taille maximale des YOY estimée par le mélange de lois normales, D le jour de l’année (213-327, 1er aout-22 novembre) et G la géologie dominante (silice).

Le « hierarchical partitionning » (Chevan & Sutherland 1991, Pont et al. 2005, Walsh & Mac Nally 2008) révèle que la température est le facteur environnemental qui, à large échelle, influence le plus la croissance des 0+ de truite (Tableau 11 ; Elliott & Hurley 1997, Ojanguren et al. 2001, Parra et al. 2009). La relation entre la température et la taille maximale des YOY est quadratique (équation 3.32 ; Elliott & Hurley 1997, Ojanguren et al. 2001). La croissance croit avec l’augmentation de la température jusqu’à un maximum thermique puis décroît (Figure 24 ; Ojanguren et al. 2001). Taille maximale 50 60 70 80 90 100 100 105 110 115 120 125

2 4 6 8 10 12 14 0 100 200 300 400 Température annuelle moyenne Surface du bassin versant Figure 24 : Profils marginaux (Fox 1987, 2003) de l’effet de la température moyenne annuelle de l’air et de la surface du bassin versant sur la taille maximale des 0+ de truite.

La surface du bassin versant est le deuxième facteur environnemental avec la plus grande contribution interdépendante (%Ii) sur la croissance des jeunes de l’année de truite

82 3.3- Implications potentielles du changement climatique

(Tableau 12). La croissance des 0+ augmente avec la taille du cours d’eau (Figure 24 ; Tedesco et al. 2009). Ce patron est probablement lié à la variation longitudinale des ressources alimentaires disponibles. A température égale, l’alimentation est un facteur clé de la croissance chez la truite (Brown 1951). Récemment, Descroix et al. (2010) ont mis en évidence une augmentation de la croissance des parrs de saumon (Salmo salar) le long du gradient amont-aval des cours d’eau, associée à une modification du régime alimentaire. La diminution de la croissance des 0+ estimée dans les très faibles surfaces de bassin versant (Figure 24), n’a pas de réelle signification biologique et est probablement le résultat d’un artefact statistique.

La géologie dominante du bassin versant représente 14,2 % des contributions indépendantes des variables environnementales (Tableau 12). En moyenne la croissance des juvéniles de truites est plus forte en milieu calcaire qu’en milieu siliceux (coefficient négatif associé à G, équation 3.33). Ceci s’explique probablement par une plus forte alcalinité et fertilité de l’eau (Kwak & Waters 1997, Almodovar et al. 2006) dans les bassins versants calcaires. La variabilité de la composition chimique de l’eau peut être un facteur majeur de la variabilité des croissances observée (Almodovar et al. 2006). La chimie de l’eau étant elle- même fortement dépendante de la géologie (Allan & Castillo 2007).

L’amplitude thermique entre janvier et juillet est la variable environnementale ayant la plus petite incidence sur la taille maximale des jeunes de l’année de truite (Tableau 12).

Tableau 12 : Résultat du partionnement hiérarchique de la régression multiple reliant la taille maximale des YOY aux variables environnementales : contribution indépendante (I), contribution jointe (J) et %Ii la contribution indépendante relative. * Les effets de la variable et de sa forme quadratique (équation 3.33) sont estimés simultanément.

Variable environnementale I J Total |I/J| %Ii Température* 0,2213 0,0043 0,2170 51,9 45,2 Surface de bassin versant * 0,1721 0,0048 0,1673 35,7 35,1 Amplitude thermique 0,0268 0,0262 0,0006 1 5,5 Géologie 0,0697 0,0068 0,0629 10,3 14,2 Total 0,4898 0,0420 0,4478 11,7 100

3.3.1.1. Limites : période d’échantillonnage

La période d’échantillonnage des 105 sites utilisés dans cette étude, s’étend d’août à novembre. Le jour de l’année a été utilisé dans la régression multiple pour prendre en compte des différences de taille liées à la date d’échantillonnage. Ceci suppose que la croissance des

83 3.3- Implications potentielles du changement climatique jeunes de l’année de truite soit linéaire au cours de cette période, et que le taux de croissance soit identique quelle que soit la région dans laquelle sont situées les populations.

3.3.1.2. Perspectives

L’influence de la température sur la croissance des jeunes de l’année de truites, suggère que le réchauffement climatique aura des répercussions importantes sur la structure en taille des populations. La taille des individus est elle-même directement reliée à la stratégie d’histoire de vie des espèces piscicoles (Winemiller & Rose 1992, Vila-Gispert et al. 2002) et à la dynamique des populations (Quinn & Peterson 1996). Une croissance rapide au cours des jeunes stades est souvent associée avec une taille et un âge à la première reproduction précoce (Abdoli et al. 2007, McDermid et al. 2007). Les populations vivant dans des milieux plus chauds présentent des taux de croissance plus elevés au cours des jeunes stades (0+, 1+) et des maturations plus précoces (Abdoli et al. 2007, McDermid et al. 2007). Néanmoins, les patrons de croissance des individus plus âgés s’inversent. Les individus âgés vivant dans des environnements chauds semblent croître moins vite que des individus vivant dans des milieux froids (Abdoli et al. 2007, McDermid et al. 2007). Ceci est probablement dû à l’existence d’un trade-off entre l’investissement gonadique et somatique. Des populations avec une maturité plus tardive favoriseraient l’investissement somatique, alors que des populations avec des croissances rapides favoriseraient l’investissement gonadique (McDermid et al. 2007). L’existence d’un tel trade-off et l’influence de la température sur la croissance des jeunes de l’année de truite, suggèrent que la truite serait capable d’adapter sa stratégie aux nouvelles conditions thermiques. Il est donc fort probable que la structure en taille des populations de truite s’en trouve modifiée. Un tel changement aurait des conséquences importantes sur l’évaluation de la condition des peuplements piscicoles via la métrique sur les classes de taille. En effet, la truite est l’espèce qui contribue le plus à cette métrique.

3.3.2. Modélisation de la distribution des espèces Deux grandes approches statistiques sont utilisées pour prédire la présence-absence des espèces, suivant l’objectif poursuivi. La première, purement prédictive et déconnectée de toutes théories écologiques (Austin 2007, Elith & Graham 2009), utilise principalement des méthodes d’apprentissage (« learning machine ») tels les arbres de décision ou de régression (Tirelli & Pessani 2009), la forêt aléatoire (« random forest » ; Elith et al. 2008), réseau de

84 3.3- Implications potentielles du changement climatique neurone (Tirelli & Pessani 2009), MARS (Leathwick et al. 2006), ou des modèles additifs généralisés (GAM ; Guisan et al. 2002, Buisson et al. 2008a). Son objectif est de prédire l’évolution de l’aire de répartition des espèces suivant divers scénarios du changement climatique (Buisson et al. 2008b, Lassale et al. 2008, Tirelli & Pessani 2009). La seconde approche, basée sur des théories écologiques comme la loi de Liebig (facteur limitant) ou la niche des espèces (Austin 2007), utilise des méthodes inférentielles comme la régression quantile (Vaz et al. 2008) ou la régression logistique (Pont et al. 2005). L’objectif de cette approche est d’étudier l’effet des facteurs environnementaux sur la présence-absence des espèces dans le réseau hydrographique (Pont et al. 2005). Identifier les paramètres environnementaux, physiques, chimiques, climatiques, qui contrôlent la présence-absence des espèces (Jackson et al. 2001, Pont et al. 2005, Buisson et al. 2008a), est une étape fondamentale (Guisan & Zimmerman 2000, Austin 2007, Elith & Leathwick 2009, Olden et al. 2010), pour prédire l’évolution de l’aire de répartition des espèces (Buisson et al. 2008b, Lassale et al. 2008, Tirelli & Pessani 2009). Pont et al. (2005) ont montré des réponses différentes des espèces aux contraintes environnementales locales et régionales. La non sélection des variables qui expliquent le mieux la distribution des espèces, engendrerait un problème « d’overfitting » (Harrell 2001, Hastie et al. 2009). Ce phénomène a lieu lorsque le modèle est trop complexe. Une partie des conclusions issues de l’analyse est due à la modélisation du bruit ou d’associations erronées entre la présence-absence d’une espèce et les variables environnementales (Harrell 2001). En cas d’overfitting, l’erreur de prédiction sur le jeu de données de calibration est relativement faible, mais l’erreur de prédiction sur un jeu de données indépendant est élevée (Hastie et al. 2009). Le pouvoir prédictif des SDM est également dépendant du jeu de données utilisé pour estimer les modèles (Elith & Graham 2009, Sinclair et al. 2010). Plus le jeu de données sera spatialement étendu, plus l’estimation de la niche ou de l’enveloppe bioclimatique des espèces sera précise. De nombreuses projections de l’aire de répartition des espèces (ex. Buisson et al. 2008b, Lassale et al. 2008, Tirelli & Pessani 2009), ont été établies à partir de jeu de données ne concernant qu’une partie de l’aire de répartition actuelle de ces espèces (Kottelat & Freyhof 2007). Il est probable que ces modèles ne prennent en compte qu’une partie de la niche réalisée des espèces. Il est possible que les conditions climatiques prédites (IPCC 2007), soient en dehors de la gamme des valeurs comprises dans le jeu de données de calibration. Les prédictions en dehors des valeurs qui ont servies à la calibration des modèles ne sont alors que des extrapolations. De plus, l’estimation des relations entre la présence-absence des espèces et

85 3.3- Implications potentielles du changement climatique les variables environnementales peut être imprécise, voir biaisée. Les présences des espèces ne peuvent être associées à des combinaisons de variables environnementales non observées dans le jeu de calibration. Pour pouvoir estimer les relations entre les conditions d’habitat et la présence-absence de 24 espèces piscicoles européennes (Tableau 13), un jeu de données de 1 548 sites peu ou pas impactés a été sélectionné (Figure 25). Ces relations ont été estimées à l’aide de régressions logistiques (Hosmer & Lemeshow 2000, Collett 2002, Pont et al. 2005), intégrant la température moyenne de l’air en juillet (Tjul), l’amplitude thermique entre juillet et janvier (Tdif), la pente (en logarithme, Slope), et une estimation du run-off (lPA, logarithme de la surface du bassin versant multiplié par les précipitations annuelles). La température est La température est un facteur limitant pour n’importe quel être vivant de par son action directe sur le métabolisme (Begon et al. 2006). La température peut aussi agir indirectement sur les organismes aquatiques, par exemple en jouant sur la solubilité de l’oxygène (Allan & Castillo 2007). Le lPA renseigne sur l’hydraulicité d’un cours d’eau. A surface de bassin versant équivalente, plus les précipitations sur le bassin versant seront importantes, plus la valeur de lPA sera forte, par conséquent plus le débit sera lui-même important. La pente représente un gradient physique structurant les cours d’eau d’amont en aval (Allan & Castillo 2007), et est un facteur environmental connu pour influencer la composition des communautés piscicoles (Huet 1954). Les formes quadratiques associées à Tjul, Slope et lPA, ont été intégrées dans les modèles.

N

01500000 km Figure 25 : Localisation des 1 548 sites peu ou pas impactés.

86 3.3- Implications potentielles du changement climatique

Tableau 13 : Résumé du nombre de sites disponibles pour chaque espèce et de leur prévalence au sein de ces sites et des régions marines où elles sont considérées comme natives (MMR ; Kottelat & Freyhof 2007, Reyjol et al. 2007). Espèces Famille Nom vernaculaire Occurrence Sites Prévalence MMR Salmo trutta Salmonidae Truite 1212 1544 0,785 44 Phoxinus phoxinus Cyprinidae Vairon 492 1319 0,373 37 Barbatula barbatula Nemacheilidae Loche franche 455 1236 0,368 34 Cottus gobio Cottidae Chabot 451 1263 0,357 36 Gobio gobio Cyprinidae Goujon 387 1265 0,306 34 Pseudochondrostoma duriense Cyprinidae Hotu du Duero 119 393 0,303 5 Leuciscus cephalus Cyprinidae Chevaine 351 1179 0,298 38 Rutilus rutilus Cyprinidae Gardon 285 1065 0,268 33 Anguilla anguilla Anguillidae Anguille 317 1314 0,241 42 Perca fluviatilis Percidae Perche 231 996 0,232 32 Alburnoides bipunctatus Cyprinidae Spirlin 143 687 0,208 11 Esox lucius Esocidae Brochet 210 1029 0,204 32 Leuciscus leuciscus Cyprinidae Vandoise 210 1054 0,199 32 Alburnus alburnus Cyprinidae Ablette 179 986 0,182 28 Gasterosteus aculeatus Gasterosteidae Epinoche 150 864 0,174 32 Chondrostoma nasus Cyprinidae Hotu 64 385 0,166 5 Salmo salar Salmonidae Saumon atlantique 163 1005 0,162 30 Barbus barbus Cyprinidae Barbeau fluviatile 131 908 0,144 23 Lota lota Lotidae Lote de rivière 104 775 0,134 16 Telestes souffia Cyprinidae Blageon 59 441 0,134 7 Thymallus thymallus Thymallidae Ombre commun 119 894 0,133 27 Lampetra planeri Petromyzonidae Lamproie de Planer 181 1442 0,126 36 Rhodeus amarus Cyprinidae Bouvière 93 835 0,111 16 Pungitius pungitius Gasterosteidae Epinochette 76 719 0,106 24 *Duero nase (Kottelat & Freyhof 2007).

3.3.2.1. Qualité de l’ajustement et validation des modèles Globalement, la qualité d’ajustement des modèles est relativement satisfaisante : 20 modèles avec une aire sous la courbe ROC (AUC) supérieure à 0,8, 22 modèles avec un kappa supérieur à 0,3 et un taux moyen de bon classement égal à 76,3 % (Tableau 14). Les valeurs « d’optimisme » du kappa et du taux de bon classement (Harrel 2001) sont faibles (Tableau 15), suggérant une faible influence des outliers et une bonne stabilité des modèles. Néanmoins, la qualité d’ajustement des modèles est relativement variable entre les espèces. Les modèles estimés pour le vairon, la loche franche, le chabot et la lamproie de Planer, présentent des statistiques de qualité d’ajustement faibles (Tableau 14). Aucune des aires sous la courbe ne dépassent 0,8 et ni leur sensibilité (pourcentage de présences correctement prédite), ni leur spécificité (pourcentage d’absences correctement prédites ; Agresti 2002, Buisson et al. 2008a), ne dépassent 70 %. Le pouvoir discriminant de ces modèles est faible. A l’inverse, les statistiques d’ajustement des modèles pour le blageon, l’ablette, le hotu et le barbeau, sont relativement élevées et correspondent à un fort pouvoir discriminant : AUC supérieure à 0,9, kappa supérieur à 0,46 et un taux de bon classement compris entre 0,76

87 3.3- Implications potentielles du changement climatique

et 0,88. Les valeurs de spécificité et sensibilité de ces modèles sont cohérentes avec un fort pouvoir discriminant. Certaines valeurs sont supérieures à 0,9 (Tableau 14) et les sensibilités de ces quatre modèles sont toujours supérieures aux spécificités. En moyenne, la sensibilité des modèles est supérieure à leur spécificité (Tableau 14). L’erreur de classification associée à la présence des espèces est plus faible que l’erreur de classification associée à l’absence des espèces.

Tableau 14 : Résumé de la qualité de l’ajustement des modèles : pourcentage de covariate pattern (Hosmer & Lemeshow 2000) supérieur à 4 (CovPat), facteur d’inflation de la variance moyen (Mvif ; Greene 2003, Kutner et al. 2005), pseudo-R² (pR²) basé sur le ratio de déviance (Estrella 1998), critère d’information d’Akaike (AIC ; Akaike 1974, Collett 2002), la probabilité seuil (Cut), la sensibilité (Sens), la spécificité (Spec), l’aire sous la courbe ROC (AUC ; Agresti 2002), la statistique kappa ( ; Agresti 2002) et le taux de bon classement (le pourcentage de sites bien classés, Class). Espèces CovPat MvifpR2 AIC CutSens Spec AUC Class Salmo trutta 0,036 1,43 0,302 1155 0,739 0,829 0,711 0,845 0,482 0,804 Phoxinus phoxinus 0,010 1,43 0,082 1648 0,412 0,630 0,625 0,652 0,243 0,627 Barbatula barbatula 0,008 1,38 0,154 1448 0,332 0,804 0,542 0,716 0,308 0,638 Cottus gobio 0,014 1,13 0,157 1457 0,362 0,818 0,602 0,727 0,376 0,679 Gobio gobio 0,020 1,34 0,349 1115 0,417 0,739 0,818 0,848 0,534 0,794 Pseudochondrostoma duriense 0,015 1,40 0,271 387 0,282 0,815 0,693 0,810 0,443 0,730 Leuciscus cephalus 0,019 1,39 0,304 1081 0,236 0,849 0,637 0,826 0,403 0,700 Rutilus rutilus 0,030 1,25 0,365 849 0,297 0,765 0,795 0,859 0,507 0,787 Anguilla anguilla 0,022 1,50 0,227 1164 0,217 0,808 0,675 0,808 0,376 0,707 Perca fluviatilis 0,026 1,33 0,312 776 0,313 0,723 0,809 0,850 0,473 0,789 Alburnoides bipunctatus 0,020 1,40 0,231 560 0,243 0,783 0,734 0,823 0,399 0,744 Esox lucius 0,022 1,30 0,321 723 0,257 0,781 0,790 0,868 0,467 0,788 Leuciscus leuciscus 0,022 1,41 0,324 727 0,329 0,719 0,859 0,872 0,522 0,831 Alburnus alburnus 0,018 1,10 0,394 563 0,143 0,905 0,751 0,901 0,469 0,779 Gasterosteus aculeatus 0,014 1,33 0,327 536 0,215 0,860 0,790 0,885 0,485 0,802 Chondrostoma nasus 0,021 1,97 0,400 212 0,130 0,922 0,782 0,919 0,500 0,805 Salmo salar 0,039 1,51 0,283 622 0,303 0,681 0,910 0,865 0,558 0,873 Barbus barbus 0,014 1,52 0,373 442 0,159 0,916 0,834 0,923 0,546 0,846 Lota lota 0,032 1,30 0,174 494 0,147 0,740 0,762 0,816 0,326 0,759 Telestes souffia 0,016 1,79 0,280 245 0,180 0,864 0,853 0,899 0,534 0,855 Thymallus thymallus 0,024 1,33 0,194 549 0,163 0,782 0,777 0,835 0,367 0,777 Lampetra planeri 0,025 1,26 0,122 929 0,136 0,801 0,694 0,786 0,271 0,707 Rhodeus amarus 0,020 1,30 0,211 432 0,100 0,936 0,714 0,872 0,330 0,739 Pungitius pungitius 0,026 1,23 0,210 356 0,122 0,829 0,768 0,860 0,334 0,775

88 3.3- Implications potentielles du changement climatique

Tableau 15: Estimation de l’optimisme (Harrel 2001) du kappa et du taux de bon classement. Pour chaque espèce 200 nouveaux jeu de données train sont définis en sélectionnant aléatoirement et avec remise les sites de calibration. Les coefficients des modèles sont réestimés pour chacun des 200 nouveaux échantillons. A partir de ces modèles on calcule le kappa et le taux de bon classement pour le jeu de données train et pour le jeu de données initial (Init) des sites de calibration. La différence entre les moyennes des 200 valeurs calculées pour les sites de calibration sélectionnés aléatoirement (Train), et pour les sites de calibration initiaux, est une estimation de l’optimisme (Optim ; Harrel 2001). Kappa Taux de bon classement Espèce Train Init Optim Train Init Optim Salmo trutta 0,471 0,468 0,004 0,800 0,798 0,002 Phoxinus phoxinus 0,235 0,227 0,008 0,623 0,618 0,005 Barbatula barbatula 0,300 0,295 0,005 0,636 0,633 0,003 Cottus gobio 0,368 0,361 0,006 0,674 0,672 0,002 Gobio gobio 0,528 0,517 0,011 0,791 0,788 0,004 Pseudochondrostoma duriense 0,434 0,409 0,025 0,728 0,717 0,010 Leuciscus cephalus 0,404 0,397 0,007 0,701 0,697 0,004 Rutilus rutilus 0,501 0,498 0,003 0,785 0,784 0,002 Anguilla anguilla 0,378 0,368 0,010 0,710 0,704 0,005 Perca fluviatilis 0,464 0,458 0,005 0,786 0,785 0,001 Alburnoides bipunctatus 0,412 0,392 0,019 0,750 0,742 0,008 Esox lucius 0,469 0,466 0,004 0,790 0,790 0,000 Leuciscus leuciscus 0,506 0,497 0,009 0,828 0,822 0,005 Alburnus alburnus 0,463 0,460 0,002 0,778 0,776 0,002 Gasterosteus aculeatus 0,487 0,478 0,010 0,803 0,798 0,004 Chondrostoma nasus 0,510 0,490 0,020 0,810 0,804 0,006 Salmo salar 0,547 0,540 0,006 0,868 0,867 0,002 Barbus barbus 0,539 0,527 0,012 0,844 0,841 0,004 Lota lota 0,320 0,299 0,021 0,756 0,750 0,005 Telestes souffia 0,506 0,472 0,034 0,846 0,835 0,011 Thymallus thymallus 0,362 0,348 0,014 0,779 0,774 0,005 Lampetra planeri 0,262 0,255 0,007 0,706 0,702 0,004 Rhodeus amarus 0,332 0,319 0,013 0,746 0,742 0,004 Pungitius pungitius 0,325 0,313 0,011 0,772 0,769 0,003

La qualité de prédiction des modèles est estimée par une méthode dérivée du split- sampling (Harrel 2001), en séparant aléatoirement le jeu de données de calibration en deux jeux de données, train et test, comprenant respectivement 70 et 30 % des sites. Pour chaque espèce, les modèles sont réestimés avec le jeu de données train et les valeurs de ROC, spécificité, sensibilité, kappa et le taux de bon classement sont estimées sur le jeu de données test (jeu de données pseudo-indépendant). Cette opération est répétée 200 fois pour avoir une estimation la plus fiable possible de ces cinq statistiques. Les moyennes de ces cinq statistiques estimées pour les jeux de données pseudo- indépendants (Tableau 16) sont similaires à celles calculées sur l’ensemble du jeu de données

89 3.3- Implications potentielles du changement climatique de calibration (Tableau 14), particulièrement la spécificité, ROC, et le taux de bon classement. Les écarts types de ces cinq statistiques sont relativement faibles. Seuls le hotu du Duero, le blageon, la lote et l’épinochette présentent des écarts notables de sensibilité entre les sites de calibration (Tableau 14) et le jeu de donnés pseudo-indépendant (Tableau 16). Leurs écarts type sont sensiblement plus élevés que pour les autres espèces.

Tableau 16 : Moyennes et écarts types de la sensibilité (Sens), spécificité (Spec), aire sous la courbe ROC (AUC), kappa et taux de bon classement (Class) des modèles, estimés sur un jeu de données indépendants par une méthode dérivée du split-sampling (Harrel 2001). Espèces Sens Spec AUC k Class Salmo trutta 0,825 ± 0,023 0,694 ± 0,047 0,841 ± 0,020 0,463 ± 0,041 0,796 ± 0,017 Phoxinus phoxinus 0,619 ± 0,045 0,617 ± 0,038 0,648 ± 0,023 0,223 ± 0,041 0,617 ± 0,022 Barbatula barbatula 0,790 ± 0,042 0,535 ± 0,035 0,710 ± 0,023 0,288 ± 0,041 0,628 ± 0,024 Cottus gobio 0,807 ± 0,037 0,594 ± 0,033 0,724 ± 0,021 0,357 ± 0,038 0,669 ± 0,022 Gobio gobio 0,719 ± 0,044 0,817 ± 0,025 0,846 ± 0,018 0,516 ± 0,039 0,787 ± 0,018 Pseudochondrostoma duriense 0,760 ± 0,082 0,685 ± 0,055 0,794 ± 0,037 0,390 ± 0,074 0,707 ± 0,038 Leuciscus cephalus 0,840 ± 0,038 0,633 ± 0,031 0,825 ± 0,019 0,393 ± 0,038 0,694 ± 0,022 Rutilus rutilus 0,757 ± 0,043 0,793 ± 0,023 0,859 ± 0,020 0,497 ± 0,042 0,783 ± 0,018 Anguilla anguilla 0,792 ± 0,042 0,669 ± 0,028 0,804 ± 0,020 0,358 ± 0,036 0,698 ± 0,020 Perca fluviatilis 0,709 ± 0,050 0,804 ± 0,028 0,847 ± 0,021 0,454 ± 0,051 0,782 ± 0,021 Alburnoides bipunctatus 0,756 ± 0,072 0,728 ± 0,039 0,812 ± 0,029 0,374 ± 0,058 0,733 ± 0,028 Esox lucius 0,772 ± 0,048 0,792 ± 0,030 0,867 ± 0,018 0,464 ± 0,052 0,788 ± 0,023 Leuciscus leuciscus 0,703 ± 0,053 0,849 ± 0,024 0,867 ± 0,019 0,491 ± 0,047 0,820 ± 0,019 Alburnus alburnus 0,891 ± 0,046 0,749 ± 0,031 0,899 ± 0,020 0,457 ± 0,050 0,774 ± 0,024 Gasterosteus aculeatus 0,842 ± 0,049 0,786 ± 0,029 0,880 ± 0,019 0,469 ± 0,055 0,796 ± 0,024 Chondrostoma nasus 0,873 ± 0,078 0,782 ± 0,040 0,902 ± 0,027 0,469 ± 0,070 0,796 ± 0,032 Salmo salar 0,662 ± 0,064 0,910 ± 0,020 0,860 ± 0,025 0,543 ± 0,057 0,869 ± 0,017 Barbus barbus 0,877 ± 0,052 0,830 ± 0,024 0,920 ± 0,015 0,516 ± 0,055 0,836 ± 0,020 Lota lota 0,681 ± 0,097 0,756 ± 0,040 0,805 ± 0,034 0,285 ± 0,053 0,744 ± 0,029 Leuciscus souffia 0,765 ± 0,107 0,834 ± 0,038 0,878 ± 0,031 0,440 ± 0,084 0,824 ± 0,031 Thymallus thymallus 0,745 ± 0,077 0,777 ± 0,031 0,833 ± 0,028 0,347 ± 0,052 0,772 ± 0,024 Lampetra planeri 0,777 ± 0,063 0,691 ± 0,029 0,786 ± 0,024 0,256 ± 0,038 0,701 ± 0,024 Rhodeus amarus 0,891 ± 0,065 0,719 ± 0,032 0,865 ± 0,025 0,319 ± 0,050 0,738 ± 0,027 Pungitius pungitius 0,783 ± 0,097 0,764 ± 0,037 0,862 ± 0,033 0,307 ± 0,062 1,765 ± 0,030

3.3.2.2. Effets des variables environnementales sur l’occurrence des espèces La température est le seul paramètre environnemental à avoir été conservé dans chaque modèle final par la procédure de stepwise (Agresti 2002). La pente et le pseudo run- off ont été conservés pour 23 espèces, alors que l’amplitude thermique a été retenue dans 17 modèles (Tableau 17). L’influence relative de chaque variable environnementale sur l’occurrence des espèces, varie fortement entre espèces (Pont et al. 2005). La pente est le

90 3.3- Implications potentielles du changement climatique

facteur environnemental le plus important (Huet 1954), avec la plus grande contribution

indépendante relative (%Ii), pour plus de la moitié des espèces (Tableau 17). La température et le pseudo run-off sont les deux facteurs environnementaux avec les plus grandes contributions indépendantes, après la pente (Tableau 17). L’amplitude thermique présente la plus grande contribution indépendante pour une seule espèce, l’anguille (Anguilla anguilla, L.). Globalement, l’amplitude thermique est le facteur environnemental ayant la plus faible contribution indépendante sur l’occurrence des espèces. Pour toutes les espèces, l’effet indépendant des variables environnementales est supérieur à l’effet joint (ratio I/J). Néanmoins, ce ratio n’excède pas deux pour 11 des 24 espèces. Ceci suggère des effets conjoints de plusieurs variables sur la présence-absence de nombreuses espèces. A l’inverse, un tiers des espèces présente des ratios supérieurs à trois, ce qui implique un effet relativement indépendant des variables environnementales sur l’occurrence de ces espèces. Ceci est particulièrement vrai pour le chabot, l’anguille, le saumon et l’épinochette (Tableau 17).

Tableau 17 : Résultat du partitionnement hiérarchique (Chevan & Sutherland 1991, Mac Nally 2000, Walsh & Mac Nally 2008) pour 24 espèces piscicoles européennes : réduction totale en déviance (R), contribution indépendante des variables (I), contribution jointe (J), ratio entre les contributions indépendantes et jointes (I/J) et contribution indépendante relative (%Ii = Ii/I). Partition totale %Ii Espèces R I J |I/J| Tjul Tjul² Slope Slope² lPA LPA² Tdif Salmo trutta -696,9 -468,6 -228,2 2,1 – 30,9 – 34,4 – 19,7 15,1 Phoxinus phoxinus -136,2 -108,9 -27,3 4,0 – 37,6 – 38,5 – 23,9 – Barbatula barbatula -258,5 -194,6 -63,9 3,0 – 31,4 – 46,4 – 18,9 3,3 Cottus gobio -212,3 -201,6 -10,8 18,7 – 73,8 – 17,4 – – 8,8 Gobio gobio -723,5 -459,1 -264,5 1,7 – 20,0 – 49,0 – 21,3 9,7 Pseudochondrostoma duriense -141,3 -109,4 -31,9 3,4 – 41,0 – 19,4 – 39,6 – Leuciscus cephalus -566,3 -368,9 -197,4 1,9 – 39,0 – 23,9 – 37,0 – Rutilus rutilus -664,3 -400,2 -264,1 1,5 – 21,2 45,3 – 30,8 – 2,7 Anguilla anguilla -275,8 -301,8 26,0 11,6 – 6,9 – 5,5 2,3 – 85,3 Perca fluviatilis -519,0 -315,4 -203,7 1,5 – 20,6 – 49,1 30,3 – – Alburnoides bipunctatus -261,4 -159,2 -102,2 1,6 – 14,6 – 23,0 – 51,0 11,5 Esox lucius -533,2 -330,5 -202,7 1,6 – 11,8 – 62,2 26,1 – – Leuciscus leuciscus -529,1 -341,6 -187,5 1,8 – 17,4 – 45,1 – 33,3 4,2 Alburnus alburnus -680,7 -383,6 -297 1,3 16,1 – – 37,8 41,5 – 4,6 Gasterosteus aculeatus -379,6 -278,1 -101,5 2,7 – 29,5 – 50,1 – 9,8 10,6 Chondrostoma nasus -240,6 -150,1 -90,5 1,7 – 26,1 – 17,9 – 53,3 2,6 Salmo salar -202,9 -279,0 76,1 3,7 69,1 – – – – 14,6 16,3 Barbus barbus -457,8 -323,8 -134 2,4 – 24,0 – 15,1 – 59,1 1,8 Lota lota -208,5 -131,3 -77,2 1,7 3,5 – – 43,1 – 40,1 13,3 Leuciscus souffia -102,0 -118,5 16,5 7,2 – 66,7 – 10,6 – 5,4 17,3 Thymallus thymallus -181,5 -168,8 -12,7 13,3 – 14,5 – 28,7 – 48,1 8,7 Lampetra planeri -158,4 -172,1 13,7 12,6 – 34,2 – 51,1 14,7 – – Rhodeus amarus -309,3 -168,0 -141,3 1,2 – 29,4 – 37,7 – 23,6 9,2 Pungitius pungitius -138,6 -142,9 4,3 33,5 – 39,7 – 42,1 – 18,3 –

91 3.3- Implications potentielles du changement climatique

A l’exception de l’ablette, du saumon, de la lote et de la bouvière, les espèces présentent des réponses unimodales et symétriques (réponse parabolique) le long du gradient thermique (Figure 26). Ceci est contraire aux résultats de Pont et al. (2005). Ces auteurs démontrent majoritairement une augmentation de la probabilité de présence des espèces avec l’augmentation de la température. Ces divergences s’expliquent probablement par les différences d’amplitude thermiques des jeux de données utilisés pour estimer ces relations. Dans ce travail l’amplitude thermique est de 13,8 °C (11,3 – 25,1 °C), alors qu’elle n’était que de 9,5 °C (4,5 – 14 °C) dans les données de Pont et al. (2005).

Les réponses au gradient thermique sont très variables selon les espèces (Wehrly et al. 2003). Les espèces présentent des largeurs de niche et des optimums thermiques relativement différents (Wehrly et al. 2003). La truite, le vairon, la loche franche, le chevaine, ont des niches thermiques relativement larges (espèces eurythermes), alors que le hotu du Duero (Pseudochondrostoma duriense), l’épinoche, l’épinochette, la vandoise, le barbeau, le blageon, l’ombre, présentent des gammes de tolérance thermique relativement étroites (espèces sténothermes, Figure 26). Parmi les 24 espèces considérées, quatre d’entre elles ont des patrons de réponses relativement singuliers. La truite, est l’espèce capable de supporter les températures les plus froides et présentant la plus grande plasticité, largeur de niche. Les probabilités d’occurrence du saumon et de la lote déclinent tout au long du gradient thermique. A l’inverse, la probabilité de présence de l’ablette croît avec l’augmentation de la température (Figure 26). Le saumon, la lote et l’ablette, sont les seules espèces pour lesquelles le terme quadratique associé à Tjul n’a pas été sélectionné par la régression pas à pas (« stepwise »; Tableau 17).

Comparativement à la température, les patrons de variation de l’occurrence des espèces le long du gradient de pente, sont beaucoup plus variés. La truite présente le patron de réponse le plus singulier. Elle est capable d’être présente dans des milieux aux contraintes physiques élevées (fortes pentes) et d’être présente dans une large gamme de pentes (Figure 27). A l’exception de la truite, plusieurs groupes d’espèces présentent des réponses relativement similaires au gradient de pente. Le vairon, la loche franche et l’ombre, affichent des courbes de réponses unimodales et symétriques (paraboliques, Figure 27). D’autres espèces, présentent des réponses relativement similaires, mais tronquées dans les faibles pentes, par exemple le chevaine, l’ablette, l’épinoche. A l’inverse, de nombreuses autres

92 3.3- Implications potentielles du changement climatique espèces présentent des réponses monotones au gradient de pente. Les probabilités d’occurrence du goujon, du gardon, de la perche, du brochet, de l’ablette et de l’épinochette, augmentent lorsque la pente diminue (Figure 27). Les optimums des espèces se répartissent selon le gradient de pente (Huet 1954).

93 3.3- Implications potentielles du changement climatique

Salmo trutta Phoxinus phoxinus Barbatula barbatula Cottus gobio Gobio gobio Pseudochondrostoma duriense 0,0 0,2 0,4 0,6 0,8 1,0

Leuciscus cephalus Rutilus rutilus Anguilla anguilla Perca fluviatilis Alburnoides bipunctatus Esox lucius 0,0 0,2 0,4 0,6 0,8 1,0

Leuciscus leuciscus Alburnus alburnus Gasterosteus aculeatus Chondrostoma nasus Salmo salar Barbus barbus 0,0 0,2 0,4 0,6 0,8 1,0

Lota lota Telestes souffia Thymallus thymallus Lampetra planeri Rhodeus amarus Pungitius pungitius 0,0 0,2 0,4 0,6 0,8 1,0 12 14 16 18 20 22 24 12 14 16 18 20 22 24 12 14 16 18 20 22 24 12 14 16 18 20 22 24 12 14 16 18 20 22 24 12 14 16 18 20 22 24 Figure 26 : Effet de la température moyenne de l’air en juillet sur la probabilité de présence des 24 espèces, une fois les valeurs des autres variables environnementales fixées (Fox 1987, 2003), à la médiane des sites où l’espèce est présente. La courbe rouge représente la probabilité estimée et le polygone gris, l’intervalle de confiance estimé par l’approche de Wald (Hosmer & Lemeshow 2000, Agresti 2002). .

94 3.3- Implications potentielles du changement climatique

Salmo trutta Phoxinus phoxinus Barbatula barbatula Cottus gobio Gobio gobio Pseudochondrostoma duriense 0,0 0,2 0,4 0,6 0,8 1,0

Leuciscus cephalus Rutilus rutilus Anguilla anguilla Perca fluviatilis Alburnoides bipunctatus Esox lucius 0,0 0,2 0,4 0,6 0,8 1,0

Leuciscus leuciscus Alburnus alburnus Gasterosteus aculeatus Chondrostoma nasus Salmo salar Barbus barbus 0,0 0,2 0,4 0,6 0,8 1,0

Lota lota Telestes souffia Thymallus thymallus Lampetra planeri Rhodeus amarus Pungitius pungitius 0,0 0,2 0,4 0,6 0,8 1,0 -20246-20246-20246-20246-20246-20246 Figure 27 : Effet de la pente sur la probabilité de présence des espèces, une fois les valeurs des autres variables environnementales fixées (Fox 1987, 2003), à la médiane des sites où l’espèce est présente. La courbe rouge représente la probabilité estimée et le polygone gris, l’intervalle de confiance estimé par l’approche de Wald (Hosmer & Lemeshow 2000, Agresti 2002). .

95 3.3- Implications potentielles du changement climatique

A l’instar de la température et de la pente, les réponses des espèces au run-off (lPA) sont relativement variables selon les espèces. Deux grands patrons de réponses ont été observés. La truite, le vairon, la loche franche, le goujon, le spirlin et l’ombre, présentent des réponses unimodales symétriques le long du gradient de run-off. L’épinoche, l’épinochette, le blageon montrent des courbes de réponses relativement similaires mais tronquées dans les faibles ou forts run-offs (Figure 28). A l’exception du hotu du Duero et de la lamproie de Planer, les probabilités d’occurrence des autres espèces augmentent le long du gradient de run-off. Des espèces telles que le chevaine, le barbeau, la vandoise, le saumon atlantique, semblent être majoritairement présentes dans des gammes de forts lPA (Figure 28). La lamproie de Planer est la seule espèce qui présente une réponse contraire. Sa probabilité d’occurrence diminue, lorsque le run-off augmente. L’occurrence de l’anguille semble relativement peu influencée par le run-off (Tableau 17 et Figure 28). Le chabot est la seule espèce pour laquelle, le run-off n’a pas été conservé dans le modèle final. L’épinoche et l’épinochette semblent préférer les milieux à faible run-offs. A l’opposé, l’ombre, le blageon, la vandoise, le chevaine et le barbeau, semblent préférer les milieux avec un fort lPA. La truite, la loche franche et le vairon ont des optimums très proches, de même que, le goujon, le spirlin, l’ombre et le blageon. La lamproie semble préférer les milieux de faible lPA alors que l’ablette, la lote, le hotu, la perche, le brochet, le gardon semblent préférer de de forts run-offs (Figure 28).

L’amplitude thermique a la contribution indépendante la plus élevée pour l’anguille. La probabilité d’occurrence de cette espèce diminue lorsque l’amplitude thermique augmente. Les probabilités de présence, du spirlin, de l’ablette, du chabot, de la lote, l’épinoche, l’ombre, le spirlin et la loche franche, augmentent avec l’amplitude thermique. A l’inverse, les probabilités de présence, du barbeau, de la truite, de la vandoise, du hotu, du gardon, du blageon et du saumon, diminuent lorsque l’amplitude thermique augmente.

96 3.3- Implications potentielles du changement climatique

Salmo trutta Phoxinus phoxinus Barbatula barbatula Cottus gobio Gobio gobio Pseudochondrostoma duriense 0,0 0,2 0,4 0,6 0,8 1,0

Leuciscus cephalus Rutilus rutilus Anguilla anguilla Perca fluviatilis Alburnoides bipunctatus Esox lucius 0,0 0,2 0,4 0,6 0,8 1,0

Leuciscus leuciscus Alburnus alburnus Gasterosteus aculeatus Chondrostoma nasus Salmo salar Barbus barbus 0,0 0,2 0,4 0,6 0,8 1,0

Lota lota Telestes souffia Thymallus thymallus Lampetra planeri Rhodeus amarus Pungitius pungitius 0,0 0,2 0,4 0,6 0,8 1,0 6 8 10 12 14 16 18 6 8 10 12 14 16 18 6 8 10 12 14 16 18 6 8 10 12 14 16 18 6 8 10 12 14 16 18 6 8 10 12 14 16 18 Figure 28 : Effet du pseudo run-off (lPA) sur la probabilité de présence des espèces, une fois les valeurs des autres variables environnementales fixées (Fox 1987, 2003), à la médiane des sites où l’espèce est présente. La courbe rouge représente la probabilité estimée et le polygone gris, l’intervalle de confiance estimé par l’approche de Wald (Hosmer & Lemeshow 2000, Agresti 2002). .

97 3.3- Implications potentielles du changement climatique

Salmo trutta Phoxinus phoxinus Barbatula barbatula Cottus gobio Gobio gobio Pseudochondrostoma duriense 0,0 0,2 0,4 0,6 0,8 1,0

Leuciscus cephalus Rutilus rutilus Anguilla anguilla Perca fluviatilis Alburnoides bipunctatus Esox lucius 0.0 0.2 0.4 0.6 0.8 1.0

Leuciscus leuciscus Alburnus alburnus Gasterosteus aculeatus Chondrostoma nasus Salmo salar Barbus barbus 0,0 0,2 0,4 0,6 0,8 1,0 10 15 20 25 10 15 20 25 Lota lota Telestes souffia Thymallus thymallus Lampetra planeri Rhodeus amarus Pungitius pungitius 0,0 0,2 0,4 0,6 0,8 1,0 10 15 20 25 10 15 20 25 10 15 20 25 10 15 20 25 Figure 29 : Effet de l’amplitude thermique entre janvier et juillet, sur la probabilité de présence des espèces, une fois les valeurs des autres variables environnementales fixées (Fox 1987, 2003), à la médiane des sites où l’espèce est présente. La courbe rouge représente la probabilité estimée et le polygone gris, l’intervalle de confiance estimé par l’approche de Wald (Hosmer & Lemeshow 2000, Agresti 2002). .

98 3.3- Implications potentielles du changement climatique

3.3.2.3. Intervalle de confiance La précision de l’estimation des relations entre les variables environnementales et la probabilité de présence est très variable selon l’espèce et le facteur environnemental considéré. L’estimation des relations entre la température moyenne de l’air en juillet et la présence-absence des espèces semble relativement précise, comme le souligne l’étroitesse des intervalles de confiance estimés (Figure 26). Néanmoins, pour certaines espèces comme le barbeau, la lote, l’ombre et le spirlin, les bandes de confiance autour des prédictions du modèle sont relativement larges aux extrémités du gradient thermique. L’absence de données (présence ou absence) pour ces espèces dans de telles gammes thermiques, au sein du jeu de données de calibration, explique la forte incertitude associée aux prédictions dans ces milieux. Les patterns observés pour la pente et le run-off sont relativement similaires à ceux observés pour la température. Les intervalles de confiance sont majoritairement étroits, indiquant une bonne précision des estimations. Comme pour la température, les intervalles de confiance associés aux modèles de certaines espèces s’élargissent aux extrémités des gradients de pente ou de lPA. Néanmoins, à la différence des patterns observés pour la température, certaines espèces ont des intervalles de confiance relativement larges tout au long de ces deux gradients environnementaux (ex. le blageon).

3.3.2.4. Perspectives Ces résultats montrent clairement que le changement climatique, tel que prédit par les différents scénarios du GIEC (IPCC 2007), aura des conséquences sur les aires de répartition future des espèces (Buisson et al. 2008b, Lassale et al. 2008). Néanmoins, toutes les espèces ne devraient pas être sensibles de la même manière à ces changements (Hering et al. 2009). Les espèces qui semblent sténothermes, comme l’épinoche et l’épinochette, seront a priori plus touchées par les modifications climatiques, que des espèces avec une plus grande enveloppe climatique telle que la truite. L’expansion ou la contraction des aires de répartition de ces espèces dépendra de l’ampleur des modifications thermiques et de la capacité des espèces à atteindre des zones thermiquement plus favorables. Les espèces avec les plus grandes capacités de dispersion seront, a priori, les plus à même de faire face aux changements (Reynolds et al. 2005, Griffiths 2006, Hering et al. 2009). A l’inverse, les espèces peu mobiles seront moins à même de se déplacer dans le réseau hydrographique et de s’établir dans des milieux plus favorables.

99 3.3- Implications potentielles du changement climatique

Par rapport aux résultats antérieurs (Pont et al. 2005, Buisson et al. 2008a), ces résultats montrent la nécessité de considérer toute l’aire de répartition des espèces pour estimer la niche des espèces et estimer leur future aire de répartition.

Ces résultats démontrent aussi que la température n’est pas le seul facteur climatique soumis au changement global (Barnett et al. 2005, Milly et al. 2005), ayant une incidence sur l’occurrence des espèces. L’évolution des conditions pluviométriques aura aussi une incidence sur l’occurrence des espèces comme démontrée par l’importance du pseudo run-off (précipitations annuelles multipliées par la surface du bassin versant). Malgré l’importance de ce facteur, la plupart des scénarios d’évolution de l’aire de répartition des espèces ne prennent en compte que les changements liés à la température. La non prise en compte des précipitations dans les projections futures, est probablement liée à la difficulté de traitement de cette donnée.

L’incertitude associée à l’estimation de la probabilité de présence de certaines espèces, dans certains milieux, constitue une limite à la projection des futures aires de répartition de ces espèces. L’estimation de l’occurrence de certaines espèces aux extrémités des gradients environnementaux semble relativement imprécise, comme le souligne la largeur des intervalles de confiance. Par conséquent, l’estimation des paramètres des modèles constitue une source d’erreur non négligeable dans l’évaluation des habitats favorables à ces espèces dans le futur (Elith & Graham 2009, Sinclair et al. 2010). Parmi les différentes sources de biais potentielles (Wilby et al. 2006, Buisson et al. 2010), l’incertitude associée au modèle lui- même n’est que rarement évoquée dans la littérature. De plus, le changement global est susceptible de créer des conditions environnementales inédites, non observées à l’heure actuelle, donc non intégrées dans l’estimation des paramètres des modèles (Sinclair et al. 2010). L’utilisation des modèles avec ces conditions environnementales, conduit à extrapoler la présence ou l’absence des espèces sans savoir s’ils seront à même de fidèlement refléter l’influence de ces nouveaux milieux sur la faune.

3.3.2.5. Limite : utilisation de la température de l’air et non de l’eau Les données de températures de l’eau font rarement l’objet d’un suivi à une échelle nationale et encore moins internationale. La température de l’eau est influencée par la température de l’air (Petts & Amoros 1996), qui est une donnée beaucoup plus facilement accessible. Par conséquent, la température de l’air est souvent utilisée comme substitut de la 100 3.3- Implications potentielles du changement climatique température de l’eau (Allan & Castillo 2007). Néanmoins, le lien entre les deux n’est pas direct et est fortement dépendant de l’hydrologie. Albek et Albek (2009) ont ainsi observé qu’à long terme, la température de l’eau ne suit pas forcément l’évolution de la température de l’air. Les modifications hydrologiques observées en parallèle du changement climatique expliquent cette absence de lien. Une fois l’effet de l’hydrologie contrôlé, la tendance observée pour la température de l’eau suit celle de la température de l’air (Albelk & Albek 2009). D’autres auteurs ont montré que dans un même bassin, la température des cours d’eau ne suivait pas systématiquement les tendances du réchauffement climatique (Rao 1993). L’interaction entre l’hydrologie et la température de l’eau est une source d’erreur potentielle pour les projections établies à partir de la température de l’air.

101 4.- Discussion et perspectives

4. Discussion et perspectives

L’objectif de ce travail était triple : - Tester plusieurs hypothèses écologiques sur lesquelles repose le développement d’un indice multimétrique à large échelle et d’utiliser ces nouvelles connaissances dans le développement du nouvel indice poisson européen EFI+. - Aller au-delà du développement d’un indice, en proposant des solutions pour estimer l’incertitude autour du score des métriques et de l’indice. - Estimer les conséquences du changement global et plus particulièrement du réchauffement climatique, sur les peuplements piscicoles, afin d’envisager les effets potentiels liés à ce changement, dans l’évaluation de l’état écologique des cours d’eau.

Les différents tests réalisés ont permis de mettre en évidence : 1) le lien étroit entre certaines catégories de traits au sein des assemblages, et notamment un gradient de tolérance et de reproduction ; 2) la variation de la structure des communautés selon les écorégions, certaines présentent des peuplements majoritairement dominés par des espèces tolérantes et eurythermes (ex. Mediterranean region), alors que d’autres présentent des peuplements majoritairement dominés par des espèces sténothermes intolérantes, par exemple Alps ; 3) l’évolution de la structure des communautés le long d’un gradient thermique et d’un gradient physique ; 4) la convergence relative des réponses de la structure des peuplements, issus de pools spécifiques régionaux différents, aux gradients environnementaux ; 5) l’influence de la température et des facteurs physiques sur la distribution et les traits d’histoire de vie des espèces.

Les résultats issus de la RDA ou des « hierarchical partitioning » ont montré que les conditions climatiques et la structure physique du milieu sont des facteurs environnementaux clés pour les traits d’histoire de vie des espèces, leur distribution et la structure fonctionnelle des peuplements. Ces analyses ont aussi révélé que les influences respectives de ces deux facteurs (climat et structure physique) sur les différents niveaux d’organisation des communautés paraissent, à large échelle, relativement indépendantes : en atteste la relative 102 4.- Discussion et perspectives orthogonalité de ces gradients sur le cercle des corrélations de la RDA et par les faibles contributions jointes de ces deux facteurs, relativement à leurs contributions indépendantes. Ces analyses ont également permis de définir deux grands types de peuplements sur la base de leurs traits fonctionnels. Une des conséquences majeure de ces résultats, est l’usage de ces deux grands types de peuplements dans l’élaboration de l’indice. La distinction entre peuplements tolérants et intolérants a été introduite pour prendre en compte leur spécificité. Ces peuplements possèdent des caractéristiques relativement différentes, en lien avec les conditions environnementales dans lesquelles ils évoluent. Les peuplements intolérants principalement associés aux petits cours d’eau et aux cours d’eau froids sont majoritairement, en Europe, dominés par les salmonidés (Huet et al. 1954, Ferreira et al. 2007a, Melcher et al. 2007). A l’inverse, les peuplements tolérants sont associés à des cours d’eau de plus grande taille et/ou plus chauds et sont majoritairement dominés par les cyprinidés. L’utilisation de cette classification a permis la sélection de métriques spécifiques à chaque type de peuplements. L’avantage majeur est que la structure des peuplements observée est comparée à des métriques dont les valeurs attendues sont non nulles (différentes de zéro), toujours positives, puisque les métriques ont été sélectionnées pour être représentatives de ces peuplements. Lorsqu’un trait fonctionnel est naturellement absent dans une communauté et un environnement donné, il est difficile de savoir ce que quantifie la déviation valeur observée- attendue. C’est d’autant plus vrai si l’effet des pressions défavorise les espèces possédant ce trait. Si les valeurs observées et attendues de la métrique sont toutes deux nulles, il devient impossible de savoir dans quelle mesure la valeur observée est due à l’environnement ou à l’effet des pressions (Harris & Silvera 1999). Finalement deux indices multimétriques, basés sur les caractéristiques des peuplements, ont été développés au cours du projet européen EFI+. Ces deux indices intègrent dans leur calcul un nombre réduit de métriques, comparativement à la majorité des indices multimétriques, même ceux développés pours des faunes relativement pauvres (Lyons et al. 1996, Langdon 2001, Southerland et al. 2007). En comparaison, le précédent indice poisson européen, EFI, intégrait 10 métriques dans son calcul (Pont et al. 2006, 2007). Le nombre restreint de métriques est à la fois un choix délibéré et la conséquence d’une contrainte pratique. Depuis le premier IBI (Karr 1981) et ses nombreux développements (Fausch et al. 1984, Angermeier & Karr 1986, Karr 1991, Karr & Chu 1999), l’utilisation de nombreuses métriques a été largement encouragée. L’idée étant que chacune de ces métriques donne une information différente sur les peuplements : la

103 4.- Discussion et perspectives richesse spécifique, l’abondance des individus, le nombre de taxa particuliers ou de groupes d’espèces (guildes), le nombre de taxons intolérants, l’abondance relative des individus intolérants, et la santé des individus (pourcentages d’individus avec des lésions) (Karr & Chu 1999). Chacune de ces métriques, ou type de métriques, est censée répondre à des pressions différentes ou être sensible à des degrés d’altérations différents (Angermeier & Karr 1986, Karr & Chu 1999). En pratique, les métriques ne sont jamais complétement indépendantes. Premièrement, quelles que soient les unités dans lesquelles elles sont exprimées (nombre d’espèce, nombre d’individus, biomasse, etc.), elles sont toutes calculées à partir des mêmes individus capturés en un site. Deuxièmement, l’analyse de l’association des traits au sein des peuplements piscicoles européens (P1), a clairement mis en évidence la forte interdépendance de certaines catégories de traits. Certaines catégories peuvent même être considérées comme redondantes. Les quatre métriques basées sur les classes de taille et les traits des espèces sont aussi fortement corrélées (P3). Classiquement, les concepteurs d’indices multimétriques définissent une corrélation maximale tolérable, pour sélectionner les métriques à intégrer dans le calcul du score final (Hughes et al. 1998, Hering et al. 2006, Pont et al. 2006, 2007, Roset et al. 2007, Stoddard et al. 2008). Par ailleurs, l’effet de la corrélation entre métriques sur l’incertitude associée au score final de l’indice a été clairement souligné (partie P5). Plus le nombre de métriques intégrées dans un indice sera important, plus les corrélations entre ces métriques seront fortes, et plus l’incertitude autour du score de l’indice sera importante. Le faible nombre de métriques prises en compte dans le calcul du nouvel indice poisson européen EFI+ est donc à la fois lié à la forte redondance entre les catégories de traits au sein des peuplements, ce qui limite le nombre de métriques candidates, et à une volonté de limiter l’incertitude associée au score de l’indice. Néanmoins, la prise en compte de deux métriques seulement limite la sensibilité de l’indice. Potentiellement, EFI+ sera plus sensible à certains types de pressions, à des altérations fortes du milieu, ou à des combinaisons de pressions. Il y a probablement un équilibre à trouver entre le nombre de métriques à sélectionner pour détecter l’impact d’un maximum de pressions anthropiques et limiter l’incertitude associée au score de l’indice.

Les résultats des parties P1, P2 et P3, confirment la nécessité de prendre en compte la part de variabilité environnementale des catégories de traits, pour que l’évaluation de l’état des cours d’eau soit comparable et indépendante de l’environnement. Globalement, les faunes de la péninsule ibérique et de l’Europe de l’Ouest présentent des réponses convergentes aux

104 4.- Discussion et perspectives variations environnementales. Les métriques présentent majoritairement, soit des réponses parallèles à l’environnement, soit des réponses similaires mais avec des amplitudes de réponses différentes. Les comparaisons des trois modèles emboîtés, associées à la procédure de tests multiples, ont principalement sélectionné les modèles 2 et 3. Néanmoins, la méthodologie statistique employée pour tester la convergence des communautés, ne poursuit pas les mêmes objectifs que ceux du développement d’un indice. Le but de l’utilisation de modèles statistiques en bioindication est de pouvoir prédire les valeurs attendues des métriques, en absence de pressions, dans un environnement donné. L’estimation de l’écart entre les valeurs attendues et observées des métriques (résidus des modèles) est l’élément central de la construction d’un indice selon la « reference condition approach » (Pont et al. 2006, 2007, 2009). Le problème, est alors de savoir si les mêmes modèles peuvent être utilisés partout en Europe pour prédire la structure des communautés, ou si les modèles doivent être adaptés selon les régions. La prise en compte ou non de la région s’effectue pour chaque métrique individuellement. A l’inverse, l’utilisation de procédure de tests multiples, pour évaluer la convergence des communautés ibériques et de l’ouest de l’Europe (P3), poursuit un objectif d’inférence multiple. Le but étant de contrôler une classe d’erreur de type I (FWER, FDR, etc.), liée au test simultané d’un ensemble d’hypothèses nulles. Le raisonnement ne porte donc plus sur une métrique mais sur l’ensemble des métriques testées (plus précisément sur l’ensemble des tests réalisés) (Lamouroux et al. 2002). Il est donc possible que l’utilisation de la méthode de Benjamini et Yekutieli (2001) qui contrôle le taux de mauvaises erreurs (FDR), et de p-valeurs ajustées (Dudoit & van Der Lann 2006), ait masqué certaines divergences entre les deux régions (P3). L’utilisation d’une telle procédure a pour but d’ajuster le seuil à partir duquel on considère que les différences entre les modèles sont telles que les hypothèses nulles (hypothèse d’égalité) doivent être rejetées. Sans ces corrections, les divergences régionales entre la péninsule ibérique et la France et la Belgique, auraient été vraisemblablement plus marquées, ainsi que la nécessité d’intégrer la région dans les modèles (Oberdorff et al. 2002, Pont et al. 2006, 2007).

En parallèle de la convergence fonctionnelle, il serait intéressant d’étudier la convergence des réponses des peuplements aux pressions anthropiques. Généralement, la sensibilité des métriques aux pressions est évaluée par rapport à un gradient global de pressions (P3, P4) (Karr & Chu 1999, Pont et al. 2006, 2007, Quataert et al. 2007). En pratique, seules les métriques avec un faible recouvrement des scores des sites de référence et des sites impactés sont utilisées (Stoddard et al. 2008, Pont et al. 2009). Il est possible que

105 4.- Discussion et perspectives selon les pools d’espèces régionaux dont sont issus les peuplements, la sensibilité des métriques aux pressions est différente. Certaines régions, comme la péninsule ibérique, présentent des faunes piscicoles adaptées à des conditions environnementales spécifiques. Des espèces vivant dans les portions de cours d’eau soumises aux aléas et à la dureté du climat méditerranéen, sont probablement plus tolérantes que des espèces vivant en zone tempérée. Par conséquent, les peuplements méditerranéens sont peut-être moins sensibles à certaines pressions anthropiques que des peuplements tempérés. La sensibilité d’un indice utilisable à large échelle peut être différente selon la région dans laquelle il est employé (Quataert et al. 2007). Il est aussi possible que la sensibilité des métriques dépende des conditions environnementales dans lesquelles agissent les pressions anthropiques. La variation du sens de réponse d’une métrique, à la même pression, est la manifestation extrême de l’interaction entre environnement et pression. Plusieurs auteurs (Lyons et al. 1996, Maret 1999) ont ainsi observé que les altérations du milieu dans les cours d’eau chauds, entrainent une diminution de la richesse spécifique, alors que la richesse tend à augmenter avec la dégradation du milieu dans les cours d’eau froids. En pratique, ces métriques sont difficiles à manipuler et rarement utilisées. La distribution des résidus des sites impactés recouvre celle des sites de références et est souvent plus étendue. Seules les métriques avec un sens de variation unique sont facilement utilisables (les valeurs décroissent, ou augmentent, avec la pression). L’interaction environnement-pression peut aussi moduler la sensibilité des métriques. Il est possible que la même pression n’ait aucun impact sur les peuplements dans un environnement donné, et altère l’intégrité des peuplements dans d’autres conditions. A l’heure actuelle, l’étude des interactions entre pressions et environnement est relativement limitée et reste relativement descriptive. La multitude des pressions qui s’exercent au même endroit est un frein à l’étude des interactions pression-environnement. Il est relativement rare d’observer des sites soumis à une seule pression ou un seul type de pressions. L’effet des interactions pression-environnement peut être d’autant plus important dans la perspective du changement global. Si l’effet d’une pression dépend de la température du milieu où elle s’exerce, son impact sur les communautés piscicoles pourra évoluer, en supposant que cette pression reste constante. De tels changements peuvent avoir des conséquences importantes pour la gestion future des cours d’eau et leur restauration. Une possibilité consisterait à identifier parmi les sites (9 948 sites) du projet EFI+, un ensemble de sites de référence et de sites affectés par un seul type de pression (ex. présence

106 4.- Discussion et perspectives d’une retenue). Il est envisageable de modéliser différentes composantes des communautés en fonction de l’environnement et de la présence ou non de retenues dans le site. Il conviendrait d’intégrer dans ces modèles l’interaction entre l’environnement et la pression. Une fois le ou les modèles ajustés, il conviendrait de tester si la ou les interactions entre l’environnement et les pressions sont significatives (P2). Si au moins une des interactions pressions environnement est significative, il conviendrait de regarder comment s’exprime cette interaction (Irz et al. 2007, Hugueny et al. 2010) (P2) : modulation de l’effet des pressions selon l’environnement, effet inverse de la pression sur la caractéristique des communautés selon l’environnement, etc. Une seconde alternative, serait de se focaliser sur une région ou un bassin versant avec un seul grand type d’altération (ex. hydrologique, qualité d’eau). Cette approche permettrait de s’affranchir des variations interrégionales des relations entre structure des peuplements et environnement (P2). Parmi les nombreux facteurs environnementaux capables de moduler l’impact des pressions sur les communautés piscicoles, les facteurs climatiques pourraient jouer un rôle majeur dans le futur. Les résultats du partitionnement hiérarchique des modèles de présence- absence des 24 espèces piscicoles européennes (P7), ont clairement mis en évidence l’importance du climat sur la distribution actuelle des espèces (Pont et al. 2005, Buisson et al. 2008a). Parmi ces modèles, quatre présentent des ajustements faibles et vingt donnent une bonne, voire très bonne correspondance entre prédictions et observations. Ceci suggère que l’on dispose d’une estimation cohérente de la niche des espèces les plus fréquentes dans les petits et moyens cours d’eau de l’ouest de l’Europe. La force de ces modèles est l’étendue géographique des jeux de données qui ont servi à leur calibration. Travailler à l’échelle européenne permet, a priori, d’avoir une estimation plus précise de la niche réalisée (Hutchinson 1957) de ces espèces, que des études menées à l’échelle nationale (Pont et al. 2005) ou d’un bassin versant (Buisson et al. 2008a). La limite spatiale des jeux de calibration est une des critiques majeures (Sinclair et al. 2010) adressées aux SDMs (species distribution model ; Guisan & Zimmerman 2000, Austin 2007, Elith & Graham 2009, Elith & Leathwick 2009). Les prédictions issues des régressions logistiques ne sont pas absolues. Elles représentent l’espérance de l’occurrence d’une espèce dans un environnement donné. Les intervalles de confiances estimés autour des valeurs prédites des modèles, montrent clairement que ces estimations sont plus au moins fiables selon les gammes environnementales dans lesquelles elles sont prédites (P7). L’estimation des intervalles de prédictions (Hahn & Meeker 1991) associée à la probabilité de présence de chaque espèce

107 4.- Discussion et perspectives permettrait d’estimer l’incertitude relative à ces projections. La complexité de l’estimation de cette incertitude est une explication à l’absence de cette information dans les études sur les conséquences du changement global.

La majeure partie des 24 espèces considérées dans la partie 7 sont les espèces dominantes des petits et moyens cours d’eau de l’ouest de l’Europe. Par conséquent, ces modèles devraient permettre d’estimer la sensibilité de ces peuplements au réchauffement climatique. Cette sensibilité pourra être évaluée en prédisant les peuplements théoriques en fonction de différents scénarios d’évolution de la température, telle qu’une augmentation d’un ou de deux degrés Celsius. Une fois la composition théorique des peuplements estimée, il serait possible d’estimer la structure fonctionnelle des peuplements (P1) et ainsi évaluer la sensibilité de la structure fonctionnelle des peuplements aux changements globaux. Il est aussi envisageable de répéter ces analyses en faisant varier les précipitations. Le run-off estimé grossièrement par le produit de la surface du bassin versant et des précipitations annuelles, joue un rôle important sur l’occurrence de 23 des 24 espèces (P7).

Le développement d’un indice multimétrique prédictif utilisable en Europe constitue un défi de par la variabilité environnementale et faunistique observée à large échelle. Les peuplements piscicoles européens présentent une certaine redondance fonctionnelle, notamment les traits liés à la tolérance et à la reproduction des espèces, et ce en dépit de la diversité de la faune européenne. La thermie et la structure physique des cours d’eau, sont les facteurs environnementaux clés qui conditionnent l’occurrence des espèces et la structure fonctionnelle des peuplements. L’action du climat sur la structure en trait des communautés pourrait avoir des conséquences importantes sur l’évaluation de l’état écologique des cours d’eau, par l’indice poisson européen. Cet état est estimé en comparant les valeurs actuelles des métriques aux valeurs attendues en absence de pression, prédites via des modèles statistiques. Pour que cette approche reste valable avec le changement climatique, cela nécessite que la structure fonctionnelle des communautés évolue de manière prédictible avec le changement climatique. Cette hypothèse pourrait être testée en comparant d’une part les valeurs des métriques prédites telles que prédites par les modèles statistiques et d’autre part les valeurs des métriques calculées à partir de peuplements théoriques prédits par les modèles d’occurrence des espèces.

108 5.- Bibliographie

5. Bibliographie

Abdoli, A., D. Pont, and P. Sagnes. 2007. Intrabasin variations in age and growth of bullhead: the effects of temperature. Journal of Fish Biology 70:1224-1238. Agresti. 2002. Categorical Data Analysis. 2nd edition. John Wiley & Sons, Inc., Hoboken, New Jersey. Akaike, H. 1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control 19:716-723. Albelk, M., and E. Albek. 2009. Stream temperature trends in Turkey. CLEAN – Soil, Air, Water 37:142-149. Alcorlo, P., M. Otero, M. Crehuet, A. l. Baltanas, and C. Montes. 2006. The use of the red swamp crayfish (Procambarus clarkii, Girard) as indicator of the bioavailability of heavy metals in environmental monitoring in the River Guadiamar (SW, Spain). Science of The Total Environment 366:380-390. Allan, J. D., and M. M. Castillo. 2007. Stream Ecology: Structure and Function of Running Waters. 2nd edition. Kluwer Academic Publishers, Boston. Almodovar, A., G. G. Nicola, and B. Elvira. 2006. Spatial variation in brown trout production: the role of environmental factors. Transactions of the American Fisheries Society 135:1348-1360. An, K. G., S. S. Park, and J. Y. Shin. 2002. An evaluation of a river health using the index of biological integrity along with relations to chemical and habitat conditions. Environment International 28:411-420. Angermeier, P. L., and J. R. Karr. 1986. Applying an index of biotic integrity based on stream-fish communities: considerations in sampling and interpretation. North American Journal of Fisheries Management 6:418-429. Angermeier, P. L., and I. J. Schlosser. 1987. Assessing biotic integrity of the fish community in a small Illinois stream. North American Journal of Fisheries Management 7:331-338. Angermeier, P. L., R. A. Smogor, and J. R. Stauffer. 2000. Regional frameworks and candidate metrics for assessing biotic integrity in Mid-Atlantic highland streams. Transactions of the American Fisheries Society 129:962–981. Arnekleiv, J. V., A. G. Finstad, and L. Ronning. 2006. Temporal and spatial variation in growth of juvenile Atlantic salmon. Journal of Fish Biology 68:1062-1076. Austin, M. P. 2007. Species distribution models and ecological theory: a critical assessment and some possible new approaches. Ecological Modelling 200:1-19. Bady, P., D. Pont, M. Logez, and J. Veslot. 2009a. Deliverable 4.1: report on the modelling of reference conditions and on the sensitivity of candidate metrics to anthropogenic pressures. Cemagref, pp. 41-43. Bady, P., D. Pont, M. Logez, and J. Veslot. 2009b. Deliverable 4.2: report on the final development and validation of the new European Fish Index and method, including a complete technical description of the new method. Cemagref, pp. 1-180. Bailey, R. C., M. G. Kennedy, M. Z. Dervish, and R. M. Taylor. 1998. Biological assessment of freshwater ecosystems using a reference condition approach: comparing predicted and 109 5.- Bibliographie

actual benthic invertebrate communities in Yukon streams. Freshwater Biology 39:765- 774. Baker, E. A., K. E. Wehrly, P. W. Seelbach, L. Wang, M. J. Wiley, and T. Simon. 2005. A multimetric assessment of stream condition in the northern lakes and forests ecoregion using spatially explicit statistical Modeling and regional normalization. Transactions of the American Fisheries Society 134:697-710. Banarescu, P. 1989. Zoogeography and history of the freshwater fish fauna of Europe. Pages 88-107 In J. Holcik (editor). The Freshwater Fishes of Europe. Aula-Verlag, Wisebaden. Banarescu, P. 1992. Zoogeography of Fresh Waters. Volume 2: Distribution and Dispersal of Freshwaters in North America and Eurasia. AULA-Verlag, Wiesbaden. Barnett, T. P., J. C. Adam, and D. P. Lettenmaier. 2005. Potential impacts of a warming climate on water availability in snow-dominated regions. Nature 438:303-309. Bates Prins, S. C., and E. P. Smith. 2007. Using biological metrics to score and evaluate sites: a nearest-neighbour reference condition approach. Freshwater Biology 52:98-111. Baty, F., M. Facompré, J. Wiegand, J. Schwager, and M. H. Brutsche. 2006. Analysis with respect to instrumental variables for the exploration of microarray data structures. Bmc Bioinformatics 7. Bêche, L. A., E. P. McElravy, and V. H. Resh. 2006. Long-term seasonal variation in the biological traits of benthic-macroinvertebrates in two Mediterranean-climate streams in California, USA. Freshwater Biology 51:56-75. Begon, M., C. R. Townsend, and J. L. Harper. 2006. Ecology: From Individuals to Ecosystems. 4th edition. Blackwell Publishing, Oxford, United Kingdom. Bellwood, D. R., P. C. Wainwright, C. J. Fulton, and A. Hoey. 2002. Assembly rules and functional groups at global biogeographical scales. Functional Ecology 16:557-562. Benaglia, T., D. Chauveau, D. R. Hunter, and D. Young. 2009. mixtools: an R package for analyzing finite mixture models. Journal of Statistical Software 32:1-29. Benjamini, Y., and D. Yekutieli. 2001. The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics 29:1165–1188. Bevington, P. R., and D. K. Robinson. 2003. Data Reduction and Error Analysis for the Physical Sciences. 3rd edition. McGraw-Hill, New York. Bianco, P. G. 1995. Mediterranean endemic freshwater fishes of Italy. Biological Conservation 72:159-170. Blanck, A., and N. Lamouroux. 2007. Large-scale intraspecific variation in life-history traits of European freshwater fish. Journal of Biogeography 34:862-875. Bonada, N., S. Doledec, and B. Statzner. 2007. Taxonomic and biological trait differences of stream macroinvertebrate communities between mediterranean and temperate regions: implications for future climatic scenarios. Global Change Biology 13:1658-1671. Bonada, N., N. Prat, V. H. Resh, and B. Statzner. 2006. Developments in aquatic insect biomonitoring: a comparative analysis of recent approaches. Annual Review of Entomology 51:495-523. Boughton, D. A., M. Gibson, R. Yedor, and E. Kelley. 2007. Stream temperature and the potential growth and survival of juvenile Oncorhynchus mykiss in a southern California creek. Freshwater Biology 52:1353-1364.

110 5.- Bibliographie

Bowman, M. F., and K. M. Somers. 2005. Considerations when using the reference condition approach for bioassessment of freshwater ecosystems. Water Quality Research Journal of Canada 40:347-360. Breine, J., I. Simoens, P. Goethals, P. Quataert, D. Ercken, C. Van Liefferinghe, and C. Belpaire. 2004. A fish-based index of biotic integrity for upstream brooks in Flanders (Belgium). Hydrobiologia 522:133-148. Bremner, J., S. I. Rogers, and C. L. J. Frid. 2006. Matching biological traits to environmental conditions in marine benthic ecosystems. Journal of Marine Systems 60:302-316. Bremner, J., S. I. Rogers, and C. L. J. Frid. 2006. Methods for describing ecological functioning of marine benthic assemblages using biological traits analysis (BTA). Ecological Indicators 6:609-622. Brown, M. E. 1951. The growth of brown trout (Salmo Trutta Linn.): IV. The effect of food and temperature on the survival and growth of fry. Journal of Experimental Biology 28:473-491. Buisson, L., L. Blanc, and G. Grenouillet. 2008a. Modelling stream fish species distribution in a river network: the relative effects of temperature versus physical factors. Ecology of Freshwater Fish 17:244-257. Buisson, L., W. Thuillier, N. Casajus, S. Lek, and G. Grenouillet. 2010. Uncertainty in ensemble forcasting of species distribution. Global Change Biology 16:1145-1157. Buisson, L., W. Thuillier, S. Lek, P. Lim, and G. Grenouillet. 2008b. Climate change hastens the turnover of stream fish assemblages. Global Change Biology 14:2232-2248. Bystrom, P., and E. Garcia-Berthou. 1999. Density dependent growth and size specific competitive interactions in young fish. Oikos 86:217-232. Cadle, J. E., and H. W. Greene. 1993. Phylogenetic patterns, biogeography, and the ecological structure of neotropical snake assemblages. Pages 350-363 In R. E. Ricklefs and D. Schluter (editors). Species Diversity in Ecological Communities. University of Chicago Press, Chicago. Cameron, A. C., and P. K. Trivedi. 1998. Regression Analysis of Count Data. Cambridge University Press, Cambridge. Carmona, J. A., I. Doadrio, A. L. Marquez, R. Real, B. Hugueny, and J. M. Vargas. 1999. Distribution patterns of indigenous freshwater fishes in the Tagus River basin, Spain. Environmental Biology of Fishes 54:371-387. Caudron, A., A. Champigneulle, C. R. Largiader, S. Launey, and R. Guyomard. 2009. Stocking of native Mediterranean brown trout (Salmo trutta) into French tributaries of Lake Geneva does not contribute to lake-migratory spawners. Ecology of Freshwater Fish 18:585-593. Chambers, J. M., and T. J. Hastie. 1993. Statistical Model in S. Chapman & Hall, Boca Raton, Florida. Charvet, S., B. Statzner, P. Usseglio-Polatera, and B. Dumont. 2000. Traits of benthic macroinvertebrates in semi-natural French streams: an initial application to biomonitoring in Europe. Freshwater Biology 43:277-296. Chevan, A., and M. Sutherland. 1991. Hierarchical partitioning. The American Statistician 45:90-96.

111 5.- Bibliographie

Clark, P. U., A. S. Dyke, J. D. Shakun, A. E. Carlson, J. Clark, B. Wohlfarth, J. X. Mitrovica, S. W. Hostetler, and A. M. McCabe. 2009. The last glacial maximum. Science 325:710- 714. Clarke, K. D., and D. A. Scruton. 1999. Brook trout production dynamics in the streams of a low fertility Newfoundland watershed. Transactions of the American Fisheries Society 128:1222-1229 Clarke, R. 2000. Uncertainty in estimates of biological quality based on RIVPACS. Pages 39- 54 In J. F. Wright, D. W. Sutcliffe and M. T. Furse (editors). Assessing the Biological Quality of Freshwaters. RIVPACS and Other Techniques. Freshwater Biological Association, Ambleside, United Kingdom. Clavero, M., F. Blanco-Garrido, and J. Prenda. 2004. Fish fauna in Iberian Mediterranean river basins: biodiversity, introduced species and damming impacts. Aquatic Conservation: Marine and Freshwater Ecosystems 14:575-585. Clavero, M., and E. Garcia-Berthou. 2006. Homogenization dynamics and introduction routes of invasive freshwater fish in the Iberian Peninsula. Ecological Applications 16:2313-2324. Collett, D. 2002. Modelling Binary Data. Champman & Hall/CRC, Boca Raton, Florida. Cornwell, W. K., and D. D. Ackerly. 2009. Community assembly and shifts in plant trait distributions across an environmental gradient in coastal California. Ecological Monographs 79:109-126. Cornwell, W. K., D. W. Schwilk, and D. D. Ackerly. 2006. A trait-based test for habitat filtering: convex hull volume. Ecology 87:1465-1471. Cowx, I. 1994. Stocking strategies. Fisheries Management and Ecology 1:15-30. Cowx, I., and D. Gerdeaux. 2004. The effects of fisheries management practises on freshwater ecosystems. Fisheries Management and Ecology 14:145-151. Daufresne, M., K. Lengfellner, and U. Sommer. 2009. Global warming benefits the small in aquatic ecosystems. Proceedings of the National Academy of Sciences of the United States of America 106:12788-12793. Descroix, A., C. Desvilettes, A. Bec, P. Martin, and G. Bourdier. 2010. Impact of macroinvertebrate diet on growth and fatty acid profiles of restocked 0+ Atlantic salmon (Salmo salar) parr from large European river (the Allier). Canadian Journal of Fisheries & Aquatic Sciences 67:659-672. Diaz, S., M. Cabido, and F. Casanoves. 1998. Plant functional traits and environmental filters at a regional scale. Journal of Vegetation Science 9:113-122. Diaz, S., M. Cabido, and F. Casanoves. 1999. Functional implications of trait–environment linkages in plant communities. Pages 338-362 In P. Weiher and P. Keddy (editors). Ecological Assembly Rules: Perspectives, Advances, and Retreats. Cambridge University Press, Cambridge, UK. Dudoit, S., and M. J. van der Laan. 2008. Multiple Testing Procedures with Applications to Genomics. Springer Series in Statistics, Springer, New York. Elith, J., and C. H. Graham. 2009. Do they? How do they? Why do they? On finding reasons for differing performances of species distribution models. Ecography 32:66-77.

112 5.- Bibliographie

Elith, J., and J. R. Leathwick. 2009. Species distribution models: ecological explanation and prediction across space and time. Annual Review of Ecology, Evolution, and Systematics 40:677-697. Elith, J., J. R. Leathwick, and T. Hastie. 2008. Working guide to boosted regression trees. Journal of Ecology 77:802-813. Elliott, J. M. 1994. Quantitative Ecology and the Brown Trout. Oxford University Press, Oxford, New York and Tokyo. Elliott, J. M., and M. A. Hurley. 1997. A functional model for maximum growth of Atlantic Salmon parr, Salmo salar, from two populations in northwest England. Functional Ecology 11:592-603. Elliott, J. M., M. A. Hurley, and R. J. Fryer. 1995. A new, improved growth model for brown trout, Salmo trutta. Functional Ecology 9:290-298. Estrella, A. 1998. A new measure of fit for equations with dichotomous dependent variables. Journal of Business and Economic Statistics 16:198-205. European Inland Fisheries Advisory Commission. 1982. Report of the Symposium on stock enhancement in the management of freshwater fisheries. Held in Budapest, Hungary, 31 May –2 June 1982 in conjunction with the Twelfth session of EIFAC. EIFAC Technical Paper 42:43 p. European Union (EC). 2000. Directive 2000/60/EC of the European Parliament and of the council establishing a framework for the community action in the field of water policy. Official Journal of the European Communities L327:1-72. Faraway, J. J. 2006. Extending the Linear Model with R. Generalized Linear, Mixed Effects and Nonparametric Regression Models. Chapman & Hall/CRC, Boca Raton, Florida. Fausch, K. D., J. R. Karr, and P. R. Yant. 1984. Regional application of an index of biotic integrity based on stream fish communities. Transactions of the American Fisheries Society 113:39-55. Ferreira, T., J. Oliveira, N. Caiola, A. De Sostoa, F. Casals, R. Cortes, A. Economou, S. Zogaris, D. Garcia-Jalon, M. Ilheu, F. Martinez-Capel, D. Pont, C. Rogers, and J. Prenda. 2007a. Ecological traits of fish assemblages from Mediterranean Europe and their responses to human disturbance. Fisheries Management and Ecology 14:473-481. Ferreira, T., L. Sousa, J. M. Santos, L. Reino, J. Oliveira, P. R. Almeida, and R. V. Cortes. 2007b. Regional and local environmental correlates of native Iberian fish fauna. Ecology of Freshwater Fish 16:504-514. Filipe, A. F., M. B. Araujo, I. Doadrio, P. L. Angermeier, and M. J. Collares-Pereira. 2009. Biogeography of Iberian freshwater fishes revisited: the roles of historical versus contemporary constraints. Journal of Biogeography 36:2096-2110. Fox, J. 1987. Effect displays for generalized linear models. Sociological Methodology 17:347-361. Fox, J. 2003. Effect displays in R for generalised linear models. Journal of Statistical Software 8:1-27. Friberg, N., J. B. Dybkjaer, J. S. Olafsson, G. M. Gislason, S. E. Larsen, and T. L. Lauridsen. 2009. Relationships between structure and function in streams contrasting in temperature. Freshwater Biology 54:2051–2068.

113 5.- Bibliographie

Gasith, A., and V. H. Resh. 1999. Streams in Mediterranean climate regions: abiotic influences and biotic responses to predictable seasonal events. Annual Review of Ecology and Systematics 30:51-81. Godinho, F. N., M. T. Ferreira, and R. V. Cortes. 1997. Composition and spatial organization of fish assemblages in the lower Guadiana basin, southern Iberia. Ecology of Freshwater Fish 6:134-143. Godinho, F. N., M. T. Ferreira, and J. M. Santos. 2000. Variation in fish community composition along an Iberian river basin from low to high discharge: relative contributions of environmental and temporal variables. Ecology of Freshwater Fish 9:22-29. Goldstein, R. M., and M. R. Meador. 2004. Comparisons of fish species traits from small streams to large rivers. Transactions of the American Fisheries Society 133:971-983. Gomez, A., and D. H. Lunt. 2007. Refugia within refugia: patterns of phylogeographic concordance in the Iberian Peninsula. Pages 155-188 In S. Weiss and N. Ferrand (editors). Phylogeography of Southern European Refugia. Springer, Dordrecht. Greene, W. H. 2003. Econometric Analysis. 5th edition. Pearson Education, Inc., Upper Saddle River, New Jersey. Grenouillet, G., D. Pont, and C. Herisse. 2004. Within-basin fish assemblage structure: the relative influence of habitat versus stream spatial position on local species richness. Canadian Journal of Fisheries and Aquatic Sciences 61:93-102. Griffiths, D. 2006. Pattern and process in the ecological biogeography of European freshwater fish. Journal of Animal Ecology 75:734-751. Guisan, A., T. C. Edwards Jr, and T. Hastie. 2002. Generalized linear and generalized additive models in studies of species distributions: setting the scene. Ecological Modelling 157:89-100. Guisan, A., and N. E. Zimmerman. 2000. Predictive habitat distribution models in ecology. Ecological Modelling 135:147-186. Hahn, G. J., and W. Q. Meeker. 1991. Statistical Intervals: A Guide for Practitioners. John Wiley & Sons Inc, New York, NY. Halliwell, D. B., R. W. Langdon, R. A. Daniels, J. P. Kurtenbach, and R. A. Jacobson. 1999. Classification of freshwater fish species of the northeastern United States for use in the development of indices of biological integrity, with regional applications. Pages 301-333 In T. P. Simon (editor). Assessing the Sustainability and Biological Integrity of Water Resources Using Fish Communities. CRC Press, Boca Raton, Florida. Hansen, M. M. 2002. Estimating the long-term effects of stocking domesticated trout into wild brown trout (Salmo trutta) populations: an approach using microsatellite DNA analysis of historical and contemporary samples. Molecular Ecology 11:1003-1015. Harrell, F. E. 2001. Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. Springer, New York. Harris, J. H., and R. Silveira. 1999. Large-scale assessments of river health using an Index of Biotic Integrity with low-diversity fish communities. Freshwater Biology 41:235-252. Hartigan, J. A., and M. A. Wong. 1979. Algorithm AS136: a k-means clustering algorithm. Applied Statistics 28:100-108.

114 5.- Bibliographie

Hastie, T., R. Tibshirani, and J. Friedman. 2009. The Element of Statistical Learning: Data Mining, Inference, and Prediction. 2nd edition. Springer, New York. Hawkins, C. P., J. N. Hogue, L. M. Decker, and J. W. Feminella. 1997. Channel morphology, water temperature, and assemblage structure of stream insects. Journal of the North American Benthological Society 16:728-749. Hawkins, C. P., Y. Cao, and B. Roper. 2010. Method of predicting reference condition biota affects the performance and interpretation of ecological indices. Freshwater Biology 55:1066-1085. Haxton, T. J., and C. S. Findlay. 2008. Meta-analysis of the impacts of water management on aquatic communities. Canadian Journal of Fisheries and Aquatic Sciences 65:437-447. Haybach, A., F. Schöll, B. König, and F. Kohmann. 2004. Use of biological traits for interpreting functional relationships in large rivers. Limnologica - Ecology and Management of Inland Waters 34:451-459. Heino, J. 2005. Functional biodiversity of macroinvertebrate assemblages allong major ecological gradients of boreal headwater streams. Freshwater Biology 50:1578-1587. Hering, D., C. K. Feld, O. Moog, and T. Ofenböck. 2006. Cook book for the development of a multimetric index for biological condition of aquatic ecosystems: experiences from the European AQEM and STAR projects and related initiatives. Hydrobiologia 566:311-324. Hering, D., A. Schmidt-Kloiber, J. Murphy, S. Lücke, C. Zamora-Muñoz, M. J. Lopez- Rodriguez, T. Huber, and W. Graf. 2009. Potential impact of climate change on aquatic insects: a sensitivity analysis for European caddisflies (Trichoptera) based on distribution pattern and ecological preferences. Aquatic Sciences 71:3-14. Hewitt, G. M. 1999. Post-glacial re-colonization of European biota. Biological Journal of the Linnean Society 68:87-112. Hewitt, G. M. 2000. The genetic legacy of the Quaternary ice ages. Nature 405:907-913. Hewitt, G. M. 2004. Genetic consequences of climatic oscillations in the Quaternary. Philosophical Transactions of the Royal Society of London Series B-Biological Sciences 359:183-195. Hewitt, J. E., S. F. Thrush, and P. D. Dayton. 2008. Habitat variation, species diversity and ecological functioning in a marine system. Journal of Experimental Marine Biology and Ecology 366:116-122. Hill, M. O., and A. J. E. Smith. 1976. Principal component analysis of taxonomic data with multi-state discrete characters. Taxon 25:249-255. Hoeinghaus, D. J., K. O. Winemiller, and J. S. Birnbaum. 2007. Local and regional determinants of stream fish assemblage structure: inferences based on taxonomic vs. functional groups. Journal of Biogeography 34:324-338. Horrigan, N., and D. J. Baird. 2008. Trait patterns of aquatic insects across gradients of flow- related factors: a multivariate analysis of Canadian national data. Canadian Journal of Fisheries & Aquatic Sciences 65:670-680. Hosmer, D. W., and S. Lemeshow. 2000. Applied Logistic Regression. 2nd edition. John Wiley & Sons, Inc., New York. Hubbell, S. P. 2001. The Unified Neutral Theory of Biodiversity and Biogeography. Princeton University Press.

115 5.- Bibliographie

Huet, M. 1954. Biologie, profils en long et en travers de eaux courantes. Bulletin Français de Pisciculture 175:41-53. Hughes, R. M., S. Howlin, and P. R. Kaufmann. 2004. A biointegrity index (IBI) for coldwater streams of Western Oregon and Washington. Transactions of the American Fisheries Society 133:1497-1515. Hughes, R. M., P. R. Kaufmann, A. T. Herlihy, T. M. Kincaid, L. Reynolds, and D. P. Larsen. 1998. A process for developing and evaluating indices of fish assemblage integrity. Canadian Journal of Fisheries & Aquatic Sciences 55:1618-1631. Hughes, R. M., and T. Oberdorff. 1999. Applications of IBI concepts and metrics to waters outside the United States and Canada. Pages 79-93 In T. P. Simon (editor). Assessing the Sustainability and Biological Integrity of Water Resources Using Fish Communities. CRC Press, Boca Raton, Florida. Hugueny, B. 1989. West-African rivers as biogeographic islands - species richness of fish communities. Oecologia 79:236-243. Hugueny, B., S. Camara, B. Samoura, and M. Magassouba. 1996. Applying an index of biotic integrity based on fish assemblages in a West African river. Hydrobiologia 331:71-78. Hugueny, B., T. Oberdorff, and P. A. Tedesco. 2010. Community ecology of river fishes: a large-scale perspective. Pages 29-62 In D. A. Jackson and K. B. Gido (editors). Community Ecology of Stream Fishes: Concepts, Approaches, and Techniques. American Fisheries Society, Symposium 73., Bethesda, Maryland. Hutchinson, G. E. 1957. Concluding remarks. Cold Spring Harbor Symposia on Quantitative Biology 22:415-427. Ibañez, C., J. Belliard, R. M. Hughes, P. Irz, A. Kamdem-Toham, N. Lamouroux, P. A. Tedesco, and T. Oberdorff. 2009. Convergence of temperate and tropical stream fish assemblages. Ecography 32:658-670. Ibañez, C., T. Oberdorff, G. Teugeis, V. Mamononekene, S. Lavoue, Y. Fermon, D. Paugy, and P. K. Tohams. 2007. Fish assemblages structure and function along environmental gradients in rivers of Gabon (Africa). Ecology of Freshwater Fish 16:315-334. Illies, J. 1978. Limnofauna Europaea. Gustav Fischer Verlag, Stuttaart, New York. Infante, D., J. David Allan, S. Linke, and R. Norris. 2009. Relationship of fish and macroinvertebrate assemblages to environmental factors: implications for community concordance. Hydrobiologia 623:87-103. IPCC. 2007. Climate change 2007: synthesis report. Contribution of working groups I, II and III to the fourth assessment report of the Intergovernmental Panel on Climate Change. IPCC, Geneva, Switzerland, p. 104. Irz, P., F. Michonneau, T. Oberdorff, T. R. Whittier, N. Lamouroux, D. Mouillot, and C. Argillier. 2007. Fish community comparisons along environmental gradients in lakes of France and north-east USA. Global Ecology and Biogeography 16:350-366. Jackson, D. A., P. R. Peres-Neto, and J. D. Olden. 2001. What controls who is where in freshwater fish communities - the roles of biotic, abiotic, and spatial factors. Canadian Journal of Fisheries and Aquatic Sciences 58:157-170. Johnson, R. K., W. Goedkoop, and L. Sandin. 2004. Spatial scale and ecological relationships between the macroinvertebrate communities of stony habitats of streams and lakes. Freshwater Biology 49:1179-1194.

116 5.- Bibliographie

Jonsson, B., N. Jonsson, E. Brodtkorb, and P.-J. Ingebrigtsen. 2001. Life-history traits of brown trout vary with the size of small streams. Functional Ecology 15:310-317. Joy, M. K., and R. G. Death. 2002. Predictive modelling of freshwater fish as a biomonitoring tool in New Zealand. Freshwater Biology 47:2261-2275. Joy, M. K., and R. G. Death. 2004. Application of the index of biotic integrity methodology to New Zealand freshwater fish communities. Environmental Management 34:415-428. Karr, J. R. 1981. Assessment of biotic integrity using fish communities. Fisheries 6:21-27. Karr, J. R. 1991. Biological integrity: a long-neglected aspect of water resource management. Ecological Applications 1:66-84. Karr, J. R., and E. W. Chu. 1999. Restoring Life in Running Waters: Better Biological Monitoring. Island Press, Washington D.C. Karr, J. R., and E. W. Chu. 2000. Sustaining living rivers. Hydrobiologia 422:1-14. Karr, J. R., K. D. Fausch, P. L. Angermeier, P. R. Yant, and I. J. Schlosser. 1986. Assessing biological integrity in running waters: a method and its rationale. Special Publication 5. Illinois Natural History Survey, Champaign, IL. Keddy, P. A. 1992. Assembly and response rules: two goals for predictive community ecology. Journal of Vegetation Science 3:157-164. Keith, P. 1998. Evolution des peuplements ichtyologiques de France et stratégies de conservation. PhD Thesis. Université de Rennes I, Rennes, 239 pp. Kelt, D. A., and J. H. Brown. 1999. Community structure and assembly rules: confronting conceptual and statistical issues with data on desert rodents. Pages 75-107 In E. Weiher and P. Keddy (editors). Ecological Assembly Rules: Perspectives, Advances, Retreats. Cambridge University Press, Cambridge, United Kingdom. Kelt, D. A., J. H. Brown, E. J. Heske, P. A. Marquet, S. R. Morton, J. R. W. Reid, K. A. Rogovin, and G. Shenbrot. 1996. Community structure of desert small mammals: comparisons across four continents. Ecology 77:746-761. Kolkwitz, R., and M. Marsson. 1909. Ökologie der tierischen Saprobien. Internationale Revue Der Gesamten Hydrobiologie 2:126-152. Kontula, T., and R. Vainola. 2001. Postglacial colonization of Northern Europe by distinct phylogeographic lineages of the bullhead, Cottus gobio. Molecular Ecology 10:1983-2002. Koskinen, M. T., J. Nilsson, A. J. Veselov, A. G. Potutkin, E. Ranta, and C. R. Primmer. 2002. Microsatellite data resolve phylogeographic patterns in European grayling, Thymallus thymallus, Salmonidae. Heredity 88:391-401. Koster, E. A. (editor). 2005. The Physical Geography of Western Europe. Oxford University Press, Oxford, USA. Kottelat, M., and J. Freyhof. 2007. Handbook of European Freshwater Fishes. Publications Kottelat, Cornol, Switzerland. Kruk, A., and T. Penczak. 2003. Impoundment impact on populations of facultative riverine fish. Annales De Limnologie-International Journal of Limnology 39:197-210. Kutner, M. H., C. J. Nachtsheim, J. Neter, and W. Li. 2005. Applied Linear Statistical Models. 5th edition. McGraw-Hill/Irwin, New York.

117 5.- Bibliographie

Kwak, T. J., and T. F. Waters. 1997. Trout production dynamics and water quality in Minnesota streams. Transactions of the American Fisheries Society 216:35-48. Lahti, K., A. Laurila, K. Enberg, and J. Piironen. 2001. Variation in aggressive behaviour and growth rate between populations and migratory forms in the brown trout, Salmo trutta. Animal Behaviour 62:935-944. Lamouroux, N., S. Dolédec, and S. Gayraud. 2004. Biological traits of stream macroinvertebrate communities: effects of microhabitat, reach, and basin filters. Journal of the North American Benthological Society 23:449-466. Lamouroux, N., N. L. Poff, and P. L. Angermeier. 2002. Intercontinental convergence of stream fish community traits along geomorphic and hydraulic gradients. Ecology 83:1792- 1807. Langdon, R. W. 2001. A preliminary index of biological integrity for fish assemblages of small coldwater streams in Vermont. Northeastern Naturalist 8:219-232. Largiadèr, C. R., and A. Scholl. 1996. Genetic introgression between native and introduced brown trout Salmo trutta L. populations in the Rhone River Basin. Molecular Ecology 5:417-426. Lassale, G., M. Béguer, L. Beaulaton, and E. Rochard. 2008. Diadromous fish conservation plans need to consider global warming issues: an approach using biogeographical models. Biological Conservation 141:1105-1118. Leathwick, J. R., J. Elith, and T. Hastie. 2006. Comparative performance of generalized additive models and multivariate adaptive regression splines for statistical modelling of species distributions. Ecological Modelling 199:188-196. Lebreton, J. D., R. Sabatier, G. Banco, and A. M. Bacou. 1991. Principal component and correspondence analyses with respect to instrumental variables: an overview of their role in studies of structure-activity and species-environment relationships. In J. Devillers and W. Karcher (editors). Applied Multivariate Analysis in SAR and Environmental Studies. Kluwer Academic Publishers, Dordrecht. Lee, D. S., C. R. Gilbert, C. H. Hocutt, R. E. Jenkins, D. E. McAllister, and J. Stauffer, J.R. 1980. Atlas of North American Freshwater Fishes. North Carolina State Museum of Natural History, Raleigh, NC. Legendre, L., and P. Legendre. 1998. Numerical Ecology. Elsevier science B.V., Amsterdam. Leonard, P. M., and D. J. Orth. 1986. Application and testing of an index of biotic integrity in small, coolwater streams. Transactions of the American Fisheries Society 115:401-414. Leprieur, F., O. Beauchard, S. Blanchet, T. Oberdorff, and S. Brosse. 2008a. Fish invasions in the world's river systems: when natural processes are blurred by human activities. Plos Biology 6:404-410. Leprieur, F., O. Beauchard, B. Hugueny, G. Grenouillet, and S. Brosse. 2008b. Null model of biotic homogenization: a test with the European freshwater fish fauna. Diversity and Distributions 14:291-300. Lobo, J. M., and A. L. V. Davis. 1999. An intercontinental comparison of dung beetle diversity between two mediterranean-climatic regions: local versus regional and historical influences. Diversity and Distributions 5:91-103.

118 5.- Bibliographie

Lyons, J., S. Navarroperez, P. A. Cochran, E. Santana, and M. Guzmanarroyo. 1995. Index of biotic integrity based on fish assemblages for the conservation of streams and rivers in West-Central Mexico. Conservation Biology 9:569-584. Lyons, J., L. Wang, and T. D. Simonson. 1996. Development and validation of an index of biotic integrity for coldwater streams in Wisconsin. North American Journal of Fisheries Management 16:241-256. Mac Nally, R. 2000. Regression and model building in conservation biology, biogeography and ecology: the distinction between and reconciliation of 'predictive' and 'explanatory' models. Biodiversity and Conservation 9:655–671. MacArthur, R., and R. Levins. 1967. The limiting similarity, convergence, and divergence of coexisting species. American Naturalist 101:377–385. Macdonald, P. D. M. 1987. Analysis of length-frequency distributions. Pages 371-384 In R. C. Summerfelt and G. E. Hall (editors). Age and Growth of Fish. Iowa State University Press/Ames. Magalhaes, M. F., D. C. Batalha, and M. J. Collares-Pereira. 2002a. Gradients in stream fish assemblages across a Mediterranean landscape: contributions of environmental factors and spatial structure. Freshwater Biology 47:1015-1031. Magalhaes, M. F., P. Beja, C. Canas, and M. J. Collares-Pereira. 2002b. Functional heterogeneity of dry-season fish refugia across a Mediterranean catchment: the role of habitat and predation. Freshwater Biology 47:1919-1934. Magalhaes, M. F., C. E. Ramalho, and M. J. Collares-Pereira. 2008. Assessing biotic integrity in a Mediterranean watershed: development and evaluation of a fish-based index. Fisheries Management and Ecology 15:273-289. Maret, T. R. 1999. Characteristic of fish assemblages and environmental conditions in streams of the upper Snake River Basin in Eastern Idaho and Western Wyoming. Pages 273-299 In T. P. Simon (editor). Assessing the Sustainability and Biological Integrity of Water Resources Using Fish Communities. CRC Press, Boca Raton, Florida. Matthews, W. J. 1998. Patterns in Freshwater Fish Ecology. Chapman & Hall, New York. Matzen, D. A., and H. B. Berge. 2008. Assessing small-stream biotic integrity using fish assemblages across an urban landscape in the Puget Sound lowlands of Western Washington. Transactions of the American Fisheries Society 137:677-689. May, J. T., and L. R. Brown. 2002. Fish communities of the Sacramento River Basin: implications for conservation of native fishes in the Central Valley, California. Environmental Biology of Fishes 63:373-388. McCormick, F. H., R. M. Hughes, P. R. Kaufmann, D. V. Peck, J. L. Stoddard, and A. T. Herlihy. 2001. Development of an index of biotic integrity for the Mid-Atlantic Highlands region. Transactions of the American Fisheries Society 130:857-877. McCullagh, P., and J. A. Nelder. 1989. Generalized Linear Models. 2nd edition. Chapman and Hall, London. McDermid, J. L., P. E. Ihssen, W. N. Sloan, and B. J. Shuter. 2007. Genetic and environmental influences on life history traits in lake trout. Transactions of the American Fisheries Society 136:1018-1029. Mebane, C. A., T. R. Maret, and R. M. Hughes. 2003. An index of biological integrity (IBI) for Pacific Northwest rivers. Transactions of the American Fisheries Society 132:239-261.

119 5.- Bibliographie

Melcher, A., S. Schmutz, G. Haidvogl, and K. Moder. 2007. Spatially based methods to assess the ecological status of European fish assemblage types. Fisheries Management and Ecology 14:453-463. Melville, J., L. J. Harmon, and J. B. Losos. 2006. Intercontinental community convergence of ecology and morphology in desert lizards. Proceedings of the Royal Society Biological Sciences Series B 273:557-563. Mesquita, N., and F. Coelho. 2002. The ichthyofauna of the small Mediterranean-type drainages of Portugal: its importance for conservation. Pages 65-71 In M. J. Collares- Pereira, I. G. COWX and F. Coelho (editors). Conservation of Freshwater Fish: Options for the Future. Oxford: Fishing News Books, Blackwell Science, Oxford. Mesquita, N., M. M. Coelho, and M. F. Magalhaes. 2006. Spatial variation in fish assemblages across small Mediterranean drainages: effects of habitat and landscape context. Environmental Biology of Fishes 77:105-120. Milly, P. C. D., K. A. Dunne, and A. V. Vecchia. 2005. Global pattern of trends in streamflow and water availability in a changing climate. Nature 438:347-350. Mouillot, D., N. W. H. Mason, and J. B. Wilson. 2007. Is the abundance of species determined by their functional traits? A new method with a test using plant communities. Oecologia 152:729-737. Moyle, P. B., and B. Herbold. 1987. Life-history patterns and community structure in stream fishes of western North America: comparisons with Eastern North America and Europe. Pages 25-32 In W. J. Matthews and D. C. Heins (editors). Community and Evolutionary Ecology of North American Stream Fishes. University of Oklahoma Press, Norman, London. Mundahl, N. D., and T. P. Simon. 1999. Development and application of an index of biotic integrity for coldwater streams of the upper Midwestern United States. Pages 383-411 In T. P. Simon (editor). Assessing the Sustainability and Biological Integrity of Water Resources Using Fish Communities. CRC Press, Boca Raton, Florida. Nelson, J. S. 2006. Fishes of the World. 4th edition. John Wiley & Sons, Inc., Hoboken, New Jersey. Nesbo, C. L., T. Fossheim, L. A. Vollestad, and K. S. Jakobsen. 1999. Genetic divergence and phylogeographic relationships among European perch (Perca fluviatilis) populations reflect glacial refugia and postglacial colonization. Molecular Ecology 8:1387-1404. Neter, J., W. Wasserman, and M. H. Kutner. 1983. Applied Linear Regression Models. Richard D. Irwin, INC., Homewood, Illinois. Noble, R. A. A., I. G. Cowx, D. Goffaux, and P. Kestemont. 2007. Assessing the health of European rivers using functional ecological guilds of fish communities: standardising species classification and approaches to metric selection. Fisheries Management and Ecology 14:381-392. Oberdorff, T., and R. M. Hughes. 1992. Modification of an index of biotic integrity based on fish assemblages to characterize rivers of the Seine Basin, France. Hydrobiologia 228:117- 130. Oberdorff, T., D. Pont, B. Hugueny, and J. P. Porcher. 2002. Development and validation of a fish-based index for the assessment of 'river health' in France. Freshwater Biology 47:1720-1734.

120 5.- Bibliographie

Oberdorff, T., and J. P. Porcher. 1994. An index of biotic integrity to assess biological impacts of salmonid farm effluents on receiving waters. Aquaculture 119:219-235. Ojanguren, A. F., F. G. Reyes-Gavilan, and F. Brana. 2001. Thermal sensitivity of growth, food intake and activity of juvenile brown trout. Journal of Thermal Biology 26:165-170. Olden, J. D., M. J. Kennard, F. Leprieur, P. A. Tedesco, K. O. Winemiller, and E. Garcia- Berthou. 2010. Conservation biogeography of freshwater fishes: recent progress and future challenges. Diversity and Distributions 16:496-513. Oswood, M. W., J. B. Reynolds, J. G. Irons III, and A. M. Milner. 2000. Distributions of freshwater fishes in ecoregions and hydroregions of Alaska. Journal of the North American Benthological Society 19:405-418. Parra, I., A. Almodovar, G. G. Nicolas, and B. Elvira. 2009. Latitudinal and altitudinal growth patterns of brown trout Salmo trutta at different spatial scales. Journal of Fish Biology 74:2355-2373. Pavoine, S., M. Baguette, and M. B. Bonsall. 2010. Decomposition of trait diversity among the nodes of a phylogenetic tree. Ecological Monographs 80:485-507. Petts, G. E., and C. Amoros (editors). 1996. Fluvial Hydrosystems. Chapman & Hall, London. Pires, A. M., I. G. Cowx, and M. M. Coelho. 1999. Seasonal changes in fish community structure of intermittent streams in the middle reaches of the Guadiana basin, Portugal. Journal of Fish Biology 54:235-249. Pires, A. M., L. M. Da Costa, M. J. Alves, and M. M. Coelho. 2004. Fish assemblage structure across the Arade basin (Southern Portugal). Cybium 28:357-365. Poff, N. L. 1997. Landscape filters and species traits: towards mechanistic understanding and prediction in stream ecology. Journal of the North American Benthological Society 16:391-409. Poff, N. L., and J. D. Allan. 1995. Functional organization of stream fish assemblages in relation to hydrological variability. Ecology 76:606-627. Pollard, A. I., and L. L. Yuan. 2010. Assessing the consistency of response metrics of the invertebrate benthos: a comparison of trait- and identity-based measures. Freshwater Biology 55:1420-1429. Pont, D. 2010. Bio-indication et peuplements piscicoles dans les cours d'eau : une approche fonctionnelle et prédictive. Sciences Eaux & Territoires 1:40-45. Pont, D., R. M. Hughes, T. R. Whittier, and S. Schmutz. 2009. A predictive index of biotic integrity model for aquatic-vertebrate assemblages of Western U.S. streams. Transactions of the American Fisheries Society 138:292-305. Pont, D., B. Hugueny, U. Beier, D. Goffaux, A. Melcher, R. Noble, C. Rogers, N. Roset, and S. Schmutz. 2006. Assessing river biotic condition at a continental scale: a European approach using functional metrics and fish assemblages. Journal of Applied Ecology 43:70-80. Pont, D., B. Hugueny, and T. Oberdorff. 2005. Modelling habitat requirement of European fishes: do species have similar responses to local and regional environmental constraints? Canadian Journal of Fisheries and Aquatic Sciences 62:163-173.

121 5.- Bibliographie

Pont, D., B. Hugueny, and C. Rogers. 2007. Development of a fish-based index for the assessment of river health in Europe: the European Fish Index. Fisheries Management and Ecology 14:427-439. Qian, H., and R. E. Ricklefs. 2000. Large-scale processes and the Asian bias in species diversity of temperate plants. Nature 407:180-182. Quataert, P., J. Breine, and I. Simoens. 2007. Evaluation of the European Fish Index: false- positive and false-negative error rate to detect disturbance and consistency with alternative fish indices. Fisheries Management and Ecology 14:465-472. Quinn, J. W., and T. J. Kwak. 2003. Fish assemblage changes in an Ozark river after impoundment: a long-term perspective. Transactions of the American Fisheries Society 132:110-119. Quinn, T. P., and N. P. Peterson. 1996. The influence of habitat complexity and fish sire on over-winter survival and growth of individually marked juvenile coho salmon (Oncorhynchus kisutch) in Big Beef creek, Washington. Canadian Journal of Fisheries and Aquatic Sciences 53:1555-1564. Quist, M. C., W. A. Hubert, and F. J. Rahel. 2005. Fish assemblage structure following impoundment of a Great Plains River. Western North American Naturalist 65:53-63. Rahel, F. J., and W. A. Hubert. 1991. Fish assemblages and habitat gradients in a Rocky- Mountain Great-Plains stream - biotic zonation and additive patterns of community change. Transactions of the American Fisheries Society 120:319-332. Rao, P.G. 1993. Climatic changes and trends over a major river basin in India. Climate Research 2:215-223. Reich, P. B., M. B. Walters, and D. S. Ellsworth. 1997. From tropics to tundra: global convergence in plant functioning. Proceedings of the National Academy of Sciences of the United States of America 94:13730-13734. Reyjol, Y., B. Hugueny, D. Pont, P. G. Bianco, U. Beier, N. Caiola, F. Casals, I. Cowx, A. Economou, T. Ferreira, G. Haidvogl, R. Noble, A. de Sostoa, T. Vigneron, and T. Virbickas. 2007. Patterns in species richness and endemism of European freshwater fish. Global Ecology and Biogeography 16:65-75. Reynolds, J. D., T. J. Webb, and L. A. Hawkins. 2005. Life history and ecological correlates of extinction risk in European freshwater fishes. Canadian Journal of Fisheries and Aquatic Sciences 62:854-862. Ricklefs, R. E. 2006. Evolutionary diversification and the origin of the diversity-environment relationship. Ecology 87:S3-S13. Ricklefs, R. E., and D. Schluter. 1993. Species diversity: regional and historical influences. Pages 350-363 In R. E. Ricklefs and D. Schluter (editors). Species Diversity in Ecological Communities. University of Chicago Press, Chicago. Ripley, B. D. 1987. Stochastic Simulation. Wiley. Roset, N., G. Grenouillet, D. Goffaux, D. Pont, and P. Kestemont. 2007. A review of existing fish assemblage indicators and methodologies. Fisheries Management and Ecology 14:393- 405. Rutherford, E. S. 2002. Fishery management. In L. A. Fuiman and R. G. Werner (editors). Fishery Science: the Unique Contributions of Early Life Stages. Blackwell Science, Oxford.

122 5.- Bibliographie

Samuels, C. L., and J. A. Drake. 1997. Divergent perspectives on community convergence. Trends in Ecology & Evolution 12:427-432. Sandin, L., and R. K. Johnson. 2004. Local, landscape and regional factors structuring benthic macroinvertebrate assemblages in Swedish streams. Landscape Ecology 19:501-514. Santoul, F., J. Cayrou, S. Mastrorillo, and R. ghino. 2005. Spatial patterns of the biological traits of freshwater fish communities in southwest France. Journal of Fish Biology 66:301- 314. Saporta, G. 2006. Probabilités, Analyses de Données et Statistiques. Editions TECHNIP, Paris. Scherrer, B. 2009. Biostatistique, Volume 1. Gaëtan Morin, Montréal, Canada. Schlosser, I. J. 1991. Stream fish ecology - a landscape perspective. Bioscience 41:704-712. Schmutz, S., I. G. Cowx, G. Haidvogl, and D. Pont. 2007. Fish-based methods for assessing European running waters: a synthesis. Fisheries Management and Ecology 14:369-380. Segurado, P., M. T. Ferreira, P. Pinheiro, and J. M. Santos. 2008. Mediterranean River Assessment. Testing the response of guild-based metric, Work Package 3, Subtask 7. - EFI+ Consortium - Improvement and spatial extension of the European Fish Index, EU- Project Nr. 044096, pp. 5-9. Simon, T. P., and J. Lyons. 1995. Application of the index of biotic integrity to evaluate water resource integrity in freshwater ecosystems. Pages 245-262 In W. S. Davis and T. P. Simon (editors). Biological Assessment and Criteria: Tools for Water Resource Planning and Decision Making. CRC Press, Boca Raton, Florida. Sinclair, S. J., M. D. White, and G. R. Newell. 2010. How useful are species distribution models for managing biodiversity under future climates? Ecology and Society 15. Smith, A. P., and J. U. Ganzhorn. 1996. Convergence in community structure and dietary adaptation in Australian possums and gliders and Malagasy lemurs. Australian Journal of Ecology 21:31-46. Southerland, M. T., G. M. Rogers, M. J. Kline, R. P. Morgan, D. M. Boward, R. Kazyak, R. J. Klauda, and S. A. Stranko. 2007. Improving biological indicators to better assess the condition of streams. Ecological Indicators 7:751-767. Southwood, T. R. E. 1977. Habitat, the templet for ecological strategies? J. Anim. Ecol. 46:337-365. Southwood, T. R. E. 1988. Tactics, strategies and templets. Oikos 52:3-18. Statzner, B., B. Bis, S. Dolédec, and P. Usseglio-Polatera. 2001. Perspectives for biomonitoring at large spatial scales: a unified measure for the functional composition of invertebrate communities in European running waters. Basic and Applied Ecology 2:73-85. Statzner, B., S. Dolédec, and B. Hugueny. 2004. Biological trait composition of European stream invertebrate communities: assessing the effects of various trait filter types. Ecography 27:470-488. Statzner, B., K. Hoppenhaus, M.-F. Arens, and P. Richoux. 1997. Reproductive traits, habitat use and templet theory: a synthesis of world-wide data on aquatic insects. Freshwater Biology 38:109-135. Stearns, S. 1992. The Evolution of Life Histories. Oxford University Press, Oxford.

123 5.- Bibliographie

Stoddard, J. L., A. T. Herlihy, D. V. Peck, R. M. Hughes, T. R. Whittier, and E. Tarquinio. 2008. A process for creating multimetric indices for large-scale aquatic surveys. Journal of the North American Benthological Society 27:878-891. Stoddard, J. L., D. P. Larsen, C. P. Hawkins, R. K. Johnson, and R. H. Norris. 2006. Setting expectations for the ecological condition of streams: the concept of reference condition. Ecological Applications 16:1267-1276. Tedesco, P. A., B. Hugueny, T. Oberdorff, H. H. Dürr, S. Mérigoux, and B. de Mérona. 2008. River hydrological seasonality influences life hisotry strategies of tropical riverine fishes. Oecologia 156:691-702. Tedesco, P. A., T. Oberdorff, C. A. Lasso, M. Zapata, and B. Hugueny. 2005. Evidence of history in explaining diversity patterns in tropical riverine fish. Journal of Biogeography 32:1899-1907. Tedesco, P. A., P. Sagnes, and J. Laroche. 2009. Variability in the growth rate of chub Leuciscus cephalus along a longitudinal river gradient. Journal of Fish Biology 74:312- 319. Tenenhaus, M., and F. W. Young. 1985. An analysis and synthesis of multiple correspondence analysis, optimal scaling, dual scaling, homogeneity analysis and other methods for quantifying categorical multivariate data. Psychometrika 50:91-119. Tirelli, T., and D. Pessani. 2009. Use of decision tree and artificial neural network approaches to model presence/absence of Telestes muticellus in Piedmont (North-Western Italy). River Research and Applications 25:1001-1012. Tockner, K., U. Uehlinger, and C. T. Robinson (editors). 2009. Rivers of Europe. Academic Press. Tonn, W. M., J. J. Magnuson, M. Rask, and J. Toivonen. 1990. Intercontinental comparison of small-lake fish assemblages: the balance between local and regional processes. American Naturalist 136:345-375. Townsend, C. R., and A. G. Hildrew. 1994. Species traits in relation to a habitat templet for river systems. Freshwater Biology 31:265-275. Usseglio-Polatera, P., M. Bournaud, P. Richoux, and H. Tachet. 2000. Biological and ecological traits of benthic freshwater macroinvertebrates: relationships and definition of groups with similar traits. Freshwater Biology 43:175-205. Van Sickle, J., and R. M. Hughes. 2000. Classification strengths of ecoregions, catchments, and geographic clusters for aquatic vertebrates in Oregon. Journal of the North American Benthological Society 19:370-384. Vaz, S., C. S. Martin, P. D. Eastwood, B. Ernande, A. Carpentier, G. J. Meaden, and F. Coppin. 2008. Modelling species distributions using regression quantiles. Journal of Applied Ecology 45:204-217. Vidoni, P. 2003. Prediction and calibration in generalized linear models. Annals of the Institute of Statistical Mathematics 55:169-185. Vila-Gispert, A., R. Moreno-Amich, and E. Garcia-Berthou. 2002. Gradients of life-history variation: an intercontinental comparison of fishes. Reviews in Fish Biology and Fisheries 12:417-427.

124 5.- Bibliographie

Vitt, L. J., and E. R. Pianka. 2005. Deep history impacts present-day ecology and biodiversity. Proceedings of the National Academy of Sciences of the United States of America 102:7877-7881. Walsh, C., and R. Mac Nally. 2008. hier.part: Hierarchical Partitioning. R package version 1.0-3. Wang, L. Z., J. Lyons, and P. Kanehl. 2003. Impacts of urban land cover on trout streams in Wisconsin and Minnesota. Transactions of the American Fisheries Society 132:825-839. Ward, J. V. 1985. Thermal-characteristics of running waters. Hydrobiologia 125:31-46. Ward, D. M., K. H. Nislow, and C. L. Folt. 2009. Increased population density and suppressed prey biomass: relative impacts on juvenile Atlantic salmon growth. Transactions of the American Fisheries Society 138:135-143. Webb, B. W. 1996. Trends in water stream and river temperature. Hydrological Processes 10:205-226. Webb, B. W., and F. Nobilis. 1995. Long term water temperature trends in Austrian rivers. Hydrological Sciences - Journal des Sciences Hydrologiques 40:83-96. Webb, B. W., and F. Nobilis. 2007. Long-term changes in river temperature and the influence of climatic and hydrological factors. Hydrological Sciences - Journal des Sciences Hydrologiques 52:74-85. Wehrly, K. E., M. J. Wiley, and P. W. Seelbach. 2003. Classifying regional variation in thermal regime based on stream fish community patterns. Transactions of the American Fisheries Society 132:18-38. Weiher, E., and P. Keddy (editors). 1999. Ecological Assembly Rules: Perspectives, Advances, Retreats. Cambridge University Press, Cambridge, UK. Weiher, E., and P. A. Keddy. 1995. Assembly rules, null models, and trait dispersion - new questions front old patterns. Oikos 74:159-164. Whittier, T. R., J. L. Stoddard, D. P. Larsen, and A. T. Herlihy. 2007. Selecting reference sites for stream biological assessments: best professional judgment or objective criteria. Journal of the North American Benthological Society 26:349-360. Wiens, J. A. 1991. Ecological similarity of shrub-desert avifaunas of Australia and North America. Ecology 72:479-495. Wilby, R. L., H. G. Orr, M. Hedger, D. Forrow, and M. Blackmore. 2006. Risks posed by climate change to the delivery of Water Framework Directive objectives in the UK. Environment International 32:1043-1055. Winemiller, K. O., and A. Adite. 1997. Convergent evolution of weakly electric fishes from floodplain habitats in Africa and South America. Environmental Biology of Fishes 49:175- 186. Winemiller, K. O., and K. A. Rose. 1992. Patterns of life-history diversification in North- American fishes - implications for population regulation. Canadian Journal of Fisheries and Aquatic Sciences 49:2196-2218. Winston, M. R. 1995. Co-occurrence of morphologically similar species of stream fishes. American Naturalist 145:527-545. Wood, G. R. 2005. Confidence and prediction intervals for generalised linear accident models. Accident Analysis and Prevention 37:267-273.

125 5.- Bibliographie

Wright, J. F., D. W. Sutcliffe, and M. T. Furse (editors). 2000. Assessing the Biological Quality of Freshwaters. Rivpacs and Other Techniques. Freshwater Biological Association, Ambleside, United Kingdom. Yoshimura, C., K. Tockner, T. Omura, and O. Moog. 2006. Species diversity and functional assessment of macroinvertebrate communities in Austrian rivers. Limnology 7:63-74. Zaroban, D. W., M. P. Mulvey, T. R. Maret, R. M. Hughes, and G. D. Merritt. 1999. Classification of species attributes for Pacific Northwest freshwater fishes. Northwest Science 73:81-93.

126 Deuxième partie : Articles et traduction anglaise

Introduction

Introduction

Introduction

This thesis was part of the European EFI+ project4 (01/01/2007–30/04/2009) and followed the work managed during the former European FAME project5 (01/01/2002– 31/10/2004). The objective of these two projects was to develop tools to help implement the Water Framework Directive (WFD; European Union (EC) 2000). Along with the development of national methods, the FAME project enabled the development of a fish index that could be used in all European streams: the European Fish Index (EFI; Pont et al. 2006, Pont et al. 2007). The EFI+ project aimed to continue development and to enhance this indicator to overcome some of its limits. A more precise database describing the environmental conditions and the anthropogenic pressures acting in streams was used. A number of hypotheses underlying the development of this type of bioindicator were also tested. This is the subject of this work.

The Water Framework Directive (WFD) and the bioindication tools The WFD is an environmental policy directive instituted by the European parliament (European Union (EC) 2000). It establishes a framework for protecting and managing the different waterbodies: surface waters (lake, rivers), transitory waters (estuaries and lagoons), coastal and underground waters. The WFD stated that waterbodies should maintain or achieve “good status” by 2015 (at the very latest by 2027). Its implementation is organised into three steps. The first step diagnoses the status of the different waterbodies (from very bad to good condition); the second step plans restoration measures for the degraded water bodies (condition below good); and the last step assesses the conditions of the waterbodies after restoration. The WFD specifies that both the integrity of the physico-chemical process of the ecosystem (e.g. the hydrological regime, pollutants) and the community functioning of four biocenosis components (diatom, macrophyte, benthic macroinvertebrate and fish) must be taken into account to assess the condition of waterbodies. Assessing waterbody status by characterising their ecological functioning is one of the major innovations of the WFD. The “good ecological status” as defined by the WFD (European Union (EC) 2000) must be evaluated by comparing the current community characteristics to the characteristics expected

4 Improvement and Spatial extension of the European Fish Index (http://efi-plus.boku.ac.at/), contract number 044096 5 Development, Evaluation and Implementation of a standardised Fish-based Assessment Method for the Ecological Status of European Rivers (http://fame.boku.ac.at/), contract number EVK1-CT-2001-00094 131 Introduction without any human alteration: the “reference condition approach” (Bailey et al. 1998). The WFD stimulated and quickened the development of national indices based on the characteristics of the communities. In France, the “indice poisson rivière” (river fish index) (IPR; Oberdorff et al. 2002) was developed to assess the ecological status of French streams based on fish assemblage structures.

The (fish) multimetric biotic indicators Bioindication has a history lasting more than a century, particularly the assessment of the degradation of waters caused by urban and industrial sewage (Kolkwitz and Marsson 1909). Concerning fish as indicators of the condition of the aquatic environment, Karr’s studies in the United States were decisive. His Index of Biotic Integrity (IBI; Karr 1981, Karr et al. 1986) was developed within the framework of the water protection measures in the US: the Clean Water Act in 1972 (“To restore and maintain the chemical, physical, and biological integrity of the Nation’s waters”) and subsequently the National Wildlife Refuge System Improvement Act in 1977. The IBI is based on statements that were largely taken up by the Water Framework Directive, especially the notion of biotic integrity. Biotic integrity is defined as the entire set of structural and functional characteristics of a biological assemblage that can be observed in pristine sites, in other words in sites undisturbed by human activities. From here, the IBI seeks to quantify a deviation from this pristine status, as it is related to the intensity of anthropogenic pressures. Moreover, it assesses the fish assemblage response to perturbations from a set of descriptors (metrics) of the condition of an assemblage. The IBI sets both the theoretical and practical foundations of the multimetric indicators (MMI). The concept of this index, developed originally for the Midwestern warm streams of the United States, was exported to other US regions and types of streams (e.g. Leonard and Orth 1986, Angermeier and Schlosser 1987, McCormick et al. 2001) and to almost all continents (e.g. Oberdorff and Hughes 1992, Lyons et al. 1995, Hugueny et al. 1996, Harris and Silveira 1999, Hughes and Oberdorff 1999, An et al. 2002, Joy and Death 2004, Magalhaes et al. 2008). MMIs are based on several premises. First, taking into account the different assemblage characteristics (specific composition, richness, trophic structure, reproductive mode, abundance and health of the individuals; Karr 1981, Fausch et al. 1984, Karr et al. 1986, Karr 1991, Karr and Chu 1999) enables a better assessment of stream conditions (biotic integrity, health, ecological condition, etc.) than considering only one assemblage attribute (Karr and Chu 1999). Each of these

132 Introduction metrics should provide unique information on the assemblage (Karr and Chu 1999) describing the quality of a community element (Karr 1991).

These metrics must also present a specific response to human pressures (Karr and Chu 2000). Their sensitivity should vary along a gradient of anthropogenic pressures (Angermeier and Karr 1986). When a site is heavily degraded, several metrics must reflect its impairment. Integrating the signals displayed by the different metrics in a single index should detect a high number of human pressures and represent a global level of impairment. Ideally, an index should be sensitive to all the pressures that would be encountered in the geographic area where the index is used (Karr 1991).

In addition to the specificity of their signal, metrics should reflect the specificities of the region where they are applied: species pool, environment and pressures. This basic premise is of major importance. The results from Harris and Silveira (1999) demonstrated that including a metric which has little or no representation in the region limits the discrimination between sites. For instance, in the first IBI Karr integrated a metric based on the relative abundance of the green sunfish (Lepomis cyanellus, Rafinesque). Using this metric in Europe is ineffective because this species is only distributed in North America (Lee et al. 1980, Nelson 2006).

The third premise inherent to the development of a MMI states that the score associated with a metric should only reflect the between-site differences in degradation (Hughes et al. 1998, Karr and Chu 1999, 2000, Oberdorff et al. 2002, Hering et al. 2006, Pont et al. 2007, Stoddard et al. 2008, Pont et al. 2009), i.e. the differences between scores should only reflect the level of site impairment. Fausch et al. (1984) was the first to consider that assemblage attributes (the richness) should vary along an environmental gradient (stream size and region). Two strategies could be used to select metrics: first, selecting invariant metrics whatever environmental conditions are present or region is considered (Statzner et al. 2001); second, controlling the environmental part of the variability of the metrics (Fausch et al. 1984, Angermeier and Karr 1986, Karr et al. 1986, Karr and Chu 1999, Oberdorff et al. 2002, Pont et al. 2007, Stoddard et al. 2008, Pont et al. 2009, Hawkins et al. 2010), either by limiting their geographical area of use or standardising metric values in relation to the environmental conditions. This latter approach was retained in the two European projects (Pont et al. 2006, 2007, Melcher et al. 2007).

133 Introduction

Whatever methods are employed to control the environmental variability of the metrics or the assemblage composition – maximum value line (e.g. maximum species richness line; Fausch et al. 1984, Karr et al. 1986, Hughes et al. 1998, Roset et al. 2007), statistical models (Oberdorff et al. 2002, Baker et al. 2005, Pont et al. 2006, Pont et al. 2007, Pont et al. 2009), discriminant analysis (Joy and Death 2002), learning algorithm (e.g. random forest; Hawkins et al. 2010), nearest neighbour (Bates Prins and Smith 2007) – the underlying hypothesis remains the same: two communities living in similar environmental conditions without any stressors display similar attributes. In spite of the strength of this assumption, this hypothesis remains for the most part untested.

Developing European indicators (FAME and EFI+ project). The European Fish Index (EFI) was developed within the FAME project (Pont et al. 2006, Pont et al. 2007). The EFI is based on the aggregation of the scores of ten metrics, or assemblage attributes (Hering et al. 2006). Scores were computed using the reference condition approach (Bailey et al. 1998): the metric values observed among fish assemblages are compared to the theoretical values expected in absence of human pressure. This index was developed by gathering and collecting data in a huge database shared by 12 European countries (FIDES) encompassing faunistic data (number of individuals caught per species), the ecological and biological traits of European fish species (e.g. trophic guild, reproduction), an environmental descriptor (e.g. slope, temperature), and a description of human pressures. The creation of this database was one of the major outcomes of the FAME project, especially the evaluation of five major anthropogenic alterations: hydrological regime, morphological conditions, acidification and toxics, organic nutrients, and connectivity disruption. Although the EFI score “shows a significant negative linear response to a gradient of human disturbance” (Pont et al. 2006), this index presents several limits: - its geographical range of use: none of the Eastern European countries were involved in FAME (Schmutz et al. 2007); - the varying sensitivity of the EFI between the different European regions (Quatartaet et al. 2007): the EFI was less sensitive in Mediterranean streams; - the sensitivity of the EFI differed depending on the kind of anthropogenic pressures: the EFI was thus more sensitive to chemical pressure than to physical pressures (Pont et al. 2007); - the application of this index to large streams is relatively limited (Schmutz et al. 2007).

134 Introduction

The aim of the European EFI+ project was to pursue the development and enhancement of the European fish index to overcome its current limits. It was therefore necessary to improve knowledge and more specially to test certain hypotheses underlying the development of a multimetric index on a large spatial extent, which was not addressed during FAME.

The objectives of this thesis The objectives of this thesis were both theoretical and practical. The goal was to test certain hypotheses on which the development of a MMI on a large scale was founded and to use these results and new knowledge to develop the new European fish index: EFI+. The second objective was to review the development of this index and propose possible methods to estimate the uncertainty associated with metric and index scores. The final objective was to investigate the possible consequences of global change on fish assemblages, most particularly global warming. The goal of this part was to consider the potential consequences induced by the changing environmental conditions on the evaluation of the ecological status of streams. The results of this thesis are structured in three parts. The first part of the thesis included five major goals: 1. Test the MMI’s underlying hypothesis and the development of the European fish index EFI+. The first goal here was to study the representativeness and the relation between the biological and ecological traits of species among European fish assemblages. The second goal of this part was to assess the functional redundancy among assemblages (P1). The EFI was based on traits, guilds that combine all species into a single variable sharing the same attribute. 2. Study the influence of the environment on the functional structure of fish assemblages, i.e. estimate the part of variability of the fish assemblage structure explained by the environment and study how fish assemblage structure varies along environmental gradients (P1). 3. After studying the relationships between environment and assemblage structure at the European resolution, test the convergence of assemblage function responses to environmental gradients between two regions with relatively distinct species pools – the Iberian Peninsula and France and Belgium – and thus test whether the structure of the assemblages living in these two regions varied similarly along environmental gradients. In other words, should the same statistical models be used in the two regions to predict the expected values of the metrics in a given environment? (P2).

135 Introduction

4. Develop new metrics based both on size classes and species attributes for low- species rivers. (P3). 5. Summarise the implication of these results in the development of the new European Fish Index (P4).

The second part of this thesis sought to estimate the uncertainty associated with: 1. The metric scores. 2. The index scores. One section in this part mainly focuses on the consequences of the correlation between metric scores on the estimation of the uncertainty around index scores (P5).

The third part of the results presents the effect of global change and more specifically of global warming on the assessment of the ecological status of streams. The influence of these changes was studied by evaluating the relative effects of temperature and other environmental factors on: 1. The growth of brown trout (Salmo trutta fario, Linnaeus) young of the year (P6) 2. The presence of 24 European fish species (P7).

This manuscript is organised into two sections. The first one comprises the French synthesis, based on the papers and the studies conducted as part of this thesis. The second section includes all the papers written during this thesis and the chapters that have been translated into English as well as those intended for future papers (the chapter on the EFI+ index and uncertainty).

136 Introduction

Table 1: Summary of the different papers. Part Paper Status P1 Functional fish assemblage structure in European rivers. In preparation Journal of the North Do Iberian and European fish faunas exhibit convergent functional P2 American Benthological structure along environmental gradients? Society, published Development of metrics based on fish body size and species traits to Ecological Indicators, in P3 assess European coldwater streams press

Implication of this work in the development of the new European Fish P4 Index, EFI+. P5 Uncertainty associated with predictive multimetric indices Variation of brown trout Salmo trutta young-of-the-year growth along Journal of Fish Biology, P6 environmental gradients in Europe in press

Modelling ecological niche of fish species at the European scale: P7 sensitivity to climate variables (temperature, run-off) and associated In preparation uncertainties

137

Presentation of the data

Presentation of the data

Presentation of the data

Except for the section on the assemblage convergence (P2), which presents the data from the European FAME project, all the data used were provided by the EFI+ project. The database of the EFI+ encompasses 14,458 sites distributed throughout 15 countries. This database contains information on the environment in and around sites (climate, abiotic factors, stream structure, etc.), anthropogenic pressures (physical alterations, water quality, connectivity), the fish assemblages living in these sites (number of individuals caught per species, fish body size) and the species attributes (biological, ecological and life history traits).

A detailed presentation of this database is available on the index web site: http://efi- plus.boku.ac.at/downloads/EFI+%200044096%20Deliverable%20D2_1-2_2.pdf. A complete list of these variables is also provided in Appendix 1. Only the data used in this study are presented here.

Environment

To describe the environmental conditions characterising the sites, twelve variables were finally retained (Table 2).

Table 2: Description of the environmental variables. Variable Unit or modality Mean annual temperature* °C Mean temperature in July* °C Thermal amplitude between January and July* °C Annual precipitations mm Slope ‰ Substrate Small, medium, large Dominant geology of the drainage area Calcareous, siliceous, organic Surface of the drainage area km² Distance from the source km Hydrological regime Glacial, nival, pluvial, karstic or groundwater dominant source Geomorphological river type Meandering, braided, constraint stream Floodplain Yes, no *Air temperature.

141 Presentation of the data

These 12 variables were selected to represent the habitat conditions occurring at the different sites. Only variables with a known effect on fish assemblages and that were only slightly or not modified by human activities were selected. The goal was to describe sites based on criteria only reflecting the natural variability of the environment. For instance, stream width was not used because it could be easily modified by human activities such as channelisation. In this case, the drainage basin area and the distance from the source were used as a proxy for river size because they are closely related to stream width.

The hydromorphology of sites was summarised by two variables synthesising the following information: surface of the drainage area, distance form the source, hydrological regime, geomorphological river type and presence of a floodplain. The ordination method developed by Hill and Smith (1976) was used to analyse both categorical and quantitative variables. If only quantitative variables are used, this analysis is similar to a principal component analysis (PCA) on a correlation matrix. Otherwise, if only categorical variables are used, this analysis is similar to a multiple correspondence analysis (MCA). This method searches for the linear combination of the variables, orthogonal axes, which explain the highest amount of inertia of a scatter of points.

The first factorial plan of the Hill and Smith analysis explains 51.9% of the total inertia, 32.9% explained by the first axis and 19% by the second axis. The first axis (SYNGEO1) reflects a stream size gradient associated with the presence/absence of a floodplain (Figure 1a). The second axis (SYNGEO2) contrasts braided streams to a non- pluvial regime (strong glacio-nival component) to the non-braided streams with a pluvial regime (Figure 1a). Whereas the first axis represents a physical gradient, the second axis clearly displays a regional component. Sweden and Finland are clearly distinguished from the other countries in the sample (Figure 1b).

142 Presentation of the data

Figure 1: First factorial plan of the Hill and Smith analysis: a) relation between variables, b) site ordinations. Squares are located at the barycenter of the countries: Germany (DE), Austria (AT), Spain (ES), Finland (FI), France (FR), Hungary (HU), Italy (IT), Lithuania (LT), Netherlands (NL), Poland (PL), Portugal (PT), Romania (RO), United Kingdom (UK), Sweden (SE) and Switzerland (CH).

Each site was also characterised by its ecoregion (Illies 1978). These ecoregions are contiguous geographical units which provide a general description of the environment surrounding a site, at a regional resolution. The ecoregions used in the EFI+ project were adapted from the ecoregions initially proposed by Illies. First, due to the relatively low number of sites some ecoregions were grouped into larger geographical units (Figure 2). The Alps and Pyrenees ecoregions were grouped into the new Alps ecoregion; Hungarian lowlands, Eastern plains and the Pontic province were grouped into the Eastern region; the Fenno scandian shield and Borealic uplands ecoregions were grouped into the Nordic region. Moreover, Illies’s ecoregions do not take into account the specificity of the Mediterranean climate (Gasith and Resh 1999). A Mediterranean region was defined based on the mediterranity level 1 of the classification of Seguradao et al. (2008), to account for this specificity.

143 Presentation of the data

NOR

C.P BAL

ENG C.P EAST

C.H W.P CAR W.H ALPS EAST ITA IBE ALPS MED MED

Figure 2: Ecoregions adapted from Illies (1978): Alps (ALPS), Baltic province (BAL), Central highlands (C.H), Central plains (C.P), the Carpathiens (CAR), Eastern region (EAST), Great Britain (ENG), Ibero-Macaronesian (IBE), Italy, Corsica and Malta (ITA), Nordic region (NOR), Western plains (W.P) and Western highlands (W.H).

Anthropogenic pressures

Five main groups of pressures were considered during the EFI+ project: hydrological, morphological, water quality, connectivity and uses that could affect fauna (navigation). In contrast to the European FAME project which only estimates the degradation level using a five-modality classification, a much more accurate characterisation of pressures was established. For each alteration, several descriptors were considered and evaluated essentially by expert assessment. All the pressures and their modalities are described in Appendix 1B.

Characterising these pressures makes it possible to select a set of sites which are only slightly impacted or not impacted and that are considered as reference sites. The pressures taken into account and the level of alteration accepted to define the reference condition were adapted depending on the focus of the article. The criteria used for P1 and P3 were more restrictive than the criteria used for P5. The relatively high number of trait categories is a priori more easily affected by human pressures than species presence/absence. Nevertheless, each set of reference sites was selected so that the biological phenomenon studied was not limited or affected by anthropogenic pressures. Reference sites were selected to provide the

144 Presentation of the data most accurate representation possible of the process studied, e.g. the relationship between environmental gradients and the functional structure of the assemblage (P1). Pressures could affect various aspects of the assemblages and thus blur the biological signal or add noise to the data. Selecting reference sites based on objective criteria (Whittier et al. 2007) is one of the most important steps in the development of the indices based on the reference condition approach (e.g. Pont et al. 2006).

An index summarising the pressures was developed. It estimates the overall alteration level of the sites. It was computed using the first axis of a MCA integrating seven pressures: modification of the flow velocity due to impoundment, modification of a river section associated with channelisation, hydropeaking, toxic substances, water quality, and presence of a barrier downstream of the stream reach (at 1, 5 or 10 km depending of the surface of the drainage area upstream of the reach). The first axis of the MCA contrasts the modalities reflecting no or slight pressures or alterations (e.g. class 1 of water quality, no hydropeaking, no toxic substances) with the modalities representing a high level of stress (e.g. heavy water abstraction, extreme modification of the stream cross-section, hydropeaking). The categories of these variables are mainly ordered along this axis depending on the level of stress that they represent.

The coordinates of the sites on the first MCA axis 1 2 were rescaled with a min-max transformation (Legendre 3 4 and Legendre 1998, Saporta 2006) into a range of values 5 from 0 (not or slightly impacted) to 1 (heavily impacted). A five-class index was defined by partitioning these values (Figure 3) with a k-means algorithm (Hartigan and Wong 0.0 0.5 1.0 1.5 2.0 2.5 1979). For greater detail concerning the development of 0.0 0.2 0.4 0.6 0.8 1.0 Figure 3: distribution of the index the pressure index, see the EFI+ deliverables 4.1 and 4.2 values between the five classes. (Bady et al. 2009; http://efi- plus.boku.ac.at/downloads/EFI+DeliverablesD4.1andD4.2 .pdf).

145 Presentation of the data

Faunistic data

All sites were sampled between 1955 and 2007 (essentially after 1995) using electrofishing techniques. They were sampled either by wading or by boat depending on the river depth. All individuals caught during the different passes (between one and five) were identified to the species level. A total of 161 species and 7,686,350 individuals were caught and 6,309,639 fish were sized (82% of the individuals). To use homogenous data, only fish collected during the first pass were considered. Moreover, to limit the temporal autocorrelation, only one date per site (one fishing occasion) was randomly selected.

Biological and ecological species traits

Among the 24 traits considered in the EFI+ project, 15 ecological and biological traits (quantitative) and nine life history traits (qualitative), I only focused on the 11 following biological and ecological species attributes: - tolerance to temperature; - tolerance to oxygen; - tolerance to habitat degradation; - trophic regime; - feeding habitat; - affinity to flow velocity (habitat); - spawning substrate; - reproduction; - reproductive behaviour; - parental care; - migration behaviour.

This classification is based on the revision of a previous classification of biological and ecological traits of European fish species (Noble et al. 2007). Each trait was divided into several categories (Table 3), and only one category per trait was attributed to each species by expert judgment. If more than 50% of the expert agreed on one category, this category was assigned to the species. Compared to the benthic macroinvertebrates (Usseglio-Polatera et al. 2000), one species could not be represented in various categories, and only the attributes at the adult stage were considered.

146 Presentation of the data

Table 3: Description of trait modalities for the 11 biological and ecological traits of the species. Trait Categories Eurythermal (EUTHER): species able to withstand a wide range of temperature Temperature tolerance Stenothermal (STTHER): species able to withstand a narrow range of temperature Intolerant (O2INTOL): species requiring more than 6 mg of oxygen per litre Tolerance to oxygen Intermediate (O2IM): species relatively tolerant to low oxygen concentration Tolerant (O2TOL): species able to live in water with less than 3mg.L1. Intolerant (HINTOL): species that cannot compensate any degradation of their habitat Tolerance to habitat Intermediate (HIM): species showing an intermediate tolerance of habitat degradation degradation Tolerant (HTOL): species that do not react too sensitive to degradation of their habitat Detritivorous (DETR): adult diet composed of a high proportion of detritus Herbivorous (HERB): adult diet is composed of at least 75% plant material Insectivorous (INSV): adult diet is composed of at least 75% insect individuals Omnivorous (OMNI): adult diet is composed of more than 25% plant material and Adult trophic guild more than 25% animal material Parasitic (PARA): adult exhibiting a parasitic feeding mode Piscivorous (PISC): adult diet composed of more than 75% fish Planktivorous (PLAN): adult diet is composed of more than 75% phytoplankton or zooplankton Benthic (B): species preferring to live near the bottom from where they feed Feeding habitat Water column (WC): species that live and feed in the water column Limnophilic (LIMNO): species preferring to live in slow-flowing to stagnant Affinity to flow velocity conditions (habitat) Rheophilic (RH): species preferring to live in high-flow conditions Eurytopic (EURY): species with a wide tolerance to flow conditions (LIPAR) species preferring to spawn in stagnant water Spawning habitat (RHPAR) species preferring to spawn in running waters (EUPAR) species without clear spawning preferences Ariadnophilic (ARIAD): species are specialised in nested building, behaviour very often associated with parental care Lithopelagophilic (LIPE): species spawning on rocks and gravels with pelagic free embryos Lithophilic (LITH): species spawning exclusively on gravel, rocks, stones, rubbles or pebbles and with photophobic hatchlings Ostracophilic (OSTRA): species spawning in bivalve molluscs Pelagophilic (PELA): species spawning in pelagic zone Reproduction Phyto-lithophilic (PHLI): species depositing their eggs in clear water habitats on submerged plants or on other submerged items such as logs, gravel and rocks and their larvae are photophobic Phytophilic (PHYT): species depositing their eggs in clear water habitats on submerged plants Psammophilic (PSAM): species spawning on roots or grass above sandy bottom or on the sand itself Speleophilic (SPEL): species spawning in interstitial spaces, crevices or caves

Single (SIN): species with a single spawning event during the reproductive season Fractional (FR): species which either spawn repeatedly in a season or with Reproductive behaviour different components of their populations spawning at different times Protracted (PRO): species spawning over a long period during the reproductive season

147 Presentation of the data

Table 3: Description of trait modalities for the 11 biological and ecological traits of the species (suite). (PROT) species presenting egg or larvae life stages with protection Parental care (NOP) species with no protection for early life stages Resident (RESID): species moving over small areas within particular river segment Potamodromous (POTAD): species migrating within the inland waters of a river Anadromous (LMA): species living as older juveniles and sub-adults in the sea and Migration behaviour migrating up rivers to spawn at maturity Catadromous (LMC): species with early life stage living in fresh water and migrating down rivers to spawn in the sea at maturity

Data set used

After selecting sites with all the information available – environment, pressure, fauna (abundance per species) – the final data set encompasses 9,948 sites (Figure 4) sampled between 1974 and 2007 (87% of the sites after 1995). One hundred and forty-seven species were caught. Fish body sizes were available for 3,436 fishing occasions accounting for 727,976 individuals.

148 Transition to P1

Transition to P1

The first step of the development of a MMI is to select metrics which are representative of the region of interest (Karr 1991, Angermeier et al. 2000). Consequently, the set of the index’s candidate metrics depended on the spatial range of this region as well as the diversity of its species pool. Ideally, the selected metric should enable the assessment of assemblage conditions whatever the environmental conditions where the assemblages live. The diversity of the stream environments observed in the various European regions (Koster 2005, Tockner et al. 2009) was the first challenge to the development of the European fish index. The metrics selected should be useful in very diverse physical and climatic conditions. They should be able to assess the status of streams as diverse as Mediterranean streams, alpine streams, coastal streams, cold Scandinavian streams, etc. The second major challenge to the development of the European fish index stemmed from the diversity of the European fish fauna. This diversity is expressed both by the number of species occurring in Europe (more than 500; Kottelat and Freyhof 2007) and by the regional inconsistency of the species pool (Reyjol et al. 2007). By grouping the major hydrographic catchments based on the similarity of their fauna (list of species), Reyjol et al. (2007) defined seven ichtyoregions. The specific composition and the richness of these regions are highly variable and depend on the distribution areas of the species. The species niche (Hutchinson 1957, Pont et al. 2005), the spatial distribution of the environment (Huet 1954, Grenouillet et al. 2004), the geography (Hewitt 2004) and the three evolutionary processes – speciation, extinction and migration (Stearns 1992) – are the major natural factors responsible for the current distribution of the species. A species can only occur in a region if the environmental conditions suitable to its development are observed (Hutchinson 1957). However, at a continental resolution a key environmental factor such as temperature varies with latitude, altitude and continentality (Ward 1985, Figure 1.5A of Tockner et al. 2009).

The formation of the hydrographic basin and the last glaciation are the two most recent geological events that have contributed the most to the current distribution of the species. The geographical isolation of the population up to the formation of the hydrographic basins has favoured the allopatric speciation processes. For instance, in the Iberian Peninsula some species belonging to the same genus (e.g. Luciobarbus, Kottelat and Freyhof 2007) occur in different watersheds. Moreover, the limits of the watersheds represent impassable geographical boundaries for fish. Each catchment could be considered a biogeographical

151 Transition to P1 island (Hugueny 1989). Only anadromous species (Table 3) or species able to tolerate brackish waters can migrate from one catchment to another. The last glacial period that occurred at the Pleistocene (from 100,000 to 10,100 BCE) is a key period for the current species distribution (Hewitt 1999, 2000, Kontula and Vainola 2001, Koskinen et al. 2002, Hewitt 2004, Griffiths 2006). The considerable areas covered by the ice sheets in Europe (Banarescu 1992, Keith 1998, Clark et al. 2009) led to massive extinctions. During the last glacial maximum, the ice sheet covered the entire north of Europe, Iceland, the British Isles, extending to Germany and Poland. Only the most southern regions of Europe such as the Danubian catchment and the Ponto-Caspian region constituted refugia for fish fauna (Banarescu 1992, Hewitt 1999, 2004). Periods of ice sheet expansion correspond to periods of contraction for the species distribution area (Qian and Ricklefs 2000). In contrast, the periods of ice retreat correspond to periods of expansion for the species distribution area (Hewitt 2000). Areas which were covered with ice were re-colonised by the species that were living in the refugia (principally in the Ponto-Caspian region) through different migratory pathways (Banarescu 1989, 1992, Hewitt 1999, Nesbo et al. 1999, Hewitt 2000, Kontula and Vainola 2001, Koskinen et al. 2002, Hewitt 1999, 2000, 2004, Griffiths 2006). The Danubian catchment was the major source of dispersion that served re- colonisation. Moreover, huge proglacial lakes formed as the glaciers retreated. These lakes enabled connections between adjacent catchments (Banarescu 1992, Griffiths 2006). Cold water generalist species with high dispersal capacity were the best suited to re-colonising the glaciated areas (Griffiths 2006). Mountain blocks constituted geographical barriers that isolated the Southern European region (e.g. the Iberian peninsula) and prevented any re- colonisation from these areas (Hewitt 2000, 2004).

The most visible consequences of these past events on the current European fish fauna are a latitudinal gradient of species richness (Griffiths 2006) and a higher number of endemic species in the unglaciated areas (Bianco 1995, Reyjol et al. 2007). Western Europe also displays a low number of endemic species and fauna composed of species very often originating from Central Europe (Reyjol et al. 2007). Consequently, at a large spatial extent, the pattern derived from the analysis of the assemblage species composition mainly reflects the role played by geographical and historical factors on the current species distribution (Van Sickle and Hughes 2000, Heino 2005, Bremner et al. 2006a, Hoeinghaus et al. 2007). Using metrics based on assemblage composition (species presence/absence, species relative abundance; Alcorlo et al. 2006,

152 Transition to P1

Pollard and Yuan 2009) to develop an index over a large spatial extent does not appear relevant. The spatial extent of the use of a metric based on composition or identity is limited to the distribution area of the species.

Using metrics grouping species sharing the same attribute into one variable (Usseglio- Polatera et al. 2000, Melville et al. 2006, Hoeinghaus et al. 2007, Noble et al. 2007) makes it possible to compare assemblages composed of different species pools (Moyle and Herbold 1987, Wiens 1991, Kelt et al. 1996, Smith and Ganzhorn 1996, Reich et al. 1997, Statzner et al. 1997, Winemiller and Adite 1997, Bellwood et al. 2002, Lamouroux et al. 2002, Vila- Gispert et al. 2002, Goldstein and Meador 2004, Statzner et al. 2004, Melville et al. 2006, Bonada et al. 2007, Hoeinghaus et al. 2007, Irz et al. 2007, Ibañez et al. 2009). Therefore, in the EFI+ project only metrics based on the biological or ecological traits of species were considered. In addition to their spatial cover, the use of such traits has been recommended (Statzner et al. 2001a, Bonada et al. 2006) and successfully used in biodindication studies, especially at the large spatial extent (Pont et al. 2006, Pont et al. 2007, Pont et al. 2009).

153

P1: Functional fish assemblage structure in European rivers. In preparation.

P1: Functional fish assemblage structure in European rivers.

Abstract

The theory of traits (life history, ecological and biological traits) states that a species’ characteristics should enable its persistence and development in given environmental conditions. If habitat is the major factor controlling assemblage functional structure (AFS), species with similar attributes are expected to inhabit a similar environment. This study focuses on the relation between trait categories among 849 European riverine fish assemblages and the relative influence of environmental and regional factors on AFS was tested. Two main strategies were observed within assemblages: (1) assemblages dominated by stenothermal intolerant individuals and (2) assemblages dominated by eurythermal, eurytopic and tolerant individuals. Environment appears to be a major factor influencing AFS, while the influence of region appeared limited. AFS varied mainly along a longitudinal stream gradient and along an altitudinal and latitudinal temperature gradient. The implications of these results for the development of bio-indicators in Europe are also discussed.

157 P1: Functional fish assemblage structure in European rivers.

Introduction

The theory of life history states that natural selection designed co-adapted traits to solve particular ecological problems (Stearns 1992). The set of traits (strategies) and features of a species (morphology, behaviour, demography) would allow it to develop and persist in a given environment.

Two theories relate habitat and species characteristics. The “habitat templet”

(Southwood 1977, 1988) states that the spatio-temporal variability of habitat drives the selection of species traits. If local environment is the major selective force upon species traits, a similar local environment would select species with a similar set of strategies. The

“landscape filters” (Poff 1997, Tonn et al. 1990) theory states that local communities are the results of different hierarchical filters acting at various spatial scales (e.g. from continent to microhabitat in streams). The presence and abundance of a species in an assemblage depends on its suite of traits given that only adapted traits enable species to pass through filter sieves

(Keddy 1992, Poff 1997, Tonn et al. 1990). If habitat is the major factor controlling assemblage functional structure (AFS), traits are expected to be under-dispersed because species living in a similar environment would present similar attributes (Cornwell et al. 2006,

Diaz et al. 1998, Keddy 1992, Weiher and Keddy 1999, Weiher and Keddy 1995).

Consequently, assemblages living under similar environmental conditions should display convergent AFS (Orians and Paine 1983). This hypothesis has led to an increasing number of comparisons between geographically and phylogenetically distant assemblages (Ibañez et al.

2009, Kelt et al. 1996, Lamouroux et al. 2002, Wiens 1991). In contrast, if species interaction

(competition) is the major factor controlling AFS trait, over-dispersion is assumed to occur

(Weiher and Keddy 1995), because species similarity would limit the amount of resources available for each species (MacArthur and Levins 1967). Several studies have supported the action of each mechanism on AFS and even of both mechanism conjointly (Cornwell and

158 P1: Functional fish assemblage structure in European rivers.

Ackerly 2009). Nevertheless, phylogenetic constraints (Vitt and Pianka 2005), trade-offs (Poff et al. 2006, Stearns 1992, Townsend and Hildrew 1994) and different evolutionary pathways to cope to the same environment (Southwood 1988) may reduce our ability to identify clear- cut strategies.

In addition to their relation to community ecological concepts, trait approaches rather than species-based approaches present several advantages. First, establishing classifications of species based on their attributes rather than their is assumed to relate species directly or indirectly to ecosystem functioning (Lavorel and Garnier 2002). Second, aggregating species on their life history (Hoeinghaus et al. 2007) and their morphological

(Melville et al. 2006), ecological or biological (Usseglio-Polatera et al. 2000) characteristics makes it possible to compare systems and assemblages composed of different species pools

(Kelt et al. 1996, Lamouroux et al. 2002), whereas patterns derived from taxonomic composition may first reflect the role played by geographical and historical factors in current species distribution (Hoeinghaus et al. 2007).

In addition, functional metrics based on biological or ecological characteristics has been recommended and successfully used in bio-indication (Bonada et al. 2006, Karr 1981,

Statzner et al. 2001) to assess the ecological status of the system of interest (in particular for lotic hydrosystems). Whereas using metrics based on taxonomy limited the spatial range in which indices could be used due to regional specificity (the set of metrics has to be adapted for each region), using functional metrics has enabled the development of multimetric indices on a large spatial extent (Pont et al. 2009, Pont et al. 2006).

Aquatic habitats offer many environmental gradients along which species could develop adaptive strategies and traits have been defined for almost all aquatic biocenosis components (e.g. Barnett et al. 2007, Usseglio-Polatera et al. 2000, Willby et al. 2000,

Winemiller and Rose 1992). Studies investigating the relations between traits and

159 P1: Functional fish assemblage structure in European rivers. environment mainly focussed on the spatio-temporal variability of the environment (Statzner et al. 1994, Townsend et al. 1997), hydraulic constraints (Horrigan and Baird 2008,

Lamouroux et al. 2002, Poff and Allan 1995) and food resource availability (Acolas et al.

2008). Several studies have compared freshwater fish communities with regard to their composition in biological traits, aiming to find general patterns such as community convergence (Irz et al. 2007, Lamouroux et al. 2002), trait–environment associations

(Goldstein and Meador 2004, Ibañez et al. 2007, Tedesco et al. 2008) or trait dispersion within assemblages (Mason et al. 2007). Nevertheless, at a large spatial extent, only very few studies have addressed the question of the link between trait categories (strategies) among assemblages and the influence of the environment and region on AFS (e.g. Statzner et al.

2004), especially for fish assemblages.

The aim of the present paper is to analyze and quantify: (1) the link between trait categories among European riverine fish assemblages, (2) the influence of the environment on the AFS, and (3) the influence of large-scale ecological filters such as ecoregions on AFS before and after taking into account the influence of environmental conditions. For this purpose, we used four data sets describing the ecological and biological traits of European fish species, the fauna of the sampling sites, the environmental conditions of these sites, and the site location at the regional scale.

Methods

Sampling site selection

We used data from fish surveys of 13 European countries conducted by several laboratories and government environmental agencies (1981–2007). The sites were sampled by electrofishing techniques during low-flow period. They were sampled either by wading (85 % of sites) or by boat depending on the river depth. All sites were located in permanent streams.

160 P1: Functional fish assemblage structure in European rivers.

We only considered sites for which at least two species were caught to avoid working on monospecific assemblages.

All sites were selected to be not or only slightly influenced by human pressures at the reach resolution in order to reduce the bias due to modifications of fish community structure in relation with human activities (Pont et al. 2006): good water quality, no or few modifications of the river cross-section, river channel and water flow, no impoundment, no or few alterations of the river banks and bottom habitat, and no major alteration of connectivity.

For greater detail see http://efi-plus.boku.ac.at/index.htm.

The network structure of streams and the ability of fish to move within this network should limit the independence between the sites selected. To reduce potential bias due to spatial autocorrelation in our data set, the minimum distance between sites belonging to the same catchment was 10 km (only few species can migrate over such distance). Once these sites were selected, some geographical areas remained over-represented in our data set. To reduce their influence in our analyses, a maximum of 30 sites per spatial unit (two degrees of latitude and longitude) was retained by random selection. Finally, our data set consisted of

849 sampling sites distributed throughout Europe (Fig. 1).

Environmental variables

The local environment (site resolution) was characterised by six environmental variables. River slope (SLOP; 0–194.5 m.km1, log-transformed), July mean air temperature

(TJULY; 11.9–25.1°C) and thermal amplitude between July and January (TDIF; 8.6–29°C) are fundamental descriptors of river habitat at the reach resolution (Pont et al. 2005). In addition, we also considered bottom sediment structure, but in a simplified manner to facilitate obtaining more precise comparable information for such a large data set (SED; small

(sand, silt), medium (cobble, pebble) and large (rock, block)).

161 P1: Functional fish assemblage structure in European rivers.

Hydro-geomorphologic functioning also includes key processes structuring river habitats and their associated fish assemblages. Their effect on rivers can be described by the river size (distance to source and upstream drainage area), the hydrological regime (pluvial dominated vs glacial-nival), the geomorphologic types (meandering, braided, constraint; Petts and Amoros 1996), and the presence of a floodplain (Junk et al. 1989, Poff and Allan 1995).

To reduce the number of variables and the multicolinearity between them, these quantitative and qualitative variables were summarised in two synthetic independent variables, using an appropriate multivariate analysis (Hill and Smith 1976): the first factorial axis (SYNGEO1) mainly describes the river size gradient and the common occurrence of a floodplain in the downstream part, and the second (SYNGEO2) the opposition between meandering rivers with a pluvial regime and others.

Regional units

We used the ecoregions defined by Illies (1978) as regional units. Illies’s ecoregions are contiguous geographical units which give a description of the general type of environment around a site at the regional level.

Ecological and biological traits

A previous classification of biological and ecological traits of European fish species

(Noble et al. 2007) was revised and completed during the European EFI+ project (http://efi- plus.boku.ac.at/). The eleven biological and ecological traits retained consider both the mean features of the life history strategies of species, their trophic positions in the food webs, their affinity to several habitat characteristics and their sensitivity to water quality and habitat alteration (Table 1). For each trait, each species was assigned to one of the different categories

(Appendix). All 88 species represented in our sampling site data set are described, providing an accurate description of the functional structure of our fish assemblages (11 traits, 40 categories).

162 P1: Functional fish assemblage structure in European rivers.

Statistical analysis

Four initial data sets were available: species abundance per site (849 sites × 88 species), trait description per species (88 species × 40 categories), environmental description

(849 sites × 6 descriptors) and site assignment by region. They were used independently or in combination for the statistical analyses (Fig. 2).

As our main objective was to compare the functional structure of fish communities between sites, we combined the first two data sets by summing for each trait-category the relative abundance of all species characterised by this trait-category within a given site:

th th ij aAT pp , where Tij is the proportion of the j category of the i trait in a given species sampling site, Ap is the relative abundance of the species p and ap the affinity of the species to the considered trait category (1 if the species displayed this category or else 0).

The link between trait categories in fish assemblages was analysed by applying a fuzzy principal component analysis (FPCA; Chevenet et al. 1994) to a matrix structured by trait. In fact, the sum of categories per trait (Tij) is always equal to 1 (for more details see http://pbil.univ-lyon1.fr/ADE-4/).

To investigate the influence of the environment on the functional assemblage structure, we used a principal component analysis with respect to instrumental variables

(PCAIV; Baty et al. 2006, Dray et al. 2003, Lebreton et al. 1991, Rao 1964, Ter Braak 1988).

Each category was related to the environmental data set with a multiple linear regression and the predicted category values were analysed within a PCA including the row and column weights of the FPCA. The influence of the environment was tested using a Monte-Carlo test

(Manly 1997) on the explained inertia. Rows of the environmental table were permuted 999 times, defining an empirical distribution of the explained inertia. The p-value was obtained by comparing the explained inertia of the PCAIV with the empirical distribution.

163 P1: Functional fish assemblage structure in European rivers.

The significance of a regional effect on AFS was tested before and after taking into account the environmental influence using a Monte-Carlo procedure based on the between- region inertia percentage (Dolédec and Chessel 1987, Romesburg 1985).

All statistical analyses and sub-sampling procedures were performed using the statistical software R v2.8.1 (R Development Core Team 2008) and in particular the ade4 library (Chessel et al. 2004) for the multivariate analyses.

Results

Functional traits

Eighty-eight species distributed among 15 families were recorded in our samples

(Table 1). The most represented family in the guild matrix was the Cyprinidae with 49 species, whereas the Acipenseridae, Anguillidae, Esocidae, Lotidae, Nemacheilidae,

Pleuronectidae and Siluridae were represented by only one species. The Cobitidae, Cottidae,

Gasterosteidae, , Percidae, Petromyzontidae and Salmonidae comprised between two and eight species.

The Cyprinidae exhibited the widest range of strategies, with at least one species of this family represented in all trait categories except for migration (Table 2). Cyprinids were only resident or potamodromous. Salmonidae and Cobitidae exhibited very close strategies; they are all stenothermal, oxygen- and habitat-intolerant, rheophilous with a single spawning event during the reproductive season (Table 2).

Assemblage functional structure

The histogram of eigenvalues of the FPCA indicated that the composition in traits of fish assemblages was highly structured (Fig. 3b). The first factorial plane explained about two-thirds of the total inertia (F1: 48.1 %, F2: 15.3 %). The F1 axis displayed a strong association between traits reflecting tolerance, affinity to flow velocity and diet. The categories STTHER, O2INTOL and HINTOL were highly correlated (Fig. 3a) and associated

164 P1: Functional fish assemblage structure in European rivers. with RH, RHPAR and INSEV in the left part of the F1 axis. On the other hand, the categories associated with tolerance such as EUTHER, O2IM, HTOL, EURY, EUPAR and OMNI were located in the opposite part of the F1 axis (Fig. 3a).

The second axis distinguished fish assemblages regarding spawning habitat and migratory and reproductive behaviour. Fish assemblages characterised by potamodromous individuals spawning in running waters only once a year were contrasted with fish assemblages with resident individuals spawning repeatedly in a season but without spawning habitat preferences. Some categories seemed to be specific of particular assemblages.

O2TOL, OMNI and PHLI only occurred in less than 50 % of the fish assemblages, whereas all other categories were at least present in two-thirds of the assemblages.

Environmental influence

The PCAIV significantly explained 30.5 % of the inertia of the functional assemblage structures (Monte-Carlo permutation test, p < 0.001), suggesting a strong, consistent influence of the environment on the trait composition of fish assemblages. The strong influence of the environment was mostly reflected by the first axis of the PCAIV (83.2 % of the inertia explained by the analysis; Fig. 4b). The relation between the FPCA F1 axis and the PCAIV

F1 axis was strong (Fig. 4c). The patterns displayed by the scores of the predictions of the trait categories by the environment (Fig. 4a) were quite consistent with the patterns observed in fish assemblages (Fig. 3a). The most important explanatory variables were TJULY,

SYNGEO1 and SLOP (Fig. 4d). They are highly correlated with the F1 axis (respectively, r = 0.728, 0.645 and 0.508, Fig. 4d) but TJULY was only weakly correlated with the two others (r = 0.127 and 0.036).

On the contrary, the environmental variables considered seemed to slightly influence the reproduction and migration behaviour gradients observed along the F2 axis of the FPCA

(Fig. 4d).

165 P1: Functional fish assemblage structure in European rivers.

Regional effect

Illies’s ecoregions explained 29.4 % of the inertia of fish assemblage structure before taking into account the environment and only 7.1 % after. Half of the remaining inertia was accounted for by the Mediterranean region (1.41 %), the Western plains (1.36 %) and the

Carpathians (0.77 %).

Discussion

Our results highlight the organisation of European fish AFS in two main strategies mainly related to two environmental gradients: river size and temperature. In addition, the influence of Illies’s ecoregions on fish AFS was limited once the effect of the environment was taken into account.

Relations between ecological and biological characteristics

The high percentage of inertia explained by the first factorial plane of the FPCA

(66 %) highlights strong links between several species trait categories among European fish assemblages. AFS appears to be distributed along a gradient opposing two main strategies: (1) assemblages dominated by stenothermal individuals (STTHER) requiring a high concentration of dissolved oxygen (O2INTOL) and intolerant to habitat degradation

(HINTOL) (2) and assemblages dominated by eurytopic (EURY) eurythermal (EUTHER) individuals tolerant to habitat degradation (HTOL) and tolerant to lower oxygen concentrations (O2IM). To a lesser extent, the first axis also represents a trophic gradient, contrasting insectivorous (INSV) and omnivorous species (OMNI), and a gradient of affinity to flow velocity with rheophilic (RH) contrasted with eurytopic species (EURY).

Consequently, several traits appear to be correlated, finally showing only a few strategies within European fish assemblages. Even if the strategies of European fish species have been previously studied (Blanck et al. 2007, Vila-Gispert et al. 2002), to our knowledge, this is the

166 P1: Functional fish assemblage structure in European rivers. first time that such strategies among fish assemblages are described at the European scale based on more than only hydrological variables.

The second axis of the FPCA reflects a reproduction gradient. It discriminates assemblages with single reproduction in rheophilic areas to assemblages with fractional spawning in diverse spawning habitats and resident behaviour. These results should be taken with caution due to the lower inertia explained by the second axis, which could reflect noise in the data instead of relevant ecological patterns.

Relation between traits among European fish assemblages and environmental gradients

The environment explains a large share of the variability in ASF, in agreement with previous studies (Bremner et al. 2003, Hoeinghaus et al. 2007, Pont et al. 2006). Two major environmental gradients shape the European fish AFS: the longitudinal gradient of streams

(surrogate by stream size and slope) and the thermal gradient.

The influence of slope on fish communities is well known (Huet 1954). The proportion of rheophilous and eurytopic individuals varies along the slope gradient. As slope increases, the proportion of rheophilic individuals also increases (Oberdorff et al. 2002, Pont et al. 2007), while the proportion of eurytopic individuals decreases.

A high concentration of dissolved oxygen in streams is associated with turbulent flow conditions and low temperature, which is generally the case for small rivers in temperate countries. As temperature increases, the concentration in dissolved oxygen decreases and habitat becomes unfavourable for species with a high physiological demand in oxygen such as salmonids (Crisp 2000, Elliott 1994). Correspondingly, the O2INTOL category was inversely related to TJULY. It also appeared that nine species recorded in our samples were classified as O2INTOL, HINTOL and STTHER, such as the brown trout (Salmo trutta, Linnaeus), the bullhead (Cottus gobio, Linnaeus) and the minnow (Phoxinus phoxinus, Linnaeus). They are typical of rythronic fish assemblages (Matthews 1998) which prefer fast-flowing, cold and

167 P1: Functional fish assemblage structure in European rivers. well-oxygenated waters (Blanck et al. 2007). The positive relation of these categories with

SLOPE and SYNGEO1 and the negative relation with TJULY appeared consistent with river functioning in temperate climates.

Numerous previous studies have also demonstrated the role played by temperature on the species composition of riverine assemblages (Hawkins et al. 1997, Lyons 1996, Rahel and

Hubert 1991, Wehrly et al. 2003) and traits (Friberg et al. 2009, Oberdorff et al. 2002, Pont et al. 2007). Our results also support the idea that temperature is a major factor controlling AFS.

At a large spatial extent, temperature varies with three major geographical components: latitude, altitude and continentality (Ward 1985). This pattern is fully substantiated in Europe as temperature varies along the latitudinal gradient (decreasing toward the pole) and along the altitudinal gradient (decreasing with increasing elevation; Koster

2005, Tockner et al. 2009). Therefore, since temperature varies at a larger resolution than the longitudinal gradient of streams, the influence of temperature on fish assemblage tends to overcome the influence of the longitudinal gradient at the European resolution. The AFS of a large Nordic stream could be more similar to the AFS of small mountainous streams than to the AFS of large plain streams located in lower latitudes or altitudes. Due to low temperatures, Nordic and mountainous streams are dominated by intolerant stenothermal individuals, whereas lowland warmer large streams are dominated by tolerant eurythermal species. Jacobsen et al. (1997) also demonstrated that the composition of Ecuadorian mountain streams was more similar to Danish lowland streams than to Ecuadorian lowland streams.

The influence of environmental variables depends on the scale on which studies are conducted

(Wiens 1989) and depends on the scale at which the variability of the variable is fully expressed (Angermeier and Winston 1999, Blackburn and Gaston 2002, Infante et al. 2009).

In a previous study on the relationship between the habitat preferences of European fish

168 P1: Functional fish assemblage structure in European rivers. species and their life history traits, Blanck et al. (2007) demonstrated that hydraulic conditions were more important than temperature and oxygen in explaining the species’ ecological strategies, while in our study temperature appears to be a key environmental factor. This can be explained by the relative difference in scale used in the two studies: Blank et al. investigated a smaller scale than the present study. The relatively slight influence of temperature in their study could therefore be explained by a relatively lower variability of this environmental factor at their spatial extent. The action of temperature on the selection of species characteristics occurs at a large spatial extent, which is consistent with the definition of landscape filters by Poff (1997): mean air temperature in July, slope, stream size (surrogate by SYNGEO1) and natural sediment size are associated with basin, reach and channel unit filters. On the other hand, this study did not consider factors acting at a microhabitat resolution, a possible limitation to our analysis. Moreover, the temporal variability of stream habitat could also influence ASF since the variations in proportions observed along the environmental gradients follow the variation in habitat conditions (Friberg et al. 2009,

Goldstein and Meador 2004, Hoeinghaus et al. 2007). In a fluctuating environment, the relative densities of species vary in accordance with habitat and the AFS mirrors those fluctuations, even if the species compositions are not modified (Bêche et al. 2006).

Assemblage structure is more stable between years than taxonomic composition and species abundance (Bêche et al. 2006).

The relative shift in trophic structure, from insectivores to omnivores, along the longitudinal gradient of streams corresponds to observations from Europe (Oberdorff et al.

2002, Pont et al. 2007), North America (McGarvey and Hughes 2008) and African tropical streams (Ibañez et al. 2007) and seemed to be true on several continents (Ibañez et al. 2009).

Along the longitudinal gradient, the proportion of insectivores decreases while the proportion of omnivores increases (with decreasing SLOP and SYNGEO1).

169 P1: Functional fish assemblage structure in European rivers.

Regional factors

The relative stability of AFS between regions, after removing local environment, suggests that the selective forces driven by environment are consistent among regions, whatever the differences between the species pools (Bremner et al. 2003, Hewitt et al. 2008,

Horrigan and Baird 2008), leading to a convergent assemblage structure (Bellwood and

Hughes 2001). It also suggests that species living in different areas may play the same ecological role or function (Angermeier and Winston 1999, Bremner et al. 2006) in assemblages experiencing similar environmental conditions. Nevertheless, the greatest amount of residual inertia is explained by the Mediterranean region. In Europe, this area displays both singular environmental conditions (high intra and inter-annual variability of hydrological conditions; Gasith and Resh 1999) and particular fish fauna (Reyjol et al. 2007).

This result is concordant with results from Logez et al. (Logez M., Pont D. and Ferreira M. T. submitted) that have highlighted different relationships between certain trait categories and the environment between the Iberian Peninsula and Western Europe.

Previous studies have demonstrated that local environmental conditions could explain and predict individual trait categories (Bellwood and Hughes 2001, Ibañez et al. 2007,

Lamouroux et al. 2004, Lamouroux et al. 2002, Oberdorff et al. 2002, Pont et al. 2006, Pont et al. 2007). In addition, in this study we demonstrated that a low number of multiple combinations of functional traits simultaneously evolved along environmental gradients rather than individual traits; these results are similar to studies conducted in North America

(Hoeinghaus et al. 2007). Consequently, the adaptive value of a given trait makes sense only with regard to other trait values (Townsend et al. 1997, Verberk et al. 2008). Nevertheless, determining how many traits should enable the presence of a species in a given environment remains difficult because a single trait or a combination of traits should be necessary depending on species and environmental conditions (Townsend and Hildrew 1994).

170 P1: Functional fish assemblage structure in European rivers.

Perspective and implications in the development of bio-indicators at a large spatial extent

Varying several trait categories along environmental gradients would have important consequences, particularly in a reference condition approach (Bailey et al. 1998), which is advocated by the Water Framework Directive in Europe. With this approach, the ecological status of a test site is determined by comparing the observed values of the metrics with the theoretical values expected in absence of human pressures (Pont et al. 2009, Pont et al. 2006).

When metrics vary along both environmental and human pressure gradients, it is necessary to disentangle the variability attributable to each factor. If metric values are not standardised with environment before the scoring process, the effect of anthropogenic stressors will be inaccurately evaluated and the test sites misclassified. Therefore, it appears necessary at the

European resolution to take into account the influence of environment on fish AFS in the development of bio-indicators (Pont et al. 2006, Pont et al. 2007).

The relatively low number of strategies observed in European fish assemblages should limit the number of independent metrics available for the development of bio-indicators at the

European resolution. Candidate metrics for multimetric indices should reflect different aspects of assemblage biological conditions or AFS (Karr 1981, Karr and Chu 1999); each metric should provide original information. As several trait categories are highly correlated, only a few sets of metrics will provide complementary information on the functional structure of

European fish assemblages and be selected to assess the ecological condition of European streams. Whereas metrics with redundant responses to human pressure will provide a more precise estimation of human alteration (e.g. Karr and Chu 1999, Mundahl and Simon 1999), functional redundant metrics should be avoided in the development of bio-indicators.

171 P1: Functional fish assemblage structure in European rivers.

Acknowledgements

Work on this manuscript was funded by the European Commission under the Sixth

Framework Programme (EFI+ project, contract number 044096). We are grateful to all members who took part in this project.

172 P1: Functional fish assemblage structure in European rivers.

References

Acolas, M.L., Roussel, J.M., and Bagliniere, J.L. 2008. Linking migratory patterns and diet to reproductive traits in female brown trout (Salmo trutta L.) by means of stable isotope analysis on ova. Ecol. Freshw. Fish. 17(3): 382-393.

Angermeier, P.L., and Winston, M.R. 1999. Characterizing fish community diversity across

Virginia landscapes: Prerequisite for conservation. Ecol. Appl. 9(1): 335-349.

Bailey, R.C., Kennedy, M.G., Dervish, M.Z., and Taylor, R.M. 1998. Biological assessment of freshwater ecosystems using a reference condition approach: comparing predicted and actual benthic invertebrate communities in Yukon streams. Freshw. Biol. 39(4): 765-774.

Barnett, A.J., Finlay, K., and Beisner, B.E. 2007. Functional diversity of crustacean zooplankton communities: towards a trait-based classification. Freshw. Biol. 52: 796-813.

Baty, F., Facompré, M., Wiegand, J., Schwager, J., and Brutsche, M.H. 2006. Analysis with respect to instrumental variables for the exploration of microarray data structures. BMC

Bioinformatics 7.

Bêche, L.A., McElravy, E.P., and Resh, V.H. 2006. Long-term seasonal variation in the biological traits of benthic-macroinvertebrates in two Mediterranean-climate streams in

California, USA. Freshw. Biol. 51(1): 56-75.

Bellwood, D.R., and Hughes, T.P. 2001. Regional-scale assembly rules and biodiversity of coral reefs. Science 292(5521): 1532-1534.

Blackburn, T.M., and Gaston, K.J. 2002. Scale in macroecology. Glob. Ecol. Biogeogr. 11(3):

185-189.

Blanck, A., Tedesco, P.A., and Lamouroux, N. 2007. Relationships between life-history strategies of European freshwater fish species and their habitat preferences. Freshw. Biol. 52:

843-859.

173 P1: Functional fish assemblage structure in European rivers.

Bonada, N., Prat, N., Resh, V.H., and Statzner, B. 2006. Developments in aquatic insect biomonitoring: A comparative analysis of recent approaches. Annu. Rev. Entomol. 51: 495-

523.

Bremner, J., Rogers, S.I., and Frid, C.L.J. 2003. Assessing functional diversity in marine benthic ecosystems: a comparison of approaches. Mar. Ecol. Prog. Ser. 254: 11-25.

Bremner, J., Rogers, S.I., and Frid, C.L.J. 2006. Matching biological traits to environmental conditions in marine benthic ecosystems. Journal of Marine Systems 60(3-4): 302-316.

Chessel, D., Dufour, A.-B., and Thioulouse, J. 2004. The ade4 package - I: One-table methods. R News 4: 5-10.

Chevenet, F., Dolédec, S., and Chessel, D. 1994. A fuzzy coding approach for the analysis of long-term ecological data. Freshw. Biol. 31: 295-309.

Cornwell, W.K., and Ackerly, D.D. 2009. Community assembly and shifts in plant trait distributions across an environmental gradient in coastal California. Ecol. Monogr. 79(1):

109-126.

Cornwell, W.K., Schwilk, D.W., and Ackerly, D.D. 2006. A trait-based test for habitat filtering: convex hull volume. Ecology 87(6): 1465-1471.

Crisp, D.T. 2000. Trout and salmon: ecology, conservation and rehabilitation. Fishing News

Books, Oxford.

Diaz, S., Cabido, M., and Casanoves, F. 1998. Plant functional traits and environmental filters at a regional scale. J. Veg. Sci. 9(1): 113-122.

Dolédec, S., and Chessel, D. 1987. Rythmes saisonniers et composantes stationnelles en milieu aquatique I- Description d'un plan d'observations complet par projection de variables.

Acta Oecol 8(3): 403–426.

Dray, S., Chessel, D., and Thioulouse, J. 2003. Co-inertia analysis and the linking of ecological tables. Ecology 84(11): 3078-3089.

174 P1: Functional fish assemblage structure in European rivers.

Elliott, J.M. 1994. Quantitative ecology and the brown trout. Oxford University Press,

Oxford, New York and Tokyo.

Friberg, N., Dybkjaer, J.B., Olafsson, J.S., Gislason, G.M., Larsen, S.E., and Lauridsen, T.L.

2009. Relationships between structure and function in streams contrasting in temperature.

Freshw. Biol. 54(10): 2051–2068.

Gasith, A., and Resh, V.H. 1999. Streams in Mediterranean climate regions: Abiotic influences and biotic responses to predictable seasonal events. Annu. Rev. Ecol. Syst. 30: 51-

81.

Goldstein, R.M., and Meador, M.R. 2004. Comparisons of Fish Species Traits from Small

Streams to Large Rivers. Trans. Am. Fish. Soc. 133(4): 971-983.

Hawkins, C.P., Hogue, J.N., Decker, L.M., and Feminella, J.W. 1997. Channel morphology, water temperature, and assemblage structure of stream insects. J. N. Am. Benthol. Soc 16(4):

728-749.

Hewitt, J.E., Thrush, S.F., and Dayton, P.D. 2008. Habitat variation, species diversity and ecological functioning in a marine system. J. Exp. Mar. Biol. Ecol. 366(1-2): 116-122.

Hill, M.O., and Smith, A.J.E. 1976. Principal component analysis of taxonomic data with multi-state discrete characters. Taxon 25: 249-255.

Hoeinghaus, D.J., Winemiller, K.O., and Birnbaum, J.S. 2007. Local and regional determinants of stream fish assemblage structure: inferences based on taxonomic vs. functional groups. J. Biogeogr. 34(2): 324-338.

Horrigan, N., and Baird, D.J. 2008. Trait patterns of aquatic insects across gradients of flow- related factors: a multivariate analysis of Canadian national data. Can. J. Fish. Aquat. Sci. 65:

670-680.

Huet, M. 1954. Biologie, profils en long et en travers de eaux courantes. Bull. fr. Piscic. 175:

41-53.

175 P1: Functional fish assemblage structure in European rivers.

Ibañez, C., Belliard, J., Hughes, R.M., Irz, P., Kamdem-Toham, A., Lamouroux, N., Tedesco,

P.A., and Oberdorff, T. 2009. Convergence of temperate and tropical stream fish assemblages.

Ecography 32: 658-670.

Ibañez, C., Oberdorff, T., Teugeis, G., Mamononekene, V., Lavoue, S., Fermon, Y., Paugy,

D., and Tohams, P.K. 2007. Fish assemblages structure and function along environmental gradients in rivers of Gabon (Africa). Ecol. Freshw. Fish. 16(3): 315-334.

Illies, J. 1978. Limnofauna Europaea. Gustav Fischer Verlag, Stuttaart, New York.

Infante, D., David Allan, J., Linke, S., and Norris, R. 2009. Relationship of fish and macroinvertebrate assemblages to environmental factors: implications for community concordance. Hydrobiologia 623(1): 87-103.

Irz, P., Michonneau, F., Oberdorff, T., Whittier, T.R., Lamouroux, N., Mouillot, D., and

Argillier, C. 2007. Fish community comparisons along environmental gradients in lakes of

France and north-east USA. Glob. Ecol. Biogeogr. 16(3): 350-366.

Junk, W.J., Bayley, P.B., and Sparks, R.E. 1989. The Flood pulse concept in river-floodplain systems. In Proceedings of the International Large River Symposium (LARS). Edited by D.P.

Dodge. Canadian Journal of Fisheries and Aquatic Sciences Special Publication.

Karr, J.R. 1981. Assessment of biotic integrity using fish communities. Fisheries 6(6): 21-27.

Karr, J.R., and Chu, E.W. 1999. Restoring Life in Running Waters: Better Biological

Monitoring. Island Press, Washington D.C.

Keddy, P.A. 1992. Assembly and response rules: two goals for predictive community ecology. J. Veg. Sci. 3(2): 157-164.

Kelt, D.A., Brown, J.H., Heske, E.J., Marquet, P.A., Morton, S.R., Reid, J.R.W., Rogovin,

K.A., and Shenbrot, G. 1996. Community structure of desert small mammals: comparisons across four continents. Ecology 77(3): 746-761.

176 P1: Functional fish assemblage structure in European rivers.

Koster, E.A. 2005. The physical geography of Western Europe. Oxford University Press,

Oxford, USA.

Lamouroux, N., Dolédec, S., and Gayraud, S. 2004. Biological traits of stream macroinvertebrate communities: effects of microhabitat, reach, and basin filters. J. N. Am.

Benthol. Soc 23: 449-466.

Lamouroux, N., Poff, N.L., and Angermeier, P.L. 2002. Intercontinental convergence of stream fish community traits along geomorphic and hydraulic gradients. Ecology 83(7): 1792-

1807.

Lavorel, S., and Garnier, E. 2002. Predicting changes in community composition and ecosystem functioning from plant traits: revisiting the Holy Grail. Funct. Ecol. 16(5): 545-

556.

Lebreton, J.D., Sabatier, R., Banco, G., and Bacou, A.M. 1991. Principal component and correspondence analyses with respect to instrumental variables: an overview of their role in studies of structure-activity and species-environment relationships. In Applied Multivariate

Analysis in SAR and Environmental Studies. Edited by J. Devillers and W. Karcher. Kluwer

Academic Publishers, Dordrecht.

Lyons, J. 1996. Patterns in the species composition of fish assemblages among Wisconsin streams. Environ. Biol. Fishes 45(4): 329-341.

MacArthur, R., and Levins, R. 1967. The limiting similarity, convergence, and divergence of coexisting species. Am. Nat. 101: 377–385.

Manly, B.J.F. 1997. Randomization, Bootstrap and Monte Carlo Methods in Biology.

Chapman & Hall, London.

Mason, N.W.H., Lanoiselee, C., Mouillot, D., Irz, P., and Argillier, C. 2007. Functional characters combined with null models reveal inconsistency in mechanisms of species turnover in lacustrine fish communities. Oecologia 153(2): 441-452.

177 P1: Functional fish assemblage structure in European rivers.

Matthews, W.J. 1998. Patterns in Freshwater Fish Ecology. Chapman & Hall, New York.

McGarvey, D.J., and Hughes, R.M. 2008. Longitudinal zonation of Pacific Northwest

(U.S.A.) fish assemblages and the species-discharge relationship. Copeia 2008(2): 311-321.

Melville, J., Harmon, L.J., and Losos, J.B. 2006. Intercontinental community convergence of ecology and morphology in desert lizards. Proc. R. Soc. Biol. Sci. Ser. B 273(1586): 557-563.

Mundahl, N.D., and Simon, T.P. 1999. Development and application of an index of biotic integrity for coldwater streams of the upper Midwestern United States. In Assessing the

Sustainability and Biological Integrity of Water Resources Using Fish Communities. Edited by T.P. Simon. CRC Press, Boca Raton, Florida. pp. 383-411.

Noble, R.A.A., Cowx, I.G., Goffaux, D., and Kestemont, P. 2007. Assessing the health of

European rivers using functional ecological guilds of fish communities: standardising species classification and approaches to metric selection. Fish. Manag. Ecol. 14(6): 381-392.

Oberdorff, T., Pont, D., Hugueny, B., and Porcher, J.P. 2002. Development and validation of a fish-based index for the assessment of 'river health' in France. Freshw. Biol. 47(9): 1720-

1734.

Orians, G.H., and Paine, R.T. 1983. Convergent evolution at the community level. In

Coevolution. Edited by D.J. Futuyma and M. Slatkin. Sinauer, Sunderland. pp. 431–458.

Petts, G.E., and Amoros, C. 1996. Fluvial Hydrosystems. Chapman & Hall, London.

Poff, N.L. 1997. Landscape filters and species traits: Towards mechanistic understanding and prediction in stream ecology. J N Amer Benthol Soc 16(2): 391-409.

Poff, N.L., and Allan, J.D. 1995. Functional organization of stream fish assemblages in relation to hydrological variability. Ecology 76(2): 606-627.

Poff, N.L., Olden, J.D., Vieira, N.K.M., Finn, D.S., Simmons, M.P., and Kondratieff, B.C.

2006. Functional trait niches of North American lotic insects: traits-based ecological applications in light of phylogenetic relationships. J. N. Am. Benthol. Soc 25(4): 730-755.

178 P1: Functional fish assemblage structure in European rivers.

Pont, D., Hughes, R.M., Whittier, T.R., and Schmutz, S. 2009. A Predictive Index of Biotic

Integrity Model for Aquatic-Vertebrate Assemblages of Western US Streams. Trans. Am.

Fish. Soc. 138(2): 292-305.

Pont, D., Hugueny, B., Beier, U., Goffaux, D., Melcher, A., Noble, R., Rogers, C., Roset, N., and Schmutz, S. 2006. Assessing river biotic condition at a continental scale: a European approach using functional metrics and fish assemblages. J. Appl. Ecol. 43(1): 70-80.

Pont, D., Hugueny, B., and Oberdorff, T. 2005. Modelling habitat requirement of European fishes: do species have similar responses to local and regional environmental constraints?

Can. J. Fish. Aquat. Sci. 62(1): 163-173.

Pont, D., Hugueny, B., and Rogers, C. 2007. Development of a fish-based index for the assessment of river health in Europe: the European Fish Index. Fish. Manag. Ecol. 14: 427-

439.

R Development Core Team. 2008. R: A language and environment for statistical computing.

In R Foundation for Statistical Computing, Vienna, Austria.

Rahel, F.J., and Hubert, W.A. 1991. Fish Assemblages and Habitat Gradients in a Rocky-

Mountain Great-Plains Stream - Biotic Zonation and Additive Patterns of Community

Change. Trans. Am. Fish. Soc. 120(3): 319-332.

Rao, C.R. 1964. The use and interpretation of principal component analysis in applied research. Sankhy: The Indian Journal of Statistics, Series A 26(4): 329-358.

Reyjol, Y., Hugueny, B., Pont, D., Bianco, P.G., Beier, U., Caiola, N., Casals, F., Cowx, I.,

Economou, A., Ferreira, T., Haidvogl, G., Noble, R., de Sostoa, A., Vigneron, T., and

Virbickas, T. 2007. Patterns in species richness and endemism of European freshwater fish.

Glob. Ecol. Biogeogr. 16(1): 65-75.

Romesburg, H.C. 1985. Exploring, confirming and randomization tests. Computers and

Geosciences 11: 19–37.

179 P1: Functional fish assemblage structure in European rivers.

Segurado, P., Ferreira, M.T., Pinheiro, P., and Santos, J.M. 2008. Mediterranean River

Assessment. Testing the response of guild-based metric, Work Package 3, Subtask 7, EFI+

Consortium - Improvement and spatial extension of the European Fish Index, EU-Project Nr.

044096.

Southwood, T.R.E. 1977. Habitat, the templet for ecological strategies? J. Anim. Ecol. 46:

337-365.

Southwood, T.R.E. 1988. Tactics, strategies and templets. Oikos 52: 3-18.

Statzner, B., Bis, B., Dolédec, S., and Usseglio-Polatera, P. 2001. Perspectives for biomonitoring at large spatial scales: a unified measure for the functional composition of invertebrate communities in European running waters. Basic Appli. Ecol. 2: 73-85.

Statzner, B., Dolédec, S., and Hugueny, B. 2004. Biological trait composition of European stream invertebrate communities: assessing the effects of various trait filter types. Ecography

27(4): 470-488.

Statzner, B., Resh, V.H., and Dolédec, S. (Guest Editors). 1994. Ecology of the Upper Rhône

River: a test of habitat templet theories. Freshw. Biol. 31(3).

Stearns, S. 1992. The evolution of life histories. Oxford University Press, Oxford.

Tedesco, P.A., Hugueny, B., Oberdorff, T., Dürr, H.H., Mérigoux, S., and de Mérona, B.

2008. River hydrological seasonality influences life hisotry strategies of tropical riverine fishes. Oecologia 156: 691-702.

Ter Braak, C.J.F. 1988. Partial canonical correspondence analysis. In Classification and related methods of data analysis. Edited by H.H. Bock. Elsevier Science, Amsterdam. pp.

551-558.

Tockner, K., Uehlinger, U., and Robinson, C.T. 2009. Rivers of Europe. Academic Press.

180 P1: Functional fish assemblage structure in European rivers.

Tonn, W.M., Magnuson, J.J., Rask, M., and Toivonen, J. 1990. Intercontinental comparison of small-lake fish assemblages: the balance between local and regional processes. Am. Nat.

136(3): 345-375.

Townsend, C.R., Doledec, S., and Scarsbrook, M. 1997. Species traits in relation to temporal and spatial heterogeneity in streams: a test of habitat templet theory. Freshw. Biol. 37(2): 367-

387.

Townsend, C.R., and Hildrew, A.G. 1994. Species traits in relation to a habitat templet for river systems. Freshw. Biol. 31(3): 265-275.

Usseglio-Polatera, P., Bournaud, M., Richoux, P., and Tachet, H. 2000. Biological and ecological traits of benthic freshwater macroinvertebrates: relationships and definition of groups with similar traits. Freshw. Biol. 43: 175-205.

Verberk, W.C.E.P., Siepel, H., and Esselink, H. 2008. Life-history strategies in freshwater macroinvertebrates. Freshw. Biol. 53: 1722-1738.

Vila-Gispert, A., Moreno-Amich, R., and Garcia-Berthou, E. 2002. Gradients of life-history variation: an intercontinental comparison of fishes. Reviews in Fish Biology and Fisheries

12(4): 417-427.

Vitt, L.J., and Pianka, E.R. 2005. Deep history impacts present-day ecology and biodiversity.

Proc. Natl. Acad. Sci. U. S. A. 102(22): 7877-7881.

Wehrly, K.E., Wiley, M.J., and Seelbach, P.W. 2003. Classifying regional variation in thermal regime based on stream fish community patterns. Trans. Am. Fish. Soc. 132(1): 18-

38.

Weiher, E., and Keddy, P. 1999. Ecological Assembly Rules: Perspectives, Advances,

Retreats. Cambridge University Press, Cambridge, UK.

Weiher, E., and Keddy, P.A. 1995. Assembly Rules, Null Models, and Trait Dispersion - New

Questions Front Old Patterns. Oikos 74(1): 159-164.

181 P1: Functional fish assemblage structure in European rivers.

Wiens, J.A. 1989. Spatial scaling in ecology. Funct. Ecol. 3: 385-397.

Wiens, J.A. 1991. Ecological similarity of shrub-desert avifaunas of Australia and North

America. Ecology 72(2): 479-495.

Willby, N.J., Abernethy, V.J., Demars, B., and t, O.L. 2000. Attribute-based classification of

European hydrophytes and its relationship to habitat utilization. Freshw. Biol. 43: 43-74.

Winemiller, K.O., and Rose, K.A. 1992. Patterns of life-history diversification in North-

American fishes - Implications for population regulation. Can. J. Fish. Aquat. Sci. 49(10):

2196-2218.

182 P1: Functional fish assemblage structure in European rivers.

Table 1. Description of the categories of the eleven biological and ecological traits considered.

Trait Categories

Eurythermal (EUTHER): species able to withstand a large range of temperature Temperature tolerance Stenothermal (STTHER): species able to withstand a narrow range of temperature

Intolerant (O2INTOL): species requiring more than 6 mg of oxygen per litre

Tolerance to oxygen Intermediate (O2IM): species relatively tolerant to low oxygen concentration

Tolerant (O2TOL): species able to live in water with less than 3mg.L1.

Intolerant (HINTOL) Tolerance to habitat Intermediate (HIM) degradation Tolerant (HTOL)

Detritivorous (DETR): adult diet composed of a high proportion of detritus

Herbivorous (HERB): adult diet is composed of at least 75 % plant material

Insectivorous (INSV): adult diet is composed of at least 75% insect individuals

Omnivorous (OMNI): adult diet is composed of more than 25% plant material and

Adult trophic guild more than 25% animal material

Parasitic (PARA): adult exhibiting a parasitic feeding mode

Piscivorous (PISC): adult diet composed of more than 75 % fish

Planktivorous (PLAN): adult diet is composed of more than 75 % phytoplankton or

zooplankton

Benthic (B): species preferring to live near the bottom from where they feed Feeding habitat Water column (WC): species that live and feed in the water column

Limnophilic (LIMNO): species preferring to live in slow-flowing to stagnant

Affinity to flow velocity conditions

(habitat) Rheophilic (RH): species preferring to live in high-flow conditions

Eurytopic (EURY): species with a wide tolerance to flow conditions

(LIPAR) species preferring to spawn in stagnant water

Spawning habitat (RHPAR) species preferring to spawn in running waters

(EUPAR) species without clear spawning preferences

Reproduction Ariadnophilic (ARIAD): species are specialised in nested building, behaviour very

183 P1: Functional fish assemblage structure in European rivers.

often associated with parental care

Lithopelagophilic (LIPE): species spawning on rocks and gravels with pelagic free

embryos

Lithophilic (LITH): species spawning exclusively on gravel, rocks, stones, rubbles

or pebbles and with photophobic hatchlings

Ostracophilic (OSTRA): species spawning in bivalve molluscs

Pelagophilic (PELA): species spawning in pelagic zone

Phyto-lithophilic (PHLI): species depositing their eggs in clear water habitats on

submerged plants or on other submerged items such as logs, gravel and rocks and

their larvae are photophobic

Phytophilic (PHYT): species depositing their eggs in clear water habitats on

submerged plants

Psammophilic (PSAM): species spawning on roots or grass above sandy bottom or

on the sand itself

Speleophilic (SPEL): species spawning in interstitial spaces, crevices or caves

Single (SIN): species with a single spawning event during the reproductive season

Fractional (FR): species which either spawn repeatedly in a season or with

Reproductive behaviour different components of their populations spawning at different times

Protracted (PRO): species spawning over a long period during the reproductive

season

(PROT) species presenting egg or larvae life stages with protection Parental care (NOP) species with no protection for early life stages

Resident (RESID): species moving over small areas within particular river segment

Potamodromous (POTAD): species migrating within the inland waters of a river

Anadromous (LMA): species living as older juveniles and sub-adults in the sea and Migration behaviour migrating up rivers to spawn at maturity

Catadromous (LMC): species with early life stage living in fresh water and

migrating down rivers to spawn in the sea at maturity

184 P1: Functional fish assemblage structure in European rivers.

Table 2. Distribution of trait categories among families.

Cobitidae Cottidae Cyprinidae Gobiidae Percidae Salmonidae Others

STTHER 0 2 2 0 0 4 3

EUTHER 5 0 47 6 8 0 11

O2INTOL 0 2 10 0 2 4 3

O2IM 4 0 26 6 6 0 10

O2TOL 1 0 13 0 0 0 1

HINTOL 2 2 18 0 2 4 3

HIM 2 0 18 2 4 0 7

HTOL 1 0 13 4 2 0 4

DETR 0 0 3 0 0 0 2

HERB 0 0 2 0 0 0 0

INSV 5 2 20 6 5 2 4

OMNI 0 0 21 0 0 0 3

PARA 0 0 0 0 0 0 2

PISC 0 0 1 0 3 2 3

PLAN 0 0 2 0 0 0 0

B 5 2 27 6 6 0 9

WC 0 0 22 0 2 4 5

LIMNO 1 0 8 0 0 0 1

RH 4 2 25 0 4 4 6

EURY 0 0 16 6 4 0 7

LIPAR 1 0 7 0 0 0 4

RHPAR 1 2 31 2 3 4 5

EUPAR 3 0 11 4 5 0 5

ARIAD 0 0 0 0 0 0 2

LIPE 0 0 0 0 0 0 0

LITH 0 0 28 2 3 4 7

OSTRA 0 0 1 0 0 0 0

PELA 0 0 0 0 0 0 2

185 P1: Functional fish assemblage structure in European rivers.

PHLI 0 0 8 1 5 0 0

PHYT 5 0 9 0 0 0 3

PSAM 0 0 3 0 0 0 0

SPEL 0 2 0 3 0 0 0

SIN 1 2 28 2 5 4 10

FR 4 0 19 0 1 0 3

PRO 0 0 2 4 2 0 1

PROT 0 2 2 6 2 2 4

NOP 5 0 47 0 6 2 10

RESID 5 2 27 6 7 0 7

POTAD 0 0 22 0 1 3 3

LMA 0 0 0 0 0 1 2

LMC 0 0 0 0 0 0 2

186 P1: Functional fish assemblage structure in European rivers.

Fig. 1. Localisation of the 849 sampling sites among the thirteen ecoregions adapted from

Illies (1978). Due to the low number of sites occurring in some regions, some ecoregions were gathered into larger geographical areas: Alps and Pyrenees in Alps (ALPS, n = 30);

Hungarian lowlands, Eastern plains and Pontic province in Eastern region (EAST, 108); and

Fenno-scandian shield and Borealic uplands in Nordic region (NOR, 67). To take into account the specificity of the Mediterranean areas (Gasith and Resh 1999), a Mediterranean region was defined (MED, 53, dark grey) as the Mediterranity level 1 of the classification by

Segurado et al. (2008). The former Ibero-Macaronesian (IBE, 134) and Italy, Corsica and

Malta ecoregions (ITA, 22) were divided into two distinct areas. Baltic province (BAL, 33), the Carpathiens (CAR, 34), Great Britain (ENG, 102), Central plains (C.P, 105), Central highlands (C.H, 31), Western plains (W.P, 94), Western highlands (W.H, 36) remain unchanged.

Fig. 2. Description of the statistical analyses and the data sets used in the study.

Fig. 3. First factorial plane of the FPCA on the trait composition of fish assemblages. a) Trait category scores. Only categories with the highest contributions to F1 and F2 axes are shown. b) Histogram of eigenvalues. Ordination of sampling sites per region: c) Alps, d) Baltic province, e) Central highlands, f) Central plains, g) The Carpathians, h) Great Britain, i)

Eastern region, j) Ibero-Macaronesian region, k) Italy, Corsica and Malta, l) Mediterranean region, m) Nordic region, n) Western highlands and o) Western plains.

Fig. 4. a) Ordination of trait categories predicted by the environment on the first factorial plane of the PCAiv (only the categories with the highest contributions are represented). b)

Histogram of eigenvalues. c) Projections of the axis of the FPCA (arrows) on the axis of the

PCAIV. d) Correlations between environmental variables used in the multiple linear regressions and the scores of sampling sites.

187 P1: Functional fish assemblage structure in European rivers.

188 P1: Functional fish assemblage structure in European rivers.

189 P1: Functional fish assemblage structure in European rivers.

a b

FRFR

RESIDRESID EUPAREUPAR

INSVINSV STTHERSTTHER O2INTOLO2INTOL RH NOPN HINTOL LITHLLILITH

LMALMA PHLIPHLPPH HTOLHTOL HIMM O2IMO2IM PISCPISSCSC O2TOLL PROTPROT OMNIOMNI EURYEURY EUTHEREUTHER

POTADPOTAD RHPARR

SINSIN

190 P1: Functional fish assemblage structure in European rivers.

a b d

TDIF

SYNGEO1

PROTPROT FR RESIDRESID PHLI NATSED-small PISC EUPAR HTOLHTOL SYNGEO2 STTHESTTHERR LONG-LMALONG-LMA OO2INTOL2INTOL RH OMNIO EURY O2IM HINTOLHINTOL RHPAR EUTHER SINN HIMHIM NATSED-medium INSVINSV POTADPOTAD LITH NOP

c SLOP Axis2 TJULY Axis1

191 P1: Functional fish assemblage structure in European rivers.

Appendix

Characteristics of the 88 species sampled: temperature tolerance (Temp), tolerance to oxygen (O2tol), tolerance to habitat degradation (Htol), adult trophic guild (Atroph), feeding habitat (Fehab), affinity to flow velocity (Hab), spawning habitat (Habsp), reproduction (Repro), reproductive behaviour (Reprob), parental care (Pc), and migration behaviour (Mig).

Species name Temp O2tol Htol Atroph Fehab Hab Habsp Repro Reprob Pc Mig

Abramis brama EUTHER O2TOL HTOL OMNI B EURY EUPAR PHLI SIN NOP POTAD

Achondrostoma arcasii EUTHER O2TOL HINTOL DETR WC RH RHPAR PHYT SIN NOP RESID

Achondrostoma oligolepis EUTHER O2TOL HIM OMNI WC LIMNO RHPAR PHYT FR NOP RESID

Acipenser ruthenus EUTHER O2IM HIM INSV B RH RHPAR LITH SIN NOP POTAD

Alburnoides bipunctatus STTHER O2INTOL HINTOL INSV WC RH RHPAR LITH FR NOP RESID

Alburnus alburnus EUTHER O2IM HTOL PLAN WC EURY EUPAR PHLI FR NOP RESID

Anguilla anguilla EUTHER O2TOL HTOL INSV B EURY LIPAR PELA SIN NOP LMC

Aspius aspius EUTHER O2INTOL HTOL PISC WC EURY RHPARLITH SIN NOP POTAD

Ballerus ballerus EUTHER O2IM HINTOL PLAN WC RH EUPAR LITH PRO NOP RESID

Ballerus sapa EUTHER O2TOL HTOL INSV B RH RHPAR LITH SIN NOP POTAD

Barbatula barbatula EUTHER O2IM HIM INSV B RH EUPAR LITH FR NOP RESID

192 P1: Functional fish assemblage structure in European rivers.

Barbus barbus EUTHER O2IM HINTOL INSV B RH RHPAR LITH FR NOP POTAD

Barbus haasi EUTHER O2INTOL HINTOL OMNI B RH RHPAR LITH SIN NOP POTAD

Barbus meridionalis EUTHER O2IM HINTOL INSV B RH RHPAR LITH SIN NOP POTAD

Barbus peloponnesius EUTHER O2INTOL HINTOL INSV B RH RHPAR LITH FR NOP POTAD

Barbus plebejus EUTHER O2IM HIM INSV B RH RHPAR LITH SIN NOP POTAD

Barbus tyberinus EUTHER O2INTOL HINTOL INSV B RH RHPAR LITH SIN NOP POTAD

Blicca bjoerkna EUTHER O2TOL HTOL OMNI B EURY EUPAR PHLI PRO NOP RESID

Carassius carassius EUTHER O2TOL HTOL OMNI B LIMNO LIPAR PHYT FR NOP RESID

Carassius gibelio EUTHER O2TOL HTOL OMNI B EURY LIPAR PHYT FR NOP RESID

Chondrostoma nasus EUTHER O2IM HINTOL HERB B RH RHPAR LITH SIN NOP POTAD

Cobitis elongatoides EUTHER O2IM HTOL INSV B RH EUPAR PHYT FR NOP RESID

Cobitis taenia EUTHER O2IM HIM INSV B RH EUPAR PHYT FR NOP RESID

Cottus gobio STTHER O2INTOL HINTOL INSV B RH RHPAR SPEL SIN PROTRESID

Cottus poecilopus STTHER O2INTOL HINTOL INSV B RH RHPAR SPEL SIN PROTRESID

Cyprinus carpio EUTHER O2TOL HTOL OMNI B EURY LIPAR PHYT SIN NOP RESID

Esox lucius EUTHER O2IM HTOL PISC WC EURY LIPAR PHYT SIN NOP POTAD

193 P1: Functional fish assemblage structure in European rivers.

Eudontomyzon mariae EUTHER O2INTOL HINTOL DETR B RH RHPAR LITH SIN NOP RESID

Gasterosteus aculeatus EUTHER O2IM HTOL OMNI WC EURY LIPAR PHYT PRO PROT RESID

Gasterosteus gymnurus EUTHER O2IM HIM OMNI WC EURY EUPARARIAD FR PROT RESID

Gobio gobio EUTHER O2IM HTOL INSV B RH RHPAR PSAM FR NOP RESID

Gymnocephalus baloni EUTHER O2IM HIM INSV B RH EUPAR PHLI PRO NOP RESID

Gymnocephalus cernua EUTHER O2IM HTOL INSV B EURY EUPARPHLI FR NOP RESID

Gymnocephalus schraetser EUTHER O2IM HIM INSV B RH RHPAR LITH PRO NOP RESID

Hucho hucho STTHER O2INTOL HINTOL PISC WC RH RHPAR LITH SIN NOP POTAD

Iberochondrostoma almacai EUTHER O2IM HINTOL OMNI WC LIMNO RHPAR PHLI FR NOP RESID

Iberochondrostoma lusitanicum EUTHER O2TOL HIM OMNI WC LIMNO RHPAR PHLI FR NOP RESID

Lampetra fluviatilis EUTHER O2IM HINTOL PARA B RH RHPAR LITH SIN NOP LMA

Lampetra planeri STTHER O2INTOL HINTOL DETR B RH RHPAR LITH SIN NOP RESID

Leucaspius delineatus EUTHER O2IM HIM OMNI WC LIMNOLIPAR PHYT FR PROT RESID

Leuciscus idus EUTHER O2IM HIM OMNI WC RH EUPAR PHLI SIN NOP POTAD

Leuciscus leuciscus EUTHER O2IM HIM OMNI WC RH RHPAR LITH SIN NOP RESID

Lota lota STTHER O2INTOL HIM PISC B EURY EUPARLITH SIN NOP POTAD

194 P1: Functional fish assemblage structure in European rivers.

Luciobarbus bocagei EUTHER O2IM HIM OMNI B EURY RHPARLITH SIN NOP POTAD

Luciobarbus graellsii EUTHER O2IM HIM OMNI B EURY RHPARLITH SIN NOP POTAD

Luciobarbus sclateri EUTHER O2TOL HIM OMNI B EURY RHPARLITH SIN NOP POTAD

Misgurnus fossilis EUTHER O2TOL HINTOL INSV B LIMNO LIPAR PHYT SIN NOP RESID

Neogobius fluviatilis EUTHER O2IM HTOL INSV B EURY EUPARSPEL PRO PROT RESID

Neogobius gymnotrachelus EUTHER O2IM HTOL INSV B EURY EUPARPHLI PRO PROT RESID

Neogobius kessleri EUTHER O2IM HTOL INSV B EURY EUPARLITH PRO PROT RESID

Neogobius melanostomus EUTHER O2IM HTOL INSV B EURY EUPARLITH PRO PROT RESID

Padogobius bonelli EUTHER O2IM HIM INSV B EURY RHPARSPEL SIN PROTRESID

Padogobius nigricans EUTHER O2IM HIM INSV B EURY RHPARSPEL SIN PROTRESID

Parachondrostoma miegii EUTHER O2IM HINTOL HERB B RH RHPAR LITH SIN NOP POTAD

Perca fluviatilis EUTHER O2IM HTOL PISC WC EURY EUPARPHLI SIN NOP RESID

Petromyzon marinus EUTHER O2IM HIM PARA WC RH RHPAR LITH SIN NOP LMA

Phoxinus phoxinus STTHER O2INTOL HINTOL INSV WC RH EUPAR LITH FR NOP RESID

Platichthys flesus STTHER O2IM HIM INSV B EURY EUPARPELA SIN NOP LMC

Protochondrostoma duriense EUTHER O2IM HIM DETR B RH RHPAR LITH SIN NOP POTAD

195 P1: Functional fish assemblage structure in European rivers.

Protochondrostoma genei EUTHER O2IM HINTOL INSV B RH RHPAR LITH SIN NOP POTAD

Protochondrostoma polylepis EUTHER O2IM HIM DETR B RH RHPAR LITH SIN NOP POTAD

Pungitius pungitius EUTHER O2IM HIM OMNI WC LIMNOLIPAR ARIAD FR PROT RESID

Rhodeus amarus EUTHER O2IM HINTOL OMNI WC LIMNO LIPAR OSTRA FR PROT RESID

Romanogobio albipinnatus EUTHER O2INTOL HIM INSV B RH RHPAR PSAM FR NOP RESID

Romanogobio kesslerii EUTHER O2INTOL HINTOL INSV B RH RHPARPSAM FR NOP RESID

Rutilus pigus EUTHER O2INTOL HIM INSV B RH RHPAR PHYT SIN NOP RESID

Rutilus rubilio EUTHER O2IM HIM INSV WC EURY RHPARPHLI SIN NOP RESID

Rutilus rutilus EUTHER O2TOL HTOL OMNI WC EURY EUPAR PHLI SIN NOP POTAD

Sabanejewia aurata EUTHER O2IM HIM INSV B RH EUPAR PHYT FR NOP RESID

Sabanejewia balcanica EUTHER O2IM HINTOL INSV B RH RHPAR PHYT FR NOP RESID

Salmo salar STTHER O2INTOL HINTOL PISC WC RH RHPAR LITH SIN PROT LMA

Salmo trutta STTHER O2INTOL HINTOL INSV WC RH RHPAR LITH SIN NOP POTAD

Sander lucioperca EUTHER O2IM HIM PISC WC EURY EUPARPHLI SIN PROTPOTAD

Sander volgensis EUTHER O2IM HIM PISC B EURY EUPARPHLI SIN PROTRESID

Scardinius erythrophthalmus EUTHER O2TOL HIM OMNI WC LIMNO LIPAR PHYT FR NOP RESID

196 P1: Functional fish assemblage structure in European rivers.

Silurus glanis EUTHER O2IM HTOL PISC B EURY EUPARPHYT SIN PROT RESID

Squalius alburnoides EUTHER O2IM HTOL INSV WC EURY EUPARLITH SIN NOP RESID

Squalius carolitertii EUTHER O2IM HIM INSV WC EURY EUPARLITH SIN NOP RESID

Squalius cephalus EUTHER O2IM HTOL OMNI WC RH RHPAR LITH FR NOP POTAD

Squalius lucumonis EUTHER O2IM HTOL OMNI WC EURY RHPARLITH SIN NOP POTAD

Squalius pyrenaicus EUTHER O2IM HIM INSV WC EURY EUPARLITH SIN NOP RESID

Squalius torgalensis EUTHER O2IM HIM INSV WC EURY EUPARLITH SIN NOP RESID

Telestes souffia EUTHER O2INTOL HINTOL INSV B RH RHPAR LITH SIN NOP RESID

Thymallus thymallus STTHER O2INTOL HINTOL INSV WC RH RHPAR LITH SIN PROT POTAD

Tinca tinca EUTHER O2TOL HINTOL OMNI B LIMNO LIPAR PHYT FR NOP RESID

Vimba vimba EUTHER O2IM HINTOL INSV B RH RHPAR LITH FR NOP POTAD

Zingel streber EUTHER O2INTOL HINTOL INSV B RH RHPAR LITH SIN NOP RESID

Zingel zingel EUTHER O2INTOL HINTOL INSV B RH RHPAR LITH SIN NOP RESID

197

Implications of these results for the development of a large-scale MMI

P1: Functional fish assemblage structure in European rivers.

The high redundancy among several trait categories limits the number of candidate metrics for the development of the index. Closely related categories such as STTHER (stenothermal), O2INTOL (oxygen intolerant) and HINTOL (intolerant to habitat degradation) provide, a priori, the same information on assemblage conditions, which is contrary to the MMI theory (Karr and Chu 1999). If these metrics go through all the selection steps (Hughes et al. 1998, Hering et al. 2006, Pont et al. 2006, Stoddard et al. 2008) and can measure the effect caused by human pressures, only one of these metrics should be selected. In addition to the ability of the metrics to detect human pressures, the geographical distribution of these metrics must be a selection criterion. Nineteen categories of the 41 used do not or only slightly participate in the first factorial plan (relative contribution lower than 0.5%) and are not represented in Figure 5a. Among these categories, some are only rarely observed in the assemblages; thus they do not participate in the PCA. Ostracophilic reproduction (OSTRA) and the parasitic (PARA) and herbivorous (HERB) trophic guilds occur in less than 10% of the sites. Rare trait categories are only slightly representative of the assemblages and thus do not contribute to the development of a large-spatial extent MMI. These metrics could be useful to develop additional tools specific to certain assemblages or stream types. Some traits could vary independently of the two major gradients (tolerance and reproduction) and are represented on other axes. This is the case of feeding habitat, whose modalities are mainly associated with the third axis of the PCA. These categories are very interesting because they integrate complementary information on the fish assemblage structure. Likewise, relatively stable trait categories (slightly variable between assemblages) may not participate in the PCA but are potentially advantageous for MMI development (Charvet et al. 2000, Statzner et al. 2001b, Hering et al. 2006).

Once the effect of the environment is controlled, the ichtyoregions (Reyjol et al. 2007) account for 2.3% of the remaining inertia (P < 0.001), 1.6% of the total inertia of the fish assemblage structures. This result tends to demonstrate that at a large spatial extent, using metrics based on traits overcomes the regional fauna differences (Pont et al. 2006). The same metric could be representative of regions with similar environmental conditions but with different species pools. In contrast, while assemblage composition varies with the environmental conditions (Huet 1954, Rahel and Hubert 1991, Schlosser 1991, Hawkins et al. 1997, Matthews 1998, Magalhaes et al. 2002a, Wehrly et al. 2003, Hoeinghaus et al. 2007, Infante et al. 2009), at a large spatial extent the pattern derived from the taxonomic analysis

201 P1: Functional fish assemblage structure in European rivers. mainly reflects the geographical differences stemming from species distribution (Oswood et al. 2000, Van Sickle and Hughes 2000, Hoeinghaus et al. 2007). Therefore, a metric based on identity, taxonomic composition, would be spatially limited.

Ideally, metric scores should only reflect the impairment level of sites (Hughes et al. 1998, Karr and Chu 1999, Hering et al. 2006, Pont et al. 2007, Stoddard et al. 2008, Pont et al. 2009). The between-site variability of the scores should only represent the variability of the alteration between sites. In the reference condition approach (Bailey et al. 1998), the first step consists in comparing observed and expected values, in absence of pressure. The results from the RDA clearly demonstrate that the environmental variability of the trait categories must be taken into account to compute metric scores (Oberdorff et al. 2002, Pont et al. 2006, 2007). Without this consideration, it would be impossible to disentangle the score variability associated with the environment or the pressures to which it is subjected. One solution is to model the variability of the metrics in relation to the environment (Oberdorff et al. 2002, Baker et al. 2005, Pont et al. 2007, Pont et al. 2009, Hawkins et al. 2010) using reference sites (Stoddard et al. 2006) as calibration sites. Reference sites are selected on objective criteria (Whittier et al. 2007) reflecting absence of or slight impact (e.g. Hughes et al. 1998, Bates Prins and Smith 2007, Pont et al. 2007, 2009, Stoddard et al. 2008). The fitted values of the models are the expected values of the metrics in a given environment. Environmental variability is controlled by subtracting the expected value from the observed value: the residuals, the deviations. The “residuals measure the range of metric variation expected after removing the effects of the abiotic predictor variables” (Pont et al. 2009). The ability of a metric to detect anthropogenic alterations will be determined in part by the ability to predict reliable expected values. Without consistent fitted values, it would be impossible to determine what the estimated deviation represents, whatever the method used (e.g. Fausch et al. 1984, Joy and Death 2002, Hughes et al. 2004, Pont et al. 2006, 2007, Roset et al. 2007, Hawkins et al. 2010).

Once the effect of the environment has been controlled for, the ecoregions account for 7% of the total inertia of the fish assemblage structure. Even if this effect is low, it suggests an interregional variation of the response of the trait categories to environmental conditions. Consequently, is it possible to use the same model to predict the expected metric values for all regions (Pont et al. 2006, Pont et al. 2007)? Or must one model per region be used (Pont et al. 2009)? To answer these questions, one must test the convergence of the assemblage response

202 P1: Functional fish assemblage structure in European rivers. to environmental variability between different regions (Wiens 1991, Smith and Ganzhorn 1996, Bellwood et al. 2002, Lamouroux et al. 2002, Irz et al. 2007, Ibañez et al. 2009, Hugueny et al. 2010).

203

P2: Do Iberian and European fish faunas exhibit convergent functional structure along environmental gradients? Journal of the North American Benthological Society

P2: Do Iberian and European fish faunas exhibit convergent functional structure along environmental gradients? The convergence of the assemblage structure responses to environmental gradients was tested by comparing the assemblages from France and Belgium (FB) to assemblages from the Iberian Peninsula (MED). Both regions have different evolutionary histories (Banarescu 1992, Hewitt 2000, 2004) and relatively different species pools (Ferreira et al. 2007a, Reyjol et al. 2007). Among the 57 species observed, only ten occur in the two regions. Three models with different hypotheses concerning the relationship between traits and environment were used for each trait category. The first model hypothesises a qualitative convergence (Hugueny et al. 2010) between regions (Figure 8a); the second model hypothesises parallel responses to the environment between regions (quantitative convergence; Figure 8b); and the third model hypothesises that the responses to the environment could be different depending on the region (Figure 8c–f).

abc

de f Catégorie de trait

Environnement Figure 8: Theoretical relationship between a trait category and the environment in two regions, black or grey curves: a) qualitative convergence (the same model, curve, for the two regions), b), c) and d) quantitative convergence, the responses to the environment are not perfectly similar but vary in the same way for the two regions, e) and f) no convergence, in e) the trait category varies along the environmental gradient in one region but not in the other and in f) the opposite responses to the environment.

207

J. N. Am. Benthol. Soc., 2010, 29(4):1310–1323 ’ 2010 by The North American Benthological Society DOI: 10.1899/09-125.1 Published online: 7 September 2010

Do Iberian and European fish faunas exhibit convergent functional structure along environmental gradients?

Maxime Logez1 Cemagref, UR HYAX, 3275 Route de Ce´zanne – CS 40061, F-13182 Aix en Provence, France

Didier Pont2 Cemagref, UR HBAN, Parc de Tourvoie BP 44, 92163 Antony Cedex, France

Maria Teresa Ferreira3 Forest Research Centre, Technical University of Lisbon, Tapada da Ajuda, 1349-017 Lisboa, Portugal

Abstract. We tested whether the functional structures of Mediterranean and temperate western European fish communities responded similarly along environmental gradients. The species pools of the 2 regions were quite different, with few species common to both regions. Each species was assigned to 1 trait for each of 6 guilds considered. We aggregated occurrences or densities of the species sharing the same traits and then computed 26 metrics describing the functional structure of fish assemblages. For each metric, we fitted and then compared 3 nested models. The 1st model related the metric to environmental variables without taking into account the region. Therefore, the response was assumed to be the same between regions. The 2nd model related the metric to the environmental variables with the region as an additive parameter. Therefore, the response was assumed to be similar between regions but with a constant deviation between them. The 3rd model took into account all interactions between the environmental variables and region. Therefore, the response to the environmental gradient was assumed to be different in the 2 regions. For the 17 metrics finally tested, 11 metrics responded similarly to environmental gradients but generally showed a constant deviation between the 2 regions, and responses of 6 metrics differed between the regions. Our results highlight the roles played by biogeographical factors and the environment on current community structure in 2 neighboring but ecologically distinct regions.

Key words: convergence, functional trait, fish community, Mediterranean, environmental gradients.

The biological and ecological traits of local com- of the geographical proximity and the evolutionary munities are selected through filters acting at different history of regions (Ricklefs 2006). This theory suggests temporal and spatial scales (Tonn et al. 1990, Statzner that communities in different regions with different et al. 2004). The River Habitat Templet theory histories should present similar traits if the regions proposed by Townsend and Hildrew (1994) extended are environmentally similar. Southwood’s theory (1977) and suggests that the Nonrandom association between traits displayed spatial and temporal variability of the environment by communities and the environment has been acts as a framework for evolution and selection of demonstrated with macroinvertebrate (Statzner et al. traits. Local environmental factors have been consid- 2004, Yoshimura et al. 2006) and fish communities ered by some authors to be the principal determinants (Magalhaes et al. 2002a, Pont et al. 2006, 2007, of community structure (Ricklefs and Schluter 1993, Hoeinghaus et al. 2007, Iban˜ ez et al. 2007). The Ricklefs 2006). Convergence of features between composition of traits in lotic fish communities appears communities living in similar environments is pre- to be structured along an environmental gradient dicted by the local determinism theory independently (Pont et al. 2006, 2007, Hoeinghaus et al. 2007, Iban˜ ez et al. 2007, 2009), whereas patterns derived from 1 E-mail addresses: [email protected] taxonomic composition reflect the role played by 2 [email protected] geographical and historical factors in current species 3 [email protected] distribution (Hoeinghaus et al. 2007). 1310

209 2010] TEST OF FUNCTIONAL CONVERGENCE OF FISH ASSEMBLAGES 1311

Intercontinental comparisons of community struc- tures have shown that communities composed of different species pools can display similar responses to environmental gradients (Winemiller and Adite 1997, Bellwood et al. 2002, Lamouroux et al. 2002, Melville et al. 2006). These observations support the hypothesis that the local environment drives evolu- tion toward convergence because assemblages com- posed of different species tend to display similar characteristics to cope with the environmental condi- tions. Such underdispersion of traits is expected if habitat is the factor selecting species traits (Diaz et al. 1999, Cornwell et al. 2006). These studies compared current communities but did not test potential convergences because no information concerning the degree of similarity of past communities was avail- able (Blondel 1991, Schluter and Ricklefs 1993). Other studies on fish and on other groups failed to find FIG. 1. Locations of the 75 Iberian Peninsula (MED) sites analogy between communities of different regions and 123 sites in France and Belgium (FB). The biogeograph- (Wiens 1991, Kelt et al. 1996, Smith and Ganzhorn ical regions (BR) 1, 2, and 7 are adapted from Reyjol et 1996, Andrews and O’Brien 2000, Irz et al. 2007), a al. (2007). result suggesting that special factors (Ricklefs 2006), such as history (Ricklefs and Schluter 1993, Smith and taxonomically closer to eastern European faunas than Ganzhorn 1996, Samuels and Drake 1997, Lobo and to geographically closer faunas (Ferreira et al. 2007a, Davis 1999, Qian and Ricklefs 2000, Tedesco et al. Reyjol et al. 2007). 2005, Vitt and Pianka 2005), also might partially We tested whether the structures of MED and FB control present community features. lotic fish assemblages respond similarly to environ- The main objective of our study was to test whether mental gradients. Pont et al. (2006, 2007) demonstrat- characteristics of fish assemblages display convergent ed at the European scale that ecological and biological responses to environmental gradients in 2 regions traits of fish are significantly related to environmental with very different species pools. We focused on 2 variability. We tested whether the relationship of European regions: the Iberian Peninsula (MED) and these traits to environment variables was similar, France and Belgium (FB). These 2 regions are parallel, or different in the 2 regions. We also tested contiguous but separated by the Pyrenees, which whether fish community structure can be used to isolate the Iberian Peninsula from the rest of the measure responses to the same environmental gradi- European continent. Because of this isolation, the ent even though the communities are from different Iberian fish fauna was less affected by the last glacial regions and regardless of the differences in evolu- maximum than the western fish fauna. The Iberian tionary history between regions. Peninsula acted as a refuge for fish and several other animal groups (Hewitt 1999, 2000, 2001, Gomez and Methods Lunt 2007). Except for recent invasions (Lobo´n-Cervia´ Data collection et al. 1989, Clavero and Garcia-Berthou 2006), species currently occurring in Iberia have long been estab- We used data from fish surveys of the Iberian lished in this region (Mesquita and Coelho 2002) and Peninsula (central and southern Spain, and Portugal), potentially have experienced a long period of evolu- France, and Belgium (Fig. 1), collected by several tion (Filipe et al. 2009). On the contrary, the western laboratories and governmental agencies. MED stream European species pool mainly results from migratory sites (river reaches) (n = 75) had a Mediterranean events from eastern Europe (in particular from the climate and were in biogeographic regions 1 and 2 Danube catchment) via multiple pathways (Koskinen (sensu Reyjol et al. 2007), whereas FB sites (n = 123) et al. 2002, Weiss et al. 2002, Hewitt 2004). The present had a temperate climate and were in biogeographic legacy of the last glacial maximum and of the Iberian region 7 (Reyjol et al. 2007). isolation is illustrated by a higher number of endemic All sites were sampled using electrofishing meth- species in the Iberian Peninsula (Griffiths 2006, Reyjol ods during low-flow periods. Sampling methods were et al. 2007), whereas the fauna of western Europe is similar to those described by Pont et al. (2006). To

210 1312 M. LOGEZ ET AL. [Volume 29

TABLE 1. Traits and modalities of the fish species used in this study.

Trait Abbreviation Modalities Tolerance INTOL Intolerant species TOLE Tolerant species Adult trophic guild INSEV Insectivorous/invertivorous species, diet composed of .75% of macroinvertebrates OMNI Omnivorous species, diet containing .25% of plant material and .25% of animal material PISC Obligatory piscivorous, .75% of fish in their diet Feeding habitat BENTH Benthic species, prefer to live near the bottom where they feed WATE Water-column species, live and feed in the water column Affinity to flow velocity (habitat) LIMN Limnophilic species, prefer slow flowing to stagnant conditions RHEO Rheophilic species, prefer to live in high-flow conditions EURY Eurytopic species, tolerant of wide range of flow conditions Spawning habitat LITH Lithophilic species, require unsilted mineral substrate to spawn, larvae photophobic PHYT Phytophilic species, tend to spawn on vegetation, larvae not photophobic Migration POTAD Potamodromous species, migrate within the inland waters of a river LONG Long-migratory (diadromous) species, migrate across a transition zone between fresh and marine water homogenize the sampling effort between regions, feeding habitat, 4) species habitat (affinity to flow only data collected during the first pass and only velocity), 5) reproduction (spawning substrate), and sites with §50 individuals were considered. All sites 6) migration behavior. The piscivorous modality were sampled by wading, and the fished area (FISH; (PISC) was not retained because only 0.15% of the m2) was measured in the field. All sites were in fish caught were classified as piscivorous. These traits permanent streams and were selected to be undis- represent the most important aspects of the biology turbed or minimally disturbed to limit biases caused and ecology of fish species and could potentially by human pressure on rivers (Ferreira et al. 2007b). influence the presence and abundance of species in a To test whether MED and FB fish assemblages given environmental condition (Tonn et al. 1990, present a convergent response to environmental Verberk et al. 2008). We also included local richness gradients, we characterized the local environmental (RICH) and total density (DENS) as metrics. We conditions with 5 environmental variables: distance considered all species sampled to compute the ‘‘ from source (DIS; km), reach slope (SLOP; m/km), metrics, even the nonnative species, because they mean annual air temperature (TEMP; uC), geological can play an important functional role in river ’’ type (GEO; calcareous or siliceous), and FISH. These ecosystems (Pont et al. 2006, p. 74). variables were chosen to represent local environmen- Metrics can be expressed in number of species tal conditions because previous studies have demon- (richness; Ns), e.g., the number of insectivorous strated their influence on assemblage structure in species, or in number of individuals (densities; Ni), France (Oberdorff et al. 2002) and more generally in e.g., the number of insectivorous individuals/ha western Europe (Pont et al. 2007). Wetted width and (computed with FISH). For each metric, we consid- altitude were not retained because of their strong ered both density and richness because they reflect correlations with other variables (Pearson coefficients, different aspects of system functioning. The same r . 0.8, p , 0.05). Each site also was characterized by metric could have a different response to environ- region (REG; MED or FB). mental variables depending on whether it was based on density or richness. Iban˜ ez et al. (2009) demon- Metrics tested strated that RICH and DENS vary differently along a longitudinal gradient. To test the convergence of the functional structure The evolution of some trait categories along the of fish communities across regions, we considered 6 environmental gradient has been studied in France traits divided into 14 modalities (Table 1): 1) the (Oberdorff et al. 2002) and in western Europe (Pont et tolerance of species to common anthropogenic im- al. 2007), but little information is available for pacts (alteration of flow regime, nutrient regime, Mediterranean areas. We expected RICH to increase habitat structure, and water chemistry; Pont et al. along an upstream–downstream gradient (Pont et al. 2006), 2) the diet of adult individuals, 3) species 2007, Iban˜ ez et al. 2009). We also expected tolerant

211 2010] TEST OF FUNCTIONAL CONVERGENCE OF FISH ASSEMBLAGES 1313

(TOLE), omnivorous (OMNI), piscivorous (PISC), the nature of these metrics (count data; Cameron and water-column (WATE), and limnophilous (LIMN) Trivedi 1998). Log-linear models use a nonnormal fishes to increase with DIS or decrease with SLOP. distribution for the model errors, and dependent On the other hand, we expected intolerant (INTOL) variables are linearly related to predictors through a and rheophilous (RHEO) fishes to decrease with DIS link function (the logarithm function for the Poisson and increase with SLOP. We also expected TOLE distribution; McCullagh and Nelder 1989, Cameron and OMNI to be positively related to increasing and Trivedi 1998). Therefore, the models have the TEMP and INTOL, whereas we expected INTOL to be form: negatively linked with TEMP. X ( )=az b log Y iX Statistical analysis where Y is the dependent variable (i.e., each metric Because the number of sites available in MED was based on the number of species), a is the intercept, lower than in FB, we subsampled FB sites to obtain b th and i the i parameter associated with the environ- similar distributions of environmental variables be- mental variable X. The coefficients are estimated by tween the 2 regions. The subsamples were selected to maximizing the likelihood (McCullagh and Nelder limit potential geographical bias resulting from 1989, Faraway 2006) rather than by ordinary least environmental dissimilarities between regions rather squares as for normal-error models (Kutner et al. than differences in fish communities. 2005, Montgomery et al. 2006). The number of sites To test whether patterns of relationships between available for MED and FB differed, so we used prior metrics and environment are the same in the 2 weights in all models such that the sum of weights of regions, we defined 3 nested multiple-regression the sites was equal between the 2 regions. We then models, which constituted our set of candidate tested model 2 vs model 1 and model 3 vs model 2 models. Each model was related to a specific with a x2 test on the difference of deviance (McCul- hypothesis. Model 1 related the metric to the lagh and Nelder 1989). To model metrics based on environmental variables. The predictors were TEMP densities (Ni), which were previously log(x)-trans- 2 + TEMP + SLOP + DIS + GEO. Model 1 assumed that formed because of the skewness of their distributions, responses to environmental gradients would be we used multiple linear regressions. We compared similar between regions (variable REG was not the nested models (model 2 vs model 1 and model 3 included in the model). Model 2 related each metric vs model 2) by analyzing the difference of mean to environmental variables and REG as an additive squares with F tests (ANOVA; Kutner et al. 2005, 2 parameter. The predictors were TEMP + TEMP + Montgomery et al. 2006). SLOP + DIS + GEO + REG. Model 2 assumed that After selecting the adequate model (model 1, model responses would be similar between regions but 2, or model 3) for each metric, we retained only those would differ by some constant amount between metrics for which the selected model presented: 1) no regions (parallel responses). Model 3 included all autocorrelation in the residuals (checked visually by interactions between REG and the environmental representing residuals along fitted values), 2) nor- 2 variables. The predictors were TEMP + TEMP + mally distributed residuals (checked visually), 3) a SLOP + DIS + GEO + REG + (REG 3 TEMP) + (REG 3 linear trend between observed and fitted values, TEMP2) + (REG 3 SLOP) + (REG 3 DIS) + (REG 3 and 4) slope and intercept of the linear regression GEO), where (REG 3 variable) represents a multipli- between observed and fitted values equal to 1 and 0, cative interaction between REG and the variable of respectively. interest. Model 3 assumed that the responses to We tested similarities of responses between regions environmental gradients were different in the 2 for many metrics with the same statistical procedure, regions. We included a quadratic term for tempera- which generated a multiple testing issue (Shaffer ture in each model because Pont el al. (2007) showed 1995, Lamouroux et al. 2002). Therefore, we used the nonlinear responses of traits to temperature. More- method proposed by Benjamini and Yekutieli (2001), over, because sampling effort could influence the which controls the false discovery rate (FDR; Benja- species richness observed at a site (Angermeier and mini and Hochberg 1995). Procedures that control Schlosser 1989, Poff and Allan 1995), we added FISH FDR are more powerful, less conservative, and better as a predictor variable to each model of an Ns trait. limit the number of hypotheses inconsistently not We used log-linear models (McCullagh and Nelder rejected (Dudoit and van der Laan 2008) than 1989, Faraway 2006) rather than linear models to procedures that control the family-wise error rate model metrics based on number of species because of (e.g., simple Bonferroni method). As for every

212 1314 M. LOGEZ ET AL. [Volume 29 multiple testing procedure (Dudoit and van der Laan 2008), the aim of Benjamini and Yekutieli’s procedure is to adjust the significance threshold that determines whether the null hypothesis of a test is rejected. We considered the 34 model comparisons (17 metrics 3 2 model comparisons) as forming a family of hypoth- eses. All tests were mutually dependent because the same predictor data were used for all response metrics. For a simpler visualization of the test outputs, we used the formula provided by Dudoit and van der Laan (2008) to adjust the test p-values for the FDR. Adjusted p-values ‘‘reflect the strength of the evidence against each null hypothesis’’ (Dudoit and van der Laan 2008; p. 33) and, thus, provide a direct assess- ment of test significance. We represented each significant relationship or divergent pattern (interaction) between the environ- mental variables and metrics through a graph effect display (Fox 1987, 2003). These graphs represent the FIG. 2. Histograms of the distribution of temperature predicted variation of each metric (fitted values) along (TEMP) (A), reach slope (SLOP) (B), distance from head- 1 environmental gradient by holding constant the waters (DIS) (C), and area fished (FISH) (D) at sites in the values of the other predictors (Fox 1987, 2003). Iberian Peninsula (MED) and France and Belgium (FB). Predicted values were computed with the coefficients , of the final selected models. We displayed the 0.001; Table 2) and more variable (Levene’s test, p , response curves only for the environmental predictor 0.01; Fig. 2A) in MED than in FB. Ranges for DIS or interactions with a significant effect on metric were similar between regions, but variance was lower , variability. The significance of each predictor or in MED (Levene’s test, p 0.05; Fig. 2C). Ranges and interaction was assessed using a drop-in-deviance distributions of SLOP and FISH were similar between . test (F test for Ni metrics and x2 test for Ns metrics) by regions (Kolmogorov–Smirnov, p 0.05; Fig. 2B, D). comparing the final model with the final model The distribution of GEO modalities did not differ x2 . without the predictor or the interaction of interest between regions ( , p 0.05; Table 2). (Chambers and Hastie 1993). All analyses were done Only 17 metrics remained after checking the final in R (version 2.8.2; R Development Core Team, model selected for each metric (see Methods) Vienna, Austria). (Table 3). These included 8 density-based metrics (including DENS) and 9 richness-based metrics Results (including RICH). Density- and richness-based eury- topic (EURY), lithophilic (LITH), potamodromous MED and FB fish species pools were composed of (POTAD), RHEO, and TOLE remained, whereas only 26 and 41 species, respectively, and had 10 species in density-based insectivorous (INSEV) and OMNI and common. Among these 10 species, 4 were nonnative only richness-based benthic (BENTH), WATE, and in both regions. The number of nonnative species was INTOL remained. Neither density- nor richness- similar between regions (8 in MED and 11 in FB). based phytophilous (PHYT) could be fitted success- They represented, on average, 17.6% of the total fully (Table 3). number of individuals recorded in MED sites and Three of the 17 metrics had nonsignificant respons- 0.9% of the number of individuals in FB sites. Among es to REG or the REG 3 environmental variable the 8 nonnative species in MED, 3 were native from interaction (model 1), 8 had significant additive FB and represented, on average, 15.7% of the number responses to REG (model 2), and 6 had significant of individuals sampled at MED sites. Moreover, RICH responses to §1 REG 3 environmental variable was higher at FB (6 species on average) than at MED interaction (model 3) (Table 3). The 6 metrics with sites (3.48 species; Wilcoxon, p , 0.001). significant responses to REG 3 environmental vari- The environment across MED and FB was similar, able interactions could be grouped into 2 categories. except for TEMP and DIS (Kolmogorov–Smirnov tests Ni_EURY (Fig. 3B), Ni_INSEV (Fig. 3C), and Ns_ on transformed variables, all p , 0.001; Fig. 2A–D, EURY (Fig. 4C) responded similarly but with differ- Table 2). Mean TEMP was higher (Kruskal–Wallis, p ent magnitudes to the same environmental gradient

213 2010] TEST OF FUNCTIONAL CONVERGENCE OF FISH ASSEMBLAGES 1315

TABLE 2. Means (SD) and ranges of environmental variables in the Iberian Peninsula (MED), France and Belgium (FB), and both regions combined. TEMP = temperature, SLOP = reach slope, DIS = distance from the headwaters, FISH = area fished at a reach, GEO = geology.

Variables Units Statistics MED FB Combined regions TEMP uC Range 2–2.9 1.8–2.7 1.8–2.9 Mean (SD) 2.58 (0.18) 2.27 (0.17) 2.38 (0.23) log(SLOP) m/km Range 0–4.1 20.9–4.2 20.9–4.2 Mean (SD) 2.18 (1.06) 1.96 (1.01) 2.04 (1.03) log(DIS) km Range 1.1–5.1 0–5.3 0–5.3 Mean (SD) 2.92 (0.78) 2.49 (1.01) 2.65 (0.95) log(FISH) m2 Range 5–8.3 4.6–9.7 4.6–9.7 Mean (SD) 6.31 (0.69) 6.48 (0.81) 6.42 (0.77) GEO number of sites Calcareous 35 57 92 Siliceous 40 66 106 in both regions (both slopes were positive or negative p , 0.05). Ni_INSEV increased with decreasing DIS in but the coefficients differed), whereas DENS (Fig. 3A), MED, whereas it varied only slightly with DIS in FB, Ni_POTAD (Fig. 3F), and Ni_RHEO (Fig. 3G) re- and the maximum observed in the relationship sponded oppositely to the same environmental gradi- between Ni_INSEV and TEMP was shifted toward ent in both regions (a positive slope in 1 region and a warmer temperatures in MED than in FB (Fig. 3C). negative slope in the other region). The interaction between GEO and REG also was The significant REG 3 environmental variable significant for Ni_INSEV (partial F-test, p , 0.05). The interactions observed for Ns_EURY, Ni_EURY, and deviation between the 2 dominant geologies was Ni_INSEV were mainly caused by different magni- almost null in FB, whereas Ni_INSEV was on average tudes of variation of those metrics along environmen- lower in calcareous geology than in siliceous geology tal gradients. In both regions, Ni_EURY was higher in in MED (Fig. 3C). a siliceous than in a calcareous geology (partial F-test, The 3 metrics with opposite patterns of response to p , 0.001), but the deviation between the 2 dominant environmental variables in MED and FB were based geologies was lower in FB than in MED (partial F-test, on density. DENS was positively related to SLOP in

2 TABLE 3. Partial F (for total density [DENS] and density-based [Ni] metrics) and x (for species richness [RICH] and richness- based [Ns] metrics) statistics and their adjusted p-values for comparison of nested regression models. The regional effect was not significant (ns) if no comparisons were significant, additive if the comparison of models 1 and 2 was significant and the comparison of models 2 and 3 was not significant, and interactive if the comparison of models 2 and 3 was significant (see Methods for details). Pseudo-R2 (% deviance explained) assesses the quality of log-linear models for Ns and RICH metrics. Metric abbreviations are given in Table 1.

Model 1 vs Model 2 Model 2 vs Model 3 R2 or pseudo-R2 Metric Statistic p-value Statistic p-value Regional effect Model 1 Model 2 Model 3 DENS 4.023 0.309 4.522 0.008 Interactive 0.245 0.261 0.341 Ni_EURY 1.545 1 4.729 0.006 Interactive 0.252 0.258 0.341 Ni_INSEV 0.196 1 5.586 0.002 Interactive 0.313 0.314 0.403 Ni_LITH 2.576 0.617 3.184 0.068 ns 0.193 0.204 0.267 Ni_OMNI 0.964 1 1.854 0.610 ns 0.311 0.314 0.347 Ni_POTAD 13.897 0.004 6.191 0.001 Interactive 0.211 0.265 0.370 Ni_RHEO 12.288 0.007 5.454 0.002 Interactive 0.342 0.382 0.461 Ni_TOLE 9.897 0.018 0.47 1 Additive 0.323 0.357 0.365 RICH 27.039 ,0.001 13.15 0.154 Additive 0.497 0.584 0.626 Ns_BENTH 19.022 ,0.001 1.157 1 Additive 0.409 0.500 0.506 Ns_EURY 1.777 0.983 16.88 0.041 Interactive 0.308 0.315 0.379 Ns_INTOL 33.438 ,0.001 4.506 1 Additive 0.447 0.586 0.604 Ns_LITH 16.342 0.001 4.556 1 Additive 0.426 0.536 0.567 Ns_POTAD 0.819 1 15.563 0.068 ns 0.266 0.270 0.348 Ns_RHEO 31.059 ,0.001 10.23 0.439 Additive 0.491 0.627 0.671 Ns_TOLE 9.604 0.018 9.56 0.540 Additive 0.340 0.372 0.404 Ns_WATE 9.633 0.018 14.862 0.081 Additive 0.438 0.485 0.556

214 1316 M. LOGEZ ET AL. [Volume 29

MED and negatively in FB (Fig. 3A). Ni_POTAD decreased with SLOP in both regions. However, Ni_POTAD decreased with increasing DIS in MED and increased with increasing DIS in FB (Fig. 3F). Ni_RHEO displayed opposite patterns of variation along DIS between MED and FB. Ni_RHEO increased with DIS in FB and decreased with DIS in MED (Fig. 3G). Moreover, the relationships between TEMP and Ni_POTAD and Ni_RHEO were approximately linear in FB and quadratic in MED. Most metrics (8 of 11) with nonsignificant interac- tions between REG and environmental variables were based on richness (Table 3). RICH (Fig. 4A), Ns_BENTH (Fig. 4B), Ns_INTOL (Fig. 4D), Ns_LITH (Fig. 4E), Ns_RHEO (Fig. 4G), Ns_TOLE (Fig. 4H), Ns_WATE (Fig. 4I), and Ni_TOLE (Fig. 3H) had similar responses in MED and FB but theoretical values expected in a given environment were always lower in MED than in FB (all MED curves were below FB curves). Ni_OMNI (Fig. 3E) and Ni_TOLE (Fig. 3H), 2 traits associated with resistance to stressful conditions, and Ni_LITH (Fig. 3D) were the only 3 density-based metrics that had nonsignificant interactions (either model 1 or model 2 was selected for these metrics; Table 3). Ni_OMNI and Ni_TOLE tended to increase with DIS and TEMP and to decrease with SLOP. Ns_POTAD, Ni_LITH, and Ni_OMNI were the only 3 metrics with no significant effect of region (Table 3). The R2 values of multiple linear regressions relating density-based metrics and environmental variables varied between 0.193 and 0.461 (Table 3), whereas the pseudo-R2 values of the log-linear models relating richness-based metrics and environmental variables varied between 0.266 and 0.671 (Table 3).

Discussion The aim of our study was to test whether the structure of communities between Mediterranean and western European faunas (represented by France and Belgium), composed of different pools of species (Clavero and Garcia-Berthou 2006, Ferreira et al. FIG. 3. Marginal effect of temperature (TEMP), reach 2007a, Reyjol et al. 2007), responded similarly along slope (SLOP), and distance from headwaters (DIS) on total environmental gradients (Hoeinghaus et al. 2007). In fish density (DENS) (A) and density (Ni) of eurytopic other words, we asked whether community structures (EURY) (B), insectivorous (INSEV) (C) lithophilic (LITH) (D), omnivorous (OMNI) (E), potamodromous (POTAD) (F), rheophilic (RHEO) (G), and tolerant (TOLE) (H) fishes, r conditional on the medians of other predictors (Fox 1987). indicated with an asterisk. Calcareous (c) and siliceous (s) Only significant effects are plotted (see Methods for details). environments were differentiated if the interaction between The common response pattern (model 1) is represented by a geology (GEO) and region (REG) was significant. ns was dashed curve; otherwise black and grey curves represent added to the graph if the environmental predictor had a responses in the Iberian Peninsula (MED) and France and significant effect but the region 3 variable interaction was Belgium (FB). Metrics with significant interactions are not statistically significant.

215 2010] TEST OF FUNCTIONAL CONVERGENCE OF FISH ASSEMBLAGES 1317

indicated convergent responses to environmental gradients between these 2 regions. Overall, MED and FB fish faunas showed convergent functional structure along environmental gradients. Three met- rics had the same response patterns, 8 metrics had similar response patterns but significant additive deviation between the 2 regions, and 6 metrics had different response patterns between regions (interac- tive effects) to the same environmental variables. Among the metrics with interactive responses, 3 met- rics showed similar response patterns, and 3 showed opposite response patterns to environmental gradi- ents.

Multidimensional nature of the environment We are convinced that a statistical strategy that uses several key environmental variables simultaneously is necessary to test the convergence of assemblage structure variations along environmental gradients. From an ecological point of view, this strategy integrates the multidimensional aspect of the envi- ronment. Various environmental variables influence assemblage structure (Oberdorff et al. 2002, Pont et al. 2005, Hoeinghaus et al. 2007). Thus, key variables should be used jointly to represent local environmen- tal conditions. At a large scale, 2 given sites could present common and divergent environmental attri- butes (e.g., the same SLOP but different TEMP). From a statistical point of view, use of models that simultaneously integrate the effects of various envi- ronmental variables seems more consistent than using several models, each with each single variable, to explain the variability of the metrics. In multiple linear regressions, each parameter reflects the partial effect of 1 predictor variable when the other predictor variables included in the model are held constant (Kutner et al. 2005). When predictors are not

r (BENTH) (B), eurytopic (EURY) (C) intolerant (INTOL) (D), lithophilic (LITH) (E), potamodromous (POTAD) (F), rheophilic (RHEO) (G), tolerant (TOLE) (H), and water- column (WATE) (I) fishes, conditional on the medians of other predictors (Fox 1987). Only significant effects are plotted (see Methods for details). The common response pattern (model 1) is represented by a dashed curve; otherwise black and grey curves represent responses in the Iberian Peninsula (MED) and France and Belgium (FB). Metrics with significant interactions are indicated with an asterisk. Calcareous (c) and siliceous (s) environments were differentiated if the interaction between geology (GEO) and FIG. 4. Marginal effect of temperature (TEMP), reach region (REG) was significant. ns was added to the graph if slope (SLOP), and distance from headwaters (DIS) on total the environmental predictor had a significant effect but the fish species richness (RICH) (A) and richness (Ns) of benthic region 3 variable interaction was not statistically significant.

216 1318 M. LOGEZ ET AL. [Volume 29 independent, the coefficients estimated by univariate western European streams (Reyjol et al. 2007). Local and multiple-predictor models (e.g., simple linear species richness of Iberian streams is also lower than regression and multiple linear regression) can show that of French Mediterranean streams (Ferreira et al. opposite patterns (Greene 2003). Thus, the pattern of 2007a). We observed a maximum of 6 fish species in variation between a given metric could be either the most-downstream section of the MED streams, a positively or negatively related to the same environ- result that is in accordance with results of previous mental variable depending on the model used. We are studies (Godinho et al. 1997, 2000, Carmona et al. convinced that using multiple-predictor models rath- 1999, Pires et al. 1999, 2004, Magalhaes et al. 2002a, b, er than various single-predictor models would pro- 2007, Mesquita and Coelho 2002, Clavero et al. 2004, vide a more accurate estimate of the relationships Clavero and Garcia-Berthou 2006, Mesquita et al. between assemblage structures and individual envi- 2006, Ferreira et al. 2007a, b), whereas a maximum of ronmental variables because the estimated effect of 20 species was recorded for FB streams. The differ- each environmental variable is less confounded by the ence in RICH between MED and FB can be partly effects of other correlated variables. explained by the difference of regional species The relatively low R2 or pseudo-R2 values observed richness (Hugueny et al. 2010) between these 2 for some metrics, even with the most complex models regions (Reyjol et al. 2007, Leprieur et al. 2008b). (model 3), indicate that other mechanisms or envi- The magnitude of richness-based metric responses ronmental variables also might be important explan- always was lower in MED than in FB (except for atory factors for these metrics. For example, inclusion Ns_POTAD) because RICH was lower in the most- of precipitation probably would have increased the downstream part of MED streams than in comparable amount of variance or deviance explained by our reaches of FB streams. Therefore, the 7 parallel models. High spatial or temporal sampling variability responses and the interaction patterns displayed by also could be a reason for low R2 or pseudo-R2 values. richness-based metrics can be interpreted as the Uncertainty about the taxonomy of cyprinid taxa and mechanical consequences of these differences. insufficient knowledge of the preferences of some The pattern of response to environmental gradients Iberian species also could limit our conclusions displayed by densities of omnivorous and tolerant (Ferreira et al. 2007a, b). individuals also matched our expectations. Indeed, these 2 traits, associated with tolerance, both in- Convergence of traits creased along longitudinal gradients (decreasing with SLOP or increasing with DIS). Such patterns have Eleven of 17 metrics showed the same or parallel been observed previously in Europe (Oberdorff et al. responses to environmental gradients in the 2 regions. 2002, Pont et al. 2007), North America (McGarvey and Thus, these responses could be considered convergent Hughes 2008), and certain African tropical streams between MED and FB (Lamouroux et al. 2002, Tedesco (Iban˜ ez et al. 2007), suggesting a general pattern et al. 2005, Irz et al. 2007, Hugueny et al. 2010). (Iban˜ ez et al. 2009). Densities of omnivorous and Moreover, responses for 3 of the metrics with tolerant individuals also were positively related to significant interactions between region and environ- temperature, which seems to be a general pattern in mental variables were fairly similar between the 2 Europe (Pont et al. 2007). regions. However, the 3 remaining metrics clearly are Our results demonstrate that assemblages com- not convergent between MED and FB based on their posed of species from different species pools (with opposite responses to at least 1 environmental variable ,15% of common species) can show similar respons- between regions (Irz et al. 2007, Hugueny et al. 2010). es to environmental gradients (Bellwood et al. 2002, Most of the response patterns observed for the Lamouroux et al. 2002, Hoeinghaus et al. 2007). The 14 metrics that were similar between regions were western European fish fauna is composed mainly of consistent with our expectations. The 9 metrics based species that have colonized this area through different on richness increased along the longitudinal gradient migratory pathways following the latest glaciations (decreases with SLOP). Increasing local species (Hewitt 2000, 2004, Kontula and Vainola 2001, Weiss richness along the longitudinal stream gradient is a et al. 2002, Griffiths 2006, Gomez and Lunt 2007). In very well-known pattern (e.g., Horwitz 1978, Hu- French Mediterranean streams, the occurrence of gueny 1989, Oberdorff et al. 1995, Grenouillet et al. certain species that are abundant in central Europe 2004) that has been observed on almost all continents attests to this past colonization (Ferreira et al. 2007a). (Iban˜ ez et al. 2009). Moreover, Spanish and Portuguese faunas appear to In general, local species richness in Mediterranean be much more similar to each other than to French streams is low and is substantially lower than in Mediterranean faunas. Only headwater and colder-

217 2010] TEST OF FUNCTIONAL CONVERGENCE OF FISH ASSEMBLAGES 1319 stream fish faunas, which generally are dominated by ment of multimetric indices at a very large scale (Pont salmonids, were relatively similar between MED and et al. 2006, 2007). Such indices assess the status of a FB (Ferreira et al. 2007a). Despite these differences, test site by comparing its assemblage characteristics MED and FB fish assemblage structures have gener- with the theoretical characteristics that would be ally convergent patterns of variation to environmental expected in absence of human pressure or in least- gradients. Thus, the environment could be considered disturbed conditions (Hughes et al. 1998, Hering et al. as a driving force of selection on species characteris- 2006, Pont et al. 2006, Stoddard et al. 2006). The tics (Tonn et al. 1990, Keddy 1992, Townsend and reference-condition approach (Bailey et al. 1998) has Hildrew 1994, Diaz et al. 1999, Cornwell et al. 2006). been widely used and has been advocated in Europe The similar response patterns in MED and FB of by the Water Framework Directive (European Union 11 metrics for which REG had a significant effect 2000). With this approach, accurate estimation of (either additive or interactive) suggest that the natural reference-assemblage characteristics is a crucial step processes involved in the 2 regions are similar. The in the development of multimetric indices. Several consistent deviation between regions observed for authors (e.g., Baker et al. 2005, Pont et al. 2006, 2007, these metrics probably is a consequence of geograph- 2009, Hawkins et al. 2010) have used a similar model- ical or historical factors (Ricklefs and Schluter 1993, ing approach to predict the site-specific values of Lobo and Davis 1999, Qian and Ricklefs 2000, assemblage characteristics in reference conditions, as Lamouroux et al. 2002, Johnson et al. 2004, Tedesco a function of (nonanthropogenic) variables, such as air et al. 2005, Ricklefs 2006, Irz et al. 2007, Hugueny et al. temperature, slope, and distance from source. Use of 2010). For example, Iban˜ ez et al. (2009) demonstrated such models to predict reference-assemblage charac- that local species richness changed along a longitudi- teristics jointly for MED and FB is based on the nal gradient on 4 continents but found specific assumption that response patterns are the same in the response curves for each continent. However, the 2 regions. Therefore, our results should have impor- opposite patterns displayed by 3 metrics in our study tant consequences for the development of common suggest that not all community features change bioindicators for MED and FB. Most of the metrics similarly in different geographical areas (Smith and show a similar pattern of variation along environ- Ganzhorn 1996, Irz et al. 2007). Richness-based mental gradients. Nevertheless, most of them also metrics seem to be more convergent than density- show consistent deviations between the 2 regions, and based metrics. In our study, 5 of the 6 metrics with some of them are influenced by interactions between significant interaction between region and environ- regional and environmental variables. Our results mental variables were based on density, whereas suggest that using the same models to predict richness-based metrics had similar response patterns assemblage characteristics for both regions based on between regions, even if the responses differed the densities or the number of species in absence of between regions by some constant magnitude. human activities would lead to inaccurately fitted Homogenization of the Iberian fish fauna (Clavero reference values. This problem is especially acute for and Garcia-Berthou 2006) could have blurred the metrics based on densities because responses of a patterns observed in our study (Leprieur et al. 2008a). high proportion of these metrics show a significant In MED, 30% of lithophilic individuals belong to interactive effect of region and environment. Regional nonnative species. This pattern should have led to differences must be taken into account (Pont et al. failure to reject the hypothesis of common responses 2009) during development of common indices. Con- between MED and FB (model 1) or the hypothesis of sidering metrics based on proportion rather than on parallel responses to environmental gradients (model densities or richness might be a valuable alternative 2). Overall, the effect of faunal homogenization should for development of a large-scale multimetric index. be relatively small because of the low number of nonnative individuals occurring in MED and FB. For Acknowledgements example, the number of nonnative omnivorous indi- viduals in MED was very low (,10% on average at We thank John Van Sickle, Pamela Silver, and 2 each site), and this metric was 1 of the 3 metrics with a anonymous referees for comments that greatly im- common response to the environment in the 2 regions. proved this manuscript. This work was funded by the European Commission within the Sixth Framework + Implication in the development of bioindicators Programme (EFI project, contract number 044096). This study was conducted with data collected during The use of metrics based on assemblage character- the Fish-based Assessment Method for the Ecological istics rather than on taxonomy has enabled develop- Status of European Rivers (FAME; http://fame.boku.

218 1320 M. LOGEZ ET AL. [Volume 29 ac.at/). We are also grateful to F. Capel, R. Cortes, D. DIAZ, S., M. CABIDO, AND F. CASANOVES. 1999. Functional Garcı´a de Jalo´n, and A. Sostoa for their participation implications of trait–environment linkages in plant in the construction of the Mediterranean database. communities. Pages 338–362 in P. Weiher and P. Keddy (editors). Ecological assembly rules: perspectives, ad- Literature Cited vances, and retreats. Cambridge University Press, Cambridge, UK. ANDREWS, P., AND E. M. O’BRIEN. 2000. Climate, vegetation, DUDOIT, S., AND M. J. VAN DER LAAN. 2008. Multiple testing and predictable gradients in mammal species richness procedures with applications to genomics. Springer in southern Africa. Journal of Zoology 251:205–231. Series in Statistics. Springer, New York. ANGERMEIER, P. L., AND I. J. SCHLOSSER. 1989. Species-area EUROPEAN UNION. 2000. Directive 2000/60/EC of the Euro- relationships for stream fishes. Ecology 70:1450–1462. pean parliament and of the council establishing a BAILEY, R. C., M. G. KENNEDY,M.Z.DERVISH, AND R. M. framework for the community action in the field of TAYLOR. 1998. Biological assessment of freshwater water policy. European Commission: Journal of the ecosystems using a reference condition approach: European Community L327:1–72. comparing predicted and actual benthic invertebrate FARAWAY, J. J. 2006. Extending the linear model with R. communities in Yukon streams. Freshwater Biology 39: Generalized linear, mixed effects and nonparametric 765–774. regression models. Chapman and Hall/CRC, Boca BAKER, E. A., K. E. WEHRLY, P. W. SEELBACH, L. WANG, M. J. Raton, Florida. WILEY, AND T. SIMON. 2005. A multimetric assessment of FERREIRA, T., J. OLIVEIRA,N.CAIOLA,A.DE SOSTOA, F. CASALS, stream condition in the northern lakes and forests R. CORTES,A.ECONOMOU,S.ZOGARIS, D. GARCIA-JALON, M. ecoregion using spatially explicit statistical modeling ILHE´ U,F.MARTINEZ-CAPEL,D.PONT, C. ROGERS, AND J. and regional normalization. Transactions of the Amer- PRENDA. 2007a. Ecological traits of fish assemblages from ican Fisheries Society 134:697–710. Mediterranean Europe and their responses to human BELLWOOD, D. R., P. C. WAINWRIGHT,C.J.FULTON, AND A. disturbance. Fisheries Management and Ecology 14: HOEY. 2002. Assembly rules and functional groups at 473–481. global biogeographical scales. Functional Ecology 16: FERREIRA, T., L. SOUSA,J.M.SANTOS, L. REINO,J.OLIVEIRA, P. R. 557–562. ALMEIDA, AND R. V. CORTES. 2007b. Regional and local BENJAMINI, Y., AND Y. HOCHBERG. 1995. Controlling the false environmental correlates of native Iberian fish fauna. discovery rate: a practical and powerful approach to Ecology of Freshwater Fish 16:504–514. multiple testing. Journal of the Royal Statistical Society FILIPE, A. F., M. B. ARAUJO,I.DOADRIO,P.L.ANGERMEIER, AND Series B: Statistical Methodology 57:289–300. M. J. COLLARES-PEREIRA. 2009. Biogeography of Iberian BENJAMINI, Y., AND D. YEKUTIELI. 2001. The control of the false freshwater fishes revisited: the roles of historical versus discovery rate in multiple testing under dependency. contemporary constraints. Journal of Biogeography 36: Annals of Statistics 29:1165–1188. 2096–2110. BLONDEL, J. 1991. Assessing convergence at the community- FOX, J. 1987. Effect displays for generalized linear models. wide level. Trends in Ecology and Evolution 6:271–272. Sociological Methodology 17:347–361. CAMERON, A. C., AND P. K. TRIVEDI. 1998. Regression analysis FOX, J. 2003. Effect displays in R for generalised linear of count data. Cambridge University Press, Cambridge, models. Journal of Statistical Software 8:1–27. UK. GODINHO, F. N., M. T. FERREIRA, AND R. V. CORTES. 1997. CARMONA, J. A., I. DOADRIO,A.L.MARQUEZ,R.REAL, B. Composition and spatial organization of fish assem- HUGUENY, AND J. M. VARGAS. 1999. Distribution patterns blages in the lower Guadiana basin, southern Iberia. of indigenous freshwater fishes in the Tagus River Ecology of Freshwater Fish 6:134–143. basin, Spain. Environmental Biology of Fishes 54: GODINHO, F. N., M. T. FERREIRA, AND J. M. SANTOS. 2000. 371–387. Variation in fish community composition along an CHAMBERS, J. M., AND T. J. HASTIE. 1993. Statistical models in S. Iberian river basin from low to high discharge: relative Chapman and Hall, London, UK. contributions of environmental and temporal variables. CLAVERO, M., F. BLANCO-GARRIDO, AND J. PRENDA. 2004. Fish Ecology of Freshwater Fish 9:22–29. fauna in Iberian Mediterranean river basins: biodiver- GOMEZ, A., AND D. H. LUNT. 2007. Refugia within refugia: sity, introduced species and damming impacts. Aquatic patterns of phylogeographic concordance in the Iberian Conservation: Marine and Freshwater Ecosystems 14: Peninsula. Pages 155–188 in S. Weiss and N. Ferrand 575–585. (editors). Phylogeography of southern European refu- CLAVERO, M., AND E. GARCIA-BERTHOU. 2006. Homogenization gia. Springer, Dordrecht, The Netherlands. th dynamics and introduction routes of invasive freshwa- GREENE, W. H. 2003. Econometric analysis. 5 edition. ter fish in the Iberian Peninsula. Ecological Applications Pearson Education, Inc., Upper Saddle River, New 16:2313–2324. Jersey. CORNWELL, W. K., D. W. SCHWILK, AND D. D. ACKERLY. 2006. A GRENOUILLET, G., D. PONT, AND C. HE´ RISSE´ . 2004. Within-basin trait-based test for habitat filtering: convex hull volume. fish assemblage structure: the relative influence of Ecology 87:1465–1471. habitat versus stream spatial position on local species

219 2010] TEST OF FUNCTIONAL CONVERGENCE OF FISH ASSEMBLAGES 1321

richness. Canadian Journal of Fisheries and Aquatic in lakes of France and north-east USA. Global Ecology Sciences 61:93–102. and Biogeography 16:350–366. GRIFFITHS, D. 2006. Pattern and process in the ecological JOHNSON, R. K., W. GOEDKOOP, AND L. SANDIN. 2004. Spatial biogeography of European freshwater fish. Journal of scale and ecological relationships between the macroin- Animal Ecology 75:734–751. vertebrate communities of stony habitats of streams and HAWKINS, C. P., Y. CAO, AND B. ROPER. 2010. Method of lakes. Freshwater Biology 49:1179–1194. predicting reference condition biota affects the perfor- KEDDY, P. A. 1992. Assembly and response rules: two goals mance and interpretation of ecological indices. Fresh- for predictive community ecology. Journal of Vegetation water Biology 55:1066–1085. Science 3:157–164. ¨ HERING, D., C. K. FELD,O.MOOG, AND T. OFENBOCK. 2006. KELT, D. A., J. H. BROWN,E.J.HESKE,P.A.MARQUET, S. R. Cook book for the development of a multimetric index MORTON,J.R.W.REID,K.A.ROGOVIN, AND G. SHENBROT. for biological condition of aquatic ecosystems: experi- 1996. Community structure of desert small mammals: ences from the European AQEM and STAR projects and comparisons across four continents. Ecology 77:746–761. related initiatives. Hydrobiologia 566:311–324. KONTULA, T., AND R. VAINOLA. 2001. Postglacial colonization HEWITT, G. M. 1999. Post-glacial re-colonization of European of Northern Europe by distinct phylogeographic line- biota. Biological Journal of the Linnean Society 68: ages of the bullhead, Cottus gobio. Molecular Ecology 10: 87–112. 1983–2002. EWITT H , G. M. 2000. The genetic legacy of the Quaternary ice KOSKINEN, M. T., J. NILSSON,A.J.VESELOV, A. G. POTUTKIN, E. ages. Nature 405:907–913. RANTA, AND C. R. PRIMMER. 2002. Microsatellite data HEWITT, G. M. 2001. Speciation, hybrid zones and phyloge- resolve phylogeographic patterns in European grayling, ography—or seeing genes in space and time. Molecular Thymallus thymallus, Salmonidae. Heredity 88:391–401. Ecology 10:537–549. KUTNER, M. H., C. J. NACHTSHEIM,J.NETER, AND W. LI. 2005. HEWITT, G. M. 2004. Genetic consequences of climatic Applied linear statistical models. 5th edition. McGraw– oscillations in the Quaternary. Philosophical Transac- Hill/Irwin, New York. tions of the Royal Society of London Series B: Biological LAMOUROUX, N., N. L. POFF, AND P. L. ANGERMEIER. 2002. Sciences 359:183–195. Intercontinental convergence of stream fish community HOEINGHAUS, D. J., K. O. WINEMILLER, AND J. S. BIRNBAUM. 2007. traits along geomorphic and hydraulic gradients. Local and regional determinants of stream fish assem- Ecology 83:1792–1807. blage structure: inferences based on taxonomic vs. LEPRIEUR, F., O. BEAUCHARD,S.BLANCHET, T. OBERDORFF, AND S. functional groups. Journal of Biogeography 34:324–338. BROSSE. 2008a. Fish invasions in the world’s river HORWITZ, R. J. 1978. Temporal variability patterns and the systems: when natural processes are blurred by human distributional patterns of stream fishes. Ecological activities. PLoS Biology 6:404–410. Monographs 48:307–321. LEPRIEUR, F., O. BEAUCHARD,B.HUGUENY,G.GRENOUILLET, AND HUGHES, R. M., P. R. KAUFMANN,A.T.HERLIHY,T.M.KINCAID, S. BROSSE. 2008b. Null model of biotic homogenization: a L. REYNOLDS, AND D. P. LARSEN. 1998. A process for developing and evaluating indices of fish assemblage test with the European freshwater fish fauna. Diversity integrity. Canadian Journal of Fisheries and Aquatic and Distributions 14:291–300. Sciences 55:1618–1631. LOBO, J. M., AND A. L. V. DAVIS. 1999. An intercontinental comparison of dung beetle diversity between two HUGUENY, B. 1989. West-African rivers as biogeographic islands: species richness of fish communities. Oecologia Mediterranean-climatic regions: local versus regional (Berlin) 79:236–243. and historical influences. Diversity and Distributions 5: HUGUENY, B., T. OBERDORFF, AND P. A. TEDESCO. 2010. 91–103. Community ecology of river fishes: a large-scale LOBO´ N-CERVIA´ , J., B. ELVIRA, AND P. A. RINCO´ N. 1989. Historical perspective. in D. A. Jackson and K. B. Gido (editors). changes in the fish fauna of the River Duero basin. Community ecology of stream fishes: concepts, ap- Pages 221–232 in G. E. Petts (editor). Historical change proaches, and techniques. Symposium 73. American of large alluvial rivers: western Europe. John Wiley, Fisheries Society, Bethesda, Maryland (in press). Chichester, UK. IBAN˜ EZ, C., J. BELLIARD, R. M. HUGHES,P.IRZ,A.KAMDEM- MAGALHAES, M. F., D. C. BATALHA, AND M. J. COLLARES-PEREIRA. TOHAM,N.LAMOUROUX,P.A.TEDESCO, AND T. OBERDORFF. 2002a. Gradients in stream fish assemblages across a 2009. Convergence of temperate and tropical stream fish Mediterranean landscape: contributions of environmen- assemblages. Ecography 32:658–670. tal factors and spatial structure. Freshwater Biology 47: IBAN˜ EZ, C., T. OBERDORFF,G.TEUGELS,V.MAMONONEKENE, S. 1015–1031. LAVOUE´ ,Y.FERMON,D.PAUGY, AND A. K. TOHAM. 2007. MAGALHAES, M. F., P. BEJA, C. CANAS, AND M. J. COLLARES- Fish assemblages structure and function along environ- PEREIRA. 2002b. Functional heterogeneity of dry-season mental gradients in rivers of Gabon (Africa). Ecology of fish refugia across a Mediterranean catchment: the role Freshwater Fish 16:315–334. of habitat and predation. Freshwater Biology 47: IRZ, P., F. MICHONNEAU, T. OBERDORFF,T.R.WHITTIER, N. 1919–1934. LAMOUROUX,D.MOUILLOT, AND C. ARGILLIER. 2007. Fish MAGALHAES, M. F., P. BEJA,I.J.SCHLOSSER, AND M. J. COLLARES- community comparisons along environmental gradients PEREIRA. 2007. Effects of multi-year droughts on fish

220 1322 M. LOGEZ ET AL. [Volume 29

assemblages of seasonally drying Mediterranean PONT, D., B. HUGUENY, AND C. ROGERS. 2007. Development of a streams. Freshwater Biology 52:1494–1510. fish-based index for the assessment of river health in MCCULLAGH, P., AND J. A. NELDER. 1989. Generalized linear Europe: the European Fish Index. Fisheries Manage- models. 2nd edition. Chapman and Hall, London, UK. ment and Ecology 14:427–439. MCGARVEY, D. J., AND R. M. HUGHES. 2008. Longitudinal QIAN, H., AND R. E. RICKLEFS. 2000. Large-scale processes and zonation of Pacific Northwest (U.S.A.) fish assemblages the Asian bias in species diversity of temperate plants. and the species-discharge relationship. Copeia 2008: Nature 407:180–182. 311–321. REYJOL, Y., B. HUGUENY,D.PONT,P.G.BIANCO,U.BEIER, N. MELVILLE, J., L. J. HARMON, AND J. B. LOSOS. 2006. Interconti- CAIOLA,F.CASALS,I.COWX, A. ECONOMOU, T. FERREIRA, G. nental community convergence of ecology and mor- HAIDVOGL,R.NOBLE,A.DE SOSTOA, T. VIGNERON, AND T. phology in desert lizards. Proceedings of the Royal VIRBICKAS. 2007. Patterns in species richness and ende- Society of London Series B: Biological Sciences 273: mism of European freshwater fish. Global Ecology and 557–563. Biogeography 16:65–75. MESQUITA, N., AND F. COELHO. 2002. The ichthyofauna of the RICKLEFS, R. E. 2006. Evolutionary diversification and the small Mediterranean-type drainages of Portugal: its origin of the diversity-environment relationship. Ecolo- importance for conservation. Pages 65–71 in M. J. gy 87:S3–S13. Collares-Pereira, I. G. Cowx, and F. Coelho (editors). RICKLEFS, R. E., AND D. SCHLUTER. 1993. Species diversity: Conservation of freshwater fish: options for the future. regional and historical influences. Pages 350–363 in R. E. Fishing News Books/Blackwell Science, Oxford, UK. Ricklefs and D. Schluter (editors). Species diversity in MESQUITA, N., M. M. COELHO, AND M. F. MAGALHAES. 2006. ecological communities. University of Chicago Press, Spatial variation in fish assemblages across small Chicago, Illinois. Mediterranean drainages: effects of habitat and land- SAMUELS, C. L., AND J. A. DRAKE. 1997. Divergent perspectives scape context. Environmental Biology of Fishes 77: on community convergence. Trends in Ecology and 105–120. Evolution 12:427–432. MONTGOMERY, D. C., E. A. PECK, AND G. G. VINING. 2006. SCHLUTER, D., AND R. E. RICKLEFS. 1993. Convergence and the Introduction to linear regression analysis. 4th edition. regional component of species diversity. Pages 230–240 Wiley Series in Probability and Statistics. Wiley, New in R. E. Ricklefs and D. Schluter (editors). Species York. diversity in ecological communities. University of OBERDORFF, T., J. F. GUE´ GAN, AND B. HUGUENY. 1995. Global Chicago Press, Chicago, Illinois. scale patterns of fish species richness in rivers. SHAFFER, J. P. 1995. Multiple hypothesis testing. Annual Ecography 18:345–352. Review of Psychology 46:561–584. OBERDORFF, T., D. PONT,B.HUGUENY, AND J. P. PORCHER. 2002. SMITH, A. P., AND J. U. GANZHORN. 1996. Convergence in Development and validation of a fish-based index for community structure and dietary adaptation in Austra- the assessment of ‘river health’ in France. Freshwater lian possums and gliders and Malagasy lemurs. Biology 47:1720–1734. Australian Journal of Ecology 21:31–46. PIRES, A. M., I. G. COWX, AND M. M. COELHO. 1999. Seasonal SOUTHWOOD, T. R. E. 1977. Habitat, the templet for ecological changes in fish community structure of intermittent strategies? Journal of Animal Ecology 46:337–365. streams in the middle reaches of the Guadiana basin, STATZNER, B., S. DOLE´ DEC, AND B. HUGUENY. 2004. Biological Portugal. Journal of Fish Biology 54:235–249. trait composition of European stream invertebrate PIRES, A. M., L. M. DA COSTA,M.J.ALVES, AND M. M. COELHO. communities: assessing the effects of various trait filter 2004. Fish assemblage structure across the Arade basin types. Ecography 27:470–488. (Southern Portugal). Cybium 28:357–365. STODDARD, J. L., D. P. LARSEN,C.P.HAWKINS,R.K.JOHNSON, POFF, N. L., AND J. D. ALLAN. 1995. Functional organization of AND R. H. NORRIS. 2006. Setting expectations for the stream fish assemblages in relation to hydrological ecological condition of streams: the concept of reference variability. Ecology 76:606–627. condition. Ecological Applications 16:1267–1276. PONT, D., R. M. HUGHES,T.R.WHITTIER, AND S. SCHMUTZ. 2009. TEDESCO, P. A., T. OBERDORFF,C.A.LASSO,M.ZAPATA, AND B. A predictive index of biotic integrity model for aquatic- HUGUENY. 2005. Evidence of history in explaining vertebrate assemblages of Western US streams. Trans- diversity patterns in tropical riverine fish. Journal of actions of the American Fisheries Society 138:292–305. Biogeography 32:1899–1907. PONT, D., B. HUGUENY,U.BEIER,D.GOFFAUX, A. MELCHER, R. TONN, W. M., J. J. MAGNUSON,M.RASK, AND J. TOIVONEN. 1990. NOBLE, C. ROGERS,N.ROSET, AND S. SCHMUTZ. 2006. Intercontinental comparison of small-lake fish assem- Assessing river biotic condition at a continental scale: blages: the balance between local and regional process- a European approach using functional metrics and fish es. American Naturalist 136:345–375. assemblages. Journal of Applied Ecology 43:70–80. TOWNSEND, C. R., AND A. G. HILDREW. 1994. Species traits in PONT, D., B. HUGUENY, AND T. OBERDORFF. 2005. Modelling relation to a habitat templet for river systems. Fresh- habitat requirement of European fishes: do species have water Biology 31:265–275. similar responses to local and regional environmental VERBERK, W. C. E. P., H. SIEPEL, AND H. ESSELINK. 2008. Life- constraints? Canadian Journal of Fisheries and Aquatic history strategies in freshwater macroinvertebrates. Sciences 62:163–173. Freshwater Biology 53:1722–1738.

221 2010] TEST OF FUNCTIONAL CONVERGENCE OF FISH ASSEMBLAGES 1323

VITT, L. J., AND E. R. PIANKA. 2005. Deep history impacts WINEMILLER, K. O., AND A. ADITE. 1997. Convergent evolution present-day ecology and biodiversity. Proceedings of of weakly electric fishes from floodplain habitats in the National Academy of Sciences of the United States Africa and South America. Environmental Biology of of America 102:7877–7881. Fishes 49:175–186. WEISS, S., H. PERSAT,R.EPPE,C.SCHLO¨ TTERER, AND F. UIBLEIN. YOSHIMURA, C., K. TOCKNER,T.OMURA, AND O. MOOG. 2006. 2002. Complex patterns of colonization and refugia Species diversity and functional assessment of macro- revealed for European grayling Thymallus thymallus, invertebrate communities in Austrian rivers. Limnology based on complete sequencing of the mitochondrial 7:63–74. DNA control region. Molecular Ecology 11:1393–1407. WIENS, J. A. 1991. Ecological similarity of shrub-desert avifaunas Received: 15 September 2009 of Australia and North America. Ecology 72:479–495. Accepted: 27 July 2010

222 Implications of these results for the development of a large scale MMI

P2: Do Iberian and European fish faunas exhibit convergent functional structure along environmental gradients? Perspectives: how important is the phylogeny to the functional structure of the assemblage and convergence? Convergence between assemblages from two regions is more easily observed when the species of these regions, even if different, belong to the same lineage (Bellwood et al. 2002, Lamouroux et al. 2002). On the contrary, regions composed of species belonging to different lineages seldom display convergent responses (Cadle and Greene 1993, Smith and Ganzhorn 1996). The different evolution in a lineage from the same clade and their different spatial distributions could explain the divergent assemblage structure observed between two regions (Cadle and Greene 1993, Ricklefs and Schluter 1993). The evolutionary history of a lineage could play an important role on the current species attributes (Vitt and Pianka 2005). However, only ten species occur conjointly in both regions, the great majority of the 57 species occurring in the two regions belong to the same genus or sub-family or family. This phylogenetic proximity of the regional species pools should favour the convergence observed between two regions (Bellwood et al. 2002). New methods taking into account phylogeny are being developed (e.g. Pavoine et al. 2010) and should provide a better understanding of this question in the future.

Implications of these results for the development of a large-scale MMI These results reinforced the importance of taking into account the environment in computing metric scores (e.g. Joy and Death 2002, Oberdorff et al. 2002, Hughes et al. 2004, Pont et al. 2006, Bates Prins and Smith 2007, Pont et al. 2007, Roset et al. 2007, Hawkins et al. 2010). They also demonstrated that the set of environmental variables used to model the metrics have to be adapted depending on the metric. It is thus necessary to select appropriate environmental factors that better reflect metric variability (Oberdorff et al. 2002, Pont et al. 2006, Bates Prins and Smith 2007, Pont et al. 2009, Hawkins et al. 2010). Not selecting the set of environmental predictors should lead to an “overfitting” problem (Hastie et al. 2009). As the number of predictors integrated into the model increases, the predictive error on an independent data set should also increase (see Figure 2.11, p. 38, Hastie et al. 2009). The low number of traits presenting quantitative convergence implies that the region has to be taken into account when metrics based on densities and the number of species are used (Oberdorff et al. 2002, Pont et al. 2006, 2009). The expected values of this metric should be biased if they are predicted using common models. With a parallel response of a metric between regions, using a common model would on the whole overestimate the expected values in one region and underestimate the expected values in the second region. This could

225 P2: Do Iberian and European fish faunas exhibit convergent functional structure along environmental gradients? be remedied by considering the metric based on the ratio (Matzen and Berge 2008) of the number of species or the number of individuals for a given trait category.

226 P3: Development of metrics based on fish body size and species traits to assess European coldwater streams. Ecological Indicators

P3: Development of metrics based on fish body size and species traits to assess European coldwater streams. The objective of this part was to develop new metrics specific to low species assemblages, living mainly in cold-water streams. These metrics had to integrate both fish body size and the ecological or biological attributes of species. Each of the eight trait categories used were representative of these assemblages (Figure 5): general intolerant (intolerant to most pressures, INTOL), intolerant to low oxygen concentrations (O2INTOL), intolerant to habitat degradation (HINTOL), rheophilic (RH), insectivorous (INSEV), potamodromous (POTAD), lithophilic (LITH) and species with a single spawning event during the reproductive season (SIN, see Table 3). Two size classes were retained to integrate the body size: small individuals and large individuals. Each individual was assigned to one size class depending on whether its total length was longer or shorter than a threshold size. This study was a preliminary step toward the development of these metrics. Consequently, three arbitrary threshold lengths defined the size classes: 100, 150 and 200 mm. Each combination of traits and size classes was used to define new metrics. Each metric was defined as the number of fish with a given trait belonging to a specific size class, for instance the number of small (less than 100 mm) rheophilic individuals (Table 5).

Table 5: The twelve metrics computed for one trait category. Species pool Threshold (mm) Size class Small 100 Large Small Total 150 Large Small 200 Large Small 100 Large Small Large 150 Large Small 200 Large

Among the small individuals, individuals of the naturally small species (e.g. the bullhead, Cottus gobio, L.) were distinguished from the young individuals of large species (e.g. the barbel, Barbus fluviatilis, L.) (Figure 12). To account for this dichotomy, the same metrics were computed using either all species or only the large ones (species maximum length greater than 300 mm; Figure 12B). Thus for each of the metrics, 12 metrics were developed (Table 5) constituting a set of 96 potential new metrics.

229 P3: Development of metrics based on fish body size and species traits to assess European coldwater streams.

AB

Trait Oui Non Taille Oui Non Figure 12: Theoretical assemblage composed of three species illustrating the computation of the metrics. In this example, the individuals with the trait of interest are represented in black (outlines) and the small individuals are coloured in grey (inside). In A) all species are taken into account whereas in B) only the large species are taken into account. The metric would be In A) the number of small individuals with a given trait (12/21) and in B) the number of small individuals from large species with a given trait (5/14)

Comment: methodological scheme The following graph summarises the different metric selection steps.

Data set No or slightly disturbed sites

Global SID

Reference sites

REF

Pressure Model Index 1, ..., 5 Metric

Environment

Residuals Residuals

min-max Anchor min Anchor max Transformation

Scores ∈[0;1] 1

Sensitivity to pressure? 0 12345 Figure 13: Metric selecting and scoring process.

230 Contents lists available at ScienceDirect

Ecological Indicators

journal homepage: www.elsevier.com/locate/ecolind

Development of metrics based on fish body size and species traits to assess European coldwater streams

Maxime Logez a,∗, Didier Pont b a Cemagref, UR HYAX, 3275 Route de Cézanne–CS 40061, F-13182 Aix-en-Provence, France b Cemagref, UR HBAN, Parc de Tourvoie BP 44, 92163 Antony Cedex, France article info abstract

Article history: During the last decade multimetric indices (MMIs) have been greatly improved by the use of appropriate Received 17 March 2010 criteria to define reference conditions and by the use of statistical analysis to select a consistent set of Received in revised form 4 November 2010 metrics. Among the large number of MMIs developed to assess the ecological status of streams based Accepted 31 December 2010 on fish communities, the emphasis was mainly put on warmwater assemblages. When compared with warmwater fish assemblages, coldwater assemblages present depauperate faunas with a limited suit of Keywords: traits. Thus, very often the number of metrics used to compute MMIs for coldwater streams is lower than Coldwater streams for warmwaters. The objective of this study was to develop new metrics specific to European coldwater Size classes Metric assemblages that integrate both the species traits and the body size of fish. Indeed, whereas the use of Fish assemblages size or age classes has been highly advocated for developing MMIs, it remains largely underrepresented. Species traits Therefore, we used eight biological and ecological traits to characterize species and two size classes: Multimetric index small and large individuals. Among the 96 metrics tested, four were successfully related to environmental gradients and three displayed a significant response to anthropogenic pressures: the number of small rheophilous individuals, the number of small oxygen-intolerant individuals, and the number of small- habitat-intolerant individuals. These results demonstrate that metrics based on size classes could be used in the development of MMIs for coldwater streams and more generally for low-species rivers. © 2011 Elsevier Ltd. All rights reserved.

1. Introduction would provide a more reliable or robust estimation than when only considering one attribute (Karr and Chu, 1999). Indeed, individu- Since the first index of biotic integrity (IBI) developed by Karr ally, each metric (assemblage attribute measured) integrated in an (1981), the concept of multimetric indices (MMIs) to assess stream MMI is expected to represent a singular assemblage attribute (Karr ecological conditions using various fish assemblage attributes has et al., 1986) responding to specific human pressure, with varying been successfully extended to different regions in the United sensitivity along the pressure gradient (Angermeier and Karr, 1986; States (Angermeier and Schlosser, 1987; Angermeier et al., 2000; Karr et al., 1986). In this way, encompassing multiple metrics into McCormick et al., 2001; Mebane et al., 2003; Wang et al., 2003) a unique index would make it sensitive to a broad array of alter- and various countries (Oberdorff and Hughes, 1992), and MMIs are ations. Moreover, under severe degradation, several metrics should now used on almost all continents (An et al., 2002; Bramblett et al., be modified, reinforcing the accuracy of the estimation (Karr, 1991). 2005; Hughes and Oberdorff, 1999; Hugueny et al., 1996; Lyons et During the last decade, selecting appropriate metrics has al., 1995; Oberdorff and Hughes, 1992; Pont et al., 2006). became a key process in the development of multimetric indices The concept of IBI, and more generally of MMIs, is based on the (Karr and Chu, 2000), which led to the development of rigorous statement that using diverse aspects of assemblages to assess sys- selection protocols integrating several criteria and statistical anal- tem conditions (biotic integrity, health, ecological condition, etc.) ysis (Hering et al., 2006; Hughes et al., 1998; Pont et al., 2006, 2009; Roset et al., 2007; Stoddard et al., 2008). To be useful, metrics would be representative of the region for which the MMIs are developed; for example, using the metric percent of green sunfish (Lepomis Abbreviations: MMI, multimetric index; INTOL, general intolerant; O2INTOL, oxygen-intolerant; HINTOL, habitat-intolerant; RH, rheophilous; INSEV, insectivo- cyanellus, Rafinesque) integrated into the first IBI (Fausch et al., rous; POTAD, potamodromous; LITH, lithophilic; SINREP, single reproduction; EFT, 1984; Karr, 1981) to assess European streams would not be con- European fish type; SED, bottom sediment structure; CAL, calibration sites; SID, sistent as its distribution area is restricted to North America (Lee slightly or not impacted sites. ∗ et al., 1980; Nelson, 2006). A metric only slightly represented in a Corresponding author. Tel.: +33 04 42 66 69 86; fax: +330442669934. region should not discriminate consistently between sites (Harris E-mail addresses: [email protected] (M. Logez), [email protected] (D. Pont). and Silveira, 1999). Metrics should also reflect different aspects and

1470-160X/$ – see front matter © 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.ecolind.2010.12.023

231 provide singular information of assemblages. To avoid redundancy, 2. Material and methods several authors have used correlation criteria to select a metric or have chosen the most efficient metric between two redundant ones 2.1. Fish sampling and site characterization (Hering et al., 2006; Pont et al., 2006). The signal reflected by met- rics should only display the variability of human pressure between We used data from the fish surveys of 12 European coun- sites and not the environmental differences between them (Hering tries (Fig. 1) conducted by several laboratories and governmental et al., 2006; Hughes et al., 1998; Karr and Chu, 2000; Pont et al., environmental agencies (1981–2007, 95% after 1990). Sites were 2006; Stoddard et al., 2008). Finally, the most important criteria sampled using electrofishing methods either by wading or by are the ability of a metric to detect human pressure and to dis- boat, depending on stream depth. All sites were located in small criminate degraded sites from reference sites (Bailey et al., 1998; permanent streams (drainage area less than 500 km2, mean air Hering et al., 2006; Hughes et al., 1998; Karr and Chu, 1999; Pont temperature in July lower than 20 ◦C) and were sampled during et al., 2006, 2009; Southerland et al., 2007; Stoddard et al., 2008). low-flow periods. The species of each individual collected was iden- Since the original set of metrics used in the first IBI for warmwa- tified and its total length (mm) was measured. To homogenize the ter streams of the Midwestern United States (Karr, 1981), a wide sampling effort between regions, only fish collected during the first variety of metrics have been used in MMIs (e.g., Simon and Lyons, pass were considered. 1995) to cope with different species pools (Hughes and Oberdorff, To characterize the local environmental conditions, we used 1999; Hugueny et al., 1996; Lyons et al., 1995; Mebane et al., 2003; the slope, July mean air temperature, and the thermal amplitude Oberdorff and Hughes, 1992), specific regional environmental con- between January and July, as they are fundamental descriptors ditions (Angermeier et al., 2000; Magalhaes et al., 2008), river types of river habitat at the reach scale (Pont et al., 2005). In addition, (Matzen and Berge, 2008; Mundahl and Simon, 1999), specific pres- we considered bottom sediment structure (SED) in a simplified sures (Oberdorff and Porcher, 1994; Wang et al., 2003), and system manner because it is difficult to obtain more precise and compa- functioning. For example, Magalhaes et al. (2008) and Bramblett rable information for such a large data set. We thus used three et al. (2005) used a specific set of metrics to develop MMIs for classes: small (sand, silt), medium (cobble, pebble), and large (rock, Mediterranean and semi-arid streams. block). Coldwater streams have received particular attention because Hydrogeomorphic processes are also a major factor control- they display both specific environmental conditions and associ- ling river habitat and fish assemblage structure (Junk et al., ated fish species communities (Hughes et al., 2004; Lyons et al., 1989; Petts and Amoros, 1996; Poff and Allan, 1995; Poff et al., 1996; Mebane et al., 2003; Mundahl and Simon, 1999; Wang 1997). Therefore, we also considered the river size (surrogates et al., 2003). Compared to most of warmwater streams, coldwater by drainage area and distance from source), the hydrological streams present depauperate faunas (Hughes et al., 2004; Lyons, regime (pluvial dominated vs. glacial-nival), the geomorphologic 1996; Matzen and Berge, 2008; Mebane et al., 2003) mainly com- types (meandering, braided, or constraint), and the presence posed of intolerant species (Halliwell et al., 1999; Zaroban et al., of a floodplain. To reduce the amount of variables and their 1999). The low species richness commonly observed in coldwater multicolinearity, we used the first two axes of a multivari- streams limits the amount of metrics available and their variability ate analysis (Hill and Smith, 1976) as synthetic geomorphic (Lyons, 1996; Simon and Lyons, 1995). The particular relationships variables (SYNGEO1 and SYNGEO2). The method developed by between metrics and human pressures in coldwater streams are Hill and Smith (1976) is designed to handle both quantita- also a key factor supporting the development of coldwater-specific tive and qualitative variables. It acts as a principal component MMIs (Hughes et al., 2004; Lyons et al., 1996; Mundahl and Simon, analysis on the correlation matrix if all variables are quantita- 1999; Wang et al., 2003). Indeed, several metrics displaying a rel- tive and as a multiple correspondence analysis if all variables atively well-known pattern of variation in warmwater streams are qualitative. SYNGEO1 describes the stream size gradients displayed opposite patterns in coldwater streams (Lyons, 1996; and the presence of floodplain downstream, whereas SYNGEO2 Mebane et al., 2003). Species richness, one of the most common contrasts meandering rivers with pluvial regime to the others metrics include in MMIs, very often decreases with human pres- (Fig. 2). sure in warmwater streams, whereas it could increase with human alteration in coldwater streams (Lyons, 1996; Mundahl and Simon, 2.2. Definition of metrics 1999). To describe these various constraints, authors developing MMIs To develop specific metrics for European coldwater streams, we for coldwater streams have either integrated metrics based on selected eight species traits that are highly represented in these other vertebrate groups (Hughes et al., 2004), used fewer met- streams (Noble et al., 2007): general intolerant (INTOL), oxygen- rics than for warmwater MMIs (Langdon, 2001; Lyons et al., 1996; intolerant (O2INTOL), habitat-intolerant (HINTOL), rheophilous Southerland et al., 2007), or used other components of fish assem- (RH), insectivorous (INSEV), potamodromous (POTAD), lithophilic blages such as species age or length classes (Breine et al., 2004; (LITH), and single reproduction (SINREP). Each of the 68 species Hughes et al., 2004; Langdon, 2001; Oberdorff and Porcher, 1994). was either assigned to these traits or not (e.g., Table 1) depend- Although the use of age or length structure has been advocated by ing on its biological and ecological characteristics. The European several authors (Breine et al., 2004; Karr, 1991; Roset et al., 2007) fish species were classified during the EU EFI+ project (http://efi- and by the Water Framework Directive (European Union, 2000), the plus.boku.ac.at/) and was a revision of the classification developed number of MMIs using such assemblage attributes remains low. in a previous European project (http://fame.boku.ac.at, Noble et al., The main purpose of this study was to develop specific met- 2007). rics for coldwater streams dominated by intolerant species that In combination with species traits, we considered two size take into account individual fish body size. To achieve this objec- classes: small and large individuals. Each fish was assigned to one of tive, we used information on species biological and ecological traits the two classes by comparing its length with an arbitrary threshold and on fish size. Individuals were classified into two size classes (Table 2). All fishes with a length lower than the given threshold (small and large) by comparing fish lengths with different arbi- were considered small and all the others large. As our study was trary thresholds. Once metrics were computed and standardized by a preliminary step in the development of such metrics, we tested the environment, we assessed their ability to detect anthropogenic three arbitrary thresholds to define the size classes: 100, 150, and disturbances. 200 mm (Table 2). For each of the eight traits, we computed the

232 Fig. 1. Location of the 2012 sites. Member countries of the EFI+ project are shown in gray.

sidered 48 metrics based on fish abundance: eight traits, two size Braided classes, and three thresholds. Non-Pluvial Fishes could be small either because they belonged to naturally small species (e.g., the bullhead Cottus gobio, Linnaeus) or because they were young. To take this dichotomy into account, in the second step we computed the same metrics integrating only the individ- Constraint uals of large species (species longer than 300 mm; Table 2). These metrics are therefore closer than metrics based on age classes, since small individuals could be viewed as young individuals and large ones as older fishes. In summary, we used eight ecological or biological species traits, Drainage area two size classes, three different thresholds, and two sets of species DistanceDi from source No Floodplain (all or only large species), computing 96 metrics. Each metric was subjected to several selection steps before being considered useful for detecting anthropogenic disturbances.

Floodplain Sinuous Pluviall 2.3. Metric selection process Meanderingn The metric selection process used in this study was adapted from Pont et al. (2006, 2007). In the first step, we defined a reference data Fig. 2. Ordination of the hydrogeomorphic variables on the first factorial plan of the set to be used to relate each metric with environmental conditions. Hill and Smith (1976) analysis. In the second step, the metrics were standardized by removing their part of variability explained by the environment. In a third step, number of small and large individuals presenting the characteris- metrics were rescaled from 0 to 1 (metric scores). Finally, we tested tic of interest, e.g., the number of small rheophilous fishes or the metric score sensitivity to human pressures (Fig. 3) and assessed number of large rheophilous fishes (Table 2). All combinations of the redundancy of the scores using a Spearman rank correlation traits and size classes defined a new metric. In this step, we con- coefficient.

233 Table 1 Traits of the ten most abundant species. See Section 2 for details concerning species trait abbreviations.

INTOL O2INTOL HINTOL RH INSEV POTAD LITH SINREP √√ √ √√√ √√ Salmo trutta √√√√ √ Phoxinus phoxinus √√ √ √√ √ Cottus gobio √√ √ Barbatula barbatula √√ Rutilus rutilus √√ √ √ √√ Salmo salar √√ Gobio gobio √√√√√√ Oncorhynchus mykiss √√ Anguilla anguilla √√ √ √√√ √√ Thymallus thymallus

Table 2 Data set Definition of the 12 metrics computed per species trait. Calibration sites Species pool Threshold (mm) Size class Total 100 Small Global CAL Large

150 Small Large

200 Small Large

Large 100 Small Modeling Large

150 Small Large Metric

200 Small Large Environment

Residual 2.3.1. Definition of the data sets Computation Metric Selection The selection of sites that were not or only slightly impacted on which the ecological and biological characteristics of test sites will be compared is of major importance (Pont et al., 2006, 2007; Whittier et al., 2007). Sites that were not or only slightly impacted Rescaling (0–1) are expected to represent the biological or ecological conditions - Upper anchor using SID sites that would be expected in absence of human activities or in the - Lower anchor using all sites least disturbed conditions (Pont et al., 2006, 2009; Stoddard et al., 2006, 2008). Sites were selected to be either not impacted or slightly impacted by anthropogenic pressures (Pont et al., 2006). We defined two data sets of slightly disturbed sites: one for the Scores Sensitivity to pressure modeling process (CAL) and one for the scoring process (SID) (0,1) (Fig. 3). CAL sites were selected on the basis of 20 criteria inte- 1 grating the presence of given human pressures (e.g., presence of Scores an impoundment) and the modifications induced by these pres- sures (e.g., modification of hydrological regime; Table 3), while SID 0 sites were selected based on presence of ten human pressures only Scores 54321 (Table 3). The SID data set (290 sites) encompassed the 218 CAL Redundancy Pressure index sites, but it was spread over a larger geographical area. In addition, we selected sites with at least 50 individuals sampled to have a con- sistent estimation of the assemblage structure. To limit a potential temporal bias due to the sampling date, we only retained sites sam- Final Metric Selection pled between August and November. Because of data availability, the number of CAL sites varied slightly depending on the metric Fig. 3. Methodological scheme for the selection of the metrics. considered (between 212 and 214 sites, with an overall set of 218 sites used). ing on the environmental conditions (Fig. 3; Pont et al., 2006, 2009). 2.3.2. Modeling process Metrics were related to environment by using generalized lin- Removing the share of variability of the metrics caused by ear models (GLM; Faraway, 2006; McCullagh and Nelder, 1989) environmental gradients is a major concern before assessing the with a negative binomial distribution (Cameron and Trivedi, 1998; sensitivity of each metric to anthropogenic pressures (Hughes Venables and Ripley, 2002). This distribution is designed for count et al., 1998; Karr and Chu, 1999; Oberdorff et al., 2002; Pont et data and takes into account the potential overdispersion of the al., 2006, 2009; Southerland et al., 2007; Stoddard et al., 2008). data (Cameron and Trivedi, 1998). In multiple linear regression, the Using statistical models enables us to estimate the value of the dependant variable Y is directly expressed as a linear combination metric that we would observe in the absence of pressure depend- of the independent variables X with a model taking the following

234 Table 3 Pressures and criteria used to select CAL (all pressures) and SID (*) data sets.

Pressure Criteria

Pressure Presence of downstream barriers (segment scale)* No or partial Natural flow velocity reduce in site due to impoundment* No Site affected by hydropeaking* No Water abstraction No or weak Channelization* No or intermediate Alteration of the cross-section* No Presence of colinear connected reservoir* No Toxic substances* No or intermediate Acidification* No Water quality index* Very good or good

Effect of pressure Seasonal hydrograph modification: no, yes No Natural flow velocity increase No Input of fine sediment No or weak Alteration of instream habitat condition No or intermediate Artificial embankment No or local Alteration of riparian vegetation No or slight Alteration of water thermal regime No

Water quality Artificial eutrophication No or low Organic pollution No or weak Organic siltation No  form: Y = ˛ + ˇiXi + ε, with Y the variable to model, ˛ the inter- of leverage values (graphs of hat values versus the expected val- cept, ˇi the ith parameter associated with the ith variable Xi, and ues), and the relationship between expected and observed values ε the error. In GLM, the linear predictor Á is related with the link (a linear relation of the form y = x was expected). The evaluation of function g rather than the dependant variable Y: the models was completed by internal validation based on the boot-  strap technique (Davidson and Hinkley, 1997; Efron and Tibshirani, = + Á ˛ ˇiXi 1993). Only metrics matching these criteria were used in the next = g(Y) Á step. E(Y) = g−1(Á)

The logarithm is the link function associated with the negative 2.3.3. Scoring process binomial distribution (Cameron and Trivedi, 1998); thus, the link Once models were fitted, the environmental effect on the met- between Y and the dependant variables X is: ric was removed by subtracting metric theoretical values from  observed values: the residuals (Fig. 3). Residuals “measure the = + range of metric variation expected after removing the effects of log(yi) ˛ ˇiXi the abiotic predictor variables and minor human disturbance” The coefficients of the models are estimated at the maximum of (Pont et al., 2009). Numerous residuals could be computed from likelihood (Cameron and Trivedi, 1998; Faraway, 2006; McCullagh GLM models (e.g., Pearson or deviance residuals; Faraway, 2006; and Nelder, 1989), while linear models used ordinary least squares McCullagh and Nelder, 1989). As the linearity between predictors (Kutner et al., 2005). All models were fitted using the CAL data sets. and the response variable is established on the link function (log- − For each metric, environmental variables were selected using a arithm), we computed residuals as log(yi +1) log(yˆi + 1), where yi stepwise procedure based on Akaike’s information criterion (AIC; and yˆi represent the observed and theoretical values. The value of Pont et al., 2006; Venables and Ripley, 2002). This procedure selects one was added both to observed and predicted values to handle the combination of variables that minimize the model’s AIC. sites presenting no fish belonging to the metric considered. In each model, we also added the logarithm of the number of Metric residuals were transformed into a unitless score varying captures, for the metric based on all species, and the logarithm of between 0 and 1 using the following transformation (Hering et al., the number of individuals of large species for the metric based on 2006; Legendre and Legendre, 1998): large species only, as predictors in the models and set its coeffi- residuals − lower anchor cients to one (as offset; Faraway, 2006). By adding these offsets as upper limit − lower anchor predictors in the models, we finally fitted the proportion of indi- viduals of a given metric either in the assemblage or among the For each metric, the upper anchor was defined as the 95% quan- large-species individuals: tile of the SID residuals (Stoddard et al., 2008), whereas the lower  anchor was computed on all residuals so that the median of SID = + + log(yi) ˛ ˇXi log( ni) scores was equal to 0.8 after rescaling (Fig. 3). − = + log(yi) log(ni)˛ ˇXi yi = + log ˛ ˇXi 2.3.4. Sensitivity to human pressure ni To test the sensitivity of the metrics, we used the first axis of where yi represents the number of fish with the characteristics of a multiple correspondence analysis (MCA; Tenenhaus and Young, interest (observed value of the metric) and ni represents the num- 1985) as a synthetic index of human pressures (Karr and Chu, 1999). ber of fish caught or the number of fish from large species of the ith Seven stressors were included in the analysis: impoundment, site. hydropeaking, water abstraction, presence of toxic substance, Goodness of fit and model adequacy were assessed by visually water quality, modification of river section associated with the checking the normality of the Pearson residuals (Q-Qplot and his- channelization level, and the presence of downstream barriers togram of the residuals), the heteroskedasticity of the residuals on the segment. We used k-mean clustering based on the algo- (graph of the Pearson residuals versus fitted values), the influence rithm proposed by Hartigan and Wong (1979) to define five classes

235 Table 6 Spearman rank correlation coefficients between metric scores of SID sites.

HINTOL RH INSEV

O2INTOL 0.977 0.901 0.720 HINTOL 0.908 0.706 RH 0.745

and the thermal amplitude between January and July ranged from 7.9 to 28.9 ◦C (mean = 15, sd = 4.2).

3.2. Fish community

A total of 68 species distributed into ten families were collected. The brown trout (Salmo trutta fario, Linnaeus) was the most widely distributed species in our samples (occurring in more than 89.9% of the sites) and also the most abundant one (representing 54% of the Fig. 4. Number of sites per pressure index class. total catches, Table 1). Bullhead, stone loach (Barbatula barbatula, Linnaeus), and minnow (Phoxinus phoxinus, Linnaeus) occurred in Table 4 at least 10% of the sites and were the most abundant species after Number of sites per pressure index class. the brown trout (Table 1). Most of the remaining species occurred Index class Number of sites in less than 1% of the sampling sites. The local species richness of 1 836 the sampling sites was low, with two-thirds of the sites displaying 2 401 fewer than three species. 3 206 479 515 3.3. Metric selection

Ninety-two metrics out of the 96 metrics tested were rejected of pressure ranging from 1 (not or slightly impacted sites) to mainly due to heteroskedasticity (74% of the metrics) of the residu- 5 (heavily impacted sites) (Fig. 4). The sensitivity of the metric als, the bootstrap (79.2%) and to a lesser extent the non-normality was assessed by visually comparing the distribution of scores in of the residuals (54.2%) and the nonlinear relationships between the five classes (Fig. 3) and by testing the differences of mean the observed and the predicted values (56.3%) (Appendix). scores between the minimally impacted sites (class 1) and the The four metrics retained were the number of small oxygen- most impacted sites. Because of the low number of sites in class intolerant individuals (O2INTOL), the number of small habitat- 5(Table 4), we grouped sites of class 4 and 5 sites to repre- intolerant individuals (HINTOL), the number of small rheophilous sent the highest degree of disturbance. We thus compared the individuals (RHEO), and the number of small insectivores (INSEV). mean of scores of class 1 and of class 4/5 using a Wilcoxon- They demonstrated a quite good relation between the observed and test. the predicted values (Fig. 5). These four metrics were computed using all species, and the 150-mm threshold was used to distin- 3. Results guish small from large individuals. None of the metrics based on large species, on large individuals, or with the thresholds 100 and 3.1. Site environment 200 mm was retained due to too strong departures from statistical assumptions. Sampling sites were principally located in small coldwater All correlations between metric scores of the SID sites were streams (80% of sites with a catchment area smaller than 100 km2 very high (all Spearman  > 0.71, Table 6) and highly significant and a wetted width of less than 9 m) of 12 European countries (all p-values < 0.001). O2INTOL and HINTOL were the most cor- (Fig. 1). The data set was distributed among 12 Illies ecoregions related metrics with a  greater than 0.98. Correlations between (1, 2, 3, 4, 8, 9, 10, 13, 14, 18, 20, 22) and covered an area of about metric scores (all sites) (Table 7) were highly significant (all p- 5.5 × 106 km2. The number of sites was highly unbalanced between values < 0.001). Despite these high correlations, only three of the countries and between stream sizes (Table 5). The annual mean air four metrics displayed a significant response to the gradient of temperature ranged from −1.5 ◦C to 16.5 ◦C (mean = 9.3, sd = 3.5), human pressures: O2INTOL, HINTOL, and RHEO (Fig. 6). For these

Table 5 Number of sites per country and per drainage area size (km2).

0–10 10–50 50–100 100–250 250–500

Austria 13 70 37 46 21 Finland 0 0 3 1 0 France 32 113 36 32 14 Germany 0 5 3 0 5 Italy 25 26 8 5 0 Poland 7 19 5 4 0 Portugal 1 9 0 1 0 Romania 2 4 1 2 0 Spain 83 353 126 61 19 Sweden 39 113 27 22 15 Switzerland 75 82 23 21 7 United Kingdom 50 152 63 53 15

236 Fig. 5. Relationship between observed and theoretical metric values (number of individual fish): (a) O2INTOL (number of sites = 214), (b) HINTOL (n = 214), (c) RHEO (n = 212) and (d) INSEV (n = 212). The black lines represent the curve y = x and the gray lines represent a general trend (Loess regression curves, f = 0.667; Hastie et al., 2009).

Table 7 the European scale: the number of small oxygen-intolerant fishes, Spearman rank correlation coefficients between metric scores (all sites). the number of small habitat-intolerant fishes, and the number of HINTOL RH INSEV small rheophilous fishes. None of the metrics based on large species,

O2INTOL 0.977 0.901 0.720 large individuals, or on a threshold other than 150 mm provided HINTOL 0.908 0.706 consistent results. RH 0.745 All these results clearly demonstrated that metrics combining individual fish body sizes (Breine et al., 2004; Mebane et al., 2003) and species traits (Pont et al., 2006) could assess the effect of Table 8 Statistics and p-values of the Wilcoxon rank tests of the comparison of the mean anthropogenic stressors on European coldwater fish assemblages. scores between pressure index classes 1 and 4–5. Despite the large variability in score values among pressure index classes 3 and 4, both the medians and the first quartile of scores Metric W statistic p-value decreased with an increasing pressure index. These values are also O2INTOL 64326 <0.001 lower than values observed for index classes 1 and 2. Moreover, HINTOL 66910.5 <0.001 RH 61557 <0.001 even if only a low number of sites displayed the highest level of INSEV 43598.5 0.145 disturbance, the scores of class 5 were consistently lower than the scores of minimally disturbed sites. metrics, the Wilcoxon tests demonstrated statistically significant 4.1. Metric sensitivities to pressure differences between scores of the pressure index of classes 1 and 4/5 (Table 8); with class 4/5 scores lower than class 1 scores (Fig. 6). The sensitivity of the metrics to a specific human pressure In contrast, the scores of the metric INSEV varied slightly along the (Angermeier et al., 2000; Karr and Chu, 1999) or a combination gradient of human influences (Fig. 6), and no significant difference of pressures should have showed high score variability among the was observed between classes 1 and 4/5 scores (Table 8). pressure index classes. Pressure index values were computed as a linear combination (Tenenhaus and Young, 1985) of numerous 4. Discussion anthropogenic stressors. Therefore, sites belonging to the same index classes could present different combinations of human pres- Of the 96 metrics tested, three could be considered candidate sures. If the three metrics are sensitive to specific anthropogenic metrics for the development of coldwater multimetric indices at alterations (single or combined pressures), the variability of scores

237 Fig. 6. Metric score distributions (box plots, Chambers et al., 1983) among the pressure index classes (1 very good to 5 bad conditions). (a) O2INTOL, (b) HINTOL, (c) RHEO, and (d) INSEV. The box represents the interquartile range, the bold line represents the median and the circles represent the outliers. observed within each pressure class could reflect the between-site adapted to the modified environmental conditions (Hughes et al., variability in anthropogenic alteration (Leonard and Orth, 1986). 2004; Langdon, 2001; Lyons et al., 1996; Matzen and Berge, 2008; Two sites of the same pressure class could present a high or low Mebane et al., 2003; Mundahl and Simon, 1999; Wang et al., 2003). score if they are impacted by different pressures. The relatively Therefore, the proportion of individuals present before the environ- low number of sites experiencing a single pressure has made mental alteration would decrease by either decreasing the number it difficult to test the specificity of metric responses to diverse of fish adapted to the former environmental conditions while the human alterations. Nevertheless, we think that using a pressure total abundance of fish remained unchanged (Mundahl and Simon, index gives global insight into the metric’s ability to detect anthro- 1999) or by increasing the overall abundance of fish with different pogenic stress on streams (Angermeier and Karr, 1986; Breine attributes. et al., 2004; Hughes et al., 2004; Karr et al., 1986; Pont et al., 2006). 4.2. Fish stocking The use of metrics based on ratio (Matzen and Berge, 2008) rather than raw abundance should be more coherent for assess- Fish stocking to sustain a wild species population or for ing stream conditions at a large scale. The number of captures recreational purposes (Cowx, 1994; Cowx and Gerdeaux, 2004; in a given site is highly dependant on the sampling effort (e.g., Rutherford, 2002) is a broad and very common practice in Europe Kennard et al., 2006; Reid et al., 2009). Therefore, metrics based (e.g., Caudron et al., 2009; European Inland Fisheries Advisory on raw abundance would better reflect between-site variations Commission, 1982; Hansen, 2002; Largiadèr and Scholl, 1996). in sampling effort rather than a consistent difference in ecolog- Among stocked species, salmonids and especially brown trout are ical conditions. Moreover, metrics based on ratio are expected used throughout Europe mainly because of financial interests and to respond to pressure modifying the expected balance between patrimonial status. The input of exogenous individuals should mod- small and large individuals by either favouring the large individu- ify the size class distribution of fish populations and the number of als or disadvantaging small individuals. For instance, an increasing individuals present in a given place. Therefore, fish stocking should number of warmwater and/or tolerant species originating from have blurred the observed metric responses to human pressures. downstream locations have been widely reported in impacted cold- In a previous experiment to develop an IBI for Pacific Northwest water streams. Metrics based on a ratio would be sensitive to coldwater streams, Mebane et al. (2003) found higher artificial IBI such shifts in the assemblage structure, as individuals naturally scores in some sites, with released hatchery fishes belonging to dif- occurring in these areas are replaced by individuals that are better ferent age classes. The influence of stocking should be even more

238 important, in that the brown trout was the most abundant and (Melcher et al., 2007; Reyjol et al., 2007; Schmutz et al., 2007). most frequent species in our study. Unfortunately, no information Nevertheless, the use of metrics based on species traits enables concerning fish stocking in our samples was available, prevent- one to compare assemblages composed of different species pools ing the test of stocking effect on metric scores, a limitation of our (Hoeinghaus et al., 2007; Hugueny et al., 2010; Irz et al., 2007; Kelt study. et al., 1996; Lamouroux et al., 2002; Logez et al., 2010), and the analysis of assemblage structure reflects the effect of habitat on 4.3. Advantages of metrics based on size classes assemblage structure (e.g., Bellwood et al., 2002; Bremner et al., 2006; Heino et al., 2005; Hoeinghaus et al., 2007). We believe that using defined size classes rather than age classes Moreover, the three metrics responding to human pressures to assess stream conditions presents several advantages. Fish body could be useful for assessing the impact of anthropogenic activi- size can be measured directly in the field during fish inventory, ties on coldwater streams across Europe. Intolerance to low oxygen with no additional handling or physical harm. While fish age could concentration and habitat degradation and to a lesser extent affin- be assessed from scales and otoliths (Summerfelt and Hall, 1987), ity to flow velocity are three widespread species attributes in it required both an experienced experimenter (Rifflart et al., 2006) small coldwater European streams (Wootton, 1992). Moreover, and significant additional laboratory work to assess fish ages accu- the traits intolerance to low oxygen concentrations and habitat rately. This should explain why some authors using metrics based degradation are supported by fish species with very large distribu- on age classes derived them from length distributions (Breine tion areas (Kottelat and Freyhof, 2007): salmonids (e.g., the brown et al., 2004; Hughes et al., 2004) or from age–length relationships trout and the Atlantic salmon, S. salar, Linnaeus), sculpins (the (Bramblett et al., 2005). Indeed, compared with fish age, data on bullhead and the Siberian sculpin, C. poecilopus, Heckel), and the fish length are much more frequently available, easier and safer minnow. This ensures a broad geographical representation of these to collect, and easier to handle. For these reasons, we think that metrics among European coldwater assemblages, independently using metrics based on size classes rather than on age classes is of the biogeographical and historical factors that have shaped more advantageous in assessing stream conditions at a large spatial the actual species distribution (Banarescu, 1989; Griffiths, 2006; extent. Hewitt, 2000, 1999; Hoeinghaus et al., 2007). This metric could thus Metrics based on size classes should be very useful for assess- be used across Europe to assess the ecological status of coldwa- ing the ecological condition of low-species rivers. Due to their ter streams, even in Mediterranean regions. While Mediterranean low number of species, depauperate assemblages present simpler warmwater fish assemblages present specific species composition assemblage structures than warmwater assemblages (Lyons et al., (Ferreira et al., 2007) and functional structure (Logez et al., 2010) 1996), limiting the amount of metrics available for these streams. compared to western European assemblages, coldwater faunas are Dividing fishes into several classes depending on their body size fairly similar among these regions (Ferreira et al., 2007). Therefore, increases the number of taxa occurring in a given site, since each the metrics based on species traits can assess stream conditions combination of species and size class could be considered an inde- over a very large scale, without using any prior site classification pendent entity or taxon. Using size classes would thus increase the (Pont et al., 2006, 2007). variability and the amount of candidate metrics available for low- We are highly convinced that such kind of size-based metrics species rivers. Moreover, the size structure of fish communities is a could be used outside of Europe, and more especially in North community attribute that is only sparsely used river bioassessment America. The relative similarity of coldwater faunas between these (Breine et al., 2004; Hughes et al., 2004; Langdon, 2001; Mebane two regions (Moyle and Herbold, 1987) would enable this geo- et al., 2003). graphical extension. First, North American coldwater assemblages are mostly dominated by salmonids and sculpins (Hughes et al., 4.4. Geographic range of metric use 2004; Langdon, 2001; Lyons et al., 1996; Mebane et al., 2003; Moyle and Herbold, 1987; Wang et al., 2003), the two most frequent Under the reference condition approach, the effect of the envi- and abundant families in European coldwater assemblages. Second, ronment on assemblage composition and structure is controlled even if some genera of these families are endemic of one particular either by a type-specific or a site-specific approach (Roset et al., region, such as Salmo in Europe and Salvelinus in North America, the 2007). The type-specific approach aims first to cluster reference attributes of salmonids and sculpins are similar in the two regions. sites on their faunal similarities (e.g., Joy and Death, 2002; Melcher Indeed, except for a few species (Halliwell et al., 1999), salmonids et al., 2007; Schmutz et al., 2007; Simpson and Norris, 2000; Wright and sculpins are considered intolerant coldwater species in North et al., 2000). Then the environment is used to assign each site America (Halliwell et al., 1999; Zaroban et al., 1999), which is con- to one of the faunal groups because both environmental condi- sistent with our species classification. Therefore, using information tions (e.g., Grossman et al., 1998; Huet, 1954; Jackson et al., 2001; on both fish body size and species traits should overcome the bio- Magalhaes et al., 2002; Matthews, 1998; Poff et al., 1997; Vehanen geographical differences between regions (Hoeinghaus et al., 2007; et al., 2010) and human pressures influence the species composi- Pont et al., 2006). tion of fish assemblages (e.g., Haxton and Findlay, 2008; Kruk and Penczak, 2003; McCormick et al., 2001; Quinn and Kwak, 2003; 5. Conclusion Quist et al., 2005; Wang et al., 2003). Then the species composition or the assemblage structure of a test site are compared to the values The methodology used in this study to develop and select new observed within the reference sites of the same faunal group (Joy metrics consisted in: (1) defining the new metrics (species trait, size and Death, 2002; Melcher et al., 2007; Roset et al., 2007; Schmutz class, threshold length, species pool), (2) selecting a calibration data et al., 2007). For instance, Melcher et al. (2007) identified 15 Euro- set composed only of sites that were not or slightly disturbed, (3) pean fish types (EFT) and used seven environmental variables to controlling the environmental variability of the metrics, (4) quanti- predict the EFT of each site. fying the deviation between the observed and the expected metric In this study, we used the site-specific approach and predicted values in absence of human pressures, (5) transforming the devia- the expected metric values directly from the site’s environmen- tions into unitless scores, (6) testing the sensitivity of the metrics to tal conditions (Hawkins et al., 2010; Oberdorff et al., 2002; Pont human pressures, (7) assessing the redundancy between metrics, et al., 2006, 2007; Roset et al., 2007). With this approach, we did and (8) selecting the candidate metrics for the development of a not consider the faunal dissimilarities between European regions MMI.

239 Table A1 Summary of the statistics checked for each model (y if the criteria was fulfilled else n): normality of the Pearson residuals (1), heteroskedasticity of the residuals (2), leverage values (3), linear relationship between observed and expected values (4), and bootstrap (5). The metric name is composed of the trait considered, of the species pool used (G for all species, L for large species only), of the threshold length (100, 150 or 200 mm), and of the size class (s small individuals, l large individuals).

Metric 1 2 3 4 5 Metric 1 2 3 4 5

INTOL-G-100-s y n y y n INSEV-G-100-s y y y n n INTOL-G-100-l y n y n y INSEV-G-100-l y n y n n INTOL-G-150-s y y y n n INSEV-G-150-s y y y y y INTOL-G-150-l n n y n y INSEV-G-150-l n n y y n INTOL-G-200-s n y y y n INSEV-G-200-s n n y n n INTOL-G-200-l n n y n n INSEV-G-200-l n n y n n INTOL-L-100-s y y y n n INSEV-L-100-s y y y n n INTOL-L-100-l y y y y n INSEV-L-100-l y y y y n INTOL-L-150-s n n y y n INSEV-L-150-s y n y y n INTOL-L-150-l n n y n n INSEV-L-150-l n n y n n INTOL-L-200-s n n y y y INSEV-L-200-s n n y y n INTOL-L-200-l n n y n n INSEV-L-200-l n n y n n O2INTOL-G-100-s y n y y n POTAD-G-100-s n n y n y O2INTOL-G-100-l y n y n n POTAD-G-100-l y n y n y O2INTOL-G-150-s y y y y y POTAD-G-150-s y n y n y O2INTOL-G-150-l n n y y n POTAD-G-150-l n n y n y O2INTOL-G-200-s n n y y n POTAD-G-200-s y n y n n O2INTOL-G-200-l n n y n y POTAD-G-200-l n n y n y O2INTOL-L-100-s y y y n n POTAD-L-100-s y n y n n O2INTOL-L-100-l y y y y n POTAD-L-100-l y y y y n O2INTOL-L-150-s n n y y n POTAD-L-150-s y y y y n O2INTOL-L-150-l n n y n n POTAD-L-150-l n n y n n O2INTOL-L-200-s n n y y n POTAD-L-200-s n n y y n O2INTOL-L-200-l n n y n n POTAD-L-200-l n n y n n HINTOL-G-100-s y n y y n LITH-G-100-s y n y n y HINTOL-G-100-l y n y n n LITH-G-100-l y n y y n HINTOL-G-150-s y y y y y LITH-G-150-s y n y n y HINTOL-G-150-l n n y y n LITH-G-150-l n n y y n HINTOL-G-200-s n n y y n LITH-G-200-s y n y n n HINTOL-G-200-l n n y n n LITH-G-200-l n n y n n HINTOL-L-100-s y y y n n LITH-L-100-s y y y n n HINTOL-L-100-l y y y y n LITH-L-100-l y y y y n HINTOL-L-150-s n n y y n LITH-L-150-s y n y n n HINTOL-L-150-l n n y n y LITH-L-150-l n n y n n HINTOL-L-200-s n n y y n LITH-L-200-s n n y y n HINTOL-L-200-l n n y n n LITH-L-200-l n n y n n RH-G-100-s y y y y n SIN-G-100-s y n y y y RH-G-100-l y n y n n SIN-G-100-l y n y n n RH-G-150-s y y y y y SIN-G-150-s y y y n n RH-G-150-l n n y y n SIN-G-150-l n n y n y RH-G-200-s n n y y y SIN-G-200-s n n y y n RH-G-200-l n n y n n SIN-G-200-l n n y n n RH-L-100-s y y y n n SIN-L-100-s y y y n n RH-L-100-l y y y y n SIN-L-100-l y y y y n RH-L-150-s n n y y n SIN-L-150-s n n y n n RH-L-150-l n n y n n SIN-L-150-l n n y n y RH-L-200-s n n y y n SIN-L-200-s n n y y n RH-L-200-l n n y n n SIN-L-200-l n n y n n

Our study clearly demonstrated the ability of metrics based on References the combination of species traits and fish lengths to assess the eco- logical conditions of coldwater streams dominated by intolerant An, K.G., Park, S.S., Shin, J.Y., 2002. An evaluation of a river health using the index of biological integrity along with relations to chemical and habitat conditions. species. Moreover, these metrics enable one to take into account Environ. Int. 28, 411–420. different assemblage components, which meets one of the require- Angermeier, P.L., Karr, J.R., 1986. Applying an index of biotic integrity based on ments of the European Water Framework Directive. stream-fish communities: considerations in sampling and interpretation. N. Am. J. Fish. Manage. 6, 418–429. Angermeier, P.L., Schlosser, I.J., 1987. Assessing biotic integrity of the fish community in a small Illinois stream. N. Am. J. Fish. Manage. 7, 331–338. Acknowledgments Angermeier, P.L., Smogor, R.A., Stauffer, J.R., 2000. Regional frameworks and can- didate metrics for assessing biotic integrity in Mid-Atlantic highland streams. Trans. Am. Fish. Soc. 129, 962–981. Work on this study was funded by the European Commission Bailey, R.C., Kennedy, M.G., Dervish, M.Z., Taylor, R.M., 1998. Biological assessment under the Sixth Framework Programme (EFI+ project, contract of freshwater ecosystems using a reference condition approach: comparing pre- dicted and actual benthic invertebrate communities in Yukon streams. Freshw. number 044096). We are grateful to all members who took part Biol. 39, 765–774. in this project. Banarescu, P., 1989. Zoogeography and history of the freshwater fish fauna of Europe. In: Holcik, J. (Ed.), The Freshwater Fishes of Europe. Aula-Verlag, Wisebaden, pp. 88–107. Bellwood, D.R., Wainwright, P.C., Fulton, C.J., Hoey, A., 2002. Assembly rules and Appendix A. functional groups at global biogeographical scales. Funct. Ecol. 16, 557–562. Bramblett, R.G., Johnson, T.R., Zale, A.V., Heggem, D.G., 2005. Development and eval- uation of a fish assemblage index of biotic integrity for northwestern Great Plains See Table A1. streams. Trans. Am. Fish. Soc. 134, 624–640.

240 Breine, J., Simoens, I., Goethals, P., Quataert, P., Ercken, D., Van Liefferinghe, C., Bel- Hughes, R.M., Kaufmann, P.R., Herlihy, A.T., Kincaid, T.M., Reynolds, L., Larsen, paire, C., 2004. A fish-based index of biotic integrity for upstream brooks in D.P., 1998. A process for developing and evaluating indices of fish assemblage Flanders (Belgium). Hydrobiologia 522, 133–148. integrity. Can. J. Fish. Aquat. Sci. 55, 1618–1631. Bremner, J., Rogers, S.I., Frid, C.L.J., 2006. Matching biological traits to environmental Hughes, R.M., Oberdorff, T., 1999. Applications of IBI concepts and metrics to waters conditions in marine benthic ecosystems. J. Mar. Syst. 60, 302–316. outside the United States and Canada. In: Simon, T.P. (Ed.), Assessing the Sus- Cameron, A.C., Trivedi, P.K., 1998. Regression Analysis of Count Data. Cambridge tainability and Biological Integrity of Water Resources Using Fish Communities. University Press, Cambridge. CRC Press, Boca Raton, Florida, pp. 79–93. Caudron, A., Champigneulle, A., Largiader, C.R., Launey, S., Guyomard, R., 2009. Stock- Hugueny, B., Camara, S., Samoura, B., Magassouba, M., 1996. Applying an index of ing of native Mediterranean brown trout (Salmo trutta) into French tributaries biotic integrity based on fish assemblages in a West African river. Hydrobiologia of Lake Geneva does not contribute to lake-migratory spawners. Ecol. Freshw. 331, 71–78. Fish. 18, 585–593. Hugueny, B., Oberdorff, T., Tedesco, P.A., 2010. Community ecology of river fishes: a Chambers, J.M., Cleveland, W.S., Kleiner, B., Tukey, P.A., 1983. Graphical Methods for large-scale perspective. In: Jackson, D.A., Gido, K.B. (Eds.), Community Ecology Data Analysis. Wadsworth & Brooks/Cole, Pacific Grove, California. of Stream Fishes: Concepts, Approaches, and Techniques. American Fisheries Cowx, I., 1994. Stocking strategies. Fish. Manag. Ecol. 1, 15–30. Society, Symposium 73, Bethesda, Maryland, pp. 29–62. Cowx, I., Gerdeaux, D., 2004. The effects of fisheries management practises on fresh- Irz, P., Michonneau, F., Oberdorff, T., Whittier, T.R., Lamouroux, N., Mouillot, D., water ecosystems. Fish. Manag. Ecol. 14, 145–151. Argillier, C., 2007. Fish community comparisons along environmental gradients Davidson, A.C., Hinkley, D.V., 1997. Bootstrap Methods and Their Application. Cam- in lakes of France and north-east USA. Glob. Ecol. Biogeogr. 16, 350–366. bridge University Press. Jackson, D.A., Peres-Neto, P.R., Olden, J.D., 2001. What controls who is where in Efron, B., Tibshirani, R.J., 1993. An Introduction to Bootstrap. Chapman & Hall, Lon- freshwater fish communities—the roles of biotic, abiotic, and spatial factors. don. Can. J. Fish. Aquat. Sci. 58, 157–170. European Inland Fisheries Advisory Commission, 1982. Report of the symposium Joy, M.K., Death, R.G., 2002. Predictive modelling of freshwater fish as a biomonitor- on stock enhancement in the management of freshwater fisheries. Held in ing tool in New Zealand. Freshw. Biol. 47, 2261–2275. Budapest, Hungary, 31 may–2 June 1982 in conjunction with the twelfth session Junk, W.J., Bayley P.B., Sparks, R.E., 1989. The flood pulse concept in river-floodplain of EIFAC. EIFAC Tech. Pap. 42, 43 pp. systems. In: Dodge, D.P. (Ed.), Proceedings of The International Large River European Union, 2000. Directive 2000/60/ec of the European Parliament and of the Symposium (LARS). Canadian Journal of Fisheries and Aquatic Sciences Special Council establishing a framework for the community action in the field of water Publication. policy. Off. J. Eur. Commun. L327, 1–72. Karr, J.R., 1981. Assessment of biotic integrity using fish communities. Fisheries 6, Faraway, J.J., 2006. Extending the Linear Model With R. Generalized Linear, Mixed 21–27. Effects and Nonparametric Regression Models. Chapman & Hall/CRC, Boca Raton, Karr, J.R., 1991. Biological integrity: a long-neglected aspect of water resource man- Florida. agement. Ecol. Appl. 1, 66–84. Fausch, K.D., Karr, J.R., Yant, P.R., 1984. Regional application of an index of Karr, J.R., Chu, E.W., 1999. Restoring Life in Running Waters: Better Biological Mon- biotic integrity based on stream fish communities. Trans. Am. Fish. Soc. 113, itoring. Island Press, Washington, DC. 39–55. Karr, J.R., Chu, E.W., 2000. Sustaining living rivers. Hydrobiologia 422, 1–14. Ferreira, T., Oliveira, J., Caiola, N., De Sostoa, A., Casals, F., Cortes, R., Economou, Karr, J.R., Fausch, K.D., Angermeier, P.L., Yant, P.R., Schlosser, I.J., 1986. Assessing A., Zogaris, S., Garcia-Jalon, D., Ilheu, M., Martinez-Capel, F., Pont, D., Rogers, C., biological integrity in running waters: a method and its rationale. Special pub- Prenda, J., 2007. Ecological traits of fish assemblages from Mediterranean Europe lication 5. Illinois Natural History Survey, Champaign, IL. and their responses to human disturbance. Fish. Manag. Ecol. 14, 473–481. Kelt, D.A., Brown, J.H., Heske, E.J., Marquet, P.A., Morton, S.R., Reid, J.R.W., Rogovin, Griffiths, D., 2006. Pattern and process in the ecological biogeography of European K.A., Shenbrot, G., 1996. Community structure of desert small mammals: com- freshwater fish. J. Anim. Ecol. 75, 734–751. parisons across four continents. Ecology 77, 746–761. Grossman, G.D., Ratajczak, R.E.J., Crawford, M., Freeman, M.C., 1998. Assemblage Kennard, M.J., Pusey, B.J., Harch, B.D., Dore, E., Arthington, A.H., 2006. Estimating organization in stream fishes: effects of environmental variation and interspe- local stream fish assemblage attributes: sampling effort and efficiency at two cific interactions. Ecol. Monogr. 68, 395–420. spatial scales. Mar. Freshwat. Res. 57, 635–653. Halliwell, D.B., Langdon, R.W., Daniels, R.A., Kurtenbach, J.P., Jacobson, R.A., 1999. Kottelat, M., Freyhof, J., 2007. Handbook of European Freshwater Fishes. Publications Classification of freshwater fish species of the northeastern United States for Kottelat, Cornol, Switzerland. use in the development of indices of biological integrity with regional applica- Kruk, A., Penczak, T., 2003. Impoundment impact on populations of facultative river- tions. In: Simon, T.P. (Ed.), Assessing the Sustainability and Biological Integrity ine fish. Ann. Limnol.-Int. J. Lim. 39, 197–210. of Water Resources Using Fish Communities. CRC Press, Boca Raton, Florida, pp. Kutner, M.H., Nachtsheim, C.J., Neter, J., Li, W., 2005. Applied Linear Statistical Mod- 301–333. els, fifth ed. McGraw-Hill/Irwin, New York. Hansen, M.M., 2002. Estimating the long-term effects of stocking domesticated Lamouroux, N., Poff, N.L., Angermeier, P.L., 2002. Intercontinental convergence of trout into wild brown trout (Salmo trutta) populations: an approach using stream fish community traits along geomorphic and hydraulic gradients. Ecol- microsatellite DNA analysis of historical and contemporary samples. Mol. Ecol. ogy 83, 1792–1807. 11, 1003–1015. Langdon, R.W., 2001. A preliminary index of biological integrity for fish assemblages Harris, J.H., Silveira, R., 1999. Large-scale assessments of river health using an of small coldwater streams in Vermont. Northeast. Nat. 8, 219–232. index of biotic integrity with low-diversity fish communities. Freshw. Biol. 41, Largiadèr, C.R., Scholl, A., 1996. Genetic introgression between native and introduced 235–252. brown trout Salmo trutta L. Populations in the Rhone river basin. Mol. Ecol. 5, Hartigan, J.A., Wong, M.A., 1979. Algorithm as136: a k-means clustering algorithm. 417–426. Appl. Stat. 28, 100–108. Lee, D.S., Gilbert, C.R., Hocutt, C.H., Jenkins, R.E., McAllister, D.E., Stauffer, Jr., J.R., Hastie, T., Tibshirani, R., Friedman, J., 2009. The Element of Statistical Learning: Data 1980. Atlas of North American Freshwater Fishes. North Carolina State Museum Mining, Inference, and Prediction, second ed. Springer, New York. of Natural History, Raleigh, NC. Hawkins, C.P., Cao, Y., Roper, B., 2010. Method of predicting reference condition Legendre, L., Legendre, P., 1998. Numerical Ecology. Elsevier Science B.V., Amster- biota affects the performance and interpretation of ecological indices. Freshw. dam. Biol. 55, 1066–1085. Leonard, P.M., Orth, D.J., 1986. Application and testing of an index of biotic integrity Haxton, T.J., Findlay, C.S., 2008. Meta-analysis of the impacts of water management in small, coolwater streams. Trans. Am. Fish. Soc. 115, 401–414. on aquatic communities. Can. J. Fish. Aquat. Sci. 65, 437–447. Logez, M., Pont, D., Ferreira, M.T., 2010. Do Iberian and European fish faunas exhibit Heino, J., Parviainen, J., Paavola, R., Jehle, M., Louhi, P., Muotka, T., 2005. Charac- convergent functional structure along environmental gradients? J. North. Am. terizing macroinvertebrate assemblage structure in relation to stream size and Benthol. Soc. 29, 1310–1323. tributary position. Hydrobiologia 539, 121–130. Lyons, J., 1996. Patterns in the species composition of fish assemblages among Wis- Hering, D., Feld, C.K., Moog, O., Ofenböck, T., 2006. Cook book for the development consin streams. Environ. Biol. Fishes 45, 329–341. of a multimetric index for biological condition of aquatic ecosystems: expe- Lyons, J., Navarroperez, S., Cochran, P.A., Santana, E., Guzmanarroyo, M., 1995. Index riences from the European AQEM and STAR projects and related initiatives. of biotic integrity based on fish assemblages for the conservation of streams and Hydrobiologia 566, 311–324. rivers in west-central Mexico. Conserv. Biol. 9, 569–584. Hewitt, G.M., 1999. Post-glacial re-colonization of European biota. Biol. J. Linn. Soc. Lyons, J., Wang, L., Simonson, T.D., 1996. Development and validation of an index of 68, 87–112. biotic integrity for coldwater streams in Wisconsin. N. Am. J. Fish. Manage. 16, Hewitt, G.M., 2000. The genetic legacy of the Quaternary ice ages. Nature 405, 241–256. 907–913. Magalhaes, M.F., Batalha, D.C., Collares-Pereira, M.J., 2002. Gradients in stream fish Hill, M.O., Smith, A.J.E., 1976. Principal component analysis of taxonomic data with assemblages across a Mediterranean landscape: contributions of environmental multi-state discrete characters. Taxonomy 25, 249–255. factors and spatial structure. Freshw. Biol. 47, 1015–1031. Hoeinghaus, D.J., Winemiller, K.O., Birnbaum, J.S., 2007. Local and regional determi- Magalhaes, M.F., Ramalho, C.E., Collares-Pereira, M.J., 2008. Assessing biotic integrity nants of stream fish assemblage structure: inferences based on taxonomic vs. in a Mediterranean watershed: development and evaluation of a fish-based functional groups. J. Biogeogr. 34, 324–338. index. Fish. Manag. Ecol. 15, 273–289. Huet, M., 1954. Biologie, profils en long et en travers de eaux courantes. Bull. fr. Matthews, W.J., 1998. Patterns in Freshwater Fish Ecology. Chapman & Hall, New Piscic. 175, 41–53. York. Hughes, R.M., Howlin, S., Kaufmann, P.R., 2004. A biointegrity index (IBI) for cold- Matzen, D.A., Berge, H.B., 2008. Assessing small-stream biotic integrity using fish water streams of western Oregon and Washington. Trans. Am. Fish. Soc. 133, assemblages across an urban landscape in the Puget Sound lowlands of western 1497–1515. Washington. Trans. Am. Fish. Soc. 137, 677–689.

241 McCormick, F.H., Hughes, R.M., Kaufmann, P.R., Peck, D.V., Stoddard, J.L., Herlihy, A.T., Rifflart, R., Marchand, F., Rivot, E., Bagliniere, J.L., 2006. Scale reading validation 2001. Development of an index of biotic integrity for the Mid-Atlantic Highlands for estimating age from tagged fish recapture in a brown trout (Salmo trutta) region. Trans. Am. Fish. Soc. 130, 857–877. population. Fish. Res. 78, 380–384. McCullagh, P., Nelder, J.A., 1989. Generalized Linear Models, second ed. Chapman Reyjol, Y., Hugueny, B., Pont, D., Bianco, P.G., Beier, U., Caiola, N., Casals, F., Cowx, and Hall, London. I., Economou, A., Ferreira, T., Haidvogl, G., Noble, R., De Sostoa, A., Vigneron, Mebane, C.A., Maret, T.R., Hughes, R.M., 2003. An index of biological integrity (IBI) T., Virbickas, T., 2007. Patterns in species richness and endemism of European for Pacific Northwest rivers. Trans. Am. Fish. Soc. 132, 239–261. freshwater fish. Glob. Ecol. Biogeogr. 16, 65–75. Melcher, A., Schmutz, S., Haidvogl, G., Moder, K., 2007. Spatially based methods to Roset, N., Grenouillet, G., Goffaux, D., Pont, D., Kestemont, P., 2007. A review of assess the ecological status of European fish assemblage types. Fish. Manag. Ecol. existing fish assemblage indicators and methodologies. Fish. Manag. Ecol. 14, 14, 453–463. 393–405. Moyle, P.B., Herbold, B., 1987. Life-history patterns and community structure in Rutherford, E.S., 2002. Fishery management. In: Fuiman, L.A., Werner, R.G. (Eds.), stream fishes of Western North America: comparisons with Eastern North Fishery Science: The Unique Contributions of Early Life Stages. Blackwell Science, America and Europe. In: Matthews, W.J., Heins, D.C. (Eds.), Community and Evo- Oxford. lutionary Ecology of North American Stream Fishes. University of Oklahoma Schmutz, S., Melcher, A., Frangez, C., Haidvogl, G., Beier, U., Boehmer, J., Breine, J., Press, Norman, London, pp. 25–32. Simoens, I., Caiola, N., De Sostoa, A., Ferreira, M.T., Oliveira, J., Grenouillet, G., Mundahl, N.D., Simon, T.P., 1999. Development and application of an index of biotic Goffaux, D., Leeuw, J.J., Roset, N., Virbickas, T., 2007. Spatially based methods to integrity for coldwater streams of the upper Midwestern United States. In: assess the ecological status of riverine fish assemblages in European ecoregions. Simon, T.P. (Ed.), Assessing the Sustainability and Biological Integrity of Water Fish. Manag. Ecol. 14, 441–452. Resources Using Fish Communities. CRC Press, Boca Raton, Florida, pp. 383–411. Simon, T.P., Lyons, J., 1995. Application of the index of biotic integrity to evaluate Nelson, J.S., 2006. Fishes of The World, fourth ed. John Wiley & Sons, Inc., Hoboken, water resource integrity in freshwater ecosystems. In: Davis, W.S., Simon, T.P. New Jersey. (Eds.), Biological Assessment and Criteria: Tools for Water Resource Planning Noble, R.A.A., Cowx, I.G., Goffaux, D., Kestemont, P., 2007. Assessing the health of and Decision Making. CRC Press, Boca Raton, Florida, pp. 245–262. European rivers using functional ecological guilds of fish communities: stan- Simpson, J., Norris, R.H., 2000. Biological assessment of water quality: develop- dardising species classification and approaches to metric selection. Fish. Manag. ment of AUSRIVAS models and outputs. In: Wright, J.F., Sutcliffe, D.W., Furse, Ecol. 14, 381–392. M.T. (Eds.), Assessing the Biological Quality of Freshwaters. RIVPACS and Other Oberdorff, T., Hughes, R.M., 1992. Modification of an index of biotic integrity based on Techniques. Freshwater Biological Association, Ambleside, United Kingdom, pp. fish assemblages to characterize rivers of the Seine Basin, France. Hydrobiologia 125–142. 228, 117–130. Southerland, M.T., Rogers, G.M., Kline, M.J., Morgan, R.P., Boward, D.M., Kazyak, R., Oberdorff, T., Pont, D., Hugueny, B., Porcher, J.P., 2002. Development and validation Klauda, R.J., Stranko, S.A., 2007. Improving biological indicators to better assess of a fish-based index for the assessment of ‘river health’ in France. Freshw. Biol. the condition of streams. Ecol. Indic. 7, 751–767. 47, 1720–1734. Stoddard, J.L., Herlihy, A.T., Peck, D.V., Hughes, R.M., Whittier, T.R., Tarquinio, E., Oberdorff, T., Porcher, J.P., 1994. An index of biotic integrity to assess biologi- 2008. A process for creating multimetric indices for large-scale aquatic surveys. cal impacts of salmonid farm effluents on receiving waters. Aquaculture 119, J. N. Am. Benthol. Soc. 27, 878–891. 219–235. Stoddard, J.L., Larsen, D.P., Hawkins, C.P., Johnson, R.K., Norris, R.H., 2006. Setting Petts, G.E., Amoros, C. (Eds.), 1996. Fluvial Hydrosystems. Chapman & Hall, London. expectations for the ecological condition of streams: the concept of reference Poff, N.L., Allan, J.D., 1995. Functional organization of stream fish assemblages in condition. Ecol. Appl. 16, 1267–1276. relation to hydrological variability. Ecology 76, 606–627. Summerfelt, R.C., Hall, G.E. (Eds.), 1987. Age and Growth of Fish. Iowa State Univer- Poff, N.L., Allan, J.D., Bain, M.B., Karr, J.R., Prestegaard, K.L., Richter, B.D., Sparks, R.E., sity Press. Stromberg, J.C., 1997. The natural flow regime: a paradigm for river conservation Tenenhaus, M., Young, F.W., 1985. An analysis and synthesis of multiple corre- and restoration. BioScience 47, 769–784. spondence analysis, optimal scaling, dual scaling, homogeneity analysis and Pont, D., Hughes, R.M., Whittier, T.R., Schmutz, S., 2009. A predictive index of biotic other methods for quantifying categorical multivariate data. Psychometrika 50, integrity model for aquatic-vertebrate assemblages of western U.S. streams. 91–119. Trans. Am. Fish. Soc. 138, 292–305. Vehanen, T., Sutela, T., Korhonen, H., 2010. Environmental assessment of boreal Pont, D., Hugueny, B., Beier, U., Goffaux, D., Melcher, A., Noble, R., Rogers, C., Roset, rivers using fish data—a contribution to the water framework directive. Fish. N., Schmutz, S., 2006. Assessing river biotic condition at a continental scale: a Manag. Ecol. 17, 165–175. European approach using functional metrics and fish assemblages. J. Appl. Ecol. Venables, W.N., Ripley, B.D., 2002. Modern Applied Statistics with S, fourth ed. 43, 70–80. Springer, New York. Pont, D., Hugueny, B., Oberdorff, T., 2005. Modelling habitat requirement of European Wang, L.Z., Lyons, J., Kanehl, P., 2003. Impacts of urban land cover on trout streams fishes: do species have similar responses to local and regional environmental in Wisconsin and Minnesota. Trans. Am. Fish. Soc. 132, 825–839. constraints? Can. Can. J. Fish. Aquat. Sci. 62, 163–173. Whittier, T.R., Stoddard, J.L., Larsen, D.P., Herlihy, A.T., 2007. Selecting reference Pont, D., Hugueny, B., Rogers, C., 2007. Development of a fish-based index for the sites for stream biological assessments: best professional judgment or objective assessment of river health in Europe: the European Fish Index. Fish. Manag. Ecol. criteria. J. N. Am. Benthol. Soc. 26, 349–360. 14, 427–439. Wootton, R.J., 1992. Fish Ecology. Blackie USA. Chapman & Hall, New York. Quinn, J.W., Kwak, T.J., 2003. Fish assemblage changes in an Ozark river after Wright, J.F., Sutcliffe, D.W., Furse, M.T. (Eds.), 2000. Assessing the Biological Quality of impoundment: a long-term perspective. Trans. Am. Fish. Soc. 132, 110–119. Freshwaters. RIVPACS and Other Techniques. Freshwater Biological Association, Quist, M.C., Hubert, W.A., Rahel, F.J., 2005. Fish assemblage structure following Ambleside, United Kingdom. impoundment of a great plains river. West. N. Am. Nat. 65, 53–63. Zaroban, D.W., Mulvey, M.P., Maret, T.R., Hughes, R.M., Merritt, G.D., 1999. Classifi- Reid, S.M., Yunker, G., Jones, N.E., 2009. Evaluation of single-pass backpack electric cation of species attributes for Pacific Northwest freshwater fishes. Northwest. fishing for stream fish community monitoring. Fish. Manag. Ecol. 16, 1–9. Sci. 73, 81–93.

242 P3: Development of metrics based on fish body size and species traits to assess European coldwater streams. Comment: modelling process Only the metrics successfully modelled by the environment were retained. Each metric was linked to the environmental conditions using Generalized Linear Models (GLM; McCullagh and Nelder 1989, Faraway 2006) with a negative binomial distribution (NB; Cameron and Trivedi 1998, Venables and Ripley 2002). With the GLM, the linear predictor is related to the independent variables X (the environmental predictors): X ii 3.1 with the intercept and i the coefficient associated with the environmental variable Xi. The variable Y (metric) is related to the linear predictor with the link function g, so: Yg )( 3.2

gYE 1 )()( 3.3 The link function associated with the NB distribution is the natural logarithm (Cameron and Trivedi 1998). The relationship between the metric and the environmental variable is: )(log XY ii 3.4 Model coefficients are computed by maximum likelihood maximisation (McCullagh and Nelder 1989).

The logarithm of the total number of individuals was integrated as an “offset” (Chambers and Hastie 1993, Faraway 2006) into the models for the metrics based on all species, and the logarithm of the total number of individuals from large species was integrated into the models for the metric based on large species individuals. The coefficient associated with the offset variables is set at one, giving models with the offset model proportion rather than the absolute number of individuals. These models are also called rate models (Cameron and Trivedi 1998): )(log ii NXY )log( 3.5

Y log X 3.6 N ii with N the total number of individuals or the number of large-species individuals.

For the metrics based on all species, this results in modelling the proportion of individuals with both a given attribute and a given size among all assemblage individuals, e.g. the proportion of small rheophilic fish (Figure 12A). For the metrics based on the large

243 P3: Development of metrics based on fish body size and species traits to assess European coldwater streams. species, this leads to modelling the proportion of individuals with both a given attribute and size class among the individuals from large species, e.g. the proportion of small rheophilic fish among large species individuals (Figure 12B). The latter resemble metrics based on age classes. The small individuals are mainly young individuals and the large individuals correspond mainly to older individuals.

Comment: the advantage of negative binomial distribution Although Poisson and negative binomial distribution are adapted to count data (Cameron and Trivedi 1998), negative binomial distribution was preferred. Using Poisson distribution assumes that the expected value and the variance are equal, the equidispersion hypothesis (Cameron and Trivedi 1998): YE 3.9

YV 3.10 with the parameter of the Poisson distribution. “Failure of the Poisson assumption of equidispersion has similar qualitative consequences to failure of the assumption of homoskedasticity in the linear regression model” (Cameron and Trivedi 1998, p.77). Therefore, without equidispersion the estimation of the variance–covariance matrix related to the model coefficients would be biased. Using a negative binomial distribution takes into account the overdispersion of the dependent variable Y. The variance function associated with this distribution is: YV 2 3.11 with the dispersion parameter computed by maximum likelihood estimation (Cameron and Trivedi 1998). In practice, the model coefficients estimated either with a Poisson distribution or a NB distribution are very close. Therefore, the predicted values are also very close. For instance, for the number of small habitat intolerant individuals metric, the fitted values computed with the two distributions had a correlation of 0.997 (in the link space). Nevertheless, as expected, the estimations of the variance–covariance matrices are relatively different depending on the distribution considered (Table 6; Cameron and Trivedi 1998). These results will have considerable consequences when estimating the confidence intervals (see section P5).

244 P3: Development of metrics based on fish body size and species traits to assess European coldwater streams.

Table 6: Insight the variance–covariance matrices of the model linking the environment to the number of small habitat intolerant fish estimated either with a Poisson or a negative binomial distribution. Poisson Negative binomial Intercept Intercept Intercept 0.001612 0.025572 Slope 0.001445 0.011266 Slope² 0.000821 0.012116 Thermal amplitude -0.000063 -0.000989

Taking into account assemblage size structure to assess streams Of the 96 metrics tested, three went through all the selection steps (Hughes et al. 1998, Karr and Chu 2000, Hughes et al. 2004, Hering et al. 2006, Roset et al. 2007, Stoddard et al. 2008) and are candidates for the final index. Although using a metric based on age or size classes has often been recommended (Karr 1991, Roset et al. 2007), especially by the WFD (European Union (EC) 2000), only a few indices have integrated such metrics (Oberdorff and Porcher 1994, Langdon 2001, Breine et al. 2004, Hughes et al. 2004). Using one of the three metrics would integrate the assemblage size structure in stream assessment. Nevertheless, to limit the redundancy of the information provided by these metrics (Table 8; Karr et al. 1986, Karr and Chu 1999, 2000) it would be necessary to retain only one of the three metrics (Hughes et al. 1998, Hering et al. 2006, Pont et al. 2006, Pont et al. 2007, Stoddard et al. 2008). The variability of the scores observed among the various index classes suggests that the sensitivity of these metrics should differ depending on the pressures (Karr and Chu 1999, Angermeier et al. 2000). The pressure index is a linear combination (Tenenhaus and Young 1985) of several pressures. Two sites may therefore be subjected to different pressures belonging to the same pressure index class. Observed differences between scores could reflect the between-site variability of anthropogenic pressures (Leonard and Orth 1986). This hypothesis still needs to be tested.

245

P4: Implication of this work in the development of the new European Fish Index, EFI+. Thesis chapter

P4: Implication of this work in the development of the new European Fish Index, EFI+.

The new index development was inspired by the EFI developed during the FAME project (Figure 16; Pont et al. 2006, Pont et al. 2007). As in the EFI, all the metrics tested were based on the ecological and biological traits of species (Usseglio-Polatera et al. 2000, Noble et al. 2007). The metrics tested were based either on the number of species or the number of individuals, e.g. the number of rheophilic species or the number of rheophilic individuals. The new index is also based on the reference condition approach (Bailey et al. 1998) used during FAME and recommended by the WFD (European Union (EC) 2000). The first step consists in selecting a data set of sites that are not or only slightly impacted, also called reference sites (Oberdorff et al. 2002, Pont et al. 2006, Stoddard et al. 2006, Hawkins et al. 2010), and selecting a data set of impacted sites (Figure 16; Pont et al. 2006). The reference sites are used to calibrate the models relating metric variability to environmental variables (Oberdorff et al. 2002, Pont et al. 2006, Pont et al. 2007, Pont et al. 2009, Hawkins et al. 2010). These models are used to predict the expected values of the metrics in a given environment. Indices using predictive models are also called predictive bioindicators (bio- indicateur prédicitif; Pont 2000). Other approaches directly estimated the expected values from the distribution of the values observed in the reference sites (e.g. Bates Prins and Smith 2007). In each model, an offset was added (total richness, number of captures) to model proportions rather than raw metrics (see previous chapter; Chambers and Hastie 1993, Cameron and Trivedi 1998, Faraway 2006). This is an important difference in the statistical strategy used to model the metrics compared to the FAME project. Observed metric values are then compared to the expected values to control the environmental part of variability of the metric (e.g. equation 3.7): yi log1log yˆi 1 3.7 with yi the observed metric values in a site i and i the expected values in absence of pressures. The deviations between observed and expected values are then transformed into a score scaled between 0 and 1 (0 for highly degraded and 1 for very good condition). Two steps are needed to transform deviations into scores. First the deviations are standardised: xres S ii 3.12 i s with Si the score, resi the deviation between observed and expected values, ¯xi the mean of the deviation of the reference sites located in the ecoregion i and s the standard deviation of all reference sites (see Appendix 2). Thus the scoring process takes into account the ecoregion of

249 P4: Implication of this work in the development of the new European Fish Index, EFI+. the sites. The score was re-scaled between 0 and 1 using a min-max transformation (Legendre and Legendre 1998, Hering et al. 2006, Saporta 2006): score min 3.8 max min The minimum and maximum anchor computed as the median of the scores of sites that were not or only slightly disturbed was equal to 0.8. The transformation used in FAME gives an expected value of 0.5 for the sites that are not or only slightly impacted. Consequently, the comparison between the score of a test site and the score expected in absence of pressure are established over a wider range of values than in FAME. The sensitivity of each metric to pressure was evaluated by testing the difference between the means of sites with a null or slight impact and heavily impacted sites. Only the less redundant metrics, with the lowest correlations, were integrated into the final index. The index score is the mean of the scores computed for each selected metric (see Appendix 2).

Data base Figure 16: Scheme of the

Environmental variables development of a predictive

Fish assemblage MMI based on the reference

Pressure evaluation condition approach, adapted

Potential metrics from Pont (2010).

Reference sites Metric ~ ƒ(environment) Expectd values without pressures Modelling metrics

Degraded sites Controlling observed- expected environmental variability

Transformation Scores [0;1] rescaling

Metric sensitivity Pressure-impact to pressures relationship

Metric selection Redundancy

Index construction Aggregating and validation scores

250 P4: Implication of this work in the development of the new European Fish Index, EFI+.

The results from the study on the redundancy between traits among European fish assemblages (P1) led to considering two types of assemblages: those dominated by intolerant species and those dominated by tolerant species. An index was developed for each assemblage type, giving two indices that were finally developed during the EFI+ project, a major difference compared to the FAME project and the EFI index. This distinction sought to select two sets of metrics that were specific and representative of each assemblage type. Thus indices should enable a more accurate assessment of site conditions.

The two types of assemblages are distinguished by the relative abundance of the stenothermal species (STTHER), intolerant to low oxygen concentration (O2INTOL), intolerant to habitat degradation (HINTOL), with lithophilic (LITH) or speleophilic (SPEL) reproduction and spawning in running waters (RHPAR). The intolerant assemblages are dominated by these species, whereas these species are absent or only slightly represented in tolerant assemblages. Intolerant assemblages are mainly dominated by salmonids and their associated species (Table 10), whereas tolerant assemblages are mainly dominated by cyprinids.

Anthropogenic activities could modify species dominance and species composing the assemblages (e.g. McCormick et al. 2001, Kruk and Penczak 2003, Quinn and Kwak 2003, Wang et al. 2003, Quist et al. 2005, Haxton and Findlay 2008). Therefore, the observed proportion of intolerant species should differ from the assemblage expected without degradation. It is thus possible to classify sites depending on the abiotic factors that control for assemblage-specific composition. The classification of Melcher et al. (2007) integrates seven environmental variables to differentiate 15 assemblage types. These 15 types were grouped to match the intolerant–tolerant assemblage classification.

Table 10: List of the 19 species characteristic of intolerant assemblages. Alburnoides bipunctatus Cobitis calderoni Coregonus lavaretus Cottus gobio Cottus poecilopus Eudontomyzon mariae Hucho hucho Lampetra planeri Phoxinus phoxinus Salmo salar Salmo trutta fario Salmo trutta lacustris Salmo trutta macrostigma Salmo trutta trutta Salmo trutta marmoratus Salvelinus fontinalis Salvelinus namaycush Salvelinus umbla Thymallus thymallus

251 P4: Implication of this work in the development of the new European Fish Index, EFI+.

The major consequence of distinguishing streams depending on their assemblage types was that the two indices integrated are computed with different metrics. The index for intolerant assemblages is based on the number of individuals intolerant to oxygen depletion (Ni.O2INTOL) and on the number of individuals intolerant to habitat degradation that are smaller than 150 mm (Ni.HINTOL.150): (Ni.O2INTOL Ni.HINTOL.150) Index 3.13 Intol 2 The index for tolerant assemblages is based on the number of species spawning in running waters (Ric.RHPAR) and on the number of individuals with lithophilic reproduction (Ni.LITH): (Ric.RHPAR Ni.LITH) Index 3.14 Tol 2 Residuals standardisation (3.12) also takes this classification into account: xres ijij Sij 3.15 s j with the score Sij, resij the deviation between the observed and expected value, ¯xij the average values of the residuals of the reference sites in the ith ecoregion and in the jth stream type

(intolerant or tolerant) and sj the standard deviation of the sites that are not or only slightly disturbed (see Appendix 2). The result of developing new metrics based on size classes can be seen by integrating the number of small individuals intolerant to habitat degradation (Ni.HINTOL.150) metric into the computation of the index for intolerant assemblages (3.13). This metric was selected rather than the two other candidate metrics due to its better representation in certain European regions.

252 P5: Uncertainty associated with predictive multimetric indices Thesis chapter

P5: Uncertainty associated with predictive multimetric indices

3.2 Uncertainty associated with predictive multimetric IBI use

Uncertainty assessment associated with index values represents an important requirement of the international scientific community and water managers aiming to assess the reliability of the diagnosis provided by the MMI. The WFD stipulates that the waterbodies should be divided into five groups according to their condition: from 1 representing a very good condition to 5, a degraded waterbody. Group 2 represents good ecological status. Each waterbody is allocated to one group after having compared the index score to each group limit (e.g. Table 11).

Table 11: Class boundaries for the intolerant index (Bady et al. 2009a,b). Classe Limites 1 [0.911; 1] 2 [0.755; 0.911] 3 [0.503; 0.755[ 4 [0.202; 0.503] 5 [0; 0.252]

Two sites with very close index scores should be evaluated as two different ecological statuses. This narrow numeric difference can have huge consequences for water managers, most particularly economic. On one hand, the studied site would be considered as degraded and would have to be restored, whereas its condition may comply with the WFD. Consequently, associating a confidence interval around the index score may enable us to assess site classification reliability. Currently, apart from Clarke’s (2000) studies on the uncertainty estimate related to RIVPACS-like indices (Wright 2000), other uncertainty calculation attempts seem absent from the literature.

Taking into account all of the potential error sources (e.g., the sampling, uncertainty around environmental variables, etc.) that can influence the index value seems illusionary. Nevertheless, using statistical models in order to predict expected values (Figure 16) could assess the uncertainty around that index value. Finally, it seems possible to assess the uncertainty around the index score by propagating the uncertainty around the expected value (Figure 17; Bevington and Robinson 2003).

255 P5: Uncertainty associated with predictive multimetric indices

Statistic model

CI fitted value

Observed value (y) [ŷmin ; ŷmax]

CI deviation

[y-ŷmin ; y-ŷmax] Standardization Transformation CI score

[scoremin; scoremax]

Figure 17: Error propagation principle; CI, confidence interval.

3.2.1 Confidence interval choice

Each value predicted with a statistical model can be associated with either a confidence interval or a prediction interval (Hahn et Meeker 1991). The confidence interval provides information on the degree of knowledge of a population characteristic from a random sample (Hah et Meeker 1991). This interval should contain the true parameter’s value of the studied population (e.g. mean), with a 100(1)% level of confidence (Hahn et Meeker 1991, Scherrer 2009). Each value predicted by a linear model or a GLM corresponds to the expectation of the response variable Y, knowing the explanatory variables X (McCullagh and Nelder 1989, Saporta 2006): ˆ Xyy iii 3.16 Consequently, the confidence interval associated with a prediction corresponds to the interval that should contain the metric average value for a given environment. In contrast, “a prediction interval for a single observation is an interval that will, with a specified degree of confidence, contain the next randomly selected observation from a population” (Hahn and Meeker 1991, p. 31). This interval estimates the uncertainty associated with the prediction of a new observation remembering what has already been observed. The prediction interval that is associated with a predicted value estimates the value range expected for a metric in a new site or a new sample for a given environment. The prediction interval is therefore larger than the confidence interval (Figure 18). The observed values are more variable than the mean values.

256 P5: Uncertainty associated with predictive multimetric indices

3.2.2 An example: the linear regression case

The confidence interval around the mean value that is predicted by a linear regression is equal to: ˆˆ nii 2,2 ysetyyIC ˆi 3.17 where 1 is the confidence level, i the predicted value, e.g. the expected value of the metric for a given environment with no anthropogenic pressure, and se(i) the standard error associated with the prediction:

2 1 yse i ˆˆ i XXXX i 3.18 where ˆ² is the dispersion associated with the model (residual variance), X the explanatory variable matrix and Xi a vector of the matrix X. X corresponds to the environmental variable th matrix and Xi corresponds to the environment observed in the i site. This interval accounts for the variability associated with the estimation of the model’s coefficients. Indeed,

s 2 XX 1 in equation 3.17 corresponds to the estimate of the variance–covariance matrix of the model’s coefficients (Kutner et al. 2005). These coefficients are themselves random variables (2003, Kutner et al. 2005, Saporta 2006). Their estimate depends on the sample of the population that has been used to calibrate the model. Therefore, parameter estimates depend on the reference sites that have been selected to estimate the model’s coefficients (Figure 13 & 16). The same coefficients estimated from two different samples will have different values. Consequently, the average values estimated by these two models, for a given environment, will also be different. The confidence interval estimates this variability around the mean value (Figure 18).

The prediction interval can be calculated as follows: ˆˆ njj 2,2 ysetyyIC ˆ j 3.19 where j is the prediction and se(j) the standard error associated with the predicted value that can be calculated as follows:

2 1 yse j ˆˆ 1 j XXXX j 3.20 The term “+1” in 3.20 takes into account that a prediction is a realisation of a distribution (a normal distribution here) and, in most cases, departs from the mean value (Hahn and Meeker 1991). The two intervals also account for the deviation between the environment for a given site (Xi) and the environment of the sites that have been used to estimate the model

257 P5: Uncertainty associated with predictive multimetric indices coefficients. The greater the distance from the environment of a site to the average environment of the reference sites, the wider the intervals associated with the predicted metric value by the models will be (Figure 18). Figure 18: Interval Theoretical relation between a confidence metric (y-axis) and prediction an environmental variable (x-axis). The black curve represents the expected values resulting from a linear regression, the dark grey

Metric polygon illustrates the confidence interval around the model (of the mean value) and the light grey polygon represents the prediction interval (around the observation). -4 -2 0 2 4 6 8 10 -2 -1 0 12 Environmental variable

3.2.2.1 Perspectives: which interval should be chosen?

The choice of the interval depends on the main objective: either one should describe the population or the process from which the sample has been selected or one should predict the results of a new sample from the same population. The confidence interval responds to the first point, whereas the prediction interval responds to the second one (Hahn and Meeker 1991). Each site (for which an index will be calculated) represents a new observation; the limits of the interval in which this observation should be observed will also be estimated. The prediction interval is thereby the best adapted to measuring the uncertainty around the predicted value and therefore the uncertainty around the score (Figure 17).

3.2.3 Estimating uncertainty around the score

The uncertainty around the score is estimated by propagating the uncertainty around the value predicted by the models (Figure 17). The first step consists in estimating the prediction interval around the expected value. When studying a variable that is normally

258 P5: Uncertainty associated with predictive multimetric indices distributed, the estimation of the prediction interval (equations 3.19 and 3.20) is based on strong theoretical statistical bases (Neter et al. 1983, Kutner et al. 2005). Nevertheless, no metric used in calculating the indices is distributed according to a normal distribution. These metrics follow either a Poisson distribution (e.g. richness-related metrics) or a negative binomial distribution (e.g. metrics based on the number of individuals). Two alternatives are possible to estimate the prediction interval: approximation or simulation.

3.2.3.1 Estimating the prediction interval by approximation

Several authors have proposed formula to estimate prediction intervals for other distributions than the normal distribution (ref). Cameron and Trivedi (1998) have thereby proposed the following equation for negative binomial distribution:

2 1 ˆii 2/ ˆi , ˆ yyzyyIC ˆ ii XWXXX i 3.21 where i is the value predicted by the model (the metric’s expected value), 2(i,ˆ) the variance function of the mean (equation 3.11) and W the following diagonal matrix: yˆ 1  0 1 ˆ.yˆ1 W  3.22 yˆ 0  n 1 ˆ.yˆn As for the linear models, the matrix (X’WX)1 is an estimation of the variance– covariance matrix of the GLM coefficients. This interval is associated with the prediction in the variable space (e.g. the number of species or the number of individuals). The residuals associated with each metric used in the two indices are calculated in the link space (equation 3.7). It will therefore be necessary to apply the link function to the interval limits in order to estimate the interval around the linear predictor .

The second step consists in applying to the limits of this interval, all transformations that lead from the expected values to the metric score: residual calculation (equation 3.7), standardisation (equation 3.15) and min–max transformation (equation 3.8). This calculated interval represents the uncertainty around the score. The prediction interval estimated using equation 3.21 is symmetric, whereas the negative binomial distribution is a priori asymmetrical. Therefore, this interval may be biased.

259 P5: Uncertainty associated with predictive multimetric indices

3.2.3.2 Estimating prediction interval by simulation

The method described in this section is the method that was proposed during the European EFI+ project (Bady et al. 2009a,b). This approach estimates that the predicted values in the link space follow a normal distribution with the parameters = ˆ and ² = ˆ². For each site x, the standard deviation of the normal distribution is approximated by the one calculated for a normal distribution (equation 3.20):

2 tt 1 ˆ ˆx ˆ 1 x XXXX x 3.23

The dispersion parameter ˆ² is estimated as the Pearson statistics (Cameron and Trivedi 1998, Agresti 2002, Faraway 2006) divided by the number of residual degrees of freedom (McCullagh and Nelder 1989):

1 n yy ˆ 2 ˆ 2 ii 3.24 pn i1 ˆi where n is the number of individuals, the number of reference sites used for calibrating the th model, p the number of the model’s parameters, yi the observed value of the i reference site,

i the expected value and 2ˆ i the conditional variance of yi (equation 3.11; Cameron and Trivedi 1998). When all the parameters are estimated, 99 values are randomly generated according to a normal distribution N(ˆx,ˆ²(ˆx)). One vector of 99 residuals is then obtained by deducting each value generated to the observed value. These residuals are then transformed into scores (Figure 16). The limits of the interval around the score are estimated as the quantiles of the distribution of the 99 randomly generated scores.

3.2.4 Estimating uncertainty around the index

Index computation, whatever the assemblage type, corresponds to the sum of the scores of two metrics (equations 3.13 and 3.14). The simple addition of the limits of each metric interval does not seem to be a relevant approximation of the uncertainty around the index. This method will overestimate the interval around the index.

260 P5: Uncertainty associated with predictive multimetric indices

Let Y3 be a random variable computed as the sum of two random variables Y1 and Y2. The variance of the sum of two random variables is equal to the sum of their variance plus twice their covariance. The variance of Y3 is therefore equal to: 222 3.25 3 1 2 YYYYY 21

The variance of Y3 can also be expressed according to the correlation between Y1 and Y2. Equation 3.25 becomes: 222 2r 3.26 3 1 2 YYYYYYY 2121

According to equation 3.26, the standard deviation of Y3 can be equal to the sum of the standard deviations of Y1 and Y2 only when the correlation between Y1 and Y2 is equal to 1. The metrics being selected to limit redundancy, this type of correlation is never observed.

Consequently, the sum of the limits of the confidence intervals around the means of Y1 and Y2 will overestimate the range of the confidence interval around the mean value of Y3. The same phenomenon will occur for the prediction interval. Nevertheless, equation 3.26 demonstrates how important it is to take into account the correlation between the metrics when estimating uncertainty around the index.

3.2.4.1 Illustration of the correlation effect between metrics using simulation

Regarding a population made up of 1000 reference sites, with two metrics Y1 and Y2, and one environmental variable Z, all following a normal distribution where the parameters and correlation are: Y1 Y2

Y1 -2.1212 4.3234 Y2 0.6591

Y2 -0.9568 4.3609 Z 0.6500 0.5500 Z 2.0000 1.5000

Within this population of 1000 sites, 200 sites are selected randomly. These 200 sites are used to estimate the linear regression parameters between each metric and the environmental variable Z, for example: 111 ZbaY 3.27 The differences between the observed values for the 1000 sites of the population and the expected values are then calculated. Two vectors of 1000 residuals are thus obtained, one vector for each metric.

261 P5: Uncertainty associated with predictive multimetric indices

This process is repeated 1000 times the following distributions are obtained: so as to estimate, for each site, 1000 values Metric residuals Simulation n° Y Y of residuals for each metric. For instance, 1 2 1 -1.6484 -1.9626 for a test site that presents the following 2 -1.3286 -1.3967 3 -1.3290 -1.3247 metric and environmental values: 4 -1.3414 -1.9118 5 -1.1860 -1.3958 Y Y Z 1 2 6 -1.5652 -1.5761 -5.1090 -4.1968 1.0921

One thousand estimations of the sum of the deviations is calculated for each site.

On average, out of 1000 simulations, the correlation between the distributions of the two metric residuals is equal to 0.4755, which is relatively close to the partial correlation between Y1 and Y2, rY1Y2|Z = 0.4752. The residuals’ distributions, for the test site, seem to be normally distributed (Shapiro tests, P > 0.05, Figure 19a,b). Figure 19c clearly shows that the residuals of the test sites are correlated, rtest = 0.5424 (P < 0.001). The sum of the residuals of both metrics Y1 and Y2 is much more widely scattered than when no correlation is observed between these two distributions (Figure 19d). The parameter of these two normal distributions, which is represented by the continuous and the dashed curves, has been estimated by the sum of the average of the two distributions of the residuals, because: 21 1 YYYY 2 3.28 The variance of the sum of the two normal distributions was estimated by equation 3.25, where rY Y = rtest for the continuous curve and for the dashed curve. The standard 1 2 rY1Y2 = 0 deviation for each distribution was estimated via the following equation: 2 sˆY testCovXX test 3.29 where Xtest represents the environment in the test site and Cov the average of the variance– covariance matrices of the models calculated at each simulation.

262 P5: Uncertainty associated with predictive multimetric indices

a b Density Density 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 -2.0 -1.5 -1.0 -0.5 -2.5 -2.0 -1.5 -1.0

Résiduals Y1 Résiduals Y2

cdr 0 0,5424 2 Density Residuals Y -2.5 -2.0 -1.5 -1.0 0.0 0.2 0.4 0.6 0.8 1.0 -2.0 -1.5 -1.0 -0.5 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5

Residuals Y1 Residuals Y1 + residuals Y2

Figure 19: Results of the 1,000 simulations for the test-site: a) residuals of the metric Y1, b) ) residuals of the metric Y2, c) the relation between the residuals of the two metrics (black square located at the barycenter of the two distributions), and d) distribution of the sum of the residuals. The dashed curve represents the theoretical sum distribution in the absence of correlation and the continuous curve represents the theoretical distribution of the sum with a 0.5424 correlation (correlation observed between the two distributions). The two segments represent the estimated limits of the confidence interval with = 0.05.

These simulations do not represent the real uncertainty around the score. They are not based on the prediction interval but on the confidence interval, and the residuals are not transformed. Nevertheless, these results show the influence of the correlation between the two metrics on the sum of their residuals; in particular, on the variability of this sum. The sum of two correlated random variables is much more scattered than the sum of two independent random variables. This demonstrates the need to take into account the correlation between the metric scores to estimate the uncertainty around the index.

263 P5: Uncertainty associated with predictive multimetric indices

As for the uncertainty estimation around the metrics score, two approaches are conceivable: approximation or simulation.

3.2.4.2 Estimating uncertainty around the index by approximation

The purpose is to estimate the expectation and standard deviation associated with each score in order to estimate the expectation and the variance of the sum of the score via equations 3.24 and 3.25. Once the parameters are known, the uncertainty around the score can be estimated as follows: scoreIC 2/ . scoresezscore 3.30 The uncertainty around the index can then be estimated by dividing the limits of this interval by two (equations 3.13 and 3.14). The difficulty with this approach lies principally in estimating the root standard deviation associated with the metric score. Indeed, only one approximation of the standard deviation associated with one prediction for a new observation is currently available (see section 3.2.31).

3.2.4.3 Estimating uncertainty around the index by simulation

This approach is an extension of the one presented in section 3.2.3.2, which was proposed during the EFI+ project (Bady et al. 2009a,b). For each metric, 99 expected values in the link space were randomly generated from a normal distribution, with parameters N(ˆ x,ˆ

²(ˆx)). These expected values were then transformed into scores and aggregated to obtain 99 index-simulated values. For a given site, the interval around the index was estimated as the quartiles of this vector of 99 values (Figure 20).

264 P5: Uncertainty associated with predictive multimetric indices

ABA B Index 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Index Index Figure 20: Illustration of the estimation of uncertainty around the index (dashed line) for tolerant populations, with an 80% confidence interval (A) and a 95% confidence interval (B). The lower and upper limits of the intervals were computed by simulation.

Currently, this method does not take into account the correlation between metrics. The vectors of random values for each metric are generated independently from each other. One improvement could be to randomly generate the vectors of expected values for each metric according to a multivariate normal distribution (Ripley 1987). Using such an algorithm should enable the correlation between metrics to be considered.

3.2.4.4 Perspectives

The two approaches presented in this section to estimate uncertainties around the metric scores and the index values are still preliminary approaches. Further developments are needed to be able to use these methods as routines in the future. It should notably be convenient to compare the results obtained with these approaches with confidence intervals obtained by simulation, when the conditions of variance, mean, coefficient values, etc. are known. Both methods have advantages and disadvantages. The method based on approximation a priori requires a much shorter calculating time than the others. This advantage will be even greater if a user has a large number of sites for which to calculate an index. The main advantage of this simulation approach is its simplicity. This approach can be understood by a broad audience with little prior statistical knowledge. On the other hand, the calculating time needed to estimate uncertainties will climb as the number of sites and simulations increases. The use of other distributions than the normal distribution, associated with the different steps leading to the scores and the index calculations, makes it possible to calculate only an approximation of the two uncertainties.

265 P5: Uncertainty associated with predictive multimetric indices

These results have also shown the importance of taking into account the correlation between the metrics used in the index uncertainty calculation. Very often, multimetric indices integrate eight to 12 metrics in their calculation (e.g. Karr 1981, Fausch et al. 1984, Simon and Lyons 1995, Lyons et al. 1996, Hughes et al. 1998, Pont et al. 2006, 2007, Stoddard et al. 2008). The metrics are selected in order to limit their redundancy. Usually, only metrics with a correlation lower than an arbitrary threshold are selected (e.g. Hughes et al. 1998, Heringet al. 2006, Pont et al. 2006, 2007, 2009, Stoddard et al. 2008). Generally, this threshold varies between 0.7 and 0.8 depending on the authors. The higher the correlation between the metrics scores, the more uncertainty there is around the index. The variance of the sum of random variables is equal to:

p Var Y X i ,Cov2Var XX ji 3.31 i1 ji Substituting the covariance in equation 3.31 with the correlation gives:

p Var Y X i ji i VarVar,Co2Var XXXXr j 3.32 i1 ji The more correlated the metrics scores, the greater the variance of the scores’ sum will be (Figure 21).

r 0 0.25 0.5 0.75 Density 0.00 0.02 0.04 0.06 -60 -40 -20 0 20 40 60 Sum of 10 random variables Figure 21: Sum of ten random variables distributed according to normal distribution, with = 0 and ² = 4, and with different levels of correlation (r) between each variable: 0, 0.25, 0.5 and 0.75.

Using redundant metrics in the index calculation leads to an increase in misclassification risk (Bowman and Somers 2005). A test site with no degradation but with a

266 P5: Uncertainty associated with predictive multimetric indices low score for a given metric will probably present other metrics with low scores due to the metric correlations. This site will probably be considered as impacted. Here, this diagnosis is partly an artefact due to the partial redundancy between the metrics. This artefact goes against the theory on multimetric indices (Karr 1991, Karr and Chu 1999, 2000). The detection of a multiple pressure impact by several metrics is desirable, when the metrics used are indeed not redundant (Karr 1991). In this case, several metrics with low scores demonstrate that several aspects of community functioning are affected by the impact of anthropogenic pressures. In this case, a low index score is consistent and provides reliable and accurate information on the site’s condition. Using an index integrating several redundant metrics, it is not possible to disentangle the effect on the index value due to pressures or the redundancy between metrics. To reduce the effect of the correlation between metrics, one can chose to be stricter by pulling down the value of the maximum correlation tolerated when choosing the metrics. The limitation of the number of metrics used in the index calculation is also a good way to reduce this effect (see Figures 19d and 21). This is one reason why only two metrics were considered in the EFI+ index computation.

267

Climate change

Climate change

Using the reference condition approach to develop a multimetric index assumes the systems are relatively stable and predictable. Whatever statistical method is used (Bowman and Somers 2005), the observed metric values are compared to the reference site values. This can be a direct comparison, e.g. metric average values of the nearest reference sites (Bates Prins and Smith 2007) or an indirect comparison using statistical models (e.g. Pont et al. 2006, 2007). For a given environment in absence of pressure, only one value per metric is expected. A slight environmental change should lead to a slight variation in assemblage structure. The assemblage structure is assumed to evolve according to the relationships estimated by the models. If this assumption is not confirmed, the control of the metric’s environmental variability should be biased, as well the site’s condition. Temperature is one of the major environmental factors constraining living beings and acting on various organisation levels: individuals, species and communities (Daufresne et al. 2009). The observed thermal changes of certain streams under the effect of global change (Webb and Nobilis 1995, Webb 1996, Milly et al. 2005, Webb and Nobilis 2007), associated with the thermal variation predicted by the IPCC (Intergovernmental Panel on Climate Change; IPCC 2007), suggests future evolution of fish assemblages (Buisson et al. 2008a, Buisson et al. 2008b, Lassale et al. 2008, Daufresne et al. 2009). The purpose of this section is to examine the potential consequences of climate changes on the assessment of the streams’ ecological status estimated with current indices (Wilby et al. 2006).

The first objective of this section is to study the effect of the environment on brown trout young of the year (YOY, 0+) growth, to investigate the effect of climate change on assemblage size structure. The more the size structure of brown trout population varies, the greater the consequences will be because the brown trout dominate “intolerant” assemblages. The intolerant index integrates the only metric based on assemblage size structure (equation 3.13): the number of individuals smaller than 150 mm intolerant to habitat degradation. More generally, the metrics used during the EFI+ project grouped species sharing attributes into guilds (Usseglio-Polatera et al. 2000, Noble et al. 2007). The future changes of species distribution area (Buisson et al. 2008a,b, Lassale et al. 2008) should have important consequences on the functional assemblage structure. The second objective of this section is to study the environmental parameters influencing the distribution of 24 European fish species.

271

P6: Variation of brown trout Salmo trutta young of the year growth along environmental gradients in Europe. Journal of Fish Biology

Journal of Fish Biology (2011) doi:10.1111/j.1095-8649.2011.02928.x, available online at wileyonlinelibrary.com

BRIEF COMMUNICATION Variation of brown trout Salmo trutta young-of-the-year growth along environmental gradients in Europe

M. Logez*† and D. Pont‡

*Cemagref, UR HYAX, 3275 Route de C´ezanne – CS 40061, F-13182 Aix en Provence, France and ‡Cemagref, UR HBAN, Parc de Tourvoie BP 44, 92163 Antony Cedex, France

(Received 29 June 2010, Accepted 28 January 2011)

This study analysed the influence of temperature and other environmental factors on the growth of brown trout Salmo trutta YOY in Europe. Air temperature accounted for the greatest proportion of the variance in maximum total length, but the inclusion of other factors significantly increased the proportion of the variance explained. © 2011 The Authors Journal of Fish Biology © 2011 The Fisheries Society of the British Isles

Key words: drainage-basin effects; Europe; large scale; salmonid; temperature.

Temperature is a major environmental factor constraining living organisms and act- ing at various levels of organisation: community, species and individuals (Daufresne et al., 2009). Among the numerous environmental effects of temperature, its influ- ence on fish growth has been widely studied and documented (Brown, 1951; Elliott et al., 1995; Abdoli et al., 2007). Nevertheless, growth of individuals and particularly the growth pattern of the young of the year (YOY) are also influenced by various other abiotic as well as biotic factors (Lobon-Cervia,´ 2010): water discharge (Jonsson et al., 2001a; Arnekleiv et al., 2006), feeding resources (Clarke & Scruton, 1999; Boughton et al., 2007; Ward et al., 2009), inter and intraspecific interactions such as aggressive behaviour and interspecific competition (Bystrom¨ & Garcia-Berthou, 1999; Lahti et al., 2001), and density-dependent feedback (Elliott, 1994). Although temperature is highly spatially structured at the large scale, varying with latitude, continentality and altitude (Ward, 1985), the variation of body size and growth rate between populations has been studied mainly along latitudinal gradi- ents (Heibo et al., 2005; Almodovar´ et al., 2006; Parra et al., 2009; Chavarie et al., 2010), focusing on the Rapoport rule hypothesis (Blanck & Lamouroux, 2007). Nevertheless, other studies at the large spatial scale also demonstrated the effect of environmental factors other than temperature on growth rate such as river size (Tedesco et al., 2009). The present study focuses on the variation in the growth of

†Author to whom correspondence should be addressed. Tel.: +33 4 42 66 69 86; email: maxime.logez@ cemagref.fr 1 © 2011 The Authors Journal of Fish Biology © 2011 The Fisheries Society of the British Isles

275 2 M. LOGEZ AND D. PONT

N

0 500 1000 km

Fig. 1. Location of the 105 sites at which YOY Salmo trutta were sampled. brown trout Salmo trutta L. 1758 YOY along environmental gradients. The wide distribution area of S. trutta in Europe and its presence in a large range of habitat conditions make it possible to assess the effect of various environmental gradients on YOY growth over a large spatial scale. One hundred and five sites, unaffected or only slightly affected by human activi- ties, were sampled by electrofishing in nine European countries (Fig. 1). Data were collected by several laboratories and governmental environmental agencies by wad- ing during low-flow periods (between August and November). The number of fish collected during the first pass for each site ranged from 53 to 802. The age of fish was estimated from the total length (LT) distribution by fitting a mixture of normal dis- tributions (one distribution for the YOY and a second one for older fish; Macdonald, 1987) using an expectation-maximization (EM) algorithm (Benaglia et al., 2009). Normal distribution parameters were estimated by maximizing the likelihood. The 97·5th percentile of the normal distribution was used to estimate the maximum LTmax (mm) that a YOY S. trutta could reach at a given site (Fig. 2). All fitted LTmax were visually checked. Four variables were used to characterize the local environmental conditions: annual ◦ ◦ mean air temperature (T ;1·3–14·8 C, median, 10·7 C), thermal amplitude (Td; ◦ difference between July and January mean air temperatures, 8·6–24·4 C, median,

© 2011 The Authors Journal of Fish Biology © 2011 The Fisheries Society of the British Isles, Journal of Fish Biology 2011, doi:10.1111/j.1095-8649.2011.02928.x

276 SALMO TRUTTA YOY GROWTH 3 0 0·005 0·010 0·015 0·020 Number of individuals (relative frequency) 50 100 150 200 250 300 350

Individual LT (mm)

Fig. 2. Example of total length (LT) distribution of a Salmo trutta population. , the Gaussian distribution fitted for the YOY; , the maximum LT computed as the 97·5th percentile of the normal distribution.

◦ 15·5 C), drainage-basin area (A; 2–449 km2, median, 34 km2) and the dominant geology (G) of the upstream drainage basin (1 for siliceous geology, otherwise 0). The drainage area was ln transformed based on the skewness of its distribution. A multiple linear regression with these four environmental variables as predictors was used to explain the field variability of YOY LTmax. A quadratic term was used for both annual mean air temperature and area of drainage basin to allow for a non- linear pattern of responses. The continuous growth period between the two extreme sampling dates was taken into account using the date-of-year (D, 213–327) of the sampling date as the control variable. The four environmental variables and the two quadratic forms significantly explained 49% of the variance of the YOY LTmax (F6,98 = 17·6, P<0·001). The amount of variance explained by the complete model was significantly higher com- pared to a model with only mean annual temperature and its quadratic form as predictor variables (F4,98 = 13·1, P<0·001). Moreover, the complete model dis- played the smallest AIC among all the possible sub-set models. The relatively low root mean square error (RMSE = 11·6 mm) and the relationship between predicted and observed LTmax (Fig. 3) underlined this model’s good fit. The estimated relation- ship between LTmax and the environmental variables was: LTmax =−24·18 − 5·46 2 2 ln(A) + 1·56log(A) + 11·49T − 0·47T + 0·13D + 1·68Td − 10·32G. As revealed by hierarchical partitioning (Chevan & Sutherland, 1991; Pont et al., 2005), the total contribution of the environmental variables was mainly indepen- dent (Table I). Indeed, the ratio of independent contribution (I ) to joint contribution (J ) was relatively high (11·7). Similar patterns were independently observed for each variable, except for the thermal amplitude for which the I:J ratio was equal to 1 (Table I). By explaining 45·2 and 35·1% of the total independent contributions, respectively, the annual mean air temperature and the area of the drainage basin were the two main factors explaining the variability of YOY LTmax (Table I). These variables were followed by the dominant geology and the thermal amplitude.

© 2011 The Authors Journal of Fish Biology © 2011 The Fisheries Society of the British Isles, Journal of Fish Biology 2011, doi:10.1111/j.1095-8649.2011.02928.x

277 4 M. LOGEZ AND D. PONT 120 ) mm ( 100 max T L d e v ser 80 Ob 60

60 80 100 120

Predicted LTmax (mm)

Fig. 3. Relationship between observed maximum total length (LTmax) and predicted LTmax of YOY Salmo trutta. The line represents y = x.

The YOY LTmax was positively associated with the thermal amplitude and the sampling dates. Sampling taking place later in the year displayed higher YOY LTmax, and YOY in a contrasting thermal environment were estimated to grow more than in environment with low thermal amplitude. The growth of YOY S. trutta was on average lower in siliceous drainage basins than in calcareous drainage basins (negative coefficient associated with G). This pat- tern probably results from higher conductivity and water fertility (Kwak & Waters, 1997; Almodovar´ et al., 2006) in the calcareous drainage basin. Almodovar´ et al. (2006) demonstrated that the variation in water chemical features accounted for 61%

Table I. Results from the hierarchical partitioning of multiple linear regression relating YOY Salmo trutta maximum total length (LTmax) with four environmental variables: I the inde- pendent contribution of each variable, J the joint contribution and %Ii the contribution of each variable to the total independent contribution

Variable IJTotal |I :J | %Ii Temperature* 0·22130 −0·00426 0·21704 51·9 45·2 Drainage area* 0·17207 −0·00482 0·16725 35·7 35·1 Thermal amplitude 0·02678 −0·02618 0·00060 1·05·5 Geology 0·06965 −0·00677 0·06288 10·3 14·2 Total 0·48980 −0·04204 0·44776 11·7 100·0

*The effects of the variable and its quadratic form were assessed simultaneously.

© 2011 The Authors Journal of Fish Biology © 2011 The Fisheries Society of the British Isles, Journal of Fish Biology 2011, doi:10.1111/j.1095-8649.2011.02928.x

278 SALMO TRUTTA YOY GROWTH 5 of YOY Tmax L

Area of drainage basin Annual mean air temperature

Fig. 4. Graph effect display (Fox, 1987) of the annual mean air temperature and of the drainage-basin area on maximum total length (LTmax of YOY) Salmo trutta. The two curves represent the theoretical effect of these two variables when all other variables are held constant. All other environmental variables were set at their median. of the variance in S. trutta production in 10 Spanish streams and water chemistry is directly affected by the geology of the drainage basin (Allan & Castillo, 2007). The relatively low correlation between the mean annual temperature and the drainage-basin area (Pearson’s correlation coefficient, r = 0·105; P>0·05) and the relatively low joint contribution of these two variables (Table I) allowed an analy- sis of the effect of each variable on YOY LTmax. The relationship between YOY LTmax and area of the drainage basin (Fig. 4) showed that YOY S. trutta growth tends to increase along the river size gradient. This pattern is probably related to the longitudinal variation of the feeding resources available and of the variation in YOY diet. Previous experiments conducted under controlled thermal conditions demonstrated the importance of diet on the growth of S. trutta (Brown, 1951). More- over, the recent study of Descroix et al. (2010) on the feeding behaviour of Atlantic salmon Salmo salar L. 1758 revealed an increase in parr growth rate along an upstream–downstream gradient. This pattern was attributed to the difference in the type of prey ingested by S. salar parr along the longitudinal gradient (Descroix et al., 2010). The hierarchical partitioning (Table I) demonstrated that the mean annual air tem- perature was the major environmental determinant of YOY S. trutta growth. The LTmax of underyearling S. trutta was estimated to increase with increasing annual mean air temperature up to a maximum value, and then slightly decrease (Fig. 4). This pattern is highly consistent with the skewed bell-shaped relationships between yearling S. trutta growth rates and temperature, previously observed by other authors (Elliott & Hurley, 1997; Jonsson et al., 2001b; Forseth et al., 2001; Ojanguren et al., 2001; Larsson et al., 2005). Moreover, the results are consistent with previous field observations. Parra et al. (2009) showed that the growth of YOY S. trutta varied both along altitudinal and lat- itudinal gradients. These patterns could, at least in part, be explained by the observed effect of the temperature on YOY S. trutta and by the pattern of variation of the temperature at a large scale, since mean air temperature varies across latitudinal and

© 2011 The Authors Journal of Fish Biology © 2011 The Fisheries Society of the British Isles, Journal of Fish Biology 2011, doi:10.1111/j.1095-8649.2011.02928.x

279 6 M. LOGEZ AND D. PONT altitudinal gradients (Ward, 1985). Nevertheless, the use of the mean air temperature rather than the water temperature constitutes a limit to the analyses. Given the importance of growth and body size in the life-history strategies of fish species (Winemiller & Rose, 1992; Vila-Gispert et al., 2002) and population dynamics (e.g. YOY overwinter survival; Quinn & Peterson, 1996), these results should have important implications in a context of global warming. Fast growth in early life stages is often associated with early maturation as well as age and size at first reproduction for fish species, which Winemiller & Rose (1992) described as the opportunistic strategy within their framework of fish species’ life-history strategies. This interrelation between early growth and age or size at maturity (He & Stewart, 2001) was also observed at the population level (Abdoli et al., 2007; McDermid et al., 2007). Populations living in warmer conditions display faster growth rates in early life stages (0+ and 1+ years of age) and earlier maturation (Abdoli et al., 2007; McDermid et al., 2007). This early maturation can be explained by the rapid early growth observed. Nevertheless, the growth patterns for older individuals (>1+ years) are reversed: older individuals grow faster in a cold-water environment than in warmer conditions (Abdoli et al., 2007; McDermid et al., 2007). Such patterns could be explained by an existing trade-off between gonadal and somatic investment. Populations with delayed maturation should favour somatic investment, whereas pop- ulations with early maturation should favour gonadal investment (McDermid et al., 2007). Therefore, it can be expected that S. trutta populations adapt their strategy to the new thermal conditions induced by global warming, by modifying their balance between somatic and gonadal investment depending on the modification of early life-stage growth.

Work on this study was funded by the European Commission under the Sixth Framework Programme (EFI+ project, contract number 044096). We are grateful to all members who took part in this project.

References Abdoli, A., Pont, D. & Sagnes, P. (2007). Intrabasin variations in age and growth of bullhead: the effects of temperature. Journal of Fish Biology 70, 1224–1238. Allan, J. D. & Castillo, M. M. (2007). Stream Ecology: Structure and Function of Running Waters. Boston, MA: Kluwer Academic Publishers. Almodovar,´ A., Nicola, G. G. & Elvira, B. (2006). Spatial variation in brown trout production: the role of environmental factors. Transactions of the American Fisheries Society 135, 1348–1360. Arnekleiv, J. V., Finstad, A. G. & Ronning, L. (2006). Temporal and spatial variation in growth of juvenile Atlantic salmon. Journal of Fish Biology 68, 1062–1076. Benaglia, T., Chauveau, D., Hunter, D. R. & Young, D. (2009). Mixtools: an R package for analyzing finite mixture models. Journal of Statistical Software 32, 1–29. Blanck, A. & Lamouroux, N. (2007). Large-scale intraspecific variation in life-history traits of European freshwater fish. Journal of Biogeography 34, 862–875. Boughton, D. A., Gibson, M., Yedor, R. & Kelley, E. (2007). Stream temperature and the potential growth and survival of juvenile Oncorhynchus mykiss in a southern California creek. Freshwater Biology 52, 1353–1364. Brown, M. E. (1951). The growth of brown trout (Salmo trutta Linn.): IV. The effect of food and temperature on the survival and growth of fry. Journal of Experimental Biology 28, 473–491. Bystrom,¨ P. & Garcia-Berthou, E. (1999). Density dependent growth and size specific com- petitive interactions in young fish. Oikos 86, 217–232.

© 2011 The Authors Journal of Fish Biology © 2011 The Fisheries Society of the British Isles, Journal of Fish Biology 2011, doi:10.1111/j.1095-8649.2011.02928.x

280 SALMO TRUTTA YOY GROWTH 7

Chavarie, L., Dempson, J. B., Schwarz, C. J., Reist, J. D., Power, G. & Power, M. (2010). Latitudinal variation in growth among Arctic charr in eastern North America: evidence for countergradient variation? Hydrobiologia 650, 161–177. Chevan, A. & Sutherland, M. (1991). Hierarchical partitioning. The American Statistician 45, 90–96. Clarke, K. D. & Scruton, D. A. (1999). Brook trout production dynamics in the streams of a low fertility Newfoundland watershed. Transactions of the American Fisheries Society 128, 1222–1229. Daufresne, M., Lengfellner, K. & Sommer, U. (2009). Global warming benefits the small in aquatic ecosystems. Proceedings of the National Academy of Sciences of the United States of America 106, 12788–12793. Descroix, A., Desvilettes, C., Bec, A., Martin, P. & Bourdier, G. (2010). Impact of macroin- vertebrate diet on growth and fatty acid profiles of restocked 0+ Atlantic salmon (Salmo salar) parr from large European river (the Allier). Canadian Journal of Fisheries and Aquatic Sciences 67, 659–672. Elliott, J. M. (1994). Quantitative Ecology and the Brown Trout. Oxford: Oxford University Press. Elliott, J. M. & Hurley, M. A. (1997). A functional model for maximum growth of Atlantic salmon parr, Salmo salar, from two populations in northwest England. Functional Ecology 11, 592–603. Elliott, J. M., Hurley, M. A. & Fryer, R. J. (1995). A new, improved growth model for brown trout, Salmo trutta. Functional Ecology 9, 290–298. Forseth, T., Hurley, M. A., Jensen, A. J. & Elliott, J. M. (2001). Functional models for growth and food consumption of Atlantic salmon parr, Salmo salar, from a Norwegian river. Freshwater Biology 46, 173–186. Fox, J. (1987). Effect displays for generalized linear models. Sociological Methodology 17, 347–361. He, J. X. & Stewart, D. J. (2001). Age and size at first reproduction of fishes: predictive models based only on growth trajectories. Ecology 82, 784–791. Heibo, E., Magnhagen, C. & Vøllestad, L. A. (2005). Latitudinal variation in life-history traits in Eurasian perch. Ecology 86, 3377–3386. Jonsson, B., Jonsson, N., Brodtkorb, E. & Ingebrigtsen, P.-J. (2001a). Life-history traits of brown trout vary with the size of small streams. Functional Ecology 15, 310–317. Jonsson, B., Forseth, T., Jensen, A. J. & Næsje, T. F. (2001b). Thermal performance of juve- nile Atlantic salmon, Salmo salar L. Functional Ecology 15, 701–711. Kwak, T. J. & Waters, T. F. (1997). Trout production dynamics and water quality in Min- nesota streams. Transactions of the American Fisheries Society 216, 35–48. Lahti, K., Laurila, A., Enberg, K. & Piironen, J. (2001). Variation in aggressive behaviour and growth rate between populations and migratory forms in the brown trout, Salmo trutta. Animal Behaviour 62, 935–944. Larsson, S., Forseth, T., Berglund, I., Jensen, A. J., Naslund,¨ I., Elliott, J. M. & Jonsson, B. (2005). Thermal adaptation of Arctic charr: experimental studies of growth in eleven charr populations from Sweden, Norway and Britain. Freshwater Biology 50, 353–368. Lobon-Cervia,´ J. (2010). Density dependence constrains mean growth rate while enhancing individual size variation in stream salmonids. Oecologia 164, 109–115. Macdonald, P. D. M. (1987). Analysis of length-frequency distributions. In Age and Growth of Fish (Summerfelt, R. C. & Hall, G. E., eds), pp. 371–384. Ames, IA: Iowa State University Press. McDermid, J. L., Ihssen, P. E., Sloan, W. N. & Shuter, B. J. (2007). Genetic and environ- mental influences on life history traits in lake trout. Transactions of the American Fisheries Society 136, 1018–1029. Ojanguren, A. F., Reyes-Gavilan,´ F. G. & Brana,˜ F. (2001). Thermal sensitivity of growth, food intake and activity of juvenile brown trout. Journal of Thermal Biology 26, 165–170. Parra, I., Almodovar,´ A., Nicolas, G. G. & Elvira, B. (2009). Latitudinal and altitudinal growth patterns of brown trout Salmo trutta at different spatial scales. Journal of Fish Biology 74, 2355–2373.

© 2011 The Authors Journal of Fish Biology © 2011 The Fisheries Society of the British Isles, Journal of Fish Biology 2011, doi:10.1111/j.1095-8649.2011.02928.x

281 8 M. LOGEZ AND D. PONT

Pont, D., Hugueny, B. & Oberdorff, T. (2005). Modelling habitat requirement of European fishes: do species have similar responses to local and regional environmental con- straints? Canadian Journal of Fisheries and Aquatic Sciences 62, 163–173. Quinn, T. P. & Peterson, N. P. (1996). The influence of habitat complexity and fish size on over-winter survival and growth of individually marked juvenile coho salmon (Oncorhynchus kisutch) in Big Beef Creek, Washington. Canadian Journal of Fisheries and Aquatic Sciences 53, 1555–1564. Tedesco, P. A., Sagnes, P. & Laroche, J. (2009). Variability in the growth rate of chub Leucis- cus cephalus along a longitudinal river gradient. Journal of Fish Biology 74, 312–319. Vila-Gispert, A., Moreno-Amich, R. & Garcia-Berthou, E. (2002). Gradients of life-history variation: an intercontinental comparison of fishes. Reviews in Fish Biology and Fish- eries 12, 417–427. Ward, D. M., Nislow, K. H. & Folt, C. L. (2009). Increased population density and sup- pressed prey biomass: relative impacts on juvenile Atlantic salmon growth. Trans- actions of the American Fisheries Society 138, 135–143. Ward, J. V. (1985). Thermal-characteristics of running waters. Hydrobiologia 125, 31–46. Winemiller, K. O. & Rose, K. A. (1992). Patterns of life-history diversification in North- American fishes – implications for population regulation. Canadian Journal of Fish- eries and Aquatic Sciences 49, 2196–2218.

© 2011 The Authors Journal of Fish Biology © 2011 The Fisheries Society of the British Isles, Journal of Fish Biology 2011, doi:10.1111/j.1095-8649.2011.02928.x

282 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. In preparation

P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. Identifying the physical, chemical, climatic, and environmental parameters that control species presence/absence (Jackson et al. 2001, Pont et al. 2005, Buisson et al. 2008a) is a fundamental step (Guisan and Zimmerman 2000, Austin 2007, Elith and R. 2009, Olden et al. 2010) toward predicting changes in species distribution area (Buisson et al. 2008b, Lassale et al. 2008, Tirelli and Pessani 2009). Not selecting the environmental variables that best explain species distribution would lead to overfitting (Harrell 2001, Hastie et al. 2009). This phenomenon occurs when models are too complex (too many free parameters) and when inconsistent explanatory variables are used. When overfitting occurs, “some of the findings of the analysis come from fitting noise or finding spurious associations” (Harrel 2001, p. 60) between environmental variables and species presence/absence. With overfitting, the prediction error on the calibration data set is relatively low, whereas the prediction error on an independent data set is relatively high (Hastie et al. 2009).

Choosing a statistical approach among the numerous methods used to predict species presence/absence (e.g. logistic regression, quantile regression, generalised additive model GAM, decision or regression tree, random forest, neural network, MARS, etc.) depends on the objective. Learning methods are principally predictive and are not linked with ecological theory (Austin 2007). On the other hand, inferential methods such as quantile regression and logistic regression are best adapted to applying theoretical theory such as Liebig’s law (limiting factor) or species niche (Hutchinson 1597, Austin 2007).

The predictive power of species distribution models (SDM) is highly dependent on the data set used to estimate the models (Elith and Graham 2009, Sinclair et al. 2010). The more spatially extended the calibration data set is, the better the estimation of the species niche or bioclimatic envelope will be. Numerous projections of the species distribution area (e.g. Buisson et al. 2008b, Lassale et al. 2008, Tirelli and Pessani 2009) are established from a data set covering a limited fraction of the current species distribution (Kottelat and Freyhof 2007). It is likely that these models only integrate a portion of the species realized niche. It is also possible that the predicted future climatic conditions (IPCC 2007) are outside the range of values encompassed in the calibration data set. Predicting species occurrence for environmental conditions outside of the range of values used to calibrate the models are pure extrapolations. Indeed it is not sure that the relationship between environment and species occurrence observed/fitted on the calibration data set are true over this range of environmental conditions. Moreover, the estimation of the environmental effect on species presence/absence

285 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. could be inaccurate or even biased. The species could be present in environmental conditions not observed in the calibration data set. A data set composed of 1548 sites that are not or only slightly impacted was selected to estimate the relation between habitat conditions and the presence/absence of 24 European fish species. These relations were estimated through logistic regressions (Hosmer and Lemeshow 2000, Collett 2002, Pont et al. 2005), integrating the mean air temperature in July (Tjul), the thermal amplitude between July and January (Tdif), slope (in logarithm, Slope) and a pseudo-runoff (logarithm of the area of the drainage basin multiplied by the annual precipitations, lPA). The quadratic terms associated with Tjul, Slope and lPA were integrated into the models.

286 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. . Introduction

Global climate change, bioindication, and, more generally, ecosystem management needs have clearly boosted the production of theoretical and applied studies on the consequences induced by the changing conditions of the environmental in terms of species distribution (Austin 2002; Guisan et al. 2002; Guisan and Zimmerman 2000; Scott et al. 2002). In addition, the prediction of species distribution – including the prerequisite knowledge and quantification – has become a central aspect of biology conservation and restoration. In lotic systems, the environmental conditions have been recognised as having a major impact on fish distribution (Matthews 1998). Many studies have been conducted to show the dependence of fish assemblage structures on their environment (Buisson et al. 2008; Lassale et al. 2008; Oberdorff et al. 2001; Pont et al. 2005). The partial list of physical, chemical and geomorphological parameters influencing the structure of fish assemblage includes discharge, slope, water velocity, water temperature, substrate, geomorphological type, and dissolved oxygen (Allan and Castillo 2007; Statzner et al. 1988). In 1949, Huet already observed that fish communities are particularly influenced by the slope of the river segment. However, the perception and influence of these factors can vary depending on the temporal and spatial scale as well as the different resolution levels (Frissell et al. 1986; Minshall 1988; Poff et al. 1997; Statzner et al. 1988). For example, temperature is a critical component related to metabolism as well as distribution along the river's length and over geographic regions (Allan and Castillo 2007).

Over the last few years, a wide variety of statistical and machine-learning methods have been proposed to estimate the prediction of species distribution areas, such as GLM (Austin 2002; Guisan et al. 2002), GAM (Guisan et al. 2002), quantile regression (Vaz et al. 2008), MARS (Leathwick et al. 2006), and the boosting procedure (Leathwick et al. 2008). The objective of this paper is not to propose a new modelling approach or compare methods. We will not stimulate rivalry between modelling techniques but would like to encourage readers to consult the very substantial review proposed by Austin (2007) on good modelling practices in ecology. Indeed, we agree with Austin (2007), who writes “it is clear that there is no standard for current best practice when modelling species environmental niche or geographical distribution, whether plant or animal. Numerous incompatibilities between the ecological, data and statistical models can be identified.” Ideas and conclusions converge in the remarks of several scientists and statisticians on the role and the use of models (e.g. Box and Norman 1987; Buja 2000; McCullagh and Nelder 1989; Mease and Wyner 2008; van Tongeren 1995)

A classical approach based on the generalized linear model (GLM) is used to model species distribution, because it provides interesting proprieties such as computations of confidence intervals and additional effect correction, such as the cluster effect, with robust methods (e.g. Harrell 2001; Liang and Zeger 1986). For example, a new covariance-variance matrix can be easily estimated by robust sandwich estimation (French et al. 2006; Harrell 2001; Liang and Zeger 1986) to take into account cluster effects induced by the river basin or regional groups, such as marine regions (Reyjol

287 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. et al. 2007). This method corrects for heteroscedasticity and for correlated responses from cluster samples. A procedure based on GLM appears to be a good compromise between predictive power and interpretability. The aim of the present study was to propose distribution models for 24 fish species that are native to Europe. These models are intended to provide the theoretical bioclimatic envelope in which species should occur (Elith and Graham 2009) in absence of anthropogenic stress. Species occurrence and environmental information come from the database constructed in the European project EFI+ (http://efi-plus.boku.ac.at/). The EFI+ database contains environmental and biological information for 14 countries of the European Union and Switzerland, covering the main European rivers. The construction of the working data set required attempted to limit sources of bias such as unrepresentative sampling, variable instability over space and time, interference (e.g. historical effect), and contamination from a number of sources (e.g. Hellmann and Fowler 1999; Magurran 1988). A specific random spatial sub-sampling method was developed to limit the potential bias induced by the initial geographical distribution of the sites. Our objective was to establish a model that could be used to estimate species responses to their environmental condition and error estimation associated with the expectations. Species distribution areas could be strongly affected by climate changes. Therefore, modelling species distribution area is voluntarily restricted to the region where species are considered native species.

2. Material and Methods

2.1 Study design

2.1.1 Site selections

The EFI+ database contains 14,221 sites distributed in 15 countries: Austria, Great Britain, Finland, France, Germany, Hungary, Italy, Lithuania, the Netherlands, Poland, Portugal, Romania, Spain, Sweden and Switzerland. First we selected only the sites (Fig. 1) for which complete information concerning environmental conditions and human pressures was available (9936 sites). We then used pressure criteria to select only sites that were not impacted or only slightly impacted by anthropogenic activities. Thus sites were selected to ensure that the presence or absence of a species in a given site was not driven by human alterations. By modifying the environmental conditions, human activities could alter the species composition of communities by either promoting the presence or absence of certain species. After this selection, some regions displayed very high concentrations of sites, with numerous sites geographically very close to each other, such as Galicia and Asturia in Spain (about 26 % of the sites). This spatial organisation can be in discordance with the necessary assumption of independence between observations in (generalised) linear regression (Kutner et al. 2005; 288 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Montgomery et al. 2006). Consequently, a sub-sample strategy was used to limit the effect of spatial autocorrelation and to reduce the weight of over-represented regions. We defined a specific grid of 0.2 decimal degrees and randomly selected one site per cell (0.1 decimal degree corresponds to approximately 11.1 km). Finally, our data set was composed of 1548 sites distributed over 14 countries (excluding the Netherlands; Fig. 2). Each site was sampled for 1–28 years. Therefore, we randomly selected only one sampling occasion (date) per site to reduce the temporal correlation in data.

1422114,221 sites sites 2950929,509 sampling sampling occasions occasions

; Exclusion of the sites with missing data ; One sampling occasion by site

99369936 sites sites

; Only no or slightly disturbed sites

24532453 sites sites

; Spatial subsampling

15481548 sites sites

Fig. 1. Data set selection procedure.

289 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Fig. 2. Geographical location of the 1548 sites.

2.1.2 Selected Species

This study only focused on species that were native to Europe (Kottelat and Freyhof 2007). To avoid the problem of rare events on the estimation of species presence (see King and Zeng 2001 for more details), only species common in Europe were retained. Therefore, we selected species that occurred in at least 10 % of the sites located in the main marine region (MMR) were they are considered native (Kottelat and Freyhof 2007; Reyjol et al. 2007). We also only considered the species for which at least 300 sampling sites where located in their geographical distribution area (Table 1). This was done to use consistent data sets to fit each species model. Finally, we retained 24 species among the 159 fish species recorded in the EFI+ database (Table 1).

290 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Table 1. Occurrence of the 24 selected fish species, with the number of sites located in their geographical distribution, their occurrence rate among these sites and the number of MMRs in which they are considered as native and with sampling sites. Species Family Occurrence Sites Prevalence MMR Salmo trutta Salmonidae 1212 1544 0.785 44 Phoxinus phoxinus Cyprinidae 492 1319 0.373 37 Barbatula barbatula Nemacheilidae 455 1236 0.368 34 Cottus gobio Cottidae 451 1263 0.357 36 Gobio gobio Cyprinidae 387 1265 0.306 34 Pseudochondrostoma duriense Cyprinidae 119 393 0.303 5 Leuciscus cephalus Cyprinidae 351 1179 0.298 38 Rutilus rutilus Cyprinidae 285 1065 0.268 33 Anguilla anguilla Anguillidae 317 1314 0.241 42 Perca fluviatilis Percidae 231 996 0.232 32 Alburnoides bipunctatus Cyprinidae 143 687 0.208 11 Esox lucius Esocidae 210 1029 0.204 32 Leuciscus leuciscus Cyprinidae 210 1054 0.199 32 Alburnus alburnus Cyprinidae 179 986 0.182 28 Gasterosteus aculeatus Gasterosteidae 150 864 0.174 32 Chondrostoma nasus Cyprinidae 64 385 0.166 5 Salmo salar Salmonidae 163 1005 0.162 30 Barbus barbus Cyprinidae 131 908 0.144 23 Lota lota Lotidae 104 775 0.134 16 Telestes souffia Cyprinidae 59 441 0.134 7 Thymallus thymallus Thymallidae 119 894 0.133 27 Lampetra planeri Petromyzonidae 181 1442 0.126 36 Rhodeus amarus Cyprinidae 93 835 0.111 16 Pungitius pungitius Gasterosteidae 76 719 0.106 24

2.1.3 Environmental Data

The set of predictor variables was selected a priori using hypothesised habitat relations based on the organism’s ecology (Burnham and Anderson 2002). The environmental variables included in the models take several aspects of the river characteristics into account such as geomorphology and climate condition. Finally, we selected six environmental variables: actual river slope (log-transformed, m/km), July temperature (Tjul, °C), Thermal amplitude (Tdif = Tjul-Tjan, °C) and a runoff estimation (lPA). The thermal variables provide an opportunity to take climate conditions into account. For a given site, the maximum value was measured by the July temperature and thermal amplitude was obtained by the difference between the July and January temperatures, ensuring the reducing correlation between these two variables. TjanTjulTdiff A pseudo-runoff was estimated by the product between the annual precipitation (P, in mm) and the drainage area (A, km2): APlPA )log(

291 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

In the modelling process, this variable will be log-transformed because the distribution of this variable is particularly asymmetric. Polynomial functions for slope and temperature (Pont et al. 2006; e.g. Pont et al. 2007) were used to integrate nonlinear responses (e.g. Jongman et al. 1995).

2.2 Statistical analysis

2.2.1 Modelling approach

To relate species presence or absence with the environmental variables, we used logistic regressions (Agresti 2002; Collett 2002; Hosmer and Lemeshow 2000) belonging to the generalised linear models (GLMs; McCullagh and Nelder 1989; Nelder and Wedderburn 1972). GLMs are an extension of the classical linear models designed for random variables that are generated from a distribution function belonging to the exponential family (e.g. Poisson, Gamma). Thus classical linear models are a special case of GLM with a random variable Y generated by a Gaussian distribution and  models in the following form:Y X ii , with Y the response variable, the intercept, i the th th i parameter associated with the i predictor Xi and  the error. In linear models, the coefficients are estimated by ordinary least squares (OLS) and the error is assumed to be normally distributed (Kutner et al. 2005; Montgomery et al. 2006). In GLMs, the explanatory variables are not directly related to the response variable but through a linear predictor: X ii . The linear predictor is related to

the expectation of the response variable Y through a link function g and gYE 1 . Moreover, the coefficients are estimated by maximising the likelihood (Faraway 2006; McCullagh and Nelder 1989) rather than by ordinary least squares. The link function classically used in logistic p regressions is the logit: g p lnlogit , with p the probability that Y = 1 (Collett 2002; 1 p

Hosmer and Lemeshow 2000). Therefore,

1 logit p X ii and 1 pXYE . With the logistic regressions, we thus 1 e X ii model the probability of the presence of a species at a given site (Y = 1) conditioned to environmental conditions (Xi). To select the set of variables that best explained the presence or absence of species, for each species we first computed a complete model integrating all environmental variables and the quadratic forms of Tjul, lslope and lPA (e.g. Pont et al. 2006; Pont et al. 2007). These quadratic forms were integrated to allow for nonlinear relationships (Jongman et al. 1995; Pont et al. 2005). Thus the 2 complete models took the following form: ~logit juljul lslopeTTp lslope² ² TlPAlPA dif . Then a stepwise procedure based on Akaike’s information criterion (AIC) was used to select the set of variables that better predict species occurrence (Pont et al. 2005; Venables and Ripley 2002). One of the main advantages of this procedure, in addition to computation time, is that selecting adequate

292 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. variables (on AIC criteria) would limit the overfitting problem that can arise from using too many variables to make predictions. For further detail on this procedure, see Hosmer and Lemeshow (2000).

2.2.2 Model assessment

Model diagnostic GLMs are based on various statistical assumptions such as independence between observations. We therefore checked for the quasi-normality of the residuals through the QQ-plot of the standardised residuals and the histogram of the Pearson residuals, the heteroskedasticity of residuals with graphic related residuals to expected values on the link function, the potential influent points by representing hat values against the standardised residuals (Collett 2002). We also checked for multicolinearity between the environmental variables by computing the variance influence factor (VIF; e.g. Belsey et al. 1980; Chatterjee et al. 2000; Fox 1997).

Goodness of fit The model’s goodness of fit was first assessed by representing the relationship between observed and predicted values. Model performance was assessed by computing the sensitivity and the specificity of each model (e.g. Fiedling and Bell 1997). Sensitivity corresponds to the percentage of presence correctly predicted and specificity corresponds to the percentage of absence correctly predicted. To compute these two statistics, we defined a cut-off probability (c) for each model for which we considered that a species would be present at a site if the fitted probability was greater than this cut-off (p(Y = 1|X) > c), and would be absent at this site if the fitted probability was lower than this cut- off (p(Y = 1|X) < c). This cut-off was estimated to maximise the sum of the sensitivity and the specificity (Fiedling and Bell 1997). We also computed the rate of good classification (number of sites correctly classified divided by the total number of sites). The predictive powers of the models were finally estimated by computing the area under the ROC curve (e.g. Hosmer and Lemeshow 2000). The kappa statistic on the confusion matrix, which is a measure of agreement between the observations and the predictions (Agresti 2002; Altman et al. 2000; Conger 1980) and a pseudo r-squared based on the ratio of deviance (Collett 2002), were also computed.

Internal validation Interpretability and predictive power are two important characteristics, but other criteria must be considered in model selection: natural handling of mixed-type data, robustness to outliers, computational scalability (large N), ability to deal with irrelevant inputs and ability to extract linear combinations of variables (Collett 2002; Faraway 2006; Hastie et al. 2009; Snee 1977). The stability of the model (robustness to outliers) was assessed by internal validations based on the bootstrap technique (Davidson and Hinkley 1997; Efron and Tibshirani 1993). Therefore, for each species we constituted 200 new calibration data sets (resamples) by randomly selecting calibration sites with replacement. The new data sets could thus include the same site several times. For each of the 200 resamples, we re-estimated the model coefficients. With these 200 re-estimated modes, we computed the rate of correct classification and the kappa statistic both on the resample calibration data sets and on the initial calibration data set. Finally, for each statistic two vectors of 200 values were obtained,

293 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. one for the randomly selected calibration sites (train vector) and one for the initial calibration data set (test vector). The difference between the mean values of the train and test vectors is an estimation of the optimism (Harrell 2001). The optimism estimates the bias due to overfitting in the final model fit.

Split-Sampling The model’s predictive reliability was assessed using a method derived from the classical cross-validation methods (Harrell 2001). First we randomly split the calibration data set into two subsets – train and test – containing, respectively, 70 and 30 % of the sites. The train data set was used to re-estimate the models’ coefficients to then compute the sensitivity, specificity, kappa, correct classification rate and the area under the ROC curve on the test data set. This operation was repeated 200 times to obtain a vector of values for each statistic computed on independent data sets (test).

Contribution of environmental variables We used hierarchical partitioning (Chevan and Sutherland 1991) to determine the relative influence of each environmental variable on the occurrence of fish species (Pont et al. 2005). This method was developed to disentangle the relative independent influence (I) and joined effect (J) of correlated independent variable on a response variable. Hierarchical partitioning is a general method that can be performed using a large variety of statistics as goodness of fit measures (e.g. R², log- likelihood). We used the difference of deviance as the statistic to assess the environmental variable’s influences on species presence or absence. Whereas with GLMs the response is indirectly related to the predictors (through a link function), it is difficult to directly interpret the relationships between independent and dependant variables from the coefficients of the models, especially when using polynomials or interactions (Fox 2003). Nevertheless, the coefficients’ signs provide information on how the response varies in relation to the predictors. Therefore, we used graph effect displays (Fox 1987; 2003) to investigate the relationships between species occurrence and environmental variables. These graphs represented the variation of the fitted probability of presence of a species along one environmental gradient (e.g. Tjul). These could be achieved by computing the estimated probability of the presence of a species over the whole range of a given environmental variable and by using a typical value for the other environmental variables (Fox 1987; 2003). Classically, the means of the independent variables are used as a typical value to draw the graph effect display (Fox 1987; 2003). Nevertheless, using mean values may be inappropriate and poorly adapted to represent species distribution. If the species optimum for a variable is far from the mean values observed in the data set, the relationship represented on the graph display may be inconsistent (e.g. no relationship between the independent variable and species presence). Consequently, in order to really represent the relation between an environmental variable and species occurrence, the values of the other predictors should be fixed such that these variables do not limit species occurrence. Therefore, we used the median values of the environmental variable of the sites where the species were recorded. We also displayed the confidence intervals on these graphs.

294 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Confidence intervals The confidence interval estimates the error associated with the mean response value of Y for a given vector of X (E(Y|X)). This interval is used as an indicator of model precision (de Jong and Heller

2008). The interval for an estimated probability ( pˆx) was first constructed on the space link (for the logit( pˆx)): logit ˆ x ICpIC ˆˆ xx 21 sez ˆx , where z1-/2 is the upper 1–/2 point of the standard normal distribution (with taken to 0.05) and se( ˆx) the standard error of the linear predictor. Then the interval of the estimated probability is constructed by transforming the limit of the linear 1 %1 predictors with the inverse of the link function (Collett 2002): ˆ x ICgpIC ˆx , where g is the inverse of the link function. The standard error for the confidence interval was estimated using the Wald approach (Agresti

 1  2002; Hosmer and Lemeshow 2000): se ˆx x XWXXX x , where is the dispersion parameter (taken to be 1), Xx (the environment in a given site) a vector of the matrix design X, and W the diagonal matrix containing the estimated variance of Y. The matrix (X’WX)1 corresponds to the estimated covariance matrix of the coefficients.

Species environmental optimum The species optimum for a given variable was estimated differently depending on whether or not the quadratic form of the variable was selected by the stepwise procedure. For the environmental variable with no quadratic terms selected (e.g. Tdif), the optimum was estimated as the extreme value observed on the sites where the species occurred. If the coefficient related to the environmental variable was positive the optimum was taken as the maximum value observed, whereas the optimum was taken as the minimum value if the coefficient was negative.

For the variable with a quadratic term, the optimum of each species was computed by identifying values for which the partial derivatives of g() with respect to the environmental variable of g g interest X , was equal to 0: 0 . Thus the partial derivatives were: 2 X , with i 21 i X i X i

1 the coefficients related to the variable Xi (e.g. Tjul) and 2 the parameter associated with its quadratic term (e.g. Tjul²). The values of the optimum were given by: x 1 , and if was positive, 2 2 2 then the extreme corresponds to a maximum.

All statistical analyses were performed with R software (version 2.9.1, R Development Core Team 2008). We also used a specific library to perform the bootstrap (boot, Canty and Ripley 2009), the hierarchical partitioning (hier.part, Walsh and Mac Nally 2008) and the graph effect display (effects, Fox 2003).

295 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. 3. Results

3.1 Environment

The global data set comprising 1548 sites was mostly composed of sites located in small streams. Half of the sites had a Strahler stream order lower than 3 and about 90 % of the sites had an order lower or equal to 4. Nevertheless, this data set encompassed a wide range of climatic and physical conditions (Table 2). For instance, the July mean air temperature ranged from 11.3 to 25.1°C and the thermal amplitude ranged from 8.4 to 28.8°C (Table 2).

Table 2. Summary of the environmental variables of the global data set (1548 sites) before transformation. Environmental variable Unit Mean Sd Range Tjul °C 18.1 2.2 11.3–25.1 Tdif °C 16.9 4.3 8.4–28.8 Slope m/km 12.8 19.6 0.1–294.7 Precipitations mm 926.7 268.8 403.8–1879.8 Area of drainage basin km² 950.2 5811.8 1–100161 lPA - 11.5 1.7 6–19

3.2 Calibration data sets

The number of sites located in species distribution areas was highly variable (Table 3), from 385 for the nase (Chondrostoma nasus, L.) to 1544 for the brown trout (Salmo trutta, L.). Following these variations, the rate of occurrence was also largely inconsistent between species, with two-thirds of the species recorded in less than 25 % of their calibration sites (Table 3). The Douro nase (Pseudochondrostoma duriense, C.), gudgeon (Gobio gobio, L.), bullhead (Cottus gobio, L.), stone loach (Barbatula barbatula, L.), minnow (Phoxinus phoxinus, L.) and the brown trout were the only species occurring in more than 30 % of the sampling sites located in their distribution area. The brown trout was different from the other species, with an approximately 80 % rate of occurrence and the greatest number of sites available (Table 3).

296 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Table 3. Summary of the number of sites available for each species in their occurrence among these sites. Species Family Occurrence Sites %Occur Region Salmo trutta Salmonidae 1212 1544 0.785 44 Phoxinus phoxinus Cyprinidae 492 1319 0.373 37 Barbatula barbatula Nemacheilidae 455 1236 0.368 34 Cottus gobio Cottidae 451 1263 0.357 36 Gobio gobio Cyprinidae 387 1265 0.306 34 Pseudochondrostoma duriense Cyprinidae 119 393 0.303 5 Leuciscus cephalus Cyprinidae 351 1179 0.298 38 Rutilus rutilus Cyprinidae 285 1065 0.268 33 Anguilla anguilla Anguillidae 317 1314 0.241 42 Perca fluviatilis Percidae 231 996 0.232 32 Alburnoides bipunctatus Cyprinidae 143 687 0.208 11 Esox lucius Esocidae 210 1029 0.204 32 Leuciscus leuciscus Cyprinidae 210 1054 0.199 32 Alburnus alburnus Cyprinidae 179 986 0.182 28 Gasterosteus aculeatus Gasterosteidae 150 864 0.174 32 Chondrostoma nasus Cyprinidae 64 385 0.166 5 Salmo salar Salmonidae 163 1005 0.162 30 Barbus barbus Cyprinidae 131 908 0.144 23 Lota lota Lotidae 104 775 0.134 16 Leuciscus souffia Cyprinidae 59 441 0.134 7 Thymallus thymallus Thymallidae 119 894 0.133 27 Lampetra planeri Petromyzonidae 181 1442 0.126 36 Rhodeus amarus Cyprinidae 93 835 0.111 16 Pungitius pungitius Gasterosteidae 76 719 0.106 24

3.3 Model goodness of fit and validation

3.3.1 Goodness of fit

For each model, the proportion of covariate patterns (number of sites with the same environmental conditions) was low, always lower than 5 % (Table 4). The relatively average VIF (ranging from 1.1 to 1.97) indicates that multicolinearity was a relatively limited phenomenon in the models (Table 4). On average, the models’ goodness of fit was good: 20 models with a very good AUC (greater than 0.8), 22 models with a kappa greater than 0.3 and an average rate of good classification equal to 0.763. This pattern was less represented by the pseudo-r-square, with values ranging from 0.08 to 0.4 and an average of 0.265 (Table 4). Nevertheless, four species displayed models with low goodness of fit statistics: the minnow, the stone loach, the bullhead and the European brook lamprey (Lampetra planeri, B.). Their pseudo-r- square was lower than 0.16, all their area under the curve was lower than 0.8 and either their specificity or their sensitivity did not exceed 70 %. These statistics indicate that the discriminant power of these models was low (Table 4). On the other hand, the riffle dace (Telestes souffia, R.), the bleak (Alburnus alburnus, L.), the nase and the barbel (Barbus barbus, L.) presented models with high goodness of fit, with the AUC around or greater than 0.9, the kappa statistic greater than 0.46 and the good classification rate between 0.76 and 0.88. Specificity and sensitivity values for these species were consistent with high

297 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

discriminating powers, some greater than 0.9 (Table 4). For these four species, sensitivity was always higher than specificity (Table 4). On average, the sensitivity of the models was higher than the specificity (for 17 models out of 24; Table 4), indicating that the classification errors were lower for species presence than for species absence. The cut-off probability used to transform predicted probabilities in species estimated presence or absence were highly correlated with the species rate of occurrence in their calibration data set (Spearman rank correlation, = 0.854, p-value < 0.001; Table 4).

Table 4. Summary statistics of the models’ goodness of fit: percentage of covariate patterns greater than 4 (CovPat), the mean VIF (Mvif), a pseudo-r-square based on deviance ratio (e.g. Estrella 1998), the AIC, cut-off probability (Cut), sensitivity (Sens), specificity (Spec), area under the ROC curve (AUC), kappa statistic () and rate of good classification (percentage of sites well classified, Class). Species CovPat Mvif pR2 AIC Cut Sens Spec Prev AUC Class Salmo trutta 0.036 1.43 0.302 1154.8 0.7392 0.829 0.711 0.785 0.845 0.482 0.804 Phoxinus phoxinus 0.010 1.43 0.082 1647.6 0.4121 0.630 0.625 0.373 0.652 0.243 0.627 Barbatula barbatula 0.008 1.38 0.154 1447.9 0.3316 0.804 0.542 0.368 0.716 0.308 0.638 Cottus gobio 0.014 1.13 0.157 1456.7 0.3617 0.818 0.602 0.357 0.727 0.376 0.679 Gobio gobio 0.020 1.34 0.349 1114.9 0.4165 0.739 0.818 0.306 0.848 0.534 0.794 Pseudochondrostoma duriense 0.015 1.40 0.271 386.6 0.2824 0.815 0.693 0.303 0.810 0.443 0.730 Leuciscus cephalus 0.019 1.39 0.304 1080.9 0.2361 0.849 0.637 0.298 0.826 0.403 0.700 Rutilus rutilus 0.030 1.25 0.365 849.1 0.2966 0.765 0.795 0.268 0.859 0.507 0.787 Anguilla anguilla 0.022 1.50 0.227 1164.2 0.2173 0.808 0.675 0.241 0.808 0.376 0.707 Perca fluviatilis 0.026 1.33 0.312 775.5 0.3132 0.723 0.809 0.232 0.850 0.473 0.789 Alburnoides bipunctatus 0.020 1.40 0.231 559.6 0.2428 0.783 0.734 0.208 0.823 0.399 0.744 Esox lucius 0.022 1.30 0.321 722.9 0.2571 0.781 0.790 0.204 0.868 0.467 0.788 Leuciscus leuciscus 0.022 1.41 0.324 727.0 0.3293 0.719 0.859 0.199 0.872 0.522 0.831 Alburnus alburnus 0.018 1.10 0.394 562.5 0.1427 0.905 0.751 0.182 0.901 0.469 0.779 Gasterosteus aculeatus 0.014 1.33 0.327 535.5 0.2154 0.860 0.790 0.174 0.885 0.485 0.802 Chondrostoma nasus 0.021 1.97 0.400 212.3 0.1303 0.922 0.782 0.166 0.919 0.500 0.805 Salmo salar 0.039 1.51 0.283 622.0 0.3030 0.681 0.910 0.162 0.865 0.558 0.873 Barbus barbus 0.014 1.52 0.373 441.6 0.1592 0.916 0.834 0.144 0.923 0.546 0.846 Lota lota 0.032 1.30 0.174 493.8 0.1474 0.740 0.762 0.134 0.816 0.326 0.759 Telestes souffia 0.016 1.79 0.280 244.6 0.1798 0.864 0.853 0.134 0.899 0.534 0.855 Thymallus thymallus 0.024 1.33 0.194 548.5 0.1625 0.782 0.777 0.133 0.835 0.367 0.777 Lampetra planeri 0.025 1.26 0.122 929.4 0.1363 0.801 0.694 0.126 0.786 0.271 0.707 Rhodeus amarus 0.020 1.30 0.211 431.5 0.1000 0.936 0.714 0.111 0.872 0.330 0.739 Pungitius pungitius 0.026 1.23 0.210 356.4 0.1223 0.829 0.768 0.106 0.860 0.3340.775

3.3.2 Model validation

The mean values of the kappa and the rate of good classification obtained with the bootstraps (200 resamples) were coherent with the values observed for the calibration data set (Table 5). The estimated optimisms never exceeded 6.5 % of the initial statistic, revealing the models’ good stability and low sensitivity to outliers.

298 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Table 5. Bootstrap results (mean of the 200 re-samples) on the kappa and good classification statistics: the initial value of the statistic (Init), the mean value obtained from the train vectors (Train, computed on the randomly selected set of sites), the mean value from the test vector (Test, computed using the initial calibration data set) and the optimism (Optim) associated with the statistic (Train-Test).

Kappa Good Classification Species Init Train Test Optim Init Train Test Optim Salmo trutta 0.482 0.471 0.468 0.004 0.804 0.8 0.798 0.002 Phoxinus phoxinus 0.242 0.235 0.227 0.008 0.627 0.623 0.618 0.005 Barbatula barbatula 0.308 0.3 0.295 0.005 0.638 0.636 0.633 0.003 Cottus gobio 0.376 0.368 0.361 0.006 0.679 0.674 0.672 0.002 Gobio gobio 0.534 0.528 0.517 0.011 0.794 0.791 0.788 0.004 Pseudochondrostoma duriense 0.443 0.434 0.409 0.025 0.73 0.728 0.717 0.01 Leuciscus cephalus 0.403 0.404 0.397 0.007 0.7 0.701 0.697 0.004 Rutilus rutilus 0.507 0.501 0.498 0.003 0.787 0.785 0.784 0.002 Anguilla anguilla 0.376 0.378 0.368 0.01 0.707 0.71 0.704 0.005 Perca fluviatilis 0.473 0.464 0.458 0.005 0.789 0.786 0.785 0.001 Alburnoides bipunctatus 0.399 0.412 0.392 0.019 0.744 0.75 0.742 0.008 Esox lucius 0.467 0.469 0.466 0.004 0.788 0.79 0.79 0 Leuciscus leuciscus 0.522 0.506 0.497 0.009 0.831 0.828 0.822 0.005 Alburnus alburnus 0.469 0.463 0.46 0.002 0.779 0.778 0.776 0.002 Gasterosteus aculeatus 0.485 0.487 0.478 0.01 0.802 0.803 0.798 0.004 Chondrostoma nasus 0.5 0.51 0.49 0.02 0.805 0.81 0.804 0.006 Salmo salar 0.558 0.547 0.54 0.006 0.873 0.868 0.867 0.002 Barbus barbus 0.546 0.539 0.527 0.012 0.846 0.844 0.841 0.004 Lota lota 0.326 0.32 0.299 0.021 0.759 0.756 0.75 0.005 Telestes souffia 0.534 0.506 0.472 0.034 0.855 0.846 0.835 0.011 Thymallus thymallus 0.367 0.362 0.348 0.014 0.777 0.779 0.774 0.005 Lampetra planeri 0.271 0.262 0.255 0.007 0.707 0.706 0.702 0.004 Rhodeus amarus 0.33 0.332 0.319 0.013 0.739 0.746 0.742 0.004 Pungitius pungitius 0.3340 0.3250 0.3130 0.0110 0.7750 0.7720 0.7690 0.0030

3.3.3 Predictive accuracy

The mean values of the five statistics computed for the independent data set (Table 6) were similar to those computed on the whole calibration data set (Table 4), especially specificity, AUC and rate of good classification. The standard deviations associated with the five statistics were generally very low. Only the Douro nase, riffle dace, burbot and Eurasian nine-spine stickleback presented notable deviation in sensitivity between calibration data set (Table 4) and independent data sets (Table 6). Their standard deviations were somewhat higher than for the other species (Table 6).

299 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Table 6. Mean and standard deviation of the statistics computed on the independent data sets: sensitivity (Sens), specificity (Spec), area under the ROC curve (AUC), kappa (k) and rate of good classification (Class). Species Sens Spec AUC k Class Salmo trutta 0.825 ± 0.023 0.694 ± 0.047 0.841 ± 0.020 0.463 ± 0.041 0.796 ± 0.017 Phoxinus phoxinus 0.619 ± 0.045 0.617 ± 0.038 0.648 ± 0.023 0.223 ± 0.041 0.617 ± 0.022 Barbatula barbatula 0.790 ± 0.042 0.535 ± 0.035 0.710 ± 0.023 0.288 ± 0.041 0.628 ± 0.024 Cottus gobio 0.807 ± 0.037 0.594 ± 0.033 0.724 ± 0.021 0.357 ± 0.038 0.669 ± 0.022 Gobio gobio 0.719 ± 0.044 0.817 ± 0.025 0.846 ± 0.018 0.516 ± 0.039 0.787 ± 0.018 Pseudochondrostoma duriense 0.760 ± 0.082 0.685 ± 0.055 0.794 ± 0.037 0.390 ± 0.074 0.707 ± 0.038 Leuciscus cephalus 0.840 ± 0.038 0.633 ± 0.031 0.825 ± 0.019 0.393 ± 0.038 0.694 ± 0.022 Rutilus rutilus 0.757 ± 0.043 0.793 ± 0.023 0.859 ± 0.020 0.497 ± 0.042 0.783 ± 0.018 Anguilla anguilla 0.792 ± 0.042 0.669 ± 0.028 0.804 ± 0.020 0.358 ± 0.036 0.698 ± 0.020 Perca fluviatilis 0.709 ± 0.050 0.804 ± 0.028 0.847 ± 0.021 0.454 ± 0.051 0.782 ± 0.021 Alburnoides bipunctatus 0.756 ± 0.072 0.728 ± 0.039 0.812 ± 0.029 0.374 ± 0.058 0.733 ± 0.028 Esox lucius 0.772 ± 0.048 0.792 ± 0.030 0.867 ± 0.018 0.464 ± 0.052 0.788 ± 0.023 Leuciscus leuciscus 0.703 ± 0.053 0.849 ± 0.024 0.867 ± 0.019 0.491 ± 0.047 0.820 ± 0.019 Alburnus alburnus 0.891 ± 0.046 0.749 ± 0.031 0.899 ± 0.020 0.457 ± 0.050 0.774 ± 0.024 Gasterosteus aculeatus 0.842 ± 0.049 0.786 ± 0.029 0.880 ± 0.019 0.469 ± 0.055 0.796 ± 0.024 Chondrostoma nasus 0.873 ± 0.078 0.782 ± 0.040 0.902 ± 0.027 0.469 ± 0.070 0.796 ± 0.032 Salmo salar 0.662 ± 0.064 0.910 ± 0.020 0.860 ± 0.025 0.543 ± 0.057 0.869 ± 0.017 Barbus barbus 0.877 ± 0.052 0.830 ± 0.024 0.920 ± 0.015 0.516 ± 0.055 0.836 ± 0.020 Lota lota 0.681 ± 0.097 0.756 ± 0.040 0.805 ± 0.034 0.285 ± 0.053 0.744 ± 0.029 Leuciscus souffia 0.765 ± 0.107 0.834 ± 0.038 0.878 ± 0.031 0.440 ± 0.084 0.824 ± 0.031 Thymallus thymallus 0.745 ± 0.077 0.777 ± 0.031 0.833 ± 0.028 0.347 ± 0.052 0.772 ± 0.024 Lampetra planeri 0.777 ± 0.063 0.691 ± 0.029 0.786 ± 0.024 0.256 ± 0.038 0.701 ± 0.024 Rhodeus amarus 0.891 ± 0.065 0.719 ± 0.032 0.865 ± 0.025 0.319 ± 0.050 0.738 ± 0.027

0.783 ± 0.097 0.764 ± 0.037 0.862 ± 0.033 0.307 ± 0.062 Pungitius pungitius 1.765 0.030

3.4 Influence of the environmental variables

Among the four environmental variables included in the initial model, July mean air temperature was the only one retained in all the models, while slope and lPA were almost always selected (both selected in 23 models out of 24; Tables 7 and 8). Thermal amplitude was only selected in 17 models (Table 7). Nevertheless, the hierarchical partitioning revealed that their influence on species occurrence varied widely between species (Table 7). For all models, the overall independent influence of variables was always greater than the joined influence (Table 7), but for 11 models the absolute ratio between these two values did not exceed 2, and for nine species the total independent influence was three times greater than the joined influence (Table 7). In more than half of the models, slope was the environmental variable presenting the greatest contribution to the independent effects (Table 7), followed by July mean air temperature and lPA, which mostly contributed to the independent effects for five species each. Thermal amplitude (Tdif) presented the highest independent effect for only one species: the European eel (Anguilla anguilla, L.). When selected by the stepwise procedure, thermal amplitude generally had the lowest participation in the independent effects of all the environmental variables, accounting for less than 15 % of the independent effects in two-thirds of the models (Table 7).

300 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

For some species such as brown trout, minnow, chub and bitterling (Rhodeus amarus, B.), the independent effects were quite well balanced between July mean air temperature, slope and lPA. Whereas for 16 species, one of the four environmental variables alone accounted for more than 45 % of the independent effects (Table 7): Tjul for the bullhead, Atlantic salmon (Salmo salar, L.) and the riffle dace; lslope for the stone loach, gudgeon, roach (Rutilus rutilus, L.), perch (Perca fluviatilis, L.), pike (Esox lucius, L.), dace (Leuciscus leuciscus, L.), three-spined stickleback (Gasterosteus aculeatus, L.) and the European brook lamprey; lPA for the spirlin (Alburnoides bipunctatus, B.), nase, barbel and the grayling (Thymallus thymallus, L.); and Tdif for the European eel.

Table 7. Hierarchical partitioning results: the overall effect of the variables (difference of deviance, R), the overall independent influence (I), the joined influence (J), the ratio between independent and joined influence

(|I/J|) and the relative independent influence of each environmental variable (%Ii). The relative independent influence of each variable was assessed either for the single variable or for the variable and its quadratic form conjointly (variable names followed by ‘²’).

Total partition %Ii Species R I J |I/J| Tjul Tjul² Slope Slope² lPA LPA² Tdif Salmo trutta 696.9 468.6 228.2 2.1 – 30.9 – 34.4 – 19.7 15.1 Phoxinus phoxinus 136.2 108.9 27.3 4 – 37.6 – 38.5 – 23.9 – Barbatula barbatula 258.5 194.6 63.9 3 – 31.4 – 46.4 – 18.9 3.3 Cottus gobio 212.3 201.6 10.8 18.7 – 73.8 – 17.4 – – 8.8 Gobio gobio 723.5 459.1 264.5 1.7 – 20 – 49 – 21.3 9.7 Pseudochondrostoma duriense 141.3 109.4 31.9 3.4 – 41 – 19.4 – 39.6 – Leuciscus cephalus 566.3 368.9 197.4 1.9 – 39 – 23.9 – 37 – Rutilus rutilus 664.3 400.2 264.1 1.5 – 21.2 45.3 – 30.8 – 2.7 Anguilla anguilla 275.8 301.8 26 11.6 – 6.9 – 5.5 2.3 – 85.3 Perca fluviatilis 519 315.4 203.7 1.5 – 20.6 – 49.1 30.3 – – Alburnoides bipunctatus 261.4 159.2 102.2 1.6 – 14.6 – 23 – 51 11.5 Esox lucius 533.2 330.5 202.7 1.6 – 11.8 – 62.2 26.1 – – Leuciscus leuciscus 529.1 341.6 187.5 1.8 – 17.4 – 45.1 – 33.3 4.2 Alburnus alburnus 680.7 383.6 297 1.3 16.1 – – 37.8 41.5 – 4.6 Gasterosteus aculeatus 379.6 278.1 101.5 2.7 – 29.5 – 50.1 – 9.8 10.6 Chondrostoma nasus 240.6 150.1 90.5 1.7 – 26.1 – 17.9 – 53.3 2.6 Salmo salar 202.9 279 76.1 3.7 69.1 – – – – 14.6 16.3 Barbus barbus 457.8 323.8 134 2.4 – 24 – 15.1 – 59.1 1.8 Lota lota 208.5 131.3 77.2 1.7 3.5 – – 43.1 – 40.1 13.3 Leuciscus souffia 102 118.5 16.5 7.2 – 66.7 – 10.6 – 5.4 17.3 Thymallus thymallus 181.5 168.8 12.7 13.3 – 14.5 – 28.7 – 48.1 8.7 Lampetra planeri 158.4 172.1 13.7 12.6 – 34.2 – 51.1 14.7 – – Rhodeus amarus 309.3 168 141.3 1.2 – 29.4 – 37.7 – 23.6 9.2 Pungitius pungitius 138.6 142.9 4.3 33.5 – 39.7 – 42.1 – 18.3 –

3.5 Species niche

3.5.1 July mean air temperature

The response patterns of species occurrence to July mean air temperature were mainly bell- shaped curves, as shown in the effect graphs (Figs. 3 to 5). The main differences between species were the magnitudes of their thermal range (conditioned to the median of the other environmental

301 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. variable) and their thermal optima. Indeed, species such as brown trout, minnow, stone loach and chub exhibited broad thermal ranges, whereas Douro nase, European three-spine stickleback, nase, barbel, riffle dace, grayling and Eurasian nine-spine stickleback displayed a narrow range of thermal tolerance (Figs. 3–5). Four species also presented a singular pattern of response to July mean air temperature. The brown trout appeared to tolerate a wide variety of low temperatures and its estimated probability of presence declined sharply over a certain temperature (Figs. 3 and 4). The occurrence probabilities of the Atlantic salmon and burbot were estimated to decline over all the thermal gradients, whereas the probability of presence of bleak was estimated to rise with increasing July mean air temperature (Figs. 3 and 5). These three former species were the only species without a quadratic term for Tjul selected by the stepwise procedures (Table 8). Concerning the species thermal optimum, two species had an optimum lower than 16°C: brown trout and grayling; four species had an optimum between 16 and 18°C: bullhead, European brook lamprey, European three-spine stickleback and minnow; nine species had an optimum between 18 and 20°C: pike, Eurasian nine-spine stickleback, perch, stone loach, dace, roach, spirlin, gudgeon, and nase; and six species had an optimum greater than 20°C: barbel, chub, Douro nase, riffle dace, European eel, and bitterling (Table 9). No optimum could be computed for the bleak, Atlantic salmon and burbot (Table 9) because the quadratic term associated with Tjul in their models was not selected. Nevertheless, bleak seemed mostly associated with high temperatures, whereas burbot and Atlantic salmon with low temperatures (Table 9 and Figs. 3 and 5) .

302 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Fig. 3. Effect display of the July mean air temperature on the estimated probability of presence of the 24 species. The red lines represent the mean estimated probability, the polygon in grey represents the confidence interval around the mean fitted probabilities. Anguilla anguilla, (c) Barbatula barbatula, (d) Cottus gobio, (e) Esox lucius, (f) Gobio gobio, (g) Leuciscus cephalus, (h) Perca fluviatilis, (i) Phoxinus phoxinus, (j) Pseudochondrostoma duriense, (k) Rutilus rutilus, and (l) Salmo trutta. To compare species response, the probabilities were rescaled between 0 and 1.

Fig. 4. Effect display of the July mean air temperature on: (a) Alburnoides bipunctatus, (b)

303 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Fig. 5. Effect display of the July mean air temperature on: (m) Alburnus alburnus, (n) Barbus barbus, (o) Chondrostoma nasus, (p) Gasterosteus aculeatus, (q) Lampetra planeri, (r) Leuciscus leuciscus, (s) Telestes souffia, (t) Lota lota, (u) Pungitius pungitius, (v) Rhodeus amarus, (w) Salmo salar, (x) Thymallus thymallus. To compare species response, the probabilities were rescaled between 0 and 1.

Table 8. Coefficients of the environmental variables selected by the stepwise procedures. A dash indicates that this variable or its quadratic form was not retained in the selected model. The logit of the probability of presence of the brown trout is computed with the given formula: logitpˆ 754.0973.17 lslope 017.0 lslope2 Tjul Tjul 2 Tdif lPA 052.0202.1115.0061.0882.1 lPA2 . Species Intercept lslope1 lslope2 Tjul1 Tjul2 Tdif lPA1 lPA2 Salmo trutta 17.973 0.754 0.017 1.882 0.061 0.115 1.202 0.052 Phoxinus phoxinus 28.164 0.203 0.126 2.275 0.064 – 1.339 0.056 Barbatula barbatula 39.559 0.105 0.169 3.317 0.090 0.029 1.510 0.064 Cottus gobio 44.776 0.139 0.105 5.303 0.160 0.061 – – Gobio gobio 57.710 0.477 0.134 4.374 0.113 0.083 2.316 0.089 Pseudochondrostoma duriense 140.874 1.120 0.177 15.085 0.368 – 2.772 0.147 Leuciscus cephalus 53.155 0.011 0.102 4.052 0.099 – 1.607 0.048 Rutilus rutilus 53.083 0.757 – 5.346 0.141 0.105 0.396 – Anguilla anguilla 4.976 0.550 0.072 0.822 0.019 0.347 0.103 – Perca fluviatilis 59.939 0.587 0.077 6.088 0.166 – 0.345 – Alburnoides bipunctatus 70.353 0.301 0.199 3.992 0.103 0.174 3.916 0.134 Esox lucius 33.466 0.871 0.090 3.333 0.092 – 0.257 – Leuciscus leuciscus 65.576 0.482 0.208 5.722 0.152 0.149 1.943 0.060 Alburnus alburnus 15.160 0.571 0.174 0.400 – 0.049 0.505 – Gasterosteus aculeatus 128.192 0.794 0.221 13.700 0.388 0.104 1.322 0.073 Chondrostoma nasus 129.901 0.779 0.335 11.085 0.282 0.262 2.996 0.077 Salmo salar 21.236 – – 0.830 – 0.238 1.422 0.084 Barbus barbus 108.831 0.702 0.235 7.442 0.184 0.132 4.440 0.127 Lota lota 2.699 0.493 0.107 0.218 – 0.112 0.649 0.035 Telestes souffia 136.040 1.398 0.294 12.517 0.299 0.339 1.409 0.047 Thymallus thymallus 59.081 1.175 0.442 3.590 0.114 0.111 3.816 0.128 Lampetra planeri 37.863 0.289 0.176 4.798 0.138 – 0.386 – Rhodeus amarus 57.611 0.628 0.149 3.818 0.087 0.116 1.976 0.071 Pungitius pungitius 247.269 0.676 0.226 26.387 0.724 – 1.745 0.102

304 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Table 9. Species estimated optimum. Dash means that the variable was not selected by the stepwise procedure and stars that the optimum computed was outside the boundaries of the calibration data set. Optimums computed as the maximal or minimal values are followed by a star. Species Tjul lslope lPA Tdif Salmo trutta 15.37* 5.69 11.46 8.40* Phoxinus phoxinus 17.87 0.81 11.88 – Barbatula barbatula 18.46 0.31 11.82 24.00* Cottus gobio 16.57 0.66 – 28.50* Gobio gobio 19.36 1.78 13.01 24.10* Pseudochondrostoma duriense 20.51 3.16 9.42 – Leuciscus cephalus 20.41 0.05 16.86 – Rutilus rutilus 18.99* 2.30 18.96* 10.20* Anguilla anguilla 21.38 3.80 15.82* 8.40* Perca fluviatilis 18.33* 2.30 18.96* – Alburnoides bipunctatus 19.30 0.76 14.61 23.90* Esox lucius 18.14* 2.30 18.96* – Leuciscus leuciscus 18.82 1.16 16.22 10.80* Alburnus alburnus 23.20 1.64* 18.96* 24.10* Gasterosteus aculeatus 17.67 1.80 9.06 24.00* Chondrostoma –sus 19.67 1.16 18.96* 16.60* Salmo salar 11.90 –* 8.47 9.30* Barbus barbus 20.17 1.50 17.48 13.10* Lota lota 11.90* 2.30* 9.28 28.50* Leuciscus souffia 20.92 2.38 15.11 14.60* Thymallus thymallus 15.75 1.33 14.92 28.50* Lampetra planeri 17.39 0.82 6.05* – Rhodeus amarus 21.90 2.11 13.84 23.90* Pungitius pungitius 18.23 1.50 8.54 –

3.5.2 Slope

Compared to temperature, the variation patterns of species probability of occurrence along the slope gradient provided much more diverse results. As for temperature, the brown trout displayed the most singular pattern of response, with a high estimated probability of presence over a wide range of high slopes and a decline in its fitted occurrence probability when slope decreased (Figs. 6 and 7). Therefore, except for brown trout, several groups of species displayed very similar patterns of variation along the slope gradient. Indeed, species such as the minnow, the stone loach and the grayling displayed bell-shaped curve responses (Figs. 6–8). Other species displayed similar response trends but with a truncated bell-shaped curve in the lowest part of the slope gradient, e.g. chub, bleak and European three-spine stickleback (Figs. 6–8). For several other species, the estimated probability of occurrence only varied in a single way along the slope gradients, lessening with increasing slope: gudgeon, roach, perch and pike, bleak, European three-spine stickleback, bitterling and Eurasian nine-spine stickleback (Figs. 6–8). Nevertheless, the quadratic term associated with slope was selected in almost all models (Table 8). The tolerance ranges of fish species to slope varied between species but seemed in general relatively wide, except for few species such as the grayling, bleak, bitterling and Eurasian nine-spine stickleback (Figs. 6 to 8).

305 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

The species optimum for slope was computed for only 16 species (two-thirds), either because the quadratic term was not selected (Table 8) or because the estimated optimum was outside the range of slope observed in calibration data sets. The latter explanations concerned the great majority of species with a missing optimum. The riffle dace, barbel, grayling and nase were the four species with the highest estimated optimum for slope (Table 9 and Fig. 8), while bleak, gudgeon, European three-spine stickleback and the bitterling presented the lowest optimum (Table 9, Fig. 8). Even if no optimum could be computed for burbot, European eel, perch, pike and roach, it seemed that the species were preferentially associated with low slope (Table 8 and Figs. 6–8).

Fig. 6. Effect display of the slope (in logarithm) on the estimated probability of presence of the 24 species. The red lines represent the mean estimated probability, the polygon in grey represents the confidence interval around the mean fitted probabilities.

306 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Fig. 7. Effect display of the slope (logarithm) on: Fig. 8. Effect display of the slope (logarithm) on: (a) Alburnoides bipunctatus, (b) Anguilla anguilla, m) Alburnus alburnus, (n) Barbus barbus, (o) (c) Barbatula barbatula, (d) Cottus gobio, (e) Esox Chondrostoma nasus, (p) Gasterosteus aculeatus, lucius, (f) Gobio gobio, (g) Leuciscus cephalus, (h) (q) Lampetra planeri, (r) Leuciscus leuciscus, (s) Perca fluviatilis, (i) Phoxinus phoxinus, (j) Telestes souffia, (t) Lota lota, (u) Pungitius Pseudochondrostoma duriense, (k) Rutilus rutilus, pungitius, (v) Rhodeus amarus, (x) Thymallus (l) Salmo trutta. To compare species response, the thymallus. To compare species response, the probabilities were rescaled between 0 and 1. probabilities were rescaled between 0 and 1.

3.5.3 Pseudo run-off (lPA)

As revealed by the effects graphs, two main patterns of presence variation along the lPA gradient were observed (conditioned to the median of the other environmental variable; Figs. 9–11). Among the range of lPA observed in the data sets, brown trout, minnow, stone loach, gudgeon, spirlin and grayling displayed bell-shaped response curves (Figs. 9–11). European three-spine stickleback, riffle dace and Eurasian nine-spine stickleback displayed similar tendencies but with a truncated curve either in the lowest part of the lPA or in the highest part of the gradient (Figs. 9 and 11). Except for the European eel, Douro nase and European brook lamprey, the estimated probability of presence of the other species was estimated to increase with increasing lPA (Figs. 9–11). Species such as chub, barbel, nase and Atlantic salmon are estimated to mostly occur in high lPA. The European brook lamprey was the only species displaying the opposite pattern: its probability of occurrence was estimated to decline along the lPA gradient. The Douro nase displayed a very singular pattern of variation, while the probability of occurrence of the eel seemed only poorly related to lPA variation. The quadratic term associated with this variable was selected for 17 models out of 23 (Table 8). Indeed, for the bullhead this environmental variable was not selected by the stepwise procedure. European three-spine stickleback and the Eurasian nine-spine stickleback optimums were located in the lowest portion of the lPA gradient, whereas on the contrary, grayling, riffle dace, dace,

307 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. chub and barbell had the highest optimum for this environmental variable. Brown trout, stone loach and minnow had very close optimums (around 11), as well as gudgeon and bitterling (around 13) and spirlin, grayling and riffle dace (around 16; Table 9). No optimum could be computed for bleak, burbot, Douro nace, eel, lamprey, nase, perch, pike, roach and salmon. Nevertheless, the lamprey seemed to be associated with low lPA, whereas bleak, burbot, nase, perch, pike, roach and salmon were associated with high lPA (Table 8, Figs. 9–11).

Fig. 9. Effect display of the lPA (pseudo-runoff) on the estimated probability of presence of the 24 species. The red lines represent the mean estimated probability, the polygon in grey represents the confidence interval around the mean fitted probabilities.

308 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Fig. 10. Effect display of the lPA on: (a) Fig. 11. Effect display of the lPA on: (m) Alburnus Alburnoides bipunctatus, (b) Anguilla anguilla, (c) alburnus, (n) Barbus barbus, (o) Chondrostoma Barbatula barbatula, (e) Esox lucius, (f) Gobio nasus, (p) Gasterosteus aculeatus, (q) Lampetra gobio, (g) Leuciscus cephalus, (h) Perca fluviatilis, planeri, (r) Leuciscus leuciscus, (s) Telestes souffia, (i) Phoxinus phoxinus, (j) Pseudochondrostoma (t) Lota lota, (u) Pungitius pungitius, (v) Rhodeus duriense, (k) Rutilus rutilus, and (l) Salmo trutta. amarus, (w) Salmo salar, and (x) Thymallus To compare species response, the probabilities were thymallus. To compare species response, the rescaled between 0 and 1. probabilities were rescaled between 0 and 1.

3.5.4 Thermal amplitude

Eel was the only species for which the thermal amplitude had the greatest independent influence (Table 7). The negative coefficients associated with Tdif (Table 8) revealed that the occurrence of this species was estimated to decline with increasing thermal amplitude (Figs. 12 and 13). The probability of presence of bitterling, bleak, bullhead, burbot, European three-spine stickleback, grayling, gudgeon, spirlin and stone loach were estimated to increase with increasing thermal amplitude, whereas the probability of occurrence of barbel, brown trout, dace, nase, roach, riffle dace and salmon were estimated to decrease along the thermal amplitude gradient (Figs. 12–14).

309 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Fig. 12. Effect display of the thermal amplitude on the estimated probability of presence of the 24 species. The red lines represent the mean estimated probability, the polygon in grey represents the confidence interval around the mean fitted probabilities. Gobio gobio, (k) Rutilus rutilus, and (l) Salmo trutta. T compare species response, the probabilities were rescaled between 0 and 1.

Fig. 13. Effect display of the thermal amplitude on: (a) Alburnoides bipunctatus, (b) Anguilla anguilla, (c) Barbatula barbatula, (d) Cottus gobio, (f)

310 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Fig. 14. Effect display of the thermal amplitude on: (m) Alburnus alburnus, (n) Barbus barbus, (o) Chondrostoma nasus, (p) Gasterosteus aculeatus, (r) Leuciscus leuciscus, (s) Telestes souffia, (t) Lota lota, (v) Rhodeus amarus, (w) Salmo salar, (x) Thymallus thymallus. To compare species response, the probabilities were rescaled between 0 and 1.

3.5.5 Summary

The global niche of each species can be perceived through the univariate and bivariate graph displays (Appendix 1–3) and Table 10. The brown trout had the most singular niche with the most extreme optimums for Tjul and slope as well as the greatest tolerance ranges for Tjul, slope and lPA. The minnow and the stone loach displayed very similar patterns of variation along the different environmental gradients. They were estimated to mainly occur in cool temperatures, average slope and lPA, and they displayed average or wide ranges of tolerance to the three main environmental variables. The bullhead differentiated itself from the two former species by a colder thermal optimum and a narrower thermal tolerance range. The gudgeon and the dace presented a very similar niche except for the lPA optimum. Gudgeon displayed an average lPA optimum, whereas the dace optimum was high. Otherwise they were both estimated to mainly occur in average temperature and slope conditions, to have average tolerance to Tjul and slope, and high tolerance to lPA conditions. Dace was also estimated to prefer low thermal amplitude but exhibited a broad tolerance range to this environmental variable. Perch, pike and roach displayed optimums very close to dace, e.g. their estimated thermal optimums were all in the same degree. They were therefore all estimated to occur preferentially at average temperatures, low slopes and high lPA. The differences between these species were expressed in their tolerance range, varying between average and high depending on the variable and species considered. Spirlin and chub also displayed very similar trends of response to the environmental gradients, preferring warm temperatures, average slope and high lPA. Chub had a lower thermal and slope optimum, a higher lPA optimum but most of all a broader tolerance range to all environmental gradients. Grayling, nase, barbel and riffle dace were all estimated to prefer high slopes and high lPA. The great difference between these species was their thermal optimum, with a 5° difference between the grayling and riffle dace. The tolerance ranges of these species also varied. Bleak and eel were both estimated to prefer warm temperature, low slope and high lPA conditions, with a narrow range of tolerance for all these species. Nevertheless, these results should be taken with caution for the eel because the thermal amplitude seemed to be the only major environmental determinant for the occurrence of this species. Eel displayed a narrow range of tolerance to thermal

311 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

amplitude and was estimated to mainly occur in environments with a restricted thermal range. The European three-spine stickleback and Eurasian nine-spine stickleback displayed very singular niches: they were both estimated to prefer low slope conditions, with low lPA and relatively cool temperatures (species optimum around 18°C) and a low magnitude of thermal tolerance. Among the species for which all optimums could be computed, the bitterling displayed the most extreme optimum for the July mean air temperature and the slope. This species was thus estimated to prefer high temperatures and very low slopes with a narrow tolerance range for these two variables. Bitterling seemed to have narrow tolerance and a low optimum to lPA gradient. The Atlantic salmon also displayed a very particular niche, tolerating very low temperature, very high lPA and low thermal amplitude. Salmon was the only species for which the slope was not selected. Burbot displayed very similar characteristics, with low tolerance to all environmental variables and a preference for low slopes. The European brook lamprey was estimated to occur preferentially at low temperatures, average slopes, tolerating a narrow range of thermal conditions. Finally, the patterns observed for the Douro nace were very singular, with quadratic responses the inverse of the others (displaying minimum instead of maximum).

Table 10. Summary of species optimums and range values (conditioned to the median of the other environmental variables). Values in bold indicated the environmental variable for each species with the highest independent effects. Optimum Range Species Tjul Slope lPA Tdif Tjul Slope lPA Tdif Salmo trutta very low high average low large large very large large Phoxinus phoxinus average average average - average large large - Barbatula barbatula average average average high average large large large Cottus gobio low average - high narrow large - large Gobio gobio average low average high average average large large Pseudochondrostoma duriense high - - - narrow narrow narrow - Leuciscus cephalus high average high - large large large - Rutilus rutilus average low high low average narrow narrow average Anguilla anguilla high low - low large narrow narrow narrow Perca fluviatilis average low high - narrow narrow average - Alburnoides bipunctatus high average high high average average average narrow Esox lucius average low high - average narrow average - Leuciscus leuciscus average low high low average average large large Alburnus alburnus high low high high narrow narrow narrow average Gasterosteus aculeatus low low low high narrow average large large Chondrostoma nasus average high high average average average average narrow Salmo salar very low - high low narrow - narrow narrow Barbus barbus high high high low average average large average Lota lota very low low high high narrow narrow narrow narrow Telestes souffia high high high low narrow average large narrow Thymallus thymallus low high high high average narrow average average Lampetra planeri low average low - narrow average narrow - Rhodeus amarus high low average high narrow narrow average average Pungitius pungitius average low low - narrow average average -

312 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. 3.6 Confidence intervals

As shown in Figure 3, the confidence interval bands around the estimated probability of occurrence along the thermal gradient were mainly narrow, suggesting that the models were accurate. The species displaying a larger confidence band for Tjul were: spirlin around its optimum, pike at low and high temperatures, bleak at high temperatures, nase near the extremes of its thermal range, barbel at high temperatures, burbot at very low temperatures, grayling at very low temperatures and bitterling at low and high temperatures. The relatively wide bands correspond to part of the thermal gradient far from the mean observed in the calibration data sets and to temperatures that were only infrequently found in the calibration data sets of these species. The confidence bands estimated around the responses to slope were fairly narrow, but could be very wide, especially at the extreme edges of the slope gradients (Fig. 6). This was particularly visible for three species: nase, barbel and riffle dace. The wide bands were mainly observed in the gradient portion for which relatively few data were available. For instance, in our data set the nase never occurred at a slope (in logarithm) greater than 2.1 and the confidence band widened just after this threshold. Moreover, these wide bands concerned mainly species with a low occurrence in their calibration data sets. The three former species occurred in less than 17 % of their calibration sites (Table 3). As shown in Figure 9, the same patterns as for slope were observed for pseudo-runoff (lPA). For nearly all species, the confidence bands over the entire lPA gradient were narrow, showing that the models had good precision but some species displayed very wide confidence bands at the extreme part of the lPA distributions, e.g. European three-spine stickleback and grayling. The riffle dace was the only species which display a wide confidence band throughout the lPA gradient. Concerning the thermal amplitude, the confidence bands were mostly close to the probabilities of presence estimated throughout the range of Tdif (Fig. 12). Brown trout, eel and Atlantic salmon displayed particularly thin bands, whereas spirlin, nase, barbel, riffle dace and bitterling displayed wider confidence intervals (Fig. 9). 4. Discussion and conclusions

Our results highlight that the four environmental variables used to define the species niche were major determinants of species occurrence. The estimated theoretical niches were highly variable between species. In addition, the confidence bands around the fitted probability of presence were mainly narrow, whereas the prediction bands were always very wide. Goodness of fit We believe that the area under the ROC curve is a convenient and consistent estimation of the models’ overall goodness of fit, but that this statistic alone is not sufficient to provide a complete overview of the models’ accuracy. Indeed, several models in this study displayed very close AUC and good classification rates but their sensitivity (percentage of presence correctly predicted) and specificity (percentage of absence correctly predicted) varied. For instance, chub and bitterling both had an AUC equal to 0.87, which is considered a very good AUC, whereas chub presented very high

313 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. sensitivity and lower specificity and bitterling displayed the opposite pattern. Therefore, these two statistics provide consistent additional information on the possible misclassification when using these models to predict species occurrence. Therefore, whatever the statistical methods used to relate species occurrence with environmental conditions, we deemed it necessary to provide these statistics when modelling species niche. To our knowledge, only very few studies on the niche of European freshwater fish species have provided such detailed information concerning niche model outputs (e.g. Pont et al. 2005). Moreover, using logistic regressions presents a number of advantages compared to other statistical methods. First of all, the sign of the coefficients associated with each predictor is a direct and visible assessment of the effect of a given environmental variable on the probability of presence of a given species. Indeed, whatever the inverse link function, a positive coefficient indicates that the probability of presence rises with increasing predictor values. This could be extended to predictors with a quadratic term, because a negative coefficient for the term in degree two indicates the presence of a maximum and thus of a bell-shaped relationship. GLM also offered a strong conceptual framework to the estimation of coefficients and confidence intervals (e.g. McCullagh and Nelder 1989; Agresti 2002; Collett 2002). The close match between the goodness of fit statistics computed on the calibration data sets and the results from the cross-validations highlights the consistency between the misclassification made on the calibration data set and the misclassification that would be made when predicting species occurrence on new sites. Among the 24 species studied, the minnow, stone loach, bullhead and European brook lamprey displayed low AUC values, specificity and sensitivity. These results suggest that other factors not taken into account in the initial model control the distribution of these species. Indeed, other environmental variables should also explain the presence/absence of these four species. Moreover, biotic interactions such as competitive exclusion should have blurred our observed relationships between the physical environment and species occurrence. In addition, it is also possible that our data set was not sufficient to consistently capture the niche of certain species. This data set was biased towards a low and average Strahler rank, with less than 10 % of the sites located in a stream with a rank greater than 4. Two main factors could account for this unbalanced data set. First, due to the sampling problems in very large rivers, only very few data were available for these systems. However, the major issue was to identify sites located in large rivers that were not impacted or only slighted impacted. The degree of alterations mostly increases along the upstream–downstream gradient, as human activities generally increase in the most downstream part. It could therefore be possible that this data set did not cover all the environmental conditions suitable for some species. Despite these limitations, we firmly believe that the spatial extent of this data set, from Portugal to Sweden and Romania, provide a more relevant and accurate visualisation of the species niche than other recent studies working at the regional or national resolution (Buisson et al. 2008; Pont et al. 2005). First, by working at the continental resolution we expect to encounter a wider range of environmental conditions (combinations of environmental variables) and therefore would be able to more efficiently disentangle the effect of each environmental variable on species occurrence. This

314 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. pattern should be increasingly consistent as the species distribution areas increase. For instance, the niche of the Douro nase could be investigated working at the regional resolution, as this species is endemic of the Duero Basin (both in Portugal and Spain; Kottelat and Freyhof 2007; Mesquita and Coelho 2002). While for a widespread species such as the brown trout, working at the regional resolution would only provide a narrow vision of the environmental conditions suitable for this species. Moreover, even if the species prevalence in our data sets was close to those observed in previous studies, the number of occurrences was always higher. Consequently, the coefficients related to the probability of presence of a given species with the predictors were estimated on a greater number of observations. For this reason, we expected these estimations to be more representative of the effect of these variables on species occurrence. The relatively narrow confidence bands around the estimated probabilities of presence observed on each graph display partly confirmed the stability of the coefficient estimation and thus the consistency of the observed patterns. Nevertheless, for some species relatively wide confidence bands were observed in some parts of the environmental gradients. The substantial uncertainty observed in these environmental conditions was due to the absence of species presence recorded in these parts of the environmental gradient. Species niche Among the four environmental variables included in the initial models, the July mean air temperature was the only variable always selected by the stepwise procedure. Even if this result is not surprising (e.g. Wehrly et al. 2003), this demonstrated the importance of the temperature on the fish species distribution in the hydrographic network. As should have been expected form earlier studies, nearly every species displayed a bell-shaped response to thermal gradients. Therefore, all species displayed a thermal tolerance range over which they would not occur and a theoretical optimum. Burbot, Atlantic salmon and bleak were the only species displaying alternative patterns, with a monotic response along the thermal gradient. It should be assumed that the range of temperatures observed in our data set was not sufficient to provide a complete representation of the thermal tolerance of these species. Among the species studied, the brown trout was the species displaying the most singular pattern of response to temperature. It occupied a very particular niche, mostly found at low temperatures but estimated to be able to sustain a wide range of thermal conditions. On the contrary, Eurasian nine-spine stickleback, European three-spine stickleback, Douro nase, riffle dace and bitterling displayed narrow thermal ranges and avoid low temperatures. The hierarchical distribution revealed that the slope was on average the environmental variable with the highest independent effect on species probability of occurrence. These results are in accordance with a previous study at the national resolution in France (Pont et al. 2005). Moreover, the distribution of the theoretical species optimum observed in our study nearly match Huet’s fish zonation (1954). The brown trout was estimated to occur in the highest slopes and able to occur within a wide range of slopes. The bullhead and minnow, two species associated with brown trout, presented lower optimums but substantial overlaps with the brown trout niche. Concerning the auxiliary species associated with the grayling zone, the barbel displayed a higher estimated optimum for slope than the

315 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. grayling. Nase and chub displayed slope optimums lower than for the other two species, in accordance with Huet’s zonation. Consistent with Huet’s definition of the barbel zone, these three rheophilic cyprinids demonstrated a large overlap in their tolerance to slope. The pseudo run-off (logarithm of annual precipitations multiplied by area of drainage basin) was also selected for 23 species (not retained for the bullhead) and was the major environmental factor for spirlin, bleak, nase, barbel and grayling. On the contrary, the effect of lPA was relatively limited for eel, European three-spine stickleback and riffle dace. Brown trout, minnow, stone loach, gudgeon, spirlin, grayling and bitterling displayed bell-shaped relationships between lPA and their probability of presence. European three-spine stickleback, European brook lamprey and Eurasian nine-spine stickleback were the only three species estimated to prefer low lPA. The occurrence of the ten other species was estimated to increase along the lPA gradient. The thermal amplitude between January and July had only a slight influence on species occurrence, as revealed both by the hierarchical partitioning and its selection for only two-thirds of the species. Nevertheless, this variable was the major environmental determinant for the European eel. Eel was estimated to prefer areas with low thermal amplitude, characteristic of an oceanic climate. Indeed, in our data set this catadromous species was mainly recorded in coastal streams or in streams a short low distance from the sea. The migratory behaviour of this species should limit our ability to assess its niche accurately. Indeed, if eel is sampled along its migratory pathway, its presence will be related to a great extent to environmental conditions that should not reflect the real niche of this species. This should explain, at least in part, the low effect of the other environmental variables on its estimated occurrence. Both the univariate (Fox 1987, 2003) and bivariate graph displays highlighted the great diversity of species response along environmental gradients. This result is in accordance with Pont et al. (2005), who highlighted the variety of niches of the riverine fish species at the national resolution for France. Indeed, both the theoretical optimums and the tolerance ranges were highly variable between species and depending on the environmental variable considered. Among the 24 species studied, the brown trout was probably the species displaying the most singular niche. The brown trout was the only species that occurred on very high slopes and at very low temperatures, two environmental conditions that are hostile for almost all the other species. But most of all, the greatest particularity of this species is its plasticity, being able to tolerate a wide range of environmental conditions. On the other hand, the Eurasian nine-spine stickleback displayed a narrow niche, occurring in very low slope and in a limited thermal window. These results are consistent with the study conducted by Buisson et al. (2008b), in which the brown trout displayed the highest niche breadth, whereas the Eurasian nine-spine stickleback had the lowest tolerance value. Moreover, the niche breadth of the 30 species was highly variable (Buisson et al. 2008b), confirming the patterns observed in the various graph displays. The singular curvilinear responses between the probabilities of presence of the Douro nase and slope, Douro nase and lPA and burbot and lPA, more likely reflect a statistical artefact than a meaningful ecological pattern. Even if these quadratic terms were selected, they should be removed in further developments as they are not related to any ecological assumptions (Austin 2007).

316 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Perspectives for global change The main objective of the numerous studies developing niche models was to predict the evolution of species distribution areas under various projected scenarios of climate change (Buisson et al. 2008b; Lassale et al. 2008; Tirelli and Pessani 2009). Due to the strong relation observed between thermal conditions and the estimated probabilities of the fish species (Buisson et al. 2008a; Jackson et al. 2001; Pont et al. 2005), some of the modifications of the current distribution areas of riverine fish species should be expected (Buisson et al. 2008b). Nevertheless, the magnitude of these modifications may vary depending on the species. It should be expected that the species with very limited niche breadth are more affected by the thermal modifications than eurythermal fish species. The latter would be able to cope with a broader range of temperatures. It is also clear that the availability of species to disperse through the hydrographic network and the amount of suitable areas will also play a major role in the delimitation of the future distribution areas. To date, the studies conducted on the effect of the climate change on species distribution have only focused on temperature. Our study demonstrates a tendency for precipitation to also be a key climatic factor acting on species occurrence. Contrary to temperature, no scenarios illustrating changes in precipitation are available; they should be integrated into the forecasting of species distribution. Criticisms of the use of species distribution models to predict the future evolution of their distribution area have recently been expressed. First, although the necessity of providing the uncertainty around the predictions has been recognised by some authors (Buisson et al. 2010), the great majority of the studies on climate change only provide the predicted probability of species presence. Nevertheless, even for generalised linear models, which are very well-known models with strong theoretical foundations, to our knowledge computing prediction intervals remains an open issue, especially for logistic regression. The use of historical data should constitute a credible alternative to assess the predictive power of the models. Indeed, comparing the model outputs to the historical presence of the species should provide reliable information concerning the error of predictions. Second, some authors have raised theoretical doubts concerning the use of species distribution models. Pearman et al. (2008) emphasised that predicting the future species distribution area using niche models is based on the hypothesis of niche conservatism. Indeed, our models are based on the realised species niche, which is only a portion of their fundamental niche. Therefore, it should be expected that in a new environment a species extends its realised niche towards an unoccupied portion of its fundamental niche and thus occurs in some unexpected environmental conditions (Pearman et al. 2008). We are convinced that using a data set that covers almost all of Europe and almost the entire current species distribution areas (realised niche) would provide a more reliable estimation of how species would react to global change. Compared to studies conducted at the national resolution, by working at the European resolution we could encompass larger ranges of environmental conditions and thus move closer to the species fundamental niche. In the future, this study could be extended to: 1) compare the estimated species optimums with the values observed in the literature, 2) use some of the thermal change scenarios to estimate the possible modification of the species distribution area, 3) add the uncertainty related to these

317 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. predictions, 4) also include the precipitation scenario, and 5) use different types of statistical models or machine learning to compare their ability to predict the distributions of European riverine fish species.

5. Acknowledgments

This paper is a result of the project WISER (Water bodies in Europe: Integrative Systems to assess Ecological status and Recovery) funded by the European Union under the 7th Framework Programme, Theme 6 (Environment including Climate Change) (contract No. 226273), www.wiser.eu.

318 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. 6. References

Agresti, 2002. Categorical Data Analysis, 2nd ed. John Wiley & Sons, Inc., Hoboken, New Jersey. Allan, J.D., Castillo, M.M., 2007. Stream Ecology: Structure and Function of Running Waters, 2nd ed. Kluwer Academic Publishers, Boston. Altman, D.G., Machin, D., Bryant, T.N., Gardner, M.J., 2000. Statistics with Confidence, 2nd ed. BMJ Books, London. Austin, M.P., 2002. Spatial prediction of species distribution: an interface between ecological theory and statistical modelling. Ecol. Model. 157, 101-118. Austin, M.P., 2007. Species distribution models and ecological theory: a critical assessment and some possible new approaches. Ecol. Model. 200, 1-19. Belsey, D., Kuh, E., Welsch, R., 1980. Regressions Diagnostics. John Wiley & Sons, New York. Box, G.E.P., Norman, R.D., 1987. Empirical Model-Building and Response Surfaces. Wiley, New York. Buisson, L., Blanc, L., Grenouillet, G.. 2008a. Modelling stream fish species distribution in a river network: the relative effects of temperature versus physical factors. Ecol. Freshw. Fish. 17, 244- 257. Buisson, L., Thuillier, W., Lek, S., Lim, P., Grenouillet, G., 2008b. Climate change hastens the turnover of stream fish assemblages. Glob. Change Biol. 14, 2232-2248. Buja, A., 2000. On "Additive Logistic Regression: A Statistical View of Boosting" by Friedman, J.H., Hastie, T., and Tibshirani, R. The Annals of Statistics. 28, 387-391. Burnham, K.P., Anderson, D.R., 2002. Model selection and multimodel inference: a practical information-theoretic approach, 2nd ed. Springer-Verlag, New York. Canty, A., Ripley, B., 2009. boot: Bootstrap R (S-Plus) Functions. R package version 1.2-41. Chatterjee, S., Hadi, A.S., Price, B., 2000. Regression Analysis by Example, 3rd ed. John Wiley & Sons, New York. Chevan, A., Sutherland, M., 1991. Hierarchical partitioning. The American Statistician. 45, 90-96. Collett, D., 2002. Modelling Binary Data. Champman & Hall/CRC, Boca Raton, Florida. Conger, A.J., 1980. Integration and generalization of kappas for multiple raters. Psychological Bulletin. 88, 322-328. Davidson, A.C., Hinkley, D.V., 1997. Bootstrap Methods and Their Application. Cambridge University Press. De Jong, P., Heller, G.Z., 2008. Generalized Linear Models for Insurance Data. Cambridge University Press, Cambridge. Efron, B., Tibshirani, R., 1993. An Introduction to the Bootstrap. Chapman & Hall/CRC, Boca Raton, Florida. Elith, J., Graham, C.H., 2009. Do they? How do they? WHY do they differ? On finding reasons for differing performances of species distribution models. Ecography. 32, 66-77. Faraway, J.J., 2006. Extending the Linear Model with R. Generalized Linear, Mixed Effects and Nonparametric Regression Models. Chapman & Hall/CRC, Boca Raton, Florida. Fiedling, A.H., Bell, J.F., 1997. A review of methods for the assessment of prediction error in conservation resence/absence models. Environ. Conserv. 24, 38-49. Fox, J., 1987. Effect displays for generalized linear models. Sociological Methodology. 17, 347-361. Fox, J., 1997. Applied Regression, Linear Models, and Related Methods. Sage.

319 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Fox, J., 2003. Effect displays in R for generalised linear models. Journal of Statistical Software. 8, 1- 27. French, B., Lumley, T., Monks, S.A., Rice, K.M., Hindorff, L.A., Reiner, A.P., Psaty, B.M., 2006. Simple estimates of haplotype relative risks in case-control data. Genet. Epidemiol. 30, 485–494. Frissell, C.A., Liss, W.J., Warren, C.E., Hurley, M.D., 1986. A hierarchical framework for stream habitat classification: viewing streams in a watershed context Environ. Manage. 10, 199-214. Guisan, A., Edwards Jr, T.C., Hastie, T., 2002. Generalized linear and generalized additive models in studies of species distributions: setting the scene. Ecol. Model. 157, 89-100. Guisan, A., Zimmerman, N.E., 2000. Predictive habitat distribution models in ecology. Ecol. Model. 135, 147-186. Harrell, F.E., 2001. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis. Springer, New York. Hastie, T., Tibshirani, R., Friedman, J., 2009. The Element of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. Springer, New York. Hellmann, J.J., Fowler, G.W., 1999. Bias, precision and accuracy of four measures of species richness. Ecol. Appl. 9, 824-834. Hosmer, D.W., Lemeshow, S., 2000. Applied Logistic Regression, 2nd ed. John Wiley & Sons, Inc., New York. Huet, M., 1949. Aperçu des relations entre la pente et les populations piscicoles des eaux courantes. Revue Suisse d'Hydrobiologie. 11, 332-351. Jackson, D. A., Peres-Neto, P. R., Olden, J. D., 2001. What controls who is where in freshwater fish communities - the roles of biotic, abiotic, and spatial factors. Can. J. Fish. Aquat. Sci. 58, 157-170. Jongman, R.H.G., Ter Braak, C.J.F., Van Tongeren, O.F.R., 1995. Data Analysis in Community and Landscape Ecology. Cambridge University Press, Cambridge, United Kingdom. King, G., Zeng, L., 2001. Logistic regression in rare events data. Political Analysis. 9, 137-163. Kottelat, M., Freyhof, J., 2007. Handbook of European freshwater fishes. Publications Kottelat, Cornol, Switzerland. Kutner, M.H., Nachtsheim, C.J., Neter, J., Li, W., 2005. Applied Linear Statistical Models, 5th ed. McGraw-Hill/Irwin, New York. Lassale, G., Béguer, M., Beaulaton, L., Rochard, E., 2008. Diadromous fish conservation plans need to consider global warming issues: an approach using biogeographical models. Biol. Conservation. 141, 1105-1118. Leathwick, J.R., Elith, J., Hastie, T., 2006. Comparative performance of generalized additive models and multivariate adaptive regression splines for statistical modelling of species distributions. Ecol. Model. 199, 188-196. Leathwick, J.R., Moilanen, A., Francis, M., Elith, J., Taylor, P., Julian, K., Hastie, T., Duffy, C., 2008. Novel methods for the design and evaluation of marine protected areas in offshore waters. Conservation Letters. 1, 91-102. Liang, K.-Y., Zeger, S., 1986. Longitudinal data analysis using generalized linear models. Biometrika. 78, 13-22. Magurran, A.E., 1988. Ecological diversity and its measurement. Princeton University Press, Princeton, N. J. Matthews, W.J., 1998. Patterns in Freshwater Fish Ecology. Chapman & Hall, New York. Mccullagh, P., Nelder, J.A., 1989. Generalized Linear Models, 2nd ed. Chapman and Hall, London. Mease, D., Wyner, A., 2008. Evidence contrary to the statistical view of boosting. Journal of Machine Learning Research 9. 131-156. Mesquita, N., Coelho, F., 2002. The ichthyofauna of the small Mediterranean-type drainages of Portugal: its importance for conservation. Pages 65-71 In M. J. Collares-Pereira, I. G. Cowx and F.

320 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties.

Coelho (editors). Conservation of Freshwater Fish: Options for the Future. Oxford: Fishing News Books, Blackwell Science, Oxford. Minshall, G.W., 1988. Stream ecosystem theory: a global perspective. J. N. Am. Benthol. Soc. 7, 263- 288. Montgomery, D.C., Peck, E.A., Vining, G.G., 2006. Introduction to Linear Regression Analysis, 4th ed. Wiley Series in Probability and Statistics, New York. Nelder, J.A., Wedderburn, R.W.M., 1972. Generalized linear models. Journal of the Royal Statistical Society, Series A. 135, 370-384. Oberdorff, T., Pont, D., Hugueny, B., Chessel, D., 2001. A probabilistic model characterizing fish assemblages of French rivers: a framework for environmental assessment. Freshwater Biology. 46, 399-415. Poff, N.L., Allan, J.D., Bain, M.B., Karr, J.R., Prestegaard, K.L., Richter, B.D., Sparks, R.E., Stromberg, J.C., 1997. The natural flow regime: A paradigm for river conservation and restoration. Bioscience. 47, 769-784. Pont, D., Hugueny, B., Beier, U., Goffaux, D., Melcher, A., Noble, R., Rogers, C., Roset, N., Schmutz, S., 2006. Assessing river biotic condition at a continental scale: a European approach using functional metrics and fish assemblages. Journal of Applied Ecology. 43, 70-80. Pont, D., Hugueny, B., Oberdorff, T., 2005. Modelling habitat requirement of European fishes: do species have similar responses to local and regional environmental constraints? Canadian Journal of Fisheries and Aquatic Sciences. 62, 163-173. Pont, D., Hugueny, B., Rogers, C., 2007. Development of a fish-based index for the assessment of river health in Europe: the European Fish Index. Fisheries Management and Ecology. 14, 427-439. R Development Core Team. 2008. R: A language and environment for statistical computing. R Foundation for Statistical Computing. Reyjol, Y., Hugueny, B., Pont, D., Bianco, P.G., Beier, U., Caiola, N., Casals, F., Cowx, I., Economou, A., Ferreira, T., Haidvogl, G., Noble, R., De Sostoa, A., Vigneron, T., Virbickas, T., 2007. Patterns in species richness and endemism of European freshwater fish. Global Ecology and Biogeography. 16, 65-75. Scott, J.M., Heglund, P.J., Samson, F., Haufler, J., Morrison, M., Raphael, M., Wall, B., 2002. Predicting Species Occurrences: Issues of Accuracy and Scale. Island Press, Covelo, CA. Snee, R.D., 1977. Validation of regression models: methods and examples. Technometrics. 19, 415- 428. Statzner, B., Gore, J.A., Resh, V.H., 1988. Hydraulic stream ecology: observed patterns and potential applications. J. N. Am. Benthol. Soc. 7, 307-360. Tirelli, T., Pessani., D., 2009. Use of decision tree and artificial neural network approaches to model presence/absence of Telestes muticellus in Piedmont (North-Western Italy). River Res. Appl. 25:1001-1012. Van Tongeren, O.F.R., 1995. Data analysis or simulation model: a critical evaluation of some methods. Ecol. Model. 78, 51-60. Vaz, S., Martin, C.S., Eastwood, P.D., Ernande, B., Carpentier, A., Meaden, G.J., Coppin, F., 2008. Modelling species distributions using regression quantiles. J. Appl. Ecol. 45, 204-217. Venables, W.N., Ripley, B.D., 2002. Modern Applied Statistics with S, 4th ed. Springer, New York. Walsh, C., Mac Nally, R., 2008. hier.part: Hierarchical Partitioning. R package version 1.0-3. Wehrly, K. E., Wiley., M. J., Seelbach., P. W. 2003. Classifying regional variation in thermal regime based on stream fish community patterns. Trans. Am. Fish. Soc. 132, 18-38.

321 P7: Modelling ecological niche of fish species at the European scale: sensitivity to climate variables (temperature, run-off) and associated uncertainties. 7. Appendices

List of appendices

Appendix 1. Bivariate effect display of slope and July mean air temperature ...... 323

Appendix 2. Bivariate effect display of lPA and July mean air temperature ...... 324

Appendix 3. Bivariate effect display of slope and lPA ...... 325

322 Appendix 1. Bivariate effect display of the slope (in x) and of the July mean air temperature on the species probabilities of occurrence. The darker the points are, the higher the probability of presence is. The grey rectangle delineates the environmental range observed in the calibrations data set. A star after the species names indicates that these variables had the two greatest independent effects.

323 Appendix 2. Bivariate effect display of the lPA (in x) and the July mean air temperature on the species probabilities of occurrence. The darker the points are, the higher the probability of presence is. The grey rectangle delineates the environmental range observed in the calibrations data set. A star after the species names indicates that these variables had the two greatest independent effects.

324 Appendix 3. Bivariate effect display of the slope (in x) and the lPA on the species probabilities of occurrence. The darker the points are, the higher the probability of presence is. The grey rectangle delineates the environmental range observed in the calibrations data set. A star after the species names indicates that these variables had the two greatest independent effects.

325

Discussion and Perspectives

Discussion and Perspectives

Discussion and perspectives

The objective of this work was to: - test certain hypotheses underlying the development of a predictive multimetric index that can be used at a large spatial extent and use the result of this research to develop the new European fish index; - go beyond the development of the index by studying the possibility of associating uncertainty around metric and index scores; - evaluate the sensitivity of the index to climate changes.

The different tests done showed: 1. The close relation between some trait categories among fish assemblages and in particular a tolerance and reproduction gradient; 2. The assemblage structure varying between the ecoregions, certain ecoregions present assemblages mainly dominated by eurythermal tolerant species (e.g. the Mediterranean region), whereas others display assemblages dominated by stenothermal intolerant species (e.g. the Alps); 3. The evolution of assemblage structure along a thermal and a physical gradient; 4. The relative convergence of the response to the environmental gradient of assemblages coming from different regional species pools; 5. The influence of temperature and physical factors on species distribution and species life history traits. The results from the RDA or from the hierarchical partitionings demonstrated that the climatic conditions and the physical structure of streams are key environmental factors for species life history traits, species distribution and functional assemblage structure. These analyses also revealed that the influence of these two factors (climate and physical structure) on the different levels of community organisation seems relatively independent. This is revealed by the orthogonal relation of these two gradients on the correlation circle of the RDA and by the low joined effects of these two factors relative to their independent contributions. These analyses also define two main assemblage types based on their functional structure. A major consequence from these results was to consider these two main assemblage types to develop the new index. The distinction between intolerant and tolerant assemblages was used to take into account the specificity of these assemblages, which present very

329 Discussion and Perspectives different characteristics that allow them to cope with the environmental conditions they are living in. Intolerant assemblages are principally dominated, associated with small and/or cold streams, and correspond mainly to assemblages dominated by salmonids in Europe (Huet et al. 1954, Ferreira et al. 2007, Melcher et al. 2007). In contrast, tolerant assemblages are mainly associated with larger and/or warmer streams and are mainly dominated by cyprinids. Using this classification makes it possible to select sets of metrics specific to each assemblage type. The great advantage is that the observed assemblage structure is compared to metrics with non-null (non-zero), always positive, expected values. Indeed, the metrics were selected to be representative of these assemblages. When a functional trait is naturally absent in an assemblage and in a given environment, it is difficult to know what is quantified by the deviation between the observed and expected metric values. This is even truer if a pressure affects the species presenting this attribute. If both observed and expected metric values are null, then it becomes impossible to know whether the observed value results from the environment or the pressure impacts (Harris and Silvera 1999). Finally, two indices based on assemblage characteristics were developed during the EFI+ project. These two indices integrate into their computation a reduced number of metrics compared to the majority of multimetric indices, even those developed for low diverse faunas (e.g. Lyons et al. 1996, Langdon 2001, Southerland et al. 2007). In comparison, the previous European fish index, EFI, integrated ten metrics (Pont et al. 2006, 2007). The limited number of metrics used was both a deliberate choice and the consequence of practical restrictions. Since the first IBI (Karr 1981) and its numerous developments (e.g. Fausch et al. 1984, Angermeier and Karr 1986, Karr 1991, Karr and Chu 1999), the use of numerous metrics has been highly advocated. The idea was that each metric provided different information concerning assemblages: species richness, number of individuals, number of particular taxa or species groups (guilds), number of intolerant taxa, relative abundance of intolerant individuals, individual health (percentage of individuals with external abnormalities, diseases, etc.) (Karr and Chu 1999). Each metric, or type of metric, is expected to respond to different pressures or to different levels of degradation (Angermeier and Karr 1986, Karr and Chu 1999). In practice, metrics are never completely independent. First, whatever units are used to express them (number of species, number of individuals, biomass, etc.), they are all computed from the same individuals caught in a given location (site). Second, the analysis of the relation between traits among European fish assemblages (P1) clearly demonstrated the high interdependency of several trait categories. Some of these categories could be considered redundant. The four metrics based on species traits and body size were also highly correlated

330 Discussion and Perspectives

(P3). Commonly, during the development of a multimetric index a maximum correlation threshold is used to select candidate metrics that will be integrated into the final index score (Hughes et al. 1998, Hering et al. 2006, Pont et al. 2006, 2007, Roset et al. 2007, Stoddard et al. 2008). In addition, the effect of the metric correlations on the uncertainty associated with the index score was clearly underlined (P5). The more metrics are integrated into the index, the greater the correlations are between these metrics and higher will be the uncertainty associated with the index score. The low number of metrics (two) taken into account in the final computation of the new European fish index, EFI+, therefore results from the substantial redundancy between trait categories among assemblages, thus limiting the number of candidate metrics, and from limiting the uncertainty associated with the index score. Nevertheless, taking into account only two metrics probably limits the sensitivity of these indices. Potentially, EFI+ would be more sensitive to certain types of pressure, considerable impairment, or a combination of pressures. It should be necessary to balance the number of metrics to both detect a maximum of pressures and limit the uncertainty associated with the index score.

The results from sections P1, P2 and P3 confirm the need to control for the environmental variability of the trait categories so that stream condition assessment is comparable and independent of the environment. On the whole, Iberian and Western European fish faunas exhibit convergent responses to environmental gradients. The metrics principally display either parallel responses to environmental variability or similar responses but with different magnitudes of responses. Models 2 and 3 were mainly selected after the comparisons of the three nested models and the multiple test procedure. Nevertheless, the statistical approach used to test the convergence of the communities does not follow the same objectives as when developing a multimetric index. In bioindication, statistical models are used to predict the metric’s expected values in a given environment, in absence of pressure. Estimating the deviation between observed and expected values is central to developing an index, following the reference condition approach (e.g. Pont et al. 2006, 2007, 2009). The major issue here is to know if the same model could be used to predict the metric’s expected values for every European stream or if it is necessary to use different models depending on the European regions. Whether or not to take the region into account in the model must be determined individually for each metric. On the other hand, using a multiple test procedure to test for assemblage convergence between the Iberian

331 Discussion and Perspectives

Peninsula and Western Europe (P3) follows a multiple inferential purpose. The aim is to control for a class of type I error rates (FWER, FDR, etc.) because of the simultaneous tests of a set of numerous null hypotheses. Therefore, the reasoning is done on the whole set of metrics tested (more precisely on the whole set of null hypotheses tested) (e.g. Lamouroux et al. 2002) and not on a single metric. The use of the procedure of Benjamini and Yekutieli (2001), which controls the false discovery rates (FDR) and of adjusted p-values (Dudoit and van Der Lann 2006) should result in hiding possible divergence between the two regions (P3). These procedures are used to adjust the threshold above which the differences between two models are considered significant. Without these corrections, the regional divergence between Western Europe and the Iberian Peninsula would likely have been more accentuated and thus the necessity to integrate the region into the models (Oberdorff et al. 2002, Pont et al. 2006, 2007). In parallel to the functional convergence, it would be interesting to study the convergent responses of metrics to anthropogenic pressures. Generally, metric sensitivity is assessed along a global pressure index (P3, P4) (Karr and Chu 1999, Pont et al. 2006, 2007, Quataert et al. 2007). In practice, only the metrics with a low overlap between scores of reference sites and the score of impacted sites are used (Stoddard et al. 2008, Pont et al. 2009). Depending on the regional species pool, the metric sensitivity to pressures varies. Some regions, such as the Iberian Peninsula, display fish faunas adapted to specific environmental conditions. Species living in a stream section experiencing the randomness and the harshness of the Mediterranean climate are probably more tolerant than species living in temperate areas. Consequently, Mediterranean assemblages are probably less sensitive to human pressures than temperate assemblages. The sensitivity of a multimetric index that can be used at a large spatial extent could vary depending on the region where it is used (Quataert et al. 2007). The metric sensitivity should also vary depending on the environmental conditions in which pressure is present. The extreme manifestation of the interaction between environment and pressure is visible when the response of a metric to the same pressure varies depending on the environment. Several authors (e.g. Lyons et al. 1996, Maret 1999) observed that species richness very often declines with habitat degradation in warm water streams, whereas species richness tends to increase with habitat degradation in cold water streams. In practice these metrics are very difficult to work with and are rarely used. The distribution of the scores of reference sites overlaps with the distribution of the degraded site scores, which are much more widely scattered. Only metrics with a single type of response to pressure are easily used

332 Discussion and Perspectives

(values decreasing, or increasing, with pressure). The environment–pressure interaction could also modulate metric sensitivity. The same pressure may therefore have no impact on assemblages in a given environment, altering assemblage integrity in other environmental conditions. Currently, the study of the interactions between pressures and environment are relatively limited and are mainly descriptive. The large number of pressures that could occur in the same place limits the study of these interactions. Sites only impacted by a single pressure or type of pressure are sparse. The effect of the pressure–environment interaction can be all the more important in the perspective of global change. If the effect of a pressure depended on temperature, the impact of this pressure on assemblage would evolve, assuming that this pressure was constant. Such changes should have important consequences for the future management and restoration of streams. One possibility could be to identify, among the 9948 EFI+ sites, a reduced set of both reference sites and sites only impacted by a single pressure (e.g. impoundment). It would be possible to model the relationships between assemblage attributes in relation to the environment and the presence/absence of impoundment. It would be necessary to integrate the interaction between environment and pressure into the models. Once the different models fitted, the effect of the interactions may be significant (P2). If at least one environment– pressure interaction is significant, it would be necessary to investigate how this interaction expressed itself (Irz et al. 2007, Hugueny et al. 2010): modulation of the effect of the pressure along environmental gradients, the opposite effect of the same pressure in different environments, etc. A second possibility could be to focus only on a region or a watershed with only one type of major alteration (e.g. hydrological alteration, water quality). This approach would not be limited by the different relations between environment and metrics in the different European regions. Among the numerous environmental factors that may modulate the effects of pressures, the climatic factors may play a major role in the future. The results from the hierarchical partitioning of the species distribution models of the 24 European fish species (P7) highlighted the importance of climate on the current species distributions (Pont et al. 2005, Buisson et al. 2008a). Four models present low goodness of fit, whereas the 20 others present a good or very good match between prediction and observation. This suggests that the niches of the most common species in small and medium West European streams, were consistently estimated. The strength of these models stems from the spatial extent of the calibration data sets used to fit them. Working at a large spatial extent provides a more accurate estimation of the species realized niche (Hutchinson 1957) than for instance studies

333 Discussion and Perspectives conducted on a national resolution (e.g. Pont et al. 2005) or on a given watershed (Buisson et al. 2008a). The relatively limited spatial extent of the calibration data set is one of the major criticisms (Sinclair et al. 2010) of the SDMs (Guisan and Zimmerman 2000, Austin 2007, Elith and Graham 2009, Elith and Leathwick 2009). The vast majority of the 24 species considered in section, P7 dominate small and medium streams in Western Europe. Consequently, these models should estimate the sensitivity of fish assemblages to global warming. This sensitivity should be evaluated by predicting the composition of theoretical assemblages under different thermal variation scenarios, e.g. a rise of one or two degrees celsius. Once the theoretical future assemblage composition has been estimated, it is possible to estimate the assemblage functional structures (P1). Therefore evaluating the sensitivity of assemblage structure to global change becomes feasible. This is also useful to perform the same analyses but with varying the annual precipitations. The contribution of the pseudo run-off, coarsely estimated by multiplying the area of the drainage basin by annual precipitations, to species presence/absence was high for 23 species out of 24 (P7). Predictions from logistic regression are not absolute. They represent the expectation of the probability of occurrence of one species in a given environment. The confidence intervals estimated around the expected probabilities clearly highlight that the reliability of these predictions varies along the environmental gradients (P7). Estimating the prediction interval (Hahn and Meeker 1991) associated with species probability of presence would make it possible to estimate the uncertainty of the projections (depending of the different scenarios). The complexity of computing these uncertainties should explain why this information is never provided in studies dealing with the consequences of global change.

334 Annexes

Annexes

Annexe 1 : Description de la base de données EFI+ (A) Liste des variables environnementales

Ecorégion Surface du bassin versant Distance à la source Distance à la mer Pente actuel de la rivière Pente de la vallée Forme de la vallée Type géomorphologique Type d'apport Régime naturel des débits Géologie dominante Présence d'une plaine d'inondation naturalle Sédiment naturellement dominant Présence d'un lac en amont Précipitations moyennes annuelles (bassin versant et bassin en amont du site) Température moyenne annuelle (bassin versant et bassin en amont du site) Température moyenne mensuelle (x12) Occupation du sol Surface de pêche Largeur Largeur mouillée Conductivité

337 Annexes

Annexe 1 : Description de la base de données EFI+ (B) Pressions

Altérations Pressions Modalités Présence de barrières à la migration à l’aval (échelle du bassin) non, oui, partiel Présence de barrières dans le tronçon à l’amont non, oui, partiel Présence de barrières dans le tronçon à l’aval non, oui, partiel Connectivité Nombre de barrières dans le tronçon à l’amont Nombre de barrières dans le tronçon à l’aval Distance entre la station et la plus proche barrière dans le tronçon à l’amont km Distance entre la station et la plus proche barrière dans le tronçon à l’aval km Réduction de la vitesse de l’écoulement par des retenues non, faible, fort Accroissement de la vitesse naturelle de l’écoulement non, oui Régime d’éclusée non, oui Prélèvement d’eau non, faible, fort aucun, irrigation, déviation pour hydro-électricité, eau potable, etang de pisciculture, Type d’usage de l’eau prélevée production de neige, refroidissement centrale thermique ou nucléaire, industriels et/ou touristique, autre Hydrologie Présence de retenues colinéaires (ou étangs de pêche) non, oui Modification de l’hydrogramme annuel non, oui

non, accroissement permanent, reduction permanente, accroissement estival, Altération du régime thermique de l’eau réduction estivale

Impact de vidange de barrage avec apports de sédiments non, oui Apport de sédiments fins non, faible, moyen, fort

338 Annexes

Altérations Pressions Modalités Chenalisation altération du profil en long non, intermédiaire, rectification Chenalisation altération de la section transversale non, intermédiaire, recalibrage Chenalisation altération de la diversité des habitats fluviaux non, faible, fort Morphologie Chenalisation altération de la végétation rivulaire non, faible, moyen, fort Chenalisation artificialisation des rives non, faible, moyen, fort Protection contre les crues, présence de digues non, oui Protection contre les crues, altération de la plaine d’inondation non, reste plus de 50 %, reste entre 10 et 50 %, moins de 10 %, reliquats Présence de substances toxiques non, faible, fort Acidification non, oui Qualité Classe de qualité (système national) très bonne, bonne, moyenne, passable, médiocre d'eau Eutrophisation artificielle non, faible, moyen, extrême Pollution organique non, faible, fort Sédimentation organique non, oui Usages Navigation non, faible, moyen, fort

339

Annexes

Annexe 2 : Manuel d’utilisation du logiciel EFI+6

6 http://efi-plus.boku.ac.at/software/doc/EFI+Manual.pdf 341

MANUAL FOR THE APPLICATION

OF THE

NEW European Fish Index – EFI+

Developed within the project Improvement and Spatial Extension of the European Fish Index

http://efi-plus.boku.ac.at/software

FINAL VERSION, June 2009 DEVELOPED BY THE EFI+ CONSORTIUM

EFI+ - Improvement and Spatial extension of the European Fish Index A project under the 6th Framework Programme, task “Ecological Status Assessment”. Priority “Integrating and Strengthening the European Research Area - Scientific Support to Policies”, Task 4 Contract Number 044096. http://efi-plus.boku.ac.at

343 To be cited as:

EFI+ CONSORTIUM (2009). Manual for the application of the new European Fish Index – EFI+. A fish-based method to assess the ecological status of European running waters in support of the Water Framework Directive. June 2009.

The text of this manual was written by Joaquin Solana, Diego Garcia de Jalon, Didier Pont, Pierre Bady, Maxime Logez, Richard Noble, Rafaela Schinegger, Gertrud Haidvogl, Andreas Melcher & Stefan Schmutz.

Acknowledgements

The EFI+ consortium would like to thank numerous individual persons, private and public institutions throughout Europe for providing the large volume of data that made the EFI+ project possible.

Special thanks are dedicated to the EFI+ external evaluation and advisory group: Robert Hughes; Stephane Stroffek, Nikolaus Schotzko, Tom Buijse, Ian Cowx, Nicolas Roset, Didier Pont, Birgit Vogel, Cornelia Schütz, Pedro Serra, Wouter van de Bund, Niels Jepsen, Panagiotis Balabanis, and Jorge Rodriguez-Romero.

The EFI+ project was financed by the EC contract no. 044096.

Help desk: Please direct all questions and suggestions for improvements to [email protected].

Keywords: new European Fish Index (EFI+), fish based assessment method, multi metric index, Water Framework Directive, software, modelling, metrics, fish communities, reference conditions, prediction, zonation.

344 Contents

PREFACE 6 General introduction 7

Part I:

1. Introduction 8 1.1. The Water Framework Directive (WFD) 8 1.2. A Fish-based ecological assessment method for Europe 9

2. The European research project EFI+ 10 2.1. Central Database a basic tool to develop EFI+ 10

3. The development of the European Fish Index (EFI+) 11 3.1. Objectives 11 3.2. Method description 12 1st step Setup and input data 15 2nd step River type and ecoregion 17 3rd step Predictive models 18 4th step Metric score 18 5th step Standardization and re-scaling of metric scores 19 6th step Fish Index 20 7th step Ecological class boundaries 22

4. EFI+ Application 26 4.1. Site selection 26 4.2. Environmental variables and sampling methods 27 4.3. Fish sampling 27 4.4. Fish data 30

Part II: A manual for application 31

1. Introduction 31

2. Web page 31

3. Input data 33

4. How to insert data 35 4.1. Manually 36 4.2. By MS Excel© spread sheet 37

5. Output 41

345 6. References 44

Annexes

1. Field protocol: 3 tables 2. River groups, ecoregions and Mediterranean types: 3 figures, 2 tables 3. Fish species in EFI software: 1 table 4. Guilds table: 4 tables 5. Data catalogue: 3 tables 6. Software technical report: 2 figures 7. Index development 1 figure, 1 table 8. Glossary for EFI+

List of figures, pictures and tables

Figure 1: Steps prescribed in the WFD for ecological status assessment...... 8 Figure 2: Locations of the sites used to develop the EFI...... 10 Figure 3: Flow chart describing the modelling procedure...... 14 Figure 4: Map of Illies ecoregions represented in EFI+...... 17 Figure 5: Different steps required to apply the assessment method...... 26 Figure 6: The front page of EFI+ ...... 31 Figure 7: The front page of EFI+ software ...... 32 Figure 8: Manual data input selection...... 36 Figure 9: Manual entering data form ...... 37 Figure 10: Get the MS Excel© copy of input data...... 38 Figure 11: Getting the MS Excel© template...... 38 Figure 12: MS Excel© input file ...... 39 Figure 13: Example of warnings and errors page...... 40 Figure 14: Content of Input.txt ...... 40 Figure 15: Running the script for the calculation of the EFI+...... 41 Figure 16: The windows-related error message...... 42 Figure 17: The web help system...... 42 Figure 18: Showing the first part of the MS Excel© output table...... 43

Picture 1: Electric fishing in a wadable river...... 28 Picture 2: Electric fishing from a boat...... 29

Table 1: Number of records included in the central database...... 11 Table 2: Metrics used to calculate the Salmonid and Cyprinid Fish Index...... 12 Table 3: List of intolerant species...... 13 Table 4: Description of the numerical variables used in the modelling procedure...... 15 Table 5: Description of the categorical variables used in the modelling procedure...... 16 Table 6: Ecoregions covered by EFI+...... 16 Table 7: Variables used to compute the European Fish Types (EFT)...... 17 Table 8: Parameters and factors correlated to the four selected metrics...... 18

346 Table 9: Summary of the four selected metrics distribution for undisturbed sites ...... 19 Table 10: Summary of the different options to select the appropriate fish index...... 22 Table 11: Ecological class boundaries for the two indices...... 23 Table 12: Fishing method: Rivers < 0.7 m depth = wadable rivers...... 239 Table 13: Fishing method: Rivers > 0.7 m depth = non-wadable rivers (boat fishing)...... 29 Table 14: Input variables required for the calculation of the EFI+...... 33 Table 15: List of species selected to compute the index based on diadromous species...... 35 Table 16: Allowed ranges of variables...... 37 Table 17: Comments and explanation of the software output...... 44

347 PREFACE

This manual describes the new European Fish Index – EFI+ - and its application software. The EFI+ software and manual have been developed within the EFI+ project. The EFI+ project was funded by the European Commission (EC) under the 6th Framework Programme, “Energy, Environment and Sustainable Development”, Key Action 1: Sustainable Management and Quality of Water of the European Commission (Specific Targeted Research Project FP6-2005- SSP-5-A, Task 4: Ecological status assessment – filling the gaps).

In the year 2000, the EC adopted a new legislation, the Water Framework Directive (WFD). This new legislation, now implemented in 27 EU member countries, aims for good ecological conditions in all surface waters. Fishes are, for the first time, part of a European-wide monitoring network designed to assess the ecological status of running waters. Between 2001 and 2004 the EC funded the FAME project developed, evaluated and implemented new standardised fish- based methods to assess the ecological status of running waters in Europe (FP5, Energy, Environment and Sustainable Management. Key Action 1: Sustainable Management and Quality of Water, EVK1-CT-2001-00094, http://fame.boku.ac.at).

The main output of the FAME project was the European Fish Index (EFI), the first standardised fish-based assessment method applicable across a wide range of European rivers. The EFI employs a number of environmental descriptors to predict biological reference conditions and then quantifies the deviation of the fish community structure from these reference conditions on a statistical basis. The EFI was developed mainly based on data from Western and Northern Europe and was calibrated against estimates of human pressures and impacts. Although a wide range of river types was included in the development of the EFI, some river types, e.g. very large rivers, were underrepresented.

The EFI has now been tested by European countries within their national monitoring programmes and has been evaluated for use for reporting under the WFD. During this evaluation process a number of limitations were observed in the performance of the index. Therefore, the overall objective of the EFI+ project was to overcome the existing limitations of the EFI by developing a new, more accurate and pan-European fish index. The scientific and technological objectives were to (1) evaluate the applicability of the existing EFI and make necessary improvements in Central-Eastern Europe and Mediterranean ecoregions; (2) extend the scope of the existing EFI to cover large rivers; (3) analyse relationships between hydromorphological pressures (including continuity disruption) and fish assemblages to increase the accuracy of the EFI; (4) adapt the existing software to the requirements of the EFI+ to allow calculation of the ecological status for running waters.

348 General introduction

This manual is divided in two parts.

Part I introduces the concepts and methods on which the EFI+ is based. This section gives an overview of the development of the new European Fish Index and its achievements to fulfil the objectives of the Water Framework Directive in terms of using fishes as indicators for assessing the ecological status of running waters.

Part II is the instruction manual to the web-based software. It details the fish assemblage and environmental data required and the process for obtaining scores and classification of the ecological status of a sampling site (or set of sampling sites) using the EFI+ online software.

349 Part I:

1. Introduction

1.1. The Water Framework Directive (WFD)

The aim of the WFD is to create a European framework for the protection of inland surface waters, transitional waters, coastal waters and groundwater (EU Water Framework Directive, 2000). Its principal objective is to protect, enhance and restore all bodies of surface waters with the aim of achieving a good ecological status by 2015 (WFD Article 4). The WFD requires member states to assess the ecological quality status (EcoQ) of their water bodies (WFD Article 8). The EcoQ is based on the status of biological quality elements supported by hydromorphological and chemical/physicochemical quality elements. Consequently, the implementation of the WFD requires appropriate and standardised methods to assess the ecological status. The four biological elements to be considered in rivers are (1) phytoplankton; (2) phytobenthos and macrophytes; (3) benthic invertebrate fauna and; (4) fish fauna.

The WFD prescribes the following steps for ecological status assessment (Figure 1):

classify river define reference types conditions assess assign quality deviation status sample monitoring sites high (1)

good (2)

moderate (3)

poor (4)

bad (5)

Figure 1: Steps prescribed in the WFD for ecological status assessment

Initially, river types have to be defined. Each type is described by abiotic parameters (System A or B, WFD Annex II 1.2) and verified by the biota. For each type, reference conditions with no or only very minor anthropogenic alterations have to be defined for each biological quality element. Reference conditions may be derived from actual data, historical data or modelling techniques. Finally, an assessment method for each quality element has to be developed. The assessment of a specific site is based on its deviation from type-specific reference conditions. The status of the fish fauna should be assessed with the following criteria: species composition, abundance, sensitive species, age structure and reproduction (WFD Annex V 1.2.1). The WFD distinguishes between five levels of ecological status: (1) high status, (2) good status, (3) moderate status, (4) poor status and (5) bad status. The methods used to develop the EFI and the EFI+ followed this general scheme of assessment.

350 1.2. A Fish-based ecological assessment method for Europe

The successful and coherent implementation of the WFD across the whole of Europe depends on the provision of reliable and standardised assessment tools. However, currently a number of different fish-based methods are used in different countries in Europe. Therefore, the observed need for a coherent standardised assessment method, applicable across Europe, was the motivation for the EC-funded FAME and EFI+ projects.

Fishes have proved their suitability as indicators for human disturbances for many reasons: Fishes are present in most surface waters. The identification of fishes is relatively easy and their taxonomy, ecological requirements and life histories are generally better known than in other species groups. Fishes have evolved complex migration patterns making them sensitive to continuum interruptions. The longevity of many fish species enables assessments to be sensitive to disturbance over relatively long time scales. The natural history and sensitivity to disturbances are well documented for many species and their responses to environmental stressors are often known. Fishes generally occupy high trophic levels, and thus integrate conditions of lower trophic levels. In addition, different fish species represent distinct trophic levels: omnivores, herbivores, insectivores, planktivores and piscivores. Fishes occupy a variety of habitats in rivers: benthic, pelagic, rheophilic, limnophilic, etc.. Species have specific habitat requirements and thus exhibit predictable responses to human induced habitat alterations. Depressed growth and recruitment are easily assessed and reflect environmental stress. Fishes are valuable economic resources and are of public concern. Using fishes as indicators confers an easy and intuitive understanding of cause-effect relationships to stakeholders beyond the scientific community.

The EFI+ project is underpinned by long established concepts that fish assemblage structure responds to human alterations of aquatic ecosystems in a predictable and quantifiable manner (e.g. Index of Biotic Integrity (IBI) concepts, Karr 1981). The main concept for these models is to assess those components, guilds or traits (quantified as ‘metrics’) of the fish assemblage that can be predicted under reference conditions and respond to the different river degradation types in a predictable manner.

351 2. The European research project EFI+

The EFI+ project was undertaken for the Improvement and Spatial extension of the European Fish Index and is an EC-funded research project aimed to contribute to the implementation of the Water Framework Directive. Research institutions based in 15 countries participated in the EFI+ project. The original project consortium consisted of partners based in Austria, Finland, France, Germany, Hungary, Italy, Poland, Portugal, Romania, Spain, Sweden, Switzerland and UK. In addition, partners from the Netherlands (RIZA/Deltares) and Lithuania (University of Vilnius) participated in the EFI+-project as self-funded associate partners.

Figure 2: Locations of the sites used to develop the EFI+: The green (N=533), red (N=2526) and black (N=7244) points correspond to the calibration, slightly disturbed and disturbed datasets, respectively. The grey shading indicates the home countries of the research partners within the EFI+ project consortium.

2.1. Central Database - a basic tool to develop EFI+

The development of the EFI+ was based on a large database of about 30,000 fish assemblage surveys covering more than 14,000 sites from 2,700 rivers in 15 European eco-regions contributed by partners and associated institutions based in 15 countries (Figure 2, Table 1). For each survey information about the fish assemblage, environmental characteristics and human pressures was collected.

352 The database also includes a comprehensive list of European freshwater fish species assigned to functional guilds according to their ecological characteristics. This information was used to calculate metrics for the newly developed index.

Table 1: Number of records included in the central database Country Sites Fishing occasions Diadromous Catches Length of samples fishes Austria 938 1172 0 6294 326039 Finland 530 530 257 2207 0 France 1145 6542 65700 62576 3896905 Germany 803 1817 27240 18543 648243 Hungary 193 193 246 2094 0 Italy 652 1152 0 4238 62847 Lithuania 115 130 280 1086 6275 Netherlands 182 790 11850 5903 135934 Poland 919 978 3480 6926 73140 Portugal 923 923 7384 45227 60431 Romania 263 323 0 1671 27722 Spain 4239 5176 10010 14092 233344 Sweden 615 5652 7607 16751 426826 Switzerland 717 969 0 2781 171583 UK 1987 3162 22134 16361 241111 Total 14221 29509 156188 206750 6310400

3. The development of the new European Fish Index (EFI+)

3.1. Objectives

The new European Fish Index (EFI+) is a multimetric index based on a predictive model that derives reference conditions from abiotic environmental characteristics of individual sites and quantifies the deviation between the predicted fish assemblage (in the “quasi absence” of any human disturbance) and the observed fish assemblage (described during a fish sampling occasion). The metrics used are based on functional guilds describing the main ecological and biological characteristics of the fish assemblage.

The purpose of the index is to evaluate the ecological status of sites at the European scale. One of the main objectives during the development phase was to define a calibration dataset and to model and select metrics in a way that the index could be correctly calibrated for all or most ecoregions and environmental situations across Europe. In addition the index needed to be comparable between ecoregions, river types and different local environments. The sensitivity of the final indices to pressures (especially morphological pressures) was considered at the European scale.

The ecological class boundaries are based on the distributions of index values for undisturbed sites in two types of rivers (see below). The main objective was firstly to optimise the specificity of the indices (capacity to classify correctly an undisturbed site as undisturbed, i.e. an unimpacted

353 sites should not have a low index value) and secondly to optimise the sensitivity of the indices (capacity to classify correctly an disturbed site as disturbed, i.e. an impacted site should not have a high index value).

3.2. Method description

For the EFI+ two indices were developed (Table 2). These two indices, each composed of two different metrics, can be computed depending on the river type classification of a given site:

Salmonid Dominated Fish Assemblage Index (Salm.Fish.Index) for sites classified as Salmonid Dominated Fish Assemblage River Type (Salmonid river type) Salm.Fish.Index = (Ni.Hab.150 + Ni.O2.Intol) / 2

Cyprinid Dominated Fish Assemblage Index (Cypr.Fish.Index) for sites classified as Cyprinid Dominated Fish Assemblage River Type (Cyprinid river type) Cypr.Fish.Index = (Ric.RH.Par + Ni.LITHO) / 2

Table 2: Metrics used to calculate the Salmonid and Cyprinid Fish Index Index Metric name Detailed name - guild Salmonid Ni.O2.Intol Density (number of individuals per 100m² in the 1. run of a sample site) of species intolerant to oxygen depletion, always more than 6 mg/l O2 in water. Ni.Hab.Intol.150 Density (number of individuals per 100m² in the 1. run of a sample site) 150 mm (total length) of species intolerant to habitat degradation. Cyprinid Ric.RH.Par Richness (number of species in the 1. run of a sample site) of rheopar species; requiring a rheophilic reproduction habitat, i.e. preference to spawn in running waters. Ni.LITHO Density (number of individuals per 100m² in the 1. run of a sample site) of species requiring lithophilic reproduction habitat, species which spawn exclusively on gravel, rocks, stones, cobble or pebbles. Their hatchlings are photophobic.

One metric is expressed in terms of richness, two in density of individuals and one in density per size class. Two metrics are based on tolerance responses, and two are based on reproduction habitat requirements. The four metrics exhibit a negative response to increases in human pressure. The correlations between metrics are relatively low (Pearson’s coefficients less than 0.65). The species classified as oxygen depletion intolerant, habitat degradation intolerant, and those requiring lithophilic or rheophilic reproduction habitats are listed in Annex 4.

The distinction between the two river types is based on the proportion (relative abundance of individuals) of typical species belonging to salmonid dominated fish assemblage (or Salmonid type species - denominated ST-species) - which are oxygen depletion intolerant, habitat alteration intolerant, stenothermic, lithophilic or speleophilic reproduction type species and with a rheophilic reproductive habitat. These 19 species are shown in Table 3.

354 Table 3: List of intolerant species, typically belonging to salmonid dominated fish communities. Alburnoides bipunctatus Cobitis calderoni Coregonus lavaretus Cottus gobio Cottus poecilopus Eudontomyzon mariae Hucho hucho Lampetra planeri Phoxinus phoxinus Salmo salar Salmo trutta fario Salmo trutta lacustris Salmo trutta macrostigma Salmo trutta trutta Salmo trutta marmoratus Salvelinus fontinalis Salvelinus namaycush Salvelinus umbla Thymallus thymallus

Typically, an undisturbed Salmonid river type site is dominated by ST-species which represent more than 80% of the number of individuals caught (more than 90 % in most cases). Conversely, the relative abundance of these species is less than 20% (in most cases < 10%) for a typical undisturbed Cyprinid type site.

As human pressures alter the fish assemblage structure, it is not possible to directly use a fish assemblage based criteria to discriminate between Salmonid-type and Cyprinid-type sites. A solution to classify sites was to use a typology based on abiotic variables. Melcher et al. (2007) produced such a typology at the European scale during the FAME project (European Fish Types classification - EFT). Using 7 environmental variables, the EFT classification differentiated between 15 fish-based river types. These 15 types have been re-classified into the two main river types for EFI+, considering criteria related to the relative abundance of ST-species.

The Salmonid/Cyprinid fish assemblage typology was used during the process of metric standardisation and selection. However, in some situations, sites may be misclassified into an inappropriate reference fish assemblage type:

Some undisturbed sites classified as Cyprinid river type sites have a high relative abundance of ST-species. Alternatively, undisturbed sites classified as Salmonid river type sites have a too low relative abundance of typical intolerant ST species (less common).

In such cases the proportion of Salmonid-type species has then to be evaluated by the user a posteriori (using situation-dependent recommendations (shown later)) to validate the river type proposed for each site and to determine the correct index to evaluate the site (Salm.Fish.Index or Cypr.Fish.Index).

In the following section the EFI+ method is summarised, the limitations of the two indices are indicated and the procedure used for metric modelling, metric selection and standardisation are described. In Annex 8, the main results related to indices performance (specificity versus sensitivity) and evaluations of uncertainties are presented.

The data flow and process for obtaining a new European Fish Index (EFI+) score is summarised in Figure 3.

355 One-Pass Sampling Fish sample (1) method

Environmental Characteristics (1)

Spatial Coordinates (1)

River Type Ecoregion Predictive (2) (2) Models (3)

Expected Observed Metric Metric Value (3) Values(4)

Residual Standardization (4) Re-scaling (5)

Metric Scores(5)

Salmonid Cyprinid fish fish Index (6) Index (6)

Fish Index Index selection value (6) and Ranking (7) Intolerant ST-Species % Ind. End- User Agreement

Figure 3: Flow chart describing the modelling procedure. Rectangles: input data and end-user intervention. Rhomboid: computation and process. Hexagon: intermediate results available in the software output. Oval: fish index value and ranking in five classes. Figures in bracket refer to the chapters of the following description.

356 1st Step: Setup and Input data1

The EFI+ method requires three types of obligatory data and one type of optional data:

Data from single-pass electric fishing catches to calculate the assessment metrics. Individuals from all species have to be measured (total length in mm) to compute the observed values of metrics. Results should be recorded as number of individuals caught per species, including the numbers in two size classes determined by a 150-mm threshold. Data describing environmental conditions at the site scale or at the river segment scale as well as the sampling method (Table 4 and 5, numerical and categorical variables). Data describing the fishing method.

In addition optional data on the present and historical occurrence of diadromous species can be used.

Table 4: Description of the numerical variables used in the modelling procedure. Variable Median Minimum Maximum Latitude 46.26 36.77 68.80 Longitude 5.24 -9.25 29.65 Drainage area (km2) 56.02 0.72 208,106.00 Distance.from.source (km) 13 0.50 1454.00 Actual.river.slope (m.km-1) 9.13 0.001 323.63 Wetted.width (m) 6.00 0.70 658.00 Fished.area (m2) 372 100 32,500 Temp.jan (°C) 1.60 -16.00 11.40 Temp.jul (°C) 17.80 8.60 25.10

The ranges of environmental conditions (median, minimum and maximum values) for the slightly impacted (calibration) sites are shown in Table 4. For the categorical variables indicated in Table 5, the number of sites per modality is indicated. These values indicate the range of environmental conditions for which the models can be considered as calibrated (N = 2526 sites).

In addition, information on the site location (longitude and latitude, for definition see Table 14, Part II), site name and sampling date is required. Spatial coordinates are used to define the corresponding ecoregion. Ecoregion classification has been done according to Illies, but with the addition of a Mediterranean region (Table 6, Figure 4). The Mediterranean region was delineated based on precipitation and temperature criteria.

Information about the historical and present occurrence of 17 diadromous species can be used to calculate an additional diadromous species metric. This information is not solely based on current fish sampling data but should also integrate knowledge of the user from other sampling methods and data sources (as e.g. commercial, other samplings or literature information especially in the case of historical data). For species currently present but with no historical information the user has to decide whether a historical occurrence can be assumed. The diadromous species used are listed in Table 15.

1 Numbers of this and the following chapters refer to the flow chart in figure 3

357 Table 5: Description of the categorical variables used in the modelling procedure.

Variable Modality Number of sites Water.source.type Glacial 12 Groundwater 78 Nival 539 Pluvial 1897 Floodplain.site No 2120 Yes 406 Natural.sediment Boulder/Rock 432 Gravel/Pebble/Cobble 1853 Organic 12 Sand 197 Silt 32 Geomorph.river.type Braided 86 Meandering regular 236 Meandering tortous 121 Naturally constraint 1053 Sinuous 1030 Sampling.method Wading or wading/boating 2362 Boating 164

Table 6: Ecoregions covered by EFI+. Abbreviation, full name and corresponding number are shown. Pontic Province and Eastern Plains are not well represented in EFI+.

Alp Alps (4) Car The Carpathians (10) Pyr Pyrenees (2) GB Great Britain (18) Hun Hungarian Lowlands (11) Ibe Iberian Peninsula (1) E.p Eastern Plains (16)* Ita Italy, Corsica and Malte (3) Pon Pontic Province (12)* W.p Western Plains (13) Fen Fenno-Scandian Shield (22) W.h Western Highlands (8) Bor Borealic Uplands (20) C.h Central Highlands (9) Bal Baltic Province (15) C.p Central Plains (14) Med Mediterranean region (1)

* Pontic Province and Eastern Plains are not well represented in EFI+.

358 GB

Figure 4: Map of Illies ecoregions represented in EFI+ and the additional Mediterranean region.

2nd Step: River type and ecoregion

As described in chapter 3.2, the European Fish Types (EFT) are used to identify the correct river type. The appropriate ecoregion for a sampling site is defined based on the spatial coordinates of the site.

Table 7: Variables used to compute the European Fish Types (EFT). EFI+ variable Type Description Longitude Numeric Longitude in decimal degree Latitude Numeric Latitude in decimal degree Altitude Integer Altitude in metres Distance from source Integer Distance from the source (in km) Temp mean Numeric Average annual air temperature (in °C) Wetted width Numeric Width of the river (in metres) Actual river slope Numeric Slope of the river segment (in m/km)

359 3rd step: Predictive models

For each metric and for a given site a statistical model is used to predict the metric value in the absence or quasi-absence of human disturbance (i.e. a value corresponding to “a reference condition”). These expected values are computed from environmental parameters (see tables 4 and 5) using generalised linear models. These models were calibrated using “undisturbed” sites. The parameters retained for each of the four models are given in Table 8. Details about the modelling procedure are given in Annex 8.

Table 8: Parameters and factors correlated to the four selected metrics. Metric Ni.O2.Intol Ni.Hab.Intol.150 Ric.RH.Par Ni.LITHO Temperature July + Annual Temp Range + + Actual river slope + + + + Natural Sediment + + + + Syngeomorph1 + + Syngeomorph2 + + +

The environmental variables describing the hydro-morphological characteristics of the river site are synthesised (using multivariate statistical methods) into two independent geomorphological factors (named Syngeomorph1 and Syngeomorph2, see Table 8). Syngeomorph1 refers to the variables drainage area, the presence of flood plain and distance from the source. This factor discriminates small and large rivers characterized by a floodplain and high distance from the source and high drainage area. Syngeomorph2 refers to geomorphological and water source types.

The annual temperature range is the difference between mean July air temperature and mean January air temperature.

4th step: Metric score

The metric score is the standardised distance (Miq) between the predicted value (Ti, i.e. the expected value in the absence of any significant human disturbance) and the observed value Oi (computed from the sampled fish assemblage).

The score (Miq) of each of the four metrics in a given river zone “q” (Salmonid river zone or Cyprinid zone) and a given ecoregion “j” is obtained in the following manner for each site:

Miq = ( Ri - Mjq ) / Sq

Where Ri = Oi - Ti

360 Ri: model residual, i.e. difference between observed and expected metric value for the given site. Mjq: Median value of the residuals in the ecoregion “j” and the river zone “q” in the whole undisturbed dataset for a given river zone (Salmonid or Cyprinid) Sq: Standard deviation of the residuals in the whole undisturbed dataset for a given river zone (Salmonid or Cyprinid)

Sites defined here as “undisturbed” are those (N = 2526) which present no or only a slight degree of perturbation (selection based on the pressure variables: channelisation, impoundment, water quality, toxic presence, water abstraction, hydropeaking, presence of barrier at the river segment scale).

The value of the median is chosen because it is less sensitive to extreme values than the mean. For the same reason (model stability), the standard deviation of residuals of the whole undisturbed dataset is used instead of the standard deviation of the residual distribution in each ecoregion.

5th step: Standardisation and re-scaling of metric scores

Standardised metric scores vary from - to + . However, development of ecological quality ratios (EQRs) requires that each final metric score varies within a finite interval from 0 to 1 and each metric must have the same median value in the absence of any disturbance (i.e. in the undisturbed dataset). Such “rescaling” is accomplished by using two transformations (see Annex 8 for details).

The final result is that, when only considering undisturbed sites, all the four metrics have a median value of 0.80 and very similar values for the 25% quantile (0.69 to 0.73). Then, metrics can be aggregated, each one having a similar distribution in the absence of any significant disturbance.

Observed metric values from impacted sites exhibit a greater deviation from the expected metric value and thus will be characterised by a low metric score and are less likely to belong to the reference residual distribution than un-impacted or only slightly impacted sites (i.e. value << 0.80).

Table 9: Summary of the four selected metrics distribution for undisturbed sites 25% 95% River type Min. Median Mean Max. quantile quantile Ni.Hab.Intol.150 Salmonid 0.000 0.69 0.80 0.74 0.87 1.000 Ni.02.Intol Salmonid 0.000 0.73 0.80 0.77 0.86 1.000 Ric.RH.Par Cyprinid 0.000 0.70 0.80 0.77 0.86 1.000 Ni.LITHO Cyprinid 0.000 0.71 0.80 0.73 0.83 1.000

Metric based on diadromous species

A diadromous species metric can be calculated corresponding to the ratio between the present number of diadromous species occurring and the richness of diadromous species that would have

361 historically occurred. A loss of diadromous species can be due to several human pressures but they are mainly affected by barriers causing interruptions to river continuity and fish migration pathways. Thus the metrics indicates connectivity problems.

It is calculated as follows:

K x ,actualk IndMig k1 K x ,historicalk k1 where K corresponds to the number of diadromous species used in the index computation. The K K terms x ,actualk and x ,historicalk correspond to the actual number of diadromous species and k 1 k1 the number of diadromous species historically present at the site respectively. If a species occurs naturally at present the user can decide to assume also historical occurrence (see table 15 for details). The obtained value (“Index diadromous species richness”) ranges from [0,1].

The metric is not integrated into the final index because of its different structure compared to the other metrics. In addition further tests need to be done for a robust computation and interpretation of the metric values.

6th step: Fish Index

The EFI+ method finally consists of two indices:

Salm.Fish.Index = (Ni.Hab.150 + Ni.O2.Intol) / 2 Cypr.Fish.Index = (Ric.RH.Par + Ni.LITHO) / 2

The Indices values vary between 0 and 1. As for each metric, an undisturbed site would have an index value close to 0.80, and a highly disturbed site a value lower than the 25% quantile of the index distribution for undisturbed sites.

Each index is composed of 2 different metrics, and can be computed for each site, although the final selection of the most appropriate index depends on the river type classification. The critical point in the use of the EFI+ method is the classification of sites in one of the two river types (Salmonid type versus Cyprinid type).

As previously indicated the automatic classification (based on prediction of fish type from abiotic environmental parameters) is more efficient at correctly determining the Salmonid river type rather than the Cyprinid type. For the Salmonid river type, generally only a small number of sites are considered as misclassified (i.e. sites with a very low relative abundance of ST-species are still classified as the Salmonid river type). Conversely, the automated classification of “Cyprinid river type” appears to have a higher rate of misclassification e.g. classifying sites are dominated by ST-species into the Cyprinid river type.

362 The consequences of this risk of misclassification are quite different, depending of the river type. For undisturbed sites classified by the model as the Salmonid river type but with a low relative abundance of ST-species in the sample, the Salmonid fish index cannot be used. Therefore, the site is misclassified and the Cyprinid fish index should be used. For undisturbed sites classified by the model as a Cyprinid type but with a high relative abundance of ST- species in the sample, the values given by the Cyprinid index are quite close to the ones given by the Salmonid index. However, in case of disturbed sites, the impact is not correctly evaluated if the Cyprinid index is used instead of the Salmonid index. Therefore, such sites are misclassified in terms of river type and the Salmonid index should be applied.

Considering the risk of misclassification and the associated consequences on the evaluation of sites the best solution is for the end-user to systematically validate the initial classification of the site (Cyprinid or Salmonid river type), the relative abundance of ST-species and the value of both indices (Salmonid index and Cyprinid index) when they can be computed. In most cases, the river type proposed by the model is correct and the user can use the corresponding index proposed by the software. In other cases, the users, as experts, will have to evaluate the situation and to confirm the proposed classification or choose the most appropriate of the two fish indices.

There are several possibilities and associated recommendations (summarized in Table 10):

Sites classified by the EFT classification as Salmonid river type

The site is classified as a Salmonid type and the % of ST- species is high (i.e. > 80%). The classification is correct and the Salmonid index has to be used.

The site is classified as a Salmonid type and the % of ST-species is relatively high (from 50 to 80%). The reduction of the relative abundance of ST-species could be due to a human disturbance of the river ecosystem. The risk of misclassification is relatively low but the user has to check and confirm the proposed river type.

The site is classified as a Salmonid type but the % of ST-species is relatively low (from 20 to 50%) to very low (less than 20%). The reduction of the relative abundance of these intolerant species can only be due to severe human disturbance (i.e. heavy impoundment, high level of water quality degradation). The risk of misclassification is important and the user has to evaluate the proposed river type and to confirm or reject the choice of the fish index. A warning is included in the output of the software.

Sites classified by the EFT classification as a Cyprinid river type

The site is classified as Cyprinid type and the % of ST-species is very low (less than 20 %). The classification is correct and the Cyprinid index has to be used.

The site is classified as a Cyprinid type but the % of ST-species is relatively high (from 20 to 50%). The increase of the relative abundance of these ST-species can be due to some

363 particular human disturbance of the river ecosystem (some types of channelisation and large increases of the water velocity, water cooling downstream from a dam etc.). Nevertheless, in most of cases, a misclassification of the site is possible. The software proposes to classify the site as a Salmonid river type and to use the Salmonid index. The user has to evaluate the proposed type and to confirm or reject the choice of the fish index. A warning is included in the output of the software.

The site is classified as a Cyprinid type but the % of ST-species is quite high (from 50 to 80%) or very high (more than 80%). The increase of the relative abundance of these ST- species can also be due to particular severe human disturbances but the risk of misclassification is very important. A correction for the river type is included in the output of the software (site reclassified as Salmonid type) and the value of the Salmonid index is proposed. The software proposes to classify the site as a Salmonid type and to use the Salmonid index. The user has to evaluate the proposed type and to confirm or reject the choice of the fish index. A warning is included in the output of the software.

Table 10: Summary of the different options to select the appropriate fish index. PERCENTAGE of INTOLERANT SALMONID TYPE SPECIES (ST-SPECIES)

Initial site classification [0% – 20%] ]20% - 50%] ]50% - 80%] ]80% - 100%] Salmonid type Risk of Risk of Correct misclassification misclassification classification Salmonid Salmonid Salmonid Salmonid Index Index Index Index proposed proposed recommended should be used User has to User has to User has to check confirm the river confirm the river the classification type and the zone and the index choice index choice Cyprinid type Correct Increase of % of Increase of % of High risk of classification ST-species can be intolerant species misclassification linked to a human can be linked to disturbance particular extreme disturbance Cyprinid Salmonid Salmonid Salmonid Index Index Index Index should be used proposed proposed proposed

User has to User has to User has to confirm the river confirm the river confirm the river zone and the zone and the zone and the index choice index choice index choice

In particular ecoregions, the possibilities for sites to be a Salmonid type are low. This is the case for Hungarian lowlands, Eastern plains, Pontic province, Baltic province and Mediterranean region.

In particular ecoregions, the possibility for sites to be a Cyprinid type are low. This is the case for Alps, Pyrenees, Fenno-Scandian shield and Borealic uplands.

364 7th step: Ecological class boundaries

Ecological class boundaries are only based on the distributions of index values for undisturbed sites in the two river types (Table 11). As the sampling method (sampling by boat or by wading) has been shown to influence the score value, especially in the Cyprinid zone, class boundaries have been computed separately for sites sampled by boating and wading in the Cyprinid zone (see indices limitations section below).

The limits between class 1 and 2 correspond to the value of the 95% quantile of the index distribution for undisturbed sites.

The limits between class 2 and 3 correspond to the value of the 25% quantile of the index distribution for undisturbed sites.

The limits between classes 3-4 and 4-5 are defined in a way that the ranges between classes 3, 4 and 5 are similar.

The specific class scoring for Cyprinid type sites sampled by boating has to be considered as a preliminary one. More work is needed in the future (requiring more data from undisturbed or slightly disturbed boating sites) to correctly handle these parameters as predictors within in the different index models.

Table 11: Ecological class boundaries for the two indices. Salmonid index Cyprinid index

Wading Boating Class 1 [0.911 -1] [0.939 -1] [0.917 - 1]

Class 2 [0.755- 0.911[ [0.655- 0.939[ [0.562 - 0.917[ Class 3 [0.503 -0.755[ [0.437 -0.655[ [0.375 - 0.562[ Class 4 [0.252 -0.503[ [0.218 -0.437[ [0.187 - 0.375[ Class 5 [0 - 0.252[ [0 - 0.218[ [0 - 0.187[

Limitations in the use of the Index due to the sampling location

The fish index, in the present state, cannot be used for surveys undertaken in lateral water bodies of the floodplain and is only calibrated correctly for sites sampled in the main river channel. In the case of fishing occasions where the lateral water bodies from the floodplain were sampled (in addition to or separately from sampling the main channel) the index values are significantly lower in comparison with sites where only the main channel is sampled.

Limitation in the use of the indices in relation with the environment

The statistical models that are used for the EFI+ reflect the average response of fish communities to environmental conditions. The application of the EFI+ for particular environmental situations might cause problems. These indices have been developed for sites located in the ecoregions

365 presented in Annex 2. Therefore, the index should not be applied in areas with a fish fauna deviating from those of the tested ecoregions.

The model was developed using data from sites with environmental characteristics ranging between specific limits. These values are given in Table 4 and 5. Sites should have characteristics within these ranges in order to obtain a confident assessment.

Some environmental situations are not correctly handled by the two indices. These situations are: - presence of a natural lake upstream from the site (see Table 14, Part II) - presence of a winter dry period - “organic” rivers (main substrate of the river is organic)

Although no clear differences in index behaviour have been observed for intermittent/ summer dry rivers, the indices must be used with caution due to the low number of undisturbed sites available to calibrate and test the index in these conditions.

River size: The metrics have been mainly calibrated for rivers with an upstream drainage area less than 10,000 km2. Independent of the sampling method, the river size seems not to significantly influence the index values for undisturbed sites when the upstream drainage area is less than 10,000 km2.

The index should be used with caution in the lowland reaches of very large rivers as no reference sites from these reaches have been available for the calibration of the index. In those cases the index uses only extrapolated predictions based on the trends observed in the models.

Limitations in the use of the Index due to very low species richness

The EFI+ is based on the analysis of the whole fish assemblage and metrics are based on the relative occurrence or abundance of functional guilds of species. Therefore, it is clear that assemblage-based metrics are unsuitable when the richness of a site is limited to one species. In most cases in Europe this relates to small headwater rivers where brown trout are the only fish species present.

The only case where such species composition based metrics could react is when the response to a disturbance is an increase of species richness. That could be the case for example for impounded sites in headwaters: other species can appear (including rheophilic cyprinid species if the water temperature is not too low) and the proportion of intolerant fish species can be lower than expected in the absence of human disturbance.

In principle, one metric selected for the Salmonid type (the abundance of individuals < 15 cm of habitat intolerant species (Hab.Intol.150)) should be able to give a response when only trout is present. But at present, additional analysis are needed to test the efficiency of this metric alone in such situations.

Nevertheless, in general, it has to be considered that the sensitivity of the EFI+ to disturbance is quite low in areas of naturally very low species richness.

366 Limitations in the use of the index due to the number of fish caught

When few specimens were caught the software still allows the calculation of the indices, but the results must be considered with caution. The same caution applies when the sampled area is smaller than 100 m². These criteria reflect the need for sampling to be adequate to assess the abundance and structure of the fish assemblage and the population structure of the species caught. The index seems relatively independent from the number of fish caught. This could be directly related to the modelling methods used. All the four selected metrics are modelled after taking into account a measure of the total richness or the number of fish caught depending of the metric. Nevertheless, too low a number of fish caught would alter the capacity of the index to assess robustly the ecological status. The user has to be careful when the number of fish caught is less than 30 individuals and a warning is included in the output of the software in such a situation.

Two cases could be problematic and the EFI+ should be used with care: (1) undisturbed rivers with naturally low fish density and (2) heavily disturbed sites where fishes are nearly extinct.

In the first case, fish are close to the natural limits of occurrence and therefore might not be good indicators for human impacts. The occurrence of fish in those rivers is highly coincidental and therefore not predictable. If the very low density is caused by severe human impacts more simple methods or even expert judgement are sufficient to assess the ecological status of the river. Consequently, when no fish occur at a site, this method is not applicable.

Limitations in the use of the index due to the sampling method

The EFI+ has been calibrated using only fish data obtained from single-pass electric fishing (or first run of a depletion survey). Therefore, the model is only calibrated using catch-per-unit-effort (CPUE) data and not from quantitative population estimates. If data population estimates from multiple passes are used (i.e. same site fished several times and catches cumulated) the EFI+ will produce erroneous results. Therefore, where multi-pass sites are considered, only the first run data should be used to calculate the indices. The sampling method (boating or wading) has a strong impact on the index values. Most of the calibration sites were sampled by wading and it was not possible to include this method variable as a potential explanatory variable within the calibration model. The number of sites sampled by boating in the Salmonid type is strongly limited but the response of the index is not too different from those sampled by wading. Conversely, there is a clear effect of the sampling method on the index values for the Cyprinid type. In most cases sites sampled by boating tend to exhibit lower index values. This pattern is observed across the dataset and these low value boating were not generated by sites from any particular region or country.

In conclusion, it seems that the EFI+, in its present state, should be used with caution when sites have been sampled by boating, especially for the Cyprinid type, i.e. for larger and deeper rivers. Nevertheless, in the case of the Cyprinid index an alternative rating system is proposed to allow a specific ranking of sites sampled by boating in the five ecological classes. The boundaries between classes are defined in a different way than for sites sampled by wading. However, this alternative scoring has just to be considered as preliminary and more work is needed to evaluate this index (including additional collection of data from undisturbed or slightly disturbed sites sampled by boating).

367 4. EFI+ application (sampling and data requirements)

The principle procedure of the EFI+ application is shown below (Figure 5).

Site selection

Data collection Fish sampling 1. Location 2. Environment 3. Sampling procedure Fish data collection

Input of data into database

Assessment of site with the EFI+ software 1. Import of data to software 2. Run the software 3. Output - Identification of river type and appropriate index - Calculation of observed, theoretical and probability metrics - Calculation of EFI + score and assignment to status class

Figure 5: Different steps required to apply the assessment method

4.1. Site selection

The selected site should be representative within the river segment in terms of habitat types and diversity, landscape use and intensity of human pressures.

A river segment is defined as:

1 km for small rivers (catchment<100 km²)

5 km for medium-sized rivers (100-1000 km²)

10 km for large rivers (>1000 km²)

A segment for a small river will thus be 500 m upstream and 500 m downstream of the sampling site.

368 4.2. Environmental variables and sampling methods

To model the reference conditions for the sampled site, the variables from Annex 1-Table 2 should be recorded in the data sheet (see field protocol in Annex 1).

4.3. Fish sampling

To calculate the index, only fish data obtained by electric fishing can be used. Standardised electric fishing procedures are precisely described in the CEN directive, “Water Analysis – Fishing with Electricity (EN 14011; CEN, 2003) for wadable and non-wadable rivers”.

Fishing procedures and equipment differ depending upon the water depth and wetted width of the sampling site. The selection of waveform, DC (Direct Current) or PDC (Pulsed Direct Current) depends on the conductivity of the water, the dimensions of the water body and the fish species to be expected. AC (Alternating Current) is harmful for the fish and should not be used. The fishing procedure is summarised below, separately for wadable and non-wadable rivers. In both cases, fishing equipment must be suitable to sample small individuals (young-of-the-year).

According to the CEN-standard, the main purpose of the standardised sampling procedure is to record information concerning fish composition and abundance; therefore, no sampling period is defined (according to CEN). However, the EFI+ approach recommends a sampling period of late summer/early autumn except for non-permanent Mediterranean rivers where spring samples may be more appropriate.

Electric fishing at a given site must be conducted over a river length of 10 to 20 times the river width, with a minimum length of 100m. This is to ensure sampling covers the variability of habitats and fish communities within rivers sections, and to ensure accurate characterisation of a fish assemblage. However, in large and shallow rivers (width >15m and water depth <70cm) where electric fishing by wading can be used, several sampling areas cumulating in total at least 1000m2 should be prospected, covering all types of mesohabitats present in a given sampling site (partial sampling method). The length of the sampling site (station) is also calculated as 10 to 20 times the river width. Fishing of longer river sections should be avoided as some metrics referring to the number of species caught (e.g. number of rheophilic species) might be biased due to over sampling.

As a general guide one anode per 5-m of wetted width should be appropriate for sampling in wadable rivers. The operators should fish upstream so that water and sediment disturbed by wading does not affect efficiency. Operators should move slowly, covering the habitat with a sweeping movement of the anodes and attempt to draw fish out of hiding. To aid effective fish capture in fast flowing water the catching nets should be held in the wake of the anode. Each anode is generally followed by one or two hand-netters (hand net: mesh size of 6 mm maximum) and one suitable vessel for transporting fish (Table 12).

In large rivers, the depth (> 0.7 m) and variety of habitats makes prospecting the entire area impossible. Therefore, a partial sampling procedure is applied covering all types of habitats to obtain a representative sample of the site. Qualitative and semi-quantitative information can be obtained by using conventional electric fishing with hand held electrodes in the river margins and delimited areas of habitat. Alternatively, where resources exist capture efficiency can be improved by increasing the size of the effective electric field relative to the area being fished by

369 increasing the number of catching electrodes (electric fishing boats with booms). Arrays comprising many pendant electrodes can be mounted on booms attached to the bows of the fishing boat. The principal array should be entirely anodic with separate provision being made for cathodes. Depending upon water conductivity, the current demands of multiple electrodes can be high and large generators and powerful control boxes may be needed (Tables 12 and 13). In the Tables 12 and 13, the river width corresponds with wetted width.

Table 12: Fishing method: Rivers < 0.7 m depth = wadable rivers Waveform selection: DC or PDC Number of anodes: One anode per 5-m wetted width Number of hand-netters: Each anode followed by 1 or 2 hand-netters (mesh size of 6 mm maximum) and 1 suitable vessel for holding fish. Number of runs: One run Time of the day: Daylight hours Fishing length: 10 - 20 times the wetted width, with a minimum length of 100 m Fished area: river width <15 m: The whole site surface river width >15 m: Several separated sampling areas are selected and prospected within a sampling site, with a minimum of 1000 m² (partial sampling method) Fishing direction: Upstream Movement: Slowly, covering the habitat with a sweeping movement of the anodes and attempt to draw fish out of hiding. Stop nets: Used if necessary and feasible

Picture 1: Electric fishing in a wadable river

370 Table 13: Fishing method: Rivers > 0.7 m depth = non-wadable rivers (boat fishing) Waveform selection: DC or PDC Number of anodes: Depending on boat configuration Number of runs: One run Time of the day: Daylight hours Fishing length: 10 -20 times the wetted width, with a minimum length of 100 m Fished area: Both banks of the river or a number of sub-samples proportional to the diversity of the habitats present with a minimum of 1000 m² (partial sampling

method) Fishing direction: Normal flow: downstream in such a manner as to facilitate good coverage of the habitat, especially where weed beds are present or hiding places of any kind are likely to conceal fish High flow: upstream Low flow: not necessary to match boat movement to water flow, and the boat can be controlled by ropes from the bank side if required Movement: Slowly, covering the habitat with a sweeping movement of the anodes or drifting with the boom along selected habitats and attempting to draw fish out of hiding. Stop net Used if necessary and feasible

Picture 2: Electric fishing from a boat

371 4.4. Fish data To calculate the EFI+, each collected specimen should be identified to species level by external morphological characters and the total number of specimens per species should be recorded on the field protocol data sheet (Annex 1-Table 1: fish data). Also, the EFI+ requires the number of fishes larger and smaller than 150 mm to be recorded. Therefore, total length (in mm) of all fish captured should be measured.

30

372 Part II: A manual for application

1. Introduction

This part of manual explains how to use the web-based EFI+ software. It details the data input, the interpretation and handling of error messages and the features of the software output.

The EFI+ software application is implemented as Web Client-Server using R-script statistical processing.

2. Web page

The EFI+ software is accessible via the webpage of the EFI+ project:

http://efi-plus.boku.ac.at/software

Link to EFI+ software calculation

Figure 6: The front page of EFI+

The main software web page (Figure 7) can be accessed via the link above or via the “EFI+ Software” button on the EFI+ project homepage (Figure 6).

373

Figure 7: The front page of EFI+ software

The first page of the software contains the main Menu from where different elements of the software can be accessed. Home, Insert data, Help, Links and Contact are the options available (Figure 7).

Software home: To return to the first page of the software.

Insert data: To access the main part of the software where the metrics and the overall score can be obtained. The data input screen provides two different options: The first one lets users manually enter the data from a single sample; the second option allows the user to upload a set of samples, previously collated in an MS Excel© file format. A formatted blank MS Excel© input file can be downloaded from this page.

Help: To obtain help about the input variables and to access the online manual and appendices as well as other helpful information needed to calculate the EFI+.

Links: Users can get useful help from external web sites to obtain more information on the database and the program. There are links to e.g. the R Project for Statistical Computing (R

374 Development Core Team, 2008) and the River and Catchment Database for Europe (CCM, European Commission 2007). The functionality of the links to external resources is not within the responsibility of the EFI+ consortium.

At the bottom of the page there are some links concerning legal warnings and requirements, as well as privacy, ITT terms, software releases, and style formats.

3. Input data

The data required to calculate the EFI+ are described next. This includes the correct formatting to ensure data are entered into the software correctly (Table 14).

Table 14: Input variables required for the calculation of the EFI+. Asterisk indicate mandatory variables Variables describing the location, name of site and date of fishing Code given to each sampling site by user (could be country abbreviation + users *Site code own code of the site, e.g. DE0001).

*Latitude Latitude in decimal degrees, projection WGS 84.

*Longitude Longitude in decimal degrees, projection WGS 84.

*Day e.g.: 08

*Month e.g.: 10

*Year e.g.: 2008

Country Name of country (should be in English). National name of the river (for transboundary, small rivers, the name of country *River Name where it confluences, i.e. Semois, Belgium – Semoy – France could be used).

*Site Name Location name e.g. indicating a nearby town or village.

*Altitude The altitude of the site in metres above average sea level.

* Ecoregion Ecoregion according to Illies, according Table 6, Part I To define the river region use table in Annex 2, Table 1, (e.g. Danube, Ebro, * River Region North_Sea, Mediterranean_Sea_WB). Variables describing the sampling method Where the sampling site is situated in relation to the river. Categories: Main channel = sampling was done in the main channel *Sampling Location Backwaters = sampling was done in a floodplain water body Mixed = Sampling in both main channel and backwaters. NoData = No information of location Definition, how electric fishing was carried out in three classes (NoData, Boat, *Method Wading, Mixed (sites sampled with both - wading and boat). Area of the section that has been sampled (sampled length * sampled width) given *Fished Area in m2.

375 Wetted width in metres is normally calculated as the average of several transects *Wetted width across the stream. The wetted width is measured during fish sampling (performed mainly in autumn during low flow conditions) Environmental variables describing the sampling site, used to obtain the expected value of metrics A factor of three classes that is needed for the classification of Mediterranean rivers *Mediterranean Type (see Annex 2, Figure 3). Are there natural lakes present upstream of the site? Categories: Yes/ No/NoData. Only applicable if the lake affects the fish fauna of the site, e.g. by altering thermal Natural Lake regime, flow regime or providing seston. Use Water Framework Directive definition of lake: more than 50ha. If there are artificial lakes (as e.g. fish ponds upstream) these are pressures and must not be considered in environmental variables. Normal flow pattern for the river. Divided into four classes: Permanent = Never (or extremely rarely) having zero water velocity or low flow. Never drying out. Summer dry = In normal years having extreme summer low Flow Regime flow with no water velocity or even dry conditions. (Mediterranean regime). Winter dry = In normal years having extreme winter low flow with no water velocity or even dry conditions. Intermittent = Having extreme low flow with no water velocity (or even dry conditions) at intervals. The timing and length of intervals is unpredictable. NoData when data is not available. Information in 5 categories to be selected: Naturally constraint no mob = Naturally constraint without mobility (riverbed is *Geomorphology fixed), Braided, Sinuous, Meandering regular, Meandering tortuous, NoData = Not applicable Describe the situation before any major human control of river bed! If the river has a former floodplain: Proportion of connected floodplain still *Former Flood Plain remaining. Categories: No/Small/Medium/Large/Some waterbodies remaining/NoData The source of the river water should be assigned to one of three classes; glacial, nival, and pluvial. Glacial = >15% glaciated area in the catchment, maximum monthly mean flow during summer. Nival = Yearly flow regime dominated by snowmelt in spring, with *Water Source spring maximum flow. Pluvial = Yearly flow regime dominated by rainfall, maximum flow often during spring, autumn/winter. Mediterranean areas will fall under pluvial (but often with flow regime “Summer dry” or “Intermittent”). Groundwater = groundwater must be dominant! NoData. = not available. (See Annex 7)

2 *Upstream Drainage Area Drainage area upstream of the site in km Distance from source in kilometres to the sampling site measured along the river. In *Distance from Source the case of multiple sources, measurement shall be made to the most distant upstream source (data source: maps, preferably 1:25 000). Slope of streambed along stream expressed as per mill, m/km (‰). The slope is the *River Slope drop of altitude divided by stream segment length. The stream segment should be as close as possible to 1 km for small streams, 5 km for intermediate streams and 10 km for large streams (Data source: maps with scale 1:50 000 or 1:100 000). Average annual air temperature measured for at least 10 years. Given in degrees *Air Temperature Mean ° Annual Celsius ( C) (data source: nearby measuring site, interpolated data). Average January air temperature, given in degrees Celsius (°C) (data source: *Air Temperature January nearby measuring site, interpolated data). Average July air temperature, given in degrees Celsius (°C) (data source: nearby *Air Temperature July measuring site, interpolated data).

376 Naturally dominant sediment information in the following categories: Organic Silt Sand *Former Sediment Gravel/Pebble/Cobble Boulder/Rock NoData = Not avaiable Situation before major changes of sediment conditions! (See Annex 7) Variables describing the fish data

*Species Name Scientific name of species (see Annex 3).

*Total Number Run1 All caught individuals (incl. 0+) of the species in run 1.

Number of individuals with total length 150mm for a given species for the first *Number Length Below run of sampling.

*Number Length Over Number of individuals with total length > 150mm for a given species for the first run of sampling.

To analyse river connectivity and impacts on diadromous species, it is possible to upload additional information about the current and historical occurrence of diadromous species (according to Table 15).

Table 15: List of species selected to compute the index based on diadromous species. Number Species Names 1 Alosa alosa 2 Anguilla anguilla 3 Alosa fallax Insert information about historical and present 4 Acipenser gueldenstaedti occurrence of the species; 5 Alosa immaculata 6 Acipenser nacarii The valid categories for historical information 7 Acipenser nudiventris are: “No”, “Yes” and “NoData”.

8 Acipenser stellatus For present information the categories are: 9 Acipenser sturio “Yes” and “No”. 10 Diadromous Coregonidae family 11 Huso huso Please note: In the case that the present 12 Lampetra fluviatilis occurrence of a diadromous species is mainly 13 Osmerus eperlanus due to stocking the user has to decide whether 14 Platychtys flesus “No” will be chosen. 15 Petromyzon marinus 16 Salmo salar If a species occurs at present and there are no historical data available historical presence can 17 Salmo trutta trutta be assumed.

377 4. How to insert data

There are two different ways available to upload data: Manually online or upload of an MS Excel© file.

The manual data entry method, via an online form, uses variable modality lists and an integrated error check system to ensure correct entry of data. However, it is necessary to introduce data site by site and step by step. When using the MS Excel© file input, it is possible to upload large datasets. However, although the software will check for errors in the uploaded file the user must correct these errors manually in the original data file before the software will process the dataset.

4.1. Manual data input

Click to go to the manual data input site

Figure 8: Manual data input selection

Once this input option has been chosen, a data entry form page with a large number of mandatory variables (all of them are marked by asterisk) appears. To avoid typing errors, records can be entered with help of selection boxes (for variables Country, Date, Eco-regions, River-region, Sampling method, Flow regime, Natural lake upstream, Geomorphology, Former floodplain, Water source, Temperature of January, Temperature of July, Former sediment size, Species and Presence of diadromous species).

For some variables, information signs are available that can be clicked to access help with definitions or possible values of variables. Once the data have been filled, they can be send to the server (by clicking on the “Send” button) to check for errors and possible solutions. This information will be indicated on a new screen. Categorical data are checked against a list of possible classes and modalities whilst numerical variables are checked against a range of allowable values (see Table 16).

378 Table 16: Allowed ranges of variables. Variable Range Variable Range Longitude -12 to 40 Upstream drainage area 1 to 20000000 km² Latitude 35 to 72 Distance from source 0.01 to 8000 km 0 Day 1 to 31 River slope 0.001 to 200 /00 Month 1 to 12 Air temperature mean annual -15 to 35ºC Year 1950 to 2020 Air temperature January -25 to 30ºC Altitude 0 to 4800 m Air temperature July -5 to 45ºC Mediterranean type 0 to 2 Total number of run1 1 to 1100000 Fished area 50 to 100000 m² Number Length below 150mm 0 to 1000000 River width 1 to 6000 m Number Length above 150mm 0 to 1000000

Note: Manual input data are not recorded but the data entered are included in the output file for further use (e.g. data sheet input if you have to repeat the assessment procedure).

Figure 9: Manual entering data form

In addition, it’s possible to store manual input data by getting an MS Excel© copy before calculating the EFI+ (Figure 10).

379 Click to get the MS Excel© copy of input data

Figure 10: Get the MS Excel© copy of input data

4.2. Data input by MS Excel© spread sheet

A standardised input data MS Excel© spread sheet file can be downloaded from the server.

Figure 11: Getting the MS Excel© input file

380 This spread sheet is named “EFI+SpreadSheet.xls” (Figure 12) and contains two work sheets: The first is called “input1 all actual variables”, where all site and survey information should be entered. The second worksheet is called “input2 all diadromous variables”, where information about diadromous species can be entered. Each sampling occasion*species record combination makes up an array that is filled in one row of the input1 file. The addition of diadromous species information in corresponding data arrays in input2 file is optional.

Note: The name of the provided spread sheet can be changed by the user but the new name must not have any space between words and must not include special cases (e.g. “(“etc.). In order to guarantee that the upload works, it’s recommended to always use the MS Excel© input file provided at the software web site with the original file name (Figure 11).

Figure 12: MS Excel© input file

In the first work sheet (“input1 all actual variables”) each variable has to be filled on the appropriated column. It is important to be very careful when entering data and variable names and it is necessary to assign the correct variable names to the data columns; otherwise an error message will appear after the sheet has been checked by the software.

Note: Correct formatting and entry of values in the worksheet is essential. When formatting and variable mistakes happen, the array of data will be identified as an error by the software; therefore it is extremely important that imported data are thoroughly checked according to the input variable definitions in Annex 5-Table 1.

The second work sheet deals with diadromous species (“input2 all diadromous variables”). It is composed of a set of reference variables, which can be used to relate information on diadromous

381 species status to the samples recorded in work sheet one and a second group of Yes/No/NoData variables about presence/absence of diadromous species.

Once data have been entered into the blank data file the MS Excel© file can be uploaded into the software using the “Browse” function to locate the file on the computer and the “Send file” button to send the information to the software. During the upload the file is sent to the server for a check, and errors in data formats and possible solutions are indicated on the next screen. Categorical data are checked against a list of possible classes and modalities whilst numerical variables are checked against a range of allowable values (see Table 16). Errors in the dataset must be corrected in the original MS Excel© file and then the corrected file must be re- imported into the software.

Figure 13: Example of warnings and errors page

The checked input files are converted into an ASCII file, a semicolon separated file format in which the first row are the name of variables to be use by R modules, where metrics will be calculated, the other rows represent the input data (Figure 13).

Figure 14: Content of Input.txt

382 After upload of a clean data file a separate web page will appear where the ASCII versions of both work sheets can be displayed (either “See the R-Input-file” or “See the R-Input-Diadromous file” prior to calculation of the EFI+.

At this stage the EFI+ values can be calculated by clicking on the “Run script” button. This activity processes the ASCII input files using the R-Script analysis and produces the output files.

Press the button “Run script” to calculate the output.

Figure 15: Running the script for the calculation of the EFI+

Windows-related problem

In case you are using MS Excel© 2003 with “Service Pack 3 for Office 2003”, you may get the error message “File error: data may have been lost”. In this case you have to click OK one time and the file does open without data being lost. The error message is related to a version conflict of MS Excel© only and has no influence on the calculation of results.

Figure 16: The MS Windows-related error message

383 Getting help

The EFI+ calculation software contains different help systems. For manual data input, selection boxes help to avoid misspelling for different variables and some variables have a mark up reference that helps users with definitions and deeper information.

When clicking the button “Help”, users’ access to a web page with detailed help information, e.g. in section “See documentation - PDF” the EFI+ software manual and the annexes. The documents are accessible in pdf format . In addition, a filled input template file is available and it’s possible to check Illies Ecoregions and the Mediterranean Type directly via Google Maps (“See documentation - WEB”). In section “View values” all data required for the EFI+ calculation are explained.

Figure 17: The web help system

5. Output

The final index values are computed and exported to an MS Excel© file (NewEFI+output.xls) by the web server. The output file is composed of four sets of data: reference and classification data, observed and expected metrics, partial and aggregated indices and final scores (Figure 18).

Reference and classification data

Contents: site, date, latitude, longitude, ecoregion, sampling method, EFT.river.typology, ST- Species, River type and Comment.river.type.

384

This set of variables reports the site information (Sample.code, Date, Latitude, Longitude) and the typological variables (Modified.ecoregion, Modified.typology) as well as the classification into Salmonid or Cyprinid type. This also includes the modified Illies ecoregion classification in which Mediterranean region has been divided.

Figure 18: Showing the first part of the MS Excel© output table.

Observed and expected metrics

Contents: Obs.dens.HINTOL.inf.150, Obs.dens.O2INTOL, Obs.ric.RH.PAR, Obs.dens.LITH, Exp.dens.HINTOL.inf150, Exp.dens.O2INTOL, Exp.ric.RH.PAR, Exp.dens.LITH, Hist.ric.diadromous, Present.ric.diadromous

This part of the output includes the observed density and richness data: Obs.dens.HINTOL.inf.150: density of individuals with length 150 mm that belong to species intolerant to habitat degradation, Obs.dens.O2INTOL: density of oxygen depletion intolerant species, Obs.dens.RH.PAR: richness in number of species of rheophilic reproduction habitat species, Obs.dens.LITHO: density of species with lithophilic reproduction habitat.

There are also included the expected densities that have been predicted by the software models: Exp.dens.HINTOL.inf.150, Exp.dens.O2INTOL, Exp.dens.RH.PAR, and Exp.dens.LITHO as well as the actual presence of diadromous species (Present.ric.diadromous) and the historical one (Hist.ric.diadromous).

Partial and aggregated indices

Contents: Ids.dens.HINTOL.inf.150, Ids.dens.O2INTOL, Ids.ric.RH.PAR, Ids.dens.LITH, Aggregated.score.Salmonid.zone and Aggregated.score.Cyprinid.zone, Ids.ric.diadromous

This set of variables includes the scores of indices Ids.dens.HINTOL.inf.150, Ids.dens.O2INTOL, Ids.ric.RH.PAR, Ids.dens.LITH. as well as the aggregated indices (Aggregated.score.Salmonid.zone and Aggregated.score.Cyprinid.zone) obtained by the average of both single indices. For the Salmonid index it is the average of oxygen intolerant species index and habitat intolerant species index, for the Cyprinid index it is an average of

385 indices related to rheopar species. In addition, the variable “Ids.ric.diadromous” (index diadromous species richness) is obtained by comparison between actual presence of diadromous species (Present.ric.diadromous) and the historical one (Hist.ric.diadromous). If the actual occurrence of diadromous species is mainly due to natural reproduction and this index is close to 1 it is unlikely that there are connectivity problems on a catchment scale. Vice versa, an index value close to 0 indicates the possibility of severe migration barriers. Final scores

Contents: Fish.Index, Fish.Index.class, Comment Fish Index, Comments sampling effort.

Finally, the aggregated indices are classified in a five class range (Class.scores.Salmonid.zone and Class.scores.Cyprinid.zone, see Table 10 in Part I).

In addition to the final metric values, comments and information regarding the limits of the EFI+ are given in the output file on a site specific basis. Four additional fields with important comments about the limits of results are indicated (Table 17). These four fields give comments and suggestions for end-users to validate the options used and assessments made.

Table 17: Comments and explanation of the software output. Comment Explanation SAMPLING DATE Ok Sampling date is within the EFI+ sampling period between August and November. Date out of sampling period Please handle EFI+ results with care. The EFI+ sampling period must be between August and November, to consider recruitment and the end of the productive season. RIVER TYPE Nothing to report The initial river type selected seems correct (good agreement between the proportion of Salmonid type species and the selected river type. To be checked by user The initial river type selected is not in agreement with the proportion of Salmonid type species. User has to confirm the river type and the index choice. SAMPLING LOCATION Nothing to report The main channel has been sampled Fish index to be used with caution Both the main channel and some connected water bodies have been sampled. Fish index inadequate Only backwaters have been sampled SAMPLING EFFORT Nothing to report High number of fish caught Fish index to be used with caution Low number of fish caught (less than 30 fish) SPECIES RICHNESS Nothing to report Species richness moderate or high Fish index to be used with caution Only one species

386 6. References

CEN, 2003. Water Analysis: Fishing with Electricity (EN 14011; CEN, 2003) for wadable and non-wadable rivers.

EU Water Framework Directive, 2000. Directive 2000/60/EC of the European Parliament and the Council of 23 October 2000 establishing a framework for community action in the field of water policy. Official Journal of the European Communities (22.12.2000) L327,1.

European Commission 2007, EUR 22649 EN, DG Joint Research Centre, Institute for the Environment and Sustainability, European River and Catchment Database – Version 2.0 (CCM2) – Analysis Tools, Authors Jürgen Vogt, Stephanie Foisneau, 22pp. ISSN 1018

FAME, 2001-2004. FAME - Development, Evaluation and Implementation of a Standardised Fish-based Assessment Method for the Ecological Status of European Rivers. A Contribution to the Water Framework Directive. A project under the 5th Framework Programme Energy, Environment and Sustainable Development Key Action 1: Sustainable Management and Quality of Water. Contract No: EVK1 -CT-2001-00094. http://fame.boku.ac.at

FAME CONSORTIUM (2004). Manual for the application of the European Fish Index - EFI. A fish-based method to assess the ecological status of European rivers in support of the Water Framework Directive. Version 1.1, January 2005.

Karr J.B., 1981. Assessment of biotic integrity using fish communities. Fisheries (6) 6, 21-27

Melcher A., Schmutz S., Haidvogl G., 2007. Spatially based methods to assess the ecological status of European fish assemblage types. Fisheries Management and Ecology, 2007, 14, 453- 463.

R Development Core Team (2008). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria. ISBN 3-900051-07-0, URL: http://www.R-project.org

45

387

Appendices

Appendices

Appendix 1: Description of the EFI+ data base (A) List of the environmental variables

Altitude Ecoregion Size of catchment Distance from source Distance from the sea Actual river slope Valley slope Valley form Geomorphological river type Water source type Flow regime Geological typology Floodplain Natural sediment Presence of a lake upstream Strahler rank Mean annual precipitation (primary and upstream catchment) Mean annual temperature (primary and upstream catchment) Mean monthly temperature (x12) Land cover Fished area Width Wetted width Conductivity

391 Appendices Appendix 1: Description of the EFI+ data base (B) Pressures

Alteration Pressure Modality Presence of downstream barriers (catchment scale) no, yes, partial Presence of upstream barriers (segment scale) no, yes, partial Presence of downstream barriers (segment scale) no, yes, partial Connectivity Number of barriers in the segment upstream Number of barriers in the segment downstream Distance to next upstream barrier in the segment km Distance to next downstream barrier in the segment km Natural flow velocity reduce in site due to impoundment no, weak, strong Natural flow velocity increase no, yes Site affected by hydropeaking no, yes Water abstraction no, weak, strong

no, irrigation, hydprower, irrigation, drinking water, industrial water, snow Main purpose for water usage Hydrology production, nuclear or thermal plant cooling, fish ponds, other Presence of colinear connected reservoir no, yes Seasonal hydrograph modification no, yes Alteration of water thermal regime no, permanent increase, permanent decrease, summer increase, summer decrease Impact of reservoir flushing no, yes Input of fine sediment no, weak, medium, high Alteration of natural morphological channel plan form no, intermediate, straightened Alteration of the cross section no, intermediate, technical cross section/U-profile Alteration of instream habitat condition no, weak, strong Morphology Alteration of riparian vegetation no, slight, intermediate, high Artificial embankment no, local, continuous permeable, continuous no permeability Presence of dykes for flood protection no, yes Flood plain protection, proportion of connected floodplain still remaining no, more than 50%, between 10 and 50%, less than 10%, residual

392 Appendices

Alteration Pressure Modality Toxic substances no, intermediate, high concentration Acidification no, yes Water Water quality index very good, good, moderate, poor, bad quality Artificial eutrophication no, low, intermediate, extreme Organic pollution no, weak, strong Organic siltation no, yes Use Navigation no, low, medium, strong

393

Appendices

Appendix 2: EFI+ user guide7 (see pages 343–387).

7 http://efi-plus.boku.ac.at/software/doc/EFI+Manual.pdf 395