Évaluation des liens entre phylogénie et traits écologiques chez les diatomées : pistes d’utilisation pour la bioindication des milieux aquatiques. François Keck

To cite this version:

François Keck. Évaluation des liens entre phylogénie et traits écologiques chez les diatomées : pistes d’utilisation pour la bioindication des milieux aquatiques.. Biodiversité et Ecologie. Université Greno- ble Alpes, 2016. Français. ￿NNT : 2016GREAA004￿. ￿tel-01502707￿

HAL Id: tel-01502707 https://tel.archives-ouvertes.fr/tel-01502707 Submitted on 6 Apr 2017

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés.

THÈSE Pour obtenir le grade de DOCTEUR DE LA COMMUNAUTE UNIVERSITE GRENOBLE ALPES Spécialité : Biodiversité, Écologie, Environnement Arrêté ministériel : 7 août 2006

Présentée par François KECK

Thèse dirigée par Alain FRANC et codirigée par Agnès BOUCHEZ et Frédéric RIMET préparée au sein du Laboratoire INRA UMR CARRTEL dans l'École Doctorale SISEO

Évaluation des liens entre phylogénie et traits écologiques chez les diatomées : pistes d’utilisation pour la bioindication des milieux aquatiques

Thèse soutenue publiquement le 26 avril 2016, devant le jury composé de : M. Philippe USSEGLIO-POLATERA PR, Université de Lorraine, Metz (Rapporteur) M. Emmanuel PARADIS DR, Institut de recherche pour le développement, Montpellier (Rapporteur) M. Yorick REYJOL CM, Office national de l'eau et des milieux aquatiques, Vincennes (Examinateur) M. Koen SABBE PR, Université de Ghent, Ghent Belgique (Président)

Thèse de doctorat

Évaluation des liens entre phylogénie et traits éco- logiques chez les diatomées : pistes d’utilisation pour la bioindication des milieux aquatiques.

François Keck

Thonon-les-Bains 2016 Centre Alpin de Recherche sur les Réseaux Trophiques et Écosystèmes Limniques

Institut National de la Recherche Agronomique

Université Savoie Mont Blanc

Station d’Hydrobiologie Lacustre

75 avenue de Corzent

74203 Thonon-les-Bains Cedex, France

Tél. +33 (0)4 50 26 78 00 http ://www.dijon.inra.fr/thonon Under various names I have praised only you, rivers !

— Czesław Miłosz, Rivers

Résumé

Les diatomées sont des micro-algues largement utilisées pour évaluer la qualité éco- logique des milieux aquatiques. La grande majorité des indices biotiques utilisant les diatomées sont basés sur la sensibilité à la pollution des espèces. Cela constitue un frein

à leur utilisation car l’identification taxonomique au niveau de l’espèce est complexe, longue, coûteuse et source d’erreurs. Afin de rendre le processus d’identification plus simple, des indices biotiques basés sur des niveaux taxonomiques supérieurs à l’espèce, comme le genre, ont été mis au point. Mais la perte d’informations associée à la réduction de la résolution taxonomique est susceptible de rendre ces outils moins efficaces.

Une approche alternative et plus récente propose de baser la simplification, non pas sur la taxonomie, mais sur la phylogénie. Cette approche fait implicitement l’hypothèse qu’il existe un signal phylogénétique dans les préférences écologiques des espèces, c’est à dire que deux espèces phylogénétiquement proches sont davantage susceptibles de présenter des réponses écologiques similaires que deux espèces prises au hasard. Si un tel signal existe, il implique une possible redondance phylogénétique dans les outils de bioindication existants, en particulier ceux basés sur les niveaux taxonomiques les plus

fins. L’objectif est de mettre à profit ce signal pour simplifier l’évaluation écologique des milieux aquatiques.

Ce travail s’attache à développer cette approche chez les diatomées et se décompose en trois parties. Nous présentons d’abord un nouveau package R entièrement dédié à l’analyse du signal phylogénétique et à l’étude de la distribution des valeurs de traits dans les phylogénies. Nous démontrons ensuite la présence d’un signal phylogénétique pour de ii Résumé nombreux traits écologiques chez les diatomées d’eau douce. Ces traits sont les optimums

écologiques de 127 espèces pour un ensemble de paramètres physico-chimiques, mesurés pendant huit ans dans des cours d’eau de l’est de la France. Nous montrons que le signal est variable en fonction des traits mais que la niche écologique des espèces étudiées est, de manière générale, dépendante de la phylogénie. Dans une troisième partie, nous pro- posons une méthode pour extraire des clusters d’espèces partageant des traits similaires tout en étant phylogénétiquement proches. Nous mettons en œuvre cette méthode sur des données de sensibilités aux pollutions pour démontrer les possibilités de simplifica- tion des indices biotiques basés sur les diatomées en prenant en compte la redondance phylogénétique. Nos résultats tendent à montrer que le potentiel de simplification en utilisant la phylogénie comme guide est significatif. Summary

Diatoms are micro-algae commonly used to assess the ecological quality of freshwa- ters. Most of the biological indices using are based on species sensitivity to pollutions. This constitutes an obstacle to the use of diatoms in ecological assessment because taxonomical identification at species level is difficult, time consuming, costly and source of errors. To avoid this problem, scientists developed biological indices based on higher taxonomical levels like the genus. However, the loss of information caused by the taxonomical resolution decrease can make these methods less efficient.

A more recent alternative proposes to use the phylogeny to simplify ecological assess- ment methods. This approach makes the implicit hypothesis that there is a phylogenetic signal for species ecological preferences, i.e. that closely related species are more likely to share similar ecological preferences than species taken randomly. If such a signal exists, it may mean that there is a phylogenetic redundancy in bioassessment tools, especially the ones which are based at species level. The aim is to exploit this signal to simplify the biological assessment of aquatic ecosystems. This work aims to develop this approach with diatoms and is divided in three parts.

First, we introduce a new R package dedicated to the analysis of the phylogenetic signal and to study traits values patterns in phylogenies. In a second part, we demonstrate the presence of phylogenetic signal for many ecological traits in freshwater diatoms. These traits are the ecological optima of 127 species for a set of physical and chemical parame- ters. They were estimated from data collected during 8 years in rivers in eastern France.

We show that the strength of the signal varied significantly from one trait to another but, iv Summary overall, diatoms ecological niches are related to the phylogeny. Finally, in a third part, we introduce a new method to extract clusters of species sharing similar traits and being phylogenetically related. We apply this method on pollutions sensitivities data in order to demonstrate the possibility to simplify biological indices based on diatoms by taking account of phylogenetic redundancy. Our results suggest that phylogenetic approaches offer a scope for simplification without an important loss of ecological information. Remerciements

Je voudrais exprimer ici mes remerciements les plus sincères à mes trois directeurs de thèse, Agnès Bouchez, Frédéric Rimet et Alain Franc pour m’avoir offert l’opportunité de conduire cette thèse et pour la confiance qu’ils m’ont accordée.

Je souhaite aussi remercier chaleureusement les membres du jury, Emmanuel Paradis,

Philippe Usseglio-Polatera, Yorick Reyjol et Koen Sabbe qui m’ont fait l’honneur d’ac- cepter de juger mon travail.

Merci aux membres de mon comité de thèse, Marie-Agnès Coutellec, Maria Kahlert et

Stéphane Dray pour leur écoute et leurs conseils avisés.

Merci aux collègues de l’INRA de Thonon-les-Bains et d’ailleurs.

Merci à ma famille, merci à mes amis. Merci à Teofana. vi Table des matières

Résumé i

Summary iii

Remerciements v

Table des matières vii

1 Introduction 1

1.1 Contexte général ...... 1

1.2 Bioindication et indicateurs biologiques ...... 4

1.3 Les diatomées ...... 10 1.4 Les diatomées en bioindication ...... 20

1.5 La piste du signal phylogénétique ...... 27

1.6 Organisation de la thèse ...... 32

2 phylosignal : An R package to measure, test and explore the phylo- genetic signal 39

2.1 Introduction ...... 40

2.2 The phylosignal package ...... 42 2.3 Example: Phylogenetic signal of pollution sensitivity in diatoms ..... 51

2.4 Conclusion ...... 53

2.5 Acknowledgments ...... 54 viii Table des matières

3 Phylogenetic signal in ecology : perspectives for aquatic eco-

systems biomonitoring 55

3.1 Introduction ...... 56 3.2 Material and Methods ...... 59

3.3 Results ...... 66

3.4 Discussion ...... 70

3.5 Acknowledgments ...... 76

4 Linking phylogenetic similarity and pollution sensitivity to develop

ecological assessment methods : a test with river diatoms 77

4.1 Introduction ...... 79

4.2 Material and Methods ...... 81 4.3 Results ...... 86

4.4 Discussion ...... 89

4.5 Acknowledgments ...... 95

5 Discussion et perspectives 97

5.1 Signal phylogénétique et traits écologiques chez les diatomées ...... 97

5.2 Clusters phylogénétiques et simplification des outils de bioindication ... 101

5.3 Signal et clusters phylogénétiques : extension au méta-barcoding ..... 105

A Linking Diatom Sensitivity to Herbicides to Phylogeny : A Step For-

ward for Biomonitoring ? 109

A.1 Introduction ...... 110 A.2 Materials and Methods ...... 114

A.3 Results ...... 121

A.4 Discussion ...... 126

B Supplementary material 135 Table des matières ix

B.1 Additional material for Chapter 3 ...... 135

B.2 Additional material for Chapter 4 ...... 144

B.3 Additional material for Appendix A ...... 152 B.4 Data accessibility ...... 153

Bibliographie 155 x CHAPITRE 1 Introduction

1.1 Contexte général

Une infime partie de l’eau présente sur Terre coule dans les rivières. Il s’agit pourtant d’un compartiment de premier ordre, composante élémentaire du cycle hydrologique, acheminant les eaux douces des pluies continentales et des glaciers vers les océans. Les rivières constituent des milieux naturels extrêmement variés, où se développent de nom- breuses espèces animales et végétales interagissant au sein d’écosystèmes complexes. Si les eaux douces n’occupent que 0.01% de la surface du globe, on y retrouve 9.5% de la biodiversité animale connue sur Terre (Balian et al., 2007).

Les rivières sont aussi intimement liées au développement des activités humaines. De- puis la préhistoire, elles fournissent aux hommes de l’eau douce pour leur consommation, mais aussi un moyen de se procurer de la nourriture par la pêche, un moyen de trans- port, la possibilité d’irriguer les cultures, une source d’énergie... Les sociétés humaines ont depuis toujours bénéficié de ces nombreux services et aménités écosystémiques pour leur développement et en sont restées en grande partie dépendantes.

Au cours des deux derniers siècles, les pressions exercées par l’homme sur les milieux se sont multipliées et ont gagné en intensité (Rockström et al., 2009). Les écosystèmes d’eaux douces sont particulièrement affectés par les activités humaines qui se traduisent notamment par la pollution chimique des eaux et l’altération hydromorphologique des habitats (Carpenter et al., 2011). La dégradation des milieux aquatiques est suivie d’effets délétères pour les écosystèmes et difficilement réversibles (Dudgeon et al., 2006 ; 2 1. Introduction

Vörösmarty et al., 2010). L’érosion de la biodiversité semble être la plus rapide dans les eaux douces, devant les milieux marin et terrestre (Jenkins, 2003).

L’industrie, l’agriculture et les zones urbaines rejettent chaque année plusieurs di- zaines de millions de tonnes de composés chimiques dans les eaux douces (Schwar- zenbach et al., 2006). Ces produits sont extrêmement variés et leurs effets souvent nocifs pour les écosystèmes aquatiques. En Europe, plus de 100 000 composés sont ré- pertoriés et des centaines de molécules de synthèse sont mises au point chaque année

(Schwarzenbach et al., 2006). Une grande partie de ces produits sont susceptibles de contaminer les eaux des rivières. D’importantes quantités d’azote et de phosphore sont transférées vers les cours d’eau au niveau des zones densément peuplées et des ré- gions agricoles (Filippelli, 2008 ; Vitousek et al., 1997). Ces apports en nutriments augmentent la production primaire dans les rivières, en modifient le fonctionnement tro- phique, asphyxient le milieu et occasionnent des blooms d’algues toxiques. L’agriculture fait aussi usage de nombreux pesticides (insecticides, fongicides et herbicides) dont le temps de dégradation peut être long et que l’on retrouve souvent dans les eaux de surface

(Fenner et al., 2013 ; Schwarzenbach et al., 2006). Les qualités biocides de ces com- posés les rendent fortement toxiques pour les organismes aquatiques. Les rivières sont aussi affectées par les rejets des activités minières et industrielles qui génèrent d’impor- tant problèmes de pollutions aux métaux lourds, aux hydrocarbures et aux composés organiques persistants, dont la toxicité est établie (Macdonald et al., 2000). Enfin, on retrouve dans les eaux de surface un nombre croissant de polluants dits “émergents” (mé- dicaments, perturbateurs endocriniens, métabolites de polluants), potentiellement actifs à très faible dose et dont la présence, le devenir dans l’environnement et les impacts potentiels sur les écosystèmes demeurent mal-connus (Farré et al., 2008).

En plus des pollutions chimiques, les cours d’eau ont subi d’importantes altérations physiques, conséquences des nombreux aménagements et ouvrages mis en place par 1.1. Contexte général 3 l’homme pour contrôler le débit et la morphologie du lit des rivières (Elmore and Kau- shal, 2008 ; Nilsson et al., 2005). Ces aménagements prennent de multiples formes : canalisation, déviation, endiguement, pavement et recalibrage du lit majeur, barrages, seuils et conduites forcées. Ils ont vocation à lutter contre l’érosion des berges et les dégâts causés par les crues, faciliter la navigation, produire de l’énergie et fournir en eau les hommes, le bétail et les cultures. Mais en affectant des paramètres hydrologiques et morphologiques clés (pente, profondeur, débit, vitesse du courant, sédimentation, forme et nature des berges), ces opérations ne sont pas sans conséquence sur les habitats qui tendent à perdre en diversité et en connectivité. Il en résulte des modifications impor- tantes sur la diversité biologique et le fonctionnement de ces écosystèmes.

La dégradation des milieux aquatiques n’a pas réellement été perçue comme problé- matique dans un premiers temps et l’ampleur du phénomène a très certainement été sous estimée. A partir des années 1950, il commence à apparaître que la pollution des rivières et la perte de biodiversité se doublent d’un coût économique et social induit par la compromission de services écosystémiques rendus jusque-là (Cardinale et al., 2012 ;

Wilson and Carpenter, 1999). Cette période est aussi celle d’une prise de conscience

écologique par les citoyens, marquée par la publication du livre de Rachel Carson Silent Spring en 1962 aux États-Unis, et le début de l’écologie politique. Ce mouvement a débouché sur la mise en place de législations environnementales pour la préservation des écosystèmes, la sauvegarde de la biodiversité et la protection de la santé humaine

(Suter, 2008). Dans le domaine de l’eau, les États-Unis votent dès 1972 le Clean Wa- ter Act dont l’objectif est de restaurer et de maintenir l’intégrité chimique, physique et biologique des eaux continentales. En Europe, la protection des eaux est inscrite dans la directive-cadre sur l’eau (2000/60/CE), qui impose notamment aux états membres le recensement des masses d’eau et la mise en place de plans de gestion et de mesures afin d’atteindre le bon état chimique et écologique à l’horizon 2015. 4 1. Introduction

La mise en œuvre de législations environnementales nécessite de pouvoir mesurer et quantifier les dégradations et évaluer les progrès accomplis. Dans ce contexte, les méthodes de bioindications se révèlent particulièrement appropriées pour évaluer l’état écologique des cours d’eau. Elles sont donc maintenant normalisées, intégrées aux légis- lations et appliquées par les services publics.

1.2 Bioindication et indicateurs biologiques

1.2.1 Définition de la bioindication

L’état d’un milieu peut être simplement qualifié en évaluant son intégrité physique et chimique. Cependant ces paramètres offrent un aperçu limité de la qualité et des potentialités écologiques de l’habitat. Le principe de la bioindication est d’utiliser la composante biologique de l’écosystème pour mesurer de manière intégrée les conditions environnementales (Blandin, 1986 ; Markert et al., 2003a). C’est un terme au sens très large qui regroupe une grande variété d’approches et de méthodes. Ainsi, un bioindi- cateur peut désigner l’ensemble d’une communauté (biocénose), un sous-ensemble d’es- pèces (guildes, groupes fonctionnels), une espèce spécifique (espèces indicatrices), un fragment d’organisme (organes, tissus) ou encore un gène (ADN) ou le produit de son expression (ARN, protéines).

Les bioindicateurs présentent un certain nombre d’avantages par rapport aux ana- lyses physico-chimiques. D’abord, ils intègrent les variations du milieu. Contrairement aux mesures physico-chimiques qui donnent un aperçu ponctuel de l’environnement, les indicateurs biologiques reflètent les conditions sur une période plus longue. Cela constitue un avantage certain dans les rivières, où les conditions environnementales varient spatialement et temporellement (Dokulil, 2003 ; Stevenson et al., 2010). 1.2. Bioindication et indicateurs biologiques 5

Ensuite, ils intègrent les effets de multiples facteurs environnementaux et reflètent la capacité écologique effective du milieu. Les bioindicateurs fournissent une mesure di- recte de l’impact écologique de l’ensemble des paramètres physico-chimiques du milieu et rendent compte de leurs effets additifs, antagonistes et synergiques (Altenburger and Schmitt-Jansen, 2003). Enfin, les bioindicateurs permettent d’évaluer la qualité environnementale relativement rapidement et pour un coût modéré.

Il n’est pas aisé de fournir une typologie de la bioindication. La terminologie reste variable d’un auteur à l’autre et les limites entre certains concepts demeurent floues. On peut proposer une classification simple qui distingue trois grandes approches de bioindication :

– Les biomarqueurs (Amiard-Triquet et al., 2013) qui regroupent les méthodes

consistant à mesurer des modifications (biochimiques, cytologique, physiologiques, comportementales) sur les organismes et à les mettre en relation avec l’état du

milieu ou une pollution particulière.

– Les bioaccumulateurs (Newman, 2014) qui regroupent les méthodes consistant à

utiliser la capacité de certains organismes à accumuler dans leurs tissus certains

polluants. Le dosage des toxiques dans ces tissus permet ainsi une estimation de l’exposition des individus.

– Les indicateurs biologiques aussi appelés indicateurs écologiques (Jørgensen et

al., 2010) qui regroupent les méthodes consistant à utiliser certaines espèces (dites

indicatrices) ou communautés d’espèces comme des indicateurs de la qualité du

milieu. Les indicateurs biologiques révèlent par leur présence, leur absence ou leur comportement démographique les caractéristiques et l’évolution d’un milieu

(Blandin, 1986). 6 1. Introduction

1.2.2 Principes des indicateurs biologiques

Des indicateurs biologiques ont été développés pour évaluer la qualité de nombreux milieux : les sols (Ritz et al., 2009), l’air (Conti and Cecchetti, 2001), le milieu marin

(Hayes et al., 2015), les zones humides (Fennessy et al., 2015). Les eaux continentales sont certainement le milieu qui a concentré le plus de recherches à ce jour.

Dans une rivière, les concentrations en nutriments inorganiques (azote, phosphore et carbone), en oxygène, matières organiques constituent des indicateurs de la productivité et du fonctionnement trophique du système. En plus de ces paramètres, il faut prendre en compte les très nombreux toxiques issus des activités humaines comme les produits phytosanitaires, les métaux lourds, les PCB, les hydrocarbures, ou encore les micro- polluants. Les coûts associés à ces analyses et l’impossibilité technique de suivre de ma- nière chronique des milliers de molécules, parfois actives à de très faibles concentrations, ont conduit au développement et à l’utilisation généralisée des indicateurs biologiques

(Ibáñez et al., 2010).

Les indicateurs biologiques se basent sur le principe que les conditions environne- mentales et a fortiori les pressions d’origine anthropique modèlent les communautés.

Les fondements théoriques des indicateurs biologiques sont donc à chercher du côté de l’écologie des communautés et du lien entre l’environnement et le cortège d’espèces qui s’y développe et s’y maintient. En particulier, les méthodes de bioindication basées sur les indicateurs biologiques sont fortement liées à la notion de niche écologique. La niche écologique d’une espèce regroupe l’ensemble des conditions environnementales et des ressources nécessaires au développement d’une population viable. La représentation de la niche telle que proposée par Hutchinson (1957) est celle d’un hypervolume dans un espace à n-dimensions, où chaque dimension correspondrait à une ressource ou un para- mètre de l’environnement. La niche écologique semble donc constituer un lien conceptuel simplifié mais pertinent entre les facteurs abiotiques du milieu et les espèces qui s’y dé- 1.2. Bioindication et indicateurs biologiques 7 veloppent.

Les indicateurs biologiques peuvent fournir des informations et des données sur la qualité de l’eau qui peuvent être appréhendées de différentes façons. Ils permettent ainsi de caractériser l’effet des pressions d’origine humaine, de classifier les milieux (e.g. le niveau trophique, l’influence géologique), d’identifier les sources de pollutions locales et diffuses ou encore d’étudier les tendances à long terme dans le cas de suivis sur plu- sieurs années. La mise en place d’un indicateur biologique sur un réseau de stations de surveillance à l’échelle d’une région ou d’un pays exige de l’indicateur qu’il garantisse une performance minimale tout en respectant des contraintes de coûts (formation, ma- tériel, main d’œuvre, temps). Un indicateur biologique approprié doit donc idéalement présenter les caractéristiques suivantes :

– Une biologie bien documentée. – Une distribution géographique large.

– La garantie d’une abondance minimale dans le milieu.

– Un lien clairement défini entre modifications du milieu et réponse.

– Un temps de réponse adapté aux objectifs de bioindication (court pour fournir un

signal précoce en cas de pollution ou long pour intégrer les effets de la pollution). – Un processus d’identification simple et rapide associé à un taux d’erreur minimal.

– Une taxonomie bien définie et stable.

En pratique ces caractéristiques sont rarement toutes réunies. Cela a conduit au déve- loppement de multiples méthodes basées sur de nombreux types d’organismes dans une quête de l’indicateur le plus performant. Dans une certaine mesure, certains indicateurs biologiques sont complémentaires. Cette complémentarité peut motiver une approche multi-indicateurs (voir Section 1.2.3) 8 1. Introduction

1.2.3 Historique des indicateurs biologiques en rivière

Les hommes ont depuis longtemps constaté que leurs activités pouvaient avoir un impact sur l’environnement. Ces observations remontent au moins à l’antiquité comme en atteste la lecture de Pline l’Ancien (23–79 ap. J.C.) cité par Markert et al. (2003b).

Dans les rivières, l’urbanisation dès le XVIème siècle et l’industrialisation au XIXème siècle ont fortement pollué et dégradé les milieux (e.g. la Tamise, le Rhin, le Rhône) avec parfois des manifestations spectaculaires et des conséquences lourdes pour les sociétés : mort massive de poissons, prolifération d’algues, épidémies de choléra et de diphtérie.

Au milieu du XIXème siècle, un certain nombre d’auteurs font le constat que les or- ganismes qui se développent dans les eaux polluées diffèrent substantiellement de ceux qui se développent dans les eaux non-polluées (Cohn, 1853, 1872 ; Hassall, 1850 ;

Kolenati, 1848 ; Mez, 1898). Cependant, il faut attendre le début du XXème siècle et les travaux de Kolkwitz and Marsson (1902, 1908, 1909) pour qu’un lien clair soit établi entre les communautés biologiques et la qualité des eaux. Ces travaux in- troduisent le premier indice biologique, basé sur 298 espèces végétales et 527 espèces animales, qui permet la classification des eaux en quatre niveaux de saprobie (niveau de matière organique). Les indices de saprobie se développent et se perfectionnent au cours du XXème siècle, prenant en compte, à terme, plusieurs milliers d’espèces (Sláde- cek, 1973). Cependant, à mesure que les politiques de protection et de restauration des milieux aquatiques prennent effet, la pollution organique des rivières diminue, rendant l’usage des indices de saprobie moins pertinent. D’un autre côté, les pressions anthro- piques exercées sur les cours d’eau se diversifient (engrais, pesticides, micro-polluants, radionucléides, nanoparticules) rendant nécessaire le développement d’une nouvelle gé- nération d’indicateurs biologiques.

Diverses pistes sont explorées à partir des années 1950. Ces approches s’appuient sur les nouveaux développements théoriques de l’écologie des communautés (niches écolo- 1.2. Bioindication et indicateurs biologiques 9 giques multi-dimensionnelles, lien entre diversité et perturbations, lien entre traits bio- logiques et environnement). Elles sont aussi permises par l’introduction de nouveaux ou- tils statistiques en écologie, en particulier les méthodes d’analyses multivariées (Green, 1971 ; Ter Braak, 1986 ; Whittaker, 1967). De nombreux indices biotiques autécolo- giques sont développés autour de toute une variété d’indicateurs qui permettent d’évaluer l’état général du milieu et de discriminer certaines formes de pollutions. D’autres ap- proches plus sophistiquées font aussi leur apparition : les approches multimétriques qui tentent de combiner différentes métriques décrivant la structure et la fonction des com- munautés (Barbour et al., 1995), les approches prédictives qui comparent les commu- nautés observées à celles que l’on attendrait en cas d’absence de pressions anthropiques

(Wright, 1995), les approches fonctionnelles basées sur les guildes écologiques (Karr,

1987) ou les traits biologiques et fonctionnels des organismes (Statzner et al., 2001 ; Usseglio-Polatera et al., 2000).

Si les organismes pris en compte dans le premier indice de saprobie sont très variés

(Kolkwitz and Marsson, 1908, 1909), les outils d’indication biologique ont rapide- ment eu tendance à se spécialiser sur les différentes composantes de la biocénose. Ce phénomène s’explique d’une part parce que les concepteurs des différents outils sont souvent spécialistes d’un groupe d’organismes, mais aussi par la nécessité d’élaborer des outils plus sensibles et capables de discriminer les différentes pressions. Ainsi, les grands ensembles d’organismes qui constituent la biocénose des rivières sont utilisés séparément pour évaluer différents types de dégradations (Ibáñez et al., 2010). Les communautés de poissons par exemple, sont de bons indicateurs des modifications hydromorphologiques et des pollutions à long terme, alors que les macroinvertébrés sont de bons indicateurs de la pollution organique, de la qualité du micro-habitat ou encore des pollutions aux mé- taux lourds. Comme évoqué précédemment, les approches multi-indicateurs permettent de mieux prendre en compte cette complémentarité. Elles sont donc encouragées par les 10 1. Introduction législations modernes (e.g. le Clean Water Act aux États-Unis et la directive cadre sur l’eau en Europe, voir aussi Monnier et al., 2016 ; Reyjol et al., 2011).

1.3 Les diatomées

Les diatomées sont des micro-algues appartenant au phylum des Bacillariophyta. La découverte de ce groupe d’algues remonte au début du XVIIIème siècle et l’apparition des premiers microscopes optiques. Round et al. (1990) rapportent ainsi que la première description d’une cellule de diatomée – que l’on sait aujourd’hui être Tabellaria foculosa

– a été publiée en 1703 dans les Philosophical Transactions de la Société Royale de Londres. Depuis, les connaissances sur la biologie, l’écologie et la diversité des diatomées n’ont cessé d’évoluer, bénéficiant en particulier des progrès techniques de la microscopie optique et électronique et de la biologie moléculaire.

1.3.1 Biologie générale

Les diatomées sont des algues microscopiques eucaryotes unicellulaires phototrophes.

La taille des individus varie selon l’espèce de 2 µm pour les plus petites (ex. Minidiscus trioculatus) à 2–5 mm pour les plus grandes espèces marines (ex. Ethmodiscus rex, Rhi- zosolenia spp., Thalassiothrix spp.). On distingue classiquement les diatomées centriques

Figure 1.1(A, B et D) caractérisées par une symétrie radiale et une forme généralement circulaire, des diatomées pennées Figure 1.1C caractérisées par une symétrie bilatérale

(Round et al., 1990). Les diatomées se distinguent des autres micro-algues par une paroi externe fortement différenciée et essentiellement constituée de silice SiO2 (Round et al., 1990). L’ensemble de cette paroi est appelée frustule et consiste en deux parties principales, les valves, qui sont emboîtées l’une dans l’autre. Les deux valves sont connectées entre elles par 1.3. Les diatomées 11 l’intermédiaire de fines bandes de silice appelées ceintures connectives ou cingulum. La plus grande des deux valves est appelée épivalve. Avec l’épicingulum, qui fait partie du cingulum, elle forme l’épithèque. De la même manière, la plus petite valve, l’hypovalve, constitue, avec l’hypocingulum, l’hypothèque. La cohésion de l’ensemble est permise par la présence de matériel organique constitutif de la paroi cellulaire (Kröger et al.,

1996 ; Round et al., 1990). Le frustule offre une protection complète à la cellule et isole le contenu du cytoplasme du milieu extérieur. Les flux entre la cellule et son environnement se font uniquement par les pores et les fentes qui ornementent le frustule. De nombreux genres de diatomées pennées ont pour caractéristique la présence d’une fente longitudinale sur le frustule appelée raphé. Le raphé peut être limité à une seule valve (cas des diatomées pennées monoraphidées) ou s’étendre sur les deux valves (diato- mées pennées biraphidées). Cette structure favorise l’excrétion d’une matrice de polysa- charides impliquée dans l’adhésion au substrat et dans la mobilité des cellules (Edgar and Pickett-Heaps, 1984).

On retrouve dans les diatomées les organites typiques des cellules eucaryotes végé- tales à savoir un noyau délimité par une enveloppe et contenant le matériel génétique nucléaire, un système de synthèse et de maturation des protéines (réticulum endoplas- mique, ribosomes, appareil de Golgi) ainsi que des mitochondries en nombre et de forme variable selon les espèces (Round et al., 1990). Les diatomées étant des organismes pho- totrophes, leur cellule contient également un ou plusieurs chloroplastes où se déroulent les étapes clés de la photosynthèse. Les chloroplastes sont séparés du cytosol par une enveloppe à quatre membranes et contiennent des pigments chlorophylliens a et c ainsi que des pigments caroténoïdiens β-carotène, fucoxanthine, diatoxanthine et diadinoxan- thine (Stauber and Jeffrey, 1988). Ce sont ces derniers qui donnent aux biofilms de diatomées leur couleur brune caractéristique.

Les diatomées sont des organismes généralement diploïdes (Kociolek and Stoer- 12 1. Introduction mer, 1989) dont le cycle de vie est composé d’une phase de multiplication végétative et d’une phase de reproduction sexuée (Chepurnov et al., 2004). La multiplication végétative est la plus commune et s’étend sur plusieurs mois voire plusieurs années. La vitesse de multiplication est variable selon les conditions et les espèces. Des mécanismes cellulaires ralentissent ou bloquent le processus de division en cas de limitation par la lumière, les nutriments ou par la silice nécessaire à la constitution des nouvelles valves

(Darley and Volcani, 1971 ; Round et al., 1990). La multiplication débute par une mitose (Pickett-Heaps et al., 1984). L’ADN est dupliqué et la cellule mère se divise en deux cellules filles avec chacune un noyau. A l’issue de la mitose les deux cellules filles sont localisées dans le frustule de la cellule mère. Chaque cellule fille synthétise alors une nouvelle hypovalve ; l’épivalve et l’hypovalve de la cellule mère constituant désormais les

épivalves des cellules filles. La constitution des nouvelle valves se fait par l’intermédiaire de vésicules de dépôt de silice (SDV) qui se forment dans chaque cellule fille et s’étendent tout en reproduisant fidèlement les structures et ornementations propres à la cellule. Une fois les nouvelles valves constituées, elles sont expulsées hors du protoplasme des cellules

filles par un mécanisme de fusion membranaire. Les deux cellules filles deviennent alors indépendantes et se séparent. Ce mode de multiplication qui implique la reformation d’une hypovalve de silice pour chaque cellule fille à chaque division implique une ré- duction de la taille des cellules au fil des générations (MacDonald, 1869 ; Pfitzer,

1871).

Quand les cellules de diatomées atteignent une taille minimale critique, la repro- duction sexuée permet le rétablissement de la taille initiale. La reproduction sexuée est donc un processus obligatoire qui a lieu de manière ponctuelle quand la taille des cellules devient trop petite pour leur permettre de fonctionner (Drebes, 1977 ; Geitler, 1932) mais qui dépend aussi de facteurs environnementaux (Mouget et al., 2009). L’autre rôle de la reproduction sexuée est de permettre une recombinaison génétique. La re- 1.3. Les diatomées 13

Figure 1.1 : Diatomées observées en microscopie électronique à balayage. A. Biddul- phia reticulata. Frustule complet d’une diatomée centrique avec ses deux valves et les ceintures connectives. B. Eupodiscus radiatus. Vue d’une seule valve d’une espèce centrique. C. Diploneis sp. Deux frustules (individus) d’une espèce pennée. D. Melosira varians. Frustule d’une espèce centrique. Photographies réalisées par Mary Ann Tiffany, Université de San Diego (Bradbury, 2004). production sexuée nécessite une première étape de gamétogenèse où les gamètes sont produits par méiose. La rencontre et la fusion de deux gamètes provenant de cellules dif- férentes donnent lieu à un zygote appelé auxospore qui se développera en une cellule de diatomée. Si ce processus est commun à toutes les espèces de diatomées, il existe cepen- dant d’importantes différences de stratégies en particulier entre les centriques, pennées araphidées et pennées raphidées (voir Chepurnov et al., 2004). 14 1. Introduction

1.3.2 Écologie

Les diatomées ont colonisé la majorité des habitats aquatiques, des océans aux sources de montagne. Elles jouent un rôle crucial dans le cycle de la silice et participent à hauteur de 20 % à la fixation photosynthétique mondiale du carbone (Chepurnov et al., 2004 ;

Field et al., 1998 ; Mann, 1999). On trouve également des espèces aérophiles sur les roches émergées, les mousses, dans les grottes, la neige, la glace et les sols, pour peu qu’il y règne une humidité suffisante (Patrick, 1977). Dans les systèmes aquatiques d’eau douce, les diatomées sont présentes dans les milieux lotiques (sources et torrents de montagne, rivières, fleuves et eaux saumâtres des estuaires) et dans les milieux lentiques

(lacs naturels ou de barrages, étangs, mares).

En fonction de leur mode de vie, on distingue les espèces dites pélagiques qui ap- partiennent au phytoplancton et évoluent librement dans la colonne d’eau, des espèces benthiques qui vivent en contact avec le substrat. Les espèces de diatomées pélagiques sont généralement centriques ou pennées araphidées. Elles sont rarement mobiles et leurs mouvements dépendent essentiellement de celui des masses d’eau (courants, turbulences, convection). A cause de leur frustule de silice, la densité des diatomées est supérieure à celle de l’eau (Round et al., 1990). Les espèces planctoniques présentent donc un cer- tain nombre d’adaptations pour se maintenir près de la surface, dans la zone photique

(e.g. taille et forme des cellules, système de régulation de densité intracellulaire, voir

Round et al., 1990 ; Walsby and Reynolds, 1980). Dans les eaux douces, ces espèces se développent surtout dans les milieux lentiques larges et ouverts comme les lacs et les

fleuves de plaines.

Les espèces benthiques, elles, se retrouvent aussi bien dans les rivières que dans les lacs, où leur développement est cependant limité à la zone euphotique. Le benthos est connu pour être plus complexe et diversifié que le plancton, avec une plus grande ri- chesse spécifique et de nombreuses formes de vie (Round et al., 1990). Les diatomées 1.3. Les diatomées 15 benthiques se développent à la surface des pierres (épilithon), du sédiment (épipélon), du sable (épipsammon) ou encore sur des plantes aquatiques et des algues filamenteuses

(épiphyton). Les espèces benthiques se fixent au substrat à l’aide de pédoncules ou de patchs de mucilage gélatineux. Ces derniers sont essentiellement composés de polysac- charides et sécrétés par l’intermédiaire de pores spéciaux, souvent localisés aux niveau des pôles de la cellule. Beaucoup d’espèces benthiques, en particulier celles de l’épipélon, possèdent un raphé qui leur permet d’adhérer au substrat et de se déplacer.

Quand elles se développent sur un substrat dur, les diatomées benthiques inter- agissent avec de nombreux autres organismes : champignons, bactéries, algues. Ensemble, ces organismes constituent un biofilm et évoluent dans une matrice de bio-polymères, plus ou moins dense, constituée de lipides, polysaccharides, acides nucléiques, protéines et de divers éléments organiques (Flemming and Wingender, 2010 ; Sutherland, 2001). Cette matrice joue un rôle important en permettant la stabilité, le maintien et la cohésion des micro-organismes, dont font partie les diatomées, face aux perturbations et aux turbulences mécaniques du courant. Elle absorbe également des composés orga- niques et inorganiques (ions, nutriments mais aussi certains polluants) qui se trouvent normalement dissous à faible concentration dans la colonne d’eau (Flemming and Leis, 2003). Enfin, la matrice semble également jouer un rôle protecteur face au broutage par les herbivores (Lawrence et al., 2002) et semble limiter l’action de certains composés toxiques comme les antibiotiques (Gilbert et al., 1987 ; Stewart and Costerton,

2001).

Les diatomées sont des organismes unicellulaires mais si certaines espèces évoluent librement (e.g. Nitzschia, Navicula), d’autres se développent en colonies de plusieurs individus (Rimet and Bouchez, 2012a). Les formes coloniales sont nombreuses et va- riées : en chaînes (e.g. Cyclotella, Thalassiosira), en rubans (e.g. Fragilaria), en zig-zag

(Diatoma), en rosaces (Ulnaria) ou encore en arbuscules (Gomponema, Cymbella). Les 16 1. Introduction différentes cellules d’une même colonie sont attachées entre elles grâce à des structures en silice particulières (épines, portules) ou grâce à la sécrétion de mucilage. Certaines espèces constituent un tube de mucilage (e.g. Encyonema) dans lequel les cellules se déplacent en file. Cette grande diversité de formes de vie semble être une réponse à de multiples contraintes écologiques impliquant la nécessité d’adhésion au substrat ou en- core l’accès à la lumière et aux nutriments (Round et al., 1990). Passy (2007) propose une autre classification des diatomées benthiques basée sur leur tolérance à la limita- tion par les nutriments et aux perturbations physiques. Elle définit ainsi trois guildes écologiques. Les diatomées appartenant à la guilde low profile sont de petite taille et pré- sentent une bonne résistance au courant tout en étant favorisées par des conditions de faibles concentrations en nutriments. Inversement, les espèces high profile, plus grandes ou coloniales, sont favorisées dans les habitats peu perturbés et riches en nutriments et se développent davantage sur les substrats durs (épilithon et épiphyton). Finalement la guilde motile comprend les espèces à forte mobilité présentant les mêmes préférences que les espèces high profile tout en dominant l’épipélon.

Les facteurs physico-chimiques exercent un contrôle fort sur la composition des com- munautés de diatomées. Parmi les facteurs abiotiques les plus importants on retiendra la vitesse du courant (Lamb and Lowe, 1987), la lumière nécessaire à la photosyn- thèse (Patrick, 1977), le pH (Mulholland et al., 1986). La quantité de nutriments disponibles (azote et phosphore) est également un facteur important (Kelly, 2003), tout comme la concentration de matériel organique (Van Dam et al., 1994). Un grand nombre d’espèces de diatomées présentent une hétérotrophie facultative (Hellebust and Lewin, 1977) tandis que quelques rares espèces (e.g. Nitzschia putrida, Hantzschia achroma) sont dépourvues de pigments et hétérotrophes obligatoires (Li and Volcani,

1987). 1.3. Les diatomées 17

1.3.3 Diversité, classification et phylogénie

Dès les premières descriptions, au XVIIIème siècle, les naturalistes ont essayé de rapprocher les diatomées des organismes connus de l’époque. La nature des diatomées est cependant restée incertaine jusqu’à la moitié du XIXème siècle. Nombre de scientifiques considéraient alors ces organismes microscopiques mobiles comme des animaux (e.g.

Bory de Saint-Vincent, 1822 ; Ehrenberg, 1838).

Les travaux de Kützing (1844) et de Smith (1853) semblent mettre fin à cette pé- riode d’incertitude en classant définitivement les diatomées dans les algues. Nos connais- sances actuelles regroupent les diatomées au sein de la super-classe des Bacillariophyta

(Haeckel, 1866a,b) et les rattachent à l’infra-règne des Heterokonta (le synonyme non formel stramenopiles est communément utilisé). Les hétérokontes regroupent une très grande diversité d’organismes appartenant à plusieurs divisions (Chrysophyceae, Dic- tyochophyceae, Xanthophyceae, ...) qui partagent la caractéristique de posséder deux

flagelles différents à un moment de leur cycle (Adl et al., 2005). Les liens de parenté entre les différentes divisions ne sont pas clairement établis (Julius and Theriot,

2010) mais Goertzen and Theriot (2003) ont identifié la classe des Bolidophyceae (Guillou et al., 1999) comme le clade le plus proche des diatomées.

Les diatomées constituent un groupe relativement récent qui serait apparu au cours du Mésozoïque, il y approximativement 190 millions d’années (Medlin et al., 1997,

2000). À partir de là, le clade a connu une forte diversification dont les origines et la dynamique demeurent mal connues (Julius and Theriot, 2010). Le résultat de ce processus est un groupe d’une diversité morphologique exceptionnelle.

En estimant que le nombre de nouvelles espèces décrites est d’environ 184 par an, et que ce taux est resté étonnamment constant au cours du siècle dernier, Julius (2007)

évalue le nombre d’espèces de diatomées décrites à 24 000. En reprenant sa méthodologie on peut proposer une mise à jour de ce chiffre et estimer qu’environ 25 500 espèces sont 18 1. Introduction décrites en 2015. Ce nombre, cependant, varie fortement selon les méthodes et les auteurs.

Le Catalogue of Diatom Names (Fourtanier and Kociolek, 2011) recense ainsi plus de 62 000 noms de taxons (genre, espèces et niveaux infra-espèce) tandis que Guiry, 2012 propose un nombre – qualifié par lui même de conservatif – de 12 000 espèces décrites.

Estimer le nombre réel d’espèces de diatomées existantes est une tâche encore plus complexe, qui repose nécessairement sur un certain nombre d’extrapolations. Les chiffres de 200 000 espèces (Mann and Droop, 1996), 30 000 (Guiry, 2012) ou encore 100 000 (Mann and Vanormelingen, 2013) ont récemment été proposés et sont à l’origine de débats sur la méthodologie et sur la notion même d’espèce chez les diatomées (Mann,

1999 ; Mann and Vanormelingen, 2013 ; Mann and Droop, 1996). En dépit de l’in- certitude qui les entoure, ces chiffres nous indiquent qu’une large partie de la diversité des diatomées nous est encore inconnue. Deux éléments tendent à conforter cette idée : d’abord une grande partie de la biosphère, où l’on sait que les diatomées sont abondantes, reste très peu étudiée (Mann and Droop, 1996), ensuite parce que les méthodes molé- culaires sont en train de révéler une diversité cryptique inattendue au sein de nombreux complexes d’espèces (Mann and Evans, 2007). C’est le cas par exemple de Nitzschia palea (Trobajo et al., 2009), Sellaphora pupula (Evans et al., 2008 ; Mann et al.,

2004), Navicula cryptocephala (Poulíčková et al., 2010) ou encore Navicula phylepta

(Vanelslander et al., 2009).

La morphologie du frustule a depuis toujours constitué le critère de choix des taxo- nomistes pour la description de nouvelles espèces. La forme et les ornementations du frustule peuvent être extrêmement complexes et présentent l’avantage d’être reproduit avec précision d’une génération à l’autre. Cependant la délimitation des espèces sur la seule base du frustule a reçu des critiques (Mann, 1982, 1999) et d’autres caractères commencent à être pris en compte (e.g. Cox and Williams, 2000, 2006). 1.3. Les diatomées 19

Le fait que pendant longtemps les taxonomistes se soient essentiellement focalisés sur le frustule a donné lieu à d’importantes incohérences de classification et à des listes d’es- pèces réunies au sein de mêmes clades sans aucune justification évolutive (Williams and Kociolek, 2007). Au plus haut niveau, la classification traditionnelle sépare les diatomées en deux groupes : les centriques et les pennées (voir Section 1.3.1). Cette séparation remonte au XIXème siècle (Schütt, 1896) et se base sur les observations permises par la microscopie optique. De la même manière, la découverte du raphé a permis de séparer les diatomées pennées raphidées des diatomées pennées araphidées (Round et al., 1990). Pourtant les méthodes de phylogénies moléculaires semblent in- diquer que cette classification ne satisfait pas les critères de la systématique moderne.

Ainsi, il semble acquis que ni les centriques, ni les pennées araphidées ne constituent des clades monophylétiques (Kooistra et al., 2003a,b ; Medlin et al., 2000 ; Medlin et al., 1996). La reconstruction de la phylogénie des diatomées impose donc de reconsi- dérer en profondeur la systématique des diatomées. C’est une tâche qui s’avère longue et complexe (Williams and Kociolek, 2007) et qui rencontre une certaine résistance au sein de la communauté des diatomistes, en particulier de la part de certains taxono- mistes et de ceux qui utilisent les diatomées comme indicateurs (Julius and Theriot, 2010).

Les tentatives d’intégrer une perspective phylogénétique à la systématique des diato- mées ne sont pas récentes (Simonsen, 1979 ; Steinecke, 1931) mais c’est l’avènement des méthodes moléculaires et la démocratisation du séquençage au cours des années 90 qui ont réellement permis de commencer à reconstruire des phylogénies précises (Med- lin et al., 1993) et de revisiter la classification. Un travail important a été effectué depuis, résumé par Alverson and Theriot (2005); Mann and Evans (2007); Med- lin (2011); Theriot et al. (2011), qui a donné lieu à différentes topologies et des hypothèses évolutives parfois contradictoires (e.g. Medlin and Kaczmarska, 2004 ; 20 1. Introduction

Sims et al., 2006 ; Theriot et al., 2009). Ces désaccords sont souvent le résultat de l’utilisation de diverses méthodes de reconstruction phylogénétique et l’usage de diffé- rents marqueurs comme le 18S (Medlin et al., 1996), le 28S (Sorhannus et al., 1995), le rbcL(Daugbjerg and Andersen, 1997), le rpoA(Fox and Sorhannus, 2003), le psbC(Theriot et al., 2010), ou encore le cox1 (Ehara et al., 2000). Les phylogénies les plus récentes privilégient désormais une approche multi-gènes (e.g. Theriot et al.,

2010, 2011 qui intègrent le 18S, le rbcL et le pbsC; Theriot et al., 2015 qui intègrent le atpB, le psaA, le psaB, le psbA, psbC et le rbcL). Un travail significatif reste néanmoins à accomplir. En particulier parce qu’il existe encore des incohérences entre phylogénies moléculaires et caractères morphologiques et que seule une infime partie de la diversité des diatomées (< 0.01% des espèces connues) a été séquencée jusqu’à présent (Julius and Theriot, 2010).

1.4 Les diatomées en bioindication

La capacité des diatomées à refléter les changements dans le milieu aquatique et leur potentiel pour la bioindication ont été remarqués très tôt. Ainsi, le premier outil de bioindication (Kolkwitz and Marsson, 1908, 1909) incluait déjà de nombreuses espèces de diatomées (e.g. Cyclotella meneghiniana, Gomphonema parvulum, Nitzschia palea). Aujourd’hui, l’utilisation des diatomées comme indicateurs biologiques s’est gé- néralisée et elles constituent un outil clé pour les programmes de bioindication dans le monde entier (Ibáñez et al., 2010 ; Stevenson et al., 2010).

1.4.1 Intérêt des diatomées pour la bioindication

Les raisons d’utiliser les diatomées comme bioindicateurs sont multiples. Les diato- mées occupent un compartiment important au sein des écosystèmes aquatiques, ce qui 1.4. Les diatomées en bioindication 21 en fait un choix pertinent pour indiquer la condition écologique des rivières (Steven- son et al., 2010). En assurant une partie conséquente de la production primaire et en constituant une ressource trophique importante pour les invertébrés et les poissons, les diatomées jouent un rôle fondamental dans les réseaux trophiques des milieux aquatiques

(Lamberti, 1996). Les changements de communautés de diatomées sont donc suscep- tibles d’indiquer des modifications du milieu de manière directe (e.g. un changement dans la disponibilité des nutriments) ou indirecte en reflétant des modifications de la pression de broutage exercée par les herbivores (effet top-down). Les diatomées étant situées à la base du réseau trophique, des modifications de la communauté peuvent également se répercuter sur la structure des communautés de consommateurs (effet bottom-up).

Les communautés de diatomées répondent directement aux changements physiques et chimiques du milieu. Les études sur l’écologie des diatomées ont montré que la struc- ture des communautés était dépendante de nombreux facteurs que l’on peut aisément mettre en relation avec les activités humaines et la qualité des eaux. Parmi ces facteurs on trouve notamment le taux de matière organique (Sládeček, 1986), le pH (Ren- berg and Hellberg, 1982 ; Zampella et al., 2007) et les nutriments (Kelly, 1998 ;

Pan et al., 1996). Ces changements de structures sont la conséquence d’une importante diversité de préférences écologiques entre les espèces. Leur temps de génération étant relativement court, on considère que les diatomées ont une réponse rapide aux modifi- cations de l’environnement et qu’elles offrent ainsi un signal précoce en cas de pollution

(McCormick and Stevenson, 1998).

Le choix d’utiliser les diatomées comme indicateur biologique est aussi guidé par des considérations pratiques. Les diatomées sont un groupe ubiquiste (voir Section 1.3.2): on les retrouve en quantité abondante dans tous les types des rivières. Le fait qu’on les trouve également en milieu aérien humide permet de les utiliser comme bioindicateur dans les sols, les prairies humides, les tourbières et les marais (Gaiser and Rühland, 22 1. Introduction

2010 ; Martínez-Carreras et al., 2015). L’échantillonnage est simple : les diatomées benthiques sont collectées en raclant leur substrat tandis que les diatomées planctoniques sont collectées directement dans la colonne d’eau à l’aide de bouteilles, de tubes ou de systèmes plus élaborés pour intégrer la micro-stratification (e.g. Croome and Tyler,

1983). Si l’identification au niveau de l’espèce en microscopie optique n’est pas aisée

(voir Section 1.4.3), il est assez simple de distinguer les principaux genres. Les frustules de diatomées présentent aussi l’avantage de conserver des caractéristiques morpholo- giques similaires tout au long du cycle de vie (Round et al., 1990) contrairement à de nombreuses autres classes d’algues (Stevenson et al., 2010).

1.4.2 Méthodes de bioindication basées sur les diatomées

De très nombreuses méthodes de bioindication basées sur les diatomées ont été dé- veloppées (Stevenson et al., 2010). Ces méthodes se basent sur différents types d’at- tributs propres aux communautés que l’on peut rattacher à leur structure et à leur fonc- tionnement. Certains indicateurs sont de simples descripteurs généraux de la condition biologique de la communauté. On peut utiliser par exemple des proxys de la biomasse comme la masse sèche, la densité cellulaire ou la concentration en pigments (Dodds et al., 1997 ; Stevenson et al., 2006). On peut aussi utiliser des estimateurs de la diver- sité biologique (Stevenson, 1984). Cependant, la majorité des indicateurs biologiques utilisant les diatomées sont basés sur la composition taxonomique des communautés, considérant notamment que les proportions relatives des espèces en présence nous in- forment directement sur les conditions environnementales. Ces méthodes se traduisent souvent par le calcul d’un indice, une valeur numérique simple qui, estimée à partir des données de composition spécifique de la communauté, est censée véhiculer une informa- tion sur l’état du milieu. 1.4. Les diatomées en bioindication 23

Les indices basés sur la composition taxonomique de la communauté ont en commun qu’ils requièrent une identification des taxons en présence et souvent une estimation de leur fréquence relative. Il s’agit d’une étape cruciale qui s’effectue sous microscope, en laboratoire. La préparation des lames nécessite plusieurs étapes consistant à détruire le protoplasme des cellules, éliminer les résidus de matière organique et monter les frustules dans une résine à fort indice de réfraction (voir Round et al., 1990). Ainsi préparées, les lames de diatomées peuvent être conservées plusieurs décennies. L’identification et le comptage des individus sont réalisés par observation des frustules de diatomées mortes au microscope bien que l’inspection de matériel frais soit aussi recommandée (Kelly et al., 1998 ; Taylor et al., 2007). Les fréquences relatives des différentes espèces sont estimées en comptant un nombre pré-déterminé de frustules. Les recommandations sur le nombre minimum de frustules nécessaires pour des estimations fiables sont variables mais généralement situées entre 300 (e.g. Chessman et al., 2007) et 600 (e.g. Barbour et al., 1999). Ces recommandations ont pour but d’optimiser la vitesse de comptage, une estimation précise de la diversité (incluant les espèces rares) nécessitant de compter entre 3000 et 8000 frustules (Stevenson et al., 2010). Inversement, pour garantir la représentativité de l’échantillon il est conseillé de ne pas compter moins de 300 frustules (Battarbee, 1986 ; Prygiel et al., 2002).

Les indices biotiques basés sur la composition taxonomique des communautés de diatomées se décomposent en trois groupes : les indices autécologiques, les indices mul- timétriques et les indices basés sur les modèles prédictifs.

Les indices autécologiques sont les plus simples, les plus anciens et les plus largement utilisés, notamment en Europe. Ces indices prennent généralement en compte trois types d’informations : les fréquences relatives estimées des taxons (a), leurs optimums écolo- giques (s) et leurs tolérances (v). Les indices autécologiques sont très souvent calculés à partir de l’équation 1.1, dérivée de l’équation de Zelinka and Marvan (1961). 24 1. Introduction

∑ n a v s ∑i=1 i i i index = n (1.1) i=1 aivi

Un très grand nombre de ces indices ont été développés (e.g. Coste, 1982 ; Dell’Uomo and Torrisi, 2011 ; Kelly and Whitton, 1995 ; Lenoir and Coste, 1996 mais voir

Stevenson et al., 2010 et Roux-Barthès, 2014 p.138 pour davantage de références).

Ces indices diffèrent pour l’essentiel dans le nombre d’espèces pris en compte et dans les valeurs d’optimums et de tolérance associées aux espèces. Certains indiquent le degré de pollution générale (Coste, 1982) tandis que d’autres se focalisent sur certains types de pollutions (Dell’Uomo and Torrisi, 2011).

Les indices multimétriques cherchent à évaluer l’intégrité biologique en synthétisant plusieurs métriques relatives à la structure et à la fonction des communautés en un seul indice (Barbour et al., 1995). Ces indices sont traditionnellement utilisés en Amérique du Nord, où ils ont été développés d’abord sur les poissons (Karr, 1981) et les ma- croinvertébrés (Hilsenhoff, 1988). Des indices multimétriques ont également été mis au point avec les diatomées (Fore and Grafe, 2002 ; Hill et al., 2000 ; Wang et al.,

2005). Les indices basés sur les modèles prédictifs s’appuient sur le concept d’état de ré- férence. Leur principe repose sur la comparaison entre la communauté observée et la communauté de référence attendue pour ce site par le modèle. Les modèles prédictifs sont donc construits à partir d’un réseau de sites de référence. De tels modèles ont été développés pour les diatomées (Chessman et al., 1999 ; Gevrey et al., 2004 ; Mendes et al., 2014). 1.4. Les diatomées en bioindication 25

1.4.3 Limites de l’utilisation des diatomées en bioindication

Les diatomées présentent de nombreux avantages pour la bioindication des cours d’eau (voir Section 1.4.1) mais leur utilisation en routine n’est pas sans inconvénients et sans contraintes techniques et méthodologiques. La qualité des données générées dé- pend essentiellement de deux étapes délicates : l’échantillonnage et l’identification des individus (Besse-Lototskaya et al., 2011).

L’échantillon doit être représentatif du site où il est collecté. Les communautés de diatomées sont susceptibles de varier fortement au sein d’un même site en fonction du faciès de la rivière (mouille, radier) et du substrat (pierres, végétaux, sédiments)

échantillonné. Les outils utilisés pour la collecte (brosse, couteau) sont aussi un facteur de biais, certaines espèces pouvant être plus difficiles à détacher (e.g. Cocconeis) que d’autres (Round et al., 1990). Pour optimiser et harmoniser les pratiques, un certain nombre de recommandations existent (Kelly et al., 1998) et certaines méthodes de bioindication font l’objet de normes. C’est le cas en France pour l’Indice Biologique

Diatomées (AFNOR NF T 90-354) utilisé dans la Directive-cadre sur l’eau. L’identification et le dénombrement des individus sur les lames est certainement l’étape qui cristallise le plus de difficultés et qui génère le plus d’erreurs et d’imprécisions.

La majorité des indices basés sur les diatomées nécessitent l’identification de plusieurs centaines d’individus au niveau de l’espèce et parfois même au niveau infra-spécifique (sous-espèces, variétés).

L’identification des frustules en microscopie optique est une tâche difficile du fait de la diversité exceptionnelle des diatomées (Besse-Lototskaya et al., 2006 ; Kociolek,

2005 mais voir Section 1.3.3). La distinction entre deux espèces est parfois faite à par- tir de caractères morphologiques imperceptibles en microscopie optique. Les complexes d’espèces (e.g. Achnanthidium minutissimum, Fragilaria capucina, Nitzschia palea) sont 26 1. Introduction une source avérée d’erreurs d’identification et à l’origine de variations importantes dans les indices calculés (Kahlert et al., 2009 ; Prygiel et al., 2002). Il arrive aussi que les conditions environnementales aient une incidence sur la morphologie de certaines es- pèces (formes tératogènes) rendant l’identification difficile (Kociolek and Stoermer,

2010 ; Round et al., 1990). Enfin la description et la publication de nouvelles espèces chaque année et la révision continuelle de la taxonomie ne sont pas sans poser problème tant pour l’identification que pour le calcul des indices.

Ces difficultés techniques ont aussi des implications économiques dans la mise en œuvre de la bioindication. L’intégration des diatomées aux réseaux de surveillance envi- ronnementale exige d’opérer un suivi de nombreux sites chaque année. L’identification et le dénombrement des diatomées nécessitent la mobilisation d’un personnel qualifié et expérimenté à une tâche parfois répétitive et fatigante. La formation du personnel est longue et exige une mise à jour constante des connaissances et la participation à des exercices d’intercalibrations (Kahlert et al., 2009). Le comptage rigoureux d’une lame peut être long en particulier si des incertitudes imposent la consultation d’une flore ou de la littérature.

Pour essayer de rendre les méthodes de bioindication plus accessibles et limiter les coûts, les diatomistes ont exploré diverses approches. Ces approches visent à simpli-

fier le processus d’identification, soit en limitant le nombre d’espèces prises en compte dans les indices, soit en diminuant la résolution taxonomique. La première approche consiste à réduire le spectre de la diversité comptabilisée à un nombre limité d’espèces en privilégiant notamment les espèces les plus abondantes, les plus indicatrices et les plus simples à identifier (e.g. Lavoie et al., 2009 ; Lenoir and Coste, 1996). La se- conde approche propose d’utiliser un niveau taxonomique supérieur à l’espèce, qui est le niveau traditionnellement utilisé. De nombreuses propositions ont été faites, en par- ticulier pour remplacer l’espèce par le genre (Chessman et al., 1999 ; Growns, 1999 ; 1.5. La piste du signal phylogénétique 27

Hill et al., 2001 ; Raunio and Soininen, 2007 ; Rimet and Bouchez, 2012b ; Wun- sam et al., 2002). Ces nouvelles méthodes posent la question de la perte d’information qu’impliquent de telles simplifications.

1.5 La piste du signal phylogénétique

1.5.1 Définition du signal phylogénétique

Le signal phylogénétique désigne la part de variation dans les valeurs de traits 1 des espèces qui est expliquée par leur proximité phylogénétique. C’est une idée intuitive qui découle directement du principe Darwinien de descendance avec modifications. Le concept, relativement récent, s’est répandu rapidement dans la littérature à partir de la

fin des années 1980 avec le développement des méthodes d’analyses comparatives pre- nant en compte l’histoire évolutive des espèces (Abouheif, 1999 ; Felsenstein, 1985). Pendant longtemps la principale motivation des biologistes à étudier le signal phylogé- nétique était de tenir compte de l’inertie phylogénétique dans les analyses comparatives de traits entre espèces et d’éviter les problèmes de dépendance dans les analyses statis- tiques. L’étude du signal représente aujourd’hui un pan entier des méthodes d’analyses phylogénétiques pour identifier et comparer les modèles et les taux d’évolution des traits chez les espèces (Blomberg et al., 2003) et comprendre les processus propres aux as- semblages d’espèces dans les communautés (Mouquet et al., 2012).

D’un point de vue statistique, une définition largement acceptée du signal phylo- génétique est celle donnée par Blomberg and Garland (2002), à savoir, pour une phylogénie donnée, “la tendance pour les espèces proches à se ressembler davantage

1. Le concept de trait occupe une place centrale dans ce document. Un trait est une caractéristique quantitative ou qualitative mesurable sur un organisme. Ici nous intégrons le concept au niveau de l’espèce (ou un niveau supérieur), ce qui permet de désigner par trait des caractéristiques qui ne sont pas nécessairement propres à un individu. Les traits écologiques (optimums, tolérances, etc.), par exemple, sont issus de la description statistique d’une population. 28 1. Introduction que ne se ressembleraient des espèces tirées au hasard dans l’arbre”. Cette tendance est théoriquement attendue (en vertu du principe d’héritabilité). Le modèle de signal phylogénétique pour les traits à valeur continue le plus simple est le mouvement Brow- nien (Brownian motion, BM). Ce modèle assume que le trait étudié évolue de manière aléatoire et dans toutes les directions à taux constant, conduisant à une augmentation de la divergence entre espèces en fonction de la racine carrée du temps (Letten and

Cornwell, 2015). D’autres modèles, dérivés du mouvement Brownien ont été proposés et peuvent s’avérer plus efficaces selon les situations. Le modèle Ornstein-Uhlenbeck, par exemple, qui contraint le trait à évoluer autour d’une valeur donnée (Martins and Hansen, 1997), ou le modèle ACDC qui permet d’accélérer ou de ralentir le taux d’évolution au cours du temps (Blomberg et al., 2003).

1.5.2 Mesurer et tester le signal phylogénétique

Si la notion de signal phylogénétique est relativement simple et intuitive, le déve- loppement d’une méthodologie statistique pour mesurer et tester la présence du signal n’est pas évidente. La littérature contient de nombreuses méthodes qui ont chacune leurs spécificités et comportements (Diniz-Filho et al., 2012 ; Münkemüller et al., 2012).

Ces méthodes peuvent être classées dans deux groupes principaux selon qu’elles soient basées sur un modèle d’évolution ou non (Diniz-Filho et al., 2012). Les méthodes qui ne s’appuient pas sur un modèle d’évolution des traits reposent principalement sur le principe d’autocorrélation. Elles regroupent les indices d’autocorrélation phylogénétique

(le I de Moran et ses variantes, Abouheif 1999 ; Gittleman and Kot 1990 ; Pavoine et al. 2008) et les approches auto-régressives (Cheverud et al., 1985 ; Diniz-Filho et al., 1998). Ces approches sont en majorité des adaptations de méthodes développées dans le cadre des statistiques spatiales. Les approches basées sur un modèle d’évolution regroupent des indices spécifiques qui comparent les valeurs observées à celles atten- 1.5. La piste du signal phylogénétique 29 dues sous un modèle évolutif particulier, en général un modèle de mouvement Brownien

(Blomberg et al., 2003 ; Pagel, 1999).

Ces méthodes ont en commun qu’elles s’attachent toutes à la détection d’une dépen- dance des valeurs de traits en rapport avec la phylogénie. Les tests classiques posent donc comme hypothèse nulle que les valeurs du trait étudié sont distribuées aléatoire- ment dans la phylogénie. Le rejet de cette hypothèse ne permet pas de tirer de conclusion sur les processus évolutifs à l’œuvre (Revell et al., 2008).

Une approche alternative consiste à ajuster différents modèles aux valeurs de traits par moindres carrés généralisés (GLS), en utilisant différentes structures de covariance

(basées sur différents modèles d’évolution). Paradis (2011) propose de comparer ces modèles entre eux et à un modèle n’assumant pas de covariance entre espèces sur la base du critère d’information d’Akaike (AIC) pour révéler un possible signal phylogénétique et identifier le meilleur modèle d’évolution.

1.5.3 Intégrer le signal phylogénétique aux outils de bioindication

Depuis que des outils appropriés ont été mis à disposition, le signal phylogénétique a été caractérisé pour de nombreux traits, qu’il soient physiologiques, morphologiques,

écologiques ou comportementaux (voir Blomberg et al., 2003). L’étude du signal phylo- génétique dans le cadre de la bioindication est une piste de recherche récente. Établir un lien entre la phylogénie des espèces et certains de leurs traits propres à la bioindication

(e.g. la sensibilité aux toxiques, les préférences écologiques) présente deux perspectives intéressantes : la prédiction de valeurs de traits pour de nouvelles espèces et la simplifi- cation des outils. C’est l’aspect de la prédiction qui a d’abord attiré l’attention des écotoxicologues qui ont révélé la structuration de traits écophysiologiques dans la phylogénie pour des 30 1. Introduction macroinvertébrés aquatiques mis en présence de cadmium et de zinc (Buchwalter et al., 2008 ; Poteat et al., 2013). Leurs discussions ont fait émerger l’idée que le signal phylogénétique pour la sensibilité aux pollutions pouvait aider à prédire les effets des toxiques sur les communautés (Guénard et al., 2014, 2011 ; Larras et al., 2014 ;

Malaj et al., 2015).

Les possibilités de simplification des indices biotiques basées sur le signal phylogéné- tique sont évoquées pour la première fois par Carew et al. (2011). Cet article démontre la présence d’un signal pour la sensibilité générale aux pollutions chez les chironomi- dés et les éphémères et introduit l’idée de redondance phylogénétique dans les outils de diagnostic environnemental. La présence d’un signal phylogénétique à un niveau au des- sus de l’espèce suggère que des individus appartenant à des espèces phylogénétiquement proches puissent être confondus dans un même groupe (un cluster, taxonomiquement dé- fini ou non), sans que cela ne porte préjudice à l’efficacité de l’outil. Cette idée était déjà implicitement mise en œuvre dans les indices utilisant un niveau taxonomique supérieur

à l’espèce (e.g. Rumeau and Coste, 1988) et dans ce contexte le signal phylogénétique pouvait être mis en évidence pour chaque rang taxonomique par analyse de variance hiérarchique (Gittleman and Luh, 1992 ; Stearns, 1983). Établir des clusters d’espèces en se basant sur la phylogénie permet de s’affranchir d’une certaine rigidité imposée par la taxonomie. Dans la phylogénie, où la notion de rang n’existe pas, la distance entre espèces est supposée continue (l’ancêtre commun des espèces d’un cluster peut être localisé à n’importe quelle profondeur). De plus, la mise en lien des traits écologiques des espèces avec la phylogénie est supposée plus réaliste qu’avec la taxonomie, notamment chez les diatomées où de nombreuses incongruences subsistent entre phylogénie et taxonomie (Julius and Theriot, 2010). Si l’on consi- dère les préférences écologiques des diatomées comme des traits intégrateurs d’autres traits biologiques (morphologiques, physiologiques, d’histoire de vie), l’approche par les 1.5. La piste du signal phylogénétique 31

Évolution Traits biologiques Physiologie

Morphologie Taxonomie

Histoire de vie Phylogénie 1 Environnement 2

Niche réalisée 3 (Traits écologiques)

Interactions Histoire biotiques

Figure 1.2 : Schéma conceptuel des différents éléments déterminant la niche réalisée des espèces. Les effets de l’histoire (événements passés, dispersion) et des interactions biotiques (compétition, facilitation) ne sont pas précisément étudiés ici. Les flèches pleines représentent un lien direct tandis que les flèches pointillées indiquent un lien indirect. Le schéma explique les liens indirects taxonomie-écologie (2) et phylogénie-écologie (3) en les mettant en relation avec le lien traits biologiques-écologie (1). traits (Figure 1.2, flèche 1) apparaît le moyen le plus logique et le plus direct pour inférer l’écologie des espèces. Cependant, cette approche n’a pas connu d’importants développements chez les diatomées, probablement parce que la constitution de bases de données de traits n’a rien de simple chez des organismes microscopiques. Les outils de bioindication ont donc été mis au point en utilisant une voie indirecte qui est celle de la taxonomie (Figure 1.2, flèche 2). Le lien entre taxonomie et écologie est possible parce que la taxonomie est elle-même basée sur des traits biologiques. Chez les diato- mées cependant la perte d’information est susceptible d’être importante, parce que la taxonomie est essentiellement basée sur les caractéristiques morphologiques du frustule, et ignore de nombreux traits biologiques susceptibles d’influencer fortement l’écologie 32 1. Introduction des espèces. La troisième voie, qui est l’objet de ce travail, est celle de la phylogénie

(Figure 1.2, flèche 3). La phylogénie étant une représentation directe de l’évolution, elle est susceptible de refléter une information proche de celle des traits biologiques. Ainsi, le lien entre phylogénie et traits écologiques, s’il est indirect, est probablement plus réaliste que le lien entre taxonomie et traits écologiques.

1.6 Organisation de la thèse

1.6.1 Objectifs

L’objectif principal de cette thèse est d’explorer les possibilités de simplification des indices biologiques diatomées en utilisant le signal phylogénétique. Pour cela, les travaux ont été divisés en trois parties (chapitres 2, 3 et 4). La première partie (phylosignal : An R package to measure, test and explore the phylogenetic signal) est une partie mé- thodologique et technique présentant un nouveau package R dédié à l’étude du signal phylogénétique. La seconde partie (Phylogenetic signal in diatom ecology : perspectives for aquatic ecosystems biomonitoring) propose de mesurer et tester la présence de signal phylogénétique pour un ensemble de trait écologiques (optimums de paramètres physico- chimiques) chez les diatomées d’eau douce. Enfin, la dernière partie (Linking phylogenetic similarity and pollution sensitivity to develop ecological assessment methods : a test with river diatoms) introduit une nouvelle méthode de clustering phylogénétique et la met en

œuvre pour évaluer les possibilités de simplification d’un indice biologique basé sur les diatomées. 1.6. Organisation de la thèse 33

1.6.2 Bases de données utilisées

Les travaux présentés dans ce document sont basés sur deux principales sources de données : des données génétiques qui permettent de reconstruire la phylogénie des di- atomées et des données écologiques qui permettent d’estimer les préférences écologiques des taxons et de valider les nouveaux indices. La base de données génétique regroupe des séquences génétiques pour deux marqueurs le 18S et le rbcL. Elle est décrite en Sec- tion 3.2.1. La base de données écologique fournit des informations sur les caractéristiques de l’habitat et les inventaires des communautés de diatomées pour 2119 échantillons.

Elle est décrite en Section 3.2.3 (voir aussi Table 3.1).

1.6.3 phylosignal : An R package to measure, test and explore the phylogenetic signal

R est un logiciel d’analyse statistique et un langage de programmation très largement répandu dans la communauté scientifique. Un des éléments qui a fait le succès de R est son extensibilité par le biais de paquets (packages) développés par la communauté. Ces paquets couvrent de très nombreux domaines et peuvent être mis à disposition de tous notamment via le dépôt The Comprehensive R Archive Network 2 (CRAN).

De nombreux packages ont été développés pour la réalisation d’analyses phylogé- nétiques (voir Paradis, 2011). Parmi ceux-ci, un certain nombre implémentent des fonctions pour l’étude du signal phylogénétique. Citons par exemple ape (Paradis et al., 2004), picante (Kembel et al., 2010), phytools (Revell, 2012) et adephylo (Jombart et al., 2010a). Une des limites du système d’extension par paquet est la fragmentation des fonctionnalités dans un écosystème où chaque package est susceptible d’utiliser un format de données et une syntaxe différents.

2. https ://cran.r-project.org/ 34 1. Introduction

Le package phylosignal que nous présentons dans cette partie, est entièrement dédié à l’analyse du signal phylogénétique et réimplante dans une interface unifiée un certain nombre de fonctionnalités déjà disponibles dans R tout en proposant de nouvelles approches. Il a été pensé de manière à pouvoir fonctionner de façon autonome tout en favorisant l’interopérabilité avec les autres packages de l’écosystème via l’utilisation de la classe phylo4d (Hackathon et al. 2013). Bien que quelques méthodes basées sur un modèle d’évolution soient présentes, le package prend le parti d’une approche essen- tiellement basée sur le principe d’autocorrélation. Les méthodes implémentées ont donc, pour beaucoup, été développées dans le cadre de l’analyse spatiale. Les fonctionnalités du package incluent notamment :

Visualisation de données Un ensemble de fonctions pour représenter graphiquement

les valeurs d’un ou de plusieurs traits sur une phylogénie.

Calcul d’indices de signal phylogénétique Le calcul de plusieurs indices commu-

nément utilisés dans la littérature : le K et le K∗ de Blomberg (Blomberg et al.,

2003), le λ de Pagel (Pagel, 1999), le I de Moran (Moran, 1948, 1950) et l’indice

Cmean d’Abouheif (Abouheif, 1999 ; Pavoine et al., 2008). Chacun de ces indices peut être testé pour l’hypothèse nulle de répartition aléatoire des valeurs de trait

dans la phylogénie.

Comparaison du comportement des indices Des fonctionnalités de simulations re-

prenant l’approche de Münkemüller et al. (2012) pour étudier le comportement et les performances des différents indices de signal phylogénétique pour une phy-

logénie donnée.

Calcul et représentation de corrélogrammes phylogénétiques La représentation

de l’autocorrélation des valeurs d’un trait à différentes distances phylogénétiques,

le corrélogramme (Gittleman and Kot, 1990).

Calcul d’indicateurs locaux d’autocorrélation phylogénétique Calculer et tes- 1.6. Organisation de la thèse 35

ter pour chaque espèce, un indice d’autocorrélation local (Anselin, 1995) qui

indique le degré de similarité de cette espèce avec les espèces les plus proches.

Clustering sous contrainte phylogénétique Implémentation de la méthode de clus-

tering présentée par Keck et al. (2016a, chapitre 4).

Le package phylosignal permet donc d’effectuer une analyse avancée du signal phylogénétique et des motifs de valeurs de traits dans la phylogénie. Il fournit, en outre, les fonctions et les éléments de code nécessaires pour conduire les analyses des chapitres

3 et 4.

1.6.4 Phylogenetic signal in diatom ecology : perspectives for aquatic ecosystems biomonitoring

Simplifier les outils de bioindication en intégrant le signal phylogénétique nécessite en premier lieu de prouver qu’un tel signal est présent pour des traits propres à la bioin- dication. Des résultats allant en ce sens ont déjà été obtenus pour les macroinvertébrés (Buchwalter et al., 2008 ; Carew et al., 2011). Pour les diatomées, un signal phy- logénétique a été identifié pour la sensibilité à différents types d’herbicides (Larras et al., 2014). L’objectif de cette partie est d’étendre l’étude du signal phylogénétique des diatomées d’eau douce à des préférences écologiques plus générales. Nous nous intéres- sons donc aux optimums écologiques des espèces pour un ensemble de 19 paramètres physico-chimiques. Ces paramètres (liés à la minéralisation, à la matière organique, aux nutriments, etc.) reflètent la qualité chimique des eaux de surface et sont souvent utilisés pour la mise au point d’indices biotiques (e.g. Coste et al., 2009).

Les analyses de signal phylogénétique réalisées ont nécessité deux types de données : une phylogénie moléculaire des diatomées et une estimation de leurs optimums éco- logiques. La phylogénie moléculaire a été reconstruite à partir des séquences de deux 36 1. Introduction marqueurs génétiques : le gène nucléaire codant pour la petite sous-unité ARNr 18S et le gène chloroplastique rbcL codant pour la protéine RuBisCO. Nous avons opté pour une stratégie de reconstruction permettant d’inclure le maximum d’espèces (Section 3.2.2). La phylogénie ainsi reconstruite comprend 550 espèces de diatomées (marines et dulça- quicoles) et a été utilisée pour l’ensemble des analyses de la thèse. Les optimums pour les 19 paramètres physico-chimiques, que nous considérons ici comme autant de traits

écologiques dont la valeur est propre à chaque espèce, ont été estimés par moyennes pon- dérées à partir des mesures d’abondance réalisées sur 2119 échantillons de communautés de diatomées benthiques.

Le croisement de la phylogénie avec les traits écologiques a restreint l’analyse à 127 espèces. En utilisant deux méthodes (l’indice λ de Pagel et l’indice Cmean d’Abouheif), nous avons pu mettre en évidence la présence de signal phylogénétique pour les opti- mums de plusieurs paramètres. L’analyse des données avec une approche multivariée

(pPCA ; Jombart et al., 2010b) nous a permis de montrer que la niche écologique multidimensionnelle des diatomées étudiées peut être mise en lien avec la phylogénie.

Les résultats de cette partie montrent qu’un lien peut clairement être établi entre la phylogénie des diatomées et leurs préférences écologiques. Ce signal et la présence de clades incluant des espèces avec des optimums écologiques similaires sont encourageants pour le développement d’indices biologiques simplifiés.

1.6.5 Linking phylogenetic similarity and pollution sensitivity to develop ecological assessment methods : a test with river diatoms

Démontrer la présence de signal phylogénétique pour des traits propres à la bioindi- cation est une première étape vers le développement d’indicateurs potentiellement plus simples et plus rapides à mettre en œuvre. Cependant, si la présence d’un signal phy- 1.6. Organisation de la thèse 37 logénétique peut indiquer une redondance phylogénétique, il n’existe pas de méthode formellement identifiée pour la prendre en compte. De fait, si la littérature s’est es- sentiellement focalisée sur les mesures et les tests de signal phylogénétique, elle reste jusque-là assez floue à ce sujet. Les recommandations de regroupements de taxons se font donc par évaluation visuelle – plus ou moins subjective – des valeurs de traits re- présentées sur les phylogénies, parfois appuyée sur la taxonomie (Carew et al., 2011 ;

Keck et al., 2016b) et d’autres traits biologiques ou écologiques (Larras et al., 2014).

Partant de ce constat, cette partie introduit une méthode simple pour extraire des clusters d’espèces étant phylogénétiquement proches et partageant des valeurs de traits similaires (Section 4.2.2). Nous avons testé la capacité de cette méthode à simplifier un indice biologique fréquemment utilisé, l’indice de polluo-sensibilité IPS (Coste, 1982).

Nous avons donc procédé à la clusterisation d’espèces de diatomées prises en compte par l’indice en se basant sur leurs proximités phylogénétiques et leur score général de sensibilité aux pollutions (valeur s incluse dans l’IPS). Nous avons utilisé ces clusters comme niveaux de base pour proposer un nouvel indice dérivé directement de l’IPS. Ce nouvel indice (IPSP) peut être décliné en modifiant les paramètres de clusterisation pour favoriser la précision ou la simplicité. Pour évaluer les performances de cette approche, nous avons comparé différentes versions de l’IPSP avec l’IPS classique sur le jeu de données sus-cité de 2119 échantillons de communautés de diatomées benthiques.

Les résultats de cette partie montrent que les possibilités de simplification de l’IPS basée sur des clusters d’espèces phylogénétiquement proches sont réelles mais posent le problème d’un compromis entre simplicité et précision. La discussion aborde également les limites de la méthode dans le cadre de la bioindication traditionnelle et les possibilités d’application dans le cadre des nouvelles approches moléculaires. 38 CHAPITRE 2 phylosignal : An R package to measure, test and explore the phylogenetic signal

This is a self-edited version of an article published in Ecology and Evolution. Please cite: Keck

F., Rimet F., Bouchez A. & Franc F. (2016) phylosignal: an R package to measure, test and explore the phylogenetic signal. Ecology and Evolution 6(9), 2774–2780. 40 2. phylosignal : An R package to measure, test and explore the phylogenetic signal

Abstract

1. Phylogenetic signal is the tendency for closely related species to display sim-

ilar trait values as a consequence of their phylogenetic proximity.

2. Ecologists and evolutionary biologists are becoming increasingly interested

in studying the phylogenetic signal and the processes which drive trait values

patterns in the phylogeny.

3. Here, we present a new R package, phylosignal which provides a collection

of tools to explore the phylogenetic signal for continuous biological traits.

These tools are mainly based on the concept of autocorrelation and have

been first developed in the field of spatial statistics.

4. To illustrate the use of the package we analyse the phylogenetic signal in

pollution sensitivity for 17 species of diatoms.

2.1 Introduction

A common observation is that continuous traits of closely related species in a phy- logeny are often similar, especially when traits are under selection pressure of the en- vironment. More generally, inheritance of traits passed with modifications from one generation to the next may lead to a structured repartition of traits values throughout the phylogeny. The link between phylogeny and continuous traits values is commonly referred in the literature as phylogenetic signal. This concept has gained in popularity among ecologists in recent years, but is often misunderstood and confused with other fundamental ideas like phylogenetic conservatism (Losos, 2008). To avoid any possible confusion (see Revell et al., 2008 for disentangling both notions), we stick here to the strict statistical definition of the phylogenetic signal given by Blomberg and Gar- land (2002), i.e. the “tendency for related species to resemble each other more than 2.1. Introduction 41 they resemble species drawn at random from the tree”. Thus, the phylogenetic signal is a statistical dependence between the values of a continuous trait and the phylogenetic tree from which the measured species are the leaves. Studying a statistical dependence leads to hypothesis testing, and formalizing a null hypothesis. Thus, the presence of phylogenetic signal (as defined by Blomberg and Garland) can be tested by reject- ing the null hypothesis that traits values for two species are distributed independently from their phylogenetic distance in the tree.

The detection and correction of phylogenetic signal has long been motivated by the necessity to control for non-independence of traits data in comparative studies

(Abouheif, 1999; Felsenstein, 1985). However, recent works have shown that study- ing the phylogenetic signal can raise interesting biological and ecological perspectives.

For example, deciphering the phylogenetic signal may help to understand community assembly processes (Webb et al., 2002), detect niche conservatism (Losos, 2008) or identify evolutionary strategies (Jombart et al., 2010b).

There are two contrasting approaches in the way phylogenetic signal for a trait can be studied as a statistical model. The first one is based on an explicit evolutionary model for the trait. This is generally a Brownian motion model (Blomberg et al., 2003; Pagel, 1999) where continuous traits evolve randomly over time along a branch, with a fixed rate. As soon as descents split at a node of the phylogeny, evolution on both branches becomes independent. To test the presence of phylogenetic signal, the null hypothesis is that traits values are randomly distributed in the phylogeny. Another null hypothesis might be that traits values follow a Brownian motion model but it is less often used and implemented. The second approach relates to methods based on the concept of autocorrelation, the correlation of a vector with itself for a given lag.

Autocorrelation is a mathematical tool which has been extensively used to study spatial and time series data. They are designed to detect whether the location of an individual 42 2. phylosignal : An R package to measure, test and explore the phylogenetic signal gives information on the expected values of its traits. However, these methods do not rely on any evolutionary model. In a phylogenetic context, patterns of trait values of the species of a tree can be framed as the outcome of a marked point process. Thus phylogenetic tools based on autocorrelation were largely imported from spatial statistics

(Cheverud et al., 1985; Gittleman and Kot, 1990; Jombart et al., 2010b).

We present a new R package, phylosignal, designed to quantify the phylogenetic signal for continuous biological traits. Most of the tools implemented in phylosignal are based on the concept of autocorrelation and thus are imported from spatial statistics. As such, they are well documented and understood. In this paper, we show how they can be used in a phylogenetic context and we describe their implementation in the package.

To illustrate the features of the package, we analyse the phylogenetic signal in pollution sensitivity for 17 species of diatoms.

2.2 The phylosignal package

The phylosignal package provides a collection of tools to visualize, measure, test and explore the phylogenetic signal in continuous traits (Table 2.1). The package is written in R and C++ languages and is fully accessible through the R environment.

The latest stable version is accessible from The Comprehensive R Archive Network 1 while the development version is hosted on GitHub 2. The phylosignal package is a free software released under the GNU GPL-3 license and any contribution is welcome.

This package builds on the R ecosystem richness and takes full advantage of ape

(Paradis et al., 2004) for tree manipulation and plotting capacities and adephylo (Jombart et al., 2010a) for tree walking algorithms and phylogenetic distances com- puting.

1. https://cran.r-project.org/web/packages/phylosignal/ 2. https://github.com/fkeck/phylosignal 2.2. The phylosignal package 43

Function Descripton barplot.phylo4d Plot traits values along a phylogeny. dotplot.phylo4d gridplot.phylo4d phyloSignal Computes and test the phylogenetic signal with different methods. phyloSim Simulations, to investigate the behavior of differ- plot.phyloSim ent phylogenetic signal statistics for a given phy- logenetic tree along a gradient of signal. phyloSignalBS Computes and plot phylogenetic signal for boot- strapped replicates of a phylogeny. phyloSignalINT Computes and test the phylogenetic signal at each internal node of a phylogeny. phyloCorrelogram Computes and plot a phylogenetic correlogram or plot.phylocorrelogram a multivariate Mantel correlogram. lipaMoran Computes Local Indicator of Phylogenetic Associ- ation (local Moran’s I). graphClust Extracts clusters of species based on traits values plot.graphclust and phylogenetic proximities. focusTree focusTraits Utility functions to add graphical elements focusTips focusStop to plots created with barplot.phylo4d, dotplot.phylo4d, gridplot.phylo4d. phyloWeights Utility function to computes a matrix of phyloge- netic weights with different methods. Table 2.1: List of the phylosignal package main functions and their description.

2.2.1 Data format

The analysis of phylogenetic signal typically involves working with a phylogeny and traits values associated to each tip (leaf). The phylobase package (Hackathon et al. 2013) defines the S4 class phylo4d designed specifically to handle such kind of data. Thus, a phylo4d object connects a phylogenetic tree with a table of traits values and constitutes the basic input for many functions implemented in phylosignal. The phylobase package comes with all the necessary functions to construct and manipulate 44 2. phylosignal : An R package to measure, test and explore the phylogenetic signal phylo4d objects. For the users who are not used to handle phylogenetic data within the R environment phylosignal adds the simple function read.p4d, which constructs a phylo4d object from a phylogenetic tree stored in a Newick file and tips data stored in a CSV file.

2.2.2 Data visualization

The first step of any statistical analysis should be a graphical exploration of the data. The R language provides very powerful and flexible graphics facilities (Murrell,

2005). They are extended for phylogenetic tree visualization with traits data by many packages: ape (Paradis et al., 2004), phytools (Revell, 2012), adephylo (Jombart et al., 2010a). The phylosignal package aims to provide a simple but complete inter- face to map traits data onto a phylogenetic tree. The users have access to three main functions to generate high quality graphics: barplot.phylo4d, dotplot.phylo4d and gridplot.phylo4d, which can respectively represent univariate and multivariate traits data as bars, dots and coloured cells. Each of these functions comes with several argu- ments to precisely control graphical aspects. Figure 2.1 gives an example of a graphic generated with barplot.phylo4d.

2.2.3 Indices for general measurements of phylogenetic signal

The function phyloSignal provides a generic interface to compute indices and tests on multiple traits from a phylo4d object. The package implements two methods directly based on the autocorrelation principle. – The Moran’s I index (Moran, 1948, 1950) is the standard measure of autocorre-

lation used in spatial statistics and has been proposed has a way to measure the 2.2. The phylosignal package 45

Neidium affine Neidium bisulcatum Neidium productum Scoliopleura peisonis Luticola goeppertiana Diploneis subovalis Craticula accomoda Craticula molestiformis Eolimna subminuscula Craticula cuspidata Stauroneis anceps Stauroneis phoenicenteron Stauroneis gracilior Stauroneis acuta Stauroneis kriegeri Fistulifera saprophila Fistulifera pelliculosa

-3 -1 1 3 -3 -1 1 3 -3 -1 1 3 IPSS random BM

Figure 2.1: Data visualization of 3 traits (IPSS, random, BM) mapped along the phylogeny of 17 diatom species. This output is obtained with the function barplot.phylo4d. By default data are centred and scaled by trait.

phylogenetic signal by Gittleman and Kot (1990). The function phyloSignal

computes I using Equation 2.1 with yi and yj being the trait value measured for species i and species j respectively, n being the number of species and, by default,

1 wij = , dij being the patristic distance between species i and species j. dij

∑ ∑ n n n w (y − y¯)(y − y¯) I = ∑ ∑ i=1 j∑=1 ij i j (2.1) n n n − 2 i=1 j=1 wij i=1(yi y¯)

– The Abouheif’s Cmean index (Abouheif, 1999) has been shown to be a Moran’s I index computed with a specific matrix of phylogenetic weights (Pavoine et al.,

2008). Thus, phyloSignal computes Cmean using Equation 2.1 with wij being the proximity matrix A described in Pavoine et al. (2008) and computed with

proxTips(x, method = "Abouheif") from adephylo. Additionally, the function phyloSignal can compute three indices based on evolu- tionary models: Blomberg’s K and K∗ (Blomberg et al., 2003) and Pagel’s λ (Pagel,

1999). 46 2. phylosignal : An R package to measure, test and explore the phylogenetic signal

Each index can be tested for the null hypothesis of absence of signal (i.e. traits values are randomly distributed in the phylogeny). This is achieved by randomization for K,

∗ K , Cmean and I and by likelihood ratio test for λ. Indices and tests procedures are written in C++ to optimize speed when dealing with large phylogenies, multiple traits and simulations.

Choosing an appropriate method to measure and test the phylogenetic signal is not straightforward. Münkemüller et al. (2012) provided general and useful guidelines, but stress that the behaviour of indices strongly depends on numerous parameters like phylogenetic tree topology, sample size and complexity of the evolutionary models gener- ating traits patterns. Moreover, phylogenetic trees based on real data can differ greatly from simulated trees commonly used in simulations. Therefore, it can be interesting to investigate how the indices behave with the phylogeny under study. The phyloSim func- tion takes up the method described by Münkemüller et al. (2012) to simulate traits with variable strength of Brownian motion for a given phylogeny and then computes indices and tests along a gradient of phylogenetic signal. Results of these simulations can be used to compare the performances of the different methods and interpret indices’ values obtained with real traits data, for a given phylogeny.

2.2.4 The phylogenetic correlogram

The phylogenetic correlogram takes up the core idea of the spatial correlogram (Sokal and Oden, 1978). It aims to graphically represent how the data are autocorre- lated at different lags of distance. The idea was introduced in a phylogenetic context by

Gittleman and Kot (1990) as a way to locate the phylogenetic signal in the .

Using an accurate phylogeny, it is possible to replace taxonomic distances with phylo- genetic distances (e.g. patristic distance). This method has been promoted by Hardy and Pavoine (2012) as an interesting way to characterize the nature of the phyloge- 2.2. The phylosignal package 47 netic signal especially when model-based approaches are limited by the complexity of evolutionary processes.

However an inherent issue of correlograms is that the autocorrelation must be com- puted within discretized distance classes. Therefore, the use of the correlogram may be strongly limited for small trees and when tips are not uniformly distributed within the phylogeny. In response to this potential problem, the phylosignal package comes with an original implementation of the phylogenetic correlogram for which the autocor- relation can be computed continuously. This is achieved by computing the Moran’s I index using a specific matrix of phylogenetic weights w based on a normalized Gaussian function (Equation 2.2).

− 2 1 (dij µ) 2 wij = √ e 2σ (2.2) σ 2π

Therefore, a phylogenetic weight matrix can be computed giving µ, which defines the distance at which a tip will have the strongest influence and σ which defines the decrease of influence around µ. This matrix can be computed using the function phyloWeights, but the phylogenetic correlogram can be estimated directly with the function phyloCorrelogram. Additionally, a confidence envelope is computed using non-parametric bootstrap resampling. Finally the function can estimate a multivari- ate Mantel correlogram (Oden and Sokal, 1986) if two traits or more are provided.

Figure 2.2 gives an example of phylogenetic correlograms with their confidence envelope.

2.2.5 Local Indicators of Phylogenetic Association – LIPA

Global measurement of autocorrelation like Moran’s I and phylogenetic autocorrel- ograms give precious information about the general presence of a phylogenetic signal 48 2. phylosignal : An R package to measure, test and explore the phylogenetic signal

A Correlation -0.4 -0.2 0.0 0.2 0.4 0.0 0.5 1.0 1.5 2.0

1.0 B Correlation -0.5 0.0 0.5 0.0 0.5 1.0 1.5 2.0

1.0 C Correlation -0.5 0.0 0.5 0.0 0.5 1.0 1.5 2.0

Phylogenetic distance

Figure 2.2: Phylogenetic correlograms for 3 traits: A. random, B. BM and C. IPSS. The solid bold black line represents the Moran’s I index of autocorrela- tion and the dashed black lines represents the lower and upper bounds of the confidence envelop (here 95%). The horizontal black line indicates the expected value of Moran’s I under the null hypothesis of no phylo- genetic autocorrelation. The coloured bar shows if the autocorrelation is significant (based on the confidence interval): red for significant positive autocorrelation, black for non-significant autocorrelation and blue for sig- nificant negative autocorrelation. 2.2. The phylosignal package 49 within a phylogeny. However, these approaches make the implicit assumptions that traits evolve similarly across the phylogeny. There are solid grounds to expect that this is rarely the case and that phylogenetic signal is scale dependent and varies among clades. Therefore, it can be interesting to use local statistics to describe local traits patterns.

Spatial statistics have introduced a class of statistical tools to analyse local patterns called Local Indicators of Spatial Association (LISA). One simple and well described

LISA is the local Moran’s I (Equation 2.3), noted Ii (Anselin, 1995), which can be used to detect hotspots of positive and negative autocorrelation. The same statistic can be applied to phylogenetic data to detect species with similar neighbours and species with different neighbours. In this context, we call these indicators Local Indicators of

Phylogenetic Association (LIPA), for sake of consistency in terminology, although the statistic remains the same.

∑n yi − y¯ Ii = wij(yj − y¯) (2.3) m2 j=1 with

∑ n (y − y¯)2 m = i=1 i 2 n

Local Moran’s I (Ii) can be computed with the function lipaMoran for each tip of the phylogeny and for one or more traits. By default, the function uses a phylogenetic

1 weights matrix wij = , dij being the patristic distance matrix. However, any matrix dij of weights can be provided. For each value of local Moran, the function performs a non- parametric test by randomization and returns a p-value. Figure 2.3 gives an example of

Local Moran’s I (Ii) values plotted onto a phylogenetic tree. 50 2. phylosignal : An R package to measure, test and explore the phylogenetic signal

Neidium affine Neidium bisulcatum Neidium productum Scoliopleura peisonis Luticola goeppertiana Diploneis subovalis Craticula accomoda Craticula molestiformis Eolimna subminuscula Craticula cuspidata Stauroneis anceps Stauroneis phoenicenteron Stauroneis gracilior Stauroneis acuta Stauroneis kriegeri Fistulifera saprophila Fistulifera pelliculosa

-2 -1 0 1 2 IPSS

Figure 2.3: Local Moran’s index (Ii) values for each species for trait IPSS computed with lipaMoran and plotted with dotplot.phylo4d. Red points indicate significant Ii values.

2.2.6 Additional functionalities

The phylosignal package comes with some additional features to analyse phyloge- netic signal. The function phyloSignalINT computes phylogenetic signal indices and tests for each internal node of a given phylogeny. Combined with lipaMoran, it can be helpful to identify an interesting region, exhibiting strong conservation for exam- ple, in the phylogenetic tree. If bootstrapped replicates of the phylogeny are available, the function phyloSignalBS can be used to compute signal indices and tests for each bootstrap. The function renders the results as boxplots allowing assessing the effect of phylogenetic reconstruction uncertainty on phylogenetic signal estimates. Finally, the function graphClust implements a simple method to perform traits clustering under phylogenetic constraints (Keck et al., 2016a). 2.3. Example: Phylogenetic signal of pollution sensitivity in diatoms 51

2.3 Example: Phylogenetic signal of pollution sensitivity in diatoms

In order to demonstrate the application of phylosignal we comment on an analysis of the phylogenetic signal for 17 diatoms species. The trait analysed is the specific pollution sensitivity index, IPSS (Coste, 1982). The diatoms are taken from the order

Naviculales and the phylogenetic tree is taken from Keck et al. (2016b). This dataset is deliberately kept simple for demonstration purposes: this is a very brief overview of the diversity existing in this clade but it constitutes a good case study (for a more comprehensive discussion about phylogenetic signal in diatoms sensitivity to pollutions see Keck et al., 2016a,b). The dataset is included in the package and can be loaded with the following command. data(navic)

For illustration purposes, we add two other traits: random which is randomly dis- tributed in the phylogeny and BM which is generated under a Brownian motion model. library(ape) library(phylobase) tipData(navic)$random <- rnorm(17) tipData(navic)$BM <- rTraitCont(as(navic, "phylo"))

The data are loaded in the form of a phylo4d object. It is therefore extremely easy to plot the phylogeny and the traits values (Figure 2.1). barplot.phylo4d(navic)

We can compute phylogenetic signal indices and p-values of their respective tests. 52 2. phylosignal : An R package to measure, test and explore the phylogenetic signal phyloSignal(navic)

$stat Cmean I K K.star Lambda

IPSS 0.47915189 0.04286040 0.7897245 0.8541988 0.9588398276 random -0.06522342 -0.10555838 0.3213491 0.3216638 0.0000704802

BM 0.37543446 0.08060191 0.7267358 0.7852155 0.9798037571

$pvalue

Cmean I K K.star Lambda

IPSS 0.008 0.088 0.014 0.012 0.02593566 random 0.464 0.713 0.565 0.629 1.00000000 BM 0.006 0.035 0.014 0.008 0.07076068

Not surprisingly, tests tend to detect a signal for BM and not for random. The phylogenetic signal also appears to be significant for IPSS. We can compute and plot a phylogenetic correlogram for each trait with the following commands:

IPSS.cg <- phyloCorrelogram(navic, trait = "IPSS") random.cg <- phyloCorrelogram(navic, trait = "random")

BM.cg <- phyloCorrelogram(navic, trait = "BM") plot(IPSS.cg) plot(random.cg) plot(BM.cg)

The phylogenetic correlogram of random is flat and non-significant (Figure 2.2A), while BM exhibits a positive autocorrelation for short lags (Figure 2.2B). The correlo- gram of IPSS is a bit different with a strong positive autocorrelation for short lags and 2.4. Conclusion 53 negative autocorrelation for medium lags (Figure 2.2C). This is due to the clades struc- ture of the signal: two closely related species belonging to the same clade tend to share similar trait values, but two adjacent clades are likely to differ strongly (Figure 2.1). Finally, we can compute local Moran’s I for each species to detect hotspots of au- tocorrelation in IPSS. The following commands compute local Moran’s I and represent them onto the phylogeny (Figure 2.3). The p-values are turned into colours to highlight hotspots. Here we use a proximity matrix based on the number of nodes to ignore the effect of long terminal branches and focus on clades. local.i <- lipaMoran(navic, trait = "IPSS",

prox.phylo = "nNodes", as.p4d = TRUE) points.col <- lipaMoran(navic, trait = "IPSS",

prox.phylo = "nNodes")$p.value points.col <- ifelse(points.col < 0.05, "red", "black") dotplot.phylo4d(local.i, dot.col = points.col)

The LIPA analysis (Figure 2.3) reveals significant local positive autocorrelation in two clades: the genus Craticula (including Eolimna subminuscula) with low values of sensitivity and the genus Stauroneis with high values of sensitivity.

2.4 Conclusion

We have presented the phylosignal package and shown how it can be used to de- scribe and analyse the phylogenetic signal in biological traits. The fact that phylosignal is integrated in the R ecosystem and uses the standard format phylo4d makes it inter- operable with several other methods implemented in the R language. For example, users can complete these results with a phylogenetic principal component analysis (Jombart 54 2. phylosignal : An R package to measure, test and explore the phylogenetic signal et al., 2010b) implemented in adephylo to detect combinations of traits that are phylo- genetically autocorrelated. They can also use the tools implemented in ape to investigate evolutionary models through a generalized least squares approach (Paradis, 2011). The combination of these tools will help to characterize the phylogenetic signal and to identify historical and ecological processes which drive trait values patterns in the phylogeny.

2.5 Acknowledgments

This work was funded by ONEMA (French National Office for Water and Aquatic

Ecosystems) in the context of the 2013-2015 “Phylogeny and Bioassessment” program. CHAPITRE 3 Phylogenetic signal in diatom ecology : perspectives for aquatic ecosystems biomonitoring

This is a self-edited version of an article published in Ecological Applications. Please cite:

Keck F., Rimet F., Franc F. & Bouchez A. (2016). Phylogenetic signal in diatom ecology: perspectives for aquatic ecosystems biomonitoring. Ecological Applications 26(3), 861–872. 56 3. Phylogenetic signal in diatom ecology

Abstract

Diatoms include a great diversity of taxa and are recognized as powerful

bioindicators in rivers. However using diatoms for monitoring programs is costly

and time consuming because most of the methodologies necessitate species-level

identification. This raises the question of the optimal tradeoff between taxonomic

resolution and bioassessment quality. Phylogenetic tools may form the bases of

new more efficient approaches for biomonitoring if relationships between ecology

and phylogeny can be demonstrated. We estimated the ecological optima of 127 di-

atom species for 19 environmental parameters using count data from 2119 diatom

communities sampled during 8 years in eastern France. Using uni- and multivariate

analyses, we explored the relationships between freshwater diatom phylogeny and

ecology (i.e. the phylogenetic signal). We found a significant phylogenetic signal

for many of the ecological optima that were tested, but the strength of the signal

varied significantly from one trait to another. Multivariate analysis also showed

that the multidimensional ecological niche of diatoms can be strongly related to

phylogeny. The presence of clades containing species that exhibit homogeneous

ecology suggests that phylogenetic information can be useful for aquatic biomon-

itoring. This study highlights the presence of significant patterns of ecological

optima for freshwater diatoms in relation to their phylogeny. These results sug-

gest the presence of a signal above the species level, which is encouraging for the

development of simplified methods for biomonitoring survey.

3.1 Introduction

The monitoring of water quality using biological indicators is commonly used by environmental managers worldwide (Ibáñez et al., 2010). This approach is based on the view that biological communities’ composition is directly affected by environmen- 3.1. Introduction 57 tal conditions and human activities (Chapman, 1996). Algae and especially diatoms, a particular group of microalgae, are well suited to monitor environmental health, because they include groups exhibiting a wide range of ecological optima and their communi- ties respond rapidly to changes in habitat quality (Lowe and Pan, 1996; Stevenson and Smol, 2003). Despite these advantages, ecological assessments using diatoms are less common than those using macroinvertebrates or fishes (Carter et al., 2006; Gal- lacher, 2002; Resh, 2007). This may be related to the huge diversity of diatoms, a group of algae containing approximately 12 000 described species (Guiry, 2012) and potentially 100 000 existing ones (Mann and Vanormelingen, 2013). Such diversity makes microscopic identification at species-level a challenge (Besse-Lototskaya et al.,

2006; Kelly et al., 1995; Kociolek, 2005). In fact, the majority of diatom indices is based on species or sub-species levels (Rimet, 2012) and requires highly qualified staff and resources.

Two ways have been explored to avoid this problem. First, some authors have pro- posed to reduce the number of species used to calculate bioassessment indices (Lavoie et al., 2009; Lenoir and Coste, 1996). In the same vein, results of DeNicola (2000) also suggested that resolution of taxonomic synonymies may reduce the number of species to evaluate. Secondly, it was proposed to work at a lower taxonomic resolution. Studies comparing bioassessment performances of different taxonomic levels showed that some higher taxa (in particular genus level) might be relatively precise and efficient compared to species level (Chessman et al., 1999; Growns, 1999; Hill et al., 2001; Kelly et al., 1995; Raunio and Soininen, 2007; Rimet and Bouchez, 2012b; Wunsam et al., 2002). However, some authors stressed the risk of oversimplification and emphasized the importance to work at the most precise taxonomic level (Bennett et al., 2014; Koci- olek, 2005; Patrick and Palavage, 1994; Ponader and Potapova, 2007; Round,

1991). Thus, it is still unclear what constitutes an appropriate group for biomonitor- 58 3. Phylogenetic signal in diatom ecology ing. An alternative and interesting way might be to use mixed taxonomic levels, as suggested by Jones (2008) for macroinvertebrates. The idea behind this is to adapt the taxonomic detail to the information content for each clade. For diatoms, moving from genus to species may require much more effort and special training. Reducing, where possible, the level of taxonomic identification, could both reduce cost and errors.

Bioassessment protocols should be able to benefit from such approaches, but this raises methodological questions about how to select biomonitoring groups and to test their efficiency. Recently, new frameworks have been proposed to link species assemblage, species traits and phylogeny (Mouquet et al., 2012; Webb et al., 2002). The integration of phylogenetics and community ecology aims to disentangle the role of phylogeny in species interactions, community structures and ecological processes. But such an ap- proach could also be interesting for conservation purposes and applied ecology, espe- cially when managers have to deal with complex biological indicators with important diversity, complicated taxonomy and arduous determination. The set of tools provided by phylogenetics seems promising for exploring the ecological variability within and be- tween taxonomic levels and to make new proposals for efficient biomonitoring groups (Carew et al., 2011; Larras et al., 2014). However, using phylogeny in an environ- mental assessment context requires the ability to demonstrate the relationships existing between phylogenetic position of species and their ecology (Carew et al., 2011). This can be assessed by analysis of the phylogenetic signal, since there is a tendency for closely related species to possess more similar trait values than more distantly related species. For instance, Buchwalter et al. (2008) explored the phylogenetic signal for physiological traits related to cadmium bioaccumulation, compartmentalization and sus- ceptibility for 21 macroinvertebrates species. Carew et al. (2011) developed a similar approach, but focused on more general traits reflecting chironomids and mayflies’ sen- 3.2. Material and Methods 59 sitivity to pollution, and Larras et al. (2014) focused on the sensitivity of 14 diatoms species to 4 different herbicides. These studies showed the existence of phylogenetic sig- nals that were more or less pronounced depending on the considered trait and biological group. Above all, they highlighted the potential of including a phylogenetic component in the development of ecotoxicological and bioassessment tools. The next step in the integration of phylogeny into biomonitoring tools using diatom communities is to pursue efforts for characterizing, measuring and testing phylogenetic signal on large databases including a significant number of species and their ecological optima. Here, we analyze jointly the phylogeny and the ecology of 127 freshwater diatom taxa included in river biomonitoring tools. The aim of this study was to take a preliminary view of phylogenetic signals of diatoms’ ecological optima for a variety of environmental parameters. These parameters (related to mineralization, organic matter, nutrients, etc.) mainly reflect the chemical status of water bodies. They define, at least partially, the fundamental habitat niche of diatoms and are recognized to influence directly the water quality and the ecological potential of freshwaters. We provide a phylogenetic tree based on 18S and rbcL markers for 127 species. For all these species, we estimated ecological optima for 19 different environmental parameters affecting the quality of freshwaters. Using univariate and multivariate approaches we explore the phylogenetic signal and discuss the implications of our findings for biomonitoring.

3.2 Material and Methods

3.2.1 Genetic database

We collected DNA sequences of diatoms from the open access GenBank database

(NCBI 2013) and the Thonon Culture Collection (TCC, INRA 2013), which provides a 60 3. Phylogenetic signal in diatom ecology nucleotide sequences database for indexed micro-algae species. We focused on the nuclear gene coding for the small subunit 18S rRNA and the chloroplast rbcL gene coding for the

RuBisCO enzyme. These markers are among the most popular and they are available for an important diversity of diatom species (Theriot et al., 2011). DNA sequences of 18S and rbcL genes were extracted from GenBank using Geneious Basic 5.6.3 software with the following query: (18S OR rbcl) AND (diatom OR Bacillariophyta). The query was rerun every two months in 2013 to keep the database updated. Data extracted from GenBank and TCC were manually cleaned. Each sequence was blasted against the GenBank database to check for taxonomic inconsistencies. In case of problems, we checked for taxonomic synonyms using the Catalogue of Diatom Names (Fourtanier and Kociolek, 2011) and if the error persisted, we dropped the sequence. After clean- ing, our database contained 1236 sequences for the 18S gene and 1084 sequences for the rbcL gene.

3.2.2 Phylogeny

Many species were sequenced several times from different strains and sequences were derived from many different sources. Therefore, the phylogenetic reconstruction was di- vided into different steps presented in Figure 3.1. Nucleotide sequences were first aligned using the Muscle algorithm (Edgar, 2004) provided through the SeaView graphical user interface (Gouy et al., 2010). The 18S alignment presented 634 complete sites (without gaps or undetermined nucleotides) and the rbcL 467 complete sites. Tree topologies and branch lengths were computed separately for the two markers with maximum likeli- hood method (ML) in PhyML 3.0 (Guindon et al., 2010). We used the MrAIC software (Nylander, 2004) to select the best substitution models using the Akaike Information

Criterion (AIC). After reconstruction, each tree was pruned so that there was only one remaining strain per species or subspecies. Some species appeared to be polyphyletic, 3.2. Material and Methods 61 which can result from taxonomic assignment errors, inaccurate tree reconstruction or a mismatch between taxonomy and phylogeny. To limit the impact of such problems, pruning was done manually with the following rules: for species with more than one available strain, isolated strains – in the phylogenetic tree – or sequences not published in peer reviewed journals were deleted first. When only a monophyletic group remained, we selected one of the strains; prioritizing the working scale of the source publication

(a study focusing on a specific genus is preferred to a broad taxonomic range study).

After pruning, 411 unique species or subspecies remained on the 18S phylogenetic tree and 425 for the rbcL phylogenetic tree (the accession numbers of the selected sequences are provided in Appendix B, Section B.1.1). However, 18S and rbcL taken separately have a limited ability to recover phylogenetic relationships within diatoms (Theriot et al., 2011). Therefore, previously selected 18S and rbcL sequences were concatenated for those species for which both data were available (286 species). This alignment was used to estimate a more robust and accurate tree (Tree∩) using a partitioned ML analysis

(20 runs; 100 bootstraps) in RAxML 7.2.8 (Stamatakis, 2006). Finally, 18S and rbcL sequences were concatenated even for species for which only one marker was available

(550 species). This alignment was used to estimate a tree with a higher diversity (Tree∪) using a partitioned ML analysis (20 runs; 100 bootstraps) with Tree∩ as a topological constraint in RAxML. All the trees were dated in relative time using a semi-parametric method based on penalized likelihood (Sanderson, 2002).

3.2.3 Ecological optima

Ecological optima for each species were estimated from results of biomonitoring sur- veys conducted as part of the French program Réseau de Contrôle de Surveillance (RCS).

Samples were collected in rivers and streams in eastern France between 2001 and 2008. The database connects the diatoms relative species abundance (estimated by counting 62 3. Phylogenetic signal in diatom ecology

18S rbcL

Databases extraction

1236 1084

Alignement Tree reconstruction Pruning (1 sequence/species)

411 425

Intersection Union

286 550

Tree reconstruction

Tree reconstruction Tree∩ with tree∩ topological constraint

Tree∪

Figure 3.1: Flowchart describing the main steps of phylogenetic reconstruction. Num- bers in grey frames represent the number of sequences available after each operation.

400 individuals) to the chemical and physical conditions for 2119 samples. Environmen- tal parameters and their abbreviations are presented in Table 3.1. For each species and each environmental parameter, we calculated an optimum value (i.e. the species opti- mum for the parameter). We chose the weighted averaging method (WA, Ter Braak and Looman, 1986) to estimate species optima. For a given species (k), a given envi- ronmental parameter (x) and a set of sampled sites (i = 1, …, n), WA corresponds to the mean of the parameter x weighted by yik, the abundance of the species k at each corresponding sampled site (Equation 3.1). This is a simple, reliable and robust statistic 3.2. Material and Methods 63

Unit Min 1st Quartile Median Mean 3rd Quartile Max – NO3 mg.L-1 0.100 2.660 4.800 6.442 8.500 69.700 – NO2 mg.L-1 0.010 0.020 0.053 0.126 0.120 2.960 + NH4 mg.L-1 0.028 0.050 0.070 0.269 0.150 17.200 NKJ mg.L-1 (N) 0.213 0.703 1.000 1.049 1.000 14.200 3– PO4 mg.L-1 0.000 0.030 0.128 0.344 0.323 13.100 TP mg.L-1 (P) 0.000 0.020 0.096 0.180 0.205 5.674 DOC mg.L-1 (C) 0.200 1.500 2.600 2.867 3.663 19.720 DOM mg.L-1 1.000 3.600 7.425 16.783 14.317 3890.000 BOD mg.L-1 0.500 0.900 1.867 1.985 2.750 20.000

O2p % 0.000 83.600 93.500 91.369 101.000 202.200 Na+ mg.L-1 0.000 5.200 10.300 31.661 22.050 863.500 Cl– mg.L-1 0.000 7.900 15.350 51.844 31.050 1411.000 Ca2+ mg.L-1 0.000 45.000 76.000 82.603 103.000 478.000 K+ mg.L-1 0.000 1.300 2.400 4.165 4.600 224.000 Mg2+ mg.L-1 0.000 3.750 6.800 12.863 14.038 136.600 2– SO4 mg.L-1 0.000 12.533 31.700 89.182 84.725 1380.000 Conductivity µS.cm-1 18.700 334.000 464.500 615.848 656.250 4818.000 pH - 6.300 7.653 7.900 7.828 8.100 8.800 Temperature °C 0.200 14.337 16.720 16.166 18.700 30.000 Table 3.1: Summary of the environmental parameters present in the dataset. NKJ = Kjeldahl nitrogen; TP = total phosphorus; DOC = dissolved organic car- bon; DOM = dissolved organic matter; BOD = biological oxygen demand;

O2p = pressure of oxygen. with a long tradition of being used in diatom ecology (e.g. Charles, 1985; Oksanen et al., 1988; Salden, 1978), paleolimnological studies (Birks, 2010) and biomonitoring

(Ector and Rimet, 2005).

∑ n y x ∑i=1 ik i WAk = n (3.1) i=1 yik

To promote statistical robustness, we chose to estimate WA only for species occur- ring in at least 10 different samples. Diatoms counts have been log-transformed and environmental parameters have been log-transformed and standardized before WA com- putation. 64 3. Phylogenetic signal in diatom ecology

3.2.4 Phylogenetic signal

Phylogenetic signal is “the tendency for related species to resemble each other more than they resemble species drawn at random from the tree” (Blomberg and Garland,

2002). From a statistical point of view, this is non-independence of traits values among species as a direct consequence of their phylogenetic relatedness. They are numerous ways to measure and test the phylogenetic signal described in the literature (see Revell et al., 2008). Münkemüller et al. (2012) reviewed some of the most popular indices to measure the phylogenetic signal and showed that the performance of these statistics can vary strongly, depending on the size and structure of the phylogeny, and the complexity of traits evolution model.

In this study, we estimated the phylogenetic signal for ecological optima using Pagel’s

λ (Pagel, 1999) and Abouheif’s Cmean (Abouheif, 1999). Pagel’s λ ranges between 0 (no signal, traits are distributed randomly) and 1 (signal, traits are distributed following a Brownian motion model). It is a branch-length transformation method, estimated by maximum likelihood and tested by likelihood ratio test (Pagel, 1999). Abouheif’s

Cmean is an alternative way to test for phylogenetic signal. Contrary to Pagel’s λ, it does not rely on an evolutionary model but measures autocorrelation among tips with a particular matrix of phylogenetic proximities (Pavoine et al. 2008). It can be tested with randomized permutations. Both Pagel’s λ and Abouheif’s Cmean have been singled out by Münkemüller et al. (2012) as powerful and reliable methods to measure and test the phylogenetic signal and have been previously used on ecological traits (Comte et al., 2014; Freckleton et al., 2002).

Since phylogenetic signal measurement strongly depends on the phylogenetic tree used as input, the robustness of the results was assessed by repeating analyses with 100 bootstrap trees. This procedure allows getting a distribution of signal measures and tests with respect to both the topology and branch lengths. 3.2. Material and Methods 65

3.2.5 Phylogenetic Principal Component Analysis

Phylogenetic principal component analysis (pPCA) extends the classical PCA to the analysis of phylogenetic structures in biological traits (Jombart et al., 2010b). This is a dimension-reduction method which aims to uncover the main phylogenetic structures of a set of traits by finding combinations of them that are phylogenetically autocorrelated.

To this end, Jombart et al. (2010b) defined two types of phylogenetic patterns: global and local structures. Global structures occur at large scales in the phylogeny and define general patterns of traits similarities among taxa. They exhibit positive autocorrelation and are directly related to the idea of phylogenetic signal. By contrast, local structures occur in specific parts of the phylogeny between closely related taxa which exhibit strong variability. Local structures are characterized by negative autocorrelation and reflect traits overdispersion (i.e. antisignal). We applied pPCA directly on the standardized matrix of ecological optima to inspect Global and Local structures along the phylogeny with respect to the complete set of environmental variables. To this end, Global and

Local scores of species were mapped onto the phylogeny to uncover interesting patterns. A species with a high score on a Global principal component shares similar traits values with its neighbors. On the other hand, a high score on a Local principal component means strong differences with its neighbors. We also used lagged scores (i.e. for one species, the mean value of its neighbors weighted by their distance). Lagged scores smooth scores for a given component and make visualizing patterns easier.

3.2.6 Statistical Packages

We performed all the statistical analyses with R 3.0.2 software (R Develop- ment Core Team, 2013). Phylogenies were handled and dated with the ape package (Paradis et al., 2004). Pagel’s λ statistics and tests were computed with the package phytools (Revell, 2012). The pPCA and Abouheif Cmean tests were performed using 66 3. Phylogenetic signal in diatom ecology the adephylo package (Jombart et al., 2010a).

3.3 Results

3.3.1 Ecological and Phylogenetic data

In the 2119 sampled sites, 909 different diatoms species (or varieties) were identified.

Using only species occurring in more than 10 different sites restricted the number of taxa to 398. Finally, crossing ecological dataset with the phylogenetic tree (Tree∪) led to a subset of 127 species for the analyses. All the following results are based on this subset.

For both 18S and rbcL markers, the most complex evolutionary model (i.e. GTR+I+G) was selected according to AIC. The final phylogenetic tree (Tree∪) based both on 18S and rbcL but representing only the set of species included in the signal analyses (127 species) is presented in Figure 3.2 whereas the complete Tree∪ (550 species) and Tree∩ (286 species) are provided with bootstrap support values (Appendix B, Section B.4.1).

The most recent branches are well supported by bootstraps overall whereas some deep branches exhibit low support values.

3.3.2 Phylogenetic signals

We found statistical evidence of phylogenetic signal for many tested traits (Table 3.2

2– 2+ and Figure 3.3). Except for SO4 and Mg , phylogenetic signal measures and tests appeared to be robust with respect with phylogenetic tree reconstruction uncertainty

(Figure 3.3). Using the most likely tree, both Pagel’s λ and Abouheif’s Cmean led to the conclusion of absence of a signal for NKJ. The Pagel’s λ also concluded that there

– + was an absence of phylogenetic signal for NO2, NH4 , pH and temperature. With both methods the strength of the phylogenetic signal was found to be variable depending on 3.3. Results 67

Melosira varians Aulacoseira ambigua Aulacoseira granulata Aulacoseira distans Thalassiosira weissflogii 6 Thalassiosira pseudonana Cyclotella distinguenda var. distinguenda Cyclotella meneghiniana Cyclotella atomus Discostella stelligera Discostella pseudostelligera Cyclotella costei Cyclotella ocellata Stephanodiscus hantzschii Stephanodiscus minutulus Stephanodiscus parvus Cyclostephanos tholiformis Cyclostephanos dubius Cyclostephanos invisitatus Diatoma tenuis Diatoma moniliformis Asterionella formosa Tabellaria flocculosa Diatoma vulgaris Pseudostaurosira brevistriata Staurosirella pinnata Staurosira venter Staurosira construens 5 Staurosira elliptica Fragilaria delicatissima Tabularia fasciculata Ctenophora pulchella Ulnaria ulna var. acus Ulnaria ulna Ulnaria ulna var. angustissima Fragilaria austriaca Fragilaria capucina var. vaucheriae Fragilaria perminuta Fragilaria crotonensis Fragilaria bidens Fragilaria capucina var. capucina Eunotia minor Eunotia bilunaris var. bilunaris Nitzschia dissipata var. dissipata Nitzschia sigmoidea Bacillaria paxillifera var. paxillifera 4 Nitzschia linearis var. linearis Hantzschia amphioxys Nitzschia fonticola Nitzschia frustulum var. frustulum Nitzschia acidoclinata Nitzschia microcephala Nitzschia supralitorea Nitzschia hantzschiana Nitzschia inconspicua Tryblionella apiculata Nitzschia filiformis var. filiformis Nitzschia communis Nitzschia capitellata Nitzschia palea Nitzschia palea var. debilis Nitzschia pusilla Nitzschia acicularis Nitzschia draveillensis Nitzschia paleacea 3 Gyrosigma acuminatum Hippodonta capitata Navicula gregaria Navicula veneta Navicula cryptocephala Navicula cryptotenella Navicula cryptotenelloides Navicula rostellata Navicula symmetrica Navicula cincta Navicula tripunctata Navicula lanceolata Navicula cari Navicula capitatoradiata Navicula radiosa Fallacia monoculata Sellaphora minima Sellaphora pupula 2 Sellaphora bacillum Mayamaea permitis Mayamaea atomus Caloneis silicula Pinnularia subcapitata var. subcapitata Pinnularia microstauron var. microstauron Luticola goeppertiana Fistulifera saprophila Craticula cuspidata Eolimna subminuscula Craticula accomoda Craticula molestiformis Amphora pediculus Amphora montana Cymatopleura elliptica var. elliptica Surirella angusta Surirella minuta Frustulia vulgaris Achnanthidium minutissimum Eucocconeis laevis 1 Planothidium frequentissimum Cocconeis placentula var. placentula Cocconeis pediculus Lemnicola hungarica Encyonema caespitosum Encyonema silesiacum Encyonema minutum Cymbopleura naviculiformis var. naviculiformis Cymbella tumida Cymbella excisa var. excisa Cymbella lanceolata var. lanceolata Cymbella affinis var. affinis Gomphonema micropus var. micropus Gomphonema rosenstockianum Reimeria sinuata Gomphonema truncatum Gomphonema acuminatum Gomphonema clavatum Gomphonema productum Gomphonema angustatum Gomphonema affine Gomphonema parvulum var. parvulum Gomphonema exilissimum Gomphonema lagenula

-4 -2 0 2 -4 -2 0 2 -4 -2 0 2 -4 -2 0 2 GPC1 GPC2 GPC1 LAG GPC2 LAG

Figure 3.2: Phylogenetic tree reconstructed by maximum likelihood from 18S and rbcL DNA sequences for 127 different diatoms taxa. The bars give the score of each taxon on the 1st (GPC1) and 2nd (GPC2) Global Principal Component of the pPCA. Lag vector scores for each of these components are also provided (GPC1 LAG and GPC2 LAG). The different clades discussed in the body of the text are directly labeled on the tree. 68 3. Phylogenetic signal in diatom ecology

λ p-value Cmean p-value – NO3 0.740 <0.001 0.404 <0.001 – NO2 0.116 1 0.185 0.002 + NH4 0.000 1 0.158 0.012 NKJ 0.000 1 0.057 0.149 3– PO4 0.667 <0.001 0.370 <0.001 TP 0.647 <0.001 0.352 <0.001 DOC 0.682 <0.001 0.330 <0.001 DOM 0.561 <0.001 0.286 <0.001 BOD 0.514 <0.001 0.266 <0.001

O2p 0.684 <0.001 0.352 <0.001 Na+ 0.784 0.001 0.238 <0.001 Cl– 0.776 <0.001 0.275 <0.001 Ca2+ 0.697 0.008 0.191 0.005 K+ 0.653 <0.001 0.286 <0.001 Mg2+ 0.713 0.016 0.146 0.012 2– SO4 0.674 0.026 0.159 0.003 Conductivity 0.723 0.002 0.206 0.003 pH 0.545 1 0.167 0.009 Temperature 0.250 0.356 0.17 0.009 Table 3.2: Measurements and tests of the phylogenetic signal for 19 ecological optima with two methods (Pagel’s λ and Abouheif’s Cmean). Statistics are in bold if their p-value < 0.05. For abbreviations references see Table 3.1.

+ the trait considered. The Pagel’s λ scaled between <0.001 (NKJ and NH4 ) and 0.784

+ – (Na ) whereas Abouheif’s Cmean scaled between 0.057 (NKJ) and 0.404 (NO3).

3.3.3 Phylogenetic principal component analysis

We kept the two first positive axes of the pPCA, which expressed the presence of a very strong Global structure (Figure 3.4A). By contrast, Local structure (negative eigen- values) was very small and a test for negative autocorrelation on all variables separately found no statistical support for such pattern (p-values ≥ 0.875). Thus, we do not include 3.3. Results 69

% Significant tests

0 50 100

- NO3 - NO2 + NH4 NKJ 3- PO4 TP DOC DOM BOD

O2p Na+ Cl- Ca2+ K+ Mg2+ 2- SO4 Conductivity pH Temperature

0.0 0.2 0.4 0.6 0.8 0.1 0.2 0.3 0.4 0.5

Pagel's λ Abouheif's Cmean

Figure 3.3: Phylogenetic signal measured and tested for 19 ecological optima with two methods (Pagel’s λ and Abouheif’s Cmean) over 100 bootstrap trees. Boxes summarize the distribution of statistic values for the 100 phylogenetic trees. Boxes color indicates the percentage of associated significant tests (p-value < 0.05) over the 100 trees. results on Local structure in this paper. The first global principal component (GPC1) reflects essentially a gradient of organic matter, nutrients and minerals on the one hand and oxygen on the other (Figure 3.4B). Species values for this component clearly exhib- ited patterns in the phylogeny (Figure 3.2, GPC1 and GPC1-Lag). Clusters 3, 4 and

6 are related to high values of organic matter, nutrients and minerals and low value of oxygen, while clusters 1 and 5 seem to characterize the opposite kind of water media.

The second global principal component (GPC2) discriminates nutrients from minerals and also expresses the strong negative correlation between nutrient levels and oxygen in the water (Figure 3.4B). GPC2 clarified information given by GPC1, in particular that cluster 2 is more related to nutrients than cluster 6 (Figure 3.2, GPC2 and GPC2-Lag). 70 3. Phylogenetic signal in diatom ecology

AB

GPC1 GPC2 0.0 1.0 2.0 3.0

Figure 3.4: A. Eigenvalues extracted by the pPCA. B. Loadings for the 1st (GPC1) and 2nd (GPC2) Global Principal Component of the pPCA (grid mesh = 0.2). For abbreviations reference see Table 3.1.

3.4 Discussion

3.4.1 Phylogenetic signal in ecological niches of diatoms

Ecologists are increasingly interested in the evolutionary perspective that sister species are ecologically similar (Losos, 2008). This has obvious implications for eco- logical studies and applied monitoring (Wiens et al., 2010). In the case of freshwater diatoms, most of the ecological optima showed a strong phylogenetic signal using Pagel’s

λ, Abouheif’s Cmean and pPCA. Ecological optima which exhibit the strongest signal can be divided in three components. The first is ions (Na+,K+, Cl–, Ca2+), which directly impact the salinity, the alkalinity and the conductivity of water, the second is

3– – inorganic nutrients (PO4 , TP, NO3) and the third is indicators of organic matter (DOC, DOM, BOD). These traits, which seem to be well conserved throughout evolution, are also known to be important factors in determining diatom assemblages (Kelly, 2003; 3.4. Discussion 71

Leland and Porter, 2000; Rimet, 2009; Van Dam et al., 1994) as they are involved in environmental filtering and species competition. But none of these studies made a connection between ecological traits and phylogeny. It is possible that a number of the ecological preferences we studied here are to some extent correlated with species’ cell/- body size which is well conserved in diatom phylogeny (Nakov et al., 2014). However, the impact of cell size on diatoms ecological optima is not clear. Previous studies have shown contrasting results (Berthon et al., 2011; Lavoie et al., 2010).

Conversely, some parameters showed low or no phylogenetic signal. This was the

– + case for three of the four forms of nitrogen (NKJ, NO2 and NH4 ), but also pH and temperature. At this point, it seems difficult to formulate any evolutionary hypothesis about the lability of these traits. As far as we know, they cannot be considered as

– + neutral: NKJ, NO2 and NH4 are important sources of nitrogen for diatoms (Berges et al., 2002; Patrick, 1977; Schoeman, 1973), while pH and temperature are also known to be important factors in the determination of communities’ composition (e.g. Coste and Ector, 2000; Renberg and Hellberg, 1982; Zampella et al., 2007). The lack of signal may be the consequence of overdispersion due to recent evolutionary events like convergent evolution or character displacement. For NKJ, the signal is probably blurred by the fact that nitrogen heterotrophic species given in Patrick (1977) are rare and dispersed in the phylogeny (e.g. Gomphonema parvulum in cluster 1, Nitzschia frustulum, N. fonticola and N. palea in cluster 4 and Cyclotella meneghiana in cluster

6). It has also been shown that very closely related taxa (cryptic and pseudo-cryptic species) can exhibit ecological niche differentiation through exclusive competition when co-occurring in sympatry (Vanelslander et al., 2009).

Methodological and data artifacts may also, at least partially, explain the absence of

– signal for ecological parameters. It is a possibility that the very restricted range of NO2

+ and NH4 values in the database (with many of them below the detection threshold) make 72 3. Phylogenetic signal in diatom ecology the detection of the signal impossible. The absence of a signal for pH is surprising but may also be due to a restricted range of variation in the database, since pH values range from 7.04 to 8.3 in the 5th-95th percentiles interval. Acid rivers are not well represented in this dataset and it implies that acidophilic diatoms are under-represented in the analyses (e.g. only two Eunotia and one Frustulia species). Additional work including more acidophilic species should be done to investigate further the phylogenetic signal of pH optima. Finally, temperature measurements in the field are often not representative of the real river temperature, but mostly on the season and the time of the day when the sampling was done. Nonetheless, it is always difficult to interpret the phylogenetic signal from an evolutionary perspective (Revell et al., 2008) and, as stressed by Losos

(2008), phylogenetic signal is a necessary condition, but it does not systematically imply phylogenetic niche conservatism with active constraints.

3.4.2 Ecological optima patterns in the phylogeny

Traits of interest may be more conserved in some clades compared to others. Despite the large diversity (more than 6800 taxa) of the epipelic motile Naviculaceae (Round et al., 1990), for instance, this family is considered a robust indicator of slightly eutrophic conditions in generic diatom indices (e.g. Rumeau and Coste, 1988). Accordingly, our analyses showed that species of cluster 3, which corresponded to the Naviculaceae, dis- played exactly this kind of ecology. Other clusters, as cluster 1 and 5 showed also a homogeneous ecology, but on the opposite side of the gradient (rivers with low nutrients and organic matter concentrations). Cluster 5 encompasses araphid diatoms, which are non-motile, most of them being attached to hard substrates by a short stalk (Ulnaria,

Fragilaria, Diatoma, Staurosira)(Rimet and Bouchez, 2012a; Round et al., 1990).

Similarly, in cluster 1, many genera also can be attached to substrata by stalks (Gom- phonema, Cymbella) or live in mucous tubules attached to substrates (Encyonema). 3.4. Discussion 73

These different life-forms confer very different advantages, especially, a better compe- tition efficiency for nutrient uptake (Berthon et al., 2011; Passy, 2007), and generic diatom indices consider them as indicator of low pollution level (Rumeau and Coste, 1988; Wu, 1999). Here again, taxa in the tree represented homogeneous ecology.

In contrast, other clades exhibited phylogenetic and ecological signals which are much fuzzier, especially along the 2nd axis of the pPCA. Cluster 4, for example, is composed of taxa of the Bacillariaceae. This family is also very diverse taxonomically and only a few of the freshwater taxa are represented in our study. The genus Nitzschia, which is well represented within clade 4, has recently been reported to be polyphyletic (Rimet et al., 2011). While generic diatom indices consider this epipelic motile genus (Round et al., 1990) as an indicator of polluted rivers (Rumeau and Coste, 1988; Wu, 1999), the clade including Nitzschia also encompasses several marine planktonic genera (e.g. Pseudo-nitzschia, Cylindrotheca, Fragilariopsis, but pruned before the statistical analy- ses) that are very distinct from Nitzschia in ecology and morphology. Moreover, cluster

4 encompasses a paraphyletic species (Nitzschia inconspicua), which showed different salinity responses in culture depending on the genotype (Rovira, 2013). Such studies are unfortunately too rare, but explain why ecological variability can be observable in the phylogeny.

Both Pagel’s λ and Abouheif’s Cmean led to very similar conclusions about the phy- logenetic signal. Despite low support of deep branches, both methods appeared to be robust in bootstrap analyses. The more accurate the phylogeny, the better results are likely to be. However in the present case, it seems like the phylogenetic signal can be detected even if deep nodes are hard to resolve. Thus, we argue that emphasis should be placed on maximizing the number of species in the phylogeny. It is important to have as much diversity as possible with a good representation of every clade to correctly estimate phylogenetic signal and to extract interesting subgroups (Losos, 2008). 74 3. Phylogenetic signal in diatom ecology

3.4.3 Implications for biomonitoring

The existence of a phylogenetic signal for ecological traits is promising for the devel- opment of simpler bioassessment protocols. Using species-level analyses in environmental monitoring, when the phylogenetic signal is strong at higher level, means that a part of the extracted information is redundant (Carew et al., 2011). Avoiding such redundancy can probably save time and effort. Here, the environmental parameters which exhibit the strongest signal are of particular relevance in a context of ecological monitoring for freshwaters. Diatom indices have been historically developed to monitor organic pol- lution (Kobayasi and Mayama, 1989; Sládeček, 1986) and eutrophication (Kelly and Whitton, 1995; Torrisi and Dell’Uomo, 2006). Interestingly, the positions of diatom species on gradients of organic matter and nutrients seem to be highly depen- dent on the phylogeny. Moreover, the loadings of the first component of the pPCA clearly describe a gradient of general water pollution, whereas the second component discriminates factors associated with eutrophication. The fact that phylogeny is linked, at least partially, to the distribution of ecological traits between species should enable the development of simplified biomonitoring methods by merging species into categories of known ecological values. A simple way to achieve this would be to use taxonomical groups, assuming that phylogeny and taxonomy are strongly related (even if we showed that some taxa were polyphyletic). An obvious advantage of taxonomy is that there is no need to reconstruct a phylogeny to test the depth of the signal; standard nested ANOVA can be used directly with taxonomic categories (Gittleman and Luh, 1992).

Furthermore, a tool based on the currently recognized taxonomy should be more readily and widely adopted by environmental managers and field operators. On the other hand, phylogeny can be more flexible and in some cases can fit better to ecological traits than taxonomy. We showed this for the genus Nitzschia, which is probably polyphyletic. An- other typical case here is the genus Cyclotella that we found divided in two clades with 3.4. Discussion 75 different optimums. The ecological heterogeneity of the Cyclotella species included in the analyses is recognized: Cyclotella costei and C. ocellata are commonly found in olig- otrophic waters (Rimet et al., 2009; Wunsam et al., 1995) whereas C. meneghiniana and C. atomus are associated with high nutrient concentrations and eutrophic waters

(Duong et al., 2007; Weckström and Juggins, 2006).

The other interesting prospect that arises when phylogenetic signal is strong enough is the possibility to use phylogeny to make predictions for new species. It can be simpler nowadays to sequence a species for an individual marker than performing an exper- imental estimation of its ecological optima (Shokralla et al., 2012). Knowing the

DNA sequence of a given species, it is feasible to estimate its position in a phyloge- netic tree and to estimate any kind of traits value from the knowledge we have of other related species (Guénard et al., 2013, 2011). Such approaches have been only tested on macroinvertebrates and under laboratory conditions. Thus, it is not yet apparent whether results can be extended to the natural habitat but since diatoms are extremely diverse, it is impossible to measure traits for every species and models could be of great help. Moreover many species are rare and the estimation of their ecological profile can be underpowered and strongly biased by very low relative abundances in sampling data. With the democratization of molecular biology, such modeling approaches appear promising in ecotoxicology (Guénard et al., 2011; Larras et al., 2014), but could probably be relevant in many other ecological applications.

Both ecological prediction and biomonitoring simplification require strong phyloge- netic signal. We demonstrated here that such signal exists in diatoms. However, sta- tistical patterns in the phylogeny are not sufficient to guarantee the efficiency of these methods in routine analysis. Moreover, intraspecific variability and phenotypic plas- ticity may degrade the signal. Further investigations will be necessary with additional species and data for validation. There is no doubt that the development of large ecolog- 76 3. Phylogenetic signal in diatom ecology ical traits databases and large genetic databases and phylogenetic trees for diatoms will permit such exciting applications in the foreseeable future.

3.5 Acknowledgments

We thank Sylvain Guyot for his scientific assistance and Elliot Shubert for language editing. The authors would like to thank three anonymous reviewers for their very help- ful comments. This program was funded by ONEMA (French National Office for Water and Aquatic Ecosystems) in the context of the 2013-2015 “Phylogeny and Bioassess- ment” program. UMR CARRTEL is member of the ECOTOX INRA network. UMR CARRTEL and UMR BIOGECO are members of the R-Syst INRA network. CHAPITRE 4 Linking phylogenetic similarity and pollution sensitivity to develop ecological assessment methods : a test with river diatoms

This is a self-edited version of an article published in Journal of Applied Ecology. Please cite:

Keck F., Bouchez A., Franc A. & Rimet F. (2016) Linking phylogenetic similarity and pollution sensitivity to develop ecological assessment methods: a test with river diatoms. Journal of

Applied Ecology 53(3), 856–864. 78 4. Linking phylogenetic similarity and pollution sensitivity to develop ecological assessment methods

Abstract

1. Diatoms include a great diversity of taxa and are recognized as powerful

bioindicators of freshwater quality. However using diatoms for bioassess-

ment is costly and time consuming, because most of the indices necessitate

species-level identification. Simplifying diatoms-based assessment protocols

has focused the attention of water-managers and researchers in recent years.

2. The increasing availability of genomic data and phylogenies can benefit in

the development of bioassessment methods making use of these tools, where

a clade plays the role of a species if relevant. Indeed, the null hypothesis is

that closely related species are more likely to exhibit similar environmental

sensitivity because of phylogenetic constraints and inheritance. Such pat-

terns have been reported recently for sensitivity to a variety of pollutants

for two important groups of bioindicators used for freshwater monitoring:

benthic macroinvertebrates and diatoms.

3. We introduce a method to extract clusters of species sharing similar traits

and being phylogenetically related. We apply this method on the general pol-

lution sensitivity (IPS specific sensitivity value; Coste, 1982) of 262 species

of diatoms and, by tuning the method settings; we generate different clade-

based derivatives of the traditional IPS index.

4. Finally, we estimate traditional and derived IPS scores for 2119 natural com-

munities of diatoms in eastern France to compare and assess the performances

of these new indices.

5. Synthesis and applications. We show that phylogenetic approaches offer a

scope for simplification without an important loss of information and we

discuss the potential of their use in biomonitoring. 4.1. Introduction 79

4.1 Introduction

Diatoms have been traditionally recognized as a good candidate group to monitor freshwater ecosystems, because they exhibit an important diversity and their community composition is strongly structured by numerous environmental factors including growth stimulating nutrients (Lange-Bertalot, 1979; Patrick, 1961). From this premise, the first environmental quality indices, based on diatoms assemblages, were developed about 50 years ago (e.g. Zelinka and Marvan, 1961) and nowadays diatoms are part of routine bioassessment standard methods in freshwater monitoring (Stevenson et al.,

2010).

With an estimation of 100 000 extant species, diatoms constitute one of the most di- verse algal classes (Mann and Vanormelingen, 2013). Taxonomic diversity is impor- tant for biomonitoring, because it promotes assemblage diversity and allows ecological assessment at a fine level (Birks, 2010). However, the extreme diversity of diatoms also constitutes a challenge for applied biomonitoring. Indices are traditionally developed by skilled diatomists and are usually derived at species-level to maximize their performance.

Moreover, they may include several hundreds of species. Extending the use of such com- plex protocols – for example at a national network scale – is costly and requires training of many operators with continuous intercalibration (Kahlert et al., 2009; Prygiel et al., 2002). In addition, there is still the risk of imprecise or wrong identifications, which can lead to a biased estimation of environmental quality and ultimately lead managers to take unsatisfactory decisions (Besse-Lototskaya et al., 2006). Simplifying and standardizing diatoms-based assessment protocols focused the at- tention of many researchers in recent years. Two main pathways have been explored:

(i) reducing the number of species included in the indices by focusing on most abundant and key species (Lavoie et al., 2009; Lenoir and Coste, 1996) and (ii) reducing the taxonomic resolution (Chessman et al., 1999; Growns, 1999; Hill et al., 2001; Kelly 80 4. Linking phylogenetic similarity and pollution sensitivity to develop ecological assessment methods et al., 1995; Raunio and Soininen, 2007; Rimet and Bouchez, 2012b; Wunsam et al., 2002).

With the increasing availability of genetic data and phylogenies (Benson et al., 2008; Wheeler et al., 2008), the idea arose that the development of bioassessment methods could also benefit from phylogenetic statistical approaches. Carew et al. (2011) first formulated the concept of phylogenetic redundancy in freshwater monitoring by analyz- ing links between mayflies and chironomids pollution sensitivity and phylogeny. The central idea is that closely related species are more likely to exhibit similar sensitiv- ity because of phylogenetic constraints and inheritance. This hypothesis is commonly tested in the literature by measuring and testing the presence of the phylogenetic sig- nal (“the tendency for related species to resemble each other more than they resemble species drawn at random from the tree”; Blomberg and Garland, 2002). The pres- ence of such a signal may have direct consequences on biomonitoring, because it opens up interesting possibilities of simplification by using larger clades instead of species. In- terestingly, the phylogenetic signal has been assessed for sensitivity to pollution on two important groups of bioindicators used for freshwater monitoring: benthic macroinverte- brates and diatoms (Ibáñez et al., 2010). Phylogenetic signal has been found significant for macroinvertebrates sensitivity to various metals (Buchwalter et al., 2008; Poteat and Buchwalter, 2014; Poteat et al., 2013) and to general pollutants (Carew et al.,

2011). For diatoms, significant phylogenetic signal was found for sensitivities to differ- ent herbicides (Larras et al., 2014) and for general ecological preferences (Keck et al.,

2016b). Demonstrating the presence of phylogenetic signal is essential, but is only the first step in making proposals for biomonitoring tools based on phylogenetic knowledge. A second step is to develop methods to extract informative groups of species based on phylogenetic signal to derive simpler indices and test their ability to predict environ- 4.2. Material and Methods 81 ment quality. Thus, we introduce a simple distance-based method to extract clusters of species sharing similar traits, but also are phylogenetically related. It is classical in ecology to compare two distances within a set of individuals, typically through a Mantel test to compare phenetic or genetic distance with geographic distance (Fortin and Gurevitch, 2001; Sokal, 1979; Vignieri, 2005). Here, we go one step further, by building clusters of species based both on traits values and phylogenetic proximity meaning that, two distantly related species cannot be included in the same cluster even if they exhibit similar trait values. In this paper, we apply this method to get different sets of clusters from 262 diatom species. Clustering is based on the phylogeny and on the general pollution sensitivity of the species (IPS specific sensitivity value; Coste, 1982). We use these sets of clusters to develop derivatives of the traditional IPS index. Finally we estimate traditional and derived IPS scores of 2119 samples to compare and assess the performances of these new indices.

4.2 Material and Methods

4.2.1 Phylogenetic tree reconstruction

We used the phylogenetic tree reconstructed in Keck et al. (2016b). This phylogeny is based both on the nuclear gene coding for the small subunit 18S rRNA and the chloroplast rbcL gene coding for the RuBisCO enzyme. The tree includes 549 diatoms species for which genetic information is available for at least one of these markers.

Reconstruction was done with RAxML 7.2.8 (Stamatakis, 2006) using a partitioned Maximum Likelihood analysis with a GTR+I+G evolutionary model (see Keck et al.,

2016b, for details). The tree was dated in relative time using a semi-parametric method 82 4. Linking phylogenetic similarity and pollution sensitivity to develop ecological assessment methods based on penalized likelihood (Sanderson, 2002).

4.2.2 Phylogenetically constrained clustering

We present here a simple co-clustering method for a set of n species in a phylogeny with one or more associated trait values (as illustrated in Figure 4.1A). The method is based both on the pairwise trait distance matrix T and the pairwise phylogenetic distance matrix P. We consider the graph G = (V,E) where V denotes the vertices (the species) and E the set of edges connecting the vertices. Here, the graph G is defined by its adjacency binary matrix A, an n × n matrix where Aij = 1 if there is an edge joining species i with species j and Aij = 0 otherwise. A variety of rules can be used to decide whether there is an edge or not between two vertices. Here, we propose a linear rule given in Equation 4.1, for which a graphical illustration is given in Figure 4.1B.

   − t ≥ ̸ 1 if p Pij Tij; i = j Aij =  (4.1)  0 otherwise

where t and p are the upper bounds to be considered for respectively trait and phylogenetic distances (see Figure 4.1B) and must be manually set. Thus, the higher the values of t and p, the lower the trait and phylogenetic constraints are, respectively.

Once the adjacency matrix is given, we compute the connected components of the associated graph, which define the clusters (Figure 4.1C). Note, however, that different strategies are possible (e.g. selection of cliques, use of community detection algorithms) which will be discussed later. 4.2. Material and Methods 83

A S A R 4 B C Q P R O Q N 3 B C M S L P A K P J I F H t D

G distance Trait K M

1 2 E F H E J D L C N

B 0 I O A 0 20p 40 60 80 100 G −2 0 2 Trait values Phylogenetic distance

Figure 4.1: Phylogenetically constrained clustering process. A. The process is illus- trated with a dataset of 19 species (identified by letters A–S). The trait data are simulated under a Brownian Motion model of trait evolution and are centred. B. Pairs of species in function of their phylogenetic (pa- tristic) distance and trait (Euclidean) distance. The selected pairs (fol- lowing Equation 4.1, p and t values) are represented with crosses while non-selected pairs are represented with circles. The dashed line illustrates the selection limit. C. A graph where species are connected according to previously selected pairs, unveiling 8 disconnected components (clusters).

4.2.3 Defining new indices based on phylogenetic clusters

We chose to work with the IPS index (indice de polluo-sensibilité; Coste, 1982).

The IPS index is a weighted average autecological index based on a modified version of the Zelinka and Marvan (1961) equation (Equation 4.2) where ai is the proportional abundance of the taxon i, vi is its indicator value and si its pollution sensitivity.

∑ n a × v × s IPS = i∑=1 i i i (4.2) n × i=1 ai vi

We then have simplified the index by defining some clusters as explained above, and averaging values of vi and si and summing the values of ai, over clusters of species. Let us denote by γ a cluster. Then, aggregated IPS index is defined as in Equation 4.3. 84 4. Linking phylogenetic similarity and pollution sensitivity to develop ecological assessment methods

∑ a × v × s γ∑γ γ γ IPSP = × (4.3) γ aγ vγ

where

∑ 1 ∑ 1 ∑ aγ = ai, vγ = vi, sγ = si, nγ = #{i : i ∈ γ} i∈γ nγ i∈γ nγ i∈γ

As the grain for species sensitivity variation is coarser (in IPSP, all species belonging to the same cluster are assumed to share a same value for s and v), the estimate of IPS will be coarser. In order to evaluate the discrepancy between IPS and IPSP, we have calculated the error made by using IPSP instead of IPS, by a first order development of

IPSP. We show now that the discrepancy induced by using the coarse index is minimized when the clusters are such that the discrepancy between average value and individual values within each cluster are bounded from above. This is precisely the role of t in the above calculation. Indeed, the error made by using IPSP instead of IPS depends on two types of terms given in Equation 4.4 (see Appendix B, Section B.2.2 for details).

∑ ∑ ∆v,i = ai(vi − vγ), ∆s,i = ai(si − sγ) (4.4) i∈γ i∈γ Each of these terms is a combination of two terms: the abundances in the environ- mental sample ai, and the discrepancy between the species si and vi and the cluster sγ and vγ it belongs to. Each term remains small, i.e. the approximation is acceptable, if

(i) the species ill positioned in a group, i.e. the term |xi − xγ| is high, with x = s or x = v has a low abundance, or (ii) the discrepancy is acceptable. Let us note that for ∑ ∑ − − each cluster γ, we have i∈γ(si sγ) = i∈γ(vi vγ) = 0, which means that the error terms are expected to be small. The detailed calculations are available in Appendix B, Section B.2.2. 4.2. Material and Methods 85

4.2.4 Developing IPSP indices

We carried out clustering analyses using the method described above for species for which both phylogenetic and IPS data (s and v values) were available. IPS data were retrieved from OMNIDIA (Lecointe et al., 1993). We used a phylogenetic distance matrix P, based on the number of nodes separating two species i and j and a trait distance matrix T, based on the pairwise Euclidean distance of IPS pollution sensitivity √ 2 (Tij = (si − sj) ). To make things more interpretable, both P and T were divided by their respective maximum values so that all distances range between 0 and 1.

Since there is no rule to set t and p values, we tested different settings. A full grid of

104 combinations of t and p = {0.01, 0.02, 0.03, ..., 0.99, 1} was processed. However, for clarity, we report results for a set of representative combinations of t = {0.2, 0.4, 0.6, 0.8} and p = {0.05, 0.1, 0.15} giving 12 different graphs and as many different sets of clusters.

Then, we developed a series of phylogenetically IPS-derived indices using Equation 4.3

(referred as IPSP[t,p] with t and p indicating the trait and phylogenetic constraints applied for the clustering).

4.2.5 Comparing IPSP indices performances

To assess the performances of IPSP indices we used a database of 2119 diatom com- munity samples collected in rivers and streams in eastern France between 2001 and 2008. For each of them, 400 diatom frustules were counted and identified at species level. Details about this database are given in Rimet and Bouchez (2012b). These count data were used to compute the IPSstandard value of each sample, which constitutes the reference index value. Thus we can compute different statistics to compare the ability of the different IPSP to recover the information contained in IPSstandard. First, the Pearson correlation index is computed as a measure of the dependence between the samples scores as estimated by IPSP and by IPSstandard. Second, the residual sum of 86 4. Linking phylogenetic similarity and pollution sensitivity to develop ecological assessment methods

square (RSS) is used as a measure of the discrepancy between the scores of IPSP and the scores of IPSstandard. Finally, it is common to use IPS scores to classify samples in 5 levels of water quality: [0; 7[ = Very Poor; [7; 11[ = Poor; [11; 13.5[ = Fair; [13.5; 16[ = Good; [16;20] = Very Good (Prygiel et al., 1996). In particular, these thresholds are currently used by managers to take decisions for environmental restoration. We reported the percentage of good classification and percentages of misclassification (over and under estimates) of samples by IPSP compared to IPSstandard.

4.2.6 Statistical Packages

We performed all the statistical analyses with R 3.0.2 software (R Development

Core Team, 2013). Phylogenies were handled with the ape package (Paradis et al., 2004) and the phylobase package (Hackathon et al. 2013). Phylogenetic distances were computed with the adephylo package (Jombart et al., 2010a). Phylogenetic clus- tering was performed with the phylosignal package 1.

4.3 Results

The dataset includes 262 taxa which were found both in the phylogenetic tree and the IPS database (Figure 4.2). The full grid approach generated 104 sets of cluster. We investigated the effects of phylogenetic and trait constraints on the number of clusters produced (Figure 4.3A) and the relationship between the number of cluster of an IPSP index and its ability for sample classification (Figure 4.3B). The subset of 12 combinations of t and p values tested produced contrasting sets of clusters. The most restrictive graph (t = 0.2, p = 0.05; i.e. low trait and phylogenetic distances) is composed of 196 connected components (i.e. clusters) while the most

1. https://cran.r-project.org/web/packages/phylosignal/ 4.3. Results 87

UULN UACU FRUM

CERE FCA1 FCRO

FPEM FAUT FBID

NFON FCVA NACD

NOVA NIFR

UUAN UUAC NMIC SMIN TTAB TFAS FPCY CGAI FFAM NZSU CTPU THNI NHAN FFVI FDEL DKUE GRMA AFOR NINC DHIE TFLO CCLONAPI DVBR DITE DMON NIVI SPIN TAPI SSVE NSIG SCON NCOM SLMA NFIL SELI NTHM SSMU PSBR NIPU CHMU NACI UERI NPAE CBEL NDRA CACY NCPL TMUS NPAD CMEN NPAL CCRY CCHO NILG CATO HAMP CSTR PPAN CSCD NLIN CDTG NDIS TPSN NLOR DSTE BPAX DPST NSIO SBIN HPDO STMI HSPC SPAV HCRU SHAN SAGA PELO SNEO GTHY CDUB GLIM CINV GYAC CTHO NSLC CCOS NDIT COCE HCAP CBOD NPNU TNOR NVEN TMED NGRE TTEN NPHY SKPC NCRY SKSS NREI TWEI NCTO TFLU NCTE TGES NROS MHEL OAUR NLUN AGCU NCIN AUDI NVRO AUVA NDUR AAMB NSYM AUIS NLAN AUAL NTPT AUSU NCAR MNUM NRAD MVAR NCPR PSUL DGAL PLFR -2 -1 CPED 0 1 CPLE CPLA LHUN DULV ASPH PELG PCOS ECAE EMNT ESNC ESLE IPS sensitivity CASP CAEX CAFF CLAN CBNA CTUM CPRX RSIN GROS TPSD GBOB CWAI GCAP RAMP GTRU PGST GCLA EGLA GACU EFOR GPRO EIMP GANG EMIN GAFF EPEC GCLE EBIL EMON GPAR SEMN GLGN SSEM GPAS SPUP GEXL SELA EULA SEBA ADMI FMOC FVUL FFOR BRUT CSIL ALIB CLAU PRUP AMFO PNGF APED PBMU ACOF PBOR PBSI AMMO PMEU ANORRGIB PINT PIAN ETUR POMU PSCA EARG PSEL ESOR PGIB SSPL PMI3 PPVS STCU PSGI CELL PACR SUMI PNOD CABU MPMI MAFO MAAT SANG MAPE NEAF NBIS NEPR SCPE EORN LGOE EALA EPTU EPAL FPEL FSAP STKR STAC

SGRL SPHO STAN

CRCU ESBM CMLF CRAC DSBO

Figure 4.2: Phylogenetic tree of 262 diatoms species and their respective IPSstandard sensitivity value (s). The colors delineate 68 clusters based on t = 0.6 and p = 0.1. Diatoms names are reported using 4-letter codes (Lecointe et al., 1993, see Appendix B, Section B.2.1 for corresponding Linnaean names). relaxed graph (t = 0.8 and p = 0.15; i.e. high trait and phylogenetic distances) is composed of 9 components. Other graphs have a number of connected components ranging between these two extremes (Table 4.1). Since the number of cluster on which they are based varies greatly, the capacity of the 88 4. Linking phylogenetic similarity and pollution sensitivity to develop ecological assessment methods

Quality classification (% of samples) compared to IPSstandard Clustering Overestimate Exact Underestimate constraints t p Number Correlation RSS 2 1 0 1 2 3 of clusters IPSstandard IPSstandard 0.2 0.05 196 0.935 3099 0.1 9.7 79.5 10 0.5 0.1 0.4 0.05 187 0.936 3081 0.1 8.2 79.8 11.2 0.5 0.1 0.6 0.05 157 0.929 3335 0.2 9.5 77.5 12.3 0.4 0.1 0.8 0.05 153 0.925 3989 0.2 6.5 73.9 18.1 1.2 0.1 0.2 0.1 126 0.938 2990 0.1 8 80.6 10.7 0.5 0.1 0.4 0.1 89 0.928 3594 0.2 7.4 76.1 15.4 0.8 0 0.2 0.15 86 0.938 3179 0.1 7.5 77.3 14.2 0.7 0.1 0.6 0.1 68 0.907 4357 0.4 9.9 73.4 15 1.2 0 0.8 0.1 51 0.863 7922 0.7 9 63.2 25 2 0 0.4 0.15 32 0.904 8684 0.1 2.2 55.3 39 3.1 0.2 0.6 0.15 16 0.774 17317 0.2 5.3 37.5 48.1 8.3 0.6 0.8 0.15 9 0.591 31120 0.9 5.9 23.5 37.4 30.2 2.1

Table 4.1: Comparison of the 12 IPSP indices. Each index is based on a set of clusters generated by a pair of t and p values. Performances of the indices are assessed by comparing with results of IPSstandard for 2119 diatom samples.

different IPSP indices to reflect the information of IPSstandard varies also markedly. The correlation between IPSP indices and IPSstandard is high (> 0.9) as long as the number of clusters remains high (> 68; Table 4.1 and Figure 4.4). The highest correlation

(0.938) is achieved with IPSP[0.2,0.1] and IPSP[0.2,0.15] (i.e. low trait and moderate to high phylogenetic distances).

The residual sum square (RSS) ranged between 2990 and 4357 as long as the number of clusters remains high (> 68). Under this threshold, the error increases drastically

(Table 4.1 and Figure 4.4). The lowest RSS is achieved with IPSP[0.2,0.1] (2990). More than 73% of the samples are correctly classified as long as the number of clus- ters remains above 68. For indices based on very few clusters (16 and 9), the number of misclassified samples falls under the number of correctly classified samples. The best 4.4. Discussion 89

percentage of classification is achieved with IPSP[0.2,0.1] (80.6% of good classification). Strong overestimations of water quality (≥ 2 classes) are rare overall while strong under- estimations (≥ 2 classes) appear to be more frequent when the number of clusters is very low. Overall, in case of misclassification, IPSP indices are more likely to underestimate water quality than overestimate it.

4.4 Discussion

4.4.1 Phylogenetic clustering – methodological discussion

The idea of simplifying bioassessment methods using phylogenetics has been raised in the last few years (Carew et al., 2011; Larras et al., 2014), but no study proposed a phylogenetically based biomonitoring tool. We introduced a simple and general approach to develop such tools and tested it in order to simplify a popular biomonitoring diatomic index: the IPS (Coste, 1982).

We proposed a simple method, which allows clustering species taking into account both their phylogenetic proximities and trait similarities. Clusters generated by this method are not necessarily monophyletic clades. The method has many declinations possible, since each step is independent and adaptable. First, a different phylogenetic distance matrix (P) can be used. Here we used the number of internal nodes separating two species, but patristic distance (length of branches separating two species) or more complex distances (Pavoine and Ricotta, 2013; Pavoine et al., 2008) can be consid- ered, as well as transformation of these distances (e.g. square root of patristic distance;

Hardy and Pavoine, 2012). Second, we applied the method on a single trait (species sensitivity), but since clustering is based on the Euclidean distance of trait (T), it can 90 4. Linking phylogenetic similarity and pollution sensitivity to develop ecological assessment methods

250 A 200

150

100

50 Number of clusters of Number

0.2 0.4 t 0.6 0.2 0.4 0.8 0.6 p 0.8 1.0 30 50 70 Samples correctly classified (%)

B 20 40 60 80

Samples correctly classified (%) 0 50 100 150 200 250 Number of clusters

Figure 4.3: A. Relation between the pair of t and p values and the number of clusters produced by the method. The color gradient indicates the percentage of samples correctly classified by the IPSP developed from the corresponding set of cluster. B. Relation between the number of clusters produced by the method and the percentage of samples correctly classified by the IPSP developed from the corresponding set of cluster. Data presented only for p < 0.2. 4.4. Discussion 91

p = 0.05 p = 0.1 p = 0.15

20 20 20 15 15 15 10 10 10

t = 0.2 5 5 5 0 0 0 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 20 20 20 15 15 15 10 10 10

t = 0.4 5 5 5 0 0 0 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 20 20 20 15 15 15 10 10 10

t = 0.6 5 5 5 0 0 0 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 20 20 20 15 15 15 10 10 10

t = 0.8 5 5 5 0 0 0 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20

Figure 4.4: Relation between samples scores estimated with IPSstandard (horizontal axis) and IPSP[t,p] (vertical axis) for the 20 tested pairs of t (trait con- straint) and p (phylogenetic constraint) values. The solid red line repre- sents the full equivalence between IPSstandard and IPSP. be easily extended to a multivariate framework. Third, different rules can be used to select which pairs of species are connected by an edge in the graph. We used a sim- ple rule based on a linear equation (Equation 4.1, Figure 4.1B), but other options can be developed (e.g. rectangular and elliptical selections are included in the R package phylosignal). Finally, different cluster extraction approaches can be tested. In partic- ular, for complex data, clusters can be detected using community detection algorithms

(Newman and Girvan, 2004) and clusters validity can be assessed with statistics de- rived from graph theory like measures of density and connectivity (Van Steen, 2010). 92 4. Linking phylogenetic similarity and pollution sensitivity to develop ecological assessment methods

Since the method is extremely general and flexible, this gives an opportunity to fit to a large variety of data. Clustering can be applied to any kind of trait. For example in freshwater biomonitoring, other indices could be clustered like the trophic diatom index (Kelly and Whitton, 1995), the global periphyton indices (Rott et al., 1997, 1999) or the Brettum index for lakes monitoring (Brettum, 1989), but the method could also be applied directly on species preferences through a multivariate approach (see Keck et al., 2016b).

The method does not provide an optimal pair of p and t constraining values. This can limit the ease of use, but is also a source of flexibility. Since the clustering algorithm we propose is not computationally intensive, it can be easy to test thousands of settings.

Thus, a practitioner developing a new index can pick up the pair of phylogenetic and trait constraints which fit the best with his/her own needs in terms of trade-off between simplification and precision. Representing the relationship between the number of clus- ters and the efficiency of indices (Figure 4.4) may be a good way to support the decision process.

Overall, the results must always be interpreted carefully and we stress the impor- tance to make a detailed analysis of how t and p influence the clustering outcomes. An identified issue is the linkage effect: if there is an edge between species A and species

B and an edge between species B and C, then A, B and C will be included in the same connected component (i.e. cluster), even if A and C are not connected. A way to overcome this problem might be to use more sophisticated method to extract clusters from the graph, as discussed above. Another point which needs attention is that if t or p values are too high, the method will converge to phylogenetic-only clustering (i.e. clusters strictly based on phylogenetic distances) or trait-only clustering (i.e. clusters strictly based on traits distances), respectively. For example in our dataset, when the number of clusters is very low and the performance of the index is very high, this is 4.4. Discussion 93 due to the phylogenetic constraint, which is nonexistent (p > 0.25; high phylogenetic distance). Therefore, the results converge to a trait-only clustering, which is definitely not the aim here.

4.4.2 Phylogenetically based indices – potential for applications

The tests we conducted showed that the number of clusters can be reduced without an important loss of information. These results tends to confirm that biomonitoring with diatoms can be simplified using taxonomic levels higher than the species level as previously suggested by other authors (Chessman et al., 1999; Growns, 1999; Hill et al., 2001; Kelly et al., 1995; Raunio and Soininen, 2007; Rimet and Bouchez,

2012b; Wunsam et al., 2002). This is achieved for the first time using a phylogenetic approach in order to take account phylogenetic redundancy (Carew et al., 2011).

It is important to note that the phylogeny of diatoms is far from complete. Only

262 species have been included in the clustering method, whereas IPS computation is based on more than 5000 species with 909 of them present in the samples of our test dataset. Including more species in the phylogenetic tree could produce more clusters, but also it will probably increase significantly the performance of IPSP indices. In partic- ular, some missing taxa are important for biomonitoring like Achnanthidium subatomus,

A. subatomoides, A. daonense, which are indicators of pristine rivers of relatively low conductivity. These missing species can probably explain the tendency of IPSP to un- derestimate the water quality. On the other hand, pollution tolerant taxa are better represented in the current phylogeny. Significant progress has been made in our un- derstanding of diatom phylogeny in recent years (Medlin, 2011; Theriot et al., 2010, 2011). Large scale phylogenetic trees, including many more species and based on many more markers, will be made available, making phylogenetic approaches more robust and 94 4. Linking phylogenetic similarity and pollution sensitivity to develop ecological assessment methods relevant. Biomonitoring methods based on phylogenies can be easily updated as new data are made available.

The use of phylogenetic approaches aims principally to simplify biomonitoring by avoiding phylogenetic redundancy (Carew et al., 2011). In this paper, we try to ad- dress this issue by extracting clusters of phylogenetically related species sharing similar pollution sensitivities. Ideally, these clusters would provide the best compromise be- tween simplicity and efficiency. However, it seems difficult to import a tool developed with phylogeny in the traditional biomonitoring workflows based on classical micro- scopic counts, because there are several incongruencies between morphology and DNA- phylogeny (Kermarrec et al., 2011; Zimmermann et al., 2014). Moreover taxonomical classification is rarely matching the IPSP clusters proposed, and it would ask the tech- nician counting diatoms to learn these new clusters even if many of them are intuitive (e.g. Ulnaria group, Nitzschia lanceolatae group). This is probably hardly applicable for already trained diatomists. However, in some cases, clusters match the taxonomy, espe- cially at genus level. For example, in Figure 4.2, the genera Eunotia and Stauroneis are identified as two clusters with high sensitivity while all Entomoneis species are detected in the same non sensitive (low sensitivity values) cluster. Such results can be interesting to develop biomonitoring tools based on a mixture of taxonomical levels as suggested by Jones (2008) for macroinvertebrates. For diatoms, tools based both on species and genus levels exists (Kelly and Whitton, 1995), but could undergo new developments with phylogenetic approaches.

Finally, these approaches seem to be much better adapted for next generation biomon- itoring, so-called biomonitoring 2.0, based on metabarcoding and high throughput se- quencing methods (Baird and Hajibabaei, 2012), which aims to use DNA-barcodes to assess environmental quality. Since this approach is based on molecular characters, it is much more straightforward to integrate phylogenetic considerations. In metabarcoding, 4.5. Acknowledgments 95 one of the difficulties is the taxonomic assignment of metabarcode sequences (Coissac et al., 2012). Assigning DNA sequences to clusters of species, rather than species would be more flexible and probably would be achieved more easily. Another common issue is the lack of data in taxon-stressor response libraries. The use of phylogenetic methods to infer taxa traits from their phylogenetic position could offer a solution to this prob- lem (Keck et al., 2016b) and a complete modeling framework has been proposed by

Guénard et al. (2013). However, as a first step in a biomonitoring context, it would be simple to infer trait values of unknown sampled species if they fall within a given cluster. Thus, increasing information on traits and taxa – thanks to metabarcoding associated together with phylogenetically based methods – should significantly enhance the efficiency of environmental monitoring.

4.5 Acknowledgments

This work was funded by ONEMA (French National Office for Water and Aquatic Ecosystems) in the context of the 2013-2015 “Phylogeny and Bioassessment” program. 96 CHAPITRE 5 Discussion et perspectives

L’objectif principal de ce travail est d’évaluer les possibilités de simplification des outils de bioindication en mettant à profit les liens entre phylogénie et traits écologiques

(i.e. le signal phylogénétique). Il s’agit de tester la faisabilité et la mise en pratique chez les diatomées d’une idée – récemment apparue dans la littérature (Carew et al., 2011)– qui suggère que la discrimination des individus au niveau le plus fin de la taxonomie n’est pas toujours nécessaire pour la bioindication et que la phylogénie peut aider à regrouper des taxons en “clusters indicateurs”. Notre stratégie a donc consisté à développer des outils techniques pour évaluer le signal phylogénétique (cf. article chapitre 2) et extraire des clusters d’espèces en prenant en compte la phylogénie (cf. article chapitre 4), puis à mettre en œuvre ces outils pour étudier le signal phylogénétique pour différents traits

écologiques (optimums pour différents paramètres physico-chimiques) chez les diatomées d’eau douce (cf. article chapitre 3) et tester l’efficacité d’indices biotiques basés sur des clusters phylogénétiques (cf. article chapitre 4). Cette partie propose une discussion générale des résultats et introduit une série de questions et de perspectives soulevées par ce travail.

5.1 Signal phylogénétique et traits écologiques chez les diatomées

Dans le chapitre 3 nous avons testé la présence de signal phylogénétique pour 19 traits écologiques (optimums spécifiques reflétant un potentiel d’adaptation pour 19 98 5. Discussion et perspectives paramètres physico-chimiques environnementaux). Pour la majorité d’entre eux, nous avons pu mettre en avant un signal significatif, ce qui suggère qu’un lien, même indi- rect, peut être établi entre la phylogénie et l’écologie des diatomées d’eau douce. La détection de signal phylogénétique ne constitue pas une surprise en soi : en vertu du principe d’héritabilité et de descendance avec modification, on peut légitimement s’at- tendre à retrouver l’empreinte de l’évolution dans la distribution des valeurs de traits dans la phylogénie. Cependant, la présence systématique de signal phylogénétique pour les traits écologiques a été remise en question (Blomberg et al., 2003 ; Freckleton et al., 2002) et ne peut être considérée a priori comme acquise (Losos, 2008). Nos résultats confirment ce point de vue puisque nous avons détecté un signal d’intensité variable selon les traits considérés et nous n’avons pas été en mesure de détecter un

– + signal phylogénétique pour plusieurs traits (optimums pour NKJ, NO2, NH4 , pH et températures).

En plus des possibles biais statistiques et de la mauvaise représentation de certains gradients discutés en détails en Section 3.4.1, Losos (2008) avance deux hypothèses pour expliquer la labilité des traits écologiques : (i) la variabilité spatiale et temporelle de ce type de traits qui est rarement prise en compte et (ii) le fait que les facteurs environnementaux ne covarient pas nécessairement. Notre jeu de données intègre à la fois la dimension spatiale (échantillonnage à l’échelle régionale sur tout l’Est de la France) et la dimension temporelle (échantillonnage conduit sur une période de 8 ans). Quant aux traits testés, ils présentent tous un certain degré de corrélation (à l’exception notable du pH). Nous proposons ici une troisième hypothèse pour expliquer la labilité des traits éco- logiques. Comme nous l’avons rappelé en introduction (Figure 1.2), ces traits – tels que nous les avons estimés – sont des composantes de la niche réalisée des espèces et ré- sultent de nombreux autres traits biologiques (physiologie, morphologie, traits d’histoire 5.1. Signal phylogénétique et traits écologiques chez les diatomées 99 de vie), mais aussi des interactions biotiques entre espèces (compétition, facilitation) et de l’histoire des espèces (événements passés, colonisations, dispersions, etc.). Il n’est pas à exclure que ces deux derniers éléments jouent un rôle important voire dominant dans la définition de la niche des espèces, modulant l’intensité du signal phylogénétique.

Il est par exemple généralement admis que les préférences des espèces peuvent différer significativement selon qu’elles se développent in situ ou en milieu contrôlé. Ainsi, on pourrait s’attendre à une meilleure adéquation entre la phylogénie et la niche fondamen- tale des espèces (qui n’est jamais observée hors des conditions de laboratoire) qu’entre la phylogénie et leur niche réalisée.

Un aspect dont nous avons peu discuté jusqu’à présent est celui du conservatisme de niche (phylogenetic niche conservatism). Le conservatisme de niche recouvre les proces- sus écologiques et évolutifs qui conduisent les espèces à conserver leur niche écologique au cours du temps. Il s’agit donc d’un concept proche du signal phylogénétique appliqué aux traits écologiques, avec lequel il a d’ailleurs pu être confondu (Losos, 2008). Le conservatisme de niche a concentré l’attention des chercheurs ces dernières années mais sa définition précise est toujours en débat. Des désaccords persistent sur la nature même du concept qui est considéré soit comme un processus (Wiens and Graham, 2005), soit comme un motif résultant de l’évolution (Losos, 2008). Les conditions écologiques et

évolutives qui favorisent l’apparition d’un conservatisme de niche ne sont pas clairement

établies mais incluent probablement un régime de sélection stabilisante, un manque de variation génétique appropriée ou encore des contraintes fonctionnelles causées par des effets pléiotropiques (Harvey and Pagel, 1991 ; Wiens and Graham, 2005). Nos résultats démontrent la présence de signal phylogénétique pour la niche écologique des espèces étudiées mais n’apportent pas la preuve formelle que les niches soient activement conservées au cours du temps. Un travail plus approfondi de modélisation (voir Mün- kemüller et al., 2015) permettrait de mieux caractériser le conservatisme de niche chez 100 5. Discussion et perspectives

A B

A A B B C C D D E E F F G G H H I I J J K K L L M M N N O O P P

C D A A B B C C D D E E F F G G H H I I J J K K L L M M N N O O P P

Figure 5.1 : Les quatre panneaux représentent les valeurs mesurées d’un même trait pour 16 espèces. Chaque panneau présente une simplification par regrou- pement d’espèces basée sur leurs relations de parenté assumées. A. Aucune relation de parenté assumée, espèces indépendantes. B. Relation taxo- nomique, regroupement basé sur un niveau taxonomique (e.g. le genre). C. Relation taxonomique, regroupement basé sur un mélange de niveau taxonomique (e.g. genre et famille). D. Relation phylogénétique, regrou- pement basé sur la distance phylogénétique. 5.2. Clusters phylogénétiques et simplification des outils de bioindication 101 les diatomées d’eau douce.

Notre compréhension du signal phylogénétique des traits écologiques chez les diato- mées et des forces écologiques et évolutives qui le régulent est donc encore loin d’être complète et de nombreuses pistes restent à explorer. Les futurs travaux pourront s’at- tacher en particulier à inclure une plus grande diversité d’espèces, la représentation

équilibrée des différents clades étant un élément important à prendre en compte pour ne pas biaiser l’estimation du signal phylogénétique. Les valeurs de traits écologiques ont ici été estimées en utilisant les moyennes pondérées de l’abondance des espèces. À ces moyennes sont associées des mesures d’erreurs qui pourraient être prises en compte dans le calcul du signal phylogénétique (Ives et al., 2007). Une autre piste pourrait être d’étudier directement le signal phylogénétique de la fonction de distribution de chaque espèce (Goolsby, 2015). L’étude des traits écologiques avec des méthodes multivariées apparaît particulièrement pertinente pour mieux saisir l’aspect multidimensionnel des niches écologiques et s’affranchir des problèmes de corrélations entre traits. Dans cette optique, l’analyse phylogénétique en composantes principales (pPCA, Jombart et al.,

2010b) apparaît comme un outil novateur et performant. Adams (2014) a récemment proposé une version multivariée du K de Blomberg et al. (2003) pour estimer le signal phylogénétique sur des caractères multidimensionnels (traits morphométriques). Ce type d’approche pourrait être utilisé pour estimer le signal phylogénétique pour des niches complexes.

5.2 Clusters phylogénétiques et simplification des outils de bioindication

Dans le chapitre 4 nous avons présenté une méthode pour extraire des clusters d’es- pèces partageant des valeurs de traits similaires tout en étant proches phylogénétique- 102 5. Discussion et perspectives ment. Cette nouvelle méthode répond à un besoin technique formulé par Carew et al.

(2011) et Larras et al. (2014) dans l’optique de pouvoir mettre au point des outils de bioindication simplifiés en intégrant la redondance phylogénétique. Nous avons mis en œuvre cette méthode pour évaluer le potentiel de simplification d’un indice diatomées basé au niveau de l’espèce, l’indice de polluo-sensibilité IPS (Coste, 1982). Nos résultats montrent que les possibilités de simplification en utilisant la phylogénie comme guide sont significatives. Il s’agit de la première étude qui intègre réellement cette approche.

La simplification des outils de bioindication sur la base de la proximité phylogé- nétique entre espèces peut à présent prendre trois formes distinctes représentées sur la

Figure 5.1. À partir d’un outil basé à un niveau taxonomique de référence (Figure 5.1A), on peut opérer une simplification en regroupant les taxons à un niveau taxonomique su- périeur (Figure 5.1B). C’est la méthode de simplification la plus simple et qui a été la plus explorée chez les diatomées (Chessman et al., 1999 ; Growns, 1999 ; Hill et al., 2001 ; Raunio and Soininen, 2007 ; Rimet and Bouchez, 2012b ; Wunsam et al., 2002). Une autre approche, plus sophistiquée, consiste à adapter le niveau taxono- mique en fonction de l’homogénéité des valeurs de traits écologiques au sein des clades

(Figure 5.1C). Il s’agit d’une stratégie qui est déjà mise en pratique pour les macroinver- tébrés (Carter and Resh, 2001 ; Jones, 2008) mais qui a été peu développée chez les diatomées à l’exception notable du Trophic Diatom Index (Kelly and Whitton, 1995) qui proposait certains regroupements au niveau du genre. Enfin, la dernière approche qui constitue l’objet principal de cette thèse, est celle des clusters d’espèces basés sur la phylogénie (Figure 5.1D). Les clusters phylogénétiques apparaissent comme le moyen le plus flexible et le plus performant pour simplifier les outils de bioindication. Pour autant, cette approche peut

être complexe à mettre au point et l’adoption d’outils issus de ce type de simplification dans les flux de travail traditionnels (identification et dénombrement des individus en 5.2. Clusters phylogénétiques et simplification des outils de bioindication 103

Signal phylogénétique

Indices Histoire évolutive

Phylogénies : 1 Corrélogrammes topologies et longueurs de branches 3

Traits : 2 modèles et Relation taux d'évolution distances phylogénétiques et distances de traits

4

Clusters d'espèces

Figure 5.2 : Schéma conceptuel explicitant le lien entre les données (phylogénie et distributions des traits) et les clusters d’espèces. Les flèches sont discutées dans le texte où elles sont référencées par leur numéro. microscopie) semble peu réaliste (ce point est discuté en détail en Section 4.4.2). Dans ce contexte, il semble que l’approche basée sur un mélange de niveaux taxonomiques (Fi- gure 5.1C) constitue un bon compromis, offrant une certaine flexibilité tout en conservant l’environnement taxonomique, plus simple à appréhender par les opérateurs et bien do- cumenté. L’approche phylogénétique peut être un moyen intéressant pour détecter des taxons homogènes en terme d’écologie et cohérents en terme de phylogénie, à intégrer dans une approche taxonomique multi-niveaux. Dans la même optique, il peut être in- téressant de tester la méthode de clustering phylogénétique présentée en Section 4.2.2 avec une matrice de distance basée sur la taxonomie. La méthode de clustering que nous avons présentée au chapitre 4 peut être déclinée de nombreuses manières (voir Section 4.4.1). Il semble donc important d’étudier en dé- 104 5. Discussion et perspectives tail ce qu’impliquent les différentes stratégies possibles (choix d’une matrice de distance phylogénétique, transformation des données, critères pour l’extraction des clusters à par- tir des graphes, etc.). Au delà des choix méthodologiques à la charge de l’utilisateur, il y a une vraie nécessité à comprendre dans quelle mesure les données (phylogénie et valeurs de traits) influent sur les résultats du clustering et comment ces résultats peuvent être mis en relation avec les mesures de signal phylogénétique. La Figure 5.2 met en évidence quatre liens (flèches) qu’il convient de mieux comprendre. La flèche 1 correspond au lien entre l’histoire évolutive (inférée à partir des données que l’on peut collecter) et les indices de signal phylogénétique. Ce lien est celui qui a certainement été le plus étudié jusqu’à présent (Diniz-Filho et al., 2012 ; Münkemüller et al., 2012 ; Revell et al.,

2008). La flèche 2 représente le lien entre l’histoire évolutive et la relation entre distances phylogénétiques et distances de traits. La forme de cette relation est connue quand les traits sont simulés avec un modèle de mouvement Brownien par exemple (Diniz-Filho et al., 1998 ; Letten and Cornwell, 2015), mais gagnerait à être étudiée avec des modèles d’évolution plus complexes et différentes formes de phylogénies. La flèche 3 représente le lien entre les mesures du signal phylogénétique et la relation distances phy- logénétiques et distances de traits. La détection d’un signal phylogénétique conditionne vraisemblablement cette relation et donc la capacité à extraire des clusters. La question qui se pose ici est celle de la nature du lien entre signal phylogénétique et clusters phy- logénétiques. Enfin la flèche 4 correspond au lien entre les distances phylogénétiques et de traits et le résultat du clustering et recouvre l’influence des choix méthodologiques discutés ci-dessus et en Section 4.4.1. Les liens (flèches) présentés sur la Figure 5.2 sont donc autant de pistes de re- cherche qu’il convient d’approfondir afin de pouvoir intégrer les clusters d’espèces dans un contexte évolutif. 5.3. Signal et clusters phylogénétiques : extension au méta-barcoding 105

5.3 Signal et clusters phylogénétiques : extension au méta-barcoding

L’utilisation des diatomées comme indicateurs biologiques impose l’identification taxonomique des échantillons. Cette identification est traditionnellement réalisée par inspection visuelle des frustules sous microscope, une tâche qui peut s’avérer longue, fastidieuse et source d’erreur (voir Section 1.4.3). Ce constat est à l’origine du présent travail, qui cherche à explorer de nouvelles pistes pour simplifier le processus d’identifi- cation.

Les progrès de la biologie moléculaire permettent à présent d’identifier les espèces sur la base de barcodes ADN (Hebert et al., 2003). Le développement des méthodes de séquençage nouvelle génération (Shokralla et al., 2012), couplées à des bases de données de référence étendues (Benson et al., 2008 ; Ratnasingham and Hebert,

2007) et à des outils bioinformatiques performants (Caporaso et al., 2010 ; Schloss et al., 2009) a rendu possible l’identification taxonomique des communautés directement à partir d’échantillons d’ADN environnementaux. Cette approche, connue sous le nom de méta-barcoding, offre des possibilités d’application intéressantes dans le cadre de la bioindication puisqu’elle permet de caractériser la biodiversité rapidement et de manière automatisée directement à partir de l’information génétique des échantillons (Baird and Hajibabaei, 2012). Si les premiers essais d’applications du méta-barcoding à la bioindication semblent concluants (Hajibabaei et al., 2011 ; Kermarrec et al., 2014), un certain nombre de verrous techniques subsistent pour l’utilisation du méta-barcoding en routine. Deux points nous intéressent particulièrement : (i) la difficulté à rattacher les barcodes des échantillons à des taxons décrits parce que les bases de données de référence associant un barcode à chaque espèce sont encore largement incomplètes et que la diversité révélée par l’ADN est peu compatible avec la taxonomie basée sur ca- ractères morphologiques et (ii) le manque de données écologiques et écotoxicologiques 106 5. Discussion et perspectives

Sp. 1 Sp. 4 BARCODE 1 Sp. 3 Sp. 2 Sp. 6 Sp. 5 Sp. 7 Sp. 9 Sp. 8 BARCODE 2 Sp. 23 Sp. 25 Sp. 24 Sp. 22 Sp. 10 Sp. 14 Sp. 13 BARCODE 4 Sp. 12 BARCODE 3 Sp. 11 Sp. 15 Sp. 17 Sp. 16 Sp. 21 Sp. 20 Sp. 18 Sp. 19 BARCODE 5

-2 -1 0 1 2 A Trait

Sp. 1 Sp. 4 BARCODE 1 Sp. 3 Sp. 2 Sp. 6 Sp. 5 Sp. 7 Sp. 9 Sp. 8 BARCODE 2 Sp. 23 Sp. 25 Sp. 24 Sp. 22 Sp. 10 Sp. 14 Sp. 13 BARCODE 4 Sp. 12 BARCODE 3 Sp. 11 Sp. 15 Sp. 17 Sp. 16 Sp. 21 Sp. 20 Sp. 18 Sp. 19 BARCODE 5

-2 -1 0 1 2 B Trait

Figure 5.3 : Deux manières d’intégrer une approche phylogénétique pour estimer les valeurs de traits de séquences environnementales (en rouge), à partir d’une phylogénie connue (en noir). A. À partir des clusters phylogénétiques (valeur moyenne du trait au sein de chaque cluster) B. En modélisant le signal avec la méthode des vecteurs propres phylogénétiques (Guénard et al., 2013) 5.3. Signal et clusters phylogénétiques : extension au méta-barcoding 107 de sensibilité des taxons aux facteurs de stress.

Les approches phylogénétiques pourraient constituer un moyen de dépasser ces limi- tations car nous disposons d’outils efficaces pour replacer les barcodes environnementaux dans des phylogénies de référence (Berger et al., 2011 ; Matsen et al., 2010). Une fois les séquences environnementales replacées dans la phylogénie, il est possible de leur attri- buer une valeur de trait à partir de leur position en s’appuyant notamment sur le signal phylogénétique (Figure 5.3). Ainsi, le signal phylogénétique que nous avons caractérisé au chapitre 3 pour des traits écologiques d’intérêt pour la bioindication devrait être éga- lement exploitable dans le cadre du méta-barcoding. Deux approches pour estimer des valeurs de traits pour de nouvelles espèces (ici des séquences environnementales) sont présentées sur la Figure 5.3. La première (Figure 5.3A) se base sur des clusters phy- logénétiques extraits avec la méthode introduite au chapitre 4. Pour chaque séquence environnementale la valeur de trait estimée correspond à la moyenne du cluster auquel elle se rattache. La seconde (Figure 5.3B) est basée sur la méthode des vecteurs propres phylogénétiques (Guénard et al., 2013). D’autres approches existent pour inférer des valeurs de traits à partir de la phylogénie (e.g. Kembel et al., 2012). La mise application de ces approches pourrait permettre de simplifier et d’intégrer une dimension évolutive à l’estimation des traits dans le cadre du méta-barcoding en bioindication. 108 ANNEXE A Linking Diatom Sensitivity to Herbicides to Phylogeny : A Step Forward for Biomonitoring ?

This is a self-edited version of an article originally published in Environmental Science & Tech- nology (link editor). Please cite: Larras F., Keck F., Montuelle B., Rimet F. & Bouchez A.

(2014) Linking Diatom Sensitivity to Herbicides to Phylogeny: A Step Forward for Biomoni- toring? Environmental Science & Technology 48(3), 1921–1930. 110 A. Linking Diatom Sensitivity to Herbicides to Phylogeny : A Step Forward for Biomonitoring ?

Abstract

Phylogeny has not yet been fully accepted in the field of ecotoxicology, despite

studies demonstrating its potential for developing environmental biomonitoring

tools, as it can provide an a priori assessment of the sensitivity of several indica-

tor organisms. We therefore investigated the relationship between phylogeny and

sensitivity to herbicides in freshwater diatom species. This study was performed

on four photosystem II inhibitor herbicides (atrazine, terbutryn, diuron, and iso-

proturon) and 14 diatom species representative of Lake Geneva biofilm diversity.

Using recent statistical tools provided by phylogenetics, we observed a strong

phylogenetic signal for diatom sensitivity to herbicides. There was a major divi-

sion in sensitivity to herbicides within the phylogenetic tree. The most sensitive

species were mainly centrics and araphid diatoms (in this study,

and Fragilariales), whereas the most resistant species were mainly pennates (in

this study, Cymbellales, Naviculales, and Bacillariales). However, there was con-

siderable variability in diatom sensitivity within the raphid clade, which could

be explained by differences in trophic preferences (autotrophy or heterotrophy).

These traits appeared to be complementary in explaining the differences in sen-

sitivity observed at a refined phylogenetic level. Using phylogeny together with

complementary traits, as trophic preferences, may help to predict the sensitivity

of communities with a view to protecting their ecosystem.

A.1 Introduction

Over the past 40 years, many methods have been developed for monitoring the eco- logical quality of water bodies (Lenoir and Coste, 1996; Liess et al., 2008; Schaum- burg et al., 2004). Among these methods, bioindicators constitute a powerful tool to assess the anthropogenic pressures on biota and ecosystem function. Most bioindicators A.1. Introduction 111 used in freshwaters, such as fish, macroinvertebrates, and microalgae, address specific biomonitoring needs (Resh, 2008). Microalgae, and especially diatoms, are suitable for monitoring the overall health of the environment. They are particularly used to assess the quality of rivers in terms of organic and nutrient pollution (Rimet, 2012). Their com- munities respond rapidly to changes in habitat quality (Lowe and Pan, 1996; Steven- son and Smol, 2003). Diatoms are good biomonitors because they are found in most aquatic habitats and exhibit a huge diversity in terms of taxonomy (over 30 000 species,

Mann and Vanormelingen, 2013), morphology (Round et al., 1990), and ecology (Van Dam et al., 1994). Despite these advantages, fewer ecological assessments are car- ried out worldwide using diatoms than using macroinvertebrates or fish (Carter et al.,

2006; Gallacher, 2002; Resh, 2007). The high sensitivity of diatoms to photosystem II inhibitor (PSII) herbicides has been widely observed at both the single-species (Larras et al., 2012; Lockert et al., 2006; Roubeix et al., 2011) and community levels (Pérès et al., 1996; Roubeix et al., 2011; Schmitt-Jansen and Altenburger, 2005). At these two levels of biological organization, some of these herbicides impair diatoms even at concentrations as low as those regularly found in the environment (Gilliom, 2007;

Larras et al., 2012; Loos et al., 2009). Ecotoxicological studies, based on toxic exposure in laboratory bioassays, provide the basic information required for developing biomonitoring tools. Studies at a higher com- plexity level, such as microcosms, provide more environmentally realistic data about toxic impacts, but identifying the environmental effects attributable to herbicides is rather complex. As a first step, it is important to determine herbicide toxicity under single-species conditions. We therefore performed single-species bioassays, which are useful tools for assessing the physiological sensitivity of a single diatom species to a single herbicide in which confounding factors were excluded and a high degree of re- producibility can be achieved. However, sensitivity data obtained from these tests are 112 A. Linking Diatom Sensitivity to Herbicides to Phylogeny : A Step Forward for Biomonitoring ? only available for relatively few species because generating these data is time-consuming and labor-intensive. Because diatoms exhibit such a huge diversity, it actually seems inconceivable to assess with bioassays the sensitivity of even a small percentage of these species. Thus, developing methods to predict their sensitivity represents a challenge for using more species sensitivity data to improve their biomonitoring efficiency. The current influx of DNA sequence data allows for establishing robust phylogenies to many species. Connecting sensitivity data to the phylogeny of the tested species is a key tac- tic that might provide an a priori prediction of the species’ sensitivity to contaminants. Recent studies have investigated the potential of phylogeny to assess the sensitivity of amphibians (Hammond et al., 2012), microalgae (Eriksson et al., 2009; Wängberg and Blanck, 1988), fish (Jeffree et al., 2010), macroinvertebrates (Buchwalter et al., 2008), and chironomids and mayflies (Carew et al., 2011) to various pollutants. Many authors have focused on the detection of a phylogenetic signal, which is the ten- dency for related species to share similar ecological or biological features. For instance, these studies showed that the phylogenetic signals of sensitivity varied in strength de- pending on the pollutant considered. However, the use of a phylogenetic framework to improve biomonitoring remains an unexplored field, and no data are available for diatoms. Approaches that integrate phylogeny and ecotoxicology could supply infor- mation for bioassessment tools operating at a larger taxonomic scale and could thus increase the effectiveness of biomonitoring. The lack of a quality data set based on a set of species presenting high diversity and a wide range of sensitivities (Guénard et al.,

2011) has so far prevented this type of study. However, new statistical methods integrat- ing phylogenies offer a promising avenue for exploring the variability of sensitivity within and between taxonomic levels. For example, Blomberg’s K statistic (Blomberg et al.,

2003) and Pagel’s λ (Pagel, 1999) are both well-established statistical tools provided by phylogenetics to detect the phylogenetic signal, to measure its strength, and to test its A.1. Introduction 113 significance. Blomberg’s K statistic has been applied extensively in the context of eco- toxicological studies by Buchwalter et al. (2008) and Carew et al. (2011), whereas

Pagel’s λ performs well for complex models of trait evolution (Münkemüller et al., 2012). Phylogenetic principal component analysis (pPCA Jombart et al., 2010b) is also a promising new method that allows for working in a multivariate framework to explore the phylogenetic signal and to explore possible correlations among the distributions of sensitivities.

This study is a preliminary exploration of the potential use of phylogenetic signals to investigate the sensitivity of freshwater diatoms to herbicides.

The first aim was to explore the sensitivity patterns of diatoms to four PSII inhibitor herbicides within the phylogeny. These herbicides were chosen for their high level of phytotoxicity and because they are often detected in French surface waters (Dubois and Lacouture, 2011). We performed single-species laboratory bioassays of a set of fourteen diatom species exposed to four PSII inhibitor herbicides (atrazine, terbutryn, diuron, and isoproturon). This species set is representative of the biofilm diversity in

Lake Geneva and was developed in the context of biomonitoring this lake. We also provide a phylogeny of these 14 diatom species reconstructed from the 18S and rbcL markers.

Second, we wanted to determine whether the phylogeny of these diatoms was signif- icantly linked to their herbicide sensitivities. We explored the phylogenetic signal for herbicide sensitivity using Blomberg’s K statistic and Pagel’s λ. We also applied the pPCA to our data as a tool to detect biologically meaningful combinations of herbicide sensitivities that are phylogenetically structured. Because we had sensitivity data for four different herbicides and few possible explanatory factors, the pPCA allowed us to uncover the underlying phylogenetic trends and patterns and to thus reveal the eco- toxicological evolutionary strategies of diatoms. The ability to explore the phylogenetic 114 A. Linking Diatom Sensitivity to Herbicides to Phylogeny : A Step Forward for Biomonitoring ? signal in a multivariate framework is also important in ecotoxicology to consider the mul- tiple environmental stressors. Finally, the results are discussed in light of diatom ecology and provide useful insights into diatom biomonitoring. Traditional and phylogenetically based regression analyses were used to assess the relationships between herbicide sensi- tivities and commonly used biomonitoring indices. We wanted to determine whether the link between phylogeny and sensitivity should be considered for inclusion in the devel- opment of biomonitoring tools intended to protect environmental diatom communities against these herbicides.

A.2 Materials and Methods

A.2.1 Diatom Species

Fourteen freshwater benthic diatom species were selected to represent the diversity found in the Lake Geneva biofilm. Each species was maintained in culture and is reg- istered at the Thonon Culture Collection 1. The selected species included Achnanthid- ium minutissimum (Kützing) Czarnecki (TCC-746), Craticula accomoda (Hustedt) D.

G. Mann (TCC-107), Cyclotella meneghiniana Kützing (TCC-755), Encyonema sile- siacum (Bleisch) D. G. Mann (TCC-678), Fistulifera saprophila (Lange-Bertalot &

Bonik) Lange-Bertalot (TCC-535), Fragilaria capucina var. vaucheriae (Grunow) Lange-

Bertalot (TCC-752), Fragilaria crotonensis Kitton (TCC-301), Fragilaria rumpens (Kütz- ing) G. W .F. Carlson (TCC-666), Gomphonema parvulum (Kützing) Kützing (TCC-

653), Gomphonema clavatum Ehrenberg (TCC-527), Mayamaea fossalis (Krasske) Lange- Bertalot (TCC-366), Nitzschia palea (Kützing) W. Smith (TCC-139-2), Sellaphora min- ima (Grunow) Mann (TCC-524), and Ulnaria ulna (C. L. Nitzsch) Compère (TCC-635).

Photographs of the strains are available on the barcoding Web site of the INRA insti-

1. http://www.inra.fr/carrtel-collection A.2. Materials and Methods 115 tute 2. The cultures were maintained in DV culture medium 3 that had been filtered at

0.22 µm (Millipore, Germany) and were grown in a 300 mL Erlenmeyer flask at 21 ± 2

℃ with a 16 h/8 h light/dark cycle at 66 ţmol.m−2.s−1.

A.2.2 Herbicide Solutions

We tested four herbicides, namely, atrazine, terbutryn, diuron, and isoproturon, from Sigma-Aldrich (St. Louis, MO, purity over 99.5%). For the bioassays, stock solutions were prepared in DV medium. Due to the low solubility of atrazine and diuron, we added a 0.05% concentration of the solvent dimethyl sulfoxide (DMSO) to the stock solutions and sonicated the mixture for 30 min. Previous research (unpublished data) has confirmed that this concentration has no adverse effects on benthic diatoms. Moreover, no interaction between DMSO and photosystem II inhibitor herbicide was observed below 0.5% DMSO (El Jay, 1996).

A.2.3 Sensitivity Data Set

The effective concentrations that reduce the population growth rate by 10% (EC10,

Appendix B.3, Table S1) and 50% (EC50, Appendix B.3, Table S2) were available from Larras et al. (2013) for Fr. capucina var. vaucheriae, Fr. rumpens, U. ulna, Cr. accomoda, M. fossalis, S. minima, N. palea, A. minutissimum, Cy. meneghiniana, E.silesiacum, and G. parvulum. To increase the diversity of the tested species, we performed additional single-species bioassays on G. clavatum, Fi. saprophila, and Fr. crotonensis for these four herbicides as described by Larras et al. (2013). We worked at two levels of sensitivity so that our data could be generalized; EC50 and EC10 are both relevant in the environmental regulatory framework (European Community, 2003).

2. http://www.rsyst.inra.fr/?q=fr/content/micro-algues 3. http://www6.inra.fr/carrtel-collection_eng/Culture-media/Composition-of-the-culture-media 116 A. Linking Diatom Sensitivity to Herbicides to Phylogeny : A Step Forward for Biomonitoring ?

A.2.4 Mixture Toxicity Prediction

We predicted the sensitivity of each species at both the EC10 and EC50 levels to equitoxic mixtures of (1) all four selected herbicides (EC10-Mix and EC50-Mix), (2) the two herbicides belonging to the phenylurea family (EC10-Mixp and EC50-Mixp), and

(3) the two herbicides belonging to the triazine family (EC10-Mixt and EC50-Mixt). The compositions of the equitoxic mixtures were determined using the concentration addition (CA) model (Equation A.1, Faust et al., 2001) because many studies have already shown that this model can predict the toxicity of mixtures of photosystem II inhibitors (Arrhenius et al., 2004; Faust et al., 2001; Porsbring et al., 2010). ECx,mix is the total concentration of the mixture that elicits a total effect of x%,Pi is the relative proportion of each substance within the mixture, and ECx,i is the individual concentration of each substance that induces the effect of x%.

( ) ∑n −1 Pi ECx,mix = (A.1) i=1 ECx,i

A.2.5 Phylogenetic Analyses

The diatom strains used in this study were genetically characterized by Sanger se- quencing using two markers (18S and rbcL) as described in Kermarrec et al. (2013).

Genbank accession numbers are provided for each sequence in Appendix B.3 (Table S3).

The sequences were aligned using the Muscle algorithm (Edgar, 2004) provided in the

SeaView graphical user interface (Gouy et al., 2010). The 18S and rbcL markers were

first aligned separately and then combined. The resulting contig spanned 2457 bp. We used the maximum likelihood (ML) and Bayesian inference (BI) methods in parallel to compute and assess the tree topology from the aligned contig. Each tree was rooted us- ing Bolidomonas pacifica L. Guillou & M.-J. Chrétiennot-Dinet, an algal species closely A.2. Materials and Methods 117 related to diatoms (Guillou et al., 1999). We used the MrAIC software (Nylander,

2004) to select the best substitution model. Because rbcL is a plastid gene and 18S is a nuclear gene, these genes may be subject to different evolutionary constraints. We carried out partitioned analyses to independently estimate the model parameters for each gene. Moreover, we constructed trees for the 18S and the rbcL genes separately to ensure that these trees both reflected the same evolutionary history. We used RAxML

7.2.8 (Stamatakis, 2006) to identify the most likely tree topology for the ML method.

The branch support was assessed by 1000 bootstrap replicates. Finally, we ran an anal- ysis for the BI method with MRBAYES 3.1.2. (Ronquist et al., 2012). The analysis consisted of two runs, three heated chains, 1 000 000 generations, and 2000 samplings from the posterior probability distribution.

A.2.6 Sensitivity-Phylogeny Relationships

The phylogenetic data (i.e., tree topology with branch lengths) and trait-related data (sensitivity to herbicides) were jointly analyzed to assess the phylogenetic signal of diatom sensitivity. All of the analyses were performed after a square-root transformation of EC10 and EC50 values for single substances and mixtures to stabilize the variances. To ensure that our results were not an artifact of a particular sensitivity level, we performed phylogenetic signal analyses on both EC10 and EC50. However, only the EC50 values were used for the complementary analyses because there is less uncertainty for these values.

A.2.7 Phylogenetic Signal for Herbicide Sensitivity

To measure the phylogenetic signal (i.e., the tendency for related species to be similar to each other) of the herbicide sensitivity of a species, we used the K statistic to quantify 118 A. Linking Diatom Sensitivity to Herbicides to Phylogeny : A Step Forward for Biomonitoring ? the strength of the phylogenetic signal for a given trait (EC values for a given herbicide or mixture). We calculated a K value for each trait and used a randomization test based on the phylogenetically independent contrast (PIC) method (Blomberg et al., 2003) with 10 000 repetitions. Because our sample size was low (14 species), this test is slightly underpowered, and we may expect type II errors to be inflated. As an alternative to the

K statistic, intercept-only GLS models that assume no covariance among species (i.e., a star phylogeny) can be fitted and compared with intercept-only GLS models that assume a Pagel’s λ correlation structure (Pagel, 1999). Pagel’s λ is a coefficient estimated from trait-related data that weights the Brownian model correlation structure. The models can be compared directly using the Akaike information criterion (AIC) (Paradis, 2011).

As a rule of thumb, if the difference between the AIC values of two models exceeds 2 units, then the model with the lowest AIC is considered to best fit the data.

A.2.8 Phylogenetic Principal Component Analysis

pPCA is a multivariate method constraining traditional PCA to exhibit phylogenetic autocorrelation (Jombart et al., 2010b). This approach reveals the main phylogenetic structures of a set of traits. We used pPCA to highlight patterns among pesticide sensitivities in the diatom phylogeny. Jombart et al. (2010b) defined two types of phylogenetic structures that can occur in biological features. First, global structures are strongly linked to the idea of a phylogenetic signal and result from global patterns of sen- sitivity similarity in phylogenetically related taxa. Second, local structures express the overdispersion of sensitivity values that occurs for closely related species in specific parts of the phylogenetic tree. Both global and local structures are detected and extracted by pPCA in the form of synthetic variables known as the global principal component and the local principal component, respectively.

We performed pPCA on the EC50 values of each species for each herbicide and A.2. Materials and Methods 119 for mixtures of these herbicides and retained only the first global principal component

(GPC1) and the first local principal component (LPC1), which correspond to the highest positive and the highest negative eigenvalues, respectively. Scree plot inspection allowed us visually to check that the selected components were sufficient to summarize our data.

A.2.9 Ancestral Character Estimations

We estimated ancestral character (sensitivity) values for the EC50-Mix to visualize the order of sensitivity of several diatom species to a mixture of herbicides within the phylogenetic tree. We estimated the ancestral characters for each node assuming a

Brownian motion model of evolution for the examined traits using a restricted maximum likelihood method (Paradis, 2011; Schluter et al., 1997). Finally, we transformed our quantitative trait values (i.e., ancestral and current) into a categorical variable with two levels, which can be interpreted as a sensitive/resistant classification. The threshold between these two levels was selected as the mean of the square roots of the current

EC50-Mix values (34.78). The values for sensitive species ranged between 8.18 and 20.20 (mean 16.14), while those for resistant species ranged between 36.46 and 79.27 (mean 53.42).

A.2.10 Phylogenetic Regressions

Phylogenetic generalized least squares (PGLS) and traditional regressions (with a star phylogeny) were used to assess and rank the explanatory power of two biomonitoring indices (VDAM-NH, Van Dam et al., 1994 and IPSS, Coste, 1982; Table A.1) for herbicide sensitivities. For each herbicide, we compared the set of PGLS and traditional regressions for both indices using AIC as an indicator of the relative support. Ecological guilds (Rimet and Bouchez, 2012b; Table A.1) and multiple regressions including 120 A. Linking Diatom Sensitivity to Herbicides to Phylogeny : A Step Forward for Biomonitoring ?

Metric Value/Code Class Specific pollution sensitiv- From 1 to 5 From bad toward best water quality ity index (IPSS, Coste, media 1982)

Nitrogen uptake 1 N-autotrophic, tolerate very low or- metabolism (VDAM-NH ganic [N] defined here as trophic 2 N-autotrophic, tolerate high organic mode, Van Dam et al., [N] 1994) 3 Facultative N-heterotrophic, need pe- riodically high organic [N] 4 Obligatory N-heterotrophic, need continuously high organic [N]

LP (low profile) Small species that can tolerate phys- Ecological guild (Rimet ical disturbance and low levels of re- and Bouchez, 2012b) sources HP (high profile) Taller species that do not resist phys- ical disturbance and do tolerate high nutrient levels M (motile) Species that can move and tolerate high nutrient levels P (planktonic) Species that evolve freely in the water column

Table A.1: Ecological and Biological Metrics of Diatoms Considered for Regression Analysis. both VDAM-NH and IPSS as explanatory variables were not tested here because these models require much more data. Our data set would have led to convergence errors or misleading results. The PGLS models were fitted by log-likelihood maximization.

A.2.11 Statistical Packages

All of the statistical analyses were performed using R 3.0.0 software (R Develop- ment Core Team, 2013). The phylogenetic data handling and Pagel’s λ correlation structure and ancestral character estimations were performed using the ape package

(Paradis et al., 2004). The PGLS were adjusted using the nlme package (Pinheiro A.3. Results 121 et al., 2013). The phylogenetic multivariate analyses were performed using the adephylo package (Jombart et al., 2010a), and Blomberg’s K statistics were computed using the picante package (Kembel et al., 2010).

A.3 Results

A.3.1 Phylogenetic Tree Analysis

The two methods (ML and BI) used to reconstruct the phylogeny produced very similar topologies. Similarly, using 18S, rbcL, or 18S + rbcL contigs produced similar topologies with minor variations in the branch lengths. These small topological differ- ences did not affect the methods used here (results not shown). Phylogenetic trees com- puted with every combination of methods and markers are provided in Appendix B.3

(Figure S1). The topologies were well supported overall by bootstraps and Bayesian posterior probabilities. The tree computed from the 18S + rbcL contig using the ML method with branch support values for each method and its agreement with other ref- erences are presented in Figure A.1. The diatom species clearly fell into three different groups: the centric, araphid, and pennate raphid diatoms. The centric diatom (Cy. meneghiniana) was placed near the pennate diatoms. The araphid pennate diatoms (U. ulna and Fragilariales species) were sister species to the nine raphid pennate diatom species in our data set. The raphid diatoms were divided into two clades: one clade has a raphe canal (N. palea), and the other does not (Naviculales, Achnanthales, and

Cymbellales). Subsequent clades included taxa belonging to the Cymbellales, freshwa- ter Achnanthales, and Naviculales species, which were well supported. The nodes that separate these important groups were robust with Bayesian support values close to 100

(Figure A.1). The support values computed by ML were lower, especially in the case 122 A. Linking Diatom Sensitivity to Herbicides to Phylogeny : A Step Forward for Biomonitoring ? of the node that separates the raphid and araphid diatoms (67). The deepest node of

Naviculales was poorly supported by ML and was unresolved by BI. This convergent tree was used in all of the subsequent phylogenetic analyses.

A.3.2 Diatom Sensitivity to Herbicides

An overview of diatom sensitivity to herbicides is presented in Figure A.2. Species sensitivities show similar patterns for both EC10 and EC50 (Figure A.2). The complete data set of untransformed EC10 and EC50 values for each herbicide and each species is provided in Appendix B.3 (Tables S1 and S2). Cy. meneghiniana and Fragilariales species were the most sensitive to the four herbicides under both single and mixture conditions. More resistant species, such as N. palea and S. minima, showed different patterns depending on the herbicide family tested. N. palea was more resistant to the triazines, while S. minima was more resistant to the phenylurea herbicides, especially at the EC10 level. Fi. saprophila tended to be more resistant to the triazines, especially at the EC50 level. The opposite was observed for all of the herbicides for G. clavatum. As a general trend, E. silesiacum, A. minutissimum, and M. fossalis tended to have medium sensitivities compared to those of the other species.

A.3.3 Phylogenetic Signal for Herbicide Sensitivity

The phylogenetic signal tested with Blomberg’s K was significant (α < 5%) at the

EC10 and EC50 levels for atrazine, Mixt, and Mix (Table A.2). Mixp was found to be significant only at the EC10 level. We found that Pagel’s λ was closely correlated to Blomberg’s K (Pearson correlation 0.76, p-value 0.002) and that both values led to similar conclusions. Interestingly, EC10 appeared to reveal a stronger phylogenetic signal than did EC50 for the phenylurea compounds, especially for diuron. Blomberg’s A.3. Results 123

GuildsIPSS VDAM-NH

Encyonema silesiacum HP 5 2 Cymbellales (a, b)

100/100 Gomphonema clavatum HP 5 1

100/100

91/100 Gomphonema parvulum HP 2 3 Achnanthales (b) Freshwater

Achnanthidium minutissimum LP 5 2

96/100 Sellaphora minima M 3 3 Naviculales sensu lato 75/99

Mayamaea fossalis M 3 2

36/− 91/100 Craticula accomoda M 1 4

100/100

Fistulifera saprophila M 2 3 Bacillariales (c, d) Nitzschia palea M 1 4

67/99

Ulnaria ulna HP 3 2

95/100 Fragilariales (b) Fragilariales Fragilaria crotonensis P 4 2 75/98

Fragilaria rumpens HP 4 2 100/100

Fragilaria capucina HP 3.4 2 Thalassiosirales (c, d) Cyclotella meneghiniana P 2 3

Bolidomonas pacifica (outgroup) 0.05

Figure A.1: Phylogenetic tree of 14 diatom species inferred from the 18S and rbcL DNA sequences and their related ecological guild,50 global pollution sensitivity (IPSS, Coste, 1982), and trophic mode (VDAM-NH, Van Dam et al., 1994). The branch lengths were computed using the maximum likelihood method. Statistical support (as a percentage) is provided at each node for the two methods (ML bootstraps/BI posterior probabilities). The clades are replaced in their bibliographic context with a = Kermarrec et al. (2011); b = Medlin (2011); c = Medlin and Kaczmarska (2004); d = Theriot et al. (2011). For the ancestral characters, the herbicide sensitivity of the diatoms was based on EC50-Mix data. Black and white symbols represent sensitive and nonsensitive species, respectively. Circles represent current characteristics, and tilted squares represent inferred ancestral characters. Ecological guilds: P = planktonic, LP = low profile, HP = high profile, M = motile. IPSS: from 1 (species in poor-quality water) to 5 (species in good-quality water). VDAM-NH: 1 = autotrophic, sensitive to low organic N concentrations; 2 = autotrophic, tolerant to high organic N concentrations; 3 = facultative heterotrophic; 4 = obligatory heterotrophic. 124 A. Linking Diatom Sensitivity to Herbicides to Phylogeny : A Step Forward for Biomonitoring ?

G.clavatum G.parvulum E.silesiacum A.minutissimum S.minima M.fossalis C.accomoda F.saprophila N.palea U.ulna F.crotonensis F.rumpens F.capucina C.meneghiniana Mix Mix 300 Diuron Diuron Atrazine Atrazine 200 G Local PC Local Terbutryne Terbutryne Isoproturon Isoproturon Global PC 100 Mix Ureases Mix Ureases Mix Triazines -1-2 1 2 Mix Triazines 0 Sensitivity Sensitivity PPCA L (EC50) (EC10) scores −100

Figure A.2: EC50, EC10, and pPCA species scores of the 14 diatom species mapped onto the phylogenetic tree. The EC values are centered and scaled by their standard deviation within the treatment. The scores are centered and scaled by their standard deviation for each component. The larger the circle, the higher (black circle) or the lower (white circle) the value. The bar plots represent the eigenvalues for the four principal components of the pPCA. The first global and local principal component eigenvalues are indicated by the letters “G” and “L”, respectively.

K statistic ranged from 0.401 for EC10-terbutryn to 0.862 for EC10-Mixt. We found no statistically significant differences in K values between the EC10 or EC50 groups of traits (Wilcoxon paired test, p-value 0.438).

To illustrate the phylogenetic signal, we mapped the EC50-Mix values and the esti- mated ancestral character values for EC50-Mix onto the phylogenetic tree (Figure A.1). The ancestral character estimation showed a clear phylogenetic segregation between the sensitive (Thalassiosirales and Fragilariales) and resistant (Cymbellales, Naviculales, and

Bacillariales) species. Moreover, the trophic mode index VDAM-NH (Van Dam et al.,

1994), the ecological guild (Rimet and Bouchez, 2012b), and the values of the specific pollution sensitivity index of the IPSS biotic diatom index (Coste, 1982) of each species are given (Figure A.1). From traditional and phylogenetic regressions (Appendix B.3, A.3. Results 125

KK p-value Pagel’s λ ∆AIC EC50 Diuron 0.602 0.103 1.018 0.150 Isoproturon 0.517 0.399 0.233 -1.708 Mixp 0.535 0.258 0.279 -1.561 Atrazine 0.691 0.032 1.000 3.578 Terbutryn 0.638 0.062 1.030 0.452 Mixt 0.717 0.027 1.011 4.041 Mix 0.680 0.031 0.844 1.146

EC10 Diuron 0.624 0.083 1.057 0.867 Isoproturon 0.460 0.195 0.675 -1.022 Mixp 0.627 0.054 NA NA Atrazine 0.860 0.007 1.062 7.448 Terbutryn 0.401 0.178 0.290 -1.655 Mixt 0.862 0.008 1.053 5.882 Mix 0.834 0.008 1.058 5.768

Table A.2: Results of Statistical Analysis for the Phylogenetic Signal of the Toxicity of Herbicides Alone and in Mixtures. ∆AIC corresponds to the difference between the AIC of the “star” model and that of the “Pagel’s λ” model. ∆AIC > 2 indicates the presence of a phylogenetic signal. NA values are produced if the PGLS values are not convergent.

Table S4), it was clear that VDAM-NH offers better explanatory power for herbicide sensitivities than does IPSS. Terbutryn sensitivity was an exception, as it was equally well explained by both IPSS and VDAM-NH. The trophic mode appeared to match both the sensitivity and phylogeny, as demonstrated statistically (p-value < 0.05). It is also interesting to note that, for isoproturon, the Mixp and Mix phylogenetic models per- formed substantially better than did the traditional models. For diuron, atrazine, and

Mixt, the traditional and phylogenetic models offered similar performances. Terbutryn is the exception for which traditional models performed better than did phylogenetic models.

The ecological guilds were closely related to the phylogeny but were not clearly linked to species’ sensitivity to herbicides. Most of the sensitive species (centrics/araphids) belonged to the high-profile guild, which also included some resistant species in the 126 A. Linking Diatom Sensitivity to Herbicides to Phylogeny : A Step Forward for Biomonitoring ? raphid group. Moreover, species belonging to the motile guild displayed various levels of sensitivity, similar to or lower than those of many of the high-profile species.

The pPCA on EC50 highlighted a strong global structure and a weak local structure (Figure A.2, scree plot), which support the presence of a phylogenetic signal as demon- strated by statistical analyses. Species scores on the first global axis (Figure A.2, global

PC) highlight groups of closely related species that share the same range of herbicide sensitivities. These scores also reveal contrast between sensitive species, such as the

Fragilariales and the centric species (Cy. meneghiniana), and resistant species, such as N. palea and S. minima. Species scores on the LPC1 reveal a marked contrast between

N. palea and both Fi. saprophila and G. parvulum (Figure A.2, local PC). These species are closely related, but their responses to the different pesticide families are different: G. parvulum is sensitive to triazine and resistant to phenylurea, whereas Fi. saprophila and N. palea are sensitive to phenylurea and resistant to triazine (Figure A.2, EC values).

All four herbicides were negatively correlated with the first global component (Figure

A.3, global PC), suggesting the existence of a general pattern of sensitivities within the phylogeny that is independent of the herbicide. However, the first local component

(Figure A.3, local PC) seemed to oppose the two groups of herbicides, even if it involved only three species (Figure A.2, local PC, N. palea, Fi. saprophila, and G. parvulum).

A.4 Discussion

A.4.1 Does Phylogeny Reflect Herbicide Sensitivity?

Interactions between phylogeny and ecology are often studied because they pro- vide complementary information that can help to explain patterns of species occurrence

(Pavoine et al., 2011) or extinction (Green et al., 2011) over time. Nevertheless, stud- A.4. Discussion 127

Global PC EC50Diuron EC50Isoproturon EC50MixUr EC50Atrazine EC50Terbutryne EC50MixTri EC50Mix

Local PC EC50Diuron EC50Isoproturon EC50MixUr EC50Atrazine EC50Terbutryne EC50MixTri EC50Mix

−0.6−0.2 0.2

Figure A.3: Loadings of herbicide treatments for the first global and local principal components of the EC50 pPCA. Black triangles represent single substances and the binary mixture of phenylurea compounds, black circles represent single substances and the binary mixture of triazine compounds, and white tilted squares represent the quaternary mixture. ies relating the phylogeny of freshwater organisms to their sensitivity to pesticides are less common, even if they could provide very useful data for monitoring purposes. Among these few studies, Hammond et al. (2012) found a phylogenetic signal of sensitivity for many amphibians to the insecticide endosulfan, implying that large ranids were the most sensitive frogs. Similarly, Wängberg and Blanck (1988) found phylogeny and sensitivity to be linked for a wide range of phototrophic organisms. Cyanophyta and

Chlorococcales exhibited different levels of sensitivity toward many chemicals. These first studies gave promising results regarding the potential use of phylogeny in ecotoxi- cology (Guénard et al., 2011; Hammond et al., 2012; Wängberg and Blanck, 1988) 128 A. Linking Diatom Sensitivity to Herbicides to Phylogeny : A Step Forward for Biomonitoring ? for these specific taxonomic groups and chemicals. To enhance the reliability of these results, studies must be conducted on data sets with wide ranges of both species diversity and species sensitivity (Guénard et al., 2011). The lack of such a sensitivity data set in the literature makes it currently impossible to carry out this type of study, especially on organisms of interest for biomonitoring. Diatoms appear to be appropriate organisms due to their high degree of diversity, their varying sensitivities to herbicides, and their critical importance in river bioassessment. These organisms are already being success- fully used as indicators to assess the ecological level of aquatic ecosystems (Lenoir and Coste, 1996; Schaumburg et al., 2004), which is required for regulatory monitoring, such as for the European Water Framework Directory (European Community, 2000).

However, it may be difficult to identify diatoms at the species level due to their great diversity and the fact that they require highly qualified personnel and resources (Kelly et al., 1995; Rimet and Bouchez, 2012b).

In our study, we worked with a set of 14 taxa. These species were chosen on the basis of their presence in Lake Geneva to work with a wide range of taxonomic diversity in diatoms. Therefore, this species set is representative of biofilm diversity in Lake Geneva; the sensitivity of the species toward various pesticides has already been explored in mesocosm studies in the context of this particular lake (Rimet and Bouchez, 2011).

Moreover, the chosen taxa belong to various clades that include the main freshwater orders. Diatom phylogeny has been explored on the basis of various different markers, such as 18S, rbcL, psbC, cox1, ITS, and LSU, by several authors (Beszteri et al., 2001;

Lundholm et al., 2002; Medlin and Kaczmarska, 2004; Medlin et al., 1996; Ruck and Theriot, 2011; Theriot et al., 2010; Zechman et al., 1994). The two methods

(ML and BI) used to reconstruct the phylogeny produced exactly the same topology, which was statistically well supported and was consistent with previously published results (Kermarrec et al., 2011; Medlin, 2011; Theriot et al., 2011). A.4. Discussion 129

We observed a phylogenetic signal of the sensitivity of benthic diatoms to PSII her- bicides at both the EC10 and EC50 levels. Blomberg’s K and Pagel’s λ statistics, which were used to detect this signal, were underpowered in this study. Simulations showed that, with 14 species, Blomberg’s K detects 80% of true positives and Pagel’s λ detects

75% of true positives. This result suggests that the phylogenetic signal was most likely more pronounced than is shown here. Therefore, a relationship likely exists between di- atom phylogeny and herbicide sensitivity. Possible explanatory hypotheses are discussed in the following section.

A.4.2 Sensitivity Patterns in Phylogeny: Explanatory Hypotheses

According to the phylogeny and the first axis of the global axis of the pPCA, two groups of sensitivity toward PSII inhibitors were observed: (a) one including the sole cen- tric diatom species (Thalassiosirales) and the araphid pennate diatoms (Fragilariales), which were generally more sensitive than (b) the pennate raphid species (Bacillariales, Naviculales, Achnantales, and Cymbellales). Cy. meneghiniana was the only centric di- atom we included. Moreover, this species displayed mixotrophic capabilities (Van Dam et al., 1994), which are rare among centric diatoms, indicating that no generalization could be made regarding the sensitivities of centric diatoms. In our study, ancestral character estimation demonstrates that the resistant state emerged very early in the tree. The main difference between these two groups is the presence of the raphe, which is present only in raphid diatoms (Round et al., 1990; Sims et al., 2006; Theriot et al., 2010). Raphe played an important role in benthic habitat colonization, principally by enabling diatoms to move and to secrete an exopolysaccharide matrix (Sims et al., 2006). Nevertheless, it is hardly conceivable that the advent of the raphe in diatom evolution led directly to the changes in lowered sensitivity to herbicides. Rather, we 130 A. Linking Diatom Sensitivity to Herbicides to Phylogeny : A Step Forward for Biomonitoring ? hypothesize that, during evolution, species that developed a raphe structure simulta- neously developed other characteristics (genetic, physiological, or cellular traits) that enhanced their resistance to PSII inhibitors. Raphid and araphid pennate diatoms are both characteristic of benthic communities (Brown and Austin, 1973; Rimet and

Bouchez, 2011; Schmitt-Jansen and Altenburger, 2005). Under low light con- ditions, some of these diatoms can evolve heterotrophic capacities and can metabolize other organic substrates, making them less dependent on photosynthesis (Hellebust and Lewin, 1977; Van Dam et al., 1994). In particular, genera such as Craticula, En- cyonema, Fistulifera, Gomphonema, Nitzschia, and Sellaphora, most of which display motile and/or raphid characteristics, are known to be better adapted to coping with higher organic matter concentrations than most centrics (e.g., Cyclotella) or araphid diatoms (e.g., Fragilariales, Berthon et al., 2011). The sensitivity levels of the different species are generally constant within the ara- phid/centric diatom clade, whereas they are variable within the raphid clade, which encompasses a greater diversity of genera. Several species from this latter clade, such as

M. fossalis, A. minutissimum, and E. silesiacum, were more sensitive than other raphid species. We observed variability within the raphid clade, but more sensitivity data per genus are required to develop a firm hypothesis regarding the variation of sensitivity within the raphid clade in relation to phylogeny. However, we suggest that the trophic mode may play a role in the greater sensitivity observed for these three autotrophic species within the raphid clade. Our main hypothesis is that the trophic mode of di- atoms, defined by Van Dam et al. (1994), is partly related to phylogeny and indirectly influences the sensitivity of species to PSII inhibitors. The results of traditional and phylogenetic regressions both suggest that phylogeny provides a first general level of sensitivity to these herbicides. Moreover, the trophic mode of species may help to refine the level of sensitivity within clades characterized by species with a wide diversity of eco- A.4. Discussion 131 logical and physiological characteristics. Diatoms with higher heterotrophic capacities are less susceptible to PSII inhibitors because they are able to shift to organic nutrition to sustain themselves. Moreover, the mixotrophic capacities of diatoms vary greatly between species. Several studies seem to confirm this hypothesis: heterotrophic diatom species are less sensitive to herbicides than are autotrophic diatom species (Berard and Pelte, 1999; Debenest et al., 2009; Larras et al., 2012; Pérès et al., 1996;

Roubeix et al., 2011), such as the highly sensitive araphid diatoms. Within the raphid clade, species exhibit various trophic modes, and some may be heterotrophic, which may explain the different levels of sensitivity found for the small number of species tested in our study.

Finally, we observed another level of variation among the most resistant species of the raphid group. As highlighted by the first local axis of the pPCA, there is greater variation in sensitivity to the different herbicide families (triazine vs phenylurea) for

G. parvulum, E. silesiacum, Fi. saprophila, and N. palea, especially at the EC10 level. Triazine and phenylurea herbicides both inhibit photosynthesis by preventing the elec- tron flow within thylakoid membranes (Moreland and Hill, 1962). However, these two herbicide families are characterized by different chemical structures (Gramatica et al., 2001). Moreover, within each chemical family, specific and different parameters influence herbicide toxicity (Gramatica et al., 2001). The fact that herbicides exhibit a range of different chemical structures may influence their uptake by algae and/or their binding to the binding site and finally modulate their toxicity (Oettmeier, 1999).

A.4.3 Use of Phylogeny in the Context of Biomonitoring

Our results suggest that an a priori assessment of the sensitivity of benthic diatoms to PSII inhibitors can be made from both their phylogeny, which discriminates two major patterns of sensitivity, and their trophic mode, which refines the sensitivity level 132 A. Linking Diatom Sensitivity to Herbicides to Phylogeny : A Step Forward for Biomonitoring ? of diatom species, especially in the raphid clade, indicating that (1) phylogeny has a considerable potential for use in ecotoxicological studies involving diatoms and PSII in- hibitor herbicides and (2) phylogeny provides promising results to predict bioindicator sensitivity, which may be used for the environmental bioassessment of herbicide im- pacts. Obviously, this study is a first step, and more studies are needed (e.g., within clades and on other diatom species) before an operational biomonitoring tool for herbi- cide contamination can be proposed. In our study and as a general trend, araphids and autotrophic species remain the most sensitive. Even Cy. meneghiniana, which is defined as mixotrophic and planktonic, was characterized as sensitive. This species is considered as an exception because most of the mixotrophic diatom species present in the environ- ment are pennate (Van Dam et al., 1994). Centric and mixotrophic species are rare.

Moreover, centric diatoms are rarely found in benthic habitats, and little information is available from micro-mesocosm studies. Pérès et al. (1996) have shown that centric

(Melosira varians) and araphid (Staurosira venter) diatoms tend to disappear under pressure from isoproturon, whereas raphid species (Cymbella mesiana and S. minima) tend to be favored. Similarly, Schmitt-Jansen and Altenburger (2005) observed a decrease in the relative abundance of araphid species (Fragilaria sp., Fr. capucina var. gracilis) after exposure to atrazine and isoproturon. In an agricultural watershed, the relative abundance of Fr. capucina (araphid) tended to decrease, while that of Navic- ula lanceolata, Nitzschia lineariz, and N. palea (raphids) tended to increase (Roubeix et al., 2011). For IPSS (used to assess the water quality from diatom species abundance) and for ecological guilds, no relevant link was observed between the ranking of species’ sensitivity and their phylogeny. Indices relevant for monitoring a specific stressor can- not easily be transposed to monitor another stressor. In our study, we observed that although the IPSS index is relevant in the context of global pollution, it does not seem to be appropriate for herbicide risk assessment, highlighting the need to develop specific A.4. Discussion 133 biomonitoring indices for herbicide contamination. In the context of risk assessment and bioassessment, integrating phylogeny into ecotoxicological studies seems to offer a promising way to assess the order of species’ sensitivity toward PSII inhibitor herbi- cides. Moreover, the predicted order of species’ sensitivity has been supported by other experiments that integrate higher levels of complexity, such as mesocosm (Rimet and

Bouchez, 2011) and in situ studies (Marcel et al., 2013). In such studies, diatoms are exposed to numerous biotic and abiotic environmental factors (turbulence, light, grazing, etc.) where herbicide contamination accounted only for part of the stress on the diatom assemblage. Ecotoxicology is a discipline in which it is necessary to conduct studies of varying complexity, ranging from molecules to ecosystems (Boudou and Ribeyre,

1997), to provide a solid foundation for biomonitoring. Environmental monitoring aims

(1) to connect pressures and impacts for a posteriori assessment and (2) to predict the a priori risk of contaminants. Both approaches rely on basic sensitivity data that are gen- erally too few and heterogeneous and are obtained from model species unrepresentative of the ecosystems. Using the link between phylogeny and sensitivity offers the prospect of extending these data and increasing their environmental representativeness, thereby improving biomonitoring tools. 134 ANNEXE B Supplementary material

B.1 Additional material for Chapter 3

B.1.1 GenBank IDs of the sequences (18S and/or rbcL) selected after the pruning step.

Species ID 18S ID rbcL Acanthoceras sp. NA HQ912540 Achnanthidium minutissimum KT072992 KT072938 Actinocyclus curvatulus X85401 NA Actinocyclus sp. NA HQ912504 Actinoptychus sinensis AJ535182 NA Actinoptychus sp. NA HQ912438 Adlafia brockmannii AM502020 AM710487 Amphiprora paludosa var. hyalina FR865482 NA Amphora capitellata AJ535158 NA Amphora coffeaeformis HQ912602 NA Amphora fogediana AM502022 AM710489 Amphora libyca AM501959 AM710425 Amphora montana KC736615 NA Amphora normanii AM501958 NA Amphora pediculus KT072994 KT072943 Amphora proteus AJ535147 NA Anomoeoneis sphaerophora AJ535153 NA Arcocellulus mammifer NA HQ912433 Ardissonea baculus AM746973 AB430664 Ardissonea formosa NA HQ912517 Ardissonea fulgens NA HQ912503 Asterionella formosa HQ912633 HQ912497 Asteroplanus karianus Y10568 AB430672 Attheya longicornis AY485450 JN162766 Attheya septentrionalis HQ912618 FJ002121 Aulacodiscus orientalis NA HQ912516 Aulacoseira alpigena AY569578 NA Aulacoseira ambigua AY569579 AY569600 Aulacoseira baicalensis AY121821 NA Aulacoseira distans X85403 NA Aulacoseira granulata AY569585 AY569604 Aulacoseira islandica AY569572 NA Aulacoseira nyassensis AJ535187 NA Aulacoseira skvortzowii AY121822 NA Aulacoseira subarctica AY569577 AY569593 Aulacoseira valida AY569586 AY569602 Bacillaria paxillifera FR865484 HQ912491 Continued on next page ⇒ 136 B. Supplementary material

Continued from previous page Species ID 18S ID rbcL Bacterosira bathyomphala DQ514894 DQ514816 Bellerochea horologicalis NA HQ912536 Bellerochea malleus DQ514845 DQ514763 Berkeleya rutilans HQ912637 HQ912501 Biddulphia alternans NA HQ912541 Biddulphia antediluviana NA HQ912529 Biddulphia tridens NA HQ912538 Biddulphiopsis membranacea NA HQ912502 Biddulphiopsis titiana HQ912641 HQ912505 Bleakeleya notata NA HM627327 Bolidomonas pacifica HQ912557 HQ912557 Brockmanniella brockmannii HQ912565 HQ912429 Caloneis amphisbaena AM501954 AM710507 Caloneis budensis AM502003 AM710470 Caloneis lauta AM502039 AM710506 Caloneis lewisii HQ912580 HQ912444 Caloneis silicula NA JN418663 Campylodiscus clypeus HQ912412 HQ912398 Campylodiscus ralfsii AJ535162 NA Campylodiscus thuretii NA AB430693 Campylosira cymbelliformis NA HQ912487 Catacombas gaillonii EF423402 NA Centronella reicheltii NA HQ912499 Cerataulina pelagica NA HQ912533 Cerataulus smithii NA HQ912530 Chaetoceros atlanticus NA HQ685849 Chaetoceros calcitrans AY625894 NA Chaetoceros debilis FR865489 NA Chaetoceros didymus X85392 HQ710593 Chaetoceros gracilis AY625895 AY604697 Chaetoceros muelleri HQ912558 HQ912422 Chaetoceros neogracile NA EU090033 Chaetoceros peruvianus NA HQ912514 Chaetoceros radicans NA AB430666 Chaetoceros rostratus X85391 NA Chaetoceros socialis AY485446 FJ002154 Chrysanthemodiscus sp. NA HQ912506 Climaconeis riddleae NA HQ912508 Climacosphenia moniligera AM746974 NA Climacosphenia sp. NA HQ912549 Cocconeis pediculus AM502010 AM710477 Cocconeis placentula KT072968 KC736591 Cocconeis placentula var. euglypta NA KT072907 Corethron criophilum X85400 NA Corethron hystrix EF192981 AY604696 Corethron inerme AJ535180 NA Coscinodiscus concinnus NA HQ912545 Coscinodiscus granii AY485495 HQ912531 Coscinodiscus radiatus X77705 HQ912424 Coscinodiscus wailesii NA HQ912532 Craticula accomoda KF959652 KF959638 Craticula cuspidata AM502000 AM710467 Craticula importuna AM501978 AM710444 Craticula molestiformis AM501989 AM710455 Ctenophora pulchella NA HQ912475 Cyclophora tenuis AJ535142 HQ912524 Cyclostephanos dubius HQ912575 HQ912439 Cyclostephanos invisitatus DQ514899 DQ514827 Cyclostephanos tholiformis DQ514898 DQ514826 Cyclotella atomus DQ514858 DQ514779 Cyclotella bodanica NA DQ514829 Cyclotella choctawhatcheeana JF790978 HQ828189 Cyclotella costei KT072952 KT072901 Cyclotella cryptica AY485499 FJ002141 Continued on next page ⇒ B.1. Additional material for Chapter 3 137

Continued from previous page Species ID 18S ID rbcL Cyclotella distinguenda DQ514859 DQ514780 Cyclotella gamma DQ514852 DQ514772 Cyclotella meneghiniana KT072987 KT072940 Cyclotella ocellata NA DQ514832 Cyclotella scaldensis AY496209 NA Cyclotella striata DQ514851 DQ514771 Cyclotella stylorum NA DQ514773 Cylindrotheca closterium DQ082741 DQ082743 Cylindrotheca fusiformis FR865491 NA Cymatopleura elliptica AJ867030.1 HQ912523 Cymatosira belgica NA AB430667 Cymbella affinis NA AM710485 Cymbella aspera AM502016 AM710483 Cymbella excisa KT072997 NA Cymbella lanceolata AM502026 AM710493 Cymbella proxima AM502017 AM710484 Cymbella tumida NA KT072945 Cymbopleura naviculiformis AM501997 AM710464 Delphineis minutissima NA JN162770 Delphineis sp. AY485465 NA Denticula kuetzingii NA HQ912474 Detonula confervacea DQ514871 HQ912481 Detonula pumila DQ514892 DQ514814 Diadesmis gallica AJ867023.1 NA Diatoma hyemalis FM164376 NA Diatoma moniliformis NA AB430674 Diatoma tenuis EF423403 HQ912486 Diatoma vulgaris EF465466 NA Dickieia ulvacea AY485462 NA Didymosphenia geminata KT072999 NA Dimeregramma minor var. nanum NA JN162771 Diploneis subovalis HQ912597 HQ912461 Discostella pseudostelligera DQ514905 DQ514833 Discostella stelligera DQ514903 DQ514831 Discotella sp NA KT072936 Ditylum brightwellii AY485444 JN162774 Ditylum intricatum NA HQ912542 Encyonema caespitosum AM502035 AM710502 Encyonema minutum AM501961 NA Encyonema silesiacum KT072991 KT072937 Encyonema sinicum NA AY571754 Encyonema triangulatum AJ535157 NA Entomoneis alata AJ535160 FJ002099 Entomoneis ornata HQ912411 HQ912397 Entomoneis paludosa AY485468 FJ002140 Entomoneis punctulata HM805031 NA Eolimna subminuscula KT072989 KT072935 Epithemia argus NA HQ912394 Epithemia sorex AB546734 HQ912395 Epithemia turgida AB546736 HQ912396 Eucampia antarctica X85389 FJ002126 Eucampia zoodiacus EF585584 NA Eucocconeis laevis KT072990 NA Eunotia bilunaris AJ866995.1 HQ912463 Eunotia formica AB085830 AM710428 Eunotia glacialis HQ912586 HQ912450 Eunotia implicata AM502001 AM710468 Eunotia minor NA AY571744 Eunotia monodon AB085831 NA Eunotia pectinalis HQ912636 HQ912500 Eunotogramma laevis NA AB430668 Extubocellulus cribriger NA HQ912435 Extubocellulus spinifer FR865495 JN162780 Fallacia forcipata NA EF143289 Continued on next page ⇒ 138 B. Supplementary material

Continued from previous page Species ID 18S ID rbcL Fallacia monoculata HQ912596 HQ912460 Fallacia pygmaea HQ912605 HQ912469 Fistulifera pelliculosa AY485454 JN162792 Fistulifera saprophila KT072975 KT072923 Fragilaria austriaca AM497734 NA Fragilaria barbararum AJ971376 NA Fragilaria bidens AM497732 AB430676 Fragilaria capucina KT072981 KT072928 Fragilaria capucina var. vaucheriae KT072972 KC736594 Fragilaria crotonensis KF959654 KT072903 Fragilaria delicatissima AM497721 NA Fragilaria famelica HQ912588 HQ912452 Fragilaria perminuta KT072954 NA Fragilaria rotundissima NA JN162781 Fragilaria rumpens KF959661 KT072939 Fragilaria striatula AJ971377 DQ222440 Fragilariforma virescens AM497737 HQ912492 Fragilariopsis cylindrus AY485467 JN162784 Fragilariopsis kerguelensis NA EF423501 Frustulia vulgaris AM502038 NA Gomphoneis minutum NA KT072946 Gomphonema acuminatum AM502019 KT072949 Gomphonema affine KT072970 HQ912472 Gomphonema affine var. affine NA KT072918 Gomphonema angustatum AM502005 NA Gomphonema angustum KT072958 KT072908 Gomphonema bourbonense KT072957 KC736595 Gomphonema capitatum NA AY571751 Gomphonema clavatum KC736622 KC736597 Gomphonema clevei KT072959 KT072951 Gomphonema exilissimum KT072955 KT072904 Gomphonema lagenula KT072965 KT072915 Gomphonema micropus AM501964 AM710431 Gomphonema parvulum KT072961 KT072911 Gomphonema parvulum f. saprophilum KT072960 KT072910 Gomphonema parvulum var. parvulum NA KT072929 Gomphonema parvulum var. parvulum f. saprophilum NA KT072909 Gomphonema productum AM501993 AM710460 Gomphonema pumilum var. pumilum KC736629 KC736599 Gomphonema rosenstockianum KT072998 KT072948 Gomphonema truncatum NA AM710509 Grammatophora marina AY216906 AB430677 Grammatophora oceanica HQ912634 HQ912498 Grammonema islandica AJ535190 NA Guinardia delicatula AJ535192 HQ912515 Guinardia flaccida AJ535191 NA Guinardia solstherfothii AY485511 NA Gyrosigma acuminatum HQ912598 HQ912462 Gyrosigma limosum AY485516 NA Gyrosigma tenuissimum var. hyperborea NA JN162786 Halamphora coffeaeformis NA FJ002103 Halamphora montana NA KC736590 Halamphora normanii NA AM710424 Hantzschia amphioxys HQ912404 NA Hantzschia amphioxys var. major NA HQ912390 Haslea crucigera AY485482 NA Haslea ostrearia AY485523 HE663064 Haslea pseudostrearia AY485524 NA Haslea spicula HM805034 NA Helicotheca tamesis X85385 JN162787 Hemiaulus sinensis NA HQ912488 Hippodonta capitata AM501966 AM710432 Hyalodiscus scoticus NA AB430660 Hyalodiscus sp. HQ912649 NA Continued on next page ⇒ B.1. Additional material for Chapter 3 139

Continued from previous page Species ID 18S ID rbcL Hyalosira delicatula FM164375 AB430678 Hyalosira tropicalis NA AB430692 Hyalosynedra laevigata EF423408 JN162788 Hydrosera sp. NA HQ912547 Isthmia enervis NA HQ912548 Lampriscus kittonii NA AB430669 Lauderia annulata DQ514849 DQ514769 Lauderia borealis X85399 NA Lemnicola hungarica KT072963 KT072913 Leptocylindrus danicus AJ535175 NA Leptocylindrus minimum AJ535176 NA Leyanella arenaria NA HQ912434 Licmophora abbreviata AY633761 NA Licmophora communis AY633756 NA Licmophora flabellata EF423409 NA Licmophora gracilis AY633758 NA Licmophora grandis EF423411 NA Licmophora juergensii AY633759 NA Licmophora paradoxa HQ912612 HQ912476 Licmophora reichardtii EF423412 NA Lithodesmioides polymorpha HQ912655 HQ912519 Lithodesmium undulatum Y10569 DQ514765 Luticola goeppertiana NA AM710433 Lyrella atlantica AJ544659 AY571747 Lyrella hennedyi NA AY571755 Mastodiscus radiatus NA HQ912539 Mastogloia sp. NA HQ912496 Mayamaea atomus AM501968 AM710434 Mayamaea atomus var. permitis NA JN418670 Mayamaea fossalis KF959655 NA Mayamaea permitis KT072962 KC736600 Mediopyxis helysia AJ968728 NA Melosira dubia NA AB430661 Melosira nummuloides NA FJ002129 Melosira varians KT072969 KT072933 Microfissurata sp. NA JN162789 Minidiscus trioculatus DQ514872 FJ002109 Minutocellus polymorphus AY485478 HQ912432 Nanofrustulum shiloi EF491891 FJ002124 Navicula capitatoradiata KT072982 KT072920 Navicula cari AM501991 AM710457 Navicula cincta KT072988 KT072934 Navicula cryptocephala HQ912603 AM710439 Navicula cryptotenella AM502015 AM710478 Navicula cryptotenelloides KT072980 KT072927 Navicula diserta AJ535159 NA Navicula duerrenbergiana NA AY571749 Navicula gregaria HM805037 AM710440 Navicula lanceolata AY485484 FJ002148 Navicula lundii KT072956 KT072906 Navicula perminuta HM805044 NA Navicula phyllepta FJ624239 NA Navicula pseudacceptata JN674064 JQ432376 Navicula radiosa AM502034 AM710501 Navicula ramosissima AY485512 FJ002097 Navicula reinhardtii AM501976 AM710442 Navicula rhynchocephala var. hankensis NA JQ432374 Navicula rostellata KT072966 NA Navicula salinicola NA AY604699 Navicula sclesviscensis AY485483 FJ002098 Navicula symmetrica KT072983 KT072930 Navicula tripunctata KT072979 KT072925 Navicula veneta AM501971 AM710437 Navicula viridula var. rostellata NA KT072916 Continued on next page ⇒ 140 B. Supplementary material

Continued from previous page Species ID 18S ID rbcL Neidium affine HQ912583 HQ912447 Neidium bisulcatum HQ912591 HQ912455 Neidium productum HQ912582 HQ912446 Neofragilaria nicobarica AB433340 NA Nitzschia acicularis AJ867000.1 NA Nitzschia acidoclinata KT072971 KC736602 Nitzschia aff dissipata NA JN162793 Nitzschia amphibia KT072977 KT072905 Nitzschia apiculata JF791075 NA Nitzschia capitellata KT072978 KT072924 Nitzschia communis AJ867014.1 NA Nitzschia curvilineata NA JN162794 Nitzschia dissipata AJ867018.1 NA Nitzschia draveillensis KC736635 KC736605 Nitzschia dubiiformis NA AB430696 Nitzschia epithemoides FR865501 NA Nitzschia filiformis AJ866999.1 HQ912453 Nitzschia fonticola AJ867022.1 KT072921 Nitzschia frigida JQ582669 NA Nitzschia frustulum KT072974 KT072922 Nitzschia hantzschiana KT072967 NA Nitzschia inconspicua AJ867021.1 KT072912 Nitzschia linearis AJ867011.1 KT072917 Nitzschia longissima AY881968 AY881967 Nitzschia lorenziana NA KC736608 Nitzschia microcephala HM805040 NA Nitzschia ovalis FR865500 NA Nitzschia palea KT072985 KJ542469 Nitzschia paleacea AJ866996.1 NA Nitzschia paleaformis AJ866997.1 NA Nitzschia palea var. debilis KC736638 KJ542486 Nitzschia pusilla AJ867015.1 KT072926 Nitzschia sigma AJ867279 NA Nitzschia sigmoidea NA FN557033 Nitzschia supralitorea AJ867019.1 NA Nitzschia thermalis AY485458 NA Nitzschia vitrea AJ867280 NA Odontella aurita EU818943 HQ912551 Odontella longicruris NA FJ002119 Odontella mobiliensis FR865505 HQ685845 Odontella sinensis HQ912564 HQ912428 Opephora guenter-grassii AB436781 NA Opephora sp. NA AB430683 Palmerina hardmaniana NA HQ912535 Papiliocellulus elegans X85388 NA Papiliocellulus simplex NA HQ912494 Paralia sol AJ535174 NA Paralia sulcata HQ912573 EF143287 Pauliella taeniata AY485528 FJ002105 Petroneis humerosa NA AY571757 Phaeodactylum tricornutum FR744758 AF195952 Pierrecomperia catenuloides HQ413684 HQ413686 Pinnularia acrosphaeria KC736641 AM710488 Pinnularia acuminata JN418597 JN418667 Pinnularia altiplanensis NA JN418643 Pinnularia anglica KT072993 AM710446 Pinnularia borealis JN418575 JN418640 Pinnularia borealis var. borealis NA JN418671 Pinnularia borealis var. subislandica NA JN418645 Pinnularia brebissonii HQ912604 HQ912468 Pinnularia gibba EF151977 EF143304 Pinnularia grunowii JN418588 JN418658 Pinnularia interrupta AJ544658 NA Pinnularia isselana JN418594 JN418664 Continued on next page ⇒ B.1. Additional material for Chapter 3 141

Continued from previous page Species ID 18S ID rbcL Pinnularia marchica JN418569 JN418639 Pinnularia mesolepta AM502024 AM710491 Pinnularia microstauron AM501983 AM710448 Pinnularia neglectiformis JN418596 JN418666 Pinnularia neomajor JN418571 JN418655 Pinnularia nodosa NA JN418657 Pinnularia obscura AM501986 AM710452 Pinnularia parvulissima NA JN418661 Pinnularia rupestris AJ867027.1 AM710458 Pinnularia subanglica JN418598 JN418668 Pinnularia subcapitata AM501979 AM710445 Pinnularia subcapitata var. elongata NA JN418649 Pinnularia subcommutata var. nonfasciata JN418584 JN418654 Pinnularia subgibba KT072984 KT072931 Pinnularia substreptoraphe AM502036 AM710503 Pinnularia termitina HQ912601 HQ912465 Pinnularia viridiformis JN418589 JN418659 Pinnularia viridis AM502023 AM710490 Placoneis constans NA AY571752 Placoneis elginensis KT072964 KT072914 Placoneis hambergii AM502030 AM710497 Placoneis paraelginensis NA AY571753 Plagiogramma atomus AB433338 NA Plagiogramma staurophorum NA HQ912520 Plagiostriata goreensis NA AB430684 Planktoniella sol HQ912562 HQ912426 Planothidium frequentissimum KT072986 KT072932 Pleurosigma elongatum NA HQ685847 Pleurosigma planktonicum AY485514 NA Pleurosira laevis AJ535188 HQ912449 Podocystis spathulata NA HQ912525 Podosira stelligera AY485507 HQ912431 Porosira glacialis HQ912619 DQ514767 Porosira pseudodelicatula NA DQ514768 Porosira pseudodenticulata DQ436461 DQ108387 Prestauroneis integra AM502025 AM710492 Proboscia alata AJ535181 NA Proboscia indica AY485470 NA Proboscia inermis EF192984 NA Psammodictyon constrictum NA AB430697 Psammodictyon panduriforme AY485485 FJ002125 Psammoneis japonica AB433336 NA Psammoneis pseudojaponica AB433339 NA Psammoneis senegalensis AB433341 NA Pseudogomphonema kamschaticum NA AY571748 Pseudogomphonema sp. AJ535152 NA Pseudohimantidium pacificum NA AB430685 Pseudo-nitzschia americana NA JN162807 Pseudo-nitzschia australis GU373961 NA Pseudo-nitzschia brasiliana NA FJ150740 Pseudo-nitzschia caciantha NA DQ813821 Pseudo-nitzschia calliantha NA FJ150760 Pseudo-nitzschia cuspidata GU373960 DQ813820 Pseudo-nitzschia delicatissima NA FJ150743 Pseudo-nitzschia dolorosa NA DQ813822 Pseudo-nitzschia fraudulenta NA EF520333 Pseudo-nitzschia galaxiae NA EF423514 Pseudo-nitzschia mannii NA DQ813824 Pseudo-nitzschia multiseries GU373964 NA Pseudo-nitzschia multistriata NA FJ150757 Pseudo-nitzschia pseudodelicatissima NA DQ813817 Pseudo-nitzschia pungens GU373968 FJ150759 Pseudo-nitzschia turgiduloides NA EF423508 Pseudostaurosira brevistriata NA HQ828191 Continued on next page ⇒ 142 B. Supplementary material

Continued from previous page Species ID 18S ID rbcL Pseudostaurosira elliptica EF423414 NA Pseudostaurosira zeilleri var. elliptica EF465473 NA Pseudostaurosiropsis sp. EF465477 HQ828195 Pseudostriatella pacifica AB379680 AB430686 Pteroncola inane NA AB430687 Punctastriata sp. NA HQ828197 Reimeria sinuata KT072996 KT072947 Rhabdonema minutum NA AB430682 Rhaphoneis amphiceros AB433337 HQ912537 Rhaphoneis belgicae X77703 NA Rhizosolenia fallax AY485480 FJ002127 Rhizosolenia imbricata AJ535178 HQ912527 Rhizosolenia pungens AY485486 FJ002101 Rhizosolenia setigera AY485461 AB430662 Rhizosolenia shrubsolei AY485510 FJ002128 Rhizosolenia similoides JF791038 NA Rhopalodia contorta HQ912406 HQ912392 Rhopalodia gibba AB546738 HQ912393 Rossia sp. EF151968 EF143281 Scoliopleura peisonis HQ912609 HQ912473 Sellaphora auldreekie EF151965 EF143276 Sellaphora bacillum EF151980 EF143311 Sellaphora blackfordensis JN418599 EF143290 Sellaphora capitata EF151971 EF143286 Sellaphora laevissima EF151979 EF143313 Sellaphora lanceolata EF151978 EF143315 Sellaphora minima KT072973 KT072919 Sellaphora pupula EF151982 EF143294 Sellaphora seminulum KT072976 KT072950 Seminavis robusta FJ624252 EF143299 Shionodiscus ritscheri DQ514891 DQ514813 costatum AY485473 FJ002107 Skeletonema dohrnii AJ632211 JN162821 DQ011158 DQ514818 DQ011160 DQ514822 AB572842 JN162823 Skeletonema menzelii AJ535168 DQ514821 Skeletonema pseudocostatum X85393 DQ514819 Skeletonema subsalsum AY684967 FJ002149 Skeletonema tropicum EF138941 NA Stauroneis acuta HQ912579 HQ912443 Stauroneis anceps AM502008 AM710475 Stauroneis gracilior AM501988 AM710454 Stauroneis kriegeri AM502037 AM710504 Stauroneis phoenicenteron AM502031 AM710498 Staurosira construens EF465467 HQ912451 Staurosira elliptica NA HQ828193 Staurosira mutabilis AM746972 NA Staurosira venter NA KT072941 Staurosirella martyi NA HQ828192 Staurosirella pinnata EF465472 HQ912484 Stellarima microtrias AY485477 EU090032 Stenopterobia curvula HQ912416 HQ912402 Stephanodiscus agassizensis DQ514895 DQ514823 Stephanodiscus binderanus DQ514896 DQ514824 Stephanodiscus hantzschii DQ093370 DQ514842 Stephanodiscus minutulus DQ514900 DQ514843 Stephanodiscus neoastraea DQ514906 DQ514834 Stephanodiscus niagarae DQ514907 DQ514835 Stephanodiscus parvus KT072953 KT072902 Stephanodiscus reimerii DQ514909 DQ514837 Stephanodiscus yellowstonensis DQ514910 DQ514838 Stephanopyxis palmeriana AY485527 NA Stephanopyxis turris NA HQ912521 Continued on next page ⇒ B.1. Additional material for Chapter 3 143

Continued from previous page Species ID 18S ID rbcL Striatella unipunctata NA AB430689 Surirella angusta KT072995 KT072944 Surirella brebissoni AJ867029.1 NA Surirella fastuosa AJ535161 NA Surirella minuta HQ912658 HQ912522 Surirella splendida HQ912415 HQ912401 Synedra fragilarioides EF193001 FJ002123 Synedra minuscula EF423415 JN162825 Synedra toxoneides EF423421 NA Synedropsis hyperborea AY485464 HQ912485 Synedropsis recta HQ912616 HQ912480 Tabellaria flocculosa EF423416 HQ912448 Tabularia fasciculata EF423417 JN162826 Tabularia laevis NA AB430690 Tabularia tabulata AY216907 FJ002156 Talaroneis posidoniae JF791047 NA Terpsinoe musica NA HQ912546 Thalassionema bacillare EF423418 NA Thalassionema frauenfeldii EF423419 FJ002113 Thalassionema nitzschioides X77702 NA Thalassiosira aestivalis DQ514873 DQ514794 Thalassiosira allenii HM991688 NA Thalassiosira angulata DQ514867 DQ514788 Thalassiosira anguste-lineata AJ810854 DQ514786 Thalassiosira antarctica EF192993 DQ514795 Thalassiosira concaviuscula HM991689 NA Thalassiosira conferta NA HQ710594 Thalassiosira curviseriata HM991690 NA Thalassiosira delicatula AJ810855 NA Thalassiosira eccentrica HM991691 DQ514789 Thalassiosira fluviatilis AJ535170 NA Thalassiosira gessneri DQ514864 DQ514785 Thalassiosira guillardii DQ514869 DQ514796 Thalassiosira hendeyi AM050629 NA Thalassiosira lundiana HM991692 NA Thalassiosira mala HM991693 NA Thalassiosira mediterranea NA DQ514806 Thalassiosira minima FR865522 DQ514797 Thalassiosira minuscula DQ514882 DQ514809 Thalassiosira nodulolineata DQ514866 DQ514787 Thalassiosira nordenskioeldii DQ093365 AB018007 Thalassiosira oceanica DQ514878 DQ514799 Thalassiosira oestrupii DQ514870 DQ514791 Thalassiosira pacifica HM991697 DQ514810 Thalassiosira profunda AM235383 NA Thalassiosira pseudonana AY485452 HQ912419 Thalassiosira punctigera AJ810856 JN162828 Thalassiosira rotula AF374480 DQ514805 Thalassiosira tenera AJ810858 NA Thalassiosira tumida DQ514890 DQ514812 Thalassiosira weissflogii GQ281043 DQ514811 Thalassiothrix longissima NA AB430691 Toxarium hennedyanum NA HQ912526 Toxarium undulatum NA HQ912518 Triceratium dubium HQ912572 HQ912436 Triceratium orbiculatum NA HQ912543 Triceratium shadboltianum NA HQ912544 Trigonium formosum NA HQ912512 Tryblionella apiculata HQ912600 HQ912464 Ulnaria acus AJ866994.1 KF959645 Ulnaria ulna AJ866993.1 KT072942 Ulnaria ulna var. acus NA KT072899 Ulnaria ulna var. angustissima NA KT072900 Undatella sp. AJ535163 NA Continued on next page ⇒ 144 B. Supplementary material

Continued from previous page Species ID 18S ID rbcL Urosolenia eriensis HQ912577 HQ912441

B.2 Additional material for Chapter 4

B.2.1 Linnaean names and clusters of species presented in Figure 4.2

Code Name Cluster AAMB Aulacoseira ambigua 2 ACOF Amphora coffeaeformis var. coffeaeformis 49 ADMI Achnanthidium minutissimum 38 AFOR Asterionella formosa 12 AGCU Aulacoseira granulata 2 ALIB Amphora libyca 48 AMFO Amphora fogediana 48 AMMO Amphora montana 49 ANOR Amphora normanii 49 APED Amphora pediculus 48 ASPH Anomoeoneis sphaerophora 40 AUAL Aulacoseira alpigena 2 AUDI Aulacoseira distans 2 AUIS Aulacoseira islandica 2 AUSU Aulacoseira subarctica 2 AUVA Aulacoseira valida 2 BPAX Bacillaria paxillifera var. paxillifera 29 BRUT Berkeleya rutilans 47 CABU Caloneis budensis 60 CACY Campylosira cymbelliformis 3 CAEX Cymbella excisa var. excisa 42 CAFF Cymbella affinis var. affinis 42 CASP Cymbella aspera 42 CATO Cyclotella atomus 3 CBEL Cymatosira belgica 3 CBNA Cymbopleura naviculiformis var. naviculiformis 42 CBOD Cyclotella bodanica var. bodanica 4 CCHO Cyclotella choctawhatcheeana 3 CCLO Cylindrotheca closterium 25 CCOS Cyclotella costei 4 CCRY Cyclotella cryptica 3 CDTG Cyclotella distinguenda var. distinguenda 9 CDUB Cyclostephanos dubius 6 CELL Cymatopleura elliptica var. elliptica 51 CERE Centronella reicheltii 19 CGAI Catacombas gaillonii 17 CHMU Chaetoceros muelleri 11 Continued on next page ⇒ B.2. Additional material for Chapter 4 145

Continued from previous page Code Name Cluster CINV Cyclostephanos invisitatus 6 CLAN Cymbella lanceolata var. lanceolata 42 CLAU Caloneis lauta 64 CMEN Cyclotella meneghiniana 3 CMLF Craticula molestiformis 55 COCE Cyclotella ocellata 5 CPED Cocconeis pediculus 39 CPLA Cocconeis placentula var. placentula 39 CPLE Cocconeis placentula var. euglypta 39 CPRX Cymbella proxima var. proxima 42 CRAC Craticula accomoda 55 CRCU Craticula cuspidata 55 CSCD Cyclotella scaldensis 3 CSIL Caloneis silicula 64 CSTR Cyclotella striata 3 CTHO Cyclostephanos tholiformis 6 CTPU Ctenophora pulchella 16 CTUM Cymbella tumida 42 CWAI Coscinodiscus wailesii 1 DGAL Diadesmis gallica 38 DHIE Diatoma hyemalis var. hyemalis 12 DITE Diatoma tenuis 12 DKUE Denticula kuetzingii var. kuetzingii 24 DMON Diatoma moniliformis 12 DPST Discostella pseudostelligera 8 DSBO Diploneis subovalis 56 DSTE Discostella stelligera 8 DULV Dickieia ulvacea 40 DVBR Diatoma vulgaris 12 EALA Entomoneis alata 52 EARG Epithemia argus var. argus 50 EBIL Eunotia bilunaris var. bilunaris 68 ECAE Encyonema caespitosum 42 EFOR Eunotia formica 68 EGLA Eunotia glacialis 68 EIMP Eunotia implicata 68 EMIN Eunotia minor 68 EMNT Encyonema minutum 42 EMON Eunotia monodon var. monodon 68 EORN Entomoneis ornata 52 EPAL Entomoneis paludosa var. paludosa 52 EPEC Eunotia pectinalis var. pectinalis 68 EPTU Entomoneis punctulata 52 ESBM Eolimna subminuscula 55 ESLE Encyonema silesiacum 42 ESNC Encyonema sinicum 42 ESOR Epithemia sorex 50 ETUR Epithemia turgida var. turgida 50 EULA Eucocconeis laevis 38 FAUT Fragilaria austriaca 19 FBID Fragilaria bidens 19 FCA1 Fragilaria capucina 19 Continued on next page ⇒ 146 B. Supplementary material

Continued from previous page Code Name Cluster FCRO Fragilaria crotonensis 19 FCVA Fragilaria capucina var. vaucheriae 19 FDEL Fragilaria delicatissima 14 FFAM Fragilaria famelica var. famelica 16 FFOR Fallacia forcipata 65 FFVI Fragilariforma virescens 14 FMOC Fallacia monoculata 65 FPCY Fragilariopsis cylindrus 23 FPEL Fistulifera pelliculosa 53 FPEM Fragilaria perminuta 19 FRUM Fragilaria rumpens 19 FSAP Fistulifera saprophila 53 FVUL Frustulia vulgaris 47 GACU Gomphonema acuminatum 44 GAFF Gomphonema affine 44 GANG Gomphonema angustatum 44 GBOB Gomphonema bourbonense 44 GCAP Gomphonema capitatum 44 GCLA Gomphonema clavatum 44 GCLE Gomphonema clevei 44 GEXL Gomphonema exilissimum 46 GLGN Gomphonema lagenula 45 GLIM Gyrosigma limosum 31 GPAR Gomphonema parvulum var. parvulum f. parvulum 45 GPAS Gomphonema parvulum var. parvulum f. saprophilum 45 GPRO Gomphonema productum 44 GRMA Grammatophora marina 13 GROS Gomphonema rosenstockianum 43 GTHY Gyrosigma tenuissimum var. hyperborea 30 GTRU Gomphonema truncatum 44 GYAC Gyrosigma acuminatum 31 HAMP Hantzschia amphioxys 27 HCAP Hippodonta capitata 32 HCRU Haslea crucigera 30 HPDO Haslea pseudostrearia 30 HSPC Haslea spicula 30 LGOE Luticola goeppertiana 57 LHUN Lemnicola hungarica 40 MAAT Mayamaea atomus 59 MAFO Mayamaea fossalis 59 MAPE Mayamaea atomus var. permitis 59 MHEL Mediopyxis helysia 3 MNUM Melosira nummuloides 1 MPMI Mayamaea permitis 59 MVAR Melosira varians 2 NACD Nitzschia acidoclinata 21 NACI Nitzschia acicularis 26 NAPI Nitzschia apiculata 25 NBIS Neidium bisulcatum 58 NCAR Navicula cari 37 NCIN Navicula cincta 36 NCOM Nitzschia communis 26 Continued on next page ⇒ B.2. Additional material for Chapter 4 147

Continued from previous page Code Name Cluster NCPL Nitzschia capitellata 26 NCPR Navicula capitatoradiata 37 NCRY Navicula cryptocephala 35 NCTE Navicula cryptotenella 35 NCTO Navicula cryptotenelloides 35 NDIS Nitzschia dissipata var. dissipata 28 NDIT Navicula diserta 30 NDRA Nitzschia draveillensis 26 NDUR Navicula duerrenbergiana 36 NEAF Neidium affine 58 NEPR Neidium productum 58 NFIL Nitzschia filiformis var. filiformis 25 NFON Nitzschia fonticola 20 NGRE Navicula gregaria 34 NHAN Nitzschia hantzschiana 24 NIFR Nitzschia frustulum var. frustulum 22 NILG Nitzschia longissima 27 NINC Nitzschia inconspicua 24 NIPU Nitzschia pusilla 26 NIVI Nitzschia vitrea var. vitrea 25 NLAN Navicula lanceolata 37 NLIN Nitzschia linearis var. linearis 27 NLOR Nitzschia lorenziana 29 NLUN Navicula lundii 36 NMIC Nitzschia microcephala 23 NOVA Nitzschia ovalis 23 NPAD Nitzschia palea var. debilis 26 NPAE Nitzschia paleacea 26 NPAL Nitzschia palea 26 NPHY Navicula phyllepta 35 NPNU Navicula perminuta 30 NRAD Navicula radiosa 37 NREI Navicula reinhardtii 35 NROS Navicula rostellata 36 NSIG Nitzschia sigma 25 NSIO Nitzschia sigmoidea 29 NSLC Navicula salinicola 30 NSYM Navicula symmetrica 36 NTHM Nitzschia thermalis 26 NTPT Navicula tripunctata 37 NVEN Navicula veneta 33 NVRO Navicula viridula var. rostellata 36 NZSU Nitzschia supralitorea 23 OAUR Odontella aurita 3 PACR Pinnularia acrosphaeria var. acrosphaeria 61 PBMU Pinnularia brebissonii 61 PBOR Pinnularia borealis var. borealis 61 PBSI Pinnularia borealis var. subislandica 61 PCOS Placoneis constans 41 PELG Placoneis elginensis 41 PELO Pleurosigma elongatum 30 PGIB Pinnularia gibba 61 Continued on next page ⇒ 148 B. Supplementary material

Continued from previous page Code Name Cluster PGST Plagiogramma staurophorum 11 PIAN Pinnularia anglica 61 PINT Pinnularia interrupta 61 PLFR Planothidium frequentissimum 39 PMEU Pinnularia mesolepta 61 PMI3 Pinnularia microstauron 62 PNGF Pinnularia neglectiformis 61 PNOD Pinnularia nodosa var. nodosa 61 POMU Pinnularia obscura 63 PPAN Psammodictyon panduriforme 27 PPVS Pinnularia parvulissima 62 PRUP Pinnularia rupestris 61 PSBR Pseudostaurosira brevistriata 12 PSCA Pinnularia subcapitata var. subcapitata 61 PSEL Pinnularia subcapitata var. elongata 61 PSGI Pinnularia subgibba var. subgibba 61 PSUL Paralia sulcata 1 RAMP Rhaphoneis amphiceros 11 RGIB Rhopalodia gibba var. gibba 50 RSIN Reimeria sinuata 43 SAGA Stephanodiscus agassizensis 7 SANG Surirella angusta 51 SBIN Stephanodiscus binderanus 6 SCON Staurosira construens 12 SCPE Scoliopleura peisonis 57 SEBA Sellaphora bacillum 66 SELA Sellaphora laevissima 66 SELI Staurosira elliptica 12 SEMN Sellaphora minima 65 SGRL Stauroneis gracilior 54 SHAN Stephanodiscus hantzschii 6 SKPC Skeletonema pseudocostatum 3 SKSS Skeletonema subsalsum 3 SLMA Staurosirella martyi 12 SMIN Synedra minuscula 18 SNEO Stephanodiscus neoastraea 6 SPAV Stephanodiscus parvus 6 SPHO Stauroneis phoenicenteron 54 SPIN Staurosirella pinnata 12 SPUP Sellaphora pupula 67 SSEM Sellaphora seminulum 65 SSMU Staurosira mutabilis 12 SSPL Surirella splendida 51 SSVE Staurosira venter 12 STAC Stauroneis acuta 54 STAN Stauroneis anceps 54 STCU Stenopterobia curvula 51 STKR Stauroneis kriegeri 54 STMI Stephanodiscus minutulus 6 SUMI Surirella minuta 51 TAPI Tryblionella apiculata 25 TFAS Tabularia fasciculata 17 Continued on next page ⇒ B.2. Additional material for Chapter 4 149

Continued from previous page Code Name Cluster TFLO Tabellaria flocculosa 12 TFLU Thalassiosira fluviatilis 3 TGES Thalassiosira gessneri 3 THNI Thalassionema nitzschioides 15 TMED Thalassiosira mediterranea 3 TMUS Terpsinoe musica 10 TNOR Thalassiosira nordenskioeldii 3 TPSD Talaroneis posidoniae 1 TPSN Thalassiosira pseudonana 3 TTAB Tabularia tabulata 17 TTEN Thalassiosira tenera 3 TWEI Thalassiosira weissflogii 3 UACU Ulnaria acus 19 UERI Urosolenia eriensis 11 UUAC Ulnaria ulna var. acus 19 UUAN Ulnaria ulna var. angustissima 19 UULN Ulnaria ulna 19

B.2.2 Quality of approximation of IPS index

IPS index reads ∑ a v s IPS = ∑i∈I i i i (B.1) k∈I akvk where

– I is the set of species

– ai is the abundance of species i in the sample

– si is yhe sensitivity of species i to pollution

– vi the variance of this sensitivity within the species. Let us assume that species are categorized into clusters. Let us denote by γ ∈ Γ the ∑ clusters. The sum i can be written as a sum over the clusters and, within each cluster, on the species in this cluster, i.e. ∑ ∑ ∑ = i∈I γ∈Γ i∈γ Let us define 1 ∑ 1 ∑ ∑ vγ = | | vi, sγ = | | si, aγ = ai γ i∈γ γ i∈γ i∈γ 150 B. Supplementary material where |γ| is the number of species i in cluster γ.

The aggregated IPS index is defined as

∑ γ aγvγsγ IPSP = ∑ (B.2) α aαvα where sums are over clusters. Let us write, for each cluster γ and each species i ∈ γ

vi = vγ + dviγ, si = sγ + dsiγ (B.3)

where the clusters are designed such that within variations of si and vi are small. Then

∑ a v s IPS = ∑i∈I i i i a v ∑ k∈I k k ∈ a (v + dv )(s + ds ) = i I ∑i γ iγ γ iγ (B.4) a (v + dv ) ∑ k∈I k γ iγ a v s + a s dv + a v ds = i∈I i∑γ γ i γ iγ i γ iγ k∈I akvγ + akdvkγ if second order terms are ignored.

Let us note that   ∑ ∑ ∑   aivγ = aivγ ∈ γ ∈ i I i γ  ∑ ∑   = ai vγ (B.5) γ i∈γ ∑ = aγvγ γ Let us denote ∑ C = aγvγ γ B.2. Additional material for Chapter 4 151

We have 1 1 ∑ a ∑ ∑ − k = ∑ 2 dvkγ k∈I akvγ + akdvkγ k akvγ k ( m amvγ) ∑ (B.6) 1 − 1 = 2 akdvkγ C C k at first order. Hence ( )( ) ∑ ∑ ∑ 1 − 1 IPS = aγvγsγ + aisγdviγ + aivγdsiγ 2 akdvkγ γ i C C ( ) k A 1 ∑ A 1 ∑ = + a s − a dv + a v ds C C i γ C i iγ C i γ iγ (B.7) i   i   ∑ ( ) ∑ ∑ ∑ A 1 A   1   = + sγ − aidviγ + vγ aidsiγ C C γ C i∈γ C γ i∈γ

∑ where A = γ aγvγsγ. We recognize

A IPS = P C and, finally

    ∑ ( ) ∑ ∑ ∑ 1 A   1   IPS − IPSP = sγ − aidviγ + vγ aidsiγ (B.8) C γ C i∈γ C γ i∈γ

Terms involved are ∑ ∑ aidviγ, aidsiγ i∈γ i∈γ ∑ ∑ which are expected to be small. Indeed, by construction, i∈γ dviγ = i∈γ dviγ = 0. Then, these terms wil be negligible if abundances are well balanced within each clus- ter. The only possibility for the discrepancy between both indices to be significant is to have a species in cluster γ with a dominant abundance and a large discrepancy within its cluster, i.e. dviγ or dsiγ. The groups have been built to avoid, or minimize, such variability within clusters. 152 B. Supplementary material

Let us notice that this can be summarized as

    ∑ ( ) ∑ ∑ 1 A     IPS − IPSP = ∆γ, ∆γ = sγ − aidviγ + vγ aidsiγ (B.9) C γ C i∈γ i∈γ

Then, the quality of aggregation of the index can be estimated globally by C−1 and per cluster by ∆γ. Globally, the higher the variance of the index over the community, the ∑ better the aggregation, as C = γ aγvγ. For one cluster, the discrepancy due to the aggregation is lower when

– the mean index sγ over species in cluster γ is close to the aggregated index IPSP

– the mean variance within the cluster vγ is low ∑ – the mean discrepancies weighted by abundances are low (terms i aidsiγ and ∑ i aidviγ)

B.3 Additional material for Appendix A

This content is available online. Larras F., Keck F., Montuelle B., Rimet F., Bouchez A. (2014). Linking Diatom Sensitivity to Herbicides to Phylogeny: A Step Forward for

Biomonitoring? Figshare Repository. https://dx.doi.org/10.1021/es4045105.s001

– Table S1: Sensitivity data (EC10 in ) of each diatom species for each herbicide

– Table S2: Sensitivity data (EC50 in ) of each diatom species for each herbicide – Table S3: GenBank accession numbers for 18S and rbcL markers

– Table S4: Results of traditional and phylogenetic regressions

– Figure S1: Phylogenetic trees of the 14 diatom species computed with all combi-

nation of methods and markers B.4. Data accessibility 153

B.4 Data accessibility

B.4.1 Genetic sequences and phylogenetic data

This content is available online. Keck F., Rimet F., Franc A., Bouchez A. (2015).

Data from: Phylogenetic signal in diatom ecology: perspectives for aquatic ecosystems biomonitoring. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.0db3f

– Alignement of the 18S sequences (all)

– Alignement of the 18S sequences (after pruning)

– Taxonomic names of 18S sequences

– Alignement of the rbcL sequences (all) – Alignement of the rbcL sequences (after pruning)

– Taxonomic names of rbcL sequences

– Alignement of the 18S and rbcL sequences (after union)

– Alignement of the 18S and rbcL sequences (after intersection) – Phylogenetic tree of diatoms (union data)

– Phylogenetic tree of diatoms (intersection data)

B.4.2 Diatom community samples data

This content is available online. Rimet F., Berthon V., Bouchez A. (2016). Diatom community samples. Figshare Repository. https://dx.doi.org/10.6084/m9.figshare.2068590.v2

– Diatom species abundances – Chemistry data 154 Bibliographie

Abouheif, E. (1999). “A method for testing the assumption of phylogenetic indepen-

dence in comparative data”. In: Evolutionary Ecology Research 1.8, pp. 895–909.

Adams, D. C. (2014). “A Generalized K Statistic for Estimating Phylogenetic Signal

from Shape and other High-dimentional Multivariate Data”. In: Systematic Biology 63.5, pp. 685–697.

Adl, S. M. et al. (2005). “The New Higher Level Classification of Eukaryotes with

Emphasis on the Taxonomy of Protists”. In: Journal of Eukaryotic Microbiology 52.5,

pp. 399–451.

Altenburger, R. and M. Schmitt-Jansen (2003). “Predicting toxic effects of con- taminants in ecosystems using single species investigations”. In: Bioindicators &

biomonitors: principles, concepts, and applications. Ed. by B. A. Markert, A. M.

Breure, and H. G. Zechmeister. Amsterdam; Boston: Elsevier, pp. 153–198.

Alverson, A. J. and E. C. Theriot (2005). “Comments on Recent Progress Toward

Reconstructing the Diatom Phylogeny”. In: Journal of Nanoscience and Nanotech- nology 5.1, pp. 57–62.

Amiard-Triquet, C., C. Cossu-Leguille, and C. Mouneyrac (2013). “Biomark-

ers of Defense, Tolerance, and Ecological Consequences”. In: Ecological biomarkers:

indicators of ecotoxicological effects. Ed. by C. Amiard-Triquet, J.-C. Amiard, and P. S. Rainbow. Boca Raton, Fla.: CRC Press, pp. 45–74.

Anselin, L. (1995). “Local Indicators of Spatial Association—LISA”. In: Geographical

Analysis 27.2, pp. 93–115. 156 Bibliographie

Arrhenius, Å., F. Grönvall, M. Scholze, T. Backhaus, and H. Blanck (2004).

“Predictability of the mixture toxicity of 12 similarly acting congeneric inhibitors

of photosystem II in marine periphyton and epipsammon communities”. In: Aquatic Toxicology 68.4, pp. 351–367.

Baird, D. J. and M. Hajibabaei (2012). “Biomonitoring 2.0: a new paradigm in ecosys-

tem assessment made possible by next-generation DNA sequencing”. In: Molecular

Ecology 21.8, pp. 2039–2044.

Balian, E. V., H. Segers, C. Lévèque, and K. Martens (2007). “The Freshwater Animal Diversity Assessment: an overview of the results”. In: Hydrobiologia 595.1,

pp. 627–637.

Barbour, M. T., J. Gerritsen, B. D. Snyder, and J. B. Stribling (1999). Rapid

Bioassessment Protocols for Use in Streams and Wadeable Rivers: Periphyton, Ben- thic Macroinvertebrates, and Fish. EPA 841-B-99-002. Washington, D.C.: U.S. En-

vironmental Protection Agency, p. 339.

Barbour, M. T., J. B. Stribling, and J. R. Karr (1995). “Multimetric approaches for

establishing biocriteria and measuring biological condition”. In: Biological Assessment

and Criteria: Tools for Water Resource Planning and Decision Making. Ed. by W. S. Davis and T. P. Simon. Boca Raton, Florida: CRC Press, pp. 63–77.

Battarbee, R. W. (1986). “Diatom Analysis”. In: Handbook of Holocene Palaeoecology

and Palaeohydrology. Ed. by B. E. Berglund. Chichester, England: John Wiley &

Sons, pp. 527–570.

Bennett, J. R., D. R. Sisson, J. P. Smol, B. F. Cumming, H. P. Possingham, and Y. M. Buckley (2014). “Optimizing taxonomic resolution and sampling effort to

design cost-effective ecological models for environmental assessment”. In: Journal of

Applied Ecology 51.6, pp. 1722–1732. Bibliographie 157

Benson, D. A., I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and D. L. Wheeler

(2008). “GenBank”. In: Nucleic Acids Research 36 (Database Issue), pp. D25–D30.

Berard, A. and T. Pelte (1999). “The impact of photo system II (PS II) inhibitors on algae communities and dynamics”. In: Journal of Water Science 12.2, pp. 333–361.

Berger, S. A., D. Krompass, and A. Stamatakis (2011). “Performance, Accuracy,

and Web Server for Evolutionary Placement of Short Sequence Reads under Maxi-

mum Likelihood”. In: Systematic Biology 60.3, pp. 291–302.

Berges, J. A., D. E. Varela, and P. J. Harrison (2002). “Effects of temperature on growth rate, cell composition and nitrogen metabolism in the marine diatom

Thalassiosira pseudonana (Bacillariophyceae)”. In: Marine Ecology Progress Series

225, pp. 139–146.

Berthon, V., A. Bouchez, and F. Rimet (2011). “Using diatom life-forms and eco- logical guilds to assess organic pollution and trophic level in rivers: a case study of

rivers in south-eastern France”. In: Hydrobiologia 673.1, pp. 259–271.

Besse-Lototskaya, A., P. F. Verdonschot, M. Coste, and B. Van de Vijver

(2011). “Evaluation of European diatom trophic indices”. In: Ecological Indicators

11.2, pp. 456–467. Besse-Lototskaya, A., P. F. M. Verdonschot, and J. A. Sinkeldam (2006). “Un-

certainty in diatom assessment: sampling, identification and counting variation”. In:

Hydrobiologia 566.1, pp. 247–260.

Beszteri, B., È. Acs, J. Makk, G. Kovács, K. Márialigeti, and K. T. Kiss (2001).

“Phylogeny of six naviculoid diatoms based on 18S rDNA sequences.” In: Interna- tional Journal of Systematic and Evolutionary Microbiology 51.4, pp. 1581–1586.

Birks, H. J. B. (2010). “Numerical methods for the analysis of diatom assemblage

data”. In: The Diatoms: Applications for the Environmental and Earth Sciences. Ed. 158 Bibliographie

by J. P. Smol and E. F. Stoermer. 2nd. Cambridge, UK: Cambridge University

Press, pp. 23–54.

Blandin, P. (1986). “Bioindicateurs et diagnostic des systèmes écologiques”. In: Bulletin d’écologie 17.4, pp. 215–307.

Blomberg, S. P. and T. Garland (2002). “Tempo and mode in evolution: phylogenetic

inertia, adaptation and comparative methods”. In: Journal of Evolutionary Biology

15.6, pp. 899–910.

Blomberg, S. P., T. Garland, and A. R. Ives (2003). “Testing for phylogenetic signal in comparative data: behavioral traits are more labile”. In: Evolution 57.4, pp. 717–

745.

Bory de Saint-Vincent, J.-B. (1822). Dictionnaire classique d’histoire naturelle.

Vol. 2. Paris: Rey et Gravier, Baudouin Frères. 621 pp. Boudou, A. and F. Ribeyre (1997). “Aquatic ecotoxicology: from the ecosystem to the

cellular and molecular levels.” In: Environmental health perspectives 105.1, pp. 21–35.

Bradbury, J. (2004). “Nature’s Nanotechnologists: Unveiling the Secrets of Diatoms”.

In: PLOS Biology 2.10, e306.

Brettum, P. (1989). Algen als Indikatoren für die Gewässerqualität in norwegischen Binnenseen. Norway: Norsk Institutt for vannforskning (NIVA), p. 102.

Brown, S. D. and A. P. Austin (1973). “Diatom Succession and Interaction in Littoral

Periphyton and Plankton”. In: Hydrobiologia 43.3, pp. 333–356.

Buchwalter, D. B., D. J. Cain, C. A. Martin, L. Xie, S. N. Luoma, and T.

Garland Jr (2008). “Aquatic insect ecophysiological traits reveal phylogenetically based differences in dissolved cadmium susceptibility”. In: Proceedings of the National

Academy of Sciences 105.24, pp. 8321–8326.

Caporaso, J. G. et al. (2010). “QIIME allows analysis of high-throughput community

sequencing data”. In: Nature Methods 7.5, pp. 335–336. Bibliographie 159

Cardinale, B. J. et al. (2012). “Biodiversity loss and its impact on humanity”. In:

Nature 486.7401, pp. 59–67.

Carew, M. E., A. D. Miller, and A. A. Hoffmann (2011). “Phylogenetic signals and ecotoxicological responses: potential implications for aquatic biomonitoring”. In:

Ecotoxicology 20.3, pp. 595–606.

Carpenter, S. R., E. H. Stanley, and M. J. V. Zanden (2011). “State of the World’s

Freshwater Ecosystems: Physical, Chemical, and Biological Changes”. In: Annual

Review of Environment and Resources 36.1, pp. 75–99. Carter, J. L. and V. H. Resh (2001). “After site selection and before data analysis:

sampling, sorting, and laboratory procedures used in stream benthic macroinverte-

brate monitoring programs by USA state agencies”. In: Journal of the North American

Benthological Society 20.4, pp. 658–682. Carter, J. L., V. H. Resh, D. M. Rosenberg, T. B. Reynoldson, G. Ziglio,

M. Siligardi, and G. Flaim (2006). “Biomonitoring in North American rivers:

a comparison of methods used for benthic macroinvertebrates in Canada and the

United States”. In: Biological Monitoring of Rivers: Applications and Perspectives.

1st ed. Chichester, England: John Wiley & Sons, pp. 203–228. Chapman, D. V. (1996). Water quality assessments: a guide to the use of biota, sediments

and water in environmental monitoring. E & Fn Spon London.

Charles, D. F. (1985). “Relationships between Surface Sediment Diatom Assemblages

and Lakewater Characteristics in Adirondack Lakes”. In: Ecology 66.3, pp. 994–1011.

Chepurnov, V. A., D. G. Mann, K. Sabbe, and W. Vyverman (2004). “Experimen- tal Studies on Sexual Reproduction in Diatoms”. In: International Review of Cytology

237, pp. 91–154. 160 Bibliographie

Chessman, B. C., N. Bate, P. A. Gell, and P. Newall (2007). “A diatom species

index for bioassessment of Australian rivers”. In: Marine and Freshwater Research

58.6, pp. 542–557. Chessman, B., I. Growns, J. Currey, and N. Plunkett-Cole (1999). “Predicting

diatom communities at the genus level for the rapid biological assessment of rivers”.

In: Freshwater Biology 41.2, pp. 317–331.

Cheverud, J. M., M. M. Dow, and W. Leutenegger (1985). “The quantitative

assessment of phylogenetic constraints in comparative analyses: sexual dimorphism in body weight among primates”. In: Evolution 39.6, pp. 1335–1351.

Cohn, F. (1853). “Über lebendige Organismen im Trinkwasser”. In: Zurnal of Klinikal

Medizin 4, pp. 229–237.

— (1872). “Über den Brunnenfaden Crenothrix polyspora mit Bemerkungen über die mikroskopische Analyse des Brunnenwassers”. In: Beiträge zur Biologie der Pflanzen

1, pp. 108 –132.

Coissac, E., T. Riaz, and N. Puillandre (2012). “Bioinformatic challenges for DNA

metabarcoding of plants and animals”. In: Molecular Ecology 21.8, pp. 1834–1847.

Comte, L., J. Murienne, and G. Grenouillet (2014). “Species traits and phylo- genetic conservatism of climate-induced range shifts in stream fishes”. In: Nature

Communications 5.5023.

Conti, M. E. and G. Cecchetti (2001). “Biological monitoring: lichens as bioindicators

of air pollution assessment — a review”. In: Environmental Pollution 114.3, pp. 471–

492. Coste, M., S. Boutry, J. Tison-Rosebery, and F. Delmas (2009). “Improvements

of the Biological Diatom Index (BDI): Description and efficiency of the new version

(BDI-2006)”. In: Ecological Indicators 9.4, pp. 621–650. Bibliographie 161

Coste, M. (1982). Étude des méthodes biologiques d’appréciation quantitative de la

qualité des eaux. Cemagref, p. 218.

Coste, M. and L. Ector (2000). “Diatomées invasives exotiques ou rares en France: principales observation effectuées au cours des dernières décennies”. In: Systematic

and Geography of Plants 70.2, pp. 373–400.

Cox, E. J. and D. M. Williams (2000). “Systematics of naviculoid diatoms: the inter-

relationships of some taxa with a stauros”. In: European Journal of Phycology 35.3,

pp. 273–282. — (2006). “Systematics of naviculoid diatoms (Bacillariophyta): A preliminary analysis

of protoplast and frustule characters for family and order level classification”. In:

Systematics and Biodiversity 4.4, pp. 385–399.

Croome, R. L. and P. A. Tyler (1983). “Mallomonas plumosa (Chrysophyceae), a new species from Australia”. In: British Phycological Journal 18.2, pp. 151–158.

Darley, W. M. and B. E. Volcani (1971). “Synchronized cultures: diatoms”. In: Meth-

ods in Enzymology 23, pp. 85–96.

Daugbjerg, N. and R. A. Andersen (1997). “A molecular phylogeny of the

algae based on analyses of chloroplast-encoded rbcL sequence data.” In: Journal of Phycology 33.6, pp. 1031–1041.

Debenest, T., E. Pinelli, M. Coste, J. Silvestre, N. Mazzella, C. Madigou,

and F. Delmas (2009). “Sensitivity of freshwater periphytic diatoms to agricultural

herbicides”. In: Aquatic Toxicology 93.1, pp. 11–17.

Dell’Uomo, A. and M. Torrisi (2011). “The Eutrophication/Pollution Index-Diatom based (EPI-D) and three new related indices for monitoring rivers: The case study

of the river Potenza (the Marches, Italy)”. In: Plant Biosystems 145.2, pp. 331–341.

DeNicola, D. M. (2000). “A review of diatoms found in highly acidic environments”.

In: Hydrobiologia 433, pp. 111–122. 162 Bibliographie

Diniz-Filho, J. A. F., C. E. R. de Sant’Ana, and L. M. Bini (1998). “An Eigenvector

Method for Estimating Phylogenetic Inertia”. In: Evolution 52.5, p. 1247.

Diniz-Filho, J. A. F., T. Santos, T. F. Rangel, and L. M. Bini (2012). “A com- parison of metrics for estimating phylogenetic signal under alternative evolutionary

models”. In: Genetics and Molecular Biology 35.3, pp. 673–679.

Dodds, W. K., V. H. Smith, and B. Zander (1997). “Developing nutrient targets to

control benthic chlorophyll levels in streams: A case study of the Clark Fork River”.

In: Water Research 31.7, pp. 1738–1750. Dokulil, M. (2003). “Algae as ecological bio-indicators”. In: Bioindicators & biomoni-

tors: principles, concepts, and applications. Ed. by B. A. Markert, A. M. Breure,

and H. G. Zechmeister. Amsterdam; Boston: Elsevier, pp. 285–326.

Drebes, G. (1977). “Sexuality”. In: The Biology of Diatoms. Ed. by D. Werner. Vol. 13. Botanical Monographs. Oxford, UK: Blackwell Scientific Publications, pp. 250–

283.

Dubois, A. and L. Lacouture (2011). Bilan de présence des micropolluants dans

les milieux aquatiques continentaux Période 2007 - 2009. Commissariat général au

développement durable. Service de l’observation et des statistiques, pp. 59–71. Dudgeon, D., A. H. Arthington, M. O. Gessner, Z.-I. Kawabata, D. J. Knowler,

C. Lévêque, R. J. Naiman, A.-H. Prieur-Richard, D. Soto, M. L. J. Stiassny,

and C. A. Sullivan (2006). “Freshwater biodiversity: importance, threats, status

and conservation challenges”. In: Biological Reviews 81.2, pp. 163–182.

Duong, T. T., A. Feurtet-Mazel, M. Coste, D. K. Dang, and A. Boudou (2007). “Dynamics of diatom colonization process in some rivers influenced by urban pollu-

tion (Hanoi, Vietnam)”. In: Ecological Indicators 7.4, pp. 839–851.

Ector, L. and F. Rimet (2005). “Using bioindicators to assess rivers in Europe: An

overview”. In: Modelling community structure in aquatic ecosystems. Ed. by S. Lek, Bibliographie 163

M. Scardi, P. Verdonschot, J.-P. Descy, and Y. Park. Berlin: Springer-Verlag,

pp. 7–19.

Edgar, L. and J. D. Pickett-Heaps (1984). “Diatom locomotion”. In: Progress in Phycological Research. Ed. by F. E. Round and D. J. Chapman. Vol. 3. Bristol,

UK: Biopress, pp. 47 –88.

Edgar, R. C. (2004). “MUSCLE: multiple sequence alignment with high accuracy and

high throughput”. In: Nucleic Acids Research 32.5, pp. 1792–1797.

Ehara, M., Y. Inagaki, K. I. Watanabe, and T. Ohama (2000). “Phylogenetic anal- ysis of diatom coxI genes and implications of a fluctuating GC content on mitochon-

drial genetic code evolution”. In: Current Genetics 37.1, pp. 29–33.

Ehrenberg, C. G. (1838). Die Infusionsthierchen als vollkommene Organismen. Leipzig:

Leopold Voss. 548 pp. El Jay, A. (1996). “Effects of organic solvents and solvent-atrazine interactions on two

algae, Chlorella vulgaris and Selenastrum capricornutum”. In: Archives of Environ-

mental Contamination and Toxicology 31.1, pp. 84–90.

Elmore, A. J. and S. S. Kaushal (2008). “Disappearing headwaters: patterns of

stream burial due to urbanization”. In: Frontiers in Ecology and the Environment 6.6, pp. 308–312.

Eriksson, K. M., A. Antonelli, R. H. Nilsson, A. K. Clarke, and H. Blanck

(2009). “A phylogenetic approach to detect selection on the target site of the an-

tifouling compound irgarol in tolerant periphyton communities”. In: Environmental

Microbiology 11.8, pp. 2065–2077. European Community (2000). Directive 2000/60/EC of the European parliament and

of the council of 23 October 2000 establishing a framework for Community action in

the field of water policy. 164 Bibliographie

European Community (2003). Technical Guidance Document on Risk Assessment.

Part II. Luxembourg: European Commission Joint Research Centre, p. 328.

Evans, K. M., A. H. Wortley, G. E. Simpson, V. A. Chepurnov, and D. G. Mann (2008). “A Molecular Systematic Approach to Explore Diversity Within the Sell-

aphora Pupula Species Complex (Bacillariophyta)”. In: Journal of Phycology 44.1,

pp. 215–231.

Farré, M. la, S. Pérez, L. Kantiani, and D. Barceló (2008). “Fate and toxicity of

emerging pollutants, their metabolites and transformation products in the aquatic environment”. In: TrAC Trends in Analytical Chemistry. Advanced MS Analysis of

Metabolites and Degradation Products - II 27.11, pp. 991–1007.

Faust, M., R. Altenburger, T. Backhaus, H. Blanck, W. Boedeker, P. Gra-

matica, V. Hamer, M. Scholze, M. Vighi, and L. H. Grimme (2001). “Predicting the joint algal toxicity of multi-component s-triazine mixtures at low-effect concen-

trations of individual toxicants”. In: Aquatic Toxicology 56.1, pp. 13–32.

Felsenstein, J. (1985). “Phylogenies and the Comparative Method”. In: The American

Naturalist 125.1, pp. 1–15.

Fenner, K., S. Canonica, L. P. Wackett, and M. Elsner (2013). “Evaluating Pes- ticide Degradation in the Environment: Blind Spots and Emerging Opportunities”.

In: Science 341.6147, pp. 752–758.

Fennessy, S., C. Ibañez, A. Munné, N. Caiola, N. Kirchner, and C. Sola (2015).

“Biological Indices Based on Macrophytes: An Overview of Methods Used in Catalo-

nia and the USA to Determine the Status of Rivers and Wetlands”. In: The Handbook of Environmental Chemistry. Berlin: Springer, pp. 1–19.

Field, C. B., M. J. Behrenfeld, J. T. Randerson, and P. Falkowski (1998). “Pri-

mary Production of the Biosphere: Integrating Terrestrial and Oceanic Components”.

In: Science 281.5374, pp. 237–240. Bibliographie 165

Filippelli, G. M. (2008). “The global phosphorus cycle: Past, present, and future”. In:

Elements 4.2, pp. 89–95.

Flemming, H.-C. and A. Leis (2003). “Sorption Properties of Biofilms”. In: Encyclo- pedia of Environmental Microbiology. New York, USA: John Wiley & Sons, Inc.,

pp. 505–517.

Flemming, H.-C. and J. Wingender (2010). “The biofilm matrix”. In: Nature Reviews

Microbiology 8.9, pp. 623–633.

Fore, L. S. and C. Grafe (2002). “Using diatoms to assess the biological condition of large rivers in Idaho (U.S.A.)” In: Freshwater Biology 47.10, pp. 2015–2037.

Fortin, M.-J. and J. Gurevitch (2001). “Mantel tests: spatial structure in field exper-

iments”. In: Design and Analysis of Ecological Experiments. Ed. by S. M. Scheiner

and J. Gurevitch. 2nd. New York: Oxford University Press, pp. 308–326. Fourtanier, E. and J. P. Kociolek (2011). Catalogue of Diatom Names, On-Line

Version. California Academy of Sciences. url: http://research.calacademy.org/

research/diatoms/names/index.asp. Fox, M. G. and U. M. Sorhannus (2003). “RpoA: A Useful Gene for Phylogenetic

Analysis in Diatoms”. In: Journal of Eukaryotic Microbiology 50.6, pp. 471–475. Freckleton, R. P., P. H. Harvey, and M. Pagel (2002). “Phylogenetic analysis and

comparative data: a test and review of evidence”. In: The American Naturalist 160.6,

pp. 712–726.

Gaiser, E. and K. Rühland (2010). “Diatoms as indicators of environmental change

in wetlands and peatlands”. In: The Diatoms: Applications for the Environmental and Earth Sciences. Ed. by J. P. Smol and E. F. Stoermer. 2nd ed. Cambridge,

UK: Cambridge University Press, pp. 473–196.

Gallacher, D. (2002). “The application of rapid bioassessment techniques based on

benthic macroinvertebrates in East Asian rivers (a review).” In: Proceedings of the 166 Bibliographie

International Association of Theoretical and Applied Limnology. 27th Congress in

Dublin 1998. Vol. 27. Williams W. D., Ed., pp. 3503–3509.

Geitler, L. (1932). “Der Formwechsel der pennaten Diatomeen (Kieselalgen)”. In: Archiv für Protistenkunde 18.1, pp. 1–226.

Gevrey, M., F. Rimet, Y. S. Park, J.-L. Giraudel, L. Ector, and S. Lek (2004).

“Water quality assessment using diatom assemblages and advanced modelling tech-

niques”. In: Freshwater Biology 49.2, pp. 208–220.

Gilbert, P., M. R. Brown, and J. W. Costerton (1987). “Inocula for antimicrobial sensitivity testing: a critical review”. In: The Journal of Antimicrobial Chemotherapy

20.2, pp. 147–154.

Gilliom, R. J. (2007). “Pesticides in US streams and groundwater”. In: Environmental

Science & Technology 41.10, pp. 3408–3414. Gittleman, J. L. and M. Kot (1990). “Adaptation: Statistics and a null model for

estimating phylogenetic effects”. In: Systematic Biology 39.3, pp. 227–241.

Gittleman, J. L. and H.-K. Luh (1992). “On Comparing Comparative Methods”. In:

Annual Review of Ecology and Systematics 23, pp. 383–404.

Goertzen, L. R. and E. C. Theriot (2003). “Effect of Taxon Sampling, Character Weighting, and Combined Data on the Interpretation of Relationships Among the

Heterokont Algae1”. In: Journal of Phycology 39.2, pp. 423–443.

Goolsby, E. W. (2015). “Phylogenetic comparative methods for evaluating the evolu-

tionary history of function-valued traits”. In: Systematic Biology.

Gouy, M., S. Guindon, and O. Gascuel (2010). “SeaView version 4: A multiplatform graphical user interface for sequence alignment and phylogenetic tree building”. In:

Molecular Biology and Evolution 27.2, pp. 221–224. Bibliographie 167

Gramatica, P., M. Vighi, F. Consolaro, R. Todeschini, A. Finizio, and M. Faust

(2001). “QSAR approach for the selection of congeneric compounds with a similar

toxicological mode of action”. In: Chemosphere 42.8, pp. 873–883. Green, R. H. (1971). “A Multivariate Statistical Approach to the Hutchinsonian Niche:

Bivalve Molluscs of Central Canada”. In: Ecology 52.4, pp. 543–556.

Green, W. A., G. Hunt, S. L. Wing, and W. A. DiMichele (2011). “Does extinction

wield an axe or pruning shears? How interactions between phylogeny and ecology

affect patterns of extinction”. In: Paleobiology 37.1, pp. 72–91. Growns, I. (1999). “Is genus or species identification of periphytic diatoms required to

determine the impacts of river regulation?” In: Journal of Applied Phycology 11.3,

pp. 273–283.

Guénard, G., P. Legendre, and P. Peres-Neto (2013). “Phylogenetic eigenvector maps: a framework to model and predict species traits”. In: Methods in Ecology and

Evolution 4.12, pp. 1120–1131.

Guénard, G., P. C. v. d. Ohe, S. C. Walker, S. Lek, and P. Legendre (2014).

“Using phylogenetic information and chemical properties to predict species tolerances

to pesticides”. In: Proceedings of the Royal Society B: Biological Sciences 281.1789, p. 20133239.

Guénard, G., P. C. v. d. Ohe, D. de Zwart, P. Legendre, and S. Lek (2011).

“Using phylogenetic information to predict species tolerances to toxic chemicals”. In:

Ecological Applications 21.8, pp. 3178–3190.

Guillou, L., M.-J. Chrétiennot-Dinet, L. K. Medlin, H. Claustre, S. L.-d. Goër, and D. Vaulot (1999). “Bolidomonas: A new genus with two species belong-

ing to a new algal class, the Bolidophyceae (heterokonta)”. In: Journal of Phycology

35.2, pp. 368–381. 168 Bibliographie

Guindon, S., J.-F. Dufayard, V. Lefort, M. Anisimova, W. Hordijk, and O.

Gascuel (2010). “New algorithms and methods to estimate maximum-likelihood

phylogenies: assessing the performance of PhyML 3.0”. In: Systematic Biology 59.3, pp. 307–321.

Guiry, M. D. (2012). “How many species of algae are there?” In: Journal of Phycology

48.5, pp. 1057–1063.

Hackathon et al. (2013). phylobase: Base package for phylogenetic structures and

comparative data. Version 0.6.5.2. Haeckel, E. (1866a). Generelle Morphologie der Organismen. Allgemeine Anatomie

der Organismen. Vol. 1. Berlin, G. Reimer, 574 pp.

— (1866b). Generelle Morphologie der Organismen. Allgemeine Entwickelungsgeschichte

der Organismen. Vol. 2. Berlin, G. Reimer, 462 pp. Hajibabaei, M., S. Shokralla, X. Zhou, G. A. C. Singer, and D. J. Baird (2011).

“Environmental Barcoding: A Next-Generation Sequencing Approach for Biomoni-

toring Applications Using River Benthos”. In: PLoS ONE 6.4, e17497.

Hammond, J. I., D. K. Jones, P. R. Stephens, and R. A. Relyea (2012). “Phylogeny

meets ecotoxicology: evolutionary patterns of sensitivity to a common insecticide”. In: Evolutionary Applications 5.6, pp. 593–606.

Hardy, O. J. and S. Pavoine (2012). “Assessing phylogenetic signal with measure-

ment error: A comparison of Mantel tests, Blomberg et al.’s K, and phylogenetic

distograms”. In: Evolution 66.8, pp. 2614–2621.

Harvey, P. H. and M. D. Pagel (1991). The Comparative Method in Evolutionary Biology. Oxford, New York: Oxford University Press. 248 pp.

Hassall, A. H. (1850). A microscopic examination of the water supplied to the inhabi-

tants of London and the suburban districts. London, UK: Samuel Highley. 69 pp. Bibliographie 169

Hayes, K. R., J. M. Dambacher, G. R. Hosack, N. J. Bax, P. K. Dunstan, E. A.

Fulton, P. A. Thompson, J. R. Hartog, A. J. Hobday, R. Bradford, S. D.

Foster, P. Hedge, D. C. Smith, and C. J. Marshall (2015). “Identifying indi- cators and essential variables for marine ecosystems”. In: Ecological Indicators 57,

pp. 409–419.

Hebert, P. D., A. Cywinska, S. L. Ball, and J. R. deWaard (2003). “Biologi-

cal identifications through DNA barcodes”. In: Proceedings of the Royal Society of

London. Series B: Biological Sciences 270.1512, pp. 313–321. Hellebust, J. A. and J. Lewin (1977). “Heterotrophic nutrition”. In: The Biology of

Diatoms. Ed. by D. Werner. Botanical Monographs 13. Oxford, UK: Blackwell

Scientific Publications, pp. 169–197.

Hill, B. H., A. T. Herlihy, P. R. Kaufmann, R. J. Stevenson, F. H. McCormick, and C. B. Johnson (2000). “Use of periphyton assemblage data as an index of biotic

integrity”. In: Journal of the North American Benthological Society 19.1, pp. 50–67.

Hill, B. H., R. J. Stevenson, Y. Pan, A. T. Herlihy, P. R. Kaufmann, and C. B.

Johnson (2001). “Comparison of correlations between environmental characteristics

and stream diatom assemblages characterized at genus and species levels”. In: Journal of the North American Benthological Society 20.2, pp. 299–310.

Hilsenhoff, W. L. (1988). “Rapid Field Assessment of Organic Pollution with a

Family-Level Biotic Index”. In: Journal of the North American Benthological Society

7.1, pp. 65–68.

Hutchinson, G. E. (1957). “Concluding remarks”. In: Cold Spring Harbor symposia on quantitative biology. Vol. 22. Cold Spring Harbor Laboratory Press, pp. 415–427.

Ibáñez, C., N. Caiola, P. Sharpe, and R. Trobajo (2010). “Ecological indicators

to assess the health of river ecosystems”. In: Handbook of Ecological Indicators for 170 Bibliographie

Assessment of Ecosystem Health. Ed. by S. E. Jørgensen, L. Xu, and R. Costanza.

2nd. Boca Raton, Florida: CRC Press, pp. 447–464.

Ives, A. R., P. E. Midford, and T. Garland (2007). “Within-species variation and measurement error in phylogenetic comparative methods”. In: Systematic Biology

56.2, pp. 252–270.

Jeffree, R. A., F. Oberhansli, and J.-L. Teyssie (2010). “Phylogenetic consistencies

among chondrichthyan and teleost fishes in their bioaccumulation of multiple trace

elements from seawater”. In: Science of The Total Environment 408.16, pp. 3200– 3210.

Jenkins, M. (2003). “Prospects for Biodiversity”. In: Science 302.5648, pp. 1175–1177.

Jombart, T., F. Balloux, and S. Dray (2010a). “adephylo: new tools for investigating

the phylogenetic signal in biological traits”. In: Bioinformatics 26.15, pp. 1907–1909. Jombart, T., S. Pavoine, S. Devillard, and D. Pontier (2010b). “Putting phy-

logeny into the analysis of biological traits: a methodological approach”. In: Journal

of Theoretical Biology 264.3, pp. 693–701.

Jones, F. C. (2008). “Taxonomic sufficiency: the influence of taxonomic resolution on

freshwater bioassessments using benthic macroinvertebrates”. In: Environmental Re- views 16, pp. 45–69.

Jørgensen, S. E., L. Xu, and R. Costanza, eds. (2010). Handbook of Ecological In-

dicators for Assessment of Ecosystem Health. 2nd. Boca Raton, Florida: CRC Press.

484 pp.

Julius, M. L. (2007). “Perspectives on the Evolution and Diversification of the Diatoms”. In: Paleontological Society Papers 13, pp. 25–36.

Julius, M. L. and E. C. Theriot (2010). “The diatoms: A primer”. In: The Diatoms:

Applications for the Environmental and Earth Sciences. Ed. by J. P. Smol and E. F.

Stoermer. 2nd. Cambridge, UK: Cambridge University Press, pp. 8–22. Bibliographie 171

Kahlert, M. et al. (2009). “Harmonization is more important than experience—results

of the first Nordic–Baltic diatom intercalibration exercise 2007 (stream monitoring)”.

In: Journal of Applied Phycology 21.4, pp. 471–482. Karr, J. R. (1981). “Assessment of Biotic Integrity Using Fish Communities”. In: Fish-

eries 6.6, pp. 21–27.

— (1987). “Biological monitoring and environmental assessment: a conceptual frame-

work”. In: Environmental Management 11.2, pp. 249–256.

Keck, F., A. Bouchez, A. Franc, and F. Rimet (2016a). “Linking phylogenetic similarity and pollution sensitivity to develop ecological assessment methods: a test

with river diatoms”. In: Journal of Applied Ecology 53.3, pp. 856–864.

Keck, F., F. Rimet, A. Franc, and A. Bouchez (2016b). “Phylogenetic signal in

diatom ecology: Perspectives for aquatic ecosystems biomonitoring”. In: Ecological Applications 26.3, pp. 861–872.

Kelly, M. G. (1998). “Use of the trophic diatom index to monitor eutrophication in

rivers”. In: Water Research 32.1, pp. 236–242.

— (2003). “Short term dynamics of diatoms in an upland stream and implications for

monitoring eutrophication”. In: Environmental Pollution 125.2, pp. 117–122. Kelly, M. G., C. J. Penny, and B. A. Whitton (1995). “Comparative performance

of benthic diatom indices used to assess river water quality”. In: Hydrobiologia 302.3,

pp. 179–188.

Kelly, M. G. and B. A. Whitton (1995). “The trophic diatom index: a new index for

monitoring eutrophication in rivers”. In: Journal of Applied Phycology 7.4, pp. 433– 444.

Kelly, M. G. et al. (1998). “Recommendations for the routine sampling of diatoms for

water quality assessments in Europe”. In: Journal of Applied Phycology 10.2, pp. 215–

224. 172 Bibliographie

Kembel, S. W., P. D. Cowan, M. R. Helmus, W. K. Cornwell, H. Morlon,

D. D. Ackerly, S. P. Blomberg, and C. O. Webb (2010). “Picante: R tools for

integrating phylogenies and ecology”. In: Bioinformatics 26.11, pp. 1463–1464. Kembel, S. W., M. Wu, J. A. Eisen, and J. L. Green (2012). “Incorporating 16S Gene

Copy Number Information Improves Estimates of Microbial Diversity and Abun-

dance”. In: PLoS Computational Biology 8.10, e1002743.

Kermarrec, L., L. Ector, A. Bouchez, F. Rimet, and L. Hoffmann (2011). “A

preliminary phylogenetic analysis of the Cymbellales based on 18S rDNA gene se- quencing”. In: Diatom Research 26.3, pp. 305–315.

Kermarrec, L., A. Franc, F. Rimet, P. Chaumeil, J.-M. Frigerio, J.-F. Hum-

bert, and A. Bouchez (2014). “A next-generation sequencing approach to river

biomonitoring using benthic diatoms”. In: Freshwater Science 33.1, pp. 349–363. Kermarrec, L., A. Franc, F. Rimet, P. Chaumeil, J. F. Humbert, and A. Bouchez

(2013). “Next-generation sequencing to inventory taxonomic diversity in eukaryotic

communities: a test for freshwater diatoms”. In: Molecular Ecology Resources 13.4,

pp. 607–619.

Kobayasi, H. and S. Mayama (1989). “Evaluation of river water quality by diatoms”. In: The Korean Journal of Phycology 4.2, pp. 121–133.

Kociolek, J. P. (2005). “Taxonomy and ecology: further considerations”. In: Proceedings

of the California Academy of Sciences 56.1, pp. 99–106.

Kociolek, J. P. and E. F. Stoermer (1989). “Chromosome Numbers in Diatoms: A

Review”. In: Diatom Research 4.1, pp. 47–54. Kociolek, J. P. and E. F. Stoermer (2010). “Variation and polymorphism in diatoms:

the triple helix of development, genetics and environment. A review of the literature”.

In: Vie et Milieu 60.2, pp. 75–87. Bibliographie 173

Kolenati, F. A. (1848). “Über Nutzen und Schaden der Trichopteren”. In: Stettiner

Entomologische Zeitung 9, pp. 50–52.

Kolkwitz, R. and M. Marsson (1902). “Grundsätze für die biologische Beurteilung des Wassers nach seiner Flora und Fauna”. In: Mitteilungen der königlichen Prü-

fanstalt für Wasserversorgung und Abwasserbeseitigung 1, pp. 33–72.

— (1908). “Ökologie der pflanzlichen Saprobien”. In: Berichte der Deutschen Botanis-

chen Gesellschaft 26.7, pp. 505–519.

— (1909). “Ökologie der tierischen Saprobien. Beiträge zur Lehre von der biologischen Gewässerbeurteilung”. In: Internationale Revue der gesamten Hydrobiologie und Hy-

drographie 2.1, pp. 126–152.

Kooistra, W. H. C. F., M. De Stefano, D. G. Mann, N. Salma, and L. K. Medlin

(2003a). “Phylogenetic Position of Toxarium, a Pennate-Like Lineage Within Centric Diatoms (bacillariophyceae)”. In: Journal of Phycology 39.1, pp. 185–197.

Kooistra, W. H. C. F., M. D. Stefano, D. G. Mann, and K. Medlin (2003b). “The

Phylogeny of the Diatoms”. In: Silicon Biomineralization. Ed. by P. D. W. E. G.

Müller. Progress in Molecular and Subcellular Biology 33. Springer Berlin Heidel-

berg, pp. 59–97. Kröger, N., C. Bergsdorf, and M. Sumper (1996). “Frustulins: Domain Conserva-

tion in a Protein Family Associated with Diatom Cell Walls”. In: European Journal

of Biochemistry 239.2, pp. 259–264.

Kützing, F. T. (1844). Die kieselschaligen Bacillarien oder Diatomeen. Nordhausen,

152 pp. Lamberti, G. A. (1996). “The role of periphyton in benthic food webs”. In: Algal ecology:

Freshwater benthic ecosystem. Ed. by R. J. Stevenson, M. L. Bothwell, and R. L.

Lowe. Academic Press, San Diego, CA, pp. 533–572. 174 Bibliographie

Lamb, M. A. and R. L. Lowe (1987). “Effects of Current Velocity on the Physical

Structuring of Diatom (Bacillariophyceae) Communities”. In: The Ohio Journal of

Science 87.3, pp. 72–78. Lange-Bertalot, H. (1979). “Pollution tolerance of diatoms as a criterion for water

quality estimation”. In: Nova Hedwigia 64, pp. 285–304.

Larras, F., A. Bouchez, F. Rimet, and B. Montuelle (2012). “Using bioassays and

species sensitivity distributions to assess herbicide toxicity towards benthic diatoms”.

In: PLoS ONE 7.8, e44458. Larras, F., F. Keck, B. Montuelle, F. Rimet, and A. Bouchez (2014). “Linking

Diatom Sensitivity to Herbicides to Phylogeny: A Step Forward for Biomonitoring?”

In: Environmental Science & Technology 48.3, pp. 1921–1930.

Larras, F., B. Montuelle, and A. Bouchez (2013). “Assessment of toxicity thresh- olds in aquatic environments: does benthic growth of diatoms affect their expo-

sure and sensitivity to herbicides?” In: Science of The Total Environment 463-464,

pp. 469–477.

Lavoie, I., P. J. Dillon, and S. Campeau (2009). “The effect of excluding diatom taxa

and reducing taxonomic resolution on multivariate analyses and stream bioassess- ment”. In: Ecological Indicators 9.2, pp. 213–225.

Lavoie, I., J. Lento, and A. Morin (2010). “Inadequacy of size distributions of stream

benthic diatoms for environmental monitoring”. In: Journal of the North American

Benthological Society 29.2, pp. 586–601.

Lawrence, J. R., B. Scharf, G. Packroff, and T. R. Neu (2002). “Microscale evaluation of the effects of grazing by invertebrates with contrasting feeding modes

on river biofilm architecture and composition”. In: Microbial Ecology 44.3, pp. 199–

207. Bibliographie 175

Lecointe, C., M. Coste, and J. Prygiel (1993). ““Omnidia”: software for taxonomy,

calculation of diatom indices and inventories management”. In: Hydrobiologia 269-

270.1, pp. 509–513. Leland, H. V. and S. D. Porter (2000). “Distribution of benthic algae in the upper

Illinois River basin in relation to geology and land use”. In: Freshwater Biology 44.2,

pp. 279–301.

Lenoir, A. and M. Coste (1996). “Development of a practical diatom index of overall

water quality applicable to the French National Water Board Network”. In: Use of Algae for Monitoring Rivers II. International symposium, Volksbildungsheim Grilhof

Vill, AUT, 17-19 September 1995. Universität Innsbruck: Whitton, B.A., Rott, E.,

Eds., pp. 29–43.

Letten, A. D. and W. K. Cornwell (2015). “Trees, branches and (square) roots: why evolutionary relatedness is not linearly related to functional distance”. In: Methods

in Ecology and Evolution 6.4, pp. 439–444.

Li, C.-W. and B. E. Volcani (1987). “Four new apochlorotic diatoms”. In: British

Phycological Journal 22.4, pp. 375–382.

Liess, M., R. B. Schäfer, and C. A. Schriever (2008). “The footprint of pesticide stress in communities—Species traits reveal community effects of toxicants”. In: Sci-

ence of The Total Environment 406.3, pp. 484–490.

Lockert, C. K., K. D. Hoagland, and B. D. Siegfried (2006). “Comparative sen-

sitivity of freshwater algae to atrazine”. In: Bulletin of environmental contamination

and toxicology 76.1, pp. 73–79. Loos, R., B. M. Gawlik, G. Locoro, E. Rimaviciute, S. Contini, and G. Bidoglio

(2009). “EU-wide survey of polar organic persistent pollutants in European river

waters”. In: Environmental Pollution 157.2, pp. 561–568. 176 Bibliographie

Losos, J. B. (2008). “Phylogenetic niche conservatism, phylogenetic signal and the re-

lationship between phylogenetic relatedness and ecological similarity among species”.

In: Ecology Letters 11.10, pp. 995–1003. Lowe, R. L. and Y. Pan (1996). “Benthic algal communities as biological monitors”.

In: Algal ecology: Freshwater benthic ecosystems. Ed. by R. J. Stevenson, M. L.

Bothwell, and R. L. Lowe. San Diego, California, USA: Academic Press, pp. 705–

739.

Lundholm, N., N. Daugbjerg, and Ø. Moestrup (2002). “Phylogeny of the Bacil- lariaceae with emphasis on the genus Pseudo-nitzschia (Bacillariophyceae) based on

partial LSU rDNA”. In: European Journal of Phycology 37.1, pp. 115–134.

MacDonald, J. D. (1869). “On the structure of the Diatomaceous frustule, and its

genetic cycle”. In: Annals and Magazine of Natural History 3.13, pp. 1–8. Macdonald, R. W. et al. (2000). “Contaminants in the Canadian Arctic: 5 years of

progress in understanding sources, occurrence and pathways”. In: Science of The

Total Environment 254.2, pp. 93–234.

Malaj, E., G. Guénard, R. B. Schäfer, and P. C. von der Ohe (2015). “Evolution-

ary patterns and physicochemical properties explain macroinvertebrate sensitivity to heavy metals”. In: Ecological Applications.

Mann, D. G. (1982). “The use of the central raphe endings as a taxonomic character”.

In: Plant Systematics and Evolution 141.2, pp. 143–152.

— (1999). “The species concept in diatoms”. In: Phycologia 38.6, pp. 437–495.

Mann, D. G. and K. M. Evans (2007). “Molecular genetics and the neglected art of diatomics”. In: Unravelling the algae: the past, present, and future of algal systematics.

Ed. by J. Brodie and J. Lewis. Boca Raton, Florida: CRC Press, pp. 231–265.

Mann, D. G., S. M. McDonald, M. M. Bayer, S. J. M. Droop, V. A. Chepurnov,

R. E. Loke, A. Ciobanu, and J. M. H. du Buf (2004). “The Sellaphora pupula Bibliographie 177

species complex (Bacillariophyceae): morphometric analysis, ultrastructure and mat-

ing data provide evidence for five new species”. In: Phycologia 43.4, pp. 459–482.

Mann, D. G. and P. Vanormelingen (2013). “An inordinate fondness? The number, distributions, and origins of diatom species”. In: Journal of Eukaryotic Microbiology

60.4, pp. 414–420.

Mann, D. G. and S. J. M. Droop (1996). “Biodiversity, biogeography and conservation

of diatoms”. In: Hydrobiologia 336, pp. 19–32.

Marcel, R., A. Bouchez, and F. Rimet (2013). “Influence of herbicide contamination on diversity and ecological guilds of river diatoms”. In: Cryptogamy Algology 34.2,

pp. 169–183.

Markert, B. A., A. M. Breure, and H. G. Zechmeister, eds. (2003a). Bioindicators

& biomonitors: principles, concepts, and applications. Oxford, UK: Elsevier. 1014 pp. — (2003b). “Definitions, strategies and principles for bioindication/biomonitoring of the

environment”. In: Bioindicators & biomonitors: principles, concepts, and applications.

Oxford, UK: Elsevier, pp. 3–39.

Martínez-Carreras, N., C. E. Wetzel, J. Frentress, L. Ector, J. J. McDon-

nell, L. Hoffmann, and L. Pfister (2015). “Hydrological connectivity as indi- cated by transport of diatoms through the riparian–stream system”. In: Hydrology

and Earth System Sciences 12.2, pp. 2391–2434.

Martins, E. P. and T. F. Hansen (1997). “Phylogenies and the Comparative Method:

A General Approach to Incorporating Phylogenetic Information into the Analysis of

Interspecific Data”. In: The American Naturalist 149.4, pp. 646–667. Matsen, F. A., R. B. Kodner, and E. V. Armbrust (2010). “pplacer: linear time

maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed

reference tree”. In: BMC Bioinformatics 11, p. 538. 178 Bibliographie

McCormick, P. V. and R. J. Stevenson (1998). “Periphyton as a Tool for Ecological

Assessment and Management in the Florida Everglades”. In: Journal of Phycology

34.5, pp. 726–733. Medlin, L. K. and I. Kaczmarska (2004). “Evolution of the diatoms: V. Morphological

and cytological support for the major clades and a taxonomic revision”. In: Phycologia

43.3, pp. 245–270.

Medlin, L., W. Kooistra, R. Gersonde, P. A. Sims, and U. Wellbrock (1997).

“Is the origin of diatoms related to the end-Permian mass extinction?” In: Nova Hedwigia 65, pp. 1–11.

Medlin, L., W. Kooistra, and A.-M. Schmid (2000). “A review of the evolution of

the diatoms-a total approach using molecules, morphology and geology”. In: The

Origin and Early Evolution of the Diatoms: Fossil, Molecular and Biogeographical Approaches. Ed. by A. Witkowski and J. Sieminska. Cracow, Poland: W. Szafer

Institute of Botany, Polish Academy of Sciences, pp. 13–35.

Medlin, L. K. (2011). “A review of the evolution of the diatoms from the origin of the

lineage to their populations”. In: The Diatom World. New York, USA: Seckbach, J.,

Kociolek, P., Eds., pp. 93–118. Medlin, L. K., W. H. Kooistra, R. Gersonde, and U. Wellbrock (1996). “Evo-

lution of the diatoms (Bacillariophyta). II. Nuclear-encoded small-subunit rRNA se-

quence comparisons confirm a paraphyletic origin for the centric diatoms.” In: Molec-

ular Biology and Evolution 13.1, pp. 67–75.

Medlin, L., D. Williams, and P. Sims (1993). “The Evolution of the Diatoms (Bacil- lariophyta). I. Origin of the Group and Assessment of the Monophyly of Its Major

Divisions”. In: European Journal of Phycology 28.4, pp. 261–275. Bibliographie 179

Mendes, T., A. R. Calapez, C. L. Elias, S. F. P. Almeida, and M. J. Feio (2014).

“Comparing alternatives for combining invertebrate and diatom assessment in stream

quality classification”. In: Marine and Freshwater Research 65.7, pp. 612–623. Mez, C. (1898). Mikroskopische Wasseranalyse. Berlin, Heidelberg: Springer.

Monnier, O., L. Basilico, Y. Reyjol, and M.-C. Ximénès (2016). La bioindication

en outre-mer. Situation et perspectives dans le contexte de la directive cadre sur l’eau.

Onema, p. 124.

Moran, P. A. P. (1948). “The Interpretation of Statistical Maps”. In: Journal of the Royal Statistical Society. Series B (Methodological) 10.2, pp. 243–251.

— (1950). “Notes on Continuous Stochastic Phenomena”. In: Biometrika 37, pp. 17–23.

Moreland, D. E. and K. L. Hill (1962). “Interference of Herbicides with the Hill

Reaction of Isolated Chloroplasts”. In: Weeds 10.3, pp. 229–236. Mouget, J.-L., R. Gastineau, O. Davidovich, P. Gaudin, and N. A. Davidovich

(2009). “Light is a key factor in triggering sexual reproduction in the pennate diatom

Haslea ostrearia”. In: FEMS Microbiology Ecology 69.2, pp. 194–201.

Mouquet, N., V. Devictor, C. N. Meynard, F. Munoz, L. F. Bersier, J. Chave,

P. Couteron, A. Dalecky, C. Fontaine, and D. Gravel (2012). “Ecophyloge- netics: advances and perspectives”. In: Biological Reviews 87.4, pp. 769–785.

Mulholland, P. J., J. W. Elwood, A. V. Palumbo, and R. J. Stevenson (1986).

“Effect of Stream Acidification on Periphyton Composition, Chlorophyll, and Pro-

ductivity”. In: Canadian Journal of Fisheries and Aquatic Sciences 43.10, pp. 1846–

1858. Münkemüller, T., F. C. Boucher, W. Thuiller, and S. Lavergne (2015). “Phy-

logenetic niche conservatism - common pitfalls and ways forward”. In: Functional

Ecology 29.5. Ed. by P. Venail, pp. 627–639. 180 Bibliographie

Münkemüller, T., S. Lavergne, B. Bzeznik, S. Dray, T. Jombart, K. Schif-

fers, and W. Thuiller (2012). “How to measure and test phylogenetic signal”. In:

Methods in Ecology and Evolution 3.4, pp. 743–756. Murrell, P. (2005). R Graphics. 1st ed. Computer Science and Data Analysis Series.

London, UK: Chapman and Hall/CRC. 328 pp.

Nakov, T., E. C. Theriot, and A. J. Alverson (2014). “Using phylogeny to model cell

size evolution in marine and freshwater diatoms”. In: Limnology and Oceanography

59.1, pp. 79–86. Newman, M. E. J. and M. Girvan (2004). “Finding and evaluating community struc-

ture in networks”. In: Physical review E 69.2, p. 026113.

Newman, M. C. (2014). Fundamentals of Ecotoxicology: The Science of Pollution.

4th ed. Boca Raton: CRC Press. 680 pp. Nilsson, C., C. A. Reidy, M. Dynesius, and C. Revenga (2005). “Fragmentation

and Flow Regulation of the World’s Large River Systems”. In: Science 308.5720,

pp. 405–408.

Nylander, J. (2004). MrAIC.pl. Version 1.4.5. Uppsala University.

Oden, N. L. and R. R. Sokal (1986). “Directional Autocorrelation: An Extension of Spatial Correlograms to Two Dimensions”. In: Systematic Zoology 35.4, pp. 608–617.

Oettmeier, W. (1999). “Herbicide resistance and supersensitivity in photosystem II”.

In: Cellular and Molecular Life Sciences 55.10, pp. 1255–1277.

Oksanen, J., E. Läärä, P. Huttunen, and J. Meriläinen (1988). “Estimation of

pH optima and tolerances of diatoms in lake sediments by the methods of weighted averaging, least squares and maximum likelihood, and their use for the prediction of

lake acidity”. In: Journal of Paleolimnology 1.1, pp. 39–49.

Pagel, M. (1999). “Inferring the historical patterns of biological evolution”. In: Nature

401.6756, pp. 877–884. Bibliographie 181

Pan, Y., R. J. Stevenson, B. H. Hill, A. T. Herlihy, and G. B. Collins (1996).

“Using Diatoms as Indicators of Ecological Conditions in Lotic Systems: A Regional

Assessment”. In: Journal of the North American Benthological Society 15.4, pp. 481– 495.

Paradis, E. (2011). Analysis of Phylogenetics and Evolution with R. 2nd. UseR! New

York, USA: Springer. 400 pp.

Paradis, E., J. Claude, and K. Strimmer (2004). “APE: analyses of phylogenetics

and evolution in R language”. In: Bioinformatics 20.2, pp. 289–290. Passy, S. I. (2007). “Diatom ecological guilds display distinct and predictable behavior

along nutrient and disturbance gradients in running waters”. In: Aquatic Botany 86.2,

pp. 171–178.

Patrick, R. (1977). “Ecology of Freshwater Diatoms - Diatom Communities”. In: The Biology of Diatoms. Ed. by D. Werner. Botanical Monographs 13. Berkeley and

Los Angeles, California: University of California Press, pp. 284–332.

Patrick, R. (1961). “A study of the numbers and kinds of species found in rivers

in eastern United States”. In: Proceedings of the Academy of Natural Sciences of

Philadelphia 113.10, pp. 215–258. Patrick, R. and D. M. Palavage (1994). “The value of species as indicators of water

quality”. In: Proceedings of the Academy of Natural Sciences of Philadelphia 145,

pp. 55–92.

Pavoine, S. and C. Ricotta (2013). “Testing for Phylogenetic Signal in Biological

Traits: The Ubiquity of Cross-Product Statistics”. In: Evolution 67.3, pp. 828–840. Pavoine, S., S. Ollier, D. Pontier, and D. Chessel (2008). “Testing for phylogenetic

signal in phenotypic traits: new matrices of phylogenetic proximities”. In: Theoretical

Population Biology 73.1, pp. 79–91. 182 Bibliographie

Pavoine, S., E. Vela, S. Gachet, G. De Bélair, and M. B. Bonsall (2011). “Link-

ing patterns in phylogeny, traits, abiotic variables and space: a novel approach to

linking environmental filtering and plant community assembly”. In: Journal of Ecol- ogy 99.1, pp. 165–175.

Pérès, F., D. Florin, T. Grollier, A. Feurtet-Mazel, M. Coste, F. Ribeyre, M.

Ricard, and A. Boudou (1996). “Effects of the phenylurea herbicide isoproturon on

periphytic diatom communities in freshwater indoor microcosms”. In: Environmental

Pollution 94.2, pp. 141–152. Pfitzer, E. (1871). “Untersuchungen über Bau und Entwicklung der Bacillariaceen

(Diatomaceen)”. In: Botanische Abhandlungen aus dem Gebiet der Morphologie und

Physiologie 1.2, pp. 1–189.

Pickett-Heaps, J. D., A.-M. M. Schmid, and D. H. Tippit (1984). “Cell division in diatoms”. In: Protoplasma 120, pp. 132–154.

Pinheiro, J., D. Bates, and S. DebRoy (2013). nlme: Linear and Nonlinear Mixed

Effects Models. Version 3.1-109.

Ponader, K. C. and M. G. Potapova (2007). “Diatoms from the genus Achnan-

thidium in flowing waters of the Appalachian Mountains (North America): Ecology, distribution and taxonomic notes”. In: Limnologica 37.3, pp. 227–241.

Porsbring, T., T. Backhaus, P. Johansson, M. Kuylenstierna, and H. Blanck

(2010). “Mixture toxicity from photosystem II inhibitors on microalgal community

succession is predictable by concentration addition”. In: Environmental Toxicology

and Chemistry 29.12, pp. 2806–2813. Poteat, M. D. and D. B. Buchwalter (2014). “Phylogeny and Size Differentially

Influence Dissolved Cd and Zn Bioaccumulation Parameters among Closely Related

Aquatic Insects”. In: Environmental Science & Technology 48.9, pp. 5274–5281. Bibliographie 183

Poteat, M. D., T. Garland, N. S. Fisher, W.-X. Wang, and D. B. Buchwal-

ter (2013). “Evolutionary Patterns in Trace Metal (Cd and Zn) Efflux Capacity in

Aquatic Organisms”. In: Environmental Science & Technology 47.14, pp. 7989–7995. Poulíčková, A., J. Veselá, J. Neustupa, and P. Škaloud (2010). “Pseudocryptic

Diversity versus Cosmopolitanism in Diatoms: a Case Study on Navicula crypto-

cephala Kütz. (Bacillariophyceae) and Morphologically Similar Taxa”. In: Protist

161.3, pp. 353–369.

Prygiel, J., P. Carpentier, S. Almeida, M. Coste, J.-C. Druart, L. Ector, D. Guillard, M.-A. Honoré, R. Iserentant, and P. Ledeganck (2002). “Deter-

mination of the biological diatom index (IBD NF T 90–354): results of an intercom-

parison exercise”. In: Journal of Applied Phycology 14.1, pp. 27–39.

Prygiel, J., L. Lévêque, and R. Iserentant (1996). “Un nouvel Indice Diatomique Pratique pour l’évaluation de la qualité des eaux en réseau de surveillance”. In: Jour-

nal of Water Science 9.1, pp. 97–113.

Ratnasingham, S. and P. D. N. Hebert (2007). “bold: The Barcode of Life Data

System (http://www.barcodinglife.org)”. In: Molecular Ecology Notes 7.3, pp. 355–

364. Raunio, J. and J. Soininen (2007). “A practical and sensitive approach to large river

periphyton monitoring: comparative performance of methods and taxonomic levels.”

In: Boreal environment research 12.1, pp. 55–63.

R Development Core Team (2013). R: A Language and Environment for Statistical

Computing. Vienna, Austria: R Foundation for Statistical Computing. Renberg, I. and T. Hellberg (1982). “The pH history of lakes in southwestern Swe-

den, as calculated from the subfossil diatom flora of the sediments.” In: AMBIO: A

Journal of the Human Environment 11.1, pp. 30–33. 184 Bibliographie

Resh, V. H. (2007). “Multinational, freshwater biomonitoring programs in the devel-

oping world: Lessons learned from African and Southeast Asian river surveys”. In:

Environmental Management 39.5, pp. 737–748. — (2008). “Which group is best? Attributes of different biological assemblages used in

freshwater biomonitoring programs”. In: Environmental Monitoring and Assessment

138.1, pp. 131–138.

Revell, L. J. (2012). “phytools: an R package for phylogenetic comparative biology

(and other things)”. In: Methods in Ecology and Evolution 3.2, pp. 217–223. Revell, L. J., L. J. Harmon, and D. C. Collar (2008). “Phylogenetic signal, evolu-

tionary process, and rate”. In: Systematic Biology 57.4, pp. 591–601.

Reyjol, Y., V. Spyratos, and L. Basilico (2011). Bioindication : des outils pour

évaluer l’état écologique des milieux aquatiques. Perspectives en vue du 2e cycle DCE – Eaux de surface continentales. Onema, p. 55.

Rimet, F. and A. Bouchez (2012a). “Life-forms, cell-sizes and ecological guilds of

diatoms in European rivers”. In: Knowledge and Management of Aquatic Ecosystems

406.1, pp. 1–14.

Rimet, F. (2009). “Benthic diatom assemblages and their correspondence with ecore- gional classifications: case study of rivers in north-eastern France”. In: Hydrobiologia

636.1, pp. 137–151.

— (2012). “Recent views on river pollution and diatoms”. In: Hydrobiologia 683.1, pp. 1–

24.

Rimet, F. and A. Bouchez (2011). “Use of diatom life-forms and ecological guilds to assess pesticide contamination in rivers: Lotic mesocosm approaches”. In: Ecological

Indicators 11.2, pp. 489–499.

— (2012b). “Biomonitoring river diatoms: Implications of taxonomic resolution”. In:

Ecological Indicators 15.1, pp. 92–99. Bibliographie 185

Rimet, F., J.-C. Druart, and O. Anneville (2009). “Exploring the dynamics of

plankton diatom communities in Lake Geneva using emergent self-organizing maps

(1974–2007)”. In: Ecological Informatics 4.2, pp. 99–110. Rimet, F., L. Kermarrec, A. Bouchez, L. Hoffmann, L. Ector, and L. K.

Medlin (2011). “Molecular phylogeny of the family Bacillariaceae based on 18S

rDNA sequences: focus on freshwater Nitzschia of the section Lanceolatae”. In: Di-

atom Research 26.3, pp. 273–291.

Ritz, K., H. I. J. Black, C. D. Campbell, J. A. Harris, and C. Wood (2009). “Selecting biological indicators for monitoring soils: A framework for balancing sci-

entific and technical opinion to assist policy development”. In: Ecological Indicators

9.6, pp. 1212–1221.

Rockström, J., W. Steffen, K. Noone, Å. Persson, F. Chapin, E. Lambin, T. Lenton, M. Scheffer, C. Folke, and H. Schellnhuber (2009). “A safe oper-

ating space for humanity”. In: Nature 461, pp. 472–475.

Ronquist, F., M. Teslenko, P. van der Mark, D. L. Ayres, A. Darling, S. Höhna,

B. Larget, L. Liu, M. A. Suchard, and J. P. Huelsenbeck (2012). “MrBayes

3.2: efficient Bayesian phylogenetic inference and model choice across a large model space”. In: Systematic Biology 61.3, pp. 539–542.

Rott, E., P. G. Hofmann, K. Pall, P. Pfister, and E. Pipp (1997). “Indikation-

slisten für Aufwuchsalgen in österreichischen Fliessgewässern. Teil 1: Saprobielle In-

dikation”. In: Bundesministerium für Land-und Forstwirtschaft, Wasserwirtschaft-

skataster, Wien. Rott, E., E. Pipp, P. Pfister, H. Van Dam, K. Ortler, N. Binder, and K. Pall

(1999). “Indikationslisten für Aufwuchsalgen in österreichischen Fliessgewässern. Teil

2: Trophie-indikation sowie geochemische Präferenz; taxonomische und toxikologis- 186 Bibliographie

che Anmerkungen”. In: Bundesministerium für Land-und Forstwirtschaft, Wasser-

wirtschaftskataster, Wien.

Roubeix, V., N. Mazzella, L. Schouler, V. Fauvelle, S. Morin, M. Coste, F. Delmas, and C. Margoum (2011). “Variations of periphytic diatom sensitivity to

the herbicide diuron and relation to species distribution in a contamination gradi-

ent: implications for biomonitoring”. In: Journal of Environmental Monitoring 13.6,

pp. 1768–1774.

Round, F. E. (1991). “Diatoms in river water-monitoring studies”. In: Journal of Applied Phycology 3.2, pp. 129–145.

Round, F. E., R. M. Crawford, and D. G. Mann (1990). The diatoms: biology and

morphology of the genera. Cambridge, UK: Cambridge University Press. 760 pp.

Roux-Barthès, A. (2014). “Biofilms phototrophes de rivières non permanentes : dy- namiques de communautés microbiennes et des populations de diatomées et perti-

nence de leur utilisation en bioindication”. PhD thesis. Université Toulouse 3. 220 pp.

Rovira, L. (2013). “The ecology and taxonomy of estuarine benthic diatoms and their

use as bioindicators in a highly stratified estuary (Ebro Estuary, NE lberian Penin-

sula): a multidisciplinary approach”. PhD thesis. IRTA San Carlos de la Ripita (Spain): University of Barcelona. 1-295.

Ruck, E. C. and E. C. Theriot (2011). “Origin and evolution of the canal raphe system

in diatoms”. In: Protist 162.5, pp. 723–737.

Rumeau, A. and M. Coste (1988). “Initiation à la systématique des diatomées d’eau

douce”. In: Bulletin Français de la Pêche et de la Pisciculture 309, pp. 1–69. Salden, N. (1978). Beiträge zur Ökologie der Diatomeen (Bacillariophyceae) des Süss-

wassers. Naturhistorischer Verein der Rheinlande und Westfalens. Bibliographie 187

Sanderson, M. J. (2002). “Estimating Absolute Rates of Molecular Evolution and

Divergence Times: A Penalized Likelihood Approach”. In: Molecular Biology and

Evolution 19.1, pp. 101–109. Schaumburg, J., C. Schranz, G. Hofmann, D. Stelzer, S. Schneider, and U.

Schmedtje (2004). “Macrophytes and phytobenthos as indicators of ecological sta-

tus in German lakes — a contribution to the implementation of the water framework

directive”. In: Limnologica - Ecology and Management of Inland Waters 34.4, pp. 302–

314. Schloss, P. D., S. L. Westcott, T. Ryabin, J. R. Hall, M. Hartmann, E. B. Hol-

lister, R. A. Lesniewski, B. B. Oakley, D. H. Parks, C. J. Robinson, J. W.

Sahl, B. Stres, G. G. Thallinger, D. J. V. Horn, and C. F. Weber (2009).

“Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities”. In: Applied and

Environmental Microbiology 75.23, pp. 7537–7541.

Schluter, D., T. Price, A. Ø. Mooers, and D. Ludwig (1997). “Likelihood of

ancestor states in adaptive radiation”. In: Evolution 51.6, pp. 1699–1711.

Schmitt-Jansen, M. and R. Altenburger (2005). “Predicting and observing re- sponses of algal communities to photosystem ii-herbicide exposure using pollution-

induced community tolerance and species-sensitivity distributions”. In: Environmen-

tal Toxicology and Chemistry 24.2, pp. 304–312.

Schoeman, F. R. (1973). A systematical and ecological study of the diatom flora of

Lesotho with special reference to the water quality. Pretoria, South Africa: National Institute for Water Research, p. 355.

Schütt, F. (1896). “Bacillariales (Diatomaceae)”. In: Die natürlichen Pflanzenfamilien.

Ed. by A. Engler and K. Prantl. Vol. 1. Leipzig: W. Engelmann, pp. 31–105. 188 Bibliographie

Schwarzenbach, R. P., B. I. Escher, K. Fenner, T. B. Hofstetter, C. A. John-

son, U. v. Gunten, and B. Wehrli (2006). “The Challenge of Micropollutants in

Aquatic Systems”. In: Science 313.5790, pp. 1072–1077. Shokralla, S., J. L. Spall, J. F. Gibson, and M. Hajibabaei (2012). “Next-generation

sequencing technologies for environmental DNA research”. In: Molecular Ecology 21.8,

pp. 1794–1805.

Simonsen, R. (1979). “The diatom system: ideas on phylogeny”. In: Bacillaria 2, pp. 9–

71. Sims, P. A., D. G. Mann, and L. K. Medlin (2006). “Evolution of the diatoms: insights

from fossil, biological and molecular data”. In: Phycologia 45.4, pp. 361–402.

Sládecek, V. (1973). System of Water Quality from the Biological Point of View.

Stuttgart: Lubrecht & Cramer Ltd. 218 pp. Sládeček, V. (1986). “Diatoms as Indicators of Organic Pollution”. In: Acta Hydrochim-

ica et Hydrobiologica 14.5, pp. 555–566.

Smith, W. (1853). A synopsis of the British Diatomaceæ. Vol. 1. London, UK: J. Van

Voorst. 89 pp.

Sokal, R. R. (1979). “Testing Statistical Significance of Geographic Variation Patterns”. In: Systematic Zoology 28.2, pp. 227–232.

Sokal, R. R. and N. L. Oden (1978). “Spatial autocorrelation in biology: 1. Method-

ology”. In: Biological Journal of the Linnean Society 10.2, pp. 199–228.

Sorhannus, U., F. Gasse, R. Perasso, and A. B. Tourancheau (1995). “A pre-

liminary phylogeny of diatoms based on 28S ribosomal RNA sequence data”. In: Phycologia 34.1, pp. 65–73.

Stamatakis, A. (2006). “RAxML-VI-HPC: maximum likelihood-based phylogenetic

analyses with thousands of taxa and mixed models”. In: Bioinformatics 22.21, pp. 2688–

2690. Bibliographie 189

Statzner, B., B. Bis, S. Dolédec, and P. Usseglio-Polatera (2001). “Perspec-

tives for biomonitoring at large spatial scales: a unified measure for the functional

composition of invertebrate communities in European running waters”. In: Basic and Applied Ecology 2.1, pp. 73–85.

Stauber, J. L. and S. W. Jeffrey (1988). “Photosynthetic Pigments in Fifty-One

Species of Marine Diatoms”. In: Journal of Phycology 24.2, pp. 158–172.

Stearns, S. C. (1983). “The Influence of Size and Phylogeny on Patterns of Covariation

among Life-History Traits in the Mammals”. In: Oikos 41.2, p. 173. Steinecke, F. (1931). “Die Phylogenie der Algophyten”. In: Schriften der Königsberger

gelehrten Gesellschaft 8, pp. 127–298.

Stevenson, R. J. (1984). “Epilithic and epipelic diatoms in the Sandusky River, with

emphasis on species diversity and water pollution”. In: Hydrobiologia 114.3, pp. 161– 175.

Stevenson, R. J., S. T. Rier, C. M. Riseng, R. E. Schultz, and M. J. Wiley

(2006). “Comparing Effects of Nutrients on Algal Biomass in Streams in Two Regions

with Different Disturbance Regimes and with Applications for Developing Nutrient

Criteria”. In: Hydrobiologia 561.1, pp. 149–165. Stevenson, R. and J. Smol (2003). “Use of algae in environmental assessments”. In:

Freshwater Algae in North America: Ecology and Classification. Ed. by J. Wehr and

R. Sheath. San Diego, California, USA: Academic Press, pp. 775–804.

Stevenson, R. J., P. Yangdong, and H. Van Dam (2010). “Assessing environmental

conditions in rivers and streams with diatoms”. In: The Diatoms: Applications for the Environmental and Earth Sciences. Ed. by J. P. Smol and E. F. Stoermer.

2nd ed. Cambridge University Press, pp. 55–85.

Stewart, P. S. and J. W. Costerton (2001). “Antibiotic resistance of bacteria in

biofilms”. In: The Lancet 358.9276, pp. 135–138. 190 Bibliographie

Suter, G. W. (2008). “Ecological risk assessment in the United States environmental

protection agency: A historical overview”. In: Integrated Environmental Assessment

and Management 4.3, pp. 285–289. Sutherland, I. W. (2001). “The biofilm matrix – an immobilized but dynamic micro-

bial environment”. In: Trends in Microbiology 9.5, pp. 222–227.

Taylor, J. C., W. R. Harding, and C. G. M. Archibald (2007). A Methods Manual

for the Collection, Preparation and Analysis of Diatom Samples. Pretoria, South

Africa: Water Research Commission, p. 49. Ter Braak, C. J. F. (1986). “Canonical Correspondence Analysis: A New Eigenvector

Technique for Multivariate Direct Gradient Analysis”. In: Ecology 67.5, pp. 1167–

1179.

Ter Braak, C. J. F. and C. W. N. Looman (1986). “Weighted averaging, logistic regression and the Gaussian response model”. In: Vegetatio 65.1, pp. 3–11.

Theriot, E. C., M. Ashworth, E. Ruck, T. Nakov, and R. K. Jansen (2010).

“A preliminary multigene phylogeny of the diatoms (Bacillariophyta): challenges for

future research”. In: Plant Ecology and Evolution 143.3, pp. 278–296.

Theriot, E. C., E. Ruck, M. Ashworth, T. Nakov, and R. K. Jansen (2011). “Sta- tus of the pursuit of the diatom phylogeny: Are traditional views and new molecular

paradigms really that different?” In: The Diatom World. Ed. by J. Seckbach and

J. Kociolek. New York, USA: Springer, pp. 119–142.

Theriot, E. C., M. P. Ashworth, T. Nakov, E. Ruck, and R. K. Jansen (2015).

“Dissecting signal and noise in diatom chloroplast protein encoding genes with phylo- genetic information profiling”. In: Molecular Phylogenetics and Evolution 89, pp. 28–

36. Bibliographie 191

Theriot, E. C., J. J. Cannone, R. R. Gutell, and A. J. Alverson (2009). “The lim-

its of nuclear-encoded SSU rDNA for resolving the diatom phylogeny”. In: European

Journal of Phycology 44.3, pp. 277–290. Torrisi, M. and A. Dell’Uomo (2006). “Biological Monitoring of Some Apennine

Rivers (central Italy) Using the Diatom-Based Eutrophication / Pollution Index (epi-

D) Compared to Other European Diatom Indices”. In: Diatom Research 21.1, pp. 159–

174.

Trobajo, R., E. Clavero, V. A. Chepurnov, K. Sabbe, D. G. Mann, S. Ishi- hara, and E. J. Cox (2009). “Morphological, genetic and mating diversity within

the widespread bioindicator Nitzschia palea (Bacillariophyceae)”. In: Phycologia 48.6,

pp. 443–459.

Usseglio-Polatera, P., M. Bournaud, P. Richoux, and H. Tachet (2000). “Biomon- itoring through biological traits of benthic macroinvertebrates: how to use species

trait databases?” In: Hydrobiologia 422-423, pp. 153–162.

Van Dam, H., A. Mertens, and J. Sinkeldam (1994). “A coded checklist and eco-

logical indicator values of freshwater diatoms from the Netherlands”. In: Netherland

Journal of Aquatic Ecology 28.1, pp. 117–133. Vanelslander, B., V. Créach, P. Vanormelingen, A. Ernst, V. A. Chepurnov,

E. Sahan, G. Muyzer, L. J. Stal, W. Vyverman, and K. Sabbe (2009). “Eco-

logical Differentiation Between Sympatric Pseudocryptic Species in the Estuarine

Benthic Diatom Navicula Phyllepta (bacillariophyceae)”. In: Journal of Phycology

45.6, pp. 1278–1289. Van Steen, M. (2010). Graph Theory and Complex Networks: An Introduction. Maarten

van Steen. 300 pp. 192 Bibliographie

Vignieri, S. N. (2005). “Streams over mountains: influence of riparian connectivity on

gene flow in the Pacific jumping mouse (Zapus trinotatus)”. In: Molecular Ecology

14.7, pp. 1925–1937. Vitousek, P. M., J. D. Aber, R. W. Howarth, G. E. Likens, P. A. Matson, D. W.

Schindler, W. H. Schlesinger, and D. G. Tilman (1997). “Human alteration of

the global nitrogen cycle: sources and consequences”. In: Ecological Applications 7.3,

pp. 737–750.

Vörösmarty, C. J., P. B. McIntyre, M. O. Gessner, D. Dudgeon, A. Prusevich, P. Green, S. Glidden, S. E. Bunn, C. A. Sullivan, C. R. Liermann, and P. M.

Davies (2010). “Global threats to human water security and river biodiversity”. In:

Nature 467.7315, pp. 555–561.

Walsby, A. and C. Reynolds (1980). “Sinking and floating”. In: Physiological ecology of phytoplankton. Ed. by I. Morris. Oxford, UK: Blackwell, pp. 371–412.

Wängberg, S.-Å. and H. Blanck (1988). “Multivariate patterns of algal sensitivity

to chemicals in relation to phylogeny”. In: Ecotoxicology and Environmental Safety

16.1, pp. 72–82.

Wang, K., R. J. Stevenson, and L. Metzmeier (2005). “Development and evaluation of a diatom-based Index of Biotic Integrity for the Interior Plateau Ecoregion, USA”.

In: Journal of the North American Benthological Society 24.4, pp. 990–1008.

Webb, C. O., D. D. Ackerly, M. A. McPeek, and M. J. Donoghue (2002). “Phy-

logenies and community ecology”. In: Annual Review of Ecology and Systematics 33,

pp. 475–505. Weckström, K. and S. Juggins (2006). “Coastal Diatom–Environment Relationships

from the Gulf of Finland, Baltic Sea”. In: Journal of Phycology 42.1, pp. 21–35.

Wheeler, D. L. et al. (2008). “Database resources of the National Center for Biotech-

nology Information”. In: Nucleic Acids Research 36 (Database Issue), pp. D13–D21. Bibliographie 193

Whittaker, R. H. (1967). “Gradient Analysis of Vegetation”. In: Biological Reviews

42.2, pp. 207–264.

Wiens, J. J., D. D. Ackerly, A. P. Allen, B. L. Anacker, L. B. Buckley, H. V. Cornell, E. I. Damschen, T. Jonathan Davies, J. A. Grytnes, and S. P.

Harrison (2010). “Niche conservatism as an emerging principle in ecology and con-

servation biology”. In: Ecology Letters 13.10, pp. 1310–1324.

Wiens, J. J. and C. H. Graham (2005). “Niche Conservatism: Integrating Evolution,

Ecology, and Conservation Biology”. In: Annual Review of Ecology, Evolution, and Systematics 36.1, pp. 519–539.

Williams, D. M. and J. P. Kociolek (2007). “Pursuit of a natural classification of

diatoms: History, monophyly and the rejection of paraphyletic taxa”. In: European

Journal of Phycology 42.3, pp. 313–319. Wilson, M. A. and S. R. Carpenter (1999). “Economic valuation of freshwater ecosys-

tem services in the united states: 1971–1997”. In: Ecological Applications 9.3, pp. 772–

783.

Wright, J. F. (1995). “Development and use of a system for predicting the macroin-

vertebrate fauna in flowing waters”. In: Australian Journal of Ecology 20.1, pp. 181– 197.

Wu, J.-T. (1999). “A generic index of diatom assemblages as bioindicator of pollution

in the Keelung River of Taiwan”. In: Hydrobiologia 397, pp. 79–87.

Wunsam, S., A. Cattaneo, and N. Bourassa (2002). “Comparing diatom species,

genera and size in biomonitoring: a case study from streams in the Laurentians (Quebec, Canada)”. In: Freshwater Biology 47.2, pp. 325–340.

Wunsam, S., R. Schmidt, and R. Klee (1995). “Cyclotella-taxa (Bacillariophyceae)

in lakes of the Alpine region and their relationship to environmental variables”. In:

Aquatic Sciences 57.4, pp. 360–386. 194 Bibliographie

Zampella, R. A., K. J. Laidig, and R. L. Lowe (2007). “Distribution of diatoms in

relation to land use and pH in blackwater coastal plain streams”. In: Environmental

Management 39.3, pp. 369–384. Zechman, F. W., E. A. Zimmer, and E. C. Theriot (1994). “Use of ribosomal DNA

internal transcribed spacers for phylogenetic studies in diatoms”. In: Journal of Phy-

cology 30.3, pp. 507–512.

Zelinka, M. and P. Marvan (1961). “Zur präzisierung der biologischen klassifikation

der reinheit fließender gewässer”. In: Archiv für Hydrobiologie 57.3, pp. 389–407. Zimmermann, J., N. Abarca, N. Enk, O. Skibbe, W.-H. Kusber, and R. Jahn

(2014). “Taxonomic Reference Libraries for Environmental Barcoding: A Best Prac-

tice Example from Diatom Research”. In: PLoS ONE 9.9, e108793.