Proc. R. Soc. B (2005) 272, 2073–2081 doi:10.1098/rspb.2005.3195 Published online 17 August 2005

The extent of protist diversity: insights from molecular ecology of freshwater Jan Sˇ lapeta, David Moreira and Purificacio´nLo´pez-Garcı´a* Unite´ d’Ecologie, Syste´matique et E´ volution, UMR CNRS 8079, Universite´ Paris-Sud, 91405 Orsay Cedex, France Classical studies on protist diversity of freshwater environments worldwide have led to the idea that most of microbial eukaryotes are known. One exemplary case would be constituted by the , which have been claimed to encompass a few thousands of ubiquitous species, most of them already described. Recently, molecular methods have revealed an unsuspected protist diversity, especially in oceanic as well as some extreme environments, suggesting the occurrence of a hidden diversity of eukaryotic lineages. In order to test if this holds also for freshwater environments, we have carried out a molecular survey of small subunit ribosomal RNA genes in water and sediment samples of two ponds, one oxic and another suboxic, from the same geographic area. Our results show that protist diversity is very high. The majority of phylotypes affiliated within a few well established eukaryotic kingdoms or phyla, including , cryptophytes, , Cercozoa, Centroheliozoa and haptophytes, although a few sequences did not display a clear taxonomic affiliation. The diversity of sequences within groups was very large, particularly that of ciliates, and a number of them were very divergent from known species, which could define new intra-phylum groups. This suggests that, contrary to current ideas, the diversity of freshwater protists is far from being completely described. Keywords: eukaryotes; environmental PCR; molecular ecology; protist diversity

1. INTRODUCTION applied to the study of protist diversity in a number of The accessibility of freshwater ecosystems has made of different ecosystems, including oceanic waters (Diez et al. these environments a privileged choice for protozoological 2001; Lo´pez-Garcı´a et al. 2001; Moon-van der Staay et al. studies for more than three centuries since Leeuwenhoek’s 2001; Romari & Vaulot 2004), deep-sea vents (Edgcomb times (Finlay & Esteban 2001). Protists are key com- et al. 2002; Lo´pez-Garcı´a et al. 2003), anoxic environ- ponents of aquatic food webs both as members of the ments (Dawson & Pace 2002; Stoeck et al. 2003), river phytoplankton (e.g. ) and as major consumers of sediment (Berney et al. 2004) and one acidic river (Amaral bacterial biomass (e.g. heterotrophic flagellates and Zettler et al. 2002). In all these studies a large protist ciliates). In particular, nanoflagellates are abundant in diversity has been revealed and, although in general they planktonic communities, acting as the primary bacterial belong to well established eukaryotic kingdoms (Berney consumers and playing a cardinal role in nutrient cycling et al. 2004; Cavalier-Smith 2004), the lineages identified (Arndt et al. 2000; Weisse 2002). Aquatic ecologists have are very often highly divergent from known protist traditionally focused their attention on functional cat- sequences. This suggests that a large fraction of eukaryotic egories (trophic chain levels) rather than in species microbial communities remains to be discovered (Moreira composition, although there is a recognized increasing &Lo´pez-Garcı´a 2002). The existence of such a hidden interest in integrating the existing taxonomic data ( Weisse diversity would support detractors from the above- 2002; Finlay 2004). Indeed, many species have been mentioned idea that most protist species have already isolated and described, and taxonomic inventories of been described (Foissner 1999). freshwater protists have been constructed over many In order to see whether the results obtained by years. The recurrent observation of the same morphologi- molecular methods fit those accumulated by classical, cal types in freshwater systems from all over the world has morphological studies, we carried out a comprehensive led to the idea that the global protist species richness is survey of protist small subunit ribosomal DNA (SSU relatively low. Ciliates, easily distinguishable due to their rDNA) diversity from two different freshwater reservoirs. complex morphology, are used as a paradigmatic example. We analysed picoplanktonic and small nanoplanktonic With nearly 4000 species identified, it has been proposed (0.2–5 mm range), and larger planktonic (greater than that most species have already been discovered 5 mm) fractions as well as sediment samples from two (Finlay 1998). ponds, one oxic and one suboxic, located in the same Molecular methods based on the amplification and geographic region. Our results reveal a large diversity of sequencing of small subunit ribosomal RNA (SSU rRNA) lineages within known eukaryotic phyla, including some genes have opened the possibility of studying microbial phylotypes that are divergent from already known diversity independently of morphological identification sequences. This is particularly noticeable in the case of and cultivation. In recent years, these methods have been ciliates, which dominate the protist diversity in the suboxic system. These results support the idea that the inventory * Author for correspondence ([email protected]). of freshwater protists is far from complete.

Received 19 January 2005 2073 q 2005 The Royal Society Accepted 27 April 2005 2074 J. Sˇlapeta and others The extent of protist diversity

2. MATERIAL AND METHODS were constructed in this way, including duplicate libraries for (a) Sampling both sediment and plankton samples. Clone inserts were Samples were collected in late autumn–early winter from a amplified with vector T7 and M13R primers, and inserts of suboxic pond (CV1) and from an oxic pool (CH1) located the expected size were sequenced directly using either specific a few kilometres south of Paris, France. The suboxic CV1 or vector primers by Genome-Express (Meylan, France). was an approximately 15 m diameter pond located at the campus of the Universite´ Paris-Sud (pH 6.5, water (c) Sequence and phylogenetic analyses temperature 11.5 8C). When stirred, its waters bubbled Initially, we obtained partial sequences from positive clones and smelled strongly of H2SandCH4, indicators of that were trimmed for ambiguities and used to construct a oxygen-depleted environments. ConOx probe ( WTW, local database, which was then analysed by internal BLAST to Weilheim) oxygen measurements yielded values ranging identify identical or highly similar sequences. Sequences K K from 0.38 mg l 1 (bottom) to 2.4 mg l 1 (surface). The exhibiting greater than 97% identity were considered to be 2 small (ca 30!50 m ), normally oxygenated (9.6 mg O2 per single phylotypes. In total, 377 eukaryotic sequences from 7 litre) shallow lake CH1 was located at the Chevreuse duplicate libraries (247 sequences corresponding to CV1 and locality in the Regional Natural Park of the Haute Valle´ede 130 to CH1) were included in our analysis (Electronic Chevreuse ( pH 7.6, water temperature 4 8C, a thin ice layer Appendix—table S1). Representative sequences of the covered the lake). In both cases, sediment and plankton difference (greater than 97%) identity groups were then were sampled. Dark brown/black sediments rich in organic selected and fully sequenced to obtain an almost complete detritus were collected from the bottom of the two ponds SSU rDNA. Full-length sequences were checked for the using sterile 50 ml Falcon tubes. A second, sandy, sediment presence of chimeras using CHIMERA_CHECK v. 2.7 at the sample was additionally collected from the edge of the Ribosomal Database Project II (Cole et al. 2003). For suboxic CV1 system. Water was collected into sterile 1 l suspicious sequences, we looked for incongruencies in glass bottles with a sterile gauze adapted to their openings. phylogenetic trees obtained from different parts of the Samples were processed (filtration and DNA extraction) sequence, and subsequently, multiple alignments were immediately after collection. Water samples were passed inspected visually to detect possible recombination sites through several layers of sterile gauze to filter out the between distant phylotypes. In addition, full sequence components larger than 50–100 mm in diameter. Approxi- datasets were analysed using BELLEROPHON (Huber et al. mately 0.5 l water was then filtered across a 5 mm-pore 2004), and putative chimeric clones thus identified were diameter filter (TMTP, Millipore), and the resulting filtrate manually checked. From the potential chimeras identified by was further passed across a 0.22 mm-pore diameter filter CHIMERA_CHECK and BELLEROPHON, only nine chimeras were (GTTP, Millipore). DNA was extracted from the biomass unambiguously verified (three in CV1 sample, six in CH1 retained on the 5 mm (CV1-5 and CH1-5) and 0.22 mm sample), which were excluded from our analyses. In total, we (CV1-2 and CH1-2) filters. obtained 83 almost complete eukaryotic SSU rDNA sequences, which were included in a local eukaryotic SSU (b) DNA extraction, PCR amplification, cloning rDNA database containing more than 5000 sequences. The and sequencing obtained sequences ranged from 1335 bp (CV1-B2-35) to The planktonic biomass retained on filters was suspended in 1956 bp (CH1-S2-16). The closest BLAST matches to our 500 ml PBS. Cells were then lysed in the presence of new sequences that were not included initially in this database 80 mgmlK1 proteinase K, 1% SDS, 1.4 M NaCl, 0.2% were retrieved from GenBank at NCBI, aligned using b-mercaptoethanol and 2% CTAB (final concentrations) at CLUSTALX v. 1.82 (Thompson et al. 1997), and the multiple 55 8C. DNA was then extracted once with phenol–chloro- alignment edited manually. Ambiguous positions in the form–isoamylalcohol, and once with chloroform–isoamylal- alignment were discarded from phylogenetic analyses. Trees cohol. Nucleic acids were concentrated by ethanol were reconstructed using both Bayesian and maximum precipitation. DNA from sediments was directly extracted likelihood methods. Bayesian phylogenetic analyses were by using the SoilMaster DNA Extraction Kit (Epicentre) done with MRBAYES v. 3.0b4 (Ronquist & Huelsenbeck using manufacturer’s instructions. Amplification of 95–98% 2003) using the GTRCGCI model of sequence evolution. of the total length of the SSU rDNA was done using all four They were repeatedly run from a random starting tree and combinations of the following specific eukaryotic primers: run well beyond convergence. We used four to six hidden 18S–42F (CTCAARGAYTAAGCCATGCA), 18S–82F Markov chains, and up to 5 000 000 generations. For the (GAAACTGCGAATGGCTC), 18S–1498R (CACCTAC reconstruction of the consensus Bayesian tree with posterior GGAAACCTTGTTA) and 18S–1520R (CYGCAGGTTC probabilities the trees before convergence were discarded. ACCTAC). PCR reactions were carried out in 25 mlof Maximum likelihood trees were constructed with PHYML reaction buffer containing 1 ml DNA template (10–100 ng), v. 2.3/2.4.4 (Guindon & Gascuel 2003) using the HKYCG

1.5 mM MgCl2, dNTPs (10 nmol each), 20 pmol of each model of sequence evolution. Non-parametric bootstrapping primer, 1 U Ta q DNA polymerase (Promega), and sometimes was performed with PHYML using 100 replicates. The number 1.25 ml DMSO and 0.05% of a detergent solution. PCR of sequences and that of positions used in the construction of reactions were performed under the following conditions: 30 the different trees were, respectively: 67 and 1345 for cycles (denaturation at 94 8C for 15 s, annealing at 55 8C for Opisthokonta and Amoebozoa (figure 2), 190 and 1179 for 30 s, extension at 72 8C for 2 min) preceded by 2 min the bikont diversity with the exception of alveolates, plants denaturation at 94 8C, and followed by 8 min extension at and excavates (figure 3) and 148 and 1099 for alveolates 72 8C. Amplicons were cloned into pCR2.1 Topo TA cloning (figure 4). Alignments and detailed trees are available from vector (Invitrogen) and transformed into Escherichia coli the authors upon request. Sequences reported in this paper TOP100 One Shot cells (Invitrogen) according to the have been deposited in the GenBank database under manufacturer’s instructions. A total of 14 different libraries accession numbers AY821916–AY821998.

Proc. R. Soc. B (2005) The extent of protist diversity J. Sˇlapeta and others 2075

3. RESULTS AND DISCUSSION 60 CV1 0.2–5µm >5µm (a) Overall eukaryotic diversity in two, oxic 55 (suboxic) and suboxic, freshwater systems sediment 50 0.2–5µm CH1 We carried out a molecular survey of the eukaryotic >5µm 45 (oxygenated) diversity in the sediment and plankton in two contrasting sediment freshwater environments. One was a common permanent 40 shallow lake in a temperate wooded region in France, 35 while the other, also located in the same geographic area, 30 was suboxic due to a very limited water circulation and a 25 massive input of organic matter from tree leaves and the 20 surrounding vegetation. The degradation of this organic 15 material triggers a rapid oxygen consumption by hetero- 10 trophic oxygen-respiring microorganisms. Reduced gases, 5

H S and CH , are abundantly degassed from the system number of SSU rDNA clones in libraries 2 4 0 when its waters are stirred, attesting to the presence of active sulphate-reducing bacteria and methanogenic Fungi Lobosea Archaea which, being strict anaerobes, characterize Alveolata Cercozoa Metazoa CryptophytaEuglenozoa Haptophyta Heterokonta ParabasalidaViridiplantaeIncertae sedis oxygen-depleted environments. In both cases, we studied Centroheliozoa the diversity of eukaryotic SSU rDNA sequences retrieved Choanoflagellata from sediment and from two different planktonic frac- Figure 1. Taxonomic distribution of eukaryotic SSU rDNA tions: that in the range 0.2–5 mm encompassing the phylotypes retrieved from sediment and two different picoplankton (lesser than or equal to 2 mm) and small planktonic size fractions in a suboxic (CV1) and a normal, nanoplankton (2–20 mm), and that greater than 5 mm oxygenated (CH1) freshwater system. For details see encompassing larger planktonic organisms. Since small Electronic Appendix table S1. nanoplankton and, most especially, picoplankton in corresponded to two parabasalid sequences identified in freshwater systems have been traditionally less studied, the suboxic CV1 picoplanktonic fraction. this type of analysis might reveal previously unrecognized The diversity of eukaryotic clones identified was not eukaryotic lineages. only very large among major eukaryotic phyla, but also the After DNA extraction and amplification of eukaryotic sequences belonging to each detected phylum were SSUrDNAwithdifferentprimercombinations,14 generally highly diverse. This was particularly striking libraries were constructed. A total of 377 eukaryotic for the fungi and heterokonts, with 77 and 64% of the partial sequences (ca 750–1000 bp) were determined. detected sequences diverging by greater than 3% nucleo- Their taxonomic distribution was very wide, revealing tide identity, respectively (see below and Electronic the presence of members of 13 major eukaryotic phyla Appendix—table S1). Once that we had had a first insight (figure 1). However, their partition in the plankton and in the overall eukaryotic diversity of our samples, and from sediment of the two freshwater systems was very unequal, the total 377 eukaryotic partial sequences obtained, we revealing important differences in community compo- identified 83 groups of sequences that shared more than sition. The most remarkable was the overwhelming 97% nucleotide identity, which we used as threshold to presence of ciliate sequences in both the plankton and define phylotypes, and we selected one member of each to sediment of the suboxic pond (CV1). Ciliates were also be fully sequenced. Detailed phylogenetic analyses were present in sediment and water of the oxic lake (CH1) but, then carried out using different sets of sequences covering although relatively abundant, they did not dominate CH1 the whole eukaryotic diversity, and particularly that of the libraries. In turn, the most abundant clones in the oxic lake most represented phyla, in order to place our environ- belonged to planktonic cryptophytes, which were absent mental phylotypes in trees. from the suboxic pond libraries. Heterokonts (strameno- piles) were also remarkably abundant in sediment, but not in the plankton libraries from the suboxic system. (b) Opisthokonta and Amoebozoa However, they were present in plankton and sediment in With the exception of one crustacean and one fungal the oxic lake in much lower proportions. Metazoan clones sequence retrieved from plankton, all amoebozoan and were similarly highly abundant in libraries obtained from opisthokont sequences were identified in sediment sediment in both freshwater systems. Animals were not (figure 2 and Electronic Appendix—table S1). Only one detected in plankton because they were probably filtered amoebozoan phylotype, CV1-B1-92, was identified in our away during the pre-filtration step (greater than 150 mm). libraries that probably represents a true lobose amoeba. It The high number of metazoan clones obtained resulted was 88% identical to the environmental clone BOLA868, most likely from the bias derived from the fact that, being retrieved from marine sediment in a tidal flat (Dawson & multicellular, they contributed large amounts of DNA. Pace 2002), and very distantly related to Hartmannella Nevertheless, animal sequences were relatively diverse cantabrigiensis. On the contrary, the fungal phylotypes (see below), implying that this bias is limited. The other were very diverse (figure 2), none of them being eukaryotic group that was relatively abundant in our represented by more than two clones in our libraries. libraries corresponded to the fungi in the sediment of the Notably, two nearly identical basidiomycete sequences suboxic system. The rest of eukaryotic groups were usually (CV1-B1-42 and CH1-S1-56, 99% identity) were represented by one or a few sequences that were almost obtained from each one of the freshwater systems studied. invariably retrieved from sediment libraries, either from They were almost identical, in turn, to Cryptococcus theoxicorthesuboxicpond.Theonlyexception carnescens. The rest of basidiomycete and ascomycete

Proc. R. Soc. B (2005) 2076 J. Sˇlapeta and others The extent of protist diversity

0.86/- Monosiga brevicollis AF174375 choanoflagellates in the anoxic sediments of the suboxic 0.65/- Monosiga ovata AF271999 Diaphanoeca grandis L10824 1./94 0.74/- pond. Choanoflagellates are phylogenetically among the 0.80/- CV1 B2 17 Choanoflagellida 1./- LEMD189 AF372736 closest relatives to animals (King & Carroll 2001; Snell et al. 0.95/- 1./88 CV1 B1 36 Ichthyosporea Nuclearia simplex AF349566 2001). They are ubiquitous in aquatic habitats but, to our Neocallimastix frontalis X80341 CH1 2B 29 knowledge, they have never been described in suboxic or Catenomyces sp. AY635830 0.71/- 0.74/- Chytriomyces hyalinus M59758 Basidiobolus ranarum AF113414 anoxic environments. The only putative exception could Mortierella wolfii AF113425 Blastocladiella emersonii X54264 correspond to the phylotype LEMD189 from the sediment 1./100 CV1 B1 10 1./100 CV1 B1 58 'Lower fungi' of lake Lemon, another divergent sequence apparently 1./100 Allomyces arbuscula AY552524 1./100 Allomyces macrogynus U23936

Fungi related to choanoflagellates but not robustly related to our Smittium simulii AF277046 0.71/- CH1 S2 50 sequences (Dawson & Pace 2002). Nevertheless, since our 1./99 CV1 B2 34 1./98 LKM11 AJ130849 Mucor racemosus AF113430 sequences are very divergent, they might also be morpho- 0.62/73 Pandora neoaphidis AF052405 1./92 Glomeromycota logically distant from known choanoflagellates, and hence, 1./75 Cucurbitaria berberidis U42481 1./100 CV1 B1 65 difficult to identify as such. As mentioned, with the Leptosphaeria bicolor 0.96/69 U04202 Ascomycota 1./92 Saccharomyces cerevisiae J01353 exception of one crustacean sequence (CV1-5A-4), CV1 B1 18 0.63/- Taphrina virginica AB000960

1./ 100 1./100 CV1 B2 69 metazoan phylotypes recovered from both freshwater sites 1./98 Rhodotorula bacarum AJ496257

1./91

0.69/- Tilletia caries U00972 were exclusively found in sediments (figure 2). In both cases, 1./100 Fellomyces polyborus D64117 Basidiomycota 1./100 CV1 B1 42 the most abundant sequences belonged to the gastrotrichs, 1./100 CH1 S1 56 0.86/85 Cryptococcus carnescens AB085798 1./95 LKM88 AJ130867 represented by two almost identical phylotypes (CV1-B1-5 0.94/64 CV1 B1 5 1./100 Chaetonotus sp. AJ001735 and CH1-S1-38) related to Chaetonotus spp. and to one CH1 S1 38

1./100 Metazoa 1./100 CV1 A1 28 environmental sequence (LKM88) retrieved from an 0.96/- Lepadella patella AY218117 1./100 CV1 A1 10 anaerobic digestor (van Hannen et al. 1999). Chaetonotus is Philodina acuticornis 1./100 CH1 S1 77 U41281 0.99/- Limnodrilus hoffmeisteri AF360992 a of worm-like aquatic invertebrates frequent in the 1./100 CH1 S1 76 Microstomum lineare D85092 interstitial waters of fine sediments or on overlying detritus 1./100 CV1 5A 4 0.62/- Cypridopsis sp. AF363307 (Margulis & Schwartz 1998). Rotifers are widespread in 1./75 1./100 Prismatolaimus intermedius AF036603 CH1 S1 25 1./81 Unidentified nematode PTrPSp AY284737 freshwater habitats, and in fact, two rotifer phylotypes 1./100 Acanthamoeba palestinensis U07411 Balamuthia mandrillaris AF477022 (CV1-A1-28, CV1-A1-10) were recovered from CV1 0.94/71 LEMD255 AF372775 1./100LKM74 AJ130863 sediments. They were closely related to the typical benthic 1./100 BOLA868 AF372795 Amoebozoa 1./72 1./97 CV1 B1 92 rotifers Lepadella patella and Philodina acuticornis, respect- 1./88 Hartmannella cantabrigiensis Leptomyxa reticulata AF293898 AY294147 1./100 Echinamoeba exundans AF293895 ively. In addition, we retrieved two phylotypes closely related 0.1 to the tubificid oligochaete Limnodrilus hoffmeisteri,a plankton (0.2-5 µm) cosmopolitan and abundant bioturbator annelid often plankton (>5 µm) found in polluted areas (Margulis & Schwartz 1998), and sediment to the turbellarian platyhelminth Microstomum lineare,anda third phylotype (CH1-S1-25) that was distantly related to Figure 2. Bayesian phylogenetic tree of opisthokont and several nematode species. amoebozoan SSU rDNA sequences retrieved from sediment and plankton of a suboxic pond (CV1) and an oxic freshwater (c) Diverse bikont phyla (CH1) shallow lake. Sixty-seven sequences and 1345 sites A large fraction of eukaryotic sequences in our libraries were used. For the chimerical phylotype LEMD255, only the amoebozoan portion (nucleotides 1–1379) was included in was distributed in a variety of bikont phyla including the analysis (see Berney et al. 2004). Sequences derived from alveolates, cryptophytes, haptophytes, heterokonts (stra- this study are highlighted in bold. Geometric symbols menopiles), Cercozoa, centrohelid Heliozoa, euglenids indicate the type of environment in which each phylotype and parabasalids. Among the typical photosynthetic was identified. Posterior probabilities are given on the left at groups, the oxic freshwater lake was clearly dominated nodes. Bootstrap values obtained in maximum likelihood by cryptophyte sequences, while the suboxic pond showed analyses are given at nodes (right) only when they are greater a variety of photosynthetic lineages (figure 1). than60%.Thescalebarrepresentsthenumberof Cryptophytes were of relatively low diversity with two substitutions per 100 positions per unit branch length. major lineages being detected, one closely related to Cryptomonas spp., and the other most similar to species of phylotypes were found to be greater than 98% identical to Teleaulax and other related marine genera (figure 3). known fungi as well. However, the remaining fungal Interestingly, most cryptophyte sequences were retrieved phylotypes were in general much more divergent. They from the 0.2–5 mm fraction, where the most abundant ascribed to the poorly resolved ‘lower fungi’ (figure 2), phylotype was CH1-S1-20, almost identical to the including phylotypes related to the ubiquitous water mold freshwater Cryptomonas sp. M420 (Marin et al. 1998; Allomyces spp. or to the environmental sequence LKM11, Deane et al. 2002). This suggests that cryptophytes may be retrieved from an anaerobic digestor (van Hannen et al. important contributors to the largely understudied 1999). The finding of such a level of divergence between freshwater eukaryotic picoplankton. In addition to bona our phylotypes and those from known species suggests fide cryptophytes, we retrieved a phylotype (CH1-5A-4), that the diversity of this heterogeneous group of fungi which branched robustly as sister group to the crypto- (lower fungi) is still very poorly known despite the fact that phytes, including the plastid-lacking phagocytic Goniomo- these flagellated fungi occupy a pivotal phylogenetic nas spp. (figure 3 and Electronic Appendix—table S1). Its position to understand fungal origin and evolution phylogenetic position is very interesting, and the identi- (Bowman et al. 1992; Cavalier-Smith & Chao 2003b). fication and structural study of this lineage may shed light Interestingly, we identified two quite divergent on the evolutionary history of cryptophytes. One appeal- phylotypes (CV1-B2-17 and CV1-B1-36) related to ing possibility would be that this phylotype belongs to

Proc. R. Soc. B (2005) The extent of protist diversity J. Sˇlapeta and others 2077 kathablepharids, which are unclassified heterotrophic Another group detected in the sediment of the suboxic flagellates abundant in aquatic systems (Arndt et al. pond was that of centrohelid Heliozoa, characteristic 2000) that have been proposed to be related to predators frequent in freshwater. We detected two cryptophytes (Clay & Kugrens 1999; Cavalier-Smith phylotypes (CV1-B2-46 and CV1-B1-93) that were 2004). Unfortunately, SSU rDNA sequences are not yet related to each other forming an independent lineage available for this group. within this phylum. In this same sediment, two cercozoan Cryptophytes were not detected in libraries from the lineages were identified. CV1-B1-11 was related to suboxic pond, which suggests that they may be sensitive to Cryothecomonas and Lecythium spp. and to a number of oxygen-depleted environments, but a variety of phylotypes environmental sequences including several from anaerobic affiliated to typical heterokont photosynthetic groups were environments (BOLA383, LKM45; van Hannen et al. identified. Indeed, all likely photosynthetic heterokonts, 1999; Dawson & Pace 2002). CV1-B1-41 formed a cluster diatoms and one chrysophyte lineage, were detected with a group of environmental sequences coming from exclusively in the sediment fraction of the suboxic pool anoxic sediment in a salt marsh (Stoeck & Epstein 2003). (figure 3 and Electronic Appendix—table S1). Four In addition, the cercozoan clone CH1-2A-20 was different phylotypes related to pennate species retrieved from the picoplankton of the oxic lake, which were identified. Diatoms are mostly benthic, which was related, although more distantly, to the same group of explains their association with the sediment fraction. The sequences that the clone CV1-B1-11 (figure 3). Cercozoa chrysophyte phylotype CV1-B1-76 was highly similar to comprises a vast variety of eukaryotic lineages with species of the photosynthetic genus Chrysamoeba, amoe- very different lifestyles, although predators dominate, boid organisms with filopodial extensions (Andersen et al. which frequently display reticulate filose pseudopodia 1999). The other chrysophyte phylotype (CV1-B1-34) was (Cavalier-Smith & Chao 2003a). Recent molecular most closely related to the colourless Spumella genus and to surveys indicate that this group is even more diverse the photosynthetic genus Ochromonas and, therefore, its than classical studies suggested, with at least nine novel physiology cannot be deduced. No additional sequences groups at the taxonomic order level or above (Bass & belonging to chrysophytes were detected, although they Cavalier-Smith 2004). represent abundant heterotrophic flagellates in freshwater Three euglenid sequences were identified in the (Arndt et al. 2000). Although many silicoflagellate species sediment of the oxic lake. Although euglenids comprise are photosynthetic, the only silicoflagellate sequence we various photosynthetic members, our phylotypes are most detected, CH1-2A-22 from the oxic lake picoplankton, was closely related to Petalomonas spp., which are colourless closely related to the colourless genera Pteridomonas and heterotrophs either phagotrophic or osmotrophic Pseudopedinella, suggesting that it is a phagotrophic (Electronic Appendix—figure S1; Fleedale & Vickerman organism. As a matter of fact, a wide diversity of typical 2000). One tree sequence, CV1-B1-85, was identified in heterotrophic heterokont lineages was detected in both the sediment, possibly from a natural contaminant freshwater ponds. These comprised the saprophytic integrating the partially decomposed vegetal matter in (three phylotypes) and labyrinthulids (two the sediment (Electronic Appendix—figure S1). Cur- phylotypes), which were retrieved from the anoxic and iously, despite the fact that the sediment contained a extremely organic matter-rich sediment of CV1, and the large amount of plant debris, it was the only green plant predatory bicosoecids (three phylotypes), detected in sequence retrieved, suggesting that the organic matter the plankton of the oxic lake (figure 3). Additionally, the forming the sediment is degraded very rapidly. One phylotype CH1-S1-35 occupies an uncertain position in parabasalid phylotype was detected in the plankton of the heterokont tree. Within the labyrinthulids, our clone the suboxic system, CV1-2B-35, most likely correspond- CV1-B1-3, from anoxic sediment, formed a very robust ing to a parasite (Electronic Appendix—figure S1). monophyletic cluster with two very distantly related environmental clones from hydrothermal sediment of the (d) Ciliates Guaymas basin (CS_E036) and from the extremely acidic By far, the most abundant and diverse group detected in river Tinto (RT5iin25) (Amaral Zettler et al.2002; our libraries corresponded to the alveolates and, most Edgcomb et al. 2002). Not only are the environmental particularly, the ciliates (figures 1 and 4). Only one sequences forming this group very divergent among sequence (CH1-S2-24) belonging to one themselves, but the group as a whole is very distant from affiliated to the parasitic gregarines, was identified. The known species of labyrinthulids. rest of the sequences were widely distributed in An interesting cluster was formed by two phylotypes different ciliate groups (figure 4). A total of 21 distinct closely related to each other that were robustly placed in phylotypes were detected in the suboxic system, where phylogenetic analyses either within the haptophytes ciliates appeared to be dominant, while five different (figure 3) or at their base (not shown). Together with phylotypes were retrieved from the oxic lake. This Pavlovales and Prymnesiophyta, this cluster would form a corroborated previous observations by light microscopy third lineage of haptophytes. Although known haptophytes revealing a remarkable diversity of ciliate morphologies are photosynthetic, the degree of divergence of this third (not shown), especially in the suboxic system. Ciliates are cluster does not allow any safe assumption about its actual major consumers of bacteria and other protists and, lifestyle to be made. Haptophytes have been essentially although they are fundamentally aerobic protists, several described in oceanic waters, with only sparse records from ciliate groups have independently adapted to anoxic freshwater (Green et al. 1990; Edvardsen et al. 2000). The environments and are among the most usual eukaryotes discovery of a third, divergent, group indicates that the in anaerobic communities (Fenchel & Finlay 1995). diversity and ecological distribution of this phylum may be Known groups of anaerobic ciliates comprise the order much broader than previously thought. Armophorida, including the families Caenomorphidae

Proc. R. Soc. B (2005) 2078 J. Sˇlapeta and others The extent of protist diversity

Spumella danica AJ236861 Figure 3. Bayesian phylogenetic tree of SSU rDNA sequences 0.85/- Ochromonas danica M32704 Spumella obliqua AJ236860 0.98/- CV1 B1 34 retrieved from sediment and plankton of a suboxic pond Paraphysomonas foraminifera AB022864 0.99/- 0.88/- Paraphysomonas butcheri AF109326 Chrysophyta (CV1) and an oxic freshwater (CH1) shallow lake covering Oikomonas mutabilis U42454 0.97/87 () 1./ 87 Chromulina nebulosa AF123285 the diversity of heterokonts, haptophytes, centrohelid CV1 B1 76 0.1/- Chrysamoeba pyrenoidifera AF123286 Heliozoa, cryptophytes and Cercozoa. One hundred and 99/82 Chrysamoeba mikrokonta AF123287 0.71/- 97/86 CH1 S1 35 Picophagus flagellatus AF185051 ninety sequences and 1179 sites were used. Sequences 1./- Heterosigma akashiwo U41650 0.91/- 1./100 Phaeophyceae () derived from this study are highlighted in bold. Geometric 0.99/64 Giraudyopsis stellifera U78034 0.70/- symbols indicate the type of environment in which each 1./100 Xanthophyceae (yellow-green algae) Eustigmatophyceae 1./100 Bolidomonas mediterranea AF123596 phylotype was identified. Posterior probabilities are given on Thalassiosira eccentrica X85396 1./- Melosira varians X85402 the left at nodes. Bootstrap values obtained in maximum 1./- Coscinodiscus radiatus X77705

Asterionellopsis glacilia X77701 (stramenopiles) Heterokonta likelihood analyses are given at nodes (right) only when they CV1 B1 87 Nitzschia apiculata M87334 are greater than 60%. The scale bar represents the number of 1./- 1./97 CV1 B1 61 1./- Navicula sp. AY485513 Bacillariophyta 1./- Amphora montana AJ243061 substitutions per 100 positions per unit branch length. Thalassionema nitzschioides X77702 (diatoms) BOLA250 AF372749 0.67/- CV1 B2 38 Eunotia monodon AB085831 1./- Nitzschia thermalis AY485458 (Caenomorpha) and Metopidae (, Brachonella), 0.97/- 1./75 CV1 B2 82 1./91 Dickieia ulvacea AY485462 and the order , including the families 1./100 Pelagophyceae 0.85/76 Dictyocha speculum U14385 Plagiopylidae (Plagiopyla), Trimyemidae (Trimyema) and CH1 2A 22 Dictyochophyceae 1./100 Pteridomonas danica L37204 1./100 Pseudopedinella elastica U14387 (silicoflagellates) Sonderiidae (Sonderia; Lynn & Small 2000). Phylotypes Developayella elegans U37107 1./100 Hypochytridiomycetes belonging to nearly all of them have been retrieved from 0.99/100 Saprolegnia ferax AJ238655 0. 92/76 1./- Achlya bisexualis M32705 the suboxic pond (figure 4 and Electronic Appendix— 1./100 Pythium monospermum AJ238653 1./100 CV1 B1 49 0.79/- CV1 B2 5 Oomycetes table S1). One of the two phylotypes closely related to 1./70 CV1 B1 96 0.8/- Phytophthora undulata AJ238654 1./- Phytophthora megasperma Caenomorpha uniserialis,CV1-A2-24,wasthemost 1./93 X54265 1./83 Slopalinida abundantly retrieved from both CV1 plankton and 0.83/73 OLI5110 AF167414 0.81/- BAQA72 AF372754 1./97 Wobblia lunata AB032606 sediment libraries. CV1-5A-17 was almost identical 0.65/73 Cafeteria roenbergensis AF174364 1./- 70/- Caecitellus parvulus AF174367 Symbiomonas (99.5%) to Brachonella sp., CV1-5A-8 and CV1-B2-81 CH1 2B 3 scintillans AF185053 1./- 0.96/79 CH1 2A 3 were distantly related to other sequences of Brachonella 0.86/82 Adriamonas peritocrescens AF243501 0.94/86 CH1 5A 8 0.78/70 Siluania monomastiga AF072883 and Metopus spp., and CV1-5B-12 grouped with Thraustochytriidae sp. AF257316 Labyrinthulida 0.99/69 CV1 B1 6 sequences of anaerobic Plagiopyla spp. and an environ- 1./99 1./100 Labyrinthula sp. AB095092 1./- Labyrinthuloides minuta L27634 CV1 B1 3 mental clone from a suboxic salt marsh (CCW92; Stoeck 0.95/- 1./100 RT5iin25 AY082983 1./97 CS E036 & Epstein 2003). In addition to phylotypes related to 1./- Ulkenia profunda L34054 AY046668 1./100 Labyrinthuloides haliotidis U21338 characteristic anaerobic ciliates, many sequences affiliated 1./90 Thraustochytrium kinnei L34668 1./87 Alveolata to other ciliate groups, including oligohymenophorean, 1./100 Prymnesiales 0.68/- prostomatean, phyllopharyngean, litostomatean, and the 1./100 1./100 CV1 B2 32 Haptophyceae CV1 B1 97 Pavlovales spirotrichean subclasses Hypotrichia, Stichotrichia, and 1./100 0.99/75 Pterocystis sp. 'JJP2003' AY268043 0.93/- Chlamydaster sterni AF534709 Choreotrichia (figure 4). 0.98/- CV1 B2 46 1./83 1./86 CV1 B1 93 In total, we detected ciliate phylotypes ascribing to 6 Heterophrys marina AF534710 Centroheliozoa 1./100 Centrohelid 'JJP2003' AY268041 0.94/73 0.79/64 Raphidiophrys ambigua AY305008 out of the ten major ciliate classes defined (Lynn & Small 1./100 Marine 'microheliozoan TCS2002' AF534711 0.77/- Glaucophyta 2000). In many cases, our sequences were very closely CH1 5A 4 Goniomonadaceae related to known species or genera of ciliates. However, in 0.97/83 0.61/100 Campylomonas reflexa AF508267 Cryptomonas sp. 'M420' AJ007280 1./97 0.8/- CH1 2A 33 other cases, we identified sequences that, although clearly CH1 5A 2 0.75/- CH1 S1 20 affiliating to a particular class or subclass, are very distant 1./96 Cryptomonas ovata AF508270 1./100 Cryptomonas platyuris AF508271 CH1 2B 6 to sequenced species and most likely belong to genera or 0.89/-Chilomonas L28811 Hemiselmis virescens AJ007284 groupings of higher taxonomic ranks for which no Chroomonas mesostigmatica AF508268 Cryptophyta 1./94 Rhinomonas pauca U53132 0.93/- Pyrenomonas salina X54276 sequences were available. For instance, CV1-B1-45 and Guillardia theta X57162 0.61/- Falcomonas daucoides AF143943 CV1-B2-52 forming a cluster within the Oligohymeno- Proteomonas sulcata AJ007285 Geminigera cryophila AB058365 Plagiomonas amylosa AF143944 phorea, CV1-B2-43 within the , or CV1-B1-1 Teleaulax acuta AF508275 1./84 CH1 2B 4 within the Hypotrichia. Furthermore, in the case of the Teleaulax amphioxeia AJ007287 0.9/- CCA80 AY180001 cluster formed by CV1-2A-27 and CV1-A1-16, it was not 1./- CV1 B1 41 CCA44 AY180003 CCA33 AY180002 even possible to ascribe it to a particular class of ciliates. Its CCW46 AY180023 0.91/- LEMD052 AF372744 position in phylogenetic analyses was unstable despite the 1./94 Maullinia ectocarpii AF405547 1./97 Polymyxa graminis AF310898 Gromia oviformis AJ457813 fact that both sequences were slow-evolving. In addition, 0.78/- Urosporidium crescens U47852 Clathrulina elegans AY305009 an extreme example may be constituted by the highly 1./100 Dimorpha-like sp. ATCC50522 AF411283 Massisteria marina AF174369 'Nuclearia-like NPor' AF289081 divergent, very fast-evolving sequence CV1-A2-16, which 1./100 Phaeodarea Paulinella chromatophora X81811 appeared to branch within ciliates in several, but not all, 0.99/- Lotharella globosa AF076169 1./89 1./100 Pedinomonas minutissima AF054832 phylogenetic analyses, and was not related to any known 1./ 100 Gymnophrys cometa AF411284 RT5iin4 AY082993 1./100 Heteromita globosa U42447 ciliate group (Electronic Appendix—figure S1). LEMD045 AF372738 1./75 Bodomorpha minima AF411276 Cercozoa 0.69/60 RT5iin20 AY082981 0.75/- Cercomonas alexieffi AF411267 0.91/- Allas diplophysa AF411262 (e) How extensive is our knowledge of protist 1./- CCI66 AY180004 1./- Euglypha rotunda X77692 diversity? 0.98/- LEMD083 AF372740 LKM30 AJ130852 LKM48 AJ130858 Our observations appear to confirm a trend profiled in Pseudodifflugia cf. gracilis AJ418794 1./- 1./100 CH1 2A 20 previous molecular eukaryotic diversity surveys carried out BOLA322 AF372764 Cercozoa sp. 'WHOI-LI1-14' AF411273 1./95 Cryothecomonas longipes AF290540 in different biotopes. Many phylotypes very distant from 1./82 BOLA383 AF372765 Cryothecomonas aestivalis AF290539 known species or genera were detected, even in environ- CV1 B1 11 plankton (0.2-5 µm) Rhizosphere AJ506034 ments that have been traditionally studied more such 0.74/- RT3n19 AY082998 plankton (>5 µm) 1./99 Lecythium sp. AJ514867 0.1 LKM45 AJ130856 sediment as freshwater systems (Berney et al. 2004 and this work).

Proc. R. Soc. B (2005) The extent of protist diversity J. Sˇlapeta and others 2079

1./100 Pleuronema coronatum AY103188 Sequences derived from this study are highlighted in bold. 1./86 CH1 S1 82 1./65 Cyclidium plouneouri U27816 For the chimerical phylotypes AT6-4 and LEMD119 only the 0.80/- Cyclidium glaucoma Z22879 Uronema marinum Z22881 Anophyroides haemophila U51554 ciliate (nucleotides 332–1708) and gregarine (nucleotides

0.60/- Cohnilembus verminus Z22878 148-5-EKD6 AF290081 1–901) parts were included in the analysis (see Berney et al. 0.67/- 0.91/71 CCW108 AY180044 Cyclidium porcatum Z29517 2004). For sequences retrieved in this work, geometric CV1 B1 45 0.98/- CV1 B2 52 symbols indicate the type of environment in which each 1./100 Opisthonecta henneguyi X56531 convallaria AF070700 phylotype was identified. Posterior probabilities are given on 1./100 Ichthyophthirius multifiliis U17354 1./100 Ophryoglena catenula U17355 corlissi U17356 the left at nodes. Bootstrap values obtained in maximum Tetrahymena pigmentosa M26358 1./100 CV1 2A 18 likelihood analyses are given at nodes (right) only when they 0.86/69 Colpidium campylum X56532 are greater than 60%. The scale bar represents the number of 0.60/- 0.98/79 Glaucoma chattoni X56533 Glaucoma scintillans AJ511861 Urocentrum turbo AF255357 substitutions per 100 positions per unit branch length. CV1 2A 17 Dexitrichides pangi AY212805 Lembadion bullinum AF255358 1./97 Paramecium bursaria AF100314 1./- 1./99 Paramecium tetraurelia X03772 It is possible that a number of these phylotypes correspond 0.99/94 CV1 5A 22 0.82/- sp. AF255359 to described species or groups for which the SSU rDNA 1./100 Frontonia vernalis U97110 1./100 Prorodon viridis U97111 Prorodon teres X71140 sequence is not yet available (Baldauf 2003). This may be CH1 S1 92 sp. X76646 the case even for ciliates, which have been studied for 1./100 CV1 2A 27 1./- CV1 A1 16 several centuries and are one of the protist groups for which 96/- irritans AF351579 Trimyema compressum Z29438 57/- 1./100 CCW92 AY180039 both, morphological and SSU rDNA records are better CV1 5B 12 Plagiopylea 1./100 Plagiopyla frontata Z29440 Plagiopyla nasuta sampled. Nevertheless, ciliate SSU rDNA sequences for 0.79/- Z29442 DH147-EKD23 AF290076 representatives of all ciliate classes and virtually all 1./99 1./- Platyophrya vorax AF060454 subclasses are available in databases (Lynn & Small 0.89/- 1./100 LEMD069 AF372828 1./88 Trithigmostoma steini X71134 0.89/- 2000). In this study, we detected not only divergent ciliate AT1 2 AF530529 1./100 AT7-23 AF530530 1./100 AT7-37 AF530531 sequences within classes or subclasses, but also sequences 1./100 1./100 BOLA408 AF372788 1./100 BOLA439 AF372787 that were difficult to classify within the known subclasses CV1 A1 26 0.97/77 1./100 Heliophrya erhardi AY007448 Discophrya collini L26446 and even classes. For instance, the environmental clones Protocruzia sp. X65153 Diplodinium dentatum U57764 BOLA721 (Dawson & Pace 2002), AT6-4 (Lo´pez-Garcı´a Ophryoscolex purkynjei U57768 1./99 Entodinium caudatum U57765 et al. 2003) and CV1-2A-6 (this work) might define new 0.93/- Isotricha intestinalis U57770 CCA76 AY179986 1./88 Enchelyodon sp. U80313 spirotrichean subclasses. The case of the Phyllopharyngea CCW63 AY180028 1./100 Homalozoon vermiculare L26447 is particularly remarkable, since most of the environmental CH1 2A 27 Litostomatea 0.64/- Spathidium sp. Z22931 sequences branching within this group are extremely CV1 B2 43 plankton (0.2-5 µm) 0.61/- AT4-38 AF530528 0.88/- nasutum U57771 plankton (>5 µm) distant from known species (figure 4), and the clusters 0.71/- Loxophyllum utriculariae L26448 1./100 CV1 2A 28 sediment defined by AT1-2, AT7-23 and AT7-37, and BOLA408, 0.80/60 Hemiophrys procera AY102175 CCA70 AY179985 BOLA439 and CV1-A1-26 (Dawson & Pace 2002; 1./100 1./99 Nyctotheroides deslierresae AF145353 Nyctotherus cordiformis AJ006711 Metopus contortus Z29516 Lo´pez-Garcı´a et al. 2003 and this work) could represent 1./99 1./100 Brachonella sp.79324 AJ009664 CV1 5A 17 0.81/- 0.98/- new subclasses as well. Furthermore, as mentioned above, CV1 5A 8 Armophorida 0.97/- Caenomorpha-like sp.4 AJ009661 0.91/- CV1 B2 81 CV1-2A-27 and CV1-2A-1 did not clearly affiliate to any 0.74/66 Caenomorpha-like sp.8 AJ009662 0.78/- Metopus palaeformis AY007452 ciliate class, and this was also the case for other Diophrys appendiculata AY004773

1./100 CCW2 AY180008 Hypotrichia 0.99/79 CCI29 AY179981 environmental sequences such as DH147–EKD23 1./100 vannus AY004772 CV1 5B 8 (Lo´pez-Garcı´a et al. 2001). Recent works have shown 0.79/- Euplotes aediculatus X03949

0.92/93 1./99 Euplotes octocarinatus AJ310489 that morphologically well defined species may have highly Spirotrichea Spirotrichea 1./- CV1 B1 1 1./78 1./96 Aspidisca steini AF305625 0.99/98 CCI60 AY179980 divergent SSU rDNA sequences that do not branch within 1./100 CCW66 AY180030 Euplotidium arenarium Y19166 the expected ciliate classes (Johnson et al. 2004). This Oxytricha nova X03948 pustulata X03947 Onychodromus quadricornutus X53485 might be also the case for several of our divergent sequences

0.80/- Oxytricha granulifera X53486

0.67/- Stichotrichia CV1 2A 16 and, consequently, our results have to be considered Halteria grandinella AF194410 Uroleptus gallina AF508779 cautiously until their eventual validation by morphological CV1 B1 13 1./83 1./- AT6-4 AF530527 1./100 CV1 2A 6 identification. Remarkably, a relatively high frequency of 1./85 CH1 5A 5 0.92/63 Tintinnopsis tubulosoides AF399111 divergent sequences is seen even when saturation has not 1./62 1./96 C1-E042 AY046638 CH1 2A 10 Choreotrichia 1./84 Strobilidium caudatum AY143573 been reached in the few molecular surveys carried out to 0.99/82 1./76 Oligotrichia date, including this study (not shown). Moreover, the 0.96/67 BOLA721 AF372792 1./100 Caenomorpha sp. AJ009663 criteria that we applied to define a phylotype may under- 1./100 CV1 A1 11 CV1 A2 24 Armophorida estimate the actual diversity. Wearbitrarily used a threshold 0.94/84 Caenomorpha sp. AJ009660 Caenomorpha uniserialis U97108 1./100 1./100 of greater than 97% identity to define a phylotype in this 1./95

6/- Heterotrichea

.6 1./- work, but cells with greater than 97% identity at the level of 0 Dinoflagellata edax AY234843 0.94/- infectans AF133909 Apicomplexa their SSU rDNA may very well belong to different species. suis U97523 1./99 tenella AF026388 Ophriocystis elektroscirrha AF129883 Altogether, these observations suggest that an import- 1./100 CCA5 AY179988

0.94/67 1./100 LEMD119 AF372777 ant fraction of the eukaryotic diversity is still awaiting CH1 S2 24 Monocystis agilis AF457127 discovery, even for very well known groups such as the Monocystis agilis AF129881 1./100 1./95 Heterokonta ciliates. This is in sharp contrast with the idea of Finlay, 0.1 Fenchel and co-workers (Finlay 1998, 2002, 2004; Finlay et al. 1998; Finlay & Fenchel 1999; Finlay & Esteban Figure 4. Bayesian phylogenetic tree of alveolate SSU rDNA 2001) that nearly all free-living ciliate species have already sequences retrieved from sediment and plankton of a suboxic been discovered, particularly in freshwater environments. pond (CV1) and an oxic shallow lake (CH1). One hundred On the contrary, even having taken into account the and forty eight sequences and 1099 sites were used. limitation imposed by the lack of sequenced SSU rDNAs

Proc. R. Soc. B (2005) 2080 J. Sˇlapeta and others The extent of protist diversity from all described species, our data tend to support the Arndt, H., Dietrich, D., Auer, B., Cleven, E.-J., Grafenhan, opposite view, i.e. only a limited fraction of ciliate diversity T., Weitere, M. & Mylnikov, A. P. 2000 Functional has been sampled, in agreement with Foissner (1999). diversity of heterotrophic flagellates in aquatic ecosystems. According to him, the number of free-living ciliate species In The flagellates: unity, diversity and evolution (ed. B. S. C. may exceed by at least one order of magnitude that of those Leadbeater & J. C. Green), pp. 240–268. London: Taylor & Francis. already described. The slow increase in new species Baldauf, S. L. 2003 The deep roots of eukaryotes. Science descriptions would be due to the decreasing number of 300, 1703–1706. ciliate taxonomists rather than to an actual low species Bass, D. & Cavalier-Smith, T. 2004 Phylum-specific environ- number, and to the fact that most environments remain mental DNA analysis reveals remarkably high global underexplored, including the soil (Foissner 1999). The biodiversity of Cercozoa (Protozoa). Int. J. Syst. Evol. latter appears confirmed by the identification of many Microbiol. 54, 2393–2404. novel, divergent ciliate sequences coming from the deep- Berney, C., Fahrni, J. & Pawlowski, J. 2004 How many novel sea and other unstudied habitats (Lo´pez-Garcı´a et al. eukaryotic ‘kingdoms’? Pitfalls and limitations of environ- 2001; Dawson & Pace 2002; Edgcomb et al.2002; mental DNA surveys. BMC Biol. 2, 13. Lo´pez-Garcı´a et al. 2003) but, nevertheless, even in the Bowman, B. H., Taylor, J. W., Brownlee, A. G., Lee, J., Lu, well studied freshwater habitats, the frequency of new S. D. & White, T. J. 1992 Molecular evolution of the fungi: relationship of the Basidiomycetes, Ascomycetes, and divergent sequences seems remarkably high. An additional Chytridiomycetes. Mol. Biol. Evol. 9, 285–296. problem for ciliate taxonomists is the occurrence of cryptic Cavalier-Smith, T. 2004 Only six kingdoms of life. Proc. R. species, which are morphologically indistinguishable but Soc. B 271, 1251–1262. (doi:10.1098/rspb.2004.2705.) genetically different. Cryptic species have been observed in Cavalier-Smith, T. & Chao, E. E. 2003a Phylogeny and ciliates (Nanney et al. 1998) and other protozoan groups classification of phylum Cercozoa (Protozoa). Protist 154, (de Vargas et al. 1999; Saez et al. 2003). It is true that the 341–358. opposite case is known to occur as well, since dimorphic Cavalier-Smith, T. & Chao, E. E. 2003b Phylogeny of species have been identified in some protist groups, such as Choanozoa, Apusozoa, and other Protozoa and early the cryptophytes (Hoef-Emden & Melkonian 2003). megaevolution. J. Mol. Evol. 56, 540–563. However, molecular surveys are independent of morpho- Clay, B. & Kugrens, P. 1999 Systematics of the enigmatic logical observations, and therefore, overcome this type kathablepharids, including EM characterization of the type species, Kathablepharis phoenikoston, and new obser- of bias. vations on K. remigera comb. nov. Protist 150, 43–59. What is observed for ciliates appears to be also the case Cole, J. R. et al. 2003 The Ribosomal Database Project for other, less-studied, eukaryotic phyla. Although (RDP-II): previewing a new autoaligner that allows Patterson suggested that most heterotrophic flagellates regular updates and the new prokaryotic . were already known (Patterson et al. 2000), similar to what Nucleic Acids Res. 31, 442–423. Finlay & Fenchel (1999) stated for ciliates, recent Dawson, S. C. & Pace, N. R. 2002 Novel -level molecular surveys suggest that general protist diversity is eukaryotic diversity in anoxic environments. Proc. Natl very poorly known both in freshwater and other environ- Acad. Sci. USA 99, 8324–8329. ments. Thus, the global diversity of Cercozoa (Bass & Deane, J. A., Strachan, I. M., Saunders, G. W., Hill, D. R. A. Cavalier-Smith 2004), bodonids (Lo´pez-Garcı´a et al. & McFadden, G. I. 2002 Cryptomonad evolution: nuclear 18S rDNA phylogeny versus cell morphology and 2003; von der Heyden et al. 2004) and heterotrophic pigmentation. J. Phycol. 38, 1236–1244. heterokonts (Massana et al. 2004) appears to be huge. de Vargas, C., Norris, R., Zaninetti, L., Gibb, S. W. & Altogether, this points to the existence of a large hidden Pawlowski, J. 1999 Molecular evidence of cryptic eukaryotic diversity whose actual extent will be very speciation in planktonic foraminifers and their relation to difficult to evaluate until more comprehensive molecular oceanic provinces. Proc. Natl Acad. Sci. USA 96, surveys combined with classical studies in different 2864–2868. environments become available. Diez, B., Pedros-Alio, C. & Massana, R. 2001 Study of genetic diversity of eukaryotic picoplankton in different NOTE ADDED IN PROOF oceanic regions by small-subunit rRNA gene cloning and sequencing. Appl. Environ. Microbiol. 67, 2932–2941. The SSU rDNA sequences from two kathablepharid Edgcomb, V. P., Kysela, D. T., Teske, A., de Vera Gomez, A. species have recently been published by Okamoto and & Sogin, M. L. 2002 Benthic eukaryotic diversity in the Inouye (2005, Protist 156, 163–179). Phylogenetic analysis Guaymas Basin hydrothermal vent environment. Proc. indicates that those sequences form a monophyletic group Natl Acad. Sci. USA 99, 7658–7662. with our clone CH1-5A-4, confirming the suspicion Edvardsen, B., Eikrem, W., Green, J. C., Andersen, R. A., mentioned above. Moon-van der Staay, S. Y. & Medlin, L. K. 2000 Phylogenetic reconstructions of the Haptophyta inferred This work was supported by an ATIP grant of the French from 18S ribosomal DNA sequences and available Centre National de la Recherche Scientifique, section morphological data. Phycologia 39, 19–35. ‘Dynamique de la biodiversite´’. Fenchel, T. & Finlay, B. J. 1995 Ecology and evolution in anoxic worlds. Oxford series in ecology and evolution. Oxford REFERENCES University Press. Amaral Zettler, L. A., Gomez, F., Zettler, E., Keenan, B. G., Finlay, B. J. 1998 The global diversity of Protozoa and other Amils, R. & Sogin, M. L. 2002 Microbiology: eukaryotic small species. Int. J. Parasitol. 28, 29–48. diversity in Spain’s River of Fire. Nature 417, 137. Finlay, B. J. 2002 Global dispersal of free-living microbial Andersen, R. A., Van de Peer, Y., Potter, D., Sexton, J. P., eukaryote species. Science 296, 1061–1063. Kawachi, M. & LaJeunesse, T. 1999 Phylogenetic analysis Finlay, B. J. 2004 Protist taxonomy: an ecological perspective. of the SSU rRNA from members of the Chrysophyceae. Phil. Trans. R. Soc. B 359, 599–610. (doi:10.1098/rstb. Protist 150, 71–84. 2003.1450.)

Proc. R. Soc. B (2005) The extent of protist diversity J. Sˇlapeta and others 2081

Finlay, B. J. & Esteban, G. F. 2001 Exploring Leeuwenhoek’s Moon-van der Staay, S. Y., De Wachter, R. & Vaulot, D. 2001 legacy: the abundance and diversity of protozoa. Int. Oceanic 18S rDNA sequences from picoplankton reveal Microbiol. 4, 125–133. unsuspected eukaryotic diversity. Nature 409, 607–610. Finlay, B. J. & Fenchel, T. 1999 Divergent perspectives on Moreira, D. & Lo´pez-Garcı´a, P. 2002 The molecular ecology protist species richness. Protist 150, 229–233. of microbial eukaryotes unveils a hidden world. Trends Finlay, B. J., Esteban, G. F. & Fenchel, T. 1998 Protozoan Microbiol. 10, 31–38. diversity: converging estimates of the global number of Nanney, D. L., Park, C., Preparata, R. & Simon, E. M. 1998 free-living ciliate species. Protist 149, 29–37. Comparison of sequence differences in a variable 23S Fleedale, G. & Vickerman, K. 2000 Phylum Euglenozoa. In rRNA among sets of cryptic species of ciliated An illustrated guide to the Protozoa (ed. J. J. Lee, G. F. protozoa. J. Eukaryot. Microbiol. 45, 91–100. Leedale & P. Bradbury), vol. 2, pp. 1135–1185. Lawrence, Patterson, D. J., Vors, N., Simpson, A. G. B. & O’Kelly, KS: Allen Press. C. 2000 Residual free-living and predatory hetero- Foissner, W. 1999 Protist diversity: estimates of the near- trophic flagellates. In An illustrated guide to the Protozoa imponderable. Protist 150, 363–368. (ed. J. J. Lee, G. F. Leedale & P. Bradbury), vol. 2, Green, J. C., Perch-Nielsen, K. & Westbroek, P. 1990 Phylum Prymnesiophyta. In Handbook of Protoctista (ed. pp. 1302–1328. Lawrence, KS: Allen Press. L. Margulis, J. O. Corliss, M. Melkonian & D. J. Romari, K. & Vaulot, D. 2004 Composition and temporal Chapman), pp. 293–317. Boston: Jones and Bartlett variability of picoeukaryote communities at a coastal site Publishers. of the English Channel from 18S rDNA sequences. Guindon, S. & Gascuel, O. 2003 A simple, fast, and accurate Limnol. Oceanogr. 49, 784–798. algorithm to estimate large phylogenies by maximum Ronquist, F. & Huelsenbeck, J. P. 2003 MrBayes 3: Bayesian likelihood. Syst. Biol. 52, 696–704. phylogenetic inference under mixed models. Bioinfor- Hoef-Emden, K. & Melkonian, M. 2003 Revision of the matics 19, 1572–1574. genus Cryptomonas (Cryptophyceae): a combination of Saez, A. G., Probert, I., Geisen, M., Quinn, P., Young, J. R. & molecular phylogeny and morphology provides insights Medlin, L. K. 2003 Pseudo-cryptic speciation in cocco- into a long-hidden dimorphism. Protist 154, 371–409. lithophores. Proc. Natl Acad. Sci. USA 100, 7163–7168. Huber, T., Faulkner, G. & Hugenholtz, P. 2004 Bellerophon: Snell, E. A., Furlong, R. F. & Holland, P. W. 2001 Hsp70 a program to detect chimeric sequences in multiple sequences indicate that choanoflagellates are closely sequence alignments. Bioinformatics 20, 2317–2319. related to animals. Curr. Biol. 11, 967–970. Johnson, M. D., Tengs, T., Oldach, D. W., Delwiche, C. F. & Stoeck, T. & Epstein, S. 2003 Novel eukaryotic lineages Stoecker, D. K. 2004 Highly divergent SSU rRNA genes inferred from small-subunit rRNA analyses of oxygen- found in the marine ciliates Myrionecta rubra and depleted marine environments. Appl. Environ. Microbiol. Mesodinium pulex. Protist 155, 347–359. 69, 2657–2663. King, N. & Carroll, S. B. 2001 A receptor tyrosine kinase Stoeck, T., Taylor, G. T. & Epstein, S. S. 2003 Novel from choanoflagellates: molecular insights into early eukaryotes from the permanently anoxic Cariaco Basin animal evolution. Proc. Natl Acad. Sci. USA 98, (Caribbean Sea). Appl. Environ. Microbiol. 69, 5656–5663. 15 032–15 037. Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F. Lo´pez-Garcı´a, P., Rodrı´guez-Valera, F., Pedro´s-Alio´,C.& & Higgins, D. G. 1997 The CLUSTAL_X windows Moreira, D. 2001 Unexpected diversity of small eukar- yotes in deep-sea Antarctic plankton. Nature 409, interface: flexible strategies for multiple sequence align- 603–607. ment aided by quality analysis tools. Nucleic Acids Res. 25, Lo´pez-Garcı´a, P., Philippe, H., Gail, F. & Moreira, D. 2003 4876–4882. Autochthonous eukaryotic diversity in hydrothermal van Hannen, E. J., Mooij, W., van Agterveld, M. P., Gons, sediment and experimental microcolonizers at the Mid- H. J. & Laanbroek, H. J. 1999 Detritus-dependent Atlantic Ridge. Proc. Natl Acad. Sci. USA 100, 697–702. development of the microbial community in an experi- Lynn, D. H. & Small, E. B. 2000 Phylum Ciliophora. In An mental system: qualitative analysis by denaturing illustrated guide to the Protozoa (ed. J. J. Lee, G. F. Leedale & gradient gel electrophoresis. Appl. Environ. Microbiol. P. Bradbury), vol. 1, pp. 371–656. Lawrence, KS: Allen 65, 2478–2484. Press. von der Heyden, S., Chao, E. E., Vickerman, K. & Cavalier- Margulis, L. & Schwartz, K. V. 1998 Five kingdoms: an Smith, T. 2004 Ribosomal RNA phylogeny of bodonid illustrated guide to the phyla of life on Earth, 3rd edn. and diplonemid flagellates and the evolution of Eugleno- New York: W. H. Freeman. 520pp. zoa. J. Eukaryot. Microbiol. 51, 402–416. Marin, B., Klingberg, M. & Melkonian, M. 1998 Phyloge- Weisse, T. 2002 The significance of inter- and intraspecific netic relationships among the cryptophyta: analyses of variation in bacterivorous and herbivorous protists. nuclear-encoded SSU rRNA sequences support the Antonie Van Leeuwenhoek 81, 327–341. monophyly of extant plastid-containing lineages. Protist 149, 265–276. The supplementary Electronic Appendix is available at http://dx.doi. Massana, R., Castresana, J., Balague, V. V., Guillou, L., org/10.1098/rspb.2005.3195 or via http://www.journals.royalsoc.ac.uk. Romari, K., Groisillier, A., Valentin, K. & Pedros-Alio, C. 2004 Phylogenetic and ecological analysis of novel marine As this paper exceeds the maximum length normally permitted, the stramenopiles. Appl. Environ. Microbiol. 70, 3528–3534. authors have agreed to contribute to production costs.

Proc. R. Soc. B (2005)