View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by HAL-UNICE

PhytoREF: a reference database of the plastidial 16S rRNA gene of photosynthetic with curated Johan Decelle, Sarah Romac, Rowena F Stern, El Mahdi Bendif, Adriana Zingone, St´ephane Audic, Michel D Guiry, Laure Guillou, D´esir´eTessier, Florence Le Gall, et al.

To cite this version: Johan Decelle, Sarah Romac, Rowena F Stern, El Mahdi Bendif, Adriana Zingone, et al.. PhytoREF: a reference database of the plastidial 16S rRNA gene of photosynthetic eukaryotes with curated taxonomy. Molecular Ecology Resources, Blackwell, 2015, 15 (6), pp.1435-1445. <10.1111/1755-0998.12401>.

HAL Id: hal-01149047 http://hal.upmc.fr/hal-01149047 Submitted on 6 May 2015 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destin´eeau d´epˆotet `ala diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publi´esou non, lished or not. The documents may come from ´emanant des ´etablissements d’enseignement et de teaching and research institutions in France or recherche fran¸caisou ´etrangers,des laboratoires abroad, or from public or private research centers. publics ou priv´es. Colomban deVargas Stéphane Audic Johan Decelle Gall This article is protected Allrights bycopyright. reserved. 10.1111/1755-0998.12401 of Record.versiondifferences and article Please between the cite this Version this as doi: through the copyediting, typesetting, pagination for publication accepted This articlehasbeen protists phytoplankton, metabarcoding, Keywords: 02,France Nice cedex F06108BP71. Valrose, Antipolis,UMR713 Nice-Sophia 8- Universitéde 02,France cedex 7- CNRS,UMR7138,Systématique Adaptation Ireland Galway, Institute, c/oRyan Foundation, 6- TheAlgaeBase Dohrn, Zoologica5- Stazione Anton Biological4- MarineLaboratory, The Association, Citadel Hill,Plymouth, PL1 2PB. 2PB. Foundation forOcean3- SirAlisterLaboratory, Hardy The Science, Hill,Plymouth, Citadel PL1 2- CNRS,UMR7144,StationFrance Biologique 29680Roscoff, Roscoff, de 29680 Roscoff,France 1- UMR7144-SorbonneUniversités,UPMCUniv Article type : Resource Article 02-Mar-2015 : Date Accepted :20-Feb-2015 Date Revised :12-Nov-2014 Date Received

Accepted Article th databaseof areference PhytoREF: 1,2 , Priscillia Gourvil plastidial 16SrRNAgene, high-throughput , sequencing, photosynthetic eukaryotes curated with taxonomy 1,2

, Sarah Romac 1,2 , Michael D.Guiry 1,2* , Richard Christen , Richard 1,2 , Adriana Lopes dos Santos 1,2 , Rowena F.Stern Villa Comunale, 80121 Naples, Italy Italy VillaComunale,80121Naples, 6 , Laure Guillou 7,8* and undergone full peer review buthasnot been review undergone fullpeer and

and proofreading process, which may lead to

Evolution, Parc Valrose, BP71. F06108 Nice Evolution, ParcValrose, National University of Ireland, Rd., University of National University 8, Systématique Adaptation Evolution,Parc 8, Systématique Adaptation Paris06,StationBiologique Roscoff, de e plastidial 16S rRNAgeneof e 3 , ElMahdi Bendif 1,2 1,2 , Ian Probert , DésiréTessier 4 1,2 , AdrianaZingone , Daniel Vaulot , Daniel 7,8 , Florence Le , Le Florence 1,2 5 , , This article is protected Allrights bycopyright. reserved. al rDNAsequenceswere 16S environmental Sanger strainsbelong microalgal cultured marine plastidialgeno from extracted and 879sequences photosynthetic compiled3,333ampl We lineages. fromalargediversit sequences thatoriginate In thisstudy, builtthePhytoREF we database identified organisms.To re containing curated databases order tointerpretenvironmenta distinctivemorphologi lack microalgae thatoften studiesofth biological holistic eco-systems environm biosphere. Theongoing ofthe ecosystems most producersin main as the role haveacritical eukaryotes Photosynthetic Running title 830 144). (Tel: 33601 25 37)& [email protected] shouldbeaddressed. *To whomcorrespondence Abstract broadenapplicability the of PhytoREF aquatic to different and habitats. terrestrial PhytoREF, sequences) PhytoREF ofthe half (representing onmari mainly focuses Thedatabase sequence. cla taxonomic and aphylogeny-based filtering Acceptedaccessible molecularecology(http://phytoref.org),a resourceweb new via in to interface a is Article

: 16S rDNA sequences of microalgae inPhytoREF ofmicroalgae rDNAsequences : 16S date, no such reference framework ex framework date, nosuchreference l sequences, metabarcoding necessa l sequences,metabarcoding ference sequences of the targeted gene (or barcode) from from (orbarcode) gene targeted of the sequences ference ental metabarcoding revoluti that contains 6,490 plastidial 16S rDNA reference 16SrDNAreference contains 6,490plastidial that ing to different eukaryotic lineages. 1,867 eukaryotic todifferent ing Email: [email protected] (Tel:+33 29829 (Tel:+33 Email: [email protected] y of eukaryotes representing allknownmajor y ofeukaryotes representing ese organisms, in particular the unicellular the organisms, inparticular ese ne microalgae, but sequences from land fromland butsequences ne microalgae, ssification were applied for each 16S rDNA 16SrDNA foreach applied ssification were cal characters and have complex life cycles. In lifecycles.In complex andhave cal characters icon sequencesavailabl mes, and generated 411 mes, and and freshwater taxa werealsoincludedto taxa and freshwater so includedinthedata

ists forphotosynthetic eukaryotes.

on opens the perspective for on openstheperspectivefor rily relies ontaxonomically- relies rily e from publicdatabases e base. Stringent quality novel sequences from from novel sequences eukaryotic microalgae are key players in players are key eukaryotic microalgae (Knoll2014).Today, Earth system eventinthehistory the oxidation of toamajor led arguably radiationofphotosynthe multicellular forms.The classUlvophyceae) chlorophyte (e.g essentially macroalgae lineages, but afew harmful blooms, which can be highly detrimental tomarine life and human activities such as 2008; Hartmann th canevenexceed bacterivory Unrein 2014; & Sanders microbial prey (McKie-Krisberg being able are mixotrophs, microalgal taxa Inaddition, 2007). & Jackson sediments (Richardson 2004; Jardillier photosynthesis (Falkowski perform oxygenic through to theircapacity In they ecosystem, processes. themarine ar 2012; Not orde morphologicalacross morethaneight diversity This article is protected Allrights bycopyright. reserved. Accepted Article distributed are eukaryotes bearing photosynthesis throughEukaryotes thatacquired Introduction using sequencing. high-throughput andmonitoring assessment foster thediscovery, et al. et

et al et 2012). Most photosynthetic are eukaryotes et al. et . 2010), and to export and sequester organic carbon to the deep ocean and ocean deep the to carbon organic sequester and export to and . 2010), 2012). In coastal areas, some microalgal can be toxic and/or form form and/or betoxic can species microalgal some areas, In coastal 2012). at of strict heterotrophs in oceanic waters (Zubkov & Tarran & Tarran (Zubkov waters inoceanic strictheterotrophs at of acrossmosteukaryotic super-

and the embryophyte land plants, haveevolvedinto landplants, and theembryophyte aquatic food webs and global biogeochemical global biogeochemical and foodwebs aquatic e themajorcontributorsto primary production to bothphotosynthesize andfeedonvarious . the rhodophyte class Florideophyceae or the the or Florideophyceae class therhodophyte . tic marineprotistsdur endosymbiosis orplastid- withcyanobacteria eukaryotes photosynthetic of of thediversity rs ofmagnitude inorganism (Archibald size evidence is growing that many eukaryotic eukaryotic growing evidenceis thatmany

et al. et unicellular (referred to as protists), toas (referred unicellular groups and exhibit abewildering exhibit groups and 2014). Theircontributionto 2014). et al. et ing the Neoproterozoic Neoproterozoic the ing 2004; Worden 2004;Worden et al et . et al. et mark "universal" traditionally usedas RNAgenes (p ribosomal ofthenuclear regions the biodiversity eukaryotes: ofphotosynthetic regimes inpredators(Pompanon (Taberlet bio-assessment andbio-monitori 2012; Bittner assessnew and toolto thecomposition ecological microalgal function of communities (Bik unveiled avastandunsuspected diversity ofmicr environmental DNAmetaba a parasitic ormutualistic symbiont (Skovgaard Montresor & Lewis Gaebler-Schwarz 2006; morphologically distinctfo undergo asuccessionof render detection intheenvironment very withtraditional difficult microscopy. Many taxa al et lack ofdistinctivemorphological fortheprasinophyte small as0.8µm taxonom techniques. For mosttaxa, inthenatural of photosynthetic eukaryotes economicimportance, and Despite theirecological al. This article is protected Allrights bycopyright. reserved. & tourism(Zingone Wyatt 2005;Chambouvet and fisheries, aquaculture

Accepted Article 2012). . 1989). The complex life cycles of many microa ofmany lifecycles complex The . 1989). 2012,2014). However, several drawbacks limitthe drawbacks several However, 2012,2014). et al. et et al. et 2012; Kermarrec 2013). Environmental metabarcoding appr Environmentalmetabarcoding 2013). rcoding (high-throughput sequencin (high-throughput rcoding features, andfragility whencla features, ng of sentinel orindicator ng ofsentinel et al. et et al. et ic identification isgreatly hi ers inenvironmental surveys (Stoeck Ostreococcus 2012; Piñol 2013;Pawlowski et al. et (i) some 18S rDNA clone library-based surveys surveys library-based some18SrDNAclone (i) et al. et environment using classical microscopy-based using microscopy-based classical environment oorganisms in recent years, oorganisms inrecent provides apowerful 2010) or can be "hidden" within a host cell as cell as withinahost "hidden" can be 2010)or articularly the 18SrRNA)are small subunit, it remains difficult to assess the total diversity thetotaldiversity it remainsdifficulttoassess rms (e.g. sexual morphotypes, cysts; resting sexual rms (e.g. ; Courties 2012; Decelle 2012;Decelle et al. et lgal species are additional obstacles that obstaclesthat additional are lgal species

use of these nuclear markers to assess assess to markers nuclear these use of 2014).For marine protists,variable et al. et oaches have also been proposed for for also beenproposed have oaches et al. al. et species, including microalgae microalgae including species, ssical fixatives areused(Vaulot ssical fixatives g of DNA markers), which has markers),whichhas g ofDNA ndered by their minute size (as their minutesize byndered 2014), and thestudy ofdiet et al. et 1994; Vaulot et al. 2012).In this context, et al. et 2008;Anderson 2010; Logares 2010; et al et . 2008), et al. et et et

This article is protected Allrights bycopyright. reserved. subunit oftheRibulose-l,5-diphosphate genes (large photosynthetic protein-coding thephototrophiccomp tofocuson In order andf the taxonomic and/or taxonomic narrower resolution ( taxonomic genetic"universal" eukaryotes, diversity DNA of multi-functional pr complex (iii) distinctionbetweenphotot (Zhu organisms inthese genes ribosomal of number higher copy tendstobeprefer multinucleated) ormetazoans protist large DNA of 2011); (ii)ribosomal overlook phototrophsin complexco hetero biasedtowards tobe shown have been been used as phytoplankton markers for (Paul communities cyanophages (viruses), and toalesserextent ph (viruses), and cyanophages Man-Aharonovich 2013). However, annotationand 2013). However, IBIA, for forms (e.g. types sequence can havedifferent Accepted al. et lineages majoreukaryotic and candistinguish wi pr generalist touse regions conserved sufficiently contains it since surveys marine several in employed successfully gene hasbeen rRNA Article 2006; MacDonald 2006;MacDonald et al et unctional composition of eukaryotes in complex ecosystems. ecosystems. complex unctional compositionofeukaryotes in CBOL Protist:Pawlowski . 2010).However,theprimerstarget et al. et otistan groups, such as dinoflagellate otistan groups, as such functional focus need to bedevelo focus need functional 2007; Lepère 2007; psbA rophic and heterotrophic taxa is rophic andheterotrophictaxa interpretation oftheplastid (protein D1 of photosysytem-II reaction center)and ofphotosysytem-II reaction (proteinD1 mmunity (Vaulot assemblages et al. et entially PCR-amplified because ofthe relatively th a relatively good taxonomic resolution (Fuller resolution(Fuller good taxonomic arelatively th trophic eukaryotes, and consequently tend to tend consequently and eukaryotes, trophic et al. et otosynthetic eukaryotes markers cannot detect all lineages withahigh all markers cannotdetect cells (mainly heterotrophs and potentially andpotentially heterotrophs cells (mainly 2009;Shi imers totargetall -bearing eukaryotes artment of eukaryotic communities,the artment ofeukaryotic 2012). Therefore, barcoding systems with 2012).Therefore,barcoding

carboxylase/oxygenase, RuBisCO) have have RuBisCO) carboxylase/oxygenase,

rbc L).contrast, By the plastidial 16S et al. et et al et ed essentially cyanobacteria and cyanobacteria ed essentially ped to provide a better picture of of picture abetter toprovide ped s. In addition, given the extreme extreme Inaddition, given the s. ial 16S rDNA clone libraries ial 16S rDNA clonelibraries et al. et very notpossiblein often . 2000; Zeidner 2011;Kirkham et al et 2005; Godhe 2005; , andthesamespecies , . 2002;Kirkham et al. et et al. et et al. et 2008); 2008); 2003; 2011, 2011, et al. et rbc L

Plastidial 16S rDNA sequences were first retrie were rDNAsequences 16S Plastidial Retrieval ofplastidial16S rDNA Database Collaboration (I Database Collaboration This article is protected Allrights bycopyright. reserved. Data sources also useful butis ecosystems, evaluate andmonitorthediversity ofphotosynth Phyt strains. microalgal cultured of spectrum asnovelSangerplastidial genomes),as well ampl plastidial publicly available organisms. Thisdatabase, named PhytoREF, has co eukaryotes, ofphotosynthetic major lineages plastid ofthe database reference an extensive fortheplastidial16SrRNAgenedatabase exists (Cole Project Database Reference Guillou Database (PR2; available forribosomal RNA genes well-identified organisms. Although anumberof obtained inthesestudieshavebeen various habitats (e.g. water, se water, various habitats(e.g. applicationsin of range a for will pavetheway totarget specific probes and design primers Accepted studies in past environments,anddietary Article

et al. et NSDC: http://www.insdc.org)NSDC: (e.g. plastidial, using keywords various 16S rDNA sequences (amplicons rDNAsequences 16S to identifytaxonomically new pl diments and ice), paleoecological paleoecological diments andice), 2005), and Greengenes (DeSantis and Greengenes 2005), sequences from public databases sequences from hindered by the lackofrefe the by hindered of eukaryotes and prokaryotes, such as the ProtistRibosomal as such and prokaryotes, eukaryotes of et al. et unicellular and multicellular herbivores. 2013), SILVA (Pruesse lineages of photosynthetic eukaryotes. PhytoREF eukaryotes. ofphotosynthetic lineages ial 16S rRNA gene including sequences from all from all gene sequences including 16S rRNA ial oREF is not only a new resource to explore, explore, to resource new a isnotonly oREF ved from the International Nucleotide Sequence Sequence Nucleotide International the from ved of photosynthetic eukaryot been builtthrough the ofall compilation ofthe icons that we obtained from awidetaxonomicicons thatweobtainedfrom bio-monitoring in eukaryotes photosynthetic mprising terrestrial, freshwater and marine marine and freshwater mprising terrestrial, curated reference databases are publicly arepublicly databases curatedreference etic eukaryotes ina

astidial 16S rDNA sequences and sequences rDNA astidial 16S rence sequences of taxonomically of taxonomically sequences rence studies of primary producers in in primary producers studies of and sequences extracted from from extracted and sequences et al. et et al et 2006), no reference 2006),noreference quatic and terrestrial quatic andterrestrial es. Here, we describe es. Here,wedescribe . 2007),Ribosomal This article is protected Allrights bycopyright. reserved. Acceptedgrowth phaseconcentrated and by centrifugation. Total nucleic extracted acids were using the ofNaples(TableS1). Zoologica AntonDohrn (formerly CCMP;https://ncma.bigelow.org/),and the culture collection ofthe Stazione the RoscoffCultureCollection(RCC,http://r plastidial generatedWe 411novel sequ Newly generated16SrDNA available as at separate http://phytoref.org. files unambiguously andclassi annotate SILVA115, Quast (release from extracted were ofCyanobacteria sequences rDNA 16S of the all Finally, PhytoREF. simila basedonsequence level class the at assigned INSDC andincludedinPhytoREF. Thosesequences Article rDNAseque 16S Environmental Sanger strain orisolatedorganism)we thatoriginat rDNAsequences All plastidial16S available athttp://www.ncbi.nlm.nih.gov/ge al were rDNAsequences 16S source publications. taxonnames,culturestrain (e.g. specific features sequen literatureforeach 201),thesource (release (May 2014)(Guillou database distinct photosyntheticA lineages.eukaryotic BLAS plastid, chloroplast,16S,smallsubunit),and et al. et re defined as reference sequen asreference re defined et al. et ences from microalgal cultures cultures microalgal from ences 16S rDNA sequences from eukary 16S rDNAsequencesfrom fy eukaryotic sequences. The sequences.The fyeukaryotic 2013) in order to root the phylogenetic trees and trees 2013)inordertorootthephylogenetic 2013; http://ssu-rrna.org/). When available in GenBank available inGenBank 2013;http://ssu-rrna.org/). When nces obtained inclonelibraries obtained nces nomes/GenomesGroup.cgi?taxid=2759&opt=plastid. nomes/GenomesGroup.cgi?taxid=2759&opt=plastid. dditional sequences were dditional sequenceswere Culturedcellswereha s), resulting inabibliogr ce was searched to compile and/or verify their tocompileand/orverify their ce wassearched ed from an identified organism (e.g. culture culture organism (e.g. ed fromanidentified oscoff-culture-collection.org/), the NCMA NCMA the oscoff-culture-collection.org/), T searches with different query sequences of of query sequences withdifferent T searches rity scores with the references sequences of of sequences references the with rity scores that lacked taxonomic identification were identificationwere thatlackedtaxonomic so extracted from all plastidialgenomes all from extracted so

ces inthePhytoREF database. cyanobacterial sequences are are sequences cyanobacterial otic microalgal strains from microalgal strainsfrom otic were also retrieved from from also retrieved were retrieved from the PR2 from thePR2 retrieved rvested in exponential inexponential rvested aphic database of 565 databaseof565 aphic This article is protected Allrights bycopyright. reserved. (i) steps: filteredwas following different validated and obtained fromduly identifiedcultures.Eachrefere with una from publicdatabases content ofthedatabase iscomposed The core Constructionthe PhytoREF of database GGC GAGTTGCAGC-3 -3 GCT AA TCG GCA GAA TAA generalist eukaryot photosynthetic International). Spectrophotometer (Labtech ND-1000 Nanodrop a quantifiedusing and (Macherey-Nagel) IIkit Nucleospin RNA cultures, and shorter than 800bpfrompublicsecultures, andshorterthan (Tablecanbe S1),and retrieved also onthe PhytoREF web interface at http://phytoref.org. in deposited were sequences were primersequences and Codes), v1.7.5 (Gene we Raw Sangersequences (Applied Biosystems). using the ABI-PRIS directions and reverse forward both in andsequenced France), Hoerdt, II (Macherey-Nagel, kit Extract NucleoSpin® e by PCR productswerepurified at72°C; at60°C, and30sextension s annealing at98°C,30 of 10sdenaturation cycles by 30sat98°C;followed 35 following PCRparameters: ina25- (Finnzymes) Phusion high-fidelity polymerase DNA Accepted also were characters non-ACGT consecutive 10 than more with sequences (ii) removed; were Article ′ GenBank under the accession numbers LN735194 to LN735532LN735194 to numbers accession the GenBank under (West (West ither EXOSAP-IT (GE Healthcare Bio-Sciences Corp.) or the Corp.)or Bio-Sciences (GE Healthcare ither EXOSAP-IT mbiguous taxonomic assignati taxonomic mbiguous e primer set biased against cyanobacteria: PLA491F: 5'- GAG 5'- GAG PLA491F: against cyanobacteria: setbiased e primer ′ (Fuller et al. et 2001). PCR amplifications were performed withthe 2001).PCRamplifications were performed et al. et An 850 bpfragment wasPCR-amplifiedwithaAn M Big Dye Terminator Cycle Sequencing Kit Sequencing Cycle Big Dye Terminator M of the reference plastidial 16S rDNA sequences rDNAsequences plastidial16S thereference of with afinalelongationstepof10minat72°C. 2006), and OXY1313R: 5 OXY1313R: 2006),and quences (including environmental sequences) sequences) (includingquences environmental nce sequence, including taxonomic affiliation, taxonomic including nce sequence, trimmed off. The new plastidial 16S rDNA plastidial16SrDNA new The trimmedoff. re editedassembledre withChromasPro and

sequences shorter than 400 bp from bp than400 shorter sequences μ on, and the novel sequences sequences thenovel on, and L reaction volume,using L the reaction ′ - CTT CAY GYA GYA CAY - CTT This article is protected Allrights bycopyright. reserved. difficult-to-align nucleotide positi using MAFFT class level) atthe (e.g. alignmdiscarded; (iii) sequence package, which generate large-scale alignments large-scale generate which package, eukaryotic 18SsmallsubunitribosomalRNAusing 2Dst conserved aligned on based seque reference the against algorithm) Wunsch phylum using onglthreshold based asimilarity da tothisvalidatedcore added subsequently were Additional publicly plas available below). conflicts, otherpossible and mislabeled sequences model(Price GTR methodusingmaximum-likelihood the approximate andaccurate v.2.1.1,afast using FastTree level) class the at generally (i.e. cons treeswere 2009);(v)phylogenetic Gutierrez va a-gt program(with trimAlv1.4 the program toverifyvisualized ofintronsor thepresence Accepteddetermine thetaxonomiclevelof well as implementedinTreeDyn 2006). Functions v.4(Gouy implemented inSeaview 16S rRNAgene ribosomalgenes. orother Phylog alignments allowedus toverify 2009). 2D-based Article ents were performed for differe for performed ents were ons were removed forsubsequent ons wereremoved ructures and sequences of the archaeal 16S, bacterial 16S and and 16S bacterial 16S, archaeal ofthe andsequences ructures tidial 16SrDNA sequenceswith each sequence (e.g at the "Family" level). All sequences atthe"Family" (e.g level). Allsequences eachsequence et al. et v6. 953b with default options (Katoh withdefaultoptions(Katoh 953b v6. 2010), and visualized using TreeDyn (Chevenet (Chevenet usingTreeDyn and visualized 2010), putative chimericsequences ofuptomillions ofsequences (Nawrocki whether the new sequences corresponded to the tothe corresponded sequences new whether the obal pair-wise alignments (using aNeedleman- alignments pair-wise obal taset. These sequences were assigned to a given given a to assigned were sequences These taset. nces. Sequences of each phylum were then phylum then were ofeach nces. Sequences and to build up the taxonomic framework (see (see framework thetaxonomic and tobuildup lue of 0.8, and -st value of0.001;Capella- and -stvalue lue of0.8, tructed separately for for tructed separately enetic trees built were then using as BioNJ the SSU-align program and Infernal software Infernal software and program the SSU-align as specific Python scripts allowed usto scriptsallowed Python as specific

et al. et nt well-defined taxonomic groups taxonomic nt well-defined 2010), in order to identify identify to inorder 2010), phylogenetic analyses using using analyses phylogenetic uncertain taxonomic status taxonomic uncertain ; (iv) poorly aligned or ; (iv)poorly or aligned each taxonomic group taxonomic each et al. et 2002),and et al. et et al. et

This article is protected Allrights bycopyright. reserved. levels: 1-;2-Supe establis For validatedsequence,we every new The taxonomic framework ofPhytoREF PhytoREF ID number. included inPhytoREFhave twouniqueiden ). Overall, the standardized ). Overall,thestandardized http://www.nlm.nih.gov/research/umls/sourcerel wasfollowe (Mayclassification ofNCBI 2014) inAlgaeBase notpresent thatwere the taxa Guiry 2014;Guiry and (Guiry AlgaeBase database based were macroalgae and micro- freshwater eukaryotes (Adl Guillou wasderive PhytoREF framework of taxonomic Family;"Supe and10-Species.For the 9-; the "Family" level would be: Family,Family_X Accepteddescribed forthePR2da as waslabeled sequence In the cases, these level. "Genus" or "Family" the only at e.g. resolved, identity theta taxonomic because itwasnotpossible rDNAsequences, For some16S by envir high-throughput assist in theenvironmentalgenerated plastidial analysis large amplicons datasets of of 16SrDNA Article et al. et 2013), which mainly follows a comprehens 2013),whichmainly followsa et al. et 2012).The"Family" "Order" le and onmental metabarcoding. onmental metabarcoding. tabase. For instance, the taxonomic tabase. For thetaxonomic instance, r-group; 3-Phylum; Class;5- 4- Subc taxonomic framework established xonomic descriptionofthecorre xonomic hed a standardized and hed astandardized and PR2 (mostly embryophytes), thetaxonomic PR2(mostly embryophytes), and easedocs/current/NCBI/metarepresentation.html tifiers, numberanda theGenBankaccession (for the "Genus" level), and Family_XX (for the the Family_XX (for and the"Genus" level), (for on the taxonomic classification system of the ofthe classificationsystem on thetaxonomic d from the PR2 database (http://ssu-rrna.org/; (http://ssu-rrna.org/; database d fromthePR2 d (taxdump.nodes andtax d (taxdump.nodes et al et r-group", "Phylum"andr-group", "Class" levels, the to defineanaccura

. 2014; http://www.algaebase.org/). For For . 2014;http://www.algaebase.org/). ive recent classification framework of framework classification ive recent path ofasequenceidentifiedupto vels of terrestrial, marine and marineand vels ofterrestrial, in PhytoREF wasdesigned to lass; 6-Order; 7-Sub-order;8- sponding organism isnotfullysponding ranked taxonomy with10 dump.names filesat te and/or complete te and/or This article is protected Allrights bycopyright. reserved. Accepted allowedustodetermine that52sequen analyses BLAST with combined genomes, 1).2Dalignments Fig. plastidial from extracted sequences amp public databases(5,200 strains from marinemicroalgal are 6,051sequences (of which rDNA sequences c 1)currently (release The PhytoREF database OverviewPhytoREF of database andDiscussion Results "k"symbiont" or as modified andmarked was databases public in host the assigned to incorrectly generally are that sequences these symbiotic microalgae from originating sequences rDNA 16S included we also Finally, RefSeq=7:Genus). RefSeq=6:Family; RefSeq=5:Order; Article super-gr RefSeq=2: (RefSeq=1: Eukaryota; at whic indicating level PhytoREF the sequence, fo level level). Aconfidence ("Species" andclade_7A1+sp level) ("Genus" clade_7A1 level), ("Family" clade_7A level), an VII clade to belonging instance, prasinophytes the16SrRNAgene with analyses phylogenetic PhytoREF aboutthemol information database, IX VII, clades prasinophyte (e.g. based taxonomy onpublis based and sub-clades informal clades groupsof somekey Moreover, level). "Species" licons from organisms and envi licons fromorganisms r taxonomic assignation (named Refseq) was given to each wasgiven toeach assignation Refseq) r taxonomic (named were produced inthis were produced study a leptoplastid" inthe PhytoREF database. ontains 6,490partialand ontains oup; RefSeq=3:oup; RefSeq=4: Phylum; Class; hed phylogenetic analyses without morphology- analyses hed phylogenetic and indicatedatdiffere and h a given sequence is unambiguously assigned assigned isunambiguously given sequence h a d sub-clade A1 are annotated: clade 7 ("Order" 7("Order" clade annotated: are A1 d sub-clade ces (mostly from streptophytes) considered as as considered streptophytes) from ces (mostly > 800 bp long). In total, 411 novel sequences In 411novelsequences > 800bplong). total, or kleptoplastids found in hosts. The origin of hosts.Theorigin or kleptoplastidsfoundin ; Apicomplexa-related lineagesI For ; -V). the Apicomplexa-related ecular clade was verified through specific wasverified ecular clade microalgae have only classifiedinto have been microalgae

nd 6,079 sequences retrieved from from nd 6,079sequencesretrieved ronmental samples,and879 nt taxonomic ranks.For nt taxonomic complete plastidial16Scomplete This article is protected Allrights bycopyright. reserved. Accepted final includedinthe also were Ulvophyceae) angiosperms), and marine and freshwater macroalgae (e.g., Rhodophyceae, Phaeophyceae, (i.e streptophytes from sequences reference novelplastidialreferenc producing effo 2and3).Although (Figs. our (20 sequences) (501sequen sequences), Alveolata (144 (3,8 Archaeplastida the databaseisasfollows: inPhytoR represented are marine environments lineages photosynthetic major of All oftheknown Taxonomic composition of PhytoREF is alsoprovided). kleptoplastids, parasitic or mutualistic microalgae(in such cases, the taxonomic name ofthe host Article to morphologically belong id environmental or strain, and original publicati molecular (amplicon origin orextracted from to path,all sequences associated taxonomic were andclassifiedfollowing phylogenetically-analyzed was quality-checked, PhytoREF sequence Every analyses). andphylogenetic algorithm onsequence based to knowneukaryotic lineages contains 1,867environmentalSange PhytoREF.In tosequences addition identified from plastid-bearing organisms, PhytoREF nuclear actually were rDNAinGenBank 16S on. Additional categories indi e sequences) mainly focused mainly focused e sequences) r sequences from clonelibraries from r sequences

. from mosses and ferns to gymnosperms and to mossesandferns . from genomes), GenBank accession number, cultured number,cultured accession genomes), GenBank 34 sequences), Stramenop sequences), 34 thecompositionof level, super-group Atthe EF. reference database of all known photosynthetic known photosynthetic all of database reference 18S rDNA, and were therefore excluded from excluded therefore were rDNA,and 18S entified organisms, and if they correspond to correspond ifthey and entified organisms, ces), Excavata (288 sequences), and Rhizaria Rhizaria and (288 sequences), Excavata ces), our standardized taxonomy. In additiontothe taxonomy. ourstandardized a suiteofdescriptors,such astheorganism, rt whilebuilding in PhytoREF (inparticular similarity (combining a Needleman-Wunsch similaritya Needleman-Wunsch (combining eukaryotes from terrestrial, freshwater and and freshwater fromterrestrial, eukaryotes

cated whether sequences are are sequences whether cated on marine microalgal taxa, taxa, microalgal on marine , which have been assigned assigned havebeen , which ila (1,704 sequences), ila (1,704sequences), covering a wide taxonomic dive awidetaxonomic covering This article is protected Allrights bycopyright. reserved. freshwat from species PhytoREF with 115described fromthesuper- Emden 2005).Theeuglenozoans genera forthe except stillunderrevision, are cryptophytessystematics of featuresand not correspondwellwithtaxonomic (e.g. from marine species described 36 covering Ha the Amongst sequencing. reference for future ma thantheir represented less are taxa freshwater photosynthetic(Janouškovec lineageidentified toformanovel "S25_1200" Ocean,named found inthePacific 3).Within (Fig. respectively described species, most importantmicroalgae in (coccolithophor Haptophyta and algae (Chlorophyta) respec 82sequences, 161,46,and by represented are genera) (18 Ulvophyceae and genera), (7 genera), Phaeophyceae (61 Rhodophyceae ge 796 373familiesand representing sequences, arenumerically Land (streptophytes) plants pollen)or (gametes, reproductive stages ofm presence the detect asto aswell lakes), habita andfreshwater marine indifferent microalgae eukaryotes. Thus,PhytoREF can be used inmetabarcoding surveys tostudycommunities of Acceptedsp. and Article that have been relatively well delineat thathavebeenrelatively well sp.) and fresh (e.g. (e.g. and fresh sp.) the database with 1094,653a with thedatabase rsity with109,79,and34descri in the digestive tracts of herbivores. tractsofherbivores. in thedigestive acroalgae and streptophytes in aquatic systems as systems aquatic in andstreptophytes acroalgae the dominantgroup the inPhytoREF with2,973 tively (Fig. 3). Diatoms green (Bacillariophyta), crobia, there are 126 sequences of cryptophytes cryptophytes of 126 sequences are crobia, there sequence environmental one theHaptophyta, rine relatives and represent an obvious target anobvioustarget represent and rine relatives group Excavata are alsowellrepresentedin are group Excavata (EF574856), was included asithasbeen wasincluded (EF574856), ed (Hoef-Emden & Melkonian 2003; &ed (Hoef-Emden Melkonian Hoef- ts (e.g. seawater, estuaries, brackish waters, waters, brackish estuaries, (e.g.ts seawater, sp.) waters. Because genetic diversity does geneticBecause diversity does sp.)waters. life-stages are likely tobe complex, thelife-stages arelikely es and their relatives their es and classes the from macroalgae The nera. er and marine habitats, such as species of suchasspecies habitats, and marine er

sp.), brackish (e.g. (e.g. brackish sp.), nd 369 sequences, respectively, respectively, sequences, nd 369 et al. et bed genera and268,113,67 genera bed 2011). For the , algae, green 2011).For the ) are numerically the the numerically ) are Cryptomonas and and

This article is protected Allrights bycopyright. reserved. we success rates amplification PCR low very the small minicircles (Zhang tothenon-photosyn correspond sequences heterotrophic euglenid PhytoREF are parasites, very often as the such holoparasitic angiosperm distributed inmarineand freshwate (Hacr rappemonads the recentlydiscovered Euglena unique genomic ofplastidgenes organization in very is rapidly-evolving andtheirsequencesare gene photosynthetic dinoflagellates of rRNA theplastidial16S ARLs, and As inapicomplexans microalgae AR called lineages, 2014). Apicomplexan-related plastid relict which havekepta Toxoplasma (Williamssome taxa & Keeling single-celled eukaryotes,but a vestigial plastid co photosynthesisha evolution, course of During the evolutionary and environmenta gene iscu 16SrRNA theplastidial rappemonads, no nuclearribosomal(18SrR Accepted by Janouškovec proposed the framework inPhytoREF (mostl byrepresented 77sequences Article , Monomorphina , Chromera Babesia ), which are obligate), intracellular parasites and ofmetazoans protists, but (Moore (Moore Euglena longa and et al. al. et Trachelomonas l studies ofthislineage. 2002; Green 2011). This extreme genetic divergence may explain genetic may divergence explain Thisextreme 2002; Green2011). , knownastheapicoplast(Lim 2003). Someofthesenon-photosynth et al. et NA gene) and genomic sequences are available for the forthe available are genomic sequences NA gene) and rs, but taxonomically undescribed (Kim (Kim rs, buttaxonomically undescribed 2008) and 2008)and , and the green alga alga green , andthe et al. et thetic alveolat thetic . Of note, PhytoREF also contains 6 sequences of 6sequences note,PhytoREFalso contains . Of (2011) i.e. ARL I, II etc. III,etc. ARL i.e. (2011) obia), an uncultured microalgal group widely groupwidely microalgal anuncultured obia), obtained during the present study ondifferent obtainedduring thepresent ntaining a 16S rRNA gene has been retained in genentaining hasbeenretained a16SrRNA dinoflagellates thatcan y environmental), and classified according to to according classified y and environmental), difficult toalign. This may be related tothe rrently the only geneticrrentlyavailable theonly marker for s been lost in several lineages of plants and lineagesofplantsand s beenlostinseveral Ls (class Colpodellid) Vitrella

e apicomplexans (Oborník Helicosporidum & MacFadden 2010; MacFadden 2010;MacFadden MacFadden etic organismsetic presentin Epifagusvirginiana et al. et be found separately in be foundseparately , which include the the , whichinclude

. In particular, 33 In. particular, et al. et (e.g. 2012),arealso 2011).Since Plasmodium , the , This article is protected Allrights bycopyright. reserved. r several plastidial16S consider when interpreting metabarcoding using datasets PhytoREF.It important tonote that is dinoflage limited numberof dinofla cultures ofphotosynthetic levels (from 98 to 80%) and the longest sequences sequences and thelongest 98to80%) levels (from se These assignation query sequences. of taxonomic sequences have also beenincludedas inPhytoREF separate files any toavoid ambiguities inthe biases inannotationofthe Mesodinium rubrum other organisms inPhytoREF present that caneither establish kleptoplastidy (e.g. the Ciliata (Kim and chlorophytes raphidophytescryptophytes, dinoflagellate "stolen" dinoflage by photosynthetic eukaryotes they mislabeledas"dinoflagellate" were wh Acceptedcytochrome asth isnotashigh plastidial 16SrDNAbarcode using or metabarcoding total diversity photosynthetic of eukaryotes To date, PhytoREF only isthe toolinmolecular ecology specifically designedexplore the to PhytoREF: anew tool to explore the ecology photosynthetic eukaryotes of http://phytoref.org/. Article c oxydase I gene for (Herbert (Herbert animals for gene I oxydase Dinophysis

and benthic Foraminifera), or photosymbi or benthic Foraminifera), and DNA sequences from GenBank were re-a were fromGenBank DNA sequences metagenomicsapproach can sequester of different microalgal prey, such as suchas microalgal prey, ofdifferent cansequesterplastids llate sequences (34 sequences repr (34 sequences llate sequences metabarcoding reads. Finall reads. metabarcoding ). Re-assignment of these sequences was necessary to avoid avoid necessary to sequenceswas ofthese ). Re-assignment gellates. Consequently, oneshor from complex marine and marine and complex from at ofestablishedbarc at

en in fact they correspond toplastidsen infactthey ofcorrespond es. Although resolutionof the thetaxonomic llate hosts (kleptoplastids). For instance, the the llate hosts(kleptoplastids).For instance, et al. et of each cluster are of clusterare av each quences were clustered at clustered were quences et al. et 2003)

2012). This issue was also found in all all also foundin 2012).Thisissuewas y, 270016S rDNAcyanobacterial osis with microalgal cells (e.g. the the cells (e.g. osis withmicroalgal

and the large subunit of ribulose subunit of the large and esenting 15 genera), a caveat to to caveat a genera), 15 esenting ssigned in the database because thedatabasebecause in ssigned tcoming ofPhytoREF isthe odes like the mitochondrial like themitochondrial odes ailable for download at download for ailable terrestrial ecosystems terrestrial ecosystems different similarity similarity different This article is protected Allrights bycopyright. reserved. gene1,5-bisphosphate ( carboxylase 2010; Na (Edvardsen the haptophytes as major lineages, such Stern Cryptophyta; allphotos and distinguish recover potential tobeasuitableproxy Zhu >30,000 copies; algae green the (e.g., magnitude and biovol cellsize size, genome with correlates gene, which rRNA 18S nuclear the of that than important less much be to therefore seems numb copy The 2011). genes (Green ribosomal RNA withtheplastidwhich isinaccordance genome st 2 only sofar, sequenced (mainly streptophytes) Pedinomonas minor 6copiesintheeuglenophyte 4and from 1to10(e.g. of number thecopy In thisstudy, foundthat we the diversity ofphot toexplore "pre-barcode" rRNA(Pawlowski 18S for the Working Group Accepted (e.g a species cycle of also throughout life the copiesperindividual) of16S thenumber (hence canvary ofplastids gene. Thenumber 16S rRNA plastidial the with may occur biases also theenvi eukaryotic phototrophsin the 18S RNAgene, respectively, while the diatoms Article et al et . 2012), and diatoms(Pillet , respectively). However, inabout80% However, , respectively). et al et et al. et . 2014), and down to the genus and sometimes species level for most for species level genus and sometimes tothe and down . 2014), 2005;Godhe in metabarcoding studiesfora in metabarcoding Prasinococcus ronment. Nevertheless, one has to consider thatbiological toconsider onehas ronment. Nevertheless, ynthetic eukaryotes at the class level, family level (e.g. (e.g. level family level, class at the ynthetic eukaryotes rbc L) for plants (CBoL Working Group 2009), it can can 2009),it Group Working (CBoL Plant plants L) for et al. et osynthetic eukaryotes in the environment. environment. eukaryotes inthe osynthetic 2008). Thus, the plastidial 16S rRNA gene hasthe 16SrRNA Thus, theplastidial 2008). et alet the 16S rRNA gene in plastid genomes can range range genomescan gene inplastid the16SrRNA et al. et sp.and copies of the 16S rRNA were found (Table S2), (Table found were 16SrRNA ofthe copies not only within one cellamong eukaryotes, but . before and after cytokinesis). Although most andafter before . ructure withtwoinverted ructure . 2011).Asproposed by theCBoL Protist ume and can vary by up to four orders of of orders uptofour by can vary ume and 2012), the 16S rRNA gene can be used as a a usedas gene be can 2012),the16SrRNA er variation of the plastidial 16S rRNA gene 16SrRNAgene oftheplastidial er variation et al et Ditylum Ostreococcus

Euglena gracilis Euglena gracilis . 2011), euglenozoans (Linton . 2011),euglenozoans of theplastid genomes ofeukaryotes ssessing the relative abundance of of ssessing relativeabundance the sp.and sp.have2and4copiesof Coscinodiscus repeats that duplicate repeatsthatduplicate and the prasinophyte and theprasinophyte sp.have et al. et

TreeDyn programs. In foreachsequence, addition, programs. TreeDyn or Mothur QUIIME, the beused by to are proposed database ofthe formats anddifferent file, Information with eachsequen associated RCC393). or AY702161 (e.g. code strain culture the numberor accession GenBank asthe identifiers such phylum, rank(e.gby genus,species) taxonomic options. Seque search different charts) andperform This article is protected Allrights bycopyright. reserved. 2004). Bendich in landplantsfromtenstovaries greatly 50-100copiesof eukaryotes typically maintain rDNA 16S the alter can also which species, number Less knownaboutthe is centric diatoms). plastids,some or afew one diatoms) haveonly hapt groups(e.g. microalgal species inmany GenBank database providemore database GenBank and abstractavailableCulture linkstothe are Collection Roscoff onthe site. web Web and thedata users toexplore and allows sequences, and ra easy provides interface web The PhytoREF Description of the PhytoREF web interface Amphidinium operculatum hasbeen thecellcycle of plastidDNAthroughout 1998), copies(Armbrust genome contains about75 Accepted Hiramatsu Article et al et . 2006; Koumandou & Howe 2007). & 2007). Howe . 2006;Koumandou

In microalgae, the plastid of the chlorophyte chlorophyte the of plastid the microalgae, In and the chrysophyte chrysophyte andthe informationaboutthe taxonomy theorigin ofeach and hundreds during theplantdevelopment(Oldenburg & the plastidgenomes perplastid.This number taxa can harbour more than 100 plastids (e.g. 100plastids(e.g. than canharbourmore taxa using orthrough ataxonomy specific browser, ophytes, cryptophytes, chlorophytes, ophytes, chlorophytes, cryptophytes, pennate pid access to all reference plastidial 16S rDNA plastidial16S pid accesstoallreference copy number per individual.Photosyntheticcopy per number base withinteractive shown for this taxon and for the dinoflagellate andforthedinoflagellate shown forthistaxon ce can be also downloaded as a tab-separated bealsodownloadedasatab-separated ce can but continuous replication and accumulation andaccumulation butcontinuousreplication nces can be retrieved in Fasta format either either format Fasta in nces canberetrieved Ochromonas publication metadata su publicationmetadata of plastid genome copies in microalgal copiesinmicroalgal genome ofplastid

(Coleman & Nerozzi 1999; & Nerozzi (Coleman Chlamydomonas reinhardtii graphs (e.g. Krona pie Krona pie graphs (e.g. ch astitle, authors

This article is protected Allrights bycopyright. reserved. Conclusionperspectives and page. dedicated a in sequences reference toindicateerro encouraged usersare PhytoREF, hitseque downloadselected and sequences, reference users toidentify individualormultipleplastid plastidialBLAST sequence. interface Finally,aallowing isavailable 16SrDNA the site web on freshwater microalgae, which may leadtoco microalgae, freshwater Ph inthecurrent organisms areunder-represented free-living,or parasitic mutualisticfrom aquatic organisms and terrestrial habitats. Some eukaryotic lineageshave that a group totargetany design andprobes ofprimers environmental sequences frommassive metabarcodingand metagenomic datasets; and (iii) plastidial16SrDNA ofnew and classification eukaryotes givenin any It ecosystem. can be used PhytoREF explor isthefirstresourceallowing Accepted suchasth fromtheliterature, features from publicsequences validationofnew expert Updateswill beevery the performed database of eukaryotes in pastandpresente thepotentialtobeusedformanyap therefore Article functional or relictfunctional plastid are or re nvironments (water, sediment, ice) to feeding selectivity studies. studies. selectivity feeding to ice) sediment, (water, nvironments

e description of novel algal classes.. ofnovelalgal classes.. description e plications frombiomonitoringplications ofphotosynthetic ial 16S rDNA sequences against all PhytoREF against allPhytoREF ial 16SrDNAsequences rs and suggest better taxonomic placements for placementsfor andsuggestrs bettertaxonomic six monthsbysix new reference adding sequences, ation ofthetotaldiversity photosynthetic of GenBank, and inclusion of novel taxonomic ofnovel and inclusion GenBank, for a range of purposes, such as: (i) annotation annotation suchas:(i) ofpurposes, range fora ofphotosynthetic eukary arse taxonomic assign arse taxonomic sequences; (ii)tax sequences; ytoREF asdinoflagellates and version,such nces. In order to improve future releases of of releases future toimprove In order nces.

presented in PhytoREF, in including presented onomic assignation of of onomic assignation ations. PhytoREFations. has

otes. Allofthemain otes. strains. and microalgal Camille AkiraKuwata, Jackson, " resources of" andvisualization HPC the to access granted " French 6-311975) andthe MicroB3 (UE-c European Unionprograms d'Avenir This article is protected Allrights bycopyright. reserved. and da provided resources LG,IP MG, SR,DV,RCanalyzed JD, data. theexperiments. AZ,FG, PG,ALS performed re the anddesigned SR andCdVconceived JD, Contributions Author plants). land eukaryotes (essentially Table S2. inthis obtained been sequence have Table S1: Supporting Information providedinTableS1. are rDNAsequences 16S theacces and http://phytoref.org/, web interface inthe All oftheplastidial16S rDNA sequences Data Accessibility managed the by This workwassupportedby pr theOCEANOMICS Acknowledgments Accepted Article NiceSophiaAntipolis Université " ANR-11-BTBR-0008. Financial support for Financial" support ANR-11-BTBR-0008. Number of 16S rRNA copies found in pub Numberof16SrRNAcopiesfoundin List of the eukaryotic microalgal strains from which the plastidial 16S rDNA List 16S fromwhichtheplastidial strains microalgal ofthe eukaryotic Agence Nationale delaRecherche Agence tabases. RC and DT designed the tabases. RCandDTdesigned Investissements d'Avenir Investissements PoirierandDaphnéGruloisfo ". We are grateful to Ales Horák, Jan Janouškovec, Chris Janouškovec, toAlesHorák,Jan ". aregrateful We

study by DNA extraction andPCR. extraction studyDNA by ontract-287589) andMaCuMBA (FP7-KBBE-2012- ontract-287589) search, and wrote the paper. JD, SR, RS,MB, JD, thepaper. andwrote search, sion numbers of the newly produced reference reference produced numbersofthenewly sion PhytoREF database can be downloadedviathe can PhytoREF database , under the grant agreement " grant agreement the , under oject, funded by the French Government and Government French oject, fundedby the lic plastidialgenomes inphotosynthetic " program EMBRC-France. This work was workwas This EMBRC-France. " program

this workwasalsoprovidedby the Centre de Interactif Calcul web site platform ofPhytoREF. web siteplatform r providing 16S rDNA sequences r providing 16SrDNAsequences Investissements Investissements " hostedby Academy of Sciences ofth Academy ofSciences forlandplants. barcode CBoL (2009)ADNA PlantWorking Group analyses. phylogenetic large-scale in trimming alignment This article is protected Allrights bycopyright. reserved. Annual ReviewofMarineScience of Chloroplastsa of inheritance Armbrust EV(1998)Uniparental Microbiology Therevi M, HamplVetal.(2012) Adl SM,SimpsonAGB, BurkiLukes Lane BassD, Bowser F, SS,Brown CE, J, Dunthorn MW, References analysis. analysis. Project The RibosomalDatabase Chai BCole JR Farris RJ,Wang Q,Kulam SA oftrees. and annotationsforanalyses AL, B,Banuls BrunChevenet F, Christ C, Jacq blooms by serial parasitic killers. LChambouvet A,MorinP,MarieD,Guillou (200 Capella- (2014)Rhizaria. Burki PJ Keeling F, Naples Bay. ofunculture patterns Diversity Vargas C(2013) BittnerL,Audic S,Romac Gobet A, S,EggeSantini S,Ogata ES, I,B, H,Probert Edvardsen de 233-243. global biodiversity. eukaryotic way towardsunderstanding JG L, CreerS,Caporaso D Bik HM,Porazinska Botanical Research ofalgae (2012)Theevolution Archibald J and blooms: paradigm shifts GM(20 AD,Hallegraeff Anderson DM,Cembella pp. 93–113. Acceptedcomponent. AM (1999)Te Nerozzi Coleman AW, Article

Gutiérrez S,Silla-Gutiérrez Nucleic Acids Research, Acids Nucleic International Reviewof Cytology Molecular Ecology Molecular , 59 nd Mitochondriain , 429 , 64 – , 87-118. 493. e UnitedStatesofAmerica

Martínez JM, GabaldónT(2009) JM, Martínez new technologies for research, research, for new technologies , (RDP-II): sequences andt (RDP-II): sequences 22

Science , 33 sed classification of eukaryotes. ofeukaryotes. classification sed 4 , 87-101. , 87-101. , 143-176. , 143-176. , 294-296. , 294-296. Chlamydomonas Current Biology BMC Bioinformatics BMC mporal and spatial c and mporal , 322 by secondary and tertiary endosymbiosis. endosymbiosis. andtertiary secondary by ,

, 1254-1257. , 1254-1257. , McGarrell DM, Garrity GM, Tiedje JM (2005) Tiedje JM GM, DM,Garrity , McGarrell 193 en R (2006) TreeDyn: towards dynamic graphics dynamic towards R (2006)TreeDyn: en d Haptophytes unravelled by pyrosequencing in by pyrosequencing unravelled Haptophytes d , Knight R, Thomas WK (2012) Sequencing our Sequencing , Knight (2012) R,ThomasWK chloroplast genomes. In The Molecular Biology In Molecular genomes. The chloroplast 8) Control of toxic marine8) Controloftoxic dinoflagellate , 12) Progressinundersta

125-164. , , , S. Merchant, ed (Amsterdam: Kluwer), ed(Amsterdam: Merchant, , S. 106 24

, R103-R107. Bioinformatics , 12794-12797. , 12794-12797. oordination of cells withtheirplastid oordination ofcells , Trends in Ecology & Evolution Trends inEcology ools for high-throughput rRNA rRNA ools forhigh-throughput 7 monitoring, and management. and monitoring, , 439. TrimAl: a tool forautomated TrimAl: atool Journal of Eukaryotic Proceedings oftheNational Proceedings , 25 nding harmful algal algal harmful nding , 1972-1973. Advances in Advances in

, 27 , This article is protected Allrights bycopyright. reserved. taxonomy. phytoplankton. evolution ofmoderneukaryotic Falkowski ME,KnollAH,Quigg Ra PG,Katz A, (Haptophyta). Prymnesiales provi revision and amorphological phylogenies AG, Sáez ThrondsenJ, B, EikremW, Edvardsen compatible withARB. ofth Academy ofSciences of sy mode (2012) Anoriginal DecelleI,Bittner Probert L, J, S,de Desdevises Y,colin Vargas M,SimoR,NotF C,Gali Claustre H(1994)Smallest Courties C,VaquerA,Trousselier transect. transect. Sea anArabian along structure community picoeukaryote of photosynthetic analysis Molecular DG,Woodward GA,Cummings Tarran Fuller NJ, chimera-ch a (2006)Greengenes, Andersen GL DeSantis TZ,Hugenholtz P,Larsen N,Rojas M, Phaeocystis antarctica stage cell LK inthehaploid-diploi (2010) Anew S,DavidsonA,Assmy, P,Chen Gaebler-Schwarz plastids. algal marine towards biased at open photosynthetic diversity picoeukaryote Pitt FD, CampbellC,AllenDJ, Zwirgl Fuller NJ, Applied and Environmental Microbiology Applied andEnvironmental bioma diatom anddinoflagellate V, Härnström K,Saravanan Godhe A,AsplundME, Green BR (2011) Chloroplast genomesBR (2011)Chloroplast Green Evolution interface forsequence alignment and phylogenetic tree building. Gouy O(2010)SeaView M,GuindonS,Gascuel Accepted eukary ofunicellular acatalog (PR2): database Berney S,BassL, D, D,Audic Guillou Bachar 44. Article Limnology and Oceanography Limnology and , 27 Nucleic Acids Research Acids Nucleic , 221-224. anditsecological implications. Applied and Environmental Microbiology andEnvironmental Applied e UnitedStatesofAmerica eukaryotic organism. eukaryotic organism. European Journal ofPhycology European mbiosis in open ocean plankton. mbiosis inopenocean sses in coastal marine seawater samples by real-time PCR. PCR. real-time by samples seawater marine coastal sses in M, Lautier Chrétiennot-Dinet M, J, , 1-8: doi:10.1093/nar/gks1160 , 1-8:doi:10.1093/nar/gks1160 Aquatic Microbial Ecology Aquatic Microbial of photosynthetic eukaryotes. photosynthetic of , 51 , 74 , 2502-2514. , 2502-2514. , 7174-7182. ote Small Sub-UnitrRNAsequencesote withcurated ocean sites in the Arabian Sea using a PCR Seausing sitesintheArabian ocean C et al. (2012) The Protist Ribosomal Reference Ribosomal Reference C etal.(2012)TheProtist ecked 16SrRNAgeneda ecked de the basis for a revised taxonomy ofthe revised taxonomy thebasisfora de Science Probert I,LK RibosomalDNA Probert Medlin (2011) d life cycle of the colony-forming haptophyte haptophyte thecolony-forming d lifecycle of Brodie EL, Keller K, Huber T, Dalevi D, Hu P, D,HuP, T,Dalevi Huber Keller K, Brodie EL, maier K, Le Gall F et al. (2006b). Analysis of (2006b). Analysis et al. LeGall F maier K, Nature version 4: a multiplatform graphical user 4:amultiplatformgraphical user version ven JA, Schofield O, Taylor FJ (2004)The FJ SchofieldO,Taylor ven JA, JX, Henjes J, Nothig EM, Lunau Nothig EM, HenjesJ, JX, M,Medlin EMS, OrcuttKM,YallopMetal.(2006a). Tyagi A, et al. (2008) Quantification of TyagiA, etal.(2008) , 109 Journal ofPhycology , , 305

370, , 18000-18005. , 18000-18005. , 354-360. , 354-360. ,

46 255. Proceedings oftheNational Proceedings , ,

Molecular Biology and Molecular , 202-228. 202-228. 43 72 M-J, Neveux J, MachadoC, J, Neveux M-J, , 79-93. , 5069-5072. The Plant Journal The Plant tabase and workbench andworkbench tabase , 46 , 1006-1016. , 1006-1016. , 66 , 34- This article is protected Allrights bycopyright. reserved. DNA barcodes. JR A, Ball PDN,Cywinska DeWaard Herbert SL, 5756-5760. GuiryGuiry GM(2014) MD, 1500. Academy ofSc Proceedings oftheNational anddiverseplastid-bear (2011)Newly identified JM HM,RichardsTA,WordenAZ, Archibald MDM,Wilcox Sudek S,Jones Kim E,HarrisonJW, in mitochondrialandchloroplastnuc S, Misumi O,Kuroiw Hiramatsu T,Nakamura 313-321. ecosystems. basis ofAtlanticoligotrophic Bu AP, GA,Martina Hartmann M,GrobC,Tarran algae. for resource line an on- (2014)AlgaeBase: DJ P,Garbary I, CF, Kuipers Carter DM,Bárbara John Langangen A, L,Morrison GuiryGuiry, GM, RindiF, MD, Vale April 2014. on18 searched Ireland, Galway. http://www.algaebase.org; University of prymnesiophytes in the subtropical prymnesiophytes inthesubtropical ScanlanDJ L, PearmanJ, Zubkov Jardillier MV, reefs. coral in lineages apicomplexan-related reveals diversity FL,K HorákA,Barott KL, Rohwer J, Janouškovec reinhardtii on fast Fourier transform. transform. on fastFourier MAFF Katoh M,KumaM(2002) 1180-1192. dimorphism. andmor phylogeny combination ofmolecular M K, Melkonian Hoef-Emden operons. ribosomal anal combined phylogenetic - () K (2005)Multiple Hoef-Emden Accepteddiatoms. toinventorysequencing taxonomic Next-generation A(2013) Bouchez JF, Humbert P, F,Chaumeil Rimet A, L, Franc Kermarrec Article Molecular Ecology Resources Molecular Ecology ( Chlorophyceae Protist Proceedings of the Royal Society of oftheRoyalSociety Proceedings , 154 Cryptogamie Algologie Cryptogamie , 371-409. Nucleic Acids Research Acids Nucleic ) cell cycle. cycle. ) cell AlgaeBase (2003) Revision of the genus (2003) Revisionof independentlossesofphotosynthesis genus inthe Journal of Molecular Evolution Journal ofMolecular T: a novel method forrapidmultip T: anovelmethod diversity in eukaryotic comm diversityeukaryotic in and tropical northeast Atlantic Ocean. leoids and mitochondria during the during leoids andmitochondria , Journal of Phycology Journal of 13 . World-wide electronic publication, National publication, electronic . World-wide Proceedings National Academy Science USA NationalAcademyScience Proceedings , 607-619. , 607-619. iences of theUnitedStates ofAmerica iences phology provides insights provides phology intoalong-hidden a T, Nakamura S (2006) Morphological changes changes Morphological a T,NakamuraS(2006) , yses of DNA sequences of the nuclear andthe thenuclear of yses ofDNAsequences 35 (2010) Significant CO2 fixation by (2010)Significant CO2fixation small through(2003) Biological identifications , nzuela MirandaS,Math nzuela , 105-115. rkill PH, Scanlan DJ et al. (2012) Mixotrophic etal.(2012)Mixotrophic rkill PH,ScanlanDJ 30 eeling PJ (2012)Globalanalysis ofplastid PJ eeling LondonSeriesB-BiologicalSciences ing branch on the eukaryotic tree oflife. tree eukaryotic onthe branch ing , 3059-3066.

Cryptomonas ,

Current Biology 42 , unities: a test for freshwater unities: atestforfreshwater ,

1048-1058. 1048-1058. 60 le sequence alignment based le sequence , 183-195. Chlamydomonas (Cryptophyceae): a (Cryptophyceae): The ieson AC, Parker BC, ieson AC,Parker

ISME , 22 , , R518-R519. Cryptomonas 108

Journal , 1496- , 109, , 270 ,

4 , ,

perspective on marine photosynthetic photosynthetic perspective onmarine LE, BoJardiller Lepère NotF, C, Kirkham AR, Transect. ofphotosynthetic pi scale distributionpatterns LE, Tiganescu A,Pear Kirkham AR,Jardillier plastids ofmultiple algalorigins at the time. same Logares R, Audic S, Bass D, Bittner L,Boutte, D,BittnerBass Logares R,AudicS, gen.(Euglenophyta). nov. chloroplastSSU LSU, and nuclear SSUand Triemer RE(2010)Reconstructin This article is protected Allrights bycopyright. reserved. Kim M,S,YihW, Park MG( activity of uncultured marine activity ofuncultured S,Pernice MC,de Logares R,AudicS,Santini abundant marinemicr Perspect Biol perspec Knoll AH(2014)Paleobiological 922-936. Journal dramaticallygrowth phase with inthe dinoflagellate (2007)Thecopy number CJ KoumandouVL, Howe Micromonas (2014)PhagotZM, RW Sanders McKie-Krisberg Sea unveiledusing photosystem-II activemari Diversity of Vaulot D,BéjàO(2011) I, MF, Polz LeT, Berman-Frank Gall F, D,PhilosofA,KirkupBC, Yogev Man-Aharonovich (2014)Apicoplast. MacFadden GI AcceptedLinton Philosophical transactions oftheRoyalSocietyB Theevolution, (2010) L, GILim MacFadden Microbiology South EastPacific (2009) Photosyn DJ Lepère C,VaulotD,Scanlan Article103.

, EW, Karnkowska-Ishikawa A, Kim JI, Shin W, Bennett MS, Kwiatowski J, Zakry Bennett W, MS,KwiatowskiJ, A,KimJI, Shin Karnkowska-Ishikawa EW, 6 Environmental Microbiology Environmental , 1823-1833. : implicationsOceans. forArctic , , 6 11 , a016121. , a016121. , 3105-3117. , 3105-3117. Ocean encompassing the most oligotrophic waters on Earth. themostoligotrophic onEarth. encompassing waters Ocean obial eukaryotes. Protist heterotrophic unve flagellates heterotrophic , g euglenoid evolutionary relationships using three genes: relationships using three evolutionary g euglenoid 161 2012) The marine di marine 2012) The Current Biology psbA picoeukaryote community structure. structure. community picoeukaryote , 603-619. , Current Biology 13 transcripts. transcripts. tives on early eukaryotic evolution. evolution. eukaryotic tives onearly , 975-990. , 975-990. rDNA sequences and the description of andthedescription sequences rDNA metabolism andfunctions metabolism The ISME Journal coeukaryotes along an Atlantic Meridional Atlantic an along coeukaryotes man J, Zubkov MV, Scanlan DJ (2011)Basin- ZubkovDJ MV,Scanlan J, man Vargas C, Massana (2012) C,Massana Vargas C, ChristenRetal.( uman H, Mead A, Scanlan DJ (2013)Aglobal DJ H,MeadA,Scanlan uman ne picoeukaryotes in the Eastern Mediterranean Mediterranean intheEastern picoeukaryotes ne thetic picoeukaryote community structure inthe structure community thetic picoeukaryote , rophy by the picoeukaryotic green alga green alga picoeukaryotic rophy the by Harmful Algae 365 , of chloroplast gene minicircles changes changes geneminicircles of chloroplast The ISMEjournal 24 Amphidinium operculatum , 749-763. , R262-263. noflagellate genus , 24

iled with pyrosequencing. iled withpyrosequencing. , 813-821. , 813-821. , , 8 13 , 1953-1961. , 1953-1961. 2014) Patterns of rare and Patterns ofrare 2014) , 105-111. , 105-111. , oftheapicoplast. 4 , 1044-1052. Diversity and patterns Dinophysis The Cold SpringHarb

. Environmental Environmental ISME Protist Euglenaria

can retain The Journal ,

158

ISME ś B, B, , 89- , 7

, This article is protected Allrights bycopyright. reserved. Scanla McDonald SM,SarnoD, Great Barrier Reef. Reef. Barrier Great andultrastructure life of cycle Lukeš M, Oborník M,ModryD, photosynthetic closely rela alveolate A M,etal.(2008) Chrudimsky T,Vancova M,JanouškovecJ, Moore RB, Oborník USA,pp.91-129 Science Publishers,Enfield, phytoplankton. In Subba Rao, D. (2006)Phases,stages,Lewis J Montresor Mand 50 ultraphytoplankton Gulfof inthe DNA per plastid during chloroplast development in maize. developmentinmaize. chloroplast during DNA perplastid (2004)Changes AJ Bendich inthe Oldenburg DJ, culture trials of L,S, Feng Xiaobo Na X,ShaojunP,Tifeng Series Gulf ofMexico. fromthesoutheastern phytoplankton populations B AlfreiderA,Wawrik ( Paul JH, 1311-1330. eukaryotic marine phytoplankton. eukaryotic marine WHCF, SimonN,Vaulot Not F, SianoR,Kooistra Bioinformatics Infe Eddy SR(2009) EP,KolbeDL, Nawrocki Oceanology andLimnology, fish farming onbenthiccommunities. foraminifera seque monitoring protistnext-generation through EslingLejzerowiczPawlowski J, F, P, Ce chloroplasts andkleptoplastidy inForaminifera. kingdoms. plant,andfungal beyond theanimal, Barcoding richness eukaryotic AudicS,AdlPawlowski J, S,Bass D,Belbah 1140. Pillet L, De Vargas C, Pawlowski J (2011) Mo Pillet L, J C,Pawlowski Vargas De 10(11): e1001419.

Accepted Article, 75-89. , 198 , 9-18. , 9-18. , Eutreptiella gymnastica 25 , 1335-1337. Protist , 163

30 Vitrella brassicaformis , 446-455. , 446-455. , 306-323. n DJ, Zingone A (2007) Genetic (2007) Zingone A n DJ, Cernotikova-Stribrna E,Cihlar (ed.), Algal Cultures,Analogue (ed.), 2000) Micro- and macrodiversity in macrodiversity and Micro- 2000) Naples during an annual cycle. annualcycle. an Naples during

Advances in Botanical Research Botanical Advances in ted to apicomplexan parasites. apicomplexan ted to (Eutreptiales, Euglenophyceae). Euglenophyceae). (Eutreptiales, dhagen T, Wilding TA (2014) Environmental T, Wilding TA(2014)Environmental dhagen Z, Suqin G (2012) Molecular identificationand G(2012)Molecular Z, Suqin ri Let al.(2012)CBOL ProtistWorking Group: ri lecular identificationof rnal 1.0: Inference of RNA Alignments. RNAAlignments. Inference of 1.0: rnal Protist ncing metabarcoding: assessing theimpactof ncing metabarcoding: and shifts in the life cycles ofmarine shifts inthelifecycles and structure of DNA molecules andtheamountof molecules structure ofDNA D, Probert I (2012) Diversity and ecology of ecology I and Diversity D,Probert (2012) Molecular EcologyResources n. sp., n. gen., a novel chromerid from the chromeridfrom n.sp.,gen.,a novel , 162

Journal ofMolecularBiology , 394-404. J, etal.(2012)Morphology, J, diversity of eukaryotic diversity of s of Blooms andApplications. Blooms s of Aquatic Microbial Ecology Marine Ecology Progress , Nature 64 rbc sequestered diatom sequestered , 1-53. , 1-53. Chinese Journalof Chinese L sequences ambient in , 451 , 959-963. PLoS Biology , 14 , 1129- , 344 ,

, for largealignments. PS,ArkinAP(2010)FastTrPrice MN,Dehal 1950. eatingwhat: diet assessment usin Pompanon F, Symondson BE, Deagle Brown WOC, ocean. GA(2007)Small ph Richardson TL, Jackson Research Acids Nucleic im project: gene database ribosomal RNA SILVA Schweer P,GerkenJ, Quast C,PruesseE,Yilmaz compatible withARB. checke quality for resource comprehensive online Ludwig B,Pruesse E,QuastC,KnittelK,Fuchs probes. us the predators: generalist of diets of analysis picoplankton? picoplankton? autotr F (2002)Are Vaulot D,RomariK,Not waters. coastal inoligotrophicgrazers I, R NotF, Massana Forn Gasol JM, Unrein F, biodiversity DNAmetabarcoding. assessmentusing Brochmann F, Taberlet P,CoissacE,Pompanon water.anoxic ahighly reveals environmental DNAsequencing M, ChristenR, Stoeck T,BassNebel Jone D, (Dinophyceae) inScottishwaters. complex Stern RF, AmorimAL, BresnanE(2014 One, fl sortedby eukaryotic picophytoplankton VaulotD( DJ, Shi XL,Lepère C,Scanlan This article is protected Allrights bycopyright. reserved. Accepted diversity. species planktonicc gutinhabiting ofmarine, the L SA,Guillou Skovgaard A,Karpov ( Article SanAndrésV,ClareEL, MirG,Sy Piñol J, 6 (4), e18979. (4), e18979. Science Molecular EcologyResources Molecular Molecular Ecology Molecular , Trends in Microbiology 315 Frontiers inmicrobiology Frontiers , 838 PLoS ONE Nucleic Acids Research Acids Nucleic , – 41 840. , D590-D596. , D590-D596.

5 g next generation sequencing. g next , (3), e9490. (3), e9490. 19 , 21-31. , 21-31. , 2012) The parasitic dinoflagellates parasitic dinoflagellates The 2012) , 10 The ISME Journal 14 ) Diversity and plastid types in and plastid ) Diversity opepods: morphology, ecologyunrecognized and , 18-26. , 18-26. ow cytometry from the South Pacific Ocean. Ocean. Pacific South the from ow cytometry , 266-267. 2011) Plastid16SrRNAgeneamong diversity Harmful Algae mondson WOC (2014)Apragma WOC mondson , 3 s MD,etal.(2010)Multip ytoplankton and carbon export from the surface from thesurface export ytoplankton andcarbon ophs less diverse than heterotrophs inmarine ophs lessdiversethanheterotrophs ee 2 - Approximately maximum-likelihood trees maximum-likelihood - Approximately ee 2 e of next-generation sequencing with no blocking withnoblocking sequencing next-generation e of , 305 , (2014) Mixotrophic haptophytes are key bacterial bacterial haptophytes key Mixotrophic are (2014) 35 W, Peplies J, Glöckner FO (2007) SILVA: a GlöcknerFO(2007)SILVA: Peplies W, J, complex eukaryotic community in marine eukaryoticcommunity inmarine complex C, Willerslev E(2012)To Willerslev C, d and aligned ribosomal RNA sequence data data andaligned ribosomalRNAsequence d T, Yarza P, Peplies J, Glöckner FO (2013) The FO Glöckner (2013) P,PepliesJ, T, Yarza proved data processing proved dataprocessing , 7188-7196. .

Molecular Ecology Molecular

DS, Jarman SN, Taberlet P (2012) Who is P(2012)Who SN,Taberlet DS, Jarman ,

8 , , 164-176. 39 , 223-231. Molecular Ecology Molecular , le marker parallel tag marker parallel le Dinophysis acuminata Dinophysis 21 Blastodinium and web-based tools. web-based and wards next-generation wards next-generation , 2045-2050. tic approach tothe tic approach , 21 , 1931- spp. PLoS

This article is protected Allrights bycopyright. reserved. flow cytometric analyses. Vaulot D,CourtiesC,Partensk Microbiology related Rippka AmannRI, Fuller NJ, Schönhuber WA, NJ, West phytoplankton ( Viprey M,Moreau Vaulot D,EikremW, Zhang Z, Cavalier-Smith T, Green VR (2002)Evol T,Green Z, Cavalier-Smith Zhang by revealed picophytoplankton as diversity marine among Zeidner CM,DeLong R, EF, G,Preston Massana 49, picophytoplankton: theimportance Assessi PalenikB (2004) AZ,Nolan JK, Worden Parasitology (2003) BAP,Williams KeelingPJ Microbiology regions assitu hybridizati revealed by in AcceptedAtlantic Ocean. in theNorth by thesmallestphytoplankton bacterivory (2008) High GA Zubkov MV,Tarran pp.867-926. Press, Harvard, phytoplankton ecology. bl Harmfulalgal A, Wyatt T(2005) Zingone 92. ecosystems withquantitative PCRofthe 18SrRNA gene. D R,NotF,MarieD,Vaulot ZhuF, Massana Evolution divergence concerted and thepartially Article 168-179. Prochlorococcus Prochlorococcus , 19 , , 489-500.

54 5 147 , 212-216. , 212-216. ≤ Nature , 9-67. 3 , 1731-1744. , 1731-1744. μ m) in marine ecosystems. ecosystems. marine in m) , In 455 genotypes show remarkably different depth distributions in two oceanic two in distributions depth different showremarkably genotypes Cytometry : Robinson, A. R. & Brink, K. H. [Eds.] : Robinson,A.R.& K.H.[Eds.] Brink, , 224-226. y (1989) A simple method topres y(1989) Asimple Cryptic inpara organelles of the eukaryotic component. eukaryotic of the , 10 of their putative replicon origins. repliconorigins. oftheirputative ,629-635. ,629-635. on using 16S rRNA- targeted oligonucleotides. on using 16SrRNA-targeted oligonucleotides. H (2008) The diversity of small eukaryotic ofsmalleukaryotic Thediversity H (2008) ooms: keysofthe totheunderstanding (2005) Mapping of picoeucaryotes in marine inmarine Mapping picoeucaryotes of (2005) FEMS MicrobiologyReview FEMS ng the dynamics and ecology of marine andecology ofmarine thedynamics ng Post AF, Scanlan DJ, Béjà O (2003) Molecular Béjà O(2003)Molecular Post AF, ScanlanDJ, ution ofdinoflagellate

R, Post AF, Scanlan DJ (2001)Closely R,PostAF, ScanlanDJ FEMS MicrobiologyEcology FEMS psbA sitic protistsandfungi. analyses. analyses. erve oceanic phytoplankton for phytoplankton for erve oceanic Limnology andOceanography The Sea. M Environmental Environmental unigenic minicircles minicircles unigenic o s Harvard University University Harvard lecular Biologyand 32 ,795-820. Advances in in Advances , 52 , 79- , This article is protected Allrights bycopyright. reserved. Accepted Article (ca. with full-length and theotherone obt novelamplicons the containing long sequences Most 16SrDNAsequencesinPhytoREF dist are ge been novel ampliconsthathave 879sequences libraries, clone fromSanger produced fromidentified composed of3,333amplicons compiledinto rDNAsequences plastidial 16S Figure 1: FiguresLegends and Treemap and histogramsorigin showing and Treemap the nerated inthisstudynerated fromcultu 1500 bp) sequences from 1500 bp)sequences ributed intwopeaks:the one with700-900bp- organisms, 1,867envi the PhytoREF database. A:PhytoREF is the PhytoREF database. ained here from cultured microalgal strains, strains, cultured microalgal from ained here extracted from plastidial genomes, and 411 and 411 plastidialgenomes, from extracted and number (A), and length (B)ofthe andnumber(A),

public databases. publicdatabases. res res B: of marine microalgae. ronmental amplicons

This article is protected Allrights bycopyright. reserved. Accepted ArticlePhytoREF database isindicated insmall grey circles. thenumberofplas and highlighted ingreen, E & Keeling 2014). (Burki evidence morphological eukaryotic life. phylogenetic schematic The 2: Figure Distribution and number of PhytoREF plas Distribution andnumberofPhytoREF tidial 16S rDNA sequences available inthe available rDNAsequences 16S tidial tree isbasedonup-to-d tree

ach plastid-containing eukaryotic lineage eukaryotic ach plastid-containing is tidial 16S rDNA sequences in the tree of inthe tree rDNAsequences tidial 16S

ate phylogenomics and phylogenomics ate

This article is protected Allrights bycopyright. reserved. 3: Figure Accepted Articleclarity. a better for here notconsidered were genera) repres thatare Streptophytes (landplants) suchasthe description, lack fulltaxonomic in thatarepresent andspecies families, genera PhytoREF thenumberof plastidial16S represent Taxonomic compositionofthePhytoREFTaxonomic data ented by 2,973sequences ented by prasinophytes (clade VII) and the rappemonads. rappemonads. andthe VII) (clade prasinophytes a given class. Several key groups of microalgae groups ofmicroalgae key class.Several given a rDNA sequences and taxonomically-described and taxonomically-described rDNAsequences

base at the class level. Bar charts Barcharts class level. atthe base

(373 families and796 (373families