J. Phycol. 53, 790–803 (2017) © 2017 Phycological Society of America DOI: 10.1111/jpy.12540

PHYLOGENETIC POSITION OF THE CORAL SYMBIONT () INFERRED FROM GENOME DATA1

Heroen Verbruggen, Vanessa R. Marcelino School of BioSciences, University of Melbourne, Melbourne, Victoria 3010, Australia Michael D. Guiry AlgaeBase, Ryan Institute, National University of Ireland, Galway H91 TK33, Ireland Ma. Chiela M. Cremen, and Christopher J. Jackson 2 School of BioSciences, University of Melbourne, Melbourne, Victoria 3010, Australia

The green algal genus Ostreobium is an important Abbreviations: bp, base pairs; kbp, kilo base pairs; symbiont of corals, playing roles in reef LBA, long branch attraction; Mb, mega base pairs; decalcification and providing photosynthates to the ML, maximum likelihood; ORF, open reading coral during bleaching events. A chloroplast genome frame of a cultured strain of Ostreobium was available, but low taxon sampling and Ostreobium’s early-branching nature left doubt about its phylogenetic position. Ostreobium is an early-branching genus in the Here, we generate and describe chloroplast genomes siphonous (order ), a group from four Ostreobium strains as well as Avrainvillea of seaweeds composed of a single giant multinucle- mazei and Neomeris sp., strategically sampled early- ate cell (Vroom and Smith 2003). The genus is unu- branching lineages in the Bryopsidales and sual among the Bryopsidales in being microscopic respectively. At 80,584 bp, the chloroplast and endolithic, usually burrowing into marine lime- genome of Ostreobium sp. HV05042 is the most stone substrates. It is best known living in association compact yet found in the Ulvophyceae. The with corals, forming a green layer in the skeleton Avrainvillea chloroplast genome is ~94 kbp and underneath the living coral tissue (Verbruggen and contains introns in infA and cysT that have nearly Tribollet 2011). Ostreobium plays a major role in car- complete sequence identity except for an open bonate dissolution of live and dead corals, eroding reading frame (ORF) in infA that is not present in as much as a kilogram of reef carbonate per square cysT. In line with other bryopsidalean , it also meter per year (Tribollet 2008, Grange et al. 2015). contains regions with possibly bacteria-derived ORFs. It can also confer benefits to the coral hosts by pro- The Neomeris data did not assemble into a canonical viding them with photosynthates and protecting circular chloroplast genome but a large number of them from high-light stress (Schlichter et al. 1995, contigs containing fragments of chloroplast genes Inoue et al. 1997, Fine and Loya 2002). and showing evidence of long introns and intergenic Due to the importance of their ecological roles in regions, and the Neomeris chloroplast genome size reef ecosystems, endolithic algae have been studied was estimated to exceed 1.87 Mb. Chloroplast quite intensively in recent years. One of the key phylogenomics and 18S nrDNA data showed strong findings of recent work is that the Ostreobium lin- support for the Ostreobium lineage being sister to the eage, previously thought to be composed of only a remaining Bryopsidales. There were differences in handful of species, is in fact a complex of multiple branch support when outgroups were varied, but the undescribed lineages (Gutner-Hoch and Fine 2011, overall support for the placement of Ostreobium was del Campo et al. 2016, Marcelino and Verbruggen strong. These results permitted us to validate two 2016, Sauvage et al. 2016). suborders and introduce a third, the Ostreobineae. An earlier phylogenetic study of the siphonous Key index words: green algae suggested that Ostreobium was sister to Bryopsidales; chloroplast genome; the two main suborders of the Bryopsidales: Bryo- coral symbiosis; Dasycladales; molecular phylogenet- Ostreobium psidineae and Halimedineae (Verbruggen et al. ics; ; phylogenomics 2009). This study, however, was based on only five genes, of which only three were sampled for Ostreo- bium, and maximum likelihood bootstrap support 1Received 1 November 2016. Accepted 13 December 2016. First for its position was low (66% for the Bryopsidineae Published Online 10 April 2017. Published Online 12 May 2017, + Halimedineae grouping). Adding more data can Wiley Online Library (wileyonlinelibrary.com). 2Author for correspondence: e-mail [email protected]. help to resolve such phylogenetic uncertainties, with Editorial Responsibility: L. Graham (Associate Editor) both gene and taxon sampling playing important

790 OSTREOBIUM CHLOROPLAST PHYLOGENOMICS 791 roles. Furthermore, it has been demonstrated that sequences for Bryopsidales that were included in the datasets phylogenomic analysis with only a few taxa can yield are listed in Table 1. high support values for the wrong tree topology, Choosing an appropriate outgroup was difficult because especially in cases where systematic bias is an issue depending on taxon and gene sampling, different groups have been inferred to be the sister group of the Bryopsidales, (Hedtke et al. 2006). With the knowledge that usually with low support. To add to the problem, the class Ostreobium is a complex of multiple lineages, we can Ulvophyceae (to which the Bryopsidales belong) was non- now take advantage of that diversity by sequencing monophyletic in chloroplast genome phylogenies (Fucıkova chloroplast genomes from multiple Ostreobium spe- et al. 2014, Leliaert and Lopez-Bautista 2015, Sun et al. 2016, cies to reduce phylogenetic artifacts and more accu- Turmel et al. 2016) while nuclear data suggested monophyly rately infer the position of the genus among the (Cocquyt et al. 2010). We have chosen to use four outgroup lineages (Table 1) and have run analyses on various combina- siphonous green algae. tions of these (specified below). High-throughput sequencing methods permit Samples and cultures. Avrainvillea mazei HV02664 and Neo- addressing problematic phylogenetic relationships meris sp. HV02668 were collected from Tavernier (Florida, with increased sampling both of taxa and molecular USA) and Islamorada (Florida, USA), respectively, and pre- markers. Chloroplast genome sequencing and phy- served in silica-gel. To obtain Ostreobium tissue, two fragments logenomic analysis have been used to resolve, for of coral skeleton were collected. Both were taken from coral example, difficult relationships among red algae skeletons (Diploastrea heliopora) collected in the Kavieng area of Papua New Guinea. Samples HV05007 and HV05042 were (Costa et al. 2016), core (Fucıkova collected at 4 and 13 m depth, respectively, and both showed et al. 2014, Sun et al. 2016, Turmel et al. 2016), a pronounced green band in the skeleton. The fragments Trebouxiophyceae (Lemieux et al. 2014a) and were kept in unenriched sterilized seawater at 24°C and Chlorophyceae (Turmel et al. 2008). The first com- under dim light for several months until Ostreobium filaments plete chloroplast genomes of siphonous green started growing out of the limestone. When the filaments ~ algae, including Ostreobium quekettii, have been reached 1 cm, they were harvested for DNA extraction. We recently sequenced (Leliaert and Lopez-Bautista did not attempt to establish pure cultures. Molecular work and DNA sequencing. Total genomic DNA 2015, del Campo et al. 2016, Marcelino et al. 2016). was extracted using a modified CTAB protocol (Cremen et al. While these provide a good starting dataset to 2016). Two types of library preparation and sequencing were resolve the phylogenetic placement of the Ostreobium used. For A. mazei HV02664, a library was prepared from lineage, many early-branching lineages have not yet ~350 bp fragments using a TruSeq Nano LT kit and sequenced been sampled. This results in long branches which on the Illumina HiSeq 2000 platform (paired end, 100 bp). For the remaining strains, libraries were prepared from a would otherwise be broken up by additional taxon ~ sampling, leading to issues such as long branch 500 bp size fraction with a Kapa Biosystems kit and sequenced on the Illumina NextSeq platform (paired end, 150 bp). attraction (LBA), a phylogenetic artifact in which Assembly and annotation. Sequence reads from both Ostreo- long branches cluster together even if they are not bium strains were assembled using IDBA-UD 1.1.1 (Peng et al. closely related (Bergsten 2005). 2012), SPAdes 3.8.1 (Bankevich et al. 2012), CLC Genomics Here, we report additional chloroplast genomes Workbench 7.5.1 (http://www.clcbio.com) and MEGAHIT strategically obtained to break up the long branches 1.0.6 (Li et al. 2016). Contigs matching to Ulvophycean in siphonous green algae in order to infer the phy- chloroplast genomes reference sequences were imported into logenetic position of the Ostreobium lineage and the Geneious 8.0.5 (http://www.geneious.com), where complete- ness and circularity were evaluated by comparing the contigs early diversification of Bryopsidales. The goals of from different assemblers manually. Assembly graphs were this study are: (i) to obtain draft chloroplast gen- visually assessed in Bandage 0.8.0 (Wick et al. 2015) and read omes of multiple Ostreobium strains and from two mapping was carried out in Geneious 8.0.5. other early-branching families in the Bryopsidales Because one of the limestone samples turned out to con- () and Dasycladales (Dasy- tain multiple Ostreobium species, metagenome binning was cladaceae), (ii) infer the phylogenetic position of carried out to separate the contigs from different species. Ostreobium from chloroplast and nuclear sequences These analyses were carried out on scaffolds larger than 13 kb resulting from the SPAdes assembly and used MyCC while quantifying uncertainty resulting from out- (Lin and Liao 2016) and MaxBin 2.2 (Wu et al. 2014). group choice; and (iii) make taxonomic changes Contigs were annotated following Verbruggen and Costa where needed. (2015) and Marcelino et al. (2016). In brief, annotations were obtained from MFannot, DOGMA and ARAGORN and imported into Geneious. All annotations were vetted manu- MATERIALS AND METHODS ally and a master annotation track was created from them. We looked for open reading frames (ORFs) of potential bac- Sampling design. The main purpose of the sampling design terial origin in Ostreobium by performing a tblastx search was to avoid long branch attraction (LBA) in the phyloge- using bacteria-derived ORFs from chloroplast genomes of netic inference by adding taxa from poorly sampled groups. other Bryopsidales as a query (Leliaert and Lopez-Bautista We therefore aimed to add Ostreobium species that were 2015). The 18S sequences were identified by creating a different from the previously sequenced one, an Avrainvillea BLAST database of contigs and querying it with known Ulvo- species representing the early-branching family Dichoto- phyceae 18S sequences. mosiphonaceae of the Bryopsidales, and a Neomeris species Phylogenetic inference. Alignments of each named chloro- representing the , an early-branching Dasy- plast protein-coding gene were inferred with TranslatorX cladales family. The remaining available chloroplast genome (Abascal et al. 2010), which translates sequences to amino 792

TABLE 1. Strains and sequences used in this study. Nomenclature follows AlgaeBase (Guiry and Guiry 2016).

Category Strain Species Source Sequencing GenBank Ostreobium SAG 6.99 Ostreobium quekettii Marcelino et al. (2016) NextSeq PE-150 LT593849 OS1B Ostreobium sp. del Campo et al. (2016) HiSeq PE-100 KU979013 HV05007a Ostreobium sp. This study NextSeq PE-150 KY509311 KY509315 KY509312 HV05007b & c Ostreobium sp. This study NextSeq PE-150 KY766991–KY766996 KY819062

KY819067 VERBRU HEROEN HV05042 Ostreobium sp. This study NextSeq PE-150 KY509314 Other Bryopsidales HV02664 Avrainvillea mazei This study HiSeq 2000 PE-100 KY509313 HV04923 Halimeda discoidea Marcelino et al. (2016) HiSeq 2000 PE-100 KX808496 HV03798 Caulerpa cliftonii Marcelino et al. (2016) HiSeq 2000 PE-100 KX808498 WEST4838 Derbesia sp. Marcelino et al. (2016) HiSeq 2000 PE-100 KX808497 N/A Bryopsis hypnoides Lu€ et al. (2011) Sanger NC_013359 WEST4718 Bryopsis plumosa Leliaert and HiSeq 2000 PE-100 NC_026795 Lopez-Bautista (2015)

FL1151 Tydemania expeditionis Leliaert and HiSeq 2000 PE-100 NC_026796 AL. ET GGEN Lopez-Bautista (2015) Outgroup 1 DI1 acetabulum de Vries et al. (2013) 454 GS-FLX http://goo.gl/00Ni4a (Dasycladales) HV02668 Neomeris sp. This study NextSeq PE-150 KY495826–KY495874 Outgroup 2 UNA00071828 Ulva sp. Melton et al. (2015) HiSeq 2000 PE-100 KP720616 (UU clade) UNA00071832 Ulva fasciata Melton and MiSeq KT882614 Lopez-Bautista (2015) QD08 Ulva linza Unpublished Unknown KX058323 UTEX 1912 Pseudendoclonium Pombert et al. (2005) Sanger AY835431 akinetum NIES 360 Oltmannsiellopsis viridis Pombert et al. (2006) Sanger DQ291132 Outgroup 3 SAG 17.86 Scherffelia dubia Turmel et al. (2016) Sanger NC_029807 (Chlorodendrophyceae) CCMP 881 Tetraselmis sp. Turmel et al. (2016) Sanger KU167097 Outgroup 4 SAG 42.84 Pedinomonas tuberculata Lemieux et al. (2014a) 454 GS-FLX NC_025530 (Pedinophyceae) UTEX LB 1350 Pedinomonas minor Turmel et al. (2009) Sanger NC_016733 NIES 1824 Marsupiomonas sp. Lemieux et al. (2014a) 454 GS-FLX KM462870 OSTREOBIUM CHLOROPLAST PHYLOGENOMICS 793 acids, uses MAFFT 7.215 (Katoh and Toh 2010) to align the is even smaller than that of SAG6.99 (81,997 bp). amino acid sequences and generates the corresponding Similar to other green algae, the chloroplast gen- nucleotide alignments. A few genes were excluded either ome has a low GC content (30.9%). because they were present in only a handful of species or because they could not be aligned reliably: rpoA, rpoB, rpoC1, Assembly of sample HV05007 was substantially less rpoC2, ftsH, tilS, ycf1, psb30. Alignments of the 18S gene were straightforward. Initial assemblies showed several inferred with MAFFT using default settings. Three concate- ~20 kb long contigs with similarity to Ostreobium nated gene alignments were created: one of the nuclear ribo- chloroplast genome sequences. These contigs also somal genes (nrDNA) and two of the chloroplast genes at the showed that three distinct variants of some genome nucleotide level (cpDNA) and the amino acid level regions were present among the contigs. This indi- (cpPROT). All alignments are available in Appendix S1 in the Supporting Information. cates that the limestone sample contained three dif- In order to evaluate the robustness of results to variations ferent Ostreobium species, which we will refer to as in outgroup choice, six different configurations of outgroup HV05007a, b and c. Metagenome binning based on lineages were used. The first (abbreviated o1) used only the k-mer profiles (MyCC) and a combination of k-mer first outgroup lineage, the Dasycladales, which were indicated and coverage profiles (MaxBin) separated out the to be related to Bryopsidales in nuclear and chloroplast phy- contigs belonging to one of the species in a sepa- logenies (e.g., Cocquyt et al. 2010, Fucıkova et al. 2014). rate cluster (HV05007a, average coverage 9139), Strategy o2 used only the members of the Ulvales-Ulotrichales lineage, another group of ulvophycean algae. The third strat- but the sequences of the remaining two species egy (o3) used only Chlorodendrophyceae. Despite being a were recovered in a single cluster (average coverage different class, this group was inferred as the sister to Bryopsi- 40–499). The assembly failed across some highly dales in the ML analysis of Turmel et al. (2016), though with conserved RNA-coding regions: rns (16S), rnl (23S), low support. Strategy o12 used a combination of outgroups 1 rnpB (RNAse P) and a region with three consecutive and 2 (Dasycladales and Ulvales-Ulotrichales). Strategy o123 tRNAs (trnF-GAA, trnL-CAA, trnQ-TTG). The recov- expanded o12 by adding the Chlorodendrophyceae. Finally, ered contigs were completely collinear with those of strategy o1234 used all four outgroup lineages. The fourth group (Pedinophyceae) is early-branching in the core Chloro- SAG6.99 and HV05042. phyta and thus serves as an outgroup to all the remaining Despite a high average sequencing depth of taxa in our dataset (e.g., Turmel et al. 2016). 1,0539, the chloroplast genome of A. mazei At the nucleotide level, analyses were carried out with a HV02664 assembled into three contigs of 55.6, 23.4, simple GTR+Γ model (unpartitioned, labeled -u- in the code and 13.9 kbp. Fragments of atpI were present near column of Table 2) and with a separate GTR+Γ model for the edges of two contigs, suggesting that these con- each codon position (partitioned, labeled -p- in Table 2). At tigs were adjacent. The same was true for the cysT the amino acid level, analyses were carried out with the LG and the (better-fitting) cpREV model, and with Γ among-site gene near different contig edges. The assembly rate variation and estimated AA frequencies. Because of the graph (Fig. 2A) indicated that indeed, there are extensive analysis design with multiple alignments, multiple paths connecting these fragments of cysT and atpI models of sequence evolution and multiple outgroup strate- (region 1 in Fig. 2A, detailed in Fig. 2B). The graph gies (30 analysis types in total), computational time con- indicated a shared path with approximately double straints led us to limit ourselves to maximum likelihood sequencing depth connecting the gene fragments of analyses. We used RAxML 8.2.6 with 500 rapid bootstrap replicates (Stamatakis 2014). both cysT and atpI (Fig. 2B), also containing a bub- ble of single sequencing depth of which the longer path contained an ORF. By mapping paired reads where one read matched the adjacent coding RESULTS sequence and the other extended the contig, we The sequences of the green algal tissue obtained were able to resolve this region unambiguously. from the rough culture of limestone sample Both atpI and cysT have introns that are virtually HV05042 yielded a complete Ostreobium chloroplast identical in sequence for the first 518 bp and the genome assembled into a single contig by SPAdes last 90 bp (Fig. 2C). They differ in an ORF being and CLC Genomics, with 1,2849 average coverage. present in the atpI intron and a 26 bp region with We confirmed that this molecule is circular-map- minor sequence differences in the noncoding part ping (Fig. 1). The sequence was completely colli- (arrowhead near 200 bp mark in Fig. 2C). Another near with the genome of the previously published area of the assembly graph showed a trifurcation strain SAG6.99 (Fig. 1), but there were considerable near the psbE gene (region 2 in Fig. 2A, detailed in rearrangements with species from other families in Fig. 2D). Sequence fragments showing similarity to the order Bryopsidales (e.g., Bryopsis, Tydemania; the psbE gene were found on all three of the con- Fig. 1). All Ostreobium chloroplast genomes have a nected nodes. Using manual mapping of paired simple structure without the inverted repeats often reads, we were able to establish that in addition to found in land plants and green algae. As previously the canonical psbE gene at the end of node 2797, described for Ostreobium SAG6.99 (Marcelino et al. there is a second, more deviant copy of this gene 2016), the Ostreobium HV05042 chloroplast genome spanning nodes 2798 and 2796. Read mapping also is a compact and gene-dense, highly economical showed conclusively that the path from 2797 to genome. At 80,584 bp long, the HV05042 genome 2798 (skipping 2796) is not supported by paired reads. Instead, this exercise confirmed that node 794 HEROEN VERBRUGGEN ET AL.

TABLE 2. Summary of phylogenetic analyses and the placement of the Ostreobium lineage.

(O,(B,H)) (B,(O,H)) (H,(O,B)) Codea Genome DNA/protein Model Outgroupsb ML topologyc support support support cpDNA-p-o1 Chloroplast DNA GTR+Γ, part. o1 (O,(B,H)) 98.0 2.0 0.0 cpDNA-p-o2 Chloroplast DNA GTR+Γ, part. o2 (O,(B,H)) 95.0 4.8 0.2 cpDNA-p-o3 Chloroplast DNA GTR+Γ, part. o3 (O,(B,H)) 51.2 48.8 0.0 cpDNA-p-o12 Chloroplast DNA GTR+Γ, part. o12 (O,(B,H)) 96.4 3.6 0.0 cpDNA-p-o123 Chloroplast DNA GTR+Γ, part. o123 (O,(B,H)) 95.8 4.2 0.0 cpDNA-p-o1234 Chloroplast DNA GTR+Γ, part. o1234 (O,(B,H)) 86.4 13.6 0.0 cpDNA-u-o1 Chloroplast DNA GTR+Γ o1 (O,(B,H)) 100.0 0.0 0.0 cpDNA-u-o2 Chloroplast DNA GTR+Γ o2 (O,(B,H)) 89.4 10.6 0.0 cpDNA-u-o3 Chloroplast DNA GTR+Γ o3 (B,(O,H)) 33.4 66.6 0.0 cpDNA-u-o12 Chloroplast DNA GTR+Γ o12 (O,(B,H)) 97.4 2.6 0.0 cpDNA-u-o123 Chloroplast DNA GTR+Γ o123 (O,(B,H)) 96.2 3.8 0.0 cpDNA-u-o1234 Chloroplast DNA GTR+Γ o1234 (O,(B,H)) 99.0 1.0 0.0 cpPROT-CPREV-o1 Chloroplast Protein CPREV+F+Γ o1 (O,(B,H)) 89.2 9.6 1.2 cpPROT-CPREV-o2 Chloroplast Protein CPREV+F+Γ o2 (O,(B,H)) 98.6 0.2 1.2 cpPROT-CPREV-o3 Chloroplast Protein CPREV+F+Γ o3 (O,(B,H)) 60.8 38.2 1.0 cpPROT-CPREV-o12 Chloroplast Protein CPREV+F+Γ o12 (O,(B,H)) 98.4 0.6 1.0 cpPROT-CPREV-o123 Chloroplast Protein CPREV+F+Γ o123 (O,(B,H)) 99.2 0.6 0.2 cpPROT-CPREV-o1234 Chloroplast Protein CPREV+F+Γ o1234 (O,(B,H)) 96.6 2.2 1.2 cpPROT-LG-o1 Chloroplast Protein LG+F+Γ o1 (O,(B,H)) 91.4 8.0 0.6 cpPROT-LG-o2 Chloroplast Protein LG+F+Γ o2 (O,(B,H)) 98.4 0.0 1.6 cpPROT-LG-o3 Chloroplast Protein LG+F+Γ o3 (O,(B,H)) 54.6 43.2 2.2 cpPROT-LG-o12 Chloroplast Protein LG+F+Γ o12 (O,(B,H)) 97.8 1.2 1.0 cpPROT-LG-o123 Chloroplast Protein LG+F+Γ o123 (O,(B,H)) 98.2 1.6 0.2 cpPROT-LG-o1234 Chloroplast Protein LG+F+Γ o1234 (O,(B,H)) 95.2 3.4 1.4 nrDNA-u-o1 Nuclear DNA GTR+Γ o1 (O,(B,H)) 100.0 0.0 0.0 nrDNA-u-o2 Nuclear DNA GTR+Γ o2 (O,(B,H)) 100.0 0.0 0.0 nrDNA-u-o3 Nuclear DNA GTR+Γ o3 (O,(B,H)) 100.0 0.0 0.0 nrDNA-u-o12 Nuclear DNA GTR+Γ o12 (O,(B,H)) 100.0 0.0 0.0 nrDNA-u-o123 Nuclear DNA GTR+Γ o123 (O,(B,H)) 100.0 0.0 0.0 nrDNA-u-o1234 Nuclear DNA GTR+Γ o1234 (O,(B,H)) 100.0 0.0 0.0 aCodes with the label -p- indicate DNA analyses that were partitioned by codon position, whereas codes with -u- were not parti- tioned. Partitioned analyses are also labeled with the abbreviation “part.” in the Model column. bOutgroup lineages are: o1 = Dasycladales, o2 = UU clade, o3 = Chlorodendrophyceae, o4 = Pedinophyceae. Remaining codes are combinations of these. cTopology given as Newick string, e.g. (O,(B,H)) meaning that Bryopsidineae and Halimedineae are sister clades, with the Ostre- obium lineage sister to Bryopsidineae+Halimedineae.

2798 continues into 2796, which contains a few Like some other Bryopsidales, the Avrainvillea ORFs with no similarity to known genes. We were chloroplast genome has two regions containing mul- not able to establish whether node 2797 connects to tiple ORFs that do not code for typical chloroplast this structure elsewhere, and hence we cannot con- genes (regions 2 and 3 in Fig. 2A). firm circularity of the Avrainvillea chloroplast gen- Our whole chloroplast genome alignments of ome. The last area in the graph where multiple Ostreobium and Avrainvillea with other members of nodes were connected to one another (region 3 in the Bryopsidales (Figs. 1, S1 in the Supporting Fig. 2A) was resolved by the assembler and did not Information) show that while there are many rear- require manual attention. rangements within the order, there are also a num- Our final, manually created scaffold for A. mazei ber of regions with conserved gene order: (i) an HV02664 is nearly 94 kbp long, and assuming that atpA-atpF-atpH-atpI operon is conserved across the it serves as an approximation of the entire chloro- Bryopsidales, (ii) a chlB-psaA-psaB-psbZ-tilS-psbD-psbC plast genome, this puts the species at the lower end cluster is found in Bryopsidineae, Ostreobineae and of the size scale for Bryopsidales and Ulvophyceae Avrainvillea but is lost in the remaining Hali- more generally. Among currently sequenced chloro- medineae, (iii) ccsA-cysA-psbB-psbT-psbH is present in plast genomes, only those of Ostreobium species are all three suborders, and (iv) rpoA-rps11-rpl36-infA- smaller. Most other species have chloroplast gen- rps8-rpl5-rpl14-rpl16-rps3-rps19-rpl2-rpl23 is conserved omes exceeding 100 kbp. At 27.4%, its GC content across the order. is lower than that of Ostreobium, probably due to The Neomeris sp. HV02668 assembly failed to Avrainvillea having more intergenic regions of low recover a chloroplast genome. Instead, the assembly GC content. The genome has a similar gene reper- consisted of a large number of contigs containing toire than other members of the Bryopsidales, with fragments of what appeared to be chloroplast genes. only cemA, petL, psbT and ycf47 apparently missing. The coverage of these contigs was in the 180–2809 OSTREOBIUM CHLOROPLAST PHYLOGENOMICS 795

0

75,000 5,000 CDS

rps18

psbM rpl20 ycf3 yc

f tA 3

saI ft l23 p pe pl14 sH p r r 10,000 y rpl2 s3 cf20 rps19 p 70,000 psaC r nfA rpl16 l5 i psb rps4 p trnW(cca) r trnV(t 6 F psbJ trnP(tgg) rpl5 3 psbL rps8 l rp psbE rps11 ac rpoA accD trnR(acg)

) t rns rnH(gtg) rnM(cat) t rps7 trnS(gct) rps12 ycf1

r RNA bcL trnI(gat) cysT petG trnD(gtc) ycf47 15,000 65,000 rpl12 rps9 rpl32 trnM(cat) rnl trnG(gcc) petD trnL(gag) petB trnL(taa)rrn5 trnT(ggt) rps14 trnY(gta) trnE(ttc) trnS(tga) trnR(tct) atpB rpoB atpE psaM Ostreobium ycf12 sp. HV05042 trnM(cat) psbK 20,000 trnF(gaa) psbN 60,000 trnL(caa) psbI 80,584 nt trnQ(ttg) trnC(gca) chlB rpoC1

psaA

psaB 25,000 rpoC2 trnK trnG(tcc) psb 55,000 Z (ttt) tilS

psbD trn trnA psaJ trnR(cct) petL ) t

L(ta g psbC rps2 t

( (tgc)

atpI trnR(ccg) T g)

trnN(gtt) chlN

atpH trn Gene categories atpF chlL

ccsA atpA cys

ycf4 p psbT psbH cl ATP synthesis chlI r s 30,000

p A 470 b f pP

l19 B

50,000 ufA

or t genetic systems psbA metabolism photosystems ribosomal proteins 35,000 45,000 GC content transport 40,000 0.0 0.2 0.4 0.6 0.8 1.0 other/unknown

Ostreobium HV05042

Ostreobium SAG 6.99

Bryopsis

Tydemania

FIG. 1. Chloroplast genome map of Ostreobium sp. HV05042. Its alignment with Ostreobium quekettii SAG6.99 shows complete collinearity of these two genomes, while there are substantial rearrangements between the Ostreobium genomes and those of Bryopsis plumosa (NC_ 026795) and Tydemania expeditionis (NC_026796). range. Using metagenome binning, we isolated a chloroplast contigs. Using ORF identification group of 383 sequences ranging from 1,024 to and BLASTP searches followed by alignment of Bryo- 59,641 in size that were considered candidate psis or Acetabularia sequences, we identified 49 796 HEROEN VERBRUGGEN ET AL.

A

2

3

1

double sequencing depth B cysT atpI exon 1 exon 2

518 132 90

1264 atpI cysT exon 1 exon 2 ORF382

C 1 100 200 300 400 500 600 700 800 900 1,000 1,100 1,200 1,300 1,400 1,500 1,600 1,700 Identity

ORF382 atpI intron cysT intron

Node 2797

ORF88 unique to unique to alternative canonical

D ORF105 ORF71 psbE3 psbE1

psbJ ps rps4 psb

bL psbE2 F

in common between versions accD Node 2798 Node 2796

FIG. 2. Assembly graph of Avrainvillea mazei HV02664 and details of areas with ambiguous assembly. (A) Complete assembly graph from SPAdes, visualized in Bandage (Wick et al. 2015), with gene annotations mapped onto the graph using BLAST (Altschul et al. 1990). Numbers indicate regions of ambiguous assembly. (B) Detail of the region containing the cysT and atpI genes (number 1 from A), show- ing that the genes have highly similar introns. The common parts in the beginning and end have approximately twice the sequencing depth of the unique pieces. The numbers on the segments indicate their length. (C) Alignment of the resolved introns of the atpI and cysT genes, showing perfect identity of the first ~518 bp (except the 26 bases at the red arrowhead) and the last 90 bp. (D) Detail of the graph structure near the psbE gene. The annotations of psbE is done in sections, showing the part that is shared between the canonical and alternative versions of the gene (section psbE2) and those unique to the canonical (psbE1) and alternative (psbE3) versions. Paired read mapping showed that there was no connection between nodes 2797 and 2798 but rather than 2798 connected to 2796. OSTREOBIUM CHLOROPLAST PHYLOGENOMICS 797 chloroplast genes, the larger ones often containing at the base of the Bryopsidales, leaving the Bryo- introns. psidineae and Halimedineae suborders as sister From the newly assembled sequences and clades (Fig. 3). The choice of outgroup, use of sequences downloaded from GenBank, we created nucleotide or amino acid sequences, and the use of gene alignments that were subsequently concate- different models had an effect on the bootstrap sup- nated into a single alignment. The chloroplast data port values (Table 2; Fig. S2). Specifically, bootstrap alignment comprised 25 species, 13 of which were support for the relationship above reduced consid- ingroups. It consisted of 71 concatenated protein- erably when the Chlorodendrophyceae were used as coding genes and was 48,651 bases (cpDNA) or the outgroup (strategy o3; Table 2). In these analy- 16,217 amino acids (cpPROT) long. The 18S align- ses, the topology in which Ostreobium and the Hal- ment had 2,054 positions and included 36 species imedineae formed a sister clade gained support. In in total (16 ingroup species). one analysis (cpDNA-u-o3, i.e., chloroplast DNA, The phylogenies recovered from the concatenated unpartitioned model, using only Chloroden- chloroplast gene datasets were very well resolved drophyceae as outgroup), the ML analysis even pre- throughout (Figs. 3, S2 in the Supporting Informa- ferred the sister relationship between Ostreobium and tion), with a few exceptions that are highlighted the Halimedineae in favor of the more common below. All Ostreobium strains were recovered in a sin- topology where Ostreobium branches off at the base gle clade with 100% support irrespective of the anal- of the Bryopsidales. Analyses based on amino acid ysis method. The phylogenies also show that sequences agreed with those on DNA data and boot- O. quekettii strain SAG6.99 (Marcelino et al. 2016) strap support values were similar in magnitude, and strain OS1B (del Campo et al. 2016) are identi- although for analyses with only Dasycladales as out- cal, and that the same is true of two of our new group (o1), the bootstrap support for the position strains (HV05042 and HV05007a). The remaining of Ostreobium was ~10 units lower for amino acid- two strains (HV05007b & c) were sister to the based analyses than for DNA-based analyses SAG6.99 + OS1B lineage, but the relationships (Table 2). among these taxa were not confidently resolved. In line with previous findings, Avrainvillea formed The phylogenetic position of Ostreobium in the an early-branching lineage within the Halimedineae, Bryopsidales, which was the main objective of this sister to a well-supported cluster containing Caulerpa, study, was also recovered with high support. Our Halimeda, and Tydemania (Fig. 3). The phylogenetic analyses showed the Ostreobium clade branching off trees that included all outgroups (configuration

Neomeris sp. HV02668

Acetabularia acetabulum DI1

Ostreobium sp. HV05007a 100 Ostreobium sp. HV05042

100 Ostreobium sp. HV05007b 87 Ostreobineae Ostreobium sp. HV05007c

100 Ostreobium sp. OS1B

100 Ostreobium quekettii SAG6.99 100 Derbesia sp. WEST4838 100 Bryopsis hypnoides Bryopsidineae

100 Bryopsis plumosa WEST4718 98 Avrainvillea mazei HV02664

100 Caulerpa cliftonii HV03798 Halimedineae 100 Halimeda discoidea HV04923

0.3 100 Tydemania expeditionis FL1151

FIG. 3. Phylogeny inferred from the coding regions of chloroplast genomes. The result shown here is the ML tree of condition cpDNA-p-o1 (cf. Table 2) annotated with bootstrap branch support values. The three suborders of siphonous green algae are given to the right of the tree. Remaining taxa are outgroups. 798 HEROEN VERBRUGGEN ET AL. o1234), where the Pedinophyceae are used as a currently known in the core Chlorophyta, most of more distant outgroup, showed that the Dasycladales which have chloroplast genomes exceeding and Bryopsidales always grouped together with high 100 kbp (Turmel et al. 2015, 2016, Marcelino support (Fig. S2). et al. 2016). It is ~1,400 bp shorter than the Like the phylogenies of plastid genomes, the anal- chloroplast genome of Ostreobium strains SAG 6.99 yses of the nuclear 18S rRNA gene recovered the and OS1B. It is completely collinear with these Ostreobium lineage at the base of the Bryopsidales, genomes, and the size differences are due to irrespective of which outgroup was used (Figs. 4, indels in intergenic regions across the genome. S2). Support for the Bryopsidineae+Halimedineae The small genome size of Ostreobium has been clade was 100% in all analyses (Table 2). attributed in part to resource limitation due to its low-light habitat (Marcelino et al. 2016). In the Chlorophyta more broadly, smaller genomes are DISCUSSION common especially among the prasinophytes (Rob- Genome features. The 80,584 bp chloroplast gen- bens et al. 2007, Lemieux et al. 2014b, Leliaert ome of Ostreobium strain HV05042 is the smallest et al. 2016).

Bornetella nitida Z33464 54 100 Batophora oerstedii Z33463 Chlorocladus australasicus Z33466

100 Cymopolia vanbosseae Z33467 Neomeris dumentosa Z33469

97 Acetabularia acetabulum AY165776 88 Acetabularia caliculus AY165780 Chalmasia antillana AY165785 100 100 Halicoryne wrightii AY165786 63 Parvocaulis polyphysoides AY165781

95 Parvocaulis parvulus Z33471 95 Parvocaulis pusillus AY165784 Ostreobium sp. SAG6.99 100 Ostreobineae Ostreobium sp. HV05042

38 Derbesia sp. WEST4838 Codium fragile MJB0103 100 100 98 Codium platylobium FJ535849 Bryopsidineae Codium duthieae FJ535848 46 Bryopsis sp. HV04063

100 Halimeda discoidea HV04923 96 Halimeda tuna AF407250 68 Halimeda cylindracea AF416388

100 100 Caulerpa cliftonii HV03798 Caulerpa sertularioides AF479703 Halimedineae 46 Flabellia petiolata AF416390

69 Tydemania expeditionis FJ432634

50 Udotea flabellum AF416392 0.2 92 Udotea argentea AF416394

FIG. 4. Phylogeny inferred from nuclear 18S ribosomal DNA sequences. The result shown here is the ML tree of condition nrDNA-u- o1 (cf. Table 2) annotated with bootstrap branch support values. The three suborders of siphonous green algae are given to the right of the tree. Remaining taxa are outgroups. OSTREOBIUM CHLOROPLAST PHYLOGENOMICS 799

The chloroplast genomes of Ostreobium are struc- such regions in Ostreobium, which may have either turally similar to those of other Bryopsidales. They lost them during its genome streamlining process are circular-mapping and lack the large inverted (Marcelino et al. 2016) or, alternatively, Bryopsis and repeats seen in many other green algae (Manhart Tydemania may have acquired these regions later in et al. 1989, Lehman and Manhart 1997, Leliaert their evolution. The fact that Derbesia also lacks and Lopez-Bautista 2015). The fact that the early- these regions (Marcelino et al. 2016) would support branching Ostreobium lineage also lacks the large the latter. Halimeda, Caulerpa, and Avrainvillea,by inverted repeats suggests that they were lost at the contrast, have multiple stretches of consecutive base of the Bryopsidales or even earlier during evo- uncharacterized ORFs (Marcelino et al. 2016), lution. The Dasycladales, probably the most closely which may or may not be of bacterial origin. related order to Bryopsidales (Cocquyt et al. 2010, For Neomeris, proper chloroplast genome assem- Fucıkova et al. 2014, Sun et al. 2016), are estimated blies were clearly not achievable based on short-read to have extremely large chloroplast genomes sequencing alone. While several fairly long contigs (>1 Mb) that have not yet been sequenced. From were obtained, these typically contained only one restriction enzyme and DNA hybridization experi- gene, and all of the longer genes were interrupted ments, Acetabularia chloroplast genomes are thought by long introns and sometimes spread across multi- to contain large repeats (8–10 kbp) but these are ple contigs. The situation in Neomeris is similar to unlikely to be the canonical inverted repeats found that in Acetabularia, where high-throughput sequenc- in other green algae (Tymms and Schweiger 1985, ing also failed to recover a complete sequence (de 1989). The chloroplast genomes of two other closely Vries et al. 2013). Metagenome binning of our Neo- related orders, Cladophorales and Trentepohliales, meris data suggested that the chloroplast data had a have also not been sequenced or characterized in quite distinct k-mer signature, and the size of the detail (but see La Claire et al. 1997, 1998, La Claire corresponding bin, which can serve as a conservative and Wang 2004). The Ulvales–Ulotrichales lineage, estimate of genome size, was 1.87 Mb. This is drasti- which is more distantly related in phylogenies of cally larger than canonical green algal chloroplast nuclear genes (Cocquyt et al. 2010), has species genomes (typically in the 100–200 kbp range) but with and species without large inverted repeats in line with previous work using restriction patterns (Pombert et al. 2005, 2006, Melton et al. 2015). showing that Dasycladales chloroplast genomes are Our approach of sequencing rough cultures (i.e., large and repeat-rich (Luttke€ and Bonotto 1982, cultured field samples without strain isolation) Tymms and Schweiger 1985, 1989). Obtaining yielded one complete chloroplast genome complete sequences of these genomes will require (HV05042) but assembly was much less straightfor- long-read sequencing, perhaps supplemented with ward for culture HV05007, which contained multi- optical genome mapping. ple strains. Here, the assemblers were unable to Phylogenetic relationships. While previous molecular span across several highly conserved RNA-coding analyses have shown uncertainty in the placement of regions, resulting in four contigs per species of Ostreobium with respect to the other main lineages Ostreobium present in the mixed culture. These of siphonous green algae (Verbruggen et al. 2009, highly conserved regions would be almost identical del Campo et al. 2016), our approach based on between the three species and thus are obviously chloroplast phylogenomics offers a clear, well-sup- difficult for assemblers to handle using short read ported solution in which the Ostreobium lineage data. Metagenome binning permitted us to separate branches at the base of the Bryopsidales, leaving the the four contigs of HV05007a from the remaining Bryopsidineae and the Halimedineae as sister lin- eight contigs of HV05007b & c, but the latter could eages. Our taxon sampling allowed us to divide up not be separated out further. As an alternative several long branches, including that of Ostreobium approach to bin the contigs into species, we itself (by using multiple strains instead of one), the attempted mapping read pairs across the ends of branch leading to the Halimedineae (by including adjacent contigs, but the RNA-coding regions were Avrainvillea) and that to the closest outgroup taxon too long for this approach to work with our 500 bp Acetabularia (by adding Neomeris). It is well-known paired-end library. Based on the information that that systematic bias can occur in phylogenetic infer- can be derived from the contigs, all Ostreobium ence when taxon sampling is sparse and that gen- chloroplast genomes appear to be completely colli- ome-scale datasets can provide strong support for near. Completing the remaining Ostreobium genomes the wrong tree (e.g., Hedtke et al. 2006). By strate- – either by isolating strains or performing long-read gically sampling early-branching taxa, the likelihood or mate-pair sequencing – would be interesting but of these artifacts are reduced (see Verbruggen and was not needed to achieve the goals of our study. Theriot 2008 and citations therein), and we can be Horizontal gene transfer from bacteria to chloro- quite confident of the position of Ostreobium. More- plast genomes is not common but Bryopsis and Tyde- over, while GC content heterogeneity and codon mania both have regions larger than 6 kb usage bias can cause phylogenetic artifacts (e.g., containing several ORFs of putative bacterial origin Conant and Lewis 2001), this did not appear to be (Leliaert and Lopez-Bautista 2015). We did not find an issue here as our results were consistent when 800 HEROEN VERBRUGGEN ET AL. using either DNA or amino-acid datasets. The 18S were sequenced and assembled from a single DNA gene, which could also be recovered from among pool, chimeric assemblies may have resulted. Such the contigs resulting from our HTS data, provides a erroneous assemblies would introduce conflicting convenient independent confirmation of the posi- signals within the data and result in poorly resolved tion of Ostreobium at the base of the Bryopsidales. phylogenies. It is also possible that the reduced Any remaining uncertainty in the position of the bootstrap support is simply a consequence of the Ostreobium lineage was a consequence of outgroup use of only the genes on a shorter fragment (contig choice, with especially the o3 rooting strategy of ~14 kb) for these two species. (Chlorodendrophyceae only) showing increased Within the remaining Bryopsidales, the relation- support for an alternative topology with Ostreobium ships among taxa matched well with those of pre- sister to Halimedineae. The effect was mostly lim- vious studies based on richer taxon sampling but ited to support values and only changed the ML fewer genes (Curtis et al. 2008, Verbruggen et al. topology in one instance. Interestingly, this alterna- 2009). In the analyses including the Pedino- tive topology was also recovered with statistical sup- phyceae as a distant outgroup, the Dasycladales port in an analysis of only the 16S gene (Neomeris + Acetabularia) and Bryopsidales consis- incorporating a broader sampling of green algal tently grouped together, suggesting that these two taxa (del Campo et al. 2016), but our in-depth anal- siphonous orders form a natural group, in accor- yses show that this alternative topology is poorly sup- dance with previous studies based on nuclear and ported overall. Our results show a moderate effect chloroplast sequences (Cocquyt et al. 2010, Sun of outgroup sampling on phylogenetic tree infer- et al. 2016). A caveat in this conclusion is that ence. In this context, our inclusion of Neomeris our taxon sampling does not include other closely broke up the long branch leading to Acetabularia related lineages such as Trentepohliales and Cla- that was seen in previous studies (Fucıkova et al. dophorales. 2014, Sun et al. 2016). Particularly in rooting phylo- Taxonomic treatment. The strongly supported phy- genetic trees, a proper consideration of taxon sam- logenetic position of the Ostreobium lineage inferred pling is important because adding outgroups can here implies that it should not be included in either introduce long branches in the phylogeny, poten- of the two currently recognized suborders Hali- tially making the analysis vulnerable to systematic medineae and Bryopsidineae. Therefore, we pro- biases (Shavit et al. 2007, Verbruggen et al. 2007, pose a new suborder for the Ostreobium lineage. Verbruggen and Theriot 2008, Goremykin et al. Ostreobineae Verbruggen et Guiry, subordo nov. 2013). With high-throughput sequencing now being Description: Siphonous green algae endolithic in mainstream, further improvements in taxon sam- limestone substrata, developing irregularly ramifying pling in green algal chloroplast phylogenomics can nets of siphons in the rock matrix; represented as a be expected in the coming years. distinct lineage in phylogenies based on chloroplast Within the Ostreobium lineage, there is clear evi- genome data, with the genome of culture strain dence for two sublineages that separate from each SAG 6.99 serving as a reference (GenBank accession other fairly early in the evolution of the lineage LT593849). (split between HV05007a + HV05042 and the Automatically typified with the genus Ostreobium remaining Ostreobiums). In fact, recent studies based Bornet and Flahault 1889. on environmental sequencing show that there are Note: In accordance with ICN Art.10.7 (McNeill up to four or even five of these lineages that deserve et al. 2012), “the principle of typification does not recognition at the family level (del Campo et al. apply to names of taxa above the rank of family, 2016, Marcelino and Verbruggen 2016, Sauvage except for names that are automatically typified by et al. 2016). When the tufA sequences from the being based on generic names, the type of which is chloroplast genomes generated here are compared the same as that of the generic name.” to those of the family level lineages from Marcelino Included family: P.C.Silva ex Maggs & and Verbruggen (2016), they belong to Ostreobium J.Brodie (Brodie et al. 2007). lineages 3 (HV05007a and HV05042) and 4 Note: The original description of the Ostreobiaceae (HV05007b & c, SAG6.99, OS1B). Within the Ostreo- (Silva 1982) was invalid as it lacked the Latin diag- bium lineage, there is poor support for the place- nosis or description required from January 1, 1958 ment of HV05007b & c, two species that were to December 31, 2011 (McNeill et al. 2012, Art. sequenced from a single rough culture. While it is 39.1). In previous studies (Verbruggen et al. 2009, certainly possible that chloroplast genome Marcelino and Verbruggen 2016, Sauvage et al. sequences do not allow for resolution in this part of 2016), the Ostreobium lineage was casually referred the tree, we should also consider the alternative pos- to as “Ostreobidineae;” however, this name is ortho- sibility that the poor support is due to misassembly. graphically incorrect. According to the rules of the HV05007b & c are relatively closely related Ostreo- ICN (McNeill et al. 2012, Art. 16.3 and 17.1), a sub- bium strains with chloroplast genomes that are order-level termination (-ineae) should be added to highly similar in k-mer composition and coverage. It the stem of the genitive of the genus name. The is therefore not unimaginable that because they genitive of Ostreobium, a second-declension neuter OSTREOBIUM CHLOROPLAST PHYLOGENOMICS 801 noun is Ostreobii, hence the correct orthography is algae nested within the Halimedineae (Marcelino Ostreobineae. and Verbruggen 2016, Sauvage et al. 2016). These The other two suborder names of the Bryopsi- groups have not been characterized morphologically dales, Halimedineae, and Bryopsidineae, were intro- but are likely to be similar to Ostreobium in appear- duced by Hillis-Colinvaux (Hillis-Colinvaux 1984) ance despite having separate evolutionary origins. but invalidly as they lacked the Latin diagnosis or As a consequence, without knowing of the molecu- description required at the time of publication lar affinities of strains, it is impossible to say whether (McNeill et al. 2012, Art. 39.1). Accordingly, we vali- the observed features are valid for the Ostreobineae, date them here, giving due credit to Hillis-Colin- and more work is needed on characterizing the biol- vaux as the originator of the names. ogy and subcellular features of this suborder. Halimedineae Hillis-Colinvaux ex Verbruggen et To conclude, we note that our chloroplast gen- Guiry, nom. nov. ome-based phylogeny offers good support at all Description: Siphonous green algae featuring levels in our phylogeny. When all steps in the phylo- heteroplasty ( and amyloplasts), holo- genetic hierarchy are resolved, it is possible to con- carpic reproduction without septa at base of repro- struct a similarly resolved hierarchical classification, ductive structures; with a concentric lamellar system including levels in between the traditional levels. In in the chloroplasts. our case, families could be organized into well- Automatically typified by Halimeda J.V.Lamouroux defined suborders. A similar situation was shown in 1812, nom. et typ. cons. a phylogenomic study of the red algal order Nema- Included families: Caulerpaceae, Dichoto- liales (Costa et al. 2016). At a shallower taxonomic mosiphonaceae, Halimedaceae, Pseudocodiaceae, level, a chloroplast genome-based phylogenetic Rhipiliaceae, and Udoteaceae. study of the highly diverse red algal family Rhodo- Bryopsidineae Hillis-Colinvaux ex Verbruggen et melaceae (Dıaz Tapia et al. 2015) allowed the orga- Guiry, nom. nov. nization of genera into well-supported tribes. While Description: Siphonous green algae with homo- these studies clearly illustrate the potential of plasty (chloroplasts only), nonholocarpic reproduc- chloroplast phylogenomics to help algal taxonomists tion with septa commonly present at base of construct a fine-grained hierarchical classification, it reproductive structures; concentric lamellar system is unlikely to be a silver bullet for all levels of classi- in the chloroplasts absent. fication and for all algal taxa. In rapidly diversifying Automatically typified by Bryopsis J.V.Lamouroux groups, the generation of many evolutionary lin- 1809 eages over a short time scale may not leave enough Included families: Bryopsidaceae, Codiaceae, and of a genomic imprint to resolve their relationships. Derbesiaceae. The origination of the classes and orders of the As the descriptions of these latter two suborders core Chlorophyta may be an example of this. suggest, there are some subcellular features that dis- Chloroplast genome-based phylogenies have not tinguish between them, in particular the presence/ been able to resolve their relationships despite absence of the “thylakoid organizing body,” a con- impressive taxon sampling and extensive experimen- centric lamellar system situated at one end of the tation with inference methods (e.g., Fucıkova et al. chloroplasts (Borowitzka 1976), the presence/ab- 2014, Lemieux et al. 2014a, Sun et al. 2016, Turmel sence of separate amyloplasts and whether or not et al. 2016). septa are present at the base of reproductive struc- tures (Hillis-Colinvaux 1984). At the moment it is This work was supported by the Australian Biological difficult to discuss how the subcellular features of Resources Study (RFL213-08), the Australian Research Coun- the Ostreobineae relate to either of the other subor- cil (FT110100585, DP150100705), the University of Mel- ders because not enough work has been carried out. bourne (ECR and SPRINT grants to HV and MIRS/MIFRS to The genus Ostreobium has not yet been studied with VRM and MCMC), and by use of the Victorian Life Sciences transmission electron microscopy, so we cannot Computation Initiative (VLSCI) at the University of Mel- bourne (projects UOM0007, UOM0021) and the Nectar comment on the structure of its lamellar system. Research Cloud, a collaborative Australian research platform Amyloplasts have not been reported previously, so it supported by the National Collaborative Research Infrastruc- is likely that the suborder aligns with Bryopsidineae ture Strategy (NCRIS). We thank Claude Payri, the staff of in that respect. Kornmann and Sahling (1980) the Alis research vessel and the La Planete Revisitee program reported reproductive structures in O. quekettii that for facilitating fieldwork in Papua New Guinea. We are very did not have septa at their base, suggesting that they grateful to Francesca Benzoni for providing coral identifica- tions, Fabio Rindi for contributing his insights in the Latin align with Halimedineae in this feature. However, language, and Laura Lagourgue, Craig Schneider, Michael we cannot extrapolate from these results because we Wynne and Sigrid Berger for help with the identifications of do not currently know whether the strain studied by algal samples. The authors declare no conflict of interest. Kornmann and Sahling (1980) is actually a true Ostreobium. Environmental sequencing has shown Abascal, F., Zardoya, R. & Telford, M. J. 2010. TranslatorX: multi- that, besides the Ostreobium clade, there are two ple alignment of nucleotide sequences guided by amino acid additional groups of endolithic siphonous green translations. Nucleic Acids Res. 38:W7–13. 802 HEROEN VERBRUGGEN ET AL.

Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. La Claire, J. W. & Wang, J. S. 2004. Structural characterization of 1990. Basic local alignment search tool. J. Mol. Biol. 215:403– the terminal domains of linear plasmid-like DNA from the 10. green alga Ernodesmis (Chlorophyta). J. Phycol. 40:1089–97. Bankevich, A., Nurk, S., Antipov, D., Gurevich, A. A., Dvorkin, M., La Claire, J. W., Zuccarello, G. C. & Tong, S. 1997. Abundant Kulikov, A. S., Lesin, V. M. et al. 2012. SPAdes: a new gen- plasmid-like DNA in various members of the orders Siphono- ome assembly algorithm and Its applications to single-cell cladales and Cladophorales (Chlorophyta). J. Phycol. 33: sequencing. J. Comp. Biol. 19:455–77. 830–7. Bergsten, J. 2005. A review of long-branch attraction. Cladistics Lehman, R. L. & Manhart, J. R. 1997. A preliminary comparison 21:163–93. of restriction fragment patterns in the genus Caulerpa Borowitzka, M. A. 1976. Some unusual features of the ultrastruc- (Chlorophyta) and the unique structure of the chloroplast ture of the chloroplasts of the green algal order Caulerpales genome of Caulerpa sertularioides. J. Phycol. 33:1055–62. and their development. Protoplasma 89:129–47. Leliaert, F. & Lopez-Bautista, J. M. 2015. The chloroplast gen- Brodie, J., Maggs, C. A. & John, D. M. 2007. The Green Seaweeds of omes of Bryopsis plumosa and Tydemania expeditiones (Bryopsi- Britain and Ireland. British Phycological Society, Great Britain, dales, Chlorophyta): compact genomes and genes of 242 pp. bacterial origin. BMC Genom. 16:204. del Campo, J., Pombert, J. F., Slapeta, J., Larkum, A. & Keeling, Leliaert, F., Tronholm, A., Lemieux, C., Turmel, M., DePriest, P. J. 2016. The ‘other’ coral symbiont: Ostreobium diversity M. S., Bhattacharya, D., Karol, K. G., Fredericq, S., Zech- and distribution. ISME J. 11:296–9. man, F. W. & Lopez-Bautista, J. M. 2016. Chloroplast phy- Cocquyt, E., Verbruggen, H., Leliaert, F. & De Clerck, O. 2010. logenomic analyses reveal the deepest-branching lineage of Evolution and cytological diversification of the green sea- the Chlorophyta, Palmophyllophyceae class. nov. Sci. Rep. weeds (Ulvophyceae). Mol. Biol. Evol. 27:2052–61. 6:25367. Conant, G. C. & Lewis, P. O. 2001. Effects of nucleotide composi- Lemieux, C., Otis, C. & Turmel, M. 2014a. Chloroplast phyloge- tion bias on the success of the parsimony criterion in phylo- nomic analysis resolves deep-level relationships within the genetic inference. Mol. Biol. Evol. 18:1024–33. green algal class Trebouxiophyceae. BMC Evol. Biol. 14:211. Costa, J. F., Lin, S. M., Macaya, E. C., Fernandez-Garcia, C. & Ver- Lemieux, C., Otis, C. & Turmel, M. 2014b. Six newly sequenced bruggen, H. 2016. Chloroplast genomes as a tool to resolve chloroplast genomes from prasinophyte green algae provide red algal phylogenies: a case study in the Nemaliales. BMC insights into the relationships among prasinophyte lineages Evol. Biol. 16:205. and the diversity of streamlined genome architecture in Cremen, M. C. M., Huisman, J. M., Marcelino, V. R. & Ver- picoplanktonic species. BMC Genom. 15:857. bruggen, H. 2016. Taxonomic revision of Halimeda (Bryopsi- Li, D., Luo, R., Liu, C. M., Leung, C. M., Ting, H. F., Sadakane, dales, Chlorophyta) in south-western Australia. Austral. Syst. K., Yamashita, H. & Lam, T. W. 2016. MEGAHIT v1.0: A fast Bot. 29:41–54. and scalable metagenome assembler driven by advanced Curtis, N. E., Dawes, C. J. & Pierce, S. K. 2008. Phylogenetic anal- methodologies and community practices. Methods 102:3–11. ysis of the large subunit rubisco gene supports the exclusion Lin, H. H. & Liao, Y. C. 2016. Accurate binning of metagenomic of Avrainvillea and Cladocephalus from the Udoteaceae (Bry- contigs via automated clustering sequences using informa- opsidales, Chlorophyta). J. Phycol. 44:761–7. tion of genomic signatures and marker genes. Sci. Rep. Dıaz Tapia, P., Maggs, C. A., West, J. A. & Verbruggen, H. 2015. 6:24175. Tackling rapid radiations with chloroplast phylogenomics in Lu,€ F., Xu,€ W., Tian, C., Wang, G., Niu, J., Pan, G. & Hu, S. 2011. the Rhodomelaceae. Eur. J. Phycol. Supplement: European The Bryopsis hypnoides plastid genome: multimeric forms and Phycological Congress abstract book:54–5. complete nucleotide sequence. PLoS ONE 6:e14663. Fine, M. & Loya, Y. 2002. Endolithic algae: an alternative source Luttke,€ A. & Bonotto, S. 1982. Chloroplasts and chloroplast DNA of photoassimilates during coral bleaching. Proc. R. Soc. Lond. of Acetabularia mediterranea: facts and hypotheses. In Bourne, B Biol. Sci. 269:1205–10. G. H. & Danielli, J. F. [Eds.] International Review of Cytology. Fucıkova, K., Leliaert, F., Cooper, E. D., Skaloud, P., D’hondt, S., Academic Press, Cambridge, MA, pp. 205–42. De Clerck, O., Gurgel, C. F. D. et al. 2014. New phylogenetic Manhart, J. R., Kelly, K., Dudock, B. S. & Palmer, J. D. 1989. Unu- hypotheses for the core Chlorophyta based on chloroplast sual characteristics of Codium fragile chloroplast DNA sequence data. Front Ecol. Evol. 2:63. revealed by physical and gene mapping. Mol. Gen. Genet. Goremykin, V. V., Nikiforova, S. V., Biggs, P. J., Zhong, B., 216:417–21. Delange, P., Martin, W., Woetzel, S., Atherton, R. A., Mcle- Marcelino, V. R., Cremen, M. C. M., Jackson, C. J., Larkum, A. & nachan, P. A. & Lockhart, P. J. 2013. The evolutionary root Verbruggen, H. 2016. Evolutionary dynamics of chloroplast of flowering plants. Syst. Biol. 62:50–61. genomes in low light: a case study of the endolithic green Grange, J. S., Rybarczyk, H. & Tribollet, A. 2015. The three steps alga Ostreobium quekettii. Genome Biol. Evol. 8:2939–51. of the carbonate biogenic dissolution process by microborers Marcelino, V. R. & Verbruggen, H. 2016. Multi-marker metabar- in coral reefs (New Caledonia). Environ. Sci. Pollut. Res. coding of coral skeletons reveals a rich microbiome and 22:13625–37. diverse evolutionary origins of endolithic algae. Sci. Rep. Guiry, M. D. & Guiry, G. M. 2016. AlgaeBase. World-wide elec- 6:31508. tronic publication. National University of Ireland, Galway. McNeill, J., Barrie, F. R., Buck, W. R., Demoulin, V., Greuter, W., Gutner-Hoch, E. & Fine, M. 2011. Genotypic diversity and distri- Hawksworth, D. L., Herendeen, P. S. et al. 2012. International bution of Ostreobium quekettii within scleractinian corals. Coral Code of Nomenclature for algae, fungi and plants (Melbourne Code) Reefs 30:643–50. adopted by the Eighteenth International Botanical Congress Mel- Hedtke, S. M., Townsend, T. M. & Hillis, D. M. 2006. Resolution bourne, Australia, July 2011. Koeltz Scientific Books, Konig-€ of phylogenetic conflict in large data sets by increased taxon stein, 208 pp. sampling. Syst. Biol. 55:522–9. Melton, J. T., Leliaert, F., Tronholm, A. & Lopez-Bautista, J. M. Hillis-Colinvaux, L. 1984. Systematics of the Siphonales. In Irvine, 2015. The complete chloroplast and mitochondrial genomes D. E. G. & John, D. M. [Eds.] Systematics of the Green Algae. of the green macroalga Ulva sp. UNA00071828 (Ulvo- Academic Press, London and Orlando, Florida, pp. 271–96. phyceae, Chlorophyta). PLoS ONE 10:e0121020. Katoh, K. & Toh, H. 2010. Parallelization of the MAFFT multiple Melton, J. T. & Lopez-Bautista, J. M. 2015. The chloroplast gen- sequence alignment program. Bioinformatics 26:1899–900. ome of the marine green macroalga Ulva fasciata Delile Kornmann, P. & Sahling, P. H. 1980. Ostreobium quekettii (Codi- (Ulvophyceae, Chlorophyta). Mitochondr. DNA 28:1–3. ales, Chlorophyta). Helgolander€ Meeresun. 34:115–22. Peng, Y., Leung, H. C. M., Yiu, S. M. & Chin, F. Y. L. 2012. IDBA- La Claire, W. J., Loudenslager, M. C. & Zuccarello, C. G. 1998. UD: a de novo assembler for single-cell and metagenomic Characterization of novel extrachromosomal DNA from sequencing data with highly uneven depth. Bioinformatics giant-celled marine green algae. Curr. Genet. 34:204–11. 28:1420–8. OSTREOBIUM CHLOROPLAST PHYLOGENOMICS 803

Pombert, J. F., Lemieux, C. & Turmel, M. 2006. The complete M. M., Leliaert, F. & De Clerck, O. 2009. A multi-locus time- chloroplast DNA sequence of the green alga Oltmannsiellopsis calibrated phylogeny of the siphonous green algae. Mol. Phy- viridis reveals a distinctive quadripartite architecture in the logenet. Evol. 50:642–53. chloroplast genome of early diverging ulvophytes. BMC Biol. Verbruggen, H. & Costa, J. F. 2015. The plastid genome of the 4:15. red alga Laurencia. J. Phycol. 51:586–9. Pombert, J. F., Otis, C., Lemieux, C. & Turmel, M. 2005. The Verbruggen, H., Leliaert, F., Maggs, C. A., Shimada, S., Schils, T., chloroplast genome sequence of the green alga Pseudendoclo- Provan, J., Booth, D. et al. 2007. Species boundaries and phy- nium akinetum (Ulvophyceae) reveals unusual structural fea- logenetic relationships within the green algal genus Codium tures and new insights into the branching order of (Bryopsidales) based on plastid DNA sequences. Mol. Phylo- chlorophyte lineages. Mol. Biol. Evol. 22:1903–18. genet. Evol. 44:240–54. Robbens, S., Derelle, E., Ferraz, C., Wuyts, J., Moreau, H. & Van Verbruggen, H. & Theriot, E. C. 2008. Building trees of algae: de Peer, Y. 2007. The complete chloroplast and mitochon- some advances in phylogenetic and evolutionary analysis. drial DNA sequence of Ostreococcus tauri: organelle genomes Eur. J. Phycol. 43:229–52. of the smallest eukaryote are examples of compaction. Mol. Verbruggen, H. & Tribollet, A. 2011. Boring algae. Curr. Biol. 21: Biol. Evol. 24:956–68. R876–7. Sauvage, T., Schmidt, W. E., Suda, S. & Fredericq, S. 2016. A de Vries, J., Habicht, J., Woehle, C., Changjie, H., Christa, G., metabarcoding framework for facilitated survey of endolithic W€agele, H., Nickelsen, J., Martin, W. F. & Gould, S. B. 2013. phototrophs with tufA. BMC Ecol. 16:1–21. Is ftsH the key to plastid longevity in sacoglossan slugs?. Gen- Schlichter, D., Zscharnack, B. & Krisch, H. 1995. Transfer of pho- ome Biol. Evol. 5:2540–8. toassimilates from endolithic algae to coral tissue. Naturwis- Vroom, P. S. & Smith, C. M. 2003. The challenge of siphonous senschaften 82:561–4. green algae. Am. Sci. 89:524–31. Shavit, L., Penny, D., Hendy, M. D. & Holland, B. R. 2007. The Wick, R. R., Schultz, M. B., Zobel, J. & Holt, K. E. 2015. Bandage: problem of rooting rapid radiations. Mol. Biol. Evol. 24:2400– interactive visualization of de novo genome assemblies. Bioin- 11. formatics 31:3350–2. Silva, P. C. 1982. Chlorophycota. In Parker, S. B. [Ed.] Synopsis Wu, Y. W., Tang, Y. H., Tringe, S. G., Simmons, B. A. & Singer, S. and Classification of Living Organisms. McGraw-Hill, New York, W. 2014. MaxBin: an automated binning method to recover pp. 133–61. individual genomes from metagenomes using an expecta- Stamatakis, A. 2014. RAxML version 8: a tool for phylogenetic tion-maximization algorithm. Microbiome 2:1–18. analysis and post-analysis of large phylogenies. Bioinformatics Yamazaki, S. S., Nakamura, T. & Yamasaki, H. 2008. Photoprotec- 30:1312–3. tive role of endolithic algae colonized in coral skeleton for Sun, L., Fang, L., Zhang, Z., Chang, X., Penny, D. & Zhong, B. the host photosynthesis. In: Allen, J. F., Gantt, E., Golbeck, J. 2016. Chloroplast phylogenomic inference of green algae H. & Osmond, B. [Eds.] Photosynthesis. Energy from the Sun. relationships. Sci. Rep. 6:20528. Springer, Netherlands, Dordrecht, pp. 1391–95. Tribollet, A. 2008. Dissolution of dead corals by euendolithic microorganisms across the northern great barrier reef (Aus- tralia). Microb. Ecol. 55:569–80. Turmel, M., Brouard, J.-S., Gagnon, C., Otis, C. & Lemieux, C. Supporting Information 2008. Deep division in the Chlorophyceae (Chlorophyta) revealed by chloroplast phylogenomic analyses. J. Phycol. Additional Supporting Information may be 44:739–50. Turmel, M., de Cambiaire, J. C., Otis, C. & Lemieux, C. 2016. found in the online version of this article at the Distinctive architecture of the chloroplast genome in the publisher’s web site: chlorodendrophycean green algae Scherffelia dubia and Tetra- selmis sp. CCMP 881. PLoS ONE 11:e0148934. Figure S1. Conserved blocks between Avrainvil- Turmel, M., Otis, C. & Lemieux, C. 2009. The chloroplast genomes lea and members of the Bryopsidineae, Hal- of the green algae Pedinomonas minor, Parachlorella kessleri, and imedineae and Ostreobineae suborders. Oocystis solitaria reveal a shared ancestry between the Pedi- nomonadales and Chlorellales. Mol. Biol. Evol. 26:2317–31. Figure S2. Maximum likelihood trees with boot- Turmel, M., Otis, C. & Lemieux, C. 2015. Dynamic evolution of strap values for all analysis conditions. The title of the chloroplast genome in the green algal classes Pedino- each page corresponds to analysis conditions in phyceae and Trebouxiophyceae. Genome Biol. Evol. 7:2062–82. Tymms, M. J. & Schweiger, H. G. 1985. Tandemly repeated nonri- Table 2. bosomal DNA sequences in the chloroplast genome of an Appendix S1. Alignments used to infer all max- Acetabularia mediterranea strain. Proc. Natl. Acad. Sci. USA 82:1706–10. imum likelihood trees, in Phylip format. Tymms, M. J. & Schweiger, H. G. 1989. Significant differences between the chloroplast genomes of two Acetabularia mediter- ranea strains. Mol. Gen. Genet. 219:199–203. Verbruggen, H., Ashworth, M., LoDuca, S. T., Vlaeminck, C., Coc- quyt, E., Sauvage, T., Zechman, F. W., Littler, D. S., Littler,