<<

Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations

2012 The of the mitochondrial of and cnidarians Ehsan Kayal Iowa State University

Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Evolution Commons, and the Molecular Commons

Recommended Citation Kayal, Ehsan, "The ve olution of the mitochondrial genomes of calcareous sponges and cnidarians" (2012). Graduate Theses and Dissertations. 12621. https://lib.dr.iastate.edu/etd/12621

This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. The evolution of the mitochondrial genomes of calcareous sponges and cnidarians

by

Ehsan Kayal

A dissertation submitted to the graduate faculty

in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

Major: Ecology and Evolutionary Biology

Program of Study Committee Dennis V. Lavrov, Major Professor Anne Bronikowski John Downing Eric Henderson Stephan Q. Schneider Jeanne M. Serb

Iowa State University

Ames, Iowa

2012

Copyright 2012, Ehsan Kayal ii

TABLE OF CONTENTS

ABSTRACT ...... V

CHAPTER 1. GENERAL INTRODUCTION...... 1

Introduction ...... 1

Thesis Organization ...... 5

References ...... 6

CHAPTER 2. CALCAREOUS SPONGES REDEFINE THE LIMITS OF MTDNA EVOLUTION IN THE ...... 11

Abstract...... 11

Introduction ...... 12

Materials and methods...... 15

Results ...... 18

Discussion...... 26

Conclusion...... 32

Acknowledgements...... 32

CHAPTER 3. EVOLUTION OF LINEAR MITOCHONDRIAL GENOMES IN MEDUSOZOAN CNIDARIANS...... 51

Abstract...... 51

Introduction ...... 52

Materials and methods...... 55

Results ...... 58

Discussion...... 66

Acknowledgements...... 73

References ...... 80

iii

CHAPTER 4. CNIDARIAN PHYLOGENETIC RELATIONSHIPS AS REVEALED BY MITOGENOMICS...... 84

Abstract...... 84

Background...... 85

Results ...... 89

Discussion...... 93

Conclusions ...... 99

Methods...... 100

Acknowledgements...... 104

CHAPTER 5. THE ERATIC EVOLUTION OF ANIMAL MITOCHONDRIAL DNA...... 118

Abstract...... 118

Introduction ...... 119

Mitochondrial structure and organization...... 121

The size of animal mitochondrial genome is not uniform ...... 123

A wide array of mitochondrial genetic content ...... 125 mtDNA-encoded are prone to a high level of variation in ...... 127

Changing the code is not limited to bilaterian animals ...... 129

Concluding remarks...... 131

References ...... 133

CHAPTER 6. GENERAL CONCLUSIONS...... 146

Conclusions ...... 146

Broader perspectives and future directions...... 147

References ...... 149

APPENDIX A. SUPPLEMENTARY MATERIALS FOR CHAPTER 2 ...... 152

APPENDIX B. SUPPLEMENTARY MATERIALS FOR CHAPTER 3 ...... 161

iv

APPENDIX B. SUPPLEMENTARY MATERIALS FOR CHAPTER 4 ...... 163

v

ABSTRACT

The mitochondrial DNA (mtDNA) in animals (Metazoa) is a favorite molecule for phylogenetic studies given its relative uniformity in both size and organization. Yet, as the depth coverage of representative animal groups increases sharply thanks to recent advances in sequencing technology, some remain stubbornly under sampled, if even represented at all. Difficulties associated with data collection from problematic taxa can arise from highly derived sequences, fragmented genomes, unusual structure, or any combination of these.

Particularly illustrative examples are found in non-bilaterian animals (placozoans, sponges, cnidarians, comb jellies) where the mtDNA is more variable in size and structure. The present dissertation provides several case studies of what is considered “unusual” mtDNA for animals.

First, we describe some unusual characteristics of the mitochondrial genomes found in calcareous sponges (Calcarea, Porifera), where one, potentially two, novel genetic codes are inferred, transfer RNAs (tRNAs) are edited, and ribosomal RNA (rRNA) are in pieces.

We also hypothesize that the mtDNA is linear and multipartite. Then, we explore the evolution of the mtDNA in medusozoan cnidarians (, ). The mtDNA in Medusozoa is linear, and encodes two extra genes (lost in one ) putatively involved in the maintenance and replication of the linear . In addition, secondary segmentalization has occurred independently in some hydras (Hydridae) and box jellies

(Cubozoa). Using the sequences from these mito-genomes, we propose a new phylogeny for

Cnidaria, providing additional support for the clade [Medusozoa + ], rendering

Anthozoa ( + Octocorallia) paraphyletic. Finally, this dissertation concludes by a mini review stating the current state of knowledge of metazoan mtDNA and some of the

vi pitfalls in the field of mitogenomics. In particular, the new findings further challenge the classical idea of a uniform mtDNA organization (frozen genome) in animals, and question any directional explanation of the evolution of the mtDNA in animals.

1

CHAPTER 1. GENERAL INTRODUCTION

Introduction

Deoxyribonucleic acid (DNA) is the main carrier of genetic information in cells. In , the majority of the genetic information is found in the (nDNA), where duplication and take place. But genetic material is not restricted to the nucleus as other compartments can also harbor and transmit DNA in the form of DNA

(plDNA, ), mitochondrial DNA (mtDNA, ), or extra chromosomal similar to what is found in . The endosymbiotic events (Gross & Bhattacharya,

2009) from α-proteobacteria-like ancestor in the case of the mitochondrion (Thrash et al.,

2011), and of a photosynthetic cyanobacterium in the case of plastids (Chan, Gross, Yoon, &

Bhattacharya, 2011; Janouskovec, Horák, Oborník, Lukes, & Keeling, 2010) are considered to be at the origin of eukaryotic . Following the original endosymbiotic event, transfer of genetic material has occurred to various degree between the and either another organelle or the nucleus (Adams & Palmer, 2003; Archibald, 2009; Martin, 2003; Timmis,

Ayliffe, Huang, & Martin, 2004). Consequently, none of the organelles carry genomes similar to those of their prokaryotic ancestor, i.e. a compact circular with a complete set of genes (Burger, Gray, & Lang, 2003; Hikosaka et al., 2010; Nash, Nisbet, Barbrook, &

Howe, 2008; Sesterhenn et al., 2010; Valach et al., 2011; Vlcek, Marande, Teijeiro, Lukes, &

Burger, 2011).

By comparison to other , animal mitochondrial genome is relatively stable in size and composition (Gissi, Iannelli, & Pesole, 2008). Structural variation of animal mtDNA encompass single or multi-partite linear or circular molecules, and recent sequencing efforts

2

provide additional evidence regarding the plasticity of animal mtDNA structure (Armstrong,

Blok, & Phillips, 2000; Doublet, 2010; Doublet et al., 2012; Gibson, Blok, Phillips, et al.,

2007; Gibson, Blok, & Dowton, 2007; Kayal et al., 2012; Kayal & Lavrov, 2008; Marcadé et

al., 2007; R. Shao, Kirkness, & Barker, 2009; Smith et al., 2012; Suga, Mark Welch, Tanaka,

Sakakura, & Hagiwara, 2008; Voigt, Erpenbeck, & Wörheide, 2008; Wei et al., 2012), further

challenging an earlier "frozen genome" view (J L Boore, 1999; Saccone et al., 2002). In

animals ( Metazoa), the mtDNA is a hot spot of translational novelties as illustrated by

the occurrence of frameshifting (Haen, Lang, Pomponi, & Lavrov, 2007; Rosengarten,

Sperling, Moreno, Leys, & Dellaporta, 2008), at least 13 codon reassignments (Abascal,

Posada, & Zardoya, 2012; Knight, Freeland, & Landweber, 2001), and still counting. The

exploration of obscure taxa (e.g. interstitial phyla, , nematomorphs) might bring

additional evidences on the organizational elasticity of animal mtDNA. This also raises some

questions about the directionality of mtDNA evolution in animals as proposed earlier (Lavrov,

2007).

Recent advances in sequencing technology provide increasing amounts of molecular

data at relatively low prices (Glenn, 2011), revolutionizing many research areas in biology

(Egan, Schlueter, & Spooner, 2012; Kamb, 2011; Lerner & Fleischer, 2010; Mardis, 2008,

2011; Morozova & Marra, 2008; Raffan & Semple, 2011; Straub et al., 2011; Weinstock,

2011). As genomic studies are becoming widespread, the urge for acquiring more mtDNA data might start waning away. At the same time, the application of massively parallel next- generation sequencing platforms to mito-genomics (the study of mitochondrial genomes) (Jex,

Littlewood, & Gasser, 2010; Mason, Li, Helgen, & Murphy, 2011; Zaragoza, Fass, Diegoli,

Lin, & Arbustini, 2010) renders traditional protocols (e.g. Burger, Lavrov, Forget, & Lang,

3

2007) obsolete. In fact, for an equivalent effort input, next generation sequencing platforms produce >100 times the amount of mitochondrial genome sequence that traditional methods used to provide. These new technologies can also help deciphering the mtDNA of the so-called obscure animal taxa (see above) for which no mitochondrial sequence is available yet.

One can expect an increasing of sequence data in the coming , some of which will without doubt improve our understanding of the evolution of animal mtDNA. Yet, most of these data will most likely come from a handful of model organisms, hardly contributing to our understanding of mtDNA diversity. As the field of mitogenomics reinvents itself in the light of next generation sequencing, particular is needed on those forgotten groups that either fell out of favor or never achieved to capture the interest of the scientific community. For instance, non-bilaterian animals (sponges, cnidarians, ctenophorans and placozoans) display the highest diversity in mitogenome organization, and are of primary importance in understanding the evolutionary history of early metazoans (Lavrov, 2007).

In that context, this thesis focuses on two groups of non-bilaterian animals for which

scarce (medusozoan cnidarians) or no mitochondrial sequences (calcareous sponges) was

available at the start of the program. Unlike most metazoans, the mtDNA of medusozoan

cnidarians are organized into one, two or eight mitochondrial chromosomes (Kayal et al., 2012;

Kayal & Lavrov, 2008; Park et al., 2012; Z. Shao, Graf, Chaga, & Lavrov, 2006; Smith et al.,

2012; Voigt et al., 2008). In Metazoa, the only other case of linear mtDNA has been found in

some isopods where the mtDNA consists of a circular ~28-kb dimer and a linear

~14-kb monomer (Doublet et al., 2012; Marcadé et al., 2007; Raimond et al., 1999). In to

understand the evolution of linear mtDNA in medusozoan cnidarians, we determined the

4

partial or complete mtDNA sequences for 24 belonging to all four medusozoan classes

using a combination of traditional and next generation sequencing approaches (Chapter 3).

Mitochondrial genomes provide one of the most commonly used molecular markers in

animal , with informative characters at the nucleotide, amino acid and

organization levels (Jeffrey L Boore & Fuerstenberg, 2008; Lavrov, 2007; Moret & Warnow,

2005). Thanks to our recent sampling of the mtDNA from representative species from all medusozoan classes, we were able to reconstruct the phylogenetic relationships within Cnidaria

(Chapter 4). In particular, we tested the of and showed that the grouping

[Medusozoa + Octocorallia] is not an artifact of reconstruction due to bias sampling as suggested earlier (Collins, 2009; Park et al., 2012).

Within the past few years, mitogenome sampling has greatly increased

(Belinky, Rot, Ilan, & Huchon, 2008; Ereskovsky & Lavrov, 2011; D Erpenbeck et al., 2007;

Dirk Erpenbeck, Voigt, Wörheide, & Lavrov, 2009; Gazave et al., 2010; Haen et al., 2007;

Lavrov, 2010; Lavrov, Forget, Kelly, & Lang, 2005; Lukić-Bilela et al., 2008; Rosengarten,

Sperling, Moreno, Leys, & Dellaporta, 2008; Wang & Lavrov, 2007, 2008) and many more are to be expected in the framework of the Assembling the Porifera Tree of project

(www.portol.org/). Yet, one of sponges (Calcarea) remains stubbornly absent from the list. Here we determined partial mitochondrial sequences of three calcareous sponges (Chapter

2). Among their highly unusual we report the use of two (one per order) genetic codes for protein , fragmented ribosomal RNA (rRNA) and transfer RNA (tRNA) editing.

We also propose that the mtDNA is fragmented into several linear molecules. While still incomplete, these sequences shed lights on the particular evolutionary path followed by the mtDNA of these sponges.

5

Thesis Organization

This dissertation consists of this general introduction (Chapter 1), two case studies describing the mtDNA of calcareous sponges (Chapter 2) and medusozoan cnidarian (Chapter

3), a phylogeny of Cnidaria based on mito-genomic data (Chapters 4), a mini review stating the current state of knowledge of metazoan mtDNA (Chapter 5), and a general conclusion that summarizes the major results of the thesis and recommends future developments (Chapter 6).

Chapter 2 describes the partial mtDNA of several calcareous sponges, where two novel genetic codes, tRNA editing, gene and genome fragmentation are reported. Chapter 3 looks at the evolution of linear mtDNA in medusozoan cnidarians. Chapter 4 redefines the phylogenetic relationships within Cnidaria using the extended mito-genomic dataset of Chapter 3. Chapter 3 has been published in Genome Biology and Evolution (2012) and Chapter 4 was submitted to

BMC Evolutionary Biology. Modified versions of Chapter 2 and 5 will be submitted for publication in Molecular Biology and Evolution and Mitochondrial DNA respectively.

6

References

Abascal, F., Posada, D., & Zardoya, R. (2012). The evolution of the mitochondrial genetic code in revisited. Mitochondrial DNA, 23(2), 84-91. doi:10.3109/19401736.2011.653801 Adams, K. L., & Palmer, J. D. (2003). Evolution of mitochondrial gene content: gene loss and transfer to the nucleus. Molecular Phylogenetics and Evolution, 29(3), 380-395. doi:10.1016/S1055-7903(03)00194-5 Archibald, J. M. (2009). The puzzle of plastid evolution. Current biology : CB, 19(2), R81-8. Elsevier Ltd. doi:10.1016/j.cub.2008.11.067 Armstrong, M. R., Blok, V. C., & Phillips, M. S. (2000). A multipartite mitochondrial genome in the potato cyst Globodera pallida. , 154(1), 181-92. Belinky, F., Rot, C., Ilan, M., & Huchon, D. (2008). The complete mitochondrial genome of the magnifica (). Molecular phylogenetics and evolution, 47(3), 1238-43. doi:10.1016/j.ympev.2007.12.004 Boore, J L. (1999). Animal mitochondrial genomes. Nucleic acids research, 27(8), 1767-80. Boore, Jeffrey L, & Fuerstenberg, S. I. (2008). Beyond linear sequence comparisons: the use of genome-level characters for phylogenetic reconstruction. Philosophical transactions of the Royal Society of London. Series B, Biological sciences, 363(1496), 1445-51. doi:10.1098/rstb.2007.2234 Burger, G., Gray, M. W., & Lang, B. F. (2003). Mitochondrial genomes: anything goes. Trends in Genetics, 19(12), 709-716. Elsevier. Burger, G., Lavrov, D. V., Forget, L., & Lang, B. F. (2007). Sequencing complete mitochondrial and plastid genomes. Nature protocols, 2(3), 603-14. doi:10.1038/nprot.2007.59 Chan, C. X., Gross, J., Yoon, H. S., & Bhattacharya, D. (2011). Plastid origin and evolution: new models provide insights into old problems. , 155(4), 1552-60. doi:10.1104/pp.111.173500 Collins, A. G. (2009). Recent Insights into Cnidarian Phylogeny. Smithsonian Contributions to Marine Sciences, 38, 139-149. Doublet, V. (2010). Structure et Evolution du Génome Mitochondrial des Oniscidea (Crustacea, Isopoda). Structure. Université de Poitiers. Doublet, V., Raimond, R., Grandjean, F., Lafitte, A., Souty-grosset, C., & Marcadé, I. (2012). Widespread atypical mitochondrial DNA structure in isopods (Crustacea, Peracarida) related to a constitutive heteroplasmy in terrestrial species, 244, 234-244. doi:10.1139/G2012-008 Egan, a. N., Schlueter, J., & Spooner, D. M. (2012). Applications of next-generation sequencing in plant biology. American Journal of Botany, 99(2), 175-185. doi:10.3732/ajb.1200020 Ereskovsky, A. V., & Lavrov, D. V. (2011). Molecular and morphological description of a new species of Halisarca (Demospongiae: Halisarcida) from Mediterranean and a redescription of the type species Halisarca dujardini. Zootaxa, 2768, 5-31. Erpenbeck, D, Voigt, O., Adamski, M., Adamska, M., Hooper, J. N. a, Wörheide, G., & Degnan, B. M. (2007). Mitochondrial diversity of early-branching metazoa is revealed

7

by the complete mt genome of a haplosclerid demosponge. Molecular Biology and Evolution, 24(1), 19-22. doi:10.1093/molbev/msl154 Erpenbeck, Dirk, Voigt, O., Wörheide, G., & Lavrov, D. V. (2009). The mitochondrial genomes of sponges provide evidence for multiple invasions by Repetitive Hairpin- forming Elements (RHE). BMC genomics, 10, 591. doi:10.1186/1471-2164-10-591 Gazave, E., Lapébie, P., Renard, E., Vacelet, J., Rocher, C., Ereskovsky, A. V., Lavrov, D. V., et al. (2010). Molecular phylogeny restores the supra-generic subdivision of homoscleromorph sponges (Porifera, Homoscleromorpha). PloS one, 5(12), e14290. doi:10.1371/journal.pone.0014290 Gibson, T., Blok, V. C., & Dowton, M. (2007). Sequence and characterization of six mitochondrial subgenomes from Globodera rostochiensis: multipartite structure is conserved among close nematode relatives. Journal of molecular evolution, 65(3), 308- 15. doi:10.1007/s00239-007-9007-y Gibson, T., Blok, V. C., Phillips, M. S., Hong, G., Kumarasinghe, D., Riley, I. T., & Dowton, M. (2007). The mitochondrial subgenomes of the nematode Globodera pallida are mosaics: evidence of recombination in an animal mitochondrial genome. Journal of molecular evolution, 64(4), 463-71. doi:10.1007/s00239-006-0187-7 Gissi, C., Iannelli, F., & Pesole, G. (2008). Evolution of the mitochondrial genome of Metazoa as exemplified by comparison of congeneric species. Heredity, 101(4), 301-20. doi:10.1038/hdy.2008.62 Glenn, T. C. (2011). Field guide to next-generation DNA sequencers. Molecular Ecology Resources, 11, 759-769. doi:10.1111/j.1755-0998.2011.03024.x Gross, J., & Bhattacharya, D. (2009). Mitochondrial and plastid evolution in eukaryotes: an outsiders’ perspective. Nature reviews. Genetics, 10(7), 495-505. doi:10.1038/nrg2610 Haen, K. M., Lang, B. F., Pomponi, S. a, & Lavrov, D. V. (2007). Glass sponges and bilaterian animals share derived mitochondrial genomic features: a common ancestry or parallel evolution? Molecular biology and evolution, 24(7), 1518-27. doi:10.1093/molbev/msm070 Hikosaka, K., Watanabe, Y.-I., Tsuji, N., Kita, K., Kishine, H., Arisue, N., Palacpac, N. M. Q., et al. (2010). Divergence of the mitochondrial genome structure in the apicomplexan parasites, Babesia and Theileria. Molecular biology and evolution, 27(5), 1107-16. doi:10.1093/molbev/msp320 Janouskovec, J., Horák, A., Oborník, M., Lukes, J., & Keeling, P. J. (2010). A common red algal origin of the apicomplexan, , and plastids. Proceedings of the National Academy of Sciences of the United States of America, 107(24), 10949-54. doi:10.1073/pnas.1003335107 Jex, A. R., Littlewood, D. T. J., & Gasser, R. B. (2010). Toward next-generation sequencing of mitochondrial genomes--focus on parasitic of animals and biotechnological implications. Biotechnology advances, 28(1), 151-9. Elsevier Inc. doi:10.1016/j.biotechadv.2009.11.002 Kamb, A. (2011). Next-Generation Sequencing and Its Potential Impact. Chemical research in , 24(8), 1163-8. doi:10.1021/tx200121m Kayal, E., Bentlage, B., Collins, A. G., Kayal, M., Pirro, S., & Lavrov, D. V. (2012). Evolution of linear mitochondrial genomes in medusozoan cnidarians. Genome Biology and Evolution, 4(1), 1-12. doi:10.1093/gbe/evr123

8

Kayal, E., & Lavrov, D. V. (2008). The mitochondrial genome of oligactis (Cnidaria, ) sheds new light on animal mtDNA evolution and cnidarian phylogeny. Gene, 410(1), 177-86. doi:10.1016/j.gene.2007.12.002 Knight, R. D., Freeland, S. J., & Landweber, L. F. (2001). Rewiring the keyboard: evolvability of the genetic code. Nature Reviews Genetics, 2(1), 49-58. doi:10.1038/35047500 Lavrov, D. V. (2007). Key transitions in animal evolution: a mitochondrial DNA perspective. Integrative and Comparative Biology, 47(5), 734-743. doi:10.1093/icb/icm045 Lavrov, D. V. (2010). Rapid proliferation of repetitive palindromic elements in mtDNA of the endemic Baikalian sponge baicalensis. Molecular biology and evolution, 27(4), 757-60. doi:10.1093/molbev/msp317 Lavrov, D. V., Forget, L., Kelly, M., & Lang, B. F. (2005). Mitochondrial genomes of two provide insights into an early stage of animal evolution. Molecular biology and evolution, 22(5), 1231-9. doi:10.1093/molbev/msi108 Lerner, H. R. L., & Fleischer, R. C. (2010). Prospects for the use of Next-Generation Sequencing Methods in Ornithology. The Auk, 127(1), 4-15. Lukić-Bilela, L., Brandt, D., Pojskić, N., Wiens, M., Gamulin, V., & Müller, W. E. G. (2008). Mitochondrial genome of domuncula: palindromes and inverted repeats are abundant in non-coding regions. Gene, 412(1-2), 1-11. doi:10.1016/j.gene.2008.01.001 Marcadé, I., Cordaux, R., Doublet, V., Debenest, C., Bouchon, D., & Raimond, R. (2007). Structure and evolution of the atypical mitochondrial genome of Armadillidium vulgare (Isopoda, Crustacea). Journal of molecular evolution, 65(6), 651-9. doi:10.1007/s00239- 007-9037-5 Mardis, E. R. (2008). The impact of next-generation sequencing technology on genetics. Trends in genetics : TIG, 24(3), 133-41. doi:10.1016/j.tig.2007.12.007 Mardis, E. R. (2011). A decade’s perspective on DNA sequencing technology. Nature, 470(7333), 198-203. Nature Publishing Group. doi:10.1038/nature09796 Martin, W. (2003). Gene transfer from organelles to the nucleus: frequent and in big chunks. Proceedings of the National Academy of Sciences of the United States of America, 100(15), 8612-4. doi:10.1073/pnas.1633606100 Mason, V. C., Li, G., Helgen, K. M., & Murphy, W. J. (2011). Efficient cross-species capture hybridization and next-generation sequencing of mitochondrial genomes from noninvasively sampled museum specimens. Genome Research, (21), 1695-1704. doi:10.1101/gr.120196.111.Capture Moret, B. M. E., & Warnow, T. (2005). Advances in phylogeny reconstruction from gene order and content data. Methods in enzymology, 395(October 2004), 673-700. doi:10.1016/S0076-6879(05)95035-0 Morozova, O., & Marra, M. a. (2008). Applications of next-generation sequencing technologies in functional genomics. Genomics, 92(5), 255-64. Elsevier Inc. doi:10.1016/j.ygeno.2008.07.001 Nash, E. a, Nisbet, R. E. R., Barbrook, A. C., & Howe, C. J. (2008). : a mitochondrial genome all at sea. Trends in genetics : TIG, 24(7), 328-35. doi:10.1016/j.tig.2008.04.001

9

Park, E., Hwang, D.-S., Lee, J.-S., Song, J.-I., Seo, T.-K., & Won, Y.-J. (2012). Estimation of divergence times in cnidarian evolution based on mitochondrial protein-coding genes and the record. Molecular Phylogenetics and Evolution, 62(1), 329-345. Raffan, E., & Semple, R. K. (2011). Next generation sequencing--implications for clinical practice. British medical bulletin, 99, 53-71. doi:10.1093/bmb/ldr029 Raimond, R., Marcadé, I., Bouchon, D., Rigaud, T., Bossy, J. P., & Souty-Grosset, C. (1999). Organization of the large mitochondrial genome in the isopod Armadillidium vulgare. Genetics, 151(1), 203-10. Rosengarten, R. D., Sperling, E. a, Moreno, M. a, Leys, S. P., & Dellaporta, S. L. (2008). The mitochondrial genome of the sponge vastus: evidence for programmed translational frameshifting. BMC genomics, 9(33), 33. doi:10.1186/1471-2164-9-33 Saccone, C., Gissi, C., Reyes, A., Larizza, A., Sbisà, E., & Pesole, G. (2002). Mitochondrial DNA in metazoa: degree of freedom in a frozen event. Gene, 286(1), 3-12. Sesterhenn, T. M., Slaven, B. E., Keely, S. P., Smulian, a G., Lang, B. F., & Cushion, M. T. (2010). Sequence and structure of the linear mitochondrial genome of Pneumocystis carinii. Molecular genetics and genomics : MGG, 283(1), 63-72. doi:10.1007/s00438- 009-0498-7 Shao, R., Kirkness, E. F., & Barker, S. C. (2009). The single mitochondrial chromosome typical of animals has evolved into 18 minichromosomes in the human louse, Pediculus humanus. Genome research, 19(5), 904-12. doi:10.1101/gr.083188.108 Shao, Z., Graf, S., Chaga, O. Y., & Lavrov, D. V. (2006). Mitochondrial genome of the moon jelly Aurelia aurita (Cnidaria, ): A linear DNA molecule encoding a putative DNA-dependent DNA polymerase. Gene, 381, 92-101. doi:10.1016/j.gene.2006.06.021 Smith, D. R., Kayal, E., Yanagihara, A. a, Collins, A. G., Pirro, S., & Keeling, P. J. (2012). First complete mitochondrial genome sequence from a box reveals a highly fragmented, linear architecture and insights into evolution. Genome Biology and Evolution, 4(1), 52-58. doi:10.1093/gbe/evr127 Straub, S. C. K., Parks, M., Weitemier, K., Fishbein, M., Cronn, R. C., & Liston, A. (2011). Navigating the tip of the genomic iceberg: Next-generation sequencing for plant systematics. American journal of botany, 99(2), 349-364. doi:10.3732/ajb.1100335 Suga, K., Mark Welch, D. B., Tanaka, Y., Sakakura, Y., & Hagiwara, A. (2008). Two circular chromosomes of unequal copy number make up the mitochondrial genome of the Brachionus plicatilis. Molecular biology and evolution, 25(6), 1129-37. doi:10.1093/molbev/msn058 Thrash, J. C., Boyd, A., Huggett, M. J., Grote, J., Carini, P., Yoder, R. J., Robbertse, B., et al. (2011). Phylogenomic evidence for a common ancestor of mitochondria and the SAR11 clade. Scientific Reports, 1. doi:10.1038/srep00013 Timmis, J. N., Ayliffe, M. a, Huang, C. Y., & Martin, W. (2004). Endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes. Nature reviews. Genetics, 5(2), 123-35. doi:10.1038/nrg1271 Valach, M., Farkas, Z., Fricova, D., Kovac, J., Brejova, B., Vinar, T., Pfeiffer, I., et al. (2011). Evolution of linear chromosomes and multipartite genomes in yeast mitochondria. Nucleic acids research, 1-18. doi:10.1093/nar/gkq1345

10

Vlcek, C., Marande, W., Teijeiro, S., Lukes, J., & Burger, G. (2011). Systematically fragmented genes in a multipartite mitochondrial genome. Nucleic acids research, 39(3), 979-88. doi:10.1093/nar/gkq883 Voigt, O., Erpenbeck, D., & Wörheide, G. (2008). A fragmented metazoan organellar genome: the two mitochondrial chromosomes of Hydra magnipapillata. BMC genomics, 9, 350. doi:10.1186/1471-2164-9-350 Wang, X., & Lavrov, D. V. (2007). Mitochondrial genome of the homoscleromorph (Porifera, Demospongiae) reveals unexpected complexity in the common ancestor of sponges and other animals. Molecular biology and evolution, 24(2), 363-73. doi:10.1093/molbev/msl167 Wang, X., & Lavrov, D. V. (2008). Seventeen New Complete mtDNA Sequences Reveal Extensive Mitochondrial Genome Evolution within the Demospongiae. (M. Hofreiter, Ed.)PLoS ONE, 3(7), 11. Public Library of Science. doi:10.1371/journal.pone.0002723 Wei, D.-D., Shao, R., Yuan, M.-L., Dou, W., Barker, S. C., & Wang, J.-J. (2012). The Multipartite Mitochondrial Genome of Liposcelis bostrychophila: Insights into the Evolution of Mitochondrial Genomes in Bilateral Animals. PloS one, 7(3), e33973. doi:10.1371/journal.pone.0033973 Weinstock, G. M. (2011). The Impact of Next-Generation Sequencing Technologies on Metagenomics. Handbook of Molecular Microbial Ecology I (pp. 143-147). John Wiley & Sons, Inc. doi:10.1002/9781118010518.ch18 Zaragoza, M. V., Fass, J., Diegoli, M., Lin, D., & Arbustini, E. (2010). Mitochondrial DNA variant discovery and evaluation in human Cardiomyopathies through next-generation sequencing. PloS one, 5(8), e12295. doi:10.1371/journal.pone.0012295

11

CHAPTER 2. CALCAREOUS SPONGES REDEFINE THE LIMITS OF

MTDNA EVOLUTION IN THE ANIMAL KINGDOM

A version of this manuscript to be submitted to Molecular Biology and Evolution

Ehsan Kayal, Oliver Voigt, Gert Wörheide, Lise Forget, Franz B. Lang, Dennis V. Lavrov

Abstract

Although animal mitochondrial DNA (mtDNA) is often depicted as a remarkably

uniform molecule, recent sampling particularly from non-bilaterian animals has challenged this view. For instance, mitochondrial genomes from the three major lineages in the phylum

Porifera (Demospongiae, Homoscleromorpha, and Hexactinellida) revealed distinct

evolutionary trajectories. Remarkably, to date no genuine mitochondrial sequence has been

reported from Calcarea, the fourth class within the phylum. Here we report partial

mitochondrial sequences from three calcareous sponges: clathrus () and

Leucetta chagosensis () from the subclass Calcinea, and Petrobiona massiliana

(Lithonida), subclass . Our analysis of calcarean mtDNA revealed a unique

combination of unusual mitochondrial features, including a multi-partite genome with potential

linear organization, at least one, potentially two novel genetic code(s) used in protein synthesis

that involves the first known reassignment of CGN codon in mtDNA, template-

dependent editing of tRNAs, and highly fragmented rRNA genes. In addition, the rate of

mitochondrial sequence evolution in this group is several orders of magnitude higher than in

other animals. The combination of unusual mitochondrial features found in this class of

12 calcareous sponges puts into a new perspective our understanding of the extent of possible mtDNA evolution in Animalia.

Introduction

Calcareous sponges (Calcarea) represent one of the four recognized classes within the phylum Porifera (Gazave et al., 2011; Hooper & Soest, 2002) and the only class of sponges with a made entirely of , a form of (CaCO3). The majority of

Calcarea species are small, colorless, and inconspicuous organisms found mostly in cryptic marine habitats (e.g. beneath rocks and inside cavities). Currently, there are ~675 described species for this class subdivided into two monophyletic subclasses Calcinea and Calcaronea

(Van Soest et al., 2012), which are well supported by the analyses of multiple independent characters (see (M Manuel, 2006)). However, despite several recent revisions conducted at different taxonomic levels (e.g. (Borojevic, Boury-Esnault, & Vacelet, 1990, 2000; Klautau &

Borojevic, 2001; Klautau & Valentine, 2003; M Manuel, 2006)), the phylogenetic relationships within these subclasses are far from being resolved. In addition, the phylogenetic position of

Calcarea relative to other sponges remains highly controversial (reviewed in (Erpenbeck &

Wöorheide, 2007)). Traditionally, calcareous sponges were placed as a to either

Demospongiae (e.g. Cellularia, (Reiswig & Mackie, 1983)) or Demospongiae + Hexactinellida

(e.g. Silicea, (Gray, 1867)). By contrast, most molecular studies using rRNA (C Borchiellini,

Manuel, Alivon, Vacelet, & Parco, 2001; Cavalier-Smith, Allsopp, Chao, Boury-Esnault, &

Vacelet, 1996; Michael Manuel, Borchiellini, Alivon, Le Parco, & Boury-Esnault, 2003;

Medina, Collins, Silberman, & Sogin, 2001; Zrzavy, Mihulka, Kepka, Bezdek, & Tietz, 1998) and nuclear housekeeping genes (E. a Sperling, Peterson, & Pisani, 2009) reconstructed

13

Porifera as a paraphyletic group with calcareous sponges being more closely related to

Eumetazoa than to other sponges. Recent phylogenomic analyses based on ESTs data from a large number of animal species recovered support for the monophyly of the phylum Porifera, where Calcarea is placed as the sister taxa to Homoscleromorpha (Philippe et al., 2009, 2011;

Pick et al., 2010).

The uncertainties in both the phylogenetic position of Calcarea and the phylogenetic relationships within this group are mainly due to the current relative obscurity of the group to the sponge world. In fact, calcareous sponges were subject to numerous studies and monographs during the 19th century, as illustrated by their instrumental role in Haeckel’s development of the Gastrea theory (Haeckel, 1874). However, they fell out of favor during the

20th century, with most biologists becoming oblivious to the “beauty and fragrance of the

calcareous sponges” (Burton 1963 in (Hooper & Soest, 2002)). Consequently, relatively few studies explored the phylogeny of the group, and those that did primarily used nuclear rDNA sequences as molecular markers. Yet, resolving the phylogenetic position of Calcarea is crucial for determining the validity of Porifera as a clade and better understanding early animal evolution (E. a. Sperling, Pisani, & Peterson, 2007). This task calls for the use of several independent molecular markers (nuclear and mitochondrial) for phylogenetic reconstruction.

Animal mtDNA is a favorite molecule for phylogenetic, population genetic, and biogeographic studies. In fact, the relatively easy access to mitogenomes relative to nuclear sequences in most animal groups has propelled sequencing of complete mtDNAs into the frontline of genomic studies. While earlier studies considered animal mtDNA as a remarkably uniform genome (Saccone et al., 2002), recent sampling has uncovered its substantial

14 organizational diversity (Dennis V. Lavrov, 2007). In particular, complete mtDNA sequences from the other sponge lineages (Demospongiae, Hexactinellida, and Homoscleromorpha) have revealed a variety of unusual features such as extra protein genes, additional or missing tRNA genes, novel genetic code, translational frameshifting, and large non-coding regions (Belinky,

Rot, Ilan, & Huchon, 2008; Erpenbeck, Voigt, Wörheide, & Lavrov, 2009; Haen, Lang,

Pomponi, & Lavrov, 2007; Dennis V Lavrov, 2010; Dennis V Lavrov, Forget, Kelly, & Lang,

2005; Lukić-Bilela et al., 2008; Rosengarten, Sperling, Moreno, Leys, & Dellaporta, 2008;

Wang & Lavrov, 2007, 2008). However, to date there is a scarcity of mtDNA sequences from

Calcarea in public databases. Furthermore, six mitochondrial sequences available in the

GenBank database assigned to this group (Accession numbers AF035266, AF035267,

AF362006, AF362010, AF362015, AF362018) appear to be contamination from the demosponges (Figure S1, see also (Carole Borchiellini et al., 2004)).

Here we report partial mitochondrial sequences from four calcareous sponges, two

belonging to the subclass Calcinea: (Clathrinidae) and Leucetta chagosensis

(Leucettidae), and one from the subclass Calcaronea: Petrobiona massiliana (Petrobionidae).

For Clathrina clathrus we PCR-amplified genomic sequence of three fragments of 7.5, 6.8 and

5.6 kbp in size using a Long-PCR approach (Burger, Lavrov, Forget, & Lang, 2007). For

Leucetta chagosensis we identified sequences of four mitochondrial protein genes from an EST database (Philippe et al., 2009). Nearly identical sequences have been PCR amplified from

another specimen of L. chagosensis collected elsewhere. For Petrobiona massiliana, we

identified mitochondrial gene from total genome sequencing reads. We then extended the

mitochondrial sequences of P. massiliana by PCR amplification using specific primers to

15 obtain three contigs of 3.5, 1.8 and 0.8 kbps. Our results provide a starting point for determining additional mt-genomes from calcareous sponges, which should reveal the full range of mtDNA sequence diversity in this group. They also reveal a derived the extensive accumulation of unusual features in mtDNA that puts into a new perspective our understanding of the extent of possible mtDNA evolution in the Animal Kingdom.

Materials and methods

Obtaining mitochondrial sequences

A specimen of Clathrina clathrus was collected by Michael Nickel off Isola del Giglio

(Italy) and preserved in an 8M-guanidium chloride solution. A rock encrusted by a specimen of

Petrobiona massiliana was collected off the coasts of Marseille and preserved in an 8M-

guanidium chloride solution. Total DNA was prepared by phenol-chloroform extraction following proteinase K . For C. clathrus, regions of cob, cox1 and rnl were amplified

and sequenced using conserved primers for animals (Burger et al., 2007). Sequences were

extended using a modified step-out protocol developed in our lab (Burger et al., 2007) and re-

amplified by standard PCR. Long PCR products were sheared and cloned using TOPO TA

Cloning Kit from Invitrogen. Total DNA from P. massiliana was sheared, tagged and

submitted for shallow depth total DNA sequencing on a GAIIX Illumina platform (1/3 for a

lane, 75 cycles). We used two different methods for the identification of mitochondrial protein

gene in P. massiliana. On the one had, short reads were assembled using MIRA v.3 (Chevreux

et al., 2004) and mitochondrial genes were identified by comparison to sequences from C.

clathrus. Sequences were extended by PCR as described above. Amplicons were used as

templates to map short Illumina reads using MIRA v.3. On the other , we used Neighbor

16

Joining (NJ) trees to pin down a list of putative mitochondrial genes. We extended these sequences by several rounds of mapping using Geneious Pro v5.5.6 (Drummond et al., 2011).

Sequencing reactions were processed at the Iowa State University DNA Facility. The EST library for Leucetta chagosensis came from the database reported in Philippe et al. (Philippe et al., 2009). cDNA library construction and ESTs sequencing were performed at the Max Planck

Institute for Molecular Genetics (Germany). Another specimen of L. chagosensis has been collected in the Red Sea (while the ESTs were generated from Australian L. chagosensis) and partial sequences for cob, cox1, cox3 and nad1 were PCR amplified and sequenced (primer sequences given in table S1).

RNA extractions and cDNA synthesis

Total RNA was extracted from a specimen of Clathrina clathrus collected by DVL at the Station marine d'Endoume, Marseille, France using TRIZOL extraction protocol

(Chomczynski & Mackey, 1995). Total RNA was treated with DNase and an aliquot was circularized using T4 RNA ligase. cDNA was produced using SuperScript® III from

Invitrogen using instructions from the manual, and several tRNAs and partial subunits of mitochondrial small and large ribosomal RNAs (rRNAs) were PCR amplified using specific primers designed based on DNA sequences (Table S1). PCR products were either directly sequenced (rRNA regions) or cloned (tRNAs) using the TA Cloning Kit from Invitrogen. At least eight colonies were analyzed per tRNA gene.

Gene annotation, RNA folding and genome description

17

Mitochondrial sequences were assembled with the STADEN software suite (Staden,

1996). Open reading frames in the nucleotide sequences were identified using the standard genetic code and minimally derived code (with TAG = Trp) characteristic for demosponge mitochondria (Dennis V Lavrov et al., 2005). Deviations from these genetic codes were found by analyzing 100% conserved positions in amino acid alignments of atp6, cob, cox1, nad1-4 genes for Clathrina clathrus, cob, cox1, cox3 and nad1 genes for Leucetta chagosensis, and cob, cox1, cox3, nad4-5 for Petrobiona massiliana, with homologues from representatives of

other non-bilaterian taxa. Gene annotation and gene boundaries were assigned by comparing

the translated sequences to homologous sequences from other non-bilaterian animals. rRNA

genes were identified by comparison to other rRNA sequences from sponges and cnidarians

and verified based upon their secondary structure. tRNA-like genes were identified with the

tRNAscan-SE and ARWEN programs (Laslett & Canbäck, 2008; Lowe & Eddy, 1997) and

verified manually.

Phylogenetic inference

We aligned the amino acid sequences of the genes atp6, cob, cox1, cox3, and nad1-5

from the calcarean species Clathrina clathrus, Leucetta chagosensis and Petrobiona

massiliana to homologues from 57 species representating other non-bilaterian groups (Table

S2). Sequences were aligned using the MUSCLE plugin of Geneious Pro v.5.5 (Drummond et

al., 2011) with default parameters. Individual gene alignments were filtered by Gblocks

(Castresana, 2000) with default parameters and allowing all gap positions. The final

concatenated data set was 2835 amino acids in length (Supplementary material). Maximum

likelihood (ML) and Bayesian (BI) analyzes were conducted on the concatenated data sets

18 using RAxML v.7.2.6 (Stamatakis, 2006) and PhyloBayes v. 3.2f (Lartillot, Blanquart, &

Lepage, n.d.). ML runs were performed for 1000 bootstraps iterations under the GTR model of

sequence evolution with the number of categories defined by a gamma (Γ) distribution and the

CAT approximation. BI analyzes consisted of four chains over 25,000 generations using the

CAT model (maxdiff<0.2), and sampled every 10th tree after the first 100 burn-in cycles

(arbitrary values).

Results

Multipartite architecture of calcareous sponge mitochondrial genomes

1- Clathrina clathrus (Calcinea)

We obtained the sequences for three fragments of the mitochondrial DNA (mtDNA) of

Clathrina clathrus (7.5, 6.8 and 5.6 kbp in size) (Figure 1A). We tried to determine whether C. clathrus sequences were parts of a larger single molecule or belonged to different mitochondrial chromosomes. To do so, we performed a Southern Blot (SB) analysis using probes specific to each of the three sequenced fragments. Each probe hybridized to two distinct sized bands. The size estimates for the lower bands were ~1.5 to 3 kbp larger than those for the corresponding determined contigs (Figure 1B), while the upper bands were approximately twice the size of the lower bands. Restriction digestion of the total DNA followed by

SB analysis displayed complex hybridization patterns difficult to interpret that reject neither a linear or circular organization of the mtDNA in C. clathrus.

The largest contig of C. clatrhus mtDNA (7.5kbp sequenced) corresponds to a chromosome ~10-11 kbp in size (“chrcl-cox1” hereafter) and contains four protein genes

19

(nad3, nad2, cox1, atp6) and at least two tRNA genes (trnN, trnR). The second largest contig

(6.8 kbp sequenced) corresponds to a chromosome ~9 kbp in size (“chrcl-cob”) and contains

three protein genes (nad4, nad1, and cob). The smallest contig (5.6 kbp sequenced)

corresponds to the smallest chromosome (“chrcl-rnl”, ~7 kbp) and encodes fragments of rns

and rnl as well as at least two tRNAs (Figure 1A). The remaining genes typically present in

animal mtDNA were not found and might be located either on different chromosomes or have

transferred to the nucleus. Yet, the presence of cox3 in Leucetta chagosensis and Petrobiona

massiliana mtDNAs (see below) suggests that the mtDNA in C. clathrus is organized into at

least four linear molecules. The arrangement of identified genes in Clathrina clathrus mtDNA

is very different from those in other non-bilaterian animals. In fact, we found only three shared

gene boundaries between C. clatrhus and other taxa: rns-rnl with some demosponges, trnQ-rns

with glass sponges, and nad1-cob with octocorals.

All three mitochondrial chromosomes in Clathrina clathrus are similarly compact with

near absence of intergenic region (IGR). We found genes located in one part of the

chromosome, while another part contains large non-coding repetitive sequences regions

(Figure 1A). In particular, a 190 bp-long, GC-rich sequence, which we called rep1, is present at

the one end of cox1 and cob chromosomes. The two copies of this sequence are 92.3%

identical in sequence and have a potential to fold into a complex secondary structure (Figure

2). We also amplified a region at one end of the rnl-chromosome with the combination of rnl-

rep1 primers, suggestion the presence the rep1 structure on this chromosome. Another

sequence, which we called rep2, is found at the other end of the cox1- and rnl-chromosomes.

The two copies of this sequence are 353 nucleotides in length and share 70.3% of sequence

20 identity. No evident secondary structure was found for rep2. We were not able to find sequence

similar to rep2 in the terminal region of the cob-chromosome (Figure 1A); instead we found

one strong GC-rich stem-loop structure. Interestingly, amplifications between rep1-rep2 give a

single band of ~6kb. Such pattern also advocates for fragmented chromosomal organization of

C. clathrus mtDNA.

The partial sequence of Clathrina clathrus mtDNA has a low A+T content (56.7%) and

displays small AT- and GC-skews between the two strands (0.03 and -0.08 respectively). The

AT-skew of the partial rRNA genes sequence is about 0.05 and GC-skew of 0.23. Strong

negative AT and GC-skews are found in C. clathrus protein-coding sequences (-0.11 and -0.25

respectively). In comparison, the coding regions in Leucetta chagosensis are less GC rich than

those in C. clathrus (27 vs. 42 %), with negative AT- and no GC-skews (-0.15 and 0.00

respectively). The presence of GC-rich repeated sequences is most likely responsible for higher

total GC values in the mtDNA of C. clathrus.

2- Petrobiona massiliana (Calcaronea)

Using C. clathrus mtDNA sequences as queries to mine the assemblies obtained from

shallow sequencing of Petrobiona massiliana total DNA, we identified several putative

mitochondrial protein genes that do not harbor internal stop codon according to the minimally

derived genetic code used for non-bilaterians (TGA=tryptophan). We identified a list of eight

sequences for atp6, cob, cox1, cox3, nad1, and nad4-5 (Figure S2). We used several gene-

specific primers for PCR-amplification and between these genes. cob, nad4 and nad5 could be

amplified into a single large PCR amplicon (chrpe-cob, 3.5 kbp). However, when we used this

21 large contig as a reference for assembling the short reads from the total DNA sequencing, no reads mapped in the regions between these genes, suggesting that the amplicon is actually a

PCR-generated artifact. Such artifacts are common features of multipartite genomes with terminal repeats (TRs) where the polymerase jumps between the TRs.

We found cob and nad5 genes on the largest contig (chrpe 1, 2.5 kbp). The other genes

were found on individual contigs: chrpe 2 (1.5 kbps) harbors cox1; chrpe 3 (1.2 kbps) contains

part of nad1; chrpe 4 (1.1 kbps) nad4; chrpe 5 (0.8 kbps) has encodes part of cox3. We were

not able to identify other mitochondrial genes with significant confidence. There is no shared

gene boundary between the mitochondrial gene arrangement in P. massiliana and other non-

bilaterian animals. We found various sized non-coding regions (NCR) at the end of most

contigs, with the largest ones found on chrpe 3 (345 bps) upstream of nad1 and on chrpe 2 (131

bps) downstream of cox1. Some of these regions can be folded into stem-loop structures. The

partial mitochondrial sequences of Petrobiona massiliana are more AT rich (53.3 %), and

display higher AT- (-0.14) and GC-skew (0.14) values between the two strands than their

calcinean counterparts.

Eccentric protein coding sequences in calcareous sponge mtDNA.

We identified seven mitochondrial protein genes (atp6, cob, cox1, nad1-4) in Clathrina

clathrus, four (cob, cox1, cox3, nad1) in Leucetta chagosensis, and six (cob, cox1, cox3, nad1,

nad4-5) in Petrobiona massiliana. These genes are on average of the same size as their

homologues in other non-bilaterian animals but share with them low sequence identities (Table

1). In C. clathrus mtDNA, ATG is inferred as the start-codon for three genes (atp6, nad2 and

22

nad3), while ATT inferred for two (cob and nad1), and TTA and CTT for one each (cox1 and

nad4, respectively). ATG is inferred as the start-codon of nad1 and nad5 in P. massiliana,

TTA inferred for cox1 and cox3. We were not able to infer a start codon for the other genes

(cob and nad4) due to missing data. TAA is the only stop codon inferred for C. clathrus and L.

chagosensis protein-coding genes, but both TAA and TAG are the stop codons inferred for P.

massiliana (see below). We also found an unidentified ORF on the rnl-chromosome of C.

clathrus mtDNA.

Our data suggest that two different genetic codes are used in mitochondrial protein synthesis in calcareous sponges. In both Calcinea species, inferred mitochondrial protein-

coding sequences have inframe UAG codons, which usually code for translation termination in

all animal mtDNAs published so far. We found 33 internal UAG codons in C. clathrus coding

sequences and 23 UAG codons in those of L. chagosensis. We excluded mRNA editing as an

explanation for the presence of these internal termination codons because we found inframe

UAG codons in both PCRs and cDNA-based mitochondrial sequences obtained from L.

chagosensis and in cDNA-based cox1 sequences obtained from C. clathrus. We compared

pattern of codon usage for the positions that are completely conserved in inferred amino acid

sequences of encoded mitochondrial between our sequences and other non-bilaterian

animals. These comparisons revealed that eight UAG codons in our sequences occurred at the

positions conserved as tyrosine in other non-bilaterian animals. In addition these sequences show an unexpected pattern of occurrence for CGN codons, specified as arginine in the standard genetic code. None of the CGN codons in calcareous sponge coding sequences are present in 100% conserved arginine position, but 49 codons in C. clathrus and 18 codons in L.

23

chagosensis are found in 100% conserved positions. Finally, several internal TGA codons occurred at the positions conserved as tryptophan.

In the Calcaronea Petrobiona massiliana, the TGA codon encodes 52% of tryptophan

residue. We did not find any indication of the use of calcinean genetic code. In fact, no internal

TAG codon was found; TAG was the inferred termination codon for cob. In addition, we found

ATA codons in P. massiliana for 6 of the 100% conserved methionine positions for animals (3 in cox1, 1 in nad4 and 2 in nad5). ATA codons encode 60% of the methionine residues in the mitochondrial protein genes of this species (Table 2). These findings suggest that two independent changes of the genetic code occurred in calcareous sponges: the reassignment of

CGN codon family from arginine to glycine, and the UAG codon from termination to tyrosine in Calcinea; the reassignment of ATA codon to methionine in Calcaronea. Calcareans share with all animals as well as multiple outgroups the reassignment of the UGA codon from termination to tryptophan (Knight, Freeland, & Landweber, 2001).

Template-dependent editing in Clathrina clathrus mt-tRNAs

Using tRNAscan and ARWEN searches we identified several tRNA-like sequences in

the mtDNA of C. clathrus (Figure 1A). The inferred secondary structure of these motifs

resemble mtDNA-encoded tRNAs in bilaterian animals by the lack of conservation in the DHU

and TΨC arms and multiple mismatches in the amino-acyl acceptor arm. Six of these tRNAs

have well-conserved sequences of anticodon, D- T-arm loops, but multiple mismatches in the

stems (Figure 3). We investigated the nature of these tRNA-like sequences by a RT-PCR

approach (D V Lavrov, Brown, & Boore, 2000). For four of them ( , ,

24

and ), we found what appears to be mature tRNAs with the CCA sequence added at the 3’ end (Chens, Joyce, Wolfe, Steffen, & Martinn, 1992), and with both the T-arm and the 3’ end of the acceptor stem edited to form perfect Crick-Watson

complementary pairs (Figure 3). We were unable to RT-PCR amplify the remaining tRNA-like

structures and the nature of these sequences needs further investigation. The process of tRNA

editing in C. clathrus mitochondria resembles those described in the Lithobius

forficatus (D V Lavrov et al., 2000) and onychophorans (Segovia, Pett, Trewick, & Lavrov,

2011), in that it involved all four nucleotides. Similar mt-tRNA editing cases were also suggested in the oregonensis (Masta & Boore, 2004) and gall midges

(Beckenbach & Joy, 2009). However, unlike bilaterian mt-tRNA editing cases, tRNA genes are

not truncated in C. clathrus mtDNA, which advocates for an alternative mechanism of editing.

Fragmented rRNA genes

We identified several fragment of the large and small subunits of ribosomal RNA (rns

and rnl) in the mitochondrial sequences of C. clathrus (Figure 1A). These fragments are

distributed on the rnl-chromosome of C. clathrus mtDNA and, when assembled, harbor

conserved structures of animal mitochondrial rRNA genes, including conserved stem loops

required for rRNA functions. We found three segments of rns that are organized in a non-

sequential order and interrupted by (Figure 4). Similarly, the two inferred

fragments of rnl (rnl-p1 and rnl-p2) are interrupted by (Figure S3). As shown

above, both and appear to code for functional tRNAs. Although it is

possible that these rRNA fragments are pseudogenes or partially duplicated sequences, the

25 pattern of their conservation (with conserved sequences in functionally important regions), their presence in cDNA libraries, and the ability to form functional rRNA subunits all together argue for their being functional genes.

Phylogenetic inferences using Calcarea mtDNA.

We created an alignment of 2835 amino acid sequences of ten mitochondrial protein- coding genes from 57 species representing all non-bilaterian groups, but excluding the ctenophore sequences. We then conducted both Bayesian and Maximum Likelihood

phylogenetic analyzes under the GTR and CAT models of sequence evolution. Both Cnidaria

and Porifera were polyphyletic in all our phylogenetic trees (Figure 5). We found the groupings

[[ (G1) + Myxospongiae (G2)] + [marine (G3) + Democlavia (G4)]] in

Demospongia and [Cubozoa + ] + [Hydrozoa + ]] in Medusozoa under

both CAT and GTR models of sequence evolution. Interestingly, the CAT model appears to

overestimate the number of substitutions (illustrated by longer branches) for Calcarea and

Hexactinellida, compared to the GTR model (Figure 5). Comparatively, the GTR model was

more sensitive to missing data, where Leucetta chagosensis with the fewer number of

sequences was the longer branch in Calcarea (Figure 5A). The tree topologies were identical

under both the GTR+Γ and GTR+CAT models of sequence evolution, with lower bootstrap

values (BV) in the former (Figure 5A, Figure S4). In these trees, we found the long branch taxa

(Hexactinellida and Calcarea) regrouping together, and Hexacorallia the sister taxon to

[Demospongia + Homoscleromorpha] (BV=93/96 under GTR+Γ and GTR+CAT,

respectively). Under the CAT model we found [Octocorallia + Medusozoa] (PP=75), and a

composite group made of monophyletic Demospongia (PP=0.91), Hexacorallia (PP=0.6),

26

Calcarea (PP=0.99), and Hexactinellida (PP=0.99), and polyphyletic Homoscleromorpha

(Figure 5B).

Discussion

A unique combination of unusual mitochondrial features in Calcarea

For a long time sampling of animal mitochondrial genomes has been biased by two different factors: a prevailing scientific interest in a few taxa and technical difficulties associated with working with others. While the first of these biases is slowly lifting thanks to a better understanding of animal diversity, the second, often caused by atypical mtDNA architecture and/or sequence evolution, is only now starting to be addressed. Thus, one can expect that the few remaining metazoan lineages with no available mitochondrial sequence to have the most unusual mt-genomes. The mitochondrial genomic sequences from the calcareous

sponges Clathrina clathrus, Leucetta chagosensis, and Petrobiona massiliana determined in

this study are among the most remarkable in animals and comparable to the extreme

mitogenomes found in some (Burger, Jackson, & Waller, 2012). Here we discuss four

uncommon mitochondrial features revealed by these sequences: apparent multipartite and

linear genome architecture, novel genetic codes, edited tRNAs, and highly fragmented rRNAs.

Animal mtDNA is commonly depicted as a single, small, circular molecule, yet derived

forms of mitogenomes have been described in several taxa. For instance, it has been known for

more than 20 years that Medusozoa, one of the two major lineages in the phylum Cnidaria, has

linear and, in some cases, multipartite genomes (Kayal et al., 2012; Smith et al., 2012). In

bilaterian animals, some terrestrial isopods harbor two mtDNA “isoforms” composed of ~14

27 kb linear monomers and ~28 kb -to-head circular dimers (Doublet, 2010; Doublet et al.,

2012; Marcadé et al., 2007; Raimond et al., 1999). Our study suggests that the mtDNA of C. clathrus is also multipartite, potentially linear, composed of at least three chromosomes of various sizes, each harboring a subset of mt genes and repeated identical regions. The results of the SB analyses, while difficult to interpret, might represent the two conformations

(supercoiled and relaxed forms) of the circular mtDNA molecules. This is reinforced by our inability to assemble P. massiliana sequences into a single contig. The presence of homologous sequences on various mitochondrial chromosomes was also reported in the multipartite mtDNA of the rotifer Brachionus plicatilis (Suga, Mark Welch, Tanaka, Sakakura, &

Hagiwara, 2008), the potato cyst nematode Globodera pallida (Armstrong, Blok, & Phillips,

2000), the human body louse Pediculus humanus (Shao, Kirkness, & Barker, 2009), and the booklice Liposcelis bostrychophila (Wei et al., 2012). In medusozoan cnidarians, inverted repeats at the ends of the linear molecules are thought to play a role in the integrity of the mitochondrial chromosomes (Smith et al., 2012). These similarities suggest that multipartite and potentially linear mitochondrial genome organization has also evolved in C. clathrus and, possibly, in all calcareous sponges. Unfortunately, our failure to obtain the complete mitochondrial sequences using traditional approaches hindered any attempt to definitively settle on the linear or circular nature of these mtDNA molecules.

The second unusual feature of the calcareous sponge mitochondria reported here is the presence of at least one, potentially two different and novel genetic codes inferred for mitochondrial protein synthesis. These include unique reassignments of the UAG codons from termination to tyrosine and of the CGN codon family from arginine to glycine in calcineans

28

Clathrina clathrus and Leucetta chagosensis, and putatively of the ATA codon from isoleucine

to methionine in the calcaronean Petrobiona massiliana. We also found the reassignment of the

UGA codon from termination to tryptophan in both subclasses that is shared with other animals

and some fungi (Knight et al., 2001). While the CGN codons were previously identified as

“mutable in their meaning” (Söll & RajBhandary, 2006) and unassigned in yeast mitochondria

(Clark-Walker & Weiller, 1994), our study is the first to report their actual reassignment.

Similarly, the reassignment of the UAG codon to tyrosine is novel, although changes of the

UAG codon to leucine/alanine and glutamine have been found in mtDNA of green and

nuclear genomes of , respectively (Knight et al., 2001). Interestingly, a convergent

change for another termination codon (UAA) to tyrosine has been recently reported in a

nematode (Jacob, Vanholme, Van Leeuwen, & Gheysen, 2009). The reassignment of ATA

codon to methionine is typical of bilaterians and some fungi (Knight et al., 2001). If confirmed,

the change of the ATA codon becomes a second convergence of codon change between

Bilateria and sponges, similar to the change in AGR codons in glass sponges (Haen et al.,

2007). But, unlike what has been suggested earlier (Haen et al., 2007), given the position of

sponges in the most recent animal phylogenies (Pick et al., 2010), these changes do not appear

to “represent an intermediate state in the reassignment of” these codons “in animal mtDNA”

(Haen et al., 2007), but rather episodic events in the erratic evolution of the mtDNA in

metazoans. Changes in the genetic code are often associated with the presence of reassigned

tRNAs in the mitochondria that are either encoded by the mtDNA and/or imported from the

cytoplasm (Lang, Lavrov, Beck, & Steinberg, 2012). Our data does not allow ruling out either

possibility.

29

Another unusual feature of calcareous mitochondrial genomes is fragmented rRNA genes. Fragmented rRNA genes are seldom in animal mtDNA and so far have been found only in oysters Crassostera gigas and C. virginica, where lrRNA is encoded in two pieces (C. A.

Milbury & Gaffney, 2005; C. a Milbury, Lee, Cannone, Gaffney, & Gutell, 2010), and in

Placozoa (Signorovitch, Buss, & Dellaporta, 2007) where lrRNA is encoded in two or tree pieces separated by as much as 20 kbp (Burger, Yan, Javadi, & Lang, 2009; Signorovitch et al.,

2007). Outside of Metazoa, fragmented mitochondrial rRNAs have been described in bacteria, green and some other lineages (Boer & Gray, 1988; Fontaine, Rousvoal, LeBlanc,

Kloareg, & Goër, 1995; Nedelcu, 1997). Extreme rRNA has been documented for mitochondrial rRNA genes of apicomplexan and dinoflagellates (Jackson et al., 2007;

Waller & Jackson, 2009). Yet, the high degree of rRNA fragmentation in C. clathrus mitochondria is unique in animals. Assuming that these rRNA-like fragments generate functional rRNA molecules, it would be very interesting to explore the mechanisms involved in their processing, particularly for the recognition of individual pieces, and their interaction inside the ribosome.

Our calcarean mitochondrial sequences encode truncated tRNA molecules, which appear to undergo a post-transcriptional RNA editing. Several different types of mitochondrial tRNA editing has been reported to date, including C-to-U editing in mitochondria of (Börner, Mörl, Janke, & Pääbo, 1996; Janke & Pääbo, 1993), trypanosomes

(Alfonzo, Blanc, Estévez, Rubio, & Simpson, 1999) and plants (e.g. (Fey et al., 2002;

Maréchal-Drouard, Ramamonjisoa, Cosset, Weil, & Dietrich, 1993)); insertion editing in slime molds (Antes, Costandy, Mahendran, & Miller, 1998); 5’-end editing in amoebozoans (K.

30

Lonergan & Gray, 1993; K. M. Lonergan & Gray, 1993) and lower fungi (Laforest, Roewer, &

Lang, 1997); or a combination of these (Gott, Somerlot, & Gray, 2010). The 3’-end editing similar to that found in our study has been reported in bilaterian animals (D V Lavrov et al.,

2000; Reichert, Rothbauer, & Mörl, 1998; Segovia et al., 2011; Tomita, Ueda, & Watanabe,

1996; Yokobori & Pääbo, 1995, 1997) and a jacobid flagellate (Leigh & Lang, 2004). Further

studies are needed to determine whether the tRNA editing found in Clathrina clathrus is a

recent acquisitions that is specific for this species/group of species, as is common in animals

(e.g. (Brennicke, Marchfelder, & Binder, 1999)) or it is a feature common to all calcareous

sponges.

Finally, the mitochondrial sequences of calcareous sponges determined for this study

display rapid and unique evolutionary patterns that evocate the very unusual mtDNAs found in

some dinoflagellates (Jackson et al., 2007; Waller & Jackson, 2009). These sequences also

exhibit an extremely high rate of sequence evolution, hampering our attempts to recover the

position of Calcarea within Metazoa based on mtDNA sequence data. In previous studies, the phylogenetic positions of the highly derived mtDNA sequences from (Kohn et al.,

2011; Pett et al., 2011) have been problematic, resulting in LBA artifacts between these species and bilaterian animals. Similarly, we also found the artifactual topology [Ctenophora +

Calcarea] the sister taxon to (data not shown). Therefore, we decided to focus our efforts on the position of Calcarea within non-bilaterian animals, and particularly within sponges, by removing sequences from bilaterian and ctenophore species. Under the GTR model of sequence evolution, we recovered [[Demospongia + Homoscleromorpha] +

Hexacorallia], a result already found in previous studies (e.g. (Dennis V Lavrov, Wang, &

31

Kelly, 2008)). GTR analyses also showed the grouping of the long branch clades Calcarea and

Hexactinellida (Figure 5A). Given the limitations of the GTR model, notably a homogenous distribution of amino acids among positions, we assume such topology is the result of LBA artifacts (Lartillot, Brinkmann, & Philippe, 2007). The CAT model performed better regarding

LBA artifacts, where the long branch Calcarea and Hexactinellida did not group together, but resulted in a polytomy between Porifera and Hexacorallia (Figure 5B). The use of more complex models of sequence evolution, such as the site-heterogeneous CATGTR+G model, might overcome some of the systematic errors present in the mitochondrial sequence data, but their implementation requires more sequences data from a wider range of taxa in order to accurately estimate all needed parameters (Philippe & Roure, 2011).

The lack of phylogenetic signal is not surprising given the highly derived nature of the mitochondrial sequences for Calcarea. Put into a broader perspective, our analyses provide additional evidences regarding the limitations of mtDNA data for resolving deep phylogenetic

nodes within early diverging animals. By comparison, recent EST-based studies have

recovered monophyletic Porifera and Cnidaria, where Calcarea is the sister group to

Homoscleromorpha (Philippe et al., 2009; Pick et al., 2010). Yet, the relatively high rates of sequence evolution in calcareous sponges suggest that mitochondrial sequences can be valuable tools for the study of higher taxonomic ranks (e.g. (Voigt, Eichmann, & Wörheide,

2011)). In addition, if calcareous and homoscleromorph sponges are indeed related, these taxa would represent an interesting system for the study of mitochondrial genome evolution in sister taxa as they display very different trends in mt-genome evolution.

32

Conclusion

In the second half of 19th century, calcareous sponges were a subject of numerous

studies, resulting in several monographs on their and development. Although, this

sponge group fell out of favor during the 20th century, we expect that the new genomic era will

reignite the interest in “these remarkable animals” (Hooper & Soest, 2002). Given that the

mitochondrial genomes presented here are incomplete, and that we only explored two

representatives from the subclass Calcinea and one from Calcaronea out of the ~675 described

species in Calcarea, we expect that future studies of calcareous sponge mtDNA, will uncover

additional unusual features in their mtDNA organization and evolution. Such studies will shed

light on the evolution of the mitochondrial genome in Calcarea, and in animals in general.

Supplementary Material

Supplementary figures S1-4 and supplementary tables 1 and 2 are available at

Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).

Funding

This work was supported by the National Science Foundation (DEB-0828783 to DVL)

and the German Science Foundation (WO896/3 and WO896/6 to GW).

Acknowledgements

We thank Micheal Nickel and Dan Jackson for collecting and providing Clathrina

clathrus and Leucetta specimens used for mtDNA sequencing and cDNA work, respectively.

We also thank Mohsen Kayal for his insightful comments on an earlier version of the

33 manuscript. GW and OV thank the German Science Foundation (DFG) for funding (WO896/3, and through the DFG Priority Programme SPP1174 "Deep Metazoan Phylogeny", project

WO896/6). E.K. wishes to acknowledge the funding through the Smithsonian Predoctoral

Fellow grant, as well as people from the Wendell lab for funding part of this project.

34

Table 1: Amino acid similarity for the mitochondrial protein coding genes atp6, cob, cox1, cox3, nad1-5 between the calcareous sponges used in this study (Clathrina clathrus, Leucetta

chagosensis and Petrobiona massiliana) and complete mitochondrial genomes from

demosponges, homoscleromorphs, glass-sponges and cnidarians.

Homoscleromorpha Demospongia Hexactinellida Cnidaria

Clathrina 33.9±0.4 34.3 ±0.7 30.7 ±0.7 33.5±1.0 30.0 ±0.5 clathrus

Leucetta 46.4 ±0.7 46.4 ±0.8 44.4 ±0.5 45.9±1.8 43.2 ±0.1 chagosensis

Petrobiona 24.7±0.4 24.7±0.3 23.4±0.1 24.9±1.1 22.3±0.1 massiliana

Table 2: Codon usage table used in the mtDNA of the calcareous sponges Clathrina clathrus

(C.c.), Leucetta chagosensis (L.c.), and Petrobiona massiliana (P.m.). Inferred derivations

from the standard genetic code are highlighted in bold. TGA1: from termination codon (TER) to tryptophan (Trp) in all metazoan groups; TAG2: from termination codon to tyrosine (Tyr) in calcineans; CGN3: from arginine to glycine in calcineans; ATA4: from isoleucine (Ile) to

methionine (Met) in the calcaronean P. massiliana.

35

19 16 45 41 17 11 29 13 27 24 42 62 39 26 64 77 P.m

7 1 7 7 1 2 1 3 2 33 22 27 25 31 30 12 L.c.

8 5 4 12 51 17 26 68 11 37 36 36 20 21 33 11 C.c.

3

3 3 1 3

TGT TGC TGG AGT GGT AGC GGC AGA AGG GGA GGG TGA CGT CGC CGA CGG

Ser Trp Gly Arg Arg Cys TER

2 1 27 35 32 11 14 18 33 15 19 19 21 13 25 20 P.m.

9 4 5 5 6 5 4 40 23 28 15 63 40 27 28 11 L.c.

7 49 53 33 31 16 29 13 33 51 33 46 27 17 35 33 C.c.

2

TAT TAC CAT TAA AAT GAT CAC CAA CAG AAC GAC AAA AAG GAA GAG TAG

His Tyr Gln Lys Glu Asn Asp TER

35 26 39 33 33 23 20 15 27 28 31 13 35 40 50 24 P.m.

2 2 8 1 8 1 39 13 18 23 11 16 26 26 24 11 L.c.

4 4 3 0 51 60 4 43 18 32 36 59 25 13 41 67 32 C.c.

TCT TCC CCT TCA TCG ACT GCT CCC CCA CCG ACC GCC ACA ACG GCA GCG

Ser Pro Thr Ala

3 33 76 65 44 29 6 58 38 34 61 40 37 16 60 58 102 P.m.

6 1 3 1 24 14 35 38 76 22 88 19 37 18 120 144 L.c.

69 35 38 63 29 94 44 61 35 49 56 25 141 107 159 137 C.c.

4

TA

TTT TTC CTT TTA TTG ATT GTT CTC CTA CTG ATC GTC ATG G GTG ATA

Ile Val Phe Leu Leu Met

36

chrcl-cox1 rep1

nad2 cox1 atp6 rep2

NR nad3

chrcl-cob rep1 cob-loop

nad1 cob nad4

chrcl-rnl rep1 r n s 1 - 2 r n s 1 - orf rep2

P S rns3 M rns1 rnl-p2 rnl-p1 rns2

1 2 3

37

Figure 1: A. Genetic map of the partial mitochondrial “chromosomes” of the calcareous sponge Clathrina clathrus (Calcinea). All genes above the contig are transcribed from left to right and those under from right to left. tRNA-like genes are identified by the one-letter code for their corresponding amino acid. Underlined tRNAs and rRNA pieces correspond to those confirmed at the RNA level. Abbreviations are as follows: atp6, subunits 6 of ATP synthase; cob, apocytochrome b; cox1, subunit one of cytochrome oxidase; nad1-4, NADH dehydrogenase subunits 1 to 4; rnl and rns, large and small ribosomal RNAs respectively; orf is an unidentified open reading frame. rep1 and rep2 correspond to repeated regions present in the mt-genome, with a subjective orientation (black arrow); cob-loop is a large stem-loop structure found on chrcl-cob. Smaller stem-loops can be folded into structures resembling

tRNAs. B. Southern blot analysis of the mtDNA of Clathrina clathrus. Lanes 1, 2 and 3 were

hybridized with probes corresponding to cob, rnl and cox1 respectively. Lower bands have size

estimates similar to the sequenced contigs (see text).

38

G T G rep1 T T A A C-G A-T A-T C G T A T T GTGTTG C AGCAGAAG .||||| G T ||||||| TACAAC C G CGTCTTC G G A T T-A T.G T A G A-T A-T G.T G-C C-G G-C G.T A-T C-G C-G A C G-C T T A G G T A A A G A C-G G-C T-A C-G C-G T-A C-G T T C c T C T C C A-T C G A T-A T-A G G T-A A-T trnN end A-T C A-T A G-CAGTCCAAACTCCTTTTCC GCCCCTATGAAGATTCTTGATACAAATTCCAAT

A T A C G G C C T T loop2 C C G T G A-T C G T A rep1 C-G T T G-C A A C-G C-G A T T A-T G T C C-G A-T C G T GTGTTG C A-T AGCAGAAG T .||||| G C-G T ||||||| TACAAC A A-T G CGTCTTC G T-A C-G T.G A T G- T A C G-C A-T A-T C-G G.T G-C A A loop1 C-G G-C C-G G.T A-T A-T C-G C-G C-G A A C A C G-C T CGG A G C-G A T T A A C-G C ||| A G T A GC-G A T-A G AGCC T A-T T G-C T-A T-A C-G GC-G C-G C-G G-C T-A C-G T-A T-A T-A G-C T-A T C-G T-A T C T-A T-A T C T T.G T-A C T A-T GT.G C C T-A C A-T T T G-C A A G-C A T-A T-A C T T-A G-C G G C T-A A-T T-A A-T C-G T-A A-T C T T-A C G-C T A-T C G-C A-T A A A C-G C A A G T-A T-A A- T G-C A G C C T-A G- C A-T T-A T- A A A T-A T- A nad4 end cob end T-A G-C 134 nucleotides T- A C-G T.G T- A C A TACACGC G-C TTTTTTGTGTT [...... ] CCTAGTAATA- T CCTTTTCCAG [...... ] TGGGAGACCTTTT-ATCCTACGTGTGAA

Figure 2: Repeated elements in the mitochondrial genome of Clathrina clathrus. A. Linear map of the partial mitochondrial sequences of C. clathrus displaying the repeated elements rep1 and rep2 and some putative stem-loops. A black arrow illustrates the suggested

orientation of these elements. B. Putative stem-loop structures and repeats found in the non-

coding regions of C. clathrus mtDNA. Bold sequences correspond to the repeated region rep1

found in the stem-loop structure at one end of both cob- and cox1-contigs. Arrows in the

39 complex stem-loops structures are discrepancies (either substitution or insertion) between the repeats. Adjacent gene boundaries are given with their orientation.

A A C-G GxA TxT Asparagine C-G Arginine (N) C (R) C TxC A C TxT A C AxC A T-A A CxC C-G CxA G-C CxA T-A C-G C-G C-G AxA T TACCC T TA T-A T C T-A A-T T GGACC A T-A A ||||| G A ||x|| G C C-G C-G ATGGG T T CCGGG C C C-G T T T C-G G C-G G A-T T T T C T TACCC A T GGCCC A A A A ||||| A G T TTTA G A A T ||||| G ATGGG C C TTG CCGGG C .||| C T T G ||| T T G GAAT G T A A G TAAC G C T G T A T A T-A GxA A C-G T-A G-C A-T T-A G-C A-T CxT C A C G T A T A GTT TCT

T T AxA Methionine A-T Proline CxT (M) CxC (P) GxA TxG C C GxT A C TxT A C AxG A G-C A G-C A-T A-T A-T A-T TxG T A C-G C-G A AATCC A T GAGTC T CA T-A G G-C |x| || A G-C A x|||| A T-A TGAGG ATCAG C G-C A T T C A-T T T T G-C TG A-T A-T A T-A T A T C A A A AC TCC T TAGTC A A A G || ||| A A A A A G CG T TTTG ||||| TGAGG T C ATCAG C G || A T G .||| T T T G GC T G T A GAAC G A T A A A A A TG G.T C G.T C-G G-C A-T T-A G-C G-C A-T T-A T C T A T A T G CAT TGG

Figure 3: Inferred secondary structures for Clathrina clathrus , ,

and at both DNA (on the left) and RNA level (on the right). Grey

letters correspond to the DNA sequence and black letters are changes found at the RNA level.

Signatures for mature (functional) tRNAs (ACC on the acceptor arm) are shown.

40

. . U A U U A ...... A C U ...... C U U U C G G G A ...... A G A AA A A G C U C U . G . G C U C . . G C A U ...... A U . . A U . . . . . A U G G G G U . . U A G G A U A U . . A A G . . A U G C rns G . . p2 C G C . . U A . . C G A C G G G . . . G C A C U C A U . A U G U U C C U G A C . . A U G G . A U U . A G C G A G G A U U U G . A U A A A . . A U C U . A U U U G rnl . U A p1 . . UG U A C . . A A G C . . G C G G U C . . A U U U G C U U C . U G G . G U G C A trnM . A . G U A A . G G C A G . G A G G U G A U U A C U G C U G A C G U U A A A U A A A A G C G A C G U A U C CA C C G G C U C U A C G A U U U U A G G G U A C G U A G U G C A A U C U C U A A C A C . . G C U A G A A A A U A . . A G C C U G C U C U A A A . . A A A A U G C A G C G A U C U C A G G rns . . U C U A G U A . U A U p1 . . G A G G C U T . U . G U A C G . C U G U G G . . G C G C G C A U U A . . C G A U U . . G . U A G A . G G C A U C . . . A C C UA A . A . . U C G G A G C G G C U A UC . A U . . . C G U G A A . . . . . A G C A G A A G A C U U G A UGC . . A A A UA . . U C G G . A A U . . . A U A G . . . U C . . A U U G GU . . . U A A G U . G A A GG . . . A A A A . G U A A U U . . . . . UCC AG C G U U U G . . U U C U G G U G U U U A C U A U C A A . . . . U G A C . . U A A . AGG U C U A G C . . . . U G U U G C A G A U G A U A G A U . . AA U C A U C A U A U U A G . . U C A G C G U . G G U U . G U G U . . . G G G G G A G . A . . A U C U U C . . G A . G G A U A A . . . AU G G . . A A . . o 5' A . . . . U U ...... U G . U G C A U G G U C A U A G U A G U . U A C A G . C G A A A ...... A C U G G U G U C G U C A C C C G A . . . . C G . A G C . U U A . U G C A U U C G A UG A C G A . U A G C A GA U C AA o . A A 3' . . . G U . . U A . C U U A G G . . . UU A A A G G . . C C C A G A G rnsp3 . . U GA U U U A U . U . . G G G U C U CG A A . . A U A G A U G A G . . . . G . . . G C . . . . U G G G ...... G G . . . U A ...... A A ...... G G . . C G . . . . AG . A . . . . C G . . . . . U C . . U A ...... U A . . U A . . C G ...... G U . . U A . . G C . . . . U . . . A U ...... G C . . . U G . . . . G A . . . A ...... M S P ...... 2 . . . . B . . . . 1 3 2 1 2 ...... rns rnl . .

Figure 4: A. Inferred secondary structure of the partial small mitochondrial ribosomal RNA subunit (rns) in Clathrina clathrus based on the skeleton of Oscarella sp rns. Sequences correspond to Clathrina; black dots are based on Oscarella rns. rnsp1, rnsp2 and rnsp3 are partial

sequences of rRNA named in the same order than folded complete ribosomal RNA. Black

41 dashed lines denote breaks in the coding sequences compared to the "complete" RNA structure.

B. Order of the different pieces of rRNA found at DNA level. Underlined tRNAs correspond to those confirmed at the RNA level. Note that the rns sequence is incomplete and that additional sequence may be present in the unsequenced portion of the genome.

42

A) * Placozoa Linuche unguiculata Coronatae moseri Cubozoa * sanjuanensis Lucernaria janetae Staurozoa Cubaia aphrodite * Clava multicornis Laomedea flexuosa * Millepora platyphylla Hydrozoa tiarella * Ectopleura Hydra magnipapillata * capillata * sp * noctiluca Aurelia aurita Discomedusae * andromeda mosaicus Renilla mulleri * Keratoisidinae sp Briareum asbestinum Dendronephthya gigantea Octocorallia Sarcophyton glaucum savaglia * Metridium senile Chrysopathes formosa Acropora tenuis * Ricordea florida Hexacorallia * Lophelia pertusa * Astrangia sp Madracis mirabilis Igernella notabilis strobilina Keratosa (G1) fulva * Halisarca dujardini Myxospongiae (G2) * plicifera muta marine Haplosclerida (G3) * muelleri * Lubomirskia baicalensis corrugata neptuni ophiraphidites Democlavia (G4) * actinia Oscarella carmela * Corticium candelabrum * Plakina monolopha Homoscleromorpha (G0) Plakinastrella cf. onkodes * Aphrocallistes vastus * Iphiteon panicea Hexactinellida Sympagella nux * Clathrina clathrus * Leucetta chagosensis Petrobiona massiliana * Placozoa B) 0.88 Linuche unguiculata Coronatae Alatina moseri Cubozoa * Haliclystus sanjuanensis 0.59 Lucernaria janetae Staurozoa Cubaia aphrodite 0.99 0.98 Clava multicornis 0.99 Laomedea flexuosa 0.93 Millepora platyphylla 0.99 Hydrozoa Pennaria tiarella 0.99 0.99 Ectopleura larynx 0.99 Hydra magnipapillata 0.75 0.99 Cyanea capillata Chrysaora sp 0.99 0.99 Aurelia aurita Discomedusae 0.99 Cassiopea andromeda 0.99 Catostylus mosaicus

0.99 Renilla mulleri Keratoisidinae sp 0.99 Briareum asbestinum Octocorallia 0.88 Dendronephthya gigantea 0.99 Sarcophyton glaucum

0.6 Metridium senile Chrysopathes formosa 0.61 0.99 Acropora tenuis 0.82 Ricordea florida Hexacorallia 0.95 Lophelia pertusa 0.99 Astrangia sp 0.99 Madracis mirabilis 0.99 Clathrina clathrus 0.99 Leucetta chagosensis Petrobiona massiliana Corticium candelabrum * Plakina monolopha * Plakinastrella cf. onkodes Homoscleromorpha (G0)

0.99 Aphrocallistes vastus 0.59 Iphiteon panicea Hexactinellida 0.99 Sympagella nux 0.99 Igernella notabilis Keratosa (G1) 0.94 Aplysina fulva 0.99 Halisarca dujardini Myxospongiae (G2) 0.91 0.99 Xestospongia muta marine Haplosclerida (G3) 0.97 0.99 0.98 Lubomirskia baicalensis Axinella corrugata 0.92 Topsentia ophiraphidites 0.99 Democlavia (G4) Geodia neptuni 0.96 0.99 Suberites domuncula Tethya actinia Oscarella carmela Homoscleromorpha (G0)

43

Figure 5: Phylogenetic analysis of concatenated amino acid sequences from representative non-bilaterian species for atp6, cob, cox1, cox3, and nad1-5 genes under the GTR+Γ (A) and

CAT+Γ (B) models of sequence evolution using RAxML v.7.2.6 and PhyloBayes v. 3.2f

respectively. A star shows maximum support for that node. Calcareous species have been

underlined.

44

References Alfonzo, J. D., Blanc, V., Estévez, A. M., Rubio, M. A., & Simpson, L. (1999). C to U editing of the anticodon of imported mitochondrial tRNA(Trp) allows decoding of the UGA stop codon in tarentolae. The EMBO journal, 18(24), 7056-7062. doi:10.1093/emboj/18.24.7056 Antes, T., Costandy, H., Mahendran, R., & Miller, D. (1998). Insertional Editing of Mitochondrial tRNAs of Physarum polycephalum and Didymium nigripes. Molecular and Cellular Biology, 18(12), 7521-7527. Armstrong, M. R., Blok, V. C., & Phillips, M. S. (2000). A multipartite mitochondrial genome in the potato cyst nematode Globodera pallida. Genetics, 154(1), 181-192. Beckenbach, A. T., & Joy, J. B. (2009). Evolution of the mitochondrial genomes of gall midges (Diptera: Cecidomyiidae): rearrangement and severe truncation of tRNA genes. Genome biology and evolution, 1(White 1973), 278-87. doi:10.1093/gbe/evp027 Belinky, F., Rot, C., Ilan, M., & Huchon, D. (2008). The complete mitochondrial genome of the demosponge Negombata magnifica (Poecilosclerida). Molecular phylogenetics and evolution, 47(3), 1238-43. doi:10.1016/j.ympev.2007.12.004 Boer, P. H., & Gray, M. W. (1988). Scrambled ribosomal RNA gene pieces in Chlamydomonas reinhardtii mitochondrial DNA. Cell, 55(3), 399-411. Borchiellini, C, Manuel, M., Alivon, E., Vacelet, J., & Parco, Y. L. E. (2001). Sponge and the origin of Metazoa. Journal of Evolutionary Biology, 14(1), 171-179. doi:10.1046/j.1420-9101.2001.00244.x Borchiellini, Carole, Chombard, C., Manuel, M., Alivon, E., Vacelet, J., & Boury-Esnault, N. (2004). Molecular phylogeny of Demospongiae: implications for classification and scenarios of character evolution. Molecular phylogenetics and evolution, 32(3), 823-37. doi:10.1016/j.ympev.2004.02.021 Borojevic, R., Boury-Esnault, N., & Vacelet, J. (1990). A revision of the intraspecific classification of the subclass Calcinea (Porifera, class Calcarea). Bulletin du Muséum national d’histoire naturelle. Section A, Zoologie, biologie et écologie animales, 12(2), 243-276. Muséum national d’histoire naturelle. Borojevic, R., Boury-Esnault, N., & Vacelet, J. (2000). A revision of the supraspecific classification of the subclass Calcaronea (Porifera, class Calcarea). Zoosystema, 22(2), 203-263. Brennicke, A., Marchfelder, A., & Binder, S. (1999). RNA editing. FEMS microbiology reviews, 23(3), 297-316. Burger, G., Jackson, C. J., & Waller, R. F. (2012). Unusual Mitochondrial Genomes and Genes. In C. E. Bullerwell (Ed.), Organelle Genetics (pp. 41-77). Springer Berlin Heidelberg. Burger, G., Lavrov, D. V., Forget, L., & Lang, B. F. (2007). Sequencing complete mitochondrial and plastid genomes. Nature protocols, 2(3), 603-14. doi:10.1038/nprot.2007.59 Burger, G., Yan, Y., Javadi, P., & Lang, B. F. (2009). Group I-intron trans-splicing and mRNA editing in the mitochondria of placozoan animals. Trends in genetics : TIG, 25(9), 377-81. doi:10.1016/j.tig.2009.07.004

45

Börner, G. V., Mörl, M., Janke, A., & Pääbo, S. (1996). RNA editing changes the identity of a mitochondrial tRNA in marsupials. EMBO Journal, 15(21), 5949-5957. Castresana, J. (2000). Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Molecular biology and evolution, 17(4), 540-52. Cavalier-Smith, T., Allsopp, M. T. E. P., Chao, E. E., Boury-Esnault, N., & Vacelet, J. (1996). Sponge phylogeny, animal monophyly, and the origin of the : 18S rRNA evidence. Canadian Journal of , 74(11), 2031-2045. doi:10.1139/z96-231 Chens, J. Y., Joyce, P. B. M., Wolfe, C. L., Steffen, M. C., & Martinn, N. C. (1992). Cytoplasmic and mitochondrial tRNA nucleotidyltransferase activities are derived from the same gene in the yeast Saccharomyces cerevisiae. The Journal of Biological Chemistry, 267(21), 14879-14883. Chevreux, B., Pfisterer, T., Drescher, B., Driesel, A. J., Müller, W. E. G., Wetter, T., & Suhai, S. (2004). Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs. Genome research, 14(6), 1147-59. doi:10.1101/gr.1917404 Chomczynski, P., & Mackey, K. (1995). Short technical reports. Modification of the TRI reagent procedure for isolation of RNA from polysaccharide- and proteoglycan-rich sources. Biotechniques, 19(6), 942-5. Clark-Walker, G. D., & Weiller, G. F. (1994). The structure of the small mitochondrial DNA of Kluyveromyces thermotolerans is likely to reflect the ancestral gene order in fungi. Journal of Molecular Evolution, 38(6), 593-601. Springer New York. Doublet, V. (2010). Structure et Evolution du Génome Mitochondrial des Oniscidea (Crustacea, Isopoda). Structure. Université de Poitiers. Doublet, V., Raimond, R., Grandjean, F., Lafitte, A., Souty-grosset, C., & Marcadé, I. (2012). Widespread atypical mitochondrial DNA structure in isopods (Crustacea, Peracarida) related to a constitutive heteroplasmy in terrestrial species, 244, 234-244. doi:10.1139/G2012-008 Drummond, A., Ashton, B., Buxton, S., Cheung, M., Cooper, A., Duran, C., Field, M., et al. (2011). Geneious Pro v5.5.6. Erpenbeck, D., Voigt, O., Wörheide, G., & Lavrov, D. V. (2009). The mitochondrial genomes of sponges provide evidence for multiple invasions by Repetitive Hairpin- forming Elements (RHE). BMC genomics, 10, 591. doi:10.1186/1471-2164-10-591 Erpenbeck, D., & Wöorheide, G. (2007). On the molecular phylogeny of sponges (Porifera)*. Zootaxa, 1668, 107 - 126. Fey, J., Weil, J. H., Tomita, K., Cosset, A., Dietrich, A., Small, I., & Maréchal-Drouard, L. (2002). Role of editing in plant mitochondrial transfer RNAs. Gene, 286(1), 21-4. Fontaine, J. M., Rousvoal, S., LeBlanc, C., Kloareg, B., & Goër, S. L.-de. (1995). The Mitochondrial LSU rDNA of the Brown AlgaPylaiella littoralisReveals α- Proteobacterial Features and is Split by Four Group IIB Introns with an Atypical Phylogeny. Journal of Molecular Biology, 251(3), 378-389. Gazave, E., Lapébie, P., Ereskovsky, A. V., Vacelet, J., Renard, E., Cárdenas, P., & Borchiellini, C. (2011). No longer Demospongiae: Homoscleromorpha formal nomination as a fourth class of Porifera. Hydrobiologia. doi:10.1007/s10750-011-0842- x

46

Gott, J. M., Somerlot, B. H., & Gray, M. W. (2010). Two forms of RNA editing are required for tRNA maturation in Physarum mitochondria. RNA (New York, N.Y.), 16(3), 482-8. doi:10.1261/.1958810 Gray, J. E. (1867). Notes on the arrangement of sponges, with the description of some new genera. Proceedings of the Zoological Society of London, 2, 492–558. Haeckel, E. (1874). Die Gastrea Theorie, die phylogenetische Classification des Thierreiches und die Homologie der Keimblatter. Jenaische Zeischrift für Naturwissenschaft, 8, 1-5. Haen, K. M., Lang, B. F., Pomponi, S. a, & Lavrov, D. V. (2007). Glass sponges and bilaterian animals share derived mitochondrial genomic features: a common ancestry or parallel evolution? Molecular biology and evolution, 24(7), 1518-27. doi:10.1093/molbev/msm070 Hooper, J. N. A., & Soest, van R. W. M. (2002). Systema Porifera, a guide to the classification of the sponges (in 2 volumes). Jackson, C. J., Norman, J. E., Schnare, M. N., Gray, M. W., Keeling, P. J., & Waller, R. F. (2007). Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria. BMC biology, 5, 41. doi:10.1186/1741-7007-5-41 Jacob, J. E. M., Vanholme, B., Van Leeuwen, T., & Gheysen, G. (2009). A unique genetic code change in the mitochondrial genome of the parasitic nematode Radopholus similis. BMC research notes, 2, 192. doi:10.1186/1756-0500-2-192 Janke, A., & Pääbo, S. (1993). Editing of a tRNA anticodon in mitochondria changes its codon recognition. Nucleic acids research, 21(7), 1523-5. Kayal, E., Bentlage, B., Collins, A. G., Kayal, M., Pirro, S., & Lavrov, D. V. (2012). Evolution of linear mitochondrial genomes in medusozoan cnidarians. Genome Biology and Evolution, 4(1), 1-12. doi:10.1093/gbe/evr123 Klautau, M., & Borojevic, R. (2001). Sponges of the Clathrina Gray, 1867 from Arraial do Cabo , Brazil. Zoosystema, 23(3), 395-410. Klautau, M., & Valentine, C. (2003). Revision of the genus Clathrina (Porifera, Calcarea). Zoological Journal of the Linnean Society, 139(1), 1-62. doi:10.1046/j.0024- 4082.2003.00063.x Knight, R. D., Freeland, S. J., & Landweber, L. F. (2001). Rewiring the keyboard: evolvability of the genetic code. Nature Reviews Genetics, 2(1), 49-58. doi:10.1038/35047500 Kohn, A. B., Citarella, M. R., Kocot, K. M., Bobkova, Y. V., Halanych, K. M., & Moroz, L. L. (2011). Rapid evolution of the compact and unusual mitochondrial genome in the ctenophore, Pleurobrachia bachei. Molecular Phylogenetics and Evolution, 63(1), 203- 207. Elsevier Inc. doi:10.1016/j.ympev.2011.12.009 Laforest, M. J., Roewer, I., & Lang, B. F. (1997). Mitochondrial tRNAs in the lower Spizellomyces punctatus: tRNA editing and UAG “stop” codons recognized as leucine. Nucleic acids research, 25(3), 626-32. Lang, B. F., Lavrov, D., Beck, N., & Steinberg, S. V. (2012). Mitochondrial tRNA Structure, Identity, and Evolution of the Genetic Code. In C. E. Bullerwell (Ed.), Organelle Genetics (pp. 431-474). Berlin, Heidelberg: Springer Berlin Heidelberg. doi:10.1007/978-3-642-22380-8 Lartillot, N., Blanquart, S., & Lepage, T. (n.d.). PhyloBayes 3.2: a Bayesian software for phylogenetic reconstruction and molecular dating using mixture models.

47

Lartillot, N., Brinkmann, H., & Philippe, H. (2007). Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model. BMC evolutionary biology, 7 Suppl 1, S4. doi:10.1186/1471-2148-7-S1-S4 Laslett, D., & Canbäck, B. (2008). ARWEN: a program to detect tRNA genes in metazoan mitochondrial nucleotide sequences. Bioinformatics (Oxford, England), 24(2), 172-5. doi:10.1093/bioinformatics/btm573 Lavrov, D V, Brown, W. M., & Boore, J. L. (2000). A novel type of RNA editing occurs in the mitochondrial tRNAs of the centipede Lithobius forficatus. Proceedings of the National Academy of Sciences of the United States of America, 97(25), 13738-13742. doi:10.1073/pnas.250402997 Lavrov, Dennis V. (2007). Key transitions in animal evolution: a mitochondrial DNA perspective. Integrative and Comparative Biology, 47(5), 734-743. doi:10.1093/icb/icm045 Lavrov, Dennis V. (2010). Rapid proliferation of repetitive palindromic elements in mtDNA of the endemic Baikalian sponge Lubomirskia baicalensis. Molecular Biology and Evolution, 27(4), 757-760. doi:10.1093/molbev/msp317 Lavrov, Dennis V, Forget, L., Kelly, M., & Lang, B. F. (2005). Mitochondrial genomes of two demosponges provide insights into an early stage of animal evolution. Molecular biology and evolution, 22(5), 1231-9. doi:10.1093/molbev/msi108 Lavrov, Dennis V, Wang, X., & Kelly, M. (2008). Reconstructing ordinal relationships in the Demospongiae using mitochondrial genomic data. Molecular phylogenetics and evolution, 49(1), 111-24. doi:10.1016/j.ympev.2008.05.014 Leigh, J., & Lang, B. F. (2004). Mitochondrial 3’ tRNA editing in the Seculamonas ecuadoriensis: A novel mechanism and implications for tRNA processing. RNA, 10(4), 615-621. doi:10.1261/rna.5195504 Lonergan, K., & Gray, M. (1993). Predicted editing of additional transfer RNAs in Acathamoeba castellanii mitochondria. Nucleic Acids Research, 21(18), 4402. doi:10.1093/nar/21.18.4402 Lonergan, K. M., & Gray, M. W. (1993). Editing of transfer RNAs in Acanthamoeba castellanii mitochondria. Science, 259(5096), 812-6. Lowe, T. M., & Eddy, S. R. (1997). tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic acids research, 25(5), 955-64. Lukić-Bilela, L., Brandt, D., Pojskić, N., Wiens, M., Gamulin, V., & Müller, W. E. G. (2008). Mitochondrial genome of Suberites domuncula: palindromes and inverted repeats are abundant in non-coding regions. Gene, 412(1-2), 1-11. doi:10.1016/j.gene.2008.01.001 Manuel, M. (2006). Phylogeny and evolution of calcareous sponges. Canadian Journal of Zoology, 241, 225-241. doi:10.1139/Z06-005 Manuel, Michael, Borchiellini, C., Alivon, E., Le Parco, Y., & Boury-Esnault, J. V. N. (2003). Phylogeny and Evolution of Calcareous Sponges: Monophyly of Calcinea and Calcaronea, High Level of Morphological Homoplasy, and the Primitive Nature of Axial . Systematic Biology, 52(3), 311-333. doi:10.1080/10635150390196966 Marcadé, I., Cordaux, R., Doublet, V., Debenest, C., Bouchon, D., & Raimond, R. (2007). Structure and evolution of the atypical mitochondrial genome of Armadillidium vulgare

48

(Isopoda, Crustacea). Journal of molecular evolution, 65(6), 651-9. doi:10.1007/s00239- 007-9037-5 Maréchal-Drouard, L., Ramamonjisoa, D., Cosset, A., Weil, J. H., & Dietrich, A. (1993). Editing corrects mispairing in the acceptor stem of bean and potato mitochondrial phenylalanine transfer RNAs. Nucleic acids research, 21(21), 4909-14. Masta, S. E., & Boore, J. L. (2004). The complete mitochondrial genome sequence of the spider Habronattus oregonensis reveals rearranged and extremely truncated tRNAs. Molecular biology and evolution, 21(5), 893-902. doi:10.1093/molbev/msh096 Medina, M., Collins, a G., Silberman, J. D., & Sogin, M. L. (2001). Evaluating hypotheses of animal phylogeny using complete sequences of large and small subunit rRNA. Proceedings of the National Academy of Sciences of the United States of America, 98(17), 9707-12. doi:10.1073/pnas.171316998 Milbury, C. A., & Gaffney, P. M. (2005). Complete mitochondrial DNA sequence of the eastern oyster Crassostrea virginica. Marine biotechnology (New York, N.Y.), 7(6), 697- 712. doi:10.1007/s10126-005-0004-0 Milbury, C. a, Lee, J. C., Cannone, J. J., Gaffney, P. M., & Gutell, R. R. (2010). Fragmentation of the large subunit ribosomal RNA gene in oyster mitochondrial genomes. BMC genomics, 11, 485. doi:10.1186/1471-2164-11-485 Nedelcu, a M. (1997). Fragmented and scrambled mitochondrial ribosomal RNA coding regions among : a model for their origin and evolution. Molecular biology and evolution, 14(5), 506-17. Pett, W., Ryan, J. F., Pang, K., Mullikin, J. C., Martindale, M. Q., Baxevanis, A. D., & Lavrov, D. V. (2011). Extreme mitochondrial evolution in the ctenophore leidyi: Insight from mtDNA and the nuclear genome. Mitochondrial DNA, (515), 1-13. doi:10.3109/19401736.2011.624611 Philippe, H., Brinkmann, H., Lavrov, D. V., Littlewood, D. T. J., Manuel, M., Wörheide, G., Baurain, D., et al. (2011). Resolving Difficult Phylogenetic Questions : Why More Sequences Are Not Enough. PLoS Biology, 9(3), e1000602. doi:10.1371/journal.pbio.1000602 Philippe, H., Derelle, R., Lopez, P., Pick, K., Borchiellini, C., Boury-Esnault, N., Vacelet, J., et al. (2009). Phylogenomics revives traditional views on deep animal relationships. Current biology : CB, 19(8), 706-12. doi:10.1016/j.cub.2009.02.052 Philippe, H., & Roure, B. (2011). Difficult phylogenetic questions: more data, maybe; better methods, certainly. BMC biology, 9, 91. doi:10.1186/1741-7007-9-91 Pick, K. S., Philippe, H., Schreiber, F., Erpenbeck, D., Jackson, D. J., Wrede, P., Wiens, M., et al. (2010). Improved phylogenomic taxon sampling noticeably affects nonbilaterian relationships. Molecular Biology and Evolution, 27(9), 1983-1987. doi:10.1093/molbev/msq089 Raimond, R., Marcadé, I., Bouchon, D., Rigaud, T., Bossy, J. P., & Souty-Grosset, C. (1999). Organization of the large mitochondrial genome in the isopod Armadillidium vulgare. Genetics, 151(1), 203-10. Reichert, a, Rothbauer, U., & Mörl, M. (1998). Processing and editing of overlapping tRNAs in human mitochondria. The Journal of biological chemistry, 273(48), 31977-84.

49

Reiswig, H. M., & Mackie, G. O. (1983). Studies on Hexactinellid Sponges. III. The Taxonomic Status of Hexactinellida Within the Porifera. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 301(1107), 419-428. Rosengarten, R. D., Sperling, E. a, Moreno, M. a, Leys, S. P., & Dellaporta, S. L. (2008). The mitochondrial genome of the hexactinellid sponge Aphrocallistes vastus: evidence for programmed translational frameshifting. BMC genomics, 9(33), 33. doi:10.1186/1471-2164-9-33 Saccone, C., Gissi, C., Reyes, A., Larizza, A., Sbisà, E., & Pesole, G. (2002). Mitochondrial DNA in metazoa: degree of freedom in a frozen event. Gene, 286(1), 3-12. Segovia, R., Pett, W., Trewick, S., & Lavrov, D. V. (2011). Extensive and Evolutionarily Persistent Mitochondrial tRNA Editing in Velvet Worms (Phylum ). Molecular Biology and Evolution, 28(10), 2873–2881. doi:10.1093/molbev/msr113 Shao, R., Kirkness, E. F., & Barker, S. C. (2009). The single mitochondrial chromosome typical of animals has evolved into 18 minichromosomes in the human body louse, Pediculus humanus. Genome Research, 19(5), 904-912. doi:10.1101/gr.083188.108 Signorovitch, A. Y., Buss, L. W., & Dellaporta, S. L. (2007). Comparative Genomics of Large Mitochondria in Placozoans. (G. McVean, Ed.)PLoS Genetics, 3(1), 7. Public Library of Science. Smith, D. R., Kayal, E., Yanagihara, A. a, Collins, A. G., Pirro, S., & Keeling, P. J. (2012). First complete mitochondrial genome sequence from a reveals a highly fragmented, linear architecture and insights into telomere evolution. Genome Biology and Evolution, 4(1), 52-58. doi:10.1093/gbe/evr127 Sperling, E. a., Pisani, D., & Peterson, K. J. (2007). Poriferan paraphyly and its implications for palaeobiology. Geological Society, London, Special Publications, 286(1), 355-368. doi:10.1144/SP286.25 Sperling, E. a, Peterson, K. J., & Pisani, D. (2009). Phylogenetic-signal dissection of nuclear housekeeping genes supports the paraphyly of sponges and the monophyly of . Molecular biology and evolution, 26(10), 2261-74. doi:10.1093/molbev/msp148 Staden, R. (1996). The Staden sequence analysis package. Molecular Biotechnology, 5(3), 233-241. Stamatakis, A. (2006). RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics (Oxford, England), 22(21), 2688-90. doi:10.1093/bioinformatics/btl446 Suga, K., Mark Welch, D. B., Tanaka, Y., Sakakura, Y., & Hagiwara, A. (2008). Two circular chromosomes of unequal copy number make up the mitochondrial genome of the rotifer Brachionus plicatilis. Molecular biology and evolution, 25(6), 1129-37. doi:10.1093/molbev/msn058 Söll, D., & RajBhandary, U. L. (2006). The genetic code - Thawing the “frozen accident.” Journal of Biosciences, 31(4), 459-463. doi:10.1038/hdy.2008.7 Tomita, K., Ueda, T., & Watanabe, K. (1996). RNA editing in the acceptor stem of mitochondrial tRNA(Tyr). Nucleic acids research, 24(24), 4987-91. Van Soest, R. W. M., Boury-Esnault, N., Vacelet, J., Dohrmann, M., Erpenbeck, D., De Voogd, N. J., Santodomingo, N., et al. (2012). Global Diversity of Sponges (Porifera). (J. M. Roberts, Ed.)PLoS ONE, 7(4), e35105. doi:10.1371/journal.pone.0035105

50

Voigt, O., Eichmann, V., & Wörheide, G. (2011). First evaluation of mitochondrial DNA as a marker for phylogeographic studies of Calcarea: a case study from Leucetta chagosensis. Hydrobiologia. doi:10.1007/s10750-011-0800-7 Waller, R. F., & Jackson, C. J. (2009). Dinoflagellate mitochondrial genomes: stretching the rules of molecular biology. BioEssays : news and reviews in molecular, cellular and developmental biology, 31(2), 237-45. doi:10.1002/bies.200800164 Wang, X., & Lavrov, D. V. (2007). Mitochondrial genome of the homoscleromorph Oscarella carmela (Porifera, Demospongiae) reveals unexpected complexity in the common ancestor of sponges and other animals. Molecular biology and evolution, 24(2), 363-73. doi:10.1093/molbev/msl167 Wang, X., & Lavrov, D. V. (2008). Seventeen New Complete mtDNA Sequences Reveal Extensive Mitochondrial Genome Evolution within the Demospongiae. (M. Hofreiter, Ed.)PLoS ONE, 3(7), e2723. Public Library of Science. doi:10.1371/journal.pone.0002723 Wei, D.-D., Shao, R., Yuan, M.-L., Dou, W., Barker, S. C., & Wang, J.-J. (2012). The Multipartite Mitochondrial Genome of Liposcelis bostrychophila: Insights into the Evolution of Mitochondrial Genomes in Bilateral Animals. PloS one, 7(3), e33973. doi:10.1371/journal.pone.0033973 Yokobori, S., & Pääbo, S. (1995). tRNA editing in metazoans. Nature, 377(6549), 490. doi:10.1038/377490a0 Yokobori, S., & Pääbo, S. (1997). Polyadenylation creates the discriminator nucleotide of chicken mitochondrial tRNA(Tyr). Journal of molecular biology, 265(2), 95-9. Zrzavy, J., Mihulka, S., Kepka, P., Bezdek, A., & Tietz, D. (1998). Phylogeny of the Metazoa Based on Morphological and 18S Ribosomal DNA Evidence. , 14, 249-285.

51

CHAPTER 3. EVOLUTION OF LINEAR MITOCHONDRIAL

GENOMES IN MEDUSOZOAN CNIDARIANS

A paper published in Genome Biology and Evolution

Ehsan Kayal, Bastian Bentlage, Allen G. Collins, Mohsen Kayal, Stacy Pirro, and Dennis V.

Lavrov

Abstract

In nearly all animals, mitochondrial DNA (mtDNA) consists of a single circular molecule that encodes several subunits of the protein complexes involved in oxidative phosphorylation as well as part of the machinery for their expression. By contrast, mtDNA in species belonging to Medusozoa (one of the two major lineages in the phylum Cnidaria) comprises one to several linear molecules. Many questions remain on the ubiquity of linear mtDNA in medusozoans and the mechanisms responsible for its evolution, replication, and transcription. To address some of these questions, we determined the sequences of nearly complete linear mtDNA from 24 species representing all four medusozoan classes: Cubozoa,

Hydrozoa, Scyphozoa, and Staurozoa. All newly determined medusozoan mitochondrial genomes harbor the 17 genes typical for cnidarians and map as linear molecules with high degree of gene order conservation relative to anthozoans. In addition, two ORFs, polB and

ORF314, are identified in cubozoan, schyphozoan, staurozoan, and trachyline hydrozoan mtDNA. polB belongs to the B-type DNA polymerase gene family, while the product of

52

ORF314 may act as a terminal protein (TP) that binds . We posit that these two ORFs

are remnants of a linear plasmid that invaded the mitochondrial genomes of the last common

ancestor of Medusozoa, and is responsible for its linearity. Hydroidolinan hydrozoans have lost

the two ORFs and instead have duplicated cox1 at each end of their mitochondrial

chromosome(s). Fragmentation of mtDNA occurred independently in Cubozoa and Hydridae

(Hydrozoa, ). Our broad sampling allows us to reconstruct the evolutionary

history of linear mtDNA in medusozoans.

Data deposition:

Sequence data from this article have been deposited in GenBank under the accession numbers: JN700934-JN700988

Introduction

The mitochondrial genome is one of the most commonly used molecular markers in

animal phylogenetics, with informative characters at the nucleotide, amino acid and gene organization levels (Moret and Warnow 2005; Lavrov 2007). As a consequence, the metazoan

mitogenomic dataset has benefited from a wide sampling effort across the animal kingdom. In

most animals, mitochondrial DNA (mtDNA) is a single, small (usually less than 20 kbp)

circular molecule that contains a nearly constant set of genes. With a few exceptions, these genes are orthologous across animals (Wolstenholme 1992; Boore 1999). Most animal

mitochondrial genomes encode 12-14 protein genes and up to 28 RNA genes (Gissi et al.

2008). While complete or nearly complete mitochondrial genomes for more than 2400 animal

species are available on the Organelle Genome Resources database in GenBank

(ncbi.nlm.nih.gov/genomes/OrganelleResource.cgi?taxid=33208), only 97 represent non-

53 bilaterians: 5 for Placozoa, 42 for Porifera, 49 for Cnidaria and one for Ctenophora. This paucity of completely sequenced non-bilaterian mtDNAs is unfortunate because non-bilaterian mitochondrial genomes represent a large portion of the variability in both sequence and genome organization among animals (Lavrov 2007; Gissi et al. 2008).

The phylum Cnidaria represents a “hot spot” of animal mitochondrial genomic diversity, with mtDNA displaying variation in both gene content and genome organization.

Variation in the gene content is illustrated by the loss of all but one or two tRNA genes, a feature also found in some demosponges and in all chaetognaths (Helfenbein et al. 2004;

Papillon et al. 2004; Faure and Casanova 2006; Wang and Lavrov 2008; Miyamoto et al.

2010). In addition, introns (e.g. Beagley et al., 1998; Chen et al. 2008), duplicated genes (Kayal and Lavrov 2008; Voigt et al. 2008) and extra protein genes (mutS in octocorals, polB in

Scyphozoa, unidentified ORFs in several hexacorals) have been found in several cnidarian classes (Pont-Kingdon et al. 1998; Medina et al. 2006; Shao et al. 2006). Furthermore, the structure of cnidarian mtDNA is also variable: a single circular molecule is found in Anthozoa

(hexa- and octocorals), while medusozoan mtDNA consists of one or several exclusively linear molecules (Warrior and Gall 1985; Bridge et al. 1992; Ender and Schierwater 2003). Working with linear mtDNA presents unique challenges for both PCR amplification and sequencing. As a potential reflection of these challenges, only six medusozoan linear mitochondrial genomes have been determined to date, those of the Aurelia aurita and Chrysaora quinquecirrha (Scyphozoa), and the three hydra species Hydra magnipapillata, H. oligactis, and H. vulgaris (Hydrozoa) (Shao et al. 2006; Kayal and Lavrov 2008; Voigt et al. 2008; Park et al. in press), while 43 complete circular mitochondrial genomes are available for anthozoans.

54

Linear mtDNA is common outside animals and has been extensively studied in fungi, plants, and several groups of unicellular eukaryotes (Bendich 1993; Burger et al 2003; Nosek and Tomáska 2003; Hikosaka et al 2010; Valach et al 2011). The linear form of these molecules is thought to affect their stability, mechanisms of DNA replication, and expression of the genes they harbor. Unlike circular DNA molecules, linear chromosomes are less stable due to their higher susceptibility to exonuclease activity. In addition, replication of linear genomes has to overcome what is known as "the challenge of the ends", namely the shortening of the linear molecule after each replication (Nosek et al. 1998). In the nucleus, the ends of the linear chromosomes are capped by repetitive sequences known as telomeres that both protect the molecule from degradation and allow its replication in a faithful manner (i.e. without the loss of essential sequences). Similarly, linear mtDNAs in algae and fungi have overcome linear instability with an array of telomere structures (Nosek et al. 1998). Recent studies of several yeast species from the genus Candida have suggested the concomitant involvement of the product of (1) a B-type DNA polymerase gene encoded by the mtDNA for

5’ end DNA extension and (2) terminal proteins (TPs) that bind the single stranded end of the linear molecules (Fricova et al 2010 and reference therein). Thus far, the small number of available linear mitochondrial genomes for medusozoans has hindered attempts to assign any of the available models for the maintenance of chromosome ends proposed for non-animal groups (Nosek et al. 1998; Nosek and Tomaska 2002), or to reconstruct the evolutionary history of linear mtDNA in medusozoans.

Here we present a description of 24 nearly complete mitochondrial genomes from a phylogenetically diverse set of species representing each of the four medusozoan classes, as

55

well as their primary lineages: Cubozoa ( and ), Hydrozoa

(Hydroidolina and Trachylina), Scyphozoa (Coronatae and Discomedusae), and Staurozoa

(Cleistocarpida and Eleutherocarpida). Our study represents the most in-depth exploration of

linear mtDNA in Cnidaria, and includes the first mitochondrial genomes from the classes

Cubozoa and Staurozoa.

Materials and methods

Animal sampling, DNA extractions, and PCR amplification

We collected several specimens of Cassiopea andromeda (Scyphozoa) and Millepora

platyphylla (Hydrozoa) on the site of Tiahura in Moorea (French Polynesia). We collected

specimens of xaymacana (Cubozoa), Cassiopea frondosa and Chrysaora sp.

(Scyphozoa), and Cubaia aphrodite (Hydrozoa) at Bocas del Toro (Panama). The staurozoan

Craterolophus convolvulus was collected off the island of Helgoland (Germany) and

Haliclystus “sanjuanensis” was collected off the coast of Washington state (USA). The

cubozoans barnesi and fleckeri were collected at Cairns and Weipa respectively (Queensland, Australia); Alatina moseri was collected in Waikiki (Hawai'i) and

Chiropsalmus quadrumanus was collected on the east coast of the USA, near the border between North and South Carolina. We purchased specimens of Cyanea capillata (Scyphozoa),

Clava multicornis, Ectopleura larynx, Laomedea flexuosa, and Pennaria tiarella (Hydrozoa)

from the Marine Biological Laboratory at Woods Hole (www.mbl.edu), and Nemopsis bachei

from the Gulf Specimen Marine Lab (www.gulfspecimen.org). Tissues from Catostylus

mosaicus, Linuche unguiculata, and pulmo (Scyphozoa) were kindly provided by

Dr. Michael Dawson, individuals of longissima (Hydrozoa) by Dr. Annette F.

56

Govindarajan, a specimen of Pelagia noctiluca (Scyphozoa) collected at the Station Marine d'Endoume (Marseille, France) by Dr. Alexander Ereskovsky, and a DNA aliquot for

Lucernaria janetae from the East Pacific Rise by Dr. Janet Voight. The identities of hydrozoan species were confirmed morphologically by Dr. Peter Schuchert and assessed for all species by comparing partial cox1 and rnl sequences to the NCBI database. We extracted total DNA using phenol-chloroform extraction followed by proteinase K digestion.

We amplified and sequenced parts of cob, cox1, cox2, nad5, rnl, and rns using conserved primers for animals (Burger et al. 2007). We used the sequences obtained from these amplifications to design species-specific primers for long PCR amplifications. We amplified nearly complete mitochondrial genomes either in a single piece between cob-cox1 or cob-cox2, or in two parts between rnl-nad5 and rns-cox1. We extended our sequences using modified step-out protocol (Burger et al. 2007) and re-amplified the contigs using standard PCR. We then sheared long PCR products and processed them for multiplex sequencing (Meyer et al.

2008) using either 454 high-throughput or Illumina Sequencing platforms, carried out at the

Center for Genomics & Bioinformatics of Indiana University and the Office of Biotechnology

DNA Facility of Iowa State University, respectively. For C. convolvulus, H. “sanjuanensis”, and L. unguiculata, we constructed random clone libraries from the purified PCR product using the TOPO Shotgun Subcloning Kit from Invitrogen (see Burger et al. 2007 for details) and sequenced at the Office of Biotechnology DNA Facility of the Iowa State University.

We identified mitochondrial sequences from partial genomic reads of Alatina moseri

generated by the Iridian Genomes project. We used specific primers and step-out protocol on a

tissue sample from another individual of A. moseri to extend our sequences toward the end of

57 the mitochondrial chromosomes. Because of the high degree of segmentation of cubozoan

mtDNA, we PCR amplified nearly complete coding sequences using standard PCRs (for rns-

nad1 and cox2-cox3) and step-out protocols (for cob, cox1, nad4, nad2-nad5, ORF314-polB,

and rnl), and cloned the products into TOPO vectors. Plasmid preparation and sequencing were

processed at the Iowa State University Office of Biotechnology DNA facility. We used

enzymatic digestion of total DNA with the methylation-sensitive restriction HpaII to

confirm that PCR products amplified in this study are genuine mitochondrial sequences

(Kumar and Bendich 2011).

Genome assembly, gene annotation, and analysis

We assembled all sequences using the STADEN software suite (Staden 1996). We used

both the tRNAscan-SE and ARWEN programs (Lowe and Eddy 1997; Laslett and Canback

2008) to identify tRNA genes. We reconstructed the secondary structures of and

by comparison with their homologues in other cnidarian species. Other genes were

identified by similarity searches in local databases using the FASTA program (Pearson 1994).

We predicted the boundaries of rRNA genes by aligning them to rRNA sequences from

published cnidarians and checking the alignments manually. We also aligned individual protein

genes with ClustalW 1.82 (Thompson et al. 1994) using default parameters. We manually

verified all alignments based on either amino-acid alignments (for protein coding genes), or

inferred rRNA or tRNA secondary structures (for RNA-coding genes) to Aurelia aurita and

Hydra oligactis (Shao et al. 2006; Kayal and Lavrov 2008). For the analysis of sequence

similarities, we calculated sequence identities using local scripts based on BioPerl modules

(Stajich et al. 2002). We estimated codon usage with the CUSP program in the EMBOSS

package (Rice et al. 2000).

58

Results

The linear mitochondrial genomes of Medusozoa

We amplified and sequenced mitochondrial (mt) genomes of eight species representing the two clades of Scyphozoa (Coronatae: Linuche unguiculata; Discomedusae: Cassiopea

andromeda, C. frondosa, Catostylus mosaicus, Chrysaora sp, Cyanea capillata, Pelagia

noctiluca and (partial)), eight species representing the two clades of

Hydrozoa (Hydroidolina: Clava multicornis, Ectopleura larynx, Laomedea flexuosa, Millepora platyphylla, Nemopsis bachei, Obelia longissima, and Pennaria tiarella; Trachylina: Cubaia

aphrodite), and three species representing both orders of Staurozoa (Cleistocarpida:

Cratolophus convolvulus; Eleutherocarpida: Haliclystus “sanjuanensis” and Lucernaria janetae). In addition, we obtained substantial mitochondrial sequences from five species belonging to the two major cubozoan clades (Carybdeida: Alatina moseri, , and Carybdea xaymacana; Chirodropida: and quadrumanus).

The mtDNA of all scyphozoan, staurozoan, and hydrozoan species sampled for this

study mapped into single linear molecules (Figure 2A). Cubozoan mtDNA was composed of

eight mitochondrial chromosomes, each encoding one to five mitochondrial genes with the

same transcriptional orientation. This is in agreement with earlier DNA hybridization

experiments, which suggested that cubozoan mtDNA was composed of several up to 4 kb

linear chromosomes (Ender and Schierwater 2003). All cubozoan mitochondrial genes were

similar in size to their homologues in other medusozoans, and contain neither internal stop

codons (TAA or TAG) nor frameshifts. In addition, incubation of Alatina moseri total DNA

with methylation-sensitive enzyme HpaII (method described in Kumar and Bendich 2011)

inhibited PCR amplifications of the contigs harboring the restriction sites (Supplementary Fig.

59

S1). Together, this evidence strongly suggested that our cubozoan contigs were genuine mitochondrial sequences, as opposed to nuclear pseudogenes ().

All medusozoan mtDNA contained a set of genes for 13 protein subunits conserved in animals and involved in the electron transport chain of oxidative phosphorylation (atp6-8, cob, cox1-3, nad1-6), two ribosomal RNAs (rRNA) (rnl and rns), and two transfer RNAs (tRNA)

(trnM(cau) and trnW(uca)). trnW(uca) was missing from the partial mitochondrial genome of the coronate Linuche unguiculata and no tRNA genes were found in the partial sequences from cubozoans (see below). We also found some level of class-specific conservation at the ends of the linear mitochondrial chromosomes: duplicated cox1 in most hydrozoans (Hydroidolina), polB and an extra ORF in cubozoans, scyphozoans, staurozoans, and the trachyline hydrozoan

Cubaia aphrodite (see below).

Medusozoan mitochondrial genomes displayed a compact organization with few, if any intergenic regions, and several pairs of overlapping genes. We found overlapping nad3-nad6 genes in the mtDNA of all hydrozoans, scyphozoans, and staurozoan species (11±3, 35±17, and 9±2 nucleotides respectively). In addition, nad4L-nad3 overlapped by 10 nucleotides in all scyphozoan species, nad1-nad4 overlapped by 10±3 nucleotides in all hydrozoans, atp6-atp8 overlapped by 7 nucleotides, and nad2-nad5 by 13 nucleotides in all staurozoans. Similarly, nad2-nad5 overlapped by 26 and 44 nucleotides in the staurozoans Haliclystus “sanjuanensis” and Lucernaria janetae, respectively. Finally, in the aplanulate Ectopleura larynx we found cob-cox1 overlapping by 44 nucleotides. The same two genes overlapped by 8 nucleotides in

Hydra magnipaillata and H. vulgaris (Voigt et al. 2008), which are also part of the Aplanulata clade (Collins et al. 2005).

60

We compared the variation in the mtDNA nucleotide composition among different cnidarian classes (Table 1). The linear mtDNA in medusozoans displayed GC-skew values close to zero: cubozoans, hydrozoans, and scyphozoans all shared a small positive GC-skew

(on average 0.04 SD=0.03) between the two strands of the coding sequences, while

staurozoans had a slightly negative GC-skew (-0.05 SD=0.02). In comparison, the circular

mtDNA in anthozoans displayed a positive GC-skew (0.20 SD=0.07 in hexacorals and 0.09

SD=0.01 in octocorals). AT-skews in all studied medusozoan species were negative for the

coding strand (between -0.11 and -0.21) and similar to those in anthozoans. The A+T content

of coding regions varied substantially between different medusozoan classes. Hydrozoan

mtDNA had the highest A+T content (74% SD=4), and cubozoan mtDNA had the lowest (61%

SD=3). Scyphozoans and staurozoans had intermediate %A+T values (67% SD=4 and 67%

SD=2 respectively) similar to those in anthozoans. We found a strong correlation (R2=0.84)

between nucleotide composition and the proportion of GC-rich amino acids (the “GARP”

amino acids: glycine, alanine, arginine, and proline) (Figure 1). This result was consistent with

previous studies, which found that nucleotide composition can serve as a proxy for amino acid

compositions of protein sequences and can be an important source of systematic errors in

phylogenetic inference (Delsuc et al 2005 and references therein).

Gene arrangements in medusozoan mtDNA

Analysis of the newly determined mitochondrial sequences revealed a well-conserved

mitochondrial gene order within Medusozoa (Figure 2A). Eleven representatives of both

discomedusan Scyphozoa and Staurozoa sampled for this study had a completely conserved

gene order, identical to that found in Aurelia aurita (Shao et al 2006). In these groups, the

cluster formed by cox1-rnl-ORF314-polB had a different transcriptional orientation to the rest

61 of the genes (Figure 2A). The two clusters were separated by a relatively large non-coding

region of about 80 nucleotides, potentially involved in the control of mitochondrial

transcription (see below). In the coronate Linuche unguiculata, trnW was not found either in its

conserved position between cox2 and atp8 or in the rest of the genome. We were also unable to

identify sequences for polB and ORF314 in this species. However, given that the sequence of

L. unguiculata is incomplete, it is possible that these genes are present in the mitochondrial

genome, downstream of cox1. The gene arrangement was also conserved in all cubozoan

species, despite the presence of multiple chromosomes (Figure 2A). Single gene chromosomes

harbored cob, cox1, nad4, and rnl genes. Other chromosomes contained gene pairs nad2-nad5

and ORF314-polB, and gene clusters cox2-atp8-atp6-cox3 and rns-nad6-nad3-nad4L-nad1.

When placed together, the gene order across the eight chromosomes in cubozoan mtDNA is

nearly identical to that in most scyphozoans and staurozoans except for the absence of tRNA

genes.

The mitochondrial genome organization within Hydrozoa was more variable, with at

least four different gene orders. The genome organization in the trachyline Cubaia aphrodite

was identical to that in discomedusan Scyphozoa and Staurozoa (Figure 2A). All species from

the clades Capitata (Millepora platyphylla and Pennaria tiarella), III (Clava

multicornis) and VI (Nemopsis bachei), and (Laomedea flexuosa and Obelia

longissima) had a genome organization resembling that of C. aphrodite, with the exception of a

duplicated cox1 downstream of cob and the loss of ORF314-polB (Figure 2A; see Cartwright et

al. 2008 for description of hydrozoan subclades). The genome organization in the aplanulate

Ectopleura larynx was similar to H. oligactis, where rnl was inverted and both tRNA genes

(trnM and trnW) moved compared to other hydrozoans. Interestingly, all non-trachyline

62 hydrozoan species sequenced in this study had two identical copies of cox1, one at each end of the mitochondrial chromosome. In comparison, the three previously published Hydridae

mitochondrial genomes (Hydra magnipapillata, H. oligactis, and H. vulgaris) were reported to

have only one complete copy of cox1 downstream of cob, with the other copy(ies) being

partially degenerated and lacking the 5’ end of the gene (Kayal and Lavrov 2008; Voigt et al.

2008). Furthermore, in H. magnipapillata and H. vulgaris, the mtDNA was split into two

nearly equal sized chromosomes, a feature also reported for several other species belonging to

the family Hydridae (Warrior and Gall 1985; Bridge et al. 1992; Ender and Schierwater 2003).

Overall, the mitochondrial gene order in medusozoans is well conserved, with only four

unique gene orders found so far (ignoring genome fragmentation and cox1 degeneration in

Hydridae). We inferred the evolution of mitochondrial genome organization in medusozoans

using the currently accepted cnidarian phylogeny (Figure 2B; Collins et al. 2006; Cartwright et

al. 2008). Gene order similar to the scyphozoan A. aurita is found in all four of the recognized

medusozoan classes. Given the reportedly low rate of genome rearrangement in animals, it is

most parsimonious to reconstruct the mitochondrial ancestral medusozoan gene order (AMGO)

as identical to the gene order found in A. aurita (Figure 2B).

Protein gene similarity in medusozoan mtDNA

Sizes of orthologous protein genes in medusozoan mtDNA showed little variation (0-

4%), as compared to anthozoans (0-11%). By contrast, the overall amino acid pairwise identity

of protein genes for Medusozoa was highly variable, ranging between 24-96% (average 49%

SD=12) (Table 1). Such variability indicates that mtDNA proteins can be good markers for

reconstructing the evolutionary history of medusozoans at lower taxonomic levels.

Interestingly, the amino acid pairwise identity of protein genes between coronates and other

63 scyphozoans (50% SD=2) were similar to the values for all other cnidarian classes (50% and

51% with hexa- and octocorals, 38% with cubozoans, 47% with hydrozoans, and 48% with staurozoans), with a mean of 49% SD=4 to all cnidarians.

The minimally derived genetic code (TGA=tryptophan) was inferred for mitochondrial protein synthesis in all the medusozoan genomes analyzed in this study. Codon usage bias estimated as the effective number of codons Nc (Wright 1990) was highly correlated with the

GC content of the mtDNAs (R2=0.97), where the AT-rich hydrozoan mtDNA displayed the

highest codon usage bias (35 SD=4), and the lowest values were found in cubozoans and

staurozoans (47 SD=4 and 47 SD=1 respectively). Within a given codon family, codons ending

with A or T were favored over those ending with C or G. CGN codons were nearly absent from

the mtDNA of cubozoans, hydrozoans, and scyphozoans, where they represented <15% of

arginine residues with no CGG codon used in hydrozoans. In contrast, CGN codons encoded

about 50% of arginine residues in staurozoans.

ATG was inferred as the initiation codon for most mitochondrial coding sequences, yet,

we inferred GTG as a start codon for two genes in Hydrozoa (cob in L. flexuosa and Obelia longissima, and cox1 in E. larynx) and five genes in Scyphozoa (atp8 in L. unguiculata, nad1 in P. noctiluca, nad3 in all but two species, nad4L in C. mosaicus, and nad5 in Chrysaora sp)

(Table 1). GTG was also the inferred initiation codon in at least 5 out of the 13 protein genes in

Cubozoa, and mostly present in the order Chirodropida. Both stop codons TAA and TAG were

found in all medusozoan classes (80% and 20% of the genes respectively, where the former

value represents both TAA and incomplete TA stop codons). Incomplete stop codons TA were

inferred for several protein genes in medusozoan mtDNA, predominantly in atp6, atp8, nad3, and nad4L (Table 1). In the aplanulatan Ectopleura larynx the nearest stop codon for nad5 is

64 located 30 nucleotides past the 5’ end of rns. Interestingly, no stop codon was inferred for nad5 in the scyphozoan P. noctiluca, as the nearest stop codon is found 77 nucleotides within rns.

RNA genes in medusozoan mtDNA

All complete medusozoan mt-genomes analyzed in this study contained four RNA

genes typical for Cnidaria: two genes coding for large and small subunit rRNA (rnl and rns), and two for methionine and tryptophan tRNAs (trnM(cau) and trnW(uca)). The genes for the small and large ribosomal RNA subunits were on average 17% smaller in size and displayed less size variation in medusozoans compared to their homologues from anthozoans (Table 1).

The nucleotide composition of RNA genes was slightly more AT-rich than that of the protein coding regions.

Both medusozoan mt-tRNA genes were more variable in primary sequence and

inferred secondary structure than their homologues in anthozoans and demosponges (Wang and

Lavrov 2008). The initiator was on average 70 nucleotides long, ranging from 68 in

staurozoans to 71 in hydrozoans and scyphozoans. The shorter in Staurozoa lacked

two nucleotides at positions 16 and 17 of the D-loop (Figure 3). The anticodon arm in all

hydrozoan and staurozoan had an extra base pair typical to type 7 tRNA (Watanabe

et al. 1994; Steinberg and Cedergren 1995; Steinberg et al. 1997), resulting in a shorter variable

loop (three nucleotides) in the hydrozoans, but a standard four-nucleotides long variable loop

in staurozoans. The of all medusozoans but Hydra species had two nucleotides (8

and 9) in the connector region between the acceptor and the D-arm, which makes the loss of

nucleotide 9 in mitochondrial a synapomorphy for the Hydridae family. Finally, all

hydrozoans but Cubaia aphrodite and all staurozoans lacked nucleotide 16 in the D-loop of

65

. The partial cubozoans mitochondrial sequences determined for this study harbored no tRNA genes.

Intergenic and putative control regions in medusozoan mtDNA

We found a single, relatively large (60 to 94 nucleotides) and AT-rich (>70%)

intergenic region (IGR) between cox1 and cox2 in all scyphozoan and staurozoan species, and in the trachyline hydrozoan Cubaia aphrodite mtDNA (Figure 2A). This IGR delimits two sets of genes with opposite transcriptional polarity. A large portion of this region can be folded into a conserved stem loop motif with up to 13-base pairs in the stem, most of them (23 bases) completely conserved in all scyphozoans sampled in this study and some (12 bases) also conserved in staurozoans and in the hydrozoan C. aphrodite (Figure 4A). In most other hydrozoan groups (Filifera, Capitata, and Leptothecata), the largest IGR was located between cox2 and rnl, two genes with opposite transcriptional polarities (Figure 2A). This IGR also harbored a 20 nucleotide-long sequence that, while displaying low sequence conservation between hydrozoan species, could be folded into a strong stem-loop element (Figure 4A).

In aplanulatan hydrozoans, we found a relatively large (42-52 nucleotides) non-coding

region between cox3 and nad2 that displayed up to 70% similarity with trnW sequences.

Finally we did not find major intergenic regions in cubozoan mtDNA.

The “ends” of medusozoan mtDNA: extra ORFs versus duplicated genes

All species of Staurozoa, discomedusan Scyphozoa (the sequence of the corresponding

region in the coronate scyphozoan Linuche unguiculata remains undetermined), and the

trachyline hydrozoan Cubaia aphrodite contained two ORFs at one end of the molecule. In

Cubozoa, analogous ORFs were found on a separate mitochondrial chromosome. The

sequences of the larger ORF were similar with polB described in Aurelia aurita (Shao et al.

66

2006) and formed a monophyletic group in a phylogenetic analysis that included polB from other organisms (Supplementary Fig. S2). The sequence of the smaller ORF (ORF314) was poorly conserved within Medusozoa, with only 2 identical amino acid residues between the three classes. This ORF encoded about 100 amino acids and was always located upstream of polB. While preliminary analysis using 3D modeling of the tertiary structure of the putative product of ORF314 suggests similarities with elongation factors and a potential DNA-binding activity (data not shown), further studies are needed to investigate the role of this gene.

The mtDNA of all hydrozoans but the trachyline Cubaia aphrodite lacked the two extra

ORFs (polB and ORF314) (Figure 2A). Furthermore, we were not able to identify copies of

these genes in the nuclear genome of H. magnipapillata (Chapman JA et al. 2010). Instead, we

found a complete copy of cox1 at each end of the mitochondrial chromosome (Figure 2A). The two copies of cox1 had opposite transcriptional polarities and formed part of inverted terminal

repeats (ITRs, Figure 2A). In addition, we found a relatively conserved 116 base pair sequence

downstream of cox1 in Aurelia aurita and Cassiopea spp mtDNAs that could be folded into a

complex stemloop structure (Figure 4B). The presence of inverted terminal repeats (ITRs) is a

hallmark of linear mitochondrial genomes and has been reported in several distantly related

organisms (Valach et al 2011 and references therein).

Discussion

Linear mitochondrial DNA has been well documented in several non-metazoan groups

such as plants and fungi. In yeast, where linear mtDNA has been extensively studied, the

distribution of linear and circular mitochondrial genomes has no apparent phylogenetic signal,

with closely related species having either linear or circular mtDNA (Nosek et al 1998; Nosek

67 and Tomaska 2008). In animals, strictly linear mtDNA has been described only in Medusozoa

(Warrior and Gall 1985; Bridge et al. 1992), for which only six genomes have been published

to date (Shao et al. 2006; Kayal and Lavrov 2008; Voigt et al. 2008; Park et al. in press). The

only other report of linear mtDNA in animals comes from the crustacean isopod Armadillidium

vulgare, where the mtDNA consists of a circular ~28-kb dimer and a linear ~14-kb monomer

(Raimond et al 1999; Marcadé et al 2007). Our study represents the first thorough survey of the

evolution of linear mtDNA in Medusozoa. In order to better understand the evolutionary

history of mtDNA linearity in this group, we surveyed species belonging to all medusozoan

classes, as well as their primary subclades.

We analyzed medusozoan mitochondrial genomes in a phylogenetic framework and

propose the following hypotheses for the evolution of linear mtDNA in this group: (1) a single

origin for the linear mtDNA in the lineage leading to Medusozoa, most likely by the integration

of a linear plasmid; (2) loss of polB and ORF314, and duplication of cox1 at each end of the

linear mt chromosome after the divergence of Trachylina from the rest of hydrozoans

(Hydroidolina); (3) gene rearrangement at the stem of Aplanulata; (4) partial loss of cox1

sequence in Hydridae and split of the mitochondrial chromosome in several species within this

clade; (5) high level of mtDNA fragmentation in the branch leading to Cubozoa; (6)

displacement of trnW in the coronate Linuche unguiculata (Figure 2B). It is worth noting that

fragmentation of the mitochondrial molecules appears to be an irreversible process as no

rearranged mitochondrial genomes have been observed in taxa where mtDNA is fragmented, as

one would expect if different pieces would fuse randomly.

Fukuhara and collaborators suggested a positive correlation between mitochondrial

gene order conservation and linear as opposed to circular structure of yeast mtDNA (Fukuhura

68 et al. 1993). Our study revealed a similar trend. If we ignore genome fragmentation and cox1 duplication, only four unique gene orders can be distinguished in medusozoan mtDNA: the

AMGO is found in all four medusozoan classes, with modified gene orders found in the

coronate Linuche unguiculata, Hydroidolina, and Aplanulata (Fig. 2A). By contrast, at least

nine unique gene orders (more if presence/absence of ORFs is considered) have been reported

in the circular mtDNA of anthozoans (Uda et al. 2011). Furthermore, most of the differences in

the medusozoan mitochondrial gene orders can be explained by a few simple gene

rearrangement events, while multiple and/or complex rearrangements events are needed to

explain differences in anthozoan gene orders. These observations suggest that linearization has

favored stabilization of the mitochondrial genome organization in Medusozoa. On the other

hand, this result may to some extent reflect the difference in the species richness in Anthozoa

vs. Medusozoa (about twice as many species in the former; Daly et al. 2007) and/or the number

of mt-genomes that have been studied in each group (43 vs. 30).

One other difference between anthozoan and medusozoan mtDNA is the amount and

distribution of non-coding, intergenic DNA. Anthozoan mtDNA usually has several relatively large intergenic regions that sometimes contain repetitive sequences and/or ORFs (Park et al.

2011). By contrast, medusozoan mitochondrial genomes are compactly arranged with few if any intergenic nucleotides and, sometimes, overlapping genes. Most mitochondrial genomes investigated in this study contained only a single moderately large (<100 bp) IGR separating two regions of mtDNA with opposite transcriptional polarities. The conservation of sequence and inferred secondary structure of this region suggests that it may be involved in the control of mtDNA transcription and/or replication in Scyphozoa, Staurozoa, and trachyline Hydrozoa.

Similarly, the largest IGR in other hydrozoans (situated between trnM and the cox1 in

69

Aplanulata, and between cox2 and rnl in other hydroidolinans) may be involved in the control of transcription as suggested earlier (Voigt et al. 2008). Compactly arranged mtDNA with a single large intergenic region involved in the initiation of replication (Desjardin and Morais

1990) and transcription (L'Abbé et al. 1991) is also a feature of bilaterian animals.

Interestingly, the maturation of a polycistronic pre-mRNA in bilaterian mitochondria involves excision of tRNA sequences located between most coding sequences in the genome, resulting in monocistronic mRNAs (Ojala et al. 1980). Based on the near absence of intergenic regions, one can assume that the transcription of medusozoan mtDNA is also polycistronic. However, the loss of all but two tRNA genes necessitates a different mechanism for mRNA maturation in medusozoan mitochondria. Finally, the absence of large IGRs and the conservation of unidirectional transcriptional orientation of all genes in cubozoan mtDNA suggest that the initiation of replication and transcription occurs at the end(s) of each of the mitochondrial chromosomes.

We found two distinctive genetic elements at the end of medusozoan mitochondrial chromosomes: duplicated cox1 in the hydroidolinan hydrozoans, and two extra ORFs (polB and ORF314) in all the other medusozoan classes (Cubozoa, discomedusan Scyphozoa,

Staurozoa, and trachyline Hydrozoa). In an earlier study, Shao and collaborators showed that polB from the scyphozoan Aurelia aurita mtDNA shares several conserved motifs characteristic of the polymerase in family B DNA polymerases (Shao et al. 2006). The authors also attributed the origin of the two extra protein genes to an ancient invasion event by a linear plasmid that resulted in the linearization of the mtDNA in Medusozoa. Similar polB- like sequences are found in the linear mtDNA of several non-animal groups such as fungi and algae, where they are hypothetically associated with the integration of linear in the

70 mitochondrial genome (Mouhamadou et al. 2004). The grouping of medusozoan polB sequences into a monophyletic clade suggests a single invasion event early in the evolutionary history of the group that coincided with the origin of its linear mtDNA at the stem of the medusozoan tree. Yet the low level of sequence conservation in polB-like sequences between medusozoan and non-animal species hinders our attempts to predict the source of the invading element. Nevertheless, the conservation of both sequence length and position of the two ORFs suggests some level of selection pressure for their maintenance in the mtDNA of most medusozoans.

Short inverted terminal repeats (ITRs) have been previously reported in the mtDNA of

A. aurita (Shao et al. 2006) and the hydrozoans Hydra magnipapillata, H. oligactis, and H.

vulgaris (Kayal and Lavrov 2008; Voigt et al. 2008). ITRs are one out of the few telomeric

structures found in linear mtDNAs (Nosek et al. 1998; Nosek and Tomaska 2003), and

involved in the maintenance and replication of linear chromosomes. In yeast, short ITRs are

associated with type III linear mtDNA, where a terminal protein (TP) covalently binds at the

single-stranded 5’ end of the linear molecules (Nosek and Tomaska 2008). The linear

molecules generally encode TPs, either as part of a DNA polymerase gene, or by a yet

unidentified ORF (Fricova et al. 2010 and references therein). In medusozoan mtDNA, the

putative product of ORF314, if shown to have DNA-binding properties, may act as a TP.

Consequently, assuming polB and ORF314 are functional genes, we predict that the mtDNA in

cubozoans, scyphozoans, staurozoans and trachyline hydrozoans uses mechanisms of

maintenance and replication similar to type III linear mtDNA as found in yeasts, linear

plasmids, and adenoviral DNA. This assertion is further supported by the similarity of polB

sequences between medusozoan and linear plasmids. As described above, both polB and

71

ORF314 are absent from non-trachyline hydrozoan mtDNAs, and no homologues could be

found in the completely sequenced nuclear genome of Hydra magnipapillata (Hydridae,

Hydrozoa). Thus, unlike what has been suggested by an earlier study (Voigt et al. 2008),

hydrozoans appear to employ an alternative mechanism for the maintenance and replication of

their mtDNA. However, recruitment of a nuclear encoded TP for the maintenance of linear

mtDNA in these hydrozoans cannot be ruled out. In addition, the duplication of cox1 at each end of the mitochondrial chromosome(s) is limited to the lineage leading to Hydroidolina, suggesting that a novel mechanism evolved after the divergence of Trachylina and the rest of hydrozoans. Future studies that explore the end conformations of medusozoan mtDNA will shed light on the maintenance and processing of these linear molecules.

Despite our efforts, we were not able to verify the expression of polB and ORF314 at

the RNA level, most likely as a consequence of poor RNA conservation in the preserved tissue

samples. We were also unable to investigate the processes involved in transcription and

replication of the mtDNA in medusozoans. The analysis of a dataset containing ESTs from the

scyphozoan Cyanea capillata was also unsuccessful, highlighting the difficulties associated

with working with RNA in this group. Interestingly, in EST assemblies obtained from the

staurozoan Haliclystus “sanjuanensis” we found a single mitochondrial hit that spans from the

3’-end of cox1 to the 5’-end of rnl, suggesting that these genes are transcribed in a

polycistronic manner. Yet, additional studies are needed to shed light on the role of the two

extra ORFs (polB and ORF314) and their putative products in the maintenance of linear

mitochondrial chromosomes in medusozoans, and on the mechanisms of mitochondrial gene

expression in the group.

72

Conclusion

Medusozoan mtDNA is typically a small compact genome composed of one or several linear chromosomes with a highly conserved gene order, and inverted terminal repeats (ITRs) at the end of the molecule(s). The linearity of medusozoan mtDNA is most likely the product of a unique invasion event by a linear plasmid-type element harboring a polB-like and a putative terminal protein (TP) gene, with secondary loss of the two extra ORFs within hydrozoans. Medusozoan mitochondrial genomes lack large IGRs, contain several overlapping genes, and harbor shorter protein and ribosomal genes than representatives from other non- bilaterian clades (Signorovitch et al. 2007; Wang and Lavrov 2008). Besides their strictly linear mitochondrial genomes, a unique feature in animals, the medusozoan mtDNA display characteristics that can be described as intermediary between bilaterian and non-bilaterian animals. On the one hand, medusozoans use a minimally derived genetic code for mitochondrial protein synthesis and show relatively low evolution rate of their mitochondrial genes, as found in most non-bilaterian animals. On the other hand, the compactness of their genomes and a relatively stable gene order both mirror the evolution of bilaterian mtDNA. Put into a broader perspective, our results show that the evolution of animal mitochondrial genomes does not follow a single linear pathway as proposed in an earlier study (Lavrov 2007).

This is well illustrated in the cnidarian clade where both bilaterian-like and non-bilaterian-like mitochondrial features are found. Future studies that focus on undersampled animal taxa may further expose the rather unpredictable evolutionary pattern of genome evolution in animal mitochondria, revealing a more complex picture of mtDNA evolution in Metazoa. In addition, we predict that a larger survey of linear mtDNA in medusozoans may uncover original/new

solutions to the "the challenge of the ends" (Nosek et al. 1998).

73

Funding

This work was supported by a grant from the National Science Foundation to DVL

[DEB-0829783] and by funding from Iowa State University.

Acknowledgements

We would like to thank Dr. Michael Dawson and Dr. Alexander Ereskovsky for their contributions to the sampling effort of Scyphozoa, Dr. Janet Voight for providing us with DNA from the staurozoan Lucernaria janetae, and Dr. Annette F. Govindarajan for providing us with the hydrozoan Obelia longissima. We also thank Dr. Casey Dunn and the Cnidarian Tree of Life project (DEB-0531779 to Paulyn Cartwright and AGC) for granting access to EST data from the scyphozoan Cyanea capillata and Haliclystus “sanjuanensis”. We are very grateful to

Dr. Peter Schuchert for the identification of our hydrozoan samples. BB wishes to acknowledge funding through US NSF Doctoral Dissertation Grant DEB 0910237. EK wishes to acknowledge funding through the Smithsonian Predoctoral Fellow grant.

Table 1: Gene properties in the mtDNA of Medusozoa. The average size in nucleotides (and standard deviation), AT composition (and standard deviation), and putative start and stop codons are reported for each of the Medusozoa subclasses.

74

Cubozoa Hydrozoa Scyphozoa Staurozoa

size1 %AT2 start3 end4 size %AT start end size %AT start end size %AT start end 708 704 75 704 708 63 atp6 63 ±2 AG * A A* 69 ±4 A A* A A ±7 ±1 ±4 ±3 ±0 ±1 210 206 83 208 204 62 atp8 64 ±4 AG AG* A A* 73 ±4 AG AG* A A ±2 ±3 ±5 ±7 ±0 ±4 1148 73 1146 60 cob 1149 62 ±2 A G AG A 66 ±2 A AG 1068 A ? ±12 ±3 ±8 ±2 67 1580 1578 61 cox1 1569 58 ±3 A A 1569 AG AG 64 ±3 A AG A A ±3 ±7 ±0 ±1 737 744 73 746 747 62 cox2 61 ±2 A AG* A AG 67 ±4 A AG* A AG ±2 ±10 ±4 ±8 ±0 ±1 786 786 72 786 786 61 cox3 59 ±3 A AG A AG 64 ±3 A AG A AG ±0 ±0 ±4 ±0 ±0 ±1 987 989 73 972 987 59 nad1 62 ±3 AG AG A AG 66 ±4 AG A* A A ±8 ±4 ±4 ±5 ±0 ±0 1328 79 1323 1346 59 nad2 1341 63 ±7 A G* A AG* 70 ±5 A AG A A ±32 ±5 ±13 ±2 ±4 351 355 77 357 354 65 nad3 62 ±4 AG * A * 69 ±4 AG A* A A* ±0 ±4 ±4 ±6 ±0 ±4 1458 76 1441 1461 61 nad4 1446 59 A G A AG* 68 ±5 A AG* A AG* ±2 ±4 ±2 ±0 ±3 290 299 79 303 299 64 nad4L 67 ±3 A G* A * 72 ±5 AG A* A * ±2 ±2 ±4 ±1 ±2 ±1 1832 76 1830 1860 60 nad5 1824 62 ±1 AG G A AG* 68 ±5 AG A* A AG ±2 ±4 ±19 ±13 ±2 542 556 79 564 553 62 nad6 64 ±3 A G* A AG* 70 ±5 A AG* A A ±4 ±8 ±5 ±12 ±2 ±2 313 ORF314 315 64 A A 291 78 A G 73 ±8 A A 288 62 A A ±7 polB 873 58 G A ? ? ? ? 969 70 ±8 A A 1119 58 ATG ? 1746 76 1818 57 rnl 769 57 NA NA NA NA 69 ±5 NA NA 1830 NA NA ±9 ±4 ±34 ±1 910 74 950 914 57 rns 672 62 NA NA NA NA 69 ±3 NA NA NA NA ±21 ±2 ±10 ±1 ±1 71 69 69 53 trnM - - NA NA NA NA 71 ±0 64 ±5 NA NA NA NA ±1 ±2 ±0 ±2 70 65 71 52 trnW - - NA NA NA NA 70 ±0 64 ±5 NA NA NA NA ±1 ±3 ±0 ±2

1Average size in nucleotides, with standard deviation

2AT composition, with standard deviation

3Start A and G stand for ATG and GTG start codons respectively

4Stop A and G stand for TAA and TAG stop codons, respectively; a star corresponds to an incomplete stop codon (see text).

75

Compositional properties of cnidarian mitochondrial coding sequences 26

24

R2 = 0.84 22

20

Hexacorallia Octocorallia 18 Scyphozoa

% G+A+R+P Staurozoa 16 Hydrozoa Cubozoa

14

12

10 15 20 25 30 35 40 45 50 % GC

Figure 1: Compositional properties of cnidarian mitochondrial coding sequences. The total G-C content of mitochondrial protein genes is plotted against the percentage of amino

acids encoded by G- and C-rich codons (glycine, alanine, arginine, and proline [G + A + R +

P]). The trend line and correlation coefficient are displayed. Hydrozoa are characterized by

very A-T rich genomes and relatively fewer [G + A + R + P] codons. Staurozoan have

relatively higher GC content similar to anthozoans.

76

A polB orf1 rnl cox1 cox2 atp8 atp6 cox3 nad2 nad5 rns nad6 nad3 nad4L nad1 nad4 cob Cubozoa

cox1 cox2 atp8 atp6 cox3 M nad2 nad5 rns nad6 nad3 nad4L nad1 nad4 cob Linuche (Coronatae)

polB orf1 rnl cox1 cox2 W atp8 atp6 cox3 M nad2 nad5 rns nad6 nad3 nad4L nad1 nad4 cob other Scyphozoa + Staurozoa

polB orf1 rnl cox1 cox2 W atp8 atp6 cox3 M nad2 nad5 rns nad6 nad3 nad4L nad1 nad4 cob AMGO

polB orf1 rnl cox1 cox2 W atp8 atp6 cox3 M nad2 nad5 rns nad6 nad3 nad4L nad1 nad4 cob Cubaia (Trachylina)

Filifera + Capitata cox1 rnl cox2 W atp8 atp6 cox3 M nad2 nad5 rns nad6 nad3 nad4L nad1 nad4 cob cox1 + Leptothecata H y d r o z a

cox1 M rnl W cox2 atp8 atp6 cox3 nad2 nad5 rns nad6 nad3 nad4L nad1 nad4 cob cox1 A p l a n u t H y d r i a e cox1-p M rnl W cox2 atp8 atp6 cox3 nad2 nad5 rns nad6 nad3 nad4L nad1 nad4 cob cox1

Some Hydridae

cox1-p M rnl W cox2 atp8 atp6 cox3 nad2 nad5 cox1-p cox1-p M rns nad6 nad3 nad4L nad1 nad4 cob cox1

Hydrozoa B Acraspeda Aplanulata Scyphozoa

rachyolintaher Hydroidolina Anthozoa Staurozoa Coronatae Cubozoa T Hydridae

77

Figure 2: Evolution of mitochondrial genomes in Medusozoa (Cnidaria) based on phylogenetic relationships according to Collins et al. (2006), Collins et al. (2008), and

Cartwright et al. (2008). A) Comparison of mitochondrial gene orders between various medusozoan clades and the putative mitochondrial Ancestral Medusozoan Gene Order

(AMGO). Breaks in contigs mark the ends of linear mitochondrial chromosomes. Genes are transcribed from left to right unless underlined. cox1-p corresponds to the partial copy of cox1

at the end of the mtDNA molecule(s) in the Hydridae family. polB is an inferred member of the

B DNA polymerase gene family and ORF314 is an unidentified ORF located upstream of it.

Lightning bolts represent breaks in the mitochondrial chromosomes. Black arrows depict

directions of inferred genome rearrangements. B) Evolution of genome organization of the

linear mtDNA in Medusozoa. The phylogeny is based on Collins et al. (2006). Major genomic

rearrangements are labeled. (1) Linearization of the mtDNA by insertion of a linear plasmid of

unknown source; (2) loss of polB and ORF314, and duplication of cox1 at each end of the

linear chromosome within Hydrozoa after the divergence of Trachylina; (3) displacement of

tRNAs and inversion of rnl in Aplanulata; (4) genome segmentalization of the mtDNA in

several Hydridae species with consequent trnM and cox1 duplication; (5) high level

segmentalization of the mtDNA in cubozoan; (6) displacement of trnW in the coronate Linuche

unguiculata.

78

Methionine w Tryptophan G h-W R-Y (M) R-Y (M) R-Y H-D R-Y R-Y G-Y R-Y R-Y R-Y R-Y R-Y B-V T YHBYCYNA T YDCVVTDA N A N ||||| R A A M ||||| R N TCAR A TTRR C RDNRGTT N RHBGGTT G |||| D |||| T G R G A AGTT A W A RAYY H R V N R D D Y-R r.y Y-R R-Y A-T W-W G-C D-Y R-Y R-Y Y H Y-R T R C A CAT T A TCA

Figure 3: Consensus secondary structure of tRNAs encoded by the mtDNA of Hydrozoa,

Scyphozoa, and Staurozoa. Bold letters with black arrows correspond to position that are missing in some taxa: in , two nucleotides at positions 16 and 17 of the D-loop in

Staurozoa, and positions 1 and 71 in the acceptor stem of the hydrozoan Nemopsis bachei; in

, position 9 of the connector region between the acceptor and the D-arm in the

Hydridae family, position 16 of the D-loop in staurozoan and all hydrozoan but the trachyline

Cubaia aphrodite, and position 42 of the variable loop in all hydrozoan. Bold letters without arrow correspond to an extra base pair in the anticodon arm of in Hydrozoa and

Staurozoa, typical to type 7 tRNAs.

79

A A T-A .. G-C . .. W W . . A-T . . T D . . W R A T-A AA N-M W W AT C-B W-W A A-T y-a T-D A-T W-T T-A A-T T-H T-A A-T W-W C-G A-T t-a A-T A.A T-A W-T T-A T-A R-T G-C R-T t y T-A T-A W-A A-T c-G S-B A-T n(N)T-A (N)n n(N) (N)n n(N) (N)n

Scyphozoa + Staurozoa + Trachylina Capitata + Filifera Leptothecata

B Scyphozoa ...... G-Y T-A T-A T-A T-A T-A . . T . G-C C...... C . . T R Y W-R T CA T-A R T R T A-T T-A A-T T-A A-T T-A T-A A-T A-T W-T A-T T-A A-T

Figure 4: Putative control region and terminal stem-loop found in medusozoan mtDNA.

A) Consensus sequence of the putative control regions (CR) found in the mtDNA of Hydrozoa,

Scyphozoa, and Staurozoa. This CR corresponds to the largest intergenic region (IGR) that

delimits two sets of genes with opposite transcriptional polarity. B) Putative stem-loop

structure found at the end of scyphozoan mtDNA.

80

References

Beagley CT, Okimoto R, Wolstenholme DR. 1998. The mitochondrial genome of the sea Metridium senile (Cnidaria): introns, a paucity of tRNA genes, and a near- standard genetic code. Genetics 148: 1091-1108. Bendich AJ. 1993. Reaching for the ring: the study of mitochondrial genome structure. Curr 24: 279-290. Boore JL. 1999. Animal mitochondrial genomes. Nucleic Acids Res 27: 1767-1780. Bridge D, Cunningham CW, Schierwater B, DeSalle R, Buss LW. 1992. Class-level relationships in the phylum Cnidaria: evidence from mitochondrial genome structure. PNAS USA 89: 8750-8753. Burger G, Gray MW, Lang BF. 2003. Mitochondrial genomes: anything goes. Trends in Genetics 19: 709-716. Burger G, Lavrov DV, Forget L, Lang BF. 2007. Sequencing complete mitochondrial and plastid genomes. Nat Protoc 2: 603-614. Cartwright P et al. 2008. Phylogenetics of Hydroidolina (Hydrozoa: Cnidaria). J Mar Biol Assoc UK 88: 1663-1672. Chapman JA et al. 2010. The dynamic genome of Hydra. Nature 464: 592-596. Chen C, Chiou C-Y, Dai C-F, Chen CA. 2008. Unique mitogenomic features in the scleractinian family pocilloporidae (: astrocoeniina). Mar Biotechnol (NY) 10: 538-553. Collins AG et al. 2005. Phylogeny of Capitata and (Cnidaria, Hydrozoa) in light of mitochondrial 16S rDNA data. Zool Scripta 34: 91-99. Collins AG et al. 2006. Medusozoan phylogeny and character evolution clarified by new large and small subunit rDNA data and an assessment of the utility of phylogenetic mixture models. Syst Biol 55: 97-115. Daly M et al. 2007. The phylum Cnidaria: A review of phylogenetic patterns and diversity 300 years after Linnaeus*. Zootaxa 182: 127-128. Delsuc F, Brinkmann H, Philippe H. 2005. Phylogenomics and the reconstruction of the tree of life. Nat Rev Genet 6: 361-375. Desjardins P, Morais R. 1990. Sequence and gene organization of the chicken mitochondrial genome. A novel gene order in higher . J Mol Biol 212: 599-634. Ender A, Schierwater B. 2003. Placozoa Are Not Derived Cnidarians: Evidence from Molecular . Mol Biol Evol 20: 130-134. Faure E, Casanova J-P. 2006. Comparison of chaetognath mitochondrial genomes and phylogenetical implications. Mitochondrion 6: 258-262. Fricova D et al. 2010. The mitochondrial genome of the pathogenic yeast Candida subhashii: GC-rich linear DNA with a protein covalently attached to the 5’ termini. Microbiology 156: 2153-2163. Fukuhara H et al. 1993. Linear mitochondrial of yeasts: frequency of occurrence and general features. Mol Cell Biol 13: 2309-2314. Gissi C, Iannelli F, Pesole G. 2008. Evolution of the mitochondrial genome of Metazoa as exemplified by comparison of congeneric species. Heredity 101: 301-320.

81

Helfenbein KG, Fourcade HM, Vanjani RG, Boore JL. 2004. The mitochondrial genome of Paraspadella gotoi is highly reduced and reveals that chaetognaths are a sister group to . PNAS USA 101: 10639-10643. Hikosaka K et al. 2010. Divergence of the mitochondrial genome structure in the apicomplexan parasites, Babesia and Theileria. Mol Biol Evol 27: 1107-1116. Kayal E, Lavrov DV. 2008. The mitochondrial genome of (Cnidaria, Hydrozoa) sheds new light on animal mtDNA evolution and cnidarian phylogeny. Gene 410: 177-86. Kumar RA, Bendich AJ. 2011. Distinguishing authentic mitochondrial and plastid DNAs from similar DNA sequences in the nucleus using the polymerase chain reaction. Curr Genet 57(4): 287-295. L’Abbé D, Duhaime JF, Lang BF, and Morais R. 1991. The transcription of DNA in chicken mitochondria initiates from one major bidirectional promoter. J Biol Chem 266: 10844- 10850. Laslett D, Canbäck B. 2008. ARWEN: a program to detect tRNA genes in metazoan mitochondrial nucleotide sequences. Bioinformatics 24: 172-175. Lavrov DV. 2007. Key transitions in animal evolution: a mitochondrial DNA perspective. Integr Comp Biol 47: 734-743.

Lowe TM, Eddy SR. 1997. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25: 955-964. Marcadé I et al. 2007. Structure and evolution of the atypical mitochondrial genome of Armadillidium vulgare (Isopoda, Crustacea). J Mol Evol 65: 651-659. Medina M, Collins AG, Takaoka TL, Kuehl JV, Boore JL. 2006. Naked : skeleton loss in Scleractinia. PNAS USA 103: 9096-9100. Meyer M, Stenzel U, Hofreiter M. 2008. Parallel tagged sequencing on the 454 platform. Nat Protoc 3: 267-278. Miyamoto H, Machida RJ, Nishida S. 2010. Complete mitochondrial genome sequences of the three pelagic chaetognaths Sagitta nagae, Sagitta decipiens and Sagitta enflata. Comp Biochem Physiol Part D Genomics Proteomics 5: 65-72. Moret BME, Warnow T. 2005. Advances in phylogeny reconstruction from gene order and content data. Methods Enzymol 395: 673-700. Mouhamadou B, Barroso G, Labarère J. 2004. Molecular evolution of a mitochondrial polB gene, encoding a family B DNA polymerase, towards the elimination from Agrocybe mitochondrial genomes. Mol Genet Genomics 272: 257-263. Nosek J, Tomáska L, Fukuhara H, Suyama Y, Kovác L. 1998. Linear mitochondrial genomes: 30 years down the line. Trends Genet : TIG 14: 184-188. Nosek J, Tomaska L. 2002. Mitochondrial telomeres: Alternative solutions to the end- replication problem. In: Telomeres, and (ed. G Krupp and R Parwaresch), pp. 396–441. Kluwer/Plenum, New York. Nosek J, Tomáska L. 2003. Mitochondrial genome diversity: evolution of the molecular architecture and replication strategy. Curr Genet 44: 73-84. Nosek J, Tomaska L. 2008. Mitochondrial telomeres: An evolutionary paradigm for the emergence of telomeric structures and their replication strategies. In: Origin and

82

Evolution of Telomeres (ed. J Nosek and L Tomaska), pp. 162–171. Landes Bioscience; Austin, Texas, USA. Ojala D, Montoya J, Attardi G. 1981. tRNA punctuation model of RNA processing in human mitochondria. Nature 290: 470-474. Papillon D, Perez Y, Caubit X, Le Parco Y. 2004. Identification of chaetognaths as protostomes is supported by the analysis of their mitochondrial genome. Mol Biol Evol 21: 2122-2129. Park E, Song JI, Won YJ. 2011. The complete mitochondrial genome of Calicogorgia granulosa (Anthozoa: Octocorallia): Potential gene novelty in unidentified ORFs formed by repeat expansion and segmental duplication. Gene 486(1-2): 81-87. Park E, Hwang DS, Lee JS, Song JI, Seo TK, Won YJ. 2011. Estimation of divergence times in cnidarian evolution based on mitochondrial protein-coding genes and the fossil record. Mol Phyl Evol, in press. Pearson WR. 1994. Using the FASTA program to search protein and DNA sequence databases. Methods Mol Biol 24: 307-331. Pont-Kingdon G et al. 1998. Mitochondrial DNA of the Sarcophyton glaucum contains a gene for a homologue of bacterial MutS: a possible case of gene transfer from the nucleus to the mitochondrion. Journal of Molecular Evolution 46: 419-431. Raimond R et al. 1999. Organization of the large mitochondrial genome in the isopod Armadillidium vulgare. Genetics 151: 203-210. Rice P, Longden I, Bleasby A, 2000. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet 16: 276-277. Shao Z, Graf S, Chaga OY, Lavrov DV. 2006. Mitochondrial genome of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa): A linear DNA molecule encoding a putative DNA-dependent DNA polymerase. Gene 381: 92-101. Signorovitch AY, Buss LW, Dellaporta SL. 2007. Comparative Genomics of Large Mitochondria in Placozoans. PLoS Genetics 3: e13. Staden R. 1996. The Staden sequence analysis package. Mol Biotechnol 5: 233-241. Stajich JE et al. 2002. The Bioperl toolkit: Perl modules for the life sciences. Genome Res 12: 1611-1618. Steinberg S, Cedergren R. 1995. A correlation between N2-dimethylguanosine presence and alternate tRNA conformers. RNA 1: 886-891. Steinberg S, Leclerc F, Cedergren R. 1997. Structural rules and conformational compensations in the tRNA L-form. J Mol Biol 266: 269-282. Thompson JD, Higgins DG, Gibson TJ. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22: 4673-4680. Uda K et al. 2011. Complete mitochondrial genomes of two Japanese precious corals, Paracorallium japonicum and Corallium konojoi (Cnidaria, Octocorallia, Coralliidae): Notable differences in gene arrangement. Gene 476: 27-37. Valach M et al. 2011. Evolution of linear chromosomes and multipartite genomes in yeast mitochondria. Nucleic Acids Res 1-18. Voigt O, Erpenbeck D, Wörheide G. 2008. A fragmented metazoan organellar genome: the two mitochondrial chromosomes of Hydra magnipapillata. BMC Genomics 9: 350.

83

Watanabe Y et al. 1994. Higher-order structure of bovine mitochondrial tRNA(SerUGA): chemical modification and computer modeling. Nucleic Acids Res 22: 5378-5384. Wang X, Lavrov DV. 2008. Seventeen New Complete mtDNA Sequences Reveal Extensive Mitochondrial Genome Evolution within the Demospongiae ed. M. Hofreiter. PLoS ONE 3: e2723. Warrior R, Gall J. 1985. The mitochondrial DNA of Hydra attenuata and Hydra littoralis consists of two linear molecules. Arch Sci Genève 38: 439-445. Wolstenholme DR. 1992. Animal mitochondrial DNA: structure and evolution. Int Rev Cytol 141: 173-216. Wright F. 1990. The “effective number of codons” used in a gene. Gene 87: 23-29.

84

CHAPTER 4. CNIDARIAN PHYLOGENETIC RELATIONSHIPS AS

REVEALED BY MITOGENOMICS

A paper submitted to BMC Evolutionary Biology

Ehsan Kayal, Béatrice Roure, Hervé Philippe, Allen G. Collins, Dennis V. Lavrov

Abstract

Background

Cnidaria (corals, sea , hydroids, jellyfish) is a phylum of relatively simple aquatic animals characterized by the presence of one particular type of cell (cnidocyst). Species within Cnidaria have life cycles that involve one or both of two distinct body forms, a typically benthic , which may or may not be colonial, and a typically pelagic medusa. The currently accepted taxonomic scheme subdivides Cnidaria into two main assemblages:

Anthozoa (Hexacorallia + Octocorallia) – cnidarians with a reproductive polyp and the absence of a medusa stage – and Medusozoa (Cubozoa, Hydrozoa, Scyphozoa, Staurozoa) – cnidarians that usually possess a reproductive medusa stage. However, the monophyly of Anthozoa has been questioned by mitogenomic-based studies, where Octocorallia emerges as the sister group

to Medusozoa.

Results

We expanded the sampling of cnidarian mitochondrial genomes, particularly from

Medusozoa, to reevaluate phylogenetic relationships within Cnidaria. Our phylogenetic

85 analyses based on a mito-genomic dataset reject the monophyly of Anthozoa and indicate that the Octocorallia + Medusozoa relationship is not the result of sampling bias, as proposed earlier. Our analyses also reject Acraspeda [Cubozoa + Coronatae + Discomedusae] and

Scyphozoa [Discomedusae + Coronatae]. Furthermore, we use the new dataset to evaluate previously proposed phylogenetic relationships within medusozoan classes and to infer the evolution of key morphological characters within Cnidaria.

Conclusions

Cnidarian mitochondrial genomic data contain a strong phylogenetic signal, which can be used to understand the evolution in this phylum. Mitogenomic-based phylogenies reject the

monophyly of Anthozoa, providing further evidence for the polyp first hypothesis. They also

reject traditional Acraspeda and Scyphozoa hypotheses, suggesting that the shared

morphological characters in these groups are plesiomorphic, originating in the branch leading

to Medusozoa. We expect that the increase in mitogenomic data along with improvements in

phylogenetic inference methods and use of additional nuclear markers will further enhance our

understanding of the phylogenetic relationships within Cnidaria.

Background

The phylum Cnidaria currently encompasses five classes: Anthozoa subdivided into the

subclasses Hexacorallia (hard corals and sea anemones) and Octocorallia (soft corals, sea pens,

and gorgonians), Staurozoa (stalked jellyfish), Scyphozoa (true jellyfish), Cubozoa (box jellies

or sea wasps), and Hydrozoa (hydras, hydroids, hydromedusae, and siphonophores). Non-

anthozoan classes are grouped into the Medusozoa [1, 2]. Cubozoa (Werner 1975)

and Staurozoa (Marques and Collins 2004) were originally included in the class Scyphozoa

86

(Götte 1887), but have each been promoted to the taxonomic status of class, leaving three

orders Coronatae, Rhizostomeae and in Scyphozoa [1]. The relationships

among and within medusozoan classes remain contentious. Although traditionally box jellies

were considered to be closely related to true jellyfish (Acraspeda; Gegenbaur 1856), some

studies have suggested the grouping of [Cubozoa + Staurozoa] and [Hydrozoa +

[Rhizostomeae + Semaeostomea (Discomedusae)]] [3, 4].

Recent research efforts, in particular the Cnidarian Tree of Life (CnidToL) project

(cnidtol.com), have generated large molecular datasets, primarily ribosomal DNA (rDNA)

sequences, from a diverse sampling of cnidarian species. Analyses of these datasets lead to

substantial progress in understanding phylogenetic relationships within the cnidarian classes

but also challenged some traditional clades. For example, some studies have nested the

monophyletic rhizostome jellyfish within a paraphyletic Semaeostomeae [5–8], a view supported by the most recent phylogenetic study using 18S and 28S sequences [9]. Other relevant findings include the sister group relationships between Trachylina and Hydroidolina within Hydrozoa [6], paraphyletic “Filifera” (Kühn 1913) within Hydroidolina [10, 11]; monophyletic stony corals (Scleractinia) within Hexacorallia [12, 13] in opposition to earlier studies [14]; and, two robust clades Carybdeida and Chirodropida composing box jellyfish

(Cubozoa) [15].

Resolving phylogenetic relationships for the phylum Cnidaria is a prerequisite for the reconstruction of the evolutionary history of key morphological novelties in this group, e.g. life history and morphological characters [6, 10, 16], medusan morphospace and swimming ability

[17], the evolution of in cubozoans [15], the origin and evolution of building corals [12, 13], and mitochondrial genome structures [18]. One critical character in cnidarian

87 evolution is the ancestral state of the adult life stage, polyp or medusa, which has been a matter of controversy for over a century [6, 19]. All anthozoans (hexacorals and octocorals) lack a

free-living medusa stage, while cubozoans and most scyphozoans contain both polyp and

medusa stages. Hydrozoans display the widest range of diversity in their life cycle, with the

absence of a sessile polyp in (which may be diphyletic [20]) and many species

of , and a highly reduced or absent medusa in other lineages (e.g. some clades

within Leptothecata and Aplanulata, including the freshwater hydras (Hydridae))[1, 5, 10, 21,

22]. While some earlier studies have proposed loss of the medusa form in Anthozoa, the

current view holds the medusa is an apomorphy (derived character) for Medusozoa [1, 6]).

Consequently, today it is generally considered that (1) a sessile polyp-like form was the

ancestral life form of the phylum Cnidaria and, (2) an adult pelagic medusa phase evolved (one

or several times) in Medusozoa [1, 6]. However, the latter view has only slight phylogenetic

support in the currently accepted cnidarian phylogeny, where the most parsimonious scenario

involves the gain of the polyp form in the ancestral Cnidaria and of the medusa form in

Medusozoa. Alternatively, both the polyp and the medusa forms could have been acquired in

the ancestral cnidarian, and the medusa form subsequently lost in the branch leading to

Anthozoa. The multiple losses of the medusa stage in different medusozoan lineages suggest

that this character is indeed evolutionary labile.

Medusozoa (sometimes referred to as Tesserazoa) is supported by a combination of

characters both morphological (e.g. presence of an adult pelagic stage, cnidocils (cilia of

lacking basal rootlets), and microbasic eurytele nematocysts) and molecular (e.g.

rDNA, mitochondrial protein genes, linear mitochondrial DNA) [1, 6, 19]. Similarly, several

morphological characters have been suggested as synapomorphies for Anthozoa, including the

88 presence of an actinopharynx (a tube leading from the into the coelenteron), mesenteries and possibly siphonoglyphs (ciliated grooves longitudinally extending along the ), although the latter are absent in some anthozoan lineages [1]. Anthozoa [23–26] has also received support in molecular based rRNA data, although only one of them (Collins (2002) included a dense sampling of both anthozoan and medusozoan taxa (see Figure 1A). By contrast, studies based on mitochondrial DNA data do not support Anthozoa and instead group octocorals with medusozoans [27–30] (Figure 1B). This alternative phylogenetic hypothesis, if true, would necessitate reinterpretation of morphological characters shared by anthozoans as symplesiomorphies (retained from the common ancestor) rather than synapomorphies. In addition, rDNA-based studies have also exposed some disparities between classical taxonomy and molecular phylogenies for some groups [31] and were unable to resolve phylogenetic relationships within others, e.g. Hydroidolina [6, 11] and alcyonacean octocorals [31]. Thus, additional markers are necessary to achieve a better understanding of cnidarian relationships.

Mitochondrial DNA (mtDNA) has traditionally been a popular molecular marker for understanding the phylogenetic relationships in animals. Recent technological advances in sequencing complete mtDNA sequences have provided easier access to mitogenomic data for phylogenomic studies [32–34]. Some suggested advantages of mtDNA over nuclear DNA

(nDNA) in phylogenetics are the orthology of all genes [35] and the small genome structure being relatively conserved, which provides organizational characters such as gene order

(considered as Rare Genetic Changes or RGC) [36, 37]. In non-bilaterian animals, recent increase in the number of completely sequenced mtDNAs has helped to resolve deep phylogenetic nodes within sponges [28, 38] and Hexacorallia within Cnidaria [14, 39]. Yet, the very poor sample size of medusozoan taxa in previous studies raises some question about the

89 validity of phylogenetic results based on them. Indeed, it is known that inadequate taxon sampling and systematic errors can override genuine phylogenetic signal, resulting in flawed phylogenetic reconstructions [40, 42].

We assembled a more taxonomically balanced mitogenomic dataset to investigate the

evolutionary history of cnidarians. Our database contains newly published mitochondrial

sequences from 24 representative species of all medusozoan classes, including three orders of

Scyphozoa, both orders of Cubozoa, and six out of the nine orders of Hydrozoa [18]. We also

included sequences from two octocoral orders Penatulacea and Helioporacea, and two

hexacoral orders Antipatharia and Ceriantharia. Our analyses suggest that the paraphyly of

Anthozoa does not result from poor taxon sampling. We also found the groupings

[Discomedusae + Hydrozoa] and [Coronatae + [Cubozoa + Staurozoa]], contradicting the

current rDNA-based phylogenetic hypothesis within Medusozoa.

Results

The impact of sequence alignment and models of sequence evolution on phylogenetic analyses

Two different strategies were used for data alignment and filtering. The first alignment

(AliS; 3111 amino acids) was created using a combination of the ClustalW and SOAP

programs. The second alignment (AliMG; 3485 aa) was produced by MUSCLE and Gblocks

programs. Both alignments were used for phylogenetic inferences (ML and BI) to assess the

impact of the aligning procedure and yielded nearly identical results. The few discrepancies

included the phylogenetic position of the blue coral Heliopora coerulea, which was impacted

by alignment method but with low support. Given the results from the AliS alignment were

90 more in agreement with traditional phylogenies within Cnidaria, we decided to concentrate on this dataset in this paper.

We also evaluated how the site-heterogeneous CAT and CATGTR models perform compared to the reference site-homogeneous model GTR under BI by using cross-validation.

According to pair-wise difference of log-likelihood scores, we found that the CATGTR model

more accurately explained our data for both alignments (293.87 +/- 22.923 for AliS and 330.78

+/- 40.071 for AliMG); the CAT model was the worst of the three models for our alignments (-

97.65 +/- 51.928 for AliS and -111.61 +/- 63.337 for AliMG).

Phylogenetic relationships among cnidarian classes

Phylogenetic relationships among cnidarian classes were analyzed under both Bayesian

(BI) and Maximum Likelihood (ML) frameworks using the AliS alignment. We also used two

different models of sequence evolution (GTR, and CATGTR) for Bayesian inferences. We

found maximum support (bootstrap values of 100, posterior probabilities of 1) for the

monophyly of Medusozoa, Cubozoa, Staurozoa, Hydrozoa, and Discomedusae in all analyses.

A few differences in topology emerged when using two different models of sequence

evolution, most of them limited to poorly supported branches. For instance, we found a

monophyletic Cnidaria (posterior probability PP=0.99) and a monophyletic Hexacorallia but

with no support (PP=0.47), and the placement of the coronate Linuche unguiculata as the sister

taxa to the clade [Cubozoa + Staurozoa] (PP=0.84) under the CATGTR model (Figure 2). On

the other hand, we found paraphyletic Cnidaria and Hexacorallia in all GTR trees, where the

position of Ceriantheoptsis americanus was unstable. We also observed that L. unguiculata

was the first diverging medusozoan clade in GTR (BI) analyses (PP=0.89) but the sister taxa to

[Cubozoa + Staurozoa] in GTR (ML) without support (bootstraps BV=12). Removing the

91 coronate L. unguiculata and those species with missing data (the blue coral Heliopora coerulea and the tube anemone C. americanus) did not impact cnidarian paraphyly under GTR (ML) model (Figure S3).

All analyses supported the clade [Octocorallia + Medusozoa] (CATGTR: PP=0.99;

GTR: PP=0.72 BV=42), although support values were lower when GTR model was used.

Within Medusozoa, we recovered the monophyly of Cubozoa, Staurozoa, and Hydrozoa with maximum support (BV=100 PP=1), but Scyphozoa [Coronatae + Discomedusae] was not recovered as a monophyletic group in any analyses (Table 1). The class Hydrozoa always grouped with Discomedusae (CATGTR: PP=1; GTR: PP=1 BV=73), and Cubozoa was the sister taxon to Staurozoa (CATGTR: PP=0.56; GTR: PP=1 BV=81). As we removed the single species coronate L. unguiculata, the support values for the clade [Cubozoa + Staurozoa] slightly increased (BV=85), while support values for the clade [Hydrozoa + Discomedusae] decreased (BV=66).

Intra-class relationships

Within Discomedusae, our data reject the monophyly of Semaeostomeae (Table 1) with the rhizostome Rhizostoma pulmo as the sister taxon to the semaeostome Aurelia aurita, and

Cyaneidae grouping with . Cubozoa is divided into the two monophyletic clades

Carybdeida and Chirodropida as found earlier [15], but our data does not resolve the relationships between the three carybdeids. In Hydrozoa, our analyses supported the dichotomy

Trachylina (Cubaia aphrodite) – Hydroidolina (the rest of the hydrozoans) with high support values (PPs=1; BVs=100; Figure 2). The aplanulatan Hydra spp and Ectopleura larynx and the capitates Millepora platyphylla and Pennaria disticha formed a monophyletic clade (PPs=1;

BV=90). The species Clava multicornis (clade Filifera III) and Nemopsis bachei (clade Filifera

92

IV) also formed a monophyletic clade as the sister group to the rest of Hydroidolina with maximum support values. The position of the leptothecate Laomedea flexuosa was different between GTR and CATGTR analyses. The GTR trees placed Leptothecata as the sister taxon to the clade [Filifera III + Filifera IV] (PPs=1; BV=88) similar to ML analyses using nuclear and mitochondrial rRNA genes with the GTRMIX model of sequence evolution [10]. By contrast, Bayesian analysis using the CATGTR model strongly supported the leptothecate hydrozoan as the sister taxon to the clade [Aplanulata + Capitata] (PP=0.99).

Our data suggest Ceriantharia (tube anemones) is the sister taxon to the rest of

Hexacorallia, but this relationship was not well supported (Figure 2). Ceriantharia aside,

Zoanthidea (zoanthids) was the earliest diverging lineage, followed by Actiniaria (sea anemones), Antipatharia (black corals), and the clade [Corallimorpharia + Scleractinia]. The order Scleractinia (stony corals) was monophyletic in CATGTR analyses with low support values (PP=0.79), but paraphyletic under GTR framework, where one clade of scleractinians was the sister taxa to corallimorphs (PP=1;BV=84). Finally, despite the addition of two new sequences from Pennatulacea (Renilla muelleri and Stylatula elongata) and partial data from H. coerulea, our analyses did not resolve the phylogenetic relationships within Octocorallia.

Testing evolutionary hypotheses

We used several phylogenetic tests to evaluate the support for traditional taxonomic hypotheses of our datasets under the ML framework. We used the Approximately Unbiased

(AU) test, the Kishino-Hasegawa (KH) test, and the Shimodaira-Hasegawa (SH) test with the

GTR+Γ model (Table 2). We found that the order Semaeostomeae is significantly rejected

(AU=0; KH=0; SH=0.2), but not the inclusion of staurozoan species as part of an extensive

“Scyphozoa” clade (AU=0.1; KH=0.1; SH=0.2). We also found that despite the absence of

93 support for the monophyly of both Hexacorallia and Scleractinia in GTR trees, hypothesis tests do not reject these clades (Table 2). GTR-based analyses do not significantly reject the validity of Anthozoa (AU=0.7; KH=0.4; SH=1), Scyphozoa (AU=0.2; KH=0.1; SH=0.5/0.4), or the clade Acraspeda [Cubozoa + Scyphozoa] (AU=0.8; KH=0.6; SH=1), even after removal of the long-branch coronate L. unguiculata (AU=0.9; KH=0.8; SH=1). By comparison, under BI we found no support for the validity of Anthozoa (PPs=0; BV=0), Acraspeda (PPs=0; BV=0),

Scyphozoa (PPs=0; BV=3/0), and Semaeostomeae (PPs=0; BV=0) in any of our trees under both GTR and CATGTR models in a Bayesian framework.

Discussion

We reevaluated cnidarian phylogenetic relationships using a dataset of mitochondrial protein genes from a systematically more balanced sample of species that encompasses classes, including cubozoans and staurozoans [18], which were absent in previous works [3, 29]. We decided to exclude sequences from bilaterian animals from our analyses because they form long branches in mitochondrial phylogenomic trees that attract other long branches, resulting in

Long Branch Attraction artifacts [43, 44]. In the future, the inclusion of bilaterians in mtDNA- based phylogenies can be tested when better models of sequence evolution are available. Our cnidarian-rich dataset reduced one major source of potential error that could derive from biased and insufficient taxon sampling for the groups of interest. We also tested our choice of the model of sequence evolution. Our findings suggest that CATGTR is the best model to describe mitochondrial protein data under a Bayesian framework. In fact, this site-heterogeneous model recovered the monophyly of Cnidaria and Hexacorallia, both supported by other molecular datasets and morphological characters [3, 45–49]. By contrast, these clades were paraphyletic

94 in both Maximum Likelihood and Bayesian analyses based on GTR model of sequence evolution. This is not surprising given that a main assumption underlying the GTR model, i.e. homogeneity of the substitution pattern across sites, is violated by most molecular data, rendering the correct capture the phylogenetic signal present in our alignments more difficult

[50]. The resulting topologies from CATGTR analyses differed significantly from the current

consensus view of cnidarian phylogeny [1, 2]. Using the best estimate as a working hypothesis

for cnidarian relationships, we can propose a putative reconstruction of character evolution for

the group.

The mitogenomic point of view of cnidarian systematics

The current systematics view based on nuclear genes under the GTR model subdivides

the phylum Cnidaria into Anthozoa and Medusozoa [1, 2], but mitochondrial protein genes

have consistently supported anthozoan paraphyly with Octocorallia being the sister taxon to

Medusozoa [27–30]. It has been argued that the paraphyly of Anthozoa reported in earlier

mitogenomic studies resulted from poor taxon sampling [2, 29]. Here we show that

paraphyletic Anthozoa does not result from unbalanced taxon sampling. In addition, a principal

component analysis of the amino acid composition (Figure S4) suggests that compositional

bias is also an unlikely explanation. The amino acid composition of Hexacorallia is rather

divergent, but not similar to those of the two outgroups (Porifera and Placozoa); in fact, the

composition similarity between Octocorallia and the outgroups would favor the alternative

possibility of Anthozoa paraphyly [Medusozoa + Hexacorallia], which is not observed in our

analyses. To further test the possible impact of compositional bias, we analyzed our alignments

using the CATGTR+G model with a Dayhoff recoding strategy, despite the fact that it implies

a loss of signal, increasing the stochastic error. Interestingly, Octocorallia remained sister-

95 group of Medusozoa, even if the statistical support is reduced (data not shown). Consequently, according to our analyses, mitochondrial protein genes support the clade [Octocorallia +

Medusozoa] (Figure 2). In contrast, the amino acid composition of Hydrozoa and

Discomedusae is similar (Figure S4) and the Dayhoff recoding recovered Discomedusae as sister-group of Cubozoa (a reduced version of Acraspeda), although with low support,

suggesting that the monophyly of the clade [Hydrozoa + Discomedusae] might be due to an

amino acid composition artifact. In fact, preliminary analyses including the very fast evolving

and compositionally biased Bilateria using the site-heterogeneous CATGTR+G model

supported the unlikely grouping of Bilateria and Cubozoa (data not shown). This suggests that

the use of a complex model of sequence evolution and of a rich taxon sampling is not sufficient

to overcome all the systematic errors in the mitochondrial genomes. With the current increase

in genome sequencing efforts, it will be soon possible to corroborate, or not, our phylogenetic

tree (Figure 2) with large nuclear DNA datasets.

Within Medusozoa, Staurozoa is considered the first diverging clade, and the sister

taxon to [Acraspeda (Cubozoa + Scyphozoa) + Hydrozoa] (reviewed in [2]). While the position

of Linuche unguiculata is still ambiguous in our trees, we found no support for either the

inclusion of coronates in Scyphozoa (PP=0; BV=3), or for Acraspeda [Cubozoa + Scyphozoa]

(PP=0; BV=0). Instead, we found high support for the clades [Hydrozoa + Discomedusae]

(CATGTR: PP=1; GTR: PP=1 BV=73) and [Cubozoa + Staurozoa] (0.56/1/81). This suggests

that Scyphozoa is polyphyletic, although both Scyphozoa and Hydrozoa display similar amino

acid composition. Additional sequences from coronates can test the phylogenetic relationships

presented here.

96

Recent molecular studies have refined our understanding of hydrozoan relationships, particularly the sister group relationship between Hydroidolina and Trachylina [5, 6, 10, 11, 21,

51]. However, relationships within Hydroidolina have been very difficult to resolve based on either nuclear or mitochondrial rDNA genes. Here we have been able to sample five important hydroidolinan clades: Aplanulata, Capitata, Filifera III and IV, Leptothecata, and

Limnomedusae. Mitochondrial protein genes provide good resolution for order-level

relationships within Hydrozoa, although our sampling was limited in scope (only 10 species

representing 6 out of 11 orders; Figure 2). We recovered monophyletic Hydroidolina as the

sister group to our single representative from Trachylina. We also found monophyletic

Aplanulata and Capitata, and paraphyletic as suggested earlier [6, 10]. On the

other hand, we found a consistent grouping of capitate and aplanulate hydrozoans in all our

trees (PPs=1; BV=90), which is in contradiction with earlier rDNA-based studies [5, 10, 11,

20, 21, 49, 51]. The high support values here suggest that even better resolution of

Hydroidolina may be achieved with an increase in the number of complete mtDNAs for

representative taxa within this difficult clade.

Based on mitochondrial genome data, the monophyly of stony corals (Scleractinia) has

recently been put into question [14], but more thorough studies employing alternative datasets

have rejected this hypothesis [13, 52]. Our analyses support the monophyly of Scleractinia

under the preferred CATGTR model (PP=0.80; AU=0.2; KH=0.1; SH=0.5). Our phylogenetic

analyses did not resolve relationships within octocorals. This was predictable given the low-

level of variation of mitochondrial genes in octocorals (see [53]), a pattern attributed to the

activity of the mtMutS gene they encoded. It is noteworthy mentioning that our alignments do

not encompass sequences from the mtMutS gene, absent in all other cnidarian and animal taxa,

97 and which displays a comparatively higher rate of sequence evolution to other genes [54].

While future molecular studies of the evolutionary history of octocorals may be tempted to

limit their investigation to only a portion of the mtDNA [55], complete mitogenomes provide

additional genomic features such as gene order and the composition of intergenic regions

(IGRs) that could be valuable to systematic studies [29]. Furthermore, the combination of

mitogenomic sequences with nuclear data will likely provide even better phylogenetic

resolution for this group.

Morphological evolution in cnidarians

Unlike the dichotomous Anthozoa-Medusozoa, our strongly supported finding that

Anthozoa is paraphyletic further supports the idea that bilateral symmetry, a step of foremost

importance in metazoan evolution as it is exhibited as part of most animal body plans

(bauplan), was anciently acquired prior to the divergence between Cnidaria and Bilateria. In

fact in Cnidaria, increasing evidence supports the presence of bilateral symmetry in corals and

sea anemones, where it is represented by the siphonoglyph [56, 57], while most medusozoans

do not exhibit any bilateral symmetry [19, 58], with siphonophores being an exception [59].

The tree topologies provided here strengthen the view that the ancestral cnidarian displayed a

bilateral symmetry from which the radial tetrameral symmetry (body divided into four identical

parts) of most medusae and many polyps of medusozoans derived (Figure 3). Previous studies

have already suggested occurrences of derivation from a bilateral bauplan in several animal

groups, such as siphonophores [59], myxozoans [60] and some bilaterians [61, 62]. Bilateriality

has evolved very early in animal evolution near the root of the metazoan tree [63], and our data

support the view that such a step was taken before the divergence of Cnidaria. Future studies

98 that resolve the position of cnidarians within the metazoan tree of life will shed light on the origin of bilateral symmetry in animals.

Based upon the basal position of Staurozoa in previous phylogenies, Collins and collaborators [6] proposed that the medusa of Hydrozoa, Cubozoa and Scyphozoa derived from a stauromedusa-like ancestor. However, if the mitogenomic hypothesis of Cnidaria is true

(Figure 3), then the free-living medusa form likely evolved in the branch leading to monophyletic Medusozoa, with subsequent independent losses of this life stage occurring in the lineages leading to Staurozoa and several hydrozoan clades (Figure 3). This scenario is consistent with earlier studies that originated the name "stalked jellyfish" for staurozoans, and concluded that these species represented "degenerated" jellyfish descended from ancestors with a pelagic medusa [6, 8, 64, 65]. Simplification or losses of the medusa form has also been documented in several Hydrozoa clades [10]. The acquisition of a pelagic form is significant given that a free-swimming medusa allows a higher degree of offspring dispersion than by gametes and larvae alone [66].

Earlier studies have suggested several synapomorphies for the extended Acraspeda clade (Cubozoa + Scyphozoa + Staurozoa), namely radial tetrameral symmetry: medusa formation located at the apical end of the polyp, polyps with canal system and gastrodermal musculature organized in bunches of ectodermal origin (4 inter-radial within the mesoglea in

Staurozoa and Scyphozoa, but not limited to 4 in Cubozoa), the presence of rhopalia or rhopalioid-like structures, medusa with gastric filaments and septa in the gastrovascular cavity

[19, 58]. Paraphyletic Acraspeda as suggested by our analyses implies that either these characters were acquired several times independently in Cubozoa, Coronatae and

Discomedusae, or that they have been inherited from the ancestral medusozoan and lost in

99

Hydrozoa. Given the complexity of these characters, it is unlikely that they have re-derived

independently in various lineages. Rather, the most parsimonious scenario is their presence in

the last common ancestor of medusozoans, with subsequent loss(es) in hydrozoans. Similarly,

Scyphozoa is defined by the presence of ephyrae and simple rhopalia, and production of

medusae through strobilation of the polyp [1, 6, 19, 58]. If Scyphozoa is paraphyletic as

suggested by our trees, these characters can be either convergent or plesiomorphic. For

instance, rhopalia-like structures are present in all medusozoan clades but Hydrozoa.

According to the mitogenomic view of cnidarian phylogeny, the most parsimonious scenario

suggests that the ancestral medusozoan possessed some sort of rhopalium, maybe similar to

those present in discomedusans and coronates. According to this scenario, simplification of

rhopalia must have accompanied the degeneration of the medusae in Staurozoa, while they

were completely lost in Hydrozoa. By contrast, the rhopalia in Cubozoa have evolved into

complex structures with multiple, complex . Marques and Collins (2004) have suggested a

clade formed by [Cubozoa + Staurozoa] based on cladistic analysis of a set of morphological

and life history characters [19]. They suggested the presence of y-shaped septa and a quadrate

or square symmetry of horizontal cross-section as synapomorphies for this clade. Mitogenomic

data also support such a grouping, providing additional evidence for the validity of these

characters as synapomorphies for the clade [Cubozoa + Staurozoa].

Conclusions

We used an extended dataset of mitochondrial protein genes to reevaluate the

phylogeny of Cnidaria, paying attention to common biases in phylogenetic reconstructions

resulting from insufficient taxon sampling and using simplistic models of sequence evolution.

100

Our phylogenetic analyses suggest the grouping of Octocorallia and Medusozoa to the exclusion of Hexacorallia, resulting in Paraphyletic Anthozoa. We also recovered the

[Discomedusae + Hydrozoa] and [Coronatae + [Cubozoa + Staurozoa]] relationships within

Medusozoa. It should be noted, however, that although our data provide little or no support for the clades Anthozoa, Acraspeda and Scyphozoa, they are not rejected with statistically significant support in the likelihood framework. Using the new mito-phylogenomic view, we reconstructed the evolution of several morphological characters in medusozoans. In particular, we provided additional evidence for the “polyp first” theory, where the ancestral cnidarian was a bilateral polyp-like , and that a radially symmetrical and vagile medusae evolved in the branch leading to Medusozoa. Our analyses support the view that the ancestor of cnidarians and bilaterians (UrEumetazoa) possessed bilateral symmetry [67]. According to our working hypothesis, synapomorphies traditionally associated with Acraspeda such as the presence of gastric filaments in the medusae and gastrodermal musculature organized in bunches of ectodermal origin were most likely acquired early in the evolution of Medusozoa, and later lost in the branch leading to Hydrozoa.

Methods

Mitochondrial sequence acquisition and alignment

We determined complete or nearly complete mitochondrial sequences from 24 medusozoan species [18] and analyzed them with 51 sequences previously available in

GenBank, four placozoan and 22 poriferan species (Figure S1). Our dataset covered most of the cnidarian diversity with representatives from all medusozoan classes, including representatives from the three orders of Scyphozoa, both orders of Cubozoa, and six out of the

101

eleven clades of Hydrozoa, and three species from both Staurozoa clades [18]. In addition, we also incorporated sequences from the Cirripathes lutkeni (JX023266), the sea pens

Renilla muelleri (JX023273) and Stylatula elongata (JX023275), the alcyonarian Sinularia sp

(JX023274), as well as partial sequences for the cerianthid Ceriantheopsis americanus

(JX023261-JX023265) and the octocoral Heliopora coerulea (JX023267-JX023272) as described previously [18]. The resulting dataset contains 75 cnidarian species.

Sequence alignments

Different alignment programs use different algorithms for retaining or removing sites in alignments. First, we created preliminary alignments to correct potential frameshifts in our sequences. We then created two sets of alignments (AliS and AliMG) using two commonly used approaches to test the impact of alignment algorithm on phylogenetic analyses for our dataset. We aligned individual genes using ClustalW v1.82 [68] and MUSCLE plugin in

Geneious Pro v5.5.6 [69]. For ClustalW we created three alignments with different

combinations of opening/extension gap penalties: 10/0.2, 12/4 and 5/1. Individual gene

alignments were concatenated and the three resulting alignments were compared using SOAP

[70] and only positions that were identical among the three alignments were retained in the

final alignment (AliS). For MUSCLE alignment, individual genes were aligned using default

parameters and the aligned genes were concatenated. We removed poorly aligned regions with

Gblocks online (Castresana Lab, molevol.cmima.csic.es/castresana/) with the options allowing

gap for all positions and 85% of the number of sequences for flanking positions (AliMG). We

quickly read both alignments and brought minor changes where sequences shown signs of

framshifts (both alignments are available as additional materials). The final SOAP (AliS) and

MUSCLE (AliMG) alignments comprised 3111 and 3485 amino acids, respectively.

102

We used the program Net from the MUST package [71] to estimate the amino-acid

composition per species in each of the alignments by assembling a 20 X 106 matrix containing

the frequency of each amino acid. This matrix was then displayed as a two-dimensional plot in

a principal component analysis, as implemented in the R package.

Phylogenetic inferences

We conducted phylogenetic analyses under Maximum Likelihood (ML) and Bayesian

(PB) frameworks using RAxML v7.2.6 and PhyloBayes v3.3, respectively [50, 72–77]. PB

analyses consisted of two chains over more than 11,000 cycles (maxdiff<0.2) using CAT,

GTR, and CAT+GTR models, and sampled every 10th tree after the first 100, 50 and 300 burn- in cycles, respectively for CAT, GTR and CAT+GTR. ML runs were performed for 1000 bootstrap iterations under the GTR model of sequence evolution with two parameters for the number of categories defined by a gamma (Γ) distribution and the CAT approximation. Under the ML framework, both analyses using the CAT approximation and Γ distribution of the rates across sites models yield nearly identical trees, suggesting that the GTR+CAT approximation does not interfere with the outcome of the phylogenetic runs for our dataset. In order to save computing time and power, we therefore opted for the CAT approximation with the GTR model for further tree search analyses under ML. We assessed the effect of missing data on cnidarian phylogenetic relationships in our trees by removing the partial sequences of C. americanus and H. coerulea. We also removed the coronate Linuche unguiculata given its

problematic position and that it is the only representative of its clade, which could introduce a

systematic bias. We then performed additional GTR analyses under the ML framework on the

reduced, 103 taxa alignment.

Evaluating phylogenetic inferences

103

We tested several relationships that had been earlier hypothesized under ML by

calculating the per-site log likelihood values with RAxML v7.2.6 under the GTR model. We

then used three commonly used tests (the Approximately Unbiased (AU) test, the Kishino-

Hasegawa (KH) test, and the Shimodaira-Hasegawa (SH) test, see [78]) to assess the validity of

several key phylogenetic relationships according to our data. To do so, we compared whether

our data significantly rejected the best trees conforming to each of the a priori hypotheses

(Table 2) using the CONSEL software [79].

For Bayesian inferences, a cross-validation was performed to find the model with the

best fit to the data. Each alignment is randomly split in two unequal parts: a “learning dataset”

with nine-tenth of the original positions and a “testing dataset” with one-tenth; ten replicates

are performed. The parameters of the model are estimated on the learning datasets for each

model, where a fixed topology is inferred with each model (the number of cycles is variable

between various models) and with a burnin corresponding to 1/10th of the total number of cycles required for convergence (maxdiff > 0.2). The estimated parameters are then used on the testing dataset to compute the likelihood of the topology, averaged over the ten replicates.

Finally, we sampled ten morphological characters we considered important from

previously published morphological matrices [10, 19, 58, 80] and mapped them on the

CATGTR based tree. To do so we used PAUP 4.0b10 for Unix [81] using DELTRAN and

ACCRAN character-state optimization models. The scoring of each character is detailed in

additional materials (Supplementary material).

Authors' contributions

EK participated in the design of the study, carried out ML phylogenetic analyses and

statistical tests, reconstructed the evolution of morphological characters, and drafted the

104 manuscript. BR carried out BI phylogenetic analyses and extracted the support values, performed the cross validation. HP and DL helped in the design of the study. AGC provided guidance on some analyses and the interpretation of the morphological characters. All authors contributed to the final version of the manuscript.

Acknowledgements

We would like to thank Matthew Palen for his wonderful drawings of cnidarians. We would also like to thank Niamh Redmond for their help and Henner Brinkmann for his highly insightful comments on this manuscript. This work was supported by a grant from the National

Science Foundation to DVL [DEB-0829783] and by funding from Iowa State University. EK wishes to acknowledge funding through the Smithsonian Predoctoral Fellow grant. Some samples used in this study came from work supported by the NSF-funded Cnidarian Tree of

Life Project (EF-0531779) to P. Cartwright, D. Fautin and AGC.

105

References 1. Daly M, Brugler MR, Cartwright P, Collins AG, Dawson MN, Fautin DG, France SC, Mcfadden CS, Opresko DM, Rodriguez E, Romano SL, Stake JL: The phylum Cnidaria: A review of phylogenetic patterns and diversity 300 years after Linnaeus*. Zootaxa 2007, 182:127-128. 2. Collins AG: Recent Insights into Cnidarian Phylogeny. Smithsonian Contributions to Marine Sciences 2009, 38:139-149. 3. Kim J, Kim W, Cunningham CW: A New Perspective on Lower Metazoan Relationships from 18S rDNA Sequences. Molecular Biology and Evolution 1999, 16:423-427. 4. Dawson MN: Some implications of molecular phylogenetics for understanding in jellyfishes, with emphasis on Scyphozoa. Hydrobiologia 2004, 530- 531:249-260. 5. Collins AG: Phylogeny of Medusozoa and the evolution of cnidarian life cycles. Journal of Evolutionary Biology 2002, 15:418-432. 6. Collins AG, Schuchert P, Marques AC, Jankowski T, Medina M, Schierwater B: Medusozoan phylogeny and character evolution clarified by new large and small subunit rDNA data and an assessment of the utility of phylogenetic mixture models. Systematic biology 2006, 55:97-115. 7. Mayer AG: Medusae of the world. Washington, D.C., Carnegie institution of Washington; 1910:386. 8. Thiel H: The evolution of Scyphozoa: A review. In Cnidaria and their Evolution. London: Academic Press; 1966. 9. Bayha KM, Dawson MN, Collins a. G, Barbeitos MS, Haddock SHD: Evolutionary Relationships Among Scyphozoan Jellyfish Families Based on Complete Taxon Sampling and Phylogenetic Analyses of 18S and 28S Ribosomal DNA. Integrative and Comparative Biology 2010, 50:436-455. 10. Cartwright P, Nawrocki a. M: Character Evolution in Hydrozoa (phylum Cnidaria). Integrative and Comparative Biology 2010:1-17. 11. Cartwright P, Evans NM, Dunn CW, Marques AC, Miglietta MP, Schuchert P, Collins AG: Phylogenetics of Hydroidolina (Hydrozoa: Cnidaria). Journal of the Marine Biological Association of the United Kingdom 2008, 88:1663-1672. 12. Kitahara MV, Cairns SD, Stolarski J, Blair D, Miller DJ: A comprehensive phylogenetic analysis of the Scleractinia (Cnidaria, Anthozoa) based on mitochondrial CO1 sequence data. PloS one 2010, 5:e11490. 13. Fukami H, Chen CA, Budd AF, Collins A, Wallace C, Chuang Y-Y, Chen C, Dai C-F, Iwao K, Sheppard C, Knowlton N: Mitochondrial and nuclear genes suggest that stony corals are monophyletic but most families of stony corals are not (Order Scleractinia, Class Anthozoa, Phylum Cnidaria). PloS one 2008, 3:e3222. 14. Medina M, Collins AG, Takaoka TL, Kuehl JV, Boore JL: Naked corals: skeleton loss in Scleractinia. Proceedings of the National Academy of Sciences of the United States of America 2006, 103:9096-100.

106

15. Bentlage B, Cartwright P, Yanagihara AA, Lewis C, Richards GS, Collins AG: Evolution of box jellyfish (Cnidaria: Cubozoa), a group of highly toxic . Proceedings. Biological sciences / The Royal Society 2010, 277:493-501. 16. Barbeitos MS, Romano SL, Lasker HR: Repeated loss of coloniality and in scleractinian corals. Proceedings of the National Academy of Sciences of the United States of America 2010, 107:11877-82. 17. Costello JH, Colin SP, Dabiri JO: Medusan morphospace: phylogenetic constraints, biomechanical solutions, and ecological consequences. Biology 2008, 127:265-290. 18. Kayal E, Bentlage B, Collins AG, Kayal M, Pirro S, Lavrov DV: Evolution of linear mitochondrial genomes in medusozoan cnidarians. Genome Biology and Evolution 2012, 4:1-12. 19. Marques AC, Collins AG: Cladistic analysis of Medusozoa and cnidarian evolution. Invertebrate Biology 2004, 123:23-42. 20. Collins AG, Bentlage B, Lindner A, Lindsay D, Haddock SHD, Jarms G, Norenburg JL, Jankowski T, Cartwright P: Phylogenetics of Trachylina (Cnidaria: Hydrozoa) with new insights on the evolution of some problematical taxa. Journal of the Marine Biological Association of the United Kingdom 2008, 88:1673. 21. Leclère L, Schuchert P, Cruaud C, Couloux A, Manual M, Leclère L, Manuel M: Molecular phylogenetics of Thecata (Hydrozoa, Cnidaria) reveals long-term maintenance of life history traits despite high frequency of recent character changes. Systematic Biology 2009, 58:509-26. 22. Leclère L, Schuchert P, Manual M: Phylogeny of the (Hydrozoa, Leptothecata): evolution of colonial organisation and life cycle. Zoologica Scripta 2007, 36:371-394. 23. Cavalier-Smith T, Allsopp MTEP, Chao EE, Boury-Esnault N, Vacelet J: Sponge phylogeny, animal monophyly, and the origin of the nervous system: 18S rRNA evidence. Canadian Journal of Zoology 1996, 74:2031-2045. 24. France SC, Rosel PE, Agenbroad JE, Mullineaux LS, Kocher T.: DNA sequence variation of mitochondrial large-subunit rRNA provides support for a two-subclass organization of the Anthozoa (Cnidaria). Molecular Marine Biology and Biotechnology 1996, 5:15-28. 25. Odorico DM, Miller DJ: Internal and external relationships of the Cnidaria: implications of primary and predicted secondary structure of the 5’-end of the 23S- like rDNA. Proceedings. Biological sciences / The Royal Society 1997, 264:77-82. 26. Berntson E a, France SC, Mullineaux LS: Phylogenetic relationships within the class Anthozoa (phylum Cnidaria) based on nuclear 18S rDNA sequences. Molecular phylogenetics and evolution 1999, 13:417-33. 27. Kayal E, Lavrov DV: The mitochondrial genome of Hydra oligactis (Cnidaria, Hydrozoa) sheds new light on animal mtDNA evolution and cnidarian phylogeny. Gene 2008, 410:177-186. 28. Lavrov DV, Wang X, Kelly M: Reconstructing ordinal relationships in the Demospongiae using mitochondrial genomic data. Molecular phylogenetics and evolution 2008, 49:111-24.

107

29. Park E, Hwang D-S, Lee J-S, Song J-I, Seo T-K, Won Y-J: Estimation of divergence times in cnidarian evolution based on mitochondrial protein-coding genes and the fossil record. Molecular Phylogenetics and Evolution 2012, 62:329-345. 30. Shao Z, Graf S, Chaga OY, Lavrov DV: Mitochondrial genome of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa): A linear DNA molecule encoding a putative DNA-dependent DNA polymerase. Gene 2006, 381:92-101. 31. McFadden CS, Sanchez J a., France SC: Molecular Phylogenetic Insights into the Evolution of Octocorallia: A Review. Integrative and Comparative Biology 2010, 50:389-410. 32. Kan X-Z, Yang J-K, Li X-F, Chen L, Lei Z-P, Wang M, Qian C-J, Gao H, Yang Z-Y: Phylogeny of major lineages of galliform (Aves: Galliformes) based on complete mitochondrial genomes. Genetics and molecular research : GMR 2010, 9:1625-33. 33. Podsiadlowski L, Braband A, Struck TH, von Döhren J, Bartolomaeus T: Phylogeny and mitochondrial gene order variation in in the light of new mitogenomic data from . BMC genomics 2009, 10:364. 34. Rota-Stabelli O, Kayal E, Gleeson D, Daub J, Boore JL, Telford MJ, Pisani D, Blaxter M, Lavrov DV: Ecdysozoan Mitogenomics: Evidence for a Common Origin of the Legged Invertebrates, the . Genome Biology and Evolution 2010, 2:425-440. 35. Hoelzer GA: Inferring Phylogenies From mtDNA Variation: Mitochondrial-Gene Trees Versus Nuclear-Gene Trees Revisited. Evolution, 1997, 51:622-626. 36. Boore JL, Fuerstenberg SI: Beyond linear sequence comparisons: the use of genome- level characters for phylogenetic reconstruction. Philosophical transactions of the Royal Society of London. Series B, Biological sciences 2008, 363:1445-51. 37. Lavrov DV, Forget L, Kelly M, Lang BF: Mitochondrial genomes of two demosponges provide insights into an early stage of animal evolution. Molecular biology and evolution 2005, 22:1231-9. 38. Gazave E, Lapébie P, Renard E, Vacelet J, Rocher C, Ereskovsky AV, Lavrov DV, Borchiellini C: Molecular phylogeny restores the supra-generic subdivision of homoscleromorph sponges (Porifera, Homoscleromorpha). PloS one 2010, 5:e14290. 39. Brugler MR, France SC: The complete mitochondrial genome of the black coral Chrysopathes formosa (Cnidaria:Anthozoa:Antipatharia) supports classification of antipatharians within the subclass Hexacorallia. Molecular Phylogenetics and Evolution 2007, 42:776-788. 40. Pick KS, Philippe H, Schreiber F, Erpenbeck D, Jackson DJ, Wrede P, Wiens M, Alié A, Morgenstern B, Manuel M, Wörheide G: Improved phylogenomic taxon sampling noticeably affects nonbilaterian relationships. Molecular Biology and Evolution 2010, 27:1983-1987. 41. Philippe H, Roure B: Difficult phylogenetic questions: more data, maybe; better methods, certainly. BMC biology 2011, 9:91. 42. Philippe H, Brinkmann H, Lavrov DV, Littlewood DTJ, Manuel M, Wörheide G, Baurain D, Baurin D: Resolving Difficult Phylogenetic Questions : Why More Sequences Are Not Enough. PLoS Biology 2011, 9:e1000602.

108

43. Philippe H, Zhou Y, Brinkmann H, Rodrigue N, Delsuc F: Heterotachy and long- branch attraction in phylogenetics. BMC evolutionary biology 2005, 5:50. 44. Roure BB, Philippe HH: Site-specific time heterogeneity of the substitution process and its impact on phylogenetic inference. BMC evolutionary biology 2011, 11:17. 45. Zrzavy J, Mihulka S, Kepka P, Bezdek A, Tietz D: Phylogeny of the Metazoa Based on Morphological and 18S Ribosomal DNA Evidence. Cladistics 1998, 14:249-285. 46. Siddall ME, Martin DS, Bridge D, Desser SS, David K, Martint DS, Bridget D, Conell DK: The Demise of a Phylum of Protists: Phylogeny of Myxozoa and Other Parasitic Cnidaria. The Journal of Parasitology 1995, 81:961-967. 47. Schuchert P: Phylogenetic analysis of the Cnidaria. Issue Journal of Zoological Systematics and Evolutionary Research 1993, 31:161-173. 48. Evans NM, Holder MT, Barbeitos MS, Okamura B, Cartwright P: The Phylogenetic Position of Myxozoa: Exploring Conflicting Signals in Phylogenomic and Ribosomal Datasets. Molecular Biology 2010. 49. Evans NM, Lindner A, Raikova EV, Collins AG, Cartwright P: Phylogenetic placement of the enigmatic parasite, hydriforme, within the Phylum Cnidaria. BMC evolutionary biology 2008, 8:139. 50. Lartillot N, Brinkmann H, Philippe H: Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model. BMC evolutionary biology 2007, 7 Suppl 1:S4. 51. Collins AG, Winkelmann S, Hadrys H, Schierwater B: Phylogeny of Capitata and Corynidae (Cnidaria, Hydrozoa) in light of mitochondrial 16S rDNA data. Zoologica Scripta 2005, 34:91-99. 52. Budd a. F, Romano SL, Smith ND, Barbeitos MS: Rethinking the Phylogeny of Scleractinian Corals: A Review of Morphological and Molecular Data. Integrative and Comparative Biology 2010, 50:411-427. 53. McFadden CS, Benayahu Y, Pante E, Thoma JN, Nevarez PA, France SC: Limitations of mitochondrial gene barcoding in Octocorallia. Molecular Ecology Resources 2011, 11:19-31. 54. Bilewitch JP, Degnan SM: A unique event has provided the octocoral mitochondrial genome with an active mismatch repair gene that has potential for an unusual self-contained function. BMC Evolutionary Biology 2011, 11:228. 55. McFadden CS, France SC, Sánchez J a, Alderslade P: A molecular phylogenetic analysis of the Octocorallia (Cnidaria: Anthozoa) based on mitochondrial protein- coding sequences. Molecular phylogenetics and evolution 2006, 41:513-27. 56. Finnerty JR, Pang K, Burton P, Paulson D, Martindale MQ: Origins of bilateral symmetry: Hox and dpp expression in a . Science 2004, 304:1335-1337. 57. Matus DQ, Pang K, Marlow H, Dunn CW, Thomsen GH, Martindale MQ: Molecular evidence for deep evolutionary roots of bilaterality in animal development. Proceedings of the National Academy of Sciences of the United States of America 2006, 103:11195-200. 58. Van Iten H, de Moraes Leme J, Simões MG, Marques AC, Collins AG: Reassessment of the phylogenetic position of conulariids (?) within the

109

subphylum medusozoa (phylum cnidaria). Journal of Systematic Palaeontology 2006, 4:109-118. 59. Dunn CW: Complex colony-level organization of the deep-sea siphonophore Bargmannia elongata (Cnidaria, Hydrozoa) is directionally asymmetric and arises by the subdivision of pro-buds. Developmental dynamics : an official publication of the American Association of Anatomists 2005, 234:835-45. 60. Jiménez-Guri E, Philippe H, Okamura B, Holland PWH: Buddenbrockia is a cnidarian . Science (New York, N.Y.) 2007, 317:116-8. 61. Palmer a R: Symmetry breaking and the evolution of development. Science (New York, N.Y.) 2004, 306:828-33. 62. Levin M: Left-right asymmetry in : a comprehensive review. Mechanisms of development 2005, 122:3-25. 63. Nielsen C: Six major steps in animal evolution: are we derived sponge larvae? Evolution & development 2008, 10:241-57. 64. Uchida T: The systematic position of the . Publishings of the Seto Marine Biology Laboratory 1972, 20:133-139. 65. Werner B: New investigations on systematics and evolution of the class Scyphozoa and the phylum Cnidaria. Publishings of the Seto Marine Biology Laboratory 1973, 20:35-61. 66. Gibbons MJ, Janson L a., Ismail A, Samaai T: Life cycle strategy, species richness and distribution in marine Hydrozoa (Cnidaria: Medusozoa). Journal of Biogeography 2010, 37:441-448. 67. Boero F, Schierwater B, Piraino S: Cnidarian milestones in metazoan evolution. Integrative and comparative biology 2007, 47:693-700. 68. Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position- specific gap penalties and weight matrix choice. Nucleic Acids Research 1994, 22:4673-80. 69. Drummond A, Ashton B, Buxton S, Cheung M, Cooper A, Duran C, Field M, Heled J, Kearse M, Markowitz S, Moir R, Stones-Havas S, Sturrock S, Thierer T, Wilson A: Geneious Pro v5.5.6. 2011. 70. Löytynoja A, Milinkovitch MC: SOAP, cleaning multiple alignments from unstable blocks. Bioinformatics 2001, 17:573-574. 71. Philippe H: MUST, a computer package of Management Utilities for Sequences and Trees. Nucleic Acids Research 1993, 21:5264-5272. 72. Lartillot N, Philippe H: Computing Bayes factors using thermodynamic integration. Systematic biology 2006, 55:195-207. 73. Lartillot N, Philippe H: A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Molecular biology and evolution 2004, 21:1095- 109. 74. Ott M, Zola J, Stamatakis A: Large-scale maximum likelihood-based phylogenetic analysis on the IBM BlueGene/L. In Proceedings of ACM/IEEE Supercomputing Conference 2007. 2007.

110

75. Stamatakis A: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics (Oxford, England) 2006, 22:2688-90. 76. Stamatakis a.: Phylogenetic models of rate heterogeneity: a high performance computing perspective. Proceedings 20th IEEE International Parallel & Distributed Processing Symposium 2006:8 pp. 77. Stamatakis A, Hoover P, Rougemont J: A rapid bootstrap algorithm for the RAxML Web servers. Systematic biology 2008, 57:758-71. 78. Shimodaira H: An approximately unbiased test of selection. Systematic biology 2002, 51:492-508. 79. Shimodaira H, Hasegawa M: CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics (Oxford, England) 2001, 17:1246-7. 80. Won J, Rho B, Song J: A phylogenetic study of the Anthozoa (phylum Cnidaria) based on morphological and molecular characters. Coral Reefs 2001, 20:39-50. 81. Swofford DL: PAUP* 4.0. 2002, 42:1-144.

111

A B

Anthozoa Medusozoa Anthozoa Medusozoa H e x a c o r l i O c o t r a l i S t a u r o z C u b o z a S c y p h o z a H y d r o z a H e x a c o r l i O c o t r a l i S c y p h o z a H y d r o z a

Figure 1 - Alternative hypotheses of the cnidarian tree of life.

A. Current view of cnidarian evolutionary history based on rRNA phylogenies. B. Hypothesis of cnidarian relationships obtained using mitochondrial protein coding genes.

112

* Placozoa 0.93/1/99 Porifera * * * Actiniaria 1/1/79 * Antipatharia 1/1/89 * H * Corallimorpharia 1/1/96 e * * x 1/1/84 * a * * c

* o

0.47/0.24/16 r

a 0.79/-/16 * l * Scleractinia l

i

1/1/99 * a * *

* * 1/1/92 1/1/70 1/1/99 Ceriantharia 1/1/95 0.99/-/7 0.88/-/22 Pennatulacea O 1/1/95 Helioporacea * * c 0.62/0.93/32 t * o

c

0.99/0.39/20 o 1/1/98 r Alcyonacea a *

l

1/1/99 l 0.9/0.99/55 i

a 1/1/98 0.99/0.72/42 0.48/-/- 0.99/1/84 0.84/0.10/12 Coronatae * Stauromedusae 0.99/0.95/56 Staurozoa 0.56/1/81 0.99/1/96 Carybdeida * 0.58/-/- Cubozoa * Chirodropida H

* Limnomedusae y * 0.99/1/100 Filifera III

Filifera IV d * Leptothecata 1/1/99 r

0.99/-/- Capitata o 1/1/90

* z

e a s u d e m o c s i * D

1/1/73 * Aplanulata o a

* * 1/1/99 Semaeostomeae * * 0.77/0.99/88 * * Rhizostomeae 0.5 1/1/99

113

Figure 2 - Cnidarian phylogenetic hypothesis based on mitochondrial protein genes.

Phylogenetic analyses of cnidarian protein coding genes under the CATGTR model with

PhyloBayes for the AliS alignments (3111 positions, 106 taxa) created using ClustalW+SOAP.

Support values correspond to the posterior probabilities for the GTR(BI) and bootstraps for

GTR(ML) analyses. Stars denote maximum support values. A dash denotes discrepancy between the results obtained when assuming different models.

114

Zoantharia Hexacorallia

Actiniaria

1r Antipatharia

Corallimorpharia

Scleractinia

10n Ceriantharia

1b Alcyonacea 10g Helioporacea Pennatulacea Octocorallia

4 6 7 Coronatae Coronatae

2- Stauromedusae Staurozoa 9 Carybdeida Cubozoa 7? Chirodropida 1r Limnomedusae 2+ Hydrozoa 5+ 3+ Filifera III 10e 5- 10n Filifera IV

Leptothecata 7 Capitata 8

Aplanulata

Semaeostomeae 4 6 Rhizostomeae Discomedusae

115

Figure 3 - Morphological character reconstruction in Cnidaria.

Reconstruction of key morphological characters in Cnidaria based on the phylogenetic hypothesis based on mitogenomic data. (1) symmetry: b = bilateral, r = radial or biradial; (2) gain (+) and loss (-) of free-swimming medusoid stage (many hydrozoans have secondary lost the medusoid phase); (3) velum (lost in some leptothecate hydrozoans); (4) strobilation; (5) gastric filaments; (6) ephyrae; (7) radial canal (scored uncertain in Cubozoa); (8) circular canal;

(9) square symmetry of horizontal cross section; (10) gastrodermal muscles: e = in bunches of ectodermal origin, g = in bunches of gastrodermal origin, n = not organized in bunches.

Morphological characters are taken from Marques and Collins (2004) and Cartwright and

Nawrocki (2010) studies. Drawings were provided by Matthew Palen.

116

Table 1 - Support values for topologies in the mtDNA-based phylogeny of cnidarians.

Bootstraps and BI values for several clades traditionally recognized in Cnidaria.

AliS AliMG Taxon GTR-ML GTR-BI CATGTR GTR-ML GTR-BI CATGTR Cnidaria 7 0 0.99 12 0 1 Anthozoa 0 0 0 0 0 0

Hexacorallia 16 0.24 0.48 12 0.22 0.57 Antipatharia 100 1 1 100 1 1 Actiniaria 100 1 1 100 1 1 Corallimorpharia 100 1 1 100 1 1 Scleractinia 16 0 0.80 12 0 0.51 Zoantharia 100 1 1 100 1 1

Octocorallia 100 1 1 100 1 1 Alcyonacea 3 0.02 0 7 0 0.01 Pennatulacea 95 1 1 96 1 1

Medusozoa 100 1 1 100 1 1 Acraspeda 0 0 0 0 0 0 Cubozoa 100 1 1 100 1 1 Carybdeida 96 1 0.99 99 1 1 Chirodropida 100 1 1 100 1 1 Hydrozoa 100 1 1 100 1 1 Aplanulata 100 1 1 100 1 1 Capitata 99 1 1 95 1 1 Scyphozoa 3 0 0 0 0 0 Discomedusae 100 1 1 100 1 1 Rhizostomeae 12 0.01 0.22 11 0 0.14 Semaeostomeae 0 0 0 0 0 0 Staurozoa 100 1 1 100 1 1

Placozoa 100 1 1 100 1 1 Porifera 99 1 0.94 100 1 0.88 Homoscleromorpha 100 1 1 100 1 1 Demospongiae 100 1 1 100 1 1 Keratosa 100 1 1 100 1 1 Myxospongiae 100 1 1 100 1 1 marine Haplosclerida 100 1 1 100 1 1 Democlavia 100 1 1 100 1 1

117

Table 2 - The use of several statistical tests verifying the validity of some groups in cnidarians.

Probability values for the AU, KH and SH tests and BI values for several clades traditionally recognized in Cnidaria.

AliS AliMG AU KH SH AU KH SH Cnidaria 0.46 0.23 0.91 0.27 0.15 0.86 Anthozoa 0.69 0.42 0.98 0.63 0.41 0.93 Hexacorallia 0.39 0.21 0.84 0.48 0.32 0.86 Scleractinia 0.16 0.09 0.52 0.25 0.17 0.53 Acraspeda* 0.84 0.58 1.00 0.80 0.59 0.99 Rhisostoma 0.16 0.91 0.55 0.24 0.17 0.64 Scyphozoa 0.20 0.14 0.50 0.10 0.08 0.42 Scyozoap + Cubozoa + Hydrozoa* 0.01 0.01 0.42 0.12 0.14 0.41 Scyphozoa + Cubozoa + Staurozoa* 0.02 0.02 0.20 0.04 0.01 0.27 Scyphozoa + Staurozoa* 0.06 0.06 0.24 0.01 0.04 0.25 Semaeostoma 0.01 0.01 0.18 0.01 0.02 0.18

* similar results have been obtained for a smaller dataset that does not contain the sequences from the coronate Linuche unguiculata, the tube anemone Ceriantheopsis americanus and the blue octocoral Heliopora coerulea.

118

CHAPTER 5. THE ERATIC EVOLUTION OF ANIMAL

MITOCHONDRIAL DNA

A paper to be submitted to Mitochondrial DNA

Ehsan Kayal

Abstract

Since the original endocytosis event of the α-proteobacteria ancestor, the endosymbiotic mitochondrion has experienced various degrees of genome reduction in different eukaryotic lineages. In animals, the general tendency appears to be a small and fairly uniform sized mitochondrial DNA (mtDNA), with nearly absent intergenic regions and non- coding DNA, a relatively stable number of RNA and protein genes. Such pattern has led to the minimalistic view of “frozen” mitogenome in metazoans. Yet, an increasing amount of studies, particularly from disregarded taxa, have documented exceptional patterns of genome evolution in several animal groups. Here, I summarize major deviations from the “typical” animal mtDNA with respect to their size, gene content, and genome organization. The overall picture of such survey appears to be of a more plastic genome than previously suggested. This also hinders any future attempt to define a single evolutionary trend, also known as a “unifying theory”, for the evolution of animal mtDNA.

119

Introduction

Mitochondria are -like organelles found in nearly all eukaryotes and involved in a variety of cellular functions, the most ubiquitous being oxidative phosphorylation. The mitochondrial ancestor is thought to have possessed a wide array of genes encoding the range of metabolic pathways necessary for a free-living α-proteobacteria- like organism. The endosymbiosis of ancestral mitochondria has then triggered the loss and/or transfer to the nucleus of most prokaryotic genes, with the subsequent import of the proteins needed for mitochondrial (mt) functions [1]. Yet, the mitochondria in all extent organisms possess a genome, the mitochondrial DNA (mtDNA), that encodes at least for part of the regulatory genes involved in its expression, and of several subunits of the complexes of molecules involved in the oxidative phosphorylation chain [2]. Various eukaryotic lineages display different trends in the evolution of their mtDNA with respect to their size, gene content, genome architecture and sequence evolution. The size of the mt genome ranges from a few kb in some protists to >2Mbp in some plants [3]. Genome compactness accounts for most of the disparity in mtDNA size variation, influenced by the distribution of non-coding DNA such as intergenic regions, introns, and repeated sequences [4]. On one side of the size spectrum stands the "typical" animal mtDNA (see below) with a nearly constant and small number of genes, virtually absent intergenic regions, and the lack of introns and repeats. On the other side of the size spectrum stand plant mtDNAs with a higher number of intron-rich protein and rRNA genes separated by long stretches of non-coding DNA.

In addition, the number of protein genes encoded by the mtDNA varies from three

(smallest number found to date) in the apicomplexan human parasites spp. [5] and dinoflagellates [6, 7], to <100 in the jakobid protist Reclinomonas americana [8] which has

120 been suggested to resemble the ancestral mtDNA. Finally, eukaryotes mtDNAs display a large assortment of architectures, ranging from a single circular prokaryotic-like genome in many plants, protists and most animals, to linear chromosomes of various sizes in ciliates, green algae, yeast, one cnidarian clade and some crustacean isopods [2, 9–18], with intermediary

“lasso-like” structures as found in the large grass Pennisetum typhoides [19], to the complex kinetoplasts DNA (kDNA) found in Kinetoplastida [20].

In animals, the typical mtDNA is a small (~17kb) single circular molecule that encodes a total of 37 genes, 13 of which are protein genes involved in the oxidative phosphorylation pathway, and 24 are RNAs involved in mitochondrial protein synthesis. The 13 protein genes are subunits 6 and 8 of ATP synthase (atp6 and atp8), apocytochrome b (cob), three subunits of cytochrome oxidase (cox1-3), and subunits 1-6 and 4L of the NADH dehydrogenase complex

(nad1-6 and nad4L). RNA genes encode the small and large ribosomal RNA (rRNA) subunits

(rns and rnl), and 22 transfer RNA (tRNA) genes (one per amino acids plus two additional ones for both leucine and serine) [21, 22]. Such apparent conservation in size and organization of animal mtDNA is in part responsible for propelling it into the front line of animal phylogenetics. In fact, the relatively easy access to complete genomes in most metazoan groups, higher rate of sequence evolution with respect to nuclear sequences [23], overall conservation of the genome structure providing organizational characters such as gene order

(considered as Rare Genetic Changes or RGC) [24, 25] have all favored the sequencing of complete mitogenomes from more than 2500 animal species, 2/3 of which belong to vertebrates and more than 400 to arthropods (Figure 1).

The current understanding of animal mtDNA is highly influenced by vertebrates and arthropods mtDNAs, together representing ~83% of the total number of published

121 mitogenomes to date. As a consequence, the reference animal mtDNA resembles , and more particularly mammalian mtDNA. Such bias can be traced back to the history of mitogenomics where the first completely sequenced mtDNAs were from humans [26]. Yet, recent sampling outside Deuterostomia, and in particular from non-bilaterian animals, has revealed additional levels of complexity in the structure and organization of animal mtDNA

[23]. These studies suggest that animal mtDNA is not "frozen genome" [27] and mammalian mitogenomes are not good representatives of the diversity of mtDNA organizations in animals.

Here I review the current understanding of animal mitogenomes with respect to genome structure and organization, and gene content. Overall, the pattern of mtDNA evolution shows no clear correlation with animal evolutionary history, even less with size and tissue complexity.

In addition, given the incomplete picture of animal mtDNA diversity, more deviations from the typical animal mitogenome shall be expected as additional metazoan taxa are explored with higher scrutiny.

Mitochondrial genome structure and organization

Mitochondrial genomes are commonly mapped into a single circular plasmid (i.e. a typically circular double stranded DNA molecule independent from genomic DNA) structure.

Yet an increasing amount of studies have been reporting deviation from such monochromosomal “plasmidic” view of the mtDNA in various organisms across the tree of life

[9, 28]. In animals, multi chromosomal mtDNA has been described in the rotifer Brachionus plicatilis [29], in some medusozoan cnidarians [13, 15, 30], and some nematode species [31–

33]. In , mitochondrial mini-circles have also been reported in species that possess monochromosomal mtDNAs, a potential intermediary state toward multipartite mtDNA [34].

122

In arthropods, several parasitic lice species (order Psocodea) have also evolved multipartite

mtDNA composed of up to 18 minicircles [35, 36]. Recently, researchers have shown that the

mitochondrial genome of the booklice Liposcelis bostrychophila is in two circular molecule of

nearly identical size but of unequimolar representation: one chromosome is twice as abundant

as the other one [37].

In the phylum Cnidaria, the mtDNA of medusozoans (Cubozoa, Hydrozoa,

“Scyphozoa”, and Staurozoa) is organized into one or several linear molecules [13, 15, 17, 18,

38–43]. To date, the only other case of linear mtDNA in animals has been found in a group of

crustacean isopods where the mtDNA consists of a circular ~28-kb dimer and a linear ~14-kb

monomer [10–12, 14]. Finally, preliminary data from calcareous sponges also hint for the

possibility of multipartite, and most likely linear mtDNA organization in this group (this

thesis). Outside animals, linear mtDNA has been documented in a wide array of organisms

such as fungi, algae, and ciliates [44–50].

Unlike circular molecules, linear mtDNA has to overcome the “challenges of the ends”

[47], namely a higher susceptibility of the linear molecules to exonuclease activity and their

shortening after each replication cycle. In the nucleus, the ends of the linear chromosomes are

capped by telomeres that both protect the molecule from degradation and allow its replication

in a faithful manner. Given the lack of -dependent telomeres in the mitochondrion,

various organisms have innovated alternative mechanisms that nearly always involves repeated

sequences with opposite orientation at each end of the linear molecule [9, 47, 51]. One interesting mechanism reported in some yeast species, is the concomitant presence of a terminal protein binding the 5’ends of the linear mitochondrial molecule(s) and a B-type DNA

polymerase (polB-like) gene encoded in the genome (e.g. [45]). polB has been linked to the

123 incorporation by the mtDNA of a linear fungal plasmid, where the invasion of by the linear plasmid linear resulted in the linearization of the mitochondrial molecule [52]. Recent discovery of polB-like genes in the mito-genomes of some medusozoan species [13, 15, 17, 18] provides further support for the horizontal transfer hypothesis, potentially after infection by a fungi that possessed the linear plasmid.

The size of animal mitochondrial genome is not uniform

The most recent review of animal mitochondrial genomes emphasized an overall conservation in mtDNA size for most taxa [22]. Animal mtDNA size ranges from just above

10 kbp in ctenophorans to >43 kbp in the placozoan adhaerens [53]. It has been suggested that "key transitions in animal evolution" (multicellularity and bilateral symmetry) are reflected by the evolution of their mitogenomes [23], where the advent of bilateral symmetry has been accompanied with several genetic novelties in the mtDNA, such as changes in the genetic code and loss of tRNA genes. For instance, bilaterian mtDNAs are generally smaller in size compared to non-bilaterian counterparts, due to the more compact nature of their genome and smaller sized genes. In contrast, non-bilaterians mitogenomes are considered more permissive to non-coding DNA, where larger-sized genes are interspaced by intergenic regions (IGRs) and introns that can account for up to 34% of the genome in sponges [54, 55] and 45% in placozoans [56]. However, this pattern does not hold in the light of the most recent studies of non-bilaterians. The recently sequenced mitogenomes from the ctenophorans

Mnemiopsis leidyi [57] and Pleurobrachia bachei [58], typical non-bilaterian animals, are the smallest for metazoan to date. Likewise, all published mitogenomes from medusozoans (one of the two classes within Cnidaria) resemble bilaterians, with the quasi absence of non-coding

124 regions and small-sized genes [13, 15–18, 30]. Presently, it is not clear whether these small mtDNAs are evolutionary curiosities among an otherwise homogenous set of sponge-like non- bilaterian mitogenomes. Yet an increasing number of evidence suggests that the typically large mtDNA is not a good representative for all non-bilaterian animals. Similarly, tiny mtDNAs found in chaetognaths [59–62] and at least one nematode [63] also suggest some level of size- plasticity in bilaterian.

In contrast, large size mtDNAs have been documented amid several bilaterian taxa. In mollusks, large mitogenomes resulted from partial segmental duplication of coding regions

[64–67]. In nematodes, large non-coding regions found in several species were generated after, and as a result of, the segmentation of their mtDNA (see below) [31–33, 36]. Some instances of high variation in the size of bilaterian mtDNA appear to be artifacts from either PCR amplification or sequencing. Recent examples of such experimental errors include the apparent

“tandem duplication-random loss” mechanism proposed for explaining the partial duplication of genes in the oyster Crassostrea hongkongensis [68, 69] and the particularly large mitogenome of the species Metaseiulus occidentalis [70]. In nematodes, authors have verified the validity the duplication responsible for the large size mtDNA of the enoplean

Romanomermis culicivorax [71]. Overall, mitochondrial genome “gigantism” resulting from partial duplication is rather rare in animals, and suspicious cases such as the

Lingula anatine [72], and other enoplean nematodes [73, 74] necessitate to be investigated thoroughly. Overall, animal mtDNA does not necessary appear to tend toward the size variation of. Instead, the big picture might most likely resemble a randomly distributed size range that does not match the metazoan tree of life, and most importantly, the size does not decrease with the increase in animal body and organization complexity.

125

A wide array of mitochondrial genetic content

As mentioned earlier, the "typical" (mammalian-based) animal mtDNA harbors 13 intron-less protein genes, 22 tRNA genes and 2 rRNA genes. Such generalization does not apply to non-bilaterian animals where additional genetic material (such as additional protein genes and ORFs, introns, and tRNAs) can be found. In the published mtDNAs of Placozoa and sponges, an additional gene for subunit 9 of ATP synthase (apt9) is also present [54–56, 75–

85], while a subunit for twin-arginine translocase (tatC) has been found in some homoscleromorphs [77, 85], and a lineage-specific ORF has been found in the mtDNA of some marine mussel species [86]. atp9 is found in the unicellular relatives of animals, the choanoflagellate Monosiga brevicollis and the ichthyosporean Amoebidium parasiticum [2], while it has been transferred to the nucleus in all other animal groups. Consequently, the presence of atp9 in the mtDNA of sponges and placozoans is considered an ancestral state for metazoan. Another ancestral feature of animal mtDNA is the presence of the self-splicing group I intron in cox1 that usually harbors a putative homing endonuclease responsible for its mobility, so far reported only in placozoan [56, 87], and some representatives from demosponges and hexacorals [80, 85, 88, 89]. No group I intron has been found in glass sponges [76, 79] or any other animal group, suggesting a tendency to be lost. This is illustrated by loss of the intron in several hexacorals mtDNA where an unidentified ORF can be found upstream or downstream of cox1. Another intron is present within nad5 in all hexacorals (e.g.

[90–92]) and placozoans [56, 87]. This feature is shared with the choanoflagellate M. brevicolis, but has been lost in all other cnidarians, sponges, and other animal groups. Two occurrences of a second type of intron (group II intron), have been recently reported in the

126 Nephtys sp [93] and the placozoan Trichoplax adhaerens [53] that putatively originated from a horizontal transfer from either a viral or bacterial vector.

The mtDNA of non-bilaterian animals can also contain additional protein genes of extra-mitochondrial origin. In octocorals, some studies have suggested a nuclear origin for the putative mismatch repair protein mutS [43, 94, 95], illustrating the first potential transfer of nuclear gene into animal mtDNA. A DNA-dependent DNA polymerase polB has also been described in one strain of placozoans and the jellyfish Aurelia aurita [13, 15, 17, 18, 56].

Overall non-bilaterian mtDNA seems to have been at least to some extent porous to invasion of alien materials.

The loss of genetic information is by far the most widespread deviation from a typical animal mtDNA. Subunits of the ATP synthase (atp8 and in a lesser extend atp6) have been lost independently in animal species, most of which are characterized by a high rate of mtDNA sequence evolution. atp8 was lost in all three groups parasitic (, Monogena and ) [22, 96, 97], the acoelomorph Symsagittifera roscoffensis [98], and the free living planarians Dugesia japonica (NC_016439) and Schmidtea mediterranea (unpublished data), the bryozoan Tubulipora flabellaris [99], several bivalve species [100, 101], and in all nematodes with the notable exception of spiralis [31, 32, 34, 102–106]. atp8 appears to be a disposable gene in animal mtDNA, with recurrent losses across the metazoan tree of life (Figure 2).

Other instances of protein gene loss from the NADH dehydrogenase complex have been reported in three vertebrate species (reviewed in [107]) and the turbellarian Dugesia japonica (for nad4L), but additional investigations are needed to rule out cases of PCR artifacts

[68]. Analysis of complete nuclear DNA (nDNA) will provide additional insights on the

127

potential transfer on these genes from the mitochondrion to the nucleus. Two extreme cases of

mtDNA reduction are found in ctenophorans and in chaetognaths, where both atp6 and atp8 genes have been lost [57–62]. Not surprisingly, atp6 has been detected in the nDNA of both ctenophorans [57, 58]. mtDNA-encoded RNAs are prone to a high level of variation in animals

Protein translation in animal mitochondria utilizes a repertoire of RNAs that are typically encoded by their mtDNA [108]. Such repertoire is constituted of two genes for the large and small subunits of the ribosomal RNAs (rnl and rns), and up to 24 tRNA genes.

Unusual rRNA genes are found in placozoans and bivalves, where rnl may be encoded by two or more DNA fragments [56, 68, 109]. High levels of gene segmentation of both subunits of rRNA have been recently demonstrated in the calcareous sponge Clathrina clathrus (this thesis). It is noteworthy mentioning that fragmented rRNA genes have been documented in many non-animal groups, such as bacteria, green algae and some protists [7, 110–115].

Putative functional small rRNA genes have been inferred in parasitic flatworms (i.e. [97])

where despite a decrease in gene sizes, the secondary structure is well conserved. Extremely

small rRNAs were inferred in the recently sequenced mtDNA from the ctenophoran

Mnemiopsis leidyi [57, 58]. However, it is unclear whether these RNA genes are functional or

pseudogenes. For instance, fragmented rRNA genes could explain the lack of conserved

secondary structure in these sequences. Empirical evidences from ESTs and RT-PCR based

techniques are likely to shed light on the function of these reduced putative RNA genes.

The typical animal mtDNA contains a whole set of tRNAs (one tRNA per codon

family), each of which corresponding to a given amino acid residue, with the exception of two

128

tRNAs for two codon families encoding leucine and serine. Yet, variation in the number of

mitochondrial encoded tRNAs has been reported in several animal groups. In , Gissi

and collaborators reviewed six instances of tRNA loss and at least 20 duplicated tRNA genes

[107]. Loss of tRNA genes is most widespread in non-bilaterian animals, as reported in some

demosponges and all cnidarians mtDNAs (e.g. [13, 15, 22, 80]). Recent analysis of the ctenophoran M. leidyi mtDNA suggests the total absence of tRNA genes accompanied by apparent parallel losses of all nuclear encoded mitochondrial aminoacyl-tRNA synthetases (mt-

aaRS) [57]. Yet, a second completely sequenced mtDNA from the ctenophoran Pleurobrachia

bachei encodes sequences similar to two tRNAs [58]. It is still unclear whether the absence of

mt-aaRS is a genuine feature of ctenophoran genomes or the reflection of what is so far an

unfinished analysis of the nuclear genome. Future analysis of the expression and the tRNA-like

genes in P. bachei and of the alternative import of nuclear tRNAs in the mitochondrion shall

shed light on the evolution of ctenophoran mtDNA.

Extreme cases of tRNA loss has also been found in the bilaterian clade ,

with only one tRNA gene (trnM) present in all five species sequenced to date [60, 61].

Interestingly, in the first description of Spadella cephaloptera mtDNA, the authors suggested

the complete loss of all tRNAs in [59]. But a more recent study reanalyzed the three mtDNAs

from chaetognaths available at the time and showed the presence of trnM in S. cephaloptera

mtDNA [62]. Similarly, whether the apparent loss of all tRNAs in ctenophoran is real or an

artifact resulting from the limitations of the purely bioinformatics methods used still have to be

determined.

Duplicated tRNA genes have also been reported in nematodes, bivalves and the

cirripede Pollicipes polymerus [22]. In addition to tRNA gene loss, animal mitochondrial

129 tRNAs show some atypical features not found in non-metazoan taxa. Whilst most mt-tRNAs have very similar inferred secondary structure to their nuclear counterparts, some tRNAs are truncated of part or a complete arm. For instance, tRNA-Ser I displays a shorter sequence with the loss of the dihydrouridine (DHU) arm in most animals. The TΨC arm is lost for several tRNAs in the highly derived mtDNA of nematodes [116, 117], the acanthocephalan

Leptorhynchoides thecatus [118], and some arthropods.

Edited tRNAs have been reported in several animal taxa [119]: for example in arthropods (e.g. [120]), mollusks (e.g. [121, 122]), and marsupials (e.g. [123, 124]). An extreme case of degenerated tRNAs is found in onychophorans (velvet worms) mtDNA where some authors have suggested the loss of most mitochondrial tRNA genes [125–127]. Yet a more comprehensive survey of these and other mtDNAs from velvet worms revealed that all tRNAs are present but incomplete, with the TΨC arm lacking at DNA level, which are edited at RNA level to become functional [128, 129]. Finally, recent survey of one calcareous sponge mtDNA suggested template-dependent tRNA editing (this thesis).

Changing the code is not limited to bilaterian animals

The standard genetic code, also known as the "canonical code", used for protein synthesis in the nucleus and organelles has been altered in various living organisms. Animal mtDNA displays the highest variation in their genetic code (Figure 2), with at least 13 instances documented to date [76, 79, 130–133]. The loss of tRNA genes in bilaterians has been tentatively associated with changes in the genetic code in bilaterian animals [23]. Yet, compensatory changes in the genetic code are not paralleled with most instances of tRNA loss, suggesting independence between the two events. For instance, cnidarians use the minimally

130 derived genetic code G4 (TGA Stop  Trp), yet they have lost nearly all their mitochondrial encoded tRNA genes. In parasitic flatworms, previous studies have shown at least two changes in the genetic code (AUA Met  Ile and AAA Lys  Asn), however the complete set of 22 tRNA genes are found in their mtDNA. Similarly, the minimally derived genetic code G4 was inferred for mitochondrial protein synthesis in ctenophoran where most [58] or all [57] tRNAs are missing from the mtDNA. Finally in chaetognaths, the mtDNA encodes only methionine, yet the genetic code for protein synthesis is the invertebrate mt-code G5 (AGR Arg  Ser;

AUA Met  Ile; TGA Stop  Trp) [60–62].

Changes in the genetic code seem to be correlated with a relative increase in the rate of sequence evolution in animals. Such pattern appears particularly evident when comparing the three major classes (Demospongia, Hexactinellida, and Calcarea) within the phylum Porifera

(sponges). The relatively slow evolving Demospongia (demosponges) use the minimally derived genetic code and their mtDNA typically encodes for 24 tRNAs (as described by the canonical code codon families). In Hexactinellida (glass sponges), recently sequenced mt- genomes showed a change in the mitochondrial genetic code (AGR Arg  Ser) convergent between these sponges and bilaterian animals [76, 79]. Recent survey of two calcarean species strongly suggests the use of at least one, potentially two novel genetic code(s) for mitochondrial protein synthesis, where UAG Stop  Tyr and CGN Arg  Gly in the order

Calcinea and putatively AUA Ile  Met in Calcaronea (this thesis). Interestingly, mt sequences from both glass sponges and calcarean display very high rate of sequence evolution similar (for the former) or higher (for the latter) than bilaterian animals.

131

Concluding remarks

Here I surveyed the range of available mtDNA sequences from animal, highlighting major deviations from the typical animal mitochondrial genome. The overall picture displays a wide range of evolutionary trajectories that challenges the classical view of a "frozen genome"

[27]. This is exemplified by the presence of various mtDNA organizations in closely related species. Recent studies of under represented taxa have revealed an assortment of novel genomic features of animal mtDNA. Future studies that explore "small" phyla such may provide insights into the evolutionary potentials of the mitochondrial genome in animals. The combination of the new sequencing technologies both at DNA and RNA levels provides unique tools for dealing with such task. One safe prediction is that no single theory can explain the pattern of mtDNA evolution in animals.

This work also underscored the need for verifying findings of unusual features (e.g. tRNA losses in onychophorans and potentially ctenophorans). In such cases, it is important to propose satisfactory mechanisms that would explain the evolution and maintenance of such pattern. In addition, with our current technological advances, limiting the analysis of lost genetic features to solely bioinformatics searches is unsatisfactory, particularly when a single case is considered, and empirical evidences shall be required before making any evolutionary statement. For instance, a recent study has proposed the parallel loss of nuclear-encoded mt aminoacyl-tRNA synthetases (aaRS) with their respective mt-encoded tRNAs in cnidarians

[134]. Yet the authors failed to explain the presence of a mitochondrial tRNA-Met while the aaRS seems to be lost in the nuclear DNA. This case is illustrative of the current trend in the field of molecular biology. With the ongoing increase in the number of genome sequencing studies, a real shift is underway from empirical experimentation toward more speculative

132 bioinformatics-based predictions. As a consequence, all newly sequenced genomes are poorly annotated. Today more than ever there is a need for independent corroboration of computation- based hypotheses.

133

References

1. Gray MW, Burger G, Lang B: The origin and early evolution of mitochondria. Genome Biology 2001, 2:1-5. 2. Burger G, Forget L, Zhu Y, Gray MW, Lang BF: Unique mitochondrial genome architecture in unicellular relatives of animals. Proceedings of the National Academy of Sciences of the United States of America 2003, 100:892-7. 3. Palmer JD, Adams KL, Cho Y, Parkinson CL, Qiu YL, Song K: Dynamic evolution of plant mitochondrial genomes: mobile genes and introns and highly variable rates. Proceedings of the National Academy of Sciences of the United States of America 2000, 97:6960-6. 4. Gray MW, Burger G, Lang BF: Mitochondrial evolution. Science (New York, N.Y.) 1999, 283:1476-81. 5. Wilson RJ, Williamson DH: Extrachromosomal DNA in the . Microbiology and molecular biology reviews : MMBR 1997, 61:1-16. 6. Nash E a, Nisbet RER, Barbrook AC, Howe CJ: Dinoflagellates: a mitochondrial genome all at sea. Trends in genetics : TIG 2008, 24:328-35. 7. Waller RF, Jackson CJ: Dinoflagellate mitochondrial genomes: stretching the rules of molecular biology. BioEssays : news and reviews in molecular, cellular and developmental biology 2009, 31:237-45. 8. Lang BF, Burger G, O’Kelly CJ, Cedergren R, Golding GB, Lemieux C, Sankoff D, Turmel M, Gray MW: An ancestral mitochondrial DNA resembling a eubacterial genome in miniature. Nature 1997, 387:493–497. 9. Nosek J, Tomáska L: Mitochondrial genome diversity: evolution of the molecular architecture and replication strategy. Current Genetics 2003, 44:73-84. 10. Marcadé I, Cordaux R, Doublet V, Debenest C, Bouchon D, Raimond R: Structure and evolution of the atypical mitochondrial genome of Armadillidium vulgare (Isopoda, Crustacea). Journal of molecular evolution 2007, 65:651-9. 11. Doublet V, Raimond R, Grandjean F, Lafitte A, Souty-grosset C, Marcadé I: Widespread atypical mitochondrial DNA structure in isopods (Crustacea, Peracarida) related to a constitutive heteroplasmy in terrestrial species. 2012, 244:234-244. 12. Doublet V: Structure et Evolution du Génome Mitochondrial des Oniscidea (Crustacea, Isopoda). Structure 2010:1-189. 13. Smith DR, Kayal E, Yanagihara A a, Collins AG, Pirro S, Keeling PJ: First complete mitochondrial genome sequence from a box jellyfish reveals a highly fragmented, linear architecture and insights into telomere evolution. Genome Biology and Evolution 2012, 4:52-58. 14. Raimond R, Marcadé I, Bouchon D, Rigaud T, Bossy JP, Souty-Grosset C: Organization of the large mitochondrial genome in the isopod Armadillidium vulgare. Genetics 1999, 151:203-10.

134

15. Kayal E, Bentlage B, Collins AG, Kayal M, Pirro S, Lavrov DV: Evolution of linear mitochondrial genomes in medusozoan cnidarians. Genome Biology and Evolution 2012, 4:1-12. 16. Kayal E, Lavrov DV: The mitochondrial genome of Hydra oligactis (Cnidaria, Hydrozoa) sheds new light on animal mtDNA evolution and cnidarian phylogeny. Gene 2008, 410:177-86. 17. Park E, Hwang D-S, Lee J-S, Song J-I, Seo T-K, Won Y-J: Estimation of divergence times in cnidarian evolution based on mitochondrial protein-coding genes and the fossil record. Molecular Phylogenetics and Evolution 2012, 62:329-345. 18. Shao Z, Graf S, Chaga OY, Lavrov DV: Mitochondrial genome of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa): A linear DNA molecule encoding a putative DNA-dependent DNA polymerase. Gene 2006, 381:92-101. 19. Kim BD, Lee KJ, Debusk AG: Linear and “ lasso-like ” structures of the mitochondrial DNA from Pennisetum typhoides. Federation of European Biochemical Societies 1982, 147:231-234. 20. Lukes J, Guilbride DL, Voty J, Zíkova A, Benne R, Englund PT: Kinetoplast DNA Network : Evolution of an Improbable Structure. Eukaryotic Cell 2002, 1:495-502. 21. Boore JL: Animal mitochondrial genomes. Nucleic acids research 1999, 27:1767-80. 22. Gissi C, Iannelli F, Pesole G: Evolution of the mitochondrial genome of Metazoa as exemplified by comparison of congeneric species. Heredity 2008, 101:301-20. 23. Lavrov DV: Key transitions in animal evolution: a mitochondrial DNA perspective. Integrative and Comparative Biology 2007, 47:734-743. 24. Lavrov DV, Forget L, Kelly M, Lang BF: Mitochondrial genomes of two demosponges provide insights into an early stage of animal evolution. Molecular biology and evolution 2005, 22:1231-9. 25. Boore JL, Fuerstenberg SI: Beyond linear sequence comparisons: the use of genome- level characters for phylogenetic reconstruction. Philosophical transactions of the Royal Society of London. Series B, Biological sciences 2008, 363:1445-51. 26. Anderson S, Bankier AT, Barrell BG, De Bruijn MH, Coulson AR, Drouin J, Eperon IC, Nierlich DP, Roe BA, Sanger F, Schreier PH, Smith AJ, Staden R, Young IG: Sequence and organization of the human mitochondrial genome. Nature 1981, 290:457 - 465. 27. Saccone C, Gissi C, Reyes A, Larizza A, Sbisà E, Pesole G: Mitochondrial DNA in metazoa: degree of freedom in a frozen event. Gene 2002, 286:3-12. 28. Burger G, Gray MW, Lang BF: Mitochondrial genomes: anything goes. Trends in Genetics 2003, 19:709-716. 29. Suga K, Mark Welch DB, Tanaka Y, Sakakura Y, Hagiwara A: Two circular chromosomes of unequal copy number make up the mitochondrial genome of the rotifer Brachionus plicatilis. Molecular biology and evolution 2008, 25:1129-37. 30. Voigt O, Erpenbeck D, Wörheide G: A fragmented metazoan organellar genome: the two mitochondrial chromosomes of Hydra magnipapillata. BMC genomics 2008, 9:350. 31. Gibson T, Blok VC, Phillips MS, Hong G, Kumarasinghe D, Riley IT, Dowton M: The mitochondrial subgenomes of the nematode Globodera pallida are mosaics: evidence of recombination in an animal mitochondrial genome. Journal of molecular evolution 2007, 64:463-71.

135

32. Gibson T, Blok VC, Dowton M: Sequence and characterization of six mitochondrial subgenomes from Globodera rostochiensis: multipartite structure is conserved among close nematode relatives. Journal of molecular evolution 2007, 65:308-15. 33. Armstrong MR, Blok VC, Phillips MS: A multipartite mitochondrial genome in the potato cyst nematode Globodera pallida. Genetics 2000, 154:181-92. 34. Gibson T, Farrugia D, Barrett J, Chitwood DJ, Rowe J, Subbotin S, Dowton M: The mitochondrial genome of the soybean cyst nematode, Heterodera . Genome 2011, 54:565-574. 35. Cameron SL, Yoshizawa K, Mizukoshi A, Whiting MF, Johnson KP: Mitochondrial genome deletions and minicircles are common in lice (Insecta: Phthiraptera). BMC genomics 2011, 12:394. 36. Shao R, Kirkness EF, Barker SC: The single mitochondrial chromosome typical of animals has evolved into 18 minichromosomes in the human body louse, Pediculus humanus. Genome research 2009, 19:904-12. 37. Wei D-D, Shao R, Yuan M-L, Dou W, Barker SC, Wang J-J: The Multipartite Mitochondrial Genome of Liposcelis bostrychophila: Insights into the Evolution of Mitochondrial Genomes in Bilateral Animals. PloS one 2012, 7:e33973. 38. Ender A, Schierwater B: Placozoa Are Not Derived Cnidarians: Evidence from Molecular Morphology. Molecular Biology and Evolution 2003, 20:130-134. 39. Bridge D, Cunningham CW, DeSalle R, Buss LW: Class-level relationships in the phylum Cnidaria: molecular and morphological evidence. Molecular Biology and Evolution 1995, 12:679-689. 40. Bridge D, Cunningham CW, Schierwater B, DeSalle R, Buss LW: Class-level relationships in the phylum Cnidaria: evidence from mitochondrial genome structure. Proceedings of the National Academy of Sciences of the United States of America 1992, 89:8750-3. 41. Warrior R, Gall J: The mitochondrial DNA of Hydra attenuata and Hydra littoralis consists of two linear molecules. Archives des Sciences, Genève 1985, 38:439-445. 42. Voigt O, Erpenbeck D, Wörheide G: Molecular evolution of rDNA in early diverging Metazoa: first comparative analysis and phylogenetic application of complete SSU rRNA secondary structures in Porifera. BMC evolutionary biology 2008, 8:69. 43. Pont-kingdon G, Vassort CG, Warrior R, Okimoto R, Beagley CT, Wolstenholme DR: Mitochondrial DNA of Hydra attenuata (Cnidaria): A Sequence That Includes an End of One Linear Molecule and the Genes for l-rRNA, tRNAf-Met, tRNATrp, COII, and ATPase8. Journal of Molecular Evolution 2000, 51:404-415. 44. Fan J, Lee RW: Mitochondrial genome of the colorless green alga Polytomella parva: two linear DNA molecules with homologous inverted repeat Termini. Molecular biology and evolution 2002, 19:999-1007. 45. Fricova D, Valach M, Farkas Z, Pfeiffer I, Kucsera J, Tomaska L, Nosek J: The mitochondrial genome of the pathogenic yeast Candida subhashii: GC-rich linear DNA with a protein covalently attached to the 5’ termini. Microbiology (Reading, England) 2010, 156:2153-63. 46. Hikosaka K, Watanabe Y-I, Tsuji N, Kita K, Kishine H, Arisue N, Palacpac NMQ, Kawazu S-I, Sawai H, Horii T, Igarashi I, Tanabe K: Divergence of the mitochondrial

136

genome structure in the apicomplexan parasites, Babesia and Theileria. Molecular biology and evolution 2010, 27:1107-16. 47. Nosek J, Tomáska L, Fukuhara H, Suyama Y, Kovác L: Linear mitochondrial genomes: 30 years down the line. Trends in genetics : TIG 1998, 14:184-8. 48. Pérez-Brocal V, Shahar-Golan R, Clark CG: A linear molecule with two large inverted repeats: the mitochondrial genome of the stramenopile Proteromonas lacertae. Genome Biology and Evolution 2010, 2:257-66. 49. Smith DR, Lee RW: Mitochondrial genome of the colorless green alga Polytomella capuana: a linear molecule with an unprecedented GC content. Molecular biology and evolution 2008, 25:487-96. 50. Valach M, Farkas Z, Fricova D, Kovac J, Brejova B, Vinar T, Pfeiffer I, Kucsera J, Tomaska L, Lang BF, Nosek J: Evolution of linear chromosomes and multipartite genomes in yeast mitochondria. Nucleic acids research 2011:1-18. 51. Nosek J, Tomáska Ľ: Mitochondrial Telomeres: An Evolutionary Paradigm for the Emergence of Telomeric Structures and Their Replication Strategies. In Origin and Evolution of Telomeres. edited by Nosek J, Tomáska Ľ Austin, Texas: Landes Bioscience; 2008:163-171. 52. Mouhamadou B, Barroso G, Labarère J: Molecular evolution of a mitochondrial polB gene, encoding a family B DNA polymerase, towards the elimination from Agrocybe mitochondrial genomes. Molecular genetics and genomics MGG 2004, 272:257-263. 53. Dellaporta SL, Xu A, Sagasser S, Jakob W, Moreno MA, Buss LW, Schierwater B: Mitochondrial genome of Trichoplax adhaerens supports Placozoa as the basal lower metazoan phylum. Proceedings of the National Academy of Sciences 2006, 103:8751– 8756. 54. Lavrov DV: Rapid proliferation of repetitive palindromic elements in mtDNA of the endemic Baikalian sponge Lubomirskia baicalensis. Molecular biology and evolution 2010, 27:757-60. 55. Lukić-Bilela L, Brandt D, Pojskić N, Wiens M, Gamulin V, Müller WEG: Mitochondrial genome of Suberites domuncula: palindromes and inverted repeats are abundant in non-coding regions. Gene 2008, 412:1-11. 56. Signorovitch AY, Buss LW, Dellaporta SL: Comparative Genomics of Large Mitochondria in Placozoans. PLoS Genetics 2007, 3:7. 57. Pett W, Ryan JF, Pang K, Mullikin JC, Martindale MQ, Baxevanis AD, Lavrov DV: Extreme mitochondrial evolution in the ctenophore Mnemiopsis leidyi: Insight from mtDNA and the nuclear genome. Mitochondrial DNA 2011:1-13. 58. Kohn AB, Citarella MR, Kocot KM, Bobkova YV, Halanych KM, Moroz LL: Rapid evolution of the compact and unusual mitochondrial genome in the ctenophore, Pleurobrachia bachei. Molecular phylogenetics and evolution 2011, 63:203-207. 59. Papillon D, Perez Y, Caubit X, Le Parco Y: Identification of chaetognaths as protostomes is supported by the analysis of their mitochondrial genome. Molecular Biology and Evolution 2004, 21:2122-2129. 60. Helfenbein KG, Fourcade HM, Vanjani RG, Boore JL: The mitochondrial genome of Paraspadella gotoi is highly reduced and reveals that chaetognaths are a sister

137

group to protostomes. Proceedings of the National Academy of Sciences of the United States of America 2004, 101:10639-10643. 61. Miyamoto H, Machida RJ, Nishida S: Complete mitochondrial genome sequences of the three pelagic chaetognaths Sagitta nagae, Sagitta decipiens and Sagitta enflata. Comparative and physiology. Part D, Genomics & proteomics 2010, 5:65- 72. 62. Faure E, Casanova J-P: Comparison of chaetognath mitochondrial genomes and phylogenetical implications. Mitochondrion 2006, 6:258-262. 63. He Y, Jones J, Armstrong M, Lamberti F, Moens M: The mitochondrial genome of Xiphinema americanum sensu stricto (Nematoda: Enoplea): considerable economization in the length and structural features of encoded genes. Journal of molecular evolution 2005, 61:819-33. 64. Yokobori S-ichi, Fukuda N, Nakamura M, Aoyama T, Oshima T: Long-term conservation of six duplicated structural genes in mitochondrial genomes. Molecular Biology and Evolution 2004, 21:2034-46. 65. Passamonti M, Ricci A, Milani L, Ghiselli F: Mitochondrial genomes and Doubly Uniparental Inheritance: new insights from Musculista senhousia -linked mitochondrial DNAs ( Mytilidae). BMC genomics 2011, 12:442. 66. Xu X, Wu X, Yu Z: Comparative studies of the complete mitochondrial genomes of four Paphia clams and reconsideration of subgenus Neotapes (Bivalvia: Veneridae). Gene 2012, 494:17-23. 67. Akasaki T, Nikaido M, Tsuchiya K, Segawa S, Hasegawa M, Okada N: Extensive mitochondrial gene arrangements in coleoid Cephalopoda and their phylogenetic implications. Molecular phylogenetics and evolution 2006, 38:648-58. 68. Ren J, Liu X, Zhang G, Liu B, Guo X: “Tandem duplication-random loss” is not a real feature of oyster mitochondrial genomes. BMC genomics 2009, 10:84. 69. Chaudhuri K, Chen K, Mihaescu R, Rao S: On the tandem duplication-random loss model of genome rearrangement. Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm - SODA ’06 2006:564-570. 70. Dermauw W, Vanholme B, Tirry L, Leeuwen TV: Mitochondrial genome analysis of the predatory mite Phytoseiulus persimilis and a revisit of the Metaseiulus occidentalis mitochondrial genome. Genome 2010, 53:285-301. 71. Azevedo JLB, Hyman BC: Molecular Characterization of Lengthy Mitochondrial DNA Duplications From the Parasitic Nematode Romanomermis culicivorax. Genetics 1993, 133:933-942. 72. Endo K, Noguchi Y, Ueshima R, Jacobs HT: Novel repetitive structures, deviant protein-encoding sequences and unidentified ORFs in the mitochondrial genome of the brachiopod Lingula anatina. Journal of Molecular Evolution 2005, 61:36-53. 73. Powers TO, Harris TS, Hyman BC: Mitochondrial DNA Sequence Divergence among Meloidogyne incognita, Romanomermis culicivorax, Ascaris suum, and Caenorhabditis elegans. Journal of nematology 1993, 25:564-72. 74. Tang S, Hyman BC: Rolling circle amplification of complete nematode mitochondrial genomes. Journal of nematology 2005, 37:236-41. 75. Erpenbeck D, Voigt O, Adamski M, Adamska M, Hooper JN a, Wörheide G, Degnan BM: Mitochondrial diversity of early-branching metazoa is revealed by the

138

complete mt genome of a haplosclerid demosponge. Molecular Biology and Evolution 2007, 24:19-22. 76. Haen KM, Lang BF, Pomponi S a, Lavrov DV: Glass sponges and bilaterian animals share derived mitochondrial genomic features: a common ancestry or parallel evolution? Molecular biology and evolution 2007, 24:1518-27. 77. Wang X, Lavrov DV: Mitochondrial genome of the homoscleromorph Oscarella carmela (Porifera, Demospongiae) reveals unexpected complexity in the common ancestor of sponges and other animals. Molecular biology and evolution 2007, 24:363-73. 78. Belinky F, Rot C, Ilan M, Huchon D: The complete mitochondrial genome of the demosponge Negombata magnifica (Poecilosclerida). Molecular phylogenetics and evolution 2008, 47:1238-43. 79. Rosengarten RD, Sperling E a, Moreno M a, Leys SP, Dellaporta SL: The mitochondrial genome of the hexactinellid sponge Aphrocallistes vastus: evidence for programmed translational frameshifting. BMC genomics 2008, 9:33. 80. Wang X, Lavrov DV: Seventeen New Complete mtDNA Sequences Reveal Extensive Mitochondrial Genome Evolution within the Demospongiae. PLoS ONE 2008, 3:11. 81. Erpenbeck D, Voigt O, Wörheide G, Lavrov DV: The mitochondrial genomes of sponges provide evidence for multiple invasions by Repetitive Hairpin-forming Elements (RHE). BMC genomics 2009, 10:591. 82. Ereskovsky AV, Lavrov DV: Molecular and morphological description of a new species of Halisarca (Demospongiae: Halisarcida) from and a redescription of the type species Halisarca dujardini. Zootaxa 2011, 2768:5-31. 83. Pleše B, Lukić-Bilela L, Bruvo-Mađarić B, Harcet M, Imešek M, Bilandžija H, Ćetković H: The mitochondrial genome of stygobitic sponge subterraneus: mtDNA is highly conserved in freshwater sponges. Hydrobiologia 2011, 687:49-59. 84. Sperling E a., Rosengarten RD, Moreno M a., Dellaporta SL: The complete mitochondrial genome of the verongid sponge Aplysina cauliformis: implications for DNA barcoding in demosponges. Hydrobiologia 2011, 687:61-69. 85. Gazave E, Lapébie P, Renard E, Vacelet J, Rocher C, Ereskovsky AV, Lavrov DV, Borchiellini C: Molecular phylogeny restores the supra-generic subdivision of homoscleromorph sponges (Porifera, Homoscleromorpha). PloS one 2010, 5:e14290. 86. Breton S, Ghiselli F, Passamonti M, Milani L, Stewart DT, Hoeh WR: Evidence for a fourteenth mtDNA-encoded protein in the female-transmitted mtDNA of marine Mussels (Bivalvia: Mytilidae). PloS One 2011, 6:e19365. 87. Burger G, Yan Y, Javadi P, Lang BF: Group I-intron trans-splicing and mRNA editing in the mitochondria of placozoan animals. Trends in genetics : TIG 2009, 25:377-81. 88. Goddard MR, Burt a: Recurrent invasion and of a selfish gene. Proceedings of the National Academy of Sciences of the United States of America 1999, 96:13880-5. 89. Goddard MR, Leigh J, Roger AJ, Pemberton AJ: Invasion and persistence of a selfish gene in the Cnidaria. PloS one 2006, 1:e3.

139

90. Sinniger F, Chevaldonné P, Pawlowski J: Mitochondrial genome of Savalia savaglia (Cnidaria, Hexacorallia) and early metazoan phylogeny. Journal of molecular evolution 2007, 64:196-203. 91. Beagley CT, Okimoto R, Wolstenholme DR: The mitochondrial genome of the sea anemone Metridium senile (Cnidaria): introns, a paucity of tRNA genes, and a near-standard genetic code. Genetics 1998, 148:1091-1108. 92. Brugler MR, France SC: The complete mitochondrial genome of the black coral Chrysopathes formosa (Cnidaria:Anthozoa:Antipatharia) supports classification of antipatharians within the subclass Hexacorallia. Molecular Phylogenetics and Evolution 2007, 42:776-788. 93. Vallès Y, Halanych KM, Boore JL: Group II introns break new boundaries: presence in a bilaterian’s genome. PloS one 2008, 3:e1488. 94. Pont-Kingdon G, Okada NA, Macfarlane JL, Beagley CT, Watkins-Sims CD, Cavalier- Smith T, Clark-Walker GD, Wolstenholme DR: Mitochondrial DNA of the coral Sarcophyton glaucum contains a gene for a homologue of bacterial MutS: a possible case of gene transfer from the nucleus to the mitochondrion. Journal of Molecular Evolution 1998, 46:419-431. 95. Bilewitch JP, Degnan SM: A unique horizontal gene transfer event has provided the octocoral mitochondrial genome with an active mismatch repair gene that has potential for an unusual self-contained function. BMC Evolutionary Biology 2011, 11:228. 96. Yamasaki H, Ohmae H, Kuramochi T: Complete mitochondrial genomes of Diplogonoporus balaenopterae and Diplogonoporus grandis (Cestoda: Diphyllobothriidae) and clarification of their taxonomic relationships. Parasitology international 2012, 61:260-6. 97. Le TH, Blair D, McManus DP: Mitochondrial genomes of parasitic flatworms. Trends in parasitology 2002, 18:206-13. 98. Mwinyi A, Bailly X, Bourlat SJ, Jondelius U, Littlewood DTJ, Podsiadlowski L: The phylogenetic position of as revealed by the complete mitochondrial genome of Symsagittifera roscoffensis. BMC evolutionary biology 2010, 10:309. 99. Sun M, Shen X, Liu H, Liu X, Wu Z, Liu B: Complete mitochondrial genome of Tubulipora flabellaris (: ): the first representative from the class Stenolaemata with unique gene order. Marine genomics 2011, 4:159-65. 100. Vallès Y, Boore JL: Lophotrochozoan mitochondrial genomes. Integrative and comparative biology 2006, 46:544-57. 101. Yuan Y, Li Q, Yu H, Kong L: The complete mitochondrial genomes of six heterodont bivalves (tellinoidea and solenoidea): variable gene arrangements and phylogenetic implications. PloS one 2012, 7:e32353. 102. Lavrov DV, Brown WM: mtDNA: a nematode mitochondrial genome that encodes a putative ATP8 and normally structured tRNAS and has a gene arrangement relatable to those of coelomate metazoans. Genetics 2001, 157:621-37. 103. Yatawara L, Wickramasinghe S, Rajapakse RPVJ, Agatsuma T: The complete mitochondrial genome of Setaria digitata (Nematoda: ): Mitochondrial

140

gene content, arrangement and composition compared with other nematodes. Molecular and biochemical parasitology 2010, 173:32-8. 104. Ramesh A, Small ST, Kloos ZA, Kazura JW, Nutman TB, Serre D, Zimmerman PA: The complete mitochondrial genome sequence of the filarial nematode Wuchereria bancrofti from three geographic isolates provides evidence of complex demographic history. Molecular and Biochemical Parasitology 2012, 183:32-41. 105. Xie Y, Zhang Z, Wang C, Lan J, Li Y, Chen Z, Fu Y, Nie H, Yan N, Gu X, Wang S, Peng X, Yang G: Complete mitochondrial genomes of Baylisascaris schroederi, Baylisascaris ailuri and Baylisascaris transfuga from giant panda, red panda and polar bear. Gene 2011, 482:59-67. 106. Liu G-H, Wu C-Y, Song H-Q, Wei S-J, Xu M-J, Lin R-Q, Zhao G-H, Huang S-Y, Zhu X-Q: Comparative analyses of the complete mitochondrial genomes of Ascaris lumbricoides and Ascaris suum from humans and pigs. Gene 2012, 492:110-6. 107. Gissi C, Griggio F, Iannelli F, Biomolecolari S, Milano U: Evolutionary mitogenomics of Chordata: the strange case of ascidians and vertebrates. Invertebrate Survival Journal 2009, 6:S21-28. 108. Wolstenholme DR: Animal mitochondrial DNA: structure and evolution. International Review Of Cytology 1992, 141:173-216. 109. Milbury CA, Gaffney PM: Complete mitochondrial DNA sequence of the eastern oyster Crassostrea virginica. Marine biotechnology (New York, N.Y.) 2005, 7:697-712. 110. Boer PH, Gray MW: Scrambled ribosomal RNA gene pieces in Chlamydomonas reinhardtii mitochondrial DNA. Cell 1988, 55:399-411. 111. Denovan-Wright EM, Sankoff D, Spencer DF, Lee RW: Evolution of fragmented mitochondrial ribosomal RNA genes in Chlamydomonas. Journal of molecular evolution 1996, 42:382-91. 112. Feagin JE, Mericle BL, Werner E, Morris M: Identification of additional rRNA fragments encoded by the Plasmodium falciparum 6 kb element. Nucleic acids research 1997, 25:438-46. 113. Evguenieva-Hackenberg E: Bacterial ribosomal RNA in pieces. Molecular microbiology 2005, 57:318-25. 114. Nedelcu a M: Fragmented and scrambled mitochondrial ribosomal RNA coding regions among green algae: a model for their origin and evolution. Molecular biology and evolution 1997, 14:506-17. 115. Jackson CJ, Norman JE, Schnare MN, Gray MW, Keeling PJ, Waller RF: Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria. BMC biology 2007, 5:41. 116. Ohtsuki T, Sato A, Watanabe Y-ichi, Watanabe K: A unique serine-specific elongation factor Tu found in nematode mitochondria. Nature structural biology 2002, 9:669- 73. 117. Watanabe Y, Kawai G, Yokogawa T, Hayashi N, Kumazawa Y, Ueda T, Nishikawa K, Hirao I, Miura K, Watanabe K: Higher-order structure of bovine mitochondrial tRNA(SerUGA): chemical modification and computer modeling. Nucleic acids research 1994, 22:5378-84. 118. Steinauer ML, Nickol BB, Broughton R, Ortí G: First sequenced mitochondrial genome from the phylum (Leptorhynchoides thecatus) and its

141

phylogenetic position within Metazoa. Journal of molecular evolution 2005, 60:706- 15. 119. Yokobori S, Pääbo S: tRNA editing in metazoans. Nature 1995, 377:490. 120. Lavrov DV, Brown WM, Boore JL: A novel type of RNA editing occurs in the mitochondrial tRNAs of the centipede Lithobius forficatus. Proceedings of the National Academy of Sciences of the United States of America 2000, 97:13738-42. 121. Yokobori S-I, Paabo S: Transfer RNA editing in mitochondria. Proceedings of the National Academy of Sciences 1995, 92:10432-10435. 122. Tomita K, Ueda T, Watanabe K: RNA editing in the acceptor stem of squid mitochondrial tRNA(Tyr). Nucleic acids research 1996, 24:4987-91. 123. Janke A, Pääbo S: Editing of a tRNA anticodon in marsupial mitochondria changes its codon recognition. Nucleic acids research 1993, 21:1523-5. 124. Janke A, Feldmaier-fuchs G, Thomas WK, Von-Haeseler A, Paabo S: The Marsupial Mitochondrial Genome and the Evolution of Placental . Genetics 1994, 137:243-256. 125. Podsiadlowski L, Braband A, Mayer G: The complete mitochondrial genome of the onychophoran Epiperipatus biolleyi reveals a unique transfer RNA set and provides further support for the hypothesis. Molecular biology and evolution 2008, 25:42-51. 126. Braband A, Cameron SL, Podsiadlowski L, Daniels SR, Mayer G: The mitochondrial genome of the onychophoran Opisthopatus cinctipes (Peripatopsidae) reflects the ancestral mitochondrial gene arrangement of Panarthropoda and Ecdysozoa. Molecular phylogenetics and evolution 2010, 57:285-92. 127. Braband A, Podsiadlowski L, Cameron SL, Daniels S, Mayer G: Extensive duplication events account for multiple control regions and pseudo-genes in the mitochondrial genome of the velvet worm Metaperipatus inae (Onychophora, Peripatopsidae). Molecular phylogenetics and evolution 2010, 57:293-300. 128. Rota-Stabelli O, Kayal E, Gleeson D, Daub J, Boore JL, Telford MJ, Pisani D, Blaxter M, Lavrov DV: Ecdysozoan Mitogenomics: Evidence for a Common Origin of the Legged Invertebrates, the Panarthropoda. Genome Biology and Evolution 2010, 2:425-440. 129. Segovia R, Pett W, Trewick S, Lavrov DV: Extensive and Evolutionarily Persistent Mitochondrial tRNA Editing in Velvet Worms (Phylum Onychophora). Molecular biology and evolution 2011:1-9. 130. Abascal F, Posada D, Knight RD, Zardoya R: Parallel evolution of the genetic code in mitochondrial genomes. PLoS biology 2006, 4:e127. 131. Abascal F, Posada D, Zardoya R: The evolution of the mitochondrial genetic code in arthropods revisited. Mitochondrial DNA 2012, 23:84-91. 132. Knight RD, Freeland SJ, Landweber LF: Rewiring the keyboard: evolvability of the genetic code. Nature Reviews Genetics 2001, 2:49-58. 133. Jacob JEM, Vanholme B, Van Leeuwen T, Gheysen G: A unique genetic code change in the mitochondrial genome of the parasitic nematode Radopholus similis. BMC research notes 2009, 2:192.

142

134. Haen KM, Pett W, Lavrov DV: Parallel loss of nuclear-encoded mitochondrial aminoacyl-tRNA synthetases and mtDNA-encoded tRNAs in Cnidaria. Molecular Biology and Evolution 2010, 27:2216-9.

143

Brachiopoda Bryozoa "Worms" Nemertea

Panarthropoda Porifera Placozoa Cnidaria Ctenophora Platyhelminthes Nematoda Acanthocephala Vertebrata Rotifera Chaetognatha Acoela Xenoturbellida Tunicata Echinodermata Hemichordata Cephalochordata

Figure 1: Distribution of sequenced metazoan mitochondrial (mt) genomes. The list of complete or nearly complete mt-genomes were gathered at the Genome Resources of the

National Center for Biotechnology Information (NCBI) website

(www.ncbi.nlm.nih.gov/genomes/ORGANELLES/mztax_short.htmlat) on April the 8th 2012.

Each phylum is represented by the percentage of sequenced mt-genomes relative to the total

number of genomes available (2579).

144

i+

t+ e g+ g- f Annelida Mollusca c4,c12 t- g- Platyhelminthes t-? m Rotifera g- Bryozoa e Onychophora Tardigrada c10 t+ e m l Arthropoda t- t+ m Nematoda g- c2,c5 t- i- Acanthocephala t- g- Chaetognatha c12 c4 Echinodermata c10,c11 Hemichordata Xenoturbellida g-

c6 C c1 Urochordata h o r d a t c9 c7 Cephalochordata c8 t- e g- Vertebrata i+ f Placozoa t- m l g+ g- i- C Medusozoa n i d a r t- t- g+ Octocorallia t+ g+ i- Hexacorallia t- g- i- Ctenophora t- i- Demospongia P

c5 i- o r i f e a Hexactinellida g+ g- i- Homoscleromorpha c2,c3?,c13 e m? l? i-? f Calcarea

145

Figure 2: Evolution of the mtDNA in Metazoa. The phylogeny is based on Dunn et. al (2008),

Philippe et. al (2009), Pick et. al (2010), Rota-Stabelli et. al (2010), and Philippe et. al (2011). c corresponds to changes in the genetic code where codons have been reassigned (based on

Knight et al. 2001): c1 UGA Stop  Trp; c2 UAG Stop  Tyr; c3 AUA Ile  Met; c4 AUA

Met  Ile; c5 AGR Arg  Ser; c6 AGR Ser  Gly; c7 AGR Ser  ?; c8 AGR ?  Stop; c9

AGA ?  Ser; c10 AGG Ser  Lys; c11 AAA Lys  ?; c12 AAA Lys  Asn; c13 CGN

Arg  Gly. t- corresponds to loss(es) of mt-tRNAs; t+ are duplicated tRNAs; e corresponds to edited tRNAs. m corresponds to multi-partite mtDNA. l corresponds to linear mt molecule. g- protein gene loss; g+ horizontal transfer of protein gene. i- loss of group I intron; i+ gain of group II intron. f fragmented rRNA genes.

146

CHAPTER 6. General Conclusions

Conclusions

The increase in completely sequenced mito-genomes has further challenged the view the animal mitochondrial DNA (mtDNA) is a frozen genome. However, most sequencing efforts are directed toward bilaterian, and particularly vertebrate and arthropod species. This is unfortunate given that a large part of mtDNA diversity has been documented in non-bilaterian animals (e.g. [1–9]). For instance, entirely linear mtDNA has been documented only in medusozoan cnidarians [6, 7, 10–13], while some arthropods possess both linear and circular mitochondrial molecules [14–16]. In this dissertation we presented our investigation of linear mtDNA in Medusozoa. We showed that all medusozoan classes possess linear mt molecules that harbor two genes (except in Hydroidolina where these have been secondary lost): polB which is similar to B-type DNA polymerase gene family and an unidentified open reading frames (ORF314) [17]. In addition, we found that Cubozoa species have experienced high level of segmentalization of their mtDNA into eight small and linear chromosomes. While multipartite mtDNA has been documented in nematodes [18–20], [21] and arthropods

[22], linear multipartite mitogenomes present additional challenges for the maintenance and replication of the molecules [23, 24]. We also found that unlike what has been suggested in yeast [25], the linear mtDNA in medusozoan appear unable to return into a circular conformation.

Mitogenomic data is a useful marker for inferring animal phylogenetic relationships

(e.g. [26, 27]). Recent mitogenomic-based phylogenetic studies (phylomitogenomics) have consistently rejected Anthozoa [Hexacorallia + Octocorallia] in favor of the clade [Medusozoa

147

+ Hexacorallia] [6, 28, 29]. New analyses with a dataset that contains more balanced taxon

sampling have confirmed anthozoan paraphyly. While independent analyses using genomic

data are needed to confirm, or reject, the mitogenomic view of deep cnidarian phylogeny, the

relatively high rate of sequence evolution of medusozoan mtDNA makes it a powerful too for

resolving higher relationships.

Recently sequenced mtDNAs from three classes of sponges have unveiled an array of

unusual features, including the presence of an extra protein gene in Homoscleromorpha [30,

31], invasive repetitive elements [2] and tRNA loss in Demospongia [5], frameshifting and the

use of a novel genetic code in Hexactinellida [4, 32]. Yet, no mtDNA sequence from the fourth class (Calcarea) was available. Our investigation of partial mt sequences from several representatives of calcareous sponges suggests that calcarean mtDNA is multipartite and linear, encodes fragmented ribosomal RNA (rRNA) and edited transfer RNA (tRNA) genes, uses at least one, potentially two novel genetic codes for protein synthesis, and displays an exceptionally high rate of sequence evolution. These findings reveal a unique combination of unusual features never found in animal mitogenomes, and only documented in some protozoans (e.g. [33]).

Broader perspectives and future directions

Thanks to recent improvements in sequencing technology, the field of animal mito- genomics has reached its apotheosis. But as the cost in sequencing decreases [34] and better methods of genome assembly are becoming available [35], there is more incentives to shift

away from organelles in favor of nuclear genomes. However, as illustrated in this dissertation,

massively parallel sequencing technologies are invaluable tools for exploring genomes that

148 resist traditional approaches. For instance, sequencing the complete genome of the box jellyfish

Alatina moseri has given access to the complete sequences of its octo-chromosomal mtDNA

[8]. Similarly, mt sequences from two out of three calcarean species presented in this dissertation were first identified from ESTs (Leucetta chagosensis) and total DNA (Petrobiona massiliana) reads. Given that several “minor” animal phyla have been completely neglected to date, one can predict that future studies that investigate the mtDNA of clades such as

Gastrotricha or using novel sequencing technologies will uncover even more unusual and interesting features.

Barcoding is an easy and inexpensive tool for species identification. In Metazoa, a region of the cytochrome c oxidase I (COI) gene is amplified and sequenced using conserved primers [36, 37]. With the steep decrease in sequencing costs, the whole mtDNA can become the favorite barcode marker for animals.

149

References

1. Gissi C, Iannelli F, Pesole G: Evolution of the mitochondrial genome of Metazoa as exemplified by comparison of congeneric species. Heredity 2008, 101:301-20. 2. Lavrov DV: Rapid proliferation of repetitive palindromic elements in mtDNA of the endemic Baikalian sponge Lubomirskia baicalensis. Molecular biology and evolution 2010, 27:757-60. 3. Lavrov DV: Key transitions in animal evolution: a mitochondrial DNA perspective. Integrative and Comparative Biology 2007, 47:734-743. 4. Haen KM, Lang BF, Pomponi S a, Lavrov DV: Glass sponges and bilaterian animals share derived mitochondrial genomic features: a common ancestry or parallel evolution? Molecular biology and evolution 2007, 24:1518-27. 5. Wang X, Lavrov DV: Seventeen New Complete mtDNA Sequences Reveal Extensive Mitochondrial Genome Evolution within the Demospongiae. PLoS ONE 2008, 3:11. 6. Kayal E, Lavrov DV: The mitochondrial genome of Hydra oligactis (Cnidaria, Hydrozoa) sheds new light on animal mtDNA evolution and cnidarian phylogeny. Gene 2008, 410:177-86. 7. Voigt O, Erpenbeck D, Wörheide G: A fragmented metazoan organellar genome: the two mitochondrial chromosomes of Hydra magnipapillata. BMC genomics 2008, 9:350. 8. Smith DR, Kayal E, Yanagihara A a, Collins AG, Pirro S, Keeling PJ: First complete mitochondrial genome sequence from a box jellyfish reveals a highly fragmented, linear architecture and insights into telomere evolution. Genome Biology and Evolution 2012, 4:52-58. 9. Signorovitch AY, Buss LW, Dellaporta SL: Comparative Genomics of Large Mitochondria in Placozoans. PLoS Genetics 2007, 3:7. 10. Bridge D, Cunningham CW, DeSalle R, Buss LW: Class-level relationships in the phylum Cnidaria: molecular and morphological evidence. Molecular Biology and Evolution 1995, 12:679-689. 11. Bridge D, Cunningham CW, Schierwater B, DeSalle R, Buss LW: Class-level relationships in the phylum Cnidaria: evidence from mitochondrial genome structure. Proceedings of the National Academy of Sciences of the United States of America 1992, 89:8750-3. 12. Warrior R, Gall J: The mitochondrial DNA of Hydra attenuata and Hydra littoralis consists of two linear molecules. Archives des Sciences, Genève 1985, 38:439-445. 13. Ender A, Schierwater B: Placozoa Are Not Derived Cnidarians: Evidence from Molecular Morphology. Molecular Biology and Evolution 2003, 20:130-134. 14. Marcadé I, Cordaux R, Doublet V, Debenest C, Bouchon D, Raimond R: Structure and evolution of the atypical mitochondrial genome of Armadillidium vulgare (Isopoda, Crustacea). Journal of molecular evolution 2007, 65:651-9. 15. Raimond R, Marcadé I, Bouchon D, Rigaud T, Bossy JP, Souty-Grosset C: Organization of the large mitochondrial genome in the isopod Armadillidium vulgare. Genetics 1999, 151:203-10.

150

16. Doublet V, Raimond R, Grandjean F, Lafitte A, Souty-grosset C, Marcadé I: Widespread atypical mitochondrial DNA structure in isopods (Crustacea, Peracarida) related to a constitutive heteroplasmy in terrestrial species. 2012, 244:234-244. 17. Kayal E, Bentlage B, Collins AG, Kayal M, Pirro S, Lavrov DV: Evolution of linear mitochondrial genomes in medusozoan cnidarians. Genome Biology and Evolution 2012, 4:1-12. 18. Gibson T, Blok VC, Dowton M: Sequence and characterization of six mitochondrial subgenomes from Globodera rostochiensis: multipartite structure is conserved among close nematode relatives. Journal of molecular evolution 2007, 65:308-15. 19. Armstrong MR, Blok VC, Phillips MS: A multipartite mitochondrial genome in the potato cyst nematode Globodera pallida. Genetics 2000, 154:181-92. 20. Gibson T, Blok VC, Phillips MS, Hong G, Kumarasinghe D, Riley IT, Dowton M: The mitochondrial subgenomes of the nematode Globodera pallida are mosaics: evidence of recombination in an animal mitochondrial genome. Journal of molecular evolution 2007, 64:463-71. 21. Suga K, Mark Welch DB, Tanaka Y, Sakakura Y, Hagiwara A: Two circular chromosomes of unequal copy number make up the mitochondrial genome of the rotifer Brachionus plicatilis. Molecular biology and evolution 2008, 25:1129-37. 22. Wei D-D, Shao R, Yuan M-L, Dou W, Barker SC, Wang J-J: The Multipartite Mitochondrial Genome of Liposcelis bostrychophila: Insights into the Evolution of Mitochondrial Genomes in Bilateral Animals. PloS one 2012, 7:e33973. 23. Nosek J, Tomáska L, Fukuhara H, Suyama Y, Kovác L: Linear mitochondrial genomes: 30 years down the line. Trends in genetics : TIG 1998, 14:184-8. 24. Nosek J, Tomáska Ľ: Mitochondrial Telomeres: An Evolutionary Paradigm for the Emergence of Telomeric Structures and Their Replication Strategies. In Origin and Evolution of Telomeres. edited by Nosek J, Tomáska Ľ Austin, Texas: Landes Bioscience; 2008:163-171. 25. Valach M, Farkas Z, Fricova D, Kovac J, Brejova B, Vinar T, Pfeiffer I, Kucsera J, Tomaska L, Lang BF, Nosek J: Evolution of linear chromosomes and multipartite genomes in yeast mitochondria. Nucleic acids research 2011:1-18. 26. Philippe H, Brinkmann H, Copley RR, Moroz LL, Nakano H, Poustka AJ, Wallberg A, Peterson KJ, Telford MJ: Acoelomorph flatworms are related to . Nature 2011, 470:255-8. 27. Rota-Stabelli O, Kayal E, Gleeson D, Daub J, Boore JL, Telford MJ, Pisani D, Blaxter M, Lavrov DV: Ecdysozoan Mitogenomics: Evidence for a Common Origin of the Legged Invertebrates, the Panarthropoda. Genome Biology and Evolution 2010, 2:425-440. 28. Park E, Hwang D-S, Lee J-S, Song J-I, Seo T-K, Won Y-J: Estimation of divergence times in cnidarian evolution based on mitochondrial protein-coding genes and the fossil record. Molecular Phylogenetics and Evolution 2012, 62:329-345. 29. Shao Z, Graf S, Chaga OY, Lavrov DV: Mitochondrial genome of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa): A linear DNA molecule encoding a putative DNA-dependent DNA polymerase. Gene 2006, 381:92-101.

151

30. Wang X, Lavrov DV: Mitochondrial genome of the homoscleromorph Oscarella carmela (Porifera, Demospongiae) reveals unexpected complexity in the common ancestor of sponges and other animals. Molecular biology and evolution 2007, 24:363-73. 31. Gazave E, Lapébie P, Renard E, Vacelet J, Rocher C, Ereskovsky AV, Lavrov DV, Borchiellini C: Molecular phylogeny restores the supra-generic subdivision of homoscleromorph sponges (Porifera, Homoscleromorpha). PloS one 2010, 5:e14290. 32. Rosengarten RD, Sperling E a, Moreno M a, Leys SP, Dellaporta SL: The mitochondrial genome of the hexactinellid sponge Aphrocallistes vastus: evidence for programmed translational frameshifting. BMC genomics 2008, 9:33. 33. Nash E a, Nisbet RER, Barbrook AC, Howe CJ: Dinoflagellates: a mitochondrial genome all at sea. Trends in genetics : TIG 2008, 24:328-35. 34. Glenn TC: Field guide to next-generation DNA sequencers. Molecular Ecology Resources 2011, 11:759-769. 35. Baker M: De novo genome assembly: what every biologist should know. Nature Methods 2012, 9:333-337. 36. Hebert PDN, Cywinska A, Ball SL, deWaard JR: Biological identifications through DNA barcodes. Proceedings. Biological sciences / The Royal Society 2003, 270:313-21. 37. Savolainen V, Cowan RS, Vogler AP, Roderick GK, Lane R: Towards writing the encyclopedia of life: an introduction to DNA barcoding. Philosophical transactions of the Royal Society of London. Series B, Biological sciences 2005, 360:1805-11.

152

APPENDIX A. Supplementary Materials for Chapter 2

Table S1: List of primers used for long and short PCR amplification, sequencing and first strand cDNA synthesis for Clathrina clathrus and Leucetta chagosensis. f and r correspond to forward and reverse primers respectively.

153

Primer for PCR-amplification of partial Leucetta chagosensis mt genes Gene f/r Primer name Sequence (5'3') Reference cob f Leuc_cha_cob_fw AGCTACAGTTATTATTAATTTWCTWAG this work cob r Leuc_cha_cob_rv GTATCTATGGYTCTTCAATAGGTA this work cox1 f C1-J2165-fwd GAAGTTTATATTTTAATTTTACCDGG this work cox1 r COX1_1157rv-HDTYYV ACIACRTARTAIGTRTCRTG Erpenbeck et al. 2002. cox3 f Leuc_cha_cox3_fw CTATTTATCTTATCTGAAGT this work cox3 r Leuc_cha_cox3_rv AAATAATCATACAACATCAAC this work nad1 f Leuc_cha_nd1_fw ATCTATCCAACATCATC this work nad1 r Leuc_cha_nd1_rv GAAACTAATTCAGATTCTC this work Primer for PCR-amplification of partial Clathrina clathrus mt genes cob f cl_cob-f1 ACAAATCCTCTTTCAATCAGGTC this work cob r cl_cob-r1 GAAGAGTACGAACTCTAGGAGC this work cox1 f cl_cox1-f1 GAACCTATTCTGTTGATCCATGC this work cox1 r cl_cox1-r1 TAGAATGTAGACTTCAGGATGAC this work nad4 f cl-nad4-f1 GTTTCATCTTACATGAGTTAGGC this work nad4 r cl-nad4-r1 AAACTATTCCCATATGAGC this work rnl r cl-rnl-r1 CCGTTTATCTGATCTTGTAGCAG this work Primer for cDNA synthesis of Clathrina clathrus RNA genes rnl f this work cl_rnl-p1-f1 ACTGATAGAGGAACTGGAATGAG rnl f this work cl_rnl-p2-f1 ATGTCTTTAGTAGCGCACTAGCC rnl r this work cl_rnl-r0 TACGTATACCCAAGAGTCATCCG rnl r this work cl_rnl-r2 GTTCCGTTACCCAGATTC rnl r this work cl_rnl-r3 CCGTTTCATGAGCTTCAC rnl r this work cl_rnl-r4 ATCGTCATCCCATGATCC f this work rns cl_rns-f0 TAGGAATCTTGATTCATGACCTG f this work rns cl_rns-f2 TATAACGTATGGCTGTGGAGCAC f this work rns cl_rns-f3 TCGTAACATGGTCATAGTAGTGG rns f this work cl_rns-f4 ATGGTAATCCGAGTATTCTGCAC f this work rns cl_rns-f5 TAGAGTGGAGGCTACCTTCGTG r this work rns cl_rns-r1 TCCTACGAAAGTGTCCTTCCAAG f this work trnM cl_trnM-f2 ATGAGGTTCAAATCCTCATCTCC r this work trnM cl_trnM-r1 GAGTATGAATCTGCTGCTATCCC f this work trnP cl_trnP-f1 CACTTGGTATCAGTTCAACTCTG r cl_trnP-r1 this work trnP GTTCCCAAAACACCTGTTCTACC f cl_trnR-f1 this work trnR ACTGTAGCCTTCTAGTCTAAAAG r cl_trnR-r1 this work trnR AGGCTACAGTTATACCGTTAAAC trnN f cl_trnN-f1 TACTGTTAATACGATGGCATGGG this work trnN r cl_trnN-r1 CGTATTAACAGTACGAGATTCTTC this work

154

Table S2: Accession numbers of the sequences used in this study. The sequences for Cantharellus cibarius and Capsaspora owczarzaki were downloaded from http://megasun.bch.umontreal.ca/People/lang/FMGP/proteins.html.

Table S2: Accession numbers of the155 sequences used in this study.

Species name GenBank Accession Number

BZ10101 NC_008832 BZ243 NC_008834

BZ49 NC_008833

Trichoplax adhaerens NC_008151

Alatina moseri JN700951-JN700958

Haliclystus sanjuanensis JN700944

Lucernaria janetae JN700946

Linuche unguiculata JN700939

Cassiopea andromeda JN700934

Catostylus mosaicus JN700940

Aurelia aurita NC_008446

Chrysaora sp JN700941

Cyanea capillata JN700937

Pelagia noctiluca JN700949

Acropora tenuis NC_003522

Astrangia sp NC_008161

Chrysopathes formosa NC_008411

Lophelia pertusa NC_015143

Madracis mirabilis NC_011160

Metridium senile NC_000933

Ricordea florida NC_008159

Savalia savaglia NC_008827

Keratoisidinae sp NC_010764

Sarcophyton glaucum AF064823 and AF063191

Briareum asbestinum NC_008073

Renilla mulleri JX023273

Dendronephthya gigantea NC_013573

Hydra magnipapillata NC_011220-NC_011221

Ectopleura larynx JN700938

Cubaia aphrodite JN700942

Millepora platyphylla JN700943

Pennaria tiarella JN700950

Clava multicornis JN700935

Laomedea flexuosa JN700945

Axinella corrugata NC_006894

Callyspongia plicifera NC_010206

Lubomirskia baicalensis NC_013760

Ephydatia muelleri NC_010202

Geodia neptuni NC_006990

Halisarca dujardini NC_010212

Ircinia strobilina NC_013662

156

Igernella notabilis NC_010216

Aplysina fulva NC_010203

Oscarella carmela NC_009090

Plakina monolopha NC_014884

Plakinastrella cf. onkodes NC_010217

Corticium candelabrum NC_014872

Tethya actinia NC_006991

Xestospongia muta NC_010211

Suberites domuncula NC_010496

Topsentia ophiraphidites NC_010204

Iphiteon panicea EF537576

Sympagella nux EF537577

Aphrocallistes vastus NC_010769

Clathrina clathrus This study

Leucetta chagosensis This study

Petrobiona massiliana This study

Figure S1: Phylogenetic analyses of the calcareous sponge sequences available in GenBank.

(A) Partial rnl sequences from Leucilla nuttingi, sp and Clathrina sp sequences

(Accession numbers AF035266, AF035267 and AF362006). (B) Partial atp6, cox2 and cox3 sequence for Clathrina sp (Accession numbers AF362015, AF362018 and AF362010).

Support values for the nodes have been shown for both ML and Bayesian analyses using

Treefinder and PhyloBayes respectively. Stars correspond to posterior probability > 0.95 and bootstraps > 95.

157

Leucilla nuttingi

0.64/82 Clathrina sp Grantia sp Callyspongia plicifera 0.87/- Xestospongia muta Topsentia ophiraphidites 1/- Ephydatia muelleri Plakortis angulospiculatus * Oscarella carmela 1/- 0.53/- Tethya actinia birotulata 0.94/80 *

* walpersi Demospongiae Ectyoplasia ferox 1/- * Geodia neptuni Cinachyrella kuekanthali

* Halisarca dujardini * * nucula Aplysina fulva compressa schmidti Cnidaria Hexactinellida Outgroups 0.5

Ephydatia muelleri Topsentia ophiraphidites 0.93/56 * Ptilocaulis walpersi * 0.65/69 Ectyoplasia ferox

* Geodia neptuni G4 Cinachyrella kuekanthali 100/86 0.81/75 Tethya actinia 0.76/86 Demospongiae 0.92/87 Clathrina sp Xestospongia muta * Callyspongia plicifera 0.94/73 G3

* Halisarca dujardini * G2 0.54/- Aplysina fulva Sympagella nux * Hexactinellida Iphiteon panicea Plakortis angulospiculatus 0.55/100 Oscarella carmela G0

Outgroups

0.2

158

Figure S2: Partial mitochondrial genome organization of the Calcaronea Petrobiona massiliana. The numbers represent the length of each contig in nucleotides. Genes are transcribed from left to right. Dark bars are non-coding regions.

Figure S3: Inferred secondary structure of the partial 3'-end of the large mitochondrial ribosomal RNA subunit (rnl) in Clathrina clathrus. The helices are named as in other sponge rRNA sequences (Wang and Lavrov, 2007). rnlp1 and rnlp2 are two pieces of the 3'-end of rRNA found in "correct" transcriptional order in Clathrina mtDNA interrupted by at DNA level (highlighted in grey). Note that additional sequences for rnl can be present in the partial genome of C. clathrus but were not found.

159

U A A C G U G G A G C G U A U U A U U 8 4 C U C C U A G A G U A C G C C G C U A G U U A G G C C A U A A U U A C 8 5 U U U G U G t r n S 2 U C A 7 9 A A U G A A A A G G G A G U U A U G U U U G U U A C G G C A A C A A U U A U A 7 6 S 2 A G G U U G U G C A A U A A C A A U A A A A A A A A A U C G G A G A A U A C C C U G G C U A C A A U U G G A C C U G U U 8 3 A G G C G A C A 7 A A A G G A G C A U G A G U G G G C A S 3 G U G G C G G A U G A U A G A G 8 1 7 5 A U A U G A G S 1 A G U G U C G G G A A G U C 8 0 U U G C A G C C G U A G U A G A C C C G U A U C C A U U C U U A U U C A G U A U U U U G C C A G U U G C C G A A C C A G G C A G U G A A G U U U A A G G C U U A U U U S 4 G A U A 7 8 A A G U G C A A G G U A A U G U C U C A G G C A U G A A A U G A G G U U C C C U G C G A G G N U A U A G G G A A A G U C C N A N G N G N N G U U C G U A U U U A U A C G G A G C A C G A A A A 7 0 G 6 9 A U U U N U C G 7 2 C A C A G C A A A G G A A U C A U A C U A C U A U A C C A G G G U C U A G G G A G 7 1 G U U A C A G C A A A G C C U A U U A A U G A G A A A C G U C A A G C U C U U C A U C U C A U A A U A C C G G G G G A U G A C G G A U G U G G A G U U A G G A G A A G G G C G U U G U C U G U G C C G A U G U U U G A U A C U U G A A U A A C A G U A A G G A C A G A U G U 6 8 6 7 C A G U U U G A A A A A A G A A A G U C A C A C A C G G U C U C C U U U C U C A A U C C A U C C G A A G U A U G C U G C G G C U U G U A G U U C U A C U C G C U A G G G G A A A G G A A U U C G G U U A G G G G G G U A A A C A G C A A U U U C U U A A U U A 6 1 U 6 6 4 A A U G G G C G C G A G A U C G A 6 5 U A U A G G C C A U U G C A A A 6 3 _ 1 A G A A U C A G A A C A C A A C U G C 6 2 C A U C G A A A G A A U G U A U C U A U U C U A C A A U G C U U C A G U C A A U U C G C C G U C C A G G G U U C U G A A T 4 C A C C A G U G A G U A A U C A U U G U c 5 U A G A G A G T 1 G U G A C U G U C U U U A U A A U U U G A C A U A C G C G U A G A G G U A U G G G A G A C U U U T 3 U A U U U U C A U G A A A C A G G G A A G G A G G U G C U G G U G A A U U G G A A U U C U A G A A A T 2 U A U G G C G U U U c 1 A G A t r n P A A A C U A C c 4 A G c 2 U U U U A U A U G U C A C U A G A C A U A A A U A U G U G A U G U C C A G U A G C A A A C A C C A U A C G c 3 A A A A C U U G G C G U U U N G G A A G U U C A G U A U A N A A C A G A A C A U G A A C A A U A U U A C C G A 5 9 . 1 G G A A G A U U A A A A G G A U A U A U U G A A A G G A N U G A G G U A A U A C U A A U C C G C U U A U U A C C G A A G U G G G A C A U G U G 4 9 U G 4 8 A C A A A A A A G C A A A G G A G A A C U C 4 9 _ 1 U G A 5 2 C C A U N A U C A A G C A A G 5 0 G U C A G A A U C U A U U U G U U C A G G A C A A C U G G G G G U A C A A U U C U A G U U U A A C U G G A A G A C U G C N A C U A G A 4 6 A U G A A U G U U G G U A U A A G G G A A G C G A G A A A C C C U A A U A A A C G U U G U A A C A U G A

160

Figure S4: Phylogenetic analysis of concatenated amino acid sequences from representative non-bilaterian species for atp6, cob, cox1, cox3, and nad1-5 genes under the GTR+CAT models of sequence evolution using RAxML v.7.2.6. A star shows maximum support for that node. Calcareous species have been underlined.

* Placozoa

Linuche unguiculata Coronatae Alatina moseri Cubozoa * Haliclystus sanjuanensis Lucernaria janetae Staurozoa Cubaia aphrodite * Clava multicornis Laomedea flexuosa * Millepora platyphylla Hydrozoa Pennaria tiarella * Ectopleura larynx Hydra magnipapillata * Cyanea capillata * Chrysaora sp * Pelagia noctiluca * Aurelia aurita Discomedusae * Cassiopea andromeda Catostylus mosaicus Renilla mulleri * Keratoisidinae sp Briareum asbestinum Dendronephthya gigantea Octocorallia Sarcophyton glaucum Savalia savaglia * Metridium senile Chrysopathes formosa Acropora tenuis * Ricordea florida Hexacorallia * Lophelia pertusa * Astrangia sp Madracis mirabilis * Igernella notabilis Ircinia strobilina Keratosa (G1) Aplysina fulva Halisarca dujardini Myxospongiae (G2) * Callyspongia plicifera Xestospongia muta marine Haplosclerida (G3) * * Ephydatia muelleri Lubomirskia baicalensis Axinella corrugata Geodia neptuni Topsentia ophiraphidites Democlavia (G4) * Suberites domuncula Tethya actinia Oscarella carmela * Corticium candelabrum * Plakina monolopha Homoscleromorpha (G0) Plakinastrella cf. onkodes * Aphrocallistes vastus * Iphiteon panicea Hexactinellida Sympagella nux Clathrina clathrus * Leucetta chagosensis Petrobiona massiliana

161

APPENDIX B. SUPPLEMENTARY MATERIALS FOR CHAPTER 3

Supplementary Fig. S1: Enzymatic digestion of Alatina moseri DNA with the methylation-sensitive restriction enzyme HpaII. Digestion of A. moseri total DNA with HpaII followed by PCR amplification of two mitochondrial contigs containing up to four restriction sites. Lane A and B have been digested for three days with HpaII prior to PCR; lane C and D were obtained with control non-digested DNA. Digestion by HpaII prevents PCR amplification (lane A and B), while DNA methylation in the nucleus shall prohibit digestion of the DNA by the enzyme, resulting in successful PCR amplifications (lane C and D).

162

Supplementary Fig. S2: Phylogenetic relationships of polB sequences found in medusozoan mtDNAs. Phylogenetic reconstruction of polB sequences under Maximum Likelihood framework with TREEFINDER v. March 2011 using mtREV+GI model of sequence evolution. All medusozoan polB sequences (in blue) form a monophyletic clade. Node supports are bootstrap values for 5000 replicates.

163

APPENDIX B. SUPPLEMENTARY MATERIALS FOR CHAPTER 4

Additional figure 1 – Species list

List of species and accession numbers used for this studies. Bold species are from this study.

164

Phylum Subphylum Class Subclass Order species accession number Cnidairia Anthozoa Hexacorallia Actiniaria Metridium senile NC_000933 Nematostella sp NC_008164 Antipatharia Chrysopathes formosa NC_008411 Cirripathes lutkeni JX023266 glaberrima FJ597643-FJ597644 Corallimorpharia Discosoma sp NC_008071 Rhodactis sp NC_008158 Ricordea florida NC_008159 Scleractinia Acropora tenuis NC_003522 Agaricia humilis NC_008160 Anacropora matthai NC_006898 Astrangia sp NC_008161 Colpophyllia natans NC_008162 Lophelia pertusa NC_015143 Madracis mirabilis NC_011160 Montastraea annularis NC_007224 Montastraea faveolata NC_007226 Montastraea franksi NC_007225 Montipora cactus NC_006902 Mussa angulosa NC_008163 Pavona clavus NC_008165 Pocillopora damicornis NC_009797 Pocillopora eydouxi NC_009798 Porites porites NC_008166 Seriatopora caliendrum NC_010245 Seriatopora hystrix NC_010244 Siderastrea radians NC_008167 Stylophora pistillata NC_011162 Zoantharia Palythoa sp DQ640650 Savalia savaglia NC_008827 Ceriantharia Ceriantheopsis americanus JX023261-JX023265 Octocorallia Alcyonacea Acanella eburnea NC_011016 Briareum asbestinum NC_008073 Corallium konojoi NC_015406 Dendronephthya gigantea NC_013573 Dendronephthya castanea GU047877 Dendronephthya mollis HQ694725 Dendronephthya putteri HQ694726 Dendronephthya suensoni GU047878 Echinogorgia complexa HQ694727 Euplexaura crassa HQ694728 Keratoisidinae sp NC_010764 Paracorallium japonicum NC_015405 Pseudopterogorgia NC_008157 Sarcophyton glaucum AF064823 & AF063191 Scleronephthya gracillimum GU047879 Sinularia sp JX023274 Helioporacea Heliopora coerulea JX023267-JX023272 Pennatulacea Renilla mulleri JX023273 Stylatula elongata JX023275 Medusozoa Cubozoa Carybdeida Alatina moseri JN700951-JN700958 Carukia barnesi JN700959-JN700962 Carybdea xaymacana JN700977-JN700983 Chirodropida Chironex fleckeri JN700963-JN700968 JN700969-JN700974 Hydrozoa Trachylina Limnomedusae Cubaia aphrodite JN700942 Hydroidolina Aplanulata Ectopleura larynx JN700938 Hydra magnipapillata NC_011220-NC_011221 Hydra oligactis NC_010214 Hydra vulgaris BN001179-BN001180 Capitata Millepora platyphylla JN700943 Pennaria tiarella JN700950

165

Filifera III Clava multicornis JN700935 Filifera IV Nemopsis bachei JN700947 Leptothecata Laomedea flexuosa JN700945 Scyphozoa Coronatae Linuche unguiculata JN700939 Discomedusae Rhizostomeae Cassiopea andromeda JN700934 Cassiopea frondosa JN700936 Catostylus mosaicus JN700940 Rhizostoma pulmo JN700987 Semaeostomeae Aurelia aurita A NC_008446 Aurelia aurita B HQ694729 Chrysaora sp JN700941 Chrysaora quinquecirrha HQ694730 Cyanea capillata JN700937 Pelagia noctiluca JN700949 Staurozoa Stauromedusae convolvulus JN700975 & JN700976 Haliclystus sanjuanensis JN700944 Lucernaria janetae JN700946 Placozoa BZ10101 NC_008832 BZ243 NC_008834 BZ49 NC_008833 Trichoplax adhaerens NC_008151 Porifera Homoscleromorpha Corticium candelabrum NC_014872 Oscarella carmela NC_009090 Plakina monolopha NC_014884 Plakinastrella cf. onkodes NC_010217 Demospongiae G1=Keratosa Igernella notabilis NC_010216 Ircinia strobilina NC_013662 G2=Myxospongiae Aplysina fulva NC_010203 Chondrilla aff. nucula NC_010208 Halisarca dujardini NC_010212 G3=marine Haplosclerida Amphimedon compressa NC_010201 Callyspongia plicifera NC_010206 Xestospongia muta NC_010211 G4=Democlavia Agelas schmidti NC_010213 Axinella corrugata NC_006894 Cinachyrella kuekanthali NC_010198 Ephydatia muelleri NC_010202 Geodia neptuni NC_006990 Lubomirskia baicalensis NC_013760 Suberites domuncula NC_010496 Tethya actinia NC_006991 Topsentia ophiraphidites NC_010204

Additional Figure 2 – Cnidarian phylogeny of mitochondrial protein genes using the alternative AliMG alignment.

Phylogenetic analyses of cnidarian protein coding genes under the CATGTR model with

PhyloBayes for the AliMG alignments (3485 positions, 106 taxa) created using the combination MUSCLE+Gblocks. Support values correspond to the posterior probabilities for

166 the GTR(BI) and bootstraps for GTR(ML) analyses. Stars denote maximum support values. A dash denotes discrepancy between the results obtained by different models.

167

* Placozoa 0.99/1/100 Porifera * Zoantharia

* * Actiniaria 1/1/78 * Antipatharia 1/1/91 * H * Corallimorpharia 1/1/91 e

* x

* * a * * * c

* o

0.57/0.22/16 r

a 0.51/-/16 * l * * Scleractinia l

i

* a * *

* * 1/1/96 1/1/94 * Ceriantharia Helioporacea 0.99/-/12 Alcyonacea *

O 1/1/96 Pennatulacea

0.63/-/45 * c

t

* * o 0.67/-/-

c

o 1/1/99

r

0.47/-/- a * Alcyonacea

1/1/99 l 1/1/74 l

i

0.74/0.89/49 a 1/0.98/54 1/1/99 0.95/1/92 0.55/-/20 Coronatae * 0.87/-/43 Stauromedusae Staurozoa 0.83/1/58 0.99/1/99 0.37/0.62/65 Carybdeida * Cubozoa * Chirodropida * Limnomedusae H * * Filifera III Filifera IV y

* Leptothecata d r

0.99/-/- 1/1/95 Capitata o z 1/1/70 * o

* Aplanulata a

1/0.99/89 *

e a s u d e m o c s i D

* * * Semaeostomeae * * 0.85/1/89 * * 0.5 * Rhizostomeae

168

Additional Figure 3 – Cnidarian phylogeny of mitochondrial protein genes using the reduced alignments AliS and AliMG with 103 species.

Phylogenetic analyses of cnidarian protein coding genes under the GTR model and CAT approximation with RAxML for the AliS alignments created using CluatalW+SOAP, where the coronate Linuche unguiculata, the tube anemone Ceriantheopsis americanus and the blue octocoral Heliopora coerulea were removed. Support values correspond to the bootstraps for both AliS and AliMG alignments. Stars denote maximum support values. A dash denotes discrepancy between the two alignments.

169

* Placozoa

99/98

* Porifera 97/86

* * 79/78 *

95/91 * *

97/90 * 84/88 * * 99/100 * Hexacorallia * *

99/100 * * * * *

* *

91/96 73/97 99/100

89/90

37/- * * 77/67 * Octocorallia 98/99 33/- 99/100

97/98

50/73

45/-

95/97 51/74 84/88

* 43/40 Staurozoa 85/90 37/58 97/98 * Cubozoa *

* * * 91/88 Hydrozoa * 98/95

82/87 * 66/77 * *

* * 0.2 99/100

* * 86/87 Discomedusae * * 99/100

Additional Figure 4 – Composition of the amino acid alignment used in this study.

Principal component analysis of the amino acid composition per species for each of the alignments (AliS and AliMG) used in this study. Species have been color-coded per group corresponding to each of the main cnidarian clades (Coronatae, Cubozoa, Discomedusae,

Hexacorallia, Hydrozoa, Octocorallia, and Staurozoa), Porifera and Placozoa. The axes explain

33 and 23 per cent of the data for AliS and 33 and 22 percent for AliMG.

170

AliS

AliMG

171

Additional figure 5 – The use of several statistical tests verifying the validity of some groups in cnidarians for the reduced alignments.

Probability values for the AU, KH and SH tests and BI values for several clades traditionally recognized in Cnidaria for the reduced alignments AliS and AliMG containing 103 species, where the coronate Linuche unguiculata, the tube anemone Ceriantheopsis americanus and the blue octocoral Heliopora coerulea were removed.

!"#$%&'() !"#-.%&'() $*+, -/01"2343.5"*160 !" #$ %$ !" #$ %$ !&'()*+,( -./0 -./1 -.02 -.23 -.44 -.0- %&5676(*898:;<676(898$5,'676( -.10 -.11 -.=/ -.=> -.== -.>0 %&5*?676(898:;<676(898%@(;'676( -.-1 -.-= -.1> -.-/ -.1/ -.=4 %&5*?676(898%@(;'676( -.-0 -.-/ -.11 -.-= -.-= -.=-

Additional figure 6 – Table of morphological characters mapped on the best tree.

Mapping of morphological characters under DELTRAN and ACCTRAN models differed only for characters (5) and (7). (1) symmetry: medusozoan taxa have been scored radial, but

Marques and Collins (2004) subdivided it into radial, biradial, or radial tetramerous;

Octocorallia, Zoantharia and Ceriantharia are scored bilateral (Won et al. 2001), while the remnant of Hexacorallia was scored radial (Marques and Collins 2004); bilateral symmetry is inferred the ancestral state for Cnidaria. (2) free-swimming medusoid stage: Hexa-,

Octocorallia and Staurozoa all lack a free-living adult form (Marques and Collins 2004); the medusoid stage is also lacking in some Hydrozoa species but Collins (2002) and Cartwright and Nawrocki (2010) have inferred that the ancestral Hydrozoa had a medusoid stage. (3)

172 velum: only present in Hydrozoa (Marques and Collins 2004). (4) strobilation: only described in Coronatae and Discomedusae (Marques and Collins 2004). (5) gastric filaments: we scored them as present in Cubozoa (Brigde et al. 1995), Staurozoa, Coronatae and Discomedusae

(Marques and Collins 2004), and absent in Hydrozoa (Brigde et al. 1995), although suggested to be present in some Aplanulata (Bouillon et al. 2004); we decided to opt for the most parsimonious scenario. (6) ephyrae: only described in Coronatae and Discomedusae (Marques and Collins 2004). (7) radial canals: described in Coronatae, Discomedusae, and some

Hydrozoa, absent in Staurozoa and unresolved in Cubozoa (Marques and Collins 2004); we choose two independent gains in Coronatae and Discomedusae as the most parsimonious scenario. (8) circular canals: present in Discomedusae and Hydrozoa, absent in the rest of

Medusozoa. Marques and Collins (2004) have further distinguished the level of development of these structures that are partial in Discomedusae and full in Hydrozoa. (9) quadrate symmetry of horizontal cross section: present only in Cubozoa and Staurozoa (Marques and

Collins 2004). (10) organization of the gastrodermal muscles of the polyp: organized in bunches of gastrodermic origin in all Hexacorallia but Ceriantharia, and inferred as the ancestral state for Cnidaria; organized in bunches of epidermic origin in all Medusozoa but

Hydrozoa. In Hydrozoa and Ceriantharia, gastrodermal muscle are not organized in bunches

(Marques and Collins 2004).

173

! " # $ % & ' ( ) !* Actiniaria 1 0 0 0 0 0 0 0 0 1 Antipatharia 1 0 0 0 0 0 0 0 0 1 Corallimorpharia 1 0 0 0 0 0 0 0 0 1 Scleractinia 1 0 0 0 0 0 0 0 0 1 Zoantharia 0 0 0 0 0 0 0 0 0 1 Ceriantharia 0 0 0 0 0 0 0 0 0 2 Alcyonacea 0 0 0 0 0 0 0 0 0 1 Helioporacea 0 0 0 0 0 0 0 0 0 1 Pennatulacea 0 0 0 0 0 0 0 0 0 1 Carybdeida 1 1 0 0 1 0 ? 0 1 0 Chirodropida 1 1 0 0 1 0 ? 0 1 0 Limnomedusae 1 1 1 0 0 0 1 1 0 2 Aplanulata 1 1 1 0 0 0 1 1 0 2 Capitata 1 1 1 0 0 0 1 1 0 2 Filifera III 1 1 1 0 0 0 1 1 0 2 Filifera IV 1 1 1 0 0 0 1 1 0 2 Leptothecata 1 1 1 0 0 0 1 1 0 2 Coronatae 1 1 0 1 1 1 1 0 0 0 Rhizostomeae 1 1 0 1 1 1 1 1 0 0 Semaeostomeae 1 1 0 1 1 1 1 1 0 0 Stauromedusae 1 0 0 0 1 0 0 0 1 0

174

ACKNOWLEDGMENTS

First of all, I would like to thank Dennis V. Lavrov for having admitted me as his student, and who, despite all our ups and downs, restrained himself from throwing me out of his lab. That was not an easy task. I also thank my POS Committee members Anne Bronikowski, John Downing, Eric Henderson, Stephan Q. Schneider, and Jeanne Serb for their support. I would like to acknowledge the past and present members of the Lavrov Lab (Karri Haen, Xiujuan Wang, Romulo Segovia, and Walker Pett) for also coping with me all these years. I am particularly indebted to Andrea Bekic, Philip Lange, Selena Leonard, and Matthew Palen for their help in the lab and more. I would also like to mention some of my fellow graduate students Arun Sethuraman, Matthew Karnatz, Alvin Mercado Alejandrino, and many others, and tell them "Good luck!" I would like to thank the Wendel Lab, and particularly Corrinne Grover and Kara Grupp for their help in acquiring data and moral support during the hard periods. I also thank Dr. Jonathan Wendel and the EEOB Department for supporting for most of my time at ISU. I would also like to acknowledge the Porifera Tree of Life project (www.portol.org) and particularly Dr. Allen G. Collins, Dr. Robert W. Thacker and Dr. Cristina Diaz for their support in my work. I thank the Smithsonian Institution for having provided me with an opportunity to work at the National Museum of , and all the people from the LAB group, particularly Matthieu and Natalia. I also want to thank Hervé Philippe, Béatrice Roure and Henner Brinkmann for their help regarding phylogenetics and de- growth! I would not have been able to survive Ames for four years without the friendship of Zee, Rishi, Michael, Scott, Lando, the food-and-movie addicts (Paula, Diego, Tomoko, Kaya, Basak and Fatih, Giorgi, did I forget anyone?), the French collection (among other Boris and Jane, Estelle and Etienne, Jean-Pierre, Sophie, Charlène, Fabienne, Camille, Stéphanie), Casino Tehran members (Mostafa, the Alis, Fardad, the Mohammads, Dr Mina and family, Reza and Jami, Mirzad and Mina, Elham and Pedram, Omid, Hamed, and many more) and many more. Finally, this PhD work could not have been achieved without the help and support of my family (thanks bro!), Hannah and her family, and many friends back in France and other parts of the World who believe in me.