<<

Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations

2010 Mitochondrial genome evolution in the Metazoa: Insights from Hexactinellida ( Porifera) and Phylum Karri Michelle Haen Iowa State University

Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Ecology and Evolutionary Biology Commons

Recommended Citation Haen, Karri Michelle, "Mitochondrial genome evolution in the Metazoa: Insights from Class Hexactinellida (Phylum Porifera) and Phylum Cnidaria" (2010). Graduate Theses and Dissertations. 11764. https://lib.dr.iastate.edu/etd/11764

This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Mitochondrial genome evolution in the Metazoa: Insights from Class Hexactinellida

(Phylum Porifera) and Phylum Cnidaria

by

Karri M. Haen

A dissertation submitted to the graduate faculty

in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

Major: Genetics

Program of Study Committee Dennis V. Lavrov, Major Professor Allan Miller John Nason Jeanne Serb Robert Wallace

Iowa State University

Ames, Iowa

2010 ii

TABLE OF CONTENTS

ABSTRACT………………………………………………………………………………….iv

CHAPTER 1. General introduction…………………………………………………..……….1 Introduction………………………………………………………………………..…..1 Dissertation organization…………………………………………………………….12 References……………………………………………………………………………12

CHAPTER 2. Glass and bilaterian share derived mitochondrial genomic features: a common ancestry or parallel evolution?...... 17 Abstract…………………………………………………………………………...... 17 Introduction………………………………………………………………………...... 18 Materials and Methods……………………………………………………………….19 Results……………………………………………………………………………...... 22 Discussion……………………………………………………………………………29 Acknowledgments……………………………………………………………………31 Literature Cited………………………………………………………………………32

CHAPTER 3. Mitochondrial genome evolution in Hexactinellida (Phylum Porifera)……...43 Abstract………………………………………………………………………………43 Introduction………………………………………………………………………...... 44 Materials and Methods……………………………………………………………….47 Results……………………………………………………………………………...... 48 Discussion……………………………………………………………………………56 Literature Cited………………………………………………………………………61

CHAPTER 4. Unusual phylogenetic distribution of +1 translational frameshifts in protein coding genes………………………………………………………...... 70 Abstract…………………………………………………………………………...... 70 Introduction...... 71 Results and Discussion………………………………………………………………78 Acknowledgments……………………………………………………………………88 Literature Cited………………………………………………………………………89

CHAPTER 5. Parallel loss of nuclear-encoded mitochondrial aminoacyl-tRNA synthetases and mtDNA-encoded tRNAs in Cnidaria…………………………………………………..103 Abstract……………………………………………………………………………..103 Introduction…………………………………………………………………………104 Results and Discussion……………………………………………………………..105 iii

Acknowledgments…………………………………………………………………..111 Literature Cited……………………………………………………………………..112

CHAPTER 6. General conclusions………………………………………………………....116 Conclusions………………………………………………………………………....116 Future directions……………………………………………………………………118 Literature Cited……………………………………………………………………..120

APPENDIX A. Supplemental materials for Chapter 3…………………………………….121

APPENDIX B. Supplemental materials for Chapter 5………………………………….....127

ACKNOWLEDGMENTS………………………………………………………………….140

iv

ABSTRACT

Metazoan mitochondrial genomes provide a model system for evolutionary genomic studies. Due to their relatively small size, sequences for complete mitochondrial DNAs

(mtDNAs) are now available for animals over a large phylogenetic spectrum, including recent additions for the three classes of Porifera and various cnidarians. This wealth of data gives us a unique opportunity to infer the mechanisms underlying the evolution of genomes.

In this dissertation, I will address the example of extensive parallel evolution between two distantly related groups, the and Class Hexactinellida (Phylum Porifera; glass sponges), citing the specific examples of degenerated tRNA secondary structures and changes in the genetic code. Much of the plasticity of metazoan mtDNA is linked to the variation of tRNA genes, and I will hypothesize upon the mechanisms that bring about this variation. I will also discuss possible mechanisms involved in the mitochondrial genome reduction of Nematostella vectensis (Class , Phylum Cnidaria), where all but two tRNAs are lost from the mtDNA. Finally, mitochondrial genome data from the

Hexactinellida indicates +1 translational frameshifting in protein coding genes, which has only rarely been reported in the mt-genomes of animals. The application of molecular phylogenies both resolves questionable phylogenetic relationships inside Hexactinellida, but also suggests a pattern of evolution for various unusual mt-genomic features.

CHAPTER 1. GENERAL INTRODUCTION

Introduction

Mitochondria, or their reductive vestiges, hydrogenosomes and mitosomes, are essential organelles that are present in all eukaryotic cells (Hjort et al. 2010). The endosymbiotic theory, an idea first proposed by a Russian scientist Mereschkowsky (Martin and Kowallik 1999) that was revived and popularized by Margulis (1970), has been confirmed by molecular evidence, showing that mitochondria originated from the endosymbiotic fusion of two prokaryotes. Although the progenitor of the mitochondrial genome was mostly likely a free-living !-proteobacterium, the molecular phylogenetic picture arising from the analysis of complete mitochondrial proteomes has revealed diverse origins for these proteins (Szklarczyk and Huynen 2010). Regardless, all contemporary mitochondria possess vastly fewer genes than the genomes of extant !-proteobacteria. During the course of mitochondrial evolution, most of the genes essential to organellar biogenesis have either been transferred to the nucleus or supplanted by preexisting nuclear genes of similar function (Duchene et al. 2009; Adams and Palmer 2003).

The primary function of the is ATP production via the oxidative phosphorylation (OXPHOS) pathway. Despite this common function in most , the evolution of mitochondrial genomes is highly irregular among independent phylogenetic lineages, encompassing a broad range of mitochondrial genome sizes, gene contents, and evolutionary rates (Hjort et al. 2010). In particular, mitochondria revealed a diversity of mt-genomic features (Wolstenholme 1992), and the sampling of new mt-genomes, which has recently surged due to high-throughput technologies and expanded interest in Tree of

Life (ToL) projects, has challenged the former ideal of the maximally streamlined, uniform metazoan mtDNA molecule (Lavrov 2007).

The analysis of mitochondrial (mt) genomes has been useful for investigating animal relationships (Lavrov & Lang 2005). Complete mtgenome sequencing not only provides a wealth of data including sequences for protein coding genes and RNAs, but this data may also may contain rare genomic changes (RGCs) that are additionally informative for phylogenetic inferences (Rokas & Holland 2000). Gene rearrangements, tRNA and protein coding gene deletions, variations in the genetic code, differences in the secondary structures of encoded transfer and ribosomal RNAs, overlapping genes, and the presence of ribosomal frameshifting sites in protein-coding sequences are all considered RGCs that may be exploited for phylogeny reconstruction. Additionally, the use of protein coding genes from eukaryotic mtDNAs simplifies the question of orthology when selecting genes for molecular phylogenetic analyses (Lavrov and Lang 2005). In nearly all cases where gene transfer has been reported in animal mtDNA, the transfer occurred between very closely related

(especially in sister species) and was directly linked to interspecies hybridization in zones of sympatry or parapatry (eg. Chan and Levin 2005; Bachtrog et al. 2006).

Hexactinellids (“glass sponges,” Class Hexactinellia, Phylum Porifera) have represented one of the final frontiers for mitochondrial genomic data in the context of the early branching Metazoa. Glass sponges are an enigmatic group of marine, predominantly deepwater animals that contains 500600 described extant species, which play an important role in benthic communities. Although a rudimentary analysis may find that glass sponges appear similar to other sponges, they possess characteristics that clearly distinguish them from other members of the animal , including syncytial tissues that arise by fusion of early embryonic cells (Leys et al. 2006) and a nerve independent impulse conduction system (Leys et al. 2007). Our analyses of two glass mt genomes (Chapter 1) revealed a multitude of parallel evolutionary events that have occurred between

Hexactinellida and the Bilateria. A parallel change in the genetic code, a peculiar secondary structure of one of the serine mttRNAs, highly derived tRNA and rRNA genes, and a high degree of support for an erroneous placement of the Hexactinellida as the sister group to the

Bilateria in the molecular phylogeny (Haen et al. 2007) confound traditional ideas upon the placement of glass sponges at the base of the metazoan tree of life. Furthermore, these data call into question the notion that rarely occurring character states are not susceptible to extensive convergent or parallel evolution, an argument elaborated upon by Rokas & Holand

(2000) in a highly cited Trends in Ecology & Evolution review.

In this dissertation I aim to investigate two basic questions. Primarily, can the mtDNA based phylogeny of Class Hexactinellida be reconciled with traditional taxonomic views?

This question is directly related to the mission of the Porifera Tree of Life Project (PorToL), which aims to survey molecular diversity in several lineages of the Porifera. An obvious possible outcome from increased molecular data availability for Hexactinellida is the phylogenetic placement of Hexactinellida both among the rest of the Metazoa (Ch. 1) as well as the analysis of its inter-class molecular phylogenetic relationships (Ch. 2). Secondarily, I aim to address a more general question about the evolution of animal mitochondrial DNA: what evolutionary processes drive the multitude of unusual molecular genetic changes we see inside metazoan mitochondrial genomes? For these analyses, I will survey two major groups of mitochondrial genes, tRNA and protein-coding genes, for unusual features with respect to the rest of the Metazoa. With respect to the protein coding genes, I will address the unusual case of +1 translational frameshifting in Hexactinellida (Ch. 3). For the tRNA component, I will analyze the evolution of mt-tRNAs with unusual secondary structures inside the

Hexactinellida (Ch. 2), as well as what forces drive tRNA gene loss from mt-genomes, as particularly evidenced in Phylum Cnidiaria, by analyzing both the mt and nuclear genomes of the cnidarian Nematostella vectensis (Ch. 4).

Conflicting evolutionary hypotheses from the mtDNA-based phylogeny of the

Hexactinellida

According to the record, glass sponges (class Hexactinellida) are quite possibly the most ancient metazoan group, with the oldest full body of reticulata found in the fauna of the dating to approximately 550

MYA (Gehling and Rigby 1996). Because of their particularly deep divergence, the details of the glass sponge phylogeny are expected to provide a unique perspective upon the rise of multicellularity and molecular evolution in the animal lineage.

The Hexactinellida is divided into two subclasses: the Amphidiscophora and

Hexasterophora: the former having amphidisc microscleres or some variant of them, and the latter having hexaster microscleres (Schulze 1899). Cladistic analyses of morphological characters, especially regarding the configuration of the siliceous , place glass sponges into six recent orders. Subclass Amphidiscophora contains one recent with three families, which are distinguished by their choanosomal spicules. Subclass

Hexasterophora contains five recent orders and several unclassified species (Reiswig 2002).

Although glass sponges have been traditionally placed with calcareous and in the phylum Porifera (Thomson 1869; Schmidt 1870), this association is based on the overall morphological similarity of sponges rather than any specific synapomorphies (Hooper & Van Soest 2002). Furthermore, most sequencebased phylogenetic studies fail to recover Porifera as a monophyletic group (eg. Kruse et al. 1998;

Adams et al. 1999; Borchiellini et al. 2001; Medina et al. 2001; Rokas et al. 2005). Some of these studies place the Hexactinellida as a sister group to the rest of the Metazoa (Kruse et al.

1998; Borchiellini et al. 2001), while others support their affinity with the Demospongiae, but exclude Calcarea from the group (Adams et al. 1999; Medina et al. 2001). Our recent analysis of two glass sponge mitochondrial genomes found striking similarities between hexactinellid and bilaterian mtDNA (Ch 1). Despite this, statistical support for the alternative hypotheses of poriferan relationships has generally been low.

The ultimate goal of many multiplespecies phylogenetic studies is the resolution of the overall phylogenetic picture for all . The early origin of sponges is indisputable given their fossil record, and sponges are placed at the base of the metazoan radiation by phylogenetic studies using morphological characters (for information see Leys et al. 2007).

However, the exact of the relationships among sponges and other nonbilaterian animals remains controversial. It is our longterm goal to assess the phylogenetic position of glass sponges among other nonbilaterian animals. Increased sampling of the glass sponges and the use of more sophisticated phylogenetic models are needed to resolve current difficulties in earlydiverging animal phylogenies.

Mitochondrial genome evolution in Class Hexactinellida (Phylum Porifera)

Until recently (Haen et al. 2007, Rosengarten et al. 2008, Dohrmann et al. 2008,

Dohrmann et al. 2009) there was virtually no molecular data available for Class

Hexactinellida. The number of hexactinellid partial and complete mt genome sequences has since increased from the three previously published genomes to 10, including individuals from the seven taxonomic families presented in Chapter 2 of this dissertation. This chapter reexamines the prevalence of parallel changes in Class Hexactinellida and the Bilateria, including the in-depth descriptions of the topological features of their mtDNAs, such as a bilaterianlike nucleotide composition of protein coding genes, very little or absent noncoding DNA with the exception of a single, large control region, an arginine (AGR) to serine (AGN) change in the genetic code, the presence of a particular mttRNA that has an absent dihydrouridine (DHU) stem replaced by a loop, and other highly derived tRNA and rRNA genes.

Given the genes determined for a particular strand of DNA from the topological analysis of a mitochondrial chromosome, we can infer the ordering and directionality of the genes, and represent the chromosome by an ordering of oriented genes (Moret et al. 2001).

The comparison of inferred mitochondrial gene arrangements may be useful for phylogenetic inferences among the metazoan clades where hypotheses based on primary sequence data are equivocal (Lavrov & Lang, 2005), or they may lend additional support to other molecular studies, especially when they call for reclassification. The large number of possible gene arrangements makes convergence in the gene order unlikely, and slow rates of gene rearrangements in animal mtDNA may preserve phylogenetic signal from mutational saturation, which may be problematic in the case of hexactinellid mtDNA.

In Chapter 2 of this dissertation, I will examine the distribution of the previously mentioned RGCs among the various hexactinellid lineages of this study. Particularly, my aim is to reexamine tRNAs of divergent structure, with emphasis on investigating the evolution of the previously reported seryl tRNA (Ch. 1), mttRNA(Ser)(UCU), that appears to be intermediate in structure between the tRNAs found in other nonbilaterians and the tRNAs found in bilaterians. This structure contains an extended loop rather than the prototypical

DHU stemloop configuration, similar to the severely truncated loop (with an absent stem) that functions in bilaterian mtDNA (Steinberg et al. 1994). We have hypothesized this unusual tRNA structure may be related to a corresponding change in the genetic code, whereby the tRNA decodes AGR codons rather than the standard arginyltRNA; however, this hypothesis can be disproved depending upon the distribution of the tRNA structural deviation in Hexactinellida.

Extended sampling of hexactinellid rDNAs, including genera from the Orders

Lyssacinosida, Hexactinosida and Amphidiscophora have largely been congruent with the traditional classification system (Dohrmann et al. 2008; Dohrmann et al. 2009); however, contrary to morphology, current molecular evidence suggests the dictyonal (fused) skeleton is a convergent feature in glass sponges (Dorhmann et al. 2008). Based upon the extended sampling of mt-genomes in this section of the dissertation, I will also determine the inter- class phylogeny of Hexactinellida with molecular protein coding gene and gene order analyses. These results will either support or refute the traditional placement of sampled dictyonine glass sponges within Order Hexactinosida.

Analysis of putative +1 translational frameshift sites in hexactinellid mtDNA

Programmed translational frameshifting is a ubiquitous but illdefined mechanism of gene expression that can correct that lead to a change in reading frame without altering the sequence of the mRNA transcript. The frameshift event is thought to occur by ribosomal bypass of nucleotides and stands in contrast to RNA editing, which alters a gene sequence at the transcriptional level by direct extraction of nucleotides from the RNA sequence (Lavrov et al. 2000). The phenomenon of translational frameshifting was first discovered and has been extensively studied in prokaryotes and lower eukaryotes (reviewed in Baranov et al. 2002; Farabaugh 1996), but incidences of possible frameshifting in animal mitochondria have been formally dismissed as sequencing errors, PCR artifacts or nuclear pseudogenes (numts) (Harlid et al. 1997; Desjardins & Marais 1990). More recently, a few studies have reported single base insertions in what appear to be fully functional protein coding genes (Mindell et al. 1998; Beckenbach et al. 2005; Milbury & Gaffney 2005; Parham et al. 2006; Russel & Bechenbach 2008) suggesting translational framshifting may act as either a means to cope with or as a mechanism for regulating gene expression in animal mitochondria.

Our analysis of mitochondrial genomes from the glass sponges Iphiteon panicea,

Aphrocallistes beatrix (Order Hexactinosida) and Sympagella nux (Order )

(Chapter 1) revealed five putative cases of translational frameshifting: in the cox1 gene of I. panicea, the nad2 genes of both I. panicea and S. nux, and two incidents of shifting in the cox3 gene of A. beatrix. The base insertion in cox1 results in a deduced protein with a carboxy end that has little similarity to other COX1 sequences. In the cases of nad2 and cox3, base insertions create truncated ORFs with inframe stop codons. In all of these cases +1 frameshifts should result in translated proteins that are functional. The objectives of Chapter

3 of this dissertation were to 1.) verify the expression of genes containing single base insertions in glass sponge mitochondria, and 2.) determine the evolutionary relationships among the frameshifting sites from several genera and families, particularly by applying molecular dating to the phylogeny in order to determine the relative ages of frameshifts.

Interestingly, all reported cases of suspected translational frameshifts in animal mitochondria include single base insertions (+1 frameshifts) incorporated into what appear to be very short frameshifting signals. In the case of glass sponges, a tetranucleotide sequence

TGGA seems to be one signal necessary for frameshifting. This signal is identical to the sequence reported for the cob gene from Genus Polyrhachis (a large group of ants) mitochondria (Bechenbach et al. 2005). Current analysis of glass sponge putative frameshifting signals leads us to suspect the mechanism in animal mitochondria may by

Tylike. Particularly, for Ty3like +1 frameshifting, there must be a cooccurrence of three events: a peptidyltRNA in the ribosomal P site must form a weakened interaction with the mRNA; second, a slow recognition of the next inframe codon must cause a translational pause; and, third, an abundant aminoacyltRNA must recognize the first +1 frame codon to cause a shift in the reading frame.

Parallel losses of tRNAs & aminoacyl tRNA synthetases in the evolution of the metazoan mitochondrial translational machinery

Mitochondria, which originated from the endosymbiotic fusion of an Archea-type host with an !proteobacterium–like ancestor, all possess vastly fewer genes than the genomes of extant !proteobacteria (Adams & Palmer 2003). During the course of mitochondrial evolution, most of the genes essential to organellar biogenesis have either been transferred to the nucleus or supplanted by preexisting nuclear genes of similar function

(Duchene et al. 2009). The process of deletion and functional replacement of mitochondrial genes by nuclear copies is presumed to have essentially stopped in animals, for which the vast majority of fully sequenced mt genomes only encode 1213 proteins, two ribosomal

RNAs, and 22 tRNAs (Lavrov 2007). Perhaps the most notable exceptions to this rule are found in lineages of the Porifera and Cnidaria, where as many as 20 of the mitochondrial tRNAs are absent (Wang et al. 2008). Extreme variations in gene loss suggest the possibility of rampant convergence leading to presentday mitochondrial gene content, and indicate a polyphyletic origin, and potentially alternative mechanisms, for the targeting and import of nuclear encoded genes into the mitochondrion (reviewed in Mirande 2007).

The fidelity of cellular protein synthesis is accomplished by the specific attachment of amino acids to their cognate tRNAs and by the specific interaction between the anticodons of tRNAs and codons in mRNA. The first of these processes is governed by a diverse family of called the aminoacyl-tRNA synthetases (aaRS). In animals and other non- photosynthetic eukaryotes, aaRS are subdivided into two structurally distinct nuclear- encoded sets: one for translation in the cytosol and one for translation in mitochondria. The nuclear integration of some mt-aaRS has included the de novo acquisition of an mt-targeting signal to redirect the integrated nuclear protein to the mitochondria (Peeters et al. 2000).

Similarly, targeting signal acquisition is necessary for a cytosolic to become mt- directed, which occurs when an ancestral mt-aaRS is lost (Duchêne et al. 2009; Wang et al.

2003). Dual targeting of the same aaRS to different cellular compartments is not an unusual phenomenon, and the acquisition of a mt targeting signal by a cytosolic aaRS should serve as a strong indicator of mt-aaRS gene loss (Duchêne et al. 2005).

The research in Chapter 4 of this dissertation attempts to address three fundamental questions regarding the evolution of mt-tRNA gene loss from mitochondrial genomes, starting with why are tRNA genes lost in some animal lineages but not in others? The first step in answering this question is to know which genes may be lost and how often losses occur. It is relatively well known that cnidarians (with the exception of soft ; Subclass

Octocorallia) contain genes for only the tryptophanyl and methionyl tRNAs in their mtgenomes. The ability of mtaaRS, which are generally thought to be critical for mttranslation, to detect and charge cytosolic tRNAs has been a proposed explanation for the loss of mttRNA genes in eukaryotes (Schneider 2001). Interestingly, our analysis of complete genome data and ESTs from the cnidarians Nematostella vectensis and Hydra magnipapillata revealed only the mtaaRS for tryptophan and phenylalanine were present.

Our data further indicate the loss of mtaaRS has been a common process in the evolution of the Metazoa, and that it may be generally correlated with the loss of mttRNAs.

We also asked how does the process of deletion and functional replacement of mt genes by nuclear copies result in protein targeting to the mitochondrion in cnidarians?

Acquisition of mitochondrial targeting seems to generally involve recombination into preexisting promoter and/or transit peptidecoding regions. Our own annotation of genomic data from N. vectensis revealed cytosolic aaRS with alternative translational start sites that could produce Nterminal residues with MITOPROT predictions between 0.98.99, suggesting these were mt targeting sequences. It appears that in N. vectensis the cytosolic aaRS supplant the function of mtaaRS in these instances.

Finally, based upon an increased sampling of aaRS from nonbilaterian animals, what are the evolutionary trends seen in this family of proteins? Our previous phylogenetic analysis of nonbilaterian mt seryltRNA synthetases revealed this enzyme formed distinct clades for bilaterian and nonbilaterian animals, which corresponded directly with the presence/absence of certain identity elements in the cognate tRNA (Haen & Lavrov, unpublished data). High substitution rates and the dynamic nature of tRNA structure in animal mitochondria may have led to other distinctions between bilaterian and nonbilaterian aaRS that may only be elucidated with adequate sampling.

Dissertation organization

This dissertation consists of this general introduction, four journal papers (Chapters 2-

5), and general conclusions (Chapter 6). Chapter 2, published in the Molecular Biology and

Evolution (2007), presents the first ever study of the mtDNA of a hexactinellid sponges and their phylogenetic relationships with other animals. Chapter 3, prepared for submission to

Mitochondrion, extends the Chapter 2 research to encompass Hexactinellids comprising six taxonomic orders and seven families. Chapter 4, prepared for subission to Molecular

Biology and Evolution, is an in-depth examination of +1 translational frameshifts in hexactinellid protein coding genes, and includes molecular dating for Hexactinellida. Chapter

5, published in Molecular Biology and Evolution (2010), examines the parallel loss of mitochondrial aminoacyl tRNA synthetases from the nuclear genome with the loss of mitochondrial tRNAs of Nematostella vectensis (Phylum Cnidaria). Under the advisement of my major professor, I conceived the research, analyzed the data and wrote manuscripts for all chapters. I was assisted in research and writing by my colleage, Walker Pett, for the Chapter

5 publication. Chapter 6 contains the summaries of the thesis and recommendations for future research.

References

Adams CL, McInerney JO, Kelly M. 1999. Indications of relationships between poriferan classes using fulllength 18S rRNA gene sequences. Mem Queensl Mus 44:3343.

Adams KL, Palmer JD. 2003. Evolution of mitochondrial gene content: gene loss and transfer to the nucleus. Molecular and Evolution, 29 (2003) 380–395

Andersson, S.F. & C. Kurland. 1995. Genome evolution drives the evolution of the translation system. Biochem. Biol. 73:775787.

Baranov, P.V., R.F. Gesteland & J.F. Atkins. 2002. Recoding: translational bifurcations in gene expression. Gene 286:187201.

Beckenbach, A.T., S.K.A. Robson, & R.H. Crozier. 2005. Single Nucleotide +1 Frameshifts in an Apparently Functional Mitochondrial Cytochrome b Gene in Ants of the Genus Polyrhachis. J. Mol. Evol. 60:14152.

Borchiellini C, Manuel M, Alivon E, BouryEsnault N, Vacelet J, Le Parco Y. 2001. Sponge paraphyly and the origin of Metazoa. J Evol Biol 14:171179.

Brasier, M.D. & G.O. Shields. 1997. Ediacaran clusters from Southwestern Mongolia and the origins of fauna. Geology 25:303306.

Buchan, J.R. & I. Stansfield. 2007. Halting a cellular production line: responses to ribosomal pausing during translation. Biol. Cell 99: 475487

Chimnaronk, S., M. Gravers Jeppesen, T. Suzuki, J. Nyborg & K. Watanabe. 2005. Dual mode recognition of noncanonical tRNAsSER by seryltRNA synthetase in mammalian mitochondria. EMBO J. 24:33693379.

Desjardins, P. & R. Morais. 1990. Sequence and gene organization of the chicken mitochondrial genome. A novel gene order in higher . J. Mol. Biol. 212:599 634.

Dohrmann, M., A. Collins & G. Worheide. 2009. New insights into the phylogeny of glass sponges (Porifera, Hexactinellida): Monophyly of Lyssacinosida and Euplectellinae, and the phylogenetic position of . Mol Phy Evol. 52: 257-262.

Dohrmann, M, D Janussen, J. Reitner, A. Collins & G. Worheide. 2008. Phylogeny and Evolution of Glass Sponges (Porifera, Hexacinellida). Systematic Biol 57: 388405.

Duchêne A, Pujol C, MarchalDrouard L. 2009. Import of tRNAs and aminoacyltRNA synthetases into mitochondria. Current Genetics, 5(1): 118.

Duchêne AM, Giritch A, HoVmann B, Cognat V, Lancelin D, Peeters NM, Zaepfel M, MarchalDrouard L, Small ID. 2005. Dual targeting is the rule for organellar aminoacyltRNA synthetases in . Proc Natl Acad Sci USA 102:16484– 16489.

Farabaugh, P.J. 1996. Programmed Translational Frameshifting. Annu. Rev. Genet. 30:50728.

Gehling, J.G.; Rigby, J.K. 1996. Long Expected Sponges from the Neoproterozoic Ediacara Fauna of South Australia. Journal of Paleontology, 2, pp. 185195.

Haen, K. M., Lang, B. F., Pomponi, S. A. & Lavrov, D. V. 2007. Glass sponges and bilaterian animals share derived mitochondrial genomic features: a common ancestry or parallel evolution? Mol. Biol. Evol. 24, 1518–1527.

Harlid, A., A. Janke, & U. Arnason. 1997. The mtDNA sequence of the ostrich and the divergence between paleognathous and neognathous . Mol. Biol. Evol. 14:754761.

Hjort K., Goldberg A. V., Tsaousis A. D., Hirt R. P., Embley T. M. 2010. Diversity and reductive evolution of mitochondria among microbial eukaryotes. Phil. Trans. R. Soc. B 365, 713–727.

Kruse, M, SP Leys, IM Muller and WEG Muller. 1998. Phylogenetic position of the Hexactinellida within the phylum Porifera based on amino acid sequence of the protein C from Rhabdocalyptus dawsoni. Journal of Mol Evol 46: 721728.

Lavrov, D. 2007. Key transitions in animal evolution: a mitochondrial DNA perspective. Integrative and Comparative Biology 47:734743.

Lavrov, D.V., W.M. Brown & J.L. Boore. 2000. A novel type of tRNA editing occurs in the mitochondrial tRNAs of the centipede Lithobius forficatus. PNAS 97: 1373813742.

Lavrov, D. & F. Lang. 2005. Poriferan mtDNA and animal phylogeny based on mitochondrial gene arrangements. Systematic Biology 54:651-659.

Lavrov, D. & F. Lang. 2005b. Transfer RNA gene recruitment in mitochondrial DNA. Trends in Genet. 21:129133.

Lavrov, D., X. Wang, M. Kelley. 2008. Reconstructing ordinal relationships in the Demospongiae using mitochondrial genomic data. Mol. Phyl. Evol. 49: 111124.

Laslett, D. & Canb.ck B. (2008) ARWEN, a program to detect tRNA genes in metazoan mitochondrial nucleotide sequences. Bioinformatics 24;172175.

Leys, SP, G.O. Mackie & H.M. Reiswig. 2007. The Biology of Glass Sponges. Advances in

Marine Biology 52: 1145.

Leys SP. 2003. The Significance of Syncytial Tissues for the Position of the Hexactinellida in the Metazoa. Integr Comp Biol 43:1927

Leys, SP, EC Cheung and N BouryEsnault. 2006. Embyrogenesis in the glass sponge Oopsacas minuta: Formation of syncytia by fusion of blastomeres. Integrative and Comparative Biol 46: 104117.

Mackie, G.O. 1981. Plugged syncytial interconnections in hexactinellid sponges. Journal of Cell Biol 21: 103a.

Mackie, G.O. & C.L. Singla. 1983. Studies on hexactinellid sponges. I. Histology of Rhabdocalyptus dawsoni (Lambe 1873). Philos Trans R Soc Lond Series B 301: 401418.

Margulis, Lynn. 1970. Origin of Eukaryotic Cells. Yale University Press. New Haven, CT. 349 pp.

Martin W and KV Kowallik. 1999. Annotated English translation of Mereschowsky’s 1905 paper Uber Natur und Ursprung der Chromatophoren im Pflanzenreiche. Eur J Phycol 34:287-295.

Medina M, Collins AG, Silberman JD, Sogin ML. 2001. Evaluating hypotheses of basal animal phylogeny using complete sequences of large and small subunit rRNA. Proc Natl Acad Sci USA 98:97079712.

Milbury, C. & P.M. Gaffney. 2005. Complete mitochondrial DNA sequence of the eastern oyster Crassostrea virginica. Marine Biotechnology 7:697712.

Mindell, D.P., M.D. Sorenson, & D.E. Dimcheff. 1998. An Extra Nucleotide is not Translated in Mitochondrial ND3 of Some Birds and Turtles. Mol. Biol. Evol. 11:1568 1571.

Parham, J.F., J.R. Macey, T.J. Papenfuss, C.R. Feldman, O. Turkozan, R. Polymeni, & J. Boore. 2006. The phylogeny of Mediterranean tortoises and their close relatives based on complete mitochondrial genome sequences from museum specimens. Mol. Phyl. Evol. 38:5064.

Peeters NM, Chapron A, Giritch A, Grandjean O, Lancelin D, Lhomme T, Vivrel A and Small ID. 2000. Duplication and Quadruplication of Arabidopsis thaliana Cysteinyl and AsparaginyltRNA Synthetase Genes of Organellar Origin. J Mol Evol 50: 413423.

Reiswig HM. 2002. Class Hexactinellida Schmidt, 1870. In: Hooper JNA, Van Soest RWM, editors. Systema Porifera: a guide to the classification of sponges. New York: Kluwer Academic/Plenum Publishers. p 12011202.

Rokas A, Kruger D, Carroll SB. 2005. Animal evolution and the molecular signature of radiations compressed in time. Science 310:19331938.

Rokas, A & PW Holland. 2000. Rare genomic changes as a tool for phylogenetics. Trends Ecol Evol. Nov 1;15(11):454459.

Russell, R. David and Andrew T. Beckenbach 2008. Recoding of Translation in Turtle Mitochondrial Genomes: Programmed Frameshift Mutations and Evidence of a Modified Genetic Code. J Mol Evol. 2008 December; 67(6): 682–695.

Schultz, D., M. Yarus. 1994. Transfer RNA mutation and the malleability of the genetic code. J.Mol. Biol. 235:13771380.

Schultze, F.E. 1899. Zur Histologie der Hexactinelliden. Siztungsberichte der Deutches Akademie con Wissenschaft 14: 198209.

Wang X and Lavrov DV. 2008. Seventeen new complete mtDNA sequences reveal extensive mitochondrial genome evolution within the Demospongiae. PLoS ONE, 3(7): e2723.

Wang, C.C. Chang, K.J. Tang, H.L. Hsieh, C.J. Schimmel, P. 2003. Mitochondrial Form of a tRNA Synthetase Can Be Made Bifunctional by Manipulating Its Leader Peptide. Biochemistry 42: 16461651.

Wolstenholme DR. 1992. Genetic novelties in mitochondrial genomes of multicellular animals. Curr Opin Genet Dev 2:918925.

CHAPTER 2. GLASS SPONGES AND BILATERIAN ANIMALS SHARE DERIVED

MITOCHONDRIAL GENOMIC FEATURES: A COMMON ANCESTRY OR

PARALLEL EVOLUTION?

A paper published in Molecular Biology and Evolution

Karri M. Haen, B. Franz Lang, Shirley A. Pomponi, and Dennis V. Lavrov

!"#$%&'$(

Glass sponges (Hexactinellida) are a group of deep-water benthic animals that have a unique syncytial organization and possess a characteristic siliceous skeleton. Although hexactinellids are traditionally grouped with calcareous and demosponges in the phylum

Porifera, the monophyly of sponges and the phylogenetic position of the Hexactinellida remain contentious. We determined and analyzed the nearly complete mitochondrial genome sequences of the hexactinellid sponges Iphiteon panicea and Sympagella nux. Unexpectedly, our analysis revealed several mitochondrial genomic features shared between glass sponges and bilaterian animals, including an Arg / Ser change in the genetic code, a characteristic secondary structure of one of the serine tRNAs, highly derived tRNA and rRNA genes, and the presence of a single large noncoding region. At the same time, glass sponge mtDNA contains atp9, a gene previously found only in the mtDNA of demosponges (among animals), and encodes a tRNAPro UGG with an atypical A11–U24 pair that is also found in demo-sponges and placozoans. Most of our sequence-based phylogenetic analyses place

Class Hexactinellida as the sister group to the Bilateria; however, these results are suspect given accelerated rates of mitochondrial sequence evolution in these groups. Thus, it remains an open question whether shared mitochondrial genomic features in glass sponges and bilaterian animals reflect their close phylogenetic affinity or provide a remarkable example of parallel evolution.

Introduction

Glass sponges (class Hexactinellida) are an exclusively marine and predominantly deep-water group of animals that contains about 500 described extant species and plays an important role in benthic communities. The group is defined by the presence of siliceous spicules of triaxonic (cubic) symmetry or their derivatives that are often fused to form a rigid framework (Reiswig 2002) and sometimes display remarkable optical characteristics (Sundar et al. 2003; Aizenberg et al. 2004). In addition to their characteristic spicules, hexactinellids differ from other sponges in their unique syncytial organization (reviewed in Leys 2003) and their unusual capability of impulse conduction (Lawn, Mackie, and Silver 1981; Mackie,

Lawn, and Pavans de Ceccatty 1983; Leys, Mackie, and Meech 1999).

Although glass sponges have been traditionally placed with calcareous and demosponges in the phylum Porifera (Thomson 1869; Schmidt 1870), this association is based on the overall morphological similarity of sponges rather than any specific synapomorphies (Hooper, Van Soest, and Debrenne 2002). Furthermore, most sequence- based phylogenetic studies fail to recover Porifera as a monophyletic group (Collins 1998;

Kruse et al. 1998; Adams, Mclnerney, and Kelly 1999; Borchiellini et al. 2001; Medina et al.

2001; Rokas, Kruger, and Carroll 2005; Muller, Muller, and Schroder 2006). Some of these studies place the Hexactinellida as a sister group to the rest of the Metazoa (Kruse et al.

1998; Borchiellini et al. 2001; Muller, Muller, and Schroder 2006), whereas others support their affinity with the Demospongiae, but exclude Calcarea from the group (Adams,

McInerney, and Kelly 1999; Medina et al. 2001). However, statistical support for these alternative hypotheses of poriferan relationships is generally low, and a recent attempt to use multiple genes to elucidate animal phylogeny found no resolution for the relationships among the 3 classes of sponges and other nonbilaterian animals (Rokas, Kruger, and Carroll 2005).

Comparisons of mitochondrial genomes have been useful for investigating ancient animal relationships (Boore, Lavrov, and Brown 1998; Reyes et al. 2004; Lavrov and Lang

2005). In addition to the large amount of sequence data, which minimizes sampling error in sequence-based phylogenetic analysis, mtDNA harbors additional rare genomic characters that are valuable for phylogenetic inference, including indels in the coding sequences, variations in the genetic code, changes in the secondary structures of encoded transfer and ribosomal RNAs, and gene rearrangements (Rokas and Holland 2000). To gain a better understanding of the phylogenetic position of the Hexactinellida and to fill a gap in our sampling of animal mitochondrial genomes, we determined the nearly complete mito- chondrial genome sequences of 2 glass sponges, Iphiteon panicea and Sympagella nux. Here, we describe these genomes and analyze their features in relationship to the phylogenetic position of glass sponges and animal mitochondrial evolution.

Materials and Methods

Specimen Collection, DNA Extraction, Amplification, Cloning, and Sequencing

Specimens of I. panicea (Bowerbank, 1869) (Class Hexactinellida: Order

Hexactinosida: Family Dactylocalycidae) and S. nux (Schmidt 1870) (Class Hexactinellida:

Order Lyssacinosida: Family ) were collected by the ‘‘Johnson-Sea-Link’’ manned submersible near Turks & Caicos and processed as described in Adams et al. (1999).

Small fragments of cox1, cox2, cox3, and rnl were amplified from total DNA using sponge- specific primers, checked against the GenBank database to minimize the possibility of contamination, and used to design specific primers for these regions (table 1). Complete mtDNA of I. panicea and partial mtDNA of S. nux mtDNA were amplified in overlapping fragments using the TAKARA LA-PCR kit. Regions of the S. nux mtDNA upstream of cox1 and downstream of cox3 were amplified using a modified step-out polymerase chain reaction

(PCR) approach (Wesley UV and Wesley CS 1997). Random clone libraries were constructed from purified PCR products by nebulizing them into fragments 1–3 kbp in size and cloning them into the pCR 4Blunt-TOPO vector (Invitrogen, Carlsbad, CA) as described previously (Wang and Lavrov 2007), sequenced at the Iowa State University DNA Facility on a ABI 3730 DNA Analyzer, and assembled using the STADEN software suite (Staden

1996). The region down-stream of cob and upstream of cox1 was problematic for PCR amplification, cloning, and sequencing, and only partial sequence from this region has been determined.

Most tRNA genes were identified by the tRNAscan-SE program (Lowe and Eddy

1997); other genes were identified by comparisons between the two genomes, similarity searches in local databases using the FASTA program (Pearson 1994) and in GenBank using the Blast network service (Benson et al. 2003). The secondary structures of tRNA genes were drawn with the Macromedia FreeHand program.

Phylogenetic Analysis

The inferred amino acid sequence for Cantharellus cibarius mtDNA was downloaded from http://megasun.bch.umontreal.ca/People/lang/FMGP/proteins.html. Other sequences were derived from the GenBank files: Katharina tunicata NC_001636, Limulus polyphemus

NC_003057, Asterina pectinifera NC_001627, Mustelus manazo NC_000890, Acropora tenuis NC_003522, Astrangia sp. NC_008161, Briareum asbestinum NC_008073, Metridium senile NC_000933, Montastraea annularis NC_007224, Nematostella sp. NC_008164,

Ricordea florida NC_008159, Sarcophyton glaucum AF064823 and AF063191, Aurelia aurita NC_008446, corrugata NC_006894, neptuni NC_006990, EF081250, actinia NC_006991, adhaerens NC_008151,

Amoebidium parasiticum AF538042–AF538052, Monosiga brevicollis NC_004309,

Allomyces macrogynus NC_001715, Cryptococcus neoformans NC_004336, Mortierella verticillata NC_006838, Penicillium marneffei NC_005256, Podospora anserina

NC_001329, Rhizopus oryzae NC_006836, and Spizellomyces punctatus NC_003052.

A concatenated alignment of protein-coding genes was prepared as described previously (Lavrov et al. 2005). Searches for the maximum likelihood (ML) topologies were

Performed with the TreeFinder program(Jobb, von Haeseler, and Strimmer 2004) using the mtArt + ! (Abascal, Posada, and Zardoya 2006), cpREV + ! (Adachi et al. 2000), mtREV +

! (Adachi and Hasegawa 1996), and JTT + ! (Jones, Taylor, and Thornton 1992) models of amino acid substitutions and with the RAxML program (Stamatakis 2006) using the cpREV

+ F + ! model. Bayesian inferences were conducted with the MrBayes 3.1.2 (Ronquist and

Huelsenbeck 2003) and Phylobayes 2.1c programs. For the MrBayes analysis, we used the cpREV model of amino acid substitutions with ! 4 + I distributed rates and conducted two simultaneous runs, each with 4 Markov chain Monte Carlo chains, for 1,100,000 generations.

Trees were sampled every 1,000th cycle after the 1st 100,000 burn-in cycles. The results of the 2 runs were compared with the AWTY program (Wilgenbusch, Warren, and Swofford

2004). For the PhyloBayes analysis, we used the CAT + " model and ran 3 chains for 35,000 cycles. Trees were sampled every 10th cycle after the 1st 5000 burn-in cycles (the maximum difference in the frequency of bipartitions between the 2 runs was 0.03). Likelihood-based tests of alternative topologies were conducted as follows: ML branch lengths of alternative topologies were calculated with Tree-Puzzle (Schmidt et al. 2002) using the mtREV24 + F +

"8 model of amino acid substitutions; log-likelihood values for individual sites in the alignment were computed with Codeml (Yang 1997) using the cpREV + F + "8 substitution model; and the probability values for different likelihood-based tests were determined with

Consel (Shimodaira and Hasegawa 2001).

Results

Mitochondrial genome organization is similar in glass sponges and bilaterian animals

The sequenced portions of the mitochondrial genomes of I. panicea and S. nux are

19.0 and 16.3 kb in size and contain 37 and 35 genes, including 13 and 12 protein-coding genes, 2 rRNA genes, and 22 and 20 tRNA genes, respectively (fig. 1). Identified protein- coding genes include all those typical for bilaterian animals with the exception of atp8 (and nad6 in S. nux), as well as the gene for subunit 9 of adenosine triphosphate (ATP) synthase

(atp9), previously found only in the mtDNA of demosponges among animals. Identified mt- tRNA genes include only those typical for bilaterian animals; none of the additional mt- tRNAs present in demosponges and the placozoan T. adhaerens have been found in glass sponge mtDNA. In addition, 2 large (>200 bp) open reading frames (ORFs) of 582 and 909 bp in size are present in the sequenced portion of the I. panicea mitochondrial genome.

All genes have the same transcriptional orientation in both mtDNAs. The arrangements of protein and rRNA coding genes in the two genomes are identical with the exception of the position of cox2 and the absence of the nad6 gene in S. nux (fig. 1). In contrast, only 6 of the tRNA genes share the same relative position in the 2 genomes. A maximum of 5 gene boundaries are shared with the G. neptuni: 3 of these gene boundaries (+nad2+nad5, +atp6+cox3, and +nad3+trnR) are well conserved in the mitochondrial DNA of demosponges and cnidarians, and, except for the +nad2+nad5 boundary, have also been reconstructed in ancestral bilaterian mtDNA (Lavrov and

Lang 2005).

The A + T content of I. panicea and S. nux is 65.3% and 70.4%, respectively, which is within the range of other animals and similar to that of demosponges. The coding strand of

I. panicea mtDNA has an AT skew of 0.36 and a GC skew of -0.36. In S. nux, the AT skew is 0.19 and the GC skew is -0.28. The polarity of each skew holds true regardless of the gene type analyzed: protein, tRNA, and rRNA (table 2). However, the AT-skew is reversed at the

2nd codon position in coding sequences, consistent with the fact that most codons specifying nonpolar amino acids, which are prevalent in mitochondrial (and other transmembrane) proteins, have thymidine (uridine) at the 2nd position (table 2). The compositional biases observed in the coding strands of glass sponge mtDNA are similar to those of many bilaterian animals rather than demosponges, cnidarians, and the choano-flagellate M. brevicollis, in which coding strands have an overall positive GC skew and negative AT skew

(Beagley, Okimoto, and Wolstenholme 1998; Burger et al. 2003; Lavrov et al. 2005;

Dellaporta et al. 2006).

Although most genes in hexactinellid DNA are compactly arrayed, a single large noncoding region is present in both genomes downstream of nad4. This region has been only partially sequenced and contains at least 645 bp (41.3%) and 348 bp (33.3%) of the noncoding nucleotides in I. panicea and S. nux mtDNA, respectively. Like in many bilaterian animals, this large noncoding region includes multiple repeated sequences and was problematic for PCR amplification, cloning, and sequencing. Other noncoding regions are substantially smaller, and several genes in S. nux mtDNA appear to overlap. Thus, with the exception of the presence of atp9, hexactinellid mtDNA resembles the mtDNA of bilaterian animals in gene content, nucleotide composition, and general organization of mtDNA. A more detailed analysis revealed several specific mitochondrial features that are potentially informative for understanding the phylogenetic position of glass sponges.

Glass sponges and bilaterian animals share an arginine to serine reassignment of mitochondrial AGR codons

The analysis of mitochondrial coding sequences in hexactinellids both manually and using genetic code predictio software (Abascal, Zardoya, and Posada 2006) showed that glass sponges use a modified genetic code for mitochondrial protein synthesis with UGA coding for tryptophan and AGR coding for serine. Specifically, nearly all tryptophan residues in encoded proteins are specified by the UGA codon (the UGG codon is extremely rare in both genomes and is potentially involved in translational frameshifting (Haen KM, Lavrov DV, unpublished data), whereas 42% of AGR codons in evolutionarily conserved positions in each genome code for serine, and none of them code for arginine. Furthermore, the inference that AGR codons specify serine is supported by the presence of only one tRNA for AGN codons tRNASer (UCG) in both genomes. Whereas the reassigned UGA codon is found in the mtDNA of all animals, the only other known reassignments of AGR codons occurred in the mtDNA of bilaterian animals, first in the lineage leading to this group (Arg/Ser), then in the lineage leading to (Ser / ?), and finally in two lineages: Urochordata

(?/Gly) and Vertebrata (? / stop) (Knight, Freeland, and Landweber 2001).

Interestingly, although the AUA codon appears to have a standard meaning

(isoleucine) in glass sponges, the sequenced portions of glass sponge mtDNA do not code for the tRNAIle (LAU) that usually translates this codon in eubacteria and eubacteria-derived organelles (Muramatsu et al. 1988; Weber et al. 1990; Soma et al. 2003). This lack of a specific tRNA for the AUA codon is shared with several groups of bilaterian animals

(, , and flat worms), where the AUA codon also specifies isoleucine but is translated by a modified tRNAIle (GAU) together with 2 other isoleucine codons, AUC and AUU (Castresana, Feldmaier-Fuchs, and Paabo 1998; Tomita, Ueda, and

Watanabe 1999). It is not clear whether these changes occurred convergently in these groups or if they represent an intermediate state in the reassignment of the AUA codon from isoleucine to methionine in animal mtDNA.

Mitochondrial genomes of glass sponges encode highly derived rRNA and tRNA structures

The analysis of tRNA and rRNA genes in glass sponge mtDNA revealed that both the primary and secondary structures of encoded RNAs are highly derived and resemble those encoded by bilaterian rather than nonbilaterian animals. The changes in tRNA structures are particularly striking (fig. 2). Canonical tRNAs in nuclear, bacterial, and most organellar genomes have several highly conserved features: a 7-bp aminoacyl arm, a dihydrouridine

(DHU) stem varying in size from 3–4 bp, a DHU loop of 8–10 nt, a 5-bp anticodon stem with a 7-nt anticodon loop, and a 5-bp T"C stem with a 7-nt loop (Marck and Grosjean 2002).

Furthermore, several nucleotides are nearly universally conserved in standard tRNAs, including 2 guanines within the DHU-loop that bind to well-conserved nucleotides in the

T"C loop, stabilizing the 3-dimensional L-shaped structure of tRNAs. Conventional tRNAs are found in demosponge, cnidarian, and Trichoplax mtDNA, but in most bilaterian animals there are multiple deviations from this pattern of conservation (Wolstenholme 1992). Similar to bilaterian animals, glass sponge mt-tRNAs display a great degree of variation in both the size and nucleotide sequence of the DHU and T"C arms, including a highly variable sequence of the T"C loop and a typical absence of guanine residues in the DHU loops (fig.

2). The authors are not aware of any other group of organisms that encode tRNAs with similar characteristics.

Unusual structure of tRNA(Ser) and an unusual R11-Y24 base pair in tRNA(Pro)

Two tRNAs encoded by glass sponge mtDNA have specific features shared with corresponding tRNAs in other animals and are potentially phylogenetically informative.

First, hexactinellid tRNASer (UCU) appears to have a very unusual secondary structure with the DHU arm replaced by a loop (fig 3). Remarkably, a similar ‘‘truncated cloverleaf’’ secondary structure is a characteristic feature of mitochondrial tRNASer GCU/UCU of bilaterian animals. Although this structure appears to be disadvantageous in translation, at least in vitro (Hanada et al. 2001), it is conserved throughout the evolution of the Bilateria

(Garey and Wolstenholme 1989; Wolstenholme 1992; Ruiz-Trillo et al. 2004).

Second, an atypical R11–Y24 pair is present in tRNAPro UGG/CGG of glass sponges as well as all demosponges and the placozoan Trichoplax but is not found in the outgroup species M. brevicollis and A. parasiticum or in bilaterian animals (Wang and Lavrov 2007).

Because the R11–Y24 base pair is an important recognition element for initiator tRNA, it is usually strongly counter-selected in elongator tRNAs (Marck and Grosjean 2002), and its presence in the proline tRNA of sponges and T. adhaerens may indicate the monophyly of nonbilaterian animals or at least the monophyly of the Porifera and (this character is not available for Cnidaria because they lack this and most other tRNA genes in their mtDNA).

Phylogenetic analyses based on sequence data give a mixed message about the phylogenetic position of the Hexactinellida

Phylogenetic analyses based on deduced amino acid sequences using traditional empirical substitution models provide strong support for the sister group relationship between glass sponges and bilaterian animals (fig. 4A). These analyses separate animals into two groups: the Diploblastica (excluding the Hexactinellida) and the Bilateria, with glass sponges grouping with the latter group. The inferred phylogeny is stable with respect to the substitution matrix: identical ML topologies were found by the TreeFinder program when using the JTT+!, mtREV+!, cpREV +!, or mtART+! models of sequence evolution—and with respect to the phylogenetic program used—both TreeFinder and RAxML (using the best available cpREV +F +! model) found the same ML topology with similar bootstrap support values. Alternative positions of the Hexactinellida (e.g., as the sister group to other Metazoa and as the sister group to the Demospongiae) were rejected by likelihood tests (fig. 4C). In addition, the ML topology received .98% posterior probability in the MrBayes analysis.

We have previously proposed (Lavrov et al. 2005) that the clustering of nonbilaterian animals in phylogenetic analyses based on mtDNA data might be explained by elevated rates of mitochondrial sequence evolution in Bilateria, which would pull the latter group toward the base of metazoan tree due to a long-branch attraction (LBA) artifact (Felsenstein 1978;

Hendy and Penny 1989). The unusual position of the Hexactinellida can be rationalized in the same way. In fact, the LBA artifact in the present analysis should be exacerbated by the similar nucleotide compositions of coding strands in the Hexactinellida and Bilateria, which is a feature that may reflect phylogenetic relationships but is also prone to convergence

(Steel, Lockhart, and Penny 1993). Several strategies have been suggested to alleviate the effect of the LBA artifact on phylogenetic inference, which involve either better detection of multiple substitutions or minimizing their number (Baurain, Brinkmann, and Philippe 2007).

To minimize the number of multiple substitutions, we redid our analyses without bilaterian animals (one of the long branches on the tree) and also conducted a Bayesian analysis on the amino acid data recoded into six Dayhoff groups (Hrdy et al. 2004). Neither of these modifications changed the relative phylogenetic position of the Hexactinellida on the tree

(not shown). We also used a potentially better model of sequence evolution—the category

(CAT) +! model (Lartillot and Philippe 2004) that explicitly handles the heterogeneity of the substitution process across amino acid positions. The results of this analysis placed glass sponges as an unresolved with the demosponges A. corrugata, G. neptuni, and T. actinia (fig 4B). Although this position of glass sponges is more reasonable from a traditional perspective, it appears to be sensitive to the choice of taxa: glass sponges are placed as a sister group to G. neptuni when bilaterian animals are excluded from the analysis, but as a sister group to the jellyfish A. aurita when additional unpublished demosponge sequences are added to the data set (not shown).

Discussion

Genomic analyses of hexactinellid mtDNA revealed several derived mitochondrial features shared with bilaterian animals, including changes in the genetic code, gene content, nucleotide composition, tRNAand rRNA structures, and the presence of a single large noncoding region. These unexpected results can be explained either by a sister group relationship between glass sponges and bilaterian animals or by parallel mitochondrial evolution in these two groups. In our view, the strongest support for the sister group relationship between glass sponges and bilaterian animals comes from the reassignment of the mitochondrial AGR codons from serine to arginine and from the changes in the secondary structure of the tRNASer (UCU) in both of these groups (fig. 3). Changes in the genetic code are complex and rare events; therefore, they are usually regarded as reliable indicators of phylogenetic relationships (Telford et al. 2000). Similarly, changes in the secondary structure of mitochondrial tRNAs and, in particular, tRNASer GCU/UCU also appear to be rare and complex events that, so far, are known to occur only in the Bilateria and

Hexactinellida. It should be noted, however, that because tRNASer GCU/UCU translates

AGR codons that changed their meaning in both groups, the modifications in the genetic code and tRNASer GCU/UCU structure may be not independent. Other features shared between these two groups are known to be prone to convergence.

Not all mitochondrial genomic features support the affinity of glass sponges and bilaterian animals. MtDNA of glass sponges and demosponges share the presence of atp9, a gene for subunit 9 of ATP synthase that is absent in the mtDNA of all other animals.

Although the presence of atp9 is clearly a plesiomorphic feature for these taxa that says nothing about sponge monophyly, the sister group relationship between hexactinellids and bilaterians would imply multiple independent mitochondria-to-nucleus transfers of atp9 within the Metazoa. It should be noted, however, that such independent transfers of organellar genes are quite common (Martin et al. 1998), and a specific example of an independent atp9 transfer to the nucleus has been reported recently in the demosponge

Amphimedon queenslandica (Erpenbeck et al. 2007). The 2nd mitochondrial genomic feature that contradicts the grouping of Hexactinellida and Bilateria is the unusual R11–Y24 pair in the proline tRNA. This unusual base pair is an apomorphy for sponges and placozoans and to our knowledge has not been found in the mt-tRNAPro of either bilaterian animals or any outgroups. Because the R11–Y24 base pair is an important recognition element for initiator tRNA, it is usually strongly counterselected in elongator tRNAs, and its presence in tRNAPro may be phylogenetically informative.

Interestingly, phylogenetic analyses of deduced amino acid sequences also produced ambiguous results for the position of the Hexactinellida. The use of traditional empirical models of amino acid evolution resulted in strong support (with high bootstrap and posterior probability numbers) for the sister group relationship between glass sponges and bilaterian animals. By contrast, phylogenetic inference based on the newly developed CAT model placed glass sponges within the Demospongiae. However, the alternative positioning proposed by CAT is not stable upon taxonomic resampling, suggesting the influence of other artifacts, such as compositional biases, that are not accounted for by the CAT model

(Lartillot N, personal communication).

To conclude, most of the mitochondrial genomic characters support the sister group relationship between glass sponges and bilaterian animals. If confirmed, this position of glass sponges would prompt the reassessment of mainstream ideas about early animal evolution in general and the evolution of the sponge body plan in particular. If, conversely, this relationship is refuted, then all the mitochondrial features shared between glass sponges and bilaterian animals—a change in the genetic code, changes in secondary structures of the encoded tRNAs, and the loss of the DHU-arm in tRNAs, among others—will serve as remarkable examples of parallel evolution in mitochondrial genomes and may provide insights into the origin of unusual features in animal mtDNA. Further studies are needed to distinguish between these two possibilities.

Supplementary Materials

Mitochondrial genome sequences of I. panicea and S. nux have been deposited in the

GenBank database under the accession numbers EF537576 and EF537577, respectively.

Acknowledgments

We thank Herve Philippe, Xiujuan Wang, Jonathan Wendel, and 3 anonymous reviewers for valuable comments on an earlier version of this manuscript, Nicolas Lartillot for his help with the PhyloBayes program, and the Canadian Institutes of Health Research and the

College of Liberal Arts and Sciences at Iowa State University for funding.

Literature Cited

Abascal F, Posada D, Zardoya R. 2006. MtArt: a new model of amino acid replacement for Arthropoda. Mol Biol Evol. 24:1–5.

Abascal F, Zardoya R, Posada D. 2006. GenDecoder: genetic code prediction for metazoan mitochondria. Nucleic Acids Res. 34:W389–W393.

Adachi J, Hasegawa M. 1996. Model of amino acid substitution in proteins encoded by mitochondrial DNA. J Mol Evol. 42:459–468.

Adachi J, Waddell PJ, Martin W, Hasegawa M. 2000. Plastid genome phylogeny and a model of amino acid substitution for proteins encoded by chloroplast DNA. J Mol Evol. 50: 348–358.

Adams CL, McInerney JO, Kelly M. 1999. Indications of relationships between poriferan classes using full-length 18S rRNA gene sequences. Mem Queensl Mus. 44:33–43.

Aizenberg J, Sundar VC, Yablon AD, Weaver JC, Chen G. 2004. Biological glass fibers: correlation between optical and structural properties. Proc Natl Acad Sci USA. 101: 3358– 3363.

Baurain D, Brinkmann H, Philippe H. 2007. Lack of resolution in the animal phylogeny: closely spaced cladogeneses or undetected systematic errors? Mol Biol Evol. 24:6–9.

Beagley CT, Okimoto R, Wolstenholme DR. 1998. The mitochondrial genome of the sea anemone Metridium senile (Cnidaria): introns, a paucity of tRNA genes, and a near standard genetic code. Genetics. 148:1091–1108.

Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. 2003. GenBank. Nucleic Acids Res. 31:23–27.

Boore JL, Lavrov DV, Brown WM. 1998. Gene translocation links insects and . Nature. 392:667–668.

Borchiellini C, Manuel M, Alivon E, Boury-Esnault N, Vacelet J, Le Parco Y. 2001. Sponge paraphyly and the origin of Metazoa. J Evol Biol. 14:171–179.

Burger G, Forget L, Zhu Y, Gray MW, Lang BF. 2003. Unique mitochondrial genome architecture in unicellular relatives of animals. Proc Natl Acad Sci USA. 100:892–897.

Castresana J, Feldmaier-Fuchs G, Paabo S. 1998. Codon reassignment and amino acid composition in mitochondria. Proc Natl Acad Sci USA. 95:3703–3707.

Collins AG. 1998. Evaluating multiple alternative hypotheses for the origin of Bilateria: an analysis of 18S rRNA molecular evidence. Proc Natl Acad Sci USA. 95:15458–15463.

Dellaporta SL, Xu A, Sagasser S, Jakob W, Moreno MA, Buss LW, Schierwater B. 2006. Mitochondrial genome of Trichoplax adhaerens supports Placozoa as the basal lower metazoan phylum. Proc Natl Acad Sci USA. 103:8751–8756.

Erpenbeck D, Voigt O, Adamski M, Adamska M, Hooper J, Worheide G, Degnan B. 2007. Mitochondrial diversity of early-branching Metazoa is revealed by the complete mt genome of a haplosclerid demosponge. Mol Biol Evol. 24: 19–22.

Felsenstein J. 1978. Cases in which parsimony or compatibility methods will be positively misleading. Syst Zool. 27:401–410.

Garey JR, Wolstenholme DR. 1989. Platyhelminth mitochondrial DNA: evidence for early evolutionary origin of a tRNA(ser-AGN) that contains a dihydrouridine arm replacement loop, and of serine-specifying AGA and AGG codons. J Mol Evol. 28:374–387.

Hanada T, Suzuki T, Yokogawa T, Takemoto-Hori C, Sprinzl M, Watanabe K. 2001. Translation ability of mitochondrial tRNAsSer with unusual secondary structures in an in vitro translation system of bovine mitochondria. Genes Cells. 6:1019–1030.

Hendy MD, Penny D. 1989. A framework for the quantitative study of evolutionary trees. Syst Zool. 38:297–309.

Hooper JNA, Van Soest RWM, Debrenne F. 2002. Phylum Porifera grant, 1836. In: Hooper JNA, Van Soest RWM, editors. Systema Porifera: a guide to the classification of sponges. New York: Kluwer Academic/Plenum Publishers. p. 9–13.

Hrdy I, Hirt RP, Dolezal P, Bardonova L, Foster PG, Tachezy J, Embley TM. 2004. Trichomonas hydrogenosomes contain the NADH dehydrogenase module of mitochondrial complex I. Nature. 432:618–622.

Jobb G, von Haeseler A, Strimmer K. 2004. TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics. BMC Evol Biol. 4:18.

Jones DT, Taylor WR, Thornton JM. 1992. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 8:275–282.

Knight RD, Freeland SJ, Landweber LF. 2001. Rewiring the keyboard: evolvability of the genetic code. Nat Rev Genet. 2:49–58.

Kruse M, Leys SP, Muller IM, Muller WE. 1998. Phylogenetic position of the Hexactinellida within the phylum Porifera based on the amino acid sequence of the protein kinase C from Rhabdocalyptus dawsoni. J Mol Evol. 46:721–728.

Lartillot N, Philippe H. 2004. A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Mol Biol Evol. 21:1095–1109.

Lavrov DV, Forget L, Kelly M, Lang BF. 2005. Mitochondrial genomes of two demosponges provide insights into an early stage of animal evolution. Mol Biol Evol. 22:1231–1239.

Lavrov DV, Lang BF. 2005. Poriferan mtDNA and animal phylogeny based on mitochondrial gene arrangements. Syst Biol. 54:651–659.

Lawn ID, Mackie GO, Silver G. 1981. Conduction system in a sponge. Science. 211:1169– 1171.

Leys SP. 2003. The significance of syncytial tissues for the position of the Hexactinellida in the Metazoa. Integr Comp Biol. 43:19–27.

Leys SP, Mackie GO, Meech RW. 1999. Impulse conduction in a sponge. J Exp Biol. 202:1139–1150.

Lowe TM, Eddy SR. 1997. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25:955–964.

Mackie GO, Lawn ID, Pavans de Ceccatty M. 1983. Studies on hexactinellid sponges. II. Excitability, conduction and coordination of responses in Rhabdocalyptus dawsoni (Lambe, 1873). Phil Trans R Soc B. 301(1107):401–418.

Marck C, Grosjean H. 2002. tRNomics: analysis of tRNA genes from 50 genomes of Eukarya, Archaea, and Bacteria reveals anticodon-sparing strategies and -specific features. RNA. 8:1189–1232.

Martin W, Stoebe B, Goremykin V, Hapsmann S, Hasegawa M, Kowallik KV. 1998. Gene transfer to the nucleus and the evolution of chloroplasts. Nature. 393:162–165.

Medina M, Collins AG, Silberman JD, Sogin ML. 2001. Evaluating hypotheses of basal animal phylogeny using complete sequences of large and small subunit rRNA. Proc Natl Acad Sci USA. 98:9707–9712.

Muller WEG, Muller IM, Schroder HC. 2006. Evolutionary relationship of Porifera within the eukaryotes. Hydrobiologia. 568:167–176.

Muramatsu T, Nishikawa K, Nemoto F, Kuchino Y, Nishimura S, Miyazawa T, Yokoyama S. 1988. Codon and amino-acid specificities of a transfer RNA are both converted by a single post-transcriptional modification. Nature. 336:179–181.

Pearson WR. 1994. Using the FASTA program to search protein and DNA sequence databases. Methods Mol Biol. 25:365–389.

Reiswig HM. 2002. Class Hexactinellida schmidt, 1870. In: Hooper JNA, Van Soest RWM, editors. Systema Porifera: a guide to the classification of sponges. New York: Kluwer Academic/Plenum Publishers. p. 1201–1202.

Reyes A, Gissi C, Catzeflis F, Nevo E, Pesole G, Saccone C. 2004. Congruent mammalian trees from mitochondrial and nuclear genes using Bayesian methods. Mol Biol Evol. 21: 397–403.

Rokas A, Holland PWH. 2000. Rare genomic changes as a tool for phylogenetics. Trends Ecol Evol. 15:454–459.

Rokas A, Kruger D, Carroll SB. 2005. Animal evolution and the molecular signature of radiations compressed in time. Science. 310:1933–1938.

Ronquist F, Huelsenbeck JP. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 19:1572–1574.

Ruiz-Trillo I, Riutort M, Fourcade HM, Baguna J, Boore JL. 2004. Mitochondrial genome data support the basal position of and the polyphyly of the Platyhelminthes. Mol Phylogenet Evol. 33:321–332.

Schmidt O. 1870. Grundzuge einer Spongien-fauna des atlantischen Gebietes. Leipzig (Germany): Engelmann.

Schmidt HA, Strimmer K, Vingron M, von Haeseler A. 2002. TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics. 18:502–504.

Shimodaira H, Hasegawa M. 2001. CONSEL: for assessing the confidence of selection. Bioinformatics. 17:1246–1247.

Soma A, Ikeuchi Y, Kanemasa S, Kobayashi K, Ogasawara N, Ote T, Kato J, Watanabe K, Sekine Y, Suzuki T. 2003. An RNA-modifying enzyme that governs both the codon and amino acid specificities of isoleucine tRNA. Mol Cell. 12: 689–698.

Staden R. 1996. The Staden sequence analysis package. Mol Biotechnol. 5:233–241.

Stamatakis A. 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 22:2688–2690.

Steel MA, Lockhart PJ, Penny D. 1993. Confidence in evolutionary trees from biological sequence data. Nature. 364:440–442.

Sundar VC, Yablon AD, Grazul JL, Ilan M, Aizenberg J. 2003. Fibre-optical features of a glass sponge. Nature. 424:899–900.

Telford MJ, Herniou EA, Russell RB, Littlewood DT. 2000. Changes in mitochondrial genetic codes as phylogenetic characters: two examples from the . Proc Natl Acad Sci USA. 97:11359–11364.

Thomson CW. 1869. On Holtenia, a genus of vitreous sponges. Phil Trans R Soc Lond. 159:701–720.

Tomita K, Ueda T, Watanabe K. 1999. The presence of pseudouridine in the anticodon alters the genetic code: a possible mechanism for assignment of the AAA lysine codon as asparagine in mitochondria. Nucleic Acids Res. 27:1683–1689.

Wang X, Lavrov DV. 2007. Mitochondrial genome of the homoscleromorph Oscarella carmela (Porifera, Demospongiae) reveals unexpected complexity in the common ancestor of sponges and other animals. Mol Biol Evol. 24: 363–373.

Weber F, Dietrich A, Weil JH, Mare´chal-Drouard L. 1990. A potato mitochondrial isoleucine tRNA is coded for by a mitochondrial gene possessing a methionine anticodon. Nucleic Acids Res. 18:5027–5030.

Wesley UV, Wesley CS. 1997. Rapid directional walk within DNA clones by step-out PCR. Methods Mol Biol. 67:279–285.

Wilgenbusch JC, Warren DL, Swofford DL. 2004. AWTY: a system for graphical exploration of MCMC convergence in Bayesian phylogenetic inference. [cited 2007 May 23]. Available from: http://ceb.csit.fsu.edu/awty.

Wolstenholme DR. 1992. Genetic novelties in mitochondrial genomes of multicellular animals. Curr Opin Genet Dev. 2:918–925.

Yang Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci. 13:555–556.

Figure 1. Gene maps of hexactinellid mtDNAs. Protein and ribosomal RNA genes (lighter gray) are atp6 and 9, subunits 6 and 9 of F0 ATP synthase; cob, apocytochrome b; cox1–3, subunits 1–3; nad1–6 and nad4L, NADH dehydrogenase subunits 1–6 and 4L; rns and rnl, small and large subunit rRNAs. The tRNA genes (black) are identified by the 1-letter code for their corresponding amino acid. Large ORFs (darker gray) are named according to their lengths. All genes are transcribed clockwise. Curly lines indicate the lack of sequence data from the region.

Figure 2. Inferred secondary structures of mitochondrial tRNAs in Iphiteon panicea

(I.p.) and Sympagella nux (S.n.). 36 out of 42 tRNA genes were identified by the tRNAscan-SE program (Lowe and Eddy 1997); others were identified by similarity searches between the two genomes. Two missing tRNA genes in S. nux mtDNA are drawn with asterisks instead of letters and indicated with questions marks.

Figure 3. Evolution of tRNA(S) for AGY/AGN codons in animal mitochondria.

Secondary structures were derived manually from mtDNA sequences of representative organisms in each group of animals as well as the choanoflagellate Monosiga brevicollis. H. sapiens – Homo sapiens (Chordata), D. yakuba – Drosophila yakuba (Arthropoda), P. rubra

– Paratomella rubra (Acoelomorpha), G. neptuni – Geodia neptuni (Demospongiae), T. adhaerens – Trichoplax adhaerens (Placozoa).

Figure 4. Phylogenetic position of the glass sponges based on the analysis of mitochondrial sequence data. A) ML tree obtained from the analysis of 2678 aligned amino acid positions using the mtART substitution matrix plus four-category gamma rate correction

(lnL = -75759.61) in the TreeFinder program. Identical ML topologies and very similar bootstrap values were recovered using the JTT+" (lnL = -77651.98), mtREV+" (lnL =

-77071.95), cpREV+" (lnL = -76696.31) models in TreeFinder and using the cpREV+F+" model in RaXml. An identical topology received >98% posterior probability in a MrBayes analysis using the cpREV+"4+I model of sequence evolution. The first number at each node indicates the percentage of bootstrap support in ML analysis; the second number is the posterior probability in Bayesian analysis. An asterisk indicates 100% bootstrap support and

Bayesian posterior probability equal to 1.0. B) Posterior consensus tree under the CAT+F+" model. Only the part of the tree that differs from A is shown. C) Results of likelihood-based tests of alternative positions of the Hexactinellida. Five alternative positions of the group

(including the one from fig. 4B) have been tested using Approximately Unbiased (AU),

Weighted Shimodaira-Hasegawa (WSH) tests and Resampling of Estimated Log-Likelihood

Bootstrap Percentage (RELL BP). B, bilaterians; G, glass sponges; D, demosponges; C, cnidarians; P, placozoan.

Table 1. Specific primers used for mtDNA amplification Iphiteon panicea IP-cox1-f1 5’- CCA GAC ATG GCC TTT CCA CGA C -3’ IP-cox1-r1 5’- AGA GAT TCC TGC TAA GTG CAG TC -3’ IP-cob-f1 5’- CAA CCC TAA ACC GAT TCT ATA GAC -3’ IP-cob-r1 5’- TGG AGG GCT AAT ATA TGC ATT CC -3’ IP-cox3-f1 5’- TCC TCT CTA TCT GCT GAC TAC GA -3’ IP-cox3-r1 5’- TAT CAG GCT CTT GTT TCG AAT CC -3’ IP-rnl-f1 5’- AGG GAT AAC AGC GTT ATA TCG TC -3’ IP-rnl-r1 5’- ACA TCG AGG TCG CAA ACA TCG TC -3’ Sympagella nux SN-cox1-f1 5’-GGA ATG GAT GTT GAT ACT CGA GC -3’ SN-cox1-r1 5’- GTA ATA GCA CCA GCT AAT ACA GG -3’ SN-rnl-f1 5’- CAA GAC GAT AAG ACC CTA AGA AC -3’ SN-rnl-r1 5’- GAA ACG ATA TAA CGC TGT TAT CC -3’ SN-cox3-f1 5’- GGA GCA CAT GTA ATC ATA GGT AC -3’ SN-cox3-r1 5’- ATG CTC TGG TTT CGA ATC CTA TG -3’

Table 2. Nucleotide composition of the coding strands of Iphiteon panicea (IP) and Sympagella nux (SN) mtDNA

CHAPTER 3. MITOCHONDRIAL GENOME EVOLUTION IN HEXACTINELLIDA

(PHYLUM PORIFERA)

A paper to be submitted to Mitochondrion

Karri M. Haen and Dennis V. Lavrov

Abstract

Hexactinellida or glass sponges are a group of exclusively deep marine animals, most closely related to demosponges. Yet, previous analyses revealed several Bilaterian-like features in their mitochondrial genomes including changes in the genetic code, an unusual inferred secondary structure for a seryl mt-tRNA, and a generally compact genome organization.

Thus parallel evolution in mitochondrial genome architecture is likely in the two groups, but the extent and history of these parallelisms remain unclear. Here, we report the analysis of six additional complete or nearly complete mitochondrial genomes, and one partial mitochondrial genome of hexactinellid sponges, extending the mitochondrial genome sampling for Class Hexactinellida to encompass three taxonomic orders and seven families.

Mitochondrial genome architecture appears to be conserved inside the Hexactinellida with a high degree of protein conservation and with most gene rearrangements occurring among tRNA genes. Furthermore, mitochondrial protein coding data appear to serve as a high fidelity genetic marker inside the group, although phylogenetic analyses cannot confidently place Hexactinellida among the rest of the Metazoa.

1. Introduction

Mitochondrial genomes of bilaterian animals display a number of unusual features that distinguish them from the mtDNA of other eukaryotes. The majority of sequenced animal mtDNAs shows a tendency toward genome compaction and are relatively small (13–

20 kb). The genes are closely arrayed with absent, or very short, noncoding spacers, or, more rarely, genes may overlap (Lavrov 2007). Although the set of mitochondrial genes is nearly identical throughout the Metazoa, specifying 12–14 proteins involved in electron transport and oxidative phosphorylation, 2 ribosomal and 22-23 transfer RNAs (small and large subunit rRNAs and tRNAs), RNA genes may be highly truncated with respect to those found in nuclear and prokaryotic genomes (Wolstenholme 1992). For instance, only animal mitochondrial genomes have tRNAs that do not utilize a canonical cloverleaf secondary structure. In addition, the genetic codes of metazoan mitochondria may be highly modified.

Extreme codon usage biases and strand compositional asymmetries appear to be related to the evolution of translational frameshifting within some groups (eg. Haen et al. 2007; Haen et al. unpublished). Increased sampling of animal mitochondrial genomes has revealed additional unusual genomic features. This is particularly the case for non-bilaterian animals, which show a large diversity of mtDNA organization (Lavrov 2007). Still, our sampling of non-bilaterian animals remains limited, and it is probable that additional unusual features, and, potentially, even new evolutionary processes, are yet to be discovered.

Glass sponges, Class Hexactinellida (Schmidt 1870), compose a group of deep marine animals that is largely defined by their siliceous spicules, which are of hexactinic or triaxonic symmetry. Evidence from the fossil record indicates Class Hexactinellida evolved at least as early as the Ediacaran (630-542 MYA): Hexactinellid spicules have been characterized from the terminal Proterozoic of Mongolia (Brasier et al. 1997), China (Steiner et al. 1993), and Iraq (Brasier 1992), and whole body impressions of Phragmodictya reticulata have been characterized in Southern Australia (Gehling and Rigby 1996), suggesting Hexactinellida may possibly the oldest lineage of extant animals on Earth.

Concurrent with the fossil record, glass sponges, along with most other sponges, are typically placed as the sister group to the rest of the animals (). This placement is supported by their primitive bauplan: the absence of organs and true tissues, lack of neurons and muscle cells and a striking cytological similarity between some sponge cells

() and a group of unicellular eukaryotes (choanoflagellates). However, hexactinellids possess additional morphological features that separate them, not only from other sponges, but also most other animals. For instance, glass sponges do not contain calcareous or spongin skeletal components, characteristics separating them from Class

Calcarea (calcareous sponges) and most of Class Demospongiae (demosponges), respectively. Additionally, nearly three-quarters of the adult hexactinellid cellular body mass consists of a multinucleate trabecular , which not only allows for plant-like symplastic nutrient transport but is also capable of impulse conduction, analogous to the nervous systems of other animals (Leys et al. 2007).

Hexactinellida is divided into two subclasses, Amphidiscophora and Hexasterophora, based upon their spicule composition: the former having amphidisc microscleres, and the latter having hexaster microscleres (Schulze 1886; Reiswig 2002). Subclass

Amphidiscophora contains one order with three families, and Subclass Hexasterophora contains five orders with 15 families (Reiswig 2002; Leys et al. 2007). Recent phylogenetic analyses of ribosomal DNA (rDNA) have largely been congruent with the traditional classification system (Dohrmann et al. 2008, 2009), which is unusual for sponges (Erpenbeck and Wörheide 2007). However, these studies suggested the dictyonal order Hexactinosida does not form a monophyletic group (Dohrmann et al. 2008, 2009).

Previous analysis of glass sponge mt-genomes (Haen et al. 2007; Rosengarten et al

2008) showed that these genomes have similarities to both demosponges (Class

Demospongiae) as well as bilaterian animals. For instance, hexactinellid mt-genomes share the presence of mitochondrial atp9 with demosponges, the only two animal groups to have this gene as well as the sporadic presence of apparent noncoding regions that contain open reading frames (ORFs) of various sizes. On the other hand, glass sponges share with bilaterian animals an arginine to serine reassignment of AGR codons, as well as nonstandard tRNA and truncated rRNA structures. We addressed whether these features are common among all hexactinellid groups, or if apparent similarities to Bilaterian mt-genomic features were, instead, convergent features among them. Further, by increasing the number of sampled hexactinellid families from 3 to 7, we may be able to determine the level of conservation of gene order among families.

Here, we present the mt-genomic analyses of 10 complete or nearly complete mitochondrial genomes of glass sponges, including new data for Hertwigia falcifera,

Euretidae sp., Rossellidae sp., Oopsacas minuta, beatrix, Vazella pourtalesi, and Hyalonematidae sp. and place them in the evolutionary perspective. The analysis of inter-class relationships reveal the extent of protein and gene order conservation among these groups and may either confirm or refute traditional morphology-based, and nuclear rDNA based, phylogenies.

2. Materials and Methods

2.1. Specimens collection.

Hexactinellid specimens encompassing three orders, Hexactinosida (Schrammen 1903),

Lyssacinosida (Zittel 1877) and (Schrammen 1924), and eight families were collected by the Harbor Branch Oceanographic Institute, Florida Atlantic University (Ft.

Pierce, FL), using the Johnson-Sea-Link manned-submersible. Voucher numbers, sequence accession numbers, and current taxonomic status of the species used in this study are summarized in Haen et al. (unpublished).

2.2. Complete mtDNA amplification.

Small fragments of mitochondrial rRNA or protein-coding genes were amplified from total

DNA using glass sponge conserved primers from Haen et al. 2010 (unpublished), checked against the GenBank database to minimize the possibility of contamination, and used to design specific primers. Complete mtDNA was amplified in several fragments using LA-

PCR (Takara) under conditions described previously (Burger et al. 2007) except annealing temperatures for touchdown PCR necessitated annealing temperatures as low as 40oC for glass sponge DNA amplification. PCR reactions for each species were combined in equimolar concentration, sheared and barcoded as in (Haen et al. 2010 unpublished), and sequenced at Indiana University. The STADEN package v. 1.6.0 was used to assemble the sequences. Gaps and uncertainties in the assembly were resolved by primer-walking using conventional Sanger sequencing.

2.3. Amplification of individual genes from Hyalonematidae sp.

Due to the necessity for extremely low annealing temperatures for DNA amplification within the amphidiscophoran group, and problems with amplification of contaminations with these temperatures, reverse transcription of short RNA sequences was used to obtain initial species-specific sequences. For Hyalonematidae sp. conserved regions of protein coding genes were reverse transcribed from TRiZol extracted RNA using the iScript Select cDNA amplification kit (BioRad), and used for second strand synthesis with conserved primers and iTaq polymerase (BioRad). The resulting sequences for nad5 and nad4 were used to construct specific primers for the LA-PCR of TRiZol extracted DNA and initial sequencing of the 5 kb fragment. Primer walking was used to fill in missing sequence gaps.

2.4. Genome Annotation and Phylogenetic Inference.

Protein coding genes for complete mt-genome data were annotated manually based on sequence similarity with known genes. tRNA genes were annotated by tRNA Scan SE (Lowe and Eddy 1997) or ARWEN (Laslett and Canback 2008). Concatenated alignment of protein- coding genes was prepared as described previously (Lavrov et al. 2005). Searches for the maximum likelihood (ML) topologies were performed with the TreeFinder program (Jobb et al. 2004) using the mtREV+! (Adachi and Hasegawa 1996) model of amino-acid substitution.

3. Results

3.1. Genome Organization

We amplified and sequenced complete or nearly complete mitochondrial genomes of seven species of glass sponges: Hertwigia falcifera, Euretidae sp., Rossellidae sp., Oopsacas minuta, Aphrocallistes beatrix, Vazella pourtalesi, and one Hyalonematidae species. The

20.3 kilobase (kb) V. pourtalesi mitochondrial genome was the only mtDNA through which the entirety of the control region has been sequenced and, is only the second complete mt available for glass sponges (the first in Rosengarten et al. 2008). For the other genomes we were unable to determine the complete sequence of the large non-coding genomic regions, as well as some of the genes (in particular nad6) adjacent to it. Difficulties in the PCR amplification and sequencing of this, and adjacent, DNA regions may be due to the presence of numerous repeats and a high degree of secondary structure (see section 3.4). Despite this, completely sequenced hexactinellid mitochondrial genomes, have a similar gene content to many bilaterian animals: 2 rRNAs, a set of 22 tRNAs, and 13 protein coding genes, which, like other sponges, (but unlike bilaterian animals) includes atp9, yet unlike other sponges, excludes atp8, the latter also having been lost from the mt-genomes of some lineages of bilaterian animals, including secernentean (Okimoto et al. 1992; Lavrov and

Brown 2001), bivalve mollusks (Serb and Lydeard 2003), chaetognaths (Papillon et al.

2004), and platyhelminths (Le et al. 2000).

With the exception of tRNA genes, which appear to be fairly mobile in glass sponge mt genomes, gene orders are well conserved within Hexactinellida. In particular, representatives of the same family have a nearly complete congruence of gene order, including family Rossellidae, represented by V. pourtalesi, Rossellidae sp., and S. nux (Fig.

1). The only exception to gene order collinearity inside this group is an apparent duplication of the trnD and trnM genes of S. nux and their insertion at the 3’ end of the atp9 gene.

Additionally, the high degree of gene order conservation can be extended to a closely related family, Leucopsacidae (for molecular phylogenetic relationships see Fig. 6, section 3.5), as evidenced by a single deletion of trnW in O. minuta. Similarly, a collinear gene order was found in family , represented here by the partial A. beatrix mt genome and the genome of A. vastus, with the only exception being the absence of trnS(ucu) in the A. beatrix partial sequence. Euretidae n. gen. sp., diverged from the common ancestor of crown group (which includes A. beatrix and A. vastus) approximately 250 MYA

(Haen et al. unpublished). Despite this large time lag, the Euretidae sp. mt-genome shares much of its gene order with the Aphrocallistidae, including the order of all “major” (rRNA and protein coding) genes with the exception of rns. Inside Order Lyssacinosida, we sequenced the mt-genome of H. falcifera (Family ). According to our previous molecular dating analyses (Haen et al. unpublished), the Euplectellidae shared a common ancestor with other groups of Lyssacinosida approximately 150 MYA. Gene order differences between H. falcifera and other members of Lyssacinosida include the movement of multiple tRNA genes, as well as the rearrangement of cox2 and atp9. Iphiteon panicea, a sponge with a dictyonal (fused) skeletal framework has been traditionally placed within order

Hexactinosida; however, it has several gene order differences with the rest of the members of that group. Notwithstanding the movement of multiple tRNA genes, these rearrangements include the nad3, nad6, nad4L, atp9 and rns genes. On the other hand, I. panicea only has two protein-coding/rRNA gene rearrangements with respect to the gene order of H. falcifera: atp9 and nad6, and two rearrangements with respect to the basic Rossellid gene order: nad6 and cox2, supporting its phylogenetic affinity for the Lyssacinosida and re-demonstrating the possibility that the dictyonal framework is a homoplasious feature amongst the

Hexactinellida. Maximum parsimony analysis of gene order data further places I. panicea within the Lyssacinosida with a reasonable degree of statistical support (Fig. 2).

All analyzed mitochondrial genomes were relatively uniform in the overall nucleotide composition with an average A+T content of 68.8% (65.2–70.1%) (Sup. Fig. 1) and, on average, displayed positive AT- and negative GC-skews of the coding strand (Sup. Figs. 2 and 3). Protein coding genes had a highly negative GC skew at third codon positions (-0.75,

! = .12) (Sup. Fig. 4), which is manifested in a strong bias against the use of codons ending in G in the hexactinellid proteome, where several codons ending in G are rarely used, including the TGG and CGG codons implicated in +1 translational frameshifting (see Haen et al. 2010, unpublished).

3.2. Protein Coding Genes

Encoded proteins in hexactinellid mitochondrial genomes are generally well- conserved both with respect to protein size (Sup. Fig. 5) and amino acid primary sequence

(Sup. Fig. 6). The standard deviation of protein sizes was 5.37 amino acids on average with a range from 0.4 for nad3 to 22.5 for nad2, which has an ambiguous 3’ end due to undetermined stop codons in 6 species (Supplementary Table 1). Average percent pairwise distance to orthologous protein coding genes in the Demospongiae ranged from 75% for those few hexactinellids which have a sequenced atp9 gene (sd=2.16) to 25.5% for nad2

(sd=1.60), with an average distance of 48% (sd=14.6) for all protein coding genes.

Occasionally, apparently functional protein coding genes in hexactinellids have single base insertions, which, if uncorrected, shift the reading frame by +1 (Haen et al. 2010, unpublished). These insertions occur in the cox1 gene of I. panicea, nad2 of I. panicea, H. falcifera, and Euretidae sp., cox 3 of A. beatrix and A vastus, cob of A. beatrix, and nad6 of

A. vastus. We previously hypothesized these mutations could be corrected by an out of frame pairing +1 translational frameshifting mechanism, which is well-supported by both nucleotide and protein compositional analyses, as well as the verification of the base insertions at the RNA level. Putative frameshifts occur at or near the TGG and CGG (in H. falcifera) hungry codons when these codons are followed by adenine.

Among the sampled glass sponge mitochondrial genomes those from Euretidae sp.,

V. pourtalesi, and I. panicea have additional open reading frames (ORF), which have no similarity to known protein-coding sequences. Two of the ORFs, including the previously published ORF909 of I. panicea (Haen et al. 2007), share some sequence similarity to each other, although they are different lengths. Euretidae sp. ORF909 (353 aa) and I. panicea

ORF909 (303 aa) share 25% pairwise identity. Two additional, unrelated ORFs in V. pourtalesi encoding 104 and 194 aa had no amino acid similarity to the others. Conserved domain searches using NCBI’s CD-Search found no conserved protein domains for any of the putative proteins.

The sequence of start codons glass sponge protein coding genes is overwhelmingly

ATG; however, it appears that Euretidae sp. may use additional codons for start, including:

ACC for cox2, TTG for nad4L, and ATA for nad5. Glass sponges primarily utilize TAA as the stop codon, except cox2 and nad1 of Euretidae sp. and nad6 in V. pourtalesi which appears to end with TAG. However, in the latter case, a TAA stop codon is located 9 bp downstream from TAG. In addition, A. beatrix, I. panacea, O. minuta, and S. nux do not appear to have in-frame stop codons for cox1 and/or nad5, and in those cases genes were annotated at the start of the next gene on the mt chromosome, as long as this was reasonable given the protein’s alignment. Interestingly, all nad2 genes from hexactinellid mt-genomes, with the exception of those from Rossellidae sp. and V. pourtalesi, also lack an in-frame stop codon. However, in all of these cases, when the gene is terminated immediately before the start of the next ORF, there is a +1 frameshifted TAA codon.

3.3. tRNA Genes

Data from complete mt-genomes suggests glass sponges have a set of 22 tRNAs encoded on coding strands, including one tRNA for each amino acid with the exception of those having two codon groups: trnS and trnL. Previously we had suggested the loss of the dihydrouridine (DHU) arm from trnS(ucu) of I. panicea and S. nux (Haen et al. 2007).

Further analysis of other mt-genomes shows this tRNA has a canonical structure although there appear to be sporadic, but not conserved, mismatches in both the DHU and T"C arms.

Interestingly, the Euretidae sp. trnS(uga) has an 8 bp DHU replacement loop similar to that of nematodes (Lavrov & Brown 2001). There is no indication of a gradual evolutionary loss of the DHU arm for this tRNA when examining other hexactinellid trnS(uga) sequences, although there is a consistent 28:42 anticodon stem mismatch that is also extended to the

29:41 basepair in several species (A. vastus, O. minuta, I. panicea, A. beatrix, V. pourtalesi,

Rossellidae sp.), and there is also an additional mismatch in the anticodon stem of S. nux at the 27:43 basepair for this tRNA, which are all unusual with respect to canonical tRNA secondary structure.

Interestingly, searches for tRNA genes with Arwen revealed additional tRNA-like structures on the complimentary strand in several genomes (Fig 4). Several of these are canonical cloverleaf tRNAs; however, some also contain dihydrouridine (DHU) and, more rarely, T"C a replacement loops. Most of the tRNA-like genes are well matched in their acceptor, anticodon and T"C stems (where applicable). Interestingly, most of these genes are located inside the complementary strand to protein coding genes: in atp6, nad2 and nad1, and nad4 in the H. falcifera mt-genome; in cox1 nad4L, nad5, and nad1 of O. minuta; in cox1, nad4L, nad2, and cob of Euretidae sp.; and in nad5 and cob of Rossillidae sp. We also identified trnK(uuu) on the complimentary strand of Hyalonematidae sp., which overlaps the coding strand trnK(uuu) by 14 nucleotides (nt), and a potential mt-trnQ(uug) that was the compliment of the trnA(ugc) gene of A. beatrix. Neighbor joining analysis of these sequences suggests these sequences are not homologous to coding strand tRNAs (Fig. 3). The only exception was the case of O. minuta trnS(ucu) with a truncated T"C stem, found on the complimentary strand of the nad1 gene, which grouped with other trnS(ucu) in our analysis

(Fig. 4), and share 62.5% nucleotide identity. Analyses of protein conservation in regions with tRNA pseudogenes show there appear to be no unusual amino acid changes, either regarding polarity or charge, in the location of the encoded tRNA-like structures, including the case of O. minuta nad1, although these regions may have stretches of sequence enriched in amino acids with hydrophobic side groups that may not be well-conserved among glass sponges (Fig. 4).

3.4. Noncoding DNA

Because most hexactinellid mitochondrial genomes have not been sequenced through the entire large non-coding region, it is difficult to say what their average size is. V. pourtalesi and A. vastus, two complete mitochondrial genomes, show substantial differences in the amount of noncoding DNA inside their genomes. A. vastus (Rosengarten et al. 2008) has only a relatively short “control region,” of 1,003 nt out of a total of 1,408 noncoding nucleotides (71.2%) in the entire genome, located just after trnL(taa) and before the atp9 gene. On the other hand, V. pourtalesi, sequenced for the present study, has a control region between nad6 and trnE(ttc), which is 1,853 nt, nearly twice as large as the comparable region in A. vastus. Depending upon whether the unknown ORF in the V. pourtalesi genome is counted, the entire content of its noncoding DNA ranges from 4,041-4,356 nt, which is about

20% of the total mt-genome sequence. Other nearly complete hexactinellid mt-genomes have a significant amount of noncoding DNA, as well, with an average of 1, 236 nt (ranging from only 692 nt in O. minuta to 1,817 nt in Rossellidae sp., where A. beatrix (475 nt) was excluded due to its comparably large amount of missing sequence data).

Analyses of the putative hexactinellid mitochondrial control region indicate it varies not only in size but in nucleotide sequence. Although both regions have a negative GC skew

(-0.59 in V. pourtalesi and -0.23 in A. vastus) and a positive AT skew (.29 and .15 respectively) like gene encoding regions of these mt-genomes, and, additionally, both contain

~75% AT content. The V. pourtalesi control region contains a total of 10 interspersed 11 and

41 bp repeats (Fig. 5), the former of which, in one case, is followed by a long stretch of GT dinucleotides. On the other hand, the comparable region in A. vastus is characterized by mononucleotide runs of about 7 nucleotides, interspersed throughout the sequence. All of the aforementioned features of the primary sequence appear to contribute to the high degree of

DNA secondary structure within this region (Supplementary Figs. 1-4). Additional analysis of the V. pourtalesi putative control region revealed both the 11 (p-value 2.1e-09) and 41 bp

(p-value 4.81e-24) repeats have a high degree of sequence similarity to experimentally validated promoter sequences for eukaryotic RNA polymerase II (p-value 1e-05 - 1e-07 for individual motifs). For instance, both sequence motifs are distributed throughout the D. melanogaster RSF1 promoter region, as well as the same region for Rattus norvegicus pyruvate kinase (among others). Some RNA polymerase II promoters can also be used by the mitochondrial DNA polymerase # (Marczynski et al. 1989).

3.5. Molecular Phylogeny

Our previous study showed mtDNA based phylogenies of global Metazoan relationships may be susceptible to the long branch attraction artefact (LBA), resulting in well-supported, but most likely erroneous, topologies (Haen et al. 2007). By contrast, we have found the mtDNA based phylogeny within the group to be extremely stable, with consistently the same results using Maximum Likelihood (ML) and Bayesian methods and utilizing nucleotide and amino acid sequences (Fig. 6). The topology reconstructed in these analyses is highly congruent with that obtained in nuclear rRNA studies and shows good agreement with morphological studies. The only notable exception is the placement of I. panicea (Order Hexactinosida) within the Lyssacinosida, which is consistent with previous molecular but not morphological studies (Dohrmann et al. 2008).

4. Discussion

The current analysis of mitochondrial genome data from seven families of the

Hexactinellida show both remarkable similarity and unexpected differences among these groups. Features shared among all the specimens examined include a set of protein coding genes which includes atp9 but excludes atp8, 2 rRNA genes, and a core set of 22, but possibly more, tRNA genes, which may encode tRNAs of divergent structure. Gene rearrangments are few among the “major” (protein coding and rRNA) genes, and are more common among the tRNA genes; however, gene orders are generally stable among the closely related taxonomic groups. As in most animals, the mt genomes are generally characterized by a compact organization, with the exception of a large noncoding “control” region that contains sequences with similarity to eukaryotic promoters, which have not been apparent in other sponges (Wang and Lavrov 2008). Unlike bilaterian animals, these mt genomes may also contain interspersed noncoding regions up to 1 kilobase (kb) in size, and the presence of ORFs that potentially encode proteins of unknown function.

Unlike the situation in other groups of sponges, previous studies on the molecular relationships inside Hexactinellida have found a high degree of agreement between morphology based systematics and molecular phylogenies (Dohrmann et al. 2008, 2009).

However, these analyses also tend to suggest one of the two most basic morphological categories in hexactinellid systematics, that of the dictyonine, or rigid, skeleton, which stands in contrast to the lyssacine, or loose, skeleton, is actually a convergent feature, even within the Hexactinosida. This discrepancy was first indicated in the initial molecular phylogenies of the group using rDNA (Dohrmann et al. 2008), and both the mitochondrial gene order analysis and mtDNA-based molecular phylogeny reported in the present study support the conclusion that some species grouped within Hexactinosida based upon skeletal morphology

(in this case I. panicea (Family Dactylocalycidae)) are misclassified. Even a rudimentary perusal of the gene orders found in Hexactinosida versus Lyssacinosida indicate the affinity of I. panicea for the Lyssacinosidan group, where several fewer gene rearrangements would be required to make the I. panicea gene order collinear with the basic arrangement seen in that group, as opposed to the Hexactinosidan gene order (Fig. 1).

Relaxation of selective constraint on mitochondrial DNA is probably important regarding both the maintenance of gene content as well as gene sequence conservation among the mt-genomes of Metazoan groups (Kumazawa and Nishida 1993). Mitochondrial gene content has often been used as an indicator of phylogenetic relationships (see Rokas et al. 2000); however, it is also known that gene loss is a relatively common feature in mt- genomes of the Metazoa, especially with respect to tRNA content (see Wang and Lavrov

2008; Haen et al. 2010). Although protein-coding genes are less often gained or lost from these genomes, it appears some genes may be more dispensable than others. As found in a previous study, several purportedly rare genomic changes (RGCs) in hexactinellid mitochondrial genomes indicate a phylogenetic affinity for the Bilateria, while others indicate a close relationship with the Demospongiae (Haen et al. 2007). For instance, the atp9 gene has only been found in the mitochondrial genomes of sponges: in Hexactinellida and Demospongiae (Wang and Lavrov 2008; Rosengarten et al. 2008; Haen et al. 2007). Yet the loss of atp8 has also been observed in several lineages of Bilaterian animals (eg. Okimoto et al. 1992; Lavrov and Brown 2001). The atp8 gene encodes subunit 8 of the Fo portion of the enzyme ATP synthase and, in metazoans, is the smallest mitochondrial encoded protein, only around 50–65 aa long, with only about a half dozen of these amino acid residues being well conserved across animal mtDNAs (Helfenbein et al. 2004). In the lineages that have lost the gene, it is not clear whether atp8 moved to the nucleus, became completely dispensable, or was functionally coopted by one of the other ATPase subunits (Boore 1999).

Loss of tRNA genes or variation in tRNA structure, in terms of differences among the tRNA alloacceptors, as well as among taxonomic groups, is common in metazoan mtDNA

(Boore 1999; Westenholme 1992). Previously, an arginine to serine change in the genetic code, as well as an apparently associated change in tRNA structure, were found in hexactinellid mitochondria (Haen et al. 2007). The current study found differences in the structure of the aforementioned mt-trnS(ucu), where the mt-genomes of I. panicea and S. nux have relatively little capability for base-pairing in the DHU arm, yet other hexactinellid mt- genomes contain a tRNA with a normally paired DHU stem. On the contrary, this study also found the other seryl tRNA alloacceptor, trnSer(uga), has an 8 basepair DHU truncation in

Euretidae sp. Pressure to minimize the size of the mitochondrial genome may result in the fixation of truncated tRNAs (Anderssen and Kurland 1991; Yamazaki et al. 1997); however, the mechanism by which arms are lost: whether by replication slippage (Macey et al. 1997), gradual loss of basepairing through point mutations in tRNA stem regions, or some other mechanism, remains unclear.

In the literature, there is very little discussion of tRNA-like structures within protein coding genes and only a single bioinformatic study that pays specific attention to it (Morrison

2010). Most cases of tRNA-like structures inside other mitochondrial genes were found to be misannotations. A small fraction of these cases were putatively considered functional tRNAs, invoking the possibility of dual encoding. However, there is no known biochemical mechanism that could explain the maintenance of dual-encoded tRNAs and proteins, and, further, our own laboratory has shown that, at least in the putative case of

(velvet worms), tRNA editing plays a major role in the synthesis of tRNAs from highly derived DNA templates (Segovia Ugarte et al. 2010, unpublished). A simple explanation for tRNA-like structures inside protein coding genes could be that internal basepairing may protect the mRNA from certain types of oxidative damage inside the mitochondrion, and that these structures simply evolve to be tRNA-like by chance. Interestingly, in the case of the O. minuta trnS(ucu) tRNA-like structure from the complimentary strand of nad1, we found a phylogenetic affinity between the putative pseudogene and its corresponding coding strand tRNA, as well as with coding strand mt-trnS(ucu) genes from other species, in a neighbor joining analysis (Fig. 3). Because this structure is not encoded within an intron (Fig. 4), it is difficult to hypothesize about its origin. However, it does beg the question whether inter- or intramolecular recombination could play a role in the evolution of some unusual tRNA structures, which may explain their discontinuous phylogenetic distribution.

In conclusion, extended sampling of hexactinellid mt-genomes shows a cohesion of mt genomic features across the families of Hexactinellida: maintenance of atp9, a loss of atp8, a general conservation of gene orders and protein coding sequences with rare exceptions, and the presence of at least a single large noncoding control region.

Interestingly, we also found unusual tRNA structures with little conservation when taking into account their phylogenetic distribution. It is unclear what the evolutionary relationship is, if any, among the coding strand tRNAs and the several tRNA-like structures found in protein coding genes. Finally, because both the gene order and mt protein based phylogenies of this study agree with other molecular studies that the dictyonine skeleton is a homoplasious feature within Hexactinellida, some taxonomic reclassification may be advised, at least in the case of family Dactylocalycidae, which includes I. panicea.

Literature Cited

Adachi J, Hasegawa M. 1996. Model of amino acid substitution in proteins encoded by mitochondrial DNA. J Mol Evol. 42: 459–468.

Andersson, S.F. & C. Kurland. 1995. Genome evolution drives the evolution of the translation system. Biochem. Cell Biol. 73:775-787.

Boore, J. L., 1999. Animal mitochondrial genomes. Nucleic Acids Res. 27:1767-1780.

Burger G, Lavrov DV, Forget L and Lang BF. 2007. Sequencing complete mitochondrial and plastid genomes. Nat Protoc 2:603-614

Brasier, M.D. & G.O. Shields. 1997. Ediacaran sponge spicule clusters from Southwestern Mongolia and the origins of Cambrian fauna. Geology 25:303-306.

Dohrmann, M., A. Collins & G. Worheide. 2009. New insights into the phylogeny of glass sponges (Porifera, Hexactinellida): Monophyly of Lyssacinosida and Euplectellinae, and the phylogenetic position of Euretidae. Mol Phyl Evol. 52: 257-262.

Dohrmann, M, D Janussen, J. Reitner, A. Collins & G. Worheide. 2010. Phylogeny and Evolution of Glass Sponges (Porifera, Hexacinellida). Systematic Biol 57: 388-405.

Erpenbeck D & G Wörheide. 2007. On the molecular phylogeny of sponges (Porifera). Zootaxa 1668: 107-126.

Gehling, J.G.; Rigby, J.K. 1996. Long Expected Sponges from the Neoproterozoic Ediacara Fauna of South Australia. Journal of Paleontology, 2, pp. 185195.

Haen, K. M., Lang, B. F., Pomponi, S. A. & Lavrov, D. V. 2007. Glass sponges and bilaterian animals share derived mitochondrial genomic features: a common ancestry or parallel evolution? Mol. Biol. Evol. 24, 1518–1527.

Haen, K.M., W. Pett & D.V. Lavrov. 2010. Parallel loss of nuclear-encoded mitochondrial aminoacyl tRNAsynthetases and mtDNA-encoded tRNAs in Cnidaria. Mol Biol Evol. 27(10): 2216-2219.

Haen, K.M. and D. Lavrov. 2010. Unusual phylogenetic distribution of +1 translational frameshifts in hexactinellid mitochondrial genomes. Unpublished.

Helfenbein, K.G., H.M. Fourcade, R.G. Vanjani, and J.L. Boore. 2004.The mitochondrial genome of Paraspadella gotoi is highly reduced and reveals that chaetognaths are a sister group to . Proc Natl Acad Sci USA 101(29): 10639-43.

Jobb G, von Haeseler A and Strimmer K. 2004. TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics. BMC Evolutionary Biology 4:18

Kumazawa, Y., and M. Nishida. 1993. Sequence evolution of mitochondrial tRNA genes and deep-branch animal phylogenetics.J. Mol. Evol. 37:380–398.

Laslett, D. & Canbäck B. (2008) ARWEN, a program to detect tRNA genes in metazoan mitochondrial nucleotide sequences. Bioinformatics 24: 172-175.

Lavrov D.V. and W.M. Brown. 2001. Trichinella spiralis mtDNA: A mitochondrial genome that encodes a putative ATP8 and normally structured tRNAs and has a gene arrangement relatable to those of coelomate metazoans. Genetics 157:621-637.

Lavrov, D. 2007. Key transitions in animal evolution: a mitochondrial DNA perspective. Integrative and Comparative Biology 47:734743.

Lavrov, D. & F. Lang. 2005. Poriferan mtDNA and animal phylogeny based on mitochondrial gene arrangements. Systematic Biology 54:651-659.

Le, T. H., Blair, D., Agatsuma, T., Humair, P. F., Campbell, N. J., Iwagami, M., Littlewood, D. T., Peacock, B., Johnston, D. A., Bartley, J., et al. 2000. Mol. Biol. Evol. 17: 1123–1125.

Leys, SP, G.O. Mackie & H.M. Reiswig. 2007. The Biology of Glass Sponges. Advances in Marine Biology 52: 1-145.

Lowe TM, Eddy SR. 1997. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25: 955-964.

Macey, J. R., A. Larson, N. B. Ananjeva, and T. J. Papenfuss. 1997. Replication slippage may cause parallel evolution in the secondary structures of mitochondrial transfer RNAs. Mol. Biol. Evol. 14:30–39.

Marczynski, GT, PW Schultz, and JA Jaehning. 1989. Use of yeast nuclear DNA sequences to define the mitochondrial RNA polymerase promoter in vitro. Mol. Cell. Biol. 9: 3193- 3202

Morrison, D. 2010. How and where to look for tRNAs in Metazoan mitochondrial genomes, and what you might find when you get there. arXiv:1001.3813v1 [q-bio.GN]

Okimoto, R., Macfarlane, J., Clary, D. & Wolstenholme, D. R. 1992. Genetics 130: 471–498.

Papillon, D., Y. Perez, X. Caubit and Y. Le Parco. 2004. Identification of chaetognaths as protostomes is supported by the analysis of their mitochondrial genome. Molecular Biology & Evolution 21(11): 2122-2129.

Reiswig HM. 2002. Class Hexactinellida Schmidt, 1870. In: Hooper JNA, Van Soest RWM, editors. Systema Porifera: a guide to the classification of sponges. New York: Kluwer Academic/Plenum Publishers. p 1201-1202.

Rigby, J., Pisera, A., Wrzolek, T., Racki, G. 2001. Upper sponges from the holy cross mountains, central Poland. Palaeontology 44:447–488.

Rosengarten RD, Sperling EA, Moreno MA, Leys SP, Dellaporta SL (2008) The mitochondrial genome of the hexactinellid sponge Aphrocallistes vastus: evidence for programmed translational frameshifting. BMC Genomics 9:33.

Rokas, A & PW Holland. 2000. Rare genomic changes as a tool for phylogenetics. Trends Ecol Evol. 15(11): 454459.

Serb, J. M., and C. Lydeard. 2003. Complete mtDNA sequence of the North American freshwater mussel, Lampsilis ornata (Unionidae): an examination of the evolution and phylogenetic utility of mitochondrial genome organization in (). Mol. Biol. Evol. 20:1854–1866.

Schrammen, A. 1903. Zur Systematik der Kieselspongien. Mitteilungen aus dem Roemer Museum, Hildesheim 19: 1-21.

Schrammen, A. 1924. Die Kieselspongien der overen Kreide von Nordwestdeutschland. III. Und letzter Teil. Monographien zur Geologie und Palaontologie (1) Heft 2: 1-159, I-XVII.

Staden, R. 1996. The Staden sequence analysis package. Mol. Biotechnol. 5:233-241.

Steiner, M., Mehl, D., Reitner, J. and Erdtmann, B. 1993. Oldest entirely preserved sponges and other fossils from the Lowermost Cambrian and a new facies reconstruction of the Yangtze platform (China). Berliner Geowissenschftliche Abhandlungen Reihe E (Palaobiologie) 9: 293-329.

Wang X and Lavrov DV. 2008. Seventeen new complete mtDNA sequences reveal extensive mitochondrial genome evolution within the Demospongiae. PLoS ONE 3(7): e2723.

Wolstenholme DR. 1992. Genetic novelties in mitochondrial genomes of multicellular animals. Curr Opin Genet Dev 2:918-925.

Yamazaki, N., R. Ueshima, J. A. Terrett et al. 1997. Evolution of pulmonate gastropod mitochondrial genomes: comparisons of gene organizations of Euhadra, Cepaea and Albinaria and implications of unusual tRNA secondary structures. Genetics 145: 749–758.

Zittel, K.A. 1877. Studien uber fossile Spongien, I. Hexactinellidae. Abhandlungen der Mathematisch-Physikalischen Classe der Koniglich-Bayerischen Akademie der Wisseneschaften 13(1): 1-63.

Figure 1. Mitochondrial gene arrangements in glass sponges. Complete and nearly complete mtDNA molecules pictured illustrate the protein coding genes surveys for frameshift insertions: atp 6, 8-9 (subunits 6, 8 and 9 of the F0 ATPase); cox1-3 (cytochrome c oxidase subunits 1-3); nad1-6, nad4L (NADH dehydrogenase subunits 1-6 and 4L). Rns and rnl (the large and small subunit rRNAs) as well as the tRNAs (smaller boxes) as also shown.

Gray boxes indicate unsequenced regions.

Figure 2. Hexactinellid phylogeny based on mitochondrial gene order data. Maximum parsimony analysis of mitochondrial gene orders reveal the monophyly of Order

Lyssacinsosida, which includes I. panicea (Family Dactylocalycidae), classified as a

Hexactinosidan glass sponge, according to morphology-based .

Figure 3. Phylogeny of hexactinellid tRNA genes and pseudogenes. Neighbor-joining analysis reveals the monophyletic grouping of glass sponge mt-tRNA genes. tRNA-like structures identified in protein coding genes do not group with coding strand tRNAs, with the exception of the O. minuta trnS(ucu)-like structure located on the complementary strand of nad1. AB = A. beatrix, EU = Euretidae sp., HF = H. falcifera, RO = Rossillidae sp., OM =

O. minuta, IP = I. panicea, SN = S. nux, VP = V. pourtalesi. The two letter extension of –ps after a tRNA name means the structure is an implied pseudogene.

Figure 4. O. minuta trnS(ucu)-like structure located inside the nad1 gene. The protein alignment indicates the region of nad1 that dual encodes the tRNA-like structure, which is further delineated by the light blue triangle. The coding strand nucleotide sequence for the tRNA-like structure is indicated above the protein translation, along with the implied secondary structure of the trns(ucu)-like pseudogene.

Figure 5. Putative mitochondrial control region of V. pourtalesi. Promoter-like nucleotide sequence motifs found by the MEME and MAST programs and their locations in the largest (1,853 nt) noncoding region are shown. A. and B. show the 2 prevalent nucleotide motifs, each found 5 times, in the V. pourtalesi putative control region. Letter height indicates the prevalence of that nucleotide in the alignment of the repeated motif. C. shows the overall arrangement of the two motifs in the noncoding sequence.

Figure 6. MtDNA-based molecular phylogeny of glass sponges. Maximum likelihood and Bayesian analyses using the mtREV model of amino acid substitution both converge on the same topology, with high statistical support (all nodes have a bootstrap support of 100 or a posterior probability of 1.0). The ML and Bayesian trees further support the grouping of I. panicea with Order Lyssacinosida, but otherwise conform to traditional morphology-based classifications.

CHAPTER 4. UNUSUAL PHYLOGENETIC DISTRIBUTION OF +1

TRANSLATIONAL FRAMESHIFTS IN HEXACTINELLID PROTEIN CODING

GENES

A paper to be submitted to Molecular Biology and Evolution

Karri M. Haen and Dennis V. Lavrov

Abstract

The presence of extra nucleotides in mitochondrial (mt) coding sequences has typically been construed as evidence for either the nuclear transfer of mitochondrial genes

(numts) or RNA editing. Recently, a growing number of these insertions, many of them found in mtDNA of bilaterian animals, have been reinterpreted as programmed translational frameshifts. +1 translational frameshifting is a mechanism by which an extra nucleotide in a protein coding sequence is left untranslated through a process of ribosomal bypass. This mechanism often requires the use of a rare codon in the original reading frame, which acts as a frameshifting signal. We had previously reported several instances of single nucleotide insertions in the mitochondrial protein coding genes of two glass sponges (Phylum Porifera,

Class Hexactinellida). More recently, we expanded the breadth of our taxon sampling to include members of three orders and eight families of the Hexactinellida. We found that base insertions occur widely in hexactinellid mt protein coding genes and appear to conform to the out of frame pairing model of translational frameshifting due to a high degree of sequence conservation, the presence of certain tRNAs, and a distinct pattern of codon usage. Here we describe putative translational frameshifting in the cob, nad2, nad6, cox3 and cox1 protein coding genes of Hertwigia falcifera, Sympagella nux, Iphiteon panicea, Euretidae sp.,

Farreidae sp., Regadrella sp., and two species of family Aphrocallistidae, and place these results in an evolutionary context.

Introduction

Mitochondria, or their remnants, hydrogenosomes and mitosomes, are essential organelles that are present in the cells of all eukaryotic organisms (Hjort et al. 2010). The primary function of the mitochondrion is ATP production via the oxidative phosphorylation

(OXPHOS) pathway. Additionally, mitochondria perform crucial roles in many other metabolic, regulatory and developmental processes including cell growth and ageing and the regulation of the cell cycle (Detmer and Chan 2007). Despite their common functions in most eukaryotes, the evolution of mitochondrial genomes follows remarkably different paths among independent phylogenetic lineages (eg., Lang et al. 1999). Furthermore, recent advances in the sampling of animal mitochondrial (mt) genomes have revealed a diversity of mt-genomic features within mitochondrial animals that challenge the previous view of the small, simple and uniform metazoan mtDNA molecule (Lavrov 2007).

Among the unusual features reported in animal mtDNA are single base insertions that are occasionally found in coding sequences (Russell and Beckenbach 2008; Rosengarten et al. 2008; Haen et al. 2007; Parham et al. 2006; Beckenbach et al. 2005; Milbury & Gaffney

2005; Zardoya and Meyer 1998; Mindell et al. 1998; Harlid et al. 1997). Because metazoan mitochondrial genes that encode proteins involved in OXPHOS are essential to survival (but see Danovaro et al. 2010 for an exception) and, because in vertebrates, frameshift mutations in these genes lead to a broad spectrum of diseases and disorders including various cancers, mitochondrial myopathies, diabetes mellitus and deafness (Taylor and Turnbull 2007), a mechanism that corrects potential changes in reading frame should exist. Two such mechanisms have been implicated: post-transcriptional RNA editing, which alters the gene sequence at the RNA level (Lavrov et al. 2000) and programmed translational frameshifting, which bypasses mutations during translation without physically correcting them (Mindell et al. 1998). One in vitro study of mouse cell culture lines also indicated the nuclear suppression of a mitochondrial frameshift mutation in nad6 (Deng et al. 2006), however that mechanism remains unknown. While mitochondrial mRNA editing has been found in only one species of animal (Trichoplax adharens; Burger et al. 2009), translational frameshifting has been inferred in most of these cases.

The diverse phenomenon of programmed translational frameshifting has been studied extensively in prokaryotes and unicellular eukaryotes (Farabaugh 1996) but has only recently been experimentally validated in animals () (Temperley et al. 2010). In programmed frameshifting, genes have evolved short sequence motifs (frameshifting sites or signals) that predispose ribosomes to shift reading frames from the initial frame into one of the two alternative frames. Oftentimes the frameshifting site consists of an extra or missing nucleotide in close proximity to a rarely used “hungry” codon (Weiss and Gallant 1986), which, once shifted in the +1 or -1 direction, encodes two nucleotides of a very frequently used codon (Fig. 1). In mitochondria, the rarely used AGR triplets were thought to code for stop codons. However, recently, these codons in mitochondrial cox1 and nad6 were demonstrated to be the site of -1 translational frameshifts that produce standard

UAA and UAG stop codons, necessitating the use of only a single mitochondrial release factor (Temperley et al. 2010). All other putative animal mitochondrial frameshifts reported to date involve single base insertions that would require the ribosome to advance +1 nucleotide to reestablish the preexisting reading frame.

Two models have been proposed for +1 frameshifting. The “pause and slip model” includes a P-site tRNA that can slip on the mRNA template to pair with the +1 codon. The

“out of frame pairing” model does not require the P-site tRNA to slip, but, instead, the extra nucleotide is excluded from the A-site pairing interaction. In either case, a weakened nucleotide interaction at the ribosomal P-site in combination with low levels of the tRNA complementary to the zero frame A-site codon stimulate the binding of a more abundant, or slippage prone, tRNA in the +1 frame (for a review see Farabaugh 2006; Buchnan and

Stansfield 2007).

Frameshift mutations can have a functional role, either producing an alternative form of a protein and/or serving as an autoregulatory mechanism for the control of gene expression

(Sonenberg et al. 2000). For example, the cytoplasmic expression of the ornithine decarboxylase (ODC) antizyme, which regulates polyamine synthesis in response to cellular polyamine concentration, is partially controlled through a frameshifting mechanism in a wide range of organisms, from yeast to mammals (Mangold 2006; Ivanov et al. 2000).

Polyamines, small polycationic molecules, are essential for a variety of eukaryotic and prokaryotic cellular processes including the binding and stabilization of DNA and RNA, protein translation, RNA processing, cell proliferation and programmed cell death (Coffino

2001). The functional significance of this frameshift is reflected in the high degree of evolutionary conservation of the frameshifting site in the phylogenetically diverse lineages of fungi and animals that use it as a control mechanism for ODC antizyme expression.

Alternatively, there may be no adaptive significance associated with a translational frameshifting site. In this case, a frameshift may be tolerated by the genome if it allows sufficient expression of a gene product such that it may still perform its function (Farabaugh et al. 2006). In this scenario, spontaneous frameshift mutations would be free to accumulate in the genome as long as there was a mechanism to correct them during protein synthesis.

Due to the relative absence of functional constraint, neutral frameshift sites should also be easily lost through compensatory mutations that restore an uninterrupted open reading frame

(ORF). Thus, in the relatively highly mutagenic mitochondria of most animals, we suspect neutral frameshifting sites, where present, will be frequent subject to back mutation during the divergence of closely related taxa.

Class Hexactinellida, or glass sponges, is a small group in the phylum Porifera that has nevertheless attracted substantial attention from evolutionary biologists due to their uncertain but important phylogenetic position in the Metazoa, the syncytial nature of most of their adult body tissues, the unusual features of their mtDNA, and the fiber optic properties of their siliceous spicules (Haen et al. 2007; Leys et al. 2007; Sundar et al. 2003). Like their putative sister group, the Demospongiae (demosponges), glass sponges undoubtedly diverged early in animal evolution, with evidence from fossil body impressions dating back to the

Ediacaran Period of the Neoproterozoic (ca. 635-542 MYA) (Gehling and Rigby 1996).

Current molecular estimates for sponge stem lineages differ by as much as 200 million years depending upon the taxa, genes and method used (eg. Sperling et al. 2010; Berney and

Pawlowski 2006; Peterson and Butterfield 2005). Our previous study (Haen et al. 2007), and a subsequent work (Rosengarten et al. 2008), revealed the presence of putative translational frameshift sites in glass sponge mitochondria. Here, we investigate the pattern of frameshift site evolution among different lineages of glass sponges. Our results indicate single base insertions commonly occur in glass sponge mitochondrial coding sequences and appear to be dynamically gained and lost from reasonably closely related lineages, consistent with the expectation of there being no adaptive significance for these sites.

Materials and Methods

Specimen collection. Hexactinellid specimen encompassing three orders, Hexactinosida

(Schrammen 1903), Lyssacinosida (Zittel 1877) and Amphidiscosida (Schrammen 1924), and eight families were collected by the Harbor Branch Oceanographic Institute, Florida

Atlantic University (Ft. Pierce, FL), using the Johnson-Sea-Link manned-submersible.

Voucher numbers, sequence accession numbers, and current taxonomic status of the species used in this study are summarized in Supplementary Table 1.

Complete mtDNA amplification. Small fragments of mitochondrial rRNA or protein-coding genes were amplified from total DNA using glass sponge conserved primers (Supplementary

Table 2), checked against the GenBank database to minimize the possibility of contamination, and used to design specific primers. Complete mtDNA was amplified in several fragments using LA-PCR (Takara) under conditions described previously {Burger et al., 2007, #15128}. PCR reactions for each species were combined in equimolar concentration, sheared and barcoded as in [46]. Barcoded PCR fragments were used for the

GS FLX Titanium library preparation (454 Life Sciences). Pyrosequencing was carried out on a Genome Sequencer FLX Instrument (454 Life Sciences) at the Indiana University

Center for Genomics and Bioinformatics. The STADEN package v. 1.6.0 was used to assemble the sequences. Gaps and uncertainties (mainly mononucleotide runs) in the assembly were filled/resolved by primer-walking using conventional Sanger sequencing.

Sequencing coverage ranged from 10X to 100X for the genes in question, and heteroplasmy was not detected for any putative frameshift site in this study.

Amplification of individual genes. Amino acid alignments of several glass sponge, demosponge, and cnidarian sequences were surveyed for protein motifs that could be used to make hexactinellid-specific conserved primers for direct PCR amplification. For the amplification of the nad5-nad1-cob-nad4 fragment from Hyalonematidae sp, conserved regions of nad5 and nad4 were reverse transcribed from TRiZol extracted RNA using the iScript Select cDNA amplification kit (BioRad), and used for second strand synthesis with conserved primers and iTaq polymerase (BioRad). The resulting sequences for nad5 and nad4 were used to construct specific primers for the LA-PCR amplification of Trizol extracted DNA and the initial sequencing of the 5 kb fragment. Primer walking was used to fill in missing sequence gaps.

Verification of base insertions in mRNA. Total RNA extraction was performed from

RNALater (Ambion) preserved tissues according to the manufacturer’s instructions. To verify the transcription of the single nucleotide insertions, we conducted reverse transcription of Hertwigia falcifera nad2, Aphrocallistes beatrix cox3, and Aphrocallistes vastus cox3, using the Bio-Rad iScript cDNA Synthesis Kit. Second strand cDNA products were sequenced by the Sanger method at the Iowa State University DNA Facility.

Genome Annotation and Phylogenetic Inference. Protein coding genes for complete mt- genome data were annotated manually based on sequence similarity with known genes. tRNA genes were annotated by tRNA Scan SE (Lowe and Eddy 1997) or ARWEN (Laslett and Canback 2008). ClustalW (Thompson et al. 1994) alignments were primarily used to determine potential frameshifting sites. For inference of ancestral frameshifting sites, a concatenated alignment of protein-coding genes was prepared as described previously

(Lavrov et al. 2005). Searches for the maximum likelihood (ML) topologies were performed with the TreeFinder program (Jobb et al. 2004) using the mtREV+! (Adachi and Hasegawa

1996) model of amino-acid substitution.

Molecular Estimates of Divergence Time. A concatenated alignment of 3,507 amino acids, representing the mitochondrial protein coding genes of 11 hexactinellid species, was analyzed using the mtREV+! model of amino acid substitution and a random local clock in the BEAST v1.6.0 software package (Drummond and Suchard 2010; Drummond and

Rambaut 2007). The molecular clock for this analysis was calibrated at 5 nodes of the hexactinellid tree that have fossil evidence: at the appearance of triaxonic spicules 542-488

MYA for Class Hexactinellida (Brasier et al. 1997); the appearance of hexasters at $380

MYA for Order Hexasterophora (Rigby et al. 2001; Mostler 1986); the appearance of sceptrules $237 MYA for clade Sceptrulophora (Krainer & Mostler 1991).; $83.5 MYA for

Family Aphrocallistidae (Schrammen 1912); and, $89.3 MYA for Family Rossellidae

(Brückner & Janussen 2005). We used lognormal rate distributions for calibrated nodes, employing soft maxima for clade ages, in which all calibrations were treated as distributed priors with 95% of their density lying between accepted uniform maxima and minima derived from fossil estimates. Terminal nodes that did not have bracketing fossil dates were set such that 95% of the distribution density was between the minimum fossil date and its corresponding 50 MY mark. Additional priors and operators were kept at default settings.

Three MCMC chains of 10,000,000 generations were sampled every 200 cycles, with the first 10% of cycles discarded as burn-in. The results of the three chains were merged with the LogCombiner program, and the resultant combined log file was statistically analyzed with Tracer. The resulting tree files were merged into a consensus tree with the

TreeAnnotator program and visualized with FigTree v 1.2.3.

Results and Discussion

Mitochondrial genomes of glass sponges

We sequenced complete or partial mitochondrial genomes from seven glass sponges:

Aphrocallistes beatrix, Oopsacas minuta, Vazella pourtalesi, Hertwigia falcifera, Euretidae sp., one unclassified Rossellidae species and one unclassified Hyalonematidae sp., as well as cox1 from a sp., and cox1, cox3, nad2 and nad5 from a Regadrella sp. and analyzed them with three previously published glass sponge mitochondrial genomes

(Iphiteon panicea and Sympagella nux from Haen et al. 2007 and Aphrocallistes vastus from

Rosengarten et al. 2008) (Supplementary Table 1). This range of sampling includes species from 2 subclasses, 3 orders, and 8 families of Hexactinellida. Hexactinellid mtDNA sequences used for this study ranged from 13.6 to 20.3 kbp, contained 10-13 protein coding genes, 2 rRNA genes, and 10–22 tRNA genes (excluding the partial genome from

Hyalonematidae sp.). Gene orders were highly conserved among the glass sponge mt- genomes, with the number of shared rRNA and protein coding gene boundaries ranging from

7 to 13, out of a maximum 14 possible boundaries. Most gene rearrangements involved tRNA genes (the gene boundary range was from 7 to 30 when including tRNA genes). The average AT content for the protein-coding genes used in this study was 68% (! = 3.9) with average AT and GC skews of 0.21 (! = 0.09) and -0.34 (! = 0.13), respectively. Bias against the usage of G at 3rd codon positions was highly negative (-0.75, ! = .12).

Sporadic occurrence of +1 programmed frameshifting in mitochondrial genes of Class

Hexactinellida, Phylum Porifera.

Our analysis of hexactinellid mitochondrial coding sequences has revealed 12 incidents of single nucleotide insertion that appear to be frameshift mutations. Two independent insertions have been found in cox1 of Iphiteon panicea and a Farreidae species; two insertions have been identified in nad2 (at one position in Iphiteon panicea and at another in Sympagella nux, Hertwigia falcifera, a Regadrella and an Euretidae species); three insertions at two distinct sites in cox3 from two members of the family

Aphrocallistidae; a single insertion in cob of Aphrocallistes beatrix; and, finally, a single insertion in nad6 of Aphrocallistes vastus. The insertion sites for I. panicea, S. nux and A. vastus have been reported in previous studies (Haen et al. 2007; Rosengarten et al. 2008).

Failure to complete translational frameshifting in these genes would either result in long stretches of protein that have no sequence similarity to other mitochondrial proteins, or in severely truncated proteins. In all of these incidents a +1 frameshift at the target site results in a translated protein product that is well-conserved in both size and sequence compared to its homologues from other organisms (Fig. 2).

To reject the possibility that the amplified regions contain a pseudogene (nuclear mitochondrial DNA, a “numt”) (Erpenbeck et al. 2010) or are posttranscriptionally edited, we analyzed mRNA sequences of the cox3 and nad2 genes from Aphrocallistes vastus,

Aphrocallites beatrix and Hertwigia falcifera. All these sequences contained an extra nucleotide. Furthermore, we analyzed codon position specific nucleotide compositional skews in genes with frameshifts before and after manual correction of the extra nucleotide

(Fig. 3). Nonfunctional genes are liberated from the selective pressures that constrain nucleotide sequence evolution; therefore, changes in compositional skews and codon usage should be expected if a single nucleotide insertion actually shifts the reading frame and/or creates a pseudogene. However, no such changes were detected in this study. By contrast, the strongly negative GC skew found at 3rd codon positions of the nad2 gene (-0.79, ! =

0.08), is reduced to -0.34, the overall GC-skew measurement for coding sequences, when the insertion of the extra nucleotide at the 5’ end of the gene is not manually corrected before sequence compositional analyses.

All of the putative frameshift sites from hexactinellid protein coding genes share common features that generally conform to the “out of frame pairing” model of +1 programmed translational frameshifting (Fig. 1). In this model, 1) a peptidyl-tRNA in the ribosomal P site forms a weakened interaction with the mRNA; 2) slow recognition of the next in-frame codon causes a translational pause; and, 3) exact Watson-Crick pairing by a canonical aminoacyl-tRNA at the first +1 frame codon causes a shift in the reading frame.

Comparisons among homologous hexactinellid sequences indicate a high degree of sequence conservation for the glycine codon GGA at sites of potential frameshifts (in-frame GGA codons in glass sponges without frameshifting and +1 shifted GGA triplets at frameshifting sites). In the case of frameshift sites, these +1 shifted GGA codons are preceded by a T or C nucleotide (the latter in Hertwigia falcifera only), creating in-frame TGG or CGG codons.

Aside from their usage in frameshift sites, TGG and CGG are two of the least used codons in hexactinellid mitochondrial proteomes, with TGG use ranging from 0 (in A. beatrix, A. vastus, H. falcifera, O. minuta, I. panicea, and Euretidae sp.), to 3 (S. nux) or 4 times (V. pourtalesi and the Rossellidae sp.), and CGG use being restricted to a single time in the H. falcifera mt-genome (Fig. 4). In the case of the species whose mt-genomes use the TGG codon for tryptophan, V. pourtalesi, Rossellidae sp., and S. nux, the in-frame TGG codons occur a total of 11 times, but these codons are always followed by a codon starting with G.

Thus, in these instances, a +1 frameshift would shift the reading frame to a glycine encoded by GGG, which is another relatively rarely used codon with no Watson-Crick matched cognate tRNA. According to the out of frame pairing model, no frameshifting should occur under these circumstances, and this is evidenced by the preservation of the reading frame.

Phylogenetic distribution of frameshifting sites and molecular dating in Hexactinellida

In order to investigate the evolutionary history of the occurrence of translational frameshifting in hexactinellid genes, we constructed a phylogeny of phylum Porifera based upon available mitochondrial genomic data (Fig 5). Traditionally, class Hexactinellida is divided into two subclasses, containing six recent orders. The current study includes species from the two subclasses, three orders and eight families of Hexactinellida. This range of sampling allows us to reconstruct some of the most ancient as well as more recent divergences within the hexactinellid crown group (the extant and extinct lineages dating to the most common ancestor of the Hexactinellida). To determine molecular divergence dates for hexactinellid mitochondrial genes, we utilized Bayesian random local molecular clock methods as implemented in the BEAST package (Drummond and Suchard 2010). The results of our analysis indicate that Subclass Amphidiscophora (Schulze 1886), represented in our analysis by a single Hyalonematidae species, diverged from Subclass Hexasterophra

(Schulze 1886) from 512 to 600 million years ago. The addition of this specimen (for which we only have incomplete data) to the analysis increased the molecular age for crown group

Hexactinellida by more than 100 MY as opposed to another recent report (Sperling et al.

2010); however, the age for the node between the two largest orders of this study (Orders

Lyssascinosida and Hexactinosida) remained similar at 400±30 MYA. Further, mitochondrial data date the emergence of crown group Lyssacinsida at 295±50 MYA, crown group

Sceptrulophora at 277±50 MYA, Aphrocallistidae (A. beatrix and A. vastus) at 85±10 MYA, family Euplectellidae (subfamily Corbitellinae Gray, 1872: H. falcifera and Regadrella) at

100±20 MYA, and family Rossellidae at 93±30 MYA.

One of the most striking features revealed by our phylogenetic analysis (Fig. 5) is the lack of conservation of most frameshift sites, despite their common occurrence in sampled genomes. The only frameshift in common between the two major hexactinellid orders of this study (Orders Lyssacinosida and Hexactinosida) is a 5-prime terminal shift in the nad2 gene.

This shift site, present in Euretidae sp., S. nux, H. falcifera, and Regadrella sp. is the most commonly found translational frameshifting site among hexactinellid protein coding genes.

Based on our analysis, it appears this frameshift site may have persisted for up to 400 MY in some lineages, although it has been lost up to 4 times among the sampled taxa. The assertion of frequent frameshift site gain and loss is further supported within genus Aphrocallistes, where we find the conservation of a single frameshift site in the cox3 gene; however, a unique secondary cox3 shift site is also present in A. beatrix, as well as a site in cob, both of which appear to have been either acquired in in A. beatrix or lost in A. vastus after it diverged from the aphrocallistid common ancestor. It is unknown whether the nad6 site in A. vastus is conserved between the two sponges since this gene was not amplified from the A. beatrix mt- genome sequence. Additionally, independently acquired frameshifting sites are found in the cox1 and nad2 genes of I. panicea, and in the cox1 gene of a Farreidae species.

tRNA complement of mt-genomes

Frameshift mutations can be alleviated by tRNAs with specific modifications that allow programmed translational frameshifting (Bossi and Smith 1984; Curran and Yarus

1987; Huttenhofer et al. 1990), and the phylogenetic distribution of frameshift-related tRNA isoacceptors, either with a particular wobble anticodon or another distinguishing feature that destabilizes the P-site pairing interaction, mirrors the phylogeny of frameshift sites

(Farabaugh et al. 2006). Nucleotide changes associated with unconventional decoding can occur in the anticodon loop as well as the acceptor stem, and alterations in one or the other have been noted to be enough to cause frameshift suppression in the yeast model system

(Huttenhofer et al. 1990). Thus we investigated the structures of the glycine tRNA in glass sponges, because this is the tRNA that would translate the C/TGGA sequence, resulting in +1 translational frameshifting (Fig. 6). In the case of the I. panicea genome, we found the glycine tRNA has a mismatch at the most distal base pair in the anticodon stem, creating a 9 nt anticodon loop, which could allow the pairing of a 4 bp anticodon, fully complimentary to the UCC-A frameshifting site (Bossi and Smith 1984; Curran and Yarus 1987). All other hexactinellid glycine mt-tRNA found in our study had a 28:42 mismatch in the acceptor stem similar to a change in a serine mt-tRNA determined to cause frameshift suppression in

Saccharomyces mitochondrial DNA (Huttenhofer et al. 1990). Other studies of frameshifting in animal mt-DNA have also found tRNAs capable of forming a frameshift-suppressing anticodon (Bechenbach et al. 2005). However, because animal mt-tRNA structures are highly derived, it is exceedingly difficult to attribute structural changes to functionality, and several studies have not been able to identify particular corresponding tRNA structural changes for frameshifting sites. Further, it is not surprising that the potential frameshifting tRNAs are not extremely unusual because they must retain the ability to translate non-frameshifting codons for their respective amino acids.

Evolution of translational frameshift sites in hexactinellid coding sequences

Unlike other frameshift mutations, which would likely lead to a non-functional protein, insertions upstream of GGA codons appear to be tolerated in glass sponge coding sequences when GGA is preceded by a pyrimidine. Such insertions result in a rare in-frame

C/TGG codon overlapping with a common +1 GGA codon, the inferred site for translational frameshifting. Based on this reasoning one would expect that 1) the frameshift site evolves at pre-exisiting GGA codons; 2) their number may correlate with the frequency of GGA codons in coding sequences; and 3) the closer an insertion occurs to the GGA codon, the less deleterious effect it would have. Indeed, our analysis of homologous sequences across sampled species indicates that frameshifts occur either immediately or a few nucleotides upstream from a conserved GGA triplet. Furthermore, we found a positive correlation

(R=.73,p=.04) between the number of frameshifts in mt-genomes (0-3) and the number of

GGA codons in the coding sequences (111-140 GGA codons).

Phylogenetic distribution of frameshifting sites in Metazoa

In addition to glass sponges, putative frameshift mutations in mitochondrial protein- coding genes have been reported in several phyla of bilaterian animals: Mollusca (in

Crassostrea virginica), Arthropoda (class Insecta), and Chordata (class Aves, class Reptilia, class Mammalia) (reviewed in Russell and Beckenbach 2008; Temperley et al. 2010).

Interestingly, frameshifting has also been reported in the mt genome of the dinoflagellate

Perkinsus marinus, which parasitizes C. virginica (Masuda et al. 2010). In the case of the

ND-174 site of the nad3 genes of some birds and turtles (Russell and Bechenbach 2008;

Mindel et al. 1998), the frameshifting sites are remarkably conserved. By contrast, an alternative shift site in the nad3 gene of the African helmeted turtle (Zardoya and Meyer

1998), and in the nad4 and nad4L genes of other turtles appear to not be conserved, at least among the sampled species, and, further, only a single frameshift mutation has been reported for all of the available mt-genomes from the phylum Mollusca.

Parallel evolution of frameshift mechanisms in Hexactinellida and Bilateria

Translational frameshifting is a complex and relatively unusual feature of animal mitochondrial genomes, having been found only in the Hexactinellida and Bilateria, but not in their sister groups, Demospongiae and Cnidaria, respectively. Our previous study (Haen et al. 2007) reported several other mitochondrial genomic features that appeared to evolve in parallel in Hexactinellida and Bilateria, including strand specific nucleotide composition, changes in the genetic code, the secondary structure of encoded RNA molecules, and overall genome organization. Some of these changes may predispose the two groups to the evolution of a translational frameshifting strategy. In particular, the positive AT-skew and the negative

GC-skew of the main coding strand may result in the loss of UGG codons, while highly derived tRNA structures may facilitate translational frameshifting according to the out of frame pairing model. By contrast, the opposite nucleotide skews and the higher fidelity interactions of standard clover leaf tRNAs found in demosponges and cnidarians with mRNAs and/or ribosomes may be less conducive to changes in reading frame, preventing the fixation of frame-altering insertional mutations.

Life history traits and extensive frameshifting in Hexactinellida

In contrast to what is known about the evolution of nuclear programmed translational frameshifts, which may be conserved for hundreds of millions of years (Ivanov & Atkins

2007; Farabaugh 2006), frameshifts in metazoan mtDNA display a scarcity of interspecies conservation. This is especially the case in glass sponges, where we find very little conservation in frameshifting sites either in position or in time. This scattered incidence of frameshifting throughout the hexactinellid tree strongly suggests a random gain-loss scenario of frameshift sites in hexactinellid protein coding genes. Furthermore, in contrast to other groups of animals, the evolution of frameshift sites appears to be more widespread in glass sponges, which can be possibly explained by their biology. Although proteins involved in aerobic respiration are essential for nearly all animals, glass sponges may be able to better tolerate frameshift mutations in their genes due to their exceptionally low metabolism and low oxygen environment (Dayton 1979, Yahel et al. 2007, Whitney et al. 2005), which would necessitate a relatively low base level of mt gene expression as opposed to many other animals. This low metabolic rate may allow frameshift accumulation in mitochondrial coding sequences even if the frameshifting mechanism is rather inefficient, as long as there was leaky translation of OXPHOS proteins. Because growth rate appears to be inversely associated with body size in glass sponges (Dayton 1979), we hypothesized larger species which have slower growth rates may accumulate more frameshifts in their mt-genomes.

Indeed, holotype body length measurements are positively correlated with frameshift abundance (r=.94, p=.005), although published body length data is scant for glass sponges with sequenced mt-genomes (N=6; Fig. 7). Thus, sponges with a larger body size and a lower growth rate and basal metabolism appear to be more likely to accumulate frameshift mutations in their protein coding genes.

In conclusion, our findings of +1 translational frameshifts in Hexactinellida are by far the most numerous and diverse accounts of frameshifting in the mitochondrial proteins of any group known to date. The pattern of their occurrence, along with the knowledge of unusual features of hexactinellid mt-genomes, suggests the following model of frameshift accumulation in this group: 1) GGA codons are conserved at certain positions of hexactinellid protein coding genes, 2) base insertions upstream the GGA codon occur, shifting the reading frame +1, 3) wobble-matched tryptophan tRNAs and Watson-Crick matched glycine tRNAs compete for the A-site pairing interaction, and, finally, 4) when glycine tRNAs are bound at the A-site, a +1 frameshift occurs. In this scenario it seems to follow that frameshifting efficiency could be stoichiometric, potentially based upon modifications of tRNAs (Gustilo et al. 2008) or their differential stability inside the mitochondrion, and, additionally, it could be relatively inefficient, based upon the mutant glycine tRNA’s ability to correctly decode the frameshifting site. Overall, the data amassed in current studies of mitochondrial frameshifting in animals warrants the reanalysis of metazoan mitochondrial protein coding genes with single nucleotide insertions, particularly in cases where these insertions exist in regions with strongly biased codon usage that may indicate +1 programmed translational frameshifting. Disparity of frameshift conservation may not necessarily indicate numts, but instead, many frameshift sites in Metazoa appear to be poorly conserved by means of their apparently neutral gain-loss process. Nucleotide compositional skews and certain tRNA structural changes may additionally be informative regarding frameshift presence and gene functioning. Although phylogenies suggest a neutral evolution of acquired mitochondrial frameshift mutations in animal mitochondria, ultimately, functional studies will be necessary to truly address the possibility of their physiological significance.

Acknowledgments. We would like to thank staff members of the Harbor Branch

Oceanographic Institute, Florida Atlantic University, for their generosity in helping KH to acquire the specimen used in this study, especially Dr. Amy Wright and John Reed. We also wish to thank Jena Hopkins for technical assistance on this project, Amy Sojka, Biological

Illustration major at Iowa State University, for composing Figure 1 for this paper, and Martin

Dohrmann for discussions about molecular dating. This research was supported by the

Porifera Tree of Life Grant (PorToL) and an Ecology, Evolution and Organismal Biology

(EEOB) graduate student research grant.

Literature Cited

Adachi J, Hasegawa M. 1996. Model of amino acid substitution in proteins encoded by mitochondrial DNA. J Mol Evol: 42:459-468.

Beckenbach AT, Robson SKA, Crozier RH. 2005. Single nucleotide +1 frameshifts in an apparently functional mitochondrial cytochrome b gene in ants of the genus Polyrhachis. J Mol Evol 60:141–152.

Berney, C. and Pawlowski J. 2006. A molecular time-scale for evolution recalibrated with the continuous record. Proc Biol Sci. 273:1867-1872.

Bossi, L., Smith, D.M. 1984. Suppressor sufJ: A novel type of tRNA mutant that induces translational frameshifting. Proc. Natl. Acad. Sci. 81:6105–6109.

Brasier, M.D. & G.O. Shields. 1997. Ediacaran sponge spicule clusters from Southwestern Mongolia and the origins of Cambrian fauna. Geology 25:303-306.

Brückner, A., Janussen, D. 2005. The first entirely preserved fossil sponge species of the genus Rossella (Hexactinellida) from the Upper of Bornholm, Denmark. J. Paleont. 79:21–28.

Buchan JR, Stansfield I. 2007. Halting a cellular production line: responses to ribosomal pausing during translation. Biol Cell. Sep: 99(9):475-87.

Burger, G., Y. Yan, P. Javadi and B.F. Lang. 2009. Group I-intron trans-splicing and mRNA editing in the mitochondria of placozoan animals. Trends in Genetics. 25(9): 381-386.

Peterson, K., J. and N. Butterfield. 2005. Origin of the Eumetazoa: testing ecological predictions of molecular clocks against the Proterozoic fossil record. Proc Natl Acad Sci USA 102, 9547-9552.

Coffino P. 2001. Regulation of cellular polyamines by antizyme. Nat Rev Mol Cell Biol. 2(3):188-94.

Curran, J.F., Yarus, M. 1987. Reading frame selection and transfer RNA anticodon loop stacking. Science 238:1545–1550.

Danovaro R, Dell'Anno A, Pusceddu A, Gambi C, Heiner I, Kristensen RM. 2010. The first metazoa living in permanently anoxic conditions. BMC Biology; 8:30.

Dayton, PK. 1979. Observations of growth, dispersal and population dynamics of some sponges in McMurdo Sound, Antarctica, Colloques internationaux du C.N.R.S. 291: 271– 282.

Deng JH, Li Y, Park JS, Wu J, Hu P, Lechleiter J, Bai Y. 2006. Nuclear suppression of mitochondrial defects in cells without the ND6 subunit. Mol Cell Biol. 26(3):1077-86.

Detmer, S.A., and Chan, D.C. 2007. Functions and dysfunctions of mitochondrial dynamics. Nat Rev Mol Cell Biol 8, 870-879.

Drummond AJ, Rambaut A. 2007. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evolutionary Biology 7:214.

Drummond AJ, Suchard MA. 2010. Bayesian random local clocks, or one rate to rule them all. BMC Biol 8: 114.

Erpenbeck D, Voigt O, Adamski M, Woodcroft BJ, Hooper JN, Wörheide G, Degnan BM. 2010. NUMTs in the sponge genome reveal conserved transposition mechanisms in metazoans. Mol Biol Evol. Aug 18. [Epub ahead of print]

Farabaugh PJ. 1996. Programmed translational frameshifting. Microbiol Rev 60:103–134.

Farabaugh PJ, Kramer E, Vallabhaneni H, Raman A. 2006. Evolution of +1 programmed frameshifting signals and frameshift-regulating tRNAs in the order Saccharomycetales. J Mol Evol 63:545–561

Gehling, J. G. and J. K. Rigby. 1996. Long expected sponges from the neoproterozoic ediacara fauna of South Australia. Journal of Paleontology, 2: 185-195.

Gray, JE. 1872. Notes on the classification of the sponges. Annals and Magazine of Natural History (4) 9(54): 442-461.

Gustilo EM, Vendeix FA, Agris PF. 2008. tRNA's modifications bring order to gene expression. Curr Opin Microbiol 11: 134–140.

Haen, K.M., F.B. Lang, S.A. Pomponi & D.V. Lavrov. 2007. Glass Sponges and Bilaterian Animals Share Derived Mitochondrial Genomic Features: A Common Ancestry or Parallel Evolution? Mol Biol Evol 24: 1518-1527.

Harlid, A., A. Janke, & U. Arnason. 1997. The mtDNA sequence of the ostrich and the divergence between paleognathous and neognathous birds. Mol. Biol. Evol. 14:754-761.

Hjort K., Goldberg A. V., Tsaousis A. D., Hirt R. P., Embley T. M. 2010. Diversity and reductive evolution of mitochondria among microbial eukaryotes. Phil. Trans. R. Soc. B 365, 713–727.

Hooper JNA, Van Soest RWM, Debrenne F. 2002. Phylum Porifera Grant, 1836. In: Hooper JNA, Van Soest RWM, editors. Systema Porifera: a guide to the classification of sponges. New York: Kluwer Academic/Plenum Publishers. p 9-13.

Huttenhofer, A, B. Weis-Brummer, G. Dirheimer, R. Martin. 1990. A novel type of +1 frameshift suppressor: a base substitution in the anticodon stem of a yeast mitochondrial serine-tRNA causes frameshift suppression. EMBO J 9(2):551-558.

Ivanov, I.P., Matsufuji, S., Murakami, Y., Gesteland, R.F., Atkins, J.F. (2000) Conservation of polyamine regulation by translational frameshifting from yeast to mammals. EMBO J. 19:1907–1917.

Ivanov, I. and John F. Atkins. 2007. Ribosomal frameshifting in decoding antizyme mRNAs from yeast and protists to humans: close to 300 cases reveal remarkable diversity despite underlying conservation. Nucleic Acids Res. 35(6): 1842–1858.

Jobb G, von Haeseler A and Strimmer K. 2004. TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics. BMC Evolutionary Biology 4:18

Krainer, K., Mostler, H. 1991. Neue Hexactinellide Poriferen aus der südalpinen Mitteltrias der Karawanken (Kärnten, Österreich). Geol. Paläont. Mitt. Innsbruck 18:131–150.

Lang BF, Gray MW, Burger G. 1999. Mitochondrial genome evolution and the origin of eukaryotes. Ann Rev Genet. 33:351-97.

Laslett, D. & Canbäck B. 2008. ARWEN, a program to detect tRNA genes in metazoan mitochondrial nucleotide sequences. Bioinformatics 24: 172-175.

Lavrov DV, Forget L, Kelly M, Lang BF. 2005. Mitochondrial genomes of two demosponges provide insights into an early stage of animal evolution. Mol Biol Evol 22:1231-1239.

Lavrov DV. 2007. Key transitions in animal evolution: a mitochondrial DNA perspective. Integr Comp Biol. 47:734–743.

Lavrov DV, Brown WM, Boore JL. 2000. A novel type of RNA editing occurs in the mitochondrial tRNAs of the centipede Lithobius forficatus. Proc Natl Acad Sci USA. 97(25):13738-42.

Leys, S. P., Mackie, G. O. & Reiswig, H. M. 2007. The biology of glass sponges. Adv. Mar. Biol. 52, 1–145.

Lowe TM, Eddy SR. 1997. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25:955-964.

Mangold, U. 2006. Antizyme inhibitor: mysterious modulator of cell proliferation. Cell. Mol. Life Sci. 63:2095-2101.

Masuda, I., M. Matsuzaki, K. Kita. 2010. Extensive frameshift at all AGG and CCC codons in the mitochondrial cytochrome c oxidase subunit 1 gene of Perkinsus marinus (Alveolata; Dinoflagellata). Nucleic Acids Res. Advance access.

Milbury CA, Gaffney PM (2005) Complete mitochondrial DNA sequence of the eastern oyster Crassostrea virginica. Mar Biotechnol (NY) 7:697–712

Mindell DP, Sorenson MD, Dimcheff DE. 1998. An extra nucleotide is not translated in mitochondrial ND3 of some birds and turtles. Mol Biol Evol 15:1568–1571

Mostler, H. 1986. Beitrag zur stratigraphischen Verbreitung und phylogenetischen Stellung der Amphidiscophora und Hexasterophora (Hexactinellida, Porifera). Mitt. Osterr. Geol. Ges. 78:319–359.

Parham JF, Macey JR, Papenfuss TJ, Feldman CR, Türkozan O, Polymeni R, Boore J. 2006. The phylogeny of Mediterranean tortoises and their close relatives based on complete mitochondrial genome sequences from museum specimens. Mol Phylogenet Evol 38:50–64

Rigby, J.K., Pisera, A., Wrzolek, T., Racki, G. 2001. Upper Devonian sponges from the Holy Cross mountains, central Poland. Palaeontology 44:447–488.

Rosengarten RD, Sperling EA, Moreno MA, Leys SP, Dellaporta SL (2008) The mitochondrial genome of the hexactinellid sponge Aphrocallistes vastus: evidence for programmed translational frameshifting. BMC Genomics 9:33.

Salomon, D. 1990. Ein neuer lyssakiner Kieselschwamm, Regadrella leptotoichica (Hexasterophora, Hexactinellida) aus dem Untercenoman von Baddeckenstedt (Nordwestdeutschland). N. Jb. Geol. Paläont. Mh. 1990:342–352.

Schrammen, A. 1903. Zur Systematik der Kieselspongien. Mitteilungen aus dem Roemer- Museum, Hildesheim 19: 1-21.

Schrammen, A. 1912. Die Kieselspongien der oberen Kreide von Nordwestdeutschland. II. Teil: Triaxonia (Hexactinellida). Paläontographica Suppl. 5:177–385.

Schrammen, A. 1924. Die Kieselspongien der overen Kreide von Nordwestdeutschland. III. und letzter Teil. Monographien zur Geologie und Palaontologie (1) Heft 2: 1-159, I-XVII.

Sonenberg, N., J.W.B. Hershey & M.B. Mathews, Eds. 2000. Translational Control of Gene Expression. Monograph 39. Cold Spring Harbor Laboratory Press.

Sperling, E. A. Robinson, J. M., Pisani, D., Peterson, K. J. 2010. Where's the glass? Biomarkers, molecular clocks, and microRNAs suggest a 200-Myr missing fossil record of spicules. Geobiology 8: 24-36.

Sundar, V.C., A.D. Yablon, J.L. Grazul, M. Ilan, J. Aizenberg. 2003. Fibre-optical features of a glass sponge. Nature 424, 899-900.

Temperley R, Richter R, Dennerlein S, Lightowlers RN, Chrzanowska-Lightowlers ZM. 2010. Hungry codons promote frameshifting in human mitochondrial ribosomes. Science. 327(5963):301

Thompson JD, Higgins DG, Gibson TJ. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22(22) :4673-80

Weiss, RB and Gallant JA. 1986. Frameshift suppression in aminoacyl-tRNA limited cells. Genetics. April 112: 727-739.

Whitney, F., K. Conway, R. Thomson, V. Barrie, M. Krautter, G. Mungov. 2005. Oceanographic habitat of sponge reefs on the Western Canadian Continental Shelf. Continental Shelf Research 25(2): 211-226.

Yahel, G., Whitney, F., Reiswig, H. M., Eerkes-Medrano, D. I. and Leys, S. P. (2007). In situ feeding and metabolism of glass sponges (Hexactinellida, Porifera) studied in a deep temperate fjord with a remotely operated submersible. Limnology and Oceanography 52(1): 428-440.

Zardoya R, Meyer A. 1998. Complete mitochondrial genome suggests diapsid affinities of turtles. Proc Natl Acad Sci USA 95:14226–14231.

Zittel, K.A. 1877. Studien uber fossile Spongien, I. Hexactinellidae. Abhandlungen der Mathematisch-Physikalischen Classe der Koniglich-Bayerischen Akademie der Wisseneschaften 13(1): 1-63.

.

Figure 1. Illustration of ribosomal bypass at hexactinellid frameshifting sites. The presented data is from the Sympagella nux nad2 gene. The tRNA in the P-site is a D-loop truncated seryl mt-tRNA. According to the out of frame pairing model, ribosomes become stalled at the UGG hungry codon. Competition between the rare wobble tRNA for tryptophan and a mutant Watson-Crick paired mt-tRNA for glycine putatively dictates whether the extra uracil in the UGGA frameshift site will be bypassed, restoring the reading frame of the coding sequence.

Figure 2. Amino acid alignments of frameshifted region in A) uncorrected and B) corrected protein translations for the cox1 gene of Iphiteon panicea, where a high degree of protein conservation can be observed in the restored reading frame (B). Note that in most cases the frameshifted protein ends abruptly at a stop codon in the new reading frame.

Figure 3. Nucleotide compositional skews at 1st, 2nd and 3rd codon positions of frameshifting and non-frameshifting nad2 genes from Hexactinellida. Combined results for species with a 5’ terminal frameshift (Euretidae sp., H. falcifera, Regadrella sp. and S. nux) are shown both uncorrected and manually corrected. On the far right of the graph, combined results are shown for wildtype nad2 genes (O. minuta, A. beatrix, A. vastus, Rossellidae sp. and V. pourtalesi).

Figure 4. Overall log codon usage from the coding sequences used in this study. Codons ending in T are colored orange, ending in C are dark blue, ending in A are light blue and ending in G are green. Note the general disuse of codons ending in guanine, including the

TGG and CGG hungry codons.

Figure 5. Timing of the crown group hexactinellid radiation according to the Bayesian random local molecular clock. The phylogenetic tree for eleven hexactinellid taxa was determined by both Bayesian and Maximum Likelihood phylogenetic analyses using the mtREV + G model of amino acid sequence evolution. All nodes have a posterior probability of 1 or bootstrap support of 100. The scale at the bottom indicates time in millions of years

(as MYA). The matrix on the right indicates the presence of frameshifts in mitochondrial protein coding genes. Dark and light gray boxes indicate alternate frameshift sites for the same gene; the split box for A. beatrix cox3 indicates two frameshift sites in the same gene.

Taxonomic orders are listed to the far right; Amphi* = Order Amphidiscosida. Nad6 was not sequenced from several species; see Table 1.

Figure 6. Glycine mt-tRNA structures from two representative glass sponges. A. Iphiteon panicea mt-tRNA Gly contains a 9 base-pair anticodon loop. B. Sympagella nux and other glass sponges from this study contain a 28:42 basepair mismatch in the mt-tRNA Gly acceptor stem. Unusual features are highlighted in red.

Figure 7. Correlation of holotype body length and mt frameshift presence. R= .94, P = .005.

Letters indicate the species of the measured holotype specimen: OM (O. minuta), VP (V. pourtalesi), SN (S. nux), HF (H. falcifera), IP (I. panicea), and AB (A. beatrix).

Table 1. For nearly complete mitochondrial genome data, protein-coding genes used in the current analyses are indicated by exception, with the implication that all 13 protein-coding genes typically found in the mt genomes of sponges, with the exception of atp8 (atp 6, 9

(subunits 6, 8 and 9 of the F0 ATPase); cox1-3 (cytochrome c oxidase subunits 1-3); nad1-6, nad4L (NADH dehydrogenase subunits 1-6 and 4L)) should be present.

!"#$%&#'( )*+,',&%$( )*+,',&%$( 6,-$7#2( 9*:*(:5"#( !-.$/*00( 4*&%/5( 8-&.#2( )*+,',&%$( 123#2( !"#$%&'&() ($-./) *(+,&*"#() 0(12 3"'(1#"++()4-5) ,678/),679/)0(1:/)0(1;) <6-4(,(4)=&0>$() 0(12 ?@=-('"++()0>7) 0(12 A(B"++()-6>#$(+"4&) 3644"++&1(")4-5) 0(12 C-D&$"60)-(0&,"() E>#"$&1(")4-5) 0(19/) ($-./)0(1F/)0(12 G(##"&1(")4-5) ,678) H-D#6,(++&4$"4) 0(12/) I"($#&7) ($-./)0(19 H-D#6,(++&4$"4) J(4$>4) !@(+60"=($&1(") ,678/)0(18/),6I/)0(1;/) 4-5) 0(1F)

Table 2. Glass sponge specific primers used in this study. Additional primer sequences available from authors.

;/*00(!",'<#(!"#$%=%$(>2%( >2%(!#?-#'$#(@AB(:,(CBD(

CHAPTER 5. PARALLEL LOSS OF NUCLEAR-ENCODED MITOCHONDRIAL

AMINOACYL-tRNA SYNTHETASES AND mtDNA-ENCODED tRNAs IN

CNIDARIA

A paper published in Molecular Biology and Evolution

Karri M. Haen, Walker Pett, and Dennis V. Lavrov

Abstract

Unlike most animal mitochondrial genomes, which encode a set of 22 tRNAs sufficient for mitochondrial protein synthesis, those of cnidarians have only retained one or two tRNA genes. Whether the missing cnidarian mt-tRNA genes relocated outside the main mitochondrial chromosome or were lost remains unclear. It is also unknown what impact the loss of tRNA genes had on other components of mitochondrial translational machinery. Here we explored the nuclear genome of the cnidarian Nematostella vectensis for the presence of mitochondrial tRNA genes and their corresponding mitochondrial aminoacyl-tRNA synthetases (mt-aaRS). We detected no candidates for mitochondrial tRNA genes and only two mt-aaRS orthologs. At the same time, we found that all but one cytosolic aaRS appear to be targeted to mitochondria. These results indicate the loss of mt-tRNAs in Cnidaria is genuine and occurred in parallel with the loss of nuclear-encoded mt-aaRS. Our phylogenetic analyses of individual aaRS revealed that while the nearly total loss of mt-aaRS is rare, aaRS gene deletion and replacement has occurred throughout the evolution of Metazoa.

Introduction

In contrast to other eukaryotic groups where mitochondrial (mt) genomes vary substantially in size and gene content, animal mtDNA is remarkably conserved. The typical mitochondrial genome of bilaterian animals contains 37 genes, which encode 13 proteins, two ribosomal RNAs and 22 transfer RNAs (tRNAs), while the mtDNA of non-bilaterian animals often contains a few extra genes (Lavrov 2007). Exceptions to these generalities do occur and often involve the loss of tRNA genes. Extreme examples of mt-tRNA gene loss may be found in the mitochondrial genomes of the Phylum Cnidaria (Beagley et al. 1995;

Shao et al. 2006; Kayal and Lavrov 2008), the (G1) group of demosponges (Wang and Lavrov 2008), and in (arrow worms) (Helfenbein et al. 2004; Papillon et al. 2004), where no more than two mt-tRNAs appear to be retained.

While the loss of mitochondrial genes is uncommon within animals, it has been a general trend throughout most of mitochondrial genome evolution prior to the origin of

Metazoa. In particular, nearly all genes necessary for mitochondrial protein synthesis have either been transferred to the nucleus or supplanted by preexisting nuclear genes of similar function, resulting in a metazoan mt protein translation machinery that is co-encoded by the nucleus and the mitochondrion (Adams and Palmer 2003). Although, in principle, the functional transfer of mt-tRNAs to the nucleus could explain their absence from mt-genomes, there has been no evidence so far of such import (Lithgow and Schneider 2010). Instead, imported cytosolic counterparts presumably have replaced lost mt-tRNAs (Duchêne et al.

2009).

If the absence of mt-tRNA genes from mt-genomes represents a true loss, it should have an effect on the other components of the mitochondrial translational machinery with which mt-tRNAs coevolved. Aminoacyl-tRNA synthetases (aaRS), the enzymes that catalyze the specific attachment of amino acids to the 3’-ends of cognate tRNAs, should be especially affected by tRNA loss. In opisthokonts (animals, fungi, and their unicellular relatives), aaRS may be subdivided into three categories based on their functional localization: cytosol-specific (cy-aaRS), involved in cytosolic protein synthesis; mitochondria-specific (mt-aaRS), involved in mitochondrial protein synthesis; and bifunctional, operating in both compartments (Antonellis and Green 2008). Throughout metazoan evolution, the members of these categories have undergone structural modifications in parallel with changes in tRNA identity determinants. In particular, there is clear evidence for coevolution between mitochondrial aaRS and mt-tRNA structures

(Chimnaronk et al. 2005; Sissler et al. 2005). Because of this coevolution, as well as their potentially distinct origins (e.g., Brindefalk et al. 2007), aaRS operating in one compartment may not be able to recognize tRNAs specific for another. Hence we hypothesized the actual loss of mt-tRNA genes would render mt-aaRS disposable by removing the evolutionary pressure to maintain them, while mt-tRNA transfer to the nucleus or to another mitochondrial chromosome would not affect them. In order to test this hypothesis, we surveyed the nuclear genome of the anthozoan cnidarian Nematostella vectensis (Putnam et al. 2007) for evidence of mt-tRNA gene transfer, as well as mt-aaRS gene loss.

Results and Discussion

We extracted tRNA-like sequences from the N. vectensis genome using tRNAscan-

SE (Lowe and Eddy 1997) and used Neighbor-Joining (NJ) analysis (Saitou and Nei 1987;

Studier and Keppler 1988) based on uncorrected (p) distances to reconstruct their phylogenetic relationships with mitochondrial tRNA genes from four species of demosponges. Both cytosolic and mitochondrial tRNAs can be detected and annotated using tRNAscan-SE in Cnidaria because of their conserved secondary structures. In total, 12,781 nuclear tRNA-like sequences were found in the N. vectensis genome after removing 20 bacterial contaminants and one mt-tRNAMet (present on a mitochondrial contig). We clustered all tRNA genes into 415 groups with >90% similarity and used one representative from each of these groups along with two tRNA genes from the previously published

Nematostella sp. JVK-2006 mtDNA (Medina et al. 2006) and 101 mt-tRNA sequences from four species of demosponges for the phylogenetic analysis. The results of the NJ analysis revealed no close affinity of any N. vectensis nuclear tRNA genes to mt-tRNA from demosponges. By contrast, both mitochondrial tRNAs of Nematostella sp. were closely related to their demosponge counterparts (Supplementary Fig S8; online only). Thus, there is no genomic evidence for a functional transfer of mt-tRNA genes to the nucleus in N. vectensis, consistent with previous studies on other organisms (Lithgow and Schneider

2010).

To assess whether the loss of mt-tRNA genes co-occurred with the loss of mt-aaRS genes, we applied a Reciprocal Best Hits (RBH) approach (Rivera et al. 1998) to search the complete genome of N. vectensis for the presence of aaRS orthologs using the functionally verified aaRS sequences from (Fig. 1). As a control, we performed the same search in Homo sapiens. We identified sequences orthologous to all yeast cy-aaRS in both N. vectensis and H. sapiens except for GluRS, which is known to be fused with ProRS in Metazoa forming a bifunctional Glu-ProRS enzyme (Berthonneau and

Mirande 2000). Furthermore, of the 14 distinctly mitochondrial (not bifunctional, see Fig. 1) genes from yeast, our RBH search could identify 11 corresponding orthologs in human.

Reversing the RBH search (using human sequences to query yeast) revealed that two of the remaining mitochondrial genes (mt-LysRS and mt-ArgRS) are orthologous between yeast and human, but underwent duplication and sublocalization in yeast. By contrast, while we found all 19 expected cy-aaRS orthologs in N. vectensis genome, we were able to identify only two putative mt-aaRS orthologs (mt-PheRS and mt-TrpRS), suggesting most mt-aaRS genes have been lost. This result contradicts the hypothesis that mt-tRNA loss would be precipitated in situations where mt-aaRS were able to aminoacylate imported cytosolic tRNAs (Schneider 2001; Lithgow and Schneider 2010).

The retention of two mt-aaRS in the nuclear genome of N. vectensis appears to be non-random. The reassignment of the UGA termination codon to tryptophan in animal mtDNA likely explains the retention of the nuclear encoded mt-TrpRS as well as the

Trp mtDNA-encoded mt- tRNAUCA. In particular, the cy-TrpRS specifically recognizes the

Trp Trp anticodon of tRNACCA as a tRNA identity element, and thus mt- tRNAUCA cannot be aminoacylated! by the cytosolic enzyme, necessitating the distinct mitochondrial enzyme

(Charriére! et al. 2006). The retention of mt-PheRS may be explained! by structural differences between the mt-PheRS and its cytosolic counterpart. Unlike the majority of aminoacyl-tRNA synthetases that are monomers or homo-oligomers, both prokaryotic and cy-PheRS have

!2"2 heterotetrameric structures, with the active site located in the !-subunit and tRNA binding sites in both subunits. Conversely, the gene encoding metazoan mt-PheRS exists as a chimera of the !-subunit and the C-terminal tRNA anticodon binding domain of the "- subunit (Sanni et al. 1991). We suspect a hydrophobic domain of the N-terminal "-subunit may complicate its import into the mitochondrion, promoting the use of the chimeric protein

(unpublished).

The loss of mt-aaRS should necessitate the import of cy-aaRS into mitochondria, which would require the acquisition of mt-targeting signals (Wang et al. 2003). Dual- targeting of the same aaRS to different cellular compartments is not an unusual phenomenon, and the acquisition of a mt-targeting signal by a cy-aaRS can serve as an indicator of mt- aaRS gene loss (Duchêne et al. 2005). Thus to substantiate our inference about mt-aaRS loss in N. vectensis, we surveyed the genomic and EST sequences of cy-aaRS for the presence of upstream alternative start sites that could produce N-terminal targeting presequences. Indeed, we located putative mt-targeting signals for all the cy-aaRS expected to function in both the cytosol and mitochondria with scores within the range observed for experimentally verified targeting signals (Supplementary Table S1). As expected, we did not identify a putative mt- targeting sequence in the N. vectensis cy-PheRS, but interestingly we were able to identify such a sequence in cy-TrpRS, despite the maintenance of mt-TrpRS. Furthermore, MitoProt scores for both TrpRS were highly significant and virtually identical (~0.98). This suggests both tryptophanyl enzymes can be directed to the mitochondrion. The reason for this is unknown, but in dicotyledonous plants where two GlyRS are also directed to mitochondria, the second aaRS without mt-tRNA aminoacylation activity was hypothesized to be involved in the import of tRNA isoacceptors (Duchêne et al. 2001). Furthermore, we detected targeting signals for 10 out of 17 cy-aaRS in H. sapiens. Although the latter observations weaken the inference that can be made regarding the fate of mt-aaRS from the presence of mt-targeting signals in cy-aaRS, they potentially illustrate the first necessary step for the loss of mt-tRNAs and mt-aaRS. Further research detailing protein and tRNA targeting and import mechanisms will help clarify the interactions of these tryptophanyl translational elements.

Although the loss of nearly all mt-aaRS from cnidarian nuclear genomes represents an unusual case in the evolution of animal mitochondrial translation systems, the loss of either a mitochondrial or cytosolic gene, accompanied by the duplication and neolocalization of the remaining aaRS to a different cellular compartment, has occurred several times throughout eukaryotic evolution. As an example, a previous study deduced the presence of eight bifunctional synthetases in the common ancestor of animals and fungi, suggesting either the original mitochondrial or cytosolic gene was lost (Brindefalk et al. 2007, Fig. 1B).

Our phylogenetic analyses of individual aaRS genes revealed that there was at least one additional round of duplication and differential sublocalization for six of these aaRS genes within the Fungi-Metazoa group (Fig. 1B; Supplementary Figs S1-S7). This pattern of gene deletion and replacement explains why we were unable to find yeast orthologs in H. sapiens for mt-ThrRS using the RBH approach (Supplementary Fig S5).

To conclude, our study suggests the reported absence of mt-tRNAs in Cnidaria represents their genuine loss and demonstrates that mt-tRNA gene loss was accompanied by the loss of nuclear-encoded mt-aaRS and the acquisition of mt-targeting signals by cy-aaRS.

Although it is theoretically possible that all of the remaining mt-aaRS and/or mt-tRNA genes are contained within the estimated 5% of unassembled protein-coding content of the N. vectensis genome (Putnam et al. 2007), we find it highly unlikely. We propose the following sequence of events in the loss of mitochondrial tRNA and aaRS genes: 1) both cytosolic and mt-aaRS are targeted to mitochondria, 2) cytosolic tRNAs are imported into mitochondria, potentially together with cy-aaRS via the protein import pathway (e.g., Chacinska et al.

2009), 3) because cytosolic tRNAs are essential for cytosolic translation while mitochondrial tRNAs may be replaceable by their nuclear analogs, a ratchet-like process leads to the loss of mitochondrial tRNAs 4) the loss of mitochondrial tRNAs leads to the disuse and eventual loss of mt-aaRS. This model is in agreement with cross aminoacylation studies in humans and yeast that showed cy-aaRS are generally inefficient for the aminoacylation of mt-tRNAs

(Sissler et al. 2005). Our model also shows that neolocalization of cy-aaRS is a necessary, but not sufficient, condition for the deletion of mt-tRNA, since even in the simplest scenario an import signal should also evolve in cytosolic tRNA genes (Salinas et al. 2008).

Furthermore, we speculate the more frequent loss of mt-tRNAs observed in non-bilaterian animals (Wang and Lavrov 2008) and non-metazoan eukaryotes (reviewed in Duchêne et al.

2009) than in bilaterian animals is due to the more conserved structures of tRNAs encoded by their mtDNA, which are more similar to cytosolic tRNAs (Wolstenholme 1992) and thus can be more easily replaced by them. It is unclear whether the loss of nearly all mt-tRNAs and mt-aaRS observed in Cnidaria is simply an extreme example of the process that occurs in other metazoans or if it is a distinct phenomenon facilitated by additional factors. To this end, it would be very interesting to investigate whether a parallel loss of mt-aaRS occurred in

Keratosa and Chaetognatha, two groups of animals that experienced a loss of mt-tRNA genes very similar to that in Cnidaria.

Supplementary Material

Supplementary methods, fig. S1-S8, and tables S1-10 are available at Molecular Biology and

Evolution online (http://www.mbe.oxfordjournals.org/).

Funding sources

This work was supported by the National Science Foundation [DEB-0828783] and an EEOB departmental graduate student research grant.

Acknowledgements

We would like to thank Lesley Collins and Xiujuan Wang for helpful discussions related to this research, and Martin Dohrmann, Ehsan Kayal, the associate editor of the journal, and two anonymous reviewers for insightful suggestions on earlier versions of this manuscript.

Literature cited:

Adams KL, Palmer JD. 2003. Evolution of mitochondrial gene content: gene loss and transfer to the nucleus. Mol Phylogenet Evol. 29:380-395. Antonellis A, Green ED. 2008. The role of aminoacyl-tRNA synthetases in genetic diseases. Annu Rev Genomics Hum Genet. 9:87-107. Beagley CT, Macfarlane JL, Pont-Kingdon GA, Okimoto R, Okada N, Wolstenholme DR. 1995. Mitochondrial genomes of Anthozoa (Cnidaria). Progress in Cell Research. 5:149-153. Berthonneau E, Mirande M. 2000. A gene fusion event in the evolution of aminoacyl-tRNA synthetases. FEBS Lett. 470:300-304. Brindefalk B, Viklund J, Larsson D, Thollesson M, Andersson SG. 2007. Origin and evolution of the mitochondrial aminoacyl-tRNA synthetases. Mol Biol Evol. 24:743- 756. Chacinska A, Koehler CM, Milenkovic D, Lithgow T, Pfanner N. 2009. Importing mitochondrial proteins: machineries and mechanisms. Cell. 138:628-644. Charriére F, Helgadottir S, Horn EK, Söll D, Schneider A. 2006. Dual targeting of a single tRNA(Trp) requires two different tryptophanyl-tRNA synthetases in Trypanosoma brucei. Proc Natl Acad Sci U S A. 103:6847-6852. Chimnaronk S, Gravers Jeppesen M, Suzuki T, Nyborg J, Watanabe K. 2005. Dual-mode recognition of noncanonical tRNAs(Ser) by seryl-tRNA synthetase in mammalian mitochondria. EMBO J. 24:3369-3379. Duchêne AM, Giritch A, Hoffmann B, Cognat V, Lancelin D, Peeters NM, Zaepfel M, Maréchal-Drouard L, Small ID. 2005. Dual targeting is the rule for organellar aminoacyl-tRNA synthetases in Arabidopsis thaliana. Proc Natl Acad Sci U S A. 102:16484-16489. Duchêne AM, Peeters N, Dietrich A, Cosset A, Small ID, Wintz H. 2001. Overlapping destinations for two dual targeted glycyl-tRNA synthetases in Arabidopsis thaliana and Phaseolus vulgaris. J Biol Chem. 276:15275-15283. Duchêne AM, Pujol C, Maréchal-Drouard L. 2009. Import of tRNAs and aminoacyl-tRNA synthetases into mitochondria. Curr Genet. 55:1-18.

Helfenbein KG, Fourcade HM, Vanjani RG, Boore JL. 2004. The mitochondrial genome of Paraspadella gotoi is highly reduced and reveals that chaetognaths are a sister group to protostomes. Proc Natl Acad Sci U S A. 101:10639-10643. Kayal E, Lavrov DV. 2008. The mitochondrial genome of Hydra oligactis (Cnidaria, Hydrozoa) sheds new light on animal mtDNA evolution and cnidarian phylogeny. Gene. 410:177-186. Lavrov DV. 2007. Key transitions in animal evolution: a mitochondrial DNA perspective. Integr Comp Biol. 47:734-743. Lithgow T, Schneider A. 2010. Evolution of macromolecular import pathways in mitochondria, hydrogenosomes and mitosomes. Philos Trans R Soc Lond B Biol Sci. 365:799-817. Lowe TM, Eddy SR. 1997. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25:955-964. Medina M, Collins AG, Takaoka TL, Kuehl JV, Boore JL. 2006. Naked corals: skeleton loss in Scleractinia. Proc Natl Acad Sci USA. 103:9096-9100. Papillon D, Perez Y, Caubit X, Le Parco Y. 2004. Identification of chaetognaths as protostomes is supported by the analysis of their mitochondrial genome. Mol Biol Evol. 21:2122-2129. Putnam NH et al. 2007. Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization. Science. 317:86-94. Rivera MC, Jain R, Moore JE, Lake JA. 1998. Genomic evidence for two functionally distinct gene classes. Proc Natl Acad Sci U S A. 95:6239-6244. Saitou N, Nei M. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 4:406-425. Salinas T, Duchene AM, Marechal-Drouard L. 2008. Recent advances in tRNA mitochondrial import. Trends Biochem Sci. 33:320-329. Sanni A, Walter P, Boulanger Y, Ebel JP, Fasiolo F. 1991. Evolution of aminoacyl-tRNA synthetase quaternary structure and activity: Saccharomyces cerevisiae mitochondrial phenylalanyl-tRNA synthetase. Proc Natl Acad Sci U S A. 88:8387-8391.

Schneider A. 2001. Does the evolutionary history of aminoacyl-tRNA synthetases explain the loss of mitochondrial tRNA genes? Trends Genet. 17:557-559. Shao Z, Graf S, Chaga OY, Lavrov DV. 2006. Mitochondrial genome of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa): A linear DNA molecule encoding a putative DNA-dependent DNA polymerase. Gene. 381:92-101. Sissler M, Putz J, Fasiolo R, Florentz C. 2005. Mitochondrial aminoacyl-tRNA synthetases. In: editors. The aminoacyl-tRNA synthetases. Georgetown, Tex., U.S.A: Landes Bioscience : Eurekah.com. p 269-281. Studier JA, Keppler KJ. 1988. A note on the neighbor-joining algorithm of Saitou and Nei. Mol Biol Evol. 5:729-731. Wang CC, Chang KJ, Tang HL, Hsieh CJ, Schimmel P. 2003. Mitochondrial form of a tRNA synthetase can be made bifunctional by manipulating its leader peptide. Biochemistry. 42:1646-1651. Wang X, Lavrov DV. 2008. Seventeen new complete mtDNA sequences reveal extensive mitochondrial genome evolution within the Demospongiae. PLoS ONE. 3:e2723. Wolstenholme DR. 1992. Animal mitochondrial DNA: structure and evolution. Int Rev Cytol. 141:173-216.

Figure 1. Evolutionary relationships among cytosolic and mt-aaRS in H. sapiens (HS),

N. vectensis (NV), and Saccharomyces cerevisiae (SC). Row labels indicate aaRS specificity, and circles indicate presence/absence of aaRS. Filled, empty, and half-filled circles denote cytosolic, mt-, and bifuntional aaRS, respectively. Dashed circles indicate inferred losses of aaRS. Small circles indicate secondary duplication events. (A) aaRS for which distinct ancestral cy-aaRS (Cy) and mt-aaRS (Mt) can be inferred in Metazoa. (B) aaRS for which only a single ancestral bifunctional aaRS (Cy/Mt) was inferred.

CHAPTER 6. GENERAL CONCLUSIONS

Conclusions

Extended sampling of nonbilaterian mitochondrial genomes has challenged our ideas about what characteristics may describe the metazoan mtDNA molecule. With the addition of data from the Porifera and Cnidaria, we now know these mtDNAs may be linear (Bridge et al. 1992), contain extra genes or have missing genes (eg. Wang & Lavrov 2008), have unusual tRNA secondary structures (Wolstenholme 1992), or may show extreme parallel evolutionary features with distantly related lineages (Haen et al. 2007). Glass sponges, Class

Hexactinellida, Phylum Porifera, have been one of the final frontiers for mitochondrial genomic sequencing among the Metazoa. In this dissertation, we have sequenced the very first hexactinellid mitochondrial genomes, extended our range of sampling to three orders and seven families as part of the Porifera Tree of Life Project, and have attempted to gain insight into the mechanisms that drive the evolution of various peculiarities of these, as well as other, metazoan mt genomes.

Studies upon hexactinellid mitochondrial DNAs have found a plethora of unexpected characters that were formerly attributed to mt genomes within the Bilateria: noncanonical tRNA structures and genetic code changes, derived rRNA and tRNA secondary structures, and the loss of the atp8 gene. Contrarily, other features, like the presence of the atp9 gene and the sporadic presence of noncoding regions and/or ORFs that potentially encode proteins of unknown function, show glass sponges’ affinity for the Porifera. Although mtDNA based molecular phylogenies were not able to confidently place Hexactinellida within the Metazoa, the phylogentic relationships within the group were generally found to agree with morphology based systematics (Reiswig 2002), with the exception of the placement of all dictyonine sponges within Orders Hexactinosida or Lychniscosida. At least in the case of the

Dactylocalycidae, Iphiteon panicea appears to be misplaced within the Hexactinosida.

Our analysis of hexactinellid mt genomes found the most extensive case of translational frameshifting known in metazoan mitochondria. Although frameshifting in mitochondrial genes has been identified a few other times (Temperley et al. 2010; Russell et al. 2008; Bechenbach et al. 2005; Mindel et al. 1998) these reports concentrated upon conserved frameshift sites. In Hexactinellida, it appears that frameshifts are free to be accumulated and lost at certain positions inside the protein coding genes. We have proposed a mechanism for the maintenance of these frameshift sites based upon the codon usage and tRNAs present within these genomes. Additionally, we were able to use molecular data to assign putative dates to several hexactinellid crown groups, allowing us to determine how old certain frameshift sites may be.

The evolution of tRNAs within metazoan mitochondria is particularly interesting because at no other place in the Tree of Life do tRNA structures show such a broad range of diversity. Apparently, many metazoan mitochondrial translation systems have evolved to accommodate tRNAs of divergent structure, including their respective nucleus encoded aminoacyl-tRNA synthetases. The same features that allow metazoan mitochondria to tolerate tRNA structural changes may also allow functional tRNA import from the nucleus to be an option for increasing the pool of available tRNAs for mitochondrial protein translation.

We found this in the cnidarian, Nematostella vectensis, where we confirmed the loss of mt- tRNAs from both the mitochondrial and nuclear genomes and showed a parallel loss of all but two of the corresponding mitochondrial aminoacyl-tRNA synthetases (Haen et al. 2010).

Future directions

Several potential extensions of this research are possible, and I will only elaborate upon a few of them. Primarily, with respect to studying hexactinellid mitochondrial genomes, two future outcomes would be desirable: 1.) the complete sequencing of a mitochondrial genome from Subclass Amphidiscophora and 2.) the resolution of

Hexactinellida’s place among the Metazoa using mitochondrial genomic data. Although the condition of available DNAs made PCR and sequencing difficult for the amphidiscophoran

Hyalonematidae sp. sampled in this study, new specimens recently acquired by our laboratory may make complete mt genome sequencing possible. The molecular phylogenetic resolution of Hexactinellida’s place in the Metazoa will prove to be more difficult. This research found that extended sampling did not help to resolve Hexactinellida’s phylogenetic position (data not shown), including the addition of the partial mt genome sequence from

Hyalonematidae sp. It seems clear that in order to resolve deep branching phylogenies that newer phylogenetic models will be required. These models may not only take into account heterotachy (unequal evolutionary rates in branches) among lineages, but will also consider the possibility of different amino acid replacement matrices among these lineages.

Another interesting future direction would be to determine whether mitochondrial genomes that share a parallel loss of tRNA genes with cnidarians, such as within certain groups of demosponges, have also lost the same mitochondrial aminoacyl-tRNA synthetases from their nuclear genomes. Such a study would emphasize the connection between mt tRNA and mt aaRS gene loss but may also be a starting point for understanding what functional reasons certain mt aaRS are retained while others are lost. Further, a recent study

(Brandão and Silva-Filo 2010) has prompted me to continue to consider my hypothesis that aaRS genes are prone to lateral gene transfer from prokaryotes (unpublished). Because of the prevalence of bacteria-like aaRS in the nuclear genomes I have analyzed, it would be interesting to see whether some of these genes, formerly dismissed as contaminations, may actually contribute to the functional aaRS repertoire of animals.

Literature Cited

Beckenbach AT, Robson SKA, Crozier RH. 2005. Single nucleotide +1 frameshifts in an apparently functional mitochondrial cytochrome b gene in ants of the genus Polyrhachis. J Mol Evol 60:141–152

Brandão, M.M. and M.C. Silva-Filho. 2010. Evolutionary history of Arabidopsis thaliana aminoacyl-tRNA synthetase dual-targeted proteins. Mol Biol Evol. Advance Access.

Bridge, D., C. W. Cunningham, B. Schierwater, R. Desalle, and L. W. Buss. 1992. Class level relationships in the phylum Cnidaria: evidence from mitochondrial genome structure. Proc. Natl. Acad. Sci. USA 89:8750-8753

Haen, K.M., F.B. Lang, S.A. Pomponi & D.V. Lavrov. 2007. Glass Sponges and Bilaterian Animals Share Derived Mitochondrial Genomic Features: A Common Ancestry or Parallel Evolution? Mol Biol Evol 24: 1518-1527.

Haen, K.M., W. Pett & D.V. Lavrov. 2010. Parallel loss of nuclear-encoded mitochondrial aminoacyl-tRNA synthetases and mtDNA-encoded tRNAs in Cnidaria. Mol Biol Evol 27(10):2216-2219.

Mindell DP, Sorenson MD, Dimcheff DE (1998) An extra nucleotide is not translated in mitochondrial ND3 of some birds and turtles. Mol Biol Evol 15:1568–1571

Reiswig HM. 2002. Class Hexactinellida Schmidt, 1870. In: Hooper JNA, Van Soest RWM, editors. Systema Porifera: a guide to the classification of sponges. New York: Kluwer Academic/Plenum Publishers. p 1201-1202.

Russell, R. David and Andrew T. Beckenbach 2008. Recoding of Translation in Turtle Mitochondrial Genomes: Programmed Frameshift Mutations and Evidence of a Modified Genetic Code. J Mol Evol. 2008 December; 67(6): 682–695.

Temperley R, Richter R, Dennerlein S, Lightowlers RN, Chrzanowska-Lightowlers ZMA. 2010. Hungry codons promote frameshifting in human mitoribosomes. Science 327: 301

Wang X and Lavrov DV. 2008. Seventeen new complete mtDNA sequences reveal extensive mitochondrial genome evolution within the Demospongiae. PLoS ONE, 3(7): e2723.

Wolstenholme DR. 1992. Genetic novelties in mitochondrial genomes of multicellular animals. Curr Opin Genet Dev 2:918925.

APPENDIX A. Supplementary materials for Chapter 3

Supplementary Figure 1. Mean and range of AT composition in hexactinellid mitochondrial genes. AT composition is relatively stable regardless of the type of gene analyzed. AB = A. beatrix, AV= A. vastus, EU = Euretidae sp., HF = H. falcifera, RO =

Rossillidae sp., OM = O. minuta, IP = I. panicea, SN = S. nux, VP = V. pourtalesi.

Supplementary Figure 2. Mean and range of AT skew for hexactinellid mitochondrial genes. The average AT skew is positive, regardless of the type of gene analyzed, although

AT skews are generally stronger in rRNA genes. . AB = A. beatrix, AV= A. vastus, EU =

Euretidae sp., HF = H. falcifera, RO = Rossillidae sp., OM = O. minuta, IP = I. panicea, SN

= S. nux, VP = V. pourtalesi.

Supplementary Figure 3. Mean and range of GC skew for hexactinellid mitochondrial genes. The average GC skew is negative, regardless of the type of gene analyzed, although

GC skews are most pronounced in protein-coding genes. . AB = A. beatrix, AV= A. vastus,

EU = Euretidae sp., HF = H. falcifera, RO = Rossillidae sp., OM = O. minuta, IP = I. panicea, SN = S. nux, VP = V. pourtalesi.

Supplementary Figure 4. Mean and range of GC skew at 3rd codon positions for hexactinellid, as well as demosponge, protein coding genes. The negative GC skew of protein coding genes is exaggerated at 3rd positions, showing a clear preference for codons ending in C over G. Interestingly, AT and GC skews are opposite in glass and demosponges.

AB = A. beatrix, AV= A. vastus, EU = Euretidae sp., HF = H. falcifera, RO = Rossillidae sp.,

OM = O. minuta, IP = I. panicea, SN = S. nux, VP = V. pourtalesi, TW = Tethya wilhelma

(Demospongiae), XM = muta (Demospongiae). The latter-most genome was published in Wang and Lavrov 2008.

Supplementary Figure 5. Mean and range of hexactinellid mitochondrial protein coding gene lengths. Gene length is highly conserved, with the exception of nad2, nad5 and cob. AB = A. beatrix, AV= A. vastus, EU = Euretidae sp., HF = H. falcifera, RO =

Rossillidae sp., OM = O. minuta, IP = I. panicea, SN = S. nux, VP = V. pourtalesi.

Supplementary Figure 6. Mean and range of hexactinellid mitochondrial protein coding gene pairwise distances to demosponges. Pairwise distances were calculated with respect to demosponges representing each major lineage of Demospongiae (Wang and

Lavrov 2008): Oscarella carmela (G0; ), Xestospongia muta (G3),

Aplysina fulva (G2), Tethya actinia (G4), and Igernella notabilis (G1).

APPENDIX B. Supplementary materials for Chapter 5

Figure S1-7. Evolution of aaRS in Metazoa and Fungi. Maximum likelihood trees for

AlaRS, CysRS, HisRS, LysRS, ArgRS, ThrRS and ValRS respectively. AlaRS, CysRS,

HisRS, LysRS and ThrRS were aligned using Probcons. ArgRS and ValRS were aligned using ClustalW. Numbers above nodes indicate bootstrap support < 95 from 1000 replicates.

Pink highlighted lineages indicate those for strictly mitochondrial aaRS genes, blue for cytosolic. Cy = cytosolic, Mt = mitochondrial, Cy/Mt = dual targeted.

Figure S1-7. Evolution of aaRS in Metazoa and Fungi. Maximum likelihood trees for

AlaRS, CysRS, HisRS, LysRS, ArgRS, ThrRS and ValRS respectively. AlaRS, CysRS,

HisRS, LysRS and ThrRS were aligned using Probcons. ArgRS and ValRS were aligned using ClustalW. Numbers above nodes indicate bootstrap support < 95 from 1000 replicates.

Pink highlighted lineages indicate those for strictly mitochondrial aaRS genes, blue for cytosolic. Cy = cytosolic, Mt = mitochondrial, Cy/Mt = dual targeted.

Figure S1-7. Evolution of aaRS in Metazoa and Fungi. Maximum likelihood trees for

AlaRS, CysRS, HisRS, LysRS, ArgRS, ThrRS and ValRS respectively. AlaRS, CysRS,

HisRS, LysRS and ThrRS were aligned using Probcons. ArgRS and ValRS were aligned using ClustalW. Numbers above nodes indicate bootstrap support < 95 from 1000 replicates.

Pink highlighted lineages indicate those for strictly mitochondrial aaRS genes, blue for cytosolic. Cy = cytosolic, Mt = mitochondrial, Cy/Mt = dual targeted.

Figure S1-7. Evolution of aaRS in Metazoa and Fungi. Maximum likelihood trees for

AlaRS, CysRS, HisRS, LysRS, ArgRS, ThrRS and ValRS respectively. AlaRS, CysRS,

HisRS, LysRS and ThrRS were aligned using Probcons. ArgRS and ValRS were aligned using ClustalW. Numbers above nodes indicate bootstrap support < 95 from 1000 replicates.

Pink highlighted lineages indicate those for strictly mitochondrial aaRS genes, blue for cytosolic. Cy = cytosolic, Mt = mitochondrial, Cy/Mt = dual targeted.

Figure S1-7. Evolution of aaRS in Metazoa and Fungi. Maximum likelihood trees for

AlaRS, CysRS, HisRS, LysRS, ArgRS, ThrRS and ValRS respectively. AlaRS, CysRS,

HisRS, LysRS and ThrRS were aligned using Probcons. ArgRS and ValRS were aligned using ClustalW. Numbers above nodes indicate bootstrap support < 95 from 1000 replicates.

Pink highlighted lineages indicate those for strictly mitochondrial aaRS genes, blue for cytosolic. Cy = cytosolic, Mt = mitochondrial, Cy/Mt = dual targeted.

Figure S1-7. Evolution of aaRS in Metazoa and Fungi. Maximum likelihood trees for

AlaRS, CysRS, HisRS, LysRS, ArgRS, ThrRS and ValRS respectively. AlaRS, CysRS,

HisRS, LysRS and ThrRS were aligned using Probcons. ArgRS and ValRS were aligned using ClustalW. Numbers above nodes indicate bootstrap support < 95 from 1000 replicates.

Pink highlighted lineages indicate those for strictly mitochondrial aaRS genes, blue for cytosolic. Cy = cytosolic, Mt = mitochondrial, Cy/Mt = dual targeted.

Figure S1-7. Evolution of aaRS in Metazoa and Fungi. Maximum likelihood trees for

AlaRS, CysRS, HisRS, LysRS, ArgRS, ThrRS and ValRS respectively. AlaRS, CysRS,

HisRS, LysRS and ThrRS were aligned using Probcons. ArgRS and ValRS were aligned using ClustalW. Numbers above nodes indicate bootstrap support < 95 from 1000 replicates.

Pink highlighted lineages indicate those for strictly mitochondrial aaRS genes, blue for cytosolic. Cy = cytosolic, Mt = mitochondrial, Cy/Mt = dual targeted.

ACKNOWLEDGEMENTS

I would like to acknowledge my major professor, Dennis Lavrov, former and present lab mates, especially Xiujuan Wang and Will Pett, and departmental colleagues for contributing insightful discussions and technical assistance regarding the research in this dissertation. I’m indebted to the members of my POS Committee, which included Jeanne Serb, Allen Miller,

Robert Wallace, and John Nason for their guidance and inspiration. I would also like to thank Barbara Pleasants for her constant mentoring, which has set an excellent example for me regarding how to teach and think about evolution. Finally, I would like to thank my mother, Kristina Rouk, who supported my educational endeavors in countless ways, despite her never gaining a baccalaureat degree herself, and my husband, Christopher Whitmer, for his patience and relentless encouragment.