REVIEW in Choanoflagellates RICHARD P. TUCKER* Department of Cell Biology and Human Anatomy, University of California, Davis, Davis, California

ABSTRACT Horizontal gene transfer (HGT), also known as lateral gene transfer, results in the rapid acquisition of genes from another organism. HGT has long been known to be a driving force in speciation in prokaryotes, and there is evidence for HGT from symbiotic and infectious to metazoans, as well as from to bacteria. Recently, it has become clear that as many as a 1,000 genes in the genome of the choanoflagellate Monosiga brevicollis may have been acquired by HGT. Interestingly, these genes reportedly come from , bacteria, and other choanoflagellate prey. Some of these genes appear to have allowed an ancestral choanoflagellate to exploit nutrient-poor environments and were not passed on to metazoan descendents. However, some of these genes are also found in genomes, suggesting that HGT into a common ancestor of choanozoans and may have contributed to metazoan . J. Exp. Zool. (Mol. Dev. Evol.) 320B:1–9, 2013. © 2012 Wiley Periodicals, Inc. J. Exp. Zool. (Mol. Dev. Evol.) How to cite this article: Tucker RP. 2013. Horizontal gene transfer in choanoflagellates. J. Exp. 320B:1–9, 2013 Zool. (Mol. Dev. Evol.) 320B:1–9.

A key event that can contribute to the evolution of a is the these are intracellular that have been infecting protists acquisition of new genetic material. This can take a number of for a very long time. Nevertheless, it points to the potential for the different forms ranging from point mutations to chromosomal two-way acquisition of potentially beneficial genes across recombination and rearrangements to whole genome duplica- kingdoms. tions; the inheritance of, and selection for, the novel genetic Choanoflagellates are now widely considered to be the most material derived from pre-existing genes can lead to distinctive closely related group to metazoans (Fig. 1A; King and Carroll, traits that are the hallmark of a novel species. In prokaryotes it has 2001; Philippe et al., 2004; Abedin and King, 2008; King et al., been recognized for decades that new genetic material in the form 2008, 2009). Intriguingly, examination of choanoflagellate genes of complete, novel genes can be acquired through horizontal indicates that HGT is particularly common in these single-celled gene transfer (HGT; also referred to a lateral gene transfer; organisms, with genes appearing to arise from prokaryotic and Lawrence, '99; de la Cruz and Davies, 2000). This can be a eukaryotic prey caught in the mucous net arrayed between relatively common event. Early studies predicted that between 8% microvilli at the base of their flagellum (Fig. 1B–D). Here, I will and 18% of the Escherichia coli genome was acquired by HGT review the recent (and rapidly growing) literature supporting HGT (Ochman and Lawrence, '96), and more recent studies comparing in the choanoflagellate Monosiga and present novel the genomes of different E. coli strains suggest that HGT may even analyses of these genes taking into consideration the number of be more common (e.g., Brzuszkiewicz et al., 2006; reviewed by novel genomes that have become available since the original Dobrindt et al., 2010). research was done. HGT is not limited to prokaryotes. Recently, HGT from bacteria to and nematodes, as well as from bacteria to hydrozoans, has been described (Dunning Hotopp, 2011). For example, approximately 30% of the genome of the proteobacteria Conflict of interest: none to declare. Wolbachia, a parasitic microbe of arthropods, is found in the *Correspondence to: Richard P. Tucker, Department of Cell Biology and genome of the chinensis (Nikoh et al., 2008). Human Anatomy, University of California, Davis, Davis, CA 95616. E-mail: [email protected] There is even evidence of HGT from to bacteria: Received 20 December 2011; Revised 25 May 2012; Accepted 20 August approximately 100 genes in Legionella pneumophila, the bacteria 2012 behind Legionnaire's Disease, appear to have a origin Published online 19 September 2012 in Wiley Online Library (wiley (Lurie-Weinberger et al., 2010). The number of -derived genes in L. pneumophila is likely to be unusually high given that DOI: 10.1002/jez.b.22480


Figure 1. A schematic tree of life showing plausible phylogenetic relationships between choanoflagellates, fungi, and metazoans (A). Choanoflagellates are characterized by a microvilli collar found at the apical end of the cell body (B). A flagellum creates currents that can drive algal and prokaryotic prey into a mucous net arrayed between the microvilli (C). The prey are then phagocytosed, creating the potential for horizontal gene transfer (D).

PHOSPHOFRUCTOKINASE: THE FIRST EXAMPLE OF most similar to PFK sequences encoded in prokaryotic genomes HORIZONTAL GENE TRANSFER INTO CHOANOZOANS (e.g., it shares 62% amino acid identity with PFK from the The first description of HGT between a prokaryote and a photosynthetic acidobacteria Chloracidobacterium thermophi- choanoflagellate is found in Bapteste et al. (2003). This article lum, AEP13132), and among eukaryotic sequences it is most describes the complex evolutionary history of phosphofructoki- similar to PFK from (e.g., 45% amino acid identity with nase (PFK), a highly conserved enzyme that is central to the PFK from Arabidopsis thaliana, AAL90928). Interestingly, the canonical glycolytic pathway. Construction of a phylogenetic tree PFK from M. brevicollis (XP_001742461) is more similar to the based on the PFK sequences from a number of species resulted in PFKs found in fungi and metazoa that the PFK from M. ovata. puzzling relationships, not the least of which was the grouping of When these and other representative sequences are aligned with a PFK from Monosiga ovata with PFKs from the spirochetes ClustalW ( (Pairwise Borrelia brugdorferi and Treponema sp. Such groupings are alignment parameters = gap open penalty:10.0; gap extension typically considered to be evidence of HGT. The authors concluded penalty:0.01; Mulitple alignment parameters = gap open penal- that this unusual relationship argued for HGT between the gram ty:10.0; gap extension penalty:0.05; no weight transition; negative bacteria and the choanoflagellate, and that this was one hydrophobic residues:GPSNDQERK) and the sequences aligning of many cases where PFK and genes encoding related enzymes with the first 243 amino acids of the M. ovata enzyme (which may have passed from one genome to another by HGT. Evidence align with few gaps) are subjected to simple phylogenetic tree for the expression of this PFK in M. ovata is supported by the analysis(UPGMA;fromthe ClustalWsite),itisclearthat thegene presence of ESTs (Table 1). in M. ovata that came from a prokaryote via HGT does not sort Reanalysis of the possible origins of PFK were undertaken by with the homolog from M. brevicollis, and that a gene related to using the default parameters of tblastn (http://blast.ncbi.nlm. the one in M. brevicollis, and not the gene that arose by HGT, was and searching the NCBI nucleotide collection. As likely to be the gene that ultimately was contributed to the reported by Bapteste et al. (2003), the M. ovata PFK sequence is metazoan genome (Table 2; Fig. 2A).

Table 1. Representative genes reported to be acquired by choanoflagellates via horizontal gene transfer. Protein name RefSeq/JGI ID Putative origin EST?a Refs. Phosphofructokinase AAQ55175b Bacteria Yes Bapteste et al. (2003) Spo11-3 EDQ87716 Algae No Malik et al. (2007) Top6B XP_001744261 Algae No Malik et al. (2007) Uroporphyrin III methyltransferase XP_001742170 No Maruyama et al. (2009) Cobalamin synthesis protein XP_001746731 Cyanobacteria No Maruyama et al. (2009) Amino acid aminotransferase XP_001749475 Cyanobacteria Yes Maruyama et al. (2009) Glyoxalase I family-like protein XP_001750995 Cyanobacteria No Maruyama et al. (2009) lysA XP_001746610 Bacteria Yes Torruella et al. (2009) Ammonium transporter 1 XP_001743145 Algae No Nedelcu et al. (2009) Glutamine synthetase 32625 Algae Yes Nedelcu et al. (2009) Glutamate synthase 15256 Algae No Nedelcu et al. (2009) thrA 34304 Algae Yes Sun and Huang (2011) ilvD 32689 Algae Yes Sun and Huang (2011) ilvE 32828 Yes Sun and Huang (2011) Teneurin XP_001749414 Diatom/bacteria Yes Tucker et al. (2012)

aAs determined from reference or by tblastn of Monosiga sequence against non-human, non-mouse ESTs limited to Monosiga. bMonosiga ovata (RefSeq/JGI ID numbers not indicated by a superscript b are from Monosiga brevicollis).

Table 2. Analysis of genes reported to be acquired by choanoflagellates via horizontal gene transfer (HGT).

Likely origin HGT gene Pre-existing From Replaces Found with Into homolog into Protein name HGT Prokaryotic homolog homolog Novela metazoa metazoa Phosphofructokinaseb Yes Yes Yesf Yes Spo11-3 Yes Yes Yes Yes Top6B Yes Yes Yes Uroporphyrin III methyltransferase Yes ?c ?e Yes Cobalamin synthesis proteinb Yes Yes Yes Amino acid aminotransferase Yes Yes ? Glyoxalase I family-like protein Yes ?e lysA Yes Yes ? ? ? Ammonium transporter 1 Yes Yes Yes Yes Glutamine synthetaseb Yes Yes Yes Yes Glutamate synthase Yes Yes Yes Yes thrA Yes ? ? Yes ilvD Yes Yes Yes Yes Yes Yes ilvE Yes Yes ? Teneurinb Yes Yesd Yes Yes

aHomologous gene not found in or fungi, suggesting that gene appeared in unikonts via HGT. bPhylogenetic trees are provided elsewhere. cAnalysis inconclusive. dA fusion of domains acquired by HGT and a pre-existing transmembrane protein. ePossible origin from an . fReplaces homolog in Monosiga ovata, but pre-existing gene is found in M. brevicollis.

Figure 2. Phylogenetic tree analysis of representative proteins proposed to have been acquired in choanozoans by horizontal gene transfer (see text for details). A phosphofructokinase (A)inMonosiga ovata (underlined) clusters with prokaryotic sequences, a hallmark of horizontal gene transfer. However, the phosphofructokinase found in M. brevicollis is found in the same as the homologous enzyme in fungi and metazoans, as one would expect if horizontal gene transfer did not take place. This demonstrates that horizontal gene transfer can often be quite specific in choanozoan lineages. Glutamine synthetase (B) is representative of this group. It appears to have been acquired from algal prey, as it clusters at the bottom of this tree with and plants. The second clade at the top of the tree contains fungi and metazoans. One possible explanation for this arrangement is that the gene was acquired by a choanoflagellate after the evolution of metazoa. M. brevicollis has two cobalamin synthesis proteins (C). One is encoded by a gene that was likely to have been acquired by horizontal gene transfer (“1”; underlined) as it is found in the same clade as proteins from a microscopic alga and a diatom. The second cobalamin synthesis protein (“2”) clusters with fungi and metazoans. In this case, the pre-existing gene was not replaced by the gene that was likely to have been acquired by horizontal gene transfer, and the pre-existing gene was transmitted into metazoa. Three genes that make up the glutamate synthase cycle have been proposed to have been acquired in choanoflagellates by horizontal gene transfer. A fourth protein in M. brevicollis, teneurin (D), has distinct properties. The C-terminal sequences cluster with YD proteins from proteobacteria. The N-terminus of the M. brevicollis teneurin has EGF repeats, which are unique to eukaryotes. Thus, the gene encoding teneurin appears to be a hybrid of prokaryotic and pre-existing genes, and the resulting gene was transmitted into metazoa. Full species names and accession numbers can be found in the text.


REPORTS OF THE TRANSFER OF GENETIC MATERIAL so the authors speculate that the acquired genes were functionally FROM PLANTS TO CHOANOFLAGELLATES superior and may have allowed a choanoflagellate to exploit The possibility of HGT between and Monosiga was nitrogen-poor environments. The search, alignment and tree first suggested by Malik et al. (2007) who studied phylogenetic analyses described above confirm this report (Table 2; Fig. 2B). For relationships between Spo11, a protein that generates breaks in example, the M. brevicollis glutamine synthetase is most similar to DNA during , and its paralogs. They found Spo11-1 in this enzyme in the diatom (58% amino acid , plants and many prokaryotes, and Spo11-2 in plants identity, DAA12505) as well as to various plants. A homolog is and many prokaryotes. Spo11-3 and the related topoisomerase found in fungi (e.g., 54% amino acid identity, Clavispora Top6B, in contrast, are found not only in plants and prokaryotes, lusitaniae, XP_002619296) and in cnidarians, , and but also in M. brevicollis. Both proteins are missing from other (e.g., 50% amino acid identity, Hydra magnipapillata, opisthokonts. The authors conclude that HGT between algal prey XP_002161673). Phylogenetic tree analysis suggests the follow- and choanoflagellate predator is the most likely explanation for ing: a glutamine synthetase gene was acquired by an ancestor of the presence of these genes in M. brevicollis. This was confirmed M. brevicollis from a diatom, and it replaced the existing gene. by our analysis: Spo11-3 from M. brevicollis is most similar to However, it was the glutamine synthetase gene found in fungi that homologs from the green algae Volvox carteri (60% amino acid persisted in the that evolved into metazoa, not the gene identity, XP_002947507) and the marine diatom Thalassiosira acquired by HGT (Fig. 2B). pseudonana (60% amino acid identity, XP_002294481), and the Amino acid biosynthesis pathways in M. brevicollis are reported M. brevicollis Top6B is most similar to the Top6B subunits of A. to have evolved from HGT as well (Sun and Huang, 2011). An thaliana (47% amino acid identity, NP_188714) and V. carteri enzyme common to the threonine/methionine pathway, thrA, (46% amino acid identity, XP_002952053). Neither protein has appears to have an algal origin, as does ilvD, an enzyme used for homologs in amoebozoa, fungi, or metazoa (though, as noted isoleucine biosynthesis. Another isoleucine synthesis enzyme, above, the paralog Spo11-1 is found widely). ilvE, appears to have come from a diatom. Expression of all of Three enzymes found in M. brevicollis that form the glutamate these enzymes is supported by ESTs (Table 1). Our tblastn, synthase cycle—ammonium transporter (AMT), glutamine syn- alignment and phylogenetic tree analyses support these con- thetase and glutamate synthase—are all reported to be of algal clusions, with a few notable exceptions. One such exception is the origin (Fig. 3; Nedelcu et al., 2009). This is remarkable for a possible origin of the M. brevicollis thrA, which is most similar to a number of reasons. First, an entire enzymatic pathway is proposed thrA from a chloroflex bacteria (44% amino acid identity, to have been incorporated into the Monosiga genome, and the fact Caldilinea aerophila, BAL98984), and clusters with this prokary- that the genes are not clustered in any way suggests that they were otic enzyme by phylogenetic tree analysis. However, one must probably incorporated following separate HGT events. Second, note that the choanozoan thrA is also quite similar to plant such a pathway already existed in an ancestral choanoflagellate, homologs, so an origin from plants remains possible. In addition, the ilvD from M. brevicollis appears to have a prokaryotic origin: it shares 47% amino acid identity with the ilvD from Staphylococcus saprophyticus (BAE19438) and other bacteria, and phylogenetic tree analysis shows only distant relationships with plants. The ilvD from M. brevicollis is particularly remarkable in that it is most similar to metazoan sequences (e.g., 53% amino acid identity with the homolog from the Amphimedon queenslandica, XP_003384343). It appears to be a rare example of a gene acquired by HGT persisting and perhaps playing a role in the evolution of metazoa.

REPORTS OF THE TRANSFER OF GENETIC MATERIAL FROM PROKARYOTES TO CHOANOFLAGELLATES Torruella et al. (2009) make a strong argument that the lysA gene Figure 3. Key enzymes in the glutamate synthase cycle in M. brevicollis was acquired from bacteria. Lysine is an (ammonium transporter 1, glutamine synthetase and glutamine essential amino acid that must be part of the animal diet, but synthase) were all acquired by Monosiga brevicollis through some organisms have pathways for lysine biosynthesis: the DAP horizontal gene transfer from algal prey (adapted from Nedelcu pathway requires the enzyme lysA, and the AAA pathway et al., 2009). These enzymes may have functioned more efficiently requires alpha-aminoadipate reductase (AAR). The DAP pathway than innate enzymes, thus allowing an ancestral choanoflagellate is found in bacteria and plants, whereas the AAA pathway is to exploit nitrogen-poor environments. found in fungi. Torruella et al. (2009) report that M. brevicollis

J. Exp. Zool. (Mol. Dev. Evol.) 6 TUCKER has both an AAR gene and a lysA gene, and phylogenomics probably did arise by HGT, replaced the existing gene, and was suggests that the lysA genewasacquiredbyHGTfrom not transmitted into metazoa. proteobacteria. Its expression is also confirmed by ESTs. Analysis A different story emerges from the analysis of cobalamin here confirms these conclusions. The gene encoding LysA in synthesis protein. This protein in M. brevicollis is most similar to a many prokaryotes is fused with the gene encoding LysC. tblastn homolog in V. carteri (58% amino acid identity, XP_002957516), searches of the nucleotide collection with the lysA sequence from microscopic marine algae (e.g. Micromonas sp. and Ostreococcus M. brevicollis reveals that the choanozoan sequence is most tauri) and diatoms (Phaeodactylum tricornutum), suggesting an similar to the part of the fused prokaryotic protein that encodes origin not from cyanobacteria, but via HGT from algae. LysA (e.g., 46% identity, Legionella longbeachae, CBJ11456). Interestingly, limiting the tblastn search to opisthokonts reveals Phylogenetic tree analysis also shows the M. brevicollis sequence a second cobalamin synthesis protein in Monosiga brevicollis clustering with the bacterial sequences. Numerous enzymes (40% amino acid identity, XP_001746159), as well as homologs sharing structural similarity to lysA are found in fungi and in fungi and metazoa. Phylogenetic tree construction (Fig. 2C) metazoans, and the phylogenetic relationships between them are shows that this second protein has more traditional evolutionary complex (e.g., see Sandmeier et al., '94). This complicates this origins and is probably encoded on the gene that was eventually particular analysis. transmitted into animals. Note that a recent study describing phylogenomic screening of Like the cobalamin synthesis protein described above, the the M. brevicollis genome using PhyloGenie and Darkhorse has amino acid transferase of M. brevicollis also appears to have arisen clarified the possible extent of HGT in choanoflagellates (Sun by HGT from plants: it is most similar to homologs in the primitive et al., 2010). These programs predicted that 1,324 genes in M. lycophyte Selaginella moellendorffii (47% amino acid identity, brevicollis were likely to have been acquired from other XP_002982481) and the green algae V. carteri (45% amino acid organisms: 53 of these genes appear to have arisen from HGT identity, XP_002948580), and less similar to this protein in from diatoms or , and the remaining came from algae cyanobacteria (e.g., 36% amino acid identity, Cyanothece sp., and bacteria. The authors conclude that HGT in Monosiga from its ACL43948). Limiting the search to opisthokonta reveals a distant prey is widespread, and that the mechanism of prey gene homolog in fungi but no homolog in metazoa. Tree analysis integration into the genome (which is currently unknown) merits clusters the M. brevicollis protein with the plant proteins and puts further investigation. the fungal proteins into a different clade. Thus, the amino acid transferase appears to have evolved in choanozoa by HGT from a FROM CYANOBACTERIA OR PLANTS? FOUR STORIES eukaryote, and while the gene encoding this protein persisted into FROM FOUR ENZYMES fungi (and perhaps into choanozoa, only to be replaced by the An analysis of the genomes of eukaryotes lacking plastids for novel gene), it did not find its way into the metazoan genomes that genes of cyanobacterial ancestry led Maruyama et al. (2009) to are available for analysis. propose four prokaryote-derived genes in M. brevicollis: Phylogenomic analysis of the glyoxalase I family-like protein uroporphyrin III C-methyltransferase, cobalamin synthesis reveals that is it most similar to a homolog from the oomycete protein, an amino acid aminotransferase and a glyoxalase I Phytophthora infestans (48% amino acid identity, family-like protein (Table 1). These four genes were further XP_002901811), as well as to a homolog from the percolozoan analyzed here as described above with surprising results. Naegleria gruberi (46% amino acid identity, XP_002683444). For example, the M. brevicollis uroporphyrin III C-methyltrans- More dissimilar homologs are found in plants and prokaryotes, but ferase gene is most similar to a homologous gene from the no homologs are found in amoebozoa, fungi or metazoa. This oomycete Phyophthora infestans (50% amino acid identity, suggests that while the gene was likely to have been acquired by XP_002998208). Phylogenetic tree analysis shows the choano- HGT, its origins are difficult to determine. Perhaps, like zoan enzyme clustering with both the P. infestans sequence and uroporphyrin III C-methyltransferase, the gene originated from the sequences from several genera of cyanobacteria, while an oomycete or related algae. another clade contains fungal sequences (e.g., 32% amino acid identity, Schizosaccharomyces japonica, XP_002172446) and a TENEURIN: A GENE ENCODING A TRANSMEMBRANE homologous enzyme from the anthozoan Nematostella vectensis PROTEIN REPORTED TO BE A FUSION OF SEQUENCES (but not other metazoans). These complicated relationships ACQUIRED BY HGT AND AN EXISTING (Table 2) make it difficult to ascertain the origins of this enzyme GENE in M. brevicollis: did it come from HGT following a Teneurins are multi- transmembrane proteins necessary meal, or did it arise from a cyanobacterial gene that is closely for the normal development of the nervous system and pattern related to the gene in a heterokont? Analysis is further hampered formation in both protostomes and (Fig. 4A; Tucker by the challenging of these organisms. In any event, and Chiquet-Ehrismann, 2006; Young and Leamey, 2009). the uroporphyrin III C-methyltransferase gene in M. brevicollis Recently, a teneurin was identified in the genome of M. brevicollis


Figure 4. Teneurins are transmembrane cell–cell adhesion proteins necessary for the normal development of the nervous system in a broad range of metazoans. In vertebrates, teneurins have a transmembrane domain (indicated to two short vertical bars), eight EGF repeats, a C-rich region, NHL domains, a series of YD repeats, and an RHS protein domain (A). Monosiga brevicollis has a teneurin gene encoding a protein with the same domain organization as vertebrate teneurins (B). Most of the extracellular domain of the M. brevicollis teneurin is encoded on a single huge exon. This is characteristic of a gene acquired by horizontal gene transfer. The extracellular domain of the M. brevicollis teneurin is most similar to the C-terminal half of a YD protein from the aquatic bacterium Desulfurivibrio alkaliphilus (C). Other YD proteins from proteobacteria that are also similar to teneurins have different N-terminal domains, demonstrating that the C-terminal sequences are used like a cassette to create novel genes in prokaryotes. Some of the YD proteins, like teneurins, are transmembrane proteins. The acquisition of a teneurin by an ancestral choanoflagellate may have increased its ability to catch prokaryotic prey through teneurin–YD protein interactions.

