JOURNAL OF BACTERIOLOGY, May 2006, p. 3682–3696 Vol. 188, No. 10 0021-9193/06/$08.00ϩ0 doi:10.1128/JB.188.10.3682–3696.2006 Copyright © 2006, American Society for Microbiology. All Rights Reserved.

Living with Instability: the Adaptation of Phytoplasmas to Diverse Environments of Their Insect and Plant Hosts†† Xiaodong Bai,1 Jianhua Zhang,1,2† Adam Ewing,1‡ Sally A. Miller,2 Agnes Jancso Radek,3§ Dmitriy V. Shevchenko,3 Kiryl Tsukerman,3 Theresa Walunas,3 Alla Lapidus,3¶ 3 1

John W. Campbell, ࿣ and Saskia A. Hogenhout * Downloaded from Department of Entomology1 and Department of Plant Pathology,2 The Ohio State University, Ohio Agricultural Research and Development Center, Wooster, Ohio 44691, and Integrated Genomics, Chicago, Illinois 606123

Received 30 September 2005/Accepted 23 January 2006

Phytoplasmas (“Candidatus Phytoplasma,” class Mollicutes) cause disease in hundreds of economically important plants and are obligately transmitted by sap-feeding insects of the order Hemiptera, mainly leafhoppers and psyllids. The 706,569-bp chromosome and four plasmids of aster yellows phytoplasma strain witches’ broom (AY-WB) were sequenced and compared to the onion yellows phytoplasma strain M (OY-M) genome. The phyto- http://jb.asm.org/ plasmas have small repeat-rich . This comparative analysis revealed that the repeated are organized into large clusters of potential mobile units (PMUs), which contain tra5 insertion sequences (ISs) and genes for specialized sigma factors and membrane proteins. So far, these PMUs appear to be unique to phytoplasmas. Compared to mycoplasmas, phytoplasmas lack several recombination and DNA modification functions, and there- fore, phytoplasmas may use different mechanisms of recombination, likely involving PMUs, for the creation of variability, allowing phytoplasmas to adjust to the diverse environments of plants and insects. The irregular GC skews and the presence of ISs and large repeated sequences in the AY-WB and OY-M genomes are indicative of high

genomic plasticity. Nevertheless, segments of ϳ250 kb located between the lplA and glnQ genes are syntenic between on October 14, 2015 by University of Queensland Library the two phytoplasmas and contain the majority of the metabolic genes and no ISs. AY-WB appears to be further along in the reductive evolution process than OY-M. The AY-WB genome is ϳ154 kb smaller than the OY-M genome, primarily as a result of fewer multicopy sequences, including PMUs. Furthermore, AY-WB lacks genes that are truncated and are part of incomplete pathways in OY-M.

Phytoplasmas cause disease in over 200 economically impor- cells of insects and plants, organisms belonging to two king- tant plants and are obligately transmitted by phloem-feeding doms. Phytoplasmas are members of the class Mollicutes. Mol- insects of the order Hemiptera, mainly leafhoppers and psyl- licutes are soft-skinned (mollis, soft; cutis, skin [in Latin]) bac- lids. They are unique bacteria, as they can efficiently invade teria due to the lack of an outer cell wall and usually have small genomes, a low GϩC content, few rRNA operons, few tRNA genes, and limited metabolic activities (18). Mollicutes repre- * Corresponding author. Mailing address: Department of Entomology, sent a branch of the phylogenetic tree of the gram-positive The Ohio State University, Ohio Agricultural Research and Development Center, 1680 Madison Avenue, Wooster, OH 44691. Phone: (330) 263- eubacteria and are most closely related to low-GC gram-pos- 3730. Fax: (330) 263-3686. E-mail: [email protected]. itive bacteria such as Bacillus, Clostridium, and Streptococcus † Present address: Potato Research Center, Agriculture and Agri- spp. (94, 96). Food Canada, Fredericton, NB E3B 4Z7, Canada. The phylogenetic tree of mollicutes is composed of two ‡ Present address: GCB Graduate Group, University of Pennsylva- nia, Philadelphia, PA 19104. major clades that diverged early in evolution (51). One clade § Present address: Epicentre Technologies Corp., Madison, WI 53713. contains the orders Acholeplasmatales and Anaeroplasmatales ¶ Present address: Microbial Genomics, DOE Joint Genome Insti- (AAA clade mollicutes), and the other clade contains the or- tute, Walnut Creek, CA 94598. ders Mycoplasmatales and Entomoplasmatales (SEM clade molli- ࿣ Present address: Scarab Genomics, LLC, Madison, WI 53713. †† Authors’ contributions: X.B., performance of the majority of the cutes) (9). Phytoplasmas, formerly known as mycoplasma-like bioinformatics analysis (annotation, comparative genome analyses, de- organisms of plants, form a monophyletic group in the order fining metabolic pathways of AY-WB, and submission of sequences to Acholeplasmatales (51) and were recently assigned to a novel GenBank), and writing of manuscript; J.Z., development of DNA genus, “Candidatus Phytoplasma” (41). Approximately 20 phy- isolation method, DNA isolation, gap closure, and annotation of se- lected sequences; A.E., annotation of selected sequences and charac- toplasma phylogenetic groups have been proposed based on terization of PMUs; S.A.M., project initiation, project support, and 16S rRNA gene sequences, and new branches are continuously providing materials and resources; A.J.R., sequencing and gap closure; being discovered (69, 85). Members of the order Acholeplas- D.V.S., construction of AY-WB genomic libraries, sequencing, and matales are distinct from other mollicutes in several ways. For gap closure; K.T., bioinformatics (sequence assembly and gap closure); T.W., bioinformatics (maintenance of annotation database and auto- instance, whereas most mollicutes use UGA as a tryptophan mated annotation); A.L., project manager for construction of AY-WB codon instead of a stop codon, a feature they share with mi- genomic libraries, sequencing, and gap closure; J.W.C., project man- tochondria, the acholeplasmas and phytoplasmas retained ager for bioinformatics (sequence assemply, gap closure, maintenance UGA as a stop codon (80). of annotation database, and assembly); S.A.H., project initiation, over- all project management (experimental work, annotation, and all other Mollicutes have been extensively studied because of their bioinformatics analyses), and writing of manuscript. economic importance. They are disease agents and obligate

3682 VOL. 188, 2006 COMPARATIVE ANALYSIS OF TWO PHYTOPLASMA GENOMES 3683 inhabitants of humans, mammals, reptiles, fish, arthropods, genomes of at least nine SEM clade mollicutes and one AAA and plants. Phytoplasmas are generally associated with arthro- clade mollicute (OY-M phytoplasma) (76) have been fully pods and plants, whereas mycoplasmas (Entomoplasmatales sequenced. Here, we report the full sequence of the small and Mycoplasmatales) and ureaplasmas (Mycoplasmatales) are genome of AY-WB. Comparative genome analysis revealed pathogens that cause infections of the respiratory and urogen- the presence of 14 to 23% repetitive DNA organized in po- ital tracts, eyes, alimentary canals, glands, and joints of humans tential mobile units (PMUs) in the phytoplasma genomes and and animals. Interestingly, three spiroplasmas, Spiroplasma differences in standard metabolic and nonmetabolic pathways kunkelii, Spiroplasma citri, and Spiroplasma phoeniceum, are between phytoplasmas and SEM clade mollicutes.

also insect-transmitted plant pathogens but belong to the order Downloaded from Entomoplasmatales (34) and hence are distantly related to the MATERIALS AND METHODS phytoplasmas. Dual phytoplasma and spiroplasma infections of DNA isolation. The AY-WB strain was collected from diseased lettuce plants in Celeryville, Ohio (41.00°N, 82.45°W), in 1998 (99). AY-WB was isolated from insects and plants occur frequently (40). lettuce plants about 2 weeks after the appearance of symptoms. The stems of Severalmycoplasmas,ureaplasmas,spiroplasmas,andachole- lettuce plants were cut at several places with a sharp razor blade, and phloem sap plasmas have been cultured outside their hosts in artificial oozing from the cut area was collected. On average, 1.6 ml sap was collected from culture media. Culture media are complex, because mollicutes each symptomatic lettuce plant. For preparation of gel plugs, 200 ␮l sap was ␮ ϫ suffered extensive gene losses and consequently lack genes of immediately mixed with 800 l precooled 30% glucose–1 Tris-EDTA (pH 8.0) buffer, followed by centrifugation at 16,000 ϫ g for 20 min at 4°C. The pellet was many basic metabolic pathways. However, to date, phytoplas- ␮ ϫ

mixed with 80 l 1% premelted low-melting agarose (45°C) in 0.5 Tris-borate- http://jb.asm.org/ mas have not been cultured in cell-free medium, indicating EDTA (pH 8.0) and incubated at 4°C. Solidified plugs were subjected to pro- that phytoplasmas have a different metabolism and are likely to teinase K digestion at 50°C for 48 h and then rinsed with 1ϫ Tris-EDTA buffer have more highly reduced genomes than other mollicutes. (pH 8.0) three times before subjection to pulsed-field gel electrophoresis (PFGE). PFGE was conducted in a 1% agarose gel with a running time of 18 h, The aster yellows phytoplasma (AYP) strain witches’ broom a 60- to 120-s switch time ramp, a voltage of 6 V/cm, and an included angle of (AY-WB) (“Ca. Phytoplasma asteris”; class Mollicutes) gener- 120° (CHEF-DR III; Bio-Rad, Hercules, CA). The AY-WB chromosome pro- ally spreads systemically in lettuce (Lactuca sativa L.) and duced a single band of ϳ700 kb in the PFGE gel. The identity of the band was China aster (Callistephus chinensis Nees), inducing a variety of confirmed by Southern blot hybridizations and PCR using phytoplasma-specific probes and primers, respectively. The 700-kb fragment was excised from the gel, symptoms, including vein clearing, yellowing, stunting, witches’ on October 14, 2015 by University of Queensland Library and the gel blocks were placed directly into the Elutrap (Schleicher & Schuell) broom, pigment loss or sterility of flowers, and necrosis (99). collection chamber for elution of DNA at 106 V at 4°C for 15 h. DNA was The extreme malformations of plants suggest that phytoplas- ethanol precipitated using standard procedures and resuspended in deionized mas interfere with plant hormone metabolism (51). AY-WB distilled water. The concentration of the purified genomic DNA was assessed also spreads systemically in Arabidopsis thaliana and Nicotiana using a PicoGreen kit (Molecular Probes). Sequencing strategy. The shotgun library was constructed at Integrated benthamiana, inducing yellowing, stunting, and witches’ broom Genomics Inc. (IG). Five micrograms of DNA was sheared using a computer- in both (X. Bai, V. Correa, and S. A. Hogenhout, unpublished controlled shearing device (GeneMachines, San Carlos, CA) to produce DNA results). AY-WB was classified into the 16SrI-A subgroup of fragments of 2 kb on average. Sheared DNA was loaded onto 0.7% agarose gels, “Ca. Phytoplasma asteris” based on the restriction fragment and DNA fractions corresponding to 2 to 2.5 kb were extracted from the agarose length polymorphism banding pattern of a 1.2-kb 16S rRNA gel. Single-stranded ends of the DNA were removed by T4 polymerase and then filled in with Klenow fragment. Size-selected 2- to 2.5-kb DNA fragments were gene PCR fragment (99). In contrast, onion yellows (OY) cloned into the pGEM-3Z vector (Promega, Madison, WI), introduced into phytoplasma strain M (OY-M), the only other phytoplasma for DH10B, and sequenced with the DYEnamic ET Dye Terminator which a complete genome sequence is available (74), belongs kit (Amersham Biosciences, Piscataway, NJ). Sequence quality assessment and to the 16SrI-B subgroup (51). “Ca. Phytoplasma asteris,” pre- subsequent assembly were performed with the Phred/Cross_match/Phrap pack- age (29, 30) and Paracel Genome Assembler. Sequencing and physical gaps in viously known as AYP or group I phytoplasma (52), is the the assembly were closed by multiplex PCR (92) and primer walking. largest of the phytoplasmas and associates with more than 100 Annotation. The sequence data of AY-WB were submitted to the IG database economically important diseases worldwide (51, 62). Plant and software suite, ERGO, for sequence annotation. CRITICA (8), Glimmer2 hosts include broad-leaf, herbaceous plants and several woody (25), and IG proprietary tools were used for open reading frame (ORF) iden- fruit crops (62). tification. ORF function annotation was conducted by a number of IG propri- etary algorithms that automatically predict the function of ORFs based on AY-WB is transmitted by the polyphagous leafhopper Mac- comparative analysis with orthologue clusters in ERGO. In addition, the pre- rosteles quadrilineatus (Forbes). Phytoplasma interactions with dicted proteins were searched, using the BLAST algorithm (6), against a non- insects are complex and involve intra- and extracellular repli- redundant database at the National Center for Biotechnology Information cation in gut and salivary glands, epithelial and muscle tissues, (NCBI). Protein functional domains were analyzed by searching against the NCBI conserved-domain database (60) and the Pfam database (12). The Kyoto and other organs and tissues. Whereas there is evidence that encyclopedia of genes and genomes was used for the reconstruction of the some phytoplasmas are vertically transmitted to the progeny of metabolic pathways. The assignment of Enzyme Commission (EC) number was their insect vectors (37), the predominant means of survival of done according to the BRENDA database (86). phytoplasmas is through transmission between insects and Nucleotide sequence accession numbers. Sequences of the AY-WB genome plants. They appear to manipulate their insect and plant hosts have been deposited in the GenBank database under accession numbers CP000061 (chromosome), CP000062 (plasmid AYWB-pI), CP000063 (plasmid to enhance their own transmission efficiencies. For example, AYWB-pII), CP000064 (plasmid AYWB-pIII), and CP000065 (AYWB-pIV). AYPs can increase fecundity and longevity of their insect vec- More detailed information on the AY-WB genome is available on our website tor, Macrosteles quadrilineatus (13). (http://www.oardc.ohio-state.edu/phytoplasma). Because of their small genomes and economic importance, mollicutes have been targeted for genome sequencing projects RESULTS for some time. Mycoplasma genitalium was the second bacte- General genomic features. The AY-WB genome is com- rium to be sequenced to completion because of its minimal posed of one circular chromosome of 706,569 bp (Fig. 1A) and gene complement for a cultivable organism (33). Thus far, contains two rRNA gene operons, 31 tRNA genes, and 671 3684 BAI ET AL. J. BACTERIOL. Downloaded from http://jb.asm.org/ on October 14, 2015 by University of Queensland Library

FIG. 1. (A) Genome maps of the 706,569-bp circular chromosome of “Candidatus Phytoplasma asteris” strain AY-WB. Rings present from the inside to outside are as follows: ring 1, rrn operons in red and tRNA in green; ring 2, GC skew over a 2-kb window and 200-bp steps with red denoting G Ͼ C and blue denoting C Ͼ G; ring 3, predicted ORFs in sense orientation in yellow and antisense orientation in blue; ring 4, location of tra5 ISs presented as angular brackets with yellow indicating sense orientation and blue indicating antisense orientation; ring 5, ORFs present in all sequenced mollicutes in blue and unique to phytoplasmas within the class Mollicutes in red; ring 6, ORFs of predicted secreted proteins in green, secreted membrane proteins in red, and membrane proteins in blue; ring 7, base pair indicator with the first nucleotide of dnaA as nucleotide 1. oriC is most likely located immediately upstream of dnaA as predicted by Oriloc software (32) and by the opposite direction of ORFs surrounding the putative oriC. (B) The four plasmids of AY-WB. ORFs are presented as block arrows with names of the ORFs on the outside of the rings. Numbers on the inside of the rings indicate the locations in base pairs, with the first nucleotide of the repA and rep genes as nucleotide 1. ORFs indicated with an asterisk are predicted to encode membrane-targeted proteins. In the GenBank database, the plasmids are referred to as pAYWB-I through pAYWB-IV. (C) Three chromosomal segments containing ORFs with similarity to plasmid ORFs. The chromosome is presented as a black line. The numbers below the black lines indicate the positions of the first and last nucleotides of the sequence on the AY-WB chromosome in base pairs. ORFs are represented as block arrows. Arrows of paralogous genes on plasmids and chromosomes have the same color, with the exception of the gray arrows, which represent unique genes. The names of the ORFs with predicted functions are indicated above the arrows. RepA, plasmid replication-associated protein with significant similarity to RepA of geminiviruses. rep encodes the phytoplasma-specific plasmid replication protein encodes; ssb encodes the single-stranded DNA-binding protein.

TABLE 1. General features of the chromosomes of predicted ORFs (Table 1). UGA was used as a stop codon for AY-WB and OY-M the prediction of the ORFs. This is consistent with other re- Feature AY-WB OY-Ma ports showing that acholeplasmas and phytoplasmas retained UGA as a stop codon, unlike SEM branch mollicutes, which Length (bp) 706,569 860,631 ϩ use UGA as a tryptophan codon instead of a stop codon (80). G C content (%) 27 28 This is also in agreement with annotations conducted for OY-M (76). Our results were not in agreement with a previous Protein-coding region (%) 72 73 report that stated that UGA should be considered as a tryp- No. of protein-coding genes with 450 446 tophan codon in phytoplasmas, as in mycoplasmas (64). The assigned function b average guanine (G) and cytosine (C) content of the AY-WB No. of conserved hypothetical genes 149 51 No. of hypothetical genes 72 257 chromosome is 27%. The genome has an irregular GC-skew pattern that is different from most prokaryotic genomes, which Total no. of genes 671 754 usually consist of two major shifts near the origin of replication and the terminus of replication (35). Irregular GC-skew pat- Avg length of protein-coding genes (bp) 779 785 terns were also found in the genomes of some other bacteria, No. of tRNA genes 31c 32c such as Wolbachia pipientis (97) and Mycoplasma mycoides No. of rRNA operons 2 2 (95). Because the location of the origin of replication (oriC) a Numbers taken from data reported previously by Oshima et al. (69). was not clear, the first nucleotide of the dnaA gene was as- b Includes proteins with similarity (blastp, Ͻ10Ϫ5) to OY-M proteins. signed as bp 1. However, oriC is most likely located upstream c tRNAs corresponding to all amino acids are represented. VOL. 188, 2006 COMPARATIVE ANALYSIS OF TWO PHYTOPLASMA GENOMES 3685

TABLE 2. General features of the plasmids of AY-WB and OY-M

Feature AYWB-pI AYWB-pII AYWB-pIII AYWB-pIV EcOYMa pOYMa Length (bp) 3,972 4,009 5,104 4,316 5,025 3,932 GϩC content (%) 25.6 23.9 21.8 25.5 25 24

Protein-coding region (%) 75 71 65 76 71 75 No. of protein-coding genes with assigned function 2 2 2 2 2 2 No. of conserved hypothetical genes 3b 2b 6b 3b No. of hypothetical genes 1 1 4 3 Downloaded from

Total no. of genes 5 4 7 6 6 5

Avg length of protein-coding genes (bp) 594 569 472 546 597 588

a Numbers taken from data reported previously by Oshima et al. (69). b Includes proteins with similarity (blastp, Ͻ10Ϫ5) to OY-M proteins. http://jb.asm.org/ of dnaA as predicted by Oriloc software (32) and by the op- covering 71,979 bp (10.2%) of the chromosome, are organized posite direction of ORFs surrounding the putative oriC (Fig. as clusters, consisting of genes encoding transposases (tra5), 1A) (35). DNA primases (dnaG), DNA helicases (dnaB), thymidylate In addition to the chromosome, four small circular plasmids kinases (tmk), Zn-dependent proteases (hflB), DNA-binding were identified (Fig. 1B and Table 2). This was surprising, proteins HU (himA), single-stranded DNA-binding proteins because the DNA isolation procedure should not allow the (ssb), and specialized sigma factors (sigF) and a number of isolation of small DNAs. One explanation for this discrepancy other genes with unknown function (Fig. 3). Many of these is that the plasmids are present at high copy numbers in the hypothetical proteins are predicted to target phytoplasma on October 14, 2015 by University of Queensland Library phytoplasma cell. As a consequence, some plasmid DNA was membranes (Fig. 1 and 3 and Table 3) and are therefore likely copurified from the PFGE gel along with the AY-WB chro- involved in AY-WB interactions with plant and insect hosts. mosomal DNA. The plasmids contain a total of 22 putative The phytoplasma tra5 insertion sequences (ISs) belong to ORFs, and their average GC contents ranged from 21.8% to the IS150 group of the IS3 family (53, 58). The presence of tra5 25.6%. Each plasmid has genes for a replication initiation ISs and other genes involved in recombination and repair, such protein (Rep) and a single-stranded DNA-binding protein as himA, suggests that these cluster are mobile elements and, (SSB) that are involved in rolling-circle amplification (45), hence, were named PMUs. PMU1 is flanked by a complete tra5 whereas the functions of the other genes are not known. How- IS on one side and a truncated tra5 IS at the other side as well ever, most of the plasmid genes were predicted to encode as inverted repeats (IRs) of 327 bp (Fig. 3A). Sequences highly secreted or membrane proteins (Fig. 1B), and except for ORF similar to the PMU1 inverted repeats were also found adjacent pIII02 of AYWB-pIII and pIV06 of AYWB-pIV, all genes are to the tra5 ISs of the other three PMUs (Fig. 3A). Another similar to OY-M phytoplasma sequences (Table 2). It is strik- striking observation is that all PMUs contain copies of dnaG, ing that whereas the plasmids encode different Rep proteins, dnaB, ssb, and tmk that are involved in DNA replication, sug- they contain paralogous genes in similar orders (Fig. 1B). Two gesting that the PMUs may transpose in a replicative fashion. AY-WB plasmids (AYWB-pI and AYWB-pIII) contain repA The AY-WB genome also contained several clusters that genes similar to geminiviruses repA, whereas the rep genes of look like derivatives of PMUs, as they contained truncated the other two plasmids (AYWB-pII and AYWB-pIV) were unique to AY-WB and OY phytoplasmas. versions of PMU ORFs with gene orders similar to those of The AY-WB plasmids seem prone to mutation. First, ORFs PMUs. It is likely that these PMU-like clusters are in the pIII04 and pIII05 of AYWB-pIII are similar to the 5Ј and 3Ј process of being eliminated. Based on the positions of the tra5 portions, respectively, of paralogous genes on the other three insertion sequences, the PMUs or PMU-like ORF clusters are plasmids, suggesting that a mutation to a stop codon produced present in at least seven locations in the AY-WB chromosome two ORFs in AYWB-pIII. Furthermore, the sequence between (Fig. 1A). At three locations in the AY-WB genome, PMUs pII03 and ssb of AYWB-pII is similar to genes pI04, pIII06, are located adjacent to each other. The largest PMU-rich re- ϳ and pIV04 of the other three plasmids but was not annotated gion of the AY-WB chromosome is 75,000 bp (Fig. 1A), as an ORF because of the presence of a premature stop codon. including PMU1 and PMU2 (Fig. 3A). In addition, plasmids apparently recombine with the chromo- Not all dnaG, dnaB, tmk, hflB, himA, and ssb genes are part some, as the latter contains three truncated ORFs similar to of PMUs or PMU-like clusters. As discussed above, several ssb the geminivirus-like repA plasmid genes and one truncated genes are located on plasmids or in plasmid-derived sequences copy similar to the rep gene (Fig. 1C). within the chromosome (Fig. 1B and C). The AY-WB chro- Repetitive and mobile DNA in the AY-WB genome. The mosome also contains single copies of dnaG, dnaB, tmk, himA, AY-WB genome contains long repeating units of DNA. Of the and hflB homologs, which are clearly different in sequence 671 predicted ORFs of AY-WB, 191 (28%) ORFs, covering from the PMU genes. Furthermore, AY-WB contains several 97,374 bp (13.8%) of the AY-WB chromosome, are present as multicopy sequences that are not part of PMUs, including one multiple copies (Fig. 2A). Of these 191 ORFs, 134 (20%), complete copy and several truncated copies of uvrD and dam. 3686 BAI ET AL. J. BACTERIOL.

Comparative genome analysis of phytoplasmas. The AY-WB chromosome is 154,062 bp smaller than that of OY-M, and AY-WB has 83 fewer ORFs than OY-M (Table 1). This dif- ference in genome size is the result of a lower number of multicopy genes in AY-WB compared to OY-M (Fig. 2A). OY-M multicopy genes are also organized in PMUs. The AY-WB genome contains 97,374-bp (13.8%; 191 ORFs) mul- ticopy sequences compared to 195,035-bp (22.7%; 268 ORFs)

multicopy sequences for OY-M, and the majority are clustered Downloaded from in PMUs with 71,979 bp (10.2%; 134 ORFs) for AY-WB and 121,226 bp (14.1%; 175 ORFs) for OY-M. Thus, compared to OY-M, the 154,062-bp-smaller genome of AY-WB is due to 97,661 bp fewer multicopy genes. The percentages of noncod- ing DNA are similar between AY-WB and OY-M, but because the OY-M genome is larger, OY-M noncoding DNA absorbs an additional 55,728-bp genome size difference between AY-WB and OY-M (Fig. 2A). As expected based on these observations, http://jb.asm.org/ the numbers of single-copy ORFs are similar between the phy- toplasmas, with 432,553 bp (61.2%; 482 ORFs) for AY-WB and 433,226 bp (50.3%; 486 ORFs) for OY-M (Fig. 2A). The alignment of the AY-WB and OY-M genomes has an X-shaped pattern, illustrating synteny of the majority of AY-WB and OY-M sequences but an inverse orientation of large genome segments (Fig. 2C). In both AY-WB and OY-M,

the largest aligned region is ϳ250 kb and starts with the lplA on October 14, 2015 by University of Queensland Library gene at 423,992 bp in AY-WB and 354,087 bp in OY-M and ends with glnQ at 660,824 bp in AY-WB and 103,752 bp in OY-M (Fig. 2C, arrowheads). This region is upstream of the putative oriC in AY-WB but downstream of the putative oriC in OY-M. In both AY-WB and OY-M, these ϳ250-kb regions contain the majority of the metabolic genes and do not contain tra5 insertion sequences (Fig. 1A). The PMUs tend to congregate, as evidenced by the groups of ISs, and are frequently located on opposite strands, as can be noticed by the correlation of GC-skew inflection points and the boundaries of sense-antisense regions as well as tra5 insertion sequences in the AY-WB chromosome (Fig. 1A). The align- ment of the AY-WB and OY-M chromosomes revealed that PMUs or PMU-like sequences at six locations in the AY-WB chromosome are also present at the same locations in the OY-M chromosome. However, at three locations, the se- quences in AY-WB or OY-M have undergone excessive dele- tion and mutation events. PMU sequences at one location in the AY-WB chromosome and four locations in the OY-M chromosome are unique to each of the phytoplasmas. Like

and those in the reverse orientation are represented as green lines. The arrowheads indicate lplA and glnQ, which flank ϳ250 kb of sequences mostly conserved among mollicutes. (D) The number of ORFs unique to phytoplasmas or shared with sequenced SEM clade mollicutes based on blastp analysis of AY-WB and OY-M protein sequences against a database composed of deduced protein sequences of all fully sequenced mollicute genomes (E value, Ͻ10Ϫ5). GenBank acces- FIG. 2. Comparative analyses of the AY-WB genome with the ge- sion numbers are as follows: Mesoplasma florum L1, AE017263; My- nomes of OY-M and other mollicutes. (A) The AY-WB and OY-M coplasma gallisepticum R, AE015450; M. genitalium G-37, L43967; genomes are repeat rich. (B) Venn diagram showing the number of M. hyopneumoniae 232, AE017332; M. mobile 163K, AE017308; M. shared and unique genes between AY-WB and OY-M. (C) Dot plot mycoides subsp. mycoides SC strain PG1, BX293980; Mycoplasma comparison of AY-WB and OY-M chromosomes. The numbers on the penetrans HF-2, BA000026; Mycoplasma pneumoniae M129, U00089; x and y axes indicate the nucleotides in base pairs. AY-WB and OY-M M. pulmonis UAB CTIP, AL445566; OY-M phytoplasma, AP006628; genome segments in the same orientation are represented as red lines, Ureaplasma urealyticum serovar 3 strain ATCC 700970, AF222894. VOL. 188, 2006 COMPARATIVE ANALYSIS OF TWO PHYTOPLASMA GENOMES 3687 Downloaded from http://jb.asm.org/

FIG. 3. PMUs of the AY-WB chromosome. The chromosome is presented as a black line. The numbers between parentheses at the left indicate the positions of the first and last nucleotides of the PMU on the AY-WB chromosome. ORFs are represented as block arrows. Arrows of paralogous genes have the same color, with the exception of the gray arrows, which represent unique genes. The names of the ORFs with predicted functions are indicated above the arrows, with ORFs of predicted membrane-targeted proteins indicated with *. The ORF numbers below the arrows correspond to annotations listed in Table 3, with # indicating genes that contain mutations separating them into two truncated ORFs. However, the tra5 ORFs of PMU4 contains separate A and B ORFs that may produce a full-length transposase upon a single frameshifting event (58). on October 14, 2015 by University of Queensland Library

AY-WB, the OY-M genome contains several genes that are fer from the distantly related SEM clade mollicutes, ORF not part of PMUs, including two full-length and several trun- sequences of the AY-WB and OY-M phytoplasmas were com- cated copies of dam and three full-length and several truncated pared to those of nine Mycoplasma and Ureaplasma spp. copies of uvrD. Our observations are consistent with those of (blastp; E value, Ͻ10Ϫ5). More than half of the phytoplasma others, as Oshima et al. (76) previously reported that the OY-M ORFs had similarities to those of SEM clade mollicutes, and genome contains multiple copies of uvrD, hflB, tmk, dam, and AY-WB and OY-M had an equal number of unique phyto- ssb, constituting 18% of the total genes. plasma ORFs (318 ORFs) (Fig. 2D). Relative to OY-M, Besides the PMUs and other multicopy sequences, other AY-WB contained fewer ORFs that were present in several differences between AY-WB and OY-M were found. Strik- but not all SEM branch mollicutes (146 ORFs for AY-WB ingly, AY-WB lacks most sequences that are truncated in versus 214 ORFs for OY-M) (Fig. 2D). The ϳ250-kb segment OY-M (Fig. 2B), including hsdR and hsdM of the type I re- between the lplA and glnQ genes that is syntenic between the striction modification system, three adjacent fragments with AY-WB and OY-M phytoplasmas (Fig. 2C) contained the similarities to recA, and two adjacent sequences of the sucP majority of the ORFs conserved among mollicutes (Fig. 1, blue gene for sucrose phosphorylase (EC 2.4.1.7). AY-WB also patches in ring 5), while the less syntenic region (the first 400 lacks genes that are part of incomplete pathways in OY-M, kb of the AY-WB genome) (Fig. 2C) are repeat rich (Fig. 1 [IS including rfaG (EC 2.4.1.157) of the glycerolipid metabolism element ring 4] and 2C) and are more enriched with phyto- pathway and pdxK (EC 2.7.1.35) of the vitamin B6 pathway. plasma-specific ORFs (Fig. 1, red patches of ring 5). Finally, whereas AY-WB lacks folC (EC 6.3.2.17) and has Of the 318 ORFs that are unique for phytoplasmas in the truncated versions of folK (EC 2.7.6.3) and folP (EC 2.5.1.15), class Mollicutes, 40 had functional annotations and were closely OY-M has full-length copies of these genes that belong to the examined (Table 6), since these may be part of metabolic folate biosynthesis pathway. Only a few AY-WB ORFs with pathways absent from SEM branch mollicutes. These 40 ORFs functional annotations were absent from OY-M (Fig. 2B). include sfcA for NAD-specific malic enzyme (EC 1.1.1.38) and These include cbiQ and evbH of the cobalt and multidrug two copies of the malate/citrate-sodium symporter gene citS. ATP-binding cassette (ABC) transporter systems, respectively Phytoplasmas have a maltose ABC transporter system, includ- (Table 4). However, OY-M has chromosome fragments with ing a maltose-binding protein (MalE) (Table 4) and several similarities to cbiQ and evbH, but ORFs were not assigned. other transporters that are not present in the SEM clade mol- Except for these sequences, a high degree of gene content licutes (Table 6). These include several components of the art conservation was observed between the genomes of AY-WB and gln ABC transporter systems that might be important for and OY-M, including major metabolic pathways and ABC and the import of glutamine and arginine, respectively, and several P-type ATPase transporters (76) (Tables 4 and 5). solute-binding proteins, including ArtI, which is predicted to Comparative genomics of phytoplasmas and other molli- bind arginine (39); the dipeptide binding protein and D-ami- cutes. To determine to what extent phytoplasma genomes dif- nopeptidase DppA (20); and NlpA lipoprotein (98), for which 3688 BAI ET AL. J. BACTERIOL.

TABLE 3. Features of the four PMUs of AY-WB

ORF ID (length [nt]) ORFa Annotationg PMU1 PMU2 PMU3 PMU4 1 tra5 (210)c Truncated transposase, group IS150, family IS3 2 sigF (624) sigF (603) Specialized sigma factor 3 ssb (312) ssb (333) ssb (312) Single-stranded DNA-binding protein

4 himA (330) himA (288) himA (366) DNA-binding factor HU Downloaded from 5 AYWB_191 (438) AYWB_273 (441) Cons hyp protein 6b,f AYWB_190 (279) AYWB_274 (294) Hyp protein 7b AYWB_189 (792) Cons hyp protein 8b AYWB_188 (858) Cons hyp protein 9b hflB (2,106) hflB (2,304) hflB (2,145) Zn-dependent protease 10b,f AYWB_186 (270) Cons hyp protein 11b AYWB_185 (855) Cons hyp protein 12 AYWB_184 (2,253) AYWB_226 (618) AYWB_277 (1,155); Cons hyp protein AYWB_278 (1,110)d b 13 AYWB_183 (987) AYWB_225 (690) AYWB_279 (804) Cons hyp protein http://jb.asm.org/ 14 AYWB_182 (636) AYWB_224 (372) AYWB_281 (366) Cons hyp protein 15 tmk-a (630) tmk-a (630) tmk-a (627) Thymidylate kinase 16 AYWB_180 (609) AYWB_221 (603) AYWB_283 (609) AYWB_618 (744) Cons hyp protein 17 dnaB (1,494) dnaB (1,494) dnaB (1,500) dnaB (1,413) DNA helicase 18 dnaG (1,323) dnaG (1,323) dnaG (1,323) dnaG (1,107) DNA primase 19b AYWB_177 (855) AYWB_218 (162) AYWB_615 (834) Cons hyp protein 20 AYWB_176 (624) AYWB_217 (312); AYWB_286 (750) AYWB_614 (564) Cons hyp protein AYWB_216 (360)d e

21 tra5 (963) tra5 (939) tra5 (963) tra5 (396, 519) Transposase, group IS150, on October 14, 2015 by University of Queensland Library family IS3 22f AYWB_231 (171) Hyp protein 23b AYWB_228 (873) AYWB_276 (600) Cons hyp protein 24 AYWB_227 (411) Cons hyp protein 25b AYWB_223 (627) Cons hyp protein 26b AYWB_280 (261) Cons hyp protein 27 mgs1 (1,242) ATPase, AAA family 28 tra5 (963) Transposase, group IS150, family IS3

a ORF numbers correspond to numbers in Fig. 3. b Deduced proteins predicted to target the membrane (secreted or membrane proteins). c ORF IDs, with lengths in nucleotides shown in parentheses, are indicated for all PMU ORFs. d Genes contain mutations separating them into two truncated ORFs (Fig. 3). e Contains separate A and B ORFs that may produce a full-length transposase upon a single frameshift event (58). f Sequences unique to AY-WB. g Abbreviations: Cons, conserved; hyp, hypothetical. the gene is located between methionine ABC transporter and pnp encoding polyribonucleotide nucleotidyltransferase genes and which hence may be a methionine binding protein (EC 2.7.7.8). Both are involved in the regulation of mRNA (Table 4). Phytoplasmas also have mntB and znuA of the man- stability. Interestingly, the pnp gene is present in the genome of ganese (Mn) and zinc (Zn) ABC transporter system (15) (Ta- S. kunkelii (9), which is also an insect-transmitted plant-patho- ble 6). All the solute-binding proteins were predicted to have genic mollicute. Polyribonucleotide nucleotidyltransferase may signal peptides (SignalP v3.0) (14) and are likely extracellular be involved in the persistent infection of insects and/or adap- lipoproteins (38). Two ABC transporters have adjacent genes tation to diverse hosts and habitats of phytoplasmas and spi- for thermostable carboxypeptidase 1 (EC 3.4.17.19) and oligo- roplasmas (9). The adjoining phytoplasma genes pmbA and endopeptidase F (EC 3.4.24.Ϫ) that can process imported pep- tldD were not identified in SEM branch mollicutes either. tides and that were not present in the genomes of SEM branch PmbA and TldD regulate DNA gyrase function and are in- mollicutes (Table 6). Finally, three AY-WB genes were anno- volved in protein maturation (3, 70, 83). tated as norM that encodes a Naϩ-driven multidrug efflux Compared to other mollicutes, phytoplasmas lack several pump. One norM gene had similarity to genes of SEM molli- essential transporters and pathways. AY-WB and OY-M lack cutes, whereas the other two did not. These two are located phosphoenolpyruvate:sugar phosphotransferase (PTS) systems adjacent to each other and are transcribed in opposite direc- for the import of sugars essential for glycolysis. AY-WB and tions in both the AY-WB and OY-M genomes. OY-M also lack F-type ATP synthases. This is in contrast to Other genes present in AY-WB and OY-M but absent from mycoplasmas and ureaplasmas that have ATP synthase com- SEM branch mollicutes are pssA and psd (Table 6) of the plexes, including the A, B, and C subunits for the transmem- phosphatidylethanolamine pathway (63). Furthermore, myco- brane channel and the five-subunit (alpha, beta, gamma, delta, plasmas lack pcnB encoding poly(A) polymerase (EC 2.7.7.19) and epsilon) catalytic core for ATP synthesis, and can use the Downloaded from http://jb.asm.org/ on October 14, 2015 by University of Queensland Library

TABLE 4. ABC transporter genes in AY-WB and OY-M phytoplasma genomes

AY-WB OY-M Substrate ATP-binding protein Membrane protein Solute-binding protein ATP-binding protein Membrane protein Solute-binding protein Amino acid uptake Amino acid glnQ (AYWB_634) AYWB635 (AYWB_635) glnQ (39938565) artM (39938563), artM (39938564) D-methionine metN (AYWB_589) AYWB587 (AYWB_587) nlpA (AYWB_588) abc (39938618) PAM134 (39938620) nlpA (39938619) Amino acid AYWB_314 (fragment) glnP (AYWB_315) artM (39938942), artI (arginine) (39938943), artM (39938950) Amino acid artP (AYWB_264) artQ (AYWB_265), artI (AYWB_263) glnQ (39938974) artM (39938973), artI (glutamine) artM (AYWB_262) (39938975), artM (39938976) Amino acid artM (AYWB_125) artM (39939074) Amino acid artM (39938980), artM (39938981) Amino acid artM (39939125), mdoB (39939127)

Dipeptide/oligopeptide uptake Dipeptide or dppF (AYWB_527), dppB (AYWB_530), dppA (AYWB_529) dppD (39938678) dppC (39938675), dppB oppA (39938677) oligopeptide dppD (AYWB_528) dppC (AYWB_531) (39938676) Oligopeptide dppD (39938511), dppB (39938508), PAM024 (39938510) oppF (39938512) PAM023 (39938509)

Sugar uptake Maltose, trehalose, malK (AYWB_670) malG (AYWB_668), malE (AYWB_667) malK (39939238) ugpE (39939236), ugpA ugpB (39939235) malF COMPARATIVE ANALYSIS OF TWO PHYTOPLASMA GENOMES 3689 sucrose, or (AYWB_669) (39939237) palatinose

Inorganic ion uptake Cobalt cbiO (AYWB_014) cbiQ (AYWB_015) cbiO (39938506) PAM19 (39938505) Cobalt cbiO (AYWB_540), cbiO cbiQ (AYWB_539) cbiO (39938665) cibQ (39938666) (AYWB_541) Mn/Zn mntA (AYWB_623) mntB (AYWB_622), znuA (AYWB_624) znuC (39938579) znuB (39938580) znuA (39938578) mntB (AYWB_621)

Multidrug resistance Multidrug evbG/mdlB (AYWB_028) mdlB (39938545) Multidrug evbH (AYWB_029)

Spermidine/putrescine uptake Spermidine or potA (AYWB_095) potB (AYWB_094), potD (AYWB_092) potA (39939145) potB (39939146), potC potD (39939148) putrescine potC (AYWB_093) (39939147)

Uncharacterized Lipoprotein phnL (AYWB_619) phnL (39938582) nlpA (39938583) Unknown phnL (AYWB_135) phnL (39939085) . 188, 2006 OL V 3690 BAI ET AL. J. BACTERIOL.

TABLE 5. Predicted P-type ATPases of AY-WB and OY-M

AYWB OY-M

Gene (length [aa], Gene (length [aaa], ORF) Possible substrate Possible substrate GenBank accession no.) mgtA (920, AYWB_018) Cation mgtA (920, 39938516) Sodium/potassium mgtA (817, AYWB_469) Cation mgtA (918, 39938672) Calcium mgtA (952, AYWB_533) Cation mgtA (1,056, 39938738) Cation mgtB (892, AYWB_242) Magnesium mgtA (892, 39939071) Magnesium

zntA (666, AYWB_650) Lead, cadmium, zinc, mercury zntA (666, 39939219) Cadmium Downloaded from

a aa, amino acids. transmembrane potential for ATP synthesis (80). However, mollicutes. All mollicutes sequenced so far lack recB, recC, phytoplasmas have five genes encoding P-type ATPases (Table recD, recG, and ruvC of the recombination pathway and recN, 5) that may generate electrochemical gradients over the mem- recO, recQ, and recR of the SOS response, although some brane. mycoplasmas carry recR and recO. Thus, SEM branch molli- Phytoplasmas have fewer genes in the standard recombina- cutes have recA, recU, ssb, polA, gyrA, gyrB, ruvA, and ruvB,a http://jb.asm.org/ tion pathway and SOS response in comparison to SEM branch rudimentary set of genes that permit homologous recombina-

TABLE 6. Proteins with functional annotations unique to AY-WB and OY-M within the class Mollicutes

OY-M Other organisms AY-WB ORF ID Gene ID Annotation GenBank b GenBank b a E value E value

accession no. accession no. on October 14, 2015 by University of Queensland Library Transcription AYWB_654 rpoZ EC 2.7.7.6 39939222 9eϪ19 58337597 2eϪ07 Translation AYWB_504 rpmD LSU ribosomal protein L30P 39938704 5eϪ28 50590420 3eϪ09 Membrane transport AYWB_052 citS Malate-sodium symporter 39939206 eϪ166 42528200 5eϪ36 AYWB_435 citS Malate-sodium symporter 39938772 eϪ174 15672883 2eϪ30 AYWB_125 artM ABC-type permease protein ArtM 39939074 0 48866203 2eϪ35 AYWB_263 artI ABC type solute-binding protein ArtI 39938975 6eϪ60 58336459 1eϪ21 AYWB_265 artQ ABC-type permease protein ArtQ 39938973 9eϪ79 24376619 2eϪ17 AYWB_315 glnP ABC-type permease protein GlnP 39938942 0 15022937 2eϪ30 AYWB_587 ABC-type Met ATP-binding protein 39938620 eϪ101 29377647 1eϪ13 AYWB_588 nlpA ABC-type Met binding protein 39938619 eϪ136 25010851 3eϪ05 AYWB_621 mntB ABC-type membrane protein 39938580 eϪ177 42526732 2eϪ53 AYWB_622 mntB ABC-type membrane protein 39938580 eϪ162 53685687 1eϪ47 AYWB_624 znuA ABC-type Mn/Zn-binding protein 39938578 eϪ168 1335912 1eϪ41 AYWB_667 malEa ABC-type maltose-binding protein 39939235 0 52858068 2eϪ18 AYWB_439 norM Naϩ-driven multidrug efflux pump 39938768 0 NA AYWB_441 norM Naϩ-driven multidrug efflux pump 39938766 0 NA AYWB_467 secE a SecE 40786355 1eϪ37 NA Metabolic enzymes AYWB_051 sfcA EC 1.1.1.38 39939207 0 28202548 eϪ129 AYWB_120 pssA EC 2.7.8.8 39939099 8eϪ92 15023686 2eϪ16 AYWB_121 psd EC 4.1.1.65 39939098 eϪ178 15023687 9eϪ50 AYWB_326 sodA EC 1.15.1.1 39938928 eϪ107 15672390 1eϪ60 AYWB_415 pmt EC 2.1.1- 39938792 0 45682627 4eϪ06 AYWB_470 pnpa EC 2.7.7.8 39938737 0 48824146 eϪ178 AYWB_532 EC 3.4.17.19 39938673 0 52698549 eϪ146 AYWB_598 qns a EC 6.3.5.1 39938607 0 16804107 eϪ158 AYWB_607 pcnB EC 2.7.7.19 39938586 2eϪ14 NA NA Other AYWB_017 ibpA Hsp20 39938514 9eϪ64 4884483 7eϪ14 AYWB_302 mutT Phosphohydrolase 39938962 8eϪ92 15673603 2eϪ24 AYWB_331 tldD TldD 39938933 0 15024804 eϪ107 AYWB_332 pmbA PmbA 39938933 0 18143998 2eϪ54 AYWB_561 hlyC a Hemolysin III 39938644 eϪ110 18145579 5eϪ26 AYWB_599 Immunodominant protein precursor 39938608 2eϪ10 NA NA AYWB_630 Rhodanese-related sulfurtransferase 39938571 eϪ180 23098027 eϪ109 AYWB_646 pduL PduL 39939215 eϪ102 49235943 5eϪ44

a Genes with identical annotations but no sequence similarities in SEM branch mollicute. b The E values were obtained by searching against the GenBank nonredundant database with 2,506,223 sequences consisting of 849,940,114 letters on a local Linux workstation. Results from the GenBank search were verified using the mollicute database MolliGen (http://cbi.labri.fr/outils/molligen/) (11). NA, not available. VOL. 188, 2006 COMPARATIVE ANALYSIS OF TWO PHYTOPLASMA GENOMES 3691 tion. Of these, phytoplasmas do not have recA, ruvA, and ruvB. sec-dependent protein translocation system and that the N- Hence, phytoplasmas have a deficient homologous recombina- terminal signal peptides of proteins are cleaved. Since the tion machinery. closest walled relatives of phytoplasmas are Clostridium, Bacil- AY-WB virulence. The AY-WB genome was analyzed for lus, and Streptococcus spp. (phylum Firmicutes), it is possible similarities to known bacterial virulence factors. Several puta- that, similarly to Streptococcus pyogenes (84), phytoplasmas se- tive hemolysins of AY-WB were identified based on annota- crete virulence-related proteins via the sec-dependent pathway. tion. These include a protein annotated as HlyC, a putative Both phytoplasma genomes contain several ABC transport- hemolysin III. This protein belongs to the integral membrane ers (Table 4). ABC transporters import peptides, amino acids,

protein family (Pfam domain number PF03006), which in- and nutrients into the cell. They can be virulence factors, and Downloaded from cludes a protein with hemolytic activity from Bacillus cereus. they can deplete essential nutrients from the host and secrete However, other proteins in this family play a role in lipid and toxins and antimicrobial compounds such as hemolysins (23). phosphate metabolic pathways. Another putative hemolysin- Furthermore, solute-binding proteins of ABC transporters are related protein of AY-WB was annotated as TlyC, which car- usually secreted lipoproteins that bind external substrate to the ries resemblance to cluster of orthologous group 1253 of he- cell and deliver the substrate to the ABC transporters and may molysins and related proteins containing CBS domains. also be involved in adherence to cell surfaces (4). For instance, Indeed, AY-WB TlyC contains a CBS domain (Pfam domain the ABC transporter-related solute-binding protein Sc76 of number PF00571). However, the AY-WB TlyC protein has an Spiroplasma citri was shown to be involved in the penetration http://jb.asm.org/ N-terminal transmembrane region (Pfam domain number of or multiplication in the salivary gland (17). The AY-WB PF01595) not found in TlyC proteins and a C-terminal domain genome contains genes for five solute-binding proteins with that is present in the C terminus of Naϩ/Hϩ antiporters, in- specific solute-binding activities (Table 4). All five solute-bind- cluding CorC, which is involved in magnesium and cobalt efflux ing proteins have N-terminal cleavable signal peptide se- (Pfam domain number PF03471). Thus, it is not clear whether quences, as predicted with SignalP v3 software (14), and there- HlyIII and TlyC of AY-WB are hemolysins. fore are secreted via the sec-dependent pathway. Hence, these Two AY-WB proteins, AYWB_084 and AYWB_352, are five solute-binding proteins are putative virulence factors of similar to the Legionella pneumophila virulence factor IcmE (E phytoplasmas. on October 14, 2015 by University of Queensland Library values of 5eϪ21 and 5eϪ05, respectively), which is part of the type IVB secretion system apparatus that translocates bacterial DISCUSSION proteins into host cells (87). Proteins with similarities to IcmE were also identified in the OY-M genome (76). IcmE has It is intriguing that phytoplasmas have small genomes that sequence similarity to plasmid genes involved in conjugation lack many standard metabolic functions but are repeat rich. (87). In both AY-WB and OY-M, the majority of the icmE-like The repeated DNAs are mostly multicopy genes organized in sequences were located upstream of the ATP-dependent heli- PMUs. Thus, phytoplasmas are different from other bacterial case gene uvrD. UvrD belongs to the Rep family of helicases endosymbionts of insects, e.g., Buchnera and Blochmannia and catalyzes ATP-dependent mediated unwinding of double- spp., which also have small genomes lacking many standard stranded DNA into single-stranded DNA and has a role in the metabolic functions but have low levels of repeated DNAs (1, recF recombination pathway, methyl-directed mismatch repair, 91). On the other hand, the majority of the mollicutes have and UvrABC-mediated nucleotide excision repair and replica- repeat-rich genomes. All mollicutes are under pressure for tion (36, 67). Similarly to the other repeated sequences, the genome minimization, and the presence of numerous repeats OY phytoplasma genome contains multiple copies of icmE- is therefore highly significant (82). Indeed, it has been shown like sequences and full-length uvrD, whereas the AY-WB phy- for several mycoplasmas that repeats engage in recombination toplasma contains only one full-length icmE-like sequence and events resulting in changes of mosaics of antigenic structures at uvrD and multiple truncated copies of these sequences. Fur- cell surfaces, essential for evasion of the host immune system ther research should reveal whether the icmE-like sequences and for adaptation to new environments (82). Thus, similarly of phytoplasmas mediate conjugation or are somehow involved to mycoplasmas, the repeated DNAs of phytoplasmas probably in the recombination pathway. No other similarities of phyto- allow adaptations to different environments. Adaptation is par- plasma sequences to type III and type IV secretion systems ticularly important for phytoplasmas, as their host environ- were observed. This may not be surprising, as translocation of ments are extremely variable, including the intracellular envi- virulence factors via type III and type IV secretion systems is ronments of phloem tissues of plants and guts, salivary glands, more specific for gram-negative bacteria. and other organs and tissues of insect hosts. Also, phytoplas- AY-WB and OY-M share the genes of the protein export mas have a broad plant host range. AY-WB alone can infect and targeting components of the sec-dependent pathway, in- China aster, lettuce, tomato, Nicotiana benthamiana, and Ara- cluding secA, secY, yidC, ffh, ftsY, dnaJ, dnaK, grpE, groES, and bidopsis thaliana. Phytoplasma genomes are different from my- groEL and, like SEM branch mollicutes, lack several subunits coplasma genomes in several aspects. First, phytoplasmas do and the signal peptidases of the protein maturation compo- not have recA, ruvA, and ruvB and hence appear to lack a nent, including secB, secG, secF, secE, secD, and signal pepti- functional recombination system. Second, thus far, the organi- dase I (80). Despite the absence of several components, OY-M zation of repeated DNAs in PMUs (Fig. 3 and Table 3) is phytoplasma has a functional sec-dependent protein transloca- unique to phytoplasmas among the mollicutes. tion system (43). It is possible that some of the many hypo- PMUs. The PMUs contain tra5 ISs, which belong to the thetical proteins have peptidase activities. This confirms pre- IS150 group and the IS3 family (53, 58). IS3-type mobile units vious findings (10, 44) that phytoplasmas have a functional are found in a number of other mollicutes, for example, IS1138 3692 BAI ET AL. J. BACTERIOL. in Mycoplasma pulmonis,IS1221 in Mycoplasma hyorhinis and have region 2 domains (Pfam domain number PF04542) con- Mycoplasma hyopneumoniae,IS1297 in M. mycoides subsp. my- taining both the Ϫ10 promoter recognition helix and the pri- coides, ISMi1 in Mycoplasma fermentans, and one IS3 element mary core RNA polymerase binding determinant. However, in the spiroplasma virus DNA SPV1-C74 sequence of S. citri the C-terminal 100 amino acids of the SigF proteins do not (58, 66). All of these elements belong to the IS150 subgroup, have similarities to other proteins or domains, including the and it has been demonstrated that some of these elements region 4 domains (Pfam domain number PF04545) containing undergo autonomous transposition (16). the Ϫ35 promoter-binding element. AY-WB SigF proteins PMU1 of AY-WB is the longest, appears to be the most showed the greatest similarity (E value, 10Ϫ6) to the stress

complete, and has several striking features characteristic of response sigma factor [sigma(H)] of Streptococcus coelicolor Downloaded from composite transposons (Fig. 3). First, the right and left borders (50) and the flagellar biosynthesis sigma factor FliA of Pseudo- of PMU1 contain long (327-bp) IRs. Furthermore, whereas the monas putida (46). Expression of SigF and other PMU genes ORF to the right is a truncated tra5 sequence, the tra5 se- might occur under specific environmental conditions. quence at the left can produce a full-length ORFAB fused- Since PMUs contain several genes predicted to encode frame transposase (58). IS150 can generate circles by joining membrane-targeted sequences, one would expect that expres- IRs upon production of the fused-frame transposase (90), and sion of PMU genes would result in a change of the phyto- particularly, composite transposons that carry single inverted plasma membrane surface. In this regard, it is intriguing that repeats at the left and right borders form stable circles (48). the PMUs contain hflB (or ftsH) genes that encode membrane- http://jb.asm.org/ PMU1 also carries a gene for DNA protein HU (himA), which associated ATP-dependent Zn proteases of ϳ700 amino acids. is a nonspecific binder of DNA but prefers binding to bent, These proteins are conserved among bacteria and are involved kinked, or altered DNA sequences (31) and has a role in in membrane-associated processes such as protein secretion recombination through the joining of distant recombination (26) and membrane protein assembly (2) as well as adaptations sites (5). Thus, with the help of transposase and DNA protein to nutritional conditions and osmotic stress (26, 57). HU, the IRs could join to form a circle and induce transposi- Genomic plasticity. The irregular GC skews and the pres- tion of PMU1. It is striking that all the genes on PMUs are ence of large repeated sequences (PMUs) in the AY-WB and oriented in the same direction, with sigF, encoding a special- OY-M genomes are indicative of high genomic plasticity. The on October 14, 2015 by University of Queensland Library ized transcription factor, as the first gene and located down- correlation between an irregular GC skew and the presence of stream of the . In IS3 family members, the ISs in mollicute genomes is quite striking. For instance, M. adjoined IRs, which are formed on circularization, create a mycoides has an irregular GC skew, and 13% of the genome strong hybrid promoter that drives high levels of transposase size consists of ISs (95), whereas Mycoplasma mobile has a expression (58). Hence, it is possible the adjoined 327-bp re- regular GC skew and no ISs (42). It should be noted, however, peats upon circulation of PMU1 create a strong promoter that that although AY-WB doesn’t have a significant consistent GC drives the transcription of at least part of the PMU genes. skew, it may have another kind of significant skew or excess, The AY-WB and OY-M genomes also contain evidence that including AT skew and purine excess or keto excess (89). at least some PMUs transpose in a replicative fashion. First, Phytoplasma genomic plasticity is also evidenced by the dif- there are multiple copies of PMUs and PMU-like clusters. ferences in genome sizes and compositions between members Second, the PMUs contain full-length dnaB, dnaG, and ssb of “Ca. Phytoplasma asteris,” ranging from 660 to 1,130 kb and genes that are involved in DNA replication. DnaB initiates consisting of several fragments of 500 kb and larger (61; our DNA replication (19). It moves along the lagging strand and personal observation). Since PMUs can form large clusters unwinds the DNA helix for the propagating fork and attracts that may locate in different sections of the chromosome, it is DnaG for lagging-strand synthesis (93). SSB plays an essential likely that they are also capable of splitting a single chromo- role in DNA replication by stabilizing single-stranded DNA some into two smaller chromosomes. Furthermore, results re- (56). Most PMUs also contain a tmk gene encoding thymidy- ported herein show that AY-WB and OY-M differ by ϳ154 kb late kinase that synthesizes dTDP from dTMP for DNA syn- in genome size, mainly because of a difference in PMUs and thesis. Similarly to AY-WB, the OY-M phytoplasma genome other multicopy sequences (Fig. 2A). contains at least two tmk homologs, tmk-a and tmk-b, with Despite the phytoplasma genomic plasticity, the majority of tmk-a being present as multiple copies (68). We revealed that the AY-WB and OY-M genomes are syntenic (Fig. 3C). Scat- the tmk-a genes are part of PMUs. However TMK-b but not ter plots of conserved sequences between the AY-WB and TMK-a was shown to have thymidylate kinase activity (68). OY-M genomes show an X-shaped pattern with symmetry Hence, the function of TMK-a is not yet clear. around the tentative oriC and two other locations at approxi- Several sigma factor genes were identified in the AY-WB mately opposite ends of oriC (Fig. 2C). This X-shaped pattern genome. These genes are rpoD, which encodes the standard or X alignment is common in genome comparisons of closely 465-amino-acid ␴70 protein and is present as a single copy on related bacterial species and is most likely due to the occur- the AY-WB chromosome, and multiple copies of sigF that are rence of large inversions that rotate around oriC and the ter- located on PMUs or PMU-like gene clusters and have deduced minus of replication (28). The breakpoints of the inversions proteins of ϳ200 amino acids in length. PMU3 contains a between the AY-WB and OY-M genomes are, as expected, at sequence with similarity to sigF immediately upstream of the PMU-like regions and repeated uvrD sequences. ssb gene, but because of the presence of a premature stop There are probably two reasons for the good alignment of codon, this sequence was not predicted to be an ORF. The the AY-WB and OY-M genomes. First, we already observed OY-M genome also has multiple copies of sigF that are part of that the PMUs tend to congregate. This is consistent with PMUs. The N-terminal 100 amino acids of the SigF proteins findings that IS150 frequently transposes into target regions VOL. 188, 2006 COMPARATIVE ANALYSIS OF TWO PHYTOPLASMA GENOMES 3693 resembling its IR (58, 74). Thus, transposition will predomi- 73). Each AY-WB plasmid contains two genes involved in nantly affect certain areas of the phytoplasma genomes, and rolling-circle amplification and two to six ORFs with unknown hence, the synteny in the rest of the genome can be main- function, several of which were predicted to target the AY-WB tained. Second, because of the absence of recA, ruvA, and ruvB, membrane, suggesting that the plasmids are involved in rearrangements between PMUs through homologous recom- AY-WB association with the plant and insect hosts. Indeed, bination are likely to occur at lower frequencies than in ge- the RepA proteins of OY-M phytoplasmas were detected in nomes with RecA-dependent homologous recombination ma- infected plants (71), indicating that the plasmid genes are chineries (78, 79). expressed during infection of the plant. Furthermore, sponta-

Variations in the presence of recA are common among in- neous OY-M mutants that lack ORFs on a plasmid and are not Downloaded from sect-associated mollicutes (65). Truncated recA genes were insect transmissible were isolated (72). found in six Spiroplasma citri strains, which, like phytoplasmas, Interestingly, two AY-WB plasmids (AYWB-pI and are insect-transmitted plant pathogens, and five Spiroplasma AYWB-pIII) contain repA genes similar to geminivirus repA, melliferum strains, which are pathogens of bees (59). In S. citri, whereas the rep genes of the other plasmids were unique to only the first 390 nucleotides at the 5Ј end of recA are present, AY-WB and OY-M phytoplasmas. Geminivirus-like repA whereas in S. melliferum, the full-length recA gene is inter- genes in OY-M (75) and more distantly related phytoplasmas rupted by a TAA stop codon. Intriguingly, truncated and full- (55, 81) were also identified. Like phytoplasmas, geminiviruses length RecA polypeptides were observed in a proteomic study are insect-transmitted plant pathogens and have to pass http://jb.asm.org/ of S. melliferum (21). These finding suggest that recA sequence through the gut epithelium, hemolymph, and salivary gland variation among insect-associated mollicutes is of biological cells of the insect vectors before returning to the plant (22). significance. RecA has an important function in mycoplasmas. Phytoplasmas and geminiviruses have overlapping plant and Deletion of recA is lethal for M. pulmonis (80). RecA is prob- insect host ranges. Hence, it is possible that phytoplasmas ably essential for homologous recombination between re- acquired the repA genes from geminiviruses through horizontal peated lipoproteins, and adhesin genes result in a change of exchange. On the other hand, it has been hypothesized that mosaics of antigenic structures at the bacterial surface, with geminiviruses originated from bacterial plasmids (49). Plas- subsequent evasion of the host immune response (80, 82). mids with similar repA genes are generally incompatible, and on October 14, 2015 by University of Queensland Library Thus, it seems that phytoplasmas and spiroplasmas can adapt therefore, it is likely that the four plasmids are not present in to their hosts with less efficient homologous recombination one AY-WB cell but represent the plasmid content of the systems, and the loss of RecA function might then be beneficial AY-WB population present in plants from which the AY-WB for increasing genome stability. This is supported by the ob- DNA was isolated. servations that, like phytoplasmas, spiroplasmas have highly The variation among the AY-WB plasmids suggests that repeat-rich genomes mainly due to phage-derived sequences they are prone to frequent mutations. This is consistent with (80). On the other hand, M. mycoides, which also has a repeat- other findings. OY-M has plasmids ranging from ϳ3toϳ7kb rich genome and is a human pathogen, has a full-length recA (47). in size (Fig. 1B) (73), and the plasmids of beet leafhopper- Reductive evolution. In general, AY-WB seems further transmitted virescence phytoplasma range from ϳ2.5 to ϳ11 along in the reductive evolution process than OY-M. First, kb (55). There is high variability of the occurrence of ORFs in AY-WB phytoplasma contained fewer PMUs insertions, and the plasmids of 30 beet leafhopper-transmitted virescence phy- the ORFs in AY-WB PMUs are more frequently truncated or toplasma strains (55). There is also evidence of intramolecular deleted. Second, AY-WB lacks genes that are truncated in recombination among phytoplasma plasmids (55, 73). We OY-M, including asnB, hsdR, hsdM, recA, and sucP. Third, show that they can also recombine with the chromosome (Fig. AY-WB lacks genes of incomplete pathways in OY-M, includ- 1C). ing rfaG of the glycerolipid metabolism pathway and pdxK of Phytoplasma metabolism. Except for a few exceptions de- the vitamin B6 pathways. Furthermore, unlike OY-M, AY-WB scribed above, the AY-WB metabolic pathways are similar to does not have folC, and OY-M has full-length folK and folP those of OY-M that have been described elsewhere (76) and genes that are truncated in AY-WB. The folK and folP genes will not be discussed in detail here, although a few findings were also identified as pseudogenes in clover phyllody (CPh) need more emphasis. The phytoplasma metabolism is in sev- phytoplasma (“Ca. Phytoplasma asteris”) (24), suggesting that eral ways different from those of SEM branch mollicutes. This OY-M may be capable of de novo folate synthesis, whereas was expected, because phytoplasmas have not yet been grown AY-WB and CPh have to import folate from host cells. Simi- in cell-free culture media, including mycoplasma culture me- larly to CPh (24), the folK and folP sequences of AY-WB and dia. Unlike SEM branch mollicutes, phytoplasmas do not have OY-M are flanked by gcp, which encodes a glycoprotease, and PTS systems to import sugars and to generate glucose-6-phos- two ORFs encoding a DegV family protein and a 24-kDa phate to feed the glycolysis pathway. Thus, phytoplasmas are lipoprotein (AYWB_245) (24). Hence, the gene organizations clearly different from the insect-transmitted plant-pathogenic of this part of the genome are conserved among “Ca. Phyto- S. citri and S. kunkelii, which have three PTS systems for the plasma asteris” members. Final evidence that AY-WB is fur- import of fructose, glucose, and trehalose (7). In contrast, ther down the reductive evolutionary path is provided by the phytoplasmas possess ABC transporters for the import of mal- observation that relative to OY-M, AY-WB contains fewer tose. The maltose-binding protein (MalE) (Table 4) may have ORFs that are shared by several but not all mollicutes (146 affinity to maltose, trehalose, sucrose, and palatinose (88). Af- ORFs for AY-WB versus 214 ORFs for OY-M) (Fig. 2D). finity of MalE to trehalose is likely, as trehalose is a major Plasmids. We identified four plasmids in AY-WB. Plasmids sugar in the insect hemolymph. The fate of these sugars after have been detected in a number of other phytoplasmas (55, import is not clear, because enzymes required for the conver- 3694 BAI ET AL. J. BACTERIOL. sion of these sugars to glucose-6-phosphate for glycolysis were circularization and replicative transposition. In addition, ge- not found in the phytoplasma genomes, and the sucrose phos- nome rearrangements through expansions and deletions of phorylase gene, which is important for sucrose degradation, is PMUs might increase the chance of phytoplasma adaptation to fragmented in the OY-M phytoplasma genome (76) and is diverse hosts and can be a major evolutionary factor allowing completely absent from the AY-WB phytoplasma genome (Ta- phytoplasmas to occupy broad plant host ranges or to adapt to ble 6). Generally, the genomes of AY-WB and OY-M phyto- different insect vectors. Few genes have similarities to known plasmas harbor significantly fewer carbohydrate transport and bacterial virulence factors. Like the related gram-positive bac- metabolism genes than their mycoplasma counterparts. Even teria, phytoplasmas may secrete virulence-related proteins via

in the 580-kb genome of M. genitalium, 26 carbohydrate trans- the Sec-dependent pathway. Hence, all the proteins with signal Downloaded from port and metabolism genes were identified (33). In contrast, peptides are potential virulence factors, including the five sol- only 19 genes are present in the 860-kb OY-M phytoplasma ute-binding proteins of the ABC transporters and proteins genome (76), and 16 genes are present in the 706-kb AY-WB derived from plasmids and PMUs. Finally, phytoplasmas have phytoplasma genome. ABC transporters for the import of maltose (or trehalose, Unlike SEM branch mollicutes, phytoplasmas have a NAD- sucrose, and palatinose), utilize malate, and can make phos- specific malic enzyme (EC 1.1.1.38) and malate/citrate-sodium pholipids. In contrast, SEM branch mollicutes have PTSs for symporter genes. Thus, like symbiotic Rhizobium spp. (77) but the import of fructose, glucose, and trehalose, utilize lactate, unlike sequenced SEM branch mollicutes, phytoplasmas may use and are phospholipid auxotrophs. http://jb.asm.org/ malate as a carbon source. The use of malate is advantageous, because it is readily available in the cytoplasm of host cells, and ACKNOWLEDGMENTS it can serve as the sole energy source for bacteria by conversion This work was supported by the National Research Institute of the to oxaloacetate and pyruvate (27, 77). Furthermore, metabo- USDA Cooperative State Research, Education, and Extension Ser- lism of malate saves energy (27), which is important, because vice, grant 2002-35600-12752, and the Ohio Agricultural Research and phytoplasmas lack ATP synthases, and hence, the capacity to Development Center competitive grants program. We thank former members of the bioinformatics and genome anal- generate energy in phytoplasmas seems limited to glycolysis ysis group at Integrated Genomics, including Svetlana Gerdes, Eugene (starting with glucose-6-phosphate). Goltsman, Viktor Joukov, Vinayak Kapatral, Yakov Kogan, Nikos on October 14, 2015 by University of Queensland Library Unlike SEM clade mollicutes, phytoplasmas appear to be Kyrpides, Andrei Osterman, Olga Ostrovskaya, and Ross Overbeek. capable of biosynthesis of their own membrane phospholipids. We also acknowledge Angela D. Strock, Melanie L. Lewis Ivey, and The genomes of AY-WB, OY-M (76), and Western X-disease Jhony Mera for excellent technical assistance. phytoplasma (54) contain the pssA and psd genes (Table 6) REFERENCES encoding CDP-diacylglycerol-serine-O-phosphatidyltransferase (EC 1. Achaz, G., E. Coissac, P. Netter, and E. P. Rocha. 2003. Associations be- 2.7.8.8) and phosphatidylserine decarboxylase (EC 4.1.1.65), tween inverted repeats and the structural evolution of bacterial genomes. respectively. Both are part of the phosphatidylethanolamine Genetics 164:1279–1289. 2. Akiyama, Y., Y. K. Shirai, and K. Ito. 1994. Involvement of FtsH in protein pathway (63). Furthermore, the AY-WB and OY-M genomes assembly into and through the membrane. II. Dominant mutations affecting contain a candidate pmt gene for phospholipid N-methyltrans- FtsH functions. J. Biol. Chem. 269:5225–5229. 3. Allali, N., H. Afif, M. Couturier, and L. van Melderen. 2002. The highly ferase (Table 6) that is involved in phosphatidylcholine syn- conserved TldD and TldE proteins of Escherichia coli are involved in mi- thesis in conjunction with PssA and Psd (63). This confirms crocin B17 processing and in CcdA degradation. J. Bacteriol. 184:3224–3231. that phytoplasmas are phylogenetically more related to acho- 4. Allen, B. L., and M. Hook. 2002. Isolation of a putative laminin binding protein from Streptococcus anginosus. Microb. Pathol. 33:23–31. leplasmas (4), which do not require exogenous phospholipids, 5. Alonso, J. C., F. Weise, and F. Rojo. 1995. The Bacillus subtilis histone-like whereas SEM branch mollicutes are sterol and fatty acid auxo- protein Hbsu is required for DNA resolution and DNA inversion mediated ␤ trophs (80). AY-WB and OY-M also have all enzymes that link by the recombinase of plasmid pSM10935. J. Biol. Chem. 270:2938–2945. 6. Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, the glycolysis pathway to the glycerolipid pathway (76) and an and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation ABC transporter gene, phnL, involved in lipoprotein release of protein database search programs. Nucleic Acids Res. 25:3389–3402. 7. Andre´, A., W. Maccheroni, F. Doignon, M. Garnier, and J. Renaudin. 2003. (Table 4). Glucose and trehalose PTS permeases of Spiroplasma citri probably share a Summary. Phytoplasmas have intriguing genomes that are single IIA domain, enabling the spiroplasma to quickly adapt to carbohy- small and contain many multicopy sequences mainly organized drate changes in its environment. Microbiology 149:2687–2696. ϳ 8. Badger, J. H., and G. J. Olsen. 1999. CRITICA: coding region identification as PMUs. The AY-WB genome is 154 kb smaller than the tool invoking comparative analysis. Mol. Biol. Evol. 16:512–524. OY-M genome, primarily as a result of fewer multicopy se- 9. Bai, X., J. Zhang, I. R. Holford, and S. A. Hogenhout. 2004. Comparative quences. Thus, expansions or reductions of PMUs play a major genomics identifies genes shared by distantly related insect-transmitted plant pathogenic mollicutes. FEMS Microbiol. Lett. 235:249–258. role in phytoplasma genome evolution. At least one PMU, 10. Barbara, D. J., A. Morton, M. F. Clark, and D. L. Davies. 2002. Immuno- PMU1, has the characteristics of a replicative composite trans- dominant membrane proteins from two phytoplasmas in the aster yellows clade (chlorante aster yellows and clover phyllody) are highly divergent in poson. PMUs contain genes for specialized sigma factors and the major hydrophilic region. Microbiology 148:157–167. membrane proteins, providing evidence that PMUs are impor- 11. Barre, A., A. de Daruvar, and A. Blanchard. 2004. MolliGen, a database tant for phytoplasma interactions with the environment. Since dedicated to the comparative genomics of Mollicutes. Nucleic Acids Res. 32:D307–D310. phytoplasmas lack recA and other standard homologous re- 12. Bateman, A., L. Coin, R. Durbin, R. D. Finn, V. Hollich, S. Griffiths-Jones, combination functions, it is unlikely that phytoplasmas gener- A. Khana, M. Marshall, S. Moxon, E. L. L. Sonnhammer, D. J. Studholme, ate antigenic variation of membrane proteins through RecA- C. Yeats, and S. R. Eddy. 2004. The Pfam protein families database. Nucleic Acids Res. 32:D138–D141. dependent homologous recombination. We propose that the 13. Beanland, L., C. W. Hoy, S. A. Miller, and L. R. Nault. 2000. Influence of regulation of expression of PMU genes is one of the strategies aster yellows phytoplasma on the fitness of aster leafhopper (Homoptera: Cicadellidae). Ann. Entomol. Soc. Am. 93:271–276. phytoplasmas use to adapt to different environments. Expres- 14. Bendtsen, J. D., H. Nielsen, G. von Heijne, and S. Brunak. 2004. Improved sion of PMU genes might occur through a process that involves prediction of signal peptides: SignalP 3.0. J. Mol. Biol. 340:783–795. VOL. 188, 2006 COMPARATIVE ANALYSIS OF TWO PHYTOPLASMA GENOMES 3695

15. Berducci, G., A. P. Mazzetti, G. Rotilio, and A. Battistoni. 2004. Periplasmic that colonize plant phloem and insects. Int. J. Syst. Evol. Mi- competition for zinc uptake between the metallochaperone ZnuA and crobiol. 54:1243–1255. Cu,Zn superoxide dismutase. FEBS Lett. 569:289–292. 42. Jaffe, J. D., N. Stange-Thomann, C. Smith, D. DeCaprio, S. Fisher, J. Butler, 16. Bhugra, B., and K. Dybvig. 1993. Identification and characterization of S. Calvo, T. Elkins, M. G. FitzGerald, N. Hafez, C. D. Kodira, J. Major, S. IS1138, a from Mycoplasma pulmonis that belongs to Wang, J. Wilkinson, R. Nicol, C. Nusbaum, B. Birren, H. C. Berg, and G. M. the IS3 family. Mol. Microbiol. 7:577–584. Church. 2004. The complete genome and proteome of Mycoplasma mobile. 17. Boutareaud, A., J. L. Danet, M. Garnier, and C. Saillard. 2004. Disruption Genome Res. 14:1447–1461. of a gene predicted to encode a solute binding protein of an ABC transporter 43. Kakizawa, S., K. Oshima, T. Kuboyama, H. Nishigawa, H. Jung, T. Saway- reduces transmission of Spiroplasma citri by the leafhopper Circulifer haema- anagi, T. Tsuchizaki, S. Miyata, M. Ugaki, and S. Namba. 2001. Cloning and toceps. Appl. Environ. Microbiol. 70:3960–3967. expression analysis of phytoplasma protein translocation genes. Mol. Plant- 18. Bove´, J. M. 1997. Spiroplasmas: infectious agents of plants, arthropods and Microbe Interact. 14:1043–1050. vertebrates. Wien. Klin. Wochenschr. 109:604–612. 44. Kakizawa, S., K. Oshima, H. Nishigawa, H. Y. Jung, W. Wei, S. Suzuki, M. 19. Carr, K. M., and J. M. Kaguni. 2002. Escherichia coli DnaA protein loads a Tanaka, S. Miyata, M. Ugaki, and S. Namba. 2004. Secretion of immuno- Downloaded from single DnaB helicase at a DnaA box hairpin. J. Biol. Chem. 277:39815– dominant membrane protein from onion yellows phytoplasma through the 39822. Sec protein-translocation system in Escherichia coli. Microbiology 150:135– 20. Cheggour, A., L. Fanuel, C. Duez, B. Joris, F. Bouillenne, B. Devreese, G. 142. Van Driessche, J. Van Beeumen, J. M. Frere, and C. Goffin. 2000. The dppA 45. Khan, S. A. 1997. Rolling-circle replication of bacterial plasmids. Microbiol. gene of Bacillus subtilis encodes a new D-aminopeptidase. Mol. Microbiol. Mol. Biol. Rev. 61:442–455. 38:504–513. 46. Kieboom, L., R. Bruinenberg, I. Keizer-Gunnink, and J. A. de Bont. 2001. 21. Cordwell, S. J., D. J. Basseal, B. Bjellqvist, D. C. Shaw, and I. Humphery- Transposon mutations in the flagella biosynthetic pathway of the solvent- Smith. 1997. Characterisation of basic proteins from Spiroplasma melliferum tolerant Pseudomonas putida S12 result in a decreased expression of solvent using novel immobilized pH gradients. Electrophoresis 18:1393–1398. efflux genes. FEMS Microbiol. Lett. 198:117–122. 22. Czosnek, H., M. Ghanim, S. Morin, G. Rubinstein, V. Fridman, and M. 47. King, K. W., A. Woodard, and K. Dybvig. 1994. Cloning and characterization Zeidan. 2001. Whiteflies: vectors, and victims (?), of geminiviruses. Adv. of the recA genes from Mycoplasma pulmonis and M. mycoides subsp. http://jb.asm.org/ Virus Res. 57:291–322. mycoides. Gene. 139:111–115. 23. Davidson, A. L., and J. Chen. 2004. ATP-binding cassette transporters in 48. Kiss, J., and F. Olasz. 1999. Formation and transposition of the covalently bacteria. Annu. Rev. Biochem. 73:241–268. closed IS30 circle: the relation between tandem dimers and monomeric 24. Davis, R. E., R. Jomantiene, Y. Zhao, and E. L. Dally. 2003. Folate biosyn- circles. Mol. Microbiol. 34:37–52. thesis pseudogenes, PsifolP and PsifolK, and an O-sialoglycoprotein endo- 49. Koonin, E. V., and T. V. Ilyina. 1992. Geminivirus replication proteins are peptidase gene homolog in the phytoplasma genome. DNA Cell Biol. 22: related to prokaryotic plasmid rolling circle DNA replication initiator pro- 697–706. teins. J. Gen. Virol. 73:2763–2766. 25. Delcher, A. L., D. Harmon, S. Kasif, O. White, and S. L. Salzberg. 1999. 50. Kormanec, J., B. Seccikova, N. Halgasova, R. Knirschova, and B. Rezu- Improved microbial gene identification with GLIMMER. Nucleic Acids Res. chova. 2000. Identification and transcriptional characterization of the gene 27:4636–4641. encoding the stress-response sigma factor sigma(H) in Streptomyces coeli- on October 14, 2015 by University of Queensland Library 26. Deuerling, E., A. Mogk, C. Richter, M. Purucker, and W. Schumann. 1997. color A3(2). FEMS Microbiol. Lett. 189:31–38. The ftsH gene of Bacillus subtilis is involved in major cellular processes such 51. Lee, I. M., R. E. Davis, and D. E. Gundersen-Rindal. 2000. Phytoplasma: as sporulation, stress adaptation and secretion. Mol. Microbiol. 23:921–933. phytopathogenic mollicutes. Annu. Rev. Microbiol. 54:221–255. 27. Dimroth, P., and B. Schink. 1998. Energy conservation in the decarboxyl- 52. Lee, I.-M., D. E. Gundersen-Rindal, R. E. Davis, K. D. Bottner, C. Marcone, ation of dicarboxylic acids by fermenting bacteria. Arch. Microbiol. 170:69– and E. Seemuller. 2004. ‘Candidatus Phytoplasma asteris’, a novel phyto- 77. plasma taxon associated with aster yellows and related diseases. Int. J. Syst. 28. Eisen, J. A., J. F. Heidelberg, O. White, and S. L. Salzberg. 2000. Evidence for Evol. Microbiol. 54:1037–1048. symmetric chromosomal inversions around the replication origin in bacteria. 53. Lee, I. M., Y. Zhao, and K. D. Bottner. 2005. Novel insertion sequence-like Genome Biol. 1:research0011.1–0011.9. [Online.] http://genomebiology.com elements in phytoplasma strains of the aster yellows group are putative new /2000/1/6/RESEARCH/0011. members of the IS3 family. FEMS Microbiol. Lett. 242:353–360. 29. Ewing, B., L. Hillier, M. C. Wendl, and P. Green. 1998. Base-calling of 54. Liefting, L. W., and B. C. Kirkpatrick. 2003. Cosmid cloning and sample automated sequencer traces using phred. I. Accuracy assessment. Genome sequencing of the genome of the uncultivable mollicute, Western X-disease Res. 8:175–185. phytoplasma, using DNA purified by pulsed-field gel electrophoresis. FEMS 30. Ewing, B., and P. Green. 1998. Base-calling of automated sequencer traces Microbiol. Lett. 221:203–211. using phred. II. Error probabilities. Genome Res. 8:186–194. 55. Liefting, L. W., M. E. Shaw, and B. C. Kirkpatrick. 2004. Sequence analysis 31. Fernandez, S., F. Rojo, and J. C. Alonso. 1997. The Bacillus subtilis chroma- of two plasmids from the beet leafhopper-transmitted virescence agent. tin-associated protein Hbsu is involved in DNA repair and recombination. Microbiology 150:1809–1817. Mol. Microbiol. 23:1169–1179. 56. Lohman, T. M., and M. E. Ferrari. 1994. Escherichia coli single-stranded 32. Frank, A. C., and J. R. Lobry. 2000. Oriloc: prediction of replication bound- DNA-binding protein: multiple DNA-binding modes and cooperativities. aries in unannotated bacterial chromosomes. Bioinformatics 16:560–561. Annu. Rev. Biochem. 63:526–570. 33. Fraser, C. M., J. D. Gocayne, O. White, M. D. Adams, R. A. Clayton, R. D. 57. Lysenko, E., T. Ogura, and S. M. Cutting. 1997. Characterization of the ftsH Fleischmann, C. J. Bult, A. R. Kerlavage, G. Sutton, J. M. Kelley, R. D. gene of Bacillus subtilis. Microbiology 143:971–978. Fritchman, J. F. Weidman, K. V. Small, M. Sandusky, J. Fuhrmann, 58. Mahillon, J., and M. Chandler. 1998. Insertion sequences. Microbiol. Mol. D. Nguyen, T. R. Utterback, D. M. Saudek, C. A. Phillips, J. M. Merrick, Biol. Rev. 62:725–774. J. F. Tomb, B. A. Dougherty, K. F. Bott, P. C. Hu, T. S. Lucier, S. N. 59. Marais, A., J. M. Bove, and J. Renaudin. 1996. Characterization of the recA Peterson, H. O. Smith, C. A. Hutchison III, and J. C. Venter. 1995. The gene regions of Spiroplasma citri and Spiroplasma melliferum. J. Bacteriol. minimal gene complement of Mycoplasma genitalium. Science 270:397–403. 178:7003–7009. 34. Gasparich, G. E. 2002. Spiroplasmas: evolution, adaptation and diversity. 60. Marchler-Bauer, A., J. B. Anderson, C. DeWeese-Scott, N. D. Fedorova, L. Y. Front. Biosci. 7:d619–d640. Geer, S. He, D. I. Hurwitz, J. D. Jackson, A. R. Jacobs, C. J. Lanczycki, C. A. 35. Guy, L., and C. A. Roten. 2004. Genometric analyses of the organization of Liebert, C. Liu, T. Madej, G. H. Marchler, R. Mazumder, A. N. Nikolskaya, circular chromosomes: a universal pressure determines the direction of ri- A. R. Panchenko, B. S. Rao, B. A. Shoemaker, V. Simonyan, J. S. Song, P. A. bosomal RNA genes transcription relative to chromosome replication. Gene Thiessen, S. Vasudevan, Y. Wang, R. A. Yanashita, J. J. Yin, and S. H. 340:45–52. Bryant. 2003. CDD: a curated Entrez database of conserved domain align- 36. Hall, M. C., J. R. Jordan, and S. W. Matson. 1998. Evidence for a physical ments. Nucleic Acids Res. 31:383–387. interaction between the Escherichia coli methyl-directed mismatch repair 61. Marcone, C., H. Neimark, A. Ragozzino, U. Lauer, and E. Seemu¨ller. 1999. proteins MutL and UvrD. EMBO J. 17:1535–1541. Chromosome sizes of phytoplasmas composing major phylogenetic groups 37. Hanboonsong, Y., C. Choosai, S. Panyim, and S. Damak. 2002. Transovarial and subgroups. Phytopathology 89:805–810. transmission of sugarcane white leaf phytoplasma in the insect vector Mat- 62. Marcone, C., I. M. Lee, R. E. Davis, A. Ragozzino, and E. Seemuller. 2000. sumuratettix hiroglyphicus (Matsumura). Insect Mol. Biol. 11:97–103. Classification of aster yellows-group phytoplasmas based on combined anal- 38. Higgins, C. F. 2001. ABC transporters: physiology, structure and mecha- yses of rRNA and tuf gene sequences. Int. J. Syst. Evol. Microbiol. 50:1703– nism—an overview. Res. Microbiol. 152:205–210. 1713. 39. Hosie, A. H. F., and P. S. Poole. 2001. Bacterial ABC transporters of amino 63. Martinez-Morales, F., M. Schobert, I. M. Lopez-Lara, and O. Geiger. 2003. acids. Res. Microbiol. 152:259–270. Pathways for phophatidylcholine biosynthesis in bacteria. Microbiology 149: 40. Hruska, A. J., and M. Gomez Peralta. 1997. Maize response to corn leaf- 3461–3471. hopper (Homoptera: Cicadellidae) infestation and Achaparramiento dis- 64. Melamed, S., E. Tanne, R. Ben-Haim, O. Edelbaum, D. Yogev, and I. Sela. ease. J. Econ. Entomol. 90:604–610. 2003. Identification and characterization of phytoplasmal genes, employing a 41. IRPCM Phytoplasma/Spiroplasma Working Team-Phytoplasma Taxonomy novel method of isolating phytoplasmal genomic DNA. J. Bacteriol. 185: Group. 2004. ‘Candidatus Phytoplasma’, a taxon for the wall-less, non-helical 6513–6521. 3696 BAI ET AL. J. BACTERIOL.

65. Melcher, U., and J. Fletcher. 1999. Genetic variation in Spiroplasma citri. 82. Rocha, E. P., and A. Blanchard. 2002. Genomic repeats, genome plasticity Eur. J. Plant Pathol. 105:519–533. and the dynamics of Mycoplasma evolution. Nucleic Acids Res. 30:2031– 66. Melcher, U., Y. Sha, F. Ye, and J. Fletcher. 1999. Mechanisms of spiroplasma 2042. genome variation associated with SpV1-like viral DNA inferred from se- 83. Rodriguez-Sainz, M. C., C. Hernandez-Chico, and F. Moreno. 1990. Molec- quence comparisons. Microb. Comp. Genomics 4:29–46. ular characterization of pmbA,anEscherichia coli chromosomal gene re- 67. Mendonca, V. M., and S. W. Matson. 1995. Genetic analysis of ⌬helD and quired for the production of the antibiotic peptide MccB17. Mol. Microbiol. ⌬uvrD mutations in combination with other genes in the RecF recombina- 4:1921–1932. tion pathway in Escherichia coli: suppression of ruvB mutation by a uvrD 84. Rosch, J., and M. Caparon. 2004. A microdomain for protein secretion in deletion. Genetics 141:443–452. gram-positive bacteria. Science 304:1513–1515. 68. Miyata, S., K. Oshima, S. Kakizawa, H. Nishigawa, H. Y. Jung, T. 85. Schneider, B., E. Torres, M. P. Martin, M. Schroder, H. D. Behnke, and E. Kuboyama, M. Ugaki, and S. Namba. 2003. Two different thymidylate kinase Seemuller. 2005. ‘Candidatus Phytoplasma pini’, a novel taxon from Pinus gene homologues, including one that has catalytic activity, are encoded in the silvestris and Pinus halepensis. Int. J. Syst. Evol. Microbiol. 55:303–307. onion yellows phytoplasma genome. Microbiology 149:2243–2250. 86. Schomburg, I., A. Chang, C. Ebeling, M. Gremse, C. Heldt, G. Huhn, and D. Downloaded from 69. Montano, H. G., R. E. Davis, E. L. Dally, S. Hogenhout, J. P. Pimentel, and Schomburg. 2004. BRENDA, the enzyme database: updates and major new P. S. T. Brioso. 2001. ‘Candidatus Phytoplasma brasiliense’, a new phyto- developments. Nucleic Acids Res. 32:D431–D433. plasma taxon associated with hibiscus witches’ broom disease. Int. J. Syst. 87. Segal, G., M. Feldman, and T. Zusman. 2005. The Icm/Dot type-IV secretion Bacteriol. 51:1109–1118. systems of Legionella pneumophila and Coxiella burnetii. FEMS Microbiol. 70. Murayama, N., H. Shimizu, S. Takiguchi, Y. Baba, H. Amino, T. Horiuchi, Rev. 29:65–81. K. Sekimizu, and T. Miki. 1996. Evidence for involvement of Escherichia coli 88. Silva, Z., M. M. Sampaio, A. Henne, A. Bohm, R. Gutzat, W. Boos, M. S. da genes pmbA, csrA and a previously unrecognized gene tldD, in the control Costa, and H. Santos. 2005. The high-affinity maltose/trehalose ABC trans- of DNA gyrase by letD (ccdB) of sex factor F. J. Mol. Biol. 256:483–502. porter in the extremely thermophilic bacterium Thermus thermophilus HB27 71. Nishigawa, H., S. Miyata, K. Oshima, T. Sawayanagi, A. Komoto, T. also recognizes sucrose and palatinose. J. Bacteriol. 187:1210–1218. Kuboyama, I. Matsuda, T. Tsuchizaki, and S. Namba. 2001. In planta ex- 89. Song, J., A. Ware, and S. L. Liu. 2003. Wavelet to predict bacterial ori and pression of a protein encoded by the extrachromosomal DNA of a phyto- ter: a tendency towards a physical balance. BMC Genomics 4:17. http://jb.asm.org/ plasma and related to geminivirus replication proteins. Microbiology 147: 90. Szevere´nyi, I., Z. Nagy, T. Farkas, F. Olasz, and J. Kiss. 2003. Detection and 507–513. analysis of transpositionally active head-to-tail dimers in three additional 72. Nishigawa, H., K. Oshima, S. Kakizawa, H. Y. Jung, T. Kuboyama, S. Escherichia coli IS elements. Microbiology 149:1297–1310. Miyata, M. Ugaki, and S. Namba. 2002. A plasmid from a non-insect- 91. Tamas, I., L. Klasson, B. Canback, A. K. Naslund, A. S. Eriksson, J. J. transmissible line of a phytoplasma lacks two open reading frames that exist Wernegreen, J. P. Sandstrom, N. A. Moran, and S. G. Andersson. 2002. 50 in the plasmid from the wild-type line. Gene 298:195–201. million years of genomic stasis in endosymbiotic bacteria. Science 296:2376– 73. Nishigawa, H., K. Oshima, S. Kakizawa, H. Y. Jung, T. Kuboyama, S. 2379. Miyata, M. Ugaki, and S. Namba. 2002. Evidence of intermolecular recom- 92. Tettelin, H., D. Radune, S. Kasif, H. Khouri, and S. L. Salzberg. 1999. bination between extrachromosomal DNAs in phytoplasma: a trigger for the Optimized multiplex PCR: efficiently closing a whole-genome shotgun se-

biological diversity of phytoplasma? Microbiology 148:1389–1396. quencing project. Genomics 62:500–507. on October 14, 2015 by University of Queensland Library 74. Olasz, F., T. Farkas, J. Kiss, A. Arini, and W. Arber. 1997. Terminal inverted 93. Tougu, K., and K. J. Marians. 1996. The interaction between helicase and repeats of insertion sequence IS30 serve as targets for transposition. J. primase sets the replication fork clock. J. Biol. Chem. 35:21398–21405. Bacteriol. 179:7551–7558. 94. Weisburg, W. G., J. G. Tully, D. L. Rose, J. P. Petzel, H. Oyaizu, D. Yang, L. 75. Oshima, K., S. Kakizawa, H. Nishigawa, T. Kuboyama, S. Miyata, M. Ugaki, Mandelco, J. Sechrest, T. G. Lawrence, J. Van Etten, J. Maniloff, and C. R. and S. Namba. 2001. A plasmid of phytoplasma encodes a unique replication Woese. 1989. A phylogenetic analysis of the mycoplasmas: basis for their protein having both plasmid- and virus-like domains: clue to viral ancestry or classification. J. Bacteriol. 171:6455–6467. result of virus/plasmid recombination? Virology 285:270–277. 95. Westberg, J., A. Persson, A. Holmberg, A. Goesmann, J. Lundeberg, K.-E. 76. Oshima, K., S. Kakizawa, H. Nishigawa, H. Y. Jung, W. Wei, S. Suzuki, R. Johanssen, B. Petterson, and M. Uhle´n. 2004. The genome sequence of Arashida, D. Nakata, S. Miyata, M. Ugaki, and S. Namba. 2004. Reductive Mycoplasma mycoides subsp. mycoides type strain PG1T, the causative agent evolution suggested from the complete genome sequence of a plant-patho- of contagious bovine pleuropneumonia (CPBB). Genome Res. 14:221–227. genic phytoplasma. Nat. Genet. 36:27–29. 96. Woese, C. R. 1987. Bacterial evolution. Microbiol. Rev. 51:221–227. 77. Poole, P., and D. Allaway. 2000. Carbon and nitrogen metabolism in Rhizo- 97. Wu, M., L. V. Sun, J. Vamathevan, M. Riegler, R. Deboy, J. C. Brownlie, bium. Adv. Microb. Physiol. 43:117–163. E. A. McGraw, W. Martin, C. Esser, N. Ahmadinejad, C. Wiegand, R. 78. Pre´vost, C., and M. Takahashi. 2004. Geometry of the DNA strands within Madupu, M. J. Beanan, L. M. Brinkac, S. C. Daugherty, A. S. Durkin, J. F. the RecA nucleofolament: role in homologous recombination. Q. Rev. Bio- Kolonay, W. C. Nelson, Y. Mohamoud, P. Lee, K. Berry, M. B. Young, phys. 36:429–453. T. Utterback, J. Weidman, W. C. Nierman, I. T. Paulsen, K. E. Nelson, H. 79. Ray, K. C., Z.-C. Tu, R. Grogono-Thomas, D. G. Newell, S. A. Thompson, Tettelin, S. L. O’Neill, and J. A. Eisen. 2004. Phylogenomics of the repro- and M. J. Blaser. 2000. Campylobacter fetus inversion occurs in the absence ductive parasite Wolbachia pipientis wMel: a streamlined genome overrun of RecA function. Infect. Immun. 68:5663–5667. by . PLOS Biol. 2:e69. 80. Razin, S., D. Yogev, and Y. Naot. 1998. Molecular biology and pathogenicity 98. Yu, F., S. Inouye, and M. Inouye. 1986. Lipoprotein-28, a cytoplasmic mem- of mycoplasmas. Microbiol. Mol. Biol. Rev. 62:1094–1156. brane lipoprotein from Escherichia coli. Cloning, DNA sequence, and ex- 81. Rekab, D., L. Carraro, B. Schneider, E. Seemuller, J. Chen, C. J. Chang, R. pression of its gene. J. Biol. Chem. 261:2284–2288. Locci, and G. Firrao. 1999. Geminivirus-related extrachromosomal DNAs of 99. Zhang, J., S. A. Hogenhout, L. R. Nault, C. W. Hoy, and S. A. Miller. 2004. the X-clade phytoplasmas share high sequence similarity. Microbiology 145: Molecular and symptom analyses of phytoplasma strains from lettuce reveal 1453–1459. a diverse population. Phytopathology 94:842–849.