Quick viewing(Text Mode)

The Evolution of a Gene Cluster Containing a Plant-Like Protein In

The Evolution of a Gene Cluster Containing a Plant-Like Protein In

EVOLUTION OF A PLANT-LIKE GENE ANCIENTLY ACQUIRED AS PART OF A GENOMIC ISLAND IN

A DISSERTATION SUBMITTED TO THE GRADUATE DIVISION OF THE UNIVERSITY OF HAWAI‘I AT MᾹNOA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN MOLECULAR BIOSCIENCES AND BIOENGINEERING May 2012 BY

KEVIN SCHNEIDER

DISSERTATION COMMITTEE

GERNOT PRESTING, CHAIRPERSON

ANNE ALVAREZ

YANGRAE CHO

GUYLAINE POISSON SEAN CALLAHAN

Dedicated to my Parents!

i

Acknowledgments

I want to give my biggest thanks to Dr Gernot Presting for providing me with so many opportunities during my career at UH Manoa. The teaching assistantship I received on an unexpected short notice that began my PhD to working and publishing on exciting and interesting topics from corn centromeres to bacterial genomes. I am forever grateful for the time, patience, and energy he has spent mentoring me.

This work would not have been possible without Dr Anne Alvarez. She has provided not only her knowledge of , but also her collection of bacterial strains that the majority of my research required. Also, I thank Asoka Da Silva whom has provided his expertise and skills to culture and purify the hundreds of strains used in this study.

The analysis in this work would not have begun without the initial phylogenomic analysis of Arabidopsis completed by Aren Ewing. His work laid the foundation to stick with studying bacterial genomic evolution in light of all of the wonderful work to study the genomic evolution of the centromeres of Zea mays in our lab. I also thank all of my lab mates Anupma Sharma, Thomas Wolfgruber, Jamie Allison, Jeffrey Lai, Megan Nakashima, Ronghui Xu, Zidian Xie, Grace Kwan, Margaret Ruzicka, Krystle Salazar and Erin Mitsunaga from the past and the present for their advice, help, discussions and their friendship and casual chit-chat. The work presented in this dissertation was supported by USDA #2005-34135-15972.

My special thanks go to my supportive father Bruce, mother Judy, brother Robbie and sister Carrie who have been there throughout my career as a graduate student. I also thank the many friends and colleagues that have provided an unforgettable graduate school experience.

ii

Abstract

In this thesis, I determine that XAC3314, a plant-like gene, was anciently transferred to Xanthomonas axonopodis pv. citri strain 306 (Xac_306). An XAC3314 homolog has also been reported in the sequenced strain Xanthomonas axonopodis pv. vesicatoria strain 85-10, but is absent from three sequenced strains of X. campestris and three sequenced strains of X. oryzae. XAC3314 may have been lost in these . The genetic diversity of XAC3314 and the evolution of this gene in Xanthomonas were characterized. Furthermore, I show that XAC3314 was acquired by an ancestral genome and transferred to Xanthomonas as part of a genomic island. Anciently acquired genes are expected to have ameliorated to the native genome; therefore, these genes cannot be identified using characteristics common to a horizontal transfer event. I compared Xac_306 to other sequenced genomes to identify potential gene clusters in the vicinity of XAC3314. Analysis of these genomes revealed that XAC3314 was likely acquired as a small gene cluster, which inserted in Xanthomonas after the split of X. albilineans. and have unique gene clusters inserted at the same region as those in X. axonopodis. XAC3314 homologs were also identified in the sequenced strains of X. vasicola and X. gardneri. The diversity of XAC3314 was analyzed in Xanthomonas strains from the Pacific Bacterial Collection. A total of 307 Xanthomonas strains were classified with the RIF marker. Trees constructed from Xanthomonas RIF sequences are similar in structure to a completed multi-locus analysis of the [Young et al., 2008]. XAC3314 homologs are present in 98 of the 307 strains in the Pacific Bacterial Collection. The XAC3314 sequences revealed that HGT may have occurred among some strains of X. axonoodis. This work expands our understanding of bacterial genome evolution and anciently acquired gene clusters.

iii

Table of Contents

Acknowledgments ...... ii Abstract ...... iii List of Tables ...... v List of Figures...... vi Chapter 1: General Introduction ...... 1 Chapter 2: Literature Review ...... 2 Chapter 3: Analysis of XAC3314 within sequenced genomes ...... 22 3.1 Introduction ...... 22 3.2 Hypothesis ...... 23 3.3 Methods ...... 24 3.4 Results ...... 26 3.5 Discussion ...... 46 3.5 Conclusion ...... 49 Chapter 4: Classification of Xanthomonas strains using RIF, a computationally derived DNA marker 50 4.1 Introduction ...... 50 4.2 Hypothesis ...... 51 4.3 Methods ...... 52 4.4 Results ...... 57 4.5 Discussion ...... 71 4.5 Conclusion ...... 75 Chapter 5: XAC3314 diversity in the PBC ...... 76 5.1 Introduction ...... 76 5.3 Methods ...... 78 5.4 Results ...... 80 5.5 Discussion ...... 88 5.5 Conclusion ...... 90 Chapter 6: Summary ...... 91 Appendix A: Design of RIF primers for other plant associated ...... 92 Appendix B: Identification of Clavibacter with RIF ...... 96 Appendix C: Identification of Ralstonia with RIF ...... 99 Appendix D: Identification of with RIF ...... 103 Appendix E: Supplemental Figures ...... 108 Appendix F: Supplemental Tables...... 113 References ...... 194

iv

List of Tables

Table 2.1. Sequenced strains used in this study ...... 16 Table 3.1. The gene cluster in Xac_306 that includes XAC3314 ...... 30 Table 4.1. Genes in Xoo_311018 and Xcc_33913 containing regions that match criteria for potential markers ...... 60 Table 4.2. In silico comparison of Xanthomonas with the RIF marker resolves one pair of closely related strains that is unresolved with four other housekeeping genes and the ITS ...... 64 Table 4.3. Abbreviations for Xanthomonas and Stenotrophomonas strains ...... 67 Table A.1. Additional primer pairs to amplify RIF designed for other genera ...... 92 Supplemental Table S5.1. Copy number and location of the dnaA gene in 1,067 sequenced bacterial genomes...... 112 Supplemental Table S5.2. Xanthomonas strains from the PBC used in this study and ITS/RIF/XAC3314 amplicon and sequencing success ...... 154 Supplemental Table SB.1. Clavibacter strains from the PBC used in this study and ITS/RIF amplicon and sequencing success ...... 172 Supplemental Table SC.1. Ralstonia strains from the PBC used in this study and ITS/RIF/egl amplicon and sequencing success ...... 178 Supplemental Table SD.1. Enterobacteriaceae strains from the PBC used in this study and ITS/RIF/ADE amplicon and sequencing success ...... 188

v

List of Figures

Figure 3.1. Gene cluster insertion sites in the ...... 27 Figure 3.2. Potential HGT events in Xac_306 ...... 31 Figure 3.3. XAC3314 has been lost in several of the sequenced Xanthomonas strains ...... 34 Figure 3.4. Genes shared by Xanthomonadaceae strains ...... 35 Figure 3.5. Genes were lost by Stenotrophomonas and rearranged in Xylella ...... 36 Figure 3.6. A large gene cluster inserted into an ancestral X. albilineans genome after the split of X. albilineans from other Xanthomonas species ...... 38 Figure 3.7. XAC3314 inserted into an ancestral Xanthomonas genome as a gene cluster that arose after the split of X. albilineans ...... 39 Figure 3.8. XAC3314 was lost by X. campestris, X. gardneri and X. vesicatoria ...... 41 Figure 3.9. XAC3314 has been retained in X. vasicola ...... 42 Figure 3.10. X. oryzae strains have lost XAC3314 ...... 43 Figure 3.11. XAC3314 has been retained and lost in different X. axonopodis strains ...... 44 Figure 4.1. Alignment of the dnaA gene from six Xanthomonas strains ...... 58 Figure 4.2. Neighbor-joining cladograms for Xanthomonas using ITS and RIF Sequences ...... 63 Figure 4.3. RIF distinguishes more Xanthomonas strains than ITS ...... 66 Figure 4.4. Maximum-likelihood of RIF sequences that agree with all four tree methods ... 68 Figure 4.5. RIFdb: Online RIF sequence framework for the identification of plant ...... 73 Figure 5.1. XAC3314 is present after the split of X. albilineans in the Xanthomonas tree ... 81 Figure 5.2. Diversity of XAC3314 Xanthomonas strains in the PBC ...... 82 Figure 5.3. XAC3314 likely underwent horizontal transfer at least twice in Xanthomonas .. 85 Figure 5.4. Diversity of the three different intergenic regions amplified in Xanthomonas Strains ...... 86 Figure A.1. Multiple sequence alignment of nucleotides 311 to 1311 of the dnaA genes of six genera ...... 94 Figure B.1. RIF sequences distinguish fewer strains of Clavibacter but produce a more robust tree ...... 96 Figure B.2. The RIF marker separates Clavibacter michiganensis strains into the three Subspecies ...... 97 Figure C.1. RIF sequences distinguish more Ralstonia strains than ITS ...... 99 Figure C.2. Ralstonia solanacearum strains classified based on RIF genotypes ...... 101 Figure D.1. Neighbor-joining cladograms for the family Enterobacteriaceae using ITS and RIF sequences ...... 103 Figure D.2. RIF sequences distinguish more Dickeya strains than ITS ...... 104 Figure D.3. RIF tree of plant pathogenic Enterobacteriaceae ...... 106 Supplemental Figure S4.1. Neighbor-joining tree of the RIF marker for Xanthomonas ..... 107 Supplemental Figure S4.2. Minimum-evolution tree of the RIF marker for Xanthomonas ...... 108 Supplemental Figure S4.3. Maximum-parsimony tree of the RIF marker for Xanthomonas ...... 109 Supplemental Figure S4.4. Maximum-likelihood tree of the RIF marker for Xanthomonas ...... 111

vi

Chapter 1: General Introduction

Cells in all domains of life inherit DNA. Most commonly DNA is acquired from a parental cell by its offspring, known as vertical inheritance (Snel et al., 2002). Exogenous DNA may also be acquired by a cell (Syvanen 1994). Exogenous DNA that cannot persist as a plasmid must insert by homologous or non-homologous recombination directly into a native chromosome or plasmid. This is known as horizontal or lateral inheritance (Lorenz and Wackernagel 1994). The horizontally inherited DNA can then be passed on vertically. A horizontal gene transfer (HGT) event may occur from any one organism to any other organism. A cluster of genes transferred to an organism is known as a genomic island (Hacker and Kaper 1999). The region of the genome into which the gene cluster inserts is known as the insertion site, a region that may also exhibit rearrangement. Horizontally acquired DNA may exhibit skewed genomic and proteomic characteristics (Lawrence 2002; Huynen and Bork 1998; Karlin 2001; Comas et al., 2006; Hsiao et al., 2003; Garcia-Vallve et al., 2003), such as G/C content or dinucleotide bias, compared to the native genome. The differences are most pronounced when DNA is acquired from a distant genome (e.g., a bacterium acquiring a plant gene) and are difficult to discern when the acquired DNA is similar to the native genome. Nembaware et al., (2004) identified XAC3314 in Xanthomonas axonopodis pv. citri (Xac) strain 306 (Xac_306). XAC3314 may have been horizontally transferred from a plant to non-plant organisms (Richards et al., 2009). This gene may have been acquired by Xanthomonas as a larger gene cluster from an unknown ancestral organism. I am interested in the evolution of XAC3314. Was the XAC3314 gene transferred to Xanthomonas directly or as part of a larger gene cluster? How has the gene cluster evolved in each Xanthomonas strain? Was XAC3314 transferred more than once from a plant to a non-plant? Has XAC3314 been transferred since acquisition by Xanthomonas?

1

Chapter 2: Literature Review

2.1 Identification of plant-like genes in Xanthomonas Nembaware et al., (2004) identified two genes of plant origin in Xanthomonas axonopodis pv. citri (also named ) strain Xac_306. One gene encodes a plant natriuretic-like protein, XACPNP, which modifies the plant host response when Xac infects a plant (Gottig et al., 2008). This is likely to benefit the (Nembaware et al., 2004; Gottig et al., 2008). Gottig et al., (2008) have shown that the addition of XACPNP directly to a plant, or when expressed in planta, causes the plants to open their stomata, similar to the expression of some native plant natriuretic-like proteins. These open stomata likely allow bacteria access into the substomatal cavity. Nembaware et al., (2004) identified XAC3314, a protein that contains a plant-like of unknown function 239 (DUF239 named by Pfam [http://pfam.sanger.ac.uk]), in Xac_306. Ewing (2008) subsequently identified the gene in a phylogenomic analysis of Arabidopsis. The XAC3314 gene is activated by the transcriptional activator of the type three secretion system and the protein is secreted by the type two secretion system when in an environment similar to the apoplast. This protein may provide a pathogenicity or virulence benefit. DUF239 domain-containing proteins were identified in Xanthomonas citri, Xanthomonas euvesicatoria, (also named Xanthomonas axonopodis pv. vesicatoria), the bacterium Methylocella, the fungus Laccaria bicolor and plants (Richards et al., 2009; Ewing 2008). Based on the whole protein phylogenetic tree, Richards et al., (2009) were unable to determine whether the gene was transferred on separate occasions to fungi and , or transferred from a plant to Laccaria bicolor through Methylocella as an intermediate. Richards et al., (2009) observed long branches between plant proteins and non-plant proteins with a DUF239 domain. These branch lengths resemble proteins that have been anciently transferred to an ancestral genome (Kurland et al., 2003). No functional analysis of XAC3314 has been reported. To better study the evolution of the XAC3314 gene it is important to understand bacterial genome

2

evolution and bacterial secretion systems in general and in the context of Xanthomonas.

2.2 Bacterial Genome Evolution Snel et al., (2002) state that many different processes can shape a prokaryotic genome and suggest that the most prominent process that drives bacterial genome evolution is the vertical inheritance that include “core” set of genes, or housekeeping genes. Non-coding regions and horizontally transferred genes (that are not required for survival post-acquisition) are not under the same selective pressure as housekeeping genes and may undergo spontaneous loss or mutation more readily.

2.2.1 Bacterial Classification Methods developed for bacterial classification use both phenotypic and genotypic traits. Phenotype-based classification methods include biochemical and serological tests. DNA-based classification methods are faster than most phenotypic tests and more commonly used. These include fingerprinting, DNA- DNA hybridization, primer based sequencing and whole genome sequencing.

DNA classification techniques provide greater resolution than biochemical and serological tests. DNA fingerprinting methods use the unique banding patterns obtained from separating genomic DNA on an agarose gel after PCR amplification of repetitive regions using specific primer pairs or restriction digestion (Louws et al., 1999; Alvarez 2004). The maximum variability (resolution) provided by the DNA fingerprint is equal to the number of unique bands produced (the greater number of bands the better resolution; i.e., two unique bands resolve four different types and three unique bands resolve eight different types). A fingerprinting agarose gel may require a 14 hour electrophoresis run for proper separation and resolution of the DNA bands. Reference strains must be included with other samples to obtain reference fingerprints for proper comparison and classification. Also, some fingerprinting

3

methods require specific culturing and DNA extraction methods to create a reproducible fingerprint (Louws et al., 1999). The similarities and differences between DNA fingerprints can be used to construct a phylogenetic tree among strains of interest. DNA-DNA hybridization techniques utilize the annealing temperature between genomes to measure renaturation rates and calculate the percent homology. The percent homology between strains can be used to create distance matrices and construct phylogenetic trees (see section 2.2.4). To create a complete distance matrix all strains of interest should be cross hybridized. DNA-DNA hybridization of type strains was performed by Vauterin et al., (1995) to reclassify the species of Xanthomonas (see section 2.2.4).

The low cost of sequencing a PCR amplicon make DNA markers a viable classification method (DNA barcoding), replacing assays that do not use sequence data. Commonly the universal ribosomal RNA (rRNA) genes are used to classify bacteria. Ribosomal RNA genes are amplified in the majority of strains using universal primers (Normand et al., 1996). Ribosomal DNA sequences have been determined for a large number of samples, including more than one million 16S rDNA sequences from individual strains and environmental samples (Cole et al., 2009, http://rdp.cme.msu.edu), and over 18,000 GenBank (Benson et al., 1999) sequences of the internal transcribed spacer (ITS), which lies between the 16S and 23S rRNA genes. These DNA sequences have been used to classify a large number of bacteria and study population diversity (Garcia-Martinez et al., 1999; Gurtler and Stanisich 1996). However, 16S rDNA sequences do not always resolve species within a genus, and even the more variable ITS sequence resolves many genera only to the species level (Garcia- Martinez et al., 1999). Also, Kang et al., (2010) have shown that multiple copies of the 16S and 23S rRNA genes exist in over 80% (639 of 782 strains) of Gram+ and Gram- bacteria studied. Direct sequencing of rDNA amplicons from a genome containing different alleles may result in poor sequence quality (as observed in 415 of 782 strains by Kang et al., [2010]). This is currently overcome by isolating the amplicon of interest with one of two time-consuming procedures

4

and sequencing the product: Excising individual amplicons from an agarose gel (Wilton et al., 1997), if the alleles differ in length, or by cloning the amplicons.

The CO1 protein coding gene has been suggested as the universal animal DNA barcode (Hebert et al., 2003) and the universal plastid amplicon (UPA) has been suggested as the universal plant barcode (Presting 2006). DNA markers designed from housekeeping genes in bacteria have been used singly or in combination to determine their phylogenetic relationships (Ma et al., 2007; Young and Park 2007; Prior and Fegan 2005; Castillo and Greenburg 2007; Young et al., 2008; Parkinson et al., 2007; Parkinson et al., 2009; Vitorino et al., 2006). The most commonly used protein coding genes are housekeeping genes. These are under stabilizing selection and are expected to more accurately portray the genetic relationships among strains than genes that are under positive selection, such as pathogenicity genes (Urwin and Maiden 2003). Genes under stabilizing selection may also be less prone to lateral transfer than other genes, such as those involved in pathogenicity. Nonetheless, even housekeeping genes may be transferred laterally as evidenced by recombination events within the gyrB gene of Vibrio species (Pascual et al., 2010).

An ideal DNA marker for classification of bacteria would be more variable than the rDNA regions, present in all target organisms as a single copy per genome, unlikely to be transferred by horizontal gene transfer, and amplifiable with universal primers. Attempts to design universal primers for bacteria from sequences other than the rDNA regions have been unsuccessful (Santos and Ochmann 2004) due to the great diversity and rapidly evolving nature of bacterial genomes. No ideal marker for strain identification or phylogenetic analysis has been identified for bacteria to date. Instead, genus-specific primers to amplify genic regions are commonly used singly or in combination as a multiple locus sequence analysis/typing (MLSA/MLST). The characteristics inherent in housekeeping genes make them ideal for marker development (Urwin and Maiden 2003). Choosing the best housekeeping gene can be difficult as some

5

genes will provide greater resolution than others (Urwin and Maiden 2003). Any single marker that is to be used for classification and identification needs to produce a phylogenetic tree that is similar to one obtained in a multi-locus analysis or when the complete genome is used.

2.2.2 Phylogenetic Analysis Aligned DNA sequence data, fingerprinting data, DNA-DNA hybridization and phenetic information from two or more strains can be used to produce a distance matrix that represents the differences between pair-wise comparisons of different strains (Salemi and Vandamme 2003). The matrix is analyzed and used to construct a cladogram to identify and classify organisms. The distance matrices can be directly compared using the neighbor-joining (does not assume that lineages evolve at the same rate) or UPGMA (all paths from the root to the tip are equal) methods to calculate the best fit tree (Salemi and Vandamme 2003). Other methods used to analyze aligned DNA sequences include: maximum parsimony to construct a tree with the fewest changes between nodes, maximum likelihood to construct a tree similar to maximum parsimony but with a defined model for nucleotide or amino acid evolution, and minimum evolution to construct a tree with the shortest overall branch length possible. A tree constructed from sequences that provide similar branch patterns using several different methods is most probable. To test the branches within a cladogram, trees constructed from random subsequences are subsampled (Salemi and Vandamme 2003). The trees constructed from these subsequences are compared to the original cladogram to retrieve a bootstrap score for each branch. A subtree with similar branching receives a positive bootstrap. Higher bootstrap values give more confidence to the branch in the tree. Strains can be compared to each other or unknowns using these methods.

6

2.2.3 Horizontal Gene Transfer The horizontal inheritance of genes, compared with vertical inheritance, is a rare event (Syvanen 1994), but has thoroughly shaped the genomes of modern prokaryotes after billions of years of evolution through DNA acquisition, loss and rearrangements (Syvanen 1994; Huynen and Bork 1998). HGT is the movement and incorporation of DNA from one organism into another that is not the offspring of the donor (Syvanen 1994). Therefore, HGT occurs commonly between organisms that share a similar environment (Ochman et al., 2000). A cluster of horizontally transferred genes are termed genomic islands (GIs). Hacker and Kaper (1999) have shown that pathogenicity islands, first named based on virulence associated genomic regions in isolates, were also GIs. XAC3314, a gene associated with other virulence factors (the bacterial type two secretion system [Yamazaki et al., 2008]), is horizontally transferred (Richards et al., 2009) and may be present within a gene cluster.

Chromosomes, plasmids or short DNA fragments, may be acquired from the environment through transformation, transduction or conjugation (Lorenz and Wackernagel 1994). Foreign DNA that cannot replicate and persist on its own must recombine into the native chromosome or plasmid to continue to persist. The incorporation of foreign DNA into the host occurs through recombination (Lorenz and Wackernagel 1994). DNA fragments from related organisms share sequence homology that allows recombination into the chromosome or plasmid (Lorenz and Wackernagel 1994; Lawrence and Roth 1996; Lawrence 2002). DNA from distantly related organisms insert by illegitimate recombination. Illegitimate recombination is the only method to explain HGT events across distant taxa, unless the strains share sequence homology by chance (Lawrence 2002).

Insertion sequence (IS) elements are involved in the movement of genes and are similar to other transposon-like elements (Siguier et al., 2006). IS elements are characterized by family. This is based on related transposases,

7

catalytic domains and flanking inverted repeats. A scar of direct repeats is commonly associated with IS elements at the insertion site (Siguier et al., 2006). Siguier et al., (2006) explain that these IS elements can cause gene inactivation (if inserted into a gene), gene loss (if two similar IS elements recombine and excise out the middle DNA) or genome rearrangement (if two similar IS elements undergo double recombination). Most bacteria contain IS elements as revealed by Touchon and Rocha (2007); however, of the 262 sequenced genomes available, 24% contained no IS elements and 48% had fewer than ten IS elements. They observed an overall correlation between genome size and the number of IS elements, but some small genomes have many IS elements. For example, three sequenced pv. oryzae strains each contained more than 200 IS elements (Lee et al., 2005; Salzberg et al., 2008). The number of IS elements may expand or contract within a genome during vertical inheritance (Siguier et al., 2006; Touchon and Rocha 2007); nonetheless, HGT may play a major role in the spread of IS elements to new genomes and the evolution of a GI after acquisition (Touchon and Rocha 2007).

HGT has been identified between all domains of life and has been observed experimentally between prokaryotes (Comas et al., 2006; Doolittle 1999; Gogarten et al., 2002; Van Sluys et al., 2002; Karlin 2001; Beiko et al., 2005), e.g., the in-planta transmission of toluene phyto-remediation genes between Burkholderia strains (Taghavi et al., 2005), from to eukaryote, e.g., entire and partial chromosomal movements from the endosymbiont Wolbachia into several different host genomes (Dunning Hotopp et al., 2007), and from eukaryote to prokaryote, e.g., the alpha-amylase gene of M. degradens (Da Lage et al., 2003). This has led several biologists to note that the construction of a single tree of life for all organisms is not possible (Comas et al., 2006). They state that many genes in a genome may be transferred by HGT and Doolittle (1999) notes that “the universal tree is hopelessly compromised by methodological artifacts and [H]GT”. Doolittle suggests a net-like tree of life to describe organisms. In fact, genes acquired by HGT over evolutionary time can

8

account for a high proportion of a genome. This may mask the vertical signal of the “core” set of genes (Doolittle 1999; Gogarten et al., 2002). The ability to discern genomic regions acquired by vertical descent increases in difficulty as the proportion of the genome derived from HGT events increases (Gogarten et al., 2002).

2.2.4 Characteristics of horizontally transferred DNA A gene cluster recently acquired through HGT from an unrelated strain will exhibit specific characteristics at the genomic and proteomic level (Lawrence 2002; Huynen and Bork 1998; Karlin 2001; Comas et al., 2006; Hsiao et al., 2003; Garcia-Vallve et al., 2003). Transferred DNA exhibits genetic synteny to the DNA of the donor genome (Lawrence 2002; Huynen and Bork 1998) and G+C%, dinucleotide, codon and amino acid anomalies compared to the native genome. GIs also commonly insert near tRNA genes (Hacker and Kaper 2000), have direct repeats that flank the insert and may encode insertion sequences (Karlin 2001; Comas et al., 2006; Hsiao et al., 2003; Garcia-Vallve et al., 2003). The genomic region that the transferred DNA inserts into is named the genomic hotspot. In a study of 24 sequenced prokaryotic genomes, Garcia-Vallve et al., (2000) used G+C content, codon usage, amino acid usage and gene synteny to reveal that between 1.5% and 14.5% of genes in a genome were likely derived by horizontal instead of vertical inheritance. Formulas have been developed to determine the aforementioned genomic properties (Karlin 2001) and two online resources are available with HGT data on published genomes (Hsiao et al., 2003; Garcia-Vallve et al., 2003).

2.2.5 The Selfish Operon Model to Describe HGT Lawrence and Roth (1996) have suggested the “Selfish Operon Model” to explain the formation of operons in bacterial genomes by HGT. An operon is a cluster of genes controlled by a single regulator that provides a selective advantage for the organism. Similarly, a cluster of genes with varied regulatory control, which provides a selective advantage to the organism as a gene cluster,

9

would follow similar inheritance patterns as an operon. This model explains that gene clusters are more likely to be horizontally transferred than non-clustered genes, if the genes provide some form of selective advantage. Lawrence and Roth (1996) explain that their model better fit the transcription control and gene diversity that operons (and sometimes GIs) exhibit in bacterial genomes than the Natal model proposed by Horowitz in 1945 (gene clusters originate from duplication and divergence), the Fisher model proposed by Fisher in 1930 (gene clusters form due to the recombination of genes, as clustered genes are less likely to be disrupted by recombination), and the co-regulation model proposed by Pardee et al., in 1958, Jacob et al., in 1960 and Jacob and Monod in 1962 (gene clusters originate due to coordinate expression and regulation). The “Selfish Operon Model” provides a selective advantage for the transfer of clustered genes over the transfer of non-clustered genes, as most all genes in the cluster are required to provide the weakly selected function. In addition, if one or more genes in the cluster no longer provide a selective function, then gene loss or cluster loss may occur (these regions would no longer be transferred vertically or horizontally). These gene clusters can propagate to related organisms through homologous recombination and to distantly related organisms by illegitimate recombination (Lawrence and Roth 1996; Huynen 1998). These operons and GIs can encode various functions such as specific metabolic pathways, secretion systems, pathogenicity factors and more (Lorenz 1994; Huynen and Bork 1998; Lawrence and Roth 1996).

2.3 The bacterial secretion systems and pathogenicity Proteins bound for one of the bacterial secretion systems are called pre- proteins. Pre-proteins are first directed into the periplasm by the general secretion pathway (Pallen et al., 2003). Virulence proteins are generally secreted via the type II secretion system (T2SS), type III secretion system (T3SS), or the type V secretion system (T5SS).

10

The T2SS directs proteins that stay in the periplasm, are bound in the outer membrane or secreted outside of the bacterial cell (Sandkvist 2001; Filloux 2004). The T2SS was first discovered in Klebsiella by d’Enfert et al., (1987). Proteins secreted include lipases, esterases, cellulases, phosphotases, proteases, xylanases, chitinases and other degrading enzymes (Sandkvist 2001). Sandkvist (2001b) reviewed the machinery (Gsp proteins) of the T2SS in detail.

Pre-proteins fold after arrival in the periplasm and the GspDQ protein recognizes and cleaves the N-terminal signal peptide (Filloux 2004). Filloux (2004) and Cianciotto (2005) have shown that single amino acid substitutions to the signal peptide of secreted proteins of the T2SS may yield non-secreted proteins. Translocation out of the cell occurs through a pseudo-pilus (made of Gsp proteins) that is similar to Pil proteins that form the pilus of the type four secretion system (Sandkvist 2001; Filloux 2004). Sandkvist (2001b) also revealed homology between the T2SS machinery of distant strains, likely due to the horizontal transfer of these genes on pathogenicity islands (Huynen and Bork 1998).

The T5SS is similar to the T2SS, except T5SS proteins also contain a C- terminal domain (Henderson et al., 1998) that forms a pore in the membrane, allowing translocation outside of the cell. To release the mature protein from the membrane the C-terminal domain is auto-catalytically cleaved (Henderson et al., 1998). Therefore, the C-terminal end of a protein secreted by the T2SS may evolve to be auto-transported by the T5SS. This may explain why proteins secreted by the T2SS in one organism may be secreted in a different organism that lacks the T2SS (Szczesny et al., 2010).

Proteins secreted by the T2SS include pathogen associated molecular patterns (PAMPs) that are recognized by the plant and activate the plant defense response. Bacteria recognize the defense response and respond with the activator of the T3SS (Bent and Mackey 2007). The T3SS injects proteins directly into a compatible host cell to interact with host defenses. Host defenses

11

include proteins encoded by plant resistance (R) genes. Plant defenses may be evaded by the pathogen to cause a disease response on the plant. A resistant plant may produce the hypersensitivity response (HR) or show other degrees of resistance ranging from a decrease to disease onset to complete resistance. A pathogen that infects an incompatible plant will elicit no host response; however, the bacterium may grow as an endophyte and the plant may act as a reservoir of bacteria.

The T3SS machinery is encoded by the HR and pathogenicity (hrp) genes and secretes effector proteins that are encoded by avirulence (avr) genes (reviewed in Bent and Mackey 2007; Gurlebeck et al., 2005). These authors have shown that the number of effectors can differ from twenty to over one hundred different proteins in bacterial strains. Effectors are controlled by the transcription factors HrpX and HrpG, which recognize plant-inducible-promoter boxes (PIP) upstream of some virulence related genes (Tsuge et al., 2004; Koebnik et al., 2006). Tsuge et al., (2004) have shown that one or two base substitutions to the PIP boxes controlled by HrpX do not cause a complete loss of transcription, and that the PIP sequence may not be very stringent (review of PIP in Xanthomonas is discussed below). Therefore, not all genes that are controlled by hrp transcription factors may have been identified to date. Yamazaki et al., (2008) have shown that hrp transcription factors induce eleven T2SS proteins in Xanthomonas axonopodis pv. citri.

2.4 The Xanthomonadaceae family The Xanthomonadaceae family contains several genera of plant- associated bacteria. Agricultural pathogens include Xanthomonas, meaning “yellow entity”, and Xylella (Meyer and Bogdanove 2009). has a reduced genome, similar to X. albilineans (Pieretti et al., 2010), is xylem-limited and transmitted by an insect-vector (Meyer and Bogdanove 2009). Stenotrophomonas, a related genus, contains plant and soil associated, as well as opportunistic, bacteria (Palleroni and Bradbury 1993). This genus was

12

originally named Pseudomonas maltophilia, which was subsequently renamed Xanthomonas maltophilia and later renamed to several species of Stenotrophomonas, meaning “one that feeds on few substrates” (Brenner et al., 2005). This name better represented the differences observed between Xanthomonas and Stenotrophomonas and restored Xanthomonas to the original definition before the addition of Stenotrophomonas (Brenner et al., 2005).

2.4.1 Historical Xanthomonas nomenclature Xanthomonas strains infect over 392 hosts (Hayward 1993). Xanthomonas species were originally named based on their pathogenicity on specific hosts. In 1965, Xanthomonas contained well over 100 species, many of which had not been tested on a wide range of hosts (Hayward 1993). Nomenclature based on host specificity requires a lot of time and space to grow and test host plants. Most problematic with this nomenclature, however, is the fact that pathogenicity is the phenotypic manifestation of only one gene or a very small number of genes, which are non-essential and thus easily lost or transferred by HGT.

The majority of Xanthomonas species were collapsed to a single species, X. campestris, by Dye and Lelliott (1974). The species X. albilineans, X. fragariae, X. oryzae, and X. populi were not collapsed into X. campestris. A trinomial pathovar designation was used to name strains. Pathovars were named based on the host range of the strain and the symptoms produced on the host. Similar to the previous naming of Xanthomonas species, strains with a given pathovar designation were difficult to test on a wide range of host plants. The existence of pathogenicity islands may make naming a strain with a trinomial pathovar designation difficult (Lawrence 2002).

2.4.2 Current Nomenclature of Xanthomonas A comprehensive reclassification of Xanthomonas species based on DNA- DNA hybridization was completed by Vauterin et al., (1995). This analysis led

13

the authors to increase the number of Xanthomonas species from five to twenty and to reorganize approximately 200 pathovars of X. campestris into new species. Sixty-six strains that had pathovar designations were not placed into any specific species. Some species were further updated using three different DNA fingerprinting methods by Rademaker et al., (2005) and DNA-DNA hybridization methods by Jones et al., (2004) and Schaad et al., (2006). Some pathovars within these species were proposed to be renamed to subspecies. Young et al., (2008) completed a robust MLSA using four housekeeping genes, which was in agreement with the naming scheme proposed by Vauterin et al., (1995).

The proposed naming schemes have caused several changes to Xanthomonas species, subspecies and pathovars designations. For example, several former pathovars of X. campestris, which were originally named based on their pathogenicity to specific hosts or tissues of the same host were placed into subgroups of the new species X. axonopodis, based on genomic and phenotypic analysis by Vauterin et al., (1995). One important pathovar, the causal agent of , Xac (formerly X. campestris pv. citri, pathotype A), affects all varieties of citrus under most field conditions (Brunings and Gabriel 2003). Xac strains with a limited host range in the field have been named to pathotypes B and C of Xac; however, all strains can affect all varieties of citrus under laboratory conditions (Brunings and Gabriel 2003). Shaad et al., (2006) have renamed these two pathotypes to the species X. fuscans. This has led to several different taxonomic designations that are currently in use to name Xanthomonas strains.

14

2.4.3 Taxa of sequenced Xanthomonadaceae strains The genomes of nineteen Xanthomonas, one , six

Xylella and four Stenotrophomonas strains have been completed or are currently in draft. The list of all sequenced Xanthomonadaceae strains used in this study is given in Table 2.1.

The genus Xanthomonas contains the greatest number of strains that have been completely sequenced in the Xanthomonadaceae family. Xac subtype A was one of the first sequenced Xanthomonas strains. This strain, possibly introduced on fruit imported from Asia, caused devastation to the Florida citrus industry costing the state over $200 million in damage (Brunings and

Gabriel 2003; Gottwald et al., 2001; Gottwald et al., 2002; Brown 2001) and is currently listed as a select agent by the USDA (Hawks 2005). Xanthomonas axonopodis pv. citri subtypes B and C (X. fuscans subsp. aurantifolia pathotypes

B and C, respectively), have also been sequenced and provide additional genetic information to understand the citrus canker disease (Moreira et al., 2010).

Table 2.1 (Next Page). Sequenced strains used in this study.

15

16

X. campestris pv. campestris (Xcc) strain ATCC33913 was sequenced and compared to Xac_306 by da Silva et al., (2002), revealing strain specific genes that help to define the different phenotypes of the two pathogens. X. campestris pv. campestris causes black rot of crucifers and the other pathovars (raphani, aberrans and armoraciae) cause leaf spotting diseases (Hayward 1993; Vauterin et al., 1995; Brenner et al., 2005). Xcc strain 8004 was sequenced and compared to Xcc strain ATCC33913 by Qian et al., (2005), revealing that IS elements constitute the major genetic differences between the two strains, as exemplified by rearrangements, gene gain and loss. The genome sequence of Xcc strain B100 was completed in 2008 and used to further understand the xanthan biosynthesis pathway in the three sequenced Xcc strains (Vorholter et al., 2008).

The genome of X. euvesicatoria (Xe also named X. axonopodis pv. vesicatoria [Xav]) strain 85-10 was compared with Xcc strain ATCC33913 and Xac_306 by Thieme et al., (2005), who found a large number of unique coding sequences and overall genome plasticity. Xav is the causal agent of bacterial wilt and leaf spot on tomato and pepper.

Xanthomonas oryzae pv. oryzae (Xoo), listed as a select agent (Hawks 2005), and Xanthomonas oryzae pv. oryzicola are two important pathogens of rice. Xoo strain KACC10331 was sequenced in 2005 (Lee et al., 2005). Lee et al., (2005) identified that twenty percent of the genes in this organism were unique compared with other sequenced Xanthomonas species. Xoo strain MAFF311018 was sequenced; however, no study was published on this genome. After Xoo strain PXO99A was sequenced, the strain was compared with both of the previously sequenced Xoo genomes (Salzberg et al., 2008), revealing that Xoo strain PXO99A exhibited extensive genome plasticity: 87 genes were unique to this genome compared to the two other Xoo strains.

17

The genome of Xanthomonas albilineans, the causal agent of sugar cane leaf scald, was completed in 2009 (Pieretti et al.) and compared to the genome of the xylem-limited Xylella fastidiosa pathogen. Pieretti et al., (2009) concluded that the reduced genome of X. albilineans and X. fastidiosa arose independently and the occupation of the xylem, a shared niche, may have favored a reduced genome. Other sequenced Xanthomonas genomes are not completely assembled.

A strain of X. axonopodis pv. dieffenbachiae that contains the LUX cassette was sequenced using 454 sequencing and is currently being assembled by Glorimar Marrero in our laboratory (personal communication). X. axonopodis pv. dieffenbachiae infects aroids including Anthurium, thus the pathogen is important both globally and to Hawaiian agriculture (Hayward 1993).

The genomes of X. gardneri and X. vesicatoria (Potnis et al., 2011) have been sequenced and compared to Xav_85-10 to identify pathogenicity factors to aide in creating tomato cultivars. X. vasicola subsps. vasculorum (pathogenic on sugarcane and maize) and musacearum (pathogenic on banana) have been sequenced and compared to help understand the factors involved in causing banana wilt disease (Studholme et al., 2010).

Other genera within the Xanthomonadaceae have been completed or published in draft form. These genomes will help elucidate the precursor gene cluster that contains XAC3314. Of six proposed species of Stenotrophomonas (Ryan et al., 2009), two have been sequenced. Stenotrophomonas maltophilia strains K279 and R551 were isolated from of an immuno-compromised patient and from the environment, respectively (Crossman et al., 2008; Ryan et al., 2009). S. rhizophila strain Dr-63A, isolated from cabbage seed in the 1980s,

18

was sequenced using 454 pyrosequencing and assembled by Grace Kwan in our laboratory (2010). Six completed Xylella fastidiosa genomes have been analyzed and are also publically available (Simpson et al., 2000; Van Sluys et al., 2002b; Bhattacharyya et al., 2002; Doddapaneni et al., 2006; Chen et al., 2010).

2.5 The secretion systems of sequenced Xanthomonas strains Xanthomonas species contain the T2SS and T3SS (da Silva et al., 2002; Qian et al., 2005, Vorholter et al., 2008; Lee et al., 2005; Salzberg et al., 2008; Studholme et al., 2010; Potnis et al., 2011; Moreira et al., 2010; Pieretti et al., 2009). The proteins secreted via the T2SS of Xanthomonas include amylases, proteases, cellulases, pectatelyases, and xylanases, as well as “hypothetical proteins” (Szczesny et al., 2010; Yamazaki et al., 2008).

Proteins secreted via the T2SS in Xac (Yamazaki et al., 2008) and Xav (Szczesny et al., 2010) were analyzed using a HrpG+ strain that induces the T3SS under nutrient-rich conditions. Both the Xac and Xav HrpG transcription factor induced some proteins that are secreted with the T2SS. Yamazaki et al., (2008) revealed that eleven proteins were secreted outside of the Xac cell, including six proteases and three conserved hypothetical proteins (including XAC3314). In addition, two proteins dependent on the T2SS in Xac, XAC2853 and XAC0661, had homologous proteins in Xav (Szczesny et al., 2010). Interestingly, the authors found that the homologous proteins in Xav were not dependent on the T2SS. This may be true for other secreted proteins in Xav (based on homologies with other T2SS proteins from other Xanthomonas sp.). Although the proteins were no longer controlled by the T2SS in Xav, these proteins were still activated by HrpG and HrpX mutant strains (Szczesny et al., 2010). Koebnik et al., (2006) expanded on the previously identified PIP upstream of proteins activated by HrpX in Ralstonia (TTCG-N16-TTCG) with X. euvesicatoria (TTCGC-N15-TTCGC and TTCGC-N8-TTCGT). Xanthomonas species likely share PIP sequences as Xav PIP sequences are a subset of the Ralstonia PIP. Some proteins secreted by the T2SS in Xac and Xcc have

19

homologous proteins in Xav that are not dependent on the T2SS (Szczesny et al., 2010). This indicates that Xanthomonas strains may use different mechanisms to recognize a type II secreted protein (Szczesny et al., 2010). The mechanisms that control secretion of these proteins in Xav, now independent of the T2SS, are unknown (Szczesny et al., 2010), but may have been modified to be auto-transported through the T5SS (see above).

A genome may contain more than one type of the T2SS and/or T3SS. Xcc, Xav and Xac have two T2SS operons, named the Xcs and Xps operons (Sandkvist 2001; Szczesny et al., 2010; Yamazaki et al., 2008; Lu et al., 2008). Xanthomonas oryzae strains have the Xps operon and no Xcs operon. Lu et al., (2008) believed that the Xps operon may have been acquired more recently in the Xanthomonas lineage than the Xcs operon, and that the hrp genes may have been acquired more recently than the Xps genes. Lu et al., (2008) and Szczesny et al., (2010) have shown that the Xps and not the Xcs operon of X. euvesicatoria are involved in the secretion of proteins in the T2SS. However, they revealed that there is some gene complementation between the Xps operon and homologous proteins in the Xcs operon when a gene is lost in the Xps operon (homologous proteins with a greater percentage of identity provided greater complementation).

Currently the function of XAC3314 is not known (Ewing 2008; Richards et al., 2009; Yamazaki et al., 2008). XAC3314 is regulated through positive and negative competition between HrpG and HrpX (Yamazaki et al., 2008). Yamazaki et al., revealed that the greatest expression of XAC3314 in a HrpX- mutant that constitutively expresses HrpG. For unknown reasons Szczesny et al., (2010) did not include the XAC3314 homolog in Xav, XAV3432, in their study. Therefore, it is unknown if XAV3432 is dependent on the T2SS.

20

2.6 Gene transfer activity of XAC3314 in Xanthomonas strains In order to study the evolution of XAC3314 I used available data, such as genome sequence data that has been deposited in GenBank. I also obtained a diverse set of bacterial strains from the Pacific Bacterial Collection (PBC), which contains over 2,500 Xanthomonas accessions from a variety of hosts, locations, species and pathovars. These were classified with the RIF marker and their XAC3314 homologs were compared to determine the diversity of XAC3314 and their PIP boxes in the population and to determine if the region has undergone HGT since acquisition.

21

Chapter 3: Analysis of XAC3314 within sequenced genomes

3.1 Introduction Genes that have originated from a plant have previously been identified in bacteria and fungi (Nembaware et al., 2004; Ewing 2008; Richards et al., 2009). XACPNP, horizontally acquired by Xac_306, has been shown to mimic a plant host response that causes stomates to open (Gottig et al., 2008). These plant- like proteins have been hypothesized to use molecular mimicry to help an organism evade defense mechanisms, create suitable growing environments or increase the fitness of the organism (Gottig et al., 2008). XAC3314, another gene of plant origin in Xac_306, may also provide a selective function through molecular mimicry. These plant-like genes were acquired through HGT and may persist through further HGT events or by vertical inheritance.

The ancestral Xanthomonas strain that acquired XAC3314 and the evolution of XAC3314 thereafter is currently unknown. Xac_306 may have acquired XAC3314 as a gene cluster from another organism or directly from the plant. An anciently acquired genomic island may be difficult to identify using common HGT characteristics such as GC%, dinucleotide frequency and codon usage (Huynen and Bork 1998; Karlin 2001). I have analyzed thirty sequenced Xanthomonadaceae genomes in regions surrounding the XAC3314 homologs to identify the ancestral strain that acquired the gene, if the gene was part of a larger cluster and how the region has evolved in each strain (i.e., gene gain, loss and rearrangement).

22

3.2 Hypothesis XAC3314 is part of a gene cluster acquired by an ancestral Xanthomonas strain Objectives: a) Analyze syntenic regions between XAC3314 in Xac_306 and other genera to identify location of possible gene clusters b) Identify the inheritance of genes that flank XAC3314 c) Reconstruct the evolution of the genomic island for each strain

23

3.3 Methods 3.3.1 XAC3314 Protein Annotation The XAC3314 protein is 49 amino acids shorter than the homolog in Xav, as annotated on GenBank. SignalP (Peterson et al., 2011), an online search tool to identify signal peptides in eukaryotic and prokaryotic proteins, was used to identify a potential N-terminal signal peptide. The DNA region encoding XAC3314 was extended beyond the start codon to the potential start codon used to annotate the XAC3314 homolog in Xav, XAV3432. The extended gene was translated and analyzed with SignalP. The identification of an upstream start codon with no stop codon downstream of the annotated start codon was used to retranslate the gene.

3.3.2 Analysis of syntenic regions in Xac_306 that flank XAC3314 in other sequenced Xanthomonadaceae strains Sequenced Xanthomonadaceae genomes (Table 2.1) were downloaded from the complete or in progress genome page on NCBI. Genomes not available on NCBI were downloaded from the corresponding sequencing project homepage (J Craig Ventor Institute) or retrieved from Glorimar Marrero (Xad_LUX) and Grace Kwan (Sr_DR63a).

The XAC3314 region of Xac_306 genome was compared to a representative strain from each genus (Xanthomonas, Pseudoxanthomonas, Stenotrophomonas and Xylella) using Mauve (Darling et al., 2010). Genes in these clusters were compared to the HGT information on the HGT-DB (Garcia- Vallve et al., 2003) and IslandPath (Hsiao et al., 2003), online databases to identify genes that may have been acquired by HGT using common genomic and proteomic characteristics. Gene sequences were BLASTed against the NR database to identify the nearest class of bacterial or eukaryotic strains that contain a homolog.

24

3.3.3 Analysis of the potential evolution of the genomic island The region flanking the potential genomic hotspot of Xac_306 was compared to the other genomes using Mauve (Darling et al., 2010). Mauve is an alignment tool that allows complete genomes to be compared and identifies locally collinear blocks (regions of synteny) and the homology between the two blocks. The genome for each strain was analyzed to confirm gene loss or gain, or rearrangements within the island. CVtree (Zhao et al., 2009) was used to create a cladogram using a k-mer analysis of six for the whole proteomes of thirty sequenced Xanthomonadaceae strains as well as the outgroup strain Leifsonia xyli. Gene acquisition, loss and rearrangement were visualized by the program CVtree, a tree construction tool that utilizes a k-mer approach to compare entire genomes and proteomes. The tree was analyzed to study the evolution of these regions in the ancestral and sequenced genomes.

25

3.4 Results 3.4.1 The XAC3314 homologs in NCBI The XAC3314 protein has a homolog in Xav_85-10 (XAV3432), pv. musacearum, X. vasicola pv. vasculorum, X. gardneri, X. perforans and X. fuscans. XAC3314 homologs contain a domain of unknown function, which was incorrectly identified in version 25 of Pfam as glucoamylase and has since been fixed in version 26. The XAC3314 and XAV3432 proteins, as translated in GenBank, differ by 49 amino acids in their GenBank files on the N- terminal side. The XAV3432 signal peptide receives a higher SignalP (Peterson et al., 2011) score than the XAC3314 signal peptide (0.097 vs. 0.853). The XAC3314 gene was re-translated for this analysis to include the 49 missing amino acids. The re-translated protein has a greater SignalP score than the GenBank predicted protein (0.097 vs. 0.828), similar to XAV3432. All XAC3314 homologs were confirmed to begin translation in the same region. Xanthomonas vasicola pv. musacearum also has a missense mutation at nucleotide 715 (C -> T) that terminates the protein pre-maturely. This gene may or may not be transcribed, translated or functional. The strain has retained the XAC3314 homolog; therefore, the mutation may not be detrimental to the protein. Further tests would have to be performed with this strain to determine if the XAC3314 homolog is secreted and functional.

3.4.2 Acquisition of ancient gene clusters in Xanthomonas The Xac_306 genome was analyzed forty kilobases upstream and downstream of XAC3314 using Mauve. Mauve allows genome to genome comparisons to identify syntenic regions shared between one or more strains. This region was compared to Xanthomonas albilineans (Figure 3.1A), Stenotrophomonas rhizophila DR63a (Figure 3.1B), Pseudoxanthomonas (Figure 3.1C), and Xylella fastidiosa 9a5c (Figure 3.1D) using Mauve. An XAC3314 homolog is not present in these four other genomes. This analysis revealed

26

genes that are shared between two or more strains, those that are unique to a strain and those that may have been lost in one or more strains. A gene unique to a strain may have been acquired by HGT by that strain, or lost in all other lineages.

27

Figure 3.1. Gene cluster insertion sites in the Xanthomonadaceae. Forty kilobase regions upstream and downstream of XAC3314 in Xac_306 compared with X. albilineans (A), Sr_Dr63a (B), Pseudoxanthomonas (C) and Xylella (D) using Mauve. Collinear blocks between the two genomes in the region of XAC3314 (red box between genes C and D) are shown. Genes conserved between the two genomes are labeled in the order of the genes in Xac_306 (from gene A to gene J). Possible gene cluster insertions are shown with blue brackets. Xylella and Pseudoxanthomonas have likely undergone rearrangements in this region compared with the other genomes.

28

Strains were compared to Xac_306 starting upstream of XAC3314. The first gene conserved upstream of XAC3314 in any of four genomes compared to Xac_306 is gene C (susB) in Pseudoxanthomonas (Figure 3.1C). Gene C may have been acquired by the ancestral genome of Pseudoxanthomonas and Xanthomonas or may have been acquired in the ancestral genome of the Xanthomonadaceae and subsequently lost in Stenotrophomonas, X. albilineans and Xylella (Figure 3.1). The next conserved gene upstream from gene C is gene B (salR). SalR is present in all genomes except Xylella.

Strains were compared to Xac_306 starting downstream of XAC3314. The first gene downstream of XAC3314 conserved in another genome is gene D (tRNA methyltransferase). All genomes have gene D except Xylella. Xac_306 and Pseudoxanthomonas also contain gene E (acrF), which was likely lost in the three other genomes. The genes conserved between species and genera help reveal the genomic island(s) that may have inserted before and after the last common ancestor of these strains.

Stenotrophomonas, Pseudoxanthomonas and Xylella contain no unique gene cluster in this region; however, Stenotrophomonas contains a unique gene between genes D and E that may have been lost in the other lineages or acquired after the split of Stenotrophomonas and Xanthomonas. Xac_306 contains a small unique cluster of two genes between genes B and C and two genes (including XAC3314) between genes C and D. X. albilineans has a unique region that may have inserted as one or two gene clusters between genes B and C or C and D or both. This cluster was likely acquired in X. albilineans after the split with other Xanthomonas species, but may have inserted before or after the loss of gene B.

The two gene clusters in Xac_306 between genes B and C and genes C and D do not match any genes in the X. albilineans gene cluster. Comparison of the five genomes indicates that genes E-I were likely in the last common

29

ancestor of all Xanthomonadaceae. Genes from E to F were lost in X. albilineans and genes from G to H were lost in Stenotrophomonas and Xylella. The gene clusters present after gene D in Xac_306 and X. albilineans likely inserted in between genes D and E before the loss of genes in X. albilineans. This region, in relation to the dnaA gene, has undergone a rearrangement in Pseudoxanthomonas. The Xac_306 genome may have up to three different gene clusters inserted into this region, one of which contains XAC3314.

These comparisons have revealed three probable gene clusters acquired by an ancestral strain of Xanthomonas after the split from X. albilineans (Table 3.1). Genomic islands may have inserted into the genome over several events, or gene clusters may have been larger and acquired in two HGT events (Figure 3.2).

30

Table 3.1. The gene cluster in Xac_306 that includes XAC3314.

Closest Non- Gene Xanthomonadaceae # Locus Name Full Name Homolog A XAC3309 Aminopeptidase N Gamma- LacI family transcription B XAC3310 salR regulator Alpha-proteobacteria 1 XAC3311 iroN TonB-dependent receptor Alpha-proteobacteria 2 XAC3312 glycosyl hydrolase Alpha-proteobacteria C XAC3313 s usB alpha-glucosidase Alpha-proteobacteria 3 XAC3314 hypothetical protein Eukaryote 4 XAC3315 Carboxylesterase Alpha-proteobacteria D XAC3316 tRNA/rRNA methyltransferase Alpha-proteobacteria 5 XAC3317 Acetyltransferase Beta-proteobacteria 6 XAC3318 pepN aminopeptidase N Beta-proteobacteria 7 XAC3319 hypothetical protein Beta-proteobacteria 8 XAC3320 ISxac3 transposase Beta-proteobacteria 9 XAC3321 ISxac3 transposase Beta-proteobacteria 10 XAC3322 hypothetical protein Beta-proteobacteria 11 XAC3323 acidic amino acid rich protein Beta-proteobacteria 12 XAC3324 hypothetical protein Beta-proteobacteria 13 XAC3325 hypothetical protein Beta-proteobacteria Beta/Gamma – E XAC3326 acrF acriflavin resistance protein proteobacteria F XAC3327 RND Gamma-proteobacteria G XAC3328 nodQ Gamma-proteobacteria The three potential gene clusters (numbers) that inserted into the genomes between shared genes (letters) within the Xanthomonadaceae. A protein BLAST was used to determine the closest non-Xanthomonadaceae homolog in the NR database.

31

Figure 3.2. Potential HGT events in Xac_306. The gene clusters present in Xac_306 may have been acquired in two HGT events (1) or over several HGT events (2). Gene colors and annotations are derived from Table 3.1. The origin of each gene cluster is annotated above the cluster. Arrows represent the insertion site of the gene cluster, i.e., genes 1 and 2 inserted between genes B and C in event (2).

32

3.4.3 Xanthomonas acquisition of the XAC3314 gene cluster The gene clusters in Xac_306 were checked in the HGT-DB and IslandPath database (Hsiao et al., 2003; Garcia-Vallve et al., 2003). Genes in the clusters identified above were not present in either of these databases. The gene clusters were likely acquired by the ancestral Xanthomonas genome after the split of X. albilineans and Xanthomonas axonopodis/vasicola/gardneri based on initial Mauve and BLAST comparisons. The third gene cluster in Xac_306 contains the IS3 family of insertion sequences that produce inverted repeats of 20-40 nt (http://www-is.biotoul.fr/). The cluster in Xac_306 does not have an inverted repeat that is greater than seventeen nt. These insertions are likely ancient and their repeats may have degenerated. These will not be useful for determining if the insertion of the clusters was mediated by an IS.

The proteins encoded in the gene clusters (Table 3.1) were compared to non-Xanthomonas proteins in the NR database to identify the possible origin of each gene in the cluster. The first and second gene clusters are most similar to alpha-proteobacteria, except for XAC3314. XAC3314 may have been lost in the genome that initially acquired the gene cluster or the genome that acquired the gene cluster may not have been sequenced yet. The gene cluster that inserted after tRNA methyltransferase (genes 5-13) are most similar to beta- proteobacteria. The second best hit to acrF is a gamma-proteobacteria. This gene provides resistance and was likely horizontally transferred multiple times. After this gene the next genes are most similar to other Gamma- proteobacteria, expected from a region that was acquired by vertical inheritance as Xanthomonas is a gamma-proteobacteria. The first two clusters may have been acquired as a larger gene cluster or both came from one or more alpha- proteobacteria and the third cluster was likely acquired from a beta- proteobacterium.

33

3.4.4 Gene gain and loss in the Xanthomonadaceae

The Xac_306 genome was compared to an additional twenty-five sequenced Xanthomonadaceae genomes using Mauve. The genes shared, gained and lost by these strains were analyzed and visualized in a tree with a matrix displaying the presence and absence of different genes in each cluster (Figure 3.3). The cladogram was constructed from the whole proteomes of the thirty sequenced strains using CVtree (Zhao et al., 2009) set to a k-mer of size six. The gain and loss of these genes were displayed on the nodes of the tree (Figures 3.4-3.11). The ancestral genomes of all Xanthomonadaceae likely contained genes A, F and G (Table 3.1). The smaller gene cluster acquisitions that are visualized on the tree hereafter are one of several possibilities (such as the possibilities indicated in Figure 3.2).

The cluster of genes most similar to alpha-proteobacteria (Table 3.1) may have been acquired in a single event and undergone loss in other lineages. Similarly, the cluster of genes most similar to beta-proteobacteria may have been acquired in a single event. Xylella, which underwent a reduction in genome size, lost genes A-F and has undergone a genome rearrangement in the collinear block at gene F (Figure 3.5). Stenotrophomonas has lost genes C (Sm_R551 and S_SKA14) and F (Sr_Dr63a). Not depicted on the tree is the acquisition of four genes in a cluster by Stenotrophomonas that likely occurred before the loss of gene C. The cluster inserted between genes B and C (transmembrane protein, a hypothetical protein, trmF and an amino acid carrier protein). Subsequently Steno_SKA14 has lost trmF, Sm_K279 has lost a hypothetical protein and trmF

34

Figure 3.3. Xac3314 has been lost in several of the sequenced Xanthomonas strains. The letters and color scheme represent the loci in Table 3.1 and Figure 3.2. The CVtree constructed from the whole proteome using a k- mer of 6. Arrows into the tree represent a node that acquired the genes based on Figure 3.2. The matrix next to the tree represents the presence and absence of different genes in the gene cluster described in Table 3.1 if labeled with the appropriate alpha-numeric and color. Other color boxes represent genes that are not present in Xac_306. The other genomes contain annotated proteins (gray boxes), hypothetical proteins (black boxes) or transposases (peach boxes). The green boxes in the XAC3314 column represent strains that have lost the gene. The least possible gene loss events are depicted on the appropriate node (green star).

35

Figure 3.4. Genes shared by Xanthomonadaceae strains. The letters and color scheme represent the loci in Table 3.1 and Figure 3.2. The CVtree constructed from the whole proteome using a k-mer of 6. Arrows into the tree represent a node that contains the genes or acquired the genes, arrows into a gene represent gene loss and blue arrows over one or more genes represent a rearrangement. Arrows from a gene cluster in between two genes represent the insertion point of the cluster, i.e., genes B and C inserted in between genes A and F.

36

Figure 3.5. Genes were lost by Stenotrophomonas and rearranged in Xylella. The letters and color scheme represent the loci in Table 3.1 and Figure 3.2. The CVtree constructed from the whole proteome using a k-mer of 6. Arrows into the tree represent a node that contains the genes or acquired the genes, arrows into a gene represent gene loss and blue arrows over one or more genes represent a rearrangement.

37

and Sr_Dr63a has retained the amino acid carrier protein. One gene cluster acquired by X. albilineans (Figure 3.6) likely inserted between genes B and C or genes C and D. Gene C was likely lost after this cluster inserted. Therefore, it is not possible to tell if genes B and C or gene C and D were the insertion sites for the gene cluster in X. albilineans. This gene cluster includes hemagglutinin (which normally cause aggregation of blood cells), hemolysin activation, transposases and hypothetical proteins. The second gene cluster of two genes (a polysaccharide deacetlyase and a hypothetical gene) inserted in X. albilineans between genes D and E.

3.4.4 Gene cluster evolution in Xanthomonas

The Xanthomonas species that evolved after the split from X. albilineans (X. campestris, X. axonopodis [X. euvesicatoria, X. citri, X. perforans, X. fuscans], X. vasicola, X. gardneri and X. vesicatoria) had two or more gene clusters (Figure 3.7 and Table 3.1) insert between genes A and F (Figures 3.8-3.11). Gene acquisitions visualized in these trees are unique to the individual clade or strain (unless otherwise noted). However, these acquisitions may have occurred with the initial genomic cluster and were subsequently lost in different lineages. All lineages except Xanthomonas vasicola, X. gardneri and X. axonopodis have lost XAC3314. In addition, X. vesicatoria (Figure 3.8) has lost gene D, all X. oryzae strains (Figure 3.10) have lost genes B, 1, 2 and C (Xoo_311018 has also lost the acetlytransferase gene), Xad_LUX (Figure 3.11) has lost XAC3314 and X. perforans had gene F, G and the subsequent nodP gene rearranged within the genome. X. gardneri, X. vesicatoria and X. campestris contain a cellobiosidase gene (used to break down sugars) and hypothetical genes that inserted between gene B and gene 1 (Figure 3.8). These genes may have been acquired with the gene cluster and lost in other lineages, or were from more recent insertions.

38

Figure 3.6. A large gene cluster inserted into an ancestral X. albilineans genome after the split of X. albilineans from other Xanthomonas species. The letters, numbers and color scheme represent the loci in Table 3.1 and Figure 3.2. The CVtree was constructed from the whole proteome using a k-mer of 6. Arrows into the tree represent a node that contains the genes or acquired the genes, arrows into a gene represent gene loss and blue arrows over one or more genes represent a rearrangement. Arrows from a gene cluster in between two genes represent the insertion point of the cluster, i.e., the gene cluster containing polysaccharide deacetylase and a hypothetical protein inserted in between genes D and E. Genes unique to an organism (not listed in Table 3.1) are labeled below the gene (red arrows represent annotated proteins, peach arrows represent transposases and black arrows represent hypothetical proteins). The large gene cluster in X. albilineans may have inserted between genes A and B, genes B and C or may have inserted as two smaller clusters before the loss of gene B.

39

Figure 3.7. XAC3314 inserted into an ancestral Xanthomonas genome as a gene cluster that arose after the split of X. albilineans. The letters, numbers and color scheme represent the loci in Table 3.1 and Figure 3.2. The CVtree was constructed from the whole proteome using a k-mer of 6. Arrows into the tree represent a node that contains the genes or acquired the genes, arrows into a gene represent gene loss and blue arrows over one or more genes represent a rearrangement. Arrows from a gene cluster in between two genes represent the insertion point of the cluster, i.e., genes 5 and 6 inserted in between genes D and E. Genes unique to an organism (not listed in Table 3.1) are labeled below the gene (red arrows represent annotated proteins, peach arrows represent transposases and black arrows represent hypothetical proteins).

40

X. vasicola contains a hypothetical gene and a cytochrome P-450 gene (involved in metabolic processes) that likely inserted between genes B and 1. X. vasicola pv. vasculorum also has undergone a rearrangement from gene 3 to the middle of gene 6 (Figure 3.9), reverse complementing this section in the N- terminal region of gene 6 (pepN). The partial pepN gene has been retained in this genome and may still be active, or this may be a more recent rearrangement and the genome has not yet lost the gene.

The alanine carrier protein between genes D and E in X. campestris likely inserted with the third gene cluster, but was lost by other lineages, otherwise the alanine carrier protein is a recent acquisition and should exhibit genomic and proteomic characteristics of a recent HGT event. The genes in the Xac_306 gene cluster were not identified by IslandPath or HGT-DB as being horizontally transferred (Hsiao et al., 2003; Garcia-Vallve et al., 2003); therefore, none of the genes (Table 3.1) were recently acquired.

Xanthomonas oryzae contain several transposases that likely inserted in these regions recently and may have mediated the loss of the gene cluster between genes B and D. The transposases (in Xac_306) and hypothetical proteins (in Xac_306 and Xe_85-10) retained in these strains in the last gene cluster (between genes D and E) may be from a genome rearrangement as these genes are present in other strains in different locations and the transposases are not unique within the genome.

41

Figure 3.8. XAC3314 was lost by X. campestris, X. gardneri and X. vesicatoria. The letters, numbers and color scheme represent the loci in Table 3.1 and Figure 3.2. The CVtree was constructed from the whole proteome using a k-mer of 6. Arrows into the tree represent a node that contains the genes or acquired the genes, arrows into a gene represent gene loss and blue arrows over one or more genes represent a rearrangement. Arrows from a gene cluster in between two genes represent the insertion point of the cluster, i.e., the alanine carrier gene inserted in between genes C and D. Genes unique to an organism (not listed in Table 3.1) are labeled below the gene (red arrows represent annotated proteins, peach arrows represent transposases and black arrows represent hypothetical proteins). The alanine carrier protein in this clade may have been acquired with the original gene cluster and lost in other lineages.

42

Figure 3.9. XAC3314 has been retained in X. vasicola. The letters, numbers and color scheme represent the loci in Table 3.1 and Figure 3.2. The CVtree was constructed from the whole proteome using a k-mer of 6. Arrows into the tree represent a node that contains the genes or acquired the genes, arrows into a gene represent gene loss and the arrow over the gene cluster represents a rearrangement. Arrows from a gene cluster in between two genes represent the insertion point of the cluster, i.e., the cytochrome P450 and two hypothetical genes inserted in between genes B and 1. Genes unique to an organism (not listed in Table 3.1) are labeled below the gene (red arrows represent annotated proteins, peach arrows represent transposases and black arrows represent hypothetical proteins). The Cytochrome P450 gene in this clade may have been acquired with the original gene cluster and lost in other lineages. X. vasicola pv. vasculorum has also undergone a rearrangement flipping the order of genes 3, 4, C, 5 and 6, cutting the N-terminal end of gene 6 (pepN).

43

Figure 3.10. X. oryzae strains have lost XAC3314. The letters, numbers and color scheme represent the loci in Table 3.1 and Figure 3.2. The CVtree constructed from the whole proteome using a k-mer of 6. Arrows into the tree represent a node that contains the genes or acquired the genes, arrows into a gene represent gene loss and blue arrows over one or more genes represent a rearrangement. Arrows from a gene cluster in between two genes represent the insertion point of the cluster, i.e., a transposase gene inserted in between genes F and G. Genes unique to an organism (not listed in Table 3.1) are labeled below the gene (red arrows represent annotated proteins, peach arrows represent transposases and black arrows represent hypothetical proteins).

44

Figure 3.11. XAC3314 has been retained and lost in different X. axonopodis strains. The letters, numbers and color scheme represent the loci in Table 3.1 and Figure 3.2. The CVtree constructed from the whole proteome using a k-mer of 6. Arrows into the tree represent a node that contains the genes or acquired the genes, arrows into a gene represent gene loss and blue arrows over one or more genes represent a rearrangement. Arrows from a gene cluster in between two genes represent the insertion point of the cluster, i.e., a hypothetical gene inserted before gene 6. Genes unique to an organism (not listed in Table 3.1) are labeled below the gene (red arrows represent annotated proteins, peach arrows represent transposases and black arrows represent hypothetical proteins). X. perforans has undergone a genome rearrangement in end of the third gene cluster. Three of six of the sequenced strains have lost the XAC3314 homolog. The red arrow pointing up represents a region in the Xfa10535 genome that has many gaps in the cluster that the arrow originates from.

45

3.5 Discussion 3.5.1 The XAC3314 gene cluster likely originated after the rise of plants The DUF239 domain is found in “land plants”, both non-vascular (Physcomitrella-moss) and vascular (monocot and dicots), and likely originated in the last common ancestor of these groups. The rise of the land plants has been estimated by Sanderson et al., (2004) using sequence data and fossil records to have occurred 483-490 mya using a penalized likelihood approach and 435-425 mya with a molecular clock assumption. Some unknown organism, likely of alpha-proteobacterial origin (Table 3.1), may have acquired the gene at this time.

X. campestris and X. axonopodis are estimated to have split from a common ancestor approximately 100-200 mya and the more distant Xylella genus shared a common ancestor approximately 600-700 mya (Battistuzzi et al., 2004). Stenotrophomonas, Pseudoxanthomonas and X. albilineans likely shared a common ancestor with X. axonopodis between 200-600 mya. The gene cluster containing the XAC3314 homolog was likely acquired by HGT from the plant associated organism, which acquired the original XAC3314 gene from a plant, after the last common ancestor of X. albilineans and the other Xanthomonas species.

3.5.2 Horizontal Gene Transfer and the gene clusters in Xanthomonas XAC3314 was likely an ancient acquisition (Richards 2009) and proteins with DUF239 domains have been identified in other bacteria and fungi. The gene clusters identified in this study were not present in any current HGT database (Hsiao et al., 2003; Garcia-Vallve et al., 2003). The databases use G/C%, dinucleotide bias, codon usage and amino acid content to identify genes that have been acquired through horizontal transfer. These genes are likely to have been acquired from another organism that has DNA characteristics that differ from the native genome. If a gene is not in these databases it may have still

46

been acquired horizontally, but the characteristics of the gene are too similar to the native genome to be computationally identified. Regions that were anciently acquired and have had time to adapt to the native genome or those from a related strain will not be picked up by these methods.

The gene clusters identified in Xac_306 may have been acquired from two different proteobacteria lineages. This is not unusual for a Xanthomonas genome and many genes exhibit a closer relationship outside of the gamma- proteobacteria class (Comas et al., 2006). Genetic regions necessary for survival of the gamma-proteobacteria were likely vertically inherited.

3.5.3 Evolution of the XAC3314 gene cluster The potential gene clusters visualized in Figure 3.6-3.12 depict regions that are shared, gained, lost and rearranged in different Xanthomonas species and strains. Gene clusters may have inserted as a single acquisition, which would mean any strain missing a gene is due to a gene loss. Gene clusters depicted on the trees are split to help visualize genes that are unique to one lineage or another. These depictions of the gene clusters may or may not be correct as a unique gene may have been lost in all other lineages. This would occur if the gene provides a specialized selective function. An ancestral genome may have acquired these gene clusters in large segments. If this occurred, then genes unique to a lineage would either have been acquired recently or lost in all other lineages. The genes in Xac_306 show no evidence of recent acquisition; however, it is possible that genes in the other strains may exhibit HGT characteristics. The smaller gene clusters are more likely to be acquired as described by Lawrence and Roth (1996) with their “Selfish Operon Model.” The gene cluster that likely contains XAC3314 is small, which is more efficiently acquired horizontally (Lawrence 2002). Genes in the gene clusters undergo gene loss, as several strains have lost different genes (Moran et al., 2011). This

47

is most likely to occur if the encoded protein no longer provides a useful function in a strain. Evidence of recent transfer events of the island between strains, as opposed to smaller clusters within the island, would provide further evidence that the islands present in these Xanthomonas strains are from a single insertion event. An analysis of a greater number of sequenced genomes may help reveal which genes were more likely acquired by HGT.

48

3.5 Conclusion In this analysis I have expanded the number of XAC3314 homologs based on the completely sequenced strains of Xanthomonas. I have used other sequenced Xanthomonadaceae strains to identify regions unique and shared between the lineages, which revealed two different genomic islands that have inserted into the same insertion site (one in X. albilineans and the other in Xac_306). Further analysis of the gene clusters in Xac_306 revealed that there were at least two independent HGT events, one from one or more alpha- proteobacteria and one from a beta-proteobacterium. The evolution of the original gene cluster among the Xanthomonas strains has revealed gene loss and gene rearrangement in this region. A larger set of Xanthomonas strains will be analyzed (Chapter 5) to confirm the loss of XAC3314 in the different lineages and if XAC3314 has undergone HGT since acquisition.

49

Chapter 4: Classification of Xanthomonas strains using RIF, a computationally derived DNA marker

4.1 Introduction A robust, single marker classification framework for Xanthomonas is required to study the diversity of XAC3314 within a large collection of Xanthomonas strains that were isolated from hundreds of plants over the past 37 years, during which the of the genus Xanthomonas has changed dramatically. An ideal DNA marker for Xanthomonas classification is more variable than the rDNA regions, present in all target organisms as a single copy per genome, unlikely to be transferred by horizontal gene transfer, and amplifiable with Xanthomonas-specific primers. I have used six completely sequenced Xanthomonas genomes (da Silva et al., 2002; Qian et al., 2005; Lee et al., 2005; Thieme et al., 2005) representing three species and four pathovars to computationally identify RIF, a region of the single-copy dnaA gene, as the best marker to distinguish closely related Xanthomonas strains. RIF sequences were then used to classify a large number of Xanthomonas reference strains from a diverse subset of the Pacific Bacterial Collection (PBC) and the International Collection of Microorganisms from Plants (ICMP). In 2010, the PBC at the University of Hawaii consisted of over 2,500 accessions of Xanthomonas; the ICMP at Landcare Research of New Zealand contained over 13,000 microbial strains, including 1,331 belonging to the genus Xanthomonas. RIF frameworks were used to study XAC3314 in Chapter 5.

50

4.2 Hypothesis A single, useful DNA marker that has greater resolution than rDNA and a similar phylogeny to that of a multi-locus analysis can be computationally identified. Objectives: a) Computationally derive the useful DNA marker, RIF b) Create sequence frameworks using RIF with Xanthomonas strains in

the PBC

51

4.3 Methods 4.3.1 Computational identification of suitable marker regions The computational identification of a DNA barcode region that distinguishes closely related Xanthomonas strains was performed with completely sequenced Xanthomonas genomes (Xcc_8004, Xcc_33913, Xe_85- 10, Xccit_306, Xoo_10331 and Xoo_311018 in Table 2.1) as follows. MUMmer (Kurtz et al., 2004), a tool used to find perfect unique and multi-hit matches between two strains, was used for this analysis. First, SNPs were identified between the two Xanthomonas oryzae pv. oryzae genomes Xoo_10331 and Xoo_311018 using MUMmer (Kurtz et al., 2004) to match all conserved regions of 20 nucleotides or greater. A similar analysis was performed for two strains of X. campestris pv. campestris (Xcc_33913 and Xcc_8004). Nucleotides of Xoo_311018 and Xcc_33913 that had no MUMmer matches in the other strain of the same pathovar were masked with a Perl script. Second, single copy oligos in Xoo_311018 that were perfectly conserved in the five other Xanthomonas genomes (Xoo_10331, Xcc_33913, Xcc_8004, Xccit_306 and Xe_85-10) were identified using MUMmer (Kurtz et al., 2004) in sequential comparisons (options: -mum -l 20 -b), and will be referred to as conserved 20+-mers. In this analysis, the regions conserved between Xoo_311018 and genome two were used as a query with genome three and so on. Third, a Perl script was used to identify regions in all six Xanthomonas genomes that a) were flanked by conserved 20+- mers, b) separated by >550 nucleotides, c) produced an amplicon of <1,000 nucleotides, and d) contained at least 20 masked out nucleotides (i.e. containing 2 or more SNPs) in Xoo_311018. The same analysis was performed using the masked genome of Xcc_33913. Fourth, another Perl script was used to align the extracted regions from the six Xanthomonas genomes using ClustalW (Chenna et al., 2003) and output a distance matrix of nucleotide differences to confirm that orthologous regions from any two Xanthomonas strains contained one or more SNPs. Finally, a Perl script was used to extract every gene from the identified regions in Xcc_33913 and Xoo_311018, and confirm the presence of the gene in nine completely sequenced genomes of Ralstonia (NC_008313, NC_007347,

52

NC_007973), Clavibacter (NC_009480, NC_0104087), Pectobacterium (NC_012917, NC_013421), Erwinia (NC_012214, NC_010694) and Dickeya (CP002038, NC_012912, NC_012880, NC_013592) with a default BLAST (Altschul et al., 1990) comparison. The BLAST results were checked by hand to confirm the presence of the complete gene in the sequenced genomes; only one gene (dnaA) was present in full in all genomes.

4.3.2 Copy number determination of bacterial dnaA genes A set of 3,020 bacterial DnaA protein sequences was downloaded by querying the NCBI protein database (limited to bacterial organisms) with “dnaA” and “350:1000” limited to gene and SLEN, respectively. Of these, only 1,188 protein sequences whose headers contained the terms “dnaA”, “chromosomal replication initiator” or “chromosomal replication initiation” were kept. Separately, all 1,159 completely sequenced bacterial and archael genomes were downloaded from NCBI (ftp://ftp.ncbi.nih.gov/genbank/genomes/bacteria) using a Perl script. The directory containing the genome “Escherichia coli strain RS218” was left out as it improperly contained Enterobacteria phage CUS-3. The subset of 1,067 bacterial genomes on the “List of prokaryotic names with standing in nomenclature” (http://www.bacterio.cict.fr) and the list of cyanobacterial genera (http://www.cyanodb.cz/valid_genera) were used in this analysis. Subsequently, a protein set was generated for each of the 1,067 bacterial genomes. Twenty- one proteins from these sequenced genomes that were annotated as “dnaA”, “chromosomal replication initiator” or “chromosomal replication initiation” but whose accession number was not in the DnaA protein set were added to the protein set for a total of 1,209 sequences. Another Perl script was used to identify copies of the DnaA protein in the genome of each strain using BLAST (-p tblastn -m8 –X 300 –e 1e-10) and the 1,209 DnaA proteins as a query. Any gene in the 1,067 bacterial genomes with a match to any protein query at >40% identity over >350 amino acids was counted as a DnaA protein. That query was then used to identify additional copies of DnaA in the genome. The dnaA gene of strain D106004 could not be identified in this way because of a frame-shift

53

mutation (Table S5.1), so bl2seq was used with the the dnaA gene from Y. pestis Angola. The dnaA gene of Lysinibacillus sphaericus strain C3 41a began at position 4,639,741 of 4,639,821 nt of the circular chromosome instead of position 1. Genomes containing more than one copy of the dnaA gene were compared with themselves using bl2seq to identify possible gene duplication or horizontal transfer events.

4.3.3 RIF primer design on the dnaA gene The dnaA gene from the six sequenced Xanthomonas strains were first used for primer development. To compare sequences in a cladogram multiple sequence alignments were produced with ClustalW (Chenna et al., 2003). ClustalW creates a multiple sequence alignment from a set of FASTA sequences. All sequences from Xanthomonas were aligned with ClustalW (Chenna et al., 2003) and 18–20 nt primers were designed manually on conserved regions to produce the largest amplicon covering the greatest number of SNPs, shorter than 1,000 nucleotides. The primers were required to have a melting temperature of 60±2°C, 50–60% G+C content, end in a G or C and have no complementary bases at the ends. Only one of many possible regions covering all SNPs between the two Xcc and two Xoo strains was chosen.

4.3.4 Bacterial culturing and DNA extraction A subset of 366 bacterial clones from the PBC was chosen for DNA sequencing (Supplemental Table S5.2). The strains were plated on TZC medium to confirm that no contamination was present and transferred into sterile deionized water for genomic DNA isolation and to LB containing 15–20% glycerol for long-term storage at −80°C. DNA was isolated by adding 300 µl of water culture to 100 µl of a 40% mixture of deionized water-Chelex-100® resin (BioRad), and incubated at 60°C for 60 minutes (da Lamballarie 1992). DNA from 84 Xanthomonas strains extracted with the REDExtract-N-Amp™ PCR ReadyMix™ (Sigma-Aldrich) were provided by John Young and Duck-Chul Park from the ICMP in New Zealand.

54

4.3.5 DNA marker amplification and sequencing RIF and ITS were PCR amplified using an Eppendorf Mastercycler® Ep Gradient machine. All PCR reactions contained: 25 µl of JumpStart™ REDTaq® ReadyMix™ (Sigma-Aldrich) for high throughput PCR with 5 µl of 10 mM primer, 5 µl of template and 15 µl of deionized water. The cycling conditions included an initial denaturation step at 94°C for ten minutes, followed by 35 cycles of 94°C for 30 seconds, 61°C for one minute and 72°C for 30 seconds, followed by a ten minute extension at 72°C and a hold at 4°C. ITS marker amplifications were performed as described by Norman et al., (1996). All PCR products were separated on a 1.5% agarose gel to confirm quantity and quality, and DNA was purified and sequenced from all PCR attempts, even if no PCR product was visible on the gel. PCR products were purified using a Qiaquick 96 PCR Purification Kit (Qiagen), and the product was sequenced at the University of Hawaii sequencing facilities using forward and reverse primers.

4.3.6 DNA marker analysis DNA markers were computationally assembled using the forward and reverse chromatograms with the chromatogram converter phred and the assembly tool phrap (Ewing and Green 1998a, Ewing et al., 1998). Chromatograms of sequenced amplicons were automatically converted to FASTA files using phred (Ewing and Green 1998a) with default settings and no trimming. Forward and reverse sequences were automatically assembled with phrap (Ewing et al., 1998b), set to a minimum match of 100 and a minimum quality score of 20. A Perl script was used to identify the longest stretch of high quality bases containing no more than five consecutive low quality bases from the ace file. Low quality bases were edited manually with BioEdit (Hall et al., 1999) and either retained, removed, or converted into degenerate bases. Assembled sequences spanning fewer than 550 high-quality bases and sequence reads that could not be assembled were removed. RIF sequences that did not match the expected strain annotation were re-isolated and re-sequenced.

55

RIF sequences of each genus were aligned using MEGA (Kumar et al., 2006), a GUI-based tool to align sequences and construct cladograms, and trimmed to the same length. Distance matrices, and neighbor-joining (N-J) phylogenetic trees with bootstrapping scores, from 5000 replicates, were produced with MEGA. Neighbor-joining, Minimum Evolution, Maximum Parsimony and Maximum Likelihood trees were produced using a single representative for each different sequence. Both the ITS and RIF markers were re-amplified and sequenced for strains that branched in unexpected positions to confirm their RIF sequence. N-J trees were rooted with the orthologous regions obtained via BLAST (Altschul et al., 1990) from a closely related sequenced strain in NCBI.

4.3.7 Construction of a consensus tree Phylogenetic trees constructed from the RIF marker in section 4.3.6 (using Neighbor-joining, Minimum Evolution, Maximum Parsimony and Maximum Likelihood methods) were compared. Strains that were placed in the same position with all four methods were used to construct a final RIF sequence framework.

56

4.4 Results 4.4.1 Identification of the dnaA marker Six Xanthomonas strains (Xcc_8004, Xcc_33913, Xoo_10331, Xoo_311018, Xccit_306, Xe_85-10 in Table 2.1) were used to computationally identify polymorphic regions that distinguish the two closely related strains of each pathovar (see Methods). These strains were not resolved by the ITS marker.

In brief, the two Xoo genomes (Xoo_10331 and Xoo_311018) were compared to identify all orthologous regions that contain at least two SNPs and are flanked by unique 20+-mers that are conserved in all six Xanthomonas strains, are separated by at least 550 nucleotides and would produce an amplicon of <1000 nt. This revealed a total of 16,362 unique 20+-mers conserved in all six genomes, 72 of which (from eight different Xoo_311018 protein-coding genes) had the potential to produce amplicons of the desired length and contained at least two SNPs between the two Xoo strains (Table 4.1). BLAST homologies in all fifteen sequenced genomes were detected for four of these genes. The Xcc orthologs for three of these genes do not contain sequence polymorphisms, but the Xcc orthologs of gene XOO4370 (Table 4.1) differ by one SNP. However, no conserved 20+-mers flanking potential amplicons that span this SNP met the criteria for primer design (see Methods). A similar analysis of the two Xcc genomes (Xcc_8004 and Xcc_33913) yielded 58 regions in Xcc_33913 from seven other genes (i.e. no overlap with the list of 8 genes obtained in the Xoo comparison; please see Table 4.1 for the complete list). Initial automated BLAST searches revealed that two of these seven genes had a homologous region in all fifteen completed genomes. The two Xoo orthologs of genes XCC0001 and XCC3789 (from Xcc_33913, see Table 4.1) are distinguishable by one and multiple SNPs, respectively. However, a manual inspection of the BLAST results revealed gene XCC3789 to be absent from the sequenced Clavibacter genomes. No single gene was identified that contains two or more SNPs in both the Xcc and Xoo comparisons, is flanked by 20+-mers

57

conserved in all six Xanthomonas genomes and is present in all six genera of plant . However, the Xcc and Xoo orthologs of XCC0001 are distinguishable by five and one SNP, respectively. The conserved 20+-mers found in gene XCC0001 have the potential to form 26 different amplicons that would meet the length criteria, eight of which contain all five of the Xcc SNPs and the single Xoo SNP (Figure 4.1 and Table 4.1). XCC0001 encodes the dnaA gene. A primer pair to amplify a region of the dnaA gene, known as the replication initiation factor (RIF) marker, in Xanthomonas was manually selected (5-CAGCACGGTGGTGTGGTC-3 and 5-CCTGGATTCGCATTACACC-3) from the two conserved 20+-mers that yielded the longest amplicon of <1000 nt that covers all six SNPs (Figure 4.1).

58

59

Figure 4.1. (Previous Page): Alignment of the dnaA gene from six Xanthomonas strains. The dnaA genes of X. campestris pv. campestris strains differ by 5 SNPs (pink highlight) and those of two X. oryzae pv. oryzae strains (green highlight) differ by 1 SNP. Conserved 20+-mers in all six Xanthomonas genomes are highlighted in blue. The primer set chosen is highlighted in yellow.

60

Table 4.1. Genes in Xoo_311018 and Xcc_33913 containing regions that match criteria for potential markers.

a- Xoo_10331 was compared to Xoo_311018 or Xcc_8004 was compared to Xcc_33913. b- There is no overlap of genes between the two sets of strains analyzed c- Strains are listed in Table 1 with an asterisk. d- Gene XCC0001 met all criteria for use as a potential marker. e- Orthologs differ by a single SNP.

61

4.4.2 Assessment of dnaA copy number in 1,067 bacterial genomes The copy number of the dnaA gene was determined for over 1,000 completed bacterial genomes. A total of 1,067/1,159 sequenced genomes downloaded from GenBank on June 9, 2010 were from genera present in the “List of prokaryotic names with standing in nomenclature” or in the “List of valid cyanobacterial genera” (see Methods). The dnaA gene was present once in 1,016 (95.1%) bacterial genomes (including all phytopathogens), twice in 40 (3.7%) bacterial genomes and zero times in eleven (1.0%) genomes (Supplemental Table S4.1).

4.4.3 Vertical transfer of dnaA Phylogenetic trees constructed from the DnaA proteins and 16S rDNA regions of the 1,067 bacterial genomes revealed that genera that share a clade in the DnaA tree also share a clade with the 16S marker consistent with a gene that has undergone vertical transfer at the genus level (data not shown). However, recent transfers between lower taxa may still be possible. The phylogenetic tree constructed from the RIF marker of six Xanthomonas strains (RIF in Figure 4.2) was also consistent with the cladogram of 229 Xanthomonas genes, that represent proteins unlikely to be horizontally transferred (Ewing 2008).

4.4.4 In silico analysis of the RIF and ITS markers in pathovar, subspecies and species comparisons The RIF marker resolves different accessions of two strains of Xcc and Xoo that are not resolved by ITS (blue and red boxes in RIF tree in Figure 4.2). Specifically, RIF distinguishes strain Xcc_8004 from Xcc_33913 and strain Xoo_311018 from Xoo_10331. This was expected, as the RIF marker was selected on the basis of increased resolution over the ITS marker using these particular strains. However, although RIF distinguishes the Japanese race 1 strain (Xoo_311018) from the Korean race 1 strain (Xoo_10331), it does not distinguish the latter from the race 6 strain Xoo_PXO99A from the Philippines.

62

RIF sequences contain more nucleotide polymorphisms than ITS sequences for strains of Xanthomonas. The average distance between Xanthomonas species was three times greater with RIF than with ITS (33.6/11.3 nucleotide differences). Also, in the case of Xcc_8004 and Xcc_33913 the five SNPs obtained with the RIF marker sequence (Figure 4.1) resolves these strains that were not with four other genes in the MLSA completed in 2008 by Young et al., (Table 4.2).

4.4.5 Practicability of ITS and RIF markers as assessed by de novo sequencing of 120 Xanthomonas accessions We extended our comparative analysis of RIF and ITS to the laboratory by attempting to generate sequences for both markers from 120 accessions of Xanthomonas from the PBC representing a diverse range of species, hosts and geographic origins (Supplemental Table S4.2). Marker sequences were derived from PCR amplicons generated with universal (ITS) or genus-specific primers (RIF). These amplicons were sequenced in both directions without cloning and the resulting sequences were base-called (Ewing and Green 1998a) and assembled (Ewing et al., 1998b) automatically. The ITS sequences of the three genera ranged in size from 557-590 bp, resulting in assembled sequence pairs that end before the 16S and 23S rRNA genes and RIF sequence that was 700 bp. The success rate of high-quality direct sequence generation (i.e. overlapping sequence from both ends) was 1.25 times greater for RIF (110 sequences = 92.4%) than for ITS (87 sequences = 73.1%). Seven additional Xanthomonas accessions yielded a single RIF read.

The 84 strains for which both the ITS and RIF marker were successfully sequenced were compared using an unrooted tree (Figure 4.3). A greater number of different sequences were obtained with RIF (26) than with ITS (18). The majority of polymorphisms in the RIF marker occur in the wobble base (data not shown). Two X. campestris strains are placed outside of the X. campestris

63

clade in the ITS but not the RIF tree (asterisk in Figure 4.3). Similarly, one X. axonopodis strain placed outside of the main X. axonopodis clade. Longer branch lengths, and a higher sequencing success rate as compared to ITS make RIF a useful marker for classifying strains of Xanthomonas.

Figure 4.2. Neighbor-joining cladograms for Xanthomonas using ITS and RIF sequences. RIF resolves closely related strains of Xanthomonas that remain unresolved with ITS. The neighbor-joining cladograms for Xanthomonas using the ITS and RIF markers from fully sequenced strains in GenBank (Table 2.1). Bootstrap values >50% (shown at the node) are expressed as a percentage of 5,000 replicates. Red boxes contain two X. oryzae pv. oryzae strains resolved by one SNP between RIF sequences and zero SNPs between ITS sequences. Likewise, blue boxes contain two X. campestris pv. campestris strains resolved by five SNPs between RIF sequences and zero SNPs between ITS sequences. The RIF and ITS sequences of Xoo strains 10331 and PXO99A are identical. Xylella fastidiosa is included as outgroup.

64

Table 4.2. In silico comparison of Xanthomonas with the RIF marker resolves one pair of closely related strains that is unresolved with four other housekeeping genes and the ITS. Strain1 Strain2 dnaA rpoD dnaK fyuA gyrB ITS Xcc_33913 Xcc_8004 5 0 0 0 0 0 Xcc_33913 Xcc_B100 12 41 2 1 1 1 Xcc_33913 Xe_85-10 66 60 40 88 106 18 Xcc_33913 Xccit_306 71 64 36 89 101 15 Xcc_33913 Xoo_PXO99 79 66 40 89 98 6 Xcc_33913 Xoo_10331 79 66 40 90 99 5 Xcc_33913 Xoo_311018 80 66 39 89 103 5 Xcc_33913 Sm_279 118 181 97 243 82 Xcc_33913 Sm_R551 122 181 101 237 85 Xcc_8004 Xcc_B100 17 41 2 1 1 1 Xcc_8004 Xe_85-10 69 60 40 88 106 18 Xcc_8004 Xccit_306 74 64 36 89 101 15 Xcc_8004 Xoo_PXO99 82 66 40 89 98 6 Xcc_8004 Xoo_10331 82 66 40 90 99 5 Xcc_8004 Xoo_311018 83 66 39 89 103 5 Xcc_8004 Sm_279 123 181 97 243 82 Xcc_8004 Sm_R551 127 181 101 237 85 Xcc_B100 Xe_85-10 66 47 42 87 107 19 Xcc_B100 Xccit_306 72 52 38 88 102 14 Xcc_B100 Xoo_PXO99 79 52 42 88 99 5 Xcc_B100 Xoo_10331 79 52 41 89 104 4 Xcc_B100 Xoo_311018 80 52 41 88 104 4 Xcc_B100 Sm_279 117 188 99 244 83 Xcc_B100 Sm_R551 123 183 103 238 86 Xe_85-10 Xccit_306 35 21 19 28 26 7 Xe_85-10 Xoo_PXO99 55 39 35 49 58 21 Xe_85-10 Xoo_10331 55 39 35 50 57 20 Xe_85-10 Xoo_311018 56 39 34 49 63 20 Xe_85-10 Sm_279 96 187 100 232 88 Xe_85-10 Sm_R551 103 182 107 234 90 Xccit_306 Xoo_PXO99 65 37 34 58 56 15 Xccit_306 Xoo_10331 65 37 34 59 55 14 Xccit_306 Xoo_311018 66 37 33 58 60 14 Xccit_306 Sm_279 107 186 99 227 88 Xccit_306 Sm_R551 113 183 103 238 89 Xoo_PXO99 Xoo_10331 0 0 2 1 1 0 Xoo_PXO99 Xoo_311018 1 0 1 0 10 0 Xoo_PXO99 Sm_279 115 184 100 235 85 Xoo_PXO99 Sm_R551 119 176 102 239 86 Xoo_10331 Xoo_311018 1 0 1 1 11 0 Xoo_10331 Sm_279 115 184 100 236 84 Xoo_10331 Sm_R551 119 176 101 242 85 Xoo_311018 Sm_279 116 184 99 238 84 Xoo_311018 Sm_R551 120 176 101 242 85 Sm_279 Sm_R551 34 61 25 63 13

65

4.4.6 Building RIF frameworks from sequences of characterized PBC accessions of Xanthomonas Characterized strains from the PBC were used to construct RIF frameworks. An additional 197 RIF sequences (see above) were obtained from strains collected over a span of 37 years from the PBC. Sequences were generated as described above to construct RIF frameworks that can be used for strain comparisons between collections and for classifying strains to further examine the inheritance of XAC3314. Over 83.8% of the 366 Xanthomonas strains from the PBC yielded usable RIF sequence. In total, RIF sequences were obtained for 307 PBC strain and 43 ICMP strains.

After automated alignment and trimming to 558 nt (see Methods), 77 different RIF sequences were obtained for Xanthomonas. These sequences were analyzed with the neighbor-joining, minimum-evolution, maximum-parsimony and maximum-likelihood tree methods (Supplemental Figure S4.1-S4.4). Each tree formed six major clades, each of which contains a dominant species: X. axonopodis; X. oryzae; X. campestris; X. (Stenotrophomonas) maltophilia and X. albilineans. Strains of Xanthomonas axonopodis pv. citri and three other strains of X. euvesicatoria branch basal to the two other subclades of X. axonopodis, all within the same clade. X. axonopodis pv. dieffenbachiae is found throughout the RIF tree. Ten RIF sequences are shared by two or more species and/or pathovars. Some sequences were placed in different positions with each tree building method. A subset of sixty of the seventy-seven strains that had RIF sequences that placed in the same position in all four trees was used to construct a final phylogenetic tree (Figure 4.4). This tree shows support for the clades formed with the larger set of strains. This tree provides the most robust framework for further phylogenetic inference to study other genes in the genome.

66

Figure 4.3. RIF distinguishes more Xanthomonas strains than ITS. Unrooted neighbor-joining trees for the RIF and ITS markers were constructed from eighty-four Xanthomonas strains from the PBC (Supplemental Table S4.2) and eight reference strains from GenBank (see Table 2.1 for strain names) . Identical sequences are represented only once and the number of sequenced strains is indicated on each leaf. Bootstrap values >50% (shown at the node) are expressed as a percentage of 5,000 replicates. Two X. campestris strains and one X. axonopodis strain localize to the appropriate clade with RIF but not ITS (asterisk). Xc - X. campestris, Xa - X. axonopodis, Xe – X. euvesicatoria, Xcit – X. citri, Xo - X. oryzae, Xm – X. (Stenotrophomonas) maltophilia.

* *

67

Table 4.3. Abbreviations for Xanthomonas and Stenotrophomonas strains. Abbreviation Genus Species Pathovar/Species Sr Stenotrophomonas rhizophila Xal Xanthomonas albilineans Xacitru Xanthomonas alfalfae citrumelonis Xara arboricola Xarc Xanthomonas arboricola corylina Xarp Xanthomonas arboricola populi Xaa Xanthomonas axonopodis allii Xaax Xanthomonas axonopodis axonopodis Xad Xanthomonas axonopodis dieffenbachiae Xaman Xanthomonas axonopodis manihotis Xap Xanthomonas axonopodis phaseoli Xapa Xanthomonas axonopodis panax Xavi Xanthomonas axonopodis syngonii Xb Xcaber aberrans Xca Xanthomonas campestris armoraciae Xcc Xanthomonas campestris campestris Xccit Xanthomonas axonopodis (citri) citri Xci Xanthomonas campestris incanae Xcr Xanthomonas campestris raphani Xcas Xanthomonas cassave Xcod Xanthomonas codiaei Xcuc Xcyn axonopodis Xe Xanthomonas (euvesicatoria) vesicatoria Xf Xfa Xanthomonas fuscans aurantifolii Xg Xanthomonas gardneri Xhc carotae Xhh Xanthomonas hortorum hederae Xhp Xanthomonas hortorum pelargonii Xht Xanthomonas hortorum taraxaci Xhv Xanthomonas hortorum vitians Xme Xanthomonas melonis Xoo Xanthomonas oryzae oryaze Xooryz Xanthomonas oryzae oryzicola Xp Xpi Xsps Xanthomonas species syngonii Xspd Xanthomonas species from Dysoxylum Xva Xanthomonas vasicola Xv Xanthomonas vesicatoria

68

69

Figure 4.4 (Previous Page). Maximum-likelihood of RIF sequences that agree with all four tree methods. This tree provides the best inference for classifying strains with only the RIF marker. Rooted maximum-likelihood tree for the RIF marker sixty unique sequences and eight reference strains from GenBank (see Table 2.1 for strain names) with Xylella as the outgroup. Bootstrap values (shown at the node) are expressed as a percentage of 1,000 replicates. Abbreviations are in Table 4.3 and species within a clade are displayed on the tree.

70

4.5 Discussion 4.5.1 RIF and other markers for Xanthomonas classification Identification of the RIF marker and sequencing of strains began in 2006. Since then other publications, such as those by Young et al., (2008) and Parkinson et al., (2007; 2009), have also used one or more markers for classification of Xanthomonas. I have shown that RIF provides greater resolution than those used in the MLSA for some strains, and that other markers provide better resolution than RIF for others (Tables 4.3). Whole genome sequence data provides the greatest information for construction of a phylogenetic tree. Sequenced genomes are not always available; therefore, one or more markers are commonly used for identification (Young et al., 2008; Parkinson et al., 2009). RIF was the only region identified using the methods described; however, it may be of interest to identify additional markers and increase the resolution of the required sequence framework. The methods used here to identify regions for primer development could be altered, such as allowing mismatches in the primer region, to allow for a greater number of markers to be revealed. The advent of new sequencing and labeling techniques and the decreasing costs to sequence a strain may allow laboratories to sequence entire collections, cultured samples and environmental data in the future. Currently, a single 101nt paired-end Illumina run using 384 different sequence tags for 384 different strains would produce data that provides a 19.5X coverage for each strain. Not counting laboratory supplies, at a cost of $20,000, each strain would cost ~$52 to be sequenced, which is about ten times the cost of sequencing a single DNA marker. However, these technologies and tools are still expensive for the average laboratory and even more so for the casual farmer. Likely, in-field handheld devices for bacterial detection, such as those in development by Jenkins et al., (2011) will prove to be a valuable tool for a farmer.

71

4.5.2 DnaA structure in relation to the RIF marker region The DnaA protein is an essential core metabolic protein in bacteria known to regulate the initiation of chromosomal replication by binding DnaA boxes on the replicating chromosome (Kaguni et al., 2006). The absence of the dnaA gene from eleven sequenced genomes of insect endosymbionts has been suggested to reflect the extreme dependency that these symbionts have with their host (Mackiewicz et al., 2004). The DnaA protein contains several conserved domains, including a AAA+ superfamily domain near the N-terminal and a helix- turn-helix DNA binding domain at the C-terminal of the protein (Kaguni et al., 2006). The primers were designed near, but not in, the AAA+ domain and on the edge of the helix-turn-helix DNA binding domain. Thus, the conserved region at the 5’ primer may function at the nucleotide rather than protein level, possibly in gene transcription or mRNA stability.

4.5.3 RIF can be used as an identification marker for the majority of bacteria The fact that dnaA is a single copy gene in the vast majority of complete bacterial genomes examined and does not appear to frequently undergo horizontal gene transfer make it a valuable marker for bacterial strain classification. The ease with which we were able to produce sequence frameworks for Xanthomonas illustrates the practicability of this marker. Since previous efforts to identify universal primers that amplify protein-coding regions were unsuccessful (Santos et al., 2004), the need for a genus-specific primer was expected.

4.5.4 RIF classification is consistent with other methods of classification The cladograms constructed using the RIF marker were consistent with existing trees based on single or multiple markers (where available). RIF sequences supported previous genetic studies of Xanthomonas and the close relationship between Stenotrophomonas and Xanthomonas (Garrity et al., 2005). RIF genotypes did not correspond to the current nomenclature for

72

pathovars of X. campestris and X. axonopodis. The X. campestris clade had strains of X. campestris pvs. armoraciae, aberrans, raphani and campestris that share the same RIF sequence; therefore, these pathovar designations are not useful. Similarly, the X. axonopodis clade had other Xanthomonas species that share the same RIF sequence with X. axonopodis pv. dieffenbachiae. The polyphyletic X. axonopodis clade (Figure 4.4) contained six species, five of which have recently been renamed from X. axonopodis to X. alfalfae, X. euvesicatoria, X. perforans, X. citri and X. fuscans (Jones et al., 2004, Schaad et al., 2006). These six species fell into subclades that agree with the groups of X. axonopodis observed by Vauterin et al., (1995).

4.5.5 The RIF database In order to enable comparison of strains isolated worldwide with those characterized in this study, we have created an online database, RIFdb (Figure 4.5). For ease of use, the database can be queried with unprocessed single or paired chromatograms, or FASTA sequences, to search for the best reference strain match. The ability of inputting unprocessed chromatograms should enable any worker from a diagnostic laboratory that has access to a PCR machine and a sequencing facility to compare their sequence of an unknown against the RIF reference database. Chromatograms submitted to the web-site are base-called, and paired chromatograms are assembled (see methods). Querying the database with a single read reduces the cost of identification if a user decides to sequence only one end. Queries are automatically aligned with existing sequences and visualized in a neighbor-joining tree using ArchaeopteryxE (Han et al., 2009). The RIF sequences in the database are automatically re-trimmed if a partial sequence is provided as a query, although excessive trimming will decrease the resolution of this marker.

73

Clearly the utility of the RIF database will increase as more sequences from as yet unrepresented accessions from around the world are added. Sequences of strains from diverse bacterial collections will increase global representation. Deposition of high quality chromatogram pairs along with key characteristics of strains will enable the expansion of RIFdb with strains from international collections and increase its utility to diagnosticians worldwide.

Figure 4.5. RIFdb: Online RIF sequence framework for the identification of plant pathogens. The website accepts raw sequence chromatograms as a single file or paired files. Sequences may also be input in fasta style. A search of the database loads the next website that displays your RIF sequence on the RIF cladogram.

74

4.5 Conclusion I have computationally identified RIF, a useful DNA marker that produced a reliable tree similar to those constructed in other studies (Young et al, 2008; Parkinson et al., 2009). I used this marker to sequence over 300 strains of Xanthomonas and have shown that the RIF marker can be used worldwide to identify unknown Xanthomonas strains. I have also constructed an online, searchable database that can be queried by anyone with raw or formatted sequence data. RIF and RIFdb have been advertised at several national APS meetings and this work has been published in PLoS ONE (Schneider et al., 2011). The RIF framework was used to identify if there has been any HGT events since acquisition of the XAC3314 homolog (Chapter 5).

75

Chapter 5: XAC3314 diversity in the PBC

5.1 Introduction An analysis of sequenced Xanthomonadaceae strains revealed XAC3314 homologs in four species (see Chapter 3). XAC3314 was likely an ancient acquisition (Richards 2009), transferred with at least one other gene in a small genomic island (Chapter 3). The island was acquired by an ancestral strain of Xanthomonas most likely after the split of X. albilineans. Based on the sequenced strains analyzed, other Xanthomonas lineages have lost the XAC3314 homolog. These may have been recent losses in sequenced strains or a more ancient loss from a shared ancestral genome and is not present in any strain of the lineage. Initial analysis of sixteen PBC strains by Ewing (2008) only amplified XAC3314 homologs in Xac (big amplicon-B 2800 nt amplicon). Strains that did not have an XAC3314 homolog produced an amplicon that was approximately 900 nt (medium amplicon-M in X. campestris), 600 nt (small amplicon-S in X. axonopodis pv. dieffenbachiae) or the strain was missing one or more of the flanking genes and failed to amplify (no amplicon-N in X. oryzae). The analysis of thirty sequenced strains identified an additional five XAC3314 homologs in Xanthomonas since the study by Richards et al., (2009).

I have classified over 300 strains of Xanthomonas from the Pacific Bacterial Collection with RIF (Chapter 4) to study XAC3314 homologs in these strains. Amplicons were sequenced from these Xanthomonas strains to study diversity, identify potential promoter elements and reveal evidence of possible horizontal transfer events that occurred since acquisition. In addition to the XAC3314 homologs identified in sequenced strains, amplification of strains in the PBC revealed additional species and pathovars (X. hortorum, Xcc and Xad) containing the gene. These strains contain a plant inducible promoter upstream of the XAC3314 start codon. The XAC3314 gene was likely acquired by an ancestral Xanthomonas strain after the split of X. albilineans and may only transfer among strains in the same niche if there is HGT of the XAC3314 gene.

76

5.2 Hypotheses 1) XAC3314 has undergone frequent loss and has been horizontally transferred since acquisition Objectives: a) Amplify XAC3314 in Xanthomonas strains in the PBC b) Sequence XAC3314 with sequencing primers and assemble sequence: c) Identify inconsistencies between RIF and XAC3314 genotypes. d) Construct cladogram of the intergenic region between alpha- glucosidase and carboxylestarse genes to identify the extent any horizontal transfer event

77

5.3 Methods 5.3.1 Bacterial culture and DNA extraction A subset of 366 Xanthomonas clones from the PBC that were analyzed in Chapter 5 (Supplemental Table S4.2) was selected for further analysis of their XAC3314 homologs. The DNA was grown and isolated the same as in Chapter 5.

5.3.2 XAC3314 amplification XAC3314 flanking primers were used to amplify DNA (Ewing 2008) using an Eppendorf Mastercycler® Ep Gradient machine. All PCR reactions contained: 25 µl of JumpStart™ REDTaq® ReadyMix™ [Sigma-Aldrich] for high throughput PCR with 5 µl of 10mM primer, 5 µl of template and 15 µl of deionized water. The cycling conditions included an initial denaturation step at 94°C for ten minutes, followed by 35 cycles of 94°C for 30 seconds, 61°C for one minute and 72°C for 120 seconds, followed by a ten minute extension at 72°C and a hold at 4°C. All PCR products were separated on a 1.5% agarose gel to confirm quantity and quality. Amplicons were labeled B for the 2.4 kb amplicon, M for the 0.9 kb amplicon, S for the 0.6 kb amplicon, N for no amplicon, A for an odd amplicon such as a smear.

5.3.3 XAC3314 primer design for sequencing The XAC3314 primers were developed based on the completely sequenced Xac_306 and Xe_85-10 genomes. To get the highest quality sequence coverage over the length of the gene and intergenic regions up and down-stream, six additional primers to the two flanking primers were developed. PCR products were purified from the previous PCR reaction using a Qiaquick 96 PCR Purification Kit [Qiagen], and the product was sequenced at the University of Hawaii sequencing facilities using the flanking primers with a small or medium amplicon or all eight primers with a big amplicon (Flanking Forward- 5’- AGCGGCAAGGACTGGTATGT-3’, Intergenic Forward-1- 5’-GATCAGCCACGCGATTGC-3’, Intergenic Forward-2-

78

5’-GATCAGCCACGCGATTGC-3’, Intergenic Forward-3- 5’-CAACAGCGGTGTAGCCAGC-3’, Intergenic Reverse-1- 5’-GGTGTGCGAAGCAAGGGTG-3’, Intergenic Reverse-2- 5’-CCTGTGACCGATGTGCCG-3’, Intergenic Reverse-3- 5’-GTGACCATCGCCGCATCG-3’, Flanking Reverse- 5’-GCATTACGTGTTCGATACCG-3’). Sequences were regrown, isolated and sequenced to confirm placement in the tree when needed.

5.3.4 XAC3314 and intergenic analysis Chromatograms of sequenced amplicons were automatically converted to FASTA files using phred (Ewing and Green 1998a) with default settings and no trimming. Forward and reverse sequences were automatically assembled with phrap (Ewing et al., 1998b), set to a minimum match of 100 and a minimum quality score of 20. Low quality bases were edited manually with BioEdit (Hall et al., 1999) and either retained, removed, or converted into degenerate bases. Phylogenetic trees with bootstrapping scores, from 1000 replicates, were produced with clustalW2 using pairwise deletions and the number of nucleotide differences to indicate branch length. The aligned sequences were separated into XAC3314 homologs and intergenic regions. The intergenic regions were those between genes XAC3313 and XAC3315. If the strain contained a XAC3314 homolog it was removed after alignment. An artificial sequence was produced from the alignment to root the cladogram. The sequence consisted of 45% of the most frequent base, 45% of the least frequent base and 10% of a non-matching base for each nucleotide in the alignment. Sequences were grouped by XAC3314 homolog to identify HGT events.

79

5.4 Results 5.4.1 X. axonopodis pv. dieffenbachiae, X. hortorum and X. campestris contain XAC3314 homologs Xanthomonas strains in the PBC were amplified with XAC3314 flanking primers. XAC3314 homologs were amplified in 96 of 366 Xanthomonas strains (Supplemental Table S4.2). Amplicons from 109 strains were of medium size, 49 strains were of small size, and 90 strains had no amplicon (Figure 5.1). Two strains, both Stenotrophomonas, had amplicons with a band that was larger than expected and was unsuccessfully sequenced with internal primers. This may be due to a rearrangement within these two genomes or different flanking genes up- and down- stream of XAC3314. Internal primers for XAC3314 produced no quality sequence data for these Stenotrophomonas strains. Also, nine strains had amplicons with two bands (big/medium, medium/small) and eleven strains had amplicons that exhibited an abnormal smear. Both may be attributed to contamination. X. axonopodis strains produced amplicons that indicate that these strains have the insert, are missing a flanking gene or have no XAC3314 gene with a small intergenic region between the flanking genes. X. campestris strains produced amplicons that indicate that these strains have a XAC3314 homolog or no XAC3314 gene and a medium intergenic region between the flanking genes. X. hortorum strains produced amplicons that indicate that these strains either contain an XAC3314 homolog or are missing a flanking gene and produce no amplicon.

5.4.2 Diversity of XAC3314 in the PBC XAC3314 homologs were sequenced with eight primers over the entire amplicon. Sequences were base called with phred and assembled with phrap. XAC3314 sequences were 1488nt, except for strain K330 that has a mis-sense mutation at nucleotide 991 (C->T) that may cause the early termination of the protein, similar to the sequenced X. vasicola strain. A total of 60 strains had their

80

XAC3314 homolog successfully sequenced in high quality. A cladogram was constructed from the alignment of these proteins (Figure 5.2). There are 15 unique XAC3314 sequences forming four groups. XAC3314 sequences were not much more diverse than their RIF counterparts (11 vs. 10 unique sequences from strains in the PBC) (Figure 5.3).

Figure 5.1 (Next Page). XAC3314 has been lost by several Xanthomonas strains. Rooted maximum-likelihood tree for the fifty unique sequences sequences from the PBC and eight reference strains from GenBank (see Table 2.1 for strain names) with Xylella as the outgroup. Bootstrap values (shown at the node) are expressed as a percentage of 1,000 replicates. Abbreviations are in Table 4.3. The types of amplicons produced after PCR using the flanking primers for XAC3314 is given for each leaf. A big (B), medium (M) and small (S) amplicon when a PCR product was present. If a flanking gene to XAC3314 was missing than the primers produced no (N) amplicon. Stenotrophomonas produced an abberant amplicon that was very big (BB) and may represent the gene with a different size intergenic region or a rearrangement in the genome. XAC3314 is present in five clades and has undergone several loss events (green star).

81

82

83

Figure 5.2 (Previous Page). Diversity of XAC3314 Xanthomonas strains in the PBC. Rooted maximum-likelihood tree for the fifteen unique XAC3314 sequences plus proteins that contain a DUF239 domain from GenBank and an artifical sequence produced from the alignment as an outgroup (see methods). Bootstrap values (shown at the node) are expressed as a percentage of 1,000 replicates. The four major clades in this tree were considered different types to be shown on the RIF tree (Group 1-4). Group 1 was further grouped into 1a, b and c. XAC3314 sequences in group 1 were each colored differently for further visualization on the RIF tree.

5.4.3 XAC3314 contains a plant inducible promoter box Putative PIP sequences were located 133-137 nt upstream of the XAC3314 start codon. The T3SS transcriptional activators, HrpG and HrpX, recognize PIP and activate or repress the gene controlled by the PIP. The XAC3314 gene in Xac_306 is activated by HrpG and repressed by HrpX and has a PIP of TTGCC-N14-TTGCA. Group I PIP sequences include the Xac_306 PIP and the additional PIP, TTGCC-N13-TTGCA and TTGCC-N14-TTGCG. The group II PIP sequences had an additional nucleotide change (T->C) in the upstream region of the PIP sequence (CTGCC-N14-TTGCA). The sequenced genome X. gardneri had a probable PIP sequence of TTGCC-N14-TTAGA at the same location as the other PIP sequences. The sequenced X. vasicola strains had a possible PIP upstream 177 nt from the start codon (TTGCA-N14-TTACG), but this PIP box is divergent from other PIP boxes identified and may not be a PIP.

5.4.4 XAC3314 loss and HGT in Xanthomonas The intergenic regions between genes XAC3313 and XAC3315 were aligned by removing the XAC3314 homolog from strains with the gene and a cladogram was constructed (Figure 5.4). The small, medium and big amplicons all contain distinct intergenic regions. The intergenic sequences share small stretches with each other, which may be due to the loss of the gene early in the split of the lineage. Intergenic regions that contain XAC3314 have a promoter

84

upstream (see above) that likely requires greater conservation in this region. The relationships were similar between the intergenic region of strains with a XAC3314 homolog and the XAC3314 homolog. The XAC3314 genotypes of each strain was overlaid onto the RIF cladogram to identify recent HGT events (Figure 5.3). There was no evidence of HGT between species since acquisition of the XAC3314 gene cluster or loss of the XAC3314 gene. However, HGT of XAC3314 among strains with the same or similar RIF genotype are difficult to identify. There is evidence that HGT of XAC3314 among similar strains as revealed with strains in group 1a. XAC3314 may have transferred between strain K0332 and K0815/K0816 and also K0583 and K0588 with Xav. These strains have the same XAC3314 homolog and a different RIF sequence. Similar strains that inhabit the same niche may horizontally transfer DNA to each other. The acquired DNA may be from an XAC3314+ strain, replacing a similar copy of XAC3314 or the copy of XAC3314 that was lost since initial acquisition. These HGT events may not affect the strain relations revealed in the phylogenetic trees of either the intergenic of genic region of XAC3314.

85

Figure 5.3. HGT of XAC3314 in Xanthomonas. Rooted maximum-likelihood tree constructed from intergenic sequences with Xylella fastidiosa as the outgroup. Bootstrap values (shown at the node) are expressed as a percentage of 1,000 replicates. The XAC3314 groups are overlaid on the RIF tree. The colored sequences in group 1a from Figure 5.2 are also colored the same in this figure. XAC3314 has not been transferred between species of Xanthomonas. Within group 1a, XAC3314 may have transferred between strain K0332 and K0815/K0816 and also K0583 and K0588 with Xav.

Figure 5.4 (Next Page). Diversity of the three different intergenic regions amplified in Xanthomonas strains. The un-rooted maximum-likelihood tree constructed from intergenic sequences of Xanthomonas.

86

87

5.5 Discussion 5.5.1 Distribution of XAC3314 in Xanthomonas species Initial analysis of the sequenced genomes identified XAC3314 homologs in X. axonopodis, X. gardneri and X. vasicola strains (Richards et al., 2009; Ewing 2008). The analysis of the PBC has shown that X. campestris and X. hortorum also contain a XAC3314 homolog. Furthermore, X. axonopodis pv. dieffenbachiae strains also contain a XAC3314 homolog. I propose that the sequenced X. campestris strains analyzed in Chapter 3 likely lost their XAC3314 homolog since acquisition as visualized in Figure 3.7 once the gene was no longer providing a useful function. The loss of XAC3314 may have occurred any time after the last common ancestor that contained a XAC3314 homolog until DNA was isolated from the strain for sequencing. Gene loss among strains of similar bacteria was observed with other symbionts (Moran et al., 2011). There is a possibility that strains in culture are more likely to lose the gene than those recently isolated from a plant. This small subset of over 300 Xanthomonas strains represents a small portion of the diversity of XAC3314 homologs likely present in Xanthomonas. Expanded analysis to the 3,000+ Xanthomonas strains in the PBC and other international collections may increase the known diversity of XAC3314 and identify if there may be any HGT activity since acquisition. The species that contain a XAC3314 homolog infect a variety of crops. This has led me to speculate that this gene may provide a generalized function to aide in the pathogen-host interaction. I suspect that XAC3314 genes are likely lost in bacterial strains that do not have a selective advantage on some specific host or tissue or if the strain is not residing on a plant.

5.5.2 XAC3314 diversity in the PBC There are four genotypes of XAC3314 present in the PBC. Each group is unique to a species of Xanthomonas. These strains exhibit a pattern expected of a gene inherited vertically, after the ancestral genome acquired the gene.

88

Although different species of Xanthomonas do not exhibit evidence of being recently acquired by strains in the PBC, similar strains may have acquired XAC3314 by HGT since the initial acquisition. These HGT events are difficult to decipher and may be a result of random mutations to the gene.

5.5.3 Transcriptional activity of XAC3314 homologs Yamazaki et al., (2008) have shown that the XAC3314 gene in Xac_306 is positively regulated by HrpG and negatively regulated by HrpX. The gene has a probable PIP box that is positively regulated by HrpG. The PIP box previously identified in Xav was regulated by HrpX (Koebnik 2006). Those PIP boxes that show more similarity to the HrpX regulated PIP box may be regulated by the HrpX transcription factor as opposed to the HrpG factor that activates the gene in Xac_306. It is also possible that the PIP sequences that do not perfectly match the PIP sequence in Xac_306 may not be recognized and secreted at the same level as the XAC3314 homolog in Xac_306. A different transcription factor may also control these genes that have a more divergent PIP sequence. Strains of differing pathogenicity (such as Xac_306 and Xfa_11122) with a XAC3314 homolog may transcribe and secrete the encoded protein at different levels and may still affect pathogenicity. The analysis of the secreted protein in different strains during infection may provide evidence that XAC3314 homologs affect pathogenicity or virulence by some as yet unknown mechanism.

89

5.5 Conclusion I have amplified and sequenced the XAC3314 homolog and intergenic region between the flanking genes for all strains with a RIF sequence (Chapter 4). The intergenic region was still sequenced if no XAC3314 homolog was present. I identified two additional species of Xanthomonas and one additional pathovar of X. axonopodis that contained a XAC3314 homolog. PIP boxes associated with XAC3314 reveal that different strains may regulate XAC3314 using different transcription factors. This analysis has shown that the inheritance patterns of XAC3314 and the intergenic region was of a gene that was ancientally acquired and may under HGT among similar strains of X. axonopodis.

90

Chapter 6: Summary

I have analyzed the Xanthomonadaceae family to study the evolution of a plant-like gene that was anciently acquired (Richards et al., 2009) after the rise of the land plants as part of a gene cluster in Xanthomonas. This and two other gene clusters in the vicinity of XAC3314 likely inserted from non-gamma proteobacterial strains into an ancestral genome of Xanthomonas after the split of X. albilineans. HGT events in the Xanthomonas genome have previously been analyzed (Hsiao et al., 2003, Garcia-Vallve et al., 2003), but these genes were not identified in those studies, likely because the HGT occurred such a long time ago. According to the “Selfish Operon Model” (Lawrence and Roth 1996) this gene would be more likely to propagate by HGT if the gene is clustered with others that also provide some function (together or separately). The carboxylesterase gene acquired with the XAC3314 gene cluster likely provides the plant-associated bacteria with an additional nutrient source and the XAC3314 protein provides some unknown useful function. The ancient ancestral genome that acquired the XAC3314 gene cluster has undergone gene loss and rearrangement as different strains have evolved to current state of the isolates analyzed. No evidence of recent HGT events of the XAC3314 gene has been identified among species of Xanthomonas; however, horizontal transfer among similar strains of X. axonopodis may have occured. A functional analysis of XAC3314 should provide insight into why Xanthomonas and other non-plant organisms have acquired a gene encoding a DUF239 domain.

91

Appendix

Appendix A: Design of RIF primers for other plant associated bacteria

A.1 Extension of the RIF marker to other genera Primer pairs to amplify and sequence the RIF marker (Table A.1) for Ralstonia and Clavibacter were designed within 60 nt of the Xanthomonas primers (Figure A.1). To further evaluate RIF, we analyzed Dickeya, Pectobacterium and Pantoea. Initial Enterobacteriaceae RIF primer sets developed from sequenced strains of Dickeya (Dd_3937), Pectobacterium (Pa_1043) and Erwinia (NC_013971.1) were used for PCR amplification and sequencing (Table A.1). New primers designed from the more recently sequenced Dickeya (Dd_703 and Dd_586) and Pectobacterium (Pcc_PC1 and Pw_Wpp163) genomes have improved amplification rates and genus specificity (data not shown).

A.2 In silico classification of Xylella and Pseudomonas with RIF The potential of RIF to classify bacteria from additional plant associated genera was further examined with an in silico analysis of Pseudomonas and Xylella strains available in GenBank. All but one of the completely sequenced strains contained a single copy of the dnaA gene and more than one copy of the ITS region (P. aeruginosa strain 39016 contains a single copy of both ITS and RIF). RIF primers for X. fastidiosa were developed from five strains; primers for Pseudomonas were developed from twenty-three strains representing eight species (Table A.1). The five Xylella strains were classified into three RIF genotypes, whereas every Pseudomonas strain had a different RIF sequence.

92

Table A.1. Additional primer pairs to amplify RIF designed for other genera.

Genusa Forward Primer Reverse Primer Clavibacter 5-TACGGCTTCGACACCTTCG-3 5-CGGTGATCTTCTTGTTGGCG-3 Dickeya b 5-CCTATCGYTCGAACGTGAA-3 5-CTGCTCGATTTTGCGGCAG-3 Dickeya c 5-CACACYTATCGYTCCAAYGT-3 5-TGTCGTGACTTTCYTCRCGC-3 Pectobacterium b 5-TACCGTTCCAATGTGAACCC-3 5-AAATCTTCTTTGATGTCGTGG-3 Pectobacterium c 5-ATGTGAACCCSAAACATACGT-3 5-TTCACGCAACTGCTCAATCTT-3 Ralstonia 5-TCRCGSCTGAACCSATCCT-3 5-TTGAGCTGSGCGTCCTTGC-3 Xanthomonas 5-CAGCACGGTGGTGTGGTC-3 5-CCTGGATTCGCATTACACC-3 Pseudomonas d 5-CTBAAGCACACCAGYTAYC-3 5-TCGCATTAGTTTATCCCAGTC-3 Xylella e 5-GAGGGACGAAGCAATCAACT-3 5-CAGCAGGTTCTTGTAGTCCT-3 aNo genome sequence was available for Pantoea agglomerans when these primers were developed. bPrimer pairs for three genera of Enterobacteriaceae were initially developed by comparison of the dnaA gene of Dickeya, Pectobacterium and Erwinia available in 2007. Primers were developed using the criteria applied for selection of the Clavibacter and Ralstonia primers and alignment of the respective dnaA gene (see methods). The initial Pectobacterium and Dickeya primers failed to produce amplicons for two strains of Dickeya (one D. chrysanthemi and one D. paradisiaca), three strains of Pantoea (one that also failed to amplify with the ITS marker) and nine strains of Pectobacterium (eight of which failed to amplify with the ITS marker). Pectobacterium strains K0509, K0522 and K0574 and Pantoea strains required an annealing temperature of 51°C. cPrimer sets were derived using newly sequenced strains of Dickeya (Dz_1591, Dd_703) and Pectobacterium (Pc_PCC1 and Pw_Wpp163). The Dickeya primers amplified all nine reference Dickeya strains. The new Pectobacterium primers no longer amplified any Pantoea strains at 51°C and they amplified strain K0522, K0574 and one additional Pectobacterium strain, K0510, but still did not amplify strain K0509 at 61°C. d Primers developed from Pseudomonas accessions NZ_CH482384, NZ_CM001020, NC_011770, NC_009656, NC_002516, NC_008463, NC_008027, NC_004129, NC_012660, NZ_CM001025, NC_009439, NC_009512, NC_010322, NC_002947, NC_010501, NZ_GG774590, NZ_DS997060, NC_005773, NC_007005, NZ_GG700508, NC_004578, NZ_GG699506 and NC_009434. e Primers developed from Xylella fastidiosa accessions CP002165, CP001011, AE009442, CP000941 and AE003849.

93

Figure A.1 (Next Page). Multiple sequence alignment of nucleotides 311 to 1311 of the dnaA genes of six genera. Primer regions are shown for Clavibacter, Xanthomonas, Ralstonia, Erwinia, Dickeya and Pectobacterium. Primer binding regions are shown in red with black background. The AAA+ domain (green) and the C-terminal domain (pink) (Kaguni et al., 2004) are highlighted.

94

95

Appendix B: Identification of Clavibacter with RIF

B.1 Comparison of the RIF and ITS markers in sequenced reference genomes of Clavibacter A comparison of two sequenced subspecies of C. michiganensis revealed twice as many polymorphic nucleotides in the RIF marker as in the ITS marker (31 vs. 14 nucleotide differences). RIF and ITS markers formed similar groupings for C. michiganensis (Figure B.1).

B.2 RIF frameworks for Clavibacter RIF sequences were generated for 124 of 125 Clavibacter michiganensis strains (99% success rate) including 121 strains belonging to C. michiganensis subsp. michiganensis that had been collected from every continent except Australia and Antarctica (Supplemental Table SB.1 and Figure B.2). Ten different sequences remained after the alignment was trimmed to 660 nt. The resulting neighbor joining tree contains three clades representing the three subspecies. Notably, there was no correlation between the RIF sequence of C. michiganensis subsp. michiganensis strains and their geographic origin, i.e. C. michiganensis subsp. michiganensis strains from distant geographic locations shared the same RIF sequence. Worldwide distribution of this seed- borne pathogen on infected tomato seed may account for this lack of correlation.

96

Figure B.1. RIF sequences distinguish fewer strains of Clavibacter but produce a more robust tree. Unrooted neighbor-joining trees for the RIF and ITS markers were constructed from nineteen Clavibacter strains from the PBC and two reference strains from GenBank. Identical sequences are represented only once and the number of sequenced strains is indicated. Bootstrap values >50% (shown at the node) are expressed as a percentage of 5,000 replicates.

97

Figure B.2. The RIF marker separates Clavibacter michiganensis strains into the three subspecies. The rooted neighbor-joining cladogram was constructed from 124 characterized Clavibacter strains from the PBC, two reference strains from GenBank and the outgroup Leifsonia xyli. Identical sequences are represented only once and the number of sequenced strains is indicated on each leaf. Bootstrap values >50% (shown at the node) are expressed as a percentage of 5,000 replicates. Cm† is a non-pathogenic strain isolated from tomato that is most similar to C. michiganensis subsp. insidiosus. Cmm: C. michiganensis subsp. michiganensis. Cmi: C. michiganensis subsp. insidiosus. Cms: C. michiganensis subsp. sepedonicus.

98

Appendix C: Identification of Ralstonia with RIF

C.1 Comparison of the RIF and ITS markers in sequenced reference genomes of Ralstonia The average distance between the six Ralstonia strains representing four species was similar with both RIF and ITS markers (50.5 and 54.8 nucleotide differences, respectively). Nevertheless, RIF separated R. solanacearum (Rs) strains GMI1000 and UW551 from R. pickettii (Rp) strain 12J while the ITS did not. ITS resulted in interspecies (Rs-Rp) distances (35 nt) similar to intraspecies (Rs-Rs) distances (26 nt) while RIF showed a clear separation between interspecies distances (42 nt) and intraspecies distances (18 nt).

In addition, although the Rs ITS clades were separated by a greater distance than the RIF clades, the sequence diversity within an ITS clade was less than that within a RIF clade (Figure C.1). R. pickettii was placed outside the Rs clade in the RIF but not the ITS tree.

99

Figure C.1. RIF sequences distinguish more Ralstonia strains than ITS. Unrooted neighbor-joining trees for the RIF and ITS markers were constructed from ninety-seven Ralstonia strains from the PBC and three reference strains from GenBank. Identical sequences are represented only once and the number of sequenced strains is indicated on each leaf. Bootstrap values >50% (shown on the node) are expressed as a percentage of 5,000 replicates. Rs strains grouped differently with the two markers, as illustrated by strains K0157, K0024, K0190 and K0018, which were re-sequenced and are marked with an asterisk. Although the average nucleotide difference between groups of Rs with the ITS marker is high, there is little sequence variation within each individual group, and fewer strains are resolved than with the RIF marker. Also, ITS sequence from Ralstonia pickettii strain 12J is placed within Rs clade B on the ITS tree, while RIF sequence from the same strain is placed outside Rs clade B on the RIF tree.

100

C.2 Building Ralstonia RIF frameworks The RIF marker was successfully sequenced for 210 (87.8%) of 239 characterized strains of R. solanacearum (Supplemental Table SC.1) representing four races, blood disease bacterium (BDB), and the atypical strain R. solanacearum UW433 (ACH0732). Seventeen different sequences were obtained after the alignment was trimmed to 617 nucleotides (Figure B.2). In order to compare the sequencing efficiency and resolution of RIF to that of the egl marker, which has been used to classify Rs into sequevars (Prior and Fegan 2005), we attempted to generate egl sequence from 191 characterized Rs strains. RIF was sequenced with a 12.6% higher success rate than egl. Strains for which both RIF sequence and an egl sequevar had been obtained were compared in a neighbor-joining tree (Figure B.2). The egl sequevars did not group into a single clade in the RIF tree, e.g. egl sequevars 13–18 are present in both clades B and D. The different groupings of strains with the RIF and egl markers are in agreement with Fegan's observations for Rs strain ACH0732 grouping differently with the ITS, egl and polygalacturonase marker (Fegan et al., 1998), and may again be indicative of HGT. Thus, although greater resolution was achieved with the egl marker (22 different sequences) than with the RIF marker (17 different sequences), the RIF marker, which is under stabilizing selection (Urwin et al., 2003), should more accurately portray genetic relationships than the positively selected pathogenicity related egl gene.

101

Figure C.2. Ralstonia solanacearum strains classified based on RIF genotypes. The rooted neighbor-joining cladogram was constructed from 210 characterized Ralstonia strains from the PBC, six reference strains from GenBank and the outgroup . Identical sequences are represented only once and the number of sequenced strains is indicated on each leaf. Bootstrap values >50% (shown at the node) are expressed as a percentage of 5,000 replicates. All Rs strains (clade A) are clearly separated from other Ralstonia species. Rs1, Rs2, Rs3 and Rs4 are the four races of R. solanacearum, BDB is the blood disease bacterium and Rs† is R. solanacearum UW433 (ACH0732) which is an atypical race 1 biovar 2 strain (Fegan and Prior 2005). Numbers in boxes next to tree leaves are egl sequevars obtained for those strains.

102

Appendix D: Identification of Enterobacteriaceae with RIF

D.1 Comparison of the RIF and ITS markers in sequenced reference genomes In silico analysis of the RIF and ITS markers of the nine completely sequenced Dickeya, Pectobacterium and Erwinia genomes revealed 2–3 times more polymorphic nucleotides in RIF than ITS with Dickeya (80 vs. 40 nt) and Erwinia (58 vs. 21 nt) (Figure D.1). These distances are similar to those observed for different species of Xanthomonas and illustrate the higher resolution of the RIF marker. In contrast, Pectobacterium strain PccPC1 was separated from other Pectobacterium accessions by longer branches in the ITS than the RIF tree (Figure D.1). This is due to the numerous nucleotides affected by several indels in the ITS sequence of PccPC1 (27 nucleotides from multiple insertions and six nucleotides from a deletion relative to the two other Pectobacterium strains).

The efficacy of the RIF and ITS markers was directly compared for 96 Enterobacteriaceae accessions (Supplemental Table SD.1). RIF was easier to amplify and sequence in this family than ITS as illustrated by the higher success rate for the direct sequencing of RIF (53) as compared to ITS (35) from 64 strains of Dickeya (all verified to be Dickeya with the ADE primers [Nassar et al., 1996]). A greater number of different sequences were obtained with RIF (12) than with ITS (11) for the same set of Dickeya strains (Figure D.2). We were able to obtain RIF but no ITS sequence for fourteen Pectobacterium strains. The difficulty in sequencing the ITS amplicon was likely due to the presence of multiple rDNA copies that differ in sequence and length. All Pantoea strains and the Pectobacterium strains K0509, K0574 and K0522 required a lower stringency annealing temperature (51°C vs. 61°C) for RIF amplification.

103

Figure D.1. Neighbor-joining cladograms for the family Enterobacteriaceae using ITS and RIF sequences. Sequences for Dickeya, Pectobacterium, Pantoea and Erwinia were extracted from fully sequenced strains in GenBank. Bootstrap values >50% (shown at the node) are expressed as a percentage of 5,000 replicates. Note the longer branch lengths for the Erwinia and Dickeya species in the RIF tree.

104

Figure D.2. RIF sequences distinguish more Dickeya strains than ITS. Unrooted neighbor-joining trees for the RIF and ITS markers were constructed from twenty-nine Dickeya strains from the PBC and three reference strains (with strain names) from GenBank. Identical sequences are represented only once and the number of sequenced strains is indicated. Bootstrap values >50% (shown at the node) are expressed as a percentage of 5,000 replicates.

105

D.2 Building Enterobacteriaceae RIF frameworks Twenty-nine different RIF sequences were obtained for Enterobacteriaceae strains after the alignment was trimmed to 722 nt (Figure D.3). Dickeya, Pectobacterium, Pantoea, Erwinia and Yersinia formed distinct clades in the neighbor joining tree. Fifty strains of Pectobacterium and Dickeya from the PBC that could not be identified to species based on biochemical tests were placed in the RIF framework. Two strains of Pectobacterium sp. isolated from Aglaonema in Hawaii and two strains isolated from water grouped with the P. carotovorum clade near RIF genotypes from known reference strains (green text in Figure D.3). The other forty-six strains grouped with distinct Dickeya reference strains (red text in Figure D.3). Notably, four D. dadantii strains were located in three separate sub-clades within the Dickeya clade. The sequenced strain P. wasabiae Wpp163 (formerly named P. carotovorum) and the potato strain P. carotovorum K0574 (classified based on bacteriological tests) differed from each other by only three nt. Similarly, the distance between several different P. atrosepticum strains in clades I and K (Figure 7) was greater (54.5 nt) than that between the species P. atrosepticum and P. carotovorum (50.8 nt and 48.6 nt, respectively, in clades J and K).

106

Figure D.3. RIF tree of plant pathogenic Enterobacteriaceae. The rooted neighbor-joining cladogram was constructed from 74 Enterobacteriaceae strains, including 24 characterized strains of Dickeya, Pectobacterium and Pantoea from the PBC, eight reference strains from GenBank and the outgroup Yersinia pestis. Identical sequences are represented only once and the number of sequenced strains is indicated on each leaf. Bootstrap values >50% (shown at the node) are expressed as a percentage of 5,000 replicates. The true Erwinia species E. tasmaniensis separates from the renamed Erwinia species that branch into three different genera: Dickeya (clade A), Pectobacterium (clades G and H) and Pantoea (clade M). Red text indicates uncharacterized strains of Dickeya and green text indicates strains of Pectobacterium only characterized to genus in the PBC.

107

Appendix E: Supplemental Figures

Supplemental Figure S4.1. Neighbor-joining tree of the RIF marker for Xanthomonas. Rooted neighbor-joining tree for the RIF marker of seventy- seven unique sequences and eight reference strains from GenBank (see Table 2.1 for strain names) with Xylella as the outgroup. Bootstrap values (shown at the node) are expressed as a percentage of 1,000 replicates. Abbreviations are in Table 4.3.

108

Supplemental Figure S4.2. Minimum-evolution tree of the RIF marker for Xanthomonas. Rooted minimum-evolution tree for the RIF marker of seventy- seven unique sequences and eight reference strains from GenBank (see Table 2.1 for strain names) with Xylella as the outgroup. Bootstrap values (shown at the node) are expressed as a percentage of 1,000 replicates. Abbreviations are in Table 4.3.

109

110

Supplemental Figure S4.3 (Previous Page). Maximum-parsimony tree of the RIF marker for Xanthomonas. Rooted maximum-parsimony tree for the RIF marker of seventy-seven unique sequences and eight reference strains from GenBank (see Table 2.1 for strain names) with Xylella as the outgroup. Bootstrap values (shown at the node) are expressed as a percentage of 1,000 replicates. Abbreviations are in Table 4.3.

Supplemental Figure S4.4 (Next Page). Maximum-likelihood tree of the RIF marker for Xanthomonas. Rooted maximum-likelihood tree for the RIF marker of seventy-seven unique sequences and eight reference strains from GenBank (see Table 2.1 for strain names) with Xylella as the outgroup. Bootstrap values (shown at the node) are expressed as a percentage of 1,000 replicates. Abbreviations are in Table 4.3.

111

112

Appendix F: Supplemental Tables

Supplemental Table S4.1. Copy number and location of the dnaA gene in 1,067 sequenced bacterial genomes.

NCBI Accession Strain Name Number Start of dnaA End of dnaA Length Blattabacterium sp. (Blattella germanica) Bge CP001487.1 N/A N/A Blattabacterium sp. (Periplaneta americana) BPLAN CP001429.2 N/A N/A Candidatus Blochmannia floridanus BX248583.1 N/A N/A Candidatus Blochmannia pennsylvanicus BPEN CP000016.1 N/A N/A Candidatus Carsonella ruddii AP009180.1 N/A N/A Candidatus Hodgkinia cicadicola CP001266.1 N/A N/A Candidatus Riesia pediculicola CP001085.1 N/A N/A Candidatus Sulcia muelleri DMIN CP001981.1 N/A N/A Candidatus Sulcia muelleri SMDSEM CP001605.1 N/A N/A Candidatus Sulcia muelleri GWSS CP000770.1 N/A N/A Wigglesworthia brevipalpis BA000021.3 N/A N/A Chlamydia muridarum AE002160.2 630061 631431 1371 AE002160.2 656551 657918 1368 Chlamydia trachomatis 434 Bu AM884176.1 598341 599711 1371 AM884176.1 624880 626247 1368 Chlamydia trachomatis A HAR-13 CP000051.1 282141 283511 1371 CP000051.1 308684 310051 1368 Chlamydia trachomatis B Jali20 OT FM872308.1 282056 283426 1371

113

FM872308.1 308598 309965 1368 Chlamydia trachomatis B TZ1A828 OT FM872307.1 281987 283357 1371 FM872307.1 308526 309893 1368 Chlamydia trachomatis D UW 3 CX AE001273.1 279889 281259 1371 AE001273.1 306433 307800 1368 Chlamydia trachomatis E 11023 uid43141 CP001890.1 280292 281662 1371 CP001890.1 306832 308199 1368 Chlamydia trachomatis E 150 uid43143 CP001886.1 280276 281646 1371 CP001886.1 306817 308184 1368 Chlamydia trachomatis G 11074 uid43149 CP001889.1 280391 281761 1371 CP001889.1 306938 308305 1368 Chlamydia trachomatis G 11222 uid43147 CP001888.1 280068 281438 1371 CP001888.1 306610 307977 1368 Chlamydia trachomatis G 9301 uid45851 CP001930.1 280391 281761 1371 CP001930.1 306938 308305 1368 Chlamydia trachomatis G 9768 uid43145 CP001887.1 280391 281761 1371 CP001887.1 306937 308304 1368 Chlamydia trachomatis L2b UCH 1 proctitis AM884177.1 598382 599752 1371 AM884177.1 624921 626288 1368 Chlamydophila abortus S26 3 CR848038.1 529040 530422 1383 CR848038.1 412801 414153 1353 Chlamydophila caviae AE015925.1 535600 536982 1383 AE015925.1 419450 420802 1353 Chlamydophila felis Fe C-56 AP006861.1 627434 628816 1383 AP006861.1 742836 744188 1353 Chlamydophila pneumoniae AR39 AE002161.1 490309 491691 1383 AE002161.1 370232 371635 1404

114

Chlamydophila pneumoniae CWL029 AE001363.1 349592 350974 1383 AE001363.1 469612 470964 1353 Chlamydophila pneumoniae J138 BA000008.3 349214 350596 1383 BA000008.3 469322 470674 1353 Chlamydophila pneumoniae LPCoLN uid17947 CP001713.1 492191 493573 1383 CP001713.1 371695 373047 1353 Chlamydophila pneumoniae TW 183 AE009440.1 347006 348388 1383 AE009440.1 467113 468465 1353 Desulfohalobium retbaense DSM 5692 CP001734.1 2830406 2831761 1356 CP001734.1 932647 934029 1383 Desulfomicrobium baculatum DSM 4028 CP001629.1 3507040 3508320 1281 CP001629.1 331 1680 1350 Desulfovibrio desulfuricans ATCC 27774 CP001358.1 2640 4013 1374 CP001358.1 978500 979948 1449 Desulfovibrio desulfuricans G20 CP000112.1 189 1499 1311 CP000112.1 2337250 2338635 1386 Desulfovibrio magneticus RS 1 AP010904.1 126 1466 1341 AP010904.1 3486420 3487817 1398 Desulfovibrio salexigens DSM 2638 CP001649.1 4233747 4235051 1305 CP001649.1 3548083 3549444 1362 Desulfovibrio vulgaris Miyazaki F CP001197.1 2588354 2589688 1335 CP001197.1 4 1500 1497 Desulfovibrio vulgaris DP4 CP000527.1 9812 11113 1302 CP000527.1 1217610 1219124 1515 Desulfovibrio vulgaris Hildenborough AE017285.1 151 1464 1314 AE017285.1 2347717 2349192 1476 Fibrobacter succinogenes S85 uid32617 CP001792.1 3396130 3397593 1464

115

CP001792.1 128 1453 1326 Lawsonia intracellularis PHE MN1-00 AM180252.1 979314 980636 1323 AM180252.1 358 1794 1437 bovis BCG Pasteur 1173P2 AM408590.1 1 1524 1524 AM408590.1 29668 31191 1524 mycoides capri GM12 uid19245 CP001621.1 1 1353 1353 CP001621.1 1078308 1079660 1353 Parachlamydia sp UWE25 BX908798.1 1295440 1296795 1356 BX908798.1 424280 425665 1386 Pedobacter heparinus DSM 2366 CP001681.1 4313026 4314456 1431 CP001681.1 2753213 2754643 1431 Pirellula sp BX119912.1 6246565 6247827 1263 BX119912.1 889863 891638 1776 Pirellula staleyi DSM 6068 uid29845 CP001848.1 1279958 1281154 1197 CP001848.1 234 1946 1713 Salinibacter ruber DSM 13855 CP000159.1 1718032 1719726 1695 CP000159.1 2 1576 1575 Thermincola JR uid41467 CP002028.1 1137843 1138862 1020 CP002028.1 98 1453 1356 Acaryochloris marina MBIC11017 CP000828.1 3124142 3125509 1368 Acetobacter pasteurianus IFO 3283 1 AP011121.1 1291083 1292516 1434 Acetobacter pasteurianus IFO 3283 1 42C uid31141 AP011163.1 1291083 1292516 1434 Acetobacter pasteurianus IFO 3283 12 uid32203 AP011170.1 1291077 1292510 1434 Acetobacter pasteurianus IFO 3283 22 uid31135 AP011142.1 1292094 1293527 1434 Acetobacter pasteurianus IFO 3283 26 uid31137 AP011149.1 1292094 1293527 1434 Acetobacter pasteurianus IFO 3283 3 uid31131 AP011128.1 1292094 1293527 1434 Acetobacter pasteurianus IFO 3283 32 uid31139 AP011156.1 1291077 1292510 1434

116

Acetobacter pasteurianus IFO 3283 7 uid31133 AP011135.1 1291077 1292510 1434 Acholeplasma laidlawii PG 8A CP000896.1 289 1647 1359 Acidaminococcus fermentans DSM 20731 uid33685 CP001859.1 34 1194 1161 Acidaminococcus fermentans DSM 20731 uid33685 CP001859.1 2751 4139 1389 Acidimicrobium ferrooxidans DSM 10331 CP001631.1 87 1463 1377 Acidiphilium cryptum JF-5 CP000697.1 261 1709 1449 Acidithiobacillus ferrooxidans ATCC 23270 CP001219.1 2980841 2982199 1359 Acidithiobacillus ferrooxidans ATCC 53993 CP001132.1 252 1610 1359 Acidobacterium capsulatum ATCC 51196 CP001472.1 2480464 2481900 1437 Acidothermus cellulolyticus 11B CP000481.1 75 1517 1443 Acidovorax avenae citrulli AAC00-1 CP000512.1 71 1498 1428 Acidovorax JS42 CP000539.1 4429148 4430569 1422 AB0057 CP001182.1 21687 23084 1398 Acinetobacter baumannii AB307 294 CP001172.1 95 1492 1398 Acinetobacter baumannii ACICU CP000863.1 5370 6767 1398 Acinetobacter baumannii ATCC 17978 CP000521.1 95 1492 1398 Acinetobacter baumannii AYE CU459141.1 170 1567 1398 Acinetobacter baumannii SDF CU468230.2 367 1764 1398 Acinetobacter sp ADP1 CR543861.1 201 1598 1398 pleuropneumoniae L20 CP000569.1 63 1547 1485 Actinobacillus pleuropneumoniae serovar 3 JL03 CP000687.1 2 1486 1485 Actinobacillus pleuropneumoniae serovar 7 AP76 CP001091.1 2 1486 1485 Actinobacillus succinogenes 130Z CP000746.1 252 1619 1368 Actinosynnema mirum DSM 43827 CP001630.1 98 1831 1734 ATCC 7966 CP000462.1 223 1593 1371 Aeromonas salmonicida A449 CP000644.1 201 1571 1371 Aggregatibacter actinomycetemcomitans D11S 1 uid40107 CP001733.1 1895609 1896970 1362

117

Aggregatibacter aphrophilus NJ8700 CP001607.1 53733 55094 1362 Agrobacterium radiobacter K84 CP000628.1 496696 498246 1551 Agrobacterium tumefaciens C58 Cereon AE007869.2 317547 319109 1563 Agrobacterium vitis S4 CP000633.1 336131 337705 1575 Akkermansia muciniphila ATCC BAA 835 CP001071.1 1882 3288 1407 Alcanivorax borkumensis SK2 AM286690.1 1 1425 1425 Alicyclobacillus acidocaldarius DSM 446 CP001727.1 112 1476 1365 Aliivibrio salmonicida LFI1238 FM178379.1 7427 8833 1407 Alkalilimnicola ehrlichei MLHE-1 CP000453.1 75 1430 1356 Alkaliphilus metalliredigens QYMF CP000724.1 50 1396 1347 Alkaliphilus oremlandii OhILAs CP000853.1 49 1404 1356 Allochromatium vinosum DSM 180 uid32547 CP001896.1 232 1593 1362 alpha proteobacterium IMCC1322 uid28081 CP001751.1 1721041 1722573 1533 Alteromonas macleodii Deep ecotype CP001103.1 759 2360 1602 Aminobacterium colombiense DSM 12261 uid32587 CP001997.1 141 1469 1329 Ammonifex degensii KC4 uid12390 CP001785.1 188 1522 1335 Anabaena variabilis ATCC 29413 CP000117.1 119 1501 1383 Anaerocellum thermophilum DSM 6725 CP001393.1 45 1409 1365 Anaerococcus prevotii DSM 20548 CP001708.1 254 1633 1380 Anaeromyxobacter dehalogenans 2CP 1 CP001359.1 248 1624 1377 Anaeromyxobacter dehalogenans 2CP-C CP000251.1 22 1395 1374 Anaeromyxobacter Fw109-5 CP000769.1 48 1430 1383 Anaeromyxobacter K CP001131.1 36 1412 1377 Anaplasma centrale Israel uid32765 CP001759.1 850199 851614 1416 Anaplasma marginale Florida CP001079.1 387758 389173 1416 Anaplasma marginale St Maries CP000030.1 389525 390940 1416 Anaplasma phagocytophilum HZ CP000235.1 501807 503186 1380

118

Anoxybacillus flavithermus WK1 CP000922.1 401 1744 1344 Aquifex aeolicus AE000657.1 208642 209841 1200 haemolyticum DSM 20595 uid37925 CP002045.1 77 1711 1635 Arcobacter butzleri RM4018 CP000361.1 1 1317 1317 Arcobacter nitrofigilis DSM 7299 uid32593 CP001999.1 71 1384 1314 Aromatoleum aromaticum EbN1 CR555306.1 1693960 1695405 1446 Arthrobacter aurescens TC1 CP000474.1 1 1419 1419 Arthrobacter chlorophenolicus A6 CP001341.1 177 1598 1422 Arthrobacter FB24 CP000454.1 5 1429 1425 Atopobium parvulum DSM 20469 CP001721.1 1200 2729 1530 Azoarcus BH72 AM406670.1 1 1443 1443 Azorhizobium caulinodans ORS 571 AP009384.1 1138239 1139573 1335 Azospirillum B510 uid32551 AP010946.1 3055827 3057344 1518 Azotobacter vinelandii DJ CP001157.1 101 1537 1437 amyloliquefaciens FZB42 CP000560.1 412 1752 1341 Bacillus anthracis A0248 CP001598.1 307 1647 1341 Bacillus anthracis Ames AE017334.2 407 1747 1341 Bacillus anthracis Ames 581 AE017334.2 407 1747 1341 Bacillus anthracis CDC 684 CP001215.1 281 1621 1341 Bacillus anthracis str Sterne AE017225.1 408 1748 1341 Bacillus cereus 03BB102 CP001407.1 292 1632 1341 Bacillus cereus AH187 CP001177.1 409 1749 1341 Bacillus cereus AH820 CP001283.1 408 1748 1341 Bacillus cereus ATCC 10987 AE017194.1 408 1748 1341 Bacillus cereus ATCC14579 AE016877.1 281 1621 1341 Bacillus cereus B4264 CP001176.1 407 1747 1341 Bacillus cereus cytotoxis NVH 391-98 CP000764.1 521 1861 1341

119

Bacillus cereus G9842 CP001186.1 246 1586 1341 Bacillus cereus Q1 CP000227.1 408 1748 1341 Bacillus cereus ZK CP000001.1 408 1748 1341 Bacillus clausii KSM-K16 AP006627.1 183 1538 1356 Bacillus halodurans BA000004.3 584 1933 1350 Bacillus licheniformis ATCC 14580 CP000002.3 507 1847 1341 Bacillus licheniformis DSM 13 AE017333.1 311 1651 1341 Bacillus megaterium DSM319 uid42425 CP001982.1 1 1344 1344 Bacillus megaterium QM B1551 uid30165 CP001983.1 1 1344 1344 Bacillus pseudofirmus OF4 uid28811 CP001878.1 1573534 1574886 1353 Bacillus pumilus SAFR-032 CP000813.1 1 1341 1341 Bacillus selenitireducens MLS10 uid13376 CP001791.1 654 2009 1356 Bacillus subtilis AL009126.3 410 1750 1341 Bacillus thuringiensis Al Hakam CP000485.1 351 1748 1398 Bacillus thuringiensis BMB171 uid43631 CP001903.1 283 1623 1341 Bacillus thuringiensis konkukian AE017355.1 409 1749 1341 Bacillus tusciae DSM 2912 uid31345 CP002017.1 369 1721 1353 Bacillus weihenstephanensis KBAB4 CP000903.1 106 1446 1341 Bacteroides fragilis NCTC 9434 CR626927.1 4268926 4270356 1431 Bacteroides fragilis YCH46 AP006841.1 4362998 4364428 1431 Bacteroides thetaiotaomicron VPI-5482 AE015928.1 2697457 2698869 1413 Bacteroides vulgatus ATCC 8482 CP000139.1 1 1416 1416 bacilliformis KC583 CP000524.1 1324320 1325894 1575 as4aup CP001562.1 153867 155438 1572 Houston-1 BX897699.1 158771 160342 1572 Toulouse BX897700.1 146196 147767 1572 Bartonella tribocorum CIP 105476 AM260525.1 156702 158273 1572

120

Bdellovibrio bacteriovorus BX842601.2 1 1416 1416 Beijerinckia indica ATCC 9039 CP001016.1 1509 3041 1533 Beutenbergia cavernae DSM 12333 CP001618.1 322 1791 1470 Bifidobacterium adolescentis ATCC 15703 AP009256.1 1 1500 1500 Bifidobacterium animalis lactis AD011 CP001213.1 165 1916 1752 Bifidobacterium animalis lactis BB 12 uid42883 CP001853.1 635770 637533 1764 Bifidobacterium animalis lactis Bl 4 CP001515.1 134 1885 1752 Bifidobacterium animalis lactis DSM 10140 CP001606.1 134 1885 1752 Bifidobacterium animalis lactis V9 uid32515 CP001892.1 134 1885 1752 Bifidobacterium dentium Bd1 uid17583 CP001750.1 1 1503 1503 Bifidobacterium longum AE014295.3 1624263 1625765 1503 Bifidobacterium longum DJO10A CP000605.1 1665929 1667431 1503 Bifidobacterium longum infantis ATCC 15697 CP001095.1 210 1712 1503 Bifidobacterium longum JDM301 uid47579 CP002010.1 210 1712 1503 avium 197N AM167904.1 3709832 3711283 1452 Bordetella bronchiseptica BX470250.1 5310125 5311567 1443 BX470249.1 4744528 4745937 1410 BX470248.1 504563 505972 1410 Bordetella petrii AM902716.1 1 1437 1437 Borrelia afzelii PKo CP000395.1 460469 461926 1458 Borrelia burgdorferi CP001205.1 456576 458036 1461 Borrelia burgdorferi ZS7 CP001205.1 452671 454134 1464 Borrelia duttonii Ly CP000976.1 468262 469716 1455 Borrelia garinii PBi CP000013.1 458789 460243 1455 Borrelia hermsii DAH CP000048.1 460217 461671 1455 Borrelia recurrentis A1 CP000993.1 474985 476439 1455 Borrelia turicatae 91E135 CP000049.1 457651 459105 1455

121

Brachybacterium faecium DSM 4810 CP001643.1 44 1741 1698 Brachyspira hyodysenteriae WA1 CP001357.1 239539 240924 1386 Brachyspira murdochii DSM 12563 uid29543 CP001959.1 132 1511 1380 Bradyrhizobium BTAi1 CP000494.1 571 1998 1428 Bradyrhizobium japonicum BA000040.2 893712 895124 1413 Bradyrhizobium ORS278 CU234118.1 936 2366 1431 Brevibacillus brevis NBRC 100599 AP008955.1 465 1826 1362 bv 1 9 941 AE017223.1 784 2274 1491 Brucella abortus S19 CP000887.1 634 2274 1641 Brucella canis ATCC 23365 CP000872.1 634 2274 1641 Brucella melitensis CP001488.1 2001077 2002660 1584 Brucella melitensis ATCC 23457 CP001488.1 634 2274 1641 Brucella melitensis biovar Abortus AM040264.1 784 2274 1491 Brucella microti CCM 4915 CP001578.1 784 2274 1491 Brucella ovis CP000708.1 785 2275 1491 Brucella suis 1330 AE014291.4 784 2274 1491 Brucella suis ATCC 23445 CP000911.1 634 2274 1641 Buchnera aphidicola CP001161.1 13114 14487 1374 Buchnera aphidicola 5A Acyrthosiphon pisum CP001161.1 12554 13918 1365 Buchnera aphidicola Cc Cinara cedri CP000263.1 5647 6960 1314 Buchnera aphidicola Sg AE013218.1 12397 13761 1365 Buchnera aphidicola Tuc7 Acyrthosiphon pisum CP001158.1 12554 13918 1365 Buchnera sp BA000003.2 12554 13918 1365 Burkholderia 383 CP000151.1 76 1653 1578 Burkholderia ambifaria AMMD CP000440.1 203 1780 1578 Burkholderia ambifaria MC40 6 CP001025.1 302 1879 1578 Burkholderia CCGE1002 uid37719 CP002013.1 363 1961 1599

122

Burkholderia cenocepacia AU 1054 CP000378.1 2802249 2803934 1686 Burkholderia cenocepacia HI2424 CP000458.1 178 1755 1578 Burkholderia cenocepacia J2315 AM747720.1 467298 468875 1578 Burkholderia cenocepacia MC0 3 CP000958.1 303 1880 1578 Burkholderia glumae BGR1 CP001503.1 100 1704 1605 Burkholderia mallei ATCC 23344 CP000010.1 70 1626 1557 Burkholderia mallei NCTC 10229 CP000546.1 2267135 2268736 1602 Burkholderia mallei NCTC 10247 CP000548.1 25 1626 1602 Burkholderia mallei SAVP1 CP000526.1 2826231 2827832 1602 Burkholderia multivorans ATCC 17616 JGI CP000868.1 74 1648 1575 Burkholderia multivorans ATCC 17616 Tohoku AP009385.1 74658 76232 1575 Burkholderia phymatum STM815 CP001043.1 362 1918 1557 Burkholderia phytofirmans PsJN CP001052.1 299 1936 1638 Burkholderia pseudomallei 1106a CP000572.1 101194 102795 1602 Burkholderia pseudomallei 1710b CP000124.1 310638 312239 1602 Burkholderia pseudomallei 668 CP000570.1 85283 86884 1602 Burkholderia pseudomallei K96243 BX571965.1 85351 86952 1602 Burkholderia thailandensis E264 CP000086.1 3690347 3691957 1611 Burkholderia vietnamiensis G4 CP000614.1 294 1871 1578 Burkholderia xenovorans LB400 CP000270.1 192 1826 1635 Caldicellulosiruptor saccharolyticus DSM 8903 CP000679.1 642 2006 1365 Campylobacter concisus 13826 CP000792.1 1 1311 1311 Campylobacter curvus 525 92 CP000767.1 1 1311 1311 Campylobacter fetus 82-40 CP000487.1 1 1311 1311 Campylobacter hominis ATCC BAA-381 CP000776.1 1 1314 1314 CP000025.1 1 1323 1323 Campylobacter jejuni 81116 CP000814.1 1 1323 1323

123

Campylobacter jejuni 81-176 CP000538.1 1 1323 1323 Campylobacter jejuni doylei 269 97 CP000768.1 1 1323 1323 Campylobacter jejuni IA3902 uid28907 CP001876.1 1 1323 1323 Campylobacter jejuni RM1221 CP000025.1 1 1323 1323 Campylobacter lari RM2100 CP000932.1 1 1329 1329 Candidatus Accumulibacter phosphatis clade IIA UW 1 CP001715.1 239 1660 1422 Candidatus Amoebophilus asiaticus 5a2 CP001102.1 513 1946 1434 Candidatus Azobacteroides pseudotrichonymphae genomovar CFP2 AP010656.1 1 1368 1368 Candidatus Desulfococcus oleovorans Hxd3 CP000859.1 50 1432 1383 Candidatus Desulforudis audaxviator MP104C CP000860.1 103 1440 1338 Candidatus Hamiltonella defensa 5AT Acyrthosiphon pisum CP001277.1 1579736 1581100 1365 Candidatus Koribacter versatilis Ellin345 CP000360.1 33 1418 1386 Candidatus Liberibacter asiaticus psy62 CP001677.2 643376 644884 1509 Candidatus Pelagibacter ubique HTCC1062 CP000084.1 398915 400249 1335 Candidatus Phytoplasma australiense AM422018.1 1 1368 1368 Candidatus Phytoplasma mali CU469464.1 250608 251975 1368 Candidatus Ruthia magnifica Cm Calyptogena magnifica CP000488.1 39 1331 1293 Candidatus Vesicomyosocius okutanii HA AP009247.1 1 1293 1293 Capnocytophaga ochracea DSM 7271 CP001632.1 2323139 2324560 1422 Carboxydothermus hydrogenoformans Z-2901 CP000141.1 2399937 2401301 1365 Catenulispora acidiphila DSM 44928 CP001700.1 370 2082 1713 Caulobacter crescentus CP001340.1 5636 7108 1473 Caulobacter crescentus NA1000 CP001340.1 5636 7108 1473 Caulobacter K31 CP000927.1 389 1852 1464 Caulobacter segnis ATCC 21756 uid37277 CP002008.1 20 1495 1476 Cellulomonas flavigena DSM 20109 uid19707 CP001964.1 68 1672 1605 Cellvibrio japonicus Ueda107 CP000934.1 1 1605 1605

124

Chitinophaga pinensis DSM 2588 CP001699.1 126 1559 1434 Chlorobaculum parvum NCIB 8327 CP001099.1 167 1648 1482 Chlorobium chlorochromatii CaD3 CP000108.1 346 1824 1479 Chlorobium limicola DSM 245 CP001097.1 162 1637 1476 Chlorobium luteolum DSM 273 CP000096.1 2363373 2364842 1470 Chlorobium phaeobacteroides BS1 CP001101.1 1 1476 1476 Chlorobium phaeobacteroides DSM 266 CP000492.1 1 1473 1473 Chlorobium tepidum TLS AE006470.1 1398 2879 1482 Chloroflexus aggregans DSM 9485 CP001337.1 208 1644 1437 Chloroflexus aurantiacus J 10 fl CP000909.1 336 1775 1440 Chloroflexus Y 400 fl CP001364.1 294 1733 1440 Chloroherpeton thalassium ATCC 35110 CP001100.1 1 1491 1491 Chromobacterium violaceum AE016825.1 245 1648 1404 Chromohalobacter salexigens DSM 3043 CP000285.1 93 1580 1488 ATCC BAA-895 CP000822.1 46474 47808 1335 Citrobacter rodentium ICC168 uid34685 FN543502.1 4262871 4264277 1407 Clavibacter michiganensis NCPPB 382 AM711867.1 1 1434 1434 Clavibacter michiganensis sepedonicus AM849034.1 1 1431 1431 Clostridiales genomosp BVAB3 UPII9 5 uid42555 CP001850.1 96822 98138 1317 Clostridium acetobutylicum AE001437.1 467 1807 1341 Clostridium beijerinckii NCIMB 8052 CP000721.1 280 1629 1350 Clostridium botulinum A CP000962.1 1 1347 1347 Clostridium botulinum A ATCC 19397 CP000726.1 1 1347 1347 Clostridium botulinum A Hall CP000727.1 1 1347 1347 Clostridium botulinum A2 Kyoto CP001581.1 157 1494 1338 Clostridium botulinum A3 Loch Maree CP000962.1 14 1351 1338 Clostridium botulinum B Eklund 17B CP001056.1 175 1545 1371

125

Clostridium botulinum B1 Okra CP000939.1 38 1375 1338 Clostridium botulinum Ba4 657 CP001083.1 164 1510 1347 Clostridium botulinum E3 Alaska E43 CP001078.1 127 1497 1371 Clostridium botulinum F 230613 uid47575 CP002011.1 1 1338 1338 Clostridium botulinum F Langeland CP000728.1 1 1338 1338 Clostridium cellulolyticum H10 CP001348.1 27 1349 1323 Clostridium difficile 630 AM180355.1 1 1320 1320 Clostridium difficile R20291 FN545816.1 1 1320 1320 Clostridium kluyveri DSM 555 CP000673.1 1 1356 1356 Clostridium kluyveri NBRC 12016 AP009049.1 1 1356 1356 Clostridium novyi NT CP000382.1 2546147 2547493 1347 Clostridium perfringens CP000246.1 410 1783 1374 Clostridium perfringens ATCC 13124 CP000246.1 411 1784 1374 Clostridium perfringens SM101 uid12521 CP000312.1 236 1609 1374 Clostridium phytofermentans ISDg CP000885.1 75 1436 1362 Clostridium tetani E88 AE015927.1 51225 52253 1029 Clostridium thermocellum ATCC 27405 CP000568.1 2834293 2835624 1332 Colwellia psychrerythraea 34H CP000083.1 72 1457 1386 Comamonas testosteroni CNB 1 uid29203 CP001220.1 1 1407 1407 Conexibacter woesei DSM 14684 uid20745 CP001854.1 80 1441 1362 Coprothermobacter proteolyticus DSM 5265 CP001145.1 57624 59255 1632 Coraliomargarita akajimensis DSM 45221 uid33365 CP001998.1 490 1899 1410 aurimucosum ATCC 700975 CP001601.1 1 1641 1641 Corynebacterium diphtheriae BX248353.1 19 1677 1659 Corynebacterium efficiens YS-314 BA000035.2 1 1722 1722 Corynebacterium glutamicum ATCC 13032 Bielefeld BX927147.1 1 1575 1575 Corynebacterium glutamicum ATCC 13032 Kitasato BA000036.3 1 1575 1575

126

Corynebacterium glutamicum R AP009044.1 1 1575 1575 K411 CR931997.1 1 1752 1752 Corynebacterium kroppenstedtii DSM 44385 CP001620.1 1 1818 1818 Corynebacterium urealyticum DSM 7109 AM942444.1 1 1761 1761 CP001019.1 140 1495 1356 Coxiella burnetii CbuG Q212 CP001019.1 140 1495 1356 Coxiella burnetii CbuK Q154 CP001020.1 140 1495 1356 Coxiella burnetii Dugway 7E9-12 CP000733.1 140 1495 1356 Coxiella burnetii RSA 331 CP000890.1 140 1495 1356 Cronobacter turicensis uid39965 FN543093.1 34550 36010 1461 Cryptobacterium curtum DSM 15641 CP001682.1 130 1650 1521 Cupriavidus metallidurans CH34 uid250 CP000352.1 104 1843 1740 Cupriavidus taiwanensis CU633749.1 707 2476 1770 bacterium Yellowstone A-Prime CP000239.1 71 1474 1404 Cyanobacteria bacterium Yellowstone B-Prime CP000240.1 64 1467 1404 Cyanothece ATCC 51142 CP000806.1 3796973 3798340 1368 Cyanothece PCC 7424 CP001291.1 90 1451 1362 Cyanothece PCC 7425 CP001344.1 48 1430 1383 Cyanothece PCC 8801 CP001287.1 1254 2615 1362 Cyanothece PCC 8802 CP001701.1 98 1459 1362 Cytophaga hutchinsonii ATCC 33406 CP000383.1 3450532 3451947 1416 Dechloromonas aromatica RCB CP000089.1 151 1548 1398 Deferribacter desulfuricans SSM1 uid37285 AP011529.1 2230975 2232306 1332 Deferribacter desulfuricans SSM1 uid37285 AP011529.1 234637 236244 1608 Dehalococcoides BAV1 CP000688.1 245 1579 1335 Dehalococcoides CBDB1 AJ965256.1 261 1595 1335 Dehalococcoides ethenogenes 195 CP000027.1 261 1598 1338

127

Dehalococcoides GT uid36645 CP001924.1 245 1579 1335 Dehalococcoides VS uid18811 CP001827.1 259 1593 1335 Deinococcus deserti VCD115 CP001114.1 42 1421 1380 Deinococcus geothermalis DSM 11300 CP000359.1 222 1634 1413 Deinococcus radiodurans AE000513.1 1904 3268 1365 Delftia acidovorans SPH-1 CP000884.1 16 1422 1407 Denitrovibrio acetiphilus DSM 12809 uid29431 CP001968.1 36 1376 1341 Desulfatibacillum alkenivorans AK 1 CP001322.1 423 1787 1365 Desulfitobacterium hafniense DCB 2 CP001336.1 93 1442 1350 Desulfitobacterium hafniense Y51 AP008230.1 1 1350 1350 Desulfobacterium autotrophicum HRM2 CP001087.1 2112000 2113409 1410 Desulfotalea psychrophila LSv54 CR522870.1 301842 303305 1464 Desulfotomaculum acetoxidans DSM 771 CP001720.1 31 1371 1341 Desulfotomaculum reducens MI-1 CP000612.1 365 1690 1326 Desulfurivibrio alkaliphilus AHT2 uid33629 CP001940.1 505 1860 1356 Diaphorobacter TPSY uid29975 CP001392.1 70 1491 1422 Dichelobacter nodosus VCS1703A CP000513.1 130 1458 1329 Dickeya dadantii Ech586 uid33667 CP001836.1 270 1658 1389 Dickeya dadantii Ech703 CP001654.1 188 1576 1389 Dickeya zeae Ech1591 CP001655.1 35 1429 1395 Dictyoglomus thermophilum H 6 12 CP001146.1 1662563 1663888 1326 Dictyoglomus turgidum DSM 6724 CP001251.1 75 1406 1332 Dinoroseobacter shibae DFL 12 CP000830.1 3544704 3546086 1383 Dyadobacter fermentans DSM 18053 CP001619.1 2 1444 1443 Edwardsiella ictaluri 93 146 CP001600.1 60 1451 1392 Edwardsiella tarda EIB202 uid28539 CP001135.1 1 1323 1323 Eggerthella lenta DSM 2243 CP001726.1 1306 2868 1563

128

Ehrlichia canis Jake CP000107.1 413518 414876 1359 Arkansas CP000236.1 822485 823879 1395 Ehrlichia ruminantium Gardel CR925677.1 483666 485060 1395 Ehrlichia ruminantium str. Welgevonden CIRAD CR925678.1 486450 487844 1395 Ehrlichia ruminantium Welgevonden UPSA CR767821.1 506593 507987 1395 Elusimicrobium minutum Pei191 CP001055.1 556 1914 1359 Enterobacter 638 CP000653.1 518 1969 1452 ATCC 13047 uid45793 CP001918.1 82 1410 1329 Enterobacter sakazakii ATCC BAA-894 CP000783.1 3924054 3925370 1317 Enterococcus faecalis V583 AE016830.1 59 1402 1344 Erwinia amylovora ATCC 49946 uid43757 FN666575.1 3772963 3774354 1392 Erwinia amylovora CFBP1430 uid46805 FN434113.1 3772705 3774096 1392 Erwinia carotovora atroseptica SCRI1043 BX950851.1 4983699 4985096 1398 Erwinia pyrifoliae DSM 12163 uid37877 FN392235.1 3995572 3996999 1428 Erwinia pyrifoliae Ep1 96 uid34779 FP236842.1 3995609 3997000 1392 Erwinia tasmaniensis CU468135.1 3850468 3851859 1392 Erythrobacter litoralis HTCC2594 CP000157.1 2142392 2143858 1467 Escherichia coli 127 H6 E2348 69 FM180568.1 4179650 4181053 1404 Escherichia coli 42 uid40647 FN554766.1 4317269 4318672 1404 Escherichia coli 536 CP000247.1 4066823 4068157 1335 Escherichia coli 55989 CU928145.2 4257487 4258890 1404 Escherichia coli APEC O1 CP000468.1 4176691 4178106 1416 Escherichia coli B REL606 CP000819.1 3842687 3844090 1404 Escherichia coli BL21 DE3 CP001509.3 347 1750 1404 Escherichia coli BL21 DE3 uid20713 CP001509.3 3771122 3772525 1404 Escherichia coli BL21 DE3 uid28965 AM946981.1 3769200 3770603 1404 Escherichia coli BW2952 CP001396.1 3768682 3770085 1404

129

Escherichia coli C ATCC 8739 CP000946.1 48 1451 1404 Escherichia coli CFT073 AE014075.1 2875039 2875785 747 Escherichia coli DH1 uid30031 CP001637.1 33 1436 1404 Escherichia coli E24377A CP000800.1 4199088 4200491 1404 Escherichia coli ED1a CU928162.2 4319836 4321239 1404 Escherichia coli HS CP000802.1 3909497 3910900 1404 Escherichia coli IAI1 CU928160.2 3953613 3955016 1404 Escherichia coli IAI39 CU928164.2 4482664 4484067 1404 Escherichia coli IHE3034 uid43693 CP001969.1 4256793 4258196 1404 Escherichia coli K 12 substr DH10B CP000948.1 3977933 3979336 1404 Escherichia coli K 12 substr MG1655 U00096.2 3880349 3881752 1404 Escherichia coli K 12 substr W3110 AP009048.1 3756686 3758089 1404 Escherichia coli O103 H2 12009 uid32511 AP010958.1 4580256 4581659 1404 Escherichia coli O111 H 11128 uid32513 AP010960.1 4551702 4553105 1404 Escherichia coli O157 H7 EC4115 CP001164.1 4771663 4773066 1404 Escherichia coli O157 H7 TW14359 CP001368.1 4727706 4729109 1404 Escherichia coli O157H7 AE005174.2 4668559 4669962 1404 Escherichia coli O157H7 EDL933 AE005174.2 4737542 4738945 1404 Escherichia coli O26 H11 11368 uid32509 AP010953.1 4904905 4906308 1404 Escherichia coli O55 H7 CB9615 uid42729 CP001846.1 4556053 4557456 1404 Escherichia coli S88 CU928161.2 4091404 4092807 1404 Escherichia coli SE11 AP009240.1 4131342 4132745 1404 Escherichia coli SE15 uid19053 AP009378.1 3840836 3842239 1404 Escherichia coli SMS 3 5 CP000970.1 4158994 4160397 1404 Escherichia coli UMN026 CU928163.2 4366052 4367455 1404 Escherichia coli UTI89 CP000243.1 4143894 4145297 1404 Escherichia fergusonii ATCC 35469 CU928158.2 4102156 4103568 1413

130

Eubacterium eligens ATCC 27750 CP001104.1 1 1356 1356 Eubacterium rectale ATCC 33656 CP001107.1 1 1362 1362 Exiguobacterium AT1b CP001615.1 1699111 1700490 1380 Exiguobacterium sibiricum 255 15 CP001022.1 427 1821 1395 Fervidobacterium nodosum Rt17-B1 CP000771.1 43 1377 1335 Finegoldia magna ATCC 29328 AP008971.1 365 1822 1458 bacterium 3519 10 CP001673.1 543627 545093 1467 Flavobacterium johnsoniae UW101 CP000685.1 3174 4601 1428 Flavobacterium psychrophilum JIP02 86 AM398681.1 1399584 1401020 1437 Francisella philomiragia ATCC 25017 CP000937.1 896082 897557 1476 FSC 198 AM286280.1 1 1521 1521 Francisella tularensis holarctica CP000803.1 1 1476 1476 Francisella tularensis holarctica CP000803.1 275 1750 1476 Francisella tularensis holarctica FTNF002 0 CP000803.1 1 1476 1476 Francisella tularensis holarctica OSU18 CP000437.1 1 1476 1476 Francisella tularensis mediasiatica FSC147 CP000915.1 130 1521 1392 Francisella tularensis NE061598 uid38289 CP001633.1 1 1521 1521 Francisella tularensis novicida U112 CP000439.1 145 1620 1476 Francisella tularensis tularensis AJ749949.2 1 1521 1521 Francisella tularensis WY96-3418 CP000608.1 46 1521 1476 Frankia alni ACN14a CT573213.2 424 2028 1605 Frankia CcI3 CP000249.1 35 1723 1689 Frankia EAN1pec CP000820.1 71 1654 1584 Fusobacterium nucleatum AE009951.2 639955 641868 1914 Gardnerella vaginalis 409 5 uid31001 CP001849.1 53866 55485 1620 Gemmatimonas aurantiaca T 27 AP009153.1 101 1519 1419 Geobacillus C56 T3 uid41701 CP002050.1 219 1571 1353

131

Geobacillus kaustophilus HTA426 BA000043.1 88 1440 1353 Geobacillus thermodenitrificans NG80-2 CP000557.1 209 1561 1353 Geobacillus WCH70 CP001638.1 27 1379 1353 Geobacillus Y412MC10 uid27777 CP001793.1 166 1512 1347 Geobacillus Y412MC61 uid30537 CP001794.1 123 1475 1353 Geobacter bemidjiensis Bem CP001124.1 166 1542 1377 Geobacter FRC 32 CP001390.1 185 1522 1338 Geobacter lovleyi SZ CP001089.1 261 1640 1380 Geobacter M21 CP001661.1 119 1501 1383 Geobacter metallireducens GS-15 CP000148.1 261 1613 1353 Geobacter sulfurreducens AE017180.1 30 1367 1338 Geobacter uraniumreducens Rf4 CP000698.1 46 1395 1350 Geodermatophilus obscurus DSM 43160 uid29547 CP001867.1 314 2068 1755 Gloeobacter violaceus BA000045.2 1583759 1585087 1329 Gluconacetobacter diazotrophicus PAl 5 FAPERJ AM889285.1 1798138 1799601 1464 Gluconacetobacter diazotrophicus PAl 5 JGI CP001189.1 5 1438 1434 Gluconobacter oxydans 621H CP000009.1 155 1594 1440 Gordonia bronchialis DSM 43247 uid29549 CP001802.1 333 1862 1530 Gramella forsetii KT0803 CU207366.1 699578 700087 510 ducreyi 35000HP AE017143.1 678427 679773 1347 CP000671.1 1055909 1057273 1365 Haemophilus influenzae PittEE CP000671.1 1389072 1390436 1365 Haemophilus influenzae PittGG CP000672.1 1584551 1585915 1365 Haemophilus parasuis SH0165 CP001321.1 1478942 1480276 1335 Haemophilus somnus 129PT CP000436.1 122211 123569 1359 Haemophilus somnus 2336 CP000947.1 279 1646 1368 Hahella chejuensis KCTC 2396 CP000155.1 378 1781 1404

132

Haliangium ochraceum DSM 14365 uid28711 CP001804.1 109 1605 1497 Halorhodospira halophila SL1 CP000544.1 1333575 1334924 1350 Halothermothrix orenii H 168 CP001098.1 593 1984 1392 Halothiobacillus neapolitanus c2 uid31049 CP001801.1 150 1559 1410 Helicobacter acinonychis Sheeba AM260522.1 1 1350 1350 Helicobacter hepaticus AE017125.1 1080619 1081989 1371 Helicobacter mustelae 12198 uid40677 FN555004.1 1 1290 1290 26695 AE000511.1 1607624 1608997 1374 Helicobacter pylori 51 uid9627 CP000012.1 1506576 1507943 1368 Helicobacter pylori B38 uid39685 FM991728.1 1488636 1490003 1368 Helicobacter pylori G27 CP001173.1 1571935 1573302 1368 Helicobacter pylori HPAG1 CP000241.1 1435690 1437057 1368 Helicobacter pylori J99 AE001439.1 1557789 1559162 1374 Helicobacter pylori P12 CP001217.1 1592687 1594054 1368 Helicobacter pylori Shi470 CP001072.2 1527250 1528617 1368 Helicobacter pylori uid39507 CP001680.1 1485277 1486644 1368 Heliobacterium modesticaldum Ice1 CP000930.2 1088217 1089548 1332 Herminiimonas arsenicoxydans CU207211.1 278 1660 1383 Herpetosiphon aurantiacus ATCC 23779 CP000875.1 77 1495 1419 Hirschia baltica ATCC 49814 CP001678.1 3337218 3338720 1503 Hydrogenobacter thermophilus TK 6 uid34131 AP011112.1 100 1299 1200 Hydrogenobaculum Y04AAS1 CP001130.1 265 1599 1335 Hyphomonas neptunium ATCC 15444 CP000158.1 561363 562736 1374 Idiomarina loihiensis L2TR AE017340.1 362 1726 1365 Jannaschia CCS1 CP000264.1 48 1475 1428 Janthinobacterium Marseille CP000269.1 1 1383 1383 Jonesia denitrificans DSM 20603 CP001706.1 159 1589 1431

133

Kangiella koreensis DSM 16069 CP001707.1 363 1748 1386 Kineococcus radiotolerans SRS30216 CP000750.2 2318319 2319875 1557 342 CP000964.1 19 1422 1404 Klebsiella pneumoniae MGH 78578 CP000647.1 4495248 4496582 1335 Klebsiella pneumoniae NTUH K2044 AP006725.1 5212742 5214145 1404 Klebsiella variicola At 22 uid37701 CP001891.1 3051 4454 1404 Kocuria rhizophila DC2201 AP009152.1 1 1695 1695 Kosmotoga olearia TBF 19 5 1 CP001634.1 103 1446 1344 Kribbella flavida DSM 17836 uid21089 CP001736.1 342 2156 1815 Kytococcus sedentarius DSM 20547 CP001686.1 209 1729 1521 acidophilus NCFM CP000033.3 31 1398 1368 Lactobacillus brevis ATCC 367 CP000416.1 113 1471 1359 Lactobacillus casei CP000423.1 81 1430 1350 Lactobacillus casei ATCC 334 CP000423.1 81 1430 1350 Lactobacillus crispatus ST1 uid46813 FN692037.1 1 1368 1368 Lactobacillus delbrueckii bulgaricus CR954253.1 323 1687 1365 Lactobacillus delbrueckii bulgaricus ATCC BAA-365 CP000412.1 104 1468 1365 Lactobacillus fermentum IFO 3956 AP008937.1 1 1317 1317 Lactobacillus gasseri ATCC 33323 CP000413.1 102 1496 1395 Lactobacillus helveticus DPC 4571 CP000517.1 187 1554 1368 Lactobacillus johnsonii FI9785 uid36575 FN298497.1 1 1365 1365 Lactobacillus johnsonii NCC 533 AE017198.1 1 1365 1365 Lactobacillus plantarum CP001617.1 1 1368 1368 Lactobacillus plantarum JDM1 CP001617.1 1 1368 1368 Lactobacillus reuteri DSM 20016 CP000705.1 381 1703 1323 Lactobacillus reuteri F275 Kitasato AP007281.1 1 1323 1323 Lactobacillus rhamnosus GG AP011548.1 1 1350 1350

134

Lactobacillus rhamnosus GG uid40637 AP011548.1 1 1350 1350 Lactobacillus rhamnosus Lc 705 FM179323.1 1 1350 1350 Lactobacillus sakei 23K CR936503.1 210 1556 1347 Lactobacillus salivarius UCC118 CP000233.1 1 1365 1365 Lactococcus lactis AM406671.1 358 1725 1368 Lactococcus lactis cremoris MG1363 AM406671.1 1 1365 1365 Lactococcus lactis cremoris SK11 CP000425.1 144 1508 1365 Lactococcus lactis KF147 uid41115 CP001834.1 575 1942 1368 Laribacter hongkongensis HLHK9 CP001154.1 3146262 3147707 1446 NSW150 uid39579 FN650140.1 203 1579 1377 2300 99 Alcoy uid18743 CP001828.1 204 1562 1359 Legionella pneumophila Corby CP000675.2 655 2013 1359 Legionella pneumophila Lens CR628337.1 204 1562 1359 Legionella pneumophila Paris CR628336.1 204 1562 1359 Legionella pneumophila Philadelphia 1 AE017354.1 654 2012 1359 Leifsonia xyli xyli CTCB0 AE016822.1 225 1646 1422 Leptospira biflexa serovar Patoc Patoc 1 Ames CP000777.1 4188 5513 1326 Leptospira biflexa serovar Patoc Patoc 1 Paris CP000786.1 141 1466 1326 Leptospira borgpetersenii serovar Hardjo-bovis JB197 CP000350.1 4254 5567 1314 Leptospira borgpetersenii serovar Hardjo-bovis L550 CP000348.1 4254 5567 1314 Leptospira interrogans serovar Copenhageni AE016823.1 292 1623 1332 Leptospira interrogans serovar Lai AE010300.2 234 1565 1332 Leptothrix cholodnii SP 6 CP001013.1 72 1604 1533 Leptotrichia buccalis DSM 1135 CP001685.1 687 2042 1356 Leuconostoc citreum KM20 DQ489736.1 347 1693 1347 Leuconostoc kimchii IMSNU11154 uid40837 CP001758.1 593324 594670 1347 Leuconostoc mesenteroides ATCC 8293 CP000414.1 116 1462 1347

135

Listeria innocua AL592022.1 319 1674 1356 Listeria monocytogenes AL591824.1 305 1673 1369 Listeria monocytogenes 4b F2365 AE017262.2 319 1674 1356 Listeria monocytogenes 8 5923 uid36363 CP001604.1 2997382 2998737 1356 Listeria monocytogenes Clip81459 FM242711.1 319 1674 1356 Listeria monocytogenes HCC23 CP001175.1 2652349 2653704 1356 Listeria monocytogenes uid36361 CP001602.1 3030616 3031971 1356 Listeria seeligeri serovar 1 2b SLCC3954 uid41123 FN557490.1 319 1674 1356 Listeria welshimeri serovar 6b SLCC5334 AM263198.1 318 1673 1356 Lysinibacillus sphaericus C3 41a CP000817.1 4639741 1198 -5E+06 Macrococcus caseolyticus JCSC5402 AP009484.1 348 1685 1338 Magnetococcus MC-1 CP000471.1 181 1524 1344 Magnetospirillum magneticum AMB-1 AP007255.1 683815 685248 1434 Mannheimia succiniciproducens MBEL55E AE016827.1 447961 449334 1374 Maricaulis maris MCS10 CP000449.1 14 1462 1449 Marinobacter aquaeolei VT8 CP000514.1 466 1929 1464 Marinomonas MWYL1 CP000749.1 262 1812 1551 Meiothermus ruber DSM 1279 uid28827 CP001743.1 37 1356 1320 Mesoplasma florum L1 AE017263.1 1 1332 1332 Mesorhizobium BNC1 CP000390.1 85 1524 1440 Mesorhizobium loti BA000012.4 4477398 4478942 1545 Methylibium petroleiphilum PM1 CP000555.1 186 1589 1404 Methylobacillus flagellatus KT CP000284.1 46 1458 1413 Methylobacterium 4 46 CP000943.1 176 1660 1485 Methylobacterium chloromethanicum CM4 CP001298.1 1438 2943 1506 Methylobacterium extorquens AM1 CP001510.1 712 2217 1506 Methylobacterium extorquens DM4 FP103042.2 710 2215 1506

136

Methylobacterium extorquens PA1 CP000908.1 44 1549 1506 Methylobacterium nodulans ORS 2060 CP001349.1 1 1500 1500 Methylobacterium populi BJ001 CP001029.1 1161 2666 1506 Methylobacterium radiotolerans JCM 2831 CP001001.1 769 2265 1497 Methylocella silvestris BL2 CP001280.1 362 1882 1521 Methylococcus capsulatus Bath AE017282.2 3219779 3221107 1329 Methylotenera 301 uid39983 CP002056.1 94 1509 1416 Methylotenera mobilis JLW8 CP001672.1 202 1620 1419 Methylovorus SIP3 4 CP001674.1 138 1547 1410 Micrococcus luteus NCTC 2665 CP001628.1 256 1803 1548 Microcystis aeruginosa NIES 843 AP009552.1 2048563 2049897 1335 Moorella thermoacetica ATCC 39073 CP000232.1 450 1778 1329 RH4 uid46869 CP002005.1 993 2396 1404 ATCC 19977 CU458896.1 1 1476 1476 Mycobacterium avium 104 CP000479.1 33 1529 1497 Mycobacterium avium AE016958.1 1 1530 1530 BX248333.1 1 1524 1524 Mycobacterium bovis BCG Tokyo 172 AP010918.1 1 1524 1524 Mycobacterium gilvum PYR-GCK CP000656.1 858447 859925 1479 Mycobacterium JLS CP000580.1 252 1739 1488 Mycobacterium KMS CP000518.1 5935 7422 1488 AL450380.1 1 1566 1566 Mycobacterium leprae Br4923 FM211192.1 1 1566 1566 M CP000854.1 1 1533 1533 Mycobacterium MCS CP000384.1 20 1507 1488 Mycobacterium smegmatis MC2 155 CP000480.1 6986600 6988114 1515 Mycobacterium CDC1551 AE000516.2 1 1524 1524

137

Mycobacterium tuberculosis F11 CP000717.1 102 1625 1524 Mycobacterium tuberculosis H37Ra CP000611.1 1 1524 1524 Mycobacterium tuberculosis H37Rv AL123456.2 1 1524 1524 Mycobacterium tuberculosis KZN 1435 CP001658.1 103 1626 1524 Agy99 CP000325.1 1 1533 1533 Mycobacterium vanbaalenii PYR-1 CP000511.1 249 1733 1485 Mycoplasma agalactiae PG2 CU179680.1 1 1401 1401 Mycoplasma arthritidis 158L3 1 CP001047.1 107 1471 1365 Mycoplasma capricolum ATCC 27343 CP000123.1 1 1353 1353 Mycoplasma conjunctivae HRC 581 uid32285 FM864216.2 101 1513 1413 Mycoplasma crocodyli MP145 uid29021 CP001991.1 1 1305 1305 Mycoplasma gallisepticum CP001873.1 3163 4548 1386 Mycoplasma gallisepticum F uid43301 CP001873.1 3159 4544 1386 Mycoplasma gallisepticum R high uid43299 CP001872.1 3163 4548 1386 Mycoplasma genitalium L43967.2 577268 578581 1314 Mycoplasma hyopneumoniae 232 AE017332.1 1 1392 1392 Mycoplasma hyopneumoniae 7448 AE017244.1 207 1598 1392 Mycoplasma hyopneumoniae J AE017243.1 207 1598 1392 Mycoplasma mobile 163K AE017308.1 215 1603 1389 Mycoplasma mycoides BX293980.2 1 1440 1440 Mycoplasma mycoides capri GM12 uid39245 CP001668.1 1 1353 1353 Mycoplasma penetrans BA000026.2 1 1359 1359 Mycoplasma pneumoniae U00089.2 813468 814787 1320 Mycoplasma pulmonis AL445566.1 222 1607 1386 Mycoplasma synoviae 53 AE017245.1 57 1427 1371 Myxococcus xanthus DK 1622 CP000113.1 87 1439 1353 Nakamurella multipartita DSM 44233 CP001737.1 176 1849 1674

138

Natranaerobius thermophilus JW NM WN LF CP001034.1 146 1507 1362 Nautilia profundicola AmH CP001279.1 101 1411 1311 FA 1090 AE004969.1 159 1715 1557 Neisseria gonorrhoeae NCCP11945 CP001050.1 206 1714 1509 53442 CP000381.1 321373 322779 1407 Neisseria meningitidis 8013 uid34687 FM999788.1 2010063 2011619 1557 Neisseria meningitidis alpha14 AM889136.1 281876 283432 1557 Neisseria meningitidis FAM18 AM421808.1 321483 323039 1557 Neisseria meningitidis MC58 AE002098.2 2004389 2005945 1557 Neisseria meningitidis Z2491 AL157959.1 527602 529158 1557 Neorickettsia risticii Illinois CP001431.1 231140 232936 1797 Neorickettsia sennetsu Miyayama CP000237.1 220315 221739 1425 Nitratiruptor SB155-2 AP009178.1 139 1470 1332 Nitrobacter hamburgensis X14 CP000319.1 107 1534 1428 Nitrobacter winogradskyi Nb-255 CP000115.1 567 1994 1428 Nitrosococcus halophilus Nc4 uid36589 CP001798.1 244 1596 1353 Nitrosococcus oceani ATCC 19707 CP000127.1 175 1524 1350 Nitrosomonas europaea AL954747.1 211 1590 1380 Nitrosomonas eutropha C71 CP000450.1 35 1414 1380 Nitrosospira multiformis ATCC 25196 CP000103.1 69 1499 1431 Nocardia farcinica IFM10152 AP006618.1 1 1989 1989 Nocardioides JS614 CP000509.1 72 1658 1587 Nostoc punctiforme PCC 73102 CP001037.1 167 1546 1380 Nostoc sp BA000019.2 2403018 2404397 1380 Novosphingobium aromaticivorans DSM 12444 CP000248.1 24 1520 1497 Oceanobacillus iheyensis BA000028.3 300 1643 1344 Ochrobactrum anthropi ATCC 49188 CP000758.1 1439 3001 1563

139

Oenococcus oeni PSU-1 CP000411.1 1 1353 1353 Oligotropha carboxidovorans OM5 CP001196.1 527180 528613 1434 Opitutus terrae PB90 1 CP001032.1 20 1423 1404 Boryong AM494475.1 22 1431 1410 Orientia tsutsugamushi Ikeda AP008981.1 1601042 1602454 1413 Paenibacillus JDR 2 CP001656.1 27 1379 1353 Pantoea ananatis LMG 20103 uid43085 CP001875.1 95034 96257 1224 Parabacteroides distasonis ATCC 8503 CP000140.1 1 1398 1398 Paracoccus denitrificans PD1222 CP000490.1 405 1751 1347 Parvibaculum lavamentivorans DS-1 CP000774.1 1568 3127 1560 AE004439.1 1357003 1358358 1356 Pectobacterium carotovorum PC1 CP001657.1 72 1469 1398 Pectobacterium wasabiae WPP163 uid31293 CP001790.1 53 1450 1398 Pediococcus pentosaceus ATCC 25745 CP000422.1 50 1390 1341 Pelobacter carbinolicus CP000142.2 72 1430 1359 Pelobacter propionicus DSM 2379 CP000482.1 1 1353 1353 Pelodictyon phaeoclathratiforme BU 1 CP001110.1 1 1464 1464 Pelotomaculum thermopropionicum SI AP009389.1 1 1344 1344 Persephonella marina EX H1 CP001230.1 867511 868824 1314 Petrotoga mobilis SJ95 CP000879.1 53 1417 1365 Phenylobacterium zucineum HLK1 CP000747.1 3882661 3884115 1455 Photobacterium profundum SS9 CR354531.1 7364 8788 1425 Photorhabdus asymbiotica FM162591.1 133 1452 1320 Photorhabdus luminescens BX470251.1 234 1622 1389 Planctomyces limnophilus DSM 3776 uid29411 CP001744.1 614 3166 2553 Polaromonas JS666 CP000316.1 40 1434 1395 Polaromonas naphthalenivorans CJ2 CP000529.1 4402850 4404247 1398

140

Polynucleobacter necessarius asymbioticus QLW P1DMWA 1 CP000655.1 33 1460 1428 Polynucleobacter necessarius STIR1 CP001010.1 113 1537 1425 Porphyromonas gingivalis ATCC 33277 AP009380.1 1 1422 1422 Porphyromonas gingivalis W83 AE015924.1 1 1422 1422 Prevotella ruminicola 23 uid10619 CP002006.1 1529107 1530528 1422 Prochlorococcus marinus AS9601 CP000551.1 541765 543159 1395 Prochlorococcus marinus CCMP1375 AE017126.1 537957 539342 1386 Prochlorococcus marinus MED4 BX548174.1 531716 533107 1392 Prochlorococcus marinus MIT 9211 CP000878.1 530649 532013 1365 Prochlorococcus marinus MIT 9215 CP000825.1 567695 569089 1395 Prochlorococcus marinus MIT 9301 CP000576.1 516187 517581 1395 Prochlorococcus marinus MIT 9303 CP000554.1 755354 756745 1392 Prochlorococcus marinus MIT 9312 CP000111.1 526327 527721 1395 Prochlorococcus marinus MIT 9515 CP000552.1 562236 563627 1392 Prochlorococcus marinus MIT9313 BX548175.1 1277265 1278662 1398 Prochlorococcus marinus NATL1A CP000553.1 562863 564257 1395 Prochlorococcus marinus NATL2A CP000095.2 550495 551895 1401 Propionibacterium acnes KPA171202 AE017283.1 245 1747 1503 Propionibacterium acnes SK137 uid31005 CP001977.1 241 1743 1503 Prosthecochloris aestuarii DSM 271 CP001108.1 1 1479 1479 Prosthecochloris vibrioformis DSM 265 CP000607.1 1 1464 1464 AM942759.1 3445820 3447220 1401 Pseudoalteromonas atlantica T6c CP000388.1 96 1520 1425 Pseudoalteromonas haloplanktis TAC125 CR954246.1 118 1542 1425 FM209186.1 483 2027 1545 Pseudomonas aeruginosa LESB58 FM209186.1 483 2027 1545 Pseudomonas aeruginosa PA7 CP000744.1 488 2026 1539

141

Pseudomonas aeruginosa UCBPP-PA14 CP000438.1 483 2027 1545 Pseudomonas entomophila L48 CT573326.1 552 2078 1527 Pseudomonas fluorescens Pf0 1 CP000094.2 565 2088 1524 Pseudomonas fluorescens Pf-5 CP000076.1 101 1642 1542 Pseudomonas fluorescens SBW25 AM181176.4 1 1506 1506 Pseudomonas mendocina ymp CP000680.1 82 1566 1485 Pseudomonas putida F1 CP000712.1 385 1902 1518 Pseudomonas putida GB 1 CP000926.1 1152 2684 1533 Pseudomonas putida KT2440 AE015451.1 9542 11062 1521 Pseudomonas putida W619 CP000949.1 386 1921 1536 Pseudomonas stutzeri A1501 CP000304.1 380 1924 1545 Pseudomonas syringae phaseolicola 1448A CP000058.1 238 1773 1536 Pseudomonas syringae pv B728a CP000075.1 1 1536 1536 Pseudomonas syringae tomato DC3000 AE016853.1 339 1874 1536 Psychrobacter arcticum 273-4 CP000082.1 520 1965 1446 Psychrobacter cryohalolentis K5 CP000323.1 1365 2810 1446 Psychrobacter PRwf-1 CP000713.1 1311 2723 1413 Psychromonas ingrahamii 37 CP000510.1 4524210 4525586 1377 Ralstonia eutropha H16 AM260479.1 689 2455 1767 Ralstonia eutropha JMP134 CP000090.1 184 2019 1836 Ralstonia pickettii 12D CP001644.1 613 2202 1590 Ralstonia pickettii 12J CP001068.1 328 1917 1590 Ralstonia solanacearum AL646052.1 3714708 3716276 1569 Renibacterium salmoninarum ATCC 33209 CP000910.1 144 1547 1404 Rhizobium etli CFN 42 CP000133.1 370494 372044 1551 Rhizobium etli CIAT 652 CP001074.1 406075 407625 1551 Rhizobium leguminosarum bv trifolii WSM1325 CP001622.1 55 1605 1551

142

Rhizobium leguminosarum bv trifolii WSM2304 CP001191.1 620 2170 1551 Rhizobium leguminosarum bv viciae 3841 AM236080.1 410392 411840 1449 Rhizobium NGR234 CP001389.1 202 1644 1443 Rhodobacter capsulatus SB1003 uid55 CP001312.1 351 1724 1374 Rhodobacter sphaeroides 2 4 1 CP000143.1 3110775 3112142 1368 Rhodobacter sphaeroides ATCC 17025 CP000661.1 842 2227 1386 Rhodobacter sphaeroides ATCC 17029 CP000577.1 9930 11297 1368 Rhodobacter sphaeroides KD131 CP001150.1 2784388 2785731 1344 Rhodococcus erythropolis PR4 AP008957.1 768 2330 1563 Rhodococcus jostii RHA1 CP000431.1 3872441 3874027 1587 Rhodococcus opacus B4 uid34839 AP011115.1 3776104 3777690 1587 Rhodoferax ferrireducens T118 CP000267.1 7 1419 1413 Rhodopseudomonas palustris BisA53 CP000463.1 641 2065 1425 Rhodopseudomonas palustris BisB18 CP000301.1 758 2179 1422 Rhodopseudomonas palustris BisB5 CP000283.1 649 2067 1419 Rhodopseudomonas palustris CGA009 BX571963.1 679 2097 1419 Rhodopseudomonas palustris HaA2 CP000250.1 676 2094 1419 Rhodopseudomonas palustris TIE 1 CP001096.1 220 1638 1419 Rhodospirillum centenum SW CP000613.2 3127693 3129216 1524 Rhodospirillum rubrum ATCC 11170 CP000230.1 228 1757 1530 Rhodothermus marinus DSM 4252 uid29281 CP001807.1 165 1676 1512 africae ESF 5 CP001612.1 871344 872735 1392 Hartford CP000847.1 846716 848107 1392 Rickettsia bellii OSU 85-389 CP000849.1 498632 500023 1392 Rickettsia bellii RML369-C CP000087.1 932808 934199 1392 Rickettsia canadensis McKiel CP000409.1 404050 405441 1392 AE006914.1 862914 864305 1392

143

Rickettsia felis URRWXCal2 CP000053.1 382683 384074 1392 Rickettsia massiliae MTU5 CP000683.1 945604 946995 1392 Rickettsia peacockii Rustic CP001227.1 694602 695993 1392 CP001584.1 756866 758257 1392 Rickettsia prowazekii Rp22 uid19813 CP001584.1 756779 758170 1392 Iowa CP000766.1 866869 868260 1392 Rickettsia rickettsii Sheila Smith CP000848.1 855505 856896 1392 wilmington AE017197.1 762664 764055 1392 Robiginitalea biformata HTCC2501 CP001712.1 2781545 2782969 1425 Roseiflexus castenholzii DSM 13941 CP000804.1 370 1812 1443 Roseiflexus RS-1 CP000686.1 415 1860 1446 Roseobacter denitrificans OCh 114 CP000362.1 204958 206322 1365 Rothia mucilaginosa uid38547 AP011540.1 1 1695 1695 Rubrobacter xylanophilus DSM 9941 CP000386.1 2 1354 1353 Ruegeria pomeroyi DSS 3 CP000031.1 164331 165737 1407 Saccharomonospora viridis DSM 43017 CP001683.1 99 1892 1794 Saccharophagus degradans 14642 CP000282.1 383 1957 1575 Saccharopolyspora erythraea NRRL 2338 AM420293.1 1 1779 1779 Salinispora arenicola CNS-205 CP000850.1 148 1923 1776 Salinispora tropica CNB-440 CP000667.1 625 2388 1764 arizonae serovar 62 z4 z23 CP000880.1 3740840 3742249 1410 Salmonella enterica Choleraesuis AE017220.1 3982617 3984017 1401 Salmonella enterica Paratypi ATCC 9150 CP000026.1 3825575 3826975 1401 Salmonella enterica serovar Agona SL483 CP001138.1 3964318 3965718 1401 Salmonella enterica serovar Dublin CT 2021853 CP001144.1 4076670 4078070 1401 Salmonella enterica serovar Enteritidis P125109 AM933172.1 3918280 3919680 1401 Salmonella enterica serovar Gallinarum 287 91 AM933173.1 3779210 3780610 1401

144

Salmonella enterica serovar Heidelberg SL476 CP001120.1 4057833 4059233 1401 Salmonella enterica serovar Newport SL254 CP001113.1 4014207 4015607 1401 Salmonella enterica serovar Paratyphi A AKU 12601 FM200053.1 3821254 3822654 1401 Salmonella enterica serovar Paratyphi B SPB7 CP000886.1 3992516 3993916 1401 Salmonella enterica serovar Paratyphi C RKS4594 CP000857.1 3979370 3980782 1413 Salmonella enterica serovar Schwarzengrund CVM19633 CP001127.1 3923952 3925352 1401 Salmonella enterica serovar Typhi Ty2 AE014613.1 3790618 3792018 1401 Salmonella enterica serovar Typhimurium 14028S uid33067 CP001363.1 4057317 4058717 1401 Salmonella enterica serovar Typhimurium uid40625 FN424405.1 4066500 4067900 1401 Salmonella typhi AE006468.1 3805127 3806527 1401 Salmonella typhimurium LT2 AE006468.1 4043624 4045024 1401 Sanguibacter keddieii DSM 10542 uid19711 CP001819.1 137 1609 1473 Sebaldella termitidis ATCC 33386 uid29539 CP001739.1 361 1719 1359 Segniliparus rotundus DSM 44985 uid37711 CP001958.1 130 1593 1464 Serratia proteamaculans 568 CP000826.1 30606 31997 1392 Shewanella amazonensis SB2B CP000507.1 10775 12148 1374 Shewanella ANA-3 CP000469.1 9651 11033 1383 Shewanella baltica OS155 CP000563.1 41 1429 1389 Shewanella baltica OS185 CP000753.1 382 1770 1389 Shewanella baltica OS195 CP000891.1 382 1770 1389 Shewanella baltica OS223 CP001252.1 77 1465 1389 Shewanella denitrificans OS217 CP000302.1 10 1398 1389 Shewanella frigidimarina NCIMB 400 CP000447.1 23 1408 1386 Shewanella halifaxensis HAW EB4 CP000931.1 81 1466 1386 Shewanella loihica PV-4 CP000606.1 111 1490 1380 Shewanella MR-4 CP000446.1 79 1461 1383 Shewanella MR-7 CP000444.1 81 1463 1383

145

Shewanella oneidensis AE014299.1 6873 8255 1383 Shewanella pealeana ATCC 700345 CP000851.1 223 1608 1386 Shewanella piezotolerans WP3 CP000472.1 16217 17602 1386 Shewanella putrefaciens CN-32 CP000681.1 147 1532 1386 Shewanella sediminis HAW-EB3 CP000821.1 4812 6200 1389 Shewanella violacea DSS12 uid34739 AP011177.1 30205 31593 1389 Shewanella W3-18-1 CP000503.1 150 1535 1386 Shewanella woodyi ATCC 51908 CP000961.1 397 1785 1389 CDC 3083 94 CP001063.1 3946927 3948330 1404 Shigella boydii Sb227 CP000036.1 3689193 3690596 1404 CP000034.1 3918205 3919608 1404 2002017 uid33639 CP001383.1 3893394 3894797 1404 Shigella flexneri 2a AE014073.1 3869099 3870325 1227 Shigella flexneri 2a 2457T AE014073.1 3904162 3905565 1404 Shigella flexneri 5 8401 CP000266.1 3903225 3904556 1332 Ss046 CP000038.1 3823237 3824640 1404 Sideroxydans lithotrophicus ES 1 uid33161 CP001965.1 72 1415 1344 Silicibacter TM1040 CP000377.1 45 1457 1413 Sinorhizobium medicae WSM419 CP000738.1 1405 2850 1446 Sinorhizobium meliloti AL591688.1 399330 400853 1524 Slackia heliotrinireducens DSM 20476 CP001684.1 42 1547 1506 Sodalis glossinidius morsitans AP008232.1 1 1395 1395 Sorangium cellulosum So ce 56 AM746676.1 1 1416 1416 Sphaerobacter thermophilus DSM 20745 uid21087 CP001823.1 47 1462 1416 Sphingobium japonicum UT26S uid19949 AP010803.1 835602 837011 1410 Sphingomonas wittichii RW1 CP000699.1 245 1678 1434 Sphingopyxis alaskensis RB2256 CP000356.1 173 1534 1362

146

Spirosoma linguale DSM 74 uid28817 CP001769.1 60 1469 1410 Stackebrandtia nassauensis DSM 44728 uid19713 CP001778.1 64 1764 1701 aureus 4 2981 uid34809 CP001844.1 517 1878 1362 aureus MRSA252 BX571856.1 517 1878 1362 Staphylococcus aureus aureus MSSA476 BX571857.1 517 1878 1362 Staphylococcus aureus COL CP000046.1 544 1905 1362 Staphylococcus aureus ED98 uid39547 CP001781.1 517 1878 1362 Staphylococcus aureus JH1 CP000736.1 641 2002 1362 Staphylococcus aureus JH9 CP000703.1 572 1933 1362 Staphylococcus aureus Mu3 AP009324.1 517 1878 1362 Staphylococcus aureus Mu50 BA000017.4 517 1878 1362 Staphylococcus aureus MW2 BA000033.2 517 1878 1362 Staphylococcus aureus N315 BA000018.3 517 1878 1362 Staphylococcus aureus NCTC 8325 CP000253.1 517 1878 1362 Staphylococcus aureus Newman AP009351.1 517 1878 1362 Staphylococcus aureus RF122 AJ938182.1 517 1878 1362 Staphylococcus aureus ST398 uid29427 AM990992.1 517 1878 1362 Staphylococcus aureus TW20 uid36647 FN433596.1 517 1878 1362 Staphylococcus aureus USA300 FPR3757 CP000255.1 544 1905 1362 Staphylococcus aureus USA300 TCH1516 CP000730.1 544 1905 1362 Staphylococcus carnosus TM300 AM295250.1 2564508 2565869 1362 Staphylococcus epidermidis ATCC 12228 AE015929.1 362 1717 1356 Staphylococcus epidermidis RP62A CP000029.1 2614976 2616331 1356 Staphylococcus haemolyticus AP006716.1 508 1863 1356 Staphylococcus lugdunensis HKU09 1 uid42395 CP001837.1 2625430 2626797 1368 Staphylococcus saprophyticus AP008934.1 509 1876 1368 Stenotrophomonas maltophilia K279a AM743169.1 1 1332 1332

147

Stenotrophomonas maltophilia R551 3 CP001111.1 215 1546 1332 Streptobacillus moniliformis DSM 12112 uid29309 CP001779.1 66 1391 1326 Streptococcus agalactiae 2603 AE009948.1 102 1463 1362 Streptococcus agalactiae A909 CP000114.1 101 1462 1362 Streptococcus agalactiae NEM316 AL732656.1 176 1537 1362 Streptococcus dysgalactiae equisimilis GGS 124 AP010935.1 225 1580 1356 Streptococcus equi 4047 FM204883.1 1 1353 1353 Streptococcus equi zooepidemicus FM204884.1 1 1353 1353 Streptococcus equi zooepidemicus MGCS10565 CP001129.1 233 1585 1353 Streptococcus gallolyticus UCN34 uid34729 FN597254.1 170 1525 1356 Streptococcus gordonii Challis substr CH1 CP000725.1 166 1518 1353 Streptococcus mitis B6 uid16302 FN568063.1 1 1362 1362 Streptococcus mutans AE014133.1 194 1552 1359 Streptococcus mutans NN2025 uid28997 AP010655.1 194 1552 1359 Streptococcus pneumoniae 70585 CP000918.1 197 1558 1362 Streptococcus pneumoniae ATCC 700669 FM211187.1 186 1547 1362 Streptococcus pneumoniae CGSP14 CP001033.1 197 1558 1362 Streptococcus pneumoniae D39 CP000410.1 1 1362 1362 Streptococcus pneumoniae G54 CP001015.1 197 1558 1362 Streptococcus pneumoniae Hungary19A 6 CP000936.1 197 1558 1362 Streptococcus pneumoniae JJA CP000919.1 197 1558 1362 Streptococcus pneumoniae P1031 CP000920.1 197 1558 1362 Streptococcus pneumoniae R6 AE007317.1 1 1362 1362 Streptococcus pneumoniae Taiwan19F 14 CP000921.1 197 1558 1362 Streptococcus pneumoniae TIGR4 AE005672.3 197 1558 1362 Streptococcus pyogenes M1 GAS AE004092.1 232 1587 1356 Streptococcus pyogenes Manfredo AM295007.1 202 1557 1356

148

Streptococcus pyogenes MGAS10270 CP000260.1 202 1557 1356 Streptococcus pyogenes MGAS10394 CP000003.1 202 1557 1356 Streptococcus pyogenes MGAS10750 CP000262.1 202 1557 1356 Streptococcus pyogenes MGAS2096 CP000261.1 202 1557 1356 Streptococcus pyogenes MGAS315 AE014074.1 232 1587 1356 Streptococcus pyogenes MGAS5005 CP000017.1 202 1557 1356 Streptococcus pyogenes MGAS6180 CP000056.1 202 1557 1356 Streptococcus pyogenes MGAS8232 AE009949.1 202 1557 1356 Streptococcus pyogenes MGAS9429 CP000259.1 202 1557 1356 Streptococcus pyogenes NZ131 CP000829.1 232 1587 1356 Streptococcus pyogenes SSI-1 BA000034.2 232 1587 1356 Streptococcus sanguinis SK36 CP000387.1 214 1566 1353 Streptococcus suis 05ZYH33 CP000407.1 1 1374 1374 Streptococcus suis 98HAH33 CP000408.1 1 1374 1374 Streptococcus suis BM407 FM252032.1 38 1372 1335 Streptococcus suis GZ1 uid18737 CP000837.1 1 1374 1374 Streptococcus suis P1 7 uid352 AM946016.1 1 1374 1374 Streptococcus suis SC84 FM252031.1 1 1374 1374 Streptococcus thermophilus CNRZ1066 CP000024.1 186 1550 1365 Streptococcus thermophilus LMD-9 CP000419.1 101 1465 1365 Streptococcus thermophilus LMG 18311 CP000023.1 186 1550 1365 Streptococcus uberis 0140J AM946015.1 1 1356 1356 Streptomyces avermitilis BA000030.3 5285972 5287933 1962 Streptomyces bingchenggensis BCW 1 uid46847 CP002047.1 6602155 6604011 1857 Streptomyces coelicolor AL645882.2 4270778 4272748 1971 Streptomyces griseus NBRC 13350 AP009493.1 4322502 4324370 1869 Streptomyces scabiei 87 22 uid40749 FN554889.1 5139761 5141785 2025

149

Streptosporangium roseum DSM 43021 uid21083 CP001814.1 361 2109 1749 Sulfurihydrogenibium azorense Az Fu1 CP001229.1 696482 697330 849 Sulfurihydrogenibium YO3AOP1 CP001080.1 129 1508 1380 Sulfurospirillum deleyianum DSM 6946 uid29529 CP001816.1 1277 2602 1326 Sulfurovum NBC37-1 AP009179.1 65 1393 1329 Symbiobacterium thermophilum IAM14863 AP006840.1 1 1377 1377 Synechococcus CC9311 CP000435.1 1723679 1725085 1407 Synechococcus CC9605 CP000110.1 927763 929115 1353 Synechococcus CC9902 CP000097.1 881683 883071 1389 Synechococcus elongatus PCC 6301 AP008231.1 502252 503700 1449 Synechococcus elongatus PCC 7942 CP000100.1 1117103 1118551 1449 Synechococcus PCC 7002 CP000951.1 1 1350 1350 Synechococcus RCC307 CT978603.1 756450 757835 1386 Synechococcus sp WH8102 BX548020.1 1479573 1480970 1398 Synechococcus WH 7803 CT971583.1 697164 698558 1395 Synechocystis PCC6803 BA000022.2 1350236 1351579 1344 Syntrophobacter fumaroxidans MPOB CP000478.1 346 1698 1353 Syntrophomonas wolfei Goettingen CP000448.1 83 1393 1311 Syntrophothermus lipocalidus DSM 12680 uid37873 CP002048.1 108 1418 1311 Syntrophus aciditrophicus SB CP000252.1 155 1528 1374 Teredinibacter turnerae T7901 CP001614.2 541 2322 1782 Thauera MZ1T CP001281.2 250 1752 1503 Thermanaerovibrio acidaminovorans DSM 6589 uid29531 CP001818.1 97 1440 1344 Thermoanaerobacter italicus Ab9 uid33157 CP001936.1 424 1755 1332 Thermoanaerobacter mathranii A3 uid33329 CP002032.1 99 1430 1332 Thermoanaerobacter pseudethanolicus ATCC 33223 CP000924.1 79 1410 1332 Thermoanaerobacter tengcongensis AE008691.1 365 1696 1332

150

Thermoanaerobacter X514 CP000923.1 195 1526 1332 Thermobaculum terrenum ATCC BAA 798 uid29523 CP001825.1 85 1455 1371 Thermobifida fusca YX CP000088.1 195 2042 1848 Thermobispora bispora DSM 43833 uid20737 CP001874.1 270 1853 1584 Thermocrinis albus DSM 14484 uid37275 CP001931.1 164163 165347 1185 Thermodesulfovibrio yellowstonii DSM 11347 CP001147.1 13985 15301 1317 Thermomicrobium roseum DSM 5159 CP001275.1 1157859 1159256 1398 Thermomonospora curvata DSM 43183 uid20825 CP001738.1 61 2226 2166 Thermosipho africanus TCF52B CP001185.1 219924 221240 1317 Thermosipho melanesiensis BI429 CP000716.1 162 1475 1314 Thermosynechococcus elongatus BA000039.2 609990 611351 1362 Thermotoga lettingae TMO CP000812.1 137 1471 1335 Thermotoga maritima AE000512.1 943332 944273 942 Thermotoga naphthophila RKU 10 uid33663 CP001839.1 38 1360 1323 Thermotoga neapolitana DSM 4359 CP000916.1 1617713 1619065 1353 Thermotoga petrophila RKU-1 CP000702.1 72 1394 1323 Thermotoga RQ2 CP000969.1 62 1384 1323 Thermus thermophilus HB27 AE017221.1 1523000 1524340 1341 Thermus thermophilus HB8 AP008226.1 1848185 1849495 1311 Thioalkalivibrio HL EbGR7 CP001339.1 139 1506 1368 Thioalkalivibrio K90mix uid30759 CP001905.1 118 1485 1368 Thiobacillus denitrificans ATCC 25259 CP000116.1 137 1510 1374 Thiomicrospira crunogena XCL-2 CP000109.2 1 1404 1404 Thiomicrospira denitrificans ATCC 33889 CP000153.1 15 1322 1308 Thiomonas intermedia K12 uid33641 CP002021.1 433 1887 1455 Tolumonas auensis DSM 9187 CP001616.1 12 1397 1386 Treponema denticola ATCC 35405 AE017226.1 166 1575 1410

151

Treponema pallidum CP001752.1 4 1398 1395 Treponema pallidum Chicago uid39981 CP001752.1 4 1398 1395 Treponema pallidum SS14 CP000805.1 4 1398 1395 Trichodesmium erythraeum IMS101 CP000393.1 27 1397 1371 TW08 27 BX072543.1 1 1437 1437 Tropheryma whipplei Twist AE014184.1 1 1437 1437 Truepera radiovictrix DSM 17093 uid38371 CP002049.1 166 1500 1335 Tsukamurella paurometabola DSM 20162 uid29399 CP001966.1 190 1662 1473 Ureaplasma parvum serovar 3 ATCC 27815 CP000942.1 27 1400 1374 Ureaplasma urealyticum CP001184.1 1 1374 1374 Ureaplasma urealyticum serovar 10 ATCC 33699 CP001184.1 17 1390 1374 Variovorax paradoxus S110 CP001635.1 37 1416 1380 Veillonella parvula DSM 2008 uid21091 CP001820.1 248 1822 1575 Verminephrobacter eiseniae EF01-2 CP000542.1 493 1944 1452 CP001485.1 7397 8815 1419 Vibrio cholerae M66 2 CP001233.1 7397 8815 1419 Vibrio cholerae MJ 1236 CP001485.1 517742 519145 1404 Vibrio cholerae O395 CP001235.1 2660512 2661930 1419 Vibrio cholerae O395 uid32853 CP001235.1 167860 169278 1419 Vibrio Ex25 uid40507 CP001805.1 418744 420150 1407 Vibrio fischeri ES114 CP000020.2 7401 8810 1410 Vibrio fischeri MJ11 CP001139.1 7492 8832 1341 Vibrio harveyi ATCC BAA-1116 CP000789.1 423785 425191 1407 BA000031.2 7680 9086 1407 Vibrio splendidus LGP32 FM954972.2 7498 8919 1422 YJ016 BA000037.2 7450 8856 1407 Wolbachia endosymbiont of Brugia malayi TRS AE017321.1 361157 362539 1383

152

Wolbachia endosymbiont of Culex quinquefasciatus Pel AM999887.1 152 1534 1383 Wolbachia endosymbiont of Drosophila melanogaster AE017196.1 153 1535 1383 Wolbachia wRi CP001391.1 153 1535 1383 Wolinella succinogenes BX571656.1 1 1314 1314 Xanthobacter autotrophicus Py2 CP000781.1 330 1850 1521 Xanthomonas campestris 8004 CP000050.1 42 1370 1329 Xanthomonas campestris ATCC 33913 AE008922.1 42 1370 1329 Xanthomonas campestris B100 AM920689.1 1 1329 1329 Xanthomonas campestris vesicatoria 85-10 AM039952.1 1 1329 1329 Xanthomonas citri AE008923.1 42 1370 1329 Xanthomonas oryzae KACC10331 AE013598.1 42 1373 1332 Xanthomonas oryzae MAFF 311018 AP008229.1 41 1369 1329 Xanthomonas oryzae PXO99A CP000967.1 45 1373 1329 Xenorhabdus bovienii SS 2004 uid13399 FN667741.1 159 1547 1389 Xylanimonas cellulosilytica DSM 15894 uid19715 CP001821.1 150 1556 1407 Xylella fastidiosa AE009442.1 143 1462 1320 Xylella fastidiosa M12 CP000941.1 20 1351 1332 Xylella fastidiosa M23 CP001011.1 25 1356 1332 Xylella fastidiosa Temecula1 AE009442.1 146 1465 1320 8081 AM286415.1 4572805 4574193 1389 Yersinia pestis Angola CP000901.1 4468868 4470256 1389 Yersinia pestis Antiqua CP000308.1 4666903 4668291 1389 Yersinia pestis biovar Microtus 91001 AE017042.1 4559438 4560826 1389 Yersinia pestis CO92 AL590842.1 4618341 4619729 1389 Yersinia pestis D106004 uid36507b CP001585.1 4605331 4606721 1391 Yersinia pestis D182038 uid36545 CP001589.1 4591558 4592946 1389 Yersinia pestis KIM 10 uid288 AE009952.1 4565119 4566519 1401

153

Yersinia pestis Nepal516 CP000305.1 4499203 4500591 1389 Yersinia pestis Pestoides F CP000668.1 34 1422 1389 Yersinia pestis Z176003 uid36547 CP001593.1 4518199 4519587 1389 Yersinia pseudotuberculosis IP 31758 CP000720.1 4688891 4690279 1389 Yersinia pseudotuberculosis IP32953 BX936398.1 4709259 4710647 1389 Yersinia pseudotuberculosis PB1 CP001048.1 37 1389 1353 Yersinia pseudotuberculosis YPIII CP000950.1 37 1389 1353 SM A87 uid38641 CP001650.1 3877760 3879187 1428 Zymomonas mobilis NCIMB 11163 uid34821 CP001722.1 274 1728 1455 Zymomonas mobilis ZM4 AE008692.2 1370644 1372098 1455

The table shows sequenced genomes with and without the dnaA gene. Eleven strains lacking a dnaA gene in their chromosomes are shown first, followed by forty strains that have two copies of the dnaA gene (the second copy is listed after the first) and 1,016 strains that have a single copy of the dnaA gene. The strains lacking a dnaA gene are all endosymbionts of insects with reduced genomes. Strains containing two copies of the dnaA gene are pathogens of animals, extremophiles and environmental cultures from water. All plant pathogens contain a single copy of the dnaA gene. a The dnaA gene of strain Lysinibacillussphaericus C3 41 begins at position 4,639,741 of 4,639,821 of the circular chromosome instead of position 1. b The dnaA gene of this strain Yersiniapestis D106004 contains a frameshift mutation due to a single nucleotide insertion at position 498. The dnaA gene was identified for this genome with bl2seq using Yersiniapestis Angola (CP000901.1)

154

Supplemental Table S4.2. Xanthomonas strains from the PBC used in this study and ITS/RIF/XAC3314 amplicon and sequencing success.

ITS sequence RIF Sequence XAC3314 Subspecies/ (Obtained/Not (Obtained/Not Amplicon K Number Genus Species Pathovar Host Obtained) Obtained) Type K0025 Xanthomonas axonopodis dieffenbachiae Anthurium NotObtained Obtained S K0026 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0027 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0028 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0029 Xanthomonas campestris campestris Cabbage NotObtained Obtained M K0030 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0031 Xanthomonas campestris campestris Cabbage NotObtained Obtained M Brassica K0032 Xanthomonas campestris campestris oleracea Obtained Obtained M K0033 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0034 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0035 Xanthomonas euvesicatoria Pepper NotObtained NotObtained B K0036 Xanthomonas citri citri Citrus NotObtained Obtained N K0037 Xanthomonas euvesicatoria Tomato NotObtained NotObtained B K0038 Xanthomonas euvesicatoria Tomato Obtained Obtained B K0039 Xanthomonas citri citri Citrus Obtained Obtained B K0040 Xanthomonas fuscans aurantifolii Lemon Obtained Obtained B K0041 Xanthomonas fuscans aurantifolii Lime Obtained Obtained S K0042 Xanthomonas citri citri Citrus Obtained Obtained B K0043 Xanthomonas citri citri Citrus NotObtained Obtained B K0044 Xanthomonas oryzae oryzae Rice Obtained Obtained N

155

K0045 Xanthomonas oryzae oryzae Rice NotObtained Obtained N K0046 Xanthomonas oryzae oryzae Rice Obtained Obtained N K0047 Xanthomonas oryzae oryzae Rice NotObtained Obtained N K0048 Xanthomonas oryzae oryzae Rice Obtained Obtained N K0193 Xanthomonas campestris armoraciae Cabbage Obtained Obtained M K0194 Xanthomonas campestris armoraciae Cabbage Obtained Obtained M K0195 Xanthomonas campestris armoraciae Cabbage Obtained Obtained M K0196 Xanthomonas campestris armoraciae Cabbage Obtained Obtained M K0197 Xanthomonas campestris armoraciae Cabbage Obtained Obtained M K0198 Xanthomonas campestris armoraciae Cabbage Obtained Obtained M K0199 Xanthomonas campestris armoraciae Cabbage Obtained Obtained M K0200 Xanthomonas campestris armoraciae Cabbage Obtained Obtained M K0201 Xanthomonas campestris armoraciae Cabbage NotObtained Obtained M K0202 Xanthomonas campestris armoraciae Cabbage Obtained Obtained M K0203 Xanthomonas campestris armoraciae Cabbage Obtained Obtained M K0204 Xanthomonas campestris raphani Radish Obtained Obtained M K0205 Xanthomonas campestris raphani Radish NotObtained Obtained M K0206 Xanthomonas campestris raphani Radish Obtained NotObtained M K0207 Xanthomonas campestris raphani Radish Obtained Obtained S K0208 Xanthomonas campestris armoraciae Candytuff Obtained Obtained M K0209 Xanthomonas campestris armoraciae Candytuff Obtained Obtained M K0210 Xanthomonas campestris armoraciae Cabbage Obtained Obtained M K0211 Xanthomonas campestris armoraciae Broccoli Obtained Obtained M Cabbage K0212 Xanthomonas campestris armoraciae seed Obtained Obtained M Cabbage K0213 Xanthomonas campestris armoraciae seed Obtained Obtained M K0214 Xanthomonas campestris armoraciae Cabbage Obtained Obtained M

156

K0215 Xanthomonas campestris armoraciae Cabbage Obtained Obtained M K0216 Xanthomonas campestris armoraciae Cabbage NotObtained Obtained B K0217 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0218 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0219 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0220 Xanthomonas campestris campestris Cabbage NotObtained Obtained M K0221 Xanthomonas campestris campestris Cabbage Obtained Obtained M Brassica K0222 Xanthomonas campestris campestris oleracea NotObtained Obtained B K0223 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0224 Xanthomonas campestris campestris Cabbage NotObtained Obtained M K0225 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0226 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0227 Xanthomonas campestris campestris Cabbage Obtained Obtained M Brussel K0228 Xanthomonas campestris campestris sprouts Obtained Obtained M Cabbage K0229 Stenotrophomonas rhizophila seed Obtained Obtained N Cabbage K0230 Stenotrophomonas rhizophila seed Obtained Obtained N K0231 Xanthomonas campestris campestris Cabbage Obtained NotObtained A K0232 Xanthomonas campestris campestris Cabbage NotObtained Obtained N K0233 Xanthomonas campestris campestris Cabbage NotObtained Obtained M K0234 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0235 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0236 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0237 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0238 Xanthomonas campestris campestris Cabbage Obtained Obtained B K0239 Xanthomonas campestris campestris Cabbage NotObtained Obtained M

157

K0240 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0241 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0242 Xanthomonas campestris campestris Cabbage Obtained Obtained A K0243 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0244 Xanthomonas axonopodis dieffenbachiae Colocasia Obtained Obtained A K0245 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0246 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0247 Xanthomonas campestris campestris Cabbage Obtained Obtained M K0248 Xanthomonas campestris campestris Cabbage NotObtained Obtained M K0249 Xanthomonas campestris campestris Cabbage NotObtained Obtained M K0250 Xanthomonas campestris campestris Cabbage NotObtained Obtained M K0251 Xanthomonas campestris campestris Cabbage NotObtained Obtained M K0252 Xanthomonas campestris campestris Cabbage NotObtained Obtained B K0253 Xanthomonas campestris campestris Cabbage Obtained Obtained A K0254 Xanthomonas campestris campestris Cabbage NotObtained Obtained M K0255 Xanthomonas campestris campestris Cabbage NotObtained Obtained M K0256 Xanthomonas campestris campestris Broccoli NotObtained Obtained M K0257 Xanthomonas campestris campestris Broccoli Obtained Obtained M K0258 Xanthomonas axonopodis dieffenbachiae Aglonema Obtained NotObtained N K0259 Xanthomonas axonopodis dieffenbachiae Anthurium NotObtained Obtained S K0260 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0261 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0262 Xanthomonas axonopodis dieffenbachiae Aglonema Obtained Obtained S K0263 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0264 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0265 Xanthomonas axonopodis dieffenbachiae Syngonium Obtained Obtained S Epipremnu K0266 Xanthomonas axonopodis dieffenbachiae m Obtained Obtained N

158

K0267 Xanthomonas axonopodis dieffenbachiae Anthurium NotObtained Obtained A K0268 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0269 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0270 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0271 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0272 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0273 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S Spathiphyllu K0274 Xanthomonas axonopodis dieffenbachiae m Obtained NotObtained N Xanthosom K0275 Xanthomonas axonopodis dieffenbachiae a NotObtained NotObtained S K0276 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0277 Xanthomonas axonopodis dieffenbachiae Anthurium NotObtained Obtained S K0278 Xanthomonas axonopodis dieffenbachiae Anthurium NotObtained Obtained A K0279 Xanthomonas axonopodis dieffenbachiae Anthurium NotObtained Obtained S K0280 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0281 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained Obtained S K0282 Xanthomonas campestris aberrans Cabbage Obtained Obtained M Methiola K0283 Xanthomonas campestris incanae incana Obtained Obtained M K0284 Xanthomonas campestris raphani Radish Obtained NotObtained M K0285 Xanthomonas campestris campestris Cabbage Obtained Obtained B K0286 Xanthomonas campestris campestris Cabbage Obtained Obtained B K0287 Xanthomonas campestris campestris Cabbage Obtained Obtained B K0288 Xanthomonas campestris campestris Cabbage Obtained Obtained B K0289 Xanthomonas campestris campestris Broccoli Obtained Obtained B Brussel K0290 Xanthomonas campestris campestris sprouts Obtained B Brussel K0291 Xanthomonas campestris campestris sprouts Obtained B

159

K0292 Xanthomonas campestris campestris crucifer Obtained M K0293 Xanthomonas campestris campestris crucifer Obtained M K0294 Xanthomonas campestris campestris Cabbage Obtained M K0295 Xanthomonas axonopodis vitians Lettuce Obtained N K0296 Xanthomonas campestris campestris Cabbage Obtained B K0297 Xanthomonas campestris campestris Cabbage Obtained B K0298 Xanthomonas campestris campestris Cabbage Obtained B K0299 Xanthomonas campestris campestris Cabbage Obtained B K0300 Xanthomonas campestris campestris Cabbage Obtained B K0301 Xanthomonas campestris campestris Cabbage Obtained B K0302 Xanthomonas campestris campestris Cabbage Obtained B K0303 Xanthomonas campestris campestris Cabbage Obtained B K0304 Xanthomonas campestris campestris Cabbage Obtained B K0305 Xanthomonas campestris campestris Cabbage Obtained B K0306 Xanthomonas campestris campestris Cabbage Obtained B K0307 Xanthomonas campestris campestris Cabbage Obtained N K0308 Xanthomonas campestris campestris Cabbage Obtained N K0309 Xanthomonas campestris campestris Cabbage Obtained B K0310 Xanthomonas campestris campestris Cabbage Obtained B K0311 Xanthomonas oryzae oryzae Rice Obtained N K0312 Xanthomonas oryzae oryzae Rice NotObtained N K0313 Xanthomonas oryzae oryzae Rice NotObtained N K0314 Xanthomonas oryzae oryzae Rice NotObtained N K0315 Xanthomonas oryzae oryzae Rice Obtained N K0316 Xanthomonas oryzae oryzae Rice Obtained N K0317 Xanthomonas oryzae oryzae Rice NotObtained N K0318 Xanthomonas oryzae oryzae Rice Obtained N

160

K0319 Xanthomonas oryzae oryzae Rice Obtained N K0320 Xanthomonas oryzae oryzae Rice NotObtained N K0321 Xanthomonas oryzae oryzae Rice Obtained N K0322 Xanthomonas oryzae oryzae Rice Obtained N K0323 Xanthomonas oryzae oryzae Rice NotObtained N K0324 Xanthomonas oryzae oryzae Rice NotObtained N K0325 Xanthomonas oryzae oryzae Rice Obtained N K0326 Xanthomonas oryzae oryzae Rice NotObtained M/S K0327 Xanthomonas oryzae oryzae Rice Obtained N K0328 Xanthomonas oryzae oryzae Rice Obtained N K0329 Xanthomonas citri citri Citrus Obtained B K0330 Xanthomonas citri citri Citrus NotObtained B K0331 Xanthomonas alfalfae citrumelonis Citrus Obtained N K0332 Xanthomonas alfalfae citrumelonis Citrus Obtained B K0333 Xanthomonas citri citri Citrus NotObtained B K0334 Xanthomonas euvesicatoria Pepper Obtained N K0335 Xanthomonas euvesicatoria Pepper Obtained N K0336 Xanthomonas euvesicatoria Pepper Obtained N K0337 Xanthomonas euvesicatoria Pepper Obtained N K0338 Xanthomonas euvesicatoria Pepper NotObtained N K0339 Xanthomonas euvesicatoria Pepper Obtained N K0340 Xanthomonas citri citri Citrus NotObtained N K0341 Xanthomonas citri citri Citrus NotObtained N K0342 Xanthomonas euvesicatoria Tomato Obtained B K0343 Xanthomonas euvesicatoria Tomato Obtained B K0344 Xanthomonas euvesicatoria Tomato Obtained B K0345 Xanthomonas euvesicatoria Tomato Obtained B

161

K0346 Xanthomonas euvesicatoria Tomato Obtained B K0347 Xanthomonas euvesicatoria Tomato Obtained B K0348 Xanthomonas euvesicatoria Tomato Obtained B K0349 Xanthomonas euvesicatoria Tomato Obtained B K0350 Xanthomonas euvesicatoria Tomato Obtained B K0351 Xanthomonas euvesicatoria Tomato NotObtained N K0352 Xanthomonas euvesicatoria Tomato Obtained B K0353 Xanthomonas euvesicatoria Tomato Obtained N K0354 Xanthomonas campestris campestris Cabbage Obtained B K0355 Xanthomonas campestris campestris Cabbage Obtained B K0356 Xanthomonas campestris campestris Cabbage Obtained B K0357 Xanthomonas campestris campestris Cabbage Obtained B K0358 Xanthomonas campestris campestris Cabbage Obtained B K0359 Xanthomonas campestris campestris Cabbage Obtained M K0360 Xanthomonas campestris campestris Cabbage Obtained M K0361 Xanthomonas campestris campestris Cabbage Obtained M K0362 Xanthomonas campestris campestris Cabbage Obtained M K0363 Xanthomonas campestris campestris Cabbage Obtained M K0364 Xanthomonas campestris campestris Cabbage Obtained M K0365 Xanthomonas campestris campestris Cabbage Obtained M K0366 Xanthomonas campestris campestris Cabbage Obtained M K0367 Xanthomonas campestris campestris Cabbage Obtained M K0368 Xanthomonas campestris campestris Cabbage Obtained M K0369 Xanthomonas campestris campestris Cabbage Obtained M K0370 Xanthomonas campestris campestris Cabbage Obtained M K0371 Xanthomonas campestris campestris Cabbage Obtained B K0372 Xanthomonas campestris campestris Cabbage Obtained B

162

K0373 Xanthomonas campestris campestris Cabbage Obtained B K0374 Xanthomonas campestris campestris Cabbage Obtained B K0375 Xanthomonas campestris campestris Cabbage Obtained M K0376 Xanthomonas campestris aberrans Cabbage Obtained M K0377 Xanthomonas campestris aberrans Cabbage Obtained M K0378 Xanthomonas campestris campestris Cabbage NotObtained N K0379 Xanthomonas campestris campestris Cabbage Obtained M K0380 Xanthomonas campestris campestris Cabbage Obtained B K0381 Xanthomonas campestris campestris Cabbage Obtained M K0382 Xanthomonas campestris campestris Cabbage NotObtained N K0383 Xanthomonas campestris campestris Cabbage Obtained M K0384 Xanthomonas campestris campestris Cabbage Obtained B Cabbage K0577 Stenotrophomonas rhizophila seed NotObtained N K0578 Xanthomonas campestris campestris Cabbage Obtained B K0579 Xanthomonas campestris campestris Cabbage Obtained B K0580 Xanthomonas campestris campestris Cabbage Obtained B K0581 Xanthomonas campestris campestris Cabbage NotObtained N K0582 Xanthomonas axonopodis allii Onion NotObtained N K0583 Xanthomonas axonopodis allii Onion Obtained B K0584 Xanthomonas axonopodis allii Onion Obtained N K0585 Xanthomonas axonopodis allii Onion Obtained N K0586 Xanthomonas axonopodis allii Onion Obtained N K0587 Xanthomonas axonopodis allii Onion Obtained N K0588 Xanthomonas axonopodis allii Allium cepa Obtained B K0589 Xanthomonas axonopodis allii Allium cepa Obtained N K0590 Xanthomonas axonopodis allii Onion Obtained N K0591 Xanthomonas axonopodis allii Onion Obtained N

163

K0592 Xanthomonas axonopodis manihotis Cassava Obtained S K0593 Xanthomonas axonopodis allii Onion Obtained N K0594 Xanthomonas axonopodis allii Onion Obtained N K0595 Xanthomonas axonopodis allii Onion Obtained N K0596 Xanthomonas axonopodis begoniae Begonia NotObtained B K0597 Xanthomonas axonopodis vitians Lettuce Obtained N Cabbage K0598 Stenotrophomonas rhizophila seed Obtained N Cabbage K0599 Stenotrophomonas rhizophila seed Obtained BB Cabbage K0600 Stenotrophomonas rhizophila seed Obtained N Cabbage K0601 Stenotrophomonas rhizophila seed Obtained N Cabbage K0602 Stenotrophomonas rhizophila seed NotObtained N Cabbage K0603 Stenotrophomonas rhizophila seed Obtained N Cabbage K0604 Stenotrophomonas rhizophila seed Obtained N Cabbage K0605 Stenotrophomonas rhizophila seed Obtained N Cabbage K0606 Stenotrophomonas rhizophila seed Obtained BB Cabbage K0607 Stenotrophomonas rhizophila seed Obtained N Cabbage K0608 Stenotrophomonas rhizophila seed Obtained N Cabbage K0609 Stenotrophomonas rhizophila seed Obtained N Cabbage K0610 Stenotrophomonas rhizophila seed NotObtained N K0611 Xanthomonas euvesicatoria Tomato NotObtained M K0612 Xanthomonas euvesicatoria Tomato NotObtained M/B

164

K0613 Xanthomonas euvesicatoria Tomato Obtained M/B K0614 Xanthomonas euvesicatoria Tomato Obtained B K0615 Xanthomonas euvesicatoria Tomato Obtained M/B K0616 Xanthomonas euvesicatoria Tomato Obtained A K0617 Xanthomonas euvesicatoria Tomato Obtained B K0618 Xanthomonas euvesicatoria Tomato Obtained B K0619 Xanthomonas euvesicatoria Tomato NotObtained M/B K0620 Xanthomonas euvesicatoria Tomato NotObtained M/B K0621 Xanthomonas euvesicatoria Tomato NotObtained M/B K0622 Xanthomonas euvesicatoria Tomato Obtained B K0623 Xanthomonas euvesicatoria Tomato Obtained B K0624 Xanthomonas euvesicatoria Tomato Obtained B K0625 Xanthomonas euvesicatoria Tomato Obtained B K0626 Xanthomonas euvesicatoria Tomato Obtained B K0627 Xanthomonas euvesicatoria Tomato Obtained B K0628 Xanthomonas euvesicatoria Tomato Obtained B K0629 Xanthomonas euvesicatoria Tomato Obtained B K0630 Xanthomonas euvesicatoria Tomato Obtained B K0631 Xanthomonas hortorum carotae Carrot Obtained B K0632 Xanthomonas hortorum carotae Carrot Obtained B K0633 Xanthomonas hortorum carotae Carrot Obtained A K0634 Xanthomonas campestris raphani crucifer Obtained M K0635 Xanthomonas campestris raphani crucifer Obtained M K0636 Xanthomonas campestris raphani crucifer Obtained N K0637 Xanthomonas campestris raphani crucifer Obtained A K0638 Xanthomonas campestris raphani crucifer Obtained M K0639 Xanthomonas campestris armoraciae crucifer Obtained M

165

K0640 Xanthomonas campestris armoraciae crucifer Obtained M K0641 Xanthomonas euvesicatoria Tomato Obtained B K0642 Xanthomonas axonopodis manihotis Cassava Obtained M/S K0643 Xanthomonas axonopodis manihotis Cassava NotObtained N K0644 Xanthomonas axonopodis manihotis Cassava NotObtained BB K0645 Xanthomonas euvesicatoria Tomato Obtained B K0646 Xanthomonas euvesicatoria Pepper Obtained B K0647 Xanthomonas oryzae oryzae Rice Obtained A K0648 Xanthomonas axonopodis poinsettiicola Poinsettia NotObtained N K0649 Xanthomonas species urticae Nettle NotObtained N K0650 Xanthomonas axonopodis vitians Lettuce NotObtained S K0651 Xanthomonas axonopodis begoniae Begonia NotObtained M K0652 Xanthomonas euvesicatoria Pepper NotObtained B K0653 Xanthomonas hortorum carotae Carrot Obtained B K0654 Xanthomonas hortorum carotae Carrot Obtained A K0655 Xanthomonas campestris armoraciae Cabbage Obtained M K0656 Xanthomonas albilineans Sugarcane NotObtained N K0657 Xanthomonas albilineans Sugarcane Obtained N K0658 Xanthomonas albilineans Sugarcane Obtained N K0659 Xanthomonas albilineans Sugarcane Obtained N K0660 Xanthomonas fragariae Strawberry NotObtained N K0661 Xanthomonas fragariae Strawberry NotObtained S K0662 Xanthomonas oryzae oryzae Rice Obtained N K0663 Xanthomonas euvesicatoria Tomato NotObtained M/B K0664 Xanthomonas axonopodis panax Panax Obtained N K0665 Xanthomonas axonopodis vitians Lettuce Obtained N tomato/pepp K0666 Xanthomonas euvesicatoria er Obtained B

166

tomato/pepp K0667 Xanthomonas euvesicatoria er Obtained B tomato/pepp K0668 Xanthomonas euvesicatoria er Obtained B tomato/pepp K0669 Xanthomonas euvesicatoria er Obtained B K0670 Xanthomonas axonopodis syngonii Syngonium Obtained B K0671 Xanthomonas axonopodis syngonii Syngonium Obtained B K0672 Xanthomonas axonopodis syngonii Syngonium Obtained B K0810 Xanthomonas campestris campestris Broccoli Obtained M K0811 Xanthomonas axonopodis dieffenbachiae aroid Obtained B K0812 Xanthomonas axonopodis dieffenbachiae aroid Obtained B K0813 Xanthomonas axonopodis dieffenbachiae aroid Obtained S K0814 Xanthomonas axonopodis dieffenbachiae aroid Obtained S K0815 Xanthomonas axonopodis dieffenbachiae aroid Obtained B K0816 Xanthomonas axonopodis dieffenbachiae aroid Obtained B K0817 Xanthomonas axonopodis dieffenbachiae aroid Obtained N K0818 Xanthomonas axonopodis dieffenbachiae aroid Obtained B K0819 Xanthomonas axonopodis dieffenbachiae aroid Obtained S Dieffenbachi K0820 Xanthomonas axonopodis dieffenbachiae a Obtained S K0821 Xanthomonas axonopodis dieffenbachiae aroid NotObtained N K0822 Xanthomonas axonopodis dieffenbachiae aroid Obtained B K0823 Xanthomonas axonopodis dieffenbachiae aroid Obtained S K0824 Xanthomonas axonopodis dieffenbachiae aroid Obtained S K0825 Xanthomonas campestris campestris Cabbage Obtained M K0826 Xanthomonas campestris campestris Cabbage Obtained M K0827 Xanthomonas campestris campestris Cabbage Obtained M K0828 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained S

167

K0829 Xanthomonas campestris campestris Cabbage Obtained M K0830 Xanthomonas campestris aberrans Cabbage Obtained M K0831 Xanthomonas campestris campestris Cabbage Obtained M K0832 Xanthomonas campestris armoraciae Cabbage Obtained M K0833 Xanthomonas campestris armoraciae Cabbage Obtained M Methiola K0834 Xanthomonas campestris incanae incana Obtained M K0835 Xanthomonas campestris aberrans Cabbage Obtained M K0836 Xanthomonas campestris raphani Radish Obtained N K0837 Xanthomonas campestris campestris Cabbage Obtained M K0838 Xanthomonas campestris aberrans crucifer Obtained M K0839 Xanthomonas campestris raphani Radish Obtained M K0840 Xanthomonas campestris armoraciae crucifer Obtained M crucifer K0841 Xanthomonas campestris campestris seed Obtained M crucifer K0842 Xanthomonas campestris campestris seed Obtained M K0843 Xanthomonas campestris campestris Swine cress Obtained N K0844 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained N K0845 Xanthomonas axonopodis dieffenbachiae Syngonium Obtained S K0846 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained S Dieffenbachi K0847 Xanthomonas axonopodis dieffenbachiae a Obtained M K0848 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained S K0849 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained S K0850 Xanthomonas axonopodis dieffenbachiae Anthurium NotObtained N K0851 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained S K0852 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained S K0853 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained N

168

K0854 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained S K0855 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained S K0856 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained S K0857 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained S K0858 Xanthomonas axonopodis dieffenbachiae Colocasia Obtained S K0860 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained S K0861 Xanthomonas axonopodis dieffenbachiae Anthurium Obtained S K0862 Xanthomonas campestris campestris Cabbage Obtained M K0863 Xanthomonas hortorum pelargonii Geranium Obtained N K0864 Xanthomonas euvesicatoria Pepper NotObtained B K0865 Xanthomonas axonopodis citri Obtained K0866 Xanthomonas vasicola Obtained K0867 Xanthomonas vesicatoria Obtained K0868 Xanthomonas perforans Obtained K0869 Xanthomonas axonopodis citri NotObtained K0870 Xanthomonas hortorum hederae Obtained K0871 Xanthomonas cucurbitae Obtained K0872 Xanthomonas fragariae Obtained K0873 Xanthomonas melonis Obtained K0874 Xanthomonas cynarae Obtained strains from K0875 Xanthomonas Dysoxylum Obtained K0876 Xanthomonas pisi Obtained K0877 Xanthomonas vasicola Obtained K0878 Xanthomonas arboricola corylina Obtained K0879 Xanthomonas axonopodis vesicatoria Obtained K0880 Xanthomonas melonis Obtained K0881 Xanthomonas cynarae Obtained

169

strains from K0882 Xanthomonas Dysoxylum Obtained K0883 Xanthomonas arboricola arboricola Obtained K0884 Xanthomonas campestris incanae Obtained K0885 Xanthomonas oryzae oryzae Obtained K0886 Xanthomonas vesicatoria NotObtained K0887 Xanthomonas melonis Obtained K0888 Xanthomonas vasicola Obtained K0889 Xanthomonas fragariae Obtained strains from K0890 Xanthomonas Dysoxylum Obtained K0891 Xanthomonas axonopodis axonopodis Obtained K0892 Xanthomonas hortorum taraxaci Obtained K0893 Xanthomonas vasicola Obtained K0894 Xanthomonas arboricola populi Obtained K0895 Xanthomonas oryzae oryzicola Obtained strains from K0896 Xanthomonas Dysoxylum Obtained K0897 Xanthomonas cucurbitae Obtained K0898 Xanthomonas hortorum pelargonii Obtained K0899 Xanthomonas axonopodis phaseoli Obtained strains from K0900 Xanthomonas Dysoxylum Obtained K0901 Obtained K0902 Xanthomonas campestris raphani Obtained K0903 Xanthomonas axonopodis manihotis Obtained K0904 Xanthomonas fragariae Obtained K0905 Xanthomonas albilineans Obtained K0906 Xanthomonas codiaei Obtained

170

K0907 Xanthomonas bromi Obtained strains from K0908 Xanthomonas Dysoxylum Obtained K0909 Xanthomonas axonopodis vasculorum NotObtained K0910 Xanthomonas oryzae oryzicola Obtained K0911 Xanthomonas gardneri Obtained K0912 Xanthomonas albilineans NotObtained K0913 Xanthomonas albilineans NotObtained K0914 Xanthomonas campestris NotObtained K0915 Xanthomonas axonopodis NotObtained K0916 Xanthomonas oryzae oryzae NotObtained K0917 Xanthomonas species NotObtained K0918 Xanthomonas arboricola pruni NotObtained K0919 Xanthomonas axonopodis axonopodis NotObtained K0920 Xanthomonas axonopodis citri NotObtained K0921 Xanthomonas axonopodis citri NotObtained K0922 Xanthomonas axonopodis glycines NotObtained K0923 Xanthomonas axonopodis malvacearum NotObtained K0924 Xanthomonas axonopodis vesicatoria NotObtained K0925 Xanthomonas axonopodis vesicatoria NotObtained K0926 Xanthomonas axonopodis vignicola NotObtained K0927 Xanthomonas cassavae NotObtained K0928 Xanthomonas cassavae NotObtained K0929 Xanthomonas campestris campestris NotObtained K0930 Xanthomonas codiaei NotObtained K0931 Xanthomonas cucurbitae NotObtained K0932 Xanthomonas fragariae NotObtained K0933 Xanthomonas hortorum pelargonii NotObtained

171

K0934 NotObtained K0935 Xanthomonas hyacinthi NotObtained K0936 Xanthomonas melonis NotObtained K0937 Xanthomonas hyacinthi NotObtained K0938 Xanthomonas hyacinthi NotObtained K0939 Xanthomonas melonis NotObtained K0940 NotObtained K0941 Xanthomonas populi NotObtained K0942 Xanthomonas theicola NotObtained K0943 graminis NotObtained K0944 Xanthomonas translucens translucens NotObtained K0945 Xanthomonas translucens undulosa NotObtained K0946 Xanthomonas theicola NotObtained K0947 Xanthomonas vesicatoria NotObtained K0948 Xanthomonas theicola NotObtained

172

Supplemental Table SB.1. Clavibacter strains from the PBC used in this study and ITS/RIF amplicon and sequencing success.

ITS sequence RIF Sequence K (Obtained/Not (Obtained/Not Number Genus Species Subspecies Host Location Obtained) Obtained) K0073 Clavibacter michiganensis michiganensis Tomato Idaho, USA Obtained Obtained K0074 Clavibacter michiganensis michiganensis Tomato China Obtained Obtained K0075 Clavibacter michiganensis michiganensis Tomato Morocco Obtained Obtained K0076 Clavibacter michiganensis michiganensis Tomato China Obtained Obtained K0077 Clavibacter michiganensis michiganensis Tomato Portugal Obtained Obtained K0078 Clavibacter michiganensis michiganensis Tomato Hawaii, USA Obtained Obtained K0079 Clavibacter michiganensis michiganensis Tomato Chile Obtained Obtained K0080 Clavibacter michiganensis michiganensis Pepper Ohio, USA Obtained Obtained K0081 Clavibacter michiganensis michiganensis Tomato Oregon, USA Obtained Obtained K0082 Clavibacter michiganensis michiganensis Tomato China Obtained Obtained K0083 Clavibacter michiganensis michiganensis Tomato Hawaii, USA Obtained Obtained K0084 Clavibacter michiganensis michiganensis Tomato Kenya NotObtained Obtained K0085 Clavibacter michiganensis michiganensis Tomato California, USA Obtained Obtained K0086 Clavibacter michiganensis michiganensis Tomato Washington, USA NotObtained Obtained K0087 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained Obtained K0088 Clavibacter michiganensis michiganensis Tomato Netherlands Obtained Obtained K0089 Clavibacter michiganensis michiganensis Tomato Oregon, USA Obtained Obtained K0090 Clavibacter michiganensis sepedonicus Potato USA Obtained Obtained K0091 Clavibacter michiganensis insidiosus Alfalfa Kansas, USA Obtained Obtained K0093 Clavibacter michiganensis michiganensis Tomato South Africa NotObtained Obtained

173

K0094 Clavibacter michiganensis michiganensis Tomato China Obtained Obtained K0095 Clavibacter michiganensis michiganensis Tomato California, USA NotObtained Obtained K0096 Clavibacter michiganensis michiganensis Tomato California, USA NotObtained Obtained K0385 Clavibacter michiganensis michiganensis Tomato Hawaii, USA Obtained K0386 Clavibacter michiganensis michiganensis Tomato California, USA Obtained K0387 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0388 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0389 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0390 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0391 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0392 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0393 Clavibacter michiganensis michiganensis Tomato California, USA Obtained K0394 Clavibacter michiganensis michiganensis Tomato California, USA Obtained K0395 Clavibacter michiganensis michiganensis Tomato California, USA Obtained K0396 Clavibacter michiganensis michiganensis Tomato California, USA Obtained K0397 Clavibacter michiganensis michiganensis Tomato California, USA Obtained K0398 Clavibacter michiganensis michiganensis Tomato California, USA Obtained K0399 Clavibacter michiganensis michiganensis Tomato North Carolina, USA Obtained K0400 Clavibacter michiganensis michiganensis Tomato North Carolina, USA Obtained K0401 Clavibacter michiganensis michiganensis Tomato North Carolina, USA Obtained K0402 Clavibacter michiganensis michiganensis Tomato North Carolina, USA Obtained K0403 Clavibacter michiganensis michiganensis Tomato North Carolina, USA Obtained K0404 Clavibacter michiganensis michiganensis Tomato Iowa, USA Obtained K0405 Clavibacter michiganensis michiganensis Tomato Iowa, USA Obtained K0406 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0407 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0408 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained

174

K0409 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0410 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0411 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0412 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0413 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0414 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0415 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0416 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0417 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0418 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0419 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0420 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0421 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0422 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0423 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0424 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0425 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0426 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0427 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0428 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0429 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0430 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0431 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0432 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0433 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0434 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0435 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained

175

K0436 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0437 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0438 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0439 Clavibacter michiganensis michiganensis Tomato Iowa, USA Obtained K0440 Clavibacter michiganensis michiganensis Tomato Iowa, USA Obtained K0441 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0442 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0443 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0444 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0445 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0446 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0447 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0448 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0449 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0450 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0451 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0452 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0453 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0454 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0455 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0456 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0457 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0458 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0459 Clavibacter michiganensis michiganensis Tomato Ohio, USA Obtained K0460 Clavibacter michiganensis michiganensis Tomato Washington, USA Obtained K0461 Clavibacter michiganensis michiganensis Tomato Washington, USA Obtained K0462 Clavibacter michiganensis michiganensis Tomato Washington, USA Obtained

176

K0463 Clavibacter michiganensis michiganensis Tomato Washington, USA Obtained K0464 Clavibacter michiganensis michiganensis Tomato Washington, USA Obtained K0465 Clavibacter michiganensis michiganensis Tomato Portugal Obtained K0466 Clavibacter michiganensis michiganensis Tomato Portugal NotObtained K0467 Clavibacter michiganensis michiganensis Tomato UK Obtained K0468 Clavibacter michiganensis michiganensis Tomato Hungary Obtained K0469 Clavibacter michiganensis michiganensis Tomato UK Obtained K0470 Clavibacter michiganensis michiganensis Tomato Italy Obtained K0471 Clavibacter michiganensis michiganensis Tomato Hungary Obtained K0472 Clavibacter michiganensis michiganensis Tomato Netherlands Obtained K0473 Clavibacter michiganensis michiganensis Tomato Chile Obtained K0474 Clavibacter michiganensis michiganensis Tomato California, USA Obtained K0475 Clavibacter michiganensis michiganensis Tomato California, USA Obtained K0476 Clavibacter michiganensis michiganensis Tomato Chile Obtained K0477 Clavibacter michiganensis michiganensis Tomato Chile Obtained K0478 Clavibacter michiganensis michiganensis Tomato China Obtained K0479 Clavibacter michiganensis michiganensis Tomato Chile Obtained K0480 Clavibacter michiganensis michiganensis Tomato Chile Obtained K0950 Clavibacter Canada Obtained K0951 Clavibacter Canada Obtained Presidio County,

K0952 Clavibacter Texas, USA Obtained Presidio County,

K0953 Clavibacter Texas, USA Obtained Presidio County,

K0954 Clavibacter Texas, USA Obtained K0955 Clavibacter Marfa, Texas, USA Obtained Presidio County,

K0956 Clavibacter Texas, USA Obtained

177

Fort Davis, Texas,

K0957 Clavibacter USA Obtained Presidio County,

K0958 Clavibacter Texas, USA Obtained K0959 Clavibacter Marfa, Texas, USA Obtained

178

Supplemental Table SC.1. Ralstonia strains from the PBC used in this study and ITS/RIF/egl amplicon and sequencing success.

egl sequence ITS sequence (Obtained/Not RIF Sequence (Obtained/Not Obtained/Not (Obtained/Not K # Genus Species Race, Biovar Host Obtained) Applicable) Obtained) K0001 Ralstonia solanacearum Race 1, Biovar 1 Tomato Obtained Obtained Obtained K0002 Ralstonia solanacearum Race 1, Biovar 1 Tomato NotObtained N/A Obtained K0003 Ralstonia solanacearum Race 1, Biovar 1 Tomato Obtained Obtained Obtained K0004 Ralstonia solanacearum Race 1, Biovar 1 Tomato Obtained Obtained Obtained K0005 Ralstonia solanacearum Race 1, Biovar 1 Tomato Obtained Obtained Obtained K0006 Ralstonia solanacearum Race 2, Biovar 1 Banana Obtained Obtained Obtained K0007 Ralstonia solanacearum Race 2, Biovar 1 Banana Obtained N/A NotObtained K0008 Ralstonia solanacearum Race 2, Biovar 1 Banana Obtained Obtained Obtained K0009 Ralstonia solanacearum Race 2, F or D Banana Obtained Obtained Obtained K0010 Ralstonia solanacearum Blood Disease Banana Obtained Obtained Obtained K0011 Ralstonia solanacearum Blood Disease Banana Obtained Obtained Obtained K0012 Ralstonia solanacearum Blood Disease Banana Obtained NotObtained Obtained K0013 Ralstonia solanacearum Race 2, Biovar 1 Heliconia Obtained Obtained Obtained K0014 Ralstonia solanacearum Race 2, Biovar 1 Heliconia Obtained Obtained NotObtained K0015 Ralstonia solanacearum Race 2, Biovar 1 Heliconia Obtained Obtained Obtained K0016 Ralstonia solanacearum Race 3, Biovar 2 Geranium Obtained Obtained Obtained K0017 Ralstonia solanacearum unknown Race Geranium Obtained Obtained Obtained K0018 Ralstonia solanacearum Race 3 Biovar 2 Potato Obtained Obtained Obtained K0019 Ralstonia solanacearum Race 3, Biovar 2 Potato Obtained Obtained Obtained K0020 Ralstonia solanacearum Race 3, Biovar 2 Potato Obtained Obtained Obtained K0021 Ralstonia solanacearum Race 1 Ginger Obtained Obtained Obtained

179

K0022 Ralstonia solanacearum Race 4 Ginger Obtained Obtained Obtained K0023 Ralstonia solanacearum Race 4 Ginger NotObtained N/A Obtained K0024 Ralstonia solanacearum Race 4 Ginger Obtained NotObtained Obtained K0097 Ralstonia solanacearum Race 1 Pepper Obtained NotObtained Obtained K0098 Ralstonia solanacearum Race 1 Tomato Obtained NotObtained Obtained K0099 Ralstonia solanacearum Race 1 Tomato Obtained NotObtained Obtained K0100 Ralstonia solanacearum Race 1 Tomato Obtained NotObtained Obtained K0101 Ralstonia solanacearum Race 1 Peanut Obtained NotObtained Obtained K0102 Ralstonia solanacearum Race 1 Peanut Obtained Obtained Obtained K0103 Ralstonia solanacearum Race 1 Peanut NotObtained N/A NotObtained K0104 Ralstonia solanacearum Race 1 Peanut Obtained Obtained Obtained K0105 Ralstonia solanacearum Race 1 Peanut Obtained Obtained Obtained K0106 Ralstonia solanacearum Race 1 Peanut Obtained Obtained Obtained K0107 Ralstonia solanacearum Race 1 Peanut Obtained Obtained Obtained K0108 Ralstonia solanacearum Race 1 Bell Pepper Obtained NotObtained Obtained K0109 Ralstonia solanacearum Race 3, Biovar 2 Potato Obtained N/A NotObtained K0110 Ralstonia solanacearum Race 1, Biovar 3 Potato NotObtained NotObtained Obtained K0111 Ralstonia solanacearum Race 1, Biovar 4 Olive Obtained NotObtained Obtained K0112 Ralstonia solanacearum Race 1, Biovar 1 Tobacco Obtained Obtained Obtained K0113 Ralstonia solanacearum Race 1, Biovar 4 Ginger Obtained Obtained Obtained K0114 Ralstonia solanacearum Race 1, Biovar 4 Ginger NotObtained Obtained Obtained K0115 Ralstonia solanacearum Race 1, Biovar 1 Potato Obtained N/A Obtained K0116 Ralstonia solanacearum Race 1, Biovar 3 Tobacco NotObtained N/A Obtained K0117 Ralstonia solanacearum Race 3, Biovar 2 Potato Obtained NotObtained NotObtained K0118 Ralstonia solanacearum Race 1, Biovar 3 Potato NotObtained Obtained Obtained K0119 Ralstonia solanacearum Race 1, Biovar 4 Tobacco Obtained Obtained Obtained K0120 Ralstonia solanacearum Race 1, Biovar 1 Tobacco Obtained N/A Obtained

180

K0121 Ralstonia solanacearum Race 1, Biovar 3 Tomato Obtained NotObtained Obtained K0122 Ralstonia solanacearum Race 1, Biovar 3 Potato Obtained NotObtained Obtained Eupatorium K0123 Ralstonia solanacearum Race 1, Biovar 3 odoratum Obtained Obtained Obtained K0124 Ralstonia solanacearum Race 1, Biovar 5 Mulberry Obtained NotObtained Obtained Melanpodium K0125 Ralstonia solanacearum Race 1, Biovar 1 perfoliatum Obtained Obtained Obtained K0126 Ralstonia solanacearum Race 1, Biovar 4 Mulberry Obtained Obtained Obtained K0127 Ralstonia solanacearum Race 1, Biovar 1 Tomato Obtained Obtained Obtained K0128 Ralstonia solanacearum Race 1, Biovar 4 Ginger Obtained Obtained Obtained K0129 Ralstonia solanacearum Race 1, Biovar 3 Olive NotObtained N/A NotObtained K0130 Ralstonia solanacearum Race 1, Biovar 1 Tobacco NotObtained Obtained Obtained K0131 Ralstonia solanacearum Race 1, Biovar 4 Potato Obtained N/A Obtained K0132 Ralstonia solanacearum Race 1, Biovar 5 Mulberry NotObtained Obtained Obtained K0133 Ralstonia solanacearum Race 1 Peanut Obtained NotObtained Obtained K0134 Ralstonia solanacearum Race 1 Peanut Obtained Obtained Obtained K0135 Ralstonia solanacearum Race 1 Peanut Obtained Obtained Obtained K0136 Ralstonia solanacearum Race 1 Peanut Obtained Obtained Obtained K0137 Ralstonia solanacearum Race 1 Peanut Obtained Obtained Obtained K0138 Ralstonia solanacearum Race 1 Peanut Obtained Obtained Obtained K0139 Ralstonia solanacearum Race 1 Peanut NotObtained Obtained Obtained K0140 Ralstonia solanacearum Race 1 Peanut NotObtained Obtained Obtained K0141 Ralstonia solanacearum Race 1 Peanut NotObtained Obtained Obtained K0142 Ralstonia solanacearum Race 1 Peanut Obtained Obtained Obtained K0143 Ralstonia solanacearum Race 1 Sweet Potato Obtained NotObtained Obtained K0144 Ralstonia solanacearum Race 1 Ampalaya Obtained Obtained Obtained K0145 Ralstonia solanacearum Race 1 Tomato Obtained NotObtained Obtained K0146 Ralstonia solanacearum Race 1 Peanut Obtained Obtained NotObtained

181

K0147 Ralstonia solanacearum Race 2 Heliconia Obtained Obtained Obtained K0148 Ralstonia solanacearum Race 1 Potato Obtained Obtained Obtained K0149 Ralstonia solanacearum Race 1 Squash Obtained Obtained Obtained K0150 Ralstonia solanacearum Race 2 Abaca Obtained Obtained Obtained K0151 Ralstonia solanacearum Race 2 Heliconia NotObtained Obtained Obtained K0152 Ralstonia solanacearum Race 2 Heliconia Obtained Obtained Obtained K0153 Ralstonia solanacearum Race 2 Heliconia NotObtained Obtained Obtained K0154 Ralstonia solanacearum Race 2 Heliconia NotObtained Obtained Obtained K0155 Ralstonia solanacearum Race 2 Heliconia Obtained Obtained Obtained K0156 Ralstonia solanacearum Race 2 Heliconia Obtained Obtained Obtained K0157 Ralstonia solanacearum Race 2 Heliconia Obtained N/A Obtained K0158 Ralstonia solanacearum Race 4 Ginger Obtained Obtained NotObtained K0159 Ralstonia solanacearum Race 2 Heliconia Obtained Obtained Obtained K0160 Ralstonia solanacearum Race 2 Banana Obtained Obtained Obtained K0161 Ralstonia solanacearum Race 2 Banana Obtained Obtained Obtained K0162 Ralstonia solanacearum Race 2 Banana Obtained Obtained Obtained K0163 Ralstonia solanacearum Race 2 Banana Obtained N/A Obtained K0164 Ralstonia solanacearum Race 2 Banana Obtained Obtained Obtained K0165 Ralstonia solanacearum Race 2 Banana Obtained Obtained Obtained K0166 Ralstonia solanacearum Race 2 Banana Obtained NotObtained Obtained K0167 Ralstonia solanacearum Race 2 Banana Obtained N/A Obtained K0168 Ralstonia solanacearum Race 2 Banana Obtained Obtained Obtained K0169 Ralstonia solanacearum Blood Disease Banana Obtained Obtained Obtained K0170 Ralstonia solanacearum Race 2 Heliconia Obtained Obtained Obtained Banana K0171 Ralstonia solanacearum Race 2, SFR (Cavendish) Obtained Obtained Obtained K0172 Ralstonia solanacearum Race 2 Banana Obtained Obtained Obtained K0173 Ralstonia solanacearum Blood Disease Banana Obtained Obtained Obtained

182

K0174 Ralstonia solanacearum Race 4 Ginger Obtained Obtained Obtained K0175 Ralstonia solanacearum Race 2 Plantain Obtained Obtained Obtained K0176 Ralstonia solanacearum Race 2 Banana Obtained Obtained Obtained K0177 Ralstonia solanacearum Race 2 Banana Obtained Obtained Obtained K0178 Ralstonia solanacearum Race 2 Banana Obtained Obtained Obtained K0179 Ralstonia solanacearum Race 2 Banana NotObtained Obtained NotObtained K0180 Ralstonia solanacearum Race 2 Banana NotObtained Obtained NotObtained K0181 Ralstonia solanacearum Race 2 Banana Obtained Obtained Obtained K0182 Ralstonia solanacearum Race 2 Banana Obtained N/A Obtained K0183 Ralstonia solanacearum Race 2 Banana NotObtained NotObtained Obtained K0184 Ralstonia solanacearum Race 2 Banana Obtained NotObtained Obtained K0185 Ralstonia solanacearum Race 2 Banana Obtained Obtained Obtained K0186 Ralstonia solanacearum Race 2 Banana Obtained Obtained Obtained K0187 Ralstonia solanacearum Blood Disease Banana Obtained Obtained Obtained K0188 Ralstonia solanacearum Blood Disease Banana Obtained Obtained Obtained K0189 Ralstonia solanacearum Blood Disease Banana Obtained Obtained Obtained K0190 Ralstonia solanacearum Blood Disease Banana Obtained Obtained Obtained K0191 Ralstonia solanacearum Blood Disease Banana Obtained Obtained Obtained K0192 Ralstonia solanacearum Blood Disease Banana Obtained NotObtained Obtained K0673 Ralstonia solanacearum Race 3, Biovar 2 Potato NotObtained Obtained K0674 Ralstonia solanacearum Race 2 Abaca Obtained Obtained K0675 Ralstonia solanacearum unknown Race Geranium Obtained Obtained K0676 Ralstonia solanacearum unknown Race Geranium Obtained Obtained K0677 Ralstonia solanacearum unknown Race Geranium Obtained Obtained K0678 Ralstonia solanacearum Blood Disease Banana Obtained Obtained K0679 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0680 Ralstonia solanacearum Race 4 Ginger Obtained Obtained

183

K0681 Ralstonia solanacearum Race 4 Ginger N/A Obtained K0682 Ralstonia solanacearum Race 4 Ginger N/A NotObtained K0683 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0684 Ralstonia solanacearum Race 4 Ginger N/A Obtained K0685 Ralstonia solanacearum Race 4 Ginger N/A Obtained K0686 Ralstonia solanacearum Race 4 Ginger NotObtained Obtained K0687 Ralstonia solanacearum Race 4 Ginger N/A NotObtained K0688 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0689 Ralstonia solanacearum Race 4 Ginger NotObtained Obtained K0690 Ralstonia solanacearum Race 1 Tomato NotObtained Obtained K0691 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0692 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0693 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0694 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0695 Ralstonia solanacearum Race 4 Ginger N/A Obtained K0696 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0697 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0698 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0699 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0700 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0701 Ralstonia solanacearum Race 4 Ginger NotObtained Obtained K0702 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0703 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0704 Ralstonia solanacearum Race 4 Ginger NotObtained Obtained K0705 Ralstonia solanacearum Race 4 Ginger Obtained Obtained Banana

K0708 Ralstonia solanacearum Race 2 (Chato) Obtained Obtained K0709 Ralstonia solanacearum Race 2 Banana NotObtained Obtained

184

(Cavendish) Banana

K0710 Ralstonia solanacearum Race 2 (Cavendish) N/A NotObtained Banana

K0711 Ralstonia solanacearum Race 2 (Cavendish) NotObtained Obtained Banana

K0712 Ralstonia solanacearum Race 2 (Cavendish) N/A NotObtained Banana

K0713 Ralstonia solanacearum Race 2 (Cavendish) N/A NotObtained K0714 Ralstonia solanacearum Blood Disease Banana Obtained Obtained K0715 Ralstonia solanacearum Blood Disease Banana Obtained Obtained K0716 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0717 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0718 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0719 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0720 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0721 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0722 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0723 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0724 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0725 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0726 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0727 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0728 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0729 Ralstonia solanacearum Race 4 Ginger NotObtained Obtained K0730 Ralstonia solanacearum Race 4 Ginger N/A Obtained K0731 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0732 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0734 Ralstonia solanacearum Race 2 Heliconia Obtained Obtained

185

K0735 Ralstonia solanacearum Race 4 Ginger Obtained Obtained K0736 Ralstonia solanacearum unknown Race Geranium Obtained Obtained K0737 Ralstonia solanacearum Blood Disease Banana Obtained Obtained K0739 Ralstonia solanacearum Race 1 Tomato N/A Obtained K0740 Ralstonia solanacearum Race 1 Tomato Obtained Obtained K0741 Ralstonia solanacearum Race 2 Heliconia Obtained Obtained Banana

K0742 Ralstonia solanacearum Race 2 (Cavendish) Obtained Obtained Banana

K0743 Ralstonia solanacearum Race 2 (Cavendish) Obtained Obtained Banana

K0744 Ralstonia solanacearum Race 2 (Cavendish) Obtained Obtained K0745 Ralstonia solanacearum Race 3 Biovar 2 Potato Obtained Obtained K0746 Ralstonia solanacearum Race 4 Ginger (Kahili) N/A NotObtained K0747 Ralstonia solanacearum Blood Disease Banana-goroho Obtained Obtained Banana raja

K0748 Ralstonia solanacearum Blood Disease pseudostem Obtained Obtained K0749 Ralstonia solanacearum Blood Disease Banana kapok N/A NotObtained K0750 Ralstonia solanacearum Blood Disease Banana kapok Obtained Obtained K0751 Ralstonia solanacearum Race 1 Eggplant Obtained NotObtained K0752 Ralstonia solanacearum Race 1 Eggplant N/A Obtained K0753 Ralstonia solanacearum Race 4 Ginger N/A NotObtained K0754 Ralstonia solanacearum Race 1 Tomato N/A Obtained K0755 Ralstonia solanacearum Race 1 Chili N/A NotObtained K0756 Ralstonia solanacearum Race 1 Eggplant N/A NotObtained K0757 Ralstonia solanacearum Race 1 Chilli Obtained Obtained K0758 Ralstonia solanacearum Race 1 Chilli Obtained Obtained K0759 Ralstonia solanacearum Blood Disease Banana Obtained Obtained K0760 Ralstonia solanacearum Blood Disease Banana kapok Obtained Obtained

186

K0761 Ralstonia solanacearum unknown Race Geranium Obtained Obtained K0762 Ralstonia solanacearum unknown Race Geranium Obtained Obtained K0763 Ralstonia solanacearum unknown Race Geranium Obtained Obtained K0764 Ralstonia solanacearum unknown Race Geranium Obtained Obtained K0765 Ralstonia solanacearum unknown Race Geranium Obtained Obtained K0766 Ralstonia solanacearum Race 4 Ginger NotObtained Obtained K0767 Ralstonia solanacearum Race 1 Tomato Obtained Obtained K0768 Ralstonia solanacearum Race 1 Tomato Obtained Obtained K0769 Ralstonia solanacearum Race 3, Biovar 2 Potato NotObtained Obtained K0770 Ralstonia solanacearum Race 3, Biovar 2 Potato Obtained Obtained K0771 Ralstonia solanacearum Race 3, Biovar 2 Potato Obtained Obtained K0772 Ralstonia solanacearum Race 3, Biovar 2 Potato Obtained Obtained K0773 Ralstonia solanacearum Race 3, Biovar 2 Potato Obtained Obtained K0774 Ralstonia solanacearum Race 3, Biovar 2 Potato Obtained Obtained K0775 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A NotObtained K0776 Ralstonia solanacearum Race 3, Biovar 2 Potato Obtained Obtained K0777 Ralstonia solanacearum Race 3, Biovar 2 Potato Obtained Obtained K0778 Ralstonia solanacearum Race 3, Biovar 2 Potato Obtained Obtained Solanum

K0779 Ralstonia solanacearum Race 3, Biovar 2 nigrum Obtained NotObtained K0780 Ralstonia solanacearum Race 3, Biovar 2 Potato Obtained Obtained K0781 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A Obtained K0782 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A Obtained K0783 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A NotObtained K0784 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A Obtained K0785 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A Obtained Abberant

K0786 Ralstonia solanacearum Biovar 2 Tomato N/A Obtained

187

K0787 Ralstonia solanacearum Race 3, Biovar 2 Tomato N/A Obtained K0788 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A Obtained K0789 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A Obtained K0790 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A Obtained K0791 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A Obtained K0792 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A NotObtained K0793 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A Obtained K0794 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A Obtained K0795 Ralstonia solanacearum Race 3, Biovar 2 Tomato N/A Obtained K0796 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A Obtained K0797 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A Obtained K0798 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A Obtained K0799 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A NotObtained K0800 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A Obtained K0801 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A NotObtained K0802 Ralstonia solanacearum Race 3, Biovar 2 Tomato N/A NotObtained K0803 Ralstonia solanacearum Race 3, Biovar 2 Tomato N/A Obtained Solanum

K0804 Ralstonia solanacearum Race 3, Biovar 2 phureja N/A Obtained K0805 Ralstonia solanacearum Race 3, Biovar 2 Potato N/A Obtained K0806 Ralstonia solanacearum Race 3, Biovar 2 Geranium N/A Obtained K0807 Ralstonia solanacearum Blood Disease Banana N/A Obtained K0808 Ralstonia solanacearum Blood Disease Banana N/A Obtained K0809 Ralstonia solanacearum Blood Disease Banana N/A Obtained

188

Supplemental Table SD.1. Enterobacteriacae strains from the PBC used in this study and ITS/RIF/ADE amplicon and sequencing success.

ITS RIF sequence ADE Sequence RIF PCR (Obtained/ PCR (Obtained/ at 51°C Not (Present/ Not (Present/ K Number Genus Species Host Location Obtained) Absent) Obtained) Absent) K0481 Dickeya sp. Philodendron Hawaii, USA P Obtained K0483 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0484 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0485 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0486 Dickeya sp. Pineapple Hawaii, USA P Obtained K0487 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0488 Dickeya sp. Pineapple Hawaii, USA P Obtained K0489 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0490 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0491 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0492 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained Not

K0493 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0494 Dickeya sp. Irrigation water Hawaii, USA P Obtained K0495 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0496 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0497 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0498 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0499 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0500 Dickeya sp. Pineapple Hawaii, USA Obtained A Obtained

189

K0501 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0502 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0503 Dickeya sp. Pineapple Hawaii, USA P Obtained K0504 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained Not

K0505 Dickeya sp. Maize Missouri, USA Obtained P Obtained Not

K0519 Dickeya sp. Aglaonema sp. Hawaii, USA A Obtained A Not

K0520 Dickeya sp. Aglaonema sp. Hawaii, USA A Obtained A Not

K0521 Dickeya sp. Aglaonema sp. Hawaii, USA A Obtained A K0523 Dickeya sp. Aglaonema sp. Hawaii, USA A Obtained K0531 Dickeya sp. Pineapple Hawaii, USA P Obtained K0532 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0533 Dickeya sp. Pineapple Hawaii, USA P Obtained K0534 Dickeya sp. Pineapple Hawaii, USA P Obtained K0535 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0536 Dickeya sp. Pineapple Hawaii, USA P Obtained K0537 Dickeya sp. Pineapple Hawaii, USA P Obtained K0538 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0539 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained K0540 Dickeya sp. Pineapple Hawaii, USA P Obtained K0541 Dickeya sp. Pineapple Hawaii, USA P Obtained K0542 Dickeya sp. Pineapple Hawaii, USA Obtained P Obtained Chrysanthemum Not

K0543 Dickeya chrysanthemi morifolium USA P Obtained P Pelargonium

K0544 Dickeya dadantii capitatum Comoros P Obtained K0545 Dickeya sp. Pineapple Malaysia Obtained P Obtained

190

Carnation (Dianthus K0546 Dickeya dianthicola caryophyllus) UK P Obtained K0547 Dickeya dieffenbachiae Dieffenbachia sp. USA P Obtained Not

K0548 Dickeya paradisiaca Musa paradisiaca Colombia A Obtained A K0549 Dickeya sp. Pineapple Malaysia Obtained P Obtained K0550 Dickeya zeae Maize USA Obtained P Obtained K0551 Dickeya zeae Pineapple Martinique P Obtained K0552 Dickeya sp. Irrigation water Hawaii, USA P Obtained K0553 Dickeya sp. Irrigation water Hawaii, USA Obtained P Obtained K0554 Dickeya sp. Irrigation water Hawaii, USA Obtained P Obtained K0555 Dickeya sp. Irrigation water Hawaii, USA P Obtained K0556 Dickeya sp. Irrigation water Hawaii, USA Obtained P Obtained K0557 Dickeya sp. Irrigation water Hawaii, USA P Obtained K0558 Dickeya sp. Irrigation water Hawaii, USA Obtained P Obtained K0559 Dickeya sp. Irrigation water Hawaii, USA P Obtained K0560 Dickeya sp. Irrigation water Hawaii, USA P Obtained K0561 Dickeya sp. Irrigation water Hawaii, USA Obtained P Obtained K0562 Dickeya sp. Irrigation water Hawaii, USA Obtained P Obtained K0563 Dickeya sp. Irrigation water Hawaii, USA P Obtained Not

K0568 Dickeya sp. Lettuce Hawaii, USA P Obtained Not

K0575 Dickeya sp. Ornamental Florida, USA P Obtained K0576 Dickeya sp. Ornamental Florida, USA P Obtained Papaya (Carica Not

K0524 Pantoea agglomerans papaya) Hawaii, USA P Obtained P Papaya (Carica Not

K0525 Pantoea agglomerans papaya) Hawaii, USA A Obtained P K0526 Pantoea agglomerans Papaya (Carica Hawaii, USA P Not A

191

papaya) Obtained Papaya (Carica

K0527 Pantoea agglomerans papaya) Hawaii, USA A Obtained P Papaya (Carica Not

K0528 Pantoea agglomerans papaya) Hawaii, USA A Obtained P Not

K0529 Pantoea agglomerans Cabbage Hawaii, USA A Obtained P Not

K0530 Pantoea agglomerans Rice Belgium A Obtained A Not

K0564 Pantoea agglomerans Ornamental Hawaii, USA A Obtained P Not

K0565 Pantoea agglomerans Ornamental Hawaii, USA A Obtained K0482 Pectobacterium carotovorum Aglaonema sp. Hawaii, USA Obtained P Obtained K0506 Pectobacterium carotovorum Aglaonema sp. Hawaii, USA A Obtained K0507 Pectobacterium sp. Irrigation water Hawaii, USA A Obtained K0508 Pectobacterium sp. Irrigation water Hawaii, USA A Obtained Not

K0509 Pectobacterium carotovorum Pineapple Hawaii, USA A Obtained P Not

K0510 Pectobacterium atrosepticum Potato Netherlands A Obtained A Not

K0511 Pectobacterium atrosepticum Potato Netherlands A Obtained A Papaya (Carica Not

K0512 Pectobacterium carotovorum papaya) Hawaii, USA A Obtained A Not

K0513 Pectobacterium carotovorum Aglaonema sp. Hawaii, USA A Obtained A Not

K0514 Pectobacterium carotovorum Aglaonema sp. Hawaii, USA A Obtained A Not

K0515 Pectobacterium carotovorum Aglaonema sp. Hawaii, USA A Obtained A K0516 Pectobacterium carotovorum Aglaonema sp. Hawaii, USA A Obtained P K0517 Pectobacterium carotovorum Aglaonema sp. Hawaii, USA A Obtained K0518 Pectobacterium sp. Aglaonema sp. Hawaii, USA A Obtained

192

Not

K0522 Pectobacterium sp. Aglaonema sp. Hawaii, USA A Obtained P Not

K0566 Pectobacterium atrosepticum Potato Ohio, USA A Obtained A California, Not

K0567 Pectobacterium carotovorum Pepper USA A Obtained A Colorado,

K0569 Pectobacterium atrosepticum Potato USA A Obtained Colorado,

K0570 Pectobacterium atrosepticum Potato USA A Obtained Colorado, Not

K0571 Pectobacterium carotovorum Potato USA A Obtained Colorado,

K0572 Pectobacterium atrosepticum Potato USA A Obtained Colorado,

K0573 Pectobacterium atrosepticum Potato USA A Obtained Colorado, Not

K0574 Pectobacterium carotovorum Potato USA A Obtained P

193

References

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403-410.

Alvarez, A.M. 2004. Integrated approaches for detection of plant pathogenic bacteria and diagnosis of bacterial diseases. Annu Rev Phytopath 42:339-366.

Battistuzzi FU, Feijo A, Hedges SB (2004) A genomic timescale of prokaryote evolution: insights into the origin of methanogenesis, phototrophy, and the colonization of land. BMC Evol Bio 4: doi:10.1186/1471-2148-4-44.

Beiko RG, Harlow TJ, Ragan MA (2005) Highways of gene sharing in prokaryotes. PNAS 102:14332-14337.

Benson DA, Boguski MS, Lipman DJ, Ostell J, Ouellette BF, et al., (1999)

GenBank. Nucleic Acids Res 27:12-17.

Bent AF, Mackey D (2007) Elicitors, Effectors and R Genes: The new paradigm and a lifetime supply of questions. Annu Rev Phytopathol 45:399-436.

Bhattacharyya A, Stilwagen S, Ivanova N, D'Souza M, Bernal A, et al.(2002)

Whole-genome comparative analysis of three phytopathogenic Xylella fastidiosa strains. PNAS 99:12403-12408.

194

Brenner DJ, Krieg NR, Staley JT, Garrity GM, et al., (2005) Bergey's Manual of

Systematic Bacteriology, 2nd ed., vol. 2, The Proteobacteria. Part C, The Alpha-,

Beta-, Delta-, and Epsilon Proteobacteria. New York: Springer. 996 p.

Brown K (2001) Florida fights to stop citrus canker. Science 292:2275-2276.

Brunings AM, Gabriel D (2003) Xanthomonas citri: breaking the surface.

Molecular Plant Pathology 4:141-157.

Castillo JA, Greenburg JT (2007) Evolutionary dynacmics of Ralstonia solanacearum. AEM 73:1225-1238.

Chen J, Xie G, Han S, Chertkov O, Sims D et al., (2010) Whole genome sequences of two Xylella fastidiosa strains (M12 and M23) causing almond leaf scorch disease in california. Journal of Bacteriol 192:4543.

Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins HG, et al., (2003)

Multiple sequence alignment with the Clustal series of programs. Nucleic Acids

Res 31:3497-3500.

Cianciotto NP (2005) Type II secretion: a protein secretion system for all seasons. Trends in Microbiology 13:581-588.

195

Cole JR , Wang Q, Cardenas E, Fish J, Chai B, et al., (2009) The Ribosomal

Database Project: improved alignments and new tools for rRNA analysis.

Nucleic Acids Res 37:D141-D145.

Comas I, Moya A, Azad RK, Lawrence JG, Gonzalez-Candelas F (2006) The

Evolutionary Origin of Xanthomonadales Genomes and the Nature of the

Horizontal Gene Transfer Process. Mol Biol Evol 23: 2049-2057.

Crossman LC, Gould VC, Dow JM, Vernikos GS, Okazaki A, et al., (2008) The complete genome, comparative and function analysis of Stenotrophomonas maltophilia reveals an organism heavily shielded by drug resistance determinants. Genome Biol 9:r74.1-13.

d’Enfert C, Chapon C, Pugsley AP (1987) Export and secretion of the lipoprotein

Pullulanase by Klebsiella . Molecular Micro 1:107-116.

Darling AE, Mau B, Perna NT (2010) progressiveMauve: Multiple Genome

Alignment with Gene Gain, Loss, and Rearrangement. PLoS One 5(6):e11147.

Da Lage JL, Feller G, Janecek S (2003) Horizontal gene transfer from Eukarya to Bacteria and domain shuffling: the alpha-amylase model. Cell and mol life sciences 61:97-109.

196

Doddapaneni H, Yao J, Lin H, Walker MA, Edwin L Civerolo EL (2006)

Analysis of the genome-wide variations among multiple strains of the plant pathogenic bacterium Xylella fastidiosa. BMC Genomics 7:225.

Doolittle WF (1999) Phylogenetic Classification and the Universal Tree. Science

284:2124-2128.

Dunning Hotopp JC, Clark ME, Oliveira DCSG, Foster JM, Fischer P, et al.,

(2007). Widespread lateral gene transfer from intracellular bacteria to multicellular eukaryotes. Science. 317:1753-1756.

Dye DW, Lelliott RA (1974) Genus 11. Xanthomonas. Dowson 1939, 187, p. 243-

249. In R. E. Buchanan and N. E. Gibbons (ed.), Bergey’s manual of determinative bacteriology,8th ed. The Williams & Wilkins Co., Baltimore.

Ewing B, Green P (1998a) Basecalling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8:186-194.

Ewing B, Hillier L, Wendl M, Green P (1998b) Basecalling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8:175-185.

197

Ewing A (2008) A study on the phylogenetics of gene transfer: from pathways to kingdoms. University of Hawaii at Manoa, Molecular Biosciences and

Bioengineering. MS Thesis.

Fegan M, Taghavi M, Sly LI, Hayward AC (1998) Phylogeny, diversity, and molecular diagnostics of Ralstonia solanacearum; In: Prior P, Allen C,

Elphinstone JG, editors. Bacterial Wilt Disease: Molecular and Ecological

Aspects. Berlin: Springer-Verlag. pp. 19–33.

Filloux A (2004) The underlying mechanisms of type II protein secretion. Biochim

Biophys Acta 1694:163–179.

García-Martínez J, Acinas SG, Antón AI, Rodríguez-Valera F (1999) Use of the 16S–23S ribosomal genes spacer region in studies of prokaryotic diversity.

J Microbiol Methods 36:55-64.

Garcia-Vallve S, Guzman E, Montero MA, Romeu A (2003) HGT-DB: a database of putative horizontally transferred genes in prokaryotic complete genomes. NAR

31:187-189.

Garrity GM, Bell JA, Lilburn T (2005) Phylum XIV. Proteobacteria phyl. nov. In:

Brenner DJ, Krieger-Huber S, Stanley JT, editors. Bergey's manual of systematic bacteriology. New York: Springer. pp. 1–912.

198

Gogarten JP, Doolittle WF, Lawrence JG (2002) Prokaryotic evolution in the light of gene transfer. Mol Biol Evol 19:2226-2238.

Gottig N, Garavaglia BS, Daurelio LD, Valentine A, Gehring C, et al., (2008)

Xanthomonas axonopodis pv. citri uses a plant natriuretic peptide-like protein to modify host homeostasis. PNAS 47:18631-18636.

Gottwald T, Hughes G, Graham JH, Sun X, Riley T (2001) The citrus canker epidemic in Florida: the scientific basis of regulatory eradication policy for an invasive species. Phytopathol 91:30-34.

Gottwald TR, Graham JH, Schubert TS (2002) Citrus canker: the pathogen and its impact. Online. Plant Health Progress doi: 10.1094/PHP-2002-0812-01-RV.

Gurlebeck D, Thieme F, Bonas U (2005) Type III effector proteins from the plant pathogen Xanthomonas and their role in the interaction with the host plant. Jour of plant phys 163:233-255.

Gurtler V, Stanisich VA (1996) New approaches to typing and identification of bacteria using the 16S-23S rDNA spacer region. Microbiol 142:3-16.

199

Hacker J, Kaper JB (1999) The Concept of Pathogenicity Islands. In: Kaper JB,

Hacker J, editors. Pathogenicity Islands and Other Mobile Virulence Elements.

Amer Soc Microbiol. pp. 1–12.

Hacker J, Kaper JB (2000) Pathogenicitiy islands and the evolution of microbes.

Annu Rev Micro 54:641-679.

Han M.V. and Zmasek C.M. (2009). phyloXML: XML for evolutionary biology and comparative genomics. BMC Bioinformatics 10:356.

Hall T (1999) BioEdit 7. Nucleic Acids Symp Ser 41:95–98.

Hawks B (2005) Agricultural Bioterrorism Protection Act of 2002: Possession, use, and transfer of biological agents and toxins; final rule. (7 CFR Part 331 and

9 CFR Part 121). Federal Registrar 70:13241-13292.

Hayward AC (1993) The hosts of Xanthomonas. In: Swings JG, Civerolo EL, editors. Xanthomonas. Chapman and Hall. pp. 1–17.

Hebert PDN, Cywinska A, Ball SL, DeWaard JR (2003) Biological identifications through DNA barcodes. Proc R Soc London Ser 270: 313–321.

200

Henderson IR, Navarro-Garcia F, Nataro JP (1998) The great escape: structure and function of the autotransporter proteins. Trends in Microbiol 6:370-378.

Hsiao W, Wan I, Jones SJ, Brinkman FSL (2003) IslandPath: aiding detection of genomic islands in prokaryotes. Bioinformatics 19:418-420.

Huynen MA, Bork P (1998) Measuring genome evolution. PNAS 95:5849-5856.

Jones JB, Lacy GH, Bouzar H, Stall RE, Schaad NW (2004) Reclassification of the xanthomonads associated with bacterial spot disease of tomato and pepper. Syst Appl Microbiol 27:755-762.

Kang YJ, Cheng J, Mai LJ, Hu J, Piao Z (2010) Multiple copies of 16S rRNA gene affect the restriction patterns and DGGE profile revealed by analysis of genome database. Microbiol 79: 655–662.

Kaguni J (2006) DnaA: Controlling the Initiation of Bacterial DNA Replication and

More. Annu Rev Microbiol 60: 351–371.

Karlin S (2001) Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial genomes. Trends Microbiol 7:335-343.

201

Koebnik R, Kruger A, Thieme F, Urban A, Bonas U (2006) Specific binding of the

Xanthomonas campestris pv. vesicatoria AraC-Type transcriptional activator

HrpX to plant-inducible promoter boxes. J Bacteriol 188:7652-7660.

Kumar S, Dudley J, Nei M, Tamura K (2008) MEGA: A biologist-centric software for evolutionary analysis of DNA and protein sequences. Briefings in Bioinformatics 9: 299-306.

Kurland CG, Canback B, Berg OG (2003) Horizontal gene transfer: A critical view. PNAS. 17:9658-9662.

Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, et al., (2004) Versatile and open software for comparing large genomes. Genome Biol 5:R12.

Kwan G (2010) The Complete Genome Sequence of Stenotrophomonas rhizophila DR-63A: Insights Into Bacterial Genome Evolution. M.S. Thesis,

University of Hawaii.

de Lamballerie X, Zandotti C, Vignoli C, Bollet C, de Micco P (1992) A one-step microbial DNA extraction method using "Chelex 100" suitable for gene amplification. Res Microbiol 143:785-790.

202

Lawrence JG and Roth JR (1996) Selfish Operons: Horizontal transfer may drive the evolution of gene clusters. Genetics 143:1843-1860.

Lawrence JG (2002) Gene transfer in Bacteria: Speciation without species?

Theor Pop Biol 61:449-460.

Lee BM, Park YJ, Park DS, Kang HW, Kim JG, et al., (2005) The genome sequence of Xanthomonas oryzae pathovar oryzae KACC10331, the bacterial blight pathogen of rice. Nucleic Acids Res 33:577-86.

Lorenz MG, Wackernagel W (1994) Bacterial gene transfer by natural genetic transformation in the environment. Microbiol Rev 58:563-602.

Louws FJ, Rademaker JLW, de Bruijn FJ. (1999) THE THREE DS OF PCR-

BASED GENOMIC ANALYSIS OF PHYTOBACTERIA: Diversity, Detection, and

Disease Diagnosis. Annu Rev Phytopath 37: 81-125.

Lu H, Patil P, Van Sluys M-A, White FF, Ryan RP, et al., (2008) Acquisition and

Evolution of Plant Pathogenesis–Associated Gene Clusters and Candidate

Determinants of Tissue-Specificity in Xanthomonas. PLoS ONE 3: e3828. doi:10.1371/journal.pone.0003828.

203

Jenkins DM, Kubota R, Dong J, Li Y, Higashiguchi D (2011) Handheld device for real-time, quantitative LAMP-based detection of Salmonella enterica using assimilating probes. Biosens and Bioelec doi:10.1016/j.bios.2011.09.020.

Ma B, Hibbing ME, Kim H, Reedy RM, Yedidia I, et al., (2007) Host range and molecular phylogenies of the soft rot enterobacterial genera Pectobacterium and

Dickeya. Phytopathol 97:1150-1163.

Mackiwicz P, Zakrzewska-Czerwinska J, Zawilak A, Dudek MR, Cebrat S (2004)

Where does bacterial replication start? Rules for predicting the oriC region. NAR

32:3781-3791.

Meyer DF, Bogdanove AJ (2009) Genomics-driven advances in Xanthomonas biology. In book: Plant pathogenic bacteria: Genomics and Molecular Biology. pp:147-159.

Moran NA, McLaughlin HJ, Sorek R (2009) The dynamics and time scale of ongoing genomic erosion in symbiotic bacteria. Science 323:379-382.

Moreira LM, Almeida NF Jr, Potnis N, Digiampietri LA, Adi SS, et al., (2010)

Novel insights into the genomic basis of citrus canker based on the genome sequences of two strains of Xanthomonas fuscans subsp. aurantifolii. BMC

Genomics doi:10.1186/1471-2164-11-238.

204

Nassar A, Darrasse A, Lemattre M, Kotoujansky A, Dervin C, et al., (1996)

Characterization of Erwinia chrysanthemi by pectinolytic isozyme polymorphism and restriction fragment length polymorphism analysis of PCR- amplified fragments of pel genes. Appl Environ Microbiol 62:2228–2235.

Nembaware V, Seoighe C, Sayed M, Gehring C (2004) A plant natriuretic peptide-like gene in the bacterial pathogen Xanthomonas axonopodis may induce hyper-hydration in the plant host: a hypothesis of molecular mimicry.

BMC Central 4:10.

Normand P, Ponsonnet C, Nesme X, Neyra M, Simonet P (1996) ITS analysis of prokaryotes; In: Akkermans DL, van Elsas JD, de Bruijn EI, editors.

Molecular Microbial Ecology Manual. Kluwer Academic Publishers. pp. 1-12.

Ochman H, Lawrence JG, Groisman EA (2000) Lateral gene transfer and the nature of bacterial innovation. Nature 405:299-304.

Pallen MJ, Chaudhuri RR, Henderson IR (2003) Genomic analysis of secretion systems. Curr Opinion in Microbiol 6:519-527.

205

Palleroni NJ, Bradbury JF (1993) Stenotrophomonas, a New Bacterial Genus for Xanthomonas maltophilia (Hugh 1980) Swings et al., 1983. Int J Syst

Bacteriol 43:606-609.

Parkinson N, Aritua V, Heeney J, Cowie C, Bew J et al., (2007) Phylogenetic analysis of Xanthomonas species by comparison of partial gyrase B gene sequences. Int J Syst Evol Microbiol 57:2881–2887.

Parkinson N, Cowie C, Heeney J, Stead D (2009) Phylogenetic structure of

Xanthomonas determined by comparison of gyrB sequences. Int J Syst Evol

Microbiol 57:2881–2887.

Pascual J, Macián MC, Arahal DR, Garay E, Pujalte MJ (2010) Multilocus sequence analysis of the central clade of the genus Vibrio using the 16S rRNA, recA, pyrH, rpoD, gyrB, rctB and toxR genes. Int J Syst Evol Microbiol 60:154–

165.

Petersen TN, Brunak S, von Heijne G, and Nielsen H (2011).SignalP 4.0: discriminating signal peptides from transmembrane regions. Nature Methods, 8:785-786.

Pieretti I, Royer M, Barbe V, Carrere S, Koebnik R, et al., (2009) The complete genome sequence of Xanthomonas albilineans provides new insights into the reductive genome evolution of xylem-limited Xanthomonadaceae. BMC

Genomics doi:10.1186/1471-2164-10-616.

206

Potnis N, Krasileva K, Chow V, Almieda NF, Patil PB et al., (2011) Comparative genomics reveals diversity among xanthomonads infecting tomato and pepper.

BMC Genomics 12:146.

Presting GG (2006) Identification of conserved regions in the plastid genome: implications for DNA barcoding and biological function. Can J Bot 84: 434–1443.

Prior P, Fegan M (2005) Recent developments in the phylogeny and classification of Ralstonia solanacearum. Acta Hort 695:127-136.

Qian W, Jia Y, Ren SX, He YQ, Feng JX, et al., (2005) Comparative and functional genomic analyses of the pathogenicity of phytopathogen Xanthomonas campestris pv. campestris. Genome Res 15:757-767.

Rademaker JLW, Louws FJ, Schultz MH, Rossbach U, Vauterin L, et al., (2005)

A comprehensive species to strain taxonomic framework for Xanthomonas.

Phytopath 95:1098-1111.

Richards TA, Soanes DM, Foster PG, Leonard G, Thornton CR, et al., (2009)

Phylogenomoic analysis demonstrates a pattern of rare and ancient horizontal gene transfer between plants and fungi. The Plant Cell 21:1897-1911.

207

Ryan RP, Monchy S, Cardinale M, Taghavi S, Crossman L, et al., (2009) The versatility and adaptation of bacteria from the genus Stenotrophomonas. Nature

Reviews 7:514-525.

Salanoubat M, Genin S, Artiguenave F, Gouzy J, Mangenot S et al., (2002)

Genome sequence of the plant pathogen Ralstnoia solanacearum. Nature

415:497-502.

Salemi M, Vandamme A-M (2003) The phylogenetic handbook: a practical approach to DNA and protein phylogeny. United . Published:

Cambridge University Press.

Salzberg SL, Sommer DD, Schatz MC, Phillippy AM, Rabinowicz PD, et al.,

(2008) Genome sequence and rapid evolution of the rice pathogen

Xanthomonas oryzae pv. oryzae PXO99A. BMC Genomics 9:doi:10.1186/1471-

2164-9-204.

Sanderson MJ, Thorne JL, Wikstrom N, Bremer K (2004) Molecular evidence on plant divergence times. Amer Jour of Bot 91:1656-1665.

Sandkvist M (2001) Biology of type II secretion. Mol Microbiol 40:271-283.

208

Sandkvist M (2001b) Type II Secretion and Pathogenesis. Infection and

Immunity 69:3523-3535.

Santos S, Ochman H (2004) Identification and phylogenetic sorting of bacterial lineages with universally conserved genes and proteins. Environ Microbiol

6:754–759.

Schaad NW, Postnikova E, Lacy G, Sechler A, Agarkova I et al., (2006).

Emended classification of xanthomonad pathogens on citrus. Syst Appl Microbiol

29:690–695.

Schneider KL, Marrero G, Alvarez A, Presting GG (2011) Classification of Plant

Associated Bacteria Using RIF, a Computationally Derived DNA Marker. PLoS

ONE 6(4): e18496.

Schnoes AM, Brown SD, Dodevski I, Babbitt PC (2009) Annotation error in public databases: misannotation of molecular function in enzyme superfamilies. PLOS

Comp Biol doi:10.1371/journal.pcbi.1000605.

da Silva ACR, Ferro JA, Reinach FC, Farah CS, Furlan LR, et al., (2002)

Comparison of the genomes of two Xanthomonas pathogens with differing host specificities. Nature 417:459–463.

209

Siguier P, Filee J, Chandler M (2006) Insertion sequences in prokaryotic genomes. Curr Opinion in Microbiol 9:526-531.

Simpson AJG, Reinach FC, Arruda P, Abreu FA, Acencio M (2000) The genome sequence of the plant pathogen Xylella fastidiosa. Nature. 406:151-159.

Snel B, Pork P, Huynen MA (2002) Genomes in flux: The evolution of Archaeal and Proteobacterial gene content. Genome Research 12:17-25.

Studholme DJ, Kemen E, MacLean D, Schornack S, Aritua V et al., (2010)

Genome-wide sequencing data reveals virulence factors implicated in banana

Xanthomonas wilt. FEMS 310:182-192.

Syvanen M (1994) Horizontal gene transfer: Evidence and Possible

Consequences. Annu Rev Genet 28:237-261.

Szczesny R, Jordan M, Schramm C, Schulz S, Cogez V, et al., (2010) Functional characterization of the Xcs and Xps type II secretion systems from the plant pathogenic bacterium Xanthomonas campestris pv vesicatoria. New Phytologist

187:983-1002.

210

Taghavi S, Barac T, Greenberg B, Borremans B, Vangronsveld J, et al., (2005)

Horizontal gene transfer to endogenous endophytic bacteria from Poplar improves phytoremediation of Toluene. Appl Environ Microbiol 12:8500-8505.

Thieme F, Koebnik R, Bekel T, Berger C, Boch J, et al., (2005) Insights into genome plasticity and pathogenicity of the plant pathogenic bacterium

Xanthomonas campestris pv. vesicatoria revealed by the complete genome sequence. J Bacteriol 187:7254–7266.

Touchon M, Rocha EPC (2007) Causes of insertion sequences abundance in prokaryotic genomes. Molec Biol and Evol 24:969-981.

Tsuge S, Terashima S, Furutani A, Ochiai H, Oku T, et al., (2005) Effects on promoter activity of base substitutions in the cis-acting regulatory element of HrpXo regulons in Xanthomonas oryzae pv. oryzae. J Bacteriol

187:2308-2314.

Urwin R, Maiden MCJ (2003) Multi-locus sequence typing: a tool for global epidemiology. Trends Microbiol 11:479-487.

Van Sluys MA, Monteiro-Vitorello CB, Camargo LEA, Menck CFM, da Silva ACR, et al., (2002) Comparative genomic analysis of plant-associated bacteria. Annu

Rev Phytopathol 40:169-189.

211

Van Sluys MA, de Oliveira MC, Monteiro-Vitorello CB, Miyaki CY, Furlan LR, et al., (2002) Comparative analysis of the complete genome sequence of Pierce’s disease and citrus variegated chlorosis strains of Xylella fastidiosa. American

Soc for Microbiol 185:1018-1026.

Vauterin L, Hoste B, Kersters K, Swings J (1995) Reclassification of

Xanthomonas. Int J Syst Bacteriol 45:472–489.

Vitorino L, Chelo I, Bacellar F, Ze-Ze L (2007) Rickettsiae phylogeny: a multigenic approach. Microbiology 153:160-168.

Vorholter F-J, Shneiker S, Goesmann A, Krause L, Bekel T, et al., (2008) The genome of Xanthomonas campestris pv. campestris B100 and its use for the reconstruction of metabolic pathways involved in xanthan biosynthesis. Journal of Bacteriology 134:33-45.

Wilton SD, Lim L, Dye D, Laing N (1997) Bandstab: a PCR-based alternative to cloning PCR products. BioTechniques 22:642–645.

Yamazaki A, Hirata H, Tsuyumu S (2008) HrpG regulates type II secretory proteins in Xanthomonas axonopodis pv. citri. Bacterial and phytoplasma diseases 74:138-150.

212

Young JM, Park DC (2007) Relationships of plant pathogenic enterobacteria based on partial atpD, carA, and recA as individual and concatenated nucleotide and peptide sequences. Syst Appl Microbiol 30:343-354.

Young JM, Park DC, Shearman HM, Fargier E (2008) A multilocus sequence analysis of the genus Xanthomonas. Syst Appl Microbiol 31:366-77.

Zhao Xu, Bailin Hao (2009) CVTree update: a newly designed phylogenetic study platform using composition vectors and whole genomes. NAR Web Server issue 37:W174-W178.

213