Hindawi Publishing Corporation International Journal of Genomics Volume 2015, Article ID 198560, 15 pages http://dx.doi.org/10.1155/2015/198560

Research Article Phylogenomic and Molecular Demarcation of the Core Members of the Polyphyletic Genera , ,andPasteurella

Sohail Naushad, Mobolaji Adeolu, Nisha Goel, Bijendra Khadka, Aqeel Al-Dahwi, and Radhey S. Gupta Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada L8N 3Z5

Correspondence should be addressed to Radhey S. Gupta; [email protected]

Received 5 November 2014; Revised 19 January 2015; Accepted 26 January 2015

Academic Editor: John Parkinson

Copyright © 2015 Sohail Naushad et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

The genera Actinobacillus, Haemophilus, and exhibit extensive polyphyletic branching in phylogenetic trees and do not represent coherent clusters of . In this study, we have utilized molecular signatures identified through comparative genomic analyses in conjunction with based and multilocus sequence based phylogenetic analyses to clarify the phylogenetic and taxonomic boundary of these genera. We have identified large clusters of Actinobacillus, Haemophilus, and Pasteurella species which represent the “sensu stricto” members of these genera. We have identified 3, 7, and 6 conserved signature indels (CSIs), which are specifically shared by sensu stricto members of Actinobacillus, Haemophilus, and Pasteurella, respectively. We have also identified two different sets of CSIs that are unique characteristics of the containing genera and , respectively. It is now possible to demarcate the genera Actinobacillus sensu stricto, Haemophilus sensu stricto, and Pasteurella sensu stricto on the basis of discrete molecular signatures. The other members of the genera Actinobacillus, Haemophilus, and Pasteurella that do not fall within the “sensu stricto” clades and do not contain these molecular signatures should be reclassified as other genera. The CSIs identified here also provide useful diagnostic targets for the identification of current and novel members of theindicated genera.

1. Introduction Classification of organisms into these three genera was primarily based on DNA G-C content and a handful of The family Pasteurellaceae, the single constituent family of the phenotypic traits [11]. The phenotypic traits were later found order Pasteurellales, represents a diverse group of commensal nottobecharacteristicofanysinglegenus[12]. Consequently, and pathogenic within the class Gammaproteobacte- the genera Actinobacillus, Haemophilus, and Pasteurella each ria. The family currently contains 19 genera, some of which exhibit extensive polyphyly in subsequent 16S rRNA based are particularly important human and animal [1, phylogenies [12, 13]. Additional studies based on individ- 2]. The genera Haemophilus contains species responsible for ual or concatenated gene sets and DNA-DNA/rRNA-DNA human bacteremia, , acute bacterial , hybridization also support the presence of extensive poly- and the sexually transmitted disease [3–5]; Aggre- phyly within the genera Actinobacillus, Haemophilus, and gatibacter species have been implicated in juvenile periodon- Pasteurella [14–20]. titis [6]; members of the genera Mannheimia, Pasteurella, Extensive work has been undertaken to amend the and Actinobacillus have been implicated in the causation of classification of the genera Actinobacillus, Haemophilus, and shipping fever in cattle, fowl , and pleuropneumonia Pasteurella [1, 5, 9, 18]. New genera have been created to in pigs, respectively [7–9]. house phylogenetically coherent clusters of Actinobacillus, The family Pasteurellaceae was originally proposed as a Haemophilus, and Pasteurella.Thespecies[Actinobacillus] higher level taxonomic grouping of the related pathogenic actinomycetemcomitans,[Haemophilus] aphrophilus,[Hae- genera Actinobacillus, Haemophilus, and Pasteurella [10]. mophilus] paraphrophilus, and [Haemophilus] segnis have 2 International Journal of Genomics been transferred to the genus Aggregatibacter [21]; the using widely available nucleotide sequences of the 16S rDNA, species [Haemophilus] paragallinarum,[Pasteurella] galli- infB (translation initiation factor IF-2), recN (DNA repair narum,[Pasteurella] avium, and [Pasteurella] volantium have protein), and rpoB (DNA-directed RNA polymerase subunit been transferred to the genus [22]; the species beta) genes which have been used, individually or as part [Haemophilus] somnus and [Haemophilus] agni have been of a set, in a number of previous phylogenetic analyses of transferred to the genus Histophilus [23]; and the species the family Pasteurellaceae [15–17, 29, 30]. Gene sequences for [Pasteurella] haemolytica and [Pasteurella] granulomatis have these four genes were obtained for 52 Pasteurellaceae strains, been transferred to the genus Mannheimia [7]. Additionally, representing a large majority of the known Pasteurellaceae some individual species within the genera Actinobacillus, species, and 2 members of from the NCBI Haemophilus, and Pasteurella that do not cluster with other nucleotide database [37]. Species which were missing one of members of their genus in phylogenetic trees have been these four genes or which did not have a gene sequence that moved or proposed to be moved to novel or neighbouring was at least 50% of the length of the full gene were excluded genera (namely, the transfer of the species [Haemophilus] from the analysis. The four genes were individually aligned pleuropneumoniae to the genus Actinobacillus [24], the trans- using MUSCLE [38] and manually concatenated to create a ferofthespecies[Pasteurella] anatis to the genus Gallibac- combined dataset that contained 10 183 nucleotide long align- terium [25],thetransferofthespecies[Pasteurella] trehalosi ments. A maximum-likelihood tree based on 100 bootstrap to the genus [26],thetransferofthespecies[Pas- replicates of this alignment was constructed using MEGA teurella] ureae to the genus Actinobacillus [27], and the pro- 6.0 [39] while employing maximum composite likelihood posed transfer of the species [Haemophilus] ducreyi to a novel substitution model. genus [28]). However, despite these changes, the classification of the genera Actinobacillus, Haemophilus, and Pasteurella 2.2.PasteurellaceaeCoreGenomePhylogeneticTree. A phylo- is still problematic and each genus continues to contain genetic tree of 76 Pasteurellaceae strains, rooted using 7 mem- members which exhibit polyphyletic branching [5, 17–20]. bers of the family , based on the core genome Multiple studies have attempted to define a core group of of the family Pasteurellaceae was created for this study. The species which cluster around the nomenclatural type species core set of Pasteurellaceae proteins were identified using of Actinobacillus, Haemophilus, or Pasteurella as the only true the UCLUST algorithm [55] to identify widely distributed members of these genera (i.e., sensu stricto)[13, 15–17, 29–31], protein families with at least 30% sequence identity and but the and phylogeny of these bacteria continue to 50% sequence length. Proteins families which were present remain inconclusive [20, 32, 33]. Several methods have been in less than 50% of the input were excluded from employed for the demarcation of these genera; however, no further analysis. Potentially paralogous sequences (additional simple method or criterion is available that can clearly delimit proteins from the same organism in a single protein family) these genera. It has been suggested that genome based studies within the remaining protein families were also excluded may provide reliable means of clarifying the evolutionary from further analysis. Each protein family was individually relationships of these bacteria [33]. aligned using MAFFT 7 [56]. Aligned amino acid positions Since the availability of the first complete genome which contained gaps in more than 50% of organisms were sequence of the [34], a large number excluded from further analysis. The remaining amino acid of genomes for the members of the family Pasteurellaceae positions were concatenated to create a combined dataset that have become available in public databases [35, 36]. The avail- contained 128 080 amino acid long alignments. An approx- ability of these genomes provides us with an opportunity to imately maximum-likelihood tree based on this alignment complete comprehensive genome scale phylogenetic analyses was constructed using FastTree 2 [57]whileemployingthe of the family Pasteurellaceae. These genome sequences have Whelan and Goldman substitution model [58]. also been utilized to carry out comparative genomic analyses to identify molecular signatures (namely, conserved signature indels (CSIs) in various proteins), commonly shared by all 2.3. Identification of Molecular Signatures (CSIs) for Different or closely related subsets of species within the family Pas- Genera of the Family Pasteurellaceae. The detailed outline teurellaceae. On the basis of the molecular signatures iden- of the process of identifying CSIs has been recently pub- tified from comparative analyses of Pasteurellaceae genomes lished [59]. In brief, Blastp searches were performed on inconjunctionwithcoregenomebasedandmultilocus all proteins from the genome of Haemophilus influenzae sequence based phylogenetic analyses, we have identified F3047 [47].Tentofifteenhighscoringhomologuesthat sensu stricto clades of Actinobacillus, Haemophilus, and Pas- were present in Haemophilus, other Pasteurellaceae,and teurella that are supported by 3, 7, and 6 unique molecular species were retrieved, and their mul- signatures, respectively. We also report sets of molecular tiplesequencealignmentswereconstructedusingClustalX signatures that are unique characteristics of the pathogen 1.83 [60]. The alignments were visually inspected to identify containing genera Aggregatibacter and Mannheimia. any conserved inserts or deletions (indels) that are restricted to the particular clades of the family Pasteurellaceae, which 2. Methods are flanked on each side by at least 5-6 identical/conserved residues in the neighbouring 30–40 amino acids. The selected 2.1. Multilocus Sequence Analysis. Multilocus sequence anal- sequences containing the indels and their flanking conserved ysis was completed for members of the family Pasteurellaceae regions were further evaluated by detailed Blastp searches International Journal of Genomics 3 to determine species distribution and group specificity. The however, this division is not readily apparent in phylogenies results of these Blast searches were processed using Sig Create based on the 16S rRNA gene [9, 66]. and Seq Style to construct signature files [59]. Due to space A majority of the known genera within the family constraints, the sequence alignment files presented here Pasteurellaceae form well-defined and coherent clusters in containsequenceinformationforalimitednumberofspecies phylogenetic trees (Figure 1)[9, 17, 19, 20, 66]. The genera within the order Pasteurellaceae and a representative selection Actinobacillus, Haemophilus, and Pasteurella,whichwere of outgroup species. However, in each case, all members of the described before the advent of genetic characterization, order and outgroups exhibited similar sequence characteris- exhibit polyphyletic branching in all gene and protein based tics to the representatives. phylogenetic trees, including the core genome based and multilocussequencebasedphylogenetictreescreatedinthis 3. Results and Discussion work (Figure 1). However, there are large clusters of Acti- nobacillus, Haemophilus, and Pasteurella species identifiable 3.1. Phylogenetic Analysis of the Pasteurellaceae. Elucidating inthephylogenetictreeswhichrepresentthecoreor“sensu an accurate phylogeny of the members of the family Pasteurel- stricto” members of each genera. The clusters of species laceae has been a long standing challenge in Pasteurellaceae that represent Actinobacillus sensu stricto, Haemophilus sensu research [10–12, 18, 19]. Early 16S rRNA based studies revealed stricto, and Pasteurella sensu stricto are indicated in Figure 1. that the established taxonomy of the family Pasteurellaceae Members of each genera which fall outside of the sensu stricto was not consistent with their genetically inferred phylogeny clusters, indicated in our phylogenetic trees by the presence of [12, 14]. This has led to a long series of taxonomic revisions square brackets around their genus name (e.g., [Pasteurella] within the family Pasteurellaceae,aprocesswhichisstilltak- pneumotropica), are only distantly related to the sensu stricto ing place today [7, 18, 22, 28]. However, it was subsequently members of their genus and will require reclassification in discovered that phylogenetic trees of Pasteurellaceae species order to make their taxonomy and phylogeny concordant. based on different genes did not completely agree with each other [15, 16, 31]. In particular, phylogenetic trees based on 3.2. The Usefulness of Conserved Signature Indels as Phylo- the 16S rRNA gene, often considered the gold standard in genetic and Taxonomic Markers. Wholegenomesequences bacterial taxonomy and phylogeny [61, 62], disagreed with are a rich resource for the discovery of molecular signatures highly robust multilocus sequence and concatenated protein which are unique to a group of organisms [67–69]. One useful sequence based phylogenetic trees [9, 17, 19, 20, 53, 63]. class of shared molecular signatures are conserved signa- Phylogenetic trees based on concatenated sequences for ture indels (CSIs), which are insertions/deletions uniquely a large number of unlinked and conserved loci are more present in protein sequences from a group of evolutionarily reliable and robust than phylogenetic trees based on any related organisms [59, 70, 71]. The unique, shared presence single gene or protein [64, 65]. Due to a rapid increase of multiple CSIs by a group of related species is most in the availability of genomic sequence data, we are now parsimoniously explained by the occurrence of the genetic able to complete genome scale phylogenetic analyses of the changes that resulted in these CSIs in a common ancestor family Pasteurellaceae which cover a vast majority of the ofthegroup,followedbyverticaltransmissionoftheseCSIs diversity within the family. In this work we have produced to various descendant species [59, 71–73]. Hence, these CSIs a phylogenetic tree for 74 genome sequenced members of the represent molecular synapomorphies (markers of common family Pasteurellaceae based on 128 080 aligned amino acid evolutionary decent) which can be used to identify and positions (Figure 1(a)). The branching patterns of the core demarcate specific bacterial groups in molecular terms and genome phylogenetic tree produced in this work largely agree for understanding their interrelationships independently of with a previous genome based phylogenetic tree produced phylogenetic trees [59, 70–72]. CSIs have recently been foralimitednumberofPasteurellaceae species [19]anda used to propose important taxonomic changes for a num- concatenated protein based phylogenetic tree of the family ber of bacterial groups (namely, , , Pasteurellaceae produced by our lab in a previous study [20]. , Xanthomonadales,andBorrelia) at different Additionally, we have also produced a multilocus sequence taxonomic ranks [69, 74–77].Inthepresentwork,wehave basedphylogenetictreeusingthe16SrDNA,infB, recN,and completed comprehensive comparative analysis of Pasteurel- rpoB genes which are commonly used in the phylogenetic laceae genomes (Table 1) in order to identify CSIs that analysis of the family Pasteurellaceae (Figure 1(b))[15–17, are primarily restricted to the different genera within the 29, 30]. This tree also showed broadly similar branching family Pasteurellaceae.Wehaveidentified3,7,and6unique patterns to past multilocus sequence based phylogenetic trees molecular signatures which are shared by Actinobacillus [17, 18] and to our core genome based phylogenetic tree. sensu stricto, Haemophilus sensu stricto, and Pasteurella sensu Bothourcoregenomebasedandmultilocussequencebased stricto, respectively. Information regarding these CSIs and phylogenetic trees provide evidence for a division of the their evolutionary significances is discussed below. Pasteurellaceae into at least two higher taxonomic groups (families) which are broadly similar to the two clades of Pasteurellales identifiedinourpreviouswork[20]. A similar 3.3. Molecular Signatures Specific for Actinobacillus sensu division of the family Pasteurellaceae into two or more stricto. The genus Actinobacillus was originally defined as largegroupsisseeninmanyotherrobustmultilocusor a group of growth factor independent host-associated rods concatenated protein based phylogenetic trees [17, 19, 53, 63]; which shared phenotypic or biochemical similarity with 4 International Journal of Genomics

Haemophilus influenzae Rd KW20 Haemophilus influenzae R2866 Haemophilus influenzae 224-21 Haemophilus influenzae 10810 Haemophilus influenzae 7P49H1 Haemophilus aegyptius CCUG 25716 Haemophilus influenzae 86-028NP Haemophilus sensu Haemophilus influenzae 6P18H1 M21639 Haemophilus influenzae PittHH stricto Haemophilus influenzae 10810 HK 85 Haemophilus aegyptius ATCC 11116 Haemophilus influenzae F3031 Haemophilus parainfluenzae T3T1 Haemophilus influenzae RdAW [Pasteurella] pneumotropica DSM 21403 Haemophilus influenzae 221-21 Haemophilus influenzae PittAA Haemophilus sensu [Actinobacillus] porcinus NM319 Haemophilus influenzae KR494 stricto koalarum ATCC 700131 Haemophilus haemolyticus M19501 Haemophilus haemolyticus M19107 Avibacterium paragallinarum JF4211 Haemophilus haemolyticus HK386 Avibacterium gallinarum NCTC 11188 Haemophilus sp.oral taxon 851 str. F0397 [Pasteurella] pneumotropica DSM 21403 [Haemophilus] felis ATCC 49733 Haemophilus parainfluenzae HK2019 ATCC 43325 Haemophilus parainfluenzae ATCC 33392 Haemophilus pittmaniae HK 85 ATCC 43137 Pasteurella sensu Aggregatibacter actinomycetemcomitans D11S-1 Pasteurella multocida HN06 stricto Aggregatibacter actinomycetemcomitans ANH9381 Aggregatibacter actinomycetemcomitans I23C Pasteurella multocida 36950 Aggregatibacter actinomycetemcomitans RhAA1 Aggregatibacter actinomycetemcomitans D11S-1 Aggregatibacter actinomycetemcomitans SC1083 Aggregatibacter Aggregatibacter Aggregatibacter aphrophilus ATCC 33389 Aggregatibacter aphrophilus NJ8700 Aggregatibacter aphrophilus F0387 [Actinobacillus] succinogenes 130Z Aggregatibacter segnis ATCC 33393 Aggregatibacter sp. oral taxon 458 str. W10330 Necropsobacter rosorum CCUG 28028 Avibacterium paragallinarum 72 Basfia succiniciproducens JF4016 Avibacterium paragallinarum 221 2336 Basfia succiniciproducens MBEL55E Histophilus somni 129PT [Pasteurella] bettyae CCUG 2042 Pasteurella dagmatis ATCC 43325 Pasteurella multocida 1500E Bisgaardia hudsonensis M327/99/2 Pasteurella multocida P1059 [Haemophilus] sputorum HK 2154 Pasteurella multocida 1500C Pasteurella sensu stricto Pasteurella multocida 671 90 Volucribacter psittacicida CCUG 47536 Pasteurella multocida 3480 [Actinobacillus] indolicus 46KC2 [Actinobacillus] succinogenes 130Z Basfia succiniciproducens MBEL55E [Haemophilus] parasuis KL0318 [Pasteurella] bettyae CCUG 2042 semolina JF2465 [Haemophilus] parasuis 29755 [Haemophilus] parasuis 84-15995 Bibersteinia trehalosi USDA-ARS-USMARC-192 [Haemophilus] parasuis SH0165 [Actinobacillus] muris NCTC 12432 [Haemophilus] parasuis gx033 [Haemophilus] parasuis ZJ0906 Otariodibacter oris Baika1 [Haemophilus] parasuis MN-H [Haemophilus] parasuis 12939 Vespertiliibacter pulmonis E127/08 [Haemophilus] parasuis SW114 [Haemophilus] paraphrohaemolyticus HK385 [Haemophilus] parasuis D74 [Haemophilus] paraphrohaemolyticus HK411 Bibersteinia trehalosi USDA-ARS-USMARC-190 Bibersteinia trehalosi USDA-ARS-USMARC-188 Histophilus somni 2336 [Haemophilus] paraphrohaemolyticus HK411 [Haemophilus] paraphrohaemolyticus HK385 [Actinobacillus] minor NM 305 [Haemophilus] sp. HK 2154 Actinobacillus lignieresii NCTC 4189 [Haemophilus] parainfluenzae CCUG 13788 [Actinobacillus] minor NM305 Actinobacillus pleuropneumoniae L20 [Actinobacillus] minor 202 Actinobacillus arthritidis CCUG 24862 Actinobacillus sensu Actinobacillus ureae ATCC 25976 H91-0380 Actinobacillus capsulatus DSM 19761 stricto Actinobacillus capsulatus DSM 19761 Actinobacillus suis ATCC 33415 Actinobacillus pleuropneumoniae 1096 Actinobacillus sensu Actinobacillus pleuropneumoniae L20 stricto Actinobacillus ureae CCUG 2139 Actinobacillus pleuropneumoniae 56153 Mannheimia haemolytica D171 Actinobacillus pleuropneumoniae S8 [Haemophilus] ducreyi 35000HP USDA-ARS-USMARC-1261 Mannheimia Mannheimia varigena USDA-ARS-USMARC-1388 Mannheimia granulomatis DSM 19156 Mannheimia varigena USDA-ARS-USMARC-1296 Mannheimia varigena USDA-ARS-USMARC-1312 [Haemophilus] ducreyi 3500HP Mannheimia varigena USDA-ARS-USMARC-1261 Mannheimia Gallibacterium anatis F 149 Mannheimia haemolytica D35 Mannheimia haemolytica D153 Gallibacterium melopsittaci F450 Mannheimia haemolytica PKL10 Chelonobacter oris 1662 Gallibacterium anatis 1265612 Gallibacterium anatis F 149 [Pasteurella] testudinis CCUG 19802

0.05 0.1 (a) (b)

Figure 1: (a) A maximum-likelihood core genome phylogenetic tree of sequenced members of the family Pasteurellaceae. (b) A maximum- likelihood phylogenetic tree based on concatenated nucleotide sequence alignments of the 16S rDNA, infB, recN,andrpoB genes. Both trees are rooted using members of the Vibrionaceae (not shown). Nodes with >80% bootstrap support are indicated by diamond shaped symbols at the node. Clusters of species representing Actinobacillus sensu stricto, Haemophilus sensu stricto, Pasteurella sensu stricto, Aggregatibacter, and Mannheimia are indicated by brackets. Members of the genera Actinobacillus, Haemophilus, and Pasteurella which do not fall into their respective “sensu stricto” clades are indicated by the presence of square brackets around their generic name (ex. [Pasteurella] pneumotropica). International Journal of Genomics 5

Table 1: Genome characteristics of the sequenced Pasteurellaceae included in our analyses. Organism name BioProject Size (Mb) Proteins G-C (%) References Actinobacillus pleuropneumoniae L20 CP000569 2.27 2013 41.3 [40] Actinobacillus pleuropneumoniae serovar 3 str. JL03 CP000687 2.24 2036 41.2 [41] Actinobacillus pleuropneumoniae serovar7 str. AP76 CP001091 2.35 2142 41.2 STHHb Actinobacillus ureae ATCC 25976a AEVG0 2.30 2475 — BCMg Actinobacillus minor 202a ACFT0 2.13 2050 39.3 McGill University Actinobacillus minor NM305a ACQL0 2.43 2411 39.3 McGill University Actinobacillus succinogenes 130Z CP000746 2.32 2079 44.9 Joint Genome Institute Aggregatibacter aphrophilus NJ8700 CP001607 2.31 2219 42.2 [19, 42] Aggregatibacter actinomycetemcomitans D11S-1 CP001733 2.20 2280 44.3 [43] Aggregatibacter actinomycetemcomitans D7S-1 CP003496 2.31 2250 44.3 [44] Aggregatibacter segnis ATCC 33393a AEPS0 1.99 1956 — BCMg Gallibacterium anatis UMN179 CP002667 2.69 2500 39.9 [45] Haemophilus aegyptius ATCC 11116a AFBC0 1.92 2020 — BCMg 35000HP AE017143 1.70 1717 38.2 Ohio State University Haemophilus haemolyticus M21621a AFQQ0 2.09 1894 — [46] Haemophilus influenzae 10810 FQ312006 1.98 1903 38.1 WTSIh Haemophilus influenzae F3031 FQ670178 1.99 1770 38.2 [47] Haemophilus influenzae F3047 FQ670204 2.01 1786 38.2 [47] Haemophilus influenzae 22.1-21a AAZD0 1.89 2224 38.0 [48] Haemophilus influenzae 3655 AAZF0 1.88 1929 38.0 [48] Haemophilus influenzae 6P18H1a ABWW0 1.91 1893 38.2 CGS, ASRIe Haemophilus influenzae 7P49H1a ABWV0 1.83 1752 37.9 CGS, ASRIe Haemophilus influenzae NT127a ACSL0 1.87 1809 38.0 BIGSPc Haemophilus influenzae PittAAa AAZG0 1.88 1981 38.1 [48] Haemophilus influenzae PittIIa AAZI0 1.95 2028 38.0 [48] Haemophilus influenzae PittHHa AAZH0 1.84 1977 38.0 [48] Haemophilus influenzae R3021a AAZJ0 1.88 2307 37.9 [48] Haemophilus influenzae RdAWa ACSM0 1.80 1718 38.0 BIGSPc Haemophilus influenzae 86-028NP CP000057 1.91 1792 38.2 [49] Haemophilus influenzae PittEE CP000671 1.81 1613 38.0 [48] Haemophilus influenzae PittGG CP000672 1.89 1661 38.0 [48] Haemophilus influenzae Rd KW20 L42023 1.83 1657 38.2 [34] Haemophilus influenzae R2846 CP002276 1.82 1636 38.0 UW-SBRId Haemophilus influenzae R2866 CP002277 1.93 1795 38.1 UW-SBRId Haemophilus parainfluenzae ATCC 33392 a AEWU0 2.11 2010 — BCMg Haemophilus parainfluenzae T3T1 FQ312002 2.09 1975 39.6 WTSIh Haemophilus parasuis 29755a ABKM0 2.22 2244 39.8 Iowa State University Haemophilus parasuis SH0165 CP001321 2.27 2021 40.0 [50] Haemophilus pittmaniae HK 85a AFUV0 2.18 2390 — J. Craig Venter Institute Haemophilus sputorum CCUG13788a AFNK0 2.14 2073 — Aarhus University Hospital Haemophilus parahaemolyticus HK385a AJSW0 1.81 1764 — J. Craig Venter Institute Haemophilus paraphrohaemolyticus HK411a AJMU0 2.02 2025 — J. Craig Venter Institute Haemophilus sp. oral taxon 851 str.F0397a AGRK0 1.84 1809 — GCG-WUf Histophilus somni 2336 CP000947 2.26 1980 37.4 Joint Genome Institute Histophilus somni 129PT CP000436 2.01 1798 37.2 [51] Mannheimia succiniciproducens MBEL55E AE016827 2.31 2370 42.5 [52] Mannheimia haemolytica PHL213a AASA0 2.57 2695 41.1 [53] Pasteurella multocida subsp.multocidastr.Pm70 AE004439 2.26 2012 40.4 [54] Pasteurella dagmatis ATCC 43325a ACZR0 2.25 2053 37.4 BCMg aThe genomes of these species/strains are currently under scaffolds/contigs status. bStiftung Tieraerztliche Hochschule Hannover (STHH). cThe Broad Institute Genome Sequencing Platform (BIGSP). dUniversity of Washington; Seattle Biomedical Research Institute (UW-SBRI). eCenter for Genomic Sciences, Allegheny-Singer Research Institute (CGS, ASRI). fGenome Sequencing Center (GSC) at Washington University (WashU) School of Medicine. gBaylor College of Medicine (BCM). hWellcome Trust Sanger Institute (WTSI). 6 International Journal of Genomics

367 402 Act. pleuropneumonia ser. 5b str. L20 126208128 QDDPTIQIVNQAQKAYVEN V VVKGLPELTGLPVLSA Act. pleuropneumoniae serovar 6 str. Femo 306860341 ------Act. pleuropneumoniae serovar 1 str. 4074 306853561 ------Act. pleuropneumoniae serovar 9 str. CVJ13261 306862597 ------Act. pleuropneumoniae serovar 11 str. 56153 306866939 ------Act. pleuropneumoniae serovar 12 str. 1096 306869187 ------Act. pleuropneumonia ser. 3 str. JL03 165976058 ------A------Act. pleuropneumonia ser. 7 str. AP76 190149956 ------A------Actinobacillus Act. pleuropneumoniae serovar 13 str. N273 306871346 ------A------sensu stricto Act. pleuropneumoniae serovar 2 str. S1536 306855886 ------A------Act. pleuropneumoniae serovar 2 str. 4226 302647429 ------A------Act. pleuropneumoniae serovar 4 str. M62 306858147 ------A------Act. pleuropneumoniae serovar 10 str. D13039 306864716 ------A------Actinobacillus capsulatus 517480365 ------A------Actinobacillus suis H91-0380 407692352 ------A------Actinobacillus suis ATCC 33415 672592002 ------A------Actinobacillus ureae 491832514 ------S----A------Actinobacillus minor 492353747 -----V------I-----Q-A---I--- Haemophilus parasuis 75992966 -----V------I-----Q-A---I--- Haemophilus paraphrohaemolyticus 491992285 ------K -APSVA-M----I--- Haemophilus parahaemolyticus 491987878 ------K -ASSVA-MA---I--- Bibersteinia trehalosi USDA-AR 470167188 ------A------A--A--D-A---I--- Mannheimia haemolytica USDA-AR 472333619 ------A---I--- Mannheimia haemolytica M42548 482886678 ------A---I--- Basfia succiniciproducens 52424119 ------T-- ---N----A------Other Actinobacillus succinogenes 13 152977811 ------R-----A-K IM-N--K-AK------Pasteurellaceae 492137838 ------A-- I--N----A---I--- Pasteurella dagmatis 492150287 -----V------A-- ---N----A------Pasteurella multocida HN06 383310907 -----V------R--A-- ---N----A------Pasteurella multocida 36950 378774943 -----V------R--A-- ---N----A------Aggregatibacter actinomycetemcomitans 491743308 ------APSV-AMA---I--- Aggregatibacter aphrophilus NJ 251792866 ------K -APSV-AMA---I--- Aggregatibacter segnis 493770251 ------K -APSV-AMA---I--- Haemophilus haemolyticus 491849990 ------K -ASSVA-MA---I--- Haemophilus influenzae 491951884 ------K -APS-AAMA---I--- 499151755 -----V----N------H F-Q-D-D-D-I----- bercovieri 491414840 -----V----N--R--T-H FIQ-D-D-A------Serratia odorifera 491094352 -----V----N------H YIQ-D-D-AD------ SCF1 311281241 -----V-V--M------H FIQ-D-D-AN------251788420 -----V----N------H FIQ-D-D-AE------Other Dickeya zeae Ech1591 489927089 -----V-V--N------H FIQ-D-D-AK------Gammaproteobacteria 491253659 -----V-V--N------H YIQ-D-D-AK------0/250 Cronobacter turicensis z3032 260599446 -----V-V--N------H FIQ-D-D-AT------ YJ016 37678593 -----V----L---D---R FIQ-D-D-A------Klebsiella variicola At-22 288937493 -----V-V--M------H FIQ-D-D-AK------Pantoea ananatis LMG 5342 378765502 -----V----N--R----H FIQ-D-D-AT------ 446511916 -----V-V--N------H YIQ-D-D-AK------󸀠 Figure 2: A partial sequence alignment of a 3 -nucleotidase showing a 1-amino-acid insertion identified in all members of Actinobacillus sensu stricto. This insertion was not found in the homologues from any member of the genus Actinobacillus that was not part of the “sensu stricto” clade or any other member of the Gammaproteobacteria. Sequence information for a representative subset of the family Pasteurellaceae and the class Gammaproteobacteria is shown, but unless otherwise indicated, similar CSIs were detected in all members of the indicated group and not detected in any other bacterial species in the top 250 BLAST hits. The dashes (-) in the alignments indicate identity with the residue inthe top sequence. GenBank identification (GI) numbers for each sequence are indicated in the second column. Sequence information for other CSIs specific to Actinobacillus sensu stricto are presented in Supplemental Figures 1-2 and their characteristics are summarized in Table 2(A).

Actinobacillus lignieresii, the type species of the genus [24, 78]. values, 16S rRNA sequence similarity, and other single gene However, the original classification scheme for the genus sequence comparisons [13, 18]. There are currently no known Actinobacillus led to the inclusion of a highly heterogeneous discrete characteristics which are unique to Actinobacillus and polyphyletic grouping of species within the genus [12– that define the genus. In this work, we have completed 14]. An assemblage of Actinobacillus species closely related a comprehensive comparative analysis of Pasteurellaceae to Actinobacillus lignieresii has been recognized as Acti- genomes in order to identify unique, defining molecular nobacillus sensu stricto (i.e.,thecoremembersofthegenus signatures for different genera within the family Pasteurel- Actinobacillus) in both our phylogenetic analysis (Figure 1) laceae. We have identified 3 CSIs which are unique, defining and past phylogenetic analyses [12–14, 17]. Differentiation of molecular signatures for the sequenced members of Acti- Actinobacillus sensu stricto from other Actinobacillus species nobacillus sensu stricto (namely, Actinobacillus capsulatus, andthemoderncriteriaforplacingnovelspecieswithinthe A. pleuropneumoniae, A. suis, and A. ureae). An example genus Actinobacillus sensu stricto is heavily reliant on genetic of a CSI specific for Actinobacillus sensu stricto is shown and genomic criteria, namely, DNA-DNA hybridization in Figure 2. The CSI consists of a 1-amino-acid insertion International Journal of Genomics 7

Table 2: Conserved signature indels specific for genera within the family Pasteurellaceae.

Protein name Gene name GenBank Figure number Indel size Indel identifier positiona (A) CSIs specific for Actinobacillus sensu stricto 󸀠 3 -nucleotidase surE 126208128 Figure 2 1aains 367–402 GTP pyrophosphokinase relA 126207889 Sup. Figure 1 1aains 368–412 Anaerobic glycerol-3-phosphate dehydrogenase subunit glpA 491834528 Sup. Figure 2 1aains 359–400 (B) CSIs specific for Haemophilus sensu stricto Biotin-protein ligase birA 144979005 Figure 3 6aadel 138–178 Aspartate ammonia-lyase aspA 145630289 Sup. Figure 3 1aains 34–75 NAD(P) transhydrogenase subunit alpha pntA 145631394 Sup. Figure 4 1aadel 352–378 Fumarate reductase subunit C frdC 301169552 Sup. Figure 5 3aains 31–89 Hypothetical tRNA/rRNA methyltransferase — 145636352 Sup. Figure 6 1aadel 17–58 Gamma-glutamyl kinase proB 145629980 Sup. Figure 7 1aains 197–253 ACP phosphodiesterase acpD 68250119 Sup. Figure 8 2aadel 119–159 (C) CSIs specific for Pasteurella sensu stricto Menaquinone-specific isochorismate synthase menF 386834899 Figure 4 4aains 29–86 tRNA s(4)U8 sulfurtransferase thiI 15602400 Sup. Figure 9 2aadel 412–446 FKBP-type peptidyl-prolyl cis-trans isomerase slyD 378775595 Sup. Figure 10 2aadel 151–188 Aspartate-semialdehyde dehydrogenase asd 383311492 Sup. Figure 11 1aadel 173–245 Lactate permease family transporter lldP 492154065 Sup. Figure 12 2aains 390–427 Cell division protein ftsA ftsA 492155843 Sup. Figure 13 1aains 357–387 (D) CSIs specific for Aggregatibacter nhaC family sodium:proton antiporter nhaC 493769836 Figure 5(a) 3aains 396–437 Outer membrane protein omp 261866907 Sup. Figure 14 4aadel 25–64 Multidrug transporter murJ murJ 365966332 Sup. Figure 15 1aadel 190–220 NADH dehydrogenase nuoE 387120244 Sup. Figure 16 1aains 372–412 (E) CSIs specific for Mannheimia Methyl-galactoside ABC transporter substrate-binding protein — 472335016 Figure 5(b) 1aadel 33–73 UDP-N-acetylmuramoylalanyl-D-glutamate–2,6-diaminopimelate ligase murE 472333011 Sup. Figure 17 2aadel 418–473 Glutathione-regulated potassium-efflux protein kefC 472333189 Sup. Figure 18 1aains 504–531 Glycerol-3-phosphate acyltransferase plsB 472334521 Sup. Figure 19 2aadel 214–252

󸀠 in a conserved region of a 3 -nucleotidase which is present and reclassifications [21–24, 28]. However, the genus remains in all sequenced members of Actinobacillus sensu stricto highly polyphyletic (Figure 1)[17, 19, 28]. The core mem- and absent in all other sequenced Gammaproteobacteria. bers of the genus Haemophilus (namely, Haemophilus sensu Sequence information for 2 other CSIs which are also unique stricto) consist of Haemophilusinfluenzae,H.aegyptius,and characteristics of the Actinobacillus sensu stricto clade is H. haemolyticus based on 16S rRNA sequence analysis [12– presented in Supplemental Figures 1-2 available online at 14, 32]. However, phylogenetic analysis based on DNA-DNA http://dx.doi.org/10.1155/2015/198560 and their characteris- hybridization and multilocus sequence analysis suggests that tics are briefly summarized in Table 2(A). H. parainfluenzae and H. pittmaniae are also members of Haemophilus sensu stricto [30, 80]. Phylogenetic analysis of rpoB, infB, and concatenated gene sets also suggest that 3.4. Molecular Signatures Specific for Haemophilus sensu [Pasteurella] pneumotropica and related isolates are closely stricto. The classification of novel species into the genus related to Haemophilus sensu stricto [15, 16]. Haemophilus wasinitiallybasedonphenotypicandbio- Our comparative analysis of Pasteurellaceae genomes chemical properties, most importantly, the dependence of has led to the identification of 7 CSIs that are unique growth on the presence of factor V and factor X in blood characteristics of Haemophilus sensu stricto which consists [13, 78, 79]. As with Actinobacillus, the classification of of Haemophilusinfluenzae,H.aegyptius,H.haemolyticus,H. Haemophilus on the basis of phenotypic and biochemical parainfluenzae, H. pittmaniae,and[Pasteurella] pneumotrop- properties has led to the genus containing an extremely ica (Figure 1). One example of a CSI specific for the members heterogeneous group of species [12–14, 32]. Species from the of Haemophilus sensu stricto,showninFigure 3,consistsof genus Haemophilus have undergone a number of transfers a 4-amino-acid deletion in a biotin-protein ligase which is 8 International Journal of Genomics

138 173 Haemophilus influenzae 22.1-21 144979005 LSLVIGLAIAEVL NVQVKWPNDILFDERKLGGILVE Haemophilus influenzae PittHH 145269679 ------Haemophilus influenzae R3021 144983393 ------Haemophilus influenzae R2866 386263583 ------Haemophilus influenzae PittAA 145267548 ------Haemophilus influenzae PittII 145271085 ------Haemophilus influenzae R2846 386265397 ------Haemophilus influenzae 10810 378696350 ------Haemophilus influenzae 7P49H1 229810402 ------Haemophilus influenzae PittEE 148715656 ------Haemophilus influenzae 86-028NP 68057028 ------Haemophilus influenzae F3031 317432060 ------Haemophilus influenzae CGSHiCZ412602 646229376 ------Haemophilus influenzae 7P49H1 229810402 ------Haemophilus Haemophilus influenzae 3655 144986658 ------sensu stricto Haemophilus influenzae 6P18H1 229812060 ------Haemophilus influenzae NT127 260094107 ------Haemophilus influenzae KR494 540365110 ------Haemophilus influenzae Rd KW20 16272182 ------Haemophilus influenzae PittGG 501001793 ------Haemophilus haemolyticus M21621 341954888 ------G------Haemophilus haemolyticus HK386 386907988 ------G------Haemophilus haemolyticus M19107 341948169 ------G------Haemophilus haemolyticus M21127 341948545 ------K------G------Haemophilus haemolyticus M19501 341948213 ------M--EG------Pasteurella pneumotropica 517167265 ----V------AF G------G------I- Haemophilus sp. oral taxon 851 696223133 ------G------Haemophilus parainfluenzae ATCC 33392 325159690 ------A- --K------LSG------Haemophilus parainfluenzae T3T1 301156028 ------A- --K------LSG------Haemophilus parainfluenzae HK262 385192842 ------A- -AK------V-LSG------Haemophilus pittmaniae 343517642 ---A------TF GA------V-YQDA-MA--SLD Haemophilus sputorum 359299234 ----VA-VL--SF TELGIS DI-IK----VYYQGK------I- Other Haemophilus paraphrohaemolyticus 386390324 ----VS-I---A- QAQNVQ DI-I------YYQGK-M----I- Haemophilus Haemophilus parahaemolyticus 387773709 ----VS-I---A- QAQNVQ DI-I-----VYYQGK-M----I- Haemophilus ducreyi 33151348 ----VA-I---T- QAQQVE HI-I------YYQGK-M----I- Haemophilus parasuis 219871466 ----VSVL---TF QTLNVP HI-I------YYQGK-M----I- Histophilus somni 113460565 --S-L---T---- NEMGA E-KL-----L-LFG---A----- Aggregatibacter segnis 315634470 --T------VQA- VELDMY GF------VN----A----- Aggregatibacter actinomycetemcomitans 261867631 --S------VQA- AELDMY GF------VND---A----- Aggregatibacter aphrophilus 251792305 --S------VQS- VELDMY GF------VND---A----- Actinobacillus pleuropneumoniae 190151209 -----S-I---S- QAQQVE -I-I------YYCSK-M----I- 322515677 -----S-I---S- QAQQVE -I-I------Y-CGK-M----I- Other Actinobacillus ureae Actinobacillus minor 223041563 ----VA-I---S- QAQGVK DI-I-----VY-QGK-M----I- Pasteurellaceae Actinobacillus succinogenes 152978568 ---TV-M--HRAI RKLGSD QTKL-----L-LHG---A---I- Basfia succiniciproducens 161510992 ------M---DAI KSAGGK EINL-----L-LNG---A---I- Mannheimia haemolytica 254361412 ----VA-I---S- TVQNVK DI-I------YYQGK-M----L- Pasteurella multocida 15602161 ----V-M----T- KQAGAL -IGL-----V-LHG---A----- Pasteurella dagmatis 260913039 ----V-M---DT- RRTGVR --KL-----V-LNG---A----- Gallibacterium anatis 332289774 ---AV-M-V-QA- TELNLH S--L------WLNGK------I- Cronobacter sakazakii 156935825 ------IVM---- HELGAA Q-R------LYLHD---A----- Edwardsiella ictaluri 238918134 ------IVM---- QALGAA G-K------LYLND---A----- Enterobacter cloacae 311281472 ------IVM---- QRLGAE G-R------LYLQD---A----- Other Erwinia tasmaniensis 188532305 ------IV---A- QQQGAP DIR------YLND---A----- Gammaproteobacteria Escherichia coli 218702608 ------IVM---- RKLGAD K-R------LYLQD---A----- 0/250 152972835 ------IV----- QQLGAE Q-R------YLQD---S----- Pantoea vagans 308188898 ------IVM--T- RALGAD D-R------YLND---A----- Photorhabdus luminescens 37528549 ----V-IV----- HRFGAG RIR------LYL-DK--A----- Serratia odorifera 270265458 ------IVM---- QRLGAE E-R------LYLND---A-----

Figure 3: A partial sequence alignment of 1,4-dihydroxy-2-naphthoate octaprenyltransferase showing a 2-amino-acid insertion identified in all members of Haemophilus sensu stricto. This insertion was not found in the homologues from any member of the genus Haemophilus that was not part of the “sensu stricto” clade or any other member of the Gammaproteobacteria. Sequence information for other CSIs specific to Haemophilus sensu stricto is presented in Supplemental Figures 3–8 and their characteristics are summarized in Table 2(B). uniquely found in homologs from Haemophilus sensu stricto influenzae, H. aegyptius, H. haemolyticus, H. parainfluenzae, and absent in all other sequenced Gammaproteobacteria. H. pittmaniae,and[Pasteurella] pneumotropica share a close Sequence information for 6 additional CSIs which are also evolutionaryrelationshipandshouldallbeconsideredmem- unique characteristics of Haemophilus sensu stricto is pre- bers of Haemophilus sensu stricto. Additionally, these results sented in Supplemental Figures 3–8 and their characteris- also suggest that [Pasteurella] pneumotropica is incorrectly tics are briefly summarized in Table 2(B).TheseCSIsand classified as a member of the genus Pasteurella and should be our phylogenetic trees (Figure 1)suggestthatHaemophilus reclassified as “Haemophilus pneumotropica.” International Journal of Genomics 9

29 80 Pasteurella multocida 3480 386834899 WYAGTLGVMGPAYADFCVTIRSAFIE QAEN DSQLCVFAGAGIVEGSIPLLEW Pasteurella multocida HN06 383310853 ------T------Pasteurella multocida Pm70 15601918 ------Pasteurella multocida Anand1 338217984 ------Pasteurella multocida X73 404383748 ------Pasteurella multocida 2000 512753744 ------Pasteurella multocida 93002 512754797 ------Pasteurella multocida 671/90 512760432 ------Pasteurella Pasteurella multocida HB03 512755642 ------sensu stricto Pasteurella multocida P1933 512755642 ------Pasteurella multocida P52VAC 401690557 ------Pasteurella multocida 404384736 ------Pasteurella multocida RIIF 512761090 ------Pasteurella multocida 1500C 512763080 ------Pasteurella multocida PMTB 544580815 ------Pasteurella multocida 36950 378774883 ------Pasteurella dagmatis 492154802 ------F-SKMQ------MD- Q-K------Pasteurella bettyae 492145910 ------F-SQVKSE----L----V- QNRIRI------A------Basfia succiniciproducens 52425850 ------F-NR-R-E----L----V- QNRIR------A--V----- Actinobacillus succinogenes 13 152978460 ------TKEHSE------SNKIR------A--V----- Aggregatibacter actinomycetemcomitans 387121592 ------FFNR-R-E------ADKIR------V----- Aggregatibacter segnis 493768552 ------FFNQQQ-E---A-----V- ADKIH------V----- Gallibacterium anatis UMN179 332289482 ----AI--ISH-F-E---GL---KLT HQ--HL------KE-QADE-- Histophilus somni 129PT 113460763 ----A--I-TETESE------S--- QDYIRI------Haemophilus parainfluenzae T3T 345428676 ------SQNLSE------EN-VR------Q-VE-- 494450864 ------L-SREQ-E------QQ-IR------A--D--A-- Other Haemophilus pittmaniae Haemophilus influenzae Rd KW20 16272240 ------SDVCSE---A------GHRIR------A--Q--E-- Pasteurellaceae Haemophilus sputorum 494790952 ----A--FVS-ERSE----L---QVH GNK-I-Y------A--E-QA-- Haemophilus parasuis SH0165 219871425 ----A--YFT-EQ-E---ML---L-Q AN-ITFY------K-D-QS-- Haemophilus ducreyi 35000HP 33151806 ------YFHTDH-E-T--L---K-D HN--TLY------AE-QADS-- Actinobacillus minor 492367157 ------IL-EDE-E----L---Q-K QN-VTLY------QD-E--S-- Actinobacillus ureae 491835912 ------YLQ-DE-E---AL---Q-- QNCITLY------E-E-QS-- Actinobacillus suis H91-0380 407691822 -----F-YLQ-DE-E---AL---Q-- QNCITLY------E-E-QS-- Actinobacillus pleuropneumonia 190150367 ------YLQ-DE-E---AL---Q-- RNRITLY------E-E-QS-- Mannheimia haemolytica USDA-AR 472333297 ------NEE-E----L---L-S QNSITLY------S--D-QS-- Bibersteinia trehalosi USDA-AR 470166611 ------YFS--Q-E----L---LVR AKEML-Y------AE-E-ES-- Moritella sp. PE36 492903543 --S-A--YV-QQKSE---A----R-L ENE-QL------P--D-MS-- Grimontia sp. AK16 488492100 --S-SV-YLS-EQSE---A----L-V -NKVHL------P--VAES-- Plesiomonas shigelloides 499151062 ----SV-HISRER-E-T-A----L-Q QN-VHL------P--D-EA-- Photobacterium damselae 358410570 --S-AV-YLSRQHSE---A----L-A GEE-HL------P--D-SS-- 446418461 ----SA-YLSLQQSE---SL---K-S -NVVRLY------R--D-EQ-- Other Escherichia sp. TW09231 Aliivibrio salmonicida LFI1238 209694656 --S-AV-FLSQQRSE---A----LVM GNK-HL------P--E-SS-- Gammaproteobacteria Vibrio fischeri ES114 59712279 --S-AV-FLSQQRSE---A----LVM GNK-HL------P--E-SS-- 0/250 Yokenella regensburgei 493874499 ----SA-YLSRQQSE---AL---MVS GET-RLY------R--D-E--- 491418549 ----SA-YLSRQQSE-S--L---WL- -QWVNLY------A--D-E--- Aliivibrio fischeri 491562394 --S-AV-FLSQQRSE---A----LVM GNK-HL------P--E-SS-- Photobacterium sp. AK15 494734248 --S-AV-FLSRQRSE---A----L-M GEE-HL------P--TADS-- 490215845 ----SA-YLSL-QSE---SL---KVQ QHT-RLY------S--D-EQ-- Figure 4: A partial sequence alignment of Menaquinone-specific isochorismate synthase showing a 4-amino-acid insertion identified in all members of Pasteurella sensu stricto. This insertion was not found in the homologues from any member of the genus Pasteurella that was not part of the “sensu stricto” clade or any other member of the Gammaproteobacteria. Sequence information for other CSIs specific to Pasteurella sensu stricto is presented in Supplemental Figures 9–13 and their characteristics are summarized in Table 2(C).

3.5. Molecular Signatures Specific for Pasteurella sensu stricto. found in the sequenced members of Pasteurella sensu stricto, The genus Pasteurella is highly heterogeneous and poly- consisting of a 4-amino-acid insertion in a conserved region phyletic (Figure 1)[13]. Similar to the members of Acti- of Menaquinone-specific isochorismate synthase, is shown nobacillus, bacterial isolates were originally classified as in Figure 4. This CSI is only found in the sequenced members of the genus Pasteurella based on growth fac- members of Pasteurella sensu stricto and is absent from tor independent growth and phenotypic or biochemical all other sequenced Gammaproteobacteria.Partialsequence similarity to Pasteurella multocida, the type species of the alignments for 5 additional CSIs which are also unique genus [78, 81]. The monophyletic clusters of Pasteurella characteristics of Pasteurella sensu stricto are presented in species that branch with Pasteurella multocida are consid- Supplemental Figures 9–13 and their characteristics are ered the core members of the genus (namely, Pasteurella briefly summarized in Table 2(C). sensu stricto)[9, 13, 16, 17]. Our comparative analysis of Pasteurellaceae genomes has led to the identification of 6 3.6. Molecular Signatures Specific for the Genera Aggregatibac- CSIs which are unique characteristics for the sequenced terorMannheimia. The genus Aggregatibacter was proposed members of Pasteurella sensu stricto (namely, Pasteurella as a novel taxonomic classification for a monophyletic cluster multocida and P. d ag mati s ). An example of a CSI uniquely of Actinobacillus and Haemophilus species which branched 10 International Journal of Genomics

396 437 Aggregatibacter segnis ATCC 33393 315476949 TSWGTFGIMLPIAAAIASHAMP GSV EFMLPCLSAVMAGAVCG Aggregatibacter aphrophilus NJ 251792630 ------A--AA --- -L------Aggregatibacter aphrophilus F0387 353346368 ------A--AA --- -L------Agg. actinomycetemcomitans D7S-1 387119969 ------H ------Agg. actinomycetemcomitans serotype d str. I63B 347995553 ------H ------Agg. actinomycetemcomitans serotype b str. SCC1398 348005459 ------H ------Agg. actinomycetemcomitans serotype c str. SCC2302 348008621 ------H ------Agg. actinomycetemcomitans serotype c str. AAS4A 443542862 ------H ------Agg. actinomycetemcomitans serotype b str. I23C 348010852 ------H ------Aggregatibacter Agg. actinomycetemcomitans serotype b str. SCC4092 443544948 ------H ------Agg. actinomycetemcomitans serotype str. D18P1 348000709 ------H ------Agg. actinomycetemcomitans serotype c str. D17P-2 348656398 ------H ------Agg. actinomycetemcomitans serotype e str. SC1083 347993209 ------H ------Agg. actinomycetemcomitans serotype e str. SCC393 348000829 ------H ------Agg. actinomycetemcomitans RhAA1 359757150 ------H ------Agg. actinomycetemcomitans D11S-1 261868126 ------H ------Agg. actinomycetemcomitans ANH9381 365967906 ------H ------Agg. actinomycetemcomitans D17P-3 348654113 ------H ------Agg. actinomycetemcomitans Y4 429151349 ------H ------Avibacterium paragallinarum 523674050 ------AN-A- -L------Gallibacterium anatis UMN179 332288121 ------M-AN-D- AL------Pasteurella pneumotropica 517167962 ------SM-TN-A- -L--A------Pasteurella dagmatis 492147403 ------AN-A- -L------Actinobacillus succinogenes 13 152978710 ------TN-A- -LL------517482329 ------GM-INVDT GLLI--M------Actinobacillus capsulatus ------G-SM-VNSD- NLII------Other Actinobacillus minor 492354008 Pasteurella multocida 36950 378774582 ------ANTA- -L------Pasteurellaceae Pasteurella dagmatis 492147403 ------AN-A- -L------Pasteurella bettyae 492137490 ------VN-A- -LL------Actinobacillus ureae 491832364 ------GM-INVDT GLLI--M------Bibersteinia trehalosi USDA-AR 470166919 ------M-AN-E- ALL--S------Haemophilus parasuis SH0165 219871308 ------M-AN-E- ALL------Haemophilus sputorum 494789693 ------A-----G-SM-M-SE- ALII---A---S----- Succinivibrionaceae bacterium 498141043 ------T--IN-NS -LLI------Psychromonas sp. CNPT3 470479953 ------GDM-GATDV A----M----L--S-F- Other 91793888 ------DM-MGSDS TM---M----L----F- Shewanella denitrificans OS217 ------GDL-GATDI AL---M----L----F- Gammaproteobacteria Grimontia hollisae 491648712 13655509 ------GDM-GATDV AL---M----L----F- 0/250 Vibrio harveyi Aeromonas molluscorum 492636195 ------L-GDM-AASEI SM---M----L----F- Vibrio sp. Ex25 262394622 ------GDM-GATDL AL---M----L----F- (a) 33 73 Mannheimia haemolytica USDA-ARS-USMARC-183 472335016 YKYDDNFMALMRKEIEKEGKTQK VELLMNDSQNTQSIQNDQ Mannheimia haemolytica D171 525657229 ------Mannheimia haemolytica serotype A2 str. OVINE 261308718 ------Mannheimia haemolytica D35 528822364 ------Mannheimia haemolytica USMARC_2286 526467663 ------Mannheimia haemolytica serotype 6 str. H23 452088105 ------Mannheimia haemolytica M42548 481064088 ------Mannheimia haemolytica D174 523435054 ------523448251 ------Mannheimia Mannheimia haemolytica D153 Mannheimia haemolytica D38 528824940 ------Mannheimia haemolytica MhBrain2012 528874222 ------Mannheimia haemolytica MhSwine2000 528877290 ------Mannheimia haemolytica D193 528878523 ------Mannheimia varigena USDA-ARS-USMARC-1312 575444678 ------QA-- -Q------Mannheimia varigena USDA-ARS-USMARC-1296 575442543 ------QA-- -Q------Mannheimia varigena USDA-ARS-USMARC-1261 575442169 ------QA-- -Q------Mannheimia varigena USDA-ARS-USMARC-1388 575448350 ------QA-- -Q------Mannheimia granulomatis 652755709 ------AN-- -N------A------Bibersteinia trehalosi USDA-AR 470166988 -N------S------VQ-AARM- E -N--L-----A-----N- Haemophilus parasuis SH0165 219870684 ------S---Q-----V-VVG G -D------A------Haemophilus paraphrohaemolyticus 491990832 ------AAQH- D ------Haemophilus sputorum 497813944 ------S------N--A-ALN D ------A------Actinobacillus minor 492352703 ------A-NL- D ------Other Actinobacillus suis H91-0380 407693344 ------D--ANQL- D -K------Pasteurellaceae Mannheimia succiniciproducens 52424698 ------D--ATNL- D -Q------A------Gallibacterium anatis UMN179 332289959 ------S------N-DAEKVE G IK------A------Aggregatibacter actinomycetemcomitans 491699858 ------S------Q-NAEQLQ N -K------A------Haemophilus parainfluenzae T3T 345429694 ------S------D--A-ALG G I------A------Haemophilus influenzae 86-028N 68249417 ------S------D--A-VVG G IK------A------490274190 ------SMV--D----A-NSP G -Q------S--T---- Erwinia amylovora ------SVV--A---D--SAP D -Q------D--K---- Other Klebsiella oxytoca 490205456 Yokenella regensburgei 493874411 ------SVV--A---DA-ASP D IQ------D--K---- Gammaproteobacteria Tolumonas auensis DSM 9187 237809379 ------SAV--A-----DQYP D IK------D--K---- 0/250 Citrobacter rodentium ICC168 283785958 ------SVV--A--ADA-AAP D -Q------D--K---- Escherichia coli 487492822 ------SVV--A--QDA-AAP D -Q------D--K---- (b)

Figure 5: A partial sequence alignment of (a) a nhaC family sodium:proton antiporter containing a 3-amino-acid insertion specific for all sequenced species of the genus Aggregatibacter (b) a methyl-galactoside ABC transporter substrate-binding protein containing a 1- amino-acid deletion specific for all sequenced species of the genus Mannheimia. In each case, the identified CSIs were only found in the sequenced members of the genera Aggregatibacter or Mannheimia and were absent from all other sequenced Gammaproteobacteria. Sequence information for other CSIs specific to Aggregatibacter or Mannheimia is presented in Supplemental Figures 14–19 and their characteristics are summarized in Tables 2(D) and 2(E). International Journal of Genomics 11 distinctly from the “sensu stricto” members of their respective 7 CSIs specific for clades [21]. Similarly, the genus Mannheimia was proposed Haemophilus sensu stricto (Table 2(B)) as a novel classification for the Pasteurella Haemolytica Haemophilus sensu complex which did not branch with Pasteurella sensu stricto stricto in phylogenetic trees [7]. Currently other than branching in phylogenetic trees or relatedness in DNA-DNA hybridiza- 4 CSIs specific for tion studies, the members of the genera Aggregatibacter or Aggregatibacter (Table 2(D)) Mannheimia do not share any single unique or defining Aggregatibacter biochemical or molecular characteristic that can differentiate them from all other bacteria [5, 82]. Inthisstudywehaveidentified4CSIsthatareunique 6 CSIs specific for molecular characteristics shared by all sequenced species Pasteurella sensu stricto (Table 2(C)) of the genus Aggregatibacter and another 4 CSIs which are Pasteurella sensu uniquely found in all sequenced members of the genus stricto Mannheimia.ExamplesofCSIsspecifictothesequenced members of Aggregatibacter and Mannheimia are shown in 3 CSIs specific for Figure 5. A partial sequence alignment of a nhaC family Actinobacillus sensu stricto (Table 2(A)) sodium:proton antiporter containing a 3-amino-acid inser- Actinobacillus sensu tion specific for all sequenced species of the genus Aggregati- stricto bacter is shown in Figure 5(a) and a partial sequence align- ment of a methyl-galactoside ABC transporter substrate- binding protein containing a 1-amino-acid deletion specific for all sequenced species of the genus Mannheimia is shown in Figure 5(b). In each case, the identified CSIs were only Mannheimia 4 CSIs specific for found in the sequenced members of the genera Aggre- Mannheimia gatibacter or Mannheimia and were absent from all other (Table 2(E)) sequenced Gammaproteobacteria. Partial sequence align- ments additional CSIs specific for the genera Aggregatibacter Figure 6: A summary diagram depicting the distribution of identi- fied CSIs for genera within the family Pasteurellaceae. or Mannheimia are provided in Supplemental Figures 14–19 and their characteristics are summarized in Tables 2(D)- 2 (E). These CSIs are the first discrete molecular charac- teristicswhichareuniqueforthegeneraAggregatibacter fall into the “sensu stricto” clusters should not be considered and Mannheimia and support their observed monophyly members of their respective genus. in phylogenetic trees. Additionally, these CSIs could be useful targets for the development of PCR based diagnostic Examinations of phenotypic and biochemical character- assays for the genera Aggregatibacter and Mannheimia which istics do not provide a reliable means of assigning a novel amplify the CSI containing DNA segment using the con- isolate to the genera Actinobacillus, Haemophilus, and Pas- served flanking regions of the CSIs83 [ , 84]. teurella [18].However,basedupontheCSIsdescribedinthis work, it is now possible to demarcate the genera Actinobacil- 4. Conclusion lus sensu stricto, Haemophilus sensu stricto, and Pasteurella sensu stricto on the basis of the presence or absence of unique The genera Actinobacillus, Haemophilus, and Pasteurella, molecular signatures. It is important to note that the current within the family Pasteurellaceae, are known to exhibit analysis of CSIs is limited to the currently available genomic extensive polyphyletic branching. We have utilized molecular sequence data and may show slight variance as additional signaturesandphylogeneticanalysestoclarifythetaxonomic bacterial genomes are sequenced. However, earlier work on boundary of these genera. We have been able to identify CSIs for other groups of bacteria provides evidence that the large clusters of Actinobacillus, Haemophilus, and Pasteurella identifiedCSIshavestrongpredictivevalueandwilllikelybe species which represent the “sensu stricto” members of these found in other members of these groups as more species are genera.Wehaveidentified3,7,and6uniquemolecular sequenced and novel species are isolated [74, 77, 85, 86]. The signatures which are specifically shared by the members of conserved nature of the sequence regions that contain these the genera Actinobacillus sensu stricto, Haemophilus sensu CSIs, in conjunction with their strong predictive value, makes stricto, and Pasteurella sensu stricto,respectively.Thegroup CSIs promising targets for the development of highly specific specificity of the molecular signatures we have identified in diagnostic assays for Actinobacillus sensu stricto, Haemophilus this work is summarized in Figure 6 and their characteristics sensu stricto, Pasteurella sensu stricto, Aggregatibacter, and are briefly summarized in Table 2.Ourcomparativegenomic Mannheimia [83, 84]. Additionally, further analysis of these analyses have not come across any CSIs that were unique genus specific CSIs should lead to the discovery of their characteristics of all sequenced members of the genera Acti- functional role in their respective organisms and may provide nobacillus, Haemophilus, or Pasteurella as currently defined, important insights into novel distinguishing features of these suggesting that the members of these genera that do not groups of organisms. 12 International Journal of Genomics

Conflict of Interests [12]F.E.Dewhirst,B.J.Paster,I.Olsen,andG.J.Fraser,“Phylogeny of 54 representative strains of species in the family Pasteurel- The authors declare that there is no conflict of interests laceae as determined by comparison of 16S rRNA sequences,” regarding the publication of this paper. Journal of Bacteriology,vol.174,no.6,pp.2002–2013,1992. [13]I.Olsen,F.E.Dewhirst,B.J.Paster,andH.J.Busse,“FamilyI. Pasteurellaceae,” i n Bergey’s Manual of Systematic Bacteriology, Authors’ Contribution D. J. Brenner, N. R. Krieg, G. M. Garrity, and J. T. Staley, Eds., Sohail Naushad and Mobolaji Adeolu contributed equally to vol.2,pp.851–856,Springer,NewYork,NY,USA,2ndedition, 2005. this work. [14]F.E.Dewhirst,B.J.Paster,I.Olsen,andG.J.Fraser,“Phy- logeny of the Pasteurellaceae as determined by comparison References of 16S ribosomal ribonucleic acid sequences,” Zentralblatt fur¨ Bakteriologie,vol.279,no.1,pp.35–44,1993. [1] A. C. Parte, “LPSN—list of prokaryotic names with standing in [15]H.Christensen,P.Kuhnert,J.E.Olsen,andM.Bisgaard, nomenclature,” Nucleic Acids Research,vol.42,no.1,pp.D613– “Comparative phylogenies of the housekeeping genes atpD, infB D616, 2014. and rpoB and the 16S rRNA gene within the Pasteurellaceae,” [2]K.Muehldorfer,S.Speck,andG.Wibbelt,“ProposalofVesper- International Journal of Systematic and Evolutionary Microbiol- tiliibacter pulmonis gen. nov., sp. nov. and two genomospecies as ogy,vol.54,no.5,pp.1601–1609,2004. new members of the family Pasteurellaceae isolated from Euro- [16] B. Korczak, H. Christensen, S. Emler, J. Frey, and P. Kuhn- pean bats,” International Journal of Systematic and Evolutionary ert, “Phylogeny of the family Pasteurellaceae based on rpoB Microbiology,vol.64,no.7,pp.2424–2430,2014. sequences,” International Journal of Systematic and Evolutionary [3]S.M.Spinola,M.E.Bauer,andR.S.MunsonJr.,“Immu- Microbiology,vol.54,no.4,pp.1393–1399,2004. nopathogenesis of Haemophilus ducreyi infection (chancroid),” [17] P. Kuhnert and B. M. Korczak, “Prediction of whole-genome Infection and Immunity, vol. 70, no. 4, pp. 1667–1676, 2002. DNA-DNA similarity, determination of G + C content and [4] H. Christensen and M. Bisgaard, “Molecular classification and phylogenetic analysis within the family Pasteurellaceae by mul- its impact on diagnostics and understanding the phylogeny tilocus sequence analysis (MLSA),” Microbiology,vol.152,no.9, and epidemiology of selected members of Pasteurellaceoe of pp. 2537–2548, 2006. veterinary importance,” Berliner und Munchener¨ Tierarztliche¨ [18] H. Christensen, P. Kuhnert, H. J. Busse, W. C. Frederiksen, and Wochenschrift,vol.123,no.1-2,pp.20–30,2010. M. Bisgaard, “Proposed minimal standards for the description [5] N. Nørskov-Lauritsen, “Classification, identification, and clin- of genera, species and subspecies of the Pasteurellaceae,” Inter- ical significance of Haemophilus and Aggregatibacter species national Journal of Systematic and Evolutionary Microbiology, with host specificity for humans,” Clinical Microbiology Reviews, vol. 57, no. 1, pp. 166–178, 2007. vol.27,no.2,pp.214–240,2014. [19] M. P. D. Bonaventura, E. K. Lee, R. DeSalle, and P. J. Planet, [6]B.Henderson,J.M.Ward,andD.Ready,“Aggregatibacter “A whole-genome phylogeny of the family Pasteurellaceae,” (Actinobacillus) actinomycetemcomitans:atripleA∗ periodon- Molecular Phylogenetics and Evolution,vol.54,no.3,pp.950– topathogen?” Periodontology 2000,vol.54,no.1,pp.78–105, 956, 2010. 2010. [20] H. S. Naushad and R. S. Gupta, “Molecular signatures (con- served indels) in protein sequences that are specific for the order [7]Ø.Angen,R.Mutters,D.A.Caugant,J.E.Olsen,andM.Bis- Pasteurellales and distinguish two of its main clades,” Antonie gaard, “Taxonomic relationships of the [Pasteurella] haemolyt- van Leeuwenhoek,vol.101,no.1,pp.105–124,2012. ica complex as evaluated by DNA-DNA hybridizations and 16s rRNA sequencing with proposal of Mannheimia haemolytica [21] N. Nørskov-Lauritsen and M. Kilian, “Reclassification of Acti- gen. nov., comb. nov., Mannheimia granulomatis comb. nov., nobacillus actinomycetemcomitans, , Mannheimia glucosida sp. nov., sp. nov. Haemophilus paraphrophilus and as Aggre- and Mannheimia varigena sp. nov.,” International Journal of gatibacter actinomycetemcomitans gen. nov., comb. nov., Aggre- Systematic Bacteriology,vol.49,no.1,pp.67–86,1999. gatibacter aphrophilus comb.nov.andAggregatibacter seg- nis comb. nov., and emended description of Aggregatibacter [8] J. T. Bosse,H.Janson,B.J.Sheehanetal.,“´ Actinobacillus pleu- aphrophilus to include V factor-dependent and V factor- ropneumoniae: pathobiology and pathogenesis of infection,” independent isolates,” International Journal of Systematic and Microbes and Infection,vol.4,no.2,pp.225–235,2002. Evolutionary Microbiology,vol.56,no.9,pp.2135–2146,2006. [9]B.A.WilsonandM.Ho,“Pasteurella multocida:fromzoonosis [22]P.J.Blackall,H.Christensen,T.Beckenham,L.L.Blackall, to cellular microbiology,” Clinical Microbiology Reviews,vol.26, and M. Bisgaard, “Reclassification of Pasteurella gallinarum, no. 3, pp. 631–655, 2013. [Haemophilus] paragallinarum, Pasteurella avium and Pas- [10] S. Pohl, Reklassifizierung der gattung Actinobacillus Brumpt 1910, teurella volantium as Avibacterium gallinarum gen. nov., comb. HaemophilusWinslowetal.1971undPasteurellaTrevisan1887 nov., Avibacterium paragallinarum comb. nov., Avibacterium anhand phanotypischer¨ und molekularer daten, insbesondere der avium comb. nov. and Avibacterium volantium comb. nov,” DNS-verwandtschaften bei DNS: DNS-hybridisierung in vitro International Journal of Systematic and Evolutionary Microbiol- und vorschlag einer neuen familie, Pasteurellaceae [Inaug. Diss.], ogy, vol. 55, no. 1, pp. 353–362, 2005. Philipps-Universitat,Marburg,Germany,1979.¨ [23]Ø.Angen,P.Ahrens,P.Kuhnert,H.Christensen,andR. [11] W. Mannheim, S. Pohl, and R. Hollander,¨ “On the taxonomy of Mutters, “Proposal of Histophilus somni gen. nov., sp. nov. Actinobacillus, Haemophilus,andPasteurella:DNAbasecom- for the three species ‘Haemophilus somnus’, position, respiratory quinones, and biochemical reactions of ‘Haemophilus agni’and‘Histophilus ovis’,” International Journal representative collection cultures (author’s transl),” Zentralblatt of Systematic and Evolutionary Microbiology,vol.53,no.5,pp. fur¨ Bakteriologie A,vol.246,no.4,pp.512–540,1979. 1449–1456, 2003. International Journal of Genomics 13

[24] S. Pohl, H. U. Bertschinger, W.Frederiksen, and W.Mannheim, [38] R. C. Edgar, “MUSCLE: multiple sequence alignment with high “Transfer of Haemophilus pleuropneumoniae and the Pas- accuracy and high throughput,” Nucleic Acids Research,vol.32, teurella haemolytica-like organism causing porcine necrotic no.5,pp.1792–1797,2004. pleuropneumonia to the genus Actinobacillus (Actinobacillus [39] K. Tamura, G. Stecher, D. Peterson, A. Filipski, and S. Kumar, pleuropneumoniae comb. nov.) on the basis of phenotypic “MEGA6: molecular evolutionary genetics analysis version 6.0,” and deoxyribonucleic acid relatedness,” International Journal of Molecular Biology and Evolution,vol.30,no.12,pp.2725–2729, Systematic Bacteriology,vol.33,no.3,pp.510–514,1983. 2013. [25] H. Christensen, M. Bisgaard, A. M. Bojesen, R. Mutters, and J. [40] S. J. Foote, J. T. Bosse,´ A. B. Bouevitch, P. R. Langford, N. M. E. Olsen, “Genetic relationship among avian isolates classified Young, and J. H. E. Nash, “The complete genome sequence of as Pasteurella haemolytica.’Actinobacillus salpingitidis’orPas- Actinobacillus pleuropneumoniae L20 (serotype 5b),” Journal of teurella anatis with proposal of Gallibacterium anatis gen. nov., Bacteriology,vol.190,no.4,pp.1495–1496,2008. comb. nov. and description of additional genomospecies with [41] Z. Xu, Y. Zhou, L. Li et al., “Genome biology of Actinobacillus Gallibacterium gen. nov,” International Journal of Systematic and pleuropneumoniae JL03, an isolate of serotype 3 prevalent in Evolutionary Microbiology,vol.53,no.1,pp.275–287,2003. China,” PLoS ONE,vol.3,no.1,ArticleIDe1450,2008. [26] P. J. Blackall, A. M. Bojesen, H. Christensen, and M. Bisgaard, [42] M. P. Di Bonaventura, R. DeSalle, M. Pop et al., “Com- “Reclassification ofPasteurella [ ] trehalosi as Bibersteinia tre- plete genome sequence of Aggregatibacter (Haemophilus) halosi gen. nov., comb. nov,” International Journal of Systematic aphrophilus NJ8700,” Journal of Bacteriology,vol.191,no.14,pp. and Evolutionary Microbiology,vol.57,no.4,pp.666–674,2007. 4693–4694, 2009. [27] R. Mutters, S. Pohl, and W. Mannheim, “Transfer of Pasteurella [43]C.Chen,W.Kittichotirat,Y.Si,andR.Bumgarner,“Genome ureae Jones 1962 to the genus Actinobacillus Brumpt 1910: Acti- sequence of Aggregatibacter actinomycetemcomitans serotype c nobacillus ureae comb. nov.,” International Journal of Systematic strain D11S-1,” Journal of Bacteriology,vol.191,no.23,pp.7378– Bacteriology,vol.36,no.2,pp.343–344,1986. 7379, 2009. [28] H. Christensen and P. Kuhnert, “International committee on [44]C.Chen,W.Kittichotirat,W.Chen,J.S.Downey,Y.Si, systematics of . Subcommittee on the taxonomy and R. Bumgarner, “Genome sequence of naturally competent of Pasteurellaceae. Minutes of the meetings, 25 August 2011, Aggregatibacter actinomycetemcomitans serotype a strain D7S- Elsinore, Denmark,” International Journal of Systematic and 1,” Journal of Bacteriology,vol.192,no.10,pp.2643–2644,2010. Evolutionary Microbiology,vol.62,no.1,pp.257–258,2012. [45] T.J. Johnson, C. Fernandez-Alarcon, A. M. Bojesen, L. K. Nolan, [29] J. Hedegaard, H. Okkels, B. Bruun, M. Kilian, K. K. D. W. Trampel, and T. Seemann, “Complete genome sequence Mortensen, and N. Nørskov-Lauritsen, “Phylogeny of the of Gallibacterium anatis strain UMN179, isolated from a laying genus Haemophilus as determined by comparison of partial hen with peritonitis,” Journal of Bacteriology,vol.193,no.14,pp. infB sequences,” Microbiology,vol.147,no.9,pp.2599–2609, 3676–3677, 2011. 2001. [46] I. K. Jordan, A. B. Conley, I. V. Antonov et al., “Genome [30] N. Nørskov-Lauritsen, B. Bruun, and M. Kilian, “Multilocus sequences for five strains of the emerging pathogen sequence phylogenetic study of the genus Haemophilus with Haemophilus haemolyticus,” Journal of Bacteriology,vol. description of Haemophilus pittmaniae sp. nov,” International 193,no.20,pp.5879–5880,2011. Journal of Systematic and Evolutionary Microbiology,vol.55,no. [47] F. R. Strouts, P. Power, N. J. Croucher et al., “Lineage-specific 1,pp.449–456,2005. virulence determinants of Haemophilus influenzae biogroup [31]V.Cattoir,O.Lemenand,J.-L.Avril,andO.Gaillot,“The aegyptius,” Emerging Infectious Diseases,vol.18,no.3,pp.449– sodA gene as a target for phylogenetic dissection of the genus 457, 2012. Haemophilus and accurate identification of human clinical [48]J.S.Hogg,F.Z.Hu,B.Jantoetal.,“Characterizationandmod- isolates,” International Journal of Medical Microbiology,vol.296, eling of the Haemophilus influenzae core and supragenomes no. 8, pp. 531–540, 2006. based on the complete genomic sequences of Rd and 12 clinical [32] M. Kilian, “Genus III. Haemophilus,” i n Bergey’s Manual of nontypeable strains,” Genome Biology,vol.8,no.6,articleR103, Systematic Bacteriology,D.J.Brenner,N.R.Krieg,G.M.Garrity, 2007. andJ.T.Staley,Eds.,vol.2,pp.883–904,Springer,NewYork, [49] A. Harrison, D. W. Dyer, A. Gillaspy et al., “Genomic sequence NY, USA, 2nd edition, 2005. of an otitis media isolate of nontypeable Haemophilus influen- [33] N. Nørskov-Lauritsen, B. Bruun, C. Andersen, and M. Kilian, zae: comparative study with H. influenzae serotyped, strain “Identification of haemolytic Haemophilus species isolated from KW20,” Journal of Bacteriology,vol.187,no.13,pp.4627–4636, human clinical specimens and description of Haemophilus 2005. sputorum sp. nov,” International Journal of Medical Microbiol- [50] M. Yue, F. Yang, J. Yang et al., “Complete genome sequence of ogy,vol.302,no.2,pp.78–83,2012. Haemophilus parasuisSH0165,” Journal of Bacteriology,vol.191, [34] R. D. Fleischmann, M. D. Adams, O. White et al., “Whole- no. 4, pp. 1359–1360, 2009. genome random sequencing and assembly of Haemophilus [51] J. F. Challacombe, A. J. Duncan, T. S. Brettin et al., “Complete influenzae Rd,” Science,vol.269,no.5223,pp.496–512,1995. genome sequence of Haemophilus somnus (Histophilus somni) [35] A. R. Wattam, D. Abraham, O. Dalay et al., “PATRIC, the bac- strain 129Pt and comparison to Haemophilus ducreyi 35000HP terial bioinformatics database and analysis resource,” Nucleic and Haemophilus influenzae Rd,” JournalofBacteriology,vol. Acids Research,vol.42,no.1,pp.D581–D591,2014. 189, no. 5, pp. 1890–1898, 2007. [36] NCBI, NCBI Genome Database,2014,http://www.ncbi.nlm.nih [52] S. H. Hong, J. S. Kim, S. Y. Lee et al., “The genome sequence .gov/genome/. of the capnophilic rumen bacterium Mannheimia succinicipro- [37] NCBI, NCBI Nucleotide Database, 2014, http://www.ncbi.nlm ducens,” Nature Biotechnology,vol.22,no.10,pp.1275–1281, .nih.gov/nuccore/. 2004. 14 International Journal of Genomics

[53] J. Gioia, X. Qin, H. Jiang et al., “The genome sequence of [70] R. S. Gupta, “Applications of conserved indels for understand- Mannheimia haemolytica A1: insights into virulence, natural ing microbial phylogeny,” in Molecular Phylogeny of Microor- competence, and Pasteurellaceae phylogeny,” Journal of Bacte- ganisms, A. Oren and R. T. Papke, Eds., pp. 135–150, Caister riology, vol. 188, no. 20, pp. 7257–7266, 2006. Academic Press, Norfolk, UK, 2010. [54] B. J. May, Q. Zhang, L. L. Li, M. L. Paustian, T.S. Whittam, and V. [71] H. S. Naushad, B. Lee, and R. S. Gupta, “Conserved signature Kapur, “Complete genomic sequence of Pasteurella multocida, indels and signature proteins as novel tools for understanding Pm70,” Proceedings of the National Academy of Sciences of the microbial phylogeny and systematics: Identification of molecu- United States of America, vol. 98, no. 6, pp. 3460–3465, 2001. lar signatures that are specific for the phytopathogenic genera Dickeya, Pectobacterium and Brenneria,” International Journal [55] R. C. Edgar, “Search and clustering orders of magnitude faster of Systematic and Evolutionary Microbiology,vol.64,no.2,pp. than BLAST,” Bioinformatics,vol.26,no.19,ArticleIDbtq461, 366–383, 2014. pp.2460–2461,2010. [72] R. S. Gupta, “Protein phylogenies and signature sequences: a [56] K. Katoh and D. M. Standley, “MAFFT multiple sequence reappraisal of evolutionary relationships among archaebacteria, alignment software version 7: improvements in performance eubacteria, and eukaryotes,” Microbiology and Molecular Biol- and usability,” Molecular Biology and Evolution,vol.30,no.4, ogy Reviews,vol.62,no.4,pp.1435–1491,1998. pp. 772–780, 2013. [73] A. Rokas and P. W.H. Holland, “Rare genomic changes as a tool [57] M. N. Price, P. S. Dehal, and A. P. Arkin, “FastTree 2— for phylogenetics,” Trends in Ecology and Evolution,vol.15,no. approximately maximum-likelihood trees for large alignments,” 11,pp.454–459,2000. PLoS ONE,vol.5,no.3,ArticleIDe9490,2010. [74] R. S. Gupta and R. Lali, “Molecular signatures for the phylum [58] S. Whelan and N. Goldman, “A general empirical model Aquificae and its different clades: Proposal for division of of protein evolution derived from multiple protein families the phylum Aquificae into the emended order Aquificales, using a maximum-likelihood approach,” Molecular Biology and containing the families Aquificaceae and Hydrogenothermaceae, Evolution,vol.18,no.5,pp.691–699,2001. and a new order Desulfurobacteriales ord. nov., containing the [59] R. S. Gupta, “Identification of conserved indels that are useful family Desulfurobacteriaceae,” Antonie van Leeuwenhoek,vol. for classification and evolutionary studies,” in Methods in 104, no. 3, pp. 349–368, 2013. Microbiology,vol.41,pp.153–182,AcademicPress,2014. [75] R. S. Gupta, S. Mahmood, and M. Adeolu, “A phylogenomic [60] F. Jeanmougin, J. D. Thompson, M. Gouy, D. G. Higgins, and T. and molecular signature based approach for characterization J. Gibson, “Multiple sequence alignment with Clustal X,” Trends of the phylum spirochaetes and its major clades: proposal for in Biochemical Sciences, vol. 23, no. 10, pp. 403–405, 1998. a taxonomic revision of the phylum,” Frontiers in Microbiology, vol. 4, article 217, 2013. [61] E. Stackebrandt and J. Ebers, “Taxonomic parameters revisited: tarnishedgoldstandards,”Microbiology Today,vol.33,no.4,p. [76]M.AdeoluandR.S.Gupta,“Aphylogenomicandmolecular 152, 2006. markerbasedproposalforthedivisionofthegenusBorrelia into two genera: the emended genus Borrelia containing only [62] K. T. Konstantinidis and E. Stackebrandt, “Defining taxonomic the members of the relapsing fever Borrelia,andthegenus ranks,” in The Prokaryotes, pp. 229–254, Springer, 2013. Borreliella gen. nov. containing the members of the Lyme [63] R. J. Redfield, W. A. Findlay, J. Bosse,´ J. S. Kroll, A. D. S. disease Borrelia (Borrelia burgdorferi sensu lato complex),” Cameron, and J. H. E. Nash, “Evolution of competence and Antonie van Leeuwenhoek, vol. 105, no. 6, pp. 1049–1072, 2014. DNAuptakespecificityinthePasteurellaceae,” BMC Evolution- [77] V. Bhandari and R. S. Gupta, “Molecular signatures for the ary Biology,vol.6,article82,2006. phylum (class) Thermotogae and a proposal for its division [64] A. Rokas, B. I. Williams, N. King, and S. B. Carroll, “Genome- into three orders (Thermotogales, Kosmotogales ord. Nov. and scale approaches to resolving incongruence in molecular phy- Petrotogales ord. Nov.) containing four families (Thermoto- logenies,” Nature,vol.425,no.6960,pp.798–804,2003. gaceae, Fervidobacteriaceae fam. Nov., Kosmotogaceae fam. Nov. and Petrotogaceae fam. Nov.) and a new genus Pseudother- [65] D. Wu, P. Hugenholtz, K. Mavromatis et al., “A phylogeny- motoga gen. Nov. with five new combinations,” Antonie van driven genomic encyclopaedia of Bacteria and Archaea,” Leeuwenhoek, International Journal of General and Molecular Nature,vol.462,no.7276,pp.1056–1060,2009. Microbiology,vol.105,no.1,pp.143–168,2014. [66] P. Yilmaz, L. W. Parfrey, P. Yarza et al., “The SILVA and [78] I. Olsen, “Recent approaches to the chemotaxonomy of ’sll-species living tree project (LTP)’ taxonomic frameworks,” the Actinobacillus-Haemophilus-Pasteurella group (family Pas- Nucleic Acids Research,2013. teurellaceae),” Oral Microbiology and Immunology,vol.8,no.6, [67] B.Gao,R.Mohan,andR.S.Gupta,“Phylogenomicsandprotein pp. 327–336, 1993. signatures elucidating the evolutionary relationships among the [79] N. Hayashimoto, M. Ueno, A. Tkakura, and T. Itoh, “Biochem- Gammaproteobacteria,” International Journal of Systematic and ical characterization and phylogenetic analysis based on 16S Evolutionary Microbiology,vol.59,no.2,pp.234–247,2009. rRNA sequences for V-factor dependent members of Pasteurel- [68] A. M. Cutino-Jim˜ enez,´ M. Martins-Pinheiro, W. C. Lima, A. laceae derived from laboratory rats,” Current Microbiology,vol. Mart´ın-Tornet, O. G. Morales, and C. F. M. Menck, “Evolution- 54,no.6,pp.419–423,2007. ary placement of Xanthomonadales based on conserved protein [80] R. Mutters, W. Frederiksen, and W. Mannheim, “Taxonomy of signature sequences,” Molecular Phylogenetics and Evolution, the group,” in Pasturella and ,C.AdlamandJ.M. vol. 54, no. 2, pp. 524–534, 2010. Rutter, Eds., pp. 3–34, Academic Press, London, UK, 1989. [69] H. S. Naushad and R. S. Gupta, “Phylogenomics and molecular [81] K. P. Snipes and E. L. Biberstein, “Pasteurella testudinis sp. nov.: signatures for species from the plant pathogen-containing order a parasite of desert tortoises (Gopherus agassizi),” International Xanthomonadales,” PLoS ONE,vol.8,no.2,ArticleIDe55216, Journal of Systematic Bacteriology,vol.32,no.2,pp.201–210, 2013. 1982. International Journal of Genomics 15

[82] Ø. Angen, P. Ahrens, and M. Bisgaard, “Phenotypic and geno- typic characterization of Mannheimia (Pasteurella) haemolyt- ica-like strains isolated from diseased animals in Denmark,” Veterinary Microbiology,vol.84,no.1-2,pp.103–114,2002. [83] N. Z. Ahmod, R. S. Gupta, and H. N. Shah, “Identification of a anthracis specific indel in the yeaC gene and development of a rapid pyrosequencing assay for distinguishing B. anthracis from the B. cereus group,” Journal of Microbiological Methods,vol.87,no.3,pp.278–285,2011. [84]S.Y.Wong,A.Paschos,R.S.Gupta,andH.E.Schell- horn, “Insertion/deletion-based approach for the detection of Escherichia coli O157:H7 in freshwater environments,” Environ- mental Science & Technology,vol.48,no.19,pp.11462–11470, 2014. [85] B. Gao and R. S. Gupta, “Phylogenetic framework and molecu- lar signatures for the main clades of the phylum ,” Microbiology and Molecular Biology Reviews,vol.76,no.1,pp. 66–112, 2012. [86] M. Howard-Azzeh, L. Shamseer, H. E. Schellhorn, and R. S. Gupta, “Phylogenetic analysis and molecular signatures defin- ing a monophyletic clade of heterocystous and identifying its closest relatives,” Photosynthesis Research,vol. 122, no. 2, pp. 171–185, 2014.