<<

USOO81 68417B2

(12) United States Patent (10) Patent No.: US 8,168,417 B2 Berka et al. (45) Date of Patent: May 1, 2012

(54) BACILLUS LICHENIFORMIS CHROMOSOME OTHER PUBLICATIONS Kunst et al., 1997, PIR database Accession No. D69904, “butyrate (75) Inventors: Randy Berka, Davis, CA (US); Michael acetoacetate CoA- (EC 2.8.3.9.) large chain homolog Rey, Davis, CA (US); Preethi Ramaiya, yodR-Bacillus subtilis”.* Walnut Creek, CA (US); Jens Tonne Claus, D. and Berkeley, R.C.W. (1986) in Bergey's Manual of Sys Andersen, Naerum (DK); Michael tematic Bacteriology, vol. 2., eds. Sneath, P.H.A. etal. (Williams and Dolberg Rasmussen, Vallensbaek (DK); Wilkins Co., Baltimore, MD.), pp. 1105-1139. Peter Bjarke Olsen, Copenhagen Ø Eveleigh, D.E. (1981) Scientific American 245, 155-178. (DK) Erickson, R.J. (1976) in Microbiology, ed. Schlesinger, D. (Am. Soc. Microbiol. Washington, DC), pp. 406-419. (73) Assignees: Novozymes A/S, Bagsvaerd (DK); Logan, N. A. and Berkeley, R.C.W. (1981), in The Aerobic Novozymes, Inc., Davis, CA (US) Endospore-Forming . Classification and Identification, eds. Berkeley, R.C.W. and Goodfellow, M. (Academic Press, Inc., Lon (*) Notice: Subject to any disclaimer, the term of this don), pp. 106-140. patent is extended or adjusted under 35 O'Donnell, A.G., Norris, J.R., Berkeley, R.C.W., Claus, D., Kanero, U.S.C. 154(b) by 0 days. T., Logan, N. A., and Nozaki, R. (1980) Internat. J. Systematic Bacteriol. 30, 448-459. (21) Appl. No.: 12/972,306 Lapidus et al., (2002), Co-linear scaffold of the Bacillus licheniformis and Bacillus subtilis genomes and its use to compare (22) Filed: Dec. 17, 2010 their competence genes, FEMS Microbiology Letters, 209, pp. 23-30. (65) Prior Publication Data Kunst et al., Nature, 1997, vol. 390, pp. 249-266. NCBI submission of B. subtilis DNA, 180 kb region of repliation US 2011 FOO864O7 A1 Apr. 14, 2011 origin, ID BAC 180K, publicly available Jun. 2, 1999, at ncbi.nlm. nih.gov. Related U.S. Application Data Avita et al., Temporal Secretion of a multicellulolytic system in Division of application No. 12/322.974, filed on Feb. Myxobacter sp. AL-1 Molecular cloning and heterologous expres (62) sion of cel9 encoding a modular endocellulase clustered in an operon 9, 2009, now Pat. No. 7,863,032, which is a division of with ce148, an exocellobiohdrolase gene, Eur, J. Biochem., 2000, application No. 10/983,128, filedon Nov. 5, 2004, now v.267, 7058-7064. Pat. No. 7,494,798. Liu et al., 2004, Curr Microbiol 49, 234-238. (60) Provisional application No. 60/535.988, filed on Jan. Lloberas et al., 1991, Eur J. Biochem 197,337-343. Moriya et al., 2005, NCBI Access No. D26.185. 9, 2004, provisional application No. 60/561,059, filed O'Donnell et al., 2007, Geneseqp Access No. ADJ79377. on Apr. 8, 2004, provisional application No. Rey et al., 2004, Gen Biol 5(10), R77-1. 60/572,403, filed on May 18, 2004. Sanchez et al., 2003, Eur J Biochem 270(13), 2913-2919. Sneath et al., 1977, 1104-1139. (51) Int. C. Sook et al., 2002, J Microbiol Biotechnol 12(5), 773-779. CI2N 9/10 (2006.01) Veith et al., 2004, J Mol Microbiol Biotechnol 7(4), 204-211. CI2N 5/54 (2006.01) Xu et al., 2003, EMBL Access No. AF478085. (52) U.S. Cl...... 435/193: 536/23.2 Xu et al., 2003, Intl Sys Evo Micro 53(3), 695-704. (58) Field of Classification Search ...... None Sinchaikul et al 2002, J Chromat 771,261-287. See application file for complete search history. * cited by examiner (56) References Cited Primary Examiner—Jon PWeber Assistant Examiner — William W. Moore U.S. PATENT DOCUMENTS (74) Attorney, Agent, or Firm — Eric J. Fechter; Robert L. 5,589,381 A 12/1996 Neyra et al. Starnes 5,665,354 A 9/1997 Neyra et al. 6,060.241 A * 5/2000 Corthesy-Theulaz ...... 435/6 6,506,581 B1* 1/2003 Fleischmann et al...... 435/69.1 (57) ABSTRACT 6,528,289 B1* 3/2003 Fleischmann et al...... 435,9141 The present invention relates to an isolated polynucleotide of

6,593,114 B1* 7/2003 Kunsch et al...... 435,9141 6,846,651 B2 * 1/2005 Fleischmann et al ... 435/69.1 the complete chromosome of Bacillus licheniformis. The 7,018,794 B2* 3/2006 Berka et al...... 435/6 present invention also relates to isolated genes of the chro 7.691,574 B2 * 4/2010 Berka et al...... 435/6 mosome of Bacillus licheniformis which encode biologically 2002fO146721 Al 10/2002 Berka et al...... 435/6 active Substances and to nucleic acid constructs, vectors, and FOREIGN PATENT DOCUMENTS host cells comprising the genes as well as methods for pro ducing biologically active substances encoded by the genes WO WOO2,29113 A2 4/2002 WO WOO229113 * 4, 2002 and to methods of using the isolated genes of the complete WO WOO3,OOO941 1, 2003 chromosome of Bacillus licheniformis. WO WOO3/O54163 A2 T 2003 WO WOO3,087.149 10, 2003 2 Claims, No Drawings US 8,168,417 B2 1. 2 BACILLUS LICHENIFORMS CHROMOSOME in the environment. Unlike most other bacilli that are pre dominantly aerobic, Bacillus licheniformis is a facultative CROSS-REFERENCE TO RELATED anaerobe which may allow it to grow in additional ecological APPLICATIONS niches. This species produces a diverse assortment of extra cellular that are believed to contribute to the process This application is a divisional of U.S. application Ser. No. of nutrient cycling in nature (Claus, D. and Berkeley, R. C.W., 12/322,974, filed Feb. 9, 2009, now U.S. Pat. No. 7,863,032, 1986. In Bergey's Manual of Systematic Bacteriology, Vol. 2., which is a divisional of U.S. application Ser. No. 10/983,128, eds. Sneath, P. H. A. et al., Williams and Wilkins Co., Balti filed Nov. 5, 2004, now U.S. Pat. No. 7,494,798, which claims more, Md., pp. 1105-1139). Certain Bacillus licheniformis the benefit of U.S. Provisional Application No. 60/535.988, 10 isolates are capable of denitrification, however, the relevance filed Jan. 9, 2004, U.S. Provisional Application No. 60/561, of this characteristic to environmental denitrification may be 059, filed Apr. 8, 2004, and U.S. Provisional Application No. Small since the species generally persists in soil as endospores 60/572.403, filed May 18, 2004, which applications are incor porated herein by reference. (Alexander, M., 1977, Introduction to Soil Microbiology. 15 John Wiley and Sons, Inc., New York). There are numerous industrial and agricultural uses for REFERENCE TO ASEQUENCE LISTING Bacillus licheniformis and its extracellular products. The spe This application contains a Sequence Listing in computer cies has been used for decades in the manufacture of indus readable form, which is incorporated herein by reference. trial enzymes including several proteases, C.-amylase, peni cillinase, pentosanase, cycloglucosyltransferase, BACKGROUND OF THE INVENTION B-mannanase, and several pectinolytic enzymes, owing largely to its ability to secrete sizeable amounts of degradative 1. Field of the Invention enzymes. Bacillus licheniformis is also used to produce pep The present invention relates to an isolated polynucleotide tide antibiotics such as bacitracin and proticin, in addition to molecule comprising the complete chromosome of Bacillus 25 a number of specialty chemicals such as citric acid, , licheniformis. The present invention also relates to features , and poly-Y-. The proteases from (genes) of the complete chromosomal DNA molecule of Bacillus licheniformis are used in the detergent industry as Bacillus licheniformis which encode biologically active sub well as for dehairing and batting of leather (Eveleigh, D. E., stances and to nucleic acid constructs, vectors, and host cells 1981, Scientific American 245, 155-178). Amylases from comprising the features as well as methods for producing 30 Bacillus licheniformis are deployed for the hydrolysis of biologically active Substances encoded by the features and to starch, desizing of textiles, and sizing of paper (Erickson, R. methods of using the isolated features derived from the com J., 1976. In Microbiology, ed. Schlesinger, D. (Am. Soc. plete chromosomal DNA molecule of Bacillus licheniformis. Microbiol. Washington, D.C.), pp. 406-419). Certain strains 2. Description of the Related Art of Bacillus licheniformis have shown efficacy to destroy fun Microbes, which make up most of the earth's biomass, 35 gal pathogens affecting maize, grasses, and vegetable crops have evolved for some 3.8 billion years. They are found in (U.S. Pat. No. 5,589,381; U.S. Pat. No. 5,665,354). As an virtually every environment, Surviving and thriving in endospore-forming bacterium, the ability of the organism to extremes of heat, cold, radiation, pressure, salt, acidity, and Survive under unfavorable environmental conditions may darkness. Often in these environments, no other forms of life enhance its potential as a natural control agent. are found and the only nutrients come from inorganic matter. 40 Bacillus licheniformis can be differentiated from other The diversity and range of their environmental adaptations bacillion the basis of metabolic and physiological tests (Lo indicate that microbes long ago 'solved’ many problems for gan, N. A. and Berkeley, R. C. W., 1981. In The Aerobic which scientists are still actively seeking Solutions. The value Endospore-Forming Bacteria Classification and Identifica in determining the complete genome sequence of microbes is tion, eds. Berkeley, R. C. W. and Goodfellow, M., Academic that it provides a detailed blueprint for the organism revealing 45 Press, Inc., London, pp. 106-140; O'Donnell, A.G., Norris, J. all of the biochemical pathways, Substrates, intermediates, R., Berkeley, R. C. W., Claus, D., Kanero, T., Logan, N.A., and end products as well as regulatory networks, and evolu and Nozaki, R., 1980, Internat. J. Systematic Bacteriol. 30: tionary relationships to other microbes. A complete manifest 448-459). However, biochemical and phenotypic character of proteins, both structural and catalytic, is encoded as a list of istics may be ambiguous among closely related species. Lapi features in the DNA molecule comprising the genome, as well 50 dus et al. (Lapidus, A., Galleron, N., Andersen, J. T., Jor as their likely cellular location. gensen, P. L. Ehrlich, S. D., and Sorokin, A., 2002, FEMS Knowledge about the enormous range of microbial capaci Microbiol. Lett. 209: 23-30) recently constructed a physical ties has broad and far-reaching implications for environmen map of the Bacillus licheniformis chromosome using a PCR tal, energy, health, and industrial applications, such as approach, and established a number of regions of co-linearity cleanup of toxic-waste, production of novel therapeutic and 55 where gene content and organization were ostensibly con preventive agents (drugs and vaccines), energy generation served with the Bacillus subtilis chromosome. and development of renewable energy sources, production of It would be advantageous to the art to have available the chemical catalysts, reagents, and enzymes to improve effi complete primary structure of the chromosomal DNA mol ciency of industrial processes, management of environmental ecule of the Bacillus licheniformis type strain ATCC 14580. carbon, nitrogen and nutrient cycling, detection of disease 60 With the complete chromosome data in hand, it should be causing organisms and monitoring of the safety of food and possible to do comparative genomics and proteomics studies water Supplies, use of genetically altered bacteria as living that can lead to improved industrial strains as well as to a sensors (biosensors) to detect harmful chemicals in soil, air, better understanding of genome evolution among closely or water, and understanding of specialized systems used by related bacilli in the subtilis-licheniformis group. microbial cells to live in natural environments. 65 It is an object of the present invention to provide an isolated Bacillus licheniformis is a gram positive spore-forming polynucleotide with the sequence of the complete chromo bacterium that is widely distributed as a saprophytic organism some of Bacillus licheniformis. US 8,168,417 B2 3 4 SUMMARY OF THE INVENTION labeled with a first reporter and the second nucleic acid probes are labeled with a second reporter; The present invention relates to an isolated polynucleotide (b) examining the array under conditions wherein the rela of the complete chromosomal DNA molecule of Bacillus tive expression of the genes of the microbial strain is deter licheniformis ATCC 14580 having the sequence of 5 mined by the observed hybridization reporter signal of each SEQID NO: 1. spot on the array in which (i) the Bacillus licheniformis genes The present invention also relates to isolated features on the array that hybridize to the first nucleic acid probes (genes) of the complete chromosomal DNA molecule of produce a distinct first hybridization reporter signal or the Bacillus licheniformis ATCC 14580 encoding biologically second nucleic acid probes produce a distinct second hybrid active Substances, selected from the group consisting of 10 ization reporter signal, and (ii) the Bacillus licheniformis (a) a gene comprising a nucleotide sequence having at least genes on the array that hybridize to both the first and second 60% identity with any of the polynucleotides of SEQIDNOs: nucleic acid probes produce a distinct combined hybridiza 2-4198; and tion reporter signal; and (b) a gene comprising a nucleotide sequence which hybrid (c) isolating a gene from the microbial strain that encodes izes under at least medium stringency conditions with any of 15 an that degrades or converts the . the polynucleotides of SEQ ID NOS: 2-4198, or a comple The present invention also relates to genes isolated by Such mentary strand thereof. methods and nucleic acid constructs, vectors, and host cells The present invention also relates to biologically active containing the genes. Substances encoded by the isolated genes, and nucleic acid constructs, vectors, and host cells comprising the genes. DEFINITIONS The present invention also relates to methods for producing Such Substances having biological activity comprising (a) Biologically active substance: The term “biologically cultivating a recombinant host cell comprising a nucleic acid active Substance' is defined herein as any substance which is construct comprising a gene encoding the biologically active encoded by a single gene or a series of genes (contiguous or substance under conditions suitable for production of the 25 non-contiguous) composing a biosynthetic or metabolic biologically active Substance; and (b) recovering the biologi pathway or operon or may be the director indirect result of the cally active Substance. of a single gene or products of a series of genes of the The present invention also relates to methods for monitor Bacillus licheniformis chromosome. Such substances ing differential expression of a plurality of genes in a first include, but are not limited to, biopolymers, metabolites, and bacterial cell relative to expression of the same genes in one or 30 cellular structures and components (e.g., ribosome, flagella, more second bacterial cells, comprising: etc.). For purposes of the present invention, biological activ (a) adding a mixture of detection reporter-labeled nucleic ity is determined according to procedures known in the art acids isolated from the bacterial cells to a Substrate containing such as those described by Carpenter and Sabatini, 2004, an array of Bacillus licheniformis genes selected from the Nature 5: 11-22: Sordie et al., 2003, Proceedings of the group consisting of SEQ ID NOS: 2-4198, 35 National Academy of Sciences USA 100: 11964-11969; complementary strands of SEQ ID NOS: 2-4198, or frag Braun and LaBaer, 2003, TRENDS in Biotechnology 21: 383 ments of SEQID NOS: 2-4198, under conditions where the 388; and Kaberdin and McDowall, 2003, Genome Research detection reporter-labeled nucleic acids hybridize to comple 13: 1961-1965. mentary sequences of the Bacillus licheniformis genes on the In the methods of the present invention, the biopolymer array, wherein the nucleic acids from the first bacterial cell 40 may be any biopolymer. The term “biopolymer is defined and the one or more second bacterial cells are labeled with a hereinas a chain (or polymer) of identical, similar, or dissimi first detection reporter and one or more different second lar subunits (monomers). The biopolymer may be, but is not detection reporters, respectively; and limited to, a nucleic acid, polyamine, polyol, polypeptide (or (b) examining the array under conditions wherein the rela polyamide), or polysaccharide. tive expression of the genes in the bacterial cells is determined 45 In a preferred aspect, the biopolymer is a polypeptide. The by the observed detection signal of each spot on the array in polypeptide may be any polypeptide having a biological which (i) the Bacillus licheniformis genes on the array that activity of interest. The term “polypeptide' is not meant hybridize to the nucleic acids obtained from either the first or hereinto refer to a specific length of the encoded product and, the one or more second bacterial cells produce a distinct first therefore, encompasses peptides, oligopeptides, and proteins. detection signal or one or more second detection signals, 50 The term “polypeptide' also encompasses naturally occur respectively, and (ii) the Bacillus licheniformis genes on the ring allelic variations. array that hybridize to the nucleic acids obtained from both In a preferred aspect, the polypeptide is an antibody, anti the first and one or more second bacterial produce a distinct gen, antimicrobial peptide, enzyme, growth factor, hormone, combined detection signal. immunodilator, neurotransmitter, , reporter protein, The present invention also relates to methods for isolating 55 structural protein, transcription factor, and transporter. a gene encoding an enzyme, comprising: In a more preferred aspect, the polypeptide is an oxi (a) adding a mixture of labeled first nucleic acid probes, doreductase, transferase, , , , or isolated from a microbial strain cultured on medium without . In a most preferred aspect, the polypeptide is an alpha an inducing Substrate, and labeled second nucleic acid glucosidase, aminopeptidase, amylase, carbohydrase, car probes, isolated from the microbial strain cultured on medium 60 boxypeptidase, catalase, cellulase, chitinase, cutinase, cyclo with the inducing Substrate, to an array of Bacillus licheni dextrin glycosyltransferase, deoxyribonuclease, esterase, formis genes selected from the group consisting of nucle alpha-galactosidase, beta-galactosidase, glucoamylase, glu otides SEQID NOS: 2-4198, complementary strands of SEQ cocerebrosidase, alpha-glucosidase, beta-glucosidase, inver IDNOs: 2-4198, or fragments of SEQID NOs: 2-4198, under tase, laccase, lipase, mannosidase, mutanase, oxidase, pecti conditions where the labeled nucleic acid probes hybridize to 65 nolytic enzyme, peroxidase, phospholipase, phytase, complementary sequences of the Bacillus licheniformis polyphenoloxidase, proteolytic enzyme, ribonuclease, trans genes on the array, wherein the first nucleic acid probes are glutaminase, urokinase, Xylanase. US 8,168,417 B2 5 6 In another preferred aspect, the polypeptide is a collagen or For purposes of the present invention, the degree of identity gelatin. between two sequences is determined by the In another preferred aspect, the biopolymer is a polysac Smith-Waterman Protein method for the Genematcher2, as charide. The polysaccharide may be any polysaccharide, implemented by Paracel Inc. (Pasadena, Calif.), or the including, but not limited to, a mucopolysaccharide (e.g., BLASTP method as described by Altschulet al., 1990, Jour heparin and hyaluronic acid) and a nitrogen-containing nal of Molecular Biology 215: 403-410. polysaccharide (e.g., chitin). In a more preferred aspect, the For purposes of the present invention, the degree of identity polysaccharide is hyaluronic acid (hyaluronan). between two nucleotide sequences is determined by the In the methods of the present invention, the metabolite may Smith Waterman nucleotide method for the Genematcher2 or 10 BLASTN for the BlastMachine as implemented by Paracel be any metabolite. The metabolite may be encoded by one or Inc. more genes, such as a biosynthetic or metabolic pathway. The Polypeptide Fragment: The term “polypeptide fragment” term “metabolite' encompasses both primary and secondary is defined herein as a polypeptide, which retains biological metabolites. Primary metabolites are products of primary or activity, having one or more amino acids deleted from the general metabolism of a cell, which are concerned with 15 amino and/or carboxyl terminus of a polypeptide encoded by energy metabolism, growth, and structure. Secondary any of the genes of the present invention, i.e., polypeptides of metabolites are products of secondary metabolism (see, for SEQ ID NOs: 4199-8395. Preferably, a fragment contains at example, R. B. Herbert, The Biosynthesis of Secondary least 80%, preferably at least 85%, more preferably at least Metabolites, Chapman and Hall, New York, 1981). 90%, even more preferably at least 95%, and most preferably The primary metabolite may be, but is not limited to, an at least 97% of the amino acid residues of the mature encoded amino acid, fatty acid, , nucleotide, Sugar, triglyc polypeptide product. eride, vitamin. Subsequence: The term “subsequence' is defined hereinas The secondary metabolite may be, but is not limited to, an a polynucleotide comprising a nucleotide sequence of any of alkaloid, coumarin, flavonoid, polyketide, quinine, Steroid, SEQ ID NOS: 2-4198 except that one or more nucleotides peptide, or terpene. In a preferred aspect, the secondary 25 have been deleted from the 5' and/or 3' end. Preferably, a metabolite is an antibiotic, antifeedant, attractant, bacterio subsequence contains at least 80%, preferably at least 85%, cide, fungicide, hormone, insecticide, or rodenticide. more preferably at least 90%, even more preferably at least Isolated biologically active substance: The term "isolated 95%, and most preferably at least 97% of the nucleotides of biologically active Substance' is defined hereinas a Substance any of the genes of the present invention. which is at least about 20% pure, preferably at least about 30 Allelic variant: The term “allelic variant denotes herein 40% pure, more preferably about 60% pure, even more pref any of two or more alternative forms of a gene occupying the erably about 80% pure, most preferably about 90% pure, and same chromosomal locus. Allelic variation arises naturally even most preferably about 95% pure, as determined by SDS through mutation, and may result in polymorphism within PAGE, HPLC, capillary electrophoresis, or any other method populations. Gene mutations can be silent (no change in the used in the art. 35 encoded polypeptide) or may encode polypeptides having Substantially pure biologically active Substance or pure altered amino acid sequences. An allelic variant of a polypep biologically active substance: The term “substantially pure tide is a polypeptide encoded by an allelic variant of a gene. biologically active Substance' is defined herein as a biologi Substantially pure polynucleotide or pure polynucleotide: cally active Substance preparation which contains at most The term “substantially pure polynucleotide' as used herein 10%, preferably at most 8%, more preferably at most 6%, 40 refers to a polynucleotide preparation free of other extraneous more preferably at most 5% by weight, more preferably at or unwanted nucleotides and is in a form Suitable for use most 4%, at most 3%, even more preferably at most 2%, most within genetically engineered production systems. Thus, a preferably at most 1%, and even most preferably at most 0.5% Substantially pure polynucleotide contains at most 10%, pref by weight of other material with which it is natively associ erably at most 8%, more preferably at most 6%, more pref ated. It is, therefore, preferred that the substantially pure 45 erably at most 5%, more preferably at most 4%, more pref biologically active substance is at least 92% pure, preferably erably at most 3%, even more preferably at most 2%, most at least 94% pure, more preferably at least 95% pure, more preferably at most 1%, and even most preferably at most 0.5% preferably at least 96% pure, more preferably at least 96% by weight of other polynucleotide material with which it is pure, more preferably at least 97% pure, even more preferably natively associated. A Substantially pure polynucleotide may, at least 98% pure, most preferably at least 99%, and even most 50 however, include naturally occurring 5' and 3' untranslated preferably at least 99.5% pure by weight of the total material regions. Such as promoters and terminators. It is preferred that present in the preparation. The term “pure biologically active the substantially pure polynucleotide is at least 92% pure, Substance' is defined as a biologically active Substance prepa preferably at least 94% pure, more preferably at least 95% ration which contains no other material with which it is pure, moire preferably at least 96% pure, more preferably at natively associated. 55 least 96% pure, more preferably at least 97% pure, more The biologically active substances of the present invention preferably at least 98% pure, most preferably at least 99%, are preferably in a substantially pure form. In particular, it is and even most preferably at least 99.5% pure by weight. The preferred that the biologically active substances are in “essen polynucleotides of the present invention are preferably in a tially pure form', i.e., that the biologically active substance substantially pure form. In particular, it is preferred that the preparation is essentially free of other material with which it 60 polynucleotides disclosed herein are in “essentially pure is natively associated. This can be accomplished, for form', i.e., that the polynucleotide preparation is essentially example, by preparing the biologically active Substance by free of other polynucleotide material with which it is natively means of well-known recombinant methods or by classical associated. The term “pure polynucleotide' is defined as a purification methods. polynucleotide preparation which contains no other material Identity: The relatedness between two amino acid 65 with which it is natively associated. sequences or between two nucleotide sequences is described Nucleic acid construct: The term “nucleic acid construct’ by the parameter “identity”. as used herein refers to a nucleic acid molecule, either single US 8,168,417 B2 7 8 or double-stranded, which is isolated from a naturally occur The Bacillus licheniformis chromosome possesses regions ring gene or which has been modified to contain segments of that are markedly co-linear with the chromosomes of Bacillus nucleic acids in a manner that would not otherwise exist in subtilis and Bacillus halodurans, and approximately 80% of nature. The term nucleic acid construct is synonymous with the predicted genes have Bacillus subtilis orthologues. the term “expression cassette' when the nucleic acid con 5 The present invention also relates to isolated features struct contains the control sequences required for expression (genes) of the complete chromosomal DNA molecule of of a coding sequence of the present invention. Bacillus licheniformis ATCC 14580 encoding biologically Control sequence: The term “control sequences” is defined active Substances, selected from the group consisting of herein to include all components, which are necessary or (a) a gene comprising a nucleotide sequence having at least advantageous for the expression of a biologically active Sub 10 60% identity with any of the nucleotide sequences of SEQID stance of the present invention. Each control sequence may be NOS: 2-4198; and native or foreign to the polynucleotide encoding the Sub (b) a gene comprising a nucleotide sequence which hybrid stance. Such control sequences include, but are not limited to, a leader, propeptide sequence, promoter, signal peptide izes under at least medium stringency conditions with any of sequence, and transcription terminator. At a minimum, the 15 the genes of SEQID NOS: 2-4198, or a complementary strand control sequences include a promoter, and transcriptional and thereof. translational stop signals. The control sequences may be pro In a first aspect, the present invention relates to isolated vided with linkers for the purpose of introducing specific genes, which have a degree of identity to the nucleotide restriction sites facilitating ligation of the control sequences sequences of any of SEQID NOS: 2-4198 of at least about with the coding region of the polynucleotide encoding a bio 60%, preferably at least about 65%, more preferably at least logically active Substance. about 70%, more preferably at least about 75%, more prefer Operably linked: The term “operably linked as used ably at least about 80%, more preferably at least about 85%, herein refers to a configuration in which a control sequence is even more preferably at least about 90%, most preferably at placed at an appropriate position relative to the coding least about 95%, and even most preferably at least about 97%, sequence of the DNA sequence, Such that the control 25 which encode biologically active Substances having a particu sequence directs the expression of a biologically active Sub lar biological activity (hereinafter “homologous biologically Stance. active Substances”). Coding sequence: When used herein the term “coding In a second aspect, the present invention relates to isolated sequence' is intended to cover a nucleotide sequence, which genes comprising nucleotide sequences which hybridize directly specifies the amino acid sequence of its protein prod 30 under very low stringency conditions, preferably low strin uct. The boundaries of the coding sequence are generally gency conditions, more preferably medium stringency con determined by an open reading frame, which usually begins with the ATG start codon or alternative start codons such as ditions, more preferably medium-high Stringency conditions, GTG and TTG. even more preferably high Stringency conditions, and most Expression: The term “expression' includes any step 35 preferably very high Stringency conditions with any of (i) the involved in the production of a biologically active substance genes of SEQID NOS: 2-4198, or subsequences thereof, or including, but not limited to, transcription, post-transcrip (ii) complementary strands thereof (J. Sambrook, E. F. tional modification, translation, post-translational modifica Fritsch, and T. Maniatus, 1989, Molecular Cloning, A Labo tion, and secretion. ratory Manual, 2d edition, Cold Spring Harbor, New York). Expression vector: The term “expression vector herein 40 Subsequences of SEQ ID NOs: 2-4198 may be at least 100 covers a DNA molecule, linear or circular, that comprises a nucleotides or preferably at least 200 nucleotides. Moreover, segment encoding a biologically active Substance of the the Subsequences may encode fragments of a gene product invention, and which is operably linked to additional seg which have biological activity. The biologically active sub ments that provide for its transcription. stances may also be biologically active allelic variants of the Host cell: The term “host cell, as used herein, includes any 45 biologically active Substances. cell type which is Susceptible to transformation, transfection, The nucleotide sequences of SEQID NOS: 2-4198 or sub conjugation, electroporation, etc. with a nucleic acid con sequences thereof, as well as the amino acid sequences of struct, plasmid, or vector. SEQ ID NOs: 4199-8395 or fragments thereof, may be used to design nucleic acid probes to identify and clone DNA DETAILED DESCRIPTION OF THE INVENTION 50 encoding biologically active Substances from Strains of dif ferent genera or species according to methods well known in Bacillus licheniformis Chromosome and Features (Genes) the art. In particular, such probes can be used for hybridiza Thereof tion with the genomic DNA of the genus or species of interest, The present invention relates to an isolated polynucleotide following standard Southern blotting procedures, in order to of the complete chromosomal DNA molecule of Bacillus 55 identify and isolate the corresponding gene therein. Such licheniformis ATCC 14580 having the nucleotide sequence of probes can be considerably shorter than the entire sequence, SEQID NO: 1. Bacillus licheniformis ATCC 14580, consists but should be at least 14, preferably at least 25, more prefer of a circular molecule of 4.222,336 base pairs with a mean ably at least 35 nucleotides in length, such as at least 70 G+C content of 46.2%. The chromosome contains 4208 pre nucleotides in length. It is preferred, however, that the nucleic dicted protein-coding genes (SEQID NOS: 2-4198) with an 60 acid probes are at least 100 nucleotides in length. For average size of 873 bp, 7 rRNA operons, and 72 tRNA genes. example, the nucleic acid probes may be at least 200 nucle The deduced amino acid sequences of the 4208 predicted otides, at least 300 nucleotides, at least 400 nucleotides, or at protein-coding genes are shown in SEQID NOs: 4199-8395. least 500 nucleotides in length. Even longer probes may be SEQ ID NO: 4210 corresponds to SEQ ID NO: 2, SEQ ID used, e.g., nucleic acid probes which are at least 600 nucle NO: 4211 corresponds to SEQID NO:3, SEQID NO: 4212 65 otides, at least 700 nucleotides, at least 800 nucleotides, or at corresponds to SEQID NO: 4, etc. The predicted functions of least 900 nucleotides in length. Both DNA and RNA probes the 4208 gene products are shown in Table 1. can be used. The probes are typically labeled for detecting the US 8,168,417 B2 10 corresponding gene (for example, with P.H.S, biotin, or between the probe and the filter bound DNA for successful avidin). Such probes are encompassed by the present inven hybridization. The effective T may be determined using the tion. formula below to determine the degree of identity required for A genomic DNA library prepared from Such other organ two DNAs to hybridize under various stringency conditions. isms may, therefore, be screened for DNA which hybridizes 5 with the probes described above and which encodes a bio Effective T-81.5+16.6(log MNa")+0.41 (% G+C)- logically active substance. Genomic DNA from such other 0.72(% formamide) organisms may be separated by agarose or polyacrylamide The % G+C content of any of the genes of SEQID NOs: gel electrophoresis, or other separation techniques. DNA 2-4198 can easily be determined. For medium stringency, for from the libraries or the separated DNA may be transferred to 10 example, the concentration of formamide is 35% and the Na" and immobilized on nitrocellulose or other suitable carrier concentration for 5xSSPE is 0.75M. Applying this formulato material. In order to identify a clone or DNA which is these values, the Effective T in C. can be calculated. homologous with any of SEQ ID NOS: 2-4198 or subse Another relevant relationship is that a 1% mismatch of two quences thereof, the carrier material is used in a Southern DNAs lowers the T 1.4° C. To determine the degree of blot. 15 identity required for two DNAs to hybridize under medium For purposes of the present invention, hybridization indi stringency conditions at 42°C., the following formula is used: cates that a polynucleotide hybridizes to a labeled gene hav ing the nucleotide sequence shown in any of SEQ ID NOs: % Homology=100-(Effective T-Hybridization 2-4198, complementary Strands thereof, or Subsequences Temperature), 1.41 thereof, under very low to very high stringency conditions. Applying this formula, the degree of identity required for Molecules to which the nucleic acid probe hybridizes under two DNAs to hybridize under medium stringency conditions these conditions can be detected using X-ray film. at 42°C. can be calculated. In a preferred aspect, the nucleic acid probe is any of the Similar calculations can be made under other stringency genes of SEQID NOS: 2-4198, or subsequences thereof. In conditions, as defined herein. another preferred aspect, the nucleic acid probe is the mature 25 The present invention also relates to isolated polynucle coding region of any of the genes of SEQID NOS: 2-4198. In otides obtained by (a) hybridizing a population of DNA under another preferred aspect, the nucleic acid probe is the gene of very low, low, medium, medium-high, high, or very high any of SEQID NOS: 2-4198 contained in Bacillus lichenifor stringency conditions with any of (i) the genes of SEQ ID mis ATCC 14580. In another preferred aspect, the nucleic NOs: 2-4198, or subsequences thereof, or (ii) complementary acid probe is the mature coding region of any of the genes of 30 Strands thereof, and (b) isolating the hybridizing polynucle SEQ ID NOS: 2-4198 contained in Bacillus licheniformis otide from the population of DNA. In a preferred aspect, the ATCC 1458O. hybridizing polynucleotide encodes a polypeptide of any of For long probes of at least 100 nucleotides in length, very SEQID NOS: 2-4198, or homologous polypeptides thereof. low to very high stringency conditions are defined as prehy In a third aspect, the present invention relates to isolated bridization and hybridization at 42° C. in 5xSSPE, 0.3% 35 polypeptides having amino acid sequences which have a SDS, 200 ug/ml sheared and denatured salmon sperm DNA, degree of identity to any of SEQ ID NOs: 4199-8395 of at and either 25% formamide for very low and low stringencies, least about 60%, preferably at least about 65%, more prefer 35% formamide for medium and medium-high Stringencies, ably at least about 70%, more preferably at least about 75%, or 50% formamide for high and very high stringencies, fol more preferably at least about 80%, more preferably at least lowing standard Southern blotting procedures. 40 about 85%, even more preferably at least about 90%, most For long probes of at least 100 nucleotides in length, the preferably at least about 95%, and even most preferably at carrier material is finally washed three times each for 15 least about 97%, which have biological activity (hereinafter minutes using 2xSSC, 0.2% SDS preferably at least at 45° C. “homologous polypeptides’). In a preferred aspect, the (very low stringency), more preferably at least at 50°C. (low homologous polypeptides have an amino acid sequence stringency), more preferably at least at 55°C. (medium strin 45 which differs by ten amino acids, preferably by five amino gency), more preferably at least at 60° C. (medium-high acids, more preferably by four amino acids, even more pref stringency), even more preferably at least at 65° C. (high erably by three amino acids, most preferably by two amino stringency), and most preferably at least at 70° C. (very high acids, and even most preferably by one amino acid from the Stringency). amino acid sequences of SEQID NOs: 4199-8395. For short probes which are about 14 nucleotides to about 70 50 The polypeptides of the present invention preferably com nucleotides in length, Stringency conditions are defined as prise the amino acid sequence of any of SEQID NOs: 4199 prehybridization, hybridization, and washing post-hybridiza 8395 or an allelic variant thereof; or a fragment thereof that tion at about 5° C. to about 10° C. below the calculated T. has biological activity. In a more preferred aspect, the using the calculation according to Bolton and McCarthy polypeptides of the present invention comprise the amino (1962, Proceedings of the National Academy of Sciences USA 55 acid sequence of any of SEQID NOs: 4199-8395. In another 48: 1390) in 0.9 M. NaCl, 0.09 M Tris-HCl pH 7.6, 6 mM preferred aspect, the polypeptides of the present invention EDTA, 0.5% NP-40, 1xDenhardt's solution, 1 mM sodium comprise the mature polypeptide region of any of SEQ ID pyrophosphate, 1 mMSodium monobasic phosphate, 0.1 mM NOs: 4199-8395, or an allelic variant thereof; or a fragment ATP and 0.2 mg of yeast RNA per ml following standard thereof that has biological activity. In another preferred Southern blotting procedures. 60 aspect, the polypeptides of the present invention comprise the For short probes which are about 14 nucleotides to about 70 mature polypeptide region of any of SEQ ID NOs: 4199 nucleotides in length, the carrier material is washed once in 8395. In another preferred aspect, the polypeptides of the 6xSCC plus 0.1% SDS for 15 minutes and twice each for 15 present invention consist of the amino acid sequence of any of minutes using 6xSSC at 5'C to 10° C. below the calculated SEQ ID NOs: 4199-8395 or an allelic variant thereof; or a Tn. 65 fragment thereof that has biological activity. In another pre Under salt-containing hybridization conditions, the effec ferred aspect, the polypeptides of the present invention con tive T is what controls the degree of identity required sist of the amino acid sequence of any of SEQ ID NOs: US 8,168,417 B2 11 12 4199-8395. In another preferred aspect, the polypeptides con moters are described in “Useful proteins from recombinant sist of the mature polypeptide region of any of SEQID NOs: bacteria” in Scientific American, 1980, 242: 74-94; and in 4199-8395 oran allelic variant thereof; or a fragment thereof Sambrook et al., 1989, supra. that has biological activity. In another preferred aspect, the The control sequence may also be a suitable transcription polypeptides consist of the mature polypeptide region of any terminator sequence, a sequence recognized by a host cell to of SEC) ID NOs: 4199-8395. terminate transcription. The terminator sequence is operably In a fourth aspect, the present invention relates to isolated linked to the 3' terminus of the gene encoding the biologically Substances having biological activity which are encoded by active Substance. Any terminator which is functional in the polynucleotides which hybridize, as described above, under host cell of choice may be used in the present invention. very low stringency conditions, preferably low stringency 10 The control sequence may also be a signal peptide coding conditions, more preferably medium stringency conditions, region that codes for an amino acid sequence linked to the more preferably medium-high Stringency conditions, even amino terminus of a polypeptide and directs the encoded more preferably high Stringency conditions, and most prefer polypeptide into the cells secretory pathway. The 5' end of ably very high stringency conditions with which hybridize 15 the coding sequence of the nucleotide sequence may inher under very low stringency conditions, preferably low strin ently contain a signal peptide coding region naturally linked gency conditions, more preferably medium stringency con in translation reading frame with the segment of the coding ditions, more preferably medium-high Stringency conditions, region which encodes the secreted polypeptide. Alternatively, even more preferably high Stringency conditions, and most the 5' end of the coding sequence may contain a signal peptide preferably very high Stringency conditions with any of (i) the coding region which is foreign to the coding sequence. The genes of SEQID NOS: 2-4198, or subsequences thereof, or foreign signal peptide coding region may be required where (ii) complementary strands thereof. A Subsequence of any of the coding sequence does not naturally contain a signal pep SEQ ID NOS: 2-4198 may be at least 100 nucleotides or tide coding region. Alternatively, the foreign signal peptide preferably at least 200 nucleotides. Moreover, the subse coding region may simply replace the natural signal peptide quence may encode a fragment, e.g., a polypeptide fragment, 25 coding region in order to enhance Secretion of the polypep which has biological activity. tide. However, any signal peptide coding region which directs Nucleic Acid Constructs the expressed polypeptide into the secretory pathway of a host The present invention also relates to nucleic acid constructs cell of choice may be used in the present invention. comprising an isolated gene or isolated genes (e.g., operon) of Effective signal peptide coding regions for bacterial host the present invention operably linked to one or more control 30 cells are the signal peptide coding regions obtained from the sequences which direct the expression of the coding sequence genes for Bacillus NCIB 11837 maltogenic amylase, Bacillus in a suitable host cell under conditions compatible with the Stearothermophilus alpha-amylase, Bacillus licheniformis control sequences. subtilisin, Bacillus licheniformis beta-lactamase, Bacillus An isolated gene(s) of the present invention may be Stearothermophilus neutral proteases (nprT, nprS, nprM), and manipulated in a variety of ways to provide for production of 35 Bacillus subtilis prSA. Further signal peptides are described a biologically active substance encoded directly or indirectly by Simonen and Palva, 1993, Microbiological Reviews 57: by the gene(s). Manipulation of the nucleotide sequence prior 109-137. to its insertion into a vector may be desirable or necessary The control sequence may also be a propeptide coding depending on the expression vector. The techniques for modi region that codes for an amino acid sequence positioned at the fying nucleotide sequences utilizing recombinant DNA 40 amino terminus of a polypeptide. The resultant polypeptide is methods are well known in the art. known as a proenzyme or propolypeptide (or a Zymogen in The control sequence may be an appropriate promoter Some cases). A propolypeptide is generally inactive and can sequence, a nucleotide sequence which is recognized by a be converted to a mature active polypeptide by catalytic or host cell for expression of the gene(s) encoding the biologi autocatalytic cleavage of the propeptide from the propolypep cally active Substance. The promoter sequence contains tran 45 tide. The propeptide coding region may be obtained from the Scriptional control sequences which mediate the expression genes for Bacillus subtilis alkaline protease (aprE) and Bacil of the biologically active Substance. The promoter may be any lus subtilis neutral protease (nprT). nucleotide sequence which shows transcriptional activity in Where both signal peptide and propeptide regions are the host cell of choice including mutant, truncated, and hybrid present at the amino terminus of a polypeptide, the propeptide promoters, and may be obtained from genes encoding extra 50 region is positioned next to the amino terminus of a polypep cellular or intracellular polypeptides or biologically active tide and the signal peptide region is positioned next to the Substances either homologous or heterologous to the host amino terminus of the propeptide region. cell. It may also be desirable to add regulatory sequences which Examples of suitable promoters for directing the transcrip allow the regulation of the expression of a biologically active tion of the nucleic acid constructs of the present invention, 55 substance relative to the growth of the host cell. Examples of especially in a bacterial host cell, are the promoters obtained regulatory systems are those which cause the expression of from the E. colilac operon, Streptomyces coelicolor agarase the gene to be turned on or offin response to a chemical or gene (dagA), Bacillus subtilis levanSucrase gene (sacB), physical stimulus, including the presence of a regulatory Bacillus licheniformis alpha-amylase gene (amyl), Bacillus compound. Regulatory systems in prokaryotic systems Stearothermophilus maltogenic amylase gene (amyM), 60 include the lac, tac, and trp operator systems. Other examples Bacillus amyloliquefaciens alpha-amylase gene (amyO), ofregulatory sequences are those which allow for gene ampli Bacillus licheniformis penicillinase gene (penP), Bacillus fication. In eukaryotic systems, these include the dihydro subtilis XylA and XylB genes, and prokaryotic beta-lactamase folate reductase gene which is amplified in the presence of gene (Villa-Kamaroff et al., 1978, Proceedings of the , and the metallothionein genes which are ampli National Academy of Sciences USA 75: 3727-3731), as well 65 fied with heavy metals. In these cases, the nucleotide as the tac promoter (DeBoer et al., 1983, Proceedings of the sequence encoding the biologically active Substance would National Academy of Sciences USA 80: 21-25). Further pro be operably linked with the regulatory sequence. US 8,168,417 B2 13 14 Expression Vectors the integrational elements may be non-encoding or encoding The present invention also relates to recombinant expres nucleotide sequences. On the other hand, the vector may be sion vectors comprising an isolated gene of the present inven integrated into the genome of the host cell by non-homolo tion, a promoter, and transcriptional and translational stop gous recombination. signals. The various nucleic acid and control sequences 5 For autonomous replication, the vector may further com described above may be joined together to produce a recom prise an origin of replication enabling the vector to replicate binant expression vector which may include one or more autonomously in the host cell in question. The origin of rep convenient restriction sites to allow for insertion or substitu lication may be any plasmid replicator mediating autono tion of the nucleotide sequence encoding the polypeptide at mous replication which functions in a cell. The term “origin Such sites. Alternatively, a gene of the present invention may 10 of replication” or “plasmid replicator is defined herein as a be expressed by inserting the nucleotide sequence or a nucleic sequence that enables a plasmid or vector to replicate in vivo. acid construct comprising the sequence into an appropriate Examples of bacterial origins of replication are the origins of vector for expression. In creating the expression vector, the replication of plasmids pBR322, puC19, p.ACYC177, and coding sequence is located in the vector so that the coding pACYC184 permitting replication in E. coli, and puB110, sequence is operably linked with the appropriate control 15 pE 194, pTA1060, and pAME31 permitting replication in sequences for expression. Bacillus. The recombinant expression vector may be any vector More than one copy of a gene of the present invention may (e.g., a plasmidor ) which can be conveniently subjected be inserted into the host cell to increase production of the gene to recombinant DNA procedures and can bring about the product. An increase in the copy number of the gene can be expression of a gene of the present invention. The choice of obtained by integrating at least one additional copy of the the vector will typically depend on the compatibility of the sequence into the host cell genome or by including an ampli vector with the host cell into which the vector is to be intro fiable selectable marker gene with a gene of the present inven duced. The vectors may be linear or closed circular plasmids. tion where cells containing amplified copies of the selectable The vector may be an autonomously replicating vector, i.e., marker gene, and thereby additional copies of the gene of the a vector which exists as an extrachromosomal entity, the 25 present invention, can be selected for by cultivating the cells replication of which is independent of chromosomal replica in the presence of the appropriate selectable agent. tion, e.g., a plasmid, an extrachromosomal element, a min The procedures used to ligate the elements described above ichromosome, or an artificial chromosome. The vector may to construct the recombinant expression vectors of the present contain any means for assuring self-replication. Alternatively, invention are well known to one skilled in the art (see, e.g., the vector may be one which, when introduced into the host 30 Sambrook et al., 1989, supra). cell, is integrated into the genome and replicated together Host Cells with the chromosome(s) into which it has been integrated. The present invention also relates to recombinant host Furthermore, a single vector or plasmid or two or more vec cells, comprising an isolated gene of the present invention, tors or plasmids which together contain the total DNA to be where the host cells are advantageously used in the recombi introduced into the genome of the host cell, or a transposon 35 nant production of a biologically active substance encoded by may be used. the gene. A vector comprising a gene of the present invention The vectors of the present invention preferably contain one is introduced into a host cell so that the vector is maintained or more selectable markers which permit easy selection of as a chromosomal integrant or as a self-replicating extra transformed cells. A selectable marker is a gene the product of chromosomal vector as described earlier. The term "host cell’ which provides for biocide or viral resistance, resistance to 40 encompasses any progeny of a parent cell that is not identical heavy metals, prototrophy to auxotrophs, and the like. to the parent cell due to mutations that occur during replica Examples of bacterial selectable markers are the dal genes tion. The choice of a host cell will to a large extent depend from Bacillus subtilis or Bacillus licheniformis, or markers upon the gene encoding the biologically active Substance and which confer antibiotic resistance Such as amplicillin, kana its source. mycin, chloramphenicol or tetracycline resistance. 45 The host cell may be any unicellular microorganism, e.g., The vectors of the present invention preferably contain an a prokaryote. element(s) that permits integration of the vector into the host Useful unicellular cells are bacterial cells such as gram cells genome or autonomous replication of the vector in the positive bacteria including, but not limited to, a Bacillus cell, cell independent of the genome. e.g., Bacillus alkalophilus, Bacillus amyloliquefaciens, For integration into the host cell genome, the vector may 50 Bacillus brevis, Bacillus cereus, Bacillus circulans, Bacillus rely on portions of the sequence of the gene or any other clausii, Bacillus coagulans, Bacillus fastidiosus, Bacillus fir element of the vector for integration of the vector into the mus, Bacillus lautus, Bacillus lentus, Bacillus licheniformis, genome by homologous or nonhomologous recombination. Bacillus macerans, Bacillus megaterium, Bacillus methan Alternatively, the vector may contain additional nucleotide olicus, Bacillus purnilus, Bacillus sphaericus, Bacillus sequences for directing integration by homologous recombi 55 Stearothermophilus, Bacillus subtilis, and Bacillus thuring nation into the genome of the host cell. The additional nucle iensis; or a Streptomyces cell, e.g., Streptomyces lividans and otide sequences enable the vector to be integrated into the Streptomyces murinus, or gram negative bacteria Such as E. host cell genome at a precise location(s) in the chromoso coli and Pseudomonas sp. In a preferred aspect, the bacterial me(s). To increase the likelihood of integration at a precise host cell is a Bacillus lentus, Bacillus licheniformis, Bacillus location, the integrational elements should preferably contain 60 Stearothermophilus, or Bacillus subtilis cell. In another pre a sufficient number of nucleotides, such as 100 to 10,000 base ferred aspect, the Bacillus cell is an alkalophilic Bacillus. pairs, preferably 400 to 10,000 base pairs, and most prefer The introduction of a vector into a bacterial host cell may, ably 800 to 10,000 base pairs, which are highly homologous for instance, be effected by protoplast transformation (see, with the corresponding target sequence to enhance the prob e.g., Chang and Cohen, 1979, Molecular General Genetics ability of homologous recombination. The integrational ele 65 168: 111-115), using competent cells (see, e.g., Young and ments may be any sequence that is homologous with the Spizizen, 1961, Journal of Bacteriology 81:823-829, or Dub target sequence in the genome of the host cell. Furthermore, nau and Davidoff-Abelson, 1971, Journal of Molecular Biol US 8,168,417 B2 15 16 ogy 56: 209-221), electroporation (see, e.g., Shigekawa and Protein Purification, J.-C. Janson and Lars Ryden, editors, Dower, 1988, Biotechniques 6: 742-751), or conjugation (see, VCH Publishers, New York, 1989). e.g., Koehler and Thorne, 1987, Journal of Bacteriology 169: Plants 5771-5278). The present invention also relates to a transgenic plant, Methods of Production 5 plant part, or plant cell which has been transformed with a The present invention also relates to methods for producing gene encoding a biologically active substance of the present a biologically active substance of the present invention com invention so as to express and produce the biologically active prising (a) cultivating a strain, which in its wild-type form is Substance in recoverable quantities. The biologically active capable of producing the biologically active Substance, under Substance may be recovered from the plant or plant part. conditions conducive for production of the biologically active 10 Alternatively, the plant or plant part containing the recombi Substance; and (b) recovering the biologically active Sub nant biologically active Substance may be used as Such for stance. Preferably, the strain is of the genus Bacillus, and improving the quality of a food or feed, e.g., improving nutri more preferably Bacillus licheniformis. tional value, palatability, and rheological properties, or to The present invention also relates to methods for producing 15 destroy an antinutritive factor. a biologically active substance of the present invention com The transgenic plant can be dicotyledonous (a dicot) or prising (a) cultivating a host cell under conditions conducive monocotyledonous (a monocot). Examples of monocot for production of the biologically active substance; and (b) plants are grasses, such as meadow grass (blue grass, Poa), recovering the biologically active Substance. forage grass such as festuca, lolium, temperate grass, such as The present invention also relates to methods for producing 20 Agrostis, and cereals, e.g., wheat, oats, rye, barley, rice, Sor a biologically active substance of the present invention com ghum, and maize (corn). prising (a) cultivating a host cell under conditions conducive Examples of dicot plants are tobacco, legumes, such as for production of the biologically active substance, wherein lupins, potato, Sugar beet, pea, bean and soybean, and crucif the host cell comprises a mutant polynucleotide comprising at erous plants (family Brassicaceae). Such as cauliflower, rape least one mutation in the coding region of any of SEQ ID 25 seed, and the closely related model organism Arabidopsis NOs: 2-4198, wherein the mutant polynucleotide encodes a thaliana. biologically active substance which consists of SEQID NOs: Examples of plant parts are stem, callus, leaves, root, fruits, 4199-8395, respectively, and (b) recovering the biologically seeds, and tubers as well as the individual tissues comprising active Substance. these parts, e.g., epidermis, mesophyll, parenchyma, vascular In the production methods of the present invention, the 30 tissues, meristems. In the present context, also specific plant cells are cultivated in a nutrient medium suitable for produc cell compartments, such as chloroplast, apoplast, mitochon tion of the biologically active substance using methods dria, vacuole, peroxisomes and cytoplasm are considered to known in the art. For example, the cell may be cultivated by be a plant part. Furthermore, any plant cell, whatever the shake flask cultivation, and Small-scale or large-scale fermen tissue origin, is considered to be a plant part. Likewise, plant tation (including continuous, batch, fed-batch, or Solid State 35 parts such as specific tissues and cells isolated to facilitate the fermentations) in laboratory or industrial fermentors per utilisation of the invention are also considered plant parts e.g. formed in a suitable medium and under conditions allowing embryos, endosperms, aleurone and seeds coats. the biologically active Substance to be expressed and/or iso Also included within the scope of the present invention are lated. The cultivation takes place in a suitable nutrient the progeny of Such plants, plant parts, and plant cells. medium comprising carbon and nitrogen Sources and inor- 40 The transgenic plant or plant cell expressing a biologically ganic salts, using procedures known in the art. Suitable media active Substance of the present invention may be constructed are available from commercial Suppliers or may be prepared in accordance with methods known in the art. Briefly, the according to published compositions (e.g., in catalogues of plant or plant cell is constructed by incorporating one or more the American Type Culture Collection). If the biologically expression constructs encoding a biologically active Sub active Substance is secreted into the nutrient medium, the 45 stance of the present invention into the plant host genome and biologically active substance can be recovered directly from propagating the resulting modified plant or plant cell into a the medium. If the biologically active substance is not transgenic plant or plant cell. secreted, it can be recovered from cell lysates. The expression construct is conveniently a nucleic acid The biologically active Substances may be detected using construct which comprises a gene encoding a biologically methods known in the art that are specific for the polypep- 50 active substance of the present invention operably linked with tides. These detection methods may include use of specific appropriate regulatory sequences required for expression of antibodies, formation of an enzyme product, or disappear the nucleotide sequence in the plant or plant part of choice. ance of an enzyme Substrate. For example, an enzyme assay Furthermore, the expression construct may comprise a select may be used to determine the activity of an enzyme. able marker useful for identifying host cells into which the The resulting biologically active Substances may be recov- 55 expression construct has been integrated and DNA sequences ered by methods known in the art. For example, the biologi necessary for introduction of the construct into the plant in cally active substances may be recovered from the nutrient question (the latter depends on the DNA introduction method medium by conventional procedures including, but not lim to be used). ited to, centrifugation, filtration, extraction, spray-drying, The choice of regulatory sequences, such as promoter and evaporation, or precipitation. 60 terminator sequences and optionally signal or transit The biologically active substances of the present invention sequences is determined, for example, on the basis of when, may be purified by a variety of procedures known in the art where, and how the biologically active substance is desired to including, but not limited to, chromatography (e.g., ion be expressed. For instance, the expression of the gene encod exchange, affinity, hydrophobic, chromatofocusing, and size ing a biologically active Substance of the present invention exclusion), electrophoretic procedures (e.g., preparative iso- 65 may be constitutive or inducible, or may be developmental, electric focusing), differential solubility (e.g., ammonium stage or tissue specific, and the gene product may be targeted sulfate precipitation), SDS-PAGE, or extraction (see, e.g., to a specific tissue or plant part Such as seeds or leaves. US 8,168,417 B2 17 18 Regulatory sequences are, for example, described by Tague et Following transformation, the transformants having incor al., 1988, Plant Physiology 86: 506. porated therein the expression construct are selected and For constitutive expression the 35S-CaMV, the maize ubiq regenerated into whole plants according to methods well uitin 1 and the rice actin 1 promoter may be used (Franck et known in the art. Often the transformation procedure is al., 1980. Cell 21: 285-294, Christensen A H, Sharrock RA designed for the selective elimination of selection genes and Quail, 1992, Plant Mo. Biol. 18: 675-689; Zhang W. either during regeneration or in the following generations by McElroy D. and Wu R., 1991, Plant Cell 3: 1155-1165). using, for example, co-transformation with two separate Organ-specific promoters may be, for example, a promoter T-DNA constructs or site specific excision of the selection from storage sink tissues such as seeds, potato tubers, and gene by a specific recombinase. fruits (Edwards and Coruzzi, 1990, Ann. Rev. Genet. 24: 10 The present invention also relates to methods for producing 275-303), or from metabolic sink tissues such as meristems a biologically active Substance of the present invention com (Ito et al., 1994, Plant Mol. Biol. 24: 863-878), a seed specific prising (a) cultivating a transgenic plant or a plant cell com promoter Such as the glutelin, prolamin, globulin, or albumin prising a gene encoding a biologically active Substance of the promoter from rice (Wu et al., 1998, Plant and Cell Physiol present invention under conditions conducive for production ogy 39: 885-889), a Vicia faba promoter from the legumin B4 15 of the biologically active Substance; and (b) recovering the and the unknown seed protein gene from Vicia faba (Conrad biologically active Substance. et al., 1998, Journal of Plant Physiology 152: 708–711), a Removal or Reduction of Biologically Active Substance promoter from a seed oil body protein (Chen et al., 1998, The present invention also relates to methods for producing Plant and Cell Physiology 39: 935-941), the storage protein a mutant of a parent cell, which comprises disrupting or nap A promoter from Brassica napus, or any other seed spe deleting all or a portion of a gene encoding a biologically cific promoter known in the art, e.g., as described in WO active substance of the present invention, which results in the 91/14772. Furthermore, the promoter may be a leaf specific mutant cell producing less of the biologically active Sub promoter Such as the rbcs promoter from rice or tomato (Kyo stance than the parent cell when cultivated under the same Zuka et al., 1993, Plant Physiology 102:991-1000, the chlo conditions. rella virus methyltransferase gene promoter (Mitra 25 The mutant cell may be constructed by reducing or elimi and Higgins, 1994, Plant Molecular Biology 26: 85-93), or nating expression of a gene encoding or regulatory synthesis the aldP gene promoter from rice (Kagaya et al., 1995, of a biologically active Substance of the present invention Molecular and General Genetics 248: 668-674), or a wound using methods well known in the art, for example, insertions, inducible promoter Such as the potato pint promoter (Xuetal. disruptions, replacements, or deletions. The gene to be modi 1993, Plant Molecular Biology 22:573-588). Likewise, the 30 fied or inactivated may be, for example, the coding region or promoter may inducible by abiotic treatments such as tem a part thereof essential for activity, or a regulatory element of perature, drought or alterations in salinity or induced by exog the gene required for the expression of the coding region. An enously applied Substances that activate the promoter, e.g., example of Such a regulatory or control sequence may be a , oestrogens, plant hormones like ethylene, abscisic promoter sequence or a functional part thereof, i.e., a part that acid and gibberellic acid and heavy metals. 35 is sufficient for affecting expression of the gene. Other control A promoter enhancer element may also be used to achieve sequences for possible modification include, but are not lim higher expression of the enzyme in the plant. For instance, the ited to, a leader, propeptide sequence, signal peptide promoter enhancer element may be an intron which is placed sequence, transcription terminator, and transcriptional acti between the promoter and the nucleotide sequence encoding VatOr. a biologically active substance of the present invention. For 40 Modification or inactivation of the gene may be performed instance, Xu et al., 1993, supra disclose the use of the first by Subjecting the parent cell to mutagenesis and selecting for intron of the rice actin 1 gene to enhance expression. mutant cells in which expression of the gene has been reduced The selectable marker gene and any other parts of the or eliminated. The mutagenesis, which may be specific or expression construct may be chosen from those available in random, may be performed, for example, by use of a Suitable the art. 45 physical or chemical mutagenizing agent, by use of a Suitable The nucleic acid construct is incorporated into the plant oligonucleotide, or by subjecting the DNA sequence to PCR genome according to conventional techniques known in the generated mutagenesis. Furthermore, the mutagenesis may art, including Agrobacterium-mediated transformation, be performed by use of any combination of these mutageniz virus-mediated transformation, microinjection, particle bom ing agents. bardment, biolistic transformation, and electroporation (Gas 50 Examples of a physical or chemical mutagenizing agent ser et al., 1990, Science 244: 1293; Potrykus, 1990, Bio/ suitable for the present purpose include ultraviolet (UV) irra Technology 8:535: Shimamoto et al., 1989, Nature 338:274). diation, hydroxylamine, N-methyl-N'-nitro-N-nitrosoguani Presently, Agrobacterium tumefaciens-mediated gene dine (MNNG), O-methyl hydroxylamine, nitrous acid, ethyl transfer is the method of choice for generating transgenic methane sulphonate (EMS), sodium bisulphite, formic acid, dicots (for a review, see Hooykas and Schilperoort, 1992, 55 and nucleotide analogues. Plant Molecular Biology 19: 15-38). However it can also be When Such agents are used, the mutagenesis is typically used for transforming monocots, although other transforma performed by incubating the parent cell to be mutagenized in tion methods are generally preferred for these plants. Pres the presence of the mutagenizing agent of choice under Suit ently, the method of choice for generating transgenic mono able conditions, and screening and/or selecting for mutant cots is particle bombardment (microscopic gold or tungsten 60 cells exhibiting reduced or no expression of the gene. particles coated with the transforming DNA) of embryonic Modification or inactivation of the nucleotide sequence calli or developing embryos (Christou, 1992, Plant Journal 2: may be accomplished by introduction, Substitution, or 275-281; Shimamoto, 1994, Current Opinion Biotechnology removal of one or more nucleotides in the gene or a regulatory 5: 158-162; Vasil et al., 1992, Bio/Technology 10: 667-674). element required for the transcription or translation thereof. An alternative method for transformation of monocots is 65 For example, nucleotides may be inserted or removed so as to based on protoplast transformation as described by Omirulleh result in the introduction of a stop codon, the removal of the et al., 1993, Plant Molecular Biology 21: 415-428. start codon, or a change in the open reading frame. Such US 8,168,417 B2 19 20 modification or inactivation may be accomplished by site preferably at least 85%, still more preferably at least 95%, and directed mutagenesis or PCR generated mutagenesis in most preferably at least 99% of the biologically active sub accordance with methods known in the art. Although, in stance. Complete removal of biologically active Substance principle, the modification may be performed in vivo, i.e., may be obtained by use of this method. directly on the cell expressing the nucleotide sequence to be The methods used for cultivation and purification of the modified, it is preferred that the modification be performed in product of interest may be performed by methods known in vitro as exemplified below. the art. An example of a convenient way to eliminate or reduce The methods of the present invention for producing an expression of a nucleotide sequence by a cell of choice is essentially biologically active substance-free product is of based on techniques of gene replacement, gene deletion, or 10 gene disruption. For example, in the gene disruption method, particular interest in the production of prokaryotic polypep a nucleic acid sequence corresponding to the endogenous tides, in particular bacterial proteins such as enzymes. The nucleotide sequence is mutagenized in vitro to produce a enzyme may be selected from, e.g., an amylolytic enzyme, defective nucleic acid sequence which is then transformed lipolytic enzyme, proteolytic enzyme, cellulytic enzyme, oxi into the parent cell to produce a defective gene. By homolo 15 doreductase, or plant cell-wall degrading enzyme. Examples gous recombination, the defective nucleic acid sequence of Such enzymes include an aminopeptidase, amylase, amy replaces the endogenous nucleotide sequence. It may be loglucosidase, carbohydrase, carboxypeptidase, catalase, desirable that the defective nucleotide sequence also encodes cellulase, chitinase, cutinase, cyclodextrin glycosyltrans a marker that may be used for selection of transformants in ferase, deoxyribonuclease, esterase, galactosidase, beta-ga which the nucleotide sequence has been modified or lactosidase, glucoamylase, glucose oxidase, glucosidase, destroyed. In a particularly preferred aspect, the nucleotide haloperoxidase, hemicellulase, invertase, isomerase, laccase, sequence is disrupted with a selectable marker Such as those ligase, lipase, lyase, mannosidase, oxidase, pectinolytic described herein. enzyme, peroxidase, phytase, phenoloxidase, polyphenoloxi Alternatively, modification or inactivation of the nucle dase, proteolytic enzyme, ribonuclease, transferase, trans otide sequence may be performed by established anti-sense 25 glutaminase, or Xylanase. The biologically active Substance techniques using a sequence complementary to the nucleotide deficient cells may also be used to express heterologous sequence. More specifically, expression of the nucleotide proteins of pharmaceutical interest Such as hormones, growth sequence by a cell may be reduced or eliminated by introduc factors, receptors, and the like. ing a sequence complementary to the nucleic acid sequence It will be understood that the term “prokaryotic polypep of the gene that may be transcribed in the cell and is capable 30 tides’ includes not only native polypeptides, but also those of hybridizing to the mRNA produced in the cell. Under polypeptides, e.g., enzymes, which have been modified by conditions allowing the complementary anti-sense nucleotide amino acid substitutions, deletions or additions, or other such sequence to hybridize to the mRNA, the amount of protein modifications to enhance activity, thermostability, pH toler translated is thus reduced or eliminated. ance and the like. The present invention further relates to a mutant cell of a 35 In a further aspect, the present invention relates to a product parent cell which comprises a disruption or deletion of a of a protein or Substance essentially free of a biologically nucleotide sequence encoding the biologically active Sub active substance of the invention, produced by a method of the stance or a control sequence thereof, which results in the present invention. mutant cell producing less of the biologically active Sub Compositions stance than the parent cell. 40 The present invention also relates to compositions com The biologically active substance-deficient mutant cells so prising a biologically active Substance of the present inven created are particularly useful as host cells for the expression tion. Preferably, the compositions are enriched in the biologi of homologous and/or heterologous Substances, such as cally active substance. The term “enriched indicates that the polypeptides. Therefore, the present invention further relates biologically active Substance of the composition has been to methods for producing a homologous or heterologous Sub 45 increased, e.g., with an enrichment factor of 1.1. stance comprising (a) cultivating the mutant cell under con The composition may comprise a biologically active Sub ditions conducive for production of the substance; and (b) stance of the invention as the major component, e.g., a mono recovering the substance. The term "heterologous sub component composition. Alternatively, the composition may stances is defined herein as Substances which are not native comprise multiple biologically active Substances, for to the host cell, a native Substance in which modifications 50 example, multiple enzymes, such as an aminopeptidase, amy have been made to alter the native sequence, or a native lase, carbohydrase, carboxypeptidase, catalase, cellulase, Substance whose expression is quantitatively altered as a chitinase, cutinase, cyclodextrin glycosyltransferase, deox result of a manipulation of the host cell by recombinant DNA yribonuclease, esterase, alpha-galactosidase, beta-galactosi techniques. dase, glucoamylase, alpha-glucosidase, beta-glucosidase, In a further aspect, the present invention relates to a method 55 haloperoxidase, invertase, laccase, lipase, mannosidase, oxi for producing a protein product essentially free of a biologi dase, pectinolytic enzyme, peptidoglutaminase, peroxidase, cally active substance by fermentation of a cell which pro phytase, polyphenoloxidase, proteolytic enzyme, ribonu duces both a biologically active substance of the present clease, transglutaminase, or Xylanase. invention as well as the protein product of interest by adding The compositions may be prepared in accordance with an effective amount of an agent capable of inhibiting activity 60 methods known in the art and may be in the form of a liquid of the biologically active substance to the fermentation broth ora dry composition. For instance, the composition may be in before, during, or after the fermentation has been completed, the form of a granulate or a microgranulate. The biologically recovering the product of interest from the fermentation active Substance to be included in the composition may be broth, and optionally subjecting the recovered product to stabilized in accordance with methods known in the art. further purification. 65 Methods for Using the Bacillus licheniformis Chromosome In accordance with this aspect of the invention, it is pos The present invention also relates to methods for using the sible to remove at least 60%, preferably at least 75%, more Bacillus licheniformis chromosome. US 8,168,417 B2 21 22 The chromosome of Bacillus licheniformis serves as a res Methods for Isolating Genes ervoir of new genes/proteins that have likely environmental, The present invention also relates to methods for isolating energy, health, and industrial applications (e.g., enzymes, a gene encoding a biologically active Substance from a micro antibiotics, biochemicals). A clear extension of this is that the bial strain. The method comprises first the addition of a mix newly discovered molecules can be used as starting points for ture offirst labeled nucleic acid probes, isolated from a micro further improvements via well-established gene shuffling, bial strain cultured on medium without an inducing Substrate, directed evolution, and protein engineering methods. Addi and a mixture of second labeled nucleic acid probes, isolated tionally, regions or motifs (e.g., signal peptides, active sites, from the microbial strain cultured on medium with the induc Substrate-binding regions) from the newly discovered mol ing Substrate, to an array of Bacillus licheniformis genes ecules may be employed to derive novel chimeras with indus 10 selected from the group consisting of nucleotides SEQ ID trially advantageous properties. NOs: 2-4198, complementary strands of SEQ ID NOs: 2-4198, or fragments of SEQID NOS: 2-4198, under condi The genes encoded in the chromosome may be used for tions where the labeled nucleic acid probes hybridize to monitoring global gene expression during the life cycle of the complementary sequences of the Bacillus licheniformis organism or during industrial fermentations (e.g., imple 15 genes on the array. The first nucleic acid probes are labeled mented on DNA microarrays). By monitoring global gene with a first reporter and the second nucleic acid probes are expression, for example, improved processes for industrial labeled with a second reporter. The array is then examined fermentation can be implemented with greater efficiency and under conditions wherein the relative expression of the genes economy. of the microbial strain is determined by the observed hybrid The chromosome is useful in comparative evolutionary ization reporter signal of each spot on the array in which (i) and ecological studies. For example, dozens of Bacillus the Bacillus licheniformis genes on the array that hybridize to licheniformis isolates can be readily compared on a global the first nucleic acid probes produce a distinct first hybridiza scale by hybridization of their genomic DNAS to a microarray tion reporter signal or to the second nucleic acid probes pro fabricated from the reference strain presented in this case duce a distinct second hybridization reporter signal, and (ii) (so-called comparative genomic hybridization). Using this 25 the Bacillus licheniformis genes on the array that hybridize to method, one can compare various isolates to look for simi both the first and second nucleic acid probes produce a dis larities/differences among geographical and environmental tinct combined hybridization reporter signal. The probe is niches or among biocontrol strains versus saprophytic iso then sequenced to isolate from the microbial strain the corre lates. sponding gene that encodes an enzyme that degrades or con The chromosome sequence may be used to construct the 30 verts the substrate. metabolic blueprint for Bacillus licheniformis that includes Enzymes. The gene of interest may encode any enzyme all catabolic and anabolic pathways, signaling pathways, including an , transferase, hydrolase, lyase, regulatory networks, growth Substrates, biochemical inter isomerase, or ligase. In a preferred aspect, the enzyme is an mediates, end products, electron donors/acceptors and others. acylase, alpha-glucosidase, amidase, aminopeptidase, amy In doing so, it is possible to modify the metabolic machinery 35 lase, carbohydrase, carboxypeptidase, catalase, cellulase, of the organism by deleting unwanted pathways and/or add chitinase, cutinase, cyclodextrin glycosyltransferase, deox ing enzymes/pathways from other organisms to generate use yribonuclease, dextrinase, endoglucanase, esterase, galacta ful chemicals and intermediates. nase, alpha-galactosidase, beta-galactosidase, glucoamylase, The pathways and components that contribute to produc glucanase, glucocerebrosidase, alpha-glucosidase, beta-glu tion of extracellular and surface proteins in Bacillus licheni 40 cosidase, hemicellulase, invertase, laccase, lignase, lipase, formis can be extracted from the chromosomal sequence. lysin, mannosidase, mutanase, oxidase, pectinolytic enzyme, This affords opportunities for improved production of extra peroxidase, phosphatase, phospholipase, phytase, polyphe cellular proteins by genetic manipulation of the Secretion noloxidase, proteolytic enzyme, pullulanase, ribonuclease, machinery. transglutaminase, urokinase, or Xylanase. The chromosome data allows deduction of the essential 45 Inducing Substrate. The inducing Substrate may be any genes for Bacillus licheniformis (either by comparison to Substrate that is Subject to the action of an enzyme, i.e., that related bacteria such as Bacillus subtilis or by systematic degrades or converts the Substrate. In a preferred aspect, the gene-by-gene knock outs). Thus it has become possible to inducing Substrate is lignin oralignin-containing material. In design custom-made strains which contain only the genes that a more preferred aspect, the lignin-containing material is are essential for production of specific proteins or metabolites 50 lignocellulose. In another preferred aspect, the inducing Sub (so-called cell factory concept). strate is cellulose. In another preferred aspect, the inducing The chromosome data may be used to construct interspe substrate is hemicellulose. In another preferred aspect, the cies hybrids between Bacillus licheniformis and other bacte inducing Substrate is pectin. In another preferred aspect, the ria. Venteret al., 2003, Proc. Nat. Acad. Sci. USA 100, 15440 inducing Substrate is a lipid. In another preferred aspect, the 15445 have shown that it is possible to construct an entire 55 inducing Substrate is phospholipid. In another preferred virus genome from Smaller DNA segments. Thus, segments aspect, the inducing Substrate is . In another pre of the Bacillus licheniformis chromosome may be employed ferred aspect, the inducing Substrate is protein. In another to derive novel chromosomal segments or even entire chi preferred aspect, the inducing Substrate is a starch. In another meric chromosomes for specific applications. preferred aspect, the inducing Substrate is a medium that is In a preferred aspect, methods for using the Bacillus 60 low in nutrients such as amino acids, carbon, nitrogen, phos licheniformis chromosome include host improvement, e.g., phate, or iron. secretion of a protein or metabolite, genome shuffling, con In a more preferred aspect, the protein substrate is blood, struction of new genomes, metabolic engineering and path casein, egg, gelatin, gluten, milk protein, or soy protein. In way reconstruction, carrier for heterologous expression vec another more preferred aspect, the lignin-containing material tors, microarrays as described herein, identification of 65 is hardwood thermomechanical pulp. In another more pre polypeptides in proteomics analyses, and comparative ferred aspect, the lignocellulose is corn Stover. In another genomics with other Bacillus species or related organisms. more preferred aspect, the lignocellulose is white poplar. In US 8,168,417 B2 23 24 anothermore preferred aspect, the lignocellulose is rice straw. dimensions, e.g., diameters, in the range of between about 10 In another more preferred aspect, the lignocellulose is Switch to about 250 um, preferably in the range of between about 10 graSS. to about 200 um, more preferably in the range of between Microbial Strains. In the methods of the present invention, about 20 to about 150 um, even more preferably in the range the microbial strain may be any microbial strain. The strain is of between about 20 to about 100 um, most preferably in the cultured on a suitable nutrient medium with and without a range of between about 50 to about 100 um, and even most substrate of interest. The strain cultured on medium without preferably in the range of between about 80 to about 100 um, the substrate is used as a reference for identifying differences and are separated from other gene elements in the microarray in expression of the same or similar complement of genes in by about the same distance. the strain cultured on medium with substrate. The strain may 10 Methods and instruments for forming microarrays on the be a wild-type, mutant, or recombinant strain. surface of a solid support are well known in the art. See, for In the methods of the present invention, the microbial strain example, U.S. Pat. No. 5,807,522; U.S. Pat. No. 5,700,637; is preferably a bacterium. In a more preferred aspect, the and U.S. Pat. No. 5,770,151. The instrument may be an auto bacterium is a Bacillus, Pseudomonas, or Streptomyces strain mated device such as described in U.S. Pat. No. 5,807,522. or E. coli. 15 The term “a substrate containing an array of Bacillus The Bacillus strain may be any Bacillus strain. In a pre licheniformis genes' is defined herein as a solid Support hav ferred aspect, the Bacillus strain is Bacillus alkalophilus, ing deposited on the Surface of the Support one or more of a Bacillus amyloliquefaciens, Bacillus brevis, Bacillus cereus, plurality of Bacillus licheniformis genes, as described herein, Bacillus circulans, Bacillus clausii, Bacillus coagulans, for use in detecting binding of labeled nucleic acids to the Bacillus fastidiosus, Bacillus firmus, Bacillus lautus, Bacillus Bacillus licheniformis genes. lentos, Bacillus licheniformis, Bacillus macerans, Bacillus The Substrate may, in one aspect, be a glass Support (e.g., megaterium, Bacillus methanolicus, Bacillus pumilus, Bacil glass slide) having a hydrophilic or hydrophobic coating on lus sphaericus, Bacillus Stearothermophilus, Bacillus subti the Surface of the Support, and an array of distinct random lis, or Bacillus thuringiensis. It will be understood that the nucleic acid fragments bound to the coating, where each term “Bacillus’ also encompasses relatives of Bacillus such 25 distinct random nucleic acid fragment is disposed at a sepa as Paenibacillus, Oceanobacillus, and the like. rate, defined position. The Pseudomonas strain may be any Pseudomonas strain. Each microarray in the Substrate preferably contains at In a preferred aspect, the Pseudomonas Strain is Pseudomo least 10 distinct Bacillus licheniformis in a surface area of nas acidovorans, Pseudomonas aeruginosa, Pseudomonas less than about 5 or 6 cm. Each distinct Bacillus lichenifor alcaligenes, Pseudomonas anguilliseptica, Pseudomonas 30 mis gene (i) is disposed at a separate, defined position on the abtimicrobica, Pseudomonas aurantiaca, Pseudomonas array, (ii) has a length of at least 50 bp, and (iii) is present in aureofaciens, Pseudomonas beijerinckii, Pseudomonas bore a defined amount between about 0.1 femtomoles and 100 opolis, Pseudomonas chlororaphis, Pseudomonas citronello nanomoles or higher if necessary. lis, Pseudomonas Cocovenenans, Pseudomonas diminuta, For a hydrophilic coating, the glass slide is coated by Pseudomonas doudorofii, Pseudomonas echinoides, 35 placing a film of a polycationic polymer with a uniform Pseudomonas elongata, Pseudomonas fluorescens, thickness on the surface of the slide and drying the film to Pseudomonas fragi, Pseudomonas halophobica, Pseudomo form a dried coating. The amount of polycationic polymer nas huttiensis, Pseudomonas indigofera, Pseudomonas lan added should be sufficient to form at least a monolayer of ceolata, Pseudomonas lemoignei, Pseudomonas lundensis, polymers on the glass Surface. The polymer film is bound to Pseudomonas mendocina, Pseudomonas mephitica, 40 the Surface via electrostatic binding between negative silyl Pseudomonas mucidolens, Pseudomonas Oleovorans, OH groups on the Surface and charged cationic groups in the Pseudomonas phenazinium, Pseudomonas pictorium, polymers. Such polycationic polymers include, but are not Pseudomonas putida, Pseudomonas resinovorans, limited to, polylysine and polyarginine. Pseudomonas saccharophila, Pseudomonas Stanieri, Another coating strategy employs reactive to Pseudomonas Stutzeri, Pseudomonas taetrolens, or 45 couple DNA to the slides (Schena et al., 1996, Proceedings of Pseudomonas vesicularis. the National Academy of Science USA 93: 10614-10619; The Streptomyces strain may be any Streptomyces strain. In Heller at al., 1997, Proceedings of the National Academy of a preferred aspect, the Streptomyces strain is Streptomyces Science USA 94: 2150-2155). lividans. In another preferred aspect, the Streptomyces strain Alternatively, the surface may have a relatively hydropho is Streptomyces murinus. 50 bic character, i.e., one that causes aqueous medium deposited Microarrays. The term “an array of Bacillus licheniformis on the surface to bead. A variety of known hydrophobic genes' is defined herein as a linear or two-dimensional array polymers, such as polystyrene, polypropylene, or polyethyl of preferably discrete elements of an array of Bacillus licheni ene, have desirable hydrophobic properties, as do glass and a formis genes selected from the group consisting of nucle variety of lubricant or other hydrophobic films that may be otides SEQID NOS: 2-4198, complementary strands of SEQ 55 applied to the Support Surface. A Support Surface is “hydro ID NOS: 2-4198, or fragments of SEQID NOs: 2-4198 (e.g., phobic' if an aqueous droplet applied to the Surface does not synthetic oligonucleotides of, for example, 40-60 nucle spread out substantially beyond the area size of the applied otides), wherein each discrete element has a finite area, droplet, wherein the Surface acts to prevent spreading of the formed on the surface of a solid support. It shall be understood droplet applied to the surface by hydrophobic interaction with that the term “Bacillus licheniformis genes' encompasses 60 the droplet. nucleotides SEQID NOS: 2-4198, complementary strands of In another aspect, the Substrate may be a multi-cell Sub SEQID NOS: 2-4198, or fragments of SEQID NOS: 2-4198. strate where each cell contains a microarray of Bacillus The term “microarray' is defined herein as an array of licheniformis and preferably an identical microarray, formed Bacillus licheniformis gene elements having a density of dis on a porous Surface. For example, a 96-cell array may typi crete of Bacillus licheniformis gene elements of at least about 65 cally have array dimensions between about 12 and 244 mm in 100/cm, and preferably at least about 1000/cm. The Bacil width and 8 and 400 mm in length, with the cells in the array lus licheniformis gene elements in a microarray have typical having width and length dimension of /12 and /8 the array US 8,168,417 B2 25 26 width and length dimensions, respectively, i.e., between which acts to limit the total bead area on the surface, and by about 1 and 20 in width and 1 and 50 mm in length. the surface tension of the droplet, which tends toward a given The Solid Support may include a water-impermeable back bead curvature. At this point, a given bead volume will have ing such as a glass slide or rigid polymer sheet, or other formed, and continued contact of the dispenser tip with the non-porous material. Formed on the Surface of the backing is bead, as the dispenser tip is being withdrawn, will have little a water-permeable film, which is formed of porous material. or no effect on bead volume. Such porous materials include, but are not limited to, nitro The desired deposition volume, i.e., bead volume, formed cellulose membrane nylon, polypropylene, and polyvi is preferably in the range 2 pl (picoliters) to 2nl (nanoliters), nylidene difluoride (PVDF) polymer. The thickness of the although Volumes as high as 100 ml or more may be dispensed. film is preferably between about 10 and 1000 um. The film 10 may be applied to the backing by spraying or coating, or by It will be appreciated that the selected dispensed volume will applying a preformed membrane to the backing. depend on (i) the “footprint of the dispenser tip(s), i.e., the Alternatively, the Solid Support may be simply a filter com size of the area spanned by the tip(s), (ii) the hydrophobicity posed of nitrocellulose, nylon, polypropylene, or polyvi of the Support Surface, and (iii) the time of contact with and nylidene difluoride (PVDF) polymer, or, for that matter, any 15 rate of withdrawal of the tip(s) from the support surface. In material suitable for use. addition, bead size may be reduced by increasing the Viscos The film surface may be partitioned into a desirable array ity of the medium, effectively reducing the flow time of liquid of cells by water-impermeable grid lines typically at a dis from the dispensing device onto the Support Surface. The drop tance of about 100 to 2000 um above the film surface. The grid size may be further constrained by depositing the drop in a lines can be formed on the surface of the film by laying down hydrophilic region Surrounded by a hydrophobic grid pattern an uncured flowable resin or elastomer Solution in an array on the Support Surface. grid, allowing the material to infiltrate the porous film down At a given tip size, bead volume can be reduced in a to the backing, and then curing the grid lines to form the controlled fashion by increasing Surface hydrophobicity, cell-array Substrate. reducing time of contact of the tip with the Surface, increasing The barrier material of the grid lines may be a flowable 25 rate of movement of the tip away from the surface, and/or silicone, wax-based material, thermoset material (e.g., increasing the Viscosity of the medium. Once these param epoxy), or any other useful material. The grid lines may be eters are fixed, a selected deposition volume in the desired applied to the Solid Support using a narrow Syringe, printing picoliter to nanoliter range can be achieved in a repeatable techniques, heat-seal stamping, or any other useful method fashion. known in the art. 30 After depositing a liquid droplet of a Bacillus licheniformis Each well preferably contains a microarray of distinct gene sample at one selected location on a Support, the tip may Bacillus licheniformis genes. “Distinct Bacillus licheniformis be moved to a corresponding position on a second support, the genes' as applied to the genes forming a microarray is defined Bacillus licheniformis gene sample is deposited at that posi herein as an array member which is distinct from other array tion, and this process is repeated until the random nucleic acid members on the basis of a different Bacillus licheniformis 35 fragment sample has been deposited at a selected position on gene sequence or oligo sequence thereof, and/or different a plurality of Supports. concentrations of the same or distinct Bacillus licheniformis This deposition process may then be repeated with another genes and/or different mixtures of distinct Bacillus licheni random nucleic acid fragment sample at another microarray formis genes or different-concentrations of Bacillus licheni position on each of the Supports. formis genes. Thus an array of “distinct Bacillus lichenifor 40 The diameter of each Bacillus licheniformis gene region is mis genes' may be an array containing, as its members, (i) preferably between about 20-200 um. The spacing between distinct Bacillus licheniformis genes which may have a each region and its closest (non-diagonal) neighbor, mea defined amount in each member, (ii) different, graded con Sured from center-to-center, is preferably in the range of centrations of a specific Bacillus licheniformis gene, and/or about 20-400 um. Thus, for example, an array having a center (iii) different-composition mixtures of two or more distinct 45 to-center spacing of about 250 um contains about 40 regions/ Bacillus licheniformis genes. cm or 1,600 regions/cm. After formation of the array, the It will be understood, however, that in the methods of the support is treated to evaporate the liquid of the droplet form present invention, any type of Substrate known in the art may ing each region, to leave a desired array of dried, relatively flat be used. Bacillus licheniformis gene or oligo thereof regions. This The delivery of a known amount of a selected Bacillus 50 drying may be done by heating or under vacuum. The DNA licheniformis gene to a specific position on the Support Sur can also be UV-crosslinked to the polymer coating. face is preferably performed with a dispensing device Nucleic Add Probes. In the methods of the present inven equipped with one or more tips for insuring reproducible tion, the strains are cultivated in a nutrient medium with and deposition and location of the Bacillus licheniformis genes without a substrate using methods well known in the art for and for preparing multiple arrays. Any dispensing device 55 isolation of nucleic acids to be used as probes. For example, known in the art may be used in the methods of the present the strains may be cultivated by shake flask cultivation, small invention. See, for example, U.S. Pat. No. 5,807,522. scale or large-scale fermentation (including continuous, For liquid-dispensing on a hydrophilic Surface, the liquid batch, fed-batch, or solid state fermentations) in laboratory or will have less of a tendency to bead, and the dispensed volume industrial fermentors performed in a suitable medium. The will be more sensitive to the total dwell time of the dispenser 60 cultivation takes place in a Suitable nutrient medium compris tip in the immediate vicinity of the Support Surface. ing carbon and nitrogen sources and inorganic salts, using For liquid-dispensing on a hydrophobic surface, flow of procedures known in the art. Suitable media are available fluid from the tip onto the support surface will continue from from commercial Suppliers or may be prepared according to the dispenser onto the Support Surface until it forms a liquid published compositions (e.g., in catalogues of the American bead. At a given bead size, i.e., Volume, the tendency of liquid 65 Type Culture Collection). to flow onto the surface will be balanced by the hydrophobic The nucleic acid probes from the microbial strains cultured surface interaction of the bead with the support surface, on medium with and without Substrate may be any nucleic US 8,168,417 B2 27 28 acid including genomic DNA, cDNA, and RNA, and may be formamide, 4.1 xDenhardt's solution, 4.4xSSC, and 100 isolated using standard methods known in the art. ug/ml of herring sperm DNA. Arrays are washed after The populations of isolated nucleic acid probes may be removal of the coverslip in 4xSSC by immersion into 1 xSSC, labeled with detection reporters such as colorimetric, radio 0.1% SDS for 10 minutes, 0.1XSSC, 0.1% SDS twice for 10 active for example, P. P. or S), fluorescent reporters, or minutes, and 0.1xSSC twice for 10 minutes. other reporters using methods known in the art (Chen et al., For shorter nucleic acid probes which are about 50 nucle 1998, Genomics 51: 313-324; DeRisi at al., 1997, Science otides to about 100 nucleotides in length, conventional strin 278: 680-686; U.S. Pat. No. 5,770.367). gency conditions may be used. Such stringency conditions In a preferred aspect, the probes are labeled with fluores are defined as prehybridization, hybridization, and washing cent reporters. For example, the DNA probes may be labeled 10 during reverse transcription from the respective RNA pools post-hybridization at 5° C. to 10° C. below the calculated T. by incorporation of fluorophores as dye-labeled nucleotides using the calculation according to Bolton and McCarthy (DeRisi et al., 1997, supra), e.g., Cy5-labeled deoxyuridine (1962, Proceedings of the National Academy of Sciences USA triphosphate, or the isolated cDNAs may be directly labeled 48: 1390) in 0.9 M. NaCl, 0.09 M Tris-HCl pH 7.6, 6 mM with different fluorescent functional groups. Fluorescent-la 15 EDTA, 0.5% NP-40, 1xDenhardt's solution, 1 mM sodium beled nucleotides include, but are not limited to, fluorescein pyrophosphate, 1 mMSodium monobasic phosphate, 0.1 mM conjugated nucleotide analogs (green fluorescence), lissa ATP and 0.2 mg of yeast RNA per ml following standard mine nucleotide analogs (red fluorescence). Fluorescent Southern blotting procedures. functional groups include, but are not limited to, Cy3 (agreen The carrier material is finally washed once in 6xSSC plus fluorescent dye) and Cy5 (red fluorescent dye). 0.1% SDS for 15 minutes and twice each for 15 minutes using Array Hybridization. The labeled nucleic acids from the 6xSSC at 5° C. to 10° C. below the calculated T. two strains cultivated with and without substrate are then The choice of hybridization conditions will depend on the added to an array of Bacillus licheniformis genes under con degree of homology between the Bacillus licheniformis genes ditions where the nucleic acid pools from the two strains and the nucleic acid probes obtained from the strain cultured hybridize to complementary sequences of the Bacillus 25 with and without inducing substrate. For example, where the licheniformis genes on the array. For purposes of the present nucleic acid probes and the Bacillus licheniformis genes are invention, hybridization indicates that the labeled nucleic obtained from identical Strains, high Stringency conditions acids from the two strains hybridize to the Bacillus licheni may be most suitable. Where the strains are from a genus or formis genes under very low to very high Stringency condi species different from which the Bacillus licheniformis genes tions. 30 were obtained, low or medium stringency conditions may be A small volume of the labeled nucleic acids mixture is more suitable. loaded onto the substrate. The solution will spread to cover In a preferred aspect, the hybridization is conducted under the entire microarray. In the case of a multi-cell Substrate, one low stringency conditions. In a more preferred aspect, the or more solutions are loaded into each cell which stop at the hybridization is conducted under medium stringency condi barrier elements. 35 tions. In a most preferred aspect, the hybridization is con For nucleic add probes of at least about 100 nucleotides in ducted under high Stringency conditions. length, microarray hybridization conditions described by The entire solid support is than reacted with detection Eisen and Brown, 1999, Methods of Enzymology 303: 179 reagents if needed and analyzed using standard colorimetric, 205, may be used. Hybridization is conducted under a cover radioactive, or fluorescent detection means. All processing slip at 65° C. in 3xSSC for 4-16 hours followed by post 40 and detection steps are performed simultaneously to all of the hybridization at room temperature after removal of the cover microarrays on the solid Support ensuring uniform assay con slip in 2xSSC, 0.1% SDS by washing the array two or three ditions for all of the microarrays on the solid support. times in the solution, followed by successive washes in Detection. The most common detection method is laser 1xSSC for 2 minutes and 0.2xSSC wash for two or more induced fluorescence detection using confocal optics (Che minutes. 45 ung et al., 1998, Nat. Genet. 18: 225-230). The array is exam Conventional conditions of very low to very high strin ined under fluorescence excitation conditions such that (i) the gency conditions may also be used. Very low to very high Bacillus licheniformis genes on the array that hybridize to the stringency conditions are defined as prehybridization and first nucleic acid probes obtained from the strain cultured hybridization at 42° C. in 5xSSPE, 0.3% SDS, 200 ug/ml without inducing Substrate and to the second nucleic acid sheared and denatured salmon sperm DNA, and either 25% 50 probes obtained from the strain cultured with inducing sub formamide for very low and low stringencies, 35% forma strate produce a distinct first fluorescence emission color and mide for medium and medium-high Stringencies, or 50% a distinct second fluorescence emission color, respectively, formamide for high and very high Stringencies, following and (ii) the Bacillus licheniformis genes on the array that standard Southern blotting procedures. hybridize to substantially equal numbers of nucleic acid The carrier material is finally washed three times each for 55 probes obtained from the strain cultured without inducing 15 minutes using 2xSSC, 0.2% SDS preferably at least at 45° substrate and from the strain cultured with inducing substrate C. (very low stringency), more preferably at east at 50° C. produce a distinct combined fluorescence emission color; (low stringency), more preferably at least at 55° C. (medium wherein the relative expression of the genes in the strains can stringency), more preferably at least at 60°C. (medium-high be determined by the observed fluorescence emission color of stringency), even more preferably at least at 65° C. (high 60 each spot on the array. stringency), and most preferably at least at 70° C. (very high The fluorescence excitation conditions are based on the Stringency). selection of the fluorescence reporters. For example, Cy3 and For shorter nucleic acid probes which are less than 50 Cy5 reporters are detected with solid state lasers operating at nucleotides, microarray hybridization conditions described 532 nm and 632 nm, respectively. by Kane et al., 2000, Nucleic Acids Research 28:4552-4557, 65 However, other methods of detection well known in the art may be used. Hybridization is conducted under a Supported may be used Such as standard photometric, calorimetric, or coverslip at 42°C. for 16-18 hours at high humidity in 50% radioactive detection means, as described earlier. US 8,168,417 B2 29 30 Data Analysis. The data obtained from the scanned image the medium are characterized by determining the sequence of may then be analyzed using any of the commercially available the probe. Based on the sequence, the gene can then be iso image analysis software. The software preferably identifies lated using methods well known in the art. array elements, Subtracts backgrounds, deconvolutes multi The techniques used to isolate or clone a gene include color images, flags or removes artifacts, verifies that controls isolation from genomic DNA, preparation from cDNA, or a have performed properly, and normalizes the signals (Chenet combination thereof. The cloning of the gene from Such al., 1997, Journal of Biomedical Optics 2: 364-374). genomic DNA can be effected, e.g., by using the well known Several computational methods have been described for polymerase chain reaction (PCR) or antibody screening of the analysis and interpretation of microarray-based expres expression libraries to detect cloned DNA fragments with sion profiles including cluster analysis (Eisen at al., 1998, 10 shared structural features. See, e.g., Innis at al., 1990, PCR. A Proc. Nat Acad. Sci. USA 95: 14863-14868), parametric Guide to Methods and Application, Academic Press, New ordering of genes (Spellman et al., 1998, Mol. Biol. Cell 9: York. Other nucleic acid amplification procedures such as 3273-3297), and supervised clustering methods based on rep ligase chain reaction (LCR), ligated activated transcription resentative hand-picked or computer-generated expression (LAT) and nucleic acid sequence-based amplification profiles (Chu at al., 1998. Science 282: 699-705). Preferred 15 (NASBA) may be used. The gene may be cloned from the methods for evaluating the results of the microarrays employ strain of interest, or another or related organism and thus, for statistical analysis to determine the significance of the differ example, may be an allelic or species variant of the gene. ences in expression levels. In the methods of the present Methods for Monitoring Differential Expression of a Plural invention, the difference in the detected expression level is at ity of Genes least about 10% or greater, preferably at least about 20% or The present invention also relates to methods for monitor greater, more preferably at least about 50% or greater, even ing differential expression of a plurality of genes in a first more preferably at least about 75% or greater; and most bacterial cell relative to expression of the same genes in one or preferably at least about 100% or greater. more second bacterial cells, comprising: One such preferred system is the Significance Analysis of (a) adding a mixture of detection reporter-labeled nucleic Microarrays (SAM) (Tusher et al., 2001, Proc. Natl. Acad. 25 acids isolated from the bacterial cells to a Substrate containing Sci. USA 98: 5116-5121). Statistical analysis allows the deter an array of Bacillus licheniformis genes selected from the mination of significantly altered expression of levels of about group consisting of nucleotides SEQ ID NOS: 2-4198, 50% or even less. The PAM (or predictive analysis for complementary strands of SEQ ID NOS: 2-4198, or frag microarrays) represents another approach for analyzing the ments of SEQID NOS: 2-4198, under conditions where the results of the microarrays (Tibshirani at al., 2002, Proc. Natl. 30 detection reporter-labeled nucleic acids hybridize to comple Acad. Sci. USA99: 6567-6572). mentary sequences of the Bacillus licheniformis genes on the Cluster algorithms may also be used to analyze microarray array, wherein the nucleic acids from the first bacterial cell expression data. From the analysis of the expression profiles and the one or more second bacterial cells are labeled with a it is possible to identify co-regulated genes that perform com first detection reporter and one or more different second mon metabolic or biosynthetic functions. Hierarchical clus 35 detection reporters, respectively; and tering has been employed in the analysis of microarray (b) examining the array under conditions wherein the rela expression data in order to place genes into clusters based on tive expression of the genes in the bacterial cells is determined sharing similar patterns of expression (Eisen at al., 1998, by the observed detection signal of each spot on the array in Supra). This method yields a graphical display that resembles which (i) the Bacillus licheniformis genes on the array that a kind of phylogenetic tree where the relatedness of the 40 hybridize to the nucleic acids obtained from either the first or expression behavior of each gene to every other gene is the one or more second bacterial cells produce a distinct first depicted by branch lengths. The programs Cluster and Tree detection signal or one or more second detection signals, View, both written by Michael Eisen (Eisen et al., 1998 Proc. respectively, and (ii) the Bacillus licheniformis genes on the Nat. Acad. Sci. USA 95: 14863-14868) are freely available. array that hybridize to the nucleic acids obtained from both Genespring is a commercial program available for Such 45 the first and one or more second bacterial produce a distinct analysis (Silicon Genetics, Redwood City, Calif.). combined detection signal. Self-organizing maps (SOMs), a non-hierarchical method, The methods of the present invention may be used to moni have also been used to analyze microarray expression data tor global expression of a plurality of genes from a Bacillus (Tamayo et al., 1999, Proc. Natl. Acad. Sri. USA 96: 2907 cell, discover new genes, identify possible functions of 2912). This method involves selecting a geometry of nodes, 50 unknown open reading frames, and monitor gene copy num where the number of nodes defines the number of clusters. ber variation and stability. For example, the global view of Then, the number of genes analyzed and the number of changes in expression of genes may be used to provide a experimental conditions that were used to provide the expres picture of the way in which Bacillus cells adapt to changes in sion values of these genes are subjected to an iterative process culture conditions, environmental stress, or other physiologi (20,000-50,000 iterations) that maps the nodes and data 55 cal provocation. Other possibilities for monitoring global points into multidimensional gene expression space. After the expression include spore morphogenesis, recombination, identification of significantly regulated genes, the expression metabolic or catabolic pathway engineering. level of each gene is normalized across experiments. As a The methods of the present invention are particularly result, the expression profile of the genome is highlighted in advantageous since one spot on an array equals one gene or a manner that is relatively independent of each gene's expres 60 open reading frame because extensive follow-up character sion magnitude. Software for the “GENECLUSTER SOM ization is unnecessary since sequence information is avail program for microarray expression analysis can be obtained able, and the Bacillus licheniformis microarrays can be orga from the Whitehead/MIT Center for Genome Research. nized based on function of the gene products. SOMs can also be constructed using the GeneSpring software Microarrays. Methods for preparing the microarrays are package. 65 described herein. Isolation of Genes. Probes containing genes or portions Bacterial Cells. In the methods of the present invention, the thereofidentified to be induced by the present of substrate in two or more Bacillus cells may be any Bacillus cell where one US 8,168,417 B2 31 32 of the cells is used as a reference for identifying differences in It will be understood that the term 'Bacillus’ also encom expression of the same or similar complement of genes in the passes relatives of Bacillus Such as Paenibacillus, Oceano other cell(s). In one aspect, the two or more cells are the same bacillus, and the like. cell. For example, they may be compared under different In the methods of the present invention, the cells are culti growth conditions, e.g., limitation, nutrition, and/or vated in a nutrient medium Suitable for growth using methods physiology. In another aspect, one or more cells are mutants well known in the art for isolation of the nucleic acids to be of the reference cell. For example, the mutant(s) may have a used as probes. For example, the cells may be cultivated by different phenotype. In a further aspect, the two or more cells shake flask cultivation, Small-scale or large-scale fermenta are of different species (e.g., Bacillus clausii and Bacillus tion (including continuous, batch, fed-batch, or Solid State 10 fermentations) in laboratory or industrial fermentors per subtilis). In another further aspect, the two or more cells are of formed in a suitable medium. The cultivation takes place in a different genera. In an even further aspect, one or more cells Suitable nutrient medium comprising carbon and nitrogen are transformants of the reference cell, wherein the one or Sources and inorganic salts, using procedures known in the more transformants exhibit a different property. For example, art. Suitable media are available from commercial suppliers the transformants may have an improved phenotype relative 15 or may be prepared according to published compositions to the reference cell and/or one of the other transformants. (e.g., in catalogues of the AmericanType Culture Collection). The term "phenotype' is defined herein as an observable or Nucleic Acid Probes. The nucleic acid probes from the two outward characteristic of a cell determined by its genotype or more Bacillus cells may be any nucleic acid including and modulated by its environment. Such improved pheno genomic DNA, cDNA, and RNA, and may be isolated using types may include, but are not limited to, improved secretion standard methods known in the art, as described herein. The or production of a protein or compound, reduced or no secre populations of isolated nucleic acid probes may be labeled tion or production of a protein or compound, improved or with colorimetric, radioactive, fluorescent reporters, or other reduced expression of a gene, desirable morphology, an reporters using methods described herein. altered growth rate under desired conditions, relief of over In a preferred aspect, the probes are labeled with fluores expression mediated growth inhibition, or tolerance to low 25 cent reporters, e.g., Cy3 (a green fluorescent dye) and Cy5 oxygen conditions. (red fluorescent dye), as described herein. The Bacillus cells may be any Bacillus cells, but preferably Array Hybridization. The labeled nucleic acids from the Bacillus alkalophilus, Bacillus amyloliquefaciens, Bacillus two or more Bacillus cells are then added to a substrate brevis, Bacillus cereus, Bacillus circulans, Bacillus clausii, containing an array of Bacillus licheniformis genes under Bacillus coagulans, Bacillus fastidiosus, Bacillus firmus, 30 conditions, as described herein, where the nucleic acid pools Bacillus lautus, Bacillus lentus, Bacillus licheniformis, from the two or more Bacillus cells hybridize to complemen Bacillus macerans, Bacillus megaterium, Bacillus methan tary sequences of the Bacillus licheniformis genes on the olicus, Bacillus pumilus, Bacillus sphaericus, Bacillus array. Stearothermophilus, Bacillus subtilis, or Bacillus thuringien Detection and Data Analysis. The same methods as sis cells. 35 described herein are used for detection and data analysis. In a preferred aspect, the Bacillus cells are Bacillus alkalo Computer Readable Media and Computer-Based Systems philus cells. In another preferred aspect. The Bacillus cells are The Bacillus licheniformis chromosome and its genes Bacillus amyloliquefaciens cells. In another preferred aspect, described herein may be “provided in a variety of media to the Bacillus cells are Bacillus brevis cells. In another pre facilitate their use. The term “provided’ refers to a manufac ferred aspect, the Bacillus cells are Bacillus cereus cells. In 40 ture comprising an array of Bacillus licheniformis genes. another preferred aspect, the Bacillus cells are Bacillus cir Such manufactures provide the Bacillus licheniformis genes culans cells. In another preferred aspect, the Bacillus cells are in a form which allows one skilled in the art to examine the Bacillus clausii cells. In another preferred aspect, the Bacillus manufacture using means not directly applicable to examin cells are Bacillus coagulans cells. In another preferred aspect, ing the chromosome or a Subset thereofas it exists in nature or the Bacillus cells are Bacillus fastidiosus cells. In another 45 in purified form. preferred aspect, the Bacillus cells are Bacillus firmus cells. Thus, the present invention also relates to such a manufac In another preferred aspect, the Bacillus cells are Bacillus ture in the form of a computer readable medium comprising lautus cells. In another preferred aspect, the Bacillus cells are an array of Bacillus licheniformis genes selected from the Bacillus lentus cells. In another preferred aspect, the Bacillus group consisting of nucleotides SEQ ID NOS: 2-4198, cells are Bacillus licheniformis cells. In another preferred 50 complementary strands of SEQ ID NOS: 2-4198, or frag aspect, the Bacillus cells are Bacillus macerans cells. In ments of SEQID NOS: 2-4198. another preferred aspect, the Bacillus cells are Bacillus mega In one application of this aspect, the Bacillus licheniformis terium cells. In another preferred aspect, the Bacillus cells are genes of the present invention can be recorded on computer Bacillus methanolicus cells. In another preferred aspect, the readable media. The term “computer readable media' is Bacillus cells are Bacillus pumilus cells. In another preferred 55 defined hereinas any medium which can be read and accessed aspect, the Bacillus cells are Bacillus sphaericus cells. In by a computer. Such computer readable media include, but another preferred aspect, the Bacillus cells are Bacillus are not limited to, magnetic storage media, e.g., floppy discs, Stearothermophilus cells. In another preferred aspect, the hard disc storage medium, and magnetic tape; optical storage Bacillus cells are Bacillus subtilis cells. In another preferred media, e.g., CD-ROM, DVD; electrical storage media, e.g., aspect, the Bacillus cells are Bacillus thuringiensis cells. 60 RAM and ROM; and hybrids of these categories, e.g., mag In a more preferred aspect, the Bacillus cells are Bacillus netic/optical storage media. One skilled in the art can readily licheniformis cells. In a most preferred aspect, the Bacillus appreciate how any of the presently known computer read licheniformis cells are Bacillus licheniformis ATCC 14580 able media can be used to create a manufacture comprising cells. computer readable medium having recorded thereon a nucle In another more preferred aspect, the Bacillus cells are 65 otide sequence of the present invention. Likewise, it will be Bacillus clausii cells. In another most preferred aspect, the clear to those of skill how additional computer readable Bacillus clausii cells are Bacillus clausii NCIB 10309 cells. media that may be developed also can be used to create US 8,168,417 B2 33 34 analogous manufactures having recorded thereon a nucle As Stated above, the computer-based systems of the present otide sequence of the present invention. invention comprise a data storage means having stored As used herein, “recorded refers to a process for storing therein a nucleotide sequence of the present invention and the information on computer readable medium. One skilled in the necessary hardware means and software means for Support art can readily adopt any of the presently known methods for 5 ing and implementing a search means. recording information on computer readable medium togen The term “data storage means' is defined herein as erate manufactures comprising the nucleotide sequence memory which can store nucleotide sequence information of information of the present invention. the present invention, or a memory access means which can A variety of data storage structures are available for creat access manufactures having recorded thereon the nucleotide ing a computer readable medium having recorded thereon a 10 sequence information of the present invention. nucleotide sequence of the present invention. The choice of The term "search means’ refers is defined herein as one or the data storage structure will generally be based on the more programs which are implemented on the computer means chosen to access the stored information. In addition, a based system to compare a target sequence or target structural variety of data processor programs and formats can be used to motif with the sequence information stored within the data store the nucleotide sequence information of the present 15 storage means. Search means are used to identify fragments invention on computer readable medium. The sequence infor or regions of the present genomic sequences which match a mation can be represented in a word processing text file, particular target sequence or target motif. A variety of known formatted in commercially-available software such as Word algorithms are disclosed publicly and a variety of commer Perfect and Microsoft Word, or represented in the form of an cially available Software for conducting search means are and ASCII file, stored in a database application, such as DB2, can be used in the computer-based systems of the present Sybase, Oracle, or the like. A skilled artisan can readily adapt invention. Examples of Such software includes, but is not any number of data-processor structuring formats (e.g., text limited to, MacPattern (Fuchs, 1991, Comput. Appl. Biosci. 7: file or database) in order to obtain computer readable medium 105-106), BLASTN and BLASTX National Center for Bio having recorded thereon the nucleotide sequence information technology Information (NCBI). One skilled in the art can of the present invention. 25 readily recognize that any one of the available algorithms or Various computer Software programs are publicly available implementing software packages for conducting homology that allow a skilled artisan to access sequence information searches can be adapted for use in the present computer-based provided in a computer readable medium. Thus, by providing systems. in computer readable form an array of Bacillus licheniformis The term “target sequence' is defined here as any DNA genes selected from the group consisting of nucleotides SEQ 30 (genomic DNA, cDNA) or amino acid sequence of six or ID NOS: 2-4198, complementary strands of SEQ ID NOs: more nucleotides or two or more amino acids. One skilled in 2-4198, or fragments of SEQID NOS: 2-4198, enables one the art can readily recognize that the longer a target sequence skilled in the art to routinely access the provided sequence is, the less likely a target sequence will be present as a random information for a wide variety of purposes. occurrence in the database. The most preferred sequence Software utilizing the BLAST (Altschul et al., 1990, 35 length of a target sequence is from about 10 to 100 amino supra), BLAZE (Brutlag et al., 1993, Comp. Chem. 17:203 acids or from about 30 to 300 nucleotide residues. However, 207), GENEMARK (Lukashin and Borodovsky, 1998, it is well recognized that searches for commercially important Nucleic Acids Research 26: 1107-1115), GENSCAN (Burge fragments. Such as sequence fragments involved in gene and Karlin, 1997, Journal of Molecular Biology 268: 78-94), expression and protein processing, may be of shorter length. GLIMMER (Salzberg et al., 1998, Nucleic Acids Research 40 The term “a target structural motif or “target motif is 26: 544–548), and GRAIL (Xu et al., 1994, Comput. Appl. defined herein as any rationally selected sequence or combi Biosci. 10: 613-623) search algorithms may be used to iden nation of sequences in which the sequence(s) are chosen tify open reading frames (ORFS) within a genome of interest, based on a three-dimensional configuration which is formed which contain homology to ORFs or proteins from both upon the folding of the target motif. There are a variety of Bacillus licheniformis and Bacillus clausii and from other 45 target motifs known in the art. Protein target motifs include, organisms. Among the ORFs discussed herein are protein but are not limited to, enzyme active sites and signal encoding fragments of the Bacillus licheniformis and Bacil sequences. Substrate and binding domains, trans lus clausiigenomes useful in producing commercially impor membrane domains, and sites for post-translational modifi tant proteins, such as enzymes used in fermentation reactions cations. Nucleic acid target motifs include, but are not limited and in the production of commercially useful metabolites. 50 to, promoter sequences, hairpin structures and inducible The present invention further provides systems, particu expression elements (protein binding sequences), repeats, larly computer-based systems, which contain the sequence palindromes, dyad symmetries, and transcription and trans information described herein. Such systems are designed to lation start and stop sites. identify, among other things, genes and gene products— A variety of structural formats for the input and output many of which could be products themselves or used to 55 means can be used to input and output the information in the genetically modify an industrial expression host through computer-based systems of the present invention. A preferred increased or decreased expression of a specific gene format for an output means ranks fragments of the Bacillus sequence(s). licheniformis or Bacillus clausiigenomic sequences possess The term “a computer-based system” is herein defined as ing varying degrees of homology to the target sequence or the hardware means, software means, and data storage means 60 target motif. Such presentation provides one skilled in the art used to analyze the nucleotide sequence information of the with a ranking of sequences which contain various amounts present invention. The minimum hardware means of the com of the target sequence or target motif and identifies the degree puter-based systems of the present invention comprises a of homology contained in the identified fragment. central processing unit (CPU), input means, output means, A variety of comparing means can be used to compare a and data storage means. One skilled in the art can readily 65 target sequence or target motif with the data storage means to appreciate that any currently available computer-based sys identify sequence fragments of the Bacillus licheniformis and tem is suitable for use in the present invention. Bacillus clausii genomes. For example, implementing soft US 8,168,417 B2 35 36 ware which utilize the BLAST and BLAZE algorithms, Harbor, N.Y.), pp.397-454, and fosmid end sequencing (Kim, described in Altschul at al., 1990, supra, may be used to U.J., Shizuya, H., de Jong, P. J., Birren, B. and Simon, M.I., identify open reading frames within the Bacillus lichenifor 1992, Nucleic Acids Res. 20: 1083-1085; Longmire, J. L. and mis or Bacillus clausiigenome or the genomes of other organ Brown, N. C., 2003, Biotechniques 35: 50-54; Zhao, S., isms. A skilled artisan can readily recognize that any one of 5 Malek, J., Mahairas, G., Fu, L., Nierman, W., Venter, J. C., the publicly available homology search programs can be used and Adams, M. D., 2000, Genomics 63: 321-332). Genomic DNA of Bacillus licheniformis ATCC 14580 was as the search means for the computer-based systems of the isolated using the following method: A single colony was present invention. Suitable proprietary systems that may be used to inoculate 20 ml of LB broth (Davis, R. W. Botstein, known to those of skill also may be employed in this regard. D., and Roth, J. R. 1980, Advanced Bacterial Genetics, Cold Codon Usage Tables 10 Spring Harbor Press, Cold Spring Harbor, N.Y.) in a sterile The present invention further relates to methods for pre 125 ml Erlenmeyer flask. The culture was incubated at 37°C. paring a synthetic gene, comprising (a) generating a codon overnight with agitation at 240 rpm. The resulting cells were usage table based on codons used in one or more open reading collected by centrifugation in a 45 ml Oak Ridge tube for 10 frames or portions thereof of SEQID NO: 1, (b) constructing minutes at 6000xg, and the cell pellet was resuspended in 5 ml a synthetic gene or portion thereof that contains in place of 15 of Tris-glucose buffer (50 mM Tris-HCl, pH 8.0, 50 mM one or more native codons one or more preferred codons from glucose, 10 mM EDTA). Lysozyme was added to a final the codon usage table, and (c) recovering the synthetic gene. concentration of 50 ug/ml and the Suspension was incubated In a preferred aspect, the codon usage table is Table 4 and/or in a 37° C. water bath for 25 minutes. Next, 200 ul of 10% Table 5. SDS was added and the tubes were gently inverted several The Bacillus licheniformis chromosomal sequence of SEQ times. Five milliliters of a second detergent mixture (1% Brij, ID NO: 1 or portions thereof can be used to generate codon 1% deoxycholate, 50 mM EDTA, pH 7.5) was added, and the usage tables to design synthetic genes for their efficient het tubes were inverted several times while incubating for 20 erologous expression in Bacillus licheniformis host cells. The minutes at room temperature. An equal Volume of phenol: chloroform (1:1 V/v) was added and the tubes were inverted codon usage tables can be based on (1) the codon used in all gently at room temperature for 20-30 minutes. The tubes were the open reading frames, (2) selected open reading frames, (3) 25 centrifuged for 20 minutes at 12,000xg, 4°C. The top aque fragments of the open reading frames, or (4) fragments of ous layer was carefully removed with a wide-bore pipette and selected open reading frames. With a codon usage table, placed in a clean 45 ml Oak Ridge tube. The phenol:chloro synthetic genes can be designed with only the most preferred form extraction was repeated and /10 volume of 3 M sodium codon for each amino acid; with a number of common codons acetate pH 5.2 was added to the aqueous layer. Two volumes for each amino acid; or with the same or a similar statistical 30 of cold ethanol were carefully layered on top and the DNA average of codon usages found in the table of choice. was spooled from the Solution onto a sterile glass rod. The synthetic gene can be constructed using any method Spooled DNA was carefully rinsed in 70% ethanol and resus Such as site-directed mutagenesis or PCR generated mutagen pended in a suitable amount of TE buffer (10 mM Tris-HCl, esis in accordance with methods known in the art. Although, pH 8.0, 1 mM EDTA). in principle, the modification may be performed in Vivo, i.e., 35 Plasmid libraries were constructed using randomly directly on the cell expressing the nucleotide sequence to be sheared and BamHI-digested genomic DNA that was modified, it is preferred that the modification is performed in enriched for 2-3 kb fragments by preparative agarose gel vitro. electrophoresis (Berka, R. M., Schneider, P., Golightly, E. J., The synthetic gene can be further modified by operably Brown, S. H., Madden, M., Brown, K. M., Halkier, T., Mon linking the synthetic gene to one or more control sequences 40 dorf, K., and Xu, F., 1997, Appl. Environ. Microbiol. 63: which direct the expression of the coding sequence in a Suit 3151-3157). Approximately 49,000 random clones were able host cell under conditions compatible with the control sequenced using dye-terminator chemistry (Applied Biosys tems, Foster City, Calif.) with ABI 377 and ABI 3700 auto sequences using the methods described herein. Nucleic acid mated sequencers yielding approximately 6x coverage of the constructs, recombinant expression vectors, and recombinant genome. A combination of methods was employed for gap host cells comprising the synthetic gene can also be prepared 45 closure including sequencing on fosmids (Kim, U.J., Shi using the methods described herein. Zuya, H., de Jong, P. J., Birren, B., and Simon, M.I., 1992, The present invention also relates to methods for producing Nucleic Acids Res. 20: 1083-1085), primer walking on a polypeptide encoded by Such a synthetic gene comprising selected clones, and PCR-amplified DNA fragments. Fosmid (a) cultivating a host cell comprising the synthetic gene under libraries were constructed using a commercial kit from Epi conditions conducive for production of the polypeptide; and 50 centre (Madison, Wis.). Data from both ends of approxi (b) recovering the polypeptide. mately 1975 fosmid clones with an average insert size of 40 The present invention is further described by the following kb were incorporated to aid invalidation of the final assembly. examples which should not be construed as limiting the scope In total, the number of input reads was 62,685 with 78.6% of of the invention. these incorporated into the final assembly. Sequences were 55 base called using TraceTuner 2.0 (Paracel, Inc., Pasadena, EXAMPLES Calif.) and assembled using the Paracel Genome Assembler (Paracel, Inc., Pasadena, Calif.) with optimized parameters Example 1 and the quality score set to >20. Phrap, Crossmatch, and Consed were used for sequence finishing (Gordon D., Aba Shotgun DNA Sequencing and Genome Assembly 60 jian C., and Green P, 1998, Genome Res. 8: 195-202). Example 2 The genome of the type strain Bacillus licheniformis ATCC 14580 was sequenced by a combination of the whole Identification and Annotation of Open Reading genome shotgun method described by Wilson, R. K. and Frames (ORFs) Mardis, E. R., 1997. In Genome Analysis: A Laboratory 65 Manual, Vol. 1, eds. Birren, B., Green, E. D., Meyers, R. M., Protein coding regions in the assembled genome sequence and Roskams, J. (Cold Spring Harbor Press, Cold Spring data were identified using EasyGene (Larsen, T. S., and US 8,168,417 B2 37 38 Krogh, A., 2003, BMC Bioinformatics 4: 21), Glimmer The likely origin of replication was identified by similari (Delcher, A. L., Harmon, D., Kasif, S., White, O. and ties to several features of Bacillus subtilis origin (Moriya, S., Salzberg, S. L., 1999, Nucleic Acids Res. 27,4636-4641), and Fukuoka, T., Ogasawara, N., and Yoshikawa, H., 1988, FrameD (Schiex, T., Gouzy, J., Moisan, A. and de Oliveira, Y., EMBOJournal 7: 2911-2917: Ogasawara, N., Nakai, S., and 2003, Nucleic Acids Res. 31, 3738-3741). Only EasyGene Yoshikawa, H., 1994, DNA Res. 1, 1-14; Kadoya, R., Hassan, gene models with an R-value of less than 2 and log-odds score A. K. Kasahara, Y., Ogasawara, N., and Moriya, S., 2002, greater than -10 were used. Predicted proteins were com Mol. Microbiol. 45: 73-87; Tosato, V., Gjuracic, K., Vlahov pared to the non-redundant database PIR-NREF (Wu, C. H., icek, K., Pongor, S., Danchin, A., and Bruschi, C. V., 2003, Huang, H., Arminski, L., Castro-Alvear, J., Chen, Y. Hu, Z. FEMS Microbiol, Lett. 218: 23-30). These included (a) colo 10 calization of four genes (rpmH. dnaA, dnaN, and recF) found Z. Ledley, R. S. Lewis, K. C., Mewes, H. W., Orcutt, B.C., near the origin of the Bacillus subtilis chromosome, (b) GC 2002, Nucleic Acids Research 30: 35-37) and the Bacillus nucleotide skew (G-C)/(G+C) analysis, and (c) the pres subtilis genome (Subtill list) using BLASTP with an E-value ence of multiple dnaA-boxes (Pedersen, A.G., Jensen, L. J., threshold of 10. InterProScan was used to predict putative Brunak, S., Staerfeldt, H. H., and Ussery, D. W., 2000, Mol. function. (Zdobnov, E. M. and Apweiler, R., 2001, Bioinfor 15 Biol. 299: 907-930; Christensen, B. B., Atlung, T., and matics 17, 847-848). The InterPro analysis included compari Hansen, F. G., 1999, J. Bacterial. 181: 2683-2688; Majka, J., son to Pfam (Bateman, A., Coin, L. Durbin, R. Finn, R. D., Jakimowicz, D., Messer, W., Schrempf, H., Lisowski, M., and Hollich, V., Griffiths-Jones, S., Khanna, A., Sonnhammer, E. Zakrzewska-Czerwinska, J., 1999, Eur: J. Biochem. 260:325 L., et al., 2004, Nucleic Acids Res. 32, D138-D141), TIGR 335) and AT-rich sequences in the region immediately fam (Haft, D.J., Selengut, J. D. and White, O., 2003, Nucleic upstream of the dnaA gene. On the basis of these observa Acids Res. 31:371-373), Interpro (Apweiler, R., Attwood, T. tions, a residue of the BstBI restriction site was K. Bairock, A., Bateman, A., Birney, E., Biswas, M., Bucher, assigned between the rpmH and dnaA genes to be the first P. Cerutti, L., Corpet, F. Croning, M.D., et al., 2001, Nucleic nucleotide of the Bacillus licheniformis genome. The repli Acids Res. 29:37-40), signal peptide prediction using SignalP cation termination site was localized near 2.02 Mb by GC (Nielsen, H., Engelbrecht, J., Brunak, S., and von Heijne, G., 25 skew analysis. This region lies roughly opposite the origin of 1997, Protein Engineering 10: 1-6), and trans-membrane replication. domain prediction using TMHMM (Krogh, A., Larsson, B., Unlike Bacillus subtilis, no apparent gene encoding a rep von Heijne, G. and Sonnhammer, E. L. L., 2000, J. Mol. Biol. lication terminator protein (rtp) was found in Bacillus licheni 305, 567-580). formis. The Bacillus halodurans genome also lacks an irtp These ORFs were assigned to functional categories based 30 function (Takami, H., Nakasone, K., Takaki, Y. Maeno, G., on the Cluster of Orthologous Groups (COG) database with Sasaki, R., Masui, N., Fuji, F., Hirama, C., Nakamura, Y., manual verification as described (Tatusov, R. L., Koonin, E. Ogasawara, N. et al., 2000, Nucleic Acids Res. 28: 4317 V. and Lipman. D.J., 1997, Science 278: 631-637: Koonin, E. 4331), and it seems likely that Bacillus subtilis acquired the V. and Galperin, M.Y., 2002, Sequence-Evolution-Function: rtp gene following its divergence from Bacillus halodurans Computational Approaches in Comparative Genomics (Klu 35 and Bacillus licheniformis. wer, Boston)). Transfer RNA genes were identified using Transposable elements and prophages. The genome of tRNAscan-SE (Lowe, T. M. and Eddy, S. R., 1997, Nucleic Bacillus licheniformis ATCC 14580 was determined to con Acids Res. 25: 955-964). tain nine identical copies of a 1285 bp insertion sequence element termed IS3Bli1 (Lapidus, A., Galleron, N., Ander Example 3 40 sen, J. T. Jorgensen, P. L. Ehrlich, S. D., and Sorokin, A., 2002, FEMS Microbiol. Lett. 209: 23-30). This sequence General Features of the Bacillus licheniformis shares a number of features with other IS3 family elements Genome including direct repeats of three to five bp, a ten bp left inverted repeat, and a nine by right inverted repeat. IS3Bli1 The genome of Bacillus licheniformis ATCC 14580 was 45 encodes two predicted overlapping ORFs, designated orfA determined to consist of a circular molecule of 4.222.336 bp and orfB in relative translational reading frames of 0 and -1. with an average GC content of 46.2% (Table 2). No plasmids The presence of a “slippery heptamer motif, AAAAAAG, were found during the genome analysis, and none were found before the stop codon in orfA Suggests that programmed by agarose gel electrophoresis. translational frameshifting occurs between these two ORFs, The genome contains 4208 predicted protein-coding genes 50 resulting in a single gene product (Farabaugh, P., 1996, with an average size of 873 bp, seven rRNA operons, and 81 Microbiol. Rev. 60: 103-134). The orfB gene product harbors tRNA genes. Using a combination of several gene-finding the DD35E7K motif, a highly conserved pattern among algorithms 4208 protein coding ORFs were predicted. These insertion sequences. Eight of the IS3Bli1 elements lie in ORFs constitute 87% of the genome and have an average intergenic regions, and one interrupts the comP gene. In addi length of 873 bp. Approximately 48% of the ORFs are 55 tion to these insertion sequences, the genome encodes a puta encoded on one DNA strand and 52% on the other strand. tive transposase that is most closely related (E=1.8e-11) to Among the protein coding ORFs, 3948 (94%) have signifi one identified in the Thermoanaerobacter tengcongensis cant similarity to proteins in PIR, and 3.187 of these gene genome (Bao, Q., Tian, Y. Li, W., Xu, Z., Xuan, Z., Hu, S., models contain Interpro motifs and 2895 contain protein Dong, W., Yang, J., Chen, Y., Xue, Y., et al., 2002, Genome motifs found in PFAM. The number of hypothetical and con 60 Res. 12: 689-700), however, similar genes are also found in served hypothetical proteins in the Bacillus licheniformis the chromosomes of Bacillus halodurans (Takaki, H., Naka genome with hits in the FIR database was 1318 (212 con sone, K., Takaki, Y., Maeno, G., Sasaki, R., Masui, N., Fuji, F., served hypothetical ORFs). Among the list of hypothetical Hirama, C., Nakamura, Y., Ogasawara, N. et al., 2000, and conserved hypothetical ORFs, 683 (52%) have protein Nucleic Acids Res. 28: 4317-4331), Oceanobacillusiheyensis motifs contained in PFAM (148 conserved hypothetical 65 (Takagi, H. Takaki, Y., and Uchiyama, I., 2002, Nucleic Acids ORFs). There are 72 tRNA genes representing all 20 amino Res. 30: 3927-3935), Streptococcus agalactiae (Takahashi, acids and 7 rRNA operons. S., Detrick, S. Whiting, A. A. Blaschke-Bonkowksy, A. J., US 8,168,417 B2 39 40 Aoyagi, Y. Adderson, E. E., and Bohnsack, J. F., 2002, J. plexes. Thus, it is reasonable to expect that the central features Infect. Dis. 186: 1034-1038), and Streptococcus pyogenes of the secretory apparatus are conserved in Bacillus subtilis (Smoot, J. C., Barbian, K. D., Van Gompel, J. J. Smoot, L. and Bacillus licheniformis. M. Chaussee, M.S., Sylva, G. L., Sturdevant, D. E., Ricklefs, From the list of 139 sporulation genes tabulated by Kunstet S. M., Porcella, S. F., Parkins, L. D., et al., 2002, Proc. Natl. al. (Kunst, F., Ogasawara, N, Mozser, I., Albertini, A. M., Acad. Sci. U.S.A. 99: 4668-4673). Alloni, G., Azebedo, V., Bertero, M.G., Bessieres, P., Bolotin, The presence of several bacteriophage lysogens or proph A., and Borchert, S. et al., 1997, Nature 390:249-256), all but age-like elements was revealed by Smith-Waterman compari six have obvious counterparts in Bacillus licheniformis. Sons to other bacterial genomes and by their AT-rich signa These six exceptions (spsABCEFG) comprise an operon 10 involved in synthesis of spore coat polysaccharide in Bacillus tures. Prophage sequences, designated NZP1 and NZP3 subtilis. Additionally, the response regulator gene family (similar to PBSX and cp-105, respectively), were uncovered (phraCEFGI) appears to have a low level of sequence con by noting the presence of nearby genes encoding the large servation between Bacillus subtilis and Bacillus lichenifor Subunit of terminase, a signature protein that is highly con i.S. served among prophages (Casjens, S., 2003, Mol. Microbiol. 15 Natural competence (the ability to take up and process 49: 277-300). A terminase gene was not observed in a third exogenous DNA in specific growth conditions) is a feature of putative prophage, termed NZP2 (similarity to SPP1), how few Bacillus licheniformis strains (Gwinn, D.D. and Thorne, ever, its absence may be the result of genome deterioration C. B., 1964, J. Bacteriol. 87: 519–526). The reasons for vari during evolution. Regions were observed in which the GC ability in competence phenotype have not been explored at content is less than 39% usually encoded proteins that have no the genetic level, but the genome data offered several possible Bacillus subtilis orthologue and share identity only to hypo explanations. Although the type strain genome encodes all of thetical and conserved hypothetical genes. Two of these AT the late competence functions ascribed in Bacillus subtilis rich segments correspond to the NZP2 and NZP3 prophages. (e.g., comC, comEFG operons, comK, mecA), it lacks an Anisochore plot also revealed the presence of a region with obvious comS gene, and the comP gene is punctuated by an anatypically high (62%) G+C content. This segment contains 25 insertion sequence element, Suggesting that the early stages two hypothetical ORFs whose sizes (3831 and 2865 bp) of competence development have been preempted in Bacillus greatly exceed the size of an average gene in Bacillus licheni licheniformis ATCC 14580. Whether these early functions formis. The first protein encodes a protein of 1277 amino can be restored by introducing the corresponding genes from acids for which Interpro predicted 16 collagen triple helix Bacillus subtilis is unknown. In addition to an apparent defi repeats, and the amino acid pattern TGATGPT is repeated 75 30 ciency in DNA uptake, two Type I restriction-modification systems were discovered that may also contribute diminished times within the polypeptide. The second ORF is smaller, and transformation efficiencies. These are distinct from the ydi encodes a protein with 11 collagen triple helix repeats, and OPS genes of Bacillus subtilis, and could participate in deg the same TGATGPT motif recurs 56 times. Interestingly, the radation of improperly modified DNA from heterologous chromosomal region (19 kb) adjacent to these genes is clearly 35 hosts used during construction of recombinant expression non-colinear with the Bacillus subtilis genome, and virtually vectors. Lastly, the synthesis of a glutamyl polypeptide cap all of the predicted ORFs are hypothetical or conserved hypo Sule has also been implicated as a potential barrier to trans thetical proteins. There are a number of bacterial proteins formation of Bacillus licheniformis strains (Thorne, C. B. and listed in PIR that contain collagen triple helix repeat regions Stull, H. B., 1966, J. Bacteriol. 91: 1012-1020). Six genes including two from Mesorhizobium loti (accession numbers 40 were predicted (ywtABDEF andywsC orthologues) that may NF00607049 and NF00607035) and three from Bacillus be involved in the synthesis of this capsular material. cereus (accession numbers NF01692528, NF01269899, and Antibiotics and secondary metabolites. Bacitracin is a NF01694666). These putative orthologs share 53-76% amino cyclic peptide antibiotic that is synthesized non-ribosomally acid sequence identity with their counterparts in Bacillus in Bacillus licheniformis (Katz, E. and Demain, A. L., 1977, licheniformis, although their functions are unknown. 45 Bacteriol. Rev. 41: 449-474). While there is variation in the Extracellular enzymes. In the Bacillus licheniformis prevalence of bacitracin synthase genes in laboratory strains genome, 689 of the 4208 gene models have signal peptides as of this species, one study suggested that up to 50% may forecasted by SignalP (Identification of prokaryotic and harbor the bac operon (Ishihara, H., Takoh, M. Nishibayashi, eukaryotic signal peptides and prediction of their cleavage R., and Sato, A., 2002, Curr. Microbiol. 45:18-23). The bac sites, Henrik Nielsen, Jacob Engelbrecht, Søren Brunak and 50 operon was determined not to be present in the type strain Gunnar von Heijne, 1997, Protein Engineering 10: 1-6). Of (ATCC 14580) genome. Seemingly, the only non-ribosomal these, 309 have no trans-membrane domain as predicted with peptide synthase operon encoded by the Bacillus lichenifor TMHMM (A. Krogh, B. Larsson, G. von Heijne, and E. L. L. mis type strain genome is that which is responsible for Sonnhammer, 2000, Journal of Molecular Biology 305:567 lichenysin biosynthesis. Lichenysin structurally resembles 580) and 134 are hypothetical or conserved hypothetical 55 surfactin from Bacillus subtilis (Grangemard, I., Wallach, J., genes. Based on a manual examination of the remaining 175 Maget-Dana, R., and Peypoux, F., 2001, Appl. Biochem. Bio ORFs, at least 82 were determined to likely encode secreted technol. 90: 199-210), and their respective biosynthetic oper proteins and enzymes. The sequence ID numbers for each of ons are highly similar. No Bacillus licheniformis counterparts these genes are listed in Table 3. were found for the pps (plipastatin synthase) and polyketide Protein secretion, sporulation, and competence pathways. 60 synthase (pks) operons of Bacillus subtilis. Collectively, Kunstetal. (Kunst, F., Ogasawara, N, Mozser, I., Albertini, A. these two regions represent sizeable portions (80 kb and 38 M., Alloni, G., AZebedo, V., Bertero, M. G., Bessieres, P., kb, respectively) of the chromosome in Bacillus subtilis, Bolotin, A., and Borchert, S., 1997, Nature 390: 249-256) although they are reportedly dispensable (Westers, H., listed 18 genes that play a major role in the secretion of Dorenbos, R., van Dijl, J. M., Kable, J., Flanagan, T., Devine, extracellular enzymes by Bacillus subtilis 168. This list 65 K. M., Jude, F., Seror, S.J., Beekman, A. C., Darmon, E., includes several chaperonins, signal peptidases, components 2003, Mol. Biol. Evol. 20: 2076-2090). Unexpectedly, a gene of the signal recognition particle and proteintranslocase com cluster was found encoding a lantibiotic and associated pro US 8,168,417 B2 41 42 cessing and transport functions. This peptide of 69 amino genome organization exists between Bacillus licheniformis acids was designated as lichenicidin, and its closest known and Bacillus halodurans, and inversion of one or more large orthologue is mersacidin from Bacillus sp. strain HIL-Y85/ genomic segments is evident. Clearly this Supports previous 54728 (Altena, K., Guder, A., Cramer, C., and Bierbaum, G., findings (Xu, D. and Coté. J. C., 2003, Internat. J. Syst. Evol. Microbiol. 53: 695-704) that Bacillus subtilis and Bacillus 2000, Appl. Environ. Microbiol. 66: 2565-2571). Lantibiotics 5 licheniformis are phylogenetically and evolutionarily closer are ribosomally synthesized peptides that are modified post than either species is to Bacillus halodurans. However, a translationally so that the final molecules contain rare thioet number of important differences were also observed, both in her amino acids such as lanthionine and/or methyl-lanthion the numbers and locations of prophages and transposable ine (Pag, U. and Sahl, H. G., 2002, Curr: Pharm. Des. 8: elements and in a number of biochemical pathways, which 815-833). These antimicrobial compounds have attracted 10 distinguish Bacillus licheniformis from Bacillus subtilis, much attention in recent years as models for the design of new including a region of more than 80 kb that comprises a cluster antibiotics (Hoffmann, A., Pag, U. Wiedemann, I., and Sahl, of polyketide synthase genes that are absent in Bacillus H. G., 2002, Farmaco. 57: 685-691). licheniformis. Essential Genes. The gene models were also compared to the list of essential genes in Bacillus subtilis (Kobayashi, K., 15 Example 5 Ehrlich, S. D., Albertini, A., Amati, G., Andersen, K. K., Arnaud, M., Asai, K., Ashikaga, S., Aymerch, S., Bessieres, Codon Usage Tables P., 2003, Proc. Natl Acad. Sci. USA 100: 4678-4683). All the essential genes in Bacillus subtilis have orthologues in Bacil The evolution of codon bias, the unequal usage of synono lus licheniformis, and most are present in a wide range of mous codons, is thought to be due to natural selection for the bacterial taxa (Pedersen, P. B., Bjørnvad, M. E., Rasmussen, use of preferred codons that match the most abundant species M.D., and Petersen, J. N., 2002, Reg. Toxicol. Pharmacol. 36: of isoaccepting tPNAS, resulting in increased translational 155-161). efficiency and accuracy. The practical applications for utiliz ing codon bias information include optimizing expression of Example 4 25 heterologous and mutant genes (Jiang and Mannervik, 1999, Protein Expression and Purification 15: 92-98), site-directed Comparison of Bacillus licheniformis Genome with mutagenesis to derive variant polypeptides from a given gene Other Bacilli (Wong at al., 1995. J. Immunol. 154: 3351-3358; Kaji, H. et al., 1999, J. Biochem. 126: 769-775), design and synthesis of VisualGenome software (Rational Genomics, San Fran- 30 synthetic genes (Libertiniand DiDonato, 1992, Protein Engi cisco, Calif.) was used for GC-skew analysis and global neering 5: 821-825: Feng et al., 2000, Biochem. 39: 15399 homology comparisons of the Bacillus licheniformis, Bacil 15409), and fine-tuning or reducing of translation efficiency lus subtilis, and Bacillus halodurans genomes with pre-com of specific genes by introduction of non-preferred codons puted BLAST results stored in a local database. In pairwise (Crombie, T. et al., 1992, J. Mol. Biol. 228: 7-12: Carlini and comparisons (E-score threshold of 10) 66% (2771/4208) of 35 Stephan, 2003, Genetics 163: 239-243). the predicted Bacillus licheniformis ORFs have orthologs in A codon usage table (Table 4) was generated from SEQID Bacillus subtilis, and 55% (2321/4208) of the gene models NO: 1 with CUSP, a software component of the EMBOSS are represented by orthologous sequences in Bacillus halo package (Rice, Longden, and Bleasby, 2000, EMBOSS: The durans. Using a reciprocal BLASTP analysis 1719 orthologs European Molecular Biology Open Software Suite. Trends in were found that are common to all three species (E-score 40 Genetics 16: 276-277) on all the predicted protein-coding threshold of 10). genes of the Bacillus licheniformis chromosome. CUSP read As noted by Lapidus et al. (Lapidus, A., Galleron, N., the coding sequences and calculated the codon frequency Andersen, J. T., Jorgensen, P. L. Ehrlich, S. D., and Sorokin, table shown in Table 4. A., 2002, FEMS Microbiol. Lett. 209:23–30), there are broad A codon usage table (Table 5) was also generated based on regions of colinearity between the genomes of Bacillus 45 the signal peptides of the 82 extracellular proteins described licheniformis and Bacillus subtilis. Less conservation of in Example 3. TABLE 1.

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRefAccession No. Organism Name) 2 Chromosomal replication initiator UniRef100 PO5648 Bacilius subtiis DnaA protein dnaA Bacilius subtilis 3 DNA polymerase III, beta chain UniRef100 PO5649 Bacilius subtiis DnaN Bacillus subtilis 4 5 DNA replication and repair protein recF UniReflOO PO5651 Bacilius subtiis RecR Bacillus subtilis 6 7 DNA gyrase subunit B Bacillus subtilis UniReflOO PO5652 Bacilius subtiis GyrB 8 DNA gyrase subunit A Bacillus subtilis UniReflOO PO5653 Bacilius subtiis Gyra 9 YaaC 10 Inosine-5'-monophosphate UniReflOO P21879 Bacilius subtiis GuaB dehydrogenase Bacilius Stibilis US 8,168,417 B2 43 44 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog D (Gene NO. Description OniRef Accession No. Organism Name) 1 D-alanyl-D-alanine carboxypeptidase UniRef100 PO8750 Bacilius subtiis DacA precursor Bacillus subtilis 2 Pyridoxine biosynthesis protein pdx1 UniRef100 P37527 Bacilius subtiis YaaD Bacilius subtilis 3 Hypothetical UPF0030 protein yaaE UniRef100 P37528 Bacilius subtiis Yaa Bacilius subtilis 4 Seryl-tRNA synthetase Bacilius UniRef100 P37464 Bacilius subtiis SerS subtilis 5 Glycerate kinase Bacilius subtilis UniRef100 P42100 Bacilius subtiis YxaA 6 H+gluconate symporter Vibrio UniRef100 Q7MHW6 Vibrio vulnificus YF vulnificus 7 Sugar diacid utilization regulator Vibrio UniRef100 Q8DBZ9 Vibrio vulnificus YsfB vulnificus 8 Hypothetical protein Bacilius UniRef100 Q6HH43 Bacilius thuringiensis hiringiensis 9 Hypothetical protein yaaF Bacilius UniRef100 P37529 Bacilius subtiis Dck subtilis 20 Hypothetical protein yaa.G. Bacilius UniRef100 P3.7530 Bacilius subtiis Dgk ubtilis 21 Hypothetical protein yaaH Bacilius UniRef100 P37531 Bacilius subtiis Yaa H subtilis 22 Hypothetical protein yaal Bacilius UniRef100 P37532 Bacilius subtiis Yaa subtilis 23 Yaa 24 Dnax 25 Hypothetical UPFO133 protein yaaK UniRef100 P24281 Bacilius subtiis Yaak Bacilius subtiis 26 Recombination protein recRBacilius UniRef100 P24277 Bacilius subtiis RecR subtilis 27 Hypothetical protein yaa.L. Bacilius UniRef100 P37533 Bacilius subtiis subtilis 28 Sigma-K factor processing regulatory UniRef100 P24282 Bacilius subtiis protein BOFA Bacilius subtilis 29 CsfB protein Bacilius subtilis UniRef100 P37534 Bacilius subtiis 30 XpaC protein Bacillus subtilis UniRef100 P37467 Bacilius subtiis XpaC 31 Hypothetical protein yaaN Bacilius UniRef100 P37535 Bacilius subtiis YaaN subtilis 32 Yala.O 33 Thymidylate kinase Bacilius subtilis UniRef100 P3.7537 Bacilius subtiis Tmk 34 Hypothetical protein yaaO Bacilius UniRef100 P3.7538 Bacilius subtiis YaaCR subtilis 35 36 DNA polymerase III, delta' subunit UniRef100 P37540 Bacilius subtiis HoB Bacilius subtiis 37 Hypothetical protein yaaT Bacilius UniRef100 P37541 Bacilius subtiis YaaT subtilis 38 Hypothetical protein yabA Bacilius UniRef100 P37542 Bacilius subtiis YabA subtilis 39 Hypothetical protein yabB Bacilius UniRef100 P37543 Bacilius subtiis YabB subtilis 40 Hypothetical UPF0213 protein yaZA UniRef100 O31414 Bacilius subtiis Bacilius subtiis 41 Hypothetical UPF0011 protein yabC UniRef100 P37544 Bacilius subtiis YabC Bacilius subtiis 42 Transition state regulatory protein abrB UniRef100 PO8874 Bacilius subtiis Bacilius subtiis 43 Methionyl-tRNA synthetase Bacilius UniRef100 P37465 Bacilius subtiis MetS subtilis 44 Putative deoxyribonuclease yabD UniRef100 P37545 Bacilius subtiis Yab) Bacilius subtiis 45 YabE 46 Hypothetical protein yabF Bacilius UniRef100 P37547 Bacilius subtiis RinmV subtilis 47 Dimethyladenosine transferase (EC UniRef100 P37468 Bacilius subtiis KsgA 2.1.1.-) (S-adenosylmethionine-6-N',N'- adenosyl(rRNA) dimethyltransferase) Bacilius subtilis 48 Hypothetical protein yabG Bacilius UniRef100 P37548 Bacilius subtiis YabC subtilis 49 Veg protein Bacilius subtilis UniRef100 P37466 Bacilius subtiis 50 SspF protein Bacillus subtilis UniRef100 P37549 Bacilius subtiis 51 IspE US 8,168,417 B2 45 46 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 52 Pur operon repressor Bacilius subtilis UniRef100 P37551 Bacilius subtiis PrR 53 UPF0076 protein yably Bacillus subtilis) UniRef100 P37552 Bacilius subtiis Yab 54 Stage V sporulation protein G Bacilius UniRef100 P28O15 Bacilius subtiis subtilis 55 Bifunctional gcad protein (TMS protein) UniRef100 P14.192 Includes: UDP- Gcal) Includes: UDP-N-acetylglucosamine N pyrophosphorylase (EC 2.7.7.23) (N- acetylglucosamine acetylglucosamine-1-phosphate pyrophosphorylase uridyltransferase); Glucosamine-1- (EC 2.7.7.23) phosphate N-acetyltransferase (EC (N- 2.3.1.157) Bacilius subtilis acetylglucosamine 1-phosphate uridyltransferase); Glucosamine-1- phosphate N acetyltransferase (EC 2.3.1.157) 56 -phosphate pyrophosphokinase UniRef100 P14193 Bacilius subtiis Prs Bacilius subtilis 57 General stress protein cte Bacilius UniRef100 P14194 Bacilius subtiis Ctc. subtiis 58 Peptidyl-tRNA hydrolase Bacilius UniRef100 P37470 Bacilius subtiis SpoVC subtiis 59 Hypothetical protein yabK Bacilius UniRef100 P37553 Bacilius subtiis subtiis 60 Transcription-repair coupling factor UniRef100 P3.7474 Bacilius subtiis Mfd Bacilius subtiis 61 Stage V sporulation protein TBacilius UniRef100 P37554 Bacilius subtiis SpoVT subtiis 62 Hypothetical protein yabM Bacilius UniRef100 P37555 Bacilius subtiis YabM subtiis 63 Hypothetical protein yabN Bacilius UniRef100 P37.556 Bacilius subtiis YabN tibiis 64 Hypothetical protein yabO Bacilius UniRef100 P37557 Bacilius subtiis tibiis 65 Hypothetical protein yabP Bacilius UniRef100 P37558 Bacilius subtiis YabP tibiis 66 Hypothetical protein yabQ Bacilius UniRef100 P37559 Bacilius subtiis YabO subtiis 67 Cell division protein divlC Bacilius UniRef100 P374.71 Bacilius subtiis DivlC subtiis 68 Hypothetical protein yabRBacilius UniRef100 P37560 Bacilius subtiis YabR. subtiis 69 Stage II sporulation protein E. Bacilius UniRef100 P37475 Bacilius subtiis SpoIIE subtiis 70 Hypothetical protein yabS Bacilius UniRef100 P37561 Bacilius subtiis YabS subtiis 71 Probable /-protein UniRef100 P37562 Bacilius subtiis YabT kinase yabTBacilius subtilis 72 Hypothetical UPF0072 protein yacA UniRef100 P37563 Bacilius subtiis YacA Bacilius subtiis 73 - UniRef100 P37472 Bacilius subtiis HprT phosphoribosyltransferase Bacilius subtilis 74 Cell division protein ftsH homolog UniRef100 P37476 Bacilius subtiis FtSH Bacilius subtilis 75 Putative 32 kDa replication protein UniRef100 Q9F985 Bacilius YacB Bacilius Stearothermophilus Stearothermophilus 76 33 kDa chaperonin Bacilius subtilis UniRef100 P37565 Bacilius subtiis YacC 77 YacD 78 CysK 79 Para-aminobenzoate synthase UniRef100 P28820 Bacilius subtiis PabB component I Bacilius subtilis 80 Para-aminobenzoatefanthranilate UniRef100 P288.19 Includes: Para- PabA synthase amidotransferase aminobenzoate component II Includes; Para synthase aminobenzoate synthase glutamine glutamine amidotransferase component II (EC amidotransferase 6.3.5.8) (ADC synthase); Anthranilate component II synthase component II (EC 4.1.3.27) (EC 6.3.5.8) Bacilius subtil (ADC synthase); US 8,168,417 B2 47 48 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) component II (EC 4.1.3.27) 81 Aminodeoxychorismate lyase Bacilius UniRef100 P28821 Bacilius subtiis PabC subtilis 82 Dihydropteroate synthase Bacilius UniRef100 P28822 Bacilius subtiis Su subtilis 83 Dihydroneopterin aldolase Bacilius UniRef100 P28823 Bacilius subtiis FoB subtilis 84 2-amino-4-hydroxy-6- UniRef100 P29252 Bacilius subtiis FolK hydroxymethyldihydropteridine pyrophosphokinase Bacilius subtilis 85 YazB protein Bacilius subtilis UniRef100 O31417 Bacilius subtiis 86 Probable tRNA-dihydrouridine synthase UniRef100 P37567 Bacilius subtiis YacF Bacillus subtilis 87 Lysyl-tRNA synthetase Bacilius UniRef100 P37477 Bacilius subtiis Lyss subtiis 88 Transcriptional regulator ctsRBacilius UniRef100 P37568 Bacilius subtiis CtSR subtiis 89 Hypothetical protein yach Bacilius UniRef100 P37569 Bacilius subtiis McSA subtiis 90 Hypothetical ATP:guanido UniRef100 P37570 Bacilius subtiis McSB phosphotransferase yacIBacilius subtiis 91 Negative regulator of genetic UniRef100 P37571 Bacilius subtiis ClpC competence clipC/mech3 Bacilius subtiis 92 RadA 93 Hypothetical protein yacKBacilius UniRef100 P37573 Bacilius subtiis YacK subtiis 94 Hypothetical protein yacL Bacilius UniRef100 Q06754 Bacilius subtiis YacL subtiis 95 2-C-methyl-D-erythritol 4-phosphate UniRef100 Q06755 Bacilius subtiis YacM cytidylyltransferase Bacilius subtilis 96 2-C-methyl-D-erythritol 2,4- UniRef100 Q06756 Bacilius subtiis YacN cyclodiphosphate synthase Bacilius subtiis 97 Glutamyl-tRNA synthetase Bacillus UniRef100 P22250 Bacilius subtiis GltX subtiis 98 Serine acetyltransferase Bacilius UniRef100 Q06750 Bacilius subtiis CysE subtiis 99 Cysteinyl-tRNA synthetase Bacilius UniRef100 Q06752 Bacilius subtiis CysS subtiis 00 YazC protein Bacilius subtilis UniRef100 O31418 Bacilius subtiis YazC O1 Hypothetical tRNA, rRNA UniRef100 Q06753 Bacilius subtiis YacO methyltransferase yacOBacilius subtiis 02 Hypothetical protein yacP Bacilius UniRef100 P37574 Bacilius subtiis YacP subtiis 03 RNA polymerase sigma-H factor UniRef100 P17869 Bacilius subtiis Sigh Bacilius subtilis 04 Preprotein secE. Subunit UniRef100 Q06799 Bacilius subtiis Bacilius subtilis 05 NuSG 06 50S ribosomal protein L11 Bacilius UniRef100 Q06796 Bacilius subtiis RplK subtilis 07 50S ribosomal protein L1 Bacilius UniRef100 Q06797 Bacilius subtiis RplA subtilis 08 50S ribosomal protein L10 Bacilius UniRef100 P42923 Bacilius subtiis RplJ subtilis 09 50S ribosomal protein L7/L12 Bacilius UniRef100 PO2394 Bacilius subtiis RplL subtilis 10 Hypothetical protein ybxB Bacilius UniRef100 P3.7872 Bacilius subtiis YbXB subtilis 11 DNA-directed RNA polymerase beta UniRef100 P37870 Bacilius subtiis RpoB chain Bacilius subtilis 12 DNA-directed RNA polymerase beta' UniRef100 P37871 Bacilius subtiis RpoC chain Bacilius subtilis 13 Putative ribosomal protein L7Ae-like UniRef100 P46350 Bacilius subtiis Bacilius subtilis US 8,168,417 B2 49 50 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 14 30S ribosomal protein S12 Bacilius UniRef100 P21472 Bacilius subtiis RpsL subtilis 15 30S ribosomal protein S7 Bacilius UniRef100 P21469 Bacilius subtiis RSG subtilis 16 Elongation factor G Bacilius subtilis UniRef100 P80868 Bacilius subtiis FusA 17 Elongation factor Tu Bacilius subtilis UniRef100 P33166 Bacilius subtiis TufA 18 30S ribosomal protein S10 Bacilius UniRef100 Q979L5 Bacilius Rps halodurans halodurans 19 50S ribosomal protein L3 Bacilius UniRef100 P42920 Bacilius subtiis RC subtiis 20 50S ribosomal protein L4Bacilius UniRef100 P42921 Bacilius subtiis RD subtiis 21 50S ribosomal protein L23 Bacilius UniRef100 P42924 Bacilius subtiis subtiis 22 50S ribosomal protein L2 Bacilius UniRef100 P42919 Bacilius subtiis RB subtiis 23 30S ribosomal protein S19 Bacilius UniRef100 P21476 Bacilius subtiis subtiis 24 50S ribosomal protein L22 Bacilius UniRef100 P42060 Bacilius subtiis RIV subtiis 25 30S ribosomal protein S3 Bacilius UniRef100 P21465 Bacilius subtiis RSC subtiis 26 50S ribosomal protein L16 Bacilius UniRef100 P14577 Bacilius subtiis RIP subtiis 27 50S ribosomal protein L29 Bacilius UniRef100 P12873 Bacilius subtiis subtiis 28 30S ribosomal protein S17Bacilius UniRef100 P12874 Bacilius subtiis subtiis 29 30 50S ribosomal protein L24 Bacilius UniRef100 P12876 Bacilius subtiis RibX subtiis 31 50S ribosomal protein L5 Bacilius UniRef100 P12877 Bacilius subtiis RplE subtiis 32 Ribosomal protein S14 Bacilius cereus UniRef100 Q63H77 Bacilius cereus ZK ZK) 33 30S ribosomal protein S8 Bacilius UniRef100 P12879 Bacilius subtiis RSH subtiis 34 50S ribosomal protein L6 Bacilius UniRef100 P46898 Bacilius subtiis RIF subtiis 35 50S ribosomal protein L18 Bacilius UniRef100 P46899 Bacilius subtiis RplR subtiis 36 30S ribosomal protein S5 Bacilius UniRef100 P21467 Bacilius subtiis RSE subtiis 37 50S ribosomal protein L30 Bacilius UniRef100 P19947 Bacilius subtiis subtiis 38 50S ribosomal protein L15 Bacilius UniRef100 P19946 Bacilius subtiis RplO subtiis 39 SecY 40 Adenylate kinase Bacilius subtilis UniRef100 P16304 Bacilius subtiis Adk 41 Methionine aminopeptidase Bacilius UniRef100 P19994 Bacilius subtiis Map subtiis 42 C-125 initiation factor IF-I, RNA UniRef100 O50629 Bacilius halodurans polymerase alpha Subunit and ribosomal proteins, partial and complete cols Bacilius halodurans 43 Translation initiation factor IF-1 UniRef100 P20458 Bacilius subtiis Bacilius subtilis 44 45 RpsM 46 30S ribosomal protein S11 Bacilius UniRef100 PO4969 Bacilius subtiis RpsK subtilis 47 DNA-directed RNA polymerase alpha UniRef100 P20429 Bacilius subtiis RpoA chain Bacilius subtilis 48 50S ribosomal protein L17 Bacilius UniRef100 P20277 Bacilius subtiis RplO subtilis 49 Hypothetical ABC transporter ATP- UniRef100 P4O735 Bacilius subtiis YbXA binding proteinybXA Bacilius subtilis 50 Hypothetical protein orf5 Bacilius UniRef100 P70970 Bacilius subtiis Yba. subtilis 51 YbaF protein Bacilius subtilis UniRef100 P70972 Bacilius subtiis YbaF 52 tRNA pseudouridine synthase A UniRef100 P70973 Bacilius subtiis TruA Bacilius subtilis US 8,168,417 B2 51 52 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 53 50S ribosomal protein L13 Bacilius UniRef100 P70974 Bacilius subtiis RplM subtilis 54 30S ribosomal protein S9 Bacilius UniRef100 P21470 Bacilius subtiis RpsI subtilis 55 Hypothetical protein Bacillus cereus UniRef100 Q737T6 Bacilius cereus YZA 56 Hypothetical protein ybaK Bacilius UniRef100 PSO862 Bacilius subtiis YbaK subtilis 57 Germination-specific N- UniRef100 PSO864 Bacilius subtiis CwO acetylmuramoyl-L-alanine amidase Bacilius subtilis 58 Yba. 59 Spore germination protein gerD UniRef100 P16450 Bacilius subtiis GerD precursor Bacillus subtilis 60 KinB signaling pathway activation UniRef100 P16449 Bacilius subtiis KbaA protein Bacilius subtilis 61 Hypothetical protein ybaN precursor UniRef100 PSO865 Bacilius subtiis YbaN Bacilius subtiis 62 Penicillin-binding protein Bacilius UniRef100 O31773 Bacilius subtiis PbpX subtilis 63 64 Hypothetical protein ybaS Bacilius UniRef100 P55190 Bacilius subtiis YbaS subtilis 65 Hypothetical protein yxa Bacillus UniRef100 P42109 Bacilius subtiis Yxa subtilis 66 Phenazine biosynthetic protein UniRef100 Q9HHG6 Haiobacterium YfhB Halobacterium sp. sp. 67 Hypothetical protein ybbC precursor UniRef100 P40407 Bacilius subtiis YbbC Bacilius subtiis 68 Hypothetical lipoprotein ybbD precursor UniRef100 P40406 Bacilius subtiis YbbD Bacilius subtiis 69 Hypothetical UPF0214 protein ybbE UniRef100 OO5213 Bacilius subtiis YbbE precursor Bacillus subtilis 70 YbbF protein Bacillus subtilis) UniRef100 Q797 S1 Bacilius subtiis YbbF 71 Putative HTH-type transcriptional UniRef100 Q45581 Bacilius subtiis YbbH regulatorybbH Bacilius subtilis 72 Hypothetical protein ybbI Bacillus UniRef100 Q45582 Bacilius subtiis Ybb subtilis 73 Hypothetical protein ybbK Bacilius UniRef100 Q45584 Bacilius subtiis YbbK subtilis 74 Arginase Bacilius Caido velox UniRef100 P53608 Bacilius RocF caidoveiox 75 RNA polymerase sigma factor sigW UniRef100 Q45585 Bacilius subtiis SigW Bacilius subtilis 76 YbbM protein Bacilius subtilis UniRef100 Q45588 Bacilius subtiis YbbM 77 YbbP protein Bacillus subtilis) UniRef100 Q45589 Bacilius subtiis YbbP 78 YbbR protein Bacilius subtilis UniRef100 O34659 Bacilius subtiis YbbR 79 YbbT protein Bacillus subtilis) UniRef100 O34824 Bacilius subtiis YbbT 80 Glucosamine--fructose-6-phosphate UniRef100 P39754 isomerizing GlmS aminotransferase isomerizing Bacilius subtilis 81 82 Hypothetical protein Bacilius UniRef100 Q6HH47 Bacilius thuringiensis thiringiensis 83 UPIOOOO3CBF92 UniRef100 entry UniRef100 UPIOOOO3CBF92 YtrB 84 Transcriptional regulator Bacilius UniRef100 Q9KF35 Bacilius Ytra halodurans halodurans 85 86 YbcL protein Bacillus subtilis UniRef100 O34663 Bacilius subtiis YbcL 87 BHO186 protein Bacilius halodurans UniRef100 Q9KGB8 Bacilius halodurans 88 ABC transporter Bacilius halodurans UniRef100 Q9KEY6 Bacilius YvcC halodurans 89 Hypothetical protein ywbO Bacilius UniRef100 P39598 Bacilius subtiis YwbO subtilis 90 91 BHO695 protein Bacilius halodurans UniRef100 Q9KFO4 Bacilius halodurans 92 Aminoglycoside 6-adenylyltranserase UniRef100 Q6V4U6 Enterococcits Aadk Enterococciis faecium faecium 93 Hypothetical conserved protein UniRef100 Q8ERX2 Oceanobacilius iheyensis Oceanobacilius iheyensis 94 YfnBBacillus subtilis UniRef100 OO6480 Bacilius subtiis YfB 95 Hypothetical transport proteinybXG UniRef100 P54.425 Bacilius subtiis YbxG Bacilius subtilis US 8,168,417 B2 53 54 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 196 Mg(2+)/citrate complex secondary UniRef100 P55069 Bacilius subtiis City transporter Bacilius subtilis 197 YfiP protein Bacillus subtilis) UniRef100 O34439 Bacilius subtiis YfP 198 CitT 199 Sensor protein citS Bacilius subtilis UniRef100 O34427 Bacilius subtiis CitS 200 Transcriptional regulator UniRef100 Q8RB37 Thermoanaerobacter YwfK Thermoanaerobacter tengcongensis tengcongensis 201 Complete genome; segment 8/17 UniRef100 Q7N4W7 Photorhabdus YwfE Photorhabdus luminescens luminescens 202 Multidrug resistance protein B Bacilius UniRef100 Q81AFO Bacilius cereus YgiV cereus 203 Sigma-G-dependent sporulation UniRef100 P54379 Bacilius subtiis specific SASP protein Bacilius subtilis 204 Hypothetical protein ybXH Bacilius UniRef100 P54426 Bacilius subtiis subtilis 205 YbyB protein Bacillus subtilis UniRef100 O31441 Bacilius subtiis 206 Hypothetical protein yyaL. Bacilius UniRef100 P37512 Bacilius subtiis YyaL subtilis 207 Hypothetical protein yyaC) Bacilius UniRef100 P3.7509 Bacilius subtiis subtilis 208 Threonyl-tRNA synthetase 2 Bacilius UniRef100 P18256 Bacilius subtiis Thr2. subtilis 209 210 YttB 211 Hypothetical protein Acinetobacter sp. UniRef100 Q6FDNO Acinetobacter sp. 212 Hypothetical UPF0053 protein yrkA UniRef100 P54428 Bacilius subtiis YrkA Bacilius subtilis 213 YdeG 214 Yba. 215 216 YgeW 217 YubF protein Bacilius subtilis UniRef100 O32082 Bacilius subtiis 218 YbfF 219 Hypothetical protein Bacteroides UniRef100 Q64UG6 Bacteroides fragilis fragilis 220 Hypothetical transport protein ybhF UniRef100 O31448 Bacilius subtiis Ybf Bacilius subtilis 221 YbfI protein Bacillus subtilis UniRef100 O31449 Bacilius subtiis YbfI 222 223 UPIOOOO3CBC98 UniRefl00 entry UniRef100 UPIOOOO3CBC98 224 225 Oxidoreductase Lactococcus lactis UniRef100 Q9CFC5 Lactococci is lactis 226 Methyl-accepting chemotaxis protein UniRef100 Q8EST3 Oceanobacilius YvaO Oceanobacilius iheyensis iheyensis 227 D-xylose-binding protein UniRef100 O68456 Thermoanaerobacter RbsB Thermoanaerobacter ethanolicus eihanoicus 228 Hypothetical protein OB0544 UniRef100 Q8ESS4 Oceanobacilius Yom Oceanobacilius iheyensis iheyensis 229 Hypothetical protein Bacillus cereus UniRef100 Q81 IR4 Bacilius cereus 230 Streptogramin Blactonase UniRef100 O87275 Staphylococcus cohnii Staphylococcus cohnii 231 Hypothetical protein yebP Bacilius UniRef100 P42248 Bacilius subtiis YcbP subtilis 232 YbgF protein Bacilius subtilis UniRef100 O31462 Bacilius subtiis YbgF 233 YbgG protein Bacilius subtilis UniRef100 O31463 Bacilius subtiis YbgG 234 235 PTS system, n-acetylglucosamine- UniRef100 Q9KF24 Bacilius NagP specific enzyme II, ABC component halodurans Bacilius halodurans 236 UPIOOOO3CBDBC UniRef100 entry UniRef100 UPIOOOO3CBDBC Mta 237 Hypothetical protein Bacilius UniRef100 Q6HIWO Bacilius YdfS thiringiensis thiringiensis 238 Carbonic anhydrase Methanosarcina UniRef100 Q8PSJ1 Meihanosarcina Yto A mazei mazei 239 Hypothetical protein yig A Bacilius UniRef100 Q6HHU2 Bacilius YigA thiringiensis thiringiensis 240 Amino acid carrier protein Bacilius UniRef100 Q6HGU4 Bacilius YbgH thiringiensis thiringiensis 241 Probable glutaminaseybgJ Bacillus UniRef100 O31465 Bacilius subtiis Ybg. subtilis 242 Two-component sensor kinase yebA UniRef100 Q81 BN8 Bacilius cereus YcbA Bacilius cereus US 8,168,417 B2 55 56 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 243 Hypothetical sensory transduction UniRef100 P4O759 Bacilius subtiis YcbB protein yebB Bacillus subtilis 244 YoaA 245 YbXI 246 247 Beta-lactamase precursor (EC 3.5.2.6) UniRef100 PO0808 Contains: Large PenP (Penicillinase) Contains: Large exopenicillinase; exopenicillinase; Small Small exopenicillinase Bacilius licheniformis exopenicillinase 248 Alkaline phosphatase D precursor UniRef100 P42251 Bacilius subtiis POD Bacilius subtilis 249 250 Hypothetical protein yebTBacilius UniRef100 P42252 Bacilius subtiis TatCD subtilis 251 YcbC 252 Probable dehydrogenase UniRef100 P42236 Bacilius subtiis YcbD ycbD Bacilius subtilis 253 Probable glucarate transporter Bacilius UniRef100 P42237 Bacilius subtiis YcbE subtilis 254 Probable glucarate dehydratase UniRef100 P42238 Bacilius subtiis YcbF Bacilius subtilis 255 Hypothetical transcriptional regulator UniRef100 P42239 Bacilius subtiis YcbG ycbG Bacilius subtilis 256 Probable D-galactarate dehydratase UniRef100 P42240 Bacilius subtiis YcbH Bacilius subtilis 257 Hypothetical sensory transduction UniRef100 P42244 Bacilius subtiis YcbL protein yebL Bacilius subtilis 258 Sensor kinase Bacilius cereus UniRef100 Q633Q6 Bacilius cereus ZK ZK) 259 Hypothetical ABC transporter ATP- UniRef100 P42246 Bacilius subtiis YcbN binding protein yebN Bacilius subtilis 260 Hypothetical protein yebO Bacilius UniRef100 P42247 Bacilius subtiis YcbO subtilis 261 YcbO 262 YfnK Bacilius subtilis UniRef100 OO6490 Bacilius subtiis YetN 263 Hypothetical protein ybdO Bacilius UniRef100 O31437 Bacilius subtiis YbdO subtilis 264 Hypothetical protein yebJ Bacillus UniRef100 P42242 Bacilius subtiis Ycb subtilis 265 Hypothetical protein ywhA Bacilius UniRef100 P70993 Bacilius subtiis YwhA subtilis 266 YdaB 267 AII0778 protein Anabaena sp. UniRef100 Q8YYR8 Anabaena sp. 268 BH1298 protein Bacilius halodurans UniRef100 Q9KDB5 Bacilius YbdN halodurans 269 BH1299 protein Bacilius halodurans UniRef100 Q9KDB4 Bacilius halodurans 270 Transcriptional activator tip.A, putative UniRef100 Q81XN3 Bacilius Mta Bacilius anthracis anthracis 271 272 Transcriptional regulator, PadR family UniRef100 Q81 BZO Bacilius cereus Bacilius cereus 273 Hypothetical protein Bacillus cereus UniRef100 Q639U7 Bacilius cereus ZK ZK) 274 Tryptophan RNA-binding attenuator UniRef100 O31466 Bacilius subtiis protein-inhibitory protein Bacillus subtilis 275 Hypothetical transport protein yebK UniRef100 P42243 Bacilius subtiis YcbK Bacilius subtilis 276 YczC 277 Hypothetical protein yecP Bacilius UniRef100 O34478 Bacilius subtiis YccF subtilis 278 SpaF Bacillus subtilis UniRef100 Q454.04 Bacilius subtiis YCH 279 SpaE Bacilius subtilis UniRef100 OS2853 Bacilius subtiis 280 Putative Spag Bacilius subtilis UniRef100 Q93GG9 Bacilius subtiis 281 Subtilin biosynthesis regulatory protein UniRef100 P33112 Bacilius subtiis YycF spaRBacilius subtilis 282 Putative histidine kinase Bacilius UniRef100 Q93GG7 Bacilius subtiis ResE subtilis 283 UPIOOOO3CA401 UniRef100 entry UniRef100 UPIOOOO3CA401 YI 284 YusO protein Bacilius subtilis UniRef100 O321.83 Bacilius subtiis YusO 285 Complete genome; segment 6/17 UniRef100 Q7N6K9 Photorhabdus YuSR Photorhabdus luminescens luminescens US 8,168,417 B2 57 58 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 286 Methyltransferase Bacilius UniRef100 Q6HK82 Bacilius thuringiensis thiringiensis 287 Hypothetical lipoprotein yed A precursor UniRef100 O34538 Bacilius subtiis YcdA Bacilius subtilis 288 UPIOOOO3CB481 UniRefl00 entry UniRef100 UPIOOOO3CB481 YcgA 289 Hypothetical conserved protein UniRef100 Q8ESQ2 Oceanobacilius YL Oceanobacilius iheyensis iheyensis 290 Hypothetical conserved protein UniRef100 Q8ET83 Oceanobacilius YvbK Oceanobacilius iheyensis iheyensis 291 GatA 292 Oligoendopeptidase F, putative UniRef100 Q81JJ8 Bacilius YbO Bacilius anthracis anthracis 293 294 UPIOOOO3CC42D UniRef100 entry UniRef100 UPIOOOO3CC42D PbpX 295 Hypothetical protein ysfD Bacilius UniRef100 P94534 Bacilius subtiis YsfD subtilis 296 Hypothetical protein ySfc Bacilius UniRef100 P94535 Bacilius subtiis YsfC subtilis 297 PTS system, cellobiose-specific UniRef100 Q9KEE3 Bacilius halodurans enzyme II, B component Bacilius halodurans 298 PTS system, cellobiose-specific UniRef100 Q9KEE2 Bacilius LicA enzyme II, A component Bacilius halodurans halodurans 299 PTS system, cellobiose-specific UniRef100 Q8EP43 Oceanobacilius YwbA enzyme II, C component iheyensis Oceanobacilius iheyensis 300 6-phospho-beta-glucosidase Bacilius UniRef100 Q9KEEO Bacilius Lich halodurans halodurans 301 Transcriptional regulator Bacilius UniRef100 Q9KED8 Bacilius YbgA halodurans halodurans 3O2 303 YvbX 304 Chitinase precursor Streptomyces UniRef100 Q9L3E8 Streptomyces olivaceoviridis olivaceoviridis 305 Extracellular metalloprotease precursor UniRef100 P39790 Bacilius subtiis Mpr Bacilius subtilis 306 Glucose 1-dehydrogenase II Bacilius UniRef100 P80869 Bacilius subtiis YcdF subtilis 307 Hypothetical protein OB0244 UniRef100 Q8ETL6 Oceanobacilius iheyensis Oceanobacilius iheyensis 308 Hypothetical protein yedC Bacilius UniRef100 O34772 Bacilius subtiis YCCC subtilis 309 310 Cell wall hydrolase cwlJ Bacilius UniRef100 P42249 Bacilius subtiis Cwl subtilis 311 Hypothetical protein yeeB Bacilius UniRef100 O34504 Bacilius subtiis YceB subtilis 312 Hypothetical protein yvcE Bacilius UniRef100 P4O767 Bacilius subtiis YvcE subtilis 313 ABC transporter, permease Bacilius UniRef100 Q637N2 Bacilius cereus YM cereus ZK) ZK. 314 ABC transporter, ATP-binding protein UniRef100 Q733P8 Bacilius cereus YfL Bacilius cereus 315 UPIOOOO3CC482 UniRefl00 entry UniRef100 UPIOOOO3CC482 Ysia 316 Hypothetical protein OB3113 UniRef100 Q8ELV2 Oceanobacilius iheyensis Oceanobacilius iheyensis 317 BH3953 protein Bacilius halodurans UniRef100 Q9K5Y3 Bacilius halodurans 318 BH3951 protein Bacilius halodurans UniRef100 Q9K5Y4 Bacilius PadR halodurans 319 Stress response protein SCP2 Bacilius UniRef100 P81100 Bacilius subtiis YceC subtilis 320 General stress protein 16U Bacilius UniRef100 P80875 Bacilius subtiis YceD subtilis 321 Hypothetical protein yeeE Bacilius UniRef100 O34384 Bacilius subtiis YceE subtilis 322 Hypothetical protein yeeF Bacilius UniRef100 O34447 Bacilius subtiis YceF subtilis 323 Hypothetical protein Bacillus cereus UniRef100 Q72YDO Bacilius cereus 324 YeeG Bacilius subtilis UniRef100 O34.809 Bacilius subtiis YceG 325 Hypothetical protein yeeH Bacilius UniRef100 O34833 Bacilius subtiis YceH subtilis US 8,168,417 B2 59 60 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 326 UPIOOOO3CB694 UniRefl00 entry UniRef100 UPIOOOO3CB694 CcdA 327 Hypothetical conserved protein UniRef100 Q8ELB4 Oceanobacilius YL Oceanobacilius iheyensis iheyensis 328 Mta 329 Nitrate transporter Bacilius subtilis UniRef100 P42432 Bacilius subtiis NasA 330 L-lactate dehydrogenase Bacilius UniRef100 P13714 Bacilius subtiis Ldh subtilis 331 L-lactate permease Bacilius subtilis UniRef100 P55910 Bacilius subtiis LctP 332 YcgF protein Bacilius subtilis UniRef100 P94381 Bacilius subtiis YcgF 333 Homologue of aromatic amino acids UniRef100 P94383 Bacilius subtiis YcgH transport protein of E. coli Bacilius subtilis 334 NH(3)-dependent NAD(+) synthetase UniRef100 PO8164 Bacilius subtiis NadE Bacilius subtilis 335 Shikimate kinase Bacillus subtilis UniRef100 P37944 Bacilius subtiis AroK 336 YogL protein Bacilius subtilis UniRef100 P94389 Bacilius subtiis YcgL 337 Proline dehydrogenase Bacilius UniRef100 Q8RL79 Bacilius subtiis YcgM subtilis 338 1-pyrroline-5-carboxylate UniRef100 P94391 Bacilius subtiis YcgN dehydrogenase 2 Bacilius Subtilis 339 Homologue of proline permease of E. coli UniRef100 P94392 Bacilius subtiis YcgO Bacilius subtilis 340 Hypothetical protein yogP Bacilius UniRef100 P94393 Bacilius subtiis YcgP subtilis 341 YegO protein Bacillus subtilis UniRef100 P94394 Bacilius subtiis YcgQ 342 YcgR 343 Cephalosporin-C deacetylase Bacilius UniRef100 Q59233 Bacilius subtiis Cah subtilis 344 Transcriptional regulator Bacilius UniRef100 Q9KF41 Bacilius YdgG halodurans halodurans 345 346 347 Probable amino-acid ABC transporter UniRef100 P42399 Bacilius subtiis YckA permease protein yokA Bacilius subtilis 348 Probable ABC transporter extracellular UniRef100 P42400 Bacilius subtiis YckB binding protein yokB precursor Bacilius subtilis 349 NAD(P)H dehydrogenase (Quinone); UniRef100 Q63FG6 Bacilius cereus YwrO possible modulator of drug activity B ZK. Bacilius cereus ZK) 350 BHO315 protein Bacilius halodurans UniRef100 Q9KGO1 Bacilius halodurans 351 Probable beta-glucosidase Bacilius UniRef100 P42403 Bacilius subtiis YckE subtilis 352 Nin Bacillus amyloiquefaciens UniRef100 Q7OKK3 Bacilius Nin amyloiquefaciens 353 DNA-entry nuclease Bacilius subtilis UniRef100 P12667 Bacilius subtiis NucA 3S4 YbCM 355 Methyl-accepting chemotaxis protein UniRef100 P39209 Bacilius subtiis TlpC lpC Bacilius subtilis 356 Hypothetical protein yacG Bacilius UniRef100 P45942 Bacilius subtiis YacG subtilis 357 Hydantoin utilization protein A UniRef100 Q7WCS3 Bordeteila bronchiseptica Bordeteila bronchiseptical 358 Hydantoin utilization protein B UniRef100 Q987J6 Rhizobium ioti Rhizobium ioti 359 UPIO00027D233 UniRefl00 entry UniRef100 UPIOOOO27D233 YvfK 360 358aa long hypothetical transporter UniRef100 Q9YB65 Aeropyrim MSmX ATP-binding protein Aeropyrim pernix pernix 361 Atta2-like ABC transporter, permease UniRef100 Q92ZHO Rhizobium Yes protein Rhizobium meliloti meioti 362 YM 363 SrfAA 364 SrfAB 365 SrfAC 366 SrfAD 367 YCXC 368 YCxD 369 4'-phosphopantetheinyl transferase sfp UniRef100 P39135 Bacilius subtiis Sfp Bacilius subtilis US 8,168,417 B2 61 62 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 370 Predicted esterase of alpha/beta UniRef100 Q97HP7 Cliostridium YbbA hydrolase Superfamily, YBBA. B. subtilis acetobiitvictim ortholog Clostridium acetobutyllicum 371 Transcriptional regulator Clostridium UniRef100 Q97LX8 Cliostridium YdleE acetobiitvictim acetobiitvictim 372 YczE Bacillus subtilis) UniRef100 Q9F4F8 Bacilius subtiis YCZE 373 Hypothetical protein Symbiobacterium UniRef100 Q67MZ7 Symbiobacterium thermophilum thermophilum 374 YfL 375 376 Hypothetical protein Symbiobacterium UniRef100 Q67MZ9 Symbiobacterium thermophilum thermophilum 377 378 379 Methyl-accepting chemotaxis protein UniRef100 Q6HJV7 Bacilius TlpB Bacilius thuringiensis thiringiensis 380 YekI Bacilius subtilis UniRef100 Q9F4F9 Bacilius subtiis YckI 381 YekJ Bacillus subtilis UniRef100 Q9F4GO Bacilius subtiis Yck 382 YckK 383 RocR 384 Ornithine aminotransferase Bacillus UniRef100 P38021 Bacilius subtiis Roc) subtilis 385 Amino-acid permease rocE Bacilius UniRef100 P39137 Bacilius subtiis Roc subtilis 386 Arginase Bacilius subtilis UniRef100 P39138 Bacilius subtiis RocF 387 Homologue of als operon regulatory UniRef100 P94403 Bacilius subtiis YCA protein AlsR of B. subtilis Bacilius subtilis 388 Probable aromatic acid decarboxylase UniRef100 P94.404 Bacilius subtiis YCIB Bacilius subtilis 389 Hypothetical protein yelC Bacilius UniRef100 P94405 Bacilius subtiis YcC subtilis 390 YelD protein Bacilius subtilis UniRef100 P944O6 Bacilius subtiis YcID 391 392 393 394 Hypothetical protein Bacilius UniRef100 Q6HMQ9 Bacilius thuringiensis thiringiensis 395 Hypothetical protein OB2810 UniRef100 Q8EMN2 Oceanobacilius iheyensis Oceanobacilius iheyensis 396 397 YXD 398 399 YXB 400 Sugar ABC transporter Bacillus UniRef100 Q9K7B8 Bacilius RbsB halodurans halodurans 401 Two-component sensor histidine kinase UniRef100 Q9K7B9 Bacilius YesN Bacilius halodurans halodurans 4O2 YesN 403 Multiple Sugar transport system UniRef100 Q9K7C2 Bacilius RbsB Bacilius halodurans halodurans 404 L-arabinose transport ATP-binding UniRef100 Q9S472 Bacilius RbSA protein araO Bacilius Stearothermophilus Stearothermophilus 405 L-arabinose membrane permease UniRef100 Q9S471 Bacilius RbSC Bacilius Stearothermophilus Stearothermophilus 406 Transporter Bacilius halodurans UniRef100 Q9K9F3 Bacilius YbfB halodurans 407 YbfA protein Bacilius subtilis UniRef100 O31443 Bacilius subtiis YbfA 408 Beta-D-galactosidase Bacilius UniRef100 Q45093 Bacilius circuians LacA circulans 409 Phy 410 Hypothetical transporteryclF Bacilius UniRef100 P94408 Bacilius subtiis YcIF subtilis 411 YelG protein Bacilius subtilis UniRef100 P94409 Bacilius subtiis YcG 412 413 GerKA 414 Spore germination protein KC precursor UniRef100 P49941 Bacilius subtiis GerKC Bacilius subtilis 415 Spore germination protein KB Bacilius UniRef100 P49940 Bacilius subtiis GerKB subtilis 416 Mta US 8,168,417 B2 63 64 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 417 Hypothetical protein yelH Bacilius UniRef100 P94411 Bacilius subtiis YCH subtilis 418 Hypothetical protein yell Bacilius UniRef100 P94412 Bacilius subtiis YcII subtilis 419 Hypothetical sensory transduction UniRef100 P94413 Bacilius subtiis YcT protein yell Bacillus subtilis 420 Hypothetical sensor-like histidine UniRef100 P94414 Bacilius subtiis YcK kinase yelK Bacillus subtilis 421 Methyl-accepting chemotaxis protein UniRef100 Q9K632 Bacilius Tlp A Bacilius halodurans halodurans 422 Probable aspartokinase Bacilius UniRef100 P94417 Bacilius subtiis YCIM subtilis 423 Homologue of ferric anguibactin UniRef100 P94418 Bacilius subtiis YcIN transport system perimerase protein FatD of V anguillarum Bacilius subtilis 424 Homologue of ferric anguibactin UniRef100 P94419 Bacilius subtiis YCO transport system perimerase protein FatC of V anguillarum Bacilius subtilis 425 Homologue of iron dicitrate transport UniRef100 P94420 Bacilius subtiis YcIP ATP-binding protein Fech of E. coli Bacilius subtilis 426 Ferric anguibactin-binding protein UniRef100 P94421 Bacilius subtiis YcIQ precusor FatB of V anguillarum Bacilius subtilis 427 Homologue of multidrug resistance UniRef100 P94422 Bacilius subtiis YCB protein B, EmrB, of E. coli Bacilius subtilis 428 YenC protein Bacillus subtilis UniRef100 P94423 Bacilius subtiis YCC 429 Hypothetical protein Racillus cereus UniRef100 Q636RO Bacilius cereus YgiQ ZK) ZK. 430 Hypothetical oxidoreductaseycnD UniRef100 P94424 Bacilius subtiis YCD Bacilius subtilis 431 Hypothetical protein yenE Bacilius UniRef100 P94.425 Bacilius subtiis subtilis 432 YczG protein Bacilius subtilis UniRef100 O31480 Bacilius subtiis 433 Homologue of regulatory protein MocR UniRef100 P94426 Bacilius subtiis GabR of R. meliloti Bacilius subtilis 434 Probable 4-aminobutyrate UniRef100 P94427 Bacilius subtiis GabT aminotransferase (EC 2.6.1.19) ((S)-3- amino-2-methylpropionate transaminase) Bacilitis subtilis 435 Cationic amino acid transporter UniRef100 Q8ESX7 Oceanobacilius YG Oceanobacilius iheyensis iheyensis 436 Homologue of Succinate semialdehyde UniRef100 P94428 Bacilius subtiis Gab) dehydrogenase GabD of E. coli Bacilius subtilis 437 YwfM 438 439 YcnI protein Bacilius subtilis UniRef100 P94431 Bacilius subtiis YCI 440 Homologue of copper export protein UniRef100 P94432 Bacilius subtiis Ycil Pcod of E. coli Bacilius subtilis 441 YcnK protein Bacilius subtilis UniRef100 P94433 Bacilius subtiis YcnK 442 Assimilatory nitrate reductase electron UniRef100 P42433 Bacilius subtiis Nas transfer subunit Bacilius subtilis 443 Assimilatory nitrate reductase catalytic UniRef100 P42434 Bacilius subtiis NaSC subunit Bacilius subtilis 444 Nitrite reductase NAD(P)H) Bacilius UniRef100 P42435 NAD(P)H Nas) subtilis 445 Assimilatory nitrite reductase UniRef100 P42436 NAD(P)H Nas NAD(P)H Small subunit Bacilius subtilis 446 Uroporphyrin-III C-methyltransferase UniRef100 P42437 Bacilius subtiis Nast Bacilius subtilis 447 Hypothetical transcriptional regulator UniRef100 OO5494 Bacilius subtiis YdhC ydhC Bacilius subtilis 448 Sodium-dependent transporter UniRef100 Q8ENE3 Oceanobacilius YfS Oceanobacilius iheyensis iheyensis 449 Hypothetical protein yesD Bacilius UniRef100 P42961 Bacilius subtiis subtilis 450 UPIOOOO3CC424 UniRefl00 entry UniRef100 UPIOOOO3CC424 451 Possible transcriptional antiterminator, UniRef100 Q63A16 Bacilius cereus MtR bglG family Bacilius cereus ZK) US 8,168,417 B2 65 66 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 452 Putative Sugar-specific PTS component UniRef100 Q7X1N9 Lactococci is raffinolactis EIIB Lactococcus raffinolactis 453 SgaT protein Mannheimia UniRef100 Q65WA2 Mannheimia succiniciproducens MBEL55E succiniciproducens MBEL55E 454 Transketolase, N-terminal subunit UniRef100 Q8E202 Streptococci is Tkt Streptococci is agaiactiae agalaciae 455 Putative transketolase Salmonella UniRef100 Q87ND5 Saimoneiia DxS typhimurium typhimurium 456 Hypothetical protein yes}E Bacilius UniRef100 P42962 Bacilius subtiis YCSE subtilis 457 Hypothetical UPF0271 protein yes} UniRef100 P42963 Bacilius subtiis YCSF Bacilius subtiis 458 Hypothetical protein yesG Bacilius UniRef100 P42964 Bacilius subtiis YCSG subtilis 459 Hypothetical UPF0317 protein yes.I UniRef100 P42966 Bacilius subtiis Ycs Bacilius subtiis 460 Kinase A inhibitor Bacilius subtilis UniRef100 P60495 Bacilius subtiis Kip 461 Kiplantagonist Bacilius subtilis UniRef100 Q7WY77 Bacilius subtiis Kip A 462 HTH-type transcriptional regulator kipR UniRef100 P42968 Bacilius subtiis KipR Bacilius subtiis 463 Hypothetical protein yesk Bacilius UniRef100 P42969 Bacilius subtiis YcSK subtilis 464 PTS system, mannitol-specific IIABC UniRef100 P42956 Bacilius subtiis MtA component Bacilius subtilis 465 Mannitol-1-phosphate 5- UniRef100 P42.957 Bacilius subtiis MtD dehydrogenase Bacilius subtilis 466 YdaA protein Bacilius subtilis UniRef100 P96574 Bacilius subtiis MtR 467 General stress protein 39 Bacilius UniRef100 P80873 Bacilius subtiis Yda) subtilis 468 Hypothetical protein yolaE Bacilius UniRef100 P96578 Bacilius subtiis YdaE subtilis 469 Hypothetical protein Bacilius anthracis UniRef100 Q81U55 Bacilius anthracis 470 General stress protein 26 Bacilius UniRef100 P802.38 Bacilius subtiis Yda subtilis 471 YdaH protein Bacilius subtilis UniRef100 P96581 Bacilius subtiis Yda 472 LinC)463 protein Listeria innocua UniRef100 Q92EJ6 Listeria innoctia Yvh 473 YazA protein Bacilius subtilis UniRef100 O31485 Bacilius subtiis 474 BHO424 protein Bacilius halodurans UniRef100 Q9KFQ4 Bacilius halodurans 475 HTH-type transcriptional regulator IrpC UniRef100 P96582 Bacilius subtiis LC Bacilius subtilis 476 PROBABLE DNATOPOISOMERASE UniRef100 P96583 Bacilius subtiis TopB III Bacilius subtilis 477 478 479 480 YdaO 481 YdaO 482 YdaP protein Bacilius subtilis UniRef100 P96591 Bacilius subtiis YdalP 483 484 UPIOOOO3CCO69 UniRefl00 entry UniRef100 UPIOOOO3CCO69 485 IS1627s 1-related, transposase Bacilius UniRef100 Q7CMDO Bacilius anthracis str. A2012 anthracis str. A2012 486 487 Similar to ribosomal-protein-serine N- UniRef100 Q99WN5 Staphylococcits Yda acetyltransferase Staphylococci is (iiietS attreus 488 Manganese transport protein minth UniRef100 P96593 Bacilius subtiis MintEH Bacilius subtilis 489 490 AnsB 491 YojK 492 YdaT protein Bacilius subtilis UniRef100 P96595 Bacilius subtiis Yda.T 493 Hypothetical protein yabA Bacilius UniRef100 P96596 Bacilius subtiis YdbA subtilis 494 Na+ H+ antiporter Nhac Bacilius UniRef100 Q81 FX8 Bacilius cereus NhaC cereus 495 YdbB protein Bacilius subtilis UniRef100 P96597 Bacilius subtiis YbB 496 Glucose starvation-inducible protein B UniRef100 P26907 Bacilius subtiis GsiB Bacilius subtilis 497 Hypothetical UPF0118 protein yobi UniRef100 P96604 Bacilius subtiis YdbI Bacilius subtilis 498 GltT US 8,168,417 B2 67 68 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 499 YdbJ protein Bacilius subtilis UniRef100 P96605 Bacilius subtiis Ydb 500 YdbK protein Bacilius subtilis UniRef100 P96606 Bacilius subtiis YdbK 501 Hypothetical protein yabL Bacilius UniRef100 P966O7 Bacilius subtiis YdbL subtilis 502 YdbM protein Bacilius subtilis UniRef100 P96608 Bacilius subtiis YdbM 503 SO4 505 YabP protein Bacillus subtilis) UniRef100 P96611 Bacilius subtiis YdbP 506 D-alanine--D-alanine ligase Bacilius UniRef100 P96612 Bacilius subtiis Dd subtilis 507 MurF 508 Esterase Oceanobacillusiheyensis UniRef100 Q8ESMO Oceanobacilius YvaK iheyensis 509 YalbR protein Bacillus subtilis UniRef100 P96614 Bacilius subtiis YdbR. 510 YodbS protein Bacilius subtilis UniRef100 P96615 Bacilius subtiis YdbS 511 YalbT protein Bacillus subtilis UniRef100 P96616 Bacilius subtiis YdbT 512 YdcA protein Bacillus subtilis UniRef100 P96617 Bacilius subtiis YdcA 513 YdcC S1.4 Alr 515 516 YdcE protein Bacilius subtilis UniRef100 P96622 Bacilius subtiis YdcE 517 RsbR 518 RSbS 519 RsbT 520 RSb 521 RsbV 522 RsbW 523 SigB 524 RsbX 525 YdcI protein Bacilius subtilis UniRef100 O31489 Bacilius subtiis YdcI 526 Transcriptional regulator, TetR family UniRef100 Q6HGY5 Bacilius YxbF Bacilius thuringiensis thiringiensis 527 Lin1189 protein Listeria innocua UniRef100 Q92CI2 Listeria innoctia YdgH 528 529 Protein sprT-like Bacillus subtilis UniRef100 P96628 Bacilius subtiis YdcK 530 Possible transporter, EamA family UniRef100 Q638K5 Bacilius cereus ZK Bacilius cereus ZK) 531 532 Hypothetical protein ORFO.0034 UniRef100 O87235 Lactococci is lactis Lactococci is lactis 533 DeltaS acyl-lipid desaturase Bacilius UniRef100 Q81CO2 Bacilius cereus Des cereus 534 Cold shock protein cspC Bacilius UniRef100 P39158 Bacilius subtiis subtilis 535 YogA 536 Membrane protein, putative Bacilius UniRef100 Q734YO Bacilius cereus YyaS cereus 537 Transcriptional regulator, MarR family UniRef100 Q734X9 Bacilius cereus YybA Bacilius cereus 538 Acetyltransferase, GNAT family UniRef100 Q734X8 Bacilius cereus PaiA Bacilius cereus 539 Protease synthase and sporulation UniRef100 Q734X7 Bacilius cereus PaB negative regulatory protein PAI 2 Bacilius cereus 540 BHO654 protein Bacilius halodurans UniRef100 Q9KF33 Bacilius RocF halodurans S41 542 YdeO protein Bacillus subtilis UniRef100 P96672 Bacilius subtiis YdeO 543 Transporter, LysE family Bacilius UniRef100 Q81D17 Bacilius cereus YrhP cereus 544 YwqM 545 546 Permease, putative Bacillus anthracis UniRef100 Q81QY7 Bacilius Yvd.J anthracis 547 Putative cyclase Rhodopseudomonas UniRef100 Q6N497 Rhodopseudomonas palustris palustris 548 Hypothetical transport protein yogF UniRef100 P96704 Bacilius subtiis YdgF Bacilius subtilis S49 550 YkW 551 RNA polymerase sigma factor sigV UniRef100 OO5404 Bacilius subtiis SigV Bacilius subtilis US 8,168,417 B2 69 70 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 552 Putative anti-SigV factor Bacilius UniRef100 OO5403 Bacilius subtiis YrhM subtilis 553 Hypothetical protein yrh.L. Bacilius UniRef100 OO5402 Bacilius subtiis Yrh, subtilis 554 YdgK 555 YwpD 556 Lytt 557 Collagen adhesion protein Bacilius UniRef100 Q63OP2 Bacilius cereus ZK cereus ZK) 558 LinC929 protein Listeria innocua UniRef100 Q92D88 Listeria innoctia 559 560 Metabolite transport protein Bacilius UniRef100 O34718 Bacilius subtiis YdiK subtilis 561 Thiamine-monophosphate kinase UniRef100 OO5514 Bacilius subtiis ThiL Bacilius subtilis 562 Hypothetical UPF0079 protein yoliB UniRef100 OO5515 Bacilius subtiis YdB Bacilius subtilis 563 YodiC protein Bacilius subtilis UniRef100 OO5516 Bacilius subtiis YdiC 564 YdiD protein Bacilius subtilis UniRef100 OO5517 Bacilius subtiis YD 565 Probable O-sialoglycoprotein UniRef100 OO5518 Bacilius subtiis Gcp endopeptidase Bacilius subtilis 566 YdF 567 YdF 568 Molybdenum cofactor biosynthesis UniRef100 OO5520 Bacilius subtiis YG protein C Bacilius subtilis 569 Redox-sensing transcriptional repressor UniRef100 OO5521 Bacilius subtiis YH rex Bacilius subtilis 570 YdiI Bacilius halodurans UniRef100 Q979P5 Bacilius halodurans 571 TatCY 572 Hypothetical lipoprotein yoliK precursor UniRef100 OO5524 Bacilius subtiis Bacilius subtilis 573 YL 574 10 kDa chaperonin Bacilius subtilis UniRef100 P28599 Bacilius subtiis 575 60 kDa chaperonin Bacilius subtilis UniRef100 P28598 Bacilius subtiis GroEL 576 577 578 Hypothetical protein yolD UniRef100 O64030 Bacteriophage SPBc2 Bacteriophage SPBc2 579 S8O 581 YoaRBacillus subtilis UniRef100 O34611 Bacilius subtiis YoaR 582 Hypothetical protein yfmO Bacilius UniRef100 OO6475 Bacilius subtiis YfmO subtilis 583 584 YoqW protein Bacteriophage SPBc2 UniRef100 O64131 Bacteriophage YoqW SPBC2 585 Lin2076 protein Listeria innocua UniRef100 Q92A46 Listeria innoctia YerO 586 PEP synthase Bacillus subtilis UniRef100 O34309 Bacilius subtiis Pps 587 Hypothetical protein yoaF Bacilius UniRef100 O31829 Bacilius subtiis subtilis 588 Hypothetical protein Staphylococcus UniRef100 Q6GK76 Staphylococcusatiretts attreus 589 Short-chain dehydrodenase UniRef100 Q97LM1 Cliostridium DItE Clostridium acetobutyllicum acetobiitvictim 590 Type B carboxylesterase Bacillus sp. UniRef100 Q9L378 Bacilius sp. BP-7 PnbA BP-7) 591 transport protein UniRef100 Q8ESX2 Oceanobacilius IOIF Oceanobacilius iheyensis iheyensis 592 Phage shock protein A homolog UniRef100 P54617 Bacilius subtiis PSA Bacilius subtilis 593 YoC protein Bacilius subtilis UniRef100 O34434 Bacilius subtiis YdiG 594 YdjH protein Bacilius subtilis UniRef100 O35004 Bacilius subtiis YdjH 595 Yd I protein Bacilius subtilis UniRef100 O34789 Bacilius subtiis YdjI 596 Putative oxidoreductase UniRef100 Q67S08 Symbiobacterium YO Symbiobacterium thermophilum thermophilum 597 YrhC Bacillus subtilis) UniRef100 OO5405 Bacilius subtiis YrhC) 598 YrhP 599 Helix-turn-helix domain protein Bacilius UniRef100 Q73COO Bacilius cereus cereus 600 Stage V sporulation protein E Bacilius UniRef100 Q9K7T4 Bacilius SpoVE halodurans halodurans US 8,168,417 B2 71 72 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 601 Stage V sporulation protein E Bacilius UniRef100 Q9K7T3 Bacilius FtSW halodurans halodurans 602 603 Hypothetical protein Bacilius cereus UniRef100 Q72Z89 Bacilius cereus 604 BH1889 protein Bacilius halodurans UniRef100 Q9KBN6 Bacilius YobV halodurans 60S YeA 606 YeA 607 Trea 608 Putative HTH-type transcriptional UniRef100 OO6987 Bacilius subtiis YvoE regulatoryvdE Bacilius subtilis 609 Hypothetical protein yvdF Bacilius UniRef100 OO6988 Bacilius subtiis YvoF subtilis 610 Hypothetical protein yvdG Bacilius UniRef100 OO6989 Bacilius subtiis YvdG subtilis 611 Hypothetical protein yvd H. Bacilius UniRef100 OO6990 Bacilius subtiis YvEH subtilis 612 Hypothetical protein yvdI Bacillus UniRef100 OO6991 Bacilius subtiis YvoI subtilis 613 Hypothetical protein yvd.J. Bacilius UniRef100 OO6992 Bacilius subtiis Yvo subtilis 614 Hypothetical glycosylhydrolaseyvdK UniRef100 OO6993 Bacilius subtiis YvK Bacilius subtilis 615 Oligo-1,6-glucosidase Bacilius subtilis UniRef100 OO6994 Bacilius subtiis MalL 616 Putative beta-phosphoglucomutase UniRef100 OO6995 Bacilius subtiis PgcM Bacilius subtilis 617 Tyrosyl-tRNA synthetase 2 Bacilius UniRef100 P25151 Bacilius subtiis Tyrz subtilis 618 Putative HTH-type transcriptional UniRef100 P25150 Bacilius subtiis Ywas regulatory waE Bacillus subtilis 619 620 Hypothetical protein SE2399 UniRef100 Q8CQM7 Staphylococci is epidermidis Staphylococci is epidermidis 621 Hypothetical protein yajM Bacilius UniRef100 P4O775 Bacilius subtiis YdjM subtilis 622 YdN protein Bacilius subtilis UniRef100 O34353 Bacilius subtiis YdiN 623 Hypothetical protein Bacilius cereus UniRef100 Q81AT6 Bacilius cereus YeaA 624 625 Signal peptidase I Bacilius cereus UniRef100 Q73C25 Bacilius cereus SipS 626 627 Hypothetical protein yhfK Bacilius UniRef100 OO7609 Bacilius subtiis YK subtilis 628 Transcriptional regulator Bacilius UniRef100 Q9K766 Bacilius YdleE halodurans halodurans 629 Spore coat protein A Bacilius subtilis UniRef100 PO7788 Bacilius subtiis CotA 630 YkrP 631 Extracellular protein Lactobacilius UniRef100 Q88T27 Lactobacilius YcdA plantari in pianiartin 632 Hypothetical UPF0018 protein yeaB UniRef100 P46348 Bacilius subtiis Yea Bacilius subtilis 633 YeaC Bacilius subtilis UniRef100 P94474 Bacilius subtiis YeaC 634 Hypothetical protein Bacilius cereus UniRef100 Q739D8 Bacilius cereus Yea) 635 YebA Bacillus subtilis) UniRef100 P94476 Bacilius subtiis Yeb.A 636 GMP synthase glutamine-hydrolyzing UniRef100 P29727 glutamine- GuaA Bacilius subtilis hydrolyzing 637 Hypoxanthine guanine permease UniRef100 O34987 Bacilius subtiis PblG Bacilius subtilis 638 YebC 639 Hypothetical UPF0316 protein yebE UniRef100 O34624 Bacilius subtiis YebE Bacilius subtilis 640 YebG protein Bacilius subtilis UniRef100 O34700 Bacilius subtiis 641 Phosphoribosylaminoimidazole UniRef100 P12044 Bacilius subtiis PurE carboxylase catalytic Subunit Bacilius subtilis 642 Purk 643 Adenylosuccinate lyase Bacilius UniRef100 P12047 Bacilius subtiis PrB subtilis 644 Phosphoribosylaminoimidazole- UniRef100 P12046 Bacilius subtiis PrC Succinocarboxamide synthase Bacilius subtilis 645 Hypothetical UPF0062 protein yeXA UniRef100 P12049 Bacilius subtiis Bacilius subtilis US 8,168,417 B2 73 74 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 646 Phosphoribosylformylglycinamidine UniRef100 P12041 Bacilius subtiis PurO synthase I Bacilius subtilis 647 Phosphoribosylformylglycinamidine UniRef100 P12042 Bacilius subtiis PurL synthase II Bacilius subtilis 648 Amidophosphoribosyltransferase UniRef100 POO497 Bacilius subtiis PurF precursor Bacillus subtilis 649 PrM 650 Phosphoribosylglycinamide UniRef100 P12040 Bacilius subtiis PrN ormyltransferase Bacilius subtilis 651 Bifunctional biosynthesis protein UniRef100 P12048 Includes: PurEH purH Includes: Phosphoribosylaminoimidazole Phosphoribosylaminoimidazolecarboxamide carboxamide formyltransferase ormyltransferase (EC 2.1.2.3) (EC 2.1.2.3) (AICAR transformylase): IMP (AICAR cyclohydrolase (EC 3.5.4.10) transformylase); (Inosinicase) (IMP synthetase) (ATIC) IMP Bacilius subtilis cyclohydrolase (EC 3.5.4.10) (Inosinicase) (IMP synthetase) (ATIC) 652 Phosphoribosylamine-glycine ligase UniRef100 P12039 Bacilius subtiis Pur) Bacilius subtilis 653 YxbF 654 Putative cytochrome P450 yjiB Bacilius UniRef100 O34374 Bacilius subtiis YiB subtilis 655 656 Hypothetical lipoprotein yybP precursor UniRef100 P37488 Bacilius subtiis YybP Bacilius subtiis 657 Transposase Thermoanaerobacter UniRef100 Q8RCM3 Thermoanaerobacter tengcongensis tengcongensis 658 Hypothetical protein Bacilius anthracis UniRef100 Q81ZG4 Bacilius anthracis 659 Putative adenine deaminase yerA UniRef100 O34909 Bacilius subtiis Yer A Bacilius subtiis 660 YerB protein Bacilius subtilis UniRef100 O34968 Bacilius subtiis YerB 661 YecD Bacilius subtilis UniRef100 Q7BVT7 Bacilius subtiis YerC 662 PerB protein homolog Bacillus subtilis UniRef100 O34790 Bacilius subtiis PcrB 663 ATP-dependent DNA helicase pcrA UniRef100 O34580 Bacilius subtiis PcrA Bacilius subtiis 664 DNA ligase Bacilius subtilis UniRef100 O31498 Bacilius subtiis LigA 665 YerH protein Bacillus subtilis UniRef100 O34629 Bacilius subtiis YerH 666 BHO586 protein Bacilius halodurans UniRef100 Q9KF99 Bacilius halodurans 667 668 669 BHO589 protein Bacilius halodurans UniRef100 Q9KF96 Bacilius halodurans 670 Phosphotriesterase homology protein UniRef100 P45548 Escherichia coi 671 Similar to unknown protein YhfS of UniRef100 Q7NSF2 Photorhabdus Csd Escherichia coi Photorhabdus luminescens luminescens 672 Phosphopentomutase UniRef100 Q67SO6 Symbiobacterium Drm Symbiobacterium thermophilum thermophilum 673 Putative alanine racemase UniRef100 Q67S05 Symbiobacterium Symbiobacterium thermophilum thermophilum 674 Glucosamine-6-phosphate deaminase UniRef100 Q8ESL6 Oceanobacilius NagB Oceanobacilius iheyensis iheyensis 675 SapB protein Bacilius subtilis UniRef100 Q45514 Bacilius subtiis SapB 676 OpuE 677 Glutamyl-tRNA(Gln) amidotransferase UniRef100 OO6492 Bacilius subtiis subunit C Bacilius subtilis 678 Glutamyl-tRNA(Gln) amidotransferase UniRef100 OO6491 Bacilius subtiis GatA subunit A Bacilius subtilis 679 Gat 680 Hypothetical protein Bacilius UniRef100 Q848Y2 Bacilius negaterium negaterium 681 682 YdhT 683 684 Putative HTH-type transcriptional UniRef100 O31500 Bacilius subtiis YerO regulatoryerO Bacilius subtilis 685 Swarming motility protein swrC Bacilius UniRef100 O31501 Bacilius subtiis YerP subtilis US 8,168,417 B2 75 76 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 686 Yick 687 Inosine- preferring nucleoside UniRef100 Q81DM6 Bacilius cereus hydrolase Bacilius cereus 688 YerQ protein Bacillus subtilis UniRef100 O31502 Bacilius subtiis YerQ 689 Hypothetical RNA methyltransferase UniRef100 O31503 Bacilius subtiis YefA yefA Bacilius subtilis 690 Type I restriction-modification system UniRef100 Q817 S1 Bacilius cereus specificity subunit Bacilius ceretts 691 Type I restriction-modification system UniRef100 Q817 S2 Bacilius cereus methylation subunit Bacilius cereus 692 Type IC specificity subunit UniRef100 Q9RNWO Streptococcus thermophilus Streptococcus thermophilus 693 Type I restriction-modification system UniRef100 Q817 S4 Bacilius cereus restriction subunit Bacilius cereus 694 695 Beta-glucosides PTS, EIIBCA UniRef100 Q88T54 Lactobacilius BglP Lactobacilius plantarum pianiartin 696 6-phospho-beta-glucosidase UniRef100 Q88T55 Lactobacilius BglH Lactobacilius plantarum pianiartin 697 LicT 698 Response regulator aspartate UniRef100 O34930 Bacilius subtiis RapK phosphatase KBacilius subtilis 699 700 Methyltransferase Bacilius cereus ZK) UniRef100 Q639N2 Bacilius cereus ZK 701 Hypothetical UPF0082 protein UniRef100 P62032 Bacilius cereus Yee BCEO595 Bacilius cereus) 702 Putative HTH-type transcriptional UniRef100 O28.646 Archaeoglobits filgidus regulator AF1627 Archaeoglobus fulgidus 703 704 YfmT Bacillus subtilis) UniRef100 OO6478 Bacilius subtiis YfmT 705 YfmS Bacillus subtilis) UniRef100 OO6477 Bacilius subtiis YfmS 706 YfS 707 YfmRBacilius subtilis UniRef100 OO6476 Bacilius subtiis YfmR 708 709 YeiA protein Bacilius subtilis UniRef100 P94398 Bacilius subtiis YciA 710 UPIOOOO2BDF65 UniRef100 entry UniRef100 UPIOOOO2BDF65 YpdA 711 Ferrusion transporter protein UniRef100 Q6U5S9 Klebsiella pneumoniae Klebsiella pneumoniae 712 713 YeiC protein Bacilius UniRef100 Q7OKK8 Bacilius YCC amyloiquefaciens amyloiquefaciens 71.4 BioW 715 Adenosylmethionine-8-amino-7- UniRef100 P53555 Bacilius subtiis BioA oxononanoate aminotransferase Bacilius subtilis 716 8-amino-7-oxononanoate synthase UniRef100 P53556 Bacilius subtiis BioF Bacilius subtilis 717 BioD protein Bacilius UniRef100 Q70JZO Bacilius BioD amyloiquefaciens amyloiquefaciens 718 BioB protein Bacilius UniRef100 Q70JZ1 Bacilius BioB amyloiquefaciens amyloiquefaciens 719 BioI protein Bacilius amyloiquefaciens UniRef100 Q70JZ2 Bacilius BioI amyloiquefaciens 720 BH15O1 protein Bacilius halodurans UniRef100 Q9KCR8 Bacilius halodurans 721 YfmP 722 Multidrug efflux protein yfmO Bacilius UniRef100 OO6473 Bacilius subtiis YfmO subtilis 723 YfmM protein Bacilius subtilis UniRef100 O34512 Bacilius subtiis YfmM 724 YfmL protein Bacilius subtilis UniRef100 O34750 Bacilius subtiis YfmL 725 Yfm J protein Bacilius subtilis UniRef100 O34812 Bacilius subtiis Yfm 726 Hypothetical protein yfmB Bacillus UniRef100 O34626 Bacilius subtiis YfmB subtilis 727 General stress protein 17M Bacilius UniRef100 P80241 Bacilius subtiis YFT subtilis 728 YfS 729 YfS 730 Putative permease Clostridium tetani UniRef100 Q895AO Cliostridium tetani 731 Possible Zn-dependent hydrolase, UniRef100 Q6HKH5 Bacilius YdgX beta-lactamase Superfamily Bacilius thiringiensis thiringiensis 732 YfiN protein Bacillus subtilis UniRef100 O34409 Bacilius subtiis YfN US 8,168,417 B2 77 78 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 733 Zink-carboxypeptidase Clostridium UniRef100 Q898E1 Cliostridium tetani tetani 734 Nitric oxide synthase oxygenase UniRef100 O34453 Bacilius subtiis YM Bacilius subtilis 735 Membrane protein, putative Bacilius UniRef100 Q72YN4 Bacilius cereus YvaZ cereus 736 Transcriptional regulator, ArsR family UniRef100 Q632K6 Bacilius cereus ZK Bacilius cereus ZK) 737 Putative acylphosphatase Bacilius UniRef100 O35031 Bacilius subtiis subtilis 738 YflK protein Bacilius subtilis UniRef100 O34542 Bacilius subtiis YK 739 740 741 YfiG protein Bacilius subtilis UniRef100 O34484 Bacilius subtiis YG 742 YfE 743 Hypothetical protein yflB Bacilius UniRef100 O34887 Bacilius subtiis subtilis 744 YA 745 Probable PTS system, trehalose- UniRef100 P39794 Bacilius subtiis TreP specific IIBC component Bacilius subtilis 746 Alpha-glucosidase Bacilius sp. UniRef100 Q9L872 Bacilius sp. Trea DGO303 DGO303 747 Trehalose operon transcriptional UniRef100 P39796 Bacilius subtiis TreR repressor Bacillus subtilis 748 Acetyltransferases UniRef100 Q8RBZ9 Thermoanaerobacter YvfD Thermoanaerobacter tengcongensis tengcongensis 749 750 751 Predicted - UniRef100 Q8RBY7 Thermoanaerobacter SpsC dependent enzyme apparently involved tengcongensis in regulation of cell wall biogenesis Thermoanaerobacter tengcongensis 752 YkuQ 753 UPIO00029FB28 UniRefl00 entry UniRef100 UPIOOOO29FB28 YtcB 754 Hypothetical protein MA2181 UniRef100 Q8TNU7 Meihanosarcina acetivorans Methanosarcina acetivorans 755 756 Beta 14 glucosyltransferase Bacilius UniRef100 Q81GJ1 Bacilius cereus Yol cereus 757 Predicted pyridoxal phosphate- UniRef100 Q8RCO2 Thermoanaerobacter SpsC dependent enzyme apparently involved tengcongensis in regulation of cell wall biogenesis Thermoanaerobacter tengcongensis 758 Predicted dehydrogenases and related UniRef100 Q8RCOO Thermoanaerobacter YrbE proteins Thermoanaerobacter tengcongensis tengcongensis 759 UDP-glucose: GDP-mannose UniRef100 Q8CXB6 Oceanobacilius TualD dehydrogenase Oceanobacilitis iheyensis iheyensis 760 Hypothetical protein Bacillus cereus UniRef100 Q635H5 Bacilius cereus Yak) ZK) ZK. 761 Hypothetical UPF0087 protein yoleP UniRef100 P96673 Bacilius subtiis YdeP Bacilius subtilis 762 Putative NAD(P)H nitroreductaseyfkO UniRef100 O34475 Bacilius subtiis YfkO Bacilius subtilis 763 YfkN 764 General stress protein 18 Bacilius UniRef100 P80876 Bacilius subtiis YfkM subtilis 765 YfkK protein Bacilius subtilis UniRef100 O35019 Bacilius subtiis 766 Amino acid transporter Bacilius UniRef100 Q9K5Q5 Bacilius YA halodurans halodurans 767 YfkJ protein Bacilius subtilis UniRef100 O35016 Bacilius subtiis Yfk 768 Hypothetical protein yfkI precursor UniRef100 O34418 Bacilius subtiis YfkI Bacilius subtilis 769 YfkH protein Bacilius subtilis UniRef100 O34437 Bacilius subtiis YfkH 770 YfkF protein Bacillus subtilis UniRef100 O34929 Bacilius subtiis YfkF 771 Hypothetical conserved protein UniRef100 Q8ELS1 Oceanobacilius iheyensis Oceanobacilius iheyensis 772 YfkE protein Bacilius subtilis UniRef100 O34840 Bacilius subtiis YfkE 773 YfkD protein Bacillus subtilis) UniRef100 O34579 Bacilius subtiis YfkD US 8,168,417 B2 79 80 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 774. Thioredoxin-like UniRef100 Q81 IC7 Bacilius cereus YfkA Bacilius cereus 775 YfT protein Bacillus subtilis UniRef100 O35041 Bacilius subtiis 776 Yfs 777 UPIO00029390C UniRef100 entry UniRef100 UPIOOOO2939OC AraM 778 779 YfiO protein Bacilius subtilis UniRef100 O31543 Bacilius subtiis YfjQ 780 Yf P protein Bacilius subtilis UniRef100 O31544 Bacilius subtiis YfiP 781 YfiO 782 Yf M protein Bacilius subtilis UniRef100 O31547 Bacilius subtiis Yf M 783 784 Hypothetical protein yfL Bacilius UniRef100 P4O773 Bacilius subtiis YfL subtilis 785 UPIOOOO3CB259 UniRefl00 entry UniRef100 UPIOOOO3CB259 YvkB 786 YahE protein Bacillus subtilis UniRef100 OO5496 Bacilius subtiis YdhE 787 Hypothetical protein yokD precursor UniRef100 P42402 Bacilius subtiis Bacilius subtilis 788 Hypothetical metabolite transport UniRef100 O34691 Bacilius subtiis Yce protein yeeIBacilius subtilis 789 SacX 790 LevanSucrase and Sucrase synthesis UniRef100 P15401 Bacilius subtiis Sacy operon antiterminator Bacilius subtilis 791 Hypothetical protein Bacilius anthracis UniRef100 Q81NL1 Bacilius YbcF anthracis 792 Hypothetical protein ybcD Bacilius UniRef100 Q639F7 Bacilius cereus YbCD cereus ZK) ZK. 793 Potential NADH-quinone UniRef100 P39755 Bacilius subtiis NF oxidoreductase subunit 5 Bacilius subtilis 794 YbcI protein Bacilius subtilis UniRef100 O34380 Bacilius subtiis YbcI 795 YraA 796 TPP-dependent acetoin dehydrogenase UniRef100 Q81PM6 Bacilius AcoA E1 alpha-subunit Bacilius anthracis anthracis 797 TPP-dependent acetoin dehydrogenase UniRef100 Q736 U7 Bacilius cereus AcoB E1 beta-subunit Bacilius cereus 798 Dihydrolipoyllysine-residue UniRef100 O31550 Bacilius subtiis AcoC acetyltransferase component of acetoin cleaving system Bacilius Subtilis 799 Dihydrolipoyl dehydrogenase Bacilius UniRef100 O34324 Bacilius subtiis AcoL subtilis 800 Acetoin operon transcriptional activator, UniRef100 Q736V6 Bacilius cereus AcoR putative Bacillus cereus 801 Hypothetical UPF0060 protein yfjF UniRef100 O31553 Bacilius subtiis YfjF Bacilius subtiis 802 Maltose-6'-phosphate glucosidase UniRef100 P54716 Bacilius subtiis MalA Bacilius subtiis 803 HTH-type transcriptional regulator glvR UniRef100 P54717 Bacilius subtiis YA Bacilius subtiis 804 PTS system, arbutin-like IIBC UniRef100 P54715 Bacilius subtiis MaP component Bacilius subtilis 805 UPIOOOO3651CF UniRefl00 entry UniRef100 UPIOOOO3651CF 806 UPIOOOO34AA1D UniRef100 entry UniRef100 UPIOOOO34AA1D 807 Transcriptional regulator, MarR family UniRef100 Q638I2 Bacilius cereus YvaP Bacilius cereus ZK) ZK. 808 Yf) 809 Hypothetical protein yfiE Bacilius UniRef100 P54721 Bacilius subtiis YE subtilis 810 Xylosidasefarabinosidase Bacteroides UniRef100 Q8A036 Bacteroides XynB thetaiotaomicron theiaioiaomicron 811 Xylan beta-1,4-xylosidase Bacilius UniRef100 Q9K6P5 Bacilius XynB halodurans halodurans 812 Trancriptional regulator of AraC family UniRef100 Q97FW8 Cliostridium YbfI Clostridium acetobutyllicum acetobiitvictim 813 YtcQ 814 NAD(P)H dehydrogenase, quinone UniRef100 Q638S8 Bacilius cereus ZK family Bacilius cereus ZK) 815 Mutator MutT protein Bacilius UniRef100 Q9K8B7 Bacilius YhB halodurans halodurans 816 YfiT protein Bacilius subtilis UniRef100 O31562 Bacilius subtiis YfiT 817 YfiX Bacilius subtilis UniRef100 O52961 Bacilius subtiis YfiX 818 Hypothetical protein yfhB Bacilius UniRef100 O31570 Bacilius subtiis YfhB subtilis US 8,168,417 B2 81 82 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 819 YfhC protein Bacilius subtilis UniRef100 O31571 Bacilius subtiis YfhC 820 Hypothetical protein yfhD Bacilius UniRef100 O31572 Bacilius subtiis subtilis 821 822 BHO923 homolog Bacilius cereus UniRef100 Q81 IAO Bacilius cereus 823 Hypothetical UPFO105 protein yfhF UniRef100 O31574 Bacilius subtiis YfhF Bacilius subtilis 824 Regulatory protein recX Bacilius UniRef100 O31575 Bacilius subtiis YfhCG subtilis 825 YfhH protein Bacilius subtilis UniRef100 O31576 Bacilius subtiis YfhEH 826 827 Yfh.J protein Bacilius subtilis UniRef100 O31578 Bacilius subtiis 828 CsbB protein Bacilius subtilis UniRef100 Q45539 Bacilius subtiis CsbB 829 Hypothetical protein SE1997 UniRef100 Q8CR87 Staphylococci is epidermidis Staphylococci is epidermidis 830 YfhC) protein Bacilius subtilis UniRef100 O31582 Bacilius subtiis YfhO 831 YfhP protein Bacillus subtilis UniRef100 O31583 Bacilius subtiis YfhP 832 YfhCR protein Bacilius subtilis UniRef100 O31584 Bacilius subtiis Yfhq 833 YfhS protein Bacillus subtilis UniRef100 O31585 Bacilius subtiis 834 Unidentified dehydrogenase Bacilius UniRef100 P71079 Bacilius subtiis FabL subtilis 835 836 Hypothetical protein ygaB Bacillus UniRef100 P71080 Bacilius subtiis subtilis 837 YgaC protein Bacilius subtilis UniRef100 Q796Z1 Bacilius subtiis YgaC 838 Unidentified transporter-ATP binding UniRef100 P71082 Bacilius subtiis YgaD Bacilius subtilis 839 Oligopeptide ABC transporter Bacilius UniRef100 Q9K6TO Bacilius AppD halodurans halodurans 840 Oligopeptide ABC transporter Bacilius UniRef100 Q9K6T1 Bacilius AppF halodurans halodurans 841 Dipeptide transporter protein DppA UniRef100 P94310 Bacilius firmus OppA Bacilius firmus 842 Dipeptide ABC transporter Bacillus UniRef100 Q9K6T3 Bacilius AppB halodurans halodurans 843 Dipeptide transport system permease UniRef100 P94312 Bacilius AppC protein dippC Bacilius pseudofirmits pseudofirmus 844 Hypothetical 40.7 kd protein Bacillus UniRef100 P71083 Bacilius subtiis YgaE subtilis 845 Glutamate-1-semialdehyde 2,1- UniRef100 P71084 Bacilius subtiis GSaB aminomutase 2 Bacilius subtilis 846 YgaF protein Bacilius subtilis UniRef100 Q796Y8 Bacilius subtiis YgaF 847 Peroxide operon regulator Bacilius UniRef100 P71086 Bacilius subtiis PerR subtilis 848 849 Hypothetical protein ygxA Bacilius UniRef100 Q04385 Bacilius subtiis YgXA subtilis 8SO RapD 851 852 853 854 855 856 Hypothetical protein Bacilius UniRef100 Q6HH72 Bacilius thuringiensis thiringiensis 857 YXD 858 Hypothetical protein uncultured UniRef100 Q64DF2 uncultured archaeon archaeon GZfos18F2 GZfos18F2 859 860 YXD 861 RapE 862 Putative membrane protein UniRef100 Q82NMO Streptomyces avermitiis Streptomyces avermitilis 863 3-dehydroquinate dehydratase Listeria UniRef100 Q8Y9N4 Listeria AroC monocytogenes monocytogenes 864 ThiC 865 Putative aliphatic sulfonates transport UniRef100 P97027 Bacilius subtiis SSuB ATP-binding protein SSuB Bacilius subtilis 866 Putative aliphatic sulfonates binding UniRef100 P40400 Bacilius subtiis SSuA protein precursor Bacillus subtilis US 8,168,417 B2 83 84 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 867 Putative aliphatic sulfonates transport UniRef100 P4O401 Bacilius subtiis SSC permease protein SSuC Bacilius subtilis 868 SSD 869 SSD 870 Hypothetical lipoprotein ygaO precursor UniRef100 P97O29 Bacilius subtiis YgaO Bacilius subtilis 871 DNA-binding protein Bacilius anthracis UniRef100 Q81V18 Bacilius anthracis 872 873 ABC-type multidrug transport system, UniRef100 Q8RBHO Thermoanaerobacter Yhao ATPase component tengcongensis Thermoanaerobacter tengcongensis 874 875 876 877 878 30S ribosomal protein S14-2 Bacilius UniRef100 O31587 Bacilius subtiis subtilis 879 Hypothetical protein yhzB Bacilius UniRef100 O31588 Bacilius subtiis YZB subtilis 880 Hypothetical 48.5 kd protein Bacillus UniRef100 P97030 Bacilius subtiis Yba subtilis 881 Hypothetical 35.8 kd protein Bacillus UniRef100 P97031 Bacilius subtiis YbB subtilis 882 CspR Bacillus subtilis UniRef100 Q45512 Bacilius subtiis CspR 883 Hypothetical 27 kd protein Bacilius UniRef100 P97032 Bacilius subtiis YhbD subtilis 884. Hypothetical Cytosolic Protein Bacilius UniRef100 Q813H1 Bacilius cereus YhbE cereus 885 Hypothetical protein Bacilius anthracis UniRef100 Q81 PD2 Bacilius YhbF anthracis 886 887 PrkA protein Bacillus subtilis UniRef100 P39134 Bacilius subtiis PrkA 888 Stress response UPF0229 protein yhbH UniRef100 P45.742 Bacilius subtiis YbB Bacilius subtilis 889 YhbJ protein Bacillus subtilis UniRef100 O31593 Bacilius subtiis Yhb 890 Hypothetical transport protein yhcA UniRef100 P54585 Bacilius subtiis YhcA Bacilius subtilis 891 Hypothetical protein yhcBBacilius UniRef100 P54586 Bacilius subtiis YCB subtilis 892 YCC 893 894 Hypothetical ABC transporter ATP- UniRef100 P54591 Bacilius subtiis binding protein yhcG Bacilius subtilis 895 Hypothetical ABC transporter ATP- UniRef100 P54592 Bacilius subtiis YCH binding protein yhcH Bacilius subtilis 896 Hypothetical protein yhcI Bacilius UniRef100 P54593 Bacilius subtiis YCI subtilis 897 Cold shock protein cspBBacilius UniRef100 P32081 Bacilius subtiis subtilis 898 ABC transporter, ATP-binding protein UniRef100 Q6HP89 Bacilius YSC Bacilius thuringiensis thiringiensis 899 MWO417 protein Staphylococcus UniRef100 Q8NY20 Staphylococcits YSB attreus (iiietS 900 ABC transporter Substrate-binding UniRef100 Q81 IN6 Bacilius cereus Yic protein Bacilius cereus 901 Hypothetical SymporteryhcL Bacilius UniRef100 P54596 Bacilius subtiis YCL subtilis 902 Hypothetical protein yhcM Bacilius UniRef100 P54597 Bacilius subtiis YcM subtilis 903 Acylamino-acid-releasing enzyme UniRef100 Q8CXN6 Oceanobacilius YuxL Oceanobacilius iheyensis iheyensis 904 Lipoprotein yhcN precursor Bacilius UniRef100 P54598 Bacilius subtiis YcN subtilis 905 Hypothetical protein yhcP Bacilius UniRef100 P54600 Bacilius subtiis YhcP subtilis 906 Hypothetical protein yhcQ Bacilius UniRef100 P546O1 Bacilius subtiis YhcQ subtilis 907 Two-component sensor histidine kinase UniRef100 O31661 Bacilius subtiis KinE Bacilius subtilis 908 Hypothetical protein yhcR precursor UniRef100 P54602 Bacilius subtiis YCR Bacilius subtilis US 8,168,417 B2 85 86 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 909 Hypothetical protein yhcS Bacilius UniRef100 P54603 Bacilius subtiis YCS subtilis 910 Hypothetical pseudouridine synthase UniRef100 P54604 Bacilius subtiis YhcT yhcT Bacilius subtilis 911 Hypothetical protein yhcU Bacilius UniRef100 P54605 Bacilius subtiis YcU subtilis 912 Hypothetical protein yhcV Bacilius UniRef100 P54606 Bacilius subtiis YhcV subtilis 913 Hypothetical protein yhcW Bacilius UniRef100 P54607 Bacilius subtiis YhcW subtilis 914 Hypothetical UPF0012 protein yhcX UniRef100 P54608 Bacilius subtiis YCX Bacilius subtiis 915 ABC transporter Bacilius halodurans UniRef100 Q9KBA3 Bacilius YdF halodurans 916 Hypothetical transport protein yrhG UniRef100 OO5399 Bacilius subtiis YrhG Bacilius subtiis 917 LinC826 protein Listeria innocua UniRef100 Q92DI7 Listeria innoctia YwkB 918 dehydrogenase Bacilius UniRef100 Q81CT4 Bacilius cereus YogA cereus 919 Glycerol uptake operon antiterminator UniRef100 P3O3OO Bacilius subtiis GlpP regulatory protein Bacilius Stibilis 920 Glycerol uptake facilitator protein UniRef100 P18156 Bacilius subtiis GlpF Bacilius subtilis 921 Glycerol kinase Bacilius subtilis UniRef100 P18157 Bacilius subtiis GlpK 922 Aerobic glycerol-3-phosphate UniRef100 P18158 Bacilius subtiis GlpD dehydrogenase Bacilius subtilis 923 Alpha-phosphoglucomutase Bacilius UniRef100 Q68VA2 Bacilius subtiis YXB subtilis subsp. subtilis Subsp. subtilis 924 Hypothetical conserved protein UniRef100 Q8ELS7 Oceanobacilius YcgE Oceanobacilius iheyensis iheyensis 925 Transcriptional regulator UniRef100 Q8ELS8 Oceanobacilius iheyensis Oceanobacilius iheyensis 926 YhcY 927 Hypothetical protein yhcZ Bacilius UniRef100 OO7528 Bacilius subtiis YCZ subtilis 928 Hypothetical protein yhdA Bacilius UniRef100 OO7529 Bacilius subtiis YhdA subtilis 929 930 931 Hypothetical UPF0074 protein yhdE UniRef100 OO7573 Bacilius subtiis YhdE Bacilius subtilis 932 Flavohemoprotein Bacilius halodurans UniRef100 Q9RC40 Bacilius Hmp halodurans 933 Stage V sporulation protein RBacilius UniRef100 P3 7875 Bacilius subtiis SpoVR subtilis 934 Probable endopeptidase IytE precursor UniRef100 P54421 Bacilius subtiis LytE Bacilius subtilis 935 CitR 936 CitA 937 Glucose dehydrogenase-B Bacilius UniRef100 Q979R3 Bacilius halodurans halodurans 938 Hypothetical oxidoreductase yhdF UniRef100 OO7575 Bacilius subtiis YhdF Bacilius subtilis 939 Hypothetical protein yhdH Bacilius UniRef100 OO7577 Bacilius subtiis YH subtilis 940 2,4-diaminobutyrate decarboxylase UniRef100 Q9KFB9 Bacilius halodurans Bacilius halodurans 941 YaleE protein Bacilius subtilis UniRef100 P9666.2 Bacilius subtiis YdleE 942 YdeL protein Bacilius subtilis UniRef100 P96669 Bacilius subtiis YdeL 943 BH1582 protein Bacilius halodurans UniRef100 Q9KCI9 Bacilius Yhd halodurans 944 YhdK Bacilius subtilis Subsp. spizizeni UniRef100 Q7X2K9 Bacilius subtilis Subsp. spizizeni 945 YhdL Bacilius subtilis subsp. spizizeni UniRef100 Q7X2LO Bacilius subtiis YL Subsp. spizizeni 946 Hypothetical protein yhdM (RNA UniRef100 OO7582 Bacilius subtiis SigM polymerase ECF (Extracytoplasmic function)-type sigma factor) Bacilius subtilis 947 YrkC 948 Hypothetical protein yhdO Bacilius UniRef100 O07584 Bacilius subtiis YO subtilis US 8,168,417 B2 87 88 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 949 950 Acyl-CoA thioesterase 1 Clostridium UniRef100 Q97DR5 Clostridium acetobutyllicum acetobiitvictim 951 UPIOOOO3CB259 UniRefl00 entry UniRef100 UPIOOOO3CB259 YvkB 952 YhdP 953 HTH-type transcriptional regulator cueR UniRef100 OO7586 Bacilius subtiis YhdO Bacilius subtilis 954 YhdT 955 956 BH3511 protein Bacilius halodurans UniRef100 Q9K762 Bacilius halodurans 957 Sporulation specific N-acetylmuramoyl- UniRef100 Q8CX69 Oceanobacilius CwC L-alanine amidase Oceanobacilius iheyensis iheyensis 958 Protein crchB homolog 1 Bacilius UniRef100 OO7590 Bacilius subtiis YhdU subtilis 959 Protein crchB homolog2 Bacilius UniRef100 OO7591 Bacilius subtiis YhdV subtilis 96.O YhdW 961 962 Hypothetical UPF0003 protein yhdY UniRef100 OO7594 Bacilius subtiis YhdY Bacilius subtiis 963 NAD-dependent deacetylase Bacilius UniRef100 OO7595 Bacilius subtiis YZ subtiis 964 965 Hypothetical protein yheNBacilius UniRef100 OO7596 Bacilius subtiis YeN subtiis 966 Dat 967 Na(+)/H(+) antiporter Bacilius subtilis UniRef100 OO7553 Bacilius subtiis NhaC 968 Hypothetical protein yoxA LBacilius UniRef100 P39840 Bacilius subtiis YoxA subtiis 969 Hypothetical protein yahH Bacilius UniRef100 OO5500 Bacilius subtiis YH subtiis 970 Hypothetical protein Bacillus cereus UniRef100 Q63EB4 Bacilius cereus ZK ZK) 971 Hypothetical protein Bacillus cereus UniRef100 Q81GF4 Bacilius cereus 972 Stress response protein inhaXBacillus UniRef100 OO7552 Bacilius subtiis NhaX subtiis 973 Hypothetical protein yhel Bacilius UniRef100 OO7550 Bacilius subtiis Yhe tibiis 974. Hypothetical protein yheH Bacilius UniRef100 OO7549 Bacilius subtiis Yhe tibiis 975 Hypothetical protein yhed Bacilius UniRef100 OO7548 Bacilius subtiis YeG subtiis 976 Small, acid-soluble spore protein B UniRef100 PO4832 Bacilius subtiis Bacilius subtiis 977 BH1139 protein Bacilius halodurans UniRef100 Q9KDS2 Bacilius halodurans 978 Sugar ABC transporter ATP-binding UniRef100 Q8CUH3 Oceanobacilius MSmX protein Oceanobacilius iheyensis iheyensis 979 Fis-type helix-turn-helix domain protein UniRef100 Q73A84 Bacilius cereus YxkF Bacilius cereus 980 Hypothetical protein yheE Bacilius UniRef100 OO7546 Bacilius subtiis subtiis 981 Hypothetical protein yhed Bacilius UniRef100 OO7545 Bacilius subtiis Yle) subtiis 982 Hypothetical protein yhec Bacilius UniRef100 OO7544 Bacilius subtiis YeC subtiis 983 Hypothetical protein yheBBacilius UniRef100 OO7543 Bacilius subtiis YeB subtiis 984 Hypothetical protein yhe A Bacilius UniRef100 OO7542 Bacilius subtiis Yhe A subtiis 985 Stress response protein yhaXBacillus UniRef100 OO7539 Bacilius subtiis Yax subtiis 986 Hemz, 987 YhaR protein Bacillus subtilis UniRef100 OO7533 Bacilius subtiis YaR 988 Response regulator aspartate UniRef100 P96649 Bacilius subtiis RapI phosphatase I Bacilius subtilis 989 990 Hypothetical protein yhaO Bacilius UniRef100 OO7524 Bacilius subtiis Yhao subtilis 991 Hypothetical protein yhaP Bacilius UniRef100 OO7523 Bacilius subtiis YaP subtilis 992 YaO US 8,168,417 B2 89 90 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 993 Hypothetical protein yhaN Bacilius UniRef100 OO8455 Bacilius subtiis YaN subtilis 994 YaM 995 Hypothetical protein yhaL. Bacilius UniRef100 OO7520 Bacilius subtiis subtilis 996 Foldase protein prSA precursor UniRef100 P24327 Bacilius subtiis PrSA Bacilius subtiis 997 998 Hypothetical protein yhaK Bacilius UniRef100 OO7519 Bacilius subtiis subtilis 999 Hypothetical protein yhaI Bacilius UniRef100 OO7517 Bacilius subtiis Ya subtilis 000 Protease production regulatory protein UniRef100 P11065 Bacilius subtiis Hpr hpr Bacillus subtilis 001 Hypothetical protein yhaH Bacilius UniRef100 OO7516 Bacilius subtiis YaH subtilis 002 Probable tryptophan transport protein UniRef100 OO7515 Bacilius subtiis YaG Bacilius subtiis 003 Phosphoserine aminotransferase UniRef100 P80862 Bacilius subtiis SerC Bacilius subtiis 004 Hit protein Bacilius subtilis UniRef100 OO7513 Bacilius subtiis Hit 005 006 ABC-type transporter ATP-binding UniRef100 P55339 Bacilius subtiis Ecs.A protein ecsA Bacilius subtilis 007 Protein ecsB Bacillus subtilis UniRef100 P55340 Bacilius subtiis EcSB 008 Protein ecsC Bacillus subtilis UniRef100 P55341 Bacilius subtiis EcSC 009 YhaA protein Bacilius subtilis UniRef100 OO7598 Bacilius subtiis YhaA 010 Hypothetical protein yhfA Bacilius UniRef100 OO7599 Bacilius subtiis YA subtilis 011 Hypothetical protein yhgC Bacilius UniRef100 P38049 Bacilius subtiis YhgC subtilis 012 Penicillin-binding protein 1F Bacilius UniRef100 P38050 Bacilius subtiis PbpF subtilis 013 Uroporphyrinogen decarboxylase UniRef100 P32395 Bacilius subtiis Hem. Bacilius subtiis 014 Ferrochelatase Bacillus subtilis UniRef100 P32396 Bacilius subtiis Hem 015 Protoporphyrinogen oxidase Bacilius UniRef100 P32397 Bacilius subtiis HemY subtiis O16 YhgD 017 Hypothetical protein yhgE Bacilius UniRef100 P32399 Bacilius subtiis YhgE subtiis 018 3-oxoacyl-[acyl-carrier-protein UniRef100 OO7600 acyl-carrier- FabHB synthase III protein 2 Bacilius subtilis protein 019 Hypothetical protein yhf E Bacilius UniRef100 OO7603 Bacilius subtiis YhfE subtiis O20 021 Hypothetical protein yhfG Bacilius UniRef100 OO7605 Bacilius subtiis GltT subtiis 022 Hypothetical protein yhfI Bacilius UniRef100 OO7607 Bacilius subtiis YhfI tibiis 023 Hypothetical protein yhf Bacilius UniRef100 OO7608 Bacilius subtiis Yhf tibiis 024 Hypothetical protein yhfL Bacilius UniRef100 OO7610 Bacilius subtiis YhfL subtiis 025 Hypothetical protein yhfM precursor UniRef100 OO7611 Bacilius subtiis YhfM Bacilius subtiis 026 BH2909 protein Bacilius halodurans UniRef100 Q9K8U3 Bacilius halodurans 027 Branched-chain amino acid transporter UniRef100 Q9K8U2 Bacilius AZIC Bacilius halodurans halodurans 028 BH2911 protein Bacilius halodurans UniRef100 Q9K8U1 Bacilius halodurans 029 Putative metalloprotease yhfN Bacilius UniRef100 P4O769 Bacilius subtiis YhfN subtilis O3O AprE 031 Transporter, drug? metabolite exporter UniRef100 Q63D40 Bacilius cereus Yde) amily Bacilius cereus ZK) ZK. 032 Hypothetical protein yhfG Bacilius UniRef100 O07616 Bacilius subtiis YhfC) subtilis 033 YfmD protein Bacilius subtilis UniRef100 O34933 Bacilius subtiis Yfm) 034 YfmE protein Bacilius subtilis UniRef100 O34832 Bacilius subtiis YfmE 035 Hypothetical protein yhfR Bacilius UniRef100 OO7617 Bacilius subtiis YhfR subtilis US 8,168,417 B2 91 92 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) O36 Heme-based aerotactic transducer UniRef100 OO7621 Bacilius subtiis HemAT hemAT Bacillus subtilis 037 Rieske 2Fe-2S iron-sulfur protein, UniRef100 Q73E94 Bacilius cereus YhfW putative Bacillus cereus O38 YXC 039 Hypothetical protein yhzC Bacilius UniRef100 O31594 Bacilius subtiis subtilis O40 ComK O41 YhD O42 YhE 043 Signal peptidase IV Bacilius subtilis UniRef100 OO7560 Bacilius subtiis SipV 044 Minor extracellular protease epr UniRef100 P16396 Bacilius subtiis Epr precursor Bacillus subtilis O45 Putative permease Klebsiella UniRef100 Q765R6 Kiebsiella YybO pneumoniae pneumoniae O46 Hypothetical protein Enterococcus UniRef100 Q82ZQ4 Enterococcits PucR faecalis faecalis O47 Putative allantoinase Staphylococcus UniRef100 Q9EV52 Staphylococcits PCEH cytosis cytosis 048 Peptidase, M20/M25/M40 family UniRef100 Q82ZQ2 Enterococcits YEH Enterococci is faecalis faecalis 049 Hypothetical protein STYO574 UniRef100 Q8XFX7 Salmonella typhi Salmonella typhi 050 Ureidoglycolate dehydrogenase UniRef100 Q838P9 Enterococcits YimC Enterococci is faecalis faecalis 051 Ureidoglycolate dehydrogenase UniRef100 Q838P9 Enterococcits YimC Enterococci is faecalis faecalis 052 Hypothetical protein Enterococcus UniRef100 Q838Q3 Enterococcits SucD faecalis faecalis 053 Hypothetical protein Enterococcus UniRef100 Q838Q3 Enterococci is faecalis faecalis OS4 055 Carbamate kinase Clostridium tetani UniRef100 Q890W1 Cliostridium tetani 056 Major facilitator family transporter UniRef100 Q838Q1 Enterococcits YcbE Enterococciis faecalis faecalis 057 Hypothetical protein yifA precursor UniRef100 O34554 Bacilius subtiis Bacilius subtilis 058 Response regulator aspartate UniRef100 O32294 Bacilius subtiis RapG phosphatase G Bacilius subtilis 059 O60 O61 062 Lacl-family transcription regulator UniRef100 O34829 Bacilius subtiis MSmR Bacilius subtilis 063 Multiple Sugar-binding protein Bacilius UniRef100 O34335 Bacilius subtiis MSmE subtiis O64 Sugar transporter Bacilius subtilis UniRef100 O347O6 Bacilius subtiis AmyD O65 Sugar transporter Bacilius subtilis UniRef100 O34518 Bacilius subtiis AmyC 066 Alpha-galactosidase Bacilius subtilis UniRef100 O34645 Bacilius subtiis MelA 067 Hypothetical protein yhjN Bacilius UniRef100 OO7568 Bacilius subtiis YhiN subtiis O68 Spore coat-associated protein JA UniRef100 Q63FK5 Bacilius cereus ZK Bacilius cereus ZK) 069 CotB protein Bacilius subtilis UniRef100 Q45537 Bacilius subtiis 070 CotC protein Bacilius subtilis UniRef100 Q45538 Bacilius subtiis CotC O71 Long-chain fatty-acid-CoA ligase UniRef100 Q9KDTO Bacilius Yng Bacilius halodurans halodurans 072 Hypothetical protein yhjO Bacilius UniRef100 OO7569 Bacilius subtiis YhiO subtiis 073 Hypothetical protein Bacilius cereus UniRef100 Q63FZ3 Bacilius cereus LytB ZK) ZK. 074 Sensor histidine kinase Bacilius UniRef100 Q6HNG3 Bacilius PhOR thiringiensis thiringiensis 075 Two-component response regulator UniRef100 Q81I36 Bacilius cereus YcT Bacilius cereus O76 YhjR 077 Putative molybdate binding protein, UniRef100 O32208 Bacilius subtiis YvgL Yvg.L. Bacilius subtilis 078 Putative molybdate transport protein, UniRef100 O32209 Bacilius subtiis YvgM YvgM Bacilius subtilis 079 ATP-dependent nuclease subunit B UniRef100 P23477 Bacilius subtiis AddB Bacilius subtilis US 8,168,417 B2 93 94 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 080 ATP-dependent nuclease subunit A UniRef100 P23478 Bacilius subtiis AddA Bacilius subtilis 081 Exonuclease sbcDhomolog Bacilius UniRef100 P23479 Bacilius subtiis SbcD subtilis 082 Nuclease sbcCD subunit C Bacilius UniRef100 OO6714 Bacilius subtiis Yiry subtilis 083 Probable spore germination protein UniRef100 OO6716 Bacilius subtiis gerPF Bacilius subtilis 084 Probable spore germination protein UniRef100 OO6717 Bacilius subtiis GerPE gerPE Bacillus subtilis 085 Probable spore germination protein UniRef100 OO6718 Bacilius subtiis gerPD Bacilius subtilis 086 Probable spore germination protein UniRef100 OO6719 Bacilius subtiis GerPC gerPC Bacilius subtilis 087 Probable spore germination protein UniRef100 OO6720 Bacilius subtiis gerPB Bacilius subtilis 088 Probable spore germination protein UniRef100 OO6721 Bacilius subtiis gerPA Bacillus subtilis 089 Hypothetical protein yitRBacilius UniRef100 OO6753 Bacilius subtiis subtilis O90 091 Spore coat protein H Bacillus cereus UniRef100 Q81EE9 Bacilius cereus Cot O92 CotC 093 YisK Bacilius subtilis UniRef100 OO6724 Bacilius subtiis YisK 094 Hypothetical protein yloQ. Bacilius UniRef100 Q63CF2 Bacilius cereus YloQ cereus ZK) ZK. 095 Yis.L. Bacillus subtilis UniRef100 OO6725 Bacilius subtiis YSL 096 Hypothetical protein yisNBacilius UniRef100 OO6727 Bacilius subtiis YSN subtilis 097 Asparagine synthetase glutamine- UniRef100 OO5272 glutamine- ASnO hydrolyzing 3 Bacilius subtilis hydrolyzing O98 NrgA 099 Nitrogen regulatory PII protein UniRef100 Q8ERT8 Oceanobacilius iheyensis Oceanobacilius iheyensis 00 YisO Bacilius subtilis UniRef100 OO7940 Bacilius subtiis YisO 01 Putative HTH-type transcriptional UniRef100 P40331 Bacilius subtiis YSR regulatoryisRBacilius subtilis O2 Acetyltransferase, GNAT family UniRef100 Q63C80 Bacilius cereus YokL Bacilius cereus ZK) ZK. 03 HTH-type transcriptional regulator degA UniRef100 P37947 Bacilius subtiis DegA Bacilius subtiis 04 Hypothetical oxidoreductase yiss UniRef100 P40332 Bacilius subtiis YSS Bacilius subtiis 05 YisV protein Bacilius subtilis UniRef100 Q796Q6 Bacilius subtiis YisV 06 Diaminobutyrate--pyruvate UniRef100 Q9K9M1 Bacilius GabT transaminase Bacilius halodhirans halodurans 07 L-2,4-diaminobutyrate decarboxylase UniRef100 Q8YZR2 Anabaena sp. Anabaena sp. 08 AIIO394 protein Anabaena sp. UniRef100 Q8YZR3 Anabaena sp. 09 BH2621 protein Bacilius halodurans UniRef100 Q9K9M4 Bacilius halodurans 10 BH2620 protein Bacilius halodurans UniRef100 Q9K9M5 Bacilius halodurans 11 BH2618 protein Bacilius halodurans UniRef100 Q9K9M7 Bacilius halodurans 12 YitI protein Bacilius subtilis UniRef100 OO6744 Bacilius subtiis Yit 13 Glr2355 protein Gloeobacter violaceus UniRef100 Q7N129 Gioeobacter YcdF violacetis 14 BHO411 protein Bacilius halodurans UniRef100 Q9KFR6 Bacilius YobV halodurans 15 5-methyltetrahydrofolate S- UniRef100 Q9KCE1 Bacilius Yit homocysteine methyltransferase halodurans Bacilius halodurans 16 Yit Bacilius subtilis UniRef100 OO6745 Bacilius subtiis Yit 17 Hypothetical UPF0234 protein yitk UniRef100 OO6746 Bacilius subtiis YitK Bacilius subtiis 18 YitL protein Bacilius subtilis UniRef100 OO6747 Bacilius subtiis YitL 19 2O 21 Hypothetical UPF0230 protein yitS UniRef100 P70945 Bacilius subtiis YitS Bacilius subtiis 22 Hypothetical protein yitT Bacilius UniRef100 P398O3 Bacilius subtiis YitT subtilis 23 Intracellular proteinase inhibitor UniRef100 P39804 Bacilius subtiis Ipi Bacilius subtiis US 8,168,417 B2 95 96 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 1124 GMP reductase Bacilius subtilis UniRef100 OO5269 Bacilius subtiis GuaC 112S 1126 1127 Putative orf protein Bacilius subtilis UniRef100 P70947 Bacilius subtiis Yit 1128 Putative orf protein Bacilius subtilis UniRef100 P70948 Bacilius subtiis YitV 1129 YitW 1130 N-acetyl-gamma-glutamyl-phosphate UniRef100 P23715 Bacilius subtiis ArgC reductase Bacilius subtilis 1131 biosynthesis bifunctional UniRef100 Q97.J14 Includes: Arg protein arg.J Includes: Glutamate N- Glutamate N acetyltransferase (EC 2.3.1.35) acetyltransferase (Ornithine acetyltransferase) (Ornithine (EC 2.3.1.35) transacetylase) (OATase); Amino-acid (Ornithine acetyltransferase (EC 2.3.1.1) (N- acetyltransferase) acetylglutamate synthase) (AGS) (Ornithine Contai transacetylase) (OATase); Amino-acid acetyltransferase (EC 2.3.1.1) (N- acetylglutamate synthase) (AGS) 32 Acetylglutamate kinase Bacilius UniRef100 P36840 Bacilius subtiis ArgB subtilis 33 Acetylornithine aminotransferase UniRef100 P36839 Bacilius subtiis Arg) Bacilius subtilis 34 Carbamoyl-phosphate synthase, UniRef100 P36838 Bacilius subtiis CarA arginine-specific, Small chain Bacilius subtiis 35 Carbamoyl-phosphate synthase, UniRef100 P18185 Bacilius subtiis CarE arginine-specific, large chain Bacilius subtiis 36 Ornithine carbamoyltransferase UniRef100 P1818.6 Bacilius subtiis ArgF Bacilius subtilis 37 Undecaprenyl-diphosphatase 1 UniRef100 Q81HV4 Bacilius cereus YbB Bacilius cereus 38 YizC protein Bacilius subtilis UniRef100 O34585 Bacilius subtiis 39 40 Hypothetical protein yjal J Bacilius UniRef100 O35001 Bacilius subtiis YaU subtiis 41 ArgF and med genes, partial and UniRef100 O32435 Bacilius subtiis YaV complete cols Bacilius subtilis 42 Transcriptional activator protein med UniRef100 O32436 Bacilius subtiis Med precursor Bacillus subtilis 43 ComG operon repressor Bacillus UniRef100 O32437 Bacilius subtiis subtiis 44 Hypothetical protein yjzB Bacilius UniRef100 O34891 Bacilius subtiis subtiis 45 3-oxoacyl-acyl-carrier-protein UniRef100 O34746 acyl-carrier- FabHA synthase III protein 1 Bacilius subtilis protein 46 Beta-ketoacyl-acyl carrier protein UniRef100 O34340 Bacilius subtiis FabF synthase II Bacilius subtilis 47 YaZ protein Bacilius subtilis UniRef100 O31596 Bacilius subtiis YaZ 48 Oligopeptide transport ATP-binding UniRef100 P42064 Bacilius subtiis AppD protein appD Bacilius subtilis 49 Oligopeptide transport ATP-binding UniRef100 P42065 Bacilius subtiis AppF protein appF Bacilius subtilis 50 Oligopeptide-binding protein app A UniRef100 P42061 Bacilius subtiis AppA precursor Bacillus subtilis 51 Oligopeptide transport system UniRef100 P42062 Bacilius subtiis AppB permease protein appBBacilius subtilis 52 Oligopeptide transport system UniRef100 P42063 Bacilius subtiis AppC permease protein appC Bacilius subtilis 53 Permease, putative Bacillus cereus UniRef100 Q734Y3 Bacilius cereus Yvd.J S4 YbA 55 Tryptophanyl-tRNA synthetase Bacilius UniRef100 P21656 Bacilius subtiis TrpS subtilis 56 Oligopeptide-binding protein oppA UniRef100 P24141 Bacilius subtiis OppA precursor Bacilius subtilis US 8,168,417 B2 97 98 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog D (Gene NO. Description OniRef Accession No. Organism Name) 57 Oligopeptide transport system UniRef100 P24138 Bacilius subtiis OppB permease protein oppBBacilius subtilis 58 Oligopeptide transport system UniRef100 P24139 Bacilius subtiis OppC permease protein oppo Bacilius subtilis 59 Oligopeptide transport ATP-binding UniRef100 P24136 Bacilius subtiis OppD protein oppD Bacilius subtilis 60 Oligopeptide transport ATP-binding UniRef100 P24137 Bacilius subtiis OppF protein oppF Bacilius subtilis 61 YbC protein Bacilius subtilis UniRef100 O31601 Bacilius subtiis YbC 62 Regulatory protein spx Bacilius subtilis UniRef100 O31602 Bacilius subtiis YbD 63 YibE protein Bacilius subtilis UniRef100 O31603 Bacilius subtiis YbE 64 Adapter protein mecA 1 Bacilius UniRef100 P37958 Bacilius subtiis MecA subtilis 65 Hypothetical conserved protein UniRef100 Q8ELH8 Oceanobacilius YfP Oceanobacilius iheyensis iheyensis 66 Hypothetical protein OB3248 UniRef100 Q8ELH9 Oceanobacilius iheyensis Oceanobacilius iheyensis 67 Hypothetical conserved protein UniRef100 Q8ELIO Oceanobacilius iheyensis Oceanobacilius iheyensis 68 Response regulator of citrate/malate UniRef100 Q7ML23 Vibrio vulnificus CitT metabolism Vibrio vulnificus 69 Sensor protein citS Bacilius UniRef100 Q9RC53 Bacilius YfL halodurans halodurans 70 YbF protein Bacilius subtilis UniRef100 O31604 Bacilius subtiis YibF 71 YbO 72 73 YbH protein LBacillus subtilis UniRef100 O31606 Bacilius subtiis YbH 74 YbI protein Bacilius subtilis UniRef100 O31607 Bacilius subtiis YbI 75 Yb.J protein Bacilius subtilis UniRef100 O31608 Bacilius subtiis YibJ 76 YbK protein Bacillus subtilis UniRef100 O31609 Bacilius subtiis YbK 77 YbL protein Bacilius subtilis UniRef100 O31610 Bacilius subtiis YibL 78 YbM protein Bacilius subtilis UniRef100 O31611 Bacilius subtiis YbM 79 YibN 80 Hypothetical pseudouridine synthase UniRef100 O31613 Bacilius subtiis YbO ybO Bacilius subtilis 81 YbP protein Bacilius subtilis UniRef100 O31614 Bacilius subtiis YibP 82 YbQ protein Bacilius subtilis UniRef100 O31615 Bacilius subtiis YbQ 83 Transcriptional activator ten A Bacilius UniRef100 P25052 Bacilius subtiis TenA subtilis 84 Regulatory protein ten I Bacilius UniRef100 P25053 Bacilius subtiis Ten subtilis 85 Glycine oxidase Bacilius subtilis UniRef100 O31616 Bacilius subtiis GoxB 86 This protein Bacillus subtilis UniRef100 O31617 Bacilius subtiis 87 Thiazole biosynthesis protein thiG UniRef100 O31618 Bacilius subtiis ThiC Bacilius subtilis 88 ThiF protein Bacillus subtilis UniRef100 O31619 Bacilius subtiis ThiF 89 YbV protein Bacilius subtilis UniRef100 O31620 Bacilius subtiis YibV 90 Enoyl-[acyl-carrier-protein reductase UniRef100 P54616 acyl-carrier- Fab NADH Bacilius subtilis protein 91 YbX protein Bacillus subtilis UniRef100 O31622 Bacilius subtiis YibX 92 Spore coat protein Z Bacilius subtilis UniRef100 Q08312 Bacilius subtiis Cotz 93 Spore coat protein Y Bacilius subtilis UniRef100 Q08311 Bacilius subtiis Coty 94 Spore coat protein X Bacillus subtilis UniRef100 Q08313 Bacilius subtiis CotX 95 Spore coat protein W Bacilius subtilis UniRef100 Q08310 Bacilius subtiis CotW 96 Spore coat protein V Bacilius subtilis UniRef100 Q08309 Bacilius subtiis CotV 97 YicA protein Bacilius subtilis UniRef100 O31623 Bacilius subtiis YicA 98 99 200 YcC protein Bacilius subtilis UniRef100 O31625 Bacilius subtiis 2O1 YicD 2O2 YngC 2O3 GalE 204 YngB protein Bacilius subtilis UniRef100 O31822 Bacilius subtiis YngB 205 YngA protein Bacilius UniRef100 Q70JY6 Bacilius YngA amyloiquefaciens amyloiquefaciens 206 Yiclf protein Bacillus subtilis UniRef100 O31628 Bacilius subtiis YicF 207 YcG protein Bacilius subtilis UniRef100 O31629 Bacilius subtiis YicG 208 Yich protein Bacilius subtilis UniRef100 O31630 Bacilius subtiis Yich 209 Hypothetical protein Bacilius cereus UniRef100 Q739H9 Bacilius cereus US 8,168,417 B2 99 100 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 210 BH1889 protein Bacilius halodurans UniRef100 Q9KBN6 Bacilius YobV halodurans 211 YicI protein Bacilius subtilis UniRef100 O31631 Bacilius subtiis YicI 212 Yic protein Bacilius subtilis UniRef100 O31632 Bacilius subtiis Yic.J 213 YicL protein Bacilius subtilis UniRef100 O31634 Bacilius subtiis YicL 214 Transcriptional regulator, MarR/EmrR UniRef100 Q97DR6 Clostridium acetobutyllicum family Clostridium acetobutyllicum 215 Penicillin-binding protein 4* Bacilius UniRef100 P32.959 Bacilius subtiis PbpE subtilis 216 Abn A 217 218 Maltose transacetylase Bacilius UniRef100 Q75TH6 Bacilius Maa. Stearothermophilus Stearothermophilus 219 220 Putative HTH-type transcriptional UniRef100 P39647 Bacilius subtiis YwfK regulatorywfK Bacillus subtilis 221 Sulfite reductase Bacilius halodurans UniRef100 Q9KF76 Bacilius YvgR halodurans 222 Sulfite reductase Bacilius halodurans UniRef100 Q9KF75 Bacilius YvgO halodurans 223 Putative HTH-type transcriptional UniRef100 O34701 Bacilius subtiis YoaU regulator yoalJ Bacilius subtilis 224 Hypothetical transport protein yoaV UniRef100 O34416 Bacilius subtiis Yoav Bacilius subtilis 225 Hypothetical protein VPAO302 Vibrio UniRef100 Q87JF1 Vibrio YyaH parahaemolyticits parahaemolyticus 226 Hypothetical protein yoeB precursor UniRef100 O34841 Bacilius subtiis YoeB Bacilius subtilis 227 Yoc 228 Permease, general substrate UniRef100 Q6HMC3 Bacilius LmrB transporter Bacilius thuringiensis thiringiensis 229 Putative HTH-type transcriptional UniRef100 P42105 Bacilius subtiis YxaF regulatoryxaF Bacilius subtilis 230 231 YnZD 232 Yee 233 YiqB 234 Phage-like element PBSX protein xkdA UniRef100 P3978O Bacilius subtiis XkdA Bacilius subtiis 235 HTH-type transcriptional regulator Xre UniRef100 P23789 Bacilius subtiis Xre Bacilius subtiis 236 237 238 Phage-like element PBSX protein xkdB UniRef100 P39781 Bacilius subtiis XkdB Bacilius subtiis 239 Phage-like element PBSX protein xkdC UniRef100 P39782 Bacilius subtiis XkdC Bacilius subtiis 240 Phage-like element PBSX protein xkdD UniRef100 P39783 Bacilius subtiis XkdD Bacilius subtiis 241 Phage-like element PBSX protein Xtra UniRef100 P54344 Bacilius subtiis Bacilius subtiis 242 Positive control factor Bacilius subtilis UniRef100 P39784 Bacilius subtiis Xpf 243 PBSX phage terminase small subunit UniRef100 P39785 Bacilius subtiis XtmA Bacilius subtiis 244 PBSX phage terminase large subunit UniRef100 P39786 Bacilius subtiis XtmB Bacilius subtiis 245 Phage-like element PBSX protein xkdE UniRef100 P54325 Bacilius subtiis XkdE Bacilius subtiis 246 Phage-like element PBSX protein xkdF UniRef100 P54326 Bacilius subtiis XkdF Bacilius subtiis 247 Phage-like element PBSX protein xkdG UniRef100 P54327 Bacilius subtiis XkdG Bacilius subtiis 248 Hypothetical protein yabG Bacilius UniRef100 P45923 Bacilius subtiis YgbG subtilis 249 Hypothetical protein yabh Bacilius UniRef100 P45924 Bacilius subtiis YgbH subtilis 250 Phage-like element PBSX protein xkdI UniRef100 P54329 Bacilius subtiis XkdI Bacilius subtiis 251 Phage-like element PBSX protein xkdJ UniRef100 P54330 Bacilius subtiis Xkd Bacilius subtiis 252 Lin1277 protein Listeria innocua UniRef100 Q92CB2 Listeria innoctia US 8,168,417 B2 101 102 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 253 Phage-like element PBSX protein xkdK UniRef100 P54331 Bacilius subtiis XkdK Bacilius subtiis 254 Phage-like element PBSX protein xkdM UniRef100 P54332 Bacilius subtiis XkdM Bacilius subtiis 255 Phage-like element PBSX protein xkdN UniRef100 P54333 Bacilius subtiis XkdN Bacilius subtiis 256 Phage-like element PBSX protein xkdO UniRef100 P54334 Bacilius subtiis XkdO Bacilius subtiis 257 Phage-like element PBSX protein xkdP UniRef100 P54335 Bacilius subtiis YgbP Bacilius subtiis 258 Hypothetical protein yabQ Bacilius UniRef100 P45950 Bacilius subtiis YabO subtilis 259 Hypothetical protein yabR Bacilius UniRef100 P45933 Bacilius subtiis YabR subtilis 260 Phage-like element PBSX protein xkdS UniRef100 P54338 Bacilius subtiis XkdS Bacilius subtiis 261 Hypothetical protein yabTBacilius UniRef100 P45935 Bacilius subtiis YgbT subtilis 262 Phage-like element PBSX protein xkdU UniRef100 P54340 Bacilius subtiis XkdU Bacilius subtiis 263 264 XkdV 26S YomR 266 267 268 BlyA 269 Regulatory protein Bacillus UniRef100 Q97.FL9 Bacilius YdhC Stearothermophilus Stearothermophilus 270 -containing alcohol dehydrogenase UniRef100 O35045 Bacilius subtiis YD Bacilius subtilis 271 Mannonate dehydratase 1 Bacilius UniRef100 Q9KDZ8 Bacilius UxuA halodurans halodurans 272 D-mannonate oxidoreductase Bacilius UniRef100 Q9KDZ4 Bacilius YF halodurans halodurans 273 UPIOOOO2F2634 UniRefl00 entry UniRef100 UPIOOOO2F2634 YD 274 Hexuronate transporter Bacillus UniRef100 O34456 Bacilius subtiis ExuT subtilis 275 Stage II sporulation protein SB Bacilius UniRef100 O34800 Bacilius subtiis subtilis 276 Stage II sporulation protein SA Bacilius UniRef100 O34853 Bacilius subtiis SpoIISA subtilis 277 UPIOOOO3CC121 UniRefl00 entry UniRef100 UPIOOOO3CC121 Pit 278 Hypothetical UPF0111 protein ykaA UniRef100 O34454 Bacilius subtiis YkaA Bacilius subtilis 279 Gigt 280 YesL protein Bacilius subtilis UniRef100 O31515 Bacilius subtiis YeSL 281 YesM protein Bacilius subtilis UniRef100 O31516 Bacilius subtiis YesN 282 YesN protein Bacilius subtilis UniRef100 O31517 Bacilius subtiis YesN 283 YesO protein Bacilius subtilis UniRef100 O31518 Bacilius subtiis YeSO 284 Probable ABC transporter permease UniRef100 O31519 Bacilius subtiis Yes protein yesP Bacilius subtilis 285 Probable ABC transporter permease UniRef100 O31520 Bacilius subtiis YesQ protein yesO Bacilius subtilis 286 YesR protein Bacilius subtiis UniRef100 O31521 Bacilius subtiis YeSR 287 Yess protein Bacilius subtiis UniRef100 O31522 Bacilius subtiis YeSS 288 YesT protein Bacilius subtilis UniRef100 O31523 Bacilius subtiis YesT 289 Yes 290 YesV protein Bacilius subtiis UniRef100 O31525 Bacilius subtiis YesV 291 Yes W protein Bacilius subtilis UniRef100 O31526 Bacilius subtiis YesW 292 293 YesT 294 YesX protein Bacilius subtiis UniRef100 O31527 Bacilius subtiis YeSX 295 Putative ion-channel protein UniRef100 Q874X6 Salmonella typhi YccK Salmonella typhi 296 YesY protein Bacilius Subiiis UniRef100 O31528 Bacilius subtiis Yes Y 297 YesZ protein Bacilius subtilis UniRef100 O31529 Bacilius subtiis Yes2. 298 YetA 299 Lipoprotein pla precursor Bacillus UniRef100 P37966 Bacilius subtiis LplA subtilis 300 LplB protein Bacilius subtilis UniRef100 P39128 Bacilius subtiis LplB 301 LplC protein Bacilius subtilis UniRef100 P39129 Bacilius subtiis LplC 302 YkbA protein Bacilius subtilis UniRef100 O34739 Bacilius subtiis YkbA US 8,168,417 B2 103 104 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 303 YkcA protein Bacilius subtilis UniRef100 O34689 Bacilius subtiis YkcA 304 Hypothetical protein Bacilius cereus UniRef100 Q81CP9 Bacilius cereus 305 Probable serine protease do-like htra UniRef100 O34358 Bacilius subtiis Htra Bacilius subtilis 306 Pyrroline-5-carboxylate reductase 3 UniRef100 Q00777 Bacilius subtiis ProG Bacilius subtilis 307 D-aminopeptidase Bacilius subtilis UniRef100 P26902 Bacilius subtiis DppA 308 Dipeptide transport system permease UniRef100 P26903 Bacilius subtiis DB protein dippBBacilius subtilis 309 Dipeptide transport system permease UniRef100 P26904 Bacilius subtiis DC protein dippC Bacilius subtilis 310 Dipeptide transport ATP-binding protein UniRef100 P26905 Bacilius subtiis DD dppD Bacillus subtilis 311 Dipeptide-binding protein dippE UniRef100 P26906 Bacilius subtiis DppE precursor Bacillus subtilis 312 Hypothetical protein ykfA Bacilius UniRef100 O34851 Bacilius subtiis YkfA subtilis 313 YkfB Bacilius Subiiis UniRef100 O34508 Bacilius subtiis YkfB 314 YkfC Bacilius Subiiis UniRef100 O35010 Bacilius subtiis Ykf 315 YkfDBacilius Sibiis UniRef100 O34480 Bacilius subtiis YkfD 316 BH1779 protein Bacilius halodurans UniRef100 Q9KBZ5 Bacilius YkgA halodurans 317 Putative acyl-CoA thioester hydrolase UniRef100 P49851 Bacilius subtiis YkhA ykh A Bacillus subtilis 3.18 YkiA 319 Pectate lyase 47 precursor Bacillus sp. UniRef100 Q9AJM4 Bacilius sp. TS- Pel TS-47 47 320 Transcriptional regulator, PadR family UniRef100 Q6HED5 Bacilius thuringiensis Bacillus thuringiensis 321 Hypothetical protein OB0568 UniRef100 Q8ESQ3 Oceanobacilius iheyensis Oceanobacilius iheyensis 322 BH1312 protein Bacilius halodurans UniRef100 Q9KDA2 Bacilius halodurans 323 Hypothetical protein ykkA Bacilius UniRef100 P49854 Bacilius subtiis YkkA subtilis 324 Hypothetical protein ykkC Bacilius UniRef100 P49856 Bacilius subtiis YkkC subtilis 325 326 YkkE Bacilius subtilis UniRef100 O34990 Bacilius subtiis YkkE 327 Glutamate 5-kinase 1 Bacilius subtilis UniRef100 P3982O Bacilius subtiis ProB 328 Gamma-glutamyl phosphate reductase UniRef100 P398.21 Bacilius subtiis ProA Bacilius subtilis 329 Organic hydroperoxide resistance UniRef100 O34762 Bacilius subtiis YkA protein ohrA Bacilius subtilis 330 Organic hydroperoxide resistance UniRef100 O34777 Bacilius subtiis YkmA transcriptional regulator Bacilius subtilis 331 Organic hydroperoxide resistance UniRef100 P80242 Bacilius subtiis YkZA protein ohrB Bacilius subtilis 332 333 Guanine deaminase Bacilius subtilis UniRef100 O34598 Bacilius subtiis GualD 334 Phosphoglycerate mutase Bacillus UniRef100 Q9ALU0 Bacilius YhfR Stearothermophilus Stearothermophilus 335 336 337 5-methyltetrahydropteroyltriglutamate- UniRef100 P80877 Bacilius subtiis Met homocysteine methyltransferase Bacilius subtilis 338 Intracellular serine protease Bacilius UniRef100 Q69DB4 Bacilius sp. Isp A sp. WRD-2 WRD-2 339 340 YkoK Bacilius subtilis UniRef100 O34442 Bacilius subtiis YkOK 341 342 Integrase Oceanobacilius iheyensis UniRef100 Q8ETV2 Oceanobacilius YdcL iheyensis 343 YdaB 344 Hypothetical protein Bacilius anthracis UniRef100 Q81UEO Bacilius anthracis 345 YoS 346 347 Putative HTH-type transcriptional UniRef100 P45902 Bacilius subtiis YdaE regulatoryqaE. Bacilius subtilis US 8,168,417 B2 105 106 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 348 Transcriptional regulator UniRef100 Q7 P886 Fusobacterium nucleatum Fusobacterium nucleatin Subsp. Subsp. vincentii ATCC 49256 vincentii ATCC 49256) 349 350 351 Lin1236 protein Listeria innocua UniRef100 Q92CD6 Listeria innoctia 352 353 3S4 355 Ydal 356 35 protein Bacteriophage SPP1) UniRef100 Q38143 Bacteriophage YdaK SPP1 357 YdaL 358 YdaM 359 Hypothetical protein yoaO Bacilius UniRef100 P45912 Bacilius subtiis subtilis 360 Hypothetical protein yopY UniRef100 O64108 Bacteriophage SPBc2 Bacteriophage SPBc2 361 362 363 364 Hypothetical protein CTCO2137 UniRef100 Q892G2 Cliostridium tetani Clostridium tetani 365 Hypothetical protein MW1918 UniRef100 Q8NVN5 Staphylococcits YdaN Staphylococcusatiretts (iiietS 366 367 Single-strand binding protein 2 Listeria UniRef100 Q8Y4X1 Listeria monocytogenes monocytogenes 368 369 370 Hypothetical protein yoaQ Bacilius UniRef100 P45948 Bacilius subtiis YdaQ subtilis 371 372 Hypothetical protein yaaS Bacilius UniRef100 P45915 Bacilius subtiis YaaS subtilis 373 Hypothetical protein yaaT Facilius UniRef100 P45916 Bacilius subtiis YdaT subtilis 374 Hypothetical phage associated protein UniRef100 Q8K610 Streptococci is pyogenes SpyM3 1326 Streptococcus pyogenes 375 Minor head structural component GP7 UniRef100 Q38442 Bacteriophage SPP1 Bacteriophage SPP1 376 377 378 379 Hypothetical protein CTCO1553 UniRef100 Q894JO Cliostridium tetani Clostridium tetani 380 Major capsid protein Bacteriophage UniRef100 Q9T1B7 Bacteriophage A118 A118) 381 ORF28 Bacteriophage phi-105 UniRef100 Q97XF5 Bacteriophage phi-105 382 383 15 protein Bacteriophage SPP1 UniRef100 Q38584 Bacteriophage SPP1 384 Complete nucleotide sequence UniRef100 O484.46 Bacteriophage SPP1 Bacteriophage SPP 385 386 Complete nucleotide sequence UniRef100 O48448 Bacteriophage SPP1 Bacteriophage SPP 387 Complete nucleotide sequence UniRef100 O48449 Bacteriophage SPP1 Bacteriophage SPP 388 389 Complete nucleotide sequence UniRef100 O48453 Bacteriophage SPP1 Bacteriophage SPP 390 391 Complete nucleotide sequence UniRef100 O48455 Bacteriophage XkdO Bacteriophage SPP SPP1 392 Complete nucleotide sequence UniRef100 O48459 Bacteriophage SPP1 Bacteriophage SPP 393 Complete nucleotide sequence UniRef100 O48463 Bacteriophage SPP1 Bacteriophage SPP 394 395 396 LycA Clostridium botulinum UniRef100 Q6R100 Cliostridium bointinum US 8,168,417 B2 107 108 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 397 Hypothetical protein yrkC Bacilius UniRef100 P54430 Bacilius subtiis YrkC subtilis 398 YhjR 399 YolfS protein Bacillus subtilis) UniRef100 P96697 Bacilius subtiis YdfS 400 HTH-type transcriptional regulator thrA UniRef100 Q45666 Bacilius subtiis TnrA Bacilius subtilis 401 Hypothetical protein ykZBBacilius UniRef100 O34923 Bacilius subtiis subtilis 402 403 YkoM Bacilius subtilis UniRef100 O34949 Bacilius subtiis YkCM 404 YkoU protein Bacilius subtilis UniRef100 O34398 Bacilius subtiis YkoU 405 YkoV protein Bacilius subtilis UniRef100 O34859 Bacilius subtiis YkoW 406 Signaling protein ykoW Bacilius UniRef100 O34311 Bacilius subtiis YkoW subtilis 407 YkoX protein Bacilius subtilis UniRef100 O34908 Bacilius subtiis Ykox 408 YkoY protein Bacilius subtilis UniRef100 O34997 Bacilius subtiis YkoY 409 RNA polymerase sigma factor Bacilius UniRef100 O31654 Bacilius subtiis SigI subtilis 410 YkrI 411 Small, acid-soluble spore protein C3 UniRef100 P10572 Bacilius negaterium Bacilius negaterium 412 YkrK protein Bacillus subtilis UniRef100 O31656 Bacilius subtiis YkrK 413 Probable protease htpX homolog UniRef100 O31657 Bacilius subtiis YkrL Bacilius subtilis 414 YkrM protein Bacilius subtilis UniRef100 O31658 Bacilius subtiis YkrM 415 Penicillin-binding protein 3 Bacilius UniRef100 P42971 Bacilius subtiis PbpC subtilis 416 Hypothetical protein Bacilius cereus UniRef100 Q73BI4 Bacilius cereus 417 418 YkrP protein Bacilius subtilis UniRef100 O31660 Bacilius subtiis YkrP 419 Two-component sensor histidine kinase UniRef100 O31661 Bacilius subtiis KinE Bacilius subtilis 420 Methylated-DNA-protein-cysteine UniRef100 P11742 Bacilius subtiis Ogt methyltransferase Bacilius subtilis 421 422 Methylthioribose-1-phosphate UniRef100 O31662 Bacilius subtiis YkrS isomerase Bacillus subtilis 423 Methylthioribose kinase Bacilius UniRef100 O31663 Bacilius subtiis YkrT subtilis 424 YkrU protein Bacillus subtilis UniRef100 O31664 Bacilius subtiis YkrU 425 Transaminase mtnE Bacilius subtilis UniRef100 O31665 Bacilius subtiis YkrV 426 2,3-diketo-5-methylthiopentyl-1- UniRef100 O31666 Bacilius subtiis YkrW phosphate enolase Bacilius subtilis 427 Methylthioribulose-1-phosphate UniRef100 O31668 Bacilius subtiis Ykry dehydratase Bacilius subtilis 428 1,2-dihydroxy-3-keto-5- UniRef100 O31669 Bacilius subtiis Ykrz, methylthiopentene dioxygenase Bacilius subtilis 429 Metallothiol transferase fosB UniRef100 Q8CXK5 Oceanobacilius YN Oceanobacilius iheyensis iheyensis 430 YkVA protein Bacilius subtilis UniRef100 O31670 Bacilius subtiis 431 Stage Osporulation regulatory protein UniRef100 P05043 Bacilius subtiis Bacilius subtilis 432 Two-component sensor histidine kinase UniRef100 O31671 Bacilius subtiis KinD Bacilius subtilis 433 YkyE 434 Chemotaxis motR protein Bacilius UniRef100 P28612 Bacilius subtiis Mot subtilis 435 Chemotaxis motA protein Bacilius UniRef100 P28611 Bacilius subtiis MotA subtilis 436 ATP-dependent Clp protease-like UniRef100 O31673 Bacilius subtiis ClpB. Bacilius subtilis 437 YkvI protein Bacilius subtilis UniRef100 O31674 Bacilius subtiis Yky 438 YkvJ protein Bacilius subtilis UniRef100 O31675 Bacilius subtiis Yky 439 YkvK protein Bacilius subtilis UniRef100 O31676 Bacilius subtiis YkvK 440 YkvL protein Bacillus subtilis UniRef100 O31677 Bacilius subtiis YkyL 441 YkvM protein Bacilius subtilis UniRef100 O31678 Bacilius subtiis YkyM 442 DNA integration/recombination protein UniRef100 Q894H7 Cliostridium tetani CodW Clostridium tetani 443 Integrase/recombinase Bacilius ceretts UniRef100 Q633V7 Bacilius cereus RipX ZK) ZK. US 8,168,417 B2 109 110 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 444 445 446 447 YoqV protein Bacteriophage SPBc2 UniRef100 O64130 Bacteriophage LigB SPBC2 448 449 450 UPIOOOO3CC586 UniRefl00 entry UniRef100 UPIOOOO3CC586 451 452 453 454 455 Prophage LambdaBaO2, HNH UniRef100 Q81W86 Bacilius anthracis endonuclease family protein Bacilius anthracis 456 Terminase small subunit UniRef100 Q6GAL5 Staphylococcusatiretts Staphylococcusatiretts 457 Prophage LambdaBa02, terminase, UniRef100 Q6HUD2 Bacilius anthracis arge subunit, putative Bacilius anthracis 458 Hypothetical protein Bacilius anthracis UniRef100 Q81W89 Bacilius anthracis 459 ClipP family serine protease, possible UniRef100 Q97HW4 Cliostridium CipP phage related Clostridium acetobiitvictim acetobiitvictim 460 Prophage LambdaEa.02, major capsid UniRef100 Q81W91 Bacilius anthracis protein, putative Bacillus anthracis 461 Precursor polypeptide (AA-37 to 1647) UniRef100 Q03658 unidentified bacterium precursor unidentified bacterium 462 Gp7 protein Bacteriophage phi3626) UniRef100 Q8SBP7 Bacteriophage phi3626 463 Uncharacterized phage related protein UniRef100 Q97HW7 Clostridium acetobutyllicum Clostridium acetobutyllicum 464. Hypothetical protein CAC1887 UniRef100 Q97HW9 Clostridium acetobutyllicum Clostridium acetobutyllicum 465 Prophage LambdaBa02, major tail UniRef100 Q81W97 Bacilius anthracis protein, putative Bacillus anthracis 466 467 468 Prophage LambdaBa02, tape measure UniRef100 Q81WAO Bacilius YgbO protein, putative Bacillus anthracis anthracis 469 Lin2382 protein Listeria innocua UniRef100 Q928Z8 Listeria innoctia 470 Protein gp18 Listeria monocytogenes UniRef100 Q8Y4Z4 Listeria monocytogenes 471 YcG 472 XkdV 473 XkW 474 YomP protein Bacteriophage SPBc2 UniRef100 O64052 Bacteriophage SPBc2 475 Glycerophosphoryl diester UniRef100 Q737E6 Bacilius cereus GlpQ phosphodiesterase, putative Bacilius cereus 476 Protein bhlA Bacteriophage SPBc2 UniRef100 O64039 Bacteriophage SPBc2 477 ORF46 Bacteriophage phi-105 UniRef100 Q97XD7 Bacteriophage XlyB phi-105 478 479 480 481 Transcriptional regulator, DeoR family UniRef100 Q816D5 Bacilius cereus Bacilius cereus 482 Hypothetical protein yolD UniRef100 O64030 Bacteriophage SPBc2 Bacteriophage SPBc2 483 DNA integration/recombination invertion UniRef100 Q81GD4 Bacilius cereus YdcL protein Bacilius ceretts 484 YkvM protein Bacilius subtilis UniRef100 O31678 Bacilius subtiis 485 Response regulator aspartate UniRef100 P4O771 Bacilius subtiis RapH phosphatase H Bacilius subtilis 486 YoaT Bacilius subtilis UniRef100 O34535 Bacilius subtiis Yoa.T 487 YozG protein Bacillus subtilis UniRef100 O31834 Bacilius subtiis 488 YoaS protein Bacilius subtilis UniRef100 O31833 Bacilius subtiis YoaS 489 490 491 492 493 494 YkvS protein Bacillus subtilis UniRef100 O31684 Bacilius subtiis US 8,168,417 B2 111 112 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 495 BH2327 protein Bacilius halodurans UniRef100 Q9KAGO Bacilius halodurans 496 YkvT protein Bacillus subtilis UniRef100 O31685 Bacilius subtiis YkyT 497 YkvU protein Bacilius subtilis UniRef100 O31686 Bacilius subtiis YkyU 498 YkvV protein Bacilius subtilis UniRef100 O31687 Bacilius subtiis YkwV 499 YkyW 500 YkvY protein Bacillus subtilis UniRef100 O31689 Bacilius subtiis YkwY 501 Necrosis and ethylene inducing protein UniRef100 Q9KFT2 Bacilius halodurans Bacilius halodurans 502 Putative HTH-type transcriptional UniRef100 O31690 Bacilius subtiis Ykw2. regulatorykvZ Bacilius subtilis 503 Transcription antiterminator Bacilius UniRef100 OO6710 Bacilius subtiis GCT subtilis SO4 PtSG 505 Phosphocarrier protein HPr Bacilius UniRef100 PO8877 Bacilius subtiis subtilis 506 Phosphoenolpyruvate-protein UniRef100 PO8838 Bacilius subtiis Pts phosphotransferase Bacillus subtilis 507 SplA Bacilius amyloiquefaciens UniRef100 O54358 Bacilius amyloiiquefaciens 508 Bacilius UniRef100 O54359 Bacilius SplB amyloiquefaciens amyloiquefaciens 509 Hypothetical protein orf1 Bacilius UniRef100 OO5187 Bacilius subtiis YkwB subtilis 510 Methyl-accepting chemotaxis protein UniRef100 P54576 Bacilius subtiis McpC mcpC Bacilius subtilis 511 3-oxoacyl-(Acyl carrier protein) UniRef100 Q8EMP9 Oceanobacilius FabG reductase Oceanobacilius iheyensis iheyensis 512 Hypothetical oxidoreductaseykwC UniRef100 O34948 Bacilius subtiis YkwC Bacilius subtilis 513 YkwD protein LBacillus subtilis UniRef100 O31694 Bacilius subtiis YkwD 514 Yku.A protein Bacilius subtilis UniRef100 O31399 Bacilius subtiis YkuA 515 Sporulation kinase A Bacilius subtilis UniRef100 P16497 Bacilius subtiis KinA 516 Putative aminotransferase A Bacilius UniRef100 P16524 Bacilius subtiis PatA subtilis 517 518 Hypothetical protein yxaI Bacilius UniRef100 P42108 Bacilius subtiis YxaI subtilis 519 YxiO 520 Chemotaxis protein cheV Bacilius UniRef100 P37599 Bacilius subtiis CheW subtilis 521 Hypothetical protein ykyB Bacilius UniRef100 P42430 Bacilius subtiis YkyB subtilis 522 YkC 523 YkuD protein Bacilius subtilis UniRef100 O34816 Bacilius subtiis YkD 524 YkE 525 Hypothetical oxidoreductaseykuF UniRef100 O34717 Bacilius subtiis YkF Bacilius subtilis 526 YkuI protein Bacilius subtilis UniRef100 O35O14 Bacilius subtiis Yku 527 528 YkuJ protein Bacilius subtilis UniRef100 O34588 Bacilius subtiis 529 YkuK protein Bacilius subtilis UniRef100 O34776 Bacilius subtiis YkuK 530 Hypothetical protein ykZF Bacilius UniRef100 O31697 Bacilius subtiis subtilis 531 YkuL protein Bacillus subtilis UniRef100 O31698 Bacilius subtiis YkuL. 532 Putative HTH-type transcriptional UniRef100 O34827 Bacilius subtiis CcpC regulatorykuM Bacilius subtilis 533 Probable flavodoxin 1 Bacillus subtilis UniRef100 O34.737 Bacilius subtiis YkN 534 YkuO protein Bacilius subtilis UniRef100 O34879 Bacilius subtiis YKO 535 Probable flavodoxin 2 Bacillus subtilis UniRef100 O34589 Bacilius subtiis YkP 536 YkuQ protein Bacilius subtilis UniRef100 O34981 Bacilius subtiis YkuQ 537 YkuR protein Bacilius Subiiis UniRef100 O34916 Bacilius subtiis YkuR 538 Hypothetical UPF0180 protein ykuS UniRef100 O34783 Bacilius subtiis Bacilius subtilis 539 Ykul protein Bacilius Subiiiis UniRef100 O34564 Bacilius subtiis YkU 540 YkuV protein Bacilius subtilis UniRef100 O31403 Bacilius subtiis YkV 541 Repressor rok Bacilius subtilis UniRef100 O34857 Bacilius subtiis Rok 542 YknT protein Bacillus subtilis UniRef100 O31700 Bacilius subtiis YkT 543 MobA 544 Molybdopterin biosynthesis protein UniRef100 O31702 Bacilius subtiis MoeB MoeBBacilius subtilis 545 Molybdopterin biosynthesis protein UniRef100 O31703 Bacilius subtiis MoeA MoeA Bacilius subtilis US 8,168,417 B2 113 114 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 546 Molybdopterin-guanine dinucleotide UniRef100 O31704 Bacilius subtiis MobB biosynthesis protein B Bacillus subtilis 547 Molybdopterin converting factor subunit UniRef100 O31705 Bacilius subtiis Moa 2 Bacilius subtilis 548 Molybdopterin converting factor, UniRef100 O31706 Bacilius subtiis subunit 1 Bacillus subtilis 549 550 YknU protein Bacilius subtilis UniRef100 O31707 Bacilius subtiis YkU 551 YknV protein Bacilius subtilis UniRef100 O31708 Bacilius subtiis YkV 552 Hypothetical protein yknW Bacilius UniRef100 O31709 Bacilius subtiis YkW subtilis 553 YknX protein Bacilius subtilis UniRef100 O31710 Bacilius subtiis YknX 554 YknY protein Bacilius subtilis UniRef100 O31711 Bacilius subtiis YknY 555 Hypothetical protein yknZ Bacilius UniRef100 O31712 Bacilius subtiis YknZ. subtilis 556 FruR 557 1-phosphofructokinase Bacilius UniRef100 O31714 Bacilius subtiis Fruk subtilis 558 Phosphotransferase system (PTS) UniRef100 P71012 Bacilius subtiis FruA ructose-specific enzyme IIABC component Bacilius subtilis 559 Signal peptidase IT Bacilius subtilis UniRef100 P71013 Bacilius subtiis SipT 560 Hypothetical protein ykoA Bacilius UniRef100 O31715 Bacilius subtiis subtilis S61 562 YkpA protein Bacilius subtilis UniRef100 O31716 Bacilius subtiis YkpA 563 BH1921 protein Bacilius halodurans UniRef100 Q9KBK5 Bacilius halodurans 564 Aminopeptidase ampS Bacilius subtilis UniRef100 P39762 Bacilius subtiis AmpS 565 566 MreBH protein Bacillus subtilis UniRef100 P39763 Bacilius subtiis MireBH 567 568 Sporulation kinase C Bacilius subtilis UniRef100 P39764 Bacilius subtiis KinC 569 Hypothetical protein ykgB Bacilius UniRef100 P39760 Bacilius subtiis YkqB subtiis 570 Adenine deaminase Bacilius subtilis UniRef100 P39761 Bacilius subtiis AdeC 571 YkqC 572 YkZG protein Bacilius subtilis UniRef100 O31718 Bacilius subtiis 573 Hypothetical protein ykrA Bacilius UniRef100 Q45494 Bacilius subtiis YkrA subtiis 574 YkrB 575 576 Hypothetical protein yky A Bacilius UniRef100 P21884 Bacilius subtiis Yky A subtiis 577 Pyruvate dehydrogenase E1 UniRef100 P21881 Bacilius subtiis PchA component, alpha Subunit Bacilius subtiis 578 Pyruvate dehydrogenase E1 UniRef100 P21882 Bacilius subtiis PB component, beta Subunit Bacilius subtiis 579 Dihydrolipoyllysine-residue UniRef100 P21883 Bacilius subtiis PC acetyltransferase component of pyruvate dehydrogenase complex Bacilius subtilis 580 Dihydrolipoyl dehydrogenase Bacilius UniRef100 P21880 Bacilius subtiis PhD subtilis 581 UPIOOOO3CCO69 UniRefl00 entry UniRef100 UPIOOOO3CCO69 582 IS1627s 1-related, transposase Bacilius UniRef100 Q7CMDO Bacilius anthracis str. A2012 anthracis str. A2012 583 S84 585 Bacilius UniRef100 P21885 Bacilius subtiis SpeA subtilis 586 Hypothetical UPF0223 protein yktA UniRef100 Q45497 Bacilius subtiis Bacilius subtilis 587 Hypothetical protein yktB Bacilius UniRef100 Q45498 Bacilius subtiis YktB subtilis 588 YkZI protein Bacillus subtilis UniRef100 O31719 Bacilius subtiis 589 Inositol-1-monophosphatase Bacilius UniRef100 Q45499 Bacilius subtiis YktC subtilis 590 Hypothetical protein ykZC Bacilius UniRef100 O31720 Bacilius subtiis YkzC subtilis US 8,168,417 B2 1. 15 116 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 591 Hypothetical protein ylaA Bacilius UniRef100 OO7625 Bacilius subtiis YaA subtilis 592 Hypothetical protein ylaB Bacilius UniRef100 OO7626 Bacilius subtiis subtilis 593 YlaC protein Bacilius subtilis UniRef100 OO7627 Bacilius subtiis YaC 594 Hypothetical protein ylad Bacilius UniRef100 OO7628 Bacilius subtiis subtilis 595 Hypothetical protein ylaF Bacilius UniRef100 OO7630 Bacilius subtiis subtilis 596 GTP-binding protein typAbipA homolog UniRef100 OO7631 Bacilius subtiis YaG Bacilius subtiis 597 YlaH protein Bacilius subtilis UniRef100 OO7632 Bacilius subtiis YaH 598 YhzA homolog Bacilius subtilis UniRef100 OO7562 Bacilius subtiis YhiH 599 Hypothetical protein yhjG Bacilius UniRef100 OO7561 Bacilius subtiis YhiG subtilis 600 601 Hypothetical lipoprotein yla precursor UniRef100 O07634 Bacilius subtiis Yal Bacilius subtiis 602 YlaK protein Bacilius subtilis UniRef100 O07635 Bacilius subtiis YaK 603 UPIOOOO3CB7B1 UniRefl00 entry UniRef100 UPIOOOO3CB7B1 YaL 604 Probable glutaminase ylaM Bacilius UniRef100 OO7637 Bacilius subtiis YaM subtilis 605 YlaN protein Bacilius subtilis UniRef100 OO7638 Bacilius subtiis 606 Hypothetical protein ylaO Bacilius UniRef100 OO7639 Bacilius subtiis FtSW subtilis 607 PycA 608 Cytochrome AA3 controlling protein UniRef100 P12946 Bacilius subtiis CtaA Bacilius subtiis 609 Protoheme IX farnesyltransferase UniRef100 P24009 Bacilius subtiis CtaB Bacilius subtiis 610 Cytochrome c oxidase polypeptide II UniRef100 P24011 Bacilius subtiis CtaC precursor (EC 1.9.3.1) (Cytochrome aa3 subunit 2) (Caa-3605 Subunit 2) (Oxidase aa(3) Subunit 2) Bacilius subtilis 611 Cytochrome c oxidase polypeptide I UniRef100 P24010 Bacilius subtiis CtaD (EC 1.9.3.1) (Cytochrome aa3 subunit 1) (Caa-3605 subunit 1) (Oxidase aa(3) subunit 1) Bacillus subtilis 612 Cytochrome c oxidase polypeptide III UniRef100 P24012 Bacilius subtiis CtaE (EC 1.9.3.1) (Cytochrome aa3 subunit 3) (Caa-3605 subunit 3) (Oxidase aa(3) subunit 3) Bacillus subtilis 613 Cytochrome c oxidase polypeptide IVB UniRef100 P24013 Bacilius subtiis CtaF Bacilius subtilis 614 CtaG protein Bacilius subtilis UniRef100 O34329 Bacilius subtiis CtaG 615 YTbA protein Bacilius subtiis UniRef100 O34743 Bacilius subtiis YbA 616 YbB protein Bacilius Stibilis UniRef100 O34682 Bacilius subtiis YbB 617 YbC protein Bacilius Stibilis UniRef100 O34586 Bacilius subtiis YbC 618 YlbD protein Bacilius subtilis UniRef100 O34880 Bacilius subtiis YbD 619 YIbE protein Bacilius Subiiis UniRef100 O34958 Bacilius subtiis 620 Regulatory protein ylbF Bacilius UniRef100 O34412 Bacilius subtiis YbF subtilis 621 Hypothetical UPF0298 protein ylbO UniRef100 O34658 Bacilius subtiis Bacilius subtilis 622 YIbH protein Bacilius subtilis UniRef100 O34331 Bacilius subtiis YbBI 623 Phosphopantetheline UniRef100 O34797 Bacilius subtiis Yb adenylyltransferase Bacilius subtilis 624 Yb 625 YlbL protein Bacilius subtilis UniRef100 O34470 Bacilius subtiis YbL 626 YlbM protein Bacilius subtilis UniRef100 O34513 Bacilius subtiis YbM 627 YTbN protein Bacilius subtilis UniRef100 O34445 Bacilius subtiis YbN 628 50S ribosomal protein L32 Bacilius UniRef100 O34687 Bacilius subtiis subtilis 629 Hypothetical protein ylbO Bacilius UniRef100 O34549 Bacilius subtiis YbO subtilis 630 YlbP protein Bacillus subtilis) UniRef100 O34468 Bacilius subtiis YbP 631 Probable 2-dehydropantoate 2- UniRef100 O34661 Bacilius subtiis YbQ reductase Bacilius subtilis 632 YA 633 Protein mraZ Bacilius subtilis UniRef100 P55343 Bacilius subtiis YIIB US 8,168,417 B2 117 118 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 634 S-adenosyl-methyltransferase miraW UniRef100 Q07876 Bacilius subtiis YXA Bacilius subtilis 635 FtSL 636 Penicillin-binding protein 2B Bacilius UniRef100 Q07868 Bacilius subtiis PbpB subtilis 637 Stage V sporulation protein D Bacilius UniRef100 Q03524 Bacilius subtiis SpoVD subtilis 638 UDP-N-acetylmuramoylalanyl-D- UniRef100 Q03523 Bacilius subtiis MurE glutamate-2,6-diaminopimelate ligase Bacilius subtilis 639 Phospho-N-acetylmuramoyl- UniRef100 Q03521 Bacilius subtiis Mray pentapeptide-transferase Bacilius subtilis 640 UDP-N-acetylmuramoylalanine--D- UniRef100 Q03522 Bacilius subtiis Mur) glutamate ligase Bacilius subtilis 641 Stage V sporulation protein E. Bacilius UniRef100 PO7373 Bacilius subtiis SpoVE subtilis 642 UDP-N-acetylglucosamine-N- UniRef100 P37585 Bacilius subtiis MurG acetylmuramyl-(pentapeptide)pyrophosphoryl undecaprenol N acetylglucosamine transferase Bacilius subtilis 643 UDP-N-acetylenolpyruvoylglucosamine UniRef100 P18579 Bacilius subtiis MurE reductase Bacilius subtilis 644 Div 645 YxW 646 YXX 647 Sbp 648 FtSA 649 Cell division protein ftsZ. Bacilius UniRef100 P17865 Bacilius subtiis FtsZ. subtilis 6SO Bpr 651 Bacillopeptidase F precursor Bacillus UniRef100 P16397 Bacilius subtiis Bpr subtilis 652 Sporulation sigma-E factor processing UniRef100 P13801 Bacilius subtiis SpoIIGA peptidase Bacilius subtilis 653 RNA polymerase sigma-E factor UniRef100 PO6222 Bacilius subtiis SigE precursor Bacilius subtilis 654 RNA polymerase sigma-G factor UniRef100 P19940 Bacilius subtiis SigG Bacilius subtilis 655 YlmA protein Bacilius subtilis UniRef100 O31723 Bacilius subtiis YlmA 656 657 YlmC protein Bacilius subtilis UniRef100 O31725 Bacilius subtiis 658 Hypothetical UPFO124 protein ylmD UniRef100 O31726 Bacilius subtiis YD Bacilius subtilis 659 YE 660 YlmF protein Bacillus subtilis UniRef100 O31728 Bacilius subtiis YF 661 YlmG protein Bacilius subtilis UniRef100 O31729 Bacilius subtiis 662 Minicell-associated protein Bacillus UniRef100 P71020 Bacilius subtiis YH subtilis 663 Minicell-associated protein DiviVA UniRef100 P71021 Bacilius subtiis DiwVA Bacilius subtilis 664 Isoleucyl-tRNA synthetase Bacilius UniRef100 Q45477 Bacilius subtiis IeS subtilis 665 Yly A 666 LSpA 667 Hypothetical pseudouridine synthase UniRef100 Q45480 Bacilius subtiis YlyB ylyB Bacilius subtilis 668 PyrR bifunctional protein Includes: UniRef100 P39765 Includes: PyrR operon regulatory protein; Pyrimidine phosphoribosyltransferase (EC operon 2.4.2.9) (UPRTase) Bacillus subtilis regulatory protein; Uracil phosphoribosyltransferase (EC 2.4.2.9) (UPRTase) 1669 Uracil permease Bacillus subtilis UniRef100 P39766 Bacilius subtiis PyrP 1670 Aspartate carbamoyltransferase UniRef100 PO5654 Bacilius subtiis PyrB Bacilius subtilis 1671 Dihydroorotase Bacillus subtilis UniRef100 P25995 Bacilius subtiis PyrC US 8,168,417 B2 119 120 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 672 Carbamoyl-phosphate synthase, Oni OO P25993 Bacilius subtiis Pyra A pyrimidine-specific, Small chain Bacilius subtilis 673 Carbamoyl-phosphate synthase, Oni OO P25994 Bacilius subtiis Pyr AB pyrimidine-specific, large chain Bacilius subtilis 674 Dihydroorotate dehydrogenase electron Oni OO P25983 Bacilius subtiis PyrK transfer subunit Bacilius subtilis 675 Dihydroorotate dehydrogenase, Oni OO P25996 Bacilius subtiis PyrD catalytic subunit Bacilius subtilis 676 Orotidine 5'-phosphate decarboxylase Oni OO P25971 Bacilius subtiis PyrF Bacilius subtilis 677 Orotate phosphoribosyltransferase Oni OO P25972 Bacilius subtiis PyrE Bacilius subtilis 678 679 CysH 680 YInA protein Bacilius subtilis Oni OO O34734 Bacilius subtiis CysP 681 Sulfate adenylyltransferase Bacilius Oni OO O34764 Bacilius subtiis Sat subtilis 682 Probable adenylyl-sulfate kinase Oni OO O34577 Bacilius subtiis CysC Bacilius subtilis 683 Putative S-adenosyl L-methionine: Oni OO O34744 Bacilius subtiis YD uroporphyrinogen III methyltransferase Bacilius subtilis 684 Sirohydrochlorin ferrochelatase Oni OO O34632 Bacilius subtiis YE Bacilius subtilis 685 YInF protein Bacillus subtilis Oni OO O34813 Bacilius subtiis YnF 686 Putative fibronectin-binding protein Oni OO O34693 Bacilius subtiis YoA Bacilius subtilis 687 YloB protein Bacilius subtilis Oni OO O34431 Bacilius subtiis YoB 688 YloC protein Bacilius subtilis Oni OO O34441 Bacilius subtiis YoC 689 Hypothetical UPF0296 protein ylzA Oni OO Q7WY72 Bacilius subtiis Bacilius subtilis 690 Guanylate kinase Bacilius subtilis Oni OO O34328 Bacilius subtiis Gmk 691 DNA-directed RNA polymerase omega Oni OO O35O11 Bacilius subtiis chain Bacilius subtilis 692 YloI protein Bacilius subtilis Oni OO O35033 Bacilius subtiis Yo 693 Primosomal protein N'Bacilius subtilis Oni OO P94461 Bacilius subtiis PriA 694 Peptide deformylase 1 Bacilius subtilis Oni OO P94462 Bacilius subtiis Def 695 Methionyl-tRNA formyltransferase Oni OO P94463 Bacilius subtiis Fmt Bacilius subtilis 696 Ribosomal RNA small subunit Oni OO P94464 Bacilius subtiis YoM methyltransferase B (EC 2.1.1.—) (rRNA (cytosine-C(5)-)-methyltransferase) Bacilius subtilis 697 Hypothetical UPF0063 protein yloN Oni OO O34617 Bacilius subtiis YoN Bacilius subtilis 698 Protein phosphatase Bacilius subtilis Oni OO O34779 Bacilius subtiis PrpC 699 Probable serine/threonine-protein Oni OO O34507 Bacilius subtiis PrkC kinaseyloP Bacilius subtilis 700 Probable GTPase engC Bacilius Oni OO O34530 Bacilius subtiis YloQ subtilis 701 Ribulose-phosphate 3-epimerase Oni OO O34557 Bacilius subtiis Ribe Bacilius subtilis 702 YloS protein Bacilius subtilis Oni OO O34664 Bacilius subtiis YOS 703 704 50S ribosomal protein L28 Bacilius Oni OO P37807 Bacilius subtiis subtilis 705 Hypothetical protein yloU Bacilius Oni OO O34318 Bacilius subtiis YoU subtilis 706 YoV 707 Probable L-serine dehydratase, beta Oni OO O34635 Bacilius subtiis ScalAB chain Bacilius subtilis 708 Probable L-serine dehydratase, alpha Oni OO O34607 Bacilius subtiis SolaAA chain Bacilius subtilis 709 ATP-dependent DNA helicase recC Oni OO O34942 Bacilius subtiis RecC Bacilius subtilis 710 Transcription factor fapR Bacilius Oni OO O34835 Bacilius subtiis YlpC subtilis 711 Fatty acid?phospholipid synthesis Oni OOP71018 Bacilius subtiis PSX protein plsX Bacilius subtilis US 8,168,417 B2 121 122 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 712 Malonyl CoA-acyl carrier protein UniRef100 P71019 Bacilius subtiis Fab) transacylase Bacilius subtilis 713 3-oxoacyl-[acyl-carrier-protein UniRef100 P51831 acyl-carrier- FabG reductase Bacilius subtilis protein 714. Acyl carrier protein Bacillus subtilis UniRef100 P80643 Bacilius subtiis 715 Rinc 716 Chromosome partition protein Smc UniRef100 P51834 Bacilius subtiis Smc Bacilius subtilis 717 FSY 718 719 Signal recognition particle protein UniRef100 P37105 Bacilius subtiis Ff Bacilius subtilis 720 30S ribosomal protein S16 Bacilius UniRef100 P21474 Bacilius subtiis subtilis 721 722 YldD protein Bacilius subtilis UniRef100 O31739 Bacilius subtiis YldD 723 RimM 724 TrmD 725 50S ribosomal protein L19 Bacilius UniRef100 O31742 Bacilius subtiis RplS subtilis 726 YldF protein Bacilius subtilis UniRef100 O31743 Bacilius subtiis YlqF 727 Ribonuclease HII Bacilius subtilis UniRef100 O31744 Bacilius subtiis RinhB 728 YlgG protein Bacilius subtilis UniRef100 O31745 Bacilius subtiis YlgG 729 YldH protein Bacilius subtilis UniRef100 O34867 Bacilius subtiis 730 Succinyl-CoA synthetase beta chain UniRef100 P80886 Bacilius subtiis SucC Bacilius subtilis 731 Succinyl-CoA synthetase alpha chain UniRef100 P80865 Bacilius subtiis SucD Bacilius subtilis 732 Smf 733 Top A 734 Protein gid Bacilius subtilis UniRef100 P39815 Bacilius subtiis Gid 735 Tyrosine recombinase xerC Bacilius UniRef100 P39776 Bacilius subtiis CodW subtilis 736 ATP-dependent protease hslV UniRef100 P39070 Bacilius subtiis ClipO precursor Bacillus subtilis 737 ATP-dependenths protease ATP- UniRef100 P39778 Bacilius subtiis ClpY binding subunithslUBacilius subtilis 738 GTP-sensing transcriptional pleiotropic UniRef100 P39779 Bacilius subtiis Cody repressor codY Bacilius subtilis 739 Flagellar basal-body rod protein figE UniRef100 P24500 Bacilius subtiis FIgE Bacilius subtilis 740 Flagellar basal-body rod protein figC UniRef100 P245O1 Bacilius subtiis FlgC Bacilius subtilis 741 Flagellar hook-basal body complex UniRef100 P245O2 Bacilius subtiis FE protein fliE Bacilius subtilis 742 Flagellar M-ring protein Bacilius UniRef100 P23447 Bacilius subtiis FIF subtilis 743 Flagellar motor Switch protein fliG UniRef100 P23448 Bacilius subtiis FG Bacilius subtilis 744 Probable flagellar assembly protein fliH UniRef100 P23449 Bacilius subtiis FIH Bacilius subtilis 745 Flagellum-specific ATP synthase UniRef100 P23445 Bacilius subtiis FI Bacilius subtilis 746 Flagellar fliJ protein Bacilius subtilis UniRef100 P20487 Bacilius subtiis Fi 747 FlaA locus 22.9 kDa protein Bacilius UniRef100 P23454 Bacilius subtiis YXF subtilis 748 Probable flagellar hook-length control UniRef100 P23451 Bacilius subtiis FiK protein Bacilius subtilis 749 FlaA locus hypothetical proteinylxG UniRef100 P23455 Bacilius subtiis YxG Bacilius subtilis 750 FIgE 751 BH2448 protein Bacillus halodurans UniRef100 Q9KA42 Bacilius halodurans 752 Flagellar fliL protein Bacilius subtilis UniRef100 P23452 Bacilius subtiis FIL 753 Flagellar motor Switch protein fliM UniRef100 P23453 Bacilius subtiis FM Bacilius subtiis 754. Flagellar motor Switch protein fliY UniRef100 P24O73 Bacilius subtiis FIY Bacilius subtiis 755 Chemotaxis protein cheY homolog UniRef100 P24072 Bacilius subtiis CheY Bacilius subtiis 756 Flagellar biosynthetic protein fliZ UniRef100 P35536 Bacilius subtiis FIZ. precursor Bacillus subtilis US 8,168,417 B2 123 124 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 757 Flagellar biosynthetic protein fliP UniRef100 P35528 Bacilius subtiis FIP Bacilius subtilis 758 Flagellar biosynthetic protein fliO UniRef100 P35535 Bacilius subtiis Bacilius subtilis 759 Flagellar biosynthetic protein fliR UniRef100 P35537 Bacilius subtiis FIR Bacilius subtilis 760 Flagellar biosynthetic protein flhB UniRef100 P35538 Bacilius subtiis FIB Bacilius subtilis 761 Flagellar biosynthesis protein flhA UniRef100 P35620 Bacilius subtiis FIhA Bacilius subtilis 762 Flagellar biosynthesis protein flhF UniRef100 Q01960 Bacilius subtiis FIhF Bacilius subtilis 763 Hypothetical protein yixH Bacilius UniRef100 P4O742 Bacilius subtiis YXH subtiis 764 Chemotaxis response regulator protein- UniRef100 Q05522 Bacilius subtiis Ches glutamate methylesterase Bacilius subtiis 765 Chemotaxis protein chea Bacilius UniRef100 P29072 Bacilius subtiis Chea subtiis 766 Chemotaxis protein cheW Bacilius UniRef100 P398O2 Bacilius subtiis Chew subtiis 767 Chemotaxis protein chec Bacilius UniRef100 P40403 Bacilius subtiis CheC subtiis 768 Chemotaxis protein ched Bacilius UniRef100 P4O404 Bacilius subtiis Che) subtiis 769 RNA polymerase sigma-D factor UniRef100 P10726 Bacilius subtiis SigD Bacilius subtilis 770 Swarming motility protein swrB Bacilius UniRef100 P40405 Bacilius subtiis YXL subtiis 771 30S ribosomal protein S2 Bacilius UniRef100 P21464 Bacilius subtiis RSB subtiis 772 Translation elongation factor Ts UniRef100 Q65JJ8 Bacilius cereus TSf Bacilius cereus 773 PyrH 774 Frr 775 Undecaprenyl pyrophosphate UniRef100 O31751 Bacilius subtiis UppS synthetase Bacilius subtilis 776 Phosphatidate cytidylyltransferase UniRef100 O31752 Bacilius subtiis ColsA Bacilius subtilis 777 1-deoxy-D-xylulose 5-phosphate UniRef100 O31753 Bacilius subtiis Dxr reductoisomerase Bacillus subtilis 778 Hypothetical zinc metalloprotease y luc UniRef100 O31754 Bacilius subtiis YC Bacilius subtilis 779 Prolyl-tRNA synthetase Bacilius UniRef100 O31755 Bacilius subtiis ProS subtilis 780 DNA polymerase III polC-type Bacilius UniRef100 P13267 Bacilius subtiis POC subtilis 781 782 Cellulose 1,4-beta-cellobiosidase UniRef100 Q8KKF7 Paenibacillus sp. BP-23 precursor Paenibacilius sp. BP-23) 783 Endoglucanase B precursor UniRef100 P23550 Paenibacilius iaitius Paenibacilius lautus 784 Beta-mannosidase Thermotoga UniRef100 Q9RIK7 Thermotoga neapolitana neapolitana 785 Hypothetical UPF0090 protein ylxS UniRef100 P32726 Bacilius subtiis YxS Bacilius subtilis 786 Transcription elongation protein nuSA UniRef100 P32727 Bacilius subtiis NuSA Bacilius subtilis 787 Hypothetical protein yixRBacilius UniRef100 P32728 Bacilius subtiis subtilis 788 Probable ribosomal protein ylxO UniRef100 P32729 Bacilius subtiis Bacilius subtilis 789 Translation initiation factor IF-2 UniRef100 P17889 Bacilius subtiis InfB Bacilius subtilis 790 Hypothetical protein ylxP Bacilius UniRef100 P32730 Bacilius subtiis subtilis 791 Ribosome-binding factor A Bacilius UniRef100 P32731 Bacilius subtiis Rbf A subtilis 792 TruE US 8,168,417 B2 125 126 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 1793 Riboflavin biosynthesis protein ribC UniRef100 P54575 Includes: RbC Includes: Riboflavin kinase (EC Riboflavin kinase 2.7.1.26) (Flavokinase): FMN (EC 2.7.1.26) adenylyltransferase (EC 2.7.7.2) (FAD (Flavokinase); pyrophosphorylase) (FAD synthetase) FMN Bacilius subtilis adenylyltransferase (EC 2.7.7.2) (FAD pyrophosphorylase) (FAD synthetase) 794 30S ribosomal protein S15 Bacilius UniRef100 P21473 Bacilius subtiis subtilis 795 Polyribonucleotide UniRef100 P50849 Bacilius subtiis PnpA nucleotidyltransferase Bacilius subtilis 796 Hypothetical protein yixY precursor UniRef100 P50850 Bacilius subtiis YxY Bacilius subtilis 797 Hypothetical zinc proteaseymxG UniRef100 Q04805 Bacilius subtiis Mlp A Bacilius subtilis 798 Hypothetical protein ymxH Bacilius UniRef100 Q04811 Bacilius subtiis subtilis 799 Dipicolinate synthase, A chain Bacillus UniRef100 Q04809 Bacilius subtiis SpoVFA subtilis 800 Dipicolinate synthase, B chain Bacillus UniRef100 Q04810 Bacilius subtiis SpoVFB subtilis 801 Aspartate-semialdehyde UniRef100 Q04797 Bacilius subtiis ASd dehydrogenase Bacilius subtilis 802 Aspartokinase 1 (EC 2.7.2.4) UniRef100 Q04795 Contains: DapG (Aspartokinase I) (Aspartate kinase 1) Aspartokinase I Contains: Aspartokinase I alpha alpha Subunit; Subunit; AspartOkinase I beta Subunit Aspartokinase I Bacilius subtilis beta subunit 803 Dihydrodipicolinate synthase Bacilius UniRef100 Q04796 Bacilius subtiis Dap A subtilis 804 YmfA protein Bacilius subtilis UniRef100 O31760 Bacilius subtiis YmfA 805 Translocation-enhancing protein tep A UniRef100 Q99171 Bacilius subtiis TepA Bacilius subtilis 806 807 DNA translocase ftsK Bacilius subtilis UniRef100 P21458 Bacilius subtiis SpoIIIE 808 Hypothetical transcriptional regulator UniRef100 O31761 Bacilius subtiis YnfC ymfc Bacilius subtilis 809 Multidrug resistance protein Bacilius UniRef100 Q9K7Q2 Bacilius YitG halodurans halodurans 810 Hypothetical protein Bacilius anthracis UniRef100 Q81WP6 Bacilius Ylf anthracis 811 YmfH protein Bacilius subtilis UniRef100 O31766 Bacilius subtiis Ylf 812 813 814 Ymfy protein Bacilius subtilis UniRef100 O31768 Bacilius subtiis 815 Hypothetical protein Bacilius cereus UniRef100 Q636P2 Bacilius cereus YnfK ZK) ZK. 816 YnfM 817 CDP-diacylglycerol-glycerol-3- UniRef100 P46322 Bacilius subtiis PgSA phosphate 3-phosphatidyltransferase Bacilius subtilis 818 CinA-like protein Bacilius subtilis UniRef100 P46323 Bacilius subtiis CinA 819 RecA protein Bacilius UniRef100 Q8GJG2 Bacilius RecA amyloiquefaciens amyloiquefaciens 820 Hypothetical UPF0144 protein ymdA UniRef100 O31774 Bacilius subtiis YmdA Bacilius subtilis 821 YmdB protein Bacillus subtilis UniRef100 O31775 Bacilius subtiis YndB 822 Stage V sporulation protein S Bacilius UniRef100 P45693 Bacilius subtiis subtilis 823 824 825 L-threonine 3-dehydrogenase Bacilius UniRef100 O31776 Bacilius subtiis Tdh subtilis 826 2-amino-3-ketobutyrate coenzyme A UniRef100 O31777 Bacilius subtiis Kbl ligase Bacilius subtilis 827 Hypothetical UPF0004 protein ymcB UniRef100 O31778 Bacilius subtiis YncB Bacilius subtilis 828 YmcA protein Bacillus subtilis UniRef100 O31779 Bacilius subtiis YmcA US 8,168,417 B2 127 128 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 829 Spore coat protein E Bacilius subtilis UniRef100 P14016 Bacilius subtiis Cot 830 MutS 831 DNA mismatch repair protein mutL UniRef100 P49850 Bacilius subtiis MutL Bacilius subtilis 832 YicS protein Bacilius subtilis UniRef100 O31641 Bacilius subtiis 833 YXD 834 835 836 All 1751 protein Anabaena sp. UniRef100 Q8YW65 Anabaena sp. YCC 837 BHO367 protein Bacilius halodurans UniRef100 Q9KFV4 Bacilius halodurans 838 Phosphinothricin N-acetyltransferase UniRef100 Q9KFP5 Bacilius YWH Bacilius halodurans halodurans 839 UPIOOOO3CCOD8 UniRef100 entry UniRef100 UPIOOOO3CCOD8 840 841 Putative L-amino acid oxidase Bacilius UniRef100 O34363 Bacilius subtiis YobN subtilis 842 Yoak 843 Na+ H+ antiporter Bacilius halodurans UniRef100 Q9K5Q0 Bacilius YvgP halodurans 844 Aromatic hydrocarbon catabolism UniRef100 Q8CV32 Oceanobacilius iheyensis protein Oceanobacilius iheyensis 845 Hypothetical UPF0145 protein VP1283 UniRef100 Q87Q67 Vibrio parahaemolyticus Vibrio parahaemolyticus 846 Hypothetical protein yaeD Bacilius UniRef100 P54449 Bacilius subtiis YgeD subtilis 847 Penicillin-binding protein, putative UniRef100 Q738U9 Bacilius cereus PbpE Bacilius cereus 848 Hypothetical glycosyltransferase UniRef100 O34539 Bacilius subtiis YiC LBacilius subtilis 849 Asparate-proton Symporter Bacilius UniRef100 OO7002 Bacilius subtiis YveA subtilis 850 Spore coat protein Bacilius halodurans UniRef100 Q9KEV6 Bacilius halodurans 851 852 YD 853 Putative HTH-type transcriptional UniRef100 Q7WY76 Bacilius subtiis YezE regulatoryezE Bacilius subtilis 854 Hypothetical protein yes}E Bacilius UniRef100 O31511 Bacilius subtiis YesE subtilis 855 YesR protein Bacilius subtilis UniRef100 O31512 Bacilius subtiis YesE 856 UPIOOOO3CBA3B UniRef100 entry UniRef100 UPIOOOO3CBA3B 857 YmaD protein Bacillus subtilis UniRef100 O31790 Bacilius subtiis YmaD 858 Multidrug resistance protein ebrB UniRef100 O31791 Bacilius subtiis EbrB Bacilius subtiis 859 Multidrug resistance protein ebra UniRef100 O31792 Bacilius subtiis Ebra Bacilius subtiis 860 861 Hypothetical protein ymaF Bacilius UniRef100 O31794 Bacilius subtiis Yma subtilis 862 tRNA delta(2)- UniRef100 O31795 Bacilius subtiis Mia-A isopentenylpyrophosphate transferase Bacilius subtiis 863 Hfq protein Bacilius subtilis UniRef100 O31796 Bacilius subtiis 864. Hypothetical protein ymzA Bacilius UniRef100 O31798 Bacilius subtiis subtilis 865 NrdI protein Bacillus subtilis UniRef100 P50618 Bacilius subtiis YmaA. 866 Ribonucleoside-diphosphate reductase UniRef100 P50620 Bacilius subtiis NrdE alpha chain Bacilius Subtilis 867 Ribonucleoside-diphosphate reductase UniRef100 P50621 Bacilius subtiis NrdF beta chain Bacillus subtilis 868 Hypothetical protein ymaB Bacilius UniRef100 P50619 Bacilius subtiis Ymas. subtilis 869 Blró966 protein Bradyrhizobium UniRef100 Q89EV4 Bradyrhizobium YnP japonictim japonictim 870 Nitrogen fixation protein Bacilius UniRef100 Q9KFV2 Bacilius YurV halodurans halodurans 871 Transcription regulator Fur family-like UniRef100 Q8CNQ7 Staphylococcits PerR protein Staphylococci is epidermidis epidermidis 872 UPIOOOO3CB681 UniRefl00 entry UniRef100 UPIOOOO3CB681 YdhC 873 Hypothetical protein Bacilius cereus UniRef100 Q81C60 Bacilius cereus YIA 874 CwlC 875 Membrane protein, putative Listeria UniRef100 Q72OL9 Listeria monocytogenes monocytogenes US 8,168,417 B2 129 130 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 876 Lin1174 protein Listeria innocua UniRef100 Q92CJ7 Listeria innoctia 877 878 879 Transcriptional regulator Aquifex UniRef100 O66635 Aquifex aeolicits YdgC aeolicits 880 Hypothetical membrane-anchored UniRef100 Q92VA1 Rhizobium meioti protein Rhizobium meliloti 881 Cytosine permease Bacillus UniRef100 Q9KBP3 Bacilius YXA halodurans halodurans 882 AgaF Agrobacterium tumefaciens UniRef100 OSO265 Agrobacterium timefaciens 883 884 Hydantoin utilization protein B UniRef100 Q88H51 Pseudomonas puttida Pseudomonas puttida 885 Hypothetical protein SMb20139 UniRef100 Q92X23 Rhizobium meioti Rhizobium meliloti 886 BH2340 protein Bacilius halodurans UniRef100 Q9KAE7 Bacilius halodurans 887 888 889 Stage V sporulation protein K Bacilius UniRef100 P27643 Bacilius subtiis SpoVK subtilis 890 YnbA Bacilius subtilis UniRef100 P94478 Bacilius subtiis YnbA 891 YnbB Bacillus subtilis) UniRef100 P94479 Bacilius subtiis YbB 892 HTH-type transcriptional regulator glnR UniRef100 P37582 Bacilius subtiis GlnR Bacilius subtilis 893 Glutamine synthetase Bacilius subtilis UniRef100 P12425 Bacilius subtiis GlnA 894 895 Hypothetical protein CAC3435 UniRef100 Q97DN7 Clostridium acetobutyllicum Clostridium acetobutyllicum 896 897 898 Hypothetical protein CACO350 UniRef100 Q97M50 Clostridium acetobutyllicum Clostridium acetobutyllicum 899 Hypothetical HIT-like protein MJO866 UniRef100 Q58276 Meihanococcus Hit Methanococcus jannaschii iannaschii 900 Hypothetical protein Bacilius anthracis UniRef100 Q6I2B3 Bacilius anthracis 901 Hypothetical protein Bacilius cereus UniRef100 Q81CM1 Bacilius cereus YdjC 902 903 Methyltransferase Bacilius cereus UniRef100 Q81CJ2 Bacilius cereus 904 YoaC) 905 Acetyltransferase, GNAT family UniRef100 Q737B4 Bacilius cereus Bacilius cereus 906 Repressor rok Bacilius subtilis UniRef100 O34857 Bacilius subtiis Rok 907 908 Hypothetical protein Bacteriophage T5) UniRef100 Q6OGK2 Bacteriophage T5 909 Hypothetical protein yvdT Bacilius UniRef100 OO7001 Bacilius subtiis YvoT subtilis 910 Hypothetical protein yvdS Bacilius UniRef100 O32262 Bacilius subtiis YvoS subtilis 911 Hypothetical protein yvdR Bacilius UniRef100 OO6999 Bacilius subtiis YvR. subtilis 912 Spermidine N1-acetyltransferase UniRef100 Q72YO3 Bacilius cereus YoaA Bacilius cereus 913 Hypothetical Membrane Associated UniRef100 Q812L6 Bacilius cereus Protein Bacilius cereus 914 Hypothetical protein yoaW precursor UniRef100 O34541 Bacilius subtiis YoaW Bacilius subtilis 915 Thiol-disulfide oxidoreductase res.A UniRef100 Q9KCJ4. Bacilius ResA Bacilius halodurans halodurans 916 Manganese-containing catalase UniRef100 Q9KAU6 Bacilius YdbD Bacilius halodurans halodurans 917 BH1562 protein Bacilius halodurans UniRef100 Q9KCK9 Bacilius halodurans 918 Acetyltransferase, GNAT family UniRef100 Q739KO Bacilius cereus Yick Bacilius cereus 919 Hypothetical conserved protein UniRef100 Q8ELR7 Oceanobacilius iheyensis Oceanobacilius iheyensis 920 921 Transcriptional regulator, MarR family UniRef100 Q81 BM5 Bacilius cereus YkyE Bacilius cereus 922 Putative NAD(P)H nitroreductaseydfN UniRef100 P96692 Bacilius subtiis YdfN Bacilius subtilis 923 YdfC) protein Bacilius subtilis UniRef100 P96693 Bacilius subtiis Ydf) US 8,168,417 B2 131 132 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 924 Hypothetical protein Bacilius cereus UniRef100 Q630S7 Bacilius cereus YwrF ZK) ZK. 925 BH1010 protein Bacilius halodurans UniRef100 Q9KE48 Bacilius YCB halodurans 926 ORF28 Staphylococcus phage K. UniRef100 Q6Y7T8 Staphylococci is phage K 927 928 Thymidylate synthase Bacteriophage UniRef100 PO7606 Bacteriophage Thy A phi-3T phi-3T 929 Sporulation-specific extracellular UniRef100 P42983 Bacilius subtiis NucB nuclease precursor Bacilius subtilis 930 LexA repressor Bacilius subtilis UniRef100 P31080 Bacilius subtiis Lex A 931 YneA Bacilius subtilis UniRef100 Q45056 Bacilius subtiis YneA 932 YneB Bacillus subtilis) UniRef100 Q45057 Bacilius subtiis YneB 933 Hypothetical UPF0291 protein ynzC UniRef100 O31818 Bacilius subtiis Bacilius subtilis 934 Transketolase Bacilius subtilis UniRef100 P45694 Bacilius subtiis Tkt 935 Hypothetical protein yneE Bacilius UniRef100 P45707 Bacilius subtiis YneE subtilis 936 Hypothetical UPFO154 protein yneF UniRef100 P45708 Bacilius subtiis Bacilius subtilis 937 938 Cytochrome c-type biogenesis protein UniRef100 P45706 Bacilius subtiis CcdA ccdA Bacilius subtilis 939 CCdB protein Bacilius subtilis UniRef100 P45709 Bacilius subtiis Yne 940 CedC protein Bacilius subtilis UniRef100 P45710 Bacilius subtiis Yne 941 Hypothetical protein yneK Bacilius UniRef100 P45711 Bacilius subtiis YneK subtilis 942 Spore coat protein M Bacilius subtilis UniRef100 Q45058 Bacilius subtiis CotM 943 944 945 CitB 946 YneN protein Bacilius subtilis UniRef100 O31820 Bacilius subtiis YneN 947 948 949 Small, acid-soluble spore protein tip UniRef100 Q45060 Bacilius subtiis Bacilius subtilis 950 YneP Bacillus subtilis UniRef100 Q45061 Bacilius subtiis YneP 951 YneQ Bacilius subtilis UniRef100 Q45062 Bacilius subtiis 952 Hypothetical protein Bacilius cereus UniRef100 Q815P1 Bacilius cereus 953 Conserved domain protein Bacilius UniRef100 Q72YR7 Bacilius cereus cereus 954 YneR Bacilius subtilis UniRef100 Q45063 Bacilius subtiis 955 Hypothetical UPF0078 protein yneS UniRef100 Q45064 Bacilius subtiis YneS Bacilius subtilis 956 YneT Bacillus subtilis UniRef100 Q45065 Bacilius subtiis YneT 957 Topoisomerase IV subunit BBacilius UniRef100 Q59192 Bacilius subtiis ParE subtilis 958 Topoisomerase IV subunit A Bacilius UniRef100 Q45066 Bacilius subtiis ParC subtilis 959 AraR 960 Hypothetical conserved protein UniRef100 Q8EMP2 Oceanobacilius XylB Oceanobacilius iheyensis iheyensis 961 L-ribulose-5-phosphate 4-epimerase UniRef100 Q8EMP3 Oceanobacilius Ara) Oceanobacilius iheyensis iheyensis 962 L-arabinose isomerase Oceanobacilius UniRef100 Q8EMP4 Oceanobacilius Ara.A iheyensis iheyensis 963 YwtG 964 FabG 965 Hypothetical protein ynfC Bacilius UniRef100 Q45067 Bacilius subtiis YnfC subtilis 966 Amino acid carrier protein alsT Bacilius UniRef100 Q45068 Bacilius subtiis AlsT subtilis 967 Nar 968 Nar 969 NarEH 970 NarG 971 972 Hypothetical protein yafB Lactococcus UniRef100 Q9CF70 Lactococci is lactis lactis 973 AlbA 974 ArfM 975 YwiC US 8,168,417 B2 133 134 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 976 Transcriptional regulator of anaerobic UniRef100 Q9KG81 Bacilius Finr genes Bacilius halodurans halodurans 977 Nitrite extrusion protein Bacilius UniRef100 P46907 Bacilius subtiis NarK subtilis 978 979 CAMP-binding domains-Catabolite UniRef100 Q8R5P4 Thermoanaerobacter Finr gene activator and regulatory Subunit of tengcongensis cAMP-dependent protein kinases Thermoanaerobacter tengcongensis 980 Putative nitric oxide reductase UniRef100 Q6GK48 Staphylococcusatiretts Staphylococcusatiretts 981 982 Yng|L 983 BglC 984 Hypothetical protein ynfE Bacilius UniRef100 Q45069 Bacilius subtiis subtilis 985 Hypothetical protein Bacilius UniRef100 Q97.F48 Bacilius YkkB negaterium negaterium 986 Hypothetical protein Bacilius UniRef100 Q7OKO6 Bacilius amyloiiquefaciens amyloiquefaciens 987 Alkyl hydroperoxide reductase large UniRef100 Q979W3 Bacilius AhpF subunit Bacilius halodurans halodurans 988 Methyltransferase Methanosarcina UniRef100 Q8PU82 Meihanosarcina mazei mazei 989 Similar to B. subtilis ywgB gene Bacilius UniRef100 Q979W2 Bacilius YwgB halodurans halodurans 990 Hypothetical protein ywoF Bacilius UniRef100 P94576 Bacilius subtiis YWOF subtilis 991 Branched-chain amino acid transport UniRef100 P94499 Bacilius subtiis Brno system carrier protein brnO Bacilius subtilis 992 NADP-dependent alcohol UniRef100 OO6007 Bacilius subtiis Adha dehydrogenase Bacilius subtilis 993 Transcriptional regulator, MerR family UniRef100 Q7217.3 Listeria YraB Listeria monocytogenes monocytogenes 994 HPr-like protein crh Bacilius subtilis UniRef100 OO6976 Bacilius subtiis 995 BH2089 protein Bacilius halodurans UniRef100 Q979R4 Bacilius YddR halodurans 996 997 Enoyl-CoA hydratasetisomerase family UniRef100 Q738LO Bacilius cereus YngF protein Bacilius ceretts 998 Hypothetical protein ysiB Bacilius UniRef100 P94549 Bacilius subtiis YSEB subtilis 999 Methylmalonic acid semialdehyde UniRef100 Q63BLO Bacilius cereus Mms.A dehydrogenase Bacilius cereus ZK ZK. 2000 3-hydroxyisobutyrate dehydrogenase UniRef100 Q63BL1 Bacilius cereus YkwC Bacilius cereus ZK) ZK. 2001 Acyl-CoA dehydrogenase Bacilius UniRef100 Q81DR7 Bacilius cereus YuSJ cereus 2002 Mannose-6-phosphate isomerase UniRef100 O31646 Bacilius subtiis ManA Bacilius subtilis 2003 Phosphotransferase system (PTS) UniRef100 O31645 Bacilius subtiis ManP mannose-specific enzyme IIBCA component Bacilius subtilis 2004 2005 2006 Hypothetical protein Bacilius cereus UniRef100 Q72YT6 Bacilius cereus 2007 Transcriptional regulator Bacilius UniRef100 O31644 Bacilius subtiis ManR subtilis 2008 2009 UPIOOOO3CC220 UniRefl00 entry UniRef100 UPIOOOO3CC220 YtrB 2010 Transcriptional regulator Bacilius UniRef100 Q9KF35 Bacilius Ytra halodurans halodurans 2011 Probable oxidoreductase Clostridium UniRef100 Q8XP17 Cliostridium YimF perfingens perfingens 2012 Mannonate dehydratase Clostridium UniRef100 Q8XP15 Cliostridium UxuA perfingens perfingens 2013 Glucosidase Bacilius halodurans UniRef100 Q9KEZ5 Bacilius halodurans 2014 C4-dicarboxylate transport system UniRef100 Q9KEZ6 Bacilius halodurans Bacilius halodurans US 8,168,417 B2 135 136 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2015 C4-dicarboxylate transport system UniRef100 Q8EMM5 Oceanobacilius iheyensis permease Small protein Oceanobacilius iheyensis 2016 C4-dicarboxylate transport system UniRef100 Q9KEZ8 Bacilius DctB Bacilius halodurans halodurans 2017 Transcriptional regulator UniRef100 Q8EL22 Oceanobacilius CcpA Oceanobacilius iheyensis iheyensis 2018 Arsenate reductase Bacilius subtilis UniRef100 P45947 Bacilius subtiis ArsC 2019 YdfA protein Bacilius subtilis UniRef100 P96678 Bacilius subtiis YdfA 2020 YdeT protein Bacilius subtilis UniRef100 P96677 Bacilius subtiis Yde.T 2021 YoleIBacilius halodurans UniRef100 Q979R5 Bacilius Yde halodurans 2022 Putative secreted protein UniRef100 Q9S1Z5 Streptomyces coelicolor Streptomyces coelicolor 2023 Probable glucose uptake protein gleU UniRef100 P40419 Bacilius GlcU Bacilius negaterium negaterium 2024 YngK protein Bacillus subtilis UniRef100 O35O15 Bacilius subtiis YngK 2025 YngD protein Bacillus subtilis UniRef100 O31824 Bacilius subtiis YngD 2026 Pyruvate formate-lyase-activating UniRef100 Q73DZ6 Bacilius cereus YkyL enzyme Bacilius Ceretts 2027 Formate acetyltransferase Bacilius UniRef100 Q81YX1 Bacilius anthracis anthracis 2028 DacC 2029 NADH dehydrogenase-like protein yID UniRef100 P80861 Bacilius subtiis YID Bacilius subtilis 2030 Hypothetical protein yIC Bacilius UniRef100 O34633 Bacilius subtiis YIC subtilis 2O31 2032 Hypothetical protein LBacillus cereus UniRef100 Q81IJ8 Bacilius cereus 2033 Hypothetical protein ykZH Bacilius UniRef100 O31653 Bacilius subtiis subtilis 2034 Acetyl-CoA synthetase Bacilius UniRef100 Q9KDS4 Bacilius AcSA halodurans halodurans 2035 YngE protein Bacillus subtilis UniRef100 O31825 Bacilius subtiis YngE 2036 Hydroxybutyryl-dehydratase Bacilius UniRef100 Q9L7W1 Bacilius subtiis YngF subtilis 2037 YngG protein Bacillus subtilis UniRef100 O34873 Bacilius subtiis YngG 2038 YngXX Bacilius subtilis UniRef100 Q9R9I3 Bacilius subtiis 2039 YngH Bacillus subtilis UniRef100 Q9R9I4 Bacilius subtiis YngH 2040 Yng Bacilius subtilis UniRef100 Q9R9I5 Bacilius subtiis Yng 2041 Yng protein Bacilius subtilis UniRef100 O34421 Bacilius subtiis Yng 2042 NAD(P)H oxidoreductase YRKL UniRef100 Q7P6PO Fusobacterium YrkL Fusobacterium nucleatin Subsp. nucleathin vincentii ATCC 49256) Subsp. vincentii ATCC 492S6 2043 Transcriptional regulator, MarR family UniRef100 Q8RE85 Fusobacterium nucleatum Fusobacterium nucleatum 2044 Glutamate-5-semialdehyde UniRef100 Q6HHC2 Bacilius ProA dehydrogenase Bacilius thiringiensis thiringiensis 2045 Glutamate 5-kinase 2 Bacilius subtilis UniRef100 OO7509 Bacilius subtiis Pro 2046 Pyrroline-5-carboxylate reductase 1 UniRef100 P14383 Bacilius subtiis ProH Bacilius subtilis 2047 UPIOOOO3CB6CD UniRef100 entry UniRef100 UPIOOOO3CB6CD YetF 2048 Sodium-dependent phosphate UniRef100 Q9KCT1 Bacilius CysP transporter Bacilius halodurans halodurans 2049 Probable phosphoadenosine UniRef100 OO6737 Bacilius subtiis YitB phosphosulfate reductase Bacilius subtilis 2050 Phosphosulfolactate synthase (EC UniRef100 OO6739 Bacilius subtiis Yit) 4.4.1.19) ((2R)-phospho-3-sulfolactate synthase) Bacilius subtilis 2051 YitE Bacillus subtilis) UniRef100 OO6740 Bacilius subtiis YitE 2052 YitF Bacilius subtilis UniRef100 OO6741 Bacilius subtiis YitF 2053 YitG Bacillus subtilis) UniRef100 OO6742 Bacilius subtiis YitG 2054 Putative glycosyltransferaseykoT UniRef100 O34755 Bacilius subtiis YkoT Bacilius subtilis 2055 YkoR Bacillus subtilis UniRef100 O34830 Bacilius subtiis YkoS 2056 Glutamate synthase NADPH Small UniRef100 O34399 NADPH GltB chain Bacilius subtilis 2057 GltA 2058 HTH-type transcriptional regulator gltC UniRef100 P20668 Bacilius subtiis GltC Bacilius subtilis US 8,168,417 B2 137 138 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2059 AI17121 protein Anabaena sp. UniRef100 Q8YL17 Anabaena sp. 2060 2061 Limo0606 protein Listeria UniRef100 Q8Y9C6 Listeria monocytogenes monocytogenes 2062 ABC transporter ATP-binding protein UniRef100 Q67MU2 Symbiobacterium YfB Symbiobacterium thermophilum thermophilum 2063 ABC transporter ATP-binding protein UniRef100 Q67MU3 Symbiobacterium YfiC Symbiobacterium thermophilum thermophilum 2064 Cytochrome P450 109 Bacilius subtilis UniRef100 P27632 Bacilius subtiis YiB 2065 Hypothetical oxidoreductase yoxD UniRef100 P14802 Bacilius subtiis YoxD Bacilius subtilis 2066 Pps 2067 Permease, general substrate UniRef100 Q6HMC3 Bacilius LmrB transporter Bacilius thiringiensis thiringiensis 2068 Putative formate dehydrogenase UniRef100 O34323 Bacilius subtiis Yoa Bacilius subtilis 2069 Transcriptional regulatory protein UniRef100 Q89KD1 Bradyrhizobium YcgE Bradyrhizobium japonicum japonictim 2070 Drug resistance transporter, UniRef100 Q73615 Bacilius cereus Mdr EmrB/QacA family Bacilius cereus 2071 YndE protein Bacillus subtilis UniRef100 O31809 Bacilius subtiis YE 2072 YndF protein Bacilius subtilis UniRef100 O31810 Bacilius subtiis YF 2O73 YE 2074 YE 2075 UPIOOOO3CB22E UniRef100 entry UniRef100 UPIOOOO3CB22E YD 2O76 2077 Nucleotide binding protein expZ. UniRef100 P39115 Bacilius subtiis ExpZ. Bacilius subtilis 2O78 TlpB 2079 CsaA protein Bacilius subtilis UniRef100 P37584 Bacilius subtiis CsaA 2080 Alkaline serine protease Bacilius UniRef100 O31788 Bacilius subtiis AprX subtilis 2081 Hypothetical protein Bacilius anthracis UniRef100 Q81SD4 Bacilius YnZE anthracis 2082 Transcriptional regulator, TetR family UniRef100 Q63D70 Bacilius cereus YrhI Bacilius cereus ZK) ZK. 2083 Putative HTH-type transcriptional UniRef100 P4O762 Bacilius subtiis YynB regulatoryvmB Bacilius subtilis 2084 Hydrolase Bacillus cereus UniRef100 Q81D79 Bacilius cereus YgeK 2085 Hypothetical conserved protein UniRef100 Q8ETG3 Oceanobacilius iheyensis Oceanobacilius iheyensis 2086 YndJ protein Bacilius subtilis UniRef100 O31813 Bacilius subtiis Ynd 2087 YndH protein Bacillus subtilis UniRef100 O31812 Bacilius subtiis YH 2088 Hypothetical protein Bacilius cereus UniRef100 Q73A97 Bacilius cereus YndG 2089 UPIOOOO3CBA97 UniRef100 entry UniRef100 UPIOOOO3CBA97 YobS 2090 UPIOOOO3CBA98 UniRef100 entry UniRef100 UPIOOOO3CBA98 YobT 2091 DNA-binding protein YobU Bacilius UniRef100 O34637 Bacilius subtiis YobU subtilis 2092 Hypothetical protein Bacteroides UniRef100 Q64RP1 Bacteroides fragilis fragilis 2093 Transcription regulator Bacilius subtilis UniRef100 O34920 Bacilius subtiis YobV 2094 YobW 2095 YoZA protein Bacilius subtilis UniRef100 O31844 Bacilius subtiis 2096 Possible metallo-beta-lactamase family UniRef100 Q638G1 Bacilius cereus Ymas. protein Bacilius cereus ZK) ZK. 2097 2098 YfmA 2099 YoCA 2100 UPIOOOO3CBE6E UniRef100 entry UniRef100 UPIOOOO3CBE6E YveM 2101 Hypothetical protein Bacilius cereus UniRef100 Q817C2 Bacilius cereus 2102 2103 Glycosyltransferase Bacilius UniRef100 Q9K7I1 Bacilius YtcC halodurans halodurans 2104 YozB protein Bacilius subtilis UniRef100 O31845 Bacilius subtiis YozB 2105 Limo2079 protein Listeria UniRef100 Q8YSI3 Listeria monocytogenes monocytogenes 2106 YocC Bacillus subtilis UniRef100 O35042 Bacilius subtiis YocC 2107 Na+/myo-inositol cotransporter Bacilius UniRef100 Q9KAR5 Bacilius YcgO halodurans halodurans 2108 Yoch Bacilius subtilis UniRef100 O34669 Bacilius subtiis Yoc 2109 RecQ homolog Bacilius subtilis UniRef100 O34748 Bacilius subtiis YocI US 8,168,417 B2 139 140 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog D (Gene NO. Description OniRef Accession No. Organism Name) 2110 Hypothetical protein yabC Bacilius UniRef100 P45919 Bacilius subtiis YgbC subtilis 2111 Hypothetical protein yfB Bacilius UniRef100 O34438 Bacilius subtiis ubtilis 2112 Hypothetical protein yyaCR Bacilius UniRef100 P37507 Bacilius subtiis YyaO ubtilis 2113 Hypothetical protein OB2103 UniRef100 Q8EPJ8 Oceanobacilius iheyensis Oceanobacilius iheyensis 2114 2115 Hypothetical protein yigD Bacilius UniRef100 O34681 Bacilius subtiis YigD subtilis 2116 YigC 2117 YigC 2118 Hypothetical protein OB3361 UniRef100 Q8EL70 Oceanobacilius iheyensis Oceanobacilius iheyensis 2119 Hypothetical protein ypfA Bacilius UniRef100 P38491 Bacilius subtiis YpfA subtilis 2120 21.21 2122 YXB 2123 2124 2125 2126 YF 2127 2128 Hypothetical protein Bacilius cereus UniRef100 Q816C2 Bacilius cereus 2129 BHO185 protein Bacilius halodurans UniRef100 Q9KGB9 Bacilius halodurans 2130 21.31 Yisy 2132 Hypothetical protein yoqH UniRef100 O64117 Bacteriophage YoqH Bacteriophage SPBc2 SPBC2 21.33 2134 BHO429 protein Bacilius halodurans UniRef100 Q9KFP9 Bacilius YrhP halodurans 2135 30S ribosomal protein S14 UniRef100 Q8ETXO Oceanobacilius iheyensis Oceanobacilius iheyensis 2136 UPIOOOO2DEBB5 UniRef100 entry UniRef100 UPIOOOO2DEBB5 MutT 2137 Pyruvate water dikinase UniRef100 Q8TN35 Meihanosarcina Pps Methanosarcina acetivorans acetivorans 2138 Transcriptional regulator UniRef100 Q8ESJ8 Oceanobacilius YxbF Oceanobacilius iheyensis iheyensis 2139 2140 YndM protein Bacilius subtilis UniRef100 O31816 Bacilius subtiis YM 2141 Hypothetical protein yisT Bacilius UniRef100 OO7939 Bacilius subtiis YisT subtilis 2142 Putative acyl carrier protein UniRef100 O35022 Bacilius subtiis Yoc phosphodiesterase 1 Bacilius subtilis 2143 General stress protein 16O Bacilius UniRef100 P80872 Bacilius subtiis YocK subtilis 2144 2145 2146 Aldehyde dehydrogenase Bacilius UniRef100 O34660 Bacilius subtiis DhaS subtilis 2147 YbB protein Bacilius subtilis UniRef100 O31600 Bacilius subtiis YbB 2148 Aminoglycoside N6'-acetyltransferase UniRef100 Q81AT3 Bacilius cereus Yick Bacilius cereus 2149 Blasticidin S deaminase, putative UniRef100 Q81Y61 Bacilius anthracis Bacilius anthracis 21SO SqhC 2151 Probable Superoxide dismutase Fe UniRef100 O35023 Fe SodF Bacilius subtilis 2152 Stress response protein yvgO precursor UniRef100 O32211 Bacilius subtiis YvgO Bacilius subtilis 2153 Putative transporter Bacilius subtilis UniRef100 O34383 Bacilius subtiis YocR 2154 Putative transporter Bacilius subtilis UniRef100 O34524 Bacilius subtiis YocS 2155 Dihydrolipoyllysine-residue UniRef100 P16263 Bacilius subtiis OdhB Succinyltransferase component of 2 oxoglutarate dehydrogenase complex Bacilius subtilis 2156 2-oxoglutarate dehydrogenase E1 UniRef100 P23129 Bacilius subtiis Odha component Bacilius subtilis 2157 YojO protein Bacilius subtilis UniRef100 O31849 Bacilius subtiis Yojo US 8,168,417 B2 141 142 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog D (Gene NO. Description OniRef Accession No. Organism Name) 2158 YoN protein Bacilius subtilis UniRef100 O31850 Bacilius subtiis YoN 2159 Hypothetical Superoxide dismutase-like UniRef100 O31851 Bacilius subtiis YoM protein yoM precursor Bacilius subtilis 2160 Hypothetical protein yojL precursor UniRef100 O31852 Bacilius subtiis LytF Bacilius subtilis 2161 Probable multidrug resistance protein UniRef100 O31855 Bacilius subtiis YojI norM (Na(+)/drug antiporter) Bacilius subtilis 2162 Hypothetical protein Bacillus cereus UniRef100 Q637Z8 Bacilius cereus YoC ZK) ZK. 2163 YojF protein Bacilius subtilis UniRef100 O31858 Bacilius subtiis YojF 2164 YoE Bacilius subtilis UniRef100 O68260 Bacilius subtiis 21 65 YoE protein Bacilius subtilis UniRef100 O31859 Bacilius subtiis YoE 2166 Hypothetical protein yozRBacilius UniRef100 Q7WY67 Bacilius subtiis YozR subtilis 2167 Yoa Bacilius subtilis UniRef100 O34918 Bacilius subtiis Yoa 21 68 2169 UPIOOOO3CBE9B UniRef100 entry UniRef100 UPIOOOO3CBE9E3 YwfD 2170 Hypothetical UPF0087 protein yodB UniRef100 O34844 Bacilius subtiis YodB Bacilius subtilis 2171 Putative NAD(P)H nitroreductase 12C UniRef100 P81102 Bacilius subtiis YoC Bacilius subtilis 2172 YolF Bacillus subtilis) UniRef100 O34842 Bacilius subtiis YodD 2173 UPIO000315ACC UniRef100 entry UniRef100 UPIOOOO315ACC 2174 YodF protein Bacilius subtilis UniRef100 O34745 Bacilius subtiis YodF 2175 IS1627s 1-related, transposase Bacilius UniRef100 Q7CMDO Bacilius anthracis str. A2012 anthracis str. A2012 2176 UPIOOOO3CCO69 UniRefl00 entry UniRef100 UPIOOOO3CCO69 2177 OrfRM1 Bacillus subtilis) UniRef100 O34666 Bacilius subtiis CtpA 2178 YolB Bacillus subtilis) UniRef100 O34954 Bacilius subtiis YooH 2179 2180 Carboxypeptidase Bacilius subtilis UniRef100 O34.866 Bacilius subtiis Yod 2181 Purine nucleoside phosphorylase II UniRef100 O34925 Bacilius subtiis DeoD Bacilius subtilis 2182 Hypothetical Membrane Spanning UniRef100 Q813PO Bacilius cereus YcgR Protein Bacilius cereus 2183 YegO protein Bacilius subtilis UniRef100 P94394 Bacilius subtiis YcgQ 21.84 2185 Hypothetical protein yodL Bacilius UniRef100 O30472 Bacilius subtiis YodL subtilis 2186 YodM 2187. Hypothetical protein yozD Bacilius UniRef100 O31863 Bacilius subtiis subtilis 2188 Hypothetical protein yodN Bacilius UniRef100 O34414 Bacilius subtiis YodN subtilis 21.89 2190 YokU Bacilius subtilis UniRef100 O30470 Bacilius subtiis 2191 Hypothetical UPF0069 protein yodo UniRef100 O34676 Bacilius subtiis KamA Bacilius subtiis 2192 YodP Bacillus subtilis) UniRef100 O34895 Bacilius subtiis YodP 2193 Acetylornitine deacetylase Bacilius UniRef100 O34984 Bacilius subtiis Yodo subtilis 2194 Butirate-acetoacetate CoA transferase UniRef100 O34466 Bacilius subtiis YodR Bacilius subtiis 2195 Butyrate acetoacetate-CoA transferase UniRef100 O34.317 Bacilius subtiis YodS Bacilius subtiis 2196. Probable aminotransferase yodT UniRef100 O34662 Bacilius subtiis YodT Bacilius subtiis 2197 Multidrug resistance protein: possible UniRef100 Q6HK46 Bacilius YkC etracycline resistance determinant thiringiensis Bacilius thuringiensis 2198 Protein cgeE Bacilius subtilis UniRef100 P42093 Bacilius subtiis CgeE 2199 Peptide methionine sulfoxide reductase UniRef100 P54155 Bacilius subtiis YppO msrB Bacilius subtilis 2200 MSrA 2201 Putative HTH-type transcriptional UniRef100 P54182 Bacilius subtiis YpoP regulatorypoP Bacilius subtilis 22O2 2203 Hypothetical protein yhcKBacilius UniRef100 P54595 Bacilius subtiis YCK subtilis 2204 Hypothetical protein ypnP Bacilius UniRef100 P54181 Bacilius subtiis YpnP subtilis US 8,168,417 B2 143 144 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2205 Hypothetical conserved protein UniRef100 Q746K9 Thermus thermophilus Thermus thermophilus 22O6 2207 Hypothetical protein ypmS Bacilius UniRef100 P54179 Bacilius subtiis YpmS subtiis 2208 Hypothetical protein ypmR Bacilius UniRef100 P4O766 Bacilius subtiis YpmR subtiis 2209 Hypothetical protein ypmC Bacilius UniRef100 P54178 Bacilius subtiis YpmO subtiis 2210 DegV family protein Bacillus cereus UniRef100 Q63BU6 Bacilius cereus YviA ZK) ZK. 2211 Hypothetical protein ypmP Bacilius UniRef100 P54177 Bacilius subtiis subtiis 2212 Threonine dehydratase biosynthetic UniRef100 P37946 Bacilius subtiis IlwA Bacilius subtilis 2213 Putative sigma L-dependent UniRef100 P54156 Bacilius subtiis YbIP transcriptional regulatorypIP Bacilius subtiis 2214 Hemolysin III homolog Bacilius subtilis UniRef100 P54175 Bacilius subtiis Yplo 2215 Hypothetical protein ypkP Bacilius UniRef100 P54174 Bacilius subtiis YpkP subtiis 2216 Dihydrofolate reductase Bacilius UniRef100 P11045 Bacilius subtiis DfrA subtiis 2217 Hypothetical protein ypiO Bacilius UniRef100 P54173 Bacilius subtiis YpjQ subtiis 2218 YpiP 2219 YpiP 2220 Hypothetical protein yphP Bacilius UniRef100 P54170 Bacilius subtiis YphP subtiis 2221 Dihydroxy-acid dehydratase Bacilius UniRef100 P51785 Bacilius subtiis Iv) subtiis 2222 YpgR 2223 Hypothetical protein ypgQ Bacilius UniRef100 P54168 Bacilius subtiis YpgQ subtiis 2224 Glutathione peroxidase homolog bsa.A UniRef100 P52035 Bacilius subtiis BSaA Bacilius subtiis 2225 UPIOOOO3CBAOF UniRef100 entry UniRef100 UPIOOOO3CBAOF 2226 Homoserine O-succinyltransferase UniRef100 P54167 Bacilius subtiis MetA Bacilius subtiis 2227 Putative glycosyltransferaseypfP UniRef100 P54166 Bacilius subtiis UgtP Bacilius subtiis 2228 2229 Cold shock protein cspD Bacilius UniRef100 P51777 Bacilius subtiis subtiis 2230 Regulatory protein degR Bacilius UniRef100 PO6563 Bacilius subtiis subtiis 2231 Hypothetical protein ypZA Bacilius UniRef100 O32007 Bacilius subtiis subtiis 2232 2233 Hypothetical protein ypeP Bacilius UniRef100 P54164 Bacilius subtiis YpeP subtiis 2234 Hypothetical protein ypdP Bacilius UniRef100 P54163 Bacilius subtiis YpdP subtiis 2235 14.7kDa ribonuclease H-like protein UniRef100 P54162 Bacilius subtiis YpdO Bacilius subtiis 2236 Probable 5'-3' exonuclease Bacilius UniRef100 P54161 Bacilius subtiis YpcP subtiis 2237 2238 Hypothetical protein ypbS Bacilius UniRef100 P54160 Bacilius subtiis subtiis 2239 Hypothetical protein ypbRBacilius UniRef100 P54159 Bacilius subtiis YpbR subtiis 2240 2241 Hypothetical protein ypbQ Bacilius UniRef100 P54158 Bacilius subtiis YpbQ subtiis 2242 BcSA 2243 Predicted acetyltransferase Clostridium UniRef100 Q97GO3 Cliostridium YokL acetobiitvictim acetobiitvictim 2244 permease Bacillus subtilis UniRef100 P42086 Bacilius subtiis Pblx 224.5 Xanthine phosphoribosyltransferase UniRef100 P42085 Bacilius subtiis Xpt Bacilius subtilis US 8,168,417 B2 145 146 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2246 Hypothetical metalloprotease ypwA UniRef100 P50848 Bacilius subtiis Yw A Bacilius subtilis 2247 YowA 2248 Hypothetical protein yptA precursor UniRef100 PSO841 Bacilius subtiis Bacilius subtilis 2249 2250 Hypothetical UPF0020 protein ypsC UniRef100 PSO840 Bacilius subtiis YSC precursor Bacilius subtilis 2251 Hypothetical protein ypsB Bacilius UniRef100 P50839 Bacilius subtiis subtilis 2252 Hypothetical protein ypsA Bacilius UniRef100 P50838 Bacilius subtiis YSA subtilis 2253 Spore coat protein D Bacillus subtilis UniRef100 PO7791 Bacilius subtiis 2254 2255 YbB 2256 YA 2257 Putative PTS system IIA component UniRef100 P50829 Bacilius subtiis YpgE ypgE Bacillus subtilis 2258 Hypothetical protein ypcA precursor UniRef100 P50836 Bacilius subtiis YpqA Bacilius subtilis 2259 Hypothetical protein yppG Bacilius UniRef100 P50835 Bacilius subtiis YppG subtilis 2260 2261 YppE 2262 2263 2264 Hypothetical protein yppC Bacilius UniRef100 P39791 Bacilius subtiis YppC subtilis 2265 Recombination protein ULBacilius UniRef100 P39792 Bacilius subtiis Rec subtilis 2266 Penicillin-binding protein 1A, 1B (PBP1) UniRef100 P39793 Includes: PonA Includes: Penicillin-insensitive Penicillin transglycosylase (EC 2.4.2.—) insensitive (Peptidoglycan TGase); Penicillin- transglycosylase sensitive transpeptidase (EC 3.4.——) (EC 2.4.2.—) (DD-transpeptidase) Bacilius subtilis (Peptidoglycan TGase); Penicillin sensitive transpeptidase (EC 3.4——) (DD transpeptidase) 2267 Hypothetical protein ypoC Bacilius UniRef100 P39789 Bacilius subtiis YpoC subtilis 2268 Probable endonuclease III (EC UniRef100 P39788 Bacilius subtiis Nth 4.2.99.18) (DNA-(apurinic or apyrimidinic site) lyase) Bacilitis subtilis 2269 DNA replication protein dinaD Bacilius UniRef100 P39787 Bacilius subtiis Dna) subtilis 2270 Asparaginyl-tRNA synthetase Bacilius UniRef100 P39772 Bacilius subtiis ASnS subtilis 2271 Aspartate aminotransferase Bacilius UniRef100 P53001 Bacilius subtiis AspB subtilis 2272 Hypothetical protein ypmB Bacilius UniRef100 P54396 Bacilius subtiis YpmB subtilis 2273 Hypothetical protein ypmA Bacilius UniRef100 P54395 Bacilius subtiis subtilis 2274 Probable ATP-dependent helicase dinG UniRef100 P54394 Bacilius subtiis DinG homolog Bacilius subtilis 2275 Aspartate 1-decarboxylase precursor UniRef100 P52999 Bacilius subtiis PanD Bacilius subtilis 2276 Pantoate--beta-alanine ligase Bacilius UniRef100 P52998 Bacilius subtiis PanC subtilis 2277 3-methyl-2-oxobutanoate UniRef100 P52996 Bacilius subtiis Pan B hydroxymethyltransferase Bacilius subtilis 2278 BirA bifunctional protein Includes: UniRef100 P42.975 Includes: Biotin BirA Biotin operon repressor; Biotin--acetyl operon CoA-carboxylase synthetase (EC repressor; Biotin-- 6.3.4.15) (Biotin-protein ligase) Bacilius subtilis US 8,168,417 B2 147 148 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2279 Poly(A) polymerase Bacilius subtilis UniRef100 P42.977 Bacilius subtiis Cca 2280 Putative glycosyltransferaseypH UniRef100 P42982 Bacilius subtiis YpH Bacilius subtilis 2281 Hypothetical protein ypiG Bacilius UniRef100 P42.981 Bacilius subtiis YpG subtilis 2282 Methylglyoxal synthase Bacilius UniRef100 P42.980 Bacilius subtiis MgSA subtilis 2283 Dihydrodipicolinate reductase Bacilius UniRef100 P42976 Bacilius subtiis DapR subtilis 2284 Hypothetical protein yp D Bacilius UniRef100 P42979 Bacilius subtiis YpD subtilis 2285 YpC 2286 Hypothetical protein ypjB precursor UniRef100 P54393 Bacilius subtiis Yp B Bacilius subtilis 2287 YpjA 2288 QcrC 2289 Menaquinol-cytochrome c reductase UniRef100 P46912 Bacilius subtiis QcrB cytochrome b subunit Bacillus subtilis 2290 Menaquinol-cytochrome c reductase UniRef100 P46911 Bacilius subtiis QcrA iron-sulfur subunit Bacilius subtilis 2291 Hypothetical protein ypiF Bacilius UniRef100 P54391 Bacilius subtiis YpiF subtilis 2292 Hypothetical UPF0302 protein ypiB UniRef100 P54390 Bacilius subtiis YpiB Bacilius subtilis 2293 Hypothetical protein ypiA Bacilius UniRef100 P54389 Bacilius subtiis YpiA subtilis 2294 3-phosphoshikimate 1- UniRef100 P20691 Bacilius subtiis AroE carboxyvinyltransferase Bacilius subtilis 2295 Prephenate dehydrogenase Bacilius UniRef100 P20692 Bacilius subtiis Tyra subtilis 2296 HisC 2297 Tryptophan synthase alpha chain UniRef100 PO7601 Bacilius subtiis Trp-A Bacilius subtiis 2298 Tryptophan synthase beta chain UniRef100 PO7600 Bacilius subtiis TrpB Bacilius subtiis 2299 N-(5'-phosphoribosyl)anthranilate UniRef100 P20167 Bacilius subtiis TrpF isomerase Bacillus subtilis 2300 Indole-3-glycerol phosphate synthase UniRef100 PO3964 Bacilius subtiis TrpC Bacilius subtiis 2301 Anthranilate phosphoribosyltransferase UniRef100 PO3947 Bacilius subtiis TrpD Bacilius subtiis 2302 Anthranilate synthase component I UniRef100 PO3963 Bacilius subtiis TrpE Bacilius subtiis 2303 Chorismate mutase Bacillus subtilis UniRef100 P1908O Bacilius subtiis AroH 2304 3-dehydroquinate synthase Bacillus UniRef100 P31 102 Bacilius subtiis AroB subtilis 2305 Chorismate synthase Bacillus subtilis UniRef100 P31 104 Bacilius subtiis AroF 2306 Chemotaxis protein methyltransferase UniRef100 P31 105 Bacilius subtiis CheR Bacilius subtilis 2307 Nucleoside diphosphate kinase UniRef100 P31 103 Bacilius subtiis Ndk Bacilius subtilis 2308 Heptaprenyl diphosphate synthase UniRef100 P31114 Bacilius subtiis HepT component II Bacilius subtilis 2309 Menaquinone biosynthesis UniRef100 P31 113 Bacilius subtiis MenEH methyltransferase ubiE Bacilius subtilis 2310 Heptaprenyl diphosphate synthase UniRef100 P31 112 Bacilius subtiis HepS component I Bacilius subtilis 2311 Transcription attenuation protein mtrB UniRef100 P1946.6 Bacilius subtiis Bacilius subtilis 2312 GTP cyclohydrolase I Bacilius subtilis UniRef100 P19465 Bacilius subtiis Mtra 2313 DNA-binding protein HU 1 Bacilius UniRef100 PO8821 Bacilius subtiis subtilis 2314 Stage IV sporulation protein A Bacilius UniRef100 P35149 Bacilius subtiis SpoIVA subtilis 2315 Hypothetical protein yphF Bacilius UniRef100 P39911 Bacilius subtiis YphF subtilis 2316 Hypothetical protein yphE Bacilius UniRef100 P50744 Bacilius subtiis subtilis US 8,168,417 B2 149 150 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 2317 Glycerol-3-phosphate dehydrogenase UniRef100 P46919 NAD(P)+ GpSA NAD(P)+ (EC 1.1.1.94) (NAD(P)H- dependent glycerol-3-phosphate dehydrogenase) (NAD(P)H-dependent dihydroxyacetone-phosphate reductase) Bacilius subtilis 2318 YphC 2319 Hypothetical protein OB1798 UniRef100 Q8EQA7 Oceanobacilius iheyensis Oceanobacilius iheyensis 2320 Hypothetical protein yphB Bacilius UniRef100 P50742 Bacilius subtiis Sea.A subtilis 2321 Hypothetical protein yphA Bacilius UniRef100 P50741 Bacilius subtiis YphA subtilis 2322 2323 YpgA 2324 30S ribosomal protein S1 homolog UniRef100 P38494 Bacilius subtiis YpfD Bacilius subtilis 2325 Cytidylate kinase Bacilius subtilis UniRef100 P38493 Bacilius subtiis Cmk 2326 Hypothetical protein ypfB Bacilius UniRef100 P38492 Bacilius subtiis subtilis 2327 Sporulation protein ype.B Bacilius UniRef100 P38490 Bacilius subtiis YeB subtilis 2328 Spore cortex-lytic enzyme precursor UniRef100 P50739 Bacilius subtiis SeB Bacilius subtiis 2329 YC 2330 YccC 2331 Hypothetical protein ypdA Bacilius UniRef100 P50736 Bacilius subtiis YpdA subtiis 2332 NAD-specific glutamate dehydrogenase UniRef100 P50735 Bacilius subtiis GudB Bacilius subtiis 2333 Adapter protein mecA2 Bacilius UniRef100 P50734 Bacilius subtiis YbbH subtiis 2334 Hypothetical protein ypbG precursor UniRef100 P50733 Bacilius subtiis YbC Bacilius subtiis 2335 YbbF 2336 Hypothetical protein ypbE Bacilius UniRef100 P50731 Bacilius subtiis YbbE subtiis 2337 Hypothetical protein ypbD Bacilius UniRef100 P50730 Bacilius subtiis YbbD subtiis 2338 ATP-dependent DNA helicase recCR UniRef100 P50729 Bacilius subtiis RecQ Bacilius subtiis 2339 Hypothetical protein ypbB Bacilius UniRef100 P50728 Bacilius subtiis YbB subtiis 2340 Ferredoxin Bacilius subtilis UniRef100 P50727 Bacilius subtiis 2341 Hypothetical protein ypaA Bacilius UniRef100 P50726 Bacilius subtiis YaA subtiis 2342 2343 D-3-phosphoglycerate dehydrogenase UniRef100 P35136 Bacilius subtiis SerA Bacilius subtiis 2344 BH1600 protein Bacilius halodurans UniRef100 Q9KCH1 Bacilius halodurans 2345 Sigma-X negative effector Bacilius UniRef100 P35166 Bacilius subtiis RSIX subtilis 2346 RNA polymerase sigma factor sigX UniRef100 P35165 Bacilius subtiis SigX Bacilius subtiis 2347 Transcriptional regulator Bacilius UniRef100 Q72XJ3 Bacilius cereus LytR cereus 2348 Endo-1,4-beta-xylanase Bacilius UniRef100 Q9K630 Bacilius YeN halodurans halodurans 2349 2350 Alkaline phosphatase synthesis sensor UniRef100 Q898N3 Cliostridium tetani YcK protein phoR Clostridium tetani 2351 Response regulators consisting of a UniRef100 Q8R9H7 Thermoanaerobacter YycF Chey-like receiver domain and a HTH tengcongensis DNA-binding domain Thermoanaerobacter tengcongensis 2352 Sensor protein resE Bacilius subtilis UniRef100 P35164 Bacilius subtiis ResE 2353 Transcriptional regulatory protein resD UniRef100 P35163 Bacilius subtiis ResD Bacilius subtilis 2354 Protein resC Bacilius subtilis UniRef100 P35162 Bacilius subtiis ReSC 2355 ResB protein Bacilius subtilis UniRef100 P35161 Bacilius subtiis ResB 2356 Thiol-disulfide oxidoreductase res.A UniRef100 P35160 Bacilius subtiis ResA Bacilius subtilis US 8,168,417 B2 151 152 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2357 Ribosomal large subunit pseudouridine UniRef100 P351.59 Bacilius subtiis RuB synthase B Bacilius subtilis 2358 Spore maturation protein B Bacillus UniRef100 P35158 Bacilius subtiis SpmB subtilis 2359 Spore maturation protein A Bacillus UniRef100 P35157 Bacilius subtiis SpmA subtilis 2360 Penicillin-binding protein 5* precursor UniRef100 P35150 Bacilius subtiis DacB Bacilius subtilis 2361 Hypothetical protein ypul Bacilius UniRef100 P351.56 Bacilius subtiis YpuI subtilis 2362 Segregation and condensation protein UniRef100 P35155 Bacilius subtiis YpuH B Bacilius subtilis 2363 Segregation and condensation protein UniRef100 P35154 Bacilius subtiis YpuG A Bacilius subtilis 2364 2365 Hypothetical protein ypuF Bacilius UniRef100 P17617 Bacilius subtiis YpuF subtilis 2366 RibT protein Bacilius subtilis UniRef100 P17622 Bacilius subtiis RibT 2367 6,7-dimethyl-8-ribityllumazine synthase UniRef100 Q44681 Bacilius RibH Bacilius amyloiquefaciens amyloiquefaciens 2368 Riboflavin biosynthesis protein ribA UniRef100 P17620 Includes: GTP RibA Includes: GTP cyclohydrolase II (EC cyclohydrolase II 3.5.4.25): 3,4-dihydroxy-2-butanone 4- (EC 3.5.4.25); phosphate synthase (DHBP synthase) 3,4-dihydroxy-2- Bacilius subtilis butanone 4 phosphate synthase (DHBP synthase) 2369 Riboflavin synthase alpha chain UniRef100 P16440 Bacilius subtiis RibE Bacilius subtilis 2370 Riboflavin biosynthesis protein ribD UniRef100 P17618 Includes: RibD Includes: Diaminohydroxyphosphoribosyl Diaminohydroxyphosphoribosylamino- aminopyrimidine deaminase pyrimidine deaminase (EC 3.5.4.26) (EC 3.5.4.26) (Riboflavin-specific deaminase); 5- (Riboflavin amino-6-(5-phosphoribosylamino)uracil specific reductase (EC 1.1.1.193) (HTP deaminase); 5 reductase) Bacilius subtilis amino-6-(5- phosphoribosylamino)uracil reductase (EC 1.1.1.193) (HTP reductase) 2371 Hypothetical protein ypuD Bacilius UniRef100 P17616 Bacilius subtiis YpulD subtilis 2372 Putative serine/threonine protein UniRef100 Q7 VFC1 Helicobacter YibP phosphatase Helicobacter hepaticits hepaticits 2373 Stress response homolog Hsp Bacilius UniRef100 Q9X3Z5 Bacilius subtiis subtilis 2374 Response regulator aspartate UniRef100 Q00828 Bacilius subtiis Rap A phosphatase A Bacilius subtilis 2375 Peptidyl-prolyl cis-trans isomerase B UniRef100 P35137 Bacilius subtiis PpiB Bacilius subtilis 2376 YpuA 2377 IS1627s 1-related, transposase Bacilius UniRef100 Q7CMDO Bacilius anthracis str. A2012 anthracis str. A2012 2378 UPIOOOO3CCO69 UniRefl00 entry UniRef100 UPIOOOO3CCO69 23.79 YndL 2380 Diaminopimelate decarboxylase UniRef100 P23630 Bacilius subtiis LySA Bacilius subtilis 2381 2382 Stage V sporulation protein AFBacilius UniRef100 P31845 Bacilius subtiis SpoVAF subtilis 2383 Stage V sporulation protein AE Bacilius UniRef100 P40870 Bacilius subtiis SpoVAE subtilis 2384 Stage V sporulation protein AE Bacilius UniRef100 P40870 Bacilius subtiis SpoVAE subtilis 2385 Stage V sporulation protein AD UniRef100 P40869 Bacilius subtiis SpoVAD Bacilius subtilis 2386 SpoVAC 2387 Stage V sporulation protein ABBacilius UniRef100 P40867 Bacilius subtiis SpoVAB subtilis US 8,168,417 B2 153 154 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 2388 Stage V sporulation protein AA Bacilius UniRef100 P40866 Bacilius subtiis SpoVAA subtilis 2389 SigF 2390 SpoIIAB 2391 SpoIAA 2392 Penicillin-binding protein dacF UniRef100 P38422 Bacilius subtiis DacF precursor Bacillus subtilis 2393 Purine nucleoside phosphorylase I UniRef100 P46354 Bacilius subtiis PunA Bacilius subtilis 2394 Phosphopentomutase Bacilius subtilis UniRef100 P46353 Bacilius subtiis Drm 2395 Tyrosine recombinase xerD Bacilius UniRef100 P46352 Bacilius subtiis RipX subtilis 2396 2397 Ferric uptake regulation protein UniRef100 P54574 Bacilius subtiis Fur Bacilius subtilis 2398 Stage II sporulation protein M Bacilius UniRef100 P37873 Bacilius subtiis SpoIIM subtiis 2399 Limo2763 protein Listeria UniRef100 Q926Y4 Listeria YO monocytogenes monocytogenes 2400 UPIOOOO3CA2F1 UniRef100 entry UniRef100 UPIOOOO3CA2F1 2401 Probable allantoin permease Bacilius UniRef100 P94575 Bacilius subtiis YWOE subtiis 24O2 Yda 2403 Hypothetical protein yokK Bacilius UniRef100 P54573 Bacilius subtiis subtiis 2404 YybD 2405 Hypothetical protein Bacilius pumilus UniRef100 Q93PN4 Bacilius pumilus YoxK 2406 ADP-ribose pyrophosphatase Bacillus UniRef100 P54570 Bacilius subtiis NudF subtiis 24O7 2408 YdgC protein Bacilius subtilis UniRef100 P96701 Bacilius subtiis YdgC 2409 Hypothetical protein yogD Bacilius UniRef100 P96702 Bacilius subtiis YdgD subtiis 2410 Hypothetical oxidoreductase yokF UniRef100 P54569 Bacilius subtiis YokF Bacilius subtiis 2411 2412 Yok) 2413 Hypothetical protein yokC Bacilius UniRef100 P54566 Bacilius subtiis subtilis 2414 Hypothetical protein yokB Bacilius UniRef100 P54565 Bacilius subtiis YokB subtilis 2415 UPIOOOO3CBE2B UniRef100 entry UniRef100 UPIOOOO3CBE2B 2416 Hypothetical UPFO157 protein yokA UniRef100 P54564 Bacilius subtiis YokA Bacilius subtiis 2417 Hypothetical protein yd Z. Bacilius UniRef100 P54563 Bacilius subtiis YdZ subtilis 2418 Lipase precursor Bacilius subtilis UniRef100 P37957 Bacilius subtiis Lip 2419 2420 Nickel ABC transporter Bacilius UniRef100 Q9KBX8 Bacilius AppA halodurans halodurans 2421 Nickel ABC transporter Bacilius UniRef100 Q9KBX7 Bacilius AppB halodurans halodurans 2422 Nickel ABC transporter Bacilius UniRef100 Q9KBX6 Bacilius AppC halodurans halodurans 2423 Putative oligopeptide ABC transporter UniRef100 Q895 A4 Cliostridium tetani DppD Clostridium tetani 2424 Putative oligopeptide ABC transporter UniRef100 Q895A5 Cliostridium tetani OppF Clostridium tetani 2425 DNA-damage-inducible protein Bacilius UniRef100 Q9KDR1 Bacilius YoI halodurans halodurans 2426 Fibronectin-binding protein Bacilius UniRef100 Q63CW1 Bacilius cereus ZK cereus ZK) 2427 Hypothetical protein yolD UniRef100 O64030 Bacteriophage YoD Bacteriophage SPBc2 SPBC2 2428 UvrX 2429 Hypothetical protein yaZH Bacilius UniRef100 O32014 Bacilius subtiis subtilis 2430 Hypothetical transport protein yoV UniRef100 P54559 Bacilius subtiis YgiV Bacilius subtilis 2431 Hypothetical protein yoTBacilius UniRef100 P54557 Bacilius subtiis YcT subtilis 2432 CoaA US 8,168,417 B2 155 156 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2433 LacG Lactococcus lactis UniRef100 Q9RAU9 Lactococci is lactis 2434 LacF Lactococcus lactis UniRef100 Q9RAV2 Lactococcits Ydb iaciis 2435 2436 Hypothetical protein Bacilius UniRef100 Q7OKO7 Bacilius amyloiiquefaciens amyloiquefaciens 2437 Hypothetical protein Bacilius cereus UniRef100 Q737G3 Bacilius cereus 2438 YmaC protein Bacilius subtilis UniRef100 O31789 Bacilius subtiis YmaC 2439 Hypothetical oxidoreductase yajQ UniRef100 P54554 Bacilius subtiis YajQ Bacilius subtiis 2440 YoP 2441 Pro 2442 Hypothetical protein yoN Bacilius UniRef100 P54551 Bacilius subtiis YgiN subtilis 2443 Probable NADH-dependent flavin UniRef100 P54550 Bacilius subtiis YgM oxidoreductase yajM Bacilius subtilis 2444 Hypothetical protein yajL. Bacilius UniRef100 P54549 Bacilius subtiis YajL subtilis 2445 2446 Ribonuclease Z. Bacilius subtilis UniRef100 P54548 Bacilius subtiis YaK 2447 Glucose-6-phosphate 1-dehydrogenase UniRef100 P54547 Bacilius subtiis Zwf Bacilius subtiis 2448 Hypothetical conserved protein UniRef100 Q8ELZ9 Oceanobacilius Cit Oceanobacilius iheyensis iheyensis 2449 Hypothetical protein OB3065 UniRef100 Q8ELZ8 Oceanobacilius iheyensis Oceanobacilius iheyensis 2450 Immunogenic protein Oceanobacillus UniRef100 Q8ELZ7 Oceanobacilius SSuA iheyensis iheyensis 2451 Yaho protein LBacillus subtilis UniRef100 OO5509 Bacilius subtiis Ydhq 2452 Beta-glucosidase Bacilius halodurans UniRef100 Q9K615 Bacilius YdhP halodurans 2453 Putative cellobiose-specific enzyme IIC UniRef100 Q8KP28 Bacilius pumilus YO Bacilius pumilus 2454 YdhO protein Bacillus subtilis UniRef100 OO5507 Bacilius subtiis YO 2455 YdhN protein Bacillus subtilis UniRef100 OO5506 Bacilius subtiis YdhN 2456 PTS system, cellobiose-specific UniRef100 Q9K613 Bacilius halodurans enzyme II, B component Bacilius halodurans 2457 Alkaline phosphatase Bacilius cereus UniRef100 Q639W1 Bacilius cereus PhoB ZK) ZK. 2458 Glucose 1-dehydrogenase A Bacilius UniRef100 P10528 Bacilius Gdh negaterium negaterium 2459 6-phosphogluconate dehydrogenase, UniRef100 P80859 Bacilius subtiis Yd I decarboxylating II Bacilius subtilis 2460 DNA polymerase IV 1 Bacilius subtilis UniRef100 P54545 Bacilius subtiis YaH 2461 Hypothetical protein yaZJ Bacillus UniRef100 Q7WY64 Bacilius subtiis subtiis 2462 YajG 2463 Hypothetical protein yd E. Bacilius UniRef100 P54542 Bacilius subtiis YgE subtiis 2464 Methylmalonyl-CoA decarboxylase UniRef100 Q9K8P6 Bacilius YgiD alpha subunit Bacilius halodurans halodurans 2465 Hypothetical protein yoA Bacilius UniRef100 P54538 Bacilius subtiis Yoja subtiis 2466 Probable amino-acid ABC transporter UniRef100 P54537 Bacilius subtiis YZ ATP-binding protein yai Z. Bacilius subtiis 2467 Probable amino-acid ABC transporter UniRef100 P54536 Bacilius subtiis YoY permease protein ydiY Bacilius subtilis 2468 Probable amino-acid ABC transporter UniRef100 P54535 Bacilius subtiis YX extracellular binding protein yaiX precursor Bacillus subtilis 2469 Hypothetical protein yai W Bacilius UniRef100 P54534 Bacilius subtiis YW subtiis 2470 Protein bmrU Bacilius subtilis UniRef100 P39074 Bacilius subtiis Bimr 2471 Lipoamide acyltransferase component UniRef100 P37942 Bacilius subtiis BkdB of branched-chain alpha-keto acid dehydrogenase complex (EC 2.3.1.168) (Dihydrolipoyllysine-residue (2- methylpropanoyl)transferase) Bacilius subtilis US 8,168,417 B2 157 158 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2472 2-oxoisovalerate dehydrogenase beta Oni OO P37941 Bacilius subtiis BkdAB subunit Bacilius subtilis 2473 2-oxoisovalerate dehydrogenase alpha Oni OO P37940 Bacilius subtiis BkoilAA subunit Bacilius subtilis 2474 Dihydrolipoyl dehydrogenase Bacilius Oni OO P54533 Bacilius subtiis LpdV subtiis 2475 Probable butyrate kinase Bacilius Oni OO P54532 Bacilius subtiis Buk subtiis 2476 Leucine dehydrogenase Bacilius Oni OO P54531 Bacilius subtiis Bcd subtiis 2477 Probable phosphate butyryltransferase Oni OO P54530 Bacilius subtiis Ptb Bacilius subtilis 2478 Putative sigma L-dependent Oni OO P54529 Bacilius subtiis BkdR transcriptional regulatoryqiRBacilius subtiis 2479 Hypothetical protein yaZF Bacilius Oni OO O32015 Bacilius subtiis subtiis 2480 Hypothetical protein ydiK Bacilius Oni OO P54527 Bacilius subtiis YgiK subtiis 2481 Hypothetical protein ydi I Bacilius Oni OO P54525 Bacilius subtiis YgiI subtiis 2482 YgiH 2483 Serine O-acetyltransferase Oni OO Q8PSY4 Meihanosarcina CysE Methanosarcina mazei mazei 2484 2485 Stage Osporulation protein A Bacilius Oni OO PO6534 Bacilius subtiis Spo0A subtilis 2486 SpoIVB 2487 DNA repair protein recNLBacilius Oni 00 Q659H4 Bacilius RecN amyloiquefaciens amyloiquefaciens 2488 Arginine repressor Bacilius subtilis Oni OO P17893 Bacilius subtiis AhrC 2489 Hypothetical protein yoxC Bacilius Oni OO P19672 Bacilius subtiis YaxC subtilis 2490 DxS 2491 Geranyltranstransferase Bacilius Oni OO P54383 Bacilius subtiis YgiD subtilis 2492 Probable exodeoxyribonuclease VII Oni OO P54522 Bacilius subtiis small subunit Bacilius subtilis 2493 Probable exodeoxyribonuclease VII Oni OO P54521 Bacilius subtiis YgiB large subunit Bacilius Subtilis 2494 FolD bifunctional protein Includes: Oni OO P54382 Includes: FoD Methylenetetrahydrofolate Methylenetetrahydrofolate dehydrogenase (EC 1.5.1.5); dehydrogenase Methenyltetrahydrofolate (EC 1.5.1.5); cyclohydrolase (EC 3.5.4.9) Bacilius Methenyltetrahydrofolate subtilis cyclohydrolase (EC 3.5.4.9) 2495 N utilization substance protein B Oni P54520 Bacilius subtiis NuSB homolog Bacilius subtilis 2496 Hypothetical protein yahY Bacilius Oni P54519 Bacilius subtiis YahY subtilis 2497 AccC 2498 Biotin carboxyl carrier protein of acetyl- Oni P49786 Bacilius subtiis AccB CoA carboxylase Bacilius subtilis 2499 Stage III sporulation protein AH Oni P49785 Bacilius subtiis SpoIIIAH Bacilius subtilis 2500 Stage III sporulation protein AG Oni P49784 Bacilius subtiis SpoIILAG Bacilius subtilis 2501 Stage III sporulation protein AF Oni P49783 Bacilius subtiis SpoIIIAF Bacilius subtilis 2502 Stage III sporulation protein AE Oni P49782 Bacilius subtiis SpoIIIAE Bacilius subtilis 2503 Stage III sporulation protein AD Oni P49781 Bacilius subtiis SpoIIIAD Bacilius subtilis 2504 Stage III sporulation protein AC Oni P4978O Bacilius subtiis Bacilius subtilis 2505 Stage III sporulation protein AB Oni OO Q01368 Bacilius subtiis SpoIIIAB Bacilius subtilis 2SO6 SpoIIAA 2507 Hypothetical protein yahV Bacilius Oni OO P49779 Bacilius subtiis subtilis 2508 Elongation factor P Bacilius subtilis Oni OO P49778 Bacilius subtiis Efp US 8,168,417 B2 159 160 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2509 Putative peptidase yahT Bacilius UniRef100 P54518 Bacilius subtiis YT subtiis 2510 3-dehydroquinate dehydratase Bacilius UniRef100 P54517 Bacilius subtiis YS subtiis 2511 Hypothetical protein yahRBacilius UniRef100 P54516 Bacilius subtiis YOR subtiis 2512 Hypothetical protein yaho Bacilius UniRef100 P54515 Bacilius subtiis Yaho subtiis 2513 Hypothetical protein yahP Bacilius UniRef100 P54514 Bacilius subtiis YP subtiis 2514 Hypothetical protein yaho Bacilius UniRef100 P54513 Bacilius subtiis YO subtiis 2515 Transcriptional regulator mintRBacillus UniRef100 P54512 Bacilius subtiis MntR subtiis 2516 Hypothetical protein yahM Bacilius UniRef100 P54511 Bacilius subtiis YM subtiis 2517 Hypothetical protein yah L. Bacilius UniRef100 P54510 Bacilius subtiis Ygh subtiis 2518 Glycine betaine-binding protein UniRef100 P46922 Bacilius subtiis OpuaC precursor Bacillus subtilis 2519 Glycine betaine transport system UniRef100 P46921 Bacilius subtiis Opua B permease protein opuABBacilius subtiis 2520 Glycine betaine transport ATP-binding UniRef100 P46920 Bacilius subtiis Opua. A protein opuAA Bacilius subtilis 2521 Probable glycine dehydrogenase UniRef100 P54377 decarboxylating GcyPB decarboxylating Subunit 2 Bacilius subtiis 2522 Probable glycine dehydrogenase UniRef100 P54376 decarboxylating GowPA decarboxylating Subunit 1 Bacilius subtiis subtiis 2523 Aminomethyltransferase Bacilius UniRef100 P54378 Bacilius subtiis GowT subtiis 2524 Hypothetical helicase yohH Bacilius UniRef100 P54509 Bacilius subtiis YghH subtiis 2525 Hypothetical protein yaho Bacilius UniRef100 P54508 Bacilius subtiis Yaho subtiis 2S26 2527 SinR 2528 Spore coat-associated protein N UniRef100 P54507 Bacilius subtiis TaSA Bacilius subtilis 2529 Signal peptidase IW Bacillus subtilis UniRef100 P54506 Bacilius subtiis SipW 2530 Hypothetical protein yoxM Bacilius UniRef100 P4O949 Bacilius subtiis YaxM subtilis 2531 YazG protein Bacilius subtilis UniRef100 O32019 Bacilius subtiis YazG 2532 YdZE protein Bacilius subtilis UniRef100 O32020 Bacilius subtiis 2533 ComGG 2S34 ComGF 2535 ComG operon protein 5 precursor UniRef100 P25957 Bacilius subtiis ComGE Bacilius subtiis 2536 ComG operon protein 4 precursor UniRef100 P25956 Bacilius subtiis ComGD Bacilius subtiis 2537 ComG operon protein 3 precursor UniRef100 P25955 Bacilius subtiis Bacilius subtiis 2.538 ComGB 2539 ComG operon protein 1 Bacilius UniRef100 P25953 Bacilius subtiis ComGA subtilis 2540 Hypothetical protein yahA Bacilius UniRef100 P54504 Bacilius subtiis YahA subtilis 2541 Hypothetical protein yogZ Bacilius UniRef100 P54503 Bacilius subtiis YogZ subtilis 2542 Hypothetical protein yogY Bacilius UniRef100 P545O2 Bacilius subtiis subtilis 2543 Hypothetical protein yagX Bacilius UniRef100 P545O1 Bacilius subtiis YdgX subtilis 2544 Hypothetical protein yogW Bacilius UniRef100 P54500 Bacilius subtiis subtilis 2545 Hypothetical protein Staphylococcus UniRef100 Q8CSE8 Staphylococci is epidermidis epidermidis US 8,168,417 B2 161 162 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2547 Hypothetical protein yogT Bacilius UniRef100 P54497 Bacilius subtiis YdgT subtiis 2548 Ferrichrome-binding protein precursor UniRef100 P37580 Bacilius subtiis FhuD Bacilius subtiis 2549 Hypothetical protein yogS Bacilius UniRef100 P54496 Bacilius subtiis YdgS subtiis 2550 Glucokinase Bacilius subtilis UniRef100 P54495 Bacilius subtiis GlcK 2551 Hypothetical protein yogO Bacilius UniRef100 P54494 Bacilius subtiis subtiis 2552 Stage V sporulation protein AF UniRef100 Q8EQ08 Oceanobacilius SpoVAF Oceanobacilius iheyensis iheyensis 2553 Hypothetical protein yogP Bacilius UniRef100 P54493 Bacilius subtiis YdgP subtiis 2554 Hypothetical protein yogO Bacilius UniRef100 P54492 Bacilius subtiis tibiis 2555 Hypothetical protein yogN Bacilius UniRef100 P54491 Bacilius subtiis YdgN subtiis 2556 2557 YdgM 2558 Hypothetical protein yogL Bacilius UniRef100 P54489 Bacilius subtiis YogL subtiis 2559 YZD 2560 YazC protein Bacillus subtilis UniRef100 O32023 Bacilius subtiis YZ.C 2561 Phosphate import ATP-binding protein UniRef100 P46342 Bacilius subtiis PstBB pstB 1 Bacilius subtilis 2562 Phosphate import ATP-binding protein UniRef100 P46341 Bacilius subtiis PstBA pstB 2 Bacilius subtilis 2563 Probable ABC transporter permease UniRef100 P46340 Bacilius subtiis PstA protein yaqI Bacillus subtilis 2564 Probable ABC transporter permease UniRef100 P46339 Bacilius subtiis PstC protein yogH Bacilius subtilis 2565 Probable ABC transporter binding UniRef100 P46338 Bacilius subtiis PstS protein yogG precursor Bacilius subtilis 2566 Hypothetical protein yogF Bacilius UniRef100 P54488 Bacilius subtiis PbbA subtilis 2567 Hypothetical protein yagE Bacilius UniRef100 P54487 Bacilius subtiis YdgE subtilis 2S68 SodA 2569 Hypothetical protein yogC Bacilius UniRef100 P54486 Bacilius subtiis YdgC subtilis 2570 Hypothetical protein yogB Bacilius UniRef100 P54485 Bacilius subtiis YdgB subtilis 2571 Hypothetical protein ydf7, Bacilius UniRef100 P54483 Bacilius subtiis subtilis 2572 4-hydroxy-3-methylbut-2-en-1-yl UniRef100 P54482 Bacilius subtiis Ydify diphosphate synthase (EC 1.17.4.3) (1- hydroxy-2-methyl-2-(E)-butenyl 4 diphosphate synthase) Bacilius subtilis 2573 Hypothetical protein yafX Bacilius UniRef100 P54481 Bacilius subtiis Ydfx subtilis 2574 Putative nucleotidase ydfW Bacilius UniRef100 P54480 Bacilius subtiis YdfW subtilis 2575 Zinc-specific metalloregulatory protein UniRef100 P54479 Bacilius subtiis Zur Bacilius subtilis 2576 Metal (Zinc) transport protein Listeria UniRef100 Q926D9 Listeria innoctia YceA innoctia 2577 YccI 2578 Hypothetical protein ydfl J Bacilius UniRef100 P54.478 Bacilius subtiis Yafu subtilis 2579 2580 Probable endonuclease IV Bacilius UniRef100 P54476 Bacilius subtiis YafS subtilis 2581 Probable RNA helicase yafR Bacillus UniRef100 P54475 Bacilius subtiis YafR subtilis 2582 Hypothetical protein yafG Bacilius UniRef100 P54474 Bacilius subtiis Yafo subtilis 2583 4-hydroxy-3-methylbut-2-enyl UniRef100 P54473 Bacilius subtiis YafP diphosphate reductase Bacilius subtilis Yafo US 8,168,417 B2 163 164 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 2585 Hypothetical protein ydfN Bacilius UniRef100 P54471 Bacilius subtiis YafN subtilis 2586 YwqL protein Bacilius subtilis UniRef100 P96724 Bacilius subtiis YwqL 2587 2S88 2589 2590 2591 Hypothetical protein CACO336 UniRef100 Q97M62 Clostridium acetobutyllicum Clostridium acetobutyllicum 2592 Hypothetical protein Bacilius UniRef100 Q6HGF2 Bacilius thuringiensis thiringiensis 2593 2594 YwqJ protein Bacillus subtilis UniRef100 P96722 Bacilius subtiis YwqJ 2595 Hypothetical protein ywd I Bacilius UniRef100 P96721 Bacilius subtiis subtilis 2596 YwqH 2597 Cytochrome c-550 Bacilius subtilis UniRef100 P24469 Bacilius subtiis CccA 2598 SigA 2599 DNA primase Bacillus subtilis UniRef100 PO5096 Bacilius subtiis DnaG 2600 Hypothetical UPFO178 protein yaxD UniRef100 P17868 Bacilius subtiis YaxD Bacilius subtiis 2601 Hypothetical UPF0085 protein yafL UniRef100 P54470 Bacilius subtiis YafL Bacilius subtiis 2602 YdZB protein Bacillus subtilis UniRef100 O34994 Bacilius subtiis YazB 2603 Glycyl-tRNA synthetase beta chain UniRef100 P54381 Bacilius subtiis GlyS Bacilius subtiis 2604 Glycyl-tRNA synthetase alpha chain UniRef100 P54380 Bacilius subtiis Glyo Bacilius subtiis 2605 DNA repair protein recCLBacilius UniRef100 P42095 Bacilius subtiis RecC) subtilis 2606 2607 GTP-binding protein era homolog UniRef100 P42182 Bacilius subtiis Era Bacilius subtiis 2608 deaminase Bacilius subtilis UniRef100 P19079 Bacilius subtiis Cdd 2609 2610 Hypothetical UPF0054 protein yafG UniRef100 P46347 Bacilius subtiis Yof Bacilius subtiis 2611 Hypothetical protein ydfF Bacilius UniRef100 P46344 Bacilius subtiis YdfF subtilis 2612 PhoH-like protein Bacilius subtilis UniRef100 P46343 Bacilius subtiis POEH 2613 Yof D 2614 Hypothetical protein ydfC Bacilius UniRef100 P54468 Bacilius subtiis subtilis 2615 Hypothetical protein ydfB Bacilius UniRef100 P54467 Bacilius subtiis YofE ubtilis 2616 Hypothetical protein ydfA Bacilius UniRef100 P54466 Bacilius subtiis YafA subtilis 26.17 YdeZ 2618 Hypothetical protein ydeY Bacilius UniRef100 P54464 Bacilius subtiis YeY subtilis 2619 2620 Hypothetical protein ydeW Bacilius UniRef100 P54463 Bacilius subtiis YeW subtilis 2621 Deoxyribose-phosphate aldolase UniRef100 Q92A19 Listeria innoctia Dra Listeria innoctia 2622 Hypothetical UPF0004 protein yaeV UniRef100 P54.462 Bacilius subtiis YeV Bacilius subtiis 2623 Hypothetical UPF0088 protein yaeU UniRef100 P54461 Bacilius subtiis YdeU Bacilius subtiis 2624 Ribosomal protein L11 UniRef100 P54460 Bacilius subtiis Yde.T methyltransferase Bacilius subtilis 2625 Chaperone protein dna Bacilius UniRef100 P17631 Bacilius subtiis DnaJ subtilis 2626 DnaK 2627 2628 2629 Heat-inducible transcription repressor UniRef100 P25499 Bacilius subtiis HrcA hircA Bacilius subtilis 2630 Probable oxygen-independent UniRef100 PS4304 Bacilius subtiis HemN coproporphyrinogen III oxidase Bacilius subtilis 2631 LepA US 8,168,417 B2 165 166 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2632 Hypothetical protein yoxA Bacilius UniRef100 P38425 Bacilius subtiis YaxA subtiis 2633 SpoIIP 2634 Germination protease precursor UniRef100 P22322 Bacilius subtiis Gpr Bacilius subtilis 2635 30S ribosomal protein S20 Bacilius UniRef100 P21477 Bacilius subtiis subtiis 2636 Hypothetical protein ydeNBacilius UniRef100 P54459 Bacilius subtiis YgeN subtiis 2637 2638 ComE operon protein 3 Bacilius UniRef100 P39695 Bacilius subtiis ComEC subtiis 2639 ComE operon protein 2 Bacilius UniRef100 P32393 Bacilius subtiis ComEB subtiis 2640 ComE operon protein 1 Bacilius UniRef100 P39694 Bacilius subtiis ComEA subtiis 2641 ComE operon protein 4 Bacilius UniRef100 P39696 Bacilius subtiis ComER subtiis 2642 Hypothetical protein ydeMLBacilius UniRef100 P54458 Bacilius subtiis YgeM subtiis 2643 Hypothetical protein ydeL Bacilius UniRef100 P54457 Bacilius subtiis YdeL subtiis 2644 Hypothetical protein yaeK Bacilius UniRef100 P54456 Bacilius subtiis YgeK subtiis 2645 Nicotinate-nucleotide UniRef100 P54455 Bacilius subtiis Yge.J. adenylyltransferase (EC 2.7.7.18) (Deamido-NAD(+) pyrophosphorylase) (Deamido-NAD(+) diphosphorylase) Bacilius subtilis 2646 Hypothetical UPF0044 protein yaeI UniRef100 P54454 Bacilius subtiis Bacilius subtilis 2647 Shikimate dehydrogenase Bacilius UniRef100 P54374 Bacilius subtiis AroD subtilis 2648 Hypothetical protein yde.H. Bacilius UniRef100 P54453 Bacilius subtiis YgeH ubtilis 2649 Hypothetical protein ydeG Bacilius UniRef100 P54452 Bacilius subtiis YgeO subtilis 26SO 2651 2652 Hypothetical lipoprotein yaeF precursor UniRef100 P54451 Bacilius subtiis YgeF Bacilius subtilis 2653 Acetyltransferase, GNAT family UniRef100 Q81 KW 8 Bacilius YdfB Bacilius anthracis anthracis 2654. Hypothetical protein yrhF Bacilius UniRef100 OO5398 Bacilius subtiis YrhE subtilis 2655 Formate dehydrogenase chain A UniRef100 OO5397 Bacilius subtiis YrhE Bacilius subtilis 2656 Hypothetical protein yrhD Bacilius UniRef100 OO5396 Bacilius subtiis YrhD subtilis 2657 2658 RNA polymerase sigma-K factor UniRef100 P12254 Bacilius subtiis SpoIIIC precursor Bacillus subtilis 2659 YCB 2660 BH2157 protein Bacilius halodurans UniRef100 Q9KAX9 Bacilius Yua halodurans 2661 2662 2663 Alanyl-tRNA synthetase family protein UniRef100 Q81Y73 Bacilius AlaS Bacilius anthracis anthracis 2664 METAL-ACTIVATED PYRIDOXAL UniRef100 Q8YCI2 Bruceiia neitensis ENZYME Brucella melitensis) 2665 Probable translation initiation inhibitor UniRef100 Q6LKM3 Photobacterium Yab Photobacterium profundum) profindi.in) 26.66 YccC 2667 Putative threonine synthase UniRef100 Q82IF6 Streptomyces ThrC Streptomyces avermitilis avermitiis 26.68 Yab 2669 Hypothetical protein UniRef100 Q8RBAO Thermoanaerobacter Thermoanaerobacter tengcongensis tengcongensis 2670 2671 UPIO00032CE59 UniRef100 entry UniRef100 UPIOOOO32CE59 US 8,168,417 B2 167 168 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2672 Multidrug-efflux transporter 1 regulator UniRef100 P39075 Bacilius subtiis BmR Bacilius subtilis 2673 Metallo-beta-lactamaserhodanese-like UniRef100 Q81Q95 Bacilius YrkH domain protein Bacilius anthracis anthracis 2674 Hypothetical protein Bacilius cereus UniRef100 Q63B51 Bacilius cereus YumEB ZK) ZK. 2675 NreC Staphylococcus carnosus UniRef100 Q7WZY4 Staphylococcits DegU C(iiOS:S 2676 Two-component sensor histidine kinase UniRef100 Q67JE7 Symbiobacterium DegS Symbiobacterium thermophilum thermophilum 2677 YdfCR protein Bacilius subtilis UniRef100 P96695 Bacilius subtiis YdfC) 2678 Hypothetical Membrane Spanning UniRef100 Q813Y5 Bacilius cereus Yrk Protein Bacilius cereus 2679 Hypothetical UPF0033 protein yrkI UniRef100 P54436 Bacilius subtiis Bacilius subtilis 2680 UPIOOOO3CB3C6 UniRefl00 entry UniRef100 UPIOOOO3CB3C6 YrkH 2681 Molybdopterin biosynthesis MoeB UniRef100 Q81 HL2 Bacilius cereus YrkF protein Bacilius cereus 2682 Hypothetical protein yrkE Bacilius UniRef100 P54432 Bacilius subtiis YrkE subtilis 2683 Hypothetical conserved protein UniRef100 Q8EN37 Oceanobacilius iheyensis Oceanobacilius iheyensis 2684 S-adenosylmethionine-dependent UniRef100 Q97FB3 Cliostridium YcgJ methyltransferase Clostridium acetobiitvictim acetobiitvictim 2685 Hypothetical protein Bacilius anthracis UniRef100 Q81N81 Bacilius anthracis 2686 YeiB protein Bacilius subtilis UniRef100 P94399 Bacilius subtiis 2687 Acetylxylan esterase related enzyme UniRef100 Q97LM8 Clostridium acetobutyllicum Clostridium acetobutylicum 2688 Hypothetical UPFO161 protein UniRef100 P61464 Bacilius cereus BCE4947 Bacilius cereus 2689 Delta-aminolevulinic acid dehydratase UniRef100 Q9K8G2 Bacilius halodurans Bacilius halodurans 2690 6-phospho-3-hexuloisomerase Bacilius UniRef100 Q6TV53 Bacilius HXB methanolicus methanoicus 2691 Probable hexulose-6-phosphate UniRef100 P42405 Bacilius subtiis HxIA synthase Bacilius subtilis 2692 Transcriptional regulator Bacilius UniRef100 Q7OKJ9 Bacilius HxIR amyloiquefaciens amyloiquefaciens 2693 Fatty acid desaturase Bacillus subtilis UniRef100 O34653 Bacilius subtiis Des 2694 Sensor kinase Bacillus subtilis UniRef100 O34757 Bacilius subtiis YocF 2695 Sensor regulator Bacilius subtilis UniRef100 O34723 Bacilius subtiis YocG 2696 UPIOOOO3CC1E4 UniRef100 entry UniRef100 UPIOOOO3CC1E4 YcgT 2697 Nickel transport system Bacilius UniRef100 Q9KFB8 Bacilius AppA halodurans halodurans 2698 Nickel transport system Bacilius UniRef100 Q9KFB7 Bacilius AppB halodurans halodurans 2699 Nickel transport system Bacilius UniRef100 Q9KFB6 Bacilius AppC halodurans halodurans 2700 Oligopeptide ABC transporter Bacilius UniRef100 Q9KFB5 Bacilius DppD halodurans halodurans 2701 Oligopeptide ABC transporter Bacilius UniRef100 Q9KFB4 Bacilius AppF halodurans halodurans 2702 UPIOOOO3CB880 UniRefl00 entry UniRef100 UPIOOOO3CB880 YdfL 2703 UPIOOOO3CA374 UniRef100 entry UniRef100 UPIOOOO3CA374 YoeA 2704 PROBABLE TRANSCRIPTION UniRef100 Q8XS91 Ralstonia Solanacearum REGULATOR PROTEIN Ralstonia Soianacearum 2705 Short chain dehydrogenase family UniRef100 Q834I5 Enterococcits YvaG protein Enterococci is faecalis faecalis 2706 Uncharacterized protein, containing UniRef100 Q97L24 Clostridium acetobutyllicum predicted phosphatase domain Clostridium acetobutyllicum 2707 UPIO00025758C UniRef100 entry UniRef100 UPIOOOO25758C PnbA 2708 Cytochrome P450 Bacilius subtilis UniRef100 OO8469 Bacilius subtiis CypA 2709 YtnM Bacillus subtilis UniRef100 O34430 Bacilius subtiis YM 2710 Hypothetical protein yndA precursor UniRef100 O31805 Bacilius subtiis YndA Bacilius subtilis 2711 YvaG protein Bacilius subtilis UniRef100 O32229 Bacilius subtiis YvaG 2712 Levanase precursor Bacillus subtilis UniRef100 PO5656 Bacilius subtiis SacC 2713 PTS system, fructose-specific IID UniRef100 P26382 Bacilius subtiis LewG component Bacilius subtilis US 8,168,417 B2 169 170 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2714 PTS system, fructose-specific IIC UniRef100 P26381 Bacilius subtiis LewF component Bacilius subtilis 2715 PTS system, fructose-specific IIB UniRef100 P26380 Bacilius subtiis LewE component Bacilius subtilis 2716 PTS system, fructose-specific ILA UniRef100 P26379 Bacilius subtiis LewD component Bacilius subtilis 2717 Transcriptional regulatory protein levR UniRef100 P23914 Bacilius subtiis LewR Bacilius subtilis 2718 Hypothetical protein Bacilius cereus UniRef100 Q734Q5 Bacilius cereus 2719 Hypothetical protein yrhK Bacilius UniRef100 OO5401 Bacilius subtiis subtilis 2720 UPIOOOO3CB785 UniRefl00 entry UniRef100 UPIOOOO3CB785 2721 Ada A 2722 Methylated-DNA-protein-cysteine S- UniRef100 Q732Y 7 Bacilius cereus AdaB methyltransferase Bacilius cereus 2723 Oxidoreductase, aldo/keto reductase UniRef100 Q6HBJ5 Bacilius YtbE amily Bacilius thuringiensis thiringiensis 2724 YtbD 2725 Hypothetical UPF0087 protein ytcD UniRef100 O34533 Bacilius subtiis YtcD Bacilius subtilis 2726 2727 Hypothetical protein yjA Bacilius UniRef100 O34394 Bacilius subtiis YA subtilis 2728 UPIO00028298B UniRef100 entry UniRef100 UPIOOOO28298B 2729 YxeB 2730 Putative HTH-type transcriptional UniRef100 P37499 Bacilius subtiis YybE regulatoryybE Bacilius subtilis 2731 Hypothetical transport protein yybF UniRef100 P37498 Bacilius subtiis YybF Bacilius subtilis 2732 Hypothetical protein Bacilius cereus UniRef100 Q732JO Bacilius cereus 2733 Probable bifunctional P-450:NADPH- UniRef100 OO8336 Includes: Yrh, P450 reductase 2 Includes: Cytochrome Cytochrome P450 102 (EC 1.14.14.1): P450 102 (EC NADPH-cytochrome P450 reductase 1.14.14.1); (EC 1.6.2.4). Bacillus subtilis NADPH-i- cytochrome P450 reductase (EC 1.6.2.4) 2734 Regulatory protein Bacillus subtilis UniRef100 OO8335 Bacilius subtiis YrhI 2735 WprA 2736 YrhH Bacillus subtilis UniRef100 OO5400 Bacilius subtiis YrhEH 2737 2738 2739 2740 CyStathionine gamma-lyase Bacilius UniRef100 OO5394 Bacilius subtiis YrhEB subtilis 2741 Cysteine synthase Bacilius subtilis UniRef100 OO5393 Bacilius subtiis YrhA 2742 MTA/SAH nucleosidase Bacilius UniRef100 O32028 Bacilius subtiis Mtn subtilis 2743 YrrT protein Bacilius subtilis UniRef100 O32029 Bacilius subtiis YrrT 2744 Hypothetical protein yrzA Bacilius UniRef100 O32030 Bacilius subtiis subtilis 2745 YrrS 2746 YrrR protein Bacilius subtilis UniRef100 O32032 Bacilius subtiis YrrR 2747 Transcription elongation factor grea UniRef100 P80240 Bacilius subtiis GreA Bacilius subtilis 2748 Uridine kinase Bacillus subtilis UniRef100 O32033 Bacilius subtiis Udk 2749 YrrO protein Bacilius subtilis UniRef100 O32034 Bacilius subtiis YrrO 2750 YrrN protein Bacillus subtilis) UniRef100 O32035 Bacilius subtiis YrrN 2751 YrrM protein Bacilius subtilis UniRef100 O32036 Bacilius subtiis YrrM 2752 YrrL protein Bacilius subtilis UniRef100 O34758 Bacilius subtiis YrrL 2753 YrzB protein Bacilius subtilis) UniRef100 O34828 Bacilius subtiis 2754 Putative Holliday junction resolvase UniRef100 O34634 Bacilius subtiis YrrK Bacilius subtilis 2755 Hypothetical UPF0297 protein yrzL UniRef100 Q7WY61 Bacilius subtiis Bacilius subtilis 2756 Alanyl-tRNA synthetase Bacilius UniRef100 O34526 Bacilius subtiis AlaS subtilis 2757 Hypothetical UPF0118 protein yrrI UniRef100 O34472 Bacilius subtiis YrrI Bacilius subtilis 2758 US 8,168,417 B2 171 172 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2759 Hypothetical protein Bacilius cereus UniRef100 Q634F2 Bacilius cereus ZK ZK) 2760 YrrD protein Bacilius subtilis UniRef100 O34402 Bacilius subtiis YrrD 2761 YrrC protein Bacilius subtilis UniRef100 O34481 Bacilius subtiis YrrC 2762 YrrB protein Bacilius subtilis UniRef100 O34452 Bacilius subtiis YrrB 2763 Probable tRNA (5-methylaminomethyl- UniRef100 O35020 Bacilius subtiis Trim 2-thiouridylate)-methyltransferase Bacilius subtilis 2764 YrvO protein Bacilius subtilis UniRef100 O34599 Bacilius subtiis YrvO 2765 BH1259 protein Bacilius halodurans UniRef100 Q9KDF4 Bacilius halodurans 2766 YrvN protein Bacilius subtilis UniRef100 O34528 Bacilius subtiis YrvN 2767 YrvM protein Bacillus subtilis) UniRef100 O32037 Bacilius subtiis YrwM 2768 Aspartyl-tRNA synthetase Bacilius UniRef100 O32038 Bacilius subtiis AspS subtilis 2769 Histidyl-tRNA synthetase Bacilius UniRef100 O32039 Bacilius subtiis HisS subtilis 2770 2771 Yry 2772 Putative D-tyrosyl-tRNA(Tyr) UniRef100 O32042 Bacilius subtiis YrvI deacylase-like protein Bacilius subtilis 2773 GTP pyrophosphokinase (EC 2.7.6.5) UniRef100 O544.08 Bacilius subtiis RelA. (ATP:GTP3'-pyrophosphotransferase) (ppGpp synthetase I) (P)ppGpp synthetase) Bacilius subtilis 2774 Adenine phosphoribosyltransferase UniRef100 O34443 Bacilius subtiis Apt Bacilius subtilis 2775 YrvE protein Bacilius subtilis) UniRef100 O32044 Bacilius subtiis YrvE 2776 YrvD protein Bacillus subtilis) UniRef100 O32045 Bacilius subtiis 2777 SecDF protein LBacillus subtilis UniRef100 O32047 Bacilius subtiis SecDF 2778 YrzD protein Bacilius subtilis UniRef100 O32049 Bacilius subtiis 2779 SpoVB 2780 Hypothetical protein yrbG Bacilius UniRef100 O32050 Bacilius subtiis YrbG subtilis 2781 YrzE protein Bacilius subtilis UniRef100 O32051 Bacilius subtiis YrzE 2782 Hypothetical UPF0092 protein yrbF UniRef100 O32052 Bacilius subtiis Bacilius subtilis 2783 Queuine tRNA-ribosyltransferase UniRef100 O32053 Bacilius subtiis Tgt Bacilius subtilis 2784 S-adenosylmethionine:tRNA UniRef100 O32054 Bacilius subtiis QueA ribosyltransferase-isomerase Bacilius subtilis 2785 2786 Holliday junction DNA helicase ruvB UniRef100 O32055 Bacilius subtiis RuvB Bacilius subtilis 2787 Holliday junction DNA helicase ruvA UniRef100 O05392 Bacilius subtiis RuvA Bacilius subtilis 2788 BofC protein precursor Bacilius subtilis UniRef100 O05391 Bacilius subtiis BofC 2789 Hypothetical conserved protein UniRef100 Q8ERL7 Oceanobacilius YrzF Oceanobacilius iheyensis iheyensis 2790 2791 Small, acid-soluble spore protein H UniRef100 Q9KB75 Bacilius halodurans Bacilius halodurans 2792 Hypothetical protein yoA Bacilius UniRef100 O34334 Bacilius subtiis YoA subtilis 2793 YmaC protein Bacilius subtilis UniRef100 O31789 Bacilius subtiis YmaC 2794 2.795 2796 Hypothetical UPF0082 protein yrbC UniRef100 P94447 Bacilius subtiis YrbC Bacilius subtilis 2797 Sporulation cortex protein coXA UniRef100 P94446 Bacilius subtiis Cox A Bacilius subtilis 2798 Morphogenetic protein associated with UniRef100 O32062 Bacilius subtiis SafA SpoVID Bacilius subtilis 2799 Quinolinate synthetase A Bacillus UniRef100 Q9KWZ1 Bacilius subtiis NadA subtilis 2800 Probable nicotinate-nucleotide UniRef100 P39666 carboxylating NadC pyrophosphorylase carboxylating Bacilius subtilis 2801 L-aspartate oxidase Bacilius subtilis UniRef100 P38032 Bacilius subtiis NadB 2802 Probable cysteine desulfurase Bacilius UniRef100 P38033 Bacilius subtiis NifS subtilis 2803 YrxA US 8,168,417 B2 173 174 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2804 Prephenate dehydratase Bacilius UniRef100 P21203 Bacilius subtiis Phea subtilis 28.05 ACT domain protein pheB Bacilius UniRef100 P21204 Bacilius subtiis Phe subtilis 2806 SpooB-associated GTP-binding protein UniRef100 Q659J4 Bacilius Obg Bacilius amyloiquefaciens amyloiquefaciens 2807 Sporulation initiation UniRef100 PO6535 Bacilius subtiis SpoOB phosphotransferase B Bacilius subtilis 2808 50S ribosomal protein L27 Bacilius UniRef100 PO5657 Bacilius subtiis subtilis 2809 2810 50S ribosomal protein L21 Bacilius UniRef100 P26908 Bacilius subtiis RplU subtilis 2811 Stage IV sporulation protein FB UniRef100 P26937 Bacilius subtiis SpoIVFB Bacilius subtiis 2812 Stage IV sporulation protein FA UniRef100 P26936 Bacilius subtiis SpoIVFA Bacilius subtiis 2813 Hypothetical protein Bacilius cereus UniRef100 Q816V6 Bacilius cereus YndB 2814 Transcriptional regulator, ArsR family UniRef100 Q632YO Bacilius cereus ZK Bacilius cereus ZK) 2815 Septum site-determining protein minD UniRef100 Q01464 Bacilius subtiis MiniD Bacilius subtiis 2816 MinC 2817 Rod shape-determining protein mreD UniRef100 Q01467 Bacilius subtiis MireD Bacilius subtiis 2818 Rod shape-determining protein mreC UniRef100 Q01466 Bacilius subtiis MireC Bacilius subtiis 2819 Rod shape-determining protein mreB UniRef100 Q01465 Bacilius subtiis MireB Bacilius subtiis 2820 DNA repair protein radChomolog UniRef100 Q02170 Bacilius subtiis RadC Bacilius subtiis 2821 Septum formation protein MafBacilius UniRef100 Q02169 Bacilius subtiis Maf subtilis 2822 Stage II sporulation protein B Bacilius UniRef100 P37575 Bacilius subtiis SpoIIB subtilis 2823 Type 4 prepilin-like proteins leader UniRef100 P15378 Includes: Leader ComC peptide processing enzyme (Late peptidase (EC competence protein comC) Includes: 3.4.23.43) Leader peptidase (EC 3.4.23.43) (Prepilin (Prepilin peptidase); N- peptidase); N methyltransferase (EC 2.1.1.-) Bacilius methyltransferase subtilis (EC 2.1.1.—) 2824 FoC 2825 Valyl-tRNA synthetase Bacilius subtilis UniRef100 Q05873 Bacilius subtiis VaS 2826 Hypothetical protein OB2062 UniRef100 Q8EPN1 Oceanobacilius iheyensis Oceanobacilius iheyensis 2827 Hypothetical protein ySXE Bacilius UniRef100 P37964 Bacilius subtiis YSXE subtilis 2828 Stage VI sporulation protein D Bacilius UniRef100 P37963 Bacilius subtiis SpoVID subtilis 2829 Glutamate-1-semialdehyde 2,1- UniRef100 P30949 Bacilius subtiis HemL aminomutase Bacilius subtilis 2830 Delta-aminolevulinic acid dehydratase UniRef100 P3O950 Bacilius subtiis Hem Bacilius subtilis 2831 Uroporphyrinogen-III synthase Bacilius UniRef100 P21248 Bacilius subtiis HemD subtilis 2832 Porphobilinogen deaminase Bacilius UniRef100 P16616 Bacilius subtiis HemC subtilis 2833 Protein hemiX Bacilius subtilis UniRef100 P16645 Bacilius subtiis HemX 2834 Glutamyl-tRNA reductase Bacilius UniRef100 P16618 Bacilius subtiis HemA subtilis 2835 Hypothetical protein ySXD Bacilius UniRef100 P4O736 Bacilius subtiis YSXD subtilis 2836 Probable GTP-binding protein engB UniRef100 P38424 Bacilius subtiis YSXC Bacilius subtilis 2837 ATP-dependent protease La 1 Bacilius UniRef100 P37945 Bacilius subtiis LonA subtilis 2838 ATP-dependent protease Lahomolog UniRef100 P42425 Bacilius subtiis LonB Bacilius subtilis 2839 ATP-dependent Clp protease ATP- UniRef100 PSO866 Bacilius subtiis ClpX binding subunit clpX Bacillus subtilis 2840 Trigger factor Bacillus subtilis UniRef100 P80698 Bacilius subtiis Tig US 8,168,417 B2 175 176 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2841 Hypothetical protein ySOA Bacilius UniRef100 P94569 Bacilius subtiis YSOA subtilis 2842 3-isopropylmalate dehydratase small UniRef100 P94568 Bacilius subtiis Leu) subunit Bacilius subtilis 2843 3-isopropylmalate dehydratase large UniRef100 P80858 Bacilius subtiis Leuc subunit Bacilius subtilis 2844 LeuB 2845 2-isopropylmalate synthase Bacilius UniRef100 P94565 Bacilius subtiis LeuA subtilis 2846 Ketol-acid reductoisomerase Bacilius UniRef100 P37253 Bacilius subtiis wC subtilis 2847 Acetolactate synthase Small subunit UniRef100 P37252 Bacilius subtiis w Bacilius subtilis 2848 Acetolactate synthase large subunit UniRef100 P37251 Bacilius subtiis wB Bacilius subtilis 2849 Branched-chain amino acid UniRef100 Q6HLF7 Bacilius Dat aminotransferase Bacilius thiringiensis thiringiensis 28SO 2851 RocG 2852 BH3337 protein Bacilius halodurans UniRef100 Q9K7M4 Bacilius halodurans 2853 YxeD 2854 Hypothetical protein yabA Bacilius UniRef100 P45917 Bacilius subtiis subtilis 2855 Hypothetical protein ydaT Bacilius UniRef100 P45916 Bacilius subtiis YdaT subtilis 2856 Lin1266 protein Listeria innocua UniRef100 Q92CC3 Listeria innoctia 2857 Lin1733 protein Listeria innocua UniRef100 Q92B18 Listeria innoctia 2858 2859 2860 2861 MtbP 2862 BH3535 protein Bacilius halodurans UniRef100 Q9K738 Bacilius halodurans 2863 Yee 2864 YXD 2865 RapI 2866 2867 2868 2869 Hypothetical UPF0025 protein ySnB UniRef100 P94559 Bacilius subtiis YSB Bacilius subtilis 2870 HAM1 protein homolog Bacilius UniRef100 P94558 Bacilius subtiis YSA subtiis 2871 Rph 2872 Germination protein gerM Bacilius UniRef100 P39072 Bacilius subtiis GerM subtiis 2873 RacE 2874 Hypothetical protein ySmB Bacilius UniRef100 P97247 Bacilius subtiis YSmB subtiis 2875 Germination protein gerE Bacilius UniRef100 P11470 Bacilius subtiis subtiis 2876 Oxidoreductase Clostridium UniRef100 Q97TP7 Cliostridium YajQ acetobiitvictim acetobiitvictim 2877 Hypothetical protein ySmA Bacilius UniRef100 Q6L874 Bacilius subtiis YsmA subtiis 2878. Succinate dehydrogenase iron-sulfur UniRef100 PO8066 Bacilius subtiis SdhB protein Bacilius subtilis 2879 SdhA 288O 2881 Succinate dehydrogenase cytochrome UniRef100 PO8.064 Bacilius subtiis SdhC B-558 subunit Bacilius subtilis 2882 Hypothetical protein ySIB Bacilius UniRef100 P42.955 Bacilius subtiis YSIB subtilis 2883 Aspartokinase 2 (EC 2.7.2.4) UniRef100 PO8495 Contains: LysC (Aspartokinase II) (Aspartate kinase 2) Aspartokinase II Contains: Aspartokinase II alpha alpha Subunit; Subunit; AspartOkinase II beta subunit Aspartokinase II Bacilius subtilis beta subunit 2884 UvrABC system protein C Bacillus UniRef100 P14951 Bacilius subtiis UvrC subtilis 2885 Thioredoxin Bacilius subtilis UniRef100 P14949 Bacilius subtiis TrxA US 8,168,417 B2 177 178 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2886 Electron transfer flavoprotein alpha- UniRef100 P94551 Bacilius subtiis EtfA subunit Bacilius subtilis 2887 Electron transfer flavoprotein beta- UniRef100 P94550 Bacilius subtiis EtfB subunit Bacilius subtilis 2888 Hypothetical protein ysiB Bacilius UniRef100 P94549 Bacilius subtiis YSEB subtilis 2889 Hypothetical protein ysia Bacilius UniRef100 P94548 Bacilius subtiis Ysia subtilis 2890 Long-chain-fatty-acid-CoA ligase UniRef100 P94547 Bacilius subtiis LcfA Bacilius subtilis 2891 Hypothetical protein yshE Bacilius UniRef100 P94546 Bacilius subtiis YSE subtilis 2892 MutS2 protein Bacilius subtilis UniRef100 P94545 Bacilius subtiis MutSB 2893 Hypothetical protein yshC Bacilius UniRef100 P94544 Bacilius subtiis YSC subtilis 2894 YSB 2895 2896 Ribonuclease HIII Bacilius subtilis UniRef100 P94541 Bacilius subtiis RinhC 2897 2898 2899 29OO 2901 YxIF 2902 2903 Phenylalanyl-tRNA synthetase beta UniRef100 P17922 Bacilius subtiis PheT chain Bacilius subtilis 2904 Phenylalanyl-tRNA synthetase alpha UniRef100 Q659J3 Bacilius PeS chain Bacilius amyloiquefaciens amyloiquefaciens 2905 Hypothetical protein ysgALBacilius UniRef100 P94538 Bacilius subtiis YsgA subtilis 2906 Small, acid-soluble spore protein I UniRef100 P94537 Bacilius subtiis Bacilius subtilis 2907 Carbon starvation protein Ahomolog UniRef100 P94532 Bacilius subtiis CstA Bacilius subtilis 2908 Alpha-N-arabinofuranosidase Bacilius UniRef100 Q9XBQ3 Bacilius Abfa Stearothermophilus Stearothermophilus 2909 L-arabinose transport system permease UniRef100 P94530 Bacilius subtiis AraC protein araO Bacillus subtilis 291.0 L-arabinose transport system permease UniRef100 P94529 Bacilius subtiis Arap protein araP Bacilius subtilis 2911 Probable arabinose-binding protein UniRef100 P94528 Bacilius subtiis AraN precursor Bacillus subtilis 2912 Arabinose operon protein araM UniRef100 P94527 Bacilius subtiis AraM Bacilius subtilis 2913 L-ribulose-5-phosphate 4-epimerase UniRef100 P94525 Bacilius subtiis Ara) Bacilius subtilis 2914 Ribulokinase Bacilius subtilis UniRef100 P94524 Bacilius subtiis AraB 2915 L-arabinose isomerase Bacilius UniRef100 P94523 Bacilius subtiis Ara.A subtiis 2916 Abn A 2917 Hypothetical protein ysdC Bacilius UniRef100 P94521 Bacilius subtiis YSC subtiis 2918 Hypothetical protein ysdB Bacilius UniRef100 P94520 Bacilius subtiis YSCB subtiis 2919 Hypothetical protein ysdA Bacilius UniRef100 P94519 Bacilius subtiis subtiis 2920 50S ribosomal protein L20 Bacilius UniRef100 P55873 Bacilius subtiis RplT subtiis 2921 50S ribosomal protein L35 Bacilius UniRef100 P55874 Bacilius subtiis subtiis 2922 Info 2923 Antiholin-like protein IrgB Bacilius UniRef100 P94516 Bacilius subtiis YsbB subtiis 2924 Antiholin-like protein IrgA Bacilius UniRef100 P94515 Bacilius subtiis YsbA subtiis 2925 POD 2926 Sensory transduction protein lytT UniRef100 P94514 Bacilius subtiis Lytt Bacilius subtilis 2927 Sensor protein lytS Bacilius subtilis UniRef100 P94513 Bacilius subtiis LytS 2928 Hypothetical protein Bacilius anthracis UniRef100 Q81NOO Bacilius anthracis 2929 Hypothetical protein ysaA Bacilius UniRef100 P94512 Bacilius subtiis YsaA subtiis US 8,168,417 B2 179 180 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 2930 Threonyl-tRNA synthetase 1 Bacilius UniRef100 P18255 Bacilius subtiis ThrS subtilis 2931 Hypothetical protein ytXC Bacilius UniRef100 PO6569 Bacilius subtiis YXC subtilis 2932 Hypothetical UPF0043 protein ytxB UniRef100 PO6568 Bacilius subtiis YXB Bacilius subtilis 2933 Primosomal protein dinal Bacilius UniRef100 PO6567 Bacilius subtiis Dna subtilis 2934 Replication initiation and membrane UniRef100 PO7908 Bacilius subtiis DnaB attachment protein Bacilius subtilis 2935 Hypothetical UPFO168 protein ytcG UniRef100 Q45549 Bacilius subtiis YtcG Bacilius subtilis 2936 2937 SpeD 2938 GapB 2939 Pectin lyase Bacilius subtilis UniRef100 P94449 Bacilius subtiis PeB 2940 Dephospho-CoA kinase Bacilius UniRef100 O34932 Bacilius subtiis YtaG subtilis 2941 YaF 2942 Formamidopyrimidine-DNA glycosylase UniRef100 O34403 Bacilius subtiis MutM Bacilius subtilis 2943 DNA polymerase I Bacilius subtilis UniRef100 O34996 Bacilius subtiis Pola 2944 Alkaline phosphatase synthesis sensor UniRef100 P23545 Bacilius subtiis PhOR protein phoRBacilius subtilis 2945 Alkaline phosphatase synthesis UniRef100 P13792 Bacilius subtiis PhoP transcriptional regulatory protein phoP Bacilius subtilis 2946 Malate dehydrogenase Bacilius UniRef100 P49814 Bacilius subtiis Mdh subtilis 2947 Isocitrate dehydrogenase NADP UniRef100 P39126 NADP Ico Bacilius subtilis 2948 Citrate synthase II Bacilius subtilis UniRef100 P39120 Bacilius subtiis Citz, 2949 YtwI Bacillus subtilis) UniRef100 O34811 Bacilius subtiis Ytw 2950 Hypothetical UPF0118 protein ytv I UniRef100 O34991 Bacilius subtiis YWI Bacilius subtilis 2951 YtZA protein Bacilius subtilis UniRef100 O32064 Bacilius subtiis YZA 2952 Pyk 2953 6-phosphofructokinase Bacilius UniRef100 O34529 Bacilius subtiis PfkA subtilis 2954 Acetyl-coenzyme A carboxylase UniRef100 O34847 Bacilius subtiis AccA carboxyl transferase subunit alpha Bacilius subtilis 2955 Acetyl-CoA carboxylase subunit UniRef100 O34571 Bacilius subtiis AccD Bacilius subtilis 2956 Yts 2957 DNA polymerase III alpha subunit UniRef100 O34623 Bacilius subtiis DnaB. Bacilius subtilis 2958 Hypothetical Membrane Spanning UniRef100 Q812P3 Bacilius cereus Protein Bacilius cereus 2959 YtrI Bacillus subtilis UniRef100 O34460 Bacilius subtiis Yr 2960 BH3172 protein Bacilius halodurans UniRef100 Q9K835 Bacilius halodurans 2961 YtdI Bacillus subtilis UniRef100 O34600 Bacilius subtiis YtdI 2962 YtpI Bacillus subtilis UniRef100 O34922 Bacilius subtiis 2963 YoI 2964 PadR 2965 YtkL protein Bacilius subtilis UniRef100 Q795U4 Bacilius subtiis YtkL 2966 YtkK 2967 2968 Argininosuccinate lyase Bacilius UniRef100 O34858 Bacilius subtiis ArgH subtilis 2969 Argininosuccinate synthase Bacilius UniRef100 O34347 Bacilius subtiis ArgG subtilis 2970 Molybdenum cofactor biosynthesis UniRef100 O34457 Bacilius subtiis Moa B protein B Bacilius subtilis 2971 Ack A 2972 Hypothetical protein ytXK Bacilius UniRef100 P37876 Bacilius subtiis YXK subtilis 2973 Probable thiol peroxidase Bacilius UniRef100 P80864 Bacilius subtiis Tpx subtilis 2974 YtfT Bacilius subtilis UniRef100 O34806 Bacilius subtiis Ytf 2975 Ytf 2976 Ytel Bacillus subtilis) UniRef100 O34424 Bacilius subtiis Yte US 8,168,417 B2 181 182 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 2977 Putative signal peptide peptidase spp.A UniRef100 O34525 Bacilius subtiis SppA Bacilius subtilis 2978 Probable inorganic polyphosphate? ATP- UniRef100 O34934 Bacilius subtiis YtcI NAD kinase 2 (EC 2.7.1.23) (Poly(P)/ATP NAD kinase 2) Bacilius subtilis 2979 YhbJ protein Bacillus subtilis) UniRef100 O31593 Bacilius subtiis Yhb 2980 Multidrug resistance protein UniRef100 Q8CQB1 Staphylococcits YbD Staphylococci is epidermidis epidermidis 2981 Putative HTH-type transcriptional UniRef100 P42103 Bacilius subtiis YxaD regulatoryxa D Bacilius subtilis 2982 YtcI Bacilius subtilis UniRef100 O34613 Bacilius subtiis YtcI 2983 Small, acid-soluble spore protein 1 UniRef100 PO6552 Bacilius Stearothermophilus Bacilius Stearothermophilus 2984 Probable thiamine biosynthesis protein UniRef100 O34595 Bacilius subtiis Ytb hil Bacilius subtilis 2985 NifS2 Bacillus subtilis UniRef100 O34874 Bacilius subtiis NZ 2986 Branched-chain amino acid transport UniRef100 O34545 Bacilius subtiis BraB system carrier protein braB Bacillus subtilis 2987 IS1627s 1-related, transposase Bacilius UniRef100 Q7CMDO Bacilius anthracis str. A2012 anthracis str. A2012 2988 UPIOOOO3CCO69 UniRefl00 entry UniRef100 UPIOOOO3CCO69 2989 Septation ring formation regulator ezrA UniRef100 O34894 Bacilius subtiis Ezra Bacilius subtilis 2990 Histidinol-phosphatase Bacilius UniRef100 O34411 Bacilius subtiis His subtilis 2991 Probable HTH-type transcriptional UniRef100 O34970 Bacilius subtiis YttP regulatoryttP LBacillus subtilis 2992 Hypothetical conserved protein UniRef100 Q8EPBO Oceanobacilius iheyensis Oceanobacilius iheyensis 2993 YtrP Bacillus subtilis) UniRef100 O34.325 Bacilius subtiis YtrP 2994 30S ribosomal protein S4 Bacilius UniRef100 P21466 Bacilius subtiis RpsD subtilis 2995 2996 2997 YddR 2998 HTH-type transcriptional regulator Irp A UniRef100 P96652 Bacilius subtiis LrpA Bacilius subtilis 2999 3000 Tyrosyl-tRNA synthetase 1 Bacilius UniRef100 P22326 Bacilius subtiis TyrS subtilis 3001 Acetyl-coenzyme A synthetase Bacilius UniRef100 P39062 Bacilius subtiis AcSA subtilis 3002 Acetoin utilization protein acuA Bacilius UniRef100 P39065 Bacilius subtiis AcuA subtilis 3003 Acetoin utilization acuB protein Bacilius UniRef100 P39066 Bacilius subtiis AcuB subtilis 3004 Acetoin utilization protein acuC UniRef100 P39067 Bacilius subtiis AcuC Bacilius subtilis 3005 Hypothetical protein ytxE Bacilius UniRef100 P39064 Bacilius subtiis YXE subtilis 3006 Hypothetical protein ytXD Bacilius UniRef100 P39063 Bacilius subtiis YXD subtilis 3007 Catabolite control protein A Bacilius UniRef100 P25144 Bacilius subtiis CcpA subtilis 3008 AroA(G) protein Includes: Phospho-2- UniRef100 P39912 Includes: AroA dehydro-3-deoxyheptonate aldolase Phospho-2- (EC 2.5.1.54) (Phospho-2-keto-3- dehydro-3- deoxyheptonate aldolase) (DAHP deoxyheptonate synthetase) (3-deoxy-D-arabino- aldolase (EC heptulosonate 7-phosphate synthase); 2.5.1.54) Chorismate mutase (EC 5.4.99.5) (Phospho-2-keto Bacilius subtilis 3 deoxyheptonate aldolase) (DAHP synthetase) (3- deoxy-D-arabino heptulosonate 7 phosphate synthase); US 8,168,417 B2 183 184 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) Chorismate mutase (EC 5.499.5) 3009 Similar to hypothetical repeat UniRef100 Q7N3B8 Photorhabdus iuminescens containing protein Photorhabdits luminescens 3010 Hypothetical protein ytXJ Bacilius UniRef100 P39914 Bacilius subtiis YXJ subtilis 3011 Hypothetical protein ytXH Bacilius UniRef100 P4O780 Bacilius subtiis YtxEH subtilis 3012 Hypothetical protein ytXG Bacilius UniRef100 P4O779 Bacilius subtiis YXG subtilis 3013 MurC 3.014 YtpT Bacilius subtilis UniRef100 O34749 Bacilius subtiis YipT 3015 YtpR Bacillus subtilis) UniRef100 O34943 Bacilius subtiis YR 3016 YtpO Bacillus subtilis UniRef100 O34496 Bacilius subtiis YtpO 3017 Putative thioredoxin Bacilius subtilis UniRef100 O34357 Bacilius subtiis YpP 3018 YtoQ Bacillus subtilis UniRef100 O34305 Bacilius subtiis YtoQ 3019 YtoP Bacillus subtilis) UniRef100 O34924 Bacilius subtiis YoP 3020 YtzB protein Bacilius subtilis UniRef100 O32065 Bacilius subtiis YZB 3021 Probable NAD-dependent malic UniRef100 O34389 Bacilius subtiis MalS enzyme 3 Bacilius subtilis 3022 YtnP Bacillus subtilis) UniRef100 O34760 Bacilius subtiis YnP 3023 tRNA (guanine-N(7)-)- UniRef100 O34522 Bacilius subtiis YtmO methyltransferase (EC 2.1.1.33) (tRNA(m7G46)-methyltransferase) Bacilius subtilis 3024 YtzH protein Bacilius subtilis UniRef100 O32066 Bacilius subtiis 3025 YtmPLBacillus subtilis) UniRef100 O34935 Bacilius subtiis YEmP 3026 Amyx protein Bacilius subtilis UniRef100 O34587 Bacilius subtiis AmyX 3027 YtlRBacillus subtilis) UniRef100 O34799 Bacilius subtiis YIR 3028 YtlO Bacillus subtilis) UniRef100 O34471 Bacilius subtiis YtlO 3029 Hypothetical UPF0097 protein ytlP UniRef100 O34570 Bacilius subtiis YP Bacilius subtilis 3030 Probable cysteine synthase (EC UniRef100 O34476 Bacilius subtiis YkP 2.5.1.47) (O-acetylserine sulfhydrylase) (O-acetylserine (Thiol)-lyase) Bacilius subtilis 3031 Hypothetical protein Bacilius cereus UniRef100 Q81 BR8 Bacilius cereus YncE 3032 Putative peptidase Bacilius subtilis UniRef100 O34944 Bacilius subtiis YtiP 3033 YtiP Bacillus subtilis) UniRef100 O34978 Bacilius subtiis YtiP 3034 YtzE protein Bacilius subtilis UniRef100 O32067 Bacilius subtiis 3035 Ribosomal Small subunit pseudouridine UniRef100 Q816W1 Bacilius cereus YZF synthase A Bacilius cereus 3036 YtgP Bacillus subtilis) UniRef100 O34674 Bacilius subtiis YtgP 3037 YtfP Bacillus subtilis) UniRef100 O30505 Bacilius subtiis YtfB 3O38 Opuld 3039 Protein cse(50 Bacillus subtilis UniRef100 P94496 Bacilius subtiis 3040 Rhodanese-like domain protein UniRef100 Q72YZ9 Bacilius cereus Bacilius cereus 3O41 Rap A 3042 Hypothetical protein Bacilius UniRef100 Q6HI31 Bacilius thuringiensis thiringiensis 3043 Ytel J Bacillus subtilis) UniRef100 O34378 Bacilius subtiis Yte 3044 YeT 3O45 YteS 3.046 YteR Bacillus subtilis) UniRef100 O34559 Bacilius subtiis YteR 3047 Transmembrane lipoprotein Bacilius UniRef100 Q9KFJ5 Bacilius LplB halodurans halodurans 3048 YtdP protein Bacilius subtilis UniRef100 O32071 Bacilius subtiis YoP 3049 YtcQ protein Bacilius subtilis UniRef100 Q795R2 Bacilius subtiis YtcQ 3050 YtcP 3051 Hypothetical protein ytbQ Bacilius UniRef100 P53560 Bacilius subtiis YtbQ subtilis 3052 YtaP Bacillus subtilis) UniRef100 O34973 Bacilius subtiis YaP 3053 Amino acid polyamine transporter; UniRef100 Q6LYX9 Meihanococcus YecA family I Methanococcus maripaludis maripaliidis 3054 Transcriptional regulator, LysR family UniRef100 Q97DX1 Cliostridium YwqM Clostridium acetobutyllicum acetobiitvictim 3055 Prolyl endopeptidase Bacilius cereus UniRef100 Q81C54 Bacilius cereus YycE 3056 Leucyl-tRNA synthetase Bacilius UniRef100 P36430 Bacilius subtiis LeuS subtilis US 8,168,417 B2 185 186 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 3057 YtyB 3058 YttB Bacillus subtilis) UniRef100 O34546 Bacilius subtiis YttB 3059 Lipoprotein Oceanobacilius iheyensis UniRef100 Q8EPK3 Oceanobacilius YuSA iheyensis 3060 YttA Bacillus subtilis) UniRef100 O30500 Bacilius subtiis YttA 3061 YtrF Bacillus subtilis) UniRef100 O35005 Bacilius subtiis YtrF 3062 Hypothetical ABC transporter ATP- UniRef100 O34392 Bacilius subtiis YtrE binding proteinytrE Bacillus subtilis 3063 YtrC Bacillus subtilis) UniRef100 O34898 Bacilius subtiis YtrC 3064 Transporter Bacilius subtilis UniRef100 O34641 Bacilius subtiis YtrB 3065 Transcription regulator Bacilius subtilis UniRef100 O34712 Bacilius subtiis Ytra 3066 Hypothetical protein ytzC Bacilius UniRef100 O32073 Bacilius subtiis subtilis 3067 YtdA Bacillus subtilis UniRef100 O35008 Bacilius subtiis YtgA 3068 YtgB 3069 Proton glutamate symport protein UniRef100 P39817 Bacilius subtiis GltP Bacilius subtilis 3070 Hypothetical protein ytpBBacilius UniRef100 O34707 Bacilius subtiis YbB subtilis 3071 Probable lysophospholipase Bacilius UniRef100 O34705 Bacilius subtiis YA subtilis 3072 YtoA Bacillus subtilis) UniRef100 O34696 Bacilius subtiis Yto A 3073 YWOA 3074 Glycosyltransferase, group 1 family UniRef100 Q6HCB9 Bacilius TuaC Bacilius thuringiensis thiringiensis 3075 Asparagine synthetase glutamine- UniRef100 P54420 glutamine- ASnB hydrolyzing 1 Bacilius subtilis hydrolyzing 3076 S-adenosylmethionine synthetase UniRef100 P54419 Bacilius subtiis MetK Bacilius subtilis 3077 Phosphoenolpyruvate carboxykinase UniRef100 P54418 ATP PckA ATP Bacillus subtilis 3078 Sodium:dicarboxylate symporter UniRef100 Q8EP16 Oceanobacilius DctP Oceanobacilius iheyensis iheyensis 3079 Hypothetical protein ytmB Bacilius UniRef100 O34365 Bacilius subtiis subtilis 3080 Putative peptidase Bacilius subtilis UniRef100 O34493 Bacilius subtiis YtmA 308.1 ABC transporter substrate-binding UniRef100 Q816P5 Bacilius cereus YtA protein Bacilius cereus 3082 Putative transporter Bacilius subtilis UniRef100 O34314 Bacilius subtiis YtlC 3O83 YD 3084 YtkD Bacillus subtilis) UniRef100 O35013 Bacilius subtiis YkD 3085 Hypothetical protein Bacilius UniRef100 Q6HC91 Bacilius thuringiensis thiringiensis 3086 Hypothetical protein ytkC Bacilius UniRef100 O34883 Bacilius subtiis YkC subtilis 3087 General stress protein 20U Bacilius UniRef100 P80879 Bacilius subtiis Dps subtilis 3088 Hypothetical protein ytkA Bacilius UniRef100 P4O768 Bacilius subtiis YtkA subtilis 3089 S-ribosylhomocysteinase Bacilius UniRef100 O34667 Bacilius subtiis LuxS subtilis 3090 Hypothetical UPFO161 protein ytjA UniRef100 O346O1 Bacilius subtiis Bacilius subtilis 3091 YtiB Bacillus subtilis) UniRef100 O34872 Bacilius subtiis YB 3092 Low-affinity Zinc transport protein UniRef100 Q81F90 Bacilius cereus YCC Bacilius cereus 3093 High-affinity Zinc uptake system protein UniRef100 Q81 EF8 Bacilius cereus YCCH ZnuA Bacilius ceretts 3094 50S ribosomal protein L31 type B UniRef100 O34967 Bacilius subtiis Bacilius subtilis 3095 Ytha Bacillus subtilis) UniRef100 O34655 Bacilius subtiis Ytha 3.096 YthB Bacillus subtilis) UniRef100 O34505 Bacilius subtiis YthB 3097 Hypothetical protein Bacilius cereus UniRef100 Q737J1 Bacilius cereus 3098 3099 O-succinylbenzoate synthase (EC UniRef100 O34514 Bacilius subtiis MenC 4.2.1.—) (OSB synthase) (OSBS) (4-(2'- carboxyphenyl)-4-oxybutyric acid synthase) Bacilius subtilis 3100 O-succinylbenzoate--CoA ligase UniRef100 P23971 Bacilius subtiis MenE Bacilius subtilis 3101 Bacillus subtilis UniRef100 P23966 Bacilius subtiis MenB 31 O2 YXM US 8,168,417 B2 187 188 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description UniRef Accession No. Organism Name) 3103 Menaquinone biosynthesis protein UniRef100 P23970 Includes: 2- MenD menD Includes: 2-succinyl-6-hydroxy- Succinyl-6- 24-cyclohexadiene-1-carboxylate hydroxy-2,4- synthase (EC 2.5.1.64) (SHCHC cyclohexadiene synthase); 2-oxoglutarate 1-carboxylate decarboxylase (EC 4.1.1.71) (Alpha- synthase (EC ketoglutarate decarboxylase) (KDC) 2.5.1.64) Bacilius subtilis (SHCHC synthase); 2 oxoglutarate decarboxylase (EC 4.1.1.71) (Alpha ketoglutarate decarboxylase) (KDC) 3104 Menaquinone-specific isochorismate UniRef100 P23973 Bacilius subtiis MenF synthase Bacilius subtilis 3105 Probable 1,4-dihydroxy-2-naphthoate UniRef100 P39582 Bacilius subtiis MenA octaprenyltransferase Bacilius subtilis 3106 Hypothetical protein yteA Bacilius UniRef100 P424.08 Bacilius subtiis YteA subtilis 3107 Glycogen phosphorylase Bacilius UniRef100 P39123 Bacilius subtiis GlgP subtilis 3108 Glycogen synthase Bacillus subtilis UniRef100 P39125 Bacilius subtiis GlgA 3109 Glycogen biosynthesis protein glgD UniRef100 P39124 Bacilius subtiis GlgD Bacilius subtilis 3110 Glucose-1-phosphate UniRef100 P39122 Bacilius subtiis GlgC adenylyltransferase LBacillus subtilis 3111 14-alpha-glucan branching enzyme UniRef100 P39118 Bacilius subtiis GlgB Bacilius subtilis 3112 AraR 3113 Yual protein Bacilius subtilis UniRef100 O32074 Bacilius subtiis Yual 3114 BH4010 protein Bacilius halodurans UniRef100 Q9K5S8 Bacilius YCS halodurans 3115 BH4011 protein Bacilius halodurans UniRef100 Q9K5S7 Bacilius halodurans 31.16 3117 RapD 3118 Pyrrollidone-carboxylate peptidase UniRef100 P46107 Bacilius Pep Bacilius amyloiquefaciens amyloiquefaciens 3119 BHO597 protein Bacilius halodurans UniRef100 Q9KF88 Bacilius YuaA halodurans 312O YbC 3121 YxxF 3122 YuaE protein Bacilius subtilis UniRef100 O32078 Bacilius subtiis Yua 3123 YuaD protein Bacilius subtilis UniRef100 O32079 Bacilius subtiis YualD 3124 Alcohol dehydrogenase Bacilius UniRef100 P71017 Bacilius subtiis GbsB subtilis 3125 Betaine aldehyde dehydrogenase UniRef100 P71016 Bacilius subtiis GbSA Bacilius subtilis 3126 Hypothetical protein yuaC Bacilius UniRef100 P71015 Bacilius subtiis YuaC subtilis 3127 UPIOOOO2D3D35 UniRef100 entry UniRef100 UPIOOOO2D3D35 OpuE 3128 Hypothetical protein yktD Bacilius UniRef100 Q45500 Bacilius subtiis YktD subtilis 3129 Alanine racemase 2 Bacilius subtilis UniRef100 P94494 Bacilius subtiis YncD 3130 Bacilius UniRef100 Q81GZ6 Bacilius cereus YoaN cereus 3131 Hypothetical protein CACO135 UniRef100 Q97MQ7 Clostridium acetobutyllicum Clostridium acetobutyllicum 3132 Hypothetical protein Bacilius UniRef100 Q6HGC9 Bacilius thuringiensis thiringiensis 3133 Hypothetical protein Bacilius UniRef100 Q6HGC8 Bacilius thuringiensis thiringiensis 3134 3135 Hypothetical protein Bacilius UniRef100 Q6HGC6 Bacilius thuringiensis thiringiensis 3136 Hypothetical protein Bacilius UniRef100 Q6HGC5 Bacilius thuringiensis thiringiensis 3137 Hypothetical conserved protein UniRef100 Q8ETF5 Oceanobacilius iheyensis Oceanobacilius iheyensis 3138 US 8,168,417 B2 189 190 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog D (Gene NO. Description UniRef Accession No. Organism Name) 3139 3140 YkoN Bacillus subtilis UniRef100 O34625 Bacilius subtiis YkoN 3141 Hypothetical protein ykoP Bacilius UniRef100 O34495 Bacilius subtiis YkoP subtilis 3142 Hypothetical UPFO151 protein ykoQ UniRef100 O35040 Bacilius subtiis YkoQ Bacilius subtilis 3143 Undecaprenyl-diphosphatase Bacilius UniRef100 P94507 Bacilius subtiis YbB subtilis 3144 Hypothetical UPF0118 protein yubA UniRef100 O32086 Bacilius subtiis Yub A Bacilius subtilis 3145 Hypothetical oxidoreductase yulF UniRef100 OO5265 Bacilius subtiis Yul Bacilius subtilis 3146 Limo2256 protein Listeria UniRef100 Q929B9 Listeria YraA monocytogenes monocytogenes 3147 McpA 3148 McpA 3149 McpA 3150 Protein-glutamine gamma- UniRef100 P4O746 Bacilius subtiis Tgl glutamyltransferase Bacilius subtilis 3151 2-nitropropane dioxygenase Bacilius UniRef100 OO5413 Bacilius subtiis YrpB subtilis 3152 Hypothetical UPF0047 protein yugU UniRef100 OO5243 Bacilius subtiis YugU Bacilius subtiis 3153 Hypothetical protein yugT Bacilius UniRef100 OO5242 Bacilius subtiis YugT subtilis 3154 Transcriptional regulator, TetR family UniRef100 Q81GX6 Bacilius cereus YfiR Bacilius cereus 3155 Hypothetical protein ydeB Bacilius UniRef100 P54447 Bacilius subtiis YgeB subtilis 3156 Beta (1,4)-glucan glucanohydrolase UniRef100 Q6D3B7 Erwinia carotovora Erwinia carotovora 3157 Hypothetical UPF0053 protein yugS UniRef100 O05241 Bacilius subtiis YugS Bacilius subtiis 3158 Hypothetical protein yugP Bacilius UniRef100 OO5248 Bacilius subtiis YugP subtilis 3159 Yugo protein Bacillus subtilis UniRef100 Q795 M8 Bacilius subtiis 3160 Hypothetical protein yugN Bacilius UniRef100 O05246 Bacilius subtiis YugN subtilis 3161 Hypothetical protein Bacilius UniRef100 Q6HIW1 Bacilius YdfR thiringiensis thiringiensis 3162 YtaB protein Bacilius subtilis UniRef100 O34694 Bacilius subtiis YtaB 3163 Predicted acetyltransferase Clostridium UniRef100 Q97IT3 Cliostridium YkkB acetobiitvictim acetobiitvictim 3164 Glucose-6-phosphate isomerase UniRef100 P80860 Bacilius subtiis Pgi Bacilius subtilis 3165 Probable NADH-dependent butanol UniRef100 OO5239 Bacilius subtiis Yug dehydrogenase 1 Bacilius Subtilis 3166 YuZA protein Bacilius subtilis UniRef100 O32087 Bacilius subtiis 3167 General stress protein 13 Bacilius UniRef100 P80870 Bacilius subtiis YugI subtiis 3168 Alanine transaminase Bacillus subtilis UniRef100 Q795 M6 Bacilius subtiis AlaT 3169 Transcriptional regulator Bacilius UniRef100 OO5236 Bacilius subtiis AlaR subtiis 3170 Hypothetical protein yugF Bacilius UniRef100 OO5235 Bacilius subtiis YugF subtiis 3171 Hypothetical protein yugE Bacilius UniRef100 OO5234 Bacilius subtiis subtiis 3172 Hypothetical protein SMU.305 UniRef100 Q9X669 Streptococci is mutans Streptococci is mutans 3173 Putative aminotransferase B Bacilius UniRef100 Q08432 Bacilius subtiis Pat subtiis 3174 3175 Kinase-associated lipoprotein B UniRef100 Q08429 Bacilius subtiis KapB precursor Bacillus subtilis 3176 Hypothetical protein yugB Bacilius UniRef100 OO5231 Bacilius subtiis KapD subtiis 3177 Yx 31.78 PbpD 3179 Hypothetical protein yuxKBacilius UniRef100 P40761 Bacilius subtiis YuxK subtiis 3180 Hypothetical protein yufK Bacilius UniRef100 OO5249 Bacilius subtiis YfK subtiis US 8,168,417 B2 191 192 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog D (Gene NO. Description OniRef Accession No. Organism Name) 3181 Hypothetical protein yufL Bacilius UniRef100 OO5250 Bacilius subtiis YfL subtilis 3182 Hypothetical protein yufM Bacilius UniRef100 OO5251 Bacilius subtiis YfM subtilis 31.83 3184 UPIOOOO3CB938 UniRefl00 entry UniRef100 UPIOOOO3CB938 PSSA 3185. Hypothetical protein ybfM Bacilius UniRef100 O31453 Bacilius subtiis YbfM subtilis 3186 Phosphatidylserine decarboxylase UniRef100 Q6HDI5 Bacilius Ps Bacilius thuringiensis thiringiensis 3.187 UPIOOOO3CCO69 UniRefl00 entry UniRef100 UPIOOOO3CCO69 3188 IS1627s 1-related, transposase Bacilius UniRef100 Q7CMDO Bacilius anthracis str. A2012 anthracis str. A2012 3189 Na(+)-malate symporter Bacilius UniRef100 OO5256 Bacilius subtiis MaleN subtilis 3190 3191 Na(+)/H(+) antiporter subunit A UniRef100 Q9K2S2 Bacilius subtiis MrpA Bacilius subtilis 3192 Na(+)/H(+) antiporter subunit B UniRef100 OO5259 Bacilius subtiis MrpB Bacilius subtilis 3193 Na(+)/H(+) antiporter subunit C UniRef100 OO5260 Bacilius subtiis MrpC Bacilius subtilis 3194 Na(+)/H(+) antiporter subunit D UniRef100 OO5229 Bacilius subtiis MrpD Bacilius subtilis 3195 Na(+)/H(+) antiporter subunit E UniRef100 Q7WY60 Bacilius subtiis MrpE Bacilius subtilis 3196 Na(+)/H(+) antiporter subunit F UniRef100 OO5228 Bacilius subtiis Bacilius subtilis 3197 Na(+)/H(+) antiporter subunit G UniRef100 OO5227 Bacilius subtiis MrpG Bacilius subtilis 31.98 YXO 31.99 ComA 3200 ComP 3201 IS1627s 1-related, transposase Bacilius UniRef100 Q7CMDO Bacilius anthracis str. A2012 anthracis str. A2012 3202 UPIOOOO3CCO69 UniRefl00 entry UniRef100 UPIOOOO3CCO69 32O3 ComP 3204 3205 ComO Bacillus subtilis UniRef100 Q9K5L3 Bacilius subtiis ComO 32O6 3207 Hypothetical protein yuzC Bacilius UniRef100 O32089 Bacilius subtiis YZC subtilis 3208 Hypothetical protein yuxH Bacilius UniRef100 P14203 Bacilius subtiis YxH subtilis 3209 YueK protein Bacilius subtilis UniRef100 O32090 Bacilius subtiis YueK 3210 Yue J protein Bacilius subtilis UniRef100 O32091 Bacilius subtiis Yue 3211 Yue 3212 3213 Hypothetical protein yueC Bacilius UniRef100 O32094 Bacilius subtiis subtilis 3214 YeF 3215 RRF2 family protein Bacillus cereus UniRef100 Q81EX1 Bacilius cereus YwnA 3216 Probable lipase? esterase UniRef100 Q7UUA2 Rhodopinellula YuxL Rhodopinellula baitical baitica 3217 BH1896 protein Bacilius halodurans UniRef100 Q9KBNO Bacilius halodurans 3218 YueB protein Bacilius subtilis UniRef100 O32098 Bacilius subtiis YueF 3219 Yued protein Bacilius subtilis UniRef100 O32099 Bacilius subtiis Yue) 3220 Hypothetical protein yueC Bacilius UniRef100 O32100 Bacilius subtiis YueC subtilis 3221 YueB protein Bacilius subtilis UniRef100 O32101 Bacilius subtiis YueB 3222 YukA protein Bacillus subtilis UniRef100 P71068 Bacilius subtiis YukA 3223 YukC protein Bacilius subtilis UniRef100 P71070 Bacilius subtiis YkC 3224 YukD protein Bacillus subtilis UniRef100 P71071 Bacilius subtiis 3225 LinC)049 protein Listeria innocua UniRef100 Q92FQ4 Listeria innoctia 3226 YA 3227 YukF protein Bacilius subtilis UniRef100 P71073 Bacilius subtiis YkF 3228 Alanine dehydrogenase Bacilius UniRef100 Q08352 Bacilius subtiis Ald subtilis 3229 3230 YuiH protein Bacilius subtilis UniRef100 O32103 Bacilius subtiis YH 3231 YuiG protein Bacilius subtilis UniRef100 O32104 Bacilius subtiis YG 3232 YuiF protein Bacilius subtilis UniRef100 O32105 Bacilius subtiis YF US 8,168,417 B2 193 194 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 3233 Probable cytosol aminopeptidase UniRef100 O32106 Bacilius subtiis YE Bacilius subtilis 3234 YuiD protein Bacilius subtilis UniRef100 O32107 Bacilius subtiis YD 3235 YuiC protein Bacilius subtilis UniRef100 O32108 Bacilius subtiis YC 3236 YuiB protein Bacilius subtilis UniRef100 O32109 Bacilius subtiis YB 3237 3238 YumEB 3239 Thioredoxine reductase Bacilius UniRef100 OO5268 Bacilius subtiis YumC subtilis 3240 3241 YdjO protein Bacilius subtilis UniRef100 O34759 Bacilius subtiis 3242 YxbD 3243 Hypothetical protein yutM Bacilius UniRef100 O32113 Bacilius subtiis YtM subtilis 3244 Diaminopimelate epimerase Bacilius UniRef100 O32114 Bacilius subtiis DapF subtilis 3245 YutK protein Bacilius subtilis UniRef100 O32115 Bacilius subtiis YutK 3246 Yuzb protein Bacilius subtilis UniRef100 O32116 Bacilius subtiis 3247 Yut protein Bacilius subtilis UniRef100 O32117 Bacilius subtiis Yt 3248 YdhO protein Bacillus subtilis UniRef100 OO5499 Bacilius subtiis YG 3249 Response regulator aspartate UniRef100 Q9KBE1 Bacilius RapI phosphatase Bacilius halodurans halodurans 3250 Phenolic acid decarboxylase Bacilius UniRef100 OO7006 Bacilius subtiis PadC subtilis 3251 BH2266 protein Bacilius halodurans UniRef100 Q9KAM1 Bacilius halodurans 3252 Yuzd protein Bacilius subtilis UniRef100 O32118 Bacilius subtiis YZD 3253 YutI protein Bacilius subtilis UniRef100 O32119 Bacilius subtiis 3254 Probable peptidase yuxL Bacilius UniRef100 P39839 Bacilius subtiis YuxL subtilis 3255 Homoserine kinase Bacillus subtilis UniRef100 PO4948 Bacilius subtiis ThrB 3256 Threonine synthase Bacilius subtilis UniRef100 PO4990 Bacilius subtiis ThrC 3257 Homoserine dehydrogenase Bacilius UniRef100 P19582 Bacilius subtiis Hom subtilis 3258 Glycerate dehydrogenase UniRef100 Q8ENW9 Oceanobacilius YvcT Oceanobacilius iheyensis iheyensis 3259 Yuth protein Bacilius subtilis UniRef100 O32123 Bacilius subtiis YtEH 3260 Hypothetical protein yutG Bacilius UniRef100 O32124 Bacilius subtiis Yt subtilis 3261 YutF protein Bacilius subtilis UniRef100 O32125 Bacilius subtiis YutF 3262 YutE protein Bacilius subtilis UniRef100 O32126 Bacilius subtiis YtE 3263 YutD protein Bacilius subtilis UniRef100 O32127 Bacilius subtiis 3264 YutC protein Bacilius subtilis UniRef100 O32128 Bacilius subtiis YtC 326S Lip A 3266 YunA protein Bacillus subtilis UniRef100 O32130 Bacilius subtiis YunA, 3267 3268 Sodium-dependent transporter Bacilius UniRef100 Q9K7C5 Bacilius YocR halodurans halodurans 3269 YunB protein Bacilius subtilis UniRef100 O32131 Bacilius subtiis YB 3270 YunC protein Bacilius subtilis UniRef100 O32132 Bacilius subtiis YunC 3271 Yun) protein Bacillus subtilis UniRef100 O32133 Bacilius subtiis Yun) 3272 YunE protein Bacilius subtilis UniRef100 O32134 Bacilius subtiis YE 3273 YunF protein Bacilius subtilis UniRef100 O32135 Bacilius subtiis YF 3274 YmcC 3275 TetR family transcriptional regulator? UniRef100 Q67KA4 Symbiobacterium PkSA Symbiobacterium thermophilum thermophilum 3.276 3277 Purine catabolism protein pucC UniRef100 O32148 Bacilius subtiis YrG Bacilius subtilis 3278 Allantoate amidohydrolase Bacilius UniRef100 O32149 Bacilius subtiis YEH subtilis 3279 Purine catabolism regulatory protein UniRef100 O32138 Bacilius subtiis PucR Bacilius subtilis 3280 Multidrug resistance protein B Bacilius UniRef100 Q63FH7 Bacilius cereus Blt cereus ZK) ZK. 3281 BH2308 protein Bacilius halodurans UniRef100 Q9KAH9 Bacilius YcgA halodurans 3282 TrpD 3283 Anthranilate phosphoribosyltransferase UniRef100 Q8U089 Pyrococcus furiosus Pyrococcus furiosus 3284 Extracellular ribonuclease precursor UniRef100 O32150 Bacilius subtiis Yr Bacilius subtilis 3285 BH1977 protein Bacilius halodurans UniRef100 Q9KBF1 Bacilius halodurans US 8,168,417 B2 195 196 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 3286 YrR 3287 Putative membrane protein Bordeteila UniRef100 Q7WGW7 Bordeteila bronchiseptica bronchiseptical 3288 UPIOOOO3CB453 UniRefl00 entry UniRef100 UPIOOOO3CB453 3289 Response regulator aspartate UniRef100 P96649 Bacilius subtiis RapI phosphatase I Bacilius subtilis 3290 YurU protein Bacilius subtilis UniRef100 O32162 Bacilius subtiis Yr 3291 NiflJ-like protein Bacilius subtilis UniRef100 O32163 Bacilius subtiis YurV 3292 Probable cysteine desulfurase Bacilius UniRef100 O32164 Bacilius subtiis Csd subtilis 3293 YurX protein Bacilius subtilis UniRef100 O32165 Bacilius subtiis YX 3294 Vegetative protein 296 Bacilius subtilis UniRef100 P80866 Bacilius subtiis Yury 3295 Limo2575 protein Listeria UniRef100 Q8Y480 Listeria CzcD monocytogenes monocytogenes 3296 3297 BH3473 protein Bacilius halodurans UniRef100 Q9K796 Bacilius YZ halodurans 3298 YuSA protein Bacilius subtilis UniRef100 O32167 Bacilius subtiis YuSA 3299 Yusb protein Bacilius subtilis UniRef100 O32168 Bacilius subtiis YSB 3300 YusC protein Bacilius subtilis UniRef100 O321.69 Bacilius subtiis YSC 3301 Hypothetical protein yus.O Bacilius UniRef100 O32170 Bacilius subtiis YuSD subtilis 3302 Yush protein Bacilius subtilis UniRef100 O32171 Bacilius subtiis 3303 Yus protein Bacilius subtilis UniRef100 O32172 Bacilius subtiis YSF 3304 3305 Glycine cleavage system H protein UniRef100 O32174 Bacilius subtiis GcvH Bacilius subtilis 3306 Hypothetical protein yusI Bacilius UniRef100 O32175 Bacilius subtiis YuSI subtilis 3307 Yus protein Bacilius subtilis UniRef100 O32176 Bacilius subtiis YuSJ 3308 YusK protein Bacilius subtilis UniRef100 O32177 Bacilius subtiis YuSK 3309 YusL protein Bacilius subtilis UniRef100 O32178 Bacilius subtiis YuSL 3310 3311 3312 YusN protein Bacilius subtilis UniRef100 O3218O Bacilius subtiis YSN 3313 Hypothetical protein yusU Bacilius UniRef100 O321.87 Bacilius subtiis subtilis 3314 BH1040 protein Bacilius halodurans UniRef100 Q9KE18 Bacilius halodurans 3315 YusV protein Bacilius subtilis UniRef100 O32188 Bacilius subtiis YSV 3.316 YA 3317 YfiZ protein Bacilius subtilis UniRef100 O31568 Bacilius subtiis YZ 3318 YfiY protein Bacillus subtilis UniRef100 O31567 Bacilius subtiis YfiY 3319 Hypothetical protein yusW precursor UniRef100 O321.89 Bacilius subtiis YSW Bacilius subtilis 3320 YusX protein Bacilius subtilis UniRef100 O32190 Bacilius subtiis YuSX 3321 D-alanyl-D-alanine carboxypeptidase UniRef100 Q8ERGO Oceanobacilius DacB Oceanobacilius iheyensis iheyensis 3322 Hypothetical oxidoreductase yusz UniRef100 P37959 Bacilius subtiis YuSZ Bacilius subtilis 3323 Metalloregulation DNA-binding stress UniRef100 P37960 Bacilius subtiis MrgA protein Bacilius subtilis 3324 Probable serine proteaseyvtA Bacilius UniRef100 Q9R9I1 Bacilius subtiis YytA subtiis 3325 Transcriptional regulatory protein cssR UniRef100 O32192 Bacilius subtiis CSSR Bacilius subtilis 3326 Sensor protein cssS Bacilius subtilis UniRef100 O321.93 Bacilius subtiis CSSS 3327 YirB Bacillus subtilis) UniRef100 O323O2 Bacilius subtiis 3328 Putative HTH-type transcriptional UniRef100 P4O950 Bacilius subtiis YXN regulatoryuxN Bacilius subtilis 3329 Fumarate hydratase class II Bacilius UniRef100 PO7343 Bacilius subtiis CitG subtiis 3330 3331 Spore germination protein A1 Bacilius UniRef100 PO7868 Bacilius subtiis GerAA subtiis 3332 Spore germination protein A2 Bacilius UniRef100 PO7869 Bacilius subtiis GerAE subtiis 3333 Spore germination protein A3 precursor UniRef100 PO7870 Bacilius subtiis GerAC Bacilius subtilis 3334 3335 YvaC 3336 YvdE protein Bacillus subtilis UniRef100 O32198 Bacilius subtiis YvdE 3337 YvoF protein Bacilius subtilis UniRef100 O32199 Bacilius subtiis YvaF US 8,168,417 B2 197 198 TABLE 1-continued

Predicted functions

Bacilius subtiis SEQ homolog ID (Gene NO. Description OniRef Accession No. Organism Name) 3338 YvoG protein Bacillus subtilis UniRef100 O32200 Bacilius subtiis YvaG 3339 Hypothetical protein yvgH Bacilius UniRef100 O322O1 Bacilius subtiis Yvah subtilis 3340 Hypothetical protein yvg|IBacillus UniRef100 O322O2 Bacilius subtiis YvdI subtilis 3341 Pectate lyase P358 Bacilius sp. P-358 UniRef100 Q8RR73 Bacillus sp. P-358 3342 YvaK protein Bacillus subtilis UniRef100 O34899 Bacilius subtiis YvaK 3343 UPIOOOO2E3648 UniRef100 entry UniRef100 UPIOOOO2E3648 FabG 3344 3345 GbsB 3346 Dap A 3347 Putative metal binding protein, YvrA UniRef100 O34631 Bacilius subtiis Yvra Bacilius subtilis 3348 Putative hemin permease, YvrB UniRef100 O34451 Bacilius subtiis YvrB Bacilius subtilis 3349 Putative metal binding protein, YvrC UniRef100 O34805 Bacilius subtiis YvrC Bacilius subtilis 3350 Transcriptional regulator, GntR family UniRef100 Q81SA7 Bacilius YdhC Bacilius anthracis anthracis 3351 Putative ketoreductase, YvrD Bacilius UniRef100 O34782 Bacilius subtiis YvrD subtilis 3352 UPIOOOO3CC410 UniRefl00 entry UniRef100 UPIOOOO3CC410 YK 3353 Transcriptional regulators, LysR family UniRef100 Q81 DJ6 Bacilius cereus AlsR Bacilius cereus 3354 Exo-poly-alpha-D-galacturonosidase, UniRef100 Q9WYR8 Thermotoga maritima putative Thermotoga maritima 3355 Altronate hydrolase Bacilius subtilis UniRef100 O34673 Bacilius subtiis UXaA 3356 Altronate oxidoreductase Bacilius UniRef100 O34354 Bacilius subtiis UxaB subtilis 3357 LacI repressor-like protein Bacilius UniRef100 Q9JMQ1 Bacilius subtiis ExuR subtilis 3358 Hypothetical symporteryjmB Bacilius UniRef100 O34961 Bacilius subtiis YimB subtilis 3359 Uronate isomerase Bacilius subtilis UniRef100 O34808 Bacilius subtiis UxaC 3360 Putative sensory protein kinase, YvrG UniRef100 O34989 Bacilius subtiis YvrG Bacilius subtilis 3361 Putative DNA binding response UniRef100 P945.04 Bacilius subtiis YvrH regulator, YvrH Bacillus subtilis 3362 Ferrichrome transport ATP-binding UniRef100 P49938 Bacilius subtiis FhuC protein fhuC Bacilius subtilis 3363 Ferrichrome transport system UniRef100 P49.937 Bacilius subtiis FhuG permease protein fhuG Bacillus subtilis 3364. Ferrichrome transport system UniRef100 P49936 Bacilius subtiis FhuB permease protein fhuB Bacilius subtilis 3365 Putative arginine ornithine antiporter, UniRef100 O32204 Bacilius subtiis YvsH YvsH Bacilius subtilis 3366 Hypothetical protein yvsG precursor UniRef100 O32205 Bacilius subtiis YvsG Bacilius subtiis 3367 Putative molybdate binding protein, UniRef100 O32206 Bacilius subtiis Yvg.) YvgJ Bacillus subtilis 3368 YCB 3369 Putative reductase protein, YvgN UniRef100 O32210 Bacilius subtiis YvgN Bacilius subtiis 3370 Fructokinase Listeria monocytogenes UniRef100 Q722A5 Listeria YR monocytogenes 3371 Hypothetical protein yebU precursor UniRef100 P42253 Bacilius subtiis YcbU Bacilius subtiis 3372 Hypothetical protein CPEO889 UniRef100 Q8XMO1 Clostridium perfingens Clostridium perfingens 3373 YvgS protein Bacilius subtilis UniRef100 O32215 Bacilius subtiis YvgS 3374 Hypothetical UPFO126 protein yvgT UniRef100 O32216 Bacilius subtiis YvgT Bacilius subtiis 3375 Glutamate-rich protein grpBBacilius UniRef100 Q81CT5 Bacilius cereus YakA cereus 3376 Acetyltransferase, GNAT family UniRef100 Q6HJN8 Bacilius Yua Bacilius thuringiensis thiringiensis 3377 Disulfide bond formation protein C UniRef100 O32217 Bacilius subtiis BbC Bacilius subtilis 3378 Disulfide bond formation protein D UniRef100 O32218 Bacilius subtiis BdbD precursor Bacillus subtilis