US 201402961 61A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2014/0296.161 A1 Qian et al. (43) Pub. Date: Oct. 2, 2014

(54) DIDEMININ BIOSYNTHETIC CLUSTER Publication Classification INTISTRELLA MOBILIS (51) Int. Cl. (71) Applicant: King Abdullah University of Science C07K II/02 (2006.01) and Technology, Thuwal (SA) C07K I4/95 (2006.01) (72) Inventors: Pei-Yuan Qian, Hong Kong (CN); Ying (52) U.S. Cl. Sharon Xu, Hong Kong (CN); Pok-Yui CPC ...... C07K II/02 (2013.01); C07K 14/195 Lai, Hong Kong (CN) (2013.01) (73) Assignee: King Abdullah University of Science USPC ...... 514/21.1:536/23.1; 53.59. 530/324; and Technology, Thuwal (SA) 435/252.3; 435/69.1 (21) Appl. No.: 14/346,068 (57) ABSTRACT (22) PCT Filed: Sep. 21, 2012 (86). PCT No.: PCT/B2O12/OO2361 A novel Tistrella mobilis strain having Accession Deposit S371 (c)(1), Number NRRL B-50531 is provided. A method of producing (2), (4) Date: Mar. 20, 2014 a didemnin precursor, didemnin or didemnin derivative by O O using the Tistrella mobilis strain, and the therapeutic compo Related U.S. Application Data sition comprising at least one didemnin or didemnin deriva (60) Provisional application No. 61/537.416, filed on Sep. tive produced from the strain or modified strain thereof are 21, 2011. also provided. Patent Application Publication Oct. 2, 2014 Sheet 1 of 16 US 2014/0296.161 A1

OMe M-0 N.0-Melyr H II0 OMe HO s

O HO Thr'III

O Me OH O). NY"N . Cy 0^ M s OH LaC Q Me NH H0O y HO, If M ) 0 H, Isf N-Me-D-leu PO Didemnin Band its mOnOmerS

OMe s' N O Me 0 is 0 O Me OH'-N."N O. NH, 0 H, S-NH H 00sI N - Q H - 0 OH O N-N-N- C.His 0 H O H Didemnin X 01NH, FIG. A Patent Application Publication Oct. 2, 2014 Sheet 2 of 16 US 2014/0296.161 A1

OMe

O Me X)",N (...) OH

NOrdidemnin B

OMe

Dehydrodidemnin B (Aplidine)

FIG. 1A (Cont'd) Patent Application Publication Oct. 2, 2014 Sheet 3 of 16 US 2014/0296.161 A1

HC H 3 OH O'

HCO --

HC CH, Ni CH

OCH, OH Didemnin A R = H W

Didemnin B. R = GorOH Didemnin C R = O O HC O

FIG. 1B Patent Application Publication Oct. 2, 2014 Sheet 4 of 16 US 2014/0296.161 A1

1-----=*------kA.}

T------s w were s gas

Patent Application Publication Oct. 2, 2014 Sheet 6 of 16 US 2014/0296.161 A1 ººººººººººººººººo@@@@@@ Patent Application Publication Oct. 2, 2014 Sheet 7 of 16 US 2014/0296.161 A1

Patent Application Publication Oct. 2, 2014 Sheet 8 of 16 US 2014/0296.161 A1 Patent Application Publication Oct. 2, 2014 Sheet 9 of 16 US 2014/0296.161 A1

Patent Application Publication Oct. 2, 2014 Sheet 10 of 16 US 2014/0296.161 A1

Patent Application Publication Oct. 2, 2014 Sheet 11 of 16 US 2014/0296.161 A1

?WO Patent Application Publication Oct. 2, 2014 Sheet 12 of 16 US 2014/0296.161 A1

Patent Application Publication Oct. 2, 2014 Sheet 13 of 16 US 2014/0296.161 A1

Patent Application Publication Oct. 2, 2014 Sheet 14 of 16 US 2014/0296.161 A1

r

----

-1

Patent Application Publication Oct. 2, 2014 Sheet 15 of 16 US 2014/0296.161 A1 [X]p3/0£TIÚT?!?-1779-MÊMuni1000$| 7.TEMYSE?NETVÆRSVEIFON Patent Application Publication Oct. 2, 2014 Sheet 16 of 16 US 2014/0296.161 A1

-T-t-t-t-t-t-t-t-t-t-t-t- ges Y ———————————————————————————————————————————??[?]/[]] US 2014/0296.161 A1 Oct. 2, 2014

DDEMININ BIOSYNTHETIC GENE CLUSTER producing the didemnins. Furthermore, the invention relates INTISTRELLA MOBILS to therapeutic compositions containing the didemnins and to uses of the therapeutic compositions. 0001. This application claims priority to U.S. Provisional Patent Application Ser. No. 61/537,416, filed Sep. 21, 2011, 0006. It is therefore an object of the present invention to which application is incorporated by reference herein in its provide novel species of didemnin-producing Tistrella bacte entirety. ria. It is also an object of the invention to provide a novel Tistrella mobilis bacterium or one or more strains thereof. In TECHNICAL FIELD specific cases, the novel didemnin-producing Tistrella mobi lis bacterium was deposited on Jul. 27, 2011 as Accession No. 0002 The present invention generally concerns the fields NRRL B-50531 with the depository Agricultural Research of cell biology, molecular biology, bacteriology, and medi Service Culture Collection, National Center for Agricultural cine. In particular aspects, the invention concerns direct or Utilization Research Agricultural Research Service, U.S. indirect production of anti-cancer compounds in . Department of Agriculture, 1815 North University Street, Peoria, Ill. 61604, U.S.A. BACKGROUND OF THE INVENTION 0007 An isolated or biologically pure culture of the bac 0003. The didemnins (FIG. 1) are a group of cyclic dep terium is encompassed in the invention. A further object of the sipeptides with extraordinary biological activities, including present invention is to provide didemnins, didemnin precur antitumor, antivirus and immunosuppressive activities (Vera sors, and/or didemnin derivatives produced by novel strains and Joullié, 2002; Rawat et al., 2006). Rinehart and his of Tistrella. In additional aspects of the present invention, coworkers (1981) initially isolated didemnins A, B and C there are novel and polynucleotide sequences of from a Caribbean tunicate of the family Didemnidae in 1981. Tistrella mobilis, and exemplary sequences include SEQ ID Since then, more than 20 other didemnins have been isolated NOS:10-61. from tunicates collected from different geographical loca 0008 Any Tistrella species may be utilized in the inven tions. Owing to its potent antitumor activity, didemnin B was tion. In specific embodiments, the following bacteria are uti the first marine natural selected to enterclinical trials. lized: Tistrella mobilis (Shi et al., 2002); Tistrella bauzanen Although didemnin B failed in mid-1990s, a closely related sis (Zhang et al., 2010), Tistrella sp. BZ78, Tistrella sp. didemnin compound-dehydrodidemnin B (Aplidin) is cur D1-34, Tistrella sp. D1-36, Tistrella sp. D6-30, Tistrella sp. rently in human phase II clinic trials for Solid and haemato f-1-2, Tistrella sp. JW16.1a, Tistrella sp. MARC2PPND, logical malignant neoplasias like T cell lymphoma and Tistrella sp. PhS5A, Tistrella sp. S67-5, Tistrella sp. S73-3, myelofibrosis and in phase III clinical trials for multiple and Tistrella sp. Zp5. myeloma (Soto-Matos et al., 2011). Until now, didemnins are exclusively isolated from the eukaryotic marine tunicates. 0009 Specific embodiments of the invention include the However, since cyclic are usually synthesized by isolation of didemnin B and nordidemnin B from a Gram non-ribosomal synthetases (NRPS) from microor negative marine-derived bacterium Tistrella mobilis. The ganisms, a microorganism associated with the tunicates complete genome sequence of this bacterium revealed the would be a useful source for producing didemnins. Until the biosynthetic gene cluster responsible for the didemnin Syn present invention, careful scrutiny of tunicate symbiotic thesis. Bioinformatic analysis was used to predict the func microorganisms has not been Successful to locate Such a tion of the in the cluster. Culture conditions were opti microorganism and chemical synthesis remains to be the only mized in order to increase the yield of the target didemnin route to obtain sufficient amount of didemnins. The chemical compounds. reactions to synthesize didemnin A, the simplest compound 0010. In particular aspects, there is a process for producing of the didemnin family, are too complicated to be finished a didemnin, which comprises the steps of: a) culturing at least within 10 steps (Jou et al., 1997), and consequently the yield is one Tistrella strain in growth-Supporting nutrient medium undoubtedly very low. Therefore, the discovery of didemnin capable of promoting growth and reproduction of said bacte biosynthetic gene cluster from a microbe will be beneficial ria, wherein said culturing is effected for a time sufficient to for producing didemnins more economically. More impor allow production of a didemnin; and b) recovering the didem tantly, novel didemnin analogues can be generated through nin from said bacteria or medium of step a). of the biosynthetic pathways and be 0011. One embodiment of the present invention is to pro investigated for their biological activities, which may in turn vide a plurality of bacteria for the mass production of a provide more drug leads. didemnin or precursor or derivative thereof. BRIEF SUMMARY OF THE INVENTION 0012 A particular embodiment of the present invention is to provide a novel process for the production of didemnins, 0004. The present invention is directed to a system, didemnin precursors, or didemnin derivatives from bacteria. method, and compositions related to production of anti-can The industrial application of this process would provide cer compounds or precursors or derivatives thereto from a renewable sources of didemnins, didemnin precursors, or bacterium. In particular aspects, the bacterium is from the didemnin derivatives for the . In cer Tistrella genus, and in specific aspects the bacterium is tain aspects, there is a biotransformation process in which Tistrella mobilis. bacteria-derived didemnins, didemnin precursors, or didem 0005. This invention relates to the production of nin derivatives are converted into Substances that are useful as didemnins or didemnin derivatives through the use of novel therapeutic compounds or for the production of other thera didemnin-producing bacterium and/or didemnins or didem peutic compounds. In certain aspects of the invention, didem nin derivatives produced by these species; it also relates to nin B and/or nordidemnin B (for example) are produced polynucleotide and amino acids from the novel bacterium endogenously by Tistrella mobilis and derivatives are made US 2014/0296.161 A1 Oct. 2, 2014

therefrom one or the other; however, in other cases other invention may employed for any type of mammal, including didemnins are produced endogenously by other Tistrella bac humans, dogs, cats, horses, pigs, sheep, and goats. teria. 0020. In alternative embodiments, the compositions of the 0013 Aspects of the invention include methods and com invention relate to anti-viral and/or immunosuppressive com positions wherein bacteria capable of producing didemnin pounds. precursors or derivatives, wherein are added to their 0021. In some aspects of the invention, one or more poly culture medium to effect production of a didemnin or the nucleotides in Tistrella (such as one of plasmids 1, 2, 3, or 4 didemnin derivatives. of Tistrella mobilis, for example) are able to conjugate with 0014. In accordance with the present invention there is other bacteria. In specific cases, the plasmid transfers to E. also provided a process for improving production of coli or another bacterium. didemnins, didemnin precursors, or didemnin derivatives in 0022. In an embodiment of the invention, there is an iso bacteria comprising the steps of a) culturing Tistrella bacteria lated Tistrella mobilis bacterium having Accession Deposit in the presence of a mutagenic agent for a period of time Number NRRL B-50531 with the depository Agricultural Sufficient to allow mutagenesis; and b) selecting said mutants Research Service Culture Collection National Center for by a change of the phenotype that results in an increased Agricultural Utilization Research Agricultural Research Ser production of didemnins, didemnin precursors, or didemnin vice, U.S. Department of Agriculture. In some embodiments, derivatives. The mutagenic agent may be a chemical agent, there is an isolated polynucleotide selected from the group Such as daunorubicin and nitrosoguanidine; a physical agent, consisting of SEQID NO:10-25, 42-51, 62-66, and a mixture Such as gamma radiation or ultraviolet radiation; or a biologi thereof. In particular embodiments, there is an isolated cal agent, Such as a transposon, for example. Exemplary polypeptide selected from the group consisting of SEQ ID modifications include to the side chain region, to the hip NO:26-41, 52-61, and a mixture thereof. isostatine region, to the tetrapeptide region, and/or to the 0023. In specific embodiments, a bacterium comprises at macrocyclic backbone. least one genetic modification compared to wild-type, and in 0015. In certain embodiments, there is provided a process certain aspects the genetic modification is in a gene in the for improving biotransformation of didemnins into didemnin didemnin gene cluster. In specific cases, at least one genetic derivative-producing bacteria comprising the steps of a) cul modification is in a condensation, adenylation, thiolation, turing bacteria in the presence of a mutagenic agent for a time ketoreductase, ketosynthase, methyltransferase, or Sufficient to allow mutagenesis; and b) selecting said mutants domain of a gene in the didemningene cluster. In by a change of the phenotype that results in an increased certain embodiments, at least one genetic modification is in biotransformation of didemnins into didemnin derivative the ketoreductase or adenylation domains. producing bacteria. 0024. In one embodiment of the invention, there is a 0016. In some aspects of the invention, an anti-cancer method of producing a didemnin precursor, didemnin, or compound is produced from Tistrella bacteria. The com didemnin derivative, comprising the steps of a) culturing a pound may be isolated directly from the bacteria, or another host cell harboring bacterial didemnin synthesis genes; and b) compound may be isolated from the bacteria from which the recovering said didemnin precursor, didemnin, or didemnin anti-cancer compound is then synthesized, either directly or derivative from said host cell. In some cases, the host cell is indirectly through one or more other compounds. further defined as a Tistrella bacterium, a genetically modi 0017 Embodiments of the invention include methods and fied Tistrella bacterium compared to wild-type, or E. coli. The compositions regarding fermentation to produce didemnins Tistrella bacterium may be Tistrella mobilis, in some cases. In and their derivative compounds. Some embodiments include particular aspects, the didemnin is selected from the group the gene cluster in Tistrella for producing didemnins. consisting of didemninA, didemnin B, didemnin C, didemnin Although in specific cases the didemningene cluster includes D, didemnin E, didemnin G, didemnin X, didemninY, nordi Tistrella didA, didB, didC, didD, didE, didF, didG, didFH, and demnin, or a combination thereof. In some cases, the method didI; however, in particular embodiments the didemnin gene further comprises modifying the recovered didemnin precur cluster includes one or more of Tistrella didA, didB, didC, Sor, didemnin, or didemnin derivative. didD, didE, didF, didG, did, and did I. In at least certain 0025. In an embodiment of the invention, there is a thera aspects, one or more of the didemnin gene cluster may be peutic composition comprising in a suitable carrier: at least transformed into bacteria from another genus. Such as one isolated didemnin produced by a culture comprising bac Escherichia; in particular, one or more of the didemnin gene teria having Accession Deposit Number NRRL B-50531 with cluster members may be transferred into E. coli for the pro the depository Agricultural Research Service Culture Collec duction of a didemnin or didemnin derivative from E. coli, tion National Center for Agricultural Utilization Research either directly or indirectly. Agricultural Research Service, U.S. Department of Agricul 0018. In particular aspects of the invention, there is ture; at least one didemnin or didemnin derivative produced included production of didemnins, didemnin precursors, or from a modified bacteria of the bacterial strain having Acces didemnin derivatives by the Tistrella mobilis JAM 14872T sion Deposit Number NRRL B-50531 with the depository (Shi et al., 2002). Tistrella mobilis JAM 14872T was cultured Agricultural Research Service Culture Collection National and extracted by the same way as the Tistrella mobilis strain Center for Agricultural Utilization Research Agricultural described herein, and the UPLC-HRMS profile shows that Research Service, U.S. Department of Agriculture; or a mix this type strain also produces didemnin Band nordidemnin B. ture thereof. 0019 Anti-cancer compounds of the present invention 0026. The foregoing has outlined rather broadly the fea may be useful for any type of cancer, including at least the tures and technical advantages of the present invention in following: breast, lung, prostate, colon, pancreatic, blood, order that the detailed description of the invention that fol brain, liver, spleen, esophageal, ovarian, cervical, kidney, lows may be better understood. Additional features and thyroid, rectal, bone, gallbladder, stomach, and so forth. The advantages of the invention will be described hereinafter US 2014/0296.161 A1 Oct. 2, 2014

which form the subject of the claims of the invention. It described herein. Thus, an embodiment pertaining to one should be appreciated by those skilled in the art that the method or composition may be applied to other methods and conception and specific embodiment disclosed may be compositions of the invention as well. readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of the present I. DEFINITIONS invention. It should also be realized by those skilled in the art 0037. The definitions provided in the entire disclosure that Such equivalent constructions do not depart from the Supersede any conflicting definition in any of the reference spirit and scope of the invention as set forth in the appended that is incorporated by reference herein. The fact that certain claims. The novel features which are believed to be charac terms are defined, however, should not be considered as teristic of the invention, both as to its organization and method indicative that any term that is undefined is indefinite. Rather, of operation, together with further objects and advantages all terms used are believed to describe the invention in terms will be better understood from the following description Such that one of ordinary skill can appreciate the scope and when considered in connection with the accompanying fig practice the present invention. ures. It is to be expressly understood, however, that each of the 0038. The term “didemnin' as used herein refers to a figures is provided for the purpose of illustration and descrip group of cyclic depsipeptides. In specific embodiments, the tion only and is not intended as a definition of the limits of the didemnin has antitumor, antiviral, and/or immunosuppres present invention. sive activity. BRIEF DESCRIPTION OF THE DRAWINGS 0039. The term “didemnin derivatives' as used herein refers to any constructed based on modification of a 0027. For a more complete understanding of the present didemnin compound. In specific embodiments, the didemnin invention, reference is now made to the following descrip derivatives have antitumor, antiviral, and/or immunosuppres tions taken in conjunction with the accompanying drawings. sive activity. 0028 FIGS. 1A and 1B illustrate structures of didemnin B 0040. The term "didemnin gene cluster as used herein and its exemplary monomers and other representative refers to a cluster of genes responsible for the of didemnins. didemnins. 0029 FIG. 2 provides detection of didemnin B (upper 0041. “Effective amount,” “Therapeutically effective panel) and nordidemnin B (lower panel) in one of the frac amount’ or “pharmaceutically effective amount’ means that tions of Tistrella mobilis crude extract using UPLC-HRMS amount which, when administered to a subject or patient for analysis. treating a disease, is sufficient to effect such treatment for the 0030 FIG.3 shows organization of didemnin NRPS-PKS disease, including to improve at least one symptom of the system. disease. 0031 FIG. 4 illustrates an exemplary incorporation of the 0042. The term “growth supporting nutrient medium is two monomers Ist and Hip into didemnin B. Reactions are intended to mean any culture media which include, without simplified by taking out the thiol-tethered peptidyl interme limitation, sources, nitrogen Sources, amino acids, diates. Vitamins and minerals. 0032 FIG. 5 shows an exemplary biosynthetic pathway of 0043. As used herein, the term “patient' or “subject' or didemnin B. C. condensation domain. A. adenylation “individual” refers to a living mammalian organism, such as domain. T: thiolation domain. KR: ketoreductase domain. a human, monkey, cow, sheep, goat, dog, cat, mouse, rat, KS: ketosynthase domain. MT: methyl domain. guinea pig, or transgenic species thereof. In certain embodi TE: thioesterase domain. ments, the patient or subject is a primate. Non-limiting 0033 FIG. 6 shows detection of didemnin Afrom Tistrella examples of human Subjects are adults, juveniles, infants and mobilis. fetuses. 0034 FIGS. 7A and 7B show detection of didemnin Nand 0044) “Prevention” or “preventing includes: (1) inhibit nordidemnin Afrom Tistrella mobilis. ing the onset of a disease in a subject or patient that may be at 0035 FIGS. 8A and 8B show detection of two new risk and/or predisposed to the disease but does not yet expe didemnins from Tistrella mobilis. rience or display any or all of the pathology or symptomatol ogy of the disease, and/or (2) slowing the onset of the pathol DETAILED DESCRIPTION OF THE INVENTION ogy or symptoms of a disease in a Subject or patient that may be at risk and/or predisposed to the disease but does not yet 0036. As used herein the specification, “a” or “an may experience or display any or all of the pathology or symptoms mean one or more. As used herein in the claim(s), when used of the disease. An individual at risk for cancer, for example, in conjunction with the word “comprising, the words “a” or may be an individual with a family or personal history or that 'an' may mean one or more than one. As used herein exhibits one or more known risk factors, such as certain "another may mean at least a second or more. In specific lifestyle habits (for example, Smoking) or particular gene embodiments, aspects of the invention may “consist essen associated mutations (for example, BRCA1 or BRCA2 for tially of or “consist of one or more sequences of the inven breast or ovarian cancer) or particular elevations of a metabo tion, for example. Some embodiments of the invention may lite (elevated prostate-specificantigen (PSA) for prostate can consist of or consist essentially of one or more elements, method steps, and/or methods of the invention. It is contem cer). plated that any method or composition described herein can II. GENERAL, EMBODIMENTS OF THE be implemented with respect to any other method or compo INVENTION sition described herein. Embodiments discussed in the con text of methods and/or compositions of the invention may be 0045. In embodiments of the invention, there are bacteria employed with respect to any other method or composition that produce didemnins or didemnin derivatives, and in par US 2014/0296.161 A1 Oct. 2, 2014 ticular aspects the bacteria are from the family Rhodospiril modular according to the particular domains they employ. lales, although in specific aspects the bacteria are from the They may have one or more of the following domains (one or genus Tistrella. more) adenylation (A) domain, the thiolation (T) domain, the 0046. The present invention at least provides a novel condensation (C) domain, the ketoreductase (KR) domain, Tistrella mobilis bacterium; amino acid and polynucleotide and/or the methyltransferase (MT) domain, for example. For sequences of the bacterium; didemnins, didemnin precursors, example, didI has a C domain, an A domain, and a T domain, and/or didemnin derivatives produce therefrom; therapeutic whereas didB has a C, A, T, and KR domain (see FIG. 3). compositions containing the didemnins, didemnin precur Marahiel et al. (1997) provide a review of peptide syn sors, and/or didemnin derivatives; and/or methods for pro thetases, which is incorporated by reference herein in its ducing or using the didemnins, didemnin precursors, and/or entirety. didemnin derivative compositions. IV. DIDEMININS III. TISTRELLA MOBILISSEQUENCES AND THE 0.052 Didemnins are cyclic depsipeptide compounds DIDEMININ GENE CLUSTER (depsipeptide is a peptide in which one or more of the amide 0047 Embodiments of the invention include isolated (-CONHR ) bonds are replaced by ester (COOR) bonds). polynucleotide and polypeptide sequences from Tistrella Although more than nine didemnins (didemnins A-E, G, X mobilis, such as those included in SEQ ID NOS: 10-61. In and Y) have been isolated from the extract of the exemplary Some cases, the isolated polynucleotide and polypeptide tunicate Trididemnium solidum, in the prior art didemnin B is sequences from Tistrella mobilis are involved in synthesis of the one that possesses the most potent biological activities. It one or more didemnins, whereas in other cases the isolated is a strong antiviral agent against both DNA and RNA viruses polynucleotide and polypeptide sequences from Tistrella Such as herpes simplex virus type 1, a strong immunosuppres mobilis are not involved in synthesis of one or more sant that shows some potential in skin graft and is also very didemnins. The isolated polynucleotide or polypeptide cytotoxic. Although didemnin B shows strong activity against sequences may be modified and utilized in a cellor in vitro. In murine leukemia cells and had completed phase II human certain cases, the modified polynucleotide or polypeptide clinical trials against adenocarcinoma of the kidney, sequences are utilized in a Tistrella mobilis or other bacterial advanced epithelial ovarian cancer, and metastatic breast can or cell. Isolated or modified polynucleotide sequences cer, it exhibits high toxicity in human subjects. For review see may be employed in an expression vector. Vera and Joullié, 2002. 0.048 Tistrella mobilis gene cluster nucleotide sequences 0053 Didemnin precursors, didemnins, and didemnin include ORF1 (SEQ ID NO:10), ORF2 (SEQ ID NO:11), derivatives are encompassed in methods and compositions of ORF3 (SEQ ID NO:12), ORF4 (SEQ ID NO:13), ORF5 the invention. Exemplary didemnins that may be generated in (SEQ ID NO:14), ORF6 (SEQ ID NO:15), ORF7 (SEQ ID the T. mobilis bacterium of the invention may be of any kind, NO:16), ORF8 (SEQ ID NO:17), ORF9 (SEQ ID NO:18), but in specific embodiments they are didemnins A-I, M, X and ORF10 (SEQID NO:19), ORF11 (SEQ ID NO:20), ORF12 Y, nordidemnin B, dehydrodidemnin B (Aplidine(R), or are (SEQID NO:21), ORF13 (SEQID NO:22), ORF14 (SEQID one or more of the didemnins described in U.S. Pat. No. NO:23), ORF15 (SEQ ID NO:24), and ORF16 (SEQ ID 5,294,603, incorporated by reference herein in its entirety. In NO:25). Corresponding Tistrella mobilis gene cluster protein specific embodiments, didemnin derivatives include N-acyl sequences for the respective ORFS is as follows: ORF1 (SEQ congeners of didemnin A (DA); several DDB-type analogues IDNO:26), ORF2 (SEQID NO:27), ORF3 (SEQID NO:28), of DA in which either pyruvic acid has been replaced (with ORF4 (SEQ ID NO:29), ORF5 (SEQ ID NO:30), ORF6 phenylpyruvic acid or alphaketobutyric acid) or proline at (SEQ ID NO:31), ORF7 (SEQ ID NO:32), ORF8 (SEQ ID position 8 has been replaced with L-azetidine-2-carboxylic NO:33), ORF9 (SEQID NO:34), ORF10 (SEQ ID NO:35), acid (AZT), L-pipecolic acid (Pip), 1-amino-1-carboxylic ORF11 (SEQID NO:36), ORF12 (SEQ ID NO:37), ORF13 cyclopentane (acc), D-Pro or sarcosine (sar); and the (SEQID NO:38), ORF14 (SEQID NO:39), ORF15 (SEQID didemnins X (R)-3-hydroxy-decanoyl-(Gln)-Lac-Pro NO:40), and ORF16 (SEQID NO:41). didemnin A; Y (R)-3-hydroxy-decanoyl-(Gln)-Lac-Pro 0049 Tistrella mobilis didemnin gene cluster nucleotide didemnin A: M (pGlu-Gln-Lac-Pro-didemnin A); N (ITyr sequences include one or more of the following: DidA (SEQ didemnin B); nordidemnin N (ITyr nordidemnin B); and ID NO:42); DidB (SEQID NO:43), DidC (SEQID NO:44), epididemnin A (2S,4R-Hip didemnin A). Others include DidD (SEQID NO:45), DidE (SEQID NO:46), DidF (SEQ Isodidemnin A, Didemnin N, Nordidemnin N. Epididemnin ID NO:47), DidG (SEQID NO:48), DidH (SEQ ID NO:49), A. Acyclodidemnin A, Dihydrodidemnin N. Dihydroepidi Did I, (SEQ ID NO:50), and Did.J. (SEQ ID NO:51). Corre demnin A. N-Acetyl didemnin A. N.O-Diacetyldidemnin A, sponding Tistrella mobilis didemnin gene cluster protein N-(D-Prolyl) didemnin A, N-(benzyloxycarbonyl-D-prolyl) sequences include one or more of the following: DidA (SEQ didemnin A, N-(D-Prolyl) didemnin A, N-(L-Prolyl) didem ID NO:52); DidB (SEQID NO:53), DidC (SEQID NO:54), nin A, Acetyl-didemnin B, Propionyl-didemnin B, Isobu DidD (SEQID NO:55), DidE (SEQID NO:56), DidF (SEQ tyryl-didemnin B. L-Ala-didemnin B, O-Benzyl L-Ala ID NO:57), DidG (SEQID NO:58), DidH (SEQID NO:59), didemnin B, L-Aladidemnin B, D-Prodidemnin B, Did I, (SEQID NO:60), and Did (SEQID NO:61). O-Benzyl-D-prodidemnin B, D-Prodidemnin B, N 0050. The whole Tistrella mobilis genome is provided by (CHONsu)didemnin A, O-Acetyldidemnin A, Hexahy a chromosomal sequence (SEQ ID NO:62) and four mini droMe, Tyrididemnin A, Hexahydro-N-MePheididemnin chromosomes (which may be referred to as plasmids) includ A, HexahydroMe, Tyrididemnin B, Hexahydro-N-Me ing plasmid 1 (SEQID NO:63), plasmid 2 (SEQ ID NO:64), Pheididemnin B, pyroglutaminyl didemnin B. plasmid 3 (SEQID NO:65), and plasmid 4 (SEQID NO:66). 0054) Glutaminyl derivatives as described in U.S. Pat. No. 0051. In aspects of the invention, the endogenous Tistrella 6,841.530, incorporated by reference herein in its entirety, are mobilis didemnin gene cluster includes polypeptides that are also encompassed in the invention. Examples of derivatives US 2014/0296.161 A1 Oct. 2, 2014 include pyroglutaminyl didemnin B, O-Benzyldidemnin B, plete acylation. Propionic butyric and pentanoic acids may be Benzyloxycarbonyl-L-Glutaminyldidemnin B, (Benzyloxy introduced using a 3-5 fold excess of their symmetrical anhy carbonyl-L-Glutaminy) Didemnin, Benzyloxycarbonyl dride. didemnin M, Benzyloxycarbony-L-Pyroglutaminyldidem 0062. The symmetrical anhydrides of fatty acids may be nin B, Pyroglutaminyldidemnin B. Prolydidemnin A, L-(N- prepared in the conventional manner using EDC. Acylation of Benzyloxycarbonyl-pyroglutaminyl)-L-glutaminyl didemnin A may be carried out in the presence of a catalytic didemnin B, N-Benzyloxycarbonyl-L-pyroglutaminyl amount of dimethylamino pyridine (DMAP). Derivatives didemnin B, Boc-L-prolyl-didemninA, L-Prolyl-didemnin B may be purified on a silica gel column using methanol/chlo (see also US2001/0007855, incorporated by reference herein roform as a eluant. They and others may be characterized in its entirety). using "H NMR spectroscopy and HRFABMS. Exemplary 0055. In specific aspects, the didemnin is an N-acylated methods of synthesis of didemnin analogs is described in U.S. Didemnin A. In particular aspects of the invention, the acyl Pat. No. 5,294,603, which is incorporated by reference herein group of the N-acyl Didemnin A comprises a C to Cs group. in its entirety. In certain embodiments, the didemnin is a derivative of 0063 Structure elucidation, chemical conversion, biologi Didemnin A (optionally a synthetic derivative), modified at a cal activities including cytotoxicity, antiviral and immuno position selected from the group consisting of position 1, 5, 6, Suppressive activities and structure-activity relationships and combinations thereof, by the incorporation of a D-amino may be performed for any didemnin ordidemnin derivative of acid (for example). In other aspects, the didemnin is a syn the invention. thetic derivative of Didemnin A, modified to include a Dehy drodidemnin B (DDB) moiety in the linear peptide chain. In V. PRODUCTION OF MODIFIED ORGANISMS particular aspects, the didemnin is selected from the group FOR DIDEMNIN PRECURSORS, DIDEMNINS, consisting of phenylpyruv-Pro didemnin A. Pyruv-Sar AND/OR DIDEMNIN DERIVATIVES, AND didemnin A, alpha-ketobutyryl-Pro didemnin A. Pyruv-Azt COMPOSITIONS PRODUCED THEREBY didemnin A, or Pyruv-D-Pro didemnin A. 0064. In some embodiments of the invention, one or more 0056. In some aspects of the invention, one increases the host cells (which may be organisms) are modified to produce lipophilicity of a didemnin, Such as didemnin A, in its modi one or more didemnin precursors, didemnins, or didemnin fications, for example to raise its solubility in the plasma derivatives. In some cases the organism is from the Tistrella membrane and in at least certain cases increase its activity. In genus, Such as Tistrella mobilis, for example, although in specific aspects, the N-amine of N-methyl D-leucine posi other cases an organism or other cells are modified to produce tions is useful for the addition of hydrophobic groups to didemnin precursors, didemnins, or didemnin derivatives. didemnin A, for example. From here, one can synthesize a series of N-acylated analogues of didemnin A comprising A. Tistrella alkyl chains with 2, 3, 4, 5, 7, 11, 15 or 17 carbon atoms. 0065. In certain embodiments of the invention, a Tistrella 0057 Dehydrodidemnin B derivatives which either pyru bacterium is modified to affect the synthesis of one or more Vic acid has been replaced (with phenylpyruvic acid or -ke endogenous didemnins produced in the native bacterium. In tobutyric acid, for example) or proline at position 8 has been Some cases, the modified bacterium is genetically modified replaced (with L-azetidine-2-carboxylic acid (AZT), L-pipe (such as by using recombinant technology to mutate or knock colic acid (Pip), 1-amino-1-carboxylic cyclopentane(accs), out one or more endogenous bacteria genes, including by D-Pro or sarcosine (sar), for example). transforming the bacteria with a polynucleotide construct, for 0058 Exemplary hydrophobic derivatives of didemnin A example), although it may also or otherwise be chemically may be synthesized by incorporating acyl chains therein modified (daunorubicin or nitrosoguanidine, for example) didemninA, ranging from 4 to 18 , for example. Such and/or physically modified (gamma or ultraviolet radiation, compounds include N-butyryl didemnin A. N-pentanoyl for example). didemnin A, N-hexanoyl didemnin A. N-octanoyl didemnin 0.066 For genetic modification of Tistrella (or other organ A. N-lauroyl didemnin A. N-palmitoyl didemnin A, or isms), nucleic acid comprising a mutation of inter N-stearoyl didemnin A. est or means to generate a mutation of interest) of the present 0059. Three other exemplary derivatives of didemnin A invention can be expressed separately, i.e., inserted into sepa include those in which the amino acids at positions 1, 5, and rate vectors for expression. Such vectors are known or can be 6 were replaced with their corresponding D-amino acids. constructed by those skilled in the art and generally contain Exemplary compounds include D-Thr' didemnin A, D-Pro all expression elements (e.g., promoters, terminator frag didemnin A, and D-MeTyr(Me) didemnin A. ments, enhancer elements, marker genes and other elements as appropriate) necessary to achieve the desired transcription 0060 Five didemnin A derivatives related to dehydrodi of the sequences. Examples of vectors include viruses such as demnin B may be synthesized by introducing DDB-type bacteriophages, baculoviruses, and retroviruses, DNA modifications into their linear peptide chain moieties. Such viruses, cosmids, plasmids, phagemids and other recombina DDB-type compounds include Phenylpyruv-Pro didemninA, tion vectors. The vectors can also contain elements for use in Pyruv-Sar didemnin A-ketobutyryl-Pro didemnin A. Pyruv either prokaryotic or eukaryotic host systems. One of ordi Azt didemnin A, and Pyruv-D-Podidemnin A. nary skill in the art will know which host systems are com 0061 Synthesis of N-acyl analogues may be carried out in patible with a particular vector. The vectors can be introduced solution. Acyl groups may be introduced at the N-methyl-D- into cells or tissues and expressed by any one of a variety of leucine unit of didemnin A using a symmetrical anhydride known methods within the art. Such methods can be found procedure with C to Cs fatty acids, and an excess of sym generally described in Sambrook et al. (1989, 1992) Molecu metrical anhydride may be required in order to achieve com lar Cloning: A Laboratory Manual, Cold Springs Harbor US 2014/0296.161 A1 Oct. 2, 2014

Laboratory, New York; Ausubel et al. (1989) Current Proto gene cluster members is mutated. The region may be of any cols in Molecular Biology, John Wiley and Sons, Baltimore, kind so as to effect an altered synthesis of one or more of a Md.: Chang, et al. (1995) Somatic Gene Therapy, CRC Press, didemnin precursor, didemnin, or didemnin derivative. The Ann Arbor, Mich.: Vega, et al. (1995) Gene Targeting, CRC affected region may be in the sequence that encodes (one or Press, Ann Arbor, Mich.: Vectors: A Survey of Molecular more) adenylation (A) domain, the thiolation (T) domain, the Cloning Vectors and Their Uses, Butterworths, Boston, Mass. condensation (C) domain, the ketoreductase (KR) domain, (1988); and include, for example, stable or transient transfec and/or the methyltransferase (MT) domain, for example. tion, lipofection, electroporation and infection with recombi 0071. In particular aspects the A domain is modified, for nant viral vectors. Introduction of nucleic acids by infection example Such that the A domain specificity and/or placement offers several advantages over other listed methods. Higher within the non-ribosomal peptide synthetase assembly affects efficiencies can be obtained due to their infectious nature. the sequence of the monomers in the nonribosomal peptide. Moreover, viruses are very specialized and typically infect In specific cases, the particular DidA-Did.Jgenes is modified and propagate in specific cell types. Thus, their natural speci such that the specificity is different. For example, ficity can be used to target the vectors to specific cell types in the DidAA domain may be altered to have substrate speci vivo or within a tissue or mixed culture of cells. The viral ficity other than glutamine, DidC may be altered to have vectors can also be modified with specific receptors or ligands substrate specificity other than proline, and so forth (see Table to alter target specificity through receptor mediated events. 3 of predicted normal substrate specificities). B. Non-Tistrella Cells 0072. In particular cases, one can genetically modify one or more genes of the didemnin gene cluster Such that the 0067 Host cells suitable for introduction and expression pathway generates a didemnin derivative or didemnin precur of the nucleic acids of the invention may be bacterial; how sor that can then be modified following purification of the ever, yeast (e.g., Pichia, Saccharomyces, etc.), mammalian, precursor. For example, one can knockout one or more genes or insect host cells are also contemplated, as is a cell-free or one or more regions of one or more genes. For example, expression system. In particular embodiments, the host cellor one can knock out the ketoreductase in DidB so that the culture is bacterial. Exemplary bacterial host cells include E. pathway produces dehydrodidemnin B. In other exemplary coli as well as Bacillus sp. measures, one can replace one of the A domains with another 0068. In cases wherein Tistrella is not used as a fermenta A domain in any of the members of the gene cluster, for tion or other system for production of modified didemnin and example. For example, one can replace one of the A domains didemnin-related compounds, the host cell may be modified with another Adomain (for example, one that activate Valine) to include one or more proteins suitable for production of at such that a different didemnin with a Valine residue will be least one didemnin precursor, didemnin, and/or didemnin produced. derivative. In particular, the host cell may be modified to 0073. In some cases, the mutation is generated in the host include part or all of a didemningene cluster, including one or cell. Following this, the mutant strain can be cultured to more of didA-didJ and/or ORF1-16, for example. In specific produce the desired didemnin precursor, didemnin, or didem embodiments, the cell includes ORF1, ORF3, ORF6, ORF 7, nin derivative, and the compound can then be purified (for ORF 8 and/or ORF 16. In particular aspects, one or more of example, by one or more of extraction, fractionation by nor ORF1, ORF3, ORF6, ORF 7, ORF 8 and/or ORF 16 are mal or reverse phase liquid chromatography, size-exclusion involved in the regulation, transport, resistance and other chromatography, reverse-phase HPLC, and so forth). If functions of the didemningene cluster. In some cases, the host appropriate, further modifications to the compound may be cell may be modified to include a mutated did gene (compared produced. In some cases, one can culture the mutant strain to Tistrella wildtype sequence) Such that a didemnin deriva and wild-type strain and extract them respectively; then, one tive may be produced, and it optionally may also include one can compare their respective metabolites, for example by or more of the endogenous Tistrella wild-type did genes. The LC-HRMS (high resolution mass spectrometry) and/or NMR didemnin derivative may be produced directly therefrom the to identify or verify production of the expected metabolite. modified host cell, or a didemnin precursor may be produced and additional synthesis steps may be required following 0074. In alternative embodiments, one can mutagenize a purification of the precursor, for example. plurality of host cells (including Tistrella or E. coli) for example and performs a high throughput assay to obtain the C. Exemplary Modifications of Host Cells mutant. The candidate mutants producing didemnin precur sors, didemnins, or didemnin derivatives may be screened for 0069. In particular aspects of the invention, a host cell a compound not otherwise obtained from the corresponding (Tistrella or E. coli, for example) is modified compared to wild-type host cellor obtainable to much higher levels than in their respective wild-type counterparts. The modification the corresponding wildtype host cell. may be such that one or more endogenous polynucleotides in the host cell become mutated compared to their respective VI. DETERMINATION OF DIDEMININ OR wild-type counterparts. The mutation may be of any kind, DIDEMININ DERIVATIVESTRUCTURE including a knockout, knockdown, point, frameshift, inver Sion, deletion, and so forth. The mutation may be of any kind 0075. In embodiments of the invention, the didemnin or So as to effect an altered synthesis of one or more of a didem didemnin derivative is isolated and the structure is deter nin precursor, didemnin, or didemnin derivative. mined. Although one of skill in the art recognizes routine 0070. In some cases, one or more mutations are generated methods of purifying and determining a chemical structure, in in one or more didemningene cluster members, including one specific cases one may use extraction (including, for example, or more of didA-did-J, and/or in one or more of ORF1-ORF16. methanol-toluene (or ethyl acetate or chloroform) extrac In certain cases, a particular region of one or more didemnin tion), silica gel, preparative thin-layer chromatrography, US 2014/0296.161 A1 Oct. 2, 2014

nuclear magnetic resonance imaging, acid hydrolysis, mass sented in Handbook of Pharmaceutical Salts. Properties, and spectrometry, gas chromatography, X-ray crystallography, Use (2002), which is incorporated herein by reference. and a combination thereof. I0083. When used in the context of a chemical group, “hydrogen' means —H; “hydroxy' means —OH: “oxo' VII. EXEMPLARY SYNTHESIS EMBODIMENTS means —O; “halo' means independently —F. —Cl, —Br or —I: "amino” means - NH. (see below for definitions of 0076. In certain cases, a didemnin precursor, didemnin, groups containing the term amino, e.g., alkylamino); and/or didemnin derivative compounds is obtained by bacte “hydroxyamino” means —NHOH: “nitro” means - NO; ria or other fermentation methods of the invention and is then imino means —NH (see below for definitions of groups con processed. The processing may include further purification, taining the term imino, e.g., alkylimino); "cyano” means although in Some embodiments the compound is subject to —CN; "isocyanate” means —N=C=O; “azido” means one or more further synthesis steps to obtain the desired —N; in a monovalent context “phosphate” means —OP(O) molecule. Any suitable further synthesis steps may be (OH), or a deprotonated form thereof; in a divalent context employed in the art and are known to the skilled artisan (for “phosphate” means —OP(O)(OH)O— or a deprotonated representative examples, see Mayer et al., (1994); Jou et al. form thereof; “mercapto' means —SH; “thio’ means —S: (1997)). “thioether means —S ; “sulfonamido” means —NHS(O) 0077. In specific aspects of the invention, there is in vitro - (see below for definitions of groups containing the term conversion of didemnin B from T. mobilis to dehydrodidem Sulfonamido, e.g., alkylsulfonamido); "Sulfonyl' means nin B, for example using an oxidizing agent to oxidize the —S(O) - (see below for definitions of groups containing the didemnin B to dehydrodidemnin B (see Faulkner D.J. Marine term Sulfonyl, e.g., alkylsulfonyl); and “sulfinyl' means pharmacology. (2000) Antonie van Leeuwenhoek 77: 135 —S(O)— (see below for definitions of groups containing the 145, incorporated by reference herein in its entirety). term Sulfinyl, e.g., alkylsulfinyl). 0078 Thus, compounds of the present disclosure may be I0084. In the context of chemical formulas, the symbol made using the methods known in the art. These methods can “ ” means a single bond, “—” means a double bond, and be further modified and optimized using the principles and '=' means triple bond. The symbol “ represents an techniques of organic chemistry as applied by a person skilled optional bond, which if present is either single or double. The in the art. Such principles and techniques are taught, for symbol “==== represents a single bond or a double bond. example, in March's Advanced Organic Chemistry. Reac Thus, for example, the structure tions, Mechanisms, and Structure (2007), which is incorpo rated by reference herein in its entirety. 0079 Compounds encompassed in methods of the inven tion or produced directly or indirectly there by may contain one or more asymmetrically-substituted carbon or nitrogen atoms, and may be isolated in optically active or racemic form. Thus, all chiral, diastereomeric, racemic form, epimeric form, and all geometric isomeric forms of a structure are includes the structures intended, unless the specific or isomeric form is specifically indicated. Compounds may occuras race mates and racemic mixtures, single enantiomers, diastereo meric mixtures and individual diastereomers. In some embodiments, a single diastereomer is obtained. The chiral centers of the compounds of the present invention can have O s O s O s O and the S or the R configuration. 0080 Compounds of the invention may also have the advantage that they may be more efficacious than, be less toxic than, be longer acting than, be more potent than, pro duce fewer side effects than, be more easily absorbed than, As will be understood by a person of skill in the art, no one and/or have a better pharmacokinetic profile (e.g., higher oral such ring atom forms part of more than one double bond. The bioavailability and/or lower clearance) than, and/or have symbol “M,”, when drawn perpendicularly across a bond other useful pharmacological, physical, or chemical proper indicates a point of attachment of the group. It is noted that the ties over, compounds known in the prior art, whether for use point of attachment is typically only identified in this manner in the indications stated herein or otherwise. for larger groups in order to assist the reader in rapidly and 0081. In addition, atoms making up the compounds of the unambiguously identifying a point of attachment. The sym present invention are intended to include all isotopic forms of bol “- means a single bond where the group attached to Such atoms. Isotopes, as used herein, include those atoms the thick end of the wedge is “out of the page.” The symbol “ having the same atomic number but different mass numbers. 'ill' means a single bond where the group attached to the By way of general example and without limitation, isotopes thick end of the wedge is “into the page'. The symbol “VVV. of hydrogen include tritium and deuterium, and isotopes of means a single bond where the conformation (e.g., either R or carbon include 'Cand ''C. S) or the geometry is undefined (e.g., either E or Z). 0082 It should be recognized that the particular anion or I0085. Any undefined valency on an atom of a structure cation forming a part of any salt of this invention is not shown in this application implicitly represents a hydrogen critical, so long as the salt, as a whole, is pharmacologically atom bonded to the atom. When a group “R” is depicted as a acceptable. Additional examples of pharmaceutically accept “floating group' on a ring system, for example, in the for able salts and their methods of preparation and use are pre mula: US 2014/0296.161 A1 Oct. 2, 2014

fied is an acyclic or cyclic, but non-aromatic hydrocarbon compound or group. In aliphatic compounds/groups, the car bonatoms can be joined together in Straight chains, branched 1N chains, or non-aromatic rings (alicyclic). Aliphatic com R-t pounds/groups can be saturated, that is joined by single bonds 2 (alkanes/alkyl), or unsaturated, with one or more double bonds (alkenes/alkenyl) or with one or more triple bonds I0086 then R may replace any hydrogen atom attached to (alkynes/alkynyl). When the term “aliphatic' is used without any of the ring atoms, including a depicted, implied, or the “substituted” modifier only carbon and hydrogen atoms are present. When the term is used with the “substituted expressly defined hydrogen, so long as a stable structure is modifier one or more hydrogenatom has been independently formed. When a group “R” is depicted as a “floating group' replaced by one of the following exemplary non-limiting on a fused ring system, as for example in the formula: functional groups: —OH, -F, -Cl. —Br. —I, —NH —NO, COH, -COCH, —CN. —SH, —OCH, —OCHCH. —C(O)CH, N(CH), —C(O)NH2, –B(OH), -P(O)(OCH), or—OC(O)CH. (R), Yg N (0091. The term “alkyl” when used without the “substi 7 tuted modifier refers to a monovalent saturated aliphatic Sluk group with a carbonatom as the point of attachment, a linear H or branched, cyclo, cyclic or acyclic structure, and no atoms other than carbon and hydrogen. Thus, as used herein 0087 then R may replace any hydrogen attached to any of cycloalkyl is a subset of alkyl. The groups —CH (Me), the ring atoms of either of the fused rings unless specified —CHCH (Et), —CHCHCH (n-Pr), —CH(CH), (iso otherwise. Replaceable hydrogens include depicted hydro Pr), —CH(CH) (cyclopropyl), —CH2CH2CHCH (n-Bu), gens (e.g., the hydrogen attached to the nitrogen in the for —CH(CH)CHCH (sec-butyl), —CH2CH(CH) (iso-bu mula above), implied hydrogens (e.g., a hydrogen of the tyl), —C(CH) (tert-butyl), —CHC(CH) (neo-pentyl). formula above that is not shown but understood to be present), cyclobutyl, cyclopentyl, cyclohexyl, and cyclohexylmethyl expressly defined hydrogens, and optional hydrogens whose are non-limiting examples of alkyl groups. The term presence depends on the identity of a ring atom (e.g., a hydro "alkanediyl” when used without the “substituted” modifier gen attached to group X, when X equals —CH-), so long as refers to a divalent Saturated aliphatic group, with one or two a stable structure is formed. In the example depicted, R may saturated carbon atom(s) as the point(s) of attachment, a reside on either the 5-membered or the 6-membered ring of linear or branched, cyclo, cyclic or acyclic structure, no car the fused ring system. In the formula above, the subscript bon-carbon double or triple bonds, and no atoms other than letter “y” immediately following the group “R” enclosed in carbon and hydrogen. The groups, —CH2— (methylene), parentheses, represents a numeric variable. Unless specified —CH2CH2—, —CHC(CH)CH , —CH2CHCH . otherwise, this variable can be 0, 1, 2, or any integer greater and than 2, only limited by the maximum number of replaceable hydrogen atoms of the ring or ring system. 0088 For the groups and classes below, the following parenthetical Subscripts further define the group/class as fol lows: “(Cn) defines the exact number (n) of carbon atoms in the group/class. "(Csn)' defines the maximum number (n) of carbon atoms that can be in the group/class, with the mini mum number as Small as possible for the group in question, are non-limiting examples of alkanediyl groups. The term e.g., it is understood that the minimum number of carbon “alkylidene' when used without the “substituted” modifier atoms in the group "alkenylcs” or the class "alkenecs” is refers to the divalent group —CRR' in which R and R' are two. For example, "alkoxyclo" designates those alkoxy independently hydrogen, alkyl, or RandR' are taken together groups having from 1 to 10 carbonatoms (e.g., 1, 2, 3, 4, 5, 6, to represent an alkanediyl having at least two carbon atoms. 7, 8, 9, or 10, or any range derivable therein (e.g., 3 to 10 Non-limiting examples of alkylidenegroups include:=CH2. carbon atoms). (Cn-n') defines both the minimum (n) and —CH(CHCH), and—C(CH). When the term is used with maximum number (n') of carbon atoms in the group. Simi the “substituted” modifier one or more hydrogen atom has larly, "alkylc2-lo” designates those alkyl groups having been independently replaced by one of the following exem from 2 to 10 carbon atoms (e.g., 2, 3, 4, 5, 6,7,8,9, or 10, or plary non-limiting functional groups: —OH, -F, -Cl. —Br. any range derivable therein (e.g., 3 to 10 carbon atoms)). I, NH, NO. —COH, -COCH —CN. —SH, 0089. The term “saturated” as used herein means the com —OCH, OCHCH. —C(O)CH, N(CH), —C(O) pound or group so modified has no carbon-carbon double and NH = B(OH), -P(O)(OCH), or - OC(O)CH. The fol no carbon-carbon triple bonds, except as noted below. The lowing groups are non-limiting examples of Substituted alkyl term does not preclude carbon-heteroatom multiple bonds, groups: —CH2OH. —CHCl, —CF, —CHCN, —CHC for example a carbon oxygen double bond or a carbon nitro (O)OH, -CHC(O)OCH, CHC(O)NH, -CHC(O) gen double bond. Moreover, it does not preclude a carbon CH, CHOCH, -CHOC(O)CH, -CH-NH. carbon double bond that may occur as part of keto-enoltau —CHN(CH), and —CH2CHC1. The term “fluoroalkyl is tomerism or iminefenamine tautomerism. a subset of substituted alkyl, in which one or more hydrogen 0090. The term “aliphatic” when used without the “sub has been Substituted with a fluoro group and no other atoms stituted” modifier signifies that the compound/group so modi aside from carbon, hydrogen and fluorine are present. The US 2014/0296.161 A1 Oct. 2, 2014 groups, —CHF. —CF, and —CHCF are non-limiting aryl, in which the terms alkanediyl and aryl are each used in examples of fluoroalkyl groups. An “alkane' refers to the a manner consistent with the definitions provided above. compound H-R, wherein R is alkyl. Non-limiting examples of aralkyls are: phenylmethyl (ben 0092. The term “aryl” when used without the “substi Zyl, Bn) and 2-phenyl-ethyl. When the term is used with the tuted modifier refers to a monovalent unsaturated aromatic “substituted modifier one or more hydrogen atom has been group with an aromatic carbon atom as the point of attach independently replaced by one of the following exemplary ment, said carbon atom forming part of a one or more six non-limiting functional groups: —OH, - F. —Cl. —Br. —I. membered aromatic ring structure, wherein the ring atoms are —NH, NO. —COH, -COCH, —CN, —SH, all carbon, and wherein the group consists of no atoms other —OCH, OCHCH, —C(O)CH, N(CH), —C(O) than carbon and hydrogen. If more than one ring is present, NH = B(OH), -P(O)(OCH), or - OC(O)CH. Non the rings may be fused or unfused. As used herein, the term limiting examples of Substituted aralkyls are: (3-chlorophe does not preclude the presence of one or more alkyl group nyl)-methyl, and 2-chloro-2-phenyl-eth-1-yl. (carbon number limitation permitting) attached to the first (0095. The term “heteroaryl” when used without the “sub aromatic ring or any additional aromatic ring present. Non stituted modifier refers to a monovalent aromatic group with limiting examples of aryl groups include phenyl (Ph), meth an aromatic carbon atom or nitrogen atom as the point of ylphenyl, (dimethyl)phenyl, —CH-CHCH. (ethylphe attachment, said carbon atom or nitrogen atom forming part nyl), naphthyl, and the monovalent group derived from of an aromatic ring structure wherein at least one of the ring biphenyl. The term “arenediyl when used without the “sub atoms is nitrogen, oxygen or Sulfur, and wherein the group stituted modifier refers to a divalent aromatic group, with consists of no atoms other than carbon, hydrogen, aromatic two aromatic carbon atoms as points of attachment, said nitrogen, aromatic oxygen and aromatic Sulfur. As used carbon atoms forming part of one or more six-membered herein, the term does not preclude the presence of one or more aromatic ring structure(s) wherein the ring atoms are all car alkyl group (carbon numberlimitation permitting) attached to bon, and wherein the monovalent group consists of no atoms the aromatic ring or any additional aromatic ring present. other than carbon and hydrogen. As used herein, the term does Non-limiting examples of heteroaryl groups include furanyl. not preclude the presence of one or more alkyl group (carbon imidazolyl, indolyl, indazolyl (Im), methylpyridyl, oxazolyl, number limitation permitting) attached to the first aromatic pyridyl, pyrrolyl pyrimidyl, pyrazinyl, quinolyl, quinazolyl, ring or any additional aromatic ring present. If more than one quinoxalinyl, thienyl, and triazinyl. The term "heteroarene ring is present, the rings may be fused or unfused. Non diyl when used without the “substituted” modifier refers to limiting examples of arenediyl groups include: an divalent aromatic group, with two aromatic carbon atoms, two aromatic nitrogen atoms, or one aromatic carbon atom and one aromatic nitrogen atom as the two points of attach ment, said atoms forming part of one or more aromatic ring structure(s) wherein at least one of the ring atoms is nitrogen, oxygen or Sulfur, and wherein the divalent group consists of no atoms other than carbon, hydrogen, aromatic nitrogen, aromatic oxygen and aromatic Sulfur. As used herein, the term () – does not preclude the presence of one or more alkyl group (carbon number limitation permitting) attached to the first aromatic ring or any additional aromatic ring present. If more than one ring is present, the rings may be fused or unfused. ld Non-limiting examples of heteroarenediyl groups include: N / s -R)-- - Cox and 0093. When the term “aryl” is used with the “substituted” modifier one or more hydrogenatom has been independently replaced by one of the following exemplary non-limiting functional groups: —OH, -F, -Cl. —Br. —I, —NH2, N —NO, —COH, -COCH, —CN. —SH, —OCH, —OCHCH. —C(O)CH, N(CH), —C(O)NH2. –B(OH), -P(O)(OCH) or - OC(O)CH. An “arene' x-y refers to the compound H-R, wherein R is aryl. 0094. The term “aralkyl” when used without the “substi tuted modifier refers to the monovalent group -alkanediyl US 2014/0296.161 A1 Oct. 2, 2014

0096. When the term is used with the “substituted modi respectively. A non-limiting example of an arylamino group fier one or more hydrogen atom has been independently is —NHCHs. The term "amido” (acylamino), when used replaced by one of the following exemplary non-limiting without the “substituted modifier, refers to the group functional groups: —OH, -F, -Cl. —Br. —I, —NH —NHR, in which R is acyl, as that term is defined above. A —NO, —COH, -COCH, —CN. —SH, —OCH, non-limiting example of an amido group is —NHC(O)CH. —OCHCH. —C(O)CH, N(CH), —C(O)NH2. The term “alkylimino' when used without the “substituted –B(OH), -P(O)(OCH) or - OC(O)CH. modifier refers to the divalent group —NR, in which R is an 0097. The term “acyl” when used without the “substi alkyl, as that term is defined above. When the term is used tuted modifier refers to the group —C(O)R, in which R is a with the “substituted” modifier one or more hydrogen atom hydrogen, alkyl, aryl, aralkyl or heteroaryl, as those terms are has been independently replaced by one of the following defined above. The groups, —CHO,-CO)CH (acetyl. Ac), exemplary non-limiting functional groups: —OH, -F, -Cl, —C(O)CHCH –C(O)CHCHCH, C(O)CH(CH), Br. —I, NH, NO. —COH, -COCH, —CN, —C(O)CH(CH), —C(O)CHs —C(O)CH, CH, —SH, —OCH —OCHCH, —C(O)CH, N(CH), —C(O)CHCHs. -C(O)(imidazolyl) are non-limiting - C(O)NH2. —B(OH), -P(O)(OCH) or - OC(O)CH. examples of acyl groups. A “thioacyl is defined in an analo The groups - NHC(O)OCH and - NHC(O)NHCH are gous manner, except that the oxygen atom of the group non-limiting examples of Substituted amido groups. —C(O)R has been replaced with a sulfur atom, —C(S)R. 0100. The term “heterocyclic” or "heterocycle” when When the term is used with the “substituted” modifier one or used without the “substituted modifier signifies that the more hydrogenatom has been independently replaced by one compound/group so modified comprising at least one ring in of the following exemplary non-limiting functional groups: which at least one ring atom is an element other than carbon. OH, - F - C1, —Br. —I, NH, NO, COH, Examples of the non-carbon ring atoms include but are not —COCH, CN, -SH, OCH –OCHCH, C(O) limited to nitrogen, oxygen, Sulfur, boron, phosphorus, CH, -N(CH), C(O)NH2. —B(OH), -P(O)(OCH) arsenic, antimony, germanium, bismuth, silicon and/or tin. or —OC(O)CH. The groups, —C(O)CHCF. —COH Examples of heterocyclic structures include but are not lim (carboxyl), —COCH (methylcarboxyl), —COCHCH ited to aziridine, azirine, oxirane, epoxide, OXirene, thirane, —C(O)NH. (carbamoyl), and CONCCH) are non-limit episulfides, thiirene, diazirine, oxaziridine, dioxirane, aZeti ing examples of Substituted acyl groups. dine, azete, oxetane, Oxete, thietane, thiete, diazetidine, diox 0098. The term “alkoxy” when used without the “substi etane, dioxete, dithietane, dithiete, pyrrolidine, pyrrole, tuted” modifier refers to the group —OR, in which R is an oxolane, furane, thiolane, thiophene, borolane, borole, phos alkyl, as that term is defined above. Non-limiting examples of pholane, phosphole, arsolane, arsole, Stibolane, Stibole, bis alkoxy groups include: —OCH —OCHCH molane, bismole, silolane, silole, Stannolane, Stannole, imi —OCH2CHCH —OCH(CH), —OCH(CH), —O-cy dazolidine, imidazole, pyrazolidine, pyrazole, imidazoline, clopentyl, and —O-cyclohexyl. The terms “alkenyloxy’, pyrazoline, oxazolidine, oxazole, oxazoline, isoxazolidine, “alkynyloxy”, “aryloxy”, “aralkoxy”, “heteroaryloxy’, and isoxazole, thiazolidine, thiazole, thiazoline, isothiazolidine, “acyloxy”, when used without the “substituted modifier, isothiazole, dioxolane, thithiolane, triazole, furazan, oxadia refers to groups, defined as —OR, in which R is alkenyl, Zole, thiadiazole, dithiazole, tetrazole, piperidine, pyridine, alkynyl, aryl, aralkyl, heteroaryl, and acyl, respectively. Simi oxane, pyran, thiane, thiopyran, Salinane, Saline, germinane, larly, the term “alkylthio' when used without the “substi germine, Stanninane, Stannine, borinane, borinine, phosphi tuted modifier refers to the group —SR, in which R is an nane, phosphinine, arsinane, arsinine, piperazine, diazine, alkyl, as that term is defined above. When the term is used morpholine, oxazine, thiomorpholine, thiazine, dioxane, with the “substituted” modifier one or more hydrogen atom dioxine, dithiane, dithiine, triazine, trioxane, tetrazine, has been independently replaced by one of the following aZepane, azepine, Oxepane, Oxepine, thiepane, thiepine, exemplary non-limiting functional groups: —OH, -F, -Cl, homopiperazine, diazepine, thiazepine, ozocane, azocine, Br, —I, NH, NO. —CO.H. —COCH, —CN, oxecane, or thiocane. When the term "heterocyclic” is used —SH, —OCH —OCHCH. —C(O)CH —N(CH), with the “substituted” modifier one or more hydrogen atom —C(O)NH, B(OH), -P(O)(OCH) or - OC(O)CH. has been independently replaced by one of the following The term “ corresponds to an alkane, as defined exemplary non-limiting functional groups: —OH, -F, -Cl, above, wherein at least one of the hydrogen atoms has been Br. —I, NH, NO. —COH, -COCH, —CN, replaced with a . —SH, —OCH —OCHCH, —C(O)CH, N(CH), 0099. The term “alkylamino” when used without the “sub - C(O)NH, or - OC(O)CH. stituted modifier refers to the group —NHR, in which Risan 0101. As generally used herein “pharmaceutically accept alkyl, as that term is defined above. Non-limiting examples of able' refers to those compounds, materials, compositions, alkylamino groups include: NHCH and —NHCHCH. and/or dosage forms which are, within the scope of Sound The term “dialkylamino” when used without the “substi medical judgment, Suitable for use in contact with the tissues, tuted modifier refers to the group —NRR", in which Rand R' organs, and/or bodily fluids of human beings and animals can be the same or different alkyl groups, or R and R' can be without excessive toxicity, irritation, allergic response, or taken together to represent an alkanediyl. Non-limiting other problems or complications commensurate with a rea examples of dialkylamino groups include: —N(CH), sonable benefit/risk ratio. —N(CH)(CHCH), and N-pyrrolidinyl. The terms 0102 “Pharmaceutically acceptable salts' means salts of “alkoxyamino”, “alkenylamino”, “alkynylamino”, “ary compounds of the present invention which are pharmaceuti lamino”, “aralkylamino”, “heteroarylamino', and “alkylsul cally acceptable, as defined above, and which possess the fonylamino” when used without the “substituted” modifier, desired pharmacological activity. Such salts include acid refers to groups, defined as —NHR, in which R is alkoxy, addition salts formed with inorganic acids such as hydrochlo alkenyl, alkynyl, aryl, aralkyl, heteroaryl, and alkylsulfonyl, ric acid, hydrobromic acid, Sulfuric acid, nitric acid, phos US 2014/0296.161 A1 Oct. 2, 2014

phoric acid, and the like; or with organic acids such as 1.2- ceutical Sciences, 18th Ed. Mack Printing Company, 1990, ethanedisulfonic acid, 2-hydroxyethanesulfonic acid, pp. 1289-1329, incorporated herein by reference). Except 2-naphthalenesulfonic acid, 3-phenylpropionic acid, 4,4'- insofar as any conventional carrier is incompatible with the methylenebis(3-hydroxy-2-ene-1-carboxylic acid), 4-meth active ingredient, its use in the pharmaceutical compositions ylbicyclo2.2.2]oct-2-ene-1-carboxylic acid, , ali is contemplated. phatic mono- and dicarboxylic acids, aliphatic Sulfuric acids, 0105. The didemnin or didemnin derivative may comprise aromatic Sulfuric acids, benzenesulfonic acid, benzoic acid, different types of carriers depending on whether it is to be camphorsulfonic acid, carbonic acid, cinnamic acid, citric administered in solid, liquid or aerosol form, and whether it acid, cyclopentanepropionic acid, ethanesulfonic acid, need to be sterile for Such routes of administration as injec fumaric acid, glucoheptonic acid, gluconic acid, glutamic tion. The present invention can be administered intrave acid, glycolic acid, heptanoic acid, hexanoic acid, hydrox nously, intradermally, transdermally, intrathecally, intraarte ynaphtholic acid, lactic acid, laurylsulfuric acid, maleic acid, rially, intraperitoneally, intranasally, intravaginally, malic acid, , mandelic acid, methanesulfonic intrarectally, topically, intramuscularly, Subcutaneously, acid, muconic acid, o-(4-hydroxybenzoyl)benzoic acid, mucosally, orally, topically, locally, inhalation (e.g., aerosol oxalic acid, p-chlorobenzenesulfonic acid, phenyl-Substi inhalation), injection, infusion, continuous infusion, local tuted alkanoic acids, propionic acid, p-toluenesulfonic acid, ized perfusion bathing target cells directly, via a catheter, via pyruvic acid, Salicylic acid, Stearic acid, Succinic acid, tartaric a lavage, incremes, in lipid compositions (e.g., liposomes), or acid, tertiarybutylacetic acid, trimethylacetic acid, trifluoro acetic acid, trifluorimethylsulfonic (triflic) acid and the like. by other method or any combination of the forgoing as would Pharmaceutically acceptable salts also include base addition be known to one of ordinary skill in the art (see, for example, salts which may be formed when acidic protons present are Remington’s Pharmaceutical Sciences, 18th Ed. Mack Print capable of reacting with inorganic or organic bases. Accept ing Company, 1990, incorporated herein by reference). able inorganic bases include Sodium hydroxide, sodium car 0106 The didemnin or didemnin derivative may be for bonate, potassium hydroxide, aluminum hydroxide and cal mulated into a composition in a free base, neutral or salt form. cium hydroxide. Acceptable organic bases include, but are Pharmaceutically acceptable salts, include the acid addition not limited to ethanolamine, diethanolamine, triethanola salts, e.g., those formed with the free amino groups of a mine, tromethamine, N-methylglucamine and the like. It proteinaceous composition, or which are formed with inor should be recognized that the particular anion or cation form ganic acids such as for example, hydrochloric or phosphoric ing a part of any salt of this invention is not critical, so long as acids, or Such organic acids as acetic, oxalic, tartaric or man the salt, as a whole, is pharmacologically acceptable. Addi delic acid. Salts formed with the free carboxyl groups can also tional examples of pharmaceutically acceptable salts and be derived from inorganic bases such as for example, Sodium, their methods of preparation and use are presented in Hand potassium, ammonium, calcium or ferric hydroxides; or Such book of Pharmaceutical Salts. Properties, and Use (P. H. organic bases as isopropylamine, trimethylamine, histidine or Stahl & C. G. Wermuth eds., Verlag Helvetica Chimica Acta, procaine. Upon formulation, Solutions will be administered in 2002). a manner compatible with the dosage formulation and in Such amount as is therapeutically effective. The formulations are VIII. PHARMACEUTICAL PREPARATIONS easily administered in a variety of dosage forms such as formulated for parenteral administrations such as injectable 0103 Pharmaceutical compositions of the present inven Solutions, or aerosols for delivery to the lungs, or formulated tion comprise an effective amount of one or more didemnin or for alimentary administrations such as drug release capsules didemnin derivative compositions of the invention dissolved and the like. or dispersed in a pharmaceutically acceptable carrier. The phrases “pharmaceutical or pharmacologically acceptable' 0107 Further in accordance with the present invention, refers to molecular entities and compositions that do not the composition of the present invention Suitable for admin produce an adverse, allergic or other untoward reaction when istration is provided in a pharmaceutically acceptable carrier administered to an animal. Such as, for example, a human, as with or without an inert diluent. The carrier should be assimi appropriate. The preparation of an pharmaceutical composi lable and includes liquid, semi-solid, i.e., pastes, or Solid tion that contains at least one composition of the invention or carriers. Except insofar as any conventional media, agent, additional active ingredient will be known to those of skill in diluent or carrier is detrimental to the recipient or to the the art in light of the present disclosure, as exemplified by therapeutic effectiveness of a the composition contained Remington’s Pharmaceutical Sciences, 18th Ed. Mack Print therein, its use in administrable composition for use in prac ing Company, 1990, incorporated herein by reference. More ticing the methods of the present invention is appropriate. over, for animal (e.g., human) administration, it will be under Examples of carriers or diluents include , oils, water, stood that preparations should meet sterility, pyrogenicity, saline solutions, lipids, liposomes, resins, binders, fillers and general safety and purity standards as required by FDA Office the like, or combinations thereof. The composition may also of Biological Standards. comprise various antioxidants to retard oxidation of one or 0104. As used herein, “pharmaceutically acceptable car more component. Additionally, the prevention of the action of rier includes any and all solvents, dispersion media, coat microorganisms can be brought about by preservatives Such ings, Surfactants, antioxidants, preservatives (e.g., antibacte as various antibacterial and antifungal agents, including but rial agents, antifungal agents), isotonic agents, absorption not limited to parabens (e.g., methylparabens, propylpara delaying agents, salts, preservatives, drugs, drug stabilizers, bens), chlorobutanol, phenol, Sorbic acid, thimerosal or com gels, binders, excipients, disintegration agents, lubricants, binations thereof. Sweetening agents, flavoring agents, dyes, such like materials 0108. In accordance with the present invention, the com and combinations thereof, as would be known to one of ordi position is combined with the carrier in any convenient and nary skill in the art (see, for example, Remington’s Pharma practical manner, i.e., by Solution, Suspension, emulsifica US 2014/0296.161 A1 Oct. 2, 2014 tion, admixture, encapsulation, absorption and the like. Such amount of active compound(s) in each therapeutically useful procedures are routine for those skilled in the art. composition may be prepared is such a way that a Suitable 0109. In a specific embodiment of the present invention, dosage will be obtained in any given unit dose of the com the composition is combined or mixed thoroughly with a pound. Factors such as solubility, bioavailability, biological semi-solid or Solid carrier. The mixing can be carried out in half-life, route of administration, product shelf life, as well as any convenient manner Such as grinding. Stabilizing agents other pharmacological considerations will be contemplated can be also added in the mixing process in order to protect the by one skilled in the art of preparing Such pharmaceutical composition from loss of therapeutic activity, i.e., denatur formulations, and as such, a variety of dosages and treatment ation in the stomach. Examples of stabilizers for use in an the regimens may be desirable. composition include buffers, amino acids Such as glycine and 0114. In other non-limiting examples, a dose may also lysine, carbohydrates Such as dextrose, mannose, galactose, comprise from about 1 microgram/kg/body weight, about 5 fructose, lactose. Sucrose, maltose, Sorbitol, mannitol, etc. microgram/kg/body weight, about 10 microgram/kg/body 0110. In further embodiments, the present invention may weight, about 50 microgram/kg/body weight, about 100 concern the use of a pharmaceutical lipid vehicle composi microgram/kg/body weight, about 200 microgram/kg/body tions that include a didemnin or didemnin derivative, one or weight, about 350 microgram/kg/body weight, about 500 more lipids, and an aqueous solvent. As used herein, the term microgram/kg/body weight, about 1 milligram/kg/body “lipid’ will be defined to include any of a broad range of weight, about 5 milligram/kg/body weight, about 10 milli Substances that is characteristically insoluble in water and gram/kg/body weight, about 50 milligram/kg/body weight, extractable with an organic solvent. This broad class of com about 100 milligram/kg/body weight, about 200 milligram/ pounds are well known to those of skill in the art, and as the kg/body weight, about 350 milligram/kg/body weight, about term “lipid' is used herein, it is not limited to any particular 500 milligram/kg/body weight, to about 1000 mg/kg/body structure. Examples include compounds which contain long weight or more per administration, and any range derivable chain aliphatic hydrocarbons and their derivatives. A lipid therein. In non-limiting examples of a derivable range from may be naturally occurring or synthetic (i.e., designed or the numbers listed herein, a range of about 5 mg/kg/body produced by man). However, a lipid is usually a biological weight to about 100 mg/kg/body weight, about 5 microgram/ Substance. Biological lipids are well known in the art, and kg/body weight to about 500 milligram/kg/body weight, etc., include for example, neutral fats, phospholipids, phospho can be administered, based on the numbers described above. glycerides, Steroids, terpenes, lysolipids, glycosphingolipids, glycolipids, Sulphatides, lipids with ether and ester-linked D. Alimentary Compositions and Formulations fatty acids and polymerizable lipids, and combinations 0.115. In preferred embodiments of the present invention, thereof. Of course, compounds other than those specifically the composition(s) are formulated to be administered via an described herein that are understood by one of skill in the art alimentary route. Alimentary routes include all possible as lipids are also encompassed by the compositions and meth routes of administration in which the composition is in direct ods of the present invention. contact with the alimentary tract. Specifically, the pharma 0111. One of ordinary skill in the art would be familiar ceutical compositions disclosed herein may be administered with the range of techniques that can be employed for dis orally, buccally, rectally, or Sublingually. As such, these com persing a composition in a lipid vehicle. For example, the positions may be formulated with an inert diluent or with an didemnin or didemnin derivative may be dispersed in a solu assimilable edible carrier, or they may be enclosed inhard- or tion containing a lipid, dissolved with a lipid, emulsified with soft-shell gelatin capsule, or they may be compressed into a lipid, mixed with a lipid, combined with a lipid, covalently tablets, or they may be incorporated directly with the food of bonded to a lipid, contained as a Suspension in a lipid, con the diet. tained or complexed with a micelle or liposome, or otherwise 0116. In certain embodiments, the active compounds may associated with a lipid or lipid structure by any means known be incorporated with excipients and used in the form of to those of ordinary skill in the art. The dispersion may or may ingestible tablets, buccal tables, troches, capsules, elixirs, not result in the formation of liposomes. Suspensions, syrups, wafers, and the like (Mathiowitz et al., 0112 The actual dosage amount of a composition of the 1997: Hwang et al., 1998: U.S. Pat. Nos. 5,641,515; 5,580, present invention administered to an animal patient can be 579 and 5,792.451, each specifically incorporated herein by determined by physical and physiological factors such as reference in its entirety). The tablets, troches, pills, capsules body weight, severity of condition, the type of disease being and the like may also contain the following: a binder, such as, treated, previous or concurrent therapeutic interventions, idi for example, gum tragacanth, acacia, cornstarch, gelatin or opathy of the patient and on the route of administration. combinations thereof an excipient, such as, for example, Depending upon the dosage and the route of administration, dicalcium phosphate, mannitol, lactose, starch, magnesium the number of administrations of a preferred dosage and/oran Stearate, Sodium saccharine, cellulose, magnesium carbonate effective amount may vary according to the response of the or combinations thereof; a disintegrating agent, such as, for subject. The practitioner responsible for administration will, example, corn starch, potato starch, alginic acid or combina in any event, determine the concentration of active ingredient tions thereof, a lubricant, Such as, for example, magnesium (s) in a composition and appropriate dose(s) for the individual Stearate; a Sweetening agent, such as, for example, Sucrose, Subject. lactose, Saccharin or combinations thereof a flavoring agent, 0113. In certain embodiments, pharmaceutical composi Such as, for example peppermint, oil of wintergreen, cherry tions may comprise, for example, at least about 0.1% of an flavoring, orange flavoring, etc. When the dosage unit form is active compound. In other embodiments, the an active com a capsule, it may contain, in addition to materials of the above pound may comprise between about 2% to about 75% of the type, a liquid carrier. Various other materials may be present weight of the unit, or between about 25% to about 60%, for as coatings or to otherwise modify the physical form of the example, and any range derivable therein. Naturally, the dosage unit. For instance, tablets, pills, or capsules may be US 2014/0296.161 A1 Oct. 2, 2014

coated with shellac, sugar, or both. When the dosage form is include sterile aqueous solutions or dispersions and sterile a capsule, it may contain, in addition to materials of the above powders for the extemporaneous preparation of sterile inject type, carriers such as a liquid carrier. Gelatin capsules, tab able solutions or dispersions (U.S. Pat. No. 5,466,468, spe lets, or pills may be enterically coated. Enteric coatings pre cifically incorporated herein by reference in its entirety). In vent denaturation of the composition in the stomach or upper all cases the form must be sterile and must be fluid to the bowel where the pH is acidic. See, e.g., U.S. Pat. No. 5,629, extent that easy injectability exists. It must be stable under the 001. Upon reaching the small intestines, the basic pH therein conditions of manufacture and storage and must be preserved dissolves the coating and permits the composition to be against the contaminating action of microorganisms. Such as released and absorbed by specialized cells, e.g., epithelial bacteria and fungi. The carrier can be a solvent or dispersion enterocytes and Peyer's patch M cells. A syrup of elixir may medium containing, for example, water, ethanol, polyol (i.e., contain the active compound Sucrose as a Sweetening agent glycerol, propylene glycol, and liquid polyethylene glycol, methyl and propylparabens as preservatives, a dye and fla and the like), suitable mixtures thereof, and/or vegetable oils. Voring, Such as cherry or orange flavor. Of course, any mate Proper fluidity may be maintained, for example, by the use of rial used in preparing any dosage unit form should be phar a coating, such as lecithin, by the maintenance of the required maceutically pure and Substantially non-toxic in the amounts particle size in the case of dispersion and by the use of Sur employed. In addition, the active compounds may be incor factants. The prevention of the action of microorganisms can porated into Sustained-release preparation and formulations. be brought about by various antibacterial and antifungal 0117 For oral administration the compositions of the agents, for example, parabens, chlorobutanol, phenol, Sorbic present invention may alternatively be incorporated with one acid, thimerosal, and the like. In many cases, it will be pref or more excipients in the form of a mouthwash, dentifrice, erable to include isotonic agents, for example, Sugars or buccal tablet, oral spray, or Sublingual orally-administered sodium chloride. Prolonged absorption of the injectable com formulation. For example, a mouthwash may be prepared positions can be brought about by the use in the compositions incorporating the active ingredient in the required amount in of agents delaying absorption, for example, aluminum an appropriate solvent, such as a sodium borate solution (Do monostearate and gelatin. bell's Solution). Alternatively, the active ingredient may be 0121 For parenteral administration in an aqueous solu incorporated into an oral Solution Such as one containing tion, for example, the solution should be suitably buffered if Sodium borate, glycerin and potassium bicarbonate, or dis necessary and the liquid diluent first rendered isotonic with persed in a dentifrice, or added in a therapeutically-effective Sufficient Saline or glucose. These particular aqueous solu amount to a composition that may include water, binders, tions are especially suitable for intravenous, intramuscular, abrasives, flavoring agents, foaming agents, and humectants. Subcutaneous, and intraperitoneal administration. In this con Alternatively the compositions may be fashioned into a tablet nection, Sterile aqueous media that can be employed will be or Solution form that may be placed under the tongue or known to those of skill in the art in light of the present otherwise dissolved in the mouth. disclosure. For example, one dosage may be dissolved in 0118. Additional formulations which are suitable for other isotonic NaCl solution and either added hypodermoclysis modes of alimentary administration include Suppositories. fluid or injected at the proposed site of infusion, (see for Suppositories are solid dosage forms of various weights and example, “Remington's Pharmaceutical Sciences' 15th Edi shapes, usually medicated, for insertion into the rectum. After tion, pages 1035-1038 and 1570-1580). Some variation in insertion, Suppositories soften, melt or dissolve in the cavity dosage will necessarily occur depending on the condition of fluids. In general, for Suppositories, traditional carriers may the subject being treated. The person responsible for admin include, for example, polyalkylene glycols, triglycerides or istration will, in any event, determine the appropriate dose for combinations thereof. In certain embodiments, Suppositories the individual subject. Moreover, for human administration, may be formed from mixtures containing, for example, the preparations should meet Sterility, pyrogenicity, general active ingredient in the range of about 0.5% to about 10%, and safety and purity standards as required by FDA Office of preferably about 1% to about 2%. Biologics standards. I0122) Sterile injectable solutions are prepared by incorpo E. Parenteral Compositions and Formulations rating the active compounds in the required amount in the 0119. In further embodiments, the composition may be appropriate solvent with various of the other ingredients enu administered via a parenteral route. As used herein, the term merated above, as required, followed by filtered sterilization. “parenteral includes routes that bypass the alimentary tract. Generally, dispersions are prepared by incorporating the vari Specifically, the pharmaceutical compositions disclosed ous sterilized active ingredients into a sterile vehicle which herein may be administered for example, but not limited to contains the basic dispersion medium and the required other intravenously, intradermally, intramuscularly, intraarterially, ingredients from those enumerated above. In the case of ster intrathecally, subcutaneous, or intraperitoneally U.S. Pat. ile powders for the preparation of sterile injectable solutions, Nos. 6,7537,514, 6,613,308, 5,466,468, 5,543,158; 5,641, the preferred methods of preparation are vacuum-drying and 515; and 5,399.363 (each specifically incorporated herein by freeze-drying techniques which yield a powder of the active reference in its entirety). ingredient plus any additional desired ingredient from a pre 0120 Solutions of the active compounds as free base or viously sterile-filtered solution thereof. A powdered compo pharmacologically acceptable salts may be prepared in water sition is combined with a liquid carrier Such as, e.g., water or Suitably mixed with a surfactant, Such as hydroxypropylcel a saline Solution, with or without a stabilizing agent. lulose. Dispersions may also be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof and in oils. Under F. Miscellaneous Pharmaceutical Compositions and ordinary conditions of storage and use, these preparations Formulations contain a preservative to prevent the growth of microorgan I0123. In other preferred embodiments of the invention, the isms. The pharmaceutical forms suitable for injectable use active compound may be formulated for administration via US 2014/0296.161 A1 Oct. 2, 2014

various miscellaneous routes, for example, topical (i.e., trans tumor, preventing or inhibiting the progression of cancer, or dermal) administration, mucosal administration (intranasal, increasing the lifespan of a Subject with cancer. More gener vaginal, etc.) and/or inhalation. ally, these other compositions would be provided in a com 0.124 Pharmaceutical compositions for topical adminis bined amount effective to kill or inhibit proliferation of the tration may include the active compound formulated for a cell. This process may involve contacting the cells with the medicated application Such as an ointment, paste, cream or didemnin or didemnin derivative and the agent(s) or multiple powder. Ointments include all oleaginous, adsorption, emul factor(s) at the same time. This may be achieved by contact sion and water-solubly based compositions for topical appli ing the cell with a single composition or pharmacological cation, while creams and lotions are those compositions that formulation that includes both agents, or by contacting the include an emulsion base only. Topically administered medi cell with two distinct compositions or formulations, at the cations may contain a penetration enhancer to facilitate same time, wherein one composition includes the expression adsorption of the active ingredients through the skin. Suitable construct and the other includes the second agent(s). penetration enhancers include glycerin, , alkyl I0128 Tumor cell resistance to chemotherapy and radio methyl sulfoxides, pyrrolidones and luarocapram. Possible therapy agents represents a major problem in clinical oncol bases for compositions for topical application include poly ogy. One goal of current cancer research is to find ways to ethylene glycol, lanolin, cold cream and petrolatum as well as improve the efficacy of chemo- and radiotherapy by combin any other Suitable absorption, emulsion or water-soluble oint ing it with other therapy. In the context of the present inven ment base. Topical preparations may also include emulsifiers, tion, it is contemplated that a didemnin or didemnin deriva gelling agents, and preservatives as necessary tive could be used in conjunction with chemotherapeutic, to preserve the active ingredient and provide for a homog radiotherapeutic, or immunotherapeutic intervention, for enous mixture. Transdermal administration of the present example. invention may also comprise the use of a “patch'. For I0129. The didemnin or didemnin derivative therapy may example, the patch may supply one or more active substances precede or follow the other agent treatment by intervals rang at a predetermined rate and in a continuous manner over a ing from minutes to weeks. In embodiments where the other fixed period of time. agent and expression construct are applied separately to the 0.125. In certain embodiments, the pharmaceutical com cell, one would generally ensure that a significant period of positions may be delivered by eye drops, intranasal sprays, time did not expire between the time of each delivery, such inhalation, and/or other aerosol delivery vehicles. Methods that the agent and didemnin ordidemnin derivative would still for delivering compositions directly to the lungs via nasal be able to exert an advantageously combined effect on the aerosol sprays has been described e.g., in U.S. Pat. Nos. cell. In such instances, it is contemplated that one may contact 5,756.353 and 5,804.212 (each specifically incorporated the cell with both modalities within about 12-24 h of each herein by reference in its entirety). Likewise, the delivery of other and, more preferably, within about 6-12 h of each other. drugs using intranasal microparticle resins (Takenaga et al., In some situations, it may be desirable to extend the time 1998) and lysophosphatidyl-glycerol compounds (U.S. Pat. period for treatment significantly, however, where severald No. 5,725,871, specifically incorporated herein by reference (2, 3, 4, 5, 6 or 7) to several wk (1, 2, 3, 4, 5, 6, 7 or 8) lapse in its entirety) are also well-known in the pharmaceutical arts. between the respective administrations. Likewise, transmucosal drug delivery in the form of a poly 0.130 Various combinations may be employed, didemnin tetrafluoroetheylene support matrix is described in U.S. Pat. or didemnin derivative therapy is “A” and the secondary No. 5,780.045 (specifically incorporated herein by reference agent, such as radio- or chemotherapy (for example), is “B”: in its entirety). 0131 A/B/AB/A/BB/B/A A/A/BA/B/BB/A/AA/B/B/B 0126 The term aerosol refers to a colloidal system of BFAFBFB finely divided solid of liquid particles dispersed in a liquefied (0132 B/B/B/A B/B/A/B A/A/B/B A/B/A/B A/B/B/A or pressurized gas propellant. The typical aerosol of the B/B/A/A present invention for inhalation will consist of a suspension of 0.133 B/A/B/A B/A/A/B A/A/A/B B/A/A/A A/B/A/A active ingredients in liquid propellant or a mixture of liquid A/A/B/A propellant and a suitable solvent. Suitable propellants include 0.134 Administration of the therapeutic didemnin or hydrocarbons and hydrocarbon ethers. Suitable containers didemnin derivative of the present invention to a patient will will vary according to the pressure requirements of the pro follow general protocols for the administration of chemo pellant. Administration of the aerosol will vary according to therapeutics, taking into account the toxicity, if any, of the Subjects age, weight and the severity and response of the vector. It is expected that the treatment cycles would be symptoms. repeated as necessary. It also is contemplated that various standard therapies, as well as Surgical intervention, may be IX. COMBINATION THERAPY applied in combination with the described hyperproliferative 0127. In some embodiments, in order to increase the effec cell therapy. tiveness of a didemnin or didemnin derivative, it may be I0135 A. Chemotherapy desirable to combine these compositions with other agents 0.136 Cancer therapies also include a variety of combina effective in the treatment of hyperproliferative disease, such tion therapies with both chemical and radiation based treat as anti-cancer agents. An 'anti-cancer agent is capable of ments. Combination chemotherapies include, for example, negatively affecting cancer in a Subject, for example, by kill cisplatin (CDDP), carboplatin, procarbazine, mechlore ing cancer cells, inducing apoptosis in cancer cells, reducing thamine, cyclophosphamide, camptothecin, ifosfamide, mel the growth rate of cancer cells, reducing the incidence or phalan, chlorambucil, buSulfan, nitro Surea, dactinomycin, number of metastases, reducing tumor size, inhibiting tumor daunorubicin, doxorubicin, bleomycin, plicomycin, mitomy growth, reducing the blood Supply to a tumor or cancer cells, cin, etoposide (VP16), tamoxifen, raloxifene, estrogen recep promoting an immune response against cancer cells or a tor binding agents, taxol, gemcitabien, navelbine, farnesyl US 2014/0296.161 A1 Oct. 2, 2014

protein tansferase inhibitors, transplatinum, 5-fluorouracil, (0145 E. Surgery Vincristin, vinblastin and methotrexate, or any analog or 0146 Approximately 60% of persons with cancer will derivative variant of the foregoing. undergo Surgery of some type, which includes preventative, 0137 B. Radiotherapy diagnostic or staging, curative and palliative Surgery. Cura 0.138. Other factors that cause DNA damage and have been tive Surgery is a cancer treatment that may be used in con used extensively include what are commonly known as junction with other therapies, such as the treatment of the 7-rays, X-rays, and/or the directed delivery of radioisotopes present invention, chemotherapy, radiotherapy, hormonal to tumor cells. Other forms of DNA damaging factors are also therapy, gene therapy, immunotherapy and/or alternative contemplated Such as microwaves and UV-irradiation. It is therapies. most likely that all of these factors effect a broad range of damage on DNA, on the precursors of DNA, on the replica 0147 Curative surgery includes resection in which all or tion and repair of DNA, and on the assembly and maintenance part of cancerous tissue is physically removed, excised, and/ of chromosomes. Dosage ranges for X-rays range from daily or destroyed. Tumor resection refers to physical removal of at doses of 50 to 200 roentgens for prolonged periods of time (3 least part of a tumor. In addition to tumor resection, treatment to 4 wk), to single doses of 2000 to 6000 roentgens. Dosage by Surgery includes laser Surgery, cryoSurgery, electroSur ranges for radioisotopes vary widely, and depend on the half gery, and miscopically controlled Surgery (Mohs Surgery). It life of the isotope, the strength and type of radiation emitted, is further contemplated that the present invention may be used and the uptake by the neoplastic cells. in conjunction with removal of Superficial cancers, precan 0.139. The terms “contacted and “exposed, when applied cers, or incidental amounts of normal tissue. to a cell, are used herein to describe the process by which a 0.148. Upon excision of part of all of cancerous cells, tis therapeutic construct and a chemotherapeutic or radiothera Sue, or tumor, a cavity may be formed in the body. Treatment peutic agent are delivered to a target cellor are placed in direct may be accomplished by perfusion, direct injection or local juxtaposition with the target cell. To achieve cell killing or application of the area with an additional anti-cancer therapy. Stasis, both agents are delivered to a cell in a combined Such treatment may be repeated, for example, every 1, 2, 3, 4, amount effective to kill the cell or prevent it from dividing. 5, 6, or 7 days, or every 1, 2, 3, 4, and 5 weeks or every 1, 2, 0140 C. Immunotherapy 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 months. These treatments may 0141. Immunotherapeutics, generally, rely on the use of be of varying dosages as well. immune effector cells and molecules to target and destroy cancer cells. The immune effector may be, for example, an 0149 F. Other Agents antibody specific for some marker on the Surface of a tumor 0150. It is contemplated that other agents may be used in cell. The antibody alone may serve as an effector of therapy or combination with the present invention to improve the thera it may recruit other cells to actually effect cell killing. The peutic efficacy of treatment. These additional agents include antibody also may be conjugated to a drug or toxin (chemo immunomodulatory agents, agents that affect the upregula therapeutic, radionuclide, ricin A chain, cholera toxin, per tion of cell Surface receptors and GAP junctions, cytostatic tussis toxin, etc.) and serve merely as a targeting agent. Alter and differentiation agents, inhibitors of cell adehesion, or natively, the effector may be a lymphocyte carrying a Surface agents that increase the sensitivity of the hyperproliferative molecule that interacts, either directly or indirectly, with a cells to apoptotic inducers. Immunomodulatory agents tumor cell target. Various effector cells include cytotoxic T include tumor necrosis factor, interferon alpha, beta, and cells and NK cells. gamma; IL-2 and other cytokines; F42K and other cytokine 0142 Immunotherapy, thus, could be used as part of a analogs; or MIP-1, MIP-1beta, MCP-1, RANTES, and other combined therapy, in conjunction with didemnin or didemnin chemokines. It is further contemplated that the upregulation derivative therapy. The general approach for combined of cell surface receptors or their ligands such as Fas/Fas therapy is discussed below. Generally, the tumor cell must ligand, DR4 or DR5/TRAIL would potentiate the apoptotic bear some marker that is amenable to targeting, i.e., is not inducing abilities of the present invention by establishment of present on the majority of other cells. Many tumor markers an autocrine or paracrine effect on hyperproliferative cells. exist and any of these may be suitable for targeting in the Increases intercellular signaling by elevating the number of context of the present invention. Common tumor markers GAP junctions would increase the anti-hyperproliferative include carcinoembryonic antigen, prostate specific antigen, effects on the neighboring hyperproliferative cell population. urinary tumor associated antigen, fetal antigen, tyrosinase In other embodiments, cytostatic or differentiation agents can (p97), gp68, TAG-72, HMFG, Sialyl Lewis Antigen, MucA, be used in combination with the present invention to improve Much3, PLAP, estrogen receptor, laminin receptor, erb Band the anti-hyerproliferative efficacy of the treatments. Inhibi p155. tors of cell adehesion are contemplated to improve the effi 0143 D. Genes cacy of the present invention. Examples of cell adhesion 0144. In yet another embodiment, the secondary treatment inhibitors are focal adhesion kinase (FAKs) inhibitors and is a gene therapy in which a therapeutic polynucleotide is Lovastatin. It is further contemplated that other agents that administered before, after, or at the same time as the didemnin increase the sensitivity of a hyperproliferative cell to apopto or didemnin derivative. Delivery of didemnin or a didemnin sis, Such as the antibody c225, could be used in combination derivative with a vector encoding one of a therapeutic gene with the present invention to improve the treatment efficacy. product will have a combined anti-hyperproliferative effect 0151 Hormonal therapy may also be used in conjunction on target tissues. Alternatively, a single vector encoding both with the present invention or in combination with any other genes may be used. A variety of proteins are encompassed cancer therapy previously described. The use of hormones within the invention, including inducers of cellular prolifera may be employed in the treatment of certain cancers such as tion, inhibitors of cellular proliferation, or regulators of pro breast, prostate, ovarian, or cervical cancer to lower the level grammed cell death, for example. or block the effects of certain hormones such as testosterone US 2014/0296.161 A1 Oct. 2, 2014

or estrogen. This treatment is often used in combination with and eluted with increasing amounts of methanol in water. The at least one other cancer therapy as a treatment option or to active fraction was further purified by semi-preparative reduce the risk of metastases. reverse-phase HPLC with 63% acetonitrile in water at a flow rate of 3 milliliter per minute. Didemnin B and nordidemnin X. KITS OF THE INVENTION B were eluted at 25 and 22 minute respectively. 0152 Any of the compositions described herein may be comprised in a kit, including, for example, one or more bac Genome Sequencing, Annotation, and Analysis teria, media reagents, and so forth. In some embodiments, one 0157. The nucleotide sequence of the T. mobilis genome or more reagents to assist in fermentation of didemnin in was determined by using a massively parallel pyrosequencing bacteria is provided in a kit. In certain aspects, one or more technology (Roche 454 GS FLX). 112 contigs (>500 bp) with compounds for use in preparing a didemnin derivative from a total size of 6.4 Mb were assembled from 315,496 reads an endogenously produced Tistrella didemnin is included in (average length of 334bp) using Newbler software of the 454 the kit. Suite package, providing an 18.4-fold coverage. In addition, 0153. The kits may comprise a suitably aliquoted compo 2,625,640 sequences with average length 115 bps of mate sition of the present invention. The components of the kits pair produced by the Illumina sequencing system were may be packaged either in aqueous media or in lyophilized mapped to the genome sequence to promote sequence quality form, where appropriate. The container means of the kits will and construct a scaffold. All the contig relationships within generally include at least one vial, test tube, flask, bottle, scaffolds were validated by PCR, and the relationship among Syringe or other container means, into which a component scaffolds was determined by multiplex PCR. Gaps were filled may be placed, and preferably, suitably aliquoted. Where by sequencing PCR products. The final sequence assembly there are more than one component in the kit, the kit also will was carried out using phred/phrap/consed package (see the generally contain a second, third or other additional container World Wide website of the Genome Center at the University into which the additional components may be separately of Washington), and the low sequence quality region was placed. However, various combinations of components may resequenced. The final error rate of genome sequence was be comprised in a vial. The kits of the present invention also 0.28 per 100,000 bases. will typically include a means for containing the kit compo 0158 Protein-coding sequences (CDS) were determined nent(s) in close confinement for commercial sale. Such con by combining the prediction results of Glimmer 3.02 and tainers may include injection or blow molded plastic contain Z-Curve programs. Functional annotation of CDS were per ers into which the desired vials are retained. formed by searching the NCBI non-redundant protein data 0154) However, the components of the kit may be pro base and KEGG protein database. tRNA genes were pre vided as dried powder(s). When reagents and/or components dicted with tRNAScan-SE (v1.23). Protein domain are provided as a dry powder, the powder can be reconstituted prediction and COG assignment were performed by RPS by the addition of a suitable solvent. It is envisioned that the BLAST using the NCBICDD library. Solvent may also be provided in another container means. in Silico Analysis of Didemnin Biosynthetic Gene Cluster and Other NRPS in T. mobilis Genome EXAMPLES 0159. The roles of the proteins, in embodiments of the 0155 The following examples are included to demon invention, in the didemnin gene cluster were assigned using strate preferred embodiments of the invention. It should be protein-protein BLAST and Pfam analysis. The NRPS A appreciated by those of skill in the art that the techniques domain specificity was predicted using online program disclosed in the examples which follow represent techniques NRPSpredictor (Rausch 2005). The nucleotide sequences of discovered by the inventor to function well in the practice of the gene cluster can be deposited at GenBank(R), for example. the invention, and thus can be considered to constitute pre ferred modes for its practice. However, those of skill in the art Example 2 should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are General Genome Feature of Tistrella mobilis and disclosed and still obtain a like or similar result without Associated NRPS Gene Clusters departing from the spirit and scope of the invention. (0160 Isolation of Didemnins from Tistrella mobilis Example 1 0.161 The bacterium was noted for its astonishing cyto toxic activity during a screening program of bacteria isolated from the Red Sea for cytotoxic activity. Its ethyl acetate crude Exemplary Methods extract of culture broth could kill the HeLa cells at a concen tration lower than 1 ng/ml. The bacterium was then fermented Strain, Fermentation and Isolation of Didemnin Compounds to 701 and the ethyl acetate extract was subjected to reverse 0156 Tistrella mobilis was isolated from seawater col phase column chromatography using 15%, 30%, 45%, 60%, lected from the Red Sea during a 2009 research cruise. Its 70%, 90% and 100% methanol in water, respectively. Frac crude extract showed remarkable cytotoxicity on HeLa cells tions eluted using 60% and 70% showed strong cytotoxic in a bioactive compound screening of Red Sea bacteria. T. activity using HCT-116 human colon carcinoma cell. One mobilis was grown on GYP medium 10g of glucose/4 g of active fraction was identified with peptide signals, using yeast extract/2 g of peptone/17 g of sea salts/1 liter of deion H-NMR analysis and MS analysis to reveal two separate ized water using 50-liter stirred fermenters at 25°C. for 72 h. peaks with M+H of 1098.7 and 1112.7, respectively (FIG. At the end of fermentation, ethyl acetate was added to the 2). We purified both compounds using semi-preparative culture to extract the metabolites. The crude extract was then HPLC and both of them showed strong cytotoxic activity at a fractionated by reverse-phase C18 liquid chromatography concentration of 0.074 ng/ml. Using NMR, MS and the Anti US 2014/0296.161 A1 Oct. 2, 2014 17

Marin database, we identified these two compounds to be presence of other didemnins, including didemnin A, and didemnin Band nordidemnin B. A pure form didemnin B was didemnin C, attrace amounts. This is consistent with the fact acquired from Prof. Chris Ireland at the University of Cali that didemnin Band nordidemnin B were the most abundant fornia, Santa Cruz, who with coworkers, isolated it from the in the Trididemnum tunicates. tunicate Trididemnun solidum. This sample was used to com pare its NMR spectral data with the present data and to further Biosynthetic Pathways of Didemnins confirm that the compound isolated from T. mobilis was didemnin B. Moreover, other fractions from T. mobilis also 0162 The putative biosynthetic gene cluster for exhibited similar bioactivity and LC-MS analysis showed the didemnins contains 26 ORFs (Table 2). TABLE 2 Deduced functions of the proteins in the didemnin biosynthetic gene cluster Amino Identity/ GenBank gene acid aa) Sequence similarity organism Proposed function similarity accession no. orf1 261 thioesterase, Actinomadira kijaniata Thioesterase 41%, 54% ACB46473 orf2 483 band 7 protein, Methyliobacter 50%, 68% ZP 07653046 tundripaludum SV96 orf3 560 cyclic peptide transporter, transporter 48%, 68% ZP 07653047 Methyliobacter tundripaludum SV96 orfA. 70 No hits orfs 190 GTPase domain-containing protein, 39%, 57% ZP 07653048 Methyliobacter tundripaludum SV96 Orf6 987 hydrophobic amphiphilic exporter-1, Resistance 38%, 56% YP 003450508 Azospirilium sp. B510 orf7 377 secretion protein, Azospirilium sp. Resistance 28%, 47% YP OO3450507 BS10 idA 2123 Ocia protein, Planktothrix rubescens NRPS (CAT CA 29%, 40% CAQ48254 NIVA-CYA 98 T) idB 1796 linear gramicidin synthetase subunit D, NPRS (CA KRT) 36%, 48% ZP 01459555 Stigmatella aurantiaca DW43-1 idC 1330 NRPS, Myxococcus xanthus DK 1622 NRPS (CAT C*) 41%, 52% YP 632257 idD 3853 amino acid adenylation domain protein, NRPS (CAMT 39%, 50% ZP 07603194 Streptomyces violaceusniger Tu 4113 C* CAT CAT) idE 1705 amino acid adenylation domain protein, NRPS/PKS (KS 36%, 53% ZP 07325073 Acetivibrio cellulolyticus CD2 KRTC) F 1613 HotF, Lyngbya majuscula NRPS (A* A KR 35%, 52% AAY42398 T) idG 1413 NRPS/PKS, Amycolatopsis PKS (KSMT) 46%, 56% YP 003765866 mediterranei U32 idH 1286 NRPS/PKS, Myxococcus xanthus DK NRPS (CAT C*) 39%, 53% YP 631961 1622 idI 873 NRPS, Myxococcus xanthus DK 1622 NRPS (CAT) 39%, 52% YP 632257 id 2163 amino acid adenylation domain protein, NRPS (CAMMT 36%, 53% ZP 0843.1746 Lyngbya majuscula 3L TE) O 77 MbtH domain-containing protein, MbtH-like protein 80%, 89% YP OO1542806 Herpetosiphon aurantiacus ATCC 23779 orf 68 hypothetical protein, Acidovorax sp. 79%, 91% YP 986866 JS42 orf10 45 No hits orf11 60 No hits orf12 190 hypothetical protein, Acidovorax sp. 94%, 97% YP 986861 JS42 orf13 75 hypothetical protein, Acidovorax sp. 98%, 100% YP 004387524 JS42 orf14 324 CAAX amino terminal protease family, 30%, 49% ZP 050354O1 Synechococcus sp. PCC 7335 orf15 398 cyanate transport system protein, 40%, 52% ZP 07265.073 Pseudomonas Syringae pv. Syringae 642 orf16 255 GntR family transcriptional regulator, Regulation 39%, 55% NP 903400 Chronobacterium violiaceum ATCC 12472

0163 Ten of them, didA to did, are most likely to be AMP; the activated form of the amino acid is then loaded to an involved in the biosynthesis of didemnins (FIG. 3). All of adjacent thiolation (T) domain; and the condensation (C) them encode for non-ribosomal peptide synthetase (NRPS) domain catalyzes peptide bond formation. In addition to these except didE and didG. Most NRPS are modular that three basic domains, different tailoring domains, such as use an assembly line strategy to synthesize nonribosomal ketoreductase (KR) domains or methyltransferase (MT) peptides. A typical module in the NRPS assembly line usually domains, are sometimes present in NRPS modules. These has a minimal of three domains: the adenylation domain (A) tailoring domains make modification of the amino acids and activates a specific amino acid via the formation of an acyl they are one of the main reasons that many non-ribosomal US 2014/0296.161 A1 Oct. 2, 2014

peptides contain non-proteinogenic amino acid momoners. 0.165. The remaining monomers of the didemnin B are a Most NRPS generally follow a colinearity principle wherein lactic acid (Lac), an isostatine (Ist) and an O-(C-hydroxyis the A domain specificity and placement within the NRPS ovaleryl)-propionic acid (Hip). According to the colinearity assembly line dictates the sequence of the monomers in the rule, DidB, which contains four domains (C-A-KR-T), nonribosomal peptide (Marahiel et al. 1997: Fischbach and should be the module that feeds the lactic acid into didemnin Walsh 2006). Analysis of the substrate specificity of the A B. Although NRPSpredictor fails to predict a reliable sub domains by NPRSpredictor can substantially help match the strate for the didBA domain, BLAST search of its sequence A domains with the corresponding amino acid Substrate. shows that it is highly similar to one of the A domains found There are a total of 11 A domains in the eight NRPS in the in the valinomycin gene cluster. According to Chen (2006), didemnin gene clusters and it is the specificity of these A this A domain may activate pyruvic acid. The pyruvic acid is domains that leads us to postulate that this gene cluster is then reduced to lactic acid, which is also a monomer of responsible for didemnin synthesis (Table 3). valinomycin. Therefore, the A domain in DidB may first activate pyruvic acid and the KR domain then comes to reduce the pyruvic acid to lactic acid, which is exactly the Adenyla same scenario in the Vancomycin gene cluster. The third A tion 10 amino Identity, domain in DidD is most likely to activate isoleucine accord protein domain acid code Substrate 3. ing to the code specificity analysis. Downstream to DidD is a DidA A1 DAWOFGLIDK Glutamine 1OO hybrid sythetase/nonribosomal protein synthetase (SEQ ID NO: 1) (PKS/NRPS) protein DidE comprising of a Ketosynthase DidA A2 DAWOFGLIDK Glutamine 1OO domain (KS), a Ketoreductase (KR), a thiolation domain (T) (SEQ ID NO: 1) and a condensation domain (C). It can be hypothesized that the after isoleucine is incorporated, the intermediate mol Did A3 OHPWIAETWK (pyruvic ecule bearing the isoleucine goes through a round of PKS (SEQ ID NO: 2) acid) reaction which contains two steps: 1) addition of a 2-carbon DidC A4 DVOFAAQVVK Proline 9 O unit as a common PKS chain elongation step possibly using a (SEQ ID NO : 3) malonyl-CoA as a Substrate, and 2) the carbonyl group of the isoleucine is reduced to a hydroxyl group. This series of DidD As DAWFLGHWWK Leucine 9 O reactions can then explain the existence of the monomer Istin (SEQ ID NO : 4) the didemnin B molecule. However, the above PKS system DidD A6 DFWNIGMWHK Threonine 1OO features as the AT-less PKS system as no (SEO ID NO. 5) can be found in this module. Moreover, no stand alone AT DidD A7 DAFFLGITFK Ile 9 O genes can be found in the entire didemningene cluster and its (SEQ ID NO : 6) vicinity, which is a very rare phenomenon in AT-less PKS systems as normally at least one stand alone AT gene can be DicF As No code (2-oxoiso found in the vicinity that functions in trans to catalyze the valleric activation of the substrate, which most of the time would be a acid) malonyl-CoA. Nevertheless, in certain aspects the Ist mono DidH Ag DAWFLGNWWK Leucine 1OO mer is added through the joint effect of a NRPS module and a (SEO ID NO : 7) hybrid PKS/NPRS module. It is unlikely that a single step in DidI Alo DVOFAAOVVK Proline 9 O which an A domain specifically activates Ist catalyzes the (SEQ ID NO: 8) reaction as predicted by Salomon et al. (2004). Downstream of the didD gene are didE and didF, the former is a NRPS gene Didu A11 DASTLAAWCK Tyrosine 9 O while the latter is a PKS gene. The A domain in DidE does not (SEO ID NO: 9) have a code specified for any known Substrate, yet its amino acid sequence shares 43% identity and 60% similarity with that of the first adenylation domain of Het of the cyanobac 0164. For example, the inventors found that in at least terium Lyngbya majuscule that produces the cyclic peptide certain aspects the A domain of DidC activates proline while hectochlorin. Ramaswamy et al. (2007) used the ATP-PPi the first two A domains in DidD activate leucine and threo exchange activity assay to demonstrate that the first A domain nine, respectively. A fragment of the didemnin B happens to of HetF can activate 2-oxo-isovaleric acid. Therefore the A contain these three amino acids in the same sequence as they domain in DidE may first activate the 2-oxo-isovaleric acid, are activated by the corresponding A domains. Moreover, the which is then reduced in situ by a KR domain in DidE to leucine in the didemnin B has an N-terminal methyl group 2-hydroxyisovaleric acid (2-Hiv). The KS domain of DidF and this matches well with the fact that there is also a methy then adds a 2-carbonunit to the C-terminus of Hiv and the MT lation domain right next to the A domain activating leucine. In domain in DidF catalyzes the addition of a methyl group. The a similar way, the inventors found the A domains in DidH, net outcome of this cascade of reactions is the incorporation DidI and Did in certain embodiments activate leucine, pro of the monomer Hip into didemnin molecules (FIG. 4). This line and tyrosine. Did I contains a MT tailoring domain that is very similar to the scenario of the addition of Ist in which a will convert proline into N-methyl proline while Did con joint effort of two proteins eventually catalyze the addition of tains two MT domains, one N-methyltransferase and one the necessary monomer into the final molecule. Finally, in O-methyltransferase, which may respectively add a methyl Some aspects the thioesterase domain of the DidJ catalyzes group on the amine and hydroxyl group of the tyrosine. Leu the macrocyclization between the hydroxyl group on the side cine, N-methyl proline and N.O-dimethyl tyrosine together chain of threonine residue and the group of T make another fragment of the didemnin B. domain bound tyrosine residue, resulting in the release of the US 2014/0296.161 A1 Oct. 2, 2014

mature cyclic depsipeptide didemnin compounds. Therefore, the biosynthesis of didemnins. It is also reasonable as over in certain embodiments of the invention didB to did are production of complicated didemnins could cause a huge responsible for the biosynthesis of didemnin B (FIG. 5). waste of energy for the bacterium. Although didemnins have 0166 Apart from didemnin B, the inventors also isolated been reported for their potent cytotoxic, antivirus, immuno nordidemnin B, which contains the monomer norStatine, a Suppressive activities, the real ecological function of these very similar derivative of isostatine. Since the A domains compounds for the producing bacterium is unknown. To syn usually have some flexibility in activating similar substrates, thesize such complicated molecules, the bacterium has to Valine, instead of isoleucine, may also be activated by the invest considerable energy with good reason. third A domain of DidD and together with DidE, norstatine may therefore be incorporated to produce nordidemnin B, in Manipulation of the Didemnin Gene Cluster some embodiments. Upstream of didB, didA encodes two 0.168. There are huge incentives to manipulate the didem modules of NRPS with the CATCAT domain organization. nin gene cluster as didemnins other than didemnin B or nor Both A domains have specified codes to activate glutamine. didemnin B could be produced from this gene cluster. For There are some other didemnins which have multiple example, there might be a way to produce dehydrodidemnin glutamine residues preceding the lactic residue, such as B (Aplidine), which is a promising antitumor drug currently didemnin X (FIG. 1) and didemnin Y. In some cases with in clinical trials (LeTourneau et al. 2010, Mateos et al. 2010). didA, this gene cluster produces didemnin X or didemnin Y. The only difference between dehydrodidemnin Band didem Similar to plipastatin, didemnin X and didemnin Y are both nin B is that the former has a pyruvic acid monomer while the N-acylated. It has been reported that the first C domain in the latter has a lactic acid. If the ketoreductase domain in didB initiation module may catalyze the N-acylation of the first can be knocked out, then the incorporated pyruvic acid will amino acid in some NRPS systems (Imker et al. 2010). How not be reduced to lactic acid and consequently, dehydrodi ever, further studies are performed to characterize this. Apart demnin B may be expected to be produced. Upon construc from didemnin Band nordidemnin B, the inventors also used tion of the corresponding deletion mutant, there may be an UPLC-HRMS to detect many other didemnin derivatives, alternative way to produce dehydrodidemnin B other than Such as didemnin A and didemnin C by growing T. mobilis in chemical synthesis, and the cost could be substantially different culture media or with different culture durations. reduced. In addition, as more than 10 mg/L didemnin Band nordidemnin B can be harvested from fermentation of Additional ORFs in the Didemnin Gene Cluster Tistrella mobilis using a simple medium for as short as 3 days, (0167 Besides the synthesis of didemnins, other genes that improvement of the culture conditions may further increase are possibly involved in the regulation, resistance and trans the production of the didemnins. Because of their large port of the didemnins are also found in the didemnin gene molecular weight and high lipophilicity, purification of the cluster. Orf1, which encodes a type II thioesterase, may didemnins is quite simple by chromatographic separation, for regenerate the misprimed NPRS which are inactive to ensure example, which in turn can provide the starting material to the NRPS assembly line is functional. Orf3, orf6 and orf7 all semisynthesize dehydrodidemnin B. Furthermore, genetic encode proteins related to the transport and secretion of the engineering of the didemnin gene cluster provides novel cyclic peptide and they may be responsible for the secretion didemnin compounds with improved biological activities. of didemnins. As Secretion and self-resistance are often related, they may also function as self-resistance genes to Example 3 protect the producer. Orf3 encodes an MbtH-like protein that is a common resident of various NRPS systems. Although the Other Didemnins Produced by Tistrella Mobilis precise function of MbtH-like proteins in NRPS systems has 0169. In particular aspects of the invention, there are not been revealed, in specific aspects they play an important didemnins other than didemnin B and nordidemninB from role in the production of many non-ribosomal peptides as Tistrella mobilis. As shown in FIGS. 6, 7A, and 7B using indicated by Lautru et al., (2007). Orf16 encodes a Gnt UPLC-HRMS, there was detection of some other didemnins family transcription factor and this is the only transcription Such as didemnin A, nordidemnin A and didemnin N. In related gene that can be found in the vicinity of the didemnin certain aspects, there are new didemnins with similar molecu biosynthetic gene cluster. As GntR family transcription fac lar weights but different from all reported didemnins (see tors usually, though not always act as repressive regulators FIGS. 8A and 8B). They also have similar UV absorption (Chen et al., 2010), orf16 may participate in the regulation of patterns to didemnins.

Molecular Name Formula weight Reference Source Didemnin A C49H78N6O12 943.177 Crampton, S. L. et al., Cancer Res., 44 (1984) 1796; M.B. Trididemnium sp. Hossain et al., Internat. J. Peptide Protein Res.47 (1996) 20-27 Didemnin B C57E89N7O15 1,112.35 Crampton. S. L. et al., Cancer Res., 44 (1984) 1796: Trididemnium sp. Didemnin C CS2EH82N6O14 1,015.24 Rinehart, K. L. et al., Science, 212 (1981) 933: Trididemnium spp. Prolyldidenmin A C54H85N7O13 1,040.29 Schmidt, U. et al., Tetrahedron Lett., 29, 4407-8 1988 didemnin D C77H118N14O23 1,607.84 Rinehart KL Kishore V Bible KC Sakai R Sullins DW Li Chordata Trididemnium K-MJ. Nat. Prod. 198851 1-21 solidum didemnin E C72H11ON12O21 1479.71 Rinehart KL Kishore V Bible KC Sakai R Sullins DW Li Chordata Trididemnium K-MJ. Nat. Prod. 198851 1-21 solidum didemnin G CS8H89N7O16 1,140.36 K. L. Rinehart, 2nd Euroconf. Marine Nat. Prod., 1999 Trididemnium solidum Santiago d. Comp., US 2014/0296.161 A1 Oct. 2, 2014 20

-continued

Molecular Name Formula weight Reference Source isodidemnin-1 C57E89N7O15 1,112.35 Guyot M Davoust D Morel E. C. R. Acad. Sci. P Ser. II Chordata Trididemnium 1987 305 681-686 cyanophortin nordidemnin B C56H87N7O15 1,098.33 McKee TC Ireland CM Lindquist NFenical W Chordata Trididemnium Tetrahedron Lett. 1989 303053-3056 solidum didemnin M C67H102N10O19 1,351.58 Sakai R Stroh. JG Sullins DW Rinehart KLJ. Am. Chem. Chordata Trididemnium Soc. 1995 1173734-3748 solidum didemnin N C55H85N7O15 1,084.30 Sakai R. Stroh. JG Sullins DW Rinehart KLJ. Am. Chem. Chordata Trididemnium Soc. 1995 1173734-3748 solidum didemnin X C82H131N13O23 1,666.99 Sakai R Stroh. JG Sullins DW Rinehart KLJ. Am. Chem. Chordata Trididemnium Soc. 1995 1173734-3748 solidum didemnin Y C87H139N15O25 1,795.12 Sakai R Stroh. JG Sullins DW Rinehart KLJ. Am. Chem. Chordata Trididemnium Soc. 1995 1173734-3748 solidum nordidemnin N CS4H83N7O15 1,070.27 Sakai R Stroh. JG Sullins DW Rinehart KLJ. Am. Chem. Chordata Trididemnium Soc. 1995 1173734-3748 solidum epididemnin A1 C49H78N6O12 943.177 Sakai R. Stroh. JG Sullins DWRinehart KLJ. Am. Chem. Chordata Trididemnium Soc. 1995 1173734-3748 solidum acyclodidemnin A C49H8ON6O13 961.192 Sakai R. Stroh. JG Sullins DWRinehart KLJ. Am. Chem. Chordata Trididemnium Soc. 1995 1173734-3748 solidum tyrSldidemnin B C55H85N7O15 1,084.30 Aboumansour E. Boulanger A Badre A Bonnard I Banaigs B Chordata Trididemnium Combaut G Francisco C Tetrahedron 1995 51 12591-12600 cyanophortin D-pro4ldidemnin C57H89N7O15 1,112.35 Aboumansour E. Boulanger A Badre A Bonnard I Banaigs B Chordata Trididemnium B Combaut G Francisco C Tetrahedron 1995 51 12591-12600 cyanophortin didemnin H C67H102N10O19 1,351.58 Boulanger A. Abou-Mansour E Badre A Banaigs B Chordata Trididemnium Combaut G Francisco C Tetrahedron Lett. 1994 354345-4348 cyanophortin dehydrodidemnin C57H87N7O15 1,110.34 Rinehart KL Lithgow-Bertelloni AM Ruffles GKPCT Chordata Apiidium B Int. Appl. 1991 00 albicans Hysp2didemnin C58H91N7O15 1,126.38 Banaigs B Mansour EA Bonnard I Boulanger A Francisco C Chordata Apiidium B Tetrahedron 1999559559-9574 albicans, Chordata Trididemnium cyanophortin, Chordata Trididemnin Solidiin Hap2didemnin C54H83N7O15 1,070.27 Banaigs B Mansour EA Bonnard I Boulanger A Francisco C Chordata Apiidium B Tetrahedron 1999559559-9574 albicans, Chordata Trididemnium cyanophortin, Chordata Trididemnium solidum

Example 4 Sequence Information 0170 The following table provides annotation of the Tistrella mobilis genome (including plasmids 1-4 (P1-P4)).

LOCS Gene Product Start End Strand Length P1.orf)001 lb Amino acid adenylation 1 35O1 35O1 P1.orf)002 HNH endonuclease domain-containing protein 3566 4135 2 570 P1.orf)003 conserved hypothetical protein 4297 4632 336 P1.orf)004 conserved hypothetical protein 4675 4920 246 P1.orf)OOS conserved hypothetical protein 4910 5203 2 294 P1.orf)006 Tetratricopeptide TPR 4 S622 7493 3 1872 P1.orf)007 hypothetical protein 7712 7870 2 159 P1.orf)008 2-nitropropane dioxygenase NPD 10299 9328 972 P1.orf)009 cytochrome B561 10869 1 O303 567 P1.orf)O10 TPR repeat-containing protein 11116 12300 118S P1orf)O11 arcB ornithine cyclodeaminase 13363 12311 -2 1053 P1orf)O12 arcB arginase 14343 13360 984 P1.orf)O13 hypothetical protein 15282 14491 792 P1.orf)O14 transcriptional regulator 15452 15904 2 453 P1.orf)O15 D-aminoacylase 16007 17512 2 1506 P1.orf)016 cobW Cobalamin biosynthesis protein CobW 17509 18474 966 P1.orf)017 kipR transcriptional regulator, IcIR family 18495 1931.3 3 819 P1.orf)O18 transposase 1968O 19988 3 309 P1.orf)O19 transposase, IS4 20131 2O361 231 P1orf)020 rect DNA replication and repair protein recF 2O617 22251 1635 P1.orf)021 yagA integrase catalytic subunit 24237 23488 750 P1.orf)022 dadA D-amino-acid dehydrogenase (DadA-like) 25991 24768 -3 1224 P1.orf)023 gsiA putative ABC transporter, ATP-binding protein 27732 26053 1680 P1.orf)024 putative ABC transporter, permease protein 28582 27737 -2 846 P1.orf)O2S putative ABC transporter, permease protein 29522 28803 -3 720 US 2014/0296.161 A1 Oct. 2, 2014 21

-continued

LOCS Gene Product Start End Strand Length

O O26 binding-protein-dependent transport systems 288O 28582 -1 228 inner membrane component O O27 putative ABC transporter, periplasmic binding 3121 1 29595 -3 1617 protein O baeS two component sensor histidine kinase, AdeS 3265 7 3.1479 -3 1179 O putative transcriptional regulatory.cf27 3341 5 3.2654 -2 762 O conserved hypothetical protein 3368 8 34.317 1 630 O conserved hypothetical protein 3440 O 3S098 2 699 O conserved hypothetical protein 36OO 7 351.86 -2 822 O hypothetical protein 3638 6 36081 -3 306 O Sulfatase-modifying factor 3655 3 37887 1 1335 O Meiotically up-regulated gene 158 protein 3791 5 38916 1 10O2 O putative transcriptional regulator 3903 O 39296 3 267 O conserved hypothetical protein 3929 3 39718 2 426 O conserved hypothetical protein 4011 O 39703 -1 408 O conserved hypothetical protein 40969 4O127 -2 843 O hypothetical protein 41084 40953 -3 132 O conserved hypothetical protein 4111 3 41679 1 567 O conserved hypothetical protein 4171 8 42971 3 254 O yciE conserved hypothetical protein 42989 44128 2 140 O TPR repeat 45084 44143 -1 942 O conserved hypothetical protein 45365 45081 -3 285 O aspartyl asparaginyl beta-hydroxylase 4639 6 45362 -2 O35 O asnB Asparagine (glutamine-hydrolyzing) 48329 46389 -3 941 O hypothetical protein 4851 2 48342 -3 171 O transcriptional regulator, ASnC family 4922 1 48742 -1 480 O conserved hypothetical protein 49373 5O158 2 786 O L-carnitine dehydratase/bile acid-inducible 5130 9 5O179 -1 131 protein F O exoD putative exopolysaccharide synthesis protein 5153 O 52174 2 645 O sensory box/GGDEF family protein 5432 O S2131 -3 2190 O PEBP family protein 54541 56367 1 827 O hypothetical protein 5651 4 56780 3 267 O inositol-1-monophosphatase 56949 57.809 3 861 O hypothetical protein 5807 2 57815 -2 258 O feaR AraC family transcriptional regulator 5825 6 S9380 2 125 O hypothetical protein 5958 2 59920 2 339 O hypothetical protein 6023 3 59958 -3 276 O conserved hypothetical protein 60341 605.35 2 195 O hypothetical protein 6181 9 60983 -2 837 O conserved hypothetical protein 6252O 61816 -1 705 O hypothetical protein 63.25 7 62658 -3 600 O hypothetical protein 6398 8 632S4 -2 735 O conserved hypothetical protein 6S13 7 64061 -2 O77 O conserved hypothetical protein 6616 8 65134 -1 O35 O vgrO protein 68924 66195 -3 2730 O conserved hypothetical protein 6957O 69058 -1 513 O conserved hypothetical protein 712 9 69567 -3 653 O ABC transporter, permease protein 7243 8 71212 -1 227 O macB Macrollide export ATP-binding permease 73142 72438 -3 705 protein O ppkA-related protein 751 5 73139 -2 977 O /threonine kinase 76542 751.15 -1 428 O Ser/Thr protein phosphatase 7730 5 76535 -2 771 O conserved hypothetical protein 780 8 77302 -1 717 O conserved hypothetical protein 81725 78OOO -3 3726 O conserved hypothetical protein 83263 81725 -2 539 O conserved hypothetical protein 846 O 83267 -2 344 O type VI secretion lipoprotein, VC AO 113 family 8517 8 84669 -3 510 O FHA domain-containing protein 866 7 85211 -2 407 O type VI secretion ATPase, ClpV1 family 89264 86661 -3 2604 O type VI secretion system protein ImpH 9036 5 89.334 -3 O32 O type VI secretion system protein ImpG 922 6 90375 -3 842 O type VI secretion system protein ImpF 92.73 1 92.213 -2 519 O type VI secretion system protein ImpE 9359 7 92.728 -1 870 O type VI secretion system protein ImpC 9506 3 93663 -3 401 O type VI secretion system protein ImpC 96.57 6 95.08O -1 497 O type VI secretion protein 9711 5 96.582 -3 534 O impA-related N-terminal protein 98.31 7 971.75 -2 143 O peptidase S1 and S6, chymotrypsin/Hap 99.08 3 98.634 -3 450 O hcp1 conserved hypothetical protein 9941 8 998.91 1 474 O hypothetical protein 1OOO7 O 1OO681 2 612 O hypothetical protein 1 OO695 101.285 3 591 O VgrG protein 10129 O 103272 1 1983 US 2014/0296.161 A1 Oct. 2, 2014 22

-continued

LOCS Gene Product Start End Strand Length orf)096 hypothetical protein O3322 O4083 2 762 orf)097 conserved hypothetical protein O4130 O4S43 3 414 orf)098 rhsC putative rhs-related transmembrane protein O4S48 O901.7 1 4470 orf)099 conserved hypothetical protein O9034 O9873 2 840 orf)100 conserved hypothetical protein O9983 10342 3 360 orf)101 Ribosomal protein L7/L12 C-terminal domain. 10467 10853 1 387 orf)102 hypothetical protein 10962 11171 1 210 orf)103 hypothetical protein 11187 11699 1 513 orf)104 conserved hypothetical protein 11745 12572 1 828 orf)105 conserved hypothetical protein 12700 13104 2 40S orf)106 conserved hypothetical protein 13104 13838 1 735 orf)107 treY malto-oligosyltrehalose synthase 13999 16740 2 2742 orf)108 ipolytic 17759 16767 -1 993 orf)109 paiB transcriptional repressor of sporulation and 17861 18490 3 630 degradative enzyme production orf)110 Metallophosphoesterase 1931.3 18522 -1 792 orf)111 putative transporter protein 19527 20717 1 1.191 orf)112 AraC family transcriptional regulator 2O779 21609 2 831 orf)113 conserved hypothetical protein 21817 21 620 -3 198 orf)114 conserved hypothetical protein 222O7 21902 -3 306 orf)115 putative addiction module antidote protein, 22S4O 22764 2 225 CopGArc/MetJ family orf)116 plasmid stabilization system protein 22765 23076 2 312 conserved hypothetical protein 231 61 24144 2 984 C orf)118 ywA ABC transporter, nucleotide binding ATPase 2582O 24108 -1 1713 protein orf)119 fhuB transport system permease protein 264.04 25895 -3 510 orf)12O fhuB transport system permease protein 27864 26395 -2 1470 orf)121 fhuD ron(3+)-hydroxamate-binding protein fhuD 28711 278.57 -3 855 orf)122 fhuC iron-hydroxamate transporter ATP-binding 29528 28731 -1 798 Subunit of)123 fhuA errichrome receptor precursor protein 31773 29539 -2 2235 msmR transcriptional regulator, AraC family 32821 31928 -3 894 Regulator of cell morphogenesis and NO 33622 32918 -3 705 signaling P1.orf)126 GCN5-related N- 34591 33734 -3 858 P1.orf)127 appF putative dipeptide ABC transporter, ATP 35640 34.654 -2 987 binding protein Oligopeptide? dipeptide transporter domain 36449 35637 -1 813 amily protein P1.orf)129 didpC ABC transporter permease protein 1 37342 36446 -3 897 P1.orf) 130 didpB binding-protein dependent transport system 38.361 37339 -2 1023 inner membrane protein P1.orf) 131 hbpA extracellular solute-binding 5 40053 38434 -2 162O P1.orf) 132 SoxC DSZC-like desulfurization enzyme 41285 4O104 -1 1182 P1.orf) 133 SOXA flavin-dependent 42712 41282 -3 1431 P1.orf) 134 hcaR transcriptional regulator, LysR family 44315 43398 -1 918 P1.orf)13S short-chain dehydrogenase/reductase SDR 44,452 45222 2 771 P1.orf) 136 cat) carboxylesterase 46146 45223 -2 924 P1.orf) 137 conserved hypothetical protein 46.388 47218 3 831 P1.orf)138 acyl-CoA dehydrogenase 47267 48451 3 118S P1.orf) 139 hypothetical protein 46396 46217 -3 18O P1.orf)140 acyl-CoA dehydrogenase family protein 48483 49682 1 1200 P1orf)141 hypothetical protein SO210 49764 -1 447 P1.orf)142 parA Cobyrinic acid ac-diamide synthase S106.1 502O7 -3 855 P1.orf)143 hypothetical protein 51289 S1116 -3 174 P1orf)144 transcriptional regulator, CopG family S1688 S1413 -3 276 P1.orf)145 mtnC 2,3-diketo-5-methylthio-1-phosphopentane 52572 51757 -2 816 phosphatase mtnD acireductOne dioxygenase ARD 53107 52562 -3 S46 mtnB methylthioribulose-1-phosphate dehydratase 53.733 53104 -2 630 replication protein A 56147 57436 3 1290 hypothetical protein 56098 55670 -3 429 hypothetical protein 58775 59.935 3 1161 conserved hypothetical protein 59979 66170 1 6.192 dctB two-component sensor histidine kinase protein 66326 681.58 3 1833 dctD two component, sigmaS4 specific, 68155 69513 2 1359 transcriptional regulator, Fis family TRAP transporter solute receptor, TAXI family 696.76 2 987 protein TRAP transporter, 4TM/12TM fusion protein 70765 72939 2 2175 3-methylcrotonoyl-CoA carboxylase beta 74655 73048 -2 1608 Subunit US 2014/0296.161 A1 Oct. 2, 2014 23

-continued

LOCS Gene Product Start End Strand Length

O 57 Pcca putative acyl-CoA carboxylase, Biotiinflipoyl 75.185 74652 -1 534 carrier domain O 58 pycA acetylpropionyl CoA carboxylase alpha 76739 75264 -1 1476 Subunit O 59 citE Citryl-CoA 77666 76746 -1 921 O 60 dehydratase 78133 77663 -3 471 O 61 short-chain dehydrogenase/reductase SDR 783O4 791.37 2 834 O 62 VD ISOValeryl-CoA dehydrogenase 80424 79222 -2 1203 O 63 wF ABC transporter related protein 81.236 8.0448 -1 789 O 64 putative branched-chain amino acid transport 82524 81244 -2 1281 system substrate-binding protein O 65 bra inner-membrane translocator 83.668 82595 -3 1074 O 66 bra) inner-membrane translocator 84561 83674 -2 888 O 67 AMP-dependent synthetase and 86533 84563 -3 1971 O 68 braF ABC transporter related protein 87335 86523 -1 813 O 69 adR TetR family transcriptional regulator 885.19 878O3 -3 717 O 70 hypothetical protein 88784 89266 3 483 O 71 etR Tetracycline repressor protein class H 8998.7 89301 -1 687 O 72 eutQ Ethanolamine utilization protein eutQ 901.33 90S13 2 381 O 73 ordL FAD dependent oxidoreductase 90S10 91829 1 132O O 74 deoxycytidine triphosphate (dCTP) deaminase 91903 92943 2 1041 O 75 TRAP-type C4-dicarboxylate transport system 92995 935O1 2 507 Small permease component O 76 sia.T putative DctM (C4-dicarboxylate permease, 93503 94792 3 1290 large subunit) O 77 hypothetical protein 9SOO2 95.325 2 324 O 78 conserved hypothetical protein 95.425 95892 2 468 O 79 TetR family transcriptional regulator 96.517 95846 -3 672 O 8O yesh oxidoreductase yes 96619 97461 2 843 O 81 calB aldehyde dehydrogenase 98848 97430 -3 1419 O 82 TetR family transcriptional regulator 99.683 99045 -1 639 O 83 conserved hypothetical protein 99899 200906 3 1008 O 84 putative pyridoxine 5'-phosphate oxidase 2O1927 201238 -1 690 O 85 LysR family transcriptional regulator 2O2O44 2O2982 3 939 O 86 conserved hypothetical protein 2O4OOS 202992 -3 1014 O 87 conserved hypothetical protein 2O4591 2O3998 -1 594 O 88 cydB cytochrome dubiquinol oxidase, Subunit II 2O6024 2O4870 -3 1155 O 89 cydA cytochrome Dubiquinol oxidase, Subunit I 2O7644 2O6031 -3 1614 O 90 cydC transport ATP-binding protein CYDC 209417 2O7732 -3 1686 O 91 cydD ABC transporter, Cyd DC cysteine exporter 211018 209444 -2 1575 (Cyd DC-E) family, permeasef ATP-binding protein CydD O 92 hypothetical protein 211079 211204 2 126 O 93 methyl-accepting chemotaxis protein 211329 213422 3 2094 O 94 putative araC-like transcription regulator 214364 21343S -3 930 O 95 short-chain dehydrogenase/reductase SDR 214465 21.5220 1 756 O 96 Methyltransferase type 11 216,154 21S240 -2 915 O 97 major facilitator transporter 217467 216211 -1 1257 O 98 TonB-dependent receptor 21974O 217473 -3 2268 O 99 regulatory protein Pchr 22O674 219814 -1 861 O 2OO hypothetical protein 221164 22O736 -2 429 O 2O1 hypothetical protein 221470 221 198 -2 273 O 2O2 transposase, IS4 222251 221904 -3 348 O 2O3 PilT domain-containing protein 222699 222268 -1 432 O 204 conserved hypothetical protein 22290S 222696 -3 210 O 205 conserved hypothetical protein 2236O4 223110 -3 495 O 2O6 dhkJ multi-sensor hybrid histidine kinase 228426 223648 -1 4779 O 2O7 ybaR Sulphate transporter 229028 230536 2 1509 O 208 putative addiction module antidote protein, 230716 23.0997 1 282 CopGArc/MetJ family O 209 hypothetical protein 231751 231569 -2 183 O 210 hypothetical protein 232640 232413 -3 228 O 211 malO glycoside family protein 234541 232637 -2 1905 O 212 glgX glycogen debranching enzyme GlgX 238,557 234628 -1 3930 O 213 glycogen branching enzyme 24O732 2.38567 -1 21 66 O 214 treS trehalose synthase 244194 240808 -1 3387 O 215 aam1 alpha amylase catalytic region 2472O2 2442OO -3 3OO3 O 216 hypothetical protein 24.7741 247244 -2 498 O 217 hypothetical protein 247736 247966 2 231 O 218 cyclasef dehydrase 248O34 248840 3 807 O 219 otsB trehalose-phosphatase 248837 249631 2 795 O 220 otSA alpha,alpha-trehalose-phosphate synthase 249734 251275 2 1542 (UDP-forming) US 2014/0296.161 A1 Oct. 2, 2014 24

-continued

LOCS Gene Product Start End Strand Length

O 221 TRAP dicarboxylate transporter, DctM subunit 252686 251367 -3 132O O 222 TRAP dicarboxylate transporter, DctO subunit 253240 2S2683 -2 558 O 223 yiaO TRAP dicarboxylate family transporter, DctP 254661 253585 -1 1077 Subunit O 224 mvaA hydroxymethylglutaryl-CoA reductase, 256O27 254723 -2 1305 degradative O 225 ydcR GntR family transcriptional regulator 256219 257616 1 1398 O 226 AraC family transcriptional regulator 2588.98 257852 -2 1047 O 227 Extracellular ligand-binding receptor 259239 26042O 3 1182 O 228 IvE inner-membrane translocator 260507 26.1364 2 858 O 229 IiwV inner-membrane translocator 261376 262317 1 942 O 230 braF ABC transporter related protein 262314 26.3063 3 750 O 231 IvE ABC transporter related protein 263060 263761 2 702 O 232 acetyl-CoA acyltransferase 263771 26SOO3 2 1233 O 233 lcfB Acyl-CoA synthetase (AMP-forming)/AMP-acid 26SO13 266563 2 1551 ligase II O 234 fad) AMP-binding enzyme 266573 268O81 2 1509 O 235 SCP2 lipid-transfer protein 268.098 2692.79 3 1182 O 236 3-oxoacyl-acyl-carrier protein reductase 269276 270085 2 810 O 237 conserved hypothetical protein 270216 270575 3 360 O 238 F440524 38 putative alcohol dehydrogenase 27.08.19 27O637 -1 183 O 239 conserved hypothetical protein 270867 271124 3 258 O 240 addiction module toxin, RelE/StbE family 271121 271423 2 303 O 241 putative Alcohol dehydrogenase, (ADH) 271793 271398 -3 396 O 242 hypothetical protein 271927 271802 -2 126 O 243 dhkJ multi-sensor hybrid histidine kinase 272O12 276670 2 4659 O 244 adhA Alcohol dehydrogenase GroES domain protein 277765 276710 -2 1056 O 245 inner-membrane translocator 2789.08 277865 -2 1044 O 246 ytfT Sugar ABC transporter permease protein 2799.21 278908 -1 1014 O 247 ytfR Sugar ABC transporter ATP-binding protein 281SO3 279995 -2 1509 O 248 ytfC) periplasmic binding protein LacI transcriptional 282S45 281586 -3 960 regulator O 249 Galm aldose 1-epimerase 283693 28.2674 -2 102O O 250 yvrE Smp-30/Cgr1 family protein 284592 283690 -1 903 O 251 aldH dehydrogenase 28SS62 284582 -2 981 O 252 aldH NAD-dependent aldehyde dehydrogenase 28.6131 285.559 -1 573 O 253 gal putative d-galactose 1-dehydrogenase protein 287110 2861.63 -2 948 O 2S4 dihydroxy-acid dehydratase 288839 287097 -3 1743 O 255 umarylacetoacetate (FAA) hydrolase 289859 288858 -3 10O2 O 2S6 inner membrane permease of D-xylose ABC 291069 2898.64 -1 12O6 transporter O 257 araG ABC transporter related protein 2926O3 291041 -2 1S63 O 258 chw. putative Sugar uptake ABC transporter 293736 292666 -1 1071 periplasmic solute-binding protein precursor O 259 gbpR transcriptional regulator protein, LysR family 293961 294938 3 978 (possibly activator of the expression of chvE protein) O 260 conserved hypothetical protein 295O16 2.95489 2 474 O 261 LysR family transcriptional regulator 296.609 297565 2 957 O 262 Alcohol dehydrogenase zinc-binding domain 296.536 295.508 -2 1029 protein O 263 conserved hypothetical protein 297888 297619 -1 270 O 264 conserved hypothetical protein 298075 297848 -2 228 O 26S TetR family transcriptional regulator 2988O3 298.174 -1 630 O 266 mdmC O-methyltransferase mdmC 298.931 2996O2 2 672 O 267 major facilitator Superfamily transporter 299655 30O839 3 118S O 268 conserved hypothetical protein 3OO897 301634 3 738 O 269 acrF acriflavin resistance protein 3OS336 302289 -3 3048 O 270 efflux transporter, RND family. MFP subunit 3O6451 3OS336 -2 1116 O 271 Imra transcriptional regulator 306S4O 307115 3 576 O 272 wdlC short chain dehydrogenase 3O7162 307977 1 816 O 273 TRR1 FAD-dependent pyridine nucleotide-disulphide 3O891S 307995 -3 921 oxidoreductase O 274 transcriptional regulator, BadMRrf2 family 3O9078 309569 3 492 O 275 putative effector of murein hydrolase 310317 309583 -1 735 O 276 hypothetical protein 310717 310310 -2 408 O 277 pmrA major facilitator transporter 311875 31 O790 -2 1086 O 278 foE GTP cyclohydrolase I 312630 31 2004 -1 627 O 279 conserved hypothetical protein 313.395 312655 -1 741 O 28O conserved hypothetical protein 31.3853 314236 2 384 O 281 conserved hypothetical protein 314242 315081 1 840 O 282 arylmalonate decarboxylase 315223 315966 1 744 O 283 transcriptional regulator, GintR family 3.15981 316685 3 705 US 2014/0296.161 A1 Oct. 2, 2014 25

-continued

LOCS Gene Product Start End Strand Length

O 284 putative acyl-CoA transferase? carnitine 316686 317882 3 1197 dehydratase O 285 yngG Hydroxymethylglutaryl-CoA lyase 3.17894 318838 2 945 O 286 gbas conserved hypothetical protein 31883S 319149 315 O 287 isochorismatase hydrolase 3.19.182 31983S 3 654 O 288 mauR LysR family transcriptional regulator 3.19929 32O825 3 897 O 289 yajO putative oxidoreductase 321960 32O899 1062 O 290 conserved hypothetical protein 32249S 322088 -2 408 O 291 WD40-like repeat 323353 322676 -2 678 O 292 conserved hypothetical protein 32468O 323.472 -3 1209 O 293 hypothetical protein 32S2OO 32SO18 183 O 294 braC branched chain amino acid ABC transporter 32S4O2 326541 1140 periplasmic ligand-binding protein O 295 bra) ABC branched chain amino acid family 326S63 327480 918 transporter, inner membrane subunit O 296 braE High-affinity branched-chain amino acid 327562 32.8494 933 transport system permease protein braE O 297 braF branched-chain amino acid ABC transporter 3284.91 3293.18 3 828 ATPase O 298 braG High-affinity branched-chain amino acid 3293.23 330018 696 transport ATP-binding protein braC O 299 putative hydrolase 330O37 330909 873 O 3OO yfdE L-carnitine dehydratase/bile acid-inducible 331052 332284 2 1233 protein F O 301 hypothetical protein 332281 333 126 846 O 3O2 namA putative FMN oxidoreductase 333.123 334307 3 118S O 303 MscS mechanosensitive ion channel 335475 334.321 1155 O 3O4 yagR aldehyde oxidase and Xanthine 337817 33552O -3 2298 dehydrogenase molybdopterin binding O 305 yagS molybdopterin dehydrogenase FAD-binding 33883O 337814 -2 1017 O 306 yagT oxidoreductase 339306 338830 477 O 307 hypothetical protein 339700 339341 -2 360 O 3O8 conserved hypothetical protein 339834 34O160 3 327 O 309 pyridoxal phosphate biosynthetic protein Pdx.J. 340527 340315 213 O 310 cold-shock DNA-binding domain-containing 34O993 34O799 -2 195 protein O 311 DNA ligase D 343739 341301 -3 2439 O 312 Ku domain-containing protein 344647 343745 -2 903 O 313 conserved hypothetical protein 3448O3 345087 285 O 314 mintFH NADPH-dependent FMN reductase 345091 345735 645 O 315 yhdF Short-chain dehydrogenase/reductase SDR 346645 345794 -2 852 O 316 hypothetical protein 346993 347313 321 O 317 cmoB putative SAM-dependent methyltransferase 34.8091 34.7378 -2 714 O 318 topoisomerase IB 349134 348.118 1017 O 319 NAD-dependent epimerase? dehydratase 35O294 34919.1 -3 1104 O 32O conserved hypothetical protein 35146S 350305 1161 O 321 conserved hypothetical protein 352556 351462 -3 109S O 322 conserved hypothetical protein 353707 35.2553 -2 1155 O 323 , group 1 354876 353704 1173 O 324 radical SAM domain-containing protein 356.162 3S4864 -3 1299 O 325 rfbE NAD-dependent epimerase? dehydratase 3582O7 356159 -2 2049 O 326 NAD-dependent epimerase? dehydratase 359322 3582O4 1119 O 327 ECF subfamily RNA polymerase sigma-24 36O114 359527 S88 actor C O 328 conserved hypothetical protein 361.269 36O760 510 O 329 cc4 cytochrome c, class I 362430 361.306 112S O 330 Cytochrome c oxidase caa3-type, assembly 363092 36.2427 -3 666 actor CtaG-related protein O 331 transmembrane prediction 363451 363O89 -2 363 O 332 ctaD cytochrome c oxidase, Subunit I 365985 3634.66 252O O 333 ctaC cytochrome c oxidase, Subunit II 367010 365982 -3 1029 O 334 gcd glucose dehydrogenase 369412 36 7016 -2 2397 O 335 hypothetical protein 3694.79 369637 2 159 O 336 conserved hypothetical protein 370254 370700 3 447 O 337 conserved hypothetical protein 371128 371805 1 678 O 338 conserved hypothetical protein 373405 373O43 -2 363 O 339 prevent-host-death family protein 373714 373454 -2 261 O 340 gabl) Aldehyde Dehydrogenase 375058 3738O2 -2 1257 O 341 gsiA oligopeptide ABC Superfamily ATP binding 376993 3753O8 -2 1686 cassette transporter, ABC protein O 342 Putative peptide transport system permease 377814 376990 -1 825 protein O 343 binding-protein-dependent transport systems 378769 377825 -2 945 inner membrane component US 2014/0296.161 A1 Oct. 2, 2014 26

-continued

LOCS Gene Product Start End Strand Length

O 344 acyl-: 6-aminopenicillanic-acid 379943 378822 -3 1122 acyltransferase precursor O 345 beta-alanine-pyruvate transaminase 38142O 380041 1380 O 346 yM GntR family transcriptional regulator 382417 381SOO -2 918 O 347 bam indoleacetamide hydrolase 382SO1 383991 1491 O 348 poly(aspartic acid) hydrolase 384OO6 384872 3 867 O 349 ABC Superfamily ATP binding cassette 3850O8 38.6591 3 1584 transporter Substrate-binding protein O 350 transposase, mutator type 3874OS 387115 291 O 351 transposase, mutator type 388310 3874OS -3 906 O 352 conserved hypothetical protein 38.8987 38.9457 471 O 353 conserved hypothetical protein 390709 389.783 -2 927 O 3S4 conserved hypothetical protein 391494 3907O6 789 O 355 plasmid maintenance system antidote protein, 392861 393 004 2 144 XRE family O 356 type II restriction enzyme, methylase subunit 393.192 192 O 357 hsdR type I restriction-modification system 396231 3O3O restriction subunit O 358 putative restriction modification system 396.266 397510 2 1245 specificity Subunit O 359 prrC anticodon nuclease 397507 398736 1230 C O 360 type I restriction-modification system, M 398733 40O3SS 3 1623 Subunit O 361 integrase, catalytic region 40O892 4O1353 2 462 O 362 xerC phage integrase family protein 4O2O15 4O2977 3 963 O 363 conserved hypothetical protein 4O3S09 402967 543 O 364 conserved hypothetical protein 403679 405373 2 1695 O 365 conserved hypothetical protein 405379 4O7313 1935 O 366 conserved hypothetical protein 4O7310 408947 3 1638 O 367 Cold-shock DNA-binding domain protein 408944 411208 2 2265 O 368 conserved hypo hetical protein 41120S 412182 978 O 369 conserved hypo hetical protein 416O11 412664 -2 3348 O 370 al2 CRISPR-associa ed protein 416875 418866 1992 O 371 CRISPR-associa ed protein 418867 41918.4 3.18 O 372 Selenide, water ikinase 41986S 42O392 3 528 O 373 ybfL transposase, is4 amily 42O664 421710 1047 O 374 parA partition protein Para 422SO9 423213 705 O 375 hypothetical pro ein 423411 423.569 3 159 O 376 hypothetical pro ein 423954 424O79 3 126 O 377 gluA F213463. 1 cellobiase CelA precursor 424.429 4263OO 1872 O 378 maleate cis-trans protein 427582 426,830 -2 753 O 379 icaR putative TetR family transcriptional regulator 427725 428483 3 759 O 380 conserved hypothetica protein 428775 428515 261 O 381 glyoxalasebleomycin resistance 4294.79 428775 -3 705 protein dioxygenase C O 382 conserved hypothetica protein 430769 4295O1 -3 1269 O 383 conserved hypothetica protein 431298 430777 522 O 384 vioD putative monooxygenase, FAD, NAD(P)- 432452 431295 -3 1158 binding domain O 385 dhbE AMP-dependent synth etase and ligase 4341.72 432484 1689 O 386 icaR putative transcriptiona regulator 434976 434275 702 O 387 yiaO Bacterial extracellular Solute-binding protein, 43S118 436,176 1059 amily 7 O 388 conserved hypothetica protein 436161 436919 3 759 C O 389 sia.T TRAP transporter, DctM-like membrane 436879 438.048 1170 protein O 390 putative Aminoglycoside phosphotransferase 439093 438071 -2 1023 O 391 tan trans-aconitate methyl ransferase 439.218 439982 3 765 O 392 gst glutathione S-transferase protein 44OOO1 44O609 3 609 O 393 gloA actoylglutathione lyase 4.41036 44O632 -1 40S C O 394 conserved hypothetica protein 44.1639 44.1223 -1 417 O 395 acrR TetR family transcriptional regulator 441721 442362 1 642 O 396 ifcA umarate reductase 443799 442378 -1 1422 O 397 namA NADH:flavin oxidoreductase,NADH oxidase 443958 444935 3 978 O 398 yVOA transcriptional regulator 445O26 445901 3 876 O 399 branched-chain amino acid transport system 446OOO 447223 2 1224 Substrate-binding protein O 400 bra) branched-chain amino acid ABC transporter 4473.25 448167 1 843 permease protein O amino acid ABC transporter permease protein 448167 4491SO 3 984 O lptB ABC transporter ATP-binding protein 4491SO 449893 2 744 O IvE ABC transporter ATP-binding protein 449890 450609 1 720 O MmgE/PrpD family protein 450622 452040 1 1419 O isocitrate lyase family protein 452042 45292O 2 879 US 2014/0296.161 A1 Oct. 2, 2014 27

-continued

LOCS Gene Product Start End Strand Length

O 4O6 automerase 452966 453457 2 492 O 407 ifcA putative fumarate reductase, succinate 453463 454821 1 1359 dehydrogenase flavoprotein O 4.08 hyuE hydantoin racemase 454841 455599 2 759 O 409 hyuE hydantoin racemase 455614 4563O3 1 690 O 410 Irg1 MmgE/PrpD family protein 4563OO 457646 3 1347 O 411 hypothetical protein 457718 457864 2 147 O 412 hypothetical protein 457895 4.58059 2 16S O 413 conserved hypothetical protein 458269 459588 1 132O O 414 putative amidase 461101 4596.68 -2 1434 O 415 braG branched chain amino acid ABC transporter 461821 461129 -2 693 ATP-binding protein O 416 braF putative ABC transport system, ATP-bidning 462606 461815 -1 792 protein O 417 IiwV putative ABC transport system, membrane 4636O1 462603 -3 999 protein O 418 bra) inner-membrane translocator 464465 4636OS -3 861 O 419 branched-chain amino acid ABC transporter, 46.5776 464556 -3 1221 periplasmic amino acid-binding protein O 420 nanR regulatory protein GntRHTH 466752 465961 792 O 421 hypothetical protein 467173 467316 144 O 422 hypothetical protein 467699 467899 2 2O1 O 423 hypothetical protein 467955 468.107 3 153 O 424 hypothetical protein 468152 468298 2 147 O 425 conserved hypothetical protein 468358 46930S 948 O 426 algi membrane bound O-acyl transferase, MBOAT 469322 47O677 2 356 O 427 hypothetical protein 471709 47O678 -2 O32 O 428 hypothetical protein 472O81 471,875 -2 2O7 O 429 hypothetical protein 472147 472290 144 O 430 hypothetical protein 473,393 472362 -3 O32 O 431 hypothetical protein 473SO1 473692 2 192 O 432 conserved hypothetical protein 473712 473978 3 267 O 433 cinA competence damage-inducible protein Cin A 473929 474507 579 O 434 catl putative acetyl-CoA hydrolase/transferase 474600 476111 3 512 amily protein; putative Succinyl CoA: coenzyme A transferase O 435 alkJ putative choline dehydrogenase lipoprotein 476491 478116 626 oxidoreductase O 436 siaP TRAP dicarboxylate transporter- DctP subunit 478397 479.503 2 107 O 437 tripartite ATP-independent periplasmic 479565 480071 3 507 transporter DctO O 438 TRAP dicarboxylate transporter, DctM subunit 48OO68 481.345 2 278 O 439 yxeO MmgE/PrpD family protein 481377 482729 3 353 CC O 440 maoC conserved hypothetical protein 48.2756 483.256 2 5O1 O 441 mmgC Butyryl-CoA dehydrogenase 483253 484428 1 176 O 442 maoC conserved hypothetical protein 484439 484921 2 483 O 443 yfdE L-carnitine dehydratase/bile acid-inducible 484.918 486153 1 236 protein F O 444 Citryl-CoA lyase 4861SO 487073 3 924 O 445 kdgR transcriptional regulator, IcIR family 487.146 487898 3 753 O 446 major facilitator transporter 487911 489089 3 179 O 447 Ech1 enoyl-CoA hydratase 489947 48.9102 -3 846 O 448 hypothetical protein 490444 490049 -2 396 O 449 hypothetical protein 490967 490S1S -3 453 O 450 hypothetical protein 4.91546 491067 -3 480 O 451 gst3 Glutathione S-transferase domain protein 492356 4.91676 -3 681 O 452 mmgC acyl-CoA dehydrogenase 493553 4924.08 -3 1146 O 453 Carbamoyl-phosphate synthase L chain ATP 495592 49.3598 -2 1995 binding O 454 propionyl-CoA carboxylase 497235 495.673 -1 1S63 O 455 hypothetical protein 497323 497616 1 294 O 456 mirca glycosyltransferase family protein 498482 497871 -3 612 O 457 transcriptional regulator 4991.99 498.924 -3 276 O 458 hypothetical protein 499669 499484 -2 186 O 459 frpC bacteriocin SO1632 499881 -3 1752 O 460 hypothetical protein SO1804 SO1619 -1 186 O 461 hypothetical protein 5O2315 SO1893 -2 423 O 462 butative ENOYL-COA HYDRATASE SO2S66 SO3336 3 771 O 463 araC transcriptional regulator, AraC family SO3355 SO4209 3 855 O 464 lcfA ong-chain-fatty-acid-CoA ligase SO4260 506095 2 1836 O 465 rhG short-chain dehydrogenase/reductase SDR 506873 SO 6106 -3 768 O 466 pleC non-motile and phage-resistance protein SO8451 507000 -3 1452 O 467 membrane protein 508592 50891.5 2 324 O 468 conserved hypothetical protein S10548 SO8890 -3 1659 US 2014/0296.161 A1 Oct. 2, 2014 28

-continued

LOCS Gene Product Start End Strand Length

O 469 hypothetical protein S10922 S10668 -2 255 O 470 sensor histidine kinase S11186 514770 1 3585 O 471 two component transcriptional regulator 514767 515732 3 966 O 472 conserved hypothetical protein S15942 S16124 2 183 O 473 hypothetical protein S163S4 S16656 3 303 O 474 ynaD GCN5-related N-acetyltransferase S16833 517366 2 534 O 475 conserved hypothetical protein 521851 517388 -2 4464 O 476 conserved hypothetical protein 529416 521851 -1 7566 O 477 hypothetical protein 530907 529468 -1 1440 O 478 hemagglutinin protein 534719 530907 -3 3813 O 479 conserved hypothetical protein 537877 S34716 -2 3.162 O 480 conserved hypothetical protein 539.720 53.7975 -3 1746 O 481 major facilitator transporter 541177 5399.45 -2 1233 O 482 Arsk family transcriptional regulator/protein S41322 S42170 2 849 tyrosine phosphatase O 483 gap3 glyceraldehyde-3-phosphate dehydrogenase, 542175 543197 3 1023 type I O 484 major facilitator transporter 543.191 544.441 2 1251 O 485 beta-lactamase S44538 545719 2 1182 O 486 Cephalosporin hydroxylase S46109 54.6777 1 669 O 487 macrocin-O-methyltransferase 54.6777 S476O4 3 828 O 488 conserved hypothetical protein S47960 S48265 1 306 O 489 hypothetical protein 548277 S49.212 3 936 O 490 O-linked N-acetylglucosamine transferase S492O6 550879 2 1674 O 491 hypothetical protein 550831 551082 1 252 O 492 dehydratase, MaoC family protein 551.432 551587 2 156 O 493 citE HpcHHpaI aldolase/citrate lyase family 551584 55.25O1 1 918 O 494 conserved hypothetical protein 552.513 SS3436 3 924 O 495 IiwV ABC transporter permease protein 555566 SS4586 -3 981 O 496 IvE inner-membrane translocator S566OO 555731 -2 870 O 497 IvE ABC transporter related protein 557327 SS6614 -3 714 O 498 braF ABC transporter related protein 558115 557324 -2 792 O 499 IivK putative leucine, isoleucine? valine-binding 559395 5581.63 -1 1233 protein precursor O 500 baiB AMP-dependent synthetase and ligase 559600 561,189 1 1590 O 5O1 mfeB MaoC-like dehydratase S61274 S62128 1 855 O 502 short-chain dehydrogenase S621.84 S63104 2 921 O 503 paaG crotOnase S6312O 563935 2 816 O SO4 badR putative MarR family transcriptional regulator S64533 563985 -3 549 O 505 fadH short-chain dehydrogenase/reductase SDR S65439 S64561 -3 879 O SO6 ymfI short-chain dehydrogenase/reductase SDR 566278 565.538 -2 741 O 507 putative acyl-CoA hydratase 5671.37 56.6289 -3 849 O SO8 2,5-dichloro-2,5-cyclohexadiene-1,4-diol 567916 S671 61 -2 756 dehydrogenase (2,5-ddol dehydrogenase) C O 509 menE ong-chain-fatty-acid-CoA ligase 568O3O S6961O 1 581 O 510 Acyl-CoA dehydrogenase, middle domain 570905 56.9721 -3 18S protein O 511 pimeloyl-CoA dehydrogenase S72O14 570902 -2 113 O 512 lcfB AMP-dependent synthetase and ligase 573590 S72049 -3 S42 O 513 dictP putative dicarboxylate-binding periplasmic 573957 575021 3 O6S protein O S1.4 TRAP transporter, DctO-like membrane 575066 575599 2 534 protein O 515 trap dicarboxylate transporter, dctm Subunit 575642 576946 2 305 O S16 Long-chain specific acyl-CoA dehydrogenase S78341 577181 -2 161 O 517 acetyl-CoA acetyltransferase 57954O 578.338 -1 2O3 O 518 conserved hypothetical protein S81018 579570 -3 449 O 519 L-carnitine dehydratase/bile acid-inducible S81258 582325 2 O68 protein F O 520 bcd Acyl-CoA dehydrogenase, C-terminal domain 582357 583,523 3 167 protein O 521 Enoyl-CoA hydrataseisomerase 58.3556 S84341 2 786 O 522 alkR transcriptional regulator, AraC family protein S85281 S84475 -3 807 O 523 alkyl hydroperoxide reductase/thiol specific S85348 585887 3 S4O antioxidant Mal allergen O 524 acSA acetyl-CoA synthetase/AMP-(fatty) acid ligase S8761O S85964 -1 1647 Fadox O 525 cit enoyl-CoA hydratase S87849 588775 2 927 O 526 OPR3 NADH:flavin oxidoreductase,NADH oxidase S888S9 S89962 1 1104 O 527 hypothetical protein 590.078 S90392 2 315 O 528 miliC Membrane-bound lysozyme inhibitor of C-type S90488 590883 1 396 ysozyme O 529 csgA putative short-chain dehydrogenase, S91228 590935 -1 294 oxidoreductase US 2014/0296.161 A1 Oct. 2, 2014 29

-continued

LOCS Gene Product Start End Strand Length

O 530 sdh short-chain dehydrogenase/reductase SDR S916SO 591.237 -3 414 O 531 LysR family transcriptional regulator 591776 S92696 2 921 O 532 hexapaptide repeat-containing transferase S928O1 593412 612 O 533 xerD possible integrase-like protein S94709 593429 -2 1281 O 534 yXM putative sensor histidine kinase 595024 597,258 2235 O 535 devR response regulator 597.287 598OOO 2 714 O 536 conserved hypothetical protein S98984 598O19 -2 966 O 537 Putative TRAP transporter large permease 60O291 598.999 1293 protein O 538 putative DctO (C4-dicarboxylate permease, -3 531 Small subunit) O 539 dictP C4-dicarboxylate-binding periplasmic protein 6O1989 600982 1008 O S4O cit enoyl-CoA hydratase 6O2255 603O37 2 783 O S41 tagA ToxR-activated gene Alipoprotein 6O3169 605823 2655 O S42 ying AMP-dependent synthetase and ligase 6O7530 605857 1674 O 543 conserved hypothetical protein 608158 607595 -2 S64 O 544 conserved hypothetical protein 6087OO 608179 522 O 545 conserved hypothetical protein 609437 608847 -3 591 O S46 Sulfotransferase 610594 609725 -2 870 O 547 Hpr(Ser) kinase/phosphatase 611.454 610591 864 O S48 Sulfotransferase 611695 61.2348 654 O S49 putative N-acetylglucosaminyltransferase 614343 613219 112S O 550 pcaK 3-hydroxyphenylpropionic acid 615773 614421 -3 1353 O 551 gentisate 1,2-dioxygenase 616945 61S887 -2 1059 O 552 doXA rieske (2Fe-2S) domain protein 617264 6169SO -3 315 O 553 aromatic-ring-hydroxylating dioxygenase, beta 617775 617281 495 Subunit O 554 bphA aromatic-ring-hydroxylating dioxygenase, 61898O 617772 -3 1209 alpha subunit O 555 thcD FAD-dependent pyridine nucleotide-disulphide 619 193 2 1248 oxidoreductase O 556 iclR transcriptional regulator 621.218 62O4(OO -3 819 O 557 maiA maleylacetoacetate isomerase 621321 62.1962 3 642 O 558 umarylacetoacetate hydrolase 622007 622696 2 690 O 559 mauR LysR family transcriptional regulator 623860 622940 -2 921 O S60 methylmalonate-semialdehyde dehydrogenase 624O14 62551.3 2 1SOO O 561 hypothetical protein 625931 628O21 2 2091 O S62 hypothetical protein 628O36 62841O 1 375 O 563 peptidase M48, Ste24p 6284.32 630216 1 1785 O S64 TRAP transporter solute receptor, TAXI family 630217 631.218 1 10O2 O 565 hypothetical protein 631582 631779 1 198 O 566 badR MarR family transcriptional regulator 633132 633674 3 543 O 567 hipO peptidase M2OD, amidohydrolase 633059 631887 -3 1173 O 568 conserved hypothetical protein 633807 6349SS 3 1149 O 569 putative zinc-binding dehydrogenase 63S212 636231 1 102O O 570 AMP-binding domain protein 636228 638114 3 1887 O 571 conserved hypothetical protein 638111 638.96S 2 855 O 572 TRAP-T family transporter, DctO (4 TMs) 638962 639.525 1 S64 Subunit C O 573 TRAP-T family protein transporter, DctM (12 639522 640826 3 1305 TMs) subunit O 574 phenylacetic acid degradation-related protein 640842 641276 3 435 O 575 hypothetical protein 641.795 641316 -3 480 O 576 heme receptor 644,452 641879 -2 2574 O 577 fecR anti-FecI sigma factor, FecR 64S589 64466O -2 930 O 578 ECF subfamily RNA polymerase sigma-24 646092 645589 SO4 actor O 579 hypothetical protein 646794 646489 306 O S8O HPrkinase 647747 64.6791 -3 957 O 581 asnEH asparagine synthase 64964.8 647747 -2 1902 O 582 peptidase S45 penicillin amidase 64989S 652.276 2 2382 O 583 enoyl-CoA hydratase 652483 653304 822 O S84 transcriptional regulator, TetR family 654155 653328 -3 828 O 585 conserved hypothetical protein 654646 6541S2 -2 495 O S86 hypothetical protein 655893 654.871 1023 O 587 algi membrane bound O-acyl transferase MBOAT 657349 6SS88O -2 1470 amily protein O 588 acpM putative 657482 657718 2 237 O 589 conserved hypothetical protein 65866S 657730 936 O 590 FkbH like protein 658770 660806 3 2037 O 591 yafC transcriptional regulator 661741 660836 -2 906 O 592 addiction module antitoxin 662O33 661743 -3 291 O 593 din addiction module antitoxin 662319 662O2O 300 O 594 hypothetical protein 6627O2 6624.54 -3 249 US 2014/0296.161 A1 Oct. 2, 2014 30

-continued

LOCS Gene Product Start End Strand Length P1.or 595 conserved hypothetical protein 662739 666431 3 3693 P1.or 596 hypothetical protein 666428 667438 2 1011 P1.or 597 conserved hypothetical protein 667452 671111 3 3660 P1.or 598 HipA-like 672445 671177 -2 1269 P1.or 599 XRE family transcriptional regulator 672,684 672442 -1 243 P1.or 600 katE catalase 675 002 672882 -3 21.21 P1.or 6O1 yhcV Inosine-5'-monophosphate dehydrogenase 675652 675164 -2 489 related protein P1.or 602 phage SPO1 DNA polymerase-related protein 675846 676.556 3 711 P1.or 603 tycC non-ribosomal peptide synthetase 676765 682599 1 5.835 P1.or 604 tycC Amino acid adenylation 682572 691364 3 8793 P1.or 60S lgrC Amino acid adenylation 691352 6928.30 2 1479 P2 or OO1 fh alcohol dehydrogenase 1 1170 1 1170 P2 or OO2 ail protein D 2205 1204 -1 10O2 P2 or OO3 phage Tail Protein X 2432 2205 -3 228 P2 or OO4 conserved hypothetical protein 3679 2429 -2 1251 P2 or 005 hypothetical protein S381 3663 -3 1719 P2 or OO6 bacteriophage gpE S834 5529 -3 306 P2 or OO7 FII similar to probable bacteriophage protein 6419 S916 -3 SO4 P2 or OO8 FI phage tail sheath protein 7686 6511 -1 1176 P2 or O09 hypothetical protein 8166 7690 -1 477 P2 or O10 conserved hypothetical protein 8989 8.177 -2 813 P2 or O11 conserved hypothetical protein 1112S 8993 -2 21.33 P2 or O12 gp15 protein 11987 11139 -3 849 P2 or O13 phage baseplate J-like protein 12876 11980 -1 897 P2 or O14 Baseplate assembly protein W 13256 12873 -3 384 P2 or O15 PAAR 13555 13262 -2 294 P2 or O16 phage baseplate assembly protein V 14012 13560 -3 453 P2 or O17 hypothetical protein 14632 14009 -2 624 P2 or O18 hypothetical protein 15097 14816 -2 282 P2 or O19 amtB ammonium transporter 16723 15371 -2 1353 P2.Or O20 tau) Taurine dioxygenase 17879 17037 -3 843 P2 or O21 class II aldolase/adducin family protein 18717 17953 -1 765 P2 or O22 binding-protein-dependent transport systems 19509 18739 -1 771 inner membrane component P2 or binding-protein-dependent transport systems 19491 -3 783 inner membrane component P2 or O24 ABC transporter related protein 21163 2O270 -2 894 P2 or O25 NMT1/THIS like domain protein 22221 21181 -1 O41 P2 or O26 metal-dependent phosphohydrolase 22657 23268 1 612 P2 or O27 lutR transcriptional regulatory protein 234O1 24O90 1 690 P2 or O28 dhkJ Signal transduction histidine kinase 24694 29.427 1 4734 P2 or O29 rpfG two component system, transcriptional 29424 30527 3 104 regulatory protein P2 or ydeR major facilitator Superfamily protein 31739 30543 -3 197 P2 or yeaT transcriptional regulator 31835 3.2716 2 882 P2 or yvdB Sulphate transporter 34447 32732 -2 716 P2 or XRE family transcriptional regulator 34788 34528 -1 261 P2 or conserved hypothetical protein 35O12 35473 2 462 P2 or Imra transcriptional regulator, TetR family 35497 36090 1 594 P2 or ABC transporter related protein 361.76 38O86 2 911 P2 or hypothetical protein 38126 38839 2 714 P2 or ABC Superfamily ATP binding cassette 4OSO1 38924 -2 578 transporter Substrate-binding protein P2 or poly(aspartic acid) hydrolase 41621 4O668 -3 954 P2 or gsiA ABC Superfamily ATP binding cassette 43229 41646 -3 S84 transporter, ABC protein P2 or Dipeptide transport system permease protein 44.158 43292 -2 867 dppC P2 or Putative peptide transport system permease 45096 441SS 942 protein P2 or acyl-coenzyme A: 6-aminopenicillanic-acid 46564 45401 -2 1164 acyltransferase P2 or O44 bphR GntR-family transcriptional regulator 467OS 4788O 1176 P2 or O45 hypothetical protein 48044 478.68 -3 177 P2 or O46 paiB Protease synthase and sporulation protein PAI 48SOS 481.SS -2 351 P2 or O47 amidase family protein 49974 4852O 1455 P2 or O48 secretion protein HlyD family protein 50955 50059 897 P2 or O49 conserved hypothetical protein 51167 50952 -3 216 P2 or 050 ydhK usaric acid resistance protein region 53257 S1164 -2 2094 P2 or 051 yeaM AraC family transcriptional regulator 53335 S4138 804 P2 or 052 conserved hypothetical protein S4665 55303 2 639 P2 or 053 conserved hypothetical protein 564O2 5.5323 -3 1080 P2 or OS4 conserved hypothetical protein 56446 S7084 639 US 2014/0296.161 A1 Oct. 2, 2014 31

-continued

LOCS Gene Product Start End Strand Length P2 or 055 conserved hypothetical protein 57088 57966 1 879 P2 or OS6 pheC extracellular solute-binding protein 58113 58955 3 843 P2 or 057 glnP polar amino acid ABC transporter, inner 59028 S9690 3 663 membrane subunit P2 or OS8 teyC ABC transporter ATP-binding protein 59687 6O472 2 786 P2 or 059 Chain A, Crystal Structure Of A Muconate 60482 61603 2 1122 Cycloisomerase From Azorhizobium Cauinodans P2 or dadA FAD dependent oxidoreductase 61661 62932 2 1272 P2 or ilw8 hiamine pyrophosphate protein central region 62971 64722 1 1752 P2 or gcVA LysR family transcriptional regulator 65673 64735 -1 939 P2 or conserved hypothetical protein 66090 65689 -1 402 P2 or Beta-lactamase 67350 66.328 -1 1023 P2 or Chloramphenicol acetyltransferase 68.054 674.04 -3 651 P2 or yoe A MATE efflux family protein 696O2 6.8235 -3 1368 P2 or inccB RND family efflux transporter MFP subunit 69931 71175 1 1245 P2 or noG acriflavin resistance protein 71172 74234 3 3063 P2 or GCN5-related N-acetyltransferase 74.551 74243 -2 309 P2 or yagA SHne2, transposase 74674 75075 1 402 P2 or putative periplasmic protein 75622 76551 1 930 P2 or GCN5-related N-acetyltransferase 78513 78055 -1 459 P2 or mannitol transporter 78868 79398 1 531 P2 or DctMS 79403 80728 2 1326 P2 or dictP putative extracellular solute-binding protein, 808O1 81931 2 1131 amily 7 P2 or menE 81954 83537 3 1584 P2 or enoyl CoA hydratase 83534 84376 2 843 P2 or marA Multiple resistance protein marA 84846 84418 -1 429 P2 or conserved hypothetical protein 85298 84885 -3 414 P2 or fct TonB-dependent siderophore receptor 85.472 87691 2 2220 P2 or fes esterase 87688 89 295 1 1608 P2.Or ytnP metallo-beta-lactamase 90248 89301 -3 948 P2 or nahR Transcriptional regulator, LysR family 90263 91285 2 1023 P2 or putative GMP synthase glutamine 91310 91993 2 684 hydrolyzing P2 or conserved hypothetical protein 928.30 92006 -2 825 P2 or transcriptional regulator, GintR family 93.052 93756 705 P2 or dap A Dihydrodipicolinate synthase 93875 94.837 2 963 P2 or hypothetical protein 94888 95079 192 P2 or artI extracellular solute-binding protein 95092 95901 810 P2 or yecS ABC transporter, membrane spanning protein 95.990 96655 2 666 (amino acid) P2 or gltK putative amino acid ABC transporter 966.67 97320 654 permease protein P2 or glnO putative amino acid ABC transporter ATP 97.313 2 723 binding protein P2 or dadA D-amino-acid dehydrogenase 98O3S 99.276 1242 P2 or proline racemase 99.290 OO291 2 10O2 P2 or Tripartite ATP-independent periplasmic OO453 OO992 S4O P2 or sia.T TRAP dicarboxylate transporter, DctM subunit OO994 O2262 2 1269 P2 or TRAP-type large permease component O2297 O3304 3 1008 P2 or Sugar ABC transporter OS174 O343S 1740 P2 or conserved hypothetical protein O5572 O5285 -3 288 P2 or yurM binding-protein-dependent transport systems O6468 O5587 -2 882 inner membrane component P2 or O1 Sugar ABC transporter O7336 O6470 -3 867 P2 or O2 ABC transporter related protein O8411 O7341 -1 1071 P2 or O3 Smok ABC transporter related protein O9522 O8431 -2 1092 P2 or O4 glpD FAD dependent oxidoreductase 11127 O9586 -2 1542 P2 or 05 conserved hypothetical protein 12676 11435 -3 1242 P2 or O6 hypothetical protein 13264 13103 -3 162 P2 or O7 cydB cytochrome dubiquinol oxidase, Subunit II 14258 13248 -1 1011 P2 or O8 cydA cytochrome bcd ubiquinol oxidase, Subunit I 15738 14263 -2 1476 P2 or 09 lcfB AMP-dependent synthetase and ligase 16321 17895 2 1575 P2 or 10 putative extracellular solute-binding protein 17996 19045 3 1OSO P2 or 11 conserved hypothetical protein 190SO 19565 1 S16 P2 or 12 sia.T TRAP-type C4-dicarboxylate transport system, 19562 2O851 3 1290 large permease component P2 or 13 todF alpha/beta hydrolase fold protein 20931 21773 1 843 P2 or 14 conserved hypothetical protein 21793 221.13 2 321 P2 or 15 mhpB protocatechuate 4,5-dioxygenase subunit beta 22110 22937 1 828 P2 or 16 bedB CinA3 22951 23274 2 324 P2 or 17 hcaE Aromatic-ring-hydroxylating dioxygenase, 23351 24739 3 1389 alpha subunit-like protein US 2014/0296.161 A1 Oct. 2, 2014 32

-continued

LOCS Gene Product Start End Strand Length P2 or 18 3-phenylpropionate dioxygenase subunit beta 24751 25293 2 543 P2 or 19 bphB 2,3-dihydroxy-2,3-dihydrophenylpropionate 25327 26139 2 813 dehydrogenase P2 or 2O 2-nitropropane dioxygenase, NPD 26169 27.191 1 1023 P2 or 21 tod 2-oxopent-4-enoate hydratase 27188 27979 3 792 P2 or 22 acetaldehyde dehydrogenase 27985 28923 2 939 P2 or 23 4-hydroxy-2-oxovalerate aldolase 28927 29985 2 1059 P2 or 24 tetR regulatory protein TetR 30566 29976 -1 591 P2 or 25 thcD ferredoxin--NAD+ reductase 3O857 32035 3 1179 P2 or 26 conserved hypothetical protein 32167 3.2940 2 774 P2 or 27 diguanylate cyclase 33040 34215 2 1176 P2 or 28 yhiN conserved hypothetical protein 35416 34226 -3 1.191 P2 or 29 gacS multi-sensor hybrid histidine kinase 35604 39797 1 4194 P2 or 30 isopenicillin N Synthase 4O642 398O3 -3 840 P2 or 31 hypothetical protein 41315 42607 3 1293 P2 or 32 hypothetical protein 42771 42646 -2 126 P2 or 33 transposase, IS4 43O23 42853 -2 171 P2 or 34 transposase 43487 43020 -1 468 P2 or 35 peptidase M48 Ste24p 45079 4.4252 -3 828 P2 or 36 rhtB RhtB family transporter 45828 45223 -2 606 P2 or 37 ASN1 putative asparagine synthase (glutamine 47157 45841 -2 1317 hydrolyzing) P2 or 38 ubiF Generic methyltransferase 48478 47132 -3 1347 P2 or 39 transcriptional regulator, LysR family 49523 48594 -1 930 P2 or 40 ATP-binding protein SO384 49590 -1 795 P2 or 41 dap A Dihydrodipicolinate synthase 51442 SO381 -3 1062 P2 or 42 conserved hypothetical protein 52393 51569 -3 825 P2 or 43 conserved hypothetical protein 53423 S238O -1 1044 P2 or 44 putative ABC transporter permease protein 54241 S342O -3 822 P2 or 45 mtrR transcriptional regulatory protein 55062 54388 -2 675 P2 or 46 Pyrimidine precursor biosynthesis enzyme 55238 S6239 3 10O2 P2.Or 47 putative transmembrane protein 56889 56278 -2 612 P2 or 48 transcriptional regulator, TetR family 57618 57037 -2 582 P2 or 49 Pirin-related protein 57819 58520 1 702 P2 or 50 dihydroxy-acid dehydratase 58781 60622 3 1842 P2 or 51 NMT1/THI5-like domain-containing protein 60851 61813 3 963 P2 or 52 ABC transporter, membrane spanning protein 61910 62695 3 786 P2 or 53 ABC transporter, nucleotide binding ATPase 62692 63480 2 789 protein P2 or S4 din addiction module antitoxin 63579 63881 1 303 P2 or 55 addiction module toxin, RelE/StbE family 63878 6416S 3 288 P2 or 56 ydaP thiamine pyrophosphate protein domain 65971 64190 -3 1782 protein TPP-binding P2 or 57 putative beta hydroxylase 661.36 67419 2 1284 P2 or 58 Superoxide dismutase protein 67454 67984 3 531 P2 or 59 yid J arylsulfatase A and related enzyme 68829 70619 1 1791 P2 or 60 pcaC carboxymuconolactone decarboxylase 71893 70712 -3 1182 P2 or 61 transcriptional regulator, LysR family 72948 72043 -2 906 P2 or 62 zinc-containing alcohol dehydrogenase 73105 74121 2 1017 Superfamily protein P2 or 63 fabG short chain dehydrogenase 74920 74168 -3 753 P2 or 64 ampR MPR RHOCA RecName: Full = HTH-type 76285 75293 -3 993 transcriptional activator AmpR emb P2 or 65 LAC RHOCA RecName: Full = Beta 76386 77288 1 903 actamase; AltName: Full = Penicillinase: Flags: Precursor emb P2 or 66 TetR family transcriptional regulator 77876 77292 -1 585 P2 or 67 3-oxoacyl-(acyl-carrier-protein) reductase 77978 78733 3 756 P2 or 68 putative transcriptional regulator, AbrB family 79015 78761 -3 255 P2 or 69 DNA-binding protein 793.85 79083 -1 303 P2 or 70 conserved hypothetical protein 80464 795.62 -3 903 P2 or 71 mingR Mannosyl-D-glycerate transport/metabolism 81469 80681 -3 789 system repressor mingR P2 or 72 cat2 putative 4-hydroxybutyrate coenzyme A 82704 81442 -2 1263 transferase P2 or 73 MmgE/PrpD family protein 82794 841S2 1 1359 P2 or 74 paak putative aerobic phenylacetate-CoA ligase 842O8 8562O 2 1413 P2 or 75 gatA glutamyl-tRNA (Gln) amidotransferase, A 85687 871.56 2 1470 Subunit P2 or 76 ami) Putative amidase 88.946 87531 -1 1416 P2 or 77 NMT1/THIS like domain protein 892O3 90225 2 1023 P2 or 78 short chain dehydrogenase 90353 91204 3 852 P2 or 79 hisC aspartate aminotransferase 912O1 92.334 2 1134 P2 or 8O ytXM abhydrolase 1 92352 93.233 1 882 US 2014/0296.161 A1 Oct. 2, 2014 33

-continued

LOCS Gene Product Start End Strand Length P2 or 81 allC allantoate amidohydrolase 193332 1946OO 3 269 P2 or 82 ytbD major facilitator Superfamily MFS 1 195827 194634 -3 194 P2 or 83 hypothetical protein 196724 197392 2 669 P2 or 84 hypothetical protein 1979OO 1975.95 -3 306 P2 or 85 SlmA Transcriptional regulator 198971 1981.89 -3 783 P2 or 86 branched-chain amino acid ABC Superfamily 200283 199069 -1 215 ATP binding cassette transporter, binding protein P2 or 87 pyruvate carboxylase 2O3732 -3 3342 P2 or 88 caiC AMP-dependent synthetase and ligase 2O5403 -3 668 P2 or 89 ycdT diguanylate cyclase 205708 1 617 P2 or 90 2-oxoisovalerate dehydrogenase (alpha 2O7731 2 233 Subunit) P2 or 91 2-oxoisovalerate dehydrogenase (beta 2O8968 209981 3 O14 Subunit) P2 or 92 branched-chain alpha-keto acid 20998S 211337 3 353 dehydrogenase subunit E2 P2 or 93 lpdV dihydrolipoamide dehydrogenase 211342 212739 1 398 P2 or 94 ior A isoquinoline 1-oxidoreductase subunit alpha 213057 213515 3 459 P2 or 95 iorB isoquinoline 1-oxidoreductase, beta Subunit 213533 215710 2 2178 P2 or 96 QbsM like protein 215917 217O68 1 152 P2 or 97 TetR family transcriptional regulator 217122 217727 3 606 P2 or 98 rutE3 putative hydrolase 218483 217788 -3 696 P2 or 99 ytlA nitratef sulfonate?taurine bicarbonate ABC 218671 219723 1 O53 Superfamily ATP binding cassette transporter, binding protein P2 or aurine ABC Superfamily ATP binding cassette 219761 22O600 2 840 transporter, ABC protein P2 or putative ABC transport proteins, inner 22O590 221444 3 855 membrane component P2 or putative membrane protein, DUF81 221536 222270 1 735 P2.Or gloA glyoxalase? bleomycin resistance 222718 222281 -2 438 protein dioxygenase P2 or 204 DoxX family protein 2231.98 2228.09 -2 390 P2 or 205 TetR family transcriptional regulator 22387O 223322 -2 549 P2 or 2O6 alcohol dehydrogenase 223948 224979 1 1032 P2 or 2O7 FAD-binding 9 siderophore-interacting domain 225757 224999 -2 759 protein P2 or 208 yeaM transcriptional regulatory protein 226721 225882 -3 840 P2 or 209 cyaA erredoxin 228506 226734 -3 1773 P2 or 210 inte1 transcriptional regulator, Crp, Fnr family 2286.13 229.329 717 P2 or 211 patatin 23O390 229314 -3 1077 P2 or 212 hypothetical protein 23O826 23O365 462 P2 or 213 hypothetical protein 231299 23O823 -3 477 P2 or 214 oruR PA2556 232461 231394 1068 P2 or 215 ycfCR TetR family transcriptional regulator 233166 232S61 606 P2 or 216 yig putative short-chain 233301 234O71 3 771 dehydrogenase/oxidoreductase P2 or 217 conserved hypothetical protein 235O12 234062 -2 951 P2 or 218 fhuF erric iron reductase 23S830 235087 744 P2 or 219 fhuB transporter 237833 235833 -3 2001 P2 or 220 fhuC ABC-type hydroxamate-dependent iron 238.609 237830 -2 780 transport system, ATPase component P2 or 221 fhuD periplasmic component, ABC-type 239544 238606 939 hydroxamate-dependent iron transport system P2 or 222 fct TonB-dependent siderophore receptor 241934 239553 -3 2382 P2 or 223 ompR two component transcriptional regulator, 2428O8 242O68 741 winged helix family P2 or 224 MltA-interacting MipA family protein 243592 242822 -2 771 P2 or 225 envz. two component sensor kinase 243801 245,078 3 1278 P2 or 226 putative AraC family transcriptional regulator 246O18 245083 936 P2 or 227 NAD-dependent epimerase? dehydratase 246.162 247058 3 897 P2 or 228 ydbC putative oxidoreductase 247979 247098 -3 882 P2 or 229 sensor histidine kinase 251372 248049 -3 3324 P2 or 230 response regulator receiver protein 252450 251737 714 P2 or 231 transcriptional regulatoriantitoxin, MazE 252681 252908 3 228 P2 or 232 conserved hypothetical protein 254762 252999 -3 1764 P2 or 233 transcriptional regulatory protein 255618 2S4920 699 P2 or 234 fadk putative acid-CoA ligase 257373 255661 1713 P2 or 235 IvE ABC branched chain amino acid transporter, 258272 257553 -3 720 ATPase subunit P2 or 236 Iivo ABC branched chain amino acid transporter, 259047 258265 783 ATPase subunit US 2014/0296.161 A1 Oct. 2, 2014 34

-continued

LOCS Gene Product Start End Strand Length P2 or 237 ABC branched chain amino acid transporter, 26OO72 259044 -3 1029 inner membrane subunit P2 or 238 bra) ABC branched chain amino acid transporter, 260962 -2 894 inner membrane subunit P2 or 239 ABC branched chain amino acid transporter, 262424 26113S -3 1290 periplasmic ligandbinding protein P2 or 240 lcfB AMP-dependent synthetase and ligase 264169 262496 -2 1674 P2 or 241 cit enoyl-CoA hydrataseisomerase 265037 264.222 -3 816 P2 or 242 ying acyl-CoA dehydrogenase domain-containing 266230 26S109 -2 1122 protein P2 or 243 IVD acyl-CoA dehydrogenase domain-containing 267644 26.6403 -3 1242 protein P2 or 244 conserved hypothetical protein 268137 267889 -1 249 P2 or 245 old putative ATP-dependent endonuclease of the 268626 270371 3 1746 OLD family protein P2 or 246 conserved hypothetical protein 271176 270910 -1 267 P2 or 247 addiction module antitoxin 270868 27O617 -2 252 P2 or 248 hypothetical protein 271312 271533 1 222 P2 or 249 short-chain dehydrogenase/reductase SDR 2725.18 271616 -2 903 P2 or 250 clR family transcriptional regulator 273492 27 2716 -1 777 P2 or 251 TRAP-T family transporter, DctM (12 TMs) 2748.23 273498 -3 326 Subunit P2 or 252 hypothetical protein 27S340 274.813 -1 528 P2 or 253 conserved hypothetical protein 276464 275328 -3 137 P2 or 2S4 conserved hypothetical protein 276581 277462 2 882 P2 or 255 aSoaritate racenase 277466 278.194 2 729 P2 or 2S6 OsmO family protein 278769 279.194 3 426 P2 or 257 methyl-accepting chemotaxis protein 28.0931 279228 -3 704 P2 or 258 conserved hypothetical protein 281540 28.1079 -3 462 P2 or 259 citE citrate lyase beta chain 28.2469 281537 -2 933 P2 or 260 pcaR transcriptional regulator 283242 282466 -1 777 P2.Or 261 acyltransferase 3 283484 28468O 2 197 P2 or 262 indoleacetamide hydrolase 286128 284692 -1 437 P2 or 263 paiB putative regulatory protein 286853 286.185 -3 669 P2 or 264 86S putative acetyl esterase 2878.33 286868 -2 966 P2 or 26S gsiA putative ABC transporter ATP-binding protein 289497 287845 -1 653 P2 or 266 gsiC binding-protein-dependent transport systems 291379 29O399 -2 981 inner membrane component P2 or 267 appC binding-protein-dependent transport systems 290351 2895OO -3 852 inner membrane component P2 or 268 app A twin-arginine translocation pathway signal 292974 291.382 -1 1593 P2 or 269 ghrA putative 2-hydroxyacid dehydrogenase family 29.4110 2931.78 -3 933 protein. P2 or 270 yeaM AraC family transcriptional regulator 294233 295117 2 885 P2 or 271 putative transcriptional regulators containing 295330 295127 -2 204 he CopG/Arc? MetJ DNA-binding domain P2 or 272 PRC-barrel domain-containing protein 295879 295.508 -2 372 P2 or 273 etra transcriptional regulator, Crp, Fnr family protein 2961.21 296909 3 789 P2 or 274 conserved hypothetical protein 297O16 298.263 1 248 P2 or 275 conserved hypothetical protein 298.260 298787 3 528 P2 or 276 DNA ligase III-like protein 298784 299476 2 693 P2 or 277 HwnChalovibrin 29969S 3OO918 1 224 P2 or 278 conserved hypothetical protein 3O1464 3O1009 -1 456 P2 or 279 transposase, is4 family 3O1727 3O2773 2 O47 P2 or 28O hioesterase-like protein 3O3254 3O2817 -3 438 P2 or 281 major facilitator Superfamily MFS 1 3O448S 3O3271 -1 215 P2 or 282 conserved hypothetical protein 3OS396 3064.09 2 O14 P2 or 283 conserved hypothetical protein 3O64O6 306660 1 255 P2 or 284 SerS RNA synthetase class II (G H P and S) 306703 307707 1 005 P2 or 285 yrpE 2-nitropropane dioxygenase NPD 3O7806 3O8786 3 981 P2 or 286 ASMT O-methyltransferase family 2 3O881O 309811 2 OO2 P2 or 287 aldo/keto reductase family oxidoreductase 310797 309829 -1 969 P2 or 288 short-chain dehydrogenase/reductase SDR 311726 310845 -3 882 P2 or 289 sia.T TRAP transporter, DctM subunit 313045 31.1756 -2 290 P2 or 290 hypothetical protein 313660 313 058 -2 603 P2 or 291 dictP C4-dicarboxylate-binding periplasmic protein 3.14806 313769 -2 O38 P2 or 292 conserved hypothetical protein 317076 314899 -1 2178 P2 or 293 acyl-CoA dehydrogenase domain-containing 318311 31728O -3 O32 protein P2 or 294 IVD1 acyl-CoA dehydrogenase domain protein 3.19462 3183O8 -2 155 P2 or 295 iclR Transcriptional regulator kdgR 319747 320523 1 777 P2 or 296 paaF putative enoyl-CoA hydratase 32O572 321369 1 798 P2 or 297 transcriptional regulator 321550 321849 1 300 P2 or 298 plasmid stabilization system protein 321846 322172 3 327 US 2014/0296.161 A1 Oct. 2, 2014 35

-continued

LOCS Gene Product Start End Strand Length P2 or 299 ImrE3 Lincomycin resistanc e protein 32346S 322224 -3 1242 P2 or 3OO hypothetical protein 326989 323492 -2 3498 P2 or 301 alss acetolactate synthase 32.7307 328953 1647 P2 or 3O2 rfbU glycosyltransferase group 1 330096 329038 1059 P2 or 303 div histidine kinase response regulator hybrid 331511 330273 -3 1239 protein P2 or 3O4 hypothetical protein 332022 331756 267 P2 or 305 hypothetical protein 331765 33 1646 -2 120 P2 or 306 polC polymerase epsilon Subunit 333137 332193 -3 945 P2 or 307 conserved hypothetical protein 3340O8 33.31.51 858 P2 or 3O8 putative iron reductase 3341.99 334945 2 747 P2 or 309 RNA polymerase ECF-Subfamily sigma-70 334903 335475 573 actor P2 or 310 fecR transmembrane sensor 335636 336613 2 978 P2 or 311 acetyltransferase 336764 337423 2 660 P2 or 312 aatB aspartate aminotransferase 338676 337426 1251 P2 or 313 NAD-dependent aldehyde dehydrogenase 34.0317 338788 1530 P2 or 314 gcVA transcriptional regulator 341860 342789 930 P2 or 315 conserved hypothetical protein 341.730 34O687 104.4 P2 or 316 SpoOM family protei 343262 343699 2 438 P2 or 317 pca 3-oxoacid CoA-transferase, A Subunit 343828 344538 711 P2 or 318 hypothetical protein 343269 343153 117 P2 or 319 pca 3-oxoadipate CoA-transferase 34453S 345218 3 684 P2 or 32O mbtEH MbtH-like protein 345567 345812 3 246 P2 or 321 putative SyrP-like regulatory protein 346227 346856 3 630 P2 or 322 putative non-ribosom all peptide synthetase 346919 348349 2 1431 P2 or 323 lgrC amino acid adenylation domain protein 348429 367838 3 1941O P2 or 324 lgrC Pvd 367789 369903 1 2115 P2 or 325 lgrD GRC BREPA RecName: Full = Linear 369900 381584 3 11685 gramicidin synthase subunit C: Includes: RecName: Full = ATP-dependent valine adenylase; Short = ValA; AltName: Full = Valine activase: Includes: RecName: Full = ATP dependent D-valine adenylase; Short = D-ValA: AltName: Full = D-valine P2 or 326 iucB putative acetylase 381581 382612 2 1032 P2 or 327 pvdA L-ornithine 5-monooxygenase 382609 383940 1 1332 P2 or 328 fhuF ferric iron reductase protein 38.3944 384714 1 771 P2 or 329 hypothetical protein 384,753 385127 3 375 P2 or 330 pvdO acyl-homoserine lactone acylase 385517 387O64 2 1548 P2 or 331 fct ferrichrome-iron transporter 387288 38.9717 3 2430 P2 or 332 fhuC ABC transporter related protein 389797 390591 1 795 P2 or 333 fhuD periplasmic binding protein 3906.13 391.512 1 900 P2 or 334 fhuB iron-hydroxamate transporter permease 391.509 393557 3 2049 Subunit P2 or 335 pyoverdine ABC export system, 393612 395.282 3 1671 permease/ATP-binding protein P2 or 336 nitrile hydratase-like protein 395682 395365 -1 318 P2 or 337 conserved hypothetical protein 396O44 395733 -3 312 P2 or 338 nitrile hydratase 396434 3961OS -3 330 P2 or 339 conserved hypothetical protein 396919 396644 -2 276 P2 or 340 glutathione-dependent formaldehyde 3973.72 397764 1 393 activating GFA P2 or 341 aceE Pyruvate dehydrogenase (acetyl-transferring) 40O2.19 397808 -2 2412 P2 or 342 transcriptional regula O 40O360 40O839 1 480 P2 or 343 conserved hypothetical protein 402074 4O1058 -3 O17 P2 or 344 StB high-affinity phospha e ABC transporter ATP 403461 402658 -1 804 binding protein P2 or 345 bstA phosphate ABC trans porter permease protein 404321 403458 -3 864 P2 or 346 StC phosphate ABC trans porter, inner membrane 4OS346 404336 -2 O11 subunit PstC P2 or 347 StS phosphate ABC trans porter, periplasmic 405432 -3 005 phosphate-binding protein P2 or 348 PEPM phosphoenolpyruvate phosphomutase 4O7744 -3 915 P2 or 349 aspC aminotransferase class I and II 408984 -1 2O3 P2 or 350 ABC Superfamily AT P binding cassette 41OO75 -3 999 transporter, binding protein P2 or 351 inner membrane com ponent of ABC 411775 410132 -2 644 transporter P2 or 352 ABC transporter rela ed protein 412855 411779 -2 O77 P2 or 353 transcriptional regula or, LysR family 413003 41.3887 2 885 P2 or 3S4 class II aldolase/adducin family protein 414043 414819 777 P2 or 355 4-oxalocrotonate tautomerase 4153O2 414910 393 P2 or 356 LysR family transcriptional regulator 415370 416272 903 US 2014/0296.161 A1 Oct. 2, 2014 36

-continued

LOCS Gene Product Start End Strand Length P2 or 357 fecR putative FecR 417310 416303 -2 O08 P2 or 358 fec RNA polymerase sigma-70 family protein 417962 417426 -3 537 P2 or 359 Sphingosine kinase and enzymes related to 419046 418102 945 eukaryotic diacylglycerol kinase P2 or 360 conserved hypothetical protein 421.188 41936S 824 P2 or 361 acyl-CoA dehydrogenase domain protein 421331 422548 2 218 P2 or 362 Putative HPrkinase/phosphorylase 422644 423597 954 P2 or 363 sia.T TRAP-T family protein transporter, DctM (12 424975 423671 -2 305 TMs) subunit P2 or 364 conserved hypothetical protein 424972 549 P2 or 365 Bacterial extracellular solute-binding protein, 425599 173 family 7 P2 or 366 lcfB AMP-dependent synthetase and ligase 428473 426866 -2 608 P2 or 367 paaG enoyl-CoA hydrataseisomerase family protein 429389 428.544 -3 846 P2 or 368 acyl-CoA dehydrogenase-like 430S63 429403 161 P2 or 369 GntR family transcriptional regulator 43O895 431545 2 651 P2 or 370 pleC two-component hybrid sensor and regulator 431729 433306 2 578 P2 or 371 degC transcriptional regulator 433932 433294 639 P2 or 372 ugpC lactose transport ATP-binding protein LacK 434169 435335 3 167 P2 or 373 binding-protein-dependent transport systems 435328 436269 942 inner membrane component P2 or 374 yeSQ binding-protein-dependent transport systems 436266 437108 3 843 inner membrane component P2 or 375 Sugar binding protein of ABC transporter 437160 438434 3 1275 P2 or 376 icc conserved hypothetical protein 438434 439282 2 849 P2 or 377 conserved hypothetical protein 439858 439S41 -2 3.18 P2 or 378 conserved hypothetical protein 44.1210 43987O -1 1341 P2 or 379 dppD ABC transporter ATP-binding protein 44.2939 44.1290 -2 16SO P2 or 380 inner membrane component of binding 4438O1 44.2944 -3 858 protein-dependent transport system P2 or 381 inner membrane component of binding 444744 443806 -1 939 protein-dependent transport System P2 or 382 extracellular solute-binding protein 446347 444758 -2 1590 P2 or 383 ybhD HTH-type transcriptional regulator 446504 447463 2 960 P2 or 384 acy I gamma-glutamyltranspeptidase 4476O1 449319 1 1719 P2 or 385 putative amidotransferase 449371 4SO792 1 1422 P2 or 386 arylmalonate decarboxylase 451589 450840 -3 750 P2 or 387 citE HpcHHpaI aldolase 451750 452613 1 864 P2 or 388 maoC MaoC-like dehydratase 452668 453186 1 519 P2 or 389 conserved hypothetical protein 453403 453849 1 447 P2 or 390 hypothetical protein 453982 454.575 1 594 P2 or 391 acyl-CoA dehydrogenase domain protein 455977 4S4583 -2 1395 P2 or 392 yajO aldo/keto reductase 456051 457085 3 1035 P2 or 393 GNAT family acetyltransferase 4576.00 457082 -2 519 P2 or 394 nitrile hydratase beta-like protein 458,132 457743 -3 390 P2 or 395 nth A alpha subunit of nitrile hydratase 458767 458,156 -2 612 P2 or 396 nitrile hydratase beta subunit 459468 458782 -1 687 P2 or 397 conserved hypothetical protein 46O174 459506 -2 669 P2 or 398 amiC amino acid ABC transporter 46O454 461557 2 104 P2 or 399 MarR family transcriptional regulator 461618 462O76 2 459 P2 or 400 conserved hypothetical protein 462391 462104 -2 288 P2 or 4O1 acrR TetR family transcriptional regulator 463O14 462388 -1 627 P2 or 4O2 Phytoene dehydrogenase and related protein 4646.30 463O11 -3 62O P2 or 403 aldehyde dehydrogenase 464849 466.288 2 440 P2 or 404 Amidase 4.67917 466403 -2 515 P2 or 40S IvE ABC Superfamily ATP binding cassette 468812 468099 -3 714 transporter, ABC protein P2 or 4O6 braF urea or short-chain amide ABC transporter 469557 468814 -1 744 P2 or 407 urea or short-chain amide ABC transporter 470626 469559 -2 O68 P2 or 4.08 IvE urea or short-chain amide ABC transporter 471496 47O630 -2 867 P2 or 409 amiC putative Leu/Ile? Val/Thr/Ala-binding protein 472766 471513 -3 254 P2 or 410 amiC ABC transporter substrate-binding protein 474O13 472976 -2 O38 P2 or 411 yhbI transcriptional regulatory protein 474.233 474652 2 420 P2 or 412 gamma-aminobutyraldehyde dehydrogenase 476115 474658 -1 458 P2 or 413 putative LysR-family regulatory protein 477038 476112 -3 927 P2 or 414 puuC aldehyde dehydrogenase protein 478605 4771OO -1 SO6 P2 or 415 conserved hypothetical protein 478912 478,616 -2 297 P2 or 416 adh alcohol dehydrogenase protein 4792OS 480242 3 O38 P2 or 417 sdpR Arsk family transcriptional regulator 480297 480611 3 315 P2 or 418 conserved hypothetical protein 480608 480940 2 333 P2 or 419 ATP-dependent Clp protease proteolytic 480987 481.583 3 597 Subunit P2 or 420 yddV diguanylate cyclase 481691 483121 2 1431 P2 or 421 gabl) Aldehyde dehydrogenase (NAD(+)) 483197 484633 2 1437 US 2014/0296.161 A1 Oct. 2, 2014 37

-continued

LOCS Gene Product Start End Strand Length P2 or 422 bphR transcriptional regulator protein 485453 484707 -3 747 P2 or 423 sia.T TRAP dicarboxylate transporter, DctO subunit 485577 486.194 3 618 P2 or 424 yiaN TRAP transporter, DctM subunit subfamily 486.191 487462 2 1272 P2 or 425 putative hydroxlacyl-CoA dehydrogenase 487462 488415 1 954 P2 or 426 trap dicarboxylate transporter-dctp Subunit 488412 4.88996 3 585 P2 or 427 trap dicarboxylate transporter-dctp Subunit 489025 489384 1 360 P2 or 428 catA oxidoreductase protein 4894.63 49.0338 1 876 P2 or 429 bphC glyoxalasebleomycin resistance 490335 491.192 3 858 protein dioxygenase P2 or 430 decarboxylase protein 491197 491901 1 705 P2 or 431 2-hydroxymuconic semialdehyde 491907 493.361 3 1455 dehydrogenase P2 or 432 hpcG 2-oxo-hepta-3-ene-1,7-dioic acid hydratase 493381 494163 1 783 P2 or 433 hpch HpcHHpaI aldolase 494,173 494.940 1 768 P2 or 434 enoyl-CoA hydrataseisomerase family protein 495939 496742 3 804 P2 or 435 phenylacetic acid degradation operon negative 4958.06 495.021 -3 786 regulatory protein P2 or 436 menE eruloyl-CoA synthetase 496838 4984.03 2 1566 P2 or 437 bphR GntR family transcriptional regulator 4994O6 498528 -3 879 P2 or 438 gentisate 1,2-dioxygenase 499618 500733 1 1116 P2 or 439 5-carboxymethyl-2-hydroxymuconate Delta SOO863 5O1768 1 906 isomerase P2 or 440 extracellular ligand-binding receptor 5O1956 SO3143 2 1188 P2 or 441 bra) ABC Superfamily ATP binding cassette 503251 SO4168 1 918 transporter, membrane protein P2 or 442 IiwV ABC Superfamily ATP binding cassette 505202 3 1023 transporter, permease protein P2 or 443 braF ABC transporter related protein 505195 505962 1 768 P2 or 444 IvE branched-chain amino acid ABC Superfamily 505962 506657 3 696 ATP binding cassette transporter, ABC protein P2 or 445 NIT4 nitrilase SO6670 507617 3 948 P2.Or 446 conserved hypothetical protein SO7630 508274 3 645 P2 or 447 putative esteraselipase 508271 509 167 2 897 P2 or 448 hypothetical protein SO9275 509907 1 633 P2 or 449 transcriptional activator FtrB S106.30 SO9926 -1 705 P2 or 450 narK nitrite transporter 510877 513570 1 2694 P2 or 451 narG nitrate reductase, alpha Subunit 513598 517338 1 3741 P2 or 452 nary nitrate reductase, beta Subunit 517335 518873 3 1539 P2 or 453 nar nitrate reductase molybdenum S18878 S19603 1 726 assembly chaperone P2 or 454 nar respiratory nitrate reductase, gamma Subunit S19603 S2O349 3 747 P2 or 455 nifM PpiC-type peptidyl-prolyl cis-trans isomerase S2O362 521,252 3 891 P2 or 456 Hemerythrin HHE cation binding domain S21300 S21854 2 555 Subfamily P2 or 457 moe.A molybdopterin biosynthesis protein 521851 523119 269 P2 or 458 molybdenum ABC transporter ATP-binding 523116 S236O4 3 489 protein P2 or 459 moa-A molybdenum cofactor biosynthesis protein A S23641 524687 3 O47 P2 or 460 mog molybdopterin binding domain protein 524716 525.222 507 P2 or 461 NnrS family protein S2S243 526451 3 209 P2 or 462 Cellulose synthesis regulatory protein 526976 526461 -3 S16 P2 or 463 diguanylate cyclase 527952 526945 O08 P2 or 464 aapP General L-amino acid transport ATP-binding S28809 S28063 -3 747 protein P2 or 465 amino acid ABC transporter permease 531132 528817 2316 P2 or 466 aap general L-amino acid transport system 532175 531132 -3 O44 Substrate-binding protein P2 or 467 dap A dihydrodipicolinate synthase 533301 532,393 909 P2 or 468 glutathione S-transferase S34O2S 533423 -2 603 P2 or 469 dadA D-amino-acid dehydrogenase 535329 S34022 3O8 P2 or 470 ord Glycine, D-amino acid oxidase S36660 535332 -3 329 P2 or 471 OC transcriptional regulator, LysR family 537755 S36838 -3 918 P2 or 472 8. urea ABC transporter, urea binding protein 537992 S392.63 2 272 P2 or 473 bra : urea ABC transporter, permease protein UrtB S39469 S40398 3 930 P2 or 474 urea ABC transporter, permease protein UrtC S4O403 541587 1 18S P2 or 475 bra urea ABC transporter, ATP-binding protein S416O2 S42384 3 783 UrtD P2 or 476 IvE putative ABC transporter ATP-binding S42458 543147 1 690 component P2 or 477 fim formamidase S43426 54.4661 3 1236 P2 or 478 fim FmdB family regulatory protein S44730 545059 2 330 P2 or 479 conserved hypothetical protein S4S144 S4S584 2 441 P2 or 480 conserved hypothetical protein 545695 546876 1 1182 P2 or 481 conserved hypothetical protein 546881 548323 2 1443 US 2014/0296.161 A1 Oct. 2, 2014 38

-continued

LOCS Gene Product Start End Strand Length P2 or 482 conserved hypothetical protein 5484.42 S49665 3 1224 P2 or 483 ABC transporter related protein S49662 551452 2 1791 P2 or 484 transcriptional regulator, MarR family 551531 5521.30 2 600 P2 or 485 dipZ. hiol-disulfide isomerase and thioredoxins 55218O 55.2671 3 492 P2 or 486 SuhR Patatin S54481 55.2685 -1 1797 P2 or 487 gcVA transcriptional regulator 555545 554538 -3 1008 P2 or 488 glyoxalasebleomycin resistance 555678 SS6124 3 447 protein dioxygenase Superfamily protein P2 or 489 cya calcium binding hemolysin protein 5563O2 SS8818 3 2517 P2 or 490 yodM phosphoesterase PA-phosphatase related S594.94 SS8808 -1 687 protein P2 or 491 dppC binding-protein-dependent transport systems 560476 559592 -2 885 inner membrane component P2 or 492 putative oligopeptide ABC transporter S61429 560476 -1 954 (permease) P2 or 493 gsiB extracellular solute-binding protein family 5 S63119 561512 -2 1608 P2 or 494 ykflD oligopeptide? dipeptide ABC transporter, ATP S64228 S63212 -1 1017 binding protein-like P2 or 495 oligopeptide? dipeptide ABC transporter, ATP 565220 S64225 -3 996 binding protein-like P2 or 496 caiC AMP-dependent synthetase and ligase 566971 565217 -2 1755 P2 or 497 conserved hypothetical protein 567938 567177 -3 762 P2 or 498 B3f4 domain protein 568O82 568798 2 717 P2 or 499 hypothetical protein 569 114 569269 2 156 P2 or 500 conserved hypothetical protein 569625 569290 -1 336 P2 or 5O1 hypothetical protein S69983 S69669 -2 315 P2 or 502 phytanoyl-CoA dioxygenase (PhyH) family S70940 570065 -2 876 protein P2 or 503 oruR AraC family transcriptional regulator S71040 57208O 2 1041 P2 or SO4 glpR glycerol-3-phosphate regulon repressor 572197 572,976 1 780 P2 or 505 mhpD umarylacetoacetate (FAA) hydrolase 5738O1 572995 -1 807 P2.Or 506 opdE major facilitator transporter 575026 573917 -2 1110 P2 or 507 major facilitator Superfamily MFS 1 575433 575083 -1 351 P2 or SO8 regulatory protein 575710 576.603 1 894 P2 or 509 transcriptional regulator 5771.16 576613 -1 SO4 P2 or 510 short chain dehydrogenase/reductase family 577227 577991 3 765 oxidoreductase P2 or 511 2-hydroxychromene-2-carboxylate isomerase S78006 S78614 2 609 P2 or 512 conserved hypothetical protein S7881O 578622 -3 189 P2 or 513 AraC family transcriptional regulator 579873 5788.93 -1 981 P2 or S1.4 conserved hypothetical protein S80238 579909 -3 330 P2 or 515 dke1 acetylacetone-cleaving enzyme 580708 58O235 -2 474 P2 or S16 Glo1 Lactoylglutathione lyase S80954 S81478 1 525 P2 or 517 GntR family transcriptional regulator S81586 S82263 3 678 P2 or 518 coaA pantothenate kinase 582287 S83222 2 936 P2 or 519 alkJ GMC family oxidoreductase 583334 S84968 2 1635 P2 or 520 LysR family transcriptional regulator S85483 S84983 -1 5O1 P2 or 521 yha LysR family transcriptional regulator S85890 S85480 -3 411 P2 or 522 TrpR binding protein WrbA 585995 S86621 2 627 P2 or 523 TPR domain-containing protein S88934 S87846 -2 1089 P2 or 524 caiC AMP-dependent synthetase and ligase 5906.79 S89033 -1 1647 P2 or 525 putative monooxygenase 592736 590.775 -3 1962 P2 or 526 Beta-lactamase S93O31 S94248 3 1218 P2 or 527 TRAP-T family transporter, DctP (periplasmic S943SO S95498 2 1149 binding) Subunit P2 or 528 TRAP-T family transporter, DctO (4 TMs) 595.572 596105 3 534 Subunit P2 or 529 sia.T TRAP-T family transporter, DctM (12 TMs) 596,102 S974.09 2 1308 Subunit P2 or 530 regulatory protein, TetR S97461 598.171 2 711 P2 or 531 putative transcriptional regulators 598216 S98461 1 246 P2 or 532 ybiO mscS family protein ybiO 600759 5.98477 -1 2283 P2 or 533 Beta-lactamase 600931 6022OS 1 1275 P2 or 534 putative Peroxiredoxin 6O2334 6O2972 3 639 P2 or 535 conserved hypothetical protein 603028 603453 1 426 P2 or 536 conserved hypothetical protein 603785 603438 -3 348 P2 or 537 rob putative transcriptional regulator protein, AraC 6O4467 605.375 3 909 family P2 or 538 yebQ major facilitator Superfamily MFS 1 6067O6 60S393 -2 1314 P2 or 539 glx A AraC family transcriptional regulator 608808 6O7756 -1 1053 P2 or S4O SoxB sarcosine oxidase beta Subunit family protein 608908 61O161 1 1254 P2 or S41 SoxD sarcosine oxidase delta Subunit family protein 610172 610471 2 300 P2 or S42 SOXA sarcosine oxidase alpha Subunit family protein 610468 6.13470 1 3OO3 P2 or 543 SoxG sarcosine oxidase, gamma Subunit 613489 614040 1 552 US 2014/0296.161 A1 Oct. 2, 2014 39

-continued

LOCS Gene Product Start End Strand Length P2 or 544 hypothetical protein 614270 614O79 -3 192 P2 or 545 conserved hypothetical protein 614962 614510 -2 453 P2 or S46 leu) 3-isopropylmalate dehydratase, Small subunit 615722 615069 -3 654 P2 or 547 leuc adenosylcobinamide 617121 615727 -1 1395 kinasef adenosylcobinamide-phosphate guanylyltransferase P2 or S48 L-carnitine dehydratase/bile acid-inducible 618356 617118 -3 1239 protein F P2 or S49 GCDH glutaryl-CoA dehydrogenase 619608 618421 1188 P2 or 550 isochorismatase hydrolase 62O279 619656 -3 624 P2 or 551 hyuA hydantoin utilization protein A 6223.63 62O231 -2 21.33 P2 or 552 5-oxoprolinase (ATP-hydrolyzing) 624O71 622392 -3 1680 P2 or 553 GntR family transcriptional regulator 62SOO2 6241.57 846 P2 or 554 Conserved hypothetical protein 626913 62S399 1515 P2 or 555 conserved hypothetical protein 628844 627255 -3 1590 P2 or 556 glyoxalasebleomycin resistance 629129 6296OS 2 477 protein dioxygenase P2 or 557 sdpR Arsk family transcriptional regulator 6296O2 629943 342 P2 or 558 activator of HSP90 ATPase 1 family protein 629940 630437 3 498 P2 or 559 uhp A LuxR family transcriptional regulator 630987 630421 567 P2 or S60 acetyl-CoA acetyltransferase 631151 63.2326 2 1176 P2 or 561 menE putative long-chain-fatty-acid CoA ligase 632336 633.874 2 1539 P2 or S62 hutC transcriptional regulator, GintR family 634557 633829 729 P2 or 563 conserved hypothetical protein 634942 634649 -2 294 P2 or S64 coal polysaccharide deacetylase 63S146 636O12 867 P2 or 565 TRAP transporter, transmembrane protein 636092 636685 2 594 P2 or 566 Putative TRAP transporter large permease 636682 637965 1284 protein P2 or 567 yiaO putative exported protein (TRAP-type transport 638073 639077 3 1OOS system, periplasmic component) P2 or 568 IRG1 639094 64OSO3 1410 P2.Or 569 InmgC acyl-CoA dehydrogenase 64O689 64-1858 3 1170 P2 or 570 L-carnitine dehydratase/bile acid-inducible 64-1855 643117 2 1263 protein F P2 or 571 yfdE L-carnitine dehydratase/bile acid-inducible 643436 644209 2 774 protein F P2 or 572 pil J putative methyl accepting chemotaxis protein 644,419 646494 1 2O76 P2 or 573 conserved hypothetical protein 647362 646472 -2 891 P2 or 574 peptide deformylase 647567 6481.51 2 585 P2 or 575 short chain dehydrogenase 649O16 6481.83 -3 834 P2 or 576 ycaN LysR family transcriptional regulator 6491.14 6SOO16 1 903 P2 or 577 ycfCR TetR family transcriptional regulator 650709 6SOO32 -1 678 P2 or 578 yig short-chain dehydrogenase/reductase SDR 650818 651558 1 741 P2 or 579 hypothetical protein 652126 652260 1 135 P2 or S8O putative Ketopantoate reductase PanE/ApbA 652662 652390 -1 273 P2 or 581 leuA 2-isopropylmalate synthase 654186 652687 -1 1SOO P2 or 582 hypothetical protein 654661 654545 -2 117 P2 or 583 conserved hypothetical protein 655211 654687 -3 525 P2 or S84 pleC Sensor protein 655466 656.701 2 1236 P2 or 585 conserved hypothetical protein 657778 6568O1 -2 978 P2 or S86 sporulation initiation inhibitor protein So 658949 65998.9 2 1041 P2 or 587 chromosome partitioning protein ParB 659989 660891 903 P2 or 588 GCN5-related N-acetyltransferase 661556 661014 -3 543 P2 or 589 padR transcriptional regulator 662176 6627O6 531 P2 or 590 conserved hypothetical protein 663.239 662793 -3 447 P2 or 591 fadH NADH:flavin oxidoreductase,NADH oxidase 665393 663.282 -3 2112 P2 or 592 triacylglycerol lipase Superfamily protein 666816 66S485 1332 P2 or 593 glx A AraC family transcription regulator 667905 6669SS 951 P2 or 594 conserved hypothetical protein 668O34 668.333 3 300 P2 or 595 conserved hypothetical protein 668330 668.737 2 408 P2 or 596 regulatory protein TetR 66942S 668724 -3 702 P2 or 597 putative acyltransferase 669523 670626 1104 P2 or 598 putative outer membrane autotransporter 672954 67O636 2319 barrel P2 or 599 hypothetical protein 687767 686256 -3 1512 P2 or 600 Invasion protein B homolog 688434 68786S 570 P2 or 6O1 universal stress protein Uspa 689603 688716 -3 888 P2 or 602 conserved hypothetical protein 689981 689637 -3 345 P3.or OO1 istA Integrase, catalytic region 1 1509 1509 P3.or OO2 istB IstB domain protein ATP-binding protein 1496 2287 2 792 P3.or OO3 CzcC family heavy metal RND efflux outer 3597 2SOO 1098 membrane protein P3.or OO4 transposase, IS4 family protein 4330 4806 477 P3.or 005 alsR regulatory protein BphR 7279 6389 -2 891 US 2014/0296.161 A1 Oct. 2, 2014 40

-continued

LOCS Gene Product Start End Strand Length P3.or copK Copper resistance protein K 7696 7511 -2 186 P3.or copD putative copper resistance D transmembrane 88SO 7924 -1 927 protein P3.or copC copper resistance protein CopC 9241 8855 387 P3.or SIP heavy metal translocating P-type ATPase 11634 93.82 2253 P3.or conserved hypothetical protein 12523 12227 297 P3.or copB copper resistance protein B 13520 12750 771 P3.or pcoA CopA family copper resistance protein 15354 13756 599 P3.or cusR two component response regulator 15899 16603 705 P3.or cusS sensor histidine kinase 16857 18O11 155 P3.or Arsk family transcriptional regulator 18941 19288 348 P3.or yacK actoylglutathione lyase 193O2 19772 471 P3.or arsC arsenate reductase 19785 20282 498 P3.or putative sodium bile acid symporter family 2O293 21378 O86 protein P3.or arsenate reductase 21393 21815 423 P3.or transcriptional regulator, PadR family 21914 22216 303 P3.or chromate transporter 22213 234.42 230 P3.or yhbS acetyltransferase, gnat family 23448 23978 531 P3.or yubM ParB domain protein nuclease 25667 241.86 482 P3.or bepE transporter, HAE1 family 26377 25976 402 P3.or bepF multidrug efflux RND transporter, membrane 27438 26470 969 usion protein MexE P3.or relaxase 29882 27960 923 P3.or conserved hypothetical protein 301.23 30749 627 P3.or RES domain superfamily 30762 31394 633 P3.or conserved hypothetical protein 3.1479 31844 366 P3.or conserved hypothetical protein 33376 3.1856 521 P3.or conserved hypothetical protein 33749 33390 360 P3.or conserved hypothetical protein 3S140 33746 395 P3.or conserved hypothetical protein 36100 35150 951 P3.Or conserved hypothetical protein 36489 36097 393 P3.or DNA repair protein RadC 372O2 36708 495 P3.or disba oxidoreductase 37806 37378 429 P3.or conserved hypothetical protein 4106.1 3817O 2892 P3.or conserved hypothetical protein 41510 41061 450 P3.or conserved hypothetical protein 42909 41491 1419 P3.or putative Secreted protein 43810 42899 912 P3.or putative Secreted protein 44499 43807 693 P3.or conserved hypothetical protein 44894 44496 399 P3.or conserved hypothetical protein 4526S 44906 360 P3.or conserved hypothetical protein 45515 45282 234 P3.or plasmid conserved hypothetical protein, 45895 45512 384 RAQPRD family P3.or excisionase/Xis, DNA-binding 46099 46569 471 P3.or conserved hypothetical protein 46638 47141 SO4 P3.or AAA ATPase, central region 47159 48073 915 P3.or conserved hypothetical protein 48537 49037 5O1 P3.or conserved hypothetical protein 49037 499.39 903 P3.or F440523 63 hypothetical protein SO110 50703 594 P3.or conserved hypothetical protein 50770 53337 2S68 P3.or conserved hypothetical protein S4127 53378 750 P3.or conserved hypothetical protein S6190 54124 2O67 P3.or conserved hypothetical protein 56638 S6321 3.18 P3.or lytic transglycosylase, catalytic 57435 56866 570 P3.or conserved hypothetical protein 58175 574.38 738 P3.or conserved hypothetical protein S8832 S8.188 645 P3.or conserved hypothetical protein S9428 S8829 600 P3.or helicase-like protein 61846 595.67 228O P3.or conserved hypothetical protein 62288 61983 306 P3.or conserved hypothetical protein 62698 62378 321 P3.or conserved hypothetical protein 63858 62749 1110 P3.or conserved hypothetical protein 64,570 63923 648 P3.or conserved hypothetical protein 64907 64647 261 P3.or conserved hypothetical protein 65331 64924 408 P3.or conserved hypothetical protein 66560 65871 690 P3.or conserved hypothetical protein 67482 666SS 828 P3.or conserved hypothetical protein 68533 67628 906 P3.or conserved hypothetical protein 691.28 68844 285 P3.or conserved hypothetical protein 69696 69436 261 P3.or Transposase Tn3 704O7 69766 642 P3.or merA mercuric reductase 72470 70782 1689 P3.or merP mercuric transport protein periplasmic protein 72786 72481 306 P3.or merT putative mercuric transport protein 73149 72799 351 US 2014/0296.161 A1 Oct. 2, 2014 41

-continued

LOCS Gene Product Start End Strand Length P3.or merR MerR family transcriptional regulator 73221 73.628 3 408 P3.or conserved hypothetical protein 74714 73890 -3 825 P3.or conserved hypothetical protein 75281 75003 -3 279 P3.or conserved hypothetical protein 76115 75378 -3 738 P3.or conserved hypothetical protein 76721 76329 -3 393 P3.or conserved hypothetical protein 77461 77294 -2 168 P3.or lsp A signal peptidase II 78322 77942 -2 381 P3.or cadA heavy metal translocating P-type ATPase 81247 78446 -2 28O2 P3.or ZntR MerR family transcriptional regulator 81.450 81848 3 399 P3.or sterol desaturase-like protein 82S12 823O3 -1 210 P3.or cation efflux protein 82924 83.559 1 636 P3.or topB DNA topoisomerase III 863 13 84.283 -1 2O31 P3.or ssb single-stranded DNA-binding protein 87037 8.6597 -2 441 P3.or conserved hypothetical protein 87638 87111 -3 528 P3.or conserved hypothetical protein 88426 87635 -2 792 P3.or conserved hypothetical protein 899.02 88.856 -2 1047 P3.or conserved hypothetical protein 90658 90098 -2 S61 P3.or conserved hypothetical protein 92352 90673 -1 1680 P3.or conserved hypothetical protein 92.614 92345 -2 270 P3.or cobyrinic acid a,c-diamide synthase 93473 92598 -3 876 P3.or phage-related protein 93728 93516 -3 213 P3.or conserved hypothetical protein 94593 93.847 -1 747 P3.or conserved hypothetical protein 9S428 95544 1 117 P3.or hypothetical protein 95.955 961.97 3 243 P3.or iap Alkaline phosphatase isozyme conversion 98224 96.218 -2 2007 protein P3.or Pyridoxamine 5'-phosphate oxidase-like, FMN 99.403 98414 -2 990 binding P3.or O2 trpB tryptophan synthase subunit beta OO4O6 OOO89 -3 3.18 P3.or O3 trpB tryptophan synthase, beta Subunit O1314 OO4O6 -2 909 P3.or O4 conserved hypothetical protein O2640 O1558 -2 1083 P3.Or 05 conserved hypothetical protein O32O1 O2659 -2 543 P3.or O6 rpoD RNA polymerase sigma factor O5716 O3623 -3 2094 P3.or O7 dnaG DNA primase O7778 OS883 -1 1896 P3.or O8 GatB/Yaey domain Superfamily O82S6 O7789 -2 468 P3.or 09 car.A carbamoyl-phosphate synthase Small chain O8790 O9962 1 1173 P3.or 10 carE Carbamoylphosphate synthase large subunit O9955 132OO 2 3246 P3.or 11 grea transcription elongation factor GreA 13454 13912 3 459 P3.or 12 conserved hypothetical protein 15711 13996 -2 1716 P3.or 13 abrE3 membrane protein AbrB duplication 16368 15796 -2 573 P3.or 14 transcriptional regulator 17507 17001 507 P3.or 15 nucleoside-diphosphate-Sugar epimerase 17815 18876 2 1062 P3.or 16 trixB hioredoxin reductase (NADPH) 18974 19945 3 972 P3.or 17 yafC -sensitive transcriptional activator 2O106 20996 891 P3.or 18 diguanylate cyclase 21882 21061 -2 822 P3.or 19 conserved hypothetical protein 223O8 21898 -2 411 P3.or 2O conserved hypothetical protein 22847 22305 543 P3.or 21 Serine phosphatase RsbO, regulator of sigma 2456S 22895 -3 1671 Subunit P3.or 22 amino acid ABC transporter periplasmic 25412 246OO 813 protein P3.or 23 yedY Sulfoxide reductase catalytic subunityed Y 276.68 26661 1008 P3.or 24 aatA aspartate aminotransferase 29.076 2.7874 -2 1203 P3.or 25 uvrB excinuclease ABC subunit B 29371 31581 2 2211 P3.or 26 mdtA RND family efflux transporter MFP subunit 31.809 32912 1104 P3.or 27 AcrB/AcrD/AcrF family protein 32944 36144 2 32O1 P3.or 28 fabG 3-oxoacyl-acyl-carrier-protein reductase 36284 37027 3 744 P3.or 29 folB dihydroneopterin aldolase family protein 37128 37535 408 P3.or 30 uwrC Nuclease subunit of the excinuclease complex 37920 40.361 2442 P3.or 31 pgSA Phosphatidylglycerophosphate synthase 40437 4O988 552 P3.or 32 moaE molybdenum cofactor biosynthesis protein E 41003 41761 3 759 P3.or 33 Calcium-binding EF-hand protein 41891 42391 3 5O1 P3.or 34 sigW RNA polymerase sigma factor 42388 42996 2 609 P3.or 35 conserved hypothetical protein 42996 43559 S64 P3.or 36 conserved hypothetical protein 43556 43957 3 402 P3.or 37 azoB NmirA family protein 44927 44049 879 P3.or 38 transcriptional regulator 4SO36 45386 351 P3.or 39 ilw branched-chain amino acid aminotransferase 46392 45484 -2 909 P3.or 40 petP transcriptional regulator, MarR family 46478 4.6960 3 483 P3.or 41 petR Response regulator receiver: Transcriptional 4701.7 47763 2 747 regulatory protein, C-terminal P3.or 42 envz. Signal transduction histidine kinase 47784 491.45 1 1362 P3.or 43 hypothetical protein 49529 49149 -1 381 P3.or 44 diguanylate cyclase SO423 49653 -1 771

US 2014/0296.161 A1 Oct. 2, 2014 43

-continued

LOCS Gene Product Start End Strand Length P3.or 211 Hadh 3-hydroxyacyl-CoA dehydrogenase 2246.36 225592 2 957 P3.or 212 ybfI transcriptional activator 226298 226915 2 618 P3.or 213 pamO steroid monooxygenase 227173 228822 1 16SO P3.or 214 metX esterase 228849 229904 3 1056 P3.or 215 conserved hypothetical protein 230906 23OO31 -3 876 P3.or 216 TetR family transcriptional regulator 231603 230980 -1 624 P3.or 217 ybhR multidrug ABC transporter permease 2328O1 231 680 -2 1122 component P3.or 218 ybhS multidrug ABC transporter permease 233368 2328OS -2 S64 component P3.or 219 putative ABC transporter permease protein 233931 23336S -1 567 P3.or 220 ABC transporter related protein 23S682 233928 -3 1755 P3.or 221 membrane protein 2367O6 235687 -1 102O P3.or 222 ybiH TetR family transcriptional regulator 237326 236,703 -3 624 P3.or 223 conserved hypothetical protein 238615 23.7500 -2 1116 P3.or 224 paa A phenylacetic acid degradation protein (similar 239551 238799 -2 753 opaaA) P3.or 225 hypothetical protein 240236 239.757 -3 480 P3.or 226 Radical SAM domain protein 241498 24O233 -2 1266 P3.or 227 Rieske (2Fe-2S) domain-containing protein 242879 241698 -3 1182 P3.or 228 hioesterase Superfamily protein 243659 24318.9 -3 471 P3.or 229 MaoC-like dehydratase 244141 243656 -2 486 P3.or 230 NAD-dependent DNA ligase 246344 244233 -3 2112 P3.or 231 DNA repair protein RecN 248.114 246438 -3 1677 P3.or 232 DNA uptake lipoprotein 249069 248.191 -1 879 P3.or 233 DNA-binding protein 249926 250372 2 447 P3.or 234 UDP-3-O-3-hydroxymyristoyl N 251.433 2SO468 -1 966 acetylglucosamine deacetylase P3.or 235 tSZ cell division protein FtsZ. 253,351 251786 -2 566 P3.or 236 tSA F492457 3 cell division protein FtsA 2546.67 2S3432 -1 236 P3.or 237 Cell division protein FtsO 255726 254782 -1 945 P3.Or 238 did D-alanine-D-alanine ligase 256628 255714 -3 915 P3.or 239 murB UDP-N-acetylenolpyruvoylglucosamine 257635 256625 -2 O11 reductase P3.or 240 mur(G. UDP-N-acetylmuramate-L-alanine ligase 259059 257635 -1 425 P3.or 241 mur(G. UDP-N-acetylglucosamine-N-acetylmuramyl 260279 259056 -3 224 (pentapeptide) pyrophosphoryl-undecaprenol N-acetylglucosamine transferase P3.or 242 ftSW putative cell division protein ftsW 261397 26O276 -2 122 P3.or 243 mur) UDP-N-acetylmuramoyl-L-alanyl-D-glutamate 262860 261394 -1 467 synthetase P3.or 244 mira.Y Glycosyltransferase, family 4 263966 26.2878 -3 O89 P3.or 245 murF UDP-N-acetylmuramoyl-tripeptide--D-alanyl-D- 265477 264O2O -2 458 alanine ligase subfamily P3.or 246 murE UDP-N-acetylmuramoylalanyl-D-glutamate-- 266946 26S474 -1 473 2,6-diaminopimelate ligase P3.or 247 penA peptidoglycan synthetase FtsI 268828 266933 -2 896 P3.or 248 periplasmic protein 269292 2688.25 -1 468 P3.or 249 mraW S-adenosyl-methyltransferase miraW 270293 269289 -3 005 P3.or 250 mraZ cell division protein MraZ 270811 27O290 -2 522 P3.or 251 amp) negative regulator of AmpC, Amp) 272679 271888 -1 792 P3.or 252 integral membrane protein 273634 27 2702 -2 933 P3.or 253 hfaC ABC transporter related protein 275499 273631 -1 869 P3.or 2S4 curculin domain-containing protein 276723 275686 -1 O38 P3.or 255 dIA DnaJ-like protein DIA 277594 276848 -2 747 P3.or 2S6 bacterial extracellular solute-binding protein, 277977 278957 3 981 family 7 P3.or 257 tripartite ATP-independent periplasmic 279.099 279656 3 558 transporter, DctO component P3.or 258 TRAP dicarboxylate transporter, DctM subunit 279676 280986 1 1311 P3.or 259 OXP1 5-oxoprolinase (ATP-hydrolyzing) 281063 284,704 2 3642 P3.or 260 conserved hypothetical protein 28471S 2853O2 3 S88 P3.or 261 conserved hypothetical protein 28S447 28S947 3 5O1 P3.or 262 conserved hypothetical protein 28S944 287074 2 1131 P3.or 263 conserved hypothetical protein 287088 2873O3 3 216 P3.or 264 yne putative transcriptional regulator protein, LysR 287.445 2883O8 3 864 family P3.or 26S putative transmembrane protein 288326 2891OS 2 780 P3.or 266 pyruvate, phosphate dikinase 289409 291.367 2 1959 P3.or 267 XRE family transcriptional regulator 292040 291459 -3 582 P3.or 268 Gluconate 2-dehydrogenase subunit 3 292191 292.955 3 765 P3.or 269 Gluconate 2-dehydrogenase (acceptor) 293O16 294779 3 1764 P3.or 270 diheme cytochrome c SoxE 294776 2.95147 2 372 P3.or 271 Ketopantoate reductase 295209 296.13S 3 927 US 2014/0296.161 A1 Oct. 2, 2014 44

-continued

LOCS Gene Product Start End Strand Length P3.or 272 ydeK transporterydeK 296,132 296998 2 867 P3.or 273 hxIR transcriptional regulator 297307 297.005 -2 303 P3.or 274 gstB glutathione S-transferase 297503 298117 2 615 P3.or 275 conserved hypothetical protein 298114 298491 1 378 P3.or 276 4-oxalocrotonate tautomerase family protein 298.504 298710 1 2O7 P3.or 277 thcD FAD-dependent pyridine nucleotide-disulphide 3OO150 298.870 -1 1281 oxidoreductase P3.or 278 hirb rubredoxin reductase 3OO460 3OO2S1 -2 210 P3.or 279 conserved hypothetical protein 3O1498 3OOSO9 -2 990 P3.or 28O gly A glycine hydroxymethyltransferase 3O3294 302002 -1 1293 P3.or 281 yfeJ glutamine amidotransferase class-I 3O4426 3O3725 -2 702 P3.or 282 ptsJ GntR family transcriptional regulator 3O471S 306O28 2 1314 P3.or 283 cupin 2 domain-containing protein 3066.17 306231 -3 387 P3.or 284 yedZ Ferric reductase domain protein protein 306862 307848 1 987 transmembrane component domain protein P3.or 285 ydeR Major facilitator Superfamily MFS 1 3O8978 3O7812 -3 1167 P3.or 286 ycaN LysR family transcriptional regulator 310241 31.1167 2 927 P3.or 287 yajO aldo/keto reductase 310081 309035 -2 1047 P3.or 288 menE citrate synthase 31.1990 313657 2 1668 P3.or 289 putative glutathione S-transferase-related 313654 314640 1 987 protein P3.or 290 short-chain dehydrogenase/reductase SDR 314637 315410 3 774 P3.or 291 crea CreA family protein 315498 315989 3 492 P3.or 292 LemA family protein 316110 316691 3 582 P3.or 293 htpX HtpX-2 peptidase 316696 3.17859 1 1164 P3.or 294 wirR conserved hypothetical protein 318375 31 7875 -1 5O1 P3.or 295 fixR short-chain dehydrogenase 31854O 319295 3 756 P3.or 296 hypothetical protein 32OOOS 32O121 1 117 P3.or 297 yfL inner membrane protein YifL 32O245 320667 1 423 P3.or 298 conserved hypothetical protein 32O745 321326 3 582 P3.or 299 ygiC glutathionylspermidine synthase 321341 322552 2 1212 P3.Or 300 conserved hypothetical protein 323,212 322577 -2 636 P3.or 301 cqSS CAI-1 autoinducer sensor kinase/phosphatase 324242 323349 -3 894 cqsS P3.or pleC response regulator receiver sensor signal 325,762 324248 -2 1515 transduction histidine kinase P3.or 303 ior A indolepyruvate ferredoxin oxidoreductase, 3.29497 325985 -2 3513 alpha subunit P3.or 3O4 putative TetR family protein receptor protein 330225 329839 -1 387 P3.or 305 conserved hypothetical protein 330266 330535 2 270 P3.or 306 conserved hypothetical protein 330489 330956 3 468 P3.or 307 ZraR two component, sigmaS4 specific, Fis family 332800 331 O2S -2 776 transcriptional regulator P3.or 3O8 ybO Oligoendopeptidase F homolog 332996 334807 2 812 P3.or 309 aarF domain-containing kinase 334862 33626S 2 404 P3.or 310 hypothetical protein 336S46 337256 3 711 P3.or 311 diguanylate cyclase/phosphodiesterase 339938 337284 -3 2655 P3.or 312 hipO amidohydrolase 341077 339941 -2 137 P3.or 313 hypothetical protein 341.375 341542 2 168 P3.or 314 ilvG hiamine pyrophosphate protein central region 341626 343.383 1 758 P3.or 315 conserved hypothetical protein 343422 345497 3 2O76 P3.or 316 conserved hypothetical protein 345494 347092 2 599 P3.or 317 conserved hypothetical protein 347500 347114 -2 387 P3.or 318 conserved hypothetical protein 347766 3475.18 -1 249 P3.or 319 ttu) hydroxypyruvate reductase 3491SO 347876 -2 275 P3.or 32O Creatininase 3493SO 3SOO93 3 744 P3.or 321 N-formylglutamate amidohydrolase 35O106 350867 3 762 P3.or 322 conserved hypothetical protein 350920 3S1222 1 303 P3.or 323 amidohydrolase 3 353140 3S1260 -2 881 P3.or 324 conserved hypothetical protein 353483 3531.93 -3 291 P3.or 325 transcriptional regulator, AraC family 353634 354530 3 897 P3.or 326 conserved hypothetical protein 354659 3S4910 2 252 P3.or 327 conserved hypothetical protein 35SO41 355385 3 345 P3.or 328 hypothetical protein 355851 355.375 -1 477 P3.or 329 conserved hypothetical protein 356637 355870 -1 768 P3.or 330 conserved hypothetical protein 35.7494 356634 -3 861 P3.or 331 conserved hypothetical protein 357888 357541 -1 348 P3.or 332 CAI-1 autoinducer synthase 358328 359608 2 1281 P3.or 333 Threo-3-hydroxyaspartate ammonia-lyase 360652 359672 -2 981 P3.or 334 transcriptional regulator, GintR family with 360821 3623O2 2 1482 aminotransferase domain P3.or 335 L-lactate permease 362445 364148 3 1704 P3.or 336 transcriptional regulator 364.222 364,560 1 339 P3.or 337 arsenical-resistance protein 364569 36S630 3 1062 US 2014/0296.161 A1 Oct. 2, 2014 45

-continued

LOCS Gene Product Start End Strand Length P3.or 338 NADPH-dependent FMN reductase 365627 366391 2 765 P3.or 339 ggt gamma-glutamyltranspeptidase 366,507 368312 3 806 P3.or 340 epoxide hydrolase protein 368348 369241 2 894 P3.or 341 AraC family transcriptional regulator 369241 37O3O8 O68 P3.or 342 putative regulator PrlF 370305 370499 3 195 P3.or 343 conserved hypothetical protein 371729 370581 -3 149 P3.or 344 conserved hypothetical protein 372148 371726 -2 423 P3.or 345 conserved hypothetical protein 372478 373455 978 P3.or 346 didpA ABC transporter substrate-binding protein 373624 375249 626 P3.or 347 dppB ABC transporter permease protein 2 375326 376348 2 O23 P3.or 348 didpC ABC transporter permease protein 1 376.345 3771.93 849 P3.or 349 gsiA ABC transporter ATP-binding protein 3771.90 378911 3 722 P3.or 350 mcbR Transcriptional Regulator, GintR family protein 37898O 379708 2 729 P3.or 351 aminotransferase class-III 3799.90 381.345 356 P3.or 352 hydantoin utilization protein 381360 38.3396 3 2037 P3.or 353 putative N-methylhydantoinase B 383401 385O14 614 P3.or 3S4 ygaY major facilitator transporter 386.318 385119 -3 200 P3.or 355 HXIR family transcriptional regulator 386421 386861 3 441 P3.or 356 GntR family transcriptional regulator 387573 386902 672 P3.or 357 bcpA isocitrate lyase and phosphorylmutase 387704 388567 2 864 P3.or 358 isopropylmalate isomerase large subunit 3.88564 389985 422 P3.or 359 leu) isopropylmalate isomerase Small subunit 389985 39062O 3 636 P3.or 360 yraM conserved hypothetical protein 390701 39.1888 2 188 P3.or 361 dictP c4-dicarboxylate-binding periplasmic protein 391.939 392961 O23 P3.or 362 TRAP-type C4-dicarboxylate transport system 392966 393532 2 567 Small permease P3.or 363 C4-dicarboxylate TRAP-T family tripartite ATP 393.529 394812 284 independent periplasmic transporter, membrane protein, large subunit P3.or 364 citH proline:glycine betaine transporter, Major 394.995 396.287 3 293 facilitator Superfamily P3.Or 365 InqSR Motility quorum-sensing regulator masR 396479 3.96775 2 297 P3.or 366 ygiT transcriptional regulator, XRE family 3.96777 3971.78 3 402 P3.or 367 hypothetical protein 397787 398224 2 438 P3.or 368 lcfA long-chain-fatty-acid-CoA ligase 4OOO68 39832O -1 1749 P3.or 369 ygff short-chain dehydrogenase/reductase SDR 4OO154 400912 2 759 P3.or 370 bioC methyltransferase 4OO934 4O1659 2 726 P3.or 371 F377340 3 AraC 4O2662 4O1670 -3 993 P3.or 372 dienelactone hydrolase family protein 402712 4O3S24 1 813 P3.or 373 conserved hypothetical protein 403624 404664 1 O41 P3.or 374 pleC sensory transduction histidine kinase 4O6997 4O4673 -3 2325 P3.or 375 dht dihydropyrimidinase 408.628 4071.68 -2 461 P3.or 376 pucI cytosinepurines, uracil, thiamine, allantoin 410219 408690 -3 530 transporter P3.or 377 yeiA dihydroorotate dehydrogenase family protein 411740 410436 -3 305 P3.or 378 yeiT putative oxidoreductase 413141 411768 -3 374 P3.or 379 hyuC amidase, hydantoinase? carbamoylase family 414563 413307 -3 257 P3.or 380 mmSA methylmalonate-semialdehyde dehydrogenase 416162 414663 -3 500 P3.or 381 rutR transcriptional regulator, TetR family protein 416868 417575 3 708 P3.or 382 suhB putative inositol monophosphatase 417572 418438 2 867 P3.or 383 CRISPR-associated protein Cas2 418894 418604 -2 291 P3.or 384 conserved hypothetical protein 42OO31 41894O -2 O92 P3.or 385 putative crispr-associated protein cas4 42O690 420O28 663 P3.or 386 CRISPR-associated Csh2 family protein 4216O2 42O730 873 P3.or 387 CRISPR-associated protein, Csd1 family 423567 421660 908 P3.or 388 CRISPR-associated protein Cass family 424163 423S64 -3 600 P3.or 389 helicase 426840 424.429 2412 P3.or 390 conserved hypothetical protein 427477 427037 -2 441 P3.or 391 Pirin domain protein 428530 4276O1 -2 930 P3.or 392 ybfI transcriptional regulator, AraC family 429531 428647 885 P3.or 393 mdeA CyStathionine gamma-synthase 429596 43O8O1 2 12O6 P3.or 394 ydeL transcriptional regulator, GintR family with 432223 43O811 -2 1413 aminotransferase domain protein P3.or 395 FMN-binding negative transcriptional regulator 4323O1 432936 636 P3.or 396 putative nitronate monooxygenase 434O92 43.3112 -2 981 P3.or 397 HXIR family transcriptional regulator 434310 434867 3 558 P3.or 398 conserved hypothetical protein 435.492 434926 567 P3.or 399 D-isomer specific 2-hydroxyacid 435713 43.6675 2 963 dehydrogenase NAD-binding P3.or 400 hypothetical protein 438321 436 699 1623 P3.or 4O1 2OG-Fe(II) oxygenase family protein 438775 439785 1011 P3.or 4O2 hypothetical protein 4.38564 438355 210 P3.or 403 Glyoxalasebleomycin resistance 439832 440221 2 390 protein dioxygenase US 2014/0296.161 A1 Oct. 2, 2014 46

-continued

LOCS Gene Product Start End Strand Length P3.or 404 med Bmp family membrane protein 44O3OO 44.1394 2 095 P3.or 40S yufC) ABC transporter related protein 441436 44.2977 1 S42 P3.or 4O6 inner-membrane translocator 442970 4441OO 2 131 P3.or 407 branched chain amino acid ABC transporter, 444097 445O29 1 933 permease protein P3.or 4.08 AMP nucleosidase 44512O 446628 1 509 P3.or 409 hypothetical protein 446711 447028 2 3.18 P3.or 410 HicB family protein 4471.93 447.432 1 240 P3.or 411 conserved hypothetical protein 447SO2 448788 1 287 P3.or 412 SupD 44878S 449744 3 960 P3.or 413 yakc aldo/keto reductase 449848 4SO882 1 O35 P3.or 414 yhgD TetR family transcriptional regulator 450910 4S1506 1 597 P3.or 415 AraC family transcriptional regulator 451595 452S63 2 969 P3.or 416 major facilitator transporter 452652 453845 3 194 P3.or 417 fyuA TonB-dependent receptor 453931 456OS1 1 21.21 P3.or 418 O-methyltransferase family protein 456051 457061 3 O11 P3.or 419 yafC transcriptional regulator protein 457981 457088 -2 894 P3.or 420 transporter protein 458,114 45962S 2 512 P3.or 421 NQO1 NAD(P)H quinone oxidoreductase 460322 459654 -3 669 P3.or 422 nahR LysR Substrate binding domain protein 46O421 461374 2 954 P3.or 423 yebE nner membrane protein yebE 4621.76 461334 -3 843 P3.or 424 acetyl-CoA carboxylase 462381 463,019 3 639 P3.or 425 ybB Na Pi-cotransporter II-related protein 464741 463O29 -3 1713 P3.or 426 hypothetical protein 465315 464797 -1 519 P3.or 427 hypothetical protein 465715 465299 -2 417 P3.or 428 NADH-ubiquinone oxidoreductase 466049 465705 -3 345 P3.or 429 PED1 Acetyl-CoA C-acyltransferase 467453 466269 -3 118S P3.or 430 braG putative high-affinity branched-chain amino 468.382 46766O -2 723 acid transport ATP-binding proteinputative P3.or 431 braF ABC Superfamily ATP binding cassette 4691.52 468.379 -1 774 transporter, ABC protein P3.Or 432 branched-chain amino acid ABC transporter 470993 469149 -3 1845 P3.or 433 putative branched-chain amino acid ABC 472.322 4711OS -3 1218 transporter, periplasmic Substrate-binding protein P3.or 434 dehydrogenase 473394 4724.80 915 P3.or 435 rSmA Methyltransferase type 12 474041 473445 -3 597 P3.or 436 iorB Aldehyde oxidase and Xanthine 476400 474082 2319 dehydrogenase, molybdopterin binding protein P3.or 437 ior A Membrane-bound aldehyde dehydrogenase 476874 476410 465 iron-sulfur protein P3.or 438 ychC transcriptional regulator protein 477114 478010 3 897 P3.or 439 apl putative alkaline phosphatase protein 478111 478704 594 P3.or 440 sbcD nuclease SbcCD, D subunit 478813 48OO60 248 P3.or 441 sbcC exonuclease SbcC 48OO60 48.3848 3 3789 P3.or 442 aminotransferase 485304 483934 371 P3.or 443 ydeL transcriptional regulator, GintR family 485526 486989 3 464 P3.or 444 mocR transcriptional regulator 4.885.15 487031 -2 485 P3.or 445 Indoleacetamide hydrolase 488833 490233 401 P3.or 446 amidohydrolase-like 490463 492109 2 647 P3.or 447 cobalamin synthesis protein, P47K 492 09 493O38 930 P3.or 448 ALD4 aldehyde dehydrogenase 5 493O82 494-605 2 524 P3.or 449 conserved hypothetical protein 494-617 4953S4 738 P3.or 450 app A extracellular solute-binding protein family 5 49S476 497071 2 596 P3.or 451 dppB binding-protein dependent transport system 497 78 498.197 3 O20 inner membrane protein P3.or 452 appC binding-protein-dependent transport systems 498 94 499.108 2 915 inner membrane component P3.or 453 oligopeptide? dipeptide ABC transporter, 499 SOO118 1 O11 ATPase subunit P3.or 454 appF oligopeptide? dipeptide ABC transporter, 500 15 SO1104 3 990 ATPase subunit P3.or 455 acdS 1-aminocyclopropane-1-carboxylate 75 -3 O17 deaminase P3.or 456 leucine-responsive regulatory protein 5O2373 SO2843 2 471 P3.or 457 NADH dehydrogenase protein SO4282 SO2864 -1 419 P3.or 458 putative addiction module antidote protein, 505 13 SOS4O6 3 294 CopGArc/MetJ family P3.or 459 conserved hypothetical protein 505091 SO4606 -3 486 P3.or 460 yoaH chemotaxis sensory transducer SO6986 SOS490 -2 497 P3.or 461 glnO polar amino acid transport system ATP SO7949 5071.97 -2 753 binding protein P3.or 462 polar amino acid ABC transporter, inner SO8586 507936 -3 651 membrane subunit US 2014/0296.161 A1 Oct. 2, 2014 47

-continued

LOCS Gene Product Start End Strand Length P3.or 463 yecS glutamine ABC Superfamily ATP binding SO9304 508579 -1 726 cassette transporter, membrane protein P3.or 464 glnH amino acid ABC transporter Substrate-binding S10214 SO9390 -2 825 protein P3.or 465 lutR GntR domain-containing protein S104.87 511170 1 684 P3.or 466 alc1 allantoicase S11212 512225 3 O14 P3.or 467 allA ureidoglycolate hydrolase S12234 512761 2 528 P3.or 468 hypothetical protein 514441 S12849 -2 593 P3.or 469 hypothetical protein 514877 515026 2 150 P3.or 470 hypothetical protein S14651 514505 -2 147 P3.or 471 putative B12-dependent ribonucleoside 516753 515536 -1 218 diphosphate-triphosphate reductase (nird ike) P3.or 472 hypothetical protein S17886 S18038 2 153 P3.or 473 ydcR Bifunctional -- Transcriptional Regulator, GntR S19443 S18043 -3 401 amily Aminotransferase, class I and II P3.or 474 MFS transporter 51973O S21019 1 290 P3.or 475 transcriptional regulator, BadMRrf2 family S21021 S214O1 2 381 P3.or 476 HPrkinase S214O1 522249 1 849 P3.or 477 Sulfotransferase 522246 523112 3 867 P3.or 478 glCF Glycolate oxidase iron-sulfur subunit 525337 523952 -2 386 P3.or 479 glCE glycolate oxidase, Subunit gloE 526566 525334 -1 233 P3.or 480 glCD FAD linked oxidase-like protein S28058 526571 -2 488 P3.or 481 rpm 50S ribosomal protein L36 S28391 S28266 -2 126 P3.or 482 hypothetical protein S28619 529317 1 699 P3.or 483 Putative phosphoethanolamine N 530285 529314 -3 972 methyltransferase P3.or 484 lcfA acyl-CoA synthetase 532444 S30696 -2 1749 P3.or 485 biotin dependent acyl-CoA carboxylase 532,599 534.194 3 1596 P3.or 486 des atty acid desaturase 534,570 535637 3 1068 P3.or 487 conserved hypothetical protein S36061 535885 177 P3.Or 488 XRE family transcriptional regulator 53,7253 536912 -2 342 P3.or 489 hypothetical protein 537835 53,7951 117 P3.or 490 ad dependent oxidoreductase; protein 5391.65 537972 -3 1194 P3.or 491 valyl-tRNA synthetase S41083 539137 1947 P3.or 492 betB Aldehyde Dehydrogenase S42856 S41297 1S60 P3.or 493 6-phosphogluconate dehydrogenase NAD S43742 S42873 -2 870 binding P3.or 494 cynR LysR family transcriptional regulator S43784 S44932 1149 P3.or 495 acetyltransferase (GNAT) family protein 545.532 S4S140 393 P3.or 496 conserved hypothetical protein S45943 S45662 282 P3.or 497 ABC transporter related protein S46858 S46079 780 P3.or 498 binding-protein dependent transport system 547711 546917 -2 795 inner membrane protein P3.or 499 SSuA putative Sulfonate/nitrate transport system 54.8718 547735 984 Substrate-binding protein P3.or 500 agmatinase 549766 548759 -2 O08 P3.or 5O1 cmpR putative LysR-family regulatory protein S49871 550782 912 P3.or 502 putative membrane protein 550993 551823 831 P3.or 503 conserved hypothetical protein 551829 552332 3 SO4 P3.or SO4 SolaA L-serine dehydratase 1 552433 553821 389 P3.or 505 ytnL putative hydrolase 553879 555078 200 P3.or SO6 dgdR transcriptional regulator, LysR family 555974 555093 -3 882 P3.or 507 icod isocitrate dehydrogenase 556111 55.7196 O86 P3.or SO8 conserved hypothetical protein 557266 557913 648 P3.or 509 methyl-accepting chemotaxis protein 559595 557919 -3 677 P3.or 510 methyl-accepting chemotaxis protein 561525 SS984O 686 P3.or 511 mcp3 methyl-accepting chemotaxis sensory S63423 561738 -3 686 transducer P3.or 512 anti-Sigma factor, ChrR S63646 S64329 3 684 P3.or 513 yhbS GCN5-related N-acetyltransferase S64359 S64871 2 513 P3.or S1.4 Aass Alpha-aminoadipic semialdehyde synthase 565.472 566539 2 O68 P3.or 515 putative transcriptional regulator, ASnC family S6S324 S64887 -2 438 protein P3.or S16 CRISPR-associated protein Cas2 S67419 567090 -3 330 P3.or 517 conserved hypothetical protein 568244 56,7429 -3 816 P3.or 518 conserved hypothetical protein S71430 S68281 -3 3150 P3.or 519 hypothetical protein 571876 571736 -2 141 P3.or 520 putative ABC transporter-binding protein 572O58 573.365 3 1308 P3.or 521 binding-protein-dependent transport systems 573.362 574309 2 948 inner membrane component P3.or 522 yurM maltose/maltodextrin ABC transporter S74306 575139 1 834 permease protein MalG P3.or 523 Smok maltose 575149 576.219 1 1071 US 2014/0296.161 A1 Oct. 2, 2014 48

-continued

LOCS Gene Product Start End Strand Length P3.or 524 rpfC GAF sensor hybrid histidine kinase 576303 578552 3 2250 P3.or 525 gstA conserved hypothetical protein 578676 579287 3 612 P3.or 526 dotA Sodium: dicarboxylate symporter S8O834 S794.19 -2 416 P3.or 527 dctB histidine kinase 581,119 S83O20 1 902 P3.or 528 dctD C4-dicarboxylate transport transcriptional 583017 S84396 3 380 regulatory protein dctD P3.or 529 MGLL hydrolase 585291 584,416 -1 876 P3.or 530 ydaM diguanylate cyclase S871.86 585414 -3 773 P3.or 531 tnaA tryptophanase 587561 589075 2 515 P3.or 532 conserved hypothetical protein 5892O7 S8.9812 1 606 P3.or 533 sodB Superoxide dismutase 589927 590535 1 609 P3.or 534 trpI LysR family transcriptional regulator S91413 S90499 -3 915 P3.or 535 pdxA 4-hydroxythreonine-4-phosphate 591591 S926.13 3 O23 dehydrogenase P3.or 536 ydbB Cupin domain protein S926.18 593.676 1 059 P3.or 537 putative oxidoreductase protein 594448 S93699 -2 750 P3.or 538 yhjC transcriptional regulator protein S94536 595471 2 936 P3.or 539 transporter 596,621 595.425 -3 197 P3.or S4O pat Phosphinothricin acetyltransferase 5972O2 S96618 -2 585 P3.or S41 cmpR LysR family transcriptional regulator 597267 598265 3 999 P3.or S42 Rrf2-linked NADH-flavin reductase S994.86 598.875 -3 612 P3.or 543 HTH-type transcriptional regulatorytfH 599657 6OOOSS 2 399 P3.or 544 TetR family transcriptional regulator 6OO669 6OOO43 -1 627 P3.or 545 putative NAD(P)H dehydrogenase 6OO774 601.382 3 609 P3.or S46 cyclase family protein 6O1466 602404 2 939 P3.or 547 hiopurine S-methyltransferase 603073 6O2429 -2 645 P3.or S48 moaR transcriptional regulator, LuxR family 603210 603926 3 717 P3.or S49 amine oxidase 6OS247 603.910 -1 1338 P3.or 550 short-chain dehydrogenase/reductase family 606O76 6OS336 -2 741 oxidoreductase P3.or 551 AraC family transcriptional regulator 606190 607110 1 921 P3.Or 552 major facilitator Superfamily MFS 1 6O71.97 6084SO 3 1254 P3.or 553 tohA TonB-dependent heme/hemoglobin receptor 610963 608438 -2 2S26 family protein P3.or 554 FecR protein 611984 611058 -3 927 P3.or 555 FecI-like protein 6.12481 611981 -2 5O1 P3.or 556 ycdO conserved hypothetical protein 613577 612753 -3 825 P3.or 557 ycdB Dyp-type peroxidase family protein 614927 613623 -3 1305 P3.or 558 ycdO conserved hypothetical protein 616107 614929 -1 1179 P3.or 559 efeU iron permease FTR1 616954 616109 -2 846 P3.or S60 conserved hypothetical protein 623392 623982 1 591 P3.or 561 pleC sensor histidine kinase (non-motile and phage 624O21 625847 3 1827 resistance protein) P3.or S62 pleC Signal transduction histidine kinase 627293 625860 -3 1434 P3.or 563 purU formyltetrahydrofolate deformylase 627659 628516 2 858 P3.or S64 TRAP transporter solute receptor TAXI family 628659 629678 3 102O protein P3.or 565 TRAP transporter, 4TM/12TM fusion protein 629779 631911 1 21.33 P3.or 566 conserved hypothetical protein 631908 63.2378 3 471 P3.or 567 membrane protein-like protein 63.2378 633676 2 1299 P3.or 568 cupin 2 domain-containing protein 633719 634216 2 498 P3.or 569 fabG short-chain dehydrogenase/reductase SDR 634978 634232 -2 747 P3.or 570 hypothetical protein 635089 63S322 1 234 P3.or 571 two-component hybrid sensor and regulator 63S442 638924 3 3483 P3.or 572 agmR two-component response regulator 639596 638943 -3 654 P3.or 573 proP major facilitator Superfamily 641507 639789 -3 1719 P3.or 574 putative hydrolase/carboxylic esterase 642770 641817 -3 954 P3.or 575 Leu/Ile? Val-binding protein 644O23 642791 -2 1233 P3.or 576 mexR MarR family transcriptional regulator 644239 6447S1 1 513 P3.or 577 conserved hypothetical protein 644833 64516S 1 333 P3.or 578 conserved hypothetical protein 64SSO4 645773 3 270 P3.or 579 transcriptional regulator, XRE family 645764 646111 2 348 P3.or S8O transcriptional regulator, GintR family 647380 6481.11 1 732 P3.or 581 amidohydrolase 2 647292 646474 -1 819 P3.or 582 conserved hypothetical protein 648226 6491.82 1 957 P3.or 583 conserved hypothetical protein 649184 6496.21 2 438 P3.or S84 integral membrane protein 649637 6S 1130 2 1494 P3.or 585 conserved hypothetical protein 651713 651135 -3 579 P3.or S86 mnmA tRNA (5-methylaminomethyl-2-thiouridylate)- 653OO6 651852 -3 1155 methyltransferase P3.or 587 Transcriptional activator, TenA family protein 653.783 653061 -3 723 P3.or 588 fixB 2Fe-2S ferredoxin 654279 653947 -1 333 P3.or 589 Iron-sulfur assembly protein 654862 654530 -2 333 P3.or 590 mdtA Secretion protein 655170 656342 3 1173 US 2014/0296.161 A1 Oct. 2, 2014 49

-continued

LOCS Gene Product Start End Strand Length P3.or 591 hydrophobic amphiphilic exporter-1 656364 659462 3 3.099 P3.or 592 ybaL potassium efflux system protein 661960 66O227 -2 1734 P3.or 593 hypothetical protein 6621.98 662629 2 432 P3.or 594 SecD protein-export membrane protein SECD 662767 665103 1 2337 P3.or 595 grST Type II thioesterase 6651.86 665971 2 786 P3.or 596 phbA Membrane protease subunit 667.426 665975 -2 1452 stomatin/prohibitin-like protein P3.or 597 yoI Cyclic peptide transporter 6691. OS 667.423 -1 1683 P3.or 598 hypothetical protein 669427 66921S -2 213 P3.or 599 GTPase domain-containing protein 6701.23 669551 -2 573 P3.or 600 cation multidrug efflux pump protein 673337 670374 -3 2964 P3.or 6O1 putative HlyD-like secretion protein 674600 673467 -3 1134 P3.or 602 lgrC OciB protein 674941 681312 1 6372 P3.or 603 pks inear gramicidin synthetase subunit D 681309 686699 3 5391 P3.or 604 lgrC non-ribosomal peptide synthetase 686696 690688 2 3993 P3.or 60S lgrC putative nonribosomal peptide synthetases 690664 702225 1 11562 (NPRS) P3.or 606 lgrC erythronolide synthase 702250 707367 1 S118 P3.or 607 pks HctF 707383 712224 1 4842 P3.or 608 pks non-ribosomal peptide synthetase/polyketide 712221 71.6462 3 4242 syntinase P3.or 609 lgrC non-ribosomal peptide synthetase 716459 72O319 2 3861 P3.or 610 lgrC non-ribosomal peptide synthetase 720291 722912 3 2622 P3.or 611 lgrC peptide synthetase 722909 7294OO 2 6492 P3.or 612 mbtEH MbtH domain-containing protein 729464 729697 2 234 P3.or 613 conserved hypothetical protein 73.1518 73.1312 -2 2O7 P3.or 614 hypothetical protein 732225 732362 3 138 P3.or 615 hypothetical protein 732366 732548 3 183 P3.or 616 conserved hypothetical protein 733333 732761 -2 573 P3.or 617 conserved hypothetical protein 734,754 734527 -1 228 P3.or 618 CAAX amino terminal protease family 736217 735243 -3 975 P3.Or 619 cyanate transport System protein 737535 736339 -1 1197 P3.or 62O transcriptional regulator, GintR family 738299 737532 -3 768 P3.or 621 putative SURF1 family protein 739095 738403 -1 693 P3.or 622 cyoD cytochrome o ubiquinol oxidase subunit IV 739619 7391.94 -3 426 P3.or 623 cyOC cytochrome o ubiquinol oxidase subunit III 74O278 739616 -2 663 P3.or 624 cyoB cytochrome o ubiquinol oxidase, Subunit I 742.286 74O283 -3 2004 (ubiquinol oxidase chain A) P3.or 625 cyoA ubiquinol oxidase, subunit II 743480 742305 -3 1176 P3.or 626 type III effector protein 744O90 744644 3 555 P3.or 627 hypothetical protein 7446SS 745092 1 438 P3.or 628 DNA methylase N-4/N-6 745248 748043 3 2796 P3.or 629 type III restriction enzyme, res subunit 748063 751170 1 3.108 P3.or 630 UvrD/REP type DNA helicase 753318 751,198 -1 2121 P3.or 631 pyridoxal phosphate biosynthetic protein Pdx.J. 754149 753397 -1 753 P3.or 632 hypothetical protein 754142 754318 2 177 P3.or 633 ggpS alpha,alpha-trehalose-phosphate synthase 75588O 754360 -1 1521 P3.or 634 ggpS alpha,alpha-trehalose-phosphate synthase 756.681 755959 -1 723 P3.or 635 gpSA glycerol-3-phosphate dehydrogenase 757862 7S 6681 -3 1182 P3.or 636 ag|A putative alpha-glucosidase AglA 758072 759703 2 1632 P3.or 637 glutathione-dependent formaldehyde 759.738 76O196 3 459 activating GFA P3.or 638 TetR family transcriptional regulator 760355 760999 2 645 P3.or 639 aminotransferase, class IV 761003 761953 2 951 P3.or 640 conserved hypothetical protein 761953 763626 1 1674 P3.or 641 cit 3-hydroxybutyryl-CoA dehydratase 764644 763850 -2 795 P3.or 642 oruR AraC family transcriptional regulator 764810 765886 2 1077 P3.or 643 FAD/FMN-containing dehydrogenases 767279 765852 -3 1428 P3.or 644 lutR transcriptional regulator, GintR 767533 769188 1 1656 family amidohydrolase family protein P3.or 645 yiaO putative periplasmic Substrate-binding 769391 770386 2 996 transport protein P3.or 646 sia.T TRAP-type C4-dicarboxylate transport system, 770492 770998 2 507 Small permease component P3.or 647 ygiK C4-dicarboxylate transport system (permease 770995 772278 1 1284 large protein) P3.or 648 cysB cyStathionine beta-synthase 772.491 773513 3 1023 P3.or 649 2-isopropylmalate synthase 773510 774922 2 1413 P3.or 6SO conserved hypothetical protein 774877 775938 1 1062 P3.or 651 conserved hypothetical protein 776O20 7771.83 1 1164 P3.or 652 Argininosuccinate lyase 2 7771.93 7784.79 1 1287 P3.or 653 conserved hypothetical protein 778.476 779.720 3 1245 P3.or 654 hypothetical protein 779737 78.1011 1 1275 P3.or 655 short chain dehydrogenase 782717 78.1806 -3 912 US 2014/0296.161 A1 Oct. 2, 2014 50

-continued

LOCS Gene Product Start End Strand Length P3.or 656 transcriptional regulator, TetR family 782822 7834O3 2 582 P3.or 657 iclR clR family transcriptional regulator 784283 7834.17 -3 867 P3.or 658 gentisate 1,2-dioxygenase 784491 785555 3 106S P3.or 659 umarylacetoacetat hydroxylase 785.552 786268 2 717 P3.or 660 TRAP dicarboxylate transporter- DctP subunit 78.6340 787383 1 1044 P3.or 661 hypothetical protein 787380 787946 3 567 P3.or 662 trap dicarboxylate transporter, dctm Subunit 787953 7892.57 3 1305 P3.or 663 maleylacetoacetate isomerase 789271 7899.45 1 675 P3.or 664 HTH-type transcriptional regulator 790355 790053 -3 303 P3.or 665 Cysteine desulfurase 791652 79.0423 -1 1230 P3.or 666 Cysteine sulfinate desulfinase/cysteine 792881 7.91745 -3 1137 desulfurase and related enzyme P3.or 667 transcriptional regulator, BadMRrf2 family 793.399 79.2878 -2 522 P3.or 668 serine O-acetyltransferase 794181 793441 -1 741 P3.or 669 hydrolase of the alpha/beta Superfamily 7943.79 795O17 3 639 P3.or 670 hreonine synthase-like protein 796344 795.088 -1 1257 P3.or 671 TRAP transporter, 4TM/12TM fusion protein 798272 796416 -3 1857 P3.or 672 TRAP transporter solute receptor, TAXI family 799414 798.431 -2 984 P3.or 673 clR family transcriptional regulator 80O322 799.525 -1 798 P3.or 674 peptidase M24 8O1581 80O361 -3 1221 P3.or 675 heavy metal translocating P-type ATPase 8040O2 8018O1 -3 22O2 P3.or 676 transcriptional regulator, MerR family protein 804141 804641 3 5O1 P3.or 677 phospholipid N-methyltransferase protein 8.04744 805394 3 651 P3.or 678 anhydro-N-acetylmuramic acid kinase 807521 805377 -3 2145 P3.or 679 tyrosyl-tRNA synthetase 807709 8O896S 1257 P3.or 68O YceI family protein 809682 809047 636 P3.or 681 MarR family transcrip ional regulator 8098.91 81 0358 2 468 P3.or 682 major facilitator transporter 81.0360 811934 3 1575 P3.or 683 secretion protein HlyD 811942 812994 1053 P3.or 684 conserved hypothetical protein 816400 813OO2 -2 3399 P3.or 685 glutamate-ammonia-ligase adenylyltransferase 816637 819669 3O33 P3.Or 686 Redoxin 819666 82O127 3 462 P3.or 687 conserved hypothetical protein 82O143 821033 3 891 P3.or 688 GCN5-related N-acetyltransferase 821069 821593 2 525 P3.or 689 conserved hypothetical protein 821947 822381 435 P3.or 690 conserved hypothetical protein 824286 823750 537 P3.or 691 conserved hypothetical protein 823753 8.22395 -2 1359 P3.or 692 conserved hypothetical protein 824.445 825173 3 729 P3.or 693 major facilitator Superfamily MFS 1 826398 825151 1248 P3.or 694 ydeM putative acyl dehydratase 826905 826429 477 P3.or 695 ubiquinonemenaquinone biosynthesis 827735 826902 -3 834 methyltransferase pro el P3.or 696 radical SAM domain protein 830039 827772 -3 2268 P3.or 697 rhiR LuxR family transcriptional regulator 83O445 831,188 3 744 P3.or 698 rhII N-acyl-L-homoserine lactone synthetase MsaI 831.221 831901 2 681 P3.or 699 ydeM MaoC domain protein dehydratase 831898 832386 1 489 P3.or 700 alpha/beta hydrolase fold protein 833473 83241S -2 1059 P3.or 701 prfB peptide chain release actor RF-2 834438 833473 -1 966 P3.or 702 mirca penicillin binding pro ein 1A 837211 834671 -2 2541 P3.or 703 amiC N-acetylmuramoyl-L-alanine amidase 838844 83.7453 -3 1392 P3.or 704 le Ribonuclease E and G 839847 843209 3 33.63 P3.or 705 aspC putative aminotransferase protein 844504 843335 -2 1170 P3.or 706 GGT1 gamma-glutamyltranspeptidase precursor 845878 84.4637 -2 1242 P3.or 707 putative Zn-dependen protease 845982 847358 3 1377 P3.or 708 outer membrane protein 847424 848.182 2 759 P3.or 709 gph phosphoglycolate phosphatase 848873 84.8187 -3 687 P3.or 710 conserved hypothetical protein 849082 849SS8 1 477 P3.or 711 vagC SpoVT/AbrB-like pro ein 849653 84989S 2 243 P3.or 712 PilT protein domain protein 84.9994 850296 1 303 P3.or 713 conserved hypothetical protein 850860 850378 -1 483 P3.or 71.4 nahR LysR family transcrip ional regulator 850859 851845 2 987 P3.or 715 bdIA methyl-accepting chemotaxis sensory 852122 853828 2 1707 transducer with PastPac sensor P3.or 716 umarylacetoacetate hydrolase family protein 854906 853923 -3 984 P3.or 717 homogentisate 12-dioxygenase 85.6060 854909 -2 1152 P3.or 718 wY 4-hydroxyphenylpyruvate dioxygenase 857153 856,053 -3 1101 P3.or 719 transcriptional regulator, MarR family protein 857344 857895 1 552 P3.or 720 methyl-accepting chemotaxis sensory 858317 8S9390 2 1074 transducer P3.or 721 cya adenylate cyclase protein 8594.94 861215 3 1722 P3.or 722 Argininosuccinate lyase 2 862S11 861219 -3 1293 P3.or 723 conserved hypothetical protein 863485 862SO8 -2 978 P3.or 724 atty acid desaturase 864225 863482 -1 744 P3.or 725 conserved hypothetical protein 865.333 864.473 -2 861 US 2014/0296.161 A1 Oct. 2, 2014 51

-continued

LOCS Gene Product Start End Strand Length P3.or 726 ioS aldo/keto reductase 866343 865.357 -1 987 P3.or 727 lgrD amino acid adenylation domain protein 869822 866.340 -3 3483 P3.or 728 gshB glutathione synthase 871153 87019.1 -2 963 P3.or 729 GST glutathione S-transferase 872O56 871313 -2 744 P3.or 730 thiG bifunctional sulfur carrier proteinthiazole 8731 OO 872105 -2 996 synthase protein P3.or 731 3-dehydroquinate dehydratase 873.253 873732 1 480 P3.or 732 accB acetyl-CoA carboxylase, biotin carboxyl carrier 873,725 8742O7 2 483 protein P3.or 733 accC acetyl-CoA carboxylase 874221 875564 3 1344 P3.or 734 D-serine dehydratase 875663 876799 2 1137 P3.or 735 conserved hypothetical protein 876931 878.382 1 1452 P3.or 736 membrane protein 878.379 879446 3 1068 P3.or 737 eucyl phenylalanyl-tRNA-protein transferase 879579 88O3O1 3 723 P3.or 738 conserved hypothetical protein 880751 88O3O8 -3 444 P3.or 739 ABC Superfamily ATP binding cassette 881269 88.0748 -2 522 transporter Substrate binding protein P3.or 740 NADH dehydrogenase 881697 881350 -1 348 P3.or 741 nrd Ribonucleotide reductase large subunit 882342 886O10 3 3669 P3.or 742 putative aminopeptidase protein 88613S 886.36S 1 231 P3.or 743 domain of unknown function DUF1814 886.362 887063 3 702 P3.or 744 nuclease 888130 887O69 -2 1062 P3.or 745 conserved hypothetical protein 8892.57 888 127 -1 1131 P3.or 746 conserved hypothetical protein 890O82 8892S8 -1 825 P3.or 747 putative transmembrane protein 891.508 89OO66 -2 1443 P3.or 748 conserved hypothetical protein 892O71 89.1652 -1 420 P3.or 749 Iivo ABC transporter related protein 892.719 893525 3 807 P3.or 750 IvE ABC transporter related protein 893545 894.264 1 720 P3.or 751 IvE hydrophobic amino acid ABC transporter 894.261 895124 3 864 {{8Se. P3.or 752 braE inner-membrane translocator 895124 896110 2 987 P3.Or 753 nepI major facilitator family transporter 897320 896.151 -3 1170 P3.or 754 yafC Transcriptional regulator, LysR family 898325 897.456 -3 870 P3.or 755 mauR LysR family transcriptional regulator 899.245 8983O4 -2 942 P3.or 756 Sodium sulphate symporter 899.382 900788 3 1407 P3.or 757 Glyoxalasebleomycin resistance 9.01172 900798 -3 375 protein dioxygenase P3.or 758 conserved hypothetical protein 9 O2S18 901.277 -2 1242 P3.or 759 sigma-54 interacting transcription regulator 902714 904321 2 1608 protein P3.or 760 nsrR HTH-type transcriptional regulatornsrR 904774 904325 -2 450 P3.or 761 braC extracellular ligand-binding receptor 906172 904952 -2 1221 P3.or 762 hypothetical protein 906686 906889 2 204 P3.or 763 xdhA aldehyde oxidase and Xanthine 907429 910212 1 2784 dehydrogenase molybdopterin binding P3.or 764 hcrC putative deshydrogenase/oxidoreductase; 910209 910781 3 573 P3.or 765 Gluconate 2-dehydrogenase (acceptor) 910778 912O58 2 281 P3.or 766 AMP-dependent synthetase and ligase 91.3887 912O82 -1 806 P3.or 767 IvE ABC transporter related protein 914711 913.884 -3 828 P3.or 768 IivK Extracellular ligand-binding receptor 916OO6 91478O -2 227 P3.or 769 IiwV inner-membrane translocator 917130 916087 -1 O44 P3.or 770 IvE inner-membrane translocator 918063 917,185 -1 879 P3.or 771 braF ABC transporter related protein 918916 91.8149 -2 768 P3.or 772 yoP Beta-lactamase-like 92O316 919276 -1 O4 P3.or 773 acyl-CoA dehydrogenase 922322 92O526 -3 797 P3.or 774 hypothetical protein 923O29 922643 -2 387 P3.or 775 asnO asparagine synthase 92S222 923261 -2 962 P3.or 776 conserved hypothetical protein 92.5873 925373 -2 50 P3.or 777 sir1 Sulfite reductase (ferredoxin) 92.7872 92.5857 -3 2016 P3.or 778 cya adenylate cyclase 1 9293.31 928.132 -1 200 P3.or 779 conserved hypothetical protein 92.9415 929771 3 357 P3.or 78O 2OG-Fe(II) oxygenase 93O4O7 929.787 -3 62 P3.or 781 conserved hypothetical protein 93O792 932603 3 812 P3.or 782 mmgC acyl-CoA dehydrogenase 934.470 93268O -1 79 P3.or 783 fadN 3-hydroxyacyl-CoA dehydrogenase 936829 934496 -2 2334 P3.or 784 fadA 3-ketoacyl-CoA 938O14 93.6875 -2 140 P3.or 785 transcriptional regulator, MerR family protein 938S4O 938148 -3 393 P3.or 786 lcfA long-chain-fatty-acid-CoA ligase 94.0431 938.701 -1 73 P3.or 787 Helix-turn-helix motif protein 94.1092 94O724 -2 369 P3.or 788 plasmid maintenance system killer 941296 941,174 -2 123 P3.or 789 hypothetical protein 942O32 941577 -3 456 P3.or 790 conserved hypothetical protein 94.2675 942043 -1 633 P3.or 791 conserved hypothetical protein 943.429 942773 -2 657 P3.or 792 ArgK protein 944S30 943568 -2 963 US 2014/0296.161 A1 Oct. 2, 2014 52

-continued

LOCS Gene Product Start End Strand Length P3.or 793 ppdK pyruvate phosphate dikinase 947228 944,574 -3 2655 P3.or 794 glyS glycyl-tRNA synthetase beta chain 94.9324 947249 -2 2O76 P3.or 795 hypothetical protein 950163 950405 3 243 P3.or 796 glyCR Glycyl-tRNA synthetase alpha chain 95O135 949.329 -3 807 P3.or 797 hypothetical protein 950700 95O449 -1 252 P3.or 798 peptidase S49 951588 950713 -1 876 P3.or 799 methyltransferase Small 952466 951648 -3 819 P3.or 800 conserved hypothetical protein 952771 952463 -2 309 P3.or 8O1 ispB Polyprenyl synthetase 953124 954.125 3 10O2 P3.or 802 pleC integral membrane sensor hybrid histidine 954S13 95.5805 3 1293 kinase P3.or 803 ADS1 atty-acid desaturase 955953 956948 3 996 P3.or 804 ppSA Beta-ketoacylsynthase 95.7140 95.7436 2 297 P3.or 805 cyclopropane-fatty-acyl-phospholipid synthase 957464 958.363 2 900 P3.or 806 amino acid adenylation 958379 96O163 2 1785 P3.or 807 acyl-CoA dehydrogenase domain protein 96O167 96.1330 2 1164 P3.or 808 cit F148266 1 unknown 961327 962SO2 1176 P3.or 809 cmo A methyltransferase type 12 963218 962SO8 -3 711 P3.or 810 conserved hypothetical protein 963792 96.3313 480 P3.or 811 conserved hypothetical protein 964293 963898 396 P3.or 812 HDDC2 protein 964561 965.175 615 P3.or 813 ycfCR transcriptional regulator, TetR family 96.582O 965,188 633 P3.or 814 estB beta-lactamase 96S903 967150 2 1248 P3.or 815 TetR-family transcriptional regulator 967743 9671.38 606 P3.or 816 gst3 glutathione S-transferase domain-containing 96.7843 968511 669 protein P3.or 817 conserved hypothetical protein 969.588 96.8770 819 P3.or 818 phenazine biosynthesis protein 971.074 97O151 -2 924 P3.or 819 GntR family transcriptional regulator with 971121 97.2512 3 1392 aminotransferase domain P3.or 82O putative flagellin 972.583 97.4163 1581 P3.Or 821 conserved hypothetical protein 974616 974.173 444 P3.or 822 Dimethylaniline monooxygenase N-oxide 976226 974622 -3 16OS orming 5 P3.or 823 biotiinflipoyll attachment domain-containing 976537 976319 -2 219 protein P3.or 824 accC carbamoyl-phosphate synthase L chain ATP 977.933 976590 -3 1344 binding P3.or 825 conserved hypothetical protein 978049 978.747 1 699 P3.or 826 transcriptional regulator, XRE family 978834 97.9655 3 822 P3.or 827 putative phosphatidylethanolamine-binding 98O2.09 979643 -2 567 protein P3.or 828 AraC protein 981184 98O306 -2 879 P3.or 829 conserved hypothetical protein 98.1345 982268 3 924 P3.or 830 rhtB amino acid efflux protein 98.29.13 9822.72 -3 642 P3.or 831 mutB methylmalonyl-CoA mutase 985204 98.301S -2 2190 P3.or 832 mutA methylmalonyl-CoA mutase 98.7113 985209 -3 1905 P3.or 833 Enoyl-CoA hydratase? isomerase 988058 98.7369 -3 690 P3.or 834 hypothetical protein 9886.38 9882O7 -1 432 P3.or 835 hypothetical protein 989373 988879 -1 495 P3.or 836 hmuR TonB-dependent receptor plug 989660 99.1966 2 2307 P3.or 837 yddA putative ABC transport system, ATP-binding 99.1971 993.755 3 1785 protein P3.or 838 putative oxidoreductase 99.4690 993740 -2 951 P3.or 839 3-ketoacyl-(acyl-carrier-protein) reductase 99.5724 99.4966 -1 759 P3.or 840 NHL repeat-containing protein 996.113 997576 2 1464 P3.or 841 GCN5-related N-acetyltransferase 997.593 998.120 3 528 P3.or 842 npdA NAD-dependent deacetylase 998882 998.127 -3 756 P3.or 843 nitro reductase 999578 998.934 -3 645 P3.or 84.4 hypothetical protein 999914 999597 -3 3.18 P3.or 845 pleC multi-sensor signal transduction histidine 10O2236 1OOOO74 -3 21 63 kinase P3.or 846 ATP-dependent protease HsVU (ClpYQ), 10O3S13 10O2365 -2 1149 ATPase subunit P3.or 847 cytochrome c family protein 10O3815 1004216 3 402 P3.or 848 conserved hypothetical protein 1OO4257 10049SS 1 699 P3.or 849 acoR transcriptional regulator 1005170 1007215 2 2046 P3.or 8SO acoX ATP-NAD AcoX kinase 1007579 10O8661 2 1083 P3.or 851 acoA acetoin dehydrogenase complex, E1 10O874O 1OO9744 2 1OOS component, alpha subunit P3.or 852 acoB pyruvate dehydrogenase E1 component, beta 1009779 1010786 3 1008 Subunit P3.or 853 acoC branched-chain alpha-keto acid 101.08OS 101.1917 3 1113 dehydrogenase subunit E2 US 2014/0296.161 A1 Oct. 2, 2014 53

-continued

LOCS Gene Product Start End Length P3.or 854 budC short-chain dehydrogenase/reductase SDR O11922 O12716 795 P3.or 855 hypothetical protein O13102 O13245 144 P3.or 856 Lysine-specific demethylase O13381 O145SO 1170 P3.or 857 aspartyl asparaginyl beta-hydroxylase O15527 O14553 975 P3.or 858 conserved hypothetical protein O15809 O15531 279 P3.or 859 prolyl 4-hydroxylase, alpha Subunit O15971 O16600 630 P3.or 860 conserved hypothetical protein O16930 O16616 315 P3.or 861 asnO Asparagine synthase (glutamine-hydrolyzing) O17430 O19361 1932 P3.or 862 Sulfotransferase O2O238 O19384 855 P3.or 863 HPrkinase O2.1071 O2O238 834 P3.or 864 wiuB siderophore-interacting protein O22343 O21240 1104 P3.or 865 Solute-binding periplasmic protein of O23S21 O2238S 1137 ironsiderophore ABC transporter P3.or 866 fhuA OMR family ferrichrome outer membrane O23536 2121 transporter P3.or 867 TetR family transcriptional regulator O25958 O26647 690 P3.or 868 alpha/beta hydrolase fold O27782 O26658 1125 P3.or 869 paaG enoyl-CoA hydratase O28657 O27845 813 P3.or 870 Acetyl-CoA C-acetyltransferase O29949 028765 1185 P3.or 871 ynch3 oxidoreductase, zinc-binding dehydrogenase O3O187 O31251 1065 family P3.or 872 pleC two-component sensor histidine kinase O31517 1218 P3.or 873 putative lipoprotein O33471 642 P3.or 874 hypothetical protein O33737 237 P3.or 875 putative 2-hydroxychromene-2-carboxylate O34622 615 isomerase P3.or 876 5-carboxymethyl-2-hydroxymuconate delta O34781 843 isomerase P3.or 877 ycC conserved hypothetical protein O35829 O37511 1683 P3.or 878 conserved hypothetical protein O38OSO O37535 S16 P3.or 879 transcriptional regulator, XRE family O38394 O38999 606 P3.Or 88O Sulfonate/nitrate?taurine transport System ATP O39664 O40437 774 binding protein P3.or 881 Sulfonate/nitrate?taurine transport system O40430 O41365 936 permease protein P3.or 882 Sulfonate/nitrate?taurine transport system O41362 O42516 1155 permease protein P3.or 883 Sulfonate/nitrate?taurine transport system O42687 1014 Substrate-binding protein P3.or 884 diguanylate cyclase/phosphodiesterase with O43849 O45435 1587 PAS/PAC sensor(s) P3.or 885 ytsP conserved hypothetical protein O4S456 O45956 5O1 P3.or 886 major facilitator Superfamily MFS 1 O47664 O46O12 1653 P3.or 887 LysR-family transcriptional regulator O48839 O47913 927 P3.or 888 EmrB/QacA family drug resistance transporter 050376 O48943 1434 P3.or 889 regulatory protein, MarR OSO906 050373 534 P3.or 890 kipR clR family regulatory protein OS1954 OS1049 906 P3.or 891 acoA Thiamine pyrophosphate-dependent OS2403 OS3386 984 dehydrogenase, E1 component alpha Subunit P3.or 892 pdhB putative pyruvate dehydrogenase E1 beta OS3383 OS4354 972 Subunit P3.or 893 outer membrane efflux protein OS6161 OS4611 1551 P3.or 894 Stan VCBS 059657 OS6286 3372 P3.or 895 Fata. Putative Ig domain family O76476 059644 16833 P3.or 896 conserved hypothetical protein O8214.5 O76527 S619 P3.or 897 conserved hypothetical protein O82623 O82946 324 P3.or 898 Tail Collar domain protein O82954 O83649 696 P3.or 899 ail collar domain-containing protein O83750 O84343 594 P3.or 900 phage tail collar domain-containing protein O84375 O84974 600 P3.or 901 phage tail collar domain-containing protein O85296 O85622 327 P3.or 902 peptidase M50 O88761 O86569 21.93 P3.or 903 mdtA Multidrug resistance protein mdtA O90330 O8876S 1566 P3.or 904 conserved hypothetical protein O91271 O90327 945 P3.or 905 conserved hypothetical protein O923O2 091397 906 P3.or 906 S8 S-adenosylmethionine uptake transporter O92S38 093491 954 P3.or 907 yodO acetylornithine deacetylase or Succinyl O93542 O94795 1254 diaminopimelate desuccinylase P3.or 908 conserved hypothetical protein O95826 O951.46 681 P3.or 909 aldo/keto reductase O96797 O95826 972 P3.or 910 membrane protein O97557 096910 648 P3.or 911 hypothetical protein 097741 O97571 171 P3.or 912 hypothetical protein O981.83 O97731 453 P3.or 913 fabG short-chain dehydrogenase/reductase SDR O990SO O98265 786 P3.or 914 linx short-chain dehydrogenase/reductase SDR O998.71 O99077 795 US 2014/0296.161 A1 Oct. 2, 2014 54

-continued

LOCS Gene Product Start End Strand Length P3.or 915 yiaN DctM9 O1186 O99900 -1 1287 P3.or 916 DctO9 O1722 O11.83 -3 S4O P3.or 917 siaP TRAP dicarboxylate transporter, DctP subunit O2788 O1802 -1 987 P3.or 918 hypothetical protein O32O1 O3O16 -3 186 P3.or 919 fixI cation transport ATPase O5679 O3229 -3 2451 P3.or 920 nitrogen fixation protein fixH O6284 05700 -2 585 P3.or 921 fixG nitrogen fixation protein fixG O7868 O6360 -2 1509 P3.or 922 petJ cytochrome-c oxidase fixP O8809 O7940 -1 870 P3.or 923 cytochrome-c oxidase FixO2 O9728 09003 -2 726 P3.or 924 fixN cbb3-type cytochrome c oxidase subunit I 11223 O9733 -3 1491 P3.or 925 hypothetical protein 11513 112S6 -2 258 P3.or 926 conserved hypothetical protein 11787 1262O 2 834 P3.or 927 hypothetical protein 12860 12642 -2 219 P3.or 928 IvE ABC transporter ATP-binding protein 13652 12957 -2 696 P3.or 929 Iivo ABC transporter ATP-binding protein 144O1 13649 -1 753 P3.or 930 ABC transporter permease protein 15462 14398 -3 106S P3.or 931 IvE inner-membrane translocator 16426 15491 -1 936 P3.or 932 Extracellular ligand-binding receptor 17676 16489 -3 1188 P3.or 933 transcriptional regulator, ASnC family protein 18010 17768 -1 243 P3.or 934 conserved hypothetical protein 19103 19681 1 579 P3.or 935 GCN5-related N-acetyltransferase 19806 21011 2 12O6 P3.or 936 lysX RimK domain protein ATP-grasp 21016 22521 3 1506 P3.or 937 yncD putative TonB-dependent receptor protein 22565 24988 1 2424 P3.or 938 transcriptional regulator 2S283 24993 -2 291 P3.or 939 addiction module killer protein 25581 25276 -3 306 P3.or 940 hioesterase Superfamily protein 2613S 2S626 -2 510 P3.or 941 menE O-Succinylbenzoate--CoA ligase 26640 26.176 -3 465 P3.or 942 fad) AMP-dependent synthetase and ligase 26818 2666O -1 159 P4.or OO1 hcpC Putative beta-lactamase hcpC 1233 1 1233 P4.or OO2 rifampin ADP-ribosylating transferase 1724 1275 -3 450 P4.or OO3 conserved hypothetical protein 2205 1702 -1 SO4 P4.or 004 ThiJ/PfpI 3391 2255 -2 1137 P4.or 005 glx A transcriptional regulator, AraC family 349S 4505 3 1011 P4.or OO6 pleC Signal transduction histidine kinase 6703 4697 -2 2007 P4.or OO7 putative endonuclease involved in 7431 68.23 -1 609 recombination P4.or gatC aspartylglutamyl-tRNA(ASn/Gln) 7647 7934 3 288 amidotransferase subunit C P4.or gatA aspartylglutamyl-tRNA(ASn/Gln) 7931 9433 2 1503 amidotransferase subunit A P4.or O10 gath3 aspartyl-tRNA(ASn), glutamyl-tRNA (Gln) 10881 1 1452 amidotransferase subunit B P4.or O11 hypothetical protein 11112 11450 3 339 P4.or O12 Putative lipoprotein 11505 12014 3 510 P4.or O13 yafP N-acetyltransferase yafP 12042 12SO3 3 462 P4.or O14 CopG family DNA-binding protein 12S28 128O3 3 276 P4.or O15 hypothetical protein 1312O 12971 -2 150 P4.or O16 oxin complex protein 13164 2O768 3 7605 P4.or O17 metI binding-protein-dependent transport systems 21511 2O828 -2 684 inner membrane component P4.or O18 metN D-methionine transport system ATP-binding 22577 21495 -3 1083 protein P4.or ygcF 7-carboxy-7-deazaguanine synthase homolog 22800 23S43 3 744 P4.or queID putative 6-pyruvoyltetrahydropterin synthase 23S43 23929 2 387 P4.or queC exSB protein 23926 24669 1 744 P4.or 23S rRNA methyltransferase 24686 2S2O4 2 519 P4.or menE AMP-dependent synthetase and ligase 26973 2S48O -1 1494 P4.or fadR transcriptional regulator AcrR family 27670 27053 -2 618 P4.or conserved hypothetical protein 28OOO 28827 1 828 P4.or putative Rossmann fold nucleotide-binding 29784 3O428 3 645 protein P4.or glpR transcriptional regulator 29699 28854 -3 846 P4.or phnV ABC transporter permease protein 30599 3.1381 2 783 P4.or potA ABC transporter ATP-binding protein 31392 32417 3 1026 P4.or spermidine putrescine transport system 32495 33592 2 1098 Substrate-binding protein P4.or potB Spermidine, putrescine transport system 33705 34526 3 822 permease protein P4.or siaP TRAP-type bacterial extracellular solute 35753 34626 -3 1128 binding protein P4.or sia.T TrapT family protein 37113 35791 -1 1323 P4.or Tripartite ATP-independent periplasmic 37598 37110 -3 489 P4.or glpR putative transcriptional regulator 38445 37690 -1 756 P4.or oxidoreductase, FAD-binding protein 38634 40O82 3 1449 US 2014/0296.161 A1 Oct. 2, 2014 55

-continued

LOCS Gene Product Start End Strand Length P4.or OOXA Opine oxidase subunit A 401(OO 41329 2 1230 P4.or glpK glycerol kinase 41326 42756 1 1431 P4.or conserved hypothetical protein 436O7 42732 -3 876 P4.or hypothetical protein 4363S 4388O 3 246 P4.or citE Citrate lyase 44886 439.18 -1 969 P4.or buuR XRE family-like protein 45473 44883 -3 591 P4.or art ABC-type amino acid transport 45870 46616 3 747 P4.or glnM amino acid ABC transporter permease protein 46777 47400 1 624 P4.or cyC amino acid ABC transporter ATP-binding 47397 481.28 3 732 protein P4.or buuB FAD dependent oxidoreductase 48489 49658 3 1170 P4.or putative bleomycin resistance protein SOO18 49659 -3 360 P4.or ABC transporter related protein SO845 SOO63 -2 783 P4.or pyoverdine biosynthesis protein PvdE 52485 SO845 1641 P4.or FAD-binding 9 siderophore-interacting S3240 52482 -3 759 domain-containing protein P4.or 051 transport system permease protein 55216 53237 -2 1980 P4.or 052 iron complex transport system substrate S6103 55213 891 binding protein P4.or 053 oxA putative TonB-dependent receptor S8628 S6106 -3 2523 P4.or OS4 ecR FecR protein 59723 58.737 -3 987 P4.or 055 sigW DNA-directed RNA polymerase specialized 60330 59815 S16 sigma Subunit P4.or conserved hypothetical protein 60860 60522 -3 339 P4.or conserved hypothetical protein 61234 62583 350 P4.or mrp protein 63861 626SO 212 P4.or HflK protein 64296 65366 3 O71 P4.or membrane protease subunit HfiC 65385 66287 3 903 P4.or conserved hypothetical protein 66495 66776 3 282 P4.or conserved hypothetical protein 67990 68175 186 P4.or degP Peptidase S1C, Do 68.267 697.39 2 473 P4.or tas aldo, keto reductase 67819 66809 -2 O11 P4.or phosphoserine phosphatase SerB 70742 6982S -3 918 P4.or mia.A tRNA delta(2)-isopentenylpyrophosphate 70814 71782 2 969 transferase P4.or ilw acetolactate synthase, large subunit 72021 73772 3 752 P4.or w acetolactate synthase 3 regulatory Subunit 73882 74418 1 537 P4.or ilvC ketol-acid reductoisomerase 7448O 756O7 2 128 P4.or ribonuclease BN 75627 76763 3 137 P4.or conserved hypothetical protein 77386 76790 -2 597 P4.or conserved hypothetical protein 77913 77488 -1 426 dnaA chromosomal replication initiation protein 1578 1 578 dnaN DNA polymerase III subunit beta 1786 2907 1 122 TM.or rect recombination protein F 2904 4169 2 266 TM.or DNA gyrase subunit B 4183 6687 1 2505 TM.or putative transcription regulator protein 6772 6981 1 210 TM.or conserved hypothetical protein 6978 7244 2 267 TM.or ygiD Catalytic LigB Subunit of aromatic ring-opening 8093 7278 -2 816 dioxygenase TM.or pbpG penicillin binding protein 8240 10417 3 2178 TM.or bcr drug resistance transporter, BcriCfA subfamily 11649 10429 -1 1221 TM.or citA Citrate (Si)-synthase 121.69 13266 1 1098 TM.or conserved hypothetical protein 14238 13351 -1 888 TM.or 2-amino-3-carboxymuconate-6-semialdehyde 15546 14329 -1 1218 decarboxylase TM.or O13 leuA pyruvate carboxyltransferase 16189 17394 1 12O6 TM.or O14 PcO6g.00060 17400 18092 2 693 TM.or O15 CADD protein 182O3 18883 3 681 TM.or O16 potC binding-protein-dependent transport systems 1973O 18945 -2 786 inner membrane component potH binding-protein-dependent transport systems 20912 19737 -2 1176 inner membrane component potA putative spermidine? putrescine ABC 21979 20909 -3 1071 transporter, ATP-binding protein spermidine putrescine-binding periplasmic 23147 22092 -2 1056 protein TM.or ABC-2 type transporter 24.256 23381 -3 876 TM.or nodI ABC transporter related protein 25005 242S3 -1 753 TM.or conserved hypothetical protein 25155 25655 2 5O1 TM.or O23 regB two-component sensor histidine kinase 25652 27001 3 1350 TM.or O24 regA photosynthetic apparatus regulatory protein 2.7055 27645 1 591 RegA US 2014/0296.161 A1 Oct. 2, 2014 56

-continued

Locus Gene Product Start End Strand Length TM.orf)025 yneP thioesterase family protein 27784 283O8 525 TM.orf)026 mprA two component transcriptional regulator, 29239 2832S -3 915 winged helix family TM.orf)027 argD acetylornithine and Succinylornithine 29533 30732 1200 aminotransferase TM.orf)028 argF ornithine carbamoyltransferase 30729 31.256 2 528 TM.orf)029 argF ornithine carbamoyltransferase 31.267 31704 438 TM.orf)030 hsO molecular chaperone Hsp33 31.701 32687 2 987 TM.orf)O31 Acyl-CoA synthetases (AMP-forming). AMP- 32743 33.399 657 acid II TM.orf)032 gy IR clR family regulatory protein 33.543 34448 2 906 TM.orf)033 menC N-acylamino acid racemase 35.583 34474 1110 TM.orf)O34 beta-lactamase 37011 35704 1308 TM.orf)035 nudF adp-ribose pyrophosphatase 3762O 371.56 465 TM.orf)O36 TetR family transcriptional regulator 38271 37663 609 TM.orf)037 crp transcriptional regulator 388OS 3948S 2 681 TM.orf)O38 fixR halohydrin epoxidase A 39570 40316 2 747 TM.orf)O39 hypothetical protein 4O712 40S12 -2 2O1 TM.orf)040 Secretion activator protein 41274 4O729 S46 TM.orf)041 hypothetical protein 41.399 41271 -2 129 TM.orf)042 putative tail fiber protein 43942 41588 -3 2355 TM.orf)O43 conserved hypothetical protein 46922 43947 -2 2976 TM.orf)044 conserved hypothetical protein 492O1 46919 -3 2283 TM.orf)O45 conserved hypothetical protein S2092 49198 -1 2895 TM.orf)046 hypothetical protein 52646 S2092 -2 555 TM.orf)047 conserved hypothetical protein S458O 52646 -3 1935 TM.orf)048 conserved hypothetical protein 55198 S4584 -3 615 TM.orf)049 hypothetical protein 5552O 55203 -2 3.18 TM.orf)OSO conserved hypothetical protein 564.04 55.535 -3 870 TM.orf)051 hypothetical protein 57586 56708 -3 879 TM.orf)052 hypothetical protein 57893 576.06 -2 288 TM.Orf)053 conserved hypothetical protein 59500 57890 -3 1611 TM.orf)054 conserved hypothetical protein 60285 59713 -1 573 TM.orf)OSS conserved hypothetical protein 61524 6O295 -1 1230 TM.orf)0S6 conserved hypothetical protein 62.297 61770 -2 528 TM.orf)057 DNA primase 64465 62915 -3 1551 TM.orf)058 conserved hypothetical protein 65855 64749 -2 1107 TM.orf)059 hypothetical protein 65984 66289 3 306 TM.orf)060 hypothetical protein 66645 66322 324 TM.orf)061 conserved hypothetical protein 66920 66645 -2 276 TM.orf)062 hypothetical protein 67428 67204 225 TM.orf)063 conserved hypothetical protein 67775 67425 -2 351 TM.orf)064 hypothetical protein 684.54 68.194 261 TM.orf)06S conserved hypothetical protein 68687 68451 -2 237 TM.orf)066 hypothetical protein 69642 698.63 2 222 TM.orf)067 hypothetical protein 69856 70269 414 TM.orf)068 conserved hypothetical protein 7O675 71391 717 TM.orf)069 hypothetical protein 71388 71807 2 420 TM.orf)070 hypothetical protein 71804 72O85 3 282 TM.orf)071 hypothetical protein 72105 72434 2 330 TM.orf)072 hypothetical protein 72421 72678 258 TM.orf)073 hypothetical protein 72678 72995 2 3.18 TM.orf)074 phage integrase family protein 72995 742O3 3 1209 TM.orf)075 hypothetical protein 74694 74437 258 TM.orf)076 hypothetical protein 75004 75219 216 TM.orf)077 hypothetical protein 75230 75640 3 411 TM.orf)078 conserved hypothetical protein 7S646 75903 258 TM.orf)079 L-lactate bermease 77675 76167 -2 1509 TM.orf)08O hypothetical protein 77831 793OO 3 1470 TM.orf)081 fil umarate hydratase, class II 8O821 794.15 -3 1407 TM.orf)082 HemY protein 82392 80995 1398 TM.orf)083 conserved hypothetical protein 83888 82389 -2 1SOO TM.orf)084 hemD uroporphyrinogen III synthase HEM4 84784 840O8 -3 777 TM.orf)085 HEMC Porphobilinogen deaminase, chloroplastic 85834 8484.5 -3 990 TM.orf)086 gcp O-Sialoglycoprotein endopeptidase 85940 87097 3 1158 TM.orf)097 conserved hypothetical protein 88.155 87109 -1 1047 TM.orf)088 conserved hypothetical protein 89.324 881.52 -2 1173 TM.orf)089 conserved hypothetical protein 90S12 89349 -2 1164 TM.orf)090 transmembrane protein 91469 90630 -2 840 TM.orf)091 NmirA-like protein 92583 916OO -1 984 TM.orf)092 yeil YCII-related protein 92841 93.125 2 285 TM.orf)093 Thymocyte nuclear protein 93147 93566 2 420 TM.orf)094 petC Rieske 93568 93.927 1 360 TM.orf)095 acSA acetyl-coenzyme A synthetase 9.4040 95998 3 1959 US 2014/0296.161 A1 Oct. 2, 2014 57

-continued

Locus Gene Product Start End Strand Length O96 GMC oxidoreductase 97714 96.806 -3 909 O97 hypothetical protein 98517 98398 -1 120 TM.or O98 conserved hypothetical protein 99271 98888 -3 384 TM.or O99 conserved hypothetical protein 996OS 99.426 -2 18O TM.or OO htpX heat shock protein HtpX 99762 OO673 2 912 TM.or O1 rSmB Sun protein OO681 O2306 1 1626 TM.or O2 ribulose-phosphate 3-epimerase O2358 O3O83 1 726 TM.or O3 conserved hypothetical protein O3162 OS234 1 2O73 TM.or O4 rf2 FeS assembly SUF system regulator O5385 OS822 1 438 TM.or 05 FeS assembly protein SufB OS949 O7394 1 1446 TM.or O6 suf FeS assembly ATPase SufC O7488 O8246 1 759 TM.or O7 FeS assembly protein Suf) O82S1 O9600 3 1350 TM.or O8 cysteine desulfurase, SufS subfamily O9597 10844 1 1248 TM.or 09 Nifl J family SUF system FeS assembly protein 10856 11359 2 SO4 TM.or 10 conserved hypothetical protein 11435 11776 2 342 TM.or 11 conserved hypothetical protein 11898 12584 1 687 TM.or 12 transcriptional regulator, MarR family 1266S 13168 2 SO4 TM.or 13 conserved hypothetical protein 13153 13515 3 363 TM.or 14 transcriptional regulatory protein 14.188 13553 -2 636 TM.or 15 choline? carnitine/betaine transporter 16182 142O6 -3 1977 TM.or 16 hypothetical protein 16447 16590 3 144 TM.or 17 ybbK conserved hypothetical protein 17109 16603 -3 507 TM.or 18 meaA methylmalonyl-CoA mutase 19139 17109 2O31 TM.or 19 crotonyl-CoA reductase 1956.1 2O847 3 1287 TM.or 2O ying Acyl-CoA dehydrogenase 21098 22777 2 1680 TM.or 21 hypothetical protein 22896 23093 198 TM.or 22 ATP: cob(I)alamin adenosyltransferase 23.188 23763 3 576 TM.or 23 etfB Electron transfer flavoprotein, beta subunit 24053 248O2 2 750 TM.or 24 etfA electron transfer flavoprotein alpha subunit 248OS 25737 3 933 TM.or 25 hbdA 3-hydroxyacyl-CoA dehydrogenase 25872 26744 873 TM.or 26 Thiol: disulfide interchange protein TlpA 27403 26807 -2 597 TM.Or 27 argH argininoSuccinate lyase 27533 28948 2 1416 TM.or 28 lySA diaminopimelate decarboxylase 291.93 3O452 1260 TM.or 29 hypoxanthine phosphoribosyltransferase 31087 30545 -2 543 TM.or 30 conserved hypothetical protein 31816 31157 -2 660 TM.or 31 ftSE putative ATPase involved in cell division 32046 32810 765 TM.or 32 cell division protein 32807 33694 2 888 TM.or 33 conserved hypothetical protein 33763 34383 3 621 TM.or 34 phospholipid glycerol acyltransferase 3438O 3S102 723 TM.or 35 yig hioesterase Superfamily protein 35712 35278 -3 435 TM.or 36 hioesterase Superfamily protein 361.97 35709 489 TM.or 37 chaC Cation transport protein chaC 36860 36246 615 TM.or 38 conserved hypothetical protein 37089 38.096 1008 TM.or 39 tyrC cyclohexadienyl dehydrogenase 39029 381.33 897 TM.or 40 hisC histidinol-phosphate aminotransferase 40208 39108 1101 TM.or 41 phe A chorismate mutase 41267 40347 921 TM.or 42 metX homoserine O-acetyltransferase 41696 42784 2 1089 TM.or 43 putative methionine biosynthesis protein 42781 43473 3 693 (MetW) TM.or 44 pleC non-motile and phage-resistance protein 44628 43483 -3 1146 TM.or 45 TRAP dicarboxylate transporter, DctP subunit 45134 46123 2 990 TM.or 46 tripartite ATP-independent periplasmic 46223 46735 2 513 transporter DctO TM.or 47 TRAP dicarboxylate transporter, DctM subunit 46752 48O17 1 1266 TM.or 48 SAM-dependent methyltransferases 4876S 47989 -3 777 TM.or 49 gloB metallo-beta-lactamase family protein 48.96S 49744 2 780 TM.or 50 Glutathione S-transferase 49898 50503 2 606 TM.or 51 hypothetical protein 50954 SOS44 -1 411 TM.or 52 conserved hypothetical protein 51812 S1060 -1 753 TM.or 53 phbB acetoacetyl-CoA reductase 52634 S1909 -1 726 TM.or S4 phaA acetyl-CoA acetyltransferase 53993 S2821 -1 1173 TM.or 55 phbC polyhydroxyalkanoate synthase SS4O1 S4136 -3 1266 TM.or 56 polyhydroxyalkanoate synthesis repressor 56093 56713 2 621 PhaR TM.or 57 conserved hypothetical protein 57735 S6914 -3 822 TM.or 58 arcB PAS/PAC sensor hybrid histidine kinase 61874 57891 -1 3984 TM.or 59 hypothetical protein 621.27 62324 198 TM.or 60 ABC Superfamily ATP binding cassette 62639 63628 990 transporter, binding protein 61 ABC-type transport system, permease 64038 64925 1 888 component TM.or 62 Cobalt import ATP-binding protein chiO2 64944 65.738 1 795 63 NUDIX hydrolase 6S845 66438 3 594 64 NHL repeat-containing protein 666.21 68297 1 1677 US 2014/0296.161 A1 Oct. 2, 2014 58

-continued

Locus Gene Product Start End Strand Length TM.orf)165 ispZ. intracellular septation protein 68936 68310 -1 627 TM.orf)166 ftsy Signal recognition particle GTPase 69885 68941 -3 945 TM.orf) 167 MiaB-like tRNA modifying enzyme 71300 69882 -1 1419 TM.orf)168 dapF diaminopimelate epimerase 72243 71332 -3 912 TM.orf)169 divL sensor protein divL 74357 72282 -1 2O76 TM.orf) 17O acy S-adenosyl-L-homocysteine hydrolase 75792 74497 -3 1296 TM.orf)171 ptsH phosphocarrier protein HPr 76294 75995 -2 300 TM.orf)172 manX PTS system, IIA component 76722 76315 -3 408 TM.orf)173 3448 RHORT RecName: Full = UPF0042 77782 76838 -2 945 nucleotide-binding protein Rru A3448 TM.orf)174 hprK HPrkinase 78240 77779 -3 462 TM.orf)175 ptsN phosphotransferase system mannitol fructose- 7876O 78.296 -2 465 specific IIA domain-containing protein TM.orf) 176 sigma 54 modulation protein 79526 78927 600 TM.orf)177 RNA polymerase factor sigma-54 81269 79695 1575 TM.orf)178 ABC transporter, ATP-binding protein 82O8O 81.28O -3 8O1 TM.orf)179 conserved hypothetical protein 82789 82148 -2 642 TM.orf) 18O lipopolysaccharide export system protein LiptC 83433 82804 -3 630 TM.orf) 181 Predicted Sugar phosphate isomerase 84530 83538 993 involved in capsule formation TM.orf)182 rind 3'-5' exonuclease 85288 84677 -2 612 TM.orf)183 NADH-ubiquinone oxidoreductase 39 kDa 85536 86507 972 Subunit precursor TM.orf)184 gltD putative oxidoreductase 871.93 88.638 3 1446 TM.orf) 185 gltB glutamate synthase(NADPH) large subunit 88662 93.212 4SS1 TM.orf)186 godha Glu/Leu/Phe(Valdehydrogenase 93479 947.59 2 1281 TM.orf)187 RarD protein 94918 95850 3 933 TM.orf)188 glutathione S-transferase 95862 96530 669 TM.orf)189 yeS putative iron-sulfur cluster binding protein 96S11 97725 3 1215 TM.orf) 190 yeeZ NAD-dependent epimerase? dehydratase 97722 98.687 966 TM.orf)191 ABC transporter, ATP-binding protein 200274 98.694 1581 TM.of)192 hydrolase 2O1221 2004O3 -2 819 TM.orf) 193 Glycosyltransferase 2O1550 202644 109S TM.orf)194 rfaF ADP-heptose of LPSheptosyltransferase 2O2641 2O3612 2 972 TM.orf) 195 inf translation initiation factor IF-3 2O3868 204446 2 579 TM.orf)196 rpm.I 50S ribosomal protein L35 2O471S 2O4912 198 TM.orf)197 rplT ribosomal protein L20 2O4999 205367 2 369 TM.orf) 198 pheS phenylalanyl-tRNA synthetase alpha chain 2O5622 2O6701 3 1080 TM.orf) 199 pheT phenylalanyl-tRNA synthetase beta chain 2O6718 209126 2 2409 TM.orf)200 cyaA putative adenylate cyclase 2O9411 210817 3 1407 TM.orf)201 lepA GTP-binding protein 210967 212766 1 1800 TM.orf)2O2 hypothetical protein 213156 212833 -1 324 TM.orf)2O3 major facilitator Superfamily MFS 1 214904 213621 -2 1284 TM.orf)2O4 regulatory protein ArsR 21S2O6 214874 -3 333 TM.orf)205 icfA Putative carbonic anhydrase precursor 21618 215435 -3 747 TM.orf)206 yxbA conserved hypothetical protein 21638O 216661 3 282 TM.orf)2O7 GCN5-related N-acetyltransferase 216787 217368 582 TM.orf)2O8 conserved hypothetical protein 21766 217350 -2 312 TM.orf)209 microcystin-dependent protein 218298 2188SS 2 558 TM.orf)210 putative microcystin dependent protein 218.946 219497 2 552 TM.orf)211 tail collar domain-containing protein 21951 22OO71 S61 TM.orf)212 cytochrome B561 22O28 220988 2 708 TM.orf)213 cytochrome C 221508 221068 441 TM.orf)214 mcp4 Pmethyl-accepting chemotaxis 2221.75 2236O2 1428 receptor sensory transducer TM.orf)215 dadA D-amino acid dehydrogenase Small Subunit 22498 223 725 -2 1257 TM.orf)216 dadB putative alanine racemase, catabolic 226230 225.064 1167 TM.orf)217 Leucine-responsive regulatory protein 226382 226843 3 462 TM.orf)218 linC UcpA protein 226937 227698 3 762 TM.orf)219 ydeE transcriptional regulatory protein 22856 227701 861 TM.orf)220 conserved hypothetical protein 228712 229140 429 TM.orf)221 tris methyl-accepting chemotaxis protein 23087 229171 1701 TM.orf)222 ydhc drug resistance transporter, BcriCfA subfamily 2322O7 230993 -3 1215 TM.orf)223 short-chain dehydrogenase/reductase SDR 233060 232257 -2 804 TM.orf)224 2-oxoglutarate and iron-dependent oxygenase 23385 233171 -3 681 domain-containing protein 1 TM.orf)225 mmgC putative acyl-CoA dehydrogenase 23453 235685 2 1155 TM.orf)226 CfB Acyl-CoA synthetase/AMP-acid ligase II 23S682 237250 3 1569 TM.orf)227 Sam S-adenosylmethionine uptake transporter 238,234 237323 -3 912 TM.orf)228 CmI2 possible acetyltransferase 2388OO 238231 -1 570 TM.orf)229 chemotaxis sensory transducer 238,976 241060 3 2O85 TM.orf)230 ile:S Isoleucyl-tRNA synthetase 241.390 242SS3 1 1164 TM.orf)231 gev A transcriptional regulator 242S61 243562 3 10O2 TM.orf)232 GDSL-like lipase/acylhydrolase 244208 243579 -2 630 US 2014/0296.161 A1 Oct. 2, 2014 59

-continued

Locus Gene Product Start End Strand Length TM.orf)233 intCA cAMP-binding protein - catabolite protein 244888 2442OS -3 684 activator and regulatory subunit of cAMP dependent protein kinase TM.orf)234 conserved hypothetical protein 2450S6 245658 1 603 TM.orf)235 class II aldolase/adducin family protein 245663 246508 3 846 TM.orf)236 Carboxymethylenebutenolidase 246645 24.7532 2 888 TM.orf)237 Monofunctional biosynthetic peptidoglycan 2482OO 24.7553 -3 648 transglycosylase TM.orf)238 Extracellular serine protease 248.526 2552OO 2 6675 TM.orf)239 aromatic compounds catabolic protein 255761 2SS240 -2 522 TM.orf240 mhdR MarR family transcriptional regulator 255862 2S629O 1 429 TM.orf)241 putative lysine decarboxylase 256878 256297 -1 582 TM.orf)242 dctP putative C4 dicarboxylate binding protein 2572O3 2S8.192 1 990 TM.orf)243 SiaT Tripartite ATP-independent periplasmic 258225 258797 2 573 transporter DctO component TM.orf)244 siaT TRAP dicarboxylate transporter, DctM subunit 258794 260O89 3 1296 TM.orf)245 conserved hypothetical protein 260423 26O172 -2 252 TM.orf)246 gcv A LysR family transcriptional regulator 260662 26.1588 1 927 TM.orf)247 conserved hypothetical protein 261623 262S4O 3 918 TM.orf)248 cysA Sulfate ABC transporter, ATPase subunit 262652 263767 3 1116 TM.orf)249 yhaZ conserved hypothetical protein 264557 2638O8 -2 750 TM.orf)250 yetL regulatory protein, MarR 26SO45 264554 -3 492 TM.orf)2S1 bchO ethyl ferulate-hydrolyzing esterase 26.6038 26SO88 -3 951 TM.orf)2S2 FAD dependent oxidoreductase 267555 26.6038 -1 1518 TM.orf)2S3 TetR family transcriptional regulator 267715 268.386 1 672 TM.orf)254 ybaN nner membrane protein ybaN 268811 268383 -2 429 TM.orf)255 kata catalase 27OSO4 26902O -1 1485 TM.orf)2S6 2-polyprenyl-6-methoxyphenol hydroxylase 271822 27O644 -3 1179 TM.orf)257 Major facilitator Superfamily domain-containing 273069 271819 -1 1251 protein 3 TM.orf)258 fyuA TonB-dependent receptor 275101 273074 -3 2028 TM.Orf)259 chR transcriptional regulator, arac family 275,300 276286 3 987 TM.orf)260 cation diffusion facilitator family transporter 277241. 276246 -2 996 TM.orf)261 tonP O-methyltransferase domain-containing 2781.63 277291 -1 873 protein TM.orf)262 oxyR hydrogen peroxide-inducible genes activator 2791.96 278279 -3 918 TM.orf)263 hypothetical protein 27.9330 279716 2 387 TM.orf)264 proS prolyl-tRNA synthetase 279842 281365 3 1524 TM.orf)26S conserved hypothetical protein 281575 282O3O 1 456 TM.orf)266 zinc-finger protein 2821.78 282603 1 426 TM.orf)267 ohrR transcriptional regulator, MarR family 282856 2833O2 1 447 TM.orf)268 ydhc transcriptional regulator 284OSO 283334 -3 717 TM.orf)269 class II aldolase/adducin family protein 284230 285009 1 780 TM.orf)270 gsiB ABC transporter substrate binding protein 28S 156 286703 2 1548 TM.orf)271 appB oligopeptide ABC transporter 28682S 2878O2 1 978 TM.orf)272 appC ABC transporter permease protein 287799 288638 2 840 TM.orf)273 gsiA ABC transporter ATP-binding protein 2886SO 29O3OS 3 1656 TM.orf)274 ghra D-isomer specific 2-hydroxyacid 2903O2 2.91273 1 972 dehydrogenase NAD-binding TM.orf)275 aph A histone deacetylase family protein 291.313 292341 1 1029 TM.orf)276 aminotransferase 292397 293764 3 1368 TM.orf)277 exoT putative Succinoglycan transport protein 294288 295754 2 1467 TM.orf)278 conserved hypothetical protein 296735 297.058 3 324 TM.orf)279 hypothetical protein 2973 14 297114 -2 2O1 TM.orf)28O conserved hypothetical protein 298793 298524 -2 270 TM.orf)281 lutR transcriptional regulator-like 299671 298.943 -3 729 TM.orf)282 TRAP dicarboxylate transporter, DctO subunit 3OO331 299.783 -3 549 TM.orf)283 C4-dicarboxylate transporter 3O1644 30O337 -1 1308 TM.orf)284 detB possible TrapT family, dctP subunit, C4- 3O2889 3O1747 -1 1143 dicarboxylate periplasmic binding protein TM.orf)285 aminoglycoside phosphotransferase 304023 3O2923 -1 1101 TM.orf)286 crt enoyl-CoA hydratase 304211 305059 3 849 TM.orf)287 potI binding-protein dependent transport system 3.06353 305532 -2 822 inner membrane protein TM.orf)288 potH binding-protein dependent transport system 307310 306372 -2 939 inner membrane protein TM.orf)289 potG putrescine transport ATP-binding protein PotC 3O8592 307378 -1 1215 TM.orf)290 potF PotF 3O9829 3O8690 -3 1140 TM.orf)291 puuA glutamine synthetase 3.11504 31 OO74 -2 1431 TM.orf)292 dippF putative oligopeptide ABC transporter (ATP 313018 31, 1990 -3 1029 binding protein) TM.orf)293 applD oligopeptide? dipeptide ABC transporter, 314052 31301S -1 1038 ATPase subunit US 2014/0296.161 A1 Oct. 2, 2014 60

-continued

Locus Gene Product Start End Strand Length 294 dppC Dipeptide transport system permease protein 31SOO2 314124 -2 879 dppC 295 binding-protein-dependent transport systems 315975 315019 -1 957 inner membrane component TM.or 296 hbpA extracellular solute-binding protein 317578 316088 -3 1491 TM.or 297 peptidase S58 DmpA 318724 317585 -3 1140 TM.or 298 2-hydroxy-3-oxopropionate reductase 3.19943 318963 -2 981 TM.or 299 pyridoxamine 5'-phosphate oxidase family 320O88 32O702 2 615 protein TM.or 3OO conserved hypothetical protein 321,132 321632 2 5O1 TM.or 301 proC pyrroline-5-carboxylate reductase 321639 322481 2 843 TM.or 3O2 conserved hypothetical protein 322663 323.526 1 864 TM.or 303 Carboxymethylenebutenolidase 324313 323,603 -3 711 TM.or 3O4 conserved hypothetical protein 3246.11 325,759 3 1149 TM.or 305 conserved hypothetical protein 325814 326329 3 S16 TM.or 306 engD Predicted GTPase, probable translation factor 327522 326422 -1 1101 TM.or 307 peptidyl-tRNA hydrolase 328.177 3275.27 -3 651 TM.or 3O8 rp|Y 50S ribosomal protein L25/general stress 328797 3281.83 -1 615 protein Ctc TM.or 309 prs ribose-phosphate pyrophosphokinase 329931 328.942 -1 990 TM.or 310 conserved hypothetical protein 330683 330189 -2 495 TM.or 311 ade adenine deaminase 332662 330908 -3 1755 TM.or 312 conserved hypothetical protein 33.3599 332781 -2 819 TM.or 313 midA conserved hypothetical protein 3348O1 333623 -3 1179 TM.or 314 Prolipoprotein diacylglyceryl transferase 335616 334798 -1 819 TM.or 315 protein conserved in bacteria 335984 336340 3 357 TM.or 316 putative membrane protein 336346 33.7263 1 918 TM.or 317 sdpR transcriptional regulator, Arsk family 337414 338223 1 810 TM.or 318 rpmB ribosomal protein L28 338503 338814 1 312 TM.or 319 conserved hypothetical protein 339013 33956.1 1 549 TM.or 32O hyfR putative PAS/PAC sensor protein 341 SOS 339562 -1 1944 TM.Or 321 nrtC putative nitrate transporter component, nrtA 341772 343175 2 1404 TM.or 322 nritB possible nitrate transport system permease 3432OO 344030 2 831 protein 323 cmpD nitrate ABC transporter, ATPase subunits C 344049 34.4954 2 906 and D 364 putative transmembrane efflux precursor 38.216S 383358 1 194 protein TM.or 365 conserved hypothetical protein 38341S 383909 2 495 TM.or 366 conserved hypothetical protein 38.3926 385074 1 149 TM.or 367 52.8 kDa protein in TAR-IttuC'3'region 386612 38508O -2 533 TM.or 368 Ericarboxylic transport 387,097 386630 -3 468 TM.or 369 yfiP conserved hypothetical protein 388109 3871O2 -2 O08 TM.or 370 tctD two component transcriptional regulator, 388416 38.9090 2 675 winged helix family TM.or 371 hypothetical protein 388417 3882S3 -3 16S TM.or 372 histidine kinase 3891 O1 39.0489 1 389 TM.or 373 extracellular solute-binding protein family 1 39.0486 391586 2 101 TM.or 374 ubiG 3-demethylubiquinone-9 3-methyltransferase 392,398 3916O7 -3 792 TM.or 375 lysC aspartate kinase 392S8O 393800 2 221 TM.or 376 ptsP Signal transduction protein containing GAF 393993 396.263 2 2271 and PtsIdomains TM.or 377 conserved hypothetical protein 396535 397770 1 236 TM.or 378 ispG 4-hydroxy-3-methylbut-2-en-1-yl diphosphate 397921 399.060 1 140 synthase TM.or 379 hiss Histidyl-tRNA synthetase, class IIa 40O370 2 245 TM.or 380 prfA peptide chain release factor RF-1 401469 1 O89 TM.or 381 hemik HemK family modification methylase 402362 2 897 TM.or 382 conserved hypothetical protein 403709 2 170 TM.or 383 mal MOSC domain containing protein 4O3811 -3 771 TM.or 384 clpB ATP-dependent Clp protease ATP-binding 3 2598 Subunit TM.or 385 yebA Peptidase M23B 408960 4O7533 -1 428 TM.or 386 outer membrane autotransporter barrel 41SSO6 418865 2 3360 domain-containing protein 387 glpR glycerol-3-phosphate transcriptional regulator 418973 419752 3 780 protein TM.or 388 CO Multicopper oxidase family 421157 419772 -2 1386 TM.or 389 pchR PcR 421349 422161 3 813 TM.or 390 fhuA outer membrane ferripyochelin receptor 422306 424453 3 2148 TM.or 391 membrane protein 424.458 42572O 2 1263 TM.or 392 PepSY-associated TM helix domain protein 42572O 426814 3 109S TM.or 393 (Acyl-carrier protein) phosphodiesterase 427469 426,789 -2 681 TM.or 394 gcVA LysR family transcriptional regulator 427576 428472 1 897 US 2014/0296.161 A1 Oct. 2, 2014 61

-continued

Locus Gene Product Start End Strand Length 395 LysR family transcriptional regulator 428469 429377 2 909 396 conserved hypothetical protein 429.458 430402 3 945 TM.or 397 ordL FAD dependent oxidoreductase 430538 431851 3 1314 TM.or 398 TRAP transporter, 4TM/12TM fusion protein 43.3867 431906 -3 1962 TM.or 399 TRAP transporter solute receptor TAXI family 435070 434003 -3 1068 protein TM.or 400 ybiO MscS Mechanosensitive ion channel 43S428 438028 3 26O1 TM.or 4O1 yadH Inner membrane transport permease yadH 4381.83 438.971 2 789 TM.or 4O2 conserved hypothetical protein 439111 439530 420 TM.or 403 ftSK cell divisionFtsK/SpoIIE 4395.27 44.1980 2 2454 TM.or 404 Glutathione peroxidase 442138 442779 642 TM.or 40S conserved hypothetical protein 443279 442806 -2 474 TM.or 4O6 histone deacetylase 4441.78 443276 -3 903 TM.or 407 Outer membrane protein and related 444712 445557 846 peptidoglycan-associated lipoprotein TM.or 4.08 asma ASmA protein 4458.10 447792 983 TM.or 409 conserved hypothetical protein 447945 44823S 2 291 TM.or 410 hupC cytochrome B561 448239 44-8796 2 558 TM.or 411 on transport 2 domain protein 448999 44.9613 615 TM.or 412 yajO putative Oxidoreductase 450653 4496.28 -2 O26 TM.or 413 protein tyrosine phosphatase 4512S4 4SO781 474 TM.or 414 ppc phosphoenolpyruvate carboxylase 451412 4S42O7 3 2796 TM.or 415 dgdR transcriptional regulator, LysR family 454.237 455127 891 TM.or 416 indhE putative NADH-quinone oxidoreductase 45S220 45.6689 2 470 subunit 5 TM.or 417 conserved hypothetical protein 456682 4591 S6 2475 TM.or 418 nitrogen regulatory protein P-II 4592O3 459559 3 357 TM.or 419 transcriptional regulator 460693 459683 -3 O11 TM.or 420 fruB phosphotransferase system, enzyme I 460971 463514 2 2544 TM.or 421 fruK -phosphofructokinase 463S11 464491 3 981 TM.or 422 fruA PTS system, fructose-specific IIC component 464526 466307 2 782 TM.Or 423 ywoC Amidases related to nicotinamidase 467048 466344 -2 705 TM.or 424 foxA TonB-dependent siderophore receptor 467333 4694O2 3 2070 TM.or 425 bio A adenosylmethionine-8-amino-7-Oxononanoate 471910 470576 -3 335 transaminase TM.or 426 bioD ithiobiotin synthetase 472545 471907 -1 639 TM.or 427 8-amino-7-oxononanoate synthase BioF 473678 472542 -2 137 TM.or 428 bioB biotin synthase 474574 473675 -3 900 TM.or 429 transcriptional regulatory protein 474814 475455 1 642 TM.or 430 oxidoreductase alpha (molybdopterin) subunit 475,529 4778.59 3 2331 TM.or 431 gno dehydrogenase 478626 477841 -1 786 TM.or 432 conserved hypothetical protein 478688 48O127 3 1440 TM.or 433 potA ABC transporter related protein 480236 48.1363 3 1128 TM.or 434 ABC transporter, periplasmic solute-binding 481456 482.583 1 1128 protein 435 potB binding-protein-dependent transport systems 48268O 48,3660 1 981 inner membrane componen 436 potC binding-protein-dependent transport systems 483.657 484.469 2 813 inner membrane componen TM.or 437 histidinol dehydrogenase 484499 485821 3 1323 TM.or 438 transcriptional regulator, AraC family protein 486830 48586S -2 966 TM.or 439 outer membrane protein W precursor 487875 487210 -1 666 TM.or 440 oxygen-independent coproporphyrinogen III 488.18O 489589 3 1410 oxidase TM.or 441 haloalkane dehalogenase 490541 489624 -2 918 TM.or 442 arginine exporter protein ArgO 491.285 49.0656 -2 630 TM.or 443 glycerol-3-phosphate dehydrogenase 491733 49326S 2 1533 TM.or 444 glycerol kinase 493445 494989 3 1545 TM.or 445 TPR repeat-containing protein 495112 495.654 1 543 TM.or 446 conserved hypothetical protein 495786 496427 2 642 TM.or 447 hypothetical protein 496544 496660 3 117 TM.or 448 potassium-transporting ATPase subunit A 496804 498.504 1 1701 TM.or 449 potassium-translocating P-type ATPase B 498519 500567 2 2049 Subunit TM.or 450 potassium-transporting ATPase subunit C 500,585 501175 3 591 TM.or 451 two-component system, Ompk family, sensor SO1204 503951 2 2748 histidine kinase TM.or 452 two-component system, response regulator 503.973 SO4704 2 732 TM.or 453 conserved hypothetical protein 505112 504765 -2 348 TM.or 454 nitrile hydratase beta subunit 505831 505157 -3 675 TM.or 455 nitrile hydratase alpha Subunit 5065O2 505828 -1 675 TM.or 456 gcVA transcriptional regulator SO7496 SO6543 -3 954 TM.or 457 hypothetical protein SO8111 507761 -3 351 TM.or 458 globin SO8687 SO8247 -3 441 US 2014/0296.161 A1 Oct. 2, 2014 62

-continued

Locus Gene Product Start End Strand Length TM.orf)459 ErfKFYbiSYCfS.YG 508807 SO9469 1 663 TM.orf)460 putative lipoprotein SO9478 510302 2 825 TM.orf)461 arSC protein tyrosine phosphatase S10431 S10949 3 519 TM.orf)462 pdhR GntR domain-containing protein S11748 510972 -2 777 TM.orf)463 FMN-dependent alpha-hydroxy acid 511872 S13029 2 1158 dehydrogenase TM.orf)464 isocitrate lyase and phosphorylmutase 513,155 S14018 3 864 TM.orf)46S hypothetical protein S1401S 514371 1 357 TM.orf)466 ABC transporter substrate-binding protein 514469 515752 3 1284 TM.orf)467 conserved hypothetical protein 517165 S16215 -3 951 TM.orf)468 peptidase 5172O1 517578 1 378 TM.orf)469 qseB two-component transcriptional regulator FeuP. 517575 S18240 2 666 winged helix family TM.orf)470 phoQ two-component sensor histidine kinase 518237 S19682 3 1446 TM.orf)471 iorE Twin-arginine translocation pathway signal S21914 519695 -3 2220 TM.orf)472 ior A putative aldehyde dehydrogenase subunit III 52.2370 S21918 -3 453 TM.orf)473 transcriptional regulator, AraC family 522591 S23S44 2 954 TM.orf)474 cfA putative ligase 5251.13 S23S48 -2 1566 TM.orf)475 Extracellular ligand-binding receptor 5264OO 525171 -2 1230 TM.orf)476 petP transcriptional regulator 526590 527138 2 549 TM.orf)477 conserved hypothetical protein 52.7847 S27146 702 TM.orf)478 panc pantoate--beta-alanine ligase 528817 527900 -3 918 TM.orf)479 hypothetical protein 529043 S3O3O8 3 1266 TM.orf)48O hypothetical protein 530716 S3O312 -3 40S TM.orf)481 ribH 6,7-dimethyl-8-ribityllumazine synthase S31291 530773 519 TM.orf)482 gmhB Histidinol phosphatase and related 531698 S32249 3 552 phosphatases TM.orf)483 paag enoyl-CoA hydratase S33O46 532252 795 TM.orf)484 radical SAM domain-containing protein 533444 534793 3 1350 TM.orf)485 bchE Radical SAM domain protein S34908 S36443 3 1536 TM.orf)486 AMP-dependent synthetase and ligase S36464 S38260 1797 TM.orf)487 ery A 538.257 545873 2 7617 TM.orf)488 hioesterase domain-containing protein S45926 S46360 435 TM.orf)489 acpP acyl carrier protein S46431 S46682 3 252 TM.orf)490 fab 3-oxoacyl-acyl-carrier-protein synthase II S467OO S48O34 1335 (Beta-ketoacyl-ACP synthase II) (KAS II) TM.orf)491 phage tail Collar S48247 548765 2 519 TM.orf)492 Tail Collar domain protein 54.8785 S49333 549 TM.orf)493 phage tail Collar S49347 S4988O 3 534 TM.orf)494 Nitrilase/cyanide hydratase and apolipoprotein 551667 S49940 1728 N-acyltransferase TM.orf)495 braG putative branched-chain amino acid ABC SS2409 551714 -3 696 transporter, ATP-binding protein TM.orf)496 braF branched chain amino acid ABC transporter 553173 SS2409 765 ATP-binding protein TM.orf)497 IwV putative branched-chain amino acid ABC SS4282 553170 -2 113 transporter, permease protein TM.orf)498 bra) putative permease component of ABC 555.185 SS4292 -2 894 transporter TM.orf)499 amiC Aliphatic amidase expression-regulating 556.529 555276 -2 254 protein TM.orf)500 amiR two component response regulator 557586 556978 -1 609 TM.orf)5O1 amiC two component sensor kinase 558743 557613 -2 131 TM.orf)5O2 major facilitator transporter 559050 56O237 2 188 TM.orf)5O3 exoD exopolysaccharide synthesis, ExoD S60901 560248 -1 654 TM.orf)SO4 ABC-type transport system, periplasmic S61238 561855 1 618 component TM.orf)SOS membrane protein S61833 S62960 3 128 TM.orf)506 dokA Signal transduction histidine kinase S63O84 S64592 3 509 TM.orf)SO7 putative phosphohistidine phosphatase, Six A S64654 565271 2 618 TM.orf)5O8 critK TspO and MBR like protein 565318 565920 1 603 TM.orf)509 dhkA PAS fold family 565990 S67486 1 497 TM.orf)510 GCDH acyl-CoA dehydrogenase domain-containing 568.693 567503 -3 191 protein TM.orf)511 nitroreductase 568851 56.9519 2 669 TM.orf)512 Leu/Ile? Val-binding protein 570792 S69566 -1 227 TM.orf)513 CfB AMP-dependent synthetase and ligase 571032 572663 2 632 TM.orf)514 ybcL putative efflux transporter 573873 572689 -1 18S TM.orf)515 nahR transcriptional regulator, LysR family S74O12 S74938 1 927 TM.orf)516 ybaK ybaKfebsC protein 575,000 575,491 3 492 TM.orf)517 Phospholipase/Carboxylesterase 576117 575,497 -1 621 TM.orf)518 thioredoxin reductase 577220 576114 -2 107 TM.orf)519 FMO5 monooxygenase S78828 577506 -2 323 TM.orf)S2O NAD-dependent epimerase? dehydratase S79816 578839 -1 978 US 2014/0296.161 A1 Oct. 2, 2014 63

-continued

Locus Gene Product Start End Strand Length TM.orf)521 acnR putative TetR-family transcriptional regulator 579957 S80628 2 672 TM.orf)522 acyl-CoA dehydrogenase domain protein S80669 S81832 1 1164 TM.orf)523 yfdE L-carnitine dehydratase/bile acid-inducible S81847 582971 2 112S protein F TM.orf)524 htd2 Hydroxyacyl-thioester dehydratase type 2 S82968 583837 3 870 TM.orf)525 yeiC cobalamin synthesis protein P47K 58.3974 585.140 2 1167 CH42 BURCC RecName: Full = GTP TM.orf)526 cyclohydrolase folE22 S85216 S86118 2 903 TM.orf)527 carbonic anhydrasefacetyltransferase S861S6 S86716 1 S61 isoleucine patch Superfamily TM.orf)528 thrS hreonyl-tRNA synthetase 586723 58.8711 1 1989 TM.orf)529 as dihydroorotase S88708 S90063 2 1356 TM.orf)530 Imra regulatory protein, TetR S90631 S90041 -1 591 TM.orf)531 pksS cytochrome P450 590886 S921.21 2 1236 TM.orf)532 Enoyl-CoA hydrataseisomerase 592172 S93O83 3 912 TM.orf)533 alanyl-tRNA synthetase-like protein S93148 593.792 2 645 TM.orf)534 hypothetical protein 593.789 S93926 3 138 TM.orf)535 metal dependent phosphohydrolase S94581 S93940 -2 642 TM.orf)536 AraC family transcriptional regulator S94749 595756 3 1008 TM.orf)537 phinC acetyltransferase protein 5961.83 595 719 -2 465 TM.orf).538 hiolase 597626 596370 -2 1257 TM.orf)539 pab para-aminobenzoate synthase 597796 S99931 1 2136 TM.orf)540 conserved hypothetical protein 6OOO61 600906 1 846 TM.orf)541 qor alcohol dehydrogenase 6O2007 601 01S -1 993 TM.orf)542 mauR transcriptional regulator protein 6O21SO 603052 3 903 TM.orf)543 cata putative dioxygenase 603116 6037O6 3 591 TM.orf)544 yeaM AraC family transcriptional regulator 6O4467 603676 -1 792 TM.orf)545 conserved hypothetical protein 604608 60SO18 2 411 TM.orf)546 ior A indolepyruvate ferredoxin oxidoreductase 6O7828 605153 -3 2676 subunit alpha/beta TM.orf)547 ADH4 putative iron-containing alcohol 6O9219 610397 2 1179 dehydrogenase TM.orf)548 yncG glutathione S-transferase domain-containing 61O399 6111OO 1 702 protein TM.orf)549 thioesterase family protein 611097 611534 2 438 TM.orf)SSO conserved hypothetical protein 611534 611995 3 462 TM.orf)SS1 cfB putative AMP-dependent synthetase and 612106 613716 1 611 ligase TM.orf)SS2 chemotaxis transducer 613861 61 SS43 1 683 TM.orf)SS3 conserved hypothetical protein 61.64O1 615544 -1 858 TM.orf)554 acoA Thiamine pyrophosphate-dependent 616817 617848 3 O32 dehydrogenase, E1 component alpha Subunit TM.orf)SSS putative pyruvate dehydrogenase E1 beta 617854 618828 1 975 Subunit TM.orf)556 yiaO 2,3-diketo-L-gulonate-binding periplasmic 61.8991 62OO31 1 O41 protein yiaO TM.orf)557 yiaM Tripartite ATP-independent periplasmic 62O150 62O704 3 555 transporter DctO component TM.orf)558 putative membrane permease 62O762 622O84 3 323 TM.orf)559 vB putative acetolactate synthase large subunit 622226 623998 3 773 TM.orf)560 conserved hypothetical protein 624099 624686 2 S88 TM.orf)561 prf peptide chain release factor 3 6264.04 6248O3 -3 6O2 TM.orf)562 virulence-associated protein VapB-like protein 62668O 626970 291 TM.orf)563 hydrolase-like protein protein of the 62.7843 627025 819 alpha/beta-hydrolase fold family TM.orf)564 conserved hypothetical protein 627993 628.298 2 306 TM.orf)565 exo nuclease (SNase-like) 628791 628339 453 TM.orf)566 katG catalase/peroxidase HP 634943 632781 -2 21 63 TM.orf)567 cysteine desulfuration protein SufE 63S136 635567 2 432 TM.orf)568 class II aldolase/adducin domain protein 635598 636389 2 792 TM.orf)569 OsmO family protein 636514 636900 387 TM.orf)570 Amidase 636948 63.8387 2 1440 TM.orf)571 conserved hypothetical protein 6388SS 640O81 3 1227 TM.orf)572 hypothetical protein 64O122 640562 2 441 TM.orf)573 paa A phenylacetic acid degradation protein (similar 64O657 641412 756 to paaA) TM.orf)574 so chromosome partitioning protein 642261 643097 2 837 TM.orf)575 conserved hypothetical protein 643.191 643934 2 744 TM.orf)576 Heat shock protein DnaJ-like protein 643951 644727 777 TM.orf)577 exON UTP-glucose-1-phosphate uridylyltransferase 645.077 645976 3 900 TM.orf)578 rkpK putative UDPglucose dehydrogenase 646O78 647397 132O TM.orf)579 exoC phosphomannomutase phosphoglucomutase 647463 648872 2 1410 TM.orf)580 ybB. putative transmembrane protein 649638 648958 681 TM.orf)581 putative methyl-accepting chemotaxis protein 6499.39 6S1363 1425 US 2014/0296.161 A1 Oct. 2, 2014 64

-continued

Locus Gene Product Start End Strand Length TM.orf)582 ribokinase 652335 6S 1364 -1 972 TM.orf).583 hypothetical protein 652572 652417 -1 156 TM.orf)584 dacF D-alanyl-D-alanine carboxypeptidase 654331 652757 -3 1575 TM.orf)585 conserved hypothetical protein 654712 655467 1 756 TM.orf)586 conserved hypothetical protein 655702 656541 1 840 TM.orf)587 clpS ATP-dependent Clp protease adaptor protein 656820 65.7164 2 345 ClpS TM.orf)588 ATP-dependent Clp protease ATP-binding 65.7164 659569 3 24O6 subunit ClpA TM.orf)589 conserved hypothetical protein 661060 660392 -3 669 TM.orf)590 hypothetical protein 661403 661134 -2 270 TM.orf)591 hypothetical protein 661698 66142O 279 TM.orf)592 conserved hypothetical protein 662968 661796 -3 1173 TM.orf)593 conserved hypothetical protein 663406 663008 -3 399 TM.orf)594 conserved hypothetical protein 663.738 663403 336 TM.orf)595 chbF Syringomycin synthetase 667650 663781 3870 TM.orf)596 conserved hypothetical protein 668332 667784 -3 549 TM.orf)597 Chain A, Crystal Structure Of Cmls, A Flavin- 670077 668353 1725 Dependent Halogenase TM.orf)598 putative acetyltransferase 670754 670074 -2 681 TM.orf)599 cfB Long-chain-fatty-acid-CoA ligase 672127 670751 -3 1377 TM.orf)600 fabG short-chain dehydrogenase/reductase SDR 672963 672124 840 TM.orf)601 htpX putative protease httpX homolog 674367 6731.89 1179 TM.orf)6O2 conserved hypothetical protein 675506 674370 -2 1137 TM.orf)603 iv acetolactate synthase 2 catalytic Subunit 677.353 675629 -3 1725 TM.orf)6O4 translation initiation inhibitor 677895 677428 468 TM.orf)605 dhaS betaine-aldehyde dehydrogenase 6794.19 677941 1479 TM.orf)606 nemA GTN reductase 68O812 679697 -3 1116 TM.orf)607 lirp eucine-responsive regulatory protein 681.490 681008 -3 483 TM.orf)608 putA delta-1-pyrroline-5-carboxylate dehydrogenase 681671 684.853 3 31.83 TM.orf)609 metA Homoserine O-Succinyltransferase 685094 686O3S 3 942 TM.Orf)610 dhaR transcriptional regulator, TetR family 686626 686O42 -3 585 TM.orf)611 ybfB major facilitator Superfamily MFS-1 686797 688068 1 1272 TM.orf)612 PRX1 BcpB protein 688671 688087 -1 585 TM.orf)613 amino acid transporter LysE 689327 6886.68 -2 660 TM.orf)614 conserved hypothetical protein 690148 689324 -3 825 TM.orf)615 cmoA methyltransferase domain protein 690891 690145 -1 747 TM.orf)616 ydeG MFS-type transporterydeG 692297 691077 -2 1221 TM.orf)617 inosine-uridine preferring nucleoside hydrolase 692498 693S2O 3 1023 family protein TM.orf)618 hlyB putative chemotaxis protein 693705 695,978 2 2274 TM.orf)619 hypothetical protein 696163 696546 384 TM.orf)620 bfr bacterioferritin 697119 696.637 483 TM.orf)621 hypothetical protein 697540 697331 -3 210 TM.orf)622 acetyl-CoA synthetase 699297 697648 16SO TM.orf)623 ydbL conserved hypothetical protein 6998O1 699409 393 TM.orf)624 ynbE conserved hypothetical protein 6.9999S 6998O1 -2 195 TM.orf)625 conserved hypothetical protein 702253 700028 -3 2226 TM.orf)626 prtG calcium binding hemolysin protein 704.409 702337 2O73 TM.orf)627 cya calcium binding hemolysin protein 707645 7044O6 -2 3240 TM.orf)628 hbd 3-hydroxybutyryl-CoA dehydrogenase 7O8626 707766 -2 861 TM.orf) 629 acyl-CoA dehydrogenase family protein 710498 708687 -2 1812 TM.orf)630 yng) acyl-CoA dehydrogenase 711807 710659 1149 TM.orf)631 yng) pimeloyl-CoA dehydrogenase, large subunit 713O33 711837 -2 1197 TM.orf)632 peroxisomal bifunctional enzyme 715404 713272 21.33 TM.orf)633 Iv ABC transporter related protein 716251 715547 -3 705 TM.orf)634 braF ABC transporter related protein 717000 716248 753 TM.orf)63S IwV inner-membrane translocator 7183.19 717OOO -2 132O TM.orf)636 Iw inner-membrane translocator 719266 718316 -3 951 TM.orf)637 ABC transporter, periplasmic branched chain 720794 719529 -2 1266 amino acid binding protein TM.orf)638 enoyl-CoA hydratase 721,169 721 630 3 462 TM.orf)639 alkK Acyl-CoA synthetases (AMP-forming). AMP- 723334 721679 -3 1656 acid ligase II TM.orf)640 stationary-phase survival protein SurE 724368 723562 -1 807 TM.orf)641 cpnA dehydrogenase 724537 725313 1 777 TM.orf)642 short-chain dehydrogenase/reductase SDR 725,393 726169 3 777 TM.orf) 643 aminoglycoside phosphotransferase 726257 727384 3 1128 TM.orf)644 conserved hypothetical protein 7294.04 727473 -2 1932 TM.orf)645 Helix-turn-helix motif 73OOO1 729531 -2 471 TM.orf)646 putative acyl-CoA dehydrogenase 731374 730163 -3 1212 TM.orf)647 acnR TetR family transcriptional regulator 73.1751 732479 2 729 TM.orf)648 oxidoreductase, zinc-binding dehydrogenase 732536 733510 3 975 family US 2014/0296.161 A1 Oct. 2, 2014 65

-continued

Locus Gene Product Start End Strand Length TM.orf)649 methyl-accepting chemotaxis sensory 733.792 735945 1 2154 transducer TM.orf)6SO conserved hypothetical protein 736107 736688 2 582 TM.orf)651 cutL carbon-monoxide dehydrogenase (acceptor) 739092 736771 -1 2322 TM.orf)652 chromate transporter 739344 740543 2 1200 TM.orf)653 transcriptional regulatory protein ZraR 741,144 740554 -1 591 TM.orf)654 rpoH RNA polymerase sigma-32 factor 741640 742587 1 948 TM.orf)6SS Soluble lytic murein transglycosylase and 743595 742660 -1 936 related regulatory protein TM.orf)656 hypothetical protein 744009 743890 -1 120 TM.orf)657 phhA phenylalanine 4-monooxygenase 745351 744503 -3 849 TM.orf)658 pat GCN5-related N-acetyltransferase 746049 74S474 -1 576 TM.orf)659 N-formylglutamate amidohydrolase 7471.43 7461.87 -2 957 TM.orf)660 conserved hypothetical protein 747698 748045 3 348 TM.orf)661 conserved hypothetical protein 74-8108 748539 1 432 TM.orf)662 ybiC malate dehydrogenase 749609 748530 -2 1080 TM.orf)663 HTH-type transcriptional regulator in instable 749726 750433 3 708 DNA locus TM.orf)664 dotP c4-dicarboxylate-binding periplasmic protein 750531 75.1577 2 1047 TM.orf)665 Tripartite ATP-independent periplasmic 751608 752219 2 612 transporter DctO component TM.orf)666 SiaT C4-dicarboxylate TRAP-T family tripartite ATP- 752216 7S3484 3 1269 independent periplasmic transporter, membrane protein, large subunit TM.orf)667 conserved hypothetical protein 754396 753.521 -3 876 TM.orf)668 hypothetical protein 754660 755205 1 S46 TM.orf) 669 Sodium hydrogen exchanger family protein 755322 757847 2 2S26 TM.orf)670 conserved hypothetical protein 758235 757852 -1 384 TM.orf)671 conserved hypothetical protein 758543 758232 -2 312 TM.orf)672 coda cytosine deaminase 759.953 758661 -2 1293 TM.orf)673 yufC) inner-membrane translocator 760898 75995.7 -2 942 TM.orf)674 yufP inner-membrane translocator 76.1966 760917 -2 1050 TM.orf)675 yufO ABC transporter component 763503 761956 1548 TM.Orf676 med Transcriptional activator protein med 76469S 763700 -3 996 TM.orf)677 ytXM Putative esteraseytXM 765825 764917 909 TM.orf)678 apl dedA family protein 766513 765920 -3 594 TM.orf)679 Arylformamidase 766698 767585 2 888 TM.orf)680 hldE rfaE bifunctional protein 769 104 767590 1515 TM.orf)681 gmhA Phosphoheptose isomerase 769759 769145 -3 615 TM.orf)682 cupin 2 domain-containing protein 769946 770377 3 432 TM.orf)683 hldD nucleoside-diphosphate-Sugar epimerase 771374 770388 -2 987 TM.orf)684 hypothetical protein 771640 771888 249 TM.orf)685 gfcB lipoproteingfcB 772.654 77 1977 -3 678 TM.orf)686 Sulfate adenylyltransferase subunit 2 sulfate 772921 773,724 804 adenylatetransferase sat ATP-Sulfurylase Small subunit TM.orf)687 tuf Sulfate adenylyltransferase, large subunit 773,744 774697 3 954 TM.orf)688 cysC NodO bifunctional enzyme 774694 775674 981 TM.orf)689 cfA 4-coumarate-CoA ligase 777 276 775735 S42 TM.orf)690 peroxisomal 2,4-dienoyl-CoA reductase 7781.27 777318 -2 810 TM.orf)691 hiolase 779269 778124 -3 146 TM.orf)692 DUF35 779685 779266 420 TM.orf)693 peroxisomal multifunctional enzyme type 2 780596 779682 -2 915 TM.orf)694 alpha-methylacyl-CoA racemase 781834 780686 -3 149 TM.orf)695 sly A transcriptional regulator, MarR family 782310 78.1861 450 TM.orf)696 FOX2 peroxisomal multifunctional enzyme type 2 78.3339 782434 906 TM.orf)697 acyl-CoA dehydrogenase 78.3507 784658 2 152 TM.orf)698 cfA acyl-CoA synthase 784655 786238 3 S84 TM.orf)699 enoyl-CoA hydrataseisomerase 786235 787038 804 TM.orf)700 SiaT trap dicarboxylate transporter, dctm Subunit 788453 787113 -2 341 TM.orf)701 tripartite ATP-independent periplasmic 788.963 7884.57 -2 507 transporter DctO TM.orf)702 dictB Bacterial extracellular solute-binding protein, 790096 789.038 -3 059 amily 7 TM.orf)703 cysW Sulfate ABC transporter permease CysW 791154 7903OO -1 855 TM.orf)704 cysT Sulfate transport system permease protein 792007 791159 -3 849 TM.orf)705 sbp Sulfate ABC transporter, periplasmic sulfate- 793.059 792013 -1 O47 binding protein TM.orf)7O6 MUM4 (MUCILAGE-MODIFIED 4); UDP-4- 7943.35 793313 -3 O23 keto-6-deoxy-glucose-3,5-epimerase/UDP-4- keto-rhamnose-4-keto-reductase/UDP-L- rhamnose synthase/UDP-glucose 4,6- dehydratase? catalytic US 2014/0296.161 A1 Oct. 2, 2014 66

-continued

Locus Gene Product Start End Strand Length TM.orf)707 hd) NAD-dependent epimerase? dehydratase 795138 794335 804 TM.orf)708 rifbP undecaprenyl-phosphate galactose 79.5619 7963O2 684 phosphotransferase TM.orf)709 capI NAD-dependent epimerase? dehydratase 796462 797448 987 TM.orf)710 capL UDP-N-acetyl-D-mannosaminuronate 7974.55 798807 1353 dehydrogenase TM.orf)711 kpsD polysaccharide biosynthesis export protein 798.962 8O1733 3 2772 TM.orf)712 gfcD ipoproteingfcD 8O1746 804310 3 2565 TM.orf)713 rifbF glucose-1-phosphate cytidylyltransferase 804589 805395 807 TM.orf)714 rifbN glycosyltransferase, group 2 family protein 805747 806772 1026 TM.orf)715 algA GDP-mannose pyrophosphorylase EpsCR 8O6941 8O8362 1422 TM.orf)716 hypothetical protein 8091.91 810693 1503 TM.orf)717 transposase, IS4 811033 810656 -3 378 TM.orf)718 transposase 811329 811 O3O 300 TM.orf)719 transposase 812O31 811762 270 TM.orf)720 putative transposase number 1 of insertion 812834 812286 -2 549 sequence NGRIS-2b TM.orf)721 nitroreductase 813882 814241 2 360 TM.orf)722 glycosyltransferase 814291 815379 1089 TM.orf)723 conserved hypothetical protein 815651 815971 3 321 TM.orf)724 phinA putative polybetahydroxybutyrate synthesis 816051 816371 2 321 PhnA protein TM.orf)725 conserved hypothetical protein 816451 8172O3 753 TM.orf)726 transcriptional regulator, TetR family protein 817819 817235 -3 585 TM.orf)727 yxaH putative membrane protein 817946 819001 3 1056 TM.orf)728 boy putative biotin transporter bioY 819599 819009 -2 591 TM.orf)729 yd I transcriptional regulator, PadR-like family 819792 820448 2 657 TM.orf)730 conserved hypothetical protein 82O445 82O891 3 447 TM.orf)731 ynfM nner membrane transport protein ynfM 822157 82O892 -3 1266 TM.orf)732 alsR transcriptional regulator, LysR family 822266 82316S 3 900 TM.orf)733 glyoxalase family protein 823242 823730 2 489 TM.orf)734 ybaA conserved hypothetical protein 823730 824089 3 360 TM.orf)735 titr Putative acetyltransferase protein 824625 82408O -1 S46 TM.orf)736 XRE family transcriptional regulator 825205 824636 -3 570 TM.orf)737 pecS transcriptional regulator protein 825751 82S248 -3 SO4 TM.orf)738 pecM putative transmembrane protein 82S858 826,748 2 891 TM.orf)739 yfm.J. NADP-dependent dehydrogenase 826767 8278O1 2 1035 TM.orf)740 ywfM transporterywfM 828741 827788 -1 954 TM.orf)741 ornithine cyclodeaminase/mu-crystallin 829797 8288.44 -1 954 TM.orf)742 conserved hypothetical protein 83O365 829883 -3 483 TM.orf)743 hoSA MarR family transcriptional regulator 83O469 830942 2 474 TM.orf)744 conserved hypothetical protein 831065 832447 3 1383 TM.orf)745 arsenate reductase and related 832953 832576 -1 378 TM.orf)746 peptidase M14, carboxypeptidase A 8341S2 832953 -2 1200 TM.orf)747 metC Cys/Met metabolism pyridoxal-phosphate- 834281 83546S 3 118S dependent enzymes TM.orf)748 IE DEAD/DEAH box helicase domain-containing 836O34 83762O 2 1587 protein TM.orf)749 bcIA Methyl-accepting chemotaxis protein 8394.19 837737 -3 1683 TM.orf)7SO phenazine biosynthesis protein PhzF family 840341 839514 -2 828 TM.orf)751 Multidrug resistance protein B 84.1645 840386 -3 1260 TM.orf)752 peptidase C45, acyl-coenzyme A: 6- 84.2814 841633 -1 1182 aminopenicillanic acid acyl-transferase TM.orf)753 lysX S6 modification enzyme RimK 843734 84.2811 -2 924 TM.orf)754 norM MATE efflux family protein 845343 843844 -1 1SOO TM.orf)755 conserved hypothetical protein 845638 84-6540 1 903 TM.orf)756 conserved hypothetical protein 846561 847322 2 762 TM.orf)757 GGDEF domain containing protein 847396 848.274 1 879 TM.orf)758 diguanylate cyclase with PAS/PAC sensor 84828O 84863O 2 351 TM.orf)759 hypothetical protein 848932 8491OS 1 174 TM.orf)760 leuA 2-isopropylmalate synthase 849.191 850897 3 1707 TM.orf761 mreB rod shape-determining protein MreB and 851 190 852230 2 1041 related proteins TM.orf762 mreC Rod shape-determining protein MreC 852430 85.3335 1 906 TM.orf)763 Cell shape-determining protein 853.352 85.3867 3 S16 TM.orf)764 mrda penicillin binding protein 2 853953 855.959 2 2007 TM.orf)765 mrdB Bacterial cell division membrane protein 855.956 857107 3 1152 TM.orf)766 hypothetical protein 85.7429 8576.50 3 222 TM.orf)767 plcC phospholipase Di?transphosphatidylase 8591.94 8576.68 -1 1527 TM.orf)768 yaB GCN5-related N-acetyltransferase 8593.90 8S9845 1 456 TM.orf)769 tE flavin reductase domain-containing protein 86O411 8S9860 -2 552 TM.orf)770 yhgD transcriptional regulator, TetR family 860595 861254 2 660 TM.orf)771 eutG maleylacetate reductase 862448 861276 -2 1173 TM.orf)772 conserved hypothetical protein 862698 863O18 2 321 US 2014/0296.161 A1 Oct. 2, 2014 67

-continued

Locus Gene Product Start End Strand Length TM.orf)773 Bbp2 863O33 863611 3 579 TM.orf)774 hypothetical protein 863608 863.973 1 366 TM.orf)775 C repressor protein 864710 863964 -2 747 TM.orf)776 conserved hypothetical protein 864876 8653O4 2 429 TM.orf)777 conserved hypothetical protein 865831 865328 -3 SO4 TM.orf)778 hypothetical protein 866147 865857 -2 291 TM.orf)779 exo putative nuclease 867100 866357 -3 744 TM.orf)780 thrS hreonyl-tRNA synthetase 86,7419 8693O2 3 1884 TM.orf)781 Outer membrane lipoprotein-like 869483 87OO10 3 528 TM.orf)782 conserved hypothetical protein 870.055 87O642 1 S88 TM.orf)783 filamentation induced by cAMP protein Fic 87O812 871078 3 267 TM.orf)784 hypothetical protein 871209 872570 2 1362 TM.orf)785 hypothetical protein 872.707 873117 1 411 TM.orf)786 arSB membraneanion transport protein 873236 874462 3 1227 TM.orf)787 conserved hypothetical protein 874.551 876O17 2 1467 TM.orf)788 transcriptional regulator 876446 876O36 -2 411 TM.orf)789 cry2 Deoxyribodipyrimidine photolyase 876900 878381 2 1482 TM.orf)790 icc metallophosphoesterase 878427 879224 2 798 TM.orf)791 glxA transcriptional regulator, AraC family 88O193 879255 -2 939 TM.orf)792 Thi.JPfpI family protein 880283 88.0981 3 699 TM.orf)793 extracellular ligand-binding receptor 882269 880995 -2 1275 TM.orf)794 oxidoreductase, short chain 88.3289 8824.83 -2 807 dehydrogenase/reductase family TM.orf)795 Short-chain dehydrogenase/reductase SDR 88.3557 884315 2 759 TM.orf)796 Rhomboid family protein 884613 885,104 2 492 TM.orf)797 conserved hypothetical protein 885466 885,122 -3 345 TM.orf)798 conserved hypothetical protein 885588 886211 2 624 TM.orf)799 acetyltransferase 886436 886909 3 474 TM.orf)800 possible periplasmic Substrate-binding protein, 887349 888.662 2 314 ABC-type amino acid ABC transporter TM.orf)801 bra) inner-membrane translocator 888777 889673 2 897 TM.of)8O2 inner-membrane translocator 889688 890713 3 O26 TM.orf)803 IvO ABC transporter related protein 89.0710 89.1459 1 750 TM.orf)804 Iw ABC transporter related protein 89.1479 89.2138 3 660 TM.orf)805 cfB putative acyl coenzyme A synthetase, long- 892181 89.3356 3 176 chain-fatty-acid-CoA ligase TM.orf)806 istB IstB domain protein ATP-binding protein 8942OS 893414 -3 792 TM.orf)807 istA Integrase, catalytic region 895700 894.192 -2 509 TM.orf)808 intZ site-specific recombinase, phage integrase 89.7748 895811 -3 938 family TM.orf)809 hypothetical protein 8981OS 897965 -3 141 TM.orf)810 YGGT family protein 898.781 898.464 -2 3.18 TM.orf)811 Na+ solute symporter (SSf family) 9 OO627 898861 -1 767 TM.orf)812 membrane protein 900934 900641 -3 294 TM.orf)813 hypothetical protein 901151 901759 3 609 TM.orf)814 dnaO DNA polymerase III, epsilon subunit 901740 903149 2 410 TM.orf)815 Predicted signal-transduction protein 903243 904697 2 455 containing cAMP-binding and CBS domains TM.orf)816 plc B alpha/beta hydrolase fold protein 9047O6 905701 3 996 TM.orf)817 short-chain dehydrogenase/reductase SDR 906481 905702 -3 780 TM.orf)818 ybfI AraC family transcriptional regulator 907367 906504 -2 864 TM.orf)819 calE phenylacetic acid degradation protein PaaY 907543 908148 606 TM.orf)82O putative ABC transporter ATP-binding protein 9083O8 909990 1683 TM.orf)821 virulence-associated protein D 910204 910410 2O7 TM.orf)822 mlac toluene tolerance transporter 911,169 910477 693 TM.orf)823 Surface lipoprotein 912117 911266 852 TM.orf)824 transcriptional regulator 912352 913419 1068 TM.orf)825 metF 5,10-methylenetetrahydrofolate reductase 913416 914360 2 945 TM.orf)826 metH 5-methyltetrahydrofolate-homocysteine 914413 917922 3510 methyltransferase TM.orf)827 NAD(P)(+) transhydrogenase (AB-specific) 91839S 919534 3 1140 TM.orf)828 NAD(P)(+) transhydrogenase (AB-specific) 919531 9199SO 420 TM.orf)829 pntB NAD(P) transhydrogenase, beta subunit 91996.1 92.1364 3 1404 TM.orf)830 talD putative alpha-ketoglutarate-dependent taurine 922702 92.1779 -3 924 dioxygenase TM.orf)831 Transcriptional regulator, AraC family protein 92.2852 923916 106S TM.orf)832 conserved hypothetical protein 924.188 923.874 -2 315 TM.orf)833 30S ribosomal protein S21 924.597 924.391 2O7 TM.orf)834 transcriptional activator, TenA family 924910 925581 672 TM.orf)835 COQ9 Ubiquinone biosynthesis protein 92S664 926437 3 774 TM.orf)836 purK phosphoribosylaminoimidazole carboxylase 927569 9264S4 -2 1116 ATPase subunit TM.orf)837 purE phosphoribosylcarboxyaminoimidazole mutase 928.13S 92.7611 -3 525 TM.orf)838 conserved hypothetical protein 928.319 928906 3 S88 US 2014/0296.161 A1 Oct. 2, 2014 68

-continued

Locus Gene Product Start End Strand Length TM.orf)839 Sulfatase 929009 930739 3 1731 TM.orf)840 2-nitropropane dioxygenase 930.795 93.1772 2 978 TM.orf)841 putative transcriptional regulator family 93.1811 93.23O2 3 492 TM.orf)842 conserved hypothetical protein 932637 932819 2 183 TM.orf)843 ubiX aromatic acid decarboxylase 933975 933.316 -1 660 TM.orf)844 conserved hypothetical protein 934O76 934243 3 168 TM.orf)845 alkK Acyl-CoA synthetase (AMP-forming)/AMP-acid 93606S 934,428 -2 1638 ligase II TM.orf)846 acSA AMP-dependent synthetase and ligase 938073 936157 -1 1917 TM.orf)847 hypothetical protein 938.444 93.8578 3 135 TM.orf)848 ybfB MFS-type transporterybfB 9386.18 939883 3 1266 TM.orf)849 AAA ATPase 93.9969 942092 2 2124 TM.orf)850 conserved hypothetical protein 942089 943507 3 1419 TM.orf)851 conserved hypothetical protein 944.176 943.583 -3 594 TM.orf)852 quinone oxidoreductase 94538O 94.4355 -2 1026 TM.orf)853 rinz beta-lactamase Superfamily hydrolase 946.345 945536 -3 810 TM.orf)854 rinz beta-lactamase domain protein 946664 947518 3 855 TM.orf)855 cyclase family protein 94.8746 947757 -2 990 TM.orf)856 3-hydroxybutyrate dehydrogenase 94922S 9SOOO4 1 780 TM.orf)857 alpha/beta hydrolase fold 95OOS4 950875 3 822 TM.orf)858 putative excisonase protein 950962 951405 1 444 TM.orf)859 tR TetR family transcriptional regulator 952O74 951400 -1 675 TM.orf)860 yufC) inner-membrane translocator 953O23 95.2079 -3 945 TM.orf)861 yufP permease protein of Sugar ABC transporter 954.108 953O20 -1 1089 TM.orf)862 yufC) ABC transporter, ATP-binding protein 955661 9S4105 -2 1557 TM.orf)863 med Basic membrane lipoprotein 956,701 955721 -3 981 TM.orf)864 putative iron ascorbate oxidoreductase 956986 958.041 1 1056 TM.orf)865 sulfite reductase 9594.99 958.093 -1 1407 TM.orf)866 hypothetical protein 95.993.7 959536 -1 402 TM.orf)867 putative cache sensor protein 96O487 95.9996 -3 492 TM.orf)868 glycine betaine proline transport system 96.1483 96.O599 -3 885 Substrate-binding protein TM.orf)869 binding-protein-dependent transport systems 962426 96.1581 -2 846 inner membrane component TM.orf)870 proV glycine betaine L-proline ABC transporter, 963712 962426 -3 1287 ATPase subunit TM.orf)871 putative addiction module antidote protein, 964193 964423 3 231 CopGArc/MetJ family TM.orf)872 hspB Small heat shock protein 96SO45 964605 -2 441 TM.orf)873 ybfL transposase, is4 family 965331 966446 2 1116 TM.orf)874 rpmE arge subunit ribosomal protein L31 96.6850 9666.29 -3 222 TM.orf)875 ytxE OmpAMotR domain protein 968.218 966.944 -3 1275 TM.orf)876 conserved hypothetical protein 969468 968293 -1 1176 TM.orf)877 hypothetical protein 969997 969587 -3 411 TM.orf)878 SuhB ructose-1 6-bisphosphatase 971114 97.0272 -2 843 TM.orf)879 eff elongation factor P 971848 971282 -3 567 TM.orf)880 thE hiamine-phosphate pyrophosphorylase 97.2731 972O63 -2 669 TM.orf)881 fbaB ructose-bisphosphate aldolase 973648 972728 -3 921 TM.orf)882 hypothetical protein 974.192 973854 -2 339 TM.orf)883 pgk phosphoglycerate kinase 97 S422 974.226 -2 1197 TM.orf)884 gap glyceraldehyde-3-phosphate dehydrogenase 976536 975532 1OOS TM.orf)885 hypothetical protein 976579 976710 132 TM.orf)886 cbbT transketolase 978670 9766.67 -3 2004 TM.orf)887 glnT putative amino acid carrier 97.9078 98.0508 1431 protein(sodiumalanine symporter) TM.orf)888 asp.A aspartate ammonia-lyase 98.0586 98.1959 2 1374 TM.orf)889 anSA cytoplasmic asparaginase I 98288.1 982O36 846 TM.orf)890 Aspartate racemase 983627 98288.1 -2 747 TM.orf)891 putative transcriptional regulator 98.3797 98.4603 807 TM.orf)892 hypothetical protein 984949 9852S4 306 TM.orf)893 conserved hypothetical protein 985283 985633 3 351 TM.orf)894 5-formyltetrahydrofolate cyclo-ligase 98S909 986610 702 TM.orf)895 ymdB metallophosphoesterase 986721 98.7059 2 339 TM.orf)896 ymdB conserved hypothetical protein 987053 987598 3 S46 TM.orf)897 conserved hypothetical protein 987733 988482 750 TM.orf)898 ruvC crossover junction endodeoxyribonuclease 988521 989021 2 5O1 TM.orf)899 ruvA Holliday junction ATP-dependent DNA 989OS4 9897O1 3 648 helicase TM.orf)900 ruvB Holliday junction DNA helicase B 9896.98 99.0759 1062 TM.orf)901 tol-pal system-associated acyl-CoA 990862 99.1296 435 thioesterase TM.orf)902 gatA Glutamyl-tRNA(Gln) amidotransferase 992736 99.1339 1398 Subunit A TM.orf)903 tolC) biopolymer transport protein 993294 994O28 2 735 US 2014/0296.161 A1 Oct. 2, 2014 69

-continued

Locus Gene Product Start End Strand Length TM.orf)904 toR biopolymer transport protein 994,032 994.490 2 459 TM.orf)905 hypothetical protein 995062 994.682 -3 381 TM.orf)906 ToA protein 99.5410 99.5754 1 345 TM.orf)907 to B translocation protein TolB 99.5751 997 130 2 380 TM.orf)908 Outer membrane protein and related 997284 997796 2 513 peptidoglycan-associated lipoprotein TM.orf)909 ol-pal system protein YbgF 998O13 999008 2 996 TM.orf)910 tiS putative PP-loop Superfamily ATPase 999013 OOO494 1 482 TM.orf)911 ftSEH cell division protease OOO491 OO2434 2 944 TM.orf)912 foP dihydropteroate synthase OO2461 OO3666 3 2O6 TM.orf)913 glmM phosphoglucosamine mutase OO3826 OOS181 3 356 TM.orf)914 thiD phosphomethylpyrimidine kinase OOS334 OO6173 1 840 TM.orf)915 phosphomethylpyrimidine kinase OO6170 OO7015 2 846 TM.orf)916 isomerase OO7384 OO8172 3 789 TM.orf)917 GGDEF domain-containing protein OO9326 OO8169 -1 158 TM.orf)918 SerC phosphoserine aminotransferase OO958O O10746 3 167 TM.orf)919 SerA D-3-phosphoglycerate dehydrogenase O10864 O12441 3 578 TM.orf)920 hisZ. ATP phosphoribosyltransferase regulatory O12760 O13935 3 176 Subunit TM.orf)921 purA adenylosuccinate synthetase O13949 O1S232 2 284 TM.orf)922 conserved hypothetical protein O15398 O17425 2 2028 TM.orf)923 conserved hypothetical protein O17441 O17908 2 468 TM.orf)924 Iw branched-chain amino acid ABC transporter, O18685 O17936 -2 750 ATP-binding protein TM.orf)925 braF branched-chain amino acid ABC transporter, O19443 O18685 -3 759 ATP-binding protein TM.orf)926 braE inner-membrane translocator O2O607 O19456 -3 1152 TM.orf)927 Iw inner-membrane translocator O21634 O2O609 -2 1026 TM.orf)928 ABC-type branched-chain amino acid O231 O1 O21761 -2 1341 transporter TM.orf)929 transcriptional regulator, XRE family protein O23300 O24766 2 1467 TM.Orf)930 arR protein containing response regulator domain, O24837 O25208 1 372 but no DNA binding domain TM.orf)931 hypothetical protein O2S230 O2S514 1 285 TM.orf)932 Na+ solute symporter O25519 O28239 3 2721 TM.orf)933 transcriptional regulator MazE O28523 O28.293 -1 231 TM.orf)934 rpoH RNA polymerase factor sigma-32 O29554 O28631 -2 924 TM.orf)935 r) Pseudouridylate , 23S RNA-specific O313.85 O3O168 -1 1218 TM.orf)936 hypothetical protein O31422 O31904 1 483 TM.orf)937 Mov34, MPNPAD-1 O31901 O32383 2 483 TM.orf)938 (Uracil-5)-methyltransferase O32463 O34O70 1 1608 TM.orf)939 ItalE putative low specificity L-threonine aldolase O34067 O3S119 2 1053 TM.orf)940 l.cfB malonyl-CoA synthase O35551 O37O68 3 1518 TM.orf)941 ywbO hiol oxidoreductase FrnE O37195 O37962 3 768 TM.orf)942 mdcF malonate transporter O38916 O37981 -3 936 TM.orf)943 N-acetyl-gamma-glutamyl-phosphate O3942O O40418 1 999 reductase TM.orf)944 puuR HTH-type transcriptional regulator puuR O41265 O4O717 -3 549 TM.orf)945 omega-amino acid-pyruvate aminotransferase O41612 O42934 2 323 TM.orf)946 ald alanine dehydrogenase O43O8O O44204 1 125 TM.orf)947 addiction module antitoxin, RelB/DinJ family O44340 O44486 1 147 TM.orf)948 putative multidrug-efflux transporter O45981 O44515 -3 467 TM.orf)949 pleC PAS/PAC sensor signal transduction histidine O46185 O47351 1 167 kinase TM.orf)950 dadA glycine, D-amino acid oxidase O48626 O47364 -1 263 TM.orf)951 ygdD inner membrane protein ygdD O48788 O491.89 2 402 TM.orf)952 hypothetical protein O492.17 O49336 2 120 TM.orf)953 AAA ATPase central domain protein O49427 OSO3O8 2 882 TM.orf)954 conserved hypothetical protein OSO366 OS1544 2 179 TM.orf)955 hypothetical protein OS 1833 OS2648 2 816 TM.orf)956 mach3 ABC transporter related protein 054578 OS2626 -1 953 TM.orf)957 macA RND family efflux transporter MFP subunit O55.783 054575 -2 209 TM.orf)958 conserved hypothetical protein OSS843 O56796 3 954 TM.orf)959 membrane protein OS6813 OS818O 2 368 TM.orf)960 aprE Hly D family secretion protein O59S43 O58170 -1 374 TM.orf)961 aprD ABC transporter, ATP-binding protein O61372 OS954O -2 833 TM.orf)962 yhjN Putative ammonia monooxygenase O62612 O61521 -1 O92 TM.orf)963 histone deacetylase-like amidohydrolase O62890 O63828 3 939 TM.orf)964 xSeB Exonuclease VII Small subunit O63936 O64193 1 258 TM.orf)965 isp A geranyltranstransferase O642SO O6S176 2 927 TM.orf)966 dxS deoxyxylulose-5-phosphate synthase O65466 O67403 1 938 TM.orf)967 yaxC Predicted rRNA methylase O6742O O68229 3 810 TM.orf)968 finr transcriptional regulator, Crp O68283 O68984 1 702 TM.orf)969 phbC PHB de-polymerase O701.15 O68994 -1 122 US 2014/0296.161 A1 Oct. 2, 2014 70

-continued

Locus Gene Product Start End Strand Length TM.orf)970 conserved hypothetical protein O7O686 O70255 -3 432 TM.orf)971 rhodanese-related sulfurtransferase O70835 O71209 2 375 TM.orf)972 hypothetical protein O71363 O71,199 -1 16S TM.orf)973 CfB putative long-chain-fatty-acid CoA ligase O72995 O71436 -1 1S60 TM.orf)974 fadR TetR family transcriptional regulator O73737 O73117 -3 621 TM.orf)975 SCP2 lipid-transfer protein O74051 O75232 2 1182 TM.orf)976 conserved hypothetical protein O75263 O75730 2 468 TM.orf)977 FOX2 MaoC-like dehydratase O757 27 O76113 3 387 TM.orf)978 short-chain dehydrogenase/reductase SDR O76176 O77003 1 828 TM.orf)979 acyl-CoA dehydrogenase domain protein O7708O O78240 3 1161 TM.orf)980 conserved hypothetical protein O78598 O78299 -2 300 TM.orf)981 putative epimerase PhzC/PhZF-like protein O79828 O78869 -2 960 TM.orf)982 ynjA conserved hypothetical protein O80524 O79934 -2 591 TM.orf)983 aroC chorismate synthase O81676 O80603 -2 1074 TM.orf)984 short-chain dehydrogenase/reductase SDR O82535 O81723 813 TM.orf)985 yfkH YihY family protein O83690 O82737 954 TM.orf)986 curved DNA-binding protein O84583 O83687 -2 897 TM.orf)987 Omp 17 kDa surface antigen precursor O851.87 O84711 477 TM.orf)988 pdxH pyridoxamine 5'-phosphate oxidase O85540 O86133 3 594 TM.orf)989 fieF cation efflux protein O86130 O87056 927 TM.orf)990 apt adenine phosphoribosyltransferase O871.98 O87722 525 TM.orf)991 tag DNA-3-methyladenine glycosylase I O883.43 O87750 594 TM.orf)992 Predicted aminomethyltransferase related to O8926S O88336 -3 930 GowT TM.orf)993 glycosyltransferase family protein O8943O 090422 993 TM.orf)994 ytcC glycosyltransferase group 1 O90419 O91657 2 1239 TM.orf)995 lpsD Glycosyltransferase 091647 O92765 1119 TM.orf)996 asnB asparagine synthase O92798 O94612 2 1815 TM.orf)997 ypfG conserved hypothetical protein O95734 O94628 -2 1107 TM.orf)998 pleC sensory transduction histidine kinase O95958 O9836O 2403 TM.orf)999 ded deoxycytidine triphosphate deaminase O99261 O99815 555 TM.orf1000 murA UDP-N-acetylglucosamine enolpyruvyl O99904 O1193 3 1290 transferase TM.orf1001 hisG ATP phosphoribosyltransferase catalytic O1245 O1928 3 684 Subunit TM.orf1002 hisD histidinol dehydrogenase O1912 O3213 2 1302 TM.orf10O3 conserved hypothetical protein O3327 O3848 3 522 TM.orf1004 arSC low molecular weight phosphotyrosine protein O3908 O4330 1 423 phosphatase TM.orf1005 infA translation initiation factor 1 O4S28 O4746 2 219 TM.orf1 OO6 mafprotein O4866 OS48O 3 615 TM.orf1007 ring ribonuclease G 05477 O7108 1 1632 TM.orf1008 zinc-binding protein O7142 O7381 1 240 TM.orf1009 hypothetical protein O7832 O8107 1 276 TM.orf1010 hypothetical protein O8278 O8138 -1 141 TM.orf1011 amidohydrolase family protein O8294 O8461 1 168 TM.orf1012 conserved hypothetical protein O8729 O928O 1 552 TM.orf1013 metG Methionyl-tRNA synthetase 10472 O9360 -3 1113 TM.orf1014 conserved hypothetical protein 10999 11595 2 597 TM.orf1015 metQ lipoprotein, YaeC family 1246S 11677 -2 789 TM.orf1016 cysD O-acetylhomoserine/O-acetylserine 14130 12832 -2 1299 sulfhydrylase TM.orf1017 leuA pyruvate carboxyltransferase 15574 14237 -1 1338 TM.orf1018 oxyR transcriptional regulator, LysR family 15785 16711 1 927 TM.orf1019 nulo NADH dehydrogenase (ubiquinone) 24 kDa 16819 17316 2 498 Subunit TM.orf102O putative NAD-dependent formate 17313 18869 3 1557 dehydrogenase beta Subunit protein TM.orf1021 yigC formate dehydrogenase, alpha Subunit 18881 21721 1 2841 TM.orf1022 flD formate dehydrogenase accessory protein 21738 22535 3 798 TM.orf1023 putative NAD-dependent formate 22552 22764 2 213 dehydrogenase TM.orf1024 nylB 6-aminohexanoate-dimer hydrolase 23952 22780 -2 1173 TM.orf1025 gsiA ABC transporter ATP-binding protein 25784 23964 -3 1821 TM.orf1026 puuC aldehyde dehydrogenase (acceptor) 273.13 25781 -1 1533 TM.orf1027 SoxB Sarcosine oxidase subunit beta 2872O 27365 -1 1356 TM.orf1028 acetyltransferase 291.44 28707 -3 438 TM.orf1029 FAD dependent oxidoreductase 29418 30S63 3 1146 TM.orf1030 app A ABC Superfamily ATP binding cassette 30651 32.309 3 1659 transporter substrate-binding protein TM.orf1031 appB oligopeptide ABC Superfamily ATP binding 32328 33278 3 951 cassette transporter, membrane protein TM.orf1032 appC ABC transporter permease 33284 34153 1 870 TM.orf1033 gev A transcriptional regulatory protein 341.78 3508O 1 903 US 2014/0296.161 A1 Oct. 2, 2014 71

-continued

Locus Gene Product Start End Strand Length TM.orf1034 major facilitator Superfamily MFS 1 3638O 3S142 -3 1239 TM.orf1035 ywlC Suas/YciO/YrdC/YwlC family protein 37113 36391 -2 723 TM.orf1036 hypothetical protein 37943 37362 -3 582 TM.orf1037 hypothetical protein 38346 38122 -2 225 TM.orf1038 conserved hypothetical protein 38468 38.971 1 SO4 TM.orf1039 hypothetical protein 3.8968 394.62 2 495 TM.orf1040 yefC) transcriptional regulator, TetR family protein 39539 4O16S 1 627 TM.orf1041 fec RNA polymerase sigma-70 family protein 40SO4 41028 2 525 TM.orf1042 putative FecK 41025 41987 3 963 TM.orf1043 norA oxidoreductase 43149 42049 -2 1101 TM.orf1044 yeaN transcriptional regulator LysR family 43262 44149 1 888 TM.orf1045 foXA TonB-dependent siderophore receptor 44309 46759 1 2451 TM.orf1046 PepSY-associated TM helix domain-containing 46770 47993 3 1224 protein TM.orf1047 ntalE Flavin reductase like domain protein 48069 48584 3 S16 TM.orf1048 iron-sulfur cluster repair di-iron protein 48678 49136 3 459 TM.orf1049 signal peptide protein 49672 4.9154 519 TM.orf1 OSO phosphoglycerate mutase family protein 49860 SO432 3 573 TM.orf1051 ybfB major facilitator Superfamily MFS 1 SO696 51955 1260 TM.orf1052 hypothetical protein 51979 52560 2 582 TM.orf1053 ubiG 3-demethylubiquinone-9 3-O- S3440 S2544 897 methyltransferase TM.orf1054 peptidase MSO 54597 53437 -2 1161 TM.orf1 OSS Lantibiotic dehydratase domain protein 57020 S4594 -3 2427 TM.orf1056 hypothetical protein 57871 57017 855 TM.orf1057 conserved hypothetical protein 58788 57871 -2 918 TM.orf1058 eamB Lysine exporter protein (LYSE/YGGA) S9483 S8854 -3 630 TM.orf1059 radical SAM domain-containing protein 6.1690 595.67 2124 TM.orf1060 hypothetical protein 621.16 61901 216 TM.orf1061 conserved hypothetical protein 62425 62712 2 288 TM.orf1062 yfkN alkaline phosphatase 62922 67649 3 4728 TM.orf1063 prtC Secreted protease C 67664 69685 2022 TM.orf1064 TolC family type I secretion Outer membrane 69792 71222 3 1431 protein TM.orf1065 paxB putative exotoxin translocation ATP-binding 71219 73030 1812 protein PaxB TM.orf1066 cyaD type I Secretion membrane fusion protein, 73O20 74321 3 1302 Hly D family TM.orf1067 conserved hypothetical protein 75412 74330 1083 TM.orf1068 pleC Signal transduction histidine kinase 757.91 77161 1371 TM.orf1069 aco) acetaldehyde dehydrogenase 77358 78.905 3 1548 TM.orf1070 conserved hypothetical protein 78941 79459 519 TM.orf1071 conserved hypothetical protein 79570 80430 2 861 TM.orf1072 pyridoxamine 5'-phosphate oxidase-related 80529 80996 3 468 FMN-binding protein TM.orf1073 conserved hypothetical protein 81468 80980 -2 489 TM.orf1074 fiS flagellar protein FliS 8.2138 81662 477 TM.orf1075 f) flagellar hook-associated protein 2 (FliD, 84293 821.70 -3 2124 filament cap protein) precursor TM.orf1076 flagellar protein Flag 84798 84376 -2 423 TM.orf1077 flagellin 862OO 85037 1164 TM.orf1078 flagellin domain protein 87778 866.15 1164 TM.orf1079 alkylhydroperoxidase like protein, Ahp) family 88.104 88586 3 483 TM.orf1080 conserved hypothetical protein 89248 88577 672 TM.orf1081 conserved hypothetical protein 91 698 892S4 -2 2445 TM.orf 1082 cely cellulase 92993 91689 -3 1305 TM.orf1083 cellulose synthase subunit B 95779 92990 2790 TM.orf 1084 bcSA putative cellulose synthase catalytic Subunit 98.133 95797 -2 2337 TM.orf1085 fB Lysine-N-methylase 99.528 98.353 -2 1176 TM.orf1086 fW Flagellar assembly factor fliW 200072 995S4 519 TM.orf1087 flagellar hook-associated protein Flg.L. 201094 2001.92 -2 903 TM.orf1088 figk flagellar hook-associated protein 2O2984 2O1173 -2 1812 TM.orf1089 hypothetical protein 203537 2O3O16 522 TM.orf1090 chemotactic signal-response protein CheL 2O3952 203569 -3 384 TM.orf1091 figI flagellar P-ring protein precursor 205127 203964 1164 TM.orf1092 flagellar assembly regulator FliX 2OS467 205895 429 TM.orf1093 kSA DnaK Suppressor protein 206017 2O6433 3 417 TM.orf 1094 pleC putative sensory box histidine kinase 2O6708 209452 2 2745 TM.orf1095 flagellar basal body L-ring protein 210260 2094.66 795 TM.orf1096 figA flagellar basal body P-ring biosynthesis protein 211316 210324 993 FIgA TM.orf1097 figG flagellar basal-body rod protein FlgG 212149 211364 -2 786 TM.orf1098 figF flagellar basal-body rod protein 212950 212198 -2 753 TM.orf1099 fiL flagellar basal body-associated protein 213326 213868 2 543 US 2014/0296.161 A1 Oct. 2, 2014 72

-continued

Locus Gene Product Start End Strand Length TM.orf1 100 fM flagellar motor Switch protein FliM 21.3924 21SO3O 1 1107 TM.orf1 101 ATPase involved in DNA repair 215027 21562O 2 594 TM.orf1 102 FlaA locus 229 kDa protein 215704 21 6519 3 816 TM.orf1103 cheY Response regulator receiver 21 6543 21 6929 1 387 TM.orf1 104 chemotaxis protein CheZ 216979 217641 3 663 TM.orf1 105 PPE-repeat protein (cell mobility) 217692 218354 1 663 TM.orf1 106 ytxE chemotaxis MotR protein 218351 219 100 2 750 TM.orf1 107 conserved hypothetical protein 219097 222261 3 316S TM.orf1108 rlmI methyltransferase 222402 223379 1 978 TM.orf1109 ysgA rRNA methylase 223383 224210 1 828 TM.orf1110 hypothetical protein 224235 224495 1 261 TM.orf1111 dhaR TetR/AcrR family transcriptional regulator 224,589 225215 1 627 TM.orf1112 hypothetical protein 2246.17 2244.92 -2 126 TM.orf1113 linC 2,5-dichloro-2,5-cyclohexadiene-1,4- 225281 226045 2 765 ioldehydrogenase TM.orf1114 DMT family permease 226948 226052 -2 897 TM.orf1115 rhA LysR family transcriptional regulator 227106 227975 1 870 TM.orf1116 fiP flagellar biosynthetic protein FliP 2294.94 228724 -3 771 TM.orf1117 hypothetical protein 229949 229491 -1 459 TM.orf1118 hypothetical protein 23O4O2 23.0043 -1 360 TM.orf1 119 ylqH flagellar biosynthetic protein FIhB 230732 23O448 -1 285 TM.orf1120 csirA carbon storage regulator 231119 23O856 -1 264 TM.orf1121 figE flagellar basal body rod protein FlgB 231481 231903 3 423 TM.orf1122 figC flagellar basal-body rod protein FlgC 231952 232359 3 408 TM.orf1123 fB flagellar hook-basal body complex protein FliE 23242O 232734 3 315 TM.orf1124 yscS flagellar biosynthetic protein FNQ 232807 233O82 3 276 TM.orf1125 fR flagellar biosynthesis protein FliR 2331 O2 233857 2 756 TM.orf1126 fB flagellar biosynthetic protein FIhB 233875 2349S4 3 O8O TM.orf1127 cell cycle histidine kinase CokA 23SO20 236945 1 926 TM.orf1128 hypothetical protein 237585 2371OO -3 486 TM.orf1129 recA RecA protein 23.7929 239029 2 101 TM.Orf1130 alaS alanyl-tRNA synthetase 239345 24-1999 2 2655 TM.orf1131 conserved hypothetical protein 243214 242069 -2 146 TM.orf1132 rhsD nematicidal protein 2 248,254 243.263 -2 4992 TM.orf1133 putative acetyltransferase 249003 248464 -3 S4O TM.orf1134 yiA cobalamin synthesis protein CobW 2SO103 249000 -1 104 TM.orf113S 3-hydroxyacyl-CoA dehydrogenase (hdb-1) 2SO3OO 251 193 3 894 TM.orf1136 ico isocitrate dehydrogenase 2S2543 251296 -3 248 TM.orf1 137 hypothetical protein 252.775 2S3668 3 894 TM.orf1138 ygaU protein containing LysM domain 253779 255026 1 248 TM.orf1139 ywfA MFS-type transporterywfA 2S6268 255027 -1 242 TM.orf1140 hypothetical protein 259653 259456 -3 198 TM.orf1141 hypothetical protein 261070 259721 -2 350 TM.orf1142 yezE TetR family transcriptional regulator 261874 261266 -2 609 TM.orf1143 conserved hypothetical protein 261993 262910 1 918 TM.orf1144 yfcG glutathione S-transferase domain-containing 262966 263670 3 705 protein TM.orf1145 yfcG Glutathione S-transferase domain protein 2636.79 264389 1 711 TM.orf1146 rbSC methyl-galactoside transport system permease 26S469 264411 -1 1059 protein TM.orf1147 ABC transporter ATP-binding protein 266995 26S466 -2 1530 TM.orf1 148 torT ABC transporter substrate-binding protein 2681.59 267098 -2 1062 TM.orf1149 kip A urea amidolyase related protein 269275 2683O4 -2 972 TM.orf11SO allophanate hydrolase subunit 1 27O140 26926S -1 876 TM.orf1151 accC biotin carboxylase 271514 27O150 -1 1365 TM.orf1 152 acetyl-CoA carboxylase biotin carboxyl carrier 271 758 271522 -3 237 protein TM.orf1153 conserved hypothetical protein 272647 271865 -2 783 TM.orf1154 cynR transcriptional regulator 272913 273836 1 924 TM.orf115S cupin Superfamily protein 274097 275260 2 1164 TM.orf1156 pfpI proteinase 275836 275.285 -2 552 TM.orf1157 conserved hypothetical protein 276516 275911 -3 606 TM.orf1158 ynjF phosphatidylglycerophosphate synthase 276,707 277315 2 609 TM.orf1159 conserved hypothetical protein 277454. 2783.56 2 903 TM.orf1160 ydjF transcriptional regulator, DeoR family 278.360 2791.33 2 774 TM.orf1161 ytfo oxidoreductase ytfG 280016 279 138 -1 879 TM.orf1162 ytcD putative HTH-type transcriptional regulator 28O138 280497 3 360 TM.orf1163 motC AcrE3 283531 280535 -2 2997 TM.orf1164 Membrane-fusion protein 284.902 283676 -2 1227 TM.orf1165 padR transcriptional regulator, PadR family protein 285.456 284899 -3 558 TM.orf1166 dtop-glucose 4,6-dehydratase 28662O 285598 -3 1023 TM.orf1167 NAD-dependent epimerase? dehydratase 2873.06 28662O -1 687 TM.orf1 168 methyltransferase, UbiE 288.319 287462 -2 858 TM.orf1169 glimU molybdopterin binding domain-containing 29O137 28844-6 -2 1692 US 2014/0296.161 A1 Oct. 2, 2014 73

-continued

Locus Gene Product Start End Strand Length protein TM.orf1170 pucA Xanthine dehydrogenase accessory factor 290925 290.194 -3 732 TM.orf1171 ydeB XdhCCoxI family protein 29.1280 290942 -2 339 TM.orf1172 VWA containing CoxE family protein 2925.45 291.286 -3 1260 TM.orf1173 ATPase associated with various cellular 293426 292S42 -1 885 activities TM.orf1174 cutM carbon-monoxide dehydrogenase (acceptor) 294473 293673 -1 8O1 TM.orf1175 cutL carbon monoxide dehydrogenase large chain 296922 294535 -3 2388 TM.orf1176 cutS carbon monoxide dehydrogenase Small chain 297463 296993 -2 471 TM.orf1177 ghra D-isomer specific 2-hydroxyacid 298852 297899 -2 954 dehydrogenase, NAD-binding TM.orf1178 hypothetical protein 2991.90 298.951 -3 240 TM.orf1179 hypothetical protein 299598 299329 -3 270 TM.orf1 180 marP two-component response regulator 29.9787 3OO494 1 708 TM.orf1 181 appC ABC transporter permease protein 301407 3OO499 -3 909 TM.orf1 182 gsiC binding-protein-dependent transport systems 3O2342 3O1404 -1 939 inner membrane component TM.orf1 183 oligopeptide? dipeptide ABC transporter, 3O3335 3O2349 -1 987 ATPase subunit TM.orf1184 ABC transporter ATP binding protein 3O4318 3O3389 -2 930 TM.orf1 185 dippA oligopeptide ABC transpoter oligopeptide- 305889 304315 -3 575 binding protein TM.orf1186 conserved hypothetical protein 306808 305963 -2 846 TM.orf1187 TetR family transcriptional regulator 306998 307657 2 660 TM.orf1 188 conserved hypothetical protein 307758 3O8891 1 134 TM.orf1 189 allophanate hydrolase 309273 3O8911 -3 363 TM.orf1190 rutE isochorismatase family protein 3.0995O 309270 -1 681 TM.orf1191 yufC) putative ABC transporter ATP-binding protein 3.11546 3.09969 -1 578 TM.orf1.192 isochorismatase family protein 312235 3.11543 -2 693 TM.orf1193 yufC) ABC transporter permease protein 313199 312273 -1 927 TM.orf1194 putative simple Sugar transport system 314322 313.204 -3 119 permease protein TM.orf1195 med simple Sugar transport system Substrate- 315523 314414 -2 110 binding protein TM.orf1196 ydfH transcriptional regulator, GintR family 31.5916 316635 3 720 TM.orf1197 Urea amidolyase 316648 3.18471 3 824 TM.orf1 198 anti-Sigma-factor antagonist 318884 318495 -1 390 TM.orf1199 mobA molybdopterin-guanine dinucleotide 319614 3.18931 -3 684 biosynthesis protein A TM.orf1200 sugE Small multidrug resistance protein 3.19848 32O168 1 321 TM.orf12O1 cold-shock DNA-binding domain-containing 32O334 32O687 1 3S4 protein TM.orf1202 yiaO TRAP dicarboxylate transporter- DctP subunit 32O940 322001 1 1062 TM.orf12O3 hypothetical protein 322006 322SOO 3 495 TM.orf1204 TRAP dicarboxylate transporter, DctM subunit 322497 323801 1 1305 TM.orf1205 ong-chain-fatty-acid-CoA ligase 323806 324267 3 462 TM.orf1206 cfB AMP-dependent synthetase and ligase 324240 32S4OO 1 1161 TM.orf12O7 hypothetical protein 3255.30 32S82O 1 291 TM.orf1208 ygaZ AZIC family protein 32.5859 326626 2 768 TM.orf1209 branched-chain amino acid transport 326626 326946 3 321 TM.orf1210 pepN membranealanyl aminopeptidase 327O64 329727 3 2664 TM.orf1211 4-chloro-3-hydroxybutyrate hydrolase 330982 3.29804 -2 1179 TM.orf1212 transcriptional regulator, AraC family 331259 331786 2 528 TM.orf1213 thcR transcriptional regulator, AraC family 331747 332259 3 513 TM.orf1214 Methyl-accepting chemotaxis protein 333700 3.32351 -2 1350 TM.orf1215 yebC conserved hypothetical protein 333892 334.713 3 822 TM.orf1216 cph2 Phytochrome-like protein cph2 334862 337087 2 2226 TM.orf1217 YbaK prolyl-tRNA synthetase associated 337299 337.096 -3 204 region TM.orf1218 ppnK Predicted Sugar kinase 338173 337343 -2 831 TM.orf1219 moaA molybdenum cofactor biosynthesis protein A 3392.24 3381.81 -1 1044 TM.orf1220 conserved hypothetical protein 340486 339377 -2 1110 TM.orf1221 conserved hypothetical protein 341012 340800 -1 213 TM.orf1222 hypothetical protein 341287 341955 3 669 TM.orf1223 chea Che A Signal transduction histidine Kinases 342O75 344816 1 2742 (STHK) TM.orf1224 cheW Chemotaxis signal transduction protein 344.813 345310 2 498 TM.orf1225 ceY FOG: Chey-like receiver 3454O7 34.5772 2 366 TM.orf1226 cheB chemotaxis-specific methylesterase 34582O 347007 3 1188 TM.orf1227 chemotaxis protein methyltransferase CheR 347007 347816 1 810 TM.orf1228 ctra two-component response regulator 348673 347942 -2 732 TM.orf1229 fi flagellum-specific ATP synthase 348922 35O259 3 1338 TM.orf1230 hypothetical protein 3SO266 350706 3 441 TM.orf1231 conserved hypothetical protein 3S12O2 352383 3 1182 US 2014/0296.161 A1 Oct. 2, 2014 74

-continued

Locus Gene Product Start End Strand Length TM.orf1232 yfiL ABC transporter related protein 352388 353380 2 993 TM.orf1233 ABC transporter permease protein 353373 3S4323 1 951 TM.orf1234 ybfB major facilitator Superfamily MFS 1 355611 3S4358 -3 1254 TM.orf1235 ydC transcriptional regulator, TetR family protein 356283 355675 -3 609 TM.orf1236 conserved hypothetical protein 3568O3 356336 -2 468 TM.orf1237 conserved hypothetical protein 357954 356872 -3 1083 TM.orf1238 ylxH FeN 361632 360847 -3 786 TM.orf1239 FIhF 362951 361629 -1 1323 TM.orf1240 flA flagellar biosynthesis pathway, component 36S160 363037 -3 2124 FIhA TM.orf1241 fibD Response regulator containing Chey-like 366774 365245 -3 1530 receiver, AAA-type ATPase, and DNA-binding domains TM.orf1242 pomA chemotaxis protein 3.67656 366889 -3 768 TM.orf1243 fiN flagellar motor Switch protein 3681 11 367758 -1 3S4 TM.orf1244 flagellar assembly protein H 368888 368181 -1 708 TM.orf1245 fiG flagellar motor Switch protein G 369955 368.933 -2 1023 TM.orf1246 fif flagellar MS-ring protein 371668 369962 -2 1707 TM.orf1247 conserved hypothetical protein 372410 372048 -1 363 TM.orf1248 figE Flagellar hook protein figE 374276 372573 -1 1704 TM.orf1249 figD flagellar basal-body rod modification protein 375223 37.4444 -2 780 Flg.) TM.orf12SO flagellar hook-length control protein 378290 375246 -1 3O45 TM.orf1251 pleC PAS/PAC sensor signal transduction histidine 379682 3784OS -1 1278 kinase TM.orf1252 chaC ChaC-like protein 380628 379810 -3 819 TM.orf12S3 acetoacetyl-CoA synthetase 38O890 382917 3 2028 TM.orf1254 cph2 GGDEF family protein 38306S 384003 3 939 TM.orf1255 SoxR LysR family transcriptional regulator 38.4044 3.84946 2 903 TM.orf1256 ccpA cytochrome c551 peroxidase precursor 385079 3861.13 2 1035 TM.orf1257 conserved hypothetical protein 387,205 386129 -2 1077 TM.Of 1258 anthranilate synthase 389.504 387.342 -1 21 63 TM.orf1259 bigR Arsk family transcriptional regulator 389867 3902O8 2 342 TM.orf1260 YeeE/YedE family protein 3902OS 390642 3 438 TM.orf1261 membrane protein 390709 391170 3 462 TM.orf1262 metallo-beta-lactamase family protein 391175 392083 2 909 TM.orf1263 conserved hypothetical protein 3921.35 392353 2 219 TM.orf1264 Sulfate transporter 392,393 394147 2 755 TM.orf1265 conserved hypothetical protein 39428S 395940 3 656 TM.orf1266 conserved hypothetical protein 396591 39S983 -3 609 TM.orf1267 conserved hypothetical protein 397805 396.579 -1 227 TM.orf1268 ydeM Anaerobic sulfatase-maturating enzyme 399284 397824 -1 461 homologydeM TM.orf1269 hypothetical protein 3996.13 399281 -2 333 TM.orf1270 bcIA methyl-accepting chemotaxis protein 401211 399727 -3 485 TM.orf1271 conserved hypothetical protein 4O1687 402514 2 828 TM.orf1272 dictP TRAP dicarboxylate transporter- DctP subunit 402599 403675 2 O77 TM.orf1273 conserved hypothetical protein 403734 404342 1 609 TM.orf1274 siaT TRAP dicarboxylate transporter, DctM subunit 404412 40S698 1 287 TM.orf1275 conserved hypothetical protein 405788 406468 2 681 TM.orf1276 gatA Glutamyl-tRNA(Gln) amidotransferase subunit A 4O6534 4O7892 3 359 TM.orf1277 amidase 407916 408734 1 819 TM.orf1278 namA NADH:flavin oxidoreductase,NADH oxidase 4.08859 410016 3 158 TM.orf1279 hypothetical protein 41 O145 410306 1 162 TM.orf1280 mintB ABC transporter component 41 O303 411.064 2 762 TM.orf1281 mintC ABC transporter permease protein 411083 411967 2 885 TM.orf1282 periplasmic solute binding protein 411964 412929 3 966 TM.orf1283 helix-turn-helix type 11 domain-containing 413590 412898 -2 693 protein TM.orf1284 conserved hypothetical protein 413958 413596 -3 363 TM.orf1285 Peci enoyl-CoA hydrataseisomerase 4152SO 414009 -1 1242 TM.orf1286 cfB AMP-binding domain protein 417269 41533S -1 1935 TM.orf1287 Short-chain alcohol dehydrogenase of 417SO2 4.18254 3 753 unknown specificity TM.orf1288 rutR transcriptional regulator, TetR family protein 4.18422 419123 1 702 TM.orf1289 alsB D-allose-binding periplasmic protein 419446 420483 3 1038 TM.orf1290 rbSA ABC transporter related protein 42O6O1 422166 3 1566 TM.orf1291 rbSC inner-membrane translocator 4221 63 4232OO 1 1038 TM.orf1292 SorC transcriptional regulator, DeoR family 423261 424,268 1 1008 TM.orf1293 glpD Glycerol-3-phosphate dehydrogenase 424428 426,194 1 1767 TM.orf1294 aldolase protein 426249 427091 1 843 TM.orf1295 lyx L-xylulose kinase protein 4271.57 428653 2 1497 TM.orf1296 putative dehydrogenase 429452 428,658 -1 795 TM.orf1297 conserved hypothetical protein 430636 429497 -2 1140 US 2014/0296.161 A1 Oct. 2, 2014 75

-continued

Locus Gene Product Start End Strand Length TM.orf1298 TetR family transcriptional regulator 43 O704 43.1333 1 630 TM.orf1299 conserved hypothetical protein 4331.51 431346 -1 1806 TM.orf1300 fec ECF subfamily RNA polymerase sigma factor 433359 433928 1 570 TM.orf1301 sigma factor regulatory protein FecR/PupR 433928 434968 2 1041 amily TM.orf1302 irgA TonB-dependent receptor 4351SO 4374.71 1 2322 TM.orf1303 hypothetical protein 437570 437866 2 297 TM.orf1304 conserved hypothetical protein 437863 439428 3 1566 TM.orf1305 hypothetical protein 439425 439766 1 342 TM.orf1306 transporterywfM 44O649 439717 -3 933 TM.orf1307 TetR family transcriptional regulator 44.1344 44O688 -1 657 TM.orf1308 hypothetical protein 442O3S 442223 1 189 TM.orf1309 conserved hypothetical protein 442292 442SO7 2 216 TM.orf1310 hypothetical protein 442S31 4.42734 3 204 TM.orf1311 Coenzyme PQQ synthesis protein E 442877 444391 2 1515 TM.orf1312 TonB-dependent receptor 444961 4471.68 3 2208 TM.orf1313 rhizobiocin/RTX toxin and hemolysin-type 448826 4472S2 -1 1575 calcium binding protein TM.orf1314 cya calcium binding hemolysin protein 449004 -1 3585 TM.orf1315 hemolysin-type calcium-binding repeat family 45296S -3 1386 protein TM.orf1316 cya proprotein convertase P 457577 454641 -1 2937 TM.orf1317 araO ABC Superfamily ATP binding cassette 458683 45.7808 -2 876 transporter, membrane protein TM.orf1318 inner membrane component of binding 459558 45868O -3 879 protein-dependent transport system TM.orf1319 ugpB extracellular solute-binding protein 460942 4596SO -2 1293 TM.orf1320 ugpC ABC Superfamily ATP binding cassette 462O72 460972 -3 1101 transporter, ABC protein TM.orf1321 glpR transcriptional regulator, DeoR family 462836 462O69 -1 768 TM.orf1322 ypeA acetyltransferase, GNAT family 463,231 463,707 3 477 TM.Of 1323 aprE peptidase S8 and S53, Subtilisin, kexin, 463916 466324 2 2409 Sedolisin TM.orf1324 WD-40 repeat protein 467482 466412 -2 1071 TM.orf1325 cobalamin synthesis protein, P47K 468521 467466 -1 1056 TM.orf1326 conserved hypothetical protein 46878S 4695S2 2 768 TM.orf1327 hypothetical protein 469561 469698 3 138 TM.orf1328 lysine exporter protein LysEYggA 4704O1 469760 -2 642 TM.orf1329 ASnC family transcriptional regulator 470889 470398 -3 492 TM.orf1330 hxuC TonB-dependent hemoglobin 471277 473,346 3 2070 TM.orf1331 hemP hemin uptake protein HemP 473394 473591 1 198 TM.orf1332 hmuS hemin transport protein 4736O2 474663 3 1062 TM.orf1333 hmuT periplasmic binding protein 474663 475.559 1 897 TM.orf1334 hmul putative ABC-type Fe3+-siderophore transport 47SS64 476712 3 1149 system, permease component TM.orf1335 hmuV ABC-type hemin transport system, ATPase 476730 477518 1 789 component TM.orf1336 bioI cytochrome P450 hydroxylase 478788 477532 -3 1257 TM.orf1337 mscL arge conductance mechanosensitive channel 479044 479484 3 441 protein TM.orf1338 conserved hypothetical protein 481042 479507 -2 1536 TM.orf1339 outer membrane protein 4816OO 482310 3 711 TM.orf1340 conserved hypothetical protein 482S76 4831O3 2 528 TM.orf1341 tolO MotATolO/ExbB proton channel 48363S 484510 2 876 TM.orf1342 hypothetical protein 484536 484949 1 414 TM.orf1343 hypothetical protein 484946 485374 2 429 TM.orf1344 tonB TonB-like protein 4853.71 486483 3 113 TM.orf1345 glycosyltransferase family 2 48.929S 4866.17 -2 2679 TM.orf1346 hypothetical protein 48971S 48.9996 3 282 TM.orf1347 triacylglycerol lipase 491806 490001 -2 806 TM.orf1348 glucokinase 492OOS 493.099 2 095 TM.orf1349 agmatine deiminase 493.144 494.274 3 131 TM.orf1350 pirin 495175 494.279 -2 897 TM.orf1351 ykvY peptidase M24 496516 49532O -2 197 TM.orf1352 major facilitator transporter 497829 496525 -3 305 TM.orf1353 pchR AraC family transcriptional regulator 498O11 499.01S 2 005 TM.orf1354 cation efflux protein 4999.13 4990OS -1 909 TM.orf1355 BGL2 glycoside hydrolase, family 17 SO1794 SOO118 -1 677 TM.orf1356 cobC putative threonine-phosphate decarboxylase 5O2952 5O1879 -1 O74 TM.orf1357 cobQ adenosylcobyric acid synthase SO4421 SO2949 -2 473 TM.orf1358 btuB TonB-dependent receptor 506335 SO4440 -2 896 TM.orf1359 PRC-barrel domain-containing protein SO742O 506827 -3 594 TM.orf1360 conserved hypothetical protein 507743 5.08477 2 735 TM.orf1361 HrpA-like helicase SO8518 S11061 1 2544 US 2014/0296.161 A1 Oct. 2, 2014 76

-continued

Locus Gene Product Start End Strand Length TM.orf1362 rhodanese-related sulfurtransferase S11485 S11084 -3 402 TM.orf1363 gev A LysR family transcriptional regulator 511592 S12470 2 879 TM.orf1364 glutamyl-tRNA(Gln) amidotransferase subunit A 512585 S13982 2 1398 TM.orf1365 insulin-cleaving metalloproteinase outer S14113 515390 1 1278 membrane protein TM.orf1366 conserved hypothetical protein S15482 S17041 3 1S60 TM.orf1367 conserved hypothetical protein S17041 5181.17 1 1077 TM.orf1368 twin-arginine translocation pathway signal S18133 S19266 1 1134 TM.orf1369 yenE Putative monooxygenase yenE S19633 S19343 -3 291 TM.orf1370 Haloacid dehalogenase domain protein S2O433 519759 -1 675 hydrolase TM.orf1371 mmsB 3-hydroxyisobutyrate dehydrogenase 521512 S2O613 -2 900 TM.orf1372 acyl-CoA dehydrogenase domain-containing 52.2771 S2162O -3 1152 protein TM.orf1373 mmSA methylmalonate-semialdehyde dehydrogenase S2436S S22869 -2 1497 TM.orf1374 mauR transcriptional regulator S24594 525,505 2 912 TM.orf1375 NupC family protein 525694 526542 3 849 TM.orf1376 yutK Na+ dependent nucleoside transporter 52652O 526921 2 402 TM.orf1377 radical SAMB12 binding domain protein 527265 529292 1 2028 TM.orf1378 aminoglycoside phosphotransferase 5294.54 S3O494 2 1041 TM.orf1379 gno short-chain dehydrogenase/reductase SDR S3OS46 531322 2 777 TM.orf1380 Mecr nuclear receptor binding factor 531420 532424 1 1OOS TM.orf1381 yhf T atty-acyl-CoA synthase S32466 S34115 2 16SO TM.orf1382 TetR family transcriptional regulator S34242 S348SO 2 609 TM.orf1383 motA RND family efflux transporter MFP subunit S34942 S36090 1 1149 TM.orf1384 noG acriflavin resistance protein 536,146 5393.79 3 3234 TM.orf1385 ZnuA periplasmic solute binding protein S40543 539479 -3 106S TM.orf1386 Zur erric uptake regulator, Fur family 54.0644 S412O1 2 558 TM.orf1387 ZnC high-affinity Zinc uptake system ATP-binding S41,198 542O64 3 867 protein ZnuC TM.orf1388 ZnuB permease of ABC zinc transporter ZnuB S42083 S42883 3 8O1 TM.Of 1389 conserved hypothetical protein 543006 S44046 1 1041 TM.orf1390 potI binding-protein-dependent transport systems S44920 S44.096 -3 825 inner membrane component TM.orf1391 potB binding-protein-dependent transport systems 54.5792 S44917 -1 876 inner membrane component TM.orf1392 potA spermidine putrescine transport system ATP- 546891 54.5794 -3 1098 binding protein TM.orf1393 twin-arginine translocation pathway signal 548148 S46973 -3 1176 TM.orf1394 UspA domain-containing protein S48687 S4952O 2 834 TM.orf1395 major facilitator family protein SSO818 S49541 -3 1278 TM.orf1396 phosphatase 551026 551763 3 738 TM.orf1397 hypothetical protein 55.2567 551773 -3 795 TM.orf1398 ipoprotein, putative 553.277 SS2606 -1 672 TM.orf1399 ABC transport system substrate-binding SS4268 5.53270 -3 999 protein TM.orf14OO ABC transporter related protein 555138 554.287 -3 852 TM.orf14O1 ABC transport system permease protein 556,336 555143 -2 1194 TM.orf1402 cfB malonyl-CoA synthase 557,997 SS6462 -3 1536 TM.orf1403 crt carnitinyl-CoA dehydratase 55.8856 SS8041 -2 816 TM.orf1404 ydfH GntR family transcriptional regulator 559600 SS8914 -2 687 TM.orf1405 Tripartite ATP-independent periplasmic 559821 S60336 1 S16 transporter DctO component TM.orf14O6 TRAP dicarboxylate transporter, DctM subunit S60333 S61619 2 1287 TM.orf1407 yii.Z. TRAP dicarboxylate transporter- DctP subunit 561698 S62741 2 1044 TM.orf1408 acyl-CoA synthase 562725 S63960 1 1236 TM.orf1409 cai) enoyl-CoA hydratase S64O10 564804 3 795 TM.orf1410 lip2 lipolytic enzyme S64915 S65889 1 975 TM.orf1411 hypothetical protein 565897 566424 3 528 TM.orf1412 gatA Glutamyl-tRNA(Gln) amidotransferase subunit A 566459 567871 2 1413 TM.orf1413 conserved hypothetical protein S68030 568968 3 939 TM.orf1414 nahD 2-hydroxychromene-2-carboxylate isomerase 569070 S69678 1 609 TM.orf1415 dkgB oxidoreductase S698O1 57O634 2 834 TM.orf1416 nitroreductase 571352 57O693 -1 660 TM.orf1417 pleC two-component sensor histidine kinase 571579 573.522 3 1944 TM.orf1418 didpA extracellular solute-binding protein 573651 575285 1 1635 TM.orf1419 dippB ABC transporter membrane spanning protein 575282 576328 2 1047 (oligopeptide) TM.orf1420 didpC binding-protein-dependent transport systems 57.6325 577242 3 918 inner membrane component TM.orf1421 dippD ABC transporter related protein 577239 578087 1 849 TM.orf1422 molybdopterin oxidoreductase 581,154 578839 -3 2316 TM.orf1423 conserved hypothetical protein S82223 S81204 -2 102O TM.orf1424 TRAP transporter, 4TM/12TM fusion protein S84193 S82220 -3 1974 US 2014/0296.161 A1 Oct. 2, 2014 77

-continued

Locus Gene Product Start End Strand Length TM.orf1425 31 kDa immunogenic protein S85241 S84273 -2 969 TM.orf1426 yagI Transcriptional regulator, IcIR family S861.46 5853.79 -3 768 TM.orf1427 conserved hypothetical protein S86343 587263 2 921 TM.orf1428 acetone carboxylase, gamma Subunit 5878O1 587295 507 TM.orf1429 acetone carboxylase alpha subunit S9022O 587902 -3 2319 TM.orf1430 hyuA Acetone carboxylase beta subunit; AcxA S924.48 590271 2178 TM.orf1431 stc. Fis family GAF modulated sigmaS4 specific S92804 S94699 3 896 transcriptional regulator TM.orf1432 conserved hypothetical protein 595182 S947O6 -3 477 TM.orf1433 cation efflux protein precursor 59.5344 596237 894 TM.orf1434 cfA AMP-dependent synthetase and ligase 596385 598.199 815 TM.orf1435 grsT hioesterase domain protein S99028 S98234 -3 795 TM.orf1436 ppx Exopolyphosphatase 5998.25 60O829 2 005 TM.orf1437 rlmE 23S rRNA methylase 60O879 6O1694 816 TM.orf1438 guaB inosine-5'-monophosphate dehydrogenase 6O1944 6034O1 458 TM.orf1439 rSmB RNA and rRNA cytosine-C5-methylase 6O3S30 604882 2 353 TM.orf1440 hypothetical protein 6O4974 60S618 645 TM.orf1441 yhiD MgtC/SapB transporter 605701 606150 3 450 TM.orf1442 cpx Xanthinefuracili vitamin C permease 6063S4 608.045 692 TM.orf1443 yd P AB hydrolase Superfamily protein yoP 608.128 608985 3 858 TM.orf1444 yetK transporter yetK 609088 610164 3 O77 TM.orf1445 guaA bifunctional GMP synthase, glutamine 610289 61.1836 2 S48 amidotransferase protein TM.orf1446 TetR family transcriptional regulator 61.2122 612697 2 576 TM.orf1447 3-beta hydroxysteroid 612702 6.13676 1 975 dehydrogenase isomerase TM.orf1448 yd G putative oxidoreductase 614840 61.3758 -1 1083 TM.orf1449 yeaN LysR family transcriptional regulator 615063 615968 1 906 TM.orf1450 indwa Beta-(1->2)glucan export ATP- 619031 616164 -1 2868 binding permease protein ndvA TM.orf1451 bacteriocin/lantibiotic ABC transporter 621187 619037 -2 2151 TM.Of 1452 membrane-fusion protein 622477 621.188 -2 1290 TM.orf1453 putative Secreted protein 6231 OO 622522 -3 579 TM.orf1454 silP cation transport ATPase 62S423 623330 -2 2094 TM.orf14S5 conserved hypothetical protein 626219 627070 2 852 TM.orf1456 PEBP family protein 627587 627084 -1 SO4 TM.orf1457 yeaM AraC family transcriptional regulator 628424 627657 -1 768 TM.orf1458 conserved hypothetical protein 628S46 629337 3 792 TM.orf1459 conserved hypothetical protein 629444 63.0667 2 1224 TM.orf1460 conserved hypothetical protein 630670 631SOO 3 831 TM.orf1461 conserved hypothetical protein 631513 6322SO 3 738 TM.orf1462 yuxN TetR family regulatory protein 632939 632262 -1 678 TM.orf1463 glycosyltransferase family protein 633072 634304 1 1233 TM.orf1464 putative integral membrane protein 635052 634318 -3 735 TM.orf146S hypothetical protein 63S191. 635475 3 285 TM.orf1466 godha membrane-bound PQQ-dependent 63S644 636300 3 657 dehydrogenase, glucosef quinateishikimate amily TM.orf1467 Sodium hydrogen exchanger 637571 636267 -1 1305 TM.orf1468 chaperone protein HtpG 637747 63841S 3 669 TM.orf1469 molybdopterin oxidoreductase, iron-sulfur 638.399 640249 2 1851 binding subunit TM.orf1470 hmeA molybdopterin oxidoreductase, iron-sulfur 64O246 641451 3 12O6 binding subunit TM.orf1471 Polysulphide reductase Nrf) 641448 642797 1350 TM.orf1472 transmembrane prediction 642790 6433SO 3 S61 TM.orf1473 conserved hypothetical protein 643347 64388O 534 TM.orf1474 conserved hypothetical protein 643877 644626 2 750 TM.orf1475 conserved hypothetical protein 644598 644987 390 TM.orf1476 hypothetical protein 644984 64S415 2 432 TM.orf1477 electron transport protein SCO1/SenC 6454.12 646215 3 804 TM.orf1478 citaC cytochrome c oxidase, Subunit II 646.212 647162 951 TM.orf1479 ctaD cytochrome c oxidase, Subunit I 647159 648.784 2 1626 TM.orf1480 cta cytochrome c oxidase subunit III 648777 64.9412 636 TM.orf1481 caa.(3)-type oxidase, subunit IV 649418 64969O 2 273 TM.orf1482 Sulfide: quinone oxidoreductase 651323 649695 1629 TM.orf1483 transmembrane protein 652416 651634 -3 783 TM.orf1484 bigR transcriptional regulator, Arsk family 652914 652513 -3 402 TM.orf1485 peroxiredoxin 653579 652911 669 TM.orf1486 aZB alanine catabolic operon transcriptional 6541S4 653687 -2 468 regulator protein TM.orf1487 eamB amino acid transporter LysE 654270 654920 651 TM.orf1488 conserved hypothetical protein 655033 656364 3 1332 TM.orf1489 yfkF major facilitator Superfamily MFS-1 6S 6361 657545 118S US 2014/0296.161 A1 Oct. 2, 2014 78

-continued

Locus Gene Product Start End Strand Length 490 GATS-like protein 1 657939 657559 -3 381 491 peroxidase-related enzyme 658,554 657970 -3 585 TM.or 492 yhd H zinc-binding alcohol dehydrogenase 659600 658608 -1 993 TM.or 493 dhaR TetR family transcriptional regulator 660309 659629 -3 681 TM.or 494 nahR LysR family transcriptional regulator 661260 660334 -3 927 TM.or 495 conserved hypothetical protein 661351 661824 3 474 TM.or 496 Alcohol dehydrogenase zinc-binding domain 661821 662735 915 protein TM.or 497 hypothetical protein 662843 66.341S 573 TM.or 498 E family transcriptional regulator 66341S 66.3621 2O7 TM.or 499 athione S-transferase domain-containing 664340 66.3633 708 protein TM.or 500 conserved hypothetical protein 665756 664476 281 TM.or 5O1 transposase, is4 family 66.5938 667053 116 TM.or 502 ATP-dependent metalloprotease FtsH 667172 668.986 815 TM.or 503 adhA zinc-binding alcohol dehydrogenase family 669071 67OOS4 984 protein TM.or SO4 dihydroxy-acid dehydratase 6701.68 671841 674 TM.or 505 xth A exodeoxyribonuclease III Xth 6727O1 671868 834 TM.or SO6 erp A ron-sulfur cluster insertion protein 673 106 672717 390 TM.or 507 dGTP triphosphohydrolase 6733O8 674S13 2O6 TM.or SO8 arginyl-tRNA synthetase 674515 676260 746 TM.or 509 protein TonB, putative 676260 677294 O35 TM.or 510 nag2. Beta-N-acetylhexosaminidase 677291 678.337 O47 TM.or 511 ScpA condensin subunit ScpA 678359 679288 930 TM.or 512 scpB transcription regulator 679278 679955 TM.or 513 membrane-fusion protein 68O119 681375 TM.or S1.4 lag) ATP-binding cassette subfamily C 681372 6836.36 TM.or 515 hlyB ATP-binding cassette subfamily B 6836.33 686644 TM.or S16 conserved hypothetical protein 687259 686669 TM.or 517 bepc type I secretion outer membrane protein, TolC 68868O 687286 amily TM.or 518 hypothetical protein 688979 688.677 TM.or 519 conserved hypothetical protein 689357 688.986 TM.or 520 arc A Response regulators consisting of a Chey-like 689771 690445 receiver domain and a winged-helix DNA binding domain TM.or 521 hypothetical protein 690822 691073 252 TM.or 522 PMP3 Plasma membrane proteolipid 691208 691 393 186 TM.or 523 putative proteasome-type protease 6921.90 691471 720 TM.or 524 domain protein 693098 692262 837 TM.or 525 conserved hypothetical protein 6941O1 693154 948 TM.or 526 conserved hypothetical protein 695737 694187 1551 TM.or 527 putative anti-sigma regulatory factor, 696399 695950 450 serine/threonine protein kinase TM.or 528 btrV anti-anti-sigma regulatory factor 6968O3 696468 336 TM.or 529 ygeW Na+/Picotransporter 697053 698705 1653 TM.or 530 yoaE erC-like membrane protein 698858 699607 750 TM.or 531 hemolysin 6996O4 700917 1314 TM.or 532 hypothetical protein 7.01280 700933 348 TM.or 533 extracellular solute-binding protein, family 1 701653 702708 1056 TM.or 534 Fe(3+)-transport system permease protein 702922 704649 1728 SfuB TM.or 535 ABC transporter related protein 704649 7.05710 1062 TM.or 536 tatA Sec-independent protein protein 7.05796 7O6044 249 atAE homolog 537 tat B Sec-independent protein translocase protein 706607 507 Tats TM.or 538 tatC Sec-independent protein translocase protein 706614 707.447 834 TM.or 539 SerS seryl-tRNA synthetase 707547 708836 1290 TM.or SurE acid phosphatase 708986 709741 756 TM.or pcm Protein-L-isoaspartate(D-aspartate) O 709744 7104.09 666 methyltransferase TM.or S42 ygeR Membrane protein 710498 711802 1305 TM.or 543 fab 3-oxoacyl-(acyl-carrier-protein) synthase II 713106 711841 1266 protein TM.or 544 transcriptional regulator, TetR family 713711 713118 594 TM.or 545 conserved hypothetical protein 714692 713799 894 TM.or S46 yaC preprotein translocase subunitYajC 714946 715341 396 TM.or 547 SecD Preprotein translocase subunit SecD 71.5451 717019 1569 TM.or S48 SecF Preprotein translocase subunit SecF 717039 718.007 969 TM.or S49 NADH dehydrogenase ubiquinone 1 alpha 718056 718439 384 Subcomplex assembly factor 550 sodB Superoxide dismutase 719123 718524 600 US 2014/0296.161 A1 Oct. 2, 2014 79

-continued

Locus Gene Product Start End Strand Length TM.orf1551 conserved hypothetical protein 720699 719497 -3 1203 TM.orf1552 critB fusion protein of y4aCandy4aD 722519 720735 -1 1785 TM.orf1553 squalene 722679 723536 1 858 TM.orf1554 glucose-inhibited division protein A 724921 723554 -2 1368 TM.orf1555 hypothetical protein 725496 725O29 -3 468 TM.orf1556 uvrA excinuclease ABC subunit A 728736 72584.5 -3 2892 TM.orf1557 Methyl-accepting chemotaxis protein signaling 730575 728926 -3 16SO domain TM.orf1558 conserved hypothetical protein 73.1388 730708 -3 681 TM.orf1559 yodo acetylornithine deacetylase 732898 73.1552 -2 1347 TM.orf1560 cmpR LysR family transcriptional regulator 733O16 733912 2 897 TM.orf1561 cysW sulfate ABC transporter, permease protein 7348O2 733906 -3 897 CysW TM.orf1562 cysl J Sulfate ABC transporter, permease protein 735666 734827 -3 840 CysT TM.orf1563 cysP Sulfate ABC transporter, sulfate-binding 736684 735674 -2 1011 protein TM.orf1564 hypothetical protein 736794 736681 -3 114 TM.orf1565 beta-lactamase domain-containing protein 738310 736901 -2 1410 TM.orf1566 Ssb single-strand DNA binding protein 738492 739 133 1 642 TM.orf1567 aldo/keto reductase 74O2O3 739235 -2 969 TM.orf1568 ohrR transcriptional regulator, MarR family 74O4O7 740874 3 468 TM.orf1569 Ohr organic hydroperoxide resistance protein 741OOO 74.1413 1 414 TM.orf1570 gyra Type IIA topoisomerase (DNA gyrase/topo II, 741674 744610 2 2937 opoisomerase IV), A Subunit TM.orf1571 coal) phosphopantetheline adenylyltransferase 744632 745.171 2 S4O TM.orf1572 queA Queuosine biosynthesis protein 74528O 746341 2 1062 TM.orf1573 tigt Queuine/archaeosinetRNA-ribosyltransferase 746.338 747522 3 118S TM.orf1574 AMP-dependent synthetase and ligase 74928O 747559 -3 1722 TM.orf1575 mcp2 methyl-accepting chemotaxis protein 75O175 751881 3 1707 TM.orf1576 hypothetical protein 752014 752364 3 351 TM.Orf1577 similar to protein conserved in bacteria with a 752687 753322 2 636 cystatin-like fold TM.orf1578 Aldehyde Dehydrogenase 754996 7S3404 -2 1593 TM.orf1579 Hydroxypyruvate isomerase 755861 755070 -1 792 TM.orf1580 SprT protein 755977 756.609 3 633 TM.orf1581 dc Orn/DAP/Arg decarboxylase 2 757015 758154 3 1140 TM.orf1582 haloacid dehalogenase, type II 758963 758262 -1 702 TM.orf1583 st Soluble lytic murein transglycosylase precursor 761,198 759060 -1 2139 TM.orf1584 dap A dihydrodipicolinate synthase 76.1583 7624.94 1 912 TM.orf1585 SmpB SSrA-binding protein 762550 763O26 3 477 TM.orf1586 yuiH oxidoreductase 763O31 763.738 2 708 TM.orf1587 putative uracil-DNA glycosylase 764448 763729 -3 720 TM.orf1588 conserved hypothetical protein 765409 764588 -2 822 TM.orf1589 foK 7,8-Dihydro-6-hydroxymethylpterin- 765659 7661.56 2 498 pyrophosphokinase, HPPK TM.orf1590 rpoZ DNA-directed RNA polymerase, omega 766263 766664 1 402 Subunit TM.orf1591 rsh Guanosine polyphosphate 766844 769000 2 2157 pyrophosphohydrolases, synthetases TM.orf1592 pyrE orotate phosphoribosyltransferase 769087 769689 3 603 TM.orf1593 pdx.J. Pyridoxal phosphate biosynthesis protein 769762 7.70553 3 792 TM.orf1594 acpS 4'-phosphopantetheinyl transferase 7.70558 770989 2 432 TM.orf1595 lepE signal peptidase I 770986 771786 3 8O1 TM.orf1596 inc RNAse III 771856 772.584 3 729 TM.orf1597 era GTP-binding protein Era 772.581 773660 1 1080 TM.orf1598 hypothetical protein 773.946 774569 1 624 TM.orf1599 3040 774.559 775173 3 615 TM.orf1600 hypothetical protein 7751.89 775713 3 525 TM.orf16O1 conserved hypothetical protein 776669 775686 -1 984 TM.orf16O2 transcriptional regulator protein 776764 777684 3 921 TM.orf1603 recC) DNA repair protein RecC) 777681 778469 1 789 TM.orf1604 parC Gram negative topoisomerase IV, Subunit A 778466 780784 2 2319 TM.orf1605 TRAP dicarboxylate transporter, DctM subunit 782490 780913 -3 1578 TM.orf1606 tripartite ATP-independent periplasmic 782764 782495 -2 270 transporter DctO TM.orf1607 tripartite ATP-independent periplasmic 783036 782.752 -3 285 transporter DctO TM.orf1608 Bacterial extracellular solute-binding protein, 784565 783.456 -1 1110 amily 7 TM.orf1609 TRAP-type transports system extracellular 786091 784991 -2 1101 Solute binding protein TM.orf1610 Bacterial extracellular solute-binding protein, 787389 786277 -3 1113 amily 7 US 2014/0296.161 A1 Oct. 2, 2014 80

-continued

Locus Gene Product Start End Strand Length TM.orf1611 ate Arginyltransferase 788423 787695 -1 729 TM.orf1612 tocB hreonine dehydratase 789781 788525 -2 1257 TM.orf1613 quiP peptidase S45 penicillin amidase 792263 789819 -1 2445 TM.orf1614 Odc1 ysine 79.2435 793.697 1 1263 TM.orf1615 mhpD umarylacetoacetate (FAA) hydrolase 794.703 793.690 -3 1014 TM.orf1616 hemB Delta-aminolevulinic acid dehydratase 79.5758 794.754 -1 1OOS TM.orf1617 putative 3-hydroxyisobutyryl-Coenzyme A 796022 7971O1 2 1080 hydrolase TM.orf1618 expG transcriptional regulator 797343 797798 1 456 TM.orf1619 speE spermidine synthase 798767 797889 -1 879 TM.orf1620 speH S-adenosylmethioninedecarboxylase 799218 798817 -3 402 proenzyme TM.orf1621 Haloacetate dehalogenase H-1 7995.76 800466 3 891 TM.orf1622 conserved hypothetical protein 8OOSO3 8O1288 3 786 TM.orf1623 hypothetical protein 8O1610 802128 3 519 TM.orf1624 rink nucleoside diphosphate kinase regulator 802160 8O2576 2 417 TM.orf1625 recC ATP-dependent DNA helicase 804676 8O2577 -2 2100 TM.orf1626 EMIS Succinate dehydrogenase assembly factor 804861 805169 1 309 TM.orf1627 mfc. transcription-repair coupling factor Superfamily 805326 8O8889 1 3564 I helicase TM.orf1628 conserved hypothetical protein 809061 809393 1 333 TM.orf1629 hypothetical protein 809447 809929 2 483 TM.orf1630 pleD response regulator? GGDEF domain protein 811242 809914 -3 1329 TM.orf1631 ytcI AMP-dependent synthetase and ligase 81.2810 811293 -1 1518 TM.orf1632 methyl-accepting chemotaxis protein 815215 813104 -2 2112 TM.orf1633 RNAirRNA methyltransferase (SpoU) 815583 816182 1 600 TM.orf1634 pilJ methyl-accepting chemotaxis protein 818315 8161.98 -1 2118 TM.orf1635 Transmembrane protein 819276 818431 -3 846 TM.orf1636 Glycosyltransferase 820451 819339 -1 1113 TM.orf1637 UDP-2,3-diacylglucosamine hydrolase 821.447 82O4SS -1 993 TM.orf1638 cold-shock DNA-binding domain-containing 821901 821695 -3 2O7 protein TM.orf1639 cold-shock DNA-binding domain protein 82248O 822274 -3 2O7 TM.orf1640 kipR regulatory protein, IcIR 823933 823OS2 -2 882 TM.orf1641 bcpA 2,3-dimethylmalate lyase 824793 823930 -3 864 TM.orf1642 putative branched chain amino acid ABC 826112 82488O -1 1233 transporter Substrate-binding protein TM.orf1643 Iw ABC transporter related protein 826920 8262O1 -3 720 TM.orf1644 braF High-affinity branched-chain amino acid 827752 826913 -2 840 transport ATP-binding protein braF TM.orf1645. IiwV branched chain amino acid ABC transporter 828717 827749 -3 969 inner membrane protein TM.orf1646 Iiw putative ABC transporter, permease protein 8295.99 828733 -3 867 TM.orf1647 leu) 3-isopropylmalate dehydratase, Small subunit 83O134 829 604 -2 531 TM.orf1648 3-isopropylmalate dehydratase large subunit 831449 830193 -1 1257 TM.orf1649 FOF1 ATP synthase subunit beta 831718 833145 3 1428 TM.orf16SO alternate F1F0 ATPase, F1 subunit epsilon 833142 83.3570 1 429 TM.orf1651 FOF1-ATPase Subunit 83.3587 83.3877 3 291 TM.orf1652 hypothetical protein 83.3874 834.194 1 321 TM.orf1653 FOF1 ATP synthase subunit A 834.191 834925 2 735 TM.orf1654 FOF1 ATP synthase subunit C 834922 8351.70 3 249 TM.orf16SS H + transporting two-sector ATPase B/B' 8351.84 835927 2 744 Subunit TM.orf1656 FOF1 ATP synthase subunit alpha 835914 837476 1 1S63 TM.orf1657 atpG H + transporting two-sector ATPase gamma 837473 8383S4 2 882 Subunit TM.orf1658 hypothetical protein 838654 838.367 -2 288 TM.orf1659 major facilitator Superfamily MFS 1 84OO67 838769 -2 1299 TM.orf1660 TfoX domain-containing protein 84O283 84O675 3 393 TM.orf1661 conserved hypothetical protein 84O779 842725 2 1947 TM.orf1662 ppdK Pyruvate, phosphate dikinase 842737 84433S 3 1599 TM.orf1663 yehW binding-protein-dependent transport systems 845088 844321 -3 768 inner membrane componen TM.orf1664 yehK ABC transporter related protein 846O20 845085 -1 936 TM.orf1665 yehY binding-protein-dependent transport systems 847249 846O17 -2 1233 inner membrane componen TM.orf1666 osmR Substrate-binding region of ABC-type glycine 8482O2 847246 -3 957 betaine transport system TM.orf1667 putative membrane protein 848844 848488 -3 357 TM.orf1668 response regulator 849.103 8SO494 3 1392 TM.orf1669 ipolytic enzyme 85.1479 8SO499 -2 981 TM.orf1670 vapI XRE family plasmid maintenance system 851988 852281 1 294 antidote protein TM.orf1671 conserved hypothetical protein 852615 852304 -3 312 US 2014/0296.161 A1 Oct. 2, 2014 81

-continued

Locus Gene Product Start End Strand Length TM.orf1672 transposase, mutator type 852.717 853367 1 651 TM.orf1673 transposase, mutator type 853497 853646 1 150 TM.orf1674 militB Lytic murein transglycosylase 855361 856.512 3 1152 TM.orf1675 rlpA Lipoproteins 856642 857619 3 978 TM.orf1676 dacA serine-type D-Ala-D-Ala carboxypeptidase 857693 858.931 2 1239 TM.orf1677 timk thymidylate kinase 858943 85962O 3 678 TM.orf1678 holB ATPase involved in DNA replication 8596.17 86O744 1 1128 TM.orf1679 metG methionyl-tRNA synthetase 86,0830 86238O 3 1551 TM.orf1680 yefH TatD-related deoxyribonuclease 862436 863242 2 807 TM.orf1681 lipB metallo-beta-lactamase Superfamily 863242 864087 3 846 TM.orf1682 hypothetical protein 864.096 864536 1 441 TM.orf1683 regulatory protein TetR 865132 864539 -2 594 TM.orf1684 ydgK drug resistance transporter, BcriCfA subfamily 866381 865.194 -1 1188 TM.orf1685 conserved hypothetical protein 866575 86691.6 3 342 TM.orf1686 rpoB RNA polymerase, sigma-24 subunit, ECF 866987 867547 2 S61 Subfamily TM.orf1687 conserved hypothetical protein 867531 868301 1 771 TM.orf1688 conserved hypothetical protein 868393 869109 3 717 TM.orf1689 conserved hypothetical protein 869258 869860 2 603 TM.orf1690 acetyltransferase 870904 87O125 -2 780 TM.orf1691 dnaB replicative DNA helicase 871863 871147 -3 717 TM.orf1692 yvbT conserved hypothetical protein 872145 873167 1 O23 TM.orf1693 mazG MazG family protein 873366 874289 1 924 TM.orf1694 ydfG short-chain dehydrogenase/reductase SDR 874289 87.4606 2 3.18 TM.orf1695 ydfG short-chain dehydrogenase/reductase SDR 87.4603 87SO43 3 441 TM.orf1696 groL unnamed protein product 87688S 875239 -3 647 TM.orf1697 groS chaperonin 877294 877OO4 -2 291 TM.orf1698 fix GTP-binding protein HFLX 878805 877438 -3 368 TM.orf1699 hfo RNA-binding protein Hfq, 879268 878.990 -2 279 TM.orf1 700 HAD-superfamily hydrolase, subfamily IA, 88O1SO 879434 -2 717 variant 1 TM.Orf1701 trkA Trk system potassium uptake protein 88.1552 88O176 -1 377 TM.orf1702 intrX response regulator containing Chey-like 882971 88.1586 -1 386 receiver TM.orf1703 Nitrogen regulation protein, Ntry, Signal 88S423 882976 -3 2448 transduction histidine kinase TM.orf1704 intro two-component response regulator, nitrogen 886904 88S42O -1 485 regulation response TM.orf1705 nitrogen regulation protein 88.8063 886921 -3 143 TM.orf1706 dus RNA-dihydrouridine synthase 8892.11 888.213 -1 999 TM.orf1707 spDispF bifunctional enzyme 8893.99 89.0658 3 260 TM.orf1708 pg.p.A phosphatidylglycerophosphatase A and 89.06SS 89.1176 1 522 related protein TM.orf1709 ygaD protein (competence- and mitomycin-induced) 8911.99 89.1690 3 492 TM.orf1710 sigma factor, sigma 70 type, group 4 (ECF) 892690 891716 -2 975 TM.orf1711 conserved hypothetical protein 8931.51 892687 -3 465 TM.orf1712 oligoketide cyclase/lipid transport protein 893,770 893270 -2 5O1 TM.orf1713 lipA ipoic acid synthetase 89483S 893879 -2 957 TM.orf1714 lpd dihydrolipoamide dehydrogenase 896793 894970 -3 1824 TM.orf1715 pyruvate dehydrogenase complex 898249 896852 -2 1398 dihydrolipoamide acetyltransferase TM.orf1716 pdhB Pyruvate dehydrogenase E1 component, beta 899.277 898.291 -3 987 Subunit TM.orf1717 pahA 2-dehydro-3-deoxyphosphooctonate aldolase 90O396 899.374 -3 1023 TM.orf1718 yybE transcriptional regulator, LysR family 901444 900551 -2 894 TM.orf1719 putative enoyl-CoA hydratase echA8 90.1579 902.382 3 804 TM.orf1720 Peptidoglycan-binding domain 1 protein 903,739 902756 -2 984 TM.orf1721 Septum formation initiator 904226 903,840 387 TM.orf1722 amiC ABC-type branched-chain amino acid 904609 90S880 3 1272 transport system, periplasmic component TM.orf1723 Iv High-affinity branched-chain amino acid 906153 907892 1740 transport system permease protein Iiv TM.orf1724 IwV branched chain amino acid ABC transporter 907897 909.099 3 1203 {{8Se. TM.orf1725 braF putative ATP-binding component of ABC 909096 9098.99 804 transporter TM.orf1726 Iw ABC transporter, nucleotide binding ATPase 909915 910610 696 protein (urea amide) TM.orf1727 ureD urease accessory protein UreD 910629 911585 957 TM.orf1728 urea urease, gamma Subunit 91.1653 91.1955 3 303 TM.orf1729 ureB urease subunit beta 91.1966 912364 2 399 TM.orf1730 ureC Urease alpha subunit 912369 914O78 1710 TM.orf1731 ureE urease accessory protein 914190 915035 846 TM.orf1732 ureF Urease accessory protein UreF 91SO43 915798 3 756 US 2014/0296.161 A1 Oct. 2, 2014 82

-continued

Locus Gene Product Start End Strand Length TM.orf1733 ureG urease accessory protein 915852 916511 1 660 TM.orf1734 conserved hypothetical protein 91.7954 916518 -1 1437 TM.orf1735 glycosyltransferase Wech/Tag A/CpsF 91.8745 917951 -2 795 TM.orf1736 polysaccharide export protein 919529 918765 -1 765 TM.orf1737 cpsD lipopolysaccharide biosynthesis 92.1541 919529 -2 2013 TM.orf1738 conserved hypothetical protein 922273 923637 3 1365 TM.orf1739 wica Sugar transferase 923693 92S129 2 1437 TM.orf1740 glycoside hydrolase family 18 92S129 926.148 3 102O TM.orf1741 FkbM family methyltransferase 92.7010 926,171 -2 840 TM.orf1742 hypothetical protein 927033 92.7158 1 126 TM.orf1743 conserved hypothetical protein 927269 927676 2 408 TM.orf1744 dgdR LysR family transcriptional regulator 928.536 92.7688 -3 849 TM.orf1745 Membrane protein 928691 929641 2 951 TM.orf1746 eno enolase 93.0955 929 666 -2 290 TM.orf1747 kcds.A 2-dehydro-3-deoxyphosphooctonate aldolase 93.1957 93.1103 -2 855 TM.orf1748 pyrC CTP synthase 933.636 93.2002 -3 635 TM.orf1749 hypothetical protein 93.426S 933840 -1 426 TM.orf1 750 tpi A triose-phosphate isomerase 93S114 934,356 -1 759 TM.orf1751 ppiD peptidyl-prolyl cis-trans isomerse 935472 93.7385 1 914 TM.orf1752 trpE Anthranilate synthase component I 93,7456 93.8973 3 S18 TM.orf1753 guaA putative GMP synthase glutamine- 93.897O 939551 1 582 hydrolyzing TM.orf1754 copZ. heavy metal transport detoxification protein 939783 939574 -3 210 TM.orf1755 yibQ conserved hypothetical protein 941474 939789 -1 686 TM.orf1756 hmrR Cu(I)-responsive transcriptional regulator 94.1993 941529 -1 465 TM.orf1757 cop A copper-translocating P-type ATPase 944254 94.1990 -2 2265 TM.orf1758 trpG anthranilate synthase component II 944S25 945148 2 624 TM.orf1759 trpD anthranilate phosphoribosyltransferase 945145 946.185 3 O41 TM.orf1760 trpC indole-3-glycerol phosphate synthase 946190 94700S 2 816 TM.orf1761 moaC molybdenum cofactor biosynthesis protein 947011 947SO8 3 498 MoaC TM.orf1762 moeA molybdopterin biosynthesis protein 94.7547 9487.82 1 236 TM.orf1763 exA Lex A repressor 949031 949879 2 849 TM.orf1764 ComEC Rec2-related protein 95.2111 949964 -2 2148 TM.orf1765 Glutamyl-tRNA synthetase, class Ic 952232 953629 2 398 TM.orf1766 plpC NLPA lipoprotein 954O13 95.4648 3 636 TM.orf1767 conserved hypothetical protein 954721 956109 3 389 TM.orf1768 lpxB lipid-A-disaccharide synthase 957339 95.6149 -3 191 TM.orf1769 conserved hypothetical protein 9581 60 957336 -1 825 TM.orf1770 lpxA UDP-N-acetylglucosamine acyltransferase 958996 958157 -2 840 TM.orf1771 fab2, beta-hydroxyacyl-(acyl-carrier-protein) 959469 958993 -3 477 dehydratase FabZ TM.orf1772 pxD UDP-3-O-3-hydroxymyristoylglucosamine N- 96.OS6S 959534 -2 1032 acyltransferase TM.orf1773 outer membrane protein 961264 960677 -2 S88 TM.orf1774 yaeT outer membrane protein 963560 961.269 -1 2292 TM.orf1775 Zinc metalloprotease Atul380 964.840 963683 -2 1158 TM.orf1776 dxr 1-deoxy-D-xylulose 5-phosphate 965945 964890 -1 1056 reductoisomerase TM.orf1777 ccSA CDP-diglyceride synthetase 966871 966098 -2 774 TM.orf1778 uppS undecaprenyl pyrophosphate synthetase 967679 966966 -1 714 TM.orf1779 firr ribosome recycling factor 968277 967717 -3 S61 TM.orf1780 pyrH uridylate kinase 969009 968281 -3 729 TM.orf1781 tsf elongation factor Ts 97.0195 969272 -2 924 TM.orf1782 rpsB 30S ribosomal protein S2 97.1239 970433 -2 807 TM.orf1783 DNA polymerase III alpha subunit 975030 971.530 -3 35O1 TM.orf1784 conserved hypothetical protein 975621 975127 -3 495 TM.orf1785 conserved hypothetical protein 976125 97.5841 -3 285 TM.orf1786 ptXR LysR family transcriptional regulator 977125 97.6223 -2 903 TM.orf1787 nylA 6-aminohexanoate-cyclic-dimer hydrolase 978624 9771.97 -3 1428 TM.orf1788 nreC transcriptional regulator, LuxR family 979695 978.703 -3 993 TM.orf1789 rpsI 30S ribosomal protein S9 98O333 979839 -1 495 TM.orf1790 rplM Ribosomal protein L13 98.0799 98.0341 -3 459 TM.orf1791 Enoyl-CoA hydratase/carnithine racemase 981104 98.1913 2 810 TM.orf1792 yect J O-acetylhomoserine/O-acetylserine 98.1918 982421 1 SO4 sulfhydrylase TM.orf1793 cysD O-acetylhomoserine sulfhydrylase 982418 983734 2 1317 TM.orf1794 phoE phosphoglycerate mutase family protein 983773 98.4378 3 606 TM.orf1795 azoR FMN-dependent NADH-azoreductase 985.030 98.44O1 -2 630 TM.orf1796 transcriptional regulator, LysR family 98516S 986 124 3 960 TM.orf1797 flavin reductase-like, FMN-binding 98.6115 986681 1 567 TM.orf1798 dgdR transcriptional regulator, LysR family protein 987558 986707 -3 852 TM.orf1799 dgdA aminotransferase class-III 987691 988995 3 1305 TM.orf1800 conserved hypothetical protein 98.9035 98.9901 3 867 US 2014/0296.161 A1 Oct. 2, 2014 83

-continued

Locus Gene Product Start End Strand Length TM.orf1801 glutamine amidotransferase, class 19899SO 1991 113 3 1164 II/dipeptidase TM.orf1802 conserved hypothetical protein 1991.182 1992288 1107 TM.orf1803 TRAP-T family transporter, DctO (4 TMs) 1992301 1992771 471 Subunit TM.orf1804 TRAP-T family protein transporter, DctM (12 1992768 1994O75 2 1308 TMs) subunit TM.orf1805 conserved hypothetical protein 1995.126 1994.083 1044 TM.orf1806 HD Superfamily metal-dependent 1996391 1995.123 -2 1269 phosphohydrolase TM.orf1807 divalent cation tolerance protein 1997241 1996522 720 TM.orf1808 transcriptional regulator, MerR family 199906O 1998.188 -3 873 TM.orf1809 ihfA integration host factor Subunit alpha 1999445 1999.101 -2 345 TM.orf1810 fabEH 3-oxoacyl-acyl-carrier-protein synthase III 2OOOS83 1999606 978 TM.orf1811 plsX putative glycerol-3-phosphate acyltransferase 2001644 200OS8O -2 106S PSX TM.orf1812 rpmF Ribosomal protein L32 2001904 2001719 -3 186 TM.orf1813 conserved hypothetical protein 200263S 2002O18 618 TM.orf1814 uqcc ubiquinol-cytochrome C chaperone 20O336O 200275S -2 606 TM.orf1815 conserved hypothetical protein 2003577 2004143 2 567 TM.orf1816 hypothetical protein 2004690 2004232 459 TM.orf1817 conserved hypothetical protein 20053OS 2004.736 570 TM.orf1818 thiL hiamine-monophosphate kinase 2006528 2005488 -2 1041 TM.orf1819 nuSB transcription antitermination protein Nusb 20071.18 2006525 -3 594 TM.orf1820 6,7-dimethyl-8-ribityllumazine synthase 2007584 2007135 -2 450 TM.orf1821 GTP cyclohydrolase II 20O8743 2007592 1152 TM.orf1822 ribE riboflavin synthase, alpha Subunit 200942S 2008793 -3 633 TM.orf1823 ribD riboflavin biosynthesis protein ribD 2O10538 2009SO4 -3 1035 TM.orf1824 Transcriptional repressor nirdR 2011124 2010648 -2 477 TM.orf1825 gly A serine hydroxymethyltransferase 2012SO3 201120S -3 1299 TM.orf1826 ywlF ribose 5-phosphate isomerase RpiB 20131 O1 2012658 -2 444 TM.Of 1827 transcriptional regulatory protein MucR 2013596 2014O69 3 474 TM.orf1828 hypothetical protein 2O14465 2014229 -3 237 TM.orf1829 hypothetical protein 2O15327 2014479 -2 849 TM.orf1830 hypothetical protein 2016262 2015324 -3 939 TM.orf1831 hypothetical protein 2016S6S 2016266 -3 300 TM.orf1832 conserved hypothetical protein 201688.0 2016608 -3 273 TM.orf1833 aspS Aspartyl-tRNA synthetase 2O18935 2017109 -3 827 TM.orf1834 rind Ribonuclease D 2019.195 2020463 2 269 TM.orf1835 putative glyoxalasebleomycin resistance 2021259 20204.92 -1 768 protein dioxygenase family protein TM.orf1836 chromosomal replication initiator 2021912 20212S6 -2 657 TM.orf1837 {{8Se. 202312O 2021981 -3 140 TM.orf1838 CDP-alcohol phosphatidyltransferase 2023757 2023140 -2 618 TM.orf1839 conserved hypothetical protein 2O2SOOS 2023761 -2 245 TM.orf1840 purM Phosphoribosylformylglycinamidine cyclo- 2O2S286 2026419 1 134 1gase TM.orf1841 GART phosphoribosylglycinamide formyltransferase 2026392 2027057 2 666 TM.orf1842 indk Nucleoside diphosphate kinase 2O27572 2027150 -3 423 TM.orf1843 yheS ABC transporter, ATP-binding protein 2O27829 2029742 2 914 TM.orf1844 walk PAS/PAC sensor hybrid histidine kinase 2O3OS4S 2032674 1 2130 TM.orf1845 DNA polymerase III subunit chi 2O3316S 2032686 -2 480 TM.orf1846 pep A eucyl aminopeptidase 2O34667 20331.83 -3 485 TM.orf1847 Predicted permeases 2O34999 2036126 2 128 TM.orf1848 putative permease 2O36123 2037.277 3 155 TM.orf1849 lptD Organic solvent tolerance protein OstA 2O3728O 2039472 1 21.93 TM.orf1850 Sura Parvulin-like peptidyl-prolyl isomerase 2O39504 204O784 3 281 TM.orf1851 poix A dimethyladenosine transferase 2040771. 2041805 2 O35 TM.orf1852 rSmA dimethyladenosine transferase 2041853 2042728 3 876 TM.orf1853 yrbG Sodium calcium exchanger 2042829 204381.8 2 990 TM.orf1854 gmk Guanylate kinase (GMP kinase) 2O44522, 2043848 -3 675 TM.orf1855 yicC stress-induced protein 2O4S441. 2044551 -2 891 TM.orf1856 Short-chain dehydrogenase/reductase (SDR) 2O45603 2046376 3 774 Superfamily TM.orf1857 btlB TonB-dependent receptor 2O46799 2048769 1 1971 TM.orf1858 periplasmic binding protein 2O48773 2049651 1 879 TM.orf1859 yvrB transport system permease protein 2O49648. 2050643 2 996 TM.orf1860 fecE iron(III) dicitrate transport ATP-binding protein 2050647. 2051492 2 846 Fece, TM.orf1861 conserved hypothetical protein 2O51447 2052490 3 1044 TM.orf1862 periplasmic solute-binding protein 2053517 2052495 -2 1023 TM.orf1863 fab 3-oxoacyl-acyl-carrier-protein synthase II 2O548.27 2053568 -3 1260 TM.orf1864 acpP transit peptide-acyl carrier protein fusion 2OSS348. 2055112 -1 237 protein