bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Research Article for mBio

A pan-genome method to determine core regions of the Bacillus subtilis and

Escherichia coli genomes

Running Title

A pan-genome method for core region determination in bacteria

Authors

Granger Suttona, Gary B. Fogelb, Bradley Abramsona, Lauren Brinkac, Todd Michaela, Enoch S.

Liub, Sterling Thomasc#

a J. Craig Venter Institute

b Natural Selection, Inc.

c Noblis, Inc.

Corresponding author: Sterling Thomas, [email protected]

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Abstract

Synthetic engineering of bacteria to produce industrial products is a burgeoning field of research

and application. In order to optimize genome design, designers need to understand which

are essential, which are optimal for growth, and locations in the genome that will be tolerated by

the organism when inserting engineered cassettes. We present a pan-genome based method for

the identification of core regions in a genome that are strongly conserved at the species level. We

show that these core regions are very likely to contain all or almost all essential genes. We assert

that synthetic engineers should avoid deleting or inserting into these core regions unless they

understand and are manipulating the function of the genes in that region. Similarly, if the

designer wishes to streamline the genome, non-core regions and in particular low penetrance

genes would be good targets for deletion. Care should be taken to remove entire cassettes with

similar penetrance of the genes within cassettes as they may harbor toxin/antitoxin genes which

need to be removed in tandem. The bioinformatic approach introduced here saves considerable

time and effort relative to knockout studies on single isolates of a given species and captures a

broad understanding of the conservation of genes that are core to a species.

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Importance

The pan-genome approach presented in this paper can be used to determine core regions of a

genome and has many possible applications. Synthetic engineering design can be informed by

which genes/regions are more conserved (core) versus less conserved. The level of conservation

of adjacent non-core genes tends to define cassettes of genes which may be part of a pathway or

system that can inform researchers about possible functional significance. The pattern of

presence across the different genomes of a species can inform the understanding of evolution and

horizontal gene acquisition. The approach saves considerable time and effort relative to

laboratory methods used to identify essential genes in species.

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Introduction

Over the last decade, considerable interest has been directed towards the determination of a

minimal bacterial cell, making use of a short genome consisting of only essential genes for

viability. The Mycoplasma mycoides JCVI-syn3.0 is a case example of synthetic engineering to

design and build a genome that contains a streamlined gene set essential for cell viability and cell

replication (1). However, the identification of “essential” genes - those genes that are critical for

cell viability and replication - takes considerable time and effort in a laboratory setting and is

usually determined with respect to one reference genome under one set of specific growth

conditions. For instance, Kobayashi et al. (2) and Koo et al. (3) experimentally and

computationally determined the minimal gene set in the gram-positive bacterium Bacillus

subtilis. Of the ~4,100 genes of the type strain, a total of 271 genes for Kobayashi and 257 genes

for Koo were shown to be essential. These essential genes were further categorized in terms of

cell metabolism and enzymatic capability. Additionally for the ~4400 genes in the gram-negative

bacteria , Goodall et al. (4), Baba et al. (5), and Yamazaki et al. (6), it was

determined that 414 genes were essential to strain K-12.

Experimental studies to determine such essential genes are time consuming and often

restricted to a single environmental condition using a single strain of the species. In addition,

these approaches also knock out one gene at a time. As such, genes with multiple copies with

redundant functions are often not considered as essential following knockout, as their additional

copy is able to maintain cellular function. In other words, a viable organism would not result

from deleting all but the experimentally determined essential genes from the genome. Another

peculiarity of the single knockout essential genes is that pairs or cassettes of genes which can be bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

removed and still have a viable organism are labeled essential because removal of just one gene

is lethal. For example, removing the methylation gene(s) without removing the restriction

digestion enzyme genes from the restriction mechanism results in cell death but the cell survives

if the entire system is removed. This is likewise true for toxin/antitoxin systems.

While it is possible to define “essential” genes relative to viability, another larger

question remains; which genes define a species? While specific phenotypes can vary across

strains, in general a species seems to require some minimal set of genes to not only survive in the

laboratory but to thrive in its natural environment. In contrast, some strains may have retained or

acquired some genes which improve survival for specific niches. Pan-genome analysis of large

sets of diverse strains from the same species provide the structure for determining what “core”

genes are present in all or almost all strains versus what genes are only present in a limited subset

of genomes. These core genes should determine the baseline phenotype (capabilities and traits)

of a species. We assert that truly essential genes should typically be a subset of core genes given

that core genes presumably confer fitness advantages that are not necessarily essential but have

been retained for a reason over evolutionary time. Previous pan-genome studies (7) have shown

that species tend to only tolerate the placement of noncore genes between what we call “core

regions” and not within those core regions. Here we define a pan-genome “core gene” as a gene

which is present in at least 95% of the strains of a species. A “core region” on the genome of a

given strain is the minimal region containing a set of adjacent core genes (and no noncore genes)

which are adjacent in all or almost all strains in a species. The reason an organism might

constrain a core region rather than just core genes is that the region may include regulatory

mechanisms such as operons, which allows for co-expression of multiple functionally associated

genes, or regulons which would be disrupted with the insertion of other genes. We believe that bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

conservation of core regions in species indicates resistance to insertion of genes through

evolution or through human-mediated genetic engineering.

Here we present a pan-genome based calculation of core regions for B. subtilis spp.

subtilis and for E. coli. These core regions are compared to previous experimentally determined

essential genes from the literature. Our bioinformatics approach saves considerable time and

effort in determining core regions and can broaden our understanding of core genomic loci for

these specific species. The pan-genome approach outlined here provides a method to determine a

set of core regions and their associated metabolic functions which help define a bacterial

species/subspecies. This goes beyond determining which genes are essential for survival in a

defined laboratory setting and addresses what genes are conserved for a specific species. We

would expect that all truly essential genes for the species/subspecies would be a subset of the

core genes/regions since core genes would encompass genes responsible for providing a fitness

advantage in environmental conditions as well as being essential for viability. This approach

automates computational prediction of core genes/regions as an alternative to laborious

experimental methods to determine core genes through knockout studies. The proper

identification of core genes/regions will greatly improve the effectiveness of synthetic biology

and improve our understanding of prokaryotic diversity and phylogeny. bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Results

For B. subtlis, both Kobayashi et al. (2) and Koo et al. (3) used similar single knockout methods

to determine “essential” genes when grown in LB at 37ºC. Koo identified 257 essential genes

while Kobayashi identified 271 essential genes. The union of these two sets results in 305

essential genes (Table 1). The 305 genes were mapped to the pan genome graph (PGG) clusters

using the RefSeq genome NC_000964.3 (BioSample SAMEA3138188). Interestingly through

this mapping, 16 of the essential genes were not identified as core genes. We believe only 9 of

these 16 genes are truly essential. Gene wapI/yxxG (cluster 4769 present in 39 of 108 genomes)

is an antitoxin for the wapA toxin gene which is adjacent to it (present in 85 of 108 genomes) (8).

Gene rttF/yqcF (cluster 4590 present in 46 of 108 genomes) and gene rtbE/yxxD (cluster 4772

present in 53 of 108 genomes) are also the antitoxin of a cognate toxin-antitoxin pair (9). Gene

yezG (cluster 4411 present in 43 of 108 genomes) is also the toxin for a cognate toxin–antitoxin

pair (10). Gene sknR/yqaE (cluster 4643 present in 34 of 108 genomes) is part of a phage-like

region which, if removed would still allow B. subtilis to remain viable (11) possibly because it is

another antitoxin or similar mechanism. Genes bsuMA/ydiO (cluster 4838 present in 24 of 108

genomes) and bsuMB/ydiP (cluster 4839 present in 24 of 108 genomes) are part of a prophage

region of about 15 genes in 48 genomes which includes ydiR and ydiS which are type-2

restriction enzymes. These are not essential genes but they are essential if the restriction enzymes

are present (12).

Another 8 genes are involved in wall teichoic acid (WTA) biosynthesis: Genes tuaB

(cluster 4729 present in 85 of 108 genomes), mnaA/yvyH (cluster 4735 present in 84 of 108

genomes), tagH (cluster 4744 present in 84 of 108 genomes), tagG (cluster 4745 present in 35 of

108 genomes), tagF (cluster 4746 present in 35 of 108 genomes), tagD (cluster 4748 present in bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

35 of 108 genomes), tagA (cluster 4749 present in 35 of 108 genomes) and tagB (cluster 4750

present in 35 of 108 genomes). The WTA genes are involved in production of anionic

glycopolymers required for consistent cell shape and division (13). The WTA genes are part of a

31 gene region which has been shown to be dispensable (14) but results in misformed cells with

poor growth properties. Gene rodA (cluster 3994 present in 97 of 108 genomes) appears to be the

exception as it is asserted to be essential for maintaining a rod shape and preventing spherical

cells which lyse (15). Kobayashi et al. (2) stated: “Ten essential genes are involved in cell shape

and division. Septum formation requires seven (ftsA, L, W, and Z, divIB and C, and pbpB; ref.

21), whereas cell shape requires three (rodA, and mreB and C).” Interestingly, genes ftsZ (cluster

1675 present in 105 of 108 genomes) and pbpB (cluster 1662 present in 107 of 108 genomes)

while considered core, using our 95% of genomes definition are the only core genes not present

in all 108 genomes. We investigated these 11 genes further to understand why essential genes did

not appear to be core genes. By examining the PGG we discovered that alternate genes with

homology to the essential genes had replaced the essential genes. Gene pbpB (cluster 1662 in

107 genomes) is replaced in the one remaining genome by cluster 7120 which is also annotated

as pbpB. Gene ftsZ (cluster 1675 in 105 genomes) is replaced in three genones by a four gene

insertion of clusters 8068, 8300, 8069, and 8070 where both 8068 and 8070 are annotated as ftsZ.

Gene rodA (cluster 3994 in 97 genomes) is replaced by either: cluster 8718 (2 genomes) or

clusters 10492, 6436, and 6437 (1 genome) or clusters 6436 and 6437 (8 genomes) where 8718

and 6436 are annotated as rodA. For B. subtilis spp. spizizenii strain W23, poly(ribitol phosphate)

is the main teichoic acid (16) and this was thought to distinguish spp. spizizenii from spp. subtilis

whose type strain 168 has poly(glycerol phosphate) as the main teichoic acid. Further study

found that the ribitol/glycerol distinction does not distinguish between spizizenii and subtilis bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

subspecies (17) but rather either subspecies can contain one or the other. Our PGG confirms this

and in fact finds six distinct vairieties of the WTA region. For example the tagD gene (cluster

4748 in 35 genomes) has been replaced by mutlitple orthologs with the same annotation: cluster

3746 (23 genomes), cluster 5431 (43 genomes), cluster 6915 (2 genomes), cluster 7624 (3

genomes), and cluster 8731 (1 genome).

This shows that most essential genes in B. subtilis ssp. subtilis are encompassed by core

genes/regions. There are 258 core regions for B. subtilis (Table 2). The 289 essential genes

which are core genes are contained in only 63 of these regions. These 289 essential genes are not

evenly distributed in these 63 regions (e.g., 46 are in core region 3). Similarly, the 16 essential

genes in non-core regions (the regions between core regions) are contained in only 7 non-core

regions with 8 WTA genes in the non-core region between core regions 206 and 207. A table of

all B. subtilis genes is provided in Supplementary Table 1.

For E. coli, Goodall et al. (4) determined E. coli essential genes using an analysis of

transposon insertion events (TraDIS). The results of their study and two other studies, the Keio

collection (5) and the Profiling of the E. coli Chromosome (PEC) (6) were captured in Table S2

of Goodall et al. Of the 414 genes with overlap between these studies, the 248 essential genes in

common for all three studies are all core genes (Table 3). This set of 248 essential genes should

be the highest quality predictions as determined by all three studies and confirms our assertion

that essential genes should almost always be core genes. The next highest quality set of essential

gene predictions is the 45 essential genes where two of the three studies agree which 41 are core

genes: for Keio-PEC 15 of 16 are core genes, for TraDIS-Keio 8 of 11 are core genes, and for

TraDIS-PEC 18 of 18 are core genes (Table 3). The lowest quality set of essential gene

predictions is the 121 essential genes where only one study agrees which 89 are core genes: for bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Keio only 12 of 22 are core genes, for PEC only 18 of 18 are core genes, and for TraDIS only 59

of 81 are core genes (Table 3). One of the noncore essential genes present in two studies

(TraDIS-Keio), racR, is probably a toxin suppressor which is not essential in the absence of the

toxins. Bindal et al. (18) noted, “We further show that both YdaS and YdaT can act

independently as toxins and that RacR serves to counteract the toxicity by tightly downregulating

the expression of these toxins.” The racR gene is found in only 106 of the 971 genomes in the E.

coli PGG, whereas ydaS and ydaT are found in 106 and 150 genomes respectively, perhaps

arguing that ydaS is the key toxin gene. This recapitulates the pattern we observed in B. subtilis

where toxin suppressor genes are only essential in the presence of toxin genes. Similarly the dicA

gene (TraDIS-Keio) can be deleted if the dicB gene is also deleted. Kato et al. (19) noted: “The

dicA gene encoding a repressor of a cell division inhibitor was deleted in our study with the dicB,

the inhibitor gene.” There are 521 core regions for E. coli (Table 4). The 378 essential genes

which are core genes are contained in only 133 of these regions. These 378 essential genes are

not evenly distributed in these 133 regions (e.g., 27 are in core region 362). Similarly, the 36

essential genes in non-core regions (the regions between core regions) are contained in only 23

non-core regions with 4 in the non-core region between core regions 152 and 153. A table of all

E. coli genes is provided in Supplementary Table 3.

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Discussion

For the purpose of synthetic engineering, determining the set of core regions for a given species

is critical as changes to these regions should be expected to reduce fitness. Core regions indicate

parts of the genome that are conserved across evolution within a species. These regions are not

necessarily required for survival but presumably define the characteristic core genotype which

produces the core phenotype (lifestyle). Since most essential gene studies are carried out under

specific laboratory growth conditions, genes which would normally be essential for a species

across a diverse set of environmental conditions might not be discovered (e.g., due to fluctuating

temperatures). Correspondingly, genes required to out compete rival organisms through

increased fitness or to evade immune responses might not be found under laboratory conditions.

Core regions, therefore, should be a superset of essential genes in most cases but exceptions

might occur for genes which are not needed in a species’ natural niche but are required in a

laboratory setting. Another exception would be for genes which are essential for a particular

strain but not for other strains due to the presence of compensating non-core genes.

Verification of core genes or regions requires experimental studies on specific genomes.

However, it is cost prohibitive to do knockout studies on all strains of a pan-genome. One has to

carefully choose a single genome as a representative of the entire pan-genome for the purpose of

verifying the essentiality of core regions and/or the non-essentiality of noncore regions by

experimental validation. However, given the diversity of most bacterial species it is unlikely that

any one strain completely captures the capabilities of the species in all environmental conditions.

Further, while there are clearly core genes/regions associated with viability for a species, other

core regions probably contribute to a lesser degree to cell viability. For example, for the purpose

of synthetic engineering, changes in these locations may reduce fitness by slowing cell growth. bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

The use of a PGG for identifying core regions of a bacterium is an automatable, low-cost,

rapid, and effective way to evaluate both gram-negative and gram-positive bacteria. This method

compliments and expands upon the experimental knockout approach by including environmental

diversity as a measure of what regions and genes are conserved across the species. The approach

also overcomes the limitations of knockout studies that are specific to the strains and conditions

used among other issues. Core region analysis should inform the design of synthetic engineering.

The B. subtilis WTA region provides a cautionary note for relying entirely upon core

regions to determine what is safe to remove. While most non-core regions involve cassettes of

genes which are entirely absent from som strains such as phage regions, sometimes orthologous

replacement possibly due to homologous recombination can have functionally equivalent genes

appearing to be non-core. A closer examination of the PGG can determine if a region is simply

missing from some strains versus being replaced in which case further study may be needed

before removal of the region. Of course in some cases the orthologous replacement does not need

to occur at the same location in the genome but that was the case for all instances in B. subtilis.

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Materials and Methods

For B. subtilis ssp. subtilis and E. coli we selected strains with complete genomes in RefSeq (20).

We restricted our analysis to complete genomes to ensure that missing genes due to incomplete

genome sequencing/assembly did not affect the approach or results. We limited our choice to

RefSeq for two reasons: RefSeq performs a series of quality checks to remove dubious genome

assemblies, and the initial pan-genome construction depends upon reasonably consistent

annotation which RefSeq provides. We extracted the genomes based on organism name: Bacillus

subtilis (we did not specify subspecies, since for many RefSeq genomes a subspecies is not

given) and Esherichia coli (we also specified Shigella since all Shigella species are actually

considered to be the same species as Escherichia coli) (21,22). For each pan-genome we then

compared the genomes using a fast Average Nucleotide Identity (ANI) estimate generated using

MASH (23). We used type strains and ANI to determine which of these genomes were actually

the desired organism. We also used ANI to remove very closely related strains to reduce

oversampling bias (for example for the B. subtilis type strain, 168, has at least 8 genomes in

RefSeq). We used GGRASP (24) to choose a single medoid sequence from any complete linkage

ANI cluster with a threshold of 0.01% or 1/10,000 base pair difference. The strain 168 medoid

genome is the Entrez reference genome for the B. subtilis type strain (GenBank sequence

AL009126.3, BioSample SAMEA3138188, Assembly ASM904v1/GCA_000009045.1) which

can be used to map the Kobayashi and Koo results.

Using this approach, for B. subtilis 143 genomes were downloaded from RefSeq. Of

these 132 genomes were determined to be B. subtilis spp. subtilis based on type strains and ANI.

The minimum ANI between any pair of the 132 B. subtilis spp. subtilis genomes was 97.28% bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

whereas the maximum ANI of any of the 11 other genomes to the 132 genomes was 95.73%,

providing good separation between the other subspecies. The 132 genomes were reduced to 109

genomes after removing redundant strains. Finally we removed strain delta6 (BioSample

SAMN05150066) because it is known to have been engineered to remove multiple genes. Thus

we were left with 108 B. subtilis genomes (Supplemental Table 2). For E. coli (and Shigella) we

downloaded 1097 complete genomes from RefSeq. Of these, 1096 were determined using ANI

to actually be E. coli. The non E. coli genome was clearly mislabeled as its maximum ANI to

any other genome was 82.27%. The minimum pairwise ANI of any of the 1096 genomes was

95.53% which is not as tight as for B. subtilis spp. subtilis which is to be expected given that E.

coli is a species grouping not a subspecies grouping. One could arbitrarily try to choose a tighter

grouping around the K-12 reference genome but the pairwise ANI values of the other genomes

compared to the K-12 reference genome vary continuously from 96.22% to 100% with no

punctate break in the values. After removing redundancy 969 E. coli genomes remained. We

added back in two redundant genomes: The K-12 Entrez E. coli reference strain MG1655

(BioSample SAMN02604091) and the K-12 strain BW25113 (GenBank sequence accession

CP009273.1, GenBank Assembly accession ASM75055v1/GCA_000750555.1, GenBank

BioSample accession SAMN03013572) used by Goodall. By using a 95% threshold for the

number of genomes a gene must be in to be considered core, some small number of the 971

genomes (Supplemental Table 4) could be engineered to remove what are normally core genes

and not affect the assignment of core genes.

For B. subtilis ssp. subtilis and E. coli initial pan-genomes were based on the RefSeq

annotation of these genomes. The pan-genome was generated using the pan-genome pipeline at

the J. Craig Venter Institute (JCVI) at the nucleotide level using default parameters (25). This bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

produced ortholog clusters using gene context (26) as well as a Pan-Genome Graph (PGG) (7).

The PGG has two main components: nodes representing genes, and edges representing the

sequence between genes and the order and orientation of the genes in the genomes. We updated

the code repository for the JCVI pan-genome pipeline with a script: iterate_pgg_graph.pl, which

calls pgg_annotate.pl for the genomes in the existing PGG in order to ensure consistent

annotation of the genomes and iterates until the PGG stabilizes. The script pgg_annotate.pl uses

an existing PGG to assign regions of a genome to nodes of the graph. This is done by blasting the

medoid sequence for the gene cluster the node represents against the genome and then uses

Needleman-Wunsch (27) to extend the alignment if needed. If there are conflicting blast matches

then the matches are resolved based on which matches are consistent with the structure of the

PGG which encapsulates gene context across the entire pan-genome. Once the nodes of the PGG

are mapped to each of the genomes in the pan-genome a new version of the PGG is intrinsic and

then explicitly extracted. This process is iterated to stability. This ensures that each genome is

consistently annotated so that genes missing from the original annotation of some genomes will

be consistently annotated across all genomes.

The original and refined PGG statistics for B. subtilis and E. coli are provided in Table 5.

The major goal of this reannotation and iteration until stabilization was to achieve consistent

annotation across all genomes in the PGG leading to a more comprehensive and cohesive PGG.

While the RefSeq annotations of these genomes tends to be highly consistent, many small genes

are often arbitrarily called from genome to genome and even some common longer genes can

occasionally be missed. There are three obvious points of improvement in the refined PGG for

both the cluster and edge stats: the number of size 1 clusters/edges significantly decreased due to

some dubious RefSeq gene calls being eliminated and some becoming shared with other bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

genomes; the number of core clusters/edges significantly increased showing an improvement in

the consistency of annotation across all genomes; and the number of genes/edges in clusters/edge

instances greatly increased again indicating a much more consistent annotation.

Core regions were determined based on the PGG. Nodes in the PGG were orthologous

clusters of genes. Edges in the PGG represented adjacency of genes (contained in the clusters) in

the underlying genomes. The definition of which genes were or were not considered “core” was

determined relative to a threshold criterion. We used a criterion for core such that 95% or more

of the underlying genome had to contain the cluster or edge. Considering that we used only

complete genomes it might have been possible to use a 100% threshold. However, we opted for a

95% threshold based on prior experience and an abundance of caution to not under call core

genes/edges. Each core region began with a core cluster followed by a core edge (if possible –

otherwise the core region comprises a single cluster) to another core cluster and so on until a

core edge cannot be found to continue the core region. A core region is just a path in the

PGGwhich was then mapped onto any particular genome to determine the core region

coordinates. When the core threshold was below 100% any particular genome may be missing a

cluster (gene) or edge along this path which results in the path being broken into its remaining

constituent parts.

In order to compare core regions to experimentally determined essential genes we needed

a common base of reference. For each of the experimental studies, the genes are specified based

on a reference strain that was used for the experiments and has a complete genome in RefSeq.

For Kobayashi et al., only gene symbols/names were given which we mapped to Entrez GeneIDs

using Entrez search. GeneIDs with no matches were manually curated to estimate the best

matching gene symbol listed in the literature. For Koo, locus IDs were provided giving direct bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

access to the gene coordinates for RefSeq accession NC_000964.3 (BioSample

SAMEA3138188, Assembly GCF_000009045.1). For Goodall, we used the data from three

studies in Table S2 from Goodall et al. Gene symbols/names again were all that was available

but these were consistent with the GenBank annotation downloadable in gff format for the K-12

BW25113 reference genome (GenBank accession CP009273.1) used by Goodall (BioSample

SAMN03013572). This gave us coordinates for all essential genes on RefSeq genomes which

were annotated with a PGG which produces a file with coordinates for clusters and edges

mapped to the genome. These coordinates allow us to affiliate essential genes to gene clusters.

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Acknowledgements

This research is based upon work supported [in part] by the Office of the Director of National

Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA) under Finding

Engineering Linked Indicators (FELIX) program contract #N6600118C-4506. The views and

conclusions contained herein are those of the authors and should not be interpreted as necessarily

representing the official policies, either expressed or implied, of ODNI, IARPA, or the U.S.

Government. The U.S. Government is authorized to reproduce and distribute reprints for

governmental purposes notwithstanding any copyright annotation therein.

The authors would like to thank Derren Barken for his assistance in table generation.

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

References

1. Hutchison CA, Chuang RY, Noskov VN, Assad-Garcia N, Deerinck TJ, Ellisman MH,

Gill J, Kannan K, Karas BJ, Ma L, Pelletier JF, Qi ZQ, Richter RA, Strychalski EA, Sun

L, Suzuki Y, Tsvetanova B, Wise KS, Smith HO, Glass JI, Merryman C, Gibson DG,

Venter JC. 2016. Design and synthesis of a minimal bacterial genome. Science 351:

aad6253.

2. Kobayashi K, Ehrlich SD, Albertini A, Amati G, Andersen KK, Arnaud M, Asai K,

Ashikaga S, Aymerich S, Bessieres P, Boland F, Brignell SC, Bron S, Bunai K, Chapuis

J, Christiansen LC, Danchin A, Débarbouille M, Dervyn E, Deuerling E, Devine K,

Devine SK, Dreesen O, Errington J, Fillinger S, Foster SJ, Fujita Y, Galizzi A, Gardan R,

Eschevins C, Fukushima T, Haga K, Harwood CR, Hecker M, Hosoya D, Hullo MF,

Kakeshita H, Karamata D, Kasahara Y, Kawamura F, Koga K, Koski P, Kuwana R,

Imamura D, Ishimaru M, Ishikawa S, Ishio I, Le Coq D, Masson A, Mauël C, Meima R,

Mellado RP, Moir A, Moriya S, Nagakawa E, Nanamiya H, Nakai S, Nygaard P, Ogura

M, Ohanan T, O'Reilly M, O'Rourke M, Pragai Z, Pooley HM, Rapoport G, Rawlins JP,

Rivas LA, Rivolta C, Sadaie A, Sadaie Y, Sarvas M, Sato T, Saxild HH, Scanlan E,

Schumann W, Seegers JF, Sekiguchi J, Sekowska A, Séror SJ, Simon M, Stragier P,

Studer R, Takamatsu H, Tanaka T, Takeuchi M, Thomaides HB, Vagner V, van Dijl JM,

Watabe K, Wipat A, Yamamoto H, Yamamoto M, Yamamoto Y, Yamane K, Yata K,

Yoshida K, Yoshikawa H, Zuber U, Ogasawara N. 2003. Essential Bacillus subtilis

genes. Proc Natl Acad Sci U S A. 100: 4678-83. bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

3. Koo BM, Kritikos G, Farelli JD, Todor H, Tong K, Kimsey H, Wapinski I, Galardini M,

Cabal A, Peters JM, Hachmann AB, Rudner DZ, Allen KN, Typas A, Gross CA. 2017.

Construction and Analysis of Two Genome-Scale Deletion Libraries for Bacillus subtilis.

Cell Syst. 4:291-305.

4. Goodall ECA, Robinson A, Johnston IG, Jabbari S, Turner KA, Cunningham AF, Lund

PA, Cole JA, Henderson IR. 2018. The Essential Genome of Escherichia coli K-12.

mBio. 20: e02096-17.

5. Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M,

Wanner BL, Mori H. 2006. Construction of Escherichia coli K-12 in-frame, single-gene

knockout mutants: the Keio collection. Mol Syst Biol. 2:2006.0008.

6. Yamazaki Y, Niki H, Kato J. 2008. Profiling of Escherichia coli Chromosome database.

Methods Mol Biol. 416:385–389.

7. Chan AP, Sutton G, DePew J, Krishnakumar R, Choi Y, Huang XZ, Beck E, Harkins

DM, Kim M, Lesho EP, Nikolich MP, Fouts DE. 2015. A novel method of consensus

pan-chromosome assembly and large-scale comparative analysis reveal the highly

flexible pan-genome of Acinetobacter baumannii. Genome Biol. 16:143.

8. Koskiniemi S, Lamoureux JG, Nikolakakis KC, t'Kint de Roodenbeke C, Kaplan MD,

Low DA, Hayes CS. 2013. Rhs proteins from diverse bacteria mediate intercellular

competition. Proc Natl Acad Sci U S A. 110:7032-7.

9. Holberger LE, Garza-Sánchez F, Lamoureux J, Low DA, Hayes CS. 2012. A novel

family of toxin/antitoxin proteins in Bacillus species. FEBS Lett. 586(2):132-6.

10. Brantl S, Müller P. 2019. Toxin-Antitoxin Systems in Bacillus subtilis. Toxins 11: pii:

E262. bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

11. Westers H, Dorenbos R, van Dijl JM, Kabel J, Flanagan T, Devine KM, Jude F, Seror SJ,

Beekman AC, Darmon E, Eschevins C, de Jong A, Bron S, Kuipers OP, Albertini AM,

Antelmann H, Hecker M, Zamboni N, Sauer U, Bruand C, Ehrlich DS, Alonso JC, Salas

M, Quax WJ. 2003. Genome engineering reveals large dispensable regions in Bacillus

subtilis. Mol Biol Evol. 20:2076-90.

12. Ohshima H, Matsuoka S, Asai K, Sadaie Y. 2002. Molecular organization of intrinsic

restriction and modification genes BsuM of Bacillus subtilis Marburg. J Bacteriol.

184:381-9.

13. Brown S, Santa Maria Jr, JP, Walker S. 2013. Wall teichoic acids of gram-positive

bacteria. Annu Rev Microbiol. 67:313-36.

14. D'Elia MA, Millar KE, Beveridge TJ, Brown ED. 2006. Wall teichoic acid polymers are

dispensable for cell viability in Bacillus subtilis. J Bacteriol. 188:8313-6.

15. Henriques AO, Glaser P, Piggot PJ, Moran CP Jr. 1998. Control of cell shape and

elongation by the rodA gene in Bacillus subtilis. Mol Microbiol. 28:235-47.

16. Lazarevic V. Abellan F-X, Möller SB, Karamata D, Mauël C. 2002. Comparison of

ribitol and glycerol teichoic acid genes in Bacillius subtilisW23 and 168: Identical

function, similar divergent organization, but different regulation. Microbiology. 148:815-

24.

17. Ahn S, Jun S, Ro H-J, Kim JH, Kim S. 2018. Complete genome of Bacillus subtilis

subsp. subtilis KCTC 3135T and variation in cell wall genes of B. subtilis strains. J

Microbiol Biotechnol. 28:1760-68. bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

18. Bindal G, Krishnamurthi R, Seshasayee ASN, Rath D. 2017. CRISPR-Cas-mediated gene

silencing reveals RacR to be a negative regulator of YdaS and YdaT toxins in

Escherichia coli K-12. mSphere. 2:e00483-17.

19. Kato J, Hashimoto M. Construction of consecutive deletions of the Escherichia coli

chromosome. 2007. Mol Syst Biol 3:132.

20. O'Leary NA, Wright MW, Brister JR, Ciufo S, Haddad D, McVeigh R, Rajput B,

Robbertse B, Smith-White B, Ako-Adjei D, Astashyn A, Badretdin A, Bao Y, Blinkova

O, Brover V, Chetvernin V, Choi J, Cox E, Ermolaeva O, Farrell CM, Goldfarb T, Gupta

T, Haft D, Hatcher E, Hlavina W, Joardar VS, Kodali VK, Li W, Maglott D, Masterson

P, McGarvey KM, Murphy MR, O'Neill K, Pujar S, Rangwala SH, Rausch D, Riddick

LD, Schoch C, Shkeda A, Storz SS, Sun H, Thibaud-Nissen F, Tolstoy I, Tully RE,

Vatsan AR, Wallin C, Webb D, Wu W, Landrum MJ, Kimchi A, Tatusova T, DiCuccio

M, Kitts P, Murphy TD, Pruitt KD. 2016. Reference sequence (RefSeq) database at

NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids

Res. 44:D733-45.

21. Lan R, Reeves PR. 2002. Escherichia coli in disguise: molecular origins of Shigella.

Microbes and Infection. 4: 1125-1132.

22. Meier-Kolthoff JP, Hahnke RL, Petersen J, Scheuner C, Michael V, Fiebig A, Rohde C,

Rohde M, Fartmann B, Goodwin LA, Chertkov O, Reddy T, Pati A, Ivanova NN,

Markowitz V, Kyrpides NC, Woyke T, Göker M, Klenk HP. 2014. Complete genome

sequence of DSM 30083(T), the type strain (U5/41(T)) of Escherichia coli, and a

proposal for delineating subspecies in microbial taxonomy. Stand Genomic Sci. 8:9:2. bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

23. Ondov BD, Treangen TJ, Melsted P, Mallonee AB, Bergman NH, Koren S, Phillippy

AM. 2016. Mash: fast genome and metagenome distance estimation using MinHash.

Genome Biol. 17:132.

24. Clarke TH, Brinkac LM, Sutton G, Fouts DE. 2018. GGRaSP: a R-package for selecting

representative genomes using Gaussian mixture models. Bioinformatics. 34:3032-3034.

25. Inman JM, Sutton GG, Beck E, Brinkac LM, Clarke TH, Fouts DE. 2019. Large-scale

comparative analysis of microbial pan-genomes using PanOCT. Bioinformatics. 35:1049-

1050.

26. Fouts DE, Brinkac L, Beck E, Inman J, Sutton G. 2012. PanOCT: automated clustering of

orthologs using conserved gene neighborhood for pan-genomic analysis of bacterial

strains and closely related species. Nucleic Acids Res. 40:e172.

27. Needleman SB, Wunsch CD. 1970. A general method applicable to the search for

similarities in the amino acid sequence of two proteins. J Mol Biol. 48:443-53.

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Table 1. The 305 B. subtilis genes deemed essential by either Kobayashi et al. or Koo et al. These genes are compared to the PGG

based annotation of core regions for the type strain genome used by Kobayashi et al. and Koo et al. (strain 168, GenBank sequence

AL009126.3, BioSample SAMEA3138188, Assembly ASM904v1/GCF_000009045.1). Columns 1-5 are the start, stop, strand,

cluster, and cluster size for the PGG annotation. Columns 6-11 are the gene type, start, stop, strand, locus tag, and gene symbol/name

for the GenBank annotation. Column 12 is a list of synonyms for B. subtilis genes associated with the Koo and Kobayashi genes.

Column 13 is the Koo et al. gene symbol/name. Columns 14-15 are the Kobayashi et al. gene symbol/name and evidence type (from

Supporting Table 4 “RB, reference to study with Bacillus subtilis; RO, reference to study with other bacteria; TW, this work; TW*,

inactivation failed but IPTG mutant could not be made”). Columns 16-17 are the GenBank protein product accession and name.

Column 18 is the PGG core or non-core region the gene is contained in.

PGG PGG PGG PGG Genbank Genbank Genbank GenBank Koo Kobayashi Kobayashi PGG Stop Cluster Strand Locus Tag Synonyms Protein Product Protein Name Region Start Strand Cluster Type Start Stop Gene Gene Gene Evidence Size

chromosomal replication initiator 410 1750 + 2 108 gene 410 1750 + BSU_00010 dnaA dnaH dnaA dnaA RB NP_387882.1 core 1 informational ATPase

1939 3075 + 3 108 gene 1939 3075 + BSU_00020 dnaN dnaG dnaN dnaN RB NP_387883.1 DNA III (beta subunit) core 1

4867 6783 + 7 108 gene 4867 6783 + BSU_00060 gyrB novA gyrB gyrB RO NP_387887.1 DNA gyrase (subunit B) core 1

6994 9459 + 8 108 gene 6994 9459 + BSU_00070 gyrA cafB, nalA gyrA gyrA RO NP_387888.1 DNA gyrase (subunit A) core 1

inosine-monophosphate 15915 17381 + 15 108 gene 15915 17381 + BSU_00090 guaB gnaB guaB TW NP_387890.1 core 1 dehydrogenase bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

glutamine amidotransferase for pyridoxal phosphate synthesis; 19062 19946 + 17 108 gene 19062 19946 + BSU_00110 pdxS yaaD map TW* NP_387892.1 core 1 pyridoxal 5\'-phosphate synthase complex, synthase subunit

20880 22157 + 19 108 gene 20880 22157 + BSU_00130 serS serS serS RO NP_387894.1 seryl-tRNA synthetase core 1

DNA polymerase III subunit tau 26814 28505 + 27 108 gene 26814 28505 + BSU_00190 dnaX dnaH dnaX dnaX TW NP_387900.2 core 1 subunit

39159 39797 + 41 108 gene 39159 39797 + BSU_00280 tmk yaaP tmk tmk TW NP_387909.2 thymidylate kinase core 1

DNA polymerase III clamp loader 40665 41654 + 44 108 gene 40665 41654 + BSU_00310 holB yaaS holB holB TW NP_387912.1 core 1 delta\' subunit

45633 47627 + 51 108 gene 45633 47627 + BSU_00380 metS metG metS metS RO NP_387919.1 methionyl-tRNA synthetase core 1

4-(cytidine 5\'-diphospho)-2-C- 53516 54385 + 59 108 gene 53516 54385 + BSU_00460 ispE ipk, yabH ispE ispE TW NP_387927.1 core 1 methyl-D-erythritol kinase

bifunctional glucosamine-1- tms, tms- phosphate N-acetyltransferase/UDP- 56352 57722 + 63 108 gene 56352 57722 + BSU_00500 glmU gcaD gcaD TW NP_387931.1 core 1 26 N-acetylglucosamine pyrophosphorylase

phosphoribosylpyrophosphate 57745 58698 + 64 108 gene 57745 58698 + BSU_00510 prs prs prs TW NP_387932.1 core 1 synthetase

59504 60070 + 66 108 gene 59504 60070 + BSU_00530 pth pth spoVC RB, TW NP_387934.1 peptidyl-tRNA hydrolase core 1

69168 69545 + 75 108 gene 69168 69545 + BSU_00620 divIC divIB, dds divIC divIC TW NP_387943.1 cell-division initiation protein core 1

74929 76347 + 82 108 gene 74929 76347 + BSU_00670 tilS tilS yacA TW NP_387948.1 tRNA(ile2) lysidine synthetase core 1

85737 86594 + 92 108 gene 85737 86594 + BSU_00770 folP sul NP_387958.1 dihydropteroate synthase core 1

86587 86949 + 93 108 gene 86587 86949 + BSU_00780 folB folA, yacE folB hprT TW NP_387959.1 dihydroneopterin aldolase core 1

7,8-dihydro-6-hydroxymethylpterin 86946 87449 + 94 108 gene 86946 87449 + BSU_00790 folK folK NP_387960.1 core 1 pyrophosphokinase

88727 90226 + 97 108 gene 88727 90226 + BSU_00820 lysS lysS lysS TW NP_387963.1 lysyl-tRNA synthetase core 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

2-C-methyl-D-erythritol 4-phosphate 109789 110487 + 120 108 gene 109789 110487 + BSU_00900 ispD ispD yacM TW NP_387971.1 cytidylyltransferase, nonmevalonate core 3

isoprenoid pathway

2-C-methyl-D-erythritol-2,4- 110480 110956 + 121 108 gene 110480 110956 + BSU_00910 ispF ispF yacN TW NP_387972.1 core 3 cyclodiphosphate synthase

111047 112498 + 122 108 gene 111047 112498 + BSU_00920 gltX gltX gltX RO NP_387973.1 glutamyl-tRNA synthetase core 3

dual cysteinyl-tRNA synthetase; 113450 114850 + 124 108 gene 113450 114850 + BSU_00940 cysS spnA cysS cysS RO NP_387975.1 core 3 cysteine persulfide synthase

rpmG, 117349 117498 + 129 108 gene 117349 117498 + BSU_00990 rpmGB rpmGB RO NP_387980.1 ribosomal protein L33 core 3 rpmG2

117532 117711 + 130 108 gene 117532 117711 + BSU_01000 secE secE secE RB NP_387981.1 preprotein translocase subunit core 3

119111 119809 + 133 108 gene 119111 119809 + BSU_01030 rplA rplA RO NP_387984.2 ribosomal protein L1 (BL1) core 3

120061 120561 + 134 108 gene 120061 120561 + BSU_01040 rplJ rplJ rplJ RO NP_387985.2 ribosomal protein L10 (BL5) core 3

120607 120978 + 135 108 gene 120607 120978 + BSU_01050 rplL rplL rplL RO NP_387986.1 ribosomal protein L12 (BL9) core 3

121919 125500 + 137 108 gene 121919 125500 + BSU_01070 rpoB crsE, rfm rpoB rpoB RO NP_387988.2 RNA polymerase (beta subunit) core 3

125562 129161 + 138 108 gene 125562 129161 + BSU_01080 rpoC lpm, std rpoC rpoC TW NP_387989.2 RNA polymerase (beta\' subunit) core 3

psL, fun, 129702 130118 + 140 108 gene 129702 130118 + BSU_01100 rpsL rpsL rpsL RO NP_387991.2 ribosomal protein S12 (BS12) core 3 strA

130160 130630 + 141 108 gene 130160 130630 + BSU_01110 rpsG rpsG rpsG RO NP_387992.2 ribosomal protein S7 (BS7) core 3

130684 132762 + 142 108 gene 130684 132762 + BSU_01120 fusA fus fusA fusA TW NP_387993.2 elongation factor G core 3

132882 134072 + 143 108 gene 132882 134072 + BSU_01130 tufA tuf tufA tufA TW* NP_387994.1 elongation factor Tu core 3

ribosomal protein S10 (BS13); 135364 135672 + 145 108 gene 135364 135672 + BSU_01150 rpsJ tetA rpsJ rpsJ RO NP_387996.1 core 3 transcription antitermination factor

135712 136341 + 146 108 gene 135712 136341 + BSU_01160 rplC rplC rplC RO NP_387997.1 ribosomal protein L3 (BL3) core 3

136369 136992 + 147 108 gene 136369 136992 + BSU_01170 rplD rplD rplD RO NP_387998.1 ribosomal protein L4 core 3

136992 137279 + 148 108 gene 136992 137279 + BSU_01180 rplW rplW rplW RO NP_387999.2 ribosomal protein L23 core 3

137311 138144 + 149 108 gene 137311 138144 + BSU_01190 rplB rplB rplB RO NP_388000.2 ribosomal protein L2 (BL2) core 3 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

138202 138480 + 150 108 gene 138202 138480 + BSU_01200 rpsS rpsS rpsS RO NP_388001.1 ribosomal protein S19 (BS19) core 3

138497 138838 + 151 108 gene 138497 138838 + BSU_01210 rplV rplV rplV RO NP_388002.1 ribosomal protein L22 (BL17) core 3

138842 139498 + 152 108 gene 138842 139498 + BSU_01220 rpsC rpsC rpsC RO NP_388003.2 ribosomal protein S3 (BS3) core 3

139500 139934 + 153 108 gene 139500 139934 + BSU_01230 rplP rplP rplP RO NP_388004.1 ribosomal protein L16 core 3

139924 140124 + 154 108 gene 139924 140124 + BSU_01240 rpmC rpmC RO NP_388005.1 ribosomal protein L29 core 3

140147 140410 + 155 108 gene 140147 140410 + BSU_01250 rpsQ rpsQ rpsQ RO NP_388006.1 ribosomal protein S17 (BS16) core 3

140451 140819 + 156 108 gene 140451 140819 + BSU_01260 rplNA rplNA rplN RO NP_388007.1 ribosomal protein L14 core 3

140857 141168 + 157 108 gene 140857 141168 + BSU_01270 rplX rplX rplX RO NP_388008.1 ribosomal protein L24 (BL23) core 3

141195 141734 + 158 108 gene 141195 141734 + BSU_01280 rplE rplE rplE RO NP_388009.1 ribosomal protein L5 (BL6) core 3

141757 141942 + 159 108 gene 141757 141942 + BSU_01290 rpsNA rpsZ rpsNA rpsN RO NP_388010.1 ribosomal protein S14 core 3

141974 142372 + 160 108 gene 141974 142372 + BSU_01300 rpsH rpsH rpsH RO NP_388011.2 ribosomal protein S8 (BS8) core 3

142402 142941 + 161 108 gene 142402 142941 + BSU_01310 rplF rplF rplF RO NP_388012.1 ribosomal protein L6 (BL8) core 3

142974 143336 + 162 108 gene 142974 143336 + BSU_01320 rplR rplR rplR RO NP_388013.2 ribosomal protein L18 core 3

143361 143861 + 163 108 gene 143361 143861 + BSU_01330 rpsE spcA rpsE rpsE RO NP_388014.1 ribosomal protein S5 core 3

143875 144054 + 164 108 gene 143875 144054 + BSU_01340 rpmD rpmD rpmD RO NP_388015.1 ribosomal protein L30 (BL27) core 3

144085 144525 + 165 108 gene 144085 144525 + BSU_01350 rplO rplO RO NP_388016.1 ribosomal protein L15 core 3

144527 145822 + 166 108 gene 144527 145822 + BSU_01360 secY secY secY RB NP_388017.1 preprotein translocase subunit core 3

145877 146530 + 167 108 gene 145877 146530 + BSU_01370 adk adk adk TW* NP_388018.1 adenylate kinase core 3

147585 147803 + 170 108 gene 147585 147803 + BSU_01390 infA infA infA TW NP_388020.1 initiation factor IF-I core 3

ribosomal protein L36 (ribosomal 147837 147950 + 171 108 gene 147837 147950 + BSU_01400 rpmJ rpmJ RO NP_388021.1 core 3 protein B)

147973 148338 + 172 108 gene 147973 148338 + BSU_01410 rpsM rpsM rpsM RO NP_388022.2 ribosomal protein S13 core 3

148359 148754 + 173 108 gene 148359 148754 + BSU_01420 rpsK rpsK rpsK RO NP_388023.1 ribosomal protein S11 (BS11) core 3

148931 149875 + 174 108 gene 148931 149875 + BSU_01430 rpoA rpoA rpoA RO NP_388024.1 RNA polymerase (alpha subunit) core 3

149953 150315 + 175 108 gene 149953 150315 + BSU_01440 rplQ rplQ rplQ RO NP_388025.1 ribosomal protein L17 (BL15) core 3

153842 154279 + 180 108 gene 153842 154279 + BSU_01490 rplM rplM rplM RO NP_388030.2 ribosomal protein L13 core 3

154300 154692 + 181 108 gene 154300 154692 + BSU_01500 rpsI rpsI rpsI RO NP_388031.1 ribosomal protein S9 core 3

198497 199843 + 228 108 gene 198497 199843 + BSU_01770 glmM glmM ybbT TW NP_388058.1 phosphoglucosamine mutase core 7 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

L-glutamine-D-fructose-6-phosphate 200277 202079 + 229 108 gene 200277 202079 + BSU_01780 glmS gcaA, ybxD glmS glmS TW NP_388059.1 core 7 amidotransferase

ammonium-dependent NAD+ 338288 339106 + 358 108 gene 338288 339106 + BSU_03130 nadE outB nadE nadE RB NP_388195.1 core 19 synthetase

340025 340585 + 360 108 gene 340025 340585 + BSU_03150 aroK aroI aroK NP_388197.2 shikimate kinase core 19

508227 509312 + 510 108 gene 508248 509312 + BSU_04560 ddlA ddlA ddl ddl TW NP_388337.1 D-alanyl-D-alanine ligase A core 29

UDP-N-acetylmuramoylalanyl-D- 509384 510757 + 511 108 gene 509384 510757 + BSU_04570 murF ydbQ murF murF TW NP_388338.1 glutamyl-2, 6-diaminopimelate-D- core 29 alanyl-D-alanine ligase

holo-acyl carrier protein synthase 515710 516075 + 516 108 gene 515710 516075 + BSU_04620 acpS ydcB acpS acpS TW NP_388343.1 core 30 (phosphopantetheinyl transferase)

517372 518541 + 518 108 gene 517372 518541 + BSU_04640 alrA dal alrA alr TW NP_388345.1 D-alanine racemase core 30

tRNA(NNU) t(6)A37 641654 642130 + 670 108 gene 641654 642130 + BSU_05910 tsaE ydiB TW NP_388472.1 threonylcarbamoyladenosine core 45

modification; ADP binding protein

tRNA(NNU) t(6)A37 threonylcarbamoyladenosine 642111 642800 + 671 108 gene 642111 642800 + BSU_05920 tsaB ydiC ydiC ydiC TW NP_388473.1 core 45 modification; protease involved in TsaD function

tRNA(NNU) t(6)A37 threonylcarbamoyladenosine 643258 644298 + 673 108 gene 643258 644298 + BSU_05940 tsaD gcp, ydiE gcp gcp TW NP_388475.1 core 45 modification; glycation binding protein

649903 650187 + 681 108 gene 649903 650187 + BSU_06020 groES mopB groES groES RB NP_388483.2 chaperonin small subunit core 46

650234 651868 + 682 108 gene 650234 651868 + BSU_06030 groEL mopA groEL groEL RB NP_388484.1 chaperonin large subunit core 46

non- DNA-methyltransferase (cytosine- 655223 656506 + 4838 24 gene 655223 656506 + BSU_06060 bsuMA ydiO ydiO RB, TW NP_388487.1 core 46- specific); prophage 3 region 47 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

non- DNA-methyltransferase (cytosine- 656528 657697 + 4839 24 gene 656528 657697 + BSU_06070 bsuMB ydiP ydiP RB, TW NP_388488.1 core 46- specific); defective prophage 3 47

719370 721589 + 738 108 gene 719370 721589 + BSU_06610 pcrA yerF pcrA pcrA RB, TW NP_388543.1 ATP-dependent DNA core 50

721613 723619 + 739 108 gene 721613 723619 + BSU_06620 ligA lig, yerG ligA ligA TW NP_388544.1 DNA ligase (NAD-dependent) core 50

glutamyl-tRNA(Gln) 728732 729022 + 745 108 gene 728732 729022 + BSU_06670 gatC yedA, yerL gatC gatC RO NP_388549.1 core 51 amidotransferase (subunit C)

glutamyl-tRNA(Gln) 729038 730495 + 746 108 gene 729038 730495 + BSU_06680 gatA yedB, yerM gatA gatA TW NP_388550.1 core 51 amidotransferase (subunit A)

glutamyl-tRNA(Gln) 730509 731939 + 747 108 gene 730509 731939 + BSU_06690 gatB yerN gatB gatB TW NP_388551.2 core 51 amidotransferase (subunit B)

736436 737347 + 750 108 gene 736436 737347 + BSU_06720 dagK dgkB yerQ TW NP_388554.1 diacylglycerol kinase core 51

non- conserved hypothetical protein; HGT 747079 747534 - 4411 43 gene 747079 747534 - BSU_06811 yezG yeeE yezG YP_003097692.1 core 52- island 53

tRNA (cytidine(34)-2\'-O)- 970135 970617 + 976 108 gene 970135 970617 + BSU_08930 trmL ygaR cspR TW NP_388774.2 core 74 methyltransferase

negative regulator of the activity of 1028511 1029587 - 1054 108 gene 1028511 1029587 - BSU_09510 asiMA yhdL yhdL TW NP_388832.1 core 77 sigma-M

1-acylglycerol-phosphate (1-acyl- 1031395 1031994 + 1057 108 gene 1031395 1031994 + BSU_09540 plsC plsC yhdO TW NP_388835.1 core 77 G3P) acyltransferase

1070364 1071242 - 1098 108 gene 1070364 1071242 - BSU_09950 prsA prsA prsA TW NP_388876.1 molecular chaperone lipoprotein core 78

beta-ketoacyl-acyl carrier protein 1209183 1210424 + 1230 108 gene 1209183 1210424 + BSU_11340 fabF yjaY fabF fabF TW* NP_389016.1 synthase II (involved in pimelate core 88 synthesis)

1218113 1219105 - 1238 108 gene 1218113 1219105 - BSU_11420 trpS trpS trpS RO NP_389024.1 tryptophanyl-tRNA synthetase core 89 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

inorganic polyphosphate/ATP-NAD 1237660 1238460 + 1260 108 gene 1237660 1238460 + BSU_11610 ppnKA nadF, yjbN ppnKA ppnK TW NP_389043.1 core 89 kinase (quinolate activated)

1321329 1321670 - 1381 108 gene 1321329 1321670 - BSU_12510 xre xre NP_389133.1 phage PBSX transcriptional regulator core 96

core 1396013 1397368 + 1468 108 gene 1396013 1397368 + BSU_13300 mgtE ykoK mgtE NP_389213.1 magnesium transporter 100

N-acetyl-L,L-diaminopimelate core 1471857 1473038 - 1541 108 gene 1471857 1473038 - BSU_14000 dapX uat patA NP_389283.2 aminotransferase 103

tetrahydrodipicolinate N- core 1488973 1489683 + 1559 108 gene 1488973 1489683 + BSU_14180 dapH dapH ykuQ TW NP_389301.2 acetyltransferase 104

N-acetyl-diaminopimelate core 1489753 1490877 + 1560 108 gene 1489753 1490877 + BSU_14190 dapI dapL ykuR TW NP_389302.1 deacetylase 104

core 1523118 1524785 - 1595 108 gene 1523118 1524785 - BSU_14530 rnjA rnjA rnjA ykqC TW NP_389336.1 ribonuclease J1 106

pyruvate dehydrogenase (E1 alpha core 1528326 1529441 + 1601 108 gene 1528326 1529441 + BSU_14580 pdhA aceA pdhA RB NP_389341.1 subunit) 106

core 1552412 1552693 + 1629 108 gene 1552412 1552693 + BSU_14840 ylaN ylaN ylaN TW NP_389367.1 conserved hypothetical protein 108

cell-division protein; transporter of core 1552899 1554110 + 1630 108 gene 1552899 1554110 + BSU_14850 ftsW ylaO ftsW ftsW TW NP_389368.1 lipid-linked cell wall precursors 108

phosphopantetheine core 1570078 1570563 + 1647 108 gene 1570078 1570563 + BSU_15020 coaD ylbI coaD NP_389385.1 adenylyltransferase 109

core 1575804 1575983 + 1654 108 gene 1575804 1575983 + BSU_15080 rpmF rpmF RO NP_389391.1 ribosomal protein L32 110

core 1581597 1581950 + 1661 108 gene 1581597 1581950 + BSU_15150 ftsL yllD, ylxB ftsL ftsL RB NP_389398.1 cell-division protein 110

core 1581947 1584097 + 1662 107 gene 1581947 1584097 + BSU_15160 pbpB pbpB pbpB RB NP_389399.2 penicillin-binding protein 2B 110

UDP-N-acetylmuramoylalanyl-D- core 1586330 1587814 + 1664 108 gene 1586330 1587814 + BSU_15180 murE murE murE RO NP_389401.1 glutamate-2, 6-diaminopimelate 110 ligase

phospho-N-acetylmuramoyl- core 1587926 1588900 + 1665 108 gene 1587926 1588900 + BSU_15190 mraY mraY mraY RO NP_389402.2 pentapeptide undecaprenyl 110 phosphate (C55P) transferase bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

UDP-N-acetylmuramoylalanyl-D- core 1588901 1590256 + 1666 108 gene 1588901 1590256 + BSU_15200 murD murD murD RB NP_389403.1 glutamate ligase 110

UDP-N-acetylglucosamine-N- acetylmuramyl- core 1591540 1592631 + 1668 108 gene 1591540 1592631 + BSU_15220 murG murG murG RB NP_389405.2 (pentapeptide)pyrophosphoryl- 110 undecaprenol N-acetylglucosamine transferase

UDP-N- core 1592663 1593574 + 1669 108 gene 1592663 1593574 + BSU_15230 murB ylxC murB murB RB NP_389406.1 acetylenolpyruvoylglucosamine 110 reductase

core 1593704 1594495 + 1670 108 gene 1593704 1594495 + BSU_15240 divIB dds divIB divIB RB NP_389407.1 cell-division initiation protein 110

cell-division protein essential for Z- core 1596474 1597796 + 1674 108 gene 1596474 1597796 + BSU_15280 ftsA ftsA ftsA RB NP_389411.2 ring assembly 110

core 1597832 1598980 + 1675 105 gene 1597832 1598980 + BSU_15290 ftsZ ftsZ ftsZ RB NP_389412.2 cell-division initiation protein 110

core 1613357 1616122 + 1689 108 gene 1613357 1616122 + BSU_15430 ileS ileS ileS RO NP_389426.2 isoleucyl-tRNA synthetase 110 core 1641949 1642563 + 1714 108 gene 1641949 1642563 + BSU_15680 gmk yloD gmk gmk TW NP_389450.2 guanylate kinase 111

coenzyme A biosynthesis bifunctional protein CoaBC; core 1642851 1644071 + 1716 108 gene 1642851 1644071 + BSU_15700 coaBC yloI coaBC NP_389452.1 phosphopantothenoylcysteine 111 synthetase/decarboxylase

primosomal replication factor Y core 1644068 1646485 + 1717 108 gene 1644068 1646485 + BSU_15710 priA yloJ priA priA RB, TW NP_389453.1 (primosomal protein N\') 111

core 1646999 1647952 + 1719 108 gene 1646999 1647952 + BSU_15730 fmt yloL fmt TW NP_389455.1 methionyl-tRNA formyltransferase 111

GTPase involved in ribosome core 1653103 1653999 + 1724 108 gene 1653103 1653999 + BSU_15780 rsgA cpgA, engC yloQ TW NP_389460.1 biogenesis 111

core 1655599 1655787 - 1728 108 gene 1655599 1655787 - BSU_15820 rpmB yloT rpmB RO NP_389464.1 ribosomal protein L28 111

core 1662547 1663548 + 1735 108 gene 1662547 1663548 + BSU_15890 plsX ylpD plsX plsX TW NP_389471.1 phosphate:acyl-ACP acyltransferase 111

malonyl CoA:acyl carrier protein core 1663567 1664520 + 1736 108 gene 1663567 1664520 + BSU_15900 fabD ylpE fabD fabD TW NP_389472.1 transacylase 111 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

beta-ketoacyl-acyl carrier protein core 1664513 1665253 + 1737 108 gene 1664513 1665253 + BSU_15910 fabG ylpF fabG fabG TW NP_389473.1 reductase 111

core 1665337 1665570 + 1738 108 gene 1665337 1665570 + BSU_15920 acpA acpP acpA acpA RO NP_389474.1 acyl carrier protein 111

core 1665710 1666459 + 1739 108 gene 1665710 1666459 + BSU_15930 rnc rnc rnc RB NP_389475.1 ribonuclease III rnc, rncS 111

chromosome condensation and core 1666560 1670120 + 1740 108 gene 1666560 1670120 + BSU_15940 smc ylqA smc smc RB, TW NP_389476.2 segregation SMC ATPase 111

signal recognition particle (docking core 1670140 1671129 + 1741 108 gene 1670140 1671129 + BSU_15950 ftsY srb ftsY ftsY RB NP_389477.1 protein) 111

signal recognition particle-like (SRP) core 1672174 1673514 + 1744 108 gene 1672174 1673514 + BSU_15980 ffh ffh ffh RB NP_389480.1 GTPase 112

core 1673620 1673892 + 1745 108 gene 1673620 1673892 + BSU_15990 rpsP rpsP rpsP RO NP_389481.1 ribosomal protein S16 (BS17) 112

core 1675171 1675902 + 1749 108 gene 1675171 1675902 + BSU_16030 trmD trmD trmD TW NP_389485.1 tRNA(m1G37)methyltransferase 112

core 1676042 1676389 + 1750 108 gene 1676042 1676389 + BSU_16040 rplS rplS rplS RO NP_389486.2 ribosomal protein L19 112

core 1676532 1677380 + 1751 108 gene 1676532 1677380 + BSU_16050 rbgA rbgA ylqF RB, TW NP_389487.1 ribosome biogenesis GTPase A 112

core 1683661 1685736 + 1758 108 gene 1683661 1685736 + BSU_16120 topA topI topA TW NP_389494.1 DNA I 112

core 1717933 1718673 + 1796 108 gene 1717933 1718673 + BSU_16490 rpsB rpsB rpsB RO NP_389531.1 ribosomal protein S2 112

core 1718775 1719656 + 1797 108 gene 1718775 1719656 + BSU_16500 tsf tsf tsf TW NP_389532.1 elongation factor Ts 112 core 1719802 1720524 + 1798 108 gene 1719802 1720524 + BSU_16510 pyrH smbA pyrH NP_389533.2 uridylate kinase 112 core 1720526 1721083 + 1799 108 gene 1720526 1721083 + BSU_16520 frr frr frr TW NP_389534.1 ribosome recycling factor 112

undecaprenyl pyrophosphate core 1721214 1721996 + 1800 108 gene 1721214 1721996 + BSU_16530 uppS yluA uppS NP_389535.1 synthase 112

phosphatidate cytidylyltransferase core 1722000 1722809 + 1801 108 gene 1722000 1722809 + BSU_16540 cdsA cdsA cdsA TW NP_389536.1 (CDP-diglyceride synthase) 112

1-deoxy-D-xylulose-5-phosphate core 1722871 1724022 + 1802 108 gene 1722871 1724022 + BSU_16550 dxr yluB dxr dxr TW NP_389537.2 reductoisomerase 112

core 1725330 1727024 + 1804 108 gene 1725330 1727024 + BSU_16570 proS proS proS RO NP_389539.1 prolyl-tRNA synthetase 112

core 1727133 1731446 + 1805 108 gene 1727133 1731446 + BSU_16580 polC dnaF, mutI polC polC RB, TW NP_389540.1 DNA polymerase III (alpha subunit) 112 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

transcription translation coupling core 1732281 1733396 + 1807 108 gene 1732281 1733396 + BSU_16600 nusA nusA nusA RB NP_389542.1 factor involved in Rho-dependent 113 transcription termination

core 1734009 1736159 + 1810 108 gene 1734009 1736159 + BSU_16630 infB infB infB TW NP_389545.1 initiation factor IF-2 113

bifunctional riboflavin kinase FAD core 1737834 1738784 + 1814 108 gene 1737834 1738784 + BSU_16670 ribC ribC NP_389549.1 synthase 113

core 1738941 1739210 + 1815 108 gene 1738941 1739210 + BSU_16680 rpsO rpsO rpsO RO NP_389550.1 ribosomal protein S15 (BS18) 113

aspartate-semialdehyde core 1745991 1747031 + 1822 108 gene 1745991 1747031 + BSU_16750 asd asd asd TW NP_389557.1 dehydrogenase 113

aspartokinase I (alpha and beta core 1747123 1748337 + 1823 108 gene 1747123 1748337 + BSU_16760 dapG lssD dapG NP_389558.2 subunits) 113

4-hydroxy-tetrahydrodipicolinate core 1748368 1749240 + 1824 108 gene 1748368 1749240 + BSU_16770 dapA dapA dapA TW NP_389559.1 synthase 113

CDP-diacylglycerol-glycerol-3- core 1762623 1763204 + 1837 108 gene 1762623 1763204 + BSU_16920 pgsA ymfN pgsA pgsA TW NP_389574.2 phosphate 3- 113 phosphatidyltransferase

core 1767310 1768872 + 1841 108 gene 1767310 1768872 + BSU_16960 rny ymdA TW NP_389578.1 endoribonuclease Y 113

co-factor of ribonucleotide core 1868617 1869009 + 1884 108 gene 1868617 1869009 + BSU_17370 nrdI nrdI ymaA TW NP_389619.1 diphosphate reductase 115

ribonucleoside-diphosphate core 1868969 1871071 + 1885 108 gene 1868969 1871071 + BSU_17380 nrdE nrdA nrdE nrdE RB NP_389620.1 reductase (major subunit) 115

ribonucleoside-diphosphate core 1871089 1872078 + 1886 108 gene 1871089 1872078 + BSU_17390 nrdF nrdF nrdF RB NP_389621.1 reductase (minor subunit) 115

core 1919861 1921864 + 1996 108 gene 1919861 1921864 + BSU_17890 tktA tkt TW NP_389672.1 transketolase 127 core 1922549 1922767 + 1998 108 gene 1922549 1922767 + BSU_17910 yneF yoxG yneF NP_389674.1 putative acyltransferase 127

acylphosphate:glycerol-3-phosphate core 1931920 1932501 - 2015 108 gene 1931920 1932501 - BSU_18070 plsY plsY yneS TW NP_389689.1 O-acyltransferase 128

subunit B of DNA topoisomerase IV core 1933477 1935444 + 2017 108 gene 1933477 1935444 + BSU_18090 parE grlB parE parE RB NP_389691.2 (ATP-dependent) 128 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

subunit A of DNA topoisomerase IV core 1935448 1937868 + 2018 108 gene 1935448 1937868 + BSU_18100 parC grlA parC parC RB NP_389692.2 (ATP-dependent) 128

2-oxoglutarate dehydrogenase core 2107505 2108758 - 2115 108 gene 2107505 2108758 - BSU_19360 odhB citM odhB TW NP_389818.2 complex (dihydrolipoamide 137 transsuccinylase, E2 subunit)

core 2296603 2297109 - 2181 108 gene 2296603 2297109 - BSU_21810 dfrA dfrA dfrA TW NP_390064.1 dihydrofolate reductase 140

DNA-remodelling primosomal core 2345433 2346131 - 2240 108 gene 2345433 2346131 - BSU_22350 dnaD dnaD dnaD RB NP_390116.1 protein 142

core 2346224 2347516 - 2241 108 gene 2346224 2347516 - BSU_22360 asnS asnS RO NP_390117.1 asparaginyl-tRNA synthetase 142

biotin acetyl-CoA-carboxylase ligase core 2354918 2355895 - 2249 108 gene 2354918 2355895 - BSU_22440 birA birA birA TW NP_390125.1 and biotin regulon repressor (BirA- 142 biotinoyl-5\'-AMP)

core 2355880 2357073 - 2250 108 gene 2355880 2357073 - BSU_22450 cca papS, ypjI cca cca TW NP_390126.1 tRNA nucleotidyltransferase 142

(4S)-4-hydroxy-2,3,4, 5-tetrahydro- core 2359340 2360143 - 2254 108 gene 2359340 2360143 - BSU_22490 dapB dapB dapB TW NP_390130.1 (2S)-dipicolinic acid (HTPA) 142 dehydratase reductase

3-phosphoshikimate 1- carboxyvinyltransferase (5- core 2367954 2369240 - 2265 108 gene 2367954 2369240 - BSU_22600 aroA aroE NP_390141.1 enolpyruvoylshikimate-3-phosphate 142 synthase)

core 2379100 2380272 - 2276 108 gene 2379100 2380272 - BSU_22710 aroF aroF aroF NP_390152.3 chorismate synthase 142

gerC3, heptaprenyl diphosphate synthase core 2381919 2382965 - 2279 108 gene 2381919 2382965 - BSU_22740 hepT gerCC, hepT RB NP_390155.1 component II 142 hepB

gerC1, heptaprenyl diphosphate synthase core 2383615 2384370 - 2281 108 gene 2383615 2384370 - BSU_22760 hepS gerCA, hepS RB NP_390157.1 component I 142 hepA

core 2384783 2385355 - 2283 108 gene 2384783 2385355 - BSU_22780 folEA folE, mtrA folEA NP_390159.1 GTP cyclohydrolase I 142 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

hupA, non-specific DNA-binding protein core 2385543 2385821 - 2284 108 gene 2385543 2385821 - BSU_22790 hbs hbs hbs RB NP_390160.1 dbpA, hbsU HBsu 142

NADPH-dependent glycerol-3- core 2389151 2390188 - 2288 108 gene 2389151 2390188 - BSU_22830 gpsA glyC gpsA RB NP_390164.1 phosphate dehydrogenase 142

GTPase essential for ribosome 50S core 2390206 2391516 - 2289 108 gene 2390206 2391516 - BSU_22840 der engA yphC RB, TW NP_390165.1 subunit assembly (maturation of the 142 50S subunit central protoberance)

core 2396045 2396719 - 2296 108 gene 2396045 2396719 - BSU_22890 cmk jofC, ypfC, cmk cmk RB NP_390170.1 cytidylate kinase 143

factor required for cytochrome c core 2417984 2419159 - 2319 108 gene 2417984 2419159 - BSU_23130 resC ypxC resC RB NP_390194.2 synthesis 143

factor required for cytochrome c core 2419179 2420807 - 2320 108 gene 2419179 2420807 - BSU_23140 resB ypxB resB RB NP_390195.1 synthesis 143

extracytoplasmic thioredoxin core 2420804 2421343 - 2321 108 gene 2420804 2421343 - BSU_23150 resA ypxA resA RB NP_390196.2 involved in cytochrome c maturation 143 (lipoprotein)

chromosome condensation and core 2425248 2425841 - 2327 108 gene 2425248 2425841 - BSU_23210 spcB ypuH RB, TW NP_390202.1 segregation factor 143

chromosome condensation and core 2425831 2426586 - 2328 108 gene 2425831 2426586 - BSU_23220 scpA scpA ypuG RB, TW NP_390203.1 partitioning factor 143

core 2478006 2478929 - 2411 108 gene 2478006 2478929 - BSU_23840 rnz rnz yqjK TW NP_390265.1 ribonuclease Z 146

1-deoxyxylulose-5-phosphate core 2523713 2525614 - 2456 108 gene 2523713 2525614 - BSU_24270 dxs yqiE dxs dxs TW NP_390307.1 synthase 146

core 2525789 2526679 - 2457 108 gene 2525789 2526679 - BSU_24280 ispA yqiD TW NP_390308.2 farnesyl diphosphate synthase 146

methylenetetrahydrofolate dehydrogenase; core 2528404 2529255 - 2460 108 gene 2528404 2529255 - BSU_24310 folD yqiA folD TW NP_390311.1 methenyltetrahydrofolate 146 cyclohydrolase

acetyl-CoA carboxylase subunit core 2530354 2531706 - 2463 108 gene 2530354 2531706 - BSU_24340 accC accC1, accC accC TW NP_390314.2 (biotin carboxylase subunit) 146 yqhX bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

acetyl-CoA carboxylase subunit core 2531718 2532197 - 2464 108 gene 2531718 2532197 - BSU_24350 accB fabE, yqhW accB accB TW NP_390315.1 (biotin carboxyl carrier subunit) 146

core 2574408 2574557 - 2522 108 gene 2574408 2574557 - BSU_24900 rpmGA rpmG1 rpmGA RO NP_570908.2 ribosomal protein L33 148

4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase (1-hydroxy-2- core 2589123 2590256 + 2538 108 gene 2589123 2590256 + BSU_25070 ispG yqfY ispG yqfY TW NP_390386.1 methyl-2-(E)-butenyl 4-diphosphate 150 synthase)

1-hydroxy-2-methyl-2-(E)-butenyl 4- core 2596535 2597479 + 2547 108 gene 2596535 2597479 + BSU_25160 ispH ispH yqfP TW NP_390395.2 diphosphate reductase 150

RNA polymerase major sigma-43 core 2600214 2601329 - 2551 108 gene 2600214 2601329 - BSU_25200 sigA rpoD sigA sigA TW NP_390399.2 factor (sigma-A) 150

core 2601528 2603339 - 2552 108 gene 2601528 2603339 - BSU_25210 dnaG dnaE dnaG dnaG RB NP_390400.2 DNA 150

glycyl-tRNA synthetase (beta core 2605730 2607769 - 2556 108 gene 2605730 2607769 - BSU_25260 glyS yqfK glyS glyS RO NP_390404.2 subunit) 150

glycyl-tRNA synthetase (alpha core 2607762 2608649 - 2557 108 gene 2607762 2608649 - BSU_25270 glyQ yqfJ glyQ glyQ RO NP_390405.1 subunit) 150

maturation of 16S RNA and assembly core 2610041 2610946 - 2560 108 gene 2610041 2610946 - BSU_25290 era bex, yqfH era era RB, TW NP_390407.1 of 30S ribosomal subunit GTPase 150

core 2620371 2620544 - 2572 108 gene 2620371 2620544 - BSU_25410 rpsU yqeX rpsU RO NP_390419.1 ribosomal protein S21 150

core 2635815 2636081 + 2586 108 gene 2635815 2636081 + BSU_25550 rpsT yqeO rpsT RO NP_390433.2 ribosomal protein S20 (BS20) 150

DNA polymerase clamp loader delta core 2636096 2637139 - 2587 108 gene 2636096 2637139 - BSU_25560 holA holA yqeN TW NP_390434.1 subunit 150

nicotinate-nucleotide core 2643765 2644334 - 2597 108 gene 2643765 2644334 - BSU_25640 nadD nadD yqeJ TW NP_390442.1 adenylyltransferase 150

core 2644346 2644636 - 2598 108 gene 2644346 2644636 - BSU_25650 yqeI yqeI TW NP_390443.1 50S RNA-binding protein 150

potassium-dependent GTPase core 2645490 2646590 - 2600 108 gene 2645490 2646590 - BSU_25670 rgpH yqeH RB, TW NP_390445.1 involved in ribosome 30S assembly 150 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

phosphatase (active on GMP and core 2646594 2647112 - 2601 108 gene 2646594 2647112 - BSU_25680 yqeG yqeG NP_390446.1 Glc-6-P) 150

non- antitoxin factor of ribonuclease toxin 2662712 2663290 + 4590 46 gene 2662712 2663290 + BSU_25870 rttF yqcF NP_390464.1 core RttG; skin element 150-151

skin element; transcriptional non- 2698893 2699243 + 4643 34 gene 2698893 2699243 + BSU_26350 sknR yqaE NP_390512.1 repressor of yqaF-yqaN operon (Xre core

family) 150-151

core 2798174 2800810 - 2726 108 gene 2798174 2800810 - BSU_27410 alaS alaS alaS RO NP_390618.1 alanyl-tRNA synthetase 162

core 2809413 2810528 - 2737 108 gene 2809413 2810528 - BSU_27500 mnmA yrrA trmU trmU TW NP_390627.3 tRNA-specific 2-thiouridylase 163

cysteine desulfurase involved in U34 core 2810559 2811698 - 2738 108 gene 2810559 2811698 - BSU_27510 iscSA iscS yrvO TW NP_390629.2 tRNA thiolation 163

aspartyl-tRNA synthetase, core 2814743 2816521 - 2743 108 gene 2814743 2816521 - BSU_27550 aspS aspS aspS RO NP_390633.1 promiscuous (also recognizes 163 tRNAasn)

core 2816535 2817809 - 2744 108 gene 2816535 2817809 - BSU_27560 hisS hisS hisS TW NP_390634.1 histidyl-tRNA synthetase 163

ppGpp-binding GTPase involved in core 2852661 2853947 - 2782 108 gene 2852661 2853947 - BSU_27920 obgE obg obg RB, TW NP_390670.1 cell portioning, DNA repair and 165 ribosome assembly

core 2854880 2855164 - 2785 108 gene 2854880 2855164 - BSU_27940 rpmA rpmA rpmA RO NP_390672.1 ribosomal protein L27 (BL24) 165

ribosomal protein L27 specific N- core 2855177 2855515 - 2786 108 gene 2855177 2855515 - BSU_27950 rppA ysxB NP_390673.2 terminal end cysteine protease 165

core 2855518 2855826 - 2787 108 gene 2855518 2855826 - BSU_27960 rplU rplU rplU RO NP_390674.1 ribosomal protein L21 (BL20) 165

core 2859317 2859835 - 2792 108 gene 2859317 2859835 - BSU_28010 mreD rodB mreD NP_390679.1 cell-shape determining protein 165

core 2859832 2860704 - 2793 108 gene 2859832 2860704 - BSU_28020 mreC mreC mreC TW NP_390680.2 cell-shape determining protein 165

core 2860735 2861748 - 2794 108 gene 2860735 2861748 - BSU_28030 mreB mreB mreB TW NP_390681.2 cell-shape determining protein 165

core 2865312 2866604 - 2799 108 gene 2865312 2866604 - BSU_28080 folC folC NP_390686.1 folyl-polyglutamate synthase 165

core 2866664 2869306 - 2800 108 gene 2866664 2869306 - BSU_28090 valS valS valS RO NP_390687.2 valyl-tRNA synthetase 165 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

GTPase involved in ribosome 50S core 2879882 2880469 - 2811 108 gene 2879882 2880469 - BSU_28190 engB ysxC ysxC RB, TW NP_390697.1 subunit assembly (maturation of the 165 central 50S protuberance)

core 2903217 2904035 - 2841 108 gene 2903217 2904035 - BSU_28390 rcmE glr, murI racE NP_390717.2 glutamate racemase 168 core 2913024 2913338 - 2851 108 gene 2913024 2913338 - BSU_28500 trxA trx trxA trxA TW NP_390728.1 thioredoxin 168

phenylalanyl-tRNA synthetase (beta core 2927008 2929422 - 2864 108 gene 2927008 2929422 - BSU_28630 pheT pheT pheT RO NP_390741.1 subunit) 168

phenylalanyl-tRNA synthetase (alpha core 2929438 2930472 - 2865 108 gene 2929438 2930472 - BSU_28640 pheS pheS pheS RO NP_390742.1 subunit) 168

core 2952224 2952583 - 2886 108 gene 2952224 2952583 - BSU_28850 rplT rplT rplT RO NP_390763.1 ribosomal protein L20 168

core 2952615 2952815 - 2887 108 gene 2952615 2952815 - BSU_28860 rpmI rpmI RO NP_390764.1 ribosomal protein L35 168

core 2952828 2953349 - 2888 108 gene 2952828 2953349 - BSU_28870 infC infC infC TW* NP_390765.1 initiation factor IF-3 168 core 2963185 2964120 - 2898 108 gene 2963185 2964120 - BSU_28980 dnaI ytxA dnaI dnaI RB NP_390776.1 helicase loader 168

helicase loading protein; replication core 2964148 2965566 - 2899 108 gene 2964148 2965566 - BSU_28990 dnaB dnaB dnaB RB NP_390777.1 initiation membrane attachment 168 protein

core 2970922 2971515 - 2906 108 gene 2970922 2971515 - BSU_29060 coaE coaE ytaG TW NP_390784.1 dephosphocoenzyme A kinase 168

core 2986588 2987547 - 2919 108 gene 2986588 2987547 - BSU_29190 pfkA pfk pfkA TW NP_390797.1 6-phosphofructokinase 169

acetyl-CoA carboxylase core 2987731 2988708 - 2920 108 gene 2987731 2988708 - BSU_29200 accA accA accA TW NP_390798.1 (carboxyltransferase alpha subunit) 169

acetyl-CoA carboxylase core 2988693 2989565 - 2921 108 gene 2988693 2989565 - BSU_29210 accD yttI accD accD TW NP_390799.2 (carboxyltransferase beta subunit) 169

DNA polymerase III (alpha subunit), core 2991269 2994616 - 2923 108 gene 2991269 2994616 - BSU_29230 dnaEC dnaE dnaE RB, TW NP_390801.1 DnaE3 169

core 3035730 3036332 + 2958 108 gene 3035730 3036332 + BSU_29660 rpsD rpsD rpsD RO NP_390844.1 ribosomal protein S4 (BS4) 170

core 3036603 3037871 - 2959 108 gene 3036603 3037871 - BSU_29670 tyrS tyrS1 tyrS tyrS TW* NP_390845.1 tyrosyl-tRNA synthetase 170

UDP-N-acetyl muramate-alanine core 3048177 3049475 - 2971 108 gene 3048177 3049475 - BSU_29790 murC ytxF murC murC TW NP_390857.1 ligase 170

core 3102629 3105043 - 3021 108 gene 3102629 3105043 - BSU_30320 leuS leuS leuS RO NP_390910.1 leucyl-tRNA synthetase 170 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

core 3127825 3129027 - 3045 108 gene 3127825 3129027 - BSU_30550 metK metE metK metK TW NP_390933.1 S-adenosylmethionine synthetase 173

core 3146238 3147353 - 3068 108 gene 3146238 3147353 - BSU_30780 menC ytfD menC menC TW NP_390956.1 O-succinylbenzoate-CoA synthase 173

core 3147350 3148810 - 3069 108 gene 3147350 3148810 - BSU_30790 menE menE menE TW NP_390957.1 O-succinylbenzoic acid-CoA ligase 173

core 3148901 3149716 - 3070 108 gene 3148901 3149716 - BSU_30800 menB menB menB TW NP_390958.1 dihydroxynapthoic acid synthetase 173

2-succinyl-6-hydroxy-2, 4- ytfB, core 3149751 3150575 - 3071 108 gene 3149751 3150575 - BSU_30810 menH menH RB NP_390959.1 cyclohexadiene-1-carboxylate ytxM 173 synthase

2-oxoglutarate decarboxylase and 2- succinyl-5-enolpyruvyl-6-hydroxy-3- core 3150563 3152305 - 3072 108 gene 3150563 3152305 - BSU_30820 menD menD menD TW NP_390960.1 cyclohexene-1-carboxylic-acid 173 synthase

ntrA, shaA, sodium transporter component of a core 3246598 3249003 + 3180 108 gene 3246598 3249003 + BSU_31600 mrpA mrpA mrpA TW NP_391038.2 yufT Na+/H+ antiporter 177

core 3248996 3249427 + 3181 108 gene 3248996 3249427 + BSU_31610 mrpB yufU mrpB mrpB TW NP_391039.1 Na+/H+ antiporter complex 177

core 3249427 3249768 + 3182 108 gene 3249427 3249768 + BSU_31620 mrpC yufV mrpC mrpC TW NP_391040.2 component of Na+/H+ antiporter 177

proton transporter component of core 3249761 3251242 + 3183 108 gene 3249761 3251242 + BSU_31630 mrpD yufD mrpD mrpD TW NP_391041.3 Na+/H+ antiporter 177

non essential component of Na+/H+ core 3251248 3251724 + 3184 108 gene 3251248 3251724 + BSU_31640 mrpE mrpE YP_054591.2 antiporter 177

efflux transporter for Na+ and core 3251724 3252008 + 3185 108 gene 3251724 3252008 + BSU_31650 mrpF yufC mrpF mrpF TW NP_391043.2 cholate 177

non essential component of Na+/H+ core 3251992 3252366 + 3186 108 gene 3251992 3252366 + BSU_31660 mrpG yufB mrpG NP_391044.1 antiporter 177

nicotinate core 3259403 3260875 - 3196 108 gene 3259403 3260875 - BSU_31750 pncB pncB yueK TW NP_391053.1 phosphoribosyltransferase 178 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

ferredoxin-NADP+ reductase core 3301586 3302584 + 3231 108 gene 3301586 3302584 + BSU_32110 trxBB yumC yumC TW NP_391091.2 (flavodoxin) 179

core 3306040 3306894 - 3259 108 gene 3306040 3306894 - BSU_32170 dapF yutL dapF dapF TW NP_391097.1 diaminopimelate epimerase 180

FeS cluster formation scaffold core 3355593 3356990 - 3312 108 gene 3355593 3356990 - BSU_32670 sufB sufB yurU TW NP_391146.1 protein 185

iron-sulfur cluster assembly sulfur- core 3357011 3357454 - 3313 108 gene 3357011 3357454 - BSU_32680 sufU nifU iscU yurV TW NP_391147.1 transfer protein (Zn(2+)-dependent) 185

core 3357444 3358664 - 3314 108 gene 3357444 3358664 - BSU_32690 sufS yurW sufS csd TW NP_391148.1 cysteine desulfurase 185

core 3358664 3359977 - 3315 108 gene 3358664 3359977 - BSU_32700 sufD sufD yurX TW NP_391149.1 Fe-S cluster assembly protein SufD 185

sulfur mobilizing ABC protein, core 3359995 3360780 - 3316 108 gene 3359995 3360780 - BSU_32710 sufC sufC yurY TW NP_391150.1 ATPase 185

core 3476555 3477847 - 3499 108 gene 3476555 3477847 - BSU_33900 eno eno eno TW NP_391270.1 enolase 193 core 3477877 3479412 - 3500 108 gene 3477877 3479412 - BSU_33910 pgm gpmI pgm pgm TW NP_391271.1 phosphoglycerate mutase 193

core 3479405 3480166 - 3501 108 gene 3479405 3480166 - BSU_33920 tpiA tpi tpiA TW NP_391272.1 triose phosphate isomerase 193

core 3480197 3481381 - 3502 108 gene 3480197 3481381 - BSU_33930 pgk pgk TW NP_391273.1 phosphoglycerate kinase 193

glyceraldehyde-3-phosphate core 3481698 3482705 - 3503 108 gene 3481698 3482705 - BSU_33940 gapA gap gapA NP_391274.1 dehydrogenase (NAD-dependent, 193 glycolytic)

promiscuous aminoacid racemase core 3533419 3534102 - 3555 108 gene 3533419 3534102 - BSU_34430 racX racE TW NP_391323.1 (prefers arginine, lysine and 196 ornithine)

core 3573207 3574157 - 3642 108 gene 3573207 3574157 - BSU_34790 trxB yvcH trxB trxB TW NP_391359.1 thioredoxin reductase 199

core 3627139 3628240 - 3689 108 gene 3627139 3628240 - BSU_35290 prfB prfB prfB TW NP_391409.1 peptide chain release factor 2 203

core 3628310 3630835 - 3690 108 gene 3628310 3630835 - BSU_35300 secA div+ secA secA RB NP_391410.1 translocase binding subunit (ATPase) 203

UDP-N- core 3648654 3649730 - 3721 108 gene 3648654 3649730 - BSU_35530 tagO yvhI tagO tagO TW NP_391433.1 acetylglucosamine:undecaprenyl-P 206 N-acetylglucosaminyl-1-P transferase bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

non- putative exporter involved in 3656752 3658203 - 4729 85 gene 3656752 3658203 - BSU_35600 tuaB yvhB tuaB NP_391440.1 core biosynthesis of teichuronic acid 206-207

non- UDP-N-acetylmannosamine 2- 3664241 3665383 - 4735 84 gene 3664241 3665383 - BSU_35660 mnaA mnaA yvyH TW NP_391446.1 core epimerase 206-207

non- ATP-binding teichoic acid precursor 3673564 3675147 - 4744 84 gene 3673564 3675147 - BSU_35700 tagH tagH tagH RB NP_391451.1 core transporter component 206-207

non- 3675167 3675994 - 4745 35 gene 3675167 3675994 - BSU_35710 tagG tagG tagG RB NP_391452.1 teichoic acid precursors permease core

206-207

CDP-glycerol:polyglycerol phosphate non- glycero-phosphotransferase 3676159 3678399 - 4746 35 gene 3676159 3678399 - BSU_35720 tagF rodC, tag3 tagF tagF TW NP_391453.1 core (poly(glycerol phosphate) 206-207 polymerase)

non- glycerol-3-phosphate 3680581 3680970 - 4748 35 gene 3680581 3680970 - BSU_35740 tagD tagD tagD TW NP_391455.1 core cytidylyltransferase 206-207

N-acetylmannosamine (ManNAc) C4 hydroxyl of a membrane-anchored non- 3681370 3682140 + 4749 35 gene 3681370 3682140 + BSU_35750 tagA tagA tagA TW NP_391456.1 N-acetylglucosaminyl diphospholipid core

(GlcNAc-pp-undecaprenyl, lipid I) 206-207 glycosyltransferase

teichoic acid primase, CDP- glycerol:N-acetyl-beta-d- non- tagF, rodC, 3682173 3683318 + 4750 35 gene 3682173 3683318 + BSU_35760 tagB tagB tagB TW NP_391457.1 mannosaminyl-1, 4-N-acetyl-d- core tag3 glucosaminyldiphosphoundecaprenyl 206-207 glycerophosphotransferase

core 3740206 3740547 - 3807 108 gene 3740206 3740547 - BSU_36310 ssbB ywpH ssb RO NP_391512.2 single-strand DNA-binding protein 214 core 3743732 3744157 - 3814 108 gene 3743732 3744157 - BSU_36370 fabZ ywpB fabZ NP_391518.2 beta-hydroxyacyl 215 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

core 3747254 3748255 - 3818 108 gene 3747254 3748255 - BSU_36410 mbl mbl NP_391522.2 MreB-like morphogen 215

UDP-N-acetylglucosamine 1- core 3777949 3779259 - 3857 108 gene 3777949 3779259 - BSU_36760 murAA murA murAA murAA TW NP_391557.1 carboxyvinyltransferase 217

glyC, ipc- core 3789190 3790437 - 3871 108 gene 3789190 3790437 - BSU_36900 glyA glyA TW NP_391571.1 serine hydroxymethyltransferase 34d 217

tRNA(NNU) t(6)A37 threonylcarbamoyladenosine core 3792969 3794009 - 3876 108 gene 3792969 3794009 - BSU_36950 tsaC ipc-29d ywlC TW NP_391576.1 modification; threonine-dependent 217 ADP-forming ATPase

core 3797085 3798155 - 3882 108 gene 3797085 3798155 - BSU_37010 prfA prfA prfA TW NP_391582.1 peptide chain release factor 1 217

core 3803081 3803281 - 3888 108 gene 3803081 3803281 - BSU_37070 rpmEA rpmE RO NP_391588.1 ribosomal protein L31 217

fba, fba1, core 3808512 3809369 - 3894 108 gene 3808512 3809369 - BSU_37120 fbaA fbaA TW NP_391593.1 fructose-1,6-bisphosphate aldolase tsr 217

core 3810693 3812300 - 3897 108 gene 3810693 3812300 - BSU_37150 pyrG ctrA pyrG pyrG TW NP_391596.1 CTP synthetase 218 core 3833650 3835320 - 3916 108 gene 3833650 3835320 - BSU_37330 argS argS argS RO NP_391614.1 arginyl-tRNA synthetase 219

glycosyltransferase involved in non- ywcF, ipa- 3912332 3913513 - 3994 97 gene 3912332 3913513 - BSU_38120 rodA rodA rodA RB NP_391691.1 extension of the lateral walls of the core 42d cell 224-225

ywaB, ipa- 1,4-dihydroxy-2-naphthoate core 3950726 3951661 - 4034 108 gene 3950726 3951661 - BSU_38490 menA menA menA TW NP_391728.1 6d octaprenyltransferase 229

core 3969999 3970319 - 4056 108 gene 3969999 3970319 - BSU_38690 yxlC yxlC NP_391748.1 sigma-Y antisigma factor 229 non- 4023054 4023482 - 4769 39 gene 4023054 4023482 - BSU_39220 wapI N17H yxxG NP_391801.1 antitoxin of WapA tRNase core

234-235

non- antitoxin factor of the RttD-RttE 4036344 4036787 - 4772 53 gene 4036344 4036787 - BSU_39290 rtbE N17A yxxD NP_391808.1 core toxin-antitoxin system 236-237

two-component sensor histidine core 4151853 4153688 - 4257 108 gene 4151853 4153688 - BSU_40400 walK walK yycG RB, TW NP_391920.1 kinase 250

core 4153696 4154403 - 4258 108 gene 4153696 4154403 - BSU_40410 walR walR yycF RB, TW NP_391921.1 two-component response regulator 250

core 4157471 4158835 - 4265 108 gene 4157471 4158835 - BSU_40440 dnaC dnaC dnaC RB NP_391924.2 replicative DNA helicase 250

core 4163197 4163646 - 4271 108 gene 4163197 4163646 - BSU_40500 rplI rplI RO NP_391930.1 ribosomal protein L9 250 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

inorganic pyrophosphatase (Mn2+- core 4168204 4169133 + 4277 108 gene 4168204 4169133 + BSU_40550 ppaC ppaC ppaC TW NP_391935.1 dependent) 251

core 4198603 4198842 - 4303 108 gene 4198603 4198842 - BSU_40890 rpsR rpsR rpsR RO NP_391969.2 ribosomal protein S18 257

core 4198886 4199404 - 4304 108 gene 4198886 4199404 - BSU_40900 ssbA ssbA NP_391970.1 single-strand DNA-binding protein 257

core 4199445 4199732 - 4305 108 gene 4199445 4199732 - BSU_40910 rpsF rpsF RO NP_391971.1 ribosomal protein S6 (BS9) 257

protein component of ribonuclease P core 4214753 4215103 - 4320 108 gene 4214753 4215103 - BSU_41050 rnpA rnpA rnpA RO NP_391985.1 (RNase P) (substrate specificity) 258

core 4215255 4215389 - 4321 108 gene 4215255 4215389 - BSU_41060 rpmH rpmH RO NP_391986.1 ribosomal protein L34 258 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Table 2. The 258 B. subtilis core regions identified through the pan-genome graph.

Core Region Start Stop Length (bp) 1 410 92089 91680 2 92253 95352 3100 3 98109 165706 67598 4 165830 165902 73 5 166253 166328 76 6 166501 168053 1553 7 177083 202079 24997 8 205409 210946 5538 9 222971 227954 4984 10 228066 238129 10064 11 241917 243661 1745 12 243892 250053 6162 13 252514 282897 30384 14 283003 317603 34601 15 317725 318927 1203 16 319180 320352 1173 17 320421 327377 6957 18 327618 332396 4779 19 335329 340585 5257 20 340613 364142 23530 21 365170 365754 585 22 365841 369217 3377 23 369773 370105 333 24 370259 406007 35749 25 406131 435989 29859 26 436036 455738 19703 27 455771 478741 22971 28 488314 490554 2241 29 490777 514764 23988 30 515016 524249 9234 31 525743 529419 3677 32 559264 560612 1349 33 561514 566101 4588 34 568345 569949 1605 35 570371 575562 5192 36 579541 582479 2939 37 582536 585037 2502 38 585155 587336 2182 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

39 587744 591891 4148 40 593407 599067 5661 41 599107 600105 999 42 601019 602427 1409 43 603012 604576 1565 44 604736 631811 27076 45 632774 648704 15931 46 648742 651868 3127 47 664775 680873 16099 48 680907 694281 13375 49 694425 696998 2574 50 697157 724825 27669 51 724987 737347 12361 52 737603 738982 1380 53 749562 749763 202 54 750959 753131 2173 55 753817 754401 585 56 755907 764765 8859 57 764781 770113 5333 58 772142 788636 16495 59 791462 795867 4406 60 796001 813818 17818 61 814384 817243 2860 62 827993 844645 16653 63 850367 860263 9897 64 860303 875055 14753 65 875178 878051 2874 66 879002 886173 7172 67 888143 893798 5656 68 893904 898867 4964 69 905010 909125 4116 70 909198 911928 2731 71 913924 928658 14735 72 928803 953294 24492 73 953373 957667 4295 74 959311 988377 29067 75 988417 988872 456 76 989712 1013851 24140 77 1013958 1050234 36277 78 1050443 1092728 42286 79 1092770 1108704 15935 80 1108736 1125153 16418 81 1125123 1135225 10103 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

82 1135255 1149935 14681 83 1150206 1156473 6268 84 1156504 1159081 2578 85 1160776 1169849 9074 86 1178218 1182395 4178 87 1187700 1204420 16721 88 1204506 1213449 8944 89 1217326 1250504 33179 90 1252177 1262861 10685 91 1268275 1268592 318 92 1270426 1276291 5866 93 1276337 1289101 12765 94 1290018 1313615 23598 95 1313840 1314304 465 96 1315869 1334247 18379 97 1339632 1343227 3596 98 1345631 1382430 36800 99 1382677 1383147 471 100 1383320 1417416 34097 101 1417561 1441788 24228 102 1447251 1466599 19349 103 1466638 1478836 12199 104 1482248 1494362 12115 105 1494403 1514963 20561 106 1514997 1534239 19243 107 1534279 1538692 4414 108 1538770 1557631 18862 109 1558034 1573790 15757 110 1575051 1616122 41072 111 1616267 1671129 54863 112 1671166 1731446 60281 113 1731457 1780220 48764 114 1780404 1782523 2120 115 1860014 1879759 19746 116 1879787 1880377 591 117 1887352 1888743 1392 118 1888774 1895165 6392 119 1896424 1899125 2702 120 1899589 1900514 926 121 1901117 1901377 261 122 1901429 1903058 1630 123 1903511 1904749 1239 124 1905370 1906207 838 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

125 1906272 1906706 435 126 1907013 1907201 189 127 1910167 1924962 14796 128 1924993 1942455 17463 129 1942714 1957360 14647 130 1958027 1960087 2061 131 1998340 2003182 4843 132 2003276 2015681 12406 133 2017886 2021144 3259 134 2021223 2022467 1245 135 2073658 2075779 2122 136 2079096 2085980 6885 137 2086070 2128421 42352 138 2128464 2139457 10994 139 2140898 2151256 10359 140 2287097 2297109 10013 141 2297984 2330022 32039 142 2330075 2393559 63485 143 2393602 2433522 39921 144 2435012 2435224 213 145 2435833 2476859 41027 146 2476969 2532197 55229 147 2532353 2564883 32531 148 2564923 2581617 16695 149 2584035 2587573 3539 150 2588701 2653340 64640 151 2701362 2701754 393 152 2716035 2718802 2768 153 2718959 2724827 5869 154 2727160 2730226 3067 155 2732747 2732881 135 156 2732980 2739105 6126 157 2739486 2740193 708 158 2741357 2743889 2533 159 2747984 2751651 3668 160 2752167 2773222 21056 161 2773783 2784652 10870 162 2784688 2805313 20626 163 2805348 2832367 27020 164 2832424 2840925 8502 165 2841307 2897432 56126 166 2898292 2898929 638 167 2898931 2899752 822 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

168 2899816 2982269 82454 169 2982603 2998620 16018 170 3009915 3106995 97081 171 3107232 3109360 2129 172 3109397 3115974 6578 173 3117976 3183195 65220 174 3183201 3188048 4848 175 3188173 3208154 19982 176 3210445 3213829 3385 177 3214372 3253448 39077 178 3257092 3266655 9564 179 3269914 3305261 35348 180 3305599 3308231 2633 181 3308368 3318002 9635 182 3318029 3334990 16962 183 3335414 3344979 9566 184 3345013 3354487 9475 185 3355593 3382499 26907 186 3382633 3419421 36789 187 3419656 3436651 16996 188 3436849 3440961 4113 189 3441121 3450788 9668 190 3450709 3457941 7233 191 3459848 3464067 4220 192 3467564 3474070 6507 193 3476555 3488042 11488 194 3487974 3495701 7728 195 3495876 3499257 3382 196 3499541 3535473 35933 197 3536012 3542691 6680 198 3545889 3568135 22247 199 3568527 3580038 11512 200 3582936 3591268 8333 201 3591288 3601909 10622 202 3606762 3609221 2460 203 3610064 3630835 20772 204 3631003 3634754 3752 205 3636046 3647406 11361 206 3648654 3651068 2415 207 3684826 3688547 3722 208 3691372 3693906 2535 209 3695363 3696223 861 210 3696257 3714679 18423 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

211 3716009 3722541 6533 212 3724420 3725136 717 213 3727427 3727687 261 214 3728511 3741592 13082 215 3742351 3750677 8327 216 3750768 3761700 10933 217 3761987 3810611 48625 218 3810693 3819178 8486 219 3833650 3845998 12349 220 3847853 3851893 4041 221 3852186 3856447 4262 222 3856639 3874180 17542 223 3874332 3899182 24851 224 3900963 3912226 11264 225 3914009 3926399 12391 226 3926682 3937515 10834 227 3938310 3949922 11613 228 3949952 3950584 633 229 3950726 3973278 22553 230 3973364 3982915 9552 231 3983187 3984026 840 232 3984132 3989084 4953 233 3989948 3990967 1020 234 3992671 4018656 25986 235 4030710 4031645 936 236 4031797 4033755 1959 237 4038513 4039159 647 238 4039466 4040875 1410 239 4049009 4062143 13135 240 4062543 4084383 21841 241 4084799 4086540 1742 242 4088002 4092666 4665 243 4106245 4121056 14812 244 4121166 4123903 2738 245 4128119 4130044 1926 246 4134436 4139537 5102 247 4140260 4141474 1215 248 4141711 4145506 3796 249 4145747 4147132 1386 250 4147419 4165622 18204 251 4166815 4171359 4545 252 4176900 4178145 1246 253 4183445 4185681 2237 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

254 4186092 4187174 1083 255 4191198 4195744 4547 256 4196350 4196730 381 257 4197780 4205517 7738 258 4205556 4215389 9834

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Table 3. The 414 E. coli genes deemed essential by Goodall et al., Baba et al., or Yamazaki et al. These genes are compared to the

PGG based annotation of core regions for the K-12 BW25113 strain used by Goodall (GenBank sequence CP009273.1, Assembly

ASM75055v1/GCA_000750555.1, BioSample SAMN03013572). Columns 1-5 are the start, stop, strand, cluster, and cluster size for

the PGG annotation. Columns 6-11 are the gene type, start, stop, strand, locus tag, and gene symbol/name for the GenBank annotation.

Column 12 is a list of gene synonyms for the gene from GenBank. Columns 13-21 are from Goodall et al.: 13-15 from Table S1

(normal essentiality), 16-18 from Table S4 (essentiality after outgrowth), 19-20 from Table S3 (outlier discrepancies), and 21 from

Table S2 (comparison of data sets). Column 22 is the PGG core or non-core region the gene is contained in.

PGG Log PGG PGG Gene Insertion Log Likelihood Insertion Venn Cause of TraDIS/Keio/PE PGG Start PGG Stop Cluster Type Start End Strand Locus Tag Synonyms Class Likelihood Class4 Region Strand Cluster GenBank Index Score Ratio Index Score2 group discrepancy C Size Ratio3

ECK0408, 0.00815217 428911 430014 + 9603 957 gene 428911 430014 + BW25113_0414 ribD JW0404, -17.08601447 Essential 0.004528986 -9.88843503 Essential All 3 core 39 4 ribG, ybaE

ECK1197, - hemM, 0.00480769 1258333 1258956 - 4835 963 gene 1258333 1258956 - BW25113_1209 lolB -20.77735583 Essential 0.001602564 13.5832067 Essential All 3 core 136 JW1200, 2 3 ychC

- ECK2407, 0.01114488 2523606 2524592 - 14903 963 gene 2523606 2524592 - BW25113_2412 zipA -14.84018637 Essential 0.006079027 8.68145130 Essential All 3 core 279 JW2404 3 4

- ECK0411, 0.00511247 431090 432067 + 9600 965 gene 431090 432067 + BW25113_0417 thiL -20.35269783 Essential 0.00204499 12.7704580 Essential All 3 core 39 JW0407 4 8

- ECK3798, 0.02564102 3982448 3983188 - 7451 965 gene 3982448 3983188 - BW25113_3804 hemD -8.469391606 Essential 0.012145749 5.30527025 Essential All 3 core 421 JW3776 6 1

dnaZ, - 0.00517598 487548 489479 + 9544 966 gene 487548 489479 + BW25113_0470 dnaX ECK0464, -20.26726714 Essential 0.001552795 13.6867207 Essential All 3 core 44 3 JW0459 3

asuA, - ECK1192, 0.01709401 1253385 1253969 - 4840 968 gene 1253385 1253969 - BW25113_1204 pth -11.65807531 Essential 0.005128205 9.39115688 Essential All 3 core 136 JW1195, 7 2 rap

ECK1199, - JW1202, 0.00554016 1260468 1261550 + 4833 968 gene 1260468 1261550 + BW25113_1211 prfA -19.7959005 Essential 0.001846722 13.1133758 Essential All 3 core 136 sueB, uar, 6 5 ups

- ECK3448, 0.00200803 3596110 3597603 - 161 968 gene 3596110 3597603 - BW25113_3464 ftsY -26.71851386 Essential 0.004016064 10.3546376 Essential All 3 core 374 JW3429 2 8

ECK0628, 0.00525762 661772 663673 - 9389 969 gene 661772 663673 - BW25113_0635 mrdA JW0630, -20.15890546 Essential 0.003154574 -11.2536292 Essential All 3 core 66 4 pbpA bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

dnaR, - ECK1195, 0.00316455 1256384 1257331 - 4837 969 gene 1256384 1257331 - BW25113_1207 prs -23.64112411 Essential 0 173.551740 Essential All 3 core 136 JW1198, 7 1 prsA

ECK1196, - ipk, 0.00352112 1257482 1258333 - 4836 969 gene 1257482 1258333 - BW25113_1208 ispE -22.91372984 Essential 0 173.551740 Essential All 3 core 136 JW1199, 7 1 ychB

ECK1198, - 0.00556881 1259170 1260426 + 4834 969 gene 1259170 1260426 + BW25113_1210 hemA gtrA, -19.760084 Essential 0.000795545 15.8149304 Essential All 3 core 136 5 JW1201 9

ECK1200, - 0.00239808 1261550 1262383 + 4832 969 gene 1261550 1262383 + BW25113_1212 prmC hemK, -25.52115633 Essential 0 173.551740 Essential All 3 core 136 2 JW1203 1

- ECK1203, 0.00233918 1263621 1264475 + 4829 969 gene 1263621 1264475 + BW25113_1215 kdsA -25.68914942 Essential 0.002339181 12.3111970 Essential All 3 core 136 JW1206 1 1

ECK3964, - 0.00971817 4161985 4163013 + 7623 969 gene 4161985 4163013 + BW25113_3972 murB JW3940, -15.83087441 Essential 0.002915452 11.5377719 Essential All 3 core 454 3 yijB 4

bioR, dhbB, 0.03105590 4163010 4163975 + 7624 969 gene 4163010 4163975 + BW25113_3973 birA ECK3965, -6.881075138 Essential 0.014492754 -4.26610639 Essential All 3 core 454 1 JW3941

ECK3972, 0.01041666 4167286 4167669 + 7633 969 gene 4167286 4167669 + BW25113_3981 secE JW3944, -15.33033826 Essential 0.002604167 -11.9378332 Essential All 3 core 454 7 mbrC, prlG

- ECK3973, 0.01465201 4167671 4168216 + 7634 969 gene 4167671 4168216 + BW25113_3982 nusG -12.82250177 Essential 0.010989011 5.85555921 Essential All 3 core 454 JW3945 5 1

ECK3976, 0.00602409 4169924 4170421 + 7638 969 gene 4169924 4170421 + BW25113_3985 rplJ -19.21326655 Essential 0.002008032 -12.8321173 Essential All 3 core 454 JW3948 6

- ECK3977, 4170488 4170853 + 7639 969 gene 4170488 4170853 + BW25113_3986 rplL 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 454 JW3949 1

ECK3978, ftsR, groN, JW3950, - mbrD, nitB, 0.00173740 4171173 4175201 + 7640 969 gene 4171173 4175201 + BW25113_3987 rpoB -27.69214838 Essential 0.001489203 13.8233951 Essential All 3 core 454 rif, ron, 4 4 sdgB, stl, stv, tabD, tabG

ECK3979, - 0.01017992 4175278 4179501 + 7641 969 gene 4175278 4179501 + BW25113_3988 rpoC JW3951, -15.49644362 Essential 0.00686553 8.15015416 Essential All 3 core 454 4 tabB 4

ECK0366, - 0.01230769 384209 385183 - 3382 970 gene 384209 385183 - BW25113_0369 hemB JW0361, -14.11477855 Essential 0.009230769 6.75713700 Essential All 3 core 35 2 ncf 1

ECK0517, 0.00691562 548674 549396 - 9489 970 gene 548674 549396 - BW25113_0524 lpxH JW0513, -18.24746005 Essential 0.004149378 -10.2293169 Essential All 3 core 53 9 ybbF

- ECK0519, 550067 551452 + 9487 970 gene 550067 551452 + BW25113_0526 cysS 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 53 JW0515 1

ads, - 0.01730103 552331 553197 - 9484 970 gene 552331 553197 - BW25113_0529 folD ECK0522, -11.56617533 Essential 0.005767013 8.90518528 Essential All 3 core 53 8 JW0518 8

ECK0627, - 0.00449236 660657 661769 - 9390 970 gene 660657 661769 - BW25113_0634 mrdB JW0629, -21.24478909 Essential 0.000898473 15.4354928 Essential All 3 core 66 3 rodA 5

ECK0632, - fusB, 0.01401869 665387 666028 - 9385 970 gene 665387 666028 - BW25113_0639 nadD -13.15232618 Essential 0.007788162 7.57565243 Essential All 3 core 66 JW0634, 2 2 ybeN bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

- ECK0633, 0.01453488 666030 667061 - 9384 970 gene 666030 667061 - BW25113_0640 holA -12.88253731 Essential 0.010658915 6.01827523 Essential All 3 core 66 JW0635 4 2

ECK0634, 0.01030927 667061 667642 - 9383 970 gene 667061 667642 - BW25113_0641 lptE JW0636, -15.40525238 Essential 0.005154639 -9.37021669 Essential All 3 core 66 8 rlpB

- ECK0635, 0.00193573 667657 670239 - 9382 970 gene 667657 670239 - BW25113_0642 leuS -26.96535804 Essential 0.000387147 18.0163890 Essential All 3 core 66 JW0637 4 9

cutE, 0.01299545 684799 686337 - 9367 970 gene 684799 686337 - BW25113_0657 lnt ECK0649, -13.71433021 Essential 0.009096816 -6.8297358 Essential All 3 core 68 2 JW0654

ECK0668, 0.00480480 701549 703213 + 9347 970 gene 701549 703213 + BW25113_0680 glnS -20.78150036 Essential 0.002402402 -12.2189965 Essential All 3 core 71 JW0666 5

- ECK0672, 0.00564971 706391 706921 - 9342 970 gene 706391 706921 - BW25113_0684 fldA -19.65987986 Essential 0 173.551740 Essential All 3 core 72 JW0671 8 1

ECK1054, JW1056, 1123295 1124830 + 4959 970 gene 1123295 1124830 + BW25113_1069 murJ 0.00390625 -22.2042982 Essential 0.001302083 -14.2584822 Essential All 3 core 126 mviN, mviS, yceN

ECK1078, - 0.00752688 1145184 1146113 + 4936 970 gene 1145184 1146113 + BW25113_1092 fabD JW1078, -17.65096927 Essential 0.002150538 12.5995238 Essential All 3 core 127 2 tfpA 1

- ECK1079, 1146126 1146860 + 4935 970 gene 1146126 1146860 + BW25113_1093 fabG 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 127 JW1079 1

- ECK1080, 0.00421940 1147071 1147307 + 4934 970 gene 1147071 1147307 + BW25113_1094 acpP -21.67561417 Essential 0 173.551740 Essential All 3 core 127 JW1080 9 1

ECK1084, - 0.00467289 1150580 1151221 + 4930 970 gene 1150580 1151221 + BW25113_1098 tmk JW1084, -20.97345916 Essential 0.007788162 7.57565243 Essential All 3 core 127 7 ycfG 2

- ECK1085, 0.01194029 1151218 1152222 + 4929 970 gene 1151218 1152222 + BW25113_1099 holB -14.3370201 Essential 0.005970149 8.75860454 Essential All 3 core 127 JW1085 9 6

ECK1102, - 0.00083333 1170883 1172082 + 4911 970 gene 1170883 1172082 + BW25113_1116 lolC JW5161, -32.60579817 Essential 0 173.551740 Essential All 3 core 128 3 ycfU 1

ECK1103, - 0.00284900 1172075 1172776 + 4910 970 gene 1172075 1172776 + BW25113_1117 lolD JW5162, -24.3546829 Essential 0.001424501 13.9679477 Essential All 3 core 128 3 ycfV 2

ECK1104, - 0.00321285 1172776 1174020 + 4909 970 gene 1172776 1174020 + BW25113_1118 lolE JW1104, -23.53807484 Essential 0.002409639 12.2085736 Essential All 3 core 128 1 ycfW 8

asuA, - ECK1268, 1325305 1327902 + 4764 970 gene 1325305 1327902 + BW25113_1274 topA 0.01308699 -13.6624738 Essential 0.003464203 10.9104474 Essential All 3 core 143 JW1266, 4 supX

- ECK1272, 0.00169204 1332827 1333417 - 4758 970 gene 1332827 1333417 - BW25113_1277 ribA -27.86982087 Essential 0.001692047 13.4040922 Essential All 3 core 145 JW1269 7 6

ECK1283, - envM, gts, 0.00126742 1344508 1345296 - 4745 970 gene 1344508 1345296 - BW25113_1288 fabI -29.80636683 Essential 0 173.551740 Essential All 3 core 145 JW1281, 7 1 qmeA

ECK1805, - 0.00574712 1884829 1885524 - 11026 970 gene 1884829 1885524 - BW25113_1807 tsaB JW1796, -19.54102957 Essential 0.001436782 13.9400616 Essential All 3 core 206 6 rpf, yeaZ 8

ECK3385, igaA, - 0.03043071 3519828 3521963 + 231 970 gene 3519828 3521963 + BW25113_3398 yrfF JW3361, -7.052653118 Essential 0.012640449 5.07836899 Essential All 3 core 364 2 mucM, 5 umoB bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

ECK3445, - fam, hin, 0.01052631 3593289 3594143 - 164 970 gene 3593289 3594143 - BW25113_3461 rpoH -15.2545729 Essential 0.003508772 10.8630819 Essential All 3 core 374 htpR, 6 1 JW3426

coaBC, - 0.00819000 3806091 3807311 + 7277 970 gene 3806091 3807311 + BW25113_3639 dfp ECK3629, -17.05314352 Essential 0.004095004 10.2800541 Essential All 3 core 398 8 JW5642 2

ECK4254, 0.00908265 4476035 4477135 + 6561 970 gene 4476035 4477135 + BW25113_4261 lptF JW4218, -16.31591652 Essential 0.006357856 -8.4880772 Essential All 3 core 506 2 yjgP

ECK4255, - 0.01108033 4477135 4478217 + 6560 970 gene 4477135 4478217 + BW25113_4262 lptG JW5760, -14.88243706 Essential 0.006463527 8.41629275 Essential All 3 core 506 2 yjgQ 5

ECK0026, - 0.00530785 21407 22348 + 14167 971 gene 21407 22348 + BW25113_0025 ribF JW0023, -20.0930286 Essential 0.003184713 11.2190667 Essential All 3 core 6 6 yaaC 8

- ECK0028, 0.01212121 25207 25701 + 14165 971 gene 25207 25701 + BW25113_0027 lspA -14.22682379 Essential 0.006060606 8.69443780 Essential All 3 core 6 JW0025 2 4

ECK0030, - 0.00841219 26277 27227 + 14163 971 gene 26277 27227 + BW25113_0029 ispH JW0027, -16.86290727 Essential 0.001051525 14.9404977 Essential All 3 core 6 8 lytB, yaaE 4

ECK0032, 0.00121654 28374 29195 + 14161 971 gene 28374 29195 + BW25113_0031 dapB -30.08042176 Essential 0.001216545 -14.4766248 Essential All 3 core 7 JW0029 5

ECK0049, - 0.00416666 49823 50302 + 14143 971 gene 49823 50302 + BW25113_0048 folA JW0047, -21.76194665 Essential 0 173.551740 Essential All 3 core 8 7 tmrA 1

ECK0055, - imp, 0.00721868 54755 57109 - 14137 971 gene 54755 57109 - BW25113_0054 lptD -17.94579327 Essential 0.003397028 10.9827581 Essential All 3 core 9 JW0053, 4 1 ostA, yabG

ECK0084, 0.04098360 0.44230867 87519 87884 + 14107 971 gene 87519 87884 + BW25113_0083 ftsL JW0081, -4.457809192 Essential 0.027322404 Unclear All 3 core 14 7 7 mraR, yabD

ECK0085, - 0.00792303 87900 89666 + 14106 971 gene 87900 89666 + BW25113_0084 ftsI JW0082, -17.28819005 Essential 0.004527448 9.88977210 Essential All 3 core 14 3 pbpB, sep 2

- ECK0086, 0.00403225 89653 91140 + 14105 971 gene 89653 91140 + BW25113_0085 murE -21.98681522 Essential 0.000672043 16.3367849 Essential All 3 core 14 JW0083 8 5

ECK0087, - 0.00662251 91137 92495 + 14104 971 gene 91137 92495 + BW25113_0086 murF JW0084, -18.55131361 Essential 0.004415011 9.98845141 Essential All 3 core 14 7 mra 6

ECK0088, - 0.00554016 92489 93571 + 14103 971 gene 92489 93571 + BW25113_0087 mraY JW0085, -19.7959005 Essential 0.002770083 11.7199630 Essential All 3 core 14 6 murX 4

- ECK0089, 0.00607441 93574 94890 + 14102 971 gene 93574 94890 + BW25113_0088 murD -19.15526311 Essential 0.004555809 9.86517039 Essential All 3 core 14 JW0086 2 5

ECK0090, 0.01847389 94890 96134 + 14101 971 gene 94890 96134 + BW25113_0089 ftsW -11.06277524 Essential 0.008835341 -6.97329652 Essential All 3 core 14 JW0087 6

- ECK0091, 0.00374531 96131 97198 + 14100 971 gene 96131 97198 + BW25113_0090 murG -22.49213431 Essential 0.002808989 11.6704448 Essential All 3 core 14 JW0088 8 2

- ECK0092, 0.00542005 97252 98727 + 14099 971 gene 97252 98727 + BW25113_0091 murC -19.94800949 Essential 0.003387534 10.9930693 Essential All 3 core 14 JW0089 4 4

- ECK0094, 0.01805054 99642 100472 + 14097 971 gene 99642 100472 + BW25113_0093 ftsQ -11.24122024 Essential 0.014440433 4.28832916 Essential All 3 core 14 JW0091 2 9 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

divA, - 0.00633412 100469 101731 + 14096 971 gene 100469 101731 + BW25113_0094 ftsA ECK0095, -18.86292727 Essential 0.002375297 12.2582710 Essential All 3 core 14 5 JW0092 9

ECK0096, 0.00173611 101792 102943 + 14095 971 gene 101792 102943 + BW25113_0095 ftsZ JW0093, -27.69714975 Essential 0.002604167 -11.9378332 Essential All 3 core 14 1 sfiB, sulB

asmB, - ECK0097, 0.01307189 103044 103961 + 14094 971 gene 103044 103961 + BW25113_0096 lpxC -13.67100273 Essential 0.002178649 12.5552184 Essential All 3 core 14 envA, 5 8 JW0094

azi, - ECK0099, 0.01515151 104766 107471 + 14092 971 gene 104766 107471 + BW25113_0098 secA -12.57114779 Essential 0.00849963 7.16140253 Essential All 3 core 14 JW0096, 5 2 pea, prlD

ECK0125, - 0.01055806 138495 139157 - 14063 971 gene 138495 139157 - BW25113_0126 can JW0122, -15.23276688 Essential 0.004524887 9.89199927 Essential All 3 core 18 9 yadF 7

ECK0153, - gsa, 0.01092896 170089 171369 - 14032 971 gene 170089 171369 - BW25113_0154 hemL -14.9823987 Essential 0.004683841 9.75550270 Essential All 3 core 23 JW0150, 2 8 popC

ECK0155, - 0.01449275 173097 173441 + 14030 971 gene 173097 173441 + BW25113_0156 erpA JW0152, -12.90423574 Essential 0.011594203 5.56399998 Essential All 3 core 23 4 yadR 9

ECK0164, - 0.00848484 181610 182434 - 14020 971 gene 181610 182434 - BW25113_0166 dapD JW0161, -16.80171855 Essential 0.006060606 8.69443780 Essential All 3 core 24 8 ssa 4

ECK0166, - 0.02138364 185199 185993 - 14018 971 gene 185199 185993 - BW25113_0168 map JW0163, -9.922770224 Essential 0.00754717 7.72130788 Essential All 3 core 24 8 pepM 3

- ECK0168, 0.00275482 186361 187086 + 14016 971 gene 186361 187086 + BW25113_0169 rpsB -24.58263598 Essential 0.00137741 14.0769283 Essential All 3 core 24 JW0164 1 8

- ECK0169, 0.00352112 187344 188195 + 14014 971 gene 187344 188195 + BW25113_0170 tsf -22.91372984 Essential 0.001173709 14.5911938 Essential All 3 core 25 JW0165 7 1

ECK0170, - 0.00550964 188342 189067 + 14013 971 gene 188342 189067 + BW25113_0171 pyrH JW0166, -19.83425577 Essential 0 173.551740 Essential All 3 core 25 2 smbA, umk 1

- ECK0171, 0.01254480 189359 189916 + 14012 971 gene 189359 189916 + BW25113_0172 frr -13.97449934 Essential 0.001792115 13.2134730 Essential All 3 core 25 JW0167, rrf 3 4

ECK0172, - ispC, 0.00668337 190008 191204 + 14011 971 gene 190008 191204 + BW25113_0173 dxr -18.48719616 Essential 0.003341688 11.0431880 Essential All 3 core 25 JW0168, 5 3 yaeM

ECK0173, - JW0169, 191393 192151 + 14010 971 gene 191390 192151 + BW25113_0174 ispU 0.00656168 -18.61596728 Essential 0.005249344 9.29582153 Essential All 3 core 25 rth, uppS, 3 yaeS

- ECK0174, 0.00582750 192164 193021 + 14009 971 gene 192164 193021 + BW25113_0175 cdsA -19.44438964 Essential 0.004662005 9.77404839 Essential All 3 core 25 JW5810 6 3

ecfK, ECK0176, - JW0172, 0.00328812 194415 196847 + 14007 971 gene 194415 196847 + BW25113_0177 bamA -23.38042638 Essential 0 173.551740 Essential All 3 core 25 omp85, 2 1 yaeT, yzzN, yzzY ECK0178, - firA, 0.00292397 197458 198483 + 14005 971 gene 197458 198483 + BW25113_0179 lpxD -24.17841327 Essential 0.001949318 12.9321301 Essential All 3 core 25 JW0174, 7 2 omsA, ssc

ECK0179, - JW0175, 0.00657894 198588 199043 + 14004 971 gene 198588 199043 + BW25113_0180 fabZ -18.59755968 Essential 0.002192982 12.5328170 Essential All 3 core 25 sabA, sefA, 7 6 sfhC, yaeA bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

- ECK0180, 0.00253485 199047 199835 + 14003 971 gene 199047 199835 + BW25113_0181 lpxA -25.14614224 Essential 0.001267427 14.3452400 Essential All 3 core 25 JW0176 4 8

ECK0181, - 0.00696257 199835 200983 + 14002 971 gene 199835 200983 + BW25113_0182 lpxB JW0177, -18.19992375 Essential 0.003481288 10.8922337 Essential All 3 core 25 6 pgsB 6

ECK0183, - 0.00229687 201613 205095 + 14000 971 gene 201613 205095 + BW25113_0184 dnaE JW0179, -25.81239972 Essential 0.000861326 15.5674884 Essential All 3 core 25 1 polC, sdgC 1

- ECK0184, 205108 206067 + 13999 971 gene 205108 206067 + BW25113_0185 accA 0.00625 -18.95635603 Essential 0 173.551740 Essential All 3 core 25 JW0180 1

ECK0187, - 0.01847575 208818 210116 + 13996 971 gene 208818 210116 + BW25113_0188 tilS JW0183, -11.06200112 Essential 0.005388761 9.18802938 Essential All 3 core 25 1 mesJ, yaeN 3

drpA, - 0.00639906 213544 215262 - 13989 971 gene 213544 215262 - BW25113_0194 proS ECK0194, -18.79160058 Essential 0.002326934 12.3293018 Essential All 3 core 26 9 JW0190 3

ECK0409, - 430103 430573 + 9602 971 gene 430103 430573 + BW25113_0415 ribE JW0405, 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 39

ribH, ybaF 1

ECK0414, - 0.00214707 433771 435633 - 9597 971 gene 433771 435633 - BW25113_0420 dxs JW0410, -26.26739609 Essential 0.001610306 13.5673628 Essential All 3 core 41 5 yajP 2

- ECK0415, 0.00666666 435658 436557 - 9596 971 gene 435658 436557 - BW25113_0421 ispA -18.50474397 Essential 0.004444444 9.96244001 Essential All 3 core 41 JW0411 7 1

dnaW, ECK0468, 0.00310077 492631 493275 + 9539 971 gene 492631 493275 + BW25113_0474 adk -23.77959053 Essential 0.003100775 -11.3159925 Essential All 3 core 44 JW0463, 5 plsA

ECK0469, - 0.00726895 493511 494473 + 9538 971 gene 493511 494473 + BW25113_0475 hemH JW0464, -17.89691102 Essential 0.003115265 11.2991102 Essential All 3 core 44 1 popA, visA 8

bypA1, - 921681 921899 - 5153 971 gene 921681 921899 - BW25113_0884 infA ECK0875, 0.00456621 -21.1325582 Essential 0.00456621 9.85617660 Essential All 3 core 100

JW0867 2

ECK0882, - 0.00326797 932828 933439 + 5145 971 gene 932828 933439 + BW25113_0891 lolA JW0874, -23.42227814 Essential 0.003267974 11.1249350 Essential All 3 core 100 4 lplA, yzzV 5

- ECK0884, 0.00464037 934884 936176 + 5143 971 gene 934884 936176 + BW25113_0893 serS -21.02159143 Essential 0.000773395 15.9026058 Essential All 3 core 100 JW0876 1 4

ECK0902, - 0.01553166 957451 959124 + 5125 971 gene 957451 959124 + BW25113_0911 rpsA JW0894, -12.38469926 Essential 0.002389486 12.2376644 Essential All 3 core 105 1 ssyF 7

- ECK0905, 0.00114351 962077 963825 + 5122 971 gene 962077 963825 + BW25113_0914 msbA -30.49428949 Essential 0.000571755 16.8326940 Essential All 3 core 105 JW0897 1 7

ECK0906, - 0.00506585 963822 964808 + 5121 971 gene 963822 964808 + BW25113_0915 lpxK JW0898, -20.41605543 Essential 0.003039514 11.3880818 Essential All 3 core 105 6 ycaH 3

- ECK0909, 966308 967054 + 5118 971 gene 966308 967054 + BW25113_0918 kdsB 0 -172.1619602 Essential 0.001338688 14.1690982 Essential All 3 core 105 JW0901 8

ECK0913, - 0.00377928 969775 971097 + 5114 971 gene 969775 971097 + BW25113_0922 mukF JW0905, -22.43039243 Essential 0.001511716 13.7744210 Essential All 3 core 105 9 kicB 2

ECK0914, - 0.00851063 971078 971782 + 5113 971 gene 971078 971782 + BW25113_0923 mukE JW0906, -16.78011426 Essential 0.004255319 10.1318955 Essential All 3 core 105 8 kicA, ycbA 4 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

- ECK0915, 971782 976242 + 5112 971 gene 971782 976242 + BW25113_0924 mukB 0.00403497 -21.98220695 Essential 0.00044833 17.5727951 Essential All 3 core 105 JW0907 7

ECK0921, - 0.00142755 983041 984441 - 5105 971 gene 983041 984441 - BW25113_0930 asnS JW0913, -29.0098859 Essential 0 173.551740 Essential All 3 core 106 2 lcs, tss 1

- ECK0945, 0.00963391 1011408 1011926 - 5080 971 gene 1011408 1011926 - BW25113_0954 fabA -15.89347357 Essential 0.003853565 10.5117850 Essential All 3 core 107 JW0937 1 3

- ECK1633, 1710205 1711479 - 11205 971 gene 1710205 1711479 - BW25113_1637 tyrS 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 183 JW1629 1

ECK1658, - 0.01869158 1736858 1737499 - 11178 971 gene 1736858 1737499 - BW25113_1662 ribC JW1654, -10.97237971 Essential 0.00623053 8.57565038 Essential All 3 core 185 9 ribE 2

- ECK1711, 0.00167504 1789814 1792201 - 11124 971 gene 1789814 1792201 - BW25113_1713 pheT -27.93764461 Essential 0.000837521 15.6549137 Essential All 3 core 193 JW1703 2 2

ECK1712, - 1792216 1793199 - 11123 971 gene 1792216 1793199 - BW25113_1714 pheS JW5277, 0.00101626 -31.28221971 Essential 0 173.551740 Essential All 3 core 193

phe-act 1

ECK1714, - 1793650 1794006 - 11122 971 gene 1793650 1794006 - BW25113_1716 rplT JW1706, 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 194

pdzA 1

ECK1716, - fit, 1794353 1794895 - 11120 971 gene 1794353 1794895 - BW25113_1718 infC 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 194 JW5829, 1 srjA

- ECK1717, 0.00207361 1794899 1796827 - 11119 971 gene 1794899 1796827 - BW25113_1719 thrS -26.50204125 Essential 0.001036807 14.9850771 Essential All 3 core 194 JW1709 3 7

ECK1738, - efg, 0.00241545 1816715 1817542 + 11098 971 gene 1816715 1817542 + BW25113_1740 nadE -25.47236625 Essential 0 173.551740 Essential All 3 core 198 JW1729, 9 1 ntrL

- ECK1777, 0.00200803 1857028 1858023 + 11056 971 gene 1857028 1858023 + BW25113_1779 gapA -26.71851386 Essential 0 173.551740 Essential All 3 core 201 JW1768 2 1

ECK1867, 0.00564015 1943007 1944779 - 10961 971 gene 1943007 1944779 - BW25113_1866 aspS -19.67164917 Essential 0.002256063 -12.4356925 Essential All 3 core 211 JW1855, tls 8

ECK1877, - 0.00115340 1954319 1956052 + 10950 971 gene 1954319 1956052 + BW25113_1876 argS JW1865, -30.43672668 Essential 0 173.551740 Essential All 3 core 214 3 lov 1

- ECK1911, 1985750 1986298 - 10915 971 gene 1985750 1986298 - BW25113_1912 pgsA 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 220 JW1897 1

- ECK2107, 0.01671583 2187779 2189812 + 13297 971 gene 2187779 2189812 + BW25113_2114 metG -11.82848048 Essential 0.006882989 8.13883231 Essential All 3 core 245 JW2101 1 9

- ECK2146, 0.00448430 2236463 2237131 - 13256 971 gene 2236463 2237131 - BW25113_2153 folE -21.25714257 Essential 0 173.551740 Essential All 3 core 252 JW2140 5 1

ECK2223, hisW, - 0.00570776 2330272 2332899 - 13173 971 gene 2330272 2332899 - BW25113_2231 gyrA JW2225, -19.58882502 Essential 0.003424658 10.9528795 Essential All 3 core 258 3 nalA, nfxA, 4 norA, parD

dnaF, - 0.00087489 2338344 2340629 + 13169 971 gene 2338344 2340629 + BW25113_2234 nrdA ECK2226, -32.28141399 Essential 0 173.551740 Essential All 3 core 259 1 JW2228 1

ECK2227, - 0.00088417 2340863 2341993 + 13168 971 gene 2340863 2341993 + BW25113_2235 nrdB ftsB, -32.21105382 Essential 0.000884173 15.4856889 Essential All 3 core 259 3 JW2229 8 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

dedC, - 0.01103230 2425153 2426421 - 13088 971 gene 2425153 2426421 - BW25113_2315 folC ECK2309, -14.91401514 Essential 0.001576044 13.6380021 Essential All 3 core 267 9 JW2312 2

dedB, - ECK2310, 0.00983606 2426491 2427405 - 13087 971 gene 2426491 2427405 - BW25113_2316 accD -15.74412319 Essential 0.005464481 9.13031414 Essential All 3 core 267 JW2313, 6 6 usg

ECK2317, - 2433864 2435084 - 13079 971 gene 2433864 2435084 - BW25113_2323 fabB fabC, 0 -172.1619602 Essential 0.000819001 15.7245607 Essential All 3 core 269

JW2320 1

ECK2394, 0.00776836 2512736 2514151 - 14913 971 gene 2512736 2514151 - BW25113_2400 gltX -17.4277685 Essential 0.004237288 -10.1483457 Essential All 3 core 277 JW2395 2

dnaL, ECK2406, - 0.00248015 2521520 2523535 - 14904 971 gene 2521520 2523535 - BW25113_2411 ligA JW2403, -25.29367742 Essential 0.001984127 12.8725259 Essential All 3 core 279 9 lig, lop, 3 pdeC

ECK2467, - 0.00797872 2584966 2586093 + 14852 971 gene 2584966 2586093 + BW25113_2472 dapE JW2456, -17.23855666 Essential 0.005319149 9.24159915 Essential All 3 core 283 3 msgB 5

- ECK2474, 0.00455062 2592241 2593119 - 14845 971 gene 2592241 2593119 - BW25113_2478 dapA -21.15609645 Essential 0.004550626 9.86965785 Essential All 3 core 284 JW2463 6 7

ECK2507, - engA, 0.00067888 2629243 2630715 - 14809 971 gene 2629243 2630715 - BW25113_2511 der -33.9708671 Essential 0 173.551740 Essential All 3 core 287 JW5403, 7 1 yfgK

- ECK2510, 0.00784313 2632660 2633934 - 14806 971 gene 2632660 2633934 - BW25113_2514 hisS -17.35996883 Essential 0.004705882 9.73684750 Essential All 3 core 287 JW2498 7 7

ECK2511, - 0.00446827 2634045 2635163 - 14805 971 gene 2634045 2635163 - BW25113_2515 ispG gcpE, -21.28178126 Essential 0.002680965 11.8355953 Essential All 3 core 287 5 JW2499 3

ECK2530, - 0.01616915 2656801 2657604 + 14787 971 gene 2656801 2657604 + BW25113_2533 suhB JW2517, -12.08083279 Essential 0.008706468 7.04498833 Essential All 3 core 289 4 ssyA 3

dpj, - 0.00262467 2693977 2694357 - 14754 971 gene 2693977 2694357 - BW25113_2563 acpS ECK2561, -24.91048841 Essential 0.002624672 11.9102924 Essential All 3 core 293 2 JW2547 4

ECK2564, - 0.00551876 2695840 2696745 - 14751 971 gene 2695840 2696745 - BW25113_2566 era JW2550, -19.82277225 Essential 0.003311258 11.0767576 Essential All 3 core 293 4 sdgE 6

ECK2566, - 0.02153846 2697694 2698668 - 14748 971 gene 2697694 2698668 - BW25113_2568 lepB JW2552, -9.865876328 Essential 0.01025641 6.22050697 Essential All 3 core 293 2 lep 3

ECK2583, - 0.00368731 2716086 2717441 + 14730 971 gene 2716086 2717441 + BW25113_2585 pssA JW2569, -22.59881556 Essential 0.002949853 11.4957514 Essential All 3 core 294 6 pss 5

ecfD, - ECK2593, 0.01355013 2729505 2730242 + 14719 971 gene 2729505 2730242 + BW25113_2595 bamD -13.40497064 Essential 0.006775068 8.20912280 Essential All 3 core 296 JW2577, 6 6 yfiO

- ECK2603, 0.02586206 2737542 2737889 - 14708 971 gene 2737542 2737889 - BW25113_2606 rplS -8.399510111 Essential 0.00862069 7.09306125 Essential All 3 core 298 JW2587 9 8

- ECK2604, 2737931 2738698 - 14707 971 gene 2737931 2738698 - BW25113_2607 trmD 0.0078125 -17.38767452 Essential 0.00390625 10.4602847 Essential All 3 core 298 JW2588 9

- ECK2606, 0.00803212 2739296 2739544 - 14705 971 gene 2739296 2739544 - BW25113_2609 rpsP -17.19126165 Essential 0.008032129 7.43105781 Essential All 3 core 298 JW2590 9 1

- ECK2607, 2739793 2741154 - 14704 971 gene 2739793 2741154 - BW25113_2610 ffh 0.01174743 -14.45617269 Essential 0.008076358 7.40513855 Essential All 3 core 298 JW5414 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

ECK2610, 0.03535353 2743474 2744067 - 14701 971 gene 2743474 2744067 - BW25113_2614 grpE -5.769005313 Essential 0.023569024 -0.82460818 Unclear All 3 core 299 JW2594 5

ECK2611, - 0.00113765 2744190 2745068 + 14699 971 gene 2744190 2745068 + BW25113_2615 nadK JW2596, -30.52859315 Essential 0.001137656 14.6906364 Essential All 3 core 299 6 yfjB, yfjE 3

ECK2691, - 0.03763440 2812320 2812505 - 12413 971 gene 2812320 2812505 - BW25113_2696 csrA JW2666, -5.220254659 Essential 0.005376344 9.19754865 Essential All 3 core 304 9 zfiA 6

ECK2741, - 0.00208333 2864660 2865139 - 12363 971 gene 2864660 2865139 - BW25113_2746 ispF JW2716, -26.47053041 Essential 0 173.551740 Essential All 3 core 309 3 ygbB 1

ECK2742, - 0.01265822 2865139 2865849 - 12362 971 gene 2865139 2865849 - BW25113_2747 ispD JW2717, -13.90823625 Essential 0.011251758 5.72794049 Essential All 3 core 309 8 ygbP 2

ECK2743, - 0.01923076 2865868 2866179 - 12361 971 gene 2865868 2866179 - BW25113_2748 ftsB JW2718, -10.7523091 Essential 0.025641026 0.11735853 Unclear All 3 core 309 9 ygbQ 4

- ECK2773, 0.00846805 2900002 2901300 - 12328 971 gene 2900002 2901300 - BW25113_2779 eno -16.81582141 Essential 0.006158584 8.62567196 Essential All 3 core 313 JW2750 2 8

ECK2774, 0.00244200 2901388 2903025 - 12327 971 gene 2901388 2903025 - BW25113_2780 pyrG -25.39849964 Essential 0.002442002 -12.1622853 Essential All 3 core 313 JW2751 2

ECK2824, - 0.01255707 2958521 2959396 - 12278 971 gene 2958521 2959396 - BW25113_2828 lgt JW2796, -13.96730234 Essential 0.007990868 7.45531785 Essential All 3 core 316 8 umpA 2

ald, - ECK2921, 0.00740740 3063524 3064603 - 12165 971 gene 3063524 3064603 - BW25113_2925 fbaA -17.7638918 Essential 0 173.551740 Essential All 3 core 322 fba, fda, 7 1 JW2892

- ECK2922, 0.00945017 3064818 3065981 - 12164 971 gene 3064818 3065981 - BW25113_2926 pgk -16.03174237 Essential 0.006013746 8.72759575 Essential All 3 core 322 JW2893 2 1

ECK2937, - 0.00865800 3080065 3081219 + 12148 971 gene 3080065 3081219 + BW25113_2942 metK JW2909, -16.65781583 Essential 0.005194805 9.33854570 Essential All 3 core 329 9 metX 4

- ECK2944, 0.02637889 3086859 3087275 + 12141 971 gene 3086859 3087275 + BW25113_2949 yqgF -8.237989554 Essential 0.009592326 6.56421526 Essential All 3 core 329 JW2916 7 5

ECK3009, - 0.00406504 3156103 3156840 - 622 971 gene 3156103 3156840 - BW25113_3018 plsC JW2986, -21.93130923 Essential 0.006775068 8.20912280 Essential All 3 core 333 1 parF 6

ECK3021, - 3166863 3168755 - 610 971 gene 3166863 3168755 - BW25113_3030 parE JW2998, 0.00792393 -17.28738803 Essential 0.003169572 11.2363968 Essential All 3 core 336

nfxD 6

- ECK3046, 0.02017756 3195250 3196488 + 584 971 gene 3195250 3196488 + BW25113_3056 cca -10.37828348 Essential 0.012106538 5.32345784 Essential All 3 core 340 JW3028 3 7

ECK3054, - gcp, 0.00690335 3202889 3203902 - 575 971 gene 3202889 3203902 - BW25113_3064 tsaD -18.25994026 Essential 0.003944773 10.4229663 Essential All 3 core 340 JW3036, 3 3 ygjD

ECK3157, gicD, 0.01159745 3306701 3309373 - 464 971 gene 3306701 3309373 - BW25113_3168 infB -14.55005461 Essential 0.007856341 -7.53496204 Essential All 3 core 352 JW3137, 6 ssyG

- ECK3158, 0.01680107 3309398 3310885 - 463 971 gene 3309398 3310885 - BW25113_3169 nusA -11.78978054 Essential 0.014112903 4.42832956 Essential All 3 core 352 JW3138 5 4

ECK3167, hflB, - 0.04237726 3318360 3320294 - 454 971 gene 3318360 3320294 - BW25113_3178 ftsH JW3145, -4.154008379 Essential 0.009302326 6.71860915 Essential All 3 core 353 1 mrsC, std, 6 tolZ bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

cgtA, - ECK3172, 0.02898550 3323941 3325113 - 449 971 gene 3323941 3325113 - BW25113_3183 obgE -7.460166058 Essential 0.011935209 5.40329022 Essential All 3 core 353 JW3150, 7 5 obg, yhbZ

ECK3174, - 0.00387596 3326221 3326478 - 447 971 gene 3326221 3326478 - BW25113_3185 rpmA JW3152, -22.25757105 Essential 0.003875969 10.4898187 Essential All 3 core 353 9 rpz 6

- ECK3175, 0.00641025 3326499 3326810 - 446 971 gene 3326499 3326810 - BW25113_3186 rplU -18.7793833 Essential 0.006410256 8.45238128 Essential All 3 core 353 JW3153 6 5

cel, - ECK3176, 0.01131687 3327069 3328040 + 445 971 gene 3327069 3328040 + BW25113_3187 ispB -14.72869052 Essential 0.005144033 9.37860913 Essential All 3 core 353 JW3154, 2 1 yhbD

ECK3178, - JW3156, 0.00158730 3328594 3329853 - 443 971 gene 3328594 3329853 - BW25113_3189 murA -28.29874364 Essential 0.000793651 15.8223367 Essential All 3 core 353 mrbA, 2 7 murZ

ECK3189, 0.02150537 3336739 3337296 + 432 971 gene 3336739 3337296 + BW25113_3200 lptA JW3167, -9.878006498 Essential 0.021505376 -1.55160095 Unclear All 3 core 353 6 yhbN

- ECK3219, 0.01017811 3371174 3371566 - 405 971 gene 3371174 3371566 - BW25113_3230 rpsI -15.49772501 Essential 0 173.551740 Essential All 3 core 355 JW3199 7 1

- ECK3220, 3371582 3372010 - 404 971 gene 3371582 3372010 - BW25113_3231 rplM 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 355 JW3200 1

- ECK3237, 0.02862985 3391746 3392234 - 387 971 gene 3391746 3392234 - BW25113_3249 mreD -7.562911402 Essential 0.016359918 3.49614786 Unclear All 3 core 356 JW3218 7 1

- ECK3238, 0.03170289 3392234 3393337 - 386 971 gene 3392234 3393337 - BW25113_3250 mreC -6.706333794 Essential 0.023550725 0.83094969 Unclear All 3 core 356 JW3219 9 3

ECK3239, - envB, 0.00670498 3393403 3394446 - 385 971 gene 3393403 3394446 - BW25113_3251 mreB -18.46456543 Essential 0.003831418 10.5335966 Essential All 3 core 356 JW3220, 1 8 mon, rodY

ECK3242, - 0.00636942 3398795 3399265 + 379 971 gene 3398795 3399265 + BW25113_3255 accB fabE, -18.82407043 Essential 0.002123142 12.6431839 Essential All 3 core 356 7 JW3223 9

ECK3243, - 0.00296296 3399276 3400625 + 378 971 gene 3399276 3400625 + BW25113_3256 accC fabG, -24.08848745 Essential 0.000740741 16.0362856 Essential All 3 core 356 3 JW3224 4

ECK3269, - 3424202 3424774 - 350 971 gene 3424202 3424774 - BW25113_3282 tsaC JW3243, 0.02443281 -8.860280933 Essential 0.008726003 7.03408022 Essential All 3 core 361

rimN, yrdC 3

ECK3273, - 0.02941176 3427049 3427558 + 346 971 gene 3427049 3427558 + BW25113_3287 def fms, -7.338332519 Essential 0.009803922 6.45326317 Essential All 3 core 361 5 JW3248 6

ECK3274, - 0.02109704 3427573 3428520 + 345 971 gene 3427573 3428520 + BW25113_3288 fmt JW3249, -10.0290149 Essential 0.012658228 5.07030012 Essential All 3 core 361 6 yhdD 7

- ECK3281, 0.00260416 3432975 3433358 - 338 971 gene 3432975 3433358 - BW25113_3294 rplQ -24.96358498 Essential 0 173.551740 Essential All 3 core 362 JW3256 7 1

ECK3282, - JW3257, 0.00404040 3433399 3434388 - 337 971 gene 3433399 3434388 - BW25113_3295 rpoA -21.97298241 Essential 0.003030303 11.3990227 Essential All 3 core 362 pez, phs, 4 4 sez

ECK3283, - 3434414 3435034 - 336 971 gene 3434414 3435034 - BW25113_3296 rpsD JW3258, 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 362

ramA, sud 1

- ECK3284, 3435068 3435457 - 335 971 gene 3435068 3435457 - BW25113_3297 rpsK 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 362 JW3259 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

- ECK3285, 3435474 3435830 - 334 971 gene 3435474 3435830 - BW25113_3298 rpsM 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 362 JW3260 1

ECK3287, - 0.00075075 3436125 3437456 - 332 971 gene 3436125 3437456 - BW25113_3300 secY JW3262, -33.30101992 Essential 0.000750751 15.9947235 Essential All 3 core 362 1 prlA 6

- ECK3288, 3437464 3437898 - 331 971 gene 3437464 3437898 - BW25113_3301 rplO 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 362 JW3263 1

- ECK3289, 3437902 3438081 - 330 971 gene 3437902 3438081 - BW25113_3302 rpmD 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 362 JW3264 1

ECK3290, - eps, 3438085 3438588 - 329 971 gene 3438085 3438588 - BW25113_3303 rpsE 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 362 JW3265, 1 spc, spcA

- ECK3291, 3438603 3438956 - 328 971 gene 3438603 3438956 - BW25113_3304 rplR 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 362 JW3266 1

- ECK3292, 3438966 3439499 - 327 971 gene 3438966 3439499 - BW25113_3305 rplF 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 362 JW3267 1

- ECK3293, 0.00254452 3439512 3439904 - 326 971 gene 3439512 3439904 - BW25113_3306 rpsH -25.12036917 Essential 0 173.551740 Essential All 3 core 362 JW3268 9 1

- ECK3294, 3439938 3440243 - 325 971 gene 3439938 3440243 - BW25113_3307 rpsN 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 362 JW3269 1

- ECK3295, 0.00185185 3440258 3440797 - 324 971 gene 3440258 3440797 - BW25113_3308 rplE -27.26337251 Essential 0.003703704 10.6613233 Essential All 3 core 362 JW3270 2 6

- ECK3296, 3440812 3441126 - 323 971 gene 3440812 3441126 - BW25113_3309 rplX 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 362 JW3271 1

- ECK3297, 3441137 3441508 - 322 971 gene 3441137 3441508 - BW25113_3310 rplN 0 -172.1619602 Essential 0.002688172 11.8261265 Essential All 3 core 362 JW3272 7

ECK3298, - 0.00392156 3441673 3441927 - 321 971 gene 3441673 3441927 - BW25113_3311 rpsQ JW3273, -22.17749955 Essential 0.003921569 10.4454109 Essential All 3 core 362 9 neaA 6

- ECK3299, 0.00520833 3441927 3442118 - 320 971 gene 3441927 3442118 - BW25113_3312 rpmC -20.22413515 Essential 0.005208333 9.32791842 Essential All 3 core 362 JW3274 3 2

- ECK3300, 3442118 3442528 - 319 971 gene 3442118 3442528 - BW25113_3313 rplP 0.00729927 -17.86758046 Essential 0 173.551740 Essential All 3 core 362 JW3275 1

- ECK3301, 0.00427350 3442541 3443242 - 318 971 gene 3442541 3443242 - BW25113_3314 rpsC -21.58814052 Essential 0.001424501 13.9679477 Essential All 3 core 362 JW3276 4 2

ECK3302, - 0.01201201 3443260 3443592 - 317 971 gene 3443260 3443592 - BW25113_3315 rplV eryB, -14.2931597 Essential 0.006006006 8.73308959 Essential All 3 core 362 2 JW3277 3

- ECK3303, 0.00358422 3443607 3443885 - 316 971 gene 3443607 3443885 - BW25113_3316 rpsS -22.79249401 Essential 0 173.551740 Essential All 3 core 362 JW3278 9 1

- ECK3304, 3443902 3444723 - 315 971 gene 3443902 3444723 - BW25113_3317 rplB 0.00486618 -20.69390119 Essential 0.003649635 10.7164342 Essential All 3 core 362 JW3279 1

- ECK3305, 3444741 3445043 - 314 971 gene 3444741 3445043 - BW25113_3318 rplW 0.00330033 -23.35518875 Essential 0.00330033 11.0888733 Essential All 3 core 362 JW3280 3 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

ECK3306, - 3445040 3445645 - 313 971 gene 3445040 3445645 - BW25113_3319 rplD eryA, 0.00330033 -23.35518875 Essential 0 173.551740 Essential All 3 core 362

JW3281 1

- ECK3307, 0.00476190 3445656 3446285 - 312 971 gene 3445656 3446285 - BW25113_3320 rplC -20.84336986 Essential 0 173.551740 Essential All 3 core 362 JW3282 5 1

ECK3308, - 0.00641025 3446318 3446629 - 311 971 gene 3446318 3446629 - BW25113_3321 rpsJ JW3283, -18.7793833 Essential 0 173.551740 Essential All 3 core 362 6 nusE 1

ECK3327, - 0.00520094 3464759 3466873 - 292 971 gene 3464759 3466873 - BW25113_3340 fusA far, fus, -20.23396158 Essential 0.00141844 13.9817902 Essential All 3 core 363 6 JW3302 6

- ECK3328, 0.02407407 3466970 3467440 - 291 971 gene 3466901 3467440 - BW25113_3341 rpsG -8.979386506 Essential 0.014814815 4.13014537 Essential All 3 core 363 JW3303 4 3

asuB, - ECK3329, 3467537 3467911 - 290 971 gene 3467537 3467911 - BW25113_3342 rpsL 0.008 -17.21967899 Essential 0.002666667 11.8544432 Essential All 3 core 363 JW3304, 4 strA

- ECK3371, 3505993 3506997 - 245 971 gene 3505993 3506997 - BW25113_3384 trpS 0.00199005 -26.77908321 Essential 0 173.551740 Essential All 3 core 364 JW3347 1

dap, ECK3419, 0.00271739 3567135 3568238 - 191 971 gene 3567135 3568238 - BW25113_3433 asd -24.67534794 Essential 0.000905797 -15.410073 Essential All 3 core 369 hom, 1 JW3396

cfcA, - ECK3548, 0.00109649 3717767 3718678 - 54 971 gene 3717767 3718678 - BW25113_3560 glyQ -30.77482494 Essential 0.003289474 11.1009409 Essential All 3 core 389 glyS, 1 9 JW3531

ECK3598, 0.01176470 3776002 3777021 - 7247 971 gene 3776002 3777021 - BW25113_3608 gpsA -14.4454279 Essential 0.004901961 -9.57365125 Essential All 3 core 395 JW3583 6

ECK3623, - 0.00938967 3801900 3803177 + 7271 971 gene 3801900 3803177 + BW25113_3633 waaA JW3608, -16.07781367 Essential 0.003912363 10.4543440 Essential All 3 core 398 1 kdtA 5

ECK3624, - 3803185 3803664 + 7272 971 gene 3803185 3803664 + BW25113_3634 coaD JW3609, 0.00625 -18.95635603 Essential 0.004166667 10.2132903 Essential All 3 core 398

kdtB, yicA 5

- ECK3627, 0.00421940 3804798 3805034 - 7275 971 gene 3804798 3805034 - BW25113_3637 rpmB -21.67561417 Essential 0 173.551740 Essential All 3 core 398 JW3612 9 1

dnaS, - ECK3630, 0.01096491 3807292 3807747 + 7278 971 gene 3807292 3807747 + BW25113_3640 dut -14.95854446 Essential 0.013157895 4.84583611 Essential All 3 core 398 JW3615, 2 7 sof

ECK3638, - 0.00320512 3814788 3815411 + 7287 971 gene 3814788 3815411 + BW25113_3648 gmk JW3623, -23.55445252 Essential 0.003205128 11.1958049 Essential All 3 core 400 8 spoR 3

acrB, Cou, ECK3691, himB, hisU, - 0.00455486 3871065 3873479 - 7344 971 gene 3871065 3873479 - BW25113_3699 gyrB hopA, -21.14968616 Essential 0.001656315 13.4745818 Essential All 3 core 410 5 JW5625, 1 nalC, parA, pcbA, pcpA

- ECK3693, 0.00544959 3874581 3875681 - 7346 971 gene 3874581 3875681 - BW25113_3701 dnaN -19.91030829 Essential 0.002724796 11.7783321 Essential All 3 core 410 JW3678 1 1

ECK3694, - 0.00356125 3875686 3877089 - 7347 971 gene 3875686 3877089 - BW25113_3702 dnaA hsm-2, -22.83639392 Essential 0.000712251 16.1575573 Essential All 3 core 410 4 JW3679 9

ECK3695, - 3877696 3877836 + 7349 971 gene 3877696 3877836 + BW25113_3703 rpmH JW3680, 0 -172.1619602 Essential 0 173.551740 Essential All 3 core 411

rimA, ssaF 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

ECK3696, 0.04444444 3877853 3878212 + 7350 971 gene 3877853 3878212 + BW25113_3704 rnpA -3.71643435 Essential 0.027777778 0.59191535 Unclear All 3 core 411 JW3681 4

- ECK3698, 0.00910746 3878436 3880082 + 7352 971 gene 3878436 3880082 + BW25113_3705 yidC -16.29639683 Essential 0.004857316 9.61038300 Essential All 3 core 411 JW3683 8 3

- ECK3722, 0.03114754 3905199 3907028 - 7376 971 gene 3905199 3907028 - BW25113_3729 glmS -6.856153186 Essential 0.00273224 11.7686828 Essential All 3 core 414 JW3707 1 7

ECK3723, - 0.01021152 3907190 3908560 - 7377 971 gene 3907190 3908560 - BW25113_3730 glmU JW3708, -15.47406887 Essential 0.005105762 9.40899661 Essential All 3 core 414 4 tms, yieA 8

ECK3799, - 0.01273885 3983185 3984126 - 7452 971 gene 3983185 3984126 - BW25113_3805 hemC JW5932, -13.86145793 Essential 0.005307856 9.25033663 Essential All 3 core 421 4 popE 7

ECK3842, - 0.00366300 4027968 4028513 + 7496 971 gene 4027968 4028513 + BW25113_3850 hemG JW3827, -22.64401575 Essential 0.009157509 6.79676425 Essential All 3 core 427 4 yihB 8

ECK3857, - 0.02369668 4043493 4044125 - 7511 971 gene 4043493 4044125 - BW25113_3865 yihA engB, -9.106272554 Essential 0.006319115 8.51459615 Essential All 3 core 432 2 JW5930 2

dga, ECK3959, - 0.00699300 4155356 4156213 + 7618 971 gene 4155356 4156213 + BW25113_3967 murI glr, -18.16927168 Essential 0.003496503 10.8760730 Essential All 3 core 449 7 JW5550, 6 yijA cyr, - ECK4032, 0.01718213 4242944 4243816 + 7699 971 gene 4242944 4243816 + BW25113_4040 ubiA -11.61884284 Essential 0.005727377 8.93420235 Essential All 3 core 469 JW4000, 1 7 sdgG

- ECK4033, 0.01278877 4243971 4246394 - 7700 971 gene 4243971 4246394 - BW25113_4041 plsB -13.83262504 Essential 0.008663366 7.06910825 Essential All 3 core 469 JW4001 9 7

ECK4035, exrA, 4247043 4247651 + 7702 971 gene 4247043 4247651 + BW25113_4043 lexA JW4003, 0.02134647 -9.936484521 Essential 0.014778325 -4.14547925 Essential All 3 core 469

spr, tsl, umuA ECK4044, - groP, grpA, 0.01129943 4254242 4255657 + 7711 971 gene 4254242 4255657 + BW25113_4052 dnaB -14.73992408 Essential 0.005649718 8.99146437 Essential All 3 core 471 grpD, 5 2 JW4012

ECK4051, - exrB, 0.00186219 4264053 4264589 + 1249 971 gene 4264053 4264589 + BW25113_4059 ssb -27.22590934 Essential 0 173.551740 Essential All 3 core 473 JW4020, 7 1 lexC ECK4136, groES, - 0.01020408 4360505 4360798 + 1343 971 gene 4360505 4360798 + BW25113_4142 groS JW4102, -15.47933255 Essential 0 173.551740 Essential All 3 core 486 2 mopB, 1 TabB

- ECK4156, 0.03508771 4379209 4380177 - 1361 971 gene 4379209 4380177 - BW25113_4160 psd -5.834683906 Essential 0.025799794 0.06399644 Unclear All 3 core 491 JW4121 9 2

ECK4158, - 0.01465201 4381421 4381966 + 1363 971 gene 4381421 4381966 + BW25113_4162 orn JW5740, -12.82250177 Essential 0.003663004 10.7027484 Essential All 3 core 491 5 yjeR 7

ECK4164, - 0.01515151 4385402 4385863 + 1370 971 gene 4385402 4385863 + BW25113_4168 tsaE JW4126, -12.57114779 Essential 0.006493506 8.39607118 Essential All 3 core 492 5 yjeE 1

- ECK4198, 0.03070175 4415656 4415883 + 1404 971 gene 4415656 4415883 + BW25113_4202 rpsR -6.977931647 Essential 0.013157895 4.84583611 Essential All 3 core 495 JW4160 4 7

- ECK4222, 0.00376647 4438939 4439469 - 7908 971 gene 4438939 4439469 - BW25113_4226 ppa -22.45361297 Essential 0.001883239 13.0478780 Essential All 3 core 499 JW4185 8 5 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

ECK4251, - 4470799 4473654 - 6564 971 gene 4470799 4473654 - BW25113_4258 valS JW4215, 0.00245098 -25.37369246 Essential 0.0017507 13.2911948 Essential All 3 core 506

val-act 5

dnaD, 0.00813008 4590055 4590792 - 14235 971 gene 4590055 4590792 - BW25113_4361 dnaC ECK4351, -17.10527512 Essential 0.002710027 -11.7975412 Essential All 3 core 514 1 JW4325

- ECK0141, 0.07916666 Conditionall non-core 20- 153740 154219 - 14045 778 gene 153740 154219 - BW25113_0142 folK 2.160389125 Unclear 0.022916667 1.05179456 Unclear KP Keio-PEC JW0138 7 y essential 21 7

Genes dinH, 0.11729323 Non- 1.17457198 containing a 928680 932669 + 5146 944 gene 928680 932669 + BW25113_0890 ftsK ECK0881, 7.020558521 0.029573935 Unclear KP Keio-PEC core 100 3 essential 7 transposon JW0873 free region

Genes ECK3925, 0.20729166 Non- 32.4349444 Non- containing a 4112308 4113267 - 7581 968 gene 4112308 4113267 - BW25113_3933 ftsN JW3904, 16.12149011 0.148958333 KP Keio-PEC core 441 7 essential 7 essential transposon msgA free region ams, Genes ECK1069, - 0.06403013 containing a 1136638 1139823 - 4944 970 gene 1136638 1139823 - BW25113_1084 rne hmp1, -0.140424844 Unclear 0.023226616 0.94356534 Unclear KP Keio-PEC core 127 2 transposon JW1071, 4 free region smbB

ECK1070, 0.09034267 Non- 8.07735805 Non- Polar 1139958 1140278 + 4943 970 gene 1139958 1140278 + BW25113_1085 yceQ 3.696304238 0.052959502 KP Keio-PEC core 127 JW5154 9 essential 1 essential insertions

Genes ECK0098, 0.08771929 6.89464588 Non- containing a 104192 104704 + 14093 971 gene 104192 104704 + BW25113_0097 secM JW5007, 3.34589178 Unclear 0.048732943 KP Keio-PEC core 14 8 5 essential transposon srrA, yacA free region

ecfE, - ECK0175, 0.05764966 Conditionall 193033 194385 + 14008 971 gene 193033 194385 + BW25113_0176 rseP -1.212948026 Unclear 0.002217295 12.4951030 Essential KP Keio-PEC core 25 JW0171, 7 y essential 8 yaeL

- Errors in ECK0402, 0.05681818 423103 424950 + 9609 971 gene 423103 424950 + BW25113_0408 secD -1.358330908 Unclear 0.013528139 4.68225792 Essential KP library Keio-PEC core 38 JW0398 2 8 construction

- Errors in ECK0403, 0.08539094 424961 425932 + 9608 971 gene 424961 425932 + BW25113_0409 secF 3.029940696 Unclear 0.018518519 2.65338088 Unclear KP library Keio-PEC core 38 JW0399 7 9 construction

Genes ECK2182, 0.09369676 Non- 2.41323318 containing a 2277855 2279615 + 13214 971 gene 2277855 2279615 + BW25113_2188 yejM JW2176, 4.136233447 0.033503691 Unclear KP Keio-PEC core 254 3 essential 4 transposon yejN free region

ECK3032, Genes htrP, 0.05198776 5.20005766 Non- containing a 3177172 3177825 - 599 971 gene 3177172 3177825 - BW25113_3041 ribB -2.23234331 Unclear 0.042813456 KP Keio-PEC core 338 JW3009, 8 4 essential transposon luxH free region

Genes ECK3188, 0.07118055 7.34877528 Non- containing a 3336195 3336770 + 433 971 gene 3336195 3336770 + BW25113_3199 lptC JW3166, 0.983663531 Unclear 0.050347222 KP Keio-PEC core 353 6 7 essential transposon yrbK free region

ECK3224, - 0.14325842 Non- Conditionall 3375559 3376626 + 400 971 gene 3375559 3376626 + BW25113_3235 degS hhoB, htrH, 9.882280933 0.011235955 5.73557023 Essential KP Keio-PEC core 355 7 essential y essential JW3204 2

Genes ECK3446, 0.13408876 Non- 5.92530601 Non- containing a 3594388 3595446 - 163 971 gene 3594388 3595446 - BW25113_3462 ftsX ftsS, 8.901244327 0.045325779 KP Keio-PEC core 374 3 essential 6 essential transposon JW3427 free region

Errors in ECK3447, 0.14499252 Non- 7.06222878 Non- 3595439 3596107 - 162 971 gene 3595439 3596107 - BW25113_3463 ftsE 10.06464909 0.049327354 KP library Keio-PEC core 374 JW3428 6 essential 9 essential construction

Genes ECK3640, 0.10052157 Non- 8.38047380 Non- containing a 3815760 3817868 + 7289 971 gene 3815760 3817868 + BW25113_3650 spoT 5.005815705 0.054054054 KP Keio-PEC core 400 JW3625 4 essential 6 essential transposon free region

ECK1131, - non-core 1197715 1198389 - 24944 47 gene 1197715 1198389 - BW25113_1145 cohE JW1131, 0 -172.1619602 Essential 0 173.551740 Essential KT Unclear TraDIS-Keio 131-132 ymfK 1

cohR, - ECK1354, 0.01257861 non-core 1414022 1414498 - 4312 106 gene 1414022 1414498 - BW25113_1356 racR -13.95468959 Essential 0.008385744 7.22623637 Essential KT Unclear TraDIS-Keio JW1351, 6 152-153 5 ydaR bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

ECK1564, - Errors in non-core 1642191 1642598 + 8268 128 gene 1642191 1642598 + BW25113_1570 dicA ftsT, 0 -172.1619602 Essential 0 173.551740 Essential KT library TraDIS-Keio 179-180 JW1562 1 construction

- Errors in ECK0721, 0.01912045 766914 768482 + 9289 971 gene 766914 768482 + BW25113_0733 cydA -10.79690017 Essential 0.008285532 7.28373241 Essential KT library TraDIS-Keio core 77 JW0722 9 5 construction

ECK0877, JW0869, - Errors in 0.03658536 922930 924651 - 5151 971 gene 922930 924651 - BW25113_0886 cydC mdrA, -5.469436285 Essential 0.019163763 2.40966014 Unclear KT library TraDIS-Keio core 100 6 mdrH, 6 construction surB, ycaB

ade, 0.01239970 Conditionall 1186072 1187442 - 4898 971 gene 1186072 1187442 - BW25113_1131 purB ECK1117, -14.0600538 Essential 0.001458789 -13.8906244 Essential KT TraDIS-Keio core 131 8 y essential JW1117

ECK2557, - Errors in 0.02976190 2690713 2691249 - 14761 971 gene 2690713 2691216 - BW25113_2559 tadA JW2543, -7.239296387 Essential 0.003968254 10.4003569 Essential KT library TraDIS-Keio core 292 5 yfhC 9 construction

ECK2571, - Errors in 0.02777777 2702796 2703371 - 14743 971 gene 2702796 2703371 - BW25113_2573 rpoE JW2557, -7.813274253 Essential 0.003472222 10.9018898 Essential KT library TraDIS-Keio core 293 8 sigE 5 construction

ECK3786, - 0.01256467 3971961 3973313 + 7441 971 gene 3971961 3973313 + BW25113_3793 wzyE JW3769, -13.96285363 Essential 0.009608278 6.55580177 Essential KT Unclear TraDIS-Keio core 419 1 rffT 5

aarF, ECK3829, - 0.00609384 Conditionall 4013586 4015226 + 7483 971 gene 4013586 4015226 + BW25113_3835 ubiB JW3812, -19.13298299 Essential 0.002437538 12.1686391 Essential KT TraDIS-Keio core 427 5 y essential yigQ, yigR, 9 yigS

ECK3835, - 0.02208835 Conditionall 4018348 4019841 + 7489 971 gene 4018348 4019841 + BW25113_3843 ubiD JW3819, -9.666521984 Essential 0.008701473 7.04777984 Essential KT TraDIS-Keio core 427 3 y essential yigC, yigY 7

ECK3966, Errors in JW3942, 0.01682439 4164004 4164954 - 7625 969 gene 4164004 4164954 - BW25113_3974 coaA -11.77922323 Essential 0.009463722 -6.63234268 Essential PT library TraDIS-PEC core 454 panK, rts, 5 construction ts-9

ECK0027, - Errors in 0.00248491 22391 25207 + 14166 971 gene 22391 25207 + BW25113_0026 ileS ilvS, -25.28072825 Essential 0.00141995 13.9783366 Essential PT library TraDIS-PEC core 6 3 JW0024 3 construction

ECK0103, - Errors in 0.02093397 109086 109706 - 14088 971 gene 109086 109706 - BW25113_0103 coaE JW0100, -10.09000813 Essential 0.009661836 6.52761286 Essential PT library TraDIS-PEC core 14 7 yacE 3 construction

ECK0410, - groNB, 0.03809523 Polar 430593 431012 + 9601 971 gene 430593 431012 + BW25113_0416 nusB -5.112435103 Essential 0.014285714 4.35427056 Essential PT TraDIS-PEC core 39 JW0406, 8 insertions 7 ssaD, ssyB

ECK2492, - Errors in idaB, 0.01282051 2611434 2612180 - 14827 971 gene 2611434 2612135 - BW25113_2496 hda -13.81435035 Essential 0.004273504 10.1153581 Essential PT library TraDIS-PEC core 285 JW5397, 3 8 construction yfgE

act, ala-act, - Errors in ECK2692, 0.00076016 2812740 2815370 - 12412 971 gene 2812740 2815370 - BW25113_2697 alaS -33.21801291 Essential 0 173.551740 Essential PT library TraDIS-PEC core 304 JW2667, 7 1 construction lovB

ECK2886, - Errors in 3028543 3029641 - 12201 971 gene 3028543 3029641 - BW25113_2891 prfB JW5847, 0.00273224 -24.63841954 Essential 0.000910747 15.3930014 Essential PT library TraDIS-PEC core 318 supK 4 construction

- Errors in ECK3010, 0.00841080 3157074 3159332 - 621 971 gene 3157074 3159332 - BW25113_3019 parC -16.86408869 Essential 0.003098716 11.3183966 Essential PT library TraDIS-PEC core 333 JW2987 1 1 construction

ECK3048, 0.02981029 Conditionall 3197580 3197948 - 582 971 gene 3197580 3197948 - BW25113_3058 folB JW3030, -7.225681066 Essential 0.002710027 -11.7975412 Essential PT TraDIS-PEC core 340 8 y essential ygiG

dnaP, - Errors in ECK3056, 0.00171821 3204466 3206211 + 572 971 gene 3204466 3206211 + BW25113_3066 dnaG -27.76675924 Essential 0.001145475 14.6688261 Essential PT library TraDIS-PEC core 340 JW3038, 3 3 construction parB, sdgA

alt, - Errors in 0.00162866 3206406 3208247 + 570 971 gene 3206406 3208247 + BW25113_3067 rpoD ECK3057, -28.12612808 Essential 0.000542888 16.9909374 Essential PT library TraDIS-PEC core 340 4 JW3039 8 construction bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

ECK3165, - Errors in 0.00747384 3316092 3317429 - 456 971 gene 3316092 3317429 - BW25113_3176 glmM JW3143, -17.70089145 Essential 0.004484305 9.92741745 Essential PT library TraDIS-PEC core 353 2 mrsA, yhbF 1 construction

ECK3190, - Errors in 0.00275482 3337303 3338028 + 431 971 gene 3337303 3338028 + BW25113_3201 lptB JW3168, -24.58263598 Essential 0.002754821 11.7395449 Essential PT library TraDIS-PEC core 353 1 yhbG 8 construction

ECK3547, - Errors in 0.00193236 3715688 3717757 - 55 971 gene 3715688 3717757 - BW25113_3559 glyS gly-act, -26.97707354 Essential 0.000966184 15.2075153 Essential PT library TraDIS-PEC core 389 7 JW3530 7 construction

ECK3775, hdf, JW3756, - Errors in 0.00476190 3959777 3961036 + 7429 971 gene 3959777 3961036 + BW25113_3783 rho nitA, nusD, -20.84336986 Essential 0.002380952 12.2500458 Essential PT library TraDIS-PEC core 419 5 psuA, rnsC, 9 construction sbaA, sun, tabC, tsu ECK4137, - Errors in groEL, 4360842 4362488 + 1344 971 gene 4360842 4362488 + BW25113_4143 groL 0.00667881 -18.4919865 Essential 0.004250152 10.1366041 Essential PT library TraDIS-PEC core 486 JW4103, 3 construction mopA

ECK4157, - Errors in engC, 0.02469135 4380274 4381326 - 1362 971 gene 4380274 4381326 - BW25113_4161 rsgA -8.775325975 Essential 0.008547009 7.13458606 Essential PT library TraDIS-PEC core 491 JW4122, 8 8 construction yjeQ

- ECK4352, 0.00925925 4590795 4591334 - 14234 971 gene 4590795 4591334 - BW25113_4362 dnaT -16.17806066 Essential 0.005555556 9.06163917 Essential PT Unclear TraDIS-PEC core 514 JW4326 9 8

ECK0271, Non- 33.6556783 Non- non-core 34- 281106 282488 + 9093 58 gene 281106 282488 + BW25113_0270 yagG 0.23065799 18.21291061 0.154013015 K Unclear Keio only JW0263 essential 5 essential 35

Genes ECK0219, 0.32600732 Non- 49.3084826 Non- containing a non-core 29- 235587 235865 + 1554 72 pseudogene 235593 235865 + BW25113_4503 yafF 26.15277505 0.21978022 K Keio only JW0208 6 essential 5 essential transposon 30 free region

Genes ECK3613, 0.22718808 Non- 34.9093830 Non- containing a non-core 3791599 3792672 - 6615 153 gene 3791599 3792672 - BW25113_3623 waaU JW3598, 17.90698672 0.159217877 K Keio only 2 essential 4 essential transposon 397-398 rfaK, waaK free region

Genes ECK3012, 0.13888888 Non- 19.3721423 Non- containing a non-core 3161210 3161605 - 619 330 gene 3161210 3161605 - BW25113_3021 mqsA JW2989, 9.418401974 0.095959596 K Keio only 9 essential 1 essential transposon 333-334 ygiT free region

ECK1566, 1.63185417 non-core 1642880 1643050 + 11630 354 gene 1642922 1643050 + BW25113_1572 ydfB 0.07751938 1.923662732 Unclear 0.031007752 Unclear K Unclear Keio only JW1564 7 179-180

ECK4077, Errors in Non- 52.2423655 Non- non-core 4296687 4297616 - 1280 500 gene 4296687 4297616 - BW25113_4084 alsK JW5724, 0.31827957 25.53607274 0.232258065 K library Keio only essential 8 essential 479-480 yjcT construction

chpBI, ECK4220, 0.22619047 Non- 38.5976755 Non- non-core 4438264 4438515 + 7910 547 gene 4438264 4438515 + BW25113_4224 chpS 17.8187494 0.174603175 K Anti-Toxin Keio only JW5750, 6 essential 3 essential 498-499 yjfB

ECK2012, 0.07539682 non-core 2082943 2083194 - 13394 735 gene 2082943 2083194 - BW25113_2017 yefM 1.61421362 Unclear 0.027777778 0.59191535 Unclear K Anti-Toxin Keio only JW5835 5 234-235

chpAI, chpR, 0.19277108 Non- 21.4987263 Non- non-core 2904450 2904698 - 12324 818 gene 2904450 2904698 - BW25113_2783 mazE ECK2777, 14.7804794 0.104417671 K Anti-Toxin Keio only 4 essential 2 essential 313-314 JW2754, tasA

Errors in ECK0575, 0.16425120 Non- 23.1678124 Non- non-core 57- 604915 605535 - 9446 889 gene 604915 605535 - BW25113_0583 entD 12.03105967 0.111111111 K library Keio only JW5085 8 essential 3 essential 58 construction

ECK3702, Errors in JW5619, 0.52564102 Non- 104.275686 Non- 3883596 3884843 + 7356 946 gene 3883596 3884843 + BW25113_3709 tnaB 41.14469998 0.457532051 K library Keio only core 412 JW5622, 6 essential 7 essential construction tnaP, trpP

ECK1687, 0.29411764 Non- 31.6361518 Non- 1764872 1765228 + 11150 958 gene 1764872 1765228 + BW25113_1689 ydiL 23.58178253 0.145658263 K Unclear Keio only core 186 JW1679 7 essential 7 essential bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

ECK3517, Errors in 0.28846153 Non- 49.6822035 Non- 3683628 3685967 - 86 963 gene 3683628 3685967 - BW25113_3532 bcsB JW3500, 23.11810449 0.221367521 K library Keio only core 383 8 essential 3 essential yhjN construction

ECK3102, Non- 32.9965409 Non- 3253080 3253469 - 522 966 gene 3253080 3253469 - BW25113_3113 tdcF JW5521, 0.21025641 16.3911441 0.151282051 K Unclear Keio only core 349 essential 7 essential yhaR

ECK1162, Errors in 0.04868913 4.72859772 Non- 1219735 1220001 - 4875 969 gene 1219735 1220001 - BW25113_1174 minE JW1163, -2.861471825 Unclear 0.041198502 K library Keio only core 132 9 7 essential minB construction

ECK1163, Errors in 0.16605166 Non- 19.6780233 Non- 1220005 1220817 - 4874 969 gene 1220005 1220817 - BW25113_1175 minD JW1164, 12.20988517 0.097170972 K library Keio only core 132 1 essential 1 essential minB construction

ECK3147, 0.25369738 Non- 45.4993936 Non- 3295848 3296726 + 474 969 gene 3295848 3296726 + BW25113_3159 yhbV 20.2082008 0.203640501 K Unclear Keio only core 351 JW5530 3 essential 3 essential

ECK3134, Errors in 0.20557491 Non- 32.0828343 Non- 3285834 3286694 - 487 970 gene 3285834 3286694 - BW25113_3146 rsmI JW3115, 15.964729 0.147502904 K library Keio only core 351 3 essential 3 essential yraL construction

ECK2565, - 0.06754772 Polar 2696742 2697422 - 14750 971 gene 2696742 2697422 - BW25113_2567 rnc JW2551, 0.421826829 Unclear 0.019089574 2.43750933 Unclear K Keio only core 293 4 insertions ranA 8

ECK3180, Errors in 0.33673469 Non- Non- 3330322 3330615 - 441 971 gene 3330322 3330615 - BW25113_3191 mlaB JW5535, 27.0027312 0.261904762 59.1806718 K library Keio only core 353 4 essential essential yrbB construction

Errors in ECK3455, 0.24324324 Non- 36.3380929 Non- 3602577 3603242 + 154 971 gene 3602577 3603242 + BW25113_3471 yhhQ 19.31024275 0.165165165 K library Keio only core 374 JW3436 3 essential 9 essential construction

ECK3828, 0.15676567 Non- Non- Polar 4012984 4013589 + 7482 971 gene 4012984 4013589 + BW25113_3834 ubiJ JW3811, 11.27885212 0.118811881 25.0749831 K Keio only core 427 7 essential essential insertions yigP

RNA genes not 692419 692503 - 9356 967 tRNA 692419 692503 - BW25113_0672 leuW tRNA-Leu P considered PEC only core 68

in our analysis RNA genes not 4165316 4165391 + 7628 969 tRNA 4165316 4165391 + BW25113_3976 thrU tRNA-Thr P considered PEC only core 454

in our analysis RNA genes not 4165601 4165675 + 7630 969 tRNA 4165601 4165675 + BW25113_3978 glyT tRNA-Gly P considered PEC only core 454

in our analysis RNA genes not 560179 560255 + 9477 970 tRNA 560179 560255 + BW25113_0536 argU tRNA-Arg P considered PEC only core 54

in our analysis RNA genes not 1027081 1027168 - 5062 970 tRNA 1027081 1027168 - BW25113_0971 serT tRNA-Ser P considered PEC only core 108

in our analysis Genes ECK3855, 0.09580193 Non- 6.60849882 Non- containing a 4040326 4043112 + 7510 970 gene 4040326 4043112 + BW25113_3863 polA JW3835, 4.407982507 0.047721564 P PEC only core 432 8 essential 5 essential transposon resA free region 4.5S sRNA RNA genes component not of Signal 471912 472008 + 9561 971 SRP_RNA 471904 472017 + BW25113_0455 ffs P considered PEC only core 44 Recognitio in our n Particle analysis (SRP) RNA genes not 1985296 1985382 - 10918 971 tRNA 1985296 1985382 - BW25113_1909 leuZ tRNA-Leu P considered PEC only core 220

in our analysis RNA genes not 1985395 1985468 - 10917 971 tRNA 1985395 1985468 - BW25113_1910 cysT tRNA-Cys P considered PEC only core 220

in our analysis bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

RNA genes not 2811912 2812004 - 12414 971 tRNA 2811912 2812004 - BW25113_2695 serV tRNA-Ser P considered PEC only core 304

in our analysis RNA genes not 3315431 3315517 - 458 971 tRNA 3315431 3315517 - BW25113_3174 leuU tRNA-Leu P considered PEC only core 353

in our analysis

ECK3187, 0.13227513 Non- 17.8467878 Non- Polar 3335632 3336198 + 434 971 gene 3335632 3336198 + BW25113_3198 kdsC JW3165, 8.703671917 0.08994709 P PEC only core 353 2 essential 4 essential insertions yrbI, yrbJ

RNA genes not 3940317 3940392 + 7408 971 tRNA 3940317 3940392 + BW25113_3761 trpT tRNA-Trp P considered PEC only core 418

in our analysis RNA genes not 3975735 3975811 + 7444 971 tRNA 3975735 3975811 + BW25113_3796 argX tRNA-Arg P considered PEC only core 419

in our analysis RNA genes not 3975870 3975945 + 7445 971 tRNA 3975869 3975945 + BW25113_3797 hisR tRNA-His P considered PEC only core 419

in our analysis RNA genes not 3976095 3976171 + 7447 971 tRNA 3976095 3976171 + BW25113_3799 proM tRNA-Pro P considered PEC only core 419

in our analysis

ECK3927, - Errors in 0.09095043 Non- 4114540 4116738 - 7583 971 gene 4114540 4116738 - BW25113_3935 priA JW3906, 3.776675641 0.024556617 0.48490891 Unclear P library PEC only core 442 2 essential srgA 4 construction

- Errors in ECK4141, 0.07407407 4365516 4366082 + 1348 971 gene 4365516 4366082 + BW25113_4147 efp 1.418735065 Unclear 0.012345679 5.21300567 Essential P library PEC only core 488 JW4107 4 6 construction

- non-core 2557882 2558691 + 17731 32 gene 2557882 2558691 + BW25113_2450 yffS ECK2445 0.02962963 -7.276601225 Essential 0.024691358 0.43893872 Unclear TraDIS only 282-283 3

- ECK1124, 0.01843971 non-core 1192989 1193693 - 17523 44 gene 1192989 1193693 - BW25113_1138 ymfE -11.07705114 Essential 0.004255319 10.1318955 Essential TraDIS only JW5166 6 131-132 4

- ECK1451, 0.02898550 non-core 1524179 1524350 + 9732 68 gene 1524179 1524661 + BW25113_1457 ydcD -7.460166058 Essential 0.00621118 8.58906497 Essential TraDIS only JW1452 7 162-163 8

- ECK1355, non-core 1414622 1414918 + 4313 106 gene 1414622 1414918 + BW25113_1357 ydaS 0.03030303 -7.088034047 Essential 0.013468013 4.70866977 Essential TraDIS only JW1352 152-153 7

ECK1932, - JW1918, 0.04088888 non-core 2005030 2005832 - 4291 143 pseudogene 2004704 2005832 - BW25113_4495 yedN -4.478728525 Essential 0.008888889 6.94369228 Essential TraDIS only JW5912, 9 221-222 5 yedM

- ECK1543, non-core 1631304 1631714 + 7818 202 gene 1631304 1631714 + BW25113_1549 ydfO 0.02919708 -7.399517805 Essential 0.01459854 4.22129132 Essential TraDIS only JW5252 179-180 9

- non-core 1459422 1459487 - 8167 207 gene 1459422 1459487 - BW25113_4674 ynbG ECK4489 0 -172.1619602 Essential 0 173.551740 Essential TraDIS only 155-156 1

- ECK1558, 0.01666666 non-core 1639890 1640129 - 8263 247 gene 1639890 1640129 - BW25113_1564 relB -11.85087856 Essential 0.0125 5.14231587 Essential TraDIS only JW1556, RC 7 179-180 8

- ECK1349, 0.02923976 non-core 1412095 1412265 - 4306 257 gene 1412095 1412265 - BW25113_4526 ydaE -7.387323963 Essential 0 173.551740 Essential TraDIS only JW1346 6 152-153 1

- 0.01960784 non-core 1429215 1429265 - 74732 411 pseudogene 1429215 1429265 - BW25113_4638 ttcC ECK4447 -10.60151157 Essential 0 173.551740 Essential TraDIS only 3 152-153 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

- 0.02083333 non-core 29- 234744 235223 - 13965 534 gene 234744 235223 - BW25113_4586 ykfM ECK4397 -10.12785226 Essential 0.010416667 6.13946696 Essential TraDIS only 3 30 1

ECK2855, - 0.02542372 non-core 2988673 2988882 - 7873 585 pseudogene 2988673 2989379 - BW25113_2858 ygeN JW5459, -8.538567963 Essential 0.012711864 5.04599278 Essential TraDIS only 9 316-317 ygeM 5

ECK2849, 0.04471544 3.35572773 non-core 2984627 2985118 + 12254 627 gene 2984627 2985118 + BW25113_2851 ygeG -3.660160022 Essential 0.036585366 Unclear TraDIS only JW2819 7 1 316-317

- ECK2848, 0.02235772 non-core 2983913 2984402 + 12255 629 pseudogene 2983913 2984402 + BW25113_2850 ygeF -9.570374812 Essential 0.010162602 6.26827512 Essential TraDIS only JW2818 4 316-317 6

0.02469135 10.4744605 Non- non-core 2983178 2983258 - 12257 641 gene 2983178 2983258 - BW25113_4683 yqeL ECK4498 -8.775325975 Essential 0.061728395 TraDIS only 8 1 essential 316-317

- ECK1449, 0.01408450 non-core 1521197 1521409 + 1832 680 gene 1521197 1521409 + BW25113_1455 yncH -13.1174424 Essential 0.004694836 9.74618861 Essential TraDIS only JW5235 7 162-163 7

ECK1147, - elb1, elbA, 0.04320987 non-core 1207136 1207459 - 4892 683 gene 1207136 1207459 - BW25113_1160 iraM -3.975952967 Essential 0.021604938 1.51594591 Unclear TraDIS only JW1147, 7 131-132 2 ycgW

- 0.02836879 non-core 2898916 2899056 + 12330 685 gene 2898916 2899056 + BW25113_4682 yqcG ECK4497 -7.638979344 Essential 0.021276596 1.63377036 Unclear TraDIS only 4 312-313 9

- ECK1501, non-core 1586433 1586699 - 1885 767 gene 1586433 1586699 - BW25113_1508 hipB 0.02247191 -9.529909302 Essential 0.003745318 10.6193338 Essential TraDIS only JW1501 172-173 3

ECK1466, 0.03942652 non-core 1539995 1540285 - 1848 809 pseudogene 1540007 1540285 - BW25113_1472 yddL -4.80631315 Essential 0.010752688 -5.97177277 Essential TraDIS only JW1468 3 163-164

- 0.01960784 non-core 4258457 4258979 - 7817 828 gene 4258737 4258940 - BW25113_4621 yjbS ECK4429 -10.60151157 Essential 0.009803922 6.45326317 Essential TraDIS only 3 472-473 6

ECK3072, - 0.04076738 non-core 3227087 3227503 - 553 836 gene 3227087 3227503 - BW25113_3082 higA JW3053, -4.505615563 Essential 0.009592326 6.56421526 Essential TraDIS only 6 343-344 ygjM 5

- ECK4475, 0.03535353 1578019 1578216 - 1876 963 gene 1578019 1578216 - BW25113_1500 safA -5.769005313 Essential 0.005050505 9.45316228 Essential TraDIS only core 171 yneN 5 5

ECK1039, - htrB, 0.00760043 1111118 1112038 - 4974 967 gene 1111118 1112038 - BW25113_1054 lpxL -17.58228345 Essential 0.004343105 10.0525496 Essential TraDIS only core 125 JW1041, 4 8 waaM

- ECK0114, 122182 124074 + 14075 968 gene 122182 124074 + BW25113_0115 aceF 0.04437401 -3.731100261 Essential 0.01320655 4.82421008 Essential TraDIS only core 16 JW0111 1

ECK3974, - 0.00932400 4168375 4168803 + 7635 969 gene 4168375 4168803 + BW25113_3983 rplK JW3946, -16.12812595 Essential 0.004662005 9.77404839 Essential TraDIS only core 454 9 relC 3

ECK3975, - 0.03120567 4168807 4169511 + 7636 969 gene 4168807 4169511 + BW25113_3984 rplA JW3947, -6.840373332 Essential 0.012765957 5.02153117 Essential TraDIS only core 454 4 rpy 8

ECK3989, - 0.01126760 4187644 4188708 + 7651 969 gene 4187644 4188708 + BW25113_3997 hemE hemC, -14.76047015 Essential 0.005633803 9.00326720 Essential TraDIS only core 455 6 JW3961 3

- ECK0621, 0.01138716 654707 655672 - 9398 970 gene 654707 655672 - BW25113_0628 lipA -14.68356459 Essential 0.005175983 9.35336491 Essential TraDIS only core 65 JW0623, lip 4 5

- ECK0651, 0.02777777 687330 687797 - 9365 970 gene 687330 687797 - BW25113_0659 ybeY -7.813274253 Essential 0.008547009 7.13458606 Essential TraDIS only core 68 JW0656 8 8 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

ECK1075, 0.02298850 1142823 1142996 + 4939 970 gene 1142823 1142996 + BW25113_1089 rpmF -9.348945992 Essential 0.017241379 -3.14652949 Unclear TraDIS only core 127 JW1075 6

- ECK1077, 0.01257861 1144215 1145168 + 4937 970 gene 1144215 1145168 + BW25113_1091 fabH -13.95468959 Essential 0.002096436 12.6862141 Essential TraDIS only core 127 JW1077 6 5

ECK1246, - exbA, 0.04027777 1305346 1306065 + 4788 970 gene 1305346 1306065 + BW25113_1252 tonB -4.614558523 Essential 0.015277778 3.93711327 Essential TraDIS only core 140 JW5195, 8 2 T1rec

ECK1259, - 0.04444444 1317295 1317339 - 4774 970 gene 1317295 1317339 - BW25113_1265 trpL JW1257, -3.71643435 Essential 0.022222222 1.29625223 Unclear TraDIS only core 141 4 trpEE 1

- ECK1274, 0.03883495 1334500 1334808 + 4756 970 gene 1334500 1334808 + BW25113_1279 yciS -4.941379755 Essential 0.012944984 4.94094551 Essential TraDIS only core 145 JW1271 1 3

0.85079032 1341053 1341157 + 4748 970 gene 1341053 1341157 + BW25113_4672 ymiB ECK4487 0.00952381 -15.97603279 Essential 0.028571429 Unclear TraDIS only core 145 2

1.40792025 1940372 1940437 - 10966 970 gene 1940372 1940437 - BW25113_4677 yobI ECK4492 0.03030303 -7.088034047 Essential 0.03030303 Unclear TraDIS only core 211 8

dus, - ECK2059, 0.03092783 2135115 2135696 - 13344 970 gene 2135115 2135696 - BW25113_2065 dcd -6.915999887 Essential 0 173.551740 Essential TraDIS only core 238 JW2050, 5 1 paxA

ECK3827, - JW5581, 0.04497354 4012215 4012970 + 7481 970 gene 4012215 4012970 + BW25113_3833 ubiE -3.606791384 Essential 0.007936508 7.48739906 Essential TraDIS only core 427 menG, 5 7 menH, yigO

ECK0024, - 20815 21078 - 14169 971 gene 20815 21078 - BW25113_0023 rpsT JW0022, 0 -172.1619602 Essential 0 173.551740 Essential TraDIS only core 6

sup 1

dhl, - ECK0115, 0.00842105 124399 125823 + 14074 971 gene 124399 125823 + BW25113_0116 lpd -16.855423 Essential 0.002105263 12.6719395 Essential TraDIS only core 16 JW0112, 3 8 lpdA

- ECK0714, 0.03604568 754162 756963 + 9297 971 gene 754162 756963 + BW25113_0726 sucA -5.599728351 Essential 0.015346181 3.90882489 Essential TraDIS only core 75 JW0715, lys 2 5

- ECK0715, 0.03119868 756978 758195 + 9296 971 gene 756978 758195 + BW25113_0727 sucB -6.842268967 Essential 0.013957307 4.49538664 Essential TraDIS only core 75 JW0716 6 5

- ECK0722, 0.02894736 768498 769637 + 9288 971 gene 768498 769637 + BW25113_0734 cydB -7.471136099 Essential 0.005263158 9.28505054 Essential TraDIS only core 77 JW0723 8 4

ECK0723, 0.04385964 0.10867049 769652 769765 + 9287 971 gene 769652 769765 + BW25113_4515 cydX JW0724, -3.83871021 Essential 0.026315789 Unclear TraDIS only core 77 9 8 ybgT

ECK0878, 0.04244482 924652 926418 - 5150 971 gene 924652 926418 - BW25113_0887 cydD htrD, -4.139466553 Essential 0.010752688 -5.97177277 Essential TraDIS only core 100 2 JW0870

ECK0908, 0.03825136 Non- 966129 966311 + 5119 971 gene 966129 966311 + BW25113_0917 ycaR -5.076126751 Essential 0.043715847 5.46165912 TraDIS only core 105 JW0900 6 essential

ECK0960, - 0.03636363 1025795 1026124 - 5064 971 gene 1025795 1026124 - BW25113_0969 tusE JW0952, -5.522790325 Essential 0.009090909 6.83295172 Essential TraDIS only core 107 6 yccK 3

asuE, ECK1119, 0.02981029 1188123 1189229 - 4896 971 gene 1188123 1189229 - BW25113_1133 mnmA -7.225681066 Essential 0.002710027 -11.7975412 Essential TraDIS only core 131 JW1119, 8 trmU, ycfB

- ECK1634, 1711608 1712264 - 11204 971 gene 1711608 1712264 - BW25113_1638 pdxH 0.02891933 -7.47920808 Essential 0.00152207 13.7521166 Essential TraDIS only core 183 JW1630 3 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

- ECK1644, 0.04166666 1719938 1720177 - 11194 971 gene 1719938 1720177 - BW25113_1648 ydhL -4.307985423 Essential 0.020833333 1.79394513 Unclear TraDIS only core 183 JW5827 7 8

ECK1648, 0.02623456 1722604 1723251 + 11190 971 gene 1722604 1723251 + BW25113_1652 rnt -8.282835786 Essential 0.00617284 -8.61572878 Essential TraDIS only core 183 JW1644 8

ECK1710, - 0.03666666 1789510 1789809 - 11125 971 gene 1789510 1789809 - BW25113_1712 ihfA hid, himA, -5.449934172 Essential 0.01 6.35166827 Essential TraDIS only core 193 7 JW1702 4

ECK1713, - 1793389 1793523 - 7195 971 gene 1793483 1793527 - BW25113_1715 pheM JW1705, 0 -172.1619602 Essential 0 173.551740 Essential TraDIS only core 194

phtL 1

- ECK1715, 1794059 1794256 - 11121 971 gene 1794059 1794256 - BW25113_1717 rpmI 0.02020202 -10.36882094 Essential 0.01010101 6.29977402 Essential TraDIS only core 194 JW1707 5

- ECK2179, 0.00701754 2275996 2276280 + 13217 971 gene 2275996 2276280 + BW25113_2185 rplY -18.14464769 Essential 0.007017544 8.05220142 Essential TraDIS only core 254 JW2173 4 3

ECK2224, 0.01798063 2333046 2333768 + 13172 971 gene 2333046 2333768 + BW25113_2232 ubiG JW2226, -11.27103137 Essential 0.004149378 -10.2293169 Essential TraDIS only core 258 6 pufX, yfaB

dedF, - 2421536 2422105 - 13092 971 gene 2421536 2422105 - BW25113_2311 ubiX ECK2305, 0.01754386 -11.45958812 Essential 0.005263158 9.28505054 Essential TraDIS only core 266

JW2308 4

ctr, - 2527425 2529152 + 14899 971 gene 2527425 2529152 + BW25113_2416 ptsI ECK2411, 0.03587963 -5.640113382 Essential 0.008101852 7.39023844 Essential TraDIS only core 279

JW2409 6

- ECK2503, 0.03168567 2624317 2625894 - 14813 971 gene 2624317 2625894 - BW25113_2507 guaA -6.71094865 Essential 0.001267427 14.3452400 Essential TraDIS only core 286 JW2491 8 8

- ECK2522, 0.04166666 2650107 2650442 - 14795 971 gene 2650107 2650442 - BW25113_2525 fdx -4.307985423 Essential 0.017857143 2.90689194 Unclear TraDIS only core 289 JW2509 7 3

ECK2523, - 2650444 2652294 - 14794 971 gene 2650444 2652294 - BW25113_2526 hscA hsc, 0.03619665 -5.563133723 Essential 0.015667207 3.77683257 Essential TraDIS only core 289

JW2510 3

ECK2526, hsm-5, - 0.01033591 2653262 2653648 - 14791 971 gene 2653262 2653648 - BW25113_2529 iscU JW2513, -15.38660276 Essential 0.005167959 9.35969425 Essential TraDIS only core 289 7 miaC, nifU, 4 yfhN ECK2527, - JW2514, 0.01234567 2653676 2654890 - 14790 971 gene 2653676 2654890 - BW25113_2530 iscS -14.09214201 Essential 0.003292181 11.0979288 Essential TraDIS only core 289 nuvC, yfhO, 9 8 yzzO

- ECK2548, 0.00239234 2677613 2678866 - 14769 971 gene 2677613 2678866 - BW25113_2551 glyA -25.53734318 Essential 0.003189793 11.2132671 Essential TraDIS only core 292 JW2535 4 8

ECK2592, - 0.02344546 2728390 2729370 - 14720 971 gene 2728390 2729370 - BW25113_2594 rluD JW2576, -9.191663545 Essential 0.013251784 4.80414028 Essential TraDIS only core 296 4 sfhB, yfiI 9

ECK2605, - 0.01092896 2738729 2739277 - 14706 971 gene 2738729 2739277 - BW25113_2608 rimM JW5413, -14.9823987 Essential 0.003642987 10.7232543 Essential TraDIS only core 298 2 yfjA 3

- ECK2823, 2957720 2958514 - 12279 971 gene 2957720 2958514 - BW25113_2827 thyA 0 -172.1619602 Essential 0 173.551740 Essential TraDIS only core 316 JW2795 1

asuD, - ECK2885, 0.01185770 3027016 3028533 - 12202 971 gene 3027016 3028533 - BW25113_2890 lysS -14.38782943 Essential 0.001976285 12.8858740 Essential TraDIS only core 318 herC, 8 9 JW2858

ECK2893, - 0.03363914 3034672 3035652 + 12194 971 gene 3034672 3035652 + BW25113_2898 ygfZ JW2866, -6.199428634 Essential 0.02038736 1.95643848 Unclear TraDIS only core 318 4 yzzW 4 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

acd, - ECK2902, 3045699 3046877 - 12185 971 gene 3045699 3046877 - BW25113_2907 ubiH 0.03307888 -6.343734547 Essential 0.005089059 9.42231040 Essential TraDIS only core 319 JW2875, 7 visB

ECK2930, - 0.04116465 3073003 3074994 - 12155 971 gene 3073003 3074994 - BW25113_2935 tktA JW5478, -4.417920388 Essential 0.019076305 2.44249493 Unclear TraDIS only core 327 9 tkt 4

- ECK3055, 0.01388888 3204140 3204355 + 573 971 gene 3204140 3204355 + BW25113_3065 rpsU -13.22155404 Essential 0.00462963 9.80166353 Essential TraDIS only core 340 JW3037 9 8

ECK3154, - 3304774 3305043 - 467 971 gene 3304774 3305043 - BW25113_3165 rpsO JW3134, 0.02962963 -7.276601225 Essential 0.011111111 5.79605046 Essential TraDIS only core 352

secC 1

ECK3156, - JW3136, 0.04228855 3306136 3306537 - 465 971 gene 3306136 3306537 - BW25113_3167 rbfA -4.173126561 Essential 0.019900498 2.13543370 Unclear TraDIS only core 352 sdr-43, 7 5 yhbB

dhpS, - 0.04122497 3317422 3318270 - 455 971 gene 3317422 3318270 - BW25113_3177 folP ECK3166, -4.404661108 Essential 0.00942285 6.65410708 Essential TraDIS only core 353 1 JW3144 4

dod, - ECK3373, 3507741 3508418 - 243 971 gene 3507741 3508418 - BW25113_3386 rpe 0.03539823 -5.757998822 Essential 0.013274336 4.79414700 Essential TraDIS only core 364 JW3349, 4 yhfD

ECK3804, 1.01900286 3988122 3988946 + 7458 971 gene 3988122 3988946 + BW25113_3809 dapF 0.04 -4.676800152 Essential 0.029090909 Unclear TraDIS only core 422 JW5592 9

ECK4196, - 0.00252525 4414935 4415330 + 1402 971 gene 4414935 4415330 + BW25113_4200 rpsF JW4158, -25.17181345 Essential 0 173.551740 Essential TraDIS only core 495 3 sdgH 1

- ECK4197, 0.03174603 4415337 4415651 + 1403 971 gene 4415337 4415651 + BW25113_4201 priB -6.694783646 Essential 0.00952381 6.60044484 Essential TraDIS only core 495 JW4159 2 7

ECK4363, 0.02415458 4597620 4598033 + 14223 971 gene 4597620 4598033 + BW25113_4372 holD -8.952528053 Essential 0.014492754 -4.26610639 Essential TraDIS only core 516 JW4334 9

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Table 4. The 521 E. coli core regions identified through the pan-genome graph.

Core Region Start Stop Length (bp) 1 337 5237 4901 2 5683 9893 4211 3 9928 15298 5371 4 16751 16960 210 5 17325 18655 1331 6 20815 28207 7393 7 28374 39115 10742 8 39244 50302 11059 9 50380 58179 7800 10 59687 66546 6860 11 66477 67752 1276 12 67838 72097 4260 13 72131 73786 1656 14 75335 113586 38252 15 113596 118038 4443 16 118579 125823 7245 17 127869 130699 2831 18 130875 143343 12469 19 143455 144357 903 20 144431 146088 1658 21 154216 161021 6806 22 161217 163751 2535 23 166265 180582 14318 24 181610 187086 5477 25 187344 212466 25123 26 212666 220198 7533 27 220258 222062 1805 28 222237 225166 2930 29 225245 233494 8250 30 235906 242292 6387 31 244845 246584 1740 32 247385 248440 1056 33 249196 249648 453 34 250746 258657 7912 35 380688 385435 4748 36 390586 403917 13332 37 404403 419788 15386 38 419793 425932 6140 39 426061 432563 6503 40 432617 433591 975 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

41 433771 442122 8352 42 442271 447066 4796 43 447110 447302 193 44 447526 502022 54497 45 502059 503536 1478 46 504331 509324 4994 47 509449 509856 408 48 509857 518286 8430 49 525588 531153 5566 50 543813 545072 1260 51 545083 546788 1706 52 546983 548643 1661 53 548674 553197 4524 54 560179 560255 77 55 588766 592919 4154 56 594170 600880 6711 57 601721 602839 1119 58 605480 613494 8015 59 613710 619966 6257 60 620341 630832 10492 61 630805 632025 1221 62 633160 637323 4164 63 639013 647312 8300 64 650039 651792 1754 65 652013 655884 3872 66 657093 670956 13864 67 678933 683203 4271 68 684799 692826 8028 69 692969 694633 1665 70 695030 701346 6317 71 701549 703213 1665 72 703790 710654 6865 73 716512 724796 8285 74 734457 743225 8769 75 748641 758195 9555 76 758470 760505 2036 77 766914 776374 9461 78 777033 777108 76 79 777224 793887 16664 80 794042 796031 1990 81 798959 802737 3779 82 802795 803365 571 83 803424 811003 7580 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

84 811195 816044 4850 85 816998 827692 10695 86 828526 832892 4367 87 833121 836987 3867 88 837006 853011 16006 89 854669 868257 13589 90 868435 870783 2349 91 870791 872119 1329 92 872166 883415 11250 93 883435 895101 11667 94 895300 899190 3891 95 900049 910426 10378 96 910808 911503 696 97 911716 918270 6555 98 918369 920996 2628 99 921274 921358 85 100 921340 936176 14837 101 936415 940352 3938 102 940387 942475 2089 103 945796 946536 741 104 946728 955551 8824 105 955720 979753 24034 106 979975 992968 12994 107 1000224 1026874 26651 108 1027081 1037371 10291 109 1046419 1046631 213 110 1046917 1051634 4718 111 1051717 1059231 7515 112 1061041 1063164 2124 113 1063368 1063604 237 114 1065316 1067615 2300 115 1067627 1074602 6976 116 1074761 1076269 1509 117 1076812 1080103 3292 118 1080448 1081512 1065 119 1093021 1096243 3223 120 1096307 1097140 834 121 1097167 1098652 1486 122 1098868 1099304 437 123 1099407 1100358 952 124 1100417 1108862 8446 125 1110976 1114502 3527 126 1114763 1136442 21680 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

127 1136638 1162845 26208 128 1163055 1175797 12743 129 1177161 1179076 1916 130 1179073 1183696 4624 131 1183772 1191829 8058 132 1219409 1222928 3520 133 1223170 1224985 1816 134 1226052 1235405 9354 135 1235791 1240438 4648 136 1252177 1264475 12299 137 1265159 1281982 16824 138 1282994 1289600 6607 139 1290902 1305357 14456 140 1305346 1308037 2692 141 1311479 1318975 7497 142 1321109 1321984 876 143 1322024 1329086 7063 144 1329381 1329715 335 145 1330088 1345296 15209 146 1346085 1351925 5841 147 1361192 1364260 3069 148 1377220 1384983 7764 149 1386248 1387147 900 150 1387484 1390179 2696 151 1391622 1394493 2872 152 1400236 1406205 5970 153 1429442 1431150 1709 154 1431517 1436000 4484 155 1436111 1439947 3837 156 1476512 1481220 4709 157 1483970 1486107 2138 158 1488405 1497382 8978 159 1509019 1511259 2241 160 1511356 1513284 1929 161 1515220 1517322 2103 162 1517356 1520237 2882 163 1533107 1538317 5211 164 1540545 1546248 5704 165 1546655 1546939 285 166 1547085 1550017 2933 167 1550882 1551313 432 168 1561761 1566678 4918 169 1573890 1575047 1158 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

170 1575099 1576781 1683 171 1577183 1580743 3561 172 1581077 1581991 915 173 1603229 1604937 1709 174 1605164 1606111 948 175 1606223 1608960 2738 176 1611285 1613165 1881 177 1613377 1614464 1088 178 1616903 1617171 269 179 1619514 1625225 5712 180 1650441 1651001 561 181 1651004 1652127 1124 182 1657247 1669204 11958 183 1669229 1723251 54023 184 1728011 1728358 348 185 1728692 1762942 34251 186 1763331 1766542 3212 187 1766769 1769701 2933 188 1772647 1773558 912 189 1773661 1774638 978 190 1774658 1778934 4277 191 1778991 1783738 4748 192 1783870 1789409 5540 193 1789510 1793199 3690 194 1793389 1796827 3439 195 1799219 1806582 7364 196 1807678 1810385 2708 197 1810643 1812757 2115 198 1812862 1837971 25110 199 1839256 1844950 5695 200 1845117 1846785 1669 201 1855959 1863099 7141 202 1863212 1864495 1284 203 1866118 1866285 168 204 1867831 1873512 5682 205 1873846 1874205 360 206 1881121 1888139 7019 207 1888330 1916569 28240 208 1916570 1917480 911 209 1917622 1933348 15727 210 1933479 1938456 4978 211 1938603 1944779 6177 212 1946523 1948670 2148 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

213 1948835 1951264 2430 214 1951289 1956052 4764 215 1956837 1960448 3612 216 1960650 1963621 2972 217 1965287 1972101 6815 218 1973234 1975093 1860 219 1975068 1985100 10033 220 1985296 1995270 9975 221 1998784 2003347 4564 222 2006181 2006734 554 223 2016374 2017159 786 224 2017449 2025798 8350 225 2029316 2034599 5284 226 2034856 2035506 651 227 2036949 2038105 1157 228 2048386 2051468 3083 229 2051508 2055816 4309 230 2055872 2059245 3374 231 2072513 2072842 330 232 2074862 2076028 1167 233 2076237 2077664 1428 234 2079185 2082609 3425 235 2083280 2083462 183 236 2083543 2090706 7164 237 2131383 2132966 1584 238 2133240 2144466 11227 239 2145192 2146609 1418 240 2147497 2159002 11506 241 2159149 2160678 1530 242 2161470 2163092 1623 243 2170991 2178911 7921 244 2179003 2180777 1775 245 2186272 2189812 3541 246 2199174 2204579 5406 247 2204704 2208123 3420 248 2209136 2212036 2901 249 2213171 2222544 9374 250 2222917 2223864 948 251 2224103 2227318 3216 252 2227512 2245176 17665 253 2247724 2249493 1770 254 2253198 2279766 26569 255 2283690 2305013 21324 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

256 2306509 2313355 6847 257 2320846 2327881 7036 258 2328435 2333768 5334 259 2338344 2342951 4608 260 2343414 2351282 7869 261 2355910 2367017 11108 262 2367127 2374506 7380 263 2375087 2376004 918 264 2380413 2380916 504 265 2383527 2416080 32554 266 2417215 2424242 7028 267 2424501 2427405 2905 268 2428303 2431330 3028 269 2431310 2437249 5940 270 2437370 2441919 4550 271 2449806 2454749 4944 272 2457731 2459712 1982 273 2459788 2459862 75 274 2472681 2491774 19094 275 2492150 2507723 15574 276 2509122 2512290 3169 277 2512736 2514485 1750 278 2514612 2514687 76 279 2519289 2531070 11782 280 2532031 2537982 5952 281 2539132 2541456 2325 282 2543005 2550656 7652 283 2568337 2590096 21760 284 2590264 2594307 4044 285 2608179 2624224 16046 286 2624317 2628961 4645 287 2629243 2638223 8981 288 2638372 2646698 8327 289 2647215 2661205 13991 290 2666627 2666720 94 291 2666705 2667127 423 292 2667175 2692966 25792 293 2693022 2707589 14568 294 2707798 2719199 11402 295 2719430 2724917 5488 296 2724959 2730242 5284 297 2730272 2735752 5481 298 2735742 2743419 7678 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

299 2743474 2749314 5841 300 2782344 2793834 11491 301 2793879 2807513 13635 302 2807577 2810862 3286 303 2811143 2811219 77 304 2811832 2815998 4167 305 2816067 2817705 1639 306 2817850 2831464 13615 307 2835932 2849775 13844 308 2849812 2853013 3202 309 2859918 2866388 6471 310 2866746 2870977 4232 311 2880669 2885938 5270 312 2898106 2898777 672 313 2900002 2904044 4043 314 2904776 2927047 22272 315 2928943 2940932 11990 316 2941116 2980435 39320 317 2992343 2997327 4985 318 3024593 3038460 13868 319 3039527 3050308 10782 320 3050321 3052684 2364 321 3053112 3054216 1105 322 3059636 3067105 7470 323 3068045 3068554 510 324 3069538 3070815 1278 325 3070830 3072218 1389 326 3072246 3072689 444 327 3073003 3076030 3028 328 3076236 3077156 921 329 3077294 3091762 14469 330 3091917 3103800 11884 331 3129667 3132952 3286 332 3133075 3154505 21431 333 3154616 3159519 4904 334 3162108 3165192 3085 335 3165889 3166815 927 336 3166863 3174940 8078 337 3174978 3176682 1705 338 3177172 3178077 906 339 3185098 3185562 465 340 3188323 3209032 20710 341 3209086 3209850 765 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

342 3210138 3219530 9393 343 3219593 3227042 7450 344 3227976 3232904 4929 345 3233303 3239881 6579 346 3240011 3246028 6018 347 3246677 3250010 3334 348 3250038 3251369 1332 349 3251405 3260957 9553 350 3263575 3270212 6638 351 3285834 3297814 11981 352 3297932 3313339 15408 353 3315431 3353975 38545 354 3362373 3367848 5476 355 3369638 3385387 15750 356 3385817 3400625 14809 357 3400734 3406159 5426 358 3412196 3413425 1230 359 3413493 3416553 3061 360 3416784 3417144 361 361 3420568 3432489 11922 362 3432975 3446629 13655 363 3463504 3492974 29471 364 3505993 3524011 18019 365 3524074 3527799 3726 366 3527875 3536553 8679 367 3537433 3549149 11717 368 3553207 3556878 3672 369 3557494 3569024 11531 370 3569081 3572086 3006 371 3572310 3574986 2677 372 3578441 3585685 7245 373 3586084 3591727 5644 374 3591915 3606274 14360 375 3606329 3612349 6021 376 3619039 3623962 4924 377 3629568 3641148 11581 378 3643597 3644022 426 379 3647165 3648573 1409 380 3648615 3658328 9714 381 3659540 3662548 3009 382 3662952 3678978 16027 383 3682515 3689545 7031 384 3689818 3693447 3630 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

385 3693456 3699150 5695 386 3699458 3701677 2220 387 3701808 3703835 2028 388 3704159 3713960 9802 389 3715688 3720721 5034 390 3720767 3724125 3359 391 3724491 3734942 10452 392 3750036 3755315 5280 393 3763603 3764739 1137 394 3764742 3770387 5646 395 3770759 3782420 11662 396 3782407 3785911 3505 397 3787347 3789335 1989 398 3801900 3808450 6551 399 3808487 3810899 2413 400 3812234 3820651 8418 401 3820820 3825526 4707 402 3825579 3829676 4098 403 3833909 3835099 1191 404 3835310 3844086 8777 405 3844162 3846348 2187 406 3847282 3849667 2386 407 3857972 3863801 5830 408 3868681 3868776 96 409 3869500 3870834 1335 410 3871065 3877089 6025 411 3877383 3881552 4170 412 3881795 3890186 8392 413 3900213 3904885 4673 414 3905199 3918993 13795 415 3919372 3927130 7759 416 3927138 3934687 7550 417 3937055 3939984 2930 418 3940062 3953173 13112 419 3954037 3977554 23518 420 3979792 3979963 172 421 3980046 3987419 7374 422 3987882 3993505 5624 423 3993652 3995736 2085 424 3996648 4001737 5090 425 4001799 4008674 6876 426 4008714 4010552 1839 427 4010452 4028513 18062 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

428 4028891 4030577 1687 429 4030870 4033800 2931 430 4033879 4033994 116 431 4034266 4037404 3139 432 4037559 4049699 12141 433 4049985 4053590 3606 434 4068913 4071798 2886 435 4072657 4072869 213 436 4073659 4080209 6551 437 4086484 4086798 315 438 4087279 4096315 9037 439 4096397 4098442 2046 440 4098597 4108688 10092 441 4108773 4114384 5612 442 4114540 4117153 2614 443 4117214 4117822 609 444 4118006 4122195 4190 445 4122223 4143026 20804 446 4143624 4148151 4528 447 4148201 4150718 2518 448 4151052 4153502 2451 449 4155356 4156527 1172 450 4156587 4158140 1554 451 4158300 4158375 76 452 4158560 4161489 2930 453 4161567 4161682 116 454 4161985 4179501 17517 455 4180663 4188708 8046 456 4188718 4191747 3030 457 4193248 4197599 4352 458 4199702 4199777 76 459 4199962 4202892 2931 460 4202970 4203085 116 461 4203608 4208341 4734 462 4208524 4210260 1737 463 4212732 4217439 4708 464 4217659 4220070 2412 465 4220282 4221559 1278 466 4221812 4230127 8316 467 4230253 4230663 411 468 4232554 4240402 7849 469 4242328 4249049 6722 470 4251597 4254159 2563 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

471 4254242 4256789 2548 472 4257042 4258235 1194 473 4259342 4264589 5248 474 4264688 4266985 2298 475 4266988 4271557 4570 476 4273181 4275141 1961 477 4275341 4285722 10382 478 4286253 4286942 690 479 4287036 4294220 7185 480 4311514 4312239 726 481 4313153 4315558 2406 482 4316216 4323767 7552 483 4325511 4331445 5935 484 4331728 4333083 1356 485 4334746 4341479 6734 486 4352368 4362979 10612 487 4363182 4364179 998 488 4364446 4366516 2071 489 4366692 4373437 6746 490 4373656 4375158 1503 491 4375864 4382252 6389 492 4382400 4399823 17424 493 4399794 4399868 75 494 4402204 4414204 12001 495 4414333 4416374 2042 496 4418752 4421093 2342 497 4421138 4427312 6175 498 4428525 4438253 9729 499 4438939 4440735 1797 500 4440875 4449128 8254 501 4449285 4449672 388 502 4449717 4450181 465 503 4450339 4460138 9800 504 4460344 4462350 2007 505 4468290 4468706 417 506 4470799 4479880 9082 507 4485007 4486306 1300 508 4541193 4545166 3974 509 4547195 4549343 2149 510 4549814 4550497 684 511 4550527 4551301 775 512 4577726 4581096 3371 513 4581474 4583129 1656 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

514 4586967 4592675 5709 515 4593294 4595857 2564 516 4596132 4599254 3123 517 4599231 4604077 4847 518 4604497 4611419 6923 519 4612918 4618364 5447 520 4618672 4620339 1668 521 4620550 4631445 10896

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Table 5. Pan-genome graph statistics for B. subtilis and E. coli.

B. subtilis B. subtilis E. coli E. coli PGG Statistic original PGG refined PGG original PGG refined PGG Size 1 clusters 4434 3231 87423 27273 Shared (size>1) clusters) 8174 8204 68970 48129 # of genes in shared clusters 463311 487562 5039502 5610683 Core clusters 3558 3778 2968 3631 Size 1 edges 7282 5479 153199 67566 Shared edges 9755 9433 99284 67248 Edge instances in shared edges 460452 485177 4970823 5567497 Core edges 3230 3520 2218 3124

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Supplementary Table 1. All B. subtilis genes compared to the PGG based annotation of core

regions for the type strain genome used by Kobayashi et al and Koo et al. (strain 168, GenBank

sequence AL009126.3, BioSample SAMEA3138188, Assembly

ASM904v1/GCF_000009045.1). Columns 1-5 are the start, stop, strand, cluster, and cluster size

for the PGG annotation. Columns 6-11 are the gene type, start, stop, strand, locus tag, and gene

symbol/name for the GenBank annotation. Column 12 is the Koo et al. gene symbol/name.

Columns 13-14 are the Kobayashi et al. gene symbol/name and evidence type (from Supporting

Table 4 “RB, reference to study with Bacillus subtilis ; RO, reference to study with other

bacteria; TW, this work; TW*, inactivation failed but IPTG mutant could not be made”).

Columns 15-16 are the GenBank protein product accession and name. Column 17 is the PGG

core or non-core region the gene is contained in.

Supplementary Table 2. The 108 B. subtilis genomes used in the study. Data is from GenBank

RefSeq: BioSample ID, Assembly ID, GenBank Species, GenBank Strain, Genome SIaze, and

whether the genome is a type strain.

Supplementary Table 3. All E. coli genes compared to the PGG based annotation of core

regions for the K-12 BW25113 strain used by Goodall (GenBank sequence CP009273.1,

Assembly ASM75055v1/GCA_000750555.1, BioSample SAMN03013572). Columns 1-5 are

the start, stop, strand, cluster, and cluster size for the PGG annotation. Columns 6-11 are the

gene type, start, stop, strand, locus tag, and gene symbol/name for the GenBank annotation.

Column 12 is a list of gene synonyms for the gene from GenBank. Columns 13-21 are from

Goodall et al.: 13-15 from Table S1 (normal essentiality), 16-18 from Table S4 (essentiality after bioRxiv preprint doi: https://doi.org/10.1101/2020.06.11.147629; this version posted June 12, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

outgrowth), 19-20 from Table S3 (outlier discrepancies), and 21 from Table S2 (comparison of

data sets). Column 22 is the PGG core or non-core region the gene is contained in.

Supplementary Table 4. The 971 E. coli genomes used in the study. Data is from GenBank

RefSeq: BioSample ID, Assembly ID, GenBank Species, GenBank Strain, Genome SIaze, and

whether the genome is a type strain.