bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Cryo-EM Structures and Regulation of

2 Arabinofuranosyltransferase AftD from Mycobacteria

3

4 Yong Zi Tan1,2, Lei Zhang3, José Rodrigues4, Ruixiang Blake Zheng5, Sabrina I. Giacometti1, Ana

5 L. Rosário4, Brian Kloss6, Venkata P. Dandey2, Hui Wei2, Richard Brunton5, Ashleigh M.

6 Raczkowski7, Diogo Athayde4, Maria João Catalão8, Madalena Pimentel8, Oliver B. Clarke1,9, Todd

7 L. Lowary5, Margarida Archer4, Michael Niederweis3, Clinton S. Potter2,7,10, Bridget

8 Carragher2,7,10,*, Filippo Mancia1,11,*

9

10 1Department of Physiology and Cellular Biophysics, Columbia University, New York, NY 10032,

11 USA.

12 2National Resource for Automated Molecular Microscopy, Simons Electron Microscopy Center,

13 New York Structural Biology Center, New York, NY 10027, USA.

14 3Department of Microbiology, University of Alabama at Birmingham, Birmingham, AL 35233,

15 USA.

16 4Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa (ITQB

17 NOVA), 2780-157 Oeiras, Portugal.

18 5Alberta Glycomics Centre and Department of Chemistry, University of Alberta, Edmonton,

19 Alberta T6G 2G2, Canada.

20 6Center on Membrane Protein Production and Analysis, New York Structural Biology Center, New

21 York, NY 10027, USA.

22 7Simons Electron Microscopy Center, New York Structural Biology Center, New York, NY 10027,

23 USA.

1

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 8Research Institute for Medicines (iMed.ULisboa), Faculty of Pharmacy, Universidade de Lisboa,

2 1649-003 Lisboa, Portugal.

3 9Department of Anesthesiology, Columbia University, New York, NY 10032, USA.

4 10Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY

5 10032, USA.

6 11Lead Contact

7 *Correspondence: [email protected] (F.M.); [email protected] (B.C., for cryo-EM)

8

9 Running Title: Cryo-EM Structures and Regulation of Arabinofuranosyltransferase AftD from

10 Mycobacteria

11 Keywords: Glycosyltransferase, , lipomannan, lipoarabinomannan,

12 arabinofuranose, membrane protein, nanodisc, single-particle cryo-electron microscopy, acyl carrier

13 protein

14 47,548 characters; 7 figures, 7 supplemental figures, 3 tables; summary 148 words

15

2

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 SUMMARY

2 Mycobacterium tuberculosis causes tuberculosis, a disease that kills over one million people each

3 year. Its cell envelope is a common antibiotic target and has a unique structure due, in part, to two

4 lipidated polysaccharides – arabinogalactan and lipoarabinomannan. Arabinofuranosyltransferase D

5 (AftD) is an essential enzyme involved in assembling these glycolipids. We present the 2.9 Å

6 resolution structure of M. abscessus AftD determined by single particle cryo-electron microscopy.

7 AftD has a conserved GT-C glycosyltransferase fold and three carbohydrate binding modules.

8 Glycan array analysis shows that AftD binds complex arabinose glycans. Additionally, AftD is non-

9 covalently complexed with an acyl carrier protein (ACP). 3.4 and 3.5 Å structures of a mutant with

10 impaired ACP binding reveal a conformational change that suggests the ACP may regulate AftD

11 function. Using a conditional knock-out constructed in M. smegmatis, mutagenesis experiments

12 confirm the essentiality of the putative active site and the ACP binding for AftD function.

13

14

3

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 INTRODUCTION

2 Globally, tuberculosis (TB) is the leading cause of death from a single infectious agent, out-ranking

3 even HIV/AIDS (WHO, 2018). The causative agent of TB is Mycobacterium tuberculosis, which

4 belongs to a genus with a unique cell envelope made of lipids and carbohydrates (Figure 1A). The

5 major component of the mycobacterial cell envelope is a large macromolecule consisting of

6 covalently linked peptidoglycan (PG), branched heteropolysaccharide arabinogalactan (AG) and

7 long chain (70 to 90 carbon atoms long) mycolic acids, termed the mycolyl-arabinogalactan-

8 peptidoglycan (mAGP) complex (Alderwick et al., 2015; Grzegorzewicz et al., 2016). This mAGP

9 complex is in turn decorated by a variety of glycolipids called phosphatidyl-myo-inositol mannoside

10 (PIM), lipomannan (LM) and lipoarabinomannan (LAM) (Figure 1A). M. tuberculosis produces

11 between 2,000 to 10,000 distinct lipids (Layre et al., 2014), most of which form an outer membrane

12 of very low permeability (Hoffmann et al., 2008; Nikaido and Jarlier, 1991). This cell envelope is

13 crucial for growth and virulence of M. tuberculosis (Barry III, 2001; Jankute et al., 2015) and is a

14 major contributor to the resistance of M. tuberculosis to common antibiotics. The importance of this

15 unique structure in mycobacteria makes its biosynthetic enzymes attractive drug targets (Abrahams

16 and Besra, 2016; Barry et al., 2007). For example, isoniazid and ethambutol, two of the four TB

17 front-line drugs, target mycobacterial cell wall synthesis (Islam et al., 2017).

18

19 Of the enzymes involved in mAGP biosynthesis, arabinofuranosyltransferases are responsible for

20 the addition of D-arabinofuranose sugar moieties to AG and LAM (Abrahams and Besra, 2016). All

21 mycobacterial arabinofuranosyltransferases are predicted to be multi-pass transmembrane (TM)

22 proteins that use decaprenylphosphoryl-D-arabinofuranose (DPA) to donate an arabinofuranose

23 residue to the mAGP (Wolucka et al., 1994). Arabinofuranosyltransferases add 90 arabinofuranose

24 residues to the galactan core of AG and 55–72 arabinofuranose residues to the mannan core of

4

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 LAM. The best characterized arabinofuranosyltransferases are EmbA, EmbB and EmbC (Belanger

2 et al., 1996; Telenti et al., 1997), which are named Emb because they are putative targets of

3 ethambutol (Goude et al., 2009; Takayama and Kilburn, 1989). Other family members include AftA

4 (Alderwick et al., 2006), AftB (Seidel et al., 2007), AftC (Birch et al., 2008), AftD (Škovierová et

5 al., 2009), and possibly other yet to be identified enzymes (Angala et al., 2016). AG synthesis starts

6 when AftA adds a single D-arabinofuranose to the C5 of the D-galactofuranose at positions 8, 10

7 and 12 of the existing galactan. Thereafter, EmbA and EmbB extend the arabinan chain through α-

8 (15) linkages; α-(13) branching is then introduced by AftC and possibly AftD. AftB is the final

9 enzyme to cap off the AG with addition of a terminal β-(12)-linked D-arabinofuranose. For LAM,

10 an unknown enzyme (Angala et al., 2016) primes the mannan chain with an α-(16) glycosidic

11 bond before being extended through α-(15) linkages by EmbC. AftC and possibly AftD then

12 catalyzes α-(13) branching and capping in a β-(12) manner by AftB, as in the synthesis of AG.

13 These pathways are by no means fully characterized (Alderwick et al., 2006).

14

15 Arabinofuranosyltransferase D (AftD) is essential for growth (Škovierová et al., 2009) and, with a

16 theoretical molecular mass of 150 kDa, it is the largest predicted glycosyltransferase encoded in the

17 mycobacterial genome. It is described as being involved in either α-(13) (Škovierová et al., 2009)

18 and/or α-(15) (Alderwick et al., 2018) branching activity, and is thought to add the final few α-

19 linked arabinofuranose residues to AG and LAM (Alderwick et al., 2018) (Figure 1B). Little is

20 known about both the function and structure of AftD, although its large size in comparison to the

21 other arabinofuranosyltransferases (which range from 49 to 118 kDa) has raised the question of

22 whether it has additional functions beyond catalyzing glycosidic bond formation (Škovierová et al.,

23 2009).

24

5

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 To better understand the function of this enzyme, we determined the 2.9 Å structure of recombinant

2 AftD from M. abscessus produced in E. coli using single-particle cryogenic electron microscopy

3 (cryo-EM). We analyzed its in vitro carbohydrate binding capability via glycan array analysis and

4 we constructed a conditional aftD deletion mutant in M. smegmatis to interrogate the functionality

5 of AftD mutants in vivo. Unexpectedly, analysis of the structure revealed that AftD is tightly bound

6 to an acyl carrier protein (ACP). The physiological relevance of this interaction was confirmed in

7 M. smegmatis through mass spectrometry. Finally, we purified and determined structures of an

8 AftD mutant designed to abrogate ACP binding to 3.4 and 3.5 Å resolution. These structures

9 revealed ordering of a loop at the putative active site, suggesting that the ACP may regulate AftD

10 function.

11

12 RESULTS

13 Phylogenetic Analysis Reveal Existence of AftD Homologs Across

14 Iterative rounds of PSI-BLAST (Altschul and Koonin, 1998) were used to identify homologs of

15 AftD using the amino acid sequence of the protein from M. tuberculosis as a template. This analysis

16 revealed that AftD is found not just in the Corynebacteriales order, but also in 11 other orders of the

17 Actinobacteria class (Ludwig et al., 2015) including Streptomycetales, Propionibacteriales and

18 Frankiales (Figure S1). The homologs all have a similar predicted topology, computed using the

19 TMHMM server (Krogh et al., 2001), with an N-terminal TM region, followed by a long

20 periplasmic region, and a shorter C-terminal TM region. The average length of all AftD proteins is

21 1381 amino acids. Most sequences (including the mycobacterial ones) cluster around this length,

22 while AftD homologs in corynebacteria are significantly shorter with an average length of 1070

23 residues (Figure S2A). Excluding corynebacteria, the average length of AftD from the remaining

24 genera is 1422 residues.

6

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1

2 Conditional Deletion of the M. smegmatis aftD Gene Results in Altered Cell Phenotype

3 As AftD has been reported to be an essential enzyme in M. tuberculosis (DeJesus et al., 2017) and

4 in M. smegmatis (Škovierová et al., 2009), a conditional knock-out was designed to interrogate the

5 function of AftD in M. smegmatis and provide a system for structure-based mutagenesis studies.

6 First, we constructed the M. smegmatis ML2217 strain (WT), which constitutively produces the

7 reverse Tet repressor TetR38. Then, the chromosomal aftD gene in M. smegmatis ML2217 was

8 placed under the control of the tetO-4C5G operator (Kim et al., 2013), which is constitutively

9 transcribed in the absence of tetracycline (Figure S2B). In the presence of tetracycline, the TetR38

10 binds to tetO-4C5G, repressing transcription of the aftD gene. To our knowledge this is the first

11 conditional AftD knock-out in mycobacteria that can be readily controlled in a temperature-

12 independent fashion.

13

14 Phenotypic analyses of wild-type M. smegmatis (WT, ML2217) and the conditional aftD knock-out

15 mutant (cKO, ML2218) were performed in medium with and without tetracycline (Figure 1). When

16 grown in 7H9 medium without tetracycline, both WT and cKO strains had similar growth curves,

17 reaching an OD600nm of 2.5 after about 32 hours (Figure 1C). When 20 ng/mL of tetracycline was

18 added to the growth media, OD600nm failed to reach 0.5 after 48 hours (Figure 1C), confirming that

19 aftD is indeed essential for growth of M. smegmatis under those conditions. In the presence of

20 tetracycline to silence aftD expression, cKO cells readily aggregated under low shaking speed,

21 suggesting the properties of the cell surface have changed (Figure 1E). The colony morphology of

22 the conditional aftD mutant was also altered, appearing less rough than the controls (Figure 1F).

23 When imaged by scanning electron microscopy, cells from the cKO strain appeared shorter in the

24 presence of tetracycline (Figure 1G). Indeed, this observation was corroborated by measurements of

7

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 the cell length and width, showing a statistically significant reduction in cell length of the cKO

2 strain in the presence (1.9 ± 0.6 μm) versus the absence (2.5 ± 0.8 μm) of tetracycline, while the

3 widths remained the same (Figure 1H).

4

5 Genomic Expansion and Complementation of the M. abscessus aftD Gene

6 To study the AftD function in vitro and determine its structure, we adopted a structural genomics

7 approach (Bruni and Kloss, 2013; Love et al., 2010; Mancia and Love, 2010) to identify homologs

8 that yielded high expression levels in E. coli and were stable in detergents compatible with structure

9 determination (see Methods). High-throughput small-scale expression and purification screens of

10 AftD from 14 mycobacterial species revealed that under our experimental conditions only the M.

11 abscessus homolog was expressed and purified from E. coli in quantities detectable in Coomassie-

12 stained protein gels. This homolog has a high sequence identity of 62% and 70% with the M.

13 tuberculosis and M. smegmatis AftDs, respectively.

14

15 To confirm the in vivo functionality of M. abscessus AftD, we complemented the cKO with the M.

16 abscessus aftD gene. Initial experiments using tetracycline or acetamidase promoters to express the

17 aftD gene resulted in cell aggregation, suggesting altered cell surface properties of M. smegmatis,

18 probably resulting from aftD overexpression. However, an expression plasmid containing the M.

19 abscessus aftD gene transcribed from its endogenous promoter (located in a 200 base pair upstream

20 fragment) fully restored wild-type growth of the cKO strain in the presence of tetracycline (Figure

21 1D). From these results we could conclude that M. abscessus AftD is functional in M. smegmatis,

22 and that the phenotype observed in the conditional M. smegmatis mutant is indeed due to the

23 depletion of the M. smegmatis AftD protein.

24

8

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Structure Determination Using Cryo-EM

2 We determined the structure of the full-length M. abscessus apo-state AftD using single-particle

3 cryo-EM. Despite substantial efforts to optimize the over-expression and purification of AftD in E.

4 coli, final yields after purification were low: less than 100 μg of protein could be isolated from 9.6 L

5 of cell culture. AftD was successfully incorporated into lipid-filled nanodiscs after a two-step

6 purification in detergent (Figures S3A and S3B). To generate multiple cryo-EM grids of the sample

7 at various conditions for screening and data collection despite the low yield, a picoliter dispensing

8 robot (Spotiton) was used to vitrify the sample (Dandey et al., 2018; Jain et al., 2012; Razinkov et

9 al., 2016). Data were collected on a Titan Krios microscope equipped with a Gatan K2 Summit

10 camera fitted with an energy filter. 7,274 micrographs were collected and 889,985 particles were

11 initially processed (Figure S3C and Table S1). After data processing, the final map was produced

12 from 490,616 particles, resulting in a reconstruction with an overall resolution of 2.9 Å (Figure S3).

13 An atomic model for the entire protein was built, with the exception of both termini (residues 1 and

14 1392-1410) and some disordered loop regions (253-259, 270-286, 330-356, 390-401, 1281-1285,

15 1358-1367) (Figure 2). The region represented by residues 950-1090 was flexible; signal

16 subtraction followed by focused classification of this part (Bai et al., 2015) (see Methods) improved

17 the map quality to allow for better model building. The final refined model exhibits good

18 stereochemistry and correlates well with the density map (Figure S4 and Table S1).

19

20 Architecture of AftD

21 AftD consists of a membrane embedded portion with 16 TM helices and a soluble portion that

22 resides on the periplasmic side (Hoffmann et al., 2008) of the inner membrane (Figure 2A). Both

23 termini of AftD project into the cytoplasm. Starting from the N-terminus, the first 11 TM helices

24 form a typical GT-C glycosyltransferase fold (Figures 2A and 2B) (Liu and Mushegian, 2003).

9

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Thereafter, the polypeptide chain enters the periplasm. The soluble, periplasmic portion of AftD is

2 large, accounting for ~60% of the mass of the protein, and is made up of five distinct domains: a

3 of mixed α/β folds that is homologous to other solved GT-C structures (discussed below),

4 three carbohydrate binding modules (CBMs) and one structural β-sandwich domain (Figure 2C).

5 Though these domains are discrete, they have extensive inter-domain interactions that maintain

6 them tightly associated, including some secondary structural elements: β4 from the soluble GT-C

7 domain forms a beta-sheet with β29 and β30 of CBM3, β7 of CBM1 forms a beta-sheet with β16 of

8 CBM2 (Figure 2C). After the soluble domains, the polypeptide chain enters back into the membrane

9 via a single TM helix (TM12) that connects via a long loop to TMs 13-16 and the C-terminus as a

10 separate membrane-bound domain from the N-terminal TM helices (Figure 2C).

11

12 AftD Has a Conserved Glycosyltransferase GT-C Fold

13 Using the Dali server (Holm and Laakso, 2016), atomic structures with structural similarity to AftD

14 were retrieved and top hits corresponded to all full-length GT-C structures deposited in the

15 database: yeast oligosaccharyltransferase STT3 (PDB ID: 6EZN, 6C26) (Bai et al., 2018; Wild et

16 al., 2018), Archaeoglobus fulgidus oligosaccharyltransferase AglB (PDB ID: 3WAJ, 5GMY,

17 3WAK) (Matsumoto et al., 2013; Matsumoto et al., 2017), Campylobacter lari

18 oligosaccharyltransferase PglB (PDB ID: 3RCE, 5OGL, 6GXC) (Lizak et al., 2011; Napiórkowska

19 et al., 2018; Napiórkowska et al., 2017), Cupriavidus metallidurans aminoarabinose-transferase

20 ArnT (PDB ID: 5EZM, 5F15) (Petrou et al., 2016) and recently yeast mannosyltransferase Pmt1-

21 Pmt2 (PDB ID: 6P2R, 6P25) (Bai et al., 2019). The first ~600 residues of AftD correspond to this

22 homologous GT-C fold, and will subsequently be referred to as the GT-C super-domain of AftD

23 (Figures 2C and S5C). This GT-C super-domain comprises a total of 11 TM helices (colored blue in

24 Figure 2B) – two less than all the other solved GT-Cs (STT3, AglB, PglB and ArnT). The two

10

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 missing TM helices are also not conserved amongst GT-Cs (Petrou et al., 2016) and correspond to

2 TM helix 8 for STT3 (Figure S5D), helix 8 and 9 for PglB (Figure S5E), and helix 12 and 13 for

3 ArnT (Figure S5F). The remaining 11 TM helices are structurally homologous to the other solved

4 GT-Cs in several aspects: 1) They are in the same arrangement in AftD as seen in the other four

5 solved GT-C structures; 2) There are two periplasmic juxtamembrane (JM) helices (JM1 and JM2)

6 between TM1 and TM2; 3) TM3 is kinked by a proline and broken into two smaller helices; 4) TM7

7 extends beyond the membrane into the periplasm and is joined by a longer loop to TM8. A short

8 alpha helical JM helix (JM3) is present in the loop; 5) A soluble periplasmic domain is always

9 present, composed of an α/β/α arrangement (Figure 2C).

10

11 Putative Active Site and Substrate Binding Pocket

12 The postulated substrates for AftD are the donor DPA (Alderwick et al., 2007) and a poly-arabinose

13 acceptor that is part of the growing AG or LAM (Škovierová et al., 2009) (Figure 1B). Although the

14 structure of AftD was solved in the apo state, the high degree of homology as described above

15 between AftD’s GT-C super-domain and the other solved GT-C glycosyltransferases allows for

16 inference of mechanistic insights. In the GT-C super-domain, a large cavity is present between the

17 periplasmic domain and the periplasm-facing ends of TM4, 6, 7, 9 and 10, with a volume of ~1240

18 Å3 (Figure 3A). Using the unique AftD sequences of Actinobacteria class, sequence conservation

19 was mapped onto the structure of AftD to reveal that this cavity contains a number of highly

20 conserved residues, including E251, D474 and E475, which are the most plausible catalytic residue

21 candidates. When representative eukaryotic (STT3) and bacterial (PglB) structures were aligned

22 with the GT-C super-domain, the active sites of both structures are in the same area as that of AftD

23 (Figure 3B). Other GT-C glycosyltransferases like PglB require the use of divalent metal cations to

24 function (Sharma et al., 1981), and AftD seems to have a putative cation binding pocket (Figures

11

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 3A and 3B, red dotted line circle). Interestingly, the charge around the putative active site for AftD

2 is slightly positive, compared to slightly negative for STT3 and PglB. This is consistent with the

3 different acceptor substrates of these proteins: STT3 has preference for peptides with aromatic

4 residues at the -2 position (Wild et al., 2018), PglB has preference for negatively charged side chain

5 at the -2 position (Napiórkowska et al., 2017), while AftD binds to a poly-arabinofuranose glycan

6 moiety (which can form ion-dipole interactions with positively charged residues).

7

8 A shallow groove along TMs 6, 9 and 13 of AftD starts at the cytoplasmic side of the membrane

9 and ends at the putative active site, and is likely to accommodate the putative DPA donor (Figure

10 S5B). A diacyl-phospholipid (modeled as phosphatidylethanolamine) can be resolved in the density

11 map near the cytoplasmic region. There is moderate conservation in this region, and it is also

12 homologous to the polyprenol binding pockets of STT3 (Figure S5D), PglB (Figure S5E) and ArnT

13 (Figure S5F), making this site the likely binding pocket for the lipidic section of DPA. In

14 comparison with the locations of lipidic donor substrates in the other homologs, it seems that

15 although the general binding area is conserved, the exact identities of the helices that form the

16 binding site are likely to differ.

17

18 There are three long loops (N-EL4 between TM 7-JM 3, C-EL4 between JM 3-TM 8 and EL5

19 between TM 9-10) that do not have clear density present, and consist of 7, 17 and 27 residues,

20 respectively (Figures 2C and 3A). Given that the structure of AftD was solved in the apo-state,

21 these unresolved loops could be analogous to those of other glycosyltransferases that span across

22 the active site and become ordered only upon ligand binding. Indeed, superimposition against the

23 structures of STT3 (Figure S5D), PglB (Figure S5E) and ArnT (Figure S5F) suggests that the STT3

24 EL5, PglB EL5 and ArnT PL4 loops are analogous to the AftD EL4 loop, as they all extend from

12

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 analogous TM helices. In contrast, AftD EL5 seems to be unique to this enzyme (Figures S5D, S5E

2 and S5F). A number of conserved residues are present on both the loops, suggesting these segments

3 are likely to be important for AftD’s enzymatic function. In particular, W272 is 100% conserved

4 and D351 is moderately conserved (Figure 3A).

5

6 We designed a series of mutants based on the apo structure of AftD to examine structure-based

7 functional hypotheses. Mutations around the putative active site were designed to target highly

8 conserved residues that could be involved in substrate binding or catalysis (Figure 3C). These

9 residues (K39, D41, E251, K361, D474) were mutated to alanine. All mutated AftD proteins were

10 expressed in E. coli and purified using non-ionic detergents to yields comparable to WT, suggesting

11 that the proteins produced were properly folded (Figure S6). All of these aftD mutations were also

12 cloned in the mycobacterial expression vector to examine whether these mutated aftD genes

13 restored WT growth of the conditional M. smegmatis aftD mutant with tetracycline. None of these

14 AftD mutants were capable to complement the growth defect of the conditional M. smegmatis aftD

15 mutant under non-permissive conditions. These results suggest that these residues play essential

16 roles for proper functioning of AftD (Figures 3D and 3E).

17

18 Carbohydrate Binding Modules of the AftD Soluble Domain Help Bind Complex

19 Arabinofuranose Chains

20 In addition to the GT-C super-domain, AftD has a large soluble domain consisting of three

21 carbohydrate binding modules (CBMs) that exhibit beta-sandwich folds (Figure 4A). There is an

22 unusual structural motif of four consecutive prolines (residues 863-866) on CBM2, which appears

23 to serve a structural role by making hydrophobic interactions with six other neighboring prolines

24 from within CBM2 and also from a region between CBM1 and the GT-C soluble domain (Figure

13

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 S5A). The structure shows that two CBMs (CBM1 and CBM3) bind an ion that is likely calcium,

2 based on comparison with other structural homologs (PDB ID: 1UYY and 4A45). Notably, CBM3

3 is more flexible than the other two CBMs, as evidenced by the smearing of the cryo-EM density at

4 CBM3 before focused classification.

5

6 Structural homologs for each of the CBMs were identified using the Dali server and the highest

7 scoring structural homolog with a ligand is shown in Figure 4A to illustrate putative glycan binding

8 sites. The putative glycan binding sites of CBM1 and 3 have an extensive network of polar (S, T, N,

9 Q) and charged (D, E, R, K) residues that can form ion-dipole or hydrogen bonds with the

10 carbohydrate ligand (Figure 4B). In addition, CBM1 and CBM3 have aromatic side chains (W757,

11 W818 for CBM1 and F1066 for CBM3) that can facilitate hydrophobic stacking interactions with

12 the face of carbohydrate rings, an important mechanism of binding in CBMs (Boraston et al., 2004).

13 These findings suggest that CBM1 and CBM3 are likely carbohydrate binding modules. CBM2, on

14 the other hand, has only four polar/charged and no aromatic residues at the putative binding site

15 indicating that it binds carbohydrates only weakly or not at all.

16

17 The relative locations of these three CBMs are quite far from each other, between 26 to 42 Å apart

18 (Figure 4C). All three CBMs are in turn 30-36 Å away from the edge of the putative active site, and

19 they fan out in different directions from it. The length of one arabinofuranose residue in an extended

20 conformation is about 3.6 Å. This arrangement of the CBMs suggests that AftD may have evolved

21 to bind long (>10 units) and possibly non-linear arabinan chains.

22

23 To test the above hypothesis, we probed the carbohydrate-binding ability of the protein in vitro by

24 screening purified AftD against an array of mycobacterial glycan fragments (Zheng et al., 2017a).

14

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 AftD appears to have specific binding to long and branched arabinofuranose chains (Figure 4D).

2 Notably, the arabinofuranose glycans to which AftD binds have 10 or more arabinofuranose

3 residues. Conversely, AftD appears not to bind to other mycobacterial glycans on the array:

4 glycopeptidolipid, phosphatidyl-myo-inositol mannoside (PIM), phenolic glycolipids (PGL),

5 trehalose mycolates and lipooligosaccharides (LOS), and capsular α glucans, with the exception of

6 tetra-mannose which, surprisingly, appears to bind weakly (Figure 4D). These results corroborate

7 both our structural hypothesis and the previously predicted function of AftD acting as a late

8 arabinosyltransferase in the biosynthesis of LAM/AG (Alderwick et al., 2018; Škovierová et al.,

9 2009).

10

11 AftD is Complexed with Acyl Carrier Protein

12 At the C-terminus of the soluble domains of AftD, the protein chain re-enters into the membrane via

13 TM 12, followed by an 18 residue loop that arches over TMs 5, 6, 7, 9 and 10 to reach TM 13

14 (Figures 2C and 5A). TMs 13-16 then form a four-helical bundle that interfaces with TM 8 and 9,

15 which is unique to AftD in comparison to other Aft/Emb enzymes. This four-helical bundle shows a

16 contiguous density adjoining the cytoplasmic side, consistent with the presence of an associated

17 protein (Figure 5A). Using mass spectrometry, this additional component was identified as the acyl

18 carrier protein (ACP) from E. coli, and the ACP crystal structure (PDB ID: 1T8K) was readily

19 accommodated in the density (Figure 5A). The density map for ACP was of lower resolution (~4-5

20 Å, Figure S3F) than for AftD, likely due to conformational flexibility. Indeed, focused classification

21 of the ACP portion revealed multiple low-resolution conformational states. Notably, there were no

22 classes without density for ACP, consistent with the tight association of the two proteins during

23 expression and purification.

24

15

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 The negatively charged surface of E. coli ACP, which has an isoelectric point around 4, interacts

2 with a positive patch of AftD comprising the arginine residues R310, R315 and R1389 (Figures 5B,

3 5C and 5D). ACP is found in its holo-form with 4’-phosphopantetheine covalently linked via a

4 phosphodiester bond to S36 (Majerus et al., 1965). This prosthetic group can either be buried within

5 the ACP hydrophobic tunnel-like cavity in the core of the four helices or extended towards its

6 cognate protein (Cronan, 2014). The coulombic potential map shows that the 16 Å long 4’-

7 phosphopantetheine fills the hydrophobic pocket of the four-helical bundle (TMs 13-16) of AftD

8 (Figures 5C and 5D). Its C-terminal thiol group points towards the interspace between TM15 and

9 TM16. Moreover, 4’-phosphopantetheine not only forms extensive hydrophobic interactions with

10 AftD residues L1332, V1335 and F1324, but also establishes hydrogen bonds between its O33 and

11 the backbone carbonyl of A1386, and its phosphate group with the side-chain of R315, a highly

12 conserved residue among AftDs (Figure 5D).

13

14 To determine the physiological relevance of the ACP-AftD interaction, we purified M. abscessus

15 AftD from a M. smegmatis mycobacterial expression system. Mass spectrometry analysis of the

16 purified protein revealed the presence of M. smegmatis meromycolate extension acyl carrier protein

17 AcpM (UniProtKB accession: ACPM_MYCS2, Table S2), a 99 residue protein that has 39%

18 sequence identity with E. coli ACP. AcpM also has a negatively charged surface (PDB ID: 1KLP,

19 Figure 5B insert) like the E. coli ACP (Wong et al., 2002). M. smegmatis has another ACP

20 (UniProtKB accession: A0QUA2_MYCS2) which shares around 40% sequence identity with

21 AcpM, and it was not detected via mass spectrometry. These results indicate that the presence of

22 ACP in the AftD structure was not an artefact due to the overexpression system in E. coli.

23

16

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Five mutants in the region of AftD that interacts with ACP were also generated with the aim of

2 disrupting ACP binding. These were successfully purified in E. coli (Figure S6) and cloned into the

3 mycobacterial expression vector. To investigate the functional relevance of the ACP-AftD

4 interaction, we examined whether these mutant genes restored WT growth of the conditional M.

5 smegmatis aftD mutant with tetracycline. The insertion of a stop codon at T1285 produced a C-

6 terminal truncation that eliminated the four-helical ACP-binding bundle all together (Figure 5F).

7 This truncated protein resulted in a growth defect akin to the empty plasmid, suggesting that either

8 the four-helical bundle of the enzyme per se, or its interaction with ACP is important for AftD

9 function (Figures 5G and 5H). Mutations were also made to three arginine residues (R310A,

10 R315A, R1389S) to abolish (by mutations to alanine) or reduce (by mutations to serine) the charge

11 interactions between the protein part of ACP and AftD (Figure 5E). R310A and R315A mutants

12 fully restored growth of the conditional M. smegmatis aftD mutant in the presence of tetracycline,

13 indicating these residues were not important for proper AftD function (Figures 5G and 5H). By

14 contrast, R1389S showed a severe growth defect, consistent with a functional role of this residue

15 (Figures 5G and 5H). Another mutation was introduced at G1339 to add a bulky tryptophan to

16 prevent the binding of 4’-phosphopantetheine to AftD’s four-helical bundle (Figure 5E). This

17 mutant only partially rescued growth of the M. smegmatis aftD cKO in the presence of tetracycline,

18 suggesting that G1339 is also important for the function of AftD (Figures 5G and 5H).

19

20 To further characterize the two classes of AftD mutants that were made (targeting putative active

21 site versus ACP binding), we performed microplate appearance (Figure 6A) and colony morphology

22 assays (Figure 6B) on a representative mutant from each class: D474A (putative active site) and

23 R1389S (ACP binding). Both mutants showed similar phenotypes to the cKO (Figures 1E and 1F)

17

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 and to each other, suggesting that disruption of ACP binding may be linked to loss of enzymatic

2 function.

3

4 The Structure of AftD-R1389S Mutant Shows Ordering of EL5 Loop with Disordering of

5 ACP

6 To investigate how ACP binding might be correlated to enzyme activity, we determined the

7 structure of AftD-R1389S mutant using single-particle cryo-EM (Figures 6 and S7). The structure

8 determination pipeline was similar to that described for wild-type AftD, with two notable

9 exceptions: 1) TEV cleavage was not performed as the mutation resulted in TEV cleavage being

10 very inefficient (Figures S7A and S7B); and 2) AftD-R1389S was vitrified on gold grids using the

11 Leica GP, using conditions optimized earlier with Spotiton (Figure S7C). 4,886 micrographs were

12 collected and 226,478 particles were initially processed following data collection on a Titan Krios

13 equipped with a Gatan K2 Summit camera fitted with an energy filter (Table S1, Figures S7C and

14 S7D). After data processing, the consensus reconstruction with an overall resolution of 3.4 Å was

15 obtained from 150,978 final particles (Figure S7E). 2D class averages of the dataset revealed a

16 fuzzy but identifiable density for the ACP – in comparison, the WT 2D class average showed a very

17 distinct density with sharp features for the ACP (Figure 6C). This suggests that the mutation did not

18 fully abrogate ACP binding, but caused it to be weakly associated with AftD-R1389S and hence

19 highly flexible. This was corroborated by mass spectrometry analysis of the AftD bands from the

20 SDS-PAGE gel (Figure S7B).

21

22 When inspecting the consensus reconstruction of AftD-R1389S, a distinct density near the active

23 site that was not seen in the wild-type structure was observed (Figure S7E). Signal subtraction

24 followed by focused classification was performed to resolve the heterogeneity, which produced two

18

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 high resolution maps of AftD: class 1 at 3.5 Å and class 2 at 3.4 Å (Figures 6D, S7E, S7F and S7G).

2 For both these classes, 2D class averages (Figure 6C) had fuzzy density for the ACP, and both of

3 them also showed worse resolution for the last four TMs at ~5 Å (Figure S7F). When the maps were

4 low pass filtered to 4 Å and compared at similar display thresholds, TM14 was the most disordered

5 helix and its density was hardly seen compared to the other helices (Figure 6D). The linker between

6 TM14 and TM15 was also poorly ordered. No density was observed where the ACP bound to the

7 WT structure, and no density for the 4’-phosphopantetheine was apparent in both AftD-R1389S

8 classes (Figure 6D).

9

10 Between the two classes, the most notable difference was in the putative active site region (Figure

11 6F). Here, the class 2 structure highly resembles the WT structure. However, in class 1, there

12 appears to be a significant conformational change. Firstly, the previously disordered EL5 loop

13 between TM 9 and 10 is now resolved, and physically occludes the putative active site. Secondly,

14 the previously disordered N- and C-EL4 loop between TM 7 and 8 still remain poorly ordered, but

15 the JM3 now appears to be disordered with poor map density (Figure 6E). In comparison, JM3 is

16 still present in class 2 (Figure 6E).

17

18 An atomic model was built into the resolved EL5 loop density, resulting in a U-shaped structure

19 connecting TMs 9 and 10 (Figure 6F). Out of the residues present in the loop, R357 is absolutely

20 conserved. Two prolines, P340 and P355, are responsible for the sharp turns in the loop structure,

21 and both also have a high degree of conservation (85.1% and 93.1% respectively) (Figure 6F).

22

19

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 DISCUSSION

2 The Conditional aftD Mutant in M. smegmatis Provides a System to Examine Structure-

3 Function Relationships of AftD

4 In this study, we constructed a tetracycline-inducible conditional aftD mutant in M. smegmatis and

5 showed that AftD is essential in M. smegmatis as reported previously (Škovierová et al., 2009) in

6 contrast to closely related corynebacteria which produce a much shorter AftD protein (Figure S2A)

7 (Alderwick et al., 2018). We observed that AftD depletion increased cell clumping and altered the

8 colony morphology of M. smegmatis (Figure 1). Both phenotypes are consistent with a change of

9 the cell surface properties and the function of AftD in adding to long, branched arabinan as shown

10 in this study (Figure 4D). The branched arabinan chain forms an integral part of the arabinogalactan

11 polymer and of the attached outer membrane (Hoffmann et al., 2008), which is essential in

12 mycobacteria, but not in corynebacteria (Portevin et al., 2004). Presumably, interfering with the

13 arabinogalactan synthesis disturbs the proper assembly of the outer membrane and its properties,

14 thus explaining the observed phenotypes. AftD depletion resulted in shorter M. smegmatis cells in

15 contrast to the phenotype of a previous report which utilized a temperature-sensitive rescue plasmid

16 to deplete AftD from M. smegmatis at 42 °C (Škovierová et al., 2009). This discrepancy might

17 result from the non-physiological temperature in the latter experiment which is known to trigger a

18 heat shock response in mycobacteria altering expression of many genes (Stewart et al., 2002) and

19 protein properties. By contrast, our conditional mutant enables for the first time to examine the

20 function of AftD in mycobacteria under physiological conditions.

21

22 The Full-Length Structure of a Mycobacterial Membrane Glycosyltransferase

23 The GT-C membrane glycosyltransferase family (Liu and Mushegian, 2003) is underrepresented in

24 the protein databank (PDB), with currently 13 structures of just five different GT-C members (Bai

20

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 et al., 2019; Bai et al., 2018; Matsumoto et al., 2013; Matsumoto et al., 2017; Napiórkowska et al.,

2 2018; Napiórkowska et al., 2017; Petrou et al., 2016; Wild et al., 2018). None of these previous

3 structures are of mycobacterial GT-C glycosyltransferases. The AftD structure presented herein is

4 the first full-length structure of this family, and the first determined by cryo-EM (Figure 2). The

5 high degree of structural conservation of AftD with other GT-C structures across all three domains

6 of life indicates a remarkable retention of the fold across evolutionary time and also points to the

7 likely existence of a common ancestral GT-C fold (Figure 4C). However, the structural

8 conservation is not uniform – the TM regions are marginally better conserved than the soluble GT-

9 C domain, an observation also predicted previously (Lairson et al., 2008). The variability of the

10 soluble domain is likely due to variations in substrates of each GT-C: for the four GT-C members

11 where atomic structures are available, their substrates are either lipid-linked glycans and peptides

12 (STT3, PglB, AglB and Pmt1-Pmt2), or lipid-linked glycans and lipid A (ArnT). AftD differs from

13 these as its acceptor is thought to be a complex glycan (Figure 4D), providing an explanation for the

14 fact that it has three additional CBM domains attached to its soluble GT-C domain.

15

16 Arabinosyltransferase Function of AftD

17 Structure-based alignment with the other GT-C glycosyltransferases of known structure, allowed us

18 to locate the putative active site of AftD in a cavity between the membrane-bound TM and soluble

19 GT-C domain (Figure 2). We confirmed, by mutagenesis studies, that five of the conserved residues

20 (K39, D41, E251, K361, D474) around the putative active site are essential for function and we thus

21 postulate that these are involved in either catalysis or substrate binding. For substrate binding, these

22 residues need to interact either with the phospho-arabinofuranose moiety of DPA, terminal

23 arabinofuranose(s) of arabinogalactan/lipoarabinomannan, or the divalent cation likely required for

24 catalysis (Sharma et al., 1981). The map shows poorly resolved densities for the flexible EL4 loop

21

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 around this site for apo-AftD, and we expect this region to become ordered to assist in substrate

2 binding, as shown to occur in PglB and ArnT (Napiórkowska et al., 2017; Petrou et al., 2016). N-

3 EL4 should become ordered to trap DPA (State 2’ in Figure 7), being closest to the putative DPA

4 binding site (Figures S5B, S5D, S5E and S5F). On the other hand, C-EL4 is well-positioned to

5 become ordered when the growing end of the arabinan acceptor enters the active site (State 2 in

6 Figure 7). In our structure, no divalent cation density around the putative active site was observed.

7 In analogy with PglB, we predict that substrate binding and ordering of the EL4 loop might be

8 required to trap a catalytic cation (Napiórkowska et al., 2017).

9

10 We identified three putative CBMs in the periplasmic domain, and showed the possible glycan

11 binding interfaces and residues by comparison to other structurally homologous CBMs for which

12 ligand-bound structures are available (Figure 4). The sizeable distances between the AftD CBMs

13 (26 to 42 Å, Figure 4C) suggests the binding of a large, complex glycan substrate, which

14 corroborates previous work placing AftD as a late stage glycosyltransferase that acts near the end of

15 the synthesis pathway for the arabinans in both LAM and AG, (Alderwick et al., 2018; Škovierová

16 et al., 2009). As observed previously in nature, an example being cellulase Cel5A (Boraston et al.,

17 2003), which has two CBMs, CBM multivalency carries two advantages: 1) modules work together

18 cooperatively to result in stronger binding than individual CBMs, and 2) each module recognizes

19 different regions of the substrate, thus improving binding specificity. In AftD, these domains, which

20 rise 70 Å up from the membrane (Figure 2A), are likely to bind specifically to different portions of

21 AG/LAM. The glycan array result supports this hypothesis: branched arabinofuranose glycans >10

22 residues in length, which are too large to be bound by a single CBM, appear to show the most

23 convincing binding profiles (Figure 4D). The fact that both elongating AG and LAM are restrained

24 in one dimension, being anchored via a polyprenol-pyrophosphate in the periplasmic membrane

22

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 (Alderwick et al., 2015), and to either the periplasmic membrane through phosphoinositol

2 mannoside (Kordulakova et al., 2002; Lea-Smith et al., 2008; Morita et al., 2006) or possibly also to

3 the outer mycolate membrane (Alsteens et al., 2008), respectively, will likely significantly impact

4 binding affinities.

5

6 Combining our results with the knowledge derived from other GT-C membrane

7 glycosyltransferases (Napiórkowska et al., 2017; Petrou et al., 2016) studied both structurally and

8 functionally, we propose the following putative enzymatic cycle (Figure 7). The solved AftD

9 structure represents the apo-state, bearing disordered N-EL4 and C-EL4 regions (State 1 in Figure

10 7). Binding of the substrates will result in ordering of N-EL4 and C-EL4, respectively, stabilization

11 of the substrate-enzyme complex (State 2 or 2’ in Figure 7), and priming of the system for catalysis

12 to occur (State 3 in Figure 7). Thereafter the product is released and the active site is re-primed.

13 This would reset AftD back to the apo-state, readying it for another catalytic cycle.

14

15 Further understanding of the order of substrate binding and of the precise structural rearrangements

16 that occur in the active site will require structures of substrate bound of AftD. Attempts to obtain

17 such structures have been made – we have tried adding various glycan moieties as indicated from

18 our glycan array results, decaprenyl-phosphate and an analog of DPA (Joe and Lowary, 2006) to

19 obtain a substrate bound conformation of AftD. No ligand density was observed in these

20 preliminary structures, possibly due to the transient nature of binding of such natural ligands,

21 technical difficulties in substrate incorporation, or presence of the E. coli (non-native) ACP that

22 might be inhibitory. An alternative strategy might involve the chemical synthesis of substrate

23 analogues or inhibitors of AftD that have higher binding affinity than the endogenous counterparts.

23

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Such an approach has previously yielded positive results for PglB (Napiórkowska et al., 2017) and

2 STT3 (Bai et al., 2018; Wild et al., 2018).

3

4 Role of the Acyl Carrier Protein in AftD

5 In the structure of AftD, an E. coli ACP was so tightly bound to the cytoplasmic face of the enzyme

6 at a C-terminal four-helical bundle, that it remained associated during purification (Figure 5A). The

7 tight binding of the ACP is quite remarkable because complexes of ACP with partner proteins have

8 typically been difficult to observe due to the transient and weak nature of their interactions (Cronan,

9 2014). Indeed, some of these complexes could only be obtained via crosslinking experiments

10 (Tallorin et al., 2016). Neither the ACP nor the four-helical bundle it associates with is present in

11 other GT-C glycosyltransferases, for which structural information is available. This interaction was

12 found to be: 1) physiological, as AftD purified from M. smegmatis also remained associated with

13 the M. smegmatis ACP homolog AcpM (Table S2), and 2) likely essential for function, as AftD

14 variants genetically engineered to disrupt AcpM binding did not rescue the growth defect of AftD-

15 depleted M. smegmatis (Figure 5H).

16

17 Structurally, mycobacterial ACP is very similar to its E. coli counterpart, with helix II (the

18 recognition helix) being the most conserved (Wong et al., 2002). The only notable difference is the

19 presence of a ~35 residue extension at the C-terminus, which is highly flexible and is thought to

20 interact with long chain intermediates of mycolic biosynthesis or mediate specific protein-protein

21 interactions (Figure 5B) (Schaeffer et al., 2001). The highly conserved helix II contains the serine

22 36 residue, which is located within a signature motif of DSL/X-hydrophobic residue and is modified

23 by a 4’-phosphopantetheine that interacts with AftD in our structure. The high degree of

24 conservation of ACP across species is the likely reason why we were able to isolate an ACP-AftD

24

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 complex from a heterologous expression system. In contrast to E. coli, the genome of M. smegmatis

2 encodes two different ACPs. The AcpM variant is involved in both fatty acid (Kremer et al., 2001)

3 and mycolic acid synthesis (Zimhony et al., 2015).

4

5 As no further electron density was observed beyond the terminal thiol group of the 4’-

6 phosphopantetheine, it is not clear whether the ACP has no cargo bound, a disordered cargo, or a

7 variety of different cargos in different conformations that average out during cryo-EM analysis. In

8 AftD, the ACP could be serving as a shuttle for starting materials and intermediates throughout the

9 fatty acid biosynthetic pathway. In E. coli, ACP interacts with at least twelve enzymes involved in

10 fatty acid biosynthesis, as well as with seven other enzymes from disparate biosynthetic pathways

11 (Nguyen et al., 2014). In mycobacteria, the ACP also functions as a shuttle for mycolic acid

12 biosynthesis (Takayama et al., 2005). Intriguingly, ACP has been reported to be essential for an E.

13 coli enzyme catalyzing the synthesis of β-(12)-linked glucan in membrane-derived

14 oligosaccharides from UDP-glucose, although the 4’-phosphopantetheine moiety was not needed

15 (Therisod and Kennedy, 1987).

16

17 The elucidation of the structures of mutant AftD-R1389S (Figures 6 and S7) pointed to another

18 hypothesis for the presence of the ACP – as a possible regulator of AftD enzyme activity (Figure 7).

19 ACP regulation of catalytic activity has indeed been observed in other proteins like WaaP (Kreamer

20 et al., 2018) and possibly SpoT-like proteins (Battesti and Bouveret, 2009). When the 4’-

21 phosphopantetheine is removed from the four-helical bundle, the helices become more disordered,

22 as evidenced by the lower resolution, especially for TM14 (Figure 6D and State I1 in Figure 7).

23 This, in turn, results in JM3 moving away and the EL5 loop becoming ordered (Figure 6E and State

24 I2 in Figure 7). The ordered EL5 now appears to block the putative active site, possibly preventing

25

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 substrates from entering, and thus effectively halting enzymatic function (Figure 6F and State I3 in

2 Figure 7). Out of the two solved dominant structures of AftD-R1389S, class 2 may represent an

3 intermediate state in this conformational change as the four-helical bundle is disordered but JM3 is

4 still present. Class 1, on the other hand, likely represents the final state of this postulated

5 conformational change, with EL5 ordered and JM3 displaced. The fact that a mutation of the

6 enzyme active site (AftD-D474A) results in M. smegmatis that looks phenotypically similar to that

7 of AftD-R1389S is consistent with the observation that both these mutations could abrogate the

8 same function of AftD – presumably its glycosyltransferase activity (Figures 6A and 6B). As the

9 ACP binds the cytoplasmic face of AftD, this could enable the transmission of intracellular signals

10 across the membrane. Given that AftD is the crucial arabinofuranosyltransferase that adds the final

11 α-linked arabinofuranose sugars on AG (Alderwick et al., 2018), inactivating this enzyme could

12 effectively prevent the attachment of mycolic acids and block the proper assembly of the entire

13 outer membrane.

14

15 AftD as a Potential Drug Target

16 AftD is an essential gene in mycobacteria and is found on the inner membrane (Figure 1A), making

17 it an attractive target for the development of novel anti-tubercular antibiotics (Favrot and Ronning,

18 2012). There are no current drugs in the market that are known to target AftD, although there have

19 been reports of mutations on the aftD gene that are associated with drug resistance and

20 susceptibility. The T797A mutation in M. tuberculosis AftD results in resistance to isoniazid (Jena

21 et al., 2016; Shekar et al., 2014), whereas S1080G in M. tuberculosis AftD results in multiple drug

22 resistant forms of the pathogen (MDR and XDR) to become susceptible to amoxicillin and

23 clavulanic acid (Cohen et al., 2016). Both T797 and S1080 are on poorly conserved regions on the

26

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 surface of the soluble region of AftD, making their functional or structural relevance difficult to

2 interpret.

3

4 Two additional factors make AftD an attractive potential drug target. Firstly, AftD is well conserved

5 across many other of the Actinobacteria class, which includes other pathogens that cause

6 diseases like ( diphtheria), (M. leprae or M. lepromatosis) and

7 (various Nocardia species). Secondly, the fact that AftD has two potential sites at which

8 to target drugs (the active site and the ACP binding site) that both appear to be essential for

9 function, further contributes to the potential of this protein for structure-based drug design efforts.

10

11 Conclusions

12 Here, we present the first full-length mycobacterial GT-C structure, determined by single-particle

13 cryo-EM to 2.9 Å resolution. The structure of wild-type AftD reveals a conserved GT-C fold and

14 also pinpoints the location of the putative active site. The residues that are likely involved in

15 substrate binding and catalysis were probed through mutagenesis studies using an engineered AftD

16 cKO strain in M. smegmatis. Furthermore, the structure and glycan array analysis show that AftD

17 binds specifically to large, highly branched arabinofuranose residues that are found in AG/LAM.

18 Finally, to the best of our knowledge this is the first report of a glycosyltransferase structure with a

19 tightly associated ACP. This interaction, when disrupted using a specific single mutant (AftD-

20 R1389S), resulted in a series of conformational changes in AftD, as observed in the structures of

21 this mutant. These changes lead to ordering of a loop that blocks the putative active site and could

22 be a mechanism for transmitting the signal from the cytoplasmic face of AftD to the periplasmic

23 active site.

24

27

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 AUTHOR CONTRIBUTIONS

2 F.M. & O.B.C. conceived the study. A.L.R., B.K., and J.R. performed the genomics expansion and

3 small scale screening. J.R. did medium scale expression and purification. Y.Z.T. did the large scale

4 expression, purification and negative stain EM. V.P.D., H.W. and Y.Z.T. vitrified the grids. Y.Z.T.

5 collected, processed and analyzed the cryo-EM data. Y.Z.T. built the model with O.B.C’s help.

6 Y.Z.T. did the phylogenetic analysis. L.Z. constructed and characterized the conditional aftD

7 deletion mutant in M. smegmatis and analyzed the function of AftD mutants in M. smegmatis, with

8 M.N.’s supervision. A.M.R. and Y.Z.T. collected the scanning electron micrographs. L.Z., Y.Z.T.

9 and S.L.G. constructed the AftD mutants. R.B.Z. did the glycan array experiments with T.L.L.’s

10 supervision. J.R. and D.A. pulled down AftD from M. smegmatis, with advice from M.J.C. and

11 M.P., under M.A.’s supervision. B.C. & C.P. supervised EM analysis. F.M. supervised the entire

12 project. Y.Z.T. and F.M. wrote the manuscript, with input from all authors.

13

14 ACKNOWLEDGEMENTS

15 We thank Joel Nott (Iowa State University) and Ricardo Gomes (UniMS, ITQB/IBET, Oeiras) for

16 protein mass spectrometry help. We thank Ed Eng, Bill Rice, Laura Kim, Mikhail Kopylov and

17 Kelsey Jordan (New York Structural Biology Center, Simons Electron Microscopy Center) for help

18 with microscope setup. We thank Sargis Dallakyan, Carl Negro, Shaker Krit and Swapnil Bhatkar

19 (New York Structural Biology Center, Simons Electron Microscopy Center) for computation

20 support. We thank Dirk Schnappinger (Weill Cornell Medicine) for providing the components of

21 the reverse TetR system. We thank Jeremie Vendome (Schrödinger) and Dmitry Lyumkis (Salk

22 Institute) for critical reading of the manuscript. We thank Christina Chen for help with nanodisc

23 incorporation, Khuram Ashraf for help with expression optimization, Vasileios Petrou for helpful

24 discussions and Leora Hamberger for her assistance managing the Mancia laboratory (Columbia

28

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 University). This work was supported by grants from the NIH (P41 GM103310 to C.S.P., B.C..;

2 R01 GM111980, R35 GM132120 and R21 AI119672 to F.M.), the Agency for Science, Technology

3 and Research Singapore (to Y.Z.T.), the University of Alabama at Birmingham (to M.N.), the

4 Fundação para a Ciência e Tecnologia, Portugal (PD/BD/128261/2016 to J.R.; PTDC/BIA-

5 BQM/30421/2017 and IF/00656/2014 to M.A.; PTDC/BIA-MIC/31233/2017 to M.J.C.), the

6 Simons Foundation (SF349247 to C.S.P., B.C.), NYSTAR (to C.S.P., B.C.) and the Alberta

7 Glycomics Centre (to T.L.L.). Some of the work was performed at the Center for Membrane Protein

8 Production and Analysis (COMPPÅ; P41 GM116799 to Wayne Hendrickson) and at the National

9 Resource for Automated Molecular Microscopy at the Simons Electron Microscopy Center (P41

10 GM103310), both located at the New York Structural Biology Center. M.A. acknowledges

11 MostMicro Research Unit (financially supported by LISBOA-01-0145-FEDER-007660 funded by

12 FEDER funds through COMPETE2020 and by national funds through FCT), and iNOVA4Health

13 (LISBOA-01-0145-FEDER-007344, co-funded by FEDER under PT2020).

14

15 DECLARATION OF INTERESTS

16 The authors declare no financial conflict of interest.

17

29

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Figure 1. Functional analysis of the role of the arabinofuranosyltransferase AftD in the 2 biosynthesis of the mycobacterial cell envelope

3 (A) Model of the cell envelope of M. tuberculosis. Red boxes highlight the arabinogalactan (AG) 4 and lipoarabinomannan (LAM) components which are synthesized by AftD (yellow). (B) Reaction 5 catalyzed by AftD enzyme. (C) Growth of wild-type M smegmatis (WT) and the conditional aftD 6 mutant ML2218 (cKO) without and with 20 ng/mL tetracycline. (D) Complementation of the cKO 7 with empty plasmid and plasmid carrying the M. abscessus aftD gene. The strains were grown with 8 and without tetracycline. Both growth curves were recorded in biological triplicates. Standard 9 deviations are plotted. (E) Clumping of the cKO cells in the presence of 20 ng/mL tetracycline in 10 liquid culture in microplates. (F) Colony morphology of WT and cKO M smegmatis without and 11 with 20 ng/mL tetracycline grown at 37 °C for 4 and 6 days. (G) Scanning electron microscopy 12 micrographs and quantification of the cell length and width (H) from these images. For the cell 13 lengths and widths (H), one-way ANOVA Kruskal-Wallis parametric test was done as the 14 distribution was not normal. Dunns post-test was done between all pairs of samples.

30

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 31

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Figure 2. Architecture of AftD Complex

2 (A) Single-particle cryo-EM structure of AftD complex, rendered in cartoon and colored in rainbow 3 from N terminus (blue) to C terminus (red). The complexed E. coli acyl carrier protein and ligands 4 are colored in brown. Both proteins are rendered as cartoon. Ca2+ ions are rendered as green 5 spheres. Two orthogonal views that are perpendicular to the plane of the membrane are shown. The 6 approximate dimensions of the monomer are 85, 145, and 75 Å (width, height, and depth). 7 Membrane boundaries were derived from the interface between the nanodisc lipids and solvent. (B) 8 Transmembrane helices arrangement of AftD, viewed as a slice and magnified. (C) Two- 9 dimensional topological diagram of AftD. The individual soluble domains are enclosed in separate 10 grey boxes. The topology diagram is colored in rainbow from N terminus (blue) to C terminus (red). 11 Unbuilt parts of the model due to poor map density are indicated by dotted lines. The two bound 12 Ca2+ atoms are shown as green circles, and side chains coordinating them are shown as sticks. The 13 putative catalytic residues previously hypothesized (Škovierová et al., 2009) are shown as red 14 circles.

32

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 33

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Figure 3. Sequence conservation and putative active site of AftD

2 (A) Structure of AftD, rendered in cartoon and colored based on sequence conservation. Unique 3 sequences recovered from PSI-BLAST search were first pruned to those that are 95% the length of 4 M. abscessus AftD (see Figure S3B). Thereafter, closely related sequences were removed by CD- 5 HIT, giving a list of 289 sequences across the Actinobacteria class. The putative active site cavity, 6 generated by Voss Volume Voxelator server (Voss and Gerstein, 2010) using probes of 2 and 5 Å 7 radii, is colored in semi-transparent green. Residues with 100% conservation are colored in magenta 8 and have their side chains displayed. The insert shows putative active site cavity with strictly 9 conserved residues around it labeled. The primary sequences of the un-modeled flexible loops 10 around the active site are shown below the insert, colored by conservation. (B) Columbic potential 11 maps of the putative active site region of AftD, aligned PglB and yeast OST (STT3). The putative 12 divalent cation binding pocket is circled with a red dotted line, and is mirrored in (A). The ligands 13 and Mn2+ ion in the structure of PglB are shown as sticks and ball respectively. (C) Residues that 14 have been mutated in the M. abscessus aftD gene have side chains displayed as stick, and the model 15 colored by conservation. These mutant plasmids were then transformed into the cKO to assess the 16 effect of these mutations on the growth rate of M. smegmatis, in the absence (D) or presence of 17 tetracycline (E).

18

19

34

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1

35

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Figure 4. The Carbohydrate Binding Modules (CBMs) of AftD

2 (A) The three soluble CBMs and a β-sandwich fold domain are colored in the AftD structure. The 3 model is rendered as cartoon. In the first insert on top, the CBMs are zoomed in and the putative 4 active site cavity is colored in semi-transparent green as in Figure 3. In the bottom insert, similar 5 protein structures (PDB IDs indicated) aligned to and replacing each AftD CBM to show putative 6 glycan binding scheme. (B) Putative sugar binding interfaces of the CBMs, with residues that could 7 be involved in binding labelled and the putative location of the sugars circled in red dotted lines. All 8 residues are colored by conservation. (C) Distances among the CBMs (top panel) and to the putative 9 ligand binding cavity displayed in light green (bottom panel). (D) Glycan array analysis of AftD, 10 with top glycan motif hits indicated, where green star represents arabinofuranose and blue sphere 11 glucopyranose. The classes of glycans used are: lipoarabinomannan (LAM), trehalose mycolates 12 and lipooligosaccharides (LOS), capsular α glucans, phenolic glycolipids (PGL), phosphatidyl-myo- 13 inositol mannoside (PIM), glycopeptidolipid and bovine serum albumin (BSA) control. For 14 structures of all glycans on the array see (Zheng et al., 2017a).

36

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 2

37

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Figure 5. Acyl carrier protein (ACP) in complex with AftD

2 (A) TM 13, 14, 15 and 16 form a hydrophobic groove where the 4’-phosphopantetheine of ACP 3 binds. AftD is rendered as cartoon with TMs 12-16 being colored in the same colors as in Figure 2. 4 ACP is colored magenta. The insert shows a zoomed view of ACP and 4’-phosphopantetheine, with 5 the corresponding density map shown as mesh. (B) Surface electrostatic potential maps of AftD 6 (top), E. coli ACP (below) and M. smegmatis AcpM (below, insert), with red being negatively 7 charge and blue positively charged. (C) LigPlot (Wallace et al., 1995) on 4’-phosphopantetheine 8 showing its atomic interactions with AftD. (D) Zoom in on the binding interface between AftD and 9 ACP, with side chains of relevant residues displayed as sticks. Residues have been colored by 10 conservation using the same scheme as Figure 3 and 4. The primary sequence of the un-model 11 flexible loop between TM15 and TM16 is shown below the insert, colored by conservation. (E) 12 Residues that have been mutated in the M. abscessus AftD protein have side chains displayed as 13 stick, and the model colored by conservation. If the residue is not resolved in the structure, its 14 putative position is indicated by a dotted lined box (F).These mutations are also listed in a table (G). 15 These mutant plasmids were then transformed into the cKO to assess their effect on the growth rate 16 of M. smegmatis (H), in the absence or presence of tetracycline.

17

38

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 2

39

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Figure 6. Phenotype and Structure of AftD-R1389S of M. abscessus AftD

2 (A) Microplate appearance assay and (B) colony forming assay across 4, 6 and 8 days of AftD 3 mutants D474A and R1389S. (C) 2D class averages comparison between the AftD WT particles and 4 mutant R1389S class 1 and 2 particles, with the red arrow pointing at the position of ACP. (D) 5 Superposition of maps from class 1 and 2 from AftD mutant R1389S with the WT map. The maps 6 have been low-pass filtered to 4 Å and displayed at similar thresholds for comparison. (E) 7 Superposition of class 1 and 2 models from AftD-R1389S against class 1 map (on top) and class 2 8 map (on the bottom) at the JM3 region. Colored scheme follows (D). (F) The model of class 1 from 9 AftD-R1389S with the ordered loop (EL5) colored in yellow. The rest of the structure is rendered as 10 cartoon in blue. The inserts are zoomed around EL5 (residues 331 to 363), colored in yellow and by 11 conservation.

12

40

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 2

41

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Figure 7. Postulated Mechanism of Action for AftD

2 AftD is rendered as cartoon in blue, orange (JM3), red (TM 13-16) and yellow (ordered EL5). ACP 3 is rendered as cartoon in grey. The catalytic cycle for AftD is indicated by the black arrows from 4 state 1 through 3. The inhibitory cycle is indicated by red arrows from state I1 to I3. The solved WT 5 AftD structure corresponds to state 1, the AftD-R1389S class 2 structure to state I2 and the AftD- 6 R1389S class 1 structure to state I3.

7 8

42

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Figure S1. Multiple sequence alignment of AftD across the Actinobacteria class, Related to 2 Figure 1

3 The sequence alignment was generated using MUSCLE and comprises of the following species 4 (order in brackets): (Corynebacteriales), Mycobacterium tuberculosis 5 (Corynebacteriales), Mycobacterium smegmatis (Corynebacteriales), Actinospica acidiphila 6 (Catenulisporales), Streptomyces mirabilis (Streptomycetales), Actinomadura atramentaria 7 (Streptosporangiales), Glycomyces fuscus (Glycomycetales), Actinoplanes missouriensis 8 (Micromonosporales), Alloactinosynnema album (Pseudonocardiales), Jiangella gansuensis 9 (Jiangellales), Intrasporangium calvum (Micrococcales), flavida (Propionibacteriales), 10 Ilumatobacter nonamiensis (Acidimicrobiales), Frankia inefficax (Frankiales) and 11 Corynebacterium glutamicum (Corynebacteriales). The residues are colored by their 12 physiochemical properties using the Zappo scheme: Aliphatic/hydrophobic residues are in salmon 13 (ILVAM), aromatic residues in orange (FWY), positively charged residues in blue (KRH), 14 negatively charged residues in red (DE), hydrophilic residues in green (STNQ), conformationally 15 special residues in pink (PG) and cysteines in yellow (C). The intensity of the coloring is 16 proportional to the degree of conservation. Secondary structure elements derived from the M. 17 abscessus structure are labelled and aligned below. Gaps and unaligned parts of the sequence have 18 been shrunk and indicated by the blue arrow.

19

43

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 44

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Figure S2. Lengths of AftD Homologs and Conditional Knock-out Scheme, Related to Figure 2 1

3 Total number of residues of each AftD homolog against the genus. Corynebacterium genus (pink 4 dots) has unusually short AftD sequence compared to the rest of the genus in the Actinobacteria 5 class (blue dots), including the Mycobacterium genus (green dots). (B) Conditional knock-out 6 scheme for aftD gene in M. smegmatis. The gene locus for WT M. smegmatis around the aftD gene 7 is shown.

8

45

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Figure S3. Purification and single-particle cryo-EM structural determination of wild-type M. 2 abscessus AftD, Related to Figure 2

3 (A) Representative size-exclusion chromatography (SEC) trace of AftD incorporated into MSP- 4 1E3D1 nanodiscs. Fractions corresponding to the peak were used for cryo-EM analysis. (B) SDS- 5 PAGE gel of the concentrated sample from the peak fractions stained with Coomassie blue. The 6 predicted molecular weight of AftD is 150 kDa, while that for MSP-1E3D1 is 32 kDa. (C) 7 Representative square, hole and exposure magnification views of the Spotiton nanowire carbon 8 patterned grids that were imaged. The exposure image had a defocus of -1.7 μm (estimated by 9 CTFFind4 and pixel size of 1.0605 Å/pixel. (D) Representative 2D class averages. (E) Euler angle 10 distribution plot of the final 3D reconstruction from CryoSPARC 2. (F) ResMap display of 11 unsharpen AftD reconstructions in orthogonal views, showing the nanodisc belt as well as variation 12 in local resolution. (G) Fourier shell correlation (FSC) curves for the full AftD map and AftD- 13 CBM3 focused classified map describing the half-map (blue) resolutions (at 0.143 cut-offs), as well 14 as histogram of directional resolutions sampled evenly over the 3DFSC (yellow). The 15 corresponding sphericity value is also indicated. (H) Fourier shell correlation (FSC) curves for the 16 atomic models of AftD, ACP, CBM3 of AftD and AftD with ACP against the whole composite map 17 generated by combining the full AftD map with the AftD-CBM3 focused classified map using 18 phenix.combine_focused_maps at 0.5 cut-off.

19

20

46

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 47

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Figure S4. EM Density of AftD, Related to Figure 2

2 The atomic model for the structure of wild-type M. abscessus AftD is colored in rainbow and 3 rendered as cartoon, with the side chains rendered as sticks. The map density is displayed as a mesh. 4 The residues for each segment of the atomic model are indicated below the name of the segment. 5 For all the segments except the acyl carrier protein, the sigma value for display was 8.33, while for 6 the acyl carrier protein, it was 2.44.

48

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 49

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Figure S5. Structural Features of AftD, Related to Figures 2,3,4,5

2 (A) Stretch of consecutive prolines in CBM2 and neighboring prolines shown as green sticks. The 3 rest of the structure is rendered as cartoon. (B) Location of the potential binding site (dotted red 4 line) for the lipidic tail of the donor DPA. AftD is rendered as cartoon, with residues colored by 5 conservation. The density map is shown as transparent grey. (C) Structural conservation of the GT- 6 C fold of AftD (residues 1 to 600) against other GT-C glycosyltransferases obtained from a Dali 7 search. Thereafter, (D) eukaryotic STT3 (PDB ID: 6EZN), (E) bacterial PglB (PDB ID: 5OGL) and 8 (F) ArnT (PDB ID: 5EZM) were then superimposed onto full-length AftD and a slice through the 9 transmembrane helices are shown as cartoon. The other glycosyltransferase structures were colored 10 in green, yellow and purple respectively, while full-length AftD was colored in rainbow. Helices of 11 the other glycosyltransferases were labelled, and those that did not have a corresponding 12 homologous helix in AftD were indicated with a dotted circle. Locations of the flexible loop that 13 becomes ordered upon substrate binding (EL5, PL4) are indicated by the dotted lines. The locations 14 of the lipidic sugar donor substrate are indicated by pink boxes.

15

50

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1

51

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Figure S6. Expression of AftD mutants in E. coli, Related to Figures 3,5

2 SDS-PAGE gel of samples from small scale purifications of WT AftD and AftD mutants, stained 3 with Coomassie blue. The AftD WT and mutants were heterologously expressed in E. coli and 4 purified for this experiment, without TEV cleavage.

5

6 7

8

9

52

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Figure S7. Purification and single-particle cryo-EM structural determination of AftD-R1389S, 2 Related to Figure 6

3 (A) Representative size-exclusion chromatography (SEC) trace of AftD-R1389S incorporated into 4 MSP-1E3D1 nanodiscs. The insert shows a zoom in of the fractions collected for SDS-PAGE 5 analysis. (B) SDS-PAGE gel of the concentrated sample from the peak fractions stained with 6 Coomassie blue. The predicted molecular weight of AftD-R1389S is 150 kDa, while that for MSP- 7 1E3D1 is 32 kDa. Mass spectrometry was used to determine that the higher band at around 150 kDa 8 corresponds to AftD-R1389S with ACP still associated. (C) Representative square, hole and 9 exposure magnification views of the gold grids that were imaged. The exposure image had a 10 defocus of -1.6 μm (estimated by CTFFind4) and pixel size of 1.0605 Å/pixel. (D) Representative 11 2D class averages. (E) Signal subtraction and 3D focused classification scheme. (F) ResMap 12 display of two classes from focused classification of unsharpen AftD R1389S. (G) Fourier shell 13 correlation (FSC) curves for the two classes describing the half-map (blue) resolutions (at 0.143 cut- 14 offs), map-to-model (purple) resolutions (at 0.5 cut-offs), as well as histogram of directional 15 resolutions sampled evenly over the 3DFSC (yellow). The corresponding sphericity value is also 16 indicated.

53

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 54

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Table S1. Cryo-EM data collection and modeling statistics WT-AftD, Related to Figure 2 2 AftD-WT-CBM3 AftD-WT Focus Classification EM data collection / processing Microscope FEI Titan Krios Voltage (kV) 300 Camera Gatan K2 Summit with Energy Filter Energy Filter Slit Width (eV) 15 Set defocus range (μm) 1.0-2.0 Defocus mean ± std (μm) 1.6 ± 0.39 Exposure time (s) 13.05 Number of frames 90 Dose rate (e-/pixel/s) 8.2 Total dose (e-/Å2) 95.72 Pixel size (Å) 1.0605 Number of micrographs 7,274 Number of particles (after initial cleanup) 889,985 Number of particles (in final map) 490,616 - Number of particles (after classification) - 50,410 Symmetry C1 C1 Resolution (global) (Å) 2.9 3.0 Local Resolution Range 2.3 – 4.1 2.3 – 4.1 Directional Resolution Range 2.7 – 2.9 2.9 – 3.1 Sphericity of 3DFSC 0.99 0.99 Map sharpening b-factor (Å2) -56.04 -29.62

Model statistics AftD-WT Residue range 2-1391 Map CC 0.828 RMSD [bonds] (Å) 0.010 RMSD [angles] (Å) 1.224 All-atom clashscore 2.27 Ramachandran plot Favored (%) 95.45 Allowed (%) 4.47 Outliers (%) 0.08 Rotamer outliers 0.00 C-β deviations 0 MolProbity Score 1.32 EM-Ringer Score 3.89

3

55

(which wasnotcertifiedbypeerreview)istheauthor/funder,whohasgrantedbioRxivalicensetodisplaypreprintinperpetuity.Itmade bioRxiv preprint

1 Table S2. Proteins identified by Liquid MS/MS from M. smegmatis purified AftD-WT M. abscessus, Related to Figure 5 2 Name Accession UniProt Unused ProtScore Total ProtScore Confidence (95%) Species Number of Peptides (95%)

ATP synthase subunit beta sp|A0R200|ATPB_MYCS2 62,02 62,02 66,1 M. smegmatis 31 doi:

UPF0182 protein MSMEG_1959/MSMEI_ sp|A0QTT7|Y1959_MYCS2 38,08 38,08 25 M. smegmatis 21 https://doi.org/10.1101/2019.12.22.885152 ATP synthase subunit alpha sp|A0R202|ATPA_MYCS2 37,14 37,19 50,4 M. smegmatis 19 AftD Mycobacterium abscessus 1948 32,81 32,81 11,4 M. abscessus 18 F5/8 type C domain protein 60 kDa chaperonin 1 sp|A0QQU5|CH601_MYCS2 28,05 28,05 39,2 M. smegmatis 14 Adenosylhomocysteinase sp|A4ZHR8|SAHH_MYCS2 24,52 24,52 34,2 M. smegmatis 14 available undera Elongation factor Tu sp|A0QS98|EFTU_MYCS2 24,42 24,45 48 M. smegmatis 13 Probable malate:quinone oxidoreductase sp|A0QVL2|MQO_MYCS2 18,94 18,94 24,7 M. smegmatis 10 Cell wall synthesis protein Wag31 sp|A0R006|WAG31_MYCS2 16,6 16,6 34,9 M. smegmatis 8 ATP synthase gamma chain sp|A0R201|ATPG_MYCS2 12 12 30 M. smegmatis 6

Ketol-acid reductoisomerase (NADP(+)) sp|A0QUX8|ILVC_MYCS2 11,91 11,95 22,9 M. smegmatis 6 CC-BY-NC-ND 4.0Internationallicense ;

Glutamine synthetase sp|A0R079|GLN1B_MYCS2 9,83 9,83 12,3 M. smegmatis 5 this versionpostedDecember23,2019. Galactofuranosyltransferase GlfT2 sp|A0R628|GLFT2_MYCS2 8,02 8,02 9,4 M. smegmatis 4 30S ribosomal protein S3 sp|A0QSD7|RS3_MYCS2 8,01 8,01 15,3 M. smegmatis 4 Uncharacterized protein sp|A0R1B5|Y4692_MYCS2 8 8 44 M. smegmatis 4 MSMEG_4692/MSMEI_4575 ATP synthase epsilon chain sp|A0R1Z9|ATPE_MYCS2 8 8 41,3 M. smegmatis 4 30S ribosomal protein S5 sp|A0QSG6|RS5_MYCS2 6 6 25,7 M. smegmatis 3 Acyl carrier protein tr|A0A0D6IL69|A0A0D6IL69_MYCSM 6 6 53,5 M. smegmatis 3 Pyruvate dehydrogenase E1 component sp|A0R0B0|ODP1_MYCS2 5,81 5,87 3,8 M. smegmatis 3

Decaprenylphosphoryl-2-keto-beta-D- . sp|A0R610|DPRE2_MYCS2 5,07 5,13 12,6 M. smegmatis 3 erythro-pentose reductase

Acetamidase sp|Q07838|AMDA_MYCSM 4,01 4,02 5,4 M. smegmatis 2 The copyrightholderforthispreprint 50S ribosomal protein L5 sp|A0QSG1|RL5_MYCS2 4 4 14,4 M. smegmatis 2 ATP synthase subunit b sp|A0R204|ATPF_MYCS2 2,29 2,29 7,6 M. smegmatis 1 3

56

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Table S3. Cryo-EM data collection and modeling statistics AftD R1389S, Related to Figure 6 2 AftD R1389S AftD R1389S

Class 1 Class 2 EM data collection / processing Microscope FEI Titan Krios Voltage (kV) 300 Camera Gatan K2 Summit with Energy Filter Energy Filter Slit Width (eV) 15 Set defocus range (μm) 0.8-1.6 Defocus mean ± std (μm) 1.42 ± 0.35 Exposure time (s) 13.05 Number of frames 90 Dose rate (e-/pixel/s) 8.34 Total dose (e-/Å2) 96.78 Pixel size (Å) 1.0605 Number of micrographs 4,886 Number of particles (after initial cleanup) 226,478 Number of particles (in consensus map) 150,978 Number of particles (after classification) 37,814 68,231 Symmetry C1 C1 Resolution (global) (Å) 3.5 3.4 Local Resolution Range 3.0 – 8.7 3.0 – 7.2 Directional Resolution Range 3.3 – 3.6 3.2 – 3.6 Sphericity of 3DFSC 0.98 0.98 Map sharpening b-factor (Å2) -125.09 -129.19

Model statistics

Residue range 5-1388 5-1388 Map CC 0.8403 0.8045 RMSD [bonds] (Å) 0.012 0.010 RMSD [angles] (Å) 1.364 1.310 All-atom clashscore 4.12 3.23 Ramachandran plot Favored (%) 92.12 93.62 Allowed (%) 7.80 6.30 Outliers (%) 0.08 0.08 Rotamer outliers 0.11 0.32 C-β deviations 0 0 MolProbity Score 1.68 1.54 EM-Ringer Score 3.70 3.21

3

Tan et al. 57

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 METHODS AND RESOURCES

2 KEY RESOURCES TABLE

REAGENT or RESOURCE SOURCE IDENTIFIER Chemicals, Peptides, and Recombinant Proteins

Luria Broth medium Fisher Scientific BP1426-2

Terrific Broth medium Fisher Scientific DF043817

Ampicillin Sigma Aldrich A0166-25G

Chloramphenicol Sigma Aldrich C0378-25G

Isopropyl β-D-1- Fisher Scientific BP1755-10 thiogalactopyranoside

DNAse I Roche 10104159001

RNase A Roche 10109169001

EDTA-free cOmplete protease Roche 11873580001 inhibitor cocktail

n-Dodecyl-β-D-Maltopyranoside Anatrace D310 25 GM

E. coli polar lipid extract Avanti 100600P

Middlebrook 7H9 Broth medium Difco DF0713-17-9 USBiological M3895-05A Middlebrook 7H10 Agar Base Fisher Scientific M0303-500G medium Sigma Aldrich M0303-500G

Hygromycin Fisher Scientific 10687010

Tetracycline Alfa Aesar AAB2140822

Kanamycin Fisher Scientific BP906-5

Tyloxapol Sigma Aldrich T8761-50G

Critical Commercial Assays

Ni-NTA Agarose Qiagen 30230

Bio-Beads SM-2 Adsorbents BioRad 152-3920

Superdex 200 Increase 10/300 GL GE Healthcare Life 28990944 size-exclusion column Sciences

Tan et al. 58 bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

ANTI-FLAG® M2 Affinity Gel Sigma A2220

Deposited Data

AftD-WT composite density map This paper EMD-XXXX

AftD-WT global refined density This paper EMD-XXXX map

AftD-WT-CBM3 focus classified This paper EMD-XXXX density map

AftD-WT Cryo-EM movie frames, This paper EMPIAR-XXXX aligned micrographs and particle stack

AftD-WT atomic model This paper PDB-XXXX

AftD-R1389S Class 1 density map This paper EMD-XXXX

AftD-R1389S Class 2 density map This paper EMD-XXXX

AftD-R1389S Cryo-EM movie This paper EMPIAR-XXXX frames, aligned micrographs and particle stack

AftD-R1389S Class 1 atomic model This paper PDB-XXXX

AftD-R1389S Class 2 atomic model This paper PDB-XXXX

Experimental Models: Organisms/Strains

BL21 (DE3) pLysS E. coli Promega PR-L1191A

M. smegmatis mc2155 (Wild-type) N/A (Snapper et al., 1990) M. smegmatis ML2253 (M. This paper N/A smegmatis mc2155; L5 attB::pML3621(hygr, loxP- Psmyc::tetR38-loxP) M. smegmatis ML2217 (M. This paper N/A smegmatis mc2155; L5 attB, Psmyc::tetR38) M. smegmatis ML2218 (M. This paper N/A smegmatis mc2155; L5 attB, Psmyc::tetR38; tetO-4C5G::aftD, hygr) M. smegmatis ML2730 (Derivative This paper N/A ML2253, Ms6 attB::pML4214 (aph, pBR322 origin, Ms6 integrase, Ms6 attP))

Tan et al. 59 bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

M. smegmatis ML2731 (Derivative This paper N/A ML2218, Ms6 attB::pML4225 (aph, PaftD-Msm::aftDMab)) M. smegmatis ML2732 (Derivative This paper N/A ML2218, Ms6 attB::pML4226 (aph, PaftD-Mab::aftDMab)) M. smegmatis ML2733 (Derivative This paper N/A ML2218, Ms6 attB::pML4214) M. smegmatis ML2734 (Derivative This paper N/A ML2218, Ms6 attB::pML4227 (aph, PaftD-Mab::aftD K39A)) M. smegmatis ML2735 (Derivative This paper N/A ML2218, Ms6 attB::pML4228 (aph, PaftD-Mab::aftD D41A)) M. smegmatis ML2736 (Derivative This paper N/A ML2218, Ms6 attB::pML4229 (aph, PaftD-Mab::aftD E251A)) M. smegmatis ML2737 (Derivative This paper N/A ML2218, Ms6 attB::pML4231 (aph, PaftD-Mab::aftD K361A)) M. smegmatis ML2738 (Derivative This paper N/A ML2218, Ms6 attB::pML4232 (aph, PaftD-Mab::aftD D474A)) M. smegmatis ML2739 (Derivative This paper N/A ML2218, Ms6 attB::pML4233 (aph, PaftD-Mab::aftD G1339W)) M. smegmatis ML2740 (Derivative This paper N/A ML2218, Ms6 attB::pML4235 (aph, PaftD-Mab::aftD T1285Stop)) M. smegmatis ML2741 (Derivative This paper N/A ML2218, Ms6 attB::pML4236 (aph, PaftD-Mab::aftD R310A)) M. smegmatis ML2742 (Derivative This paper N/A ML2218, Ms6 attB::pML4237 (aph, PaftD-Mab::aftD R1389S)) M. smegmatis ML2743 (Derivative This paper N/A ML2218, Ms6 attB::pML4238

(aph, PaftD-Mab::aftD R315A))

Recombinant DNA

M. abscessus AftD-WT in This study (Bruni and Kloss, 2013) pNYCOMPS-N23

M. abscessus AftD-K39A in This study (Bruni and Kloss, 2013) pNYCOMPS-N23

Tan et al. 60 bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

M. abscessus AftD-D41A in This study (Bruni and Kloss, 2013) pNYCOMPS-N23

M. abscessus AftD-E251A in This study (Bruni and Kloss, 2013) pNYCOMPS-N23

M. abscessus AftD-R357A in This study (Bruni and Kloss, 2013) pNYCOMPS-N23

M. abscessus AftD-K381A in This study (Bruni and Kloss, 2013) pNYCOMPS-N23

M. abscessus AftD-D474A in This study (Bruni and Kloss, 2013) pNYCOMPS-N23

M. abscessus AftD-G1339W in This study (Bruni and Kloss, 2013) pNYCOMPS-N23

M. abscessus AftD-V1335W in This study (Bruni and Kloss, 2013) pNYCOMPS-N23

M. abscessus AftD-T1385Stop in This study (Bruni and Kloss, 2013) pNYCOMPS-N23

M. abscessus AftD-R310A in This study (Bruni and Kloss, 2013) pNYCOMPS-N23

M. abscessus AftD-R1389S in This study (Bruni and Kloss, 2013) pNYCOMPS-N23

M. abscessus AftD-R315A in This study (Bruni and Kloss, 2013) pNYCOMPS-N23

M. abscessus AftD-WT in pVVAP This study N/A

pML3621 (pUC origin; L5 This study N/A integrase; L5 attP; hygr; loxP-

Psmyc::tetR38-loxP; 6392 bp)

pML4205 (pUC origin; hygr; tetO- This study N/A

4C5G::aftD(Msm)1-1089; 5952 bp)

pML2714 (pUC origin; pAL5000ts; This study (Ofer et al., 2012)

pgroEL-cre; pimc-pamcherry1m; aph; 7613 bp)

pML4214 (pBR322 origin; Ms6 This study N/A integrase; Ms6 attP; aph; 6165 bp)

Tan et al. 61 bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

pML4225 (Derivative pML4214; This study N/A pBR322 origin; Ms6 integrase; Ms6 attP; aph; PaftD-Msm::aftDMab; 11143 bp) pML4226 (Derivative pML4214; This study N/A pBR322 origin; Ms6 integrase; Ms6 attP; aph; PaftD-Mab::aftDMab; 11143 bp) pML4227 (Derivative pML4214; This study N/A pBR322 origin; Ms6 integrase; Ms6 attP; aph; PaftD-Mab::aftDK39A; 11143 bp) pML4228 (Derivative pML4214; This study N/A pBR322 origin; Ms6 integrase; Ms6 attP; aph; PaftD-Mab::aftDD41A; 11143 bp) pML4229 (Derivative pML4214; This study N/A pBR322 origin; Ms6 integrase; Ms6 attP; aph; PaftD-Mab::aftDE251A; 11143 bp) pML4230 (Derivative pML4214; This study N/A pBR322 origin; Ms6 integrase; Ms6 attP; aph; PaftD-Mab::aftDR357A; 11143 bp) pML4231 (Derivative pML4214; This study N/A pBR322 origin; Ms6 integrase; Ms6 attP; aph; PaftD-Mab::aftDK361A; 11143 bp) pML4232 (Derivative pML4214; This study N/A pBR322 origin; Ms6 integrase; Ms6 attP; aph; PaftD-Mab::aftDD474A; 11143 bp) pML4233 (Derivative pML4214; This study N/A pBR322 origin; Ms6 integrase; Ms6 attP; aph; PaftD-Mab::aftDG133W; 11143 bp) pML4234 (Derivative pML4214; This study N/A pBR322 origin; Ms6 integrase; Ms6 attP; aph; PaftD-Mab::aftDV1335W; 11143 bp) pML4235 (Derivative pML4214; This study N/A pBR322 origin; Ms6 integrase; Ms6 attP; aph; PaftD-Mab::aftDT1285Stop; 11143 bp) pML4236 (Derivative pML4214; This study N/A pBR322 origin; Ms6 integrase; Ms6 attP; aph; PaftD-Mab::aftDR310A; 11143 bp) pML4237 (Derivative pML4214; This study N/A pBR322 origin; Ms6 integrase; Ms6 attP; aph; PaftD-Mab::aftDR1389S; 11143 bp) Tan et al. 62 bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

pML4238 (Derivative pML4214; This study N/A pBR322 origin; Ms6 integrase; Ms6 attP; aph; PaftD-Mab::aftDR315A; 11143 bp) Software and Algorithms

AutoIt AutoIt https://www.autoitscript.com

Pombe Measurer Zhou Zhou, http://www.columbia.edu/~zz21 unpublished 81/Pombe_Measurer.html

GraphPad Prism 5 GraphPad https://www.graphpad.com

Leginon (Suloway et al., http://leginon.org 2005)

Appion (Lander et al., 2009) http://appion.org

MotionCor2 (Zheng et al., 2017b) http://msg.ucsf.edu/em/software /motioncor2.html

CTFFind4 (Rohou and http://grigoriefflab.janelia.org/c Grigorieff, 2015) tffind4

GCTF (Zhang, 2016) https://www.mrc- lmb.cam.ac.uk/kzhang/Gctf

Gautomatch Kai Zhang, https://www.mrc- unpublished lmb.cam.ac.uk/kzhang/Gautom atch

CryoSPARC (Punjani et al., 2017) https://cryosparc.com

Relion (Kimanius et al., https://www2.mrc- 2016; Scheres, 2012) lmb.cam.ac.uk/relion/

cisTEM (Grant et al., 2018) https://cistem.org

pyem Daniel Asarnow, https://github.com/asarnow/pye unpublished m

EMAN (Ludtke et al., 1999) http://blake.bcm.edu/emanwiki/ EMAN

UCSF Chimera (Pettersen et al., http://www.cgl.ucsf.edu/chimer 2004) a

Phenix (Adams et al., 2010; https://www.phenix-online.org Afonine et al., 2018)

Tan et al. 63

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Buccaneer (Cowtan, 2006) http://www.ysbl.york.ac.uk/~co wtan/buccaneer/buccaneer.html

CCP-EM (Burnley et al., 2017) http://www.ccpem.ac.uk

Coot (Emsley and Cowtan, https://www2.mrc- 2004) lmb.cam.ac.uk/personal/pemsle y/coot

EMRinger (Barad et al., 2015) http://emringer.com

ResMap (Kucukelbir et al., http://resmap.sourceforge.net 2013)

3DFSC (Tan et al., 2017) https://3dfsc.salk.edu

Voss Volume Voxelator server (Voss and Gerstein, http://3vee.molmovdb.org 2010)

Dali server (Holm and Laakso, http://ekhidna2.biocenter.helsin 2016) ki.fi/dali

PyMOL Schrodinger https://pymol.org

PSI-BLAST (Altschul and https://blast.ncbi.nlm.nih.gov/B Koonin, 1998) last.cgi

CD-HIT (Li and Godzik, http://weizhong- 2006) lab.ucsd.edu/cdhit-web- server/cgi-bin/index.cgi

MUSCLE (Edgar, 2004) https://www.ebi.ac.uk/Tools/ms a/muscle

Other

Homemade carbon patterned (Wei et al., 2018) N/A nanowire grids

UltrAuFoil® Holey Gold Films (R Quantifoil Q350AR13A 1.2/1.3)

1

2 CONTACT FOR REAGENT AND RESOURCE SHARING

3 Further information and requests for reagents may be directed to, and will be fulfilled by Dr. Filippo

4 Mancia ([email protected]). Tan et al. 64

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Methods

2 Statistics

3 For calculations of Fourier shell correlations (FSC), the FSC cut-off criterion of 0.143 (Rosenthal

4 and Henderson, 2003) was used. No statistical methods were used to predetermine sample size. The

5 experiments were not randomized. The investigators were not blinded to allocation during

6 experiments and outcome assessment. For calculation of M. smegmatis cell length and width, both

7 D'Agostino & Pearson omnibus normality test and Shapiro-Wilk normality test showed that the

8 distribution was not normal. Hence, one-way ANOVA Kruskal-Wallis parametric test was done.

9 Dunns post-test was then done between all pairs of samples.

10

11 Phylogenetic and Conservation Analysis

12 PSI-BLAST (Altschul and Koonin, 1998) was performed using the amino acid sequence of M.

13 abscessus AftD until no more additional sequences were found, producing 2385 sequences.

14 Duplicate sequences from the same species and truncated variants of the same protein were

15 removed, resulting in 665 unique sequences. Lengths of these sequences were plotted in Figure S1.

16 Membrane topology for these sequences were predicted using the TMHMM server (Krogh et al.,

17 2001).

18

19 In order to compare sequences that are more closely related to Mycobacterium AftD, sequences

20 shorter than 95% the length of M. abscessus AftD (~1400 residues) were excluded. CD-HIT (Li and

21 Godzik, 2006) was then used to provide representative sequences and remove biases in the

22 distribution of the data – notably Mycobacterium sequences dominate the original sequence list

23 likely because of the medical relevance of this genus. The 289 final sequences were then aligned

24 with MUSCLE (Edgar, 2004) and used for conservation analysis. Tan et al. 65

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1

2 Generation of a M. smegmatis aftD Conditional Knockout Strain

3 To construct a parent strain for aftD conditional knockout (cKO), a L5 integration vector pML3621

R 4 (L5 integrase, Hyg , loxP-Psmyc::tetR38-loxP) was firstly transformed into wild-type M. smegmatis

5 mc2155 cell to create ML2253 strain which can constitutively express a reverse TetR (TetR38). The

6 ML2253 colony was selected on 7H10 plates supplemented with 50 μg/mL hygromycin (Fisher)

7 after 4 days at 37°C. Subsequently, a temperature-sensitive replication vector pML2714

8 (pAL5000ts, KanR, cre recombinase expression) was transformed into ML2253 for unmarking. The

9 unmarked colony (named ML2217) was selected on 7H10 plates supplemented with 30 μg/mL

10 kanamycin (Fisher) after 7 days at 32°C. Next, the unmarking plasmid ML2714 was removed by

11 temperature selection on 7H10 (no antibiotic) plates at 40°C for 4 days.

12

13 To conditionally knockout aftD in M. smegmatis mc2155, a suicide vector pML4205 containing a

14 TetR38-binding region (tetO-4C5G) and an N-terminus of aftD gene (1st to 1089th bp) was

15 transformed into the parent strain ML2217 for single cross-over selection on 7H10 plates

16 supplemented with 50 μg/mL hygromycin at 37°C for 4 days. PCR validation was then performed

17 for the single cross-over colonies. The correct colonies were named ML2218, the aftD conditional

18 KO strain. In the presence of 20 ng/mL tetracycline, TetR38-tetracycline complex binds the

19 upstream region (tetO-4C5G) of aftD and represses the transcription of aftD gene.

20

21 M. abscessus aftD Complementation

22 For the complementation experiment, the M. abscessus aftD gene, as well as the aftD variations,

23 were cloned into an Ms6 integration vector pML4214 (Ms6 integrase, Ms6 attP attachment site,

24 KanR) individually with a 200-bp upstream DNA fragment of M. abscessus aftD gene. The

Tan et al. 66

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 pML4214 and the pML4214-derivative plasmids were transformed into ML2218 strain individually.

2 All the colonies were selected on 7H10 plates supplemented with 30 μg/mL kanamycin (Fisher)

3 after 4 days at 37°C.

4

5 Phenotypic Assays for M. smegmatis aftD Conditional Knockout Strain

6 To observe a clear phenotype of aftD conditional knockout mutant, the remaining cellular AftD

7 protein levels were depleted as much as possible. First, the ML2218 seed culture was grown to an

8 OD600 of 1.0 at 37°C with a shaking speed of 200 rpm. For the AftD depletion, ML2218 cells were

9 grown in 10 mL of 7H9/glycerol/tyloxapol liquid media supplemented with 20 ng/mL tetracycline

10 with an initial OD600 of 0.002 at 37°C, 200 rpm for 2-3 days. Subsequently, the culture was used for

11 growth curve assay, microplate appearance assay and colony forming assay.

12

13 For the growth curve assay, all the AftD-depleted strains were inoculated into the fresh 10 mL of

14 7H9/glycerol/tyloxapol liquid media supplemented with or without 20 ng/mL tetracycline with an

15 initial OD600 of 0.005. The cultures were then grown at 37°C, 200 rpm and time points taken

16 approximately every 4-8 hours till stationary phase. All experiments were done in triplicate.

17

18 For the microplate appearance assay, all the AftD-depleted strains were inoculated into 12-well

19 plates (Costar). Each well contains 3 mL of 7H9/glycerol/tyloxapol liquid media supplemented with

20 or without 20 ng/mL tetracycline with an initial OD600 of 0.05. The 12-well plates were shaking at

21 120 rpm at 37°C for 2-3 days.

22

23 For the colony forming assay, all the AftD-depleted cells were streaked on 7H10 plates with or

24 without 20 ng/mL of tetracycline. The plates were kept at 37°C for 6 days.

Tan et al. 67

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1

2 Scanning Electron Microscopy

3 10mL of M. smegmatis cells are harvested by spinning cells at 8000g for 1 min with a Sorvall ST 16

4 R centrifuge (Thermo Scientific) to give around 0.4-0.5g of cell pellet mass. Supernatant was

5 discarded. Cell pellet was then resuspended with 2.5% glutaraldehyde (Tousimis) in a 15mL Falcon

6 tube and incubated at 4°C for 2 hours. The suspension was then spun down again at 8000g for 1 min

7 and resuspended with PBS for 2 times, then with an ethanol (Fisher) gradient (30%, 50%, 70%,

8 100%) 15 min for dehydration. Each step takes 15 minutes, and will be followed with a spin down

9 and resuspension. After the final dehydration, fixed cells are left to air dry overnight.

10

11 The cells were then smeared onto a scanning electron microscope sample holder using a pipette tip

12 for imaging. The entire surface of the specimen was then sputter coated with a thin layer of

13 gold/palladium. The cells were imaged using secondary electron mode in a FEI Helios Nanolab 650

14 (FEI). Images were recorded using the scanning electron microscope beam at 2 keV and 100 pA

15 with a working distance of 4.0 ± 0.5 mm. Data acquisition occurred manually using an in-house

16 AutoIt script (https://www.autoitscript.com/) written by Bill Rice. The raw images were 3072 pixels

17 by 2048 pixels, with 6.08 nm pixel width, a horizontal field width (HFW) of 18.67 µm and dwell

18 time of 10.00 µs.

19

20 To measure the dimensions of cells, Adobe Photoshop was used to select out each cell. The cell

21 length and width were calculated using ImageJ’s plugin, Pombe Measurer (Zhou Zhou,

22 www.columbia.edu/~zz2181/Pombe_Measurer.html). Both D'Agostino & Pearson omnibus

23 normality test and Shapiro-Wilk normality test showed that the distribution of lengths and widths

Tan et al. 68

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 were not normal. Hence, one-way ANOVA Kruskal-Wallis parametric test was done. Dunns post-

2 test was then done between all pairs of samples. Statistical tests were done using GraphPad Prism 5.

3

4 Genomic Expansion and Small-Scale Screening

5 AftD genes were identified from a collection of fourteen Mycobacterium genomes using a

6 bioinformatics approach (Love et al., 2010). Ligation independent cloning (LIC) was used to clone

7 these targets from the genomes into five LIC-adapted expression vectors (pNYCOMPS-Nterm,

8 pNYCOMPS-Cterm, pNYCOMPS-N23, pNYCOMPS-C23 and pMCSG7-10x) that contained a

9 Tobacco Etch Virus (TEV) protease cleave site (ENLYFQSYV) and decahistidine affinity tag.

10 Small and medium scale expression was done in a high throughput manner as described in detail in

11 a previous protocol by Bruni & Kloss (Bruni and Kloss, 2013). Only one construct, M. abscessus

12 AftD in the pNYCOMPS-N23 plasmid, was successfully cloned and expressed in small and

13 medium scale. This construct was thus brought forward for further structural studies.

14

15 M. abscessus AftD Mutagenesis

16 Site-directed mutagenesis on the M. abscessus aftD gene was done with overlapping primers

17 designed per mutant using the Quick Change Lightning Kit (Agilent). The same pairs of primers

18 were used on aftD in pNYCOMPS-N23 plasmid (for E. coli expression) and pML4214-Mab

19 Promoter::aftD M. abscessus (for M. smegmatis expression).

20

21 Small scale expression tests were carried out for the AftD mutants in the pNYCOMPS-N23 plasmid

22 as described in detail in a previous protocol by Bruni & Kloss (Bruni and Kloss, 2013) (Figure S6).

23 No TEV digestion was carried out for the small scale expression tests.

24

Tan et al. 69

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 M. abscessus AftD expression, purification and nanodisc reconstitution

2 M. abscessus AftD WT in the pNYCOMPS-N23 plasmid was transformed into BL21 (DE3) pLysS

3 E. coli competent cells and plated onto Luria broth (LB) agar (Fisher) plates supplemented with 100

4 μg/mL ampicillin (Sigma) and 35 μg/mL chloramphenicol (Sigma) and grown overnight at 37°C. In

5 the next day, a colony was picked and used to inoculate a starter culture containing 150 mL of

6 Terrific broth (TB) medium (Fisher) supplemented with 100 μg/mL ampicillin and 35 μg/mL

7 chloramphenicol. The starter culture was grown overnight at 37°C with shaking (240 r.p.m.) in an

8 incubator shaker (New Brunswick Scientific). In next day, 12 2L baffled flasks each with 800 mL

9 of Terrific broth (TB) medium (Fisher) supplemented with 100 μg/mL ampicillin and 35 μg/mL

10 chloramphenicol were inoculated with 10 mL of starter culture. The cultures were then grown at

11 37°C with shaking (240 r.p.m.) until optical density (OD) at 600nm reached 1.0 (approximately 3

12 hours). Temperature was then reduced to 22°C and AftD expression was induced with 0.2 mM

13 isopropyl β-D-1-thiogalactopyranoside (IPTG) (Fisher). The culture was then incubated overnight

14 with shaking (240 r.p.m.). In the next day, the cells were harvested by centrifugation at 4,000 g in

15 H6000A/HBB6 rotor (Sorvall) for 30 min at 4°C. The supernatant was discarded and the pellet was

16 resuspended in chilled 1x phosphate buffered saline (PBS) and centrifuged again at 4,000 g for 30

17 min at 4°C. The supernatant was again discarded and the pellet was resuspended in lysis buffer

18 containing 20 mM HEPES pH 7.5, 200 mM NaCl, 20 mM MgSO4, 10 μg/mL DNase I (Roche), 8

19 μg/mL RNase A (Roche), 1 mM tris(2-carboxyethyl)phosphine hydrochloride (TCEP), 1 mM

20 PMSF, 1 tablet/1.5 L buffer EDTA-free cOmplete protease inhibitor cocktail (Roche). For 9.6 L of

21 culture, there was ~15-20g of wet cell pellet mass, which was resuspended with ~250 mL of lysis

22 buffer. Cells were lysed by putting the suspension through a chilled Emulsiflex C3 homogenizer

23 (Avestin) three times. To isolate the membrane fractions, the lysate was put through

24 ultracentrifugation at 37,000g in Type 45 Ti Rotor (Beckman Coulter) at 4°C for 30 min. The

Tan et al. 70

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 supernatant was discarded and the pellet was resuspended in the lysis buffer up to a volume of 240

2 mL and homogenized using a hand-held glass homogenizer (Konte). The membrane fraction was

3 then stored at -80°C until the use.

4

5 The thawed membrane fraction was solubilized by adding n-dodecyl-β-D-maltopyranoside (DDM)

6 to a final concentration of 1% (w/v) detergent for 2 hours at 4°C with gentle rotation. Insoluble

7 material was removed by ultracentrifugation at 40,000g in Type 45 Ti Rotor at 4°C for 30 min. The

8 supernatant was added to six Falcon tubes containing pre-equilibrated Ni2+-NTA resin (Qiagen) in

9 the presence of 40 mM imidazole and incubated with gentle rotation at 4°C for 2 hours. The resin

10 was washed with 10 column volumes of wash buffer containing 20mM HEPES pH 7.5, 200 mM

11 NaCl, 60mM Imidazole, 0.1% DDM and eluted with elution buffer containing 20 mM HEPES pH

12 7.5, 200mM NaCl, 200mM Imidazole, 0.05% DDM. The eluted protein was exchanged into a

13 buffer containing 20 mM HEPES pH 7.5, 200mM Imidazole, 0.05% DDM using a PD-10 desalting

14 column (GE), and concentrated down using a 100-kDa concentrator (Pierce) to ~ 1 mg/mL.

15

16 The protein was then incorporated into lipid nanodisc (Bayburt and Sligar, 2010) with the molar

17 ratio 1:360:6 of E. coli polar lipid extract (Avanti): membrane scaffold protein 1E3D1

18 (MSP1E3D1) and incubated for 2 hours with gentle agitation at 4°C. Reconstitution was initiated by

19 removing detergent with the addition of 150 mg Bio-beads (Bio-Rad) per mL of protein solution for

20 overnight with constant rotation at 4°C. Bio-beads were removed by passing the protein solution

21 through an Ultrafree centrifugal filter unit (Fisher) and the nanodisc reconstitution mixture was re-

22 bound to Ni2+-NTA resin with 15 mM imidazole for 2 hours at 4°C in order to remove free

23 nanodisc. The resin was washed with 10 column volumes of wash buffer consisting of 20 mM

Tan et al. 71

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 HEPES pH 7.5, 200 mM NaCl and 20 mM Imidazole, followed by 3 column volumes of elution

2 buffer consisting of 20 mM HEPES pH 7.5, 200 mM NaCl and 200 mM Imidazole.

3

4 TEV protease (Kapust et al., 2002) (~0.5 mg TEV protease added per pellet equivalent from 800

5 mL of initial bacterial culture) was added to the eluted protein, which was then injected into a Slide-

6 A-Lyzer™ dialysis cassette (Thermo Fisher) and dialyzed against 200 mL of 20 mM HEPES pH 7.5

7 and 200 mM NaCl overnight at 4°C. The mixture was then re-bound to Ni2+-NTA resin with 15 mM

8 imidazole for 1 hour at 4°C in order to remove uncleaved AftD, cleaved tag, TEV protease and

9 contaminant proteins. The resin was washed with 3 column volumes of wash buffer consisting of 20

10 mM HEPES pH 7.5, 200 mM NaCl. The flowthrough and wash were concentrated using a 100-kDa

11 concentrator to under 500 μL and loaded onto a Superdex 200 Increase 10/300 GL size-exclusion

12 column (GE Healthcare Life Sciences) in gel filtration buffer (20 mM HEPES pH 7.5 and 200 mM

13 NaCl). Throughout the entire process of purification, 15 μL of samples were taken and added to 5

14 μL of 6X reducing Laemmli SDS sample buffer (Bioland Scientific). The samples were then loaded

15 on a 4–20% Mini-PROTEAN TGX precast protein gel (Bio-rad) for protein gel electrophoresis in a

16 Tris/Glycine/SDS buffer. The gel was developed using InstantBlue (Sigma) protein stain.

17

18 For M. abscessus AftD-R1389S, the expression and purification protocol is largely the same except

19 for the following. Firstly, AftD-R1389S was not able to be cleaved by TEV protease, hence the

20 protein was not put through overnight TEV cleavage. Secondly, during the second nickel affinity

21 purification after reconstitution into nanodiscs, additional washes of 5 column volumes of 20 mM

22 HEPES pH 7.5, 500 mM NaCl and 20 mM Imidazole and 5 column volumes of 20 mM HEPES pH

23 7.5, 500 mM NaCl, 5 mM ATP and 20 mM Imidazole were done in order to increase purity of the

24 protein, given TEV cleavage was no longer done.

Tan et al. 72

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1

2 Single-Particle Cryo-EM Sample Vitrification

3 Purified AftD WT was concentrated using a 100-kDa concentrator (Pierce) to between 5-20 μL of

4 sample at ~1 mg/mL. Due to the low yields obtained, the samples were vitrified using a semi-

5 automated Spotiton V1.0 robot (Dandey et al., 2018; Jain et al., 2012; Razinkov et al., 2016). This is

6 a device for vitrifying cryo-EM samples that uses piezo-electric dispensing to apply small (50 pl)

7 drops of sample across a ‘self-blotting’ nanowire grid as it flies past en-route to plunging into liquid

8 ethane. Owing to the small amounts of sample required per vitrified grid, only ~3 μL of sample was

9 required to be aspirated for multiple grids. Homemade carbon support nanowire grids with a regular

10 array of holes were used (Wei et al., 2018), and were plasma-cleaned (Gatan Solarus) for 15s using

11 an H2/O2 mixture. The Spotiton V1.0 robot was operating at room temperature and moderate

12 humidity.

13

14 For purified AftD-R1389S, the protein was concentrated using a 100-kDa concentrator (Pierce) to

15 between ~10 μL of sample at ~2 mg/mL. 2.5 μL of sample was added to a plasma-cleaned (Gatan

16 Solarus) 1.2/1.3 µm holey gold grid (Quantifoil UltrAuFoil) and blotted using filter paper on one

17 side for 2s using the Leica GP plunger system before plunging immediately into liquid ethane for

18 vitrification. The plunger was operating at 6°C with >80% humidity to minimize evaporation and

19 sample degradation. Due to conditions previously optimized using Spotiton, Leica GP plunging

20 could be used to obtain good grids for the mutant without expending too much of the protein

21 sample.

22

23 Data Acquisition

Tan et al. 73

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Images were recorded on a Titan Krios electron microscope (FEI) equipped with a K2 summit

2 direct detector with a Quantum energy filter (Gatan) operating at 1.0605 Å per pixel in counting

3 mode using the Leginon software package (Suloway et al., 2005). Pixel size was calibrated in-house

4 using a proteasome test sample. Energy filter slit width of 15 eV was used during the collection and

5 aligned automatically every hour using Leginon. Data collection was performed using a dose of

6 ~95.72 e-/Å2 across 90 frames (150 msec per frame) at a dose rate of ~8.2 e–/pix/sec, using a set

7 defocus range of -1.0 μm to -2.0 μm. 100 μm objective aperture was used. For AftD WT, a total of

8 7,274 micrographs were recorded over a four sessions using an image beam shift data collection

9 strategy (Cheng et al., 2018). For AftD-R1389S, the same strategy over a single session of three

10 days yielded 4,886 micrographs. The dose was marginally different at 96.78 e–/Å2.

11

12 Data Processing

13 For M. abscessus WT AftD, data from the four sessions were processed separately and combined

14 towards the end of the processing pipeline. Movie frames were aligned using MotionCor2 (Zheng et

15 al., 2017b) with 5 by 5 patches and B-factor of 100 through the Appion software package (Lander et

16 al., 2009). Micrograph CTF estimation was performed using both CTFFind4 (Rohou and Grigorieff,

17 2015) and GCTF (Zhang, 2016), and best estimate based on confidence was selected within the

18 Appion software package. Template free particle picking with Gautomatch (Kai Zhang,

19 unpublished, https://www.mrc-lmb.cam.ac.uk/kzhang/Gautomatch/) using an extremely lenient

20 threshold (to avoid missing any particles) was used to pick 2,800,102 particles (extracted 256 box

21 size binned by 2) that were transferred into Relion 2.1 (Kimanius et al., 2016; Scheres, 2012) for 2D

22 classification. 2D class averages that were ice or showed no features were discarded. An initial

23 model was then generated in CryoSPARC ab initio (Punjani et al., 2017) and used for Relion 3D

24 refinement. The Euler angles and shifts were then used to re-extract the particles and re-center them.

Tan et al. 74

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Relion 3D classification was then performed using 7 classes and only classes with clear structural

2 features were selected.

3

4 At this stage, a total of 498,060 selected particles from all four session were combined and re-

5 extracted unbinned. Per-particle CTF using GCTF was then performed before the particles were put

6 through a Relion 3D refinement using the previously obtained initial model. A mask generated

7 around the transmembrane region was then used to performed 3D classification with 8 classes, and

8 particles belonging to classes with high resolution features were selected and re-extracted with a

9 bigger box size of 288. A total of 490,616 particles were left in the final stack.

10

11 CryoSPARC non-uniform refinement produced a 3.2 Å map, which was then put through cisTEM

12 (Grant et al., 2018) CTF refinement to obtain better defocus values. The particles with the refined

13 defocus values were then put through another round of CryoSPARC non-uniform refinement to

14 produce a 3.0 Å map. Signal subtraction was then done in Relion to remove the nanodisc and

15 flexible ACP density, and the signal subtracted particles were then put through CryoSPARC non-

16 uniform refinement. The resulting Euler angles and shifts were then used to reconstruct the non-

17 signal subtracted particles to produce a 2.9 Å map, called the AftD global refined map. This final

18 map was sharpened using phenix.auto_sharpen (Adams et al., 2010; Afonine et al., 2018), which

19 automatically determined that b_iso sharpening to high_resolution cutoff should be the algorithm to

20 use. Overall b_sharpen applied was 56.04 Å2 and final b_iso obtained was 44.30 Å2.

21

22 When examining the map by eye, it was noticed the domain corresponding to CBM3 had a lower

23 resolution (poor density) compared to the rest of the map. In order to better improve the resolution

24 at this flexible portion, signal subtraction was performed to isolate just the CBM3 region, and

Tan et al. 75

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 focused classification using ten classes without alignment was done. A total of 50,410 particles

2 were present in the highest resolution class. The resulting Euler angles and shifts were then used to

3 reconstruct the non-signal subtracted particles to produce a 3.0 Å map, called the AftD-CBM3

4 focused classified map. This final map was sharpened using phenix.auto_sharpen, which

5 automatically determined that b_iso sharpening to high_resolution cutoff should be the algorithm to

6 use. Overall b_sharpen applied was 29.62 Å2 and final b_iso obtained was 45.97 Å2.

7

8 For M. abscessus AftD-R1389S, the same data processing strategy was used except for the

9 following differences. Firstly, 226,478 particles were picked initially and a consensus map of

10 150,978 particles was obtained. Secondly, from these 150,978 particles, signal subtraction was done

11 using a spherical mask around the putative active site of AftD where additional density could be

12 seen. Focused 3D classification without alignments was then performed using Relion with five

13 classes. Different T values (10, 20, 30, 40, 50, 60, 70, 80 and 90) were titrated and T=60 was

14 determined by visual inspection and comparison of classes to give the best classification. Classes 3,

15 4 and 5 looked noisy and not significantly different. Classes 1 and 2 had high resolution details and

16 had significant differences between them. A CryoSPARC non-uniform refinement was then

17 performed using particles from each of these classes to give final maps at 3.5 Å and 3.4 Å each

18 respectively. Both final maps were sharpened using phenix.auto_sharpen, using b-factors of -125.09

19 Å2 and -129.19 Å2 respectively.

20

21 All conversions between Relion, CryoSPARC, and cisTEM were performed using Daniel

22 Asarnow’s pyem script (unpublished, https://github.com/asarnow/pyem).

23

24 Model Building and Refinement

Tan et al. 76

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 In order to aid in model building against a single structure, phenix.combine_focused_maps (Adams

2 et al., 2010; Afonine et al., 2018) was used to combine the WT AftD global refined map and WT

3 AftD-CBM3 focused classified map into the WT AftD composite map.

4

5 For model building of WT AftD, the Chimera (Pettersen et al., 2004) program was first used to

6 segment out the protein components and exclude the nanodisc. De novo model building for the

7 cryo-EM map its primary sequence was initiated using Buccaneer (Cowtan, 2006) within the CCP-

8 EM software suite (Burnley et al., 2017). After 2 rounds of Buccaneer model building, the partial

9 model was brought into Coot (Emsley and Cowtan, 2004) program for manual model building.

10 After the model was built, it was refined against the cryo-EM map utilizing real space refinement in

11 the Phenix program. Restraints for the lipids were generated using phenix.eLBOW and for the metal

12 ions using phenix.ready_set. Thereafter, model adjustment and refinement were performed

13 iteratively in Coot and Phenix, with the statistics being examined using Molprobity (Chen et al.,

14 2010) until no further improvements were observed. The final map and model were then validated

15 using 1) EMRinger (Barad et al., 2015) to compare map to model, 2) ResMap (Kucukelbir et al.,

16 2013) to calculate map local resolution and 3) 3DFSC program suite (Tan et al., 2017) to calculate

17 degree of directional resolution anisotropy through the 3DFSC. Map-to-model FSCs were also

18 calculated by first converting the model to a map using Chimera molmap function at Nyquist

19 resolution (2.121 Å). A mask was made from this map using Relion (after low-pass filtering to 8 Å,

20 extending by 1 pixel and applying a cosine-edge of 3 pixels), and was then applied to the density

21 map. Map-to-model FSC was calculated using EMAN (Ludtke et al., 1999) proc3d between these

22 maps.

23

Tan et al. 77

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 For model building of AftD-R1389S, the WT structure was overlaid and Coot was used to manually

2 remove or build residues. Other than that, the model building and refinement strategy mirrored the

3 WT AftD mentioned above.

4

5 Model Analysis

6 A cavity search using the Solvent Extractor from Voss Volume Voxelator server (Voss and

7 Gerstein, 2010) was performed using an outer probe radius of 5 Å and inner probe radius of 2 Å. In

8 order to search for other PDB structures with similar fold, a Dali server (Holm and Laakso, 2016)

9 search was performed – first globally and then against the different domains of the model.

10

11 Screening, Imaging and Data Analysis of the Glycan Array

12 Glycan array analysis was done with M. abscessus AftD WT protein solubilized in DDM. Slides

13 were prewetted in buffer A (25 mM Tris-HCl pH 7.8, 0.15 mM NaCl, 2 mM CaCl2 and 0.05%

14 Tween 20) for 5 min, rinsed with buffer B (25 mM Tris-HCl pH 7.8, 0.15 mM NaCl, and 2 mM

15 CaCl2) three times, and blocked overnight with buffer C (1% BSA in 25 mM Tris-HCl pH 7.8,

16 0.15mM NaCl, and 2 mM CaCl2) at 4 °C. Aliquots (500 μL) of serial dilutions of protein samples in

17 buffer C were transferred to wells of the slide module immediately after aspiration of the blocking

18 buffer. Wells were sealed with an adhesive seal and incubated for 60 min at 37 °C. Protein was

19 removed by aspiration, and slides were washed 10 times with buffer A and three times with buffer

20 B. Fluorescence was measured directly or after addition of a secondary antibody in buffer C (1:1000

21 dilution). Slides were incubated with a secondary antibody at room temperature for 40 min before

22 being washed repeatedly with buffer A and deionized water.

23

Tan et al. 78

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Before being scanned, slides were dried by centrifugation. Microarrays were scanned at 5 μm

2 resolution with a GenePix 4000B scanner (Molecular Devices, Sunnyvale, CA). The fluorescent

3 signal was detected at 532 nm for Cy3 or Alexa Fluor 555 and 488 nm for Alexa Fluor 488. The

4 laser power was 100%, and the photomultiplier tube gain was 400. The fluorescent signals were

5 analyzed by quantifying the pixel density (intensity) of each spot using GenePix ProMicroarray

6 Image Analysis Software version 6.1. Fluorescence intensity values for each spot and its

7 background were calculated. The local background signal was automatically subtracted from the

8 signal of each separate spot, and the mean signal intensity of each spot was used for data analysis.

9 Averages of triplicate experiments and standard deviations were calculated using Microsoft Excel.

10

11 M. smegmatis Overexpression of M. abscessus AftD

12 M. abscessus AftD, containing N-terminal FLAG and His10 and tags, was cloned into pVVAP

13 (Catalão et al., 2010) plasmid for over-expression. This plasmid contains a pAL5000 replication

14 origin, a kanamycin resistance cassette, and allows expression of Nterminally

15 FLAG+His10tagged AftD. The plasmid was transformed by electroporation into M. smegmatis

16 MC2155 and plated onto Middlebrook 7H10 (Sigma-Aldrich), plates, supplemented with 0.5%

17 glucose and 15 µg mL-1 kanamycin, and left at 37°C until single colonies were visible (2 to 3 days).

18 A single colony was picked to inoculate 30 mL of Middlebrook 7H9 medium (USBiological)

19 supplemented with 0.5% glucose, 0.05% Tween 80 and 15 µg mL-1 kanamycin, to grow overnight

20 at 37°C with constant orbital shaking at 220 rpm. The starter culture was then used the next day to

21 inoculate 5 L of Middlebrook 7H9 medium (USBiological) supplemented with 0.2% succinate,

-1 22 0.05% Tween 80 and 15 µg mL kanamycin, at an initial OD600nm of 0.05, and left to grow

23 overnight at 37°C with constant orbital shaking at 220 rpm. The next morning, AftD expression was

24 induced by adding acetamide to a final concentration of 0.2%, once the culture reached an OD600nm

Tan et al. 79

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 of 0.8 to 1.0, roughly 16 hr after inoculation. The culture was grown for 24 hr after induction at

2 37°C with constant orbital shaking at 220 rpm, then harvested at 4°C by centrifugation for 20 min at

3 4000 rpm (JA-10 rotor, Beckman Coulter). Supernatant was then discarded, and the cell pellet was

4 washed once with PBS, centrifuged again at 4°C for 20 min at 4000 rpm, then frozen at -20°C.

5

6 The cells were thawed and resuspended in lysis buffer containing 20 mM HEPES pH 7.5, 200 mM

7 NaCl, 20 mM MgSO4, 10 μg/mL DNase I, 1 mM TCEP, 1 tablet/1.5 L buffer EDTA-free cOmplete

8 protease inhibitor cocktail (Roche). For 5 L of culture, there was ~10 g of wet cell pellet mass,

9 which was resuspended in ~60mL of lysis buffer. Cells were then lysed in a French Press, at 1500

10 bar, after 4 passages. To isolate the membrane fractions, the lysate was put through

11 ultracentrifugation at 42,000g in Type 45 Ti Rotor (Beckman Coulter) at 4°C for 45 min. The

12 supernatant was discarded, the pellet was resuspended in the lysis buffer up to a volume of 15 mL

13 and homogenized using a hand-held glass homogenizer (Wheaton) and then stored at -20°C. The

14 thawed membrane fraction was solubilized by adding DDM to a final concentration of 1% (w/v)

15 detergent for 3 hours at 4°C with gentle rotation. Insoluble material was removed by

16 ultracentrifugation at 42,000g in Type 45 Ti Rotor at 4°C for 45 min. The supernatant was added to

17 a Falcon50 tube containing pre-equilibrated ANTI-FLAG® M2 Affinity Gel (Sigma) and incubated

18 with gentle rotation overnight at 4°C. The gel was washed with 10 column volumes of wash buffer

19 containing 20 mM HEPES pH 7.5, 200 mM NaCl, 0.05% DDM and incubated at 4°C with gentle

20 rotation for 1h with elution buffer containing 20 mM HEPES pH 7.5, 200mM NaCl, 0.05% DDM,

21 100 μg mL-1 FLAG peptide (Sigma), then finally eluted. The eluted fraction was then run on both

22 SDS-PAGE and Native-PAGE gels and sent for subsequent mass spectrometry analysis.

23

24 Mass spectrometry analysis and protein identification

Tan et al. 80

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Samples were subjected to trypsin digestion. Briefly, protein sample was reduced with 10 mM DTT

2 (Sigma) for 40 min at 56 °C followed by alkylation with iodoacetamide (Sigma) 55 mM for 30 min

3 in the dark. Excessive iodoacetamide was quenched by further incubation with DTT (10 mM for 10

4 min in the dark). The resulting sample was digested overnight with trypsin (Proteomics grade from

5 Promega) at 37 °C (1:50 protein/trypsin ratio), dried and resuspended in 8 µL LCMS water 0.1%

6 formic acid (Fisher Chemicals).

7

8 Nano-liquid chromatography-tandem mass spectrometry (nanoLC-MS/MS) analysis was performed

9 on an ekspert™ NanoLC 425 cHiPLC® system coupled with a TripleTOF® 6600 with a

10 NanoSpray® III source (Sciex). Peptides were separated through reversed-phase chromatography

11 (RP-LC) in a trap-and-elute mode. Trapping was performed at 2 µl/min with 100% A (0.1% formic

12 acid in water, Fisher Chemicals, Geel, Belgium), for 10 min, on a Nano cHiPLC Trap column

13 (Sciex 200 µm x 0.5 mm, ChromXP C18-CL, 3 µm, 120 Å). Separation was performed at 300

14 nl/min, on a Nano cHiPLC column (Sciex 75 µm x 15 cm, ChromXP C18-CL, 3 µm, 120 Å). The

15 gradient was as follows: 0-1 min, 5% B (0.1% formic acid in acetonitrile, Fisher Chemicals); 1-61

16 min, 5-30% B; 61-63 min, 30-80% B; 63-71 min, 80% B; 71-73 min, 80-5% B; 73-90 min, 5% B.

17

18 Peptides were sprayed into the MS through an uncoated fused-silica PicoTip™ emitter (360 µm

19 O.D., 20 µm I.D., 10 ± 1.0 µm tip I.D., New Objective). The source parameters were set as follows:

20 15 GS1, 0 GS2, 30 CUR, 2.5 keV ISVF and 100 C IHT. An information dependent acquisition

21 (IDA) method was set with a TOF-MS survey scan of 400-2000 m/z for 250 msec. The 50 most

22 intense precursors were selected for subsequent fragmentation and the MS/MS were acquired in

23 high sensitivity mode (150-1800 m/z for 40 msec each). The selection criteria for parent ions

24 included a charge state between +2 and +5 and counts above a minimum threshold of 125 counts

Tan et al. 81

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 per second. Ions were excluded from further MSMS analysis for 12 s. Fragmentation was

2 performed using rolling collision energy with a collision energy spread of 5.

3

4 The obtained spectra were processed and analyzed using ProteinPilot™ software, with the Paragon

5 search engine (version 5.0, Sciex). The following search parameters were set: search against M.

6 smegmatis and E. coli from Uniprot/SwissProt database, and protein sequence of AftD from M.

7 abscessus; Iodoacetamide, as Cys alkylation; Tryspsin, as digestion; TripleTOF 6600, as the

8 Instrument; ID focus as biological modifications and Amino acid substitutions; search effort as

9 thorough; and a FDR analysis. Only the proteins with Unused Protein Score above 1.3 and 95%

10 confidence were considered.

11

12

Tan et al. 82

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 REFERENCES

2 Abrahams, K.A., and Besra, G.S. (2016). Mycobacterial cell wall biosynthesis: a multifaceted 3 antibiotic target. Parasitology, 1-18. 4 Adams, P.D., Afonine, P.V., Bunkóczi, G., Chen, V.B., Davis, I.W., Echols, N., Headd, J.J., Hung, 5 L.-W., Kapral, G.J., and Grosse-Kunstleve, R.W. (2010). PHENIX: a comprehensive Python-based 6 system for macromolecular structure solution. Acta Crystallographica Section D: Biological 7 Crystallography 66, 213-221. 8 Afonine, P.V., Poon, B.K., Read, R.J., Sobolev, O.V., Terwilliger, T.C., Urzhumtsev, A., and 9 Adams, P.D. (2018). Real-space refinement in PHENIX for cryo-EM and crystallography. Acta 10 Crystallographica Section D: Structural Biology 74. 11 Alderwick, L., Birch, H., Mishra, A., Eggeling, L., and Besra, G. (2007). Structure, function and 12 biosynthesis of the Mycobacterium tuberculosis cell wall: arabinogalactan and lipoarabinomannan 13 assembly with a view to discovering new drug targets (Portland Press Limited). 14 Alderwick, L.J., Birch, H.L., Krumbach, K., Bott, M., Eggeling, L., and Besra, G.S. (2018). AftD 15 functions as an α1→ 5 arabinofuranosyltransferase involved in the biosynthesis of the 16 mycobacterial cell wall core. The Cell Surface 1, 2-14. 17 Alderwick, L.J., Harrison, J., Lloyd, G.S., and Birch, H.L. (2015). The Mycobacterial cell wall— 18 Peptidoglycan and arabinogalactan. Cold Spring Harbor perspectives in medicine, a021113. 19 Alderwick, L.J., Seidel, M., Sahm, H., Besra, G.S., and Eggeling, L. (2006). Identification of a 20 novel arabinofuranosyltransferase (AftA) involved in cell wall arabinan biosynthesis in 21 Mycobacterium tuberculosis. Journal of Biological Chemistry 281, 15653-15661. 22 Alsteens, D., Verbelen, C., Dague, E., Raze, D., Baulard, A.R., and Dufrêne, Y.F. (2008). 23 Organization of the mycobacterial cell wall: a nanoscale view. Pflügers Archiv-European Journal of 24 Physiology 456, 117-125. 25 Altschul, S.F., and Koonin, E.V. (1998). Iterated profile searches with PSI-BLAST—a tool for 26 discovery in protein databases. Trends in biochemical sciences 23, 444-447. 27 Angala, S.k., McNeil, M.R., Zou, L., Liav, A., Zhang, J., Lowary, T.L., and Jackson, M. (2016). 28 Identification of a novel mycobacterial arabinosyltransferase activity which adds an arabinosyl 29 residue to α-D-mannosyl residues. ACS chemical biology 11, 1518-1524. 30 Bai, L., Kovach, A., You, Q., Kenny, A., and Li, H. (2019). Structure of the eukaryotic protein O- 31 mannosyltransferase Pmt1− Pmt2 complex. Nature structural & molecular biology 26, 704-711. 32 Bai, L., Wang, T., Zhao, G., Kovach, A., and Li, H. (2018). The atomic structure of a eukaryotic 33 oligosaccharyltransferase complex. Nature 555, 328. 34 Bai, X.-c., Rajendra, E., Yang, G., Shi, Y., and Scheres, S.H. (2015). Sampling the conformational 35 space of the catalytic subunit of human γ-secretase. Elife 4, e11182. 36 Barad, B.A., Echols, N., Wang, R.Y.-R., Cheng, Y., DiMaio, F., Adams, P.D., and Fraser, J.S. 37 (2015). EMRinger: side chain–directed model and map validation for 3D cryo-electron microscopy. 38 Nature methods 12, 943. 39 Barry, C.E., Crick, D.C., and McNeil, M.R. (2007). Targeting the formation of the cell wall core of 40 M. tuberculosis. Infectious Disorders-Drug Targets (Formerly Current Drug Targets-Infectious 41 Disorders) 7, 182-202. 42 Barry III, C.E. (2001). Interpreting cell wall ‘virulence factors’ of Mycobacterium tuberculosis. 43 Trends in microbiology 9, 237-241. 44 Battesti, A., and Bouveret, E. (2009). Bacteria possessing two RelA/SpoT-like proteins have 45 evolved a specific stringent response involving the acyl carrier protein-SpoT interaction. Journal of 46 bacteriology 191, 616-624.

Tan et al. 83

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Bayburt, T.H., and Sligar, S.G. (2010). Membrane protein assembly into Nanodiscs. FEBS letters 2 584, 1721-1727. 3 Belanger, A.E., Besra, G.S., Ford, M.E., Mikusová, K., Belisle, J.T., Brennan, P.J., and Inamine, 4 J.M. (1996). The embAB genes of Mycobacterium avium encode an arabinosyl transferase involved 5 in cell wall arabinan biosynthesis that is the target for the antimycobacterial drug ethambutol. 6 Proceedings of the National Academy of Sciences 93, 11919-11924. 7 Birch, H.L., Alderwick, L.J., Bhatt, A., Rittmann, D., Krumbach, K., Singh, A., Bai, Y., Lowary, 8 T.L., Eggeling, L., and Besra, G.S. (2008). Biosynthesis of mycobacterial arabinogalactan: 9 identification of a novel α (1→ 3) arabinofuranosyltransferase. Molecular microbiology 69, 1191- 10 1206. 11 Boraston, A.B., Bolam, D.N., Gilbert, H.J., and Davies, G.J. (2004). Carbohydrate-binding 12 modules: fine-tuning polysaccharide recognition. Biochemical journal 382, 769-781. 13 Boraston, A.B., Kwan, E., Chiu, P., Warren, R.A.J., and Kilburn, D.G. (2003). Recognition and 14 hydrolysis of noncrystalline cellulose. Journal of biological chemistry 278, 6120-6127. 15 Bruni, R., and Kloss, B. (2013). High‐Throughput Cloning and Expression of Integral Membrane 16 Proteins in Escherichia coli. Current protocols in protein science 74, 29.26. 21-29.26. 34. 17 Burnley, T., Palmer, C.M., and Winn, M. (2017). Recent developments in the CCP‐EM software 18 suite. Acta Crystallographica Section D 73, 469-477. 19 Catalão, M.J., Gil, F., Moniz‐Pereira, J., and Pimentel, M. (2010). The mycobacteriophage Ms6 20 encodes a chaperone‐like protein involved in the endolysin delivery to the peptidoglycan. 21 Molecular microbiology 77, 672-686. 22 Chen, V.B., Arendall, W.B., Headd, J.J., Keedy, D.A., Immormino, R.M., Kapral, G.J., Murray, 23 L.W., Richardson, J.S., and Richardson, D.C. (2010). MolProbity: all-atom structure validation for 24 macromolecular crystallography. Acta Crystallographica Section D: Biological Crystallography 66, 25 12-21. 26 Cheng, A., Eng, E.T., Alink, L., Rice, W.J., Jordan, K.D., Kim, L.Y., Potter, C.S., and Carragher, B. 27 (2018). High resolution single particle cryo-electron microscopy using beam-image shift. Journal of 28 Structural Biology. 29 Cohen, K.A., El-Hay, T., Wyres, K.L., Weissbrod, O., Munsamy, V., Yanover, C., Aharonov, R., 30 Shaham, O., Conway, T.C., and Goldschmidt, Y. (2016). Paradoxical hypersusceptibility of drug- 31 resistant mycobacteriumtuberculosis to β-lactam antibiotics. EBioMedicine 9, 170-179. 32 Cowtan, K. (2006). The Buccaneer software for automated model building. 1. Tracing protein 33 chains. Acta Crystallographica Section D: Biological Crystallography 62, 1002-1011. 34 Cronan, J.E. (2014). The chain-flipping mechanism of ACP (acyl carrier protein)-dependent 35 enzymes appears universal. Biochemical Journal 460, 157-163. 36 Dandey, V.P., Wei, H., Zhang, Z., Tan, Y.Z., Acharya, P., Eng, E.T., Rice, W.J., Kahn, P.A., Potter, 37 C.S., and Carragher, B. (2018). Spotiton: New features and applications. Journal of Structural 38 Biology 202, 161-169. 39 DeJesus, M.A., Gerrick, E.R., Xu, W., Park, S.W., Long, J.E., Boutte, C.C., Rubin, E.J., 40 Schnappinger, D., Ehrt, S., and Fortune, S.M. (2017). Comprehensive essentiality analysis of the 41 Mycobacterium tuberculosis genome via saturating transposon mutagenesis. MBio 8, e02133- 42 02116. 43 Edgar, R.C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high 44 throughput. Nucleic acids research 32, 1792-1797. 45 Emsley, P., and Cowtan, K. (2004). Coot: model-building tools for molecular graphics. Acta 46 Crystallographica Section D: Biological Crystallography 60, 2126-2132.

Tan et al. 84

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Favrot, L., and Ronning, D.R. (2012). Targeting the mycobacterial envelope for tuberculosis drug 2 development. Expert review of anti-infective therapy 10, 1023-1036. 3 Goude, R., Amin, A., Chatterjee, D., and Parish, T. (2009). The arabinosyltransferase EmbC is 4 inhibited by ethambutol in Mycobacterium tuberculosis. Antimicrobial agents and chemotherapy 5 53, 4138-4146. 6 Grant, T., Rohou, A., and Grigorieff, N. (2018). cisTEM, user-friendly software for single-particle 7 image processing. Elife 7, e35383. 8 Grzegorzewicz, A.E., De Sousa-d'Auria, C., McNeil, M.R., Huc-Claustre, E., Jones, V., Petit, C., 9 Angala, S.K., Zemanová, J., Wang, Q., and Belardinelli, J.M. (2016). Assembling of the 10 Mycobacterium tuberculosis cell wall core. Journal of Biological Chemistry, jbc. M116. 739227. 11 Hoffmann, C., Leis, A., Niederweis, M., Plitzko, J.M., and Engelhardt, H. (2008). Disclosure of the 12 mycobacterial outer membrane: cryo-electron tomography and vitreous sections reveal the lipid 13 bilayer structure. Proceedings of the National Academy of Sciences 105, 3963-3967. 14 Holm, L., and Laakso, L.M. (2016). Dali server update. Nucleic acids research 44, W351-W355. 15 Islam, M.M., Hameed, H.A., Mugweru, J., Chhotaray, C., Wang, C., Tan, Y., Liu, J., Li, X., Tan, 16 S., and Ojima, I. (2017). Drug resistance mechanisms and novel drug targets for tuberculosis 17 therapy. Journal of genetics and genomics 44, 21-37. 18 Jain, T., Sheehan, P., Crum, J., Carragher, B., and Potter, C.S. (2012). Spotiton: a prototype for an 19 integrated inkjet dispense and vitrification system for cryo-TEM. Journal of structural biology 179, 20 68-75. 21 Jankute, M., Cox, J.A., Harrison, J., and Besra, G.S. (2015). Assembly of the mycobacterial cell 22 wall. Annual review of microbiology 69, 405-423. 23 Jena, L., Nayak, T., Deshmukh, S., Wankhade, G., Waghmare, P., and Harinath, C. (2016). 24 Isoniazid with multiple mode of action on various mycobacterial enzymes resulting in drug 25 resistance. J Infect Dis Ther 4, 2332-0877.1000297. 26 Joe, M., and Lowary, T.L. (2006). Synthesis of 2-deoxy-2-fluoro analogs of polyprenyl β-D- 27 arabinofuranosyl phosphates. Carbohydrate research 341, 2723-2730. 28 Kapust, R.B., Tözsér, J., Copeland, T.D., and Waugh, D.S. (2002). The P1′ specificity of tobacco 29 etch virus protease. Biochemical and biophysical research communications 294, 949-955. 30 Kim, J.-H., O’Brien, K.M., Sharma, R., Boshoff, H.I., Rehren, G., Chakraborty, S., Wallach, J.B., 31 Monteleone, M., Wilson, D.J., and Aldrich, C.C. (2013). A genetic strategy to identify targets for 32 the development of drugs that prevent bacterial persistence. Proceedings of the National Academy 33 of Sciences 110, 19095-19100. 34 Kimanius, D., Forsberg, B.O., Scheres, S.H., and Lindahl, E. (2016). Accelerated cryo-EM structure 35 determination with parallelisation using GPUs in RELION-2. Elife 5, e18722. 36 Kordulakova, J., Gilleron, M., Mikusová , K.n., Puzo, G., Brennan, P.J., Gicquel, B., and Jackson, 37 M. (2002). Definition of the first mannosylation step in phosphatidylinositol mannoside synthesis 38 PimA is essential for growth of mycobacteria. Journal of Biological Chemistry 277, 31335-31344. 39 Kreamer, N.N., Chopra, R., Caughlan, R.E., Fabbro, D., Fang, E., Gee, P., Hunt, I., Li, M., Leon, 40 B.C., and Muller, L. (2018). Acylated-acyl carrier protein stabilizes the Pseudomonas aeruginosa 41 WaaP lipopolysaccharide heptose kinase. Scientific reports 8, 14124. 42 Kremer, L., Nampoothiri, K.M., Lesjean, S., Dover, L.G., Graham, S., Betts, J., Brennan, P.J., 43 Minnikin, D.E., Locht, C., and Besra, G.S. (2001). Biochemical Characterization of Acyl Carrier 44 Protein (AcpM) and Malonyl-CoA: AcpM Transacylase (mtFabD), Two Major Components 45 ofMycobacterium tuberculosis Fatty Acid Synthase II. Journal of Biological Chemistry 276, 27967- 46 27974.

Tan et al. 85

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Krogh, A., Larsson, B., Von Heijne, G., and Sonnhammer, E.L. (2001). Predicting transmembrane 2 protein topology with a hidden Markov model: application to complete genomes. Journal of 3 molecular biology 305, 567-580. 4 Kucukelbir, A., Sigworth, F.J., and Tagare, H.D. (2013). Quantifying the local resolution of cryo- 5 EM density maps. Nature methods 11, 63. 6 Lairson, L., Henrissat, B., Davies, G., and Withers, S. (2008). Glycosyltransferases: structures, 7 functions, and mechanisms. Annual review of biochemistry 77. 8 Lander, G.C., Stagg, S.M., Voss, N.R., Cheng, A., Fellmann, D., Pulokas, J., Yoshioka, C., Irving, 9 C., Mulder, A., Lau, P.W., et al. (2009). Appion: an integrated, database-driven pipeline to facilitate 10 EM image processing. J Struct Biol 166, 95-102. 11 Layre, E., Al-Mubarak, R., Belisle, J.T., and Moody, D.B. (2014). Mycobacterial lipidomics. In 12 Molecular Genetics of Mycobacteria, Second Edition (American Society of Microbiology), pp. 341- 360. 13 360. 14 Lea-Smith, D.J., Martin, K.L., Pyke, J.S., Tull, D., McConville, M.J., Coppel, R.L., and Crellin, 15 P.K. (2008). Analysis of a new mannosyltransferase required for the synthesis of 16 phosphatidylinositol mannosides and lipoarbinomannan reveals two lipomannan pools in 17 corynebacterineae. Journal of Biological Chemistry 283, 6773-6782. 18 Li, W., and Godzik, A. (2006). Cd-hit: a fast program for clustering and comparing large sets of 19 protein or nucleotide sequences. Bioinformatics 22, 1658-1659. 20 Liu, J., and Mushegian, A. (2003). Three monophyletic superfamilies account for the majority of 21 the known glycosyltransferases. Protein Science 12, 1418-1431. 22 Lizak, C., Gerber, S., Numao, S., Aebi, M., and Locher, K.P. (2011). X-ray structure of a bacterial 23 oligosaccharyltransferase. Nature 474, 350. 24 Love, J., Mancia, F., Shapiro, L., Punta, M., Rost, B., Girvin, M., Wang, D.-N., Zhou, M., Hunt, 25 J.F., and Szyperski, T. (2010). The New York Consortium on Membrane Protein Structure 26 (NYCOMPS): a high-throughput platform for structural genomics of integral membrane proteins. 27 Journal of structural and functional genomics 11, 191-199. 28 Ludtke, S.J., Baldwin, P.R., and Chiu, W. (1999). EMAN: semiautomated software for high- 29 resolution single-particle reconstructions. Journal of structural biology 128, 82-97. 30 Ludwig, W., Euzéby, J., Schumann, P., Busse, H.J., Trujillo, M.E., Kämpfer, P., and Whitman, 31 W.B. (2015). Road map of the phylum Actinobacteria. Bergey's Manual of Systematics of Archaea 32 and Bacteria, 1-37. 33 Majerus, P.W., Alberts, A.W., and Vagelos, P.R. (1965). Acyl carrier protein, IV. The identification 34 of 4′-phosphopantetheine as the prosthetic group of the acyl carrier protein. Proceedings of the 35 National Academy of Sciences 53, 410-417. 36 Mancia, F., and Love, J. (2010). High-throughput expression and purification of membrane 37 proteins. Journal of structural biology 172, 85-93. 38 Matsumoto, S., Shimada, A., Nyirenda, J., Igura, M., Kawano, Y., and Kohda, D. (2013). Crystal 39 structures of an archaeal oligosaccharyltransferase provide insights into the catalytic cycle of N- 40 linked protein glycosylation. Proceedings of the National Academy of Sciences 110, 17868-17873. 41 Matsumoto, S., Taguchi, Y., Shimada, A., Igura, M., and Kohda, D. (2017). Tethering an N- 42 glycosylation sequon-containing peptide creates a catalytically competent oligosaccharyltransferase 43 complex. Biochemistry 56, 602-611. 44 Morita, Y.S., Sena, C.B., Waller, R.F., Kurokawa, K., Sernee, M.F., Nakatani, F., Haites, R.E., 45 Billman-Jacobe, H., McConville, M.J., and Maeda, Y. (2006). PimE is a polyprenol-phosphate- 46 mannose-dependent mannosyltransferase that transfers the fifth mannose of phosphatidylinositol 47 mannoside in mycobacteria. Journal of Biological Chemistry.

Tan et al. 86

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Napiórkowska, M., Boilevin, J., Darbre, T., Reymond, J.-L., and Locher, K.P. (2018). Structure of 2 bacterial oligosaccharyltransferase PglB bound to a reactive LLO and an inhibitory peptide. 3 Scientific reports 8, 16297. 4 Napiórkowska, M., Boilevin, J., Sovdat, T., Darbre, T., Reymond, J.-L., Aebi, M., and Locher, K.P. 5 (2017). Molecular basis of lipid-linked oligosaccharide recognition and processing by bacterial 6 oligosaccharyltransferase. Nature structural & molecular biology 24, 1100. 7 Nguyen, C., Haushalter, R.W., Lee, D.J., Markwick, P.R., Bruegger, J., Caldara-Festin, G., Finzel, 8 K., Jackson, D.R., Ishikawa, F., and O’dowd, B. (2014). Trapping the dynamic acyl carrier protein 9 in fatty acid biosynthesis. Nature 505, 427. 10 Nikaido, H., and Jarlier, V. (1991). Permeability of the mycobacterial cell wall. Research in 11 microbiology 142, 437-443. 12 Ofer, N., Wishkautzan, M., Meijler, M., Wang, Y., Speer, A., Niederweis, M., and Gur, E. (2012). 13 Ectoine biosynthesis in Mycobacterium smegmatis. Appl Environ Microbiol 78, 7483-7486. 14 Petrou, V.I., Herrera, C.M., Schultz, K.M., Clarke, O.B., Vendome, J., Tomasek, D., Banerjee, S., 15 Rajashankar, K.R., Dufrisne, M.B., and Kloss, B. (2016). Structures of aminoarabinose transferase 16 ArnT suggest a molecular basis for lipid A glycosylation. Science 351, 608-612. 17 Pettersen, E.F., Goddard, T.D., Huang, C.C., Couch, G.S., Greenblatt, D.M., Meng, E.C., and 18 Ferrin, T.E. (2004). UCSF Chimera—a visualization system for exploratory research and analysis. 19 Journal of computational chemistry 25, 1605-1612. 20 Portevin, D., de Sousa-D'Auria, C., Houssin, C., Grimaldi, C., Chami, M., Daffé, M., and Guilhot, 21 C. (2004). A polyketide synthase catalyzes the last condensation step of mycolic acid biosynthesis 22 in mycobacteria and related organisms. Proceedings of the National Academy of Sciences 101, 314- 319. 23 319. 24 Punjani, A., Rubinstein, J.L., Fleet, D.J., and Brubaker, M.A. (2017). cryoSPARC: algorithms for 25 rapid unsupervised cryo-EM structure determination. Nat Methods 14, 290-296. 26 Razinkov, I., Dandey, V.P., Wei, H., Zhang, Z., Melnekoff, D., Rice, W.J., Wigge, C., Potter, C.S., 27 and Carragher, B. (2016). A new method for vitrifying samples for cryoEM. Journal of structural 28 biology 195, 190-198. 29 Rohou, A., and Grigorieff, N. (2015). CTFFIND4: Fast and accurate defocus estimation from 30 electron micrographs. J Struct Biol 192, 216-221. 31 Rosenthal, P.B., and Henderson, R. (2003). Optimal determination of particle orientation, absolute 32 hand, and contrast loss in single-particle electron cryomicroscopy. J Mol Biol 333, 721-745. 33 Schaeffer, M.L., Agnihotri, G., Kallender, H., Brennan, P.J., and Lonsdale, J.T. (2001). Expression, 34 purification, and characterization of the Mycobacterium tuberculosis acyl carrier protein, AcpM. 35 Biochimica et Biophysica Acta (BBA)-Molecular and Cell Biology of Lipids 1532, 67-78. 36 Scheres, S.H. (2012). RELION: implementation of a Bayesian approach to cryo-EM structure 37 determination. J Struct Biol 180, 519-530. 38 Seidel, M., Alderwick, L.J., Birch, H.L., Sahm, H., Eggeling, L., and Besra, G.S. (2007). 39 Identification of a novel arabinofuranosyltransferase AftB involved in a terminal step of cell wall 40 arabinan biosynthesis in Corynebacterianeae, such as Corynebacterium glutamicum and 41 Mycobacterium tuberculosis. Journal of Biological Chemistry. 42 Sharma, C.B., LEHLE, L., and TANNER, W. (1981). N‐Glycosylation of Yeast Proteins: 43 Characterization of the Solubilized Oligosaccharyl Transferase. European journal of biochemistry 44 116, 101-108. 45 Shekar, S., Yeo, Z.X., Wong, J.C., Chan, M.K., Ong, D.C., Tongyoo, P., Wong, S.-Y., and Lee, 46 A.S. (2014). Detecting novel genetic variants associated with isoniazid-resistant Mycobacterium 47 tuberculosis. PloS one 9, e102383.

Tan et al. 87

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Škovierová, H., Larrouy-Maumus, G., Zhang, J., Kaur, D., Barilone, N., Korduláková, J., Gilleron, 2 M., Guadagnini, S., Belanová, M., and Prevost, M.-C. (2009). AftD, a novel essential 3 arabinofuranosyltransferase from mycobacteria. Glycobiology 19, 1235-1247. 4 Stewart, G.R., Wernisch, L., Stabler, R., Mangan, J.A., Hinds, J., Laing, K.G., Butcher, P.D., and 5 Young, D.B. (2002). The heat shock response of Mycobacterium tuberculosis: linking gene 6 expression, immunology and pathogenesis. International Journal of Genomics 3, 348-351. 7 Suloway, C., Pulokas, J., Fellmann, D., Cheng, A., Guerra, F., Quispe, J., Stagg, S., Potter, C.S., 8 and Carragher, B. (2005). Automated molecular microscopy: the new Leginon system. Journal of 9 structural biology 151, 41-60. 10 Takayama, K., and Kilburn, J.O. (1989). Inhibition of synthesis of arabinogalactan by ethambutol in 11 Mycobacterium smegmatis. Antimicrobial agents and chemotherapy 33, 1493-1499. 12 Takayama, K., Wang, C., and Besra, G.S. (2005). Pathway to synthesis and processing of mycolic 13 acids in Mycobacterium tuberculosis. Clinical microbiology reviews 18, 81-101. 14 Tallorin, L., Finzel, K., Nguyen, Q.G., Beld, J., La Clair, J.J., and Burkart, M.D. (2016). Trapping 15 of the enoyl-acyl carrier protein reductase–acyl carrier protein interaction. Journal of the American 16 Chemical Society 138, 3962-3965. 17 Tan, Y.Z., Baldwin, P.R., Davis, J.H., Williamson, J.R., Potter, C.S., Carragher, B., and Lyumkis, 18 D. (2017). Addressing preferred specimen orientation in single-particle cryo-EM through tilting. 19 Nature methods 14, 793. 20 Telenti, A., Philipp, W.J., Sreevatsan, S., Bernasconi, C., Stockbauer, K.E., Wieles, B., Musser, 21 J.M., and Jacobs Jr, W.R. (1997). The emb operon, a gene cluster of Mycobacterium tuberculosis 22 involved in resistance to ethambutol. Nature medicine 3, 567. 23 Therisod, H., and Kennedy, E.P. (1987). The function of acyl carrier protein in the synthesis of 24 membrane-derived oligosaccharides does not require its phosphopantetheine prosthetic group. 25 Proceedings of the National Academy of Sciences 84, 8235-8238. 26 Voss, N.R., and Gerstein, M. (2010). 3V: cavity, channel and cleft volume calculator and extractor. 27 Nucleic Acids Res 38, W555-562. 28 Wallace, A.C., Laskowski, R.A., and Thornton, J.M. (1995). LIGPLOT: a program to generate 29 schematic diagrams of protein-ligand interactions. Protein engineering, design and selection 8, 127- 134. 30 134. 31 Wei, H., Dandey, V.P., Zhang, Z., Raczkowski, A., Rice, W.J., Carragher, B., and Potter, C.S. 32 (2018). Optimizing “self-wicking” nanowire grids. Journal of structural biology 202, 170-174. 33 WHO (2018). Global tuberculosis report 2018. 34 Wild, R., Kowal, J., Eyring, J., Ngwa, E.M., Aebi, M., and Locher, K.P. (2018). Structure of the 35 yeast oligosaccharyltransferase complex gives insight into eukaryotic N-glycosylation. Science 359, 36 545-550. 37 Wolucka, B.A., McNeil, M.R., de Hoffmann, E., Chojnacki, T., and Brennan, P.J. (1994). 38 Recognition of the lipid intermediate for arabinogalactan/arabinomannan biosynthesis and its 39 relation to the mode of action of ethambutol on mycobacteria. Journal of Biological Chemistry 269, 40 23328-23335. 41 Wong, H.C., Liu, G., Zhang, Y.-M., Rock, C.O., and Zheng, J. (2002). The solution structure of 42 acyl carrier protein from Mycobacterium tuberculosis. Journal of Biological Chemistry 277, 15874- 43 15880. 44 Zhang, K. (2016). Gctf: Real-time CTF determination and correction. J Struct Biol 193, 1-12. 45 Zheng, R.B., Jégouzo, S.A., Joe, M., Bai, Y., Tran, H.-A., Shen, K., Saupe, J.r., Xia, L., Ahmed, 46 M.F., and Liu, Y.-H. (2017a). Insights into interactions of mycobacteria with the host innate 47 immune system from a novel array of synthetic mycobacterial glycans. ACS chemical biology 12, 48 2990-3002. Tan et al. 88

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.22.885152; this version posted December 23, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Zheng, S.Q., Palovcak, E., Armache, J.P., Verba, K.A., Cheng, Y., and Agard, D.A. (2017b). 2 MotionCor2: anisotropic correction of beam-induced motion for improved cryo-electron 3 microscopy. Nat Methods 14, 331-332. 4 Zimhony, O., Schwarz, A., Raitses-Gurevich, M., Peleg, Y., Dym, O., Albeck, S., Burstein, Y., and 5 Shakked, Z. (2015). AcpM, the meromycolate extension acyl carrier protein of Mycobacterium 6 tuberculosis, is activated by the 4′-phosphopantetheinyl transferase PptT, a potential target of the 7 multistep mycolic acid biosynthesis. Biochemistry 54, 2360-2371. 8

Tan et al. 89