SUPPLEMENTARY INFORMATION

Conserved-residue mutations in Wzy affect O-antigen polymerization and Wzz-mediated chain-length regulation in Pseudomonas aeruginosa PAO1

Salim T. Islam1, Steven M. Huszczynski1, Timothy Nugent2, Alexander C. Gold1, Joseph S. Lam1*

1Department of Molecular and Cellular Biology, University of Guelph, Guelph, Ontario, N1G 2W1, Canada

2Bioinformatics Group, Department of Computer Science, University College London, London, WC1E 6BT, United Kingdom

*corresponding author

Email: [email protected] Telephone: 519-824-4120 ext. 53823 Fax: 519-837-1802

Page 1 of 14

Supplementary Figure S1. Wzx/Wzy-dependent synthesis pathway. 1) Cytoplasmic synthesis of UndPP-linked polysaccharide repeat units. 2) UndPP-linked repeat units are flipped across the inner membrane by Wzx. 3) The flipped UndPP-linked repeats are recruited by Wzy. 4) Addition of a single repeat unit at the reducing terminus of the growing chain to extend the polymer. 5) Regulation of the length of the growing chain by Wzz. 6) Release of the growing chain by Wzy, allowing for ligation of the polymer to lipid A–core oligosaccharide by WaaL. Reproduced with permission from Islam and Lam (2013) 1.

1 2 3 4

Supplementary Figure S2. Catch-and-release mechanism of WzyPa-mediated O-Ag polymerization. PL3 (net-positive charge) is depicted in dark blue. PL5 (net-negative charge) is coloured in light blue. RX10G motifs on each PL are represented by orange circles. 1) RX10G motif-mediated retention of a polymerized O-Ag chain by PL5. 2) Recruitment of a newly-flipped

UndPP-linked anionic repeat unit by the RX10G motif of cationic PL3. 3) Catalysis of a new glycosidic bond with α stereochemistry linking the repeat units at the reducing terminus of the growing chain. 4) RX10G-mediated re-binding of the polymerized chain by PL5. Modified with permission from Islam et al. (2011) 2.

Page 2 of 14

Supplementary Figure S3. ClustalW2 alignment of PL3 (A162–V203) and PL5 (G272–

L319) from WzyPa. Included are three residues from each flanking α-helical TMS (Fig. 1). The solid line represents the extent of each loop outside of the membrane, with black colour denoting the RX10G motifs and grey colour denoting the remainder of the loop. Highlighted residues have been coloured based on their conservation score (out of 10) as presented in Jalview3. The colour key is as follows: red, score = 10; orange, score = 9, yellow, score = 8, green, score = 7. ●, residue targeted via site-directed mutagenesis in the current study; ○, residue previously targeted via site-directed mutagenesis2. Both loop regions contain identical and structurally equivalent amino acids. Adapted from Islam et al (2011) 2.

Supplementary Figure S4. Top I-TASSER 3D model predictions for WT and D286A PL5 peptide sequences. The highest-confidence prediction from I-TASSER4 for the D286A mutant peptide (red) has been superimposed with that of the WT (blue) using PyMol. The N and C termini have been indicated for each set of predictions. The amino acid side chains at position 199 in the abovementioned constructs are displayed as spheres.

Supplementary Figure S5. Top I-TASSER 3D model predictions for WT and mutant A199 PL3 peptide sequences. The highest-confidence prediction from I-TASSER4 for each mutant peptide has been superimposed with that of the WT using PyMol. Colour key: red, WT; green, A199D; blue, A199E, yellow, A199K; cyan, A199S. The amino acid side chains at position 199 in the abovementioned constructs are displayed as spheres.

Page 3 of 14

Supplementary Figure S6. ClustalW2 alignment of putative PL3 and PL5 peptides from representative Wzy proteins. Sequence limits were imposed based on the equivalent positioning in the jackhmmer hit-based MSA of WzyPa PL3 and PL5 sequences. Highlighted residues have been coloured based on their conservation score (out of 10) as presented in Jalview 3. The colour key is as follows: red, score = 10; orange, score = 9, yellow, score = 8, green, score = 7. Both loop regions contain identical and structurally equivalent amino acids for a range of phylogenetically-distinct organisms. Key: F.M, Finegoldia magna; L.I, ivanovii; R.L, Rhizobium leguminosarum; D.T, Dictyoglomus thermophilum; P.B, Planctomyces brasiliensis; P.As, Photorhabdus asymbiotica subsp. asymbiotica.

Page 4 of 14

Supplementary Figure S7. Effect of expression induction on N380A and R385A on LPS complementation phenotypes. LPS from P. aeruginosa PAO1 Δwzy complemented with cytoplasmic domain WzyPa mutants was subjected to Western immunoblot analysis. B-band O- Ag was detected with monoclonal antibody MF15-4 5. Samples were grown in the presence (+) or absence (-) of 0.1% L-arabinose to alter the level of expression of the respective construct.

Page 5 of 14

6 Supplementary Figure S8. De novo WzyPa tertiary structure model. PSICOV was used to generate a list of predicted contacts from the 30-sequence WzyPa MSA along with precision estimates for each contact. Secondary structure and TMS predictions generated using PSIPRED 7 and MEMSAT-SVM 8 were combined using a simple consensus scoring scheme ensuring that the predicated topology was enforced. Using the list of predicted contacts where estimated precision was > 0.5 and the consensus topology as inputs, 200 models were then generated using the FILM3 de novo structure prediction method, utilizing correlated mutation analysis 9 from the

WzyPa MSA to generate model structures; 100 with Z-coordinate constraints derived from topology prediction and 100 without. The lowest energy model was then identified using the standard FILM3 objective function and the 100 lowest-energy models were fitted to it by rigid body superposition. The mean pairwise TM-score 10 was calculated for all models in the resulting ensemble, allowing the TM-score of the WzyPa model to be predicted using linear regression as 0.42. This value is just below the threshold of 0.5 at which the fold is considered to be probably correct, with the small size of the MSA most likely resulting in non-optimally predicted contacts with which to build the model. The protein is displayed embedded in a mock membrane bilayer corresponding to the IM, denoted by blue lines (cytoplasmic face) and red lines (periplasmic face).

Page 6 of 14

Supplementary Figure S9. Total predicted helix-helix interactions for each WzyPa TMS. 11 TMhit was used analyze the predicted inter-TMS contacts for WzyPa. The number of predicted contacts for each TMS were tallied from the provided list of individual residue contact pairs for the WT, N380A, N380Q, R385A, and R385K amino acid sequences. Arrows denote trend deviations.

Page 7 of 14

Table S1. Raw WzyPa sequence hits from jackhmmer search UniRef100 I.D. Amino Organism Annotation Acids Aligned 8 – 419 Fusobacterium periodonticum ATCC p.u.p. D4CUC2 33693 1 – 438 ORF_10; potential multiple Q8KN85 Pseudomonas aeruginosa membrane spanning domains D9PTL8 16 – 422 Finegoldia magna ACS-171-V-Col3 Putative membrane protein Q89HI6 34 – 425 Bradyrhizobium japonicum Blr6005 protein E5X1M9 12 – 425 Bacteroides eggerthii 1_2_48FAA p.u.p. 32 – 440 Planctomyces brasiliensis DSM p.u.p. F0SSG0 5305 28 – 452 Prevotella sp. oral taxon 317 str. p.u.p. D3IFT7 F0108 12 – 415 Polaromonas naphthalenivorans p.u.p. A1VRZ1 CJ2 A5N235 51 – 425 kluyveri p.u.p. F9D0P1 13 – 429 Prevotella dentalis DSM 3688 WblL-like protein 6 – 153 Sulfurihydrogenibium azorense Az- p.u.p. C1DTA8 Fu1 D8GRU1 14 – 425 Clostridium ljungdahlii DSM 13528 p.u.p. 1 – 438 QUERY SEQUENCE (Wzy from 1WZY_A Pseudomonas aeruginosa PAO1) 10 – 400 Photorhabdus luminescens subsp. WblL protein Q7MY86 laumondii TTO1 G3XCW3 1 – 438 Pseudomonas aeruginosa B-band O-antigen polymerase 19 – 434 Prevotella multisaccharivorax DSM p.u.p. F8N5Y1 17128 Q51370 1 – 438 Pseudomonas aeruginosa O-antigen polymerase 39 – 451 Citromicrobium bathyomarinum hypothetical protein UPI0001DD0B12 JL354 CbatJ_00065 E5WG87 5 – 420 sp. 2_A_57_CT2 p.u.p. 11 – 400 Photorhabdus asymbiotica subsp. Putative uncharacterized B6VM05 asymbiotica ATCC 43949 protein wblL 130 – 433 Rhizobium leguminosarum bv. Putative transmembrane Q7WYR4 viciae 3841 protein 18 – 426 Dictyoglomus thermophilum H-6- Membrane protein, putative B5YCR1 12 E1KZS4 16 – 422 Finegoldia magna BVS033A4 Putative membrane protein C9M7A7 2 – 251 Jonquetella anthropi E3_33 E1 p.u.p. 14 – 422 Candidatus Koribacter versatilis p.u.p. Q1ILE1 Ellin345 20 – 425 Putative uncharacterized D3V6Z8 Xenorhabdus bovienii SS-2004 protein 2 – 418 Listeria ivanovii subsp. ivanovii Putative B-band O-antigen G2Z8P9 PAM 55 polymerase A3SJ04 2 – 397 Roseovarius nubinhibens ISM WblL protein D6S899 16 – 422 Finegoldia magna ATCC 53516 p.u.p. B0S0N4 16 – 422 Finegoldia magna p.u.p. p.u.p., putative uncharacterized protein

Page 8 of 14

Table S2. Phylogenetic diversity of non-redundant Wzy sequence hits from jackhmmer search UniRef100 I.D. Organism (Genus and Family Class Phylum Species) G3XCW3 Pseudomonas γ Proteobacteria (query) aeruginosa Pseudomonadaceae (gamma) Fusobacterium Others periodonticum ATCC D4CUC2 33693 Finegoldia magna ACS- D9PTL8 171-V-Col3 Bradyrhizobium α (alpha) Proteobacteria Q89HI6 japonicum Others Bacteroides eggerthii Bacteroidetes/Chlorobi E5X1M9 1_2_48FAA Planctomyces Others F0SSG0 brasiliensis DSM 5305 Prevotella sp. oral taxon Bacteroidetes/Chlorobi D3IFT7 317 str. F0108 Polaromonas β (beta) Proteobacteria A1VRZ1 naphthalenivorans CJ2 Others A5N235 Clostridium kluyveri Clostridia Firmicutes Prevotella dentalis DSM Bacteroidetes/Chlorobi F9D0P1 3688 Sulfurihydrogenibium Others C1DTA8 azorense Az-Fu1 Clostridium ljungdahlii Clostridia Firmicutes D8GRU1 DSM 13528 Photorhabdus γ Proteobacteria luminescens subsp. (gamma) Q7MY86 laumondii TTO1 Enterobacteriales Prevotella Bacteroidetes/Chlorobi multisaccharivorax DSM F8N5Y1 17128 Citromicrobium α (alpha) Proteobacteria UPI0001DD0B12 bathyomarinum JL354 Others E5WG87 Bacillus sp. 2_A_57_CT2 Firmicutes Photorhabdus γ Proteobacteria asymbiotica subsp. (gamma) B6VM05 asymbiotica ATCC 43949 Enterobacteriales Rhizobium α (alpha) Proteobacteria leguminosarum bv. Q7WYR4 viciae 3841 Rhizobiaceae Dictyoglomus Thermotoga B5YCR1 thermophilum H-6-12 Jonquetella anthropi Synergistetes C9M7A7 E3_33 E1 Candidatus Koribacter Others Q1ILE1 versatilis Ellin345 Xenorhabdus bovienii γ Proteobacteria D3V6Z8 SS-2004 Enterobacteriales (gamma) Listeria ivanovii subsp. Bacillales Firmicutes G2Z8P9 ivanovii PAM 55 Roseovarius nubinhibens α (alpha) Proteobacteria A3SJ04 ISM Others

Page 9 of 14

Table S3. Primers used to generate site-directed mutants of Wzy-GFP-His8 Mutation Sense DNA Sequence (5’3’) Forward CCGTGTCGCAGCAGTATTAGTAGGC Y2A Reverse GCCTACTAATACTGCTGCGACACGG Forward CTTGCTGCAGTCGACAGG R6A Reverse CCTGTCGACTGCAGCAAG Forward CGAGTCGAGAGGTCTATTC D8A Reverse GAATAGACCTCTCGACTCG Forward GAGTCGACGCGTCTATTCTG R9A Reverse CAGAATAGACGCGTCGACTC Forward CTATTCTGCTGGCCACAGTG N14A Reverse CACTGTGGCCAGCAGAATAG Forward GTGGGTGAATAATAATGCGATCTATCATCTCTATG Y32A Reverse CATAGAGATGATAGATCGCATTATTATTCACCCAC Forward GAATAATAATTATATCGCTCATCTCTATG Y34A Reverse CATAGAGATGAGCGATATAATTATTATTC Forward CGACAGTGTGGGTGAATAATAATTATATCTATGCTCTCTATGATT H35A Reverse CCCATATAATCATAGAGAGCATAGATATAATTATTATTCACCCACA Forward TATATCTATCATCTCGCTGATTATATGGGGTC Y37A Reverse GACCCCATATAATCAGCGAGATGATAGATATA Forward CAGTTTTTTTCGCAGACCCCATATAAGCATAGAGATGATAGATA D38A Reverse TAATAATTATATCTATCATCTCTATGCTTATATGGGGTCTGCGAAA Forward CTATGATGCTATGGGGTCTG Y39A Reverse CAGACCCCATAGCATCATAG Forward TGATTATATGGGGTCTGCGGCAAAAACTGTCGACTTCGGC K44A Reverse GCCGAAGTCGACAGTTTTTGCCGCAGACCCCATATAATCA Forward ATTATATGGGGTCTGCGAAAGCAACTGTCGACTTCGGCTTG K45A Reverse CAAGCCGAAGTCGACAGTTGCTTTCGCAGACCCCATATAAT Forward CATCTGTGCCCTGTTGGCTGGAGGGGCAATTCGC C66A Reverse GCGAATTGCCCCTCCAGCCAACAGGGCACAGATG Forward TGTTGTGTGGAGGGGCAATTGCCAGGCCAGGTG R71A Reverse CACCTGGCCTGGCAATTGCCCCTCCACACAACA Forward GGAGGGGCAATTCGCGCGCCAGGTGATCTGTT R72A Reverse AACAGATCACCTGGCGCGCGAATTGCCCCTCC Forward CGCAGGGCAGGTGATC P73A Reverse GATCACCTGCCCTGCG Forward GGTTCTTGCTGGAGCTAATC N93A Reverse GATTAGCTCCAGCAAGAACC Forward CTTAATGGAGCTGCTCAATATTCTC N96A Reverse GAGAATATTGAGCAGCTCCATTAAG Forward CTTAATGGAGCTAATGCGTATTCTCCGGATGC Q97A Reverse GCATCCGGAGAATACGCATTAGCTCCATTAAG Forward CTAATCAAGCTTCTCCGGATG Y98A Reverse CATCCGGAGAAGCTTGATTAG Forward CTAATCAATATTCTGCGGATGCGCAACC P100A Reverse GGTTGCGCATCCGCAGAATATTGATTAG Forward TAATCAATATTCTCCGGCTGCGCAACCATGGGCTG D101A Reverse CAGCCCATGGTTGCGCAGCCGGAGAATATTGATTA Forward CTCCGGATGCGGCGCCATGGGCTGGC Q103A Reverse GCCAGCCCATGGCGCCGCATCCGGAG Forward GCGCAAGCATGGGCTG P104A Reverse CAGCCCATGCTTGCGC Forward GGATGCGCAACCAGCGGCTGGCGTGCCTC W105A Reverse GAGGCACGCCAGCCGCTGGTTGCGCATCC Forward GTCAATGCGATAAGATTCCATC K124A Reverse GATGGAATCTTATCGCATTGAC

Page 10 of 14

Table S3. (ctd.) Forward TTGATCATCGGCATTGTCAATAAGATAGCATTCCATCCGCTAGG R126A Reverse CCTAGCGGATGGAATGCTATCTTATTGACAATGCCGATGATCAA Forward CAATAAGATAAGAGCCCATCCG F127A Reverse CGGATGGGCTCTTATCTTATTG Forward TCGGCATTGTCAATAAGATAAGATTCGCTCCGCTAGGTGCATT H128A Reverse CAATGCACCTAGCGGAGCGAATCTTATCTTATTGACAATGCCGA Forward GATTCCATGCGCTAGGTGC P129A Reverse GCACCTAGCGCATGGAATC Forward CTAGGTGCATTGGCGCGAGAAAACC Q134A Reverse GGTTTTCTCGCGCCAATGCACCTAG Forward CGCTAGGTGCATTGCAGGCAGAAAACCAAGGAAGGC R135A Reverse GCCTTCCTTGGTTTTCTGCCTGCAATGCACCTAGCG Forward GTGCATTGCAGCGAGCAAACCAAGGAAGGCG E136A Reverse CGCCTTCCTTGGTTTGCTCGCTGCAATGCAC Forward CGAGAAGCCCAAGGAAGG N137A Reverse CCTTCCTTGGGCTTCTCG Forward CAGCGAGAAAACGCGGGAAGGCGAATG Q138A Reverse CATTCGCCTTCCCGCGTTTTCTCGCTG Forward ATTGCAGCGAGAAAACCAAGGAGCGCGAATGTTAGTGCTAC R140A Reverse GTAGCACTAACATTCGCGCTCCTTGGTTTTCTCGCTGCAAT Forward GCGAGAAAACCAAGGAAGGGCAATGTTAGTGCTACTGTCA R141A Reverse TGACAGTAGCACTAACATTGCCCTTCCTTGGTTTTCTCGC Forward CCGCTGACTTTGCTGGG F167A Reverse GTCAGCGGAAAAATAACCAGC Forward CTTTGACGCTGCTGGGC F169A Reverse GCCCAGCAGCGTCAAAG Forward GACTTTGATGGGCAGTATGC A170D Reverse GCATACTGCCCATCAAAGTC Forward CTTTGCTGGGGCGTATGCTCG Q172A Reverse CGAGCATACGCCCCAGCAAAG Forward GACTTTGCTGGGCAGGCTGCTCGCCGTGCACT Y173A Reverse AGTGCACGGCGAGCAGCCTGCCCAGCAAAGTC Forward CGCCGTGTACTTGCTCG A177V Reverse CGAGCAAGTACACGGCG Forward GTGCACTTGCTCGTGCGGTTTTTGCTGCGGG E181A Reverse CCCGCAGCAAAAACCGCACGAGCAAGTGCAC Forward GTGAGGTTGCTGCTGCG F183A Reverse CGCAGCAGCAACCTCAC Forward CTGCGGGTGCGGCAAACGGCTACTTG S187A Reverse CAAGTAGCCGTTTGCCGCACCCGCAG Forward GGGTTCTGACAACGGCTAC A188D Reverse GTAGCCGTTGTCAGAACCC Forward CTGCAGCCGGCTACTTG N189A Reverse CAAGTAGCCGGCTGCAG Forward CAAACGGCGCCTTGTCG Y191A Reverse CGACAAGGCGCCGTTTG Forward CAATCGGTACCGCGGCATTCTTTCC Q198A Reverse GGAAAGAATGCCGCGGTACCGATTG Forward GGTACCCAGGACTTCTTTC A199D Reverse GAAAGAAGTCCTGGGTACC Forward CAATCGGTACCCAGGAGTTCTTTCC A199E Reverse GGAAAGAACTCCTGGGTACCGATTG Forward CAATCGGTACCCAGAAATTCTTTCCTGTG A199K Reverse CACAGGAAAGAATTTCTGGGTACCGATTG

Page 11 of 14

Table S3. (ctd.) Forward CAATCGGTACCCAGAGCTTCTTTCCTGTGTTG A199S Reverse CAACACAGGAAAGAAGCTCTGGGTACCGATTG Forward GGCAGCCTTTCCTGTGTTG F200A Reverse GAAAGGCTGCCTGGGTAC Forward CAGAAGTATCCTGCCGTCGTGTTGTTTC F237A Reverse GAAACAACACGACGGCAGGATACTTCTG Forward GACGATTCGGTCAGGTCGCAGTGTCTTGGGTTGTCT R257A Reverse AGACAACCCAAGACACTGCGACCTGACCGAATCGTC Forward CGGTCAGGTCAAAGTGTCTTGG R257K Reverse CCAAGACACTTTGACCTGACCG Forward GTGTTTGGCGCTTCATTCTTG Y281A Reverse CAAGAATGAAGCGCCAAACAC Forward GAACATGAGGTGTTTGGCTATTCAGCCTTGAATGATTATTTTCTACGTCG F283A Reverse CGACGTAGAAAATAATCATTCAAGGCTGAATAGCCAAACACCTCATGTTC Forward GAGGTGTTTGGCTATTCATTCTTGGCTGATTATTTTCTACGTCGTGCT N285A Reverse AGCACGACGTAGAAAATAATCAGCCAAGAATGAATAGCCAAACACCTC Forward CATTCTTGAATGCTTATTTTCTACG D286A Reverse CGTAGAAAATAAGCATTCAAGAATG Forward CTATTCATTCTTGAATGAATATTTTCTACGTCGTGC D286E Reverse GCACGACGTAGAAAATATTCATTCAAGAATGAATAG Forward CTTGAATGATGCTTTTCTACGTC Y287A Reverse GACGTAGAAAAGCATCATTCAAG Forward GAATGATTATGCTCTACGTCG F288A Reverse CGACGTAGAGCATAATCATTC Forward CTACGTCGTGTTTTTATTGTGC A292V Reverse GCACAATAAAAACACGACGTAG Forward GTCGTGCTGCTATTGTGCC F293A Reverse GGCACAATAGCAGCACGAC Forward GCTTTTATTGTGGCTTCCACC P296A Reverse GGTGGAAGCCACAATAAAAGC Forward CTGTTGGGGGTAGTTGATC A302V Reverse GATCAACTACCCCCAACAG Forward GGGGCAGTTGATGCGTTTGTGTCTC Q305A Reverse GAGACACAAACGCATCAACTGCCCC Forward GTTGATCAGGCTGTGTCTC F306A Reverse GAGACACAGCCTGATCAAC Forward GTTGATCAGTTTGTGTCTGCGTTCGGATCCAATTATTAC Q309A Reverse GTAATAATTGGATCCGAACGCAGACACAAACTGATCAAC Forward GTCTCAGGCCGGATCC F310A Reverse GGATCCGGCCTGAGAC Forward CGGATCCGCTTATTACAGGG N313A Reverse CCCTGTAATAAGCGGATCCG Forward CCAATGCTTACAGGGATACC Y314A Reverse CCCTGTAAGCATTGGATCC Forward CCAATTATGCCAGGGATACC Y315A Reverse GGTATCCCTGGCATAATTGG Forward CTTTCGTCTGGGGACGGCAATTTTCAATAATCCCGA E339A Reverse TCGGGATTATTGAAAATTGCCGTCCCCAGACGAAAG Forward GACGGAAATTGCCAATAATCCC F341A Reverse GGGATTATTGGCAATTTCCGTC Forward GATAGCCTATATGGCGTTGGGTTATGTG Q359A Reverse CACATAACCCAACGCCATATAGGCTATC Forward GTGTCGTTCTCATGGCTTTCTTATTTTCGAG N380A Reverse CTCGAAAATAAGAAAGCCATGAGAACGACAC

Page 12 of 14

Table S3. (ctd.) Forward GTGTCGTTCTCATGCAGTTCTTATTTTCGAGG N380Q Reverse CCTCGAAAATAAGAACTGCATGAGAACGACAC Forward CGTTCTCATGAATGCCTTATTTTCGAGG F381A Reverse CCTCGAAAATAAGGCATTCATGAGAACG Forward GAATTTCTTAGCTTCGAGGTATG F383A Reverse CATACCTCGAAGCTAAGAAATTC Forward GTTCTCATGAATTTCTTATTTTCGGCGTATGGTGCATTCATGGC R385A Reverse TGGCCATGAATGCACCATACGCCGAAAATAAGAAATTCATGAGA Forward CTTATTTTCGAAGTATGGTGCATTC R385K Reverse CACCATACTTCGAAAATAAGAAATTC Forward CTTATTTTCGAGGGCTGGTGCATTCATG Y386A Reverse CATGAATGCACCAGCCCTCGAAAATAAG Forward GTGCAGCCATGGCCATTC F389A Reverse GAATGGCCATGGCTGCAC Forward GGCCATTGCGGTTGCTTTG P393A Reverse CAAAGCAACCGCAATGGCC

All wzy mutations were created in the pHERD26T-wzy-GFP-His8 plasmid (Islam et al., 2010). Bold Red = nucleotide mismatch designed to introduce the desired mutation in the template construct.

Page 13 of 14

Supplementary Information References

1 Islam, S.T. & Lam, J.S., Wzx flippase-mediated membrane translocation of sugar polymer precursors in . Environ. Microbiol. 15 (4), 1001-1015 (2013). 2 Islam, S.T. et al., Dual conserved periplasmic loops possess essential charge characteristics that support a catch-and-release mechanism of O-antigen polymerization by Wzy in Pseudomonas aeruginosa PAO1. J. Biol. Chem. 286 (23), 20600-20605 (2011). 3 Waterhouse, A.M., Procter, J.B., Martin, D.M.A., Clamp, M., & Barton, G.J., Jalview Version 2—a multiple sequence alignment editor and analysis workbench. Bioinformatics 25 (9), 1189-1191 (2009). 4 Roy, A., Kucukural, A., & Zhang, Y., I-TASSER: a unified platform for automated protein structure and function prediction. Nat. Protoc. 5 (4), 725-738 (2010). 5 Lam, J.S., Handelsman, M.Y., Chivers, T.R., & MacDonald, L.A., Monoclonal antibodies as probes to examine serotype-specific and cross-reactive epitopes of lipopolysaccharides from serotypes O2, O5, and O16 of Pseudomonas aeruginosa. J. Bacteriol. 174 (7), 2178-2184 (1992). 6 Jones, D.T., Buchan, D.W.A., Cozzetto, D., & Pontil, M., PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 28 (2), 184-190 (2012). 7 Buchan, D.W.A. et al., Protein annotation and modelling servers at University College London. Nucleic Acids Res. 38 (suppl 2), W563-W568 (2010). 8 Nugent, T. & Jones, D., Detecting pore-lining regions in transmembrane protein sequences. BMC Bioinformatics 13 (1), 169 (2012). 9 Nugent, T. & Jones, D.T., Accurate de novo structure prediction of large transmembrane protein domains using fragment-assembly and correlated mutation analysis. Proc. Natl. Acad. Sci. USA 109 (24), E1540-E1547 (2012). 10 Xu, J. & Zhang, Y., How significant is a protein structure similarity with TM-score = 0.5? Bioinformatics 26 (7), 889-895 (2010). 11 Lo, A. et al., Predicting helix-helix interactions from residue contacts in membrane proteins. Bioinformatics 25 (8), 996-1003 (2009).

Page 14 of 14