<<

Cell-specific Bioorthogonal Tagging of in Co- culture

Supporting Information

Supporting Figures Experimentals

Anna Ciocea,b, Beatriz Callea,b, Andrea Marchesia,b, Ganka Bineva-Toddb, Helen Flynnc, Zhen Lia,b, Omur Y. Tastanb, Chloe Roustand, Tessa Keenane, Peter Bothf, Kun Huangf, Fabio Parmeggianif, Ambrosius P. Snijdersc, Svend Kjaerd, Martin A. Fascionee, Sabine Flitschf, Benjamin Schumanna,b,* aDepartment of Chemistry, Imperial College London, 80 Wood Lane, W12 0BZ, London, United Kingdom. bThe Chemical Glycobiology Laboratory, The Francis Crick Institute, 1 Midland Rd, NW1 1AT London, United Kingdom. cProteomics Science Technology Platform, The Francis Crick Institute, NW1 1AT London, United Kingdom. dStructural Biology Science Technology Platform, The Francis Crick Institute, NW1 1AT London, United Kingdom. eDepartment of Chemistry, University of York, YO10 5DD York, United Kingdom. fManchester Institute of Biotechnology & School of Chemistry, The University of Manchester, M1 7DN Manchester, United Kingdom. *Correspondence should be addressed to: [email protected].

Supporting Figures

Fig. S1: architectures of human of the GalNAc salvage pathway. In AGX1, the N-acyl side chain in UDP-GalNAc is in proximity to Phe381 and Phe383. In GALK2, the N-acyl side chain of GalNAc-1-phosphate is in proximity with amino acids forming a hydrogen network (Glu179, Ser147 and Ser148).

2

Fig. S2: Evaluation of enzymatic turnover of GalN6yne-based metabolites. A, in vitro UDP- sugar formation by AGX1 after 2 h incubation, as assessed by LC-MS. Data are means ± SD from three independent experiments. B, in vitro GalN6yne-1-phosphate formation by human and bacterial GalNAc kinases, as assessed by LC-MS and integrated ion count. Data are individual data points and means from two individual experiments. C, ion pair HPLC traces of in vitro epimerisation assays, using UDP-GalNAc analogue 1 as substrate and either cell lysates from control or GALE-KO K-562 cells or two different concentrations of purified GALE as sources.1 Data are representative of two independent replicates (lysate samples) or from one experiment (isolated GALE samples). C, selected traces from Fig. 1B using 12.5 nM GALE as an enzyme source, with reference HPLC traces of synthetic standards for UDP-GalNAc and UDP-GlcNAc analogues. Arrowhead depicts epimerization of compound 3. Traces depict relative intensity of absorbance at 260 nm.

3

Fig. S3: Biosynthesis of chemically tagged UDP-sugars by metabolic engineering. A, structures of two MOE reagents used herein. Ac3GalN6yne-1-P(SATE)2 is a precursor of GalN6yne-1-phosphate. B, biosynthesis of UDP-sugars in cells stably transfected with metabolic enzymes, as assessed by high performance anion exchange chromatography (HPAEC). Chromatograms were normalized on an external standard. Synthetic UDP-sugars served as standards. Data are one representative out of two independent experiments.

4 Table S1: selectively enriched by cells expressing NahK and mut-AGX1. Data are from one experiment with both forward and reverse SILAC labelling strategies, and hence treated as two replicates. Glycosylation status of proteins was manually cross-checked by GlyGen2 and Glycodomain viewer.3 names name Glycosylation type O-GalNAcylation ABCC1 Multidrug resistance-associated protein 1 N-linked - ADAM10 Disintegrin and metalloproteinase domain-containing protein 10 N- and O-linked yes AGFG1 Arf-GAP domain and FG repeat-containing protein 1 O-linked yes ANTXR2 Anthrax toxin receptor 2 N-linked yes ARID2 AT-rich interactive domain-containing protein 2 O-linked - ATF7IP Activating factor 7-interacting protein 1 O-linked - ATP1B3 Sodium/potassium-transporting ATPase subunit beta-3 N- and O-linked yes ATXN2L Ataxin-2-like protein O-linked yes BCOR BCL-6 corepressor O-linked - BPTF Nucleosome-remodeling factor subunit BPTF O-linked - BSG Basigin N- and O-linked yes CA1 Carbonic anhydrase 1 O-linked - CALU Calumenin N- and O-linked yes CARM1 Histone-arginine methyltransferase CARM1 O-linked yes CASC4 Protein CASC4 N- and O-linked yes CD44 CD44 antigen N- and O-linked yes CD46 Membrane protein N- and O-linked yes CD47 Leukocyte surface antigen CD47 N-linked - CD55 Complement decay-accelerating factor N- and O-linked yes CD59 CD59 N-linked - CD63 ;CD63 antigen N-linked - CD97 CD97 antigen;CD97 antigen subunit alpha;CD97 antigen subunit beta N- and O-linked yes CLINT1 Clathrin interactor 1 N-linked - CPD Carboxypeptidase D N- and O-linked yes CREB1;CREM;ATF1 Cyclic AMP-responsive element-binding protein 1 O-linked - CSNK2A1;CSNK2A3 Casein kinase II subunit alpha O-linked - CTSD Cathepsin D;Cathepsin D light chain;Cathepsin D heavy chain N-and O- linked yes CTSZ Cathepsin Z N-linked - ECE1 Endothelin-converting enzyme 1 N-linked yes ELF1 ETS-related transcription factor Elf-1 O-linked - EPCAM Epithelial molecule N-and O- linked yes EWSR1 RNA-binding protein EWS O-linked - F11R Junctional adhesion molecule A N- and O-linked yes FCGR2A;FCGR2C Low affinity immunoglobulin gamma Fc region receptor II-a N-linked - FIP1L1 Pre-mRNA 3-end-processing factor FIP1 O-linked - FNBP4 Formin-binding protein 4 O-linked - FRRS1 Ferric-chelate reductase 1 N- and O-linked yes GABPA GA-binding protein alpha chain O-linked - GALNT1 Polypeptide N-acetylgalactosaminyltransferase 1 soluble form N-linked - GALNT5 Polypeptide N-acetylgalactosaminyltransferase 5 N-linked yes GLB1 Beta-galactosidase N-linked yes GLG1 Golgi apparatus protein 1 N- and O-linked - GNS N-acetylglucosamine-6- N- and O-linked - GOLIM4 Golgi integral 4 N- and O-linked yes GRN Paragranulin N- and O-linked yes HBA1 Hemoglobin subunit alpha O-linked - HCFC1 Host cell factor O-linked yes HEMGN homogen phosphorylated - HEXB hexosaminidase subunit beta N- and O-linked yes HGS Hepatocyte growth factor-regulated tyrosine kinase substrate O-linked - HIVEP1 Zinc finger protein 40 O-linked - HYOU1 Hypoxia up-regulated protein 1 N- and O-linked yes ICAM1 Intercellular adhesion molecule 1 N- and O-linked yes ICAM2 Intercellular adhesion molecule 2 N-linked yes IGF2R Cation-independent mannose-6-phosphate receptor N- and O-linked yes ITGAV alpha-V;Integrin alpha-V heavy chain;Integrin alpha-V light chain N-linked - ITGB1 Integrin beta-1 N-linked yes JMJD1C Probable JmjC domain-containing histone demethylation protein 2C O-linked - KDM3B Lysine-specific demethylase 3B O-linked - LAMP1 Lysosome-associated membrane glycoprotein 1 N- and O-linked yes LIN54 Protein lin-54 homolog O-linked - LMAN2 Vesicular integral-membrane protein VIP36 N- and O-linked yes LRIF1 Ligand-dependent nuclear receptor-interacting factor 1 O-linked - MAGEC1 Melanoma-associated antigen C1 phosphorylated - MAP4 -associated protein 4 O-linked yes MAPK1IP1L MAPK-interacting and spindle-stabilizing protein-like N-linked - MCAM, Muc18 Cell surface glycoprotein MUC18 N-linked yes MEF2C myocyte-specific enhancer factor 2C isoform 2 O-linked - MGA MAX gene-associated protein O-linked - MGEA5 Protein O-GlcNAcase O-linked - NCOA6 Nuclear receptor coactivator 6 O-linked - NDUFA4 Cytochrome c oxidase subunit NDUFA4 O-linked - NFAT5 Nuclear factor of activated T-cells 5 O-linked - NFRKB Nuclear factor related to kappa-B-binding protein O-linked - NFYA Nuclear transcription factor Y subunit alpha O-linked - NOTCH1 Notch 1 extracellular truncation;Notch 1 intracellular domain N- and O-linked yes NPC2 Epididymal secretory protein E1 N-linked - NUP153 Nuclear pore complex protein Nup153 O-linked - NUP214 Nuclear pore complex protein Nup214 O-linked yes NUP62 Nuclear pore glycoprotein p62 O-linked - NUPL1 Nucleoporin p58/p45 O-linked - OGT UDP-N-acetylglucosamine-peptide N-acetylglucosaminyltransferase O-linked - PECAM1 Platelet endothelial N-linked - PLXNB2 Plexin-B2 N-linked - POM121C Nuclear envelope pore membrane protein POM 121C O-linked - POU2F1 POU domain, class 2, transcription factor 1 O-linked - PPP1R12A Protein 1 regulatory subunit 12A O-linked -

5 PPP6R2 Serine/threonine- 6 regulatory subunit 2 O-linked - PPT1 Palmitoyl-protein 1 N- and O-linked - PSAP Prosaposin N- and O-linked yes PTPN23 Tyrosine-protein phosphatase non-receptor type 23 O-linked - PTPRC Receptor-type tyrosine-protein phosphatase C N- and O-linked yes PTTG1IP Pituitary tumor-transforming gene 1 protein-interacting protein N- and O-linked - QSER1 Glutamine and serine-rich protein 1 O-linked yes QSOX2 Sulfhydryl oxidase 2 N- and O-linked yes RAD23B UV excision repair protein RAD23 homolog B O-linked yes RBM26 RNA-binding protein 26 O-linked - RBM27 RNA-binding protein 27 O-linked - RFX1 MHC class II regulatory factor RFX1 O-linked - RNASET2 T2 N- and O-linked - ROR2 Tyrosine- transmembrane receptor ROR2 N- and O-linked - RPRD2 Regulation of nuclear pre-mRNA domain-containing protein 2 O-linked - S100A7 Protein S100-A7 O-linked - S100A8 Protein S100-A8 O-linked - S100A9 Protein S100-A9 O-linked yes SAP30BP SAP30-binding protein O-linked - SBSN Suprabasin O-linked yes SCAF4 Splicing factor, arginine/serine-rich 15 O-linked - SEC23IP SEC23-interacting protein O-linked - SEC24B Protein transport protein Sec24B O-linked yes SEC24C Protein transport protein Sec24C O-linked - SEC31A Protein transport protein Sec31A O-linked - SF1 Splicing factor 1 O-linked - SF3A1 Splicing factor 3A subunit 1 O-linked - SLC12A2 Solute carrier family 12 member 2 N-linked - SLC1A4 Neutral amino acid transporter A N- and O-linked - SLC1A5 Neutral amino acid transporter B(0);Amino acid transporter N- and O-linked - SLC2A1 Solute carrier family 2, facilitated glucose transporter member 1 N- and O-linked - SLC3A2 4F2 cell-surface antigen heavy chain N-linked yes SMARCC2 SWI/SNF complex subunit SMARCC2 O-linked - SPN, CD43 Leukosialin N- and O-linked yes SRCAP Helicase SRCAP O-linked - SRGN Serglycin O-linked yes SS18 Protein SSXT N-linked - TAB1 TGF-beta-activated kinase 1 and MAP3K7-binding protein 1 O-linked - TAF4 Transcription initiation factor TFIID subunit 4 O-linked - TAF6 Transcription initiation factor TFIID subunit 6 O-linked - TCERG1 Transcription elongation regulator 1 O-linked - TFPI Tissue factor pathway inhibitor N- and O-linked - TFRC (CD71) Transferrin receptor protein 1 N- and O-linked yes TGOLN2 Trans-Golgi network integral membrane protein 2 N- and O-linked yes TMEM30A Cell cycle control protein 50A N-and O- linked yes TOX4 TOX high mobility group box family member 4 O-linked - TRIM33 E3 ubiquitin-protein TRIM33 O-linked - TRPV2 Transient receptor potential cation channel subfamily V member 2 N-linked - UAP1 UDP-N-acetylgalactosamine pyrophosphorylase O-linked - UBAP2 Ubiquitin-associated protein 2 O-linked yes UBAP2L Ubiquitin-associated protein 2-like N- and O-linked yes VEZF1 Vascular endothelial zinc finger 1 O-linked - WNK1 Serine/threonine-protein kinase WNK1 O-linked yes YLPM1 YLP motif-containing protein 1 O-linked - ZNF207 BUB3-interacting and GLEBS motif-containing protein ZNF207 O-linked - ZNF281 Zinc finger protein 281 O-linked yes ZNF609 Zinc finger protein 609 O-linked - ZYX Zyxin O-linked -

6 Table S2: Proteins selectively enriched by cells expressing DM-GalNAc-Ts, NahK and mut- AGX1. Data are from one experiment with both forward and reverse SILAC labelling strategies, and hence treated as two replicates. Glycosylation status of proteins was manually cross-checked by GlyGen2 and Glycodomain viewer.3 Gene names Protein name glycosylation type O-GalNAcylation ADAMTSL4 ADAMTS-like protein 4 N- and O-linked yes ANTXR2 Anthrax toxin receptor 2 N-linked yes APLP2 Amyloid-like protein 2 N- and O-linked yes ARG1 Arginase-1 O-linked - ATN1 Atrophin-1 O-linked - ATXN2L Ataxin-2-like protein O-linked yes BSG Basigin N- and O-linked yes BST2 Bone marrow stromal antigen 2 N-and O- linked yes CALU Calumenin N- and O-linked yes CANX O-linked yes CASC4 Protein CASC4 N- and O-linked yes CD320 CD320 antigen N-linked yes CD44 CD44 antigen N- and O-linked yes CD46 Membrane cofactor protein N- and O-linked yes CD55 Complement decay-accelerating factor N- and O-linked yes CIC Protein capicua homolog O-linked - CLEC11A C-type lectin domain family 11 member A O-linked yes COL1A1 Collagen alpha-1(I) chain N- and O-linked yes CPD Carboxypeptidase D O-linked yes CRELD2 Cysteine-rich with EGF-like domain protein 2 N- and O-linked yes CTSD Cathepsin D;Cathepsin D light chain;Cathepsin D heavy chain N-and O- linked yes DIDO1 Death-inducer obliterator 1 O-linked - ECM1 Extracellular matrix protein 1 N- and O-linked yes ELF1 ETS-related transcription factor Elf-1 N-linked - ENPP3 Ectonucleotide pyrophosphatase/ family member 3 N-linked - ERP44 Endoplasmic reticulum resident protein 44 O-linked yes EWSR1 RNA-binding protein EWS N-linked - FAM3C Protein FAM3C O-linked yes FCGR2A;FCGR2C Low affinity immunoglobulin gamma Fc region receptor II-a O-linked - FRRS1 Ferric-chelate reductase 1 N- and O-linked yes GALNT2 Polypeptide N-acetylgalactosaminyltransferase 2 O-linked yes GALNT5 Polypeptide N-acetylgalactosaminyltransferase 5 N-linked yes GLG1 Golgi apparatus protein 1 N- and O-linked - GOLIM4 Golgi integral membrane protein 4 N- and O-linked yes GRPEL2 GrpE protein homolog 2, mitochondrial O-linked - GYPA -A N- and O-linked yes GYPC Glycophorin-C N-and O- linked yes HEPH Hephaestin O-linked yes ICAM1 Intercellular adhesion molecule 1 N-linked yes ICAM2 Intercellular adhesion molecule 2 N- and O-linked yes IGF2R Cation-independent mannose-6-phosphate receptor N- and O-linked yes IRF2BPL Interferon regulatory factor 2-binding protein-like O-linked - ITGB1 Integrin beta-1 N- and O-linked yes JMJD1C Probable JmjC domain-containing histone demethylation protein 2C N- and O-linked - JTB Protein JTB O-linked yes KANSL3 KAT8 regulatory NSL complex subunit 3 O-linked - KIAA0319L Dyslexia-associated protein KIAA0319-like protein N- and O-linked yes KIAA1324L UPF0577 protein KIAA1324-like N-linked LAMP1 Lysosome-associated membrane glycoprotein 1 N- and O-linked yes LAMP2 Lysosome-associated membrane glycoprotein 2 N- and O-linked yes LIN54 Protein lin-54 homolog N-linked - LRP8 Low-density lipoprotein receptor-related protein 8 N- and O-linked yes LTBP4 Latent-transforming growth factor beta-binding protein 4 N- and O-linked yes LYSMD3 LysM and putative -binding domain-containing protein 3 N- and O-linked yes M6PR Cation-dependent mannose-6-phosphate receptor N- and O-linked yes MANSC1 MANSC domain-containing protein 1 N- and O-linked yes MAPK1IP1L MAPK-interacting and spindle-stabilizing protein-like N-linked - MCAM, Muc18 Cell surface glycoprotein MUC18 N- and O-linked yes MEGF9 Multiple epidermal growth factor-like domains protein 9 N- and O-linked yes MGAT2 Alpha-1,6-mannosyl-glycoprotein 2-beta-N-acetylglucosaminyltransferase N- and O-linked yes MIA2 Melanoma inhibitory activity protein 2 N- and O-linked - MIA3 Melanoma inhibitory activity protein 3 N- and O-linked yes MPZL1 protein zero-like protein 1 N- and O-linked - NENF Neudesin O-linked yes NFRKB Nuclear factor related to kappa-B-binding protein O-linked - NPTN;DKFZp566H1924 Neuroplastin N-linked - NUCB1 Nucleobindin-1 O-linked yes NUCB2 Nucleobindin-2;Nesfatin-1 O-linked yes NUP153 Nuclear pore complex protein Nup153 O-linked - NUP54 Nucleoporin p54 O-linked - NUP62 Nuclear pore glycoprotein p62 O-linked - NUPL1 Nucleoporin p58/p45 O-linked - OS9 Protein OS-9 N- and O-linked yes PHACTR4 Phosphatase and actin regulator 4 O-linked - PIP Prolactin-inducible protein N-linked - POM121C Nuclear envelope pore membrane protein POM 121C O-linked - POU2F1 POU domain, class 2, transcription factor 1 O-linked - PRCP Lysosomal Pro-X carboxypeptidase N- and O-linked - PRKCSH Glucosidase 2 subunit beta N- and O-linked yes PRRC1 Protein PRRC1 O-linked - PRRC2C Protein PRRC2C O-linked yes PSAP Prosaposin N- and O-linked yes PTPRA Receptor-type tyrosine-protein phosphatase alpha N- and O-linked - PTPRC Receptor-type tyrosine-protein phosphatase C N- and O-linked yes QSER1 Glutamine and serine-rich protein 1 O-linked yes RAD23B UV excision repair protein RAD23 homolog B O-linked yes RBM26 RNA-binding protein 26 O-linked - RBM27 RNA-binding protein 27 O-linked - SAP130 Histone deacetylase complex subunit SAP130 O-linked - SDF2L1 Stromal cell-derived factor 2-like protein 1 O-linked - SDF4 45 kDa calcium-binding protein N- and O-linked - SEC24B Protein transport protein Sec24B O-linked yes

7 SELPLG P- glycoprotein ligand 1 N- and O-linked yes SF1 Splicing factor 1 O-linked - SLC38A10 Putative sodium-coupled neutral amino acid transporter 10 O-linked yes SLC38A2 Sodium-coupled neutral amino acid transporter 2 N-linked - SLC3A2 4F2 cell-surface antigen heavy chain N-linked yes SPATA5 Spermatogenesis-associated protein 5 O-linked - SPN, CD43 Leukosialin N- and O-linked yes SRGN Serglycin N- and O-linked yes SS18 Protein SSXT N-linked - ST6GAL1 Beta-galactoside alpha-2,6-sialyltransferase 1 N- and O-linked yes SUGP1 SURP and G-patch domain-containing protein 1 O-linked - TFPI Tissue factor pathway inhibitor N- and O-linked - TFRC (CD71) Transferrin receptor protein 1 N- and O-linked yes TGOLN2 Trans-Golgi network integral membrane protein 2 N- and O-linked yes TIMP1 Metalloproteinase inhibitor 1 N-linked - TMEM30A Cell cycle control protein 50A N-and O- linked yes TMTC3 Transmembrane and TPR repeat-containing protein 3 N- and O-linked yes TNFRSF1B receptor superfamily member 1B N- and O-linked yes TNFRSF8 Tumor necrosis factor receptor superfamily member 8 N- and O-linked yes TPM3 Tropomyosin alpha-3 chain O-linked - TPST2 Protein-tyrosine sulfotransferase 2 N- and O-linked yes TSEN15 tRNA-splicing subunit Sen15 O-linked yes TTC33 Tetratricopeptide repeat protein 33 O-linked - TXNDC5 Thioredoxin domain-containing protein 5 N- and O-linked yes UBAP2 Ubiquitin-associated protein 2 O-linked yes UBQLN2 Ubiquilin-2 O-linked - WAC WW domain-containing adapter protein with coiled-coil O-linked -

8 Experimentals

In vitro of GalN6yne NahK from B. longum was either recombinantly expressed4 or purchased from Chemily (Peachtree Corners, USA). All other bacterial NahKs were produced by Prozomix Ltd. (Haltwhistle, UK). Human kinases GALK1 and GALK2 were expressed following a standard baculovirus plasmid transfer expression protocol in SF21 insect cells as described before. 5Enzyme of interest has been purified by GST affinity and Akta chromatography.

Reactions were run in 50 µL volume, containing GalN6yne (5 mM), ATP (10 mM), MgCl2 (10 mM), Tris-HCl pH 8 (100 mM) and kinase (20 µg). Reactions were run for 4 hours at 37 °C. Reactions were diluted with 50 µL methanol, cooled to -20 °C for 2 hours and centrifuged (13000 rpm, 30min) to remove any precipitated enzyme. Supernatants were analyzed by UPLC- MS equipped with ACQUITY UPLC BEH Glycan 1.7 µm 2.1x50 mm column (90-65% buffer B over 17 minutes; buffer A: 10 mM ammonium formate pH 4.5, buffer B: 10 mM ammonium formate 90/10 acetonitrile/water). Estimated conversion was obtained by analysing a 5 mM product standard in the same analysis conditions as above, extracting product mass and comparing ion count with the ion count of extracted mass of product in the samples.

In vitro synthesis of UDP-GalN6yne AGX1 constructs were expressed and purified as reported before.5 Reactions were run in 15 µL volume, containing GalN6yne-1-phosphate (2.5 mM), MgCl2 (5 mM), Tris buffer pH 8 (75 mM), BSA (1mM), UTP (5 mM), PmPpA (Chemily, 0.045 U), recombinant WT- or mut-AGX1 (125 nM). Reactions were run for either 2 or 16 hours at 37 °C. 7 µL of each reaction were diluted with 7 µL of acetonitrile, cooled on ice for 30 min and centrifuged (13000 rpm, 30 min) to remove any precipitated enzyme. Supernatants were analyzed by UPLC-MS equipped with ACQUITY UPLC BEH Glycan 1.7 µm 2.1x50 mm column (90-65% buffer B over 17 minutes; buffer A: 10 mM ammonium formate pH 4.5, buffer B: 10 mM ammonium formate 90/10 acetonitrile/water).

Plasmids and Cell lines AGX1WT and AGX1F383A were introduced into the plasmid pSBbi using a previously reported cloning strategy.1 The pSBbi plasmid was a gift from Eric Kowarz (Addgene plasmid #60514; http://n2t.net/addgene:60514; RRID:Addgene_60514).6 pCMV(CAT)T7-SB100 was a gift from Zsuzsanna Izsvak (Addgene plasmid #34879; http://n2t.net/addgene:34879; RRID:Addgene_34879).7 NahK from Bifidobacterium longum (uniport accession number) was codon-optimised for human expression and inserted into pSBbi-AGX1WT and pSBbi-AGX1F383A by GeneArt (Thermo Fisher, Waltham, USA), containing 2A self-cleaving peptides and a C- terminal HA tag, to give the plasmids pSBbi-AGX1WT-NahK, pSBbi-AGX1F383A-NahK. The full

9 plasmid sequence is provided in Appendix 1. WT-or double mutant (DM) versions of GalNAc-T1 or GalNAc-T2 were inserted into these plasmids using an SfiI cloning strategy according to Schumann et al.8 to give the plasmids pSBbi-AGX1WT-NahK-T1WT, pSBbi-AGX1WT-NahK- T1I238A/L295A, pSBbi-AGX1F383A-NahK-T1WT, pSBbi-AGX1F383A-NahK-T1I238A/L295A, pSBbi- AGX1WT-NahK-T2WT, pSBbi-AGX1WT-NahK-DM-T2I253A/L310A, pSBbi-AGX1F383A-NahK- T2WT, pSBbi-AGX1F383A-NahK-T2I253A/L310A, pSBbi-AGX1WT and pSBbi-AGX1F383A were used as template to prepare the following plasmids: pSBbi-AGX1WT, pSBbi-AGX1F383A, pSBbi- AGX1WT-NahK, pSBbi-AGX1F383A-NahK, pSBbi-AGX1WT-NahK-T1WT, pSBbi-AGX1WT- NahK-T1DM, pSBbi-AGX1F383A-NahK-T1WT, pSBbi-AGX1F383A-NahK-T1DM, pSBbi-AGX1WT- NahK-T2WT, pSBbi-AGX1WT-NahK-T2DM, pSBbi-AGX1F383A-NahK-T2WT, pSBbi-AGX1F383A- NahK-T2DM. AGX1, NahK and GalNAc-T1/T2 constructs were tagged with FLAG, HA and VSV-G tags, respectively.

All cells were screened for contamination by mycoplasma and other cell lines by the Crick Cell Services Science Technology Platform. K-562 cells were propagated in RPMI (Thermo Fisher) with 10% (v/v) FBS, penicillin (100 U/mL) and streptomycin (100 µg/mL). 4T1(GFP) and MLg cells were a gift from Ilaria Malanchi (Francis Crick Institute) and maintained in DMEM (Thermo Fisher) with 10% (v/v) FBS, penicillin (100 U/mL) and streptomycin (100 µg/mL). K-562 and 4T1 (GFP) stably transfected with pSBbi-AGX1WT or pSBbi-AGX1F383A have been prepared previously.8 K-562 were stably transfected with pSBbi-AGX1-NahK or pSBbi-AGX1- NahK_T1/T2 constructs or empty pSBbi-GH using Lipofectamine LTX (Thermo Fisher) according to the manufacturer’s instructions, with a 20:1 (m/m) mixture of pSBbi and pCMV(CAT)T7-SB100 plasmid DNA. After 24 h, cells were harvested and selected in growth medium containing 150 µg/mL hygromycin B (Thermo Fisher) for 7-10 days to obtain stable cells. pCMV(CAT)T7-SB100 was a gift from Zsuzsanna Izsvak (Addgene plasmid #34879; http://n2t.net/addgene:34879 ; RRID:Addgene 34879). 4T1 (GFP) and MLg were stably transfected with above-mentioned plasmids or empty pSBbi-GH using Lipofectamine 3000 (Thermo Fisher) according to the manufacturer’s specifications, with a 20:1 (m/m) mixture of pSBbi and pCMV(CAT)T7-pSB100 plasmid DNA. After 24 h, cells were harvested and selected in growth medium containing 100 µg/mL hygromycin B (Thermo Fisher) for 7-10 days to obtain stable cells.

Analysis of nucleotide-sugar biosynthesis by High Performance Anion Exchange Chromatography 5 million K-562 cells stably transfected with pSBbi-GH, pSBbi-AGX1WT, pSBbi-AGX1F383A, pSBbi-AGX1WT-NahK and pSBbi-AGX1F383A-NahK were fed with 100 µM (from a 100 mM stock solution in DMSO) membrane permeable precursor Ac4GalN6yne, Ac4GalNAlk, Ac3GalN6yne- 1-P(SATE)2 or DMSO. After 16h, cells were harvested, centrifuged at 500 g, 5 min, 4 °C and resuspended in PBS (1 mL). Cell pellets were resuspended in PBS (1 mL). 0.9 mL cell suspension transferred to O-ring tubes (1.5 mL, Thermo Fisher) and harvested. Zirconia/silica beads (0.1 mm,

10 BioSpec, Bertlesville, USA) were added at a volume similar to the cell pellet, followed by 1:1 acetonitrile/water (1 mL). Cells were lysed using a bead beater (FastPrep-24, MP Biomedicals, Santa Ana, USA) at 6 m/s for 30 s, and the cell lysate was cooled at 4 °C for 10 min. Samples were centrifuged (14000 g, 10 min, 4 °C), and the supernatant was transferred to a fresh tube. The solvent was evaporated by speed vac, and the residue was dissolved in LCMS-grade water (Thermo Fisher, 0.2-0.4 mL) containing 15 µM ADP-α-D-glucose (Sigma-Aldrich, St. Louise, USA). The solution was passed through a centrifuge filter (30 min, 14000 g) using a 3 kDa MWCO Amicon Ultra Centrifugal Filter Unit (Merck). The flow-through was evaporated by speed vac and the residue resuspended in 60µL of MilliQ water. Anion exchange chromatography was used to analyze lysates. Anion exchange chromatography was carried out using an ICS-6000 equipped with a quaternary pump and a conductivity detector (data collection rate 5.0 Hz, cell temperature 35 °C) on an AS11 2x250 mm column and a 2x50 mm guard column (Thermo Fisher). Solvents were: A = water; B = 1 M NaOH. 0 min 99.9% A, 0.1% B; 3 min 99.9% A, 0.1% B; 8 min 96.9% A, 3.1% B; 13 min 96.4% A, 3.6% B; 38 min 93% A, 7% B; 39 min 90% A, 10% B; 43 min 90% A, 10% B; 48 min 90% A, 10% B. Commercial or synthetic standards (200-500 µM) were used as controls.

Metabolic cell surface labelling and in-gel fluorescence K-562 cells (or MCF7) stably transfected with pSBbi-based plasmids were seeded at a density of 250,000 cells/mL into well plates in growth medium without hygromycin. Cells were treated with

DMSO, Ac4GalN6yne or Ac4ManNAlk at the indicated concentrations. Cells were grown for 20 h. Cells were harvested in a V-shaped 96 well plate and washed twice with 2% (v/v) FBS in PBS (Labeling Buffer, 0.2 mL). Cells were resuspended in Labelling Buffer (35 µL), treated with a solution of 100 µM CuSO4, 600 µM BTTAA (Click Chemistry Tools, Scottsdale, USA), 2.5 mM sodium ascorbate, 2.5 mM aminoguanidinium chloride and 100 µM CF680 picolyl azide (Biotium) in Labelling Buffer (35 µL), and incubated for 7 min at room temperature on an orbital shaker. The click reaction was quenched with 3 mM bathocuproinedisulfonic acid (BCS) in PBS (35 µL). Cells were centrifuged, washed twice with Labeling Buffer and then with PBS, and treated with 100µL of ice-cold Lysis Buffer (50 mM Tris-HCl pH 8, 150 mM NaCl, 1% (v/v) Triton X-100,

0.5% (v/v) sodium deoxycholate, 0.1% (w/v) SDS (Sodium dodecyl sulfate), 1 mM MgCl2, and 100 mU/µL benzonase (Merck) containing halt protease inhibitors. Cells were lysed for 20 min at 4 °C on an orbital shaker and centrifuged (1500 g, 20 min, 4 °C). Supernatant was transferred to a new plate and PierceTM BCA Protein Assay kit (Thermo Fisher) was used to measure protein concentration. Loading buffer (a 1:1:1:0.5 (v/v/v/v) mixture of 1 M Tris-HCl pH 6.5, 80% (v/v) glycerol, 10% (w/v) SDS and 1 M dithiothreitol (DTT) was added, samples were run on a 10% or 4-20% CriterionTM gel (Bio-Rad, Hercules, USA) for SDS-PAGE, and imaged on an Odyssey CLx imager (LI-COR Biosciences, Lincoln, USA). Total protein was stained with Coomassie using Acquastain (Bulldog Bio, Portsmouth, USA). Protein expression was assessed by Western blot

11 with a different set of samples, using against FLAG tag (rabbit anti-FLAG ,1:2000, ab1162, abcam, Cambridge, UK), HA tag (Mouse anti-HA antibody, 1:2000, ab18181, abcam), VSV-G tag (goat anti-VSV-G, 1:550, ab3861, abcam) and GADPH (rabbit anti- GAPDH, 1:10000, ab9485, abcam). For enzymatic treatment, lysates from cell surface-labelled cells (5 µg protein) were diluted to 20 µL with 50 mM Tris-HCl pH 7.5 and 150 mM NaCl. Samples were either left untreated, or treated with 2 µL of a 1:10 dilution in PBS of commercial PNGase F (Promega, Madison, USA), 2 µL StcE (96.5 µg/mL solution in PBS)9 or 2 µL of a mixture of SialEXO and OpeRATOR (4 U/µL in PBS, Genovis, Lund, Sweden) Samples were incubated for 2 h at 37 °C, briefly heated to 95 °C and cooled on ice. SDS-PAGE and in-gel fluorescence were performed as described above. In-gel fluorescence and Western Blot images were visualized and processed on ImageStudioLite software (LI-COR Biosciences, Lincoln, USA) and cropped by Illustrator (Adobe, San Jose, USA).

SILAC-based quantitative proteomics analysis K 562 cells stably transfected with pSBbi-based plasmids were individually grown in heavy and light media for 6 doublings, sufficiently to achieve a labelling efficiency of >95%, before being fed with either DMSO or 10 μM Ac4GalN6yne. Heavy media contained RPMI (Thermo Fisher) with 10% (v/v) dialysed FBS, proline (0.1mg/mL. Thermo Fisher) and 12C/14N light lysine/arginine K0/R0 (0.1mg/mL, Thermo Fisher) that replace normal Lys and Arg. Heavy media contained RPMI (Thermo Fisher) with 10% (v/v) dialysed FBS, proline (0.1mg/mL) and 13C6,15N2 heavy lysine/arginine K8/R10 (0.1mg/mL, Thermo Fisher) that replace normal Lys and Arg. After 20 h, cells were centrifuged (500 g, 5 min) and washed with PBS (2x 200 µL). After being transferred into a V-shaped 96 well plate, cells were lysed with 200 µL of ice-cold Lysis Buffer (50 mM Tris-HCl pH 8, 150 mM NaCl, 1% (v/v) Triton X-100, 0.5% (v/v) sodium deoxycholate,

0.1% (w/v) SDS, 1 mM MgCl2, 100 mU/µL benzonase (Merck, Darmstadt, Germany) containing halt protease inhibitors, and 50 µM PUGNAC). Cells were lysed for 20 min at 4 °C on an orbital shaker and centrifuged (1500 g, 20 min, 4 °C). Supernatant was transferred to a new plate and PierceTM BCA Protein Assay kit (Thermo Fisher) was used to measure protein concentration. Heavy and light lysates were mixed 1:1 (0.5 mg), normalized up to 250 µL with PBS and incubated for 2h at RT with 300 µL of Neutravidin beads slurry (Sera-Mag SpeedBeads Neutravidin-Coated Magnetic Beads, cytiva, Marlborough, USA), previously washed with PBS (2x 200 µL), to remove endogenous biotinylated proteins. The supernatant was collected, diluted to 270 µL with PBS and 30 µL of 10x CuAAC solution (3mM CuSO4, 6mM BTTAA, 1mM biotin-picolyl azide, 50mM sodium ascorbate and 50 mM aminoguanidinium chloride) was added. The click reaction was left 6 hours at RT under shaking.

Samples were treated with 3 mL cold methanol (-20 °C, 10-fold excess) and left 24h at -80 °C for protein precipitation.

12

Samples were then centrifuged (3700 g, 4 °C, 20 min) and supernatant discarded; pellets were washed with cold methanol (2x3 mL) and centrifuged between washes (3700 g, 4 °C, 20 min). Supernatant was completely removed (tubes upside-down on tissue paper, then let air dry) and samples were re-suspended in 250 µL 0.1% (w/v) Rapigest (Waters) in PBS and sonicated in water-bath for 25 min. Samples were centrifuged (3700 g, 5 min), supernatants were transferred to new tubes and pellets were treated with 250 µL 6 M urea in PBS. Samples were sonicated 25 min and centrifuged again (3700 g, 5 min). The pellets were resuspended with 250 µL PBS, sonicated for 25 min and centrifuged again. Rapigest, urea and PBS supernatants were then combined and incubated with 350 µL of Neutravidin Beads slurry for 2 h at RT. Beads were washed with 1% (w/v) Rapigest (3x350 µL), 6 M urea in PBS (6x350 µL), AMBIC (50 mM ammonium bicarbonate, 6x350 µL) and 40% (v/v) acetonitrile (4x 100µL). Beads were resuspended in 100 µL of AMBIC containing 10 mM DTT and then incubated at 50 °C for 15 min. Beads were washed with AMBIC (2x 350µL) and 100 µL of 20 mM iodoacetamide in AMBIC was then added. Samples were kept for 30 min in the dark. DTT 10 mM (final concentration) was added to the samples. The beads were washed with AMBIC (3x 350µL), then resuspended in 100 µL of AMBIC and 300 ng of LysC Mass Spec Grade (Promega) were added to beads followed by overnight incubation at 37 °C. The supernatant was transferred to a new tube and 200 ng of trypsin gold Mass Spec Grade (Promega) were added. The digestion was left for 8 h at 37 °C. Peptides were desalted by UltraMicroSpinTM (The Nest group Inc., Ipswich, USA) according to the manufacturer protocol and vacuum-dried by SpeedVac to remove any traces of organic solvents. Dried peptides were resuspended in 16 µL of 0.1% (v/v) formic acid in water, sonicated for 15 min in water-bath, vortexed briefly and harvested 5 min at 18,000 g. Peptide mixtures were separated on a 50 cm, 75 µm I.D. Pepmap column over a 60 min gradient run at 40 °C and eluted directly into the mass spectrometer (Orbitrap Fusion Eclipse, Thermo Fisher). Xcalibur software was used to control the data acquisition. A Data Dependent Acquisition Method with HCD fragmentation and MS2 data acquired in the ion trap were used. Raw mass spectrometry files were loaded into MaxQuant software for quantification and identification by using homo sapiens FASTA protein sequences database from UniProt for the database search.10 The protein groups table were uploaded into Perseus to allow for data transformation and visualization, and into RStudio for statistical analysis.

Fluorescence microscopy For mono-culture samples, non-transfected and stably transfected 4T1(GFP) and MLg cells were seeded into a µ-Plate 24 Well Black (Thistle Scientific Ltd, Glasgow, UK) at a density of 30,000 cells in 350 μL growth medium without hygromycin. Cells were treated with either DMSO, 50

μM Ac4GalN6yne or 25 μM Ac4ManAlk. For co-culture samples, non-transfected and stably transfected MLg cells were seeded into a µ- Plate 24 Well Black (Thistle Scientific Ltd) at a density of 30,000 cells in 350 μL growth medium without hygromycin. After 4 h, 3000 4T1(GFP) cells were plated on the top of the MLg cells (1:10

13 ratio). The co-culture samples were grown 72 h before feeding with either DMSO, 50 μM

Ac4GalN6yne or 50 μM Ac4ManAlk. Cells were incubated for 16 h. Medium was aspirated, and cells were washed with ice-cold 2% (v/v) FBS in PBS (2x 200 μL). Cells were then treated with 200 μL of a freshly prepared solution containing 300 μM BTTAA, 50 μM CuSO4, 5 mM sodium ascorbate, 5 mM aminoguanidinium chloride and 200 μM biotin-picolyl-azide (Click Chemistry Tools). The reaction was carried out for 3 min at room temperature, the supernatant was aspirated and cells were washed with ice-cold PBS (4x 200 μL). Cells were incubated for 20 min at RT with 20 μg/mL Streptavidin- AlexaFluor647 (BioLegend UK Ltd, Kentish Town, UK) in 1% (w/v) BSA in PBS solution in the dark. After washing with ice-cold PBS (4x 200 μL), cells were fixed 20 min with cold 4% (v/v) formaldehyde (Thermo Fisher) in 100 mM sodium phosphate buffer pH 7.4, at room temperature in the dark. The reaction was quenched by 5 min incubation with 50 mM ammonium chloride and the cells were washed with PBS (3x 200 μL). Cells were permeabilized with 0.1% (v/v) Triton X- 100 in PBS for 10 min at 4 °C and washed with PBS (3x 200 μL). Cells were blocked in a solution of 10% (v/v) normal donkey serum (abcam), 1% (w/v) BSA and 0.1% (v/v) Tween-20 in PBS for 1 h at room temperature. GFP expression was detected by incubation with goat anti-GFP (1:300, ab5450, abcam) in a solution containing 5% (v/v) donkey serum in 1% (w/v) BSA in PBS. Cells were washed with PBS (3x200 μL) and incubated for 30 min at RT with Alexafluor488 anti-goat secondary antibody (1:500, ab150129, abcam) in a solution containing 1% (v/v) donkey serum in 1% (w/v) BSA in PBS. Cells were then incubated with Alexafluor568 Phalloidin (Invitrogen A12380, 5 μL of 40X methanol stock solution in each 200 μL of PBS) in 1% (w/v) BSA and 0.1% (v/v) Tween-20 followed by PBS washing (3x 200 μL) and DAPI incubation (1:1000 v/v, Vector Laboratories Ltd, Peterborough, UK) in 1% (w/v) BSA and 0.1% (v/v) Tween-20 in PBS (Thermo Fisher) for 30 min at room temperature. After washing with PBS (3x 200 μL), circle CoverSlips with 15 mm diameter (Thermo Fisher) were mounted onto each well by using ProLong Gold antifade reagent (Invitrogen, Carlsbad, USA). The confocal acquisition was made on a Zeiss LSM710 Invert microscope. The images were acquired using a Plan Apochromat 40x/1.3 Oil objective. Mono-culture samples were imaged with an acquisition zoom of 1.3, so the corresponding resulting pixel size was 1.59 μm. Three- dimensional images were acquired for the Co-culture samples with an acquisition zoom of 0.6, so the corresponding resulting pixel size was 1.59 μm (x, y) and 0.7 μm (z). A sequential scan to spectrally separate the fluorescence of DAPI, Alexa Fluor 647, Alexafluor488 and Alexafluor568 was used. In addition, the transmitted light channel was activated to visualize the cell morphology. Images were visualized and processed with Fiji and Zen (Zeiss, Oberkochen, Germany) software.11

References

1. Debets, M. F. et al. Metabolic precision labeling enables selective probing of O-linked N - acetylgalactosamine glycosylation . Proc. Natl. Acad. Sci. U. S. A. 117, 25293–25301 (2020). 2. York, W. S. et al. GlyGen: Computational and Informatics Resources for Glycoscience. Glycobiology 30,

14 72–73 (2020). 3. Joshi, H. J. et al. GlycoDomainViewer: a bioinformatics tool for contextual exploration of glycoproteomes. Glycobiology 28, 131–136 (2018). 4. Keenan, T. et al. Profiling Substrate Promiscuity of Wild-Type Sugar Kinases for Multi-fluorinated Monosaccharides. Cell Chem. Biol. 27, 1199-1206.e5 (2020). 5. Cioce, A. et al. Optimization of Metabolic Oligosaccharide Engineering with Ac 4 GalNAlk and Ac 4 GlcNAlk by an Engineered Pyrophosphorylase. doi:10.1021/acschembio.1c00034. 6. Kowarz, E., Löscher, D. & Marschalek, R. Optimized Sleeping Beauty transposons rapidly generate stable transgenic cell lines. Biotechnol. J. 10, 647–653 (2015). 7. Mátés, L. et al. Molecular evolution of a novel hyperactive Sleeping Beauty transposase enables robust stable gene transfer in vertebrates. Nat. Genet. 41, 753–761 (2009). 8. Schumann, B. et al. Bump-and-Hole Engineering Identifies Specific Substrates of Glycosyltransferases in Living Cells. Mol. Cell 78, 824-834.e15 (2020). 9. Malaker, S. A. et al. The -selective protease StcE enables molecular and functional analysis of human cancer-associated . Proc. Natl. Acad. Sci. U. S. A. 116, 7278–7287 (2019). 10. Tyanova, S., Temu, T. & Cox, J. The MaxQuant computational platform for mass spectrometry–based shotgun proteomics. (2016) doi:10.1038/nprot.2016.136. 11. Schindelin, J. et al. Fiji: An open-source platform for biological-image analysis. Nat. Methods 9, 676–682 (2012).

15