Mining the O-mannose glycoproteome reveals cadherins as major O-mannosylated glycoproteins

Malene B. Vester-Christensena,1,2, Adnan Halima,1, Hiren Jitendra Joshia, Catharina Steentofta, Eric P. Bennetta,b, Steven B. Leverya, Sergey Y. Vakhrusheva,3, and Henrik Clausena,3

aCopenhagen Center for Glycomics and Department of Cellular and Molecular Medicine and bSchool of Dentistry, Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen N, Denmark

Edited by Stuart A. Kornfeld, Washington University School of Medicine, St. Louis, MO, and approved September 11, 2013 (received for review July 22, 2013) The metazoan O-mannose (O-Man) glycoproteome is largely un- predictive consensus sequence motifs, in addition to hetero- known. It has been shown that up to 30% of brain O-glycans are geneity in structures of O-glycans, have long hampered iden- of the O-Man type, but essentially only alpha-dystroglycan (α-DG) tification of O-glycan attachment sites (9, 16). of the dystrophin–glycoprotein complex is well characterized as an O-mannosylation of proteins is initiated in endoplasmic re- O-Man glycoprotein. Defects in O-Man glycosylation underlie con- ticulum (ER) by transfer of mannose from dolichol monophosphate- genital muscular dystrophies and considerable efforts have been activated mannose to serine and threonine by homologous protein devoted to explore this O-glycoproteome without much success. O-mannosyltransferases (PMTs). Yeast has at least six PMTs Here, we used our SimpleCell strategy using nuclease-mediated grouped into three subfamilies, PMT1, PMT2, and PMT4, while gene editing of a human cell line (MDA-MB-231) to reduce the metazoans only have two PMT orthologs, POMT1 and POMT2, structural heterogeneity of O-Man glycans and to probe the O- grouped in subfamilies PMT4 and PMT2, respectively (17). The Man glycoproteome. In this breast cancer cell line we found that specific contribution of each PMT isoform to O-mannosylation O-Man glycosylation is primarily found on cadherins and plexins is predicted to be different but detailed information on β on -strands in extracellular cadherin and Ig-like, plexin and tran- specificities of individual isoenzymes is largely missing (1). Studies scription factor domains. The positions and evolutionary conserva- have shown, however, that members of the PMT4 subfamily may tion of O-Man glycans in cadherins suggest that they play important act selectively on membrane bound protein substrates (18). In functional roles for this large group of cell adhesion glycoproteins, human O-Man glycans are generally elongated sequentially with which can now be addressed. The developed O-Man SimpleCell N-acetylglucosamine (GlcNAc) and galactose (Gal) and capped strategy is applicable to most types of cell lines and enables by sialic acid (NeuAc), glucuronic acid (GlcA), or fucose (Fig. 1A). proteome-wide discovery of O-Man protein glycosylation. The first step in this elongation is the addition of GlcNAc to the O-Man glycans, catalyzed by a single , protein O-linked- POMGnT1 | O-glycosylation | Orbitrap | mass spectrometry | glycoproteomics mannose beta-1,2-N-acetylglucosaminyltransferase 1 (POMGnT1) (4). An alternative elongation pathway, which has so far only been identified at one specificO-Mansiteinα-DG, involves elongation rotein O-glycosylation of the O-mannose (O-Man) type of O-Man by the UDP-GlcNAc: Manα1-O-Ser/Thr β1,4GlcNAc- Pfound on Ser and Thr residues was initially thought to be (POMGnT2/GTDC2) (19). This pathway involves fur- restricted to yeast and fungi, and only within the last two decades ther elongation by β3GalNAc-T2 with subsequent phosphorylation has it been shown that this glycosylation occurs in metazoans (1, of the mannose residue by POMK (SGK196) which is extended 2). O-Man glycosylation of the basement membrane glycoprotein with GlcA and xylose by like-acetylglucosaminyltransferase to form α-dystroglycan (α-DG) is essential for assembly and function of the structure required for DG function (19, 20). the dystrophin–glycoprotein complex that links the cytoskeleton fi with the extracellular matrix, and de ciencies in all of the fi involved in the O-Man glycosylation underlie congenital Signi cance muscular dystrophies (dystroglycanopathies) (3–5). In yeast, O-mannosylation is found on a wide variety of pro- Protein O-mannosylation is believed to be an abundant modi- teins, although this O-glycoproteome is still rather unexplored. fication of proteins, but only very few glycoproteins with In mammals, early studies identified O-Man oligosaccharides O-mannose have been identified to date. Here, we present released from isolated rat brain proteoglycans by mild alkaline a unique strategy for proteome-wide discovery of O-man- borohydride treatment (6, 7), and subsequent studies clarified nosylated glycoproteins, and using this strategy we find that these structures as shown in Fig. 1A (8). However, for a long the important cadherin and plexin families of cell membrane time, α-DG was the only well-characterized protein known to be receptors are O-mannosylated. The presented strategy invites O-mannosylated in mammals despite evidence that O-Man gly- the opportunity for wider exploration of the O-mannose gly- cans constitute a major part of the total O-glycans in the brain (8, coproteome and studies of the functions of O-mannose glycans. 9). A few other proteins have been demonstrated or suggested to Author contributions: M.B.V.-C., A.H., H.J.J., C.S., E.P.B., S.Y.V., and H.C. designed research; contain O-Man glycans, including recombinantly expressed IgG2 M.B.V.-C., A.H., C.S., and E.P.B. performed research; C.S. contributed new reagents/analytic (10), RPTPβ/ζ (11), CD24 (12), neurofascin 186 (13), as well as tools; M.B.V.-C., A.H., H.J.J., S.B.L., S.Y.V., and H.C. analyzed data; and M.B.V.-C., A.H., H.J.J., lecticans (14), and gel-based analysis has suggested that O-Man and H.C. wrote the paper. glycoproteins are of high molecular weight (9). More recently, it The authors declare no conflict of interest. −/− was demonstrated that the brains of α-DG brain-specific This article is a PNAS Direct Submission. knockout mice had unchanged levels of O-Man glycans (15). There See Commentary on page 20858. are multiple types of metazoan protein O-glycosylation, most of 1M.B.V.-C. and A.H. contributed equally to this work. which have been demonstrated to play roles in protein structure 2Present address: Department of Mammalian Cell Technology, Novo Nordisk A/S, DK-2760 and function. However, our knowledge of O-glycoproteomes and Måløv, Denmark. especially sites of attachment of O-glycans is still very limited for 3To whom correspondence may be addressed. E-mail: [email protected] or [email protected]. several types of O-glycosylation including O-Man. Technical con- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. straints primarily related to lack of enrichment strategies and 1073/pnas.1313446110/-/DCSupplemental.

21018–21023 | PNAS | December 24, 2013 | vol. 110 | no. 52 www.pnas.org/cgi/doi/10.1073/pnas.1313446110 Downloaded by guest on September 27, 2021 A O-Glycosylation Pathways SEE COMMENTARY

-) O-Man O-GalNAc

LARGE (- X POMK

P (SGK196) β3GalNAc T2 Core 2 POMGnT2 (GTDC2) Core 1 (T) X X S/T S/T S/T S/T POMGnT1 POMT1/ C1GalT1/ GalNAc Ts GnT-Vb POMT2 Cosmc

ST

Sia Ts β4Gal Ts S/T S/T

B O-Man Lectin Enrichment

Long Con A column

Cell pellet Cell lysis ++protease ++PNGase F

Flow through Elution Short Con A

Media Short VVA O-Man SimpleCells

nLC-MS/MS Man GlcNAc GalNAc Gal GlcA Xyl Sialic Acid

Fig. 1. Mining the O-Man glycoproteome by ZFN gene targeting of POMGNT1.(A) Knockout of POMGNT1 and COSMC abrogates elongation of O-Man (Left) and O-GalNAc (Right) type glycosylation, respectively, resulting in truncated homogenous O-glycan structures limited to O-Man and O-GalNAc (in some cells NeuAcα2–3GalNAc). Note that POMGNT2-mediated elongation of O-Man will not be affected by this gene targeting strategy. (B) Double knockout of POMGNT1 and COSMC in MDA-MB-231 cells allows for enrichment of O-Man glycoproteins from the cell culture supernatant using either a short Con A mannose-binding lectin column (captures both O-Man and N-glycoproteins) or for O-GalNAc glycoprotein enrichment on a short VVA αGalNAc-binding lectin column. Enriched eluates from these columns or total cell lysates can then be treated with proteases and PNGase F to remove N-glycans before glycopeptides are isolated by LWAC with a long Con A column and sequenced by nanoflow liquid chromatography-mass spectrometry (nLC-MS/MS).

We previously developed a strategy to analyze GalNAc-type with both types of glycosylation simplified, making both O-Man O-glycoproteins on a proteome-wide level using zinc finger nu- and O-GalNAc glycopeptides easily identifiable. O-Man was clease (ZFN) gene engineered human cell lines with simplified simplified by ZFN targeting of the POMGNT1 gene, which encodes O-glycan structures, the so-called “SimpleCell” (16). “Bottom- for POMGnT1 controlling the first step in elongation of O-Man up” higher-energy collision-induced dissociation–electron-trans- glycans (Fig. 1A) (4). Knockout of POMGnT1 is expected to fer dissociation (HCD/ETD)-based mass spectrometric analysis truncate most O-Man glycosylation, although the function of has allowed us to provide a first draft of the human O-GalNAc POMGnT2 recently shown to extend O-Man by β1,4GlcNAc on glycoproteome where we expanded the knowledge of O-glyco- α-DG could potentially compensate (19) (Fig. 1A). So far this proteins and O-glycosites many fold (21). Here, we designed pathway has only been associated with a few O-Man glycosites a similar strategy to simplify O-Man glycans in a human breast (Thr317, Thr319, and Thr379) in α-DG (20, 26), and the POMGnT2 cancer cell line, which allowed us to probe the O-Man glyco- enzyme has only been tested with a peptide derived from the site at proteome. The strategy led to identification of over 50 unique Thr317. Moreover, we did not identify glycosites in these regions O-Man glycoproteins with more than 230 glycosites, and the in the present study despite finding many of the expected O-Man surprising finding that the large cadherin family of membrane glycosites in the N-terminal region of the mucin domain of α-DG receptors is the major carrier of O-Man glycans. (Fig. 2). Nevertheless, mice deficient in POMGnT1 have previously been shown to produce truncated O-glycans, suggesting that the Results and Discussion majority of O-Man glycosylation is elongated by POMGnT1 (15). Development of O-Man SimpleCells for O-Glycoproteomics. We used O-GalNAc was simplified by targeting the private chaperone, CELL BIOLOGY the human MDA-MB-231 breast cancer cell line because we COSMC, for the core UDP-Gal: GalNAcα1-O-Ser/Thr β1,3gal- previously identified O-GalNAc glycosites in the mucin-linker actosyltransferase (C1GalT1) controlling the first step in O-GalNAc domain of α-DG in MDA-MB-231 O-GalNAc SimpleCells (21). elongation (16). We previously developed a workflow for isolation α-DG contains both O-Man and O-GalNAc glycans in this re- and characterization of GalNAc O-glycopeptides from O-GalNAc gion (22–24), and some sites are potentially occupied by either SimpleCells (16, 27), which we modified for Man O-glycopeptides type of glycosylation (25). This dual-occupancy was also pro- using Concanavalin A (Con A) lectin chromatography (Fig. 1B). posed in neurofascin 186 (13). We reasoned that we could Total cell lysates as well as secretomes were analyzed after en- identify O-Man sites and the interplay between the two types richment of the growth medium on a short Con A column. Because of O-glycosylation on α-DG and other proteins in MDA-MB-231 Con A binds mannose and also will enrich for N-glycopeptides the

Vester-Christensen et al. PNAS | December 24, 2013 | vol. 110 | no. 52 | 21019 Downloaded by guest on September 27, 2021 α-dystroglycan 1 30 316 486 653 SP N-Terminal Mucin-like C-Terminal

P P P 350 330 316 ATPTPVTAIGPPTTAIQEPPSRIVPTPTSPAIAPPTETMAPPVRDPVPGKPTVTIRTRGAIIQTPTLGPIQPTRVSEAG371 390 451 TTVPGQIRPTMTIPGYVEPTAVATPPTTTTKKPRVSTPKPATPSTDSTTTTTRRPTKKP410 432 RTPRPVPRVTTKVSITRLETASPPTRIRTTTS470 485

Fig. 2. Schematic representation of the identified O-Man and O-GalNAc glycosylation sites in the mucin-like domain of human α-dystroglycan. Glycopeptides covered by the nLC-MS/MS analysis are underlined.

total protease digests were treated with PNGase F to remove cosites, in particular in the extracellular cadherin (EC) domains N-glycans before isolation of O-Man glycopeptides on a long EC2-5 of classical type 1 and 2 cadherins (Fig. 3). No O-Man Con A column and subsequent separation and analysis by nano- glycosites were found in the EC1 domains of classical cadherins, flow liquid chromatography-mass spectrometry (nLC-MS/MS). but we previously identified an O-GalNAc glycosite in EC1 of E- Using this strategy O-Man glycopeptides from α-DG and a total and T-cadherins that appears conserved in several cadherins (Fig. of 51 unique O-Man glycoproteins were identified, comprising 3). Some of the O-Man sites identified in the cadherin EC2-5 a total of 235 O-Man glycosites (Table 1 and Dataset S1). We did domains were also found in the crystal structure of secreted E- not identify any of the individual proteins more recently suggested and N-cadherins expressed in human HEK293 cells, although to carry O-Man glycans (10–14), which could be due to a these were not concluded to represent O-Man (31). Although less number of reasons including experimental limitations as well conserved, the protocadherins exhibit similar distribution and as lack of expression of these proteins in MDA-MB-231 cells. conservation of glycosites. In protocadherins, the majority of sites Both site and peptide data for the identified O-Man glycosites were found in the largest subgroup of the diverse cadherin su- have been made available on the GlycoDomainViewer (21). perfamily, the clustered protocadherins. The clustered pro- The GlycoDomainViewer illustrates identified glycosites in the tocadherins are predominantly expressed in the brain (32), context of the entire protein amino acid sequence and assigned and glycosites were mainly found in EC2-3 and to some degree domain structures. The Viewer shows the O-Man glycosites as well in EC5-6 but not EC4. The function of O-Man glycans in cadherins as other glycosylation sites identified in previous studies. The is clearly unknown at present, but the conserved distribution and GlycoDomainViewer can be accessed online (http://glycodomain. orientation of glycans on the surface of the membrane proximal EC glycomics.ku.dk/doi/10.1073/pnas.1313446110/). domains offer a plausible scenario for positive or negative guidance of trans-andcis-interactions required for the assembly of the net- Identification of O-Man Glycosites on α-DG. For α-DG we obtained work of cadherin ectodomains that is believed to be the basis for the the greatest coverage of human O-glycosites so far, and we could extracellular architecture of adherens junctions (31, 33). confirm that O-Man and O-GalNAc glycosites are distributed in N- and C-terminal regions, respectively, of the mucin domain Plexins Are O-Man Glycoproteins. Another large family of cell (Fig. 2) (25). Four O-Man glycopeptides from α-DG were membrane receptors identified was the plexins, for which we identified allowing assignment of 11 additional unique human O-Man glycosites in the N-terminal region of the mucin domain (Fig. S1). While only a few of these sites have been identified in Table 1. Identified O-Man glycoproteins and glycosites man, the majority of the sites have previously been identified in Superfamily No. of proteins No. of glycosites mice and rabbits (20, 25, 28–30). No glycopeptides were identi- fied in the region previously shown to carry the phosphoryl-Man Cadherins 37 133 glycan (Fig. 2), which was expected if simple O-Man glycans are Classical 7 37 not available on digested peptide fragments for capture by Con A. Solitary 1 2 These data also support the notion that the phosphoryl modifi- Desmocollin 1 1 cation of O-Man at Thr317, Thr319, and Thr379 is specificfor Desmoglein 1 3 these residues. We previously identified a number of O-GalNAc Flamingo 2 5 glycosites using the SimpleCell strategy (Fig. 2) (21), and in the CR-1b FAT4 1 16 present study no glycopeptides with mixtures of O-Man and CR-1b Daschous 1 1 O-GalNAc were identified, but one glycopeptide (Pro361-Arg373) Clustered protocadherin 13 38 with three O-Man glycosites (Thr367, Thr369, and Thr372) in the Nonclustered 7 20 middle were previously found with two O-GalNAc O-glycosites Nonclustered Delta-1 2 10 (Thr367 and Thr369). This confirms that some glycosites serve as Nonclustered Delta-2 4 8 CR-2 1 3 acceptors for both O-Man and O-GalNAc glycosylation. CR-3 FAT-like 1 6 Cadherins Are Major O-Man Glycoproteins. The major surprise was Cadherin fragment 1 1 identification of the large cadherin family of cell membrane Plexins 6 8 receptors as the major carrier of O-Man glycans (Table 1 and KIAA1549 1 64 fi Others 8 30 Dataset S1). We identi ed 37 distinct members of the cadherin α superfamily with O-Man glycosites, and further sequence analysis -DG 1 13 Total 52 235 suggests distinct evolutionary conservation in distribution of gly-

21020 | www.pnas.org/cgi/doi/10.1073/pnas.1313446110 Vester-Christensen et al. Downloaded by guest on September 27, 2021 A Type 1 Cadherins Type 2 Cadherins Type 1 Cadherin Type 2 Cadherins B SEE COMMENTARY CADH1 CAD11 Human Mouse Chick Xenla Danre Human Mouse Chick Xenla Danre CADH9 CAD12 CAD20 CADH3 CADH7 CADH8 CAD19 CADH4 CADH15 CAD10 CAD11 CAD22 CAD18 CADH1 CADH2 CADH6 CADH5

E E EC1 63 EC1 60

F F E E 63 F F

CADH1 CAD11

126 129 B B 126 131 134 B B 131 136 EC1 EC1 EC2 EC2

202 G G 202 196 204 G 204 G EC2 EC2 198 206 206

B 245 B B B 245 EC3 EC3

316 EC3 EC3 G 318 G 320 316 314 EC4 EC4 318 316 G 320 G 320 B 353 B 355

B B EC5 EC5 353 422 355 G 424 G 426 EC4 EC4

422 G B B G 424 426

G 526 G B B

EC5 EC5 530 Cacium ion 532 G 526 G 534 ECEC 536 Predicted O-Man sites Predicted O-GalNAc sites Cadherin ectodomain Cytoplasmic domain

Fig. 3. Cadherins are the major class of O-mannosylated proteins. (A) Schematic drawing of extracellular domains of classical type 1 and type 2 cadherins. White circles illustrate potential O-Man glycosites predicted based on sequence conservation of Ser/Thr residues in alignments. Green-white circles show glycosites that were ambiguously identified, i.e., a glycopeptide was identified, but localization of the glycan was not identifiable by ETD. The O-GalNAc site identified by the O-GalNAc SimpleCell strategy is shown as a yellow square (21). Predicted O-GalNAc sites based on sequence identity are depicted as white squares. The O-Man sites are located either on or at the border of β-strand B and G. (B) Evolutionary conservation of predicted O-Man glycosites in type 1 and 2 cadherins. Representative analysis of CADH1 and CAD11 are shown with prediction of highly conserved and distinct patterns of distribution of glycosites from human to zebra fish in type 1 and type 2 cadherins. O-Man glycosites identified are depicted as green circles. These also include glycosites found in the crystal structures of mouse CADH1 and CADH2 (31).

found conserved O-Man glycans on Ig-like, plexin and transcription domain, and we identified 64 O-Man glycosites on 24 glycopeptides factor (IPT) domains (Fig. S2). Plexins are large transmembrane covering 268 amino acids evenly spread over the 936 extracellular glycoproteins that function as the receptors for semaphorins domain. This exceptional pattern of O-Man glycosylation is illus- serving as axon guidance cues for neural development, but they trated using the online resource GlycoDomainViewer in Fig. S3. also have nonneural roles (34). The extracellular domains of plexins While this protein has not been identified in our O-GalNAc Sim- have one sema domain, two or three Met-related sequences, and pleCells as a GalNAc O-glycoprotein, the O-GalNAc prediction several glycine–proline-rich Ig-domains that are shared by plexins algorithm NetOGlyc4.0 (21) in fact predicts the entire extracellular and transcription factors, and all these domains are shared with domain to be evenly O-glycosylated including all of the glycosites the Met family tyrosine kinases. The O-Man sites were identified identified in this study with O-Man. Similarly, NetOGlyc4.0 pre- in three out of four IPT domains of the plexins, and the sites are dicts the whole mucin-like region of α-DG to be glycosylated—also largely conserved in all four plexin subgroups. We also found predicting identified O-Man sites as O-GalNAc glycosylation O-Man sites in the IPT domains of two members of the Scatter sites, highlighting that we currently cannot predictively distinguish

Factor Receptor family, HGFR and MST1R. O-Man and O-GalNAc glycosylation sites using this predictor. CELL BIOLOGY Given that the extracellular domain of KIAA1549 displays an A Unique O-Man Mucin-Like Membrane Glycoprotein. The study also evenly distributed high density of Ser and Thr residues and no identified an uncharacterized large membrane protein (KIAA1549) other assigned or apparent structural features, it is likely that with a high number of O-Man glycosites (Dataset S1). The function the entire ectodomain carries O-Man glycans in the regions not of this protein is unknown and it has only been implicated in a identified in the present study. Thus, KIAA1549 may be represen- tandem duplication event at 7q34 that leads to a fusion between tative of a mucin-like molecule with dense O-Man glycosylation KIAA1549 and BRAF expression of a fusion protein with consti- similar to traditional mucins with heavy O-GalNAc glycosylation. tutive kinase activity in a majority of pilocytic astrocytomas (35). Finally, we identified one O-Man glycosite in one of the disul- This protein is predicted to have a 936-aa N-terminal extracellular phide-, PDIA3 (Dataset S1), which we have previously

Vester-Christensen et al. PNAS | December 24, 2013 | vol. 110 | no. 52 | 21021 Downloaded by guest on September 27, 2021 also identified with O-GalNAc (21). All of the protein disulphide- exploration of the O-Man glycoprotome can now be widely ap- isomerases (PDIs) have C-terminal KDEL-like ER retrieval signals plied to other cell lines and transgenic animals. and hence will be exposed to both the ER and Golgi, which offers the possibility of acquiring either O-Man or O-GalNAc glycans. Materials and Methods O-Man glycosylation is initiated co- and posttranslationally in ZFN Gene Targeting. ZFN targeting constructs for COSMC and POMGNT1 were ER and can interfere with N-glycosylation and protein folding custom produced (Sigma-Aldrich) with the following binding and (cutting) cis sites: COSMC 5′-CCCAACCAGGTAGT(AGAAGGCT)GTTGTTCAGATATGGCTGTT- (36, 37), while O-GalNAc glycosylation is initiated in -Golgi ′ ′ ′ (38). However, the initiating polypeptide GalNAc- can 3 ;andPOMGNT1 5 -AGCCAAGGCTCT(GCTGA)GGAGCCTGGGCAGCCAGG-3 . The MDA-MB-231 COSMC knockout O-GalNAc SimpleCell was generated function in ER (38) and they can relocate to ER by activation of Src as previously described (21). The MDA-MB-231 COSMC/POMGNT1 double kinases, opening up the possibility for direct competition (39, 40). knockout was generated by three sequential transfections with POMGNT1 This may represent another example of common substrate rec- ZFN pairs in the MDA-MB-231 COSMC knockout cell line. Transfected pools ognition between two forms of O-glycosylation similar to α-DG were single cell cloned by limiting dilution and clones identified by size as discussed above. difference in PCR amplified around the ZFN cut site. Mutations were Given the large number of O-Man glycosites now available, identified by sequencing of TOPO-cloned PCR fragments using the following we sought to identify potential acceptor sequence motifs for primer pairs: 5′-AGGGAGGGATGATTTGGAAG-3′ and 5′-TTGTCAGAACCATTTG- fi GAGGT-3′for COSMC; and 5′-TAGTTCGTGCTCTGTGAGGC-3′ and 5′-CACTGCCA- O-mannosylation. Among all of the sites identi ed in cadherins ′ and plexins, there did not appear to be discernible common se- CTGGCTCCTATT-3 for POMGNT1 (Fig. S8). quence features suggestive of a glycosylation motif (Fig. S4). In- fi fi Lectin Weak Af nity Chromatography Isolation of O-Man Glycopeptides. The stead, we found that all but one of the O-Man glycosites identi ed lectin weak affinity chromatography (LWAC) protocol for isolation of O-Man in proteins for which structures are deposited in structural data- glycopeptides was modified from our previously described method for iso- bases are located on β-strands (Figs. S5–S7). O-Man glycosites in lation of O-GalNAc glycopeptides (16). In brief, a total of 140 mL conditioned α-DG and KIAA1549 are located in disordered regions and media (secretome) or 0.5 mL packed cells (total cell lysate) was harvested looking at these separately did also not produce any clear se- from the MDA-MB-231 COSMC/POMGNT1 double knockout cell line grown quence motifs. Interestingly, the NetOGlyc4.0 predictor showed in DMEM supplemented with 10% (vol/vol) FBS and 1% glutamine and overlap with O-Man glycosites only in these proteins and not in processed as detailed hereafter. Secretome. × fl × cadherins and plexins, which is in agreement with the finding that Conditioned media obtained from 4 T175 asks (4 35 mL) cultured for 72–96 h, harvested by centrifugation (2,500 × g,10min)were O-GalNAc glycosites are mainly predicted to be located in disor- dialyzed (Mw cutoff, 3,500 Da) twice against 5 L 20 mM Tris·HCl, pH 7.4, dered regions (21). Human only has two O-mannosyltransferases, centrifuged (2,500 × g, 10 min), and diluted in 140 mL 2× Con A binding

POMT1 and POMT2, and in yeast an ortholog of POMT1 in the buffer A (40 mM Tris·HCl, pH 7.4, 300 mM NaCl, 2 mM CaCl2/MgCl2/MnCl2/ PMT4 subfamily was demonstrated to selectively function with ZnCl2, 1 M urea) or 2× Vicia villosa agglutinin (VVA) buffer (40 mM Tris·HCl, membrane proteins (18). It is thus tempting to speculate that pH 7.4, 300 mM NaCl, 2 mM CaCl2/MgCl2/MnCl2/ZnCl2, 2 M urea) before POMT1 and POMT2 have different substrate specificities and that subjected to either Con A or VVA lectin chromatography for enrichment POMT1 primarily functions with membrane proteins with a prefer- of O-mannosylated and N-glycosylated glycoproteins or O-GalNAc glyco- ence for β-strands, while POMT2 primarily recognize linear se- proteins, respectively. Con A and VVA agarose (Vector Laboratories) 0.5–0.7 quence motifs in disordered regions similar to polypeptide GalNAc- mL in 2 mL syringes were equilibrated in Con A buffer A or VVA buffer A, fi respectively. The sample was loaded twice followed by 10–20 column vol- transferases. De ciencies in both genes underlie similar congenital umes (CVs) wash in Con A buffer A or VVA buffer A, 2–4CVsin50mM muscular dystrophy phenotypes, but this may be associated with ammonium bicarbonate, and enriched glycoproteins were eluted by heat- a need for POMT2 to function in a heteromeric complex (17, 41). ing (2× 90 °C 10 min, and 2× wash of the beads) in the presence of 0.05% O-Man glycans are clearly essential for ligand binding to RapiGest (Waters) in 50 mM ammonium bicarbonate. The eluates were α-DG, as demonstrated by the severe dystroglycanopathies caused further processed as for cell lysates for digestion and N-glycan deglycosy- by hypoglycosylation (3). Congenital deficiencies in all of the genes lation described below. involved in O-Man glycosylation produce muscular dystrophy Total cell lysates. Packed cells from 2× T175 flasks at confluency were lysed in 0.1% RapiGest in 50 mM ammonium bicarbonate with a sonic probe and the phenotypes (42), but it is premature to identify phenotypic × characteristics ascribable to other O-Man glycoproteins such as solution cleared by centrifugation (1,000 g for 10 min). The cleared lysate and secretome samples were heated for 10 min at 80 °C, followed by re- cadherins and plexins. However, it is clear that both families of duction (5 mM DTT, 60 °C, 0.5 h) and alkylation (10 mM iodoacetamide, proteins serve a range of essential functions that could be part room temperature, 0.5 h), and digestion with trypsin (25 μg, Roche), Chy- of the phenotypes observed. O-Man glycosylation in yeast has motrypsin (25 μg, Roche) or GluC (20 μg, Promega) (37 °C on). Proteases were multiple functions including those ascribed to O-GalNAc gly- heat inactivated (95 °C, 20 min) before N-glycanase treatment with PNGase F cosylation in metazoans, i.e., roles for conformation, stability, and (8 U, Roche) (37 °C on), and then incubated 4 h more with additional PNGase secretion of proteins (1). In yeast Pmt1p/Pmt2p, directed O-Man F (3 U). N-deglycosylated digests were treated with TFA (8 μL, 37 °C, 20 min), glycosylation has been demonstrated to play a role for ER protein cleared by centrifugation, purified on C18 Sep-Pak (Waters), concentrated by Speedvac, and resuspended in Con A buffer A to 1 mL before loading quality control (43), and more recently the yeast Pmt1/2 man- · nosyltransferase complex was demonstrated to serve as a termina- onto a preequilibrated (Con A buffer A 20 mM Tris HCl, pH 7.4, 150 mM NaCl, 1 mM CaCl2/MgCl2/MnCl2/ZnCl2, 0.5 M urea) 2.8-m long Con A lectin tion sensor of futile protein folding in ER (37). agarose column for isolation of O-man glycopeptides. The column was − In conclusion, the proteome-wide analysis of the O-Man gly- washed with 10 CVs Con A buffer A (100 μL·min 1) before elution with Con A

coproteome presented here provides an entirely unique view of buffer B (20 mM Tris·HCl, pH 7.4, 300 mM NaCl, 2 mM CaCl2/MgCl2/MnCl2/ distinct classes of O-Man glycoproteins such as the cadherins and ZnCl2,0.5Mmethyl-α-D-glucopyranoside/methyl-α-D-mannopyranoside) − plexins, and the study provides insight into the structural feature 5 CVs, 50 μL·min 1, 1-mL fractions); glycopeptide-containing fractions were (β-strands) governing this type of glycosylation. We applied the purified by Stage Tips (Thermo Scientific) for analysis. O-Man SimpleCell engineering strategy to a breast cancer cell line that we had already engineered for simplification of O-GalNAc nLC-MS/MS Analysis. Mass spectrometric analyses were performed essentially as glycosylation to be able to compare O-Man and O-GalNAc previously described (27). Samples were analyzed either on a setup composed of an EASY-nLC II (Thermo Fisher Scientific) interfaced via a nanoSpray Flex ion glycosites on α-DG. The O-Man glycoproteome is therefore source to an LTQ-Orbitrap XL hybrid spectrometer (Thermo Fisher Scientific) or limited to proteins expressed and O-mannosylated in this cell line. an EASY-nLC 1000 (Thermo Fisher Scientific) interfaced via a nanoSpray Flex ion Given that O-Man glycans appear to be particularly abundant in source to an LTQ-Orbitrap Velos Pro hybrid spectrometer (Thermo Fisher Scien- brain, further exploration of the O-Man glycoproteome is clearly tific). The instruments were equipped with capabilities for both HCD- and ETD- warranted, and the presented SimpleCell engineering strategy for MS2 fragmentation modes.

21022 | www.pnas.org/cgi/doi/10.1073/pnas.1313446110 Vester-Christensen et al. Downloaded by guest on September 27, 2021 The conditions of LC analysis were essentially as previously described (27). mass, generating four distinct files. The .raw files and subtracted .mgf files

The EASY-nLC II was equipped with a single analytical column set up using were searched against the human-specific UniProt database downloaded SEE COMMENTARY PicoFrit Emitters (New Objectives, 75-μm inner diameter) packed in-house on Feb. 13, 2013, using semispecific trypsin, chymotrypsin, or GluC pro- with Reprosil-Pure-AQ C18 phase (Dr. Maisch, 3-μm particle size). The EASY- teolytic cleavage. Carbamidomethyl was set as fixed modification for cys- nLC 1000 was also operated in a single analytical column set up (28-cm teine residues; methionine oxidation and asparagine deamidation were μ μ fl length, 75- m inner diameter, and 1.9- m particle size). Brie y, a precursor setasvariablemodifications; Hex was allowed as an additional variable MS1 scan (m/z 350–1,700 for LTQ-Orbitrap XL or 350–1,500 for LTQ-Orbitrap modification when considering ETD-MS2 spectra. Fragmentation spectra Velos Pro) of intact peptides was acquired in the Orbitrap at a nominal of candidate-matched glycopeptides associated with each protein were resolution setting of 30,000, followed by Orbitrap HCD-MS2 and ETD-MS2 inspected to verify accuracy of sequence and site assignments. (m/z of 100–2,000) of the three (LTQ-Orbitrap XL) or five (LTQ-Orbitrap Velos Pro) most abundant multiply charged precursors in the MS1 spectrum; a minimum MS1 signal threshold of 5,000 ions (LTQ-Orbitrap XL) or 20,000 Note Added in Proof. The authors were made aware of an upcoming paper fi fi (LTQ-Orbitrap Velos Pro) was used for triggering data-dependent frag- with the related nding of O-mannose residues identi ed on rabbit cad- mentation events; MS2 spectra were acquired at a resolution of 7,500 (LTQ- herins (44). Orbitrap XL) or 15,000 (LTQ-Orbitrap Velos Pro). ACKNOWLEDGMENTS. We thank Lawrence Shapiro and members of his Data Analysis. Data processing was carried out using Proteome Discoverer laboratory for discussions and critical review of the manuscript. We also thank Frederic Bard for sharing cell lines. This work was supported 1.4 software (Thermo Fisher Scientific) as previously described (27) with by A. P. Møller og Hustru Chastine Mc-Kinney Møllers Fond til Almene For- minor changes in preprocessing and processing procedures. The mono- maal, Kirsten og Freddy Johansen Fonden, The Carlsberg Foundation, The saccharide subtraction routine for correctly interpreting HCD spectra was Novo Nordisk Foundation, The Danish Research Councils, the University used as previously described (27), i.e., the exact masses of 1, 2, 3, and 4 of Copenhagen Program of Excellence, and the Danish National Research hexose (Hex) units were subtracted from the corresponding precursor ion Foundation (DNRF107).

1. Gentzsch M, Tanner W (1997) Protein-O-glycosylation in yeast: Protein-specific man- 23. Brancaccio A, Schulthess T, Gesemann M, Engel J (1995) Electron microscopic evi- nosyltransferases. Glycobiology 7(4):481–486. dence for a mucin-like region in chick muscle alpha-dystroglycan. FEBS Lett 368(1): 2. Endo T (2004) Structure, function and pathology of O-mannosyl glycans. Glycoconj J 139–142. 21(1-2):3–7. 24. Endo T (1999) O-mannosyl glycans in mammals. Biochim Biophys Acta 1473(1): 3. Barresi R, Campbell KP (2006) Dystroglycan: From biosynthesis to pathogenesis of 237–246. human disease. J Cell Sci 119(Pt 2):199–207. 25. Gomez Toledo A, et al. (2012) O-Mannose and O-N-acetyl galactosamine glycosylation 4. Yoshida A, et al. (2001) Muscular dystrophy and neuronal migration disorder caused of mammalian α-dystroglycan is conserved in a region-specific manner. Glycobiology by mutations in a glycosyltransferase, POMGnT1. Dev Cell 1(5):717–724. 22(11):1413–1423. 5. Jae LT, et al. (2013) Deciphering the glycosylome of dystroglycanopathies using 26. Hara Y, et al. (2011) Like-acetylglucosaminyltransferase (LARGE)-dependent modifi- haploid screens for lassa virus entry. Science 340(6131):479–483. cation of dystroglycan at Thr-317/319 is required for laminin binding and arenavirus 6. Krusius T, Finne J, Margolis RK, Margolis RU (1986) Identification of an O-glycosidic infection. Proc Natl Acad Sci USA 108(42):17426–17431. mannose-linked sialylated tetrasaccharide and keratan sulfate oligosaccharides in the 27. Vakhrushev SY, et al. (2013) Enhanced mass spectrometric mapping of the human chondroitin sulfate proteoglycan of brain. J Biol Chem 261(18):8237–8242. GalNAc-type O-glycoproteome with SimpleCells. Mol Cell Proteomics 12(4):932–944. 7. Finne J, Krusius T, Margolis RK, Margolis RU (1979) Novel mannitol-containing oli- 28. Nilsson J, Nilsson J, Larson G, Grahn A (2010) Characterization of site-specific O-glycan gosaccharides obtained by mild alkaline borohydride treatment of a chondroitin structures within the mucin-like domain of alpha-dystroglycan from human skeletal sulfate proteoglycan from brain. J Biol Chem 254(20):10295–10300. muscle. Glycobiology 20(9):1160–1169. 8. Chai W, et al. (1999) High prevalence of 2-mono- and 2,6-di-substituted manol-ter- 29. Harrison R, et al. (2012) Glycoproteomic characterization of recombinant mouse minating sequences among O-glycans released from brain glycopeptides by reductive α-dystroglycan. Glycobiology 22(5):662–675. alkaline hydrolysis. Eur J Biochem/FEBS 263(3):879–888. 30. Stalnaker SH, et al. (2010) Site mapping and characterization of O-glycan structures 9. Breloy I, Pacharra S, Aust C, Hanisch FG (2012) A sensitive gel-based global O-glyco- on alpha-dystroglycan isolated from rabbit skeletal muscle. J Biol Chem 285(32): mics approach reveals high levels of mannosyl glycans in the high mass region of the 24882–24891. mouse brain proteome. Biol Chem 393(8):709–717. 31. Harrison OJ, et al. (2011) The extracellular architecture of adherens junctions revealed 10. Martinez T, Pace D, Brady L, Gerhart M, Balland A (2007) Characterization of a novel by crystal structures of type I cadherins. Structure 19(2):244–256. modification on IgG2 light chain. Evidence for the presence of O-linked mannosylation. 32. Yagi T (2008) Clustered protocadherin family. Dev Growth Differ 50(Suppl 1):S131–S140. J Chromatogr A 1156(1-2):183–187. 33. Brasch J, Harrison OJ, Honig B, Shapiro L (2012) Thinking outside the cell: How cad- 11. Dwyer CA, Baker E, Hu H, Matthews RT (2012) RPTPζ/phosphacan is abnormally gly- herins drive adhesion. Trends Cell Biol 22(6):299–310. cosylated in a model of muscle-eye-brain disease lacking functional POMGnT1. 34. Perälä N, Sariola H, Immonen T (2012) More than nervous: The emerging roles of Neuroscience 220:47–61. plexins. Differentiation 83(1):77–91. 12. Bleckmann C, et al. (2009) O-glycosylation pattern of CD24 from mouse brain. Biol 35. Jones DT, et al. (2008) Tandem duplication producing a novel oncogenic BRAF Chem 390(7):627–645. fusion gene defines the majority of pilocytic astrocytomas. Cancer Res 68(21): 13. Pacharra S, Hanisch FG, Breloy I (2012) Neurofascin 186 is O-mannosylated within and 8673–8677. outside of the mucin domain. J Proteome Res 11(8):3955–3964. 36. Ecker M, et al. (2003) O-mannosylation precedes and potentially controls the N-gly- 14. Pacharra S, et al. (2013) The Lecticans of Mammalian Brain Perineural Net Are cosylation of a yeast cell wall glycoprotein. EMBO Rep 4(6):628–632. O-Mannosylated. J Proteome Res 12:1764–1771. 37. Xu C, Wang S, Thibault G, Ng DT (2013) Futile protein folding cycles in the ER are 15. Stalnaker SH, et al. (2011) Glycomic analyses of mouse models of congenital muscular terminated by the unfolded protein O-mannosylation pathway. Science 340(6135): dystrophy. J Biol Chem 286(24):21180–21190. 978–981. 16. Steentoft C, et al. (2011) Mining the O-glycoproteome using zinc-finger nuclease- 38. Röttger S, et al. (1998) Localization of three human polypeptide GalNAc-transferases glycoengineered SimpleCell lines. Nat Methods 8(11):977–982. in HeLa cells suggests initiation of O-linked glycosylation throughout the Golgi ap- 17. Lommel M, Strahl S (2009) Protein O-mannosylation: Conserved from bacteria to paratus. J Cell Sci 111(Pt 1):45–60. humans. Glycobiology 19(8):816–828. 39. Gill DJ, Chia J, Senewiratne J, Bard F (2010) Regulation of O-glycosylation through 18. Hutzler J, Schmid M, Bernard T, Henrissat B, Strahl S (2007) Membrane association is Golgi-to-ER relocation of initiation enzymes. J Cell Biol 189(5):843–858. a determinant for substrate recognition by PMT4 protein O-mannosyltransferases. 40. Gill DJ, Clausen H, Bard F (2011) Location, location, location: New insights into Proc Natl Acad Sci USA 104(19):7827–7832. O-GalNAc protein glycosylation. Trends Cell Biol 21(3):149–158. 19. Yoshida-Moriguchi TWT, et al. (2013) SGK196 is a glycosylation-specific O-mannose 41. Lommel M, Willer T, Cruces J, Strahl S (2010) POMT1 is essential for protein O-man- kinase required for dystroglycan function. Science 341(6148):896–899. nosylation in mammals. Methods Enzymol 479:323–342. 20. Yoshida-Moriguchi T, et al. (2010) O-mannosyl phosphorylation of alpha-dystroglycan 42. Godfrey C, Foley AR, Clement E, Muntoni F (2011) Dystroglycanopathies: Coming into

is required for laminin binding. Science 327(5961):88–92. focus. Curr Opin Genet Dev 21(3):278–285. CELL BIOLOGY 21. Steentoft C, et al. (2013) Precision mapping of the human O-GalNAc glycoproteome 43. Goder V, Melero A (2011) Protein O-mannosyltransferases participate in ER protein through SimpleCell technology. EMBO J 32(10):1478–1488. quality control. J Cell Sci 124(Pt 1):144–153. 22. Chiba A, et al. (1997) Structures of sialylated O-linked oligosaccharides of bovine 44. Winterhalter PR, Lommel M, Ruppert T, Strahl S (2013) O-glycosylation of the non- peripheral nerve alpha-dystroglycan. The role of a novel O-mannosyl-type oligosaccharide canonical T-cadherin from rabbit skeletal muscle by single mannose residues. FEBS in the binding of alpha-dystroglycan with laminin. J Biol Chem 272(4):2156–2162. Letters 587(22):3715–3721.

Vester-Christensen et al. PNAS | December 24, 2013 | vol. 110 | no. 52 | 21023 Downloaded by guest on September 27, 2021