UDP-: glucosyltransferase 1 (UGGT1) substrate characterization

by

Nathan Masashi Doner

A thesis submitted in conformity with the requirements for the degree of Master of Science Department of Molecular Genetics University of Toronto

© Copyright by Nathan Masashi Doner 2018

UDP-glucose: glycoprotein glucosyltransferase 1 (UGGT1) substrate characterization

Nathan Doner

Master of Science

Department of Molecular Genetics University of Toronto

2018 Abstract

UDP-glucose: glycoprotein glucosyltransferase 1 (UGGT1) plays a quality control role in the ER by recognizing misfolded and glucosylating their Man9 N-glycans. To study the substrate specificity of UGGT1 and to produce an acceptor substrate for co-crystallization with

UGGT1, I have purified Man9-glycosylated forms of bovine RiboB, RiboS, and S-. I have also produced variants of these with the single N-glycan at Asn34 re-located to other sites.

Using circular dichroism, I have established that RiboB is more thermostable than S-protein and

RiboS. Accordingly, S-protein and RiboS are readily glucosylated and bound by human UGGT1.

N34Q/Y76N The Man9 form of the S-protein variant bound to UGGT1 with a KD of 76 μM. This substrate is a candidate for co-crystallization with UGGT1. The methods described here can also be used to study other UGGT1 substrates and will facilitate work aimed at understanding how

UGGT1 recognizes its misfolded glycoprotein substrates.

ii

Acknowledgements

I would like to thank my supervisor, Dr. James Rini, for being an integral part of my learning process throughout my research. The countless discussions that we had about science, both pertaining to my research, and about other unanswered biological questions, were always stimulating and have contributed to my growth as a scientist.

I would also like to thank Dr. Zhijie Li, who I view as an important technical mentor. He has helped me learn to be critical of methodology and how to be validate and be confident with the results of my own assays. Discussions with him have also stimulated my interest in many fields, both biological and non-biological.

Additionally, the members of the Rini lab (especially Dr. Alan Wong, Aidan Tomlinson, Kristina Han, Dongxia Zhou, and Malathy Satkunarajah) and the members of the MaRS 16th floor have been very helpful and fun along the way and contributed positively to my experiences at the University of Toronto.

Of course, I would like to thank my family and friends for being there for me outside of the lab, particularly my mother and Aislinn Sandre.

iii

Table of Contents

Acknowledgements ...... iii

Table of Contents ...... iv

List of Tables ...... viii

List of Figures ...... ix

List of Abbreviations ...... xi

Introduction ...... 1

1.1 The secretory pathway ...... 1

1.1.1 Signal peptide...... 1

1.1.2 N-linked glycan biosynthesis and transfer ...... 1

1.1.3 cycle ...... 4

1.1.4 Golgi apparatus ...... 5

1.1.5 ER-associated degradation (ERAD) ...... 6

1.2 UDP-glucose: glycoprotein glucosyltransferase (UGGT) ...... 7

1.2.1 UGGT discovery ...... 7

1.2.2 UGGT structure ...... 8

1.2.3 UGGT in vitro glucosylation ...... 11

1.2.4 UGGT2 ...... 14

1.2.5 UGGT in vivo glucosylation ...... 14

1.2.6 Binding partner Sep15 (SELENOF) ...... 16

1.3 Rationale of the thesis ...... 16

Materials and Methods ...... 17

2.1 Vector construction ...... 17

2.2 Cell culture ...... 18

iv

2.2.1 Transfection ...... 18

2.2.2 Protein production ...... 19

2.3 Protein purification and chromatography ...... 19

2.3.1 UGGT1 ...... 19

2.3.2 RiboB ...... 19

2.4 RiboS and S-protein preparation and chromatography ...... 20

2.5 Ribonuclease activity assay ...... 20

2.6 UGGT1 MALDI-TOF MS-based activity assay...... 21

2.7 Circular Dichroism...... 22

2.8 BIAcore surface plasmon resonance ...... 22

Results ...... 23

3.1 Protein Expression ...... 23

3.1.1 UGGT1 expression ...... 23

3.1.2 RiboB vectors and expression ...... 23

3.2 Protein Purification ...... 25

3.2.1 UGGT1 purification ...... 25

3.2.2 RiboB, RiboS and S-protein purification ...... 28

3.3 Substrate properties ...... 31

3.3.1 Circular Dichroism...... 31

3.3.2 Ribonuclease activity ...... 34

3.4 UGGT1 activity ...... 35

3.4.1 MALDI-TOF-based activity assay ...... 35

3.4.2 Alternative activity assays ...... 38

3.4.3 Effect of temperature ...... 40

3.4.4 S-protein glycosylation mutant forms ...... 40

3.4.5 Estimation of KM and kcat ...... 41 v

3.4.6 Effect of metal ions ...... 45

3.5 UGGT1 substrate binding ...... 45

3.5.1 RiboB, RiboS, and S-protein ...... 45

3.5.2 KD estimation ...... 46

3.5.3 Glycan involvement in binding ...... 48

Discussion and Conclusions ...... 50

4.1 Protein expression and purification ...... 50

4.1.1 Protein expression ...... 50

4.1.2 RiboB, RiboS, and S-protein ...... 51

4.1.3 UGGT1 purification ...... 52

4.2 RiboB, RiboS and S-protein properties ...... 52

4.2.1 Circular Dichroism...... 52

4.2.2 Ribonuclease activity ...... 53

4.3 UGGT1 activity ...... 54

4.3.1 MALDI-TOF-based activity assay ...... 54

4.3.2 RiboB, RiboS, and S-protein ...... 55

4.3.3 Estimation of KM and kcat ...... 55

4.3.4 Metal ions...... 57

4.4 UGGT1 substrate binding ...... 57

4.4.1 RiboB, RiboS, and S-protein ...... 57

4.4.2 KD Estimation ...... 58

4.4.3 S-protein variants ...... 59

4.5 UGGT substrate recognition ...... 59

4.5.1 UGGT substrate recognition in the context of S-protein variants ...... 61

Future Directions ...... 62

5.1 X-ray crystal structure...... 62 vi

5.2 Co-crystallization with different protein substrates ...... 63

5.3 UGGT binding partners ...... 64

5.4 Electron microscopy ...... 65

References ...... 66

vii

List of Tables

Table 1. PCR primers used for cloning and site-directed mutagenesis ...... 17

Table 2. Summary of S-protein variants in activity and binding experiments ...... 49

Table 3. Purified UGGT substrates in vitro ...... 63

viii

List of Figures

Figure 1. The mammalian secretory pathway...... 3

Figure 2. Oligosaccharyltransferase donor substrate ...... 4

Figure 3. UGGT domain structure and crystal structure...... 9

Figure 4. Ribonuclease B and Ribonuclease S ...... 13

Figure 5. Expression of UGGT1 and the Protein A-RiboB glycosylation variants ...... 24

Figure 6. The RiboB glycosylation variants ...... 25

Figure 7. UGGT1 purification ...... 27

Figure 8. UGGT1 proteolysis ...... 28

Figure 9. RiboB, RiboS, and S-protein purification ...... 30

Figure 10. SDS-PAGE gels of RiboB and S-protein glycosylation mutants ...... 31

Figure 11. Far- and Near-UV CD spectra of RiboB, RiboS and S-protein ...... 32

Figure 12. CD melts of RiboB, RiboS and S-protein ...... 33

Figure 13. Reversible CD melt of RiboB, RiboS and S-protein ...... 33

Figure 14. RiboB, RiboS and S-protein activity assay ...... 34

Figure 15. RiboB and RiboS activity as a function of temperature ...... 35

Figure 16. MALDI-TOF-based UGGT1 activity assay ...... 37

Figure 17. Alternative activity assays ...... 39

Figure 18. UGGT1 activity across temperature ...... 40

Figure 19. Progress curves of UGGT1 activity towards S-protein glycosylation variants ...... 41 ix

Figure 20. UDP-glucose Michaelis-Menten plot...... 42

Figure 21. RiboS Michaelis-Menten plot...... 43

Figure 22. S-protein Michaelis-Menten plot...... 43

Figure 23. S-proteinN34Q/Y76N Michaelis-Menten plot...... 44

Figure 24. Effect of divalent metal ions on UGGT1 ...... 45

Figure 25. SPR sensorgram showing RiboB, RiboS, and S-protein binding to a UGGT-coupled chip ...... 46

Figure 26. Sensorgrams from S-proteinN34Q/Y76N titration ...... 47

Figure 27. KD binding curves for S-protein forms ...... 48

Figure 28. UGGT mechanisms of glucosylation ...... 60

x

List of Abbreviations

α1-AT alpha-1-antitrypsin

ATP adenosine triphosphate

CD circular dichroism

CNX calnexin

CRT

DMEM Dulbecco’s Modified Eagle Medium

DMSO dimethyl sulfoxide

DTT dithiothreitol

EDTA ethylenediaminetetraacetic acid

EK enterokinase

ER

ERAD ER-associated protein degradation

FPLC fast protein liquid chromatography

GI glucosidase I

GII glucosidase II

Glc glucose

GlcNAc N-acetylglucosamine

GT glycosyltransferase

HBS HEPES-buffered saline

HEPES 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid

HUGO Organization

IEX ion exchange

xi

KD dissociation constant

KM Michaelis constant

LC-ESI-MS liquid chromatography-electrospray ionization-mass spectrometry

LLO lipid-linked oligosaccharide

MALDI-TOF matrix-assisted laser desorption/ionization-time of flight

Man mannose

MHC major histocompatibility complex

MS mass spectrometry

MTX methotrexate

MW molecular weight

OST oligosaccharyltransferase

PCR polymerase chain reaction

PDB

PLC peptide-loading complex

RU response unit

SDS-PAGE sodium dodecyl sulfate-polyacrylamide gel electrophoresis

SEC size-exclusion chromatography

SPR surface plasmon resonance

TEV tobacco etch virus

UDP uridine diphosphate

UV ultraviolet

xii

Introduction 1.1 The secretory pathway

The eukaryotic secretory pathway is responsible for the biosynthesis and targeting of membrane and secreted proteins. It consists of the endoplasmic reticulum (ER), the Golgi apparatus, and secretory vesicles. A highly simplified model is depicted in Figure 1.

Importantly, the canonical secretory pathway is the site of N-glycosylation, a post-translational modification that has various effects on a protein, namely folding, signaling, and targeting. It has been estimated that ~50% of all human proteins are glycoproteins, and that most of these are N- glycosylated1.

1.1.1 Signal peptide

Secretory pathway proteins have a sequence at their N-terminus called the signal peptide (some have internal signal sequences). At the start of translation, the signal peptide recruits the signal recognition particle (SRP), a ribonucleoprotein complex that when bound to the nascent signal peptide halts translation. The resulting complex is called the ribosome-nascent chain complex (RNC)2. The SRP targets the RNC to a protein-conducting channel on the rough ER called the Sec61 translocon, a heterotrimer of the Sec61α, Sec61β, and Sec61γ subunits3. The ribosome docks to the translocon and remains outside of the ER. Translation resumes and the nascent protein is translated either fully (soluble proteins) or partially (membrane proteins with transmembrane segments) into the lumen of the ER. During this process, N-terminal signal sequences are cleaved off by signal peptidase.

1.1.2 N-linked glycan biosynthesis and transfer

Proteins translated into the ER that contain the amino acid sequon N-X-S/T (where X is any amino acid except proline) may be glycosylated in a process called N-glycosylation. Figure 2 shows the oligosaccharide before it is transferred onto N-glycosylated proteins. This branched structure is highly conserved across . As shown in Figure 2a, the three branches are termed the a-, b-, and c-arms. Prior to transfer to the protein, the sugar chain is linked to a dolicholpyrophosphate moiety. This membrane-bound molecule is termed the lipid-linked oligosaccharide (LLO)4. The two N-acetylglucosamine (GlcNAc) residues and five of the mannose (Man) residues are

1 2 assembled in the cytosol by several asparagine-linked glycosylation (ALG) glycosyltransferases. A flippase enzyme then flips the LLO so that the sugar chain is positioned in the lumen of the ER where ER-resident ALG glycosyltransferases add additional Man and glucose (Glc) residues to create the 14-residue Glc3Man9GlcNAc2 structure (Figure 1, Figure 2). This pathway is highly conserved across eukaryotes4.

The oligosaccharyltransferase (OST) complex is associated with the lumenal side of the ER membrane and it recognizes the N-X-S/T sequon after it has passed through the translocon. It uses its STT3 subunit to transfer the Glc3Man9GlcNAc2 oligosaccharide from the membrane-anchored dolichol group to the asparagine residue in the sequon5. OST associates with the translocon, and this proximity to the nascent protein and membrane-bound LLO facilitates oligosaccharide transfer. After transfer, the protein-linked oligosaccharide is termed an N-linked glycan, or N- glycan. It is estimated that only two-thirds of N-X-S/T sequons are glycosylated by OST, and it has been found that sequences proximal to the sequon can affect site occupancy1,6,7. Co- translational folding may also be a factor in determining whether a given sequon is glycosylated or not5. To facilitate transfer, some OST complex subunits bind to and keep nascent N-X-S/T sequons in a linear conformation5,8. Some N-X-S/T sequons are fully glycosylated, some are completely non-glycosylated, and some have partial glycan occupancy.

Glycosylation alters the biophysical properties of a protein. Having a large polar oligosaccharide chain extending into the solvent provides a thermodynamic advantage by destabilizing folding intermediates while stabilizing the native state9,10. This can accelerate folding and increase the stability of a protein11. Additionally, glycosylated proteins tend to be more soluble, resistant to aggregation, and resistant to proteases than when non-glycosylated12,13. N-glycans frequently occur between segments of secondary structure, especially at turns and bends14.

3

Figure 1. The mammalian secretory pathway.

The lipid-linked oligosaccharide is synthesized by ALG glycosyltransferases (gray dashed oval) and transferred to a nascent polypeptide by oligosaccharyltransferase (OST). Properly folded glycoproteins exit the ER (green oval), misfolded glycoproteins are temporarily retained in the ER by UGGT glucosylation (yellow oval), and terminally misfolded glycoproteins are cleaved by mannosidases rendering them substrate for receptors that facilitate export to the cytosol for degradation (red oval)

4

Figure 2. Oligosaccharyltransferase donor substrate

The fully assembled lipid-linked oligosaccharide (LLO) donor substrate used by OST. The dolichol moiety is shown embedded in the ER membrane; the oligosaccharide moiety is on the lumenal side of the ER membrane.

1.1.3 Calnexin cycle

Once a protein is N-glycosylated, the terminal Glc residue is cleaved off by glucosidase I (GI), 4 leaving a Glc2Man9GlcNAc2 glycan . The next two are cleaved sequentially by glucosidase II (GII) (Figure 2b). After the first of these is removed, the resulting

Glc1Man9GlcNAc2-containing glycoprotein can be bound by the membrane-bound calnexin (CNX) or its soluble homolog calreticulin (CRT). CNX/CRT mainly recognize the

Glc1Man9GlcNAc2 glycan of its ligands, however, some evidence suggests they may also bind to the protein component as well4,15,16. Both CNX and CRT are composed of a lectin domain and a long (>100 Å) proline-rich P-domain arm, which extends away from the lectin domain to form a 17 large cavity . The lectin domain binds the Glc1Man9GlcNAc2 glycan and interactions with the P- domain are thought to shield the glycoprotein from aggregation during folding/biosynthesis.

In addition to the role that the CNX/CRT P-domain might play in the binding/shielding of nascent glycoproteins, it also facilitates their folding by recruiting enzymes that promote folding. Known CNX/CRT binding partners include: protein isomerase ERp57, peptidyl prolyl cis-trans isomerase CypB, and chaperone ERp2915,18. ERp57 (57-kDa ER protein) binds to the tip of the

5

CNX/CRT P-domain and has thiol oxidase, reductase, and isomerase activity19. ERp57 is composed of four tandem -like domains, but only two of them contain the catalytic C- X-X-C motif. It reduces disulphide bonds and allows new ones to form. This may allow incorrect that might form during folding to be reduced, thereby providing another chance for the formation of the correct disulfide bond. The peptidyl prolyl cis-trans isomerase CypB assists in folding by isomerizing proline residues that are in the wrong conformation20.

Removal of the terminal glucose moiety from the Glc1Man9GlcNAc2 glycan by GII, leads to glycoproteins no longer able to associate with CNX/CRT. Natively folded glycoproteins continue along the secretory pathway into the Golgi apparatus (Figure 1, green oval). However, if the glycoprotein has not yet reached its final folded form, it can be re-glucosylated by UDP-glucose: glycoprotein glucosyltransferase (UGGT) and again bound by CNX/CRT (Figure 1, yellow oval). UGGT only recognizes partially misfolded glycoproteins, not native or completely unfolded proteins21,22. It recognizes both the misfolded component of the protein as well as the

Man9GlcNAc2 N-glycan, and it transfers a glucose moiety to regenerate the Glc1Man9GlcNAc2 glycan. UGGT and GII continuously glucosylate and deglucosylate misfolded substrates allowing for multiple cycles of CNX/CRT association. The continuous cycling of an immature glycoprotein between CNX/CRT/UGGT until the native fold is reached is referred to as the calnexin/calreticulin cycle.

1.1.4 Golgi apparatus

Once properly folded, glycoproteins are transported to the Golgi apparatus with the aid of the membrane-bound ERGIC-53, VIPL, or VIP3623. These lectins bind to the N-glycans and help to ensure that they are loaded into vesicles for transport to the Golgi. ERGIC-53, for example, has cytoplasmic tails that associate with coat protein (COP) I and COP II, two proteins involved in vesicle formation23. In this way, ERGIC-53 continually shuttles in COP-coated vesicles between the ER and the cis-Golgi, picking up cargo in the ER and releasing it in the Golgi.

After exiting the ER, proteins transit sequentially through the cis-, medial-, and trans- compartments of the Golgi. Each of these compartments possess enzymes (glycosyltransferases and glycosidases) that further modify the N-glycan24. This spatial separation of the modifying enzymes is one means by which defined N-glycan structures are generated as glycoproteins transit

6 through the secretory pathway. After exiting the Golgi, proteins are targeted to the correct subcellular location (soluble and membrane proteins) or secreted (soluble secreted proteins)25.

1.1.5 ER-associated degradation (ERAD)

The calnexin cycle ensures that misfolded glycoproteins do not leave the ER. However, some glycoproteins never achieve their native fold and these “terminally misfolded proteins” must be removed from the ER. The ER-associated degradation (ERAD) pathway translocates terminally misfolded glycoproteins from the ER to the cytosol where they are polyubiquitinated and degraded by the 26S proteasome (Figure 1, red oval)23,26.

Humans have four ER-resident mannosidases: ER mannosidase I (ERManI) and ER degradation- enhancing mannosidase-like proteins 1-3 (EDEM1-3), all of which are α1,2-mannosidases27,28. ERManI and EDEM2 are thought to be primarily responsible for the removal of the terminal b- arm mannose (Figure 2a), while EDEM1 and EDEM3 are thought to be responsible for removal of the other mannose residues27–29. Glycoproteins that continually cycle between CNX/CRT and

UGGT are more likely to be cleaved by these mannosidases, leading to shortened N-glycan structures. In humans, Man5-8GlcNAc2 N-glycan structures are recognized by ER-resident lectins OS-9 and XTP3-B30. The OS-9 crystal structure shows that it binds to the mannoses on the N- glycan31. In vivo studies have also suggested that OS-9 and XTP3-B do not bind natively folded proteins30,32. After binding the misfolded glycoprotein, OS-9 and XTP3-B participate in translocating it out of the ER to the cytosol by binding to the Hrd1 complex30. Hrd1 is a RING- finger ubiquitin ligase that is the central component of the Hrd1 complex, and it forms a protein- conducting channel to move misfolded proteins out of the ER33,34. On the cytosolic side of the ER membrane, the misfolded glycoprotein is polyubiquitinated, a modification that targets it to the 26S proteasome for degradation. The degradation process further involves deglycosylation, deubiquitination, and unfolding of the substrate 35–37.

Mannosidase activity is required to produce the shortened N-glycans that result in ERAD-mediated removal. Yet, degraded glycans are still able to participate in the calnexin cycle providing that they still have a full a-arm (the arm that is glucosylated by UGGT). Once the glycan is trimmed down to Man7GlcNAc2, it can be removed by ERAD lectins, although it may continue to associate with CNX/CRT/UGGT. However, the affinities of UGGT and GII for degraded glycans are less than 21,38 that for full Man9GlcNAc2 glycans . This means that glycoproteins with shortened N-glycans

7 are less likely to participate in the CNX/CRT cycle, and instead are more likely to be removed by ERAD. If the a-arm is demannosylated, the glycoprotein is unable to be re-glucosylated by UGGT and therefore it can only be bound by the lectins leading to ERAD. The idea that increased mannosidase trimming leads to ERAD is called the “timer hypothesis”. Mannosidase-mediated glycan degradation limits the amount of time that any glycoprotein can persist in the ER while undergoing cycles of CNX/CRT/UGGT-mediated folding.

1.2 UDP-glucose: glycoprotein glucosyltransferase (UGGT) 1.2.1 UGGT discovery

UGGT was first discovered when it was observed that high-mannose N-glycans were being re- glucosylated. In 1982, Parodi and Cazzulo reported that incubating the protozoan, Trypanosoma cruzi, with radiolabeled [14C]glucose for 20 minutes produced glycoproteins containing 14 39 [ C]glucose . T. cruzi makes Man9GlcNAc2 dolichol-P-P-linked oligosaccharides (instead of

Glc3Man9GlcNAc2 glycans), and these Man9GlcNAc2 N-glycans are transferred onto nascent glycoproteins. When Glc1Man9GlcNAc2, Glc1Man8GlcNAc2, and Glc1Man7GlcNAc2 glycans were found after a short pulse-chase, it was concluded that the glucose must have been transferred by an enzyme that glucosylates the glycoprotein39. The following year, the same group showed that calf thyroid slices incubated in [14C]glucose could incorporate glucose into N-glycan structures40. They also incubated calf thyroid microsomes with UDP-[14C]glucose and found N-glycans with incorporated radiolabeled glucose40. It was soon confirmed that this same process happens in rat liver, and that it is localized only to the rough ER, not the Golgi41. These pulse-chase mammalian studies used glucosidase inhibitors to ensure that they were observing re-glucosylation of

Man9GlcNAc2 structures and not Glc3Man9GlcNAc2 structures being trimmed. These studies, along with a few others from the same group, all presented strong evidence that there was a re- glucosylating enzyme in the ER across multiple kingdoms of life.

A breakthrough occurred when it was discovered that rat microsomes incubated with UDP- [14C]glucose and purified thyroglobulin only glucosylated thyroglobulin when it was first denatured with urea42. This was the first indication that UGGT is a misfolded protein sensor. Since then, denatured thyroglobulin has often been used as a model substrate for UGGT activity assays.

8

UGGT was first purified from rat liver, and different substrates were used to show that it only glucosylates misfolded glycoproteins43,44. UGGT purified from different species showed the same effect45–47, and different substrates that mimic the unfolding states of proteins were used to confirm its specificity for misfolded proteins45,48. This helped build a picture of the specificity that UGGT has and led to the current model for how the calnexin cycle works.

1.2.2 UGGT structure

Human UGGT1 is a 1555 amino acid protein with a 7-domain architecture49. For the purposes of crystallography and structural biology, UGGT from at least three different fungal strains has been expressed and purified: Penicillium chrysogenum50, Chaetomium thermophilum49,51, and Thermomyces dupontii52. These fungal UGGTs share low (~35%) sequence identity with the human enzyme at the amino acid level. The year 2017 has thus far heralded the largest advances in our understanding of the structure of UGGT, with two independent groups publishing papers describing the x-ray crystallography and cryo-electron microscopy (cryo-EM) models of fungal UGGTs49,52. Seven new UGGT crystal structures were deposited in the Protein Data Bank (PDB) in 2017.

Prior to 2017, it was thought that UGGT contained 3 thioredoxin-like (trx) domains and a β-strand- rich domain in the N-terminal region, with a C-terminal catalytic glycosyltransferase domain. This model was suggested by Zhu et al.51 in 2014 based on protein threading and was in line with what our group had predicted, using the Phyre253 protein threading server (Figure 3a). Based on this evidence, it was thought that UGGT had a beads-on-a-string structure. When the first UGGT crystal structure was published49, it was discovered that there are in fact 4 thioredoxin-like domains and 2 β-domains (Figure 3b). Moreover, the structure did not assume a beads-on-a-string arrangement; the trx4 and β1 domains contain segments that are not contiguous in the linear sequence. The catalytic domain was indeed found to be a glycosyltransferase GT-A domain belonging to CAZy (Carbohydrate-active enzymes) family GT24, as predicted by the sequence.

1.2.2.1 X-ray crystal structure

The first paper using a structural biology approach to describe UGGT was published in 201451. Two crystal structures of the 160-amino acid trx3 domain alone were determined, with one structure containing a specifically-bound detergent. These crystal structures showed that there is a

9

C-terminal α-helix on the trx3 domain that sits in a pocket enriched in hydrophobic amino acids. In the detergent-bound form, the helix was disordered, allowing the detergent to bind to the hydrophobic pocket. This was an interesting finding as this suggests that UGGT recognizes exposed regions of hydrophobicity on its misfolded substrates using a hydrophobic pocket on the UGGT trx3 domain.

Figure 3. UGGT domain structure and crystal structure

(a) The previously predicted UGGT domain structure. Trx, thioredoxin-like domain; β, β- strand rich domains; GT, glycosyltransferase domain. (b) UGGT domain structure adapted from Roversi et al. (2017)49 (c) UGGT "intermediate conformation" crystal structure49 PDB ID: 5MU1.

In August 2017, Roversi et al. published four x-ray crystal structures of full length UGGT from C. thermophilum (Figure 3c)49. They determined the structure of the native protein in an open,

10 intermediate, and closed conformation at 3.5-4.3 Å resolution. In addition, they engineered in a disulphide bond (D611C/G1050C) which locked the structure in the closed conformation, yielding a structure at 2.8 Å resolution. The open, intermediate, and closed forms differ with regard to the position/orientation of the trx2 and trx3 domains, relative to the remainder of the molecule. The trx2 domain is especially flexible and swings in and out to create the open and closed conformation. It was proposed that UGGT uses the flexibility of the trx2 domain to allow it to accommodate substrates of different sizes49.

1.2.2.2 Glycosyltransferase domain

Glycosyltransferases (GTs) are a large family of enzymes responsible for the biosynthesis of glycans54,55. GTs transfer the sugar moiety from a nucleotide sugar or dolichol-linked sugar, called the donor substrate, to another molecule, termed the acceptor substrate. UGGT has a GT-A fold, one of the two main types of GT folds (GT-B being the other), which is composed of two closely- associated Rossman-like folds. UGGT is a member of the GT24 family of glycosyltransferases52. It has a retaining mechanism, meaning that the stereochemical configuration of the donor substrate’s C1 carbon (an α-glycosidic bond) is unchanged after transfer to the glycoprotein (in contrast to GT inverting mechanisms which invert the C1 carbon configuration).

One motif that is commonly found in the GT-A catalytic site is the D-X-D motif. Crystal structures have shown that the aspartic (or glutamic) acids in D-X-D motifs coordinate divalent metal ions (commonly Mn2+ or Mg2+) which, in turn, promote donor substrate binding by coordinating phosphate oxygen atoms of the nucleotide sugar. Additionally, the metal ion serves to counter the charge that develops on the β-phosphate in the transition state, thereby promoting catalysis. In UGGT, a Ca2+ ion is coordinated by the D-X-D motif49 and Ca2+ or Mn2+ is required for activity43. Mutating either aspartic acid of the D-X-D motif abolishes activity in human47 and rat56 UGGT.

To visualize the flexibility of UGGT, Satoh et al. took real-time images of T. dupontii UGGT using high-speed atomic force microscopy, and showed that the GT domain was highly mobile with respect to the rest of the protein (the N-terminal region)52. There appeared to be a hinge between the GT domain and the N-terminal region. However, evidence for this hinge was not found in the preceding crystal structures, as the GT domain was similarly oriented relative to the N-terminal region across all 4 crystal forms49.

11

1.2.2.3 Primary structure elements

Human UGGT1 has 3 N-X-S/T sequons (potential N-glycosylation sites) and it has been shown by LC-MS/MS that only Asn269 is glycosylated57,58. UGGT1 expressed in human 293T cells contained mainly Hex6HexNAc2 glycans. These are presumably Man6GlcNAc2 glycans, as UGGT1, residing in the ER lumen, is expected to have high-mannose glycans. Mutation of the Asn269 site does not affect activity or expression levels57. Glycosylation of this site is observed in humans, rats, pigs, and cows57. C. thermophilum UGGT contains five N-X-S/T sites and all 5 are glycosylated as evidenced by the presence of Asn-linked GlcNAc residues in the EndoH-treated UGGT crystal structures49.

At the C-terminus of human UGGT1, there is an ER retention signal. This is a four-amino acid peptide, typically K-D-E-L, that ensures ER resident proteins remain in the ER. If a protein with a retention signal is found in the Golgi, it will be transported back to the ER lumen by retrograde transport. Human UGGT1 has the C-terminal sequence R-E-E-L.

1.2.3 UGGT in vitro glucosylation

Using purified UGGT for in vitro activity assays, it has been found that UGGT only efficiently glucosylates proteins that are misfolded. Native proteins do not get efficiently glucosylated by UGGT44,59, nor do completely unfolded proteins22,60. Glycopeptide substrates are poorly glucosylated unless they contain several hydrophobic amino acids48,61,62. It has been speculated that UGGT interacts with its misfolded protein substrates through the recognition of regions of exposed hydrophobic amino acids.

A series of glycosylated chymotrypsin inhibitor 2 (GCI2) fragments (25-64 amino acids in length) were shown to mimic GCI2 folding intermediates60,63. The GCI2 fragments that were properly folded were not recognized or glucosylated by rat UGGT, but GCI2 fragments exposing hydrophobic residues or perturbed with point mutations were. GCI2 fragments that had major folding defects were not recognized by UGGT, an observation consistent with the idea that UGGT only recognizes partially misfolded proteins60,63. Other studies have developed UGGT substrates by chemically modifying cysteines in such a way that the structure is slightly disrupted45,64. In these assays, UGGT recognized the modified substrates but not the native ones.

12

β-glycanase was used as a glycoprotein substrate to show that rat UGGT recognizes point mutation- induced structural disruptions if they are both near or far away from the glycan. Structural perturbations up to 40 Å away were enough to cause UGGT to glucosylate the glycan65. However, in cell culture, using the highly glycosylated influenza HA protein, UGGT was shown to preferentially glucosylate the N-glycans nearest to the region of mutation-induced misfolding66.

It has been shown that the asparagine-linked GlcNAc residue on urea-denatured, EndoH- deglycosylated RiboB is involved in UGGT binding48. Binding to this GlcNAc has been suggested to be another mechanism through which UGGT recognizes misfolded substrates, as the asparagine- linked GlcNAc is likely to be more exposed in a protein that is misfolded than one that is folded48.

UGGT recognized and glucosylated urea-dissociated soybean agglutinin but not soybean agglutinin in its tetrameric state67. This was shown to be due to the oligomeric state rather than an unfolding of the monomers themselves. This is interesting as it shows that UGGT may not only be involved with folding single polypeptides, but also with the assembly of oligomers or complexes.

In 2005, Totani et al. reported that they had created a molecule with Man9GlcNAc2 conjugated to methotrexate (MTX) and that this turned out to be a substrate for UGGT38. A future study showed that Man9-conjugated hydrophobic dyes such as BODIPY and TAMRA were even better 68 substrates . BODIPY-Man9 was shown to be the best substrate, and its Michaelis constant (KM) was found to be 69 μM69. They used this substrate to show that UGGT was more active towards 9- mannose glycans than those containing 8, 7, or 6 mannoses38,70. This is in line with previous findings using denatured thyroglobulin44.

1.2.3.1 UGGT in vitro glucosylation of modified RiboB

Modified bovine ribonuclease B (RiboB) has been shown by the Helenius lab to be a UGGT substrate22,71,72. RiboB is N-glycosylated at Asn34, which distinguishes it from the non- glycosylated isoform RiboA; the polypeptide is otherwise identical. The glycoprotein has a high- 73 mannose N-glycan, with Man6GlcNAc2-containing glycan structures being the most abundant . The structures of RiboA74 and RiboB75 have both been determined by x-ray crystallography. However, the glycan chain in the RiboB structure had poor electron density and was not modeled (Figure 4a). The structure did show that RiboA and RiboB are essentially identical, implying that

13 the glycan has little effect on its final structure75. RiboA and RiboB can both form N- and C- terminally swapped dimers under certain conditions and both can form higher-order oligomers76,77.

RiboA or RiboB can be proteolytically digested with subtilisin to produce a clipped form of the molecule, Ribonuclease S (RiboS)78,79. RiboS is proteolytically cleaved between Ala20 and Ser21 and this generates two fragments termed the S-peptide (amino acids 1-20) and the S-protein (amino acids 21-124) (Figure 4b). The S-peptide sits in a hydrophobic cavity in the S-protein and is bound tightly, although they can be separated under acidic conditions or with denaturants 80,81. The S- protein alone is thought to resemble a misfolded protein. RiboB and RiboS have been used in numerous studies examining the kinetics and energetics of protein folding82–86. They are ideal proteins for study as they are small (~15 kDa), soluble, and easily purified in large quantities.

RiboB denatured in urea has been shown to be a substrate for UGGT43,44,48. Later, the S-protein was also shown to be a UGGT substrate22,71,72. Modifying RiboB in different ways and assaying it with UGGT has provided important insights into the specificity that UGGT has for its substrates. Only the “misfolded” S-protein is glucosylated efficiently, while “folded” RiboB is not. Reduced and alkylated (or “unfolded”) S-protein is not glucosylated22. UGGT is also able to glucosylate each of the following: disulphide misoxidized RiboB, RiboB disulphide mutants, and RiboB with loop insertions72. It was also found that UGGT could glucosylate S-protein with the N-glycan in different locations on the glycoprotein72.

Figure 4. Ribonuclease B and Ribonuclease S

(a) Crystal structure of Ribonuclease B (PDB ID: 1RBB)75. The S-peptide is shown in gray, and the glycosylated Asn34 side chain is shown with stick representations. (b) Subtilisin cleavage of RiboB generates RiboS. The S-peptide is removed under denaturing conditions to give the S-protein.

14

An interesting experiment was performed where partial N-terminal domain-swapping and dimerization was induced by mixing RiboB with S-protein. In the resulting dimer, the N-terminal helix (S-peptide) of RiboB swaps into the S-peptide cavity of the S-protein. However, since the S- protein does not possess the N-terminal helix, a reciprocal swapping cannot occur, as it typically would in a domain-swapped dimer. This results in dimers that have one misfolded domain (ie. S- protein) and one properly folded domain in each molecule71. It was shown that rat UGGT only glucosylated the domain that was misfolded; the folded domains did not have their N-glycan glucosylated. This suggests that UGGT may only be able to glucosylate glycans that are located on a misfolded domain, and not glycans on an adjacent domain.

1.2.4 UGGT2

In humans, there are two forms of UGGT, UGGT1 and UGGT2. Human UGGT1 has been shown to glucosylate several misfolded model glycoproteins, but the closely related homolog UGGT2 has no identified glycoprotein substrates87. Interestingly, recombinant proteins containing the UGGT2 GT domain with the UGGT1 N-terminal region were found to glucosylate thyroglobulin87. This implies that the UGGT2 GT domain is indeed active, but that the UGGT2 N-terminal domain is either nonfunctional or its role and/or specificity differs from that of the N-terminal domain of UGGT1. C. elegans likewise has 2 UGGT homologs, with no activity detected for uggt-288.

Interestingly, when the non-proteinaceous BODIPY-Man9 substrate was assayed, human UGGT2 did glucosylate it, although at a slower rate than with UGGT170,89. When the human UGGT1 and UGGT2 GT domains were expressed alone, they were both found to be inactive towards denatured 87 89 thyroglobulin or RiboB , but active towards BODIPY-Man9 . Taken together, these studies suggest that UGGT2 is probably an active protein but with a yet-to-be determined acceptor substrate specificity.

1.2.5 UGGT in vivo glucosylation

Studying the glucosylation of proteins in the ER has given us insight into UGGT’s role in the cell. Under normal growth conditions, UGGT knockout does not result in a growth phenotype in yeast90,91 or mouse92 cells, but growth is impaired upon induced ER stress. Using a mouse UGGT1 knockout cell line, UGGT has been shown to be required for the proper maturation of T cell receptor (TCR) α subunits93 and it has been shown to continuously re-glucosylate and retain them (and other TCR subunits) in the ER until they are incorporated into the TCR complex94. UGGT is

15 also required for the proper folding of prosaposin, which form aggresomes in its absence95. UGGT decreases the aggregation of misfolded α1-antitrypsin mutants (NHK and Z allele) in the ER96. In pulse-chase experiments, CNX associated with some, but not all, glycoproteins for a shorter period of time when UGGT was knocked out97.

The class I major histocompatibility complex (MHC) is an immune system-related glycoprotein complex that gets loaded with proteolytic peptide fragments by the peptide loading complex (PLC) in the ER so that they can be presented as antigens at the cell surface. The MHC class I complex associates with calreticulin and then with PLC to load a peptide into the peptide-binding groove on MHC. After peptide loading, if the peptide is not bound to the peptide-binding groove with a strong enough affinity, UGGT will re-glucosylate MHC. This results in re-association with calreticulin, peptide removal, and PLC re-association98,99. Though UGGT is not required for MHC class I complex assembly and presentation at the cell surface, this data suggests that UGGT plays a role in the quality control of peptide loading in the ER. Additionally, a recent study identified the MHC class I chaperone TAPBPR as a human UGGT1 interactor, and implicated the UGGT1-TAPBPR interaction in the proper loading of the MHC class I complex100.

In Arabidopsis thaliana, bri1-9 mutant plants are insensitive to brassinosteroids because the bri1-9 brassinosteroid receptor is retained in the ER and does not get to the cell surface.101. Consistent with this observation, UGGT knockout restores brassinosteroid sensitivity to the plants by allowing the mutant bri1-9 receptor to escape ER quality control and get to the cell surface. A similar “receptor-retention” phenotype has been described for the bir1-1 receptor kinase mutant, where UGGT knockout restored receptor function because the mutant receptor was able to reach the cell surface102. Overall, UGGT knockout plants are viable and healthy, but they are sensitive to stress and show an upregulation in the unfolded protein response (UPR)103.

Functional UGGT is required for proper development in mice. UGGT1 knockout mice experience embryonic lethality at stage E13, but mouse embryonic fibroblasts derived from these embryos are viable in cell culture92. Neither human UGGT1 nor UGGT2 have been implicated in any disease phenotypes.

16

1.2.6 Binding partner Sep15 (SELENOF)

The 15-kDa selenoprotein Sep15 (recently named SELENOF, HUGO Nomenclature Committee104) has been shown to bind to UGGT and to be maintained in the ER through this 105 106 interaction . It binds tightly to UGGT in a 1:1 ratio with a reported KD of 20 nM . Sep15 binds via its N-terminal region, which does not have homology to known domains. The UGGT binding site has recently been mapped to amino acids 262-305 on Drosophila UGGT, a loop between the trx1 and trx4 domains50.

Sep15 is expected to be a thiol-disulphide oxidoreductase; it contains the non-canonical amino acid (U) in a conserved C-X-U motif, which differs from the canonical thioredoxin C- X-X-U/C-X-X-C motif107. A functional role has not been ascribed to Sep15. It is upregulated in some types of ER stress108, and Sep15 knockout mice develop cataracts, protein aggregates in the eye109. These data suggest that Sep15 is playing some role in folding or in quality control, although Sep15 knockout does not produce an obvious phenotype in cell culture. Many have speculated that Sep15 may be using its oxidoreductase activity to confer on UGGT properties, perhaps through disulphide isomerization. However, Sep15 has not yet been shown to have this activity.

1.3 Rationale of the thesis

UGGT recognizes a wide variety of misfolded (but not unfolded) glycoproteins containing a

Man9GlcNAc2 glycan. However, the exact requirements for binding are not well established, other than the fact that UGGT prefers substrates with exposed hydrophobic residues. The structural basis for how it binds its diverse substrates is also not clear, but evidence is mounting to suggest that its conformational flexibility (trx and catalytic domains) plays an important role. It is also not known whether different UGGT domains, or combinations of them, are involved in the recognition of different substrates. To shed light on these questions, this thesis deals with the preparation of various forms of RiboB to be used in attempts to crystallize UGGT-substrate complexes. The human isoform, UGGT1, was used for the following experiments and will hereafter be referred to as UGGT1.

17

Materials and Methods 2.1 Vector construction

All constructs used here were inserted into the same vector, PB-T-PAF110, which is a tetracycline- inducible vector for mammalian cell expression. The vector encodes a fusion protein with a Protein A purification tag at the N-terminus, followed by a linker with a tobacco etch virus (TEV) protease cleavage site, followed by the protein of interest.

A PB-T-PAF vector encoding human UGGT1 was obtained from Xuyao Li111. To generate this vector, UGGT1 cDNA (hORFeome, clone ID: 5492334) was amplified using the primers #1 and #2 shown in Table 1. The cDNA and primers were mixed with dNTPs and Phusion HF polymerase (New England Biolabs, Cat. #M0530S) and PCR was performed with 30 cycles of 30 s at 98 °C, 30 s at 56 °C, and 3 min at 72 °C. The resultant DNA was run on a 1% agarose gel, excised and purified using the NucleoSpin Gel and PCR Clean-up kit (Macherey-Nagel, Cat. #740609.250). Additionally, linearized PB-T-PAF vector was prepared by digestion with AscI and NotI (Thermo Scientific, Cat. #ER1891, Cat. #ER0591) restriction enzymes and subsequent gel purification. The PCR fragments and linearized vector were incubated with the InFusion HD mixture (Clontech, Cat. #638910) which uses recombination to insert the fragment into the vector. This vector was transformed into Stellar competent cells (Clontech, Cat. #636763), and miniprepped with the Presto Mini Plasmid Kit (Geneaid, Cat. #PDH300) to prepare DNA for transfection.

Table 1. PCR primers used for cloning and site-directed mutagenesis Primer # Primer Name Sequence (5’ to 3’) 1 5’-UGGT1 TTTATATTTCCAGGGCGCGCCCGACTCAAAAGCCATT 2 3’-UGGT1 ATCAGTTATCTATGCGGCCGCTCATTCTTCACGTTT 3 5’-RB CACAAGTTTGTACAGCTAGCCA 4 3’-RB-N34Q TTGCCGGGACTTCATCATCTG 5 5’-RB-N34Q ATGAAGTCCCGGCAACTGACAAAGGACAGATGTAAAC 6 3’-RB GATCAGTTATCTATGCGGCCGCTCACATAA 7 3’-RB-Y76N ATTAGACTGGTAGCAATTGGT 8 5’-RB-Y76N TGCTACCAGTCTAATAGTACTATGAGTATCACCGAC 9 3’-RB-G88N GTTTGTCTCCCTACAGTCGG 10 5’-RB-G88N TGTAGGGAGACAAACAGCAGCAAATACCCTAACTG

A PB-T-PAF vector containing modified bovine RiboB was obtained from Xuyao Li111. RiboB was modified to insert the sequence DDDDK between Ala20 and Ser21. Using this DDDDK

18 construct, I generated three more RiboB mutants, RiboBN34Q, RiboBN34Q/Y76N, and RiboBN34Q/G88N, that were produced using the primers found in Table 1. The common N34Q mutation was introduced first by amplifying the RiboB sequence flanking the mutation site with Phusion HF polymerase using primer pairs #3/#4 and #5/#6. The InFusion enzyme was used to insert these fragments into an AscI and NheI (Thermo Scientific, Cat. #ER0971) digested PB-T-PAF vector. To perform the second mutation, primer pairs #3/#7 and #8/#6 (for N34Q/Y76N) and primer pairs #3/#9 and #10/#6 (for N34Q/G88N) were used. The InFusion enzyme was used to insert them into an NheI/NotI digested PB-T-PAF vector. The final clones were confirmed by Sanger sequencing at The Centre for Applied Genomics (The Hospital for Sick Children, Toronto, ON).

2.2 Cell culture

For adhesion culture, 293S or 293F (two different cell lines derived from HEK293 cells) cells were grown in DMEM/F12 media (Wisent, Cat. #319085-CL) supplemented with 3% fetal bovine serum (FBS; Sigma-Aldrich, Cat. #F1051-500ML) and 1% penicillin/streptomycin (Bioshop, Cat. #PEN333.1, Cat. #STP101.100). Both cell lines are deficient in GnTI and do not form complex N- glycans. Cells were incubated at 37 °C, 5% CO2 and were passaged every 3 days at a 1:10 ratio on 10 cm plastic tissue culture plates.

2.2.1 Transfection

Cells in a 6-well dish were allowed to reach 60-80% confluency before transfection. 2 μg of the vector of interest was mixed with 0.5 μg each of helper plasmids PB-RB and PBase M110. The three plasmids were mixed with 4 μl of the jetPRIME reagent (Polyplus Transfection, Cat#114-07) in 200 μl of jetPRIME buffer and incubated for 10 min before dropwise addition to the cell media, as per the suppliers' protocol. The cells were incubated for 4 hours with the transfection reagent, before removing the media and adding fresh media for 2 days. After 2 days, cells were selected with 1 μg/ml puromycin (Wisent, Cat. #400-160-EM) and 1 μg/ml blasticidin (Invitrogen, Cat. #46-1120) in the media for 1-2 weeks. After selection, the stably transfected cells were either scaled up for protein production, or frozen down for later use. To freeze the cells, the cells on a confluent 10 cm dish were resuspended in 3 ml of DMEM/F12, 10% FBS, 10% DMSO, and 1% penicillin/streptomycin. 1 ml was aliquoted into cryo-vials, and cells were placed in a cell freezing apparatus at -80 °C. After cooling to -80 °C, cells were transferred to a liquid nitrogen tank for storage.

19

2.2.2 Protein production

For larger-scale protein expression, ~4x107 cells were seeded into a 3 L roller bottle with 250 mL DMEM/F12 media. Protein expression was induced with DMEM/F12 induction media containing 2 μg/ml doxycycline (Biobasic, Cat. #DB0889) and 2 μg/ml aprotinin (Bioshop, Cat. #APR600.B). Expression was carried out for 8 weeks with media changes every 3-4 days.

RiboB was expressed in induction media containing an additional 2 μg/ml of the mannosidase inhibitor kifunensine (GlycoSyn, Cat. #FC-034) in order to produce RiboB with nearly homogenous Man9GlcNAc2 N-glycans.

2.3 Protein purification and chromatography

For each construct, the cell culture media containing the secreted fusion protein of interest was centrifuged to remove cell debris and it was filtered through Whatman paper in an Amicon pressure filtration apparatus. 2-5 ml of IgG Sepharose 6 Fast Flow resin (GE Healthcare, Cat. #17-0969-02) was used per litre of filtered media. The bottle was placed on a roller and incubated at 4 °C for 1- 4 hours. Beads were collected in a gravity flow column, washed with 100-300 ml wash buffer (10 mM Tris pH 7.6, 150 mM NaCl), and incubated with purified TEV protease (~0.1 mg/ml) at 4 °C for 6 or more hours. The cleaved protein was eluted with wash buffer and dialyzed into low salt buffer for ion exchange chromatography.

2.3.1 UGGT1

The affinity purified UGGT1 was dialyzed into 10 mM HEPES pH 7.0, 50 mM NaCl at 4 °C. The protein was purified by ion exchange chromatography using a HiTrap 1 mL Q HP (GE Healthcare, Cat. #17-1153-01) column with a NaCl gradient from 50 mM to 500 mM. The protein was subsequently purified by size-exclusion chromatography on a Superdex 200 10/300 GL column (GE Healthcare, Cat. #17517501) in 10 mM HEPES pH 7.0, 150 mM NaCl. Purified UGGT1 was stored at -20 °C in 40% glycerol at 0.575 mg/ml in this buffer.

2.3.2 RiboB

The affinity purified RiboB was dialyzed into 10 mM HEPES pH 7.0, 25 mM NaCl at 4 °C. The protein was purified by ion exchange chromatography using a HiTrap 5 mL SP HP (GE Healthcare, Cat. #17-1152-01) column with a NaCl gradient from 25 mM to 500 mM. The protein was again

20 purified by ion exchange chromatography using a 1 ml MonoS 5/50 GL (GE Healthcare, Cat. #17- 5168-01) column (25 mM to 500 mM NaCl gradient) which provided separation between the RiboA and RiboB component. The protein was concentrated to 4-5 mg/ml in the ion exchange buffer using an Amicon Ultra-4 centrifugal concentrator with a 3000 MW cutoff (Millipore, Cat #UFC801024).

2.4 RiboS and S-protein preparation and chromatography

4-5 mg/ml RiboB from the MonoS column was mixed 10:1 with 10X EK buffer (200 mM Tris pH

8.0, 500 mM NaCl, 20 mM CaCl2) and incubated at room temperature for 16 hours with 10 units of bovine enterokinase (Applied Biological Materials, Cat. #G699) per ml of RiboB. The cleaved RiboS was diluted into low salt buffer and purified by MonoS chromatography using a 25 mM to 500 mM salt gradient. The RiboS was concentrated to 4 mg/ml in the ion exchange buffer.

To prepare S-protein, 4 mg/ml RiboS from the MonoS column was mixed with a 1:20 volume of 1 M sodium phosphate pH 2.8, which dissociates the S-peptide from the S-protein. This was purified by size-exclusion chromatography on a Superdex 75 10/300 GL (GE Healthcare, Cat. #17517401) column in in 50 mM sodium phosphate pH 2.8, 100 mM NaCl. Immediately after elution, the purified S-protein was neutralized with 1 M HEPES pH 7.0, and dialyzed into 10 mM HEPES pH 7.0, 150 mM NaCl.

2.5 Ribonuclease activity assay

Each ribonuclease sample was diluted to 0.01 mg/ml in 10 mM HEPES pH 7, 150 mM NaCl. A 0.5 mg/ml solution of torula yeast type VI RNA (Sigma-Aldrich Cat. #R6625-256) was prepared in 0.1 M sodium acetate pH 5.0. 200 μl of the RNA solution was added to 2 μl of 0.01 mg/ml ribonuclease in each well of the microplate and the plate was promptly inserted into the EnSpire

2300 multilabel reader (Perkin-Elmer). Absorbance at 300 nm (A300) was recorded every 6 s and the first 5 or 10 minutes of reaction were used to calculate the initial rate. The rate of reaction is defined as the ΔA300 per second.

For assaying the reaction at different temperatures, the EnSpire plate reader with the microplate inside was adjusted to the desired temperature and allowed to equilibrate before activity was assayed. The RNA solution was pre-warmed in a water bath for 10 min prior to pipetting it onto the microplate.

21

2.6 UGGT1 MALDI-TOF MS-based activity assay

UGGT1 enzymatic activity was assayed by incubating the reaction components together and using mass spectrometry at several time points to measure the relative amounts of Man9GlcNAc2- containing protein (substrate) and Glc1Man9GlcNAc2-containing protein (product). Unless otherwise noted, the standard assay was carried out at 37 °C in a volume of 20 μl. It consisted of 3

μl of UGGT1 (final concentration 0.5 μM), 1 μl CaCl2 (final concentration 5 mM), 1 μl UDP- glucose (Toronto Research Chemicals, Cat. #829950; final concentration 1 mM), and 15 μl of RiboS or S-protein (final concentration 50 μM). The reaction mixture contained 8.25 mM HEPES pH 7.0, 124 mM NaCl, and 6% glycerol.

2 μl of the reaction was collected at each time point and immediately mixed 1:1 with 0.2% formic acid (47 mM) to stop the reaction. These samples were mixed in a 1:1 ratio with a saturated solution of sinapinic acid (Sigma-Aldrich Cat. #85429-1G) in 95% ethanol. For each time point, 0.8 μl was spotted onto a microScout MSP 96 polished steel target (Bruker, Cat #224989) and allowed to air dry. It was then placed in a MicroFlex LT MALDI-TOF mass spectrometer (Bruker). Data was collected using 150-600 laser shots at laser intensities ranging from 65-90%. The data was exported to mzXML file format and analyzed with mMass 5.5.0. The data was processed using a baseline correction (precision: 80, relative offset: 30), smoothed with the Savitzky-Golay method (2 cycles with window size of 5 m/z), and peaks were chosen from these processed spectra. The major acceptor substrate peak for both RiboS and S-protein was found at approximately 14410 Da, and the product peak was found at approximately 14572 Da. To calculate the product concentration at each time point, equation (1) was used. Essentially, the initial substrate concentration was multiplied by the ratio of the intensity of the product peak to the total intensity of both peaks to give the concentration of the product.

푖푛푡푒푛푠푖푡푦 표푓 푝푟표푑푢푐푡 푝푒푎푘 (1) [푔푙푢푐표푠푦푙푎푡푒푑 푝푟표푑푢푐푡] = [푎푐푐푒푝푡표푟 푠푢푏푠푡푟푎푡푒] × 푖푛푡푒푛푠푖푡푦 표푓 푠푢푏푠푡푟푎푡푒 푝푒푎푘 + 푖푛푡푒푛푠푖푡푦 표푓 푝푟표푑푢푐푡 푝푒푎푘

Initial rates were calculated by taking the slope of the initial linear portion of the curve using Excel 2016 (Microsoft).

22

2.7 Circular Dichroism

Circular dichroism (CD) was performed on the modified RiboB substrates with a Jasco J-810 spectropolarimeter equipped with a Jasco Peltier PTC-4235 temperature controller. CD scans from 200 nm to 340 nm were taken with a pitch of 1 nm, a bandwidth of 1 nm, a response time of 2 s, a scan speed of 50 nm/min, and at least 3 accumulations. A quartz cuvette with a 0.1 cm path length was used. Protein was assayed at a concentration of 0.5 mg/ml in 10 mM HEPES pH 7.0, 150 mM NaCl.

Thermal melts were performed from 10 °C to 73 °C by taking a CD spectrum every 1.0 °C, using the “Temperature/Wavelength Scan” program. The ellipticity at 240 nm or 274 nm was plotted against temperature with the statistical software R. For the reversible melt, the program “Variable Temperature” was run at 240 nm, and the temperature was adjusted from 20 °C to 95 °C and back in 1.0 °C increments.

2.8 BIAcore surface plasmon resonance

A BIAcore X surface plasmon resonance (SPR) instrument was used to study protein-protein interactions. A CM5 chip (GE Healthcare, Cat. #BR100012) was coupled with purified UGGT1 using non-specific amine coupling according to the manufacturer’s recommendations. Briefly, a new chip was equilibrated with HBS buffer (10 mM HEPES pH 7.0, 150 mM NaCl) at a flow rate of 10 μl/min. In flow cell 2, a 70 μl solution of 0.05 M N-hydroxysuccinimide (Fluka, Cat. #56480) and 0.1 M 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide (Sigma-Aldrich, Cat. #03449-1G) was flowed over the chip, followed by 25 μl of 0.05 mg/ml UGGT1 in 40 mM sodium acetate pH 4.5. The chip was then washed with 70 μl of 1 M ethanolamine pH 8.5. Flow cell 2 was prepared as a control in the same way but with no UGGT1 in the acetate buffer. Both flow cells were washed well with HBS buffer prior to the binding studies.

Ribonuclease substrates were dialyzed into HBS buffer before assay. Two 10 μl injections of substrate were flowed over the UGGT1-coupled chip and the difference between flow cell 2 and flow cell 1 was recorded. The SPR traces were processed using BIAevaluation 4.1 and Scrubber 2.0. The difference in the response units (ΔRU) between the baseline and the plateau for each injection was calculated. The ΔRU value is used as a relative measure of binding.

23

Results 3.1 Protein Expression

3.1.1 UGGT1 expression

HEK293S (GnTI-) cells stably expressing a Protein A-UGGT1 fusion protein in the PB-T-PAF vector were generated by Xuyao Li111. I thawed frozen aliquots of these cells for human UGGT1 expression. There is a TEV protease cleavage site (ENLYFQ^G) between the Protein A and UGGT1 sequence so that the Protein A tag can be removed. For large-scale expression, cells were grown in roller bottles with 250 mL DMEM/F12 media. The protein A-UGGT1 fusion protein was secreted into the cell culture media. Protein A-UGGT1 was detected as a ~200 kDa band on a western blot using an anti-Protein A antibody (Figure 5a). After three chromatography steps, about 700-800 μg of UGGT1 was purified per litre of cell culture media.

3.1.2 RiboB vectors and expression

A Protein A-RiboB (bovine) construct in the PB-T-PAF vector was also constructed by Xuyao Li. RiboB had been engineered with two modifications to its amino acid sequence. First, the enterokinase (EK) cleavage site, DDDDK, was inserted between Ala20 and Ser21 as described previously112. EK is a site-specific protease that cuts a polypeptide C-terminal to the lysine in DDDDK. Secondly, there is a transglutaminase recognition site (PKPQQFM) at the C-terminus. This was engineered for the purpose of derivatizing RiboB with transglutaminase, but it was not put to use in this study.

I generated three new vectors coding for RiboB glycosylation mutants using site-directed mutagenesis. First, a non-glycosylated variant (RiboBN34Q) was created with the asparagine of its N-X-S/T sequon mutated to glutamine. Based on this RiboBN34Q variant, two additional variants with new N-X-S/T sites at position 76 and 88 were generated to give RiboBN34Q/Y76N and RiboBN34Q/G88N (Figure 6). These positions were selected because they have glycosylated N-X-S/T sequons in the homologous human RNASE1. In summary, I studied four forms of bovine RiboB: RiboB (wild-type, i.e. N-glycosylated at Asn34), RiboBN34Q/Y76N, RiboBN34Q/G88N, and the non- glycosylated form RiboBN34Q. Three of these molecules are singly glycosylated, and one is not glycosylated.

24

These vectors were transfected into HEK293F (GnTI-) cells, and stable bulk cell cultures were selected. Cells were grown in roller bottles and protein expression was assessed by the presence of a 45 kDa band on an anti-Protein A Western blot (Figure 5b). Expression levels were similar between the various forms.

Figure 5. Expression of UGGT1 and the Protein A-RiboB glycosylation variants

(a) Anti-Protein A western blot of media containing secreted Protein A-UGGT1. The MW of Protein A-UGGT1 is ~207 kDa. MW markers in the left lane are labeled with their respective MWs. (b) Anti-Protein A western blot of media containing each of the four secreted Protein A-RiboB glycosylation variants. The MW of Protein A-RiboB is 49 kDa.

25

Figure 6. The RiboB glycosylation variants

(a) A schematic of the RiboB glycosylation variants generated showing the locations of the N- X-S/T sites. Amino acids present in the wild-type RiboB sequence are shown in the top construct (RiboB) and labeled with arrows. Amino acids highlighted in red are part of an N- X-S/T sequon and are glycosylated. (b) The crystal structure of RiboB with the S-peptide shown in gray and residues Asn34, Tyr76, and Gly88 shown in red. PDB ID: 1RBB75.

3.2 Protein Purification 3.2.1 UGGT1 purification

The Protein A-UGGT1 fusion protein was purified by affinity chromatography using IgG- sepharose beads. Two pull-downs with the beads were performed as one did not capture all of the protein – this was assessed by western blotting for Protein A (Figure 7a). To remove UGGT1 from the beads, purified TEV protease was added to the Protein A-UGGT1 bound beads. TEV cuts between Protein A and UGGT1 so that soluble UGGT1 can flow out of the column while Protein

26

A remains immobilized on the IgG beads. The elution contains purified UGGT1 with some TEV protease, and this is dialyzed into low salt buffer for ion exchange (IEX) chromatography.

UGGT1, with a calculated isoelectric point of 5.4, was purified by anion exchange chromatography at pH 7.0. It was run on a HiTrap Q HP column and eluted around 310 mM NaCl (Figure 7b).

The UGGT1-containing fractions were pooled and concentrated to 1 mg/ml and purified further by size-exclusion chromatography (SEC) on a 24 mL Superdex 200 column. UGGT1 eluted at a retention volume of 10.0 ml (Figure 7c). This corresponds to a molecular weight (MW) of 175 kDa based on a standard curve generated with molecular weight standards. This is in accordance with the MW of 173 kDa predicted from the amino acid sequence of UGGT1. Additionally, the symmetrical peak suggests that the protein is homogenous in size.

Protein concentration was assessed by absorbance spectrophotometry. The absorbance at 280 nm 113 (A280) for 1 mg/ml UGGT1 was estimated to be 1.2 (estimated by the ExPASy ProtParam tool , based on sequence). This value was used to calculate the protein concentration.

UGGT1 is susceptible to proteolytic degradation. When stored at 4 °C, UGGT1 proteolysis can be detected by SDS-PAGE or MALDI-TOF MS after only two days (Figure 8a, b). The molecular weights of these fragments were estimated by SDS-PAGE gel electrophoresis (120 kDa, 145 kDa, and 170 kDa; Figure 8a) or by MALDI-TOF (121 kDa, 143 kDa, and 173 kDa; Figure 8b). Storage in 40% glycerol at -20 °C with protease inhibitor cocktail (Roche cOmplete mini EDTA-free) slowed degradation, and very little proteolysis is seen in the first two weeks when stored this way (Figure 8c, d).

The proteolyzed fragments co-purify by SEC and elute at the same retention time as freshly- purified intact UGGT1. UGGT1 still maintains activity in proteolyzed samples.

27

Figure 7. UGGT1 purification

(a) Western blot of Protein A-UGGT1-containing media after zero, one or two IgG pulldowns. Ab: anti-Protein A. (b) Anion exchange chromatography of UGGT1 on a HiTrap 1 mL Q HP column (NaCl gradient from 50 mM to 500 mM). (c) Size-exclusion chromatography of UGGT1 on a Superdex 200 column. Absorbance at 280nm (blue line), conductance (brown line), FPLC fractions (red inset).

28

Figure 8. UGGT1 proteolysis

(a) SDS-PAGE gel and (b) MALDI-TOF spectra of UGGT1 after 0-15 days of storage at 4 °C. (c) SDS-PAGE gel and (d) MALDI-TOF spectra of UGGT1 after 0-15 days of storage at -20 °C with protease inhibitors. Approximate MWs are labeled.

3.2.2 RiboB, RiboS and S-protein purification

Protein A-RiboB was purified with IgG-sepharose beads and eluted by cleavage with TEV protease. The TEV-released material was dialyzed and subsequently purified by cation exchange chromatography using a HiTrap SP HP column (Figure 9a). A gradient was run from 25 mM NaCl to 500 mM NaCl, at pH 7.0. There was separation of the glycosylated and non-glycosylated forms and only the glycosylated fractions were pooled. This material was again purified by IEX on a MonoS column, which provided better separation between the glycosylated and non-glycosylated forms (Figure 9b). Since these forms were not baseline-separated, two sequential cation exchange chromatographies were performed with the goal of purifying the glycosylated protein away from the non-glycosylated protein.

29

Even though an N-X-S/T sequon is present in the RiboB, RiboBN34Q/Y76N, and RiboBN34Q/G88N constructs, these molecules were glycosylated to different extents. As assessed by the ratio of the glycosylated to non-glycosylated peaks during SP purification, the native RiboB construct produced protein that was only ~40% glycosylated. This can also be seen on the SDS-PAGE gel in Figure 10a. On the other hand, the RiboBN34Q/Y76N construct produced protein that was entirely glycosylated, and the RiboBN34Q/G88N construct produced protein that was only ~15% glycosylated. The construct absent the N-X-S/T sequon, RiboBN34Q, was not glycosylated, as expected (Figure 10a).

RiboS was generated by cutting RiboB with EK followed by MonoS chromatography (Figure 9c). This purification step was required because the EK reaction does not reach 100% completion, and unclipped RiboB must be separated away from RiboS. To dissociate RiboS into S-protein and S- peptide, it was run on a Superdex 75 column in 50 mM sodium phosphate pH 2.8, 100 mM NaCl (Figure 9d).

After dialysis, each of the various S-proteins (S-protein, S-proteinN34Q, S-proteinN34Q/Y76N, and S- proteinN34Q/G88N) were concentrated using a centrifugal concentrator with a 3000 MW cutoff. Concentration was assessed by absorbance spectrophotometry at 280 nm. 1 mg/ml RiboB was calculated to have an A280 of ~0.60, and this was was used to determine the concentration of each sample. The protein concentration was adjusted and to check that the concentrations were similar, each S-protein variant was run on an SDS-PAGE gel alongside uncleaved RiboB for comparison (Figure 10b).

30

Figure 9. RiboB, RiboS, and S-protein purification

FPLC chromatography profiles of (a, b) RiboB, (c) RiboS, and (d) S-protein. Black line, % buffer B (NaCl gradient from 25 mM to 500 mM); blue line, A280; green line, A215. (a) 5 mL HiTrap SP chromatography of RiboB. (b) 1 mL MonoS chromatography of RiboB. (c) 1 mL MonoS chromatography of RiboS. (d) 25 mL Superdex 75 chromatography of S-protein.

Absorbance at 215 nm (A215 trace, green line) detects the S-peptide and the S-protein, while absorbance at 280 nm (A280 trace, blue line) only detects the S-protein. The A280 and A215 trace are normalized on this graph, though much more signal was obtained at A 215.

31

Figure 10. SDS-PAGE gels of RiboB and S-protein glycosylation mutants

(a) SDS-PAGE gel of native RiboB and the three RiboB variants after affinity purification. The ratio of glycosylated to non-glycosylated protein varies from mutant to mutant. (b) SDS- PAGE gel of the three S-protein variants, wild-type S-protein, and wild-type RiboB.

3.3 Substrate properties

3.3.1 Circular Dichroism

For RiboB, RiboS, and S-protein, circular dichroism (CD) was used to measure unfolding and refolding as a function of temperature. All three proteins were prepared at 0.5 mg/ml in 10 mM HEPES pH 7.0, 150 mM NaCl and subjected to a CD scan from 200 nm – 340 nm. Changes in the far-UV range (~190-250 nm) are representative of changes in protein secondary structure, while changes in the near-UV range (~250-300 nm) represent changes in the protein fold or tertiary structure.

It can be seen in Figure 11a that the 10 °C far-UV spectra for RiboB and RiboS are very similar, while the S-protein curve looks quite different. This can be attributed to the fact that RiboB and RiboS have very similar secondary structures, while S-protein is missing an entire α-helix (the S- peptide). The ellipticity reading in RiboB/RiboS is consistent with the negative ellipticity associated with α-helices. The 10 °C near-UV spectra of all three proteins also shows a negative ellipticity that peaks around 274 nm (Figure 11b).

32

Figure 11. Far- and Near-UV CD spectra of RiboB, RiboS and S-protein

(a) Far-UV spectra of RiboB (blue), RiboS (orange), and S-protein (green) at 10 °C and the same samples at 73 °C (dark blue, dark orange, and dark green). (b) Near-UV spectra of RiboB (blue), RiboS (orange), and S-protein (green) at 10 °C and the same samples at 73 °C (dark blue, dark orange, and dark green).

A CD thermal melt was performed with RiboB, RiboS, and S-protein. A scan from 200 nm – 340 nm was taken every 1 °C from 10 °C to 73 °C. When the ellipticity at 240 nm (far-UV) was plotted against temperature, a cooperative transition was seen in each of the RiboB, RiboS, and S-protein curves (Figure 12a). S-protein experienced a relatively broad transition with a midpoint at 33 °C, RiboS had a sharper transition with a midpoint at 44 °C, and RiboB had the sharpest transition with a midpoint at 55 °C. The three proteins have clear differences in their transition midpoints. This indicates that the three substrates have different thermostabilities. It can be concluded that RiboB maintains more of its secondary structure at higher temperatures than does RiboS and S-protein.

Similar results were obtained with the near-UV thermal melt performed at 274 nm (Figure 12b). The transition temperatures for RiboB, RiboS, and S-protein were estimated to be 27 °C, 42 °C, and 53 °C, respectively. In agreement with the near-UV thermal melt, RiboB, RiboS, and S-protein each have varying transition midpoints in the same order, though slightly lower in each case.

In addition, CD readings were taken at 240 nm from 20 °C to 95 °C and back to 20 °C, and the transition was found to be reversible for RiboB, RiboS, and S-protein (Figure 13).

33

Figure 12. CD melts of RiboB, RiboS and S-protein

(a) Far-UV CD melt plotting ellipticity at 240 nm against temperature. RiboB (blue), RiboS (orange), and S-protein (green). (b) Near-UV CD melt plotting ellipticity at 274 nm against temperature. RiboB (blue), RiboS (orange), and S-protein (green).

Figure 13. Reversible CD melt of RiboB, RiboS and S-protein

Far-UV CD melt plotting ellipticity at 240 nm against temperature, where temperature was increased from 20 °C to 95 °C (RiboB, blue; RiboS, orange; S-protein, green) and decreased back to 20 °C (RiboB, dark blue; RiboS, dark orange; S-protein, dark green).

34

3.3.2 Ribonuclease activity

Ribonuclease activity was measured to demonstrate the differences between RiboB, RiboS, and S- protein. A UV absorbance-based assay was employed, which operates on the principle that cleaved 114 RNA absorbs less light at 300 nm than uncleaved RNA . The A300 of an RNA solution containing ribonuclease was recorded over time. The signal drops over time as the ribonuclease cleaves the

RNA, and these values were plotted against time to generate a reaction progress curve. The ΔA300 over the first 5-10 minutes – where the reaction was linear – was used to calculate the initial velocity (Vo). I have used 20 ng of RiboB per 200 μl reaction for the following assays.

RiboB and RiboS both display activity at 30 °C (Figure 14). According to their respective Vo values, RiboB catalyzes the reaction about 1.6 times faster than RiboS. S-protein showed no measurable activity. This is because S-protein is missing a key catalytic residue, His12, which is located on the S-peptide.

I assayed ribonuclease activity at various temperatures in order to further study the thermostability of RiboB and RiboS. As temperature increased, RiboB activity increased up to 63 °C, though it started to level off at higher temperatures (Figure 15). On the other hand, RiboS activity increased with temperature up to about 42 °C, before sharply dropping off to zero (Figure 15). Even though both proteins contain the exact same amino acid sequence, RiboB has greater thermostability than RiboS, and can retain activity up to a higher temperature – which is in agreement with the conclusions drawn from the CD melts (Figure 12).

Figure 14. RiboB, RiboS and S-protein activity assay

Ribonuclease activity progress curve shows that RiboB (blue) and RiboS (orange) display ribonuclease activity while S-protein (green) does not.

35

Figure 15. RiboB and RiboS activity as a function of temperature

Ribonuclease Vo as a function of temperature. RiboB (blue) activity increases with temperature, while RiboS (orange) activity drops off around 42 °C. Error bars represent the standard error (N=4).

3.4 UGGT1 activity

3.4.1 MALDI-TOF-based activity assay

To measure glucosylation of RiboB, RiboS, or S-protein by UGGT1, a MALDI-TOF-based assay was developed to measure activity in a time point-based assay. The mass spectrometry (MS)-based assay assesses the relative amounts of protein substrate and product as a means of following the reaction progress. As the reaction progresses, UGGT1 adds glucose moieties with a MW of 162 Da to the protein substrate and the new peak is detected by MS. The intensity of the peak is used as a quantitative measure of the glucosylated product. The amount of product is calculated based on the proportion that has been glycosylated (see section 2.6 for equation). Figure 16a shows the raw mass spectra of S-protein (wild-type) at different time points. Figure 16c shows a progress curve plotted from these data. Dynafit4 was used to fit a curve to the data. Unless otherwise noted,

UGGT1 activity assays were carried out with 5 mM CaCl2, 1 mM donor substrate (UDP-glucose), and 50 μM acceptor substrate (RiboB, RiboS or S-protein) at 37 °C. The slope of the line through the initial part of the curve was calculated in order to estimate Vo. The initial part of the curve was taken to be in the region where 0%-30% of the substrate was glucosylated. This was determined after several assays showed that the reaction velocity does not significantly drop until well after this point.

36

The progress curve for RiboS (wild-type) is shown in Figure 16b, and the progress curve for S- protein is shown in Figure 16c. No curve for RiboB is shown as no glucosylation was detected. The initial velocity of S-protein glucosylation is faster than that of RiboS. This result correlates well with the CD melt data that shows that the S-protein is the least thermostable of the three substrates, and RiboB is the most thermostable. The activity assay was run at 37 °C, which is above the thermal melting transition of S-protein, just below the thermal transition of RiboS, and well below the thermal transition of RiboB (33 °C, 44 °C, and 55 °C, respectively). UGGT1 is only expected to glucosylate proteins resembling folding intermediates or misfolded proteins, and only RiboS and S-protein are misfolded/unfolded at this temperature. The fact that UGGT1 does not glucosylate the natively folded RiboB, but glucosylates the misfolded/destabilized RiboS and S- protein, is consistent with its role as a misfolded-protein sensor.

37

Figure 16. MALDI-TOF-based UGGT1 activity assay

(a) MALDI-TOF mass spectra at 4 different time points showing the peak shifting from the

Man9-S-protein substrate (MW 14410 Da) to the Glc1Man9-S-protein product (MW 14572

Da). (b) The concentration of product (Glc1Man9-RiboS) was plotted against time and fitted with Dynafit4 to get a progress curve. The slope of the blue line is the Vo. (c) The concentration of product (Glc1Man9-S-protein) was plotted against time and fitted with Dynafit4 to get a progress curve. Vo is calculated from the slope of the blue line.

38

3.4.2 Alternative activity assays

In the course of this work, alternative UGGT1 activity assays were also performed and evaluated. Liquid chromatography-electrospray ionization-mass spectrometry (LC-ESI-MS) on a Waters qTOF mass spectrometer was also tested as a mass spectrometry-based alternative to the MALDI- TOF approach. Time points were collected in a similar fashion, then desalted using a C18 Zip-Tip and run one-by-one on LC-ESI-MS. At 50 μM, S-protein was glucosylated faster than RiboS and RiboB was not glucosylated (Figure 17a). Too few data points were collected in the linear range of the reaction to calculate a Vo for S-protein, but a Vo of 0.50 uM/min was calculated for RiboS.

This is slightly lower, but comparable to the Vo calculated with the MALDI-TOF assay. These results are in agreement with the MALDI-TOF-based assay results. Given that this experiment was technically more difficult to perform, I focused on the use of the MALDI-TOF assay and this assay was not pursued further.

UGGT1 activity was also assayed with the commercially available UDP-Glo Glycosyltransferase Activity Assay Kit (Promega). This assay measures the amount of UDP in a reaction vessel, using an enzyme to convert UDP to ATP, which luciferase uses to generate light. The amount of UDP present is proportional to the amount of light generated. The absolute concentration of UDP is calculated by creating a UDP standard curve. This assay measures UGGT1 activity in a very different way than the MS-based assays. The MS-based assays directly detect the substrate and product, whereas the luminescence-based assay is an indirect measure of the UDP product produced. Moreover, UDP can be generated from UDP-glucose by hydrolysis, an outcome that needs to be controlled for. I assayed UGGT1 with 25 μM RiboB, RiboS, or S-protein, and measured the amount of UDP produced at several time points in order to generate reaction progress curves. S-protein is glucosylated faster than RiboS, and RiboB is not measurably glucosylated (Figure

17b). The Vo for S-protein was again not calculated due to the lack of data points in the linear range of the reaction, but the Vo for RiboS was calculated to be 0.10 uM/min. This value cannot be compared to the aforementioned Vo values, as this assay was performed at a lower substrate concentration (25 μM). This assay provides an independent confirmation of UGGT1 activity. It validates the results that I have obtained with the MALDI-TOF-based assay and provides an alternate assay for acceptor substrates that may not be amenable to MS analysis.

39

Figure 17. Alternative activity assays

(a) Progress curve generated from the LC-ESI-MS-based activity assay. RiboB, blue; RiboS, orange. Error bars represent standard error (N=3). (b) Progress curve generated from the UDP-Glo glycosyltransferase assay kit. Product concentration was estimated from a UDP standard curve. RiboB, blue; RiboS, orange; S-protein, green. Error bars represent standard error (N=2).

40

3.4.3 Effect of temperature

UGGT1 activity towards S-protein and RiboS was measured using the MALDI-TOF-based activity assay at three different temperatures: 24 °C, 30 °C, and 37 °C (Figure 18). The Vo of the reaction with 50 μM S-protein was lowest at 24 °C and increased with temperature up to 37 °C. Interestingly, no glucosylation of 50 μM RiboS was detected when the reaction was performed at 24 °C or 30 °C; it was only glucosylated at 37 °C. This is consistent with the CD melt and ribonuclease activity melt indicating that RiboS becomes misfolded in this temperature range. At 24 °C and 30 °C, RiboS is predominantly folded, and consequently is not a substrate for UGGT1.

Figure 18. UGGT1 activity across temperature

The effect of temperature on UGGT1 Vo towards 50 μM S-protein, RiboS, and RiboB is shown here. S-protein is glucosylated at each temperature. There was no measurable glucosylation of RiboS in 100 minutes at temperatures below 37 °C, and there was no measurable glucosylation of RiboB at all.

3.4.4 S-protein glycosylation mutant forms

The activity of UGGT1 towards S-protein was also measured for the two glycosylation variants (S-proteinN34Q/Y76N, and S-proteinN34Q/G88N) using the MALDI-TOF assay (Figure 19). UGGT1 glucosylation of S-proteinN34Q/Y76N was approximately 2.7-fold faster than S-protein, as measured

41 by the initial velocity of the reaction. The S-proteinN34Q/G88N mutant was glucosylated approximately 1.5-fold faster than S-protein. In each case, the S-protein or S-protein variant was assayed at a concentration of 50 μM and the concentration of UGGT1 was 0.5 μM.

Figure 19. Progress curves of UGGT1 activity towards S-protein glycosylation variants

UGGT1 activity towards (a) S-protein, (b) S-proteinN34Q/Y76N, and (c) S-proteinN34Q/G88N. The concentration of UGGT1 was 0.5 μM, and S-protein, 50 μM.

3.4.5 Estimation of KM and kcat

Attempts to obtain the KM and kcat for the various substrates studied were made using the MALDI-

TOF-based activity assay. To measure the KM for the donor substrate UDP-glucose, a series of

UDP-glucose concentrations were used to measure Vo while the RiboS (wild-type) concentration was held constant at 50 μM (~0.78 mg/ml). The Vo of each reaction was plotted against UDP- glucose concentration to produce a Michaelis-Menten curve (Figure 20). Three such experiments 2 were performed. The averaged KM for UDP-glucose was 260 ± 150 μM. R values are a measure of how well the data fits the line and are also inset on all Michaelis-Menten plots.

To estimate the KM for each acceptor substrate, UGGT1 was incubated with 1 mM UDP-glucose and a range of RiboS or S-protein concentrations. Vo was plotted against substrate concentration to generate Michaelis-Menten curves. When the acceptor substrate is at saturating levels, kcat=Vmax/[E]. Using this equation, the Vmax value estimated by curve fitting was used to calculate kcat.

42

Based on the average of three replicates, the KM of RiboS (wild-type) was calculated to be 340 ±

-1 90 μM, and the kcat, 3.3 ± 1.5 min (Figure 21). Based on two replicates, the KM of S-protein was

-1 found to be 650 ± 210 μM, and the kcat, 27.6 ± 4.9 min (Figure 22). Based on one experiment, the N34Q/Y76N -1 KM of S-protein was found to be 360 ± 30 μM, and the kcat, 19.6 ± 0.9 min (Figure 23). N34Q/G88N N34Q/G88N The KM of S-protein was not determined, as S-protein was not obtained in large enough amounts to allow me to assay high enough concentrations. This limited my ability to assay activity at concentrations above the KM.

Figure 20. UDP-glucose Michaelis-Menten plot.

Michaelis-Menten curves for UDP-glucose are shown here. Three individual experiments are shown. The RiboS concentration is fixed at 50 μM and UGGT1 at 0.5 μM.

43

Figure 21. RiboS Michaelis-Menten plot.

Michaelis-Menten curves for RiboS are shown here. Three individual experiments are shown. The UDP-glucose concentration is fixed at 1 mM and UGGT1 at 0.5 μM.

Figure 22. S-protein Michaelis-Menten plot.

Michaelis-Menten curves for S-protein are shown here. Two individual experiments are shown. The UDP-glucose concentration is fixed at 1 mM and UGGT1 at 0.5 μM.

44

Figure 23. S-proteinN34Q/Y76N Michaelis-Menten plot.

A Michaelis-Menten curve for S-proteinN34Q/Y76N is shown here. Only one trial was performed. The UDP-glucose concentration is fixed at 1 mM and UGGT1 at 0.2 μM.

45

3.4.6 Effect of metal ions

Calcium has been shown to be a requirement for UGGT1 activity44. It has also been shown to alternatively use a Mn2+ ion47. In agreement with these findings, I have shown that no activity is detected in the absence of metal ions, Ca2+ and Mn2+ impart activity upon UGGT1, and Mg2+ does not (Figure 24a). Activity increases with the concentration of each metal ion, but UGGT1 catalysis is much faster in the presence of Ca2+ ions than Mn2+ ions (Figure 24b). At high concentrations of Mn2+ (>1 mM), activity was depressed.

Figure 24. Effect of divalent metal ions on UGGT1

(a) The effect of 1 mM CaCl2, 1 mM MnCl2, 1 mM MgCl2, or H2O control on UGGT1 Vo with 50 μM RiboS. Error bars represent the standard error of the replicates (N=3). (b) Metal ion concentration dependence of CaCl2 and MnCl2. The effect of CaCl2 and MnCl2 concentration on UGGT1 Vo with 50 μM RiboS.

3.5 UGGT1 substrate binding

3.5.1 RiboB, RiboS, and S-protein

To measure UGGT1's affinity for its protein substrates, BIAcore SPR was employed. UGGT1 was amine coupled to a CM5 BIAcore chip and unreacted groups were capped with ethanolamine. A reference channel was prepared in the same way, but with a mock injection devoid of UGGT1. The

46

running buffer contained 5 mM CaCl2 and 1 mM UDP in all cases. 10 μl of each substrate was injected over the chip at 25 °C and response units (RU) were measured.

At 33 μM, RiboB does not display measurable binding, but RiboS and S-protein do (Figure 25). S- protein plateaus at a much higher RU value than RiboS, indicating that a higher proportion of the molecule is bound. ΔRU was calculated by subtracting the RU value at the plateau from the RU value at baseline. This is indicative of the relative levels of binding between molecules when they are of equal concentration.

Figure 25. SPR sensorgram showing RiboB, RiboS, and S-protein binding to a UGGT- coupled chip

RiboB, RiboS, and S-protein were each injected over a UGGT1-coupled chip at 33 μM. A representative sensorgram for each substrate is shown.

3.5.2 KD estimation

N34Q To estimate the KD of binding for each of the four S-protein forms (S-protein, S-protein , S- proteinN34Q/Y76N, and S-proteinN34Q/G88N), S-protein was first serially diluted. Each of these S- protein dilutions was injected over the UGGT1 chip in the presence of 5 mM CaCl2 and 1 mM UDP. Sensorgrams for each concentration of all four S-protein variants are displayed in Figure 26. The ΔRU of each injection was calculated and plotted against substrate concentration to generate a binding curve. All four binding curves from the four substrates are plotted in Figure 27.

47

N34Q/Y76N Figure 27c shows that S-protein bound with a KD of 76 ± 8 μM, the lowest KD measured. Figure 27a and Figure 27b show the fitted binding curves for S-protein and S-proteinN34Q, both of which bind more weakly, displaying KDs of 450 ± 30 μM and 230 ± 10 μM, respectively. The S- proteinN34Q/G88N was not purified in large enough amounts to perform injections at high concentrations and a KD for this molecule was not determined (Figure 27d).

Figure 26. Sensorgrams from S-proteinN34Q/Y76N titration

Sensorgrams from which ΔRU values were calculated for each S-protein variant. Each panel shows multiple concentrations of S-protein injected over a UGGT1-coupled chip. (a) S- protein, (b) S-proteinN34Q, (c) S-proteinN34Q/Y76N, and (d) S-proteinN34Q/G88N.

48

Figure 27. KD binding curves for S-protein forms

A binding curve for each S-protein form was estimated by SPR. (a) S-protein, (b) S- proteinN34Q, (c) S-proteinN34Q/Y76N, and (d) S-proteinN34Q/G88N. Error bars represent the standard error of two injections.

3.5.3 Glycan involvement in binding

Each of the four S-protein forms (S-protein, S-proteinN34Q, S-proteinN34Q/Y76N, and S- proteinN34Q/G88N) were treated with a 1:100 (v/v) dilution of 0.4 mg/ml of the endoglycosidase EndoH overnight. MALDI-TOF MS was used to confirm that each glycosylated substrate was completely deglycosylated overnight. Using SPR, when each substrate (glycosylated and de- glycosylated) was injected at 12.5 μM over the UGGT1 chip, it was found that EndoH treatment only affected binding of the S-proteinN34Q/Y76N form. In this case, the relative RU was reduced by

49

49%. In the case of the other three variants, the RU value was relatively unchanged after EndoH glycan removal.

The measured characteristics of the four S-protein variants are summarized in Table 2.

Table 2. Summary of S-protein variants in activity and binding experiments

Characteristics of the four S-protein variants. The blue highlighted values result from kinetics or binding studies. The grey highlighted values result from experiments performed at a single fixed concentration of the S-protein variant.

Variant KM kcat KD Vo Normalized RU EndoH -1 (μM) (min ) (μM) treatment (50 μM (12.5 μM (12.5 μM S-protein S-protein S-protein variant) variant) variant) S-protein 650 ± 210 27.6 ± 4.9 450 ± 30 2.0 μM/min 0.48 No change S-proteinN34Q 230 ± 10 0.97 No change S-proteinN34Q/Y76N 360 ± 30 19.6 ± 0.9 76 ± 8 5.5 μM/min 1 ↓ S-proteinN34Q/G88N 3.0 μM/min 0.40 No change

50

Discussion and Conclusions 4.1 Protein expression and purification

4.1.1 Protein expression

UGGT1 and RiboB were expressed using the the PiggyBac expression system developed by our group110. This system can be used to generate stable bulk cell cultures by co-transfecting the gene of interest, a transposase (PBase), and a drug resistance marker (blasticidin/puromycin resistance). The transiently expressed PBase integrates the gene of interest and the resistance marker into the host genome, and blasticidin/puromycin are used for selection. The PB-T-PAF vector was designed with affinity purification in mind, as it contains a signal peptide for secretion, a Protein A tag for purification, and a TEV protease cleavage site for removal of the protein A tag. The fusion protein is secreted into the media and the Protein A tag and TEV cleavage site allow affinity purification and on-column cleavage of the tag. All constructs used in this work were constructed with this PB- T-PAF vector.

It is important that these proteins were produced in eukaryotic cells since UGGT1 is a large multi- domain protein that likely relies on eukaryotic chaperones and folding machinery to achieve its proper disulfide bonding pattern and native fold. Perhaps it even requires UGGT1/CNX/CRT interactions involving its N-glycan at Asn269. The HEK293S GnTI- cells used are deficient in Golgi N-acetylglucosaminyltransferase I (GnTI). Therefore these cells do not secrete glycoproteins with complex or hybrid glycans; they produce only high-mannose glycans115. This facilitates subsequent glycan removal for crystallographic purposes using EndoH. Glycans are usually highly flexible and can inhibit crystallization. Since one of the end-goals of this project is to crystallize UGGT1, this cell line was chosen for the production of UGGT1.

For RiboB, eukaryotic cell culture was required to enable N-glycosylation. Since, UGGT1 shows 38,44,70 a preference for the Man9GlcNAc2 glycan , the mannosidase inhibitor kifunensine was used in the culture medium. Kifunensine is an α-mannosidase inhibitor that inhibits mammalian alpha 116,117 mannosidases . Its use in cell culture results in an increased proportion of Man9GlcNAc2 N- glycosylated proteins in a dose-dependent fashion. I have used kifunensine at 2 mg/L in the cell culture media which produces >95% Man9GlcNAc2 glycosylated RiboB. As with UGGT1, the use

51 of a mammalian expression system also helps to ensure that RiboB possesses the proper disulfide bonding pattern and fold.

The RiboB HEK293F cells used for wild-type RiboB expression produced more than 15 mg of purified protein per litre of media. However, after purification, cleavage, and more purification, only about 2 mg of S-protein is purified per litre of media. In addition, less than half of the protein was glycosylated, an outcome further reducing the yield of usable material.

It is worth noting that the purification yield of UGGT1 was quite low. I only purified about 0.7-0.8 mg of purified protein per litre of media. By comparing protein A western blots of the RiboB media to the UGGT1 media, it is clear that the UGGT1 construct is expressed at lower levels. The yield is quite a bit lower than most of the other PiggyBac constructs in our lab which tend to produce at least >1 mg/L of purified protein, and in the case of RiboB, >15 mg/L purified protein. The reasons for this are not clear, but at 1819 amino acids, the Protein A-UGGT1 fusion does have a particularly long sequence. UGGT1 also has a complex domain topology that may be expected to have reduced folding efficiency relative to a protein with a simpler topology. Lastly, UGGT1 may have been aggregating or getting proteolyzed in the media during expression while incubating at 37 °C for 3- 4 days. If this is the case, it would reduce purification yield.

4.1.2 RiboB, RiboS, and S-protein

The RiboB/RiboS/S-protein system was used because it is a widely studied model for protein folding, and previous UGGT1 studies have shown that RiboS/S-protein is glucosylated by UGGT122,72. In addition, RiboB is a singly glycosylated protein, which we reasoned might lead to more homogeneous enzyme-substrate complexes for crystallization trials.

The traditional method for producing RiboS from RiboB uses the non-specific protease subtilisin80. Subtilisin-treated RiboB is clipped between Ala20 and Ser21, producing RiboS. However, with this protease, non-specific degradation makes it difficult to obtain pure RiboS in high yield. The approach of introducing an EK proteolytic cleavage site (DDDDK) between Ala20 and Ser21 was published by Watkins, Arnold, and Raines in 2011118. We have adopted this method because it allows RiboS production with high yield and no non-specific cleavage after treatment with EK.

52

4.1.3 UGGT1 purification

UGGT1 purified as relatively symmetrical peak during the final SEC column. One major problem with UGGT1 is that it is proteolyzed rapidly. By the end of the 3-column purification procedure over 2 days, UGGT1 already shows some proteolysis on an SDS-PAGE gel (Figure 8a). Storage in 40% glycerol at -20 °C with protease inhibitors counters this proteolysis. The protease inhibitor cocktail has molecules that inhibit the trace levels of proteases that were not purified away from UGGT1, and low temperature slows protease activity.

The proteolysis products of UGGT1 were analyzed with SDS-PAGE and MALDI-TOF MS. Using SDS-PAGE, the UGGT1 proteolysis bands were compared to the MW standards to estimate their MWs as 120 kDa, 145 kDa, and 170 kDa. The three MALDI-TOF measured MWs were 173 kDa, 143 kDa, and 121 kDa. The 173 kDa fragment probably corresponds to full-length UGGT1. It is possible that the 143 kDa and 121 kDa fragments correspond to UGGT1 that has lost its GT or GT+β2-domain, respectively (see Figure 3). UGGT is already thought to have flexibility between the GT domain and the N-terminal region52, so proteolysis of the linker is not surprising. The published full-length crystal structures were missing a 38 amino acid segment between the β2 and GT domains.

4.2 RiboB, RiboS and S-protein properties

4.2.1 Circular Dichroism

Circular dichroism melts showed that RiboB, RiboS, and S-protein differ in terms of their thermostabilities. The far-UV range (200-250 nm) measures the CD of peptide bonds, and is indicative of secondary structure such as α-helices and β-strands119. The near-UV range (250-300 nm) gives information about protein tertiary structure, and in the case of RiboB, has been shown to be representative of tyrosine residues, particularly Tyr73 and Tyr115120. A thermal transition in the far-UV range is indicative of secondary structure unfolding, while a thermal transition in the near-UV range is attributed to unfolding of tertiary structure121.

I have shown that RiboB, RiboS, and S-protein each undergo a similar, cooperative, reversible CD transition in the far-UV range at 240 nm (Figure 12a). This is in agreement with previous work on Ribonuclease A (the non-glycosylated form of RiboB) which shows the same effect122. I have

53

estimated the TM of RiboB, RiboS, and S-protein to be 55 °C, 44 °C, and 33 °C, respectively. This is also in agreement with previous work123,124.

RiboB, RiboS, and S-protein all contain four disulphide bonds. These bonds are not broken by heat and the “unfolded” state observed here is most likely a disulphide-intact protein that has lost elements of its secondary structure. The presence of the disulfide bonds presumably assist with refolding when the temperature is lowered. The proteolytic cleavage introduced into RiboB to generate RiboS caused a decrease in thermostability. Although the overall structure is quite similar, RiboS differs from RiboB in that there are several slight displacements of loop regions and β- strands that result in different biophysical properties124. When either RiboB or RiboS are heated up, the S-peptide α-helix unfolds (in addition to other secondary structure elements). Since RiboS is cleaved, when the S-peptide unfolds, it subsequently dissociates from the rest of the protein124,125. When the temperature is again lowered, the S-peptide refolds into a helix and rebinds to the S- protein with high affinity. Thus, RiboS is reconstituted.

Since S-protein does not contain the S-peptide, the S-peptide cannot provide the stabilizing intramolecular interactions than are present in RiboB and RiboS. For this reason, S-protein is less thermally stable, as was observed by the CD melt.

The CD melt showed that RiboB began to melt at a temperature above 37 °C, suggesting that it is mostly natively folded at this temperature. Consistent with this observation, I have shown that UGGT1 does not glucosylate RiboB. At 37 °C, S-protein is above its unfolding transition of 33 °C, while RiboS is approaching its unfolding transition of 44 °C. Therefore, at this temperature, both proteins would be expected to appreciably populate unfolded conformations. Correspondingly, they are both glucosylated by UGGT1 at 37 °C. At 30 °C, the thermal melt indicates that RiboB and RiboS are mostly natively folded, and an appreciable amount of S-protein is misfolded. Accordingly, only S-protein is glucosylated, and to a lesser degree than at 37 °C.

4.2.2 Ribonuclease activity

Enzyme activity varies with temperature. Typically, activity increases with temperature up to a point beyond which it begins to drop. This is due to two competing factors: (a) the added energy in the system causes more intermolecular collisions and higher energy collisions and results in faster rates of chemical reaction, and (b) proteins denature or aggregate at high temperatures and

54 this typically inactivates them. As a general guideline, it is said that enzyme activity approximately doubles for every 10 °C temperature increase126.

RiboB and RiboS are enzymes that cleave RNA. In this work, I showed that RiboS experienced a rapid decrease in activity above 42 °C, while RiboB activity continued to increase up to 63 °C. The cleavage between the S-protein and the S-peptide segments (between Ala20 and Ser21) reduced the thermostability of RiboS relative to RiboB. This indicates that the proteolyzed RiboS becomes enzymatically inactivated at a lower temperature than RiboB. The dissociation of S-peptide would abrogate ribonuclease activity, and this could explain why RiboS becomes inactivated at a lower temperature than RiboB. Although RiboB does eventually become inactivated at high temperatures, RiboS is inactivated at a lower temperature because the S-peptide unfolds at a lower temperature and dissociates. The discrepancy between the CD melt showing that RiboB has completed its unfolding transition, but that it still maintains enzymatic activity may be the result of the different conditions between the assays. In particular, the activity assay was performed in the presence of its substrate, RNA, which could provide some stability to the enzyme.

RiboB and RiboS contain the catalytic His12 and His119127 residues, but S-protein is missing the His12 residue and is therefore inactive. This prohibited me from assessing its thermostability with an activity assay.

4.3 UGGT1 activity

4.3.1 MALDI-TOF-based activity assay

The MALDI-TOF-based activity assay assesses the relative quantities of substrate and product at various time points. MALDI-TOF is tolerant to salts and buffers, so I did not need to remove any of the reaction components before mass spectrometry. This makes the assay technically easy to perform. I took an aliquot from the reaction vessel, mixed it with formic acid/sinapinic acid/ethanol, and pipetted onto a MALDI-TOF target plate. Then using MALDI-TOF, the relative quantities of substrate and product are measured. RiboB, RiboS, and S-protein are all very amenable to study this way, as the substrate and product peaks are well-resolved. The two peaks are baseline separated and have no overlap. Little optimization and no desalting was required to get this separation. Larger proteins (which have more atoms and consequently have broader peaks due to isotope effects) may not be as amenable to this activity assay, since the 162 Da difference

55 may not be well-resolved. For such proteins, desalting the sample and/or optimization of the matrix solution may allow one to better resolve the substrate and product peaks.

4.3.2 RiboB, RiboS, and S-protein

RiboB, RiboS, and S-protein are differentially glucosylated by UGGT1. Native RiboB does not get measurably glucosylated within a few hours at 37 °C while RiboS and S-protein do. S-protein is glucosylated by UGGT1 at 30 °C, therefore UGGT1 is active at 30 °C. The absence of RiboS glucosylation supports the idea that RiboS is in a folded conformation at 30 °C. Only when it is warmed to 37 °C does it become a UGGT1 substrate. This is because at this temperature, RiboS has begun its unfolding transition, and an appreciable proportion of it is in the unfolded state. At 24 °C, RiboS resembles a folded protein. Consequently, it is not glucosylated.

Both the CD melt and ribonuclease activity melt data suggest that RiboS has already begun to unfold at 37 °C. Although 37 °C is below the midpoint of the unfolding transition, the folded and unfolded forms are in equilibrium. At 37 °C, an appreciable proportion of RiboS would be expected to be in the misfolded conformation. UGGT1 glucosylates nearly all of the misfolded RiboS molecules because each molecule spends an appreciable amount of time in a misfolded state at 37 °C.

4.3.3 Estimation of KM and kcat

Using 50 μM RiboS as a substrate, I have estimated the KM of the donor substrate UDP-glucose to be about 260 ± 150 μM. This is higher than the UDP-glucose KM of 44 μM reported for rat UGGT1 when using 0.5 μM soybean agglutinin as an acceptor substrate56, or 69 μM reported for Drosophila 69 UGGT when using saturating levels of the synthetic dye Man9-TAMRA . The discrepancy between these numbers and my reported value likely results from different assay conditions: different CaCl2 levels, different pHs, and different UGGT homologs. It is also possible that the acceptor substrate modulates the donor KM. In my work and the published studies, it is difficult to ensure that the acceptor substrate is at high enough concentration to not be limiting (owing to solubility for example). This is also a source of inaccuracy in the measurement of the donor KM since this is a two-substrate enzyme also involving a protein acceptor.

For RiboS, the KM was found to be 340 ± 90 μM. In these experiments, the UDP-glucose concentration was held at 1 mM, a value which is well above the KM reported by other groups, but

56

only 3 to 4 times the UDP-glucose KM that I measured. Moreover, Vmax was not reached when varying the acceptor substrate. In order to accurately measure the KM of RiboS, higher concentrations of both the donor and acceptor substrates must be tested. It is commonly accepted that one should assay the substrate whose KM is being determined at concentrations ranging from

0.1 to 10 times the KM. However, the purified yields of RiboS (and S-protein) were low, and solubility also limited my attempts to achieve the required concentrations. Conceivably, glycerol, detergents, or other additives could be incorporated into the assay to remedy acceptor solubility problems.

KM was similarly estimated for two S-protein variants – S-protein (650 ± 210 μM) and S- proteinN34Q/Y76N (360 ± 30 μM). Again, enzyme saturation was not reached, limiting our confidence with these values. The S-protein KMs would not be expected to be higher than that of RiboS since UGGT1 glucosylates S-protein faster than RiboS (Figure 16b, c), S-protein shows greater binding to a UGGT1-coupled BIAcore chip at a fixed concentration (Figure 25), and S-protein was shown to unfold at a lower temperature (Figure 12).

N34Q/Y76N -1 The observed kcat values of RiboS, S-protein, and S-protein were 3.3 ± 1.5 min , 27.6 ± 4.9 min-1, and 19.6 ± 0.9 min-1, respectively. However, it should again be noted that in no case was enzyme saturation achieved in the various enzyme kinetic experiments. Since the Vmax (and consequently kcat) was estimated by extrapolation of the Michaelis-Menten curves, this is a considerable source of error in the measurement of these kcat values. To explore this, if I alter the RiboS or S-protein dataset by increasing the highest concentration data point by 5%, the resulting

Vmax estimated by curve fitting increases by 9-28% and the KM increases by 14-33%. This shows how much the kcat and KM calculations depend on the accuracy of the highest concentration data point. If this point is not accurate, then the calculated values will be highly inaccurate. In order to be more confident with the results, higher substrate concentrations must be assayed.

The lack of confidence with the KM and kcat values does not imply that the MALDI-TOF assay data cannot be interpreted. I am much more confident with the calculation of Vo, and these data show that S-protein is a better UGGT substrate than RiboS, a result consistent with the observation that S-protein melts at a lower temperature than RiboS and that UGGT glucosylates misfolded proteins.

57

4.3.4 Metal ions

I have found that Ca2+ and Mn2+, but not Mg2+, can be used by UGGT1 and that there is no activity without a divalent metal ion. This is in agreement with previous experiments on UGGT147. The recent crystal structure of C. thermophilum UGGT has shown that Ca2+ is indeed bound by the DXD motif49, and this is presumably where the Mn2+ ion can be coordinated as well. The Ca2+ ion is used to coordinate the negatively charged phosphate groups of the nucleotide sugar donor substrate52 (for more see Section 1.2.2.2).

Calcium enters the ER via ATPase transporters. However, there have been no identified Mn2+ transporters in the ER, and it is unlikely that Mn2+ exists in the ER lumen at appreciable concentrations. For this reason, it is unknown if Mn2+ plays a role in the cellular function of UGGT, or if it simply displays activity because of its chemical similarity to Ca2+. Based on the absence of known Mn2+ transporters and the abundance of Ca2+ ions in the ER lumen, it is probable that in vivo, UGGT uses Ca2+ for activity, not Mn2+.

4.4 UGGT1 substrate binding

In addition to the enzyme kinetic work, binding affinities for the various substrates were also measured using SPR. This has allowed me to compare the KD values of my acceptor substrates and to study de-glycosylated or non-glycosylated forms of my acceptors that are no longer competent for glucosylation. By comparing the UGGT1 binding affinity of my acceptors with that of their de- glycosylated or non-glycosylated forms, I was able to gain insight into the respective roles that protein and carbohydrate play in the interaction with UGGT1. Modified acceptors not competent for glucosylation could also have been studied through their ability to inhibit the glucosylation of a competent acceptor, but this approach was not pursued in this thesis.

4.4.1 RiboB, RiboS, and S-protein

The SPR binding experiments were performed at 25 °C and it was found that S-protein and RiboS displayed binding to UGGT1 and RiboB did not. This is consistent with the finding that S-protein and RiboS are appreciably misfolded at this temperature and the idea that UGGT1 selectively recognizes misfolded proteins. Indeed, the binding to S-protein was considerably greater than that of RiboS, an observation consistent with the fact that the former has a lower melting temperature than the latter.

58

4.4.2 KD Estimation

When several concentrations of S-protein were injected over the UGGT1 chip, each of the ΔRU values was measured and plotted against S-protein concentration to produce binding curves. The N34Q/Y76N tightest binding substrate, S-protein , displayed a KD of 76 ± 8 μM. This is a higher affinity than the other measured KD values of 450 ± 30 μM and 230 ± 10 μM for S-protein and S- N34Q N34Q/G88N protein , respectively. In order to estimate a KD for the S-protein form, higher concentrations of S-proteinN34Q/G88N would be required.

Based on this binding assay, S-proteinN34Q/Y76N binds the most tightly to UGGT1 and the complex should be subjected to crystallization trials. Including S-proteinN34Q/Y76N at a concentration well above 76 μM will ensure that most of the UGGT1 molecules are bound to S-proteinN34Q/Y76N. Since concentrations near 600 μM (nearly 8 times the KD) were attainable and soluble during activity assays, achieving a high enough concentration of S-proteinN34Q/Y76N for crystallography will be possible.

The amount of signal measured at saturating substrate concentrations is termed the Bmax. KD is defined as the substrate concentration that produces a signal half that of Bmax. Because S-protein, N34Q N34Q/Y76N S-protein , and S-protein are all approximately the same molecular weight, the Bmax was expected to be the same for each of these protein on the same UGGT1 coupled chip (as was the case in these experiments). This was not the case. Using GraphPad to fit the curves, the N34Q estimated Bmax values were 1140 RU, 950 RU, and 520 RU for S-protein, S-protein , and S- proteinN34Q/Y76N, respectively. The fact that these vary so much could be the result of the fact that more data points from concentrations above the KD are needed to saturate UGGT1 and reliably N34Q estimate KD and Bmax. Particularly for S-protein and S-protein , more data points at higher concentrations are needed to confirm these KD and Bmax values. Unfortunately, when substrates were injected at higher concentrations (>400 μM), the sensorgrams were indicative of effects likely arising from protein aggregation and/or nonspecific interactions. To address this possibility, detergents would have to be tested as is often used in the measurement of protein-protein interactions using SPR.

If the Bmax values obtained by curve fitting are to be believed, and they do in fact vary by more than 2-fold, then perhaps this indicates differences between the S-protein variants. Maybe for some S-protein variants, there are two S-protein molecules binding to UGGT1, either in two locations,

59 or by forming dimers before binding. SPR measures all proteins that are in close proximity to the immobilized layer, and dimers would produce twice as much signal. Alternatively, the substrates could have been interacting with the surface of the reference channel of the SPR chip. If some S- protein variants aggregate or interact more strongly with the reference channel, then this will decrease the measured RU. This could also account for the variance in Bmax values.

4.4.3 S-protein variants

Each of the three Man9GlcNAc2-containing S-protein glycosylation variants (S-protein, S- proteinN34Q/Y76N, and S-proteinN34Q/G88N) were glucosylated by UGGT1. This shows that the glycan need not be in a precise location to allow for UGGT glucosylation. This is expected, as UGGT is thought to recognize a wide variety of glycoproteins.

Additionally, the binding assay showed that all four S-protein variants bound to UGGT1, including the non-glycosylated S-proteinN34Q. This is an important result, as it shows that UGGT1 can bind non-glycosylated misfolded substrates. Furthermore, the EndoH de-glycosylated S-protein variants also bound to UGGT1. At a concentration of 12.5 μM, the RU value was the same between the glycosylated and de-glycosylated forms of S-protein and S-proteinN34Q/G88N, and about 2-fold less for S-proteinN34Q/Y76N. This indicates that UGGT1 is primarily recognizing these substrates via their misfolded regions, and not the glycan.

In order to get a deeper understanding of the effect caused by the glycan, one would need to measure the differences in affinity (KD) between the glycosylated and de-glycosylated S-protein forms. In the experiment presented here, I have only tested the proteins at one concentration (12.5 μM).

4.5 UGGT substrate recognition

All known UGGT substrates have two features in common: a misfolded component and a

Man9GlcNAc2 glycan. Given that UGGT contains an N-terminal region important for substrate recognition and a C-terminal GT domain for catalysis, it follows that UGGT likely recognizes both the misfolded component and the glycan component. Flexibility between the N-terminal region and the GT domain contributes to conformational changes, allowing for accommodation of different substrates. This flexibility could allow UGGT to form a bipartite interaction with at least some of its substrates. In this mode of substrate recognition, UGGT binds both the misfolded

60 determinant and the glycan moiety simultaneously (Figure 28a). Having two points of contact with the substrate allows for an increase in binding affinity relative to just one or the other.

Figure 28. UGGT mechanisms of glucosylation

UGGT is depicted in yellow with the catalytic GT domain on the right and the N-terminal region (composed of the trx1, trx2, trx3, trx4, β1, and β2-domains) on the left. The misfolded protein binding surface is located on the N-terminal region, and the glycan binding site is on the GT domain. A red misfolded glycoprotein is depicted with an N-glycan. (a) Bipartite mode: the misfolded region and the N-glycan moiety are bound by UGGT simultaneously. (b) Local concentration mode: by interacting with the misfolded region of the protein alone, the local concentration of the N-glycan is increased after dissociation. (c) Protein-independent mode: only the carbohydrate is involved in binding. This mode is not expected to be responsible for an appreciable level of glucosylation.

However, the present data indicate that other modes of substrate recognition may also be operative. Importantly, UGGT was shown to bind some of its substrates even in the absence of its N-glycan. This suggests a mode of interaction that is not bipartite. In this mode, binding is proposed to be mediated by misfolded protein determinants alone and following dissociation from UGGT, the substrate re-orients to present the N-glycan to the catalytic site for glucosylation (Figure 28b). In

61 this way, glycosylation is promoted by a “local concentration effect” that serves to increase the concentration of the glycan on misfolded proteins in the proximity of the GT domain.

As shown in Figure 28c, is also possible that UGGT glucosylation might be mediated by interactions with the carbohydrate alone. This mode of interaction would not select for misfolded glycoprotein substrates and is presumably not favored since a carbohydrate-only interaction would be expected to be weak.

Our results suggest a rethinking of how UGGT recognizes is diverse substrates. Perhaps only some substrates are recognized by UGGT in the bipartite fashion that involves recognition of both the glycan and misfolded protein determinants simultaneously. Other substrates might be recognized by their misfolded protein determinants alone. Among UGGT substrates, it can be imagined that there are two subsets of Man9GlcNAc2 glycans: those that use the bipartite mode and those that use the local concentration mode. It follows that both modes might be operative in a multiply glycosylated substrate.

Finally, the ability to bind misfolded proteins in the absence of glycan interactions leads to the interesting possibility that UGGT might play role in the folding/quality control of non-glycosylated substrates.

4.5.1 UGGT substrate recognition in the context of S-protein variants

Our data suggests that the S-protein variants (S-protein, S-proteinN34Q/Y76N, and S-proteinN34Q/G88N) all interact with UGGT through the local-concentration mode of glucosylation (Figure 28b). This is evidenced by the fact that binding levels were unchanged after glycan removal from S-protein and S-proteinN34Q/G88N and only reduced 2-fold after glycan removal from S-proteinN34Q/Y76N. If there was a bipartite interaction formed, then glycan removal would be expected to reduce the measured level of binding by significant amounts. The protein-independent mode can be ruled out, as properly folded RiboB variants were not measurably glucosylated by UGGT.

It is important to consider that S-protein may simply be too small a protein to form a bipartite interaction with UGGT. At 15 kDa, it may not be able to contact both the misfolded binding surface and the GT domain simultaneously. This line of reasoning suggests that in vivo, UGGT might prefer larger substrates.

62

Future Directions 5.1 X-ray crystal structure

The open, intermediate, and closed conformations of UGGT reported by the published crystal structures may not represent the extent of UGGT’s flexibility49. New crystal structures of UGGT may uncover alternate open and closed conformations. Alternative open and closed conformations could be formed by the movement of the GT domain relative to the rest of the protein. All four full-length UGGT crystal structures reported indicate that the GT domain is in the same position relative to the N-terminal region49. This region does have flexibility, as was demonstrated with high-speed atomic force microscopy52, but this has not yet been observed by crystallography. Obtaining new crystal structures of UGGT may shed light on GT flexibility. Additionally, human UGGT1 structures will be pursued as they are of greater relevance to the medical field.

Though the recent publications have significantly advanced our knowledge of fungal UGGT by solving its domain structure and showing that it is a highly flexible protein, relatively little insight was gained into how it binds its substrates. Does UGGT bind its substrates through one or more trx domains? Do different misfolded substrates bind to different regions on UGGT? Is UGGT flexible enough to significantly change the size of its cavity to accommodate different substrates? These questions could be answered by solving structures of UGGT with acceptor substrates.

The glycosylated S-proteinN34Q/Y76N variant was the acceptor substrate that bound with the tightest affinity as measured by BIAcore SPR (76 ± 8 μM). This is the first variant to try in an attempt to get UGGT-S-protein crystals. However, based on the results presented here, it appears that UGGT also binds S-protein substrates independently of the glycan moiety. The non-glycosylated S- proteinN34Q variant and the de-glycosylated variants all bound to UGGT. It may be fruitful to attempt to crystallize a UGGT-S-proteinN34Q complex, as a highly mobile glycan moiety may hinder crystal formation and this substrate has no glycan. Much will be gained by learning how UGGT interacts with the misfolded component of S-proteinN34Q. The surface of UGGT that binds the misfolded substrate will be elucidated and key residues involved in binding will be identified.

UDP (the donor product) can be included while crystallizing UGGT. This may help structure the loops in the catalytic site. A divalent metal ion is also required to coordinate the UDP, so Ca2+ or Mn2+ should also be included. The donor substrate, UDP-glucose, is not well-suited for

63 crystallizing a UGGT-substrate complex. Instead, crystallization using non-hydrolysable analogues of UDP-glucose will be attempted, namely UDP-phosphono-glucose and UDP-2-deoxy- 2-fluoro-glucose, both of which have been used successfully in our lab for crystallography128.

5.2 Co-crystallization with different protein substrates

As reported in the literature, several substrates have been developed to probe the specificity of UGGT. Because of these previous studies, we have many candidate molecules that can be screened for crystallization in complex with UGGT. Several purified substrates have previously been found to be glucosylated by UGGT in in vitro enzyme activity assays, and these are listed in Table 3. If one were to solve a crystal structure of UGGT in complex with several of these substrates individually, our understanding of UGGT1 would be advanced greatly. The size and extent of the misfolded protein binding surface is not yet known. Multiple UGGT-substrate complexes should be pursued in order to map the extent of the binding surface. It is certainly possible that RiboB is not large enough to make a bipartite interaction with UGGT. Since several of these proteins are larger than S-protein, we hope to identify a substrate large enough to interact with UGGT through the bipartite mode of interaction.

Table 3. Purified UGGT substrates in vitro Substrate Modification MW Reference (kDa) Bovine thyroglobulin 8 M urea 660 21,47–49,59,106 Kidney bean phytohemagglutinin 8 M urea at 60 °C 30 21,46 21,46,48,67 Soybean agglutinin 8 M urea at 60 °C 120 Heat to 100 °C 56 21,48 Bovine ribonuclease 8 M urea at 60 °C 15 Digest with subtilisin (forms S-protein) 22,71,72 Drosophila laminin Heat to 65 °C 800 45 Drosophila glutactin Heat to 65 °C 155 45 Drosophila peroxidasin 8 M urea, DTT + N-ethylmaleimide 150 45 sulfhydryl blocking Yeast acid phosphatase Incubate at pH 7.5 252 56 Barley chymotrypsin inhibitor C-terminal truncation 10 60,63 Human interleukin-8 Disulphide mixed, GuHCl denaturation 11 129–132 Plant crambin Disulphide mixed, disulphide mutant 5 133 E. coli dihydrofolate reductase Introduce Cys and modify with N-(1- 18 64 pyrenyl)iodoacetamide

64

There are other proteins known to be UGGT substrates based on in vivo assays, where a misfolded glycoprotein is expressed by a cell and measurably glucosylated by UGGT in the ER. These include well-studied model proteins such as α1-AT mutants23,134, TCRα monomers93,94, and VSV G protein tsO45 mutants92. Although they have not yet been shown to be glucosylated in vitro with purified UGGT and purified substrate, they are also potential candidates for UGGT1-substrate complexes. Future structural biology efforts should not be limited to the “pre-validated” substrates found in Table 3, but should also consider substrates from in vivo UGGT studies such as the ones listed above.

5.3 UGGT binding partners

SEP15 is a known binding partner of UGGT105,135. The binding region of Drosophila SEP15 was recently mapped to amino acids 266-306 in the Drosophila UGGT, corresponding to a loop between the trx1 and trx4 domains50. Interestingly, both fungal UGGT crystal structures showed poorly defined electron density for this loop, and neither group modeled it in49,52. Perhaps upon SEP15 binding, the loop becomes structured.

We have cloned human Sep15 and stably transfected it into cells. It expresses well as a protein A fusion protein in 293F cells. Crystallization of UGGT in complex with SEP15 would provide insight into its mechanism of binding and confirm the location of the binding site. It may even provide insight into the functional role of the UGGT-SEP15 complex. The expected binding site is on the external surface of the protein, not the cavity. It will be interesting to see if the SEP15 catalytic site is oriented in such a way that allows it to act upon UGGT1-bound substrates in the cavity. Perhaps the SEP15 catalytic site will be obscured, pointing to a non-catalytic role for SEP15. Additionally, crystallization of the UGGT-S-protein complex can be attempted in both the presence and absence of SEP15.

Another specific binding partner of UGGT that has been recently identified is TAPBPR100. TAPBPR was co-crystallized with the MHC I complex, showing that it is amenable to purification and crystallization136,137. Additionally, TAPBPR binding does not block glucosylation of N- glycans, implying that TAPBPR does not bind in the substrate-binding cavity. Therefore, a UGGT- TAPBPR-S-protein complex is potentially achievable. Including this binding partner may allow us to crystallize and solve the structure of UGGT.

65

It is possible that its flexibility will make UGGT hard to crystallize. In order to combat this, a disulphide can be introduced to lock it into the less-mobile closed conformation. This strategy was employed for the crystallization of fungal UGGT, and allowed researchers to improve resolution from 3.5 Å to 2.8 Å49. The mutations N796C/G1118C and D611C/G1050C, introduce disulphide bonds between the trx3 and β2 domain or the trx2 and β2 domain, respectively. Both proteins had higher melting temperatures than the wild-type protein. Disulphide bonds can be introduced into the human UGGT1 sequence at the same locations (Q645C/T1123C or K885C/D1189C) or at other sites to lock it in a closed conformation. A disulphide-locked UGGT will have particular utility in obtaining structures with binding partners such as TAPBPR and Sep15 where restricting domain motions that might be critical for substrate binding are not expected to be a concern.

5.4 Electron microscopy

For reasons discussed above, obtaining x-ray crystal structures of UGGT in complex with a substrate would teach us much about UGGT. However, crystals may not be obtained, or they may not diffract well. Cryo-EM could be an alternative structural approach to the study of UGGT.

UGGT, being 173 kDa, is large enough to be analyzed by cryo-EM. In fact, I have done a preliminary trial with negative-stained UGGT and showed that it was detectable and that it was a monodispersed sample.

Indeed, preliminary cryo-EM analysis of UGGT by two independent groups has been reported. Both groups showed that UGGT segregated into four 3D classes at low resolution49,52. On one hand, this work is useful in uncovering the range of conformations that UGGT can adopt. On the other hand, working with a highly mobile molecule increases the complexity of data analysis and 3D model building, as assigning each 2D projection to the correct 3D class is not trivial. One way to structure UGGT might be to include a binding partner with it. This may reduce the number of 3D classes populated by UGGT and simplify particle alignment and averaging. A UGGT-S-protein complex may have reduced flexibility, as some of the mobile domains (trx2, GT) may be bound to S-protein. Alternatively, SEP15 or TAPBPR could structure UGGT. TAPBPR (a 55 kDa protein) would provide an additional advantage in that it would increase the size of the particle. Using larger substrate molecules could also provide this advantage. Alternatively, a cryo-EM structure of the disulphide-locked UGGT could be attempted.

References

1. Apweiler, R., Hermjakob, H. & Sharon, N. On the frequency of protein glycosylation, as deduced from analysis of the SWISS-PROT database. Biochim. Biophys. Acta - Gen. Subj. 1473, 4–8 (1999). 2. Keenan, R. J., Freymann, D. M., Stroud, R. M. & Walter, P. The Signal Recognition Particle. Annu. Rev. Biochem 70, 755–75 (2001). 3. Johnson, A. E. & van Waes, M. A. The Translocon: A Dynamic Gateway at the ER Membrane. Annu. Rev. Cell Dev. Biol. 15, 799–842 (1999). 4. Helenius, A. & Aebi, M. Roles of N-linked glycans in the endoplasmic reticulum. Annu. Rev. Biochem 73, 1019–49 (2004). 5. Aebi, M. N-linked protein glycosylation in the ER. Biochim. Biophys. Acta - Mol. Cell Res. 1833, 2430–2437 (2013). 6. Zielinska, D. F., Gnad, F., Schropp, K., Winiewski, J. R. & Mann, M. Mapping N- Glycosylation Sites across Seven Evolutionarily Distant Species Reveals a Divergent Substrate Proteome Despite a Common Core Machinery. Mol. Cell 46, 542–548 (2012). 7. Murray, A. N. et al. Enhanced Aromatic Sequons Increase Oligosaccharyltransferase Glycosylation Efficiency and Glycan Homogeneity. Chem. Biol. 22, 1052–1062 (2015). 8. Wilson, C. M., Roebuck, Q. & High, S. Ribophorin I regulates substrate delivery to the oligosaccharyltransferase core. Proc. Natl. Acad. Sci. U. S. A. 105, 9534–9 (2008). 9. Shental-Bechor, D. & Levy, Y. Effect of glycosylation on protein folding: A close look at thermodynamic stabilization. Proc. Natl. Acad. Sci. 105, 8256–8261 (2008). 10. Shental-Bechor, D. & Levy, Y. Folding of glycoproteins: toward understanding the biophysics of the glycosylation code. Curr. Opin. Struct. Biol. 19, 524–533 (2009). 11. Hanson, S. R. et al. The core trisaccharide of an N-linked glycoprotein intrinsically accelerates folding and enhances stability. Proc. Natl. Acad. Sci. U. S. A. 106, 3131–6 (2009). 12. Imperiali, B. & O’Connor, S. E. Effect of N-linked glycosylatian on glycopeptide and glycoprotein structure. Current Opinion in Chemical Biology 3, 643–649 (1999). 13. Wang, C., Eufemi, M., Turano, C. & Giartosio, A. Influence of the carbohydrate moiety on the stability of glycoproteins. Biochemistry 35, 7299–7307 (1996). 14. Petrescu, A. J., Milac, A. L., Petrescu, S. M., Dwek, R. A. & Wormald, M. R. Statistical analysis of the protein environment of N-glycosylation sites: Implications for occupancy, structure, and folding. Glycobiology 14, 103–114 (2004). 15. Williams, D. B. Beyond lectins: the calnexin/calreticulin chaperone system of the endoplasmic reticulum. J. Cell Sci. 119, 615–623 (2006). 16. Kozlov, G. et al. Structural basis of carbohydrate recognition by calreticulin. J. Biol. Chem. 285, 38612–38620 (2010). 17. Schrag, J. D. et al. The structure of calnexin, an ER chaperone involved in quality control of protein folding. Mol. Cell 8, 633–644 (2001).

66 67

18. Kozlov, G., Muñoz-Escobar, J., Castro, K. & Gehring, K. Mapping the ER Interactome: The P Domains of Calnexin and Calreticulin as Plurivalent Adapters for Foldases and Chaperones. Structure 25, 1415–1422 (2017). 19. Leach, M. R., Cohen-Doyle, M. F., Thomas, D. Y. & Williams, D. B. Localization of the lectin, ERp57 binding, and polypeptide binding sites of calnexin and calreticulin. J. Biol. Chem. 277, 29686–29697 (2002). 20. Price, E. R. et al. Human cyclophilin B: a second cyclophilin gene encodes a peptidyl-prolyl isomerase with a signal sequence. Proc. Natl. Acad. Sci. U. S. A. 88, 1903–7 (1991). 21. Sousa, M., Ferrero-Garcia, M. A. & Parodi, A. J. Recognition of the oligosaccharide and protein moieties of glycoproteins by the UDP-Glc:glycoprotein glucosyltransferase. Biochemistry 31, 97–105 (1992). 22. Trombetta, E. S. & Helenius, A. Conformational requirements for glycoprotein reglucosylation in the endoplasmic reticulum. J. Cell Biol. 148, 1123–1129 (2000). 23. Ferris, S. P., Kodali, V. K. & Kaufman, R. J. Glycoprotein folding and quality-control mechanisms in protein-folding diseases. Dis. Model. Mech. 7, 331–341 (2014). 24. Moremen, K. W., Tiemeyer, M. & Nairn, A. V. Vertebrate protein glycosylation: diversity, synthesis and function. 13, 448–462 (2014). 25. Helenius, A. & Aebi, M. Intracellular functions of N-linked glycans. Science 291, 2364– 2369 (2001). 26. Ruggiano, A., Foresti, O. & Carvalho, P. ER-associated degradation: Protein quality control and beyond. J. Cell Biol. 204, 869–879 (2014). 27. Kanehara, K., Kawaguchi, S. & Ng, D. T. W. The EDEM and Yos9p families of lectin-like ERAD factors. Semin. Cell Dev. Biol. 18, 743–750 (2007). 28. Slominska-Wojewodzka, M. & Sandvig, K. The role of lectin-carbohydrate interactions in the regulation of ER-associated protein degradation. Molecules 20, 9816–9846 (2015). 29. Avezov, E., Frenkel, Z., Ehrlich, M., Herscovics, A. & Lederkremer, G. Endoplasmic Reticulum (ER) Mannosidase I Is Compartmentalized and Required for N-Glycan Trimming to Man5–6GlcNAc2 in Glycoprotein ER-associated Degradation. Mol. Biol. Cell 19, 216–225 (2008). 30. Christianson, J. C., Shaler, T. A., Tyler, R. E. & Kopito, R. R. OS-9 and GRP94 deliver mutant α1-antitrypsin to the Hrd1-SEL1L ubiquitin ligase complex for ERAD. Nat. Cell Biol 10, 272–282 (2008). 31. Satoh, T. et al. Structural Basis for Oligosaccharide Recognition of Misfolded Glycoproteins by OS-9 in ER-Associated Degradation. Mol. Cell 40, 905–916 (2010). 32. Bernasconi, R., Pertel, T., Luban, J. & Molinari, M. A dual task for the Xbp1-responsive OS-9 variants in the mammalian endoplasmic reticulum: Inhibiting secretion of misfolded protein conformers and enhancing their disposal. J. Biol. Chem. 283, 16446–16454 (2008). 33. Carvalho, P., Stanley, A. M. & Rapoport, T. A. Retrotranslocation of a misfolded luminal ER protein by the ubiquitin-ligase hrd1p. Cell 143, 579–591 (2010). 34. Baldridge, R. D. & Rapoport, T. A. Autoubiquitination of the Hrd1 Ligase Triggers Protein Retrotranslocation in ERAD. Cell 166, 394–407 (2016).

68

35. Voges, D., Zwickl, P. & Baumeister, W. The 26S Proteasome: A Molecular Machine Designed for Controlled Proteolysis. Annu. Rev. Biochem. 68, 1015–1068 (1999). 36. Gallastegui, N. & Groll, M. The 26S proteasome: assembly and function of a destructive machine. Trends Biochem. Sci. 35, 634–642 (2010). 37. Raasi, S. & Wolf, D. H. Ubiquitin receptors and ERAD: A network of pathways to the proteasome. Semin. Cell Dev. Biol. 18, 780–791 (2007). 38. Totani, K., Ihara, Y., Matsuo, I., Koshino, H. & Ito, Y. Synthetic substrates for an endoplasmic reticulum protein-folding sensor, UDP-glucose: Glycoprotein glucosyltransferase. Angew. Chemie - Int. Ed. 44, 7950–7954 (2005). 39. Parodi, A. & Cazzulo, J. Protein glycosylation in Trypanosoma cruzi. J Biol Chem 257, 7637–7640 (1982). 40. Parodi, A. J., Mendelzon, D. H. & Lederkremer, G. Z. Transient glucosylation of protein- bound Man9GlcNAc2, Man8GlcNAc2, and Man7GlcNAc2 in calf thyroid cells. Biol. Chem. 258, 8260–8265 (1983). 41. Parodi, A. J., Mendelzon, D. H., Lederkremer, G. Z. & Martin-Barrientos, J. Evidence That Transient Glucosylation of Protein-linked Man9GlcNAc2, Man8GlcNAc2, and Man7GlcNAc2 occurs in rat liver and Phaseolus vulgaris cells. J. Biol. Chem. 259, 6351– 6357 (1984). 42. Trombetta, S. E., Bosch, M. & Parodi, A. J. Glucosylation of Glycoproteins by Mammalian, Plant, Fungal, and Trypanosomatid Protozoa Microsomal Membranes. Biochemistry 28, 8108–8116 (1989). 43. Trombetta, S. E. & Parodi, A. J. Purification to apparent homogeneity and partial characterization of rat liver UDP-glucose:glycoprotein glucosyltransferase. J. Biol. Chem. 267, 9236–9240 (1992). 44. Sousa, M. C., Ferrero-Garcia, M. A. & Parodi, A. J. Recognition of the Oligosaccharide and Protein Moieties of Glycoproteins by the UDP-Glc: Glycoprotein Glucosyltransferase. Biochemistry 31, 97–105 (1992). 45. Parker, C. G., Fessler, L. I., Nelson, R. E. & Fessler, J. H. Drosophila UDP- glucose:glycoprotein glucosyltransferase: sequence and characterization of an enzyme that distinguishes between denatured and native proteins. EMBO J. 14, 1294–1303 (1995). 46. Fernández, F. S., Trombetta, S. E., Hellman, U. & Parodi, A. J. Purification to homogeneity of UDP-glucose:glycoprotein glucosyltransferase from Schizosaccharomyces pombe and apparent absence of the enzyme from Saccharomyces cerevisiae. J. Biol. Chem. 269, 30701–30706 (1994). 47. Arnold, S. M., Fessler, L. I., Fessler, J. H. & Kaufman, R. J. Two homologues encoding human UDP-glucose:glycoprotein glucosyltransferase differ in mRNA expression and enzymatic activity. Biochemistry 39, 2149–2163 (2000). 48. Sousa, M. & Parodi, A. The molecular basis for the recognition of misfolded glycoproteins by the UDP-Glc:glycoprotein glucosyltransferase. EMBO J. 14, 4196–203 (1995). 49. Roversi, P. et al. Interdomain conformational flexibility underpins the activity of UGGT, the eukaryotic glycoprotein secretion checkpoint. Proc. Natl. Acad. Sci. 114, 8544–8549

69

(2017). 50. Calles-Garcia, D. et al. Single-particle electron microscopy structure of UDP- glucose:glycoprotein glucosyltransferase suggests a selectivity mechanism for misfolded proteins. J. Biol. Chem. 292, 11499–11507 (2017). 51. Zhu, T., Satoh, T. & Kato, K. Structural insight into substrate recognition by the endoplasmic reticulum folding-sensor enzyme: crystal structure of third thioredoxin-like domain of UDP-glucose:glycoprotein glucosyltransferase. Sci. Rep. 4, 7322 (2014). 52. Satoh, T. et al. Visualisation of a flexible modular structure of the ER folding-sensor enzyme UGGT. Sci. Rep. 7, 12142 (2017). 53. Kelly, L. A., Mezulis, S., Yates, C., Wass, M. & Sternberg, M. The Phyre2 web portal for protein modelling, prediction, and analysis. Nat. Protoc. 10, 845–858 (2015). 54. Ünligil, U. M. & Rini, J. M. Glycosyltransferase structure and mechanism. Curr. Opin. Struct. Biol. 10, 510–517 (2000). 55. Lairson, L. L., Henrissat, B., Davies, G. J. & Withers, S. G. Glycosyltransferases: Structures, Functions, and Mechanisms. Annu. Rev. Biochem. 77, 521–555 (2008). 56. Tessier, D. C. et al. Cloning and characterization of mammalian UDP-glucose glycoprotein: glucosyltransferase and the development of a specific substrate for this enzyme. Glycobiology 10, 403–412 (2000). 57. Daikoku, S., Seko, A., Ito, Y. & Kanie, O. Glycan structure and site of glycosylation in the ER-resident glycoprotein, uridine 5-diphosphate-glucose: Glycoprotein glucosyltransferases 1 from rat, porcine, bovine, and human. Biochem. Biophys. Res. Commun. 451, 356–360 (2014). 58. Daikoku, S. et al. The relationship between glycan structures and expression levels of an endoplasmic reticulum-resident glycoprotein, UDP-glucose: Glycoprotein glucosyltransferase 1. Biochem. Biophys. Res. Commun. 462, 58–63 (2015). 59. Trombetta, S. E., Gañan, S. A. & Parodi, A. J. The UDP-Glc:glycoprotein glucosyltransferase is a soluble protein of the endoplasmic reticulum. Glycobiology 1, 155– 161 (1991). 60. Caramelo, J. J., Castro, O. A., Alonso, L. G., De Prat-Gay, G. & Parodi, A. J. UDP- Glc:glycoprotein glucosyltransferase recognizes structured and solvent accessible hydrophobic patches in molten globule-like folding intermediates. Proc Natl Acad Sci U S A 100, 86–91 (2003). 61. Taylor, S. C., Thibault, P., Tessier, D. C., Bergeron, J. J. M. & Thomas, D. Y. Glycopeptide specificity of the secretory protein folding sensor UDP-glucose glycoprotein: glucosyltransferase. EMBO Rep. 4, 405–411 (2003). 62. Kudo, T., Hirano, M., Ishihara, T., Shimura, S. & Totani, K. Glycopeptide probes for understanding peptide specificity of the folding sensor enzyme UGGT. Bioorganic Med. Chem. Lett. 24, 5563–5567 (2014). 63. Caramelo, J. J., Castro, O. A., De Prat-Gay, G. & Parodi, A. J. The endoplasmic reticulum glucosyltransferase recognizes nearly native glycoprotein folding intermediates. J. Biol. Chem. 279, 46280–46285 (2004).

70

64. Hachisu, M. et al. Hydrophobic Tagged Dihydrofolate Reductase for Creating Misfolded Glycoprotein Mimetics. ChemBioChem 17, 300–303 (2016). 65. Taylor, S. C., Ferguson, A. D., Bergeron, J. J. M. & Thomas, D. Y. The ER protein folding sensor UDP-glucose glycoprotein-glucosyltransferase modifies substrates distant to local changes in glycoprotein conformation. Nat. Struct. Mol. Biol. 11, 128–134 (2004). 66. Pearse, B. R., Gabriel, L., Wang, N. & Hebert, D. N. A cell-based reglucosylation assay demonstrates the role of GT1 in the quality control of a maturing glycoprotein. J. Cell Biol. 181, 309–320 (2008). 67. Keith, N., Parodi, A. J. & Caramelo, J. J. Glycoprotein tertiary and quaternary structures are monitored by the same quality control mechanism. J. Biol. Chem. 280, 18138–18141 (2005). 68. Totani, K., Ihara, Y., Tsujimoto, T., Matsuo, I. & Ito, Y. The recognition motif of the glycoprotein-folding sensor enzyme UDP-Glc: Glycoprotein glucosyltransferase. Biochemistry 48, 2933–2940 (2009). 69. Sakono, M., Seko, A., Takeda, Y., Hachisu, M. & Ito, Y. Biophysical properties of UDP- glucose: Glycoprotein glucosyltransferase, a folding sensor enzyme in the ER, delineated by synthetic probes. Biochem. Biophys. Res. Commun. 426, 504–510 (2012). 70. Takeda, Y. et al. Both isoforms of human UDP-glucose:glycoprotein glucosyltransferase are enzymatically active. Glycobiology 24, 344–350 (2014). 71. Ritter, C. & Helenius, A. Recognition of local glycoprotein misfolding by the ER folding sensor UDP-glucose:glycoprotein glucosyltransferase. Nat. Struct. Biol. 7, 278–280 (2000). 72. Ritter, C., Quirin, K., Kowarik, M. & Helenius, A. Minor folding defects trigger local modification of glycoproteins by the ER folding sensor GT. EMBO J. 24, 1730–8 (2005). 73. Liang, C., Yamashita, K. & Kobata, A. Structural Bovine Study Pancreatic of the Carbohydrate Moiety of Ribonuclease B. J. Biochem. 88, 51–58 (1980). 74. Kartha, G., Bello, J. & Harker, D. Tertiary structure of ribonuclease. Nature 213, 862–5 (1967). 75. Williams, R. L., Greene, S. M. & McPherson, A. The crystal structure of ribonuclease B at 2.5-A resolution. J. Biol. Chem. 262, 16020–16031 (1987). 76. Gotte, G., Bertoldi, M. & Libonati, M. Structural versatility of bovine ribonuclease A. Distinct conformers of trimeric and tetrameric aggregates of the enzyme. Eur. J. Biochem. 265, 680–687 (1999). 77. Gotte, G., Laurents, D. V. & Libonati, M. Three-dimensional domain-swapped oligomers of ribonuclease A: Identification of a fifth tetramer, pentamers and hexamers, and detection of trace heptameric, octameric and nonameric species. Biochim. Biophys. Acta - Proteins Proteomics 1764, 44–54 (2006). 78. Richards, F. M. On the Enzymic Activity of Subtilisin-Modified Ribonuclease. Proc. Natl. Acad. Sci. 44, 162–166 (1958). 79. Wyckoff, H. et al. The Three-Dimensional of Ribonuclease-S. J. Biol. Chem. 245, 305–328 (1970). 80. Richards, F. & Vithayathil, P. The Preparation of Subtilisin-modified Ribonuclease and the

71

Separation of the Peptide and Protein Components. J. Biol. Chem. 234, 1459–65 (1959). 81. Kim, J. & Raines, R. T. Ribonuclease S‐peptide as a carrier in fusion proteins. Protein Sci. 2, 348–356 (1993). 82. Labhardt, A. M. & Baldwin, R. L. Recombination of S-peptide with S-protein during folding of ribonuclease S. I. Folding pathways of the slow-folding and fast-folding classes of unfolded S-protein. J. Mol. Biol. 135, 231–244 (1979). 83. Labhardt, A. M. & Baldwin, R. L. Recombination of S-peptide with S-protein during folding of ribonuclease S. II. Kinetic characterization of a stable folding intermediate shown by S-protein at pH 1.7. J. Mol. Biol. 135, 245–254 (1979). 84. Kim, P. & Baldwin, R. Specific intermediates in the folding recations of small proteins and the mechanism of protein folding pathways. Ann. Rev. Biochem. 51, 459–489 (1982). 85. Kim, P. & Baldwin, R. Intermediates In The Folding Reactions Of Small Proteins. Annu. Rev. Biochem. 59, 631–660 (1990). 86. Neira, J. L. & Rico, M. Folding studies on ribonuclease A, a model protein. Fold. Des. 2, R1–R11 (1997). 87. Arnold, S. M. & Kaufman, R. J. The Noncatalytic Portion of Human UDP- glucose:Glycoprotein Glucosyltransferase I Confers UDP-glucose Binding and Transferase Function to the Catalytic Domain. J. Biol. Chem. 278, 43320–43328 (2003). 88. Buzzi, L. I., Simonetta, S. H., Parodi, A. J. & Castro, O. A. The two Caenorhabditis elegans UDP-glucose:Glycoprotein glucosyltransferase homologues have distinct biological functions. PLoS One 6, (2011). 89. Takeda, Y. et al. Effects of domain composition on catalytic activity of human UDP- glucose: Glycoprotein glucosyltransferases. Glycobiology 26, 999–1006 (2016). 90. Fernandez, F., Jannatipour, M., Hellman, U., Rokeach, L. & Parodi, A. A new stress protein: synthesis of Schizosaccharomyces pombe UDP-Glc:glycoprotein glucosyltransferase mRNA is induced by stress conditions but the enzyme is not essential for cell viability. EMBO J. 15, 705–13 (1996). 91. Fanchiotti, S., Fernández, F., Alessio, C. D. & Parodi, A. J. The UDP-Glc:Glycoprotein Glucosyltransferase Is Essential for Schizosaccharimyces pombe Viability under Conditions of Extreme Endoplasmic Reticulum Stress. J. Cell Biol. 143, 625–635 (1998). 92. Molinari, M., Galli, C., Vanoni, O., Arnold, S. M. & Kaufman, R. J. Persistent glycoprotein misfolding activates the glucosidase II/UGT1-driven calnexin cycle to delay aggregation and loss of folding competence. Mol. Cell 20, 503–512 (2005). 93. Van Leeuwen, J. E. & Kearse, K. P. Reglucosylation of N-linked glycans is critical for calnexin assembly with T cell receptor (TCR) alpha proteins but not TCRbeta proteins. J. Biol. Chem. 272, 4179–86 (1997). 94. Gardner, T. G. & Kearse, K. P. Modification of the T Cell Antigen Receptor (TCR) Complex by UDP-glucose : Glycoprotein Glucosyltransferase. J. Biol. Chem. 274, 14094– 14099 (1999). 95. Pearse, B. R. et al. The role of UDP-Glc:glycoprotein glucosyltransferase 1 in the maturation of an obligate substrate prosaposin. J. Cell Biol. 189, 829–841 (2010).

72

96. Ferris, S. P., Jaber, N. S., Molinari, M., Arvan, P. & Kaufman, R. J. UDP- glucose:glycoprotein glucosyltransferase (UGGT1) promotes substrate solubility in the endoplasmic reticulum. Mol. Biol. Cell 24, 2597–608 (2013). 97. Soldà, T., Galli, C., Kaufman, R. J. & Molinari, M. Substrate-Specific Requirements for UGT1-Dependent Release from Calnexin. Mol. Cell 27, 238–249 (2007). 98. Zhang, W., Wearsch, P. A., Zhu, Y., Leonhardt, R. M. & Cresswell, P. A role for UDP- glucose glycoprotein glucosyltransferase in expression and quality control of MHC class I molecules. Proc. Natl. Acad. Sci. 108, 4956–4961 (2011). 99. Wearsch, P. A., Peaper, D. R. & Cresswell, P. Essential glycan-dependent interactions optimize MHC class I peptide loading. Proc. Natl. Acad. Sci. 108, 4950–4955 (2011). 100. Neerincx, A. et al. TAPBPR bridges UDP-glucose: Glycoprotein glucosyltransferase 1 onto MHC class I to provide quality control in the antigen presentation pathway. Elife 6, 1–25 (2017). 101. Jin, H., Yan, Z., Nam, K. H. & Li, J. Allele-Specific Suppression of a Defective Brassinosteroid Receptor Reveals a Physiological Role of UGGT in ER Quality Control. Mol. Cell 26, 821–830 (2007). 102. Zhang, Q., Sun, T. & Zhang, Y. ER quality control components UGGT and STT3a are required for activation of defense responses in bir1-1. PLoS One 10, 1–10 (2015). 103. Blanco-Herrera, F. et al. The UDP-glucose: glycoprotein glucosyltransferase (UGGT), a key enzyme in ER quality control, plays a significant role in plant growth as well as biotic and abiotic stress in Arabidopsis thaliana. BMC Plant Biol. 15, 127 (2015). 104. Gladyshev, V. N. et al. Selenoprotein . J. Biol. Chem. 291, 24036–24040 (2016). 105. Korotkov, K. V., Kumaraswamy, E., Zhou, Y., Hatfield, D. L. & Gladyshev, V. N. Association between the 15-kDa Selenoprotein and UDP-glucose:Glycoprotein Glucosyltransferase in the Endoplasmic Reticulum of Mammalian Cells. J. Biol. Chem. 276, 15330–15336 (2001). 106. Labunskyy, V. M. et al. A Novel Cysteine-rich Domain of Sep15 Mediates the Interaction with UDP-glucose:Glycoprotein Glucosyltransferase. J. Biol. Chem. 280, 37839–37845 (2005). 107. Labunskyy, V., Hatfield, D. & Gladyshev, V. The Sep15 protein family: Roles in disulfide bond formation and quality control in the endoplasmic reticulum. IUBMB Life 59, 1–5 (2007). 108. Labunskyy, V., Yoo, M.-H., Hatfield, D. L. & Gladyshev, V. N. Sep15, a Thioredoxin-like Selenoprotein, is Involved in the Unfolded Protein Response and Differentially Regulated by Adaptive and Acute ER Stresses. Biochemistry 48, 8458–8465 (2009). 109. Kasaikina, M. V. et al. Roles of the 15-kDa selenoprotein (Sep15) in redox homeostasis and cataract development revealed by the analysis of Sep 15 knockout mice. J. Biol. Chem. 286, 33203–33212 (2011). 110. Li, Z., Michael, I. P., Zhou, D., Nagy, A. & Rini, J. M. Simple piggyBac transposon-based mammalian cell expression system for inducible protein production. Proc. Natl. Acad. Sci.

73

U. S. A. 110, 5004–9 (2013). 111. Li, X. & Rini, J. M. Structural and Biochemical Characterization of UDP-glucose: Glycoprotein Glucosyltransferase. (University of Toronto, 2016). 112. Watkins, R. W., Arnold, U. & Raines, R. T. Ribonuclease S redux. Chem. Commun. 47, 973–975 (2011). 113. Gasteiger, E. et al. Protein Identification and Analysis Tools on the ExPASy Server. Proteomics Protoc. Handb. 571–607 (2005). 114. Kunitz, M. A spectrophotometric method for the measuremement of ribonuclease acitivty. J. Biol. Chem. 164, 563–569 (1946). 115. Reeves, P. J., Callewaert, N., Contreras, R. & Khorana, H. G. Structure and function in rhodopsin: High-level expression of rhodopsin with restricted and homogeneous N- glycosylation by a tetracycline-inducible N-acetylglucosaminyltransferase I-negative HEK293S stable mammalian cell line. Proc. Natl. Acad. Sci. 99, 13419–13424 (2002). 116. Elbein, A. D., Tropea, J. E., Mitchell, M. & Kaushal, G. P. Kifunensine, a Potent Inhibitor of the Glycoprotein Processing Mannosidase I. Enzyme 265, 15599–15605 (1990). 117. Hering, K. W., Karaveg, K., Moremen, K. W. & Pearson, W. H. A practical synthesis of kifunensine analogues as inhibitors of endoplasmic reticulum α-mannosidase I. J. Org. Chem. 70, 9892–9904 (2005). 118. Watkins, R. W., Arnold, U. & Raines, R. T. Ribonuclease S redux. Chem. Commun. 47, 973–975 (2011). 119. Kelly, S. M. & Price, N. C. The use of circular dichroism in the investigation of protein structure and function. Curr. Protein Pept. Sci. 1, 349–84 (2000). 120. Woody, A. Y. M. & Woody, R. W. Individual Tyrosine Side-Chain Contributions to Circular Dichroism of Ribonuclease. Biopolym. - Biospectroscopy Sect. 72, 500–513 (2003). 121. Ranjbar, B. & Gill, P. Circular dichroism techniques: Biomolecular and nanostructural analyses- A review. Chem. Biol. Drug Des. 74, 101–120 (2009). 122. Qi, P. X., Sosnick, T. R. & Englander, S. W. The burst phase in ribonuclease A folding and solvent dependence of the unfolded state. Nat. Struct. Biol. 5, 882–884 (1998). 123. Labhardt, A. M. Kinetic circular dichroism shows that the S-peptide alpha-helix of ribonuclease S unfolds fast and refolds slowly. Proc. Natl. Acad. Sci. 81, 7674–7678 (1984). 124. Stelea, S. D. & Keiderling, T. A. Pretransitional structural changes in the thermal denaturation of ribonuclease S and S protein. Biophys. J. 83, 2259–2269 (2002). 125. Schreier, A. A. & Baldwin, R. L. Mechanism of Dissociation of S-Peptide from Ribonuclease S. Biochemistry 16, 4203–4209 (1977). 126. Gallouzi, G. G. A. W. Biochemistry. (Nelson Education, 2013). 127. Cuchillo, C. M., Nogués, M. V. & Raines, R. T. Bovine pancreatic ribonuclease: Fifty years of the first enzymatic reaction mechanism. Biochemistry 50, 7835–7841 (2011). 128. Li, Z. et al. Structural basis of Notch O-glucosylation and O-xylosylation by mammalian protein-O-glucosyltransferase 1 (POGLUT1). Nat. Commun. 8, (2017).

74

129. Izumi, M. et al. Chemical synthesis of intentionally misfolded homogeneous glycoprotein: A unique approach for the study of glycoprotein quality control. J. Am. Chem. Soc. 134, 7238–7241 (2012). 130. Izumi, M. et al. Substrate Recognition of Glycoprotein Folding Sensor UGGT Analyzed by Site-Specifically15N-Labeled Glycopeptide and Small Glycopeptide Library Prepared by Parallel Native Chemical Ligation. J. Am. Chem. Soc. 139, 11421–11426 (2017). 131. Izumi, M. et al. Synthesis of misfolded glycoprotein dimers through native chemical ligation of a dimeric peptide thioester. Org. Biomol. Chem. 14, 6088–94 (2016). 132. Takeda, Y. et al. Both isoforms of human UDP-glucose:glycoprotein glucosyltransferase are enzymatically active. Glycobiology 24, 344–350 (2014). 133. Dedola, S. et al. Folding of synthetic homogeneous glycoproteins in the presence of a glycoprotein folding sensor enzyme. Angew. Chemie - Int. Ed. 53, 2883–2887 (2014). 134. Tannous, A., Patel, N., Tamura, T. & Hebert, D. N. Reglucosylation by UDP- glucose:glycoprotein glucosyltransferase 1 delays glycoprotein secretion but not degradation. Mol. Biol. Cell 26, 390–405 (2015). 135. Labunskyy, V. M. et al. A novel cysteine-rich domain of Sep15 mediates the interaction with UDP-glucose:glycoprotein glucosyltransferase. J. Biol. Chem. 280, 37839–37845 (2005). 136. Jiang, J. et al. Crystal structure of a TAPBPR–MHC I complex reveals the mechanism of peptide editing in antigen presentation. Science 358, 1064–1068 (2017). 137. Thomas, C. & Tampé, R. Structure of the TAPBPR–MHC I complex defines the mechanism of peptide loading and editing. Science 358, 1060–1064 (2017).