Enzymes for Glycoprotein Synthesis
Total Page:16
File Type:pdf, Size:1020Kb
318 CHIMIA 2009, 63, No. 6 BIOTRANSFORMATIONS doi:10.2533/chimia.2009.318 Chimia 63 (2009) 318–326 © Schweizerische Chemische Gesellschaft Enzymes for Glycoprotein Synthesis Chi-Huey Wong* Abstract: More than 90% of human proteins are glycosylated and since protein glycosylation is understood to play a role in folding, trafficking, stability, immunogenicity, and function there is a need to generate pure protein glycoforms. Pure single protein glycoforms are difficult to obtain because the cellular machinary produces com- plex mixtures for any given protein. In this article an overview is given of various approaches used to generate specific single glycoforms of proteins, studies of the role of glycosylation in protein stability and function, and the imaging and identification of new glycoproteins related to cancer. Keywords: Enzymes · Glycopeptides · Synthesis Introduction Due to the importance of this prob- single glycoform biopharmaceuticals bear- lem, several approaches have been taken ing specific human glycans, it is not ideal A major challenge in the study of glyco- towards generating specific single glyco- for production of O-linked glycoproteins proteins is to characterize the effects of forms of proteins. An early approach was and analogues for biochemical research, glycans on glycoprotein folding and func- in vitro enzymatic remodeling of native since any modification of a specific gly- tion, in pure form. Pure, single protein glycoprotein mixtures with a combination can would require re-engineering of a new glycoforms are difficult to obtain, since of glycosidases to trim outer saccharides, yeast strain. For this purpose, a modular the cellular machinery produces complex and glycosyltransferases to rebuild new chemical approach in which different gly- mixtures of glycoforms for any given pro- glycan structures.[12–14] While conceptu- cans can be swapped out to make a series tein. Given that more than 90% of human ally simple, this approach is difficult in of related glycoforms using the same set of proteins are glycosylated and protein gly- practice because it requires each enzy- reactions, is more attractive. cosylation is understood to play a role in matic step to go to completion in order Since the initial report of native chemi- folding, trafficking, stability, immunoge- to obtain single glycoforms. If individual cal ligation (NCL),[20] many advances to- nicity, and function,[1–8] there is a need for steps do not reach 100% yield, the result wards total chemical synthesis of proteins methods of generating pure protein glyco- will be a mixture of glycoforms that is have been achieved.[21–28] These methods forms. To date, biophysical, structural, and similar in complexity to the native mix- have been applied to answer basic ques- functional characterizations have gener- ture. A new approach, developed collab- tions in protein folding and function, ally been limited to mixed populations of oratively between the Wong and Schultz which were otherwise difficult to answer inseparable glycoforms which have been labs, takes advantage of bacterial expres- with expressed proteins containing native compared to deglycosylated protein which sion of protein with incorporation of an sequences and naturally occurring amino was either enzymatically treated to remove unnatural glycosyl amino acid.[15,16] This acids.[29–31] Earlier versions of NCL were all glycans, expressed in a bacterial host could be followed by in vitro enzymatic limited by the requirement for cysteine at which lacks the protein glycosylation ma- elaboration of the glycan to produce de- the ligation junction, as well as a strong chinery found in eukaryotes, treated with sired structures. Improvements in yield dependence on peptide sequence at the glycosidase inhibitors during expression, are needed to make this process practical, ligation junction, primarily due to steric or mutated to remove specific glycosyla- and once again the need for several enzy- influence on ligation yield. The invention tion sites.[9–11] matic steps to convert the initial monosac- of removable thiol auxiliaries[27,32–35] has charide to a full-length complex glycan expanded the sequence repertoire that is would lead to glycoform mixtures unless accessible to NCL, and recently the new each enzymatic step goes to completion. methods have been applied to synthesis of In addition, it is difficult to incorporate full-length glycoproteins[36] or fragments more than one glycosyl amino acid into thereof.[37,38] However, ligation junction a protein using this method. A more con- sequence dependence remains a serious vergent route toward fully elaborated gly- issue, as these auxiliary-based systems cans, with fewer individual steps, is nec- rely on an amine that is less nucleophilic essary to prevent this problem. and more hindered than a native peptide’s Another approach that holds great N-terminal primary amine. Thus, these li- *Correspondence: Prof. C.-H. Wong promise is engineering of production or- gations are generally slower and proceed President, Academia Sinica 128 Academia Road ganisms, such as yeast, to attach specified in lower yields, and with more racemiza- Section 2, Nankang N-linked glycans in pure form to proteins tion, than traditional cysteine-based NCL Taipei, Taiwan that they are overexpressing.[17–19] The ligations. A recent approach to NCL is ex- and Department of Chemistry The Scripps Research Institute significant commercial potential of this pression of N-terminal cysteine containing 10550 N. Torrey Pines Rd. approach for biopharmaceuticals is exem- protein fragments, with subsequent native La Jolla, CA 92037 plified by the 2006 purchase of GlycoFi, chemical ligation (NCL) of glycopep- Tel.: +886 2 2789 9400 Fax: +886 2 2785 3852 a company developing this technology, tide thioesters. TEV protease specifically E-mail: [email protected] by Merck. While the yeast engineering cleaves its target sequence to produce an or [email protected] approach is promising for production of N-terminal cysteine. This approach was BIOTRANSFORMATIONS CHIMIA 2009, 63, No. 6 319 used to assemble N-linked glycosylated IL-2.[14] Sugar Assisted Glycopeptide Ligation In 2006 our lab reported that a native N-linked or O-linked saccharide attached to asparagine or serine/threonine, respectively, when appended with a thiol handle, was an excellent auxiliary for promoting ligation with a peptide thioester, in what we call sugar-assisted ligation (SAL) (Fig.1).[39–41] In SAL, the ultimate nucleophile is a primary amine, as is the case for NCL (Fig. 1). It is therefore more reactive than auxil- iaries attached directly to the nucleophilic amine, as shown in Fig. 2. The convergent nature and high efficiency of SAL, as well as the improvements to the original method which are outlined below, make it a prom- ising technique for practical synthesis of Fig. 1. Comparison of proposed mechanisms for the original native homogenous glycoproteins. The scope chemical ligation (NCL) and sugar-assisted ligation (SAL). In NCL, and limitations of the first generation SAL an obligatory N-terminal cysteine residue of peptide 2 undergoes a reaction were determined on peptide sub- transthioesterification reaction with a C-terminal thioester of peptide 1. A subsequent intramolecular S-N acyl transfer results in peptide bond strates,[39,40] with a Gly-Gly junction being formation. In SAL, a thiol auxiliary that plays the role of cysteine in NCL the fastest, Ala-Gly half as rapid, and Val- is appended from an asparagine-linked carbohydrate. After acyl transfer Gly an order of magnitude slower. Isolated to form the native glycopeptide, the thiol can be reduced to provide the yields for slower reactions are lower, due native N-acetylglucosamine residue. In SAL, the N-terminal residue is to the competing thioester hydrolysis reac- not limited to cysteine. tion. Therefore, sterically hindered junc- tions such as Ala-Val are not practical OH from a synthetic standpoint: although the O desired product can be detected, yields are HO O HO N O H H O NH poor. One of the goals of subsequent ef- N Peptide 1 Peptide 2 forts has been to improve ligation rates for S R Peptide 1 O S Peptide 2 H N these difficult junctions, and/or suppress O 2 N O H the competing hydrolysis reaction. R2 R O One of the limitations of SAL and oth- er thiol auxiliary-based approaches is the Fig. 2. Comparison of thioester intermediates in peptide ligations need to remove the thiol function to obtain mediated by aryl thiol auxiliaries (i.e. ref. [27]), and SAL. The ultimate amine nucleophile in the former case is an α-branched (except for fully native protein or glycoprotein. When glycine) benzylic secondary amine which suffers considerable steric additional cysteine residues are present hindrance. In the case of SAL, it is a peptide N-terminal primary amine, in the protein, the thiol auxiliary must be as found in NCL. While the transition state ring size for intramolecular removed selectively using reductive de- S-N acyl transfer is larger for SAL, ligation rate and sequence tolerance sulfurization, with orthogonal protection at the ligation junction remains comparable to other ligation methods. of internal cysteine residues. We recently reported a shorter alternative route, the second generation SAL.[42] The thiolacetic these conditions, it was found that ligation exchange-based intramolecular ligations, acid moiety was shifted from the 2-posi- occurred even when the thiol auxiliary was because it is much more concentration- tion of glucosamine to the 3-position, and further removed from the ligation site (ex- dependent. As peptide substrates become was connected through an ester rather than tended SAL).[42,43] Extended SAL greatly larger, it becomes more difficult to achieve an amide linkage. Thus, after ligation, the increased the junction sequence repertoire productive concentrations. Therefore, this thiol auxiliary could be removed by mild that is accessible to ligation. If the amino method may be limited to smaller peptide hydrazinolysis rather than desulfurization acids neighboring the sugar residue were and glycopeptide targets rather than full- (Fig.