<<

REVIEWS

Rapid and efficient synthesis through expansion of the native concept

Sameer S. Kulkarni1, Jessica Sayers1, Bhavesh Premdjee2 and Richard J. Payne1* Abstract | The growing interest in , both in fundamental research and in drug discovery, has fuelled demand for efficient synthetic methods to access these . Although solid-phase synthesis serves as the workhorse for accessing up to 50 amino acids in length, ligation technologies have underpinned protein synthesis. Native chemical ligation (NCL) represents the most widely used method and relies on the reaction of a bearing an N-terminal residue with a peptide thioester. While the seminal methodology was limited to reaction at N-terminal cysteine residues, the NCL concept has recently been extended with a view to improving reaction efficiency and scope. Specifically, the discovery that cysteine residues can be desulfurized to has led to the development of a range of thiol-derived variants of the proteinogenic amino acids that can be employed in protein synthesis under a ligation– desulfurization manifold. Furthermore, a number of important technologies have been developed to access larger targets via multi-fragment assembly, including methods for latent thioester activation and orthogonal strategies. Very recently, the selenocysteine, together with selenylated variants, has been shown to facilitate rapid ligation with peptide selenoesters. The large rate accelerations of these ligations have enabled access to proteins on unprecedented timescales, while chemoselective deselenization chemistry renders hitherto unobtainable targets accessible. This Review highlights innovative developments that have greatly expanded the NCL concept, allowing it to serve as a rapid and efficient means of conquering more challenging synthetic protein targets in the near future.

Peptides and proteins are the most ubiquitous bio­ proteins after ribosomal synthesis, leading to further molecules in living systems and are responsible for diversification of an otherwise concise proteome. These orchestrating a plethora of functional and structural so-called post-translational modifications (PTMs) roles in the cell. The final structure and function of a encompass a wide range of chemical alterations to the given polypeptide are dictated by the specific sequence of protein structure, such as functionalization of amino 21 proteinogenic amino acids that is encoded within the acid side chains, and have been shown to modulate genome of the organism. The flow of genetic informa­ the structure and function of several proteins in pro­ tion from DNA to peptides and proteins — the central found ways1–5. It is widely accepted that nature would dogma of molecular biology — involves the trans­cription not expend energy modifying polypeptides unless the 1School of Chemistry, of genetic information in DNA into mRNA, followed products fulfil a highly important biological role, yet the The University of Sydney, by the of polypeptides at ribosomes effect of a given modification on the structure, stability, Sydney, NSW, Australia. based on the genetic information encoded in mRNA function and activity of the majority of peptides and 2 Department of Protein and (). However, the total ensemble of proteins proteins across all taxa remains unknown. Peptide Chemistry, Novo Nordisk A/S, Måløv, Denmark. within a cell (the proteome) is far more complex than the The exquisite specificity and potency of peptides *e-mail: richard.payne@ genome of an organism alone would allow. For exam­ and proteins at their targets have led to a renaissance 6–8 sydney.edu.au ple, in humans, 25,000 genes are thought to lead to a in their use as therapeutics over the past decade . doi:10.1038/s41570-018-0122 prote­ome in excess of a million proteins. This enormous This is reflected in the United States Food and Drug Published online 29 Mar 2018 diversity arises from the chemical modification of Administration (FDA) approval rates, which are now

NATURE REVIEWS | CHEMISTRY VOLUME 2 | ARTICLE NUMBER 0122 | 1 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

more than double those of small molecules. Currently, developed to date, the convergent assembly of peptide there are more than 100 approved peptide drugs and fragments by native chemical ligation (NCL) represents over 200 protein therapeutics approved for clinical use9, the most widely employed method15. The power of this accounting for more than 10% of the pharmaceutical methodology is highlighted by its use in the synthesis of market (ca. US$40 billion). This number is set to greatly hundreds of protein targets to date. This review focuses increase in the coming years with hundreds of peptides on the development of a number of new ligation tech­ and proteins currently in clinical trials or undergoing nologies that have been inspired by the concept of NCL, preclinical assessment. Peptide and protein drugs have together with their utility in the total chemical synthesis typically been associated with two key drawbacks that of large polypeptides and proteins, with and without can limit their therapeutic applicability: first, large modifications. molecular weight, which hampers bioavailability; and second, the presence of native peptide bonds, which are Native chemical ligation susceptible to proteolytic degradation, leading to short The principle of chemical ligation for peptides was ini­ biological half-lives. In some cases, these shortcomings tially explored by Kent and co-workers in the early 1990s have been alleviated through the incorporation of native (see BOX 1 for the origin of the concept and intellectual PTMs, for example, glycosylation, or through the instal­ framework leading to the development of NCL). The lation of ‘bespoke modifications’ such as PEGylation10, reaction first involves a reversible trans-thioesterification lipidation or N-methylation. via nucleophilic attack on the thioester by the Cys thiolate While access to large polypeptides and proteins is moiety, leading to the formation of a thioester linkage typically achieved through biological expression in between the two peptide segments (FIG. 1a). The thioester prokaryotic and eukaryotic systems, tailored modifi­ intermediate then undergoes a rapid S→N acyl shift to cations such as biotinylation, installation of d-amino afford a native bond. One of the key features of acids11 or fluorescent tags remain challenging to access the reaction is that it operates in purely aqueous media at using the cellular machinery, despite several advances neutral pH, conditions that aid in the solubility and sta­ in unnatural amino acid incorporation12. As such, bility of the unprotected peptide fragments and the pro­ chemical synthesis has emerged as an attractive avenue tein targets. NCL technology has revolutionized the field to introduce specific modifications site-selectively and of chemical protein synthesis and has been used for the homo­geneously on a protein of interest. This is in stark construction of numerous protein targets since the sem­ contrast to recombinant methods, in which the enzy­ inal report of the method in 1994 (REF. 24). Examples that matic nature of the PTM process results in inseparable highlight the utility of the method include the synthesis heterogeneous mixtures of the target proteins. of a biologically active variant of the 166‑residue erythro­ Solid-phase (SPPS) remains the poiesis protein25, the 203‑residue covalent dimer of HIV1 most efficient platform for the chemical preparation of protease26 and, very recently, the 358‑residue D-Dpo4 polypeptides up to 40–50 amino acid residues in length13. enzyme27. The NCL concept has also been successfully However, the linear nature of SPPS means that longer applied in a semisynthetic regime using expressed pro­ syntheses are usually plagued by peptide chain aggrega­ tein ligation as well as for the preparation of head-to-tail tion and steric crowding en bloc, which leads to truncated cyclic peptides16,28. There are several excellent reviews (uncoupled) sequences, unwanted side products and that highlight applications of the traditional NCL method epimerization. The accumulation of these by-products and, as such, they will not be discussed in any further over iterative steps results in low yields and purities of the detail in this article29–31. Instead, the remainder of this final products. The size limit of polypeptide targets that Review will highlight the development of new methods can be generated by SPPS has inspired the development inspired by the NCL concept that have greatly expanded of chemical ligation methods to convergently assemble the scope of ligation chemistry to provide efficient access smaller peptide fragments to generate larger polypeptides to a larger number of synthetic protein targets. and proteins. Early work in this area relied on the con­ densation of side-chain protected fragments. While this Expanding the scope of NCL beyond cysteine. Despite approach has been successfully exploited for the synthesis the empowering nature of NCL in protein synthesis, of peptide therapeutics, including on a production scale14, the need for a Cys residue on the N terminus of one of the poor solubility of crude protected fragments, as well the peptide fragments limits the possible retrosynthetic as the susceptibility of the C-terminal amino acid residue disconnections that can be considered when using the to epimerize during activation, has made this approach method, especially given the paucity of Cys in natu­ largely unattractive. A key solution to these problems rally occurring proteins (1.8% abundance). In recent was provided through the development of peptide liga­ years, several innovative strategies involving the use of tion technologies that facilitate the formation of native N-terminal auxiliaries32,33 have been devised to enable peptide bonds between completely unprotected peptide protein disconnections at alternative ligation junctions fragments15–21. These technologies have underpinned and to abrogate reliance on N-terminal Cys residues (see the chemical synthesis of numerous peptide and protein BOX 1 for details). Although auxiliary-mediated ligations targets that were previously inaccessible through re­com­ have greatly increased the flexibility of ligation chemistry, binant methods or SPPS and have therefore played a key such methods generally suffer from prolonged reaction role in addressing a number of important questions in times, whereby hydrolysis and epimerization become biology and medicine3,22,23. Of the ligation technologies the dominant competing pathways and often require

2 | ARTICLE NUMBER 0122 | VOLUME 2 www.nature.com/natrevchem ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

Box 1 | Development of NCL and related N-terminal thiol auxiliary approaches The pursuit of a chemoselective amide-bond-forming a O R transformation between unprotected peptides initially O R Ligation peptide + centred on the conjugation of a peptide with a C-terminal peptide peptide S peptide SH Br thioacid functionality and a peptide with an N-terminal O bromoacetamide motif. Importantly, unprotected peptide O Thioacid Bromoacetamide Thioester peptide analogue segments could be solubilized at mM concentrations in 6 M hydrochloride (GdnHCl) buffer, which enabled efficient conversion into the thioester-linked product (see scheme part a showing the intellectual framework HS Br O leading to the development of native chemical ligation O Ligation 123 + peptide (NCL)) . Following this study, it was envisioned that the peptide peptide N peptide SH reaction of the C-terminal thioacid with a peptide bearing H2N H O an N-terminal β-bromoalanine residue would enable the O generation of a thioester, which would then undergo an S→N Thioacid β-Bromoalanine Native peptide acyl transfer to a native (scheme part a). However, preliminary experiments found aziridine formation to be a notable competing side reaction. Nonetheless, these HS HS O experiments helped set the intellectual framework for the O NCL reaction, which uses a peptide bearing a C-terminal NCL peptide + peptide thioester and a peptide containing an N-terminal Cys peptide peptide N SR H2N H 15 O residue as the coupling partners (scheme part a) . O The requirement for a cysteine residue makes traditional Thioester Cysteine Native peptide NCL unsuitable for numerous protein targets, which either do not contain Cys or do not possess this residue at a b O O + H synthetically useful junction. Early approaches to circumvent N peptide SR HS aux. peptide this problem involved the use of N-terminal auxiliaries (such as derivatives of ethanethiol or 2-mercaptobenzyl) to mimic R the role of the Cys thiol group in facilitating a trans-thio­ Trans-thioesteri cation Auxiliary examples: esterification event, therefore enabling proximity-induced O R acylation of the auxiliary-bound secondary (scheme aux. peptide HN part b shows a general scheme for N-terminal peptide S N HN 124 H HS auxiliary-mediated ligation) . A final cleavage of the ligation O O HS auxiliary then affords the native peptide product. S N acyl shift Unsurprisingly, additional steric bulk at the N-terminal amine results in a decreased reaction rate and poor sequence O R tolerance at the ligation junction. Attention later turned to a peptide SH HN mechanistically similar method using auxiliaries appended to peptide N O the side chain of the N-terminal amino acid. An early HS aux. R side-chain auxiliary strategy, developed by Wong and HN co-workers, employed a thiol-derived β-O-linked Auxiliary removal HS carbohydrate moiety and was termed sugar-assisted R O R ligation125–127. Alternative side-chain, backbone and peptide N-terminal auxiliaries have since emerged from the groups peptide N H of Brik128, Hojo129 and Seitz130. O Nature Reviews | Chemistry

tedious auxiliary removal steps, leading to lower overall proteins20,35. A limitation of the desulfurization proto­ yields. Thus, auxiliary-based approaches have been used col was the requirement for a large excess of nickel or to access only a small number of protein targets to date. palladium that in some cases led to undesirable side reac­ In 2001, the scope of the NCL methodology was tions, such as or greatly expanded through the introduction of a post- demethylthiolization, affording α-amino butyric acid. ligation desulfurization concept. In the seminal report A milder, metal-free desulfurization strategy was later by Yan and Dawson34, the product of the NCL reaction developed by Danishefsky and co-workers36. This method

was treated with either Raney Ni or Pd on Al2O3 to effect relies on the use of a water-soluble radical initiator reductive cleavage of the sulfhydryl moiety in the Cys 2,2ʹ-azobis[2-(2-imidazolin‑2-yl)propane]dihydro­ side chain, generating a native Ala residue (FIG. 1a). This chloride (VA‑044) together with tris(2-carboxyethyl) method therefore permits the use of Cys as a surro­gate phosphine hydrochloride (TCEP) and a hydrogen atom for ligation sites containing Ala, a substantially more source (in this case, tBuSH) in aqueous media to effect abundant amino acid (8.9% of residues) in proteins. desulfurization. This radical-promoted desulfurization Ultimately this powerful advance expanded the num­ approach, based on an early report by Hoffmann et al.37, ber of ligation-based disconnections for a given pro­ is thought to be initiated by the formation of a thiyl radical tein target and has been successfully implemented in at the Cys side chain, which adds reversibly to the phos­ the synthesis of a number of Cys-free polypeptides and phine (FIG. 1b). The resulting phosphoranyl radical can

NATURE REVIEWS | CHEMISTRY VOLUME 2 | ARTICLE NUMBER 0122 | 3 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

a HS O H2N Peptide 1 S O

H2N Peptide 1 SR Trans-thioesteri cation + Peptide 2 OH Peptide 2 H N O OH 2 H N O 2 O O

S N acyl shift

SH O O O O Desulfurization Peptide 2 OH Peptide 2 OH H2N Peptide 1 N H N Peptide 1 N H 2 H O O b N R3P S H N R3P H S R3P S H S S N N N N N R H N H N H N N N H O H N H O Ala H O H O VA-044 Cys O c O OtBu TrtS MeSS TrtS SSMe MeSS OH OH OH OH OH BocHN N BocHN BocHN BocHN Boc O O O O O SPPS β-thiol Leu γ-thiol Pro β-thiol Val γ-thiol Glu γ-thiol Val SPPS

O NHTmob O O

HS O S OH TmobS STrt NH2 TrtS MeSS OtBu HS Tmp OH OH OH OH OH hPTH (61–84) OH hPTH (25–38) SR BocHN N BocHN BocHN BocHN H2N 24 Boc H2N 60 O O O O O O O γ-thiol Gln -thiol Asn γ-thiol Ile -thiol Thr β-thiol Asp 6 2 R = (CH2)2CO2Et β

BocHN NBoc + NHBoc NHAlloc + O O S t HS H NBoc S Bu N hPTH (40–59) H SPh hPTH (1–23) SPh TrtS TrtS StBu N 39 S H 3 OH OH OH OH OH O BocHN BocHN BocHN BocHN H2N 5 O O O O O β-thiol Phe β-thiol Arg γ-thiol Lys δ-thiol Lys 2-thiol Trp

NCL 1. NCL 2. Thz deprotection

HS HS O O HS O OH hPTH (25–38) SR + hPTH (61–84) hPTH (40–59) N 60 H hPTH (1–23) N 24 H N 39 H 2 H O O O 4 7

1. NCL 2. Desulfurization 3. Folding

hPTH (1)

Figure 1 | Ligation technologies inspired by the NCL concept. a | Native chemical ligation (NCL) followed by the desulfurizationNature ofReviews Cys to Ala| Chemistry at the ligation junction. b | Proposed mechanism for the radical desulfurization of Cys to Ala. c | Toolbox of synthetic thiol-derived amino acids compatible with 9-fluorenyl-methoxycarbonyl (Fmoc)-solid-phase peptide synthesis (SPPS) and an exemplar application of β-thiol Leu and γ-thiol Val for the synthesis of human parathyroid hormone (hPTH, 1) using the NCL–desulfurization methodology49.

4 | ARTICLE NUMBER 0122 | VOLUME 2 www.nature.com/natrevchem ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

then undergo β-scission to produce an alanyl radical and HIV-1 Tat protein. Another key reagent is the γ-thiol a phosphine sulfide. Rapid hydrogen abstraction by the Lys derivative48, which owing to its ability to mediate alanyl radical from an exogenous thiol then generates ligation at both α-amino and ε-amino groups through the native Ala residue. Importantly, these conditions are six-membered S→N acyl shift transition states, has found completely chemoselective in the presence of a range of numerous important applications in protein synthesis. potentially susceptible functionalities, including thioesters More specifically, this dual ligation capability offers and methionine residues. Very recently, Li and co-workers synthetic access to peptides and proteins bearing natural have shown that rapid and clean desulfurization can be PTMs at the Lys side chain, including acetylation, effected without the use of a radical initiator through ubiquitylation and methylation. treatment with sodium borohydride and TCEP38. A powerful example of the benefits offered by Since the seminal report by Yan and Dawson34, the the thiol-derived amino acid toolbox is highlighted post-ligation desulfurization concept has found enor­ by the consecutive use of β-thiol Leu, γ-thiol Val and Cys for mous application in the total chemical synthesis of pro­ the assembly of human parathyroid hormone (hPTH, 1), teins via disconnection at Ala residues35. However, the reported by Danishefsky and co-workers49 (FIG. 1c). methodology has also served as a catalyst for the con­ More specifically, bifunctional fragment 2 bearing an cept that thiol-derived variants of other canonical amino N-terminal β-thiol Leu and a C-terminal alkyl thioester acids could be used as Cys surrogates in NCL, followed by was initially ligated with thiophenyl thioester 3 to yield de­sulfurization to native residues. This has fuelled the hPTH (1–37, 4). Separately, N-terminal Thz-protected development of synthetic routes towards suitably protected fragment 5, functionalized as a thiophenyl thioester on β-thiol, γ-thiol and δ-thiol derivatives of the proteino­ the C terminus, was reacted with γ-thiol Val fragment genic amino acids that can be directly incorporated into 6. Subsequent Thz deprotection of the resulting liga­ fragments by SPPS and employed in protein synthesis tion product afforded the Cys-bearing fragment hPTH using a ligation desulfurization manifold. A decade (39–84, 7). A final NCL between hPTH (1–38, 4), pos­ on from the synthesis– of the first thiol-derived amino sessing a C-terminal thioester, and hPTH (39–84, 7), with acid, intensive research efforts by a number of research an N-terminal Cys residue, yielded the full-length hPTH groups have culminated in a comprehensive toolbox of sequence, which following global desulfurization, effected 9-fluorenylmethyloxycarbonyl (Fmoc)-SPPS-compatible the removal of all three thiol auxiliaries to generate the thiolated variants of 13 of the proteinogenic amino acids, native protein hPTH (1–84, 1) after a folding step. which have greatly expanded the repertoire of peptide A final noteworthy addition to the toolbox is ligation chemistry35,39,40 (FIG. 1c). The first contribution to β-thiolated Asp, which can be prepared in three steps the amino acid toolbox was β-thiol Phe reported inde­ from a commercially available Asp starting material pendently by Crich41 and Botti42. The β-thiol Phe moiety [Boc-Asp(OtBu)–OH]50. This reagent has proved particu­ was shown to successfully mediate ligation reactions with larly useful owing to the development of an initiator-free peptide thioesters in good yields when incorporated on chemoselective desulfurization reaction using TCEP and the N terminus of peptides. Subsequent removal of the dithiothreitol (DTT) at pH 3, which enables removal of β-thiol auxiliary could be performed using nickel boride the β-thiol auxiliary in the presence of free sulfhydryl to generate the native Phe residue at the ligation junction; side chains of native Cys residues. This selective desulfur­ however, this desulfurization can also be performed using ization technique therefore obviates the need for protect­ a radical initiator43. This proof-of-concept study, which ing group manipulation in protein targets containing demonstrated that ligation–desulfurization chemistry functionally important Cys residues, as highlighted in was possible at amino acids other than Cys, sparked inter­ the synthesis of the extracellular N-terminal domain of the est in other thiol-derived amino acids by the community, chemokine receptor CXCR4 bearing two PTMs50. A some key examples of which are highlighted below. further application of this chemoselective ligation– Val and Leu represent some of the most abundant desulfurization methodology was recently reported amino acids found in proteins (6.8% and 9.8%, respec­ by Becker and co-workers, who prepared a number of tively), and it was therefore not surprising that these differentially PEGylated prion proteins by ligation at were early targets for synthesis. Seitz and co-workers β-thiol Asp followed by chemoselective desulfurization made use of a suitably protected variant of penicillamine in the presence of a native unprotected Cys residue51. (β,β-dimethylcysteine) as a surrogate44, while Danishefsky and co-workers developed a γ-thiol valine Directional flexibility for iterative assembly of reagent that led to faster reaction kinetics when reacted proteins. Protein assembly via iterative ligation reac­ with peptide thioesters compared with the homo­ tions in the N→C direction was originally achieved logous reactions at β,β-dimethylcysteine, owing to the by harnessing the differing reactivity of thioesters in improved accessibility of the thiol auxiliary45. However, a kinetically controlled ligation. Kent and co-workers ligation products from both Val surrogates could be first demonstrated this concept by using the reactivity cleanly desulfurized using the metal-free method of a thio­phenyl thioester on the C terminus of one frag­ (TCEP and VA-044) to afford native polypeptide prod­ ment with a bifunctional fragment possessing Cys on ucts. Following this, syntheses of β-mercapto Leu deriv­ the N terminus and a less reactive alkyl thioester on the atives were independently reported by the Brik46 and C terminus that does not partake in the ligation reac­ Danishefsky47 groups, the former report showcasing the tion. This method­ology was initially showcased in the methodology through the ligation-based assembly of the six-segment assembly of the small protein crambin52.

NATURE REVIEWS | CHEMISTRY VOLUME 2 | ARTICLE NUMBER 0122 | 5 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

a OR2 O

H madanin-1 (1–28) S CF HS CO H O 3 2 35 CO Et 16 madanin-1 (30–47) S 2 H N 29 NCL 2 32 O 12–15 12: R1 = H; R2 = H 13: R1 = SO OCH C(CH ) ; R2 = H 2 2 3 3 OR1 1 2 14: R = H; R = SO2OCH2C(CH3)3 1 2 15: R , R = SO2OCH2C(CH3)3

OR2

HS CO H O NCL (TFET) 2 35 HS CO2Et H madanin-1 (1–28) madanin-1 (30–47) S N 29 madanin-1 (49–60) OH H O 32 H2N 48 O 17 R1 = H; R2 = H 1 1 – 2 OR R = SO3 ; R = H R1 = H; R2 = SO – 2 3 OR 1 2 – R , R = SO3

HS CO H SH 2 35 Desulfurization H madanin-1 (1–28) madanin-1 (30–47) madanin-1 (49–60) OH N 29 N 48 H 32 H O 1 2 O R = H; R = H 2 1 – 2 OR R = SO3 ; R = H 1 2 – 1 R = H; R = SO3 OR 1 2 – CO H R , R = SO3 2 35 H madanin-1 (1–28) madanin-1 (30–47) madanin-1 (49–60) OH N 29 N 48 32 H H O O 8: R1 = H; R2 = H 1 – 2 9: R = SO3 ; R = H 1 Sulfated variants of madanin-1 1 2 – OR 10: R = H; R = SO3 (8–11) 1 2 – 11: R , R = SO3

off b O SEA CO H SH O SUMO-1 (2–50) S 2 O NCL (MPAA) SUMO-1 (52–97) N 19 SUMO-1 (2–50) N 51 S HS + O H O S SUMO-1 (52–97) N H N 51 S 2 O 20 S SEAon TCEP (SEAoff SEAon) O N S acyl shift N NCL (MPAA) SH SH

SH SH O O H N SUMO-1 (52–97) SUMO-1 (2–50) N N HS 51 H H O H O O N H Ac-RASI N SUMO-1 peptide conjugate (18) H2N AEGR-NH2 H N-RGEA O O 21 HN 2 ISAR-Ac

c O SH O 1. Acyl hydrazide thioester O α-syn (1–29) NHNH2 2. NCL (MPAA) α-syn (31–68) NHNH α-syn (1–29) N 30 2 H 23 O HS + O HS O 1. Acyl hydrazide thioester α-syn (31–68) NHNH2 H2N 30 α-syn (70–106) NHNH2 2. NCL with 25 (MPAA) 24 H N 69 O 2 O 25

Acyl hydrazide thioester HS 1. Acyl hydrazide thioester O O O 2. NCL with 26 (MPAA) NaNO2 MPAA α-syn (108–140) OH 3. Desulfurization H2N 107 NHNH2 Activation N3 Thiolysis SR O 26

O O O α-syn (108–140) OH α-syn (70–106) N 107 α-syn (31–68) N 69 H α-syn (1–29) N 30 H O H O O -synuclein (22)

Nature Reviews | Chemistry

6 | ARTICLE NUMBER 0122 | VOLUME 2 www.nature.com/natrevchem ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

◀Figure 2 | Chemical protein synthesis via iterative ligations in the N→C direction. N-terminal β-thiol Asp residue and a C-terminal alkyl a | One-pot synthesis of a library of sulfated madanin-1 proteins 8–11 via kinetically thioester. The ligation proceeded regioselectively at the 56 controlled ligation . The ligation proceeds regioselectively at the 2,2,2-trifluoroethan- TFET thioester owing to increased reactivity compared ethiol (TFET) thioester owing to increased reactivity compared with the other alkyl with the alkyl thioester on 12–15. Upon complete reac­ thioester; this less reactive thioester can be subsequently converted into the TFET tion, the C-terminal Thr alkyl thioester was activated thioester to facilitate a second ligation. b | Synthesis of SUMO-1 peptide conjugate 18 using bis(2-sulfanylethyl)amido (SEA) chemistry69. The N-acyl perhydro-1,2,5-dithia­ with 2 vol.% TFET and subjected to a second ligation with zepine moiety (SEAoff) remains inactive during the first ligation and can be converted cysteinyl peptide 17. The resulting full-length product into the thioester (through a N→S acyl shift) using a reductant and an exogenous thiol was finally subjected to in situ global desulfurization additive to execute the second ligation. c | Synthesis of α-synuclein 22 using acyl using VA-044, TCEP and reduced glutathione to convert hydrazides as thioester surrogates68. The hydrazide moiety remains inactive under native Cys and β-SH Asp residues into Ala and Asp, respec­ chemical ligation (NCL) conditions and can be converted into a thioester (through tively, affording native madanin-1 8 and madanin-1 activation of the hydrazide with NaNO2 followed by thiolysis of the resulting acyl azide sulfoproteins 9–11 in excellent yields over the multistep with an external thiol additive) for iterative ligation reactions in the N→C direction. sequence. These synthetic proteins enabled the impor­ In structures 18 and 21, AEGR = Ala-Glu-Gly-Arg and ISAR = Ile-Ser-Ala-Arg. MPAA, tance of Tyr sulfation for anticoagulant and thrombin mercaptophenylacetic acid; TCEP, tris(2-carboxyethyl)phosphine hydrochloride. inhibitory activity to be determined. Specifically, Tyr sulfation was shown to provide a 2–3 orders of magni­ The kinetically controlled ligation concept has been fur­ tude improvement in thrombin inhibitory activity over ther modified for one-pot protein synthesis by the use the unmodified madanin-1 homologue56. of a thiol additive — 2,2,2-trifluoroethanethiol (TFET) Several other strategies have also been developed to — which serves to increase the rate of ligation reactions enable iterative ligation reactions in the N→C direction. through in situ generation of thioesters with increased The most useful of these fall broadly into the category of reactivity53. Aryl thiol additives, such as mercaptophenyl­ thioester precursors and include C-terminal Cys activa­ 57 58 (MPAA) (pKa = 6.6) and thiophenol (pKa = 6.6), tion , the bis(2-sulfanylethyl)amido (SEA) auxiliary , are more commonly used to accelerate NCL reactions N-sulfanylethylanilide auxiliary59, N-alkyl Cys60,61, owing to their demonstrated proficiency in thioester 3,4-diaminobenzoic acid (Dbz)62 and o-amino(methyl) exchange reactions with alkyl thioesters as well as their aniline (MeDbz)63 linkers, o-aminoanilides64 and peptide excellent leaving group ability upon reaction with the acyl hydrazides65–68. The SEA auxiliary, first reported by cysteinyl peptide fragment. Unfortunately, the radical Melnyk and co-workers, possesses a 1,7‑dithiol struc­ quenching activity of these aryl thiol additives prohibits ture, which allows rapid interconversion between the in situ radical desulfurization of the ligation products, inactive N-acyl perhydro-1,2,5-dithiazepine moiety and intermediate purification and lyophilization steps (SEAoff) and the N→S acyl shift-active SEA dithiol must therefore be carried out before a subsequent ligation form (SEAon) through simple redox manipulations. reaction can be performed. Alternative methods to In the reduced form, the SEA auxiliary is competent in extract the aryl thiol species following ligation have been ligation chemistry through conversion into a thioester reported, including liquid liquid extraction54 and solid- either by exchange with an exogenous additive, such phase capture procedures–55. However, TFET alleviates as 3-mercapto­propionic acid, or through trapping of

the need for these additional steps; the pKa (7.3) leads the N→S acyl-shifted SEA thioester with glyoxylic acid. to highly competent thioester exchange and efficient Importantly, the SEAoff cyclic disulfide is compatible with acylation by the Cys thiol moiety. Furthermore, the vol­ mild reducing agents (for example, MPAA) commonly atility of TFET (boiling point = 35−37 °C) permits facile employed as thiol catalysts in NCL reactions, allowing post-ligation removal through simple sparging with an the use of NCL and SEA ligations in concert. The ortho­ inert gas; however, the alkyl thiol TFET is a poor radical gonality of the SEA auxiliary with NCL was recently high­ quencher and therefore can remain in the reaction lighted in the synthesis of functional SUMO-1 peptide for in situ desulfurization of Cys or thiol amino acids conjugate 18 (REF. 69) (FIG. 2b). Initially, SUMO fragment at the ligation junction. It should be noted that com­ thioester 19 was prepared via activation of the peptide mercially available TFET often requires distillation bearing a C-terminal SEA auxiliary (not shown) through before use (depending on the source and purity) and, exchange with 3-mercaptopropionic acid. The resulting as a malodorous and volatile thiol, should be handled thioester 19 was then subjected to MPAA-catalysed NCL inside a fumehood. with fragment 20 bearing an N-terminal Cys and a latent The power of the one-pot kinetically controlled C-terminal SEAoff moiety. Subsequent addition of TCEP ligation–desulfurization strategy was showcased in the facilitated the switching of SEAoff→SEAon and the SEA efficient assembly of four differentially sulfated variants ligation could then be conducted with fragment 21 in a of madanin-1, a 60‑amino acid Cys-free thrombin one-pot manner to afford 18. inhibitor produced by the hard tick Haemaphysalis Peptide acyl hydrazides have also proved to be highly longicornis56 (FIG. 2a). The family of proteins (8–11) was useful thioester surrogates for the N→C assembly of pro­ assembled through the use of three suitably reactive tein targets through ligation chemistry65. Conversion of peptide fragments, with the middle bifunctional frag­ a given peptide with a C-terminal acyl hydrazide func­ ments (12–15) possessing all possible sulfated variants at tionality into a thioester is performed through an oper­

two Tyr sulfation sites (Tyr32 and Tyr35). The preformed ationally simple activation of the hydrazide with NaNO2 TFET thioester 16 was initially ligated to bifunctional followed by thiolysis of the resulting acyl azide with an (sulfo)peptide fragments 12, 13, 14 or 15 bearing an external thiol additive. Crucially, the hydrazide moiety

NATURE REVIEWS | CHEMISTRY VOLUME 2 | ARTICLE NUMBER 0122 | 7 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

remains inactive under NCL conditions, therefore acting N-terminal Cys residue to afford 34, which could then as a masked thioester that can be unleashed for itera­ participate in ligation with N-terminal fragment 35. With tive ligation reactions in the N→C direction. An elegant the full-length protein assembled, desulfurization of the example of the acyl hydrazide-based NCL approach is non-native Cys residues was effected under metal-free the preparation68 of α-synuclein 22, a protein that has conditions. Silver acetate-promoted removal of the Acm been implicated in the formation of neuronal Lewy bod­ protection on Cys and saponification of the benzyl ies and in the progression of several neuro­degenerative protection on sialic acid followed by folding furnished disorders, including Parkinson’s disease. Liu et al.68 homogeneously glycosylated IFNβ (31). devised a four-segment N→C sequential ligation strategy The ability to perform ligation desulfurization reac­ starting with the activation of acyl hydrazide fragment tions in an iterative manner in both– the N→C and C→N (FIG. 2c) 23 with NaNO2, followed by thiolysis . The result­ directions has greatly improved the efficiency of chemical ing peptide thioester could then be ligated to fragment protein synthesis. With these technologies, the community 24 in an MPAA-catalysed NCL reaction. This procedure has redefined the targets that can be produced, with sub­ was then repeated with fragments 25 and 26 to afford stantially larger targets (>120 amino acids) now becoming the full-length protein. Global radical desulfurization more routinely accessible. to effect Cys into Ala conversions at each of the three ligation junctions then afforded synthetic α-synuclein Extending NCL to selenocysteine 22 in excellent overall yield. In parallel with the revolutionary advances of the NCL In a manner similar to N→C protein assembly, sev­ reaction manifold, there has also been considerable eral effective methods have also been developed for research attention focused on tackling some of the inher­ assembling proteins in the C→N direction. The crux of ent limitations of the technology, specifically the lack of this concept is to precisely control sequential ligation chemoselectivity of the desulfurization reaction in the steps through the use of orthogonal protecting groups presence of native Cys residues and the prohibitively for N-terminal Cys residues or Cys surrogates. The slow ligation rates at sterically demanding amino acid design and utility of appropriate Cys protecting groups junctions. In 2001, three independent groups demon­ remains a contemporary research focus70; however, sev­ strated that the 21st amino acid (Sec) was competent in eral viable strategies have been reported in successful NCL-like transformations with peptide thioesters, thus protein syntheses, including Thz71,72 derivatives and providing access to large selenopeptides and seleno­ acetamidomethyl (Acm)73 protection of Cys. An elegant proteins for the first time77–79. Sec was first acknowl­ method from Brik and co-workers employed a Thz- edged to be biologically vital based on the selenoenzyme protected δ-thiol Lys residue to facilitate three iterative glutathione , which displayed -based ligations in the C→N direction at the ε-amino moiety of catalytic activity80. Since then, several Lys with ubiquitin chains functionalized as C-terminal have been identified with functions ranging from thioesters to generate tetraubiquitin 27 (REF. 74) (FIG. 3a). phospho­lipid , muscle development and Protein assembly was accomplished using three ubiqui­ calcium mobilization, to modulators of redox-regulated tin fragments, 28 containing an N-terminal δ-thiol Lys, signalling81–89. Despite being the chalcogenic analogue 29 bearing an N-terminal Thz-protected δ-thiol Lys and of Cys, Sec exhibits some strikingly different physico­ a C-terminal thioester and peptide thioester 30. Using chemical properties. First, Sec exhibits a consider­ iterative cycles of benzylmercaptan-catalysed and ably lower reduction potential (−381 mV) than Cys thio­phenol-catalysed ligation reactions and acidic (−180 mV)90,91. As a result, Sec readily undergoes air methoxy­amine-mediated Thz deprotection steps, four oxidation and exists exclusively as a dimeric species ubiquitin units were assembled to afford the 304‑amino acid (diselenide)92. A reducing agent is therefore required for protein tetramer. Global radical desulfurization of the NCL reactions to proceed efficiently through the gen­ three δ-thiol Lys residues to native then provided eration of the monomeric selenolate78. Second, the pKa tetraubiquitin 27. Notably, Liu and co-workers have very of Sec (5.2–5.6) is lower than that of Cys (8.2), mean­ recently reported the synthesis of hexaubiquitin through ing that when monomeric, it exists predominantly as iterative acyl hydrazide chemistry, which represents one selenolate at physiological pH, thus enabling NCL at Sec to of the largest proteins to ever be prepared by chemical be performed at a lower pH and offering higher yields by synthesis75. minimizing thioester hydrolysis. Kajihara and co-workers also demonstrated the power of Since its inception, Sec-mediated NCL has been an iterative C→N ligation strategy for the exploited for the synthesis of a wide range of pep­ of a homogeneously glycosylated variant of interferon-β tides67,79,93–95 and proteins77,78,96–100. Some of the early (IFNβ, 31)76. The approach involved disconnection of the examples include a 17-mer fragment of ribonucleotide 166‑amino acid target into three fragments, whereby two reductase79 and a bovine pancreatic trypsin inhibitor Cys residues were introduced for NCL reactions, while the (BPTI) analogue78, both possessing an intramolecular three native Cys residues were protected with Acm groups selenosulfide linkage. In another example, Hilvert and throughout the protein assembly (FIG. 3b). The synthesis co-workers synthesized a cyclic peptide by macro­ began with NCL between N-terminal cysteinyl fragment lactamization of a linear precursor functionalized with 32 and the thioester of glycosylated fragment 33, which a C-terminal thioester and an N-terminal Sec residue by possessed a Thz residue on the N terminus. Subsequent NCL94. In addition, the internal Sec in the ligated cyclic methoxyamine-mediated Thz deprotection unmasked the product was shown to be amenable to various synthetic

8 | ARTICLE NUMBER 0122 | VOLUME 2 www.nature.com/natrevchem ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

NH NH 2 NH2 a S HS NH2 O NH2 H HS N Ubi NH NH O Ubi NH Ubi O O 29 30 H HS N SR SR Ubi NH 1. NCL 1. NCL O Ubi 2. Thz deprotection H 2. Desulfurization N 28 NH Ubi O HS H N Ubi NH

27 Ubi O H N Ubi NH H H H N N N H2N O O O = HN HN HN Ubi

Tetraubiquitin (27) b

OH OH O HO OR O AcHN O S OH HO O IFNβ (69–88) SPh HO N 68 O H OH O 33 O OH HO O + O HS SAcm NHAc HO HO IFNβ (90–166) OH O H2N 89 HO 32 OH O O OR HO O OH AcHN HO O OH O O HO 1. NCL O HO OH HO O 2. Thz deprotection HO HO O HO O OH O OH O O OH O NHAc O HO HO H R = Bn or H O N NHAc NHAc

SAcm HS O HS IFN (90–166) IFNβ (69–88) β SAcm O H N 68 N 89 2 H O O 34 IFNβ (1–67) SPh

SAcm 35 NCL

1. Desulfurization HS SAcm 2. Acm and Bn deprotection HS O SAcm O 3. Folding IFNβ (90–166) IFNβ (69–88) N N 89 IFNβ (1–67) 68 H H O O SAcm IFNβ (31)

Figure 3 | Chemical protein synthesis via iterative ligations in C→N direction. a | Synthesis Natureof the 304 Reviews‑residue | Chemistry tetraubiquitin (27)74. A Thz-protected δ-thiol Lys residue was employed to facilitate three iterative ligations at the ε-amino moiety of Lys with ubiquitin (Ubi) chains functionalized as C-terminal thioesters to generate tetraubiquitin.b | Synthesis of a homogeneously glycosylated variant of human interferon-β (IFNβ, 31)76. Three native Cys residues (that were inappropriately placed for use in native chemical ligation (NCL)) were protected with Acm groups during ligation– desulfurization reactions.

transformations, such as alkylation, oxidative elimination a large protein thioester derived from recombinant and reductive deselenization94. Sec was also found to be techniques. Very recently, Rozovsky and co-workers compatible with expressed protein ligation (EPL), demon­ have developed a method for the incorporation of Sec strated through the synthesis of RNase A77 and azurin101. into expressed protein fragments by enriching the growth In both cases, a synthetic peptide bearing an N-terminal medium for Escherichia coli with Sec, such that it could selenocystine, instead of native Cys, was ligated with be subsequently incorporated using the Cys codon102.

NATURE REVIEWS | CHEMISTRY VOLUME 2 | ARTICLE NUMBER 0122 | 9 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

a HS HS 1. NCL at Sec 2. Chemoselective O Se O 2 deselenization Peptide 2 Peptide 1 SR + Peptide 2 Peptide 1 N H2N H O O

b R3P

Se R3P Se R3P Se H 2 Se H S Initiation R N N N N N H O H O H O H O H O Diselenide Ala

c Se O

PHPT1 (36–68) SR SH SH N 35 H 1. NCL (MPAA) SH O 37 2. Selenazolidine HSe O (Sez) deprotection PHPT1 (70–125) PHPT1 (36–68) N + H N 69 SH SH 2 35 H O HS O

PHPT1 (70–125) O H2N 69 38 O PHPT1 (2–34) SR

SH SH 39 SH SeH O NCL at Sec O PHPT1 (70–125) PHPT1 (36–68) N 69 PHPT1 (2–34) N 35 H O H O SH SH SH Chemoselective O O deselenization PHPT1 (70–125) PHPT1 (36–68) N PHPT1 (2–34) N 69 35 H O H O PHPT1 (40) d O OH Eglin C (1–40) SR 1. NCL at Sec O 2. Oxidative deselenization 42 Eglin C (42–70) Eglin C (1–40) N 41 + H O Se 2 Eglin C (41) Eglin C (42–70) H2N 41 O 43

Figure 4 | Applications of peptide ligation chemistry at the 21st amino acidNature (Sec). Reviews a | Native | Chemistry chemical ligation (NCL) at Sec followed by chemoselective deselenization in the presence of unprotected Cys. b | Proposed mechanism for the chemoselective deselenization of Sec. c | Synthesis of PHPT1 (40) via a key chemoselective deselenization step97. d | Synthesis of eglin C (41) via ligation–oxidative deselenization96. MPAA, mercaptophenylacetic acid.

Importantly, the ENLYFQ motif was N-terminally fused native Cys residues that may be crucial to the structure to Sec, and the resulting Q–Sec junction in the mature and/or function of a given protein target. It is important construct could be cleaved using tobacco etch virus (TEV) to note that before the development of this methodology, protease to afford proteins with an N-terminal Sec resi­ the synthesis of Cys-containing targets under a NCL due. This enabled the development of expressed protein desulfurization regime necessitated the use of protecting– ligation at Sec with large expressed fragments (Sec-EPL). groups on the side chains of Cys residues during the des­ This work provides the impetus for the site-specific mod­ ulfurization step. The observed selectivity is proposed to ification of Sec residues within proteins in the future; arise from the weaker C Se bond, favouring formation however, the method currently cannot accommodate of the alanyl radical at Sec– over Cys. Mechanistically, the Cys residues in the Sec-containing fragment owing to deselenization is proposed to proceed through reversible non-selective incorporation. addition of a selenium-centred radical to the phosphine, In 2010, a landmark contribution to Sec-based ligation leading to a phosphorus-centred radical species. The chemistry came from Dawson and co-workers, who highly thermodynamically favourable production of a discovered that deselenization of a Sec residue could phosphine selenide is then proposed to drive the homo­ be achieved under mild conditions using the reducing lysis of the C–Se bond. The resulting β-carbon-centred agent TCEP and a hydrogen donor, such as DTT (FIG. 4a). radical is then capable of hydrogen atom abstraction Importantly this method does not require a radical ini­ to generate the native alanine residue (FIG. 4b). Initially, tiator and was completely chemoselective in the pres­ the ligation–chemoselective deselenization approach ence of unprotected Cys residues95. This pivotal finding was applied on small peptidic systems, including a 38‑ highlighted the enormous potential of Sec-based NCL residue fragment of the redox glutaredoxin 3 as a method for the construction of proteins retaining (Grx3, 1–38)95. However, the power of this methodology

10 | ARTICLE NUMBER 0122 | VOLUME 2 www.nature.com/natrevchem ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

was further exemplified by Metanis et al. in the synthesis by chemoselective deselenization in the presence of of the 125‑amino acid human enzyme phosphohistidine unprotected Cys67. Taken together, the Sec-based ligation phosphatase (PHPT1, 40)97. With three Cys residues methods, coupled with chemoselective­ deselenization located in the C-terminal region of the sequence, a chemistry, represent powerful new approaches for strategy that employed both traditional NCL and Sec- accessing protein targets without strategically placed Cys mediated ligation reactions was devised, with three residues or where chemo­selective removal of the ligation segments undergoing sequential ligations in the C→N auxiliary in the presence of other sensitive residues (for direction (FIG. 4c). Bifunctional segment 37 was prepared example, structurally or functionally important Cys bearing an N-terminal Sec residue protected as a Sez and residues) is necessary. functionalized with a C-terminal thioester for a standard NCL reaction with the N-terminal Cys residue of frag­ Selenoester acyl donors for acceleration of ligation-

ment 38. The ligated product was treated with MeONH2 based protein assembly. The rate of NCL is known to to effect conversion of the Sez moiety into Sec and sub­ be strongly influenced by the steric and electronic envi­ jected to Sec-mediated NCL with C-terminal thioester ronment of the C-terminal amino acid residue of the segment 39. Interestingly, while Sez was demonstrated thioester component. For instance, peptide thioesters to be stable under Fmoc-SPPS and NCL conditions bearing sterically hindered β-branched amino acids at

(similar to Thz), the authors reported that MeONH2- the C terminus (for example, Ile, Thr and Val) suffer promoted ring opening was faster in the case of Thz from sluggish reaction rates, affording lower ligation (based on a model system). The purified ligation product yields owing to competing thioester hydrolysis. In the could then be successfully deselenized to afford PHPT1 case of C-terminal thioesters, an n→π* electronic (40). Importantly, the deselenization of Sec proceeded donation into the carbonyl carbon leads to reduced smoothly without modifying the three unprotected Cys electrophilicity of the prolyl thioesters, making Pro- residues present in the sequence, thus highlighting the Cys junctions synthetically intractable105. A solution to selectivity of the protocol. The same group later accom­ this problem was reported by Durek and Alewood, who plished the total synthesis of the 122-residue human rationalized that replacement of the thioester moiety by M (SELM) through the iterative Sec- an alkyl selenoester would lead to productive ligation mediated ligation–deselenization assembly of four frag­ chemistry, owing to the superior leaving group ability of ments in the C→N direction98. Notably, SELM comprises the selenolate over the thiolate. Indeed, ligation at model a CXXU motif that is crucial for its biological activity; peptides bearing a C-terminal prolyl selenoester were this would otherwise be inaccessible with traditional complete in 2 hours, nearly 350 times faster than tradi­ Cys-based ligation methods. tional NCL106 (FIG. 5a). This initial study laid the foun­ A further exploitation of the Sec reactivity was dation for the use of selenoesters as acyl donors in the demonstrated by the Payne96 and Metanis103 groups, ligation-based assembly of proteins, including several who independently discovered that treatment of Sec reports describing efficient methods for accessing pep­ with TCEP in the presence of an exogenous oxidant tide selenoesters of various lengths both in solution107 leads to clean conversion into at the ligation and on the solid phase108. junction. The discovery of this oxidative deseleniza­ More recently, Mitchell et al. postulated that substan­ tion transformation has further broadened Sec ligation tial rate accelerations in peptide ligation chemistry could chemistry beyond Ala disconnections96,103,104. While the be achieved by harnessing the superior reactivity of Metanis group performed deselenization in the presence C-terminal selenoesters (in this case, aryl selenoesters) of oxygen103, Payne and co-workers employed oxone as in combination with the improved nucleophilicity the oxidant96,104. Notably, the latter approach has been of Sec at the N termini of the other reacting peptide successfully employed for the synthesis of MUC4 and fragments. To test this hypothesis, two model peptides MUC5AC-based glycopeptides96,104. Furthermore, the were chosen for the initial experiments, one function­ methodology was used to assemble the Cys-free protein alized as a C-terminal Ala phenyl selenoester and the eglin C (41) via a single ligation between C-terminal other was a diselenide dimer possessing an N-terminal thioester 42 and selenocystine-bearing fragment 43, seleno­cystine residue. The authors initially explored the followed by oxidative deselenization96 (FIG. 4d). possibility of implementing electrochemistry for reduc­ In a manner similar to the improvement in the scope tion of the diselenide to the ligation-active selenolate in of NCL through the development of thiol-derived amino order to circumvent the concomitant deselenization of acids, expansion of the Sec ligation was first attempted Sec that occurs in the presence of phosphine reduct­ by Danishefsky and co-workers, who developed the syn­ ants. However, in a serendipitous finding, the control thesis of a trans-γ-selenoproline building block in three experiment (without application of a current) led to the steps from orthogonally protected hydroxy­proline93. generation of the desired ligation product107. Strikingly, This amino acid was subsequently incorporated into the ligation proceeded cleanly by simply dissolving the model peptides using Fmoc-based SPPS and was suc­ peptide fragments in denaturing buffer without the cessfully used in ligation–deselenization chemistry with addition of any additives. The additive-free reaction, various peptide thioesters. Malins et al. also developed which was subsequently dubbed diselenide selenoester an efficient synthesis of a suitably protected β-seleno­ ligation (DSL), was also complete within 60 seconds– at from Garner’s aldehyde, which could also room temperature, representing a large rate acceleration be successfully employed in ligation chemistry followed over the analogous reaction under an NCL manifold.

NATURE REVIEWS | CHEMISTRY VOLUME 2 | ARTICLE NUMBER 0122 | 11 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

a O O O NCL H N Se H2N (with additives) N N + Peptide 2 Peptide 1 Peptide 2 Peptide 1 NH2 O O HS SH

O O b Se One-pot 2 DSL–deselenization Peptide 2 SePh Peptide 1 + Peptide 2 Peptide 1 N H N H 2 O O

c O Se O H 2 H N 39 N 71 Additive-free DSL Ac ESAT-6 (1–38) SePh + ESAT-6 (41–70) S(CH2)2CO2Et H N 2 40 45 O 44

HS

ESAT-6 (73–94) H2N 72 Se O 2 47 O H O N 71 H 39 N ESAT-6 (41–70) S(CH2)2CO2Et Ac ESAT-6 (1–38) N One-pot NCL (TFET)–desulfurization H 40 O 46 (major product; not isolated)

O O H H N 71 ESAT-6 (73–94) 39 N ESAT-6 (41–70) N 72 Ac ESAT-6 (1–38) N H H 40 O O ESAT-6 (48)

Figure 5 | Peptide ligation chemistry using selenoesters as the acyl donor. a | Native chemicalNature ligation Reviews (NCL) | Chemistryusing proline selenoesters. b | Additive-free diselenide–selenoester ligation (DSL)–deselenization technology. c | Application of additive-free DSL together with traditional NCL for the one-pot synthesis of theMycobacterium tuberculosis protein ESAT-6 (48)107. The synthesis of the target was accomplished in high yield via a one-pot kinetically controlled ligation of three fragments in the N→C direction, exploiting the inherent difference in reactivity of thioesters and selenoesters. Key steps involved chemoselective DSL, followed by 2,2,2-trifluoroethanethiol (TFET)-mediated NCL with concomitant deselenization and subsequent desulfurization.

This rate acceleration was also maintained at sterically in aqueous buffer was proposed to be a potential hindered selenoesters, which were complete within driving force for the— reaction. It is worth mentioning 10 minutes, comparing favourably with the analogous that the DPDS produced during ligation acts as a rad­ NCL reactions, which require up to 48 hours. Because ical quencher and thus needs to be removed through selenoesters are considerably more prone to hydrolysis hexane extraction before performing an in situ desele­ than thioesters (at pH > 7), careful pH adjustment dur­ nization reaction. Recently, ligation has also been used ing ligation is crucial. Fortunately, the ability to perform in conjunction with oxidative deselenization technology these ligations at acidic pH (5–7) enables competing to afford serine at the ligation junction, as showcased selenoester hydrolysis to be circumvented. It is impor­ in the synthesis of fragments of some human mucin tant to note that the final ligation product is typically glycoproteins104. obtained as a mixture of symmetrical diselenide, asym­ To illustrate the synthetic utility, the additive-free metrical diselenide and product bearing a selenoester DSL deselenization methodology was also applied in linkage on the Sec used for ligation. However, all of these the construction– of two proteins107. First, intracellular products coalesce into a single product following in situ chorismate mutase from Mycobacterium tuberculosis was deselenization (via treatment with TCEP and DTT) assembled through a one-pot ligation deselenization (FIG. 5b). Intrigued by this unique transformation, a approach with 57% yield over two steps.– After folding, series of experimental and computational investigations the full-length enzyme was found to possess structure were undertaken to gain mechanistic insight into this and catalytic activity similar to those of the wild-type rapid ligation methodology. Given the absence of any enzyme. The orthogonality of the DSL chemistry with reductants or additives, it has been proposed that there NCL was also exemplified through the synthesis of is a unique initiation step for the DSL transformation another M. tuberculosis protein early secretory anti­ to provide a competent intermediate that can enter a genic protein 6 (ESAT-6). The —synthesis of the target native chemical ligation-like pathway. In addition, based was accomplished in high yield via a one-pot kinetically on the data compiled from theoretical and experimen­ controlled ligation of three fragments in the N→C direc­ tal observations, precipitation of diphenyl diselenide tion (FIG. 5c). Specifically, the inherent difference in the (DPDS) a by-product generated during ligation reactivity of thioesters and selenoesters was exploited —

12 | ARTICLE NUMBER 0122 | VOLUME 2 www.nature.com/natrevchem ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

through chemoselective DSL between bifunctional sulfated UL22A was shown to possess a 2.5 orders of middle diselenide dimer segment 44, with an N-terminal magnitude improvement in binding to RANTES over selenocystine and a C-terminal thioester, and peptide the unmodified protein, thus validating the importance phenylselenoester 45. The ligated product 46 was gen­ of sulfation for biological activity109. erated exclusively in minutes and could be subsequently Although DSL reactions generally reach completion subjected to NCL with C-terminal segment 47 bearing in minutes, the chemistry is followed by a comparably an N-terminal Cys using TFET as an additive. The pres­ sluggish deselenization step, normally requiring ence of TCEP in the NCL step also led to the concom­ 6–16 hours. As such, improving the rate of deseleniza­ itant deselenization of Sec40 used for the initial DSL tion would provide access to polypeptides and proteins reaction. Upon desulfurization using VA-044, TCEP on extraordinarily short timescales. Such an inno­ and glutathione, ESAT-6 48 was obtained in 43% yield vation was recently described by Mitchell et al., who over four steps following a single high-pressure liquid demonstrated that the presence of a weak C Se bond in chromatography (HPLC) purification. The speed and β-selenoAsp and γ-selenoGlu enables rapid– and clean efficiency of the additive-free DSL technology makes deselenization in less than a minute, orders of magnitude it a valuable addition to the ligation chemistry tool­ faster than deselenization at Sec110. The exceptional rate box for the chemical synthesis of proteins. With salient increase is thought to be a result of stabilization of the features such as operational simplicity, unprecedented carbon-centred radical, generated during deselenization, reaction rates (even at sterically encumbered junctions), by the neighbouring carboxylate functionality in the broad pH tolerance (pH 3–7) and compatibility with Asp and Glu derivatives. These seleno­amino acids were unprotected Cys residues and NCL, it is likely that this synthesized in three steps starting from commercially methodology will find wide application in the chemical available Boc-Asp(OtBu)–OH and Boc-Glu(OtBu)–OH, synthesis or semi-synthesis of numerous other impor­ respectively, with the key functionality installed tant protein targets with or without PTMs and other through an electrophilic selenylation reaction. These modifications in the future. could be readily incorporated into model peptides using the Fmoc-SPPS strategy and were demonstrated Rapid protein assembly via DSL chemistry at selenol- to undergo ligation followed by rapid deselenization, derived amino acids. Since the first report of DSL furnishing desired ligated products in excellent yields. in 2015, a number of suitably protected selenylated Based on these promising results, it was reasoned that amino acids, including β-selenoLeu109, β-selenoAsp110 this rapid deselenization combined with the expedient and γ-selenoGlu110, have been developed with a view ligation reaction could provide a means to accelerate to broaden the scope of the methodology (FIG. 6). Each chemical protein synthesis. To explore this possibility, a of these building blocks is compatible with Fmoc-SPPS library of tick-derived thrombin inhibitors (hyalomin-2, and have been successfully employed in DSL deseleni­ hyalomin-3 and hyalomin-4)111 was prepared using one- zation transformations, including protein synthesis,– as pot ligation deselenization technology at β-selenoAsp. highlighted below. As exemplified– for hyalomin-2 (58) in FIG. 6b, the hya­ The utility of DSL deselenization chemistry at lomins were assembled from two fragments, one func­ β-selenoLeu has been powerfully– demonstrated in the tionalized as a C-terminal phenyl­selenoester (59) and synthesis of a library of differentially modified variants the other as a peptide dimer bearing an N-terminal of the CCL5 (also known as RANTES) chemokine- β-selenoAsp moiety (60)110. A single one-pot DSL binding protein UL22A from human cytomegalo­virus, deselenization transformation facilitated the produc–­ which were predicted to be sulfated at Tyr65 and Tyr69 tion of each of the hyalomin proteins within minutes. (REF. 109). As there are no Cys or suitably placed alanine The entire synthetic procedure, HPLC purification, residues in the native sequence of UL22A, the protein solvent evaporation (via centrifugal concentration), could not be assembled through traditional ligation quantification and thrombin inhibition bioassay could methods. Wang and co-workers therefore chose to be performed within just 3 hours, opening the exciting disconnect UL22A at a challenging Val-Leu junction, possibility of generating rapid SAR on small proteins leading to two target fragments, diselenide 49 bearing using this technology in the future. an N-terminal β-selenoLeu and N-terminal fragments Due to the rapid kinetics of deselenization of 50–53 with variation in the sulfation state at Tyr65 β-selenoAsp and γ-selenoGlu, it was hypothesized that and Tyr69 (FIG. 6a). The synthesis of 49 was achieved this step could be performed chemoselectively in the by Fmoc-SPPS, incorporating the suitably protected presence of an unprotected Sec residue, enabling access β-selenoLeu, which was in turn accessed from Garner’s to native selenoproteins. Accordingly, selenoprotein aldehyde in eight steps109. Additive-free DSL reactions K (SelK) was selected as a synthetic target to validate between 49 and 50–53 were initially unsuccessful, pre­ this concept. SelK contains a Sec residue close to the C sumably owing to the sterically demanding nature of the terminus (Sec92) that is responsible for the formation Val-Leu junction. Nonetheless, all ligations with (sulfo) of an intermolecular diselenide in the native homodi­ peptide phenylselenoesters 50–53 proceeded smoothly meric protein112–114. Biologically, SelK is an endoplasmic in the presence of TCEP and DPDS as additives, reach­ reticulum membrane protein, believed to be involved ing completion in just 1 hour. Following in situ dese­ in regulating cellular redox balance in cardiomyocytes88 lenization and HPLC purification, the desired (sulfo) and stimulating Ca2+ flux to control immunity86. The proteins were obtained in excellent yields. The doubly protein was disconnected between Tyr60 and Asp61,

NATURE REVIEWS | CHEMISTRY VOLUME 2 | ARTICLE NUMBER 0122 | 13 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

a O PMBSe PMBSe H 8 steps UL22A (73–103) OH Cl H N 72 Fmoc-SPPS OH NBoc 2 BocHN O O O PMB deprotection ( -Se)-Leu

Se 2

UL22A (73–103) OH 72 H2N R1O O 49

R1O One-pot + DSL–deselenization O 65 H N 71 UL22A (73–103) OH UL22A (21–70) N 72 O H 65 H 69 N 71 O UL22A (21–70) SePh 54: R1 = H; R2 = H 1 – 2 69 55: R = SO3 ; R = H 1 2 – 1 2 56: R = H; R = SO3 50: R = H; R = H 1 2 – OR2 1 2 50–53 57: R , R = SO3 51: R = SO2OCH2C(CH3)3; R = H 52: R1 = H; R2 = SO OCH C(CH ) 2 2 3 3 2 Sulfated variants of UL22A (54–57) 1 2 OR 53: R , R = SO2OCH2C(CH3)3

O b Ot Bu O n OH PMBSe BocHN O– O Hyalomin-2 (25–58) OH Cl

H2N 24 Fmoc-SPPS n = 1 n = 2

O 3 steps 3 steps

O Ot Bu O PMB deprotection PMBSe Ot Bu PMBSe OR O OH OH Se BocHN BocHN – 2 O O O Hyalomin-2 (25–58) OH (β-Se)-Asp (γ-Se)-Glu H2N 24 O 60 O

+ – O O O One-pot H H 23 Hyalomin-2 (25–58) OH N 23 DSL–deselenization N Hyalomin-2 (1–22) SePh Hyalomin-2 (1–22) N 24 H O 59 Hyalomin-2 (58) Figure 6 | Protein synthesis via diselenide–selenoester ligation (DSL)–deselenization at diselenide-derived amino Nature Reviews | Chemistry acids. a | Synthesis of a library of differentially sulfated chemokine-binding UL22A 54–57 proteins109. b | β-selenoAsp, γ-selenoGlu and the one-pot synthesis of the tick-derived thrombin inhibitor hyalomin-2 (58)110. Ligation–deselenization, purification, quantification and bioassay could be achieved within 3 hours, owing to the rapid kinetics of deselenization of β-selenoAsp. SPPS, solid-phase peptide synthesis.

necessitating the synthesis of 59-residue peptide phenyl just 2 minutes effected the chemoselective deselenization selenoester 61, as well as 62, which possesses an intra­ of β-selenoAsp over Sec to afford SelK. In an attempt molecular diselenide between the native Sec and the to streamline the process, a one-pot ligation chemo­ N-terminal β-selenoAsp moiety (FIG. 7). Both fragments selective deselenization strategy was also employed– and were made by standard Fmoc-SPPS methods. For 62, the provided direct access to SelK (63) in 40% yield110. Sec and β-selenoAsp residues were introduced in suitably Based on early applications of DSL technology, the protected form, and upon cleavage, deprotection and speed and unrivalled chemoselectivity is expected to purification afforded exclusively the intramolecular find widespread applications in the rapid access to thera­ diselenide product. The purified segments were next peutic peptides and protein libraries in the near future, reacted under additive-free DSL conditions, providing including those possessing the 21st amino acid (Sec). the ligated product as the intramolecular diselenide in Moreover, access to selenylated derivatives of other 62% yield. Treatment of this intermediate with TCEP proteino­genic amino acids will further expand the scope (in the absence of an external hydrogen atom source) for of this methodology.

14 | ARTICLE NUMBER 0122 | VOLUME 2 www.nature.com/natrevchem ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

– sterically hindered junctions. The development of mul­ OOC Se Se O GR-OH ti-component protein assembly in the N→C or C→N SelK (62–91) N H N 92 PMBSe t BnSe direction, coupled with the orthogonality of NCL-based 2 61 H O O Bu O OH OH 62 BocHN FmocHN and DSL-based methods, raises the exciting possibil­ O O + ity of generating proteins with minimal handling and ( -Se)-Asp Fmoc-Sec(Bn)-OH intermediary purification steps and on unprecedented H O N 60 timescales. While the median size of a human protein is H SelK (2–59) SePh 375 residues, most proteins that have been generated by 61 chemical synthesis to date are half this size. However, the HO 2 development of orthogonal N-terminal protection strat­

COO– Se egies and masked C-terminal acyl donors, coupled with 1. DSL H O NCL and DSL chemistry, now provides a means to target 2. Hydrazinolysis N 60 SelK (62–91) GR-OH H N N 92 3. Chemoselective SelK (2–59) 61 H proteins of increasing size and complexity. Indeed, the H O O deselenization groups of Kay115, Liu27,116 and Klussmann117 have recently

40% over 3 steps HO Selenoprotein K (SelK, 63) reported the preparation of larger targets by total chem­ homodimer (21 kDa) ical synthesis. An alternative means of generating larger Figure 7 | One-pot synthesis of 21 kDa homodimeric SelK. NatureSelenoprotein Reviews K |(SelK, Chemistry 63) proteins bearing homo­geneous modifications is through was prepared by additive-free ligation followed by chemoselective deselenization of EPL techniques118. This methodology is well established β-selenoAsp in the presence of Sec110. In structures 62 and 63, GR = Gly-Arg. DSL, for NCL, and recently developed recombinant methods diselenide–selenoester ligation. for Sec incorporation open the possibility of EPL under a DSL manifold. The plethora of enabling methods available for chem­ Summary and outlook ical protein synthesis also has the potential to open up With the recent advances in peptide ligation technology, new fields of research. For example, the speed and effi­ it is clear that chemical synthesis can now be used to pro­ ciency of the latest ligation techniques offer the exciting duce large polypeptide and small protein therapeutics in possibility of generating protein libraries, thus enabling a highly robust and efficient manner. As such, it is possi­ synthetic protein medicinal chemistry. While such a ble that these methods can provide a viable alternative to platform cannot compete with the large number of tar­ traditionally used recombinant expression technologies gets that can be generated through phage display119 or while providing the additional benefit and flexibility expanded codon methodologies120,121, it has the potential of incorporating PTMs or bespoke modifications in a to fuel peptide and protein drug discovery efforts by ena­ site-specific manner to attenuate structure, function and bling focused library generation and the establishment stability. Inspired by the transformational NCL concept, of SARs in a manner similar to that of medicinal chem­ recently developed methodologies have overcome many istry programmes with small molecules, where modi­ of the limitations of the seminal approach and have fied, unnatural and/or d-amino acids can be installed expanded the number of accessible protein targets as well site-selectively. The bottleneck of chemical protein syn­ as the efficiency of chemical protein synthesis. For exam­ thesis is no longer ligation-based assembly but rather the ple, with access to a range of synthetic thiolated amino time-consuming synthesis of the suitably functionalized acids, the repertoire of NCL has been greatly broadened peptide segments by SPPS, along with laborious HPLC and provides an enormous amount of retrosynthetic purification and freeze-drying steps. While it is still flexibility for accessing a protein target by total chem­ very difficult to predict the efficiency of the synthesis ical synthesis. The potential to purchase these reagents of a given peptide target by SPPS, a number of recent from commercial vendors in the future should allow modifications to the standard method, namely, micro­ further uptake of these technologies by the community. wave heating and flow chemistry122, have the potential Furthermore, the rapid kinetics of the recently reported to accelerate this time-consuming process. Automating DSL technology provides a viable avenue to overcome the SPPS process with purification would reveal the one of the remaining key challenges of standard peptide tantalizing possibility of performing semi-automated ligation chemistry, the sluggish rates of reaction at protein synthesis in the future.

1. Muir, T. W. & Kent, S. B. H. The chemical synthesis of 5. Huttner, W. B. Sulphation of residues — a 10. Harris, J. M. & Chess, R. B. Effect of pegylation on proteins. Curr. Opin. Biotechnol. 4, 420–427 widespread modification of proteins. Nature 299, pharmaceuticals. Nat. Rev. Drug Discov. 2, 214–221 (1993). 273–276 (1982). (2003). 2. Gamblin, D. P., Scanlan, E. M. & Davis, B. G. 6. Uhlig, T. et al. The emergence of peptides in the 11. Kent, S. B. H. et al. Through the looking glass — a new Glycoprotein synthesis: an update. Chem. Rev. 109, pharmaceutical business: from exploration to world of proteins enabled by chemical synthesis. 131–163 (2009). exploitation. EuPA Open Proteom. 4, 58–69 (2014). J. Pept. Sci. 18, 428–436 (2012). 3. Stone, M. J. & Payne, R. J. Homogeneous 7. Fosgerau, K. & Hoffmann, T. Peptide therapeutics: 12. Xie, J. & Schultz, P. G. A chemical toolkit for proteins sulfopeptides and sulfoproteins: synthetic approaches current status and future directions. Drug Discov. — an expanded . Nat. Rev. Mol. Cell Biol. and applications to characterize the effects of tyrosine Today 20, 122–128 (2015). 7, 775–782 (2006). sulfation on biochemical function. Acc. Chem. Res. 48, 8. Lagassé, H. A. D. et al. Recent advances in 13. Merrifield, R. B. Solid Phase Peptide Synthesis. I. The 2251–2261 (2015). (therapeutic protein) drug development [version 1; Synthesis of a Tetrapeptide. J. Am. Chem. Soc. 85, 4. Wright, T. H., Vallee, M. R. J. & Davis, B. G. From referees: 2 approved]. F1000Research 6,113 (2017). 2149–2154 (1963). chemical mutagenesis to post-expression 9. Usmani, S. S. et al. THPdb: Database of FDA- 14. Bray, B. L. Large-scale manufacture of peptide mutagenesis: a 50 year Odyssey. Angew. Chem. Int. approved peptide and protein therapeutics. PloS ONE therapeutics by chemical synthesis. Nat. Rev. Drug Ed. 55, 5896–5903 (2016). 12, e0181748 (2017). Discov. 2, 587–593 (2003).

NATURE REVIEWS | CHEMISTRY VOLUME 2 | ARTICLE NUMBER 0122 | 15 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

15. Dawson, P. E., Muir, T. W., Clark-Lewis, I. & trialkyl phosphites. J. Am. Chem. Soc. 78, 6414 thioesterification of peptides. Tetrahedron Lett. 48, Kent, S. B. H. Synthesis of proteins by native chemical (1956). 25–28 (2007). ligation. Science 266, 776–779 (1994). 38. Jin, K., Li, T., Chow, H. Y., Liu, H. & Li, X. P.B. 61. Elrich, L. A., Kumar, K. S. A., Haj-Yahya, M., This seminal report revolutionized the field of Desulfurization: an enabling method for protein Dawson, P. E. & Brik, A. N-methylcysteine-mediated chemical protein synthesis. The method has been chemical synthesis and site-specific deuteration. total chemical synthesis of ubiquitin thioester. Org. employed in the assembly of hundreds of Angew. Chem. Int. Ed. 56, 14607–14611 (2017). Biomol. Chem. 8, 2392–2396 (2010). important protein targets. 39. Malins, L. R. & Payne, R. J. Synthetic amino acids for 62. Blanco-Canosa, J. B. & Dawson, P. E. An efficient 16. Muir, T. W., Sondhi, D. & Cole, P. A. Expressed protein applications in peptide ligation-desulfurization Fmoc-SPPS approach for the generation of thioester ligation: a general method for protein engineering. chemistry. Aust. J. Chem. 68, 521–537 (2015). peptide precursors for use in native chemical ligation. Proc. Natl Acad. Sci. USA 95, 6705–6710 (1998). 40. Premdjee, B. & Payne, R. J. in Chemical Ligation: Tools Angew. Chem. Int. Ed. 47, 6851–6855 (2008). 17. Bode, J. W., Fox, R. M. & Baucom, K. D. for Synthesis and Modification (eds 63. Blanco-Canosa, J. B., Nardone, B., Albericio, F. & Chemoselective amide ligations by decarboxylative D’Andrea, L. D. & Romanelli, A.) 161–218 (John Wiley Dawson, P. E. Chemical protein synthesis using a condensations of N-alkylhydroxylamines and & Sons, USA, 2017). second-generation N-acylurea linker for the α-ketoacids. Angew. Chem. Int. Ed. 45, 1248–1252 41. Crich, D. & Banerjee, A. Native chemical ligation at preparation of peptide-thioester precursors. J. Am. (2006). phenylalanine. J. Am. Chem. Soc. 129, 10064–10065 Chem. Soc. 137, 7197–7209 (2015). 18. Nilsson, B. L., Kiessling, L. L. & Raines, R. T. (2007). 64. Wang, J. X. et al. Peptide o-aminoanilides as crypto- Staudinger ligation: a peptide from a thioester and 42. Botti, P. & Tchertchian, S. Side-chain extended thioesters for protein chemical synthesis. Angew. azide. Org. Lett. 2, 1939–1941 (2000). ligation. Patent WO2006133962 (2006). Chem. Int. Ed. 54, 2194–2198 (2015). 19. Saxon, E., Armstrong, J. I. & Bertozzi, C. R. A 43. Malins, L. R., Giltrap, A. M., Dowman, L. J. & 65. Fang, G. M. et al. Protein chemical synthesis by “traceless” Staudinger ligation for the chemoselective Payne, R. J. Synthesis of β-thiol phenylalanine for ligation of peptide hydrazides. Angew. Chem. Int. Ed. synthesis of amide bonds. Org. Lett. 2, 2141–2143 applications in one-pot ligation–desulfurization 50, 7645–7649 (2011). (2000). chemistry. Org. Lett. 17, 2070–2073 (2015). This seminal paper outlines the use of C-terminal 20. Hackenberger, C. P. R. & Schwarzer, D. 44. Haase, C., Rohde, H. & Seitz, O. Native chemical acyl hydrazides as thioester surrogates to facilitate Chemoselective ligation and modification strategies ligation at valine. Angew. Chem. Int. Ed. 47, iterative ligation-based assembly of proteins in the for peptides and proteins. Angew. Chem. Int. Ed. 47, 6807–6810 (2008). N→C direction. 10030–10074 (2008). 45. Chen, J., Wan, Q., Yuan, Y., Zhu, J. & Danishefsky, S. J. 66. Fang, G. M., Wang, J. X. & Liu, L. Convergent chemical 21. Zhang, Y., Xu, C., Lam, H. Y., Lee, C. L. & Li, X. Protein Native chemical ligation at valine: a contribution to synthesis of proteins by ligation of peptide hydrazides. chemical synthesis by serine and ligation. peptide and synthesis. Angew. Chem. Int. Angew. Chem. Int. Ed. 51, 10347–10350 (2012). Proc. Natl Acad. Sci. USA 110, 6657–6662 (2013). Ed. 120, 8649–8652 (2008). 67. Malins, L. R. & Payne, R. J. Synthesis and utility of 22. Unverzagt, C. & Kajihara, Y. Chemical assembly of 46. Harpaz, Z., Siman, P., Kumar, K. S. A. & Brik, A. β-selenol-phenylalanine for native chemical ligation– N-glycoproteins: a refined toolbox to address a Protein synthesis assisted by native chemical ligation deselenization chemistry. Org. Lett. 14, 3142–3145 ubiquitous posttranslational modification. Chem. Soc. at . ChemBioChem 11, 1232–1235 (2010). (2012). Rev. 42, 4408–4420 (2013). 47. Tan, Z. P., Shang, S. Y. & Danishefsky, S. J. Insights 68. Zheng, J. S., Tang, S., Qi, Y. K., Wang, Z. P. & Liu, L. 23. Masania, J., Li, J., Smerdon, S. J. & Macmillan, D. into the finer issues of native chemical ligation: an Chemical synthesis of proteins using peptide Access to phosphoproteins and glycoproteins through approach to cascade ligations. Angew. Chem. Int. Ed. hydrazides as thioester surrogates. Nat. Protoc. 8, semi-synthesis, native chemical ligation and N to S acyl 49, 9500–9503 (2010). 2483–2495 (2013). transfer. Org. Biomol. Chem. 8, 5113–5119 (2010). 48. Yang, R. L., Pasunooti, K. K., Li, F. P., Liu, X. W. & 69. Boll, E. et al. One-pot chemical synthesis of small 24. Agouridas, V., El Mahdi, O., Cargoët, M. & Melnyk, O. Liu, C. F. Dual native chemical ligation at . J. Am. ubiquitin-like modifier protein–peptide conjugates A statistical view of protein chemical synthesis using Chem. Soc. 131, 13592–13593 (2009). using bis(2-sulfanylethyl)amido peptide latent thioester NCL and extended methodologies. Bioorg. Med. 49. Shang, S. Y., Tan, Z. P. & Danishefsky, S. J. Application surrogates. Nat. Protoc. 10, 269–292 (2015). Chem. 25, 4938–4945 (2017). of the logic of cysteine-free native chemical ligation to 70. Jbara, M., Maity, S. K. & Brik, A. Palladium in the 25. Kochendoerfer, G. G. et al. Design and chemical the synthesis of Human Parathyroid Hormone (hPTH). chemical synthesis and modification of proteins. synthesis of a homogeneous -modified Proc. Natl Acad. Sci. USA 108, 5986–5989 (2011). Angew. Chem. Int. Ed. 56, 10644–10655 (2017). erythropoiesis protein. Science 299, 884–887 This noteworthy article underlines the utility of the 71. Huang, Y.C. et al. Synthesis of l- and d-ubiquitin by (2003). toolbox of thiol-derived amino acids to assemble one-pot ligation and metal-free desulfurization. Chem. 26. Torbeev, V. Y. & Kent, S. B. H. Convergent chemical large proteins through a convergent kinetically Eur. J. 22, 7623–7628 (2016). synthesis and crystal structure of a 203 amino acid controlled ligation strategy. 72. Bang, D. & Kent, S. B. H. A. One-pot total synthesis of “covalent dimer” HIV-1 protease enzyme molecule. 50. Thompson, R. E., Chan, B., Radom, L., Jolliffe, K. A. & crambin. Angew. Chem. Int. Ed. 43, 2534–2538 Angew. Chem. Int. Ed. 46, 1667–1670 (2007). Payne, R. J. Chemoselective ligation-desulfurization at (2004). 27. Jiang, W. et al. Mirror-image polymerase chain aspartate. Angew. Chem. Int. Ed. 52, 9723–9727 73. Veber, D. F., Milkowski, J. D., Denkewalter, R. G. & reaction. Cell Discov. 3, 17037 (2017). (2013). Hirschmann, R. The synthesis of peptides in aqueous 28. Clark, R. J. & Craik, D. J. Native chemical ligation 51. Araman, C. et al. Semisynthetic Prion Protein (PrP) medium, IV. A novel protecting group for cysteine. applied to the synthesis and bioengineering of circular variants carrying glycan mimics at position 181 and Tetrahedron Lett. 26, 3057–3058 (1968). peptides and proteins. Pept. Sci. 94, 414–422 (2010). 197 do not form fibrils. Chem. Sci. 8, 6626–6632 74. Kumar, K. S. A. et al. Total chemical synthesis of a 304 29. Dawson, P. E. & Kent, S. B. H. Synthesis of native (2017). amino acid K48-linked tetraubiquitin protein. Angew. proteins by chemical ligation. Annu. Rev. Biochem. 69, 52. Bang, D., Pentelute, B. L. & Kent, S. B. H. Kinetically Chem. Int. Ed. 50, 6137–6141 (2011). 923–960 (2000). controlled ligation for the convergent chemical 75. Tang, S. et al. Practical chemical synthesis of atypical 30. Kent, S. B. H. Total chemical synthesis of proteins. synthesis of proteins. Angew. Chem. Int. Ed. 45, ubiquitin chains by using an isopeptide-linked Ub Chem. Soc. Rev. 38, 338–351 (2009). 3985–3988 (2006). isomer. Angew. Chem. Int. Ed. 56, 13333–13337 31. Kent, S. B. H. Chemical protein synthesis: Inventing This pivotal work showcases the power of kinetically (2017). synthetic methods to decipher how proteins work. controlled ligation chemistry by providing 76. Sakamoto, I. et al. Chemical synthesis of Bioorg. Med. Chem. 25, 4926–4937 (2017). directional flexibility to construct proteins. homogeneous human glycosyl-interferon-β that 32. Macmillan, D. Evolving strategies for protein synthesis 53. Thompson, R. E. et al. Trifluoroethanethiol: an efficient exhibits potent antitumor activity in vivo. J. Am. converge on native chemical ligation. Angew. Chem. additive for one-pot ligation-desulfurization chemistry. Chem. Soc. 134, 5428–5431 (2012). Int. Ed. 45, 7668–7672 (2006). J. Am. Chem. Soc. 136, 8161–8164 (2014). 77. Hondal, R. J., Nilsson, B. L. & Raines, R. T. 33. Offer, J. Native chemical ligation with Nα acyl transfer This work demonstrates one-pot protein synthesis Selenocysteine in native chemical ligation and auxiliaries. Pept. Sci. 94, 530–541 (2010). from three fragments using iterative ligation– expressed protein ligation. J. Am. Chem. Soc. 123, 34. Yan, L. Z. & Dawson, P. E. Synthesis of peptides and desulfurization chemistry. Key to this methodology 5140–5141 (2001). proteins without cysteine residues by native chemical is the use of TFET, which does not interfere with 78. Quaderer, R., Sewing, A. & Hilvert, D. Selenocysteine- ligation combined with desulfurization. J. Am. Chem. radical desulfurization. mediated native chemical ligation. Helv. Chim. Acta Soc. 123, 526–533 (2001). 54. Cergol, K. M., Thompson, R. E., Malins, L. R., 84, 1197–1206 (2001). This important work established the conceptual Turner, P. & Payne, R. J. One-pot peptide ligation– 79. Gieselman, M. D., Xie, L. & Van Der Donk, W. A. framework for ligation–desulfurization chemistry. desulfurization at glutamate. Org. Lett. 16, 290–293 Synthesis of a selenocysteine-containing peptide by 35. Malins, L. R. & Payne, R. J. Recent extensions to (2013). native chemical ligation. Org. Lett. 3, 1331–1334 native chemical ligation for the chemical synthesis of 55. Moyal, T., Hemantha, H. P., Siman, P., Refua, M. & (2001). peptides and proteins. Curr. Opin. Chem. Biol. 22, Brik, A. Highly efficient one-pot ligation and These three independent works (references 77–79) 70–78 (2014). desulfurization. Chem. Sci. 4, 2496–2501 (2013). demonstrate that the 21st amino acid (Sec) is 36. Wan, Q. & Danishefsky, S. J. Free-radical-based, 56. Thompson, R. E. et al. Tyrosine sulfation modulates competent in NCL-like transformations with specific desulfurization of cysteine: a powerful advance activity of tick-derived thrombin inhibitors. Nat. Chem. peptide thioesters, providing access to large in the synthesis of polypeptides and 9, 909–917 (2017). selenopeptides and selenoproteins. This work lays glycopolypeptides. Angew. Chem. Int. Ed. 46, 57. Macmillan, D., Adams, A. & Premdjee, B. Shifting the foundation for the use of selenoamino acids in 9248–9252 (2007). native chemical ligation into reverse through N to S ligation chemistry. This milder, metal-free desulfurization using TCEP, acyl transfer. Isr. J. Chem. 51, 885–899 (2011). 80. Flohe, L., Guenzler, W. A. & Schock, H. H. Glutathione a water-soluble radical initiator (VA-044) and a 58. Ollivier, N., Dheur, J., Mhidia, R., Blanpain, A. & peroxidase: a selenoenzyme. FEBS Lett. 32, 132–134 hydrogen atom source (such as tBuSH) overcomes Melnyk, O. Bis(2-sulfanylethyl)amino native peptide (1973). some limitations of the original metal-based ligation. Org. Lett. 12, 5238–5241 (2010). 81. Kryukov, G. V. et al. Characterization of mammalian approach and is demonstrated to be compatible 59. Sato, K. et al. N-sulfanylethylanilide peptide as a crypto- selenoproteomes. Science 300, 1439–1443 (2003). with thioesters, methionine and protected Cys thioester peptide. ChemBioChem 12, 1840–1844 82. Reeves, M. A. & Hoffmann, P. R. The human residues. (2011). selenoproteome: recent insights into functions and 37. Hoffmann, F. W., Ess, R. J., Simmons, T. C. & 60. Hojo, H., Onuma, Y., Akimoto, Y., Nakahara, Y. & regulation. Cell. Mol. Life Sci. 66, 2457–2478 Hanzel, R. S. The desulfurization of meraptans with Nakahara, Y. N-alkyl cysteine-assisted (2009).

16 | ARTICLE NUMBER 0122 | VOLUME 2 www.nature.com/natrevchem ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

REVIEWS

83. Muttenthaler, M. & Alewood, P. F. Selenopeptide containing a selenocysteine copper ligand. J. Am. 117. Pech, A. et al. A thermostable D-polymerase for mirror- chemistry. J. Pept. Sci. 14, 1223–1239 (2008). Chem. Soc. 124, 2084–2085 (2002). image PCR. Nucleic Acids Res. 45, 3997–4005 84. Lu, J. & Holmgren, A. Selenoproteins. J. Biol. Chem. 102. Liu, J., Chen, Q. & Rozovsky, S. Utilizing (2017). 284, 723–727 (2009). selenocysteine for expressed protein ligation and 118. Muralidharan, V. & Muir, T. W. Protein ligation: an 85. Johansson, L., Gafvelin, G. & Arner, E. S. J. bioconjugations. J. Am. Chem. Soc. 139, 3430–3437 enabling technology for the biophysical analysis of Selenocysteine in proteins—properties and (2017). proteins. Nat. Methods 3, 429–438 (2006). biotechnological use. Biochim. Biophys. Acta, Gen. 103. Dery, S. et al. Insights into the deselenization of 119. Salmond, G. P. C. & Fineran, P. C. A century of the Subj. 1726, 1–13 (2005). selenocysteine into alanine and serine. Chem. Sci. 6, phage: past, present and future. Nat. Rev. Microbiol. 86. Verma, S. et al. Selenoprotein K knockout mice exhibit 6207–6212 (2015). 13, 777–786 (2015). deficient calcium flux in immune cells and impaired 104. Mitchell, N. J., Kulkarni, S. S., Wang, S., Malins, L. R. 120. Goto, Y., Katoh, T. & Suga, H. Flexizymes for genetic immune responses. J. Immunol. 186, 2127–2137 & Payne, R. J. One-pot ligation–oxidative code reprogramming. Nat. Protoc. 6, 779–790 (2011). deselenization at selenocysteine and selenocystine. (2011). 87. Shchedrina, V. A. et al. Selenoprotein K binds Chem. Eur. J. 23, 946–952 (2017). 121. Murakami, H., Ohta, A., Ashigai, H. & Suga, H. A multiprotein complexes and is involved in the 105. Pollock, S. B. & Kent, S. B. H. An investigation into the highly flexible tRNA acylation method for non-natural regulation of endoplasmic reticulum homeostasis. origin of the dramatically reduced reactivity of polypeptide synthesis. Nat. Methods 3, 357–359 J. Biol. Chem. 286, 42937–42948 (2011). peptide-prolyl-thioesters in native chemical ligation. (2006). 88. Lu, C. et al. Identification and characterization of Chem. Commun. 47, 2342–2344 (2011). 122. Mijalis, A. J. et al. A fully automated flow-based selenoprotein K: an in cardiomyocytes. 106. Durek, T. & Alewood, P. F. Preformed selenoesters approach for accelerated peptide synthesis. Nat. FEBS Lett. 580, 5189–5197 (2006). enable rapid native chemical ligation at intractable Chem. Biol. 13, 464–468 (2017). 89. Du, S., Zhou, J., Jia, Y. & Huang, K. SelK is a novel ER sites. Angew. Chem. Int. Ed. 50, 12042–12045 (2011). 123. Schnolzer, M. & Kent, S. B. H. Constructing proteins stress-regulated protein and protects HepG2 cells 107. Mitchell, N. J. et al. Rapid additive-free selenocystine– by dovetailing unprotected synthetic peptides: from ER stress agent-induced apoptosis. Arch. selenoester peptide ligation. J. Am. Chem. Soc. 137, backbone-engineered HIV protease. Science 256, Biochem. Biophys. 502, 137–143 (2010). 14011–14014 (2015). 221–225 (1992). 90. Besse, D., Siedler, F., Diercks, T., Kessler, H. & This article discloses a rapid and highly efficient 124. Canne, L. E., Bark, S. J. & Kent, S. B. H. Extending the Moroder, L. The redox potential of selenocystine in peptide ligation reaction called diselenide– applicability of native chemical ligation. J. Am. Chem. unconstrained cyclic peptides. Angew. Chem. Int. Ed. selenoester ligation (DSL). DSL provides Soc. 118, 5891–5896 (1996). 36, 883–885 (1997). unprecedented reaction rates, does not require any 125. Brik, A., Yang, Y. Y., Ficht, S. & Wong, C. H. Sugar- 91. Nauser, T., Dockheer, S., Kissner, R. & Koppenol, W. H. additives and can be used in conjunction with NCL, assisted glycopeptide ligation. J. Am. Chem. Soc. 128, Catalysis of electron transfer by selenocysteine. greatly expanding the number of protein targets 5626–5627 (2006). Biochemistry 45, 6038–6043 (2006). that can be accessed by chemical synthesis. 126. Ficht, S., Payne, R. J., Brik, A. & Wong, C. H. Second- 92. Guenther, W. H. H. Methods in selenium chemistry. III. 108. Hanna, C. C., Kulkarni, S. S., Watson, E. E., generation sugar-assisted ligation: a method for the Reduction of diselenides with dithiothreitol. J. Org. Premdjee, B. & Payne, R. J. Solid-phase synthesis of synthesis of cysteine-containing . Angew. Chem. 32, 3931–3934 (1967). peptide selenoesters via a side-chain anchoring Chem. Int. Ed. 46, 5975–5979 (2007). 93. Townsend, S. D. et al. Advances in proline ligation. strategy. Chem. Commun. 53, 5424–5427 (2017). 127. Payne, R. J. et al. Extended sugar-assisted J. Am. Chem. Soc. 134, 3912–3916 (2012). 109. Wang, X., Sanchez, J., Stone, M. & Payne, R. J. glycopeptide ligations: development, scope, and 94. Quaderer, R. & Hilvert, D. Selenocysteine-mediated Sulfation of the human cytomegalovirus protein applications. J. Am. Chem. Soc. 129, 13527–13536 backbone cyclization of unprotected peptides followed UL22A enhances binding to the chemokine RANTES. (2007). by alkylation, oxidative elimination or reduction of the Angew. Chem. Int. Ed. 56, 8490–8494 (2017). 128. Lutsky, M. Y., Nepomniaschiy, N. & Brik, A. Peptide selenol. Chem. Commun. 2620–2621 (2002). 110. Mitchell, N. J. et al. Accelerated protein synthesis via ligation via side-chain auxiliary. Chem. Commun. 10, 95. Metanis, N., Keinan, E. & Dawson, P. E. Traceless one-pot ligation-deselenization chemistry. Chem 2, 1229–1231 (2008). ligation of cysteine peptides using selective 703–715 (2017). 129. Hojo, H. et al. The mercaptomethyl group facilitates deselenization. Angew. Chem. Int. Ed. 49, 7049–7053 This work demonstrates that the presence of a an efficient one-pot ligation at Xaa-Ser/Thr for (glyco) (2010). weaker C–Se bond in β-selenoAsp and γ-selenoGlu peptide synthesis. Angew. Chem. Int. Ed. 49, This landmark paper revealed that deselenization enables rapid and clean deselenization in less than 5318–5321 (2010). of a Sec residue can be achieved in the absence of a minute and, in combination with DSL, provides a 130. Loibl, S. F., Harpaz, Z. & Seitz, O. A type of auxiliary a traditional radical initiator using a reducing agent means to expedite access to synthetic proteins. for native chemical peptide ligation beyond cysteine (TCEP) and a hydrogen donor, and is completely Deselenization at β-selenoAsp and γ-selenoGlu can and junctions. Angew. Chem. Int. Ed. 54, chemoselective in the presence of unprotected Cys be performed chemoselectively in the presence of 15055–15059 (2015). residues. native Sec, providing access to native 96. Malins, L. R., Mitchell, N. J., McGowan, S. & selenoproteins. Acknowledgements Payne, R. J. Oxidative deselenization of 111. Jablonka, W. et al. Identification and mechanistic We acknowledge financial support from an ARC Linkage grant selenocysteine: applications for programmed ligation analysis of a novel tick-derived inhibitor of thrombin. (S.K., R.J.P.), and the Northcote Scholarship and John A. at serine. Angew. Chem. Int. Ed. 54, 12716–12721 PLoS ONE 10, e0133991 (2015). Lamberton Research Scholarship for PhD funding (J.S.). (2015). 112. Liu, J., Srinivasan, P., Pham, D. N. & Rozovsky, S. 97. Sai Reddy, P., Dery, S. & Metanis, N. Chemical Expression and purification of the membrane enzyme Author contributions synthesis of proteins with non-strategically placed selenoprotein K. Protein Expr. Purif. 86, 27–34 All authors contributed to all aspects of article preparation. using selenazolidine and selective (2012). S.S.K and J.S. contributed equally to this manuscript. deselenization. Angew. Chem. Int. Ed. 55, 992–995 113. Liu, J., Zhang, Z. & Rozovsky, S. Selenoprotein K form (2016). an intermolecular diselenide bond with unusually high Competing interests 98. Dery, L. et al. Accessing human selenoproteins redox potential. FEBS Lett. 588, 3311–3321 (2014). The authors declare no competing interests. through chemical protein synthesis. Chem. Sci. 8, 114. Liu, J. & Rozovsky, S. Membrane-bound 1922–1926 (2017). selenoproteins. Antioxid. Redox Signal 23, 795–813 Publisher’s note 99. Mousa, R., Dardashti, R. N. & Metanis, N. Selenium (2015). Springer Nature remains neutral with regard to jurisdictional and selenocysteine in protein chemistry. Angew. Chem. 115. Weinstock, M. T., Jacobsen, M. T. & Kay, M. S. claims in published maps and institutional affiliations. Int. Ed. 56, 15818–15827 (2017). Synthesis and folding of a mirror-image enzyme 100. Mousa, R., Reddy, P. S. & Metanis, N. Chemical reveals ambidextrous chaperone activity. Proc. Natl How to cite this article protein synthesis through selenocysteine chemistry. Acad. Sci. USA 111, 11679–11684 (2014). Kulkarni, S. S., Sayers, J., Premdjee, B. & Payne, R. J. Rapid Synlett 28, 1389–1393 (2017). 116. Xu, W. et al. Total chemical synthesis of a thermostable and efficient protein synthesis through expansion of the 101. Berry, S. M., Gieselman, M. D., Nilges, M. J., van der enzyme capable of polymerase chain reaction. Cell native chemical ligation concept. Nat. Rev. Chem. 2, 0122 Donk, W. A. & Lu, Y. An engineered azurin variant Discov. 3, 17008 (2017). (2018).

NATURE REVIEWS | CHEMISTRY VOLUME 2 | ARTICLE NUMBER 0122 | 17 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.