Hybrid-Phase Native Chemical Ligation Approaches to Overcome the Limitations of Protein Total Synthesis
DISSERTATION
Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University
By
Ruixuan Ryan Yu
Graduate Program in the Ohio State Biochemistry Program
The Ohio State University
2016
Committee:
Jennifer J. Ottesen – Advisor
Michael G. Poirier
Michael A. Freitas
Dennis Bong
Copyrighted by
Ruixuan Ryan Yu
2016
Abstract
Total protein synthesis allows the preparation of proteins with chemically diverse modifications. The numerous advantages of total synthesis are sometimes offset by some major limitations. Protein synthesis is a non-trivial task involving many chemical steps, and these steps increase with the size of the protein. Therefore, larger proteins are difficult to synthesize with high yield. We have developed a strategy which we term hybrid-phase native chemical ligation (NCL) to overcome some of the limitations of size and yield.
Hybrid-phase NCL combines ligating peptides on a solid support (solid-phase NCL) and in solution (solution-phase NCL) to maximize synthetic yield. We have successfully used this method to synthesize triple-acetylated histone H4-K5ac,K12ac,K91ac and, for the first time, acetylated centromeric histone CENP-A-K124ac (CpA-K124ac).
In order to improve the yield of CENP-A total synthesis, we have incorporated a convergent ligation element in our hybrid-phase strategy. This new approach reduced the number of purification steps, leading to a synthetic yield that was almost triple that of the original approach.
Finally, we introduce the convergent solid-phase hybrid NCL approach that allows the preparation of a long peptide segment bearing a masked thioester on a solid support. ii
Through a newly developed resin-anchoring strategy, cleavage of the product from solid- phase generated a ligation-compatible segment that could be used directly with no purification. This method has the potential to synthesize large proteins in good yield, effectively overcoming the size and yield limits of protein total synthesis.
iii
Dedication
Alice and Owen.
iv
Acknowledgments
I would like to thank all members of the Ottesen lab for their constant support, assistance, and guidance throughout my graduate career. I would like to thank Dr. Santosh Mahto for mentoring me during the first few years, and pushing me to continue his H4 total synthesis project. Thank you to C.J. Howard for advising me with his broad knowledge of biology.
Thank you to Michael Cotten and Mallory Alexander for helping move my project forward.
Thank you to Kurt Justus for assisting me with all the problems plaguing the histone peptides. Thank you to Dr. John Shimko for giving me his expert opinions on biochemistry, organic chemistry, and nerd culture. Thank you to Ziyong Hong for assisting me with all of the time-consuming experiments. My project would not have moved forward so efficiently without his assistance.
Most of all, I thank my advisor Dr. Jennifer Ottesen for giving me unconditional support both inside and outside the lab. Her confidence in me did not change even during the year when I was starting to drift off from my research. She encouraged me to keep going even during the times when I was ready to give up. Her support even extended to my family, especially after my son was born. In addition, she used all of her available resources to support my decision to attend medical school after graduation. She is a mentor who truly cares about the success of her students. v
Vita
2002-2006 ...... Adrian Wilcox High School
2010...... B.A. Molecular and Cell Biology,
University of California, Berkeley
2010...... B.A. Practice of Art, University of
California, Berkeley
2010-2016 ...... Graduate Teaching and Research Associate,
The Ohio State University
2013...... M.S. Biochemistry, The Ohio State
University
Publications
Ruixuan R. Yu, Santosh K. Mahto, Kurt Justus, Mallory Alexander, Cecil J. Howard, Jennifer J. Ottesen, “Hybrid phase ligation for efficient synthesis of histone proteins”. Organic and Biomolecular Chemistry, 2016, 14:2603-2607
Cecil J. Howard, Ruixuan R. Yu, Miranda L. Gardner, John C. Shimko, Jennifer J. Ottesen, “Chemical and Biological Tools for the preparation of modified histone proteins. Topics in Current Chemistry, 2015. 363:193-226
vi
Fields of Study
Major Field: The Ohio State Biochemistry Program
vii
Table of Contents
Abstract ...... ii
Dedication ...... iv
Acknowledgments ...... v
Vita ...... vi
Publications ...... vi
Fields of Study ...... vii
Table of Contents ...... viii
List of Tables ...... xvii
List of Figures ...... xviii
List of Acronyms ...... xxii
Chapter 1: Introduction ...... 1
Protein Total Synthesis ...... 1
Native Chemical Ligation ...... 2
Applications of total synthesis ...... 7
Histones...... 10
Histone Post-Translational Modification ...... 13
viii
Methods to Prepare Histone PTMs ...... 13
Genetic mimics ...... 14
Expanded genetic code ...... 15
Dehydroalanine ...... 16
Chemical installation through cysteine ...... 17
Disulfide stapling ...... 18
Chemical Synthesis ...... 20
Goals ...... 20
Outline ...... 22
Chapter 2: Solid-Phase Ligation vs. Hybrid-Phase Ligation of Histones ...... 23
Introduction ...... 23
Solution-Phase NCL ...... 23
Solid-Phase NCL ...... 25
Experimental Methods ...... 28
Materials ...... 28
RP-HPLC ...... 29
Mass spectrometry ...... 29
Solid-Phase Peptide Synthesis ...... 30
Synthesis of 3-Fmoc-Dbz-OH ...... 30
ix
Automated Solid-Phase Peptide Synthesis ...... 30
Manual peptide synthesis ...... 32
Manual synthesis of Dbz(Alloc) resin ...... 34
Loading the first amino acid on Dbz(Alloc) ...... 35
Symmetric anhydride coupling on HMBA ...... 35
Alloc Deprotection ...... 36
Nbz Conversion in DCM ...... 36
Nbz conversion in DMF and NMP ...... 37
Peptide cleavage ...... 37
Synthesis and purification of H4 peptides ...... 38
H4-A (acSer1-Leu37)-Nbz ...... 38
H4-A-K5ac,K12ac (Ser1-Leu37)-Nbz(formyl)-Arg ...... 39
H4-B (Thz38-Gly56)-Nbz ...... 39
H4-H (Pen57-H75)-Nbz(formyl)-Arg ...... 40
H4-C-K91ac (Thz76-Gly102-HMBA-Arg-Gly)-Nbz ...... 41
H4-(76-102)-K79ac (Thz76-Gly102) for semi-synthetic H4-K79ac ...... 41
Synthesis and purification of CENP-A peptides ...... 42
CpA-1 (Gly2-Gly34)-Nbz ...... 42
CpA-2 (Thz35-Leu70)-Nbz (formyl) ...... 42
x
CpA-2 (Thz35-Leu70)-O-Cys(StBu) ...... 43
CpA-3 (Thz71-Ala97)-Nbz-Arg-Arg ...... 45
CpA-4 (Thz98-H115)-Nbz-R ...... 45
CpA-5-K124ac (Thz116-Gly14-HMBA-Arg-Gly)-Nbz ...... 46
SP-NCL ...... 46
Buffers ...... 46
Base resin synthesis for SP-NCL ...... 47
Quantificaiton of product from dry PEGA resin ...... 48
Fmoc deprotection ...... 49
Thz deprotection ...... 49
Ligation ...... 50
Micro-cleavages to assess reaction progress ...... 50
On-resin desulfurization ...... 51
Cleavage from the resin at the HMBA linker ...... 51
Cleavage from the resin at the Rink linker ...... 52
SDS-PAGE ...... 52
Solution-phase NCL of Hybrid-phase ligation ...... 53
Preparation of SDS-PAGE samples by TCA precipitation ...... 53
Ziptip and MALDI-TOF MS ...... 53
xi
Second Solution-phase NCL ...... 54
Desulfurization ...... 55
Purification of Synthetic histones ...... 55
Refolding histone tetramer ...... 56
Refolding histone octamer ...... 57
Nucleosome reconstitution ...... 57
Native PAGE ...... 58
His6-tagged CENP-A expression ...... 58
Expressed Protein Ligaiton of H4-K79ac ...... 59
H4(1-75)-intein-CBD Expression ...... 59
H4(1-75)-intein-CBD Purification ...... 60
Thiolysis to produce H4(1-75)-SR ...... 61
Ligation ...... 61
Desulfurization ...... 61
Purification ...... 62
Quantification of Histone using UV-Vis spectroscopy ...... 62
Results and Discussion ...... 63
SP-NCL of H4 ...... 63
Dual linker design for SP-NCL ...... 64
xii
Synthetic peptide segments for H4 ...... 66
Synthesis of a peptide segment with an α-thioester ...... 68
Nbz Conversion of H4-A-Dbz ...... 70
Racemization of Histidine ...... 71
Synthesis of H4 peptides ...... 76
Sequential SP-NCL of ac-H4 ...... 78
SP-NCL of modified H4 ...... 80
CENP-A ...... 82
Semi-synthesis of CpA-K124ac ...... 86
SP-NCL of CpA-K124ac ...... 89
Nbz conversion of CpA-2-Dbz ...... 91
CpA-2-thioester through O to S acyl shift ...... 91
Nbz conversion using different solvents ...... 92
Synthesis of CENP-A peptides ...... 95
Sequential SP-NCL of CpA-K124ac ...... 97
Hybrid Phase Ligation of H4-K5ac, K12ac, K91ac ...... 101
SP-NCL of H4-BHC-K91ac ...... 102
Solution-phase NCL of H4-ABHC-K5ac,K12ac,K91ac ...... 105
Hybrid Phase Ligation of CpA-K124ac ...... 108
xiii
SP-NCL of CpA-345-K124ac ...... 110
Sequential solution-phase ligation of CpA-12345-K124ac ...... 111
Nucleosome reconstitution using synthetic and semi-synthetic histones ...... 115
Semi-synthesis of H4-K79ac ...... 115
Recombinant expression of His6-tagged CENP-A ...... 117
Refolding and reconstitution of recombinant and semi-synthetic histones ...... 118
Refolding and reconstitution of synthetic histones ...... 119
Conclusions ...... 121
Acknowledgements ...... 122
Chapter 3: Convergent Hybrid-Phase Native Chemical Ligation ...... 123
Introduction ...... 123
Experimental Methods ...... 125
Hydrazinolysis of peptide Nbz and peptide HMBA ...... 125
Preparation of Hydrazide resin using Wang ...... 125
Solution-phase NCL of CpA-12-Dbz ...... 126
Ligation with CpA-1-Nbz ...... 126
Ligation with CpA-1-Dbz ...... 127
Solution-phase NCL of CENP-A using CpA-12-Dbz and CpA-345 ...... 127
Ligation of CpA-12-Dbz and CpA-345 ...... 127
xiv
Desulfurization of CpA-12345 ...... 128
Glycolic acid base resin ...... 129
Synthesis of glycolic acid base resin ...... 129
Resin thioesterification ...... 129
Ligation of CpA-2-Dbz-Gly-Lys(Cys) ...... 130
Ligation of CpA-1-Dbz ...... 130
Cleavage of CpA-12-Dbz0 ...... 130
SDS-PAGE of TFA-treated resin ...... 131
Base resin sequences ...... 131
Quantificaiton of product from dry PEGA resin ...... 131
Results and Discussion ...... 133
Hydrazide as a cryptic thioester for convergent ligation ...... 133
Synthesis of CpA-2 on hydrazide resin ...... 134
CpA-2-N2H3 by hydrazinolysis of Nbz ...... 135
CpA-12-N2H3 by hydrazinolysis of HMBA ...... 138
Using Dbz as a cryptic thioester for convergent ligation ...... 142
Convergent Hybrid-Phase NCL of CENP-A ...... 143
Solution-phase ligation of CpA-12-Dbz ...... 144
Solution-Phase NCL of CpA-12345-K124ac ...... 146
xv
Convergent SP-Hybrid NCL of CpA ...... 149
Ligation handle for convergent SP-NCL ...... 149
SP-NCL of CpA-12-Dbz0 ...... 152
Refolding synthetic CENP-A without purification ...... 154
Conclusions ...... 156
Acknowledgements ...... 156
Chapter 4: Conclusions ...... 157
Future Work and Application ...... 159
References ...... 164
Appendix A: Standard Laboratory Solutions ...... 187
15% acrylamide gel ...... 187
Stacking gel (5% acrylamide) ...... 187
6 x SDS loading buffer ...... 187
SDS-PAGE running buffer ...... 188
Coomassie Stain ...... 188
Destain ...... 188
LB growth media ...... 188
SOC growth media ...... 188
xvi
List of Tables
Table 1: H4 peptide segments160 ...... 67
Table 2: Histidine coupling conditions on Dbz ...... 74
Table 3: CENP-A peptides160 ...... 90
Table 4: SP-NCL conditions for CENP-A ...... 97
Table 5: SP-NCL of H4-BHC-K91ac ...... 103
Table 6: H1.2 Peptides ...... 160
xvii
List of Figures
Figure 1: Mechanism of Native Chemical Ligation ...... 5
Figure 2: Expressed Protein Ligation ...... 6
Figure 3: Nucleosome Structure ...... 11
Figure 4: Post-translational modifications and their genetic mimics ...... 15
Figure 5: Chemistry of Dehydroalanine7 ...... 17
Figure 6: Chemical Modifications of Cysteine7 ...... 18
Figure 7: Introduction of Ubiquitylation through disulfide stapling ...... 19
Figure 8: Solution-Phase NCL Approaches ...... 24
Figure 9: C to N SP-NCL Scheme ...... 26
Figure 10: SP-NCL Ligation Scheme for H4160 ...... 63
Figure 11: Dual linker strategy for SP-NCL ...... 65
Figure 12: Preparation of Thioester through the Dbz ...... 69
Figure 13: Nbz Conversion of H4-A ...... 71
Figure 14: Analysis of H4-H peptide ...... 72
Figure 15: Histidine coupling conditions on Dbz ...... 75
Figure 16: Purified H4 peptides ...... 77
Figure 17: SP-NCL of ac-H4160 ...... 79
Figure 18: SP-NCL of Synthetic H4 constructs ...... 81
Figure 19: Comparison of CENP-A and H3-containing nucleosomes ...... 82 xviii
Figure 20: Centromeric Nucleosome PTMs ...... 83
Figure 21: Comparison of CENP-A and H3 nucleosomes ...... 85
Figure 22: CpA-K124ac EPL scheme ...... 87
Figure 23: Expression of CpA(1-115)-intein-CBD ...... 88
Figure 24: CENP-A SP-NCL scheme ...... 90
Figure 25: Nbz conversion of CpA-2 ...... 91
Figure 26: O to S acyl shift approach ...... 92
Figure 27: Nbz conversion of CpA-2 and H4-A peptides using the dry DMF approach . 93
Figure 28: Formylation through Vilsmeier-Haack ...... 94
Figure 29: Nbz conversion of CpA-2 using dry NMP ...... 94
Figure 30: Purified CENP-A peptides ...... 96
Figure 31: SP-NCL of CpA-345-K124ac160 ...... 98
Figure 32: SP-NCL of CpA-1 and CpA-2 ...... 99
Figure 33: Hybrid phase ligation of H4160 ...... 102
Figure 34: SP-NCL of H4-BHC-K91ac160 ...... 104
Figure 35: Solution-phase ligation of H4160 ...... 106
Figure 36: Desulfurization of H4-K5ac,K12ac,K91ac ...... 107
Figure 37: Purified H4-K5ac,K12ac,K91ac160 ...... 107
Figure 38: Hybrid-phase NCL scheme of CpA-K124ac160 ...... 109
160 Figure 39: SP-NCL of CpA-3450-K124ac ...... 110
Figure 40: Solution-phase ligation of CENP-A160 ...... 112
Figure 41: Desulfurization of CpA-K124ac ...... 112
xix
Figure 42: Purified CpA-K124ac160 ...... 113
Figure 43: Nucleosome containing CENP-A ...... 114
Figure 44: H4-K79ac EPL scheme ...... 116
Figure 45: H4-K79ac EPL ...... 117
Figure 46: Refolding and reconstitution of recombinant CENP-A ...... 118
Figure 47: Refolding of synthetic histones160 ...... 120
Figure 48: Convergent ligation of CENP-A ...... 124
Figure 49: Thioester conversion from peptide hydrazide ...... 133
Figure 50: Preparation of hydrazide base resin ...... 134
Figure 51: Synthesis of CpA-2-N2H3 ...... 135
Figure 52: Convergent ligation using hydrazide: hydrazinolysis of Nbz ...... 136
Figure 53: Hydrazinolysis of CpA-2- Nbz(formyl) ...... 137
Figure 54: Hydrazinolysis of CpA-2-Nbz ...... 138
Figure 55: Convergent ligation using hydrazide: hydrazinolysis of HMBA ...... 139
Figure 56: CpA-2-HMBA-Arg-Gly-Dbz ...... 140
Figure 57: Ligation and hydrazinolysis of CpA-12 ...... 141
Figure 58: Preparation of thioester using Dbz ...... 142
Figure 59: Convergent hybrid-phase NCL of CENP-A ...... 144
Figure 60: Solution-phase ligation of CpA-12-Dbz ...... 145
Figure 61: Solution-phase ligation of CpA-12 and CpA-345 ...... 146
Figure 62: Desulfurization of CENP-A ...... 147
Figure 63: Purified CpA-K124ac ...... 148
xx
Figure 64: CENP-A Convergent SP-hybrid NCL scheme ...... 150
Figure 65: CpA-2-Dbz-GK(C) ...... 151
Figure 66: SP-NCL of CpA-12-Dbz0 ...... 153
Figure 67: Refolding synthetic CENP-A with no purification ...... 155
Figure 68: Convergent SP-Hybrid NCL scheme of H1 ...... 161
Figure 69: Comparison of CENP-A, H3, and H4 structures ...... 162
xxi
List of Acronyms
Amino acids are referred to by the appropriate one or three letter codes.
6-Cl-HOBt 6-chloro-1-hydroxybenzotriazole
AA Amino acid
Ac Acetylated
CAN Acetonitrile
Anx ε-aminohexanoic acid
Alloc Allyloxycarbonyl
Boc t-butoxycarbonyl
CBD Chitin binding domain
CpA CENP-A
DBU 1,8-Diazabicycloundec-7-ene
Dbz 3,4-diaminobenzoic acid
DCM Dichloromethane
DIEA N,N-diisopropylethylamine
DIC N,N'-Diisopropylcarbodiimide
DMAP 4-Dimethylaminopyridine
DMF N,N-Dimethylformamide
xxii
DmThz 2,2-dimethylthiazolidine
DTT Dithiothreitol
EDT Ethanedithiol
EDTA 2,2',2'',2'''-(ethane-1,2-diyldinitrilo)tetraacetic acid
EPL Expressed protein ligation
Fmoc 9-fluoronylmethoxycarbonyl
FPLC Fast performance protein chromatography
GuHCl Guanidinium hydrochloride
HATU 2-(7-aza-1-H-benzotriazol-1-yl)-1,1,3,3-tetramethylaminium hexafluorophosphate
HBTU 2-(1-benzotriazol-1-yl)-1,1,3,3-tetramethyluronium hexafluorophosphate
HCTU 2-(6-chloro-1-H-benzotriazol-1-yl-1,1,3,3-tetramethyluronium hexafluorophosphate
HCCA α-Cyano-4-hydroxycinnamic acid
HMBA 4-(Hydroxymethyl)benzoic acid
HEPES 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid
HO Histone octamer
MALDI-TOF Matrix assisted laser desorption ionization-time of flight
HBHA 4-methylbenzhydrylamine hydrochloride salt resin
Me Methylated
MESNa Mercaptoethylsulfonate sodium salt
MPAA 4-mercaptophenylacetic acid
xxiii
MS Mass spectrometry
Nbz N-acyl-benzimidazolinone
NCL Native chemical ligation
Ni-NTA Nickel nitrilotriacetic acid resin
Nle Norleucine
NMP N-Methylpyrrolidone
NMR Nuclear magnetic resonance
NPCF 4-nitrophenyl chloroformate
Pen Penicillamine
PMSF phenylmethylsulfonyl fluoride
PTM Post-translational modification
RP-HPLC Reversed phase high performance liquid chromatography
SDS-PAGE Sodium dodecylsulfate-polyacrilamide gel electrophoresis
SPPS Solid phase peptide synthesis
TCA Trichloroacetic acid
TCEP Tris(2-carboxyethyl)phosphine
TEMED 1,2-bis(dimethylamino)ethane
TFA Trifluoroacetic acid
Thz Thiazolidine
TIS Triisopropyl silane
Tris 2-amino-2-hydroxymethyl-propane-1,3-diol
UV Ultraviolet
xxiv
VA-044-US 2,2’-azobis[2-(2-imidazolin-2-yl)propane]dihydrochloride
xxv
Chapter 1: Introduction
Protein Total Synthesis
Chemical synthesis of proteins has always been of great interest to chemists.1,2 Proteins are some of the most complex molecules in an organism,3 and protein total synthesis allows the artificial preparation of these important macromolecules.4 At first glance, it may be hard to imagine why one would go through the painstaking work of artificially synthesizing a protein. Typically, recombinant techniques are used to prepare proteins since the labor of protein synthesis is often left to cellular machinery, which can prepare these complex molecules efficiently from simple building blocks.5,6 However, making alterations to the protein is relatively restricted using standard molecular biology techniques.3 Using mutagenesis, the changes we make are limited to the amino acid sidechain, which is in turn limited to the 20 natural amino acids encoded by the DNA sequence.7 It is possible to expand the selection through tRNA manipulation and chemical modification of the protein, but the scope is still severely restricted.7,8 These limitations are absent in total synthesis.9,10
Chemical synthesis gives us control over essentially every atom in the protein, allowing us to introduce any functional moieties possible within the bounds of chemistry.4,11 Both side chains and backbones of proteins can be modified.12,13 Chemical synthesis enables many
1 unique modifications, including the insertion of site-specific isotope labels,13,14 introduction of unnatural amino acids and post-translational modifications,15,16 replacement of peptide backbone with non-native backbone,17 and synthesis of D-form proteins.18,19 All of these synthetic proteins have been used in studies to better understand protein function.
Native Chemical Ligation
Protein total synthesis from start to finish is a daunting task. Since every addition of an amino acid involves condensation and deprotection steps, synthesizing a 100-residue protein requires several hundred chemical steps.4 Synthesis begins with the chemical preparation of peptides. During the earlier years of peptide synthesis, product yield was extremely low because all reactions were carried out in solution.20,21 After each step, product needed to be purified from other reagents before performing the next reaction.
Each purification led to significant yield loss. In addition, the side chain protecting groups caused longer peptides to have poor solubility in organic solvents.22 The development of solid phase peptide synthesis (SPPS) was a major breakthrough in peptide chemistry.23,24
The amino acids are coupled onto a solid support,25 and removing excess reagents before subsequent reactions only requires a simple flow-wash.23 The advantages of the solid-phase over the solution-phase strategy are numerous. Simple wash steps replace the laborious intermediate purification steps that led to significant yield loss.22 Solvent can be easily changed in between chemical steps. Reactions can be carried out with excess amino acids
2 to ensure rapid and efficient coupling. All this greatly improved the efficiency of peptide synthesis.23 Despite the advantages, the upper limit of SPPS is around 50 residues.2,26 Side products begin to accumulate for longer peptides, which are often caused by formation of secondary structures.22 In addition, although each coupling step typically has an efficiency of >99%,27 when it is multiplied over many steps, the overall efficiency can become very low. Therefore, as powerful as SPPS is, this method by itself cannot be used to synthesize moderate-sized proteins. A ligation technique that can stitch smaller peptides into a single protein is necessary.3
Several chemical ligation techniques were developed to overcome the size limit of total synthesis.28-30 In chemical ligation, proteins are synthesized in short peptide fragments which are then condensed together to form the full-length product.28 The challenge was to find two functional groups that chemoselectively react with each other. One of the methods involved reacting a peptide with a C-terminal thiocarboxylate and a peptide with a N- terminal bromoacetyl group.28 This condenses the two peptides with a thioester linkage, and the method was used for the total synthesis of HIV-1 protease.28 Other early chemical ligations techniques result in the formation of oxime,29 thioether,31 disulfide,32 and thiazolidine33 linkages.
The first ligation method that yielded a native amide linkage through simple, accessible chemistry in the absence of side chain protection was termed native chemical ligation
(NCL).30 This powerful approach involves the specific reaction between the C-terminal α-
3 thioester of one peptide and the N-terminal cysteine, (or any 1,2-aminothiol) of another peptide (Figure 1). In the transthioesterification step, the thiol attacks the carbonyl of the thioester. The N-terminal amine then displaces the thiol of the cysteine in a rearrangement step termed the S to N acyl shift. This leads to the formation of a native peptide bond with a cysteine at the ligation junction.30 What makes this reaction so efficient and specific is the two-step reaction mechanism. The initial transthioesterification step is reversible, and internal cysteines can also participate. However, the S to N acyl shift is only possible with the primary amine of the N-terminal cysteine. Since the formation of the amide bond is very favorable, this second step is essentially irreversible, driving the forward reaction.
Adding to the advantage of NCL, the reaction condition is very mild since it is performed at room temperature with neutral pH. Ligation kinetics vary greatly depending on the C- terminal residue of the thioester.34 Rate typically decreases with steric bulk of the C- terminal residue sidechain. Ligation is complete within 4 h for residues such as Gly and
His, while 48 h may be required for residues such as Val and Thr.34 Therefore it is important to consider relative kinetics of different residues when deciding the ligation site of a protein. Overall, the development of NCL extended the size limit of protein total synthesis considerably.35
4
HS O
SR H2N
Transthioesterification
O
S
H2N S to N Acyl Shift
SH O
N H
Figure 1: Mechanism of Native Chemical Ligation
NCL has also been modified so that part of the protein is expressed while the other segment containing the desired PTM is synthesized. This expressed protein ligation (EPL) approach can produce the N-terminal protein segment with a C-terminal thioester.36 Preparing a protein through this method is referred to as protein semi-synthesis. When expressing the
N-terminal segment, the C-terminus of the segment is fused to a modified intein, a self- splicing protein using DNA cloning techniques. After refolding the intein, this fusion protein can be cleaved with an external thiol, which generates the expressed peptide thioester (Figure 2).36 EPL is particularly useful if the desired modification is close to the
C-terminus of the protein sequence. This way the synthetic portion of the protein is short and relatively easy to synthesize in good yield.
5
SH O
N-term N Intein H
N to S Acyl Shift
O HS R N-term S
Intein H2N Thiol exchange HS O N-term C-term SR H2N NCL
Figure 2: Expressed Protein Ligation
NCL leaves a cysteine at the ligation site. If that site originally contained another residue, we would be effectively making a point mutation. Sometimes this Cys mutation could affect the function of the synthetic protein.16 Desulfurization removes the thiol functionality of Cys and convert the residue into Alanine.37,38 Since Ala is a common amino acid in proteins,37 this allows for more flexibility when it comes to choosing split sites in the protein. If a native alanine is chosen as a split site, ligation using an N-terminal Cys followed by desulfurization results in a traceless ligation. As mentioned before, any 1,2- aminothiol moiety can participate in NCL, which allows for the possibility of other, non-
Cys ligation sites. For example, ligation and subsequent desulfurization using penicillamine yields valine, which allows for the option to carry out ligation at a Valine ligation site.39 Other residues that can be chosen as ligation sites include phenylalanine,40
6 arginine,41 serine,42 and threonine.42 For ligations at Serine and threonine, a special ligation approach was introduced where a peptide ester was used.42
Applications of total synthesis
The ability to carry out total synthesis of a protein has allowed researchers to introduce probes into the native sequence of a protein, and therefore learn numerous aspects about the protein’s activity. Examples include one study where the active site aspartate residue of the HIV-1 protease was labeled with 13C for NMR, which revealed the enzyme’s catalytic mechanism.13 Two aspartate residues in the active site acted as general acid and base to hydrolyze the substrate. Chemical synthesis of proteins allowed for the detailed study of redox potential in Rubredoxins where natural amino acids were replaced by nonstandard amino acids.15 This study found that aromatic residues could regulate metalloprotein reduction potentials, which could be fine-tuned by changing the residue to unnatural amino acids. The histidine residue of human secretory phospholipase A2 was replaced by isosteric thienyl alanine. Even though the difference between the two residues was small, it was enough to inactivate the protein, suggesting that the imidazole ring was important for function.34 This kind of precise change would not be possible using standard mutagenesis techniques. Replacing the same histidine with another natural amino might reveal if the residue is essential for function, but various confounding factors would prevent a concise reason for the loss of function.
7
Unlike mutagenesis, which is limited to the manipulation of the side chain, chemical synthesis allows for the modification of the backbone. One HIV-1 protease was synthesized by replacing the Gly-Gly in the β-turn by a constrained bicyclic ring compound. This change did not affect activity, and it led to increased thermal stability.17 In another backbone engineering study, an amide bond of a serine proteinase was replaced by an ester bond to assess the significance of backbone hydrogen bonding.12 This study found that the elimination of this hydrogen bond reduced binding of the protein to its target by 15-fold.
By making peptides with aza-Gly backbone, it was found that adding an extra nitrogen in the peptide bond increases collagen stability.43. With total synthesis, any modification compatible with the chemistry of choice can be introduced site-specifically into a protein.
One of the most profound examples of the ramifications of total protein synthesis is the ability to construct a protein entirely out of D amino acids. This was first demonstrated in the synthesis of HIV-1 protease. This enzyme was shown to only take the enantiomer of the standard substrate, and its kinetics for catalysis was identical to that of the native HIV-
1 protease.18 This observation suggested that the three-dimensional fold of the D-protein is the mirror image of the original. Importantly, it supported the hypothesis that D-proteins were capable of supporting lifeforms of the opposite chirality. This was further confirmed by a recent study where a DNA polymerase was synthesized from D amino acids. This polymerase successfully performed replication using a L-DNA template, and two mirrored polymerases could function in a racemic mixture without cross-inhibition.19 Another group used mirror-image DapA to answer the interesting question of whether or not natural
8
GroEL/ES chaperone proteins can fold D-proteins. Interestingly, the chaperone was found to be ambidextrous, and is able to fold proteins of either chirality.44 Recently, Liu and coworkers synthesized a D-polymerase. Mirror image proteins can also be used in racemic
X-ray crystallography, where a racemic mixture of L- and D-proteins are used to make crystallization easier. The structures of several proteins were determined using this method.45-47
At times people turn to chemical synthesis for proteins that are difficult to express, such as membrane proteins. One of the earliest examples is the synthesis of the proton channel
M2.48 Synthesis of membrane proteins is difficult because of their hydrophobicity, but methods have been introduced to optimize synthesis and handling conditions.49 In Chapter
2, we will discuss this in the context of centromeric protein A (CENP-A).
Glycoproteins are another popular target for chemical synthesis, which allows for the site- specific modification of a protein with another macromolecule. One of the first glycoprotein to be synthesize was diptericin.50 In addition, chemical synthesis or EPL can be used to introduce post-translational modifications (PTMs) common in eukaryotic cells that are not accessible through bacterial expression systems. One example of a protein with extensive PTMs is alpha-synuclein, which is associated with various neurodegenerative diseases like the Parkingson’s Disease.51 Synthetic and semi-synthetic alpha-synucleins have been used to understand the various PTMs and how they are involved in the pathology.
9
In our next section, we will discuss another group of proteins with extensive PTMs: the histone proteins.
Histones
Eukaryotic DNA is packaged and compacted into the nucleus by histone proteins.52,53 DNA and histones function as a unit called the chromatin.54 There are four major core histone proteins: H2A, H2B, H3, and H4.55 Two copies of each histone forms an octamer complex,
56 which assemble from one (H3/H4)2 tetramer and two H2A/H2B dimers. DNA is wrapped
~1.7 times (146 bp) around the histone octamer (HO),56 forming a complex known as the nucleosome (Figure 3).
10
DNA entry/exit region
Dyad region H2A: Green
H2B: Gray
H3: Blue
H4: Red
LRS region
Figure 3: Nucleosome Structure PDB: 1KX356
The nucleosome can be roughly divided into two regions, the tail and the core region.57
Histones have unstructured N-terminal tail regions that are extensively post-translationally modified.52 The core region is the folded octamer region with a defined three-dimensional structure. Within the core region, the entry-exit region is where the DNA begins to wrap/unwrap from HO.58 The dyad region is defined by the pseudo-plane of symmetry of
11
HO that roughly divides the octamer into two halves.59,60 The lateral surface region interacts with the wrapped DNA, and the solvent-accessible face can stack with another nucleosome unit to form higher-order chromatin structure.61,62 In addition, the loss of rDNA silencing (LRS) region is required for transcriptional silencing.63
Because of this close interaction between histones and DNA, histones are regulators of many DNA dependent processes, such as replication, transcription, and DNA repair.64-66
In addition to the canonical core histones, there are several histone variants, many of which serve distinct functions for particular situations.67,68 CENP-A for example is a H3 variant that is present in centromeres,69 and it signals where the kinetochore should assemble during cell division.70-72
In order for DNA-dependent processes to occur, the histone-wrapped DNA is made sterically accessible by several means. Chromatin remodeling proteins can slide the HO across the DNA to reveal accessible segments, while histone chaperone proteins can disassemble histones from the DNA.73-77 Even without these factors, the interaction between histone and DNA is inherently dynamic.78,79 Dynamics can also be affected by histone PTMs, which is discussed in the following section.
12
Histone Post-Translational Modification
Histones have extensive PTMs, which can be found in any of the nucleosome regions discussed in the previous section.57 Most modifications observed in other proteins can also be found in histones.53 Lysines are commonly mono-, di-, or tri-methylated, acetylated,80 ubiquitinated,81 sumoylated,82 biotinylated,83 formylated,84 ADP ribosylated,85 or crotonylated.86 Arginine can be methylated87 or converted to citrulline.88 Ser, Thr, Tyr, and
His can be phosphorylated,89 and Ser and Thr can be glycosylated.90. PTMs in the tails and unstructured regions usually recruit other proteins to affect downstream pathways.91,92
These modifications are suggested to affect one another due to “histone cross-talk”.93-95
The fact that countless combinations of tail PTMs act as an epigenetic code to affect various biological activities have led researchers to propose the concept of the “histone code”.52,96
The PTMs are recognized by reader, writer, and eraser proteins, which triggers down- stream events such as transcription and silencing.97,98 PTMs buried in the histone core on the other hand can directly affect the dynamic interaction between DNA and histone99,100 as well as the structure of the nucleosome itself,101 often changing the steric or electrostatic characteristics of key residues in protein-protein or protein-DNA interfaces.16,59,102,103
Methods to Prepare Histone PTMs
While large numbers of histone PTMs have been discovered,104 and several have been correlated with biological effects, the precise effects of most of these PTMs are still unknown.105 Part of the reason for this slow progress is the difficulty in preparing a
13 homogenous sample of histones with desired modifications. Various methods have been developed to overcome this challenge.7,106
Genetic mimics
One of the most straightforward methods to study PTMs is mutagenesis, where the unmodified residue is mutated to another natural amino acid that shares similar characteristics with the modified residue (Figure 4). For example, a lysine can be replaced by glutamine as a mimic of acetyllysine, or by arginine as a mimic of constitutively unmodified lysine.107 Glutamate or aspartate have often been used to mimic phosphorylated Ser, Thr, or Tyr.108 An advantage of mutagenesis is that it can be used in a high-throughput method to quickly screen for possible effects of histone PTMs.109 Since this method only involves simple genetic mutations, any laboratory equipped with recombinant protein expression tools can prepare these mimics. However, in several instances, when the effects of mimics have been directly compared to the chemically precise modifications, the mimics have not replicated the exact modification. This might be expected from the structural differences. For example, H3-K115, K122 acetylations,
(H3-K115ac, K122ac), reduced the free energy of octamer binding to DNA, while the mimics H3-K115Q, K122Q did not.59 In another study, H3-T118 phosphorylation was found to decrease DNA-histone binding free energy and increase nucleosome mobility, while H3-K118E had no effect on those areas.110 For cases like these, it is desired to make either a more similar analog, or the exact modification of interest.
14
O
HN O O O O P O O P O O P O O O O PTMs
N N N N H H H H O O O O Acetyllysine Phosphoserine Phosphothreonine Phosphotyrosine
O NH2 O O
Mimics N N H H O O Glutamine Glutamate
Figure 4: Post-translational modifications and their genetic mimics
Expanded genetic code
Codon suppression is a method that allows for the genetic introduction of modified amino acids. The expanded genetic code was first suggested when a strain of E. coli was found to read through the UAG (amber) stop codon.111 Later, Schultz and coworkers succeeded in introducing a tyrosine at the stop codon in a species of methanogen.112 After discovering another species of methanogen that incorporated pyrrolysine through the amber codon, the
Chin group eventually developed the genetic incorporation of acetyllysine (Kac) by artificially evolving a methanogen’s pyrrolysyl-tRNA synthetase and tRNA pair to take acetyllysine as substrate. This codon suppression method was first used to prepare H3-
K56ac.113 The system has since been modified to allow incorporation of methyllysines,114,115 Ne(Cys)-lysine,116,117 azidonorleucine,118 and phosphoserine.119
15
Dehydroalanine
Schultz and coworkers developed the genetic incorporation of several other unnatural amino acids, and one of particular interest to the introduction of PTMs was phenylselenocysteine. This residue can be converted to dehydroalanine (Dha), and Michael addition with a thiol reagent introduces the thioether analog of the desired PTM.120 Instead of genetic incorporation, the Davis group developed the chemical conversion of cysteine to Dha using 2,5-dibromohexanediamide.121 The Dha was then converted into thioether analogs of methyllysine, acetyllysine, and phosphoserine, and glycosylated serine. Another group developed the genetic incorporation of Se-alkylselenocysteine, which can also convert into Dha with an improved expression yield over the phenylselenocysteine method.122 Figure 5 summarizes the versatility of Dha. One potential problem is that conversion of chiral cysteine or selenocysteine to planar Dha leads to racemic mixture of the resulting modification analog. Some studies suggest that the inherent chirality of the protein will bias product towards the L-form analog.123
16
H2O2 H2O2
Figure 5: Chemistry of Dehydroalanine7
Chemical installation through cysteine
When preparing PTM analogs, chemical modification is another popular method. The target for these chemical approaches is often cysteine, due to its reactivity and relative rarity, especially in histones. Various reactions have been developed that are specific to the sulfhydryl of cysteines. Cysteine can be introduced in a protein using site-directed mutagenesis, and chemical modification allows for site-specific installation of a PTM. One well-established method is the preparation of methyllysine analogs (MLAs) through cysteine alkylation (Figure 6A). This yields mono-, di, or tri-MLAs depending on the aminoethyl halide used.124 A methylene in methyllysine is replaced by a sulfide in MLA, causing a slight lengthening of the sidechain by 0.28 A, and an 1.1 pKa decrease in the 17 sidechain amine.124 The analog is sufficiently similar to be recognized by methyltransferases and methyllysine antibodies, albeit with decreased affinity.124 Cysteines can also be converted to acetyllysine analogs (Figure 6B,C).117,125 and methylarginine analogs (Figure 6D).126 These cysteine derivatives seem to mimic the effects of the corresponding modifications in some cases but not others.127,128
A B
C D
Figure 6: Chemical Modifications of Cysteine7 (A) Generation of MLA. (B) Generation of thio-methyl aceteyllysine as acetylysine analog. (C) Generation of acetyllysine mimic with thiol-ene chemistry. (D) Generation of methylarginine mimic.
Disulfide stapling
Cysteine has another useful chemical property, which is the ability to form disulfide bonds, which has been exploited to introduce modifications through a technique termed disulfide stapling (Figure 7). For example, histone ubiquitynation is possible by covalently linking ubiquitin and cysteine of histone through a disulfide bond. In this disulfide stapling method, the ubiquitin is expressed as a fusion protein to intein, and thiolysis with 1,2-aminothiol 18 results in ubiquitin with a thiol terminus. This product is then incubated with histone containing a single cysteine, yielding a ubiquitynated histone mimic.129 Despite the disulfide linkage, this mimic was recognized by other enzymes, suggesting the difference in covalent linkage does not change the effect of the modification. The ability to remove the ubiquitin by reduction is both an advantage – allowing for the dynamic study of ubiquitynation by addition of reducing agent,129,130 – as well as a disadvantage, limiting the buffer conditions under which these modifications are stable.
SH O O
Ubiquitin N Intein UbiquitinN-term S H Intein H2N
SH H2N
O O SH UbiquitinN-term N UbiquitinN-term S H SH H2N Protein
O O UbiquitinN-term N H UbiquitinN-term N S H S Protein Protein Ubiquitylated protein Native Ubiquitylated protein using disulfide stapling
Figure 7: Introduction of Ubiquitylation through disulfide stapling
The unique reactivity of cysteines allows for the introduction of various PTM mimics through diverse chemistry. Histones with cysteine mutants can be easily prepared using standard recombinant approaches, allowing for large quantities of modified histones.
19
However, similarly to the amber codon method, the cysteine modification method is generally only limited to the introduction of one type of modification for a given protein.
Chemical Synthesis
As discussed above, there are many ways to introduce PTMs. However, many of these methods can only produce PTM analogs, which are not structurally identical to the native
PTM. Although codon suppression enables the introduction of the precise PTM, it is currently difficult to introduce more than one PTM in a single protein.131 These limitations are not seen with chemical synthesis of modified proteins. In terms of the level of control provided by protein total synthesis, no other methods come close.3 There is essentially no limit to the number and type of modification that can be introduced with synthetic histones.132 Since multiple PTMs can be found distributed throughout the sequence of a single histone in vivo,52,133 total synthesis is currently the only method that can replicate histones typically found in nature. Despite these advantages, total synthesis does have major limitations, and our goal as will be discussed in the coming chapters is to overcome those limitations.
Goals
Protein total synthesis has two major limitations. The first is the size limit of total synthesis.
Although the development of NCL increased the upper size limit of synthetic proteins considerably, it is still not practical to synthesize proteins with more than 300 residues.30,134
20
A larger protein needs to be split into more peptide fragments, which means more chemical steps to produce the full-length product. This leads to a lower yield, which is the second major limitation of total synthesis.135 The yield of synthesis is low, especially compared to protein expression. Total synthesis involves multiple steps. Yield loss can occur at every step from peptide synthesis to ligation and purification. Even making milligram quantities of the protein becomes a challenge.132 Our goal is to develop techniques that can overcome these two limitations, and make total synthesis more efficient and practical. We have used histone proteins as a platform to develop these techniques and to demonstrate the effectiveness of these techniques.
Histone proteins have several properties that make them attractive targets for total synthesis.
They are relatively small proteins, which makes them accessible for total synthesis. They have very few cysteine residues, (with H4 having none), such that chemical ligation followed by desulfurization yields a product with minimal mutations. Histone proteins are relatively challenging to synthesize due to their unique sequences. Some of the histone peptides have poor solubility making purification challenging, while other peptides do not yield clean chemical conversions using standard reaction conditions. However, we view these challenges not as shortcomings but rather as opportunities to optimize total synthesis methods. Many of the techniques we use to handle these challenging peptides can be applied to the chemical synthesis of other proteins containing similarly challenging sequences.
21
Outline
In Chapter 2 we introduce hybrid-phase ligation as a new approach to the efficient synthesis of histones requiring multiple ligation steps. This method grew out of our initial studies in which solid-phase NCL (SP-NCL) proved incompatible with histones. We use the hybrid- phase ligation approach to synthesize H4 and CENP-A. A dual linker is used to optimize peptide cleavage yield and produce the native carboxy terminus. We hope to overcome the current yield limitation of total synthesis with this method.
In Chapter 3 we introduce convergent hybrid ligation as a further improvement to total histone synthesis. We assess two alternate approaches: one in which a protein segment prepared with solution-phase ligation is combined with a segment prepared by SP-NCL, and one in which both segments are prepared by SP-NCL. Key to both of these approaches are versatile linker 3,4-diaminobenzoic acid (Dbz) as a masked thioester.136 For the second approach, we developed a resin-anchoring strategy that maintains a C-terminal cryptic thioester, such that both segments of the protein can be used without further purification steps. This approach to chemical ligation should be applicable for the efficient total synthesis of a wide variety of larger proteins.
22
Chapter 2: Solid-Phase Ligation vs. Hybrid-Phase Ligation of Histones
Introduction
The synthesis of a protein from two peptide fragments is relatively straight-forward process involving only one ligation step. However, given the limits of SPPS and the size of most proteins, it is usually necessary to split even moderately sized proteins into three or more fragments. Syntheses that require more than one ligation step can lead to various complications and low yields. The various approaches to ligate multiple peptide fragments are discussed in the following sections.
Solution-Phase NCL
A simple and straight-forward approach to ligating multiple peptides is through sequential ligation. Peptides are ligated one by one in sequence, either in the N to C or C to N direction, (Figure 8).135 With ligation in the C to N direction, the N-terminal cysteine of peptide thioester must be protected to prevent cyclization and oligomerization. Further, the protection must be reversible under mild conditions compatible with ligation. While several strategies have been developed, the cysteine is most often protected with
23 acetamidomethyl (Acm)137 or as a ring-closed form of cysteine called thiazolidine
(Thz).138,139 After the ligation of two segments, deprotection can reveal the cysteine.
Typically, the ligation intermediate is purified before the next ligation. H3 has been successfully synthesized using the C to N sequential NCL using Thz as the protected cysteine.16 Ligation in the N to C direction typically requires a thioester surrogate which is stable to NCL conditions, but can be readily converted to thioester when desired. This method includes the use of N-alkylcysteine,140 cysteinyl prolyl ester,141 bis(2- sulfanylethyl)amino (SEA) peptide,142 and peptide hydrazide.143
N to C Sequential NCL C to N Sequential NCL Convergent NCL
Figure 8: Solution-Phase NCL Approaches
The methods described above requires intermediate purification steps, which are necessary for removing excess peptides and reagents before the next round of ligation. For example, methoxylamine used for deprotection of Thz will react with the incoming peptide thioester.
However, reversed phase high performance liquid chromatography (RP-HPLC) purification of the intermediates results in a significant yield loss. This is especially true for histones, where we see upwards of 70% yield loss from purification.132 Several methods have been developed to minimize the purification steps.35 In the one-pot ligation strategy, sequential ligation and deprotection is performed in the same vessel.139,144 This method
24 intermediate purification step, but it is generally only limited to a three peptide ligation. In addition, the ligation kinetics must be carefully controlled using different thioesters for the first and second ligations. Convergent ligation is a useful approach to minimize the number of purification steps when there are four or more peptide fragments.145-147 However it can be challenging to find the optimal masked thioester required for this method. The various ligation schemes are illustrated in Figure 8.
Solid-Phase NCL
Just as using a solid-phase support drastically improved the yield of peptide synthesis, solid-phase NCL (SP-NCL) is an attractive strategy to improve the yield of total synthesis
(Figure 9).148,149 With SP-NCL, the first peptide is ligated to a solid support, and the subsequent peptides are ligated sequentially to the immobilized segments. Purification is not required since the soluble components can be washed away with excess buffer.
Eliminating the time-consuming purification steps accelerates synthesis. Changing reaction buffers in-between chemical steps is also straight-forward. Excess peptides can be used for ligation, and incomplete reactions can be repeated after a quick wash step.
25
+
SP-NCL
Cleave
Figure 9: C to N SP-NCL Scheme
SP-NCL has been attempted by several labs. SP-NCL can be performed in either the N to
C or the C to N direction. With either approach, protection strategies must be developed in order to avoid cyclization of the peptide dissolved in solution. Early studies include the development of N to C SP-NCL by the Kent research group using thioacids as masked thioester. The thioacid was not completely inert during NCL, causing a small percentage of the middle peptide segment to cyclize.148 Another N to C SP-NCL strategy developed by Raibaut and coworkers used the bis(2-sulfanylethyl)amido (SEA) linker. Reduction of the linker’s disulfide bond induces a rearrangement to produce the active thioester.150
Although efficient, this method prevents the addition of reducing agents during ligation, which is sometimes necessary in order to reverse disulfide formation of the N-terminal cysteine, especially for slow ligations.
The Kent group also developed a His6-tag-assisted SP-NCL of Crambin. C-terminal peptide was synthesized with His6-tag, which was bound on Ni-NTA resin, during the ligation steps. The overall yield of Crambin, however, was 16% compared to the 40% 26 achieved previously with a one-pot strategy, which was stated to be due to less efficient folding.151 Ni-NTA resin is relatively inexpensive, and no specialized linker is required to anchor the peptide to the solid-support. However, many metal-binding and highly charged proteins interact with nickel, so the syntheses of those proteins would not be compatible with this resin. Another approach uses the safety-catch acid-labile (SCAL) linker, which was used to synthesize an 8kDa protein with 20% yield.152 Cleavage of the SCAL linker with trifluoroacetic acid (TFA) produces a C-terminal amide in the product instead of the native carboxyl, and the incorporation of SCAL is nontrivial. More recently, SP-NCL of
H2B was performed using a C-terminal Rink amide linked to PEGA resin.153 Rink is cleaved rapidly with TFA,154 but like with the SCAL linker, cleavage produces an amide terminus.
For this work, we introduce a new SP-NCL strategy that we have designed for the total synthesis of H4 and CENP-A aimed to improve the current yield of synthetic histones. We begin by discussing our initial development of H4 total synthesis by sequential SP-NCL.
We discuss the problems we encountered with several of the peptides and the approaches we used to resolve them. We then examine our initial attempts at synthesis of H4 and
CENP-A using the sequential SP-NC strategy. Despite the efficiency of the reactions, the overall yield was unacceptably low. This prompted the development of a new method, the hybrid-phase NCL, which combined solid-phase and solution phase components. This new strategy led to a significant improvement of yield for both H4 and CENP-A. We end this chapter by successfully incorporating the synthetic histones into nucleosomes.
27
Experimental Methods
Materials
Rink Amide MBHA resin LL (100-200 mesh, 0.36 mmol/g loading) was purchased from
Novabiochem. PL-PEGA resin (300-500 µm, 0.2 mmol/g) was purchased from Varian.
DMF C3H7NO, DMF CH2Cl2, ACN C2H3N, and diethyl ether (C2H5)2O were purchased from Fisher Scientific. NMP C5H9NO was purchased from AGTC Bioproducts. Piperidine
C5H11N, NPCF ClCO2C6H4NO2, MPAA HSC6H4CH2CO2H, TCEP C9H15O6P, DIEA
C8H19N, Phenyl silane C6H8Si, and Tetrakis(triphenylphosphine)palladium(0) (Pd(PPh3)4) were purchased from Sigma Aldrich. Fmoc protected amino acids were purchased from
AAPPTec and Novabiochem. Fmoc-6-Aminohexanoic acid (Fmoc-Ahx-OH), Fmoc-L- norleucine (Fmoc-Nle-OH), and DMAP C7H10N2 were purchased from Novabiochem.
HATU, HBTU, HCTU, and 6-Cl-HOBt C6H4ClN3O were purchased from AAPPTec.
Acetic anhydride and MESNa C2H5NaO3S2 was purchased from Fluka Analytical. DIC
C7H14N2 was purchased from Alfa Aesar. VA-044-US C12H22N6.2HCl was purchased from
Wako Chemicals. Ultra-pure guanidine-HCl CH6ClN3 (GuHCl) was purchased from MP
Biomedicals and Alfa Aesar. Dbz acid C37H28N2O6 was purchased from Anaspec. Alloc
C4H5ClO2 was purchased from Acros Organics. TIS C9H22Si (TIS) was purchased from
GFS Chemicals. Boc-(R)-5,5-dimethyl-1,3-thiazolidine-4-caboxylic acid (Boc-dmThz-
OH) was purchased from Chem-Implex International. 9-Fluorenylmethyl N-succinimidyl
28 carbonate (Fmoc-OSu) C19H15NO5 was purchased from Novabiochem. HCCA matrix
C10H7NO3 was purchased from Bruker Daltonics.
RP-HPLC
Analytical RP-HPLC was run on a Shimadzu or Waters instrument using an analytical column (Supelco C18 15 cm × 4.6 mm × 5 µm, flow rate 0.9 mL/min). Preparative RP-
HPLC was run on a Waters instrument using a semi-preparative column (Supelco C18 25 cm × 10 mm × 10 µm, flow rate 5 mL/min), or a preparative column (Supelco C18 25 cm
× 21.2 mm × 10 µm, flow rate 18 mL/min). Solvent A was 0.1 % TFA in water, and Solvent
B was 0.1 % TFA in 1:9 water:ACN. Eluate was monitored at 218 nm and 280 nm wavelengths. Only the 218 nm absorbance trace is shown in the figures.
Mass spectrometry
Peptide masses were confirmed by MALDI-TOF-MS (Bruker Daltonics Microflex) using flexControl 3.3 and flexAnalysis 3.3 sofwares. α-Cyano-4-hydroxycinnamic acid (HCCA) was used for the matrix. Peptide Calibration Standard II (Bruker) was used for calibration of peptides ranging from 0-3 kDa, and Protein Calibration Standard I (Bruker) was used for calibration of proteins ranging from 5-20 kDa. HCCA solution was prepared by resuspending solid HCCA in 1:1 Solvent A:ACN. Typically a mixture of 0.5 µL RP-HPLC purified sample and 0.6 µL HCCA solution was spotted on the MADLI target plate. The
29 expected and observed m/z are the average values. The instrument was calibrated using calibration standards before the analysis of samples.
Solid-Phase Peptide Synthesis
Synthesis of 3-Fmoc-Dbz-OH
3,4-Diaminobenzoic acid (1 g, 6.5 mmol) was resuspended in 125 mL 1:1 ACN:NaHCO3.
Reaction was initiated by the dropwise addition of 9-Fluorenylmethyl N-succinimidyl carbonate (Fmoc-OSu) (2.4 g, 7.1 mmol) in 15 mL 1:1 ACN:NaHCO3 and proceeded for
2 h. HCl was added to a final pH of 1.0, and the mixture was filtered. Filtrate was dissolved in 4 mL DMSO, precipitated with acidified reaction buffer, washed extensively, and dried under vacuum to yield a light brown product. Product identity and purity were validated by NMR spectroscopy.
Automated Solid-Phase Peptide Synthesis
Peptides were synthesized on 100-200 mesh Rink amide MBHA resin using the AAPTec
APEX 396 automated synthesizer. 40-well reaction vessel block was used. Five types of solutions were prepared beforehand for synthesis. 0.3 M Fmoc-AA-OH in NMP were prepared and put into the monomer rack of the synthesizer. 20% piperidine in NMP, 1 M
DIEA in DMF, 0.3 M HCTU in NMP, and capping solution (300 mM 6-Cl-HOBt and 300 mM Acetic anhydride in 1:9 DCM:DMF) were all prepared in separate glass bottles and set in the appropriate location in the synthesizer.
30
For one reaction well, 0.05 mmol of resin calculated from the theoretical loading in g/mmol was used. The resin was transferred to the well using DMF, and swollen in DMF by shaking for 15 minutes.
Fmoc deprotection was performed shaking the resin in 20% piperidine in NMP for 5 min.
The solution was drained, and the deprotection step was repeated two more times. The resin was washed by shaking in DMF for 5 min. The DMF was drained, and the wash step was repeated 4 more times. Solution was drained from the resin before the coupling step.
For coupling, 1 mL Fmoc-AA-OH, 0.9 mL HCTU, and 0.45 mL DIEA were added to the resin. When using 0.05 mmol of resin, the molar equivalents of Fmoc-AA-OH, HCTU, and
DIEA were 6, 5.5, and 9, respectively. The concentrations of Fmoc-AA-OH, HCTU, and
DIEA were 128 mM, 115 mM, and 191 mM, respectively. The resin was shaken for 30 min, and drained. In cases where double-coupling was needed, Fmoc-AA-OH, HCTU, and
DIEA were added again and coupling step was repeated for another 30 min. The resin was then washed by shaking in DMF. DMF was drained, and the wash step was repeat two more times.
If performing an acetyl capping step, 6 mL of capping solution was added to the resin, and the resin was shaken for 5 min. The solution was drained, and the step was repeated one more time. The resin was washed three times with DMF. If no capping step was performed, the resin was washed three times with DMF after coupling. DMF was drained, and the next
31 synthesis cycle was performed, starting with Fmoc deprotection. N-terminal residue of each peptide was coupled as the Boc-AA-OH so the Boc could be deprotected during TFA cleavage.
At the end of synthesis, the resin was transferred from the reaction vessel to a 10 mL Poly- prep chromatography column (Bio-Rad) using DMF. For transferring resin, a thick tip plastic transfer pipette (Samco Scientific) was used. The weight of the empty column before the addition of resin was recorded in order to calculate the dry weight of the resin.
The resin was washed with DMF, then with DCM. The resin was partially dried over a vacuum, and the column containing the resin was lyophilized. Resin was then stored at -20
°C. When synthesizing multiple peptides in parallel, the synthesizer was programmed to pause each time one of the peptides finished synthesis, so that the resin could be transferred and lyophilized. After a 15 min swelling step in DMF, synthesis of the remaining peptides was resumed.
Manual peptide synthesis
Manual synthesis was performed using a glass peptide reactor vessel. Dry resin was measured, and the resin was transferred into the reactor using DMF. Resin was swelled for
15 min by agitating the resin using N2 flow through the bottom of the reactor. DMF was then drained using vacuum.
32
Fmoc deprotection was performed using 20% piperidine in NMP and mixed by N2 agitation by agitation for 3 min. Resin was drained and deprotection was repeated two more times.
The resin was flow-washed with DMF. Flow-wash was performed for 30 seconds with the bottle pointed toward one side of the vessel, and 30 seconds on the other side. The 30 second flow-wash was repeated twice on each side
For a typical manual coupling cycle, 4.4 eq Fmoc-AA-OH, 4 eq HCTU, and 8.8 eq DIEA were used. The Fmoc-AA-OH and HCTU were dissolved in 1 mL DMF. DIEA was added, and the amino acid was pre-activated by shaking for 5 minutes. When using 0.05 mmol of resin, the concentration of Fmoc-AA-OH, HCTU, and DIEA were approximately 200 mM,
180 mM, and 400 mM, respectively. The activated amino acid solution was added to the resin, and resin was mixed by agitation.
After 30 min, reaction completion was assessed by performing Ninhydrin test on 10-20 resin beads. The resin sample was transferred to a 0.8 mL Micro bio-spin column (Bio-
Rad), and washed with DMF, then with DCM. The resin was dried over a vaccum, and transferred into a glass test tube. Two drops each of Monitors 1, 2, and 3 (Anaspec) were added to the resin, and the sample was incubated at 85 °C for 2 minutes. The color was assessed by diluting the sample 10-fold with ethanol. Sufficient presence of unprotected primary amine gave rise to a blue/violet color. No detectable primary amine resulted in a clear to light yellow color.
33
If coupling was complete as assessed by ninhydrin, the resin was flow-washed three times with DMF for the next step. If not, the resin was allowed to couple for 30 more min. If coupling was not complete after 1 h of coupling, the resin was flow-washed three times, and the coupling reaction was repeated one more time.
Acetyl capping of unreacted amines was achieved using 15:15:70 Acetic anhydride:DIEA:DMF, which was prepared immediately prior to the capping reaction.
Half of the prepared capping solution was added to the resin, and the resin was agitated for
5 minutes. The solution was then drained, and the capping step was repeated using the remaining half of the capping solution. The resin was then flow-washed three times. At the end of synthesis, the resin was transferred to a 10 mL Poly-prep chromatography column.
Manual synthesis of Dbz(Alloc) resin
3-Fmoc-Dbz-OH was loaded on the Rink amide resin using 4.4 eq 3-Fmoc-Dbz-OH, 4 eq
HCTU, and 8.8 eq DIEA, similarly to a Fmoc-AA-OH. When Arg tags were required,
Fmoc-Arg(Pbf)-OH was loaded on the reisn before the Dbz. Reaction completion was assessed using Ninhydrin. The resin was washed with DMF, and transferred to a Poly-prep chromatography column, and washed with DCM. The resin was dried briefly using vacuum to remove the DCM.
The column was removed from vacuum, and the bottom was plugged with a yellow cap.
The cap was wrapped with parafilm to prevent leakage. Anhydrous DCM (Sigma) was
34 added to the resin to approximately 3/4 the maximum volume of the column. Allyl chloroformate was added to a final concentration of 250 mM. 1 eq DIEA was then added to the column, and the column was shaken at room temperature for 24 h.
Loading the first amino acid on Dbz(Alloc)
Amino acid coupling directly onto 4-Alloc-Dbz resin was accomplished using 16.5 eq
Fmoc-AA-OH, 15 eq HCTU, and 33 eq DIEA. The Fmoc-AA-OH, HCTU, DIEA mixture was preactivated for 5 min before adding to the resin. Coupling was allowed to proceed for
1 h, followed by acetyl capping. The remaining residues were added using standard molar excesses.
Loading Fmoc-His(Trt)-OH on Dbz with minimal racemization
Fmoc-His(Trt)-OH and 6-Cl-HOBt was dissolved in DMF to a final concentration of 330 mM Fmoc-His(Trt)-OH and 300 mM 6-Cl-HOBt, and immediately added, without pre- activation, to the unprotected Dbz resin. DIC was added to the resin to final concentration of 300 mM, and coupling was allowed to proceed for 45 min. The resin was washed with
DMF, then with DCM. Sample resin cleavage and RP-HPLC analysis was performed in order to assess the extent of coupling. Coupling was repeated if incomplete.
Symmetric anhydride coupling on HMBA
Amino acid was loaded on the HMBA linker using the symmetric anhydride method. 10 eq Fmoc-AA-OH was dissolved in DCM. Drops of DMF was added until the amino acid
35 was completely dissolved. 5 equivalent of DIC was added, and incubated on ice with occasional shaking for 30 minutes. The amino acid solution was filtered in order to remove the precipitate and the filtrate was added to the resin. 0.1 eq DMAP was added as a catalyst.
The reaction was allowed to proceed for 1 hour and the coupling was repeated one more time to ensure complete reaction.
Alloc Deprotection
DCM (typically 7 mL) was added to the resin in the Poly-prep chromatography column and the resin was incubated at room temperature for 20 min. 0.35 eq Pd(PPh3)4 and 20 eq
Phenylsilane was added to the resin. The column was shaken for 45 minutes. The resin was flow-washed with DCM, dried, and lyophilized. Alternatively, the resin could be taken directly to Nbz conversion.
Nbz Conversion in DCM
DCM (typically 7 mL) was added to the resin in the Poly-prep chromatography column and the resin was incubated at room temperature for 20 min. NPCF was added to a final concentration of 50 mM. The column was nutated at room temperature for 30 minutes. The resin was washed with DCM followed by DMF. 0.5 M DIEA in DMF was then added to the resin, and the column was nutated at room temperature for 15 minutes. For successful
Nbz conversions, the DIEA solution typically turned bright yellow immediately upon its addition to resin. The resin was then washed with DMF, then with DCM. The resin was dried and lyophilized.
36
Nbz conversion in DMF and NMP
For Nbz conversion in DMF, the resin was swollen in dry DMF. DMF was dried by adding molecular sieves that had been baked overnight at 300 °C. After the addition of the molecular sieves, DMF was allowed to dry for at least 5 h before use. The dry DMF must be used within a week of its preparation. Using dry DMF that was more than a week old resulted in incomplete Nbz conversions. Solid NPCF was added directly to the resin/dry
DMF suspension to 50 mM. Reaction was allowed to proceed for 30 min. The remaining step was identical to Nbz conversion in DCM. Un-dried DMF could be used for the DIEA treatment. For Nbz conversion in NMP, the same steps described above were performed using NMP.
Peptide cleavage
For analytical cleavages, peptides were cleaved at the Rink linker in 95:2.5:2.5
TFA:TIS:H2O for 2 hours. For sequences containing Cys, 94:2.5:2.5:1
TFA:H2O:ethanedithiol:TIS was used. TFA was eluted in a 1.7 mL centrifuge tube, and
TFA was concentrated with a stream of nitrogen. Peptide was precipitated by adding cold diethyl ether to the tube. For maximal precipitation of peptide, the volume of diethyl ether should be at least five times the volume of remaining TFA. The sample was centrifuged, and the ether was decanted. Ether wash was repeated two more times. The pellet was allowed to air-dry, and then resuspended in 1:1 Solvent A:ACN. The relative ratio of
Solvent A and ACN varied depending on the solubility of the peptide. This sample could
37 be immediately analyzed using RP-HPLC and MALDI-TOF. The peptide could also be flash-frozen and lyophilized before analysis, and lyophilized peptides could be stored at -
80 °C. For preparative cleavages, the procedure was almost identical. However, cleavage was allowed to proceed for at least 3 h instead of 2 h. TFA was eluted into a 50 mL Corning centrifuge tube (with plug-cap). After elution, TFA was added to the resin and the column was shaken for 1 min. The TFA was eluted into the same 50 mL tube. The TFA wash was repeated two more times to ensure maximum peptide extraction.
Synthesis and purification of H4 peptides
H4-A (acSer1-Leu37)-Nbz ac-SGRGKGGKGLGKGGAKRHRKVLRDNIQGITKPAIRRL-Nbz
H4-A peptide was synthesized with 0.05 mmol of mFmoc-Dbz(Alloc) resin. The N- terminus was acetylated to mimic the constitutive acetylation of eukaryotic H4. Alloc deprotection was followed by Nbz conversion in DCM, which resulted in several additional products.
For purification, 25 mg of lyophilized crude peptide was dissolved in 4 mL of 15% Solvent
B. The sample was centrifuged, and the supernatant was loaded on semi-preparative RP-
HPLC using a gradient of 15-30 % solvent B over 40 minutes. If the total crude peptide was more than 25 mg, two or more purification runs were performed. No more than 25 mg
38 was loaded for each semi-preparative run. Collected fractions were assessed by MALDI-
TOF MS and RP-HPLC. Pure fractions were combined and lyophilized.
H4-A-K5ac,K12ac (Ser1-Leu37)-Nbz(formyl)-Arg
SGRGKacGGKGLGKacGGAKRHRKVLRDNIQGITKPAIRRL-Nbz-R
H4-A-K5ac,K12ac peptide was synthesized with 0.04 mmol of mFmoc-Dbz(Alloc)-R resin. The N-terminal Ser was added as Fmoc-Ser(tBu)-OH. After Alloc deprotection, Nbz conversion was carried out in dry DMF to generate the Nbz(formyl) derivative, and the N- terminal Fmoc was removed by treatment with 1% DBU in DMF for three minutes.
For purification, 25 mg of lyophilized crude peptide was dissolved in 4 mL of 15% Solvent
B. The sample was centrifuged, and the supernatant was loaded on semi-preparative RP-
HPLC using a gradient of 10-30 % solvent B over 40 minutes.
H4-B (Thz38-Gly56)-Nbz
Thz-RRGGVKRISGLIYEETRG-Nbz
H4-B peptide was synthesized with 0.05 mmol of mFmoc-Dbz(Alloc) resin.
For purification, 50 mg of lyophilized crude peptide was dissolved in 4 mL of 20% Solvent
B. The sample was centrifuged, and the supernatant was loaded on preparative RP-HPLC using a gradient of 20-35 % solvent B over 40 minutes. If the crude peptide was more than
39
50 mg, two or more purification runs were performed. No more than 50 mg was loaded for each semi-preparative run.
H4-H (Pen57-H75)-Nbz(formyl)-Arg dmThz-LKVFLENVIRDAVTYTEH-Nbz(formyl)-R
H4-H was synthesized with 0.05 mmol of unprotected Dbz-Arg resin. Histidine was loaded using minimal racemization conditions. Alloc deprotection was performed using dry DMF.
For purification, 50 mg lyophilized crude peptide was resuspended in 7 mL Solvent B. The sample was vortexed for 5 min, and 1 mL Solvent A was added. After vortexing for another
5 min, another 1 mL Solvent A was added. This process was repeated until most of the peptide was dissolved. Then 3 mL of Solvent A was added at a time, vortexing in between each addition, until the total volume of the sample reached 20 mL. The sample was centrifuged, and supernatant was loaded on preparative RP-HPLC using a gradient of 35-
50 % solvent B over 40 minutes.
H4-C (Thz76-Gly102-HMBA-Arg-Gly)-Nbz Met84Nle
H4-C peptide was synthesized with 0.05 mmol of Gly-HMBA-Arg-Gly-Dbz(Alloc) resin.
Nle was used in place of Met in order to avoid oxidation.
40
For purification, 50 mg lyophilized peptide was dissolved in 4 mL of 30% Solvent B. The sample was centrifuged, and supernatant was loaded on preparative RP-HPLC using a gradient of 30-45 % solvent B over 40 minutes.
H4-C-K91ac (Thz76-Gly102-HMBA-Arg-Gly)-Nbz
Thz-KRKTVTANleDVVYALKacRQGRTLYGFGG-HMBA-RG-Nbz
H4-C-K91ac peptide was synthesized with 0.05 mmol of Gly-HMBA-Arg-Gly-Dbz(Alloc) resin.
For purification, 50 mg lyophilized peptide was dissolved in 4 mL of 30% Solvent B. The sample was centrifuged, and supernatant was loaded on preparative RP-HPLC using a gradient of 30-45 % solvent B over 40 minutes.
H4-(76-102)-K79ac (Thz76-Gly102) for semi-synthetic H4-K79ac
Peptide was synthesized using 0.05 mmol of 100-200 mesh Gly-Wang155 resin
(Novabiochem) using standard SPPS methods. The N-terminal residue was added as Boc-
Thz-OH. Cleaved and lyophilized peptide was dissolved in 0.4 M methoxylamine in 1:1
H2O:ACN. The Thz deprotection was allowed to proceed for at least 2 h, and reaction completion was assessed by MALDI-TOF MS.
41
Solvent B was added to the peptide solution so that the final percentage of ACN was 27%.
The sample was centrifuged, and the volume corresponding to 50 mg of crude peptide was loaded on preparative RP-HPLC using a gradient of 30-45 % solvent B over 40 minutes.
Synthesis and purification of CENP-A peptides
CpA-1 (Gly2-Gly34)-Nbz
GPRRRSRKPEAPRRRSPSPTPTPGPSRRGPSLG-Nbz
CpA-1 peptide was synthesized with 0.05 mmol of mFmoc-Dbz(Alloc) resin. The N- terminal Gly was added as Fmoc-Gly-OH. After Alloc deprotection and Nbz conversion,
Fmoc was removed using 1% DBU before cleavage from the resin with TFA.
For purification, 50 mg lyophilized peptide was dissolved in 4 mL of 12% Solvent B. The sample was centrifuged, and supernatant was loaded on preparative RP-HPLC using a gradient of 12-25% solvent B over 40 minutes.
CpA-2 (Thz35-Leu70)-Nbz (formyl)
Thz-SSHQHSRRRQGWLKEIRKLQKSTHLLIRKLPFSRL-Nbz(formyl)
CpA-2 peptide was synthesized with 0.05 mmol of mFmoc-Dbz(Alloc) resin with the automated synthesizer. The N-terminal residue was added as Boc-Thz-OH. Alloc was removed, and Nbz conversion was performed using dry DMF.
42
For purification, 50 mg lyophilized peptide was resuspended with 1.2 mL of Solvent B.
The sample was vortexed, and 1 mL of Solvent A was added. The sample was vortexed and 1 mL of Solvent A was again added. The sample was vortexed until most peptide dissolved, and the volume of the sample was brought up to 4 mL using Solvent A. The sample was centrifuged, and supernatant was loaded on preparative RP-HPLC using a gradient of 30-60% solvent B over 40 minutes.
CpA-2 (Thz35-Leu70)-O-Cys(StBu)
The O to S resin was prepared manually using 0.05 mmol PEGA resin. Fmoc Rink linker was coupled using 4.4 eq Fmoc Rink linker, 4 eq HCTU, and 7.95 eq DIEA. After Fmoc deprotection, Fmoc-Cys(StBu)-OH was coupled using the same condition. The resin was deprotected and Ninhdrin test was performed. The ninhydrin resin sample was kept for comparisons purposes.
Using a plastic transfer pipette, the resin was transferred into a 20 mL scintillation vial on ice using cold 0.5 M HCl solution. 2 mL of HCl solution was used per gram of wet PEGA measured. While stirring the mixture with a small magnetic stir bar, 1.4 M KNO2 was added dropwise over a period of 20 min. The volume ratio of HCl to KNO2 should be 2:1 so that the final concentration was 0.33 M HCl and 0.47 M KNO2.
After adding the KNO2 solution, the vial was lightly capped, and the the sample was allowed to stir at room temperature. Reaction completion was assessed by performing a
43 ninhydrin test, and the resulting color was compared to the ninhydrin test before conversion. Reaction was typically complete after 4 h, and an almost clear solution was obtained from the ninhydrin test.
After confirming reaction completion, the resin was transferred back into the glass peptide reactor. The resin was flow-washed three times with water. A solution of saturated Sodium bicarbonate was prepared by adding sodium bicarbonate to water until no more of the solute dissolved. This saturated solution was used to flow-wash the resin three times. Resin turned red upon the addition of the solution. The resin was incubated in the sodium bicarbonate solution for 15 minutes. The resin was flow-washed again water, and then flow-washed with DMF.
The symmetric anhydride coupling method was used to load the first amino acid. The resin was then loaded on the synthesizer to couple the remaining residues. The peptide was cleaved with 95:2.5:2.5 TFA:H2O:TIS for 3 h.
For purification the lyophilized crude peptide was dissolved in 4 mL of 30% Solvent B.
The sample was centrifuged, and supernatant was loaded on semi-preparative RP-HPLC using a gradient of 30-60% solvent B over 40 minutes.
44
CpA-3 (Thz71-Ala97)-Nbz-Arg-Arg
Thz-REISVKFTRGVDFNWQAQALLALQEA-Nbz-RR
CpA-3 peptide was synthesized with 0.05 mmol of mFmoc-Dbz(Alloc)-RR resin with the automated synthesizer. The two Arg tags were added for improved solubility. The N- terminal amino acid was added as Boc-Thz-OH.
For purification, 50 mg lyophilized crude peptide was resuspended in 6 mL Solvent B. The sample was vortexed for 5 min, and 1 mL Solvent A was added. After vortexing for another
5 min, another 1 mL Solvent A was added. This process was repeated until most of the peptide was dissolved. Then 3 mL of Solvent A was added at a time, vortexing in between each addition, until the total volume of the sample reached 20 mL. The sample was centrifuged, and supernatant was loaded on preparative RP-HPLC using a gradient of 30-
45 % solvent B over 40 minutes.
CpA-4 (Thz98-H115)-Nbz-R
Thz-EAFLVHLFEDAYLLTLH-Nbz-R
CpA-4 peptide was synthesized with 0.05 mmol of Dbz-R resin. Fmoc-His(Trt)-OH was coupled manually using the minimal racemization method. The N-terminal amino acid was added as Boc-Thz-OH.
For purification, 50 mg lyophilized crude peptide was resuspended in 7 mL Solvent B. The sample was vortexed for 5 min, and 1 mL Solvent A was added. After vortexing for another
45
5 min, another 1 mL Solvent A was added. This process was repeated until most of the peptide was dissolved. Then 3 mL of Solvent A was added at a time, vortexing in between each addition, until the total volume of the sample reached 20 mL. The sample was centrifuged, and supernatant was loaded on preparative RP-HPLC using a gradient of 35-
60 % solvent B over 40 minutes.
CpA-5-K124ac (Thz116-Gly14-HMBA-Arg-Gly)-Nbz
Thz-GRVTLFPKacDVQLARRIRGLEEGLG-HMBA-RG-Nbz
CpA-5 peptide was synthesized with 0.05 mmol of Gly-HMBA-Arg-Gly-Dbz(Alloc) resin.
The Gly was coupled to the HMBA using the symmetric anhydride method.
For purification, 50 mg lyophilized crude peptide was dissolved in 4 mL of 30% Solvent
B. The sample was centrifuged, and the supernatant was loaded on preparative RP-HPLC using a gradient of 30-55 % solvent B over 40 minutes.
SP-NCL
Buffers
SP Wash Buffer: 0.1 M Phosphate, 6 M GuHCl, pH 7
SP Deprotection Buffer: 0.1 M Phosphate, 0.4 M Methoxylamine, 6 M GuHCl, pH 4
SP Ligation Buffer: 0.1 M Phosphate, 0.05 M MPAA, 6 M GuHCl, pH 7
46
Synthesis note: after addition of the H4-C-HMBA-RG and CpA-5-HMBA-RG peptides, the resin should not be stored in methanol due to susceptibility of the HMBA linker to methanolysis. All ligations and deprotections, whether on solid-phase or in solution, were conducted at room temperature unless otherwise indicated.
Base resin synthesis for SP-NCL
All base resins were prepared manually using 0.2 mmol/g PL-PEGA resin. Theoretical loading in methanol was 0.01 mmol/mL. PEGA resin was transferred into a graduate Poly- prep chromatography column using methanol. The starting bed volume of PEGA in methanol was recorded. The bed volume of the resin swelled in methanol was estimated to be the volume of the resin. The resin was then transferred to a glass peptide reactor using
DMF, and the resin was flow-washed four times in DMF. The resin was then swelled in
DMF for 15 min.
In order to reduce steric crowding, the loading of the resin was reduced 10-fold to 0.02 mmol/g by coupling of the resin with a mixture of 1:9 Fmoc-Gly-OH:Boc-Gly-OH.
Standard coupling excesses were used for the other residues after the loading cut. Rink linker was loaded as Fmoc-Rink linker, and the N-terminal Thz was added as Fmoc-Thz-
OH. After synthesis, the resin was transferred back into the graduated chromatography column, and washed with methanol. The new bed volume in methanol was recorded.
Methanol was added to cover the resin, and the resin was stored at 4 °C. The sequences of
47 the base resins used are listed below along with their theoretical loading in methanol. Gly where the loading cut is performed is indicated in bold.
H4 SP-NCL: Fmoc-Thz-Ala-Rink-Gly-Gly-PEGA (1.66 µmol/mL loading in methanol)
CENP-A SP-NCL: Fmoc-Thz-Ala-Gly-Gly-Rink-Gly-PEGA (1.66 µmol/mL)
H4 and CENP-A Hybrid NCL: Fmoc-Thz-Ala-Ahx-Rink-Gly-PEGA (1.33 µmol/mL)
Theoretical loading was calculated with the following equation:
���� 0.01 × ������� ������ �� ���� �� ���ℎ���� ×0.1 ������� ��� �� ����� ������ �� ���� �� ���ℎ����
Quantificaiton of product from dry PEGA resin
If the final dry weight of PEGA resin after ligation is known, theoretical starting weight of the PEGA resin can be calculated using the following equation:
����� ����ℎ� �������� ����ℎ� = ���� 1 + 0.02 (�� ) �
Where MWtotal is total molecular weight of all the components added on the resin, starting from the first Gly to the full SP-NCL peptide. Once the starting weight is calculated, the theoretical yield of the cleaved peptide is calculated using the following equation:
���� �ℎ��������� ����� = (����� ����ℎ�)(0.02 )(�� ) �
Where MWpeptide is the molecular weight of the cleaved peptide product.
48
Fmoc deprotection
The base resin was transferred to a Bio-spin column (Bio-Rad) using methanol. The volume of the base resin in methanol was recorded in order to calculate the theoretical yield. Resin was flow-washed with DMF, and it was swelled by incubating in DMF for 20 minutes. The DMF was drained right above the bed volume of the resin, and one column volume of 20% piperidine in NMP was added. The resin was incubated for 5 minutes. The piperidine was drained, and the resin was flow-washed with one column volume of 20% piperidine. Piperidine was drained right above the bed volume of the resin, and the deprotection step was repeated two more times. The resin was then washed with 5 column volumes of DMF, followed with 5 columns volumes of methanol, and then with 5 column volumes of water. The resin was nutated in water for 10 minutes. The resin was then flow- washed with 3 column volumes of SP Wash Buffer, and nutated in SP Wash Buffer for 5 minutes. The flow-wash/nutation step was repeated 3 more times.
Thz deprotection
SP Wash buffer was drained, and the resin was flow-washed with three column volumes of SP Deprotection Buffer. The resin was nutated in SP Deprotection Buffer. After 1 h, the buffer was drained, and SP Deprotection Buffer was added again to the resin. The buffer was replaced at least two more times, with reaction proceeding for a total of 5 hours
(thiazolidine opening to Cys) or 12 hours (dimethyl-Thz opening to penicillamine).
MALDI-TOF MS of micro-cleavages was used to assess reaction completion.
49
Ligation
The flow-wash/nutation step was performed on the resin at least three times with SP Wash
Buffer. To reverse any disulfide formation, the resin was incubated with SP Wash Buffer
+ 10 mM TCEP for 10 minutes and then washed again with SP Ligation Buffer. The buffer was then drained from the resin. Peptide-Nbz was dissolved in SP Ligation Buffer + 20 mM TCEP, and added to the resin. The concentration of peptide was kept between 1-3 mM for optimal kinetics, and ligation was allowed to proceed at least 6 h.
Micro-cleavages to assess reaction progress
To assess the progress of ligation and deprotection, approximately 1% of the resin was cleaved to monitor the progress of the ligation as follows: Resin sample was taken using a
P200 micro-pipette with a cut-off tip, and transferred into a 0.8 mL Micro bio-spin column.
The resin was washed with SP Wash Buffer + 20 mM TCEP and then with water. TFA was added to the resin and incubated for 30 min; residual water in the the resin sample acted as a scavenger. The supernatant was collected by filtration, and the TFA was concentrated using N2 flow. 7:3 H2O:ACN was added to the sample in order to dilute the TFA, and the sample was analyzed by RP-HPLC and MALDI-TOF MS. TCEP was added prior to RP-
HPLC analysis. In an alternative method, ether wash can be performed on the sample.
Eluted TFA was diluted with cold diethyl ether, and the sample was spun down using microcentrifuge. Ether was carefully decanted, and the residual ether was allowed to air- dry. The sample was then resuspended and analyzed on the RP-HPLC. Ether is effective at
50 removing TFA, preventing high concentration of TFA from damaging the RP-HPLC column.
On-resin desulfurization
After lyophilization, the resin was resuspended in buffer containing 0.1 M Phosphate, 5 M
GuHCl, 75 mM MESNa, 300 mM TCEP, pH 7.4 that had been sparged with argon for 30 minutes. VA-044-US was added to 10 mM, and the column was incubated in 42 °C.
Reaction was monitored by micro-cleavages followed by MALDI-TOF MS.
Desulfurization was allowed to proceed overnight.
Cleavage from the resin at the HMBA linker
The resin was washed (flow-wash/nutation cycle) 3 times with SP Wash Buffer, 1 time in
SP Wash Buffer + 10 mM TCEP, followed by 4 repetitions of flow-wash/nutation with water. The resin was then lyophilized. The peptide-resin was resuspended in 0.1 M NaOH, and the reaction was allowed to proceed for 30 minutes. The eluate was collected via filtration. To neutralize the solution, equal volume of 0.1 M HCl was added to the resin, and eluate was combined with first eluate. The resin was washed 3 times with TFA, and the filtrate was collected in a separate vessel. The neutralized base treated sample and the
TFA wash sample were lyophilized separately. The NaCl generated by the neutralization was removed by adding a small amount of water (no more than 50 µL). The sample was spun down, and the supernatant containing the salt was removed. The remaining pellet was resuspended in 1:1 H2O:ACN and lyophilized again to give the salt-free crude peptide.
51
Cleavage from the resin at the Rink linker
The resin was washed (flow-wash/nutation) 3 times with SP Wash Buffer, 1 time in SP
Wash Buffer + 10 mM TCEP, followed by 4 repetitions of flow-wash/nutation with water.
After lyophilization, the resin was treated with 95:2.5:2.5 TFA:H2O:TIS for 1 hour. The solution was then filtered, and the resin was washed three times with TFA. All eluate was combined and concentrated using a flow of N2. The sample was washed with ether, and the pellet was resuspended using a 7:3 H2O:ACN mixture and lyophilized.
SDS-PAGE
The formula for each of the solutions used in SDS-PAGE is listed in Appendix A. SDS loading buffer was added to the samples so that the final concentration was at least 2 x SDS loading buffer. The samples were boiled on a heat block for 5-10 min, and briefly centrifuged. The samples were then loaded in each well of the gel. For each well, a maximum of 15 µL could be loaded. The gel was run with 180 V for 40-50 min.
After the run was complete, the gel was transferred to a container containing Coomassie stain. The gel was stained for at least 2 h. The stain was then discarded, and destain was added to the gel. After 1 h, the destain was discarded, and fresh destain was again added.
The gel was allowed to destain for no more than 12 h.
52
Solution-phase NCL of Hybrid-phase ligation
1.5 molar excess of peptide-Nbz to cysteinyl peptide was dissolved in SP Ligation Buffer, and TCEP was added to make 20 mM. The final concentration of the cysteinyl peptide was
1 mM or more. Ligation proceeded for about 16 hours, and the reaction was monitored by
SDS-PAGE and Ziptip follolowed by MALDI-TOF MS.
Preparation of SDS-PAGE samples by TCA precipitation
100 µL H2O and 25 µL TCA were added to 1-5 µL sample taken from ligation reactions in GuHCl. The sample was incubated at 4 °C for 10 minutes, and spun down using a microcentrifuge. The supernatant was decanted, and diethyl ether was added to wash the pellet. The sample was spun down again, and the ether was decanted. The sample was allowed to air-dry, and SDS loading buffer was added. No more than 0.5 µL 5 M NaOH was added to neutralize the sample.
Ziptip and MALDI-TOF MS
For performing crude MALDI-TOF MS ligation and desulfurization samples, GuHCl, salt, and other buffer components can be removed using C18 ZipTip pipette tips (EMD
Millipore). The tip was washed by pipetting up and discarding 10 µL Solvent A. 2-10 µL of sample was taken with the Ziptip, and the sample was pipetted up and down 5 times in order to assist the binding of the peptide. The tip was then washed 5 times with Solvent A.
2 uL of 30% elution buffer (7:3 Solvent A:ACN) and 2 uL of 70% elution buffer were
53 prepared in separate tubes. The peptide was eluted by first pipetting the Ziptip up and down
5 times in the 30% elution buffer back into the same tube. The Ziptip was then used to pipette up and down in the 70% elution buffer. 2 µL HCCA solution was mixed into both elution samples, and the two samples were spotted for MALDI-TOF MS.
Second Solution-phase NCL
For CENP-A where a second ligation in solution was required, methoxylamine was added to the ligation sample to make 0.4 M. pH was adjusted to 4 in order to deprotect the Thz.
The sample was then transferred to a D-tube dialyzer mini, molecular weight cut-off
(MWCO) 3-6 kDa (Novagen) and dialyzed against 200 mL of SP Wash Buffer at 4 °C.
Buffer change was performed after 5 h, where the dialysis tube was transferred to a fresh
200 mL SP Wash Buffer, and the second dialysis was allowed to go overnight.
The sample was transferred into a new tube. The dialysis tube was washed with a small amount of SP Wash Buffer, and added to the dialyzed sample. The pH of the sample was increased to 10 using NaOH in order to cleave the HMBA. Cleavage was allowed to proceed for no more than 30 minutes in order to prevent epimerization of amino acid residues. pH was brought down by the addition of MPAA to 50 mM. After adjusting the pH to 7.4, 1.5 eq CpA-1-Nbz was added to the solution. TCEP was added to the reaction to make 20 mM, and reaction was monitored by SDS-PAGE, Ziptip MALDI-TOF, and
RP-HPLC. Ligation was allowed to proceed for 16 h.
54
Desulfurization
Before desulfurization, the ligation reaction was dialyzed against 200 mL of SP Wash
Buffer at 4 °C. Buffer change was performed after 5 h, and the second dialysis was allowed to go overnight. TCEP and MESNA were added to the dialyzed sample to make the final concentration as follows: 0.1 M Phosphate, 6 M GuHCl, 75 mM MESNa, 250 mM TCEP, pH 7. The sample was sparged with argon for 30 minutes. To start desulfurization, VA-
044-US was added to a final concentration of 10 mM. The sample was incubated in a 42°C water bath for 16 hours and assessed by RP-HPLC and Ziptip MALDI-TOF MS for completion. If protein precipitates during desulfurization, the sample after dialysis should be diluted at least 5-fold with desulfurization buffer before sparging.
Purification of Synthetic histones
ACN was added to the desulfurization sample to 25% of the total volume. The sample was centrifuged, and loaded on to an analytical column using a 25-70% B gradient over 50 min for H4, and 25-90% B (25-40 for 10min then 40-90% for 40 min) for CENP-A. Fractions were analyzed by MALDI-TOF MS and SDS-PAGE. Purity was assessed by MALDI-TOF
MS, SDS-PAGE, and RP-HPLC. Pure fractions were combined and lyophilized.
55
Refolding histone tetramer
H4-K5ac,K12ac,K91ac was desulfurized for the second time before refolding. Lyophilized protein was dissolved in 100 µL of sparged desulfurization buffer (0.1 M Phosphate, 6 M
GuHCl, 75 mM MESNa, 300 mM TCEP, pH 7). VA-044-US was added to a final concentration of 10 mM, and the reaction was allowed to proceed for 4 hours at 42°C.
Protein quality was assessed by MALDI-TOF MS. This sample was used directly for refolding without purification or dialysis. Equimolar recombinant H3-C110A was added directly to this mixture and placed into a dialysis button. A 6-8 kDa MWCO dialysis tubing containing 50 mL of 25 mM Tris, 6 M GuHCl, 1 mM EDTA, 2 M NaCl, pH 7.5 was prepared, and the button was inserted into the tubing. The tubing was then dialyzed against
25 mM Tris, 1 mM EDTA, 2 M NaCl, pH 7.5.156 Three buffer changes were performed in a course of 3 days.
After dialysis, the sample was purified over a GE Healthcare Superdex 20/300 SEC column in 25 mM Tris, 1 mM EDTA, 2 M NaCl, pH 7.5. All fractions were collected, and assessed by SDS-PAGE. Pure tetramer fractions were pooled and concentrated using Amicon Ultra centrifuge filters 5KDa (EMD Millipore).
56
Refolding histone octamer
Histone octamer refolding was carried out with standard procedures similar to that of tetramer refolding156: equimolar histones were resuspended in 25 mM Tris, 1 mM EDTA,
6 M GuHCl, pH 7.5 and double-dialyzed extensively against 25 mM Tris, 1 mM EDTA, 2
M NaCl, pH 7.5. Octamer was purified using the same protocol described for tetramer.
Concentration of the octamer was determined by A280 using the total extinction coefficient of the four histones.
Nucleosome reconstitution
1.1 molar equivalent of 601 DNA157 (1:9 cy5-labeled:unlabeled DNA) was added to histone octamer in 10 mM Tris-HCl, 2 M NaCl, 1 mM EDTA, pH 7.4, and was dialyzed against 10 mM Tris-HCl, 1 mM EDTA, pH 7.4 overnight. Samples were loaded on PAGE and visualized using Typhoon imager.
For Nap1-assisted reconstitution, octamer in refolding buffer was added to 7.5 mM Tris, pH 7.4, 0.25 mM EDTA, 0.25 mM DTT, 0.1 mg/mL bovine serum albumin (BSA). His6- tagged yNap1 was added to the solution to a final concentration of 0.7 µM and incubated at 37 °C for 15 min. 1.1 molar equivalent of 1:9 cy5-labeled:unlabeled 601 DNA was then added, and the sample was incubated at 37 °C for 45 minutes.
57
Native PAGE
Gel was pre-ran for 1 h before loading in 0.3 x TBE at constant 300 V. The wells were flushed immediately before loading. Ficoll was added to the reconstituted sample to 1x.
Samples were loaded with the gel still running. Fluorescence was measured using the
Typhoon imager.
His6-tagged CENP-A expression
His6-tagged CENP-A in pHCE vector was expressed in DH5α cells following the procedure described in Tanaka et. al.158 Glycerol stock of DH5α transformed with pHCE was inoculated in 5 mL LB Amp media and grown for 24 h. The culture was then transferred to 500 mL of LB Amp media, and grown for 16 h. The culture was centrifuged at 3000 rpm for 20 min at 4 °C, and the pellet was resuspended in buffer containing 50 mM
Tris-HCl (pH 8) and 500 mM NaCl. The cells were lysed by sonication, and the sample was spun down at 23000 rpm for 20 min at 4 °C. The pellet was resuspended in buffer containing 50 mM Tris-HCl (pH 8), 500 mM NaCl, and 6 M urea and shaken for 2 h to solubilize the CENP-A protein. The sample was again spun down at 23000 rpm for 20 min at 4 °C. The supernatant containing CENP-A was separated from the pellet. The solubilization process was repeated on the remaining pellet. The two fractions of supernatant were added to Econo-Pac Chromatography column (Bio-Rad) containing Ni-
NTA resin. The column was nutated at 4 °C for 1 h. Flowthrough was collected, and the column was washed with buffer containing 50 mM Tris-HCl (pH 8), 500 mM NaCl, 6 M
58 urea, and 20 mM imidazole, and eluted with 300 mM imidazole. Pure fractions were assessed by SDS-PAGE. Pooled fractions were dialyzed against water overnight, and the sample was lyophilized. The product was confirmed my MALDI-TOF MS. When recombinant CENP-A was dissolved in RP-HPLC Solvent A, a TFA adduct was observed on MALDI-TOF MS. In order to eliminate this species, the sample was instead dissolved in 0.1% formic acid in water and lyophilized.
Expressed Protein Ligaiton of H4-K79ac
H4(1-75)-intein-CBD Expression
5 mL LB Amp media was inoculated using a glycerol stock of BL21 (DE3) containing
H4(1-75)-intein(Mxe GyrA)-CBD in pTXB1 vector (NEB). The overnight culture was grown at 37 °C for 16 h. The 5 mL culture was added to a 500 mM LB Amp media (typical expression was composed of 5 flasks of 500 mM media for a 2.5 L expression). The cells were induced using 0.2 mM IPTG once Optical Density (OD600) reached 0.4. 4 h after induction, the cells were centrifuged at 3000 rpm for 20 min at 4 °C. Media was discarded, and the cell pellet was resuspended in lysis buffer (25 mM 4-(2-hydroxyethyl)-1- piperazineethane-sulfonic acid (HEPES), pH 7.5, 1 mM ethylenediaminetetraacetic acid
(EDTA), 1 M NaCl, and 1 mM phenylmethylsulfonyl fluoride (PMSF), and stored at -80
°C.
59
Cells were thawed at room temperature, and lysed using sonication. The lysate was centrifueged at 23000 rpm for 20 min at 4 °C. The pellet was washed with Triton wash buffer (25 mM HEPES, pH 7.5, 1 mM EDTA, 1 M NaCl, 1% Triton-X), shaken for 1 h, and centrifuged 23000 rpm for 20 min at 4 °C. Triton wash was repeated and the sample was again centrifuged. The pellet was washed with 25 mM HEPES, pH 7.5, 1 mM EDTA,
1 M NaCl, centrifuged, and decanted.
250 µL DMSO was added to the pellet, and the pellet was minced with a spatula. The pellet was allowed to soak in DMSO at room temperature for 30 min. 30 mL of 25 mM HEPES, pH 7.5, 6 M Urea, 1 mM EDTA, and 0.5 M NaCl was added to the pellet. The pellet was shaken for 1 h, and centrifuged at 23000 rpm for 20 min at 25 °C. Supernatant was collected, and this extraction step was repeated. The first supernatant was combined with the second supernatant.
H4(1-75)-intein-CBD Purification
The sample was split in half, and each was purified by ion exchange over a 5 mL SP-FF column. Sample was loaded on the column using a peristaltic pump. The column was washed with 100 mM NaCl HEPES Urea buffer (25 mM HEPES, pH 7.5, 6 M Urea, 1 mM
EDTA). Product was eluted using salt concentrations at 200mM, 300mM, 400mM, and
500mM NaCl in HEPES Urea buffer. All eluents were analyzed by SPS-PAGE. Fractions containing H4(1-75)-intein-CBD were combined.
60
Thiolysis to produce H4(1-75)-SR
The sample was diluted to 50 mL using refolding buffer (25 mM HEPES, pH 7.5, 6 M
Urea, 1 mM EDTA, 1 M NaCl), and was dialyzed overnight against 4 L of refolding buffer at 4 °C. MESNa was added to the sample to make 100 mM. The sample was nutated at 4
°C for 18-24 h. The amount of H4(1-75) thioester was estimated based on the A280 and
SDS-PAGE quantification.132 Below are the extinction coefficients (ε) of various species in cm-1/M.
H4(1-75)-intein-CBD: 37320
Intein-CBD: 34760
H4(1-75)-SR: 2560
Ligation
10 eq of H4(76-102)-K79ac peptide to calculated H4(1-75)-SR was added to the thioester solution. Final concentration of the K79ac peptide was 1 mM. Ligation was allowed to proceed overnight. Ligation was assessed by SDS-PAGE and Ziptip crude MALDI-TOF
MS.
Desulfurization
The buffer of the sample was changed to 25 mM HEPES, pH 7.5, 6 M GuHCl, 1 mM
EDTA, 1 M NaCl before desulfurization using a concentrator (Amicon centrifuge filter
5kDa). MESNa was added to 75 mM. and 1 M TCEP, pH 7.4 was added to 300 mM.
61
Sample was sparged with argon for 30 min. 0.5 M VA-044-US was added to 10 mM, and sample was incubated in a 42 °C water bath. Desulfurization was allowed to proceed overnight. Complete desulfurization was determined by MALDI-TOF MS of Ziptip- prepared sample and RP-HPLC.
Purification
Desulfurized H4-K79ac protein was purified on RP-HPLC using a 25-70 % Solvent B gradient. Fraction were analyzed using RP-HPLC, MALDI-TOF MS, and SDS-PAGE.
Pure fractions were combined and lyophilized.
Quantification of Histone using UV-Vis spectroscopy
Lyophilized full-length histone was resuspended 0.1 M Phosphate and 6 M GuHCl, pH 7.
Absorbance at 280 nm was measured and concentration was calculated using the Beer-
Lambert law. The extinction coefficient ε was calculated using the following equation:
ε = 5690 � + (1280)(� )
159 Where Ntrp and Mtyr are number of Trp and Tyr residues, respectively. Calculated ε280 was 12660 M-1cm-1 for CENP-A and 5120 M-1cm-1.
62
Results and Discussion
SP-NCL of H4
Figure 10 outlines the SP-NCL scheme of H4. After four rounds of ligation, H4-ABHC would be desulfurized on the solid-phase and cleaved at the HMBA. The native H4 would be produced with only one purification step overall.
Rink linker
Ligation handle Deprotection HMBA-Arg-Gly C
Ligation
H4-C0 C 1) Deprotection 2) Ligation
H4-HC0 H C
ac-H4-ABHC0 ac- A B H Pep1C Desulfurization
ac- A B H C ac-H4-ABHC0 HMBA cleavage
ac-H4 ac- A B H C
Figure 10: SP-NCL Ligation Scheme for H4160
63
Dual linker design for SP-NCL
In order to yield a product with a native carboxy terminus while enabling rapid cleavage and analysis of the peptide intermediates, we developed a dual linker strategy (Figure 11).
The handle between the peptide and the resin contained two key linkers: 4-
Hydroxymethylbenzoic acid (HMBA)161 and Rink. H4-C was synthesized with a C- terminal HMBA. The ester bond bond between the HMBA and the C-terminal residue is relatively stable to SPPS and NCL conditions,162 but can be rapidly cleaved at pH 10 in aqueous solution, producing a carboxy terminus (Figure 11). In turn, we synthesized a short peptide containing an N-terminal Thz anchored to the resin with a Rink linker. After ligation of H4-C peptide, the anchored peptide could be rapidly and efficiently cleaved with TFA for analysis using RP-HPLC and Matrix-assisted laser desorption/ionization time of flight mass spectrometry (MALDI-TOF MS).
64
Rink S
N Fmoc SPPS/Cleavage Piperidine
HMBA Methoxylamine C Arg-Gly
NCL
C
HMBA-Arg-Gly
NaOH TFA
O C C NH2 H4-C H4-C0
Figure 11: Dual linker strategy for SP-NCL
The solid support has excellent swelling properties in TFA, expanding the resin pores and allowing free diffusion of peptide out of the resin. This combined with the high solubility of peptides in TFA effectively eluted most peptides. In order to denote the difference between the H4-C peptide cleaved at different linkers, H4-C will refer to the peptide cleaved at HMBA, while H4-C0 will refer to the peptide cleaved at Rink. H4-C peptide before ligation will be written as H4-C-HMBA-RG.
We chose PEGA resin as the solid support for its good swelling properties in both organic and aqueous solvents,163 allowing a reasonable pore size for the reagents and peptides to 65 diffuse in and out of the resin. It is also stable in a broad range of pH values, which is important since Thz deprotection (pH 4), ligation (pH 7), and HMBA cleavage (pH 10) are all performed on resin. One downside is that the resin is relatively fragile, and repeated lyophilization can damage the resin, making storage and quantification of the resin relatively difficult. Proper resin handling techniques are described in the experimental methods section.
Steric crowding is known to occur as the length of peptide increased upon ligation, preventing efficient reactions.152 In order to avoid this issue, we performed a one tenth loading cut on the resin by coupling a mixture of 1:9 Fmoc-:Boc-Gly.164 This reduced the reactive amine to 10% of the initial loading.
Synthetic peptide segments for H4
Sequences of the peptides synthesized for H4 are listed in Table 1. The initial development of H4 SP-NCL was done in collaboration with Dr. Santosh Mahto. For H3 total synthesis, in order to reduce the number of ligation steps in solution, each peptide fragments were long and challenging to synthesize.132 In contrast, none of the peptides exceeded 40 residues for H4 to ensure efficient synthesis of each peptide. Increasing the number of ligation steps without significant yield loss was part of the benefit of SP-NCL.
66
Table 1: H4 peptide segments160
Peptide residues Peptide sequence H4-A 1-37 SGRGKGGKGLGKGGAKRHRKVLRDNIQGITKPAIRRL H4-B 38-56 ARRGGVKRISGLIYEETRG H4-H 57-75 VLKVFLENVIRDAVTYTEH H4-C 76-102 AKRKTVTAMDVVYALKRQGRTLYGFGG
Peptide residues Synthesized peptide sequence H4-A 1-37 ac-SGRGKGGKGLGKGGAKRHRKVLRDNIQGITKPAIRRL-Dbz(Alloc) H4-B 38-56 Thz-ARRGGVKRISGLIYEETRG-Dbz(Alloc) H4-H 57-75 dmThz-LKVFLENVIRDAVTYTEH-Dbz-R H4-C- 76-102 Thz-KRKTVTA-Nle-DVVYALKRQGRTLYGFGG-HMBA-RG-Dbz(Alloc) HMBA-RG
We only chose alanine and valine at the split sites since Thz and dimethyl-Thz (dmThz), a precursor to Val, were both commercially available. We avoided Lys as the C-terminal residue of the peptide because the amine sidechain could react with activated MPAA thioester, forming a cyclized product.
Peptide H4-H had marginal solubility, so Arg was introduced as a solubility tag.165 The
Met in H4-C was replaced by an isostere norleucine (Nle) in order to avoid Met oxidation.132 H4-B and H4-C were synthesized with N-terminal Thz, and H4-H was synthesized with N-terminal dmThz. In the initial SP-NCL, H4-A was acetylated at the N- terminus.166 All peptides were synthesized with diaminobenzoic acid (Dbz) linker, a thioester precursor which will be discussed in more detail in the next sections.
67
Synthesis of a peptide segment with an α-thioester
Thioester is a key requirement of NCL, and preparing it is often a major challenge. The two most common types of SPPS are Fmoc and Boc SPPS, which refer to the N-terminal protecting groups of the amino acid. TFA is used to deprotect Boc,167 and HF is required for peptide cleavage from resin, which requires specialized equipment and experience. It is difficult to introduce some modifications, such as phosphorylation and glycosylation, using Boc chemistry. On the other hand, these modifications are straight-forward with
Fmoc SPPS, and the chemicals required for Fmoc chemistry are relatively mild.168
Importantly, Fmoc SPPS is easily automatable, allowing the rapid synthesis multiple peptides in parallel.169 This is especially important for our studies, because it enables the preparation of histone peptide libraries with different PTMs. However, compared to Boc
SPPS, it is harder to prepare a peptide thioester necessary for NCL using Fmoc SPPS.50
Thioester moiety is base-labile, and the piperidine treatment during the deprotection step can therefore cleave the peptide from the resin. For this reason, several methods have been introduced for the preparation of peptide thioester compatible with Fmoc chemistry.170
These methods include using a milder deprotection reagent,171 safety-catch linker,50 thioesterification of fully protected peptide,172 O to S,173 and N to S acyl shift methods,141,174 SEA ligation,142 and recently using peptide hydrazide.143
In our case we prepare thioesters by synthesizing peptides on 3,4-diamino benzoic acid
(Dbz) linker (Figure 12). The first amino acid is typically loaded on the amino group in the meta (3) position with respect to the carboxy substituent. The other amino group in the
68 para (4) position is relatively deactivated, and its reactivity is further reduced after the acylation of the para-amino group due to steric and electronic effects.136 Once the peptide is complete, on-resin treatment with 4-nitrophenyl chloroformate (NPCF) converts the Dbz into N-acyl-benzimidazolinone (Nbz). The peptide Nbz is then cleaved from the resin, and purified by RP-HPLC. Nbz is rapidly cleaved by thiols such as 4-mercaptophenylacetic acid (MPAA) to yield the desired peptide thioester.136 In order for the Nbz conversion to occur, the deactivated amino group of the Dbz cannot be acylated.
O O H H O N N O N N H H N O O N HN H N Alloc Deprotection 2 Nbz Conversion O H N O O H
Cleavage O O SR O Thiolysis N NH2 O N H
Figure 12: Preparation of Thioester through the Dbz
Despite the efficient conversion steps, coupling has been observed on the deactivated amine,175,176 particularly for Gly-rich sequences. Once an amino acid is coupled on the deactivated amine, subsequent addition of amino acids results in branched products. For several histone peptides with Gly-rich sequences, significant branched products are observed.175 To prevent this problem, in some syntheses the para-amine is protected using allyloxy carbonyl (Alloc).175,177Alloc can be removed after synthesis before Nbz conversion (Figure 12). In addition, this orthogonal protecting group also allows for acetyl
69 capping, which is typically performed after each coupling step to acetylate the unreacted amine species. This prevents the accumulation of truncated peptides. Acetyl capping is not compatible with unprotected Dbz since acetylation of the 4-amine prevents conversion to
Nbz. Of note, a second-generation Dbz linker had recently been developed to address the problem of branched products.178 The Dbz linker allows for the straightforward preparation of thioester for most of our peptides.
Using Solid-Phase Peptide Synthesis (SPPS) we synthesized the peptide fragments that would be ligated together to form the full length protein. Several of the H4 peptides were challenging to synthesize, and various complications were encountered in their preparation.
These challenges are addressed in the coming sections.
Nbz Conversion of H4-A-Dbz
The Dbz of the peptide was converted to Nbz using 4-nitrophenyl chloroformate (NPCF) with dichloromethane (DCM) as solvent, followed by a treatment with diisopropyl ethylamine (DIEA) in dimethyl formaamide (DMF). Although this conversion was straightforward with most peptides, conversion of H4-A-Dbz gave rise to multiple side products, most of which could not be identified (Figure 13). During the initial studies of
H4 SP-NCL, H4-A-Nbz was prepared using this method since it was still possible to isolate the desired product through RP-HPLC purification, albeit with overall yield averaging only
5%. A modified method to optimize the conversion will be discussed in the CENP-A total
70 synthesis section, where a similar problem was encountered with one of the CENP-A peptides.
O O ac- A O ac- A O HN N NH2 NH2 O N H2N H 15-30 % B 15-30 % B
1) NPCF in DCM 2) DIEA in DMF
0 10 20 30 Time (min)
Figure 13: Nbz Conversion of H4-A
Racemization of Histidine
After the Alloc protection of the para-amine of the Dbz, the meta-amine is sterically occluded, and stringent coupling conditions (excess amino acid, strong coupling agent, extended coupling time) were required to load the first amino acid. However, when His was the first amino acid in the sequence, we observed an additional side product in RP-
HPLC analysis, with identical mass to the desired product. When coupling an amino acid to unprotected Dbz, two isomers can be generated with the two species having different retention times in RP-HPLC. However, since our Fmoc-Dbz-OH was generated in only one isomer as assessed by NMR spectroscopy, this seemed unlikely as an explanation. We carried out our analysis of each of the early loading and coupling steps. While cleavage after the first coupling step to generate His-Dbz(Alloc) resulted in only one peak via RP-
71
RP-HPLC analysis, when we analyzed the two-residue sequence, Glu-His-Dbz(Alloc), we observed two peaks with identical mass (Figure 14A). Analysis after coupling 4 more residues generated a single peak (Figure 14B).
O A 0-73 % B H N EH NH2
HN
O O
15 25 Time (min) B 0-73 % B O H N VTYTEH NH2
HN
O O
5 15 Time (min)
O H Eh N C 0-73 % B NH2 HN
O O
O H N EH NH2
HN
O O 5 15 Time (min)
Figure 14: Analysis of H4-H peptide (A) RP-HPLC of EH-Dbz(Alloc). (B) RP-HPLC of the 6-residue segment of H4-H- Dbz(Alloc). (C) RP-HPLC of EH-Dbz(Alloc) diastereomeric mixture after His coupling using racemizing condition.
We hypothesized that the minor peak was the diastereomer Glu-his-Dbz(Alloc), where “his” is D-histidine, resulting from racemization during the coupling step to the relatively deactivated Dbz amine. Histidine is prone to racemization compared to other amino 72 acids,179 so this was a reasonable hypothesis. To test this, Fmoc-His(Trt)-OH was loaded on Dbz(Alloc) with excess 4-Dimethylaminopyridine (DMAP) to generate the racemic mixture.180 RP-HPLC analysis after Glu coupling revealed that the previously observed minor peak had increased to roughly the same height as the major peak (Figure 14C). This indicated that the side product was the result of His racemization, which gave an explanation for our observations. Single peak was observed for H-Dbz(Alloc) because the two species are enantiomers, which have indistinguishable properties. Adding Glu converts the two species into diastereomers, offering a possible explanation as to why they could be resolved using an achiral column. It is not clear if the Dbz and the Alloc contribute to the observable separation of the two species, and the actual reason may be more complex. Only a single peak was observed for the hexapetide, suggesting that the difference in properties between the two diastereomers was too small to achieve separation on the RP-HPLC.
To assess the extent of racemization, various His coupling conditions were tested. Some of the conditions of note are listed in Table 2. RP-HPLC of the diastereomeric mixture generated from the excess DMAP condition was used as a standard for directly comparing the retention times of the various species (Figure 15). When coupling on unprotected Dbz, the resin was Alloc protected after His coupling, followed by Glu coupling. Both coupling efficiency and racemization level were considered when determing the optimal condition for His loading. We found that all coupling conditions tested on Dbz(Alloc)-Arg with reasonable coupling efficiency gave rise to unacceptable level of racemization. Although initial tests were done on Dbz(Alloc) and Dbz resins, unreacted Dbz(Alloc) and Dbz were
73 eliminated during the ether wash, and therefore were not observed on the RP-HPLC. Since this prevented the assessment of coupling efficiency for cases such as condition 1 on Table
2: Histidine coupling conditions on Dbz. Later tests used Dbz(Alloc)-Arg and Dbz-Arg, which were retained after the ether wash, allowing observation via RP-HPLC.
Table 2: Histidine coupling conditions on Dbz
Pre-activation Coupling Racemization Base Resin / coupling Reagents (%) (%) time 30 min / 60 22:10:1 Fmoc-His(Trt)- Standard Dbz(Alloc) min OH:DIC:DMAP 0.23 M His, 0.23 M HATU, 1 Dbz(Alloc) 3 min / 60 min N/A 11.6 0.42 M DIEA None / 30 min 0.28 M His, 0.25 M HATU, 2 Dbz(Alloc)-R 82 4.8 (double) 0.04 M DIEA None / 30 min 0.28 M His, 0.25 M HATU, 3 Dbz(Alloc)-R 88 6.4 (triple) 0.04 M DIEA None / 30 min 0.27 M His, 0.25 M 6-Cl- 4 Dbz(Alloc)-R 25 2.3 (double) HOBt, 0.125 M DIC 0.11 M His, 0.1 M HCTU, 5 Dbz-R None / 45 min 98 6.5 0.16 M DIEA 0.11 M His, 0.1 M 6-Cl- 6 Dbz-R None / 30 min 98 <1 HOBt, 0.1 M DIC
74
Standard b
cd e a
392.7 O O a O HN N Arg H
H2N 1 e 391.6 b O
H2N N Arg b c d H HN
O O a
3 c
O H N Glu N Arg e H 522.8 HN O O
b d d O H N Glu-his N Arg 4 H 658.0 HN O O e c e
O H 659.1 N Glu-His N Arg H HN
O O 6 e
b
Figure 15: Histidine coupling conditions on Dbz (Left) 0-20 % B RP-HPLC of Glu-His-Dbz(Alloc)-Arg for His coupling conditions corresponding to standard, 1,3,4, and 6 found in Table 2. (Right) MALDI-TOF MS of the five observed species a-e in RP-HPLC with structures. For each MALDI-TOF MS, the peak is labeled with the observed m/z, which is within 1 unit of the corresponding species. Note that for condition 1, the species do not contain the Arg tag on the Dbz.
75
Among the conditions, coupling using Diisopropylcarbodiimide (DIC) and 1-Hydroxy-6- chloro-benzotriazole (6-Cl-HOBt) on unprotected Dbz was found to be the optimal condition,181 with racemization at almost background level. After the initial His loading, the Dbz could be Alloc protected or left unprotected. Since H4-H lacks Gly residues, we found that Alloc protection was not necessary. The remaining residues were coupled using standard SPPS methods.
For the Nbz conversion of H4-H-Dbz-Arg, a modified Nbz conversion was used. Full conversion was not achieved with NPCF in DCM, whereas conversion was complete in a mixture of DMF and DCM. This condition produced a mixture of H4-H-Nbz-Arg and H4-
H-Nbz(formyl)-Arg. The cause of formylation will be discussed in more detail in later sections. The formylated species converted to thioester without issues.
Synthesis of H4 peptides
Synthesis and Nbz conversions of the remaining peptide segments for H4 were straightforward. RP-HPLC and MALDI-TOF MS of purified H4 peptides are in Figure 16.
76
20 -35 % B H4-acA-Nbz [M + H]+ O ac- A O observed m/z 4137 N Expected m/z 4138 NH2 O N H
20-35 % B H4-B-Nbz
+ O [M + H] B O observed m/z 2322 N NH2 expected m/z 2321 O N H
H4-H-Nbz-Arg 30-60 % B
+ O [M + H] H O observed m/z 2636 N ArgNH2 H-R (no Nbz) expected: m/z 2634 O N H H O
25 -40 % B [M + H]+ H4-C0-Nbz observed m/z 3517 O expected m/z 3520 C O N NH2 O N H
30 -40 % B H4-C0-K91ac-Nbz
[M + H]+ O C O observed m/z 3579 N NH2 expected m/z 3581 O N H
Figure 16: Purified H4 peptides RP-HPLC (left) and MALDI-TOF MS (right) of purified H4 peptides.
77
Sequential SP-NCL of ac-H4
After we synthesized the required H4 peptides, Dr. Mahto performed the first sequential
SP-NCL of H4. Intermediates were analyzed by RP-HPLC and MALDI-TOF MS after
TFA cleavage at the Rink linker. Production of full-length H4 with acetylated N-terminus
(ac-H4) was confirmed by RP-HPLC and MALDI-TOF (Figure 17). After performing desulfurization on resin, the final product was cleaved at HMBA with 0.1 M NaOH and purified by RP-HPLC. Although synthesis appeared to be efficient as assessed by purify of the product, yield was only about 1%.
78
C0 [M + H]+ Exp. m/z 3507 Obs m/z 3505 C
HC0 [M + H]+ Exp. m/z 5768 Obs m/z 5764 S
N H C H
BHC0 [M + H]+ Exp. m/z 7899 Obs m/z 7895 B H C
ac-ABHC0 25-70 % B [M + H]+ Exp. m/z 11848 ac- A B H C Obs m/z 11853
Figure 17: SP-NCL of ac-H4160 RP-HPLC and MALDI-TOF MS of H4 SP-NCL ligations. Courtesy of Dr. Mahto.
79
SP-NCL of modified H4
We hypothesized that perhaps the yield had been minimized by suboptimal resin handling, which might have led to increased cleavage at the HMBA linker during synthesis. As such,
I repeated the synthesis of three H4 proteins in parallel: ac-H4, ac-H4-pS47, and ac-H4-
K5ac,K12ac,K91ac (Figure 18). The products were cleaved at the HMBA linker using a solution of 0.1 M NaOH (pH 10). In order to achieve maximal protein extraction, the the base-treated resin was further washed extensively with TFA, and the eluate combined with the base extraction. The major side product arose from the incomplete ligation of H4-H.
Aside from the side product, the major peak on the crude RP-HPLC was the desired product, and the syntheses appeared relatively clean. However, we did not see an improvement in yield, suggesting that careful resin handling procedures did not improve the synthesis.
80
ac- A B C A 25-70 % B ac-H4 [M + H]+ Exp. m/z 11259 ac- A B H C * Obs m/z 11259
10 20 30 Time (min) B 25-70 % B * ac-H4-pS47 [M + H]+ Sph Exp. m/z 11339 ac- A B H C Obs m/z 11341
10 20 30 Time (min)
C 25-70 % B [M + H]+ ac-H4- Exp. m/z 11403 * K5ac,K12ac,K91ac Obs m/z 11407
ac- A B H C
10 20 30 Time (min)
Figure 18: SP-NCL of Synthetic H4 constructs Crude RP-HPLC and MALDI-TOF MS of ac-H4 (A), ac-H4-S47ph (B), and ac-H4- K5ac,K12ac,K91ac (C)
These studies demonstrated that H4 SP-NCL consistently resulted in a low yield. However, the origins of these reduced yields remained unclear. While we attempted to resolve this issue, we also carried out the SP-NCL of histone variant CENP-A. In the following section, we will discuss our initial attempt to synthesize CENP-A. The combined results from H4 and CENP-A provided an explanation for out observation, which eventually led to the development of an alternative ligation strategy. 81
CENP-A
Centromeric protein A (CENP-A) is a histone H3 variant that is found as a core histone in the centromere (Figure 19).182,183 CENP-A acts as an epigenetic marker that distinguishes the centromere from other regions of the chromosome, and it is required for mitotic spindle attachment during cell division.184,185 Over-expression of this protein results in multiple centromeres on a single chromosome and commensurate errors in chromosome replication.186,187 Despite some controversy surrounding the exact structure of the centromeric nucleosome,188-190 growing evidence seems to suggest that the nucleosome is composed of an octameric protein core as with the canonical nucleosome, but with a slightly more open structure with less wrapped DNA.191-193
CpA Nucleosome H3 Nucleosome
Figure 19: Comparison of CENP-A and H3-containing nucleosomes Structure of nucleosome containing CENP-A (left, PDB: 3AN2194) and H3 (right, PDB: 1KX556). CENP-A and H3 are indicated in blue. 82
Relatively few PTMs have been discovered on CENP-A but the modifications that have been identified are thought to play important biological roles.195-198 Our modification of interest is the K124 acetylation (CpA-K124ac), which along with H4-K79ac appear to be
199 correlated with the cell-cycle and are found primarily during the G1/S phase (Figure 20).
The two modifications were initially suggested to play a role in the transition between an octameric nucleosome and a hemisome, an alternate nucleosome structure which contains only one copy each of H2A, H2B, H4, and CENP-A due to correlation in the cell cycle to alterations in AFM analysis of centromeric nucleosomes.199 However, a growing body of evidence suggests that their role may be more complex.
CpA-K124
H4-K79
Figure 20: Centromeric Nucleosome PTMs Crystal Structure of CENP-A-containing nucleosome with CpA-K124 and H4-K79ac indicated in magenta. PDB: 3AN2194
83
CpA-K124ac is found in the dyad region at the histone-DNA interface, and the analogous
PTM on H3 (H3-K122ac) was found to destabilize the nucleosome and decrease the affinity of histone to the DNA.59 It is possible that CpA-K124ac has a similar dynamic effect. Dynamic molecular simulation studies of nucleosome with CENP-A and H3 showed that the CENP-A dimer interface is weaker than the H3 dimer interface, causing the CENP-
A nucleosome to adopt a more flexible structure.200 Since CpA-K124 is found in this dimer interface, acetylation is predicted to affect the dynamics in this region. The same K124 can also be ubiquitylated, which has been suggested to be required for CENP-A deposition at the centromere.196 An alternate role proposed for CpA-K124ac is that it may act as a placeholder for ubiquitylation until the M phase.196 In either case, CpA-K124 is a key site for several different PTMs, which is further supported by a recent finding that CpA-K124 is also methylated.201
H4-K79ac is found in the DNA entry/exit region, and has been identified as part of the loss of ribosomal DNA silencing (LRS) region. Acetylation within this region of the histone
DNA interface, studied in the context of H4-K77ac, K79ac, has been found to increase
DNA unwrapping,102 but it is not known what effect it has in the context of centromeric nucleosome. It is hypothesized that H4-K79ac could increase access to the nucleosome by remodelers.199 Interestingly, some of the major sequence differences between H3 and
CENP-A are concentrated near H4-K79ac (Figure 21). Because of this, it is possible that the structural and dynamic effects of H4-K79ac in CENP-A-containing nucleosome will
84 be different from the same PTM in H3-containing nucleosome, mediated by the residue differences surrounding the H4-K79ac sidechain.
CENP-A(45-138)GWLKEIRKLQKSTHLLIRKLPFSRLAREICVKFTRGVDFNWQAQALLALQEAAEAFLVHLFEDAYLLTLHAGRVTLFPKDVQLARRIRGLEEGLG H3 (46-134) VALREIRRYQKSTELLIRKLPFQRLVREIAQDFK--TDLRFQSSAVMALQEACEAYLVGLFEDTNLCAIHAKRVTIMPKDIQLARRIRGERA
Figure 21: Comparison of CENP-A and H3 nucleosomes (Top) Sequence alignment of CENP-A and H3. Significant residue differences are indicated in red. (Bottom) Crystal structures of H3-containing nucleosome (left) and CENP-A-containing nucleosome (right). H4 is in yellow and H4-K79 is indicated in magenta. CENP-A/H3 are in blue (space-filling model) and major residue differences seen in the alignment are colored cyan. PDB: 1KX556, 3AN2194
Taken together, we hypothesize that CpA-K124ac and H4-K79ac together alter the stability and dynamics of the nucleosome. The PTMs may play a role in regulating access to the centromeric nucleosome at particular times during the cell cycle. Since CpA-K124ac is located near the proposed binding site for the inner kinetochore protein CENP-C,202 the modification may also regulate kinetochore assembly. By preparing CpA-K124ac and H4-
K79ac, we hope to understand the effects of these modifications on the centromeric nucleosome, and thereby shed light on what roles they might play in cell division. Since
85 multiple PTMs can be found at CpA-K124, a robust synthetic strategy of CENP-A would enable the study of not just CpA-K124ac, but of other PTMs as well.
Semi-synthesis of CpA-K124ac
Because the modification is close to the C-terminus of the protein, we initially proposed an expressed protein ligation (EPL) strategy for the production of CpA-K124ac, similar to the strategy that our laboratory developed for preparation of H3-K122ac,59 and that have been established for H2A, H2B, H3, and H4.59,102,203 We first proposed a ligation-desulfurization approach using a ligation site at position 116. Native human CENP-A contains a single cysteine at position 75. Since ligation and subsequent desulfurization would convert this native Cys into Ala, it was not compatible with our strategy. Our collaborator Dr. Yamini
Dalal therefore prepared two recombinant mutants CpA-C75A and CpAC75S that would be compatible with EPL and found that both mutants localized to the centromere, confirming that the Cys was not essential for function.204 We chose to use CpA-C75S for semi-synthesis since the Ser mutation should have minimal effect on the CENP-A structure.
In this work, all synthetic and semi-synthetic CENP-A contain the C75S mutation unless otherwise indicated.
We intended to prepare CpA-K124ac by using expressed CpA(1-115)-intein-Chitin
Binding Domain (CBD) and synthesized CpA(116-140)-K124ac peptide (Figure 22). The fusion protein would be expressed and purified. Allowing intein to fold into its native structure would trigger the N to S acl shift, and cleavage with an external thiol would
86 produce the CpA(1-115) thioester (CpA(1-115)-SR). Ligation would be performed with the addition of the synthetic peptide bearing the K124ac modification. Subsequent desulfurization and purification would produce the desired CpA-K124ac.
SPPS
CpA(1-115) Intein CBD
HS-R HS O CpA1-115)N-term 116-140 SR H2N NCL
CpA(1-N115)-term 116-140 Desulfurization
CpA(1-N115)-term 76-102
Figure 22: CpA-K124ac EPL scheme
Although the K124ac peptide fragment was straightforward, the expression of the CpA(1-
115)-intein-CBD fusion protein was far more challenging than we had anticipated. We tested expression with varying E. coli strains, growth temperature, growth media, inducer concentration, and induction time, but none of the conditions tested resulted in successful expression. SDS-PAGE of the whole cell lysate for the tested expression conditions are shown in Figure 23. Expression of CENP-A is known to be challenging.158 Using the Mfold web server,205 we noted that the mRNA of CENP-A had significant secondary structure that could potentially prevent efficient expression. We prepared a plasmid in which codon substitution was used to disrupt proposed mRNA structure, but this did not improve expression. 87
A B O.D. at induction 0.6 0.6 0.6 0.6 0.8 0.8 0.8 0.6 0.8 0.8 0.60.80.80.1.02.0 IPTG (1mM) 0.2 0.2 0.5 0.5 1.0 1.0 0.2 0.2 1.0 1.0 1.00.21.01.01.0 Expression time 2h 4h 2h 4h 2h 4h 2h 4h 2h 4h 2h 2h 2h 2h 2h uninduced uninduced
C O/N Media LB SOC IPTG - + - + - + - + - + - + Expression time 2h 2h 4h 4h 6h 6h 2h 2h 4h 4h 6h 6h
D LB SOC - + - + - + - + - + - + 2h 2h 4h 4h 6h 6h 2h 2h 4h 4h 6h 6h
Figure 23: Expression of CpA(1-115)-intein-CBD (A) SDS-PAGE of expression conditions for cells grown in LB media. O.D. at induction refers to the optical density of the culture when inducer (IPTG) was added. (B) SDS-PAGE of expression condition for cells grown in MMI media. (C) SDS-PAGE of expression conditions with cells inoculated in LB or SOC overnight (O/N) growth media. (D) SDS- PAGE of expression conditions identical to (C), but cells were expressed at 25 °C. All other expression conditions were carried out at 37 °C. For (C) and (D), cells were induced with 1 mM IPTG at O.D. 0.8.
88
After concluding that preparation of CpA-K124ac through EPL was not time-efficient, we turned to a total synthesis approach. We proposed an SP-NCL strategy involving the ligation of five peptides. Although H4 SP-NCL gave low yield, we hypothesized that the problem resulted not from our ligation strategy, but from the properties of H4 protein. If this were the case, CENP-A SP-NCL would be straightforward.
SP-NCL of CpA-K124ac
Figure 24 illustrates the proposed SP-NCL scheme of CENP-A. Through the total synthesis of CENP-A, we hoped to prepare a homogenous sample of CpA-K124ac, as well as demonstrate the utility of SP-NCL. The split sites were determined based on the same criteria discussed for H4 SP-NCL, notably peptide length, commercially available reagents, and predicted fragment solubility and ligation kinetics. Five peptides were synthesized for the total synthesis of CENP-A (Table 3). Synthetic CENP-A contained the C75S mutation
(in CpA-3), similarly to CENP-A EPL.
89
Rink linker Ligation handle 5 HMBA-Arg-Gly SP-NCL
1 2 3 4 5
Desulfurization
1 2 3 4 5
NaOH
1 2 3 4 5
Figure 24: CENP-A SP-NCL scheme
Table 3: CENP-A peptides160
Name residues Peptide sequence CpA-1 2-34 GPRRRSRKPEAPRRRSPSPTPTPGPSRRGPSLG CpA-2 35-70 ASSHQHSRRRQGWLKEIRKLQKSTHLLIRKLPFSRL CpA-3 71-97 AREICVKFTRGVDFNWQAQALLALQEA CpA-4 98-115 AEAFLVHLFEDAYLLTLH CpA-5 116-140 AGRVTLFPKDVQLARRIRGLEEGLG
Name residues Synthesized peptide sequence CpA-1 2-34 GPRRRSRKPEAPRRRSPSPTPTPGPSRRGPSLG-Dbz(Alloc) CpA-2 35-70 Thz-SSHQHSRRRQGWLKEIRKLQKSTHLLIRKLPFSRL-Dbz(Alloc) CpA-3 71-97 Thz-REISVKFTRGVDFNWQAQALLALQEA-Dbz(Alloc)-RR CpA-4 98-115 Thz-EAFLVHLFEDAYLLTLH-Dbz-R CpA-5- 116-140 Thz-GRVTLFPKDVQLARRIRGLEEGLG-HMBA-RG-Dbz(Alloc) HMBA-RG
90
Nbz conversion of CpA-2-Dbz
While all peptides could be efficiently and cleanly synthesized as the Dbz, CpA-2-Dbz failed to convert efficiently to Nbz, similarly to H4-A peptide (Figure 25). To circumvent this issue, we therefore had to find an alternative approach to produce the CpA-2-thioester.
O O O 2 O 2 N HN NH2 NH2 O N H H2N 30-60 % B 30-60 % B
1) Alloc deprotection 2) NPCF in DCM 3) DIEA in DMF
Figure 25: Nbz conversion of CpA-2
CpA-2-thioester through O to S acyl shift
One alternative approach to prepare Fmoc thioester is the O to S shift method (Figure
26).173 In this method, the resin was prepared by first loading Fmoc-Cys(StBu)-OH. Fmoc was removed and the N-terminal amine of Cys was converted into a hydroxyl. The first amino acid was coupled to create an ester linkage with the base resin. The resulting peptide was cleaved with the carboxy ester intact. During ligation the disulfide was reduced, exposing the thiol, causing rearrangement through O to S shift to generate the peptide thioester in situ. This thioester could immediately participate in NCL, or it could be displaced by an external thiol in the ligation buffer. Although it was possible to prepare
CpA-2 thioester using this method, the overall yield of CpA-2-OCys(StBu) was only 5%. 91
By assessing yields at different points during SPPS, we determined that the ester linkage was gradually cleaving after every cycle. Although the O to S method was suitable for short peptides, it was not compatible with the 36-residue CpA-2 for obtaining reasonable yields.
We therefore looked for other alternatives to synthesize CpA-2 thioester in good yield.
S S S S S S H H O H H2N N N H N HO N 2 O O KNO2/HCl O 1) Fmoc-Cys(StBu)-OH SPPS O 2) Piperidine 1) Cleave/Purify 2) Reduce
O HS O S NH2 O NH2 HO O O to S Acyl Shift O
Figure 26: O to S acyl shift approach
Nbz conversion using different solvents
After the unsuccessful alternative approach, we attempted to determine the cause of the
Nbz conversion side products. We hypothesized that the multiple products observed upon conversion might be due to intramolecular interactions and structure formation as described by Siman et. al.206 This could be disrupted by using solvents with hydrogen bonding capabilities such as DMF or NMP. We then tested this hypothesis by performing the NPCF treatment in DMF instead of DCM on both H4-A-Dbz and CpA-2-Dbz. In initial tests, conversion was observed. However, working in collaboration with Dr. John Shimko and
Kurt Justus, we determined that dry DMF must be used, possibly due to water-mediated hydrogen bonding. Therefore, DMF was dried over molecular sieves. By performing
92 conversion in DMF, a clean conversion to a reactive N-acylurea species was finally achieved for CpA-2 (Figure 27A). In addition, we performed Nbz conversion of H4-A using the same method, and clean conversion was also observed (Figure 27B).
O 2 O O N 2 O NH2 A O HN N NH2 H H O H2N 1) NPCF in dry DMF 30-60 % B 30-60 % B 2) DIEA in DMF
0 10 20 30 Time (min)
O O A O A O B HN N NH2 NH2 O N H2N H H O
1) NPCF in dry DMF 15-30 % B 2) DIEA in DMF 15-30 % B
0 10 20 30 Time (min)
Figure 27: Nbz conversion of CpA-2 and H4-A peptides using the dry DMF approach
(A) RP-HPLC of CpA-2-Dbz and CpA-2- Nbz(formyl). (B) RP-HPLC of H4-A-Dbz and H-4-Nbz(formyl). Compare to Figure 13.
It is important to note that reaction in DMF resulted in formylated Nbz, resulting from
Vilsmeier–Haack reaction (Figure 28).207 Although this did not affect the conversion to thioester, it does prevent further derivatization – for example, treatment with hydrazine to generate a peptide hydrazide (Chapter 3). Clean Nbz conversion with no formylation could be achieved using NMP as solvent, since it possesses similar hydrogen bonding capabilities as DMF, and it lacks the aldehyde moiety to participate in Vilmeier-Haack (Figure 29) 93
NO2 DMF O O Cl Cl O N H N H
O O O O O O HN HN HN NH2 NH2 NH2
H2N Cl H2N H2N H N N
H2O
O O O O N NH2 HN O NH2 N Nbz H H H2N H O conversion O
Figure 28: Formylation through Vilsmeier-Haack
O 2 O HN NH2
H2N
30-60 % B
0 10 20 30 Time (min) 1) NPCF in dry NMP 2) DIEA in DMF
O 2 O N NH2 O [M + H]+ N H observed m/z 4598 30-60 % B expected m/z 4596
0 10 20 30 4000 5000 Time (min)
Figure 29: Nbz conversion of CpA-2 using dry NMP RP-HPLC of CpA-2-Dbz (top) and RP-HPLC and MALDI-TOF MS of CpA-2-Nbz (bottom).
94
Synthesis of CENP-A peptides
The syntheses and conversions of the remaining CENP-A peptides were achieved in good yields. With CpA-4, the C-terminal His was loaded using the protocols developed to minimize racemization discussed previously. CpA-3 had relatively poor solubility properties, requiring the installation of an Arg-Arg tag. RP-HPLC and MALDI-TOF MS of purified CENP-A peptides are provided in Figure 30.
95
12-30 % B
[M + H]+ CpA-1-Nbz observed m/z 3765 O expected m/z 3765 1 O N NH2 O N H
+ 30-60 % B [M + H] CpA-2-Nbz(formyl) observed m/z 4625 expected m/z 4623 O 2 O N NH2 O N H H O
30-60 % B [M + H]+ CpA-3-Nbz-Arg-Arg no Nbz observed m/z 3577 expected m/z 3578 O 3 O N ArgNH2 -Arg O N H
[M + H]+ 30-60 % B CpA-4-Nbz-Arg observed m/z 2489 expected m/z 2489 O 4 O N ArgNH2 O N H
0 10 20 30 Time (min) CpA-50-Nbz 30-55 % B [M + H]+
O observed m/z 3343 5 O expected m/z 3345 N NH2 O N H
Figure 30: Purified CENP-A peptides RP-HPLC (left) and MALDI-TOF MS (right) of purified CENP-A peptides
96
Sequential SP-NCL of CpA-K124ac
With all peptides in hand, we carried out the first sequential SP-NCL of CENP-A. The ligation conditions are listed in Table 4. Every ligation was carefully monitored with test cleavages at the Rink using TFA (Figure 31). The first three cycles of deprotection and ligation were observed to generate relatively pure product in high yield.
Table 4: SP-NCL conditions for CENP-A
Ligation MW Peptide Volume Concentration Molar Round Peptide (g/mol) (mg) (mL) (mM) Equivalent Time (h) 1 CpA-5-124ac 3345 3 0.7 1.2 2.8 15 2 CpA-4 2462 2.5 0.7 1.2 2.3 17 3 CpA-3 3577 2.5 0.7 1.0 2.3 20 4 CpA-2 4628 2.5 0.5 1.1 2.6 12 5 CpA-1 3765 3 0.7 1.6 3.8 20
97
CpA-50
[M + H]+ Exp. m/z 3473 Obs m/z 3475 5
CpA-450
[M + H]+ Exp. m/z 5589 Obs m/z 5590 4 5
CpA-3450 [M + H]+ Exp. m/z 8665 Obs m/z 8665 3 4 5
Figure 31: SP-NCL of CpA-345-K124ac160 Crude MALDI-TOF MS of CpA-5 ligation (A). RP-HPLC of CpA-4 and CpA3 ligations (B).
However, after ligation of CpA-2, we were unable to observe any significant product using
RP-HPLC or MALDI-TOF (Figure 32). The disappearance of CpA-345 on the RP-HPLC suggested that the it had fully converted to another product. We initially hypothesized that
CpA-2345 had poor solubility. The last ligation with CpA-1 was then performed. If the issue was due to solubility, we anticipated that the addition of a soluble peptide would
98 make the product observable, since recombinant CENP-A is observable on the RP-HPLC.
However, no improvement was seen.
A CpA-23450
CpA-123450
B
Figure 32: SP-NCL of CpA-1 and CpA-2 RP-HPLC of CpA-2 and CpA-1 ligations (A). SDS-PAGE of CENP-A resin (B).
99
While attempting to assess these results, we considered the possibility that after TFA cleavage, the product was still interacting with the resin. In order to confirm this, we heated the TFA-treated resin in SDS loading buffer to 100 °C for 10 min, and loaded the resin on
SDS-PAGE. Interestingly, we observed a band with a similar size to recombinant CENP-
A but no smaller side products. This suggested that although ligation was successful, the final product could not be eluted from the resin. Attempts to elute the peptide using various buffers and solvents including GuHCl, dimethyl sulfoxide (DMSO), and ionic liquid were unsuccessful.
We hypothesized that full-length CENP-A interacted nonspecifically with the PEGA resin.
Since the yield seemed to drop after the ligation intermediate CpA-345, the N-terminal peptides CpA-1 and CpA-2 could be responsible for the poor recovery of the final product.
To assess sequence dependence, we ligated CpA-1 and CpA-2 directly to the base resin, cleaved at the Rink linker, and observed 30–56% recovered yield. These yields were reduced but did not replicate the extreme losses observed for the equivalent ligation in the protein context. Given that PEGA resin has a large pore size compatible with folded proteins, it seems unlikely that yield reductions are solely size-based. Together, these suggest context-dependent interactions of the larger histone sequences and the solid phase.
100
Hybrid Phase Ligation of H4-K5ac, K12ac, K91ac
The dramatic yield loss observed for CENP-A provided an important insight into the poor yields of H4 SP-NCL. Re-analysis of our syntheses validated that yields through the first
3 rounds of ligation were acceptable, and only the last ligation step reduced yields. We speculated that perhaps this problem could be overcome by cleaving the peptide after the third ligation, and performing the problem ligation in solution. We termed this hybrid- phase ligation, which combines solid-phase and solution-phase ligations in order to maximize yield.
We first attempted this approach with triple-modified H4-K5ac, K12ac, K91ac (Figure 33) using the same peptide fragments described in Table 1. The K5ac and K12ac modifications are found in newly synthesized H4,208 and its exact function is not clear. H4-K91 is found in the histone-histone interface, and acetylation is found to destabilize the nucleosome.209
Although the three acetylations have not be observed simultaneously on H4, H4-K91 is acetylated before assembly on DNA,210 just like H4-K5,K12. Furthermore, mutations in
H4 K5, K12 show hypersensitivity to replication stress and DNA-damaging agents when combined with mutations in H4-K91.211 Therefore, it has been hypothesized that the three acetylations could be working together to serve important functions in the cell.
101
Rink linker Ligation handle HMBA-Arg-Gly
H4-C0-K91ac C
H4-HC0-K91ac H C
H4-BHC0-K91ac B H C HMBA Cleavage
H4-BHC-K91ac B H C
H4-ABHC-K5ac,K12ac,K91ac A B H C
H4-K5ac,K12ac,K91ac A B H C
Figure 33: Hybrid phase ligation of H4160
Peptide H4-BHC would be prepared through SP-NCL. After cleaving with NaOH at the
HMBA linker, the last ligation with H4-A would be performed in solution. Traceless ligation would be achieved through desulfurization. For the entire synthesis, only a single purification would be required to obtain the homogenously modified H4 protein.
SP-NCL of H4-BHC-K91ac
SP-NCL of H4-BHC-K91ac was performed using the ligation conditions on Table 5.
Ligation reaction was monitored by cleaving resin samples with TFA, and analyzing with
RP-HPLC and MALDI-TOF MS (Figure 34). We did not account for the slower kinetics of the deprotection of dmThz, which led to the H4-HC side product observed in RP-HPLC.
102
For future procedures, we plan to account for the longer deprotection time of dmThz compared to Thz, and test cleave to confirm completion of the reaction.
Table 5: SP-NCL of H4-BHC-K91ac
Ligation MW Peptide Volume Concentration Molar Time Round Peptide (g/mol) (mg) (mL) (mM) Equivalent (h) 1 H4-C-Nbz K91ac 3520 6.5 1.5 1.2 2.8 14 2 H4-H-Nbz-R 2580 4 1.3 1.2 2.3 15 H4-H-Nbz-R 2580 4 1.6 1.0 2.3 18 3 H4-B-Nbz 2321 4 1.6 1.1 2.6 10 H4-B-Nbz 2321 6 1.6 1.6 3.8 20
103
A H4-C0
S
N C H
H4-HC0
S
N H C H
[M + H]+ H4-BHC0 Exp m/z: 8099 Obs m/z: 8101 B H C
S
N H C H H4-HC0
B H4-BHC [M + H]+ Exp m/z: 7454 Obs m/z: 7452 H4-HC B H C
Figure 34: SP-NCL of H4-BHC-K91ac160 RP-HPLC of ligations with H4-C, H4-H, and H4-B (A). Crude RP-HPLC of H4-BHC cleaved by NaOH (B).
104
After three ligation steps, H4-BHC-K91ac was released from the resin by cleavage of the
HMBA linker with 0.1 M NaOH. The resin was then washed with TFA for maximum peptide extraction. With the cleavage and extraction combined, we achieved 97% crude yield as assessed by lyophilized weight. The product was sufficiently pure that the solution- phase ligation was carried out with H4-BHC-K91ac without purification.
Solution-phase NCL of H4-ABHC-K5ac,K12ac,K91ac
The solution phase ligation to generate H4-ABHC proceeded to 90% as assessed by SDS-
PAGE (Figure 35A). The mixture was dialyzed extensively against thiol-free SP Wash
Buffer to exchange the aryl thiol MPAA, which is not compatible with desulfurization by the Danishefsky approach.38 Desulfurization was then carried out (Figure 36) prior to RP-
HPLC purification to obtain the final product (Figure 37). After four rounds of ligation and desulfurization (9 chemical steps), the isolated H4-K5ac, K12ac, K91ac product was obtained in 16% yield, which is approximately commensurate with yields observed for total synthesis of H4 by other approaches.144
105
A 0h 2h 4h 6h BHC
ABHC BHC HC H4-A
B 2
1 [M + H]+ 3 Exp m/z: 11457 Obs m/z: 11452 5
4
OH O O H4-A-K5ac,K12ac-MPAA 1 A S
2 H C H4-HC-K91ac
3 B H C H4-BHC-K91ac
4,5 A B H C H4-ABHC-K5ac,K12ac,K91ac
Figure 35: Solution-phase ligation of H4160
(A) SDS-PAGE of solution-phase ligation of H4-A. (B) MALDI-TOF MS of crude ligation.
106
[M + H]+ Exp m/z: 11457 Obs m/z: 11352
[M + H]+ Exp m/z: 11361 Obs m/z: 11364
Figure 36: Desulfurization of H4-K5ac,K12ac,K91ac MALDI-TOF MS of before (top) and after (bottom) desulfurization.
A B H C
25-70 % B
[M + H]+ Exp m/z: 11361 Obs m/z: 11361
Figure 37: Purified H4-K5ac,K12ac,K91ac160 RP-HPLC (top and MALDI-TOF MS (bottom) of purified H4
107
Hybrid Phase Ligation of CpA-K124ac
With the successful synthesis of H4 in good yield, we next carried out the total synthesis of CpA-K124ac using the hybrid approach (Figure 38). We again carried out the first three ligations on the solid phase. CpA-3450-K124ac was released from resin by cleavage of the
Rink linker. After cleavage, two sequential solution-phase NCL were required, as opposed to only one for H4. It was possible to perform HMBA cleavage at a number of steps in solution. However, we found that the thiol mercaptophenylacetic acid (MPAA) in the ligation buffer significantly slowed down HMBA cleavage. Therefore, it is recommended to perform HMBA cleavage before CpA-2 ligation, before CpA-1 ligation, or before desulfurization, when no thiols are present. For this synthesis HMBA was cleaved before the last ligation with CpA-1.
108
Rink linker Ligation handle SP-NCL HMBA-Arg-Gly
3 4 5 Rink cleavage
3 4 5
2 3 4 5 HMBA cleavage
2 3 4 5
1 2 3 4 5
1 2 3 4 5
Figure 38: Hybrid-phase NCL scheme of CpA-K124ac160
109
SP-NCL of CpA-345-K124ac
Preparation of CpA-345 through SP-NCL was efficient (Figure 39). CpA-3450-K124ac was recovered in 99% yield by lyophilized weight, and in excellent (>90%) purity as assessed by RP-HPLC (Figure 39). This peptide was used directly for the solution-phase ligations without purification.
5
CpA-50 0-73 % B
3 4 5
CpA-450 30-70 % B
4 5
CpA-3450 30-90 % B [M + H]+ Exp m/z: 8651 Obs m/z: 8651 3 4 5
160 Figure 39: SP-NCL of CpA-3450-K124ac RP-HPLC of CpA-345 SP-NCL and MALDI-TOF MS of product.
110
Sequential solution-phase ligation of CpA-12345-K124ac
CpA-2-Nbz was added and efficiently ligated in solution to generate CpA-23450-K124ac as assessed by SDS-PAGE (Figure 40A). Addition of methoxylamine directly to the ligation mixture deprotected N-terminal Cys. The mixture was dialyzed to remove the methoxylamine and the pH was adjusted to 10 using NaOH, which cleaved the HMBA linker to generate CpA-2345-K124ac. pH was returned to ligation conditions, and CpA-1-
Nbz peptide added to generate CpA-12345-K124ac (Fig. 24). The ligation mixture was then dialyzed extensively prior to desulfurization. After confirming complete desulfurization by MALDI-TOF MS (Figure 41), the protein was purified by RP-HPLC
(Figure 42).
111
CpA-2 CpA-1 A Ligation Ligation 0 CpA345 0h 2h 4h 6h 0h CpA2345 O/N
CpA-12345 CpA-23450 CpA-12 CpA-3450 CpA-1 CpA-2
30-90 % B B CpA-12 CpA-12345
CpA-1345
Figure 40: Solution-phase ligation of CENP-A160 SDS-PAGE of solution-phase NCL (A). RP-HPLC of CpA-1 ligation to CpA-2345 (B).
[M + H]+ Exp m/z: 16010 Obs m/z: 16012
[M + H]+ Exp m/z: 15882 Obs m/z: 15883
Figure 41: Desulfurization of CpA-K124ac MADLI-TOF MS of before (top) and after (bottom) desulfurization. 112
A 40-90 % B B
CpA12345 CpA1345
C 3 2 1
4 5
1,3,5 1 2 3 4 5
2,4 1 3 4 5
Figure 42: Purified CpA-K124ac160 RP-HPLC (A), SDS-PAGE (B), and MALDI-TOF MS (C) of purified CpA-K124ac Expected and Observed m/z: 1 CpA-K124ac: [M + 3H]+3 Exp. m/z 5295, Obs. m/z 5297 2 CpA-1345: [M +2H]+2 Exp. m/z 5755, Obs. m/z 5755 3 CpA-K124ac: [M + 2H]+2 Exp. m/z 7941, Obs. m/z 7944 4 CpA-1345 [M +H]+ Exp. m/z 11509, Obs. m/z 11510 5 CpA-K124ac [M + H]+ Exp. m/z 15882, Obs. m/z 15885
113
A small amount of CpA-1345 was observed in the purified sample. However, this was acceptable because we expected this species to be eliminated in the octamer refolding process since it was lacking an essential helix for proper folding (Figure 43). We have used this same purification-through-refolding procedure to maximize yields of H359 and H4132 produced by EPL. We find that complete octamer is only formed with the full-length proteins, but that H32/H42 tetramer will sometimes incorporate a partial protein. In the case of the octamer, partial proteins were removed via size exclusion chromatography.
Figure 43: Nucleosome containing CENP-A PDB: 3AN2194 CpA-2 segment is indicated in light blue. CpA-3, CpA-4, and CpA-5 segments are indicated in red, blue, and green, respectively.
The overall isolated yield of CpA-K124ac was 7% after five rounds of ligation, two cleavage steps, desulfurization, and RP-HPLC purification. The yield was comparable to 114 the current yield for H3 total synthesis using solution-phase NCL, which is 5-7%.132 The similar yields were expected, since the same number of solution-phase ligation steps were performed for both CENP-A and H3.
Nucleosome reconstitution using synthetic and semi-synthetic histones
This section discusses the successful incorporation of synthetic histones in nucleosomes, confirming that the products were functional as histones. We first describe nucleosome reconstitution using semi-synthetic H4-K79ac and recombinant CENP-A, in order to confirm that CENP-A nucleosome could be readily prepared without issues. We then use
H4 and CENP-A synthesized through hybrid-phase NCL to produce histone octamer, tetramer, and nucleosomes.
Semi-synthesis of H4-K79ac
In order to prepare the thioester, H4(1-75)-intein-CBD in pTXB1 plasmid was expressed in BL21 strain. Expressed protein was solubilized from the inclusion bodies using a denaturant such as urea. Denaturant was gradually removed by dialysis. Once the intein folds into its native structure, it catalyzes the N to S acyl rearrangement, forming a thioester bond between H4(1-75) and intein-CBD. Addition of excess thiol such as sodium 2- sulfanylethanesulfonate (MESNa) displaces the intein-CBD, forming the H4(1-75)-SR
(Figure 44). Typically, the separation of cleaved thioester and intein-CBD is achieved using a chitin column. The expressed thioester elutes from the column while the CBD-
115 containing species remain bound. However, in the case of histones, including H3 and H4, the cleaved thioester interacts with chitin, preventing efficient elution. For this reason, we instead exploit the high pI of histone thioesters to separate the products by cation exchange chromatography. It can be difficult to separate H4(1-75)-SR and the uncleaved H4(1-75)- intein-CBD, but the latter species can be removed by RP-HPLC purification.
SH O
H4(1-75) N Intein CBD H
N to S Acyl Shift
O HS R H4(1-75) N-term S SPPS Intein CBD H2N
Thiol exchange HS O H4(1-N75)-term 76-102 SR H2N NCL
H4(1-N75)-term 76-102 Desulfurization
H4(1-N75)-term 76-102
Figure 44: H4-K79ac EPL scheme
After cleavage, the synthesized C-terminal H4 peptide with the K79ac modification was added directly to the solution of H4(1-75)-SR. Ligation was complete in 4h. After desulfurization and RP-HPLC purification, we obtained a homogenous sample of H4-
K79ac (Figure 45).
116
A 0h 2h 4h O/N B
H4(1-75)-intCBD intCBD
H4-K79ac H4(1-75) H4-K79ac K79ac pep
C
H4(1-75) 76-102
0 10 20 30 Time (min) D [M + H]+ observed m/z 11277 expected: m/z 11277
9000 11000 m/z
Figure 45: H4-K79ac EPL (A) SDS-PAGE of H4(1-75)-SR and K79ac peptide ligation. (B) SDS-PAGE of purified H4-K79ac. (C) RP-HPLC and (D) MALDI-TOF MS of purified H4-K79ac.
Recombinant expression of His6-tagged CENP-A
We used the semi-synthetic H4-K79ac and the expressed CENP-A constructs for incorporation in nucleosomes. CENP-A was expressed using a codon-optimized CENP-A in a constitutively expressing pHCE vector, wild-type CENP-A containing a His6-tag was
117 expressed using literature protocols.158 A thrombin cleavage sequence between the tag and the protein allowed for removal of the His6-tag. Using site-directed mutagenesis, we successfully prepared the mutant His6-CpA-C75S.
Refolding and reconstitution of recombinant and semi-synthetic histones
Histones could be reconstituted into nucleosomes either from refolded histone octamer
156 212 cores or from separately refolded (H3/H4)2 tetramers and H2A/H2B dimers. Even with the His-tag, recombinant CENP-A successfully refolded into an octamer with H4,
H2A, and H2B (Figure 46). Semi-synthetic H4-K79ac was also refolded into octamers. We then used these octamer constructs for nucleosome reconstitution on 601 DNA157 using the nucleosome assembly protein Nap1213. With these successful reconstitutions, we next conducted the reconstitutions of the synthetic histones.
A B
Nuc CpA H2A/H2B DNA H4
Figure 46: Refolding and reconstitution of recombinant CENP-A (A) SDS-PAGE of histone octamer with recombinant CENP-A. (B) PAGE of CENP-A nucleosomes prepared with Nap1-assisted nucleosome reconstitution.
118
Refolding and reconstitution of synthetic histones
To confirm that we have synthesized functional histones, we performed refolding using the synthetic H4 and CENP-A. Refolding was carried out with Cecil (CJ) Howard into the relevant protein complexes. Recombinant H3 and synthetic H4-K5ac, K12ac, K91ac were refolded into (H3/H4)2 tetramer (Figure 47A). Recombinant H2A, H2B, H3, and synthetic
CpA-K124ac were refolded into octamer (Figure 47A). Of note, after careful MALDI-TOF
MS analysis, we found that desulfurization of H4- K5ac, K12ac, K91ac was not complete.
We therefore resuspended the lyophilized protein in desulfurization buffer. After confirming the completion of the reaction, we were able to use H4 in desulfurization conditions directly for tetramer refolding. This suggested that sufficiently pure proteins could be refolded immediately following desulfurization without the need for purification.
119
A 1 2 3
CpA-K124ac H3 H2A/H2B
H4 H4-Kac3
B DNA H3 CpA-K124ac
Nuc DNA
C
Figure 47: Refolding of synthetic histones160 (A) SDS-PAGE of CpA-K124ac histone octamer (2), and H4-K5ac,K12ac,K91ac tetramer (3). (B) Salt dialysis reconstitution of nucleosome with synthetic CpA-K124ac. (C) Nucleosome with CENP-A (PDB: 3AN2).194
As predicted, the 5% CpA-1345 deletion product was eliminated through the octamer refolding process, similar to effects observed for semi-synthetic H3 and H4.132 These protein complexes will be taken forward for further study of the effects of these modifications on nucleosome structure and dynamics. 120
Conclusions
In conclusion, we demonstrated a simple hybrid ligation approach that combines both solid and solution-phase ligation chemistry for optimal yields of challenging synthetic histone protein targets. We maximized product yields through resin cleavage at an external Rink linker, with subsequent cleavage at an internal HMBA linker to generate the native carboxyl terminus. We used this approach for synthesis of a triple-modified H4 histone and, notably, for the challenging target CpA-K124ac which could not be accessed using more common expression-based approaches. We find that the key step in hybrid ligation is monitoring yields of SP-NCL to determine if there is a turn- over point at which reduced release from the resin overcomes the chemical advantage of solid phase reactions.
121
Acknowledgements
Dr. Santosh Mahto contributed to the initial studies of H4 SP-NCL, including the development of the dual linker strategy and the determination of the optimal ligation sites of H4. The condition to cleanly convert H4-A-Dbz and CpA-2-Dbz to Nbz using dry DMF was developed in collaboration with Dr. John Shimko and Kurt Justus. Mallory Alexander assisted in the anlaysis of the required CENP-A peptides. Cecil Howard assisted with refolding the synthetic H4-K5ac,K12ac,K91ac and CpA-K124ac into tetramer and octamer, respectively. Thanks to Dr. Kurumizaka and Dr. Dalal for providing the pHCE vector containing the codon-optimized His6-tagged CENP-A.
122
Chapter 3: Convergent Hybrid-Phase Native Chemical Ligation
Introduction
Our prior chapter demonstrates some of the problems inherent to sequential ligation including decreased yield due to lengthy peptides and more purification steps. While solid- phase and hybrid-phase ligation ameliorated some of these issues, improved methods are necessary for high yield of fully synthetic proteins. Two main approaches have been suggested: one-pot and convergent schemes. In one-pot approaches, all ligation reactions are carried out in a single reaction vessel.139,144 Yields are often reduced due to the complex schemes and careful procedures are required. Recently, a convergent ligation scheme was recently shown to be more efficient than one-pot ligation for the synthesis of H2B.214
Here, we develop convergent hybrid-phase NCL to further improve our synthetic yield
(Figure 48). In Chapter 2, we demonstrated the synthesis of CENP-A with hybrid phase ligation, but the method required two rounds of ligation in solution after the cleavage of
CpA-345. In solution, we were required to carry out CpA-2 ligation, Thz deprotection, dialysis, HMBA cleavage, CpA-1 ligation, dialysis, desulfurization, and finally purification. With these multiple handling steps, the yield was expectedly low. We
123 hypothesized that if only one solution-phase ligation was required, we could improve the overall yield significantly, and minimize side products.
1 + 2 SP-NCL Cleavage
1 2 3 4 5
Convergent Ligation
1 2 3 4 5
Figure 48: Convergent ligation of CENP-A
In this chapter, we propose a convergent approach where CpA-1 and CpA-2 are ligated to form CpA-12, which is then ligated to CpA-345 (Fig. 27). This simplifies the solution- phase component of hybrid-phase ligation. This requires the development of a cryptic thioester at the C-terminus of CpA-2, which is necessary for our convergent scheme. We will then introduce convergent SP-hybrid NCL, a further refinement to improve CENP-A total synthesis.
124
Experimental Methods
Hydrazinolysis of peptide Nbz and peptide HMBA
Peptide Nbz was dissolved in 0.1 M Phosphate, 6 M GuHCl, 1% v/v hydrazine, pH 7.
Peptide HMBA was dissolved in 0.1 M Phosphate, 6 M GuHCl, 1% v/v hydrazine.
Reaction was complete in 2 h, and product was confirmed with RP-HPLC and MADLI-
TOF MS.
Preparation of Hydrazide resin using Wang
Wang resin155 (100-200 mesh) (Novabiochem) was swelled in 14 mL DCM and 160 µL N- methyl morpholine (final concentration was approximately 100 mM). NPCF was added to a final concentration of 100 mM at 0 °C. The mixture was brought to room temperature, and the resin was stirred overnight.
The resin was flow-washed with DCM three times, followed by three flow-washes in DMF.
Resin turned bright yellow upon the addition of DMF. The resin was flow-washed three times with methanol, followed by three flow-washes in DCM. The resin was drained and lyophlilized.
125
At 0 °C, 15 mL DMF, 11mL DCM, and 50 µL hydrazine (30 mM) was added to the lyophilized resin. The resin turned yellow upon the addition of the mixture. The mixture was brought to room temperature, and the resin was nutated overnight.
The resin was flow-washed three times with DCM. The resin at this point was white. The resin was then flow-washed three times in the following sequence: DMF, methanol, DCM.
The resin was drained and lyophilized. CpA-2 was synthesized on the hydrazide resin using standard SPPS protocols.
Solution-phase NCL of CpA-12-Dbz
Ligation with CpA-1-Nbz
CpA-1-Nbz (1.5 molar equivalent to CpA-2-Dbz) was dissolved in SP Ligation Buffer (see
Experimental Methods in Chapter 2). CpA-2-Dbz was added, and 1 M TCEP pH 7.4 was added to make 20 mM. The final concentration of CpA-2-Dbz should be at least 1 mM to allow for rapid ligation. Ligation was allowed to proceed for at least 5 h, and reaction was monitored by RP-HPLC and MALDI-TOF MS. Ligated product was purified using RP-
HPLC. Pure fractions were confirmed by RP-HPLC and MALDI-TOF MS, and the fractions were combined and lyophilized.
126
Ligation with CpA-1-Dbz
CpA-1-Dbz was dissolved in 0.1 M Phosphate, 6 M GuHCl, pH 3, and incubated at -15 °C for 15 minutes. -15 °C ice bath was prepared by mixing 2 kg of ice with 10 g of NaCl.
Temperature was adjusted by adding more ice or salt. 200 mM NaNO2 was prepared in water, and this solution was added to the CpA-1-Dbz solution to make 20 mM NaNO2. The peptide solution was mixed by pipetting up and down using a micropipette, and further incubated at -15 °C for 15 minutes. The solution was then taken out of the ice bath to room temperature. CpA-2-Dbz was dissolved in 0.1 M Phosphate, 6 M GuHCl, 0.2 M MPAA, pH 7.4. This solution was added to the CpA-1 solution, and the pH was adjusted to 7.4.
After combining the two peptide solutions, the final concentration of MPAA was 0.1 M.
Final concentration of CpA-1-Dbz should be at least 1 mM to promote rapid ligation. We found that best kinetics was achieved at 3 mM peptide. After 1 h, TCEP was added to make
20 mM. Ligation was allowed to proceed for at least 5 h, and reaction completion was assessed by RP-HPLC and MALDI-TOF MS. Ligated product was purified using RP-
HPLC, and the pure fractions were combined and lyophilized.
Solution-phase NCL of CENP-A using CpA-12-Dbz and CpA-345
Ligation of CpA-12-Dbz and CpA-345
Ligation condition for CpA-12-Dbz and CpA-345 was identical that of CpA-1-Dbz and
CpA-2-Dbz. CpA-12-Dbz (1.5 molar equivalent to CpA-345) was dissolved in 0.1 M
Phosphate, 6 M GuHCl, pH 3, and incubated at -15 °C for 15 minutes. NaNO2 was added
127 to make 20 mM, and further incubated at -15 °C for 15 minutes. The solution was taken out of the ice bath to room temperature. CpA-2-Dbz was dissolved in 0.1 M Phosphate, 6
M GuHCl, 0.2 M MPAA, pH 7.4. This solution was added to the CpA-1 solution, and the pH was adjusted to 7.4. The final concentration of MPAA should be 0.1 M, and the final concentration of CpA-1-Dbz should be at least 1 mM. After 1 h, TCEP was added to make
20 mM. Ligation was allowed to proceed for at least 5 h, and the reaction was monitored by SDS-PAGE, RP-HPLC, and Ziptip MALDI-TOF MS. Samples for SDS-PAGE were
TCA precipitated before loading. Procedures for Ziptip and TCA precipitation are described in Eperimental Methods in Chapter 2.
Desulfurization of CpA-12345
Sample was dialyzed using the D-tube dialyzer against 200 mL of SP Wash Buffer at 4 °C.
Buffer change was performed after 5 h, and the second dialysis was allowed to go overnight.
The sample was then transferred to a 2.0 mL tube, and the dialysis tube was rinsed with a small volume of SP Wash Buffer, and added to the same 2.0 mL tube. The sample was diluted 5-fold from the original volume during ligation in order to minimize precipitation during desulfurization. To a solution of 1 M TCEP, GuHCl and MESNa were added to make 6 M and 0.2 M MESNa, respectively. The buffer was spun down, and the supernatant was added to the CENP-A sample to make 0.25 M TCEP and 50 mM MESNa. The sample was sparged with Argon for 30 minutes. 0.5 M VA-044-US in water was prepared, and this solution was added to the sample to a final concentration of 30 mM, and the sample was incubated in a 42 °C water bath. Reaction was allowed to proceed for at least 5 h. Complete
128 desulfurization was confirmed by RP-HPLC and ziptip followed by crude MALDI-TOF
MS.
Glycolic acid base resin
Synthesis of glycolic acid base resin
Diglycolic acid –Ala-Ahx-Lys-Gly-Rink-PEGA
3 mL of PEGA resin swelled in methanol was measured, which equated to 0.03 mmol of reactive amine. One tenth loading cut was performed on Gly. Fmoc-Ala-OH, Fmoc-
Lys(Boc)-OH, and Fmoc-Ahx-OH were coupled using starndard manual synthesis conditions. Diglycolic anhydride was loaded using standard coupling conditions with
HCTU and DIEA. The final volume of the resin in DMF was 2.7 mL. The estimated loading of the base resin in DMF was calculated to be 0.001 mmol/mL. The resin was stored at 4
°C in DCM.
Resin thioesterification
Resin was swelled in DMF for 15 minutes. Solution containing 0.3 M Thiophenol, 0.3 M
DIC, and 2mM DMAP in DMF was added to the resin. After 1 h, test cleavage was performed to confirm the completion of the reaction. A small resin sample was washed with DMF, then with DCM. Vacuum-dried resin was incubated in TFA for 15 minutes.
TFA was eluted, evaporated with N2, and diluted with water. Sample was analyzed with
RP-HPLC and MALDI-TOF. If the unreacted glycolic acid was observed, the resin was
129 washed with DMF, and the thioesterification was repeated. Analysis was repeated after 1 h. The reaction was typically complete after the second reaction.
Ligation of CpA-2-Dbz-Gly-Lys(Cys)
After confirming the completion of thioesterification, the resin was immediately washed with DMF, and then with water. The resin was then washed three times with SP Wash
Buffer and SP Ligation Buffer. Wash here refers to the flow-wash and nutation step discussed in Experimental Methods for Chapter 2. Buffer was drained from the resin, and
CpA-2-Dbz-Gly-Lys(Cys) dissolved in SP Ligation Buffer was added to the resin. TCEP was added to make 20 mM, and the ligation was allowed to proceed for at least 5 h.
Ligation of CpA-1-Dbz
Procedure to oxidize CpA-1-Dbz is identical to that of solution-phase ligation using CpA-
1-Dbz. After adding MPAA and adjusting the pH to 7.4, the solution was added to drained resin washed with SP Wash Buffer + 100 mM MPAA. After 1 h ligation, 1 M TCEP pH
7.4 was added to a final concentration of 20 mM.
Cleavage of CpA-12-Dbz0
After washing the resin 3 times with SP Wash Buffer and 3 times with water, the resin is lyophilized inside the column. Peptide was cleaved with 95:2.5:2.5 TFA:H2O:TIS for 1 h.
Resin was washed 3 times with TFA. All TFA eluates were combined and concentrated
130 using flow of N2. EtO2 was added to at least 5 times the volume of TFA. The sample was centrifuged and decanted. Pellet was dissolved with water and ACN, and lyophilized.
SDS-PAGE of TFA-treated resin
Resin was washed 3 times with water and the eluates were collected. The resin was drained or decanted, and transferred to a tube. SDS loading buffer containing 100 mM DTT was added, and the resin was incubated at 95-100 °C for 5 min. The resin was loaded on the gel along with the loading buffer using a cut-off tip.
Base resin sequences
The SP-NCL resin for the synthesis of CpA-345 used here had been modified so that the unligated base resin handle could be observed on the RP-HPLC and MALDI-TOF MS.
SP-NCL resin: Fmoc-Thz-Ala-Ahx-Lys-Gly-Rink-PEGA (1.33 µmol/mL in methanol)
Glycolic resin: Glycolic acid - Ala-Ahx-Lys-Gly-Rink-PEGA (1.0 µmol/mL in DMF)
Quantificaiton of product from dry PEGA resin
If the final dry weight of PEGA resin after ligation is known, theoretical starting weight of the PEGA resin can be calculated using the following equation:
����� ����ℎ� �������� ����ℎ� = ���� 1.06 + 0.02 (�� ) �
131
Where MWtotal is total molecular weight of all the components added on the resin, starting from the first Gly to the full SP-NCL peptide. Once the starting weight is calculated, the theoretical yield of the cleaved peptide is calculated using the following equation:
���� �ℎ��������� ����� �� = (����� ����ℎ�)(0.02 )(�� ) �
Where MWpeptide is the molecular weight of the cleaved peptide product.
132
Results and Discussion
Hydrazide as a cryptic thioester for convergent ligation
Convergent ligation requires protection schemes to prevent reaction of the internal thioester while initial ligation steps are carried out. In order to develop an efficient cryptic thioester for CENP-A convergent ligation, we first examined the hydrazide functionality.
Peptide hydrazides are unreactive under ligation conditions, but can be activated through oxidation to peptide azide using NaNO2. The azide can then be displaced by a thiol to form a thioester in situ (Figure 49).143 Further, it has been used successfully for the total synthesis of histones.144,146
O
NH2 N H
NaNO2
O N N N
Thiol
O
SR
Figure 49: Thioester conversion from peptide hydrazide
133
Synthesis of CpA-2 on hydrazide resin
Initially, we attempted to directly synthesize CpA-2 on a resin with a hydrazide linker,215 which was prepared from Wang resin (Figure 50) This synthesis was not successful.
Although the 16-residue peptide could still be observed, by the end of synthesis, the expected product could not be observed (Figure 51).
HO
PNCF
O2N O
O O
hydrazine
O
H2N N O H
Figure 50: Preparation of hydrazide base resin
134
CpA-2-N2H3 16-mer 0-73% B
0 10 20 30 Time (min)
CpA-2-N2H3 29-mer 0-73% B
0 10 20 30 Time (min)
CpA-2-N2H3 0-73% B
0 10 20 30 Time (min)
Figure 51: Synthesis of CpA-2-N2H3
RP-HPLC of CpA-2-N2H3 at 16-residue, 29-residue, and full-length (36 residues).
CpA-2-N2H3 by hydrazinolysis of Nbz
Given that the CpA-2 peptide itself can be synthesized on simple linkers, we hypothesized that the hydrazide linker might have posed a problem during synthesis. We therefore proposed an approach in which an internal linker was displaced by hydrazine to allow preparation of the protected derivative. Nbz can be cleaved with hydrazine to yield peptide
135
146 hydrazide, so we attempted to convert CpA-2-Nbz(formyl) to CpA-2-N2H3 using this method. The proposed convergent ligation scheme is illustrated in Figure 52.
O 2 O N NH2 O Hydrazine N H
O O 1 O NH2 + 2 N N H NH2 O N H
O
NH2 1 2 N H
NaNO2
O N N 1 2 N 3 4 5
MPAA
1 2 3 4 5
Figure 52: Convergent ligation using hydrazide: hydrazinolysis of Nbz
Hydrazinolysis of CpA-2-Nbz(formyl) generated an unknown species with m/z of 4614
(Figure 53). The expected hydrazide product was also observed, but the unknown species seemed to be the major product. We hypothesized that the condensation between the formyl group and the hydrazine reagent led to a rearrangement to generate the unknown product.
We therefore performed hydrazinolysis again using CpA-2-Nbz. As discussed in Chapter
2, Nbz conversion using dry NMP generated CpA-2-Nbz with no formylation. Although
136
CpA-2-N2H3 was observed, we found very little product in the supernatant of the peptide sample as assessed by RP-HPLC (Figure 54). The peptide sample was relatively dilute and we did not observe visible precipitation, but it seemed like CpA-2-N2H3 had marginal solubility even in 6 M GuHCl. The solubility issue of CpA-2- N2H3 would partially explain why synthesis of CpA-2 on a hydrazide resin was so poor.
O A 2 O N NH2 O N H H O
30-60 % B
0 10 20 30 Time (min) B 30-60 % B
0 10 20 30 Time (min) C m/z 4614 O
NH2 2 N H [M + H]+ observed m/z 4154 expected m/z 4152
4000 5000
Figure 53: Hydrazinolysis of CpA-2- Nbz(formyl) (A) RP-HPLC of CpA-2-Nbz(formyl). (B) RP-HPLC of CpA-2-Nbz(formyl) after the addition of hydrazine. (C) MALDI-TOF MS of the major peak in (B).
137
O A 2 O N NH2 O N H
30-60 % B
0 10 20 30 Time (min)
O
NH2 B 2 N H
30-60 % B
0 10 20 30 Time (min)
Figure 54: Hydrazinolysis of CpA-2-Nbz
(A) RP-HPLC of CpA-2-Nbz. (B) RP-HPLC of CpA-2-N2H3.
CpA-12-N2H3 by hydrazinolysis of HMBA
Since it was not possible to prepare CpA-2-N2H3 due to solubility issues, we decided to perform hydrazinolysis after ligating CpA-1 to CpA-2. CpA-1 is very soluble, so CpA-12-
N2H3 should have improved solubility over CpA-2-N2H3. To do this, it was necessary to use CpA-2 with a C-terminal HMBA, since using CpA-2-Nbz would lead to cyclization and hydrolysis during ligation. HMBA, like Nbz, can be cleaved with hydrazine to yield peptide hydrazide,216 but unlike Nbz, it is stable under ligation conditions. The convergent ligation scheme using HMBA is shown in Figure 55.
138
O 1 O + 2 N NH2 O HMBA N H
1 2
Hydrazine O
1 2 NH2 N H
NaNO2
O N N 1 2 N 3 4 5
MPAA
1 2 3 4 5
Figure 55: Convergent ligation using hydrazide: hydrazinolysis of HMBA
For this purpose we synthesized CpA-2-HMBA-Arg-Gly-Dbz (Figure 56). Performing hydrazinolysis after CpA-1 ligation improved solubility, but the presence of a side product made purification difficult. The yield of partially pure CpA-12- N2H3 was less than 15% from the starting CpA-2-HMBA-ARg-Gly-Dbz (Figure 57).
139
O 2 O HN NH2 [M + H]+ H N 2 observed m/z 4903 expected m/z 4906 25-50 % B
0 10 20 30 4000 5000 Time (min)
Figure 56: CpA-2-HMBA-Arg-Gly-Dbz RP-HPLC (left) and MALDI-TOF MS (right) of CpA-2-Arg-Gly-Dbz. Note the C-terminal Dbz was originally intended for ligating CpA-2 to a solid support, so that CpA-12 could be prepared in solid-phase. The SP-NCL of CpA-12 is not relevant to the study conducted here, and will be discussed in a later section.
140
A CpA-12-HMBA-RG-Dbz 25-45 % B *
O 1 2 O HN NH2
H2N
0 10 20 30 Time (min)
B CpA-12-N2H3 25-45 % B * O
NH2 1 2 N H
0 10 20 30 Time (min)
C CpA-12-N2H3 semi-pure * 25-45 % B
0 10 20 30 Time (min)
Figure 57: Ligation and hydrazinolysis of CpA-12 RP-HPLC of CpA-1 and CpA-2-HMBA-RG-Dbz ligation (A). Hydrazinolysis of CpA-12 (B). Semipure CpA-12-N2H3 (C).
141
Generating hydrazide through hydrazinolysis of Nbz and HMBA was not compatible with CpA-2. We therefore needed to find another suitable functional group to serve as the masked thioester.
Using Dbz as a cryptic thioester for convergent ligation
While searching for the ideal cryptic thioester, we found that recent advances by the
Dawson217 and Liu218 laboratories demonstrate that peptide-Dbz can be directly converted into a reactive thioester. Dbz is oxidized using NaNO2, and the resulting triazole functionality can be displaced by an external thiol.218 This approach completely bypasses the Nbz conversion step. As illustrated in Figure 58, Dbz is a very versatile linker that can be used for the efficient production of thioester, as well as serve as a cryptic thioester.
O O HN NH2 NPCF/DMF NaNO2 H2N NPCF/DCM
O O O O O O
N N N NH2 NH2 NH2 O O N N N N H H H O Hydrazine
O
NH2 MPAA N MPAA H MPAA
NaNO2
MPAA OH O O S
Figure 58: Preparation of thioester using Dbz
142
The ability to use Dbz as a thioester precursor had several advantages over Nbz.
Elimination of a chemical step improved yield of the peptide. Nbz is more labile than Dbz, and even under acidic condition of RP-HPLC, hydrolysis products were observed if the peptides were not lyophilized immediately. Peptide was much more stable as the Dbz derivative. However, it must be noted that direct activation of Dbz cannot replace Nbz entirely as the means to produce thioester. Ligation using peptide Dbz with a N-terminal
Thz generated a ring-opened side product. Therefore, ligation using Dbz was only suitable for peptides lacking Thz.
Convergent Hybrid-Phase NCL of CENP-A
Figure 59 illustrates the scheme for the convergent hybrid-phase NCL approach. CpA-345 is prepared through SP-NCL as described in Chapter 2. CpA-345 is cleaved at the HMBA rather than the Rink linker in order to reduce the number of chemical steps performed in solution. CpA-12-Dbz is prepared through solution-phase NCL using CpA-2-Dbz and
CpA-1-Nbz. CpA-12-Dbz is activated using NaNO2, and convergent ligation with the cleaved CpA-345 generates the full-length product. Compared to the previous hybrid- phase NCL approach, this method eliminates a solution-phase Thz deprotection step, a dialysis step, and a solution-phase NCL step. With this new strategy, we anticipated an improvement in yield of CENP-A total synthesis.
143
O 2 O HN NH2 SP-NCL
Solution-phase NCL H2N
O 1 2 O 3 4 5 HN NH2
NaNO2 H2N NaOH
O 1 2 O 3 4 5 N NH2 N N
MPAA
1 2 3 4 5
Figure 59: Convergent hybrid-phase NCL of CENP-A
Solution-phase ligation of CpA-12-Dbz
CpA-1-Nbz and CpA-2-Dbz were dissolved in MPAA ligation buffer. CpA-1 converted to thioester while CpA-2-Dbz remained inert, and the ligation was efficient (Figure 60) The product was purified by RP-HPLC to remove excess MPAA and unreacted peptides, providing a 40% purified yield.
144
A MPAA 15-50 % B
CpA-12-Dbz
CpA-1 CpA-2-Dbz
0 10 20 30 Time (min)
B 25-50 % B O 1 2 O HN NH2
H2N
0 10 20 30 Time (min) C [M + H]+ observed m/z 8144 expected m/z 8144
7000 8000 10000 m/z
Figure 60: Solution-phase ligation of CpA-12-Dbz Crude RP-HPLC of the ligation (A). RP-HPLC (B) and MALDI-TOF MS (C) of purified CpA-12-Dbz.
145
Solution-Phase NCL of CpA-12345-K124ac
With both CpA-12-Dbz and CpA-345 in hand, we carried out the convergent ligation in solution-phase. CpA-12-Dbz was converted to the triazole derivative using NaNO2, while
CpA-345 was dissolved in MPAA ligation buffer. The two peptides were combined, generating the CpA-12 thioester in situ. CpA-12 was efficiently ligated to produce the full- length CpA-12345 (Figure 61).
1 2 1 2 3 4 5
30-90 % B
0 10 20 30 Time (min)
Figure 61: Solution-phase ligation of CpA-12 and CpA-345 RP-HPLC of convergent ligation after 16 h.
Overnight dialysis was carried out in order to remove the MPAA before the free-radical desulfurization. After confirming complete desulfurization by RP-HPLC and MALDI-
TOF MS (Figure 62), the protein was purified with RP-HPLC (Figure 63).
146
A [M + H]+ observed m/z 16001 expected m/z 16008
[M + H]+ observed m/z 15882 expected m/z 15881
8000 m/z 10000
B 1 2 1 2 3 4 5
30-90 % B
0 10 20 30 Time (min)
Figure 62: Desulfurization of CENP-A (A) MALDI-TOF MS of before (top) and after (bottom) desulfurization of CpA-K124ac. (B) RP-HPLC of CENP-A after desulfurization.
147
A 1 2 3 4 5
30-90 % B
0 10 20 30 Time (min) B [M + H]+ observed m/z 15882 expected m/z 15881
Figure 63: Purified CpA-K124ac RP-HPLC (A) and MALDI-TOF MS (B) of purified CpA-K124ac.
Two trials were performed using the convergent hybrid-phase NCL method. The yield, calculated on resin loading of the CpA-345 SP-NCL, was 18% in both trials. This was significantly higher than the 7% yield that we initially obtained from the CpA hybrid-phase
NCL. In addition, no side product was observed in the purified product.
148
Convergent SP-Hybrid NCL of CpA
Ligating CpA-1 and CpA-2 in solution was an effective way to improve total synthesis yield. However, the CpA-12 still requires purification. Importantly, total synthesis of larger proteins may require the N-terminal segment to be composed of three or more peptides, which is not compatible with this convergent approach. For larger proteins, it would be difficult to implement convergent hybrid-phase NCL since yield decreases significantly as the number of solution-phase ligation increases. We envisioned being able to synthesize the N-terminal segment using SP-NCL, enabling the ligation of multiple peptides while eliminating the need for purification. We term this approach convergent SP-hybrid NCL.
Ligation handle for convergent SP-NCL
The key to the convergent SP-hybrid NCL of CENP-A was the development of a strategy to anchor CpA2-Dbz to a solid support with the Dbz linker intact. In our proposed scheme, reverse NCL was performed between the C-terminal cysteine of CpA-2 and the resin with a terminal thioester (Figure 64). The cysteine was linked through the C-terminal Lys sidechain. Ligation with CpA-1 followed by cleavage results in CpA-12 with the Dbz intact.
149
H2N
O O O
O
O O O HO N H Glycolic acid
Thiophenol
O
S O
NH2 HN
SH O 2 O HN N H O H H2N N HN SP-NCL SH O 2 O HN N H SP-NCL H2N
3 4 5 O 1 2 O HN N H NaOH
H2N TFA
O 1 2 O 3 4 5 HN N H
H2N
NaNO2
MPAA
1 3 4 2 5
Figure 64: CENP-A Convergent SP-hybrid NCL scheme
150
Synthesis of CpA-2-Dbz-Gly-Lys(Cys)
In order to synthesize CpA-2 with the required C-terminal handle, Lys was coupled to the
Rink amide resin as Fmoc-Lys(Alloc)-OH. After Alloc deprotection, Boc-Cys(Trt)-OH was coupled to the ε-amine. Gly was added as a spacer residue before the addition of Dbz.
CpA-2 was then synthesized on the resin using standard SPPS procedure (Figure 65). The
Cys remained protected during synthesis until the Boc was deprotected with TFA cleavage.
A
O Cys NH 2 O HN N H
H2N
0 10 20 30 Time (min) + B [M + H] observed m/z 5852 expected m/z 5858
no Dbz
4000 m/z 6000
0 10 20 30 Time (min)
Figure 65: CpA-2-Dbz-GK(C) (A) RP-HPLC of crude CpA-2-Dbz-GK(C). (B) RP-HPLC (left) and MALDI-TOF MS (right) of CpA-2-Dbz-GK(C).
151
SP-NCL of CpA-12-Dbz0
Thioesterification of the glycolic acid base resin was performed using thiophenol due to the fast kinetics of aryl thioesters.145 Full conversion was confirmed by MALDI-TOF MS.
The base resin was stored in the diglycolic acid form, and thioesterification was performed immediately before ligation in order to minimize the chance of thioester hydrolysis. CpA-
2-Dbz-Gly-Lys(Cys) ligated successfully to the resin and no unreacted base resin was observed. After methoxylamine deprotection, the second ligation with CpA-1 was also successful (Figure 66A). The crude yield of CpA-12-Dbz0 was 20% calculated from the dry weight of the resin after cleavage, which was lower than expected. When the TFA- treated CpA-12 resin was boiled in SDS and ran on SDS-PAGE, we observed a significant band of CpA-12 (Figure 66C). This was consistent with what was observed for CpA-12345
SP-NCL, suggesting that the low yield was due to the incomplete elution of the peptide from resin. We believe that the unique sequence of CpA-12 was the cause of the inefficient cleavage. Nevertheless, SP-NCL of CpA-12 still held a definite advantage over the solution-phase approach as there was no purification required. The lyophilized crude peptide could be used directly for the solution-phase ligation to CpA-345.
152
0-73 % B A CpA-2-Dbz0
O 2 O HN N H
H2N
0 10 20 30 Time (min) 15-60 % B CpA-12-Dbz0
O 1 2 O HN N H
H2N
0 10 20 30 Time (min) B C
[M + H]+ observed m/z 8019
expected m/z 8021 O 1 2 O HN N H
H2N
7000 10000 m/z
Figure 66: SP-NCL of CpA-12-Dbz0
RP-HPLC of CpA-2 and CpA-1 ligations (A). MALDI-TOF MS of CpA-12-Dbz0 (B). SDS-PAGE of TFA-treated CpA-12 resin (C). CpA-12-Dbz0 is indicated by the arrow.
153
Refolding synthetic CENP-A without purification
With CpA-K124ac synthesized from convergent hybrid-phase NCL, most of the yield loss seemed to come from RP-HPLC purification. The RP-HPLC of the desulfurized CENP-A is relatively clean. We hypothesize that the desulfurized product should be sufficiently pure to be used directly for octamer refolding. We have shown previously that we could refold synthetic H4 from desulfurization condition. In the convergent hybrid-phase NCL of CpA-
K124ac, we used purified CpA-12-Dbz and relatively pure CpA-345 for the solution-phase ligation. The RP-HPLC following desulfurization of CpA-12345-K124ac had only two major products: CpA-12 and CpA-K124ac. CpA-12 peptide should not be able to refold since the majority of its sequence is the unstructured tail.194 In fact, the segment corresponding to CpA-1 peptide is not visible in the crystal structure.
By performing refolding directly after desulfurization, the only purification step needed in the entire convergent hybrid-phase NCL scheme will be the purification of CpA-12-Dbz.
We have shown that even this purification step can be eliminated by preparing CpA-12 using SP-NCL. Therefore, using the convergent SP-hybrid NCL approach, it should be possible to prepare CENP-A octamer without a single purification step (Figure 67). This will be an unprecedented feat in histone total synthesis.
154
SP-NCL SP-NCL
O 1 2 O 3 4 5 HN N H TFA NaOH H2N
O 1 2 O 3 4 5 HN N H
NaNO2 H2N
MPAA
1 2 3 4 5
1 2
Desulfurization
1 2 3 4 5
1 2
Refolding
CpA-12
Figure 67: Refolding synthetic CENP-A with no purification Segment containing CpA-1 was not visible in the crystal structure.194
155
Conclusions
By incorporating convergent ligation in our hybrid-phase approach, we were able to increase the yield of CENP-A total synthesis significantly. Following this success, we used another convergent approach where CpA-12 was prepared in solid-phase. CpA-12 yield from SP-NCL was lower than expected due to inefficient elution, likely arising from the unique properties of CpA-12. It is possible that this issue is not seen in other proteins.
Through CpA total synthesis, we have demonstrated the viability of convergent SP-hybrid
NCL. In this approach, a protein fragment prepared through SP-NCL can be cleaved and directly converted into a thioester without the need for purification. Large proteins can be ligated from 2 to 3 fragments, each produced by SP-NCL. The convergent SP-NCL approach can potentially overcome the current size limit of protein total synthesis. We also reveal that similarly to H4 and CENP-A, H3 could not be synthesized using sequential SP-
NCL. This provides us with H3 as the next target for our new convergent hybrid approach.
Acknowledgements
Thanks to Cecil (CJ) Howard and Ziyong Hong for the helpful discussions on the anchoring strategy of CpA-2 to resin used in convergent SP-hybrid NCL.
156
Chapter 4: Conclusions
Total synthesis offers unparalleled control when it comes to installing multiple site-specific modifications on a protein. Synthetic proteins have been used to understand the structure, function, and mechanism of action of many proteins. As powerful as protein total synthesis is, the technique has major limitations. We have developed multiple ligation techniques to overcome those limitations.
The state of the art in using sequential solution-phase ligation for the total synthesis of H3 has an overall yield of 7%. Through hybrid-phase NCL, we successfully synthesized H4, prepared from four peptides, with 16% yield. Despite the increased number of ligation steps, the yield was superior to H3 synthesis. Hybrid-phase NCL of CENP-A, with the same number of solution-phae ligation steps as H3, had 7% yield. With the convergent hybrid-phase method, the overall yield of CENP-A was further improved to 18%. This is a significant step to overcoming the current limitations in total protein synthesis.
We have revealed the limitation of H4, CENP-A, and H3 total syntheses by sequential SP-
NCL. Yield reductions were observed after one particular ligation, resulting in a low overall product yield. We therefore developed the hybrid-phase NCL approach that
157 combined solution-phase and solid-phase ligation techniques to maximize yield. Sequential
SP-NCL was performed to prepare the C-terminal segment of the protein. The segment was detached from the solid support and the ligation responsible for the yield cut was done in solution. We employed this strategy to generate H4-K5ac,K12ac,K91ac and CpA-
K124ac. The synthetic histones could be refolded into histone octamers, and reconstituted into nucleosome.
We have improved hybrid-phase NCL by incorporating convergent ligation strategies in the context of CENP-A. The N-terminal segment was prepared by solution-phase NCL, and the C-termiinal segment was prepared as previously described by SP-NCL. The two segments were converged in solution and ligated to give the full-length protein.
Desulfurization and purification produced CpA-K124ac with a significant yield improvement over the original hybrid-phase NCL method. The key to the convergent approach was use of the Dbz linker as a masked thioester that remained inert until activated by oxidation.
In order to eliminate the need to purify the N-terminal segment before its use in the convergent step, we developed the convergent SP-hybrid NCL strategy to produce the N- terminal segment. After ligation, the peptide was cleaved from the solid support while retaining the Dbz as the masked thioester. Cleaved segment was used directly for convergent ligation with no purification. CpA-K124ac was produced, and only one purification step was performed in the entire NCL process.
158
Future Work and Application
The convergent SP-hybrid NCL approach allows for the N-terminal segment to be ligated from multiple peptides. Therefore, this approach can potentially be used for the efficient synthesis of larger proteins with more than 200 residues. In order to demonstrate this potential, we are currently applying this approach to the synthesis of the 212-residue linker histone H1.2 using convergent SP-hybrid NCL. Ziyong Hong has synthesized the required peptides (Table 6). An initial sequential SP-NCL will be performed as a test to determine where the yield cut occurs, if at all. Depending on the result, we will divide the protein into two or three parallel SP-NCL. After cleavage the segments will be ligated in solution to produce the full length protein (Figure 68).
159
Table 6: H1.2 Peptides
H1.2 residues Peptide sequence H1(1-23) SETAPAAPAAAPPAEKAPVKKKA H1(24-48) AKKAGGTPRKASGPPVSELITKAVA H1(49-66) ASKERSGVSLAALKKALA H1(67-86) AAGYDVEKNNSRIKLGLKSL H1(87-110) VSKGTLVQTKGTGASGSFKLNKKA H1(111-133) ASGEAKPKVKKAGGTKPKKPVGA H1(134-162) AKKPKKAAGGATPKKSAKKTPKKAKKPAA H1(163-188) ATVTKKVAKSPKKAKVAKPKKAAKSA H1(189-212) AKAVKPKAAKPKVVKPKKAAPKKK
H1.2 residues Peptide sequence H1(1-23)-Dbz SETAPAAPAAAPPAEKAPVKKKA-Dbz H1(24-48)-Dbz Thz-KKAGGTPRKASGPPVSELITKAVA-Dbz H1(49-66)-Dbz Thz-SKERSGVSLAALKKALA-Dbz H1(67-86)-Dbz Thz-AGYDVEKNNSRIKLGLKSL-Dbz H1(87-110)-Dbz(Alloc) dmThz-SKGTLVQTKGTGASGSFKLNKKA-Dbz H1(111-133)-Dbz Thz-SGEAKPKVKKAGGTKPKKPVGA-Dbz(Alloc) H1(134-162)-Dbz Thz-KKPKKAAGGATPKKSAKKTPKKAKKPAA-Dbz H1(163-188)-Dbz Thz-TVTKKVAKSPKKAKVAKPKKAAKSA-Dbz H1(189-212)-HMBA- Thz-KAVKPKAAKPKVVKPKKAAPKKK-HMBA-RG-Dbz(Alloc) RG-Dbz(Alloc)
160
Two-piece ligation
O HS 1-23 24-48 49-66 67-86 O HN 87-110 111-133 134-162 163-188 189-212 N - H2N - H H2N 1) TFA NaOH 2) NaNO2
1) MPAA 2) Desulfurization
1-23 24-48 49-66 67-86 87-110 111-133 134-162 163-188 189-212
Three-piece ligation
O O 1-23 24-48 49-66 O 67-86 87-110 111-133 O HN HN N - N - H H H N H2N 2 TFA 1) TFA 2) NaNO2
MPAA
O 1-23 24-48 49-66 67-86 87-110 111-133 O 134-162 163-188 189-212 - HN N H H2N
1) TFA NaOH 2) NaNO2
1) MPAA 2) Desulfurization
1-23 24-48 49-66 67-86 87-110 111-133 134-162 163-188 189-212
Figure 68: Convergent SP-Hybrid NCL scheme of H1
161
In addition, the SP-hybrid NCL approach is also being applied for the total synthesis of
H3. In a preliminary test we carried out a sequential SP-NCL of H3 using five peptide fragments. Micro-cleavages after each ligation revealed a reduction in yield after the fourth ligation, much like H4 and CENP-A. Yield loss when ligation crosses from the relatively hydrophobic α-helical core domain to the unstructured, highly charged tail may be a common trend in histone proteins (Figure 69).
CpA H3 H4
Figure 69: Comparison of CENP-A, H3, and H4 structures Green segments, from left to right, are CpA-5, H4-C, and H3-C2. Blue segments are CpA- 4, H4-H, and H3-C1. Red segments are CpA-3, H3-M2, H4-B. Light blue segments are CpA-2, H3-M1, and H4-A. Yellow green segment is H3-N.56,194
162
Overall, we have demonstrated the efficient total synthesis of histone proteins by developing a new ligation strategy, which combines three NCL strategies: Solid-phase
NCL, solution-phase NCL, and convergent NCL. In the process we have also developed solutions for the various challenges involved with the efficient preparation of individual peptide thioesters using Fmoc-SPPS. This work demonstrates significant progress in the field of efficient protein total synthesis.
163
References
Figures with Ref. 7 were reproduced from the indicated reference with permission from
Springer (license number 3911880464769). Figures and Tables with Ref. 160 were reproduced and adapted from the indicated reference with permission from The Royal
Society of Chemistry.
1. Fischer, E. Synthesis in the purine and sugar group. Nobel Lecture (1902).
2. Kent, S. et al. Through the looking glass - a new world of proteins enabled by chemical synthesis. Journal of Peptide Science 18, 428-436 (2012).
3. Kent, S.B.H. Total chemical synthesis of proteins. Chem. Soc. Rev. 38, 338-351 (2009).
4. Nilsson, B.L., Soellner, M.B. & Raines, R.T. Chemical synthesis of proteins. in Annual Review of Biophysics and Biomolecular Structure, Vol. 34 91-118 (2005).
5. Stevens, R.C. Design of high-throughput methods of protein production for structural biology. Structure 8, R177-R185 (2000).
6. Rosano, G.L. & Ceccarelli, E.A. Recombinant protein expression in Escherichia coli: advances and challenges. Frontiers in microbiology 5, 172-172 (2014).
7. Howard, C.J., Yu, R.R., Gardner, M.L., Shimko, J.C. & Ottesen, J.J. Chemical and biological tools for the preparation of modified histone proteins. Topics in current chemistry 363, 193-226 (2015).
164
8. Müller, M.M. & Muir, T.W. Histones: At the Crossroads of Peptide and Protein Chemistry. Chemical Reviews 115, 2296-2349 (2015).
9. Borgia, J.A. & Fields, G.B. Chemical synthesis of proteins. Trends in Biotechnology 18, 243-251 (2000).
10. Kent, S. Total chemical synthesis of enzymes. Journal of Peptide Science 9, 574- 593 (2003).
11. Dirksen, A. & Dawson, P.E. Expanding the scope of chemoselective peptide ligations in chemical biology. Current Opinion in Chemical Biology 12, 760-766 (2008).
12. Lu, W.Y., Qasim, M.A., Laskowski, M. & Kent, S.B.H. Probing intermolecular main chain hydrogen bonding in serine proteinase-protein inhibitor complexes: Chemical synthesis of backbone-engineered turkey ovomucoid third domain. Biochemistry 36, 673-679 (1997).
13. Smith, R., Brereton, I.M., Chai, R.Y. & Kent, S.B.H. Ionization states of the catalytic residues in HIV-1 protease. Nature Structural Biology 3, 946-950 (1996).
14. Ottesen, J.J., Bar-Dagan, M., Giovani, B. & Muir, T.W. An amalgamation of solid phase peptide synthesis and ribosomal peptide synthesis. Biopolymers 90, 406- 414 (2008).
15. Low, D.W. & Hill, M.G. Rational fine-tuning of the redox potentials in chemically synthesized rubredoxins. Journal of the American Chemical Society 120, 11536-11537 (1998).
16. Shimko, J.C., North, J.A., Bruns, A.N., Poirier, M.G. & Ottesen, J.J. Preparation of Fully Synthetic Histone H3 Reveals That Acetyl-Lysine 56 Facilitates Protein Binding Within Nucleosomes. Journal of Molecular Biology 408, 187-204 (2011).
165
17. Baca, M., Alewood, P.F. & Kent, S.B.H. STRUCTURAL-ENGINEERING OF THE HIV-1 PROTEASE MOLECULE WITH A BETA-TURN MIMIC OF FIXED GEOMETRY. Protein Science 2, 1085-1091 (1993).
18. Milton, R.C.D., Milton, S.C.F. & Kent, S.B.H. TOTAL CHEMICAL SYNTHESIS OF A D-ENZYME - THE ENANTIOMERS OF HIV-1 PROTEASE SHOW DEMONSTRATION OF RECIPROCAL CHIRAL SUBSTRATE-SPECIFICITY. Science 256, 1445-1448 (1992).
19. Wang, Z., Xu, W., Liu, L. & Zhu, T.F. A synthetic molecular system capable of mirror-image genetic replication and transcription. Nature Chemistry published online ahead of print(2016).
20. Duvigneaud, V. et al. THE SYNTHESIS OF AN OCTAPEPTIDE AMIDE WITH THE HORMONAL ACTIVITY OF OXYTOCIN. Journal of the American Chemical Society 75, 4879-4880 (1953).
21. Hirschma.R et al. STUDIES ON TOTAL SYNTHESIS OF AN ENZYME .V. PREPARATION OF ENZYMATICALLY ACTIVE MATERIAL. Journal of the American Chemical Society 91, 507-& (1969).
22. Kent, S.B.H. CHEMICAL SYNTHESIS OF PEPTIDES AND PROTEINS. Annual Review of Biochemistry 57, 957-989 (1988).
23. Merrifield, R.B. SOLID PHASE PEPTIDE SYNTHESIS .1. SYNTHESIS OF A TETRAPEPTIDE. Journal of the American Chemical Society 85, 2149-& (1963).
24. Mitchell, A.R. Invited Review: Bruce Merrifield and solid-phase peptide synthesis: A historical assessment. Biopolymers 90, 175-184 (2008).
25. Martin, F.G. & Albericio, F. Solid supports for the synthesis of peptides - From the first resin used to the most sophisticated in the market. Chimica Oggi- Chemistry Today 26, 29-34 (2008).
26. Albericio, F. Developments in peptide and amide synthesis. Current Opinion in Chemical Biology 8, 211-221 (2004).
166
27. Al-Warhi, T.I., Al-Hazimi, H.M.A. & El-Faham, A. Recent development in peptide coupling reagents. Journal of Saudi Chemical Society 16, 97-116 (2012).
28. Schnolzer, M. & Kent, S.B.H. CONSTRUCTING PROTEINS BY DOVETAILING UNPROTECTED SYNTHETIC PEPTIDES - BACKBONE- ENGINEERED HIV PROTEASE. Science 256, 221-225 (1992).
29. Rose, K. FACILE SYNTHESIS OF HOMOGENEOUS ARTIFICIAL PROTEINS. Journal of the American Chemical Society 116, 30-33 (1994).
30. Dawson, P.E., Muir, T.W., Clark-Lewis, I. & Kent, S.B.H. Synthesis of Proteins by Native Chemical Ligation. Science 266, 776-779 (1994).
31. Englebretsen, D.R., Garnham, B.G., Bergman, D.A. & Alewood, P.F. A NOVEL THIOETHER LINKER - CHEMICAL SYNTHESIS OF A HIV-1 PROTEASE ANALOG BY THIOETHER LIGATION. Tetrahedron Letters 36, 8871-8874 (1995).
32. Baca, M., Muir, T.W., Schnolzer, M. & Kent, S.B.H. CHEMICAL LIGATION OF CYSTEINE-CONTAINING PEPTIDES - SYNTHESIS OF A 22-KDA TETHERED DIMER OF HIV-1 PROTEASE. Journal of the American Chemical Society 117, 1881-1887 (1995).
33. Liu, C.F., Rao, C. & Tam, J.P. Orthogonal ligation of unprotected peptide segments through pseudoproline formation for the synthesis of HIV-1 protease. Journal of the American Chemical Society 118, 307-312 (1996).
34. Hackeng, T.M. & Dawson, P.E. Protein synthesis by native chemical ligation: Expanded scope by using straightforward methodology. Proc Natl Acad Sci U S A 96, 10069-10073 (1999).
35. Bondalapati, S., Jbara, M. & Brik, A. Expanding the chemical toolbox for the synthesis of large and uniquely modified proteins. Nature Chemistry 8, 407-418 (2016).
167
36. Muir, T.W., Sondhi, D. & Cole, P.A. Expressed protein ligation: A general method for protein engineering. Proceedings of the National Academy of Sciences of the United States of America 95, 6705-6710 (1998).
37. Yan, L.Z. & Dawson, P.E. Synthesis of peptides and proteins without cysteine residues by native chemical ligation combined with desulfurization. Journal of the American Chemical Society 123, 526-533 (2001).
38. Wan, Q. & Danishefsky, S.J. Free-Radical-Based, Specific Desulfurization of Cysteine: A Powerful Advance in the Synthesis of Polypeptides and Glycopolypeptides. Angewandte Chemie International Edition 46, 9248-9252 (2007).
39. Haase, C., Rohde, H. & Seitz, O. Native Chemical Ligation at Valine. Angewandte Chemie International Edition 47, 6807-6810 (2008).
40. Crich, D. & Banerjee, A. Native Chemical Ligation at Phenylalanine. JACS Communications 129, 10064-10065 (2007).
41. Malins, L.R. & Payne, R.J. Modern Extensions of Native Chemical Ligation for Chemical Protein Synthesis. in Protein Ligation and Total Synthesis I, Vol. 362 (ed. Liu, L.) 27-87 (Springer-Verlag Berlin, Berlin, 2015).
42. Zhang, Y., Xu, C., Lam, H.Y., Lee, C.L. & Li, X. Protein chemical synthesis by serine and threonine ligation. Proceedings of the National Academy of Sciences of the United States of America 110, 6657-6662 (2013).
43. Zhang, Y., Malamakal, R.M. & Chenoweth, D.M. Aza-Glycine Induces Collagen Hyperstability. Journal of the American Chemical Society 137, 12422-12425 (2015).
44. Weinstock, M.T., Jacobsen, M.T. & Kay, M.S. Synthesis and folding of a mirror- image enzyme reveals ambidextrous chaperone activity. Proceedings of the National Academy of Sciences of the United States of America 111, 11679-11684 (2014).
168
45. Zawadzke, L.E. & Berg, J.M. THE STRUCTURE OF A CENTROSYMMETRIC PROTEIN CRYSTAL. Proteins-Structure Function and Genetics 16, 301-305 (1993).
46. Pentelute, B.L. et al. X-ray structure of snow flea antifreeze protein determined by racemic crystallization of synthetic protein enantiomers. Journal of the American Chemical Society 130, 9695-9701 (2008).
47. Mandal, K. et al. Racemic crystallography of synthetic protein enantiomers used to determine the X-ray structure of plectasin by direct methods. Protein Science 18, 1146-1154 (2009).
48. Kochendoerfer, G.G. et al. Total chemical synthesis of the integral membrane protein influenza A virus M2: Role of its C-terminal domain in tetramer assembly. Biochemistry 38, 11905-11913 (1999).
49. Olschewski, D. & Becker, C.F.W. Chemical synthesis and semisynthesis of membrane proteins. Molecular Biosystems 4, 733-740 (2008).
50. Shin, Y. et al. Fmoc-based synthesis of peptide-(alpha)thioesters: Application to the total chemical synthesis of a glycoprotein by native chemical ligation. Journal of the American Chemical Society 121, 11684-11689 (1999).
51. Pratt, M.R., Abeywardana, T. & Marotta, N.P. Synthetic Proteins and Peptides for the Direct Interrogation of alpha-Synuclein Posttranslational Modifications. Biomolecules 5, 1210-1227 (2015).
52. Jenuwein, T. & Allis, C.D. Translating the histone code. Science 293, 1074-1080 (2001).
53. Strahl, B.D. & Allis, C.D. The language of covalent histone modifications. Nature 403, 41-45 (2000).
54. Weisbrod, S. ACTIVE CHROMATIN. Nature 297, 289-295 (1982).
55. Kornberg, R.D. STRUCTURE OF CHROMATIN. Annual Review of Biochemistry 46, 931-954 (1977). 169
56. Luger, K., Mader, A.W., Richmond, R.K., Sargent, D.F. & Richmond, T.J. Crystal structure of the nucleosome core particle at 2.8 angstrom resolution. Nature 389, 251-260 (1997).
57. Mersfelder, E.L. & Parthun, M.R. The tale beyond the tail: histone core domain modifications and the regulation of chromatin structure. Nucleic Acids Research 34, 2653-2662 (2006).
58. Brehove, M. et al. Histone Core Phosphorylation Regulates DNA Accessibility. Journal of Biological Chemistry 290, 22612-22621 (2015).
59. Manohar, M. et al. Acetylation of Histone H3 at the Nucleosome Dyad Alters DNA-Histone Binding. Journal of Biological Chemistry 284, 23312-23321 (2009).
60. North, J.A. et al. Histone H3 phosphorylation near the nucleosome dyad alters chromatin structure. Nucleic Acids Research 42, 4922-4933 (2014).
61. Riposo, J. & Mozziconacci, J. Nucleosome positioning and nucleosome stacking: two faces of the same coin. Molecular Biosystems 8, 1172-1178 (2012).
62. Scheffer, M.P., Eltsov, M., Bednar, J. & Frangakis, A.S. Nucleosomes stacked with aligned dyad axes are found in native compact chromatin in vitro. Journal of Structural Biology 178, 207-214 (2012).
63. Park, J.H., Cosgrove, M.S., Youngman, E., Wolberger, C. & Boeke, J.D. A core nucleosome surface crucial for transcriptional silencing. Nature Genetics 32, 273- 279 (2002).
64. Bannister, A.J. & Kouzarides, T. Regulation of chromatin by histone modifications. Cell Research 21, 381-395 (2011).
65. Zhang, Y. & Reinberg, D. Transcription regulation by histone methylation: interplay between different covalent modifications of the core histone tails. Genes & Development 15, 2343-2360 (2001).
170
66. Rodriguez, Y., Hinz, J.M. & Smerdon, M.J. Accessing DNA damage in chromatin: Preparing the chromatin landscape for base excision repair. DNA Repair 32, 113-119 (2015).
67. Henikoff, S. & Ahmad, K. Assembly of variant histones into chromatin. in Annual Review of Cell and Developmental Biology, Vol. 21 133-153 (2005).
68. Vardabasso, C. et al. Histone variants: emerging players in cancer biology. Cellular and Molecular Life Sciences 71, 379-404 (2013).
69. Sullivan, K.F., Hechenberger, M. & Masri, K. HUMAN CENP-A CONTAINS A HISTONE H3 RELATED HISTONE FOLD DOMAIN THAT IS REQUIRED FOR TARGETING TO THE CENTROMERE. Journal of Cell Biology 127, 581- 592 (1994).
70. Verdaasdonk, J.S. & Bloom, K. Centromeres: unique chromatin structures that drive chromosome segregation. Nature Reviews Molecular Cell Biology 12, 320- 332 (2011).
71. Stellfox, M.E., Bailey, A.O. & Foltz, D.R. Putting CENP-A in its place. Cellular and Molecular Life Sciences 70, 387-406 (2013).
72. Rosic, S. & Erhardt, S. No longer a nuisance: long non-coding RNAs join CENP- A in epigenetic centromere regulation. Cellular and Molecular Life Sciences 73, 1387-1398 (2016).
73. Ito, T., Tyler, J.K., Bulger, M., Kobayashi, R. & Kadonaga, J.T. ATP-facilitated chromatin assembly with a nucleoplasmin-like protein from Drosophila melanogaster. Journal of Biological Chemistry 271, 25041-25048 (1996).
74. Loyola, A. & Almouzni, G. Histone chaperones, a supporting role in the limelight. Biochimica Et Biophysica Acta-Gene Structure and Expression 1677, 3-11 (2004).
75. Saha, A., Wittmeyer, J. & Cairns, B.R. Chromatin remodelling: the industrial revolution of DNA around histones. Nature Reviews Molecular Cell Biology 7, 437-447 (2006).
171
76. Wilson, B.G. & Roberts, C.W.M. SWI/SNF nucleosome remodellers and cancer. Nature Reviews Cancer 11, 481-492 (2011).
77. Javaid, S. et al. Nucleosome remodeling by hMSH2-hMSH6. Mol Cell 36, 1086- 94 (2009).
78. Polach, K.J. & Widom, J. MECHANISM OF PROTEIN ACCESS TO SPECIFIC DNA-SEQUENCES IN CHROMATIN - A DYNAMIC EQUILIBRIUM- MODEL FOR GENE-REGULATION. Journal of Molecular Biology 254, 130- 149 (1995).
79. Li, G., Levitus, M., Bustamante, C. & Widom, J. Rapid spontaneous accessibility of nucleosomal DNA. Nature Structural & Molecular Biology 12, 46-53 (2005).
80. Allfrey, V.G., Faulkner, R. & Mirsky, A.E. ACETYLATION + METHYLATION OF HISTONES + THEIR POSSIBLE ROLE IN REGULATION OF RNA SYNTHESIS. Proceedings of the National Academy of Sciences of the United States of America 51, 786-+ (1964).
81. Goldknopf, I.L. et al. ISOLATION AND CHARACTERIZATION OF PROTEIN-A24, A HISTONE-LIKE NON-HISTONE CHROMOSOMAL PROTEIN. Journal of Biological Chemistry 250, 7182-7187 (1975).
82. Dhall, A. et al. Sumoylated Human Histone H4 Prevents Chromatin Compaction by Inhibiting Long-range Internucleosomal Interactions. J Biol Chem 289, 33827- 37 (2014).
83. Singh, M.P., Wijeratne, S.S.K. & Zempleni, J. Biotinylation of lysine 16 in histone H4 contributes toward nucleosome condensation. Archives of Biochemistry and Biophysics 529, 105-111 (2013).
84. Wisniewski, J.R., Zougman, A. & Mann, M. N(epsilon)-Formylation of lysine is a widespread post-translational modification of nuclear proteins occurring at residues involved in regulation of chromatin function. Nucleic Acids Research 36, 570-577 (2008).
172
85. Nishizuk.Y, Ueda, K., Honjo, T. & Hayaishi, O. ENZYMIC ADENOSINE DIPHOSPHATE RIBOSYLATION OF HISTONE AND POLY ADENOSINE DIPHOSPHATE RIBOSE SYNTHESIS IN RAT LIVER NUCLEI. Journal of Biological Chemistry 243, 3765-& (1968).
86. Tan, M. et al. Identification of 67 Histone Marks and Histone Lysine Crotonylation as a New Type of Histone Modification. Cell 146, 1016-1028 (2011).
87. Wang, H.B. et al. Methylation of histone H4 at arginine 3 facilitating transcriptional activation by nuclear hormone receptor. Science 293, 853-857 (2001).
88. Tanikawa, C. et al. Regulation of histone modification and chromatin structure by the p53-PADI4 pathway. Nature Communications 3(2012).
89. Ord, M.G. & Stocken, L.A. PHOSPHATE AND THIOL GROUPS IN HISTONE F3 FROM RAT LIVER AND THYMUS NUCLEI. Biochemical Journal 102, 631-& (1967).
90. Sakabe, K., Wang, Z. & Hart, G.W. Beta-N-acetylglucosamine (O-GlcNAc) is part of the histone code. Proc Natl Acad Sci U S A 107, 19915-20 (2010).
91. Greenberg, R.A. Histone tails: Directing the chromatin response to DNA damage. Febs Letters 585, 2883-2890 (2011).
92. Voigt, P. & Reinberg, D. Histone tails – ideal motifs for probing epigenetics through chemical biology approaches. ChemBioChem 12, 236-252 (2011).
93. Hung, T. et al. ING4 mediates crosstalk between histone H3 K4 trimethylation and H3 acetylation to attenuate cellular transformation. Mol Cell 33, 248-56 (2009).
94. Fischle, W., Wang, Y. & Allis, C.D. Histone and chromatin cross-talk. Current Opinion in Cell Biology 15, 172-183 (2003).
173
95. Schwammle, V., Aspalter, C.M., Sidoli, S. & Jensen, O.N. Large Scale Analysis of Co-existing Post-translational Modifications in Histone Tails Reveals Global Fine Structure of Cross-talk. Molecular & Cellular Proteomics 13, 1855-1865 (2014).
96. Rothbart, S.B. & Strahl, B.D. Interpreting the language of histone and DNA modifications. Biochimica Et Biophysica Acta-Gene Regulatory Mechanisms 1839, 627-643 (2014).
97. Fierz, B. Synthetic Chromatin Approaches To Probe the Writing and Erasing of Histone Modifications. Chemmedchem 9, 495-504 (2014).
98. Torres, I.O. & Fujimori, D.G. Functional coupling between writers, erasers and readers of histone and DNA methylation. Current Opinion in Structural Biology 35, 68-75 (2015).
99. Zentner, G.E. & Henikoff, S. Regulation of nucleosome dynamics by histone modifications. Nature Structural & Molecular Biology 20, 259-266 (2013).
100. Jack, A.P.M. & Hake, S.B. Getting down to the core of histone modifications. Chromosoma 123, 355-371 (2014).
101. North, J.A. et al. Histone H3 phosphorylation near the nucleosome dyad alters chromatin structure. Nucleic Acids Research 42, 4922-4933 (2014).
102. Simon, M. et al. Histone fold modifications control nucleosome unwrapping and disassembly. Proc Natl Acad Sci U S A 108, 12711-6 (2011).
103. Chatterjee, N. et al. Histone Acetylation near the Nucleosome Dyad Axis Enhances Nucleosome Disassembly by RSC and SWI/SNF. Molecular and Cellular Biology 35, 4083-4092 (2015).
104. Su, X., Ren, C. & Freitas, M.A. Mass spectrometry-based strategies for characterization of histones and their post-translational modifications. Expert Review of Proteomics 4, 211-225 (2007).
174
105. Zhao, Y. & Garcia, B.A. Comprehensive Catalog of Currently Documented Histone Modifications. Cold Spring Harbor Perspectives in Biology 7, a025064 (2015).
106. Pick, H., Kilic, S. & Fierz, B. Engineering chromatin states: Chemical and synthetic biology approaches to investigate histone modification function. Biochimica Et Biophysica Acta-Gene Regulatory Mechanisms 1839, 644-656 (2014).
107. Megee, P.C., Morgan, B.A., Mittman, B.A. & Smith, M.M. GENETIC- ANALYSIS OF HISTONE-H4 - ESSENTIAL ROLE OF LYSINES SUBJECT TO REVERSIBLE ACETYLATION. Science 247, 841-845 (1990).
108. Zhao, Y. et al. SITE-DIRECTED MUTAGENESIS OF PHOSPHORYLATION SITES OF THE BRANCHED-CHAIN ALPHA-KETOACID DEHYDROGENASE COMPLEX. Journal of Biological Chemistry 269, 18583- 18587 (1994).
109. Matsubara, K., Sano, N., Umehara, T. & Horikoshi, M. Global analysis of functional surfaces of core histones with comprehensive point mutants. Genes Cells 12, 13-33 (2007).
110. North, J.A. et al. Phosphorylation of histone H3(T118) alters nucleosome dynamics and remodeling. Nucleic Acids Research 39, 6465-6474 (2011).
111. Normanly, J., Kleina, L.G., Masson, J.M., Abelson, J. & Miller, J.H. CONSTRUCTION OF ESCHERICHIA-COLI AMBER SUPPRESSOR TRANSFER-RNA GENES .3. DETERMINATION OF TRANSFER-RNA SPECIFICITY. Journal of Molecular Biology 213, 719-726 (1990).
112. Wang, L., Magliery, T.J., Liu, D.R. & Schultz, P.G. A new functional suppressor tRNA/aminoacyl-tRNA synthetase pair for the in vivo incorporation of unnatural amino acids into proteins. Journal of the American Chemical Society 122, 5010- 5011 (2000).
113. Neumann, H. et al. A Method for Genetically Installing Site-Specific Acetylation in Recombinant Histones Defines the Effects of H3 K56 Acetylation. Molecular Cell 36, 153-163 (2009). 175
114. Nguyen, D.P., Alai, M.M.G., Virdee, S. & Chin, J.W. Genetically Directing ɛ-N, N-Dimethyl-l-Lysine in Recombinant Histones. Chemistry & Biology 17, 1072- 1076 (2010).
115. Wang, Y.-S. et al. A genetically encoded photocaged Nε-methyl-l-lysine. Molecular BioSystems 6, 1557 (2010).
116. Yang, R., Pasunooti, K.K., Li, F., Liu, X.-W. & Liu, C.-F. Dual Native Chemical Ligation at Lysine. JACS Communications 131, 13592-13593 (2009).
117. Li, F. et al. A Direct Method for Site-Specific Protein Acetylation. Angewandte Chemie-International Edition 50, 9611-9614 (2011).
118. Yang, R., Bi, X., Li, F., Cao, Y. & Liu, C.-F. Native chemical ubiquitination using a genetically incorporated azidonorleucine. Chemical Communications 50, 7971 (2014).
119. Lee, S. et al. A Facile Strategy for Selective Incorporation of Phosphoserine into Histones. Angewandte Chemie International Edition 52, 5771-5775 (2013).
120. Guo, J., Wang, J., Lee, J.S. & Schultz, P.G. Site-Specific Incorporation of Methyl- and Acetyl-Lysine Analogues into Recombinant Proteins. Angewandte Chemie International Edition 47, 6399-6401 (2008).
121. Chalker, J.M., Lercher, L., Rose, N.R., Schofield, C.J. & Davis, B.G. Conversion of cysteine into dehydroalanine enables access to synthetic histones bearing diverse post-translational modifications. Angew Chem Int Ed Engl 51, 1835-9 (2012).
122. Wang, Z.U. et al. A Facile Method to Synthesize Histones with Posttranslational Modification Mimics. Biochemistry 51, 5232-5234 (2012).
123. Bernardes, G.J.L., Malker, J.M., Errey, J.C. & Davis, B.G. Facile Conversion of Cysteine and Alkyl Cysteines to Dehydroalanine on Protein Surfaces: Versatile and Switchable Access to Functionalized Proteins. JACS Communications 130, 5052-5053 (2008).
176
124. Simon, M.D. et al. The Site-Specific Installation of Methyl-Lysine Analogs into Recombinant Histones. Cell 128, 1003-1012 (2007).
125. Huang, R. et al. Site-Specific Introduction of an Acetyl-Lysine Mimic into Peptides and Proteins by Cysteine Alkylation. Journal of the American Chemical Society 132, 9986-9987 (2010).
126. Le, D.D., Cortesi, A.T., Myers, S.A., Burlingame, A.L. & Fujimori, D.G. Site- Specific and Regiospecific Installation of Methylarginine Analogues into Recombinant Histones and Insights into Effector Protein Binding. Journal of the American Chemical Society 135, 2879-2882 (2013).
127. Jia, G. et al. A systematic evaluation of the compatibility of histones containing methyl-lysine analogues with biochemical reactions. Cell Research 19, 1217-1220 (2009).
128. Chen, Z., Gryzbowski, A.T. & Ruthenburg, A.J. Traceless Semisynthesis of a Set of Histone 3 Species Bearing Specific Lysine Methylation Marks. ChembioChem 15, 2071-2075 (2014).
129. Chatterjee, A., McGinty, R.K., Fierz, B. & Muir, T.W. Disulfide-directed histone ubiquitylation reveals plasticity in hDot1L activation. Nature Chemical Biology 6, 267-269 (2010).
130. Whitcomb, S.J. et al. Histone monoubiquitylation position determines specificity and direction of enzymatic cross-talk with histone methyltransferases Dot1L and PRC2. J Biol Chem 287, 23718-25 (2012).
131. Davis, L. & Chin, J.W. Designer proteins: applications of genetic code expansion in cell biology. Nature Reviews Molecular Cell Biology 13, 168-182 (2012).
132. Shimko, J.C., Howard, C.J., Poirier, M.G. & Ottesen, J.J. Preparing Semisynthetic and Fully Synthetic Histones H3 and H4 to Modify the Nucleosome Core. Methods in Molecular Biology 981, 177-192 (2013).
133. Wang, Z. et al. Combinatorial patterns of histone acetylations and methylations in the human genome. Nature Genetics 40, 897-903 (2008).
177
134. Meledin, R., Mali, S.M. & Brik, A. Pushing the Boundaries of Chemical Protein Synthesis: The Case of Ubiquitin Chains and Polyubiquitinated Peptides and Proteins. Chemical Record 16, 509-519 (2016).
135. Raibaut, L., Ollivier, N. & Melnyk, O. Sequential native peptide ligation strategies for total chemical protein synthesis. Chemical Society Reviews 41, 7001 (2012).
136. Blanco-Canosa, J.B. & Dawson, P.E. An Efficient Fmoc-SPPS Approach for the Generation of Thioester Peptide Precursors for Use in Native Chemical Ligation. Angewandte Chemie International Edition 47, 6851-6855 (2008).
137. Becker, C.F.W. et al. Total chemical synthesis of a functional interacting protein pair: The protooncogene H-Ras and the Ras-binding domain of its effector c- Raf1. Proceedings of the National Academy of Sciences of the United States of America 100, 5075-5080 (2003).
138. Villain, M., Vizzavona, J. & Rose, K. Covalent capture: a new tool for the purification of synthetic and recombinant polypeptides. Chemistry & Biology 8, 673-679 (2001).
139. Bang, D. & Kent, S.B.H. A one-pot total synthesis of crambin. Angewandte Chemie-International Edition 43, 2534-2538 (2004).
140. Hojo, H. et al. Application of a novel thioesterification reaction to the synthesis of chemokine CCL27 by the modified thioester method. Organic & Biomolecular Chemistry 6, 1808-1813 (2008).
141. Kawakami, T. & Aimoto, S. Sequential peptide ligation by using a controlled cysteinyl prolyl ester (CPE) autoactivating unit. Tetrahedron Letters 48, 1903- 1905 (2007).
142. Ollivier, N., Dheur, J., Mhidia, R., Blanpain, A. & Melnyk, O. Bis(2- sulfanylethyl)amino Native Peptide Ligation. Organic Letters 12, 5238-5241 (2010).
178
143. Fang, G.M. et al. Protein Chemical Synthesis by Ligation of Peptide Hydrazides. Angewandte Chemie-International Edition 50, 7645-7649 (2011).
144. Li, J. et al. One-pot native chemical ligation of peptide hydrazides enables total synthesis of modified histones. Organic & Biomolecular Chemistry 12, 5435 (2014).
145. Bang, D., Pentelute, B.L. & Kent, S.B.H. Kinetically controlled ligation for the convergent chemical synthesis of proteins. Angewandte Chemie-International Edition 45, 3985-3988 (2006).
146. Siman, P., Karthikeyan, S.V., Nikolov, M., Fischle, W. & Brik, A. Convergent Chemical Synthesis of Histone H2B Protein for the Site-Specific Ubiquitination at Lys34. Angewandte Chemie-International Edition 52, 8059-8063 (2013).
147. Fang, G.M., Wang, J.X. & Liu, L. Convergent chemical synthesis of proteins by ligation of peptide hydrazides. Angew Chem Int Ed Engl 51, 10347-50 (2012).
148. Canne, L.E. et al. Chemical protein synthesis by solid phase ligation of unprotected peptide segments. Journal of the American Chemical Society 121, 8720-8727 (1999).
149. Camarero, J.A., Cotton, G.J., Adeva, A. & Muir, T.W. Chemical ligation of unprotected peptides directly from a solid support. Journal of Peptide Research 51, 303-316 (1998).
150. Raibaut, L. et al. Highly efficient solid phase synthesis of large polypeptides by iterative ligations of bis(2-sulfanylethyl)amido (SEA) peptide segments. Chemical Science 4, 4061-4066 (2013).
151. Bang, D. & Kent, S.B.H. HiS(6) tag-assisted chemical protein synthesis. Proceedings of the National Academy of Sciences of the United States of America 102, 5014-5019 (2005).
152. Brik, A., Keinan, E. & Dawson, P.E. Protein synthesis by solid-phase chemical ligation using a safety catch linker. Journal of Organic Chemistry 65, 3829-3835 (2000).
179
153. Jbara, M., Seenaiah, M. & Brik, A. Solid phase chemical ligation employing a rink amide linker for the synthesis of histone H2B protein. Chem Communications 50, 12534-12537 (2014).
154. Rink, H. SOLID-PHASE SYNTHESIS OF PROTECTED PEPTIDE- FRAGMENTS USING A TRIALKOXY-DIPHENYL-METHYLESTER RESIN. Tetrahedron Letters 28, 3787-3790 (1987).
155. Wang, S.S. PARA-ALKOXYBENZYL ALCOHOL RESIN AND PARA- ALKOXYBENZYLOXYCARBONYLHYDRAZIDE RESIN FOR SOLID- PHASE SYNTHESIS OF PROTECTED PEPTIDE FRAGMENTS. Journal of the American Chemical Society 95, 1328-1333 (1973).
156. Luger, K., Rechsteiner, T.J. & Richmond, T.J. Preparation of Nucleosome Core Particle from Recombinant Histones. Methods in Enzymology 304, 1-19 (1999).
157. Lowary, P.T. & Widom, J. New DNA sequence rules for high affinity binding to histone octamer and sequence-directed nucleosome positioning. Journal of Molecular Biology 276, 19-42 (1998).
158. Tanaka, Y. et al. Expression and purification of recombinant human histones. Methods 33, 3-11 (2004).
159. Pace, C.N., Vajdos, F., Fee, L., Grimsley, G. & Gray, T. HOW TO MEASURE AND PREDICT THE MOLAR ABSORPTION-COEFFICIENT OF A PROTEIN. Protein Science 4, 2411-2423 (1995).
160. Yu, R.R. et al. Hybrid phase ligation for efficient synthesis of histone proteins. Organic & Biomolecular Chemistry 14, 2603-2607 (2016).
161. Choma, C.T., Robillard, G.T. & Englebretsen, D.R. Synthesis of hydrophobic peptides: An Fmoc "Solubilising Tail" method. Tetrahedron Letters 39, 2417- 2420 (1998).
162. Harris, P.W.R. & Brimble, M.A. Toward the Total Chemical Synthesis of the Cancer Protein NY-ESO-1. Biopolymers 94, 542-550 (2010).
180
163. Meldal, M. PEGA - A FLOW STABLE POLYETHYLENE-GLYCOL DIMETHYL ACRYLAMIDE COPOLYMER FOR SOLID-PHASE SYNTHESIS. Tetrahedron Letters 33, 3077-3080 (1992).
164. Kunys, A.R., Lian, W. & Pei, D. Specificity Profiling of Protein-Binding Domains Using One-Bead-One-Compound Peptide Libraries. Current Protocols in Chemical Biology 4, 331-355 (2012).
165. Johnson, E.C.B. & Kent, S.B.H. Towards the total chemical synthesis of integral membrane proteins: a general method for the synthesis of hydrophobic peptide- (alpha)thioester building blocks. Tetrahedron Letters 48, 1795-1799 (2007).
166. Song, O.K., Wang, X.R., Waterborg, J.H. & Sternglanz, R. An N-alpha- acetyltransferase responsible for acetylation of the N-terminal residues of histones H4 and H2A. Journal of Biological Chemistry 278, 38109-38112 (2003).
167. Schnolzer, M., Alewood, P., Jones, A., Alewood, D. & Kent, S.B.H. INSITU NEUTRALIZATION IN BOC-CHEMISTRY SOLID-PHASE PEPTIDE- SYNTHESIS - RAPID, HIGH-YIELD ASSEMBLY OF DIFFICULT SEQUENCES. International Journal of Peptide and Protein Research 40, 180- 193 (1992).
168. Fields, G.B. & Noble, R.L. SOLID-PHASE PEPTIDE-SYNTHESIS UTILIZING 9-FLUORENYLMETHOXYCARBONYL AMINO-ACIDS. International Journal of Peptide and Protein Research 35, 161-214 (1990).
169. Maede, V., Els-Heindl, S. & Beck-Sickinger, A.G. Automated solid-phase peptide synthesis to obtain therapeutic peptides. Beilstein Journal of Organic Chemistry 10, 1197-1212 (2014).
170. Mende, F. & Seitz, O. 9-Fluorenylmethoxycarbonyl-Based Solid-Phase Synthesis of Peptide alpha-Thioesters. Angewandte Chemie-International Edition 50, 1232- 1240 (2011).
171. Li, X.Q., Kawakami, T. & Aimoto, S. Direct preparation of peptide thioesters using an Fmoc solid-phase method. Tetrahedron Letters 39, 8669-8672 (1998).
181
172. Mezo, A.R., Ottesen, J.J. & Imperiali, B. Discovery and characterization of a discretely folded homotrimeric beta beta alpha peptide. Journal of the American Chemical Society 123, 1002-1003 (2001).
173. Botti, P., Villain, M., Manganiello, S. & Gaertner, H. Native chemical ligation through in situ O to S acyl shift. Organic Letters 6, 4861-4864 (2004).
174. Terrier, V.P., Adihou, H., Arnould, M., Delmas, A.F. & Aucagne, V. A straightforward method for automated Fmoc-based synthesis of bio-inspired peptide crypto-thioesters. Chemical Science 7, 339-345 (2016).
175. Mahto, S.K., Howard, C.J., Shimko, J.C. & Ottesen, J.J. A Reversible Protection Strategy To Improve Fmoc-SPPS of Peptide Thioesters by the N-Acylurea Approach. ChemBioChem 12, 2488-2494 (2011).
176. White, P.D. & Behrendt, R. Practical aspects of the use of the Dbz linker for making thioesters by Fmoc SPPS. Journal of Peptide Science 16, 71-72 (2010).
177. Morley, A.D. Allyloxycarbonyl - a useful protecting group for phenolic amino acids and applications on solid support. Tetrahedron Letters 41, 7401-7404 (2000).
178. Blanco-Canosa, J.B., Nardone, B., Albericio, F. & Dawson, P.E. Chemical Protein Synthesis Using a Second-Generation N-Acylurea Linker for the Preparation of Peptide-Thioester Precursors. Journal of the American Chemical Society 137, 7197-7209 (2015).
179. Kovacs, J., Kim, S., Holleran, E. & Gorycki, P. STUDIES ON THE RACEMIZATION AND COUPLING OF N-ALPHA,NIM-PROTECTED HISTIDINE ACTIVE ESTERS. Journal of Organic Chemistry 50, 1497-1504 (1985).
180. Han, Y.X., Albericio, F. & Barany, G. Occurrence and minimization of cysteine racemization during stepwise solid-phase peptide synthesis. Journal of Organic Chemistry 62, 4307-4312 (1997).
182
181. Windridg.Gc & Jorgense.Ec. 1-HYDROXYBENZOTRIAZOLE AS A RACEMIZATION-SUPPRESSING REAGENT FOR INCORPORATION OF IM-BENZYL-L-HISTIDINE INTO PEPTIDES. Journal of the American Chemical Society 93, 6318-& (1971).
182. Palmer, D.K., Oday, K., Wener, M.H., Andrews, B.S. & Margolis, R.L. A 17-KD CENTROMERE PROTEIN (CENP-A) COPURIFIES WITH NUCLEOSOME CORE PARTICLES AND WITH HISTONES. Journal of Cell Biology 104, 805- 815 (1987).
183. Yoda, K. et al. Human centromere protein A (CENP-A) can replace histone H3 in nucleosome reconstitution in vitro. Proceedings of the National Academy of Sciences of the United States of America 97, 7266-7271 (2000).
184. Earnshaw, W.C. Discovering centromere proteins: from cold white hands to the A, B, C of CENPs. Nature Reviews Molecular Cell Biology 16, 443-449 (2015).
185. McKinley, K.L. & Cheeseman, I.M. The molecular basis for centromere identity and function. Nature Reviews Molecular Cell Biology 17(2016).
186. Tomonaga, T. et al. Overexpression and mistargeting of centromere protein-A in human primary colorectal cancer. Cancer Research 63, 3511-3516 (2003).
187. Amato, A., Schillaci, T., Lentini, L. & Di Leonardo, A. CENPA overexpression promotes genome instability in pRb-depleted human cells. Molecular Cancer 8(2009).
188. Dimitriadis, E.K., Weberb, C., Gillc, R.K., Diekmann, S. & Dalal, Y. Tetrameric organization of vertebrate centromeric nucleosomes. Proceedings of the National Academy of Sciences 107, 20317-20322 (2010).
189. Henikoff, S. et al. The budding yeast Centromere DNA Element II wraps a stable Cse4 hemisome in either orientation in vivo. eLife 3, 1-23 (2014).
190. Bui, M., Walkiewicz, M.P., Dimitriadis, E.K. & Dalal, Y. The CENP-A nucleosome A battle between Dr Jekyll and Mr Hyde. Nucleus-Austin 4, 37-42 (2013).
183
191. Camahort, R. et al. Cse4 Is Part of an Octameric Nucleosome in Budding Yeast. Molecular Cell 35, 794-805 (2009).
192. Padeganeh, A. et al. Octameric CENP-A Nucleosomes Are Present at Human Centromeres throughout the Cell Cycle. Current Biology 23, 764-769 (2013).
193. Hasson, D. et al. The octamer is the major form of CENP-A nucleosomes at human centromeres. Nature Structural & Molecular Biology 20, 687-+ (2013).
194. Tachiwana, H. et al. Crystal structure of the human centromeric nucleosome containing CENP-A. Nature 476, 232-235 (2011).
195. Zeitlin, S.G., Barber, C.M., Allis, C.D. & Sullivan, K.E. Differential regulation of CENP-A and histone H3 phosphorylation in G2/M. Journal of Cell Science 114, 653-661 (2000).
196. Niikura, Y. et al. CENP-A K124 Ubiquitylation Is Required for CENP-A Deposition at the Centromere. Developmental Cell 32, 589-603 (2015).
197. Bailey, A.O. et al. Posttranslational modification of CENP-A influences the conformation of centromeric chromatin. Proceedings of the National Academy of Sciences of the United States of America 110, 11827-11832 (2013).
198. Yu, Z. et al. Dynamic Phosphorylation of CENP-A at Ser68 Orchestrates Its Cell- Cycle-Dependent Deposition at Centromeres. Developmental Cell 32, 68-81 (2015).
199. Bui, M. et al. Cell-Cycle-Dependent Structural Transitions in the Human CENP- A Nucleosome In Vivo. Cell 150, 317-326 (2012).
200. Winogradoff, D., Zhao, H.Q., Dalal, Y. & Papoian, G.A. Shearing of the CENP-A dimerization interface mediates plasticity in the octameric centromeric nucleosome. Scientific Reports 5(2015).
201. Bui, M. et al. unpublished manuscript. (2016).
184
202. Carroll, C.W., Milks, K.J. & Straight, A.F. Dual recognition of CENP-A nucleosomes is required for centromere assembly. Journal of Cell Biology 189, 1143-1155 (2010).
203. Fierz, B., Kilic, S., Hieb, A.R., Luger, K. & Muir, T.W. Stability of Nucleosomes Containing Homogenously Ubiquitylated H2A and H2B Prepared Using Semisynthesis. Journal of the American Chemical Society 134, 19548-19551 (2012).
204. Dalal, Y. unpublished work. unpublished work.
205. Zuker, M. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Research 31, 3406-3415 (2003).
206. Siman, P. et al. Chemical Synthesis and Expression of the HIV-1 Rev Protein. Chembiochem 12, 1097-1104 (2011).
207. Vilsmeier, A. & Haack, A. The effect of halogen phosphor on alkyl formanilide - A new method for the characterisation of secondary and tertiary p-alkylamino- benzaldehyde. Berichte Der Deutschen Chemischen Gesellschaft 60, 119-122 (1927).
208. Sobel, R.E., Cook, R.G., Perry, C.A., Annunziato, A.T. & Allis, C.D. CONSERVATION OF DEPOSITION-RELATED ACETYLATION SITES IN NEWLY SYNTHESIZED HISTONES H3 AND H4. Proceedings of the National Academy of Sciences of the United States of America 92, 1237-1241 (1995).
209. Ye, J.X. et al. Histone H4 lysine 91 acetylation: A core domain modification associated with chromatin assembly. Molecular Cell 18, 123-130 (2005).
210. Yang, X.H. et al. HAT4, a Golgi Apparatus-Anchored B-Type Histone Acetyltransferase, Acetylates Free Histone H4 and Facilitates Chromatin Assembly. Molecular Cell 44, 39-50 (2011).
211. Ge, Z.Q. et al. Sites of Acetylation on Newly Synthesized Histone H4 Are Required for Chromatin Assembly and DNA Damage Response Signaling. Molecular and Cellular Biology 33, 3286-3298 (2013).
185
212. Iwasaki, W. et al. Comprehensive Structural Analysis of Mutant Nucleosomes Containing Lysine to Glutamine (KQ) Substitutions in the H3 and H4 Histone- Fold Domains. Biochemistry 50, 7822-7832 (2011).
213. Park, Y.J., Chodaparambil, J.V., Bao, Y.H., McBryant, S.J. & Luger, K. Nucleosome assembly protein 1 exchanges histone H2A-H2B dimers and assists nucleosome sliding. Journal of Biological Chemistry 280, 1817-1825 (2005).
214. Seenaiah, M., Jbara, M., Mali, S.M. & Brik, A. Convergent Versus Sequential Protein Synthesis: The Case of Ubiquitinated and Glycosylated H2B. Angewandte Chemie-International Edition 54, 12374-12378 (2015).
215. Wang, S.S. SOLID-PHASE SYNTHESIS OF PROTECTED PEPTIDE HYDRAZIDES - PREPARATION AND APPLICATION OF HYDROXYMETHYL RESIN AND 3-(P-BENZYLOXYPHENYL)-1,1- DIMETHYLPROPYLOXYCARBONYLHYDRAZIDE RESIN. Journal of Organic Chemistry 40, 1235-1239 (1975).
216. Bello, C., Kikul, F. & Becker, C.F.W. Efficient generation of peptide hydrazides via direct hydrazinolysis of Peptidyl-Wang-TentaGel resins. Journal of Peptide Science 21, 201-207 (2015).
217. Dawson, P.E. Personal Communication. (2015).
218. Wang, J.-X. et al. Peptide o-Aminoanilides as Crypto-Thioesters for Protein Chemical Synthesis. Angewandte Chemie-International Edition 54, 2194-2198 (2015).
186
Appendix A: Standard Laboratory Solutions
15% acrylamide gel
1.77 mL H2O 1.88 mL 40% 37.5:1 Acrylamide/Bisacrylamide Solution 37.5:1 1.25 mL 1.5 M Tris pH 8.8 50 µL 10% SDS 50 µL 10% ammonium persulfate 2 µL tetramethylethylenediamine (TEMED)
Stacking gel (5% acrylamide)
605 µL H2O 125 µL 40% 37.5:1 Acrylamide/Bisacrylamide Solution 37.5:1 250 µL 0.5 M Tris pH 6.8 10 µL 10% SDS 10 µL 10% ammonium persulfate 1 µL TEMED
6 x SDS loading buffer
To make 10 mL of 6 x SDS loading buffer, the following was measured: 1.2 g Sodium dodecyl sulfate (SDS) 6 mg Bromophenol blue 4.7 mL Glycerol 1.2 mL 0.5 M Tris pH 6.8 2.1 mL H2O 1 M Dithiothreitol (DTT) was added to a 1 mL aliquot of 6 x SDS buffer to a final concentration of 100 mM.
187
SDS-PAGE running buffer
5 x Tris Gly Buffer was prepared using the following: 60.4 g Tris base 376 glycine Volume was brought up to 4 L with H2O 1 x Tris Gly was prepared making a 1/5 dilution of 5 x Tris Gly
Coomassie Stain
2 g of Coomassie Brilliant Blue was dissolved in 4 L of 45% methanol, 10% glacial acid, and 45% H2O.
Destain
4 L was prepared with 47.5% methano, 10% acetic acid, 42.5% H2O. LB growth media 20 g LB Broth Lennox (Fisher Scientific) was resuspended in 1 L H2O, and autoclaved. For LB Amp media, Ampicillin was added to the autoclaved LB media to a final concentration of 100 mg/mL.
SOC growth media
0.5% yeast extract 2% Tryptone 10 mM NaCl 2.5 mM KCl 10 mM MgCl2 10 mM MgSO4 20 mM Glucose (added after autoclaving)
188