Quick viewing(Text Mode)

ABSTRACT TORRES, JESSICA. Synthesis Of

ABSTRACT TORRES, JESSICA. Synthesis Of

ABSTRACT

TORRES, JESSICA. Synthesis of Unnatural Amino Acids for Protein Labeling and Activation. (Under the direction of Alexander Deiters).

Site-specific incorporation of unnatural amino acids into proteins, both in vivo and in vitro, is a promising technology with tremendous potential to advance studies in protein structure and function. The technique allows the incorporation of a vast diversity of functional groups that extends beyond the conventional mutagenesis of the twenty common amino acids. The rapid development of orthogonal PylRS/PylT pairs has resulted in an increasing number of novel unnatural amino acids that can be site-specifically introduced into proteins. This dissertation presents the syntheses of several unnatural amino acids for use in protein labeling and activation.

A bipyridine lysine was synthesized for the assembly of metal-binding proteins and its genetic encoding is shown. In addition, unnatural amino acids bearing reactive functionalities for reactions for the selective labeling of proteins were assembled. These include a variety of lysine analogs that can be applied to carbonyl/aminooxy condensations, -ene reactions, and Diels-Alder cycloadditions.

Moreover, bioorthogonal reaction partners such as aminooxy dyes, thiol, and tetrazine probes for subsequent protein labeling were prepared.

For the photoregulation of protein activity in live cells, several caged tyrosine analogs were synthesized. The assembly of caged phosphoryl tyrosines to study tyrosine phosphorylation on proteins by light activation is presented; as well as, the synthesis of tyrosine derivatives bearing ortho-nitrobenzyl caging groups to improve decaging kinetics

and bioavailability of photocaged tyrosines. In addition, an isotope labeled, photolabile tyrosine was synthesized as a biophysical probe to study protein structure and dynamics by infrared spectroscopy.

Lastly, for the light-triggered regulation of oligonucleotides and gene expression, a caged thymidine phosphoramidite bearing a norbornene was synthesized.

This synthetic monomer can be incorporated into oligonucleotides and enable the dual functions of selectively modifying an oligonucleotide post-synthetically and have precise control over oligonucleotide activation by the use of light.

Synthesis of Unnatural Amino Acids for Protein Labeling and Activation

by Jessica Torres

A dissertation submitted to the Graduate Faculty of North Carolina State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy

Chemistry

Raleigh, North Carolina

2014

APPROVED BY:

______Dr. Alexander Deiters Dr. Christian Melander Committee Co-chair Committee Co-chair

______Dr. Daniel L. Comins Dr. Gavin Williams Committee Member Committee Member

DEDICATION

To my parents, who guided me to where I am today.

DEDICATORIA

A mis padres, quienes me encaminaron a donde estoy hoy.

ii

BIOGRAPHY

Jessica Torres was born on January 20, 1988 to Jose and Gerlinde Torres in Boston,

MA. Jessica has an older brother, Alexander, and older and younger sisters, Erika and

Jomarie. Her parents moved to Villalba, PR where she was raised and graduated from

Lysander Borrero Terry High School in 2005. She then moved to Cayey, PR to attend the

University of Puerto Rico at Cayey, pursuing her Bachelors of Science degree in Chemistry.

During this time she worked in the lab of Dr. Elba Reyes as an awardee of a BioMINDS scholarship by AMGEN; won three summer research internships, one at the University of

Pittsburgh School of Pharmacy under the supervision of Dr. Wen Xie in 2007, and the other two at North Carolina State University with Dr. Alexander Deiters in 2008 and Dr.

Christopher Gorman in 2009. After graduating in the top three of her class with magna cum laude in 2009, she moved to Raleigh, NC to continue her studies at North Carolina State

University in the Ph.D. program in Chemistry. In 2010, Jessica received a National Science

Foundation Graduate Research Fellowship and has worked under the supervision of Prof.

Alexander Deiters on the synthesis of novel unnatural amino acids for the expression of proteins with new function.

iii

ACKNOWLEDGEMENTS

I would first like to thank my parents for their love and support, for always believing in me and motivating me to chase my dreams. I would like to acknowledge my brother and sisters, you guys are my best friends for life and even in the distance I could always feel your love and support.

I wish to thank Dr. Elba Reyes, Dr. Rene Rodriguez, Dr. Robert Ross, Dr. Maria

Oliver-Hoyo, and Alison Wynn who believed in me and in one way or another contributed to my decision of pursuing my PhD. I would like to thank my friend Efrain Rivera Serrano for our journey together from Puerto Rico to Raleigh and helping me find a little of home in all of our talks!

I wish to give very special thanks to my advisor Alexander Deiters for his guidance and high expectations. All that I have learned from him in the past five years have made me a better scientist. He has been very supportive and his enthusiasm in research was always a great motivation to keep me going on with my projects. I would like to thank to the Deiters lab past and present members that I have had the pleasure of working with. To Doug, for being an awesome mentor during my undergraduate research summer internship, an experience that contributed to my decision of joining the lab. Yan, Andrew, Meryl, Matt,

Subhas, and Qingyang for being great hood neighbors, for their fun talks, and their knowledge in chemistry. To Rajendra, for his wise and long talks, and his impressive expertise in chemistry. Thanks to Hank, Alex P., Ji, and Jihe for their contributions on my projects, as well as James and Sander for their talks and knowledge in biology. I would like to thank Ana and Luis for being great summer students, and I wish good luck to Luis who is

iv now joining the lab! Special thanks to the ladies, Colleen, Jeane, Laura, Jie, and Kalyn for the great moments we shared inside and outside of lab. I want to add a special shout-out for

Colleen and Qingyang for their very appreciated company after the lab moved to Pittsburgh, to Sarah for helping me around with chemistry and lab cleanup, and to Taylor for being a great friend, keeping me sane, and teaching me some English. I would also like to thank

Robin Tanner and Dr. Christian Melander for always being friendly faces on the fifth floor and checking up on those of us left at NCSU. Also thanks to Dr. Comins, Dr. Williams and

Dr. Lalush for serving as committee members.

I would like to acknowledge the Chin lab for research collaborations. And finally, I would like to acknowledge the National Science Foundation Graduate Research Fellowship for financial support.

v

TABLE OF CONTENTS

LIST OF FIGURES ...... ix

LIST OF SCHEMES...... iv

LIST OF ABBREVIATIONS ...... xvi

CHAPTER 1: GENETIC CODE EXPANSION WITH UNNATURAL AMINO ACIDS ...1

1.1 Introduction ...... 1

1.2 Protein modification via genetic encoding of unnatural amino acids ...... 2

1.3 Genetic code expansion using orthogonal aminoacyl-tRNA synthetase/aminoacyl-

tRNA pairs ...... 4

1.4 Pyrrolysyl-tRNA synthetase as a genetic code expansion tool ...... 7

1.5 Summary ...... 13

CHAPTER 2: SYNTHESIS AND GENETIC ENCODONG OF LYSINE ANALOGS FOR

PROTEIN LABELING ...... 15

2.1 Genetic code expansion with a bipyridine lysine ...... 15

2.1.1 Synthesis of a bipyridine lysine ...... 17

2.1.2 Genetic encoding of a bipyridine lysine ...... 19

2.2 Labeling of proteins via bioorthogonal reactions ...... 20

2.3 Genetic code expansion with unnatural amino acids for protein labeling via

aldehyde/ketone condensations ...... 23

2.3.1 Synthesis of ketone and diol functionalized unnatural amino acids, and

hydroxylamine probes ...... 26

vi

2.3.2 Genetic encoding of ketone and diol functionalized unnatural amino acids

...... 34

2.4 Genetic code expansion with alkene lysines for protein labeling ...... 36

2.4.1 Synthesis of alkene lysines ...... 37

2.4.2 Genetic encoding of alkene lysines ...... 40

2.4.3 Synthesis of alkene-reactive probes for thiol-ene reactions ...... 42

2.4.4 Site-specific protein labeling via the thiol-ene reaction ...... 43

2.4.5 Synthesis of alkene-reactive probes for photo-click reactions ...... 46

2.5 Genetic code expansion with diene lysines ...... 50

2.5.1 Synthesis of 1,3-diene lysines ...... 51

2.5.2 Genetic encoding of 1,3-diene lysines ...... 54

2.6 Genetic code expansion with norbornene lysine and protein labeling ...... 56

2.6.1 Synthesis of a norbornene lysine ...... 57

2.6.2 Genetic encoding of a norbornene lysine...... 58

2.6.3 Synthesis of tetrazine probes for protein labeling...... 60

2.7 Summary ...... 67

2.8 Experimental data for synthesized compounds...... 68

CHAPTER 3: SYNTHESIS OF CAGED TYROSINES FOR THE PHOTOREGULATION

OF PROTEINS ...... 119

3.1 Introduction to caging groups ...... 119

3.2 Tyrosine in protein modification ...... 124

3.3 Tyrosine phosphorylation ...... 127

vii

3.3.1 Genetic code expansion with caged phosphoryl tyrosine ...... 128

3.3.2 Synthesis of caged phosphoryl tyrosines ...... 132

3.4 Genetic code expansion with caged tyrosines ...... 135

3.4.1 Synthesis of caged tyrosines ...... 136

3.4.2 Genetic encoding of caged tyrosines by evolved PylRS/PylT pairs ...... 143

3.5 Genetic code expansion with caged deuterated tyrosine ...... 148

3.5.1 Synthesis of caged deuterated tyrosine ...... 149

3.6 Summary ...... 151

3.7 Experimental data for synthesized compounds...... 152

CHAPTER 4: NORBORNENE-CONTAINING CAGED THYMIDINE FOR

OLIGONUCLEOTIDE MODIFICATION ...... 170

4.1 Introduction ...... 170

4.2 Synthesis of a norbornene-functionalized, photocaged thymidine phosphoramidite

...... 174

4.3 Summary ...... 177

4.4 Experimental data for synthesized compounds...... 177

REFERENCES ...... 185

viii

LIST OF FIGURES

Figure 1. Site-specific incorporation of an unnatural amino acid (UAA) into a protein.

Adapted from Curr. Opin. Chem. Biol. 2011, 15, 392...... 5

Figure 2. Structures of pyrrolysine (Pyl) and analogs that serve as substrates of PylRS...... 8

Figure 3. Pyrrolysine/lysine analogs that serve as substrates of the wild-type PylRS...... 10

Figure 4. Pyrrolysine/lysine analogs that have been genetically encoded into proteins using

engineered PylRS mutants...... 12

Figure 5. Structure of phenylalanine (Phe) and its derivatives that have been genetically

encoded into proteins using engineered PylRS mutants...... 13

Figure 6. Structures of metal-chelating UAAs...... 16

Figure 7. Proposed structures of bipyridine lysines 47 and 51...... 17

Figure 8. SDS-PAGE analysis and protein yield for the incorporation of 51 into sfGFP at

position Y151 in E. coli by using EV13. Experiments were performed by Jihe Liu...... 20

Figure 9. Representation of bioorthogonal reactions for labeling proteins. Under

physiological conditions and in the presence of all functional groups found in living

systems, 1) a bioorthogonal functional group (purple) is introduced into a protein of

interest, or POI, (green) by UAA mutagenesis. Then, 2) a bioorthogonal chemical

reporter (blue) reacts chemo-selectively to label the protein. Figure adapted from Chem.

Rev. 2014, 114(9), 4764...... 22

Figure 10. Bioorthogonal reactions for protein labeling...... 23

Figure 11. Ketone-containing UAAs that have been genetically encoded into proteins...... 25

ix

Figure 12. A site-specifically incorporated 1,2-diol-containing UAA is modified post-

translationally via periodate oxidative cleavage to reveal an aldehyde on a protein of

interest (POI)...... 26

Figure 13. SDS-PAGE analysis for the incorporation of the ketone lysine 62 into sfGFP by

the PylRS variant EV3. Experiment performed by Alex Prokup...... 35

Figure 14. Incorporation of diol lysine 65 into proteins in E. coli using EV3 PylRS mutant.

SDS-PAGE analyses show that A) expression of Myo is dependent on the presence of

65 at position S4, and B) expression of sfGFP is dependent on the presence of 65 at

position Y151. Experiments were performed by Ji Luo...... 36

Figure 15. Genetic incorporation of alkene lysine analogs into Myo by the wild-type

PylRS/PylT pair in E.coli. A) SDS-PAGE analyses for the incorporation of alkene-

bearing lysines into Myo. B) Myo comparative protein yield (%) and ESI-MS results.

Experiments were performed by Dr. Chungjung Chou and Jihe Liu...... 41

Figure 16. Alkenyl-sfGFP is fluorescently labeled with a dansyl-thiol. A) sfGFP bearing an

alkene functionality reacts photochemically with dansyl-thiol 117 or lysozyme (LYZ).

B) SDS-PAGE analysis and in-gel fluorescence demonstrates the labeling of alkenyl-

sfGFP with dansyl-thiol 117 after 5 min of UV irradiation via thiol-ene ligation (lanes 5

and 6). wt: wild-type sfGFP; 7 and 101: sfGFP carrying the corresponding UAA; LYZ:

lysozyme. -UV: samples were not exposed to UV irradiation. +UV: samples were

irradiated at 365 nm for 5 min. Experiments were performed by Dr. Chungjung Chou. 44

Figure 17. Alkenyl-sfGFP is bioconjugated to lysozyme, assembling a non-linear protein

dimer via the thiol-ene reaction. SDS-PAGE analysis shows mobility band shifts of

x

sfGFP-7 and sfGFP-101 from 28 kDa to 44 kDa after samples were irradiated at 365 nm

for 10 min (lanes 8 and 9), corresponding to the molecular weight of the sfGFP-LYZ

conjugate. wt: wild-type sfGFP; 7 and 101: sfGFP carrying the corresponding UAA;

LYZ: lysozyme. -UV: samples were not exposed to UV irradiation. +UV: samples were

irradiated at 365 nm for 10 min. Experiments were performed by Dr. Chungjung Chou.

...... 45

Figure 18. Fluorescent labeling of Myo-7 via photo-click cycloaddition with 128. A)

Structure of tetrazole 128. B) SDS-PAGE coomassie stain (top) and in-gel fluorescence

(bottom) analyses for the labeling of Myo-7 with 128 upon 302 nm of UV light for 2 or

5 min...... 46

Figure 19. ESI-MS analysis of Myo confirming the genetic incorporation of 155 (expected

mass: 18490.07) and 158 (expected mass: 18520.08) at position S4...... 55

Figure 20. Structures of strained-alkene derivatives that have been genetically encoded by

PylRS/PylT pairs for protein labeling via inverse electron-demand Diels-Alder

reactions...... 57

Figure 21. Genetically encoded 164 by the MbPylRS/PylT pair in E. coli. A) UAA

dependent expression of sfGFP with an amber stop codon at position 150 and Myo with

an amber stop codon at position 4. B) ESI-MS analyses confirming the incorporation of

164. i, sfGFP-164-His6, calculated: 27,977.5 Da, found: 27,975.5±15 Da. ii, Myo-164-

His6, calculated: 18,532.2 Da, found: 18,532.5±15 Da. Experiments were performed at

the Chin lab. Figure adapted from Nat. Chem. 2012, 4, 298...... 59

xi

Figure 22. Specific labeling of a cell surface protein in mammalian cells. Expression of full-

length EGFR-EGFP is dependent on 164 or 6 at position 128 as it is visible by green

fluorescence (left panel), while specific protein labeling with TAMRA-tetrazine (middle

panel) is only visible for EGFR-164-EGFP by red fluorescence; merged images (right

panel). Experiments were performed by the Chin lab. Figure adapted from Nat. Chem.

2012, 4, 298...... 60

Figure 23. SDS-PAGE analysis demonstrates specific PEGylation of sfGFP-164.

Experiment performed by Alex Prokup...... 65

Figure 24. Representation of the activation of a caged protein with UV light (365 nm). ... 120

Figure 25. Modifications of the o-NB structure. (R = H or Me, X = any substrate) ...... 123

Figure 26. Structures of NPE-type caging groups...... 123

Figure 27. Tyrosine analogs that have been genetically encoded into proteins...... 125

Figure 28. Structure of o-nitrobenzyl tyrosine (218) and analogs 219-221...... 127

Figure 29. Structures of phosphoryl serine (pSer), phosphoryl threonine (pThr), phosphoryl

tyrosine (pTyr), aspartic acid (Asp) and glutamic acid (Glu)...... 130

Figure 30. Phosphoryl tyrosine mimics. A) Structures of p-carboxymethyl-phenylalanine

(222) and p-azidophenylalanine (223). B) Photolysis of light-sensitive 224 to a p-

(phosphoamino)-phenylalanine (225) in a protein of interest (POI)...... 132

Figure 31. Structures of caged tyrosine analogs 247-249...... 138

Figure 32. Decaging of caged tyrosines 249, 248, and 247 by 365 nm of UV light. A)

Concentration versus irradiation time plots and B) kinetic analysis showing rate

constants (k) and half-life (t1/2)...... 143

xii

Figure 33. Incorporation of 247, 248 and 249 into sfGFP at the position Y151 in E. coli. A)

SDS-PAGE and protein yield analyses for the incorporation of 247 and B) 248, using

EV20. 218 was used as positive control. C) SDS-PAGE and protein yields analyses for

the incorporation of 249 using EV16-1, EV16-4, EV16-5 and EV20. D) ESI-MS results

confirming the incorporation of 247, 248 and 249 by EV20 into sfGFP at the position

Y151. Experiments performed by Jihe Liu...... 145

Figure 34. Incorporation of 247-249 in mammalian cells. A) Fluorescence micrographs and

B) western blot analysis of HEK 293T cells expressing V16-5/PylT pair and mCherry-

TAG-EGFP-HA in the presence of 218, 248, 247, and absence of an UAA. C)

Fluorescence micrographs of HEK 293T cells expressing AG19/PylT pair and mCherry-

TAG-EGFP-HA in the presence or absence of 249. Experiments performed by Ji Luo.

...... 147

Figure 35. Site-specific incorporation of a deuterated tyrosine by photochemical disguise.

...... 149

Figure 36. Photo-activation and photo-deactivation of oligonucleotides using photocaged

nucleobases or photocleavable linkers. Figure adapted from Acc. Chem. Res. 2014, 47,

45-55...... 171

Figure 37. Structure of alkyne-functionalized caged thymidine phosphoramidite 267...... 173

Figure 38. Inverse electron-demand Diels-Alder bioconjugation of a photocaged thymidine

and decaging reaction. (CG = caging group) ...... 174

xiii

LIST OF SCHEMES

Scheme 1. Attempted synthesis of bipyridine lysine 47...... 18

Scheme 2. Synthesis of a bipyridine lysine 51...... 19

Scheme 3. Synthesis of ketone lysine 62...... 28

Scheme 4. Synthesis of diol lysine 65...... 29

Scheme 5. Synthesis of diol phenylalanine 71...... 31

Scheme 6. Synthesis of dansyl aminooxy 75...... 32

Scheme 7. Synthesis of the aminooxy linker 79...... 33

Scheme 8. Synthesis of aminooxy-modified : fluorescein 82 and coumarin 85.

...... 34

Scheme 9. Synthesis of alkene-bearing lysines 101-105. aYield over two steps...... 38

Scheme 10. Synthesis of alkene-bearing lysines with an amide (108), “inverted”

carbamate (111), or urea (113) functionality at the Nɛ-position of lysine...... 40

Scheme 11. Synthesis of disulfide 115 and dansyl thiol 117 for protein labeling ..43

Scheme 12. Synthesis of alkene-reactive tetrazoles...... 48

Scheme 13. Synthesis of tetrazole 135 bearing a reactive linker and functionalized

tetrazoles 136 and 137 carrying a and biotin, respectively...... 49

Scheme 14. Synthesis of the acyclic 1,3-diene 140...... 51

Scheme 15. Attempted synthesis of furan lysine 144 and cyclohexadiene lysine 152. ....53

Scheme 16. Synthesis of furan lysines 155 and 158...... 54

Scheme 17. Synthesis of norbornene lysine 164...... 58

xiv

Scheme 18. Synthesis of tetrazines 169-172 bearing various substituents at position 6 of

tetrazine and a carboxylic acid functional group at the 3-phenyl for further

functionalization...... 63

Scheme 19. Synthesis of the PEG-tetrazines 175 and 176...... 64

Scheme 20. Synthesis of tetrazines containing an linker (181-184) for further

functionalization. X = TFA or HCl...... 66

Scheme 21. Synthesis of various tetrazine reagents for fluorescent labeling...... 67

Scheme 22. Norrish type II mechanism for the photolysis of an o-NB substrate...... 122

Scheme 23. Photolysis mechanism for the NPE-caged substrate...... 124

Scheme 24. Synthesis of caged phosphoryl tyrosines 237 (R = Me) and 238 (R = H). ....134

Scheme 25. Synthesis of caged tyrosine esters 244-246...... 137

Scheme 26. Synthesis of caged tyrosine 247...... 139

Scheme 27. Synthesis of caged tyrosine 248 via a Mitsunobu reaction...... 140

Scheme 28. Synthesis of caged tyrosine 249 via a Mitsunobu reaction...... 141

Scheme 29. Synthesis of 266 via N-Boc protection and alkylation...... 150

Scheme 30. Synthesis of a caged deuterated tyrosine 266...... 150

Scheme 31. Synthesis of the caging precursor 272 ...... 175

Scheme 32. Synthesis of a norbornene-containing caged thymidine phosphoramidite 276.

...... 176

xv

LIST OF ABBREVIATIONS

μL microliter

13C NMR carbon nuclear magnetic resonance

1H NMR proton nuclear magnetic resonance aaRS aminoacyl-tRNA synthetase aaT aminoacyl-tRNA

Ac acetyl

AcCl acetyl chloride

AlMe3 trimethyl aluminum

AME acetoxymethyl ester

AMP adenosine monophosphate aq aqueous

ATP adenosine triphosphate

Boc tert-butyloxycarbonyl br broad

Bu butyl

CD3OD deuterated methanol

CDCl3 deuterated chloroform d doublet

D2O deuterated water

Da dalton

xvi dba dibenzylideneacetone

DBU 1,8-Diazabicyclo[5.4.0]undec-7-ene

DCC N,N'-dicyclohexylcarbodiimide

DCE 1,2-dichloroethane

DCM dichloromethane dd doublet of doublets

DIAD diisopropyl azodicarboxylate

DIPEA diisopropylethylamine

DMAP 4-dimethylaminopyridine

DME dimethoxyethane (glyme)

DMF dimethylformamide

DMSO dimethyl sulfoxide

DMSO-d6 deuterated dimethyl sulfoxide

DMT dimethoxytrityl

DNA deoxyribonucleic acid

DSC N,N'-disuccinimidyl carbonate

EDCI 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide

EEDQ N-ethoxycarbonyl-2-ethoxy-1,2-dihydroquinoline

EGFP enhanced green fluorescent protein

EGFR epidermal growth factor receptor

ESI electron spray ionization

xvii

Et ethyl

Et2O diethyl ether

EtOAc ethyl acetate

EtOH ethanol

FITC fluorescein isothiocyanate g gram

GFP green fluorescent protein h hour(s)

HATU (1-[bis(dimethylamino)methylene]-1H-1,2,3-triazolo[4,5-b]

pyridinium 3-oxid hexafluorophosphate)

HEK human embryonic kidney

Hex hexanes

HOBt hydroxybenzotriazole

HRMS high resolution mass spectrometry

Hz hertz

IB immunoblot iPr isopropyl

IR infrared

J coupling constant kDa kilodalton

L liter

xviii

LC/MS liquid chromatography mass spectrometry

LRMS low resolution mass spectrometry

LYZ lysozyme m meta m multiplet

M molar

Mb Methanosarcinales barkeri

Me methyl

MeCN acetonitrile

MeOH methanol mg milligram

MHz megahertz min minutes miRNA microRNA mL milliliter mM millimolar mmol millimole

MOM methoxymethyl mRNA messenger ribonucleic acid

Myo myoglobin

NHS N-hydroxysuccinimide

xix nm nanometer

NMO 4-methylmorpholine N-oxide

NPE 2-(o-nitrophenyl)ethanol o ortho o-NB ortho-nitrobenzyl o-NBB ortho-nitrobenzyl bromide p para

PBS phosphate buffered saline pdt product

PEG

Ph phenyl

PheRS phenylalanyl-tRNA synthetase

POI protein of interest

PPh3 triphenylphosphine

PPi pyrophosphate inorganic ppm parts per million

PTM post-translational modification pTSA para-toluenesulfonic acid

Py pyridine

Pyl pyrrolysine

PylRS pyrrolysyl-tRNA synthetase

xx

PylT pyrrolysyl tRNA

RNA ribonucleic acid rt room temperature s singlet

SDS-PAGE sodium dodecyl sulfate polyacrylamide gel electrophoresis sec second(s) sfGFP superfolder green fluorescent protein siRNA small interfering ribonucleic acid s.m. starting material t triplet

TBAI tetra-N-butylammonium iodide

TBDMS tert-butyldimethylsilyl tBu tert-butyl tBuOH tert-butanol tBuOK potassium tert-butoxide

TEA triethylamine

TES triethylsilane

Tf triflate or trifluoromethanesulfonate

TFA trifluoroacetic acid

THF tetrahydrofuran

TLC thin layer chromatography

xxi tRNA transfer ribonucleic acid

Ts tosyl

TyrRS tyrosyl-tRNA synthetase

TyrT tyrosyl-tRNA

UAA unnatural amino acid

UV ultraviolet

xxii

CHAPTER 1: GENETIC CODE EXPANSION WITH UNNATURAL AMINO ACIDS

1.1 Introduction

With only a few exceptions, proteins of all living organisms are composed of the same 20 canonical amino acids. The genetic code consists of 64 three-base codons that specify for the 20 amino acids and three translation termination codons. Deoxyribonucleic acid (DNA) is transcribed into messenger ribonucleic acid (mRNA) which is later associated in the protein synthesis process. RNA-directed synthesis of a protein, called translation, requires transfer RNA (tRNA) to carry amino acids from the cytoplasm’s pool of amino acids to the ribosome, where protein synthesis takes place. Each amino acid is charged corresponding to the tRNA’s anticodon, which base-pairs with a complementary mRNA codon. Although anticodons are general for all organisms, a tRNA that binds to an mRNA codon is specific to a particular amino acid and will only carry that amino acid into the ribosome. Charging of each amino acid into an aminoacyl-tRNA (aaT) is accomplished by a specific enzyme called aminoacyl-tRNA synthetase (aaRS). Consequently, there are 20 such enzymes in the cell that are species-specific.1

Proteins are essential biomacromolecules that are involved in all cellular functions.

They are involved in biological processes such as cell signaling, signal transduction, as well as serving as building blocks to cell structure, and are involved in cellular maintenance and stability. Alterations to protein synthesis, function, and structure may in turn result in damaging effects to cells or ultimately the organism. Thus, elucidating protein function is of vital importance. However, studying proteins can be difficult because their functions depend on their specific interactions with other molecules. Many proteins undergo post-translational

1

modifications, and/or bind to cofactors or other biomolecules to extend their properties. This renders biological processes to be very complex as they are naturally regulated with high spatiotemporal resolution.2

Proteins, with their numerous biological functions, are a prominent target for chemical modifications by taking advantage of their diverse amino acid side-chain functionalities.3 The canonical set of amino acids contains a limited number of functional groups that include , amides, acids, aromatics, , alcohols, and . Why exactly nature has provided only 20 amino acids is a subject we are unable to explain; still, all biological processes are efficiently achieved with these remarkable building blocks.

However, for biological research, 20 amino acids are by no means ideal.

1.2 Protein modification via genetic encoding of unnatural amino acids

Approaches to chemically modify proteins have been explored and include 1) selective reactions of residues with reactive side-chains such as lysine and , 2) chemical modification of the N-terminus of a protein, 3) native chemical ligation, 4) peptide tags, and 5) labeling with specific amino acid sequences.4 Moreover, to provide greater chemical versatility and better compatibility to cellular systems, methods for the incorporation of unnatural amino acids (UAAs) into proteins have been developed.

The ability to incorporate amino acids that are not specified by the genetic code into proteins (UAAs) offers an opportunity to manipulate the macromolecule’s function, structure and properties within living cells or organisms. The residue-specific method of incorporating

UAAs into proteins, pioneered by Tirrell and co-workers, involves the replacement of a

2

canonical amino acid residue throughout a protein with an UAA. Since the UAA is incorporated at multiple sites, the protein may exhibit undesired altered properties compared to the natural protein.5 Despite the success this method has offered, in many occasions site- specificity is necessary to precisely study and manipulate a protein’s function. Over a decade ago, Schultz and co-workers reported a method for the site-specific incorporation of UAAs using translation’s own machinery in Escherichia coli.6

Many UAAs have since then been site-specifically incorporated into proteins in cells and animals providing essential tools to chemical biologists for the study and control of biological processes that are challenging or perhaps impossible to address by more conventional methods. Site-specific methods are ideal for introducing defined mutations into proteins with minimal perturbance. Herein, expansion of the genetic code can potentially assist in resolving the most intriguing questions about proteins, such as protein folding, protein-protein interactions, and protein localization. A wide range of studies are enabled by using UAAs as crosslinkers, photo-reactive groups, post-translational mimics, handles for bioconjugation reactions, and biophysical probes.7-11 The site-specific incorporation of

UAAs into proteins has been applied in hundreds of studies to facilitate important biological insights into structure, function, and dynamic behavior, as well as to create proteins with enhanced or novel activities. For instance, the incorporation of photocaged amino acids has enabled time-resolved studies of signaling and transport processes in cells by activating a protein with the use of light.11 Another example is the incorporation of probes into proteins that have enabled studies of proteins revealing structural insights, such as conformational changes.10

3

1.3 Genetic code expansion using orthogonal aminoacyl-tRNA synthetase/aminoacyl-tRNA pairs

The site-specific incorporation of UAAs into proteins requires an aaRS/aaT pair that is orthogonal to the host organism. An orthogonal aaRS/aaT pair can be obtained from an organism different to the host and engineer it to specifically recognize the UAA that is then inserted into the mRNA template in response to a blank codon. Most commonly, an amber stop codon (UAG) is coded in the target gene via site-directed mutagenesis to signal the tRNA for the introduction of the UAA. However, other stop codons and quadruplet codons have also been employed.12-14

An UAA is designed according to the function the researcher wants the protein to perform and based on its potential of being accepted as a substrate to an orthogonal aaRS. In a typical incorporation experiment, the UAA is fed to the cells where it is recognized by the synthetase without any cross-reaction from the endogenous translational machinery. With the same orthogonality, the synthetase charges its corresponding tRNA with the UAA. The charged tRNA delivers the UAA into the ribosome and inserts it onto mRNA in response to a blank codon, such as UAG. This results in the incorporation of the UAA into the protein of interest in a pre-selected, defined position (Figure 1).

4

orthogonal endogenous aaRS/aaT pair aaRS/aaT pair

ATP ATP

AMP + PPi AMP + PPi

ribosome mRNA

modified protein Figure 1. Site-specific incorporation of an unnatural amino acid (UAA) into a protein. Adapted from Curr. Opin. Chem. Biol. 2011, 15, 392.

For the experiment to be successful, a synthetase that naturally recognizes one of the

20 canonical amino acids needs to be evolved in order to destroy any binding with the endogenous amino acid and to selectively accept an UAA as substrate. During a synthetase evolution experiment, an aaRS, usually taken from archaea, is subjected to a site-specific

5

active site mutagenesis to generate a library of variants.15-17 The library of mutants is imported into E. coli with its cognate tRNA and is co-transformed with a chloramphenicol acetyltransferase resistance gene that bears an amber stop codon. The E. coli cells are then supplemented with the UAA and chloramphenicol. In this selection step, termed positive selection, the surviving cells encode functional aaRS variants resulting in the expression of chloramphenicol acetyltransferase via the incorporation of a canonical amino acid or an

UAA. The isolated aaRS variants are then co-transformed with a toxic barnase gene that codes an amber stop codon, and E. coli cells are grown in the absence of the UAA. Through this process, dubbed negative selection, the aaRSs that bind to canonical amino acids are removed from the library of mutants by cell death and those that do not use canonical amino acids survive. Multiple rounds of positive and negative selection give aaRS variants with optimal orthogonality that selectively uses an UAA. Once a synthetase is identified for a specific UAA, the incorporation of the UAA into proteins can be performed in cells where the expression of the protein that bears the amber stop codon is dependent on the presence of the UAA. The absence of UAA will in turn result in truncated protein and full-length protein expression will not be detected. The production of the modified protein is ultimately confirmed by protein mass spectrometry.

The first aaRS/aaT pair to be engineered was the tyrosyl-tRNA synthetase/tyrosyl- tRNA pair from Methanococcus jannaschii (MjTyrRS/TyrT).6 The directed evolution of this pair was done in E. coli to discover an aaRS mutant capable of selectively and site- specifically incorporating o-methyl-L-tyrosine into proteins. Additional pairs that have been developed include E. coli TyrRS/TyrT (EcTyrRS/TyrT) and E. coli leucyl-tRNA

6

synthetase/leucyl-tRNA (EcLeuRS/LeuT).9, 18, 19 While the MjTyrRS/TyrT pair is found to be orthogonal in E. coli but not in eukaryotic cells,20 the EcTyrRS/TyrT and EcLeuRS/LeuT pairs have shown to be orthogonal in eukaryotic cells but, of course, not in E. coli.9, 18, 21-24

1.4 Pyrrolysyl-tRNA synthetase as a genetic code expansion tool

Although useful and elegant approaches to incorporate UAAs into proteins have been developed, nature itself has provided a very competitive system to introduce pyrrolysine (Pyl,

Figure 2), the 22nd proteinogenetic amino acid. The pyrrolysyl-tRNA synthetase (PylRS) is present in some methanogenic archaea and bacteria and directly charges Pyl onto its cognate pyrrolysyl-tRNA (PylT) that subsequently inserts Pyl onto mRNA in response to an in-frame amber stop codon (UAG).25-27 It has been observed that species from the order of

Methanosarcinales use this endogenous and efficient amber suppression mechanism while the formation of truncated protein due to premature termination is not abundant. The

PylRS/PylT pair was discovered in M. bakeri, where an amber stop codon is recoded to efficiently introduce Pyl into mono-, di- and trimethyltransferases.28, 29 Three other species,

M. mazei, M. acetivorans, and M. burtonni, as well as the bacteria Desulfitobacterium hafniense were also found to encode Pyl into methyltransferases.30-34 Even more advantageous, early studies demonstrated that the PylRS/PylT pair is functional and orthogonal to all endogenous aaRS/aaT pairs in E. coli,35 Saccharomyces cerevisiae,36 and mammalian cells.37 More recently, this system has been applied to expand the genetic code of the multicellular organisms Caenorhabditis elegans,38, 39 Drosophila melanogaster,40 and

Xenopus laevis oocytes.41

7

Studies about Pyl and its genetic encoding revealed that the PylRS/PylT pair displays tolerance toward several analogs of Pyl. Structural exploratory studies of Pyl established that the Nε-carbonyl group and a heteroatom placed strictly at the position corresponding to the imine nitrogen in Pyl are necessary for substrate binding of the wild-type PylRS (Figure 2, 1-

3).42-44 Encouraged by these results, the pyrrolysine mimic 4 with a terminal alkyne was designed and applied for site-selective protein modifications with azides via a Cu(I)- catalyzed click reaction.45 In addition, the incorporation of 5 showed that a carbamate side- chain functionality is tolerated without the need of a heteroatom in the five-member ring.35

Figure 2. Structures of pyrrolysine (Pyl) and analogs that serve as substrates of PylRS.

Crystallographic studies of the PylRS structure uncovered information on PylRS substrate promiscuity that can lead to the design of additional Pyl mimics.44, 46 The enzyme has a deep hydrophobic pocket where it efficiently locks Pyl through a hydrogen bond network. Key interactions at the PylRS active site include one between the side-chain amide

8

of Asn346 and the oxygen on the carbonyl side-chain functionality of Pyl. Another interaction involves the pyrroline nitrogen of Pyl and the phenolic oxygen of Tyr384.

However, due to the incorporation of 5 it can be inferred that the interactions at the PylRS binding pocket are relatively non-specific and can be redirected to the side-chain carbamate oxygen. In fact, the absence of the pyrroline nitrogen in 5 led to a water-mediated hydrogen bond with Asn346 for substrate recognition.44 It was also found that PylRS, unlike many aaRSs, does not have an editing domain that can hydrolyze misacylated tRNAs.47 With the lack of an editing domain, PylRS can accept slight variations on Pyl, thus slipping unnatural analogs into the translation machinery.

Research toward probing the promiscuity of PylRS to add novel functional groups into proteins led to the incorporation of UAAs that were not limited to a pyrroline ring structure and several bulky and versatile functionalities were tolerated (Figure 3). In general, these UAAs possess a hydrophobic side-chain functionality with a similar size to that of Pyl that in many cases is linked via a carbamate moiety at the Nε-position of pyrrolysine/lysine.

Yokoyama et al. first discovered that the amino acids 6 and 7 are efficient substrates of

PylRS capable of having strong interactions with the synthetase.48, 49 Efforts between our group and the Chin lab revealed two PylRS substrates (8 and 9) that are capable of undergoing click chemistry to label proteins as well as the photocrosslinking amino acid

10.50, 51 Additional UAAs bearing terminal alkyne functionalities (11 and 12) were developed, as well as UAAs with a 1,2-aminothiol side chain (13) or a protected aminothiol functionality (14) to undergo selective native chemical ligation or cyanobenzothiazole condensations to modify proteins.52-55 Carell et al. recently demonstrated that PylRS weakly

9

recognizes 15-17 to obtain proteins with lysine propionylation, butylation, and crotonylation.56

Figure 3. Pyrrolysine/lysine analogs that serve as substrates of the wild-type PylRS.

Even though PylRS has shown broad substrate promiscuity, it is not possible for this synthetase to recognize certain structures such as those that are larger and rigid. In order to improve substrate scope and expand to additional chemistries, efforts have been made to engineer PylRS and identify mutants to selectively recognize other UAAs. So far, over 30

UAAs have been incorporated into proteins using engineered PylRS/PylT pairs.57 Some of their structures are shown in Figure 4. Several UAAs with benzene functionalities have been

10

incorporated and applied for the modification of proteins using PylRS mutants. Examples include an aromatic azide (18) for site-specific click or Staudinger reactions,49 a halogenated benzene derivative (19) to undergo cross-coupling reactions,58 and an o-nitrobenzyl caging group (20) for the control of protein function with light.36, 59-62 The UAAs 21 and 22 have been incorporated into proteins and applied in selective Cu-free click reactions,63, 64 and 23 reacts rapidly with tetrazines to undergo an inverse electro-demand Diels-Alder reaction to site-specifically label proteins.65, 66 Additional UAAs that have been genetically encoded using PylRS variants include a mercapto-Lys precursor (24) to undergo native chemical ligation,67 a post-translationally modified lysine mimic (25),68, 69 a spin-labeled analog

(26),70 and a furan bearing lysine derivative (27) that was applied for the photocrosslinking of protein-nucleotide complexes with red light.71 In addition to incorporating new UAAs, engineered PylRSs have displayed substrate recognition toward UAAs that are also targeted by the native PylRS, such as 7, 13, and 14, with higher activities.54, 55, 72

11

O O O

O NH O NH O O NH O N3 I NO2

H2N CO2H H2N CO2H H2N CO2H 18 19 20

H O O H O H O NH O NH O NH H

H2N CO2H H2N CO2H H2N CO2H 21 22 23

O O O O O

O NH NH O NH O NH SH N O

H2N CO2H H2N CO2H H2N CO2H H2N CO2H 24 25 26 27

Figure 4. Pyrrolysine/lysine analogs that have been genetically encoded into proteins using engineered PylRS mutants.

Structural studies of PylRS also revealed that the synthetase displays high resemblance to bacterial phenylalanyl-tRNA synthetase (PheRS). The similarly organized binding pockets of PylRS and PheRS led scientists to the evolution of PylRS for the selective recognition of phenylalanine and its analogs.73-75 Even though several phenylalanine analogs have been previously encoded into proteins in bacteria and eukaryotes by using engineered

TyrRS/TyrT pairs, the PylRS/PylT provides higher orthogonality in different biological systems.6, 9, 76 Over forty phenylalanine and tyrosine derivatives have since been incorporated

12

into proteins by using engineered PylRS variants, allowing researchers to explore the repertoire of functionalities that can be genetically encoded.77-81 A representative selection of these is shown in Figure 5.

N3 O O

H2N CO2H H2N CO2H H2N CO2H H2N CO2H H2N CO2H Phe 28 29 30 31

N3

O O O NC F3C

H2N CO2H H2N CO2H H2N CO2H H2N CO2H H2N CO2H 32 33 34 35 36

Figure 5. Structure of phenylalanine (Phe) and its derivatives that have been genetically encoded into proteins using engineered PylRS mutants.

1.5 Summary

The site-specific genetic encoding of UAAs provides tools for a number of new approaches to study and manipulate biological processes through modified proteins. The discovery and evolution of orthogonal PylRS/PylT pairs have resulted in an increasing number of novel and useful UAAs that can be site-specific incorporated into proteins. This system has been successfully applied to bacteria, yeast, mammalian cells, and animals for the genetic encoding of more than 70 UAAs. Not only has the substrate scope of PylRS been

13

expanded via directed evolution to include analogs of lysine, phenylalanine or tyrosine, but the wild-type PylRS can be used to incorporate useful UAAs as well. Thus, the PylRS/PylT pair has substantially facilitated UAA incorporation into proteins to install a large variety of biological probes and chemically reactive functionalities. This technology has provided access to many proteins with varying modifications and functions both in vitro and in vivo, thereby making it possible to address studies in protein regulation and function in fundamental biological processes.

14

CHAPTER 2: SYNTHESIS AND GENETIC ENCODING OF LYSINE ANALOGS FOR PROTEIN LABELING

2.1 Genetic code expansion with a bipyridine lysine

About a third of all proteins in nature are estimated to use metal ions to perform their function.82, 83 Metals are important structural elements in many proteins, are critical for regulating a number of biological activities, and are key cofactors in enzyme catalysis and electron transfer processes. Proteins that bind to metal ions (metalloproteins) are involved in important biological processes such as photosynthesis, oxygen transport, nitrogen fixation, synthesis and degradation of biomolecules, and water oxidation.83, 84 Due to the critical roles that are played by metalloproteins, considerable efforts have been devoted to understanding their structures and functions. Moreover, there is an interest in designing metalloproteins with new or enhanced functions, catalytic activities, and structural characteristics, in order to manipulate biological processes.85, 86 However, the design of artificial metalloproteins using the common 20 amino acids is limited, as the precise orientation of multiple amino acid side- chains is often required to create a defined metal ion binding site. This is crucial as localization of the metal complex is critical for the protein’s structure and function.

The genetic incorporation of metal-chelating amino acids into proteins provides a simple and general approach to site-specifically introduce metal ion binding sites into proteins. Unnatural amino acid mutagenesis has been applied for the biosynthesis of proteins that contain the metal-chelating amino acids bipyridylalanine (37),87 8- hydroxyquinolinylalanine (38),88 and 1,10-phenanthrolinylalanine (39)89 by using evolved

15

TyrRS/TyrT pairs in E. coli (Figure 6). The introduction of metal binding sites at defined locations into proteins using 37 have allowed for the construction of metal-binding proteins including a sequence-specific DNA cleaving protein.90, 91 In addition, 38 was used as a biophysical probe for protein crystallographic structure determination,88 and as an affinity tag for protein purification.92 As an alternative to 37 and 38, the site-specific incorporation of 39 into GFP was demonstrated in E. coli cells.89

N OH N N N N

H2N CO2H H2N CO2H H2N CO2H 37 38 39

Figure 6. Structures of metal-chelating UAAs.

Since metal-binding UAAs have been successfully applied for the expression of metalloproteins, we expect that the genetic incorporation of a bipyridine lysine would facilitate the design and construction of metalloproteins when using a PylRS/PylT pair. Thus far, the genetic encoding of metal-chelating amino acids has not been possible in eukaryotic cells. Thus, we aim to use the PylRS/PylT pair in order to incorporate a bipyridine amino acid into proteins in mammalian cells. We have designed the lysine analogs 47 and 51

(Figure 7) that bear the N,N’-bidentate group 2,2’-bipyridyine, which is known to strongly chelate transition-metal ions such as Fe2+, Cu2+, Co2+, Zn2+, and Ru2+ with different levels of affinity.87, 91, 93

16

Figure 7. Proposed structures of bipyridine lysines 47 and 51.

2.1.1 Synthesis of a bipyridine lysine

The 2,2’-bipyridine lysine 47 was designed with a carbamate moiety at the Nε- position that links the N,N’-bipyridine group. We envisioned that the presence of the carbamate moiety would facilitate genetic encoding 47 using a PylRS/PylT pair, since this functionality has shown to aid the PylRS toward substrate recognition for several UAAs48-50

(refer to discussion in section 1.4). Formation of the 2,2’-bipyridine core was achieved by the

Stille cross-coupling reaction between 2-chloro-5-ethylnicotinate (40) and 2-

94 tributylstannylpyridine (41) in the presence of PdCl2(PPh3)2 (Scheme 1). The reaction resulted in a 45% yield of 42 when refluxed in xylenes, while an increase to 68% was observed with 1,4-dioxane as the solvent at 100 °C. Reduction of the ethyl ester in the presence of lithium aluminum hydride in THF afforded 2,2'-bipyridine-5-methanol (43) according to a reported procedure94 in 93% yield. The activated alcohol 44 was obtained from the reaction with N’N-disuccinimidyl carbonate (DSC) in the presence of TEA in acetonitrile in low yield (22%). Attempts via activation of the alcohol 43 with diphosgene or

4-nitrophenyl chloroformate did not lead to product formation. Aiming to continue the

17

synthesis, we found that compound 44 did not react with lysine to furnish 45 or 46 after various reaction conditions were explored.

Scheme 1. Attempted synthesis of bipyridine lysine 47.

Alternatively, we designed the 2,2’-bipyridine lysine 51 in four steps starting with the

Stille cross-coupling product 42 (Scheme 2). The hydrolysis of 42 by sodium hydroxide to give the corresponding carboxylic acid 48 was conducted as previously described95 with some modifications, giving 48 in 99% yield. Next, a DCC coupling reaction between 48 and

N-hydroxysuccinimide (NHS) furnished the 2,2’-bipyridine NHS-activated ester 49 in 69% yield. The bipyridyl group was successfully linked to Boc-Lys-OH in DMF to afford 50 in

18

83% yield, and final Boc-deprotection of the α-amino group by TFA, followed by TFA to

HCl salt exchange resulted in the desired bipyridine lysine 51 in 98% yield.

Scheme 2. Synthesis of a bipyridine lysine 51.

2.1.2 Genetic encoding of a bipyridine lysine

To employ the newly synthesized bipyridine lysine 51 in a genetic encoding experiment, Jihe Liu in the Deiters Lab screened a panel of engineered PylRS mutants to test for substrate specificity and incorporation. The experiments were performed in E. coli and sfGFP (superfolder green fluorescent protein) was used as a reporter protein. To this end, the reporter plasmid sfGFP-Y151TAG-PylT was co-transformed with engineered PylRS into

Top10 cells. sfGFP-Y151TAG-PylT encodes sfGFP with an amber stop codon (TAG) at position Y151 and drives the expression of PylT, the gene for MbPylTCUA. The screening of

PylRS mutants led to the identification of one synthetase that drives the efficient

19

incorporation of 51. This evolved PylRS, named EV13, bears the two mutations Y271A and

Y349F compared to the wild-type PylRS. After expressing the cells with or without 51, sfGFP was purified and analyzed by SDS-PAGE (Figure 8). Our results show that the expression of sfGFP is dependent on the presence of 51 when it gets introduced at the TAG position.

EV13 51 - + 37 kDa

25 kDa

Yield (mg/L) 0.54 2.40

Figure 8. SDS-PAGE analysis and protein yield for the incorporation of 51 into sfGFP at position Y151 in E. coli by using EV13. Experiments were performed by Jihe Liu.

2.2 Labeling of proteins via bioorthogonal reactions

Covalent attachment of proteins to ligands, polymers, and surfaces creates macromolecules that combine specific biological functions with favorable physical and chemical properties. For example, studying biological processes in their native environment often requires the addition of reporter tags to proteins.96 Current labeling methods include genetic fusions of fluorescent proteins,97, 98 self-labeling proteins, such as HaloTag,99-102

SNAP-tag,103 and CLIP-tag,104 enzyme-mediated labeling (ligases),105-109 and self-labeling tags.110, 111 Although these methods have significantly impacted protein research, they require

20

the addition of protein fusions or additional sequences, which can interfere with the folding and activity of the targeted protein.112 Additionally, the ranges of probes and locations in which the probe can be placed by some of these approaches are limited.113, 114

An alternative strategy to label proteins is via the introduction of single-residue modifications, which is nearly non-perturbing. Site-specific protein labeling can be achieved by the installation of tags through bioconjugation reactions with reactive handles that are pre- installed into a protein by using an orthogonal aaRS/aaT pair for UAA mutagenesis.76, 115-117

The bioorthogonal groups can be introduced at virtually any position on the protein expressed in pro- and eukaryotic cells and the choice of probes is nearly limitless.118

There are several criteria that a bioorthogonal reaction must fulfill for the reaction to successfully take place in living systems. These criteria call for reactants that are 1) kinetically, thermodynamically, and metabolically stable, 2) non-toxic, 3) capable of forming stable covalent linkages, and more importantly, 4) the involved functionalities must be bioorthogonal by not cross-reacting with the broad functionalities found in biological systems and can only selectively react with each other under physiological conditions117, 118

(Figure 9).

21

H2O H2O H2O

2. POI 1. POI POI

CO2 CO2 CO2 O O O 2 2 2

Figure 9. Representation of bioorthogonal reactions for labeling proteins. Under physiological conditions and in the presence of all functional groups found in living systems, 1) a bioorthogonal functional group (purple) is introduced into a protein of interest, or POI, (green) by UAA mutagenesis. Then, 2) a bioorthogonal chemical reporter (blue) reacts chemo-selectively to label the protein. Figure adapted from Chem. Rev. 2014, 114(9), 4764.

Chemical handles that have been exploited in bioorthogonal reactions include aldehydes/ketones, azides, alkenes, alkynes, tetrazines, aryl halides, and aryl boronates.

These functionalities have been involved in several bioorthogonal reactions to modify proteins and other biomolecules inside live cells and other complex environments, and their scope and applications have been extensively reviewed.4, 15, 117-122 Some of the most common bioorthogonal reactions are shown in Figure 10. These include aldehyde/ketone condensations, Staudinger ligations, copper-catalyzed or strain-promoted cycloadditions, palladium-catalyzed cross couplings, inverse electron-demand Diels-Alder cycloadditions, ruthenium-catalyzed olefin metathesis, photo-click cycloadditions, and thiol-ene reactions.

More comprehensive details for selected bioorthogonal reactions will be provided in following sections.

22

Aldehyde/ketone condensation Pd-catalyzed cross coupling

Staudinger ligation Tetrazine and strained alkene Diels-Alder cycloaddition

Cu(I)-catalyzed alkyne-azide cycloaddition Cross-metathesis

Strain-promoted alkyne-azide cycloaddition Photo-click cycloaddition

Photo-induced thiol-ene reaction

Figure 10. Bioorthogonal reactions for protein labeling.

2.3 Genetic code expansion with unnatural amino acids for protein labeling via aldehyde/ketone condensations

Ketones and aldehydes are among the first functionalities to be explored for bioorthogonal reactions.123, 124 Under slightly acidic conditions (pH 4-6), the carbonyl group of ketones and aldehydes react with amines to form a Schiff base. This process is reversible with the equilibrium usually favoring the carbonyl form.125 However, the use of amines with

α-effect, such as hydrazines or hydroxylamines, shifts the equilibrium in favor of the imine products (Section 2.2, Figure 10).

23

Although the acidic pH requirement restricts ketone/aldehyde condensations to be used in certain environments where non-physiological conditions are achievable, the small size and versatility of ketones and aldehydes have made them popular choices as chemical reporters for biomolecule labeling. The reaction has proven to be useful for in vitro or cell- surface labeling including cell surface glycans124, 126-128 and oligosaccharides in bacterial cell walls.129 The ketone group has also been explored in the form of UAAs by residue- specific130-132 and site-specific methods.

Several ketone-containing UAAs have been genetically encoded into proteins in E. coli and eukaryotic cells by using engineered TyrRS/TyrT pairs (Figure 11, 52-56). The diketone 52 was introduced into proteins in E. coli and subsequently labeled with a hydroxylamine-containing dye.133 Similarly, p-acetylphenylalanine (54) has been incorporated into proteins for labeling with hydrazine- or hydroxylamine-modified fluorophores and EPR (electron paramagnetic resonance) probes.134-136 The similar analog 55 was site-specifically incorporated into proteins in vitro and in E. coli cells, and its selective labeling with hydrazide derivatives was efficiently demonstrated.137 Both 53 and 54 were incorporated into G-protein coupled receptors and labeled with fluorescent probes in vitro,138 and their genetically encoding in mammalian cells21 and yeast9, 139 was also reported. More recently, the site-specific incorporation of 53 and 54 into proteins was demonstrated via engineered PylRS/PylT pairs.78, 81 Another amino acid that has been incorporated into protein using an evolved PylRS/PylT pair is the aliphatic analog 56. This UAA was genetically encoded in E. coli and the keto-containing protein was labeled with a biotinylated

24

alkoxyamine and a hydrazide dye at pH 6.3,140 compared to pH 4 for the labeling of 54 with alkoxyamine probes.135, 136

Figure 11. Ketone-containing UAAs that have been genetically encoded into proteins.

Since the reaction of ketones with hydroxylamines or hydrazines requires an acidic pH, it is not ideal for some proteins or certain biological environments. Thus, Liu et al. showed that it is possible to site-specifically label purified GFP bearing aliphatic ketone 56 in

PBS buffer at pH 6.3 with equimolar amounts of an aminooxy dye.140 However, a GFP bearing 54 did not yield a labeled protein under the same conditions. It is possible that the low reactivity of 54 is due to decreased electrophilicity of the carbonyl group by conjugation with the aromatic phenyl ring. Thus, approaches for the genetic incorporation of aliphatic carbonyl-containing UAAs into proteins need to be explored and further developed.

Aldehydes are potent electrophiles, more so than ketones, mainly due to steric effects.141 Due to the high reactivity of aldehydes with amines that may result in non-specific bioconjugation and cytotoxicity, methods to prepare and introduce this functionality into

25

biomolecules have involved protected or masked aldehydes. Examples of these include aldehyde peptide-tags142, 143 and chemical modifications, such as periodate oxidation of N- terminal serine or threonine residues.144 However, these methods do not allow site- specificity. Recently, the site-specific incorporation of a m-formyl-Phe was demonstrated into proteins in E. coli cells using an evolved PylRS/PylT.145 Although the amino acid did not show any toxicity in E. coli cells, it required supplementation in higher concentrations to promote protein expression. Alternatively, we envisioned the site-specific introduction of aldehydes into proteins by genetically encoding 1,2-diol-containing UAAs as a masked technology. The 1,2-diol-modified amino acids can in turn be oxidized in the protein using periodate to finally give an aldehyde at a pre-determined position146 (Figure 12).

POI POI

Figure 12. A site-specifically incorporated 1,2-diol-containing UAA is modified post- translationally via periodate oxidative cleavage to reveal an aldehyde on a protein of interest (POI).

2.3.1 Synthesis of ketone and diol functionalized unnatural amino acids, and hydroxylamine probes

In order to expand the diversity of UAAs that can be incorporated by the PylRS/PylT pair and to explore the reactivity of aliphatic ketones toward condensation reactions with

26

hydrazides or alkoxyamines, we aimed to synthesize the ketone lysine 62. Initial attempts to reproduce the synthesis of 62 developed by Dr. Hrvoje Lusic (Deiters lab) via Wacker oxidation conditions (PdCl2, O2, H2O, dimethylacetamide, heat) of the corresponding terminal alkene 102 proved to be challenging. Attempts resulted in no reaction or formation of product 62 with remaining starting material that could not be isolated. We then changed our route toward 62 via a ketal protection of the corresponding carbonyl group in an effort to avoid side reactions (Scheme 3). To this end, alcohol 59 was synthesized in two steps from 4- acetyl butyrate (57).147 Acetal protection to 58 was performed in the presence of p- toluenesulfonic acid and ethylene glycol to deliver the product in 69% yield. The ester 58 was then reduced to the alcohol 59 in 97% yield via LiAlH4 reduction. Then, 59 was reacted with DSC in the presence of pyridine to obtain the activated NHS-ester 60. Purification of 60 by flash column chromatography on silica gel failed due to hydrolysis to the alcohol 59.

However, coupling the crude product with Boc-Lys-OH in DMF, provided compound 61 in

85% overall yield. Next, Boc-deprotection of the α-amine and removal of the ketal group was attempted by using 1 M HCl aqueous solution but the desired product was not obtained due to decomposition. Successful deprotection was achieved in a biphasic mixture of aqueous

HCl and diethyl ether to furnish the ketone lysine 62 as a pure white solid in 98% yield.

27

Scheme 3. Synthesis of ketone lysine 62.

Meanwhile, we also designed and assembled two 1,2-diol-containing UAAs: an aliphatic lysine derivative 65 (Scheme 4) and a phenylalanine analog 71 (Scheme 5). To synthesize 65, 2,2-dimethyl-1,3-dioxolane-4-methanol was activated with DSC to obtain 63 according to a reported procedure.148 Boc-Lys-OH was reacted with the succinimidyl carbonate 63 in aqueous basic conditions, delivering 64 in 79% yield. Boc- and acetal- deprotection of 64 using a heterogeneous mixture of aqueous HCl and diethyl ether resulted in 65 with a yield of 91%.

28

Scheme 4. Synthesis of diol lysine 65.

Based on the recent developments on engineered PylRS mutants to incorporate phenylalanine analogs into proteins (see Section 1.4, Figure 5), we designed the diol phenylalanine 71. The UAA 71 was synthesized in five steps from N-Boc-4-iodo- pheylalanine (66), which was first protected to the tert-butyl ester 67 with tert-butyl 2,2,2- trichloroacetimidate in 62% yield. The iodo-phenylalanine 67 was then subjected to palladium-catalyzed cross-coupling reactions with pinacol vinylboronate or vinyltributylstannane. Several conditions were attempted in order to optimize the reaction toward completion to facilitate purification as both the starting material and product showed the same Rf value as determined by TLC and NMR. To find optimal conditions, several reactions were setup in parallel to select the metal center and palladium catalyst. Several

Stille and Suzuki cross-coupling conditions were attempted with catalysts Pd(PPh3)4/PPh3,

Pd(dba)3, or PdCl2(PPh3)2. We identified Suzuki cross-coupling conditions using

PdCl2(PPh3)2 in the presence of sodium carbonate as base in heating THF and water that

29

efficiently delivered 68 with no trace of starting material in 74% yield. Next, we explored the oxidation of the terminal alkene 68 to the corresponding 1,2-diol with either osmium tetroxide or potassium permanganate. However, we found that several conditions did not lead produce the desired product, while others resulted in the formation of the ketol 69, and not the expected diol (structure not shown). As our best conditions, we found that potassium permanganate oxidation proceeds to the clean formation of ketol 69 as the only product in

90% yield. Subsequently, the ketone functionality on 69 was reduced to a secondary alcohol via sodium borohydride to deliver the diol 70 in 94% yield. Finally, Boc- and tert-butyl- deprotection with 50% TFA in DCM for 5 h resulted in global deprotection to the desired diol phenylalanine 71 in 77% yield.

30

Scheme 5. Synthesis of diol phenylalanine 71.

The oxime bond that is formed between a ketone/aldehyde and a hydroxylamine is stable under physiological conditions. On the other hand, imines such as those of hydrazones

149 are often reduced. In addition, the equilibrium constant (Keq) for hydrazine linkages are typically in the range of 104-106 M-1, whereas for oxime bonds it is >108 M-1.150 Accordingly,

31

we synthesized dansyl 75 that bears an aminooxy functionality (Scheme 6), which can be applied for protein labeling together with the described, synthesized ketone and diol UAAs.

For this, we first assembled the dansyl bromide 73 (59%)151 and proceeded with an alkylation reaction to the N-Boc-hydroxylamine in the presence of DBU, delivering 74. The

Boc-protected aminooxy fluorophore was then treated with TFA to furnish the free dansyl- aminooxy 75 in 54% yield over two steps.

Scheme 6. Synthesis of dansyl aminooxy 75.

In order to facilitate a general procedure to modify probes with the aminooxy functionality, we synthesized the amine-reactive linker 79 which carries an N-Boc-aminooxy group (Scheme 7). The synthesis of 79 was accomplished in three steps based on a reported procedure.152 First, 1,3-dibromopropane was reacted with (76) in the presence of potassium carbonate to deliver 77 in 77% yield. The alkyl bromide on 77 was subsequently reacted with N-hydroxylamine in 62% yield to install the aminooxy functionality (78) on the linker. Removal of the phthalimide by hydrazine delivered the primary amine 79 in 95% yield, which can be further functionalized.

32

Scheme 7. Synthesis of the aminooxy linker 79.

Next, we applied the linker 79 for reaction with fluorescein 5-isothiocyanate (FITC,

80) or the activated coumarin carbonate 83 (provided by Dr. Rajendra Uprety) in DMF

(Scheme 8). The resulting aminooxy-modified fluorophores 81 and 84 were subsequently deprotected by TFA to finally yield FITC- and coumarin-aminooxy (82 and 85) in 42% and

66% overall yields, respectively.

33

Scheme 8. Synthesis of aminooxy-modified fluorophores: fluorescein 82 and coumarin 85.

2.3.2 Genetic encoding of ketone and diol functionalized unnatural amino acids

Dr. Chungjung Chou and Alex Prokup (Deiters lab) identified two PylRS variants,

EV3 and EV11 that recognize keto lysine 62 as a substrate. The first variant contains the three mutations L274V, C313V, and M315Q, whereas the second has the five mutations

D203N, Y271C, L274V, C313V, and M315Y. We found that EV3 drives the incorporation of 62 more efficiently than EV11. Thus, the reporter plasmid sfGFP-Y151TAG-PylT was co- transformed with EV3 into Top10 cells. The protein was purified and its SDS-PAGE analysis shows that the expression of sfGFP is dependent on the presence of 62 at Y151 (Figure 13).

34

+ 62 - 62

30 kDa 25 kDa

Figure 13. SDS-PAGE analysis for the incorporation of the ketone lysine 62 into sfGFP by the PylRS variant EV3. Experiment performed by Alex Prokup.

Protein labeling experiments on sfGFP-62 were attempted with fluorophores 75, 82, and 85 by Alex Prokup in the Deiters Lab. However, efforts did not result in successful protein modification due to off-target labeling. This may be a result of using the labeling reagent in excess, which is often necessary due to the reversible nature and low kinetics of the ketone/aldehyde-hydrazide/aminooxy reactions,123, 153 as use of the reagent in low concentrations did not yield labeled protein. Alternatively, the addition of aniline as a nucleophilic catalyst could be explored to improve the kinetics of the labeling reaction.

Dawson et al.149, 150, 154 demonstrated that aniline accelerates the carbonyl condensation reaction by forming a reactive protonated electrophile with the carbonyl group. This Schiff- base intermediate then leads to fast transamination to the resulting oxime or hydrazine product. Application of this methodology has resulted in increased reaction rates (~20-fold rate enhancement) in equimolar conditions and at pH values similar to physiological conditions.145, 149, 154

Meanwhile, the diol lysine 65 was tested for its site-specific incorporation into proteins. In experiments performed by Ji Luo in the Deiters lab, it was identified that the

PylRS variant EV3 drives the incorporation of 65 into myoglobin (Myo) and sfGFP in E. coli. Ji Luo performed double transformation for the incorporation of 65 into Myo or sfGFP

35

using Myo-S4TAG-PylT (encodes myoglobin with an amber stop codon at position S4 and

drives the expression of PylT) or sfGFP-Y151TAG-PylT, respectively, with EV3 in Top10

cells. The SDS-PAGE analyses in Figure 14 show that the expression of either protein is

dependent on the presence of 65.

A. B. M +AlocM -+AUAAloc +Di -UAAolK +Di-65olK +65 M -UAAM +DA -UAA-65 PI +Di +DA +65olK PI +DiolK 30 kDa 30 30 17 kDa 17 17 25 25 kD25a

Figure 14. Incorporation of diol lysine 65 into proteins in E. coli using EV3 PylRS mutant. SDS-PAGE analyses show that A) expression of Myo is dependent on the presence of 65 at position S4, and B) expression of sfGFP is dependent on the presence of 65 at position Y151. Experiments were performed by Ji Luo.

The diol phenylalanine 71 was also tested for its incorporation into proteins using our

available panel of evolved synthetases. Unfortunately, none that could recognize 71 and

mediate its incorporation into protein were identified. Further studies are necessary to

engineer a new PylRS variant through directed evolution.

2.4 Genetic code expansion with alkene lysines for protein labeling

The alkene functionality is currently receiving considerable attention due to its

versatility in organic transformations and it is rarely found in natural proteins,155, 156 allowing

for selective modification. A variety of alkene-bearing UAAs have been exploited for site-

specific protein modifications using bioorthogonal reactions including olefin cross-

36

metathesis,157 photo-click cycloadditions with tetrazoles,158-160 inverse electron-demand

Diels-Alder cycloadditions,65, 66, 161 and thiol-ene reactions72, 162 (see Figure 10). A few examples of alkene-containing UAAs that have been incorporated via a PylRS were shown in

Section 1.4 (see 7, 17, and 23).

The thiol-ene reaction involves a radical-mediated addition of a thiol to an alkene that occurs upon UV irradiation (365-405 nm).163, 164 The reaction offers the possibility of using light to control both in space and time the formation of a stable thioether bond. As a result of its specificity for alkenes and compatibility with aqueous environments, the thiol-ene reaction is a bioorthogonal reaction that has been applied in polymer and material synthesis,165-171 and carbohydrate172, 173 and peptide/protein modification.72, 162, 174-179

An area of interest for which bioorthogonal reactions have been explored is in the generation of non-linear protein fusions. In biological systems, proteins often bind to other proteins to gain stability, affinity and higher specificity to perform specific cellular functions such as signal transduction, transcriptional regulation, and DNA repair.180-182 Here, we describe the site-specific genetic incorporation of alkenes into proteins in the direct, spacer- free generation of non-linear protein fusions.

2.4.1 Synthesis of alkene lysines

Since the wild-type PylRS is capable of accommodating a broad range of unnatural lysine derivatives that bear a carbamate linkage at the ε-amino group (refer to Section 1.4, and Figure 3 for examples), we synthesized a small collection of aliphatic alkene lysines to

37

diversify the structure of bioconjugation handles and to explore the ability to accommodate long-chain alkenes and lysine linkages other than carbamates by the wild-type PylRS.

The synthesis of the alkene lysine analogs began with the activation of the corresponding unsaturated alcohols (86-90) with DSC or diphosgene to deliver 91-94 and 95, respectively, in yields from 65% to 77% (Scheme 9). Subsequent acylation of Boc-Lys-OH proceeded to yield the protected amino acids 96-100 in 86% to 93% yield. Finally, Boc- deprotection of the α-amino group with TFA in DCM in the presence of triethylsilane delivered the free amino acids 101-105 in 87% to 97% yield after recrystallization. The amino acids 101 to 104 were subjected to a TFA to HCl salt exchange in order to improve their solubility in aqueous media.

Scheme 9. Synthesis of alkene-bearing lysines 101 - 105. aYield over two steps.

38

To further explore the necessity of a carbamate linker for substrate recognition by the wild-type PylRS, the alkene-lysine derivatives 108, 111, and 113 were synthesized (Scheme

10). The alkene lysine 108, which has an amide functionality in place of a carbamate, was obtained from Boc-Lys-OH and the succinimidyl ester 106. Compound 106 was synthesized according to a reported procedure via DCC coupling183 and the intermediate 107 was obtained in 97% yield. Boc-deprotection of the amino group, followed by TFA to HCl salt exchange delivered 108 in almost quantitative yield. The amino acid 111 bears an “inverted” carbamate moiety, relative to the well-known alloc-Lys 7. The synthesis of 111 began with the acylation of 6-hydroxy-Boc-norleucine-OH (provided by Qingyang Liu) using allyl isocyanate (109) to obtain the corresponding product 110 in 89% yield. Boc-removal of 110 using TFA delivered the free amino acid 111 in 96% yield. Next, acylation of Boc-Lys-OH was then achieved by allyl isocyanate (109) in DMF resulting in the N-Boc-protected urea derivative 112 in 57% yield. The free urea analog 113 was obtained after final Boc-removal in 94% yield.

39

Scheme 10. Synthesis of alkene-bearing lysines with an amide (108), “inverted” carbamate (111), or urea (113) functionality at the Nɛ-position of lysine.

2.4.2 Genetic encoding of alkene lysines

To investigate whether the synthesized alkene lysines are substrates for the wild-type

PylRS, Myo4TAG-PylT and pBKpylS (encodes MbPylRS) were co-transformed into E. coli

Top10 cells by Dr. Chungjung Chou. Cells were grown in the absence of an UAA and in the presence of 7 and 101-105, 108, 111, and 113. Protein expression in E. coli was evaluated by

SDS-PAGE, where 7 was used as a positive control (Figure 15, A). Unnatural amino acid incorporation efficiency into Myo was determined by protein yield (%) and ESI-MS analysis

40

of proteins confirmed UAA incorporation, except for the non-incorporated UAAs 104 and

-AA 1 +AA 7 108 (Figure 15, B).1 6 20 kD 17 kD

A. -AA 8 5 1 -AA 1 +AA 7 B. 1 6 +101-AA2 +102 1 3 +AA +103 4 - 7UAA-AA 1 6 20 kD Theoretical Experimental 20 kD 20 kDakD UAA Yield (%) 17 kD Mass (Da) Mass (Da)

-AA 8 5 1 -UAA-AA +108 8 +104 5 +7 1 7 2 3 984 -AA 18480.09 18480.45 20 kD -AA 1 9 2020 kDa kD 20 kD 101 81 18494.10 18494.00 102 85 18508.27 18508.19

17 kD -UAA-AA + 17 +113 9 103 19 18522.13 18523.55 104 14 18536.15 - 17 kDa 17 kD 17 kD 105 79 18494.10 18494.03 -UAA +7 +111 108 9 18478.11 - 20 kDa 111 100 18480.09 18479.80

-AA 1 +AA 7 +71 +105 6 113 47 18479.10 18479.20 20 kD 1717 kDa kD

-AA 8 5 1 2 3 4 -AA

20 kD Figure20 kD 15. Genetic incorporation of alkene lysine analogs into Myo by the wild-type PylRS/PylT pair in E.coli. A) SDS-PAGE analyses for the incorporation of alkene-bearing -AA 1 9 lysines into Myo. B) Myo comparative protein yield (%) and ESI-MS results. Experiments

17 kD were performed by Dr. Chungjung Chou and Jihe Liu.

The amino acids 7, 101 and 102 have been previously described and incorporated into

proteins using the wild-type PylRS and/or PylRS mutants.49, 72 Here we found that additional

analogs can be efficiently incorporated into Myo by MbPylRS. The amino acid binding

pocket of the PylRS exhibited flexibility to accommodate substrates 7, 101, 102, 105 and 111

with amino acids 7 and 111 showing the highest incorporation efficiencies, which could be

explained by their smaller size. While the amino acids 103 and 104 were not efficiently

41

incorporated into protein due to their longer carbon chains. Studies have indicated that the carbamate moiety at the lysine side-chain is an essential discriminator for substrate recognition, in which the oxygen atom adjacent to the carbonyl group of 7 interacts via a water-mediated hydrogen bond with Asn346, a key residue in establishing substrate recognition44, 49 (refer to discussion in Section 1.4). This may explain the unsuccessful incorporation of 108, which lacks the carbamate functionality, while the amino acid 111 with an inverted carbamate was efficiently incorporated, possibly by re-directing the necessary interaction to the Oɛ-position. However, the similar analog 113 is a poor substrate for

MbPylRS. Along with the inefficient incorporation of 108, this suggests that an oxygen atom adjacent to the side-chain carbonyl group allows for the hydrogen-bond network to be established more efficiently.

2.4.3 Synthesis of alkene-reactive probes for thiol-ene reactions

In order to test the thiol-ene reaction in proteins we first synthesized a biotin disulfide

(115) probe and a dansyl-thiol (117) as a fluorescent tag (Scheme 11). The disulfide 115 was synthesized in one step from NHS-biotin and cystamine hydrochloride (114) in the presence of DIPEA in a hot DMF solution to deliver 115 in 56% yield. The disulfide can then be reduced to the corresponding thiol before protein labeling. Next, we synthesized dansyl-thiol

117 in one step. The synthesis was performed by reacting dansyl chloride with cysteamine

(116) in a solution of TEA in DCM to give 117 in 30% yield.

42

Scheme 11. Synthesis of biotin disulfide 115 and dansyl thiol 117 for protein labeling.

2.4.4 Site-specific protein labeling via the thiol-ene reaction

With amino acids 7 and 101 showing good incorporation efficiency, we site- specifically incorporated these UAAs into sfGFP as a second model protein in E. coli. To verify that the thiol-ene reaction is suitable for labeling alkene-bearing sfGFP, dansyl-thiol

117 was used as a fluorescent probe. Dr. Chungjung Chou subjected the wild-type sfGFP and modified sfGFPs carrying 7 or 101 at position Y151 to a thiol-ene reaction with 117 by irradiating the reaction mixture with 365 nm UV light for 5 min in the presence of photoinitiator I2959. Both samples were then analyzed by SDS-PAGE gel and in-gel fluorescence imaging (Figure 16A and 16B). The results show that the alkene-containing sfGFPs modified with 7 and 101 were both selectively labeled with 117 after UV irradiation while the wild-type sfGFP was not fluorescently labeled. Thus, we demonstrated that a thiol- containing fluorescence probe could be site-specifically conjugated to sfGFP bearing an alkene functional group.

43

A.

+ SH S

O N S SH or = N O H 11710 LYZ

B. +117, -UV +117, +UV wt 7 101 wt 7 101 Fluorescence

Coomassie

Lanes 1 2 3 4 5 6

Figure 16. Alkenyl-sfGFP is fluorescently labeled with a dansyl-thiol. A) sfGFP bearing an alkene functionality reacts photochemically with dansyl-thiol 117 or lysozyme (LYZ). B) SDS-PAGE analysis and in-gel fluorescence demonstrates the labeling of alkenyl-sfGFP with dansyl-thiol 117 after 5 min of UV irradiation via thiol-ene ligation (lanes 5 and 6). wt: wild- type sfGFP; 7 and 101: sfGFP carrying the corresponding UAA; LYZ: lysozyme. -UV: samples were not exposed to UV irradiation. +UV: samples were irradiated at 365 nm for 5 min. Experiments were performed by Dr. Chungjung Chou.

In order to show the potential of the thiol-ene reaction in protein chemistry, we hypothesized that cysteine residues in another protein could also be used as a possible reaction partner, leading to the formation of a non-linear protein heterodimer (Figure 16A).

Lysozyme (LYZ) is a small protein containing 8 cysteine residues within 129 amino acids.184

The form 4 disulfide bonds and can be reduced to release free thiol groups.

44

Alkenyl-sfGFP bearing 7 or 101 and lysozyme were subjected to a thiol-ene reaction after irradiating with 365 nm of UV light for 10 min in the presence of photoinitiator I2959.

Analysis of bioconjugated proteins by SDS-PAGE revealed bands of expected molecular weight, as the bands corresponding to sfGFP increased from 28 kDa to 44 kDa via conjugation to lysozyme after UV exposure (Figure 17). Without UV irradiation, no significant mobility shift was observed. As expected, wild-type sfGFP did not undergo a thiol-ene reaction with lysozyme. Overall, a successful protein-protein heterodimer formation via thiol-ene conjugation of an alkene-containing protein was achieved.

wt 7 101 LYZ

Lanes 1 2 3 4 5 6 7 8 9

Figure 17. Alkenyl-sfGFP is bioconjugated to lysozyme, assembling a non-linear protein dimer via the thiol-ene reaction. SDS-PAGE analysis shows mobility band shifts of sfGFP-7 and sfGFP-101 from 28 kDa to 44 kDa after samples were irradiated at 365 nm for 10 min (lanes 8 and 9), corresponding to the molecular weight of the sfGFP-LYZ conjugate. wt: wild-type sfGFP; 7 and 101: sfGFP carrying the corresponding UAA; LYZ: lysozyme. -UV: samples were not exposed to UV irradiation. +UV: samples were irradiated at 365 nm for 10 min. Experiments were performed by Dr. Chungjung Chou.

45

2.4.5 Synthesis of alkene-reactive probes for photo-click reactions

The photo-click reaction is a photo-inducible 1,3-dipolar cycloaddition between a nitrile imine and an alkene. The nitrile imine is formed in situ from the UV activation of a tetrazole to form a stable, fluorescent pyrazoline cycloadduct (Figure 10). The photo-click reaction has been used to modify proteins in vitro and in living cells.158-160 Thus, we envisioned the application of the alkene lysine analogs in protein labeling via the photo-click cycloaddition. To this end we synthesized tetrazole 128 based on a literature procedure.185

Compound 128 was selected because it has been shown to have fast reaction kinetics with unactivated alkenes in E. coli cells.186, 187 Since the formed pyrazoline cycloadduct is fluorescent, we applied 128 for the fluorescent labeling of Myo bearing alkene lysine 7 upon

UV irradiation at 302 nm (Figure 18).

Myo-7 A. B. 302 nm (min) 0 2 5

17 kDa

128

Figure 18. Fluorescent labeling of Myo-7 via photo-click cycloaddition with 128. A) Structure of tetrazole 128. B) SDS-PAGE coomassie stain (top) and in-gel fluorescence (bottom) analyses for the labeling of Myo-7 with 128 upon 302 nm of UV light for 2 or 5 min.

46

The synthesis of 128 and that of other tetrazoles in our study commenced with the assembly of phenylsulfonylhydrazones (123-127) from the corresponding aldehydes (118-

122) and p-toluenesulfonylhydrazide in hot ethanol, delivering the products in a range of 70-

89% yields (Scheme 12). To prepare 121, 4-hydroxybenzaldehyde (119) was first acetyl- protected in the presence of TEA in THF yielding 4-formylphenyl acetate (121) in 97% yield. Protection of the hydroxyl group was necessary, as reaction toward tetrazole formation did not succeed to deliver the corresponding compound. Construction of the 2,5-diaryl tetrazoles 128-131 was achieved by cyclization when the phenylsulfohydrazones (118, 120-

122) were reacted with the diazonium salt of aniline or p-anisidine in pyridine. The formation of 128-130 suffered from low yields (27%, 20% and 37%), while 131 was obtained in 71% yield. Deacetylation of 130, as well as hydrolysis of 131, delivered the derivatives 132 and

133 in quantitative yields.

47

Scheme 12. Synthesis of alkene-reactive tetrazoles.

As observed in Figure 18, low fluorescence was detected after reacting 128 with 7. In addition, no significant improvement was observed when the tetrazoles 128-133 were applied for protein labeling experiments (results not shown). Thus, we opted to introduce a fluorophore on a tetrazole. Since electron-donating functionalities, such as methoxy, on the para-positions of the aryl groups on the tetrazole have shown to increase the tetrazole’s reaction rate,188 we selected the derivative 132, which can be further functionalized via an ether bond on the C-phenyl side. Accordingly, based on a previously reported procedure,159 a linker at the para position on the C-phenyl side of the tetrazole 132 was introduced to allow attachment of probes (Scheme 13). An amine linker was installed via alkylation to the

48

phenolic hydroxy on 132 using 3-(N-Boc)-propylbromide to afford tetrazole 134 in 88% yield. The tetrazole 135 was then obtained upon Boc-deprotection of 134 in an anhydrous

HCl solution in 77% yield. We then proceeded to further functionalizing 135 with a dansyl fluorophore (136) or biotin (137). The reactions were carried out in the presence of an organic base in anhydrous, aprotic solvent with dansyl chloride or NHS-biotin to obtain the corresponding products 136 and 137 in 77% and 80% yield, respectively.

Scheme 13. Synthesis of tetrazole 135 bearing a reactive linker and functionalized tetrazoles 136 and 137 carrying a fluorophore and biotin, respectively.

49

Bioconjugation reactions between alkene-modified proteins and tetrazoles 136 and

137 were attempted by Dr. Chungjung Chou. However, after several efforts these proved to be unsuccessful due to off-target protein labeling under UV irradiation conditions.

2.5 Genetic code expansion with diene lysines

Over thirty years ago, it was discovered that the Diels-Alder cycloaddition is accelerated in aqueous solution compared to organic solvents.189 Further studies not only have shown that water accelerates the reaction but also increases the stereoselectivity of the cycloaddition.190 This knowledge has been used advantageously to form covalent biomolecule modifications over the past ten years.191-194 The reaction between an electron- rich 1,3-diene and an electron deficient dienophile has been explored for residue-specific chemical modification of proteins to introduce fluorescent labels,192 prepare glycoproteins,195 immobilize proteins,196-200 or to chemically activate proteins.201, 202 As an alternative, we envisioned the site-specific modification of proteins by incorporating UAAs bearing conjugated dienes. Summerer et al.71 demonstrated the site-specific incorporation of the

UAA 27 (Figure 4) into proteins in E. coli using an evolved PylRS and showed furan crosslinking by red light. Although 1,3-diene structures such as furan have shown to be useful for the modification of biomolecules, there is no report for the site-specific protein labeling via the Diels-Alder reaction of genetically encoded 1,3-dienes. Here, we show the synthesis of 1,3-diene-modified lysine analogs and their site-specific incorporation into protein. The incorporated UAAs may serve as additional bioorthogonal chemical handles to

50

site-specifically label proteins with electron-deficient dienophiles, such as maleimide probes, a variety of which are commercially available.

2.5.1 Synthesis of 1,3-diene lysines

Synthesis of the acyclic 1,3-diene lysine 140 was achieved in three steps starting with the activation of trans,trans-2,4-hexadien-1-ol (138) using diphosgene (Scheme 14). The resulting chloroformate was subsequently reacted with Boc-Lys-OH in a basic aqueous solution to furnish the Boc-protected hexadiene amino acid 139 in 52% yield over two steps.

Upon Boc-deprotection with TFA in the presence of triethylsilane, the free amino acid 140 was obtained in quantitative yield.

Scheme 14. Synthesis of acyclic 1,3-diene 140.

Next, we aimed to synthesize furan lysine 144 bearing a side-chain carbamate moiety

(Scheme 15). Furfuryl alcohol (141) was activated to the NHS-activated carbonate 142 in

57% yield. This step was followed by the nucleophilic attack of Boc-Lys-OH in DMF to furnish the Boc-protected furan UAA 143 in a yield of 44%. Final Boc-deprotection with

51

either TFA or HCl did not result in lysine 144 due to decomposition. The synthesis of a cyclohexadiene lysine 152 was also attempted. Starting with 3-cyclohexene-1-methanol

(145) the primary alcohol was protected with TBDMSCl (146) in 96% yield.191 Next, compound 146 was brominated (147), followed by elimination to obtain the cyclohexadiene

148.191 Removal of the TBDMS group delivered the alcohol 149, which was subsequently activated to the NHS-ester 150 by reaction with DSC (70%). Acylation of Boc-Lys-OH with

150 was accomplished in 91% yield. However, similar to furan lysine 144 we were unable to successfully obtain 152 after treating 151 with TFA or HCl.

52

Scheme 15. Attempted synthesis of furan lysine 144 and cyclohexadiene lysine 152.

We then aimed to synthesize two furan-containing lysine analogs bearing either an amide (155) or an “inverted” carbamate (158) linkage to the side chain of lysine that could potentially be more stable to acidic conditions (Scheme 16). To this end, furancarboxylic acid (153) was stirred in thionyl chloride to convert to the corresponding acyl chloride, which was subsequently reacted with Boc-Lys-OH in a basic solution to deliver 154 in 75% yield.

Final Boc-deprotection of 154 by TFA in the presence of triethylsilane successfully yielded the furyl amino acid 155 in quantitative yield. Next, furfuryl isocyanate (156) was reacted

53

with 6-hydroxy-Boc-L-norleucine-OH in the presence of DIPEA in DCM, giving 157 in a yield of 69%. Final treatment with TFA allowed removal of the Boc group to furnish 158 in

98% yield.

Scheme 16. Synthesis of furan lysines 155 and 158.

Next, the Diels-Alder cycloaddition was tested by reacting acyclic 139 and cyclic 157

N-Boc-protected 1,3-dienes with N-phenylmaleimide. A 10 mM solution of the diene was prepared in MeOH and H2O and 20 equivalents of the maleimide were added. The reactions were stirred overnight at room temperature. The samples were analyzed by ESI-MS and formation of the corresponding product was confirmed.

2.5.2 Genetic encoding of 1,3-diene lysines

To test the incorporation of the newly synthesized UAAs into proteins, a panel of

PylRS variants was screened in mammalian cells. Dr. Chungjung Chou performed the PylRS

54

screening in HEK 293T cells using a model protein construct for mCherry-TAG-EGFP, which encodes the mCherry protein, an amber stop codon (TAG), and enhanced green fluorescent protein (EGFP).161 The expression of the full-length fusion protein is dependent on the incorporation of an UAA at the TAG position between both proteins. Thus, suitable synthetases were identified by detecting fluorescence that corresponds to the expression of

EGFP in cells via imaging. We found that 140 is marginally incorporated into protein by

MbPylRS. Also, the furyl lysine 155 is incorporated into protein by the MbPylRS, as well as the synthetase V8, which bears the mutations C313V and M315Q. While several synthetases were identified for the incorporation of 158, the UAA is especially well tolerated by the

PylRS variant V11 (D203N, Y271C, L274V, C313V, and M315Y). The furan lysines 155 and 158 were then incorporated into protein in E. coli. Dr. Chou double transformed Top10 cells using Myo-S4TAG-PylT with the corresponding PylRS construct, EV1 (MbPylRS) or

EV11, respectively in the presence of 155 or 158. The purified proteins were analyzed by

ESI-MS and the incorporation of the furan lysines was confirmed (Figure 19).

A. B.

Myo-155 Myo-158

Mass (Da) Mass (Da) Figure 19. ESI-MS analysis of Myo confirming the genetic incorporation of 155 (expected mass: 18490.07) and 158 (expected mass: 18520.08) at position S4.

55

2.6 Genetic code expansion with norbornene lysine and protein labeling

Recent advancements in bioorthogonal chemistry have demonstrated that strained alkenes can react with tetrazines in an inverse electron-demand Diels-Alder cycloaddition

(Figure 10). In contrast to the classic Diels-Alder reaction, in an inverse electron-demand

Diels-Alder cycloaddition an electron-rich dienophile reacts with an electron-poor diene, forming a stable cycloadduct. Reactions between strained alkenes or alkynes, such as trans- cyclooctene, norbornene, cyclopropene, or cyclononynes, and tetrazines have been explored for labeling and manipulating biomolecules in their native enviroment.66, 161, 203, 204 The rates of these reactions are orders of magnitude faster than those for other well-known bioorthogonal reactions.203-205 Another advantage of the reaction is that it can be made fluoregenic by quenching the fluorescence of a fluorophore when conjugating it to the tetrazine and then the fluorescence is activated upon reaction.161, 204, 206 Thus, the genetic encoding of a component of these reactions is a promising approach for rapid and site- specific protein labeling.

Here, we present a brief description of our published work in collaboration with the

Chin lab (MRC-LMB) where we reported the efficient synthesis and site-specific incorporation of a norbornene-bearing lysine analog 164 into proteins in E. coli and on the surface of mammalian cells.161 Since then, a rapid increasing number of work has been reported by other researchers and efforts have led to the development of new UAAs and probes for a number of applications with proteins. Strained alkene and alkyne functionalities that have been genetically encoded by others using PylRS/PylT pairs include norbornene

159,207 trans-cyclooctenes 23 and 160,66 and cyclooctynes 2164 and 2266 (Figure 4 and 20). In

56

addition, a tetrazine-containing phenylalanine derivative was incorporated into proteins in E. coli via a MjTyrRS variant.208

Figure 20. Structures of strained-alkene derivatives that have been genetically encoded by PylRS/PylT pairs for protein labeling via inverse electron-demand Diels-Alder reactions.

2.6.1 Synthesis of a norbornene lysine

The synthesis of norbornene lysine 164 was achieved in three steps starting from commercially available 5-norbornene-2-ol (161) (Scheme 17). Activation of the norbornene alcohol with DSC in the presence of TEA resulted in the activated carbonate 162 as a white solid in 82% yield. Upon reacting the activated norbornene with Boc-Lys-OH in DMF, the

Boc-protected norbornene lysine 163 was obtained as an off-white foam in 95% yield. Boc- deprotection of 163 using a solution of TFA in DCM, followed by HCl salt exchange using 1

M HCl, afforded the desired amino acid 164 as a white solid in quantitative yield.

57

Scheme 17. Synthesis of norbornene lysine 164.

2.6.2 Genetic encoding of a norbornene lysine

The genetic incorporation of 164 into proteins was investigated by Dr. Chungjung

Chou in the Deiters lab and by our collaborators in the Chin lab. The amino acid 164 was investigated as a substrate of the MbPylRS/PylT pair in E. coli. To this end, Top10 cells were co-transformed with psfGFP-150TAG-PylT-His6, which encodes a C-terminally hexahistidine-tagged sfGFP gene with an amber stop codon at position 150 and MbtPylT, and pBKPylS, which encodes MbPylRS. SDS-PAGE analysis demonstrated that sfGFP expression is dependent on the presence of 164 (Figure 21A). Similarly, the construct pMyo-

4TAG-PylT-His6 was used for Myo bearing an amber stop codon at position 4 and we showed that protein was produced in the presence, but not in the absence of 164. The incorporation of 164 was further confirmed by ESI-MS of purified proteins (Figure 21B).

58

A. B.

6 164

Figure 21. Genetically encoded 164 by the MbPylRS/PylT pair in E. coli. A) UAA dependent expression of sfGFP with an amber stop codon at position 150 and Myo with an amber stop codon at position 4. B) ESI-MS analyses confirming the incorporation of 164. i, sfGFP-164-His6, calculated: 27,977.5 Da, found: 27,975.5±15 Da. ii, Myo-164-His6, calculated: 18,532.2 Da, found: 18,532.5±15 Da. Experiments were performed at the Chin lab. Figure adapted from Nat. Chem. 2012, 4, 298.

Once the genetic encoding of 164 into proteins was established, the Chin lab proceeded to demonstrate the site-specific labeling of mammalian cell surface proteins. An amber stop codon was introduced into an epidermal growth factor receptor-EGFP fusion protein (EGFR-EGFP) at position 128. Expression of the full-length EGFR-EGFP fusion protein was dependent on the presence of UAAs 164 or 6 (Figure 22). To demonstrate the site-specific labeling of the cell surface via EGFR-EGFP containing 164, cells were treated with a tetramethylrhodamine (TAMRA) tetrazine dye. Cells were imaged and red fluorescence arising from the labeling of 164 was only observed for EGFR-164-EGFP and

59

not for EGFR-6-EGFP (Figure 22). This represents, for the first time, the labeling of single, genetically defined sites on proteins on the mammalian cell surface using UAA mutagenesis.

TAG(128) A. EGFR EGFP

B.

+164

+6

Figure 22. Specific labeling of a cell surface protein in mammalian cells. Expression of full- length EGFR-EGFP is dependent on 164 or 6 at position 128 as it is visible by green fluorescence (left panel), while specific protein labeling with TAMRA-tetrazine (middle panel) is only visible for EGFR-164-EGFP by red fluorescence; merged images (right panel). Experiments were performed by the Chin lab. Figure adapted from Nat. Chem. 2012, 4, 298.

2.6.3 Synthesis of tetrazine probes for protein labeling

Initial studies on the inverse electron-demand Diels-Alder cycloaddition for bioconjugation reactions identified aryl-1,2,4,5-tetrazines as suitable derivatives due to

60

biocompatibility.203 Since then, a number of diaryl-tetrazines containing different substitution patterns have been explored.66, 161 Tetrazines with varying degree of reactivity and stability have been reported.161, 209 Some general considerations are that more electron-deficient tetrazines result in faster reactions but those with electron-donating groups are more stable to biological systems.209 Thus, based on literature we prepared several tetrazines with different substituent groups that can exhibit different reactivity and stability. Depending on the application, a tetrazine can be selected based on desired reactivity or stability levels.

The tetrazines were assembled with a at the tetrazine 3 position, where this phenyl bears a carboxylic acid for further functionalization (Scheme 18). The 6 position was varied by having it unsubstituted (169) or bearing the functional groups methyl (170), phenyl (171), or pyrimidyl (172). The synthesized tetrazines and the corresponding probes can enable further studies on protein labeling and manipulation using the highly efficient inverse electron-demand Diels-Alder cycloaddition.

Although there is a rapid growing interest in the use of 1,2,4,5-tetrazines, their synthetic preparation remains to be a challenge. The most convenient route to tetrazines involves the reaction of aryl nitriles with hydrazine, followed by in situ oxidation. Thus, the tetrazines 169-172 were synthesized in a similar manner from 4-cyanobenzoic acid (165) in an excess of hydrazine, followed by oxidation with sodium nitrite in an HCl solution

(Scheme 18). The presence of metal salts have been previously used to assemble tetrazines from unactivated nitriles to improve yields or to prepare tetrazines that otherwise have not been synthesized or deliver trace amounts of product.210 Although a mechanism is still debated, it is possible that the metal acts as a Lewis acid and promotes nucleophilic attack of

61

the hydrazine to the metal-activated nitrile. Accordingly, reaction of 4-cyanobenzoic acid

(165) with formamidine (166) in the presence of zinc triflate and hydrazine resulted in the monoaryl tetrazine 169 in 56% yield (see table). A similar yield (55%) was obtained when nickel triflate was used, while a low yield of 19% was observed without catalyst. Results of the un-catalyzed reaction agree with the reported209 synthesis of 169 (18%). Similarly, the tetrazine 170, which bears a 6- was synthesized using acetonitrile and nickel triflate with 63% yield, while a 56% yield was obtained without catalyst, suggesting that the addition of the catalyst is not as necessary in this case compared to that of 169. Although the reported209 yield for the synthesis (un-catalyzed) of 170 is 11% we employed modifications to our procedure, including reaction time and purification method. The tetrazines 171 and

172 were obtained from the corresponding arylnitriles: benzonitrile and 2- pyrimidinecarbonitrile, with low yields of 20% and 10%, respectively, owing to the formation of symmetric tetrazine products. In the case of tetrazine 172, we followed a reported procedure for its synthesis and purification.211

62

Scheme 18. Synthesis of tetrazines 169-172 bearing various substituents at position 6 of tetrazine and a carboxylic acid functional group at the 3-phenyl for further functionalization.

For studies on protein stabilization, we aimed to conjugate polyethylene glycol (PEG) groups to tetrazines and subsequently PEGylate a protein via an inverse electron-demand

Diels-Alder cycloaddition with a norbornene lysine (164) residue. We synthesized

PEGylated tetrazines 175 and 176 by activating the carboxylic acid functionality on 169 and

170 to the NHS-ester via an EDCI coupling reaction, which resulted in tetrazines 173 and

174 with yields of 86% and quantitative, respectively (Scheme 19). The reagent mPEG5000-

NH2 was reacted with the succinic esters 173 and 174 in a mixture of hot DMF and DCM, giving the corresponding PEG-tetrazines 175 and 176. The loading of the PEG group on the

63

tetrazines was determined by measuring the absorbance and comparing with those of 169 or

170 at the same concentration. The conjugating yields were determined as 97% for tetrazine

175 (R = H) and 77% for tetrazine 176 (R = Me).

Scheme 19. Synthesis of the PEG-tetrazines 175 and 176.

Protein PEGylation with the newly synthesized PEG-tetrazines was tested by Alex

Prokup (Deiters lab). Purified sfGFPs bearing either alloc lysine (7) or norbornene lysine

(164) at position Y151 were subjected to overnight incubation with PEG-tetrazine 175. The following SDS-PAGE analysis showed two protein bands for sfGFP-164, one around 5 kDa larger than the original protein (Figure 23). The intensity of the bands roughly indicates over

50% conversion when 100 equivalents of 119 are supplemented. The same conditions with sfGFP-7 did not yield a modified protein, thus showing the specificity of 175 toward the strained alkene 164 in protein labeling

64

sfGFP-7 sfGFP-164 175 (eq) : 0 1 10 100 0 1 10 100

58 kDa PEGylated 46 kDa protein

30 kDa Original protein 25 kDa

Figure 23. SDS-PAGE analysis demonstrates specific PEGylation of sfGFP-164. Experiment performed by Alex Prokup.

To conjugate fluorophores to tetrazines 169-172, the carboxylic acid on their 3- phenyl ring was functionalized with a reactive amine group (Scheme 20). In search for optimal coupling conditions of an amine linker to the carboxylic acid, we identified HATU as the coupling reagent when several conditions were screened for the formation of 177.

Similar conditions were employed to obtain the N-Boc-protected tetrazines 178-180 with yields of 54-88%. Upon Boc-deprotection, the free amines were obtained as TFA or HCl salts in yields from 75% to quantitative, furnishing the tetrazines 181-184.

65

Scheme 20. Synthesis of tetrazines containing an amine linker (181-184) for further functionalization. X = TFA or HCl

Next, we tested the linkage of dansyl as a fluorophore on tetrazine by reacting tetrazine 181 with dansyl chloride, delivering dansyl-tetrazine 185 in 71% yield (Scheme 21).

Since fluorescein is a more efficient fluorophore for imaging biological samples, we decided to use FITC in order to introduce the fluorescein probe. Several reaction conditions were screened for 181 as initial attempts failed to give good amounts of the corresponding product

186. We found that the reaction proceeds in moderate yields (62%) in the presence of TEA in a solution of either THF or DMF, and MeOH. Applying the same reaction conditions lead to the FITC-tetrazines 187-189 in yields of 50-55%.

66

Scheme 21. Synthesis of various tetrazine reagents for fluorescence labeling.

2.7 Summary

This chapter presents the syntheses of new UAAs and reactive probes for protein labeling. Several of the synthesized UAAs are incorporated into proteins site-specifically using a PylRS/PylT pair in E. coli or mammalian cells. We successfully synthesized a bipyridine lysine that could be applied for the synthesis of metalloproteins in mammalian cells via UAA mutagenesis. We also focused on the synthesis of UAAs that serve as reactive

67

handles for bioconjugation reactions. These UAAs can be site-specifically incorporated into proteins to undergo bioorthogonal reactions in biological systems to modify proteins post- translationally in a specific manner. We prepared a variety of UAAs that can be applied to carbonyl/aminooxy condensations, thiol-ene reactions, and Diels-Alder cycloadditions, as well as bioorthogonal reaction partners such as aminooxy dyes, thiol, and tetrazines probes.

We demonstrated the direct synthesis of a protein heterodimer in E. coli using genetically encoded alkenes and the thiol-ene reaction. In addition, a norbornene lysine was applied in the selective labeling of the mammalian cell surface via an inverted electron-demand Diels-

Alder cycloaddition. The presented UAAs can provide tools to chemical biologists to investigate protein structure, dynamics, and localization in a specific and selective manner with minimal perturbation to the protein in study.

2.8 Experimental data for synthesized compounds

Unless otherwise stated, all reagents used were obtained from commercial sources and used as received. Reactions were stirred magnetically and carried out under nitrogen using flame-dried glassware. DCM, THF and Et2O were dried using a MB SPS Compact solvent purification system. MeCN, DMF, DCE, DIPEA, TEA and pyridine were distilled from calcium hydride. MeOH and EtOH were distilled from magnesium and iodine. The distilled solvents were stored under nitrogen and over molecular sieves (3 Å for MeOH and

EtOH, and 4 Å for all other solvents). Reactions were followed by thin layer chromatography

(TLC) using glass-back silica gel plates (Sorbent technologies, 250 µm thickness) and visualized under a UV lamp and/or by staining with a KMnO4 solution. Flash

68

chromatography was performed on silica gel (60 Å, 40-63 μm (230 × 400 mesh), Sorbtech) as a stationary phase. Melting points were determined using a capillary melting point apparatus. The 1H NMR and 13C NMR spectra were recorded on a 300 MHz or 400 MHz

Varian NMR spectrometer. HRMS was performed at the University of Pittsburgh.

Ethyl [2,2'-bipyridine]-5-carboxylate (42). Triphenylphosphine (84 mg, 0.32 mmol) and

Pd(PPh3)2Cl2 (228 mg, 0.32 mmol) were added to a solution of ethyl 6-chloronicotinate (1.2 g, 6.46 mmol) and 2-(tributylstannyl)pyridine (2.17 mL, 6.78 mmol) in dry 1,4-dioxane (30 mL). The reaction mixture was stirred at 100 ºC for 18 h and then allowed to cool to rt before concentrating it under reduced pressure. The resulting residue was taken up in DCM and

® filtered through a bed of SiO2 and Celite (1:1). The filtrate was concentrated and the residue was purified by flash column chromatography on alumina (basic, I) eluting with hexanes, then Hex/EtOAc (30:1 to 10:1) to afford 42 (1.0 g, 68%) as a light beige powder. 1H NMR and HRMS-ESI data matched literature values.95

[2,2'-Bipyridin]-5-ylmethyl (2,5-dioxopyrrolidin-1-yl) carbonate (44). N,N’-

Disuccinimidyl carbonate (192 mg, 0.75 mmol) was added to a solution of alcohol 4394 (70 mg, 0.37 mmol) and TEA (157 µL, 1.12 mmol) in dry MeCN (2 mL) at rt. The resulting mixture was stirred at rt overnight and then concentrated under vacuum. The product was purified by column chromatography on alumina (basic, I), eluted with 6:1 Hex/EtOAc, to

1 deliver 44 (27 mg, 22%) as a white solid. H NMR (300 MHz, CDCl3)  8.70 (m, 2H), 8.44

(t, J = 8.4 Hz, 2H), 8.02 (dd, J = 2.1, 6.3 Hz, 1H), 7.84 (m, 1H), 7.33 (m, 1H), 5.28 (s, 2H),

69

13 2.67 (s, 4H) ppm; C NMR (75 MHz, CDCl3):  171.1, 157.3, 155.7, 150.2, 149.4, 138.7,

137.1, 129.2, 121.5, 121.0, 75.9, 25.6 ppm.

[2,2'-Bipyridine]-5-carboxylic acid (48). The ethyl ester 42 (1.0 g, 4.38 mmol) was dissolved in hot methanol (5 mL) and 1 M NaOH (5 mL) was added. The reaction mixture was stirred at rt for 5 h or until the starting material was consumed (as judged by TLC). The volatiles were removed under reduced pressure and the residue was diluted with water (10 mL) and washed with EtOAc (15 mL). The aqueous layer was acidified to pH 3-4 with 5% citric acid and extracted with EtOAc (3  20 mL). The combined organic layers were washed with water (20 mL) and brine (10 mL), dried over Na2SO4, filtered, and concentrated to dryness to afford 48 (870 mg, 99%) as a white solid. 1H NMR spectral data matched literature values.94, 95

2,5-Dioxopyrrolidin-1-yl [2,2'-bipyridine]-5-carboxylate (49). Compound 48 (870 mg,

4.34 mmol) was dissolved in dry DMF (10 mL) and N-hydroxysuccinimide (0.5 g, 4.34 mmol) and N,N-dicyclohexylcarbodiimide (0.9 g, 4.34 mmol) were added. The reaction mixture was stirred overnight at rt. After filtration, the filtrate was concentrated under reduced pressure. The remaining residue was dissolved in a minimal amount of DCM and precipitated into hexanes and the process was repeated two more times, followed by collecting the precipitate and drying it under high vacuum to obtain 49 (0.89 g, 69%) as a white solid. 1H NMR and HRMS-ESI data matched literature values.212

70

(S)-6-([2,2'-Bipyridine]-5-carboxamido)-2-((tert-butoxycarbonyl)amino)hexanoic acid

(50). Compound 49 (0.45 g, 1.15 mmol) and Boc-Lys-OH (0.45 g, 1.82 mmol) were dissolved in dry DMF (7 mL) and the reaction mixture was stirred overnight at rt. Water (20 mL) and EtOAc (15 mL) were added and the aqueous layer was extracted (3  15 mL). The combined organic extracts were washed with water (3  40 mL) and brine (20 mL), dried over Na2SO4, filtered, and concentrated under reduced pressure to dryness to afford 50 (537

1 mg, 83%) as an off-white solid; mp 152-154 °C. H NMR (400 MHz, CDCl3)  9.10 (s, 1 H),

8.70 (d, J = 4.4 Hz, 1H), 8.33-8.23 (m, 3H), 7.84 (t, J = 7.6 Hz, 1H), 7.36 (t, J = 6.4 Hz, 1H),

5.44 (d, J = 8.0 Hz, 1H), 4.31 (m, 1 H), 3.44 (m, 2H), 1.85 (m, 2H), 1.76 (t, J = 8.8 Hz, 2H),

13 1.65 (t, J = 6.8 Hz, 2H), 1.40 (s, 9H) ppm; C-NMR (100 MHz, CDCl3)  175.5, 166.1,

157.8, 156.0, 155.0, 149.2, 148.3, 137.9, 136.4, 130.2, 124.8, 122.5, 121.3, 80.2, 53.4, 40.1,

- 32.3, 29.0, 28.5, 22.7 ppm; HRMS-ESI (m/z): [M-H] calcd for C22H28N4O5 427.19760, found

427.20020.

(S)-6-(6-(Pyridin-2-yl)pyridine-3-carboxamido)-2-aminohexanoic acid HCl salt (51). To a solution of 50 (0.5 g, 1.17 mmol) and TES (0.37 mL, 2.33 mmol) in dry DCM (16.5 mL),

TFA (0.87 mL, 11.7 mmol) was added dropwise and the reaction mixture was allowed to stir at rt overnight. The volatiles were removed under reduced pressure and the residue was dissolved in a solution of 4 N HCl in 1,4-dioxane (5 mL) and DCM (15 mL), allowed to stir for 10 min at rt and then concentrated. The latter process was repeated two more times to ensure complete TFA to HCl salt exchange. The concentrated residue was dissolved in a minimal amount of MeOH and was precipitated into ice-cold Et2O. The mixture was pelleted

71

by centrifugation, the supernatant decanted, and the solid was washed with Et2O before drying it under vacuum, affording the amino acid 51 (0.5 g, 98%) as an off-white solid; mp

1 172-174 °C. H NMR (400 MHz, CD3OD)  9.18 (s, 1H), 8.82 (m, 1H), 8.64 (d, J = 7.2 Hz,

1H), 8.52-8.34 (m, 3H), 7.82 (t, J = 4.5 Hz, 1H), 4.02 (t, J = 4.5 Hz, 1H), 3.50 (t, J = 7.2 Hz,

2H), 2.01 (m, 2H), 1.76 (t, J = 7.5 Hz, 2H), 1.58 (m, 2H) ppm; 13C NMR (100 MHz,

CD3OD)  171.8, 166.5, 149.9, 149.8, 148.9, 148.0, 144.5, 139.3, 134.0, 128.8, 126.0, 123.7,

- 53.8, 40.6, 31.1, 29.8, 23.4 ppm; HRMS-ESI (m/z): [M-H] calcd for C17H20N4O3 327.14517, found 327.14614.

Ethyl 3-(2-methyl-1,3-dioxolan-2-yl)propanoate (58). In a 250 mL round-bottom flask, equipped with a Dean-Stark apparatus and a condenser, anhydrous ethylene glycol (5.7 mL,

102 mmol) and pTSA (0.26 g, 1.3 mmol) were dissolved in dry benzene (100 mL). To this solution, 4-acetyl butyrate (10 mL, 68 mmol) was added and the reaction mixture was refluxed (80 °C) for 18 h with water being trapped in the Dean-Stark apparatus. After cooling to rt, the reaction mixture was washed with saturated NaHCO3 (2  100 mL), water (100 mL), and brine (50 mL). The organic layer was dried over MgSO4, filtered, and concentrated.

The remaining residue was purified by flash column chromatography on silica gel, eluting with 4:1 Hex/EtOAc and 1% TEA to deliver 58 (8.82 g, 69%) as a clear oil. 1H NMR spectral data matched literature values.147

3-(2-Methyl-1,3-dioxolan-2-yl)propan-1-ol (59). The ethyl ester 58 (0.6 g, 3.18 mmol) was dissolved in dry Et2O (10 mL) and the solution was cooled to 0 °C. Then, LiAlH4 (0.14 g,

72

3.51 mmol) was added and the resulting slurry was stirred for 1 h at the same temperature.

The reaction mixture was quenched with a piece of ice, diluted with Et2O (10 mL), and allowed to warm to rt. Then, it was filtered through Celite® and the filtrate was dried over

MgSO4, filtered, and concentrated. Compound 59 (0.45 g, 97%) was obtained as a colorless oil and was used for the next step without further purification. 1H NMR spectral data matched literature values.147

(S)-2-((tert-Butoxycarbonyl)amino)-6-(((3-(2-methyl-1,3-dioxolan-2-yl)propoxy)- carbonyl)amino)hexanoic acid carbonate (61). N,N’-Disuccinimidyl carbonate (3.0 g, 12.3 mmol) was added to a solution of alcohol 59 (0.9 g, 6.16 mmol) and pyridine (1.5 mL, 18.5 mmol) in dry MeCN (10 mL) at rt. The reaction mixture was stirred overnight and then concentrated. The residue was re-dissolved in MeCN (10 mL) to remove excess of pyridine by co-evaporating under reduced pressure three times. The obtained residue containing compound 60 was dried in vacuo and then dissolved in dry DMF (8 mL) and Boc-Lys-OH

(1.82 g, 7.39 mmol) was added to the solution. The reaction mixture was stirred overnight at rt and then diluted with water (20 mL) and the aqueous mixture was extracted with EtOAc (3

 20 mL). The combined organic extracts were washed with water (3  40 mL) and brine (30 mL), dried over Na2SO4, filtered, and concentrated to dryness to furnish 61 (2.18 g, 85%) as

1 a colorless oil. H NMR (400 MHz, CDCl3) δ 10.20 (s, br, 1H), 6.15 (s, br, 0.5H), 5.36 (d, J

= 8.0 Hz, 1H), 5.05 (s, br, 0.5H), 4.21 (s, br, 1H), 4.04 (m, br, 2H), 3.89 (m, 4H), 3.10 (m, br,

13 2H), 1.76-1.17 (m, 22H) ppm. C NMR (100 MHz, CDCl3) δ 175.4, 157.0, 155.8, 109.8,

73

79.9, 64.8, 64.6, 53.1, 40.5, 35.4, 32.1, 29.3, 28.3, 23.9, 23.7, 22.3 ppm; HRMS-ESI (m/z):

- [M-H] calcd for C19H33N2O8 417.22314, found 417.22337.

(S)-2-Amino-6-((((4-oxopentyl)oxy)carbonyl)amino)hexanoic acid HCl salt (62).

Compound 61 (0.7 g, 1.67 mmol) was dissolved in Et2O (35 mL) and the solution was cooled to 0 °C. Next, ice-cold aqueous 4 M HCl (17.5 mL) was added and the reaction mixture was allowed to stir vigorously at rt for 12 h. The layers were allowed to separate and the aqueous layer was collected and washed with Et2O (20 mL). The aqueous phase was finally concentrated and the remaining solid was taken to dryness under high vacuum to obtain 62

1 (443 mg, 98%) as a white solid; mp 112-114 °C. H NMR (300 MHz, D2O) δ 4.03 (m, 2H),

3.63 (s, 1H), 3.09 (t, J = 6.9 Hz, 2H), 2.61 (t, J = 6.9 Hz, 2H), 2.18 (s, 3H), 1.91 (m, 4H),

13 1.50 (m, 4H) ppm; C NMR (100 MHz, D2O) δ 215.8, 172.1, 158.7, 64.7, 62.6, 52.8, 39.8,

- 29.5, 28.5, 23.1, 21.5 ppm; HRMS-ESI (m/z): [M-H] calcd for C12H21N2O5 273.14450; found 273.14549.

(2S)-2-Amino-6-(((2,3-dihydroxypropoxy)carbonyl)amino)hexanoic acid (64). NHS- carbonate 63148 (2.0 g, 7.35 mmol) was dissolved in THF (20 mL) and cooled to 0 °C. Boc-

Lys-OH (2.2 g, 8.82 mmol) was added, followed by slow addition of saturated NaHCO3 (5 mL) and the reaction mixture was allowed to warm to rt and was stirred overnight. Next, the

THF was removed under reduced pressure and the residue was diluted with water (10 mL) and the aqueous mixture was washed with EtOAc (2  15 mL). The aqueous layer was collected and acidified with 5% citric acid to pH 3-4 and extracted with EtOAc (3  15 mL).

74

The combined organic extracts were washed with water (30 mL) and brine (15 mL), dried over Na2SO4, filtered, and concentrated to dryness to furnish 64 (2.29 g, 79%) as a colorless

1 oil. H NMR (300 MHz, CDCl3) δ 10.50 (s, br, 1H), 5.35 (d, J = 7.8 Hz, 1H), 5.13 (m, br,

1H), 4.31-4.21 (m, 2H), 4.16-4.11 (m, 1H), 4.05-3.95 (m, 2H), 3.70 (t, J = 6.3 Hz, 1H), 3.13

13 (m, 2H), 1.78-1.46 (m, 6H), 1.39 (s, 9H), 1.31 (s, 6H) ppm; C NMR (75 MHz, CDCl3) δ

175.6, 156.9, 155.8, 110.1, 80.0, 74.0, 66.1, 65.3, 53.1, 40.6, 31.9, 29.3, 28.3, 26.7, 25.3, 22.4

- ppm; HRMS-ESI (m/z): [M-H] calcd for C18H31N2O8 403.20749; found 403.20980.

(S)-2-Amino-6-((((4-oxopentyl)oxy)carbonyl)amino)hexanoic acid HCl salt (65).

Compound 64 (0.7 g, 1.67 mmol) was dissolved in Et2O (35 mL) and the solution was cooled to 0 °C. Next, ice-cold aqueous 4 M HCl (17.5 mL) was added and the reaction mixture was allowed to stir vigorously at rt for 12 h. The layers were allowed to separate and the aqueous layer was collected and washed with Et2O (20 mL) before concentrating it under reduced

1 pressure to dryness to obtain 65 (443 mg, 98%) as a white solid. H NMR (400 MHz, D2O) δ

4.09-3.95 (m, 3H), 3.87-3.83 (m, 1H), 3.60-3.45 (m, 2H), 3.13 (t, J = 8.0 Hz, 2H), 1.97-1.82

13 (m, 2H), 1.52-1.35 (m, 4H) ppm; C NMR (100 MHz, D2O) δ 172.1, 158.5, 78.1, 69.9, 65.7,

- 62.2, 52.5, 40.0, 29.4, 28.4, 21.5 ppm; HRMS-ESI (m/z): [M-H] calcd for C10H19N2O6

263.12486; found 263.12456.

(S)-tert-Butyl 2-((tert-butoxycarbonyl)amino)-3-(4-iodophenyl)propanoate (67). Boc-4- iodo-L-phenylalanine (1.0 g, 2.56 mmol) was dissolved in a 2:1 solution of anhydrous DCM and THF (18 mL) and was cooled to 0 ºC under a nitrogen atmosphere. tert-Butyl 2,2,2-

75

trichloroacetimidate (0.91 mL, 5.11 mmol) was added via syringe and the reaction was slowly warmed to rt and stirred overnight. The volatiles were removed under reduced pressure and the remaining residue was dissolved with DCM (20 mL). The organic layer was washed with a 2.5% NaHCO3 aqueous solution (2  20 mL), brine (10 mL), dried over

Na2SO4, and concentrated under reduced pressure. The obtained residue was purified by column chromatography on silica gel using Hex/EtOAc (8:1 to 5:1) to afford the protected ester 67 (709.5 mg, 62%) as a white amorphous solid. 1H NMR spectral data matched literature values.213

(S)-tert-Butyl-2-((tert-butoxycarbonyl)amino)-3-(4-vinylphenyl)propanoate (68). The protected amino acid 67 (0.5 g, 1.19 mmol) was dissolved in DMF (9 mL). Vinyl boronic acid pinacol ester (0.284 mL, 1.68 mmol) and PdCl2(PPh3)2 (39 mg, 0.056 mmol) were added to the solution, followed by water (3 mL) and Na2CO3 (355 mg, 3.35 mmol). The reaction mixture was heated to 90 ºC overnight. The mixture was allowed to cool to rt, diluted with water (20 mL) and extracted with EtOAc (3  20 mL). The combined organic layer was washed with a saturated aqueous solution of NaHCO3 (2  20 mL), brine (10 mL), dried over

Na2SO4, and concentrated under reduced pressure. The obtained residue was purified by column chromatography on silica gel using Hex/EtOAc (8:1) to furnish compound 68 (286 mg, 74%) as a yellow solid. 1H NMR spectral data matched literature values.214

(2S)-tert-Butyl-2-((tert-butoxycarbonyl)amino)-3-(4-(1,2-dihydroxyethyl)phenyl)- propanoate (70). A cold solution of KMnO4 (126 mg, 0.80 mmol) and MgSO4 (96 mg, 0.80

76

mmol) in half-saturated aqueous NaHCO3 (3 mL) was added dropwise to a solution of the alkene 68 (185 mg, 0.53 mmol) in acetone (6 mL) stirring at −10 ºC. The mixture was allowed to stir at this temperature for 1 h before it was quenched with 20% NaHSO3 (15 mL).

Once the mixture reached rt it was extracted with EtOAc (2  15 mL) and the combined organic layer was washed with brine (10 mL), dried over Na2SO4, and concentrated under reduced pressure to dryness, furnishing the corresponding ketol 69 (181 mg, 90%) as a

1 colorless oil. H NMR (300 MHz, CDCl3) δ 7.86 (d, J = 8.1 Hz, 2H), 7.33 (d, J = 8.1 Hz),

5.06 (m, 1H), 4.85 (s, 2H), 4.56 (m, 1H), 3.21-3.06 (m, 2H), 1.14 (s, 18H) ppm. Without further purification, the ketol 69 was then dissolved in dry THF (5 mL) and cooled to 0 ºC before sodium borohydride (18 mg, 0.47 mmol) was added. The reaction mixture was stirred at this temperature for an hour before it was quenched with water (0.1 mL), slowly warmed to rt and concentrated. The residue was dissolved in EtOAc (5 mL), washed with water (5 mL) and brine (5 mL), dried over Na2SO4, and concentrated under reduced pressure to

1 provide the product 70 (170 mg, 94%) as a colorless oil. H NMR (300 MHz, CDCl3) δ 7.26

(d, J = 8.1 Hz, 2H), 7.12 (d, J = 8.1 Hz, 2H), 5.09 (m, br, 1H), 4.72 (m, br, 1H), 4.40 (m, br,

1H), 3.82-3.55 (m, 2H), 3.00 (d, J = 5.7 Hz, 2H), 1.39 (s, 18H) ppm; 13C NMR (100 MHz,

CDCl3) δ 171.1, 155.2, 139.4, 136.0, 129.6, 126.2, 82.2, 79.8, 74.5, 68.1, 60.5, 54.9, 38.2,

28.4, 28.0 ppm.

(S)-2-Amino-3-(4-(1,2-dihydroxyethyl)phenyl)propanoic acid TFA salt (71). To an ice- cold solution of 70 (240 mg, 0.63 mmol) and TES (0.2 mL, 1.26 mmol) in DCM (936 µL),

TFA (936 µL, 12.6 mmol) was slowly added. The reaction mixture was allowed to warm to rt

77

and stirred for 5 h. The volatiles were removed under reduced pressure and the residue was dissolved in MeOH (3 mL) to remove the residual amount of TFA by co-evaporation. The process was subsequently repeated two more times to finally afford 71 (155.9 mg, 77%) as a

1 colorless oil. H NMR (300 MHz, D2O) δ 7.36 (d, J = 7.8 Hz, 2H), 7.28 (d, J = 7.8 Hz, 2H),

4.72 (m, 1H), 3.97 (m, br, 1H), 3.68 (d, J = 5.7 Hz, 2H), 3.35-3.02 (m, 2H) ppm; HRMS-ESI

+ (m/z) [M+H] calcd for C11H15NO4: 226.1074, found 226.1087.

N-(3-bromopropyl)-5-(dimethylamino)naphthalene-1-sulfonamide (73). To a stirred solution of dansyl chloride (0.15 g, 0.56 mmol) and TEA (0.23 mL, 1.67 mmol) in dry DCM

(5 mL) at 0 °C, 3-bromopropyl amine HBr (365 mg, 1.67 mmol) was added. The reaction was stirred at rt for 16 h. Then, it was washed with saturated NaHCO3 (2  5 mL) and brine

(5 mL). The organic layer was dried over MgSO4, filtered, and concentrated under reduced pressure. The remaining residue was purified by flash column chromatography on silica gel and eluted with 8:1 DCM/Hex to deliver 73 as a yellow oil. 1H NMR spectral data matched literature values.151

N-(3-(Aminooxy)propyl)-5-(dmethylamino)naphthalene-1-sulfonamide (75). A solution of dansyl alkylbromide 73 (27 mg, 0.075 mmol) in dry MeCN (0.2 mL) was added dropwise to an ice-cold solution of N-Boc-hydroxylamine (10 mg, 0.075 mmol) and DBU (11 μL,

0.075 mmol) in MeCN (0.3 mL). The reaction was stirred at 0 °C for 1 h and then at rt overnight. The reaction mixture was concentrated, dissolved in EtOAc (4 mL), and washed with saturated NaHCO3 (2  4 mL) and brine (4 mL). The organic layer was dried over

78

Na2SO4 and concentrated under reduced pressure. The remaining residue was purified by flash column chromatography on silica gel and eluted with 5-10% Et2O in DCM to deliver 74 as a yellow oil. Compound 74 was then dissolved in dry DCE (0.75 mL) and TFA (0.25 mL) was added at rt. The reaction was stirred at this temperature for 2 h, and then taken into DCM and water. The aqueous layer was made basic through the addition of 1 M NaOH until pH 9 was reached and then it was extracted with DCM (3  5 mL). The combined organic phase was washed with water (10 mL) and brine (5 mL), dried over Na2SO4, filtered, and concentrated. The obtained residue was taken to dryness in vacuo, resulting in 75 (13 mg,

1 54%) as a yellow film. H NMR (300 MHz, CDCl3) δ 8.55 (d, J = 7.5 Hz, 1H), 8.30-8.23 (m,

2H), 7.56-7.52 (m, 2H), 7.19 (d, J =7.5 Hz, 1H), 3.61 (t, J = 5.4 Hz, 2H), 2.98 (m, 2H), 2.88

(s, 6H), 1.69 (t, J = 6.0 Hz, 2H) ppm.

2-(3-Bromopropyl)isoindoline-1,3-dione (77). Phthalimide (0.2 g, 1.3 mmol) was dissolved in dry DMF (3 mL) and K2CO3 (188 mg, 1.3 mmol) and 1,3-dibromoporpane (0.41 mL, 4.1 mmol) were added. The reaction mixture was allowed to stir at rt for 18 h. Then, it was diluted with saturated NaHCO3 (15 mL) and extracted with EtOAc (2  10 mL). The combined organic layers were washed with water (20 mL) and brine (10 mL), dried over

Na2SO4, filtered, and concentrated. The remaining residue was purified by flash column chromatography on silica gel eluting with Hex/EtOAc (5:1) to obtain 77 (0.28 mg, 77%) as a white solid. 1H NMR spectral data matched literature values.215

79

tert-Butyl (3-(1,3-dioxoisoindolin-2-yl)propoxy)carbamate (78). N-Boc-hydroxylamine

(0.3 g, 2.2 mmol) was dissolved in dry MeCN (15 mL) and DBU (0.375 mL, 2.5 mmol) was added. The reaction mixture was stirred at rt for 30 min before cooling in an ice-bath and adding compound 77 (0.9 g, 3.4 mmol). The reaction was allowed to proceed at rt for 18 h.

Then, it was concentrated, dissolved in DCM (20 mL), washed with 5% citric acid (20 mL) and brine (10 mL). The organic layer was dried over MgSO4, filtered, and concentrated. The remaining residue was purified by flash column chromatography, eluting with Hex/EtOAc

(4:1 to 2:1). Compound 78 (0.72 g, 62%) was obtained as a white solid. 1H NMR spectral data matched literature values.152

tert-Butyl (3-aminopropoxy)carbamate (79). Compound 78 (0.44 mg, 1.37 mmol) was dissolved in dry MeOH (7 mL) and the solution was cooled to 0 °C. Anhydrous hydrazine

(0.17 mL, 5.5 mmol) was added and the reaction was stirred overnight at rt. The solvent was removed under reduced pressure and the residue was diluted in chloroform (10 mL). The white solid was filtered and the filtrate was concentrated. The latter process was repeated once more, or until no more white precipitate forms to finally give 79 (0.25 g, 95%) as a yellow oil. 1H NMR spectral data matched literature values.152

5-(3-(3-(Aminooxy)propyl)thioureido)-2-(6-hydroxy-3-oxo-3H-xanthen-9-yl)benzoic acid HCl salt (82). The linker 79 (47 mg, 0.25 mmol) and fluorescein 5-isothiocyanate (64 mg, 0.16 mmol) were dissolved in an ice-cold solution of DIPEA (64 μL, 0.49 mmol) in dry

DMF (0.4 mL). The solution was allowed to stir at rt overnight. Water (3 mL) was added to

80

the reaction mixture and extracted with EtOAc (3  3 mL), washed the combined organic layers with water (3  9 mL) and brine (5 mL). The organic layer was collected and dried over Na2SO4, filtered, and concentrated. The obtained residue was purified by flash column chromatography on silica gel and eluted with 10% MeOH in DCM to give 81. Compound 81 was then dissolved in dry DCE (0.32 mL) and TFA (64 μL) was added. The solution was stirred at rt for 2 h and then concentrated. The residue was redissolved in a solution of 4 N

HCl in 1,4-dioxane (0.12 mL) and DCM (0.38 mL), and stirred for 10 min. The volatiles were removed under reduced pressure and the latter process was repeated two more times to complete TFA to HCl salt exchange. The remaining residue was dissolved in MeOH (0.2 mL) and precipitated into ice-cold Et2O (2  10 mL). The resulting orange-yellow solid was collected and dried under high pressure to deliver 82 (35 mg, 42%). 1H NMR (300 MHz,

CD3OD) δ 8.27 (s, 1H), 7.83 (d, J = 8.4 Hz, 1H), 7.13 (d, J = 8.1 Hz, 1H), 6.85-6.67 (m, 6H),

4.20 (t, J = 6.4 Hz, 2H), 3.79 (t, J = 6.4 Hz, 2H), 2.12 (t, J = 6.4 Hz, 2H) ppm; HRMS-ESI

- (m/z): [M-H] calcd for C24H20N3O6S 478.10783, found 478.10840.

3-(Aminooxy)propyl (2-(7-hydroxy-2-oxo-2H-chromen-4-yl)ethyl) carbonate TFA salt

(85). The linker 79 (10 mg, 0.053 mmol) and the coumarin carbonate 83 (20 mg, 0.048 mmol) were dissolved in dry DMF (0.3 mL). The solution was stirred at rt overnight. Water

(3 mL) was added to the reaction mixture and extracted with EtOAc (3  3 mL), washed with water (3  9 mL) and brine (5 mL). The organic layer was collected and dried over Na2SO4, filtered, and concentrated. The obtained residue was purified by flash column chromatography on silica gel and eluted with 2:1 EtOAc/Hex to give 84. Compound 84 was

81

dissolved in a solution of TES (11 μL, 0.066 mmol) in dry DCM (0.2 mL). TFA (0.2 mL) was added and the solution was stirred at rt for 45 min. The volatiles were removed under reduced pressure and the residue was dissolved in MeOH (2 mL) and concentrated to remove the residual amount of TFA by co-evaporation. The process was repeated a total of three

1 times to finally afford 85 (13.3 mg, 66%) as a colorless oil. H NMR (400 Hz, CD3OD) δ

7.69 (d, J = 8.4 Hz, 1H), 6.84-6.81 (dd, J = 2.4, 8.8 Hz, 1H), 6.71 (d, J = 8.4 Hz, 1H), 6.14 (s,

1H), 4.36 (t, J = 6.4 Hz, 2H), 4.00 (t, J = 5.6 Hz, 2H), 3.18-3.09 (m, 4H), 1.78 (t, J = 6.4 Hz,

13 2H) ppm; C NMR (100 MHz, CD3OD) δ 163.6, 163.0, 156.9, 156.7, 155.7, 127.4, 114.4,

113.0, 111.7, 103.7, 71.5, 63.6, 38.9, 32.6, 30.4 ppm; HRMS-ESI (m/z): [M+H]+ calcd for

C15H19N2O6 323.12376, found 323.12311.

But-3-en-1-yl (2,5-dioxopyrrolidin-1-yl) carbonate (91). N,N’-Disuccinimidyl carbonate

(353 mg, 1.38 mmol) was added to a solution of 3-buten-1-ol (60 µL, 0.69 mmol) and TEA

(288 µL, 2.07 mmol) in dry MeCN (5 mL) at rt. The resulting mixture was stirred at rt overnight and then concentrated under vacuum. The product was purified by column chromatography on SiO2 gel, eluted with 2% acetone, and 1% TEA in DCM, to deliver 91

1 (96 mg, 65%) as a colorless oil. H NMR (400 MHz, CDCl3)  5.73 (m, 1H), 5.15-5.08 (m,

13 2H), 4.31 (t, J = 6.4 Hz, 2H), 2.78 (s, 4H), 2.46 (m, 2H) ppm; C NMR (100 MHz, CDCl3) 

168.8, 151.5, 132.4, 118.4, 70.2, 32.7, 25.4 ppm. LRMS-ESI was not successful due to compound instability.

82

2,5-Dioxopyrrolidin-1-yl pent-4-en-1-yl carbonate (92). Compound 92 (193 mg, 73%) was obtained as a colorless oil from 4-penten-1-ol (0.12 mL, 1.16 mmol) by following the

1 procedure described for 91. H NMR (400 MHz, CDCl3)  5.78 (m, 1H), 5.09-5.01 (m, 2H),

4.34 (t, J = 6.4 Hz, 2H), 2.83 (s, 4H), 2.17 (m, 2 H), 1.85 (m, 2H) ppm; 13C NMR (100 MHz,

CDCl3)  168.6, 151.4, 136.5, 115.8, 70.6, 29.2, 27.3, 25.3 ppm. LRMS-ESI was not successful due to compound instability.

2,5-Dioxopyrrolidin-1-yl hex-5-en-1-yl carbonate (93). Compound 93 (232 mg, 77%) was obtained as a colorless oil from 5-hexen-1-ol (0.15 mL, 1.25 mmol) by following the

1 procedure described for 91. H NMR (400 MHz, CDCl3)  5.73 (m, 1H), 5.05-4.91 (m, 2H),

4.27 (t, J = 6.4 Hz, 2H), 2.77 (s, 4H), 2.04 (m, 2H), 1.70 (m, 2H), 1.45 (m, 2H) ppm; 13C

NMR (100 MHz, CDCl3)  169.2, 151.9, 138.2, 115.5, 71.7, 33.3, 28.0, 25.7, 24.9 ppm.

LRMS-ESI was not successful due to compound instability.

2,5-Dioxopyrrolidin-1-yl hept-6-en-1-yl carbonate (94). Compound 94 (202 mg, 67%) was obtained as a colorless oil from 6-hepten-1-ol (0.16 mL, 1.17 mmol) by following the

1 procedure described for 91. H NMR (400 MHz, CDCl3)  5.77 (m, 1H), 5.00-4.91 (m, 2H),

4.28 (t, J = 6.8 Hz, 2H), 2.80 (s, 4H), 2.05 (m, 2H), 1.72 (m, 2H), 1.39 (m, 4H) ppm; 13C

NMR (400 MHz, CDCl3)  183.6, 168.9, 151.6, 138.5, 114.7, 71.6, 33.5, 28.3, 25.5, 24.9 ppm. LRMS-ESI was not successful due to compound instability.

83

(S)-6-(((But-3-en-1-yloxy)carbonyl)amino)-2-((tert-butoxycarbonyl)amino)hexanoic acid

(96). Boc-L-Lys-OH (218 mg, 0.88 mmol) was added to a stirred solution of 91 (157 mg,

0.74 mol) in dry DMF (2 mL). The reaction was allowed to continue overnight at rt. The mixture was diluted in water (10 mL) and extracted with EtOAc (3  10 mL). The combined organic layers were washed with water (3  20 mL) and brine (10 mL). The resulting organic layer was dried over Na2SO4, filtered and concentrated in vacuo to dryness to furnish 96 (219

1 mg, 86%) as an off-white foam. H NMR (400 MHz, CDCl3)  11.15 (s, br, 1H), 6.26 (s, br,

0.5H), 5.75 (m, 1H), 5.37 (m, br, 1H), 5.09-5.01 (m, 2H), 4.87, (s, br, 0.5H), 4.26 (s, br, 1H),

4.08 (m, br, 2H), 3.13 (m, 2H), 2.34 (m, 2H), 1.81-1.40 (m, 15H) ppm; 13C NMR (100 MHz,

CDCl3)  176.5, 157.2, 156.0, 138.5, 114.9, 80.1, 65.0, 53.3, 40.6, 33.5, 32.2, 29.5, 28.5, 22.5 ppm.

(S)-2-((tert-Butoxycarbonyl)amino)-6-(((pent-4-en-1-yloxy)carbonyl)amino)hexanoic acid (97). Compound 97 (517 mg, 92%) was obtained as an off-white foam from 92 (335

1 mg, 1.47 mmol) by following the procedure described for 96. H NMR (300 MHz, CDCl3) 

6.30 (m, br, 0.5H), 5.80 (m, 1H), 5.27 (m, br, 1H), 5.06-4.96 (m, 2H), 4.80 (s, br, 0.5H), 4.30

(s, br, 1H), 4.06 (m, br, 2H), 3.17 (m, 2H), 2.10 (m, 2H), 1.84-1.65 (m, 4H), 1.58-1.35 (m,

13 13H) ppm; C NMR (100 MHz, CDCl3)  176.2, 157.1, 155.9, 137.7, 115.2, 80.3, 64.7,

+ 53.5, 40.9, 32.5, 30.3, 29.7, 28.7, 22.8 ppm; HRMS-ESI (m/z): [M+K] calcd for C17H30N2O6

397.1741, found 397.1726.

84

(S)-2-((tert-Butoxycarbonyl)amino)-6-(((hex-5-en-1-yloxy)carbonyl)amino)hexanoic acid

(98). Compound 98 (177 mg, 93%) was obtained as an off-white foam from 93 (123 mg, 0.51

1 mmol) by following the procedure described for 96. H NMR (400 MHz, CDCl3)  8.40 (s, br, 1H), 6.29 (s, br, 0.5H), 5.78 (m, 1H), 5.31 (m, br, 1H), 5.02-4.93 (m, 2H), 4.90 (s, br,

0.5H), 4.35 (s, br, 1H), 4.05 (m, br, 2H), 3.16 (m, 2H), 2.06 (m, 2H), 1.82-1.43 (m, 19H)

13 ppm; C NMR (100 MHz, CDCl3)  176.3, 157.9, 155.7, 138.5, 114.8, 80.1, 64.9, 53.2,

40.6, 33.4, 32.1, 29.4, 28.5, 28.4, 25.2, 22.5 ppm; HRMS-ESI (m/z): [M+Na]+ calcd for

C18H32N2O6 395.2158, found 395.2145.

(S)-2-((tert-Butoxycarbonyl)amino)-6-(((hept-6-en-1-yloxy)carbonyl)amino)hexanoic acid (99). Compound 99 (148 mg, 93%) was obtained as an off-white foam from 94 (105

1 mg, 0.41 mmol) by following the procedure described for 96. H NMR (300 MHz, CDCl3) 

8.48 (s, br, 1H), 6.33 (s, br, 0.5H), 5.78 (m, 1H), 5.30 (m, br, 1H), 5.01-4.88 (m, 2.5H), 4.29

(s, br, 1H), 4.05 (m, br, 2H), 3.16 (m, 2H), 2.03 (m, 2H), 1.78-1.18 (m, 21H) ppm; 13C NMR

(75 MHz, CDCl3)  176.5, 157.3, 155.8, 138.9, 114.7, 80.3, 65.2, 53.3, 40.6, 33.8, 32.1, 29.6,

+ 29.0, 28.7, 28.5, 25.5, 22.5 ppm; HRMS-ESI (m/z): [M+H] calcd for C13H24N2O4 273.1809, found 273.1876.

(S)-2-Amino-6-(((but-3-en-1-yloxy)carbonyl)amino)hexanoic acid HCl salt (101). To a solution of 96 (110 mg, 0.32 mmol) and TES (0.1 µL, 0.64 mmol) in dry DCM (4.5 mL),

TFA (0.24 mL, 3.2 mmol) was added dropwise, and the reaction mixture was allowed to stir at rt overnight. The volatiles were removed under reduced pressure and the residue was

85

dissolved in a solution of 4 N HCl in 1,4-dioxane (0.25 mL) and DCM (0.75 mL), was allowed to stir for 10 min at rt and was concentrated. This process was repeated two more times to ensure complete TFA to HCl salt exchange. The concentrated residue was dissolved in a minimal amount of MeOH and was precipitated into ice-cold Et2O. The precipitate was pelleted by centrifugation, the supernatant decanted, and the solid was washed with Et2O before drying under vacuum, affording the amino acid 101 (82 mg, 92%) as a white solid. 1H

NMR (400 MHz, DMSO-d6)  8.45 (s, br, 3H), 7.09 (s, br, 1H), 5.75 (m, 1H), 5.10-5.01 (m,

2H), 3.94 (t, J = 6.8 Hz, 2H), 3.77 (t, J = 6.4 Hz, 1H), 2.92 (m, 2H), 2.23 (m, 2H), 1.75 (m,

13 2H), 1.36-1.26 (m, br, 4H) ppm; C NMR (100 MHz, DMSO-d6)  170.9, 156.2, 134.8,

117.0, 62.7, 51.8, 33.2, 29.6, 28.9, 21.6 ppm; HRMS-ESI (m/z): [M+H]+ calcd for

C11H20N2O4 245.1496, found 245.1490.

(S)-2-Amino-6-(((pent-4-en-1-yloxy)carbonyl)amino)hexanoic acid HCl salt (102).

Deprotection of 97 (0.5 g, 1.39 mmol) was performed as described for 101 to obtain 102

1 (0.40 g, 97%) as a white solid. H NMR (400 MHz, D2O)  5.86 (m, 1H), 5.07-4.98 (m, 2H),

4.05-4.00 (m, 3H), 3.11 (t, J = 5.2 Hz, 2H), 2.10 (m, 2H), 1.97-1.87 (m, 2H), 1.69 (t, J = 6.4

13 Hz, 2H), 1.56-1.39 (m, 4H) ppm; C NMR (75 MHz, D2O)  172.0, 158.9, 138.6, 114.9,

65.0, 52.7, 39.9, 29.6, 29.4, 28.5, 27.5, 21.6 ppm; HRMS-ESI (m/z): [M+H]+ calcd for

C12H22N2O4 259.1652, found 259.1653.

(S)-2-Amino-6-(((hex-5-en-1-yloxy)carbonyl)amino)hexanoic acid hydrochloride (103).

Deprotection of 98 (145 mg, 0.39 mmol) was performed as described for 101 to obtain 103

86

1 (108.4 mg, 90%) as a white solid. H NMR (400 MHz, DMSO-d6)  8.28 (s, br, 3H), 7.08 (s, br, 1H), 5.79 (m, 1H), 5.02-4.93 (m, 2H), 3.91 (t, J = 6.8 Hz, 2H), 3.67 (s, br, 2H), 2.94 (m,

2H), 2.03 (m, 2H), 1.76 (m, br, 2H), 1.52 (t, J = 6.8 Hz, 2H), 1.38 (m, br, 6H) ppm; 13C NMR

(100 MHz, DMSO-d6)  171.1, 156.3, 138.5, 115.0, 63.4, 52.3, 32.8, 29.8, 29.0, 28.2, 24.6,

+ 21.7 ppm; HRMS-ESI (m/z): [M+H] calcd for C13H24N2O4 273.1809, found 273.1803.

(S)-2-Amino-6-(((hept-6-en-1-yloxy)carbonyl)amino)hexanoic acid HCl salt (104).

Deprotection of 99 (130 mg, 0.336 mmol) was performed as described for 101 to obtain 104

1 (104.4 mg, 96%) as a white solid. H NMR (400 MHz, DMSO-d6)  8.45 (s, br, 3H), 7.09

(m, br, 1H), 5.78 (m, 1H), 5.01-4.92 (m, 2H), 3.90 (t, J = 6.4 Hz, 2H), 3.82 (s, br, 1H), 2.93

(m, 2H), 2.01 (m, 2H), 1.77 (m, br, 2H), 1.52 (m, 2H), 1.45-1.28 (m, 10H) ppm; 13C NMR

(100 MHz, DMSO-d6)  171.0, 156.3, 138.7, 114.8, 63.5, 51.8, 33.1, 29.6, 28.9, 28.6, 28.0,

+ 24.9, 21.6 ppm; HRMS-ESI (m/z): [M+H] calcd for C14H26N2O4 287.1965, found 287.1957.

(S)-6-(((But-2-en-1-yloxy)carbonyl)amino)-2-((tert-butoxycarbonyl)amino)hexanoic acid (100). Diphosgene (0.26 mL, 2.16 mmol) was added dropwise to an ice-cold mixture of

2-buten-1-ol (cis:trans isomers, ~1:19) (0.12 mL, 1.66 mmol) and K2CO3 (0.69 g, 4.98 mmol) in dry Et2O (5 mL). The resulting mixture was allowed to stir overnight at rt, filtered and concentrated under reduced pressure. The chloroformate 95 was obtained as a clear liquid and without further purification it was added dropwise to an ice-cold solution of Boc-

L-Lys-OH (495 mg, 2.0 mmol) and aqueous 1 M NaOH (1 mL) in THF (4 mL). The reaction was allowed to run overnight at rt. The volatiles were removed under reduced pressure and

87

the residue was diluted in water (10 mL) and then washed with EtOAc (10 mL). The water layer was acidified with 5% citric acid to pH 3-4 and extracted with EtOAc (3  10 mL). The combined organic layers were washed with water (20 mL) and brine (10 mL). The resulting organic layer was dried over Na2SO4, filtered, and concentrated in vacuo to dryness to

1 furnish 100 (343 mg, 60%) as an off-white foam. H NMR (400 MHz, CDCl3)  8.40 (s, br,

1H), 6.29 (s, br, 0.5H), 5.78 (m, 1H), 5.31 (m, br, 1H), 5.02-4.90 (m, 2.5H), 4.29 (s, br, 1H),

4.05 (m, br, 2H), 3.15 (m, 2H), 2.07 (m, 2H), 1.81-1.40 (m, 15H) ppm; 13C NMR (75 MHz,

CDCl3)  176.4, 156.9, 156.0, 131.0, 125.9, 80.1, 65.7, 53.3, 40.6, 32.2, 29.5, 28.5, 22.5, 17.9

- ppm; HRMS-ESI (m/z): [M-H] calcd for C16H28N2O6 343.1864, found 343.1869.

(S)-2-Amino-6-(((but-2-en-1-yloxy)carbonyl)amino)hexanoic acid TFA salt (105). To a solution of 100 (317 mg, 0.92 mmol) and TES (0.29 mL, 1.84 mmol) in dry DCM (13 mL),

TFA (0.68 mL, 9.20 mmol) was added dropwise, and the reaction mixture was allowed to stir at rt overnight. The volatiles were removed under reduced pressure and the residue was dissolved in a minimal amount of MeOH and precipitated into ice-cold Et2O. The precipitate was pelleted by centrifugation, the supernatant decanted, and the solid was washed with Et2O before drying it under vacuum, affording the amino acid 105 (288 mg, 87%) as a white solid.

1 H NMR (400 MHz, D2O)  5.80 (m, 1 H), 5.58 (m, 1H), 4.43 (d, J = 5.6 Hz, 2H), 3.85 (m,

1H), 3.08 (t, J = 6.0 Hz, 2H), 1.88 (t, J = 6.0 Hz, 2H), 1.66 (d, J = 6.4 Hz, 3H), 1.53-1.33 (m,

13 4H) ppm; C NMR (100 MHz, DMSO-d6)  171.3, 156.1, 129.6, 126.6, 64.1, 52.7, 38.4,

+ 30.1, 29.1, 26.5, 21.9, 21.6, 17.5 ppm; HRMS-ESI (m/z): [M+H] calcd for C11H20N2O4

245.14958, found 245.14970.

88

(S)-6-((Allylcarbamoyl)oxy)-2-((tert-butoxycarbonyl)amino)hexanoic acid (107). 6-

Hydroxy-Boc-L-norleucine-OH (25 mg, 0.10 mmol) was dissolved in a solution of dry DCM

(1 mL) and DIPEA (53 µL, 0.30 mmol). The solution was chilled to 0 ºC before the addition of allyl isocyanate (18 µL, 0.20 mmol) and the reaction was allowed to proceed at 40 ºC overnight. After cooling to rt, the mixture was diluted with DCM (3 mL) and 5% citric acid was added (4 mL). The aqueous layer was extracted with DCM (3  4 mL) and the combined organic layers were washed with water (10 mL) and brine (5 mL). The resulting organic layer was dried over Na2SO4, filtered, and concentrated in vacuo to dryness to furnish 107 (29 mg,

1 89% yield) as an off-white foam. H NMR (400 MHz, CDCl3)  5.85 (m, 1H), 5.24-5.07 (m,

2H), 4.74 (m, br, 1H), 4.29, (s, br, 1H), 4.06 (t, J = 5.6 Hz, 2H), 3.78 (m, 2H), 1.93-1.25 (m,

13 15H) ppm; C NMR (100 MHz, CDCl3)  176.0, 155.8, 135.0, 116.4, 80.3, 64.8, 53.3, 43.3,

+ 32.2, 28.7, 28.5, 22.0 ppm; HRMS-ESI (m/z): [M+Na] calcd for C15H26N2O6 353.1689, found 353.1654.

(S)-6-((Allylcarbamoyl)oxy)-2-aminohexanoic acid TFA salt (108). Deprotection of 107

(28 mg, 0.085 mmol) was performed by following the procedure described for compound

1 105 to afford compound 108 (28.8 mg, 96%) as a white solid. H NMR (400 MHz, D2O) 

5.86 (m, 1H), 5.19-4.80 (m, 2H), 4.08 (t, J = 6.0 Hz, 1H), 3.94 (t, J = 5.6 Hz, 2 H), 3.72 (m,

13 2H), 1.95 (m, 2H), 1.69 (m, 2 H), 1.50 (m, 2H) ppm; C NMR (100 MHz, D2O)  173.2,

159.0, 135.4, 115.1, 63.1, 53.7, 42.1, 29.7, 28.0, 21.0 ppm; HRMS-ESI (m/z): [M+Na]+ calcd for C15H26N2O6 353.1689, found 353.1654.

89

(S)-2-((tert-Butoxycarbonyl)amino)-6-(pent-4-enamido)hexanoic acid (110). Compound

110 (212 mg, 97%) was obtained as an off-white foam from 2,5-dioxopyrrolidin-1-yl pent-4- enoate (106)183 (132 mg, 0.67 mmol) by following the procedure described for compound 91.

1 H NMR (400 MHz, CDCl3)  9.56 (s, br, 1H), 6.20 (s, br, 1H), 5.78 (m, 1H), 5.34 (m, 1H),

5.06-4.97 (m, 2H), 4.25 (m, br, 1H), 4.46 (s, br, 1H), 3.23 (m, 2H), 2.34 (m, 2H), 2.27 (m,

13 2H), 1.82-1.66 (m, br, 2H), 1.59-1.35 (m, 13H) ppm; C NMR (100 MHz, CDCl3)  175.4,

173.5, 156.0, 137.0, 115.8, 80.1, 53.2, 39.3, 35.8, 32.3, 29.8, 28.9, 28.4, 22.5 ppm; HRMS-

+ ESI (m/z): [M+K] calcd for C16H28N2O5 367.1635, found 367.1625.

(S)-2-Amino-6-(pent-4-enamido)hexanoic acid HCl salt (111). Deprotection of 110 (175 mg, 0.54 mmol) was performed by following the procedure described for compound 101 to

1 afford compound 111 (142 mg, 99%) as a white solid. H NMR (400 MHz, DMSO-d6)  8.45

(s, br, 3H), 7.93 (m, br, 1H), 5.77 (m, 1H), 5.02-4.92 (m, 2H), 3.95 (m, br, 1H), 3.81 (s, br,

1H), 3.00 (m, 2H), 2.25-2.14 (m, 4H), 1.79 (m, br, 2H), 1.40-1.29 (m, br, 4H) ppm; 13C NMR

(100 MHz, DMSO-d6)  171.3, 171.0, 137.8, 115.0, 51.9, 38.0, 34.5, 29.6, 29.3, 28.6, 21.7

+ ppm; HRMS-ESI (m/z): [M+H] calcd for C11H20N2O3 229.1547, found 229.1545.

(S)-6-(3-Allylureido)-2-((tert-butoxycarbonyl)amino)hexanoic acid (112). Allyl isocyanate (100 µL, 1.13 mmol) was dissolved in dry DMF (2 mL) and the solution was chilled to 0 ºC before adding Boc-Lys-OH (334 mg, 1.36 mmol) and DMAP (166 mg, 1.36 mmol). The reaction was heated at 70 ºC overnight. After cooling to rt, the mixture was diluted in water (6 mL) and extracted with EtOAc (3  6 mL). The combined organic layers

90

were washed with water (3  15 mL) and brine (8 mL). The resulting organic layer was dried over Na2SO4, filtered, and concentrated in vacuo to dryness to furnish 112 (213 mg, 57%

1 yield) as an off-white foam. H NMR (400 MHz, CDCl3)  8.78 (s, b, 1H), 6.19 (s, br, 0.5H),

5.87-5.77 (m, 1H), 5.46 (d, J = 7.2 Hz, 1H), 5.20-5.09 (m, 2H), 4.25 (m, br, 1H), 4.11 (m, br,

0.5H), 3.77 (m, br, 2H), 3.12 (m, br, 2H), 1.80-1.68 (m, 2H), 1.58-1.43 (m, 13H) ppm; 13C

NMR (100 MHz, CDCl3)  176.0, 159.7, 156.0, 135.2, 116.1, 80.2, 53.5, 43.2, 40.3, 32.3,

+ 29.4, 28.6, 22.5 ppm; HRMS-ESI (m/z): [M+Na] calcd for C15H27N3O5 352.1848, found

352.1845.

(S)-6-(3-Allylureido)-2-aminohexanoic acid TFA salt (113). Deprotection of 112 (53.7 mg,

0.16 mmol) was performed by following the procedure described for compound 105 to afford

1 compound 113 (50 mg, 94%) as a white solid. H NMR (400 MHz, D2O)  5.62-5.55 (m,

1H), 4.93-4.84 (m, 2H), 3.81 (t, J = 6.0 Hz, 1H), 3.45 (m, 2H), 2.85 (m, 2H), 1.72-1.58 (m,

13 2H), 1.30-1.09 (m, 4H) ppm; C NMR (100 MHz, D2O)  172.0, 160.5, 135.1, 114.6, 52.7,

+ 42.0, 39.4, 29.4, 28.7, 21.5 ppm; HRMS-ESI (m/z): [M+H] calcd for C10H19N3O3 230.15, found 230.14.

(S,R,S)-N,N'-(Disulfanediylbis(ethane-2,1-diyl))bis(5-((3aS,4S,6aR)-2-oxohexahydro-

1H-thieno[3,4-d]imidazol-4-yl)pentanamide) (115). NHS-biotin (26 mg, 0.076 mmol) was dissolved in warm, dry DMF (0.5 mL). Next, cystamine hydrochloride (7.8 mg, 0.035 mmol) and DIPEA (36 μL, 0.21 mmol) were added. The reaction mixture was stirred at 90 °C overnight, cooled to rt, and was slowly added into Et2O. The formed precipitate was filtered

91

over ice and washed with ice-cold water. The beige-colored solid was collected and dried

1 under high vacuum to yield 115 (11.6 mg, 56%). H NMR (300 MHz, DMSO-d6)  8.00 (m,

1H), 6.42 (d, J = 21 Hz, 2H), 4.30-4.12 (m, 2H), 3.10 (m, 1H), 2.84-2.78 (m, 3H), 2.58 (d, J

= 12.3 Hz, 1H), 2.08 (t, J = 7.2 Hz, 2H), 1.65-1.25 (m, 6H) ppm; HRMS-ESI (m/z): [M+H]+ calcd for C24H41N6O4S4 605.2066, found 605.2003.

5-(Dimethylamino)-N-(2-sulfanylethyl)naphthalene-1-sulfonamide (117). A solution of dansyl chloride (150 mg, 0.56 mmol) and TEA (194 µL, 1.39 mmol) in dry DCM (0.8 mL) was cooled to 0 ºC and added dropwise into an ice-cold solution of cysteamine (86 mg, 1.11 mmol) in dry DCM (1 mL). The reaction was allowed to stir at rt for 3 h, was concentrated, and the product was purified on silica gel, eluting with 97:2:1 DCM/Hex/TEA to furnish 117

(51.4 mg, 30%) as a yellow film. 1H NMR spectral data matched literature values.216

4-Formylphenyl acetate (121). 4-Hydroxybenzaldehyde (1.0 g, 8.2 mmol) was dissolved in a solution of TEA (3.45 mL, 24.6 mmol) in dry THF (8 mL). The solution was cooled to 0 °C and acetyl chloride (0.87 mL, 12.3 mmol) was added dropwise. The reaction mixture was allowed to warm to rt and was stirred overnight. It was filtered and the filtrate was diluted with Et2O (15 mL), washed with saturated NaHCO3 (15 mL), water (15 mL), and brine (8 mL). The organic layer was collected and dried over MgSO4 and concentrated to dryness,

1 delivering 121 (1.23 g, 97%) as a dark yellow liquid. H NMR (300 MHz, CDCl3) δ 9.97 (s,

1H), 7.91 (d, J = 8.7 Hz, 2H), 7.27 (d, J = 8.7 Hz, 2H), 2.32 (s, 3H) ppm; 13C NMR (100

MHz, CDCl3) δ 191.0, 168.8, 155.4, 134.0, 131.3, 122.4, 21.2 ppm.

92

4-((2-Tosylhydrazono)methyl)phenyl acetate (126). p-Toluenesulfonyl hydrazide (1.6 g,

8.68 mmol) was dissolved in EtOH (10 mL) at 40 °C. A heated solution of 121 in EtOH (10 mL) was then added and the reaction continued to stir at this temperature. After 45 min the reaction was complete (by TLC) and the reaction mixture was allowed to cool to rt. The formed precipitate was filtered and washed with cold EtOH. The white solid was collected

1 and dried under reduced pressure to yield 126 (1.88 g, 72%). H NMR (300 MHz, CDCl3) δ

8.08 (s, 1H), 7.86 (d, J = 8.1 Hz, 2H), 7.69 (s, 1H), 7.57 (d, J = 8.7 Hz, 2H), 7.32 (d, J = 7.8

Hz, 2H), 7.06 (d, J = 8.4 Hz, 2H), 2.41 (s, 3H), 2.30 (s, 3H) ppm; HRMS-ESI (m/z): [M-H]- calcd for C16H15N2O4S 331.07580, found 331.07604.

N'-Benzylidene-4-methylbenzenesulfonohydrazide (123). Similar to the described procedure for the synthesis of 126, compound 123 was synthesized using the following reagents: benzaldehyde (1.0 g, 9.4 mmol), p-toluenesulfonyl hydrazide (1.9 g, 10.3 mmol), and EtOH (20 mL). The product was obtained as a white solid (2.5 g, 89%). 1H NMR spectral data matched literature values.217

N'-(4-Hydroxybenzylidene)-4-methylbenzenesulfonohydrazide (124). Similar to the described procedure for the synthesis of 126, compound 124 was synthesized using the following reagents: 4-hydroxybenzaldehyde (0.1 g, 0.82 mmol), p-toluenesulfonyl hydrazide

(168 mg, 0.9 mmol), and EtOH (4 mL). The product was obtained as a yellow solid (0.196 g,

1 83%). H NMR (300 MHz, acetone-d6) δ 7.90 (s, 1H), 7.84 (d, J = 8.7 Hz, 2H), 7.49 (d, J =

93

8.7 Hz, 2H), 7.39 (d, J = 7.8 Hz, 2H), 6.86 (d, J = 9.0 Hz, 2H), 2.91 (s, br, 1H), 2.41 (s, br,

1H), 2.38 (s, 3H) ppm.

N'-(4-Methoxybenzylidene)-4-methylbenzenesulfonohydrazide (125). Similar to the described procedure for the synthesis of 126, compound 125 was synthesized using the following reagents: 4-methoxybenzaldehyde (0.1 g, 0.73 mmol), p-toluenesulfonyl hydrazide

(0.15 g, 0.81 mmol), and EtOH (4 mL). The product was obtained as a white solid (0.155 g,

70%). 1H NMR spectral data matched literature values.217

Methyl (Z)-4-((2-tosylhydrazono)methyl)benzoate (127). Similar to the described procedure for the synthesis of 126, compound 127 was synthesized using the following reagents: 4-methylformyl benzoate (0.5 g, 3.0 mmol), p-toluenesulfonyl hydrazide (0.62 g,

3.35 mmol), and EtOH (12 mL). The product was obtained as a white solid (0.88 g, 88%). 1H

NMR spectral data matched literature values.218

4-(2-(4-Methoxyphenyl)-2H-tetrazol-5-yl)phenyl acetate (130). A diazonium salt solution was prepared by slowly adding an ice-cold solution of aqueous 3.3 M NaNO2 (3 mL) into a solution of p-anisidine (1.1 g, 9.0 mmol) that was dissolved in EtOH (9 mL) and 6 M HCl (6 mL), at −10 to −15 °C. The reaction mixture was stirred at this temperature for 30 min and then added dropwise to a solution of 126 (1.0 g, 3.0 mmol) in pyridine (9 mL) at −10 to −15

°C. The reaction mixture was stirred at this temperature for 30 min. It was then allowed to warm to rt, water (10 mL) was added, and the mixture was extracted with chloroform (3  15

94

mL). The organic layers were combined and washed with water (40 mL) and brine (20 mL), dried over Na2SO2, filtered, and concentrated. The collected solid was recrystallized from boiling EtOH to obtain 130 (346 mg, 37%) as a brown-red solid. 1H NMR (300 MHz,

CDCl3) δ 8.27 (d, J = 9.0 Hz, 2H), 8.10 (d, J = 9.3 Hz, 2H), 7.27 (d, J = 8.7 Hz, 2H), 7.07 (d,

J = 9.0 Hz, 2H), 3.88 (s, 3H), 2.34 (s, 3H) ppm; HRMS-ESI (m/z): [M+H]+ calcd for

C16H15N4O3 311.11387, found 311.11341.

2-(4-Methoxyphenyl)-5-phenyl-2H-tetrazole (128). Similar to the described procedure for the synthesis of 130, compound 128 was synthesized using the following reagents: 123 (100 mg, 0.36 mmol), p-anisidine (90 mg, 0.73 mmol), and NaNO2 (55 mg, 0.80 mmol). The product was obtained as a light yellow solid (25 mg, 27%) after purifying by flash column chromatography on silica gel eluting with 6:1 DCM/Hex and 1% TEA. 1H NMR spectral data matched literature values.188

5-(4-Methoxyphenyl)-2-phenyl-2H-tetrazole (129). Similar to the described procedure for the synthesis of 130, compound 129 was synthesized using the following reagents: 125 (70 mg, 0.23 mmol), aniline (43 mg, 0.46 mmol), and NaNO2 (35 mg, 0.51 mmol). The product was obtained as a solid (12 mg, 20%) after purifying by flash column chromatography on silica gel eluting with 1:1 DCM/Hex and 1% TEA. 1H NMR spectral data matched literature values.188

95

Methyl 4-(2-(4-methoxyphenyl)-2H-tetrazol-5-yl)benzoate (131). Similar to the described procedure for the synthesis of 130, compound 131 was synthesized using the following reagents: 127 (100 mg, 0.3 mmol), p-anisidine (74 mg, 0.6 mmol), and NaNO2 (46 mg, 0.66 mmol). The product was obtained as a light pink solid (66 mg, 71%) after recrystallization in

EtOH. 1H NMR spectral data matched literature values.187

4-(2-(4-Methoxyphenyl)-2H-tetrazol-5-yl)phenol (132). Tetrazole 130 (0.3 g, 0.97 mmol) was dissolved in MeOH (8 mL) and NH4OAc (0.6 g, 7.7 mmol) in water (2 mL) was added.

The reaction was heated to 60 °C and stirred overnight. Next, the reaction mixture was allowed to cool to rt, diluted with water (10 mL), and extracted with DCM (3  10 mL). The organic layers were combined, washed with brine (10 mL), dried over Na2SO2, filtered, and concentrated to obtain the free phenol 132 in quantitative yield as a solid. 1H NMR (300

MHz, CDCl3) δ 8.11 (m, 4H), 7.22 (d, J = 9.0 Hz, 2H), 7.05 (d, J = 8.7 Hz, 2H), 3.92 (s, 3H) ppm.

4-(2-(4-Methoxyphenyl)-2H-tetrazol-5-yl)benzoic acid (133). Compound 131 (20 mg, 0.06 mmol) was taken in water and THF (1:1, 0.46 mL) and LiOH monohydrate (11 mg, 0.26 mmol) was added. The reaction mixture was heated to 80 °C for 4 h and then allowed to cool to rt. The THF was removed under reduced pressure and the residue was diluted in water (5 mL), acidified to pH 4 with 1 M HCl, and extracted with EtOAc (3  5 mL). The combined organic layers were washed with water (10 mL) and brine (5 mL), dried over Na2SO4,

96

filtered, and concentrated. Compound 133 was obtained as a white solid in quantitative yield.

1H NMR spectral data matched literature values.219

tert-Butyl (3-(4-(2-(4-methoxyphenyl)-2H-tetrazol-5-yl)phenoxy)propyl)carbamate

(134). The tetrazole 132 (100 mg, 0.36 mmol) was dissolved in dry DMF (3 mL). Then,

K2CO3 (247 mg, 1.78 mmol) and N-Boc-3-bromopropylamine (255 mg, 1.07 mmol) were added and the reaction was stirred at 50 °C overnight. After cooling to rt, water (10 mL) was added and the reaction mixture was extracted with EtOAc (3  10 mL). The combined organic layers were washed with water (3  20 mL), 5% aqueous LiCl (20 mL), and brine

(10 mL). The organic layer was dried over Na2SO4, filtered, and concentrated. The obtained residue was purified by column chromatography on silica gel, eluting with 2:1 Hex/EtOAc and 1% TEA. The tetrazole 134 was obtained as a white solid (132 mg, 88%). 1H NMR (300

MHz, CDCl3) δ 8.16 (d, J = 9.0 Hz, 2H), 8.09 (d, J = 8.7 Hz, 2H), 7.05 (m, 4H), 4.82 (s, br,

1H), 4.10 (t, J = 6.0 Hz, 2H), 3.88 (s, 3H), 3.38 (m, 2H), 2.03 (t, J = 6.6 Hz, 2H), 1.44 (s, 9H)

+ ppm; HRMS-ESI (m/z): [M+H] calcd for C22H28N5O4 426.21358, found 426.21404.

3-(4-(2-(4-Methoxyphenyl)-2H-tetrazol-5-yl)phenoxy)propan-1-amine (135). N-Boc- tetrazole 134 (130 mg, 0.30 mmol) was dissolved in DCM (3 mL) and cooled to 0 °C. A solution of 4 N HCl in 1,4-dioxane (1 mL) was added and the reaction was stirred at rt for 30 min to furnish 135 (87 mg, 77%) as a white solid after concentrating to dryness. 1H NMR

(300 MHz, CD3OD) δ 8.14 (m, 4H), 7.17 (m, 4H), 4.21 (m, 2H), 3.90 (s, 3H), 3.19 (t, J = 7.5

Hz, 2H), 2.20 (m, 2H) ppm.

97

5-(Dimethylamino)-N-(3-(4-(2-(4-methoxyphenyl)-2H-tetrazol-5-yl)phenoxy)propyl)- naphthalene-1-sulfonamide (136). Tetrazole 135 (20 mg, 0.055 mmol) was stirred in an ice- cold solution of TEA (46 μL, 0.33 mmol) in DCM (1 mL). Dansyl chloride (17 mg, 0.061 mmol) was added and the reaction was stirred at rt overnight. The reaction mixture was concentrated under reduced pressure and the residue was purified by column chromatography on silica gel, eluting with 5% Et2O in DCM to obtain 136 (24 mg, 77%) as a bright yellow

1 solid. H NMR (300 MHz, CDCl3) δ 8.50 (d, J = 7.5 Hz, 1H), 8.29 (m, 2H), 8.10 (d, J = 8.1

Hz, 4H), 7.49 (m, 2H), 7.13-7.03 (m, 4H), 6.82 (d, J = 8.4 Hz, 1H), 5.41 (s, br, 2H), 3.88 (s,

3H), 3.69 (t, J = 6.0 Hz, 2H), 3.13 (m, 2H), 2.84 (s, 6H), 1.92 (m, 2H) ppm; HRMS-ESI

+ (m/z): [M+H] calcd for C29H31N6O4S 559.21220, found 559.21331.

N-(3-(4-(2-(4-Methoxyphenyl)-2H-tetrazol-5-yl)phenoxy)propyl)-5-((3aS,4S,6aR)-2- oxohexahydro-1H-thieno[3,4-d]imidazol-4-yl)pentanamide (137). The tetrazole 135 (20 mg, 0.055 mmol) was dissolved in dry DMF (0.4 mL). Next, DIPEA (48 μL, 0.28 mmol) was added, followed by NHS-biotin (19 mg, 0.055 mmol). The reaction was stirred at rt overnight and then poured over ice. The precipitate was filtered, washed with water, and collected to dry under high vacuum. Compound 137 (24 mg, 80%) was obtained as a white solid. 1H

NMR (300 MHz, DMSO-d6) δ 8.09 (m, 4H), 7.91 (m, 1H), 7.23 (d, J = 8.7 Hz, 2H), 7.15 (d,

J = 7.1 Hz, 2H), 6.42 (s, 1H), 6.36 (s, 1H), 4.28 (m, 1H), 4.08 (m, 3H), 3.87 (s, 3H), 3.21 (m,

2H), 3.07 (m, 1H), 2.80 (m, 1H), 2.57 (m, 1H), 2.07 (m, 2H), 1.88 (m, 2H), 1.51-1.30 (m,

+ 6H) ppm; LRMS-ESI (m/z): [M+H] calcd for C27H34N7O4S 552.24, found 552.24.

98

(S)-2-Amino-6-((((2E,4E)-hexa-2,4-dien-1-yloxy)carbonyl)amino)hexanoic acid (139).

Diphosgene (0.27 mL, 2.24 mmol) was added dropwise to trans,trans-2,4-hexadien-1-ol

(0.22 g, 2.24 mmol) and K2CO3 (0.62 g, 4.5 mmol) in dry Et2O (6 mL). The resulting mixture was allowed to stir overnight at rt, filtered, and concentrated under reduced pressure. The corresponding chloroformate was obtained as a clear liquid and (without further purification) was dissolved in THF (1 mL). The solution was added dropwise to an ice-cold solution of

Boc-L-Lys-OH (0.66 g, 2.69 mmol) and 1 M NaOH aqueous (1 mL) in THF (3 mL). The reaction was allowed to run overnight at rt. The volatiles were removed under reduced pressure and the residue was diluted in water (15 mL) and then washed with EtOAc (15 mL).

The water layer was acidified with 5% citric acid to pH 3-4 and extracted with EtOAc (3 

15 mL). The combined organic layers were washed with water (40 mL) and brine (20 mL).

The resulting organic layer was dried over Na2SO4, filtered, and concentrated in vacuo to

1 dryness to furnish 139 (0.43 mg, 52%) as an off-white foam. H NMR (300 MHz, CDCl3) 

10.20 (s, br, 1H), 6.25 (m, 1H), 6.06 (m, 1H), 5.75-5.53 (m, 3H), 5.34 (m, br, 0.5H), 5.03 (s, br, 0.5H), 4.54 (m, 2H), 4.25 (m, br, 1H), 1.75-1.23 (m, 18H) ppm; HRMS-ESI (m/z):

+ [M+Na] calcd for C18H30N2O6Na 393.2002, found 393.1991.

(S)-2-((tert-Butoxycarbonyl)amino)-6-((((2E,4E)-hexa-2,4-dien-1-yloxy)carbonyl)- amino)hexanoic acid TFA salt (140). To a solution of 139 (50 mg, 0.13 mmol) and TES (43

μL, 0.27 mmol) in dry DCM (1.9 mL), TFA (0.1 mL, 1.35 mmol) was added dropwise, and the reaction mixture was allowed to stir at rt overnight. The volatiles were removed under reduced pressure and the residue was dissolved in a minimal amount of MeOH and

99

precipitated into ice-cold Et2O. The precipitate was pelleted by centrifugation, the supernatant was decanted, and the solid was washed with Et2O before drying under vacuum,

1 affording the amino acid 140 in quantitative yield as a white solid. H NMR (400 MHz, D2O)

 6.23 (m, 1H), 6.07 (m, 1H), 5.77-5.57 (m, 2H), 4.02-3.99 (m, 2H), 3.07 (t, J = 6.8 Hz, 2H),

2.99 (t, J = 7.6 Hz, 1H) 1.97-1.37 (m, 9H) ppm.

2,5-Dioxopyrrolidin-1-yl (furan-2-ylmethyl) carbonate (142). N,N’-Disuccinimidyl carbonate (593 mg, 2.31 mmol) was added to a solution of furfuryl alcohol (100 µL, 1.16 mmol) and TEA (484 µL, 3.47 mmol) in dry MeCN (4 mL) at rt. The resulting mixture was stirred at rt overnight and then concentrated under vacuum. The product was purified by column chromatography on SiO2 gel, eluted with DCM, to deliver 142 (132 mg, 57%) as a

1 white solid. H NMR (300 MHz, CDCl3)  7.47 (m, 1H), 6.49 (d, J = 3.0 Hz, 1H), 6.36 (m,

1H), 5.07 (s, 2H), 2.65 (s, 4H) ppm.

N2-(tert-Butoxycarbonyl)-N6-((furan-2-ylmethoxy)carbonyl)-L-lysine (143). Compound

142 (0.11 g, 0.52 mmol) and Boc-Lys-OH (0.155 g, 0.631 mmol) were dissolved in dry DMF

(2 mL) and the reaction mixture was stirred overnight at rt. Water (10 mL) and EtOAc (10 mL) were added and the aqueous layer was extracted (3  10 mL). The combined organic extracts were washed with water (3  25 mL) and brine (10 mL), dried over Na2SO4, filtered, and concentrated under reduced pressure to dryness to afford 143 (75 mg, 44%) as an off-

1 white solid. H NMR (300 MHz, CDCl3)  7.40 (m, 1H), 6.40 (d, J = 3.0 Hz, 1H), 6.34 (m,

100

1H), 5.24 (m, br, 1H), 5.03 (s, 2H), 4.92 (s, br, 1H), 4.92 (t, J = 5.5 Hz, 1H), 4.29 (m, br,

1H), 3.19 (m, 2H), 1.87 (m, 2H), 1.72 (m, 2H), 1.65-1.40 (m, 13H) ppm.

tert-Butyl(cyclohex-3-en-1-ylmethoxy)dimethylsilane (146). Compound 146 was synthesized according to a reported procedure191 from 3-cyclohexadiene-1-methanol (0.2 mL, 1.71 mmol), TBDMSCl (0.31 g, 2.06 mmol), and imidazole (0.23 g, 3.42 mmol) in dry

DMF (5 mL) to deliver a clear oil (373 mg, 96%). 1H NMR spectral data matched literature values.191

tert-Butyl((3,4-dibromocyclohexyl)methoxy)dimethylsilane (147). Compound 147 was synthesized according to a reported procedure191 from cyclohexene 146 (109 mg, 0.48

1 mmol), Br2 (28 µL, 0.53 mmol), in CCl4 (2 mL) to deliver yellow oil (153 mg, 84%). H

NMR spectral data matched literature values.191

tert-Butyl(cyclohexa-2,4-dien-1-ylmethoxy)dimethylsilane (148). Compound 148 was synthesized according to a reported procedure191 from dibromo 147 (100 mg, 0.26 mmol),

KOtBu (64 mg, 0.57 mmol), Aliquat 336 (2.4 µL, 0.005 mmol), in dry THF (3 mL) to deliver

1 a yellow oil (39 mg, 67%). H NMR (300 MHz, CDCl3) δ 5.94-5.65 (m, 4H), 3.57 (m, 2H),

2.46 (m, 1H), 2.29-2.08 (m, 2H), 0.85 (s, 6H), 0.03 (s, 9H) ppm.

Cyclohexa-2,4-dien-1-ylmethanol (149). Cyclohexadiene 148 (30 mg, 0.13 mmol) was dissolved in MeOH (0.3 mL) and treated with Dowex 50WX4-100 resin (30 mg). The

101

reaction mixture was shaken for 2.5 h. Then, the resin was filtered and washed with MeOH

(0.5 mL). The filtrate was concentrated and the crude product 149 (14 mg, 95%) was obtained as a yellow oil and used without further purification. 1H NMR spectral data matched literature values.191

Cyclohexa-2,4-dien-1-ylmethyl (2,5-dioxopyrrolidin-1-yl) carbonate (150). N,N’-

Disuccinimidyl carbonate (0.28 g, 1.09 mmol) was added to a solution of alcohol 149 (60 mg, 0.54 mmol) and TEA (230 µL, 1.63 mmol) in dry MeCN (4 mL) at rt. The resulting mixture was stirred at rt overnight and then concentrated under vacuum. The product was purified by column chromatography on silica gel, eluted with DCM, to deliver 150 (92 mg,

1 70%) as a white solid. H NMR (300 MHz, CDCl3) δ 6.03-5.62 (m, 4H), 4.24 (d, J = 6.6 Hz,

2H), 2.84 (s, 4H), 2.73 (m, 1H), 2.40-2.08 (m, 2H) ppm.

N2-(tert-butoxycarbonyl)-N6-((cyclohexa-2,4-dien-1-ylmethoxy)carbonyl)-L-lysine (151).

Boc-Lys-OH (127 mg, 0.52 mmol) was added to a stirred solution of 150 (65 mg, 0.26 mmol) in dry DMF (1 mL). The reaction was allowed to continue overnight at rt. The mixture was diluted in water (5 mL) and extracted with EtOAc (3  5 mL). The combined organic layers were washed with water (3  15 mL) and brine (5 mL). The resulting organic layer was dried over Na2SO4, filtered, and concentrated under vacuum to dryness. Compound

1 151 (90 mg, 91%) was obtained as an off-white foam. H NMR (300 MHz, CDCl3) δ 595-

5.62 (m, 4H), 5.30 (m, br, 1H), 5.00-4.90 (m, br, 1H), 4.28 (m, br, 1H), 4.00 (m, 2H), 3.17

(m, 2H), 2.62 (m, 1H), 2.24-2.06 (m, 2H), 1.83 (m, 2H), 1.15-1.43 (m, 15H) ppm.

102

(S)-2-((tert-Butoxycarbonyl)amino)-6-(furan-2-carboxamido)hexanoic acid (154). 2-

Furancarboxylic acid (100 mg, 0.89 mmol) was stirred in thionyl chloride (0.65 mL, 8.9 mmol) at 80 °C for 2 h. At this time the reaction was completed (by TLC) and the reaction mixture was concentrated to obtain the corresponding acyl chloride. 1H NMR (300 MHz,

CDCl3)  7.75 (m, 1H), 7.50 (d, J = 4.5 Hz, 1H), 6.63 (dd, J = 1.5, 1.8 Hz, 1H) ppm. A solution of the acyl chloride in THF (0.5 mL) was added dropwise to an ice-cold solution of

Boc-L-Lys-OH (0.25 g, 1.0 mmol) and 1 M NaOH aqueous (0.5 mL) in THF (1.5 mL). The reaction was allowed to proceed overnight at rt. The volatiles were removed under reduced pressure and the residue was diluted in water (8 mL) and then washed with EtOAc (8 mL).

The water layer was acidified with 5% citric acid to pH 3-4 and extracted with EtOAc (3  8 mL). The combined organic layers were washed with water (20 mL) and brine (10 mL). The resulting organic layer was dried over Na2SO4, filtered, and concentrated in vacuo to furnish

1 154 (228 mg, 75%) as an off-white foam. H NMR (300 MHz, CDCl3)  7.43 (m, 1H), 7.12

(d, J = 3.6 Hz, 1H), 6.50 (dd, J = 1.8, 1.8 Hz, 1H), 5.22 (m, br, 1H), 4.30 (s, br, 1H), 3.46 (t,

- J = 5.7 Hz, 2H), 1.90-1.25 (m, 15H) ppm; HRMS-ESI (m/z): [M-H] calcd for C16H23N2O6

339.15506, found 339.15612.

(S)-2-Amino-6-(furan-2-carboxamido)hexanoic acid TFA salt (155). To a solution of 154

(170 mg, 0.50 mmol) and TES (160 μL, 0.10 mmol) in dry DCM (7 mL), TFA (0.37 mL, 5.0 mmol) was added dropwise, and the reaction mixture was allowed to stir at rt overnight. The volatiles were removed under reduced pressure and the residue was dissolved in a minimal amount of MeOH and precipitated into ice-cold Et2O. The precipitate was pelleted by

103

centrifugation, the supernatant was decanted, and the solid was washed with Et2O before drying under vacuum, affording the amino acid 155 in quantitative yield as a white solid.

1 HNMR (300 MHz, DMSO-d6)  7.81 (m, 1H), 7.07 (d, J = 4.2 Hz, 1H), 6.60 (dd, J = 1.5,

2.1 Hz, 1H), 3.52 (m, 1H), 3.20 (m, 2H), 1.68 (m, 2H), 1.48-1.25 (m, 4H) ppm; HRMS-ESI

- (m/z): [M-H] calcd for C15H19N2O6 239.10373, found 239.10345.

(S)-2-((tert-Butoxycarbonyl)amino)-6-(((furan-2-ylmethyl)carbamoyl)oxy)hexanoic acid

(157). 6-Hydroxy-Boc-L-norleucine-OH (150 mg, 0.61 mmol) was dissolved in a solution of dry DCM (4 mL) and DIPEA (211 µL, 1.21 mmol). The solution was chilled to 0 ºC before the addition of furfuryl isocyanate (78 µL, 0.73 mmol) and the reaction was allowed to proceed at rt overnight. DCM (2 mL) and 5% citric acid (6 mL) were added. The aqueous layer was extracted with DCM (3  6 mL) and the combined organic layers were washed with water (15 mL) and brine (10 mL). The resulting organic layer was dried over Na2SO4, filtered and concentrated under reduced pressure. The remaining residue was purified by column chromatography on silica gel, eluting with 50% hexanes in EtOAc to 100% EtOAc to

1 furnish 157 (140 mg, 69% yield) as an off-white foam. H NMR (300 MHz, CDCl3)  7.34

(m, 1H), 6.30 (m, 1H), 6.21 (m, 1H), 5.20 (m, br, 2H), 4.34 (m, 2H), 4.10 (m, 2H), 3.65 (t, J

+ = 5.2 Hz, 1H), 1.85-1.43 (15 H) ppm; HRMS-ESI (m/z): [M+Na] calcd for C17H26N2O7Na

393.1638, found 393.1622.

(S)-2-Amino-6-(((furan-2-ylmethyl)carbamoyl)oxy)hexanoic acid TFA salt (158). TFA

(0.18 mL, 2.4 mmol) was added dropwise to a solution of 157 (90 mg, 0.24 mmol) and TES

104

(78 μL, 0.49 mmol) in dry DCM (0.72 mL), and the reaction mixture was allowed to stir at rt overnight. The volatiles were removed under reduced pressure and the residue was dissolved in a minimal amount of MeOH and precipitated into ice-cold Et2O. The precipitate was pelleted by centrifugation, the supernatant was decanted, and the solid was washed with Et2O and dried under vacuum, affording the amino acid 158 (91 mg, 98%) as a white solid. 1H

NMR (300 MHz, DMSO-d6)  7.55 (m, 1H), 6.37 (m, 1H), 6.20 (m, 1H), 4.15 (d, J = 5.7 Hz,

2H), 3.93 (t, J = 6.9 Hz, 2H), 3.49 (t, J = 5.2 Hz, 1H), 1.71-1.38 (m, 6H) ppm; HRMS-ESI

+ (m/z): [M+H] calcd for C12H19N2O5 271.1288, found 271.1294.

Protocol for the Diels-Alder cycloaddition of 139 and 157. The 1,3-diene 139 or 157 (0.05 mmol) and N-phenylmaleimide (20 equivalents) were stirred in a solution of water and methanol (1:1.5, 10 mM diene) at rt for 18 h. The reaction was analyzed by TLC and consumption of the starting diene was observed. Samples were taken and analyzed by MS.

+ HRMS-ESI (m/z): [M+Na] , Diels-Alder product of 139 calcd for C28H37N3O8Na 566.2478, actual 566.2463; Diels-Alder product of 157 calcd for C27H33N3O9Na 566.2114, actual

566.2095.

Bicyclo[2.2.1]hept-5-en-2-yl (2,5-dioxopyrrolidin-1-yl) carbonate (162). N’N-

Disuccinimidyl carbonate (6.3 g, 24 mmol) was added to a solution of 5-norbornene-2-ol

(endo/exo mixture, 1.5 g, 14 mmol) and TEA (5.7 mL, 41 mmol) in dry MeCN (50 mL) at rt.

The resulting mixture was stirred overnight and then concentrated under vacuum. The product was purified by column chromatography on SiO2 (1-5% Et2O in DCM) to deliver

105

1 162 (2.8 g, 82%, 7:3 endo/exo) as a white solid; mp 94-97 °C. H NMR (300 MHz, CDCl3) δ

6.32 and 6.23 (mendo, ddexo, J = 2.7 Hz, 1H), 5.94 and 5.89 (mendo, texo, J = 3.6 Hz, 1H), 5.28 and 4.66 (mendo, dexo, J = 5.7 Hz, 1H), 3.19 and 3.00 (sendo, sexo, 1H), 2.84 (s, 1H), 2.80 (s,

4H), 2.21-2.13 and 1.81-1.57 (mendo, mexo, 1H), 1.52-1.49 (m, 1H), 1.32 (d, J = 9.0 Hz, 1H),

13 1.14-1.08 (dt, J1 = 12.9 Hz, J2 = 2.4 Hz, 1H) ppm; C NMR (75 MHz, CDCl3) δ 169.02,

168.95, 151.25, 142.10, 139.16, 131.69, 130.90, 83.20, 82.76, 47.58, 47.23, 46.23, 45.72,

+ 42.16, 40.52, 34.43, 25.44 ppm; HRMS-ESI (m/z): [M+Na] calcd for C12H13NO5 274.0686, found 274.0683.

(S)-6-((((1S,4S)-Bicyclo[2.2.1]hept-5-en-2-yloxy)carbonyl)amino)-2-((tert- butoxycarbonyl)-amino)hexanoic acid (163). Boc-Lys-OH (3.2 g, 13 mmol) was added to a stirred solution of 162 (2.5 g, 10 mmol) in dry DMF (35 mL). The reaction was allowed to proceed overnight at rt. The mixture was diluted in water (150 mL) and extracted with

EtOAc (3  150 mL). The combined organic layers were washed with water (3  100 mL) and brine (75 mL). The resulting organic layer was dried over Na2SO4, filtered and concentrated under vacuum to dryness. Compound 163 (3.6 g, 95%) was obtained as an off-

1 white foam. H NMR (300 MHz, CDCl3)δ9.11 (s, br, 1H), 8.03 (s, br, 1H), 6.30-6.21 (m,

1H), 5.95-5.93 (m, 1H), 5.30 and 4.59 (d, brendo, J = 7.2 Hz; d, brexo, J = 6.9 Hz, 1H), 5.24 (s, br, 1H), 4.86 (m, br, 1H), 4.77 (m, br, 1H), 4.28 (s, br, 1H), 4.09 (m, br, 1H), 3.12 (m, br,

2H), 2.80 (m, br, 1H), 2.09 (m, 1H), 1.81-1.28 (m, br, 15H), 0.90 (d, br, J = 12.9 Hz, 1H)

13 ppm; C NMR (75 MHz, CDCl3)δ175.95, 156.76, 155.58, 140.74, 138.19, 132.49, 131.43,

79.76, 75.35, 75.14, 52.90, 47.39, 47.20, 45.91, 45.74, 41.95, 40.30, 40.14, 34.28, 31.73,

106

+ 29.14, 28.09, 22.10, 21.75 ppm; HRMS-ESI (m/z): [M+Na] calcd for C19H30N2O6 405.1996, found 405.1983.

(S)-2-Amino-6-((((1S,4S)-bicyclo[2.2.1]hept-5-en-2-yloxy)carbonyl)amino)hexanoic acid

HCl salt (164). TFA (6.4 mL, 86 mmol) was added dropwise to a solution of 163 (3.3 g, 8.60 mmol) and TES (2.7 ml, 17 mmol) in dry DCM (120 mL), and the reaction mixture was allowed to stir at rt overnight. The solvents were evaporated under reduced pressure. The residue was dissolved in a 1 M HCl solution (5 mL 4 N HCl in 1,4-dioxane, 15 mL dry

MeOH), allowed to stir for 10 min, and concentrated. The latter process was repeated two more times to ensure complete TFA to HCl salt exchange. The concentrated residue was re- dissolved in a minimal amount of MeOH and was precipitated into ice-cold Et2O, filtered and dried under vacuum, affording the amino acid 164 as a white solid in quantitative yield (2.7

1 g); mp 176-178 °C. H NMR (300 MHz, CD3OD) δ 6.30-6.25 (m, 1H), 6.00-5.93 (m, 1H),

5.15 and 4.52 (mendo, mexo, 1H), 4.85 (m, 1H), 3.55 (t, J = 5.4 Hz, 1H), 3.07 (q, J = 6.7 Hz,

2H), 2.81 (d, J = 6.6 Hz, 1H), 2.13-2.05 (m, 1H), 1.93-1.74 (m, 2H), 1.68-1.63 (m, 1H), 1.53-

1.28 (m, 5H), 0.93-0.87 (dt, J1 = 12.3 Hz, J2 = 2.7 Hz, 1H) ppm; 13C NMR (75 MHz,

CD3OD) δ 174.82, 159.52, 142.37, 139.36, 133.84, 132.80, 76.73, 76.73, 56.16, 47.43, 47.13,

43.63, 41.93, 41.42, 35.67, 32.80, 32.07, 30.74, 28.90, 24.22, 23.63 ppm; HRMS-ESI (m/z):

+ [M+Na] calcd for C14H22N2O4 305.1472, found 305.1475.

4-(1,2,4,5-Tetrazin-3-yl)benzoic acid (169). Method A (catalyzed): 4-Cyanobenzoic acid

(0.5 g, 3.4 mmol), formamidine acetate salt (1.76 g, 17.0 mmol) and zinc (II) triflate (0.6 g,

107

1.7 mmol) were mixed and cooled to 0 °C. Anhydrous hydrazine (4.2 mL, 136 mmol) was added slowly and the stirring reaction mixture was allowed to slowly warm to rt. After 20 h, the dark brown mixture was cooled to 0 °C and an ice-cold, aqueous solution of sodium nitrite (2.3 g, 34.0 mmol) was added. This was followed by the slow addition of ice-cold 2 M

HCl aqueous until pH 2-3 was reached. During this process toxic nitrogen oxide gases evolved and the mixture turned bright pink. After the addition of aqueous HCl, stirring was continued for 30 min at the same temperature. Ethyl acetate (20 mL) was added and the phases were separated. The aqueous layer was extracted with four portions of EtOAc (4  20 mL) or until the last organic extract was faintly pink in color. The organic layers were combined, washed with 1 M HCl (80 mL), water (80 mL) and brine (40 mL), dried over

Na2SO4, and concentrated under reduced pressure. The obtained residue was purified by column chromatography on silica gel eluting with DCM/acetone (3:1) to give tetrazine 169

(391.4 mg, 56%) as a bright pink solid. Method B (un-catalyzed): Similar to the procedure described above, a mixture of 4-cyanobenzoic acid (162 mg, 1.10 mmol), formamidine acetate (575 mg, 5.52 mmol), and hydrazine hydrate (1.7 mL, 64% hydrazine solution) was stirred at 50 °C for 20 h. Oxidation (NaNO2: 0.76 g, 11.0 mmol) and purification was performed as described above to deliver 169 (42 mg, 19%) as a bright pink solid. 1H NMR spectral data matched literature values.209

4-(6-Methyl-1,2,4,5-tetrazin-3-yl)benzoic acid (170). Method A (catalyzed): In a similar experiment to that of 169, tetrazine 170 was synthesized by reacting 4-cyanobenzoic acid

(150 mg, 1.0 mmol), MeCN (0.27 mL, 5.1 mmol) and nickel (II) triflate (182 mg, 0.51

108

mmol) with anhydrous hydrazine (1.3 mL, 40.8 mmol). After oxidation (NaNO2: 0.7 g, 10.2 mmol), the tetrazine 170 was obtained as a bright pink solid (140 mg, 63%). Method B (un- catalyzed): Synthesized from 4-cyanobenzoic acid (150 mg, 1.0 mmol), MeCN (0.27 mL, 5.1 mmol) and hydrazine hydrate (1.6 mL, 64% hydrazine solution) at 50 °C, and NaNO2 (0.35 mg, 5.1 mmol). Compound 170 (125 mg, 56%) was obtained as a bright pink solid. 1H NMR spectral data matched literature values.209

4-(6-Phenyl-1,2,4,5-tetrazin-3-yl)benzoic acid (171). Hydrazine hydrate (12 mL, 64% hydrazine solution) was slowly added to a stirring mixture of 4-cyanobenzoic acid (1.0 g, 6.8 mmol) and benzonitrile (2.8 mL, 27.2 mmol) at 0 °C. The mixture slowly turned into a homogenous solution at rt and then was heated to 70 °C for 16 h. The formed orange precipitate was cooled to 0 °C and an aqueous solution of sodium nitrite (10 g, 145 mmol) was added. This was followed by the slow addition of ice-cold 2 M HCl until pH 2-3 was reached. During this process toxic nitrogen oxide gases evolved and the mixture turned bright pink. After the addition of aqueous HCl, stirring was continued for 30 min at the same temperature. The solid was filtered and washed with water (3  40 mL). Next, it was collected to stir in refluxing acetone (40 mL) and then filtered off while hot. The collected purple solid was added to hot DMF (20 mL) and allowed to stir for 5 min before filtering and washing with water (40 mL) and acetone (40 mL). The solid was collected and dried under

1 vacuum to furnish 171 as a purple solid (382 mg, 20%). H NMR (400 MHz, DMSO-d6) δ

8.67 (d, J = 8.8 Hz, 2H), 8.58 (d, J = 6.4 Hz, 2H), 8.25 (d, J = 8.8 Hz, 2H), 7.73 (m, 3H)

- ppm; HRMS-ESI (m/z): [M-H] calcd for C15H9N4O2 277.07200, found 277.07364.

109

2,5-Dioxopyrrolidin-1-yl 4-(1,2,4,5-tetrazin-3-yl)benzoate (173). The tetrazine 169 (39 mg, 0.19 mmol) DMAP (12 mg, 0.01 mmol) were dissolved in dry THF (2 mL). The reaction mixture was colled to 0 °C and NHS (33 mg, 0.29 mmol) and EDCI (55 mg, 0.29 mmol) were added. The mixture was heated to 50 °C and the reaction was stirred at this temperature overnight. Then, it was concentrated and dissolved in EtOAc (2 mL), washed with saturated

NaHCO3 (2 mL), brine (1 mL), dried over Na2SO4, filtered, and concentrated. Compound

173 (49 mg, 86%) was obtained as a bright pink solid without further purification. 1H NMR

(400 MHz, DMSO-d6) δ 10.70 (s, 1H), 8.77 (d, J = 8.4 Hz, 2H), 8.39 (d, J = 8.4 Hz, 2H),

2.93 (s, 4H) ppm.

2,5-Dioxopyrrolidin-1-yl 4-(6-me thyl-1,2,4,5-tetrazin-3-yl)benzoate (174). Synthesized in a similar way to 173 using the following reagents: 170 (20 mg, 0.092 mmol), DMAP (6 mg,

0.046 mmol), NHS (16 mg, 0.14 mmol), and EDCI (26 mg, 0.14 mmol) in dry THF (1.2 mL). Compound 174 was obtained in quantitative yield as a bright pink solid. 1H NMR (300

MHz, DMSO-d6) δ 8.74 (d, J = 8.4 Hz, 2H), 8.38 (d, J = 8.4 Hz, 2H), 3.04 (s, 3H), 2.92 (s,

+ 4H) ppm; HRMS-ESI (m/z): [M+H] calcd for C14H12N5O4 314.08838, found 314.08892.

Synthesis of PEG reagents 175 and 176. NHS-tetrazine 173 or 174 (0.020 mmol) was dissolved in dry DMF (0.5 mL). Dry DCM (0.25 mL) and mPEG5000-NH2 (0.010 mmol) were added. The reaction mixture was stirred at 50 °C overnight. Then, it was allowed to cool to rt and precipitated into Et2O. The solid was pelleted by centrifugation, and the solvent was decanted. The remaining residue was dissolved in a minimal amount of DCM and the

110

precipitation procedure was repeated two more times to obtain the corresponding PEG reagents 175 or 176 as light pink solids. The PEG loading was determined by measung the

UV/Vis absorbance at 268 nm of the labeled PEG reagents 175 or 176 (~0.3 mM in MeOH) and comparing with the absorbance of 169 or 170, respectively, at the same concentration level in MeOH. PEG loading was determined to be 97% for 175 and 77% for 176.

tert-Butyl (2-(4-(1,2,4,5-tetrazin-3-yl)benzamido)ethyl)carbamate (177). The tetrazine

169 (100 mg, 0.49 mmol) was dissolved in a solution of DIPEA (0.21 mL, 1.22 mmol) in dry

THF (7 mL). HATU (0.28 g, 0.73 mmol) was added and the reaction mixture was allowed to stir at rt for 2 h before the addition of N-Boc-ethylenediamine (0.12 mL, 0.73 mmol). After 1 h, the reaction mixture was concentrated and the residue was dissolved in EtOAc (15 mL), washed with 5% citric acid (15 mL), water (15 mL) and brine (8 mL), dried over Na2SO4, and concentrated under reduced pressure. The obtained residue was purified by column chromatography on silica gel eluting with 20% acetone in DCM, to afford the tetrazine 177

1 (138 mg, 82%) as a bright pink solid. H NMR (300 MHz, CDCl3) δ 10.25 (s, 1H), 8.68 (d, J

= 8.4 Hz, 2H), 8.06 (d, J = 8.4 Hz, 2H), 7.66 (m, br, 1H), 5.15 (t, br, J = 6.0 Hz, 1H), 3.59 (q, br, J = 4.8, 5.4 Hz, 2H), 3.45 (q, br, J = 4.8, 5.4 Hz, 2H), 1.43 (s, 9H) ppm; 13C NMR (75

MHz, CDCl3) δ 166.8, 166.1, 158.1, 138.3, 134.2, 128.5, 128.1, 80.4, 42.9, 39.9, 28.5 ppm;

+ HRMS-ESI (m/z): [M+K] calcd for C16H20N6O3 383.4662, found 383.1213.

N-(2-Aminoethyl)-4-(1,2,4,5-tetrazin-3-yl)benzamide HCl salt (181). Tetrazine 177 (90 mg, 0.26 mmol) was suspended in dry MeOH (4 mL) and cooled to 0 °C before the slow

111

addition of 4 N HCl in 1,4-dioxane (1.3 mL). The reaction mixture was stirred at rt for 45 min. The volatiles were removed under reduced pressure and the remaining residue was taken up in Et2O (10 mL). The solvent was decanted and the procedure was repeated two more times. The pink solid was collected and dried under vacuum to yield tetrazine 181 (55.3 mg, 75%). The crude material was considered pure enough for subsequent reactions and the

+ identity was confirmed by HRMS-ESI (m/z): [M+H] calcd for C11H12N6O 245.1145, found

245.1142.

tert-Butyl (2-(4-(6-methyl-1,2,4,5-tetrazin-3-yl)benzamido)ethyl)carbamate (178).

Tetrazine 170 (27 mg, 0.12 mmol) was dissolved in a solution of DIPEA (43 µL, 0.25 mmol) in dry THF (0.6 mL). HATU (56 mg, 0.15 mmol) was added and the reaction mixture was allowed to stir at rt for 2 h before the addition of N-Boc-ethylenediamine (23 µL, 0.15 mmol). After 1 h, the reaction mixture was concentrated and the residue was dissolved in

EtOAc (10 mL), washed with 5% citric acid (10 mL), water (10 mL) and brine (5 mL), dried over Na2SO4, and concentrated under reduced pressure. The obtained residue was purified by column chromatography on silica gel using 20% acetone in DCM to afford tetrazine 178 (24

1 mg, 54%) as a bright pink solid. H NMR (400 MHz, CDCl3) δ 8.63 (d, J = 8.4 Hz, 2H), 8.03

(d, J = 8.4 Hz, 2H), 7.57 (m, br, 1H), 5.09 (m, br, 1H), 3.59 (q, br, J = 4.8, 5.4 Hz, 2H), 3.44

13 (q, br, J = 4.8, 5.4 Hz, 2H), 3.10 (s, 3H), 1.43 (s, 9H) ppm; C NMR (100 Hz, CDCl3) δ

167.7, 166.9, 163.7, 158.0, 137.8, 134.4, 128.0, 80.4, 42.8, 39.9, 28.4, 21.4 ppm; HRMS-ESI

+ (m/z): [M+H] calcd for C17H23N6O3 359.18262, found 359.18382.

112

N-(2-Aminoethyl)-4-(6-methyl-1,2,4,5-tetrazin-3-yl)benzamide HCl salt (182). Tetrazine

178 (24 mg, 0.67 mmol) was suspended in dry MeOH (0.75 mL) and cooled to 0 °C before the slow addition of 4 N HCl in 1,4-dioxane (0.25 mL). The reaction mixture was stirred at rt for 45 min. The volatiles were removed under reduced pressure and the remaining residue was suspended in Et2O (10 mL). The solvent was decanted and the procedure was repeated two more times. The pink solid was collected and dried under vacuum to yield tetrazine 182

(18.8 mg, 100%). The crude material was considered pure enough for subsequent reactions

+ and the identity was confirmed by HRMS-ESI (m/z): [M+H] calcd for C12H15N6O

259.13019, found 259.13112.

tert-Butyl (2-(4-(6-phenyl-1,2,4,5-tetrazin-3-yl)benzamido)ethyl)carbamate (179).

Tetrazine 171 (45 mg, 0.16 mmol) was dissolved in a solution of DIPEA (0.1 mL, 0.64 mmol) in dry DMSO (0.6 mL). HATU (92 mg, 0.24 mmol) was added and the reaction mixture was allowed to stir at rt for 1 h before the addition of N-Boc-ethylenediamine (33

µL, 0.21 mmol). After 4 h, the reaction mixture was diluted with EtOAc, washed with 5% citric acid (10 mL), water (2  10 mL) and brine (5 mL), dried over Na2SO4, and concentrated under reduced pressure. The obtained purple product 179 was used without

1 further purification (51 mg, 75%). H NMR (300 MHz, CDCl3) δ 8.72-8.66 (m, 4H), 8.09 (d,

J = 8.4 Hz, 2H), 7.65 (m, 3H), 7.57 (s, br, 1H), 5.02 (s, br, 1H), 3.62 (m, br, 2H), 3.46 (m, br,

+ 2H), 1.45 (s, 9H) ppm; LRMS-ESI (m/z): [M+H] calcd for C22H25N6O3 421.20, found

- 421.15; [M-H] calcd for C22H23N6O3 419.18, found 419.10.

113

N-(2-Aminoethyl)-4-(6-phenyl-1,2,4,5-tetrazin-3-yl)benzamide TFA salt (183). Tetrazine

179 (30 mg, 0.84 mmol) was taken up in dry DCM (0.8 mL) and TFA was added (0.2 mL).

The reaction mixture was allowed to stir at rt overnight. The volatiles were removed under reduced pressure and the residue was dissolved in MeOH (2 mL) and concentrated again to remove residual amount of TFA by co-evaporation.. This was repeated two times and the remaining residue was taken up in Et2O (10 mL). The solvent was decanted and the procedure was repeated two more times. The purple solid was dried under vacuum to yield tetrazine 183 (23 mg, 77%). The crude material was considered pure enough for subsequent

+ reactions and the identity was confirmed by HRMS-ESI (m/z): [M+H] calcd for C17H17N6O

321.14, found 321.10.

tert-Butyl (2-(4-(6-(pyridin-2-yl)-1,2,4,5-tetrazin-3-yl)benzamido)ethyl)carbamate (180).

Tetrazine 172 (synthesized according to Chem. Commun. 48, 1736) (29 mg, 0.10 mmol) was dissolved in a solution of DIPEA (36 µL, 0.21 mmol) in dry DMF (0.6 mL). HATU (47 mg,

0.12 mmol) was added and the reaction mixture was allowed to stir at rt for 1 h before the addition of N-Boc-ethylenediamine (20 µL, 0.12 mmol). After 4 h, EtOAc (10 mL) was added and the organic layer was washed with water (3  10 mL) and brine (5 mL), dried over

Na2SO4, and concentrated under reduced pressure. The obtained residue was purified by column chromatography on silica gel using 2-10% MeOH in DCM to afford tetrazine 180

1 (38 mg, 88%) as a purple solid. H NMR (300 MHz, CDCl3) δ 9.12 (d, J = 4.5 Hz, 2H), 8.77

(d, J = 8.1 Hz, 2H), 8.08 (d, J = 8.7 Hz, 2H), 7.64 (br, 1H), 7.59 (t, J = 5.1 Hz, 1H), 5.12 (t, br, J = 4.2 Hz, 1H), 3.60 (q, J = 4.5, 5.7 Hz, 2H), 3.44 (m, br, 2H), 1.42 (s, 9H) ppm; 13C

114

NMR (100 MHz, CDCl3) δ 166.8, 164.2, 163.3, 159.5, 158.6, 158.0, 138.5, 133.9, 129.0,

+ 128.2, 122.7, 80.3, 42.8, 39.9, 28.5 ppm; HRMS-ESI (m/z): [M+H] calcd for C20H23N8O3

423.18876, found 423.19028.

N-(2-Aminoethyl)-4-(6-(pyridin-2-yl)-1,2,4,5-tetrazin-3-yl)benzamide HCl salt (184).

Tetrazine 180 (28 mg, 0.066 mmol) was dissolved in dry MeOH (1.5 mL) and cooled to 0 °C before the slow addition of 4 N HCl in 1,4-dioxane (0.5 mL). The reaction mixture was stirred at rt for 45 min. The volatiles were removed under reduced pressure and the remaining residue was taken up in Et2O (10 mL). The solvent was decanted and the procedure was repeated two more times. The purple solid was collected and dried under vacuum to yield tetrazine 184 (23 mg, 97%). The crude material was considered pure enough for subsequent

+ reactions and the identity was confirmed by HRMS-ESI (m/z): [M+H] calcd for C15H15N8O

323.13633, found 323.13746.

N-(2-(5-(Dimethylamino)naphthalene-1-sulfonamido)ethyl)-4-(1,2,4,5-tetrazin-3- yl)benzamide (185). Tetrazine 181 (17 mg, 0.061 mmol) was dissolved in a solution of TEA

(85 µL, 0.61 mmol) in dry DCM (1 mL) and dry DMF (0.5 mL). The reaction mixture was cooled to 0 °C before adding dansyl chloride (33 mg, 0.12 mmol) and allowed to stir at rt overnight. The reaction mixture was concentrated and purified by column chromatography on silica gel using DCM, then 10-20% acetone in DCM to afford 185 (20.6 mg, 71%) as a

1 pink solid. H NMR (300 MHz, CDCl3) δ 10.26 (s, 1H), 8.60 (d, J = 8.7 Hz, 2H), 8.53 (s, br,

1H), 8.30-8.24 (m, 2H), 7.87 (d, J = 8.7 Hz, 2H), 7.56-7.49 (m, 2H), 7.17 (d, J = 7.5 Hz, 1H),

115

6.93 (m, br, 1H), 5.60 (m, br 1H), 3.59 (q, J = 5.4, 5.1 Hz, 2H), 3.23 (q, J = 5.4, 5.1 Hz, 2H),

13 2.87 (s, 6H) ppm; C NMR (75 Hz, CDCl3) δ 167.2, 166.1, 158.2, 138.0, 134.5, 131.0,

130.1, 129.6, 128.9, 128.6, 128.1, 115.6, 45.7, 43.2, 40.3 ppm; HRMS-ESI (m/z): [M+H]+ calcd for C23H24N7O3S 478.1656, found 478.1624.

N-(2-(3-(3',6'-Dihydroxy-3-oxo-3H-spiro[isobenzofuran-1,9'-xanthen]-5-yl)thioureido)- ethyl)-4-(1,2,4,5-tetrazin-3-yl)benzamide (186). Tetrazine 181 (12 mg, 0.042 mmol) was dissolved in a solution of TEA (40 µL, 0.28 mmol) in dry MeOH and THF (2:1, 1.2 mL).

The reaction mixture was cooled to 0 °C before adding fluorescein 5-isothiocyanate (11 mg,

0.028 mmol). The reaction mixture was allowed to warm to rt and stirred overnight before concentrating and purifying by column chromatography on silica gel using a gradient of

DCM:acetone:MeOH (80:15:5 to 45:50:5) to afford 186 (10 mg, 62%) as an orange solid. 1H

NMR (300 MHz, acetone-d6) δ 10.45 (s, 1H), 9.57 (s, br, 1H), 9.02 (s, br, 2H), 8.60 (d, 2H, J

= 8.7 Hz), 8.43 (s, br, 0.5H), 8.30 (s, br, 0.5H), 8.16 (d, 2H, J = 8.7 Hz), 8.05 (s, br, 0.5H),

7.88 (s, br, 0.5H), 7.19 (d, 1H, J = 8.1 Hz), 6.69-6.54 (m, 8H), 3.91 (m, br, 2H), 3.76 (q, br,

13 2H, J = 6.0, 6.4 Hz) ppm; C NMR (100 MHz, acetone-d6) δ 182.2, 168.5, 168.1, 159.6,

158.5, 152.7, 148.8, 141.3, 135.0, 130.6, 129.5, 128.5, 128.1, 124.5, 118.3, 112.6, 111.1,

+ 102.6, 54.8, 39.7, 31.3 ppm; HRMS-ESI (m/z): [M+H] calcd for C32H23N7O6S 634.1503, found 634.1462.

N-(2-(3-(3',6'-Dihydroxy-3-oxo-3H-spiro[isobenzofuran-1,9'-xanthen]-5-yl)thioureido)- ethyl)-4-(6-methyl-1,2,4,5-tetrazin-3-yl)benzamide (187). Tetrazine 182 (13.7 mg, 0.047

116

mmol) was dissolved in a solution of TEA (50 µL, 0.36 mmol) in dry MeOH and THF (2:1,

1.2 mL). The reaction mixture was cooled to 0 °C before adding fluorescein 5-isothiocyanate

(14 mg, 0.036 mmol). The reaction mixture was allowed to warm to rt and stirred overnight before concentrating and purifying by column chromatography on silica gel using a gradient of DCM:acetone:MeOH (80:15:5 to 45:50:5) to afford 187 (12 mg, 52%) as an orange solid.

1 H NMR (acetone-d6, 400 MHz) δ 9.64 (s, br, 1H), 9.09 (s, br, 2H), 8.59 (d, J = 8.4 Hz, 2H),

8.42 (s, br, 0.5H), 8.31 (s, br, 0.5H), 8.17 (d, J = 8.4 Hz, 2H), 8.08 (s, br, 0.5), 7.88 (s, br,

0.5H), 7.22 (d, J = 8.4 Hz, 1H), 6.73-6.70 (m, 5H), 6.61-6.58 (m, 3H), 3.90 (m, br, 2H), 3.73

13 (q, br, J = 6.0, 6.4 Hz, 2H) ppm; C NMR (100 Hz, acetone-d6) δ 169.3, 168.6, 164.3, 160.3,

153.4, 149.4, 142.1, 135.8, 131.2, 130.2, 129.1, 128.4, 125.4, 119.0, 113.3, 111.7, 103.3,

+ 55.5, 45.4, 40.4, 32.0, 20.6 ppm; HRMS-ESI (m/z): [M+H] calcd for C33H26N7O6S

648.16598, found 648.16858.

N-(2-(3-(3',6'-Dihydroxy-3-oxo-3H-spiro[isobenzofuran-1,9'-xanthen]-5-yl)thioureido)- ethyl)-4-(6-phenyl-1,2,4,5-tetrazin-3-yl)benzamide (188). Tetrazine 183 (18 mg, 0.043 mmol) was dissolved in a solution of TEA (54 µL, 0.39 mmol) in dry MeOH and THF (2:1,

1.5 mL). The reaction mixture was cooled to 0 °C before adding fluorescein 5-isothiocyanate

(15 mg, 0.039 mmol). The reaction mixture was allowed to warm to rt and stirred overnight before concentrating and purifying by column chromatography on silica gel using a gradient of DCM:acetone:MeOH (80:10:10 to 0:90:10) to afford tetrazine 188 (15 mg, 55%) as an

1 orange solid. H NMR (400 MHz, acetone-d6) δ 9.74 (s, br, 1H), 8.99 (s, br, 2H), 8.64-8.62

(m, 4H), 8.41 (s, br, 0.5H), 8.35 (s, br, 1H), 8.21 (d, J = 8.4 Hz, 2H), 7.90 (s, br, 1H), 7.73

117

(m, 3H), 7.22 (d, J = 8.4 Hz, 1H), 6.73-6.70 (m, 5H), 6.61-6.58 (m, 3H), 3.92 (m, br, 2H),

+ 3.75 (m, br, 2H) ppm; HRMS-ESI (m/z): [M+H] calcd for C38H28N7O6S 710.18, found

- 710.15; [M-H] calcd for C38H26N7O6S 708.17, found 708.20.

N-(2-(3-(3',6'-Dihydroxy-3-oxo-3H-spiro[isobenzofuran-1,9'-xanthen]-5-yl)thioureido)- ethyl)4-(6-(pyrimidin-2-yl)-1,2,4,5-tetrazin-3-yl)benzamide (189). Tetrazine 184 (13 mg,

0.036 mmol) was dissolved in a solution of TEA (39 µL, 0.28 mmol) in dry MeOH and THF

(2:1, 1.2 mL). The reaction mixture was cooled to 0 °C before adding fluorescein 5- isothiocyanate (11 mg, 0.028 mmol). The reaction mixture was allowed to warm to rt and stirred overnight before concentrating and purifying by column chromatography on silica gel using a gradient of DCM:acetone:MeOH (80:10:10 to 0:90:10) to afford tetrazine 189 (10

1 mg, 50%) as an orange solid. H NMR (400 MHz, CD3OD) δ 9.16 (d, J = 4.8 Hz, 2H), 8.75

(d, J = 8.4 Hz, 2H), 8.12 (d, J = 8.4 Hz, 2H), 8.06 (d, br, J =2.4 Hz, 1H), 7.81 (t, J = 5.2 Hz,

1H), 7.75 (d, br, J = 7.2 Hz, 1H), 7.14 (d, J = 8.0 Hz, 2H), 6.69-6.63 (m, 5H), 6.51-6.48 (m,

+ 3H), 3.95 (m, 2H), 3.72 (m, 2H) ppm; HRMS-ESI (m/z): [M+H] calcd for C36H26N9O6S

712.17213, found 712.17463.

118

CHAPTER 3: SYNTHESIS OF CAGED TYROSINES FOR THE

PHOTOREGULATION OF PROTEINS

3.1 Introduction to caging groups

The absence of conditional control over biological systems can impair the ability to obtain real-time information on cellular processes. In order to elucidate complete cellular and physiological mechanisms, synthetic approaches have been developed in order to gain spatial and temporal control over specific processes. Light represents a versatile control element, as it possesses several advantages over traditional effectors of control over biological events.

Light is an ideal external trigger that is non-invasive and is generally bioorthogonal, except for photoreceptor and photosynthetic cells or in circumstances where the wavelength is too short causing cellular damage. Most importantly, light can be easily controlled in amplitude, wavelength, timing, and localization, conveying spatiotemporal control of biological activity.

Light can be applied to subcellular components, most cell types, certain tissues, and small model organisms, such as C. elegans and D. rerio, that are commonly used in research laboratories.220

Light-controlled regulation of biological processes is typically applied using “caging” groups, which are photolabile protecting groups that are installed on small molecules or macromolecules. Caging groups were first introduced in the mid 1960s221, 222 and have gained an increasing importance in both organic synthesis223, 224 and biology225-229 since the first synthesis of a photoactivatable adenosine 5’-triphosphate (ATP) by Hoffmann et al.230

The regulation of biological processes with light has greatly improved over the years. As a result, additional technologies involving light have been developed and more applications

119

have been reported. To date there are three main approaches to light-controlled regulation of biological processes: 1) “caging and uncaging” for the irreversible photoactivation of processes, 2) photoswitches for a reversible control of biological processes, and 3) optogenetics, which uses genetically encoded elements (i.e., proteins) that can be controlled with light, particularly in neurons.228, 231

The caging and uncaging approach relies on chemical synthesis for the placement of photolabile protecting groups on small molecules. Molecular caging then involves the placement of the caging group in a crucial location to render the substrate biologically inactive (caged). Subsequently, light exposure (uncaging) restores biological activity.

Usually, this can be achieved with brief UV light illumination (i.e., 365 nm) and the caged compound will subsequently undergo a photochemical reaction to release the caging group

(Figure 24).224, 232, 233

UV (365 nm) +

Inactive (caged) Active (uncaged)

Figure 24. Representation of the activation of a caged protein with UV light (365 nm).

There are several requirements that an ideal caging group must meet: 1) they should be easily introduced, 2) stable under physiological conditions, 3) have a large molar

120

extinction coefficient (ɛ), high quantum yield (Φ), and a short decaging half-life (t1/2), 4) render the substrate inactive and completely restore the activity upon photolysis, and 5) release benign byproducts.228, 233

An extensive selection of caging groups has been developed and reported in the literature with wide-ranging properties and applications.220, 228, 233 Among these, the ortho- nitro benzyl (o-NB) caging group and its analogs are the most commonly used caging groups. The accessibility and ease of synthesis in high yields is one of the many advantages of this caging group. Most importantly, efficient photolysis and compatibility with a variety of functional groups make o-NB groups appealing for use. The o-NB caged substrates can undergo decaging under UV irradiation, generally between 254-375 nm. This process is usually initiated by a single photon excitation phenomenon.234 The caged substrate (190) is released according to a Norrish type II mechanism232 that results in a free substrate (191) and a nitroso-benzocarbonyl byproduct (192) (Scheme 22). The rate of the photolysis depends on the nature of the leaving group or substrate (phenols, aliphatic alcohols, thiols, amines, carboxylic acids, and carbamates), the structure of the caging group, pH, and the composition of the solvent or buffer.224

121

Scheme 22. Norrish type II mechanism for the photolysis of an o-NB substrate.

Despite its widespread acceptance, the o-NB caging group has a few limitations. The application of UV irradiation to a biological system can be photodamaging to some extent. In addition, an amine reactive byproduct is formed upon photo-degradation of the caging group.

Fortunately, electronic changes through substituent effects in the o-NB structure allow for tunable photochemical properties, thus expanding the possibilities for improvement. For instance, bathochromic absorbing o-NB derivatives can be obtained by introducing donor groups on the aromatic ring.222, 235 Caging groups like 194 and 195 exhibit a red shift of the absorption maxima to λ >350 nm, compared to λ >254 nm for unsubstituted o-NB (193)

(Figure 25). This modification allows for the use of less biologically damaging UV light of

365 nm.222 Studies have also shown that certain substituents at the benzylic position can improve the rate of decaging, in particular methyl236, 237 (193-195, R = Me) and carboxy groups238, 239 (196). The introduction of a methyl group at the benzylic position has the additional advantage of producing a ketone byproduct, which is less prone to imine formation than aldehydes, thus providing better biocompatibility of the caging group. On the other

122

hand, the introduction of a carboxylic acid substituent has made it possible to improve the water solubility of caged compounds.239-241

Figure 25. Modifications of the o-NB structure. (R = H or Me, X = any substrate)

An alternative to the o-NB type caging group is the one-carbon homologue 197 (2-(o- nitrophenyl)ethanol; NPE) and its analog 198 (2-(o-nitrophenyl)propanol), both designed by

Hasan et al.242 (Figure 26). The effects of adding substituents have been explored and were found to be analogous to those found for the o-NB-type structures.243 Similar to 195, the caging group 199 was also developed to increase the absorbance maxima of the caging group to λ >350 nm.244

Figure 26. Structures of NPE-type caging groups.

Despite their structural similarity to the o-NB type analogs, the decaging mechanism for NPE-type caging groups follows a different pathway. Photolysis of the NPE-type caging groups (200) proceeds via a β-elimination mechanism to release the substrate (201) and

123

results in the formation of a more benign nitrovinyl byproduct (202) (Scheme 23).245

Moreover, the NPE-type caging groups exhibit faster decaging rates than the o-NB-type.246

Scheme 23. Photolysis mechanism for the NPE-caged substrate.

3.2 Tyrosine in protein modification

Tyrosine is a non-essential and proteinogenetic amino acid. The tyrosine residue offers unique chemistries in proteins by virtue of its phenol functionality. For example, the hydroxyl group in tyrosyl residues allows for hydrogen bonding interactions, acid-base reactions, and radical-mediated reactions.247-249 Tyrosine is present in the active site of a number of enzymes and is involved in a variety of cellular processes, such as the regulation of cellular activity, homeostasis, cell division, intracellular communication and post- translational modifications (PTMs), among others.250-252

Due to tyrosine’s chemical versatility and prevalence in biological processes, a number of UAAs based on tyrosine have been developed and reviewed16, 17, 253, 254 (Figure

27). These include substituted analogs of tyrosine 203-204, biophysical probes such as NMR

124

and IR probes 205-207 to study protein structure and dynamics, and the heavy-atom- containing amino acid 208 to facilitate X-ray structure determination. Chemical probes such as redox active reagents 209-210 to probe and modulate electron transfer processes in proteins and probes for hydrogen bonding interactions 211-212 have also been developed.

Further, a mimetic of PTMs of tyrosine 213 has been studied, as well as -hydroxy 214 and

D-amino acid 215 probes for backbone conformation studies. In addition, unique chemical reactive analogs of tyrosine 216-217 have been applied for the selective modification of proteins via bioorthogonal reactions.

Figure 27. Tyrosine analogs that have been genetically encoded into proteins.

Photoreactive tyrosine analogs are another class of useful UAAs that have been explored. Site-specific incorporation of photocaged o-nitrobenzyl tyrosine (218, Figure 28)

125

has been previously reported and studied as an important tool for the investigation of biological processes. The Dougherty group first reported the incorporation of 218 by using a nonsense codon suppression method to study the kinetics of gating processes of ion channels found in the nicotinic acetylcholine receptor (nAChR) and the Kir.2.1 channel.255 Deiters et al. reported the incorporation of the caged amino acid into the active site of β-galactosidase for the photoregulation of biological processes in E. coli using an evolved MjTyrRS/TyrT pair.256 Similarly, our group has reported the site-specific incorporation of 218 into various proteins including light-activated Taq DNA polymerase257 and T7 RNA polymerase for the photoregulation of gene function in both pro- and eukaryotic cells.258 For the spatiotemporal control of gene function in mammalian cells, Cre recombinase259 and a zinc-finger nuclease260 have also been photocaged in the Deiters lab via genetically encoded 218. Caged fluorotyrosines 219-220 have likewise been incorporated into proteins in E. coli.261 An 15N- isotope labeled o-nitrobenzyl tyrosine 221 was applied for studies on protein function and dynamics by NMR of the thioesterase domain of the human fatty acid synthase (FAS-TE).262

126

Figure 28. Structure of o-nitrobenzyl tyrosine (218) and analogs 219-221.

3.3 Tyrosine phosphorylation

The addition of a phosphate group to a tyrosyl residue in a protein is a post- translational modification made possible by protein kinases. Tyrosine kinases catalyze the transfer of a phosphate group from ATP to the phenolic group in a protein tyrosine substrate.

Tyrosine phosphorylation is fundamentally important in cellular processes such as proliferation, cell cycle progression, motility, membrane transport, neural transmission, homeostasis, transcriptional activation, differentiation in development, aging, immunity, and metabolism.263, 264 Furthermore, disruption of tyrosine phosphorylation has been implicated in numerous diseases such as cancers, diabetes, arteriosclerosis, severe bone disorders, and pathogen infections.250, 265-272 Due to its numerous key roles in cells, many efforts to study and understand tyrosine phosphorylation have been reported.273, 274 However, additional investigations need to be performed since tyrosine phosphorylation, like other PTMs, is a

127

reversible and highly regulated process, hampering fundamental understanding of many cellular processes.

Several matters contribute to the great difficulty in studying tyrosine phosphorylation.

Due to the reversible nature of PTMs, it is usually not possible to completely drive the protein to its modified state.275 This creates a problem since PTMs are small changes in proteins that generally do not result in significant biophysical changes, making it hard to efficiently isolate the modified protein from the unmodified form in a heterogeneous matrix.

Furthermore, complexity is added with the presence of multiple and different types of PTMs in a given protein. It has therefore been necessary to synthesize proteins in their phosphorylated forms or mimetics to investigate and elucidate their individual roles.276

3.3.1 Genetic code expansion with caged phosphoryl tyrosine

In order to understand protein phosphorylation processes, considerable efforts have introduced methods for the synthesis of phosphorylated proteins and their mimics. Mainly, these methods include enzymatic phosphorylation of target proteins, incorporation of synthetic phosphopeptides and site-directed mutagenesis. Enzymatic phosphorylation methods involve the incubation or co-expression of the target protein with its corresponding enzyme.277 However, this method suffers from difficulties in target protein separation and although antibodies have been used for the isolation of modified proteins, specific antibodies are not available for every protein. In addition, this method requires that the enzyme responsible for the PTM of the target protein has been identified.273 Another approach involves short peptide sequences that are usually incorporated in proteins via native chemical

128

ligation and allow the introduction of one or more phosphorylated residues such as serine, threonine or tyrosine, as well as mimics.278-280 Although this method has proven to be useful, it is limited to proteins where phosphorylation occurs near their C or N terminus. In order to exploit other defined sites within a protein, scientists have looked into other approaches like site-directed mutagenesis, which permits modification at relevant sites and the preparation of homogeneous protein samples.281 Generally, a serine or threonine residue in the target protein is replaced with either an aspartic or glutamic acid residue (Asp/Glu). Despite the considerable differences between phosphoryl serine (pSer)/phosphoryl threonine (pThr) and

Asp/Glu (Figure 29), in many cases Asp/Glu have been successful mimics presumably due to their relatively small size and negative charge at physiological pH.282-285 However, this is not the case for all proteins, primarily where such mutations render the protein inactive.286, 287

Moreover, this method is less useful for tyrosine phosphorylation mimics due to the much weaker resemblance between phosphoryl tyrosine (pTyr) and Asp/Glu residues, however exceptions have been reported.288

129

Figure 29. Structures of phosphoryl serine (pSer), phosphoryl threonine (pThr), phosphoryl tyrosine (pTyr), aspartic acid (Asp) and glutamic acid (Glu).

Alternatively, UAA mutagenesis has been explored for the synthesis of proteins with defined PTMs or mimetics. In contrast to the previously discussed approaches, this method allows for 1) novel modifications to be introduced site-specifically at any site in any given protein, 2) synthesis of the protein in living cells, and 3) direct analysis of the phenotypic changes caused by the modification.

The genetic encoding of a pTyr has not yet been possible, presumably due to the two negative changes hindering cell membrane penetration. However, alternative approaches have been explored to synthesize proteins for tyrosine phosphorylation mimics using UAA mutagenesis. The UAA p-carboxymethyl-phenylalanine (222) was previously used to mimic pTyr in a fragment of STAT1 in E. coli.289 Genetic encoding of 222 was achieved by an evolved MjTyrRS/TyrT pair at Y701 in STAT1 and the recombinantly expressed protein lead to its homodimerization and binding to DNA, similar to the endogenous function of STAT1.

130

This pTyr mimic was also placed at Y291 in PRMT1 (protein arginine methyltransferase-1) to show that tyrosine phosphorylation disrupts protein-protein interactions and substrate specificity of PRMT1.290 Another amino acid that has been genetically encoded into protein and applied for tyrosine phosphorylation mimics is p-azidophenylalanine (223). This amino acid has been genetically encoded in yeast, E. coli, and mammalian cells.9, 291, 292 Using an evolved MjTyrRS/TyrT pair, 223 was incorporated into protein in E. coli and its selective modification in the protein was achieved via a Staudinger reaction with light-sensitive phosphites (224).293 Upon light irradiation, a p-(phosphoamino)-phenylalanine (225) residue was revealed (Figure 30, B). Since 225 possesses two negative charges and maintains similar hydrogen bonding interactions to pTyr, it is the closest mimic to pTyr that can currently be introduced into proteins via genetic encoding of UAAs. Hence, there is yet no approach to site-specifically install pTyr into proteins in cells. Here we developed the synthesis of caged phosphoryl tyrosines for their genetic incorporation into proteins in order to reveal a pTyr residue upon UV irradiation.

131

A. O N3

OH

H2N CO2H H2N CO2H 00 00 B. O O P R P HN O HN O O

O2N R1 2

POI POI

00 00 Figure 30. Phosphoryl tyrosine mimics. A) Structures of p-carboxymethyl-phenylalanine (222) and p-azidophenylalanine (223). B) Photolysis of light-sensitive 224 to a p- (phosphoamino)-phenylalanine (225) in a protein of interest (POI).

3.3.2 Synthesis of caged phosphoryl tyrosines

The caged phosphoryl tyrosines were synthesized from commercially available o- nitrobenzyl alcohol (230) or 1-(2-nitrophenyl)ethanol (227), which was prepared via methylation of 2-nitrobenzaldehyde (226) with trimethylaluminum in 84% yield (Scheme

24). The benzyl alcohols 227 and 230 were activated with N,N- diisopropylchlorophosphoramidite to furnish the corresponding phosphoramidites 231 and

232, unsubstituted or methyl-substituted at the benzylic position with yields of 72% and

88%, respectively. Boc-Tyr-OH (228) was protected to the tert-butyl ester 229 with tert-butyl

2,2,2-trichloroacetimidate in 84% yield and subsequently coupled to caged phosphoramidites

231 and 232. The reaction was performed in the presence of 1H-tetrazole in THF and purification of the corresponding product by column chromatography led to decomposition

132

and no product was recovered. We then proceeded to perform the coupling reaction and without purification subsequently oxidize to the corresponding phosphate esters using tert- butyl hydroperoxide in DCM. The reaction progress was followed by TLC and the product was successfully isolated by flash column chromatography to yield 233 and 234 in 52% and

55% yields, respectively. Cyanoethyl deprotection was then performed with DBU to deliver the corresponding organophosphoric acids 235 and 236 in high yields. Finally, deprotection of the amino acid was achieved with 50% TFA in DCM and TES to yield the caged phosphoryl tyrosines 237 and 238 in yields of 73% and 86%, respectively.

133

Scheme 24. Synthesis of caged phosphoryl tyrosines 237 (R = Me) and 238 (R = H).

Upon developing the synthetic routes, synthesis for the amino acids 237 and 238 were reproduced by Luis A. Vazquez (Research Experiences for Undergraduates program) in order to obtain enough material to test for their incorporation into proteins by evolved PylRS/PylT pairs. Thus far, neither amino acid has been accepted as a substrate for any of the PylRS

134

mutants that are available in the Deiters Lab. Further studies for the evolution of new PylRS mutants are then necessary.

3.4 Genetic code expansion with caged tyrosines

In collaboration with the Chin lab, we demonstrated the incorporation of 218 into proteins by an evolved PylRS and its applicability for the control of protein phosphorylation in mammalian cells.11 Although 218 has proven to be useful, this amino acid has some drawbacks. As previously discussed in Section 3.1, the o-NB caging group produces a byproduct that is reactive to nucleophiles, which can result in cytotoxicity. In addition, 218 has poor solubility in aqueous media at physiological pH, hampering the expansion of 218’s applicability in living cells. The development of ONBYRS, a PylRS mutant used for the incorporation of 218,11 has introduced the possibility of genetically encoding caged tyrosine analogs into proteins with enhanced properties. We envisioned the synthesis and genetic encoding of caged tyrosines with modifications based on 218 to improve bioavailability of the UAA inside mammalian cells and decaging properties. It has been shown that masking the carboxyl group of an UAA as an ester can increase the rate and cellular uptake of the

UAA.294 In addition, esters have been applied to derivatize functionalities in prodrugs to increase cell permeability by obtaining a more neutral form of the molecule.295 We thus aimed the synthesis of esterified analogs of 218 and moreover, modifications to the caging group in order to address the issues with damaging byproduct and to increase decaging efficiency, in addition to improving the bioavailability for applications in mammalian cells.

135

3.4.1 Synthesis of caged tyrosines

We synthesized the three esters 244, 245 and 246, bearing a methyl, ethyl, and acetoxymethyl ester (AME), respectively (Scheme 25). We reasoned that once these UAAs are inside the cell, esterases can then cleave the ester to generate the amino acid 218. The syntheses started with Boc-Tyrosine-OMe (239), which was alkylated at the phenolic position with o-nitrobenzyl bromide (o-NBB) to install the caging group and deliver the product 240 in 94% yield. From compound 240 we obtained our methyl ester analog 244 after N-Boc deprotection in 79% yield. Also from compound 240, we obtained 241 after ester hydrolysis with LiOH in 95% yield, giving our precursor for the remaining two esters.

Esterification of 241 with bromoethane or 2-bromomethyl acetate resulted in esters 242 and

243 in yields of 54% and 72%, respectively. Both were subjected to TFA conditions and

TFA to HCl salt exchange to obtain the corresponding free amines 245 and 246 in 64% and

73%, respectively.

136

Scheme 25. Synthesis of caged tyrosine esters 244-246.

We then prepared solutions of the caged tyrosine esters in PBS pH 7.4 to roughly identify the concentration at which there is maximum solubility. While 218 can be obtained in a 0.2 mM solution, the esters 244-246 were able to dissolve at 1 mM (245) or 1.5 mM (244 and 246) conditions. The caged tyrosine esters were then sent to our collaborators in the Chin lab. Incorporation into protein was only detected for the methyl ester 244. However, it did not prove to be superior to the known 218. We thus envisioned that the introduction of a less hydrophobic caging group might lead to bioavailability improvements, while at the same time we probe for decaging efficiency. To expand the tools for studies of protein structure, function and dynamics using genetic code expansion and light, we synthesized the caged tyrosines 247, 248, and 249 (Figure 31).

137

Figure 31. Structures of caged tyrosine analogs 247-249.

We assembled 247 by following a similar procedure to the reported synthesis of

218.255 To install the caging group, the benzyl bromide 251 was first synthesized from the activation of 6-nitropiperonyl alcohol (250) with PBr3 to obtain the corresponding 6- nitropiperonyl bromide (251) in 66% yield (Scheme 26). A di-tyrosine copper complex (253)

o from tyrosine (252) and CuSO4 was assembled in a basic solution at 60 C for 20 min. After the copper complex was neutralized and collected by filtration, potassium carbonate and the benzyl bromide 251 were added to form the caged di-tyrosine copper complex 254. The copper complex was finally disrupted with aqueous 1 M HCl and the amino acid 247 was obtained in 53% yield overall.

138

Scheme 26. Synthesis of caged tyrosine 247.

Initial attempts to synthesize caged tyrosine 248 were conducted via nucleophilic substitution by the tyrosyl phenol to the secondary bromine 259 (Scheme 27). The benzyl bromide 259 was assembled from the alcohol 256, which was prepared via methylation to the aldehyde on 6-nitropiperonal (255) with trimethyl aluminum in 90% yield. Then, the alcohol

256 underwent activation via reaction with PBr3 in DCM in 46% yield (259). Attempt to conduct alkylation by Boc-Tyr-OMe suffered from a majority of unreacted material

(observed by TLC) and low yield (19%) to the corresponding product 257. A new route was then developed for the synthesis of 248.

The amino acid was assembled through a Mitsunobu reaction of the alcohol 256 and

Boc-Tyr-OMe using diisopropyl azodicarboxylate (DIAD) in THF to deliver 257 in 76%

139

yield. It was found that pre-formation of the beatine intermediate followed by the addition of both the alcohol and the phenol resulted in optimal reaction yield, compared to yields in the range of 40-50% when the azodicarboxylate is added last to the reaction mixture. Overnight hydrolysis of the methyl ester 257 to the free carboxylic acid 258 in 82% yield and N-Boc- deprotection using 50% TFA in DCM with a yield of 91% furnished the desired caged tyrosine 248.

Scheme 27. Synthesis of caged tyrosine 248 via a Mitsunobu reaction.

Caged tyrosine 249 was assembled in four steps via a Mitsunobu reaction similarly to the synthesis of 248 (Scheme 28). The alcohol 260 was prepared in one step from 2- ethylnitrobenzene according to a reported procedure296 and subjected to a Mitsunobu reaction with Boc-Tyr-OMe using DIAD in THF to give compound 261 in 73% yield. The methyl ester 261 was converted to the carboxylic acid 262 by LiOH hydrolysis in 70% yield. Then,

140

the free amino acid 249 was delivered after N-Boc-deprotection with TFA and TFA to HCl salt exchange in 90% yield.

Scheme 28. Synthesis of caged tyrosine 249 via a Mitsunobu reaction.

The caged tyrosines 247-249 were then subjected to decaging experiments. A solution of the amino acid at 0.5 mM in DMSO/PBS buffer (1:1, pH 7.4) was exposed to UV irradiation (365 nm) using a standard transilluminator. The caged tyrosine 218 was not included in our decaging experiments has it showed poor solubility under these conditions, while improved solubility was seen for the caged tyrosines 247-249. Aliquots of irradiated solutions of 247, 248 and 249 (0.5 mM) were taken at increasing time intervals (15, 30, 60,

90, and 120 seconds) and analyzed by LC/MS. The experiments indicated that the caged tyrosines are consumed but the decaged tyrosine product was not detected. Disappearance of the starting material was followed by chromatography and the concentrations were quantified from the non-irradiated standard solution by using the peak area of the standard and irradiated samples. A plot of the concentration of the caged tyrosines versus UV irradiation

141

time is presented in Figure 32A demonstrating the decay of the starting materials over time.

The decaging rate constant (k) and half-life (t1/2) were calculated from ln[concentration] vs. time plots (k = slope, t1/2 = 0.693/slope), using linear regression analysis and are shown in

Figure 32B. The analysis shows 249 to be consumed the fastest, followed by 248 and then

247. Our results agree well with observed decaging rate patterns related to these caging groups in other substrates236, 237, 246 (refer to discussion in Section 3.1). As expected, the methyl group on 248 aids to an increased rate compared to 247, and additionally, the decaging of the NPE-type caging group is much faster than the o-NB type (249 versus 247 and 248). In order to confirm the recovery of tyrosine after UV irradiation, decaging experiments need to be performed in protein samples.

142

A.

B. UAA 249 248 247

k×103/s-1 8.4±0.5 4.6±0.2 3.8±0.4

t1/2/s 82 149 184

Figure 32. Decaging of caged tyrosines 249, 248, and 247 by 365 nm of UV light. A) Concentration versus irradiation time plots and B) kinetic analysis showing rate constants (k) and half-life (t1/2).

3.4.2 Genetic encoding of caged tyrosines by evolved PylRS/PylT pairs

In order to test the caged tyrosines for their genetic encoding into proteins, Jihe Liu in the Deiters Lab screened a panel of PylRS mutants. The experiments were performed in E. coli and sfGFP was used as a reporter protein. To this end, the reporter plasmid sfGFP-

Y151TAG-PylT was co-transformed with a mutant PylRS, and several synthetases were tested for each amino acid. In addition, as a positive control, we synthesized the UAA 218

143

via amino acid copper-protection according to a reported procedure255 with minor modifications (see experimental section).

The screening of the PylRS mutants panel led to the identification of one synthetase that drives the incorporation of 247-249. This evolved PylRS, named EV20, bears the five mutations L270F, L274M, N311G, C313G and Y349F. The synthetase EV20 is capable of incorporating 247 and 248 into sfGFP, with 247 showing a higher efficiency than that of 218

(Figure 33, A and B). In addition, 249 was incorporated by additional variants of PylRS, termed EV16-1 (Y271A, N311A, C313A), EV16-4 (Y271A, N311A, C313A, Y349F) and

EV16-5 (Y271M, L274A, N311A, C313A, Y349F). Although EV16-4 and EV16-5 show the highest protein yields, they also show higher background expression in the absence of an amino acid (Figure 33, C). In contrast, EV20 and EV16-1 show incorporation of 249 into sfGFP with almost no background expression. The incorporation of the caged tyrosines into sfGFP by EV20 was further confirmed by ESI-MS (Figure 33, D).

144

A. C. 1 mM 249 0.25 mM 247 1 mM 247 0.25 mM 218

B. D. 1 mM 248 Theoretical Experimental UAA 0.25 mM 218 Mass (Da) Mass (Da) 247 28441.9 28441.4 248 28455.9 28455.9 249 28425.9 28426.9

Figure 33. Incorporation of 247, 248 and 249 into sfGFP at the position Y151 in E. coli. A) SDS-PAGE and protein yield analyses for the incorporation of 247 and B) 248, using EV20. 218 was used as positive control. C) SDS-PAGE and protein yields analyses for the incorporation of 249 using EV16-1, EV16-4, EV16-5 and EV20. D) ESI-MS results confirming the incorporation of 247, 248 and 249 by EV20 into sfGFP at the position Y151. Experiments performed by Jihe Liu.

Through screening of PylRS mutants in mammalian cells by Ji Luo (Deiters Lab), the

PylRS variant termed V16-5 was found to be the optimal synthetase for the incorporation of

247 and 248. The synthetase V16-5 contains the five mutations L270F, L274M, N311G,

C313G and Y349F. From the available panel of synthetases in the Deiters Lab, V16-5 was found to also incorporate 218 into proteins. To demonstrate the incorporation of the caged tyrosines in mammalian cells, HEK 293T cells were co-transfected with pV16-5-mCherry-

TAG-EGFP-HA and p4CMVE-U6-PylT. The first plasmid encodes the PylRS variant V16-5,

N-terminal mCherry, an amber stop codon (TAG), C-terminal EGFP, and a hemagglutinin

145

tag, while the second construct encodes four copies of PylT. The cells were supplemented with 247, 248, and 218 and mCherry expression was detected in the presence or absence of an UAA. However, expression of EGFP in cells was demonstrated to be dependent on the incorporation of 247, 248, and 218 at the TAG site (Figure 34, A). Expression of mCherry-

247/248/218-EGFP-HA fusion protein was confirmed by western blot analysis, where only full-length protein fusion was detected in the presence of either amino acid and low background expression was observed in the absence of an UAA (Figure 34, B). Through similar experiments, the PylRS variant AG19, which bears the mutations Y271A, L274M,

N311A, C313A, and F349L, was identified as an optimal synthetase for the incorporation of

249. Full-length mCherry-TAG-EGFP expression was demonstrated to be dependent on the presence of 249 (Figure 34, C).

146

A. B.

-UAA (0.5 mM) 218

(0.5 mM) 248

(0.25 mM) 247

+218

+248 -UAA

+247 +249

Figure 34. Incorporation of 247-249 in mammalian cells. A) Fluorescence micrographs and B) western blot analysis of HEK 293T cells expressing V16-5/PylT pair and mCherry-TAG- EGFP-HA in the presence of 218, 248, 247, and absence of an UAA. C) Fluorescence micrographs of HEK 293T cells expressing AG19/PylT pair and mCherry-TAG-EGFP-HA in the presence or absence of 249. Experiments performed by Ji Luo.

Our results show the successful incorporation of 247-249, both in bacteria and mammalian cells with no detected cytotoxicity. Moreover, we can directly compare the incorporation efficiencies of 247 and 248 with that of 218 into proteins when using the same

PylRS/PylT pair. Our experiments in E. coli show that 247 is incorporated into proteins slightly more efficiently than 218, while 218 is still incorporated more efficiently than 248

147

(~2-fold). Thus, the larger and less hydrophobic caging group on 247 (compared to 218) is efficiently accommodated by the synthetase’s binding pocket. However, the presence of the benzylic methyl group on 248 has significantly decreased the synthetase’s capability of incorporating this UAA. Yet, our preliminary results in mammalian cells suggest by western blot analysis that 247 and 248 are incorporated with similar efficiency that is superior to that of 218 (Figure 34B). Further studies are underway by Ji Luo in order to quantify and evaluate the incorporation efficiency of the caged tyrosines into proteins in mammalian cells.

3.5 Genetic code expansion with caged deuterated tyrosine

Infrared vibrational spectroscopy of proteins that are site-specifically labeled with carbon-deuterium (C-D) bonds is a promising tool for investigating protein structure, function and dynamics.297-299 The C-D stretch mode represents an excellent biophysical probe because it 1) does not perturb the native protein, 2) is versatile and sensitive to the local environment, and 3) absorbs at approximately 2100-2200 cm-1, which falls in an otherwise clear region of the protein IR absorption spectrum. However, the introduction of a deuterated amino acid, such as a deuterated tyrosine, would result in global incorporation into proteins due to high structural similarities to the canonical tyrosine and would not enable site- specificity.

We envisioned that the incorporation of a deuterated tyrosine in a protein would provide a new path to study protein structure and dynamics. Incorporation of a caged deuterated tyrosine would allow the site-specific introduction of deuterated tyrosine into protein by photochemical disguise, thus preventing global incorporation of the isotope-

148

labeled amino acid into proteins. To this end, a caged deuterated tyrosine 266 was synthesized for its site-specific genetic encoding into proteins in response to a TAG amber stop codon. Photodeprotection of 266 will then reveal a single deuterated tyrosine residue at a specific location in a protein (Figure 35).

D

O O2N D (365 nm)

Figure 35. Site-specific incorporation of a deuterated tyrosine by photochemical disguise.

3.5.1 Synthesis of caged deuterated tyrosine

The synthesis of 266 commenced with the Boc-protection of the α-amino group on L- tyrosine-3,5-d2 (263) in 92% yield, delivering 264 (Scheme 28). The caging group was installed at the phenolic position in a basic solution of DMF using o-nitrobenzyl bromide (o-

NBB) in 71% yield to deliver 265. This was followed by Boc-deprotection with HCl and the free amino acid 266 was furnished in 87% yield. Due to apparent cytotoxic effects, and despite the efforts to further re-purify 266 in order to eliminate the cytotoxicity, the UAA 266 was obtained via an alternate route (Scheme 29).

149

Scheme 28. Synthesis of 266 via N-Boc protection and alkylation.

Analogous to the synthesis procedure of 247, the caged deuterated tyrosine 266 was prepared by the formation of a copper complex between two molecules of tyrosine, deuterated at its 3- and 5-positions (263) (Scheme 29). After assembly of the di-tyrosine-d2 copper-complex, alkylation proceeded after the addition of K2CO3 and o-nitrobenzyl bromide

(o-NBB). Final deprotection by 1 M HCl delivered the desired amino acid 266 in 45% overall yield.

Scheme 29. Synthesis of a caged deuterated tyrosine 266.

150

Dr. Chungjung Chou finally incorporated the UAA 266 into proteins and no cytotoxicity was detected. The amino acid was then sent to the Romesberg Lab (TSRI, La

Jolla) to undergo protein IR studies.

3.6 Summary

In this chapter we have presented the synthesis of several caged tyrosine analogs that can be applied to the photoregulation of proteins in live cells. Since tyrosine plays major roles in fundamental enzymatic processes, the presented UAAs can provide new tools to chemical biologists to dissect proteins functions and mechanisms in real-time with minimal perturbation of the biological system. We successfully synthesized caged phosphorylated tyrosines, and their genetic encoding could allow, for the first time, the site-specific introduction of phosphoryl tyrosine into proteins and directly probe tyrosine phosphorylation and function. In addition, we assembled new caged tyrosines as alternatives to the well- known o-nitrobenzyl tyrosine. We prepared new caged tyrosines with slightly enhanced decaging kinetics and that produce safer byproducts than the nitroso-benzaldehyde.

Engineered PylRSs directed the site-specific incorporation of tyrosine analogs with new caging groups into proteins in bacterial and mammalian cells. Further experiments will be conducted in order to evaluate their incorporation and decaging efficiencies in proteins. Thus, it may be possible for the presented UAAs to be used in eukaryotic cells for the photocontrol of biological processes, such as signaling and enzymatic processes.

151

3.7 Experimental data for synthesized compounds

Unless otherwise stated, all reagents used were obtained from commercial sources and used as received. Reactions were stirred magnetically and carried out under nitrogen using flame-dried glassware. DCM, THF and Et2O were dried using a MB SPS Compact solvent purification system. MeCN, DMF, DCE, DIPEA, TEA and pyridine were distilled from calcium hydride. MeOH and EtOH were distilled from magnesium and iodine. The distilled solvents were stored under nitrogen and over molecular sieves (3 Å for MeOH and

EtOH, and 4 Å for all other solvents). Reactions were followed by thin layer chromatography

(TLC) using glass-back silica gel plates (Sorbent technologies, 250 µm thickness) and visualized under a UV lamp and/or by staining with a KMnO4 solution. Flash chromatography was performed on silica gel (60 Å, 40-63 μm (230 × 400 mesh), Sorbtech) as a stationary phase. Melting points were determined using a capillary melting point apparatus. The 1H NMR, 13C NMR and 31P NMR spectra were recorded on a 300 MHz or

400 MHz Varian NMR spectrometer. HRMS was performed at the University of Pittsburgh.

(S)-2-Amino-3-(4-((6-nitrobenzo[d][1,3]dioxol-5-yl)methoxy)phenyl)propanoic acid

(218). L-Tyrosine (2.0 g, 11.0 mmol) was dissolved in 2 M NaOH aqueous (10 mL) and a solution of CuSO4·5H2O (1.9 g, 7.28 mmol) in a minimal amount of water was added slowly at rt. The solution was heated to 60 oC and stirred for 20 min and then allowed to cool to rt before adjusting to pH 7 using 1 M HCl. The light-blue solid was filtered and washed with water (3 × 25 mL) before it was taken in 75% aqueous DMF (60 mL). K2CO3 (1.5 g, 11.04 mmol) and o-nitrobenzyl bromide (1.8 g, 8.49 mmol) were added and the reaction was

152

allowed to proceed for 72 h at rt while kept in the dark. The solid was filtered, washed with

75% aqueous DMF (2  40 mL), water (2  40 mL), 75% aqueous acetone (40 mL), and ice- cold acetone (10 mL), and then taken in 1 M HCl (100 mL) to stir for 1 h at rt. The off-white solid was filtered and stirred once more with fresh 1 M HCl (100 mL) for another 30 min.

The solid was finally filtered, washed with water (2  40 mL) and ice-cold acetone (10 mL).

The compound was collected and dried to give o-nitrobenzyl tyrosine (218) as a slightly yellow powder (1.85 g, 68%). 1H NMR, 13C NMR, and LRMS-ESI spectral data matched literature values.255

1-(2-Nitrophenyl)ethanol (227). In a 100 mL flask, 2-nitrobenzaldehyde (1.0 g, 6.62 mmol) was dissolved in dry DCM (20 mL) under argon and the solution was cooled to 0 ºC. A trimethyl aluminum solution in hexanes (5.0 mL, 9.93 mmol) was added dropwise over a period of 30 min and the reaction was allowed to stir for another 3 h at the same temperature.

The reaction was then slowly quenched with iced water (5 mL), followed by 1 M NaOH (20 mL). The mixture was stirred vigorously at rt for 1 h and the phases were separated. The aqueous layer was extracted with DCM (2 x 20 mL) and the organic layers were combined, washed with water (40 mL) and brine (20 mL), dried with Na2SO4, filtered, and concentrated under reduced pressure. The compound was used without further purification after complete dryness, furnishing 227 as a yellow solid (0.93 g, 84%). 1H NMR spectral data matched literature values.300

153

Boc-Tyrosine-OtBu (229). Boc-Tyr-OH (1.0g, 3.55 mmol) was dissolved in a dry solution of DCM and THF (2:1, 12 mL) under argon. The solution was cooled to 0 ºC and tert-butyl

2,2,2-trichloroacetimidate (1.91 mL, 10.66 mmol) was added. The reaction mixture was allowed to stir at rt for 18 h before the volatiles were removed under reduced pressure. The obtained residue was dissolved in EtOAc (20 mL) and washed with 2.5% NaHCO3 (2  15 mL) and brine (10 mL). The organic layer was dried over Na2SO4, filtered, and concentrated under reduced pressure. The remaining residue was purified by column chromatography on silica gel using Hex/EtOAc (4:1) to afford the protected ester 229 (1.01 g, 84%) as a white amorphous solid. 1H NMR and 13C NMR spectral data matched literature values.301

2-Cyanoethyl (1-(2-nitrophenyl)ethyl) diisopropylphosphoramidite (231). 1-(2- nitrophenyl)ethanol (227) (0.5 g, 3.0 mmol) was dissolved in a solution of DIPEA (2.1 mL,

12.0 mmol) in dry DCM (10 mL) under argon and the solution was cooled to 0 ºC. After 5 min, 2-cyanoethyl N,N-diisopropylchlorophosphoramidite (1.3 mL, 6.0 mmol) was added and the reaction mixture was stirred at rt for 18 h. The reaction mixture was concentrated and the remaining residue was purified by column chromatography on silica gel using

Hex/EtOAc (2:1, 1% TEA) to afford 231 (963 mg, 88%) as an oil. 1H NMR spectral data matched literature values.302

2-Cyanoethyl 2-nitrobenzyl diisopropylphosphoramidite (232). Compound 232 (361 mg,

72%) was synthesized in a similar way by reacting 2-nitrobenzyl alcohol (200 mg, 1.31 mmol) with N,N-diisopropylchlorophosphoramidite (0.58 mL, 2.61 mmol) in a solution of

154

1 DIPEA (0.91 mL, 5.22 mmol) in dry DCM (4 mL). H NMR (400 MHz, CDCl3) δ 8.05 (d, J

= 8.0 Hz, 1H), 7.82 (d, J = 8.0 Hz, 1H), 7.65 (t, J = 6.0 Hz, 1H), 7.43 (t, J = 8.4 Hz, 1H),

5.14-4.99 (m, 2H), 3.89-3.78 (m, 2H), 3.69-3.59 (m, 2H), 2.68 (t, J = 6.8 Hz, 2H), 1.23 (t, J =

13 6.8 Hz, 12H) ppm; C NMR (100 MHz, CDCl3) δ 147.2, 136.0, 135.9, 134.0, 128.9, 128.2,

124.9, 117.9, 62.8, 62.6, 59.0, 58.8, 43.6, 43.5, 24.9, 24.8, 20.7, 20.6 ppm; 31P NMR (162

MHz, CDCl3) δ 150.2 ppm. LRMS-ESI was not successful due to compound instability.

(2S)-tert-Butyl 2-((tert-butoxycarbonyl)amino)-3-(4-(((2-cyanoethoxy)(1-(2-nitrophenyl) ethoxy)phosphoryl)oxy)phenyl)propanoate (233). Boc-Tyr-OtBu (229) (765 mg, 2.37 mmol) was dissolved in dry THF (15 mL) into a flask equipped with 4Å molecular sieves and a stir bar, under argon. The mixture was cooled to 0 ºC and a solution of compound 231

(694 mg, 1.98 mmol) and 1H-tetrazole (40 mg, 1.98 mmol) in dry THF (10 mL) was added dropwise. The reaction mixture was allowed to proceed for 5 h at rt before concentrating under reduced pressure and evaporated to dryness under vacuum. The remaining residue was taken in dry DCM (15 mL) and cooled to 0 ºC. Then, tert-butyl hydroperoxide, ~5.5 M in decane (1.4 mL), was diluted in dry DCM (10 mL) and added dropwise to the former mixture. The reaction mixture was stirred at rt for 1.5 h and then quenched with 2.5%

NaHCO3 (20 mL). The organic layer was washed with 2.5% NaHCO3 (2  20 mL) and brine

(10 mL), dried over Na2SO4, filtered, and concentrated under reduced pressure. The remaining residue was purified by column chromatography on silica gel using DCM/Et2O

1 (8:1) to afford 233 (649 mg, 55%) as an oil. H NMR (300 MHz, CDCl3) δ 7.97 (dd, J = 6.9,

1.2 Hz, 1H), 7.75-7.57 (m, 2H), 7.46-7.40 (m, 1H), 7.09-6.96 (m, 4H), 6.19 (m, 1H), 5.00 (d,

155

br, J = 6.9 Hz, 1H), 4.35 (m, br, 1H), 4.24 (m, 2H), 2.97 (m, 2H), 2.62 (m, 2H), 1.70-1.65

13 (m, 3H), 1.37 (s, 18H) ppm; C NMR (75 MHz, CDCl3) δ 170.6, 155.0, 148.9, 146.7, 136.8,

134.0, 130.8, 129.0, 127.6, 124.5, 119.7, 116.1, 82.1, 79.7, 73.8, 62.5, 54.7, 37.5, 28.2, 27.9,

31 24.3, 19.5 ppm; P NMR (162 MHz, CDCl3) δ -7.53, -7.55, -7.92, -7.95 ppm; HRMS-ESI

+ (m/z): [M+H] calcd for C29H39N3O10P 620.23676, found 620.23321.

(2S)-tert-Butyl 2-((tert-butoxycarbonyl)amino)-3-(4-(((2-cyanoethoxy)((2- nitrobenzyl)oxy) phosphoryl)oxy)phenyl)propanoate (234). Compound 234 (448 mg,

52%) was synthesized in a similar way by using Boc-Tyr-OtBu (229) (621 mg, 1.84 mmol),

1H-tetrazole (99 mg, 1.42 mmol) and compound 232 (500 mg, 1.42 mmol) in dry THF (20 mL), followed by oxidation with tert-butyl hydroperoxide, ~5.5 M in decane (1 mL) in dry

1 DCM (20 mL). H NMR (400 MHz, CDCl3) δ 8.13 (d, J = 8.0 Hz, 1H), 7.71-7.64 (m, 2H),

7.51 (t, J = 6.8 Hz, 1H), 7.12 (s, 4H), 5.62 (dd, J = 3.2, 4.4 Hz, 2H), 5.02 (d, J = 8.0 Hz, 1H),

4.39-4.27 (m, 3H), 3.03 (m, 2H), 2.74 (m, 2H), 1.38 (s, 9H), 1.36 (s, 9H) ppm; 13C NMR

(100 MHz, CDCl3) δ 170.6, 155.0, 149.0, 148.9, 146.6, 134.3, 134.1, 131.7, 131.6, 131.0,

129.2, 128.5, 125.1, 119.8, 116.2, 82.2, 79.7, 67.0, 62.8, 54.7, 37.6, 28.3, 27.9, 19.7 ppm; 31P

+ NMR (162 MHz, CDCl3) δ -6.92 ppm; HRMS-ESI (m/z): [M+H] calcd for C28H37N3O10P

606.22111, found 606.22061.

(2S)-2-Amino-3-(4-((hydroxy(1-(2-nitrophenyl)ethoxy)phosphoryl)oxy) phenyl)propanoic acid TFA salt (237). Compound 233 (649 mg, 1.05 mmol) was dissolved in dry DCM (15 mL) under an argon atmosphere. DBU (0.17 mL, 1.15 mmol) was added and

156

the solution was stirred at rt overnight. The reaction mixture was washed with 5% citric acid

(2  15 mL), water (10 mL), and brine (10 mL), dried over Na2SO4, filtered, and concentrated under reduced pressure to afford 235 (552 mg, 95%) as an off-white foam.

- HRMS-ESI (m/z): [M-H] calcd for C26H34N2O10P 565.19456, found 565.19734. Compound

235 (50 mg, 0.09 mmol) was dissolved in dry DCM (1.2 mL), under argon. TES (0.03 mL,

0.18 mmol) and TFA (1.2 mL, 16.3 mmol) were added at rt and the reaction mixture was stirred for 1 h. The volatiles were removed under reduced pressure and the remaining residue was dissolved in MeOH. To remove the excess of TFA, the solution was concentrated and the process was repeated twice. The concentrated residue was dissolved in a minimal amount of MeOH and the solution was added dropwise to ice-cold Et2O to form a white precipitate.

The precipitate was pelleted by centrifugation and the supernatant was decanted. The process was repeated twice to finally obtain 237 (40.6 mg, 86%) as a white powder. 1H NMR (400

MHz, DMSO-d6) δ 7.91 (d, J = 8.4 Hz, 1H), 7.77 (m, 1H), 7.70 (t, J =7.2 Hz, 1H), 7.50 (t, J

= 7.6 Hz, 1H), 7.01 (m, 2H), 6.90 (m, 2H), 5.73-5.66 (m, 1H), 3.94 (m, 1H), 2.96 (m, 2H),

13 1.44 (d, J = 6.4 Hz, 3H) ppm; C NMR (100 MHz, DMSO-d6) δ 170.4, 154.0, 146.8, 133.6,

129.9, 128.2, 127.2, 123.7, 119.7, 68.3, 53.9, 35.1, 24.6 ppm; 31P NMR (162 MHz, DMSO-

- d6) δ -6.27 ppm; HRMS-ESI (m/z): [M-H] calcd for C17H18N2O8P 409.0801, found

409.0804.

(2S)-2-Amino-3-(4-((hydroxy((2-nitrobenzyl)oxy)phosphoryl)oxy)phenyl)propanoic acid

TFA salt (238). Cyanoethyl deprotection of 234 (448 mg, 0.74 mmol) was achieved in a similar manner with DBU (0.13 mL, 0.81 mmol) in dry DCM (15 mL) to afford compound

157

- 236 (404 mg, 99%). HRMS-ESI (m/z): [M-H] calcd for C25H32N2O10P 551.18001, found

551.18048. Similarly, deprotection of 236 (40 mg, 0.072 mmol) was performed with TES

(0.023 mL, 0.14 mmol) and TFA (0.98 mL, 13.2 mmol) in dry DCM (0.98 mL) to afford 238

1 (25.6 mg, 73%). H NMR (400 MHz, DMSO-d6) δ 8.08 (d, J = 8.0 Hz, 1H), 7.84 (m, 1H),

7.76 (t, J = 7.2 Hz, 1H), 7.54 (t, J = 7.2 Hz, 1H), 7.09 (m, 4H), 5.21 (d, J = 7.2 Hz, 2H), 3.83

13 (m, 1H), 2.98 (m, 2H) ppm; C NMR (100 MHz, DMSO-d6) δ 170.5, 152.6, 146.4, 135.3,

134.0, 130.1, 128.9, 128.4, 128.2, 124.2, 119.8, 63.3, 53.7, 35.2 ppm; 31P NMR (162 MHz,

- DMSO-d6) δ -5.60 ppm; HRMS-ESI (m/z): [M-H] calcd for C16H16N2O8P 395.0644, found

395.0650.

tert-Butyl (S)-1-(methoxycarbonyl)-2-(4-(2-nitrobenzyloxy)phenyl)ethylcarbamate

(240). Boc-Tyr-OMe (0.5 g, 1.7 mmol) was dissolved in dry DMF (8 mL). K2CO3 (0.5 g, 3.7 mmol) and o-nitrobenzyl bromide (0.4 g, 1.9 mmol) were added and the reaction mixture was stirred at rt overnight. Water (25 mL) was added and then extracted with EtOAc (3  15 mL).

The organic layers were combined, washed with water (40 mL) and brine (20 mL), dried over Na2SO4, filtered, and concentrated. The remaining residue was purified by flash column chromatography on silica gel and eluted with Hex/EtOAc (4:1 to 2:1) to obtain 240 (0.69 g,

1 94%) as an off-white solid. H NMR (300 MHz, CDCl3) δ 8.19 (d, J = 8.4 Hz, 1H), 7.90 (d, J

= 7.8 Hz, 1H), 7.71 (t, J = 7.5 Hz, 1H), 7.51 (t, J = 7.8 Hz, 1H), 7.07 (d, J = 8.7 Hz, 2H),

6.93 (d, J = 8.7 Hz, 2H), 5.46 (s, 2H), 4.98 (d, J = 7.2 Hz, 1H), 4.58 (m, 1H), 3.71 (s, 3H),

3.10 (m, 2H), 1.41 (s, 9H) ppm.

158

(S)-2-((tert-Butoxycarbonyl)amino)-3-(4-((2-nitrobenzyl)oxy)phenyl)propanoic acid

(241). Compound 240 (0.16 g, 0.37 mmol) was dissolved in THF (1.5 mL) and 2 M LiOH

(1.5 mL) was added at rt. The reaction mixture was stirred overnight and then the volatiles were removed under reduced pressure. The remaining residue was diluted in water (5 mL) and 5% citric acid was added until pH 4 was reached. The aqueous mixture was extracted with EtOAc (3  10 mL), washed with brine (10 mL), dried over Na2SO4, filtered, and concentrated. Compound 241 (0.15 g, 95%) was obtained as an off-white solid and was used

1 without further purification. H NMR (300 MHz, DMSO-d6) δ 8.11 (d, J = 8.1 Hz, 1H), 7.62

(t, J = 6.3 Hz, 1H), 7.49-7.37 (m, 2H), 6.98 (d, J = 8.4 Hz, 2H), 6.72 (d, J = 8.4 Hz, 2H), 5.51

(s, 2H), 4.98 (m, 1H), 4.62 (m, 1H), 3.03 (m, 2H), 1.40 (s, 9H) ppm.

(S)-Methyl 3-(4-(2-nitrobenzyloxy)phenyl)-2-aminopropanoate HCl salt (244).

Compound 240 (150 mg, 0.34 mmol) was dissolved in THF (1.7 mL). Next, water (1.7 mL) and concentrated HCl (0.6 mL) were added. The reaction was stirred overnight at rt and then it was concentrated under reduced pressure to remove the volatiles. The remaining aqueous solution was diluted in water (2 mL), washed with Et2O (2  2 mL) and the aqueous layer was finally concentrated to obtain compound 244 (0.1 g, 79%) as an off-white solid. 1H

NMR (400 MHz, CD3OD) δ 8.07 (d, J = 8.0 Hz, 1H), 7.79 (d, J = 8.0 Hz, 1H), 7.69 (t, J =

7.2 Hz, 1H), 7.53 (t, J = 7.6 Hz, 1H), 7.17 (d, J = 8.4 Hz, 2H), 6.96 (d, J = 8.4 Hz, 2H), 5.40

(s, 1H), 4.26 (t, J = 7.2 Hz, 1H), 3.76 (s, 3H), 3.20-3.06 (m, 2H) ppm.

159

(S)-Ethyl 2-((tert-butoxycarbonyl)amino)-3-(4-((2-nitrobenzyl)oxy)phenyl)propanoate

HCl salt (245). Compound 241 (120 mg, 0.29 mmol) was dissolved in dry DMF (1 mL).

K2CO3 (0.16 g, 1.1 mmol) and bromoethane (43 µL, 0.57 mmol) were added and the reaction mixture was stirred at 70 °C overnight. Water (5 mL) was added and then extracted with

EtOAc (3  5 mL). The organic layers were combined, washed with water (15 mL) and brine

(5 mL), dried over Na2SO4, filtered, and concentrated. The remaining residue was purified by flash column chromatography on silica gel and eluted with Hex/Et2O (3:2) to obtain 242 (76 mg, 54%) as an off-white solid. Then, compound 242 (70 mg, 0.16 mmol) was dissolved in dry DCM (0.9 mL) and TES (50 µL, 0.3 mmol) was added, followed by TFA (0.23 mL, 3.1 mmol). The reaction was stirred overnight at rt and then it was concentrated under reduced pressure. The residue was taken in 1 M HCl aqueous (2 mL) and stirred at rt for 10 min. The aqueous solution was washed with Et2O (2  2 mL) and the aqueous layer was concentrated.

To complete TFA to HCl salt exchange, the residue was dissolved again in 1 M HCl (2 mL), stirred for 10 min, concentrated, and repeated one final time. Compound 245 (38 mg, 64%)

1 was delivered as an off-white solid. H NMR (400 MHz, CD3OD) δ 8.07 (d, J = 8.4 Hz, 1H),

7.63-7.55 (m, 2H), 7.40 (m, 1H), 7.09 (d, J = 8.7 Hz, 2H), 6.78 (d, J = 8.7 Hz, 2H), 5.52 (s,

2H), 4.33 (s, br, 1H), 3.94 (m, 2H), 3.13 (m, 2H), 1.32 (t, 3H) ppm.

(S)-Ethyl 2-amino-3-(4-((2-nitrobenzyl)oxy)phenyl)propanoate HCl salt (246).

Compound 241 (130 mg, 0.31 mmol) was dissolved in dry MeCN (2 mL). DIPEA (0.2 mL,

1.2 mmol) and 2-bromomethyl acetate (61 µL, 0.62 mmol) were added and the reaction mixture was stirred at rt overnight. The reaction mixture was concentrated under reduced

160

pressure and the remaining residue was purified by flash column chromatography on silica gel and eluted with DCM/Et2O (30:1) to obtain 243 (109 mg, 72%) as a foam. Next, compound 243 (100 mg, 0.2 mmol) was dissolved in dry DCM (1.2 mL) and TES (65 µL,

0.4 mmol) was added, followed by TFA (0.3 mL, 4.1 mmol). The reaction was stirred overnight at rt and then it was concentrated under reduced pressure. The residue was taken in

1 M HCl aqueous (2 mL) and stirred at rt for 10 min. The aqueous solution was washed with

Et2O (2  2 mL) and the aqueous layer was concentrated. To complete TFA to HCl salt exchange, the residue was dissolved again in 1 M HCl (2 mL), stirred for 10 min, concentrated, and repeated one final time. Compound 246 (65 mg, 73%) was delivered as an

1 off-white solid. H NMR (400 MHz, CD3OD) δ 8.13 (d, J = 6.8 Hz, 1H), 7.89-7.55 (m, 2H),

7.43 (d, J = 7.6 Hz, 1H), 6.88 (d, J = 11.2 Hz, 2H), 6.67 (d, J = 11.2 Hz, 2H), 5.40 (s, 2H),

5.25 (s, 2H), 4.36 (t, J = 6.8 Hz, 1H), 2.88 (d, J = 9.6 Hz), 1.73 (s, 3H) ppm.

5-(Bromomethyl)-6-nitrobenzo[d][1,3]dioxole (251). 6-Nitropiperonyl alcohol (200 mg,

1.0 mmol) was taken in dry DCM (2 mL) and the mixture was cooled in an ice-bath. A solution of 1 M PBr3 in DCM (1.5 mL) was added dropwise. The reaction solution was allowed to warm to rt and stir overnight. Then, it was quenched with water (1 mL), diluted in

DCM (15 mL), and washed with saturated NaHCO3 (2  15 mL). The organic layer was dried over Na2SO4, filtered, and concentrated. The obtained residue was purified by flash column chromatography on silica gel using 4:1 Hex/EtOAc to obtain 251 (172 mg, 66%) as a yellow solid. The reaction was also performed in a 1 g scale (7.5 mL of 1 M PBr3 and 10 mL

161

of dry DCM) and the product was obtained in 57% yield (742 mg). 1H NMR spectral data matched literature values.303

5-(1-Bromoethyl)-6-nitrobenzo[d][1,3]dioxole (259). Similar to the synthesis of 251, compound 259 (0.28 g, 54%) was synthesized using the following reagents: alcohol 256 (0.4

1 g, 1.9 mmol), 1 M PBr3 in DCM (2.8 mL), and DCM (4 mL). H NMR spectral data matched literature values.304

(2S)-2-Amino-3-(4-(1-(6-nitrobenzo[d][1,3]dioxol-5-yl)ethoxy)phenyl)propanoic acid

(247). L-Tyrosine (490 mg, 2.70 mmol) was dissolved in 2 M NaOH aqueous (5 mL) and a solution of CuSO4·5H2O (365 mg, 1.45 mmol) in a minimal amount of water was added slowly at rt. The solution was heated to 60 oC and stirred for 20 min and then allowed to cool to rt before adjusting to pH 7 using 1 M HCl. The light-blue solid was filtered and washed with water (3 × 15 mL) before it was taken in 75% aqueous DMF (20 mL). K2CO3 (375 mg,

2.70 mmol) and benzyl bromide 251 (537 mg, 2.08 mmol) were added and the reaction was allowed to proceed for 48 h at rt while kept in the dark. The solid was filtered, washed with

75% aqueous DMF (2  15 mL), water (2  15 mL), 75% aqueous acetone (15 mL), and ice- cold acetone (10 mL), and then taken in 1 M HCl (20 mL) to stir for 1 h at rt. The white solid was filtered and stirred once more with fresh 1 M HCl (20 mL) for another 30 min. The solid was finally filtered, washed with water (2  15 mL) and ice-cold acetone (10 mL). The compound was collected and dried to give 247 as a slightly yellow powder (0.44 g, 53%). 1H

NMR (400 MHz, DMSO-d6) δ 8.46 (s, br, 3H), 7.72 (s, 1H), 7.24 (s, 1H), 7.21 (d, J = 8.0 Hz,

162

2H), 6.96 (d, J = 8.0 Hz, 2H), 6.23 (s, 2H), 4.07 (m, br, 1H), 3.08 (m, 2H) ppm; 13C NMR

(100 MHz, DMSO-d6) δ 170.3, 157.0, 152.1, 147.2, 141.3, 130.8, 130.0, 127.5, 114.8, 107.8,

+ 105.5, 103.6, 66.5, 53.2, 34.8 ppm; HRMS-ESI (m/z): [M+H] calcd for C17H16N2O7

361.1030, found 361.1046.

1-(6-Nitrobenzo[d][1,3]dioxol-5-yl)ethan-1-ol (256). In a 500 mL round-bottom flask, 6- nitropiperonal (3.0 g, 15.3 mmol) was dissolved in dry DCM (30 mL) under argon and cooled to 0 C. A solution of 2 M trimethyl aluminum in hexanes (11.5 mL, 23.1 mmol) was carefully added over a period of 30 min while stirring. The reaction mixture was allowed to stir for 2 h at the same temperature. Next, water (11 mL) was added and the ice-bath was removed. A solution of 1 M NaOH aqueous (30 mL) was added and the biphasic mixture was stirred vigorously for 45 min before allowing the layers to separate. The aqueous layer was extracted with DCM (2  40 mL) and the organic layers were combined, washed with 1 M

NaOH (80 mL), water (40 mL), and brine (20 mL). Then, the DCM solution was dried over

MgSO4, filtered, and concentrated to provide the secondary alcohol 256 (2.9 g, 90%) as a yellow powder without further purification. 1H NMR spectral data matched literature values.305

Methyl (2S)-2-((tert-butoxycarbonyl)amino)-3-(4-(1-(6-nitrobenzo[d][1,3]dioxol-5- yl)ethoxy)phenyl) propanoate (257). Method A: Triphenyl phosphine (0.37 g, 1.42 mmol) was dissolved in dry THF (8 mL) and the solution was cooled to 0 C. Diisopropyl azodicarboxylate (0.28 mL, 1.42 mmol) was added slowly and the reaction mixture was

163

stirred for 15 min before adding Boc-Tyr-OMe (0.42 g, 1.42 mmol) and the benzyl alcohol

256 (0.2 g, 0.95 mmol). The reaction mixture was allowed to slowly warm to rt before it was heated to 50 C and stirred for 24 h at this temperature. The volatiles were removed under reduced pressure and the crude was dissolved in EtOAc (10 mL) and washed with aqueous saturated NaHCO3 (2  10 mL), water (5 mL), and brine (5 mL). The organic layer was dried over Na2SO4, filtered, and concentrated under reduced pressure. The obtained residue was purified by flash column chromatography on silica gel, eluting with Hex/EtOAc (4:1) to afford 257 (0.23 g, 76%) as a yellow oil. Method B: The benzyl bromide 259 (30 mg, 0.11 mmol) and Boc-Tyr-OMe (32 mg, 0.11 mmol) were dissolved in dry DMF (0.6 mL). Next,

K2CO3 (15 mg, 0.11 mmol) was added and the reaction mixture was stirred at 60 °C for 24 h.

At this point, mostly starting material and a new spot (product) were observed by TLC. The reaction mixture was added to saturated NaHCO3 (10 mL) and extracted with EtOAc (3  5 mL). The organic layers were combined, washed with saturated NaHCO3 (2  10 mL) and brine (5 mL), dried over Na2SO4, filtered, and concentrated under reduced pressure. The remaining residue was purified as described in Method A to deliver 257 (8 mg, 19%) as a

1 yellow oil. H NMR (400 MHz, CDCl3) δ 7.53 (s, 1H), 7.13 (s, 1H), 6.93 (d, J = 8.4 Hz,

2H), 6.68 (d, J = 8.4 Hz, 2H), 6.07 (s, 1H), 6.04 (s, 1H), 6.00 (q, J = 6.0, 6.4 Hz, 1H), 4.95

(m, br, 1H), 4.49 (m, br, 1H), 3.67 (s, 3H), 2.95 (m, 2H), 1.64 (d, J = 6.0 Hz, 3H), 1.38 (s,

13 9H) ppm; C NMR (100 MHz, CDCl3) δ 172.5, 156.3, 155.2, 153.0, 147.3, 141.3, 137.4,

130.5, 128.7, 115.5, 106.3, 105.5, 103.1, 80.0, 71.6, 54.5, 52.3, 37.5, 28.4, 23.6, 22.1, 21.88

- ppm; HRMS-ESI (m/z): [M-H] calcd for C24H27N2O9 487.17111, found 487.17278.

164

(2S)-2-Amino-3-(4-(1-(6-nitrobenzo[d][1,3]dioxol-5-yl)ethoxy)phenyl)propanoic acid

TFA (248). Compound 257 (135 mg, 0.35 mmol) was dissolved in THF (0.7 mL) and cooled to 0 C before 2 M LiOH (0.7 mL, 1.39 mmol) was added and the reaction mixture was stirred vigorously at rt overnight. THF was removed under reduced pressure and the remaining aqueous residue was diluted with water (5 mL) and washed with Et2O (5 mL). The aqueous layer was acidified with 5% citric acid to pH 4 and extracted with EtOAc (3  5 mL). The organic layers were combined, washed with water (10 mL) and brine (5 mL), dried over Na2SO4, filtered, and concentrated under reduced pressure to dryness to afford 258 without further purification (0.13 g, 82%) as a yellow oil. Compound 258 (0.2 g, 0.53 mmol) was dissolved in dry DCM (2 mL) and TES (0.17 mL, 1.07 mmol) was added. TFA (2 mL,

26.7 mmol) was then slowly added and the reaction mixture was stirred at rt for 45 min. The volatiles were removed under reduced pressure and the residue was dissolved in MeOH (3 mL). Residual amounts of TFA were removed by co-evaporation under reduced pressure.

The process was subsequently repeated two times before finally dissolving in a minimal amount of MeOH and the residue was precipitated into ice-cold Et2O. The solvent was decanted and the precipitate was washed with cold Et2O. The remaining solid was dried under vacuum to afford 248 (265 mg, 91%) as a light-yellow solid. 1H NMR (400 MHz,

CD3OD) δ 7.54 (s, 1H), 7.14 (m, 3H), 6.77 (d, J = 8.4 Hz, 2H), 6.00 (q, J = 6.0, 6.4 Hz, 1H),

4.04 (m, 1H), 3.22-3.17 (m, 1H), 3.02-2.96 (m, 1H), 1.66 (d, J = 6.0 Hz, 3H) ppm; 13C NMR

(100 MHz, CD3OD) δ 172.5, 158.2, 154.2, 148.9, 143.0, 137.5, 131.7, 128.5, 117.0, 106.8,

+ 106.0, 104.8, 72.6, 36.6, 23.6 ppm; HRMS-ESI (m/z): [M+H] calcd for C18H18N2O7

375.1187, found 375.1156.

165

Methyl (2S)-2-((tert-butoxycarbonyl)amino)-3-(4-(2-(2-nitrophenyl)propoxy)phenyl) propanoate (261). Triphenyl phosphine (3.8 g, 15 mmol) was dissolved in dry THF (55 mL) and the solution was cooled to 0 C. Diisopropyl azodicarboxylate (2.9 mL, 15 mmol) was added slowly and the reaction mixture was stirred for 15 min before adding Boc-Tyr-OMe

(2.11 g, 7.4 mmol) and the known296 alcohol 260 (2.0 g, 11 mmol), dissolved in dry THF (5 mL). The reaction mixture was allowed to slowly warm to rt before it was heated to 60 C and stirred for 38 h at this temperature. The volatiles were removed under reduced pressure and the crude was dissolved in EtOAc (50 mL) and washed with aqueous saturated NaHCO3

(2  50 mL), water (25 mL), and brine (25 mL). The organic layer was dried over Na2SO4, filtered, and concentrated under reduced pressure. The obtained residue was purified by flash column chromatography on silica gel eluting with Hex/EtOAc (3:1) to afford 261 (2.33 g,

1 73%) as a yellow oil. H NMR (400 MHz, CDCl3) δ 7.73 (d, J = 8.4 Hz, 1H), 7.52 (m, 2H),

7.33 (m, 1H), 6.95 (d, J = 8.4 Hz, 2H), 6.74 (d, J = 8.4 Hz, 2H), 4.98 (d, J = 8.0 Hz, 1H),

4.48 (m, 1H), 4.01 (d, J = 6.4 Hz, 2H), 3.84 (m, 1H), 3.65 (s, 3H), 2.98 (m, 2H), 1.42 (d, J =

13 6.8 Hz, 3H), 1.37 (s, 9H) ppm; C NMR (100 MHz, CDCl3) δ 172.4, 157.7, 155.8, 155.1,

150.5, 138.1, 137.6, 132.6, 130.3, 128.5, 128.2, 127.3, 124.1, 114.6, 79.8, 72.1, 70.0, 54.5,

52.2, 37.3, 33.6, 32.2, 28.3, 22.0, 19.2, 17.6 ppm; HRMS-ESI (m/z): [M+H]+ calcd for

C24H31N2O7 459.2126, found 459.2119.

(2S)-2-Amino-3-(4-(2-(2-nitrophenyl)propoxy)phenyl)propanoic acid HCl (249).

Compound 261 (0.3 g, 0.65 mmol) was dissolved in THF (2.6 mL) and cooled to 0 C before

2 M LiOH (2.6 mL, 5.24 mmol) was added and the reaction mixture was stirred vigorously at

166

rt overnight. THF was removed under reduced pressure and the remaining aqueous residue was diluted with water (10 mL) and washed with Et2O mL). The aqueous layer was acidified with 5% citric acid to pH 4 and extracted with EtOAc (3  10 mL). The organic layers were combined, washed with water (20 mL) and brine (20 mL), dried over Na2SO4, filtered, and concentrated under reduced pressure. The obtained residue was purified by flash column chromatography on silica gel, eluting with 8% MeOH in DCM to afford 262 (204 mg, 70%) as a yellow oil. Compound 262 (162 mg, 0.36 mmol) was dissolved in dry DCM (1 mL) and

TES (0.116 mL, 0.73 mmol) was added. TFA (1 mL) was then slowly added and the reaction mixture was stirred at rt for 45 min. The volatiles were removed under reduced pressure and the residue was taken in aqueous 1 M HCl (1 mL) and stirred for 10 min, at which time an off-white solid was formed. The solid was collected by filtration, washed with aqueous 1 M

HCl, water, and ice-cold acetone. The remaining solid was evaporated to dryness under

1 vacuum to afford 249 (125 mg, 90%) as an off-white solid. H NMR (400 MHz, CD3OD) δ

7.77 (dd, J = 8.0, 1.2 Hz, 1H), 7.68-7.60 (m, 2H), 7.44-7.40 (m, 1H), 7.18 (d, J = 8.8 Hz,

2H), 6.87 (d, J = 8.8 Hz, 2H), 4.19 (m, 1H), 4.11 (dd, J = 4.4, 2.4 Hz, 2H), 3.78 (m, 1H),

13 3.25-3.03 (m, 2H), 1.45 (d, J = 6.8 Hz, 3H) ppm; C NMR (100 MHz, CD3OD) δ 171.2,

159.8, 133.6, 131.5, 129.4, 128.6, 127.5, 124.8, 116.1, 73.5, 55.2, 36.5, 35.0, 17.8 ppm;

+ HRMS-ESI (m/z): [M+H] calcd for C18H20N2O5 345.1445, found 345.1471.

Boc-Tyrosine-OH-phenyl-3,5-d2 (264). L-Tyrosine-phenyl-3,5-d2 (0.3 g, 1.6 mmol) and

TEA (0.34 mL, 2.4 mmol) were dissolved in a 1:1 solution of 1,4-dioxane and water (6 mL) and cooled to 0 °C. A solution of Boc-anhydride dissolved in 1,4-dioxane and water (1:1, 2

167

mL) was added and the reaction was allowed to stir at rt for 18 h. Next, the volatiles were evaporated, the residue was diluted in water (15 mL), and extracted with EtOAc (3  10 mL).

The combined organic layers were dried over Na2SO4, filtered, and concentrated to dryness.

Compound 264 (466 mg, 92%) was obtained as a white foam without further purification.

1 H NMR (300 MHz, CDCl3) δ 7.46 (s, br, 1H), 6.96 (s, 2H), 5.98, (m, br, 1H), 5.13 (d, 1H),

4.56 (m, br, 1H), 3.02-2.88 (m, 2H), 1.14 (s, 9H) ppm; LRMS-ESI (m/z): [M+H]+ calcd for

2 C14H18 H2NO5 284.15, found 284.14.

Boc-o-nitrobenzyl tyrosine-phenyl-3,5-d2 (265). Compound 264 (0.4 g, 1.4 mmol) was dissolved in dry DMF (7 mL). K2CO3 (0.58 g, 4.2 mmol) and o-nitrobenzyl bromide (0.33 g,

1.5 mmol) were added and the reaction mixture was stirred at rt overnight. Water (25 mL) was added and then extracted with EtOAc (3  15 mL). The organic layers were combined, washed with water (2  40 mL) and brine (20 mL), dried over Na2SO4, filtered, and concentrated. The remaining residue was purified by flash column chromatography on silica

1 gel and eluted with DCM/Et2O (8:1) to obtain 265 (0.42 g, 71%) as an off-white solid. H

NMR (300 MHz, CDCl3) δ 8.11 (d, J = 8.1 Hz, 1H), 7.62 (t, J = 6.6 Hz, 1H), 7.49 (t, J = 7.5

Hz, 1H), 7.40 (d, J = 7.8 Hz, 1H), 6.96 (s, 2H), 6.08 (s, br, 1H), 5.52 (s, 2H), 5.07 (d, J = 7.8

Hz, 1H), 4.63 (m, 1H), 3.03 (d, J = 6.0 Hz, 2H), 1.42 (s, 9H) ppm; 13C NMR (75 MHz,

CDCl3) δ 172.0, 167.3, 155.6, 155.3, 147.5, 134.2, 131.7, 130.5, 129.3, 129.1, 127.5, 125.2,

+ 2 115.9, 80.7, 63.9, 55.1, 37.7, 28.5 ppm; LRMS-ESI (m/z): [M+H] calcd for C21H23 H2N2O7

419.18, found 419.21.

168

o-Nitrobenzyl tyrosine-phenyl-3,5-d2 HCl salt (266). Method A: Compound 265 (0.39 g,

0.93 mmol) was dissolved in DCM (3 mL) and HCl (6 mL 4 N HCl in 1,4-dioxane and 3 mL

DCM) was added. The reaction was stirred at rt for 45 min to an hour and a precipitate was formed. Et2O (10 mL) was added and the solid was filtered and washed with three portions of

Et2O (3  10 mL). The solid was completely dried under high vacuum and 266 (0.29 mg,

87%) was delivered as a light-yellow powder. (Note: Method A resulted in 266 with cytotoxic effects) Method B: Compound 266 was synthesized by following the synthesis procedure of 218/247 using the following reagents: L-Tyrosine-phenyl-3,5-d2 (500 mg, 2.73 mmol), CuSO4·5H2O (477 mg, 1.91 mmol), 1 M NaOH (6 mL), o-nitrobenzyl bromide (707 mg, 3.27 mmol), and K2CO3 (453 mg, 3.27 mmol). The amino acid was obtained as a light-

1 yellow powder (0.44 g, 45%). H NMR (400 MHz, DMSO-d6) δ 9.48 (s, 1H), 8.73 (s, br,

3H), 8.15 (d, J = 11.2 Hz, 1H), 7.76 (m, 1H), 7.65 (m, 1H), 7.44 (d, J = 10 Hz, 1H), 7.02 (s,

2H), 5.49 (s, 2H), 4.33 (t, J = 10 Hz, 1H), 3.17-2.97 (m, 2H) ppm; HRMS-ESI (m/z): [M-H]-

2 calcd for C16H13 H2N2O5 317.11010, found 317.11114.

169

CHAPTER 4: NORBORNENE-CONTAINING CAGED THYMIDINE FOR

OLIGONUCLEOTIDE MODIFICATION

4.1 Introduction

Regulation of gene expression is an essential process that occurs in many cellular pathways and its disruption leads to a variety of diseases.306-309 Therefore, it is imperative to dissect the mechanisms involved in gene expression to understand complex biological processes. Oligonucleotides are short, single-stranded DNA or RNA macromolecules of high importance due to their vast biological functions, as they are involved in fundamental cellular processes.310 In nature, oligonucleotides usually exist as small RNA molecules that function in the regulation of gene expression as mRNAs, tRNAs, microRNAs, siRNAs, and others.

Due to their biological importance, oligonucleotides have been targeted as probes to study biological processes.

In order to gain spatial and temporal control over oligonucleotides several approaches have been exploited for the photoregulation of oligonucleotides. One of the most employed methods involves the introduction of a caging group into the oligonucleotide backbone.311

The caging groups have been introduced in the form of photocaged nucleobases, photocleavable linkers, and reversible photoswitches.225 In general, the caging group inhibits hydrogen bond hybridization between oligonucleotides, or sterically blocks the interactions with other macromolecules or small molecules.

Photocaged nucleobases exploit caging groups that are strategically inserted to block

Watson-Crick base-paring. This prevents duplex formation and the oligonucleotide remains inactive until photolysis (Figure 36). Photocleavable linkers can result in activated

170

oligonucleotides by photochemically releasing a complementary inhibitory strand or opening circular strands. Splitting a single-stranded oligonucleotide via light-mediated cleavage of an embedded photolabile linker can result in deactivated oligomers. These strategies can result in the activation or deactivation of gene expression depending on the researcher’s aim and experiment design.

Oligonucleotide Activation:

Photocaged nucleobases

Photocleavable linkers

Oligonucleotide Deactivation:

Photocleavable linkers

Figure 36. Photo-activation and photo-deactivation of oligonucleotides using photocaged nucleobases or photocleavable linkers. Figure adapted from Acc. Chem. Res. 2014, 47, 45-55.

171

Our lab has dedicated efforts for the development of methods to manipulate oligonucleotides for the elucidation of cellular processes that control and maintain biological systems. We have employed gene regulatory methods based on the activation or deactivation of gene expression controlled by light. Previously, our lab has introduced caging groups into oligonucleotides for the photochemical control of several bio-macromolecules such as DNA decoys,312 antisense agents,313-316 antagomirs,317 and triplex-forming DNA.318

The inverse electron-demand Diels-Alder cycloaddition has proved to be useful for the labeling of oligonucleotides with a molecule of interest (i.e. fluorophore, affinity tag), and for the linkage of oligonucleotide strands.319, 320 However, these methods lack spatial and temporal control over oligonucleotide function and/or gene expression. Recently, our lab reported a strategy to deliver antisense oligonucleotides into cells by bioconjugating peptides through a photocleavable group on a nucleobase (Figure 37).313 The photocaged nucleobase

267 was designed with an alkyne functionality to undergo a bioorthogonal Cu(I)-catalyzed click cycloaddition with an azide-containing, cell-penetrating peptide in mammalian cells.

The photocaged nucleobase simultaneously prevented oligonucleotide hybridization, thereby enabling the dual functions of oligonucleotide cellular uptake and precise control over gene silencing by the use of light.

172

Figure 37. Structure of alkyne-functionalized caged thymidine phosphoramidite 267.

Although Cu(I)-catalyzed click cycloadditions have proven to be useful in widespread applications, the use of the metal catalyst limits their application in living systems due to cellular toxicity. The inverse electron-demand Diels-Alder cycloaddion is an alternative reaction that has shown to be faster than other established bioorthogonal reactions, chemospecific, and non-cytotoxic by yielding nitrogen gas as the only byproduct (Figure

10).203-205 Here we show the synthesis of a photocaged thymidine phosphoramidite that bears a norbornene functionality on the caging group to undergo bioconjugation via an inverse electron-demand Diels-Alder reaction (Figure 38).

173

Figure 38. Inverse electron-demand Diels-Alder bioconjugation of a photocaged thymidine and decaging reaction. (CG = caging group)

4.2 Synthesis of a norbornene-functionalized, photocaged thymidine phosphoramidite

Based on a similar route as that for the synthesis of 267, designed by Dr. Rajendra

Uprety (Deiters lab),313 we synthesized the alcohol 272 by following a procedure that was developed by Dr. Lusic, also in the Deiters group (Scheme 31).321 The first step involves alkylation of the phenolic group on vanillin (268) using 1,2-dibromoethane in the presence of potassium carbonate, delivering 269 in 66%.321 This was followed by nitration to the aromatic ring (270, 98%) with concentrated nitric acid, and methylation to the aldehyde functionality to obtain 271 in 89% yield using trimethyl alumninum.321 We envisioned using the azide 272 as an amine precursor that would allow a general route to introduce further

174

modifications to the caging group. To this end, we obtained the alkyl azide 272 in 95% yield via a nucleophilic substitution reaction using sodium azide in DMF.

Scheme 31. Synthesis of the caging precursor 272.

Based on a method developed by Dr. Lusic to introduce a caging group on aromatic

N-heterocycles,322 compound 272 was reacted with methylsulfide and benzoyl peroxide to deliver the corresponding thiomethylene 273 in 63% (Scheme 32). The thioether 273 was subsequently activated in situ with sulfuryl chloride in ice-cold conditions, assembling the corresponding methylene chloride caging group that was directly reacted with DMT- protected thymidine (DMT-T) in the presence of DBU in DMF to furnish the photocaged thymidine 274 in 71% yield. Conversion of the azide in 274 to the corresponding primary amine via a Staudinger reduction and its subsequent ligation to a norbornene functionality using NHS-norbornene 162 resulted in the DMT-protected thymidine 275 with a yield of

78% over two steps. A final reaction of 275 with 2-cyanoethyl-N,N- diisopropylchlorophosphoramidite in DIPEA and DCM furnished the norbornene- functionalized, caged thymidine phosphoramidite 276 in 91% yield.

175

Scheme 32. Synthesis of a norbornene-containing caged thymidine phosphoramidite 276.

The caged thymidine 275 was tested for its stability to automated oligonucleotide synthesis conditions. The compound was subjected to the following conditions: 1) 5% TFA in DCM for 6 h at room temperature and 2) 40% MeNH2 and concentrated NH4OH (1:1) for

2 h at 60 °C, with 275 in a final concentration of 10 mM. The tested conditions did not lead to the removal of the caging group or the norbornene functionality as observed by TLC and

NMR. With these results in hand, we could then test the caged thymidine phosphoramidite

276 for its incorporation into an oligomer using an automated DNA synthesizer.

176

4.3 Summary

The synthesis of a caged thymidine phosphoramidite bearing a norbornene functionality was presented. Our phosphoramidite could be tested for its incorporation into an oligomer using an automated DNA synthesizer as we demonstrated its stability in oligonucleotide synthesis conditions. We could explore the nucleoside’s dual functionality of undergoing chemical modifications after being incorporated into oligonucleotides via a bioorthogonal Diels-Alder cycloaddition and its ability to have precise control over oligonucleotide function by light for the regulation of gene expression. This approach could be applied directly into live cells without any toxicity as an alternative to the Cu(I)-catalyzed click cycloaddition.

4.4 Experimental data for synthesized compounds

Unless otherwise stated, all reagents used were obtained from commercial sources and used as received. Reactions were stirred magnetically and carried out under nitrogen using flame-dried glassware. DCM, THF and Et2O were dried using a MB SPS Compact solvent purification system. MeCN, DMF, DCE, DIPEA, TEA and pyridine were distilled from calcium hydride. MeOH and EtOH were distilled from magnesium and iodine. The distilled solvents were stored under nitrogen and over molecular sieves (3 Å for MeOH and

EtOH, and 4 Å for all other solvents). Reactions were followed by thin layer chromatography

(TLC) using glass-back silica gel plates (Sorbent technologies, 250 µm thickness) and visualized under a UV lamp and/or by staining with a KMnO4 solution. Flash chromatography was performed on silica gel (60 Å, 40-63 μm (230 × 400 mesh), Sorbtech)

177

as a stationary phase. Melting points were determined using a capillary melting point apparatus. The 1H NMR, 13C NMR and 31P NMR spectra were recorded on a 300 MHz or

400 MHz Varian NMR spectrometer. HRMS was performed at the University of Pittsburgh.

4-(2-Bromoethoxy)-3-methoxybenzaldehyde (269). Vanillin (2.0 g, 13.1 mmol) was dissolved in dry DMF (20 mL). K2CO3 (5.4 g, 39.4 mmol) and 1,2-dribromoethane (3.4 mL,

39.4 mmol) were added and the reaction mixture was stirred overnight at rt. Then, it was diluted in Et2O (100 mL) and washed with 1 M NaOH (2  100 mL), water (100 mL), and brine (50 mL). The organic layer was dried over MgSO4, filtered, and concentrated to dryness. Compound 269 (2.24 g, 66%) was obtained as a white solid. 1H NMR spectral data matched literature values.323

4-(2-Bromoethoxy)-5-methoxy-2-nitrobenzaldehyde (270). Compound 269 (2.2 g, 8.5 mmol) was added in ice-cold, concentrated nitric acid (30 mL). The reaction mixture was stirred at rt for 16 h and then diluted in water (30 mL). The aqueous solution was extracted with DCM (3  60 mL), and the combined organic extracts were washed with saturated

NaHCO3 (100 mL), dried over Na2SO4, filtered, and concentrated to dryness. Compound 270

(2.5 g, 98%) was furnished as a yellow solid. 1H NMR spectral data matched literature values.321

1-(4-(2-Bromoethoxy)-5-methoxy-2-nitrophenyl)ethan-1-ol (271). In a 100 mL flask, 270

(1.5 g, 4.9 mmol) was dissolved in dry DCM (35 mL) under argon and the solution was

178

cooled to 0 ºC. A 2 M AlMe3 solution in hexanes (3.7 mL, 7.4 mmol) was added dropwise over a period of 30 min and the reaction was allowed to stir for another 3 h at the same temperature. The reaction was then slowly quenched with iced water (4 mL), followed by 1

M NaOH (35 mL). The mixture was stirred vigorously at rt for 1 h and the phases were separated. The aqueous layer was extracted with DCM (2 x 35 mL) and the organic layers were combined, washed with water (90 mL) and brine (40 mL), dried with Na2SO4, filtered, and concentrated under reduced pressure. The compound was used without further purification after complete dryness, furnishing 271 as a yellow solid (1.4 g, 89%). 1H NMR spectral data matched literature values.321

1-(4-(2-Azidoethoxy)-5-methoxy-2-nitrophenyl)ethan-1-ol (272). Sodium azide (2.0 g,

31.3 mmol) and catalytic NaI (spatula tip) were added to a stirring solution of alkyl bromide

271 (1.0 g, 3.13 mmol) in dry DMF (10 mL). The reaction mixture was heated overnight at

60 C and then cooled to rt, diluted in water (30 mL) and extracted with EtOAc (3 x 30 mL).

The combined organic extracts were washed with water (3 x 90 mL) and brine (50 mL), dried over Na2SO4, filtered, and concentrated to dryness. Compound 272 (0.84 g, 95%) as a yellow solid. 1H NMR spectral data matched literature values.321

((1-(4-(2-Azidoethoxy)-5-methoxy-2-nitrophenyl)ethoxy)methyl)(methyl) sulfane (273).

The benzyl alcohol 272 (38 mg, 0.13 mmol) was dissolved in dry MeCN (2 mL) and cooled to 0 C under argon. Methylsulfide (0.1 mL, 1.35 mmol) was added, followed by the portion- wise addition of benzoyl peroxide (130 mg, 0.54 mmol) over two hours. After stirring for an

179

additional 5 h at this temperature, the reaction mixture was quenched with 1 M NaOH (7 mL) and stirred at rt overnight. The mixture was extracted with EtOAc (3 × 5 mL) and the organic layer was combined and washed with 1 M NaOH (5 mL), water (5 mL) and brine (5 mL), dried over Na2SO4, filtered, and concentrated under reduced pressure. The obtained residue was purified by flash column chromatography on silica gel, eluting with Hex/EtOAc (3:1) to

1 yield 273 (28.9 mg, 63%) as a yellow oil. H NMR (400 MHz, CDCl3) δ 7.54 (s, 1H), 7.17

(s, 1H), 5.49 (m, 1H), 4.57 (d, J = 11.4 Hz, 1H), 4.29 (d, J = 11.4 Hz, 1H), 4.18 (t, J = 5.2

Hz, 2H), 3.91 (t, J = 5.2 Hz, 2H), 2.08 (s, 3H), 1.47 (d, J = 6.4 Hz, 3H) ppm; 13C NMR (100

MHz, CDCl3) δ 154.4, 146.5, 140.1, 135.6, 109.5, 109.0, 73.2, 70.6, 68.3, 56.4, 49.9, 23.3,

14.1 ppm. LRMS-ESI was not successful due to compound instability.

3-((1-(4-(2-Azidoethoxy)-5-methoxy-2-nitrophenyl)ethoxy)methyl)-1-((2R,4S,5R)-5-

((bis(4-methoxyphenyl)(phenyl)methoxy)methyl)-4-hydroxytetrahydrofuran-2-yl)-5- methylpyrimidine-2,4(1H,3H)-dione (274). Compound 273 (170 mg, 0.50 mmol) was dissolved in dry DCM (0.25 mL) under argon and cooled to 0 C. Sulfuryl chloride (48.2 μL,

0.60 mmol) was dissolved in dry DCM (0.25 mL) and added cold to the former solution dropwise. The solution was stirred at this temperature for 30 min and the volatiles were removed under vacuum to afford a yellow oil. 5’-DMT-thymidine (0.27 g, 0.50 mmol) and

DBU (0.15 mL, 1.0 mmol) were dissolved in dry DMF (0.5 mL) and the mixture was stirred for 1 h at rt under argon. The caging group was dissolved in dry DMF (0.3 mL) and the solution was added dropwise. The reaction mixture was allowed to stir at rt overnight. The mixture was diluted in saturated NaHCO3 (10 mL) and extracted with EtOAc (3 × 10 mL).

180

The organic layers were combined, washed with saturated NaHCO3 (10 mL), water (10 mL), and brine (5 mL), dried over Na2SO4, filtered, and concentrated under reduced pressure. The remaining residue was purified by flash column chromatography on silica gel, eluting with

1% MeOH, 1% TEA in DCM to obtain 274 (294 mg, 71%) as a yellow solid. 1H NMR (400

MHz, CDCl3) δ 7.81 (s, 1H), 7.41 (t, J = 4.4 Hz, 1H), 7.27 (m, 2H), 7.19-7.08 (m, 8H), 6.73

(d, J = 8.4 Hz, 4H), 6.18 (t, J = 6.4 Hz, 0.5H), 6.10 (t, J = 6.4 Hz, 0.5H), 5.34-5.23 (m, 2H),

5.18 (m, 1H), 4.61 (d, J = 20 Hz, 0.5H), 4.44 (d, J = 20 Hz, 0.5H), 4.07-4.02 (m, 2H), 3.98-

3.93 (m, 1H), 3.82 (s, 3H), 3.66 (s, 6H), 3.49 (t, J = 4.8 Hz, 0.5H), 3.40 (t, J = 4.8 Hz, 0.5H),

3.32-3.20 (m, 2H), 2.29-2.24 (m, 2H), 2.06-1.98 (m, 2H), 1.53 (m, 3H), 1.42 (d, J = 11.6 Hz,

13 3H) ppm; C NMR (100 MHz, CDCl3) δ 163.0, 158.5, 154.0, 153.7, 150.5, 150.4, 146.1,

144.2, 139.5, 139.3, 136.6, 136.3, 135.4, 135.2, 134.4, 129.9, 127.9, 127.8, 126.9, 113.0,

109.7, 109.6, 109.3, 109.0, 108.8, 86.5, 86.2, 85.4, 85.2, 73.6, 71.5, 70.0, 68.2, 68.0, 63.4,

56.2, 55.1, 53.4, 49.7, 49.5, 45.9, 40.8, 23.7, 12.0 ppm; HRMS-ESI (m/z): [M+Na]+ calcd for

C43H46N2O12 861.30717, found 861.3070.

(1S,4S)-Bicyclo[2.2.1]hept-5-en-2-yl-(2-(4-(1-((3-((2R,4S,5R)-5-((bis(4- methoxyphenyl)(phenyl)methoxy)methyl)-4-hydroxytetrahydrofuran-2-yl)-5-methyl-

2,6-dioxo-3,6-dihydropyrimidin-1(2H)-yl)methoxy)ethyl)-2-methoxy-5- nitrophenoxy)ethyl)carbamate (275). Compound 274 (20 mg, 0.024 mmol) was dissolved in MeOH (0.25 mL). Triphenyl phosphine (7 mg, 0.026 mmol) and water (0.25 mL) were added and the reaction mixture was allowed to stir vigorously at rt overnight. The reaction mixture was taken in water (3 mL) and EtOAc (3 mL) and the organic layer was collected

181

and washed with brine (3 mL), dried over Na2SO4, filtered, and concentrated under reduced pressure to dryness. Without further purification, the obtained residue was dissolved in dry

DMF (0.3 mL) and norbornene 162 (6 mg, 0.024 mmol) was added. The reaction mixture was allowed to stir at rt overnight before diluting in water (5 mL) and extracting with EtOAc

(3 × 5 mL). The organic layers were combined and washed with water (3 × 5 mL) and brine

(3 mL), dried over Na2SO4, filtered and concentrated under reduced pressure. The obtained residue was purified by flash column chromatography eluting, with 2% MeOH, 1% TEA in

1 DCM to afford 275 (18 mg, 78%) as a yellow oil. H NMR (400 MHz, CDCl3) δ 7.84 (s,

1H), 7.63-7.17 (m, 11H), 6.78 (d, J = 6.8 Hz, 4H), 6.34 (m, 0.8H), 6.24 (m, 0.7H), 6.19 (t, J

= 6.4 Hz, 0.5H), 5.98 (m, 0.8H), 5.89 (m, 2H), 5.40-5.28 (m, 2.8H), 5.18 (m, 1H), 4.69 (m,

0.2H), 4.39 (m, 0.5H), 4.31 (m, 0.5H), 4.00 (m, 3H), 3.88 (s, 3H), 3.72 (s, 6H), 3.44-3.28 (m,

3H), 3.19-2.88 (m, 2H), 2.30-2.24 (m, 2H), 2.16-1.99 (m, 3H), 1.73-1.05 (m, 10H) ppm; 13C

NMR (100 MHz, CDCl3) δ 168.8, 163.1, 158.6, 156.6, 153.8, 151.3, 150.6, 146.4, 144.4,

139.6, 139.1, 136.4, 136.0, 135.5, 135.3, 134.5, 132.9, 132.1, 132.0, 130.1, 128.6, 128.4,

128.1, 127.9, 113.2, 109.9, 109.0, 86.7, 86.1, 85.2, 85.1, 82.8, 82.7, 75.5, 73.8, 73.3, 71.4,

71.2, 70.2, 69.9, 68.4, 63.4, 63.2, 59.3, 55.2, 47.0, 46.1, 46.0, 45.7, 42.0, 41.0, 40.5, 40.0,

+ 34.5, 34.3, 25.4, 23.9, 12.2 ppm; HRMS-ESI (m/z): [M+Na] calcd for C51H56N4O14

971.3691, found 971.3612.

Bicyclo[2.2.1]hept-5-en-2-yl-(2-(4-(1-((3-((2R,4R,5R)-5-((bis(4-methoxyphenyl)

(phenyl)methoxy)methyl)-4-(((2-cyanoethoxy)(diisopropylamino)phosphanyl)oxy) tetrahydrofuran-2-yl)-5-methyl-2,6-dioxo-3,6-dihydropyrimidin-1(2H)-yl)methoxy)

182

ethyl)-2-methoxy-5-nitrophenoxy)ethyl)carbamate (276). Thymidine 275 (65 mg, 0.068 mmol) was dissolved in a solution of DIPEA (0.05 mL, 0.27 mmol) in dry DCM (2 mL) under argon. The solution was chilled to 0 C before the addition of 2-cyanoethyl N,N- diisopropylchlorophosphoramidite (0.003 mL, 0.14 mmol). The reaction mixture was allowed to slowly warm to rt and stirred overnight. Then, the solvent was evaporated under reduced pressure and the remaining residue was purified by flash column chromatography on silica gel using DCM containing 1% TEA to deliver the caged phosphoramidite 276 (71.8

1 mg, 91%) as a yellow oil. H NMR (400 MHz, CDCl3) δ 7.97 (s, 1H), 7.71-7.25 (m, 11H),

6.81 (d, J = 6.8 Hz, 4H), 6.28 (m, 0.8H), 6.18 (m, 1.2H), 5.92 (m, 1H), 5.41-5.30 (m, 2.8H),

5.22 (m, 1H), 5.15 (m, 0.2H), 4.57 (m, 1H), 4.20-4.05 (m, 3H), 3.93 (m, 3H), 3.75 (s, 6H),

3.53-3.42 (m, 5H), 3.27 (m, 2H), 3.12-2.77, 2.64 (m, 2H), 2.50 (m, 2H), 2.17-2.06 (m, 3H),

13 1.59-1.03 (m, 22H) ppm; C NMR (100 MHz, CDCl3) δ 167.2, 163.1, 158.7, 156.6, 153.8,

150.8, 148.7, 146.6, 144.2, 141.0, 139.9, 138.4, 136.3, 136.0, 135.4, 135.3, 134.3, 135.4,

134.3, 134.3, 133.1, 132.1, 132.0, 130.2, 128.5, 128.0, 127.2, 117.0, 113.3, 109.1, 86.9, 86.6,

85.5, 85.4, 84.3, 79.3, 75.5, 73.4, 72.2, 69.7, 68.8, 63.3, 63.0, 59.7, 58.2, 56.3, 55.3, 55.2,

48.6, 47.7, 46.3, 46.1, 45.3, 43.3, 43.2, 42.3, 40.1, 39.2, 36.6, 34.6, 24.6, 23.9, 23.0, 20.6,

31 20.2, 20.1, 12.2, 11.6 ppm; P NMR (121 MHz, CDCl3) δ 149.9, 149.7, 149.4, 149.3.

Protocol for stability test of 275. Compond 275 (8 mg, 0.0084 mmol) was dissolved in a solution of 5% TFA* in DCM (0.84 mL) and the reaction mixture was stirred for 6 h at rt. In another reaction setup, 275 (10 mg, 0.0105 mmol) was dissolved in a 1:1 solution of 40% methylamine and concentrated ammonium hydroxide (1.05 mL) and the mixture was heated

183

to 60 °C for 2 h. The volatiles were removed under reduced pressure at rt. After drying in

1 vacuo, the residue was dissolved in CDCl3 and analyzed by H NMR. *Results in DMT deprotection.

184

REFERENCES

1. Wakasugi, K.; Quinn, C. L.; Tao, N.; Schimmel, P., Genetic code in evolution: switching species-specific aminoacylation with a peptide transplant. EMBO J 1998, 17, 297- 305.

2. Deiters, A., Principles and applications of the photochemical control of cellular processes. ChemBioChem 2010, 11, 47-53.

3. Baslé, E.; Joubert, N.; Pucheault, M., Protein chemical modification on endogenous amino acids. Chem Biol 2010, 17, 213-27.

4. Sletten, E. M.; Bertozzi, C. R., Bioorthogonal chemistry: fishing for selectivity in a sea of functionality. Angew Chem Int Ed Engl 2009, 48, 6974-98.

5. Johnson, J. A.; Lu, Y. Y.; Van Deventer, J. A.; Tirrell, D. A., Residue-specific incorporation of non-canonical amino acids into proteins: recent developments and applications. Curr Opin Chem Biol 2010, 14, 774-780.

6. Wang, L.; Brock, A.; Herberich, B.; Schultz, P. G., Expanding the genetic code of Escherichia coli. Science 2001, 292, 498-500.

7. Cornish, V. W.; Benson, D. R.; Altenbach, C. A.; Hideg, K.; Hubbell, W. L.; Schultz, P. G., Site-specific incorporation of biophysical probes into proteins. Proc Natl Acad Sci U S A 1994, 91, 2910-4.

8. Dougherty, D. A., Unnatural amino acids as probes of protein structure and function. Curr Opin Chem Biol 2000, 4, 645-52.

9. Chin, J. W.; Cropp, T. A.; Anderson, J. C.; Mukherji, M.; Zhang, Z.; Schultz, P. G., An expanded eukaryotic genetic code. Science 2003, 301, 964-7.

10. Davis, L.; Chin, J. W., Designer proteins: applications of genetic code expansion in cell biology. Nat Rev Mol Cell Biol 2012, 13, 168-82.

185

11. Arbely, E.; Torres-Kolbus, J.; Deiters, A.; Chin, J. W., Photocontrol of Tyrosine Phosphorylation in Mammalian Cells via Genetic Encoding of Photocaged Tyrosine. J Am Chem Soc 2012, 134, 11912-11915.

12. Neumann, H.; Wang, K. H.; Davis, L.; Garcia-Alai, M.; Chin, J. W., Encoding multiple unnatural amino acids via evolution of a quadruplet-decoding ribosome. Nature 2010, 464, 441-444.

13. Zhang, Z.; Alfonta, L.; Tian, F.; Bursulaya, B.; Uryu, S.; King, D. S.; Schultz, P. G., Selective incorporation of 5-hydroxytryptophan into proteins in mammalian cells. Proc Natl Acad Sci U S A 2004, 101, 8882-8887.

14. Gubbens, J.; Kim, S. J.; Yang, Z.; Johnson, A. E.; Skach, W. R., In vitro incorporation of nonnatural amino acids into protein using tRNACys-derived opal, ochre, and amber suppressor tRNAs. RNA 2010, 16, 1660-1672.

15. Chin, J. W., Expanding and Reprogramming the Genetic Code of Cells and Animals. Annu Rev Biochem 2014.

16. Wang, L.; Xie, J.; Schultz, P. G., Expanding the genetic code. Annu Rev Biophys Biomol Struct 2006, 35, 225-49.

17. Wang, L.; Schultz, P. G., Expanding the genetic code. Angew Chem Int Ed Engl 2004, 44, 34-66.

18. Chin, J. W.; Cropp, T. A.; Chu, S.; Meggers, E.; Schultz, P. G., Progress Toward an Expanded Eukaryotic Genetic Code. Chem Biol 2003, 10, 511-519.

19. Wu, N.; Deiters, A.; Cropp, T. A.; King, D.; Schultz, P. G., A Genetically Encoded Photocaged Amino Acid. J Am Chem Soc 2004, 126, 14306-14307.

20. Xie, J.; Schultz, P. G., An expanding genetic code. Methods 2005, 36, 227-238.

21. Liu, W.; Brock, A.; Chen, S.; Schultz, P. G., Genetic incorporation of unnatural amino acids into proteins in mammalian cells. Nat Methods 2007, 4, 239-44.

186

22. Edwards, H.; Schimmel, P., A bacterial amber suppressor in Saccharomyces cerevisiae is selectively recognized by a bacterial aminoacyl-tRNA synthetase. Mol Cell Biol 1990, 10, 1633-1641.

23. Edwards, H.; Trézéguet, V.; Schimmel, P., An Escherichia coli tyrosine transfer RNA is a leucine-specific transfer RNA in the yeast Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 1991, 88, 1153-6.

24. Sakamoto, K.; Hayashi, A.; Sakamoto, A.; Kiga, D.; Nakayama, H.; Soma, A.; Kobayashi, T.; Kitabatake, M.; Takio, K.; Saito, K.; Shirouzu, M.; Hirao, I.; Yokoyama, S., Site‐ specific incorporation of an unnatural amino acid into proteins in mammalian cells. Nucleic Acids Res 2002, 30, 4692-4699.

25. Fekner, T.; Chan, M. K., The pyrrolysine translational machinery as a genetic-code expansion tool. Curr Opin Chem Biol 2011, 15, 387-91.

26. Krzycki, J. A., The direct genetic encoding of pyrrolysine. Curr Opin Microbiol 2005, 8, 706-12.

27. Atkins, J. F.; Gesteland, R., Biochemistry. The 22nd amino acid. Science 2002, 296, 1409-10.

28. Chattopadhaya, S.; Srinivasan, R.; Yeo, D. S.; Chen, G. Y.; Yao, S. Q., Site-specific covalent labeling of proteins inside live cells using small molecule probes. Bioorg Med Chem 2009, 17, 981-9.

29. Burke, S. A.; Lo, S. L.; Krzycki, J. A., Clustered genes encoding the methyltransferases of methanogenesis from monomethylamine. J Bacteriol 1998, 180, 3432- 40.

30. Krzycki, J. A., Function of genetically encoded pyrrolysine in corrinoid-dependent methylamine methyltransferases. Curr Opin Chem Biol 2004, 8, 484-91.

31. Hao, B.; Zhao, G.; Kang, P. T.; Soares, J. A.; Ferguson, T. K.; Gallucci, J.; Krzycki, J. A.; Chan, M. K., Reactivity and chemical synthesis of L-pyrrolysine- the 22(nd) genetically encoded amino acid. Chem Biol 2004, 11, 1317-24.

187

32. Gaston, M. A.; Zhang, L.; Green-Church, K. B.; Krzycki, J. A., The complete biosynthesis of the genetically encoded amino acid pyrrolysine from lysine. Nature 2011, 471, 647-50.

33. Mahapatra, A.; Srinivasan, G.; Richter, K. B.; Meyer, A.; Lienard, T.; Zhang, J. K.; Zhao, G.; Kang, P. T.; Chan, M.; Gottschalk, G.; Metcalf, W. W.; Krzycki, J. A., Class I and class II lysyl-tRNA synthetase mutants and the genetic encoding of pyrrolysine in Methanosarcina spp. Mol Microbiol 2007, 64, 1306-1318.

34. Nozawa, K.; O'Donoghue, P.; Gundllapalli, S.; Araiso, Y.; Ishitani, R.; Umehara, T.; Söll, D.; Nureki, O., Pyrrolysyl-tRNA synthetase-tRNA(Pyl) structure reveals the molecular basis of orthogonality. Nature 2009, 457, 1163-7.

35. Namy, O.; Zhou, Y.; Gundllapalli, S.; Polycarpo, C. R.; Denise, A.; Rousset, J. P.; Söll, D.; Ambrogelly, A., Adding pyrrolysine to the Escherichia coli genetic code. FEBS Lett 2007, 581, 5282-8.

36. Hancock, S. M.; Uprety, R.; Deiters, A.; Chin, J. W., Expanding the genetic code of yeast for incorporation of diverse unnatural amino acids via a pyrrolysyl-tRNA synthetase/tRNA pair. J Am Chem Soc 2010, 132, 14819-24.

37. Blight, S. K.; Larue, R. C.; Mahapatra, A.; Longstaff, D. G.; Chang, E.; Zhao, G.; Kang, P. T.; Green-Church, K. B.; Chan, M. K.; Krzycki, J. A., Direct charging of tRNA(CUA) with pyrrolysine in vitro and in vivo. Nature 2004, 431, 333-5.

38. Greiss, S.; Chin, J. W., Expanding the Genetic Code of an Animal. J Am Chem Soc 2011, 133, 14196-14199.

39. Parrish, A. R.; She, X.; Xiang, Z.; Coin, I.; Shen, Z.; Briggs, S. P.; Dillin, A.; Wang, L., Expanding the Genetic Code of Caenorhabditis elegans Using Bacterial Aminoacyl-tRNA Synthetase/tRNA Pairs. ACS Chem Biol 2012, 7, 1292-1302.

40. Bianco, A.; Townsley, F. M.; Greiss, S.; Lang, K.; Chin, J. W., Expanding the genetic code of Drosophila melanogaster. Nat Chem Biol 2012, 8, 748-750.

41. Ye, S.; Riou, M.; Carvalho, S.; Paoletti, P., Expanding the Genetic Code in Xenopus laevis Oocytes. ChemBioChem 2013, 14, 230-235.

188

42. Polycarpo, C. R.; Herring, S.; Bérubé, A.; Wood, J. L.; Söll, D.; Ambrogelly, A., Pyrrolysine analogues as substrates for pyrrolysyl-tRNA synthetase. FEBS Lett 2006, 580, 6695-700.

43. Li, W. T.; Mahapatra, A.; Longstaff, D. G.; Bechtel, J.; Zhao, G.; Kang, P. T.; Chan, M. K.; Krzycki, J. A., Specificity of pyrrolysyl-tRNA synthetase for pyrrolysine and pyrrolysine analogs. J Mol Biol 2009, 385, 1156-64.

44. Kavran, J. M.; Gundllapalli, S.; O'Donoghue, P.; Englert, M.; Söll, D.; Steitz, T. A., Structure of pyrrolysyl-tRNA synthetase, an archaeal enzyme for genetic code innovation. Proc Natl Acad Sci U S A 2007, 104, 11268-73.

45. Fekner, T.; Li, X.; Lee, M. M.; Chan, M. K., A pyrrolysine analogue for protein click chemistry. Angew Chem Int Ed Engl 2009, 48, 1633-5.

46. Yanagisawa, T.; Ishii, R.; Fukunaga, R.; Kobayashi, T.; Sakamoto, K.; Yokoyama, S., Crystallographic studies on multiple conformational states of active-site loops in pyrrolysyl- tRNA synthetase. J Mol Biol 2008, 378, 634-52.

47. Delarue, M., Aminoacyl-tRNA synthetases. Curr Opin Struct Biol 1995, 5, 48-55.

48. Mukai, T.; Kobayashi, T.; Hino, N.; Yanagisawa, T.; Sakamoto, K.; Yokoyama, S., Adding l-lysine derivatives to the genetic code of mammalian cells with engineered pyrrolysyl-tRNA synthetases. Biochem Biophys Res Commun 2008, 371, 818-822.

49. Yanagisawa, T.; Ishii, R.; Fukunaga, R.; Kobayashi, T.; Sakamoto, K.; Yokoyama, S., Multistep engineering of pyrrolysyl-tRNA synthetase to genetically encode N(epsilon)-(o- azidobenzyloxycarbonyl) lysine for site-specific protein modification. Chem Biol 2008, 15, 1187-97.

50. Nguyen, D. P.; Lusic, H.; Neumann, H.; Kapadnis, P. B.; Deiters, A.; Chin, J. W., Genetic encoding and labeling of aliphatic azides and alkynes in recombinant proteins via a pyrrolysyl-tRNA Synthetase/tRNA(CUA) pair and click chemistry. J Am Chem Soc 2009, 131, 8720-1.

51. Chou, C.; Uprety, R.; Davis, L.; Chin, J. W.; Deiters, A., Genetically encoding an aliphatic diazirine for protein photocrosslinking. Chem. Sci. 2011, 2, 480-483.

189

52. Li, X.; Fekner, T.; Chan, M. K., N6-(2-(R)-propargylglycyl)lysine as a clickable pyrrolysine mimic. Chem Asian J 2010, 5, 1765-9.

53. Lee, M. M.; Fekner, T.; Tang, T.-H.; Wang, L.; Chan, A. H.-Y.; Hsu, P.-H.; Au, S. W.; Chan, M. K., A Click-and-Release Pyrrolysine Analogue. ChemBioChem 2013, 14, 805- 808.

54. Li, X.; Fekner, T.; Ottesen, J. J.; Chan, M. K., A pyrrolysine analogue for site- specific protein ubiquitination. Angew Chem Int Ed Engl 2009, 48, 9184-7.

55. Nguyen, D. P.; Elliott, T.; Holt, M.; Muir, T. W.; Chin, J. W., Genetically Encoded 1,2-Aminothiols Facilitate Rapid and Site-Specific Protein Labeling via a Bio-orthogonal Cyanobenzothiazole Condensation. J Am Chem Soc 2011, 133, 11418-11421.

56. Gattner, M. J.; Vrabel, M.; Carell, T., Synthesis of ε-N-propionyl-, ε-N-butyryl-, and ε-N-crotonyl-lysine containing histone H3 using the pyrrolysine system. Chem Commun 2013, 379-381.

57. Wan, W.; Tharp, J. M.; Liu, W. R., Pyrrolysyl-tRNA synthetase: An ordinary enzyme but an outstanding genetic code expansion tool. Biochim Biophys Acta 2014, 1844, 1059- 1070.

58. Li, J.; Lin, S.; Wang, J.; Jia, S.; Yang, M.; Hao, Z.; Zhang, X.; Chen, P. R., Ligand- Free Palladium-Mediated Site-Specific Protein Labeling Inside Gram-Negative Bacterial Pathogens. J Am Chem Soc 2013, 135, 7330-7338.

59. Gautier, A.; Nguyen, D. P.; Lusic, H.; An, W.; Deiters, A.; Chin, J. W., Genetically encoded photocontrol of protein localization in mammalian cells. J Am Chem Soc 2010, 132, 4086-8.

60. Gautier, A.; Deiters, A.; Chin, J. W., Light-activated kinases enable temporal dissection of signaling networks in living cells. J Am Chem Soc 2011, 133, 2124-7.

61. Hemphill, J.; Chou, C.; Chin, J. W.; Deiters, A., Genetically Encoded Light-Activated Transcription for Spatiotemporal Control of Gene Expression and Gene Silencing in Mammalian Cells. J Am Chem Soc 2013, 135, 13433-13439.

190

62. Engelke, H.; Chou, C.; Uprety, R.; Jess, P.; Deiters, A., Control of Protein Function through Optochemical Translocation. ACS Synthetic Biology 2014.

63. Plass, T.; Milles, S.; Koehler, C.; Schultz, C.; Lemke, E. A., Genetically encoded copper-free click chemistry. Angew Chem Int Ed Engl 2011, 50, 3878-81.

64. Borrmann, A.; Milles, S.; Plass, T.; Dommerholt, J.; Verkade, J. M. M.; Wießler, M.; Schultz, C.; van Hest, J. C. M.; van Delft, F. L.; Lemke, E. A., Genetic Encoding of a Bicyclo[6.1.0]nonyne-Charged Amino Acid Enables Fast Cellular Protein Imaging by Metal- Free Ligation. ChemBioChem 2012, 13, 2094-2099.

65. Plass, T.; Milles, S.; Koehler, C.; Szymański, J.; Mueller, R.; Wießler, M.; Schultz, C.; Lemke, E. A., Amino Acids for Diels–Alder Reactions in Living Cells. Angew Chem Int Ed Engl 2012, 51, 4166-4170.

66. Lang, K.; Davis, L.; Wallace, S.; Mahesh, M.; Cox, D. J.; Blackman, M. L.; Fox, J. M.; Chin, J. W., Genetic Encoding of Bicyclononynes and trans-Cyclooctenes for Site- Specific Protein Labeling in Vitro and in Live Mammalian Cells via Rapid Fluorogenic Diels–Alder Reactions. J Am Chem Soc 2012, 134, 10317-10320.

67. Virdee, S.; Kapadnis, P. B.; Elliott, T.; Lang, K.; Madrzak, J.; Nguyen, D. P.; Riechmann, L.; Chin, J. W., Traceless and site-specific ubiquitination of recombinant proteins. J Am Chem Soc 2011, 133, 10708-11.

68. Neumann, H.; Peak-Chew, S. Y.; Chin, J. W., Genetically encoding N(epsilon)- acetyllysine in recombinant proteins. Nat Chem Biol 2008, 4, 232-4.

69. Neumann, H.; Hancock, S. M.; Buning, R.; Routh, A.; Chapman, L.; Somers, J.; Owen-Hughes, T.; van Noort, J.; Rhodes, D.; Chin, J. W., A Method for Genetically Installing Site-Specific Acetylation in Recombinant Histones Defines the Effects of H3 K56 Acetylation. Mol Cell 2009, 36, 153-163.

70. Schmidt, M. J.; Borbas, J.; Drescher, M.; Summerer, D., A Genetically Encoded Spin Label for Electron Paramagnetic Resonance Distance Measurements. J Am Chem Soc 2014, 136, 1238-1241.

191

71. Schmidt, M. J.; Summerer, D., Red-Light-Controlled Protein–RNA Crosslinking with a Genetically Encoded Furan. Angew Chem Int Ed Engl 2013, 52, 4690-4693.

72. Li, Y.; Yang, M.; Huang, Y.; Song, X.; Liu, L.; Chen, P. R., Genetically encoded alkenyl-pyrrolysine analogues for thiol-ene reaction mediated site-specific protein labeling. Chem Sci 2012, 3, 2766-2770.

73. Wang, Y.-S.; Russell, W. K.; Wang, Z.; Wan, W.; Dodd, L. E.; Pai, P.-J.; Russell, D. H.; Liu, W. R., The de novo engineering of pyrrolysyl-tRNA synthetase for genetic incorporation of l-phenylalanine and its derivatives. Mol BioSyst 2011, 7, 714-717.

74. Ko, J.-h.; Wang, Y.-S.; Nakamura, A.; Guo, L.-T.; Söll, D.; Umehara, T., Pyrrolysyl- tRNA synthetase variants reveal ancestral aminoacylation function. FEBS Letters 2013, 587, 3243-3248.

75. Takimoto, J. K.; Dellas, N.; Noel, J. P.; Wang, L., Stereochemical Basis for Engineered Pyrrolysyl-tRNA Synthetase and the Efficient in Vivo Incorporation of Structurally Divergent Non-native Amino Acids. ACS Chem Biol 2011, 6, 733-743.

76. Liu, C. C.; Schultz, P. G., Adding new chemistries to the genetic code. Annu Rev Biochem 2010, 79, 413-44.

77. Wang, Y. S.; Fang, X.; Wallace, A. L.; Wu, B.; Liu, W. R., A rationally designed pyrrolysyl-tRNA synthetase mutant with a broad substrate spectrum. J Am Chem Soc 2012, 134, 2950-3.

78. Wang, Y.-S.; Fang, X.; Chen, H.-Y.; Wu, B.; Wang, Z. U.; Hilty, C.; Liu, W. R., Genetic Incorporation of Twelve meta-Substituted Phenylalanine Derivatives Using a Single Pyrrolysyl-tRNA Synthetase Mutant. ACS Chem Biol 2012, 8, 405-415.

79. Tuley, A.; Wang, Y.-S.; Fang, X.; Kurra, Y.; Rezenom, Y. H.; Liu, W. R., The genetic incorporation of thirteen novel non-canonical amino acids. Chem Commun (Camb) 2014, 50, 2673-2675.

80. Tharp, J. M.; Wang, Y.-S.; Lee, Y.-J.; Yang, Y.; Liu, W. R., Genetic Incorporation of Seven ortho-Substituted Phenylalanine Derivatives. ACS Chem Biol 2014.

192

81. Lacey, V. K.; Louie, G. V.; Noel, J. P.; Wang, L., Expanding the Library and Substrate Diversity of the Pyrrolysyl-tRNA Synthetase to Incorporate Unnatural Amino Acids Containing Conjugated Rings. ChemBioChem 2013, 14, 2100-2105.

82. Waldron, K. J.; Robinson, N. J., How do bacterial cells ensure that metalloproteins get the correct metal? Nat Rev Microbiol 2009, 7, 25-35.

83. Lu, Y.; Yeung, N.; Sieracki, N.; Marshall, N. M., Design of functional metalloproteins. Nature 2009, 460, 855-62.

84. Berg, J. M., Metal ions in proteins: structural and functional roles. Cold Spring Harb Symp Quant Biol 1987, 52, 579-85.

85. DeGrado, W. F.; Summa, C. M.; Pavone, V.; Nastri, F.; Lombardi, A., De novo design and structural characterization of proteins and metalloproteins. Annu Rev Biochem 1999, 68, 779-819.

86. Lu, Y.; Yeung, N.; Sieracki, N.; Marshall, N. M., Design of functional metalloproteins. Nature 2009, 460, 855-62.

87. Xie, J.; Liu, W.; Schultz, P. G., A genetically encoded bidentate, metal-binding amino acid. Angew Chem Int Ed Engl 2007, 46, 9239-42.

88. Lee, H. S.; Spraggon, G.; Schultz, P. G.; Wang, F., Genetic Incorporation of a Metal- Ion Chelating Amino Acid into Proteins as a Biophysical Probe. J Am Chem Soc 2009, 131, 2481-2483.

89. Jin, S.; Lee, H. L.; Lee, S.; Lee, H. S., Genetic Incorporation of Phenanthroline- Containing Amino Acid in Escherichia coli. Bull Korean Chem Soc 2014, 35, 1087-1090.

90. Lee, H. S.; Schultz, P. G., Biosynthesis of a Site-Specific DNA Cleaving Protein. J Am Chem Soc 2008, 130, 13194-13195.

91. Mills, J. H.; Khare, S. D.; Bolduc, J. M.; Forouhar, F.; Mulligan, V. K.; Lew, S.; Seetharaman, J.; Tong, L.; Stoddard, B. L.; Baker, D., Computational Design of an Unnatural

193

Amino Acid Dependent Metalloprotein with Atomic Level Accuracy. J Am Chem Soc 2013, 135, 13393-13399.

92. Park, N.; Ryu, J.; Jang, S.; Lee, H. S., Metal ion affinity purification of proteins by genetically incorporating metal-chelating amino acids. Tetrahedron 2012, 68, 4649-4654.

93. Bjerrum, M. J.; Casimiro, D. R.; Chang, I. J.; Di Bilio, A. J.; Gray, H. B.; Hill, M. G.; Langen, R.; Mines, G. A.; Skov, L. K.; Winkler, J. R., Electron transfer in ruthenium- modified proteins. J Bioenerg Biomembr 1995, 27, 295-302.

94. Panetta, C. A.; Kumpaty, H. J.; Heimer, N. E.; Leavy, M. C.; Hussey, C. L., Disulfide-Functionalized 3-, 4-, 5-, and 6-Substituted 2,2'-Bipyridines and Their Ruthenium Complexes. J Org Chem 1999, 64, 1015-1021.

95. Ghadiri, M. R.; Soares, C.; Choi, C., A convergent approach to protein design. Metal ion-assisted spontaneous self-assembly of a polypeptide into a triple-helix bundle protein. J Am Chem Soc 1992, 114, 825-831.

96. Stephanopoulos, N.; Francis, M. B., Choosing an effective protein bioconjugation strategy. Nat Chem Biol 2011, 7, 876-84.

97. Tsien, R. Y., Constructing and Exploiting the Fluorescent Protein Paintbox (Nobel Lecture). Angew Chem Int Ed Engl 2009, 48, 5612-5626.

98. Giepmans, B. N.; Adams, S. R.; Ellisman, M. H.; Tsien, R. Y., The fluorescent toolbox for assessing protein location and function. Science 2006, 312, 217-24.

99. Tae, H. S.; Sundberg, T. B.; Neklesa, T. K.; Noblin, D. J.; Gustafson, J. L.; Roth, A. G.; Raina, K.; Crews, C. M., Identification of Hydrophobic Tags for the Degradation of Stabilized Proteins. ChemBioChem 2012, 13, 538-541.

100. Neklesa, T. K.; Tae, H. S.; Schneekloth, A. R.; Stulberg, M. J.; Corson, T. W.; Sundberg, T. B.; Raina, K.; Holley, S. A.; Crews, C. M., Small-molecule hydrophobic tagging-induced degradation of HaloTag fusion proteins. Nat Chem Biol 2011, 7, 538-43.

194

101. Los, G. V.; Encell, L. P.; McDougall, M. G.; Hartzell, D. D.; Karassina, N.; Zimprich, C.; Wood, M. G.; Learish, R.; Ohana, R. F.; Urh, M.; Simpson, D.; Mendez, J.; Zimmerman, K.; Otto, P.; Vidugiris, G.; Zhu, J.; Darzins, A.; Klaubert, D. H.; Bulleit, R. F.; Wood, K. V., HaloTag: a novel protein labeling technology for cell imaging and protein analysis. ACS Chem Biol 2008, 3, 373-82.

102. Los, G. V.; Wood, K., The HaloTag: a novel technology for cell imaging and protein analysis. Methods Mol Biol 2007, 356, 195-208.

103. Keppler, A.; Gendreizig, S.; Gronemeyer, T.; Pick, H.; Vogel, H.; Johnsson, K., A general method for the covalent labeling of fusion proteins with small molecules in vivo. Nat Biotechnol 2003, 21, 86-9.

104. Gautier, A.; Juillerat, A.; Heinis, C.; Corrêa, I. R.; Kindermann, M.; Beaufils, F.; Johnsson, K., An engineered protein tag for multiprotein labeling in living cells. Chem Biol 2008, 15, 128-36.

105. George, N.; Pick, H.; Vogel, H.; Johnsson, N.; Johnsson, K., Specific labeling of cell surface proteins with chemically diverse compounds. J Am Chem Soc 2004, 126, 8896-7.

106. Zhou, Z.; Koglin, A.; Wang, Y.; McMahon, A. P.; Walsh, C. T., An eight residue fragment of an acyl carrier protein suffices for post-translational introduction of fluorescent pantetheinyl arms in protein modification in vitro and in vivo. J Am Chem Soc 2008, 130, 9925-30.

107. Yin, J.; Straight, P. D.; McLoughlin, S. M.; Zhou, Z.; Lin, A. J.; Golan, D. E.; Kelleher, N. L.; Kolter, R.; Walsh, C. T., Genetically encoded short peptide tag for versatile protein labeling by Sfp phosphopantetheinyl transferase. Proc Natl Acad Sci U S A 2005, 102, 15815-20.

108. Uttamapinant, C.; White, K. A.; Baruah, H.; Thompson, S.; Fernández-Suárez, M.; Puthenveetil, S.; Ting, A. Y., A fluorophore ligase for site-specific protein labeling inside living cells. Proc Natl Acad Sci U S A 2010, 107, 10914-9.

109. Popp, M. W.; Antos, J. M.; Grotenbreg, G. M.; Spooner, E.; Ploegh, H. L., Sortagging: a versatile method for protein labeling. Nat Chem Biol 2007, 3, 707-8.

195

110. Griffin, B. A.; Adams, S. R.; Tsien, R. Y., Specific covalent labeling of recombinant protein molecules inside live cells. Science 1998, 281, 269-72.

111. Halo, T. L.; Appelbaum, J.; Hobert, E. M.; Balkin, D. M.; Schepartz, A., Selective recognition of protein tetraserine motifs with a cell-permeable, pro-fluorescent bis-boronic acid. J Am Chem Soc 2009, 131, 438-9.

112. Andresen, M.; Schmitz-Salue, R.; Jakobs, S., Short tetracysteine tags to beta-tubulin demonstrate the significance of small labels for live cell imaging. Mol Biol Cell 2004, 15, 5616-22.

113. Shaner, N. C.; Steinbach, P. A.; Tsien, R. Y., A guide to choosing fluorescent proteins. Nat Meth 2005, 2, 905-909.

114. Hinner, M. J.; Johnsson, K., How to obtain labeled proteins and what to do with them. Curr Opin Biotechnol 2010, 21, 766-76.

115. Wan, W.; Wang, Y.-S.; Liu, W. R., Genetically encoding bioorthogonal functional groups for site-selective protein labeling. Organic Chem Curr Res 2012, 1, 1-7.

116. Davis, L.; Chin, J. W., Designer proteins: applications of genetic code expansion in cell biology. Nat Rev Mol Cell Biol 2012, 13, 168-182.

117. Lang, K.; Chin, J. W., Bioorthogonal Reactions for Labeling Proteins. ACS Chem Biol 2014, 9, 16-20.

118. Lang, K.; Chin, J. W., Cellular Incorporation of Unnatural Amino Acids and Bioorthogonal Labeling of Proteins. Chem Rev 2014, 114, 4764-4806.

119. Patterson, D. M.; Nazarova, L. A.; Prescher, J. A., Finding the Right (Bioorthogonal) Chemistry. ACS Chem Biol 2014, 9, 592-605.

120. Debets, M. F.; van Berkel, S. S.; Dommerholt, J.; Dirks, A. J.; Rutjes, F. P. J. T.; van Delft, F. L., Bioconjugation with Strained Alkenes and Alkynes. Acc Chem Res 2011, 44, 805-815.

196

121. van Swieten, P. F.; Leeuwenburgh, M. A.; Kessler, B. M.; Overkleeft, H. S., Bioorthogonal organic chemistry in living cells: novel strategies for labeling biomolecules. Org Biomol Chem 2005, 3, 20-27.

122. Prescher, J. A.; Bertozzi, C. R., Chemistry in living systems. Nat Chem Biol 2005, 1, 13-21.

123. Rideout, D., Self-assembling cytotoxins. Science 1986, 233, 561-563.

124. Mahal, L. K.; Yarema, K. J.; Bertozzi, C. R., Engineering Chemical Reactivity on Cell Surfaces Through Oligosaccharide Biosynthesis. Science 1997, 276, 1125-1128.

125. Jencks, W. P., Studies on the Mechanism of Oxime and Semicarbazone Formation. J Am Chem Soc 1959, 81, 475-481.

126. Luchansky, S. J.; Goon, S.; Bertozzi, C. R., Expanding the diversity of unnatural cell- surface sialic acids. ChemBioChem 2004, 5, 371-4.

127. Hang, H. C.; Bertozzi, C. R., Ketone isosteres of 2-N-acetamidosugars as substrates for metabolic cell surface engineering. J Am Chem Soc 2001, 123, 1242-3.

128. Jacobs, C. L.; Yarema, K. J.; Mahal, L. K.; Nauman, D. A.; Charters, N. W.; Bertozzi, C. R., Metabolic labeling of glycoproteins with chemical tags through unnatural sialic acid biosynthesis. In Methods in Enzymology, Jeremy Thorner, S. D. E. a. J. N. A., Ed. Academic Press2000; Vol. Volume 327, pp 260-275.

129. Sadamoto, R.; Niikura, K.; Ueda, T.; Monde, K.; Fukuhara, N.; Nishimura, S.-I., Control of Bacteria Adhesion by Cell-Wall Engineering. J Am Chem Soc 2004, 126, 3755- 3761.

130. Tang, Y.; Wang, P.; Van Deventer, J. A.; Link, A. J.; Tirrell, D. A., Introduction of an Aliphatic Ketone into Recombinant Proteins in a Bacterial Strain that Overexpresses an Editing-Impaired Leucyl-tRNA Synthetase. ChemBioChem 2009, 10, 2188-2190.

197

131. Datta, D.; Wang, P.; Carrico, I. S.; Mayo, S. L.; Tirrell, D. A., A Designed Phenylalanyl-tRNA Synthetase Variant Allows Efficient in Vivo Incorporation of Aryl Ketone Functionality into Proteins. J Am Chem Soc 2002, 124, 5652-5653.

132. Ngo, J. T.; Tirrell, D. A., Noncanonical Amino Acids in the Interrogation of Cellular Protein Synthesis. Acc Chem Res 2011, 44, 677-685.

133. Zeng, H.; Xie, J.; Schultz, P. G., Genetic introduction of a diketone-containing amino acid into proteins. Bioorg Med Chem Lett 2006, 16, 5356-9.

134. Wang, L.; Zhang, Z.; Brock, A.; Schultz, P. G., Addition of the keto functional group to the genetic code of Escherichia coli. Proc Natl Acad Sci U S A 2003, 100, 56-61.

135. Brustad, E. M.; Lemke, E. A.; Schultz, P. G.; Deniz, A. A., A General and Efficient Method for the Site-Specific Dual-Labeling of Proteins for Single Molecule Fluorescence Resonance Energy Transfer. J Am Chem Soc 2008, 130, 17664-17665.

136. Fleissner, M. R.; Brustad, E. M.; Kálai, T.; Altenbach, C.; Cascio, D.; Peters, F. B.; Hideg, K.; Peuker, S.; Schultz, P. G.; Hubbell, W. L., Site-directed spin labeling of a genetically encoded unnatural amino acid. Proc Natl Acad Sci U S A 2009, 106, 21637- 21642.

137. Zhang, Z.; Smith, B. A.; Wang, L.; Brock, A.; Cho, C.; Schultz, P. G., A new strategy for the site-specific modification of proteins in vivo. Biochemistry 2003, 42, 6735-46.

138. Ye, S.; Köhrer, C.; Huber, T.; Kazmi, M.; Sachdev, P.; Yan, E. C. Y.; Bhagat, A.; RajBhandary, U. L.; Sakmar, T. P., Site-specific Incorporation of Keto Amino Acids into Functional G Protein-coupled Receptors Using Unnatural Amino Acid Mutagenesis. J Biol Chem 2008, 283, 1525-1533.

139. Young, T. S.; Ahmad, I.; Brock, A.; Schultz, P. G., Expanding the Genetic Repertoire of the Methylotrophic Yeast Pichia pastoris. Biochemistry 2009, 48, 2643-2653.

140. Huang, Y.; Wan, W.; Russell, W. K.; Pai, P. J.; Wang, Z.; Russell, D. H.; Liu, W., Genetic incorporation of an aliphatic keto-containing amino acid into proteins for their site- specific modifications. Bioorg Med Chem Lett 2010, 20, 878-80.

198

141. Carey, F. A.; Sundberg, R. J., Advanced Organic chemistry, Part A: Structure and Mechanisms. 3rd ed.; Plenum Press: New York, 1990.

142. Wu, P.; Shui, W.; Carlson, B. L.; Hu, N.; Rabuka, D.; Lee, J.; Bertozzi, C. R., Site- specific chemical modification of recombinant proteins produced in mammalian cells by using the genetically encoded aldehyde tag. Proc Natl Acad Sci U S A 2009, 106, 3000-3005.

143. Carrico, I. S.; Carlson, B. L.; Bertozzi, C. R., Introducing genetically encoded aldehydes into proteins. Nat Chem Biol 2007, 3, 321-2.

144. Geoghegan, K. F.; Stroh, J. G., Site-directed conjugation of nonpeptide groups to peptides and proteins via periodate oxidation of a 2-amino alcohol. Application to modification at N-terminal serine. Bioconjugate Chem 1992, 3, 138-146.

145. Tuley, A.; Lee, Y.-J.; Wu, B.; Wang, Z. U.; Liu, W. R., A genetically encoded aldehyde for rapid protein labelling. Chem Commun (Camb) 2014.

146. Spetzler, J. C.; Hoeg-Jensen, T., Masked side-chain aldehyde amino acids for solid- phase synthesis and ligation. Tetrahedron Lett 2002, 43, 2303-2306.

147. Ameer, F.; Giles, R. G. F.; Green, I. R.; Nagabhushana, K. S., The DDQ Mediated Cyclization Products of Some 2-Hydroxy-3-(1'-alkenyl)-1,4-naphthoquinones. Synth Commun 2002, 32, 369-380.

148. Gomtsyan, A. R.; Bayburt, E. K.; Koenig, J. R.; Marsh, K. C.; Schmidt, R. G., Jr.; Lee, C.-H.; Wang, W.; Daanen, J. F.; Brown, B. S. Preparation of indazole derivatives as TRPV1 receptor inhibitors. 2007.

149. Dirksen, A.; Hackeng, T. M.; Dawson, P. E., Nucleophilic Catalysis of Oxime Ligation. Angew Chem Int Ed Engl 2006, 45, 7581-7584.

150. Dirksen, A.; Dawson, P. E., Rapid Oxime and Hydrazone Ligations with Aromatic Aldehydes for Biomolecular Labeling. Bioconjug Chem 2008, 19, 2543-2548.

199

151. Simonin, J.; Vernekar, S. K. V.; Thompson, A. J.; Hothersall, J. D.; Connolly, C. N.; Lummis, S. C. R.; Lochner, M., High-affinity fluorescent ligands for the 5-HT3 receptor. Bioorg Med Chem Lett 2012, 22, 1151-1155.

152. Salisbury, C. M.; Maly, D. J.; Ellman, J. A., Peptide Microarrays for the Determination of Protease Substrate Specificity. J Am Chem Soc 2002, 124, 14868-14870.

153. Nauman, D. A.; Bertozzi, C. R., Kinetic parameters for small-molecule drug delivery by covalent cell surface targeting. Biochim Biophys Acta 2001, 1568, 147-154.

154. Dirksen, A.; Dirksen, S.; Hackeng, T. M.; Dawson, P. E., Nucleophilic Catalysis of Hydrazone Formation and Transimination: Implications for Dynamic Covalent Chemistry. J Am Chem Soc 2006, 128, 15602-15603.

155. Bar-Or, R.; Rael, L. T.; Bar-Or, D., Dehydroalanine derived from cysteine is a common post-translational modification in human serum albumin. Rapid Commun Mass Spectrom 2008, 22, 711-6.

156. Wickner, R. B., Dehydroalanine in Histidine Ammonia Lyase. J Biol Chem 1969, 244, 6550-6552.

157. Ai, H. W.; Shen, W.; Brustad, E.; Schultz, P. G., Genetically encoded alkenes in yeast. Angew Chem Int Ed Engl 2010, 49, 935-7.

158. Song, W.; Wang, Y.; Qu, J.; Lin, Q., Selective functionalization of a genetically encoded alkene-containing protein via "photoclick chemistry" in bacterial cells. J Am Chem Soc 2008, 130, 9654-5.

159. Song, W.; Wang, Y.; Yu, Z.; Vera, C. I.; Qu, J.; Lin, Q., A metabolic alkene reporter for spatiotemporally controlled imaging of newly synthesized proteins in Mammalian cells. ACS Chem Biol 2010, 5, 875-85.

160. Yu, Z.; Pan, Y.; Wang, Z.; Wang, J.; Lin, Q., Genetically encoded cyclopropene directs rapid, photoclick-chemistry-mediated protein labeling in mammalian cells. Angew Chem Int Ed Engl 2012, 51, 10600-4.

200

161. Lang, K.; Davis, L.; Torres-Kolbus, J.; Chou, C.; Deiters, A.; Chin, J. W., Genetically encoded norbornene directs site-specific cellular protein labelling via a rapid bioorthogonal reaction. Nat Chem 2012, 4, 298-304.

162. Lee, Y. J.; Wu, B.; Raymond, J. E.; Zeng, Y.; Fang, X.; Wooley, K. L.; Liu, W. R., A genetically encoded acrylamide functionality. ACS Chem Biol 2013, 8, 1664-70.

163. Hoyle, C. E.; Bowman, C. N., Thiol-ene click chemistry. Angew Chem Int Ed Engl 2010, 49, 1540-73.

164. Northrop, B. H.; Coffey, R. N., Thiol–Ene Click Chemistry: Computational and Kinetic Analysis of the Influence of Alkene Functionality. J Am Chem Soc 2012, 134, 13804- 13817.

165. DeForest, C. A.; Anseth, K. S., Photoreversible Patterning of Biomolecules within Click-Based Hydrogels. Angew Chem Int Ed Engl 2012, 51, 1816-1819.

166. Aimetti, A. A.; Machen, A. J.; Anseth, K. S., Poly(ethylene glycol) hydrogels formed by thiol-ene photopolymerization for enzyme-responsive protein delivery. Biomaterials 2009, 30, 6048-6054.

167. Gupta, N.; Lin, B. F.; Campos, L. M.; Dimitriou, M. D.; Hikita, S. T.; Treat, N. D.; Tirrell, M. V.; Clegg, D. O.; Kramer, E. J.; Hawker, C. J., A versatile approach to high- throughput microarrays using thiol-ene chemistry. Nat Chem 2010, 2, 138-145.

168. Chan, J. W.; Yu, B.; Hoyle, C. E.; Lowe, A. B., Convergent synthesis of 3-arm star polymers from RAFT-prepared poly(N,N-diethylacrylamide) via a thiol-ene click reaction. Chem Commun (Camb) 2008, 4959-61.

169. Chan, J. W.; Hoyle, C. E.; Lowe, A. B., Sequential phosphine-catalyzed, nucleophilic thiol-ene/radical-mediated thiol-yne reactions and the facile orthogonal synthesis of polyfunctional materials. J Am Chem Soc 2009, 131, 5751-3.

170. Lowe, A. B., Thiol-ene "click" reactions and recent applications in polymer and materials synthesis. Polym Chem 2010, 1, 17-36.

201

171. Lv, Y.; Lin, Z.; Svec, F., "Thiol-ene" click chemistry: a facile and versatile route for the functionalization of porous polymer monoliths. Analyst 2012, 137, 4114-8.

172. Wojcik, F.; O'Brien, A. G.; Götze, S.; Seeberger, P. H.; Hartmann, L., Synthesis of Carbohydrate-Functionalised Sequence-Defined Oligo(amidoamine)s by Photochemical Thiol-Ene Coupling in a Continuous Flow Reactor. Chemistry 2013, 19, 3090-3098.

173. Dondoni, A.; Marra, A., Recent applications of thiol-ene coupling as a click process for glycoconjugation. Chem Soc Rev 2012, 41, 573-586.

174. Garber, K. C. A.; Carlson, E. E., Thiol-ene Enabled Detection of Thiophosphorylated Kinase Substrates. ACS Chem Biol 2013, 8, 1671-1676.

175. Valkevich, E. M.; Guenette, R. G.; Sanchez, N. A.; Chen, Y. C.; Ge, Y.; Strieter, E. R., Forging isopeptide bonds using thiol-ene chemistry: site-specific coupling of ubiquitin molecules for studying the activity of isopeptidases. J Am Chem Soc 2012, 134, 6916-9.

176. Weinrich, D.; Lin, P. C.; Jonkheijm, P.; Nguyen, U. T.; Schröder, H.; Niemeyer, C. M.; Alexandrov, K.; Goody, R.; Waldmann, H., Oriented immobilization of farnesylated proteins by the thiol-ene reaction. Angew Chem Int Ed Engl 2010, 49, 1252-7.

177. Jonkheijm, P.; Weinrich, D.; Köhn, M.; Engelkamp, H.; Christianen, P. C.; Kuhlmann, J.; Maan, J. C.; Nüsse, D.; Schroeder, H.; Wacker, R.; Breinbauer, R.; Niemeyer, C. M.; Waldmann, H., Photochemical surface patterning by the thiol-ene reaction. Angew Chem Int Ed Engl 2008, 47, 4421-4.

178. Trang, V. H.; Valkevich, E. M.; Minami, S.; Chen, Y.-C.; Ge, Y.; Strieter, E. R., Nonenzymatic Polymerization of Ubiquitin: Single-Step Synthesis and Isolation of Discrete Ubiquitin Oligomers. Angew Chem Int Ed Engl 2012, 51, 13085-13088.

179. Li, Y.; Pan, M.; Huang, Y.; Guo, Q., Thiol-yne radical reaction mediated site-specific protein labeling via genetic incorporation of an alkynyl-L-lysine analogue. Org Biomol Chem 2013, 11, 2624-9.

180. Klemm, J. D.; Schreiber, S. L.; Crabtree, G. R., Dimerization as a regulatory mechanism in signal transduction. Annu Rev Immunol 1998, 16, 569-592.

202

181. Funnell, A. W.; Crossley, M., Homo- and Heterodimerization in Transcriptional Regulation. In Protein Dimerization and Oligomerization in Biology, Matthews, J. M., Ed. Springer New York2012; Vol. 747, pp 105-121.

182. Walker, J. R.; Corpina, R. A.; Goldberg, J., Structure of the Ku heterodimer bound to DNA and its implications for double-strand break repair. Nature 2001, 412, 607-14.

183. Arnaud, O.; Koubeissi, A.; Ettouati, L.; Terreux, R.; Alamé, G.; Grenot, C.; Dumontet, C.; Di Pietro, A.; Paris, J.; Falson, P., Potent and fully noncompetitive peptidomimetic inhibitor of multidrug resistance P-glycoprotein. J Med Chem 2010, 53, 6720-9.

184. Rypniewski, W. R.; Holden, H. M.; Rayment, I., Structural consequences of reductive methylation of lysine residues in hen egg white lysozyme: An x-ray analysis at 1.8-.ANG. resolution. Biochemistry 1993, 32, 9851-9858.

185. Ito, S.; Tanaka, Y.; Kakehi, A.; Kondo, K.-I., A facile Synthesis of 2,5-disubstituted tetrazoles by the reaction of Phenylsulfonylhydrazones with Arenediazonium Salts. Bull Chem Soc Jpn 1976, 49, 1920-1923.

186. Wang, Y.; Song, W.; Hu, W. J.; Lin, Q., Fast alkene functionalization in vivo by Photoclick chemistry: HOMO lifting of nitrile imine dipoles. Angew Chem Int Ed Engl 2009, 48, 5330-3.

187. Yu, Z.; Lin, Q., Design of spiro[2.3]hex-1-ene, a genetically encodable double- strained alkene for superfast photoclick chemistry. J Am Chem Soc 2014, 136, 4153-6.

188. Wang, Y.; Song, W.; Hu, W. J.; Lin, Q., Fast alkene functionalization in vivo by Photoclick chemistry: HOMO lifting of nitrile imine dipoles. Angew Chem Int Ed Engl 2009, 48, 5330-3.

189. Rideout, D. C.; Breslow, R., Hydrophobic Acceleration of Diels-Alder Reactions. J Am Chem Soc 1980, 102, 7816-7817.

190. Breslow, R.; Maitra, U., On the origin of product selectivity in aqueous diels-alder reactions. Tetrahedron Lett. 1984, 25, 1239-1240.

203

191. Hill, K. W.; Taunton-Rigby, J.; Carter, J. D.; Kropp, E.; Vagle, K.; Pieken, W.; McGee, D. P.; Husar, G. M.; Leuck, M.; Anziano, D. J.; Sebesta, D. P., Diels--Alder bioconjugation of diene-modified oligonucleotides. J Org Chem 2001, 66, 5352-8.

192. de Araújo, A. D.; Palomo, J. M.; Cramer, J.; Seitz, O.; Alexandrov, K.; Waldmann, H., Diels-Alder ligation of peptides and proteins. Chemistry 2006, 12, 6095-109.

193. Palomo, J. M., Diels–Alder Cycloaddition in Protein Chemistry. Eur J Org Chem 2010, 33, 6303-6314.

194. Tona, R.; Häner, R., Synthesis and bioconjugation of diene-modified oligonucleotides. Bioconjug Chem 2005, 16, 837-42.

195. Pozsgay, V.; Vieira, N. E.; Yergey, A., A Method for Bioconjugation of Carbohydrates Using Diels−Alder Cycloaddition. Org Lett 2002, 4, 3191-3194.

196. Yousaf, M. N.; Mrksich, M., Diels−Alder Reaction for the Selective Immobilization of Protein to Electroactive Self-Assembled Monolayers. J Am Chem Soc 1999, 121, 4286- 4287.

197. Houseman, B. T.; Huh, J. H.; Kron, S. J.; Mrksich, M., Peptide chips for the quantitative evaluation of protein kinase activity. Nat Biotechnol 2002, 20, 270-4.

198. de Araújo, A. D.; Palomo, J. M.; Cramer, J.; Köhn, M.; Schröder, H.; Wacker, R.; Niemeyer, C.; Alexandrov, K.; Waldmann, H., Diels–Alder Ligation and Surface Immobilization of Proteins. Angew Chem Int Ed Engl 2006, 45, 296-301.

199. Sun, X. L.; Yang, L.; Chaikof, E. L., Chemoselective immobilization of biomolecules through aqueous Diels-Alder and PEG chemistry. Tetrahedron Lett 2008, 49, 2510-2513.

200. Shi, M.; Wosnick, J. H.; Ho, K.; Keating, A.; Shoichet, M. S., Immuno-Polymeric Nanoparticles by Diels–Alder Chemistry. Angew Chem Int Ed Engl 2007, 46, 6126-6131.

201. Hooker, J. M.; Kovacs, E. W.; Francis, M. B., Interior Surface Modification of Bacteriophage MS2. J Am Chem Soc 2004, 126, 3718-3719.

204

202. Nguyen, U. T. T.; Cramer, J.; Gomis, J.; Reents, R.; Gutierrez-Rodriguez, M.; Goody, R. S.; Alexandrov, K.; Waldmann, H., Exploiting the Substrate Tolerance of Farnesyltransferase for Site-Selective Protein Derivatization. ChemBioChem 2007, 8, 408- 423.

203. Blackman, M. L.; Royzen, M.; Fox, J. M., Tetrazine ligation: fast bioconjugation based on inverse-electron-demand Diels-Alder reactivity. J Am Chem Soc 2008, 130, 13518- 9.

204. Devaraj, N. K.; Weissleder, R.; Hilderbrand, S. A., Tetrazine-based cycloadditions: application to pretargeted live cell imaging. Bioconjug Chem 2008, 19, 2297-9.

205. Devaraj, N. K.; Weissleder, R., Biomedical Applications of Tetrazine Cycloadditions. Acc Chem Res 2011.

206. Hilderbrand, S. A., Labels and probes for live cell imaging: overview and selection guide. Methods Mol Biol 2010, 591, 17-45.

207. Kaya, E.; Vrabel, M.; Deiml, C.; Prill, S.; Fluxa, V. S.; Carell, T., A genetically encoded norbornene amino acid for the mild and selective modification of proteins in a copper-free click reaction. Angew Chem Int Ed Engl 2012, 51, 4466-9.

208. Seitchik, J. L.; Peeler, J. C.; Taylor, M. T.; Blackman, M. L.; Rhoads, T. W.; Cooley, R. B.; Refakis, C.; Fox, J. M.; Mehl, R. A., Genetically Encoded Tetrazine Amino Acid Directs Rapid Site-Specific in Vivo Bioorthogonal Ligation with trans-Cyclooctenes. J Am Chem Soc 2012, 134, 2898-2901.

209. Karver, M. R.; Weissleder, R.; Hilderbrand, S. A., Synthesis and Evaluation of a Series of 1,2,4,5-Tetrazines for Bioorthogonal Conjugation. Bioconjug Chem 2011, 22, 2263- 2270.

210. Yang, J.; Karver, M. R.; Li, W.; Sahu, S.; Devaraj, N. K., Metal-catalyzed one-pot synthesis of tetrazines directly from aliphatic nitriles and hydrazine. Angew Chem Int Ed Engl 2012, 51, 5222-5.

211. Chen, W.; Wang, D.; Dai, C.; Hamelberg, D.; Wang, B., Clicking 1,2,4,5-tetrazine and cyclooctynes with tunable reaction rates. Chem Commun (Camb) 2012, 48, 1736-1738.

205

212. Oltra, N. S.; Roelfes, G., Modular assembly of novel DNA-based catalysts. Chem Commun (Camb) 2008, 6039-6041.

213. Ousmer, M.; Boucard, V.; Lubin-Germain, N.; Uziel, J.; Augé, J., Gram-Scale Preparation of a p-(C-Glucopyranosyl)-L-phenylalanine Derivative by a Negishi Cross- Coupling Reaction. Eur J Org Chem 2006, 2006, 1216-1221.

214. Wang, L.; Qu, W.; Lieberman, B. P.; Plössl, K.; Kung, H. F., Synthesis, uptake mechanism characterization and biological evaluation of 18F labeled fluoroalkyl phenylalanine analogs as potential PET imaging agents. Nucl Med Biol 2011, 38, 53-62.

215. Kondo, S.; Hayashi, T.; Sakuno, Y.; Takezawa, Y.; Yokoyama, T.; Unno, M.; Yano, Y., Synthesis of cyclic bis- and trismelamine derivatives and their complexation properties with barbiturates. Org Biomol Chem 2007, 5, 907-16.

216. Robinson, C.; Hartman, R. F.; Rose, S. D., Emollient, humectant, and fluorescent alpha,beta-unsaturated thiol esters for long-acting skin applications. Bioorg Chem 2008, 36, 265-70.

217. Li, P.; Wu, C.; Zhao, J.; Rogness, D. C.; Shi, F., Synthesis of substituted 1H- indazoles from arynes and hydrazones. J Org Chem 2012, 77, 3149-58.

218. Aggarwal, V. K.; Alonso, E.; Bae, I.; Hynd, G.; Lydon, K. M.; Palmer, M. J.; Patel, M.; Porcelloni, M.; Richardson, J.; Stenson, R. A.; Studley, J. R.; Vasse, J. L.; Winn, C. L., A new protocol for the in situ generation of aromatic, heteroaromatic, and unsaturated diazo compounds and its application in catalytic and asymmetric epoxidation of carbonyl compounds. Extensive studies to map out scope and limitations, and rationalization of diastereo- and enantioselectivities. J Am Chem Soc 2003, 125, 10926-40.

219. Rodriguez-Emmenegger, C.; Preuss, C. M.; Yameen, B.; Pop-Georgievski, O.; Bachmann, M.; Mueller, J. O.; Bruns, M.; Goldmann, A. S.; Bastmeyer, M.; Barner- Kowollik, C., Controlled cell adhesion on poly(dopamine) interfaces photopatterned with non-fouling brushes. Adv Mater 2013, 25, 6123-7.

220. Mayer, G.; Heckel, A., Biologically Active Molecules with a “Light Switch”. Angew Chem Int Ed Engl 2006, 45, 4900-4921.

206

221. Barltrop, J. A.; Schofield, P., 883. Organic photochemistry. Part II. Some photosensitive protecting groups. J Am Chem Soc 1965, 4758-4765.

222. Patchornik, A.; Amit, B.; Woodward, R. B., Photosensitive protecting groups. J Am Chem Soc 1970, 92, 6333-6335.

223. Rajasekharan Pillai, V. N., Photoremovable Protecting Groups in Organic Synthesis. Synthesis 1980, 1980, 1-26.

224. Klán, P.; Šolomek, T.; Bochet, C. G.; Blanc, A.; Givens, R.; Rubina, M.; Popik, V.; Kostikov, A.; Wirz, J., Photoremovable Protecting Groups in Chemistry and Biology: Reaction Mechanisms and Efficacy. Chem Rev 2012, 113, 119-191.

225. Liu, Q.; Deiters, A., Optochemical Control of Deoxyoligonucleotide Function via a Nucleobase-Caging Approach. Acc Chem Res 2013, 47, 45-55.

226. Gardner, L.; Deiters, A., Light-controlled synthetic gene circuits. Curr Opin Chem Biol 2012, 16, 292-299.

227. Riggsbee, C. W.; Deiters, A., Recent advances in the photochemical control of protein function. Trends Biotechnol 2010, 28, 468-475.

228. Brieke, C.; Rohrbach, F.; Gottschalk, A.; Mayer, G.; Heckel, A., Light-Controlled Tools. Angew Chem Int Ed Engl 2012, 51, 8446-8476.

229. Pelliccioli, A. P.; Wirz, J., Photoremovable protecting groups: reaction mechanisms and applications. Photochem Photobiol Sci 2002, 1, 441-458.

230. Kaplan, J. H.; Forbush, B.; Hoffman, J. F., Rapid photolytic release of adenosine 5'- triphosphate from a protected analog: utilization by the sodium:potassium pump of human red blood cell ghosts. Biochemistry 1978, 17, 1929-1935.

231. Fehrentz, T.; Schönberger, M.; Trauner, D., Optochemical Genetics. Angew Chem Int Ed Engl 2011, 50, 12156-12182.

207

232. Bochet, C. G., Photolabile protecting groups and linkers. J Chem Soc, Perkin Trans 1 2002, 125-142.

233. Ellis-Davies, G. C., Caged compounds: photorelease technology for control of cellular chemistry and physiology. Nat Methods 2007, 4, 619-28.

234. Fisher, W. G.; Partridge, W. P.; Dees, C.; Wachter, E. A., Simultaneous Two-Photon Activation of Type-I Photodynamic Therapy Agents. Photochem Photobiol 1997, 66, 141- 155.

235. Adams, S. R.; Kao, J. P. Y.; Tsien, R. Y., Biologically useful chelators that take up calcium(2+) upon illumination. J Am Chem Soc 1989, 111, 7957-7968.

236. Cameron, J. F.; Frechet, J. M. J., Photogeneration of organic bases from o- nitrobenzyl-derived carbamates. J Am Chem Soc 1991, 113, 4303-4313.

237. Reichmanis, E.; Smith, B. C.; Gooden, R., O-nitrobenzyl photochemistry: Solution vs. solid-state behavior. J Polym Sci Polym Chem Ed 1985, 23, 1-8.

238. Milburn, T.; Matsubara, N.; Billington, A. P.; Udgaonkar, J. B.; Walker, J. W.; Carpenter, B. K.; Webb, W. W.; Marque, J.; Denk, W.; McCray, J. A., Synthesis, photochemistry, and biological activity of a caged photolabile acetylcholine receptor ligand. Biochemistry 1989, 28, 49-55.

239. Gee, K. R.; Niu, L.; Schaper, K.; Hess, G. P., Caged Bioactive Carboxylates. Synthesis, Photolysis Studies, and Biological Characterization of a New Caged N-Methyl-D- aspartic Acid. J Org Chem 1995, 60, 4260-4263.

240. Schaper, K.; Mobarekeh, S. Abdollah M.; Grewer, C., Synthesis and Photophysical Characterization of a New, Highly Hydrophilic Caging Group. Eur J Org Chem 2002, 2002, 1037-1046.

241. Schaper, K.; Madani Mobarekeh, S. A.; Doro, P.; Maydt, D., The α,5-Dicarboxy-2- nitrobenzyl Caging Group, a Tool for Biophysical Applications with Improved Hydrophilicity: Synthesis, Photochemical Properties and Biological Characterization. Photochem Photobiol 2010, 86, 1247-1254.

208

242. Hasan, A.; Stengele, K.-P.; Giegrich, H.; Cornwell, P.; Isham, K. R.; Sachleben, R. A.; Pfleiderer, W.; Foote, R. S., Photolabile protecting groups for nucleosides: Synthesis and photodeprotection rates. Tetrahedron 1997, 53, 4247-4264.

243. Bühler, S.; Lagoja, I.; Giegrich, H.; Stengele, K.-P.; Pfleiderer, W., New Types of Very Efficient Photolabile Protecting Groups Based upon the [2-(2- Nitrophenyl)propoxy]carbonyl (NPPOC) Moiety. Helv Chim Acta 2004, 87, 620-659.

244. Berroy, P.; Viriot, M. L.; Carré, M. C., Photolabile group for 5′-OH protection of nucleosides: synthesis and photodeprotection rate. Sens Actuators B Chem 2001, 74, 186- 189.

245. Walbert, S.; Pfleiderer, W.; Steiner, U. E., Photolabile Protecting Groups for Nucleosides: Mechanistic Studies of the 2-(2-Nitrophenyl)ethyl Group. Helv Chim Acta 2001, 84, 1601-1611.

246. Wöll, D.; Laimgruber, S.; Galetskaya, M.; Smirnova, J.; Pfleiderer, W.; Heinz, B.; Gilch, P.; Steiner, U. E., On the Mechanism of Intramolecular Sensitization of Photocleavage of the 2-(2-Nitrophenyl)propoxycarbonyl (NPPOC) Protecting Group. J Am Chem Soc 2007, 129, 12148-12158.

247. Whiteson, K. L.; Chen, Y.; Chopra, N.; Raymond, A. C.; Rice, P. A., Identification of a potential general acid/base in the reversible phosphoryl transfer reactions catalyzed by tyrosine recombinases: Flp H305. Chem Biol 2007, 14, 121-9.

248. Abu Tarboush, N.; Jensen, L. M.; Feng, M.; Tachikawa, H.; Wilmot, C. M.; Davidson, V. L., Functional importance of tyrosine 294 and the catalytic selectivity for the bis-Fe(IV) state of MauG revealed by replacement of this axial heme ligand with histidine. Biochemistry 2010, 49, 9783-91.

249. Rouzer, C. A.; Marnett, L. J., Cyclooxygenases: structural and functional insights. J Lipid Res 2009, 50 Suppl, S29-34.

250. Arora, A.; Scholar, E. M., Role of tyrosine kinase inhibitors in cancer therapy. J Pharmacol Exp Ther 2005, 315, 971-9.

209

251. Chiarugi, P.; Buricchi, F., Protein tyrosine phosphorylation and reversible oxidation: two cross-talking posttranslation modifications. Antioxid Redox Signal 2007, 9, 1-24.

252. Koide, S.; Sidhu, S. S., The Importance of Being Tyrosine: Lessons in Molecular Recognition from Minimalist Synthetic Binding Proteins. ACS Chem Biol 2009, 4, 325-334.

253. Young, T. S.; Schultz, P. G., Beyond the canonical 20 amino acids: expanding the genetic lexicon. J Biol Chem 2010, 285, 11039-44.

254. Wang, Q.; Parrish, A. R.; Wang, L., Expanding the genetic code for biological studies. Chem Biol 2009, 16, 323-36.

255. Miller, J. C.; Silverman, S. K.; England, P. M.; Dougherty, D. A.; Lester, H. A., Flash decaging of tyrosine sidechains in an ion channel. Neuron 1998, 20, 619-24.

256. Deiters, A.; Groff, D.; Ryu, Y.; Xie, J.; Schultz, P. G., A genetically encoded photocaged tyrosine. Angew Chem Int Ed Engl 2006, 45, 2728-31.

257. Chou, C.; Young, D. D.; Deiters, A., A light-activated DNA polymerase. Angew Chem Int Ed Engl 2009, 48, 5950-3.

258. Chou, C.; Young, D. D.; Deiters, A., Photocaged t7 RNA polymerase for the light activation of transcription and gene function in pro- and eukaryotic cells. ChemBioChem 2010, 11, 972-7.

259. Edwards, W. F.; Young, D. D.; Deiters, A., Light-activated Cre recombinase as a tool for the spatial and temporal control of gene function in mammalian cells. ACS Chem Biol 2009, 4, 441-5.

260. Chou, C.; Deiters, A., Light-activated gene editing with a photocaged zinc-finger nuclease. Angew Chem Int Ed Engl 2011, 50, 6839-42.

261. Wilkins, B. J.; Marionni, S.; Young, D. D.; Liu, J.; Wang, Y.; Di Salvo, M. L.; Deiters, A.; Cropp, T. A., Site-specific incorporation of fluorotyrosines into proteins in Escherichia coli by photochemical disguise. Biochemistry 2010, 49, 1557-9.

210

262. Cellitti, S. E.; Jones, D. H.; Lagpacan, L.; Hao, X.; Zhang, Q.; Hu, H.; Brittain, S. M.; Brinker, A.; Caldwell, J.; Bursulaya, B.; Spraggon, G.; Brock, A.; Ryu, Y.; Uno, T.; Schultz, P. G.; Geierstanger, B. H., In vivo incorporation of unnatural amino acids to probe structure, dynamics, and ligand binding in a large protein by nuclear magnetic resonance spectroscopy. J Am Chem Soc 2008, 130, 9268-81.

263. Hunter, T.; Eckhart, W., The discovery of tyrosine phosphorylation: it's all in the buffer! Cell 2004, 116, S35-9, 1 p following S48.

264. Hunter, T., Tyrosine phosphorylation: thirty years and counting. Curr Opin Cell Biol 2009, 21, 140-6.

265. Radu, M.; Semenova, G.; Kosoff, R.; Chernoff, J., PAK signalling during the development and progression of cancer. Nat Rev Cancer 2014, 14, 13-25.

266. Strawn, L. M.; Shawver, L. K., Tyrosine kinases in disease: overview of kinase inhibitors as therapeutic agents and current drugs in clinical trials. Expert Opin Investig Drugs 1998, 7, 553-73.

267. Lemmon, M. A.; Schlessinger, J., Cell signaling by receptor tyrosine kinases. Cell 2010, 141, 1117-34.

268. Blume-Jensen, P.; Hunter, T., Oncogenic kinase signalling. Nature 2001, 411, 355- 65.

269. Schmelzle, K.; Kane, S.; Gridley, S.; Lienhard, G. E.; White, F. M., Temporal dynamics of tyrosine phosphorylation in insulin signaling. Diabetes 2006, 55, 2171-9.

270. Ito, A.; Shimokawa, H.; Kadokami, T.; Fukumoto, Y.; Owada, M. K.; Shiraishi, T.; Nakaike, R.; Takayanagi, T.; Egashira, K.; Takeshita, A., Tyrosine kinase inhibitor suppresses coronary arteriosclerotic changes and vasospastic responses induced by chronic treatment with interleukin-1 beta in pigs in vivo. J Clin Invest 1995, 96, 1288-94.

271. Missbach, M.; Jeschke, M.; Feyen, J.; Müller, K.; Glatt, M.; Green, J.; Susa, M., A novel inhibitor of the tyrosine kinase Src suppresses phosphorylation of its major cellular substrates and reduces bone resorption in vitro and in rodent models in vivo. Bone 1999, 24, 437-49.

211

272. Gallay, P.; Swingler, S.; Aiken, C.; Trono, D., HIV-1 infection of nondividing cells: C-terminal tyrosine phosphorylation of the viral matrix protein is a key regulator. Cell 1995, 80, 379-88.

273. Ptacek, J.; Snyder, M., Charging it up: global analysis of protein phosphorylation. Trends Genet 2006, 22, 545-554.

274. Singh, R. K.; Gunjan, A., Histone tyrosine phosphorylation comes of age. Epigenetics 2011, 6, 153-160.

275. Luo, J.; Li, M.; Tang, Y.; Laszkowska, M.; Roeder, R. G.; Gu, W., Acetylation of p53 augments its site-specific DNA binding both in vitro and in vivo. Proc Natl Acad Sci U S A 2004, 101, 2259-2264.

276. Liu, W. R.; Wang, Y.-S.; Wan, W., Synthesis of proteins with defined posttranslational modifications using the genetic noncanonical amino acid incorporation approach. Mol BioSyst 2011, 7, 38-47.

277. Dastugue, B.; Tichonicky, L.; Kruh, J., Effect of enzymatic phosphorylation of histone on its ability to bind to RNA. Biochimie 1972, 54, 1435-1441.

278. Muir, T. W.; Sondhi, D.; Cole, P. A., Expressed protein ligation: A general method for protein engineering. Proc Natl Acad Sci U S A 1998, 95, 6705-6710.

279. Lu, W.; Shen, K.; Cole, P. A., Chemical Dissection of the Effects of Tyrosine Phosphorylation of SHP-2. Biochemistry 2003, 42, 5461-5468.

280. Zhang, Z.; Shen, K.; Lu, W.; Cole, P. A., The Role of C-terminal Tyrosine Phosphorylation in the Regulation of SHP-1 Explored via Expressed Protein Ligation. J Biol Chem 2003, 278, 4668-4674.

281. Tarrant, M. K.; Cole, P. A., The Chemical Biology of Protein Phosphorylation. Annu Rev Biochem 2009, 78, 797-825.

282. Huang, W.; Erikson, R. L., Constitutive activation of Mek1 by mutation of serine phosphorylation sites. Proc Natl Acad Sci U S A 1994, 91, 8960-3.

212

283. Huang, L.; Wong, T. Y.; Lin, R. C.; Furthmayr, H., Replacement of threonine 558, a critical site of phosphorylation of moesin in vivo, with aspartate activates F-actin binding of moesin. Regulation by conformational change. J Biol Chem 1999, 274, 12803-10.

284. Léger, J.; Kempf, M.; Lee, G.; Brandt, R., Conversion of serine to aspartate imitates phosphorylation-induced changes in the structure and function of microtubule-associated protein tau. J Biol Chem 1997, 272, 8441-6.

285. Woods, A.; Vertommen, D.; Neumann, D.; Turk, R.; Bayliss, J.; Schlattner, U.; Wallimann, T.; Carling, D.; Rider, M. H., Identification of phosphorylation sites in AMP- activated protein kinase (AMPK) for upstream AMPK kinases and study of their roles by site-directed mutagenesis. J Biol Chem 2003, 278, 28434-42.

286. Szczepanowska, J.; Ramachandran, U.; Herring, C. J.; Gruschus, J. M.; Qin, J.; Korn, E. D.; Brzeska, H., Effect of mutating the regulatory phosphoserine and conserved threonine on the activity of the expressed catalytic domain of Acanthamoeba myosin I heavy chain kinase. Proc Natl Acad Sci U S A 1998, 95, 4146-51.

287. Littlepage, L. E.; Wu, H.; Andresson, T.; Deanehan, J. K.; Amundadottir, L. T.; Ruderman, J. V., Identification of phosphorylated residues that affect the activity of the mitotic kinase Aurora-A. Proc Natl Acad Sci U S A 2002, 99, 15440-5.

288. Xia, F.; Li, J.; Hickey, G. W.; Tsurumi, A.; Larson, K.; Guo, D.; Yan, S. J.; Silver- Morse, L.; Li, W. X., Raf activation is regulated by tyrosine 510 phosphorylation in Drosophila. PLoS Biol 2008, 6, e128.

289. Xie, J.; Supekova, L.; Schultz, P. G., A Genetically Encoded Metabolically Stable Analogue of Phosphotyrosine in Escherichia coli. ACS Chem Biol 2007, 2, 474-478.

290. Rust, H. L.; Subramanian, V.; West, G. M.; Young, D. D.; Schultz, P. G.; Thompson, P. R., Using Unnatural Amino Acid Mutagenesis To Probe the Regulation of PRMT1. ACS Chem Biol 2013.

291. Chin, J. W.; Santoro, S. W.; Martin, A. B.; King, D. S.; Wang, L.; Schultz, P. G., Addition of p-azido-L-phenylalanine to the genetic code of Escherichia coli. J Am Chem Soc 2002, 124, 9026-7.

213

292. Deiters, A.; Cropp, T. A.; Mukherji, M.; Chin, J. W.; Anderson, J. C.; Schultz, P. G., Adding amino acids with novel reactivity to the genetic code of Saccharomyces cerevisiae. J Am Chem Soc 2003, 125, 11782-3.

293. Serwa, R.; Wilkening, I.; Del Signore, G.; Mühlberg, M.; Claußnitzer, I.; Weise, C.; Gerrits, M.; Hackenberger, C. P. R., Chemoselective Staudinger-Phosphite Reaction of Azides for the Phosphorylation of Proteins. Angew Chem Int Ed Engl 2009, 48, 8234-8239.

294. Takimoto , J. K.; Xiang , Z.; Kang, J.-Y.; Wang, L., Esterification of an Unnatural Amino Acid Structurally Deviating from Canonical Amino Acids Promotes Its Uptake and Incorporation into Proteins in Mammalian Cells. ChemBioChem 2010, 11, 2268-2272.

295. Rautio, J.; Kumpulainen, H.; Heimbach, T.; Oliyai, R.; Oh, D.; Järvinen, T.; Savolainen, J., Prodrugs: design and clinical applications. Nat Rev Drug Discov 2008, 7, 255- 70.

296. Bhushan, K. R.; DeLisi, C.; Laursen, R. A., Synthesis of photolabile 2-(2- nitrophenyl)propyloxycarbonyl protected amino acids. Tetrahedron Lett 2003, 44, 8585- 8588.

297. Zimmermann, J.; Gundogdu, K.; Cremeens, M. E.; Bandaria, J. N.; Hwang, G. T.; Thielges, M. C.; Cheatum, C. M.; Romesberg, F. E., Efforts toward developing probes of protein dynamics: vibrational dephasing and relaxation of carbon-deuterium stretching modes in deuterated leucine. J Phys Chem B 2009, 113, 7991-4.

298. Miller, C. S.; Corcelli, S. A., Carbon-deuterium vibrational probes of amino acid protonation state. J Phys Chem B 2009, 113, 8218-21.

299. Berthomieu, C.; Hienerwadel, R., Fourier transform infrared (FTIR) spectroscopy. Photosynth Res 2009, 101, 157-70.

300. Su, M.; Wang, J.; Tang, X., Photocaging Strategy for Functionalisation of Oligonucleotides and Its Applications for Oligonucleotide Labelling and Cyclisation. Chemistry 2012, 18, 9628-9637.

214

301. Akizawa, H.; Imajima, M.; Hanaoka, H.; Uehara, T.; Satake, S.; Arano, Y., Renal brush border enzyme-cleavable linkages for low renal radioactivity levels of radiolabeled antibody fragments. Bioconjug Chem 2013, 24, 291-9.

302. Rothman, D. M.; Vázquez, M. E.; Vogel, E. M.; Imperiali, B., General Method for the Synthesis of Caged Phosphopeptides: Tools for the Exploration of Signal Transduction Pathways. Org Lett 2002, 4, 2865-2868.

303. Tietze, L. F.; Müller, M.; Duefert, S.-C.; Schmuck, K.; Schuberth, I., Photoactivatable Prodrugs of Highly Potent Duocarmycin Analogues for a Selective Cancer Therapy. Chemistry 2013, 19, 1726-1731.

304. Nguyen, D. P.; Mahesh, M.; Elsässer, S. J.; Hancock, S. M.; Uttamapinant, C.; Chin, J. W., Genetic encoding of photocaged cysteine allows photoactivation of TEV protease in live mammalian cells. J Am Chem Soc 2014, 136, 2240-3.

305. Anderson, E.; Brown, T.; Picken, D., Novel photocleavable universal support for oligonucleotide synthesis. Nucleos Nucleot Nucl 2003, 22, 1403-6.

306. Tupler, R.; Perini, G.; Pellegrino, M. A.; Green, M. R., Profound misregulation of muscle-specific gene expression in facioscapulohumeral muscular dystrophy. Proc Natl Acad Sci U S A 1999, 96, 12650-12654.

307. Waterfall, J. J.; Meltzer, P. S., Targeting epigenetic misregulation in synovial sarcoma. Cancer Cell 2012, 21, 323-4.

308. Buée, L.; Bussière, T.; Buée-Scherrer, V.; Delacourte, A.; Hof, P. R., Tau protein isoforms, phosphorylation and role in neurodegenerative disorders. Brain Res Brain Res Rev 2000, 33, 95-130.

309. Wang, Q. T., Epigenetic regulation of cardiac development and function by polycomb group and trithorax group proteins. Dev Dyn 2012, 241, 1021-33.

310. Deiters, A., Oligonucleotides as targets and cellular probes. Bioorg Med Chem 2013, 21, 6099-100.

215

311. Ordoukhanian, P.; Taylor, J.-S., Design and Synthesis of a Versatile Photocleavable DNA Building Block. Application to Phototriggered Hybridization. J Am Chem Soc 1995, 117, 9570-9571.

312. Govan, J. M.; Lively, M. O.; Deiters, A., Photochemical Control of DNA Decoy Function Enables Precise Regulation of Nuclear Factor κB Activity. J Am Chem Soc 2011, 133, 13176-13182.

313. Govan, J. M.; Uprety, R.; Thomas, M.; Lusic, H.; Lively, M. O.; Deiters, A., Cellular Delivery and Photochemical Activation of Antisense Agents through a Nucleobase Caging Strategy. ACS Chem Biol 2013, 8, 2272-2282.

314. Young, D. D.; Lively, M. O.; Deiters, A., Activation and deactivation of DNAzyme and antisense function with light for the photochemical regulation of gene expression in mammalian cells. J Am Chem Soc 2010, 132, 6183-93.

315. Young, D. D.; Lusic, H.; Lively, M. O.; Yoder, J. A.; Deiters, A., Gene Silencing in Mammalian Cells with Light-Activated Antisense Agents. ChemBioChem 2008, 9, 2937- 2940.

316. Deiters, A.; Garner, R. A.; Lusic, H.; Govan, J. M.; Dush, M.; Nascone-Yoder, N. M.; Yoder, J. A., Photocaged morpholino oligomers for the light-regulation of gene function in zebrafish and Xenopus embryos. J Am Chem Soc 2010, 132, 15644-50.

317. Connelly, C. M.; Uprety, R.; Hemphill, J.; Deiters, A., Spatiotemporal control of microRNA function using light-activated antagomirs. Mol BioSyst 2012, 8, 2987-2993.

318. Govan, J. M.; Uprety, R.; Hemphill, J.; Lively, M. O.; Deiters, A., Regulation of Transcription through Light-Activation and Light-Deactivation of Triplex-Forming Oligonucleotides in Mammalian Cells. ACS Chem Biol 2012, 7, 1247-1256.

319. Schoch, J.; Wiessler, M.; Jäschke, A., Post-synthetic modification of DNA by inverse-electron-demand Diels-Alder reaction. J Am Chem Soc 2010, 132, 8846-7.

320. Šečkutė, J.; Yang, J.; Devaraj, N. K., Rapid oligonucleotide-templated fluorogenic tetrazine ligations. Nucleic Acids Res 2013, 41, e148-e148.

216

321. Georgianna, W. E.; Lusic, H.; McIver, A. L.; Deiters, A., Photocleavable Polyethylene Glycol for the Light-Regulation of Protein Function. Bioconjug Chem 2010, 21, 1404-1407.

322. Lusic, H.; Deiters, A., A New Photocaging Group for Aromatic N-Heterocycles. Synthesis-Stuttgart 2006, 8, 2147-2150.

323. Ryu, E. K.; Choe, Y. S.; Lee, K. H.; Choi, Y.; Kim, B. T., Curcumin and dehydrozingerone derivatives: synthesis, radiolabeling, and evaluation for beta-amyloid plaque imaging. J Med Chem 2006, 49, 6111-9.

217