Unravelling the targets of electrophilic natural products in cancer with chemical proteomics

By James Andrew Clulow

A thesis submitted to Imperial College London in candidature for the degree of Doctor of Philosophy of Imperial College

Department of Chemistry Imperial College London Exhibition Road London SW7 2AZ

May 2015

1

Declaration of originality

This thesis is my own work and reports the results of my original research. Where information derives from the work of others or via collaboration with others, this is acknowledged in the text and references.

James Clulow, May 2015

Copyright declaration

The copyright of this thesis rests with the author and is made available under a Creative Commons Attribution Non-Commercial No Derivatives licence. Researchers are free to copy, distribute or transmit the thesis on the condition that they attribute it, that they do not use it for commercial purposes and that they do not alter, transform or build upon it. For any reuse or redistribution, researchers must make clear to others the licence terms of this work.

2

Abstract

Electrophilic natural products that are found in dietary sources such as curcumin, piperlongumine and have attracted considerable interest on account of their broad range of biological activities, leading to their assessment as therapeutics for a number of diseases. Despite extensive research, the mode of action and biological targets of these compounds remain poorly understood. These compounds are clearly not ‘single target’ molecules; dissecting their complex polypharmacology to determine the key targets and pathways presents a major challenge, and has limited progress in the clinic.

In this study, a chemical proteomics approach using activity-based probes (ABPs) based on these small molecules has been applied to allow the profiling of their molecular targets in breast cancer cellular systems, identifying the range and relative importance of targets that these molecules bind to covalently, across the entire system in an unbiased way for the very first time. Hundreds of high confidence targets have been unravelled, providing the most comprehensive target set for curcumin, piperlongumine and sulforaphane to date. Translation of these targets to the mode(s) of action displayed by these compounds reveals new mediators that help to explain their anticancer effects. The previous limited target information has been a major hindrance in determining how best to apply such electrophilic natural products as therapeutics. These studies address this void and help to provide greater clarity into the underlying mechanisms of curcumin, piperlongumine and sulforaphane, as well as electrophilic natural products more generally.

3

Acknowledgements

There have been too many people who deserve thanks for helping me throughout the PhD process, who I will do my best to acknowledge in the remainder of this page. Like every PhD project it has had its ups and its downs but throughout it all I have been extremely fortunate to have had an interesting and challenging project, excellent supervision and the privilege of working with and alongside some excellent scientists. I have had some amazing experiences as part of the PhD, meeting a range of people along the way, all of whom have contributed in different ways to this final thesis.

I would firstly like to thank my supervisors, Ed Tate (Imperial College London) and Lyn Jones (Pfizer) for their help and support. Ed, I could not have asked for better supervision or support. Thank you for encouraging me to make the most of the PhD experience. I truly appreciate all the time you have given me over the last 4 years or so. It has been a privilege to be a member of your research group. Lyn, thank you for maintaining an interest in the project from afar and for hosting me when I came to Boston in February last year, it was an incredible trip and one of the highlights of my PhD.

Secondly, there are a number of people to thank for providing experimental support. For providing and maintaining cell culture facilities, thank you to David Mann, Mira Novakova and Andrew Coulson. Thanks to Lisa Haigh as part of the mass spectrometry service for allowing access to the QExactive mass spectrometer and providing maintenance support. Special thanks to Remi, Goska and Manue for troubleshooting proteomic data analysis with MaxQuant and Perseus. For an introduction to Cytoscape and network analysis, thanks to Konstantinos Mitsopoulos from the ICR for initially showing me the ropes. Thanks to Saphia for synthesising some of the electrophile probes as part of a UROP summer placement. Thanks to Lisa for the original synthesis of sulforaphane ABP 1 and 2 and for laying the foundations of the profiling work into sulforaphane. For synthesis of communal capture reagents that I used throughout the PhD, thank you to Megan, Naoko, Jenny, Julia, Goska and Lisa. Finally, thanks to Lyn for providing many of the compounds that came from Pfizer.

Thirdly, thank you to all members of the Tate group past and present, especially office 538, for providing a vibrant and friendly working environment. Further thanks to Anna, Remi, Goska, Scott, Jenny and Julia from the group for taking the time to proofread this thesis. Special thanks to Paulina, Tom and Naoko with whom I started alongside in the Tate group back in 2010 and who have often provided advice, support or a general gossip all the way through the PhD.

Finally, I would like to thank all my friends and family, most importantly Mum, Dad, Chris, Emma and Charlotte. Mum and Dad, I would not be at this stage if it had not been for your continued support in everything that I do. Hopefully this final thesis will at least provide a reference for what I have been up to the last few years of my life! Very finally, Lucy (you’re mentioned by name I’ll hasten to add), there is insufficient space to thank you for all the support you have given me, sorry it’s taken me longer than anticipated to finish this thesis, you’ve kept me going through thick and thin and I look forward to spending the rest of my life with you.

4

Abbreviations

15-PGJ2 – 15-deoxy-Δ(12,14)-prostaglandin J2 2D-GE – Two-dimensional gel electrophoresis 4-HNE – 4-Hydroxynonenal 6-HITC – 6-methylsulfinylhexyl ABP – Activity-based probe ABPP – Activity-based protein profiling AC – Acetylenic chalcone ACADL – Long-chain specific acyl-CoA dehydrogenase ACAT1 – Acetyl-CoA acetyltransferase ACR – Acrylamide AE – Acetylenic enone AITC – AKT – Protein kinase B ALDH – Aldehyde dehydrogenase ALDO – Aldo/keto reductase AMBIC – Ammonium bicarbonate ANT – Adenine nucleotide AP-1 – Activator protein 1 AQUA – Absolute quantification (peptides) AR – Androgen receptor ARE – Antioxidant response element ASK1 – Apoptosis signal-regulating kinase 1 (MAP3K5) ATP – Adenosine triphosphate ATP2A2 – Sarcoplasmic/endoplasmic reticulum calcium ATPase 2 AURKA – Aurora kinase A AXL – Tyrosine-protein kinase receptor UFO AzRB – Azido-biotin capture reagent with arginine-containing cleavage site AzRB2 – Azido-biotin capture reagent with arginine-containing cleavage site AzT – Azido-TAMRA capture reagent AzTB – Azido-TAMRA-biotin capture reagent BA – Benzaldehyde Bcl – B-cell lymphoma BID – BH3 interacting-domain death agonist BioGRID – Database of protein and genetic interactors BITC – BODIPY – Boron dipyrromethene BQ – Benzoquinone BSA – Bovine serum albumin CA – Chloroacetamide CACC – Calcium activated chloride channel CBR1 – Carbonyl reductase 1 CDA – Cytidine deaminase CDK – Cyclin-dependent kinase CES1 – Carboxylesterase 1 CETSA – Cellular thermal shift assay CHEK1 – Serine/threonine-protein kinase Chk1 CI – Combination index CLIC1 – Chloride intracellular channel protein 1 CMK – Chloromethylketone COX – Cyclooxygenase CuAAC – Copper catalysed azide alkyne cycloaddition

5

CURC – Curcumin Cy3 – Cyanine 3 dye Cys - Cysteine dH2O – Distilled water DARTS – Drug affinity responsive target stability DAVID – Database for Annotation, Visualisation and Integrated Discovery DCM – Dichloromethane DFNA5 – Deafness, Autosomal Dominant 5 DIPEA – N,N-Diisopropylethylamine DMEM – Dulbecco modified Eagle’s medium DMF – Dimethylformaldehyde DMSO – Dimethyl sulfoxide DNMT1 – DNA (cytosine-5)-methyltransferase 1 DTT – Dithiothreitol

EC50 – Half maximal effective concentration ECH – Erythroid cell-derived protein EDTA – Ethylenediaminetetraacetic acid EGF – Epidermal growth factor EGFR – Epidermal growth factor receptor ELISA – -linked immunosorbent assay ENO – Enolase ES – Electrospray ESR1 – Estrogen receptor ESI – Electrospray ionization EtOAc – Ethyl acetate FA – Feruloyl acetone FAS – Fatty acid synthase FASP – Filter-aided sample preparation FBS – Foetal bovine serum FDA – Food and Drug Administration FDR – False discovery rate FGF – Fibroblast growth factor FRET – Fluorescence resonance energy transfer FXR1 – Fragile X mental retardation syndrome-related protein 1 GAPDH – Glyceraldehyde 3-phosphate dehydrogenase GCLC – γ-Glutamylcysteine catalytic subunit GLO1 – Glyoxylase 1 GO – ontology GR – Glucocorticoid receptor GSH – Glutathione GSR – Glutathione reductase GST – Gluathione S- h – Hours H/L – ‘Heavy’/’Light’ SILAC ratio H/M – ‘Heavy’/’Medium’ SILAC ratio HAT-1 – Histone acetyltransferase 1 HCCS – c-type heme HDAC – Histone deacetylase HEPES - 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid HER2 - Receptor tyrosine-protein kinase erbB-2 HIP/HOP - haploinsufficient profiling/homozygous deletion profiling HO-1 – Heme oxygenase 1

6

HPLC – High performance liquid chromatography HPRD – protein reference database HRMS – High resolution mass spectrometry HRP – Horseradish peroxidase HSD11B1 – Corticosteroid 11-beta-dehydrogenase isozyme 1 HSF – Heat shock factor HSP – Heat shock protein IA - Iodoacetamide IAP – Inhibitor of apoptosis

IC50 – Half maximal inhibitory concentration ICAT – Isotope coded affinity tags ICPL – Isotope-coded protein label IκB – Nuclear factor of kappa light polypeptide gene enhancer in B-cells inhibitor IKK – IκB kinase IL – Interleukin IMPDH2 – Inosine-5'-monophosphate dehydrogenase 2 isoTOP-ABPP – Isotope tandem orthogonal proteolysis activity-based protein profiling ITC - isothiocyanate iTRAQ - Isobaric tag for relative and absolute quantification JNK – c-JUN N-terminal Kinase KEAP1 – Kelch-like ECH-associated protein 1 KEGG – Kyoto encyclopaedia of and genomes KHSRP – KH-type slicing regulatory protein LC – Liquid chromatography LC-MS – Liquid chromatography coupled with mass spectrometry LC-MS/MS – Liquid chromatography coupled with tandem mass spectrometry LCMT1 – Leucine carboxyl methyltransferase 1 LDH – LFQ – Label-free quantification [M+…] – Molecular ion plus M/L – ‘Medium’/’Light’ SILAC ratio m/z – Mass to charge ratio MA – Maleic anhydride MALDI – Matrix-assisted laser desorption/ionization MAPK – Mitogen-activated protein kinase MARCKS – Myristolated alanine-rich C kinase substrate MES – 2-[N-Morpholino]ethanesulfonic acid MeOH – Methanol MIF – Macrophage migration inhibitory factor Min – Minutes MINT – Molecular interaction database Mol. – Moles MOPS – 3-[N-Morpholino]propanesulfonic acid mRNA – Messenger ribonucleic acid MS – Mass spectrometry MS/MS or MS2 – Tandem mass Spectrometry MTS – 3-(4,5-Dimethylthiazol-2-yl)- 5-(3-carboxymethoxyphenyl)- 2-(4-sulfophenyl)-2H- tetrazolium, inner salt MTT – ((3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide Mw – Molecular weight NAC – N-acetyl cysteine NADPH – Nicotinamide adenine dinucleotide phosphate

7

NC – Curcumin n.d. – Not determined NEM – N-ethylmaleimide NF-κB – Nuclear factor kappa-light-chain-enhancer of activated B protein NFKB1 – Nuclear factor NF-kappa-B p105/p50 subunit NFKB2 – Nuclear factor NF-kappa-B p100/p52 subunit NIT2 – Omega-amidase NIT2 NMR – Nuclear magnetic resonance NQO2 – NAD(P)H dehydrogenase, quinone 2 NR3C1 – Glucocorticoid receptor Nrf2 – Nuclear factor (erythroid-derived 2)-like 2 NT5DC1 – 5'-nucleotidase domain-containing protein 1 PARP – Poly (ADP-ribose) polymerase PBS – Phosphate buffer saline PC – Mono-O-propylcurcumin PD – Pulldown (affinity enrichment of ) PDI – Protein disulfide PEG – Polyethylene glycol PEITC – Phenethyl isothiocyanate PGAM1 – Phosphoglycerate mutase 1 PI – Piperine PI3K – Phosphatidylinositol-3-kinase PIP – Piperlongumine PLK1 – Polo-like kinase 1 PKA/B/C/R – Protein kinase A/B/C/R PMS – Phenazine methosulfate PPAR – Peroxisome proliferator-activated receptor PPD – Pre-pulldown (affinity enrichment of proteins) PR – Progesterone receptor PRDX – Peroxiredoxin PRMT1 – Protein arginine N-methyltransferase 1 PTK – Protein tyrosine kinase PTM – Post-translational modification PTP – Protein tyrosine phosphatase PVDF – Polyvinylidene difluoride 14 12 14 12 R0K0 – N4 C6-arginine and N2 C6-lysine containing cell culture media 13 R6K4 – C6-arginine and D4-lysine containing cell culture media 15 13 15 13 R10K8 – N4 C6-arginine and N2 C6-lysine containing cell culture media RB1 – Retinoblastoma protein 1 RES – Resveratrol RLP – Ribosomal-like protein RNAi – RNA interference ROCK1 - Rho-associated protein kinase 1 ROS – Reactive oxygen species Rpm – Revolutions per minute RPS – Ribosomal protein S RSK – Ribosomal s6 kinase Rt – Retention time RTN – Reticulon RT-PCR – Real-time polymerase chain reaction s – Seconds SAR – Structure activity relationship

8

SDS – Sodium dodecyl sulfate SDS-PAGE – Sodium dodecyl sulfate polyacrylamide gel electrophoresis SF1 – Splicing factor 1 SILAC – Stable isotope labelling with amino acids in cell culture siRNA – Small interfering ribonucleic acid SLB – NuPAGE® LDS sample loading buffer SN – Supernatant of pulldown (affinity enrichment of proteins) SPR – Surface plasmon resonance SRC – Proto-oncogene tyrosine-protein kinase Src SRPK1 – SRSF protein kinase 1 STAT – Signal transducers and activators of transcription protein STK3/4 – Serine/threonine-protein kinase 3/4 STRING – Search Tool for the Retrieval of Interacting Genes/Proteins SULF - Sulforaphane TAMRA – 5-Carboxytetramethylrhodamine TBS – Tris-buffered saline TBST – Tris-buffered saline-Tween TBTA - Tris[(1-benzyl-1H-1,2,3-triazol-4-yl)methyl]amine TCEP - Tris(2-carboxyethyl)phosphine TEV - Tobacco Etch Virus TH - Theophylline THC – Tetrahydrocurcumin THF - Tetrahydrofuran THL – Tetrahydrolipstatin THP – Tetrahydropiperlongumine TLC – Thin layer chromatography TLR – Toll-like receptor TNF – Tumour necrosis factor TOP2A - DNA topoisomerase 2A Tris – Tris(hydroxymethyl)aminomethane tRNA – Transfer ribonucleic acid TXN – Thioredoxin TXNRD1 – Thioredoxin reductase 1 Ub – Ubiquitin UGDH – UDP-glucose 6-dehydrogenase UVRAG – UV radiation resistance-associated gene protein VDAC – Voltage-dependent anion channel Vol. – Volumes WB – Western blot XPO1 – Exportin 1 YWHAZ – 14-3-3 protein zeta/delta

9

Publications and presentations arising from the thesis

1 Publication

• Kalesh, K.A., Clulow, J.A. and Tate, E.W. Target profiling of zerumbone using a novel cell- permeable clickable probe and quantitative chemical proteomics. Chem. Commun. (Camb.) 15, 5497-5500 (2015).

6 Oral presentations

• J. A. Clulow et al. “Unravelling the targets of electrophilic natural products using quantitative activity-based chemical proteomics”, Imperial College London Department of Chemistry Postgraduate Symposium, Imperial College London, July 2014. • J. A. Clulow et al. “Unravelling the targets of electrophilic natural products using quantitative activity-based chemical proteomics”, Biochemical Society Redox Regulation in Health and Disease, Royal Society, Edinburgh, March 2014. • J. A. Clulow et al. “Unravelling the targets of electrophilic natural products using quantitative activity-based chemical proteomics”, Pfizer invited talk, Cambridge, Massachusetts, USA, February 2014. • J. A. Clulow et al. “Unravelling the targets of electrophilic natural products using quantitative activity-based chemical proteomics”, Keystone Symposium: Chemistry and Biology of Cell Death, Santa Fe, USA, February 2014. • J. A. Clulow et al. “Unravelling the targets of electrophilic natural products using quantitative activity-based chemical proteomics”, Pfizer Chemistry Symposium, Pfizer Neusentis, Cambridge, November 2013. • J. A. Clulow et al. “Getting a chemical handle on the biological targets of electrophilic natural products”, 2013 Joint Centre for Doctoral Training Conference, London, June 2013.

14 Poster presentations

• J. A. Clulow et al. “Unravelling the targets of electrophilic natural products using quantitative activity-based chemical proteomics”, Chemical Biology and Molecular Medicine Symposium, Li Ka Shing Centre, Cambridge, October 2014. • J. A. Clulow et al. “Unravelling the targets of electrophilic natural products using quantitative activity-based chemical proteomics”, RSC Chemical Biology Postgraduate Symposium, Warwick, April 2014. • J. A. Clulow et al. “Unravelling the targets of electrophilic natural products using quantitative activity-based chemical proteomics”, EU COST meeting 2014, Trinity College, Cambridge, March 2014. • J. A. Clulow et al. “Unravelling the targets of electrophilic natural products using quantitative activity-based chemical proteomics”, SET for Britain 2014, House of Commons, London, March 2014. • J. A. Clulow et al. “Unravelling the targets of electrophilic natural products using quantitative activity-based chemical proteomics”, 7th Biological and Medicinal Chemistry Postgraduate Symposium, Cambridge, December 2013. Prize awarded. • J. A. Clulow et al. “Unravelling the targets of electrophilic natural products using quantitative activity-based chemical proteomics”, RSC Organic Division Poster Symposium, Burlington House, London, December 2013. • J. A. Clulow et al. “Unravelling the targets of electrophilic natural products using quantitative activity-based chemical proteomics”, London Chemical Biology Symposium, Imperial College London, November 2013.

10

• J. A. Clulow et al. “Unravelling the targets of electrophilic natural products using quantitative activity-based chemical proteomics”, Imperial College London Graduate School Summer Research Symposium, July 2013. • J. A. Clulow et al. “Unravelling the targets of electrophilic natural products using quantitative activity-based chemical proteomics”, Department of Chemistry Postgraduate Symposium, Imperial College London, July 2013. • J. A. Clulow et al. “Getting a chemical handle on the biological targets of electrophilic natural products”, 47th ESBOC Conference, Gregynog, Wales, May 2013. • J. A. Clulow et al. “Getting a chemical handle on the biological targets of electrophilic natural products”,6th Biological and Medicinal Chemistry Postgraduate Symposium, Cambridge, December 2012. Prize awarded. • J. A. Clulow et al. “Getting a handle on the molecular targets of interesting, small molecule natural products”, Chemistry of the Cell 5, Oxford, September 2012. • J. A. Clulow et al. “Getting a handle on the molecular targets of interesting, small molecule natural products”, 2012 Joint Centre for Doctoral Training Conference, Warwick, May 2012. • J. A. Clulow et al. “Profiling the downstream targets of potent activators of the Nrf2/HO-1 pathway using a chemical proteomics approach”, Biochemical Society Hot Topic Meeting: Nrf2 in Health and Disease, Charles Darwin House London, December

11

Contents

1. Introduction ...... 18

1.1 Biological electrophiles ...... 18

1.1.1 Introduction ...... 18

1.1.2 Electrophilic natural products ...... 19

1.1.3 Endogenous electrophiles to mammalian systems ...... 21

1.1.4 Key electrophile mediated signalling pathways ...... 22

1.1.4.1 Kelch-like ECH-associated protein 1 (KEAP1) ...... 22

1.1.4.2 Nuclear factor κB (NF-κB) ...... 22

1.1.4.3 Peroxisome proliferator-activator receptor (PPARγ) ...... 23

1.1.4.4 Heat shock proteins (HSPs) ...... 23

1.1.4.5 Redox regulatory systems ...... 23

1.1.4.6 Histone deacetylases (HDACs) ...... 24

1.1.4.7 Kinases and phosphatases (MAPKs and PTPs)...... 24

1.1.4.8 Other targets of interest ...... 24

1.1.5 The therapeutic potential of electrophiles ...... 25

1.2 Methods for target elucidation ...... 30

1.2.1 Overview of small-molecule target identification ...... 30

1.2.2 Direct biochemical methods ...... 33

1.2.2.1 Affinity chromatography ...... 33

1.2.2.2 Activity-based protein profiling (ABPP) ...... 33

1.2.2.3 2D gel electrophoresis ...... 34

1.2.2.4 Developing technologies (DARTS, CETSA and TICC) ...... 35

1.2.2.5 Protein microarrays ...... 35

1.2.3 Genetic methods ...... 35

1.2.4 Computational methods ...... 36

1.2.5 Mass spectrometry (MS)-based proteomics ...... 37

1.2.5.1 Introduction ...... 37

1.2.5.2 Top-down and bottom-up proteomics ...... 37

1.2.5.3 Shotgun proteomics ...... 37

12

1.2.5.4 Quantitative proteomics ...... 40

1.2.6 Target validation and further elucidation ...... 42

1.2.7 Case studies of electrophile target profiling ...... 44

1.2.7.1 Quantitative protein target profiling of electrophilic natural products ...... 45

1.2.7.2 Global proteome profiling of reactive cysteines ...... 46

1.2.8 Conclusions ...... 48

1.3 Three electrophilic natural products of interest ...... 49

1.3.1 Curcumin ...... 51

1.3.1.1 Overview ...... 51

1.3.1.2 Molecular targets ...... 52

1.3.1.3 Summary ...... 53

1.3.2 Sulforaphane ...... 54

1.3.2.1 Overview ...... 54

1.3.2.2 Molecular targets ...... 54

1.3.2.3 Summary ...... 55

1.3.3 Piperlongumine ...... 55

1.3.3.1 Overview ...... 55

1.3.3.2 Molecular targets ...... 56

1.3.3.3 Summary ...... 57

1.3.4 Conclusions ...... 57

2. Previous work and project aims ...... 60

3. Design and chemical synthesis of ABPs of electrophilic natural products ...... 64

3.1 Design and synthesis of curcumin ABPs...... 65

3.2 Design and synthesis of sulforaphane ABPs ...... 67

3.3 Design and synthesis of piperlongumine ABPs ...... 70

3.4 Confirmation of the anticancer activity of ABPs ...... 70

4. Initial application of ABPs to cells and cell lysates ...... 74

4.1 In-cell application of ABPs and proteomic identification of targets ...... 74

4.1.1 In-gel fluorescence analysis of ABPs treated to MCF7 and MDA-MB-231 cell lines ...... 74

4.1.2 Stability of ABP-protein adducts in the MDA-MB-231 cell line ...... 76

4.1.3 Proteomic identification of targets of the ABPs in the MDA-MB-231 cell line ...... 79

13

4.2 In-cell competition of ABPs against parent compounds and other electrophiles ...... 82

4.3 Proteomic target identification of ABPs competed against parent compound with duplex SILAC ...... 86

4.4 In-lysate competition of ABPs against parent compounds and other electrophiles ...... 90

4.5 Proteomic target comparison of ABPs competed against parent compound in-lysate and in-cell with duplex SILAC ...... 92

4.6 Summary and conclusions ...... 97

5. Quantitative, concentration gradient of competition-based chemical proteomics...... 99

5.1 Optimisation of a ‘spike-in’ SILAC approach into the chemical proteomic workflow ...... 99

5.2 Comprehensive protein target profiling of sulforaphane ...... 103

5.3 Comprehensive protein target profiling of curcumin and piperlongumine ...... 115

5.3.1 Target identification overview in MDA-MB-231 cell line ...... 115

5.3.2 Target profiling of curcumin ...... 118

5.3.3 Target profiling of piperlongumine ...... 125

5.3.4 Target profiling of THC and THP ...... 129

5.4 Target validation with Western blotting ...... 131

5.5 Overlap of targets between curcumin, piperlongumine and sulforaphane ...... 135

5.6 Identification of the site of modification of ABPs ...... 140

5.7 Chemical proteomics with other small molecule electrophile ABPs ...... 144

5.7.1 Initial application and comparison of small molecule electrophile ABPs ...... 144

5.7.2 Quantitative proteomic target identification of small molecule electrophile ABPs with triplex SILAC ...... 148

5.8 Summary and conclusions ...... 150

6. Exploring combinations of electrophilic natural products with other cancer therapeutics ...... 154

6.1 Introduction ...... 154

6.2 Assessment of effects of cancer therapeutics on breast cancer cell viability ...... 155

6.3 Combination of cancer therapeutics with electrophilic natural products ...... 156

6.4 Conclusions ...... 162

7. Conclusions and future work ...... 163

7.1 Conclusions ...... 163

7.1.1 New probes for profiling electrophilic natural products ...... 163

7.1.2 Identification of the comprehensive target set of electrophilic natural products ...... 163

14

7.1.3 Synergistic combinations of electrophilic natural products with cancer therapeutics ...... 165

7.1.4 Wider implications of the work ...... 165

7.2 Future work ...... 166

8. Materials and methods ...... 170

8.1 Chemical synthesis ...... 170

8.1.1 General procedures ...... 170

8.1.2 Feruloyl acetone (5-hydroxy-1-(4-hydroxy-3-methoxyphenyl)-1,4-hexadien-3-one) ...... 170

8.1.3 4-methoxy-3-propargyloxy-benzaldehyde ...... 171

8.1.4 Curcumin ABP 1 ((1E,6E)-1-(4-hydroxy-3-methoxyphenyl)-7-(3-methoxy-4-(prop-2- ynyloxy)phenyl)hepta-1,6 diene-3,5-dione) ...... 171

8.1.5 3-propargyloxy-4-hydroxybenzaldehyde ...... 172

8.1.6 Curcumin ABP 2 (1,7-[3-methoxy-4-hydroxyphenyl][3-butynyloxy-4-hydroxyphenyl]hepta- 1,6-diene-3,5-dione)...... 172

8.1.7 3-propargyl-2,4-pentanedione ...... 173

8.1.8 Curcumin ABP 3 ((1E,4Z,6E)-3-hydroxy-1,7-bis(4-hydroxy-3-methoxyphenyl)-4-(prop-2- ynyl)hepta-1,4,6-trien-5-one) ...... 173

8.1.9 2-(4-chloro-butyl)-2-methyl-[1,3]dioxolane ...... 174

8.1.10 S-ethyl-(prop-2-ynyl)-[4-(2-methyl-[1,3]dioxolan-2-yl)-butyl]-thiocarbamate...... 174

8.1.11 S-ethyl-(prop-2-ynyl)-(5-oxo-hexyl)-thiocarbamate ...... 175

8.1.12 Sulforaphane ABP 2 (S-ethyl-(prop-2-ynyl)-(5-oxo-hexyl)-thiocarbamate sulfoxide) ...... 175

8.1.13 Sulforaphane ABP 3 (4-butynylsulfinyl-1-(isothiocyanate)-butane) ...... 176

8.1.14 (E)-1-(3-(4-hydroxy-3,5-dimethoxyphenyl)acryloyl)-5,6-dihydropyridin-2(1H)-one ...... 177

8.1.15 Piperlongumine ABP ((E)-1-(3-(4-O-propynyl-3,5-dimethoxyphenyl)acryloyl)5,6- dihydropyridin-2(1H)-one) ...... 178

8.1.16 Tetrahydrocurcumin (1,7-bis(4-hydroxy-3-methoxyphenyl)heptane-3,5-dione) ...... 178

8.1.17 Tetrahydropiperlongumine (1-[3-(3,4,5-trimethoxyphenyl)propionyl]piperidin-2-one) ...... 179

8.1.18 Acetylenic enone ABP (4-(3-methoxy-4-prop-2-ynyloxy-phenyl)-but-3-en-2-one) ...... 179

8.1.19 Acetylenic chalcone ABP (3-(4-hydroxy-phenyl)-1-(3-methoxy-4-prop-2-ynyloxy-phenyl)- propenone) ...... 179

8.1.20 Benzaldehyde ABP (4-O-propynl-3-methoxy-cinnamaldehyde) ...... 180

8.1.21 N-ethylmaleimide ABP (1-prop-2-ynylpyrrole-2,5-dione) ...... 180

8.1.22 Chloroacetamide ABP (2-chloranyl-N-hex-5-ynyl-ethanamide) ...... 181

15

8.1.23 Acrylamide ABP (N-propargyl acrylamide) ...... 181

8.2 Biological and biochemical Methods ...... 182

8.2.1 General methods ...... 182

8.2.2 Cancer cell culture ...... 182

8.2.3 Cell lysis ...... 183

8.2.3.1 Whole cell protein lysis buffer (for in-cell applications) ...... 183

8.2.3.2 Non-detergent or low-detergent cell protein lysis buffer (for in-lysate applications) .... 183

8.2.4 In-cell compound treatment ...... 183

8.2.4.1 Single ABP compound dosing ...... 183

8.2.4.2 ABP competition assay dosing ...... 184

8.2.5 In-lysate compound treatment ...... 184

8.2.5.1 Single ABP compound dosing ...... 184

8.2.5.2 ABP competition assay dosing ...... 184

8.2.6 Gel electrophoresis ...... 184

8.2.7 Click chemistry (CuAAC) and in-gel fluorescence ...... 185

8.2.8 Affinity enrichment of ABP-labelled proteins ...... 185

8.2.9 Western blot analysis ...... 186

8.2.10 Cell viability assays (MTS assay) ...... 187

8.2.10.1 Single compound assays ...... 187

8.2.10.2 Compound combination assays ...... 188

8.3 Chemical proteomics ...... 188

8.3.1 General methods ...... 188

8.3.2 Compound treatment and lysate preparation ...... 189

8.3.2.1 Initial target identification of ABPs ...... 189

8.3.2.2 Duplex SILAC-based competition assays of ABPs against their parent compounds .. 189

8.3.2.3 Duplex SILAC-based competition assays of ABPs against their parent compounds comparing in-cell, in-lysate and in-cell/lysate targets ...... 190

8.3.2.4 ‘Spike-in’ SILAC-based optimisation experiment with sulforaphane ABP 2 ...... 191

8.3.2.5 ‘Spike-in’ SILAC-based ABP competition assays for sulforaphane ...... 192

8.3.2.6 ‘Spike-in’ SILAC-based ABP competition assays for curcumin and piperlongumine .. 193

8.3.2.7 Triplex SILAC-based target identification of small molecule electrophile ABPs ...... 195

8.3.3 CuAAC, affinity enrichment and on-bead reduction, alkylation and trypsin digest ...... 195

16

8.3.4 Trypsin digest of protein lysates ...... 196

8.3.5 LC-MS/MS runs ...... 196

8.3.6 LC-MS/MS data analysis ...... 197

8.3.6.1 Quantitative analysis ...... 197

8.3.6.1.1 Duplex SILAC data analysis ...... 197

8.3.6.1.2 Triplex SILAC data analysis ...... 197

8.3.6.1.3 ‘Spike-in’ SILAC analysis ...... 198

8.3.6.2 Site of modification analysis ...... 198

8.3.7 Bioinformatic and network analysis ...... 199

8.3.7.1 Bioinformatics ...... 199

8.3.7.2 Network analysis with Cytoscape...... 199

9. References ...... 200

10. Appendices ...... 223

17

1. Introduction

1.1 Biological electrophiles

1.1.1 Introduction

The broad definition of an electrophile is a molecule having one or more electron-poor atoms, which can accept electrons from electron-rich donors (known as nucleophiles). In nature, the roles of nucleophiles and electrophiles are distinctly separated due to the lack of intrinsic, highly electrophilic groups in the building blocks that are essential for life. An array of nucleophilic function is contained within proteins and DNA as well as other ubiquitous low molecular weight nucleophiles such as the tripeptide glutathione (GSH). Electrophiles on the other hand are less frequently found and are often produced as small molecule by-products of biological processes under conditions of cellular stress.

Proteins are one of the main targets for reaction with electrophiles in a biological system (Figure 1). Protein reactivity is governed by the broad range of nucleophiles present on the side chains of amino acid residues such as cysteine, lysine, threonine, serine, tyrosine, histidine, asparagine, arginine, selenocysteine, aspartate and glutamate. Cysteine, one of the least abundant amino acids, is considered to be the most reactive amino acid on account of its high nucleophilicity governed by its thiol group with the low atomic radius of sulfur and the low dissociation energy of the S-H bond. Moreover, it is found to concentrate in functionally important regions of proteins.1, 2 The functional importance stems from a number of unique physicochemical properties for cysteine. Firstly, the thiol group of cysteine allows it to perform both nucleophilic and redox-active functions that are not feasible for any other amino acid. Secondly, the pKa of the thiol group in cysteine is close to physiological pH with the ionisation state of cysteine being highly dependent on its microenvironment. The pKa of a surface-exposed thiol is ~ 8.5, however catalytic cysteines in active sites can have a pKa as low as 2.5 making it highly reactive towards electrophiles.3 This allows the thiol moiety of cysteine to be tuned according to function allowing it to carry out a diverse set of functions and thus distinguishing one cysteine residue from the next.4, 5 However with a lack of consensus motifs and parameters to conclusively assign function to an individual cysteine within the proteome, determining functional cysteines on a proteome-wide scale has remained a challenge.

Other amino acid residues are also capable of reacting with electrophiles. The hydroxyl group of serine, threonine and tyrosine, the amino group of lysine and histidine, and the acid group of aspartate and glutamate all convey nucleophilic properties to these amino acids. Whilst generally considered to be less nucleophilic than cysteine, these amino acids will react with electrophiles. Their relative hardness often preferring reaction with harder electrophiles that are less frequently found in nature.6 The biological function and significance of modification on these amino acids has however been far less explored relative to cysteine.

18

OH O O COOH O N - rox nonena - O 4 hyd y l (4 HNE) - - - 15 Deoxy Delta I , - H2N O 12 14 prostaglandin - N ethylmaleimide (NEM) - Iodoacetamide (IA) J2 (15d PGJ2)

O O O Acrolein H O X H P HO urcum n OH O C i Nucleophilic OCH3 OCH3 amino acid Fosfomycin

O O O H3CO N N O OH

H3CO Cl NH2 er on um ne OCH3 Pip l g i c v c n Protein target A i i i H O O N O O O O O O O HO O O O

Zerumbone O H O O Tetrahydrolipstatin HO O Wortmannin Andrographolide OH

Nucleophilic amino acids (X): NH 2 HO NH O O HO HS HO N O O N N N N H O H O H O N N N H H O N H O H O H O Serine Cysteine reon ne O Th i Histidine Tyrosine As artate Lysine p Glutamate

Figure 1. A selected panel of electrophilic compounds. Electrophiles can be non-native (iodoacetamide (IA) and N-ethylmaleimide (NEM)) and native to biological systems. Native electrophiles include those found in mammalian systems that act as electrophile signalling molecules (4-HNE, 15d-PGJ2, acrolein) as well as those found in other organisms that have been shown to possess therapeutic potential (curcumin, piperlongumine, zerumbone, andrographolide, wortmannin, tetrahydrolipstatin, acividin and fosfomycin).

1.1.2 Electrophilic natural products

Nature has provided a plethora of compounds that have inspired the generation of therapeutics, and rendered an enormous contribution to the treatment of disease. Life forms that lack immune systems biosynthesise natural products of unparalleled structural diversity for the purpose of self-defence and adaptation to environmental stresses. These compounds have clearly not been originally produced for human use. Some of these natural products show exquisite modulation of biological function. This has led to their assessment as agents to combat human disease with a minority being exploited for medicinal purposes. In a comprehensive review, Newman and Cragg showed that drugs created from natural products or containing a natural pharmacophore accounted for over a half of the drugs that received clinical approval between 1981 and 2006.7 The proportion is even higher when just

19 considering antibacterial and anticancer compounds, highlighting the value of natural products to modern drug discovery.

Mechanistically, natural products elicit their molecular actions by interacting with a cellular target, perturbing the normal signalling cascade(s) in a cell or tissue, leading to a higher level biological response that is advantageous in tackling a particular disease. A significant proportion of biologically active natural products are endowed with electrophilic moieties within their chemical scaffold, so- called electrophilic natural products, which are produced across all forms of life. Many of these electrophilic natural products have shown therapeutic promise displaying a remarkable array of biological activities that has led to intense research interest.8 The covalent nature of electrophilic natural product engagement with their target has largely facilitated the identification of their biological targets. In addition to their medicinal application, electrophilic natural products are also useful as probe molecules to interrogate the complex proteome and analyse protein function on a global scale. This is typified by one of the landmark investigations in the field in this regard whereby the use of an epoxide-containing natural product trapoxin led to the discovery of the first histone deacetylase enzyme.9, 10 This electrophilic natural product also helped establish histone deacetylases as new drug targets against cancer.

The ability of many electrophilic natural products to form covalent adducts with nucleophiles, particularly proteins, in a biological system is believed to be the fundamental driver behind their biological effects. This can include changes in: tertiary and quaternary structures, catalytic activities that depend on reactive nucleophilic amino acids, subcellular relocalisation, charge and hydrophobicity, and in the case of bifunctional electrophilic species, protein cross-linking. These effects on protein targets can translate to alterations in , transcriptional regulation, cytoskeletal function, ion and macromolecule transportation, to name but a few. The modulation of such a variety of cellular signalling processes simultaneously by electrophilic natural products, many of which are highly relevant for several major diseases, makes them promising therapeutic candidates.

A wide range of electrophilic moieties have been observed within electrophilic natural products (Figure 1) including Michael acceptor systems (comprising all α,β-unsaturated acceptor entities), ring strained electrophiles (such as epoxides and lactones) along with a plethora of others (carbamates, and esters). These offer subtle differences in reactivity towards nucleophiles and as such give rise to the strikingly different targets and biological effects. Although cysteine is predominantly the most reactive amino acid residue within proteins, due to the large structural and reactive diversity of electrophilic natural products, many compounds have been shown to covalently modify other amino acid residues (highlighted in Table 1). Understanding and being able to predict electrophile reactivity under biological conditions is challenging. The non-covalent contribution of an electrophilic natural product should furthermore not be over-looked. For many electrophilic natural products, selectivity for target binding will be governed by non-covalent interactions of the chemical scaffold with its target. This explains how two compounds with the same electrophilic motif may target very different subsets within the proteome.

20

Reviewing the hundreds of tested and promising electrophilic natural products that have had their targets profiled and/or have therapeutic potential is beyond the scope of this discussion. However a comprehensive overview of the target profiling of 45 electrophilic species, including 35 electrophilic natural products, applied to mammalian systems is presented in Table 1. A number of excellent reviews within the last decade also go further to highlight the progress that has been made in this area.8, 11-14

1.1.3 Endogenous electrophiles to mammalian systems

Electrophilic small molecule species are not simply limited to other organisms; mammalian systems also produce a range of endogenous electrophiles. The post-translational modification(s) (PTM(s)) of proteins are a fundamental mechanism of cell signalling that modulates protein structure and function. Common, well-studied PTMs include phosphorylation, acetylation and ubiquitinylation; however the discovery of new PTMs within the cell arsenal is ever growing. One such PTM is the electrophilic modification of cysteine residues. Unsaturated fatty acids are susceptible to diverse oxidation and radical addition reactions that result in the formation of electrophilic by-products during both basal metabolism and inflammatory responses.15-17 The PTM of proteins by electrophilic lipids occurs predominantly by S-alkylation of protein thiols, with other nucleophilic amino acid residues being less favourable targets. Some of the better characterised lipid electrophiles include 4-HNE, 15-PGJ2 and nitro-oleic acid (OA-NO2). The generation of such species is often associated with pathology of disease, including neurodegeneration, cancer, Alzheimer’s disease and chronic inflammation.18, 19 and have proved useful as biomarkers for inflammation and the extent of oxidative stress. It was these observations that led to the belief that such electrophilic species and their covalent modification of biological nucleophiles that may drive cellular toxicity. However, this simplified assumption appears to now be outdated.20 The diversity of electrophiles and their relative concentrations seem to dictate their exerted effects such that toxicity is not simply a function of the rate of electrophile-target interactions. Therefore the functional significance of endogenous signalling electrophiles is more widely appreciated with factors such as the downstream metabolism of electrophilic species, the actions of their metabolic by-products and the reversibility of their conjugates playing important roles in the effects they exert.21

Many of these lipid electrophiles are soft electrophiles explaining their preferential reaction with cysteine residues. They readily react with biological nucleophiles by Michael addition (the addition of a nucleophile to an alkene or alkyne that is conjugated to an electron withdrawing group).22 From a growing body of work, it is now appreciated that organisms have developed complex and sensitive signalling mechanisms that react and respond to different types of electrophiles and there is a high degree of selectivity for electrophilic PTM reactions.16 This consists of a library of redox-sensing proteins, in particular transcriptional regulatory proteins, which will define the pattern of gene expression in response to electrophilic stimuli. Current data supports like most PTMs, electrophile signalling is a dynamic process. As a result of their innate reactivity, they are expected to react and signal proximally to sites of generation resulting in localised responses. The understanding of this PTM is only in its infancy but has been greatly aided in recent years with the advancement of high

21 performance liquid chromatography-mass spectrometry (HPLC-MS) and affinity-chemistry based strategies given the complexity of the signalling network.

1.1.4 Key electrophile mediated signalling pathways

Having shown that electrophiles can covalently bind to nucleophilic amino acids within proteins to exert functional effects, discussion of the key signalling pathways influenced by electrophiles is warranted. It is not possible to summarise all of the protein targets and pathways known to associate with electrophilic species here. The continued development of MS technologies though has allowed hundreds, if not thousands, of electrophile-modified proteins to be identified and the functional implications on many of these targets is yet to be unravelled.23 An overview of the most well validated and functionally important pathways influenced by electrophiles is given, to provide key insights into their functional effects. It should also be noted that functionally significant targets of electrophiles are not limited to just proteins. Reaction with small molecule thiols and reductants such as GSH reduces reductive capacity inside the cell,24, 25 whilst reaction with DNA can also drive functional effects.26

1.1.4.1 Kelch-like ECH-associated protein 1 (KEAP1)

Activation of the antioxidant response element (ARE), a DNA--binding region that lies upstream of a variety of phase II and antioxidant genes, leads to the expression of responsible for detoxification (including glutathione-S- (GSTs), glutamate-cysteine ligase (GCLC) and heme oxygenase 1 (HO-1)). The transcription of ARE-driven genes is regulated, at least in part, by the nuclear factor (erythroid-derived 2)-like 2 (Nrf2) . Under basal conditions, Nrf2 is sequestered in the cytoplasm by its association with its repressor Kelch-like ECH- associated protein 1 (KEAP1). KEAP1 targets Nrf2 for ubiquitination by CuI-3-dependent ubiquitin ligase complex that subsequently leads to Nrf2 degradation by the 26S proteasome.27 Covalent modification or oxidation of specific cysteine residues on KEAP1 disrupts the KEAP1-mediated association with Nrf2, leading to Nrf2 accumulation/activation whereby it translocates to the nucleus resulting in Nrf2-dependent ARE gene expression. As such, the cysteine residues on KEAP1 (27 in total, 9 are believed to be reactive) act as sensors for the redox status of cells.28 In particular, Cys151, Cys273 and Cys288 have been identified as critical cysteines responsible for electrophile-dependent regulation of KEAP1.29-31 Many electrophilic species have been shown to modify KEAP1 through these and other cysteine residues.32 This has led to a so-called ‘cysteine code’ on KEAP1 being proposed whereby specific patterns of KEAP1 cysteine modifications occur for different electrophiles governed by the physical nature of the electrophile which may lead to different responses.33 The KEAP1-Nrf2 signalling axis therefore provides a broad cytoprotective response towards disruption of cellular homeostasis by extrinsic and intrinsic stresses that are a key target of signalling electrophiles.

1.1.4.2 Nuclear factor κB (NF-κB)

The electrophile-dependent modulation of nuclear factor κB (NF-κB) signalling represents one of the best defined mechanisms accounting for the anti-inflammatory activities of electrophiles. NF-κB, a ubiquitous transcription factor, controls the gene expression of a variety of proinflammatory mediators, most notably inducible nitric oxide (iNOS), cyclooxygenase-2 (COX-2) and tumour necrosis factor

22

(TNF). NF-κB is a heterodimer composed of two subunits, p50 and p65. Under basal conditions, NF- κB is regulated by its endogenous inhibitor IκB that complexes with NF-κB, sequesters its activity and keeps it localised in the cytoplasm. In response to appropriate stimuli, NF-κB activation occurs sequentially though activation of IκB kinase (IKK), phosphorylation of IκB, which allows the release of NF-κB to translocate to the nucleus to induce gene expression.34, 35 A complex mechanism of action for the electrophile-dependent regulation of NF-κB signalling has been reported, with electrophiles acting at multiple levels. Disruptions of IKK activation as well as direct binding to the p65 subunit of NF-κB by electrophiles have both been reported.36, 37

1.1.4.3 Peroxisome proliferator-activator receptor (PPARγ)

Peroxisome proliferator-activator receptor γ (PPARγ) is a member of the nuclear hormone receptor superfamily. PPARγ forms a heterodimer with retinoid X receptor (RXR) and regulates gene expression involved in a variety of functions.38 The regulatory domain of PPARγ contains a large, generally hydrophobic domain that can accommodate a broad array of lipophilic ligands. Contained within this domain is a cysteine residue (Cys285) that is liable to covalent modification by electrophiles, resulting in the activation of PPARγ.39-41 Saturation kinetics that typically govern ligand binding do not apply to a covalent interaction, therefore very low concentrations of an electrophile can bind and activate PPARγ accumulatively resulting in significantly higher activation of PPARγ than non-electrophilic counterparts.42 The downstream effects of PPARγ activation are yet to be elucidated in detail and given the pluripotent signalling actions of electrophiles, it remains a challenge to ascertain the functional response especially to PPARγ activation in addition to other signalling cascades.

1.1.4.4 Heat shock proteins (HSPs)

The heat shock protein (HSP) family assist in the recovery of cellular stresses either by repairing damaged proteins or assisting in their elimination. Key family members include HSP27, HSP70 and HSP90. The expression level of HSPs is controlled predominantly at the transcriptional level by the transcription factor HSF1. In the basal state, HSF1 is sequestered in the cytoplasm by association with HSP70 and HSP90.43 The current working hypothesis is that adduct formation of electrophiles with HSPs may release HSF1 resulting in the activation of the heat shock response.44

1.1.4.5 Redox regulatory systems

Thiol family members utilise redox-active cysteine residues to catalyse thiol/disulfide exchange reactions.45 Peroxiredoxins (PRDXs) are thiol-dependent antioxidant proteins that play a crucial role in redox regulation, protection against oxidative stress, regulation of redox-sensitive transcription factors and refolding disulfide-containing proteins inside the cell. PRDXs are maintained in the reduced state through the thioredoxin system, which consists of thioredoxin (TXN) and thioredoxin reductase (TXNRD1). The activity of TXN, TXNRD1 and PRDX depend on both catalytic and structural cysteine residues and both can be modified by a wide range of electrophiles.46-48 Electrophilic modification disrupts the PRDX-TXN system resulting in the activation of redox-sensitive pathways often triggering apoptosis.49, 50

23

1.1.4.6 Histone deacetylases (HDACs)

Histone deacetylases (HDACs) regulate transcription by modulating chromatin structure, which in turn determines accessibility of DNA to transcription factors. HDACs remove acetyl groups from histone lysine residues, promoting DNA condensation and inhibiting transcription. HDACs are modulated by the redox status of the cell because of the presence of a number of reactive cysteine residues that can be oxidised or modified by electrophiles.51, 52 Alkylation of HDACs blocks their association with histones thereby maintaining transcriptional activation of HDAC-repressed genes such as HO-1 and HSP70. This mechanism therefore works collaboratively with other transcriptional regulatory mechanisms such as Nrf2, HSF1 and PPARγ to contribute to transcriptional reprogramming within the cell in response to the electrophilic stimuli.

1.1.4.7 Kinases and phosphatases (MAPKs and PTPs)

Protein kinases and phosphatases play an integral role in controlling signal transduction pathways through their ability to phosphorylate target proteins resulting in their subsequent activation or inactivation. Both kinases and phosphatases are critical targets of electrophiles. The mitogen- activated protein kinases (MAPKs) form a signalling cascade involved in cell proliferation, differentiation and apoptosis. MAPK signalling encompasses the extracellular signal-regulated protein kinase (ERK) pathway, the c-Jun N-terminal kinase (JNK) pathway, and the p38 MAPK pathway. These kinase cascades are themselves classically regulated through phosphorylation by other kinases but there is strong evidence emerging that MAPKs are also sensitive to electrophile- dependent regulation.53-55 More recent work has identified reactive cysteine residues across a broad range of the 518 known human kinases that also have the potential to be modified by electrophiles, including fibroblast growth factor receptor (FGFR), epidermal growth factor receptor (EGFR), ribosomal s6 kinase (RSK) and apoptosis signal-regulating kinase 1 (ASK1).56, 57 Protein tyrosine phosphatases (PTPs) counter-balance the effect of protein tyrosine kinases (PTKs) by hydrolysing the phosphate group from phospho-tyrosine residues. PTPs have a conserved catalytic cysteine that exists as a thiolate ion at physiological pH and therefore is highly susceptible to reaction with electrophilic species.58, 59

1.1.4.8 Other targets of interest

•- Mitochondria generate a significant amount of O2 and H2O2 as a by-product of the respiratory electron transfer chain. This environment promotes the formation of electrophilic lipids as a result of fatty acid oxidation and nitration. Mitochondria are therefore a rich source of electrophilic signalling mediators within the cell and have evolved to be highly susceptible to reaction with electrophiles, with many mitochondrial proteins containing functionally significant electrophile-reactive cysteines.60 Furthermore, the matrix pH inside mitochondria is higher during coupled respiration resulting in an increased proportion of cysteine residues existing as thiolate anions which promotes their reactivity. Electrophiles have been shown to impact respiratory chain function within mitochondria by adduct formation with a variety of proteins including cytochrome c oxidase,61 α-ketoglutarate dehydrogenase (KGDH) and pyruvate dehydrogenase (PDH),62, 63 uncoupling proteins (UCP1, UCP2 and UCP3),64

24 adenine nucleotide translocase (ANT) and the mitochondrial ATP-sensitive potassium channel 65 (mitoKATP). Although under physiological conditions, electrophiles have a negative impact on energy metabolism within the cell, under many circumstances this can provide protective effects by decreasing the generation of ROS and subsequently easing the oxidative stress burden.

Electrophiles also have a functional impact on glycolysis. Glycolysis is a process that generates energy in the form of ATP within the cell. Glycolytic rates are modulated in response to energy requirements and the redox state of the cell at all levels (transcriptional, translational and post- translational). Post-translational regulation offers rapid activity switches in response to the environment and metabolic demand. In addition to phosphorylation-dependent regulation of glycolytic enzymes, many such enzymes also contain electrophile sensitive cysteine and histidine residues that have been reported to be modified by electrophiles. Well-document examples include enolases (ENO),66 lactate dehydrogenase (LDH),66 fructose biphosphate aldolase (ALDO),67 pyruvate kinase M2 (PKM2),68 neuronal glucose transporter GLUT3,69 and glyceraldehyde-3-phosphate dehydrogenase (GAPDH).70 Other targets of electrophiles include enzymes that contain catalytic, cysteine residues, of which there are many. Reported targets include cysteine proteases (e.g. caspases and cysteine cathepsins), ubiquitin and , and protein arginine deiminases.4

1.1.5 The therapeutic potential of electrophiles

Not every electrophile reported acts on the targets and signalling pathways highlighted above but they represent many of the major pathways targeted by endogenous and exogenous electrophiles. As alluded to earlier, subtle differences in electrophilic species can have dramatic effects on their target profiles and subsequent biological effects which are only beginning to be understood. However an inherent property of electrophiles seems to be their ability to modulate multiple metabolic and inflammatory signalling processes including Nrf2, PPARγ, HSF1 and NF-κB simultaneously which are heavily implicated in disease. This culminates in an ability to shift inflammatory- and redox- signalling inside the cell from propagation or injury mechanisms to resolution. It is this that has sparked research interest in an electrophile-based therapeutic strategy which may have the potential to address complex diseases whereby a multi-targeted effect may well be required. It is the exogenous electrophilic natural products not found in mammalian systems that are of most interest from a therapeutic point of view. However, the understanding of endogenous electrophile signalling is essential before recommendations can be made towards the application of electrophilic natural products as therapeutics with many of the same principles holding true for both classes of electrophiles.

Related to electrophile-based therapeutics is the resurgence of covalent drugs within recent years in the pharmaceutical industry.71 Covalent drugs are electrophilic or pro-electrophilic compounds that form a covalent adduct with their target. The design of covalent inhibitors that employ electrophilic motifs as potential drugs is conceptually very attractive. It allows for lower drug doses, longer duration of action at target site, reduced sensitivity to pharmacokinetic parameters and the potential to avoid

25 some resistance mechanisms. However, in practice the design and implementation of a covalent drug is very hard to achieve in drug development. The pharmaceutical industry has avoided covalent inhibitors as a drug class owing to the potential toxicity associated with protein adduct(s) formation. An increase in understanding of organ toxicity from reactive metabolites in the 1970s and 1980s led to a backlash against covalent-modifying compounds with a higher risk of toxicity if a compound lacks specificity. Further concerns include potential immunogenicity of protein adduct formation leading to an allergic response or drug hypersensitivity reaction.

Despite this, approximately one-third of all enzyme targets for which there is a drug on the market have an example of an approved covalent drug.72 Looking forward, the improved understanding of electrophile signalling in combination with the emergence of new techniques for assessing electrophilic reactivity has led to the improvement of covalent drug programs and the recent success stories of covalent drugs progressing through the clinic (e.g. EGFR,73 BTK,74 FAAH75 and MetAP276). Therefore there is great promise, not only for covalent drugs but for the application of electrophilic compounds more generally as future drug molecules.

Table 1. A summary of protein target profiling of electrophilic species carried out in mammalian model systems. Endogenous electrophilic lipids (entries 1-4), electrophilic tool molecules used for global profiling of a variety of electrophilic motifs (entries 5-10) and electrophilic natural products with interesting biological activities (entries 11-45). The electrophilic motif contained within each compound, the number of protein targets or main target(s) identified and the profiling methodology are presented. The table is constructed by combining data curated from reviews from the Sieber,8, 11 Weerapana,6, Breinbauer,14 Waldmann,77 and Peuchault13 groups, in addition to other relevant references from the literature. Where multiple techniques have been used to profile the same compound, the target identification data is summarised and appropriate references cited. PD and WB = pulldown and Western Blot, ABPP = Activity-based protein profiling, 2D-GE = labelled probe coupled with 2D gel electrophoresis, HIP = haploinsufficient profiling and HOP = homozygous deletion profiling.

Target Electrophilic Protein Entry Compound Identification Notes Reference Motif Target(s) Approach

500-800 targets α,β- (including Chemical 4-hydroxynonenal 51, 78, 79 1 unsaturated HDACs, HSPs, proteomics In-cell profiling (4-HNE) carbonyl KEAP1, NF-κB (ABPP) and PPARγ) Multiple targets (including TXN, Chemical (15-deoxy-Δ(12,14)- α,β- KEAP1, IKKβ, proteomics 16, 48, 79 2 prostaglandin J2) unsaturated In-cell profiling NF-κB, h-Ras, (ABPP), PD and 15d-PGJ2 carbonyl UCHL1 and WB HDAC1) Target profiles α,β- > 10 targets curated over 16 3 Acrolein unsaturated (including TXN PD and WB multiple separate carbonyl PTP1B, NF-κB) studies > 20 targets Target profiles 10-Nitro- (including curated over 16 4 octadecenoic acid Nitroalkene KEAP1, NF-κB, PD and WB multiple separate (OA-NO2) GAPDH and studies PPARγ) 250-1000 targets Lysate profiling; (including Chemical cysteine selective Aliphatic ACAT1, CLIC1, 80-83 5 Iodoacetamide (IA) proteomics but lysine halogen ECH13, ECH19, (ABPP) labelling at higher GSTO1 and concentrations PRMT1)

26

350-650 targets (including Chemical Lysate profiling; N-ethylmaleimide 80, 81, 83 6 Maleimide GAPDH, PRDX6 proteomics cysteine (NEM) tubulin, TOP2A (ABPP) selectivity and TXN) Lysate profiling; site-specific 37 targets identifications; Chemical Phenylsulfonyl (including label across 84 7 Sulfonyl ester proteomics esters ACADL, ECH1 cysteine, tyrosine, (ABPP) and HSD11B1) aspartate and glutamate residues Lysate profiling; 74 targets site-specific Chemical α- Aliphatic (including CLIC4, identifications; 84 8 proteomics chloroacetamides halogen GSTO1, NIT2, labelling (ABPP) UGDH) selectivity for cysteine residues Lysate profiling; 197 targets site-specific α,β- Chemical α,β-unsaturated (including identifications; 84 9 unsaturated proteomics ketones ALDH1, LDHB labelling carbonyl (ABPP) UB2DE2) selectivity for cysteine residues In-cell labelling; Spiroepoxide Single major Chemical Labelling on 85, 86 10 Biomimetic Library Epoxide target elucidated proteomics lysine residue (fumagillin-like) (PGAM1) (ABPP) (Lys100)

Chemical Lysate profiling; Cathepsin family 87 11 E-64 Epoxide proteomics Labels catalytic members (ABPP) cysteine residue

Proteasome (β- 12 Epoxomicin Epoxide catalytic PD and WB Lysate labelling 88, 89 subunits)

Single major target PD and WB; Labelling on 90 13 Fumagillin Epoxide (methionine Edman histidine residue aminopeptidase sequencing 2 (MetAP)) Single major Chemical In-cell labelling; target (SF3b 91 14 Pladienolide B Epoxide proteomics undetermined subunit 3 (ABPP) modification site (SAP310)) 49 targets In-cell labelling; (including 14-3-3 Radiolabelling further validation 92 15 Sulforaphane Isothiocyanate proteins, , and 2D-GE required on target HSPs, tubulin set and vimentin) Chemical > 50 targets In-cell labelling; Proteomics Phenethyl (including tubulin, further validation 92, 93 16 Isothiocyanate (ABPP), isothiocyanate actin, vimentin, required on target Radiolabelling HSP90) set and 2D-GE 10 targets 6-methysulfinyl (HSP90, HSP70, Chemical hexyl HSP60, KEAP1, 94 17 Isothiocyanate proteomics In-cell labelling isothiocyanate (6- GSTP1, α- (ABPP) HITC) tubulin, β-actin, GAPDH)

Labelling on Single major Cys73 is site of 95 18 PX-12 Disulfide recombinant target (TXN) modification protein

ALDH4A1 In-cell labelling; 3- Chemical identified as Binds to active 96, 97 19 Acivicin chlorodihydro- proteomics major target. site cysteine of isoxazole ring (ABPP) Carboxylesterase ALDH enzymes.

27

1 (CES1) also identified as secondary target.

α,β- Computational Related in unsaturated approach mechanism to 17- 98 20 Gedunin HSP90 carbonyl and (Connectivity AAG and 17- epoxide Map) DMAG 2 x α,β- Lysate labelling; Chemical unsaturated Single major Cys328 identified 99 21 Withaferin A proteomics carbonyl, 1 x target (Vimentin) as modification (ABPP) epoxide site In-cell labelling; α,β- Chemical PI3K family PLK1 identified as 100, 101 22 Wortmannin unsaturated proteomics members target at higher carbonyl (ABPP) concentrations Serine/Threonine Lysate labelling; α,β- protein Chemical Cys273 of PP1 102 23 Microcystins unsaturated phosphatases proteomics and Cys266 of carbonyl (PP1, PP2A, (ABPP) PP2A modified PP4, PP6)

α,β- Chemical Cys179 identified Single major 103 24 Pathenolide unsaturated proteomics as modification target (IKKβ) carbonyl (affinity pulldown) site

Serine/Threonine α,β- phosphatases In cell labelling; 104 25 Phoslactomycin A unsaturated (PP2Ac, PD and WB Cys269 of PP2A carbonyl PP2Acβ, PP6, modified PP2A) In-cell labelling; α,β- Single major Lys352 identified 105 26 Pironetin unsaturated PD and WB target (α-tubulin) as modification carbonyl site In-cell labelling; extremely low 2 x α,β- Single major concentration 27 Leptomycin B unsaturated target (exportin-1 PD and WB applied (10 nM); 106, 107 carbonyl (XPO1)) Cys529 identified as modification site In-cell labelling; α,β- Couple of major Chemical Thr220 and 28 Cyclostreptin unsaturated targets (β-tubulin proteomics Asn228 identified 108, 109 carbonyl and CES1) (ABPP) as modification sites on β-tubulin β2 and β5 α,β- Chemical 20S proteasome subunits targeted; 110, 111 29 Syringolin A unsaturated proteomics (β-subunits) Modification on carbonyl (ABPP) threonine likely Major target identified Lysate labelling; α,β- (nucleophosmin), Chemical Cys275 of 112, 113 30 Avrainvillamide unsaturated with 4 additional proteomics nucleophosmin is carbonyl targets (GR, (ABPP) modification site HSP60, PRDX1 and XPO1) Major target identified (NFKB1) with α,β- over 75 Chemical 31 Andrographolide unsaturated additional targets proteomics In-cell labelling 114, 115 carbonyl (including β- (ABPP) actin, MYH9, NPM1 and YWHAZ) α,β- Single major In-cell labelling; Chemical Tetrahydrolipstatin unsaturated target (pancreatic FDA-approved 116, 117 32 proteomics (Orlistat) carbonyl (β- lipases (such as drug compound; (ABPP) Lactone) FAS)) number of ‘off-

28

targets’ have been identified

Major target α,β- identified Chemical Hardwickiic acid 118 33 unsaturated (HSP27) with 85 proteomics Lysate labelling (HAA) carbonyl additional targets (affinity pulldown) (including CLIC1) 4 potential α,β- Chemical targets (CFL1, 119 34 Oridonin unsaturated proteomics Lysate labelling ENO1, HSP701A carbonyl (ABPP) and PRDX1) > 100 targets In-cell labelling; 2 x α,β- (including Chemical spike-in SILAC 120 35 Zerumbone unsaturated DFNA5, CDA, proteomics approach, 151 carbonyl UVRAG, LCMT1, (ABPP) high confidence NT5DC1) targets identified > 100 targets (including PKCα, Chemical α,β- Lysate labelling; Bisindoylmaleimide GSK3β, CDK2, proteomics 121, 122 36 unsaturated SILAC-based III SRPK1, PKR, (ABPP), FLAG- carbonyl approach NQO2, PKAC-α, tagged probe VDAC) 12 major targets identified involved in redox Chemical 2 x α,β- Lysate labelling; processes proteomics 123 37 Piperlongumine unsaturated SILAC-based (including (Affinity carbonyl approach GSTO1, GSTP1, chromatography) CBR1 and GLO1) Single major target (DPAGT1) 2 x α,β- with additional 124 38 Tunicamycin unsaturated HIP/HOP - potential targets carbonyl (ALG7p, HAC1, GFA1)

α,β- Single major 125 39 Brefeldin A unsaturated HIP - target (GBF1) carbonyl

4 targets identified Chemical α,β- (hnRNPA1, proteomics 126 40 Quercetin unsaturated vinculin, Lysate labelling (Affinity carbonyl nucleolin, chromatography) elongation factor 1α) Single major target (eIF4A) in Chemical α,β- addition to a proteomics 127 41 Pateamine A unsaturated Lysate labelling second potential (Affinity carbonyl target pulldown) (STRAP/UNRIP) 577 protein targets identified Chemical α,β- (including IKKβ, proteomics 128-131 42 Bardoxolone/CDDO unsaturated In-cell labelling JAK1, KEAP1, (ABPP), PD and carbonyl mTOR, PTEN WB and STAT3) > 50 targets Chemical 2 x α,β- (including proteomics 132, 133 43 Curcumin unsaturated ANXA2, ALDOA, In-cell labelling (ABPP and carbonyl GLO1, mTOR Affinity pulldown) and PRDX2) In-cell labelling; Two major α,β- Chemical Labels cysteine targets identified 134 44 Adenanthin unsaturated proteomics residue, does not (PRDX1 and carbonyl (ABPP) bind to other PRDX2) peroxiredoxins

29

In-cell labelling; 284 targets Activity believed (including α,β- Chemical to be driven ALDOA, CKAP4, 135 45 (±)-C75 unsaturated proteomics through lipid- CPT1A, FASN, carbonyl (ABPP) and/or fatty acid PDIA3 and metabolism- TXNRD1) proteins

1.2 Methods for target elucidation

1.2.1 Overview of small-molecule target identification

One of the major challenges for understanding the mode of action of any small molecule in a biological system is determining its target. It is remarkable that for many of the natural products in medicinal use, the identity of the molecular target(s) and the underlying mechanism responsible for the mode of action remains unknown. Even for small molecules or drugs that have been designed specifically for a single target, recent profiling efforts are suggesting that their in vivo activity may not be mediated through the target for which they were designed to target. When one considers that the human proteome consists of approximately 20,000 proteins, in addition to the fact that nucleic acids, carbohydrates or lipids may also be targets of these small molecules, this observation is maybe less surprising.136 The challenge in profiling the target(s) of any small molecule, including electrophilic natural products, is therefore apparent.

The lack of target information for many small molecules is a consequence of the fundamental way new biologically active small molecules are discovered. Recent biological and technological advances have resulted in the increasing use of high-throughput cell-based screening assays to discover new active entities against a particular disease model.137 These kinds of screens measure cellular function without imposing preconceived notions of relevant targets or signalling pathways. However, a sacrifice of this is the target and underlying mechanism of the small molecule is not known and requires subsequent effort to discover the molecular target(s) which can be a complex endeavour. There has been great progress in recent years to establish methods for the identification of cellular targets of small molecules. However, a generally applicable methodology that would allow a generic workflow to be applied in most cases as a ‘gold standard’ has not yet been developed. Therefore, multiple target identification techniques have been used, each holding advantages and disadvantages with their applicability dependent on many factors that need to be considered prior to embarking on target identification.

These techniques can be broadly divided into three distinct but complementary approaches: direct biochemical methods, genetic interaction methods and computational inference methods (Figure 2). In practice, proteomic techniques such as activity-based protein profiling (ABPP) and standard affinity pulldowns, collectively known as chemical proteomics, which are subgroup of direct biochemical methods are far more widely used than computational and genetic approaches. However, it is important to stress that the problem of target identification is often not solved by a single method but rather by the analytical integration of multiple, complementary approaches.138 As a further note, it is becoming more apparent in drug discovery that not only is identifying the primary target of the small

30 molecule of interest important, but also appreciating its wider reactivity within the proteome to unravel so-called ‘off-target’ effects. Knowledge of the full target spectrum of a small molecule is vitally important to develop efficacious and safe drugs. Detailed knowledge of the target set of a small molecule is also important in chemical biology research, whereby selective chemical tools that can perturb a biological system for analysis are highly sought after to interrogate the functional role of proteins within the proteome.139 Therefore methods for target elucidation ideally need to identify the primary target responsible for activity, the underlying mechanisms of action, and the broader proteome reactivity of the small molecule under study.

31

Protein lysate (A) Affinity pull-down Chemically (B) ABPP synthesised ABP In-cell treatment Beads with covalently and lysis attached substrate SDS-PAGE Other proteins Extract protein pass through Affinity enrichment of band, digest and target proteins proteomic ID Bound small molecule protein Avidin-coated beads On-bead digest target to bind biotin- and shotgun Flow functionalised proteins proteomic ID

Labelled small molecule (C) 2D-GE Small molecule (D) DARTS and CETSA treatment

Target ID based on resistance to CETSA Extracted Extract fluorescent heat denaturation proteome or radiolabelled separated by protein spot Δ ΔTm Charge

2D gel uantity electro- trypsin q phoresis DARTS Detectableprotein Temperature Size Enzymatic digest and target ID by MS analysis Target ID based on resistance to proteolysis

(E) Protein microarray Pools of yeast (F) HIP/HOP Chip scanned strains Small molecule treated to to detect Sensitive strains show protein-immobilised chip binding Small molecule reduced intensity treatment relative to control

Hybridise to DNA microarray

PCR DNA Harvest and barcodes purify genomic (amplification) DNA

(G) RNAi Profiling of compound (H) Computational approaches siRNA in assay(s) Target or Performed mode of action globally to Query signature in hypothesis for knockdown every compound database of experimental protein characterised testing compounds Target mRNA Compounds Target mRNA translation disrupted Readout phenotype Compound Specific protein – match to small specific ‘knock-down’ molecule signature Parameters

Figure 2. An overview of the target identification methodologies for small molecule target profiling. The methods encompass direct biochemical methods (affinity pulldowns (A), ABPP (B), 2D-GE (C), DARTS and CETSA (D), protein microarrays (E)), genetic methods (HIP/HOP (F) and RNAi (G)) and computational approaches (H).

32

1.2.2 Direct biochemical methods

1.2.2.1 Affinity chromatography

Affinity chromatography-based proteomics (or affinity pulldowns) are one of the most widely applied techniques to small molecule target identification (Figure 2A). Typically, the small molecule is immobilised on a solid phase and exposed to protein extract to allow binding to the protein target(s). Proteins that bind non-specifically are subsequently stringently washed away. Protein targets are then released from the solid phase, followed by SDS-PAGE separation, trypsin digest and subsequent LC- MS/MS analysis. Knowledge of some structure activity relationship (SAR) is required to determine a point of attachment to immobilise the small molecule on the solid phase whilst retaining its native biological activity such that it can bind to its cellular targets. Higher quality affinity pulldown reports from the literature have employed control beads with an inactive analogue,140, 141 pre-incubation of lysates with native small molecule prior to exposure to the small molecule immobilised solid phase142, 143 and incorporation of quantitative proteomics workflows.123, 144 These ensure that false positive protein identifications are minimised, ensuring high quality protein target identifications. The major disadvantage of such an approach is the need to carry out protein target capture of the small molecule on protein extracts as opposed to in cellular or whole organism systems. This can call into question whether target identifications by such a method are indeed genuine targets under native biological conditions. However, affinity pulldowns have been the workhorse of protein target identification of small molecules (both covalent and non-covalent interactors) over the last few decades. The identification of histone deacetylase as the target of trapoxine being one of the many success stories.9

1.2.2.2 Activity-based protein profiling (ABPP)

As an extension and improvement to the affinity chromatography-based approach, more functional probes have been developed to profile the protein targets of small molecules. Traditionally this has involved incorporating an affinity handle (typically biotin) into the small molecule via chemical synthesis to allow enrichment of protein targets for identification by a variety of detection platforms.145 The application of this approach to electrophilic compounds that can covalently engage protein targets has enveloped a subset of chemical proteomics known as ABPP (Figure 2B).146 This methodology was pioneered by the groups of Cravatt and Bogyo and has been widely used to profile the activities of enzyme classes, as well as target identification of electrophilic natural products in complex proteomes.147, 148 ABPP has quickly evolved as the method of choice for the comprehensive identification of the target spectrum of electrophilic species. Several excellent reviews are available to present the technique in more detail.146, 149-151 However, generally speaking the approach involves the design and application of small molecules in the form of activity-based probes (ABPs). Original designs of ABPs involve incorporating reporter groups into small molecules such as fluorophores or affinity tags for visualisation and capture functions respectively.127, 152, 153

More recent developments have utilised a two-step ABPP labelling strategy that uses bioorthogonal ligation chemistry.154, 155 A small and biologically inert chemical handle such as an alkyne can be

33 inserted into the small molecule which can be applied to the chosen biological system or proteome in order to engage its protein targets. The alkyne-containing small molecule bound to its protein target can then be coupled by the copper(I)-catalysed alkyne-azide [3+2] cycloaddition reaction (CuAAC) or click chemistry to an azide-containing reporter group containing the fluorophore and/or affinity tags after target engagement has taken place.156-158 The affinity label then allows for the affinity enrichment of target proteins followed by protein identification by LC-MS/MS for global profiling or WB for individual target identification and validation. Other bioorthogonal ligation strategies have also been implicated into ABPP workflows such as the Staudinger ligation (reaction between azides and phosphines),159 a strain-promoted [3+2] cycloaddition (reaction between azides and ring-strained alkynes),160 and a Diels-Alder [4+2] cycloaddition (reaction between a diene and dienophile).161 The latter two approaches allowing non-toxic in-cell ligations that are necessary for live cell imaging applications.162

This two-step ABPP labelling approach circumnavigates many of the problems associated with the bulky reporter groups (biotin or fluorophores) interfering with small molecule target engagement or biological activity. The reporter group can also affect membrane permeability and solubility of the small molecule. Therefore, utilising a two-step ABPP labelling strategy in many instances allows protein targets of a small molecule to be profiled inside cells under native biological conditions. This includes endogenous protein expression levels, cellular compartmentalisation, native PTMs and natural binding partners of targets which aids identification of genuine target interactions. This is a major advantage of the ABPP approach in comparison to affinity chromatography-based approaches that are restricted to extracted protein lysates. Non-electrophilic or non-covalent small molecules can also be profiled in a highly analogous manner with the addition of a photoaffinity group (e.g. benzophenone or diazirine) into the ABP probe design. This generates a so-called affinity-based probe (AfBP).163, 164 The photoaffinity group upon excitation covalently cross-links the AfBP to its bound protein targets such that the cross-linked targets can then be identified in the same manner as their ABP counterparts.

The application of ABPP for target profiling has been hugely diverse. Profiling of targets can be carried out in living systems ranging from bacterial cultures to eukaryotic cells or tissues even to whole animals. The significance and importance of the ABPP methodology for target identification is highlighted by the number of targets of electrophilic natural products that have been identified to date (Table 1).

1.2.2.3 2D gel electrophoresis

Two-dimensional gel electrophoresis (2D-GE) has also been used traditionally for small molecule target identification (Figure 2C). It allows the separation of thousands of proteins simultaneously by isoelectric focusing according to their charge (first dimension) and according to their size (second dimension). The resulting protein ‘spots’ on the gel often represent single proteins which can then be isolated and identified by MS. The technique can be used for protein target identification if the small molecule is a covalent binder (non-covalent binders can be converted by installing a photoaffinity

34 group into the compound such that its interactions can be covalently captured) and a labelling tag is incorporated into its structure (radiolabel or fluorophore). Target identification by such an approach is time-consuming, labour-intensive and limited by low resolution. It often favours identification of abundant proteins and hydrophobic proteins, particularly from the membrane, are not amenable to identification by such an approach. However, it has had some reported successes.165, 166

1.2.2.4 Developing technologies (DARTS, CETSA and TICC)

A new set of so-called label-free based approaches to identifying protein targets of small molecules have been developed in recent years that use unmodified compounds (Figure 2D). These include drug affinity responsive target stability (DARTS), cellular thermal shift assay (CETSA) and target identification by chromatographic co-elution (TICC). The DARTS method is based on the finding that a protein target bound to a small molecule is less susceptible to proteolysis than its unbound counterpart.167 Small molecule binding causes a local or global stabilisation of protein conformation and/or altered accessibility to proteolysis. Therefore protein targets of a small molecule can be assessed in a global manner and this approach has been used in a number of recent studies.168 The CETSA method is similar to DARTS but based on the small molecule-induced thermal stabilisation of target proteins.169 CETSA has great potential for global profiling of small molecules with an excellent recent study identifying the protein target set of the kinase inhibitor staurosporine.170 TICC is based on a characteristic shift in the chromatographic retention-time profile detected by LC-MS/MS upon small molecule binding to a target protein.171 All three of these techniques are in their infancy and many of the reported studies have been proof-of-principle experiments. The generality of the approaches remains to be determined but there is great promise that such methodologies coupled with quantitative proteomics and improvements in MS instrumentation have a bright future for target identification applications.

1.2.2.5 Protein microarrays

In a protein microarray, many different proteins are immobilised in an array on a chip which is then exposed to small molecule with a subsequent readout to detect binding (Figure 2E).172, 173 Protein microarrays are more complex to generate than DNA microarrays owing to possible denaturation not experienced by their DNA counterparts. They also suffer from immobilisation strategies onto the chip hindering protein structure and/or function. They do however offer the advantage over affinity approaches in that each protein spotted onto the array is in equal amounts, therefore overcoming preferential detection of highly abundant proteins. Despite the potential usefulness of protein microarray chips, there are a limited number of reports for their application in the detection of protein targets of small molecules.

1.2.3 Genetic methods

Yeast is often used as a eukaryotic model organism for mammalian diseases and pathways on account of approximately 50 % of genes in human disease also being present in yeast.174 Its ease of manipulation and genetic tractability, in addition to a number of other factors, has given rise to successful approaches that have been utilised for target identification (Figure 2F).175 One approach is

35 haploinsufficient profiling (HIP) which is based on the principle that lowering the dosage of the drug- target-encoding gene from two copies to one copy in diploid yeast will give rise to an increased sensitivity to drug treatment. An analogous approach to HIP is homozygous deletion profiling (HOP) where both copies of the drug-target-encoding gene are knocked out. This latter technique does not reveal the drug protein targets, but aids in identifying genes related to the mode of action of the drug. An additional yeast-based method is the yeast three-hybrid system which identifies interactions between small ligands and protein receptors.176 Yeast-based techniques are highly powerful for drug target profiling although they are not routinely used.177, 178

Genetic methods can also be carried out in mammalian systems, more specifically in physiologically relevant cell lines. Comparison of the effects of gene knockouts or RNA interference (RNAi) can often be compared to the effect of the small molecule. RNAi experiments can be easily performed on a genome-wide scale to find phenotypes similar to those induced by the small molecule (Figure 2G).179 If a gene knockout or knockdown phenocopies the effect of a compound, the evidence that the protein could be the target relevant to the phenotype is strengthened. This is useful for compounds with a strong and easily detectable phenotype, but small molecules with several targets that give unclear phenotypes (many electrophilic natural products included), are difficult to resolve using such an approach. Identification of compound-resistant clones of cells using transcriptome sequencing (RNA- seq) also offers a way of identifying potential intracellular targets of small molecule, as has been done for the PLK1 inhibitor BI 2356.180

1.2.4 Computational methods

The importance of computational modelling to predict small molecule-protein associations and their mechanistic relationships within complex biological systems is becoming increasingly recognised (Figure 2H).138, 177 Various computational approaches have been developed to integrate biological data such as regulatory networks, molecular pathways and cell phenotypes, to facilitate interpretation and robust prediction of the biological activities of small molecules and their targets.181, 182 It is highly reliant on the continued development of high quality, structured, publically accessible databases (e.g. DrugBank,183 STITCH,184 and Therapeutic Target Database185) as the source of information which currently have their limitations on account of their incompleteness and their inherent biases. However, the growth and continued curation of such databases offers a large amount of data from which computational models can continue to grow. Other more targeted computational approaches include docking studies, whereby the likelihood of small molecule binding to a target is based on three- dimensional structural considerations and spatial molecular complementarity. Docking studies however are reliant on pre-selecting a small molecule-protein pair and as such is not an appropriate technique for system-wide target identification.

This developing area of computational methods for target identification has huge potential. However, at present, experimental analyses remain important prerequisites for validating any hypotheses generated by computational approaches. Nevertheless, the power of predicting the targets of small molecules in an automated fashion using these models offers promise in reducing the time and cost

36 associated with target identification. The integration of computational approaches into drug discovery pipelines is therefore becoming an increasingly valuable asset.

1.2.5 Mass spectrometry (MS)-based proteomics

1.2.5.1 Introduction

Many of the biochemical methodologies for target identification of small molecules described above relies heavily on MS as the basis for target detection. It has been the continued development of MS- based platforms that has revolutionised the ability to profile the targets of small molecules in a systematic and non-biased manner across an entire biological system of interest. Innovations in ionisation methods and mass analysers have extended the applicability and overall sensitivity of MS such that its ability to handle complex sample mixtures is continually improving. More generally, MS has been a vital component for identification, quantification and biological function determination of the proteome and has a wide scope of applications within proteomics. This is typified none more so than in the recent mapping of the entire human proteome by high resolution MS.186

1.2.5.2 Top-down and bottom-up proteomics

MS-based proteomics analysis can be divided into two different strategic approaches: ‘top-down’ and ‘bottom-up’ proteomics.187 In ‘top-down’ proteomics, intact proteins are separated by liquid chromatography or gel electrophoresis followed by MS analysis, before being fragmented and subjected to further MS analysis.188 In ‘bottom-up’ proteomics, proteins are first fragmented into peptides by chemical or enzymatic digestion and then subjected to multiple rounds of MS analysis.189 The peptide fragments are then searched against a database of genome-sequenced proteins to identify the protein from its ‘peptide fingerprint’. ‘Bottom-up’ proteomics can be carried out in two ways. The first approach takes the protein mixture and separates it out by gel electrophoresis, before excision of protein bands from the gel, followed by enzymatic digest and MS analysis. The second approach, known as shotgun proteomics, does not involve separation by gel electrophoresis prior to enzymatic digest. Alternatively, an LC-MS/MS setup is used to separate the peptide mixture followed by MS analyses.190 Shotgun proteomics strategies have been particularly well utilised for the analysis of complex proteomes due to its high-throughput nature as it is experimentally simple, fast and can handle a broad range of characteristic peptides. Gel-based methods often have a bias against membrane proteins, large proteins and low-abundance proteins which can be circumnavigated using a shotgun proteomic approach.

1.2.5.3 Shotgun proteomics

Shotgun proteomic approaches have been widely used in the majority of MS-based target identification studies reported to date, as the limitations of gel-based approaches restrict their usefulness. Shotgun proteomics identification of a protein from a complex mixture of peptides proceeds in four stages (Figure 3). Firstly, the peptides are separated by HPLC based on their physical properties. Secondly, the peptides are ionised such that they become charged. Thirdly, the peptides enter the mass spectrometer whereby three pieces of information are obtained for each

37 identified peptide: its mass, its ion intensity and a list of its fragments. The mass and the fragments identify the peptide and subsequently the protein, the intensity is used for quantification. In shotgun proteomics, no prior knowledge of the peptides present in the sample is required, with peptides identified in a data-dependent mode.191 Fourthly, the MS and MS/MS spectra are translated into peptide and protein identifications.

(A) Sample fractionation and protein extraction (F) Data analysis

Protein identification

(B) Protease digest (C) LC separation (D) Peptide ionisation (E) Mass analyser (ESI) MS/MS LC

Trypsin

Proteins Peptides Ion trap-orbitrap setup Intensity Intensity Time (min) Intensity

MS m/z MS/MS m/z

Figure 3. Typical workflow for proteome analysis using shotgun proteomics. (A) Proteins are extracted from biological system and fractionated or separated using a varitey of approaches depending on the application. (B) The isolated proteins are then enzymatically digested into peptides with trypsin. (C) Peptides are separated on a nano-scale reverse phase chromatography column with an organic solvent whereby they are eluted off the column to be ionised. (D) Peptides eluting from the end of the column are ioinsed at the tip directly infront of the mass spectrometer whereby the ions pass into the mass spectrometer. (E) The mass analyser component of the mass spectrometer (represented here as an ion trap-orbitrap setup) generates and collects MS and MS/MS spectra. (F) The ion mass (MS) and fragmentation masses (MS/MS) are then scanned against protein sequence databases in order to produce a list of identified peptides and proteins.

1. Peptide separation: Peptides are separated on a nano-scale reverse phase chromatography column with an organic solvent whereby they are eluted off the column to be ionised.

2. Peptide ionisation: The development of ‘soft’ ionisation techniques such as electrospray ionisation (ESI) and matrix-assisted laser desorption/ionisation (MALDI) have revolutionised the ability to analyse peptides by MS.192 Both techniques convert peptide into intact ions in the gas phase. ESI utilises a solvent system to dissolve the peptide mixture which is then electro-sprayed into a vacuum chamber. Through solvent evaporation or extraction methods, the peptides become ionised, picking up more than one proton charge ([M+nH]n+).193 In MALDI, the peptide mixture is co-crystallised with a matrix that upon UV laser excitation ionises the peptide, typically picking up a proton to produce [M+H]+ ions.194

3. Mass analyser: Regardless of the ionisation technique, peptide ions are analysed based on their m/z. Among the many different types of mass spectrometers,192 two particular setups have been

38 widely used for proteomics: quadrupole time-of-flight (TOF) instruments and hybrid linear ion trap- orbitrap instruments (often referred to simply as orbitraps). In TOF instruments, peptides ions are separated in time by when they arrive at the detector. In the orbitrap mass analyser, the frequency of peptide ions oscillating in the ion trap is measured and the mass spectrum obtained by Fourier transformation. MS resolution is an important parameter for proteomic applications because at any time, multiple peptides may co-elute from the column simultaneously and need mass spectrum to be generated. MS resolution, a unit-less parameter, is defined as ‘the ability to distinguish two peaks of slightly different m/z in a mass spectrum’. The higher the mass resolution value the greater number of peptides that the mass analyser can identify simultaneously. TOF instruments have a resolution in excess of 10,000; whereas orbitraps are routinely used at 60,000. Mass accuracy is another important parameter of a mass spectrometer. This can be in the low parts-per-million range for TOF instruments and even lower for orbitrap instruments, which can greatly improve the percentage of peptides that can be identified.195

Peptides are also fragmented to generate the MS/MS spectrum for determination of the individual amino acid sequence within the peptide. Most commonly, peptides are collided with a low pressure of an inert gas (known as collision-induced dissociation (CID)). In ion trap mass spectrometers peptide ions are resonantly excited by an electric field, leading to an increased internal energy that causes peptide fragmentation at the peptide bonds. In quadrupole TOF instruments, the fragmentation occurs in one quadrupole and fragments subsequently detected in the next quadrupole such that only informative fragment ions are selected producing high quality MS/MS spectra. The recent introduction of higher energy collisional dissociation (HCD) has made a similar fragmentation mode used by quadrupole TOF instruments available for ion trap-orbitrap instruments, without diminishing sensitivity. This has improved MS/MS spectrums for ion trap-orbitrap instruments, significantly enhancing their performance.196 Other fragmentation techniques have been employed and rely on different physical mechanisms (electron-capture dissociation (ECD) and electron transfer dissociation (ETD)) which are particularly useful for analysing peptides with labile PTMs.

4. Data analysis: The large number of MS/MS spectra generated are analysed by automated search engines or algorithms, of which numerous exist.197 These search algorithms aim to explain a recorded MS/MS spectrum by a peptide sequence from a pre-defined database (usually translated from genomic data), returning a list of peptide sequences that fit the experimental data with an associated false discovery rate (FDR) for the match. Further algorithms then assign the peptide identifications into proteins, which can be challenging when dealing with redundant peptides or alternatively spliced proteins.198 One particular, freely available, analytical software suite is MaxQuant.199 This all-in-one package with its Andromeda database search algorithm can be used to translate spectral data into protein identifications inside a single interface that is particularly adept at handling quantitative proteomic analysis.200 Other platforms allow peptide matching from spectra and protein inference from peptides to be handled separately. This allows different search algorithms to be incorporated in a customised manner which is beneficial given the rapid improvements in algorithms for shotgun proteomic applications.

39

1.2.5.4 Quantitative proteomics

The development of a number of quantitative proteomic techniques over the last decade has allowed direct comparison between different protein populations to be carried out.201, 202 This enables the relative protein abundance to be determined between samples from different experimental conditions that has had wide application across the proteomics field. This has become increasingly important for MS-based small molecule target identification workflows. Given the increased sensitivity of MS-based instrumentation, increased numbers of potential protein targets are being identified for small molecules using techniques such as affinity pulldowns, ABPP, 2D-GE, DARTS, CETSA and TICC. In order to decipher which targets identified are genuine targets worthy of further validation, appropriate control experiments must be carried out. This could involve a comparison of the target profiling of an active and an inactive analogue of the small molecule, or assessing the targets of a small molecule over a dynamic range of concentrations. In order to directly compare the experimental results of different MS experiments, such as these outlined, accurate and reliable protein quantification between samples is required.

Most techniques are based on the introduction of a stable isotopic label into the proteins/peptides. Stable isotopes retain the physico-chemical properties of the proteins/peptides, therefore inflicting no interference, but differ in mass allowing them to be simultaneously detected and quantified by MS on the basis of ion peak intensities. The isotopic label can be introduced either chemically or metabolically at different stages of the proteomic sample preparation workflow.

Metabolic incorporation introduces the isotopic label early on in the workflow and one such technique is stable isotope labelling of amino acids in cell culture (SILAC).203 SILAC makes use of the requirement of human cells for essential amino acids such as arginine and lysine. Culturing cells in media supplemented with ‘heavy’ isotopically labelled essential amino acids leads to complete labelling of the cellular proteome with this ‘heavy’ amino acid. The corresponding ‘light’ isotopically labelled cells can also be cultured in parallel. The ‘heavy’ and ‘light’ proteomes generated allow two experimental conditions to be compared in a duplex SILAC approach. Further developments to this approach have allowed the comparison of three experimental conditions using a triplex SILAC setup.204 A more recently reported innovative approach, ‘spike-in’ SILAC, allows for an increased number of samples to be quantified against one another whilst also allowing quantification of proteomes not amenable to metabolic isotopic labelling themselves.205 One of the major advantages of SILAC is that the isotopic label is introduced early in the workflow thereby reducing experimental errors and biases associated with sample processing. Therefore, it is increasingly becoming a popular strategy for MS-based target identification applications where robust quantification is required.143, 206

However, a SILAC approach is not always applicable for all cell types such as those that are harvested from human tissues or body fluids. Therefore incorporation of isotopic tags for quantification purposes can be introduced at the protein or peptide level. This is the case for isotope- coded protein label (ICPL) and isobaric tag for relative and absolute quantification (iTRAQ) labelling. The ICPL method labels all free amino groups of intact proteins with ‘heavy’ or ‘light’ ICPL tags prior

40 to digestion to peptides with MS-based quantification carried out in an analogous manner to SILAC on the MS precursor ions.207 iTRAQ labelling on the other hand chemically tags primary amines (N- terminus or lysine) of peptides after digest with isobaric tags that release characteristic low mass reporter ions upon fragmentation.208 In this approach there is no splitting of the MS precursor ion or increase in spectral complexity, with quantification carried out in MS/MS. iTRAQ labelling reagents currently allow for up to eight experimental conditions to be quantified against one another. An alternative peptide labelling strategy is dimethyl labelling strategy that is proving popular on account of its low cost and undemanding protocol in comparison to iTRAQ.209, 210 Absolute quantification of protein population is also possible using internal standards such as the AQUA (absolute quantification) strategy.211 However, absolute quantification is often not required for protein target identification purposes where relative quantification between experimental samples is sufficient.

Label-free quantification (LFQ) between protein populations is also possible which negates the use of stable isotopes.144, 212 Integration of the complete MS signal of each peptide can be used to quantify the same peptide in different LC-MS/MS runs. This approach is less precise for MS-based quantification in comparison to isotope label-based approaches due to increased variability and therefore presents larger error which can be highly detrimental for observing subtle differences in protein abundance between different populations. However, LFQ benefits from its low cost and the lack of additional processing steps required for quantification. It can also provide a higher dynamic range of quantification, which is important for protein abundance changes over several orders of magnitudes between different populations, and is not limited by the number of populations that can be quantified against one another. The different MS-based quantification strategies are summarised in Figure 4.

There have been a number of recent reports that have attempted to address the performance of the many quantitative proteomics strategies available relative to one another.213-215 The choice of quantification strategy is often based on the ease of incorporation of the isotopic label, cost implications, the number of samples to be analysed and the relative quantification accuracy required. Each technique has its own advantages and disadvantages dependent on the application. However, for small molecule target identification and chemical proteomic workflows, the value of metabolic approaches such as SILAC is clearly reflected by its recent use in the field in a diverse range of studies.123, 216-218

41

Label-free Metabolic Isotopic Isobaric Spiked heavy quantification labelling tags tags peptides

Cells

Proteins

Peptides

LC-MS or LC-MS/MS

Absolute conc. Data known analysis MS/MS Intensity Intensity Intensity Intensity Intensity Intensity m/z m/z m/z m/z m/z m/z MS MS MS MS MS MS Absolute Quantification off Quantification Quantification Quantification quantification intensities from in MS1 in MS1 in MS2 in MS1 separate MS runs

LFQ method SILAC method ICPL method iTRAQ method AQUA method

Figure 4. Overview of quantitative proteomics workflows. All approaches require MS analysis as the means of detection. The incorporation of the isotopic label occurs at different points within the workflow, depending on the technique, with the exception of label-free quantification whereby no isotopic labels are used and quantification occurs through spectral counting or peak intensity comparison from separate MS runs. Metabolic labelling incorporates the label in vivo or in situ after which the samples are combined and processed for quantitative analysis. For isotopic and isobaric approaches, protein extraction occurs first before the label is incorporated at the protein or peptide level respectively, followed by sample combination and quantitative analysis. Absolute quantification of a protein or peptide is achieved using a known amount of spiked heavy peptides (AQUA) into unlabelled samples.

1.2.6 Target validation and further elucidation

It is important to stress that target identification technologies discussed above create hypotheses of protein targets and potential modes of action of small molecules. Evidence gathered across multiple techniques will provide stronger evidence that an identified target is important to an observed biological activity. However, many of these methodologies provide target information but lack the functional implications or biological relevance of the interaction. Validation of a chosen target is therefore critical and can be carried out using a plethora of techniques. These analyses are hugely time-consuming and costly and as such are only carried out when there is strong evidence to support a hypothesis surrounding a target and a mode of action associated with a small molecule. Quantitative binding parameters to determine the affinity of a compound with a given target may be carried out using biophysical techniques such as surface plasmon resonance (SPR) or isothermal titration calorimetry (ITC). The binding mode of the compound with a given target can be obtained with 3D structural information obtained by NMR or X-ray crystallography. Other complementary

42 techniques to assess the small molecule-protein affinity include enzyme-linked immunosorbent assays (ELISA) and microscale thermophoresis (MST). Site-directed mutagenesis may also be used to identify functionally important residues. This is important for electrophilic natural products to determine whether covalent binding is necessary for a functional effect. Demonstration of a physical interaction inside cells between a small molecule and its target protein can be done with fluorescent labelling co-localisation studies,219 as well as in-cell fluorescence resonance energy transfer (FRET) assays.220 This can be a vitally important observation if the small molecule target identification was originally performed in vitro on extracted protein lysates.

In addition to characterising the physical interactions, it is also important to delineate the functional implications of a small molecule binding a target protein. Enzyme activity assays allow the association of small molecule and target to be assessed for its functional impact. Furthermore, cell-based assays for cell proliferation, apoptosis, transcriptional reporter assays also allow the functional impact of a small molecule at a higher level. The utility of genetic approaches such as RNAi and knockout models has already been discussed but both these techniques are highly useful for probing the functional implications of a small molecule-protein interaction. Such extensive biophysical, biochemical, cell biological and structural characterisation of a given target will certainly provide strong evidence for an interaction for compounds that have a likely single target. More challenging however are some electrophilic natural products that are known to mediate their effects through multiple targets. For such multi-target compounds, the utilisation of bioinformatic approaches enables insight into protein target sets.221, 222

Functional annotation tools for proteins such as (GO) can provide functional insight into biological processes and molecular functions associated with protein targets that might help address the observed mode of action.223 Popular web-based software tools for assigning GO and performing enrichment analysis includes the Database for Annotation, Visualisation and Integrated Discovery (DAVID) and Babelomics software resources.224, 225 Pathway databases are also useful resources containing grouped protein identifiers either involved in the chemical reactions of a pathway or in regulation of it. Comprehensive pathway databases include KEGG and Reactome.226, 227 Recently, several pathway databases have been developed which comprise pathways active in disease, particularly cancer. Databases such as Netpath allow cancer relevant proteins and genes to be identified from complex datasets.228 It is also becoming increasingly apparent that not only is it important to understand the direct target of the small molecule, but also to have an appreciation of the interactome of the target. Small molecule-binding and activity against its target protein may cause disruption of function to its interacting partners which themselves may be affected, as was shown for the drug targeting of the BCR-ABL complex.229 Information on protein interactions in complexes can be obtained from interaction databases such as STRING, MINT, BioGRID, IntAct or HPRD.230-234 Such bioinformatic tools greatly assist proteomic research in the interpretation of lists of protein targets helping to identify potential underlying mechanisms.235

Finally, combining target information with other data derived from orthogonal technologies such as transcriptomics and metabolomics is also important in understanding the biological activity of a small

43 molecule. Analysis of both genome-wide and proteome-wide expression changes upon compound exposure will provide insight into the broader context of downstream cellular effects of compound action.236 By collating such information together, a system-wide view of the compound under investigation can be obtained. To provide an overview of a likely workflow for identifying and validating the targets of a small molecule, a generic workflow is presented in Figure 5. This is meant to provide a guide for how one might go about profiling the protein targets of a small molecule encompassing all that has been discussed in the chapter thus far.

(a) New (b) Potential (c) Useful chemical tool (d) Promiscuous biology or therapeutic for molecule if selective for agent with undefined future drug further single target or target mechanism and target development class therefore limited discovered utility

1. Identification of 7. Identification of small molecule with both targets and interesting mode of action of biological activity small molecule

6. Bioinformatic 2. Chemical 3. Target(s) 5. Validate and network synthesis of identification key targets 4. Prioritise targets analysis to probe molecules by quantitative with to identify key translate targets of compound of chemical biophysics, therapeutically to mode of action interest proteomics biochemistry, relevant targets (DAVID, (affinity pull- molecular Reactome, down or ABPP) biology and KEGG, imaging STRING…) Confirmation of Analysis of ‘off- activity and utility targets’ relevant at of probe Target validation higher molecule by orthogonal concentrations techniques to and potential strengthen target effects they may IDs (RNAi, have (e.g. toxicity) HIP/HOP, DARTS, CETSA, computational methods)

Figure 5. A generalised workflow for the protein target(s) identification and validation of a small molecule. The workflow is applicable for all small molecules (electrophiles and otherwise) and is intended as a summary or guide for how one might go about identiying the targets of a small molecule with unknown target(s) and/or mode of action. Boxes in black is the advised workflow (numbered 1-7) with optional steps shown in the red boxes. The possible outcomes of the profiled small molecule are shown in the green boxes.

1.2.7 Case studies of electrophile target profiling

The primary focus of the presented case studies is on the profiling of electrophiles inside cells using MS-based techniques for protein target identification. To this end, there is a focus on chemical proteomic approaches such as ABPP which offer the most powerful insight into the full target spectrum of any small molecule. The examples chosen for discussion are based on their particular relevance for this project and/or innovative strategies or methodologies employed that warrant further discussion. The first set of examples is the target elucidation of three electrophilic natural products from Table 1 which have been profiled by chemical proteomics providing definitive insight into their biological activities (Chapter 1.2.7.1). The second set of examples is a focus on state-of-the-art

44 platforms for electrophile target profiling that have been utilised to appreciate the wider reactivity of electrophilic species with the proteome (Chapter 1.2.7.2).

1.2.7.1 Quantitative protein target profiling of electrophilic natural products

Tetrahydrolipstatin (THL) is an FDA-approved anti-obesity drug with the trade name Orlistat with potential antitumour activities, structurally closely related to the β-lactone-containing electrophilic natural product lipstatin. Its primary target is pancreatic lipases such as fatty acid synthase (FAS) but the cellular targets of THL were further explored by Yao and co-workers by converting THL into a small series of ABPs with the addition of a terminal alkyne into its structure.116 The activity of the ABPs was first shown to be identical to THL as confirmed by a variety of cellular assays indicating the addition of the alkyne tag did not disrupt the biological targets or activity. Protein target labelling of the ABP was then carried out in-cell, with subsequent identification of eight protein targets by MS. FAS, the primary target, was identified in addition to seven further protein targets (GAPDH, β-tubulin, RLP7a, RLP14, RPS9, Annexin A2 and HSP90AB1). All of these targets were validated as targets of the ABP by WB analysis. This profiling study highlights the importance of identifying off-targets and elucidating their role in connection to the intended drug target. Follow up work has led to analogues with greater selectivity for FAS over these off-targets, leading to a better understanding of the SAR of the drug compound and further highlighting the power of chemical proteomics.237, 238

Wortmannin is a steroid-derived fungal metabolite, originally identified as a potent and selective irreversible inhibitor of phosphoinositide-3-kinase (PI3K) family members. Its observed anti- proliferative activities give it promise as a cancer therapeutic. Michael addition of a conserved Lys802 residue at the activated furan ring leads to a ring opening and formation of a stabilised β-amino α,β- unsaturated ester of PI3Ks.239 Application of an ABPP approach further elucidated the scope of protein targets within the PI3K and PI3K-related kinase (PIKK) families.101 A further study also identified the polo-like kinase (PLK1) family as additional targets of wortmannin at higher concentrations.100 This collection of studies is interesting for a number of reasons. Firstly, they highlight how electrophilic natural products can provide selective chemical tools to specifically target a family of kinases that has greatly aided the study of their function. Secondly, ABPP reveals additional protein targets that researchers should be aware of particularly when applying wortmannin as a tool molecule at high concentrations. Finally, the covalent modification of wortmannin on a lysine residue is unexpected. This highlights that other nucleophilic amino acids aside from cysteine can play important roles in mediating reactivity with electrophiles.

Andrographolide, a natural product with known anti-inflammatory and anti-cancer effects, has recently had its targets profiled in live cancer cells using a quantitative ABPP approach.114 The authors of this study synthesised an ABP of andrographolide by incorporating an alkyne tag into the molecule. The ABP was then used to identify 75 potential andrographolide targets with high confidence. The use of an iTRAQ quantitative proteomics setup allowed protein targets that are typically identified as false positive identifications in ABPP workflows to be filtered out and eliminated. This increases the accuracy of specific target identifications, minimising experimental errors. Originally 291 proteins were

45 identified in the study that was reduced to 75 upon stringent filtering, highlighting the necessity to incorporate quantitative proteomics into ABPP workflows to reduce false positive identifications. Further validation of two targets NF-κB and β-actin by an in vitro binding assay and direct mapping gave higher confidence to these assignments as targets of andrographolide. The development of quantitative strategies for MS-based analyses in addition to carefully selected control experiments should now be routine practice for target identification by ABPP to ensure the highest quality identifications. Many groups in the field are beginning to do so.123, 143, 240, 241

1.2.7.2 Global proteome profiling of reactive cysteines

In efforts to more broadly assess the targets of electrophiles in biological systems, platforms are being developed to capture the electrophile-reactive proteome, particularly relating to cysteine. Significant progress has been made in the last decade on account of the continued improvement of MS as an analytical tool. Initial investigations from Liebler and co-workers used biotinylated model electrophile species, NEM and IA, fed to extracted protein lysates to label and subsequently affinity enrich modified cysteine residues on proteins. This led to the identification of hundreds of modified cysteine residues by MS showing that a significant number of proteins could be labelled with such model electrophiles.80, 81, 83, 242

Cravatt and co-workers built on this initial work to carry out the quantitative reactivity profiling of cysteine residues within complex proteomes to predict functional cysteines.56 The methodology employs a combination of two-step ABPP utilising click chemistry, quantitative proteomics and MS- based identification. Utilising an alkyne-functionalised IA tool treated to proteomes at high and low concentrations, the reactivity towards over a thousand cysteine residues was assessed simultaneously using a quantitative MS-based approach termed isoTOP-ABPP (isotopic tandem orthogonal proteolysis-activity-based protein profiling). In this way, so-called ‘hyper-reactive’ cysteines were identified that were shown to be remarkably enriched in functional residues indicating that reactivity towards a model electrophile such as IA is a good predictor of cysteine functionality in proteins. The same group also used this platform to globally identify oxidation-sensitive cysteine residues in bacterial systems as well as to profile zinc-binding cysteine residues.243, 244

Recently, Cravatt and co-workers used a slightly modified, competition-based isoTOP-ABPP platform 79 to profile the targets of lipid-derived electrophiles such as 4-HNE and 15d-PGJ2 (Figure 6). Here they were able to assess the reactivity of the aforementioned electrophiles against the thousand cysteine residues targeted by alkyne-functionalised IA tool, assigning potencies to each cysteine to allow the most reactive cysteines towards each of the two electrophilic species to be identified. Cysteines within MAPKKK MLT (ZAK), elongation factor 2 (EEF2), Ketosamine-3-kinase (FN3KRP) and reticulon-4 (RTN4) were singled out as being highly reactive towards 4-HNE providing new insight into the biological targets and mode of action of this electrophilic species. Liebler and co- workers have also profiled the targets of endogenous electrophilic lipids such as 4-HNE using more classic ABPP approaches in a number of studies identifying hundreds of covalently bound targets.51, 245 They routinely use bioinformatic and network analyses to build functional models based on this

46 target information and have made significant efforts towards determining the role protein alkylation by reactive electrophiles plays in chemical toxicity and oxidative stress. Other work from their group also developed innovative chemical proteomic platforms to globally profile the S-sulphenylated proteome whereby the thiol residue on cysteine is oxidised to the highly reactive sulfenic acid under conditions of oxidative stress.245, 246

N3 N TEV tag N (light) N

CuAAC Inhibitor Proteome of interest IA-alkyne N3 N TEV tag N (heavy) N N N DMSO CuAAC N

N N peptide N N LC-MS/MS Streptavidin N Identification and N N N enrichment N N quantification N N N N Heavy N N N N N N N N N Intensity Light N N N N MS1 TEV digest Trypsin digest

Streptavidin Streptavidin

Figure 6. The competition-based, isoTOP-ABPP chemical proteomics platform. Compound of interest (often electrophilic species) or DMSO are treated to proteome (this can be on lysate or in-cell), followed by proteome labelling with IA-alkyne probe (on lysate) to label all reactive cysteines. CuAAC incorporates the isotopically labelled, tobacco etch virus (TEV) protease-cleavable biotin tags (containing either ‘heavy’ or ‘light’ isotopic labels) to each of the two proteomes. The samples are then combined and enriched onto streptavidin resin with subsequent sequential on-bead protease digestions to afford probe-labelled peptides for LC-MS/MS analysis. Protein targets of the compound under study will display relatively high ‘heavy’/’light’ ratios (R >> 1) as they will be less enriched by the IA-alkyne probe relative to the DMSO control.

In the assessment of the protein target profile of any electrophile species, endogenous or exogenous, assessing its target reactivity in a proteome-wide manner should be an essential undertaking. There are however a limited number of reports in the field for such comprehensive target spectrum determination of electrophiles, with many electrophilic natural products reported in Table 1 focusing on a single target. This is partly been due to the limitations in technology of accessible and easily implementable workflows for carrying out such an undertaking. In theory, any electrophile could be applied to the competition-based isoTOP-ABPP platform to unravel its cysteine reactive profile (in the same manner as has been done for 4-HNE and 15d-PGJ2). Identification of both the individual cysteine modification site as well as the protein itself could be determined in addition to its relative binding potency for each target. This would eliminate the need to chemically synthesise a probe molecule of each and every electrophile under investigation (as has been done in the examples in Chapter 1.2.7.2), which can be costly and time-consuming.

47

Extending these technologies to broaden the coverage of cysteines that can be simultaneously profiled is an on-going effort and is improving as workflows are optimised and the coverage offered by MS instrumentation develops. The use of IA- or NEM- based probes to capture reactive cysteines in the proteome has its limitations in that some cysteines may not be accessible to the probe for steric reasons or may be better suited to other cysteine-reactive electrophilic probes. This may lead to functionally important cysteines and subsequently electrophiles targeting such cysteines to be overlooked. The adaptation of these platforms to map the reactivity of other nucleophilic amino acids in proteomes is also necessary. Work again within the Cravatt group and followed on by Weerapana and co-workers has assessed the reactivity of model electrophiles targeted specifically towards other nucleophilic amino acids within the proteome.6, 84 This will allow the reactivity of electrophilic compounds to be assessed more broadly aside from towards cysteine residues, potentially to identify new mechanisms through which they may be mediating their effects.

Improving understanding of electrophile signalling and identifying electrophile-sensitive proteins is important not only for appreciating new biological mechanisms of cell signalling but also as the resurgence in covalent drugs continues. Building up information on electrophile reactivity to form protein adducts will greatly aid the development of electrophile-inspired therapeutics both in terms of electrophilic natural products and targeted covalent drugs.

1.2.8 Conclusions

Profiling the target(s) of an electrophilic natural product is no different to that of any small molecule and a large number of approaches have been discussed for elucidating the target responsible for biological activity as well as identifying potential off-targets. The covalent nature of interaction of an electrophilic natural product in many ways makes capturing its target(s) easier as has been discussed. It is with this, that attention is turned to three specific electrophilic natural products of interest for this project in the next section.

48

1.3 Three electrophilic natural products of interest

Despite the advancements in profiling of a large number of electrophilic natural products and the identification of candidate target proteins (Table 1), for many such electrophilic natural products, key targets are yet to be conclusively elucidated. Contained within this category are three dietary-based electrophilic natural products, namely curcumin, sulforaphane and piperlongumine (Figure 7).

(A) (B)

1200 O O 180 N C S 160 S O 1000 HO OH 140 Sulforaphane

800 OCH3 OCH3 120 Chemical Formula: C H NOS 6 11. 2 100 Molecular Weight: 177 29 Curcumin 600 Chemical Formula: C H O 80 21 .20 6 400 o ecu ar e t: 60 M l l W igh 368 38 publications PubMed PubMed publications PubMed

40 200 20

0 0

1949 1964 1970 1972 1975 1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 1992 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 Year Year (C) (D)

35 O O 30 H CO 3 N

25 H3CO

20 OCH3 No. of Piperlongumine PubMed 7407 1169 98 15 * Chemical Formula: C H NO references 17 19 5 PubMed publications PubMed o ecu ar e : . 10 M l l W ight 317 34 No. of 108 32 0 trials† 5 * http://www.ncbi.nlm.nih.gov/pubmed (January 2015), 0 † https://clinicaltrials.gov/ (January 2015)

1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 Year

(E)

Figure 7. (A-C) The yearly number of publications corresponding to curcumin (A), sulforaphane (B) and piperlongumine (C) as reported in PubMed (http://www.ncbi.nlm.nih.gov/pubmed) up to January 2015. (D) PubMed publications and registered clinical trials reported for curcumin, sulforaphane and piperlongumine up to January 2015. (E) Word clouds of curcumin, sulforaphane and piperlongumine to give a representative overview of the literature landscape for the three compounds. PubMed (accessed May 2013) searches for ‘curcumin’, ‘sulforaphane’ and ‘piperlongumine’ were carried out. Publication titles and abstracts were exported (most recent 1000 publications for curcumin, all publications for sulforaphane and piperlongumine) and incorporated into Wordle (http://www.wordle.net/). Irrelevant and commonly used words were filtered to generate the produced word clouds. The size of the word is proportional to its frequency of use.

49

All three of these compounds are plant-derived secondary metabolites of low molecular weight (Mw < 370 Da) which have been explored for their therapeutic potential on account of their broad range of biological activities. There is accumulating evidence from population as well as laboratory studies to support an inverse relationship between the consumption of fruit and vegetables and the risk of a number of diseases, especially cancer.247 These effects have been attributed to the contained within these foods, defined as the non-nutritive components in the plant-based diet.

This has sparked extraordinary research interest in these compounds in the last few decades (Figure 7). For curcumin, by the end of 2014, over 7000 publications had been reported in PubMed with 109 clinical trials registered at ClinicalTrials.gov. Sulforaphane also had over 1000 PubMed publications and scores of clinical trials (although the majority of clinical trials use sulforaphane in the form of broccoli extracts). Research into piperlongumine is still very much in its infancy with only 98 PubMed publications reported. However there has been a flurry of interest in piperlongumine in the last three or four years as its biological activity profile is beginning to be unravelled. Many clinical trials on the use of nutritional supplements and also modified diets, particularly in cancer treatment, on account of such compounds are currently ongoing.248 The basic benefits of bioactive dietary-based agents are their low cost, well-known applications in traditional medicine, accessibility and their minimal or non- existent toxicity. For this reason they have an interesting role to play in modern medicine.

In comparison to other electrophilic natural products that can often show exquisite selectivity for a single target (Table 1 – wortmannin and pladienolide B), these three compounds are often described as polypharmocological agents. Polypharmacology is defined as the specific binding of a compound to two or more molecular targets.249 The classic approach to drug discovery over the last few decades is the philosophy of rational drug design or the ‘one gene, one drug, one disease’ paradigm. The idea that by designing very selective ligands as drug molecules against a single drug target will result in effective and safe drugs with minimal toxicities. Although highly successful over the years in the discovery of therapeutics across many diseases, this as a central paradigm for drug discovery is beginning to be challenged.250, 251 It is becoming more apparent that for some diseases with complex pathologies, targeting multiple targets within a disease network may be necessary to overcome the robustness of a disease state. Such diseases are unlikely to be successfully treated by pharmacological interventions based on a single target. Modulation of multiple targets can be achieved using combination therapies by administering two or more drugs. However multi-target agents (polypharmocological agents) aim at the same goal but with a single agent.

Therefore compounds that are capable of interacting with multiple targets simultaneously are highly valuable in this regard. Although promiscuity needs to be tamed and the right sort of targets need to be engaged within the disease network, compounds such as curcumin, sulforaphane and piperlongumine offer exciting potential as drug discovery enters into a new dogma of polypharmacology. Polypharmacology is mostly relevant for diseases involving diverse target networks and signalling pathways. Cancer is therefore a prime example. Cancer cells are typified by a phenotype of uncontrolled cell proliferation and survival. This change in phenotype is governed by

50 multiple protein mediators that collectively contribute to the disease with the phenotype constantly subjected to changes that alter the activity and involvement of different signalling pathways.

The hallmarks of cancer comprise six biological capabilities acquired during the multistep development of human tumours. They include sustaining proliferative signalling, evading growth suppressors, resisting cell death, enabling replicative immortality, inducing angiogenesis, and activating invasion and metastasis.252 Two further emerging hallmarks have also been recently added as the understanding of cancer develops; reprogramming of energy metabolism and evading immune destruction.253 To fight cancer effectively, therapeutic strategies that take the dynamic nature of the cancer phenotype into account and are able to affect multiple molecular targets against all the hallmarks of cancer or during all stages of its development are highly sought after. The hope is that dietary-based electrophiles have the potential to meet these criteria. The anticancer activities of dietary electrophiles such as these are well-documented and there is growing evidence to support they both prevent and fight many forms of cancer.247, 254 However, as stated previously, the underlying target set and the mode of actions still require further elucidation.

There is a very important distinction to make at this point into what the definition of a target is in this context as it is a term that is often used loosely in the field. Targets of dietary-based electrophiles can be direct or indirect. A direct target is one that the compound physically interacts with (covalent or non-covalent). An indirect target is one that is either up- or down- regulated, activated or inactivated by the compound as a downstream consequence of compound treatment. It is very important to dissect direct targets from indirect ones, which is something that has poorly done within the field. Direct targets hold the key to determining the indirect target responses and thus explaining a given activity or phenotype.

A brief snapshot of the literature landscape for curcumin, sulforaphane and piperlongumine is provided. The multitude of prior literature makes a comprehensive review unfeasible. Therefore focus has been placed on the reported direct targets of the compounds, which holds the key to unravelling their modes of action, and their biological activities in cancer.

1.3.1 Curcumin

1.3.1.1 Overview

Curcumin is a polyphenolic compound isolated as the active ingredient of the perennial herb Curcuma longa (more commonly known as turmeric). Extensive research into curcumin has revealed this natural product possesses anti-inflammatory, anti-oxidant, antiproliferative, antibacterial, antiviral, antifungal, anti-angiogenic and both anti- and pro- apoptotic effects. This has led to it exerting medicinal benefits against neurodegenerative diseases, arthritis, AIDS, diabetes, multiple sclerosis, chronic obstructive pulmonary disease (COPD), cancer and cardiovascular disease. On account of this, some have dubbed curcumin as ‘cure-cumin’.255 It is often easier to discuss what effects curcumin does not have as oppose to stating the actions that it does. How a single agent can exhibit such a wide range of biological effects is an enigma under intense scrutiny with considerable research

51 effort going into to uncovering its cellular targets and underlying molecular mechanisms. Chemically, curcumin is (E,E)-1,7-bis-(4-hydroxy-3-methoxyphenyl)-1,6-heptadiene-3,5-dione and contains two α,β-unsaturated ketone motifs (Michael acceptors) giving rise to its electrophilic nature that results in covalent binding to biological nucleophiles. In addition to this, a number of studies have also shown functional effects arising from non-covalent associations of curcumin with protein targets through hydrophobic and hydrogen-bonding interactions. Finally, curcumin itself is an antioxidant capable of free radical and ROS scavenging with reports suggesting it is superior to vitamin E in this regard.256 This therefore makes curcumin a multi-functional molecule capable of exerting its biological effects in a number of ways.

1.3.1.2 Molecular targets

Despite the intense research interest into curcumin, the understanding of curcumin at the molecular level is still poorly understood. Numerous direct and indirect protein targets have been identified from a variety of independent studies. These include transcription factors, kinases, adhesion molecules, receptors, cytokines and enzymes that all appear to be controlled by curcumin.257-259 Indirect effects mediated by curcumin often come through its activity on a number of different transcription factors. Activated transcription factors by curcumin include PPAR-γ, p53, Nrf2, activating transcription factor 3 (ATF3) and C/EBP homologous protein (CHOP). Inactivated transcription factors include NF-κB, hypoxia-inducible factor (HIF-1), activator protein-1 (AP-1), neurogenic notch homolog protein 1 (NOTCH1), early growth response protein (EGR1), signal transducer and activator of transcription 3 (STAT3), β-catenin (CTTNB1) and specificity protein 1 (SP-1).260 There is often significant overlap between gene products controlled by transcription factors in addition to various levels of cross-talk between different transcription factor controlled signalling pathways. This has made it extremely challenging to both identify the underlying mechanisms of control that curcumin can exert its effects through and elucidate key direct mediators for curcumin within these signalling cascades. However, this also highlights the therapeutic potential of curcumin in that it can control and regulate multiple transcription factors simultaneously. This ability is especially beneficial in cancer whereby pressure may need to be exerted on multiple targets within the disease network in order to reverse its effect.

Nonetheless, the undertakings of numerous research groups in the past couple of decades has led to the identification of nearly 50 direct targets for curcumin (both covalent and non-covalent binders) to date using a variety of biological techniques (Figure 8). These targets are summarised in an excellent review by Aggarwal and co-workers.261 Important direct targets of curcumin include enzymes such as DNA methyltransferase 1 (DNMT1) and the well-studied anti-inflammatory target COX2,262, 263 a variety of redox proteins such as TXNRD1,264 as well as the protein kinases ERBB2, Proto-oncogene tyrosine-protein kinase Src (SRC), mammalian target of rapamycin (MTOR) and protein kinase C (PKC).265-268 Targets are not just limited to proteins, with direct binding to DNA and RNA also reported for curcumin.269, 270 The physiological significance of such association is unknown, but does raise concerns over potential genotoxicity. Great effort has gone into validating individual targets of curcumin with attempts made to connect this to its biological activity in a given system under study.

52

However as alluded to above, this is no trivial task for what appears to be a fairly promiscuous compound.

To this end, globally profiling curcumin targets using the target identification methodologies discussed in Chapter 1.2 would seem appropriate but extremely few examples from the literature can be cited. A recent study reported the synthesis and application of a biotinylated curcumin analogue to HEI-193 schwannoma cells to identify HSP70, HSP90, 3-phosphoglycerate dehydrogenase (PHGDH) and β- actin as direct targets in addition to around 40 other potential protein targets (Appendix Table 2).132 Another recent report identified 11 protein targets of curcumin via MS by immobilising curcumin on resin for affinity pulldown of mouse brain proteomes.133 These targets included PRDX, β-tubulin, β- actin, HSP70, phosphoglycerate mutase 1 (PGAM1) and fructose bisphosphate aldolase (ALDOA). While informative for their attempt to globally profile the targets of curcumin, the lack of validation and representative controls from these studies limits the confidence from which target information can be inferred. However these are some of the first attempts to address the global target profile of curcumin.

1.3.1.3 Summary

Despite the impressive range of activities of curcumin, curcumin can be best described as pharmacodynamically fierce but pharmacokinetically feeble. Its pharmacodynamic fierceness is derived from the fact curcumin interferes with a plethora of vital pathways in cancer, inhibiting all of the classic hallmarks of cancer as described by Hanahan and Weinberg.252 The unique chemical attributes of curcumin, with a logP value that ensures transmembrane passage without excessive retention in lipophilic compartments, allows curcumin to reach its molecular targets. The capacity to hydrogen bond, chelate metal cations, undergo hydrophobic interactions, conformational adaptability and the potential to form covalent associations through its Michael acceptors aid its ability to influence multiple targets and pathways simultaneously. The true extent of this target set is only beginning to be understood. On the other hand, the pharmacokinetically feeble character is attributable to is chemical instability, poor systematic uptake, and extensive biotransformation which lead to its poor bioavailability. As a consequence, many of the reported biological activities at concentrations of curcumin in vitro are not feasible in vivo.271 Significant research effort is therefore addressing improving the in vivo bioavailability of curcumin to bridge this gap.272

The sheer number of publications and new pathways modulated by curcumin is expanding at an exponential rate as the pharmacology of curcumin becomes more and more complex. The continued excellent reviews provided by many groups in the field have attempted to summarise the ever growing literature generated for curcumin.273-275 However, the literature landscape for curcumin is also littered with poorly conducted studies some of which have applied low purity curcumin (commercially available curcumin contains curcumin at only 77 % purity with the presence of a number of additional curcuminoids including demethoxycurcumin (17 %) and bisdemethoxycurcumin (3 %)) that hinder efforts towards drawing valid conclusions based on curcumin alone. There is a clear need to expand fundamental understanding into its key protein targets and mediators in the first instance to help address its multiple modes of action. Work in this regard is limited. These basic requirements must be

53 addressed before recommendations of how to apply this interesting dietary electrophile can be made in the clinic.

1.3.2 Sulforaphane

1.3.2.1 Overview

Isothiocyanates are a class of bioactive compounds that are abundant in such as broccoli and Brussels sprouts. There has been intense research interest in compounds of this class over the last few decades off the back of some promising in vitro and in vivo studies.276 Sulforaphane, one such isothiocyanate [(-)-1-isothiocyanato-(4R)-(methylsulfinyl)-butane], has been particularly well studied. It has the ability to both prevent and suppress the development of many types of cancer.277, 278 The signalling pathways significant to cancer that sulforaphane has been shown to mediate its effects through are extensive. Its well-documented activities include antiproliferative, pro-apoptotic, anti-angiogenic, anti-inflammatory and antimetastatic effects. These activities have been observed both in vitro and in vivo in many instances. Activation of MAPK signalling, induction of epigenetic modifications, inhibition of key cell survival pathways such as NF-κB and AP-1, generation of mitochondrial ROS, inhibition of the FOXO1/AKT pathway have all been reported.279 Such studies highlight that sulforaphane is a multi-target agent and this is considered to be a key component of the anticancer activity of sulforaphane and isothiocyanates more generally.280

1.3.2.2 Molecular targets

Sulforaphane, like curcumin, is an electrophile allowing covalent modification with biological nucleophiles. Unlike curcumin however, it lacks direct free radical scavenging ability. No adduct formation of sulforaphane with DNA or RNA has been reported either. Therefore the fundamental mechanism that underlies its biological effects is believed to be its ability to form covalent adducts with proteins.281 The ready and rapid binding to GSH of isothiocyanates is also an important mechanism underlying their anticancer activities.282, 283 Formation of reversible GSH-isothiocyanate adducts inside cells allows their rapid accumulation to high intracellular concentrations.284, 285 Recent reports suggest these adducts not only release free isothiocyanate through equilibrium but can also be directly transferred to protein thiols through transthiocarbamoylation.94

The protein binding targets of isothiocyanates have also been well explored with a number of well- validated protein targets identified that go some way to explaining their anticancer activities (Figure 8). Direct targets include AP-1, cytochrome P450, HDACs, HSP90, KEAP1, NF-κB, macrophage migration inhibitory factor (MIF), MAP kinase kinase kinase 1 (MEKK1), tubulin and toll-like receptor 4 (TLR4).286 One of the best studied effects of sulforaphane is its ability to induce phase II metabolising enzymes through the ARE that play an important role in detoxifying carcinogens and xenobiotics.287- 289 Sulforaphane activates this response through covalently modifying cysteine residues within KEAP1 to release the Nrf2 transcription to activate ARE-driven gene expression. Sulforaphane is reported to be one of the most potent inducers of this pathway and it has been attributed to playing a major role in its anticancer effects.290 Numerous studies have explored the modifications of sulforaphane on KEAP1 with 25 out of the 27 cysteine residues liable to modification by sulforaphane, with Cys38,

54

Cys151, Cys368 and Cys489 most widely reported.28, 29, 31, 291, 292 There has also been a lot of interest recently on the epigenetic control exerted by sulforaphane, believed to be mediated through its inhibition of HDAC family members.293, 294

The global profiling of the wider target set of sulforaphane has been reported in a few studies. Two groups simultaneously reported affinity pulldown approaches of sulforaphane immobilised on resin to identify MIF as the predominant target of sulforaphane.295, 296 However, somewhat surprisingly, no further targets were reported. Another study synthesised and applied radiolabelled [14C]-sulforaphane to non-small cell lung cancer A549 cells following protein separation by 2D-GE.92 Subsequent visualisation of radiolabelled proteins, followed by their extraction from the gel and MS analysis revealed 49 potential protein targets for sulforaphane. The same strategy was also employed for a related isothiocyanate, phenethyl isothiocyanate (PEITC), revealing 53 targets with a high degree of similarity to the sulforaphane targets. The biological functions of the targets identified was diverse, however many of the targets identified were highly abundant proteins associated with redox functions or the . There was also a lack of target validation and as such many of these targets remain speculative sulforaphane targets.

It is noteworthy the overlap in protein targets and activities sulforaphane shares with related isothiocyanates such as the aforementioned PEITC, as well as allyl isothiocyanate (AITC), benzyl isothiocyanate (BITC) and 6-methylsulfinylhexyl isothiocyanate (6-HITC) that are typically found in other cruciferous vegetable family members. This suggests that the isothiocyanate motif is the key determinant for protein target reactivity. As such, targets which have been identified for these isothiocyanate family members may also be sulforaphane targets (Figure 8), although structural variations among isothiocyanates may lead to target variation which may account for a few instances where differing activities have been observed.281, 297-299

1.3.2.3 Summary

Sulforaphane is another dietary-based electrophile with great therapeutic potential. It has been well- documented as an inducer of the phase II response through the Nrf2-KEAP1 signalling axis providing cellular defence against chemical carcinogens and oxidative stress that substantially contributes to its chemoprevention effects. It is its action on this signalling mechanism where the majority of literature attention has focused. However, further protein targets for sulforaphane are being continually identified that may explain some of the physiological effects exerted that are independent of Nrf2- KEAP1.300 This is again prompting belief that sulforaphane, like curcumin, is a multi-target agent.

1.3.3 Piperlongumine

1.3.3.1 Overview

Piperlongumine (formerly known as piplartine) is an amide alkaloid found in the Piper genus of the Piperaceae family, native to Northeast Brazil. It has been far less studied for its therapeutic potential relative to curcumin and sulforaphane (Figure 7).301 Despite this, a broad range of biological activities for piperlongumine have been demonstrated including but not limited to its anti-inflammatory activity

55 and cytotoxicity towards tumour cells.302, 303 Its therapeutic potential is yet to be realised, with no reported clinical trials registered to date.304

It is the anticancer activity for piperlongumine that is most promising. To this end, induction of apoptosis and necrosis,302 induction of autophagy by targeting p38 signalling and the AKT/mTOR 305-307 308, 309 axis, G2/M cell cycle arrest leading to mitochondrial-mediated apoptosis, repression of pro- survival proteins such as Bcl2, survivin and X-linked inhibitor of apoptosis protein (XIAP) and activation of pro-apoptotic proteins such as Bim, p53 upregulated modulator of apoptosis (PUMA) and Noxa,123 as well as suppression of tumour progression and migration,123 have all been reported as anticancer activities for piperlongumine. In a landmark publication by Lee and co-workers, piperlongumine was shown to selectively kill cancer cells of various origins through induction of oxidative stress resulting in genotoxicity with minimal effects on noncancerous cells.123 This cancer cell cytotoxicity was shown to be independent of p53 status and further demonstrated to be not limited to specific cancer subtypes. This translated into excellent oral bioavailability in mice whereby tumour burden of xenograft cancer models was significantly reduced with only weak systemic toxicity. This mechanism of cancer cell killing led to piperlongumine being classed as a redox-directed cancer therapeutic, one of the first of its kind.

1.3.3.2 Molecular targets

Piperlongumine contains some structural similarities with curcumin, containing two distinct α,β- unsaturated carbonyl motifs to allow covalent binding with target proteins. However, only a handful of direct protein targets have been identified for piperlongumine (Figure 8), with much of the attention alternatively being focused on its ability to increase ROS levels selectively inside cancer cells. In the study by Lee and co-workers, an affinity pulldown approach using resin-immobilised piperlongumine coupled with SILAC and quantitative proteomics led to the identification of 12 potential targets conserved across two cancer cell lines. Seven of these targets were identified for their role in cellular response to oxidative stress, leading to the functional disruption of these targets. These targets are typically up-regulated in cancer and led the authors to postulate inhibition of these targets as the cause for the oxidative stress-driven cancer cell death. Further potential targets for piperlongumine were identified that fell below their cut-off threshold but were not disclosed.

Preliminary work had investigated the electrophilic nature of piperlongumine and suggested that the presence of both α,β-unsaturated carbonyls is responsible for general cytotoxicity.310, 311 More recent work has showed that the olefin in the lactam ring is critical for ROS elevation, glutathione depletion and cellular toxicity. Loss of the other olefin within piperlongumine still resulted in elevated ROS levels to that of piperlongumine but showed markedly reduced cell death, suggesting ROS-independent mechanisms, may also contribute to the induction of apoptosis by piperlongumine.312 Further investigation has shown that ROS elevation alone is not sufficient to induce cancer cell death and for ROS-inducing compounds that contain sites of electrophilicity, such as piperlongumine, the electrophilic nature and ability to covalent adduct proteins may be more closely associated with selective cancer cell death than ROS induction.313 Further targets of piperlongumine that might

56 explain such a mechanism have not been forthcoming. However, recent work has identified ATP- binding cassette sub-family B member 6 (ABCB6), IKK, the proteasome and STAT3 all as further potential protein targets of piperlongumine in independent studies.314-317

1.3.3.3 Summary

There has been a lot of excitement around piperlongumine as a cancer therapeutic since the report by Lee and co-workers in Nature in 2011.123 The identification of piperlongumine as a redox-directed cancer therapeutic with a novel and selective mechanism of cancer kill certainly warrants further investigation. However further elucidation of its target set is required to unravel this mode of action.

1.3.4 Conclusions

Drug discovery has traditionally shied away from polypharmocological compounds such as curcumin, piperlongumine and sulforaphane. These compounds are often branded as pan-assay interference compounds (PAINS) in drug discovery on account of their promiscuity in that they appear active in many biological assays.318, 319 This means that they are often overlooked as hits or candidates worthy of further investigation arising from drug discovery screens. The literature surrounding curcumin, sulforaphane and piperlongumine certainly supports this notion. These compounds do not make good starting points for the development of drug candidates and are therefore unlikely to ever form the basis of drug discovery programmes in the traditional sense. However this does not mean these compounds should be disregarded as their therapeutic potential is beyond doubt. The number of targets that have been identified for these three compounds suggest their innate polypharmacology is diverse and complex and may well be inherent to their biological activities. Their structural simplicity certainly aids their ability to bind to multiple targets. In order to determine whether such compounds have a therapeutic future and to subsequently exploit it, getting a handle on the comprehensive molecular targets and signalling pathways of these molecules is imperative. Identified targets of dietary electrophiles such as NF-κB and KEAP1 are all commonly over-expressed proteins that act as important mediators in cancer cell growth and survival. These provide a starting point to explaining the anticancer activities of such compounds by engaging multiple targets simultaneously in a polypharmocological fashion. But this is merely the tip of the iceberg.

A lot of the focus on dietary-based electrophiles has been around their activation of the Nrf2-KEAP1 signalling axis.280 The role of the Nrf2-KEAP1 signalling pathway in cancer has prompted much discussion, leading to it being believed to have a dual role in cancer.320 On the one hand, Nrf2 induction by such compounds offers protection against carcinogens and harmful xenobiotic that promote DNA damage and cancer development. However, oncogenic functions have also been observed for Nrf2, with its activation in cancer correlating with poor prognosis presumed to be as a result of enhanced cancer cell proliferation and the promotion of chemoresistance and radioresistance.321 Therefore enhancers of Nrf2 such as dietary-based electrophiles may have a biphasic response of both preventing and promoting cancer that requires further elucidation. Nrf2 also plays a pivotal role in cancer chemoresistance. Thus, combination of cancer chemotherapeutics with electrophilic species which induce Nrf2 activity may be counterproductive, leading to reduced

57 effectiveness of anticancer drugs. Although it should be noted no such adverse effects for such Nrf2 activators have been observed in vitro or in vivo.

However, activation of Nrf2-KEAP1 does not account for all of the activities of such compounds and targets beyond this signalling axis are starting to become more appreciated. Progress has been made to identify further targets, but only a small handful of incomplete and inconclusive attempts have been made to globally profile their targets.92, 123, 132, 133, 322, 323 It is clear they are not ‘single target’ molecules; dissecting their complex polypharmacology to determine the key targets and pathways presents a major challenge. It still remains unclear how such compounds can display such a broad range of beneficial biological activities and yet display such minimal toxicities. The lack of understanding within the field into electrophile signalling and electrophile reactivity in biological systems at present is a clear limitation to understanding the effects exerted by these kinds of molecules. Curcumin and sulforaphane have both shown encouraging results in vitro but this is yet to fully translate in vivo. The disconnect between understanding the targets and underlying mechanisms of the compounds is clearly limiting their progress in the clinic.324, 325

58

Curcumin targets Sulforaphane targets

1. Activator protein 1 (AP-1) 1. Activator protein 1 (AP-1) 2. (AKR1B10) 2. Cytochrome P450 3. Alpha-acid glycoprotein (ORM1) 3. Histone deacetylase enzymes (HDAC) 4. Aryl hydrocarbon receptor (AHR) 4. Heat shock protein 90 (HSP90) 5. β-amyloid (APP) 5. Kelch-like ECH-associated protein 1 (KEAP1) 6. Ca2+-dependent ATPase (SERCA) 6. Macrophage inhibitory factor (MIF) 7. Calcium release-activated channel protein 1 (CRAC) 7. MEK kinase 1 (MEKK1) 8. Calmodulin (CALM1) 8. Nuclear factor kappa-light-chain-enhancer of 9. cAMP-dependent protein kinase (PKA, PRKAR1A) activated B cells (NF-κB) 10. CD13 (aminopeptidase N) 9. Tubulin 11. COX-2 (PTGS2) 10. TLR4 12. CSN kinase (COPS) Additional isothiocyanate targets 13. Cytochrome P450 1. Actin 14. DNA 2. Adenine nucleotide translocase (ANT) 15. DNA methyltransferase 1 (DNMT1) 3. Annexin A2 (ANXA2) 16. DNA polymerase λ (POLLA) 4. ATPase enzymes (ABCs) 17. DNA topoisomerase 2 (TOP2) 5. Bovine serum albumin BSA 18. Epidermal growth factor receptor (EGFR) 6. M-phase inducer phosphatase 3 (Cdc25c) 19. Focal adhesion kinase (FAK) 7. Dual specificity kinase 8 (DUSP8 (M3/M6)) 20. Glycogen synthase kinase (GSK-3β) 8. Eukaryotic Translation Elongation Factor 1 Alpha 21. Glutathione (GSH) (EEF1A1) 22. Glutathione S-transferase enzymes (GSTs) 9. EGFR 23. Receptor tyrosine-protein kinase erbB-2 (HER2) 10. Glyceraldehyde 3-phosphate dehydrogenase 24. Histone acetyltransferase E1A binding protein p300 (GAPDH) (EP300) 11. Glutathione reductase (GR) 25. Histone acetylase (p300/CBP) 12. Glutathione S-transferase P1 (GSTP1) 26. HIV-1 protease 13. Heat shock protein 70 (HSP70) 27. HIV-1 integrase 14. NADPH oxidase 2 (NOX2) 28. HIV-2 protease 15. Mutant p53 29. Human serum albumin (HSA) 16. Protein kinase C (PKC) 30. IκB kinase (IKK) 17. Proteasome 31. Inositol 1,4,5 triphosphate 18. Signal transducer and activator of transcription 3 32. Interleukin (IL)-1 receptor associated kinase (IRAK) (STAT3) 33. Arachidonate 5-lipoxygenase (5-LOX) 19. Toll-like receptor 3 (TLR3) 2+ 2+ 2+ 34. Metal ions (Fe , Cu , Zn ) 20. Topoisomerase 2α (TOP2A) 35. Multi-drug resistance protein 1 (ABCB1) 21. Transient receptor potential cation channel A1 36. Multi-drug resistance protein 2 (ABCB2) (TRPA1) 37. Myeloid differentiation protein (MYD88) 22. TXNRD1 38. P300/CREB-binding protein (CREBBP) 23. Vimentin (VIM) 39. Phosphorylase kinase 40. Protamine kinase (cPK) 41. Protein kinase C (PKC) Piperlongumine targets 42. Ribonuclease A (RNAase) 43. Proto-oncogene tyrosine-protein kinase Src (Src) 1. ATP-binding cassette sub-family B member 6 44. Toll-like receptor 4 (TLR4) (ABCB6) 45. Tubulin (α- and β- isoforms) 2. Neuroblast differentiation-associated protein AHNAK 46. Thioredoxin reductase (TXNRD1) (AHNAK) 47. Ubiquitin isopeptidase enzymes (DUBs) 3. Annexin A5 (ANXA5) 48. (XDH) 4. Carbonyl reductase 1 (CBR1) 5. Glyoxalase 1 (GLO1) 6. Glutathione S-transferase M3, O1, P1 and Z1 (GSTM3, GSTO1, GSTP1, GSTZ1) 7. IKK 8. Pleckstrin domain-containing family M member 1 (PLEKHM1) 9. Peroxiredoxin 1 (PRDX1) 10. Ribosomal protein S5 (RPS5) 11. Proteasome 12. STAT3 13. Vimentin (VIM)

Figure 8. The direct protein targets of curcumin, sulforaphane and piperlongumine . Targets highlighted in blue represent confirmed covalent targets. Target information data was curated from multiple review articles including Aggarwal et al.,271 Goel et al.,255 Gupta et al.279, Gupta et al.,261 Mi et al.286 in addition to further more recent publications. Further targets for curcumin and sulforaphane from studies requiring further validation are presented in Appendix Table 2.

59

2. Previous work and project aims

Previous work prior to the commencement of the reported PhD studies in two separate MRes projects had investigated the application of a chemical proteomics approach in the form of ABPP to study electrophilic natural products, namely curcumin and sulforaphane (Figure 9).326, 327

(A) Tagged ABP 3. Click Reaction introduces 2. Cell lysis affinity and fluorescent labels

TAMRA Cancer cell line 1. Cell of interest feeding

N N Biotin N 4. Direct visualisation of ABP- bound proteins by SDS-PAGE separation and fluorescence detection

5. Affinity purification peptides 7. Trypsin 8. LC-MS/MS Biotin digest Biotin analysis

N N N N N N Neutravidin Protein identification and beads quantification

6. Western Blotting

S 1. Cell Biotin H HN N feeding (B) NH S O O O Biotin O H HN N + - 2. Cell N O2C O O NH O O The click reaction or H H O N N lysis O N + - CuAAC (copper(I)- H N O2C O O O O O H H Cancer cell line N N catalysed azide-alkyne N Cu(I)-catalysed H cycloaddition) of interest O O O TAMRA N N N TAMRA N N N Bioorthogonal ligation N Highly specific N Biologically inert High yielding Fast reaction time

Figure 9. ABPP-based chemical proteomic workflow for the in-cell labelling of ABPs (A). The electrophile of interest is first converted to an ABP by introducing an alkyne tag within its scaffold. The ABP is then treated directly to the cell line of interest within the cell culture medium (1). Cells are then lysed and the protein extracted (2). Specific functionalisation of ABP-bound proteins is carried out via the click reaction (CuAAC) (B) (3). For in- gel fluorescence experiments, the proteome is then separated and ABP-bound proteins visualised by fluorescence detection (4). For specific target identification, affinity purification of ABP-bound proteins is carried out with Neutravidin resin (5). This is followed by detection via WB (6) or by MS-based shotgun proteomics (7 and 8). PPD = pre-pulldown, SN = supernatant after affinity enrichment and PD = pulldown (protein immobilised on the resin).

60

These earlier studies had synthesised curcumin ABP 1, sulforaphane ABP 1 and sulforaphane ABP 2 (Figure 10A). Initially, these ABPs were treated to live HeLa cells and shown to be effective cell permeable ABPs using the workflow described in Figure 9. Initial proteomic data by MS had identified 138 protein targets for curcumin ABP 1 inside HeLa cells at an applied concentration of 20 μM. In-cell competition-based assays were also carried out between the ABPs and their respective parent compounds (Figure 10B) to investigate whether the ABPs were a good mimic of their parent compounds.

Sulforaphane ABP 1 and 2 both showed that their protein target labelling could be competed against a 20-fold excess of sulforaphane as visualised by in-gel fluorescence (Figure 10B lanes 10-20). This showed good promise that both sulforaphane ABPs could successfully be used as chemical tools to profile the protein targets of sulforaphane, although no proteomic data was obtained for these ABPs. However an earlier publication by Cole and co-workers who originally designed and applied the ABP to HEK293 cells had identified over 100 candidate protein targets for the sulforaphane ABP 1.322

In contrast to the sulforaphane ABPs, curcumin ABP 1 proved to be more problematic as a tool molecule for curcumin. Proteomic identification of the targets of curcumin ABP 1 had revealed 138 candidate protein targets initially in live HeLa cells. However, despite the trialling of multiple different conditions and protocols for the competition assay between curcumin ABP 1 and curcumin itself, successful competition for labelling could not be conclusively observed either in-cell or in-lysate even at greater than 50-fold excesses of curcumin relative to the ABP by in-gel fluorescence (Figure 10B lanes 1-4). However, a curcumin analogue, mono-O-propylcurcumin (PC), was shown to effectively compete against curcumin ABP 1 in-cell (Figure 10B lanes 6-9). At this stage, it remained unclear as to the reasons for the inability of curcumin to compete against the ABP, particularly given the structural similarity between curcumin ABP 1 and curcumin and that the closely related analogue PC was capable of competing for labelling against the ABP. It was speculated this may have been due to differences in cellular uptake of curcumin ABP 1 and PC on account of the increased lipophilicity relative to curcumin. This may result in lower intracellular concentrations of curcumin such that it can’t effectively compete ABP labelling under the assay conditions employed. The insolubility of curcumin and its analogues in the cell culture medium also limited the competition excesses of the parent compounds against the ABP.

However it was clear from the comparison to sulforaphane under exactly the same experimental conditions, as well as other reported studies from the literature, that such competition assays of ABP against parent compounds should be feasible. This prompted the conclusion that curcumin ABP 1 may not be the best ABP for capturing protein targets representative of curcumin and as such other ABPs should be explored for their utility for such studies.

61

(A) O O O O O O S N S N HO O O O OCH3 OCH3 O

Curcumin ABP 1 Sulforaphane ABP 1 Sulforaphane ABP 2

(B)

Lane 1 2 3 4 5 6 7 8 9 CURC ABP 1 (2 μM) + + + + - + + + + ------NC (μM) ------IA (μM) ------PC (μM) ------SULF ABP 1 (5 μM) ------+ + + + + ------SULF ABP 2 (5 μM) ------+ + + + - SULF (μM) ------

250 250 250 150 150 150 100 100 100 75 75 75

50 50 50 37 37 37

25 25 25

20 gel fluorescence 20 20 - 15 In 15 15 10 10 Coomassie (C)

Protein targets Concentration Probe Cell line Reference identified and feeding time Curcumin ABP 1 138 HeLa 20 μM, 4 h 326 Sulforaphane ABP 1 122 HEK293 20 μM, 30 min 322

Figure 10. Summary of the key findings from previous work carried out prior to commencements of the PhD studies. (A) The structure of the previously synthesised ABPs of curcumin and sulforaphane. The electrophilic motif is highlighted in blue and the alkyne handle is shown in red. (B) In-gel fluorescence image of the in-cell competition of NC and PC against curcumin ABP 1 (lanes 1-9), sulforaphane against sulforaphane ABP 1 (lanes 10-15) and sulforaphane against sulforaphane ABP 2 (lanes 16-20) in live HeLa cells. (C) Protein targets identified for curcumin ABP 1 and sulforaphane ABP 2 from preliminary MS-based proteomic experiments. CURC/NC = curcumin, SULF = sulforaphane, IA = iodoacetamide, PC = mono-O-propylcurcumin.

Prior to commencing the PhD project, a landmark paper published in Nature by Lee and co-workers reported another interesting electrophilic natural product, piperlongumine.123 It displayed an exciting ability to selectively kill cancer cells of various origins whilst being ineffective against normal cells both in vitro and in vivo. The study identified 12 targets for piperlongumine using a quantitative competition-based affinity pulldown strategy. To build on the initial target profiling of this electrophilic natural product, it was decided to be studied in parallel with curcumin and sulforaphane.

Buoyed by these initial findings, the aim of this PhD project was to build on the direct target profiling of curcumin, piperlongumine and sulforaphane from the literature and the previous studies reported above. Preliminary work had shown that a chemical proteomic approach was feasible, albeit with

62 technical difficulties still to overcome as typified by curcumin, but target information was yet to be fully discovered. The hope being that an ABPP-based chemical proteomic workflow would allow the full target spectrum of protein targets of these compounds in a systematic and non-biased manner in biological systems of interest.

The initial cell biology and chemical proteomic methodology had been established in HeLa cells that has proved to be a well-used cellular system. HeLa cells as a model system moving forward were deemed to be of less interest for the study of relevant protein targets to the anticancer activities of the compounds. Therefore, breast cancer was chosen as the biological system for the studies that follow on account of the multitude of publications for all three compounds in breast cancer research. The long-term hope being to provide significant insight into the protein target spectrum of these electrophilic natural products such that it may be utilised by the research community to better understand the fundamental underlying mechanisms of these biologically and therapeutically interesting small molecules.

The original principle goals of the PhD project are outlined as follows:

1. Synthesise a larger panel of ABPs of curcumin, sulforaphane and piperlongumine to assess as chemical tools for profiling the protein targets of these compounds of interest (Chapter 3). 2. Use the respective ABPs to comprehensively profile the direct protein targets of curcumin, piperlongumine and sulforaphane inside breast cancer cells to not only reveal the identity of the targets but to also reveal the most potent and potentially biologically significant targets (Chapters 4 and 5). 3. Utilise protein target information to dissect the polypharmacology of these electrophilic natural products to provide insight into their mode of action in cancer cell killing (Chapter 5). 4. Improve understanding of protein target reactivity of electrophilic compounds in biological systems (Chapter 4 and 5) with a growing interest in electrophilic natural products and covalent drugs as therapeutics. 5. Utilise protein target information to provide insight into how best to apply these electrophilic natural products as therapeutics against cancer as single agents but also as combinations with other cancer therapeutic agents (Chapter 6). 6. Establish and highlight the value of a robust and reliable chemical proteomic platform that could be utilised to study the protein target spectrums of other electrophilic natural products. The continued technological improvements in MS, human proteome annotation, bioinformatics, network modelling and systems biology show the growing potential for the developed platform for use in the target identification of small molecules.

63

3. Design and chemical synthesis of ABPs of electrophilic natural products

The first challenge of the project was to expand the library of ABPs to test as chemical tools for profiling the targets of curcumin, piperlongumine and sulforaphane (Figure 11). Incorporation of a chemical tag into the three compounds of interest to derive ABPs was performed utilising available SAR information as to avoid the modification of important structural features. Generally speaking for these compounds, the key when designing an ABP is to install the tag in a variety of positions within the compound scaffold, preferably away from the reactive electrophilic motif as to not hinder covalent adduct formation with target proteins and maintain the physical properties of the parent compound such as its cell permeability. The chemical tag should also be stable and accessible for downstream applications such as affinity enrichment, ideally projecting out into solvent away from the covalent binding site. However, this can be difficult to predict a priori particularly for compounds with multiple, unknown binding targets. An ABP should also be synthetically tractable, whereby incorporation of the tag should be achieved in the smallest number of steps, in good yields and avoiding any challenging or hazardous chemistry.

On account of these considerations, alkyne-tagged ABPs were preferred over azide and biotin counterparts. The addition of biotin can significantly alter the biological properties of a compound, particularly for these small electrophilic natural products whereby addition of biotin may double the molecular weight of the ABP. Therefore, a two-step ABPP labelling strategy was preferred that utilises small, discrete, biologically inert click chemistry tags. Incorporation of azides to generate ABPs was more synthetically challenging than alkynes in this case and have been shown to be less favourable ABPs by others in direct comparisons between azide- and alkyne- functionalised ABPs.328

64

Parent compounds - , iketone 1 3 d O O O O * 2 4 Phenyl ring * * Phenyl ring H3CO * N N S C 1 3 1 Sulfoxide C linker * S O 7 H3CO Isothiocyanate 2 - Lactam ring 4 ara methox HO 3 OH p y OCH3 OCH3 Curcumin OCH3 Piperlongumine Sulforaphane

ABPs

O O O O O O H CO 3 N S N O O HO O OCH3 O OCH3 OCH3 Piperlongumine ABP Sulforaphane ABP 1 Curcumin ABP 1 O O O O S N O HO OH Sulforaphane ABP 2 OCH3 O Curcumin ABP 2 N C S O O S O Sulforaphane ABP 3 HO OH OCH3 OCH3 Curcumin ABP 3

Figure 11. Chemical structures of curcumin, piperlongumine and sulforaphane and their respective ABPs. The electrophilic motifs within each of the compounds are highlighted in blue (with * marking the likely position of nucleophilic attack) and the alkyne tag required for the ABP design shown in red.

3.1 Design and synthesis of curcumin ABPs

Curcumin is a member of the linear diaryl heptanoid class of natural products containing two oxy-

substituted aryl moieties linked together by a C7 chain consisting of an unsaturated 1,3-diketone group. Curcumin can switch between the β-diketone and enol isomers through tautomerisation, with the enol form favoured in aqueous solution. There has been a huge amount of SAR investigation into curcumin, but its multi-target activities and lack of a defined mechanism of action has limited much of the SAR to cytotoxic effects against cancer cells.329 The two aromatic rings have been widely explored, with modifications on the hydroxyl groups (alkylation, acetylation and sulfonamidation) conveying increased stability and cytotoxicity relative to curcumin across multiple cancer cell lines.330, 331 Additional functionality has also been added to other positions on the phenyl rings that are generally well tolerated. Modifications to the C7 linker has been less explored. Hydrogenation of the olefins to eliminate the reactivity of the α,β-unsaturated carbonyls generally decreased cytotoxicity. However such derivatives still retained a number of activities suggesting that the electrophilicity of curcumin only plays a part of the overall mechanism of action of curcumin.332 The 1,3-diketone could also be cyclised into heterocyclic rings to reduce the linker flexibility which yielded highly potent 333, 334 compounds. Shortening the C7 chain to C5 with removal of one of the carbonyls has also led to improved potency of curcumin analogues.335

65

However what is lacking is SAR at the single target level for curcumin which limits how structural features directly relate to target engagement, albeit that curcumin is a multi-target agent. Although, a couple of recent reports have addressed the contributions of different components of the curcumin scaffold towards binding of specific anti-inflammatory targets.336, 337

The SAR would therefore seem to suggest curcumin is highly tolerant of scaffold modification for addition of the alkyne tag to generate the ABP in almost any position. The syntheses of three curcumin ABPs are reported to investigate the impact of the tag position within the curcumin scaffold on the utility of the probes. All three curcumin ABPs preserve the two α,β-unsaturated carbonyls of the parent compound, thereby retaining the electrophilicity. Curcumin ABP 1 and 2 insert the alkyne tag on the phenol ring. ABP 1 has the tag attached onto one of the two hydroxyl groups, whereas ABP 2 substitutes one of the two methoxy groups for the tag. ABP 3 inserts the tag in the middle of the β- diketone motif dissecting the two α,β-unsaturated carbonyls.

All curcumin ABPs were prepared through a well-established Pabon reaction that has been widely used for the synthesis of curcumin and its analogues.338 The synthesis of curcumin ABP 1 had been reported previously by alkylating curcumin at one of its two hydroxyl groups with propargyl bromide under basic conditions.339 However, in previous work this only produced curcumin ABP 1 in extremely poor yields (2.9 mg, 0.6 %), requiring preparative LC-MS purification to isolate the compound, due to the inability to separate the mono-O-propargyl analogue (curcumin ABP 1) from the di-O-propargyl analogue of curcumin.326 To improve the synthetic route to curcumin ABP 1, a new strategy based on a two-step procedure via feruloyl acetone was carried out.340-342 A first Aldol reaction of 2,4- pentanedione with vanillin in the presence of boric anhydride produced feruloyl acetone. This was followed by a second Aldol reaction of feruloyl acetone and the previously synthesised 4-methoxy-3- propargyloxybenzaldehyde under similar conditions to yield curcumin ABP 1 with a final mass of 54.1 mg (Scheme 1). This represented a significant improvement in yield (7 % over two steps) whilst retaining high purity.

Curcumin ABP 2 has not been previously reported as a curcumin analogue. It was synthesised starting from 3,4-dihydroxybenzaldehyde, reacting the meta-hydroxyl with propargyl bromide to produce 3-propargyloxy-4-hydroxybenzaldehyde. 3-propargyloxy-4-hydroxybenzaldehyde, vanillin and 2,4-pentanedione were then coupled in one pot (again via the Pabon reaction) using a synthesis adapted from Saladini and co-workers to produce 43.8 mg of curcumin ABP 2 in low overall yield (5 % over two steps) but high purity (Scheme 2).343 The synthesis of curcumin ABP 3 was carried out as previously reported.340 It was synthesised starting from 2,4-pentanedione which was alkylated with propargyl bromide in the presence of base to yield 3-propargyl-2,4-pentanedione. Reaction of 3- propargyl-2,4-pentanedione with vanillin via the Aldol reaction afforded 121 mg of curcumin ABP 3 in low yield (5 % yield over two steps) but high purity (Scheme 3).

Yields of all three curcumin ABPs were more than adequate for subsequent biological evaluation despite being relatively low (< 10 %). Whereas the yields could be improved through reaction

66 optimisation, the key for each ABP was obtaining it in extremely high purity. The purity of all three ABPs was confirmed to be > 95 % by LC-MS analysis.

O O O O O H i ii H HO HO O O OCH3 OCH3 OCH3 OCH3

Curcumin ABP 1

Scheme 1. Synthesis of curcumin ABP 1. Reagents and conditions: i) 2,4-pentanedione, B2O3, (nBuO)3B, nBuNH2, 90 °C, 16 %; ii) 4-methoxy-3-propargyloxy-benzaldehyde, B2O3, (nBuO)3B, piperidine, EtOAc, 80 °C, 43 %.

O O O O H i ii HO OH HO HO OCH O OH O 3

Curcumin ABP 2

Scheme 2. Synthesis of curcumin ABP 2. Reagents and conditions: i) propargyl bromide, NaH, DMSO, 0 ˚C then room temperature, 11 %; ii) 2,4-pentanedione, vanillin, B2O3, (nBuO)3B, nBuNH2, DMF, 80 °C, 36 %.

O O O O O O i ii

HO OH OCH OCH 3 3 Curcumin ABP 3

Scheme 3. Synthesis of curcumin ABP 3. Reagents and conditions: i) propargyl bromide, K2CO3, acetone, reflux, 13 %; ii) vanillin, B2O3, (nBuO)3B, nBuNH2, EtOAc, 80 °C, 41 %.

3.2 Design and synthesis of sulforaphane ABPs

Sulforaphane has also been subjected to extensive SAR investigations, although like curcumin its multiple mechanisms of action make a clear and unambiguous SAR analysis difficult to draw up to guide ABP design.344 Much of the SAR assessment for sulforaphane has been carried out on its ability to induce Nrf2, suppress NF-κB or observe more global effects at the cellular level. Observations suggest that the isothiocyanate is critical for all reported biological activities of sulforaphane reported to date. However it could be substituted for other electrophilic groups such as sulfoxythiocarbamates or isoselenocyanates whilst retaining many of the reported activities for the parent isothiocyanate.322, 345, 346 The distance between the isothiocyanate and the sulfoxide group is well tolerated, although a four methylene spacer, as is the case for sulforaphane, appears to give the most active compounds towards cancer cells.347 The optimum oxidation status of sulfur was the sulfoxide with the sulphide and sulfone showing remarkably reduced activities. The substituent on the sulfoxide group (methyl for sulforaphane) is also well tolerated in terms of Nrf2 activation and cytotoxic activity suggesting it may be the most logical place to insert the alkyne tag to derive an ABP.347, 348

67

Three sulforaphane ABPs were synthesised that encompassed a diverse range of ABP designs. The synthetic strategy for sulforaphane ABP 1 was originally reported by Cole and co-workers.322 This was replicated with minor modifications to produce sulforaphane ABP 1 and sulforaphane ABP 2 by Elisabeth Storck (Imperial College London) in a Masters project prior to commencement of these PhD studies (Scheme 4 and Scheme 5).327 Sulforaphane ABP 2 was re-synthesised from 6-chloro-2- hexanone to obtain 11.5 mg of the ABP in low yield (3 % over 5 steps) but in high purity. The limiting step in the synthesis was the final oxidation of the sulfur atom whereby over-oxidation to the sulfone could not be avoided accounting for the poor reported yield in the final step (15 %).

O iii O i / ii O O O O N H N Cl 2 O iv, v, vi

O O O O O

S N vii, viii S N O

O O Sulforaphane ABP 1

Scheme 4. Synthesis of sulforaphane ABP 1. Reagents and conditions: i) p-toluenesulfonic acid, ethylene glycol, toluene, 135 °C; ii) phthalimide, K2CO3, KI, DMF; iii) NH2NH2, EtOH; iv) 4-hex-5-ynyloxy-benzaldehyde, MeOH; v) NaBH4, MeOH; vi) S-ethyl chlorothiolformate, DIPEA, DCM, 0 °C; vii) 2N HCl, THF; viii) m-CPBA, DCM, -78 °C. Synthetic route taken from Storck et al.327

O i ii / iii O O O O O S N Cl Cl

iv

O O v O O S N S N O

Sulforaphane ABP 2

Scheme 5. Synthesis of sulforaphane ABP 2. Reagents and conditions: i) p-toluenesulfonic acid, ethylene glycol, toluene, 135 °C, 75 %; ii) propargyl amine, K2CO3, KI, DMF; iii) DIPEA, S-ethyl chlorothiolformate, DCM, 0 °C, 38 %; iv) 2N HCl, THF, 68 %; v) m-CPBA, DCM, -78 °C, 15 %.

The structural design of both sulforaphane ABP 1 and sulforaphane ABP 2 is distinct from the sulforaphane parent compound. Firstly, the reactive isothiocyanate was substituted for the less reactive sulfoxythiocarbamate group. Isothiocyanates form a dithiocarbamate covalent adduct upon reaction with the thiol group of cysteine residues, the preferred amino acid modification site of sulforaphane, that is kinetically labile (Figure 12).349 This makes the isolation of such adducts difficult with the ABPP workflow on account of reversibility of adduct formation.283 Substitution for the sulfoxythiocarbamate group results in a more stable thiocarbamate adduct with cysteines that enables more effective sulforaphane target isolation whilst maintaining the chemoprevention properties of

68 sulforaphane.322, 349 The other modification of the ABPs was to replace the sulfoxide with a ketone. Numerous reports have suggested that such a substitution has very little impact on the biological activities of sulforaphane.289, 350 However a recent study suggested that the sulfoxide group may hydrogen bond with an amino acid located next to the reactive cysteine of KEAP1, although this is yet to be fully clarified.29 The modified ABP design for sulforaphane ABP 1 and sulforaphane ABP 2 should therefore allow more effective capture of sulforaphane targets than isothiocyanate-based ABPs which had concerns over their adduct stability upon isolation. The rationale for the synthesis of sulforaphane ABP 2 was the shorter alkyne-containing linker may provide less steric bulk around the sulfoxythiocarbamate binding site, allowing identification of sulforaphane targets where phenyl ether linker of sulforaphane ABP 1 may prevent binding.

(A) Sulfoxythiocarbamate rrevers e I ibl O O a uc orma on O O dd t f ti Sulforaphane S N S N ABP 1 and 2 O design R R Thiocarbamate SH GSH

(B) evers e R ibl Isothiocyanate a uc orma on O dd t f ti S u ora ane O S O S lf ph S S an C S H S S N R d N R S N R H su ora ane lf ph es n ABP 3 d ig Dithiocarbamate

SH GSH

Figure 12. Mechanism of sulfoxythiocarbamate (A) and isothiocyanate (B) reaction with the cysteine thiol on target proteins through formation of thiocarbamate and dithiocarbamate adducts respectively. The dithiocarbamate adduct is more kinetically labile than its thiocarbamate counterpart making it prone to reversibility or displacement by nucleophilic attack by other nucleophilic species such as GSH.

More recently, the preliminary synthesis of sulforaphane ABP 3 was undertaken to provide a more sulforaphane-like ABP that maintained the isothiocyanate electrophilic motif and the sulfoxide group with the addition of the alkyne tag. The reported synthesis of an alkyne-functionalised ABP of a related isothiocyanate, 6-HITC, had suggested previous concerns relating to the instability of dithiocarbamate adducts for isolating isothiocyanate target proteins using ABPP may have been misplaced.94 The synthetic route to generate sulforaphane ABP 3 was inspired by Uchida and co- workers with minor modifications, inserting the alkyne tag onto the sulfoxide group.94, 351 Starting from 4-amino-1-butanol, the amino group was Boc-protected and the alcohol tosylated to produce 4-(tert- butoxycarbonylamino)butyl p-toluenesulfonate. The tosylate provided a good leaving group to be displaced with thioacetate followed by substitution with 3-butynyl p-toluenesulfonate, Boc-deprotection under acidic conditions and reaction with thiophosgene to construct the isothiocyanate. A final oxidation from the sulfide to the sulfoxide with hydrogen peroxide and purification by preparative LC- MS yielded 2.3 mg of the racemic sulforaphane ABP 3 in very poor yield (0.5 % over 7 steps) but in extremely high purity (Scheme 6). The poor yield was insufficient to generate a 13C NMR spectrum for sulforaphane ABP 3 and requires further optimisation to convert it into a viable strategy for future

69 applications. However, the synthetic route did provide sufficient quantities of the ABP to test it alongside sulforaphane ABP 1 and sulforaphane ABP 2 for biological evaluation. The purity of all three sulforaphane ABPs was again confirmed to be > 95 % as determined by LC-MS analysis.

i / ii iii / iv H2N BocHN BocHN OH OTs S

v / vi v N ii C S N S C S O S Sulforaphane ABP 3

Scheme 6. Synthesis of sulforaphane ABP 3. Reagents and conditions: i) Boc2O, K2CO3, THF:H2O; ii) TsCl, Et3N, DCM, 30 % over 2 steps; iii) CH3COSK, DMF, 60 °C; iv) 3-butynyl p-toluenesulfonate, 1M NaOMe, 1,4- dioxane, 0 °C; v) 5-6N HCl; vi) CSCl2, 1M NaOH, CHCl3; vii) H2SO4, iPrOH, 30 % H2O2, MeOH.

3.3 Design and synthesis of piperlongumine ABPs

Very little is known about the SAR surrounding piperlongumine with data limited to only a handful of recent publications.123, 312, 352, 353 From what has been shown, both the α,β-unsaturated carbonyls are necessary for activity, with the olefin in the lactam ring believed to be more reactive of the two. The three methoxy groups on the phenyl ring are not essential for cancer cytotoxicity and alterations have led to more potent analogues. The para-methoxy position was used as the point of attachment for the immobilisation of piperlongumine to an affinity resin for protein target identification by Lee and co- workers.123 Therefore this was chosen as the position for insertion of the alkyne tag to generate the one and only piperlongumine ABP, the first reported alkyne-functionalised analogue of piperlongumine for ABPP. Starting from piperlongumine, which is available commercially, de- methylation at the para-position with aluminium chloride yielded the O-demethylated piperlongumine intermediate. The free hydroxyl group was then alkylated with propargyl bromide to produce 24.8 mg piperlongumine ABP in good yield (48 % over two steps) and in high purity (Scheme 7). The synthesis of further piperlongumine ABPs was not explored.

O O O O O O H CO H CO H CO 3 N i 3 N ii 3 N

H3CO HO O OCH OCH OCH 3 3 3 Piperlongumine ABP

Scheme 7. Synthesis of piperlongumine ABP. Reagents and conditions: i) AlCl3, DCM, 69 %; ii) propargyl bromide, DBU, acetonitrile, 69 %.

3.4 Confirmation of the anticancer activity of ABPs

Following the synthesis of all 7 ABPs, the cancer cytotoxicity of each ABP was tested to confirm they retain the anticancer activities of their parent compound counterparts. To do this, appropriate cell- based assays are required. A large variety of assays are available to assess cancer cytotoxicity in a 96-well format.354 One such approach is the use of vital dyes which are fluorescent or coloured molecules that discriminate between living and dead cells. These can include exclusion dyes which are incapable of crossing an intact plasma membrane and therefore specifically label dead cells.

70

Alternative dyes include fluorogenic esterase substrates, which are cell penetrable, non-fluorescent compounds that upon hydrolysis by cytosolic esterases generate fluorescent products. Such substrates are only hydrolysed inside living cells therefore providing a quantitative readout of the live cell population. Other approaches include detection and quantification of intracellular proteins (such as LDH) that are spilled from the cell upon plasma membrane rupture, a common artefact of cell death. A number of commercially available kits are available for the detection of such cell death biomarkers.

The most widely used surrogate biochemical marker for cell viability is ATP, based on the assumption that all living cells produce ATP and it is indispensable for cellular life. Luciferase-based assays allow for the sensitive quantification of intracellular ATP levels which generally correlates well with cellular metabolism and viability. Several other assays measure specific artefacts of cellular metabolism to determine cell viability such as the widely used tetrazolium reduction assays ((3-(4,5-dimethylthiazol- 2-yl)-2,5-diphenyltetrazolium bromide (MTT) or 3-(4,5-dimethylthiazol-2-yl)-5-(3- carboxymethoxyphenyl)-2-(4-sulfophenyl)-2H-tetrazolium (MTS) assays). These tetrazolium salts are readily taken up by cells and converted by mitochondrial reductases from their colourless precursor to coloured formazan products that can be easily quantified by measurement of absorbance at a defined wavelength (Figure 13A).355, 356 The MTS assay offers the advantage over its MTT counterpart in that the procedure requires fewer processing steps as the resulting formazan product is water soluble.

The MTS assay was therefore chosen to assess the effect of each ABP on cell viability assay in the MDA-MB-231 breast cancer cell line across a range of time intervals (Table 2). The % cell viability was calculated for each concentration of compound and used to fit dose response curves to generate

an EC50 value defined as ‘half the maximal concentration for a reduction in % cell viability’ which was used as a surrogate for cancer cell death (Figure 13B).

71

(A) SO3 SO3

HO O HO O Mitochondrial reductases Proportional to O N N S O N NH S the number of N N viable cells N N N N

MTS Formazan Absorbance read at 490 nm (B) 1 2 3 4 5 6 7 8 9 10 11 12 A B Negative control C Positive control D Compound 1 E Compound 2 F Compound 3 Compound 4 G H

96-well plate Calculate % Dose response Compound MTS/PMS Absorbance cell viability curves fitted and treatment reagent at 490 nm (averaged over EC value to cells treatment for 4 h readout 50 duplicates) calculated

Figure 13. The MTS cell viability assay. (A) In living cells, MTS is converted to formazan, a coloured compound, by mitochondrial reductases that can be detected by measuring absorbance at 490 nm. The absorbance is proportional to the number of living cells. (B) The 96-well plate experimental setup for screening effects on cancer cell viability. Four compounds were tested across 10 concentrations per plate in duplicate with relevant negative (DMSO vehicle) and positive (puromycin) controls as indicated.

Table 2. Cancer cell toxicity of the synthesised ABPs and their parent compounds in the MDA-MB-231 cell line. 100 μM was the maximum concentration tested for all compounds and dose response curves were fitted with non-linear regression using a four parameter fit on at least duplicate data in GraphPad Prism 5 from which the EC50 values were determined (p < 0.05). n.d. = not determined.

Compound name EC50 (μM) @ 24 h EC50 (μM) @ 48 h EC50 (μM) @ 72 h Curcumin 93.4 ± 17.6 21.3 ± 3.7 26.4 ± 4.3 Curcumin ABP 1 n.d. 14.5 ± 6.0 10.0 ± 3.2 Curcumin ABP 2 n.d. n.d. 18.8 ± 8.4 Curcumin ABP 3 n.d. n.d. 3.9 ± 0.6 Tetrahydrocurcumin (THC) n.d. n.d. > 100 Sulforaphane 46.0 ± 23.6 22.5 ± 8.3 13.3 ± 3.3 Sulforaphane ABP 1 > 100 > 100 > 100 Sulforaphane ABP 2 > 100 > 100 > 100 Sulforaphane ABP 3 79.9 ± 50.9 n.d 13.4 ± 12.1 Piperlongumine 19.8 ± 4.7 9.0 ± 2.0 4.7 ± 1.2 Piperlongumine ABP n.d 14.9 ± 16.5 28.8 ± 20.7 Tetrahydropiperlongumine (THP) n.d. n.d. > 100

At 72 h, the EC50 for curcumin ABP 1, curcumin ABP 2, curcumin ABP 3 and curcumin were 10.0 μM, 18.8 μM, 3.9 μM and 26.4 μM. All three curcumin ABPs as such show comparable activity on cancer

72 cell viability relative to curcumin suggesting they retain the biological activity of the parent compound. The same observation was made for piperlongumine ABP and piperlongumine where at 48 h and 72 h the EC50 of piperlongumine were 9.0 μM and 4.7 μM and for piperlongumine ABP were 14.9 μM and 28.8 μM respectively. Hydrogenation of the olefins responsible for the α,β-unsaturated carbonyls of both curcumin and piperlongumine to yield tetrahydrocurcumin (THC) and tetrahydropiperlongumine

(THP) abolished the activity of both compounds (EC50 > 100 μM even after 72 h), indicating that the electrophilic moieties are an essential structural feature for cancer cell death.

The effect on cell viability for the three sulforaphane ABPs did not however correlate with

sulforaphane. Sulforaphane had an EC50 at 24 h, 48 h and 72 h of 46.0 μM, 22.5 μM and 13.3 μM respectively displaying clear anticancer effects. Sulforaphane ABP 3 showed comparable activities with an EC50 at 24 h and 72 h of 79.9 μM and 13.4 μM respectively. Sulforaphane ABP 1 and 2 however showed no effect on cell viability even at the maximum concentration tested (100 μM) for all three time intervals. This suggests that the modified structural design of these two sulforaphane ABPs do not retain the anticancer activities of sulforaphane. This was surprising, given that Cole and co- workers had reported equivalent or even superior NQO1 inducer activity relative to sulforaphane in initial studies for this compound class.322 However, this is the first reported assessment of their anticancer effects in cell viability assays. The lack of cancer cell death may be due to the weaker electrophilic nature of the sulfoxythiocarbamate motif relative to the isothiocyanate of native sulforaphane. The lack of toxicity towards cancer cells urges caution in the application of these ABPs as mimics for sulforaphane given they do not share the same anticancer phenotype at the concentrations tested. However, their non-toxic nature may also be a potential advantage allowing high concentrations of the ABPs to be applied at long durations without the risk of inducing cell toxicity. Furthermore, their non-toxic nature may also allow for the application of such ABPs in vivo and as such these ABPs are certainly worth pursuing further despite the observed lack of effect on cancer cell viability.

Having successfully designed, synthesised and tested a total of 7 ABPs collectively for curcumin, piperlongumine and sulforaphane, the utility of the ABPs as tools for studying the molecular targets of their parent compounds was carried out in Chapter 4.

73

4. Initial application of ABPs to cells and cell lysates

In this chapter, the application of the seven synthesised ABPs of curcumin, piperlongumine and sulforaphane to biological systems will be discussed, both to live cancer cells and protein lysates. Initial proteomic identification of the targets of a representative ABP of each of the three compounds is carried out giving the first comprehensive insight into the polypharmocological nature of these dietary electrophiles. This led on to establishing in-cell competition-based assays with the ABPs against their respective parent compounds as a way to drill down to the genuine protein target set of the compound under study. Finally, addressing labelling with the ABPs in a comparative proteomic profiling experiment between in-cell and in-lysate labelling reveals interesting insights into how best to apply the ABPs as tool molecules to identify the genuine protein targets of these dietary electrophiles.

4.1 In-cell application of ABPs and proteomic identification of targets

4.1.1 In-gel fluorescence analysis of ABPs treated to MCF7 and MDA-MB-231 cell lines

Owing to the interest in applying the synthesised ABPs to cancer model systems, all seven ABPs (curcumin ABP 1, curcumin ABP 2, curcumin ABP 3, sulforaphane ABP 1, sulforaphane ABP 2, sulforaphane ABP 3 and piperlongumine ABP) were treated to two live breast cancer cell lines, MCF7 and MDA-MB-231.357, 358 The MDA-MB-231 cell line is a triple negative fibroblast-like cell line (ESR1- HER2- PR-) with mutant p53 status and is characterised as being highly aggressive and invasive. Triple negative breast cancers have proved difficult to treat with standard hormonal therapies in the clinic to date. The MCF7 cell line is a hormone-sensitive luminal epithelial-like cell line (ESR1+ HER2- PR+) with wild-type p53 that is relatively less aggressive in comparison to the MDA-MB-231 cell line and is more responsive to current therapeutic intervention. Both cell lines have been widely used in breast cancer research and therefore provide two contrasting representative breast cancer cell line subtypes to study the application of the ABPs towards.359 Moreover, there is an abundance of literature for the application and study of the three electrophilic natural products of interest to both the MCF7 and MDA-MB-231 cell lines.278, 360

All seven ABPs were treated to both MCF7 and MDA-MB-231 cell lines by supplementing the ABPs into the cell culture medium for 30 min. After treatment, cells were washed with PBS followed by chemical cell lysis to extract a whole cell protein lysate. Following standardisation of the protein concentration in each sample, the ABP-labelled proteomes were functionalised with an azido-TAMRA (AzT) ‘capture reagent’ by CuAAC to specifically covalently label ABP-bound protein targets with the TAMRA fluorophore for visualisation purposes. Excess AzT reagent was removed by precipitating the proteins with a CHCl3/MeOH precipitation protocol with subsequent MeOH washing of the resulting protein pellet. The protein pellet was then re-dissolved in protein re-suspension buffer and prepared for SDS-PAGE analysis. Each sample was then resolved by SDS-PAGE and analysed by in-gel fluorescence imaging whereby only ABP-labelled proteins are visible in the Cy3 channel (Ex. 550 nm Em. 570 nm) corresponding to the TAMRA fluorophore. It was noted that prior to SDS-PAGE, boiling the sample containing sulforaphane ABP 3 in sample loading buffer resulted in a loss of ABP-protein

74 conjugate formation (Appendix Figure 1). This is a result of the instability of the dithiocarbamate adducts the isothiocyanates form upon protein modification and as such this sample was not boiled prior to SDS-PAGE analysis.

Protein target labelling by all seven ABPs was achieved in the low μM range as visualised by in-gel fluorescence across both cell lines (Figure 14). The ABPs even when applied to cells at relatively low concentrations label many potential protein targets given the number of bands in the SDS-PAGE gel visible by in-gel fluorescence (additional in-gel fluorescence images are shown in Appendix Figure 2 and Appendix Figure 3). The labelling pattern observed also appears reasonably well-conserved for the same ABP across both the MCF7 and MDA-MB-231 cell lines. However, some differences in the band patterns are evident suggesting some degree of cell line-specific targets. Looking more closely, the three ABPs for curcumin, structurally differing only in the positioning of the alkyne chemical handle within the scaffold of the natural product, show very similar protein labelling band patterns, albeit with differing in-gel fluorescence intensities (Figure 14 lanes 1-3 and 9-11). This would suggest that the targets of these three ABPs are likely to be highly similar. It is evident in both cell lines that curcumin ABP 1 and curcumin ABP 3 show stronger in-gel fluorescence labelling at identical concentrations relative to curcumin ABP 2. This could be due to differences in the rate of compound uptake, stability and/or kinetic thiol reactivity. The latter would seem least likely as all three ABPs contain the same electrophilic motif, although the influence of neighbouring groups could affect electrophilic reactivity. It is often difficult to predict a priori how a designed ABP is likely to label in a cellular context. As such, synthesising multiple ABPs of a compound of interest and subsequently testing them is often still the best strategy to ensure an optimum ABP is selected for future experiments.

Lane 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Cell line MDA-MB-231 MCF7 20 μM 20 μM 20 μM 5 μM 20 μM 20 μM 50 μM 20 μM 20 μM 20 μM 2 μM 5 μM 10 μM 10 μM Treatment CURC CURC CURC PIP SULF SULF SULF DMSO CURC CURC CURC PIP SULF SULF SULF DMSO ABP 1 ABP 2 ABP 3 ABP ABP 1 ABP 2 ABP 3 ABP 1 ABP 2 ABP 3 ABP ABP 1 ABP 2 ABP 3

Mw (kDa) 150 150 100 100 75 75

50 50 37 37

25 gel fluorescence 20 - In

15 10 Coomassie

Figure 14. In-cell labelling of the small panel of synthesised ABPs of curcumin, sulforaphane and piperlongumine treated to the MDA-MB-231 (lanes 1-8) and MCF7 (lanes 9-16) cell lines. Cells were treated with ABP for 30 min within the cell culture medium; followed by whole cell protein lysis, CuAAC to append the AzT reagent and SDS- PAGE. The in-gel fluorescence represents the protein labelling pattern of each ABP and the Coomassie staining ensures equal protein loading across all lanes for each cell line. CURC = curcumin, PIP = piperlongumine and SULF = sulforaphane.

75

On the other hand, the protein labelling patterns of the three sulforaphane ABPs as visualised by in- gel fluorescence appear less well conserved (Figure 14 lanes 5-7 and 13-15). This is maybe expected given the significant structural differences in the design of the three ABPs. However, even sulforaphane ABP 1 and 2, which are two ABPs of the same fundamental design, noticeable differences in the labelling patterns are apparent. The piperlongumine ABP showed particularly strong in-gel fluorescence corresponding to protein labelling and was the most potent of the seven ABPs tested across both breast cancer cell lines (Figure 14 lane 4 and 12 accounting for its lower concentration). All seven ABPs are shown here to label protein targets at physiologically relevant concentrations, supporting the notion that covalent adduct formation with proteins is a major mechanism for the observed biological effects of these dietary electrophiles. Taken together, all seven ABPs therefore provide a diverse set of chemical tools to further study the targets of their respective parent compounds.

4.1.2 Stability of ABP-protein adducts in the MDA-MB-231 cell line

Utilising the ABP as a representative tool for the parent compound, the stability of ABP-protein conjugates over time was investigated inside the cellular environment. For these studies, an ABP for each of the three compounds (10 μM curcumin ABP 1, 2 μM piperlongumine ABP and 5 μM sulforaphane ABP 2) was applied at a single concentration to a single cell line, MDA-MB-231, over a series of time intervals up to 48 h (Figure 15A-C). This will not only identify the optimum incubation time for the application of the ABPs to achieve maximum labelling, but also allow the differences in in- cell protein target engagement to be studied. After treating live cells at each indicated time interval, the cells were washed, lysed and the protein extracted. CuAAC was performed, followed by resolution with SDS-PAGE and in-gel fluorescence scanning. Curcumin ABP 1 showed strongest in-gel fluorescence labelling after just 0.5 h corresponding to the rapid accumulation of ABP-protein adducts inside cells (Figure 15A). Maximum adduct formation persisted until 2 h before a significant loss for incubation times thereafter. Piperlongumine ABP protein labelling on the other hand showed strongest in-gel fluorescence labelling after 2 h suggesting ABP-protein adducts took longer to form but were significantly more stable than their curcumin counterparts with the maximum in-gel fluorescence intensity maintained up to 24 h, before dropping away at 48 h (Figure 15B). Sulforaphane ABP 2 took the longest out of the three compounds tested to form its maximum detectable ABP-protein adducts with in-gel fluorescence labelling intensity peaking after 6 h treatment (Figure 15C). This was maintained up to 24 h, before dropping away at 48 h.

A number of conclusions can be drawn from these studies. The differences in the electrophilic moieties between the three natural products as well as their cellular uptake rates are likely to contribute to the differing rate of protein-ABP adduct formation and their subsequent stability. However, protein-ABP adducts for all three ABPs are eventually turned over at 48 h even when compound treatment persists. Although the overall in-gel fluorescence intensity for each ABP varies over the time intervals tested, the band pattern within the SDS-PAGE gel remains consistent for each ABP. This would suggest that there is no obvious shift in protein labelling profile over this 48 h time

76 window and the targets engaged at 30 min appear to be the same as those at 24 h or 48 h (at least by in-gel fluorescence).

(A) 10 μM (B) 2 μM (C) 5 μM CURCUMIN PIPERLONGUMINE SULFORAPHANE ABP 1 ABP ABP 2 Time (h) 0.5 2 6 24 48 0.5 2 6 24 48 0.5 2 6 24 48

Mw (kDa) 250 250 250 150 150 150 100 100 100 75 75 75

50 50 50

37 37 37 gel fluorescence 25 25 25 - 20 20 20 In

15 15 15 Coomassie

20 μM 10 μM 5 μM (D) CURCUMIN (E) PIPERLONGUMINE (F) SULFORAPHANE ABP 2 ABP ABP 2 Time (h) 0 0.3 1 2 3 20 0 0.3 1 2 3 20 0 0.3 1 2 4 24 Mw (kDa) 150 250 150 150 100 100 100 75 75 75 50 50 50 37 37 37 25 25 gel fluorescence 20 25 - 20 20 In

15 15 15 10 10 Coomassie

Figure 15. (A-C) Time-course experiment of ABP dosing to observe in-cell ABP labelling in the MDA-MB-231 cell line. Cells were seeded at equal density and treated with the ABP for the stated time. Cells were then lysed with whole cell protein lysis buffer, followed by CuAAC to append the AzT reagent and SDS-PAGE and protein visualisation by in-gel fluorescence. (D-G) Stability and turnover of in-cell ABP labelling in the MDA-MB-231 cell line. Cells were seeded at equal density and treated with the ABP for 30 min. The ABP containing medium was then aspirated off and replaced with fresh medium for the stated time. Cells were then lysed with whole cell protein lysis buffer, followed by CuAAC to append the AzT reagent and SDS-PAGE and protein visualisation by in-gel fluorescence.

To further investigate the stability of ABP-protein adducts and how they are turned over within the cell, a follow on experiment was carried out. MDA-MB-231 cells were first exposed to a representative ABP for each of the three compounds (20 μM curcumin ABP 2, 10 μM piperlongumine ABP and 5 μM sulforaphane ABP 2) for 30 min, before withdrawing the ABP-containing media and replacing it with fresh non-compound containing media. Cells were then lysed and proteomes functionalised as described previously over a series of time intervals up to 24 h (Figure 15D-F). Consistent with the

77 previous observations, the protein labelling of the curcumin ABP was rapidly diminished even after 20 min (Figure 15D). Alternatively, the piperlongumine ABP and sulforaphane ABP 2 showed more stable ABP-protein adducts relative to curcumin following ABP withdrawal. Sulforaphane ABP 2 maintained ABP-protein adducts up to 4 h after withdrawal of the ABP but these were significantly reduced after 24 h (Figure 15F). Piperlongumine ABP showed a reduction in ABP-protein adducts up to 1 h but no further adduct loss was detected up to 20 h (Figure 15E).

The formation of protein adducts by curcumin appears to be relatively quick; although these protein- electrophile adducts seem to be rapidly eliminated inside the cell. This fits with our understanding of curcumin instability in biological buffer.361 Its degradation both enzymatically and non-enzymatically into a variety of products has also recently been explored which may explain why longer time intervals do no result in increased adduct formation.362-364 However the turnover of curcumin-protein adducts has not been investigated and therefore it remains unknown what the underlying mechanism may be for the rapid adduct elimination for curcumin. Piperlongumine on the other hand takes longer to form protein-electrophile adducts but these adducts appear to be more stable. Caution must be advised when interpreting the observations for sulforaphane ABP 2 in comparison to sulforaphane. There are significant structural differences between the two compounds both in terms of the electrophilic motif (sulfoxythiocarbamate versus isothiocyanate) and the remaining chemical scaffold (ketone versus sulfoxide). However as an ABP, sulforaphane ABP 2 appears to form stable protein-ABP adducts, more so than its sulforaphane ABP 3 counterpart (Appendix Figure 1).

Similar studies of the stability of electrophile-protein adducts have been carried out by Liebler and co- workers for two other small molecule electrophiles, IA and NEM.21 They noted that the differing kinetic thiol reactivity between the two electrophilic motifs, SN2 iodoacetyl group of IA in comparison to the Michael acceptor maleimide group of NEM, contribute to differences in electrophile-protein reactivity and adduct reversibility. Their results showed that NEM reacts rapidly to form protein-electrophile adducts in-cells and in-lysates but these adducts are reversed over time. Alternatively, IA reacts more slowly to form protein-electrophile adducts but the formed adducts were observed to be irreversible. It has been suggested that irreversible electrophile binding to proteins results in general cellular toxicity that should be avoided. Therefore the observation that curcumin, piperlongumine and sulforaphane all form reversible protein adducts may partly explain their minimal general toxicities in contrast to an electrophilic agent like IA. The stability and turnover of protein adducts of other electrophilic species such as 4-HNE and 15-PGJ2 has not been explored in a cellular environment.

Traditional approaches for assessing electrophile reactivity have typically used in vitro kinetic thiol assays involving a model thiol (such as GSH,365 DTT,366 single nucleophile peptides,367 cysteine368) incubated with the electrophile of interest under aqueous conditions followed by adduct formation and quantification analysis by chromatographic, MS and/or NMR techniques.369, 370 Reactivity assays of this kind have proven their reliability and robustness in determining reactivity parameters and predicting appropriate toxicity.371 Utilising the ABPs in this study to investigate protein-electrophile adduct formation and stability has the advantage over traditional approaches as it allows electrophile- protein adducts to be studied in a native biological environment. The electrophile under study can

78 therefore be assessed against its true protein target set containing a diverse range of nucleophilic reactivity, providing a much more informative readout. Utilising such a methodology for high- throughput screening of electrophilic reactivity is clearly not amenable, but for investigating the electrophilic nature of curcumin, piperlongumine and sulforaphane, the results here are superior to the reactivity parameters calculated as a result of kinetic thiol assays previously.369

4.1.3 Proteomic identification of targets of the ABPs in the MDA-MB-231 cell line

Having shown that all ABPs could successfully label protein targets inside cells as visualised by the in-gel fluorescence analysis, the identities of these targets was sought. A representative ABP for each of the three compounds (20 μM curcumin ABP 1, 5 μM sulforaphane ABP 2 and 2 μM piperlongumine ABP) along with vehicle (DMSO) were treated to MDA-MB-231 cells for 30 min. After compound treatment, cells were washed, lysed and protein concentration of the resulting lysates normalised to a final protein amount of 1 mg. Lysates were then subjected to CuAAC with an azido-TAMRA-biotin (AzTB) reagent. This tri-functional reagent contains an azide group for click chemistry adduction to terminal alkynes, a TAMRA fluorophore for in-gel fluorescence visualisation and a biotin handle for subsequent affinity purification. Protein precipitation and subsequent washes removed the excess of reagent. The resulting protein pellet was then re-suspended followed by incubation with a Neutravidin- sepharose resin to affinity enrich biotin-labelled proteins corresponding only to proteins that have i) covalently bound the ABP and ii) subsequently been functionalised with the AzTB reagent by CuAAC.

The resin was then rigorously washed under denaturing conditions with SDS, urea and ammonium bicarbonate solutions to remove proteins indirectly associated with ABP-bound proteins and proteins that non-specifically bound to the resin itself whilst maintaining biotin-functionalised protein targets on the resin. The interaction between biotin and avidin is the strongest known non-covalent interaction -15 372 (Kd is approximately 10 M). Protein targets on resin were then reduced and alkylated, trypsin digested to peptides and analysed by LC-MS/MS on a Thermo Scientific Q-Exactive mass spectrometer instrument.373 SDS-PAGE and in-gel fluorescence analysis was used as a quality control to confirm ABP labelling and affinity capture (Figure 16B).

Proteins were identified from the MS data produced by their unique peptide counterparts using the MaxQuant proteomics software package designed for handling large, high resolution mass spectrometry data sets.374 The Q-Exactive mass spectrometer is a state-of-the-art hybrid instrument that couples an ion trap-orbitrap set up with a quadrupole mass filter. The combination of these two components is the first of its kind combining the advantages of a quadrupole and orbitrap setup, providing superior peptide and protein coverage in comparison to a LTQ Orbitrap Velos instrument.373 To achieve data of high quality, it is necessary to perform MS analysis on instruments with high resolution, high dynamic range, high mass accuracy and high sequencing. The QExactive mass spectrometer provides such a platform.

79

(A) (B) Number of Number of Number of Lane 1 2 3 4 5 6 7 8 9 10 11 12 Compound unique protein peptides 20 μM 5 μM peptides targets † 2 μM PIP CURC SULF DMSO ABP 20 μM curcumin ABP 1 ABP 1 18,150 17,041 2276 PP PP PP ABP 1 PPD SN PD SN PD SN PD SN PD D D D 5 μM sulforaphane 6,232 5,613 706 250 ABP 2 150 100 2 μM 75 piperlongumine 12,123 11,227 1344 50 ABP 37 DMSO 864 677 66

25 gel fluorescence - 20 In

(C) 15 Curcumin ABP 1 10 (2211 targets)

1218 (D) Coomassie

448 27 Sulforaphane ABP 2 (645 targets) Curcumin ABP 2 (596 targets) 518

287 27 73 393 252 484 78 518 469

Piperlongumine Sulforaphane ABP ABP (1280 targets) 2 (645 targets) Sulforaphane ABP 3 (736 targets) Curcumin ABP 3 (987 targets)

Figure 16. Proteomic identification of the targets of representative ABPs of curcumin, sulforaphane and piperlongumine in the MDA-MB-231 cell line. (A) Summary of the number of peptides and proteins identified in each of the 4 experimental samples. † Protein identifications were made based on the presence of > 2 ‘razor+unique’ peptides and a protein FDR < 0.01. (B) SDS-PAGE and in-gel fluorescence analysis showing the successful enrichment of ABP-bound proteins onto Neutravidin sepharose resin. Samples: PPD = post-click chemistry pre-pull down, SN = supernatant after affinity enrichment onto the Neutravidin sepharose resin, PD = protein immobilised on the Neutravidin sepharose resin. (C) Venn diagrams showing the overlap of protein targets identified for curcumin, piperlongumine and sulforaphane after the removal of targets from the vehicle control (DMSO). (D) Venn diagrams showing the overlap of protein targets from a comparative study of the sulforaphane and curcumin ABPs.

Setting a minimum requirement of two ‘razor+unique’ peptides for a protein identification, this led to the successful identification of 2276 proteins for curcumin ABP 1, 706 proteins for sulforaphane ABP 2, 1344 proteins for the piperlongumine ABP and 66 proteins for the DMSO control (Figure 16A). The sheer number of protein targets identified here is daunting. A human cell is known to contain anywhere between 30,000-50,000 proteins and therefore this is a significant number of targets within the proteome as a whole that these small electrophilic ABPs can seemingly covalently interact with. The protein targets identified in the DMSO control (66 protein identifications) are likely to be non- specific binders to the Neutravidin-sepharose resin and as such they were discounted from the target identifications for the ABPs. Removal of targets present in the DMSO control led to 2211 targets for curcumin ABP 1, 1280 targets for piperlongumine ABP and 645 targets for sulforaphane ABP 2. A high degree of overlap of protein targets identified between the three ABPs was observed with 518 conserved targets (Figure 16C). However, the proteomic identifications also indicate a large number of targets that appear to be ABP-specific, with 1218, 287 and 73 unique protein identifications for

80 curcumin, piperlongumine and sulforaphane respectively. This is fitting with the observations by Liebler and co-workers for NEM and IA whereby low overlap of protein adducts conserved across the different model electrophiles was also observed in their MS-based target profiling.80, 81, 83

For proteomics data of this kind, there are a number of important parameters that provide strong evidence for a given protein target identification. Greater confidence of an identification is made with the higher quantity of ‘razor+unique’ peptides assigned to a protein, the % sequence coverage, and the label-free quantification (LFQ) intensity. Unique peptides are those only identified within that protein, whereas razor peptides are non-unique peptides assigned to the protein group with the largest number of assigned unique peptides (based on Occam’s razor principle).374 The % sequence coverage refers to the percentage of ‘razor+unique’ peptide identifications within the entire full length protein sequence. LFQ intensity is an algorithm-based quantification parameter based on the mass spectral peak intensities that allows the relative amount of proteins to be compared across samples from separate LC-MS/MS runs.375

As noted earlier, a well-documented protein target of electrophilic compounds is the Nrf2-inhibitory protein, KEAP1, which regulates the anti-oxidant response.376 Modifications of KEAP1 by sulforaphane have been widely reported but only on recombinant protein or in cell lines with a KEAP1 over-expression construct. The identification of KEAP1 as an in-cell target of electrophilic natural products has remained elusive on account of its low endogenous expression level inside the cell. Using ABPs of the three compounds identification of KEAP1 as a target is made with the number of ‘razor+unique’ peptides assigned to KEAP1 being 22, 22 and 29 for curcumin, piperlongumine and sulforaphane respectively. This corresponds to a respective 50 %, 50 % and 65 % peptide sequence coverage for KEAP1 with a high LFQ intensity relative to other proteins identified (30.2, 34.0 and 31.0 for KEAP1 against an average LFQ intensity of 26.3). The MS-based parameters give extremely high confidence to this assignment, showing for the first time KEAP1 as an intracellular target for all three compounds.

Is it possible that a single dietary-based electrophile is capable of covalently binding such a large number of protein targets simultaneously in a cellular system? These initial proteomic insights support the notion that reactive electrophiles like curcumin, sulforaphane and piperlongumine are highly promiscuous agents. Other studies employing similar workflows with MS detection have also reported high numbers of protein targets.56, 78, 80, 81, 83, 114, 377 One of the great advantages of using a MS-based platform to profile the protein targets of small molecules is that it is capable of coping very well with extremely complex mixtures allowing the simultaneous identification of hundreds or even thousands of proteins in a single LC-MS/MS run. This would not be achievable using alternative methods such as WB that require target identification on a target-by-target basis.

To compare the protein targets of different ABPs of the same compound, further proteomics experiments were carried out in the MDA-MB-231 cell line. Using an identical proteomics setup, in-cell treatment of 10 μM sulforaphane ABP 3 revealed 736 protein targets. Using a slightly different proteomics experimental setup, 596 and 987 protein targets were identified for 20 μM curcumin ABP 2

81 and 20 μM curcumin ABP 3 respectively. Previous observations from the in-gel fluorescence showed that all applied curcumin ABPs had very similar band labelling patterns corresponding to protein-ABP adducts (Figure 14). This is reflected in the proteomics results obtained for curcumin ABP 2 and 3, in that 518 conserved targets were identified for the two curcumin ABPs with extremely high target overlap (> 85 %) (Figure 16D). On the other hand, noticeably different in-gel fluorescence labelling patterns for the sulforaphane ABPs was observed. This was reflected in their protein targets in that only 262 protein targets were conserved across both sulforaphane ABP 2 and 3, with an additional 390 and 484 targets respectively only identified as targets in each of the two ABPs (Figure 16D).

The proteomics data generated here is comprehensive in providing a number of potential direct protein mediators for curcumin, sulforaphane and piperlongumine, the majority of which are novel. However, although structurally analogous in the most part (with the exception of the sulforaphane ABPs), there are subtle structural differences between the ABPs and their respective parent compounds. The addition of the alkyne tag into the chemical scaffold to derive an ABP may result in off-target protein binding that is not present in the parent compound. There is no way of knowing which protein identifications are as a result of the addition of the alkyne tag into the parent compound scaffold. Moreover, the target identification is limited to merely the presence or absence of a particular protein target with limited quantitative information with regard to each identified target. Therefore, while providing good initial insight into the protein target potential of the three compounds under study, there is work to be done to drill down to a robust, reproducible, validated set of targets for each compound.

4.2 In-cell competition of ABPs against parent compounds and other electrophiles

Having identified a plethora of protein targets for the ABPs, it was next imperative to confirm that the ABPs are indeed acting on the same targets as the parent compound from which they are derived. The classic control experiment to carry out to address this point is to make the comparison of the labelling pattern of the ABP both alone and in combination or competition with an excess of its parent compound fed to live cells. If the ABP and parent compound are acting on the same protein targets, then the excess of parent compound should out-compete the ABP for target occupancy sites and as such diminish the labelling of the ABP in a concentration-dependent manner. This can be easily observed by a reduction in the in-gel fluorescence upon SDS-PAGE analysis (Figure 17).

82

ABP ABP + parent only compound

A1

Click chemistry, ABP only SDS-PAGE A2 separation and in- gel fluorescence B imaging gel fluorescence -

C In ABP + competition compound

Cancer cell line SDS-PAGE gel

Figure 17. Summary of the competition-based assay between the ABP and parent compound. Targets of the ABP that are shared with the parent compound will show a reduction in the in-gel fluorescence to varying extents (A1 and A2) or a complete loss of in-gel fluorescence (B) relative to the ABP only sample. Off-target effects as a result of the introduction of the alkyne tag are not competed against by the parent compound and therefore show no change in ABP labelling (C).

Previous work had established such competition assays for curcumin ABP 1 and sulforaphane ABP 2 in HeLa cells whereby a protocol of pre-incubating the parent compound to cells for 30 min prior to addition of the ABP and parent compound together for a further 30 min proved to be effective.326 These optimised competition assay conditions were thus employed for each of the seven ABPs against each of their representative parent compounds in the MDA-MB-231 cell line (Figure 18).

The labelling of all ABPs could be successfully competed in a concentration-dependent manner by their respective parent compounds. The three curcumin ABPs showed competition against both curcumin and PC although it required a large excess of both parent compounds to achieve visible competition (30-fold excess) (Figure 18A-C). Even at such excesses, ABP labelling was still clearly visible (particularly for curcumin ABP 3). Although, a clear reduction in the labelling of curcumin ABP 1 is visible upon competition with curcumin itself, this had not been observed earlier in the HeLa cell line under similar competition conditions (Figure 10B lanes 1-4). Competition against the reactive general alkylating agent, NEM, also caused a significant reduction in labelling for all three of the curcumin ABPs. No competition for labelling was observed when each of the curcumin ABPs was applied alongside THC. This observation supports the reasoning that the two α,β-unsaturated ketone motifs are fundamentally required for covalent protein modification by curcumin.

Likewise, the piperlongumine ABP was successfully competed against a concentration series of piperlongumine, eliminating almost all labelling of the ABP at a 50-fold excess (Figure 18D). NEM also completely eliminated all ABP labelling when in competition at 20-fold excess (Figure 18D lane 23). THP, showed no apparent competition against the ABP. As with curcumin, this observation supports the notion that covalent labelling of piperlongumine is a result of its electrophilic motifs. Despite the structural diversity of the sulforaphane ABPs, all three ABPs could be successfully competed against a concentration gradient of parent compound sulforaphane (Figure 18E-F). NEM completely eliminated all ABP labelling at 20-fold excess (Figure 18E lanes 40 and 47). Competition against IA at 20-fold excess also reduced ABP labelling, but not to the same extent as NEM (Figure 18E lanes 39 and 46) suggesting sulforaphane may share more target overlap with NEM than IA.

83

Lane 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 5 μM CURCUMIN ABP 1 10 μM CURCUMIN ABP 2 25 100 150 25 100 150 100 100 25 100 150 25 100 150 100 100 (A) - μM μM μM μM μM μM μM μM - μM μM μM μM μM μM μM μM NC NC NC PC PC PC THC NEM (B) NC NC NC PC PC PC THC NEM

250 250 150 150 100 100 75 75

50 50 37 37

25 25 gel fluorescence 20 20 - In 15 15

10 10 Coomassie

Lane 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 5 μM CURCUMIN ABP 3 2 μM PIPERLONGUMINE ABP 25 100 150 25 100 150 100 100 10 50 100 100 100 (C) - μM μM μM μM μM μM μM μM (D) - μM μM μM μM μM NC NC NC PC PC PC THC NEM PIP PIP PIP THP NEM

250 250 150 150 100 100 75 75

50 50

37 37

25 gel fluorescence 25 -

20 In 20 15 15 10 Coomassie

Lane 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 5 μM SULFORAPHANE ABP 1 5 μM SULFORAPHANE ABP 2 10 μM SULFORAPHANE ABP 3 5 25 100 200 100 100 5 25 100 200 100 100 20 100 200 400 100 (E) - μM μM μM μM μM μM - μM μM μM μM μM μM (F) - μM μM μM μM μM SULF SULF SULF SULF IA NEM SULF SULF SULF SULF IA NEM SULF SULF SULF SULF NEM

250 150 250 150 100 100 75 75 50 50 37 37

25 gel fluorescence 20 25 - In 15 20 15 10 Coomassie

Figure 18. SDS-PAGE and in-gel fluorescence analysis of the in-cell competition experiments of ABPs against their respective parent compounds in the MDA-MB-231 cell line for curcumin ABP 1 (A), curcumin ABP 2 (B), curcumin ABP 3 (C), piperlongumine ABP (D), sulforaphane ABP 1 and 2 (E) and sulforaphane ABP 3 (F). Cells

84

were treated with the competition compound only for 30 min, and subsequently with the competition compound and ABP together for a further 30 min. Cells were then lysed with whole cell protein lysis buffer, followed by CuAAC to append the AzT reagent and SDS-PAGE and protein visualisation by in-gel fluorescence. NC = curcumin, PC = mono-O-propylcurcumin, THC = tetrahydrocurcumin, NEM = N-ethylmaleimide, PIP = piperlongumine, THP = tetrahydropiperlongumine, SULF = D,L-sulforaphane, IA = iodoacetamide.

To assess the protein labelling profiles of the ABPs against a wider range of other electrophiles aside from simply their parent compounds, in-cell competition assays were carried out against a small panel of other small molecule electrophiles including dimethyl fumarate, citral and benzoquinone (Appendix Figure 4). It was originally envisaged that such competition assays with the ABPs of curcumin, sulforaphane and piperlongumine would provide a platform for reading out differences in electrophilic reactivity against particular targets. Subsequently interesting observations identified by the in-gel fluorescence platform could be followed up with MS-based proteomic identification and quantification. Aside from the parent compound that clearly showed competition against the ABP as previously discussed, only with a small number of other electrophiles was it possible to observe a clear loss in labelling as a result of competition (e.g. NEM, IA and benzoquinone).

A large percentage of electrophilic compounds screened showed no apparent changes in ABP labelling upon competition that were visible by in-gel fluorescence (e.g. coniferal aldehyde, piperine, dimethyl fumarate, resveratrol, citral, feruloyl acetone, maleic anhydride). These observations suggested that protein target overlap between different electrophiles may be fairly minimal. Using an in-gel fluorescence platform for observing subtle changes in ABP labelling may also be a limitation. Hundreds of protein targets have been identified for the ABPs that when resolved on a gel will clearly co-migrate if the proteins are a similar molecular weight. Therefore each resolved band visible by in- gel fluorescence on the SDS-PAGE gel, may in fact represent a number of protein targets. This therefore makes it challenging to observe changes of ABP labelling of a protein that may be masked by a protein of a similar molecular weight that does not change its labelling in response to competition. For more specific ABPs that may have a smaller relative number of protein targets, an in- gel fluorescence readout platform like this has been useful for observing ABP labelling differences upon competition as has been reported by Cravatt and co-workers, amongst others.378-381 The results presented here suggest this was not feasible for the ABPs in hand.

Collectively, the majority of the synthesised ABPs are good surrogates for their respective parent compounds under the assay conditions described. It is apparent that not all the in-gel fluorescence can be out-competed by competition with the parent compound (even at 30-50-fold excesses). Of concern is the observation that for some of the ABPs, a number of protein bands visible by in-gel fluorescence remain at equal intensity across the entire concentration gradient of parent compound competition. It is these protein targets of the ABP that are unlikely to be genuine protein targets of their respective parent compounds and may well be an artefact of probe design. Proteomic identification of ABP targets reported in Chapter 4.1.3 identified hundreds, even thousands, of protein targets for the ABPs. However, at this stage, there is no way of determining which protein targets are competitive against parent compound and which ones are not. To address this, competition-based proteomic identification of the ABP alone and the ABP competed against an excess of parent compound was thus carried out.

85

For all reported applications of the ABPs described herein, curcumin ABP 1 was chosen as the representative ABP for curcumin as curcumin ABP 2 and 3 offered no observable advantages to the original curcumin ABP 1. Piperlongumine ABP was shown to be more than sufficient as a probe for piperlongumine. Sulforaphane ABP 2 was selected as the representative ABP for sulforaphane. This was based on its more stable protein adduct formation in comparison to sulforaphane ABP 3 (Figure 15C+F, Appendix Figure 1). Although, structurally different from sulforaphane, sulforaphane ABP 2 labelling was successfully out-competed by sulforaphane confirming they have highly overlapping protein target sets (Figure 18E). The initial proteomic identification of targets of sulforaphane ABP 2 and 3 revealed target differences (Figure 16D) that suggest sulforaphane targets may be missed using the modified design of sulforaphane ABP 2. However, sulforaphane ABP 2 was readily available and its robust labelling that allayed fears of dithiocarbamate adduct instability suggested it was the best sulforaphane probe to proceed with for chemical proteomics.

4.3 Proteomic target identification of ABPs competed against parent compound with duplex SILAC

Having established a competition-based assay of ABP and parent compound, it was next sought to identify the protein targets of each ABP that are competitive against an excess of the parent compound. In order to directly compare protein targets identified with the ABP from two different samples i) ABP only and ii) ABP with excess of parent compound, it is necessary to quantitatively compare protein target populations from each of these two separate samples.

There are a number of ways quantification can be introduced into a MS-based proteomics workflow (discussed in Chapter 1.2.5.4). The simplest is to use a ‘label-free’ quantification approach and compare the LFQ intensity from each sample from separate LC-MS/MS runs. However, this has the shortfall in that samples are processed and analysed separately with the quantification accuracy compromised particularly given the number of processing steps in the chemical proteomic workflow. With the requirement for accurate quantification of subtle or small changes in samples, a SILAC- based strategy was decided upon to be implicated into the chemical proteomic workflow. SILAC is a convenient and accurate way of introducing reliable and robust MS-based quantification.203, 382 It offers the specific advantage that the quantification component can be introduced early on in the workflow, thereby reducing the experimental error introduced over multiple processing steps within a chemical proteomic workflow of this kind.

The principle of SILAC is that it involves growing two populations of cells, one in a medium containing a ‘light’ amino acid and another in a medium containing a ‘heavy’ amino acid. The ‘heavy’ amino acid may contain 2H instead of 1H, 13C instead of 12C and 15N instead of 14N and unlike radiolabelling, all isotopes used in SILAC are stable. Incorporation of the ‘heavy’ amino acid into a peptide leads to a known and defined mass shift compared to the peptide containing the ‘light’ version of the amino acid detectable in the MS1 spectrum on the mass spectrometer, but makes no other chemical changes to the peptide in question. SILAC distinguishes two proteomes by the ‘light’ and ‘heavy’ amino acid and requires complete labelling of the cellular system (the SILAC amino acid in all proteins should be

86 replaced) and this is achieved by passaging growing cells through a number of cell doublings to ensure almost complete incorporation even for proteins with no significant turnover. In a simple experiment comparing two proteomes grown separately in ‘light’ (population A) and ‘heavy’ (population B) media, cells are mixed, their proteomes extracted and digested into peptides, followed by MS analysis. Every peptide appears as a pair in the mass spectra, with the ‘light’ peptide originating from population A and the ‘heavy’ peptide originating from population B. If the SILAC peptide pair appears with an intensity of 1:1 of ‘heavy’:’light’ (H/L) then there is no difference in abundance of this protein between the two populations. However if the intensity of the ‘heavy’ peptide is greater than its ‘light’ peptide counterpart, it indicates that the protein is more abundant in population B than it is in population A.

To establish the MDA-MB-231 cell line in the ‘light’ and ‘heavy’ media, a population of MDA-MB-231 cells were first grown in an R0K0-containing Dulbecco modified Eagle’s medium (DMEM) (‘light’) or an R10K8-containing DMEM (‘heavy’) supplemented with dialysed foetal bovine serum (FBS) for 6-8 cell 14 12 14 12 passages. The R0K0 media is an N4 C6-arginine and N2 C6-lysine containing media, whereas the 15 13 15 13 R10K8 media contains N4 C6-arginine and N2 C6-lysine. After sufficient cell passages, the incorporation of the ‘heavy’ arginine and lysine amino acid within the proteome of the R10K8 MDA- MB-231 cell line was confirmed to be > 98 % by MS-based proteomics of a whole protein lysate (Appendix Table 1). The MDA-MB-231 cell line showed no detrimental effects on cell growth rate or cellular morphology in either of the ‘heavy’ or ‘light’ media highlighting the compatibility of a SILAC- based approach for this breast cancer cell line.

In order to quantitatively compare the targets of the ABP alone (population A) and the ABP in competition with the parent compound (population B) in a single MS run, a duplex SILAC experiment was carried out (Figure 19A). Three combinations were tested; curcumin ABP 1 competed against a 10-fold excess of PC, sulforaphane ABP 2 competed against a 20-fold excess of sulforaphane and piperlongumine ABP competed against a 50-fold excess of piperlongumine. PC was chosen as the competitor over curcumin for curcumin ABP 1 as it showed more reproducible and effective competition against the ABP. Plates of MDA-MB-231 cells containing the ‘light’ media were treated with ABP only for 30 min. Plates of MDA-MB-231 cells containing the ‘heavy’ media were treated with parent compound only for 30 min, followed by ABP and parent compound together for a further 30 min (the exception to this was sulforaphane where the ‘heavy’ and ‘light’ samples were switched – see Figure 19B). Cells were then lysed separately, the protein concentration determined and the ‘heavy’ and ‘light’ lysates combined together in a 1:1 ratio with a total protein amount of 400 μg. Samples were then subjected to CuAAC with AzTB, followed by sample processing, affinity enrichment and washing steps as reported previously. Target proteins on the resin were reduced, alkylated, and digested with trypsin overnight. Trypsin is a serine protease that cleaves proteins on the carboxyl side of the amino acids arginine and lysine within proteins. This therefore ensures that almost all peptides produced contain either the ‘heavy’ or ‘light’ amino acid label and can be used for quantification. The resulting peptides were then subjected to LC-MS/MS analysis.

87

As a result of the SILAC incorporation, every protein target of the ABP identified by MS will contain a ‘light’ and ‘heavy’ version. This therefore allows a H/L intensity ratio to be calculated for each peptide which corresponds to the relative abundance of the protein identified in the two populations. Presuming that the ABP only sample is the ‘light’ version and the ABP with excess of parent compound is the ‘heavy’ version, if the parent compound does not bind to an ABP protein target then the ABP labelling of that target will be the same in both samples resulting in a H/L ratio = 1. However, if the parent compound and the ABP bind to the same protein target, then a reduction in the intensity of the ‘heavy’ peptide coming from the ABP in competition with the parent compound should occur. This will result in a H/L ratio < 1 to be obtained. A H/L threshold of ≤ 0.66 corresponding to a 1.5-fold difference between the two populations was set as a biologically significant cut-off to be deemed effective competition between the parent compound and the ABP. This ensures that only identified proteins falling below this threshold are assigned as genuine targets of the dietary-based electrophile.383 The MaxQuant software was used for data analysis as it is highly capable of detecting thousands of SILAC pairs (H/L ratios) in a single LC-MS/MS run resulting in high confidence peptide and protein quantification.199, 374, 384

Using such analysis, 120 protein targets for curcumin/PC, 112 protein targets for sulforaphane and 403 protein targets for piperlongumine were identified (Figure 19B). Identification of KEAP1 as a conserved target across all three compounds was once again made (Figure 19C). The number of ‘razor+unique’ peptides identified were 7, 7 and 13 for curcumin, piperlongumine and sulforaphane respectively providing multiple peptides to robustly quantify off in MS. The calculated H/L ratio was 3.1-fold, 6.1-fold and 4.1-fold more enriched in the ABP only relative to the ABP in competition with parent compound sample. This confirmed KEAP1 as a genuine target of the three compounds. The overlap of the protein targets between the three compounds again shows each compound has its own unique target profile (Figure 19C).

These results clearly highlight the potential for using a competition-based quantitative chemical proteomics workflow to profile the protein targets of the three compounds under study inside live cells. A significant proportion of targets (> 25 %) identified for the ABP could not be sufficiently competed for labelling by their parent compound counterpart. These may well be off-targets as a result of the ABP design or natively biotinylated proteins and/or non-specific binders to the Neutravidin sepharose resin that commonly occur as background in these types of workflow. The competition-based nature of the workflow helps to filter out such false positive target identifications, therefore providing the highest confidence in-cell target identifications for the three compounds to date.

However, there are limitations to the workflow that need to be addressed. Firstly, there is a significant reduction in the number of protein target identifications in comparison to the initial, LFQ protein identifications of the ABP targets by MS (Chapter 4.1.3). The more stringent filtering (based on the H/L cut-off threshold) and the reduced starting protein scale (200 μg) of the duplex SILAC experiment may account for the reduced protein identifications relative to the initial ABP only proteomics experiment. Secondly, concerns the multiplicity of the experiment, with only a single biological data point preventing determining whether target identifications are reproducible and therefore more

88 significant. Thirdly, only a single concentration of parent compound in competition with the ABP was carried out. Therefore, although these initial protein target identifications provide a number of novel targets worthy of further study, further optimisation of the workflow would allow for a more comprehensive global profile to be obtained.

(A) (B) R0K0 R10K8 ‘Light’ ‘Heavy’ 5 μM sulforaphane 2 μM piperlongumine 5 μM curcumin ABP 2 100 μM ABP ABP 1 sulforaphane 2 μM piperlongumine 5 μM curcumin 5 μM sulforaphane ABP 100 μM ABP only ABP + parent ABP 1 50 μM PC ABP 2 piperlongumine compound 433 proteins 152 proteins 556 proteins

> 2 > 2 > 2 Cells lysed and lysates mixed in 1:1 ratio ‘razor+unique’ ‘razor+unique’ ‘razor+unique’ peptides peptides peptides

431 proteins 151 proteins 549 proteins Click chemistry, affinity enrichment and trypsin digest to produce peptide mixture H/L ratio < H/L ratio > H/L ratio < 0.66 1.5 0.66

LC-MS/MS analysis 120 proteins 112 proteins 403 proteins MS – protein MS/MS – protein Curcumin (120) quantification identification (C) Light 24

Heavy 62 1 8 Da 33 shift Intensity Intensity 271 37 41 MS m/z MS/MS m/z Lys-containing Fragmentation to peptide reveal AA sequence Piperlongumine (403) Sulforaphane (112)

Important protein targets identified: Protein identification (MS/MS) and Protein Curcumin ‘heavy’/’light’ intensity (H/L) ratio target target target target generated for each peptide that is KEAP1 assigned to the protein identification GSTO1

STAT3

EGFR

Figure 19. In-cell, competition-based chemical proteomics identification of the targets of curcumin, piperlongumine and sulforaphane using a duplex SILAC quantitative proteomics setup. (A) The experimental workflow for quantitative comparison of ABP targets between ABP only and ABP competed against excess of parent compound. (B) Proteomic data analysis and filtering to determine the 120, 112 and 403 protein target identifications for curcumin, sulforaphane and piperlongumine respectively. The sulforaphane sample was reversed such that the ABP only was the ‘heavy’ sample and ABP competed against sulforaphane was the ‘light’ sample. Therefore a H/L ratio > 1.5 is indicative of a genuine sulforaphane target in this case. (C) Venn diagram showing the protein target overlap between identified curcumin, sulforaphane and piperlongumine targets. A small number of individual targets are highlighted. The labelling of the ABP for each sample prior to the proteomic identification was visualised by in-gel fluorescence as in Appendix Figure 5.

89

4.4 In-lysate competition of ABPs against parent compounds and other electrophiles

All synthesised ABPs are cell-permeable permitting profiling of targets to be permitted inside live cells for curcumin, sulforaphane and piperlongumine. However, many previous approaches have reported the application of probes of similar compounds on extracted cell protein lysates.385-387 For probes or compounds that are cell impermeable, this is often the approach employed for protein target identification. This is particularly the case for resin-immobilised small molecule affinity pulldowns (Chapter 1.2.2.1) and even for biotinylated probes which also often have limited cell permeability. Target identification on protein lysates has been reported for all three compounds under study here.123, 133, 295, 388 It was therefore of interest to apply the synthesised ABPs to cell lysate to allow the comparison to be made between so-called in-cell profiling and in-lysate profiling (Figure 20A).

MDA-MB-231 cell protein lysates were prepared using a referenced protocol and the seven ABPs applied at 37 ˚C to the protein lysates for 30 min.164 After treatment, the protein was precipitated to remove excess ABP and re-suspended in protein re-suspension buffer. The samples were then subjected to CuAAC with AzT followed by protein precipitation, and preparation of samples for SDS- PAGE analysis. The in-gel fluorescence labelling revealed, in a similar manner to in-cell labelling, that all ABPs labelled at low μM concentrations (Figure 20B). However, there were noticeable differences between the in-lysate and in-cell labelling patterns of the ABPs (compare Figure 20B with Figure 14). Different cell lysis conditions were employed for in-cell and in-lysate samples, resulting in different protein lysate composition, which may explain these discrepancies. In the preparation of protein lysates for in-lysate labelling, proteins must be extracted in non-denaturing conditions (low detergent buffers) in order to preserve their native conformation as to not disrupt ABP-protein interactions. Whereas, for in-cell labelling, the ABP has already covalently bound its protein target and therefore detergent-based lysis buffers (containing SDS) can be utilised that result in better extraction of difficult to solubilise proteins. However, what was also apparent for the in-lysate labelling was the generic labelling band pattern of all the ABPs which indicated that target specificity may have been lost for in- lysate labelling (Figure 20B, particularly 25-75 kDa).

In light of this, competition-based assays for a representative ABP of each of curcumin, piperlongumine and sulforaphane against their respective parent compounds were carried out (Figure 20C, Appendix Figure 6). For the most part, this showed that the ABPs could be competed against their parent compounds so the protein targets of the ABPs are specific targets for the parent compounds in lysates. Heat-denatured protein lysates were also prepared and treated with the ABPs in comparison to non-heat-denatured lysates (Figure 20D, Appendix Figure 7). Protein labelling by the ABP should be eliminated if the ABP is binding to its target in an activity-dependent manner as heat denaturation results in a loss of higher order structure and functionality of the proteome. However, it was observed that heat denaturation actually caused an increase in protein labelling as visualised by in-gel fluorescence (Figure 20D lanes 2, 4 and 6).

Taken together this suggested that the protein labelling profiles of the ABPs for curcumin, sulforaphane and piperlongumine in protein lysates are very different to those when the ABP is

90 applied to intact, live cells. In-lysate labelling of such small, reactive electrophilic ABPs also appeared to result in a generic protein labelling pattern regardless of the ABP applied that was not observed for in-cell labelling. These observations therefore prompted the comparison between in-cell and in-lysate target identification of the ABPs by MS-based proteomics to ascertain whether the profiling environment may significantly alter the identified target sets for the three compounds.

(A) (B)

MDA-MB-231 cells Lane 1 2 3 4 5 6 7 IN-CELL IN-LYSATE 5 μM 5 μM 5 μM 1 μM 4 μM 4 μM 15 μM PROFILING PROFILING CUR CUR CUR PIP SULF SULF SULF ABP ABP ABP ABP ABP ABP ABP 1 2 3 2 1 3 ABP +/- compound Mw (kDa) treatment 250 150 Cell lysate Cell lysate 100 75 ABP +/- 50 compound treatment 37 Functionalisation with AzT via

25 gel fluorescence CuAAC -

20 In

15

SDS-PAGE analysis and in-gel fluorescence imaging

(C) Coomassie Lane 1 2 3 4 5 6 7 8 9 10 11 (D) Lane 1 2 3 4 5 6 7 8 4 μM SULF 1 μM PIP ABP 5 μM CURC ABP 1 4 μM 1 μM 5 μM ABP 2 SULF PIP CURC DMSO ABP 2 ABP ABP 1 400 200 100 100 200 200 200 μM μM DMS - μM μM - - μM μM μM - Δ - Δ - Δ - Δ SUL NE O Mw (kDa) PIP NEM NC PC NEM F M 250 Mw (kDa) 150 250 100 150 75 100 50 75 37 50

37 25 gel fluorescence 20 - In 25 gel fluorescence

- 15

20 In

15 Coomassie Coomassie

Figure 20. In-lysate protein target profiling of curcumin, sulforaphane and piperlongumine. (A) The distinction between in-cell target profiling and in-lysate profiling. (B) In-gel fluorescence image of the labelling of MDA-MB- 231 protein lysate of all 7 ABPs for 30 min. (C) In-gel fluorescence image of a representative ABP of curcumin, piperlongumine and sulforaphane competed against their respective parent compounds and NEM in MDA-MB- 231 protein lysates. NEM completely eliminated all labelling by the ABPs (D) In-gel fluorescence image of a representative ABP of curcumin, piperlongumine and sulforaphane +/- heat denatured (Δ) protein lysate. CURC/NC = curcumin, PIP = piperlongumine, SULF = sulforaphane, PC = mono-O-propylcurcumin and NEM = N-ethylmaleimide.

91

4.5 Proteomic target comparison of ABPs competed against parent compound in-lysate and in- cell with duplex SILAC

A proteomic experiment was therefore setup to identify the overlapping protein target profiles of the ABPs of each of the three compounds under study under three different experimental setups involving application of the ABPs and parent compounds using the MDA-MB-231 cell line. These were i) in-cell labelling, ii) in-lysate labelling and iii) in-cell/lysate labelling. The two former methods have been discussed previously. The in-cell/lysate labelling approach involves application of the parent compound in-cell followed by application of the corresponding ABP in-lysate. A very similar approach has been used by Cravatt and co-workers to profile lipid-derived electrophiles (4-HNE and 15-PGJ2) treated in-cell followed by subsequent reactive cysteine profiling with an alkyne-functionalised IA (IA- Alk).79

As previously, curcumin ABP 1 was used as the representative ABP for curcumin, sulforaphane ABP 2 for sulforaphane and the piperlongumine ABP for piperlongumine. Each ABP was applied either alone, or in competition with an excess of parent compound. All samples were carried out in duplicate to ensure reproducibility of the observations. All cells were lysed under identical lysis conditions using a lysate labelling-compatible buffer (25 mM HEPES pH 7.5, 150 mM NaCl, 2 mM MgCl2, 0.1 % NP- 40) to ensure the whole proteomes extracted across the entire experiment were identical, therefore allowing a direct comparison of target profiles between the three different experimental setups. A duplex SILAC quantification strategy was again utilised for robust quantification between samples with an overview of the experimental design presented in Figure 21.

92

In-cell labelling In-lysate labelling In-cell/lysate labelling

R0K0 R10K8 R0K0 R10K8 R0K0 R10K8 ‘Light’ ‘Heavy’ ‘Light’ ‘Heavy’ ‘Light’ ‘Heavy’

ABP + parent Cell ABP - - Parent compound DMSO treatment compound

ABP + parent Lysate - - ABP ABP ABP treatment compound

Each SILAC pair combined at the lysate-level (1:1 ratio)

CuAAC, affinity purification, trypsin digest and LC-MS/MS analysis

Protein target filtering (‘razor+unique’ peptides > 2 and H/L ratio > 1.5-fold change)

Protein target ID in-cell Protein ID in-lysate Protein ID in-cell/lysate Curcumin (155 targets) Curcumin (117 targets) Curcumin (19 targets) Piperlongumine (358 targets Piperlongumine (317 targets Piperlongumine (24 targets Sulforaphane (126 targets) Sulforaphane (319 targets) Sulforaphane (24 targets)

Figure 21 Experimental setup for the quantiative competition-based,chemical proteomics workflow to compare the in-cell, in-lysate and in-cell/lysate target profiles of curcumin, piperlongumine and sulforaphane in the MDA- MB-231 cell line. All experimental conditions under study were prepared in duplicate and lysed under identical conditions for all samples. The concentrations of compounds were as follows; piperlongumine ABP (2 μM), in- cell piperlongumine (100 μM); in-lysate piperlongumine (100 μM); sulforaphane ABP (5 μM), in-cell sulforaphane (100 μM), in-lysate sulforaphane (150 μM); curcumin ABP (5 μM), in-cell curcumin (100 μM), in-lysate curcumin (150 μM).

For in-cell labelling experiments; ABP only samples were DMSO vehicle treated to live MDA-MB-231 cells grown in ‘light’ medium for 30 min, followed by the ABP for a further 30 min. ABP and parent compound samples were parent compound treated to live MDA-MB-231 cells grown in ‘heavy’ medium for 30 min, followed by the ABP and parent compound for a further 30 min. Cells were washed with PBS, followed by cell lysis and protein concentration determination. For in-lysate labelling experiments; ‘heavy’ and ‘light’ proteomes of MDA-MB-231 cells were generated. The ‘heavy’ lysate was treated with DMSO vehicle for 40 min, followed by ABP for a further 20 min. The ‘light’ lysate was treated with parent compound for 40 min, followed by ABP for a further 20 min. Following compound treatment, lysates were immediately precipitated to remove excess compound followed by re-suspension and the protein concentration re-determined. For in-cell/lysate labelling experiments; ABP only samples were DMSO vehicle treated to live MDA-MB-231 cells grown in ‘heavy’ medium for 30 min, followed by cell lysis and treatment of ‘heavy’ lysate with ABP for 20 min. ABP and parent compound samples were parent compound treated to live MDA-MB-231 cells grown in ‘light’ medium for 30 min, followed by cell lysis and treatment of ‘light’ lysate with ABP for 20 min.

93

Following compound treatment, lysates were immediately precipitated to remove excess compound followed by re-suspension and the protein concentration re-determined.

After compound feeding described above, all ABP only and ABP with parent compound pairs from each of the experimental setups (in-cell, in-lysate and in-cell/lysate) were combined together in a 1:1 ratio (150 μg of ‘light’ lysate and 150 μg of ‘heavy’ lysate). The resulting samples were then subjected to CuAAC with the AzTB reagent, followed by sample processing, affinity enrichment, resin washing steps, reduction and alkylation and trypsin digestion as reported previously. The resulting peptides were then subjected to LC-MS/MS analysis followed by subsequent data analysis. Proteins identified were filtered to only those identified across duplicates, requiring at least two ‘razor+unique’ peptides and a 1.5-fold change in the H/L ratio corresponding to effective competition of ABP labelling by the parent compound. In this way, 155, 358 and 126 targets for curcumin, piperlongumine and sulforaphane for in-cell labelling; 117, 317 and 319 targets for curcumin, piperlongumine and sulforaphane for in-lysate labelling; and 19, 24 and 24 targets for curcumin, piperlongumine and sulforaphane for in-cell/lysate labelling were identified. The overlap of protein targets from in-cell and in-lysate labelling for each of the three compounds (Figure 22D-F) and for each of the three compounds against one another from in-cell and in-lysate labelling (Figure 22G-H) is shown.

94

(A) (B) (C)

Lane 1 2 3 4 5 6 7 8 9 10 11 12 Lane 1 2 3 4 5 6 7 8 9 10 11 12 Lane 1 2 3 4 5 6 7 8 9 10 11 12 In- In- In- In-cell In-lysate In-cell In-lysate In-cell In-lysate cell/lysate cell/lysate cell/lysate ABP ABP ABP ABP ABP ABP ABP ABP ABP ABP ABP ABP ABP ABP ABP ABP ABP ABP +NC +NC +NC +PIP +PIP +PIP +SFN +SFN +SFN

250 250 250 150 150 150 100 100 100 75 75 75 50 50 50 37 37 37

25 25

20 25 20 gel fluorescence -

20 In 15 15

10 15 10 Coomassie

(D) (E) (F)

Curcumin targets in-cell (155) Piperlongumine targets in-cell (357) Sulforaphane targets in-cell (126)

112 43 74 222 135 181 34 92 227

Curcumin targets in-lysate (117) Piperlongumine targets in-lysate (316) Sulforaphane targets in-lysate (319)

(G) Cell labelling overlap (H) Lysate labelling overlap

Curcumin Curcumin

33 11

61 4 11 20 57 75 203 36 29 129 101 123

Piperlongumine Sulforaphane Piperlongumine Sulforaphane

Figure 22. (A-C) In-gel fluorescence images of the samples confirming the labelling patterns of the ABP in each of the three methodologies applied (in-cell, in-lysate and in-cell/lysate) for each of the three compounds under study, curcumin (A), piperlongumine (B) and sulforaphane (C). (D-F) Venn diagrams showing the overlap of identified protein targets between in-cell profiling and in-lysate profiling for curcumin (D), piperlongumine (E) and sulforaphane (F). (G-H) Venn diagrams showing the overlap of protein targets identified from in-cell (G) and in- lysate profiling (H). The concentrations of compounds were as follows; piperlongumine ABP (2 μM), in-cell piperlongumine (100 μM); in-lysate piperlongumine (100 μM); sulforaphane ABP (5 μM), in-cell sulforaphane (100 μM), in-lysate sulforaphane (150 μM); curcumin ABP (5 μM), in-cell curcumin (100 μM), in-lysate curcumin (150 μM). NC = curcumin, PIP = piperlongumine and SULF = sulforaphane.

For curcumin and piperlongumine, carrying out protein profiling in-lysate as oppose to in-cell offers no significant advantage in identifying more potential protein targets. The overlap in targets however illustrates suspicions that in-cell and in-lysate profiling provide aberrantly different target profiles, with 112 and 222 targets for curcumin and piperlongumine respectively identified in-cell but not in-lysate profiling. Included here was KEAP1 which was only identified as an in-cell target for curcumin and piperlongumine. Furthermore, in-lysate labelling reveals 74 and 181 target identifications for curcumin

95 and piperlongumine respectively that are not identified by in-cell profiling. These discrepancies are concerning. It is important to note that the ABP design is not at fault for the target profile differences between in-cell and in-lysate profiling as competition-based assays are employed which result in identification only of parent compound-competitive ABP targets. Alterations in pH, compartmentalisation, protein localisation, redox potential, cellular buffering in the protein lysate environment relative to the native environment of intact cells, collectively give rise to the striking target profile differences between in-cell and in-lysate labelling of curcumin and piperlongumine. Sulforaphane shows a similar pattern to its curcumin and piperlongumine counterparts but to a lesser extent, with in-lysate labelling providing a roughly three-fold enhancement in the number of targets identified in comparison to in-cell labelling. However, at this stage it remains unclear whether the additional 227 targets identified by in-lysate labelling are genuine in-cell targets or merely artefacts of labelling under lysate conditions as has been observed for curcumin and piperlongumine.

It would be wrong to claim that all identified targets that appear only from in-lysate profiling and that cannot be confirmed in-cell are false positives that should be discarded. Many of these targets may indeed be genuine and/or biologically significant which may have evaded detection from comparative in-cell labelling experiments (rapid cellular turnover, cellular instability and/or cellular inaccessibility due to time constraints). Although identical cell lysis conditions were employed for both in-cell and in- lysate experiments, the low detergent component of the lysis buffer (0.1 % NP-40) may well interfere with ABP target binding and as such detergent-free buffers should be explored further in this regard to determine whether NP-40, even at such low amounts, interferes with protein target binding. However, it would be fair to state that in-lysate profiling may be misleading in representing the true identity of in- cell engaged targets given the results presented here. Such discrepancies between in-cell and in- lysate targets of compounds have been reported by others but has not proved to be a deterrent to many groups who continue to do target identification on lysates.389 The ABPs applied here, and their respective parent compounds, are reactive small molecule electrophiles whose activity seems to be driven through their electrophilic moieties. Other reported ABPs contain less reactive or more tuned electrophilic motifs that rely on non-covalent recognition prior to covalent binding at the electrophilic site. Such ABPs may show better comparable results in their target profiles obtained on cells and protein lysates.164, 328, 381 However, for curcumin, piperlongumine and sulforaphane, these results highlight the necessity to be performing in-cell profiling experiments as oppose to in-lysate profiling that has been reported previously.123, 133, 295, 296

The lack of genuine target identifications for the in-cell/lysate labelling was not due to a lack of targets identified with the ABP under such conditions, but a lack of competition against parent compound for all three compounds under investigation. This is evident from the in-gel fluorescence upon SDS- PAGE analysis (Figure 22A-C lanes 9-10 versus 11-12). As alluded to earlier, Cravatt and co-workers have used a highly analogous experimental setup, which was mirrored in this study, with great success (discussed further in Chapter 1.2.7.2).79 Further optimisation is maybe required. The advantage of the approach would be that it could allow for the profiling of curcumin, sulforaphane and piperlongumine without the need to design and synthesise an ABP tool compound for each and every

96 compound. However, the approach relies on the pan-reactive alkyne-functionalised tool to capture all targets within the lysate that the electrophilic natural product engages in-cell. Even for a generally reactive and unselective alkylating agent like IA, this is not easy to achieve. Insight gained has already shown the drastic differences in protein labelling under lysate and cellular environments. One could circumvent these issues and apply both the compound of interest and the pan-reactive tool to intact, live cells. However earlier observations indicate the in-cell protein target set of IA does not encompass all of the targets of the three compounds even when dosed at large excesses (Figure 18E and Appendix Figure 4). The results of the competition assays suggest that NEM is a better pan-reactive electrophile than IA encompassing the majority of the target set of curcumin, sulforaphane and piperlongumine observed by eliminating labelling of all ABPs when applied in competition excess in both in-cell and in-lysate labelling experiments (Figure 18 and Figure 20C).

It requires further studies to elucidate whether an NEM tool molecule could successfully capture all protein targets in-lysate that curcumin, sulforaphane and piperlongumine engage in-cell if an in- cell/lysate profiling experimental setup could be established. Furthermore issues remain about the functional impact of applying a potentially toxic and reactive agent like IA or NEM to the cell. This may perturb the cellular system and alter the true protein target profile of the compound being profiled.

It is for these reasons that the design and application of ABPs of the parent compound are the best available tools for target profiling at present. Although further work is clearly warranted, our observations clearly advise the implementation of profiling experiments to be carried out in cells wherever possible to both identify the true/genuine targets of the compound in question but also to eliminate false positive identifications which seem apparent from our dataset when profiling in lysates.

4.6 Summary and conclusions

This chapter reports the successful application of ABPs of curcumin, sulforaphane and piperlongumine directly to live cells as well to protein lysates. All seven synthesised ABPs were shown to be cell-permeable tools that could label protein targets in the low μM range. The ABPs were then used as representatives of their parent compounds to study the stability of electrophile-protein adducts inside cells. It has been shown that curcumin-protein adducts form rapidly inside cells (within 30 min) but are rapidly eliminated. Piperlongumine- and sulforaphane- protein adducts form slower (within 6 h) and appear to be more stable than their curcumin counterparts. The initial identification of these ABP-protein adducts by MS identified 2211 targets for curcumin ABP 1, 645 targets for sulforaphane ABP 2 and 1280 targets for piperlongumine ABP.

In-cell competition assays between the ABPs and their respective parent compounds validated the ABPs as good surrogates for their parent compounds. In light of these findings, a quantitative proteomic strategy in the form of SILAC was incorporated into the chemical proteomic workflow to quantitatively compare the ABP targets when applied alone and in competition with their parent compound. This identified 120 protein targets for curcumin, 112 protein targets for sulforaphane and 403 protein targets for piperlongumine by MS that are high confidence, genuine in-cell targets of the three compounds. Despite previous global profiling efforts, this is the first time the targets of the three

97 compounds have been carefully profiled with relevant controls inside cells. Identification of well- documented targets from the literature such as KEAP1 (as well as others) within the target sets provides confidence that the chemical proteomic approach identifies relevant targets. However, the disconnect between the number of targets identified in the quantitative competition-based workflow relative to the initial ABP only profiling suggests that many potential protein targets may well have been missed.

Therefore, the chemical proteomic workflow and the subsequent protein identifications still require further optimisation. The differences in target profiles as identified by MS-based proteomics for the same compound observed by in-cell and in-lysate labelling strongly support the need to be carrying out target profiling of electrophilic natural products such as curcumin, piperlongumine and sulforaphane in live cells wherever possible. The need for biological replicates, a concentration series of competition against parent compound and optimisation of sample processing steps should allow for the identification of a greater number of protein targets with higher confidence. This will provide a greater wealth of information to extract from such chemical proteomics experiments to provide functional insight into the mode of action of these dietary electrophiles. It is these issues that are addressed in Chapter 5 as a quantitative, concentration gradient of competition-based chemical proteomics platform is developed.

98

5. Quantitative, concentration gradient of competition- based chemical proteomics

In the previous chapter the potential of applying a quantitative competition-based chemical proteomic workflow was shown to elucidate high confidence targets of curcumin, sulforaphane and piperlongumine. However there is a need to improve the workflow to enhance target coverage, confidence and understanding in relation to each of the assigned identifications. These points are addressed in this chapter, to provide the most comprehensive profiling of curcumin, sulforaphane and piperlongumine targets to date.

Initially, a ‘spike-in’ SILAC approach is incorporated into the chemical proteomic workflow to allow for superior MS-based quantification across multiple samples. The target set coverage of sulforaphane is increased by incorporating and optimising a proteome fractionation protocol. For all three compounds, by applying the ABP in competition with a concentration gradient of parent compound, the binding potencies for their protein targets can be calculated and addressed against one another. This provides insight into the most potent targets for the compounds under study, revealing highly potent targets worthy of further study in relation to their functional impact. Bioinformatic analyses reveal potential candidate proteins that may account for the anticancer activities of the three compounds. This contributes to better insight into their mode of action.

5.1 Optimisation of a ‘spike-in’ SILAC approach into the chemical proteomic workflow

A SILAC-based approach utilised in Chapter 4 provided robust and reliable quantification for comparing protein populations captured by the ABPs using MS-based proteomics. A limitation of the duplex SILAC approach is that it limits the number of samples that can be quantitatively compared to two. An adaptation of this method is the use of SILAC as an internal or ‘spike-in’ standard wherein SILAC is only used to produce a reference proteome to quantify off. This allows the multiplicity of the experiment to be increased such that there is no limit to the number of protein populations which can be quantitatively compared. Furthermore, this approach also has the advantage of decoupling the experiment from the labelling procedure, thereby overcoming cell culture media restrictions for the sample of interest (referred here still as the ‘light’ sample) that may be cultured under standard conditions (normal cell culture media with non-dialysed FBS). Only the ‘heavy’ labelled ‘spike-in’ standard requires culturing under SILAC conditions and it can be produced very economically in large amounts, aliquoted and stored for long periods. There are limitations to the ‘spike-in’ SILAC approach such as an increase in quantification errors relative to the previously utilised duplex SILAC approach. In duplex SILAC, the results are based on the ratios between two samples. In ‘spike-in’ SILAC the results are the ratio of ratios which leads to an increase in the quantification variation. There is also a reliance on a high quality ‘spike-in’ SILAC reference with broad proteome coverage. The measuring time is also longer as each experimental sample under investigation needs to be analysed by MS separately, in comparison to duplex SILAC whereby two experimental samples are analysed in a single MS run.

99

With the aim of applying the ABP in competition with multiple concentrations of parent compound, a quantitative proteomics strategy that allows for the comparison of multiple samples was sought after. Therefore, the implementation of a ‘spike-in’ SILAC approach was investigated. Sulforaphane ABP 2 was utilised as the model ABP for the investigation. For these optimisation experiments, the experimental setup was not to determine the targets of sulforaphane, but merely to test the applicability and reproducibility of the ‘spike-in’ SILAC methodology. There is very little difference in the experimental setup between the ‘spike-in’ SILAC and the duplex SILAC approaches (the latter discussed in Chapter 4.3). ‘Light’ and ‘heavy’ MCF7 cells were treated with 5 μM and 20 μM sulforaphane ABP 2 respectively (Figure 23A) to mimic the experimental setup that follows in Chapter 5.2. Utilising a ‘spike-in’ SILAC approach, the SILAC label is ideally introduced early on in the sample processing steps. It can be introduced into the current workflow at either of two points. The first is to ‘spike-in’ at the cell-level (Figure 23B samples 10-13) after ABP treatment and intact cells have been washed, lifted and counted. The second is to ‘spike-in’ at the protein lysate-level after cell lysis and the protein concentration of each sample has been determined and normalised as has been done routinely up to this point (Figure 23B samples 1-9). In order to improve the protein target coverage for sulforaphane that had limited identifications to less than 130 proteins in Chapter 4, a cell lysis fractionation protocol was implemented to generate separate cytosolic (Figure 23B samples 1-3 and 10-11) and nuclear (Figure 23B samples 4-6 and 12-13) fractions to compare to cells lysed with the standard whole cell lysis buffer (Figure 23B samples 7-9).

Addition of the ‘spike-in’ sample (‘heavy’) to its appropriate ‘light’ counterpart in a 1:3 ratio was then carried out either based on the number of cells (for incorporation of SILAC at cell-level) or the protein amount (for incorporation of SILAC at the lysate-level). The ‘spike-in’ ratio was chosen to maximise the cost effectiveness of the ‘heavy’-labelled ‘spike-in’ sample whilst increasing the detection of a ‘light’ peptide in MS corresponding to the sample of interest. Following the combination of the ‘light’ and ‘heavy’ samples, the SILAC pairs were then processed in an identical manner to that reported for a classic duplex SILAC experiment (Chapter 4.3). Very briefly, proteins were subjected to CuAAC, affinity enriched, reduced and alkylated, digested with trypsin and the resulting peptides analysed by LC-MS/MS. Data was analysed using MaxQuant with H/L ratios generated for each identified protein target.

The advantages of cell lysis into cytosolic and nuclear fractions for separate analyses by MS relative to whole cell lysis for the sulforaphane ABP 2 was evident, increasing the number of protein identifications from 258 to 462 in this study (Figure 23B samples 1-6 versus samples 7-9). Almost all of the targets identified upon whole cell lysis were identified in the cytosolic and nuclear fractionation samples (Figure 23C). Plotting H/L ratios for each replicate showed extremely high correlation between replicates for samples combined at the lysate-level for both fractionation and whole cell lysis with Pearson correlation coefficients > 0.8 in all but one case (Figure 23D). The strong correlation across multiple replicates is encouraging, suggesting that the ‘spike-in’ SILAC approach will provide the necessary robust and reproducible quantification for the chemical proteomic workflow. Good correlation was also observed by incorporating the ‘spike-in’ label at the cell-level (Figure 23E), which

100

also resulted in a higher number of total target identifications relative to its lysate-level counterpart (811 versus 462 protein identifications) (Figure 23B samples 10-13 compared to samples 1-6). This was not expected and could not be entirely explained but was proposed to be caused by differences in fractionation efficiency. However, introducing the ‘spike-in’ SILAC label at the lysate-level is practically simpler than at the cell-level. Thus despite a possible advantage in terms of increasing target coverage for incorporating the SILAC label at the cell-level in this particular study, it was decided to proceed with combining samples at the lysate-level for future experiments in line with previous studies. The results presented here, support that a ‘spike-in’ SILAC approach could be easily implemented into the developed chemical proteomic workflow. Fractionation upon cell lysis improved target identification numbers for sulforaphane ABP 2 relative to a whole cell protein lysis, thereby providing a further optimisation workflow to implement into the competition-based profiling of the sulforaphane target set.

101

(A) ‘Light’ ‘Spike in’ at 5 μM sulforaphane cell-level ABP 2 ‘Heavy’ 20 μM sulforaphane

METHOD 1 ABP 2 Cells fractionated into cytosolic (10-11) and Click chemistry, nuclear (12-13) fractions affinity purification, trypsin digest and LC-MS/MS analysis ‘Light’ ‘Spike in’ at 5 μM sulforaphane lysate-level ABP 2 ‘Heavy’ 20 μM sulforaphane

METHOD 2 ABP 2

Cells lysed with whole-cell lysis buffer (7-9) or fractionated into cytosolic (1-3) and nuclear (4-6) fractions (B)

Sample No. 1 2 3 4 5 6 7 8 9 10 11 12 13 Lysate Lysate Lysate Cell Cell Sample Identifier combination combination combination combination combination (cytosolic) (nuclear) (whole cell) (cytosolic) (nuclear) No. of protein IDs 892 929 881 305 243 451 604 591 611 1229 1141 841 763 ‘Razor+unique’ Filter 1 peptides > 2 and H/L 521 530 507 176 138 238 338 342 332 731 712 563 441 ratio assigned Conserved across Filter 2 396 124 258 610 389 replicates Unique IDs for fraction 338 66 - 422 201 Total protein IDs 462 258 811

(C) Whole cell (D) ‘Spike-in’ at lysate-level Cytosolic Nuclear Whole cell 15

167 19 0.849 0.819 57 171 1 47 0.949 Sample 1 v 2 Sample 4 v 5 Sample 7 v 8

Cytosolic Nuclear 0.845 0.745

(E) ‘Spike-in’ at cell-level 0.883 Sample 1 v 3 Sample 4 v 6 Cytosolic Nuclear Sample 7 v 9

0.941

0.941 0.914 0.904 0.879 Sample 2 v 3 Sample 5 v 6 Sample 8 v 9 Sample 10 v 11 Sample 12 v 13

Figure 23. Investigation into the implementation of a ‘spike-in’ SILAC quantitative proteomics methodology into the chemical proteomics workflow in the MCF7 cell line. (A) An overview of the experimental setup whereby Method 1 and Method 2 describes introducing the SILAC label at the cell- and lysate- level respectively following by standard identical downstream processing steps in both cases. (B) Table summarising the number of proteins

102

after each proteomic data processing step. The final number of protein identifications after all processing is highlighted in red. It clearly shows that fractionation upon cell lysis improves protein identification numbers and that applying the ‘spike-in’ SILAC label at the cell-level leads to an increased number of protein identifications. (C) Venn diagram showing the overlap of protein identifications from the ‘spike-in’ at lysate-level experiment conserved across triplicates for the cytosolic fraction (396 proteins), nuclear (124 proteins) and whole cell lysate (258 proteins). (D+E) Scatter plot comparisons of the H/L ratios for protein identification from in-lysate (D) and in- cell (E) ‘spike-in’ experiments displaying the Pearson correlation coefficient.

5.2 Comprehensive protein target profiling of sulforaphane

Having successfully implemented the ‘spike-in’ SILAC approach into the chemical proteomics workflow for a model system, this quantitative proteomics strategy was coupled to the competition- based chemical proteomics workflow to profile the targets of sulforaphane. Previous work in Chapter 4 had led to the identification of 126 and 112 protein targets for sulforaphane using a duplex SILAC approach to compare targets captured by ABP alone and ABP in competition with 100 μM sulforaphane. However, having initially identified 645 targets for the sulforaphane ABP 2 by chemical proteomics with the initial chemical proteomics experimental setup, the aim was to improve the number of target identifications for sulforaphane. Furthermore, whilst previous studies had only allowed the ABP labelling to be examined upon competition with a single, large excess of sulforaphane, the ‘spike-in’ SILAC methodology allows multiple concentrations of sulforaphane in competition with ABP to be examined in parallel. Such an analysis will allow the protein target binding potencies to be calculated.

The experimental design was as follows. Three concentrations of sulforaphane (5, 25, 100 μM) were chosen to compete against the labelling of sulforaphane ABP 2 (5 μM). Each experimental condition was carried out in duplicate in two cell lines, MCF7 and MDA-MB-231. Cells were also fractionated upon lysis into cytosolic and nuclear parts and processed separately (based on observations in Chapter 5.1) to maximise protein target coverage. Each cell line was first cultured under normal cell culture conditions and treated with ABP alone or ABP in competition with the designated concentration of sulforaphane. After compound incubation, cells were lysed and protein concentrations normalised against one another. In parallel, ‘heavy’ R10K8-labelled cells were treated with sulforaphane ABP 2 (20 μM) followed by cell lysis to generate the ‘spike-in’ SILAC lysate. A fixed amount of this ‘spike-in’ lysate (added at a ratio of 1:3 relative to the normal lysate) was then added to each experimental sample to provide the standard for MS quantification. Lysates were then ligated to AzRB (an enzyme cleavable capture reagent discussed in Chapter 5.6) via CuAAC, protein targets affinity purified, followed by reduction, alkylation, trypsin digest into peptides and analysis by LC- MS/MS (Figure 24).

103

Normal media (in duplicate) ‘Heavy’ media 1-2 3-4 5-6 7-8 ‘spike-in’

ABP + 5 μM ABP + 25 μM ABP + 100 μM ABP only ABP only sulforaphane sulforaphane sulforaphane Cell lysis

‘spike-in’ SILAC label added at lysate-level

CuAAC, affinity purification and trypsin digest

LC-MS/MS analysis

MS – Peptide quantification MS/MS – Peptide and ABP + protein identification ABP 5 µM only SULF ABP + Spike-in Spike-in Spike-in Spike-in 25 µM SULF ABP + 100 µM Intensity Intensity Intensity Intensity SULF Intensity

MS m/z MS m/z MS m/z MS m/z MS/MS m/z (1-2) (3-4) (5-6) (7-8) ACPKKLMAASYYS (1-2) Ratio(i) = ‘Heavy’ / ‘Light’ ABP only = Ratio(i) (3-4) Ratio(ii) = ‘Heavy’ / ‘Light’ 5 μM sulforaphane score = Ratio(ii) / Ratio(i) Peptide ion fragmentation and (5-6) Ratio(iii) = ‘Heavy’ / ‘Light’ 25 μM sulforaphane score = Ratio(iii) / Ratio(i) subsequent amino acid (7-8) Ratio = ‘Heavy’ / ‘Light’ 100 μM sulforaphane score = Ratio(iv) / Ratio(i) sequence determination (iv) Figure 24. The ‘spike-in’ SILAC, concentration gradient of competition-based quantitative chemical proteomics platform for elucidation of the targets of sulforaphane. Four experimental conditions of the ABP alone or in competition with parent compound are tested in duplicate (5 μM sulforaphane ABP 2) (1-8). In parallel, ‘heavy’ R10K8 labelled cells are treated with ABP only (20 μM sulforaphane ABP 2). All cells were lysed and the protein concentration determined, whereby a fixed amount of ‘spike-in’ R10K8 ABP only lysate is added into to lysates at a ratio of 1:3. CuAAC carried out to functionalise ABP-bound proteins, followed by affinity enrichement and preparation of peptides for LC-MS/MS analysis. Analysis by MaxQuant generates H/L ratios for each of the four samples (Ratios (i)-(iv)). A ratio of ratios is then generated by normalising each sample to the ABP only sample (Ratio (i)). It is this ratio of ratios that we refer to as the quantiative score for each respective concentration of sulforaphane competition. Prior to commencing the chemical proteomics workflow, as a quality control, the protein target labelling by the ABP across all samples was checked by in-gel fluorescence analysis following SDS-PAGE, shown in Appendix Figure 8.

Data analysis on the raw files generated from the LC-MS/MS was performed with MaxQuant. Firstly, protein identifications and their calculated H/L ratios from the cytosolic and nuclear fractions were pooled together. Only proteins with H/L ratios contained across both the ABP only duplicates were retained. Further filtering was then also performed to only allow proteins with H/L ratios across both the duplicates of ABP competed against each of the three separate concentrations of sulforaphane (5, 25, 100 μM). Averages were taken of the H/L ratios across the duplicates, followed by normalisation of the H/L ratio for each sulforaphane competition concentration against the H/L ratio of ABP only to generate a ratio of ratios (termed from herein the quantification score or simply the score). The quantification score represents a measure of how effectively sulforaphane competes for ABP labelling

104

at each of the three concentrations tested. The scores for each protein target for each sulforaphane competition concentration from each of the two cell lines (MCF7 and MDA-MB-231) were then plotted against one another (Figure 25A-C). An overview of the data processing is presented in Figure 25D.

The advantage of employing a quantitative competition-based approach is that it allows the most potent sulforaphane target binders to be identified amongst the complete target sets. Identified targets of sulforaphane that show strong competition against the ABP labelling at the lowest sulforaphane competition concentration (5 μM) are likely to be the most potent binders of sulforaphane (Figure 25A). Targets that require higher concentrations of competition to compete out ABP binding are likely to be less potent binders (Figure 25B+C). This clearly identified two conserved protein targets across both the MCF7 and MDA-MB-231 cell lines that were strongly competed by sulforaphane for ABP labelling at the lowest sulforaphane concentration employed (5 μM). These targets were KEAP1 and MIF. Strikingly, these targets had quantification scores at the lowest sulforaphane concentration far lower (higher log2 score) than any other targets at the same concentration.

These targets were identified for sulforaphane in preliminary proteomic experiments in Chapter 4 as well as by others.28, 295, 296 This is however the first observation of the potential superior binding potency of sulforaphane towards KEAP1 and MIF, in the context of all the identified protein targets in this study. The fact this was observed across both MCF7 and MDA-MB-231 cells suggests these targets may be potent sulforaphane binders regardless of the cell line. The effect of sulforaphane on KEAP1 to induce the Nrf2 antioxidant response has been widely discussed, but its low μM or even nM binding affinity towards KEAP1 supports experimental observations that it can induce Nrf2-driven responses at significantly lower concentrations than other detectable activities or responses.290 The role of Nrf2 in cancer has drawn mixed reports into whether its induction is beneficial or detrimental, but shown here is that targeting of the KEAP1/Nrf2 axis is clearly evident in cancer.320

MIF is a pro-inflammatory cytokine with roles in inflammation. A number of findings suggest a potential role of MIF in cancer with studies correlating the levels of MIF with tumour aggressiveness and metastatic potential.390, 391 MIF is known to possess two distinct catalytic activities, thiol protein oxidoreductase and keto-enol tautomerase activity.392, 393 Sulforaphane inhibits the tautomerase activity of MIF through covalent adduction with its catalytic N-terminal proline residue.295, 388 The inhibition of MIF by sulforaphane was shown to be independent of induction of Nrf2 and phase II gene expression, with in vivo studies showing that the tautomerase activity of MIF is required for promotion of tumour growth and metastasis.394 The potent protein binding of MIF by sulforaphane detected in this study therefore reaffirms this distinct mechanism for the anti-cancer activities of sulforaphane.

Aside from KEAP1 and MIF, a number of other sulforaphane targets can be elucidated. To further filter the identified targets to leave only genuine and high confidence targets of sulforaphane, a stringent quantification score cut-off of > 1.5 (log2 > 0.585) was applied to the scores for 100 μM sulforaphane competition. This corresponds to a 1.5-fold decrease in ABP-target enrichment in competition with the highest sulforaphane concentration (100 μM) tested relative to when the ABP alone was applied. This was set as a biologically significant cut-off to be deemed effective competition

105 between the parent compound and the ABP and assigned as a genuine target. In this way, 426 and 290 targets for sulforaphane were identified in the MDA-MB-231 and MCF7 cell line respectively (Appendix Table 1). A total of 181 of these protein targets were conserved across both cell lines. However more than 100 targets appeared to be cell line-specific targets highlighting that different cell lines might have different sulforaphane target profiles (Figure 26C).

LOW sulforaphane competition (5 μM)

(A) 4 (D)

High potency targets

3 MIF

KEAP1 Data output from MaxQuant – targets

2 and H/L ratios calculated (quantification score - MCF7) 2 Log 1 CPPED1 CHMP1A ALDH9A1 STK4 Filtered to leave targets STK3 NFKB2 in ABP only (1-2) and EEA1 PRKDC PTER PSMD13 APOBEC3C ABP in competition with -1 1 PSPC1 2 3 4 5 each concentration of Log2(quantification score - MDA-MB-231) sulforaphane competition (3-8) conserved across -1 duplicates MEDIUM sulforaphane competition (25 μM)

(B) 4 Ratio of ratios KEAP1 (quantitative score) MIF

3 calculated by normalising H/L ratio of ABP in

CHMP1A competition with each

STK4 BTD concentration of 2 GCLC STK3 ALDH9A1 sulforaphane to H/L ratio HDHD3 FAM203B NADKD1 CPPED1 of ABP only

(quantification score - MCF7) LCMT1 2 Log 1 TMPO

NFKB2 Plot quantitative Apply cut-off ALDH2 APOBEC3C score for each threshold -1 1 2 3 4 5

Log2(quantification score - MDA-MB-231) protein target in (quantitative score > MCF7 and MDA- 1.5) to highest MB-231 cell line sulforaphane -1 (A, B and C) competition (C) HIGH sulforaphane competition (100 μM) concentration (100 4

KEAP1 μM) to be deemed

STK3 genuine protein BTD target 3

CHMP1A STK4 MIF

ALDH9A1 FAM203B NPEPL1 FXR2 AIM1 HAT1 NUP54 EXOSC6 TIGAR PSMD9 NADKD1 CPPED1 MCMBP 2 CARM1 FXR1 VDAC3 NCAPD2 290 426

CPT1A TMPO RPS6KA3 sulforaphane sulforaphane

(quantification score - MCF7) SEC16A PML targets in MCF7 targets in MDA- 2 1 HSPB1 Log SMC2 cell line MB-231 cell line

LRRFIP1

MAP4

PHF3 CTSC LCMT1 ALDH2 APOBEC3C -1 1 2 GSDMD3 4 5

Log2(quantification score - MDA-MB-231)

-1 Figure 25. (A-C) Scatter plots to show the comparison of the quantification score for each sulforaphane target across the three concentrations of sulforaphane competition (A – 5 μM sulforaphane, B – 25 μM sulforaphane and C – 100 μM sulforaphane) across the MCF7 (y-axis) and the MDA-MB-231 (x-axis) cell lines. Protein targets only obtained in one of the two cell lines are given a value of 0 for the cell line in which the target is absent. A quantification score > 1.5 (log2 > 0.585) at the highest sulforaphane competition concentration is deemed effective competition to warrant its assignment as a genuine target. The plot in (A) highlights KEAP1 and MIF as

106

proteins that have a significantly higher quantification score than all other identiifeid sulforaphane targets conserved across both cell lines. (D) The proteomic data processing workflow employed to derive quantitative scores for each sulforaphane competition concentration and to subsequently identify the high confidence protein target identifications for sulforophane in the MDA-MB-231 and MCF7 cell lines.

Around 30 combined targets have been shown for sulforaphane and other isothiocyanates from the literature obtained from a number of independent studies (Figure 8). KEAP1 and MIF are two previously identified targets that have already been discussed. Cross-referencing the sulforaphane target set obtained in either of the MCF7 and MDA-MB-231 cell lines reveals a further 8 targets contained within those previously identified, namely ANTs (SLC25A5 and SLC25A6),395 ATPases (ABCE1 and ABCF1),396, 397 EGFR,398 HSP90AA1,93, 94 proteasome (PSMD9 and PSME2),92, 399 STAT3,400 TOP2A,401 and TXNRD1.402 Identification of previously recognised, direct sulforaphane interactions gives confidence to the developed platform as well as confirming the engagement of many of these targets inside the cell for the very first time. Other previously reported targets of isothiocyanates such as actin, annexin A2 (ANXA2), tubulin, vimentin (VIM) and GAPDH were also identified as targets of the sulforaphane ABP, but show insufficient competition against sulforaphane to be deemed genuine, high confidence targets. This raises an interesting point of discussion for these previously identified targets of sulforaphane. The observations here suggest that these previously identified targets may not be as significant to the mode of action of sulforaphane given their lower binding potency (score < 1.5) relative to the multitude of other targets that are discussed below (score > 1.5).

To more globally assess the full spectrum of targets of sulforaphane in MCF7 cells (290 targets) and MDA-MB-231 cells (426 targets), bioinformatic analyses were carried out using a variety of platforms including DAVID, Babelomics and WebGestalt.224, 225, 403 The data obtained from the DAVID analysis is shown to provide insights into the pathways, biological processes and molecular functions through which sulforaphane could be mediating its anticancer effects (Figure 26A+B). The GO terms associated with molecular functions and biological processes of the targets appear hugely diverse. This was anticipated given the polypharmocological nature of sulforaphane and its reported ability to influence multiple signalling pathways.

107

(A) Sulforaphane – 426 targets - MDA-MB-231 (B) Sulforaphane – 290 targets - MCF7

Porphyrin and chlorophyll metabolism Alanine, aspartate and glutamate metabolism Ubiquitin mediated proteolysis KEGG_PATHWAY Adipocytokine signaling pathw ay Amino sugar and nucleotide sugar metabolism Endocytosis Valine, leucine and isoleucine degradation Ribosome KEGG_PATHWAY Neutrophin signaling pathw ay Aminoacyl-tRNA biosynthesis Aminoacyl-tRNA biosynthesis Purine nucleotide binding Identical protein binding Aminoacyl-tRNA ligase activity Purine nucleotide binding Enzyme binding Thiolester activity Identical protein binding ATP binding Cytoskeletal binding Cysteine-type peptidase activity GOTERM_MF_FAT Adenyl nucleotide binding GOTERM_MF_FAT Nucleotide binding ATP binding Adenyl nucleotide binding Actin binding tRNA binding Nucleotide binding Aminoacyl-tRNA ligase activity RNA binding RNA binding Mitotic sister chromatid segregation One-carbon metabolic process mRNA processing Nucleobase metabolic process RNA processing Cellular macromolecule catabolic process Cell cycle Intracellular transport Organelle fission Nuclear division GOTERM_BP_FAT Cell cycle process GOTERM_BP_FAT RNA processing Organelle fission Glutamate metabolic process Mitotic cell cycle tRNA aminoacylation Cell cycle phase ncRNA metabolic process M-phase tRNA metabolic process 0 5 10 15 20 0 5 10 15 20

-log10(p-value) -log10(p-value) (C)

Intracellular transport Protein kinases and Cell death (44 targets) (42 targets) phosphatases (35 targets) AIMP2, BAG3, DFFA, DNM1L, AIP, ATP2A2, EEA1, GRPEL1, CDK3, NADKD1, PRKDC, DYNLL1, FXR1, GARS, GSPT1, IPO5, KLC1, KSPNA2, MYH9, RPS6KA1, RPS6KA3, STK3, HSPB1, LGALS1, PDCD6IP, PML, MYL6, NUP155, NUP54, PML, STK4, PPME1, PPP2R1A, PRKDC, RTN3, STAT1, STK3, SEC24C, SEC63, XPO5 PPP6R3, PTPN1, PTPN11, PPA1, STK4, STAT1 BAX, BID, COG1, GBF1, GOLGA3, PPM1G BAX, BID, HPRT1, PREX1, MYH14, NCBP1, NPEPL1, CDK2, GAK, ITPK1, PRSCD, SCRIB, TSTA3 SEC61B, XPOT RPS6KB1, PPA2, PGAM5, PGP, ACIN1, ATXN2, CCAR1, CIAPIN1, AP2A1, AP4E1, COPB1, DNM2, INPP4B, TIGAR DIDO1, EIF2AK2, FKBP8, KRT18, FLNA, HSP90AA1, IPO4, KIF4A, AXL, EGFR, EIF2AK2, GNE, LUC7L3, MAP1S, PNPLA6, KRT18, MAP1S, MYBBP1A, ROCK1, SRPK1, STRAP, ROCK1, RPS27A, SFN, SLC25A6, RANBP3, RPL11, SLC25A6, PPP2R2A, PPP2R5D, PPP4R1, SPG20, SQSTM1, TARDBP, SQSTM1, SRPR, ZW10 PTPN12 TOP1, TOP2A

RNA processing (50 targets) Cell cycle (55 targets) AARS, ADAR, DHX15, ELAC2, All sulforaphane targets CDK3, GSPT1, HCFC1, KPNA2, EXOSC6, GEMIN5, HNRNPF, MCM3, MYH9, NCAPD2, NCAPH, HNRNPK, HSD17B10, KHSRP, NUDC, PDCD6IP, PML, PSMD9, NONO, NSUN2, PCBP1, PCBP2 PPM1G, PSME2, RCC2, SMC2, NCBP1, PES1, POP1, PRKRA, SUGT1, TACC3, USP9X, VCPIP1 PRMT5, PTBP1, PUS1, TBM4, CHMP1A, GAK, PRES1, PHGDH, RG9MTD1, WDR77 109 181 245 PSME1 CCAR1. CPSF3, DDX17, EXOSC10, ANLN, BUB3, CCAR1, CDC27, HNRNPH1, HNRNPM, KHDRBS1, CKAP5, DLGAP5, DNM2, EGFR, LUC7L3, NOLC1, PABPC1, ERCC6L, ILF3, KHDRBS1, KIF2C, PABPC4, PPP2R1A, PRPF40A, MCF7 MDA-MB-231 KIFC1, KRT18, MKI67, NASP, RBM14, RBM25, RPL11, RPL7, targets targets NBN, NCAPG, NOLC1, PDS5A, RPS6, SF1, SMC1A, SRPK1, RAD50, RB1, RPS27A, SF1, STRAP, SYNCRIP, TARDBP, XRN2, Conserved SMC1A, SMC4, TACC1, TARDBP, ZNF638 targets TPX2, ZW10

CanSAR identified drug targets (11 targets) Mitochondrial targets (54 targets)

IMPDH2, RRM1, TXNRD1 Ubiquitination machinery ACAD9, AKAP1, ATP6V1A, CPT1A, DNM1L, HPRT1 (18 targets) DUT, GARS, GLOD4, GPD2, GRPEL1, ALDH2, DNMT1, DPYSL2, CBL, HECTD1, HUWE1, HSD17B10, ISOC2, LACTB, MAVS, MRPL39, EGFR, PP1B, TOP1, TRIM25, UBA6, UBE2O, MTCH2, MTHFD1L, PTPN11, QARS, TXN, TOP2A USP5, USP7, USP9X, VDAC2, VDAC3 USP10, VCPIP1 ALDH6A1, BAX, BID, DSP, GSR, HDHD3, OTUB1, USP15, USP32, MACROD1, MCCC2, PEX11B, PMPCB, PPA2, USP47 RG9MTD1, TBC1D15, TRAP1 NEDD4L, SMURF2, UCHL3 ABCE1, ACOT9, ACSL3, ALDH2, CPOX, CREB1, CTSB, DHX30, DLAT, DPYSL2, FKBP8, FRMD6, HCCS, ILF3, POLRMT, PPP2R1A, SLC25A5, SLC25A6

Figure 26. An overview of the bioinformatic output of the 426 and 290 sulforaphane protein targets identified in the MDA-MB-231 and MCF7 cell lines respectively. (A+B) GO annotations corresponding to biological processes (GOTERM_BP_FAT), molecular function (GOTERM_MF_FAT) and KEGG pathway (KEGG_PATHWAY)

108

enriched within the sulforaphane target sets in the MDA-MB-231 and MCF7 cell lines respectively. (C) The protein target identifications (listed by their gene name) for cellular processes of interest revealed by bioinformatic analysis. Targets conserved across both cell lines are shown in green, targets found only in MCF7 cells are shown in blue and targets found only in MDA-MB-231 cells are shown in red.

For sulforaphane targets identified in the MDA-MB-231 cell line (Figure 26A), there is a strong enrichment of biological processes involved in cell cycle and cell division with terms such as ‘M phase’ (33 proteins, p-value = 8.9 x 10-11) , ‘cell cycle’ (50 proteins, p-value = 2.3 x 10-9) and ‘nuclear -9 division’ (25 proteins, p-value = 4.1 x 10 ). Sulforaphane is known to induce a G2/M phase arrest reported in both MDA-MB-231 and MCF7 cell lines.404-406 The identification of 50 cell cycle associated targets for sulforaphane in the MDA-MB-231 cell line, 25 targets in the MCF7 cell line and 20 targets conserved across both cell lines (Figure 26C), highlights a large number of new mediators that may directly contribute to cell cycle arrest in breast cancer cells. In particular, the identification of the tumour suppressor protein retinoblastoma 1 (RB1) as well as cyclin-dependent kinase 3 (CDK3) warrant further investigation into the functional role played by sulforaphane adduction given their well- defined roles in cell cycle regulation.407, 408 DAVID analysis did not reveal cell cycle and cell division processes within its top 10 enrichment terms for the sulforaphane targets in the MCF7 cell line (Figure 26B). Fractionation efficiency for nuclear proteins in the MCF7 cell line was poor relative to the MDA-MB-231 cell line with less than 100 protein targets being identified. This may explain the lack of cell cycle and cell division GO terms as many of these proteins are likely to be nuclear localised. Terms such as ‘tRNA metabolic process’ (14 proteins, p-value = 9.2 x 10-8), ‘glutamine metabolic process’ (6 proteins, p-value = 1.3 x 10-5) and ‘organelle fission’ (14 proteins, p-value = 1.4 x 10-4) were more favoured as highly enriched biological processes.

Molecular functions of the sulforaphane targets are reasonably well conserved across both cell lines with a number of shared highly enriched functions. The most highly enriched molecular function for the targets of both cell lines was ‘RNA binding’ (65 proteins, p-value = 8.6 x 10-18 for MDA-MB-231 and 31 proteins, p-value = 6.1 x 10-6 for MCF7). Identification of KEGG pathways enriched within the target set gives insight into protein interaction networks and chemical reactions that are responsible for various cellular processes which sulforaphane may directly influence.227, 409 The most highly enriched KEGG pathway for the sulforaphane targets conserved across both cell lines was ‘aminoacyl-tRNA biosynthesis’ (8 proteins, p-value = 1.9 x 10-4 for MDA-MB-231 and 8 proteins, p- value = 3.2 x 10-5 for MCF7). This is due to 8 aminoacyl-tRNA synthetase (aaRS) enzymes being identified as sulforaphane targets. Traditionally thought of as housekeeping proteins, confined only to protein synthesis functions, these proteins are now being appreciated as having diverse roles in the cell and are being explored as therapeutic targets against cancer.410, 411 The observation of a number of these family members as sulforaphane targets speculates a new class of targets that might contribute to the mode of action of sulforaphane that has not been previously explored.

Next, it was of interest to understand the interaction relationships between the protein targets of sulforaphane. Protein interaction networks were constructed of the sulforaphane targets in Cytoscape, an open source software platform for visualising complex networks.412 Firstly, protein-protein interaction data curated from a number of public databases (including BioGRID, HPRB, Phosphosite,

109

IntAct, MINT, Reactome and TfactS) was incorporated in Cytoscape and this was used to search against the sulforaphane target set and export the interactions contained within these targets to generate the visualisable target networks (Figure 27).

110

(A) Sulforaphane (MDA-MB-231)

Top interacting nodes: EGFR (29), RPS27A (20), RELA (17), RB1 (16), STAT3 (16), SRPK1 (14), (B) HNRNPK (14), STAT1 (13)

Sulforaphane (MCF7)

Top interacting nodes: CDK2 (22), PRKCD (11), STAT1 (10), HNRNPK (9), PXN (8), STAT3 (8), PTPN11 (6), NCBP1 (6)

Figure 27. Protein interaction networks of sulforaphane protein targets generated in Cytoscape for the MDA-MB- 231 (A) and MCF7 cell line (B). Proteins (represented as nodes) are colour-coded (gradeint of yellow-red) by the quantification score (for 100 μM sulforaphane competition) such that more potent sulforaphane protein binders are represented in red and lower potency targets in yellow. The interactions between proteins (represented as edges) are also colour-coded to represent the different types of interaction: black (direct – BioGRID, HPRD), pink (phosphorylation - Phosphosite), blue (reactome - Reactome), green (reaction – MINT, IntAct) and orange

111

(transcriptional – Tfacts and Lindquist Lab databases). Nodes with the highest degree of connectivity within the network are shown below each figure. The original Cytoscape networks can be found in Appendix Table 1.

The identification of targets with a high degree of connectivity (nodes with multiple connecting edges) may be important interaction ‘hubs’ within each network. In network polypharmacology, such nodes may be most important to the biological activity of the compound.413 In the MDA-MB-231 cell line, EGFR, NF-κB p65 subunit (RELA), STAT proteins (STAT1 and STAT3) and RB1 all display a high degree of connectivity within the target network (Figure 27A). STAT proteins also display a high degree of connectivity in the sulforaphane target network in the MCF7 cell line (Figure 27B), in addition to the signalling kinases CDK2 and protein kinase Cδ (PRKCD). Interestingly, the complexity of the interaction network of sulforaphane targets in the MCF7 cell line is less than that observed in MDA-MB-231 cell line. The average number of interaction partners for each node in the MCF7 target network was 1.14, whereas it was calculated as 2.01 for the MDA-MB-231 target network. This may be due to the reduced number of targets identified in the MCF7 cell line relative to the MDA-MB-231 cell line (290 targets versus 426 targets). Clustering analysis within the sulforaphane target network in the MCF7 and MDA-MB-231 cell lines was also carried out using the Cytoscape plugin ClusterOne.414 ClusterOne strives to discover densely connected sub-networks within the overall network that usually correspond to protein complexes or fractions of them. The top functional module identified corresponded a highly significant cluster (p-value = 0.002) containing 11 nodes consisting of KEAP1, BUB3, CDC27, HUWE1, NEDD4L, PPP2R5D, PSMD9, PSME2, RPS27A, SMURF2 and UBE2O. Once again, KEAP1 appears to be an integral target for sulforaphane even when considered in the context of the entire target set identified for sulforaphane.

The identification of STAT proteins, a family of latent transcription factors, as targets of sulforaphane is interesting in the context of the mode of action of sulforaphane. STAT3 in particular is an oncogenic transcription factor, which is constitutively expressed in a variety of cancers, resulting in the expression of various genes involved in cell proliferation and apoptosis such as Mcl-1, Bcl-XL, Bcl-2, c-Myc, cyclin D1 and survivin.415 STAT3 signalling is also known to control human telomerase reverse transcriptase (hTERT) expression and promote a cancer stem cell phenotype in breast cancer cells.416, 417 Inhibition of STAT3 signalling by sulforaphane has been reported previously, with a reduction in protein level and phosphorylation status of STAT3’s activator kinase JAK2 proposed as the mechanism of action.418 These results show sulforaphane may also be capable of interacting with STAT3 directly and this may have a functional effect on its signalling.

Another important and previously validated signalling pathway targeted by sulforaphane is NF-κB.419, 420 Activation of NF-κB has been linked to inflammation, cancer cell survival and progression.421 A number of mechanisms have been proposed for the ability of sulforaphane to inhibit NF-κB in cancer cells including interference with an activating kinase of the pathway, IKK,420 reduced upstream signalling of NF-κB,298, 422 as well as potential direct interaction with NF-κB p50 subunit (NFKB1) to prevent DNA binding of the transcription factor.423 Here, the identification of two other NF-κB subunits that sulforaphane binds directly to (NF-κB p52 (NFKB2) and NF-κB p65 (RELA)) in the MDA-MB-231 cell line in addition to redox regulators of NF-κB (Ref-1 (APEX1), TXNRD1 and TXN) provides new

112

mediators that could contribute to the inhibitory effect on NF-κB signalling of sulforaphane. The functional interplay between STAT3 and NF-κB signalling has also been widely reported.423-427 Targeting of both transcription factors by sulforaphane may be advantageous in overcoming proliferation and cell survival responses.

The induction of apoptosis by sulforaphane is also a well characterised activity in breast cancer as well as other cancers.428 Sulforaphane has been shown to mediate its effects through multiple, different apoptotic pathways including the intrinsic, extrinsic and caspase-independent pathways. However, recent insights into the pathway leading to apoptosis induced by sulforaphane reveal cell line-specific underlying apoptotic mechanisms despite the overall effect on cell viability being similar.406 Induction of apoptosis in MDA-MB-231 cells is mediated through an induction of Fas ligand and caspase-8, whereas an alternative activation of the mitochondrial pathway in other breast cancer cell lines including MCF7 cells has been observed.

In this study, 38 protein targets associated with cell death are identified for sulforaphane in the MDA- MB-231 cell line and 24 targets in the MCF7 cell line, 18 protein targets being conserved across both cell lines (Figure 26C) many of which are involved in apoptosis. Conserved targets of sulforaphane across both cell lines include BAG3, DNA fragmentation factor subunit alpha (DFFA), PML, RTN3 and VDAC2/3 which all have important functional roles in apoptosis.429-433 The identified Bcl-2 family members (Bax and Bid) are also important mediators in apoptosis and are both identified as targets of sulforaphane in the MCF7 cell line. In the MDA-MB-231 cell line, a number of additional mediators of cell death including cell division cycle and apoptosis regulator protein 1 (CCAR1), death-inducer obliterator 1 (DIDO1), protein kinase R (EIF2AK2), rho-associated protein kinase 1 (ROCK1), TOP1 and TOP2A are also identified that may well contribute to the apoptotic effect of sulforaphane. Different targets of sulforaphane in the MDA-MB-231 and MCF7 cell lines may account for the different underlying mechanisms of inducing apoptosis.

The induction of autophagy has also been reported for sulforaphane.434-436 Autophagy is an evolutionary conserved, lysosome-mediated process of degrading and recycling damaged proteins and organelles inside the cell.437 The induction of autophagy by sulforaphane precedes and delays apoptotic cell death and is believed to initially protect cancer cells from cell death. The mechanism by which sulforaphane induces autophagy is unknown but recent evidence suggests that sulforaphane may induce autophagy in breast cancer by negatively influencing the activity of mTOR in the PI3K- Akt/mTOR-S6K1 pro-survival signalling pathway.437-441 We identify a number of additional targets that may contribute to the induction of autophagy by sulforaphane. It has been reported that the disruption of the interaction between EIF2AK2 and STAT3 may induce autophagy, mediated by EIF2AK2.442 Here, identification of both EIF2AK2 and STAT3 as targets of sulforaphane is made. Direct binding of sulforaphane to both proteins may physically disrupt their interaction. A recent paper has also proposed a role for MIF in the regulation of autophagy.443 It remains to be determined whether sulforaphane may induce autophagy through covalent binding to MIF. In addition to this, further autophagy-related proteins as sulforaphane targets include La-related protein 1 (LARP1), platelet- activating factor acetylhydrolase IB subunit beta (PAFAH1B2) and Ubiquitin carboxyl-terminal

113

hydrolase 10 (USP10) (across both cell lines), sequestersome (p62/SQSTM1) in MDA-MB-231 and Cysteine protease ATG4B in the MCF7 cell line. Observations into the targets of sulforaphane therefore provide new potential mediators for the autophagy response of sulforaphane.

Protein kinases and phosphatases have been identified as targets for isothiocyanates in previous studies,286 however the targets of sulforaphane identified here expand the repertoire within these families to a total of 35 targets across the two cell lines (Figure 26C). There is great diversity of the kinase signalling cascades across which sulforaphane is capable of covalently binding to. Sulforaphane targets two Hippo kinases, serine/threonine-protein kinase STK3 and STK4 (MST2 and MST1 respectively), that play central roles in the Hippo pathway.444 Another important kinase target is EGFR, a receptor tyrosine kinase (RTK) family member, that is considered, alongside ESR1 and HER2 to be one of the three most critical proteins in breast cancer proliferation and is an attractive cancer therapeutic target.445 Previous effects of sulforaphane on EGFR signalling in cancer have been reported predominately relating to the down-regulation of EGFR.406, 446 The observation of the direct interaction of sulforaphane with EGFR is the first time this has been shown, although it has been suggested for other isothiocyanates.398, 447

Additional targets of sulforaphane (Figure 26C) identified include epigenetic modulators (DNMT1 and histone acetylase 1 (HAT1)), proteins involved in the ubiquitin-proteasome machinery (18 targets) as well as mitochondrial proteins (54 targets) that are believed to play an important role in the generation of ROS by sulforaphane, a mechanism which may underlie sulforaphane-induced apoptosis and cell death.286 Utilising the CanSAR portal, it can also be shown that 10 targets of sulforaphane in MDA- MB-231 cell line and 4 targets in the MCF7 cell line currently have drugs either approved or in the clinical trials (Figure 26C).448 The further study of these druggable targets may well provide starting points for new inhibitor classes should they prove to be functional interactions.

In conclusion, the targets of sulforaphane have been conclusively profiled across two breast cancer cell lines. Identification of superior numbers of targets to previous studies (Chapter 4) has been reported highlighting the success of the ‘spike-in’ SILAC approach coupled to the previously developed competition-based chemical proteomics workflow. The most significant targets identified include KEAP1 and MIF, two well-documented sulforaphane targets, which have been identified as the highest binding potency targets of sulforaphane conserved across both cell lines in this study. However, many other targets lie beyond these two proteins that will bind at concentrations that may be therapeutically achievable in vivo. Thus supporting the notion that action of sulforaphane at multiple targets may contribute to its anticancer effects.449, 450 There is continued debate into what concentrations of sulforaphane are achievable in vivo (discussed in Chapter 1.3.2.2) with nM and μM concentrations reported from a range of studies.282, 284, 285, 449-451 However, determining which protein targets engage sulforaphane at different concentrations is crucial and may explain the differing effects exerted by sulforaphane in distinct concentration windows of the compound. This highlights the power of the quantitative target data for sulforaphane generated providing direct insight into targets such as KEAP1 and MIF which are covalently engaged by sulforaphane at < 5 μM concentrations whereas other targets may require higher concentrations of sulforaphane (25 μM – 100 μM). The observed

114

effects exerted by these targets thus may only be apparent at higher sulforaphane concentrations. This could only be achieved by applying the innovative experimental setup employed in these studies.

Clearly this study only scratches the surface in terms of identifying the full target spectrum of sulforaphane. This is by no means a complete global spectrum of the targets of sulforaphane; targets that have very low endogenous expression levels, require longer incubation times of sulforaphane for target binding, are not amenable to MS-based identification or are excluded from capture under our fractionation conditions may well be missed. Indeed, previously reported targets of sulforaphane such as AP-1, cytochrome P450s, MEKK1 and β-tubulin were not identified as targets under the conditions of our study. This could be a result of variation of the sulforaphane target set between cell lines as has been observed in this study by comparison that over 100 targets were unique to each cell line. However, what is highlighted here is the wide range of targets that sulforaphane physically interacts with in a covalent manner that provide strong evidence of support for the wide scope of downstream effects of sulforaphane that have been reported.

5.3 Comprehensive protein target profiling of curcumin and piperlongumine

5.3.1 Target identification overview in MDA-MB-231 cell line

Using the same ‘spike-in’ SILAC, competition-based chemical proteomic approach as reported for sulforaphane above, the targets of both curcumin and piperlongumine were profiled using curcumin ABP 1 and piperlongumine ABP respectively. A few differences to the experimental setup were made however in light of the initial profiling of sulforaphane (Chapter 5.2) that was carried out beforehand. Firstly, fractionating into cytosolic and nuclear fractions for sulforaphane was shown to improve coverage for protein target identification. However, target coverage had seemed adequate for curcumin and piperlongumine previously (Chapter 4.3 and 4.5) and therefore due to the fact that fractionating upon lysis leads to double the number of samples to analyse by MS, the standard whole cell lysis buffer was used to minimise MS instrumentation time. Secondly, sulforaphane had its targets profiled in two breast cancer cell lines. Due to machine access limitations, target profiles for curcumin and piperlongumine were only generated in a single cell line, MDA-MB-231. Thirdly, on reflection of the data obtained for sulforaphane, increasing the number of competition concentrations of parent compound over a larger dynamic range would provide better insight into the binding potency responses for each target identified. Therefore, five concentrations of the respective parent compound were employed in competition with the ABP for the target profiling of curcumin and piperlongumine, whereas only three such concentrations had been used previously for sulforaphane (Figure 28). All sample processing steps were subsequently identical to those reported previously (Chapter 5.2).

115

Normal media (in duplicate) ‘Heavy’ media 1-2 3-4 5-6 7-8 9-10 11-12 ‘spike-in’

ABP only ABP + v μM ABP + w μM ABP + x μM ABP + y μM ABP + z μM ABP only parent parent parent parent parent compound compound compound compound compound

Piperlongumine ABP: 2 μM across all samples ‘spike-in’ Piperlongumine (PIP): v = 2 μM, w = 10 μM, x = 25 μM, y = 50 μM and z = 100 μM Piperlongumine Curcumin ABP 1: 5 μM across all samples ABP = 8 μM Curcumin (NC): v = 5 μM, w = 20 μM, x = 50 μM, y = 100 μM and z = 150 μM Curcumin ABP = Mono-O-propyl curcumin (PC): v = 5 μM, w = 20 μM, x = 50 μM, y = 100 μM and z = 150 μM 20 μM

Figure 28. Experimental design for ‘spike-in’ SILAC, concentration gradient of competition-based chemical proteomics workflow for profiling the targets of curcumin and piperlongumine. In all cases, 5 concentrations of parent compound (v, w, x, y and z μM) are competed against ABP labelling and the ‘spike-in’ SILAC ABP only lysate is added to each experimental sample (1-12) at a 1:3 ratio. Prior to commencing the chemical proteomics workflow, as a quality control, the protein target labelling by the ABP across all samples was checked by in-gel fluorescence analysis following SDS-PAGE, shown in Appendix Figure 8.

Following proteomic analysis by LC-MS/MS, MaxQuant and subsequent data processing, 196 targets for curcumin and 442 targets for piperlongumine were identified by applying a quantification score cut- off of a value > 1.5 for the 100 μM parent compound competition concentration. The stringent target filtering and quantitative cut-off threshold applied was the same as for sulforaphane providing only targets of high confidence (Figure 25D). The number of targets is a slight improvement to what has been reported from previous studies (Chapter 4.3 and 4.5) suggesting the ‘spike-in’ SILAC approach again was equal if not superior in protein coverage in comparison to previous studies (curcumin: here 196 targets, Chapter 4.3 – 120 targets, Chapter 4.5 – 155 targets; piperlongumine: here – 442 targets, Chapter 4.3 – 403 targets, Chapter 4.5 – 358 targets).

In order to elaborate further on the target binding potencies within these target sets for curcumin and piperlongumine, the quantitative score for each concentration of parent compound competition was used to generate dose response curves for each protein target. The calculated quantitative score for at least four out of the five concentrations of competition was required and dose response curves fitted with non-linear regression in GraphPad Prism 5 (Figure 29). For the majority of targets, dose response curves fitted well with a Hill slope between 0.5 and 2.0 (reflecting good sigmoidal fit) and excellent R squared correlation (Appendix Figure 9 and Appendix Figure 10). The dose response curve was then used to calculate an ‘EC50’ value for each protein target. EC50 in this context is defined as ‘the half maximal concentration of competition between the ABP and the parent compound’ and can be considered to be a value of covalent binding potency. There is great diversity in the calculated

EC50 values, which reflect the differing binding affinities of each compound across its multitude of

targets. Targets with the lowest calculated EC50 values are likely to be the most potent targets. It can be assumed that these targets may be the most functionally important as only targets that are engaged at low concentrations are likely to be achievable at the concentrations reported in vivo which have generally been reported in the nM range, particularly for curcumin which has poor 452 bioavailability. The EC50 value not only allows the comparison of binding between targets of the

same compound, but also between compounds of the same target. For example, the EC50 value of

116

KEAP1 is 18.1 μM for piperlongumine and 36.1 μM for curcumin suggesting a slightly higher binding potency of KEAP1 for piperlongumine relative to curcumin. Another target, heme oxygenase 2 453 (HMOX2), the constitutive isoform of heme oxygenase involved in redox processes, shows an EC50 value for curcumin and piperlongumine of 13.3 μM and 13.7 μM respectively suggesting very similar binding potencies towards this target.

It is important to clarify the definition of the EC50 value in this context. In order for a target to be identified using the workflow employed, it must bind to the respective ABP at the fixed concentration that is applied to all samples (2 μM piperlongumine ABP and 5 μM curcumin ABP 1). Therefore, even if an EC50 value of 75 μM is calculated for curcumin or piperlongumine against a particular target, this refers specifically to the effective concentration of competition. The identified target is still binding to the ABP at its low μM concentration otherwise it would not be identified as a target. Furthermore, the

EC50 value provides no insight into the functional effect on the given target. It is merely restricted to an

assessment of covalent binding. The EC50 value allows the competition data across all concentrations (in this case five concentrations) to be combined together into a single parameter. Therefore it is more robust than using the quantitative score from a single concentration of compound as was done previously for sulforaphane (Chapter 5.2). As before, this information allows the dissection of targets such that their binding potencies can be assessed in a global manner across the target set.

Data output Quantitative score Filtered to leave from Filtered to leave calculated by targets with MaxQuant – targets conserved normalising ABP in quantitative targets and across duplicates competition with parent score in at least H/L ratios separately for each compound to ABP only 4 out of the 5 calculated of the 6 samples for all 5 samples samples

1/x applied to 1.2 Log10 (concentration/M) quantitative score to 1.0 plotted against fractional determine a fractional 0.8 response for each protein response for each 0.6 target and binding curve concentration of

0.4 fitted with non-linear competition

Response toResponse competition regression to determine 0.2 All targets EC50 value 0.0 -7 -6 -5 -4 Log (concentration / M)

Piperlongumine Curcumin

HMOX2 HMOX2 1.2 1.2

1.0 1.0

0.8 0.8 HMOX2

HMOX2 0.6 EC50 0.6 EC50

0.4 (HMOX2) = 0.4 (HMOX2) = Response toResponse competition Response toResponse competition 0.2 13.7 μM 0.2 13.3 μM Assign EC50 for each protein 0.0 0.0 -7 -6 -5 -4 -7 -6 -5 -4 Log (concentration / M) Log (concentration / M) target allowing targets to be KEAP1 KEAP1 ranked in terms of their 1.2 1.2 binding potency 1.0 1.0

KEAP1 0.8 KEAP1 0.8

0.6 0.6 EC50 EC50 Selected targets ofinterest 0.4 (KEAP1) = 0.4 (KEAP1) = Response toResponse competition Response toResponse competition 0.2 18.1 μM 0.2 36.1 μM

0.0 0.0 -7 -6 -5 -4 -7 -6 -5 -4 Log (concentration / M) Log (concentration / M)

Figure 29. Proteomic data processing workflow for calculating EC50 values as a measure of protein target binding potency, for curcumin and piperlongumine using data generated from the experimental setup reported in Figure 28.

117

5.3.2 Target profiling of curcumin

The elucidation of 196 targets for curcumin using the quantitative chemical proteomics platform is an improvement to previous profiling efforts (Chapter 4). However, it had been observed previously that curcumin ABP 1 (and the curcumin ABPs more generally) showed relatively poor competition against parent curcumin in comparison to its PC analogue (Figure 10B and Figure 18A). For this reason, curcumin ABP 1 was competed against both curcumin and PC, in parallel, using the ‘spike-in’ SILAC competition-based chemical proteomics workflow to profile their targets and allow a direct comparison between their target sets. The curcumin ABP 1 captured a similar number of total proteins for each experiment prior to filtering for competition against the highest concentration of parent compound tested (623 proteins for curcumin and 721 proteins for PC). However, upon filtering and application of the quantitative score cut-off threshold to reveal genuine targets based upon successful competition, more target identifications for PC were made in comparison to curcumin (472 targets for PC versus 196 targets for curcumin). The extremely high degree of overlap between the two target sets highlights that curcumin and PC are binding the same targets (Figure 30A-C). However, competition for ABP labelling by PC is greater than curcumin which is reflected in the lower number of targets identified for curcumin relative to PC (Figure 30D-G).

118

ALL TOP 50 TOP 20 (A) TARGETS (B) TARGETS (C) TARGETS

26 170 302 30 20 30 6 14 6

Curcumin PC Curcumin PC Curcumin PC

5 μM competition 50 μM competition 4 (D) 4 (E) Curcumin- Curcumin- favoured favoured 3 M NC) 3 M NC) µ µ targets targets

HMOX2 2 2 RTN3

CDKAL1 CYB5B MLKL ALDH9A1

Log2(quantification scoreLog2(quantification - 5 USP11 Log2(quantification scoreLog2(quantification - 50 TMED1 ENDOD1 1 1 PC- PC- NPC1 REEP5 favoured favoured HMOX2 CYB5B NCEH1 RTN3 targets targets CAPN2 ALG1 INF2 ACOT9PCCA -1 PGM1 1 2 3 4 -1 1 2 3 4

Log2(quantification score - 5 µM PC) Log2(quantification score - 50 µM PC) -1 -1 20 μM competition 100 μM competition 4 (F) (G) 4 Curcumin- Curcumin- favoured 3 favoured 3

HMOX2 M NC) targets µ targets RTN3 M NC) µ

CDKAL1 CYB5B TMED1 TAP1

2 KEAP1 2

ALDH9A1 CYR61 RBM4 REEP5 TFRC TEX264 HMOX2 CKAP4

ALG1 RFTN1

RTN3

1 TXNDC5 1 NCEH1

CYB5B Log2(quantification score - 100

Log2(quantification scoreLog2(quantification - 20 ALDH9A1 CDKAL1 PC- TMX1 PC- TAP1 favoured favoured ATP2A2 DLAT RTN4 REEP5 targets targets

-1 1 2 3 4 -1 1 2 3 4

Log2(quantification score - 100 µM PC) Log2(quantification score - 20 µM PC) -1 -1 Figure 30. (A) Venn diagrams showing the protein target overlap between the total curcumin (196) and PC (472) target sets (> 85 % of the curcumin targets identified were contained within the larger PC target set). (B-C) Venn diagrams of the protein target overlap between the top 50 targets (B) and top 20 targets (C) of curcumin and PC. Targets were ranked by their quantitiative score at 100 μM parent compound and the top 50 or 20 used to determine overlap between the curcumin and PC datasets. (D-G) Scatter plots of the quantiative score of the protein targets of curcumin (y-axis) and PC (x-axis) at each of the four competition concentrations (D – 5μM, E – 20 μM, F – 50 μM, G – 100 μM). The y = x line is plotted with targets favoured for PC shown in the red segment and targets favoured for curcumin shown in the grey segment. Prior to commencing the chemical proteomics workflow, as a quality control, the protein target labelling by the ABP was checked by in-gel fluorescence analysis followed by SDS-PAGE, shown in Appendix Figure 8.

This supports the initial observations from in-gel fluorescence analysis whereby PC was more effective at competing against ABP labelling than curcumin (Figure 10B and Figure 18A). It may be caused by the increased cellular uptake or in-cell stability of PC relative to curcumin (although this was not confirmed). However it is clear that the target sets of PC and curcumin are highly analogous as one might expect given their structural similarities. This might seem obvious, but was not clear upon initial in-gel fluorescence analysis whereby curcumin seemed ineffective at competing against ABP labelling even at large excesses. Therefore the additional 302 targets identified for PC are likely to be curcumin targets also. These additional curcumin targets may require higher concentrations for target engagement than the originally elucidated 196 targets of curcumin but may also be an artefact of the competition-based assays performed and therefore the significance of these additional targets

119

should not be discounted. As such, parallel analyses on both the 196 target set for curcumin as well as the 472 targets identified for PC were performed.

Over 50 direct targets have been identified for curcumin from a variety of studies (Figure 8), however only a small number of these have been shown to be as a result of covalent binding. Within the target set of curcumin (196 targets), 5 previously identified targets are elucidated (sarcoplasmic endoplasmic reticulum Ca2+ ATPase 2 (ATP2A2),454, 455 DNMT1,263 EGFR,265 inosine monophosphate dehydrogenase 2 (IMPDH2),456 and TXNRD1264) with a further 3 targets within the PC target set (aryl hydrogen receptor (AHR),457 β-tubulin (TUBB),458 and α-tubulin (TUBA1C)458). Previously reported targets of curcumin such as DNMT1, EGFR and TXNRD1 have already been outlined for their anti- inflammatory and anticancer effects for sulforaphane in the previous chapter (Chapter 5.2). Molecular docking studies suggested that curcumin might bind to the catalytic thiolate of Cys1226 of DNMT1 to exert its inhibitory effect.263 IMPDH2 and ATP2A2 (also known as SERCA2) have also been proposed to underlie the anticancer effects of curcumin and shown here is the covalent binding to these targets for the first time.455, 456 Tubulin is also a previously reported target of curcumin.458 It is commonly targeted by anticancer agents, such as paclitaxel and vinblastine, whereby disruption of tubulin polymerisation and/or microtubule dynamics can trigger cell death in rapidly dividing cells. A number of other electrophilic natural products have been shown to covalently bind tubulin (both α- and β- isoforms) suggesting it may be a common target for compounds of this kind.105, 109, 459 However, determining whether it is a critical mechanism for the anticancer effects of curcumin is yet to be demonstrated.

As was the case for sulforaphane, there are also many targets that have been reported in the literature which have been shown to be covalent interactions that are not identified as targets within either the curcumin or PC target set. These include aminopeptidase N (CD13), IKK, interleukin-1 receptor-associated kinase (IRAK) and xanthine dehydrogenase (XDH). However, as discussed previously this may be a result of the choice of cell line, incubation time or protein extraction technique employed.

Previous chemical proteomic profiling efforts for curcumin by other groups reported in the literature also identified a further 40 to 50 targets of curcumin requiring further validation (Appendix Table 2).132, 133 As commented on earlier (Chapter 1.3.1.2), these studies can be credited for their endeavour to attempt to apply a global protein target profiling strategy for curcumin, but fall short particularly concerning appropriate controls to provide the high confidence necessary to assign them as bona fide curcumin targets. Cross-referencing the curcumin and PC target sets against these targets indicated 4 targets within the curcumin set (Src substrate cortactin (CTTN), fragile X mental retardation syndrome-related protein 1 (FXR1), KH-type slicing regulatory protein (KHSRP) and KEAP1) with a further 5 targets within the PC target set (DEAD box protein 17 (DDX17), 1,4-alpha- glucan-branching enzyme (GBE1), glyoxylase 1 (GLO1), mitochondrial import receptor subunit TOM70 (TOMM70A), PRDX2). The evidence supporting these particular targets as genuine curcumin targets is therefore strengthened given the lack of controls and validation employed in these previous studies.

120

Next, the binding potencies of the identified curcumin targets were assessed. Previously in the profiling of sulforaphane, MIF and KEAP1 had stood out as targets covalently engaged by sulforaphane at significantly lower concentrations than other targets with its target spectrum (Figure 25A). Identification of significantly superior binding potencies for a selection of targets within the target set was not as clearly observed for curcumin (Figure 30D-G). However, the most potent targets with the lowest EC50 values are likely to be of most interest. The top 10 targets of curcumin identified in this way were HMOX2, RTN3, CYB5B, CDKAL1, ALDH9A1, TEX264, TAP1, KEAP1, FN3KRP and ENDOD1. The identification of KEAP1 as one of the most potent curcumin targets again highlighted its importance as a target of electrophilic species with a well-defined mode of action. However, KEAP1 is not the only target and many of these other potent binders are yet to be explored for the functional effect exerted upon curcumin binding and how this relates to its anticancer effects.

To appreciate the wider scope of the targets of curcumin, bioinformatic analysis was carried out on the protein target sets profiled against curcumin and PC (Figure 31). The 196 targets of curcumin were analysed first in DAVID (Figure 31A). DAVID analysis revealed enrichment of targets associated with the cell cycle, cell division and metabolic processes including terms such as ‘mRNA processing’ (14 proteins, p-value = 1.3 x 10-4), ‘cell cycle’ (22 proteins, p-value = 3.8 x 10-4) and ‘organelle fission’ (11 proteins, p-value = 4.4 x 10-4). Molecular function of the protein targets was diverse, with ‘RNA binding’ (25 proteins, p-value = 4.3 x 10-6) as the most significant function associated with the target set that had also previously been observed for sulforaphane. As a whole-cell lysis buffer was utilised in this experiment, it was of interest to determine where curcumin targets were engaged inside the cell. Proteins were engaged in a number of compartments, including most significantly the nucleolus (30 proteins, 7.0 x 10-9) as well as non-membrane bound organelles. This highlighted that even with relatively short incubation times curcumin binds to targets across the entirety of the cell. The targets identified for PC showed some interesting similarities as well as differences to the target set of curcumin (Figure 31B). Again, biological processes relating to the ‘cell cycle process’ (49 proteins, p- value = 1.6 x 10-11) were highly enriched. However, other processes such as ‘intracellular transport’ (61 proteins, p-value = 1.3 x 10-15), ‘translation’ (39 proteins, p-value = 3.1 x 10-13) and ‘protein localisation in organelle’ (24 proteins, p-value = 3.0 x 10-11) were also observed. The molecular functions of protein targets once again favoured ‘RNA binding’ (61 proteins, p-value = 7.7 x 10-14) with many other shared annotations relating to molecular function. On account of the larger number of targets within the PC set, more significant KEGG pathways were associated with the targets identified. One such KEGG pathway was the ‘ribosome’ (19 proteins, p-value = 3.3 x 10-9), which was the most significantly enriched. Other interesting pathways related to the ‘proteasome’ (10 proteins, p- value = 6.2 x 10-5) and ‘aminoacyl-tRNA biosynthesis’ (9 proteins, p-value = 1.4 x 10-4). The latter was identified as the most significant KEGG pathway targeted by sulforaphane (Figure 26A+B). Furthermore, 9 CanSAR targets (ALDH2, DNMT1, dihydropyrimidinase-related protein 2 (DPYSL2), EGFR, IMPDH2, glucocorticoid receptor (NR3C1), ribonucleotide reductase M1 (RRM1), equilibrative nucleoside transporter 1 (SLC29A1) and TXNRD1) were identified for curcumin. Two of these (NR3C1 and SLC29A1) were not identified as targets of sulforaphane.

121

(A) (B) Curcumin – 196 targets – MDA-MB-231 PC – 472 targets – MDA-MB-231

Mismatch repair Butanoate metabolism DNA replication Propanoate metabolism Protein export Endocytosis Alanine, aspartate and glutamate metabolism Beta-alanine metabolism Pathogenic Escherichia coli infection KEGG_PATHWAY Antigen processing and presentation KEGG_PATHWAY Gap junction Valine, leucine and isoleucine degradation Antigen processing and presentation Fatty acid metabolism Aminoacyl-tRNA biosynthesis Limonene and pinene degradation Proteasome Pyrimidine metabolism Ribosome Endomembrane system Endomembrane system Insoluble fraction Nuclear lumen Organelle membrane Nuclear pore Organelle envelope Intracellular organelle lumen Cytosol Nuclear envelope GOTERM_CC_FAT Endoplasmic reticulum GOTERM_CC_FAT Pore complex Nuclear lumen Organelle envelope Non-membrane bound organelle Ribonucleoprotein complex Organelle lumen Organelle envelope Nucleolus Non-membrane Voltage-gated anion channel activity Actin binding bound organelle Ribonucleotide binding Nucleoside binding Purine nucleotide binding Adenyl nucleotide binding Peptide transporter activity ATP binding ATP binding Ribonucleotide binding GOTERM_MF_FAT Nucleoside binding Strucutral constituent of ribosome ATPase activity GOTERM_MF_FAT Nucleotide binding TAP binding ATPase activity Nucleotide binding Protein transporter activity RNA binding RNA binding Mitosis Protein targeting Intracellular transport Protein import into nucleus, docking mRNA metabolic process Protein import Nuclear mRNA splicing, via splicesome Nuclear transport Organelle fission Translational elongation GOTERM_BP_FAT Cell cycle GOTERM_BP_FAT Protein localisation in organelle RNA splicing Cell cycle process Mitotic cell cycle Mitotic cell cycle mRNA processing Translation Cell cycle process Intracellular transport 0 5 10 15 20 0 5 10 15 20

-log10(p-value) -log10(p-value) (C)

Cell cycle (22 targets) Cell death (14 targets) Oxidation reduction (16 targets) ANAPC7, ANLN, DCTN2, ATXN10, ATXN2, DNM1L, ALDH18A1, ALDH2, ALDH9A1, DLGAP5, EGFR, ERCC6L, FXR1, HSPD1, MGEA5, CYB5B, EROL1, HMOX2, HCCS, ITGB1, MKI67, MYH9, NCAPD2, NUP62, PNPLA6, PDCD6IP, HADHA, HSD17B10, HSD17B12, NCAPH, NUMA1, PBK, PDCD6IP, PML, PSME3, RTN3, RTN4, HSDL1, IMPDH2, PGHDH, RRM1, PHGDH, PML, PSMD14, PSMD2, STAT1 TXNRD1, TMX1 PSME3, SF1, TPX2, TSG101 mRNA processing (14 targets) Organelle fission (11 targets) ADAR, DHX15, GEMIN4, CanSAR targets (9 targets) ANAPC7, ANLN, DCTN2, DLGAP5, , HNRNPH1, HNRNPF DNM1L, ERCC6L, NCAPD2, NCAPH, ALDH2, DNMT1, DPYSL2, EGFR, HNRNPH2, HNRNPM, KHSRP, NUMA1, PBK, TPX2, IMPDH2, NR3C1, RRM1, MBNL1, PCBP1, POLR2B, SLC29A1, TXNRD1 RBM4, RBM4B, SF1

Figure 31. An overview of the bioinformatic output of the curcumin and PC protein target sets identified in the MDA-MB-231 cell line. (A+B) GO annotations corresponding to biological processes (GOTERM_BP_FAT), molecular function (GOTERM_MF_FAT), celullar compartmentalisation (GOTERM_CC_FAT) and KEGG pathway (KEGG_PATHWAY) enriched within the curcumin (196 targets) and PC (472 targets) sets respectively. (C) The protein target identifications (listed by their gene name) for cellular processes of interest revealed by bioinformatic analysis. Targets with an EC50 value < 100 μM are highlighted in red to highlight the high potent binding targets.

Protein kinases are an integral part of signal transduction within the cell and therefore are important targets of curcumin. Previously identified direct kinase targets of curcumin that have been reported include protein kinase C (PKC),267 v-Src,266 GSK-3β,460 phosphorylase kinase (PHK),461, EGFR and HER2.265 All these kinases have been implicated as having roles in cancer and as such have the potential to contribute to the effects induced by curcumin. Within the curcumin target set, 8 kinases are identified including adenylate kinase (ADK), ALDH18A1, tyrosine-protein kinase receptor UFO (AXL), EGFR, FN3KRP, PDZ-binding kinase (PBK), RPS6KA1 and thymidine kinase 1 (TK1). With the exception of EGFR, none of the previously identified kinases are identified within the curcumin targets set in this study. Many of the previously reported direct kinase targets of curcumin were determined on recombinant proteins with the binding mode undetermined. The 8 kinases identified in this study can be confirmed as engaging their kinase target inside the cell in a covalent manner thus providing superior insight into the targeting of curcumin of this kinase subset. However the functional implications of covalent binding require further elucidation.

Other interesting targets include transcription regulator protein BACH1 (BACH1), BRCA1-associated ATM activator 1 (BRAT1), NFKB2, STAT1 and STAT3. STAT1 and STAT3 were shown previously as targets of sulforaphane and they too are conserved targets of curcumin. The impact of curcumin on

122

the NF-κB signalling pathway has also been well-documented in a similar manner to sulforaphane.462 It has been shown to prevent NF-κB activation by inhibiting p65 (RELA) translocation to the nucleus and suppressing IκBα degradation. Identification in this study of the p52 subunit of NF-κB (NFKB2) as a curcumin target in addition to the redox regulators of the pathway (TXN and TXNRD1) also provides further mechanistic insight into the effect of curcumin on NF-κB signalling. BRAT1 protein is associated in a complex with serine-protein kinase ATM which is believed to be a master controller of cell cycle checkpoint signalling. Curcumin targeting BRAT1 therefore may disrupt this complex which plays an important role in the protective DNA damage response of cancer cells to ionising radiation.463

Network analysis as visualised in Cytoscape for the curcumin and PC target sets provided insight into the target interactome of curcumin (Figure 32). The curcumin target network was relatively simple, with the average number of interactions per node calculated as only 0.68, showing limited interactions between its 196 proteins (Figure 32A). The highest interacting nodes included NR3C1, EGFR,

STAT1, STAT3 and TXN. The majority of the highly interacting nodes were targets with an EC50 > 100

μM, with the exception of KEAP1 and NR3C1. Although an EC50 of 92.4 μM was calculated for NR3C1, this target represented one of the more potent curcumin targets within the overall target set (ranked 39th within the 196 targets). NR3C1 can function both as a transcription factor that binds to glucocorticoid response elements (GREs) to activate transcription of a diverse array of genes, and as a regulator of other transcription factors. It plays an important functional role in inflammatory responses as well as cellular proliferation and differentiation.464 The native substrate for the NR3C1 are glucocorticoids, a class of steroid hormones, which themselves contain an electrophilic α,β- unsaturated carbonyl motif. This provides an explanation as to why a similarly hydrophobic electrophile like curcumin may also be able to effectively bind to and activate NR3C1 in an agonistic manner. The larger target set for PC revealed a substantially more complex interactome (Figure 32B) with the average number of interactions per node calculated as 2.15. Highly connected nodes with potent EC50 values included EGFR, NR3C1 and nucleoporin 153 protein (NUP153). This observation again further supports NR3C1 as a potentially key target of curcumin in addition to the previously discussed EGFR.

123

(A) Curcumin

(B) Mono-O-propylcurcumin (PC)

124

(C)

Curcumin targets (196 total) PC targets (472 total)

EC50 (μM) EC50 (μM) Interacting (rank Interacting (rank Protein Protein Nodes within Nodes within target set) target set)

NR3C1 7 92.4 (#39) EGFR 28 80.9 (#64)

EGFR 6 118.5 (#92) RPS27A 28 155.4 (#401)

STAT3 6 151.3 (#183) KPNB1 19 169.3 (#439)

STAT1 5 120.4 (#96) SPCS2 19 116.4 (#239)

TXN 4 149.8 (#178) STAT3 18 116.7 (#240)

UBE3C 4 130.1 (#122) NR3C1 16 85.3 (#80)

GEMIN5 3 144.2 (#158) XPO1 16 138.3 (#336)

KEAP1 3 36.1 (#8) NUP153 16 73.8 (#50)

UBE2O 3 122.5 (#100) RANBP2 15 87.4 (#93)

HSPD1 3 126.6 (#108) SMURF2 15 101.0 (#155)

Figure 32. Protein interaction networks of curcumin (A) and PC (B) protein targets generated in Cytoscape. Proteins (repreesented as nodes) are colour-coded (gradeint of yellow-red) representing the calculated EC50 value for target binding potency such that the most potent binders (with the lowest EC50) are represented in red. The interactions between proteins (represented as edges) are also colour-coded to represent the different types of interaction: black (direct), pink (phosphorylation), blue (reactome), green (reaction) and orange (transcriptional). (C) The tables accompanying each interaction network display the top 10 most highly connected nodes within the network and lists their associated EC50 values (with rank within the entire target set). Targets with an EC50 < 100 μM are highlighted in red. The original Cytoscape networks can be found in Appendix Table 1.

There are over 7000 publications in PudMed for curcumin and as has been previously highlighted, the number of targets both direct and indirect it has shown to interact with makes it extremely challenging to understand its clearly complex pharmacology and provide insights into its mode of action. Even limiting discussion to curcumin in the context of breast cancer still reveals over 300 publications, with almost half of these studies carried out in the MDA-MB-231 cell line. It is therefore unfeasible to discuss the potential significance of each and every curcumin target identified in this study with the vast amount of literature surrounding curcumin. However, the target data generated and subsequent analyses have provided some key insights into the covalent targets of curcumin. Firstly, curcumin appears to engage targets at a range of potencies ranging from low- to mid- μM which may be therapeutically significant. Secondly, previously well-studied anticancer targets of curcumin such as EGFR, KEAP1 and NR3C1 have been shown to be potent targets of curcumin supporting the currently held belief that they may well be important individual targets within the curcumin set. Thirdly, novel, unexplored targets of curcumin which bind with the highest potency in this study such as

HMOX2 (EC50 = 13.3 μM), RTN3 (EC50 = 19.6 μM) and ALDH9A1 (EC50 = 27.5 μM) warrant further investigation for the functional impact of curcumin binding on these targets.

5.3.3 Target profiling of piperlongumine

In an analogous manner to curcumin, the 446 high confidence targets of piperlongumine showed a range of binding potencies (highlighted by the calculated EC50 values for each target) in the low- to

125

mid- μM range. In contrast to curcumin, only a small number of direct piperlongumine targets have been reported in the literature (Figure 8), even fewer in breast cancer cell lines such as the MDA-MB- 231 cell line (only 6 studies). Many of the reported targets from the literature were identified by Lee and co-workers as a result of their quantitative affinity pulldown approach for identifying piperlongumine targets in cancer protein lysates.123 Of the previous 16 identified targets of piperlongumine, 5 were identified within the piperlongumine target set in this study (neuroblast differentiation-associated protein AHNAK, GLO1, GSTO1, proteasome (PSMB8, PSME2, PSME3 and PSMD14) and STAT3). An additional 5 targets were identified as targets of the ABP but were not deemed to be high confidence targets of piperlongumine as they could not successfully compete for ABP labelling above the set threshold applied (ANXA5, GSTM3, PRDX1, RPS5 and VIM). Identification of 10 out of 16 previously identified piperlongumine targets highlights the complementarity of the target profiling methodologies. However, this study shows that previously identified in-lysate targets may not be significant in-cell and moreover significantly expands the number of targets for piperlongumine.

The landmark study from Lee and co-workers had suggested that piperlongumine may be inducing its cancer-specific cell death through disrupting redox homeostasis within cancer cells leading to a ROS- dependent cell death mechanism. Of the 12 targets identified in their study, the majority of these proteins had roles in redox regulation which supported this hypothesis. However, further studies indicated that ROS induction alone by piperlongumine was not sufficient to induce cancer cell death. It was eluded to that the ability to form covalent adducts with proteins to disrupt other signalling pathways may be more closely associated with selective cancer cell death than ROS induction.312, 313 Therefore the identification of the wider target spectrum, as has been carried out in this study, should further elucidate its therapeutic potential. In this regard, a large number of extremely interesting targets of piperlongumine are identified in this study which seemingly engage with piperlongumine at low μM concentrations, importantly below its reported anticancer effects in vitro. These include proteasomal ubiquitin receptor ADRM1, aurora A kinase (AURKA), checkpoint kinase 1 (CHEK1), fatty acid-binding protein (FABP5), GSTO1, KEAP1, 5'-nucleotidase domain-containing protein 1 (NT5DC1), RELA, RB1 and STAT3. Some of these targets have already been discussed in the context of sulforaphane and curcumin. Indeed piperlongumine shares many of the same activities and targets with these other dietary electrophiles as will be discussed further in Chapter 5.5.

The 446 targets were subjected to bioinformatic and network analysis to attempt to link the targets to the anticancer activity of piperlongumine (Figure 33A). DAVID analysis, once again revealed a strong association with biological processes relating to the cell cycle and cell division with terms including ‘cell cycle’ (57 proteins, p-value = 1.8 x 10-12), ‘M phase’ (34 proteins, p-value = 2.6 x 10-11), ‘cytoskeleton organisation’ (33 proteins, p-value = 1.0 x 10-7) and ‘intracellular transport’ (42 proteins, p-value = 1.4 x 10-7). ‘RNA binding’ (58 proteins, p-value = 9.6 x 10-14) was also the most significant molecular function and targets appeared to be well distributed across the cell. The KEGG pathway analysis revealed some interesting insights with the two most significant pathways enriched relating to two cancers ‘chronic myeloid leukaemia’ (9 proteins, p-value = 1.3 x 10-3) and ‘pancreatic cancer’ (8

126

proteins, p-value = 4.3 x 10-3). Furthermore, ‘apoptosis’ (7 proteins, p-value = 0.038) and ‘pathways in cancer’ (17 proteins, p-value = 0.025) also appeared as enriched KEGG pathways within the piperlongumine target set.

Network analysis visualised in Cytoscape for the piperlongumine targets provided insight into the interactome of piperlongumine (Figure 33B). The result was a complex network of interactions (average interactions per node was 2.40 showing the most complex network of the three compounds profiled) within the target set and identified highly interacting nodes that were highly potent for piperlongumine binding such as AURKA, CHEK1, EGFR and RB1. AURKA regulates centrosome maturation, entry into mitosis, formation and function of the bipolar spindle, and cytokinesis.465 It is typically overexpressed in solid tumours and drives resistance to apoptosis as well as increasing drug resistance in breast cancer cells.466 It has been a popular anticancer therapeutic target in recent years with a number of drugs in early clinical development.467 CHEK1 (also known as Chk1) is one of the major players in the signal transduction pathway in response to DNA damage that contributes to the maintenance of genetic stability which is crucial for dividing cancer cells.468 The generation of CHEK1 inhibitors again has drawn much attention in the last few years as a way of sensitising tumours in combination treatments with other therapeutics.469 Determining the functional impact of piperlongumine on both AURKA and CHEK1 warrants further study. Aside from these two kinases, over 25 other kinase targets of piperlongumine are identified, highlighting the potential of piperlongumine to exert its effect on multiple intracellular signalling cascades simultaneously that certainly makes drawing definitive conclusions on its mode of action no trivial task.

127

(A) Piperlongumine – 446 targets – MDA-MB-231

Alanine, aspartate and glutamate metabolism Pyrimidine metabolism Apoptosis Pathw ays in cancer Cytosolic DNA-sensing pathw ay KEGG_PATHWAY Ubiquitin mediated proteolysis Endocytosis RIG-1-like receptor signaling pathw ay Pancreatic cancer Chronic myeloid leukemia Mictrotubule Organelle lumen Cytoskeletal part Intracellular organelle lumen Nucleolus GOTERM_CC_FAT Microtubule cytoskeleton Cytoskeleton Nuclear lumen Cytosol

Ribonucleotide binding Non-membrane bound organelle Nucleoside binding Double-stranded RNA binding Adenyl ribonucleotide binding Nucleotide binding Ras GTPase binding GOTERM_MF_FAT Cytoskeletal protein binding Enzyme binding Actin binding RNA binding Cell division Intracellular transport Cytoskeleton organisation Nuclear division Organelle fission Cell cycle process GOTERM_BP_FAT M Phase Cell cycle Mitotic cell cycle Cell cycle process 0 5 10 15 20

-log10(p-value)

(B)

Piperlongumine targets (446 total)

EC50 (μM) Interacting (rank Protein Nodes within target set)

YWHAQ 30 81.3 (#360)

EGFR 27 37.2 (#142)

CHEK1 23 12.4 (#17)

RB1 19 9.6 (#10)

XPO1 19 104.6 (#417)

PRPF40A 17 41.4 (#171)

AURKA 17 55.2 (#263)

SRPK1 16 77.8 (#347)

HNRNPK 14 55.4 (#264)

STAT1 14 26.6 (#68)

(C)

Cell cycle (57 targets) Cell death (36 targets) CanSAR targets (8 targets) AHR, AIMP2, ARHGEF2, ATXN10, ATXN2, BAG2, DNMT1, DPYSL2, EGFR, HDAC1, HPRT1, AHR, ANLN, ARAP1, ARHGEF2, BAG5, BAX, BID, CDK5, CIAPIN1, DYNLL1, EIF2AK2, MAP2K1, NR3C1, RRM1 BCCIP, CD2AP, CDK5, CDK6, FXR1, GSPT1, HPRT1, LUC7L3, MAP1S, MGEA5, CEP55, CHEK1, CKAP5, DLGAP5, MSH2, NOL3, NUP62, OPTN, PDCD6IP, PML, DYNC1H1, EGFR, EML4, ERCC6L, PNPLA6, PRKDC, PSME3, RIPK1, RRAGC, RTN3, Pathways in cancer (17 targets) GSPT1, HCFC1, ILF3, KHDRBS1, RTN4, SCRIB, SPG20, SQSTM1, STAT1 BAX, BID, CBL, CDK6, CHUK, CRKL, EGFR, KIF23, KIF2C, KIFC1, KPNA2, HDAC1, MAP2K1, MSH2, NFKB2, PML, MAP2K1, MCM3, MKI67, MSH2, RB1, RELA, STAT1, STAT3, TPM3 NASP, NCAPD2, NCAPG, NCAPH, Intracellular transport (42 targets) NEK9, NUDC, NUMA1, PBK, AIP, AP3B1, ARHGEF2, ARL1, ATP2A2, BAX, BID, PDCD6IP, PDS5A, PHGDH, PML, CDK5, COPB1, FLNA, GOPC, IPO5, KLC1, KLC2, Protein kinase cascade (16 targets) PPM1G, PSMB8, PSMD14, PSME2, KPNA2, MALT1, MAP1S, MAP2K1, MGEA5, MYBBP1A, CDK5, CHUK, CRKL, EGFR, GNAI2, PSME3, RB1, RCC2, SF1, SMC2, MYL6, NUP155, NUP98, OPTN, PHAX, PML, PUM1, MALT1, MAP2K1, MAP2K1, PKN1, PTPN11, SPAG5, TACC1, TACC3, TPX2, RAB14, RANBP2, RANBP3, SEC23IP, SEC24B, RIPK1, RPS6KA1, RPS6KA3, SRPK1, TSG101, TTK, USP9X, ZW10 SEC24C, SEC61B, SEC63, SQSTM1, SRPR, TAP2, STAT1, STAT3, STK38, THOP1, TRIP6 THOC2, TNPO2, XPO5, ZW10

Figure 33. Bioinformatic and network analysis of the piperlongumine target set identified in the MDA-MB-231 cell line. (A) GO annotations corresponding to biological processes (GOTERM_BP_FAT), molecular function (GOTERM_MF_FAT), celullar compartmentalisation (GOTERM_CC_FAT) and KEGG pathway (KEGG_PATHWAY) generated from the 446 targets of piperlongumine in DAVID. (B) Protein target networks generated in Cytoscape. Proteins (represented as nodes) are colour-coded (gradeint of yellow-red) representing the calculated EC50 value for target binding potency such that the most potent binders (with the lowest EC50) are

128

represented in red. The interactions between proteins (represented as edges) are also colour-coded to represent the different types of interaction: black (direct), pink (phosphorylation), blue (reactome), green (reaction) and orange (transcriptional). The table accompanying each interaction network displays the top 10 most highly connected nodes within the network and lists their associated EC50 values (with rank within the entire target set). (C) The protein target identifications (listed by their gene name) for cellular processes of interest revealed by bioinformatic analysis. Targets with an EC50 value < 20 μM are shown in red. The original Cytoscape networks can be found in Appendix Table 1.

5.3.4 Target profiling of THC and THP

To confirm that the engagement of targets was through the two α,β-unsaturated carbonyls of both curcumin and piperlongumine, proteomic target identification of respective ABPs were competed against their reduced analogues, THC and THP. The ‘spike-in’ SILAC, competition-based chemical proteomic workflow was again used to identify protein targets of the ABP alone and protein targets of the ABP in competition with 100 μM THC or THP in live, intact MDA-MB-231 cells (Figure 34A). Initial SDS-PAGE and in-gel fluorescence analysis revealed there was no reduction in ABP labelling upon competition with THC and THP indicating that the two α,β-unsaturated carbonyl motifs of both curcumin and piperlongumine are essential for covalent protein binding (Figure 34B). Unexpectedly, THC appeared to cause an increase in ABP labelling in competition with the ABP relative to ABP alone (Figure 34B lanes 4-6 versus lanes 1-3 respectively).

Identification of the targets of the ABP alone and the ABP in competition with THC and THP by proteomics confirmed the lack of competition of THC and THP against their respective ABPs with no targets displaying a quantitative score greater than the cut-off threshold previously deemed necessary for genuine target identification (Figure 34C+D). Therefore, THC and THP were both incapable of competing for protein labelling of the ABP in contrast to curcumin and piperlongumine. These results therefore conclusively show that for all high confidence targets identified for curcumin and piperlongumine (196 and 446 targets respectively), their target engagement with the identified targets requires the α,β-unsaturated carbonyl electrophilic motif.

The importance of the two α,β-unsaturated carbonyls to the activity of piperlongumine has not been particularly well explored, but a couple of recent studies suggest they are essential for the anticancer effects of piperlongumine.312, 313 This is supported by the cell viability observations in Chapter 3.4 whereby THP showed no effects on MDA-MB-231 cell viability. The case for curcumin is less clear. It has already been discussed that curcumin is a multi-faceted molecule whose biological activity is not limited to its two α,β-unsaturated carbonyl motifs and its ability to form covalent adducts. Other structural motifs such as the phenolic groups and the diketone bridge have been shown to directly contribute to its pleiotropic nature.274 In light of this, THC retains many of the same activities of curcumin, but does lack anti-inflammatory and many of the anticancer effects associated with curcumin.332 This was further compounded by the lack of an effect on MDA-MB-231 cell viability in earlier studies (Chapter 3.4). Taken together, this would suggest that the α,β-unsaturated carbonyl motifs of curcumin and piperlongumine are key for its anticancer effects. The 196 and 446 targets identified for both curcumin and piperlongumine respectively are as a direct consequence of their two α,β-unsaturated carbonyls and imply that these are therefore the targets responsible for their therapeutic effects. Further investigation is warranted to dissect the SAR surrounding the two α,β-

129 unsaturated carbonyls of each compound for their individual contributions to the anticancer effects. Selective reduction of each of the two α,β-unsaturated carbonyls to the corresponding dihydro- analogues could then be used in competition against their respective ABPs for example to unravel the different contributions to target binding of the electrophilic motifs. This was not explored in these studies, but certainly warrants further investigation.

(A) Normal media Normal media ‘Heavy’ media ‘Heavy’ media (in triplicate) (in triplicate) 1-3 4-6 ‘spike-in’ 7-9 10-12 ‘spike-in’

CURC ABP 1 CURC ABP 1 CURC ABP 1 PIP ABP + y PIP ABP PIP ABP only only + x μM THC only μM THP only

Curcumin ABP 1: 5 μM across samples 1-6, 20 μM for ‘spike-in’ sample (B) THC: x = 100 μM Piperlongumine ABP: 2 μM across samples 7-12, 8 Lane 1 2 3 4 5 6 8 9 μM for ‘spike-in’ sample 5 μM CURC ABP 1 + + + + + + ------y = 100 μM THP: 2 μM PIP ABP ------+ + + + + 100 μM THC - - - + + + ------electrophilic electrophilic electrophilic O O motif 2 100 μM THP ------+ + + motif 1 O O electrophilic motif 1 H CO 3 N motif 2 Mw (kDa) 250 H3CO HO OH 150 OCH 3 OCH3 OCH3 100 Piperlongumine Curcumin 75 50 Reduction of α,β- 37 unsaturated gel fluorescence

carbonyls 25 -

20 In O O O O 15 H CO 3 N

H3CO HO OH OCH 3 OCH3 OCH3 THP THC Coomassie

(C) (D)

4 6

5

3

M piperlongumine) 4 µ M curcumin)

µ Targets competed by parent compound 3 2 (curcumin or piperlongumine) but not by reduced parent 2 compound (THC or 1 THP) 1 (quantitative score of ABP v 100 2 (quantitative score of ABP v 100 2 log log

-2 -1 1 2 3 4 5 6

log2(quantitative score of ABP v 100 µM THP) -2 -1 1 2 3 4

log2(quantitative score of ABP v 100 µM THC) -1

-1 -2 Curcumin versus THC Piperlongumine versus THP

Figure 34. Overview of target identification of ABP in competition with reduced anlogues of curcumin and piperlongumine (THC and THP respectively). (A) Experimental design for ‘spike-in’ SILAC, competition-based

130

chemical proteomics workflow for respective ABP against THC and THP. Samples were carried out in triplicate and the ‘spike-in’ SILAC lysate was added at a 1:3 ratio. (B) In-gel fluorescence analysis of the labelling profile of samples (1-12). (C+D) Scatter plots of quantitiative score of ABP competed against 100 μM reduced analogue of parent compound (THC or THP) (x-axis) and quantitative score of ABP competed against 100 μM parent compound (curcumin or piperlongumine) (data from Chapter 5.3) (y-axis). Targets displaying a log2(quantitative score) value > 0.585 for curcumin/piperlongumine and a log2(quantitative score) value < 0.585 for THC/THP are highlighted in the green quadrant and correspond to genuine target identifications that require the α,β- unsaturated carbonyl motifs for target binding. 475 protein targets for the ABP were conserved for curcumin (C) and 556 protein targets for the ABP were conserved for piperlongumine (D).

5.4 Target validation with Western blotting

High confidence protein targets identified by MS for curcumin (196 targets in MDA-MB-231), sulforaphane (426 targets in MDA-MB-231 and 290 targets in MCF7) and piperlongumine (446 targets in MDA-MB-231) (Chapter 5.2 and 5.3) were then validated by WB analysis. Western blotting provides an orthogonal technique to MS to confirm that the ABP binds to a specific protein target of interest. WB analysis has become the ‘gold standard’ method for confirming MS-based observations over the last few decades. In this way, competition-based assays for WB analysis were carried out in a similar manner to those which were reported for the MS-based proteomics experiments. Briefly, ABP and parent compound competition experiments were carried out in live MDA-MB-231 cells as has been described previously. The experimental samples were as follows, curcumin ABP 1 (25 μM) was competed against three concentrations of curcumin (25 μM, 100 μM and 150 μM); sulforaphane ABP 2 (25 μM) was competed against three concentrations of sulforaphane (25 μM, 125 μM and 250 μM); and piperlongumine ABP (10 μM) was competed against three concentrations of piperlongumine (10 μM, 50 μM and 100 μM). Higher concentrations of ABP were employed relative to the MS-based experiments to increase the likelihood of detection by WB. Following compound treatment to cells, cells were lysed and lysates functionalised with AzTB by CuAAC. Protein targets were then affinity enriched with Neutravidin sepharose resin, followed by stringent washing steps. Three SDS-PAGE samples were then prepared for each experimental condition corresponding to, pre-pull down lysate (PPD), supernatant lysate left after the affinity enrichment (SN) and proteins immobilised on the resin as a result of pulldown (PD). Proteins from each of these three experimental conditions were separated by SDS-PAGE, followed by transfer to PVDF membranes for subsequent WB analysis.

It would require a herculean effort both in terms of cost and human effort to validate every protein target of the three compounds under study by WB analysis. However, a carefully selected panel of protein targets identified by MS for each compound were chosen for validation (shown in Figure 35A- C including HSP90, STAT3, STAT1, IMPDH2, HCCS, GSTO1, BID, FXR1 and KHSRP and in Appendix Figure 12 including PSMC1, MARCKS, MIF and CDK2). The correlation between the MS- based and WB-based observations is summarised in Figure 35D.

131

(A) MCF-7 cell line MDA-MB-231 cell line

Lane 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 ABP (25 μM) + + + + + + + + + + + + + + + + + + + + + + + + NC (μM) - - - 25 25 25 100 100 100 150 150 150 - - - 10 10 10 50 50 50 100 100 100 Sample PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD

α-HSP90

α-STAT1 Curcumin α-STAT3

α-IMPDH2

α-TUBULIN(α)

α-ACTIN(β)

α-HCCS

α-GSTO1

α-FXR1

(B) Lane 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 ABP (10 μM) + + + + + + + + + + + + + + + + + + + + + + + + PIP (μM) - - - 10 10 10 50 50 50 100 100 100 - - - 10 10 10 50 50 50 100 100 100 Sample PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD

α-HSP90 Piperlongumine

α-STAT3

α-STAT1

α-IMPDH2

α-TUBULIN(α)

α-HCCS

α-GSTO1

α-BID

α-FXR1

(C) Lane 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 ABP (25 μM) + + + + + + + + + + + + + + + + + + + + + + + + SULF (μM) - - - 25 25 25 125 125 125 250 250 250 - - - 10 10 10 50 50 50 100 100 100 Sample PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD Sulforaphane α-STAT1

α-STAT3

α-KHSRP

α-TUBULIN(α)

α-ACTIN(β)

α-HCCS

α-FXR1

132

(D) NC NC SULF SULF PIP PIP NC SULF SULF PIP target target target target target target target target target target Protein by MS by WB by MS by WB by MS by WB by WB by MS by WB by WB (MDA- (MDA- (MDA- (MDA- (MDA- (MDA- (MCF7) (MCF7) (MCF7) (MCF7) MB-231) MB-231) MB-231) MB-231) MB-231) MB-231) HSP90 STAT1 STAT3 IMPDH2 TUBULIN (α) ACTIN (β) HCCS GSTO1 FXR1 BID KHSRP

Figure 35. (A-C) WB analyses from multiple protein targets (β-actin, BID, FXR1, GSTO1, HCCS, HSP90, IMPDH2, KHSRP, STAT1, STAT3 and α-tubulin) for curcumin (A), piperlongumine (B) and sulforaphane (C) across both the MCF7 and MDA-MB-231 cell lines. PPD = pre-pull down lysate, SN = supernatant lysate left after the affinity enrichment, and PD = proteins immobilised on the resin as a result of pulldown. The in-gel fluorescence images are shown in Appendix Figure 11. (D) Summary table of the comparison of PD identified protein targets by WB analyses to MS-based protein target identification for curcumin, sulforaphane and piperlongumine (from Chapter 5.2 and 5.3). Green indicates a clear positive identification, yellow indicates a weak positive identification, red indicates a negative identification, white indicates the appropriate WB was not performed for the given protein. NC = curcumin, PIP = piperlongumine and SULF = sulforaphane.

A number of the targets showed both positive identification upon affinity enrichment in the ABP only sample and a concentration-dependent decrease in ABP labelling upon competition with parent compound validating the MS-based observations (Figure 35 - STAT1, STAT3 and FXR1). Aside from merely confirming the observations from the MS-based proteomics experiments, WB analysis provides insight into the degree of target engagement of the compounds on a particular protein relative to its total amount. This can be done by comparing the PPD, SN and PD lanes for each WB in Figure 35. Generally speaking, across the 10 or so protein targets studied here, the fraction of target captured by the ABP for any of the three compounds is low relative to the pool of free protein. Increasing the concentration of ABP applied or the in-cell ABP exposure time may well increase the fraction of target engaged by the ABP (this is evident in Appendix Figure 12 for GSTO1 and HCCS for example). It should be noted that the readout for protein target capture (PPD) is on account of ABP-binding to the protein target. In these experiments, each of the three ABPs were only exposed to live MDA-MB-231 and MCF7 cells for 30 min at a concentration of 25 μM, 10 μM and 25 μM for curcumin ABP 1, piperlongumine ABP and sulforaphane ABP 2 respectively (despite the fact parent compound was applied in competition at concentrations > 100 μM).

The positive identification of targets identified by MS were not always confirmed by WB analysis. Firstly, many targets do not have antibodies commercially available or have antibodies of poor quality (e.g. KEAP1 and VDAC2 – data not shown) that prevented a reliable readout by WB. Secondly, there were targets that showed inconsistencies between the results obtained by MS and WB, whereby protein targets identified by MS did not show a positive identification in the PD sample upon WB

133

analysis (Appendix Figure 12 - MIF, KHSRP and CDK2). It is possible that the ABP modification in addition to the fluorophore/biotin tags may interfere with antibody recognition of the protein target. Moreover, it could be a result of decreased sensitivity of detection by WB analysis in comparison to MS or subtle differences in sample handling or detection between the two methodologies. Thirdly, there were protein targets identified by WB that were not revealed as targets by MS (Figure 35 - α- tubulin and β-actin). The increased concentration of ABP applied for the WB experiments (5-fold higher) relative to the MS-based experiments may explain this observation.

WB analysis provides an orthogonal technique to supplement the MS-based protein target identifications albeit with lower throughput. The value of developing and optimising a WB-based assay for a specific protein target of interest has been shown recently by Jones and co-workers who used a similar approach to assess the target engagement and occupancy of a covalent inhibitor of the mRNA-decapping scavenger enzyme DcpS.470 Similar assays have been carried out to assess covalent kinase inhibitors.241, 471-474 Something similar could be developed for specific targets of the compounds under study here. As eluded in Chapter 4.1.2, there is a need to address electrophile- protein adduct formation in a quantitative manner both in terms of engagement and occupancy inside relevant cellular environments, particularly against interesting protein targets that carry functional significance such as EGFR, KEAP1, STAT3 and NF-κB.475 This is not something that has been explored for dietary electrophiles such as curcumin, piperlongumine and sulforaphane. In these studies, STAT1, STAT3 and FXR1 have the potential to be explored further should they turn out to be functionally important targets.

Without knowledge of how much of a target is engaged by an electrophile, it is difficult to attribute pharmacological effects to perturbations of the protein(s) of interest. Calculation of binding potency parameters (EC50 value) for curcumin and piperlongumine was carried out in the preceding chapter; however this does not provide information on the concentration required for total saturation of target engagement or any indication into concentrations relevant for a functional response. The WB analysis provides insight into the former but not the latter. Under the conditions employed here, not all of the available cellular pool of STAT1, STAT3 or FXR is fully engaged by the ABP. One might predict that for polypharmocological compounds as a result of their intrinsic promiscuous nature, they are unlikely to fully engage a target under therapeutically achievable concentrations. However does it need to fully engage a target to result in a functional effect on it? If not, how much does a target need to be engaged for a functional effect to be observed? These questions are difficult to answer. However, understanding the dynamics of ABP engagement with relevant target proteins over time- and concentration- gradients in an attempt to understand the differing reactivity and ABP-protein adduct stability of individual targets will certainly help address these issues.

WB analyses therefore have an important role to play both as an orthogonal technique for confirmation of MS-based observations as well as assessing target engagement and occupancy. The continued development of MS-based techniques such as selected reaction monitoring (SRM) and multiple reaction monitoring (MRM) in addition to the improvements in MS capabilities however are beginning to make WB analyses more redundant as many of the previously unique features of WB

134

can now be addressed by MS.476, 477 However, developing WB assays for key targets of these three compounds still holds a valuable contribution to understanding the target sets of these polypharmocological entities.

5.5 Overlap of targets between curcumin, piperlongumine and sulforaphane

Having individually and comprehensively profiled the targets of curcumin, piperlongumine and sulforaphane to reveal 196, 446 and 426 targets respectively for the three compounds in the MDA- MB-231 cell line (Chapter 5.2 and 5.3), it was next of interest to cross-compare the target and functional overlap between the three compounds. Given that all three compounds display anticancer effects, albeit with differing potencies and through differing mechanisms in some cases, targets that are conserved across the three compounds may hold the key to understanding their anticancer effects. Furthermore, differences within the target sets of the compounds may provide hypotheses for explaining differences in observable activities, providing candidate protein targets for further study. This highlights the benefit of profiling three compounds simultaneously as a strategy to further elucidating each of their modes of action.

The protein target overlap is shown in Figure 36A identifying 81 targets which are conserved across all three compounds. These targets included ATP2A2, AXL, DNMT1, EGFR, FXR1, GSTO1, HAT1, HMOX2, KEAP1, NFKB2, RRM1, STAT1, STAT3, VDAC2 and VDAC3 which have all been discussed in the preceding chapters as a result of their functional importance in cancer. The fact that all three compounds target these proteins suggests these proteins can accommodate reaction with a diverse range of electrophiles and structural features. DAVID analysis of the 81 conserved targets, in a similar manner to their original target sets of each compound, individually revealed enrichment of terms relating to the cell cycle and cell division (Appendix Figure 13). This suggested that this conserved target set may well contain the targets necessary for anticancer effects given that all three compounds display the ability to kill cancer cells.

The overlap in the top GO terms relating to biological processes, molecular function and associated KEGG pathways, between the three compounds was also studied (Figure 36C). With regards to biological processes, cell cycle terms and organelle fission were significantly enriched within the top 10 terms for all three compounds. In terms of molecular function, there was again a high degree of overlap in terms with RNA binding and nucleotide binding conserved across all three compounds. Only a single KEGG pathway was conserved across the three compounds (endocytosis). The proteins conserved within specific GO terms were also high as can be seen for cell cycle, cell death and RNA binding terms in Figure 36C.

Interestingly, cross-referencing the conserved 81 targets against the targets of IA generated by Cravatt and co-workers by MS-based proteomics (discussed in Chapter 1.2.7.2), revealed only 30 overlapping targets within the 794 targets known to contain reactive cysteines towards IA (Figure 36B).56 The individual target sets of curcumin, PC, piperlongumine and sulforaphane were also individually cross-referenced against the IA target set and typically showed low target overlap (typically 30-35 %) (Figure 36B). Therefore, although conserved across the three dietary

135 electrophiles, many of these targets are not reactive with the cysteine alkylating agent IA that is typically associated with toxicity.51, 80, 81, 83 This was also reflected in the DAVID analysis of the IA target set whereby although some familiar GO terms were shared between the IA targets and those of the dietary electrophiles (e.g. ‘cell cycle’, ‘mitotic cell cycle’ and ‘translation’), many GO terms enriched for IA were not shared by any of the dietary electrophiles (e.g. ‘cell redox homeostasis’ and ‘glucose metabolic process’) (Appendix Figure 13).

(A) Curcumin (196 targets) (B)

45 O I 51 30 764 N 48 22 H 81 Core target 149 166 157 81 conserved IA Total # of % target proteome targets targets overlap with IA Curcumin 196 31 Piperlongumine Sulforaphane* PC 472 31 (446 targets) (426 targets) Piperlongumine 446 33 MDA-MB-231 cell line Sulforaphane 426 35

(C) GO_TERM_BP GO_TERM_BP GO_TERM_MF CELL CYCLE CELL DEATH RNA BINDING Curcumin (21 targets) Curcumin (14 targets) Curcumin (25 targets)

4 1 3

6 1 5 1 6 2 10 7 14

18 23 16 14 10 19 13 25 24

Piperlongumine Sulforaphane* Piperlongumine Sulforaphane* Piperlongumine Sulforaphane* (57 targets) (50 targets) (36 targets) (37 targets) (58 targets) (65 targets)

ANLN, DLGAP5, EGFR, ATXN2, FXR1, PDCD6IP, ACO1, ADAR, ATXN2, DDX24, ERCC6L, MKI67, NCAPD2, PML, PNPLA6, RTN3, DHX30, FAM120A, FXR1, NCAPH, PML, SF1, TPX2 STAT1 HNRNPF, HNRNPH1, KHSRP, PCBP1, SAFB2, SF1, XPO5

Figure 36. Protein target overlap in the MDA-MB-231 cell line (from Chapter 5.2 and 5.3). (A) Venn diagram showing the overlap of protein targets between curcumin (196), piperlongumine (446) and sulforaphane (420). (B) Venn diagram of the overlap of the 81 conserved targets across curcumin, piperlongumine and sulforaphane with the IA target set (796 proteins) obtained from Weerapana et al.82 constituting reactive cysteines profiled across three cancer cell lines (MDA-MB-231, MCF7 and Jurkatt). The structure of IA-Alk probe is shown. The individual % overlap of the IA target set with curcumin, PC, piperlongumine and sulforaphane is also shown. (C) Venn diagrams showing the overlap of protein targets contained within highly enriched GO terms for curcumin, sulforaphane and piperlongumine. * It should be noted that sulforaphane target profiling was carried out under

136

different cell lysis conditions (cytosolic/nuclear fractionation) relative to curcumin and piperlongumine (whole cell lysis).

IA is a cytotoxic agent to both cancer and normal cells that readily forms covalent adducts with biological nucleophiles. Curcumin, sulforaphane and piperlongumine on the other hand display selective cytotoxicity against cancer cells, with minimal toxicities in normal cells. Therefore, the 51 conserved targets of the three compounds that are not reported targets of IA may provide a pool of mediators through which such dietary electrophiles may exert their effects without causing general protein alkylation driven toxicity. This therefore helps to address the major challenge in dissecting target sets of electrophiles to shed light on toxicity mechanisms that to date have remained difficult to predict or rationalise. Targets and pathways that are identified for curcumin, piperlongumine and sulforaphane but not for IA may help to discriminate between toxic and non-toxic mechanisms. Therefore, further to what was outlined above, the 51 conserved targets not identified as being reactive towards IA, may drill down even further to the core underlying targets and potential mechanism for the action of such dietary electrophiles.

To further the analysis between the targets of curcumin, piperlongumine and sulforaphane, the calculated binding potencies were also compared. Pair-wise scatter plots of the quantitative score for the ABP in competition with 100 μM parent compound are shown in Figure 37A-C corresponding to curcumin versus piperlongumine (Figure 37A), curcumin versus sulforaphane (Figure 37B) and piperlongumine versus sulforaphane (Figure 37C). It was apparent that for conserved targets, piperlongumine and sulforaphane showed greater binding potencies relative to curcumin (Figure 37A+B). Curcumin did however show more potent binding determined in these studies against a handful or targets such as RTN4 and ATP2A2. The comparison of binding between sulforaphane and piperlongumine showed a wide range of targets favoured for each electrophile (Figure 37C). Piperlongumine showed superior binding towards targets including ADRM1, GSTO1 and HMOX2, whereas sulforaphane showed superior binding towards FXR1, KEAP1 and VDAC3.

137

(A) 4 PIPERLONGUMINE-

CSTB FAVOURED TARGETS GSTO1

SF1

PEF1 DCUN1D5 3

HSPBP1 AKAP8L HMOX2 RTN3

M piperlongumine) PHGDH

µ BCAR3 HNRNPF KEAP1 PGLS PLIN3 ATXN2 CYB5B RFTN1 NFKB2 2

RBM4 CISD2

ENDOD1

RTN4

1 AXL

ATP2A2

TAP1 POLR2B CURCUMIN- (quantitative score of ABP v 100 v ABP of score (quantitative

2 TXNRD1 FAVOURED ALDH18A1

log MYH9 REEP5 TARGETS

-1 1 2 3 4

log2(quantitative score of ABP v 100 µM curcumin)

(B) -1 4 CURCUMIN- FAVOURED TARGETS 3

M curcumin) RTN3 HMOX2 µ SULFORAPHANE-

CDKAL1 FAVOURED TAP1 TARGETS KEAP1 2

ALDH9A1 ATP2A2 TFRC RTN4 CKAP4 HNRNPM DFNA5

1 DLGAP5 ALDH2 POLR2B MKI67 GSDMD VDAC3 (quantitative score of ABP v 100 v ABP of score (quantitative 2 PHF3 CARM1 log

-1 1 2 3 4

log2(quantitative score of ABP v 100 µM sulforaphane) -1 (C) 4

PSMB8 ADRM1 PIPERLONGUMINE- CSTB GSTO1 FAVOURED TARGETS

SF1 3

HMOX2 RTN3 M piperlongumine)

µ BCAR3 FAM203B KEAP1 DFNA5 CTTN 2 HNRNPH1 VDAC3

CARM1 FXR1 HEATR3 GSDMD

FXR2

PML 1

PCMT1 SULFORAPHANE-

(quantitative score of ABP v 100 v ABP of score (quantitative FAVOURED 2 PTGES3

log TARGETS

ALDH2 -1 1 2 3 4

log2(quantitative score of ABP v 100 µM sulforaphane)

-1 Figure 37. Scatter plots of the quantiative score comparison of piperlongumine versus curcumin (A), sulforaphane versus curcumin (B) and sulforaphane versus piperlongumine (C) for protein targets identified in the

138

MDA-MB-231 cell line. Where targets are only identified for only one of the two compounds, a quantitative score of 0 is assigned to the missing target.

Finally, in the context of electrophilic natural products, it is not just the overlap of targets between curcumin, piperlongumine and sulforaphane that can be carried out. As shown in Table 1, an ever- growing number of electrophilic natural products are having their protein targets profiled. A large number of these compounds contain at least one α,β-unsaturated carbonyl which is the same electrophilic motif within curcumin and piperlongumine. Many of the targets unravelled for other electrophilic natural products appear within the curcumin, piperlongumine and/or sulforaphane target sets. For example, NFKB1 (NF-κB p50 subunit) was identified as the major target of the bioactive diterpinoid andrographolide.114, 115 In these studies, NFKB2 (NF-κB p52 subunit) is identified as a target of curcumin, piperlongumine and sulforaphane. Fatty acid synthase has been shown to be a target of both (±)-C75 and Orlistat and is contained within the target set of sulforaphane and piperlongumine.117, 135 Cinnamaldehydes and illudine S have been shown to bind to TXNRD1,478, 479 withaferin A binds to NF-κB subunits,480 leptomycin B binds to XPO1,481 epoxomicin, lactacystin and syringolin A all bind to various elements of the proteasome.88, 111, 482 All of these targets are conserved across curcumin, piperlongumine and sulforaphane and therefore appear to be common targets of electrophilic natural products.

In an analogous study within the group, the covalent target profile of zerumbone, a cyclic sesquiterpene isolated from the tropical plant Zingiber zerumbet Smith, which possesses anticancer activities was carried out identifying 151 high confidence targets in HeLa cells.120 Of the identified targets, 24 overlap with the 81 targets conserved across curcumin, piperlongumine and sulforaphane. It is important to note that the profiling of these other electrophilic natural products have been carried out in a variety of cell lines, at a range of concentrations, using different target profiling techniques. Therefore, one is limited to simple observations of target overlap between compounds. However, as more electrophilic natural products are comprehensively profiled, particularly in a quantitative manner, SAR will be able to drawn up between compounds and further elucidate potentially key targets for electrophilic compounds. An interesting conclusion to this section is the observation that a number of electrophilic natural products have been shown to target cytoskeletal proteins (tubulin, actin and vimentin) and HSP family members (HSP70, HSP90).94, 99, 118, 119, 483 These highly abundant cellular proteins are well-validated and exploited cancer therapeutics targets.484, 485 However, although the identification of such proteins as targets of the ABPs of curcumin, piperlongumine and sulforaphane is made in this study highlighting that these targets do indeed covalently bind electrophiles, they do not appear to be particularly potent binders on account of the limited competition observed by the parent compound. This may be an artefact of the competition-based assay, but the high abundance and ease of detection by WB- and MS-based analyses may also favour their detection as targets in previous studies leading to their identification, in some cases as the single major target of the electrophilic species under study. Certainly for curcumin, piperlongumine and sulforaphane, such targets appear to be significantly less potent relative to other targets and as such may not significantly contribute to their anticancer modes of action.

139

5.6 Identification of the site of modification of ABPs

Lacking from the analysis up to this point has been the identification of the individual amino acid modification site on each protein target for the three compounds under study. Determining the modification site is important not only to provide further target validation as a result of direct evidence of modification by MS/MS but also in assessing the functional impact of the compounds on each target. For example, if the compound binds to the catalytic residue of an enzyme, the target engagement is likely to have a critical functional effect on the protein. However, target engagement of the compound at a functionally redundant site and the effect may be limited.

In a standard chemical proteomic workflow discussed up to this point, protein targets are affinity enriched onto Neutravidin sepharose resin and digested on resin with trypsin to generate peptides for subsequent LC-MS/MS analysis. Using such a workflow, following trypsin digest, the peptide generated that carries the covalent modification with the ABP remains immobilised on the resin on account of the strong affinity between biotin and the Neutravidin sepharose resin. Therefore, the so- called ‘modified’ peptide is never released from the resin for LC-MS/MS analysis and identification. In order to obtain this ‘modified’ peptide for LC-MS/MS analysis, a means of releasing it from the resin is required. To this end, new ‘capture’ reagents were synthesised within the group that addressed this issue to identify the individual modification sites on protein targets. Two such reagents were used in this regard, AzRB (synthesised by Elisabeth Storck-Saha (Imperial College London)) and AzRB2 (synthesised by Malgorzata Broncel (Imperial College London)) (Appendix Figure 14). These two reagents work in an analogous manner, in that they both contain a trypsin cleavage site within their structures. Consequently, upon trypsin digest on-resin, the ‘modified’ peptide is released from the resin alongside other non-modified peptides carrying a defined and distinct mass tag modification on its modified amino acid. This so-called ‘modified’ peptide is then analysed by LC-MS/MS and can be detected upon subsequent MaxQuant analysis by specifically searching for the mass tag modification (Figure 38). Both these reagents have been very recently reported to successfully identify the amino acid modification sites of the myristolyated proteins using an analogous workflow.486, 487 Other reagents using similar strategies have also been reported by other groups.488, 489

140

Trypsin Sulforaphane ABP 2 cleavage site O AzRB O H O H O N N N N N N N O N H O H O H O N NH O O NHH O O S ABP-bound peptide released H2N NH HN N H from resin carrying fixed H S Biotin mass modification O O S N Neutravidin resin

N O N N H 1. Affinity enrichment of ABP-bound proteins N N O Covalently bound 2. On-bead trypsin or Lys-C/trypsin digest H target protein 3. Identification by LC-MS/MS O OH

NH H2N NH Figure 38. The cleavage mode of the AzRB reagent that allows the ABP-modified peptide to be released from the affinity resin such that it can then be identified by LC-MS/MS. Shown here is a representative example for the sulforaphane ABP 2 once it has been clicked to AzRB. The structures of the AzRB and analogous AzRB2 reagent are shown in Appendix Figure 14.

The reported SILAC-based quantitative protein target profiling of curcumin, piperlongumine and sulforaphane discussed in Chapters 5.2 and 5.3 were all carried out using AzRB or AzRB2 as the capture reagent ligated with CuAAC. As such, in theory the individual amino acid modification site for all targets identified within these experiments could be determined. To complement and add to the number of modification site identifications, an alternative experimental setup was also carried out to maximise the number of site identifications for the three compounds. Representative ABPs (20 μM curcumin ABP 1, 2 μM piperlongumine ABP, 5 μM sulforaphane ABP 2 and DMSO vehicle) were fed to MDA-MB-231 cells, lysed, clicked with AzRB and protein targets affinity enriched as has been described previously. In order to separate the ‘modified’ peptides from the rest of the peptide milieu, a two-stage or orthogonal digest on-resin was performed. Firstly, targets on-resin were digested with Lys-C to generate peptides of captured proteins. Lys-C specifically cleaves at the C-terminal side of lysine residues and therefore cleaves immobilised proteins into peptides, but crucially does not cleave the trypsin cleavage site (contains an arginine residue) in the AzRB reagent. Following stringent washing of the resin to remove generated peptides, a second on-resin digest with trypsin was carried to cleave the ‘modified’ peptides from the resin and keep them separate from the other peptide pool. The two peptide pools were then analysed separately by LC-MS/MS. The hope was that reducing the complexity of the ‘modified’ peptide pool that more site identifications would be made.

For sulforaphane targets, combining target identifications made across all experiments outlined above, the modification sites of 46 proteins have been mapped out of a total of 534 protein targets across both MCF7 and MDA-MB-231 cell lines (Table 3). MS2 spectra of a selected number of assignments are shown in Appendix Figure 15 and Appendix Figure 16. This constituted identifying a modification site for around 9 % of the total targets. Even pooling ‘modified’ peptides and analysing

141

them separately by LC-MS/MS did not offer the improved site identifications sought after. The incomplete assignment of sulforaphane modification sites for over 90 % of targets can be explained by numerous contributing factors. Firstly, only a single modified peptide for each protein may be in existence (relative to many other peptides for that protein). If the MS properties of the peptide are unfavourable or if covalent adduct formation disrupts trypsin cleavage then the individual modified peptide may go undetected. In the case of the former, the charged arginine residue as part of the mass tag is crucial for favourable MS ionisation but may require further optimisation to confer improved peptide properties to favour ‘modified’ peptide ionisation and subsequent detection. Secondly, further optimisation of the LC-MS/MS setup may well be required, such as longer LC gradients, to improve coverage. Solutions beyond this remain unclear, however the lack of robust site identification using on-resin digest by others as well as within the group highlights the challenges faced in identifying ‘modified’ peptides by MS. For example, Liebler and co-workers applied a similar chemical proteomic approach to identify > 1500 protein targets of 4-HNE, filtering down to 417 high confidence targets after data processing. However, only 18 discernible HNE-adducted peptides were detected.51 Others have had greater success such as the profiling of IA by Cravatt and co-workers that identify and quantify protein targets by modified peptides.82 The implementation of reagents into the current workflow to identify ‘modified’ peptides is therefore not a trivial task, but it is hoped that the further optimisation will lead to greater numbers of site identifications moving forward.

However, despite this, the significance of the 46 site identifications inside the cell, under native biological conditions and at endogenous expression levels should not be underplayed. Many other studies have resorted to isolated recombinant protein for site of modification analysis of electrophilic compounds such as these on a protein-by-protein basis.490, 491 Here the modification sites on high confidence sulforaphane targets have been determined for 46 proteins simultaneously. Further optimisation of the current workflow will undoubtedly increase the number of modification sites on sulforaphane targets that can be mapped. This is an on-going endeavour within the group. None more so of interest in this regard than KEAP1. Although, many modification sites by sulforaphane on KEAP1 have been carried out, much of this work has been done on recombinant protein.292, 492 The prospect of identifying in-cell modifications of sulforaphane therefore offers huge incentive to develop this platform further.

The majority of the 46 modification sites for sulforaphane are novel identifications that greatly contribute to the functional impact of sulforaphane on these targets. As shown in Table 3, sulforaphane binds to the catalytic cysteine residue of three cathepsins family members (CTSB, CTSC and CTSZ) and therefore likely disrupts cathepsin biological function inside cells. Cathepsins play numerous roles in cancer progression,493 with cathepsin B (CTSB) considered to be a particularly interesting cancer drug target.494 Pyroglutamyl-peptidase 1 (PGPEP1) also binds sulforaphane at its catalytic cysteine. Other non-catalytic cysteines targeted by sulforaphane contained within protein targets that carry functional importance include those within TXN, VDAC2 and VDAC3. VDAC proteins, being located in the mitochondria, are susceptible to electrophilic and oxidative modifications on cysteine residues that act as sensors for such insults.495 Cys72 of TXN plays an important role in

142 regulating its catalytic activity and is the modification site for the covalent TXN inhibitor, PX-12.95 Given the diversity of functions carried out by thioredoxin in the cell, the potential disruption of the regulation of TXN activity by sulforaphane warrants further investigation.496

Table 3. Overview of the amino acid modification sites for sulforaphane identified in MCF7 (experiment 2) and MDA-MB-231 (experiment 1 and 3) cells. Experiment 1 is the label-free chemical proteomic experiment whereby a two-stage or orthogonal digest on resin was performed to generate ‘modified’ peptides which were identified by LC-MS/MS separately. Experiments 2 and 3 are from the SILAC-based quantitative chemical proteomic experiment in MCF7 and MDA-MB-231 cells respectively (Chapter 5.2). All reported modification sites were conserved across biological duplicates, contained a localisation score > 0.8 and a score difference > 15. For experiment 1, site modifications were only identified in the ABP samples and were not present in the DMSO vehicle control. Targets shown are only those contained within the 534 total protein targets of sulforaphane identified in the MCF7 or MDA-MB-231 cell lines (Chapter 5.2).

Modified cysteine Experiment No. Gene Name Protein Name residue No. 1 AARS Alanine--tRNA ligase, cytoplasmic 773 2 2 AP4E1 AP-4 complex subunit epsilon-1 1119 2 3 APOBEC3C Probable DNA dC->dU-editing enzyme APOBEC-3C 130 1,3 4 CAST Calpastatin 491 1 5 CPNE1 Copine-1 58 3 6 CPT1A Carnitine O-palmitoyltransferase 1, liver isoform 96 1,2 7 CTSB Cathepsin B 108 1,2,3 8 CTSC Dipeptidyl peptidase 1 258 3 9 CTSZ Cathepsin Z 92 1,3 10 CTTN Src substrate cortactin 246 2 11 DKFZp686J1372 Epididymis luminal protein 189 233, 170 1,3 12 DNPEP Aspartyl aminopeptidase 327 2 13 FLNA Filamin-A 2543 1,2,3 14 FLNB Filamin-B 2523 3 15 GBE1 1,4-alpha-glucan-branching enzyme 81 3 16 GSDMD Gasdermin-D 316 1,3 17 HAT1 Histone acetyltransferase 1 101 2,3 18 HDGF Hepatoma-derived growth factor 108 3 19 HNRNPF Heterogeneous nuclear ribonucleoprotein F 267 1 20 HNRNPH1 Heterogeneous nuclear ribonucleoprotein H 267 1,3 21 HNRNPK Heterogeneous nuclear ribonucleoprotein K 132 1 Isochorismatase domain-containing protein 2, 22 ISOC2 114 2,3 mitochondrial 23 LGALS1 Galectin-1 3 3 24 LMNA Prelamin-A/C 591, 588 1 25 MYL6 Myosin light polypeptide 6 95 1,2,3 26 NASP Nuclear autoantigenic sperm protein 708 1 27 NONO Non-POU domain-containing octamer-binding protein 145 1 28 NUDCD1 NudC domain-containing protein 1 376 3 29 PCBP1 Poly(rC)-binding protein 1 54 1 30 PGLS 6-phosphogluconolactonase 32 3 31 PGPEP1 Pyroglutamyl-peptidase 1 149 2,3 32 PLEC Plectin 4357 1 33 PREP Prolyl endopeptidase 255 2 34 PSME2 Proteasome activator complex subunit 2 5 1,2 35 PTGES3 Prostaglandin E synthase 3 58 1,2,3 36 REEP5 Receptor expression-enhancing protein 5 18 1,2,3 37 RTN3 Reticulon-3 46 1,2,3 38 SFN 14-3-3 protein sigma 38 2,3 39 SYNCRIP Heterogeneous nuclear ribonucleoprotein Q 96 1 40 TAGLN2 Transgelin-2 124 1

143

41 TPM3 Tropomyosin alpha-3 chain 170 2 42 TXN Thioredoxin 73 1,2,3 43 USP7 Ubiquitin carboxyl-terminal hydrolase 7 315 2 44 VDAC2 Voltage-dependent anion-selective channel protein 2 8, 76 1,2,3 45 VDAC3 Voltage-dependent anion-selective channel protein 3 8, 65 1,2,3

In comparison, site identification for curcumin and piperlongumine was far less successful. Only a very small handful of ‘modified’ peptides were identified within the overall protein target sets of the two compounds (data not shown). It is possible that the more complex structural nature of curcumin and piperlongumine may make identifying its amino acid modification sites more challenging. Both compounds contain two electrophilic centres capable of covalent reaction. It is plausible that one electrophilic centre could be covalently bound to its amino acid site within a protein, while the other electrophilic centre is bound elsewhere within the protein, or to other biological nucleophiles (such as GSH), or could be reduced or modified by a range of intracellular processes. When setting up MaxQuant to search for ‘modified’ peptides, a variety of masses corresponding to GSH adduct formation and reduction of the other electrophilic centre were applied but these did not lead to any further modification sites being identified. Furthermore, the modification of other nucleophilic amino acids aside from cysteine may also be the sites of modification, but searching for modifications on serine and lysine residues also returned no significant results. The instability of curcumin, which is well-reported in the literature,362, 364 could also account for the problems in identifying ‘modified’ peptides. Work is on-going to elucidate the sites of modification for curcumin and piperlongumine, with the results obtained for sulforaphane highlighting that site identifications using such a strategy should be feasible.

Mapping individual sites of modification within protein targets on a global scale in ABPP using cleavable reagents is an emerging technique. There is still much further work to be carried out to optimise the workflow, but promising progress has been made in the identification of amino acid binding sites for sulforaphane.

5.7 Chemical proteomics with other small molecule electrophile ABPs

5.7.1 Initial application and comparison of small molecule electrophile ABPs

As highlighted earlier, there has been a lack of studies that have looked to address the reactivity of different electrophiles towards molecular targets under endogenous cellular conditions. Much of this work has been carried out using in vitro thiol-based assays (discussed in Chapter 4.1.2). As shown in Table 1, the protein targets of many electrophiles have been profiled using chemical proteomic approaches, especially for the pan-reactive electrophiles, IA and NEM.80-83 However, very few studies have looked to directly compare the target profiles of different electrophiles in the same investigation. One study by Cravatt and co-workers profiled a small panel of alkyne-functionalised electrophiles encompassing five different electrophilic motifs compromising a phenylsulfonate ester, a linear epoxide, a spiro-epoxide, an α-chloroacetamide and an α,β-unsaturated ketone on mouse proteome lysates.84 The epoxide probes were shown to be unreactive towards the proteome at the

144

concentrations applied, whereas the phenylsulfonate ester, α-chloroacetamide and α,β-unsaturated ketone led to the identification of 37, 74 and 197 targets respectively (Table 1 entries 7-9). To build on this initial work and to more broadly explore the protein target profiles for curcumin, piperlongumine and sulforaphane relative to other electrophilic species, in-cell protein target identification was carried out for a small panel of electrophilic ABPs in the studies reported here.

A further 8 ABPs were obtained to study alongside the curcumin, piperlongumine and sulforaphane ABPs already in hand (Figure 39A). These included 5 ABPs containing a single α,β-unsaturated carbonyl (acetylenic chalcone (AC) ABP, acetylenic enone (AE) ABP, benzaldehyde (BA) ABP, NEM

ABP and acrylamide (ACR) ABP), irreversible SN2 reagents (chloroacetamide (CA) ABP and chloromethyl ketone (CMK) ABP) and a model isothiocyanate reagent (isothiocyanate (ITC) ABP). AE enone ABP,497 AC ABP,498 and BA ABP, were synthesised in good yield and high purity. NEM ABP,499 ACR ABP,500 and CA ABP,501, 502 were all synthesised by Saphia Matthews (Imperial College London). CMK ABP was kindly provided by Tom Charlton (Imperial College London),503 and ITC ABP (also known as propargyl isothiocyanate) was purchased commercially.

The α,β-unsaturated carbonyl (Michael acceptor) is the most widely encountered electrophilic motif in electrophilic natural products. The nature of the groups attached to this motif can have profound effects on tuning the reactivity and functionality of the electrophilic centre.504 This includes substitutions in both the α- and β- positions. Given that curcumin and piperlongumine both contain two α,β-unsaturated carbonyls, understanding the target profiles in a quantitative and comparative manner relative to a series of other α,β-unsaturated carbonyl-containing electrophiles will be beneficial to provide further insight into their target preference and build up subsequent SAR around the α,β- unsaturated carbonyl motif which has been poorly studied in relation to reactivity specifically towards proteins. Moreover, the further comparative analysis of different electrophilic motifs (Michael acceptors versus SN2 inhibitors versus isothiocyanates versus sulfoxythiocarbamates) will also aid understanding into target reactivity and preference for different electrophiles. It was envisaged that the application of all ABPs in-cell would allow for biologically relevant target engagement under endogenous conditions, which has not been previously addressed (similar previous studies have only been carried out on lysates).

The 8 ABPs were first tested for their protein labelling in live, intact MDA-MB-231 cells for 30 min. The labelling by each ABP was visualised by in-gel fluorescence following CuAAC to the AzT reagent and subsequent SDS-PAGE analysis (Figure 39B). The labelling patterns were highly diverse across the panel of ABPs both in terms of band identity and intensity. It was evident that the NEM ABP (3 μM) labelled most potently out of the 8 ABPs tested with the AC ABP (20 μM) also showing strong labelling. The two SN2 reagents (CA ABP (30 μM) and CMK ABP (10 μM)) showed similar labelling patterns, with stronger labelling observed for the CMK ABP. The ITC ABP (10 μM) appeared to show selective labelling for a protein of around 12 kDa (speculated to be MIF) which had also been previously observed as a prominent target band for the isothiocyanate-containing sulforaphane ABP 3 (Figure 14 lane 7). However, like sulforaphane ABP 3 (Appendix Figure 1), the instability of ITC ABP-protein adducts to boiling in β-mercaptoethanol for SDS-PAGE analysis may have caused a

145

significant loss of ABP-protein adducts given the results that follow upon protein identification by MS (Chapter 5.7.2).

On the contrary, two ABPs showed minimal or no protein labelling. These were BA ABP (25 μM) and ACR ABP (50 μM). Acrylamide is a highly reactive electrophile and would be expected to react widely with proteins, it is therefore unclear why no protein adducts were observed in this study. Unlike the ITC ABP this was confirmed not to be a result of instability of ABP-protein adducts to boiling in β- mercaptoethanol for SDS-PAGE (data not shown). Although not further explored, the possible reactivity of propargyl amides towards cysteine proteases could sequester the alkyne tag for click chemistry applications and may explain the failure of protein adduct detection in these studies.505-507 The lack of protein reactivity of the BA ABP is surprising. Given that a number of the tested ABPs are

α,β-unsaturated carbonyls, the in vitro reactivity of the CH2=CH-X motif has been determined as X = 508 COAr > CHO > COCH3 > CO2CH3 > CONHR > CONR2. This would suggest that the AC ABP and BA ABP should both contain the most reactive electrophilic centres, however whereas the AC ABP showed noticeable protein labelling, BA ABP showed no detectable protein labelling in this study. This would seem to suggest there is not a direct correlation between electrophile reactivity and in-cell protein labelling per se. However others have shown protein target engagement of benzaldehyde derivatives (FtsZ, GLUT1, TLR4, TRPA1 and TXNRD1) which makes the lack of adducts detectable with the BA ABP at this stage unclear.478, 501, 509-511

It was speculated that electrophilic components of the cell medium (e.g. cysteines in FBS and/or growth factors, as well as small molecule thiols) may react with and sequester the more reactive ABPs, preventing their cellular uptake and subsequent protein binding. However, dosing the ABPs in PBS to eliminate such a possibility resulted in very little difference in labelling patterns or intensities of any of the ABPs tested (data not shown). Cellular uptake and reactivity towards GSH are clearly key determinants in protein labelling of these different electrophilic species. Performing cell viability assays on the panel of ABPs showed no correlation between cytotoxicity towards MDA-MB-231 cells and the overall intensity of protein labelling (Figure 39 – compare EC50 values from cell viability assay in (A) with total in-gel fluorescence in (B)) indicating that the engagement of specific protein targets (in addition to non-protein targets) may be required for anticancer effects, rather than general protein adduct formation.

146

(A)

O O O O H CO O O 3 N S N O HO O O OCH OCH3 OCH3 3 Curcumin ABP 1 Piperlongumine ABP Sulforaphane ABP 2 EC50 = 22.6 ± 4.9 μM EC50 = 7.0 ± 1.0 μM EC50 > 50 μM

α,β-unsaturated carbonyls (Michael acceptors) Miscellaneous

β O O O O Cl R1 α R2 N H Chloroacetamide (CA) ABP O OH O EC50 > 50 μM OCH3 OCH3 O Acetylenic chalcone (AC) ABP Benzaldehyde (BA) ABP Cl EC = 31.1 ± 12.2 μM EC = 39.6 ± 8.7 μM 50 50 O O O Chloromethylketone (CMK) ABP O EC50 = 12.5 ± 3.0 μM N N O H N OCH O C 3 S Acetylenic enone (AE) ABP Acrylamide (AC) ABP NEM ABP Isothiocyanate (ITC) ABP EC50 > 50 μM EC50 > 50 μM EC50 > 50 μM EC50 = 53.7 ± 17.0 μM

(B) (C) R0K0 R6K4 R10K8 Lane 1 2 3 4 5 6 7 8 9 ‘Light’ ‘Medium’ ‘Heavy’ 25 20 25 μ 10 10 50 30 μ μ μ μ μ μ μ O ITC AC CA

250 150 1. 20 μM CMK ABP 3 μM NEM ABP * 50 μM CA ABP 100 75 2. 50 μM AE ABP 3 μM NEM ABP * 20 μM AC ABP 50 3. 50 μM BA ABP 3 μM NEM ABP * 20 μM ITC ABP 37 4. 15 μM SULF ABP 2 3 μM NEM ABP * 40 μM CURC ABP 1 25 gel fluorescence 20 - 5. 4 μM PIP ABP 3 μM NEM ABP * DMSO In 15

10 Coomassie

Figure 39. (A) Chemical structures of the 8 new electrophilic ABPs in addition to the representative curcumin, piperlongumine and sulforaphane ABPs. EC50 values corresponding to the cytotoxicity of the ABPs on MDA-MB- 231 cells (as determined by the MTS assay) at 48 h are shown. (B) SDS-PAGE and in-gel fluorescence analysis of the labelling of the 8 small molecule electrophile ABPs treated to MDA-MB-231 cells for 30 min. Further in-gel fluorescence analysis of the ABPs treated to MDA-MB-231 cells is shown in Appendix Figure 17 and Appendix Figure 18. (C) The triplex SILAC experimental setup for protein target identification and quantification (Chapter 5.7.2). Each SILAC triple was linked with the same ‘medium’ (R6K4) sample (3 μM NEM ABP) to allow cross comparison between different SILAC triples. * indicates that NEM ABP samples were all from the same batch. AC = acetylenic chalcone, AE = acetylenic enone, BA = benzaldehyde, CA = chloroacetamide, CMK = chloromethylketone, ITC = isothiocyanate, NEM = N-ethylmaleimide, CURC = curcumin, PIP = piperlongumine, SULF = sulforaphane.

147

5.7.2 Quantitative proteomic target identification of small molecule electrophile ABPs with triplex SILAC

In order to identify the protein targets of each electrophilic ABP and efficiently and effectively compare the target preference and reactivity of each ABP under study, a triplex SILAC experimental design was employed into the chemical proteomics workflow (Figure 39C). The approach is highly analogous to the duplex SILAC approach reported earlier (Chapter 4) differing only in the use of three isotopically different forms of arginine and lysine, allowing the comparison of three cell populations in a single MS run.204, 512 For a triplex SILAC experiment, cells are grown in either the R0K0-containing media (‘light’), R6K4-containing media (‘medium’) or the R10K8-containing media (‘heavy’). The 12 composition of the R6K4 media contains D4-lysine and C6-arginine, whereas the R0K0 and R10K8 media are the same as described previously (Chapter 4.3). Each differently treated cell population is then combined together in a 1:1:1 ratio at the protein-level after cell lysis. The SILAC triple is then processed with the chemical proteomic workflow to generate peptide mixtures for identification and quantification by LC-MS/MS analysis. This gives rise to three distinct peaks in the MS1 for the same peptide, x Da, x + 4 Da and x + 10 Da for lysine-containing peptides and x Da, x + 6 Da and x + 10 Da for arginine-containing peptides. The intensities of each of the ‘heavy’, ‘medium’ and ‘light’ counterparts are determined and used to calculate ‘heavy’:’light’ (H/L), ‘heavy’:’medium’ (H/M) and ‘light’:’medium’ (L/M) ratios by the MaxQuant software.

A total of 5 SILAC triples were analysed, linking them with a common experimental state (3 μM NEM ABP) for which H/L, H/M and L/M were determined. The NEM ABP was chosen as the common sample across all SILAC triples on account of its strong protein labelling (as visualised by in-gel fluorescence in Figure 39B) and its general reactivity to encompass many of the targets of the other electrophilic species. It should be noted that the ABPs were not all applied at the same concentration. It is important to remember that the reactivity of each ABP is being assessed in a cellular environment as opposed to on isolated protein. Therefore factors such as cellular uptake, stability, GSH reactivity and protein reaction kinetics will all contribute to the protein binding of each ABP. It was evident from the previous in-gel fluorescence that the NEM ABP was a more potent protein labeller than AE ABP for example (Figure 39B lane 4 v lane 1). More informative in these studies, was to assess the ABPs at concentrations where protein labelling could be detected such that target engagement could be quantitatively compared between the different electrophilic ABPs. The concentrations of each ABP applied were thus based on the observations from the in-gel fluorescence from Figure 39B to ensure protein labelling was observed for all ABPs across the panel, rather than applying all ABPs of a fixed concentration.

Firstly, the total number of protein identifications across all 5 samples was high with 734, 982, 1133, 962 and 733 protein identifications with assigned ratios for samples 1-5 respectively (the full list of targets is presented in Appendix Table 1). This suggested that there was no loss in coverage on account of increasing the complexity in the mass spectrometer and reaffirmed triplex SILAC as a valid quantification strategy comparable to duplex or ‘spike-in’ SILAC. Important protein targets identified for curcumin, piperlongumine and sulforaphane such as EGFR, HMOX2, KEAP1 and STAT3 were

148

also identified as targets across the full panel of ABPs studied here, suggesting their general reactivity towards electrophiles. However, other targets showed preference for only some but not all of the ABPs. For example, NFKB2 was identified as a target for AC ABP, AE ABP and NEM ABP but not identified for CMK ABP, CA ABP, ITC ABP or BA ABP (Table 4 column G). This suggested that the α,β-unsaturated carbonyl may be a necessary structural feature for adduct formation with NFKB2. Consistent with the in-gel fluorescence observations, the NEM ABP showed stronger target binding for the majority of the ABP targets as determined by the observed H/M and L/M ratios for many proteins identified being less than 1 (Table 4 entries 1-10). Given the NEM ABP was applied at the lowest concentration of the 10 electrophilic ABPs applied (3 μM), it is shown for the first time the high in-cell protein reactivity of NEM in comparison to other electrophiles. However, there were some exceptions such as GSTO1 that showed superior binding to most of the electrophilic ABP panel over NEM ABP (Table 4 column C). In order to study the in-cell reactivity of different ABPs, 9 protein targets of interest (ATP2A2, EGFR, GSTO1, HAT1, HMOX2, KEAP1, NFKB2, STAT3 and VDAC2) were focused on that were identified as high confidence and potentially biologically relevant targets for curcumin, piperlongumine and sulforaphane from earlier studies. The relevant SILAC ratios for these targets are shown in Table 4 (the full list of targets is in Appendix Table 1).

Next, the observed H/L ratios were examined for each of the 5 SILAC triples. As is shown in Table 4 for the representative panel of targets (entries 11-15), greater protein labelling for the CMK ABP (20 μM) was observed as compared to the CA ABP (50 μM), suggesting the chloromethylketone electrophilic motif may be more reactive than chloroacetamide under native biological conditions. The AC ABP (20 μM) also showed greater protein binding relative to the AE ABP (50 μM). The ABPs differ in their β-substituent on the α,β-unsaturated carbonyl, suggesting the phenol group over the methyl group conveys increased reactivity towards protein targets that are generally well-conserved across the two ABPs. The ITC ABP and curcumin ABP also showed superior protein binding relative to BA ABP and sulforaphane ABP respectively. All these observations were in accordance with what was expected from the previous in-gel fluorescence analysis. However these general conclusions did not apply for all targets, with a number showing superior binding to AE ABP over the AC ABP for example. This was observed across all 4 samples comparing ABP labelling.

ABPs outside of their direct SILAC triples, could also be compared to one another as they were linked through the common ‘medium’ sample, 3μM NEM ABP. This enabled the ranking of the ABPs for each protein target conserved across each electrophile (roughly 1000 proteins in total). For example KEAP1, the reactivity order of the electrophilic ABPs was determined to be 20 μM AC ABP > 20 μM ITC ABP > 3 μM NEM ABP > 20 μM CURC ABP > 50 μM AE ABP > 15 μM SULF ABP > 20 μM CMK ABP > 4 μM PIP ABP > 50 μM BA ABP > 50 μM CA ABP. These studies therefore unravel some SAR (in terms of covalent adduct formation) of different electrophiles towards protein targets across the full spectrum of identified targets.

There are many other potential hypotheses that can be drawn from this data. However, the proteomics data generated is only a single point for each ABP and requires further replicates to ensure higher accuracy and reliability of the pairwise comparisons between the labelling intensity of

149 this small electrophile ABP panel. Furthermore, the concentration of ABP applied requires further exploration as different ABPs are applied across a broad concentration range (3-50 μM). Initial insight into how protein targets between different electrophilic motifs but also between compounds employing the same electrophilic motif but in different chemical scaffolds has been explored, warranting further investigation.

Table 4. The calculated SILAC ratios for the quantitative comparison of protein target binding for 9 selected proteins of interest (ATP2A2, EGFR, GSTO1, HAT1, HMOX2, KEAP1, NFKB2, STAT3 and VDAC2). Where the ratio is > 1, the first ABP shows greater enrichment than the second ABP. Where the ratio is < 1, the reverse is true. This therefore provides an indication of target preference for the different electrophilic ABPs. The full list of protein targets is presented in Appendix Table 1. * L/M ratio + H/L ratio † L/M ratio n.d. ratio not determined. ‡ Although a SILAC ratio is reported, the intensity of the peptide for the following protein target was not identified for one of either the H/M/L samples. Therefore MaxQuant automatically quantifies against background ‘noise’ in the MS1 to generate the reported SILAC ratio.

A B C D E F G H I No. SILAC pair ATP2A2 EGFR GSTO1 HAT1 HMOX2 KEAP1 NFKB2 STAT3 VDAC2 1* CMK V NEM 0.91 0.16 27.41 n.d. 0.30 0.71 0.43‡ 0.84 0.90 2* AE V NEM 1.27 0.26 n.d. 1.61 0.52 0.78 0.69 0.35 0.84 3* BA V NEM 0.87 0.21 1.47 0.61 0.33 0.30 n.d. 0.41 0.42 4* SULF V NEM 0.51 0.12 2.57 1.11 0.23 0.77 0.45 0.28 0.24 5* PIP V NEM 1.00 0.32 22.54 1.51 0.70 0.62 0.83 0.42 0.61 6† CA V NEM 0.28 0.10 21.48 n.d. 0.21 0.16 0.26‡ 0.14 0.16 7† AC V NEM 3.54 0.70 n.d. 3.95 2.43 1.88 1.19 0.71 1.29 8† ITC V NEM 1.90 0.62 2.93 2.45 0.43 1.33 n.d. 1.45 2.06 9† CURC V NEM 3.85 0.38 1.59 2.61 2.37 0.89 0.56 0.37 0.93 10† DMSO v NEM 0.13 0.11 0.59 0.50‡ 0.14‡ 0.08‡ 0.20 0.11‡ 0.06 11+ CA V CMK 0.33 0.46 0.82 n.d. 0.76 0.22 0.53 0.17 0.17 12+ AC V AE 2.85 2.68 n.d. 2.42 5.34 2.33 1.68 2.16 1.55 13+ ITC V BA 2.17 2.88 2.05 4.33 1.39 4.71 n.d. 3.34 5.03 14+ CURC V SULF 7.93 2.52 0.65 1.97 10.63 1.11 1.42 1.52 3.87 15+ DMSO v PIP 0.14 0.29 0.03 0.30‡ 0.22‡ 0.14‡ 0.29 0.27‡ 0.10

5.8 Summary and conclusions

In this chapter, the target profiles of curcumin, piperlongumine and sulforaphane have been comprehensively unravelled using a SILAC-based, quantitative concentration gradient of competition- based chemical proteomics platform. The number of high confidence targets for the compounds is far superior to what has been reported previously (Figure 8) and sheds new light on the target sets of the compounds and how this relates to their diverse biological activities. Although chemical proteomics workflows have been applied for all three compounds in certain contexts, the target data generated here goes above and beyond what has been reported, in addition to providing highly validated targets worthy of further investigation. However, as has been carefully alluded to throughout this chapter, the list of targets generated is merely the start of the endeavour and there is much work to be done.

Sulforaphane had its targets profiled across two cell lines, MCF7 and MDA-MB-231, identifying 290 and 426 targets respectively (Chapter 5.2). Curcumin and piperlongumine were profiled in a single

150

cell line, MDA-MB-231, identifying 196 and 476 targets respectively (Chapter 5.3). The quantitative comparison between targets captured with each respective ABP across a range of parent compound competition concentrations also allowed targets within these global sets to be ranked in terms of their binding potencies. This is the first time such a challenge has been undertaken for these compounds. Sulforaphane displayed two key targets that it appeared to covalently bind at significantly lower concentrations than the remainder of its targets, namely KEAP1 and MIF. These targets may be the most therapeutically significant targets of sulforaphane, given what has been reported previously, although the effect of other targets within its wider target spectrum at this stage should not be overlooked. This was certainly the case for curcumin and piperlongumine that appeared to covalently bind multiple targets across a range of potencies.

Although many other electrophilic natural products have had their activities pinpointed to a single target to explain their anticancer effects, the case is not as simple for curcumin, piperlongumine and sulforaphane. Their multi-target promiscuity, on account of their small size, lipophilic properties and electrophilic nature mean they are unlikely to drive their biological effects through single targets but through action at multiple targets simultaneously. However, unravelling the target profiles of each of the three compounds has also highlighted that each electrophile has a strikingly different target profile with only moderate overlap between the three compounds. This supports the belief that such electrophiles do not bind non-specifically or show preference for highly abundant intracellular proteins but show a remarkable degree of selectivity albeit across multiple targets (in what might seem contradictory).

Moreover, there are likely to be more potentially significant targets of all three compounds that have been overlooked in the profiling efforts. Fractionation upon lysis into cytosolic and nuclear parts improved the number of targets that could be elucidated for sulforaphane. For curcumin and piperlongumine, a whole cell lysis buffer was successfully employed to identify hundreds of targets. However, it begs the question, what targets are we missing? The chosen lysis conditions disfavour proteins located in the membrane on account of their increased hydrophobicity which makes them more challenging to extract and solubilise upon lysis. Employing more rigorous protein extraction protocols such as filter-aided sample preparation (FASP) and enhanced FASP (eFASP) could be implemented to improve the breadth of proteins extracted from the lysate and the number of targets subsequently captured.513, 514 Moreover, hydrophobic peptides do not ionise well in MS which can often limit their detection by MS-based approaches. Given that over 50 % of modern drugs target membrane proteins, many pharmacologically relevant targets of these compounds may reside in the membrane and optimising a protocol for their extraction is important moving forward.

Furthermore, all targets of the three compounds are the result of incubation times to intact cells for 30 min. This requires a target to be covalently bound by the compound inside the cell within this relatively short time-scale for target identification. This may favour targets which have faster reaction kinetics or that are localised in easily accessible compartments within the cell. The short labelling times do however limit the effect large concentrations of compounds may have on inducing cellular stress or inducing significant proteome changes between experimental conditions. However a drawback for this

151

is that some targets may require longer exposure to the compound to be effectively engaged and subsequently captured and identified. In this regard, further work is necessary to profile the labelling of individual protein targets using a WB- or MS- based approach over time to better understand the dynamics of target labelling for protein targets of interest.

Identification of the individual amino acid sites of modification of ABP targets has provided additional validation for identifications as well as aided attempts to bridge the gap between target identification and assessment of functional implications on a given target. This showed promise for sulforaphane whereby 46 protein targets had their individual amino acid modification site determined, but less success was reported in the identification of modification sites for curcumin and piperlongumine (Chapter 5.6).

It is also important to put chemical proteomics methods into context as outlined in Chapter 1. It is merely a hypothesis-generating target identification method that requires orthogonal techniques to validate and explore further the engaged target of a compound of interest. A small number of identified targets revealed by MS were confirmed by WB analyses, giving confidence by inference that all of the MS-based targets identified are genuine observations (Chapter 5.4).

For targets that have been identified previously in the literature and for which good functional insight has been gained (KEAP1 as a typical example), there is already much evidence to support the compound-protein interaction with these chemical proteomic experiments complementing what is already known. In many cases, the study here reports the engagement of the compound with the aforementioned target inside the cell for the first time. However, for targets with no literature precedent, and with limited modification site information, it remains unclear what the functional implications of compound binding is on those targets (of which there are many). Is the interaction on a given protein causing an antagonistic, agonistic or a non-functional effect? Functional validation has been limited in the studies reported here due to time restrictions and there is a need to carry follow on studies on a protein-by-protein basis. The hope is that with such broad research interest in dietary electrophiles, especially curcumin, many different groups will be able to utilise the target information and add additional support to many of the novel targets identified over time.

Despite this, targets for each compound can be analysed collectively in a global, system-wide manner to assess their overall functional effects as has been done in the implementation of bioinformatic and network analyses. Identification of conserved biological processes relating to the cell cycle, cell division and cell death give a number of novel targets for these compounds to further explore for their functional role in their anticancer effects. Further integration of other systematic proteomic and transcriptomics data (indirect targets) in addition to the direct targets identified here, provides a rich source of cellular information in relevant cancer cell lines.515-519 This is allowing complex target network models for the three compounds to be developed to combine large datasets together to provide an overall picture of the complex polypharmacology of these compounds. Encompassing all this, key targets would appear to be KEAP1, EGFR, STAT3 and NFKB2 on account of previous literature findings, their high affinity binding potencies relative to other proteins within the target set,

152

their well-documented functional importance in cancer, their connectivity within the target interactomes and their conserved nature as targets across all three compounds under study.

153

6. Exploring combinations of electrophilic natural products with other cancer therapeutics

6.1 Introduction

The target profiling of curcumin, piperlongumine and sulforaphane clearly highlights these compounds as polypharmocological agents that mediate multiple targets and signalling pathways simultaneously. The high complexity of their mode of action makes recommendations of how best to apply these compounds as cancer therapeutics extremely challenging. These compounds are often described as having shallow phenotypes, meaning that, whilst displaying an impressive range of activities, the overall effect on cancer cell death is often only reflected in moderate potency. As has been discussed previously, it remains unclear whether these concentrations are obtainable in vivo. However, one of the great advantages of multi-target compounds is it enables them to potentially induce cancer cell death through a number of mechanisms and provide a better therapeutic outcome than that of single- target inhibitors. This is beneficial in avoiding cancer resistance mechanisms. When used in combination with other therapeutics, they also may sensitise cancer cells to enhance, complement or synergise the selective cancer cell killing action of such agents.

Synergy is defined as a situation in which the effect by the combination of two agents is more effective than the sum of the effect of each agent alone. The identification of drug synergies is highly sought after as it allows for lower dosage of one or both of the agents to achieve the same effect which is highly desirable in reducing off-target effects or where achieving the required therapeutic dose is challenging. The number of such combinations is steadily increasing, providing a new paradigm for cancer treatment.520 Predicting whether a combination is likely to be beneficial or synergistic is almost always based on experimental observations after the fact rather than design. However, a growing number of in silico prediction approaches are being developed.521-523 The dietary- based nature of these electrophilic natural products that are readily consumed as part of the diet has made the potential application of such compounds in combination with current cancer therapeutics an attractive proposition. This is aided by the fact that they are very easy to progress through to clinical trials as the FDA generally recognises these compounds as safe on account of their dietary status.

As such, a large number of combinations of current cancer therapeutics with either curcumin or sulforaphane have been explored both in vitro and in vivo, accounting for over 50 % of the reported clinical trials for curcumin past and present. 282, 478 524 In vitro, combination studies have been carried out across a wide range of cancer cell lines, leading to a handful of reported beneficial combinations. Many combinations have been with non-targeted chemotherapeutics. These include drugs that interfere with DNA and RNA metabolism (e.g. cisplatin,525, 526 doxorubicin,525, 527, 528 gemcitabine,529 and fluorouracil (5-FU)530, 531) with the electrophilic natural product believed to reverse cancer resistance mechanisms, attributed to their anti-oxidant and anti-inflammatory effects, thereby significantly reducing the (toxic) dose of chemotherapy required.254 Other studies have combined the treatments of curcumin and sulforaphane with other multi-targeted agents (e.g. epigallocatechin

154

gallate (ECGC),532, 533 and resveratrol534, 535). Two studies have even shown beneficial combination treatments with curcumin and sulforaphane,536, 537 and curcumin and piperlongumine,538 in cancer model systems. For all such therapies, very little is known about the underlying mechanism which may explain the origin of the observed beneficial effects.

Less explored are combinations of electrophilic natural products with single-target chemotherapeutics.539-542 Combinations of curcumin, piperlongumine and sulforaphane were therefore investigated with a small panel of targeted chemotherapeutics in MDA-MB-231 and MCF7 breast cancer cell lines with the aim of identifying novel synergistic combinations that could potentially provide recommendations for the clinic.

6.2 Assessment of effects of cancer therapeutics on breast cancer cell viability

Firstly, candidate cancer chemotherapeutics were sought for combination therapy studies with the electrophilic natural products. The MTS assay was again chosen to determine compound effects on cell viability (Chapter 3.4), based on its cost effectiveness and ease of implementation into a 96-well plate format.

A total of 13 carefully selected compounds were screened for their effects on the cell viability across two breast cancer cell lines, MCF7 and MDA-MB-231, at two separate time intervals. For the MCF7 cell line, at 24 h and 48 h, and for the MDA-MB-231 cell line, at 24 h and 72 h. The compounds were chosen based on their current application as therapeutics against breast cancer and their availability commercially. Drugs approved for breast cancer treatment have typically targeted the oestrogen receptor (ESR1), DNA and RNA metabolism, tubulin and microtubules, aromatase (CYP19A1), CDK4/6, TOP2, HER2 (ERBB2) and EGFR.543 Selective inhibitors were obtained against 6 of these targets (CDK4/6, CYP19A1, DNA, HER2/EGFR and ESR1) and were tested for their effects on MDA- MB-231 and MCF7 cell viability. In addition, drug-like inhibitors against 5 additional well-defined cancer targets (androgen receptor (AR), CHEK1, focal adhesion kinase (FAK), IKBKB and SRC) were also tested.

Of the 13 compounds screened, 3 compounds showed no detectable effects on cancer cell viability (CP-380736, exemestane and (-)-thalidomide) in either of the two cell lines even after 72 h treatment (Table 5). Of the remaining 10 compounds, 6 showed significant effects on cell viability across both cell lines and time intervals tested (PF-477736 (CHEK1), epirubicin (DNA metabolism), afatinib (EGFR), tamoxifen (ESR1), PF-573228 (FAK) and PP2 (SRC)) providing compounds which act though a diverse range of targets to bring about cancer cell death (Table 5 (highlighted in red)). These were selected to take forward to be tested in combination with curcumin, piperlongumine and sulforaphane (Figure 40).

155

Table 5. EC50 values for effect of compound treatment on cell viability in MDA-MB-231 and MCF7 cell lines determined at two time intervals for both cell lines using the MTS cell viability assay (p < 0.05). EC50 values with no reported errors correspond to poor sigmoidal curve fit for the dose response curve such that errors could not be reported.

MDA-MB-231 cell line MCF7 cell line

Compound Target Name EC50 (μM) at 24 h EC50 (μM) at 72 h EC50 (μM) at 24 h EC50 (μM) at 48 h PF-998425 AR > 50 16.9 ± 4.2 > 50 > 50 Exemestane CYP19A1 > 50 > 50 > 50 > 50 Palbociclib CDK4/6 > 50 > 50 30.5 ± 7.9 38.9 ± 7.3 PF-477736 CHEK1 6.7 ± 3.5 1.6 ± 0.4 1.1 ± 1.2 2.4 ± 2.9 Epirubicin DNA > 10.2 0.4 ± 0.1 9.98 ± 4.50 1.6 ± 0.6 CP-380736 EGFR > 50 > 50 > 50 > 50 Gefitinib EGFR 51.7 ± 31.4 25.6 26.7 22.4 ± 5.5 Afatinib EGFR 13.3 4.1 ± 0.3 9.4 4.3 ± 1.8 Nafoxidine ESR1 13.3 5.9 ± 0.9 12.0 6.8 ± 1.3 Tamoxifen ESR1 22.3 13.9 18.8 12.7 PF-573228 FAK 56.0 ± 31.4 17.8 ± 4.8 62.0 ± 16.9 8.5 ± 3.8 (-)-Thalidomide IKBKB > 50 > 50 > 50 > 50 PP2 SRC 37.3 ± 33.4 10.1 ± 2.4 21.3 ± 6.5 15.7 ± 7.1

Sulforaphane n/a 64.5 ± 32.3 20.1 ± 5.4 47.8 ± 35.6 14.6 ± 5.1 Curcumin n/a 93.4 ± 17.6 46.2 47.7 32.1 ± 5.7 Piperlongumine n/a 13.0 ± 4.0 4.9 ± 0.6 8.8 ± 5.8 4.5 ± 1.7

Cl HN N O H F3C N O O N O N NH2 S N N N N N N O H H N H H N NH2 N N PF-477736 (CHEK1) PF-573228 (FAK) PP2 (SRC) O O OH O OH OH O N O N N N O N O O OH O H HN HO O

NH2 Tamoxifen (ESR1) Afatinib (EGFR) F Cl Epirubicin (DNA metabolism)

Figure 40. The chemical structures of the 6 selective inhibitors which showed cytotoxicity against both MCF7 and MDA-MB-231 cell lines taken forward for combination with curcumin, piperlongumine and sulforaphane.

6.3 Combination of cancer therapeutics with electrophilic natural products

The 6 chosen cancer therapeutics (PF-477738, epirubicin, afatinib, tamoxifen, PF-573228 and PP2) were subjected to combination studies with two fixed concentrations of the three electrophilic natural products in both the MCF7 and MDA-MB-231 cell lines at a 72 h time interval. The experimental setup is shown in Figure 41.

Briefly, cells were treated with 6 serial double dilution concentrations around the EC50 separately of both the cancer therapeutic and the electrophilic natural product alone (Figure 41A blue and red wells

156

respectively). Cells were also treated with 6 serial double dilution concentrations around the EC50 of the cancer therapeutic with simultaneous dosing of two separate fixed concentrations (at non- constant ratio) of the electrophilic natural product around its EC25 and EC40 values (Figure 41A 544 orange and green wells). Following cell viability determination, the fraction of cells affected (Fa) by the various treatments was determined and used to generate dose response curves for the cancer therapeutic, the electrophilic natural product and the two combinations by employing the CompuSyn software. Several attempts have been made to quantitatively measure the dose-effect relationship between two or more agents alone and in combination to determine whether a given combination results in a synergistic effect. CompuSyn uses an approach known as the median-drug effect analysis method developed by Chou and Talalay.545 Using this method, a combination index (CI) is calculated based on the shape of the growth inhibition curves of each agent alone and in combination to provide a quantitative measure of the interaction between the two agents. A combination is defined as synergistic (CI < 1), additive (CI = 1) or antagonistic (CI > 1).546 The calculation of a CI value for each combination at each Fa value produces the isobolograms in Figure 41B. This allows the combination relationship to be visualised across all concentrations tested.

The results of the combination analysis for the 6 cancer therapeutics in combination with curcumin, piperlongumine and sulforaphane are summarised in Table 6. The key findings identified potential synergistic relationships for the following combinations:

1. Afatinib (EGFR) with curcumin, piperlongumine or sulforaphane across both MDA-MB-231 and MCF7 cell lines. 2. Epirubicin (DNA metabolism) with curcumin, piperlongumine or sulforaphane in the MDA-MB- 231 cell line. 3. PF-477738 (CHEK1) with curcumin, piperlongumine or sulforaphane in the MDA-MB-231 cell line. 4. PF-573228 (FAK) with curcumin or piperlongumine across both MDA-MB-231 and MCF7 cell lines.

The other 2 compounds tested in combination with the electrophilic natural products (tamoxifen and PP2) showed no observable synergy or the combination interaction could not be definitively determined. Tamoxifen in particular had a very steep dose response curve around its EC50, which prevented reliable combination relationships from being investigated in these studies. The number of cancer therapeutics that showed a synergistic relationship with the electrophilic natural products was surprising, especially for curcumin and piperlongumine. The identification of novel synergistic relationships of different electrophilic natural products with cell line dependencies in some cases certainly warrants further investigation beyond the scope of these studies. The synergy observed between epirubicin and all three electrophilic natural products was a novel observation, although a closely related anthracycline family member, doxorubicin, has been widely reported to be synergistic with these and other electrophilic natural products.

157

(A)

1 2 3 4 5 6 7 8 9 10 11 12 Negative control Positive control A Cancer therapeutic alone B Electrophilic natural product alone C Cancer therapeutic + EC25 electrophilic D natural product Cancer therapeutic + EC electrophilic E 40 natural product F G • MDA-MB-231 and MCF7 cell lines H • 72 h incubation • Simultaneous compound dosing 96-well plate • All data points in triplicate

(B) EC50 value (Dm) Fit to the curve (r) Degree of sigmoidity (m)

Calculate % Fraction Combination Data EC50 of B viability affected (F ) Index (CI) value Antagonism a analysed (averaged calculated calculated for with out over 1 – [(% cell each CompuSyn Additive triplicates) viability)/100] combination set

Concentration of B Synergy Afatinib (EGFR) + sulforaphane - MDA-MB-231 150 Concentration of A EC50 of A 125 Afatinib only Afatinib + 13.8 µM sulforaphane 100 Afatinib + 17.8 µM sulforaphane Isobologram analysis 75

50

25 CI > 1.0 Antagonistic effect

Response (% cell viability) cell (% Response 0 CI = 1 Additive effect -6.0 -5.5 -5.0 -4.5 -4.0 -25 Log (concentration / M) CI < 1.0 Synergistic effect

Cell viability plots CI value calculation

Figure 41. Combination studies of cancer therapeutics and electrophilic natural products on cancer cell viability. (A) The 96-well plate setup with one cancer therapeutic assessed for its combination with one electrophilic natural product per plate, with all experimental data points in triplicate. (B) Workflow for determination of CI values and the isobologram analysis for determining the relationship or interaction between two agents in combination.

Table 6. Summary of the combination analysis of the 6 cancer therapeutics with curcumin, piperlongumine or sulforaphane. The CI value for each combination presented is the average CI value over all relevant experimental data points for the combination. + slight synergy (CI = 0.85-0.9) ++ moderate synergy (CI = 0.7-0.85) +++ synergy (CI = 0.3-0.7) – no synergy (CI > 0.9). n.d. not determined. The isobolograms for curcumin (Appendix Figure 19), piperlongumine (Appendix Figure 20) and sulforaphane (Appendix Figure 21).

Electrophilic natural product CI value (curcumin) CI value (piperlongumine) CI value (sulforaphane) MDA-MB-231 MCF7 MDA-MB-231 MCF7 MDA-MB-231 MCF7 PF-477736 (CHEK1) ++ + +++ n.d. +++ n.d.

Epirubicin (DNA metabolism) +++ n.d. +++ n.d. + - Afatinib (EGFR) ++ +++ +++ +++ + ++ Tamoxifen (ESR1) - - n.d. - n.d. n.d. Cancer

therapeutic PF-573228 (FAK) ++ + +++ + - - PP2 (SRC) ++ n.d. n.d. n.d. n.d. n.d.

158

The combination of most interest was the clear synergy observed between all three electrophilic natural products and afatinib across both the MCF7 and MDA-MB-231 cell lines (Figure 42). The isobolograms analyses (Figure 42A) indicate clear synergy for all relevant combination concentrations tested and this effect is conserved across the two cell lines. The effect of the electrophilic natural product on the dose-dependent effect of afatinib on cell viability is shown in Figure 42B and the calculated CI values for the combinations summarised in Figure 42C. Variations in the amount of synergy observed was evident for the three electrophilic natural products, with curcumin and piperlongumine showing a stronger synergistic effect relative to sulforaphane in these studies.

Afatinib (previously known as BIBW-2992) is a selective covalent inhibitor of the HER family of tyrosine kinases, including EGFR (HER1), HER2 and HER4.547 It has been recently approved for the treatment of non-small-cell lung cancer (NSCLC).548 Afatinib binds at a highly conserved cysteine residue (Cys797 in EGFR) in the hinge region of the ATP-binding cleft and the subsequent covalent adduct formation disrupts access to ATP.549 Afatinib shows the highest degree of potency towards EGFR and HER2 and it is believed to be through these targets it mediates its anticancer therapeutic effects.550 Unsurprisingly, afatinib has been applied in cancers that overexpress HER2, including metastatic breast cancer.551, 552 As both cell lines under study here are HER2- (HER2 receptor is not overexpressed and they are insensitive to HER2 inhibition), the inhibitory effect exerted by afatinib is likely to come directly through EGFR. This would therefore suggest the synergy with the electrophilic natural products may also be EGFR-dependent.

The observed synergy between afatinib and all three electrophilic natural products suggests the effect is being mediated through one or more of the 81 conserved protein targets identified in Chapter 5. It also appears to be cell line independent suggesting it may have broad applicability for a range of cancer types. A verifiable explanation for the underlying mechanism of the synergy cannot be determined from these studies and is a major challenge for any drug combination.546, 553 However, a number of observations can be made. Firstly, EGFR is itself a target of all three electrophilic natural products suggesting that a combined inhibitory action on EGFR activation may drive the synergistic relationship. Secondly, a number of conserved targets of all three electrophilic natural products within the downstream EGFR signalling cascade (including CBL, CRKL, PLCG1, PTPN11, PTPN12, PXN, ROCK1, STAT1 and STAT3) may contribute to the synergistic enhancement of cell death through EGFR (http://www.netpath.org/netslim/EGFR1_pathway.htmL [Accessed 30/03/2015]). Finally, the covalent mode of inhibition by afatinib and the electrophilic natural products should also not be overlooked. A number of non-protein based mechanisms such as GSH adduction may also account for or contribute to the synergistic relationship between afatinib and the electrophilic natural products.

Synergy has also been reported by others for two non-covalent EGFR inhibitors, gefitinib and erlotinib, in combination with curcumin in other cancer cell lines.554-556 A hypothesis was proposed that the synergy is driven through the effect of curcumin on the epigenetic activity on EGFR expression that ultimately leads to ubiquitin-proteasome-mediated EGFR degradation, inhibiting downstream signalling.557 Here it is shown that such synergy between EGFR inhibitors is not limited to curcumin as

159 it is also observed for piperlongumine and sulforaphane. This may indicate that a common mechanism to all three compounds results in the synergistic effects with EGFR inhibitors.

160

(A) Afatinib (EGFR) + Sulforaphane Curcumin Piperlongumine Combination 1 Combination 231 - MB - MDA Combination 2 Combination Combination 1 Combination MCF7 Combination 2 Combination (B)

Afatinib (EGFR) + sulforaphane - MDA-MB-231 Afatinib (EGFR) + curcumin - MDA-MB-231 Afatinib (EGFR) + piperlongumine - MDA-MB-231 150 150 150 Afatinib only 125 125 Afatinib only 125 Afatinib + 3.3 µM piperlongumine Afatinib only Afatinib + 12 µM curcumin µ µ Afatinib + 4.2 M piperlongumine Afatinib + 13.8 M sulforaphane Afatinib + 20 µM curcumin 100 Afatinib + 17.8 µM sulforaphane 100 100

75 75 75

50 50 50

25 25 25 Response (% cell viability) cell (% Response Response (% cell viability) cell (% Response Response (% cell viability) cell (% Response 0 0 0 -6.0 -5.5 -5.0 -4.5 -4.0 -6.0 -5.5 -5.0 -4.5 -4.0 -6.0 -5.5 -5.0 -4.5 -4.0 -25 Log (concentration / M) -25 Log (concentration / M) -25 Log (concentration / M)

Afatinib (EGFR) + sulforaphane - MCF7 Afatinib (EGFR) + curcumin - MCF7 Afatinib (EGFR) + piperlongumine - MCF7

150 Afatinib only 150 150 Afatinib + 11.1 µM sulforaphane Afatinib only Afatinib only 125 Afatinib + 13.2 µM sulforaphane 125 Afatinib + 12.5 µM curcumin 125 Afatinib + 2.1 µM piperlongumine Afatinib + 25 µM curcumin Afatinib + 4.2 µM piperlongumine 100 100 100

75 75 75

50 50 50

25 25 25

Response (% cell viability) cell (% Response 0 viability) cell (% Response 0 viability) cell (% Response 0 -6.0 -5.5 -5.0 -4.5 -4.0 -6.0 -5.5 -5.0 -4.5 -4.0 -6.0 -5.5 -5.0 -4.5 -4.0 -25 Log (concentration / M) -25 Log (concentration / M) -25 Log (concentration / M)

Elect (C) Afatinib + EC Afatinib + EC natural 25 40 Afatinib alone electrophilic electrophilic product EC (μM) natural product CI natural product CI alone EC 50 50 value value (μM) Curcumin (MDA-MB-231) 28.8 2.7 0.65 (+++) 0.77 (++) Curcumin (MCF7) 38.0 9.0 0.60 (+++) 0.69 (+++) Piperlongumine (MDA-MB-231) 6.2 2.9 0.62 (+++) 0.62 (+++) Piperlongumine (MCF7) 7.9 8.2 0.57 (+++) 0.61 (+++) Sulforaphane (MDA-MB-231) 22.1 3.3 0.85 (+) 0.89 (+) Sulforaphane (MCF7) 29.4 12.1 0.66 (+++) 0.71 (++)

Figure 42. Combination analysis for afatinib and the three electrophilic natural products. (A) The normalised isobologram plots of afatinib with sulforaphane, curcumin and piperlongumine generated in CompuSyn analysis. The line designates the CI where CI = 1 (additive effect). Below the line (CI < 1) indicates synergy and above the line (CI > 1) indicates antagonism for the combination of the two agents. (B) Cell viability plots for afatinib alone

161

and in combination with two concentrations of electrophilic natural product (at EC25 and EC40). (C) CompuSyn output for the synergy analysis showing the averaged CI values for each electrophilic natural product with afatinib across both MDA-MB-231 and MCF7 cell lines. + slight synergy, ++ moderate synergy and +++ strong synergy.

6.4 Conclusions

The results presented here show promise for the combination of electrophilic natural products with a range of cancer chemotherapeutics currently used in the clinic. The observed synergy in particular for afatinib has not been previously investigated, particularly in HER2- cell lines such as MCF7 and MDA- MB-231, and provides an interesting combination that warrants further study. It remains to be determined whether this, as well as other combinations reported here translate from in vitro to in vivo systems.

However, there is also a lot of further validation required for the in vitro synergistic observations made here before recommendations can be made in the clinic. Secondary screens with other cell viability assays are required to further consolidate the conclusions drawn from this study. Although practically convenient, tetrazolium salt-based cell viability assays are known to have limitations and are susceptible to metabolic and compound interference.558-560

There has also been a lack of a unifying approach to conclusively ascertain combination relationships of two or more agents within the literature with a variety of experimental designs applied. In these studies, careful experimental design was made based on the recommendations of Chou to ensure the output of such experiments was both accurate and biologically relevant.544, 546, 561 Even so, further exploration of all 6 cancer therapeutics tested over broader concentration ranges and incubation time intervals will allow a more comprehensive overview of the ‘synergy window’ to be determined.

What has been implemented here is a screening platform for testing the combinations of electrophilic natural products with cancer chemotherapeutics to identify synergies for further study. Further optimisation will allow more systematic screening of combinations with a wider range of compounds and over a large number of cell lines, concentration gradients and time intervals. The combination treatment of PARP inhibitors, which have shown promising activity against triple negative breast cancers in vivo, with electrophilic natural products would be an interesting further study.543 Having identified the comprehensive target sets for the three electrophilic natural products in Chapter 5, it is hoped that these will provide a means to explain the observed synergies moving forward. Furthermore, the target information has the potential to guide the selection and further testing of combinations using hypothesis-driven approaches such that it may be possible to predict synergies. This remains a work in progress requiring continued collaborations with computational and systems biologists but could help to unlock the therapeutic potential for the application of electrophilic natural products in combination with other therapeutics for cancer treatment.

162

7. Conclusions and future work

7.1 Conclusions

The main aim of the reported studies was to comprehensively address the direct protein target sets of three dietary-based electrophilic natural products, namely curcumin, piperlongumine and sulforaphane, and translate their target profiles to further improve understanding into their fascinating biological activities.

7.1.1 New probes for profiling electrophilic natural products

Firstly, new chemical tools of three electrophilic natural products were synthesised using a combination of reported and novel procedures to produce a total of seven functionally useful ABPs (Chapter 3). With the exception of sulforaphane ABP 3, all these ABPs can be prepared reproducibly in sufficient yields for their utilisation in biological experiments. Optimisation of a synthetic procedure for sulforaphane ABP 3 such that it can be obtained on a larger scale is still required. Work showed that the three ABPs of curcumin all retained the biological activity of the parent compound curcumin, with regards to cancer cell killing, with similar reported EC50 values. The same was the case for the piperlongumine ABP. This was not the case however for the sulforaphane ABPs whereby the altered design of sulforaphane ABP 1 and 2 produced compounds that were not cytotoxic towards the tested MDA-MB-231 cell line. Sulforaphane ABP 3 did however show comparable activity to the parent compound sulforaphane. Initial in-cell and in-lysate competition-based assays between the synthesised ABPs and their respective parent compounds showed that all ABPs were good surrogates for their parent compound, engaging the same protein targets under native biological conditions as determined by in-gel fluorescence analysis (Chapter 4).

7.1.2 Identification of the comprehensive target set of electrophilic natural products

In Chapter 4 and 5, chemical proteomic profiling of the targets of curcumin, piperlongumine and sulforaphane was carried out using quantitative chemical proteomics in breast cancer cell lines. A number of chemical proteomic experiments were carried out initially to explore the applicability of using ABPs to identify protein targets for each compound using MS-based analysis. The differences observed in the target profiles of curcumin, piperlongumine and sulforaphane by in-cell labelling in comparison to in-lysate labelling strongly supported the need to carry out their target profiling in live, intact cells to identify biologically relevant targets. This led to the development of an optimised, concentration gradient of competition-based assay that identified 196, 476 and 426 high confidence targets for curcumin, piperlongumine and sulforaphane respectively in the MDA-MB-231 cell line. For sulforaphane, an additional 290 targets were also identified in the MCF7 cell line. This chemical proteomics platform combined in-cell target profiling with quantitative target binding analysis to not only identify biologically relevant targets under native conditions, but also to identify the most potently engaged targets within the target sets. Previously reported targets from the literature for all three compounds were identified. However, the number of targets identified was above and beyond what has been reported previously and sheds new light on the complex target sets of these compounds,

163

with a significant number of novel targets unravelled. Functional insight into these targets was gained through bioinformatic analyses that revealed a plethora of new targets involved in cell cycle and cell death processes that may well contribute their anticancer effects.

Curcumin has been the most well-studied of the three electrophilic natural products under study, with over 7000 publications reported. Despite around 50 targets being identified and linked to its polypharmocological effects, there is still a clear lack of understanding into its underlying molecular target set which is yet to be systematically explored. The targets identified for curcumin in this study provide further evidence for their in-cell engagement for some of these previously reported targets (including EGFR, IMPDH2 and TXNRD1) as well as identifying a broad range of novel targets requiring further study. Under the assay conditions employed, the binding potencies of all targets within the identified target sets were obtained, highlighting HMOX2, RTN3, ALDH9A1 and KEAP1 as some of the most potent targets of curcumin.

Piperlongumine, relative to curcumin, has been far less studied, with only 16 targets identified to date, the majority identified in the study by Lee and co-workers.123 The identification of 476 targets therefore expands this number substantially, providing a range of new mediators through which piperlongumine may exert its selective cancer cell killing (including AURKA, CHEK1, GSTO1, KEAP1, RB1 and STAT3).

Sulforaphane has had around 10 direct targets identified, although around 30 targets are known more generally for isothiocyanate family members. Identification of KEAP1 and MIF as significantly more potently engaged by sulforaphane than any other target across both the MCF7 and MDA-MB-231 cell lines highlighted these two proteins as the key functional targets of sulforaphane in breast cancer cell lines for the very first time. Identification of individual amino acid modification sites for sulforaphane adducts also helped improve understanding into the functional impact of binding for 46 target proteins. Adduct formation at catalytic cysteines of cathepsin enzymes and a regulatory cysteine in thioredoxin strongly implied significant functional effects on these targets.

By drawing comparisons between the targets identified for the three compounds under study, the diversity for target reactivity by electrophilic natural products is emphasised. However, a core target set of 81 proteins conserved across all three compounds contained notable protein targets such as KEAP1, NFKB2 and STAT3 that have prior literature for their reactivity with electrophiles. These targets may represent a core hub of mediators for electrophilic compounds responsible for their ability to drive cancer cytotoxicity. For example, the three electrophilic natural products have all been shown to induce apoptosis-mediated cell death as well as cause cell cycle arrest in the MDA-MB-231 cell line. In these studies, scores of candidate targets associated with these biological processes (including but not limited to EGFR, RTN3, SF1 and STAT1) may individually, but most likely collectively, contribute to these effects. The validation of many of these targets by orthogonal approaches and the determination of the functional effect of target binding and how this relates to the overall anticancer effects of the compound is yet to be achieved. To do this for every target would require a herculean effort beyond the capabilities of a single research group. The hope is that the

164

wider research community will be able to take this on and further validate identified targets in this study and utilise the target information to provide insight for observed activities of the electrophilic natural products beyond discussion here.

7.1.3 Synergistic combinations of electrophilic natural products with cancer therapeutics

Analysis of 6 cancer therapeutics was carried out to determine combination relationships when applied with curcumin, piperlongumine or sulforaphane. The identification of the in vitro synergistic relationship between afatinib and all three electrophilic natural products in both the MCF7 and MDA- MB-231 cell line was a novel observation. The underlying mechanism remains unknown, requiring further study, but given that afatinib is in late stage clinical trials for breast cancer, the possibility of applying it in conjunction with these dietary-based electrophiles is an exciting possibility that warrants further investigation. There is a significant amount of research effort required to systematically screen such electrophilic natural products in combination with cancer therapeutics more broadly across multiple cancer cell lines to identify potential synergies that could then be further explored in vivo. Knowledge of the target data for curcumin, piperlongumine and sulforaphane will certainly help in explaining and possibly predicting combinations in the future, but this is a major undertaking.

7.1.4 Wider implications of the work

The efficacy of any therapeutic against a specific disease is determined by randomised, placebo- controlled, double-blind clinical trials and to date no such trial has shown curcumin, piperlongumine and sulforaphane to be effective so far. One of the major stumbling blocks for the application of these electrophilic natural products in vivo, is the lack of understanding of the molecular targets of these compounds which has prevented making rational recommendations on how best to apply these agents in the clinic. The direct targets of these compounds have been surprisingly overlooked and understudied predominantly as traditional biological techniques have struggled to get to grips with their polypharmocological nature.

These studies have looked to address this void. The utilisation of the target information has helped to provide insight into the underlying mechanisms for this polypharmacology. These are multi-target compounds which mediate their activities through multiple targets simultaneously in a system-wide manner. Many of the targets identified in the MDA-MB-231 cell line are also likely to be targets in other cancer cell lines and types. However, many more targets for these compounds may exist beyond those captured here. The direct target information for curcumin, piperlongumine and sulforaphane identified could be easily added to curated bioactive compound databases such as ChEMBL to provide improved understanding of their broader spectrum target profiles.562 These databases, despite their infancy, are continually expanding and the addition of in-cell physical interactions with the reported targets generated here can complement much of the activity information that predominantly reports activity against targets on isolated proteins and may be less informative.

Important to note, the biological activities of such dietary electrophiles may also lie beyond their ability to form covalent adducts with proteins. Contributions may also derive from non-covalent protein

165

interactions and non-protein mediated mechanisms, particularly noted for curcumin.261 Non-protein molecular targets include adduct formation with GSH which has been widely reported for electrophilic natural products, disrupting redox buffering and subsequently causing redox-sensitive signalling cascades to become activated.282, 283, 312, 563, 564 The ability to free radical scavenge as well dynamically regulate ROS levels are also well-documented for other electrophilic compounds.17, 565 Studying the contributions of these mechanisms has proved far more challenging than profiling the protein targets as identified here by chemical proteomics, with the two likely to be highly intertwined. Therefore, when considering the biological activities of electrophilic natural products, it is not only the disruption to protein function, structure and localisation that one has to consider but also the impact on non-protein entities that also play an integral role in their mode(s) of action.

Unravelling the molecular targets of electrophilic natural products and relating this to their mode(s) of action is a complex endeavour requiring the effort of multiple research groups spanning cancer biology, inflammation, electrophile signalling, systems biology and computational chemistry. The data provides many possible explanations and hypotheses that require further validation and investigation. Complementing it with other orthogonal, system-wide profiling techniques such as RNAi, protein microarrays and HIP/HOP will provide further support not only to the targets identified through chemical proteomics, but to moving towards identifying the key underlying mechanisms of these electrophilic natural products. This is the first reported study of getting to grips with the multi-target nature of electrophilic natural products and provides a significant advancement in the understanding the fascinating biological activities of these agents.

7.2 Future work

There is much work to be done to further develop understanding into curcumin, piperlongumine and sulforaphane as well as the target profiles of other electrophilic species. Highlighted below are some of the key areas that future work should look to address.

1. Although certainly comprehensive in terms of the number of targets identified for the three compounds in these studies, many genuine targets may have been overlooked. For example, membrane-bound proteins are often disfavoured in chemical proteomic workflows both in extraction upon lysis and MS detection. Optimisation of protein extraction protocols (such as the use of FASP) as well as downstream LC-MS/MS methodologies may aid the detection of membrane-bound targets. Other improvements to the LC-MS/MS setup such as peptide fractionation or extended chromatography gradients would also allow lower abundance protein targets to be detected. 2. Target profiling across a broader range of cell lines is also required. It was noted for sulforaphane that there were significant target profile differences between the MDA-MB-231 and MCF7 cell lines. Target identification in further breast cancer cell lines in addition to other cancer cell types will further understanding into the variability of engaged targets, although this a costly endeavour. Furthermore, target profiling in resistant or non-resistant cell types,566 or in cell lines where the compounds have displayed highly potent activities would also be of

166

interest. Curcumin has shown potent in vitro activities against breast cancer stem cells and so target profiling may help to unravel its mode of action in this context.567, 568 Sulforaphane has also been noted for its efficacy against breast cancer stem cells.569, 570 3. The isoTOP-ABPP platform developed by Cravatt and co-workers is an elegant alternative to profiling the targets of electrophilic natural products without the need to design specific probes against the compound of interest as was implemented here.56, 79, 243 The application of a pan- reactive electrophile probe such as NEM ABP has shown initial promise in these studies and preliminary results have suggested its potential to be applied using a competition-based isoTOP-ABPP workflow. Direct comparison of the chemical proteomic workflow developed here to the isoTOP-ABPP workflow would be of interest to determine the advantages of each approach. 4. The chemical proteomics workflow offers the possibility to screen a range of analogues of curcumin, piperlongumine and sulforaphane in competition with their respective ABPs to explore the SAR surrounding their target profiles using a MS-based readout. Competition assays of the relevant ABP against THC and THP showed that the Michael acceptors of curcumin and piperlongumine are necessary for covalent target engagement (Chapter 5.3.4). Further screening of analogues with selective modifications to the parent compound scaffolds will allow for the contribution of different functional groups towards each of its targets to be determined simultaneously in a quantitative manner. Individual targets could also be studied using WB analysis, to further understand target engagement and occupancy. Taken together, this improvement in SAR may provide starting points for designing more selective inhibitors or interactors towards designated targets based on the curcumin, piperlongumine and sulforaphane scaffolds. 5. Determination of the functional importance of covalent binding of the electrophilic natural product to relevant targets in cancer such as STAT proteins (STAT1 and STAT3), NF-κB components (NFKB2) and kinases (AXL and EGFR) are required. A variety of biochemical and biophysical assays are available both in vitro and in vivo to determine the functional impact of each target to the anticancer activities of the electrophilic natural products. Further work could also be done to ascertain which targets are most important for known phenotypes such as the induction of apoptosis or autophagy. 6. Bridging the gap between identification of a target using such a chemical proteomic workflow and determining its functional importance is the identification of the individual amino acid modification site for each target under study. The modification sites of 46 proteins were identified for sulforaphane but no sites were identified for curcumin or piperlongumine (Chapter 5.6). Further development of the workflow may well lead to an improvement in the number of site identifications. Identification of modification sites on KEAP1 has remained elusive under endogenous cellular conditions. Implementation of SRM/MRM approaches into the LC-MS/MS setup could allow for the selected detection of modification sites on KEAP1 by the ABP to be identified, that may well be at very low abundance and hence explain their absence from detection under the current workflow.477

167

7. The ABPs of curcumin, piperlongumine and sulforaphane require further biological characterisation if they are to be applied as universal tool molecules for target capture. Further cellular assays (including cell cycle analysis, specific cell death and inflammatory assays) to confirm the ABP retains the range of activities of its respective parent compound are required. This is particularly important for in vivo application of the ABPs where competition-based assays may not be feasible and target identification may have to be made with the ABP alone. 8. Non-covalent targets of the electrophilic natural products should not be discounted and may contribute to their observed biological activities. Identification of these targets could be explored using affinity pulldown approaches, immobilising the compound on resin. Alternatively, the synthesis and application of each compound as an AfBP would allow non- covalent target associations to be captured by photo-crosslinking. 9. The multi-target action of these electrophilic natural products puts great emphasis on using system biology-based approaches for explaining their activities. On-going collaborations with other research groups is attempting to identify key targets and potential sub-networks within the target sets. Further exploration of these target networks within Cytoscape to understand the network topology is also warranted. The identification of susceptible nodes within cancer networks that arise on account of interacting or disrupting the activities of the hundreds of nodes that are targets of the electrophilic natural products is the long term goal of such approaches. The development of such a model that would predict that by applying pressure on electrophilic natural product nodes within a cancer network would make it susceptible to treatment at a single node by a second agent, in doing so giving rise to a synergistic combination, could potentially unlock the therapeutic potential of multi-target agents such as these under study. This is many years or even decades from reality, however addressing the targets of these agents as has been carried out here in a systematic way is the first step to be able to develop and test such computational models. 10. It is also important to appreciate the downstream effects of electrophilic natural product treatment on the up- and down- regulation of the proteome to understand secondary or indirect mediators that may participate in the biological response. A handful of studies have been performed, none so in the MDA-MB-231 and MCF7 cell lines.515, 519, 571-573 Therefore, complementing the direct target identifications with downstream proteomic expression changes in response to electrophilic natural product treatment will provide a more comprehensive picture of the mechanism of action in these cell lines. The utilisation of a SILAC-based proteome-wide analysis of changes in protein expression was recently reported by Mann and co-workers to detect downstream effects of HSP90 inhibition by 17-DMAG.574 11. It is also of interest to determine cancer tissues and cell lines that are highly sensitive to the three electrophilic natural products. Curcumin, piperlongumine and sulforaphane are all currently being systematically screened against a panel of roughly 1000 genetically characterised cancer cell lines as part of the Genomics of Drug Sensitivity in Cancer (GDSC) project.575 These cell lines encompass the diverse range of tissue type and genetic diversity

168 of human cancers, with cancer cell viability sensitivities to compound treatment mapped to the status of particular genes.576 Utilisation of this information, when available, will allow sensitive and non-sensitive cell lines for each compound to be identified which could then have their targets profiled and compared to identify important target discrepancies. It may also identify genetic sensitivities to compound treatment that in combination with the target data generated by chemical proteomics, further aid dissecting the mode of action of these compounds. This will also be of great assistance for recommendations of how best to apply these compounds in the clinic. Preliminary data obtained for piperlongumine showed no significant genetic sensitivities towards cancer cell viability, however did reveal a diverse range of activities for different cell lines that could be further investigated. The data for curcumin and sulforaphane is currently pending.

169

8. Materials and methods

8.1 Chemical synthesis

8.1.1 General procedures

The reagents used during all synthetic processes were obtained from commercial sources (Sigma- Aldrich, VWR, Fisher Scientific) and used without further purification. In addition, piperlongumine was obtained from Indofine Chemical Company, USA. PC and NC were synthesised as reported previously.326 Sulforaphane ABP 1 and 2 were both originally synthesised by Elisabeth Storck and also reported previously.327 Sulforaphane ABP 2 was however re-synthesised as reported below.

Reactions were followed by TLC using aluminium-backed silica plates (Merck, TLC Silica Gel 60, F254) and visualised under UV irradiation at 254 nm or using a variety of stains. Flash column chromatography was carried out either by hand-made columns with Merck Silica 60Ǻ or using a Biotage Isolera™ One flash purification system using a wet-loading Biotage SNAP cartridge, collecting fractions using a UV detector at 254 nm when appropriate.

The purity of the compound was determined using NMR spectroscopy, accurate MS and LC-MS analysis. All NMR chemical shifts are quoted using tetramethylsilane (TMS) as a reference of (δC/H = 0). 1H and 13C NMR spectroscopy were both carried out on a Bruker AV-400 spectrometer. For 1H

NMR spectroscopy, the residual solvent peak used as an internal reference was CDCl3 (δH = 7.26

ppm), CD3OD (δH = 3.31 ppm) or DMSO-d6 (δH = 2.50 ppm) and chemical shifts are reported as: (multiplicity, coupling constant J (Hz), number of protons). For 13C NMR spectroscopy, the residual

solvent peak used as an internal reference was again CDCl3 (δC = 77.2 ppm), CD3OD (δC = 49.0 ppm)

or DMSO-d6 (δC = 39.5 ppm). MS was performed using chemical ionisation (CI), electron ionisation (EI) or ESI or on an AUTOSPEC P673 spectrometer by the Chemistry Department Mass Spectrometry Service at Imperial College London. LC-MS analysis and purification were carried out on a Waters HPLC system equipped with a 2767 autosampler, a 515 pump, a 3100 mass spectrometer with ESI, and a 2998 Photodiode Array Detector (detection at 200-600 nm). The system was fitted with Waters XBridge C18 columns (4.6 mm × 100 mm for analytical and 19 mm × 100 mm for preparative LC-MS). The flow rate was 1.2 mL min-1 for analytical and 20 mL min-1 for preparative LC-MS, and the runtime was 18 min. MeOH and water, both containing 0.1 % of formic acid, were used as mobile phases.

8.1.2 Feruloyl acetone (5-hydroxy-1-(4-hydroxy-3-methoxyphenyl)-1,4-hexadien-3-one)

O O

HO O

2,4-pentanedione (4.0 mL, 40.0 mmol) was added to a 25 mL round bottom flask with boric anhydride (2.0 g, 28.7 mmol) and tributylborate (4.0 mL, 20.0 mmol) with stirring. After 30 min, vanillin (3.0 g, 20.0 mmol) was finally added and the reaction mixture stirred at 90 ˚C under reflux for a further 35

170

min. The reaction was cooled to 70 ˚C and n-butylamine (1.4 mL, 26.0 mmol) added dropwise. The reaction mixture was returned to reflux at 90 ˚C for 2 h before being quenched with 0.4 M HCl (45 mL) at 50 ˚C and allowed to cool to room temperature with further stirring for 40 min. The product was extracted with EtOAc (3 x 50 mL). The combined organic fractions were then washed with water (2 x 150 mL) and brine (1 x 150 mL), dried over sodium sulfate, filtered and concentrated under vacuum to yield a dark red oil. The crude mixture was purified by flash column chromatography with 1 hexane/EtOAc (4:1) to yield a bright yellow solid, feruloyl acetone (726 mg, 16 %). H NMR: δH/ppm

(400 MHz, CDCl3) 9.65 (s, 1H), 7.49 (d, J = 15.9 Hz, 1H), 7.29 (d, J = 1.8 Hz, 1H), 7.11 (dd, J = 1.8, 8.2 Hz, 1H), 6.80 (d, J = 8.2 Hz, 1H), 6.65 (d, J = 15.9 Hz, 1H), 5.84 (s, 1H), 3.82 (s, 3H), 2.12 (s, 3H). 13 C NMR: δC/ppm (101 MHz, CDCl3) 196.7, 178.3, 149.2, 148.0, 140.3, 126.3, 122.9, 119.7, 115.7, 100.5, 55.7, 26.4. LC-MS: ES(+) 235.1 [M+H]+, retention time 10.26 min (25-98 % MeOH gradient). + ES+ HRMS: found 235.0983 (C13H15O4, [M+H] , requires 235.0970).

8.1.3 4-methoxy-3-propargyloxy-benzaldehyde

O O O

Vanillin (1.0 g, 6.6 mmol) was dissolved in a suspension of K2CO3 (1.4 g, 9.7 mmol) in anhydrous DMF (5 mL) and left to stir at room temperature for 5 min in a 10 mL round bottom flask. Propargyl bromide (1.1 mL, 9.9 mmol) was then added and the reaction left to stir at 30 °C for 7 h. The reaction was quenched by the addition of water (25 mL) and extracted with EtOAc (3 x 25 mL). The combined organic extracts were washed with water (2 x 25 mL), dried over magnesium sulfate, filtered and concentrated under vacuum to yield 4-methoxy-3-propargyloxy-benzaldehyde as a yellow solid (1.3 g, 1 95 %). H NMR: δH/ppm (400 MHz, CDCl3) 9.83 (s, 1H), 7.44-7.39 (m, 2H), 7.11 (d, J = 8.2 Hz, 1H), 13 4.83 (d, J = 2.4 Hz, 2H), 3.90 (s, 3H), 2.56 (t, J = 2.4 Hz, 1H). C NMR: δC/ppm (101 MHz, CDCl3) 191.0, 152.2, 150.1, 131.0, 126.3, 112.6, 109.5, 77.5, 76.8, 56.6, 56.1. LC-MS: ES(+) 191.1 [M+H]+, + retention time 10.98 min (5-98 % MeOH gradient). CI HRMS: found 208.0979 (C11H14NO3, [M+NH4] , requires 208.0974).

8.1.4 Curcumin ABP 1 ((1E,6E)-1-(4-hydroxy-3-methoxyphenyl)-7-(3-methoxy-4-(prop-2- ynyloxy)phenyl)hepta-1,6 diene-3,5-dione)

O OH (E) (E)

(Z) HO O OCH OCH 3 3

A suspension of boric anhydride (43 mg, 0.61 mmol) and feruloyl acetone (72 mg, 0.31 mmol) was stirred at 80 °C for 20 min in EtOAc (2 mL) under nitrogen in a 10 mL round bottom flask. Tributylborate (165 µL, 0.61 mmol) and 4-methoxy-3-propargyloxy-benzaldehyde (47 mg, 0.25 mmol) were then added to the suspension and stirred for a further 20 min. Piperidine (13 µL, 0.17 mmol) in 1 mL EtOAc was then added dropwise to the reaction mixture over 15 min. The reaction was then left to

171

stir at 80 °C for 45 min. The reaction was cooled to 50 ˚C and acidified with 1 mL of 0.5 M HCl and cooled to room temperature with stirring for 1 h. The mixture was diluted with water (2 mL) and the product extracted with EtOAc (3 x 10 mL). The combined organic fractions were then washed with water (2 x 30 mL), dried over sodium sulfate, filtered and concentrated under vacuum to yield a dark red oil. The crude mixture was purified on the Isolera with a hexane/EtOAc gradient to yield a 1 crystalline orange solid, curcumin ABP 1 (54 mg, 43 %). H NMR: δH/ppm (400 MHz, CDCl3) 7.59 (d, J = 15.6 Hz, 2H), 7.11 (m, 3H), 7.04 (d, J = 8.4 Hz, 2H), 6.93 (d, J = 8.2 Hz, 1H), 6.49 (dd, J = 9.6, 15.8 Hz, 2H), 5.81 (s, 1H), 4.81 (d, J = 2.4 Hz, 2H), 3.94 and 3.93 (2s, 6H), 2.54 (t, J = 2.4 Hz, 1H). 13 C NMR: δC/ppm (101 MHz, CDCl3) 183.8, 149.9, 148.8, 148.0, 146.9, 140.9, 140.2, 129.4, 127.8, 123.1, 122.7, 122.2, 121.9, 115.0, 113.9, 110.5, 109.8, 101.5, 78.2, 56.7, 56.1. LC-MS: ES(+) 407.4 + [M+H] , retention time 10.81 min (50-98 % MeOH gradient). ES+ HRMS: found 407.1500 (C24H23O6, [M+H]+, requires 407.1495).

8.1.5 3-propargyloxy-4-hydroxybenzaldehyde

O HO O

To an oven dried two-neck flask with a nitrogen inlet, sodium hydride (347 mg, 14.5 mmol) was added into anhydrous DMSO (3.6 mL). While stirring and cooling the solution to 0 °C, 3,4- dihydroxybenzaldehyde (1.0 g, 7.2 mmol) dissolved in anhydrous DMSO (4.4 mL) was added dropwise into the flask. The reaction was left to stir for 1 h. Propargyl bromide (0.78 mL, 7.2 mmol) was then added dropwise and the reaction was left to stir at room temperature overnight. The mixture was then poured onto ice, neutralised by 1 M HCl and the product extracted with EtOAc (5 x 50 mL). The combined organic extracts were then reduced in volume to 100 mL under vacuum before being washed with brine (5 x 100 mL), dried over magnesium sulfate, filtered and concentrated under vacuum to yield a brown viscous oil. This was flash chromatographed with an eluent of neat DCM to 1 yield 3-propargyloxy-4-hydroxybenzaldehyde (137 mg, 11 %). H NMR: δH/ppm (400 MHz, CDCl3) 9.83 (s, 1H), 7.53 (d, J = 1.7 Hz, 1H), 7.47 (dd, J = 1.7, 8.1 Hz, 1H), 7.07 (d, J = 8.1, 1H), 6.3 (s, 1H), 13 4.84 (d, J = 2.4 Hz, 2H), 2.59 (t, J = 2.4, 1H). C NMR: δC/ppm (101 MHz, CDCl3) 190.9, 152.1, + 145.3, 129.9, 128.1, 115.2, 111.2, 77.3, 77.1, 57.1. CI HRMS: found 194.0828 (C10H12NO3, [M+NH4] , requires 194.0817).

8.1.6 Curcumin ABP 2 (1,7-[3-methoxy-4-hydroxyphenyl][3-butynyloxy-4-hydroxyphenyl]hepta- 1,6-diene-3,5-dione)

O OH (E) (E)

(Z) HO OH OCH3 O

172

A suspension of boric anhydride (22 mg, 0.31 mmol) and 2,4-pentanedione (32 µL, 0.31 mmol) was stirred at 80 °C for 30 min in DMF (0.5 mL) under nitrogen in a 5 mL round bottom flask. Tributylborate (335 µL, 1.2 mmol) was then added to the suspension. After 30 min, vanillin (50 mg, 0.33 mmol) and 3-propargyloxy-4-hydroxybenzaldehyde (50 mg, 0.23 mmol) were both added. N- butylamine (12 µL, 0.12 mmol) in 100 µL DMF was then added dropwise to the reaction mixture over 40 min. The reaction was then left to stir at 80 °C for 4 h. The reaction was acidified with 2.6 mL of 0.5 M HCl upon cooling to room temperature to yield a black sticky residue. The mixture was diluted with water (25 mL) and the product extracted with EtOAc (3 x 30 mL). The combined organic fractions were then washed with water (5 x 100 mL), dried over magnesium sulfate, filtered and concentrated under vacuum to yield a brown solid. The crude solid was flash chromatographed with an eluent of DCM/MeOH (99:1) to yield an orange crystalline solid. Trace amounts of DMF were evident in the 1H NMR spectrum, as such the compound was washed further with diethyl ether to yield the final 1 curcumin ABP 2 product (44 mg, 36 %). H NMR: δH/ppm (400 MHz, CDCl3) 9.86 (s, 1H), 9.66 (s, 1H), 7.54 (dd, J = 6.3, 15.8 Hz, 2H), 7.36 (dd, J = 1.6, 27.1 Hz, 2H), 7.18 (ddd, J = 1.7, 8.3, 21.1 Hz, 2H), 6.85 (dd, J = 8.2, 15.8 Hz, 2H), 6.75 (td, J = 4.1, 15.6, 15.6 Hz, 2H), 6.08 (t, J = 2.9 Hz, 1H), 4.82 13 (d, J = 2.3 Hz, 2H), 3.84 (s, 3H), 3.59 (t, J = 2.3 Hz, 1H). C NMR: δC/ppm (101 MHz, CDCl3) 149.9, 149.4, 148.0, 145.7, 140.8, 126.4, 124.0, 123.2, 121.3, 121.1, 116.3, 115.7, 114.0, 111.3, 101.0, 79.3, 78.4, 56.2, 55.7. LC-MS: ES(+) 393.4 [M+H]+, retention time 9.18 min (50-98 % MeOH gradient). ES- - HRMS: found 391.1194 (C23H19O6, [M-H] , requires 391.1182).

8.1.7 3-propargyl-2,4-pentanedione

O O

To an oven dried 250 mL round bottom flask was added K2CO3 (6.7 g, 48.3 mmol), 2,4-pentanedione (4.0 mL, 40.0 mmol) and propargyl bromide (910 μL, 8.1 mmol) dissolved in acetone (90 mL) with gentle agitation. The reaction was refluxed for 48 h. The K2CO3 was filtered off and the crude mixture concentrated under vacuum. The crude mixture was purified by flash column chromatography with an eluent of hexane/EtOAc (14:1) to yield 3-propargyl-2,4-pentanedione as a colourless liquid (147 mg, 1 13 %). H NMR: δH/ppm (400 MHz, CDCl3) 3.85 (t, J = 7.6 Hz, 0.5H), 3.11 (d, J = 2.7 Hz, 1H), 2.70 13 (dd, J = 2.7, 7.6 Hz, 1H), 2.25 (s, 3H), 2.22 (s, 3H), 2.02 (t, J = 2.7 Hz, 1H). C NMR: δC/ppm (101

MHz, CDCl3) 202.4, 191.1, 106.6, 81.8, 80.4, 71.0, 68.9, 66.8, 29.5, 23.3, 17.5. CI HRMS: found + 156.1028 (C8H14NO2, [M+NH4] , requires 156.1025).

8.1.8 Curcumin ABP 3 ((1E,4Z,6E)-3-hydroxy-1,7-bis(4-hydroxy-3-methoxyphenyl)-4-(prop-2- ynyl)hepta-1,4,6-trien-5-one)

O O O OH (E) (E) (E) (Z) (E)

HO OH HO OH O O H H OC 3 OC 3

173

A suspension of boric anhydride (36 mg, 0.51 mmol) and 3-propargyl-2,4-pentanedione (100 µL, 0.74 mmol) was stirred at 40 °C for 40 min in EtOAc (0.5 mL) under nitrogen in a 10 mL round bottom flask. A solution of tributylborate (792 µL, 2.9 mmol) and vanillin (203 mg, 1.3 mmol) in EtOAc (1.3 mL) was then slowly added and the reaction stirred at 40 °C for a further 20 min. N-butylamine (108 µL, 1.1 mmol) in 1 mL EtOAc was then added dropwise to the reaction mixture over 10 min. The reaction was then left to stir at 40 °C overnight. The reaction was acidified with 1.5 mL of 1N HCl and the reaction stirred at 50 °C for 40 min. The mixture was diluted with water (2 mL) and the product extracted with EtOAc (3 x 3 mL). The combined organic fractions were then washed with water (2 x 12 mL) and brine (2 x 12 mL), dried over magnesium sulfate, filtered and concentrated under vacuum to yield a dark red oil. The crude mixture was purified on the Isolera with a hexane/EtOAc gradient to yield an orange crystalline solid product, curcumin ABP 3 (121 mg, 41 %) (present as a 1:1 mixture of 1 its keto/enol tautomers). H NMR: δH/ppm (400 MHz, CDCl3) 7.72 (d, J = 15.8 Hz, 2H), 7.66 (d, J = 15.8 Hz, 2H), 7.19 (dd, J = 1.9, 8.3 Hz, 2H), 7.12 (dd, J = 1.9, 8.3 Hz, 2H), 7.07 (d, J = 1.8 Hz, 2H), 7.03 (d, J = 1.8 Hz, 2H), 7.01 (s, 1H), 6.96 (d, J = 5.0 Hz, 2H), 6.93 (d, J = 6.4 Hz, 2H), 6.90 (s, 1H), 6.70 (d, J = 15.8 Hz, 2H), 4.34 (t, J = 7.2 Hz, 1H), 3.95 (s, 6H), 3.91 (s, 6H), 3.44 (d, J = 2.6 Hz, 2H), 13 2.91 (dd, J = 2.6, 7.5 Hz, 2H), 2.16 (t, J = 2.6 Hz, 1H), 2.02 (t, J = 2.7 Hz, 1H). C NMR: δC/ppm (101

MHz, CDCl3) 193.5, 192.8, 149.0, 148.1, 147.0, 146.9, 145.5, 142.6, 128.1, 126.7, 124.4, 123.0, 121.1, 118.0, 115.0, 110.3, 109.9, 81.0, 70.7, 69.8, 63.5, 56.2, 19.3, 18.0, 16.4, 14.3. LC-MS: ES(+) 407.4 [M+H]+, retention time 10.31 min (50-98 % MeOH gradient). ES+ HRMS: found 407.1506 + (C24H23O6, [M+H] , requires 407.1495).

8.1.9 2-(4-chloro-butyl)-2-methyl-[1,3]dioxolane

O O Cl

To 6-chloro-2-hexanone (1.0 mL, 7.6 mmol) in toluene (50 mL) was added p-toluenesulfonic acid (77 mg, 0.40 mmol) and ethylene glycol (0.85 mL, 15.2 mmol). The suspension was fitted with Dean Stark apparatus and stirred at 160 °C for 4 h. The clear solution was diluted with EtOAc (200 mL) and washed with saturated sodium bicarbonate (2 x 150 mL) and brine (2 x 150 mL). The organic layer was dried over sodium sulfate, filtered and concentrated under vacuum to yield a yellow/brown liquid. The crude product was purified on the Isolera eluting in a gradient of hexane/EtOAc. The product was 1 isolated as a clear liquid (1.0 g, 75 %). H NMR: δH/ppm (400 MHz, CDCl3) 3.93 (m, 4H), 3.53 (t, J = 13 6.7 Hz, 2H), 1.79 (m, 2H), 1.66 (m, 2H), 1.55 (m, 2H), 1.31 (s, 3H). C NMR: δC/ppm (101 MHz, + CDCl3) 110.0, 64.8, 45.1, 38.5, 32.8, 23.9, 21.6. ES+ HRMS: found 179.0843 (C8H16O2Cl, [M+H] , requires 179.0839).

8.1.10 S-ethyl-(prop-2-ynyl)-[4-(2-methyl-[1,3]dioxolan-2-yl)-butyl]-thiocarbamate

O O O N S

174

To a suspension of propargyl amine (170 μL, 2.6 mmol), K2CO3 (304 mg, 2.2 mmol) and potassium iodide (308 mg, 1.9 mmol) in DMF (6 mL) in a 10 mL round bottom flask was added a solution of 2-(4- chloro-butyl)-2-methyl-[1,3]dioxolane (303 mg, 1.7 mmol) in DMF (1 mL). The orange suspension was stirred at 70 °C overnight and at room temperature for a further 48 h. The suspension was diluted with EtOAc (40 mL) and washed with brine (4 x 40 mL). The organic layer was dried over sodium sulfate, filtered and concentrated under vacuum to yield a brown oil. The intermediate was dissolved in DCM (10 mL) and cooled to 0 °C in an ice bath. DIPEA (890 μL, 5.1 mmol) was added dropwise, followed by S-ethyl chlorothiolformate (420 μL, 4.0 mmol). The solution was stirred at 0 °C for 2 h. The solution was diluted with DCM (20 mL) and washed with 1 M HCl (1 x 40 mL), saturated sodium bicarbonate (1 x 40 mL) and brine (1 x 40 mL). The organic layer was dried over magnesium sulfate, filtered and concentrated under vacuum to yield an orange liquid. The crude mixture was purified on the Isolera eluting in a gradient of hexane/EtOAc. The product was isolated as a light yellow oil (184 mg, 38 %). 1 H NMR: δH/ppm (400 MHz, CDCl3) 4.19 (s, 2H), 3.93 (m, 4H), 3.43 (t, J = 7.3 Hz, 2H), 2.92 (q, J = 13 7.3 Hz, 2H), 2.23 (s, 1H), 1.65 (m, 4H), 1.42 (m, 2H), 1.28 (m, 6H). C NMR: δC/ppm (101 MHz,

CDCl3) 110.0, 64.8, 47.4, 38.9, 28.0, 25.0, 23.9, 21.4, 15.4. ES+ HRMS: found 286.1478 + (C14H24NO3S, [M+H] , requires 286.1477).

8.1.11 S-ethyl-(prop-2-ynyl)-(5-oxo-hexyl)-thiocarbamate

O O S N

To a solution of S-ethyl-(prop-2-ynyl)-[4-(2-methyl-[1,3]dioxolan-2-yl)-butyl]-thiocarbamate (180 mg, 0.63 mmol) in THF (10.4 mL) was added 2 N HCl (2.6 mL) dropwise. The clear solution was stirred at room temperature for 3 h. The solution was diluted with EtOAc (25 mL) and washed with distilled water (1 x 25 mL), saturated sodium bicarbonate (1 x 25 mL) and brine (1 x 25 mL). The organic layer was dried over sodium sulfate, filtered and concentrated under vacuum. The crude mixture was purified on the Isolera eluting in a gradient of hexane/EtOAc. The product was isolated as a clear oil 1 (87 mg, 68 %). H NMR: δH/ppm (400 MHz, CDCl3) 4.18 (s, 2H), 3.44 (s, 2H), 2.92 (q, J = 7.4 Hz, 2H), 2.48 (t, J = 6.7 Hz, 2H), 2.25 (s, 1H), 2.14 (s, 3H), 1.60 (m, 4H), 1.28 (t, J = 7.4 Hz, 3H). 13C NMR:

δC/ppm (101 MHz, CDCl3) 208.4, 208.4, 78.5, 168.4, 72.2, 47.0, 43.0, 30.0, 27.2, 25.0, 20.7, 15.2. + ES+ HRMS: found 242.1203 (C12H20NO2S, [M+H] , requires 242.1215).

8.1.12 Sulforaphane ABP 2 (S-ethyl-(prop-2-ynyl)-(5-oxo-hexyl)-thiocarbamate sulfoxide)

O O S N O

A solution of S-ethyl-(prop-2-ynyl)-(5-oxo-hexyl)-thiocarbamate (74 mg, 0.31 mmol) in DCM (2.5 mL) was cooled to -78 °C in a sealed 10 mL round bottomed flask. 3-chloroperbenzoic acid (68 mg, 0.30 mmol) in DCM (1.5 mL) was added dropwise. The reaction flask was vented through a needle and the reaction solution stirred at -78 °C for 50 min. The solvent was removed under vacuum and the solid

175

residue taken up in EtOAc (20 mL). The organic layer was washed with saturated sodium bicarbonate (1 x 20 mL) and brine (1 x 20 mL), dried over sodium sulfate, filtered and concentrated under vacuum. The crude mixture was purified on the Isolera eluting in a gradient of hexane/EtOAc. The product was 1 isolated as a clear oil (12 mg, 15 %). H NMR: δH/ppm (400 MHz, CDCl3) 4.53 (qd, J = 2.5, 18.3 Hz, 1H), 4.24 (m, 1H), 3.82 (m, 0.5H), 3.67 (dd, J = 7.2, 14.6 Hz, 0.5H), 3.54 (td, J = 3.1, 7.0 Hz, 1H), 3.07 (m, 2H), 2.50 (q, J = 6.5 Hz, 2H), 2.34 (dt, J = 2.5, 24.1 Hz, 1H), 2.14 (d, J = 2.3 Hz, 3H), 1.63 13 (m, 5H), 1.39 (dt, J = 2.5, 8.9 Hz, 3H). C NMR: δC/ppm (101 MHz, CDCl3) 208.3, 168.2, 73.9, 73.5, 48.2, 45.8, 45.7, 45.4, 42.7, 42.6, 36.7, 35.2, 30.0, 27.9, 26.1, 20.5, 20.3, 7.0, 6.9 LC-MS: ES(+) + [M+H] , retention time 5.81 min (50-98 % MeOH gradient). ES+ HRMS: found 280.0977 (C12H20NO3S, [M+Na]+, requires 280.0983).

8.1.13 Sulforaphane ABP 3 (4-butynylsulfinyl-1-(isothiocyanate)-butane)

N C S S O

Part (i)

To a solution of 4-amino-1-butanol (500 mg, 4.4 mmol) and K2CO3 (1.5 g, 11.0 mmol) in THF (4.3 mL) and water (4.3 mL) was added di-tert-butyldicarbonate (1.2 g, 5.3 mmol) at room temperature. The reaction was stirred overnight before being poured into a biphasic mixture of EtOAc (21 mL) and 1 M HCl (21 mL) at 0 °C. The aqueous layer was extracted with EtOAc (2 x 21 mL) and the combined organic fractions washed with 1 M HCl (50 mL) and brine (50 mL) before being dried over magnesium sulfate, filtered and concentrated under vacuum to yield a pale yellow oil (950 mg, 5.0 mmol). The oil was then dissolved in DCM (8.5 mL) and triethylamine (1.9 mL, 13.3 mmol) and cooled to 0 °C. To this was added p-toluenesulfonylchloride (1.0 g, 5.3 mmol) at 0 °C and the reaction stirred for 1.5 h. The reaction mixture was then poured into a biphasic mixture of EtOAc (32 mL) and water (32 mL) at 0 °C and the aqueous phase extracted with EtOAc (2 x 21 mL). The organic phases were subsequently combined, washed with saturated ammonium chloride (21 mL) and brine (21 mL), dried over magnesium sulfate, filtered and concentrated under vacuum to yield a brown oil. The crude product was purified by flash column chromatography with a hexane/EtOAc eluent gradient (20:1 to 1:1) to yield 4-(tert-butoxycarbonylamino)butyl p-toluenesulfonate (582 mg, 30 % over 2 steps) as a 1 light brown oil. H NMR: δH/ppm (400 MHz, CDCl3) 7.81 (d, J = 8.1 Hz, 2H), 7.37 (d, J = 8.1 Hz, 2H), 4.06 (t, J = 6.3 Hz, 2H), 3.10 (t, J = 6.9 Hz, 2H), 2.48 (s, 3H), 1.69 (m, 2H), 1.54 (m, 2H), 1.45 (s, 9H).

Part (ii) 4-(tert-butoxycarbonylamino)butyl p-toluenesulfonate (500 mg, 1.5 mmol) was then dissolved in DMF (1.5 mL) and potassium thioacetate (250 mg, 2.2 mmol) added. The reaction mixture was stirred at 60 °C under nitrogen for 90 min before being poured into a biphasic mixture of EtOAc (14 mL) and water (14 mL) at 0 °C. The aqueous layer was extracted with EtOAc (2 x 15 mL), all organic phases were then combined, washed with brine (2 x 10 mL), dried over magnesium sulfate, filtered and concentrated under vacuum. The crude product was then purified by flash column chromatography with a hexane/EtOAc (8:2) eluent to yield S-6-(tert-butoxycarbonylamino)butyl thioacetate (314 mg, 87

176

1 %) as a straw coloured oil. H NMR: δH/ppm (400 MHz, CDCl3) 4.57 (s, 1H), 3.16 (m, 2H), 2.90 (t, J = 13 7.0 Hz, 2H), 2.35 (s, 1H), 1.59 (m, 4H), 1.46 (s, 9H). C NMR: δC/ppm (101 MHz, CDCl3) 195.9, 40.0, 30.7, 29.2, 28.7, 28.4, 26.9. LC-MS: ES(+) 248.3 [M+H]+, retention time 9.67 min (50-98 % MeOH gradient).

Part (iii) S-6-(tert-butoxycarbonylamino)butyl thioacetate (55 mg, 0.22 mmol) was dissolved in 1,4-dioxane (1.5 mL) and 0.5 M NaOMe (1.2 mL) was added. Following stirring at room temperature, 3-butynyl p- toluenesulfonate (60 µL, 0.30 mmol) was added and the reaction was left to stir overnight. The reaction was quenched by the addition of water (10 mL) and extracted initially with DCM (4 x 15 mL) and then EtOAc (3 x 20 mL). Separate organic extracts were then washed with brine (50 mL), dried over magnesium sulfate, filtered and concentrated under vacuum to yield off-white oils. The EtOAc extract (24 mg) yielded a lot higher purity product as revealed by LC-MS analysis and as such was taken forward for further reaction. 5-6 N HCl (in isopropanol) was added and left to stir at room temperature for 1 h. The reaction mixture was then basified with 4 M NaOH (3.5 mL) and extracted with CHCl3 (4 x 25 mL). The organic extracts were then combined and concentrated under vacuum. The crude mixture was then purified by preparative LC-MS with a 5-98 % MeOH gradient to yield an amber coloured residue (3.6 mg, 24 % over 2 steps). The residue (3.6 mg, 0.02 mmol) was dissolved in CDCl3 (0.8 mL) in a small glass vial after which 1 M NaOH (35 µL) and thiophosgene (2.5 µL, 0.03 mmol) were added. The reaction was left to stir for 4 h before the reaction was diluted down with

CHCl3 and concentrated under vacuum to yield a yellow solid substance. The substance (4.0 mg,

0.02 mmol) was re-dissolved in MeOH (600 µL) in a microcentrifuge tube and H2SO4 (1 µL), iPrOH (1

µL) and 30 % H2O2 (2.5 µL, 0.02 mmol) were all added. The reaction was gently vortexed for 2 h

before being quenched with water (5 mL), extracted with CHCl3 (2 x 5 mL), organic phases combined, dried over magnesium sulfate, filtered and concentrated under vacuum. The crude product was purified by preparative LC-MS with a 50-98 % MeOH gradient to yield the sulforaphane ABP 3 (4- butynylsulfinyl-1-(isothiocyanate)butane) as a straw coloured oil (2.3 mg, 47 % over 2 steps). 1H

NMR: δH/ppm (400 MHz, CDCl3) 3.61 (t, J = 6.0 Hz, 2H), 2.89 (tt, J = 1.2, 7.2, 8.2 Hz, 2H), 2.77 (m, 4H), 2.09 (t, J = 2.8 Hz, 1H), 2.01-1.85 (m, 4H). LC-MS: ES(+) 216.0 [M+H]+, retention time 2.87 min.

8.1.14 (E)-1-(3-(4-hydroxy-3,5-dimethoxyphenyl)acryloyl)-5,6-dihydropyridin-2(1H)-one

O O (E) O N (Z) HO O

Piperlongumine (30 mg, 0.95 mmol) was dissolved in anhydrous DCM (3.2 mL) in an oven-dried round bottom flask. Aluminium chloride (895 mg, 6.7 mmol) was added to the mixture and the reaction was stirred at room temperature for 3 h. The reaction mixture was then diluted with DCM (30 mL) and washed with saturated ammonium chloride (2 x 30 mL) and water (2 x 40 mL). The organic phase was dried over sodium sulfate, filtered and concentrated. (E)-1-(3-(4-hydroxy-3,5- dimethoxyphenyl)acryloyl)-5,6-dihydropyridin-2(1H)-one (200 mg, 69 %) was isolated by flash column

177

1 chromatography with an eluent of DCM/MeOH (99:1). H NMR: δH/ppm (400 MHz, CDCl3) 7.69 (d, J = 15.5 Hz, 1H), 7.40 (d, J = 15.5 Hz, 1H), 6.95 (m, 1H), 6.82 (s, 2H), 6.05 (dt, J = 1.8, 9.7 Hz, 1H), 5.75 13 (s, 1H), 4.05 (t, J = 6.5 Hz, 2H), 3.92 (s, 6H), 2.48 (m, 2H). C NMR: δC/ppm (101 MHz, CDCl3) 169.0, 165.9, 147.2, 145.4, 144.3, 137.1, 126.6, 125.9, 119.6, 105.4, 56.4, 41.7, 24.8. LC-MS: ES(+) 304.1 [M+H]+, retention time 7.78 min (50-98 % MeOH gradient).

8.1.15 Piperlongumine ABP ((E)-1-(3-(4-O-propynyl-3,5-dimethoxyphenyl)acryloyl)5,6- dihydropyridin-2(1H)-one)

O O (E) O N (Z) O O

(E)-1-(3-(4-hydroxy-3,5-dimethoxyphenyl)acryloyl)-5,6-dihydropyridin-2(1H)-one (32 mg, 0.10 mmol) was dissolved in acetonitrile (800 µL) and DBU (16 µL, 0.10 mmol) in a 10 mL pear-shaped round bottom flask and stirred at room temperature under nitrogen. Propargyl bromide (58 µL, 0.52 mmol) was added dropwise and the reaction left to stir overnight. The acetonitrile was removed under vacuum to yield a crude orange residue and the product was isolated by flash column chromatography with an eluent of toluene/EtOAc (5:1) to yield a pale white solid of piperlongumine 1 ABP (25 mg, 69 %). H NMR: δH/ppm (400 MHz, CDCl3) 7.67 (d, J = 15.6 Hz, 1H), 7.42 (d, J = 15.6 Hz, 1H), 6.95 (m, 1H), 6.80 (s, 2H), 6.05 (dt, J = 1.8, 9.7 Hz, 1H), 4.76 (d, J = 2.4 Hz, 2H), 4.04 (t, J = 13 6.5 Hz, 2H), 3.89 (s, 6H), 2.49 (m, 2H), 2.44 (t, J = 2.4 Hz, 1H). C NMR: δC/ppm (101 MHz, CDCl3) 168.9, 165.9. 153.7, 145.6, 143.7, 137.3, 131.4, 125.8, 121.4, 105.4, 75.1, 60.0, 56.2, 41.7, 24.8. LC- MS: ES(+) 342.1 [M+H]+, retention time 8.86 min (50-98 % MeOH gradient). ES+ HRMS: found + 342.1357 (C19H20NO5, [M+H] , requires 342.1341).

8.1.16 Tetrahydrocurcumin (1,7-bis(4-hydroxy-3-methoxyphenyl)heptane-3,5-dione)

O O

HO OH O O

Curcumin (300 mg, 0.82 mmol) was dissolved in MeOH (20 mL) and 10 % Pd/C (34 mg) added. The reaction was stirred at room temperature in an atmosphere of hydrogen for 5 h. After which, the reaction mixture was filtered over celite and concentrated under vacuum. The reaction mixture was purified using the Isolera using a hexane/EtOAc gradient to yield tetrahydrocurcumin as a pale white 1 solid (167 mg, 50 %). H NMR: δH/ppm (400 MHz, CDCl3) 6.82 (d, J = 8.0 Hz, 2H), 6.66 (m, 4H), 5.55 (br s, 2H), 5.43 (s, 1H), 3.85 (s, 6H), 2.85 (t, J = 7.2 Hz, 4H), 2.55 (t, J = 8.2 Hz, 4H) 13C NMR:

δC/ppm (101 MHz, CDCl3) 193.4, 146.5, 144.1, 132.7, 120.9. 114.4, 111.03, 99.9, 56.0, 40.5, 31.4. LC-MS: ES(+) 373.3 [M+H]+ , retention time 8.95 min (50-98 % MeOH gradient). ES+ HRMS: found + 373.1664 (C21H25O6, [M+H] , requires 373.1651).

178

8.1.17 Tetrahydropiperlongumine (1-[3-(3,4,5-trimethoxyphenyl)propionyl]piperidin-2-one)

O O O N O O

Piperlongumine (40 mg, 0.13 mmol) was dissolved in EtOH (5 mL) and 10 % Pd/C (5 mg) added. The reaction was stirred at room temperature in an atmosphere of hydrogen for 2 h. After which, the reaction mixture was filtered over celite and concentrated under vacuum. The reaction mixture was purified using flash column chromatography with hexane/EtOAc (4:1) to yield 1 tetrahydropiperlongumine as an off-white solid (31 mg, 77 %) H NMR: δH/ppm (400 MHz, CDCl3) 6.45 (s, 2H), 3.86-3.81 (2 x s, 9H), 3.70 (t, J = 6.2 Hz, 1H), 3.21 (t, J = 7.8 Hz, 2H), 2.90 (t, J = 7.8 Hz, 13 2H), 2.53 (t, J = 6.2 Hz, 2H), 1.81 (m, 4H). C NMR: δC/ppm (101 MHz, CDCl3) 176.3, 173.6, 153.2, 137.2, 136.3, 105.6, 61.0, 56.2, 44.2, 41.5, 35.0, 31.6, 22.6, 20.4. LC-MS: ES(+) 322.1 [M+H]+ ,

retention time 6.46 min (50-98 % MeOH gradient). ES+ HRMS: found 344.1470 (C17H23NO5Na, [M+Na]+, requires 344.1474).

8.1.18 Acetylenic enone ABP (4-(3-methoxy-4-prop-2-ynyloxy-phenyl)-but-3-en-2-one)

O (E)

O O

4-methoxy-3-propargyloxy-benzaldehyde (50 mg, 0.26 mmol) was dissolved in acetone (242 µL) in a 5 mL pear-shaped round bottom flask and 2.5 M NaOH (150 μL) was added. The reaction mixture was stirred at room temperature for 3 h before being quenched with water (15 mL), extracted with EtOAc (2 x 15 mL), organic phases combined and subsequently washed with water (1 x 45 mL) and brine (1 x 45 mL), dried over magnesium sulfate, filtered and concentrated under vacuum. A fraction of the crude product (21 mg) was purified by flash column chromatography with a hexane/EtOAc gradient eluent (9:1 to 1:1) followed by toluene/EtOAc (18:1) to yield the acetylenic enone ABP as a 1 yellow solid (12 mg, 57 %). H NMR: δH/ppm (400 MHz, CDCl3) 7.46 (d, J = 16.2 Hz, 1H), 7.13 (dd, J = 1.9, 8.3 Hz, 1H), 7.09 (d, J = 1.9 Hz, 1H), 7.04 (d, J = 8.3 Hz, 1H), 6.62 (d, J = 16.2 Hz, 1H), 4.81 13 (d, J = 2.4 Hz, 2H), 3.92 (s, 3H), 2.53 (t, J = 2.4 Hz, 1H), 2.38 (s, 3H). C NMR: δC/ppm (101 MHz,

CDCl3) 198.5, 149.9, 149.1, 143.4, 128.6, 125.9, 122.6, 113.8, 110.3, 78.1, 76.5, 56.7, 56.1, 27.6. LC- MS: ES(+) 230.87 [M+H]+, retention time 6.59 min (50-98 % MeOH gradient). ES+ HRMS: found + 231.1031 (C14H15O3, [M+H] , requires 231.1021).

8.1.19 Acetylenic chalcone ABP (3-(4-hydroxy-phenyl)-1-(3-methoxy-4-prop-2-ynyloxy-phenyl)- propenone)

O (E)

O OH O

179

4-hydroxyacetophenone (35 mg, 0.26 mmol) was dissolved in 2.5 M NaOH (174 µL) and MeOH (261 µL) in a 5 mL pear-shaped round bottom flask and stirred at 50 °C for 10 min. 4-methoxy-3- propargyloxy-benzaldehyde (50 mg, 0.26 mmol) was then added and the reaction left to stir at the same temperature for 5 h. Reaction was quenched by the addition of water, extracted with EtOAc (30 mL), the organic phase washed with brine (60 mL), dried over magnesium sulfate, filtered and concentrated under vacuum. The crude residue was purified by flash column chromatography with a hexane/EtOAc gradient (15:1 to 1:1) to yield acetylenic chalcone ABP as a yellow solid (29 mg, 36 1 %). H NMR: δH/ppm (400 MHz, CDCl3) 8.00 (d, J = 8.8 Hz, 2H), 7.76 (d, J = 15.6 Hz, 1H), 7.42 (d, J = 15.6 Hz, 1H), 7.24 (dd, J = 1.9, 8.4 Hz, 1H), 7.17 (d, J = 1.9 Hz, 1H), 7.06 (d, J = 8.3 Hz, 1H), 6.95 (d, J = 8.3 Hz, 2H), 6.09 (br s, 1H), 4.82 (d, J = 2.4 Hz, 2H), 3.95 (s, 3H), 2.54 (t, J = 2.4 Hz, 1H). 13C

NMR: δC/ppm (101 MHz, CDCl3) 189.1, 160.2, 149.9, 149.1, 144.4, 131.4, 131.3, 129.3, 122.6, 120.4, 115.6, 113.9, 110.9, 78.1, 77.5. 76.8, 76.4, 56.8, 56.2. LC-MS: ES(+) 309.0 [M+H]+, retention time + 9.26 min (50-98 % MeOH gradient). ES+ HRMS: found 309.1132 (C19H17O4, [M+H] , requires 309.1127).

8.1.20 Benzaldehyde ABP (4-O-propynl-3-methoxy-cinnamaldehyde)

(E) O O O

4-hydroxy-3-methoxy-cinnamaldehyde (100 mg, 0.56 mmol) was dissolved in acetonitrile (4 mL) in a 5 mL round bottom flask under stirring. DBU (85 μL, 0.56 mmol) was added followed by propargyl bromide (238 μL, 2.2 mmol) and the reaction left to stir under a nitrogen balloon at room temperature overnight. The solvent was removed under vacuum and the crude mixture purified by flash column chromatography with an eluent of hexane/EtOAc (8:2). The product was obtained as a white solid (62 1 mg, 51 %). H NMR: δH/ppm (400 MHz, CDCl3) 9.69 (d, J = 7.7 Hz, 1H), 7.44 (d, J = 15.8 Hz, 1H), 7.19 (dd, J = 2.0, 8.3 Hz, 1H), 7.10 (dd, J = 5.1, 11.6 Hz, 2H), 6.64 (dd, J = 7.7, 15.8 Hz, 1H), 4.84 (d, 13 J = 2.4 Hz, 2H), 3.94 (s, 3H), 2.57 (t, J = 2.4 Hz, 1H). C NMR: δC/ppm (101 MHz, CDCl3) 193.7, 152.7, 150.0, 149.7, 128.3, 127.3, 123.0, 113.7, 110.6, 110.1, 77.9, 76.6, 56.7, 56.1. LC-MS: ES(+) 217.2 [M+H]+, retention time 4.94 min (50-98 % MeOH gradient). ES+ HRMS: found 217.0869 + (C13H13O3, [M+H] , requires 217.0865).

8.1.21 N-ethylmaleimide ABP (1-prop-2-ynylpyrrole-2,5-dione)

O

(Z) N

O

Maleic anhydride (2.5 g, 25.0 mmol) was dissolved in 12.5 mL acetone under reflux and propargylamide (1.7 mL, 25.0 mmol) added dropwise. The solution was refluxed for 1 h and the solvent removed under vacuum. The crude mixture was purified by flash column chromatography with an eluent of hexane/EtOAc (4:1) to yield the intermediate maleic acid N-propargylmonoamide (2.3 g,

180

60 %). In order to cause ring closure, the intermediate (2.3 g, 14.9 mmol) was then dissolved in xylene (80 mL) and refluxed under a Dean-Stark apparatus for 8 h and the solvent removed under vacuum. The crude product was purified by flash column chromatography with an eluent of hexane/EtOAc (7:3) to yield N-ethylmaleimide ABP as an off-white solid (246 mg, 13 %). 1H NMR: 13 δH/ppm (400 MHz, CDCl3) 6.73 (s, 2H), 4.24 (t, J = 2.2 Hz, 2H), 2.19 (t, J = 4.0 Hz, 1H). C NMR:

δC/ppm (101 MHz, CDCl3) 169.3, 134.5, 71.6, 65.8, 26.8, 15.3. ES+ HRMS: found 154.0381 + (C7H6NO2, [M+H] , requires 154.0498).

8.1.22 Chloroacetamide ABP (2-chloranyl-N-hex-5-ynyl-ethanamide)

O Cl N H

5-hexynenitrile (1.1 mL, 10.3 mmol) was dissolved in Et2O (100 mL) in a 250 mL round bottom flask

under nitrogen and cooled to 0 ˚C. LiAlH4 (11.3 mL, 1.0 M in Et2O) was added dropwise and the

reaction warmed to room temperature. Excess LiAlH4 was carefully quenched with water and the resulting precipitate removed by filtration. The organic layer was washed with brine and concentrated under vacuum, after which the 1-amino-hex-5-yne product was precipitated out as a hydrochloride salt by the slow addition of HCl (1.0 M in Et2O) and used immediately without further purification (319 mg, 23 %). 1-amino-hex-5-yne (100 mg, 1.0 mmol), chloroacetyl chloride (81 μL, 1.0 mmol) and triethylamine (144 μL, 1.0 mmol) dissolved in DCM (4 mL) were stirred at 0 ˚C under nitrogen for 6 h. The solvent was removed under vacuum and the crude product purified by preparative LC-MS with a 20-98 % MeOH gradient to yield chloroacetamide ABP as a pale yellow solid (18 mg, 10 %). 1H NMR:

δH/ppm (400 MHz, MeOD) 4.02 (s, 1H), 3.35 (s, 1H), 3.24 (t, J = 6.9 Hz, 2H), 2.21 (m, 2H), 1.64 (qd, J 13 = 5.5, 6.6, 7.4, 8.4 Hz, 2H), 1.54 (tdt, J = 4.8, 6.8, 7.4, 9.9 Hz, 2H). C NMR: δC/ppm (101 MHz, MeOD) 169.5, 84.8, 70.0, 50.1, 49.9, 43.4, 40.4, 29.5, 27.1, 18.9. LC-MS: ES(+) 173.9 [M+H]+, +• retention time 9.67 min (5-98 % MeOH gradient). EI HRMS: found 173.0609 (C8H12NOCl, [M] , requires 173.0607).

8.1.23 Acrylamide ABP (N-propargyl acrylamide)

O N H

Propargyl amide (600 μL, 9.4 mmol) and DIPEA (2.0 mL, 11.2 mmol) were dissolved in dry DCM (3 mL) under nitrogen. Acryloyl chloride (910 μL, 11.2 mmol) in dry DCM (2 mL) was added slowly dropwise and the reaction left to stir at room temperature overnight. The solvent was removed under vacuum and the crude product dissolved back in EtOAc (50 mL) and subsequently washed with sodium bicarbonate (3 x 50 mL), brine (2 x 50 mL) and water (2 x 50 mL). The solvent was dried over magnesium sulfate, concentrated under vacuum and purified by flash column chromatography with an eluent of DCM/MeOH (9:1) to yield the acrylamide ABP as a pale yellow solid (193 mg, 19 %). 1H

NMR: δH/ppm (400 MHz, CDCl3) 6.32 (dd, J = 1.6, 18.0 Hz, 1H), 6.11 (dd, J = 10.4, 18.0 Hz, 1H), 5.88 (br s, 1H), 5.69 (dd, J = 1.6, 10.2 Hz, 1H), 4.13 (dd, J = 2.4, 5.6 Hz, 2H), 2.25 (t, J = 2.6 Hz, 1H)

181

13 C NMR: δC/ppm (101 MHz, CDCl3) 165.5, 130.3, 127.4, 79.4, 71.8, 29.3. LC-MS: ES(+) 109.8 + +• [M+H] , retention time 2.64 min (5-98 % MeOH gradient). EI HRMS: found 109.0533 (C6H7NO, [M] , requires 109.0528).

8.2 Biological and biochemical Methods

8.2.1 General methods

Ultrapure water was obtained from a MilliQ® Millipore purification system. In-gel fluorescence was recorded using an ETTAN DIGE Imager (GE Healthcare) and chemiluminescence was recorded using a LAS-4000 Imaging System (GE Healthcare). Absorbance in 96-well plates was measured using a SpectraMax M2/M2e Microplate Reader (Molecular Devices). All biological and chemical reagents reported from here on in were purchased from Sigma Aldrich unless otherwise specified.

Afatinib (Selleck Chemicals), citral, coniferal aldehyde, dimethyl fumarate, NEM, gefitinib (Selleck Chemicals), IA, maleic anhydride, N-acetylcysteine, piperine, piperlongumine (Indofine Chemical Company, USA), propargyl isothiocyanate (Fluorochem, UK), phenethyl isothiocyanate, benzoquinone, GSH, resveratrol (Molekula, UK), tamoxifen (VWR, UK), (-)-thalidomide, theophylline,

D,L-sulforaphane (Toronto Research Chemicals, Canada) and 30 % H2O2 (VWR, UK) were all purchased. CP-380736, exemstane, nafoxidine, palbociclib, PF-998425, PF-477736, PF-573228, PP2 and epirubicin were all obtained directly from collaborators at Pfizer. CMK ABP was kindly provided by Tom Charlton within the Tate research group. All chemical compounds described were prepared as DMSO stocks for biological experiments unless otherwise stated, stored at -20 ˚C and thawed on the day of use except for IA that was prepared fresh on the day of use. The azide-functionalised capture reagents used for CuAAC; AzTB was originally synthesised by Dr Megan Wright (Imperial College London) as previously reported,577 and re-synthesised by a number of people within the Tate group. AzT was provided by Dr Julia Morales-Sanfrutos (Imperial College London), AzRB and AzRB2 were kindly synthesised and provided by Elisabeth Storck-Saha (Imperial College London) and Dr Malgorzata Broncel (Imperial College London) respectively.

8.2.2 Cancer cell culture

MDA-MB-231 and MCF7 cells were obtained from CRUK cell services core facility and were cultured in DMEM supplemented with 10 % v/v FBS, incubated at 37 °C in a 10 % CO2 humidified incubator. Cells were grown on 96-well, 10 cm or 6 cm cell culture plates (Falcon or Corning). Cell detachment during passaging was done using trypsin (0.2 %). For quantitative proteomics involving ‘spike-in’, duplex or triplex SILAC; R0K0, R6K4 and R10K8 DMEM media were all purchased from Dundee Cell Products. Dialysed FBS at 10 % v/v was supplemented into the media. Cell dissociation buffer (enzyme-free, PBS-based) was used instead of trypsin for cell detachment during passaging and was obtained from Gibco Life technologies. All described experiments were carried out with cells at low passage number (< 25) and were generally plated out either 24 or 48 h prior to treatment. All cells were grown to 70-90 % confluence prior to commencing an experiment. Cells, when not in culture, were frozen for long-term storage at 150 ˚C in cell freezing medium (10 % DMSO in FBS).

182

8.2.3 Cell lysis

The protein concentration of cell lysates was determined using the BioRad Dc Protein Assay following the manufacturer’s instructions, detecting absorbance at 750 nm, using BSA as a protein standard. Lysates were stored at either -20 or -80 ˚C until further use.

8.2.3.1 Whole cell protein lysis buffer (for in-cell applications)

Following cell culture media aspiration, cells were washed three times with 1 x PBS. Cells were then lysed on the plate with PBS-based (200-350 μL) whole cell lysis buffer (1 % NP-40, 1 % sodium deoxycholate, 0.1 % SDS, 150 mM NaCl, 1 x PBS, pH 7.6, Roche EDTA-free protease inhibitors) at room temperature for 5 min (for experiments carried out in the early stages of the PhD, 50 mM Tris- HCl was used as oppose to 1 x PBS). The lysate was then scrapped from the plate and transferred to microcentrifuge tubes and kept on ice for 20 min. The lysates were then centrifuged at 17,000 x g for 25 min at 4 ˚C to pellet the insoluble cellular debris. Protein concentration of the resulting supernatant was determined.

8.2.3.2 Non-detergent or low-detergent cell protein lysis buffer (for in-lysate applications)

Cells were cultured to 90-100 % confluence on 10 cm plates, the cell culture media aspirated and the cells washed twice with 1 x PBS. Cells were then lysed in one of two ways (Method A and Method B). Following generation of the protein lysate, the protein concentration was determined and the lysate divided up into multiple 1, 2 and 4 mg/mL aliquots (to avoid freeze/thaw cycles), flash frozen in liquid nitrogen and stored at -80 ˚C until further use.

Method A 389, 578: Cells were detached with trypsin, pelleted by centrifugation (1000 x g, 5 min) and the cell pellet washed three times with 1 x PBS. The cell pellet was suspended in 1 x PBS and lysed by probe sonication with 4 x 15 s pulses at full power with 30 s intervals. Unbroken cells were removed by centrifugation (2400 x g, 5 min) at room temperature and the resulting supernatant extracted and centrifuged again at 17,000 x g for 25 min at 4 ˚C.

Method B 164: Cells were lysed on the plate with 400 μL HEPES lysis buffer (150 mM NaCl, 2 mM

MgCl2, 0.1 % NP-40, 25 mM HEPES pH 7.5) and the resulting cell suspension scrapped from the plate and transferred into microcentrifuge tubes. Samples were gently vortexed at 6 ˚C for 15 min followed by centrifugation (17,000 x g, 15 min) at 4 ˚C.

8.2.4 In-cell compound treatment

8.2.4.1 Single ABP compound dosing

The relevant ABP stock was diluted with cell culture media in a 10 mL conical centrifuge tube to give the relevant final ABP concentration (0.2 % DMSO final in most cases). The contents were mixed and gently pipetted over a plate of cells and left for a relevant time point prior to cell lysis (Chapter 8.2.3).

183

8.2.4.2 ABP competition assay dosing

The competition compound for pre-incubation was diluted with cell culture media in a 10 mL conical centrifuge tube to give the relevant final compound concentration (0.2-1.0 % DMSO final - dependent on compound solubility). The contents were mixed and gently pipetted onto a plate of cells. Cells were returned to the incubator for 30 min. In the meantime, the competition compound and the relevant ABP were diluted in fresh cell culture media in a 10 mL conical centrifuge tube to give the relevant final compound concentrations and the contents mixed (final % DMSO identical to above). Following aspiration of media from the plate of cells, this second media containing competition compound and ABP was pipetted over cells and returned to the incubator for a further 30 min. Cell lysis was then carried out as described (Chapter 8.2.3). Typically, for cell feeding experiments involving sulforaphane and piperlongumine, 0.2 % final DMSO was used. For curcumin and analogues thereof, 0.5-1.0 % final DMSO was used as a result of their reduced solubility in aqueous media.

8.2.5 In-lysate compound treatment

To generate heat denatured protein lysates, previously prepared lysates (Chapter 8.2.3.2) were thawed and then boiled at 90 ˚C for 5 min before being allowed to cool to room temperature.

8.2.5.1 Single ABP compound dosing

Protein lysates were prepared as 70 μL solutions at a protein concentration of 1 mg/mL. ABP compound stock (1.4 μL) was added to each sample (2 % DMSO final) and the samples gently vortexed at 37 ˚C for 20 min. Protein was then immediately precipitated with CHCl3/MeOH (see Chapter 8.2.7) to remove excess ABP, the protein pellet washed 1 x MeOH (10 vol.) and re- suspended in 5 μL 2 % SDS (in 1 x PBS) and diluted down with relevant lysis buffer to a final volume of 70 μL at a protein concentration of 1 mg/mL.

8.2.5.2 ABP competition assay dosing

Protein lysates were prepared as 70 μL solutions at a protein concentration of 1 mg/mL. Competition compound stock (1.4 μL) was added and samples vortexed at 37 ˚C for 45 min. ABP compound stock (1.4 μL) was then added (4 % DMSO final) and vortexed at 37 ˚C for a further 20 min. Protein was then precipitated with CHCl3/MeOH, the protein washed 1 x MeOH (10 vol.) and re-suspended in 5 μL 2 % SDS (in 1 x PBS) and diluted down with relevant lysis buffer to a final volume of 70 μL at a protein concentration of 1 mg/mL.

8.2.6 Gel electrophoresis

Proteins were resolved by SDS-PAGE using a BioRad Mini-PROTEAN® Tetra Cell system and Bis- Tris gels in either MES or MOPS running buffer (supplied as 20 x stocks from Invitrogen and diluted accordingly). Gels were run at 80 V for the first 20 min (through the stacking gel) followed by 120-160 V thereafter (through the resolving gel), typically running the gel for a further 40-60 min. The composition of the Bis-Tris gels is shown in Table 7.

184

Samples were loaded onto the gel following boiling at 90 ˚C with 4 x NuPAGE LDS sample loading buffer (4 x SLB) containing 5 % β-mercaptoethanol. BioRad Precision Plus Protein All Blue Standard ladder was used for molecular weight comparison. For total protein visualisation, gels were stained with ‘blue silver’ Coomassie (9.2 % phosphoric acid, 10 % ammonium sulfate, 0.12 % Coomassie brilliant blue G-250 dye, 20 % MeOH in water).

Table 7. The composition of the Bis-Tris gels used for SDS-PAGE analysis

Volume (mL) for 10 mL resolving gel Volume (mL) for 2.5 mL stacking gel (4 Component (12 % acrylamide) % acrylamide) 30 % acrylamide/bis-acrylamide 4.0 0.33 1.25 M Bis-Tris (pH 6.7) 2.9 0.71 Water 3.0 1.45 10 % w/v ammonium persulfate (APS) 0.1 0.025 N,N,N’,N’-Tetramethylethylenediamine 0.004 0.002 (TEMED)

8.2.7 Click chemistry (CuAAC) and in-gel fluorescence

Protein lysates were thawed from storage at -20 or -80 ˚C on ice and were generally made up as 100 μL samples at a concentration of 1 mg/mL diluting with appropriate lysis buffer. A click reaction master mix was prepared freshly as follows: capture reagent (either AzT, AzTB, AzRB or AzRB2) (1

μL, 10 mM in DMSO stock concentration, 0.1 mM final concentration), CuSO4 (2 μL, 50 mM stock concentration, 1 mM final concentration), TCEP (2 μL, 50 mM stock concentration, 1 mM final concentration) and TBTA (1 μL, 10 mM in DMSO stock concentration, 0.1 mM final concentration). 6 μL of this click reaction master mix was then added to each 100 μL sample and left to vortex for 1 h at room temperature. This reaction is the CuAAC or click reaction. The reaction was then quenched by the addition of EDTA (final concentration 10 mM) and the protein precipitated by the addition of MeOH

(4 vol.), CHCl3 (1 vol.) and water (3 vol.), followed by centrifugation at 17,000 x g for 2 min. The upper liquid phase was discarded before addition of MeOH (4 vol.) and centrifugation at 17,000 x g for 2 min to pellet the protein. The resulting protein pellet was washed 2 x MeOH (8 vol.) and air dried for 10 min.

For in-gel fluorescence analysis, the protein pellet was re-suspended in 10 μL 2 % SDS (in 1 x PBS), 10 μL 100 mM EDTA, 40 μL 1 x PBS and 20 μL 4 x SLB (with β-mercaptoethanol) to give a final concentration of protein of 1.25 mg/mL. Samples were then boiled at 90 °C for 5 min. Following SDS- PAGE, gels were fixed in gel soaking solution (50 % water, 40 % MeOH, 10 % acetic acid) for 30 min and washed 2 x MilliQ water before in-gel fluorescence visualisation in the Cy3 channel (Excitation wavelength 552 nm, emission wavelength 570 nm). Further data analysis of the derived image was performed with ImageQuant™ TL software.

8.2.8 Affinity enrichment of ABP-labelled proteins

Protein lysates were generally made up as 300 μL samples at a concentration of 2 mg/mL diluting with appropriate lysis buffer and clicked with 18 μL click reaction master mix (containing AzTB as the capture reagent) for 1 h before being quenched with EDTA and the protein precipitated with

185

CHCl3/MeOH as described (Chapter 8.3.7). After protein precipitation, the protein pellet was re- suspended in 60 μL 2 % SDS (in 1 x PBS), 60 μL 100 mM EDTA, 6 μL 100 mM DTT, 6 μL 1 x Roche EDTA-free protease inhibitors, 318 μL 1 x PBS (final volume 450 μL). 75 μL of lysate was then taken and added to 25 μL 4 x SLB (with β-mercaptoethanol). Samples were boiled at 90 ˚C for 5 min. These samples were designated as the pre-pull down samples (PPD). To the remaining 375 μL of lysate, 75 μL 1 x PBS was added. 75 μL Neutravidin sepharose resin (Thermo Scientific), pre-washed three times with 0.2 % SDS (in 1 x PBS), was added to each sample (525 μL volume, protein concentration 1 mg/mL, 0.2 % SDS final). The samples were then left to gently vortex at room temperature for 2 h for affinity enrichment of biotin-functionalised proteins. The supernatant was then removed and a 75 μL aliquot taken and added to 25 μL 4 x SLB (with β-mercaptoethanol). Samples were boiled at 90 ˚C for 5 min. These samples were designated as the supernatant samples (SN). The Neutravidin sepharose resin from each sample was then washed 4 x 400 μL with 0.2 % SDS (in 1 x PBS). 75 μL 2 % SDS (in 1 x PBS) was then added to each sample which were boiled at 90 ˚C for 15 min to dissociate enriched proteins from the resin. The resulting supernatant then had 25 μL 4 x SLB (with β- mercaptoethanol) added and the samples boiled at 90 ˚C for a further 5 min. These samples were designated as the pulldown samples (PD). The samples were then loaded onto the SDS-PAGE gel: pre-pull down sample (PPD) 12 μL (10 μg of proteins), supernatant sample (SN) 15 μL (10 μg of proteins) and pulldown samples (PD) 15 μL (75 μg of proteins) and gel electrophoresis performed as described in Chapter 8.2.6.

8.2.9 Western blot analysis

Following gel electrophoresis, proteins were immediately transferred to PVDF membranes from the SDS-PAGE gel (non-fixed gel) using a dry blotting set-up with an iBlot® Gel Transfer Device (Invitrogen, Life Technologies) following the manufacturer’s guidelines. After transfer, membranes were blocked (in 5 % w/v dried skimmed milk in TBST (1 x TBST, 0.1 % Tween-20)) for 2 h at room temperature and incubated with the appropriate primary antibody (in blocking solution) with gentle agitation overnight at 4 ˚C (Table 8). The membranes were then washed 3 x TBST before being incubated with the appropriate secondary antibody (in blocking solution) with gentle agitation for 2 h at room temperature (Goat anti-Rabbit IgG-HRP secondary antibody (Invitrogen, dilution 1:5,000), Goat anti-Mouse IgG-HRP (BD Pharminigen, dilution 1:10,000), Donkey anti-Goat IgG-HRP (Abcam, dilution 1:5,000)). The membranes were then washed again with 3 x TBST before being developed with Luminata Crescendo Western HRP substrate (Millipore) according to the supplier’s protocol and the resulting chemiluminescence visualised.

Table 8. List of antibodies used in the reported studies.

Mw Secondary Catalogue Poly- or Antigen Supplier Dilution Notes (kDa) Antibody No. Mono- clonal STAT3 86 Rabbit SCB SC-482 Poly 1:100 - STAT1 90 Rabbit SCB SC-346 Poly 1:100 - p84/p81 FXR1 78 Rabbit SCB SC-48783 Poly 1:200 High background α-tubulin 50 Mouse SCB SC-53646 Mono 1:500 - β-actin 40 Rabbit SCB SC-130656 Poly 1:500 - BID 15/22 Rabbit CST 2002S Poly 1:250 -

186

PSMC1 49 Rabbit Atlas HPA000872 Poly 1:750 - MARCKS 70-90 Rabbit CST D88D11 Mono 1:500 - IMPDH2 56 Rabbit Abcam Ab131158 Mono 1:1000 - HSP90 90 Mouse SCB SC-69703 Mono 1:200 - HCCS 32 Rabbit Atlas HPA002946 Poly 1:500 - GSTO1/2 31 Mouse SCB SC-166040 Mono 1:100 - No detection on KHSRP 73 Rabbit Abcam Ab83291 Poly 1:1000 pulldown sample No detection on MIF 12.5 Rabbit SCB SC-20121 Poly 1:100 pulldown sample No detection on CDK2 35 Rabbit SCB SC-163 Poly 1:1000 pulldown sample No detection at all for KEAP1 69 Goat SCB SC-15246 Poly 1:100 antibody No detection at all for VDAC2 30 Goat Abcam Ab37985 Poly 1:200 antibody

8.2.10 Cell viability assays (MTS assay)

For all cell viability assays, the MTS assay (Promega) was used as an indirect measure of cellular metabolic rate to determine cell viability. The assay was used following the manufacturer’s instructions with minor modifications, initially optimised by Dr William Heal (Imperial College London) and Dr Emmanuelle Thinon (Imperial College London). All experiments were carried out in at least duplicate (n = 2). All assays reported a Z’ > 0.9 and S/B value > 4.

8.2.10.1 Single compound assays

For single compound assays, MCF7 and MDA-MB-231 cells were seeded in 96-well plates 24 h before treatment. Cell suspensions were prepared at a concentration of 100 cells/μL (72 h experiment), 140 cells/μL (48 h experiment) or 180 cells/μL (24 h experiment) in standard growth media and 50 μL of cell suspension was added to each well corresponding to 5000, 7000 and 9000 cells/well respectively. 24 h after cell seeding, 50 μL of growth media containing either DMSO (0.3 % DMSO final; negative control), puromycin (2 μg/mL final, 0.3% DMSO final; positive control) or compound (0.3 % DMSO final) were added to appropriate wells to give a final volume of 100 μL per well. For compound dilutions, double or triple dilution series were prepared from the maximum concentration.

After compound treatment at the fixed time interval (24 h, 48 h or 72 h), the MTS reagent (Promega) in combination with PMS was added according to the manufacturer’s instructions (20 μL of MTS/PMS prepared stock added per well). Absorbance was read per 96-well plate at 490 nm by a spectrophotometer after 3-4 h incubation. The absorbance values were used to calculate the % cell viability for each sample normalised to both the positive and negative controls. The % cell viability was plotted against concentration and dose response curves fitted in GraphPad Prism 5 using a 4- parameter fit with variable slope (non-linear regression) constraining the maximum and minimum values at 100 and 0 respectively from which EC50 values were generated, reporting error at the 95 % confidence intervals (p < 0.05).

187

8.2.10.2 Compound combination assays

For compound combination experiments, MCF7 and MDA-MB-231 cells were seeded in a 96-well plate 24 h before treatment. Cell suspensions at a concentration of 100 cells/μL (MDA-MB-231) and 120 cells/μL (MCF7) in standard growth medium were prepared and 50 μL of cell suspension added to each well of a 96-well plate corresponding to 5000 and 6000 cells/well respectively. 24 h after cell seeding, cells were treated with compound as follows: For each 96-well plate, 6 concentrations of a double dilution series around the EC50 value of compound A only, compound B only, compound A

with a fixed concentration of compound B (around the EC25 of B) and compound A with a higher fixed

concentration of compound B (around the EC40 of B) were applied to the respective wells (in 50 μL cell growth media). In the reported studies, compound A refers to the cancer therapeutic (afatinib, epirubicin, PF-477736, PF-573228, PP2 and tamoxifen) and compound B is the electrophilic natural product (curcumin, piperlongumine or sulforaphane). The fixed concentrations of curcumin applied were 20.0 μM and 13.2 μM for MDA-MB-231 cells and 25.0 μM and 12.5 μM for MCF7 cells. The fixed concentrations of piperlongumine applied were 3.3 μM and 4.2 μM for MDA-MB-231 cells and 2.1 μM and 4.2 μM for MCF7 cells. The fixed concentrations of sulforaphane applied were 13.8 μM and 17.8 μM for MDA-MB-231 cells and 11.1 μM and 13.2 μM for MCF7 cells. Positive (puromycin) and negative controls (DMSO) were also dosed to each 96-well plate. 0.3 % DMSO final for all wells.

Following compound treatment for 72 h, MTS/PMS reagents were added and absorbance read at 490 nm after a further 3-4 h incubation. Absorbance values were used to calculate the % cell viability for each sample normalised to both the positive and negative controls. The % cell viability across triplicates were averaged out and used to calculate the fraction of cells affected (Fa) whereby Fa = 1 – [(% cell viability)/100]. The Fa values for each sample were used to generate dose response curves for compound A, compound B and the combination therapy using the CompuSyn software

(www.combosyn.com). Calculation of EC50 values (Dm), measures of sigmoidity (m), correlation coefficients (r) were all carried out with CompuSyn. CI values were also calculated by CompuSyn that utilises the methodology applied by Chou and Talalay for formal synergy analyses.545 Synergy is defined as a CI value < 1. CI values between 0.9-0.85 are deemed to show slight synergy (+), 0.85- 0.7 moderate synergy (++), 0.7-0.3 synergy (+++) and < 0.3 strong synergy (++++). CI values around 1 are additive and > 1 are antagonistic. For all combinations where synergy is claimed, CI < 1 is observed across multiple combination concentrations, r > 0.9, m > 1 reflecting accurate and reliable determination of the interaction between compound A, compound B and their combination.

8.3 Chemical proteomics

8.3.1 General methods

The experimental details for seven distinct chemical proteomic workflows reported in Chapter 4 and Chapter 5 are discussed in Chapter 8.3.2. The processing steps downstream from the CuAAC for all chemical proteomic workflows are similar and are therefore outlined collectively in Chapter 8.3.3 onwards. For proteomics sample preparation, all buffers used were filtered using a 0.2 μM filter (Gilson) (except SDS-containing buffers which were incompatible with such filters and as such were

188

simply centrifuged prior to use to pellet insoluble components) and new, sterile pipette tips were also used to reduce the likelihood of potential contamination. Low binding microcentrifuge tubes (Protein LoBind tubes, Eppendorf) were used for all sample handling steps after cell lysis. The amount of peptide injected onto the LC-MS/MS for affinity purified, peptide digests was the equivalent of 20-40 μg of the starting protein amount. For non-affinity purified, peptide digests 0.5-1.0 μg of the starting protein amount was injected.

Establishment of SILAC-incorporated cell lines (MDA-MB-231 (R6K10), MDA-MB-231 (R10K8) and MCF7 (R10K8) was achieved by growing such cell lines in the corresponding SILAC media. Cells were grown for > 7 passages prior to their use. The incorporation of the SILAC label for all three cell lines was determined by MS-based proteomics analysis (Chapter 8.3.4) of appropriate labelled protein lysate to be > 97 % which was sufficient for use in all SILAC-based applications reported (Appendix Table 1).382

8.3.2 Compound treatment and lysate preparation

8.3.2.1 Initial target identification of ABPs

For large-scale, MS-based proteomic identification of ABP treated to live, intact MDA-MB-231 cells (Chapter 4.1.3), compound treatment and cell lysis was as described previously (Chapter 8.2.3.1 and Chapter 8.2.4.1). Samples prepared were 20 μM curcumin ABP 1, 5 μM sulforaphane ABP 2, 2 μM piperlongumine ABP and DMSO for experimental batch 1 and 10 μM sulforaphane ABP 3 and DMSO for experimental batch 2. Following cell lysis, samples were prepared at a volume of 500 μL at 2 mg/mL (1 mg total protein) prior to the CuAAC reaction corresponding to a larger scale required for MS-based proteomics workflows.

8.3.2.2 Duplex SILAC-based competition assays of ABPs against their parent compounds

The following method was used for MS-based proteomic identification of ABP targets competed against a single, excess concentration of their respective parent compound in live, intact MDA-MB- 231 cells (Chapter 4.3). The method was also employed for the protein target identification of curcumin ABP 2 and curcumin ABP 3 (Chapter 4.1.3). All sample combinations are shown in Table 9.

Table 9. Experimental setup for duplex SILAC-based experiments to identify ABP targets competed by parent compounds in MDA-MB-231 cells.

No. ‘Light’ (R0K0) sample ‘Heavy’ (R10K8) sample 1 5 μM curcumin ABP 1 5 μM curcumin ABP 1 and 50 μM mono-O-propyl curcumin 2 5 μM sulforaphane ABP 2 and 100 μM sulforaphane 5 μM sulforaphane ABP 2 3 2 μM piperlongumine ABP 2 μM piperlongumine ABP and 100 μM piperlongumine 4 20 μM curcumin ABP 2 20 μM curcumin 5 20 μM curcumin 20 μM curcumin ABP 3

MDA-MB-231 cells (10 cm plates) previously established with R0K0 (‘light’) and R10K8 (‘heavy’) labelled proteomes were grown in R0K0 or R10K8 cell culture media respectively to 80 % confluence.

189

Initially, compound stock containing only the parent compound or DMSO was applied to the relevant ‘light’ or ‘heavy’ cells for 30 min, followed by aspiration of the media and exposure to the ABP and parent compound together for a further 30 min. Following media aspiration, cells were washed three times with 1 x PBS and lysed on the plate with 300 μL whole cell lysis buffer (1 % NP-40, 1 % sodium deoxycholate, 0.1 % SDS, 150 mM NaCl, 1 x PBS, pH 7.6, Roche EDTA-free protease inhibitors) followed by centrifugation and protein concentration determination for each sample. ‘Heavy’ and ‘light’ lysate pairs for each experiment (Table 9 No. 1-5) were then combined together in a 1:1 ratio to give a 200 μL lysate at a protein concentration of 2 mg/mL (200 μg ‘light’ lysate and 200 μg ‘heavy lysate’) ready for the CuAAC reaction (Chapter 8.3.3).

8.3.2.3 Duplex SILAC-based competition assays of ABPs against their parent compounds comparing in-cell, in-lysate and in-cell/lysate targets

The following method was used for MS-based proteomic identification of ABP targets competed against a single, excess concentration of the respective parent compound determined in-cell, in-lysate or in-cell/lysate (Chapter 4.5). Curcumin, piperlongumine and sulforaphane were subjected to these studies. The experimental design consisted of 3 competition assays in-cell, 3 competition assays in- lysate and 3 competition assays in-cell/lysate (Table 10). All experimental samples were generated in duplicate (18 samples total).

Table 10. Experimental setup for the duplex SILAC-based experiments to identify ABP targets competed by parent compound in-cell, in-lysate and in-cell/lysate.

No. Sample Name ‘Light’ (R0K0) sample ‘Heavy’ (R10K8) sample 5 μM curcumin ABP 1 and 100 μM 1 Curcumin (In-cell) 5 μM curcumin ABP 1 curcumin 2 μM piperlongumine ABP and 100 μM 2 Piperlongumine (In-cell) 2 μM piperlongumine ABP piperlongumine 5 μM sulforaphane ABP 2 and 100 μM 3 Sulforaphane (In-cell) 5 μM sulforaphane ABP 2 sulforaphane 5 μM curcumin ABP 1 and 150 μM 4 Curcumin (In-lysate) 5 μM curcumin ABP 1 curcumin 2 μM piperlongumine ABP and 100 μM 5 Piperlongumine (In-lysate) 2 μM piperlongumine ABP piperlongumine 5 μM sulforaphane ABP 2 and 150 μM 5 μM sulforaphane ABP 2 and 150 μM 6 Sulforaphane (In-lysate) sulforaphane sulforaphane 5 μM curcumin ABP 1 and 100 μM 7 Curcumin (In-cell/lysate) 5 μM curcumin ABP 1 curcumin 2 μM piperlongumine ABP and 100 μM 8 Piperlongumine (In-cell/lysate) 2 μM piperlongumine ABP piperlongumine 5 μM sulforaphane ABP 2 and 100 μM 9 Sulforaphane (In-cell/lysate) 5 μM sulforaphane ABP 2 sulforaphane

MDA-MB-231 cells (10 cm plates) previously established with R0K0 and R10K8 labelled proteomes were grown in R0K0 or R10K8 cell culture media respectively to 70-90 % confluence. For in-cell experiments and in-cell/lysate experiments, compound stocks were diluted with relevant cell culture media up to the final concentration and applied to cells. For the in-cell experiments, the parent compound or DMSO vehicle was treated to the cells for 30 min followed by the ABP and parent compound together or ABP alone for a further 30 min (0.2-0.5 % DMSO final). For in-cell/lysate

190

experiments, parent compound or DMSO was treated to cells for 30 min (0.2-0.5 % DMSO final). Plates for all cells (in-cell, in-lysate and in-cell/lysate) were washed three times with 1 x PBS and lysed on the plate with 500 μL HEPES lysis buffer (150 mM NaCl, 2 mM MgCl2, 0.1 % NP-40, 25 mM HEPES pH 7.5) as previously described (Chapter 8.2.3.2 Method B) and the protein concentration determined.

For in-lysate experiments, untreated protein lysates were prepared as individual 200 μL solutions at 2 mg/mL protein concentration in Protein LoBind tubes. These were then treated with parent compound stock or DMSO (4 μL) and vortexed at 37 ˚C for 40 min. ABP compound stock (4 μL) was then added (4 % DMSO final) and vortexed at 37 ˚C for a further 20 min. For in-cell/lysate experiments, parent compound or DMSO cell-treated lysates were prepared as individual 200 μL solutions at 2 mg/mL protein concentration in Protein LoBind tubes and were treated with ABP compound stock (4 μL) and vortexed at 37 ˚C for 20 min (2 % DMSO final). For both in-lysate and in-cell/lysate samples following compound treatment, protein was precipitated with CHCl3/MeOH, washed 1 x MeOH (10 vol.) and re- suspended in 40 μL 2 % SDS (in 1 x PBS) and diluted down to a final volume of 150 μL with HEPES lysis buffer. The protein concentration was then determined and 150 μL solutions at 1 mg/mL (0.2 % SDS) prepared for each sample. To the in-cell experiments was added SDS to a final 0.2 % to ensure all samples from in-cell, in-lysate and in-cell/lysate had identical final buffer compositions.

‘Heavy’ and ‘light’ lysate pairs for each experiment (Table 10 No. 1-9) were then combined together in a 1:1 ratio to give a 300 μL lysate at a protein concentration of 1 mg/mL (150 μg ‘light’ lysate and 150 μg ‘heavy’ lysate) ready for the CuAAC reaction (Chapter 8.3.3).

8.3.2.4 ‘Spike-in’ SILAC-based optimisation experiment with sulforaphane ABP 2

The following method was used for the MS-based proteomic identification of sulforaphane ABP 2 targets using a ‘spike-in’ SILAC approach in MCF7 cells (Chapter 5.1). It was used to make comparison between incorporating the SILAC label at either the cell or protein level, and implementing a whole cell lysis versus a cytosolic and nuclear fractionation protocol. All experimental samples were generated in either duplicate or triplicate.

MCF-7 cells (10 cm plates) were grown in standard cell culture conditions and treated with 5 μM ABP (0.2 % DMSO final). MCF7 cells previously established with R10K8 labelled proteomes were grown in ‘heavy’ media (R10K8) and treated with 20 μM ABP. Following compound treatment, cells were washed twice with 1 x PBS and treated in one of three ways:

i) Cells were lysed in whole cell lysis buffer (Table 11 No. 1). Cells were lysed on the plate with 280 μL whole cell lysis buffer (1 % NP-40, 1 % sodium deoxycholate, 0.1 % SDS, 150 mM NaCl, 1 x PBS, pH 7.6, Roche EDTA-free protease inhibitors), followed by centrifugation and protein concentration determination for both the normal and ‘spike-in’ SILAC samples. Samples were combined with 300 μL normal whole cell lysate at 2 mg/mL and 100 μL at 2 mg/mL of whole cell ‘spike-in’ SILAC lysate to give a total lysate volume of 400 μL at 2 mg/mL (800 μg total protein).

191

ii) Cells were fractionated upon lysis to cytosolic and nuclear fractions (Table 11 No. 2-3). Cells were lifted from the plate into 1 x PBS using a cell scrapper and transferred to Protein LoBind tubes. Cells were pelleted by centrifugation at 2000 x g for 5 min. The supernatant was discarded and the

pellet re-suspended in 400 μL Buffer A (5 mM KCl, 0.5 mM MgCl2, 0.5 % NP-40, 25 mM HEPES pH 7.9, 1 x Roche EDTA-free protease inhibitors) and lightly vortexed at 4 ˚C for 15 min. Samples were then centrifuged at 600 x g for 2 min and supernatant (cytosolic fraction) transferred to a new Protein LoBind tube. The pellet was then washed with 100 μL Buffer A with the supernatant discarded before being incubated with 200 μL Buffer B (350 mM NaCl, 10 % sucrose, 25 mM HEPES pH 7.9, 1 x Roche EDTA-free protease inhibitors) for 1 h at 4 ˚C. The sample was then centrifuged at 17,000 x g for 10 min at 4 ˚C and supernatant (nuclear fraction) transferred to a new Protein LoBind tube. Protein concentration was determined for both normal and ‘spike-in’ SILAC samples and combined. The cytosolic fraction pair consisting of 300 μL normal cytosolic lysate at 2 mg/mL and 100 μL at 2 mg/mL of cytosolic ‘spike-in’ SILAC lysate to give a total lysate volume of 400 μL at 2 mg/mL (800 μg total protein). The nuclear fraction pair consisting of 150 μL lysates at 1 mg/mL and 50 μL at 1 mg/mL of nuclear ‘spike-in’ SILAC lysate to give a total lysate volume of 120 μg at 1 mg/mL (120 μg total protein). iii) Cells were lifted and combined together prior to cell lysis (Table 11 No. 4-5). Cells were lifted from the plate with trypsin, quenched with cell culture medium and centrifuged (1000 x g, 5 min). Cell pellet was washed with 1 x PBS and re-suspended in 1 mL 1 x PBS. Cells were then counted with a haemocytometer and normal and ‘spike-in’ SILAC cells combined together at a 3:1 ratio followed by fractionation upon cell lysis into cytosolic and nuclear fractions as described above. Protein concentration was determined and samples prepared as 400 μL at 2 mg/mL (800 μg total protein).

All produced lysates were then subjected to the CuAAC reaction (Chapter 8.3.3).

Table 11. Experimental setup for the investigation of the ‘spike-in’ SILAC approach for sulforaphane using a chemical proteomic approach.

Incorporation of No. Cell lysis conditions Normal sample ‘Spike-in’ SILAC ‘heavy’ sample ‘spike-in’ SILAC label 1 Lysate combination Whole cell lysate 5 μM sulforaphane ABP 2 20 μM sulforaphane ABP 2 2 Cytosolic fraction 5 μM sulforaphane ABP 2 20 μM sulforaphane ABP 2 3 Nuclear fraction 5 μM sulforaphane ABP 2 20 μM sulforaphane ABP 2 4 Cell combination Cytosolic fraction 5 μM sulforaphane ABP 2 20 μM sulforaphane ABP 2 5 Nuclear fraction 5 μM sulforaphane ABP 2 20 μM sulforaphane ABP 2

8.3.2.5 ‘Spike-in’ SILAC-based ABP competition assays for sulforaphane

The following method was used for the MS-based proteomic identification of sulforaphane ABP 2 targets competed against a concentration gradient of the sulforaphane parent compound in live, intact MCF7 and MDA-MB-231 cells (Chapter 5.2). Cell lysis fractionation of both the sample under investigation and the ‘spike-in’ SILAC reference was employed into cytosolic and nuclear fractions. All experimental samples were generated in duplicate.

192

MDA-MB-231 and MCF-7 cells (10 cm plates) were grown under standard cell culture conditions. Cell were first treated with parent compound (5 μM, 25 μM or 100 μM sulforaphane) or DMSO to the cells for 30 min, followed by 5 μM ABP and parent compound (5 μM, 25 μM or 100 μM sulforaphane) together or 5 μM ABP alone for a further 30 min (0.2 % DMSO final) (Table 12). Following compound treatment, cells were fractionated upon lysis into cytosolic and nuclear fractions as previously described (Chapter 8.3.2.4) and the protein concentration determined.

To generate the ‘spike-in’ SILAC ‘heavy’ lysates, 8 x 10 cm plates of MDA-MB-231 cells and 9 x 10 cm plates of MCF7 cells previously established with R10K8 labelled proteomes were grown in ‘heavy’ media (R10K8) to 90 % confluence. They were treated with 20 μM ABP for 30 min in ‘heavy’ media (0.2 % DMSO final). The media was aspirated and cells washed three times with 1 x PBS prior to cell lysis to generate cytosolic and nuclear fractions. Lysates from the same two fractions were pooled into a single master ‘spike-in’ SILAC ‘heavy’ lysate for each of the cytosolic and nuclear fractions for which the protein concentration was determined. A total protein amount for the master ‘spike-in’ SILAC lysate of 1.8 mg (nuclear) and 8.2 mg (cytosolic) for the MCF-7 cell line and 1.6 mg (nuclear) and 6.4 mg (cytosolic) for the MDA-MB-231 cell line were generated.

Cytosolic fraction lysates were made up as 300 μL lysates at 2 mg/mL and had 100 μL at 2 mg/mL of cytosolic ‘spike-in’ SILAC ‘heavy’ lysate added to give a total lysate volume of 400 μL at 2 mg/mL (800 μg total protein). Nuclear fraction lysates were made up as 90 μL lysates at 1 mg/mL and had 30 μL at 1 mg/mL of nuclear ‘spike-in’ SILAC ‘heavy’ lysate added to give a total lysate volume of 120 μg at 1 mg/mL (120 μg total protein). Lysates were then subjected to the CuAAC reaction (Chapter 8.3.3).

Table 12. Experimental setup for the ‘spike-in’ SILAC competition-based assays for sulforaphane in the MCF7 and MDA-MB-231 cell lines.

No. Normal sample ‘Spike-in’ SILAC ‘heavy’ sample 1 5 μM sulforaphane ABP 2 2 5 μM sulforaphane ABP 2 and 5 μM sulforaphane 20 μM sulforaphane ABP 2 3 5 μM sulforaphane ABP 2 and 25 μM sulforaphane 4 5 μM sulforaphane ABP 2 and 100 μM sulforaphane

8.3.2.6 ‘Spike-in’ SILAC-based ABP competition assays for curcumin and piperlongumine

The following method was used for the MS-based identification of curcumin ABP 1 and piperlongumine ABP competed against a concentration gradient of their respective parent compounds (curcumin or PC and piperlongumine) or a single, excess concentration of reduced analogues of their parent compounds (THC and THP) in live, intact MDA-MB-231 cells (Chapter 5.3 and 5.5). A whole cell protein lysis protocol was employed for both the sample under investigation and the ‘spike-in’ SILAC reference. All experimental samples were generated in either duplicate or triplicate.

MDA-MB-231 cells (10 cm plates) were grown under standard cell culture conditions. Cells were first treated with the parent compound or DMSO for 30 min, followed by the ABP and parent compound

193

together or the ABP alone for a further 30 min (0.2 % DMSO final for piperlongumine samples, 1 % DMSO final for curcumin and PC samples) (Table 13). Following compound treatment, cells were lysed with whole cell lysis buffer (1 % NP-40, 1 % sodium deoxycholate, 0.1 % SDS, 150 mM NaCl, 1 x PBS, pH 7.6, Roche EDTA-free protease inhibitors), transferred to Protein LoBind tubes, followed by centrifugation and subsequent protein concentration determination.

To generate the ‘spike-in’ SILAC ‘heavy’ lysates, two sets of 6 x 10 cm plates of MDA-MB-231 cells previously established with R10K8 labelled proteomes were grown in ‘heavy’ media (R10K8) to 90 % confluence. To one set of MDA-MB-231 cells was added 20 μM curcumin ABP 1 and to the other set 8 μM piperlongumine ABP for 30 min in ‘heavy’ media (0.2 % DMSO final). After which, the media was aspirated and cells washed three times with 1 x PBS, followed by cell lysis with whole cell lysis buffer. Lysates from each set were then pooled to form a separate master ‘spike-in’ SILAC ‘heavy’ lysate for curcumin ABP 1 and piperlongumine ABP. The protein concentration of each was determined. A total protein amount of master ‘spike-in’ SILAC ‘heavy’ lysate for curcumin ABP 1 of 5.3 mg and for piperlongumine ABP of 5.0 mg was generated.

Lysates for all samples were made up as 375 μL lysates at 2 mg/mL and had 125 μL at 2 mg/mL of the appropriate ‘spike-in’ SILAC ‘heavy’ lysate added to give a total lysate volume of 500 μL at 2 mg/mL (1 mg total protein). Lysates were then subjected to the CuAAC reaction (Chapter 9.3.3).

Table 13. Experimental setup for the ‘spike-in’ SILAC competition-based assays for curcumin and piperlongumine in the MDA-MB-231 cell line.

No. Normal sample ‘Spike-in’ SILAC ‘heavy’ sample 1 5 μM curcumin ABP 1 2 5 μM curcumin ABP 1 and 5 μM curcumin 3 5 μM curcumin ABP 1 and 20 μM curcumin 4 5 μM curcumin ABP 1 and 50 μM curcumin 5 5 μM curcumin ABP 1 and 100 μM curcumin 6 5 μM curcumin ABP 1 and 150 μM curcumin 7 5 μM curcumin ABP 1 and 100 μM THC 20 μM curcumin ABP 1 8 5 μM curcumin ABP 1 9 5 μM curcumin ABP 1 and 5 μM PC 10 5 μM curcumin ABP 1 and 20 μM PC 11 5 μM curcumin ABP 1 and 50 μM PC 12 5 μM curcumin ABP 1 and 100 μM PC 13 5 μM curcumin ABP 1 and 150 μM PC 14 2 μM piperlongumine ABP 15 2 μM piperlongumine ABP and 2 μM piperlongumine 16 2 μM piperlongumine ABP and 10 μM piperlongumine 17 2 μM piperlongumine ABP and 25 μM piperlongumine 8 μM piperlongumine ABP 18 2 μM piperlongumine ABP and 50 μM piperlongumine 19 2 μM piperlongumine ABP and 100 μM piperlongumine 20 2 μM piperlongumine ABP and 100 μM THP

194

8.3.2.7 Triplex SILAC-based target identification of small molecule electrophile ABPs

The following method was used for the MS-based proteomic identification of a panel of alkyne- functionalised small molecule electrophiles in live, intact MDA-MB-231 cells (Chapter 5.7).

MDA-MB-231 cells (10 cm plates) previously established with R0K0 (‘light’), R6K4 (‘medium’) and R10K8 (‘heavy’) labelled proteomes were grown in R0K0, R6K4 or R10K8 cell culture media respectively to 70-80 % confluence. Compounds were treated to cells at the concentrations indicated for 30 min (0.25 % DMSO final) (Table 14). Cells were washed three times with 1 x PBS and lysed on the plate with 300 μL whole cell lysis buffer (1 % NP-40, 1 % sodium deoxycholate, 0.1 % SDS, 150 mM NaCl, 1 x PBS, pH 7.6, Roche EDTA-free protease inhibitors) followed by centrifugation and protein concentration determination for each sample. ‘Heavy’, ‘medium’ and ‘light’ lysates for each triple (Table 14 No. 1-5) were then combined together in a 1:1:1 ratio to give a 450 μL combined lysate at a protein concentration of 2 mg/mL (300 μg ‘light’ lysate, 300 μg ‘medium’ lysate and 300 μg ‘heavy’ lysate). Lysates were then subjected to the CuAAC reaction (Chapter 8.3.3).

Table 14. Experimental setup for the triplex SILAC-based target identification of small molecule electrophile ABPs in MDA-MB-231 cells.

No. ‘Light’ (R0K0) sample ‘Medium’ (R6K4) sample ‘Heavy’ (R10K8) sample 1 20 μM chloromethylketone (CMK) ABP 50 μM chloroacetamide (CA) ABP 2 50 μM acetylenic enone (AE) ABP 20 μM acetylenic chalcone (AC) ABP 3 50 μM benzaldehyde (BA) ABP 3 μM NEM ABP 20 μM isothiocyanate (ITC) ABP 4 15 μM sulforaphane ABP 2 40 μM curcumin ABP 1 5 4 μM piperlongumine ABP DMSO

8.3.3 CuAAC, affinity enrichment and on-bead reduction, alkylation and trypsin digest

For a standard experiment whereby the total amount of protein lysate prior to the click reaction was 1 mg at a protein concentration of 2 mg/mL (500 μL volume), the following protocol was carried out. For lesser amounts of starting protein amount or volume of protein lysate, the experimental quantities were scaled accordingly.

To the protein lysate was added 30 μL of a click reaction master mix (containing either AzTB, AzRB or AzRB2 as the capture reagent) for the CuAAC reaction and samples left to vortex gently at room temperature for 1 h. The reaction was quenched by the addition of EDTA (final concentration 10 mM) and the protein precipitated with CHCl3/MeOH as previously reported (Chapter 8.2.7). The resulting protein pellet was then washed 3 x MeOH (10 vol.) and air dried for 10 min. The protein pellet was re- suspended in 100 μL 2 % SDS (in 1 x PBS), 100 μL 100 mM EDTA, 10 μL 100 mM DTT, 10 μL 1 x Roche EDTA-free protease inhibitors, 630 μL 1 x PBS (final volume 850 μL). Samples were centrifuged at 17,000 x g for 3 min after re-suspension and transferred to new Protein LoBind tubes. Neutravidin sepharose resin (Thermo Scientific) was washed three times with 0.2 % SDS (in 1 x PBS) before 150 μL of the resin slurry was added to each sample (1 mL volume, protein concentration 1 mg/mL, 0.2 % SDS final). The samples were then left to gently vortex at room temperature for 2 h for

195

affinity enrichment. The supernatant was then discarded and the Neutravidin sepharose resin washed 3 x 1 % SDS (in 1 x PBS), 2 x 4M urea (in 50 mM AMBIC) and 4 x 50 mM AMBIC (5 vol. for each wash consisting of 2 min of vortex followed by centrifugation to pellet the resin and discard the washings).

After washing, proteins on the Neutravidin resin were reduced with 5 μL 100 mM DTT in 50 mM AMBIC at 55 ˚C for 30 min with gentle agitation. The resin was then washed 2 x 50 mM AMBIC. Cysteines were alkylated with 5 μL 100 mM iodoacetamide in 50 mM AMBIC in the dark at room temperature. The resin was again washed 2 x 50 mM AMBIC. The samples were then digested with trypsin (Sequencing Grade Modified Trypsin (Promega), 0.6 μg trypsin) at 37 ˚C overnight with gentle agitation. Following digest, samples were then centrifuged and the peptide-containing supernatant transferred to a new Protein LoBind tube. The resin was washed with 0.1 % formic acid in water, centrifuged and the washings added to the tube. The peptide solutions were then stage-tipped 579 according to a published protocol. Briefly, stage tips were prepared by fitting C18 Empore disks (SDC-XC from 3M) into 200 μL pipette tips. The stage tip was initially washed by centrifuging (2000 x g, 2 min) with 150 μL MeOH followed by 150 μL water. Peptide solutions were then added to the top

of the stage tip and centrifuged (2000 x g, 2 min) to load the peptides onto the C18 sorbent followed by desalting by washing with 150 μL water. Peptides were eluted off with 79 % acetonitrile in water and dried with speed-vac-assisted solvent removal. Peptides were then re-dissolved in 0.5 % TFA, 2 % acetonitrile in water and transferred to LC-MS sample vials ready for LC-MS/MS analysis.

8.3.4 Trypsin digest of protein lysates

Protein lysate (50 μL at a protein concentration of 2 mg/mL) obtained from R6K4 and R10K8 proteome-labelled MDA-MB-231 cells and R10K8 proteome-labelled MCF-7 cells was precipitated with ice-cold MeOH (8 vol.) overnight at -80 ˚C. The protein was pelleted by centrifugation at 17,000 x g for 10 min at 10 ˚C and subsequently washed twice with 10 % MilliQ water (in MeOH). The protein was air-dried for 2 min before being re-suspended in 80 μL 5 mM DTT in 50 mM AMBIC with vortex and sonication, followed by gentle agitation at 55 ˚C for 30 min. Cysteines were then alkylated with 5 μL 100 mM IA in 50 mM AMBIC in the dark at room temperature. Trypsin (2 μg) was used to digest the lysates overnight at 37 ˚C. The trypsin digest was then quenched with 0.5 μL TFA and the peptide mixtures stage-tipped as previously described followed by preparation for LC-MS/MS analysis.

8.3.5 LC-MS/MS runs

LC-MS/MS analysis was performed on an Easy nLC-1000 system coupled to a QExactive mass spectrometer via an easy-spray source (all Thermo Fisher Scientific). Trypsin-digested peptide samples were separated with a reverse phase Acclaim PepMap RSLC column 50 cm x 75 μm inner diameter (Thermo Fisher Scientific) using a 2 h acetonitrile gradient in 0.1 % formic aid at a flow rate of 250 nL/min. The QExactive mass spectrometer was operated in data-dependent mode with survey scans acquired at a resolution of 75,000 at m/z 200 (transient time 256 ms). Up to the top 10 most abundant isotope patterns with charge +2 from the survey scan were selected with an isolation window of 3.0 m/z and fragmented by HCD with normalized collision energies of 25 W. The maximum

196

ion injection times for the survey scan and the MS/MS scans (acquired with a resolution of 17,500 at m/z 200) were 250 and 80 ms, respectively. The ion target value for MS was set to 106 and for MS/MS to 105.

8.3.6 LC-MS/MS data analysis

The .raw data file obtained from each LC-MS/MS acquisition was directly processed with the software MaxQuant version 1.3.0.5,199 with the peptides being identified from the MS/MS spectra searched against the human UniProt+isoforms database (a variety of database versions were used, updating the database routinely every 3-6 months) using the Andromeda search engine. Cysteine carbamidomethylation (+ 57.021 Da) was set as a fixed modification and methionine oxidation (+ 15.995 Da) and N-terminal acetylation (+ 42.011 Da) set as variable modifications for the search. The minimum length of a peptide was set to 7 residues, the maximum amount of missed trypsin cleavages was set to 2, the maximum number of modifications per peptide was set to 5, and the maximum charge of a peptide as +7. Peptide and protein FDR were set to 0.01. All other parameters were used as pre-set by the software. Data outputted from MaxQuant was analysed using a combination of Perseus version 1.4.0.20, Microsoft Office Excel 2010 and GraphPad Prism 5.0. For all data sets, protein identifications by MaxQuant based on ‘contaminants’, ‘only identified by site’ and ‘reverse’ were filtered out in the first instance. Further filtering only allowed identification of a protein target if it contained at least two ‘razor+unique peptides’. For all proteomics experiments, proteins identified in the DMSO control (with a detectable LFQ intensity) were filtered out from the protein target sets of the ABPs.

8.3.6.1 Quantitative analysis

Protein quantification in MaxQuant was carried out off ‘razor+unique’ peptides carrying no modifications except carbamidomethylation.

8.3.6.1.1 Duplex SILAC data analysis For duplex SILAC experiments, the multiplicity for the MaxQuant search was set to 2 corresponding to the number of labels to quantify against one another (in this case Lys0, Lys8, Arg0 and Arg10). This enabled the calculation of the corresponding H/L ratios for each ‘razor+unique’ peptide which the MaxQuant software uses to provide an overall protein H/L ratio. For competition-based assays, a H/L ratio threshold of > 1.5-fold change was deemed sufficient to warrant a significant change between the two proteome sample populations and deem an identified protein target to be genuine.

8.3.6.1.2 Triplex SILAC data analysis For triplex SILAC experiments, the multiplicity for the MaxQuant search was set to 3 corresponding to the number of labels to quantify against one another (in this case Lys0, Lys4, Lys8, Arg0, Arg6 and Arg10). Three quantitative ratios were calculated H/L, H/M and M/L and used to compare ABP enrichment of each target in the three experimental samples.

197

8.3.6.1.3 ‘Spike-in’ SILAC analysis For ‘spike-in’ SILAC experiments, the multiplicity for the MaxQuant search was set to 2 (in principle ‘spike-in’ SILAC is analogous to duplex SILAC in the way the search is conducted). The calculated H/L ratios generated for each sample set of the same ABP in the same experimental batch by MaxQuant were normalised against their median based on their histogram distribution. After which the average of the H/L ratio across duplicates or triplicates was determined. The averaged H/L ratios for each concentration of competition against parent compound was normalised to the H/L ratio of the ABP only sample to generate a quantification score for each concentration of competition against the parent compound. A quantification score > 1.5 (or log2 > 0.585) at a concentration of 100 μM parent compound tested was deemed sufficient to warrant effective competition between the ABP and parent compound and assign the protein as high confidence target identification.

In order to generate dose response curves of target binding, the quantitative score from at least 4 concentrations of competition was used. Quantitative score for each competition concentration was transformed into a fractional response (0-1) using 1/x. Fractional response was plotted against log10 competition concentration and dose response curves generated in GraphPad Prism 5 using a 4- parameter fit with variable slope (non-linear regression) constraining the maximum and minimum values at 1 and 0 respectively from which EC50 values were generated. EC50 in this context was defined as ‘the half maximal concentration of competition between the ABP and the parent compound’.

8.3.6.2 Site of modification analysis

For identification of individual amino acid modification sites on ABP target proteins, enzyme cleavable capture reagents AzRB or AzRB2 were utilised. Two experimental workflows were used to obtain ‘modified’ peptide information, combining ‘modified’ and unmodified peptides in a single pool (Method A) or separating ‘modified’ peptides from their unmodified counterparts and analysing them separately (Method B). Both methods use LC-MS/MS analysis for detection of the modification sites, searching as a variable modification for the following mass addition on cysteine residues corresponding to the cleaved modification tag; + 507.255 Da (sulforaphane ABP 2), + 669.287 Da (piperlongumine ABP) and + 734.302 Da for the AzRB reagent (curcumin ABP 1). The FDR was widened for site and peptide to 0.05 to improve the chance of identifying modification sites. All other parameters were unchanged from those pre-set in the software and reported in Chapter 8.3.7.

Method A: An unmodified protocol to the normal workflow employed in Chapter 8.3.3 releasing modified peptides upon trypsin digest along with unmodified peptides (used in experiments described in Chapter 8.3.2.5 and Chapter 8.3.2.6).

Method B: Neutravidin resin immobilised proteins were reduced and alkylated with DTT and IA respectively. Samples were first digested on-resin with Lys-C (Wako, MS grade bacterial lysyl endopeptidase, 0.4 μg) in 50 mM AMBIC (pH 8.0) at 37 ˚C for 6 h. Resin was then washed four times with 50 mM AMBIC. Trypsin (0.6 μg) was then added to both the washings and the resin and samples were incubated at 37 ˚C with gentle agitation overnight. The resin was washed with AMBIC and 0.1 %

198

formic acid to extract the peptide-containing supernatant, followed by stage-tipping and preparation of samples for LC-MS/MS analysis (modified peptide pool). The washings sample was quenched with 0.2 % formic acid, stage-tipped and prepared for LC-MS/MS analysis (unmodified peptide pool).

8.3.7 Bioinformatic and network analysis

8.3.7.1 Bioinformatics

Online bioinformatic tools were used for GO enrichment analysis of the protein targets generated for curcumin, piperlongumine and sulforaphane. Gene names were used as the search input. Tools used included DAVID (http://david.abcc.ncifcrf.gov/), CanSAR (https://cansar.icr.ac.uk/), WebGestalt (http://bioinfo.vanderbilt.edu/webgestalt/) and Babelomics (http://babelomics.bioinfo.cipf.es/).

8.3.7.2 Network analysis with Cytoscape

Protein target interaction networks were constructed in Cytoscape (version 3.0.1) 412 A system-wide protein-protein interaction file compiled from multiple depositories was kindly provided by Dr Konstantinos Mitsopoulos (Institute of Cancer Research). Consisting of direct interactions (from BioGRID and HPRB), phosphorylation (from Phosphosite), reaction (from IntAct and MINT), Reactome (from Reactome) and transcriptional (Tfacts and Lindquist) interactions, it was used as a model interactome (consisting of 14,017 proteins or nodes and 141,562 edges or interactions between nodes) to interrogate the target subsets obtained for curcumin, piperlongumine and sulforaphane. These sub-networks were exported as new networks. Additional information such as

the quantification score (for sulforaphane networks) or calculated target EC50 values (for curcumin and piperlongumine) was mapped onto the nodes (corresponding to colour) within the respective networks. The size of the nodes reflects its connectivity within the network, such that larger nodes represent a higher degree of connectivity. The edges were also colour labelled according to their type; black (direct), pink (phosphorylation), green (reaction) blue (Reactome) and orange (transcriptional). Cluster analysis was performed using the Cytoscape plugin ClusterOne (version 1.0).414

199

9. References

1. Marino, S.M. and Gladyshev, V.N. Cysteine function governs its conservation and degeneration and restricts its utilization on protein surfaces. J. Mol. Biol. 404, 902-916 (2010). 2. Pe'er, I., et al. Proteomic signatures: amino acid and oligopeptide compositions differentiate among phyla. Proteins 54, 20-40 (2004). 3. Harris, T.K. and Turner, G.J. Structural basis of perturbed pKa values of catalytic groups in enzyme active sites. IUBMB Life 53, 85-98 (2002). 4. Pace, N.J. and Weerapana, E. Diverse functional roles of reactive cysteines. ACS Chem. Biol. 8, 283-296 (2013). 5. Couvertier, S.M., Zhou, Y. and Weerapana, E. Chemical-proteomic strategies to investigate cysteine posttranslational modifications. Biochim. Biophys. Acta 1844, 2315-2330 (2014). 6. Shannon, D.A. and Weerapana, E. Covalent protein modification: the current landscape of residue-specific electrophiles. Curr. Opin. Chem. Biol. 24C, 18-26 (2015). 7. Newman, D.J. and Cragg, G.M. Natural products as sources of new drugs over the last 25 years. J. Nat. Prod. 70, 461-477 (2007). 8. Gersch, M., Kreuzer, J. and Sieber, S.A. Electrophilic natural products and their biological targets. Nat. Prod. Rep. 29, 659-682 (2012). 9. Taunton, J., Collins, J.L. and Schreiber, S.L. Synthesis of natural and modified trapoxins, useful reagents for exploring histone deacetylase function. J. Am. Chem. Soc. 118, 10412- 10422 (1996). 10. Taunton, J., Hassig, C.A. and Schreiber, S.L. A mammalian histone deacetylase related to the yeast transcriptional regulator Rpd3p. Science 272, 408-411 (1996). 11. Bottcher, T., Pitscheider, M. and Sieber, S.A. Natural products and their biological targets: proteomic and metabolomic labeling strategies. Angew. Chem. Int. Ed. Engl. 49, 2680-2698 (2010). 12. Cheng, K.W., Wong, C.C., Wang, M., He, Q.Y. and Chen, F. Identification and characterization of molecular targets of natural products by mass spectrometry. Mass Spectrom. Rev. 29, 126-155 (2010). 13. Pucheault, M. Natural products: chemical instruments to apprehend biological symphony. Org. Biomol. Chem. 6, 424-432 (2008). 14. Krysiak, J. and Breinbauer, R. Activity-based protein profiling for natural product target discovery. Top. Curr. Chem. 324, 43-84 (2012). 15. Kansanen, E., Jyrkkanen, H.K. and Levonen, A.L. Activation of stress signaling pathways by electrophilic oxidized and nitrated lipids. Free Radic. Biol. Med. 52, 973-982 (2012). 16. Schopfer, F.J., Cipollina, C. and Freeman, B.A. Formation and signaling actions of electrophilic lipids. Chem. Rev. 111, 5997-6021 (2011). 17. Groeger, A.L. and Freeman, B.A. Signaling actions of electrophiles: anti-inflammatory therapeutic candidates. Mol. Interv. 10, 39-50 (2010). 18. Enoch, S.J., Ellison, C.M., Schultz, T.W. and Cronin, M.T. A review of the electrophilic reaction chemistry involved in covalent protein binding relevant to toxicity. Crit. Rev. Toxicol. 41, 783-802 (2011). 19. Lopachin, R.M. and Decaprio, A.P. Protein adduct formation as a molecular mechanism in neurotoxicity. Toxicol. Sci. 86, 214-225 (2005). 20. Liebler, D.C. Protein damage by reactive electrophiles: targets and consequences. Chem. Res. Toxicol. 21, 117-128 (2008). 21. Lin, D., Saleh, S. and Liebler, D.C. Reversibility of covalent electrophile-protein adducts and chemical toxicity. Chem. Res. Toxicol. 21, 2361-2369 (2008). 22. LoPachin, R.M., Barber, D.S. and Gavin, T. Molecular mechanisms of the conjugated alpha,beta-unsaturated carbonyl derivatives: relevance to neurotoxicity and neurodegenerative diseases. Toxicol. Sci. 104, 235-249 (2008). 23. Wall, S.B., et al. Detection of electrophile-sensitive proteins. Biochim. Biophys. Acta 1840, 913-922 (2014). 24. Lushchak, V.I. Glutathione homeostasis and functions: potential targets for medical interventions. J. Amino Acids 2012, 736837 (2012). 25. Zhu, P., Oe, T. and Blair, I.A. Determination of cellular redox status by stable isotope dilution liquid chromatography/mass spectrometry analysis of glutathione and glutathione disulfide. Rapid Commun. Mass Spectrom. 22, 432-440 (2008).

200

26. Marnett, L.J. Oxy radicals, lipid peroxidation and DNA damage. Toxicology 181-182, 219-222 (2002). 27. Itoh, K., et al. Keap1 represses nuclear activation of antioxidant responsive elements by Nrf2 through binding to the amino-terminal Neh2 domain. Genes Dev. 13, 76-86 (1999). 28. Dinkova-Kostova, A.T., et al. Direct evidence that sulfhydryl groups of Keap1 are the sensors regulating induction of phase 2 enzymes that protect against carcinogens and oxidants. Proc. Natl. Acad. Sci. U.S.A. 99, 11908-11913 (2002). 29. Eggler, A.L., Liu, G., Pezzuto, J.M., van Breemen, R.B. and Mesecar, A.D. Modifying specific cysteines of the electrophile-sensing human Keap1 protein is insufficient to disrupt binding to the Nrf2 domain Neh2. Proc. Natl. Acad. Sci. U.S.A. 102, 10070-10075 (2005). 30. McMahon, M., Lamont, D.J., Beattie, K.A. and Hayes, J.D. Keap1 perceives stress via three sensors for the endogenous signaling molecules nitric oxide, zinc, and alkenals. Proc. Natl. Acad. Sci. U.S.A. 107, 18838-18843 (2010). 31. Zhang, D.D. and Hannink, M. Distinct cysteine residues in Keap1 are required for Keap1- dependent ubiquitination of Nrf2 and for stabilization of Nrf2 by chemopreventive agents and oxidative stress. Mol. Cell. Biol. 23, 8137-8151 (2003). 32. Luo, Y., et al. Sites of alkylation of human Keap1 by natural chemoprevention agents. J. Am. Soc. Mass Spectrom. 18, 2226-2232 (2007). 33. Levonen, A.L., et al. Cellular mechanisms of redox cell signalling: role of cysteine modification in controlling antioxidant defences in response to electrophilic lipid oxidation products. Biochem. J. 378, 373-382 (2004). 34. Karin, M. and Ben-Neriah, Y. Phosphorylation meets ubiquitination: the control of NF- [kappa]B activity. Annu. Rev. Immunol. 18, 621-663 (2000). 35. Pahl, H.L. Activators and target genes of Rel/NF-kappaB transcription factors. Oncogene 18, 6853-6866 (1999). 36. Cui, T., et al. Nitrated fatty acids: Endogenous anti-inflammatory signaling mediators. J. Biol. Chem. 281, 35686-35698 (2006). 37. Rossi, A., et al. Anti-inflammatory cyclopentenone prostaglandins are direct inhibitors of IkappaB kinase. Nature 403, 103-108 (2000). 38. Ricote, M., Huang, J.T., Welch, J.S. and Glass, C.K. The peroxisome proliferator-activated receptor(PPARgamma) as a regulator of monocyte/macrophage function. J. Leukoc. Biol. 66, 733-739 (1999). 39. Shiraki, T., et al. Alpha,beta-unsaturated ketone is a core moiety of natural ligands for covalent binding to peroxisome proliferator-activated receptor gamma. J. Biol. Chem. 280, 14145-14153 (2005). 40. Waku, T., et al. Structural insight into PPARgamma activation through covalent modification with endogenous fatty acids. J. Mol. Biol. 385, 188-199 (2009). 41. Wang, L., et al. Natural product agonists of peroxisome proliferator-activated receptor gamma (PPARgamma): a review. Biochem. Pharmacol. 92, 73-89 (2014). 42. Itoh, T., et al. Structural basis for the activation of PPARgamma by oxidized fatty acids. Nat. Struct. Mol. Biol. 15, 924-931 (2008). 43. Zou, J., Guo, Y., Guettouche, T., Smith, D.F. and Voellmy, R. Repression of heat shock transcription factor HSF1 activation by HSP90 (HSP90 complex) that forms a stress-sensitive complex with HSF1. Cell 94, 471-480 (1998). 44. Jacobs, A.T. and Marnett, L.J. Systems analysis of protein modification and cellular responses induced by electrophile stress. Acc. Chem. Res. 43, 673-683 (2010). 45. Fomenko, D.E., Marino, S.M. and Gladyshev, V.N. Functional diversity of cysteine residues in proteins and unique features of catalytic redox-active cysteines in thiol . Mol. Cells 26, 228-235 (2008). 46. Brown, K.K., Eriksson, S.E., Arner, E.S. and Hampton, M.B. Mitochondrial peroxiredoxin 3 is rapidly oxidized in cells treated with isothiocyanates. Free Radic. Biol. Med. 45, 494-502 (2008). 47. Moos, P.J., Edes, K., Cassidy, P., Massuda, E. and Fitzpatrick, F.A. Electrophilic prostaglandins and lipid aldehydes repress redox-sensitive transcription factors p53 and hypoxia-inducible factor by impairing the selenoprotein thioredoxin reductase. J. Biol. Chem. 278, 745-750 (2003). 48. Shibata, T., et al. Thioredoxin as a molecular target of cyclopentenone prostaglandins. J. Biol. Chem. 278, 26046-26054 (2003).

201

49. Liu, Y. and Min, W. Thioredoxin promotes ASK1 ubiquitination and degradation to inhibit ASK1-mediated apoptosis in a redox activity-independent manner. Circ. Res. 90, 1259-1266 (2002). 50. Saitoh, M., et al. Mammalian thioredoxin is a direct inhibitor of apoptosis signal-regulating kinase (ASK) 1. EMBO J. 17, 2596-2606 (1998). 51. Codreanu, S.G., Zhang, B., Sobecki, S.M., Billheimer, D.D. and Liebler, D.C. Global analysis of protein damage by the lipid electrophile 4-hydroxy-2-nonenal. Mol. Cell. Proteomics 8, 670- 680 (2009). 52. Doyle, K. and Fitzpatrick, F.A. Redox signaling, alkylation (carbonylation) of conserved cysteines inactivates class I histone deacetylases 1, 2, and 3 and antagonizes their transcriptional repressor function. J. Biol. Chem. 285, 17417-17424 (2010). 53. Oliva, J.L., et al. The cyclopentenone 15-deoxy-delta 12,14-prostaglandin J2 binds to and activates H-Ras. Proc. Natl. Acad. Sci. U.S.A. 100, 4772-4777 (2003). 54. Sampey, B.P., Carbone, D.L., Doorn, J.A., Drechsel, D.A. and Petersen, D.R. 4-Hydroxy-2- nonenal adduction of extracellular signal-regulated kinase (Erk) and the inhibition of hepatocyte Erk-Est-like protein-1-activating protein-1 signal transduction. Mol. Pharmacol 71, 871-883 (2007). 55. Takeda, K., Ichiki, T., Tokunou, T., Iino, N. and Takeshita, A. 15-Deoxy-delta 12,14- prostaglandin J2 and thiazolidinediones activate the MEK/ERK pathway through phosphatidylinositol 3-kinase in vascular smooth muscle cells. J. Biol. Chem. 276, 48950- 48955 (2001). 56. Weerapana, E., et al. Quantitative reactivity profiling predicts functional cysteines in proteomes. Nature 468, 790-795 (2010). 57. Barf, T. and Kaptein, A. Irreversible protein kinase inhibitors: balancing the benefits and risks. J. Med. Chem. 55, 6243-6262 (2012). 58. Salmeen, A., et al. Redox regulation of protein tyrosine phosphatase 1B involves a sulphenyl- amide intermediate. Nature 423, 769-773 (2003). 59. Samet, J.M. and Tal, T.L. Toxicological disruption of signaling homeostasis: tyrosine phosphatases as targets. Annu. Rev. Pharmacol. Toxicol. 50, 215-235 (2010). 60. Hurd, T.R., et al. Glutathionylation of mitochondrial proteins. Antioxid. Redox Signal. 7, 999- 1010 (2005). 61. Chen, J., Schenker, S., Frosto, T.A. and Henderson, G.I. Inhibition of cytochrome c oxidase activity by 4-hydroxynonenal (HNE). Role of HNE adduct formation with the enzyme subunits. Biochim. Biophys. Acta 1380, 336-344 (1998). 62. Humphries, K.M. and Szweda, L.I. Selective inactivation of alpha-ketoglutarate dehydrogenase and pyruvate dehydrogenase: reaction of lipoic acid with 4-hydroxy-2- nonenal. Biochemistry 37, 15835-15841 (1998). 63. Humphries, K.M., Yoo, Y. and Szweda, L.I. Inhibition of NADH-linked mitochondrial respiration by 4-hydroxy-2-nonenal. Biochemistry 37, 552-557 (1998). 64. Echtay, K.S., et al. A signalling role for 4-hydroxy-2-nonenal in regulation of mitochondrial uncoupling. EMBO J. 22, 4103-4110 (2003). 65. Queliconi, B.B., Wojtovich, A.P., Nadtochiy, S.M., Kowaltowski, A.J. and Brookes, P.S. Redox regulation of the mitochondrial K(ATP) channel in cardioprotection. Biochim. Biophys. Acta 1813, 1309-1315 (2011). 66. Aldini, G., et al. Identification of actin as a 15-deoxy-Delta12,14-prostaglandin J2 target in neuroblastoma cells: mass spectrometric, computational, and functional approaches to investigate the effect on cytoskeletal derangement. Biochemistry 46, 2707-2718 (2007). 67. Chavez, J., et al. Site-specific protein adducts of 4-hydroxy-2(E)-nonenal in human THP-1 monocytic cells: protein carbonylation is diminished by ascorbic acid. Chem. Res. Toxicol. 23, 37-47 (2010). 68. Anastasiou, D., et al. Inhibition of pyruvate kinase M2 by reactive oxygen species contributes to cellular antioxidant responses. Science 334, 1278-1283 (2011). 69. Reagan, L.P., et al. Oxidative stress and HNE conjugation of GLUT3 are increased in the hippocampus of diabetic rats subjected to stress. Brain Res. 862, 292-300 (2000). 70. Ishii, T., Tatsuda, E., Kumazawa, S., Nakayama, T. and Uchida, K. Molecular basis of enzyme inactivation by an endogenous electrophile 4-hydroxy-2-nonenal: identification of modification sites in glyceraldehyde-3-phosphate dehydrogenase. Biochemistry 42, 3474- 3480 (2003). 71. Singh, J., Petter, R.C., Baillie, T.A. and Whitty, A. The resurgence of covalent drugs. Nat. Rev. Drug Discov. 10, 307-317 (2011).

202

72. Mah, R., Thomas, J.R. and Shafer, C.M. Drug discovery considerations in the development of covalent inhibitors. Bioorg. Med. Chem. Lett. 24, 33-39 (2014). 73. Ou, S.H.I. Second-generation irreversible epidermal growth factor receptor (EGFR) tyrosine kinase inhibitors (TKIs): A better mousetrap? A review of the clinical evidence. Critical Reviews in Oncology Hematology 83, 407-421 (2012). 74. Lou, Y., Owens, T.D., Kuglstatter, A., Kondru, R.K. and Goldstein, D.M. Bruton's tyrosine kinase inhibitors: approaches to potent and selective inhibition, preclinical and clinical evaluation for inflammatory diseases and B cell malignancies. J. Med. Chem. 55, 4539-4550 (2012). 75. Ahn, K., Johnson, D.S. and Cravatt, B.F. Fatty acid amide hydrolase as a potential therapeutic target for the treatment of pain and CNS disorders. Expert Opin. Drug Discov. 4, 763-784 (2009). 76. Hughes, T.E., et al. Ascending dose-controlled trial of beloranib, a novel obesity treatment for safety, tolerability, and weight loss in obese women. Obesity (Silver Spring) 21, 1782-1788 (2013). 77. Ziegler, S., Pries, V., Hedberg, C. and Waldmann, H. Target identification for small bioactive molecules: finding the needle in the haystack. Angew. Chem. Int. Ed. Engl. 52, 2744-2792 (2013). 78. Vila, A., et al. Identification of protein targets of 4-hydroxynonenal using click chemistry for ex vivo biotinylation of azido and alkynyl derivatives. Chem. Res. Toxicol. 21, 432-444 (2008). 79. Wang, C., Weerapana, E., Blewett, M.M. and Cravatt, B.F. A chemoproteomic platform to quantitatively map targets of lipid-derived electrophiles. Nat. Methods 11, 79-85 (2014). 80. Dennehy, M.K., Richards, K.A., Wernke, G.R., Shyr, Y. and Liebler, D.C. Cytosolic and nuclear protein targets of thiol-reactive electrophiles. Chem. Res. Toxicol. 19, 20-29 (2006). 81. Shin, N.Y., Liu, Q., Stamer, S.L. and Liebler, D.C. Protein targets of reactive electrophiles in human liver microsomes. Chem. Res. Toxicol. 20, 859-867 (2007). 82. Weerapana, E., et al. Quantitative reactivity profiling predicts functional cysteines in proteomes. Nature (2010). 83. Wong, H.L. and Liebler, D.C. Mitochondrial protein targets of thiol-reactive electrophiles. Chem. Res. Toxicol. 21, 796-804 (2008). 84. Weerapana, E., Simon, G.M. and Cravatt, B.F. Disparate proteome reactivity profiles of carbon electrophiles. Nat. Chem. Biol. 4, 405-407 (2008). 85. Evans, M.J., et al. Mechanistic and structural requirements for active site labeling of phosphoglycerate mutase by spiroepoxides. Mol. Biosyst. 3, 495-506 (2007). 86. Evans, M.J., Saghatelian, A., Sorensen, E.J. and Cravatt, B.F. Target discovery in small- molecule cell-based screens by in situ proteome reactivity profiling. Nat. Biotechnol. 23, 1303- 1307 (2005). 87. Greenbaum, D., Medzihradszky, K.F., Burlingame, A. and Bogyo, M. Epoxide electrophiles as activity-dependent cysteine protease profiling and discovery tools. Chem. Biol. 7, 569-581 (2000). 88. Kim, K.B., Myung, J., Sin, N. and Crews, C.M. Proteasome inhibition by the natural products epoxomicin and dihydroeponemycin: insights into specificity and potency. Bioorg. Med. Chem. Lett. 9, 3335-3340 (1999). 89. Meng, L., et al. Epoxomicin, a potent and selective proteasome inhibitor, exhibits in vivo antiinflammatory activity. Proc. Natl. Acad. Sci. U.S.A. 96, 10403-10408 (1999). 90. Lowther, W.T., McMillen, D.A., Orville, A.M. and Matthews, B.W. The anti-angiogenic agent fumagillin covalently modifies a conserved active-site histidine in the Escherichia coli methionine aminopeptidase. Proc. Natl. Acad. Sci. U.S.A. 95, 12153-12157 (1998). 91. Kotake, Y., et al. Splicing factor SF3b as a target of the antitumor natural product pladienolide. Nat. Chem. Biol. 3, 570-575 (2007). 92. Mi, L., et al. Identification of potential protein targets of isothiocyanates by proteomics. Chem. Res. Toxicol. 24, 1735-1743 (2011). 93. Fu, Y., et al. A click chemistry approach to identify protein targets of cancer chemopreventive phenethyl isothiocyanate. R. Soc. Chem. Adv. 4, 3920-3923 (2014). 94. Shibata, T., et al. Transthiocarbamoylation of proteins by thiolated isothiocyanates. J. Biol. Chem. (2011). 95. Kirkpatrick, D.L., et al. Mechanisms of inhibition of the thioredoxin growth factor system by antitumor 2-imidazolyl disulfides. Biochem. Pharmacol. 55, 987-994 (1998). 96. Kreuzer, J., Bach, N.C., Forler, D. and Sieber, S.A. Target discovery of acivicin in cancer cells elucidates its mechanism of growth inhibitiondaggerElectronic supplementary information

203

(ESI) available: Synthesis, cloning, protein expression, purification and biochemical assays. Chem. Sci. 6, 237-245 (2014). 97. Orth, R., Bottcher, T. and Sieber, S.A. The biological targets of acivicin inspired 3-chloro- and 3-bromodihydroisoxazole scaffolds. Chem. Commun. (Camb.) 46, 8475-8477 (2010). 98. Lamb, J., et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313, 1929-1935 (2006). 99. Bargagna-Mohan, P., et al. The tumor inhibitor and antiangiogenic agent withaferin A targets the intermediate filament protein vimentin. Chem. Biol. 14, 623-634 (2007). 100. Liu, Y., et al. Wortmannin, a widely used phosphoinositide 3-kinase inhibitor, also potently inhibits mammalian polo-like kinase. Chem. Biol. 12, 99-107 (2005). 101. Yee, M.C., Fas, S.C., Stohlmeyer, M.M., Wandless, T.J. and Cimprich, K.A. A cell-permeable, activity-based probe for protein and lipid kinases. J. Biol. Chem. 280, 29053-29059 (2005). 102. Shreder, K.R., et al. Design and synthesis of AX7574: a microcystin-derived, fluorescent probe for serine/threonine phosphatases. Bioconjug. Chem. 15, 790-798 (2004). 103. Kwok, B.H., Koh, B., Ndubuisi, M.I., Elofsson, M. and Crews, C.M. The anti-inflammatory natural product parthenolide from the medicinal herb Feverfew directly binds to and inhibits IkappaB kinase. Chem. Biol. 8, 759-766 (2001). 104. Teruya, T., Simizu, S., Kanoh, N. and Osada, H. Phoslactomycin targets cysteine-269 of the protein phosphatase 2A catalytic subunit in cells. FEBS Lett 579, 2463-2468 (2005). 105. Usui, T., et al. The anticancer natural product pironetin selectively targets Lys352 of alpha- tubulin. Chem. Biol. 11, 799-806 (2004). 106. Kudo, N., et al. Leptomycin B inactivates CRM1/exportin 1 by covalent modification at a cysteine residue in the central conserved region. Proc. Natl. Acad. Sci. U.S.A. 96, 9112-9117 (1999). 107. Kudo, N., et al. Leptomycin B inhibition of signal-mediated nuclear export by direct binding to CRM1. Exp. Cell Res. 242, 540-547 (1998). 108. Adam, G.C., Vanderwal, C.D., Sorensen, E.J. and Cravatt, B.F. (-)-FR182877 is a potent and selective inhibitor of carboxylesterase-1. Angew. Chem. Int. Ed. Engl. 42, 5480-5484 (2003). 109. Buey, R.M., et al. Cyclostreptin binds covalently to microtubule pores and lumenal taxoid binding sites. Nat. Chem. Biol. 3, 117-125 (2007). 110. Clerc, J., et al. Syringolin A selectively labels the 20 S proteasome in murine EL4 and wild- type and bortezomib-adapted leukaemic cell lines. Chembiochem 10, 2638-2643 (2009). 111. Groll, M., et al. A plant pathogen virulence factor inhibits the eukaryotic proteasome by a novel mechanism. Nature 452, 755-758 (2008). 112. Wulff, J.E., Herzon, S.B., Siegrist, R. and Myers, A.G. Evidence for the rapid conversion of stephacidin B into the electrophilic monomer avrainvillamide in cell culture. J. Am. Chem. Soc. 129, 4898-4899 (2007). 113. Wulff, J.E., Siegrist, R. and Myers, A.G. The natural product avrainvillamide binds to the oncoprotein nucleophosmin. J. Am. Chem. Soc. 129, 14444-14451 (2007). 114. Wang, J., et al. A quantitative chemical proteomics approach to profile the specific cellular targets of andrographolide, a promising anticancer agent that suppresses tumor metastasis. Mol. Cell. Proteomics 13, 876-886 (2014). 115. Xia, Y.F., et al. Andrographolide attenuates inflammation by inhibition of NF-kappa B activation through covalent modification of reduced cysteine 62 of p50. J. Immunol. 173, 4207-4217 (2004). 116. Yang, P.Y., et al. Activity-based proteome profiling of potential cellular targets of Orlistat--an FDA-approved drug with anti-tumor activities. J. Am. Chem. Soc. 132, 656-666 (2010). 117. Kridel, S.J., Axelrod, F., Rozenkrantz, N. and Smith, J.W. Orlistat is a novel inhibitor of fatty acid synthase with antitumor activity. Cancer Res. 64, 2070-2075 (2004). 118. Faiella, L., Piaz, F.D., Bisio, A., Tosco, A. and De Tommasi, N. A chemical proteomics approach reveals Hsp27 as a target for proapoptotic clerodane diterpenes. Mol. Biosyst. 8, 2637-2644 (2012). 119. Dal Piaz, F., et al. Chemical proteomics reveals HSP70 1A as a target for the anticancer diterpene oridonin in Jurkat cells. J. Proteomics 82, 14-26 (2013). 120. Kalesh, K.A., Clulow, J.A. and Tate, E.W. Target profiling of zerumbone using a novel cell- permeable clickable probe and quantitative chemical proteomics. Chem. Commun. (Camb.) (2015). 121. Dolai, S., Xu, Q., Liu, F. and Molloy, M.P. Quantitative chemical proteomics in small-scale culture of phorbol ester stimulated basal breast cancer cells. Proteomics 11, 2683-2692 (2011).

204

122. Saxena, C., Zhen, E., Higgs, R.E. and Hale, J.E. An immuno-chemo-proteomics method for drug target deconvolution. J. Proteome Res. 7, 3490-3497 (2008). 123. Raj, L., et al. Selective killing of cancer cells by a small molecule targeting the stress response to ROS. Nature 475, 231-234 (2011). 124. Lum, P.Y., et al. Discovering modes of action for therapeutic compounds using a genome- wide screen of yeast heterozygotes. Cell 116, 121-137 (2004). 125. Oh, J., et al. A universal TagModule collection for parallel genetic analysis of microorganisms. Nucleic Acids Res. 38, e146 (2010). 126. Ko, C.C., et al. Chemical proteomics identifies heterogeneous nuclear ribonucleoprotein (hnRNP) A1 as the molecular target of quercetin in its anti-cancer effects in PC-3 cells. J. Biol. Chem. 289, 22078-22089 (2014). 127. Low, W.K., et al. Inhibition of eukaryotic translation initiation by the marine natural product pateamine A. Mol. Cell 20, 709-722 (2005). 128. Ahmad, R., Raina, D., Meyer, C. and Kufe, D. Triterpenoid CDDO-methyl ester inhibits the Janus-activated kinase-1 (JAK1)-->signal transducer and activator of transcription-3 (STAT3) pathway by direct inhibition of JAK1 and STAT3. Cancer Res. 68, 2920-2926 (2008). 129. Ahmad, R., Raina, D., Meyer, C., Kharbanda, S. and Kufe, D. Triterpenoid CDDO-Me blocks the NF-kappaB pathway by direct inhibition of IKKbeta on Cys-179. J. Biol. Chem. 281, 35764-35769 (2006). 130. Cleasby, A., et al. Structure of the BTB domain of Keap1 and its interaction with the triterpenoid antagonist CDDO. PLoS One 9, e98896 (2014). 131. Yore, M.M., Kettenbach, A.N., Sporn, M.B., Gerber, S.A. and Liby, K.T. Proteomic analysis shows synthetic oleanane triterpenoid binds to mTOR. PLoS One 6, e22862 (2011). 132. Angelo, L.S., et al. Binding partners for curcumin in human schwannoma cells: biologic implications. Bioorg. Med. Chem. 21, 932-939 (2013). 133. Firouzi, Z., et al. Proteomics screening of molecular targets of curcumin in mouse brain. Life Sci. 98, 12-17 (2014). 134. Liu, C.X., et al. Adenanthin targets peroxiredoxin I and II to induce differentiation of leukemic cells. Nat. Chem. Biol. 8, 486-493 (2012). 135. Cheng, X., Li, L., Uttamchandani, M. and Yao, S.Q. In situ proteome profiling of C75, a covalent bioactive compound with potential anticancer activities. Org. Lett. 16, 1414-1417 (2014). 136. Clamp, M., et al. Distinguishing protein-coding and noncoding genes in the . Proc. Natl. Acad. Sci. U.S.A. 104, 19428-19433 (2007). 137. An, W.F. and Tolliday, N. Cell-based assays for high-throughput screening. Mol. Biotechnol. 45, 180-186 (2010). 138. Schenone, M., Dancik, V., Wagner, B.K. and Clemons, P.A. Target identification and mechanism of action in chemical biology and drug discovery. Nat. Chem. Biol. 9, 232-240 (2013). 139. Carlson, E.E. Natural products as chemical probes. ACS Chem. Biol. 5, 639-653 (2010). 140. Oda, Y., et al. Quantitative chemical proteomics for identifying candidate drug targets. Anal. Chem. 75, 2159-2165 (2003). 141. Wang, G., Shang, L., Burgett, A.W., Harran, P.G. and Wang, X. Diazonamide toxins reveal an unexpected function for ornithine delta-amino transferase in mitotic cell division. Proc. Natl. Acad. Sci. U.S.A. 104, 2068-2073 (2007). 142. Ito, T., et al. Identification of a primary target of thalidomide teratogenicity. Science 327, 1345- 1350 (2010). 143. Ong, S.E., et al. Identifying the proteins to which small-molecule probes and drugs bind in cells. Proc. Natl. Acad. Sci. U.S.A. 106, 4617-4622 (2009). 144. Bantscheff, M., et al. Quantitative chemical proteomics reveals mechanisms of action of clinical ABL kinase inhibitors. Nat. Biotechnol. 25, 1035-1044 (2007). 145. Trippier, P.C. Synthetic strategies for the biotinylation of bioactive small molecules. ChemMedChem 8, 190-203 (2013). 146. Heal, W.P., Wickramasinghe, S.R. and Tate, E.W. Activity based chemical proteomics: profiling proteases as drug targets. Curr. Drug Discov. Technol. 5, 200-212 (2008). 147. Cravatt, B.F. and Sorensen, E.J. Chemical strategies for the global analysis of protein function. Curr. Opin. Chem. Biol. 4, 663-668 (2000). 148. Jeffery, D.A. and Bogyo, M. Chemical proteomics and its application to drug discovery. Curr. Opin. Biotechnol. 14, 87-95 (2003).

205

149. Cravatt, B.F., Wright, A.T. and Kozarich, J.W. Activity-based protein profiling: from enzyme chemistry to proteomic chemistry. Annu. Rev. Biochem. 77, 383-414 (2008). 150. Nodwell, M.B. and Sieber, S.A. ABPP methodology: introduction and overview. Top. Curr. Chem. 324, 1-41 (2012). 151. Su, Y., et al. Target identification of biologically active small molecules via in situ methods. Curr. Opin. Chem. Biol. 17, 768-775 (2013). 152. Adam, G.C., Sorensen, E.J. and Cravatt, B.F. Chemical strategies for functional proteomics. Mol. Cell. Proteomics 1, 781-790 (2002). 153. Kato, D., et al. Activity-based probes that target diverse cysteine protease families. Nat. Chem. Biol. 1, 33-38 (2005). 154. Speers, A.E., Adam, G.C. and Cravatt, B.F. Activity-based protein profiling in vivo using a copper(i)-catalyzed azide-alkyne [3 + 2] cycloaddition. J. Am. Chem. Soc. 125, 4686-4687 (2003). 155. Ovaa, H., et al. Chemistry in living cells: detection of active proteasomes by a two-step labeling strategy. Angew. Chem. Int. Ed. Engl. 42, 3626-3629 (2003). 156. Rostovtsev, V.V., Green, L.G., Fokin, V.V. and Sharpless, K.B. A stepwise huisgen cycloaddition process: copper(I)-catalyzed regioselective "ligation" of azides and terminal alkynes. Angew. Chem. Int. Ed. Engl. 41, 2596-2599 (2002). 157. Tornoe, C.W., Christensen, C. and Meldal, M. Peptidotriazoles on solid phase: [1,2,3]- triazoles by regiospecific copper(i)-catalyzed 1,3-dipolar cycloadditions of terminal alkynes to azides. J. Org. Chem. 67, 3057-3064 (2002). 158. Kolb, H.C., Finn, M.G. and Sharpless, K.B. Click Chemistry: Diverse Chemical Function from a Few Good Reactions. Angew. Chem. Int. Ed. Engl. 40, 2004-2021 (2001). 159. Kohn, M. and Breinbauer, R. The Staudinger ligation-a gift to chemical biology. Angew. Chem. Int. Ed. Engl. 43, 3106-3116 (2004). 160. Agard, N.J., Prescher, J.A. and Bertozzi, C.R. A strain-promoted [3 + 2] azide-alkyne cycloaddition for covalent modification of biomolecules in living systems. J. Am. Chem. Soc. 126, 15046-15047 (2004). 161. Willems, L.I., Verdoes, M., Florea, B.I., van der Marel, G.A. and Overkleeft, H.S. Two-step labeling of endogenous enzymatic activities by Diels-Alder ligation. Chembiochem 11, 1769- 1781 (2010). 162. Baskin, J.M., et al. Copper-free click chemistry for dynamic in vivo imaging. Proc. Natl. Acad. Sci. U.S.A. 104, 16793-16797 (2007). 163. Chan, E.W., Chattopadhaya, S., Panicker, R.C., Huang, X. and Yao, S.Q. Developing photoactive affinity probes for proteomic profiling: hydroxamate-based probes for metalloproteases. J. Am. Chem. Soc. 126, 14435-14446 (2004). 164. Shi, H., Zhang, C.J., Chen, G.Y. and Yao, S.Q. Cell-based proteome profiling of potential dasatinib targets by use of affinity-based probes. J. Am. Chem. Soc. 134, 3001-3014 (2012). 165. Jessen, K.A., et al. The discovery and mechanism of action of novel tumor-selective and apoptosis-inducing 3,5-diaryl-1,2,4-oxadiazole series using a chemical genetics approach. Mol. Cancer Ther. 4, 761-771 (2005). 166. Park, J., Oh, S. and Park, S.B. Discovery and target identification of an antiproliferative agent in live cells using fluorescence difference in two-dimensional gel electrophoresis. Angew. Chem. Int. Ed. Engl. 51, 5447-5451 (2012). 167. Lomenick, B., et al. Target identification using drug affinity responsive target stability (DARTS). Proc. Natl. Acad. Sci. U.S.A. 106, 21984-21989 (2009). 168. Lomenick, B., Olsen, R.W. and Huang, J. Identification of direct protein targets of small molecules. ACS Chem. Biol. 6, 34-46 (2011). 169. Martinez Molina, D., et al. Monitoring drug target engagement in cells and tissues using the cellular thermal shift assay. Science 341, 84-87 (2013). 170. Savitski, M.M., et al. Tracking cancer drugs in living cells by thermal profiling of the proteome. Science 346, 1255784 (2014). 171. Chan, J.N., et al. Target identification by chromatographic co-elution: monitoring of drug- protein interactions without immobilization or chemical derivatization. Mol. Cell. Proteomics 11, M111 016642 (2012). 172. Jonkheijm, P., Weinrich, D., Schroder, H., Niemeyer, C.M. and Waldmann, H. Chemical strategies for generating protein biochips. Angew. Chem. Int. Ed. Engl. 47, 9618-9647 (2008). 173. Zhu, H., et al. Global analysis of protein activities using proteome chips. Science 293, 2101- 2105 (2001).

206

174. Heitman, J., Movva, N.R. and Hall, M.N. Targets for cell cycle arrest by the immunosuppressant rapamycin in yeast. Science 253, 905-909 (1991). 175. Giaever, G., et al. Genomic profiling of drug sensitivities via induced haploinsufficiency. Nat. Genet. 21, 278-283 (1999). 176. Licitra, E.J. and Liu, J.O. A three-hybrid system for detecting small ligand-protein receptor interactions. Proc. Natl. Acad. Sci. U.S.A. 93, 12817-12821 (1996). 177. Chan, J.N., Nislow, C. and Emili, A. Recent advances and method development for drug target identification. Trends Pharmacol. Sci. 31, 82-88 (2010). 178. Smith, A.M., Ammar, R., Nislow, C. and Giaever, G. A survey of yeast genomic assays for drug and target discovery. Pharmacol. Ther. 127, 156-164 (2010). 179. Wang, J., et al. Cellular phenotype recognition for high-content RNA interference genome- wide screening. J. Biomol. Screen 13, 29-39 (2008). 180. Wacker, S.A., Houghtaling, B.R., Elemento, O. and Kapoor, T.M. Using transcriptome sequencing to identify mechanisms of drug action and resistance. Nat. Chem. Biol. 8, 235- 237 (2012). 181. Ecker, G.F., Stockner, T. and Chiba, P. Computational models for prediction of interactions with ABC-transporters. Drug Discov. Today 13, 311-317 (2008). 182. Nigsch, F., Macaluso, N.J., Mitchell, J.B. and Zmuidinavicius, D. Computational toxicology: an overview of the sources of data and of modelling methods. Expert Opin. Drug Metab. Toxicol. 5, 1-14 (2009). 183. Wishart, D.S., et al. DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. 36, D901-906 (2008). 184. Kuhn, M., von Mering, C., Campillos, M., Jensen, L.J. and Bork, P. STITCH: interaction networks of chemicals and proteins. Nucleic Acids Res. 36, D684-688 (2008). 185. Chen, X., Ji, Z.L. and Chen, Y.Z. TTD: Therapeutic Target Database. Nucleic Acids Res. 30, 412-415 (2002). 186. Kim, M.S., et al. A draft map of the human proteome. Nature 509, 575-581 (2014). 187. Washburn, M.P., Wolters, D. and Yates, J.R., 3rd Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat. Biotechnol. 19, 242-247 (2001). 188. Huber, C. and Huber, L. Special focus on top-down proteomics. Proteomics 10, 3564-3565 (2010). 189. Zhang, Y., Fonslow, B.R., Shan, B., Baek, M.C. and Yates, J.R., 3rd Protein analysis by shotgun/bottom-up proteomics. Chem. Rev. 113, 2343-2394 (2013). 190. Yates, J.R., 3rd The revolution and evolution of shotgun proteomics for large-scale proteome analysis. J. Am. Chem. Soc. 135, 1629-1640 (2013). 191. Kalli, A., Smith, G.T., Sweredoski, M.J. and Hess, S. Evaluation and optimization of mass spectrometric settings during data-dependent acquisition mode: focus on LTQ-Orbitrap mass analyzers. J. Proteome Res. 12, 3071-3086 (2013). 192. Domon, B. and Aebersold, R. Mass spectrometry and protein analysis. Science 312, 212-217 (2006). 193. Ho, C.S., et al. Electrospray ionisation mass spectrometry: principles and clinical applications. Clin. Biochem. Rev. 24, 3-12 (2003). 194. Karas, M. and Kruger, R. Ion formation in MALDI: the cluster ionization mechanism. Chem. Rev. 103, 427-440 (2003). 195. Choudhary, C. and Mann, M. Decoding signalling networks by mass spectrometry-based proteomics. Nat. Rev. Mol. Cell. Biol. 11, 427-439 (2010). 196. Olsen, J.V., et al. Higher-energy C-trap dissociation for peptide modification analysis. Nat. Methods 4, 709-712 (2007). 197. Hoopmann, M.R. and Moritz, R.L. Current algorithmic solutions for peptide-based proteomics data generation and identification. Curr. Opin. Biotechnol. 24, 31-38 (2013). 198. Nesvizhskii, A.I. and Aebersold, R. Interpretation of shotgun proteomic data: the protein inference problem. Mol. Cell. Proteomics 4, 1419-1440 (2005). 199. Cox, J. and Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367-1372 (2008). 200. Cox, J., et al. Andromeda: a peptide search engine integrated into the MaxQuant environment. J. Proteome Res. 10, 1794-1805 (2011). 201. Hagenstein, M.C. and Sewald, N. Chemical tools for activity-based proteomics. J. Biotechnol. 124, 56-73 (2006).

207

202. Wasinger, V.C., Zeng, M. and Yau, Y. Current status and advances in quantitative proteomic mass spectrometry. Int. J. Proteomics 2013, 180605 (2013). 203. Ong, S.E., et al. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell. Proteomics 1, 376-386 (2002). 204. Blagoev, B., Ong, S.E., Kratchmarova, I. and Mann, M. Temporal analysis of phosphotyrosine-dependent signaling networks by quantitative proteomics. Nat. Biotechnol. 22, 1139-1145 (2004). 205. Geiger, T., et al. Use of stable isotope labeling by amino acids in cell culture as a spike-in standard in quantitative proteomics. Nat. Protoc. 6, 147-157 (2011). 206. Ong, S.E., Li, X., Schenone, M., Schreiber, S.L. and Carr, S.A. Identifying cellular targets of small-molecule probes and drugs with biochemical enrichment and SILAC. Methods Mol. Biol. 803, 129-140 (2012). 207. Schmidt, A., Kellermann, J. and Lottspeich, F. A novel strategy for quantitative proteomics using isotope-coded protein labels. Proteomics 5, 4-15 (2005). 208. Ross, P.L., et al. Multiplexed protein quantitation in using amine- reactive isobaric tagging reagents. Mol. Cell. Proteomics 3, 1154-1169 (2004). 209. Boersema, P.J., Raijmakers, R., Lemeer, S., Mohammed, S. and Heck, A.J. Multiplex peptide stable isotope dimethyl labeling for quantitative proteomics. Nat. Protoc. 4, 484-494 (2009). 210. Hsu, J.L., Huang, S.Y., Chow, N.H. and Chen, S.H. Stable-isotope dimethyl labeling for quantitative proteomics. Anal. Chem. 75, 6843-6852 (2003). 211. Gerber, S.A., Rush, J., Stemman, O., Kirschner, M.W. and Gygi, S.P. Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc. Natl. Acad. Sci. U.S.A. 100, 6940-6945 (2003). 212. Bantscheff, M., Lemeer, S., Savitski, M.M. and Kuster, B. Quantitative mass spectrometry in proteomics: critical review update from 2007 to the present. Anal. Bioanal. Chem. 404, 939- 965 (2012). 213. Lau, H.T., Suh, H.W., Golkowski, M. and Ong, S.E. Comparing SILAC- and stable isotope dimethyl-labeling approaches for quantitative proteomics. J. Proteome Res. 13, 4164-4174 (2014). 214. Li, Z., et al. Systematic comparison of label-free, metabolic labeling, and isobaric chemical labeling for quantitative proteomics on LTQ Orbitrap Velos. J. Proteome Res. 11, 1582-1590 (2012). 215. Wu, W.W., Wang, G., Baek, S.J. and Shen, R.F. Comparative study of three proteomic quantitative methods, DIGE, cICAT, and iTRAQ, using 2D gel- or LC-MALDI TOF/TOF. J. Proteome Res. 5, 651-658 (2006). 216. Thinon, E., et al. Global profiling of co- and post-translationally N-myristoylated proteomes in human cells. Nat. Commun. 5, 4919 (2014). 217. Breitkopf, S.B., Oppermann, F.S., Keri, G., Grammel, M. and Daub, H. Proteomics analysis of cellular imatinib targets and their candidate downstream effectors. J. Proteome Res. 9, 6033- 6043 (2010). 218. Prokhorova, T.A., et al. Stable isotope labeling by amino acids in cell culture (SILAC) and quantitative comparison of the membrane proteomes of self-renewing and differentiating human embryonic stem cells. Mol. Cell. Proteomics 8, 959-970 (2009). 219. Ciepla, P., et al. New chemical probes targeting cholesterylation of Sonic Hedgehog in human cells and . Chem. Sci. 5, 4249-4259 (2014). 220. Bozza, W.P., et al. The use of a stably expressed FRET biosensor for determining the potency of cancer drugs. PLoS One 9, e107010 (2014). 221. Sardiu, M.E. and Washburn, M.P. Building protein-protein interaction networks with proteomics and informatics tools. J. Biol. Chem. 286, 23645-23651 (2011). 222. Zhang, G.L., DeLuca, D.S. and Brusic, V. Database resources for proteomics-based analysis of cancer. Methods Mol. Biol. 723, 349-364 (2011). 223. Ashburner, M., et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25-29 (2000). 224. Huang, D.W., Sherman, B.T. and Lempicki, R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protoc. 4, 44-57 (2009). 225. Medina, I., et al. Babelomics: an integrative platform for the analysis of transcriptomics, proteomics and genomic data with advanced functional profiling. Nucleic Acids Res. 38, W210-213 (2010). 226. Croft, D., et al. Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res. 39, D691-697 (2011).

208

227. Kanehisa, M., Goto, S., Sato, Y., Furumichi, M. and Tanabe, M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 40, D109-114 (2012). 228. Kandasamy, K., et al. NetPath: a public resource of curated signal transduction pathways. Genome Biol. 11, R3 (2010). 229. Brehme, M., et al. Charting the molecular network of the drug target Bcr-Abl. Proc. Natl. Acad. Sci. U.S.A. 106, 7414-7419 (2009). 230. Chatr-aryamontri, A., et al. MINT: the Molecular INTeraction database. Nucleic Acids Res. 35, D572-574 (2007). 231. Kerrien, S., et al. The IntAct molecular interaction database in 2012. Nucleic Acids Res. 40, D841-846 (2012). 232. Keshava Prasad, T.S., et al. Human Protein Reference Database--2009 update. Nucleic Acids Res. 37, D767-772 (2009). 233. Stark, C., et al. BioGRID: a general repository for interaction datasets. Nucleic Acids Res. 34, D535-539 (2006). 234. Snel, B., Lehmann, G., Bork, P. and Huynen, M.A. STRING: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene. Nucleic Acids Res. 28, 3442-3444 (2000). 235. Schmidt, A., Forne, I. and Imhof, A. Bioinformatic analysis of proteomics data. BMC Syst. Biol. 8 Suppl 2, S3 (2014). 236. Mizuarai, S., Irie, H., Schmatz, D.M. and Kotani, H. Integrated genomic and pharmacological approaches to identify synthetic lethal genes as cancer therapeutic targets. Curr. Mol. Med. 8, 774-783 (2008). 237. Ngai, M.H., et al. Click-based synthesis and proteomic profiling of lipstatin analogues. Chem. Commun. (Camb.) 46, 8335-8337 (2010). 238. Yang, P.Y., et al. Chemical modification and organelle-specific localization of orlistat-like natural-product-based probes. Chem. Asian. J. 6, 2762-2775 (2011). 239. Wymann, M.P., et al. Wortmannin inactivates phosphoinositide 3-kinase by covalent modification of Lys-802, a residue involved in the phosphate transfer reaction. Mol. Cell. Biol. 16, 1722-1733 (1996). 240. Wang, J., et al. Mapping sites of aspirin-induced acetylations in live cells by quantitative acid- cleavable activity-based protein profiling (QA-ABPP). Sci. Rep. 5, 7896 (2015). 241. Lanning, B.R., et al. A road map to evaluate the proteome-wide selectivity of covalent kinase inhibitors. Nat. Chem. Biol. 10, 760-767 (2014). 242. Lin, D., Li, J., Slebos, R.J. and Liebler, D.C. Cysteinyl peptide capture for shotgun proteomics: global assessment of chemoselective fractionation. J. Proteome Res. 9, 5461-5472 (2010). 243. Deng, X., et al. Proteome-wide quantification and characterization of oxidation-sensitive cysteines in pathogenic bacteria. Cell Host Microbe 13, 358-370 (2013). 244. Pace, N.J. and Weerapana, E. A competitive chemical-proteomic platform to identify zinc- binding cysteines. ACS Chem. Biol. 9, 258-265 (2014). 245. Codreanu, S.G., et al. Alkylation damage by lipid electrophiles targets functional protein systems. Mol. Cell. Proteomics 13, 849-859 (2014). 246. Yang, J., Gupta, V., Carroll, K.S. and Liebler, D.C. Site-specific mapping and quantification of protein S-sulphenylation in cells. Nat. Commun. 5, 4776 (2014). 247. Surh, Y.J. Cancer chemoprevention with dietary phytochemicals. Nat. Rev. Cancer 3, 768- 780 (2003). 248. Deweerdt, S. FOOD The omnivore's labyrinth. Nature 471, S22-S24 (2011). 249. Hopkins, A.L. Network pharmacology: the next paradigm in drug discovery. Nat. Chem. Biol. 4, 682-690 (2008). 250. Anighoro, A., Bajorath, J. and Rastelli, G. Polypharmacology: challenges and opportunities in drug discovery. J. Med. Chem. 57, 7874-7887 (2014). 251. Lounkine, E., et al. Large-scale prediction and testing of drug activity on side-effect targets. Nature 486, 361-367 (2012). 252. Hanahan, D. and Weinberg, R.A. The hallmarks of cancer. Cell 100, 57-70 (2000). 253. Hanahan, D. and Weinberg, R.A. Hallmarks of cancer: the next generation. Cell 144, 646-674 (2011). 254. Turrini, E., Ferruzzi, L. and Fimognari, C. Natural compounds to overcome cancer chemoresistance: toxicological and clinical issues. Expert Opin. Drug Metab. Toxicol. 10, 1677-1690 (2014). 255. Goel, A., Kunnumakkara, A.B. and Aggarwal, B.B. Curcumin as "Curecumin": from kitchen to clinic. Biochem. Pharmacol. 75, 787-809 (2008).

209

256. Patro, B.S., et al. Protective activities of some phenolic 1,3-diketones against lipid peroxidation: possible involvement of the 1,3-diketone moiety. Chembiochem 3, 364-370 (2002). 257. Epstein, J., Sanderson, I.R. and Macdonald, T.T. Curcumin as a therapeutic agent: the evidence from in vitro, animal and human studies. Br. J. Nutr. 103, 1545-1557 (2010). 258. Hasima, N. and Aggarwal, B.B. Cancer-linked targets modulated by curcumin. Int. J. Biochem. Mol. Biol. 3, 328-351 (2012). 259. Zhou, H., Beevers, C.S. and Huang, S. The targets of curcumin. Curr. Drug Targets 12, 332- 347 (2011). 260. Shishodia, S., Singh, T. and Chaturvedi, M.M. Modulation of transcription factors by curcumin. Adv. Exp. Med. Biol. 595, 127-148 (2007). 261. Gupta, S.C., et al. Multitargeting by curcumin as revealed by molecular interaction studies. Nat. Prod. Rep. 28, 1937-1955 (2011). 262. Padhye, S., et al. Fluorocurcumins as cyclooxygenase-2 inhibitor: molecular docking, pharmacokinetics and tissue distribution in mice. Pharm. Res. 26, 2438-2445 (2009). 263. Liu, Z., et al. Curcumin is a potent DNA hypomethylation agent. Bioorg. Med. Chem. Lett. 19, 706-709 (2009). 264. Fang, J., Lu, J. and Holmgren, A. Thioredoxin reductase is irreversibly modified by curcumin: a novel molecular mechanism for its anticancer activity. J. Biol. Chem. 280, 25284-25290 (2005). 265. Jung, Y., Xu, W., Kim, H., Ha, N. and Neckers, L. Curcumin-induced degradation of ErbB2: A role for the E3 ubiquitin ligase CHIP and the Michael reaction acceptor activity of curcumin. Biochim. Biophys. Acta 1773, 383-390 (2007). 266. Leu, T.H., Su, S.L., Chuang, Y.C. and Maa, M.C. Direct inhibitory effect of curcumin on Src and focal adhesion kinase activity. Biochem. Pharmacol. 66, 2323-2331 (2003). 267. Majhi, A., Rahman, G.M., Panchal, S. and Das, J. Binding of curcumin and its long chain derivatives to the activator binding domain of novel protein kinase C. Bioorg. Med. Chem. 18, 1591-1598 (2010). 268. Beevers, C.S., et al. Curcumin disrupts the Mammalian target of rapamycin-raptor complex. Cancer Res. 69, 1000-1008 (2009). 269. Bera, R., Sahoo, B.K., Ghosh, K.S. and Dasgupta, S. Studies on the interaction of isoxazolcurcumin with calf thymus DNA. Int. J. Biol. Macromol. 42, 14-21 (2008). 270. Nafisi, S., Adelzadeh, M., Norouzi, Z. and Sarbolouki, M.N. Curcumin binding to DNA and RNA. DNA Cell. Biol. 28, 201-208 (2009). 271. Aggarwal, B.B. and Sung, B. Pharmacological basis for the role of curcumin in chronic diseases: an age-old spice with modern targets. Trends Pharmacol. Sci. 30, 85-94 (2009). 272. Anand, P., Kunnumakkara, A.B., Newman, R.A. and Aggarwal, B.B. Bioavailability of curcumin: problems and promises. Mol. Pharm. 4, 807-818 (2007). 273. Gupta, S.C., et al. Multitargeting by turmeric, the golden spice: From kitchen to clinic. Mol. Nutr. Food Res. 57, 1510-1528 (2013). 274. Heger, M., van Golen, R.F., Broekgaarden, M. and Michel, M.C. The molecular basis for the pharmacokinetics and pharmacodynamics of curcumin and its metabolites in relation to cancer. Pharmacol. Rev. 66, 222-307 (2014). 275. Prasad, S., Gupta, S.C., Tyagi, A.K. and Aggarwal, B.B. Curcumin, a component of golden spice: from bedside to bench and back. Biotechnol. Adv. 32, 1053-1064 (2014). 276. Juge, N., Mithen, R.F. and Traka, M. Molecular basis for chemoprevention by sulforaphane: a comprehensive review. Cell Mol Life Sci 64, 1105-1127 (2007). 277. Zhang, Y. and Tang, L. Discovery and development of sulforaphane as a cancer chemopreventive . Acta Pharmacol. Sin. 28, 1343-1354 (2007). 278. Wu, X., Zhou, Q.H. and Xu, K. Are isothiocyanates potential anti-cancer drugs? Acta Pharmacol. Sin. 30, 501-512 (2009). 279. Gupta, P., Kim, B., Kim, S.H. and Srivastava, S.K. Molecular targets of isothiocyanates in cancer: recent advances. Mol. Nutr. Food Res. 58, 1685-1707 (2014). 280. Nakamura, Y. and Miyoshi, N. Electrophiles in foods: the current status of isothiocyanates and their chemical biology. Biosci. Biotechnol. Biochem. 74, 242-255 (2010). 281. Mi, L., et al. The role of protein binding in induction of apoptosis by phenethyl isothiocyanate and sulforaphane in human non-small lung cancer cells. Cancer Res. 67, 6409-6416 (2007). 282. Zhang, Y. Role of glutathione in the accumulation of anticarcinogenic isothiocyanates and their glutathione conjugates by murine hepatoma cells. Carcinogenesis 21, 1175-1182 (2000).

210

283. Zhang, Y., Kolm, R.H., Mannervik, B. and Talalay, P. Reversible conjugation of isothiocyanates with glutathione catalyzed by human glutathione transferases. Biochem. Biophys. Res. Commun. 206, 748-755 (1995). 284. Cornblatt, B.S., et al. Preclinical and clinical evaluation of sulforaphane for chemoprevention in the breast. Carcinogenesis 28, 1485-1490 (2007). 285. Hu, K. and Morris, M.E. Effects of benzyl-, phenethyl-, and alpha-naphthyl isothiocyanates on P-glycoprotein- and MRP1-mediated transport. J. Pharm. Sci. 93, 1901-1911 (2004). 286. Mi, L.X., Di Pasqua, A.J. and Chung, F.L. Proteins as binding targets of isothiocyanates in cancer prevention. Carcinogenesis 32, 1405-1413 (2011). 287. Thimmulappa, R.K., et al. Identification of Nrf2-regulated genes induced by the chemopreventive agent sulforaphane by oligonucleotide microarray. Cancer Res. 62, 5196- 5203 (2002). 288. Zhang, Y., Kensler, T.W., Cho, C.G., Posner, G.H. and Talalay, P. Anticarcinogenic activities of sulforaphane and structurally related synthetic norbornyl isothiocyanates. Proc. Natl. Acad. Sci. U.S.A. 91, 3147-3150 (1994). 289. Zhang, Y., Talalay, P., Cho, C.G. and Posner, G.H. A major inducer of anticarcinogenic protective enzymes from broccoli: isolation and elucidation of structure. Proc. Natl. Acad. Sci. U.S.A. 89, 2399-2403 (1992). 290. Kensler, T.W., et al. Keap1-nrf2 signaling: a target for cancer prevention by sulforaphane. Top. Curr. Chem. 329, 163-177 (2013). 291. Hong, F., Freeman, M.L. and Liebler, D.C. Identification of sensor cysteines in human Keap1 modified by the cancer chemopreventive agent sulforaphane. Chem Res Toxicol 18, 1917- 1926 (2005). 292. Hu, C., Eggler, A.L., Mesecar, A.D. and van Breemen, R.B. Modification of keap1 cysteine residues by sulforaphane. Chem. Res. Toxicol. 24, 515-521 (2011). 293. Gerhauser, C. Epigenetic impact of dietary isothiocyanates in cancer chemoprevention. Curr. Opin. Clin. Nutr. Metab. Care 16, 405-410 (2013). 294. Gibbs, A., Schwartzman, J., Deng, V. and Alumkal, J. Sulforaphane destabilizes the androgen receptor in prostate cancer cells by inactivating histone deacetylase 6. Proc. Natl. Acad. Sci. U.S.A. 106, 16663-16668 (2009). 295. Brown, K.K., et al. Direct modification of the proinflammatory cytokine macrophage migration inhibitory factor by dietary isothiocyanates. J. Biol. Chem. 284, 32425-32433 (2009). 296. Cross, J.V., et al. Nutrient isothiocyanates covalently modify and inhibit the inflammatory cytokine macrophage migration inhibitory factor (MIF). Biochem. J. 423, 315-321 (2009). 297. Morris, M.E. and Dave, R.A. Pharmacokinetics and pharmacodynamics of phenethyl isothiocyanate: implications in breast cancer prevention. AAPS J. 16, 705-713 (2014). 298. Zhu, J., et al. Differential effects of phenethyl isothiocyanate and D,L-sulforaphane on TLR3 signaling. J. Immunol. 190, 4400-4407 (2013). 299. Singh, S.V., et al. Sulforaphane-induced cell death in human prostate cancer cells is initiated by reactive oxygen species. J. Biol. Chem. 280, 19911-19924 (2005). 300. Myzak, M.C. and Dashwood, R.H. Chemoprotection by sulforaphane: keep one eye beyond Keap1. Cancer Lett. 233, 208-218 (2006). 301. Rodrigues, R.V., et al. Antinociceptive effect of crude extract, fractions and three alkaloids obtained from fruits of Piper tuberculatum. Biol. Pharm. Bull. 32, 1809-1812 (2009). 302. Bezerra, D.P., et al. Piplartine induces inhibition of leukemia cell proliferation triggering both apoptosis and necrosis pathways. Toxicol. In Vitro 21, 1-8 (2007). 303. Fontenele, J.B., et al. Antiplatelet effects of piplartine, an alkamide isolated from Piper tuberculatum: possible involvement of cyclooxygenase blockade and antioxidant activity. J. Pharm. Pharmacol. 61, 511-515 (2009). 304. Bezerra, D.P., et al. Overview of the therapeutic potential of piplartine (piperlongumine). Eur. J. Pharm. Sci. 48, 453-463 (2013). 305. Wang, Y., et al. Piperlongumine induces autophagy by targeting p38 signaling. Cell Death Dis. 4, e824 (2013). 306. Makhov, P., et al. Piperlongumine promotes autophagy via inhibition of Akt/mTOR signalling and mediates cancer cell death. Br. J. Cancer 110, 899-907 (2014). 307. Shrivastava, S., et al. Piperlongumine, an alkaloid causes inhibition of PI3 K/Akt/mTOR signaling axis to induce caspase-dependent apoptosis in human triple-negative breast cancer cells. Apoptosis 19, 1148-1164 (2014). 308. Bezerra, D.P., et al. Evaluation of the genotoxicity of piplartine, an alkamide of Piper tuberculatum, in yeast and mammalian V79 cells. Mutat. Res. 652, 164-174 (2008).

211

309. Kong, E.H., et al. Piplartine induces caspase-mediated apoptosis in PC-3 human prostate cancer cells. Oncol. Rep. 20, 785-792 (2008). 310. Bezerra, D.P., et al. Antiproliferative effects of two amides, piperine and piplartine, from Piper species. Z. Naturforsch. C 60, 539-543 (2005). 311. Duh, C.Y., Wu, Y.C. and Wang, S.K. Cytotoxic pyridone alkaloids from the leaves of Piper aborescens. J. Nat. Prod. 53, 1575-1577 (1990). 312. Adams, D.J., et al. Synthesis, cellular evaluation, and mechanism of action of piperlongumine analogs. Proc. Natl. Acad. Sci. U.S.A. 109, 15115-15120 (2012). 313. Adams, D.J., et al. Discovery of small-molecule enhancers of reactive oxygen species that are nontoxic or cause genotype-selective cell death. ACS Chem. Biol. 8, 923-929 (2013). 314. Bharadwaj, U., et al. Drug-repositioning screening identified piperlongumine as a direct STAT3 inhibitor with potent activity against breast cancer. Oncogene 0 (2014). 315. Han, J.G., Gupta, S.C., Prasad, S. and Aggarwal, B.B. Piperlongumine chemosensitizes tumor cells through interaction with cysteine 179 of IkappaBalpha kinase, leading to suppression of NF-kappaB-regulated gene products. Mol. Cancer Ther. 13, 2422-2435 (2014). 316. Polireddy, K., et al. A novel flow cytometric HTS assay reveals functional modulators of ATP binding cassette transporter ABCB6. PLoS One 7, e40005 (2012). 317. Jarvius, M., et al. Piperlongumine induces inhibition of the ubiquitin-proteasome system in cancer cells. Biochem. Biophys. Res. Commun. 431, 117-123 (2013). 318. Baell, J.B. and Holloway, G.A. New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. J. Med. Chem. 53, 2719-2740 (2010). 319. Baell, J. and Walters, M.A. Chemistry: Chemical con artists foil drug discovery. Nature 513, 481-483 (2014). 320. Jaramillo, M.C. and Zhang, D.D. The emerging role of the Nrf2-Keap1 signaling pathway in cancer. Genes Dev. 27, 2179-2191 (2013). 321. Wang, X.J., et al. Nrf2 enhances resistance of cancer cells to chemotherapeutic drugs, the dark side of Nrf2. Carcinogenesis 29, 1235-1243 (2008). 322. Ahn, Y.H., et al. Electrophilic tuning of the chemoprotective natural product sulforaphane. Proc. Natl. Acad. Sci. U.S.A. 107, 9590-9595 (2010). 323. Mi, L., Xiao, Z., Veenstra, T.D. and Chung, F.L. Proteomic identification of binding targets of isothiocyanates: A perspective on techniques. J. Proteomics 74, 1036-1044 (2011). 324. Bar-Sela, G., Epelbaum, R. and Schaffer, M. Curcumin as an anti-cancer agent: review of the gap between basic and clinical applications. Curr. Med. Chem. 17, 190-197 (2010). 325. Houghton, C.A., Fassett, R.G. and Coombes, J.S. Sulforaphane: translational research from laboratory bench to clinic. Nutr. Rev. 71, 709-726 (2013). 326. Clulow, J.A. MRes Thesis, Imperial College London (2011). 327. Storck, E. MRes Thesis, Imperial College London (2011). 328. Speers, A.E. and Cravatt, B.F. Profiling enzyme activities in vivo using click chemistry methods. Chem. Biol. 11, 535-546 (2004). 329. Reddy, A.R., et al. A comprehensive review on SAR of curcumin. Mini Rev. Med. Chem. 13, 1769-1777 (2013). 330. Fuchs, J.R., et al. Structure-activity relationship studies of curcumin analogues. Bioorg. Med. Chem. Lett. 19, 2065-2069 (2009). 331. Qin, Z., et al. Synthesis and cytotoxic activity of novel curcumin analogues. Chinese Chemical Letters 19, 281-285 (2008). 332. Aggarwal, B.B., Deb, L. and Prasad, S. Curcumin differs from tetrahydrocurcumin for molecular targets, signaling pathways and cellular responses. Molecules 20, 185-205 (2015). 333. Labbozzetta, M., et al. Lack of nucleophilic addition in the isoxazole and pyrazole diketone modified analogs of curcumin; implications for their antitumor and chemosensitizing activities. Chem. Biol. Interact. 181, 29-36 (2009). 334. Simoni, D., et al. Antitumor effects of curcumin and structurally beta-diketone modified analogs on multidrug resistant cancer cells. Bioorg. Med. Chem. Lett. 18, 845-849 (2008). 335. Yamakoshi, H., et al. Structure-activity relationship of C5-curcuminoids and synthesis of their molecular probes thereof. Bioorg. Med. Chem. 18, 1083-1092 (2010). 336. Koeberle, A., et al. SAR studies on curcumin's pro-inflammatory targets: discovery of prenylated pyrazolocurcuminoids as potent and selective novel inhibitors of 5-lipoxygenase. J. Med. Chem. 57, 5638-5648 (2014).

212

337. Leong, S.W., et al. Synthesis and sar study of diarylpentanoid analogues as new anti- inflammatory agents. Molecules 19, 16058-16081 (2014). 338. Pabon, H.J.J. A synthesis of curcumin and related compounds. Recueil des Travaux Chimiques des Pays-Bas 83, 379-386 (1964). 339. Shi, W., et al. Synthesis of monofunctional curcumin derivatives, clicked curcumin dimer, and a PAMAM dendrimer curcumin conjugate for therapeutic applications. Org. Lett. 9, 5461-5464 (2007). 340. Lenhart, J.A., et al. "Clicked" bivalent ligands containing curcumin and cholesterol as multifunctional abeta oligomerization inhibitors: design, synthesis, and biological characterization. J. Med. Chem. 53, 6198-6209 (2010). 341. Ryu, E.K., Choe, Y.S., Lee, K.H., Choi, Y. and Kim, B.T. Curcumin and dehydrozingerone derivatives: synthesis, radiolabeling, and evaluation for beta-amyloid plaque imaging. J. Med. Chem. 49, 6111-6119 (2006). 342. Feng, J.Y. and Liu, Z.Q. Feruloylacetone as the model compound of half-curcumin: synthesis and antioxidant properties. Eur. J. Med. Chem. 46, 1198-1206 (2011). 343. Ferrari, E., et al. Synthesis, cytotoxic and combined cDDP activity of new stable curcumin derivatives. Bioorg. Med. Chem. 17, 3043-3052 (2009). 344. Milelli, A., et al. Isothiocyanate synthetic analogs: biological activities, structure-activity relationships and synthetic strategies. Mini Rev. Med. Chem. 14, 963-977 (2014). 345. El-Bayoumy, K. and Sinha, R. Mechanisms of mammary cancer chemoprevention by organoselenium compounds. Mutat. Res. 551, 181-197 (2004). 346. Sharma, A.K., et al. Synthesis and anticancer activity comparison of phenylalkyl isoselenocyanates with corresponding naturally occurring and synthetic isothiocyanates. J. Med. Chem. 51, 7820-7826 (2008). 347. Hu, K., et al. Synthesis and biological evaluation of sulforaphane derivatives as potential antitumor agents. Eur. J. Med. Chem. 64, 529-539 (2013). 348. Khiar, N., et al. Enantiopure sulforaphane analogues with various substituents at the sulfinyl sulfur: asymmetric synthesis and biological activities. J. Org. Chem. 74, 6002-6009 (2009). 349. Hwang, Y., et al. A selective chemical probe for coenzyme A-requiring enzymes. Angew. Chem. Int. Ed. Engl. 46, 7621-7624 (2007). 350. Posner, G.H., Cho, C.G., Green, J.V., Zhang, Y. and Talalay, P. Design and synthesis of bifunctional isothiocyanate analogs of sulforaphane: correlation between structure and potency as inducers of anticarcinogenic detoxication enzymes. J. Med. Chem. 37, 170-176 (1994). 351. D'Souza, C.A., Amin, S. and Desai, D. A facile and efficient synthesis of C-14-labelled sulforaphane. Journal of Labelled Compounds & Radiopharmaceuticals 46, 851-859 (2003). 352. Boskovic, Z.V., Hussain, M.M., Adams, D.J., Dai, M. and Schreiber, S.L. Synthesis of piperlogs and analysis of their effects on cells. Tetrahedron 69 (2013). 353. Sun, L.D., et al. Development and mechanism investigation of a new piperlongumine derivative as a potent anti-inflammatory agent. Biochem. Pharmacol. (2015). 354. Kepp, O., Galluzzi, L., Lipinski, M., Yuan, J. and Kroemer, G. Cell death assays for drug discovery. Nat. Rev. Drug Discov. 10, 221-237 (2011). 355. Berridge, M.V., Herst, P.M. and Tan, A.S. Tetrazolium dyes as tools in cell biology: new insights into their cellular reduction. Biotechnol. Annu. Rev. 11, 127-152 (2005). 356. Berridge, M.V. and Tan, A.S. Characterization of the cellular reduction of 3-(4,5- dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT): subcellular localization, substrate dependence, and involvement of mitochondrial electron transport in MTT reduction. Arch. Biochem. Biophys. 303, 474-482 (1993). 357. Holliday, D.L. and Speirs, V. Choosing the right cell line for breast cancer research. Breast Cancer Res. 13, 215 (2011). 358. Neve, R.M., et al. A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. Cancer Cell 10, 515-527 (2006). 359. Burdall, S.E., Hanby, A.M., Lansdown, M.R. and Speirs, V. Breast cancer cell lines: friend or foe? Breast Cancer Res. 5, 89-95 (2003). 360. Liu, D. and Chen, Z. The effect of curcumin on breast cancer cells. J. Breast Cancer 16, 133- 137 (2013). 361. Wang, Y.J., et al. Stability of curcumin in buffer solutions and characterization of its degradation products. J. Pharm. Biomed. Anal. 15, 1867-1876 (1997).

213

362. Gordon, O.N., Luis, P.B., Sintim, H.O. and Schneider, C. Unraveling Curcumin Degradation: Autoxidation Proceeds through Spiroepoxide and Vinylether Intermediates en route to the Main Bicyclopentadione. J. Biol. Chem. (2015). 363. Gordon, O.N. and Schneider, C. Vanillin and ferulic acid: not the major degradation products of curcumin. Trends Mol. Med. 18, 361-363; author reply 363-364 (2012). 364. Griesser, M., et al. Autoxidative and cyclooxygenase-2 catalyzed transformation of the dietary chemopreventive agent curcumin. J. Biol. Chem. 286, 1114-1124 (2011). 365. Epps, D.E. and Taylor, B.M. A competitive fluorescence assay to measure the reactivity of compounds. Anal. Biochem. 295, 101-106 (2001). 366. Couch, R.D., et al. Studies on the reactivity of CDDO, a promising new chemopreventive and chemotherapeutic agent: implications for a molecular mechanism of action. Bioorg. Med. Chem. Lett. 15, 2215-2219 (2005). 367. Aleksic, M., et al. Reactivity profiling: covalent modification of single nucleophile peptides for skin sensitization risk assessment. Toxicol. Sci. 108, 401-411 (2009). 368. MacFaul, P.A., Morley, A.D. and Crawford, J.J. A simple in vitro assay for assessing the reactivity of nitrile containing compounds. Bioorg. Med. Chem. Lett. 19, 1136-1138 (2009). 369. Amslinger, S., et al. Reactivity assessment of chalcones by a kinetic thiol assay. Org. Biomol. Chem. 11, 549-554 (2013). 370. Avonto, C., et al. An NMR spectroscopic method to identify and classify thiol-trapping agents: revival of Michael acceptors for drug discovery? Angew. Chem. Int. Ed. Engl. 50, 467-471 (2011). 371. Schwobel, J.A., et al. Measurement and estimation of electrophilic reactivity for predictive toxicology. Chem. Rev. 111, 2562-2596 (2011). 372. Green, N.M. Avidin. 3. The Nature of the Biotin-Binding Site. Biochem. J. 89, 599-609 (1963). 373. Michalski, A., et al. Mass spectrometry-based proteomics using Q Exactive, a high- performance benchtop quadrupole Orbitrap mass spectrometer. Mol. Cell. Proteomics 10, M111 011015 (2011). 374. Cox, J., et al. A practical guide to the MaxQuant computational platform for SILAC-based quantitative proteomics. Nat. Protoc. 4, 698-705 (2009). 375. Luber, C.A., et al. Quantitative proteomics reveals subset-specific viral recognition in dendritic cells. Immunity 32, 279-289 (2010). 376. Uruno, A. and Motohashi, H. The Keap1-Nrf2 system as an in vivo sensor for electrophiles. Nitric Oxide 25, 153-160 (2011). 377. Colzani, M., et al. Quantitative chemical proteomics identifies novel targets of the anti-cancer multi-kinase inhibitor E-3810. Mol. Cell. Proteomics 13, 1495-1509 (2014). 378. Adibekian, A., et al. Click-generated triazole ureas as ultrapotent in vivo-active serine hydrolase inhibitors. Nat. Chem. Biol. 7, 469-478 (2011). 379. Dominguez, E., et al. Integrated phenotypic and activity-based profiling links Ces3 to obesity and diabetes. Nat. Chem. Biol. 10, 113-121 (2014). 380. Bachovchin, D.A., et al. Superfamily-wide portrait of serine hydrolase inhibition achieved by library-versus-library screening. Proc. Natl. Acad. Sci. U.S.A. 107, 20941-20946 (2010). 381. Tsuboi, K., et al. Potent and selective inhibitors of glutathione S-transferase omega 1 that impair cancer drug resistance. J. Am. Chem. Soc. 133, 16605-16616 (2011). 382. Ong, S.E. and Mann, M. A practical recipe for stable isotope labeling by amino acids in cell culture (SILAC). Nat. Protoc. 1, 2650-2660 (2006). 383. Mann, M. Functional and quantitative proteomics using SILAC. Nat. Rev. Mol. Cell. Biol. 7, 952-958 (2006). 384. Cox, J. and Mann, M. Computational principles of determining and improving mass precision and accuracy for proteome measurements in an Orbitrap. J. Am. Soc. Mass. Spectrom. 20, 1477-1485 (2009). 385. Kaschani, F., et al. Selective inhibition of plant serine hydrolases by agrochemicals revealed by competitive ABPP. Bioorg. Med. Chem. 20, 597-600 (2012). 386. Zuhl, A.M., et al. Competitive activity-based protein profiling identifies aza-beta-lactams as a versatile chemotype for serine hydrolase inhibition. J. Am. Chem. Soc. 134, 5068-5071 (2012). 387. Wiedner, S.D., et al. Disparate proteome responses of pathogenic and nonpathogenic aspergilli to human serum measured by activity-based protein profiling (ABPP). Mol. Cell. Proteomics 12, 1791-1805 (2013). 388. Cross, J.V., et al. Nutrient isothiocyanates covalently modify and inhibit the inflammatory cytokine macrophage migration inhibitory factor (MIF). Biochem J 423, 315-321 (2009).

214

389. Salisbury, C.M. and Cravatt, B.F. Activity-based probes for proteomic profiling of histone deacetylase complexes. Proc. Natl. Acad. Sci. U.S.A. 104, 1171-1176 (2007). 390. Bando, H., et al. Expression of macrophage migration inhibitory factor in human breast cancer: association with nodal spread. Jpn J. Cancer Res. 93, 389-396 (2002). 391. Conroy, H., Mawhinney, L. and Donnelly, S.C. Inflammation and cancer: macrophage migration inhibitory factor (MIF)--the potential missing link. QJM 103, 831-836 (2010). 392. Rosengren, E., et al. The immunoregulatory mediator macrophage migration inhibitory factor (MIF) catalyzes a tautomerization reaction. Mol. Med. 2, 143-149 (1996). 393. Kleemann, R., et al. Disulfide analysis reveals a role for macrophage migration inhibitory factor (MIF) as thiol-protein oxidoreductase. J. Mol. Biol. 280, 85-102 (1998). 394. Simpson, K.D., Templeton, D.J. and Cross, J.V. Macrophage migration inhibitory factor promotes tumor growth and metastasis by inducing myeloid-derived suppressor cells in the tumor microenvironment. J. Immunol. 189, 5533-5540 (2012). 395. Kawakami, M., Harada, N., Hiratsuka, M., Kawai, K. and Nakamura, Y. Dietary isothiocyanates modify mitochondrial functions through their electrophilic reaction. Biosci. Biotechnol. Biochem. 69, 2439-2444 (2005). 396. Ji, Y. and Morris, M.E. Effect of organic isothiocyanates on breast cancer resistance protein (ABCG2)-mediated transport. Pharm. Res. 21, 2261-2269 (2004). 397. Tseng, E., Kamath, A. and Morris, M.E. Effect of organic isothiocyanates on the P- glycoprotein- and MRP1-mediated transport of daunomycin and vinblastine. Pharm. Res. 19, 1509-1515 (2002). 398. Nomura, T., et al. Alkyl isothiocyanates suppress epidermal growth factor receptor kinase activity but augment tyrosine kinase activity. Cancer Epidemiol. 33, 288-292 (2009). 399. Mi, L., Gan, N. and Chung, F.L. Isothiocyanates inhibit proteasome activity and proliferation of multiple myeloma cells. Carcinogenesis 32, 216-223 (2011). 400. Sahu, R.P. and Srivastava, S.K. The role of STAT-3 in the induction of apoptosis in pancreatic cancer cells by benzyl isothiocyanate. J. Natl. Cancer Inst. 101, 176-193 (2009). 401. Lin, R.K., et al. Dietary isothiocyanate-induced apoptosis via thiol modification of DNA topoisomerase IIalpha. J. Biol. Chem. 286, 33591-33600 (2011). 402. Hu, Y., et al. Glutathione- and thioredoxin-related enzymes are modulated by sulfur- containing chemopreventive agents. Biol. Chem. 388, 1069-1081 (2007). 403. Zhang, B., Kirov, S. and Snoddy, J. WebGestalt: an integrated system for exploring gene sets in various biological contexts. Nucleic Acids Res. 33, W741-748 (2005). 404. Jackson, S.J. and Singletary, K.W. Sulforaphane inhibits human MCF-7 mammary cancer cell mitotic progression and tubulin polymerization. J. Nutr. 134, 2229-2236 (2004). 405. Jackson, S.J. and Singletary, K.W. Sulforaphane: a naturally occurring mammary carcinoma mitotic inhibitor, which disrupts tubulin polymerization. Carcinogenesis 25, 219-227 (2004). 406. Pledgie-Tracy, A., Sobolewski, M.D. and Davidson, N.E. Sulforaphane induces cell type- specific apoptosis in human breast cancer cell lines. Mol. Cancer Ther. 6, 1013-1021 (2007). 407. Goodrich, D.W. The retinoblastoma tumor-suppressor gene, the exception that proves the rule. Oncogene 25, 5233-5243 (2006). 408. Malumbres, M. and Barbacid, M. Cell cycle, CDKs and cancer: a changing paradigm. Nat. Rev. Cancer 9, 153-166 (2009). 409. Kanehisa, M. and Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27-30 (2000). 410. Guo, M. and Schimmel, P. Essential nontranslational functions of tRNA synthetases. Nat. Chem. Biol. 9, 145-153 (2013). 411. Kim, S., You, S. and Hwang, D. Aminoacyl-tRNA synthetases and tumorigenesis: more than housekeeping. Nat. Rev. Cancer 11, 708-718 (2011). 412. Shannon, P., et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498-2504 (2003). 413. Hopkins, A.L. Network pharmacology. Nat. Biotechnol. 25, 1110-1111 (2007). 414. Nepusz, T., Yu, H. and Paccanaro, A. Detecting overlapping protein complexes in protein- protein interaction networks. Nat. Methods 9, 471-472 (2012). 415. Aggarwal, B.B., et al. Targeting signal-transducer-and-activator-of-transcription-3 for prevention and therapy of cancer: modern target but ancient solution. Ann. N. Y. Acad. Sci. 1091, 151-169 (2006). 416. Chung, S.S., Aroh, C. and Vadgama, J.V. Constitutive activation of STAT3 signaling regulates hTERT and promotes stem cell-like traits in human breast cancer cells. PLoS One 8, e83971 (2013).

215

417. Glukhov, A.I., Svinareva, L.V., Severin, S.E. and Shvets, V.I. Telomerase Inhibitors as Novel Antitumor Drugs. Applied Biochemistry and Microbiology 47, 655-660 (2011). 418. Hahm, E.R. and Singh, S.V. Sulforaphane inhibits constitutive and interleukin-6-induced activation of signal transducer and activator of transcription 3 in prostate cancer cells. Cancer Prev. Res. (Phila.) 3, 484-494 (2010). 419. Jeong, W.S., Kim, I.W., Hu, R. and Kong, A.N. Modulatory properties of various natural chemopreventive agents on the activation of NF-kappaB signaling pathway. Pharm. Res. 21, 661-670 (2004). 420. Xu, C., Shen, G., Chen, C., Gelinas, C. and Kong, A.N. Suppression of NF-kappaB and NF- kappaB-regulated gene expression by sulforaphane and PEITC through IkappaBalpha, IKK pathway in human prostate cancer PC-3 cells. Oncogene 24, 4486-4495 (2005). 421. Ben-Neriah, Y. and Karin, M. Inflammation meets cancer, with NF-kappaB as the matchmaker. Nat. Immunol. 12, 715-723 (2011). 422. Youn, H.S., et al. Sulforaphane suppresses oligomerization of TLR4 in a thiol-dependent manner. J. Immunol. 184, 411-419 (2010). 423. Heiss, E., Herhaus, C., Klimo, K., Bartsch, H. and Gerhauser, C. Nuclear factor kappa B is a molecular target for sulforaphane-mediated anti-inflammatory mechanisms. J. Biol. Chem. 276, 32008-32015 (2001). 424. Fan, Y., Mao, R. and Yang, J. NF-kappaB and STAT3 signaling pathways collaboratively link inflammation to cancer. Protein Cell 4, 176-185 (2013). 425. Lee, H., et al. Persistently activated Stat3 maintains constitutive NF-kappaB activity in tumors. Cancer Cell 15, 283-293 (2009). 426. Yang, J., et al. Unphosphorylated STAT3 accumulates in response to IL-6 and activates transcription by binding to NFkappaB. Genes Dev. 21, 1396-1408 (2007). 427. Yu, Z., Zhang, W. and Kone, B.C. Signal transducers and activators of transcription 3 (STAT3) inhibits transcription of the inducible nitric oxide synthase gene by interacting with nuclear factor kappaB. Biochem. J. 367, 97-105 (2002). 428. Juge, N., Mithen, R.F. and Traka, M. Molecular basis for chemoprevention by sulforaphane: a comprehensive review. Cell. Mol. Life Sci. 64, 1105-1127 (2007). 429. Bernardi, R., Papa, A. and Pandolfi, P.P. Regulation of apoptosis by PML and the PML-NBs. Oncogene 27, 6299-6312 (2008). 430. Enari, M., et al. A caspase-activated DNase that degrades DNA during apoptosis, and its inhibitor ICAD. Nature 391, 43-50 (1998). 431. McCommis, K.S. and Baines, C.P. The role of VDAC in cell death: friend or foe? Biochim. Biophys. Acta 1818, 1444-1450 (2012). 432. Rosati, A., Graziano, V., De Laurenzi, V., Pascale, M. and Turco, M.C. BAG3: a multifaceted protein that regulates major cell pathways. Cell Death Dis. 2, e141 (2011). 433. Zhu, L., Xiang, R., Dong, W., Liu, Y. and Qi, Y. Anti-apoptotic activity of Bcl-2 is enhanced by its interaction with RTN3. Cell Biol. Int. 31, 825-830 (2007). 434. Herman-Antosiewicz, A., Johnson, D.E. and Singh, S.V. Sulforaphane causes autophagy to inhibit release of cytochrome C and apoptosis in human prostate cancer cells. Cancer Res. 66, 5828-5835 (2006). 435. Kanematsu, S., et al. Autophagy inhibition enhances sulforaphane-induced apoptosis in human breast cancer cells. Anticancer Res. 30, 3381-3390 (2010). 436. Nishikawa, T., et al. Inhibition of autophagy potentiates sulforaphane-induced apoptosis in human colon cancer cells. Ann. Surg. Oncol. 17, 592-602 (2010). 437. Mizushima, N., Levine, B., Cuervo, A.M. and Klionsky, D.J. Autophagy fights disease through cellular self-digestion. Nature 451, 1069-1075 (2008). 438. Pawlik, A., Wiczk, A., Kaczynska, A., Antosiewicz, J. and Herman-Antosiewicz, A. Sulforaphane inhibits growth of phenotypically different breast cancer cells. Eur. J. Nutr. 52, 1949-1958 (2013). 439. Pyo, J.O., Nah, J. and Jung, Y.K. Molecules and their functions in autophagy. Exp. Mol. Med. 44, 73-80 (2012). 440. Vyas, A.R., et al. Chemoprevention of prostate cancer by d,l-sulforaphane is augmented by pharmacological inhibition of autophagy. Cancer Res. 73, 5985-5995 (2013). 441. Wang, M., et al. Effects of co-treatment with sulforaphane and autophagy modulators on uridine 5'-diphospho-glucuronosyltransferase 1A isoforms and cytochrome P450 3A4 expression in Caco-2 human colon cancer cells. Oncol. Lett. 8, 2407-2416 (2014). 442. Shen, S., et al. Cytoplasmic STAT3 represses autophagy by inhibiting PKR activity. Mol. Cell 48, 667-680 (2012).

216

443. Wu, M.Y., Fu, J., Xu, J., O'Malley, B.W. and Wu, R.C. Steroid receptor coactivator 3 regulates autophagy in breast cancer cells through macrophage migration inhibitory factor. Cell Res. 22, 1003-1021 (2012). 444. Qin, F., Tian, J., Zhou, D. and Chen, L. Mst1 and Mst2 kinases: regulations and diseases. Cell Biosci. 3, 31 (2013). 445. Lo, H.W. and Hung, M.C. Nuclear EGFR signalling network in cancers: linking EGFR pathway to cell cycle progression, nitric oxide pathway and patient survival. Br. J. Cancer 94, 184-188 (2006). 446. Abbaoui, B., et al. Inhibition of bladder cancer by broccoli isothiocyanates sulforaphane and erucin: characterization, metabolism, and interconversion. Mol. Nutr. Food Res. 56, 1675- 1687 (2012). 447. Kim, J.H., et al. Inhibition of EGFR signaling in human prostate cancer PC-3 cells by combination treatment with beta-phenylethyl isothiocyanate and curcumin. Carcinogenesis 27, 475-482 (2006). 448. Bulusu, K.C., Tym, J.E., Coker, E.A., Schierz, A.C. and Al-Lazikani, B. canSAR: updated cancer research and drug discovery knowledgebase. Nucleic Acids Res. 42, D1040-1047 (2014). 449. Shapiro, T.A., et al. Safety, tolerance, and metabolism of broccoli sprout and isothiocyanates: a clinical phase I study. Nutr. Cancer 55, 53-62 (2006). 450. Ye, L., et al. Quantitative determination of dithiocarbamates in human plasma, serum, erythrocytes and urine: pharmacokinetics of broccoli sprout isothiocyanates in . Clin. Chim. Acta 316, 43-53 (2002). 451. Song, L., Morrison, J.J., Botting, N.P. and Thornalley, P.J. Analysis of glucosinolates, isothiocyanates, and amine degradation products in vegetable extracts and blood plasma by LC-MS/MS. Anal. Biochem. 347, 234-243 (2005). 452. Heath, D.D., Pruitt, M.A., Brenner, D.E. and Rock, C.L. Curcumin in plasma and urine: quantitation by high-performance liquid chromatography. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 783, 287-295 (2003). 453. Munoz-Sanchez, J. and Chanez-Cardenas, M.E. A review on hemeoxygenase-2: focus on cellular protection and oxygen response. Oxid. Med. Cell. Longev. 2014, 604981 (2014). 454. Bilmen, J.G., Khan, S.Z., Javed, M.H. and Michelangeli, F. Inhibition of the SERCA Ca2+ pumps by curcumin. Curcumin putatively stabilizes the interaction between the nucleotide- binding and phosphorylation domains in the absence of ATP. Eur. J. Biochem. 268, 6318- 6327 (2001). 455. Wang, L., et al. Targeting sarcoplasmic/endoplasmic reticulum Ca(2)+-ATPase 2 by curcumin induces ER stress-associated apoptosis for treating human liposarcoma. Mol. Cancer Ther. 10, 461-471 (2011). 456. Dairaku, I., Han, Y., Yanaka, N. and Kato, N. Inhibitory effect of curcumin on IMP dehydrogenase, the target for anticancer and antiviral chemotherapy agents. Biosci. Biotechnol. Biochem. 74, 185-187 (2010). 457. Ciolino, H.P., Daschner, P.J., Wang, T.T. and Yeh, G.C. Effect of curcumin on the aryl hydrocarbon receptor and cytochrome P450 1A1 in MCF-7 human breast carcinoma cells. Biochem. Pharmacol. 56, 197-206 (1998). 458. Gupta, K.K., Bharne, S.S., Rathinasamy, K., Naik, N.R. and Panda, D. Dietary antioxidant curcumin inhibits microtubule assembly through tubulin binding. FEBS J. 273, 5320-5332 (2006). 459. Mi, L., et al. Covalent binding to tubulin by isothiocyanates. A mechanism of cell growth arrest and apoptosis. J. Biol. Chem. 283, 22136-22146 (2008). 460. Bustanji, Y., et al. Inhibition of glycogen synthase kinase by curcumin: Investigation by simulated molecular docking and subsequent in vitro/in vivo evaluation. J. Enzyme Inhib. Med. Chem. 24, 771-778 (2009). 461. Reddy, S. and Aggarwal, B.B. Curcumin is a non-competitive and selective inhibitor of phosphorylase kinase. FEBS Lett. 341, 19-22 (1994). 462. Singh, S. and Aggarwal, B.B. Activation of transcription factor NF-kappa B is suppressed by curcumin (diferuloylmethane) [corrected]. J. Biol. Chem. 270, 24995-25000 (1995). 463. Aglipay, J.A., Martin, S.A., Tawara, H., Lee, S.W. and Ouchi, T. ATM activation by ionizing radiation requires BRCA1-associated BAAT1. J. Biol. Chem. 281, 9710-9718 (2006). 464. Kadmiel, M. and Cidlowski, J.A. Glucocorticoid receptor signaling in health and disease. Trends Pharmacol. Sci. 34, 518-530 (2013).

217

465. Nikonova, A.S., Astsaturov, I., Serebriiskii, I.G., Dunbrack, R.L., Jr. and Golemis, E.A. Aurora A kinase (AURKA) in normal and pathological cell division. Cell. Mol. Life Sci. 70, 661-687 (2013). 466. Wang, L.H., et al. The mitotic kinase Aurora-A induces mammary cell migration and breast cancer metastasis by activating the Cofilin-F-actin pathway. Cancer Res. 70, 9118-9128 (2010). 467. Dar, A.A., Goff, L.W., Majid, S., Berlin, J. and El-Rifai, W. Aurora kinase inhibitors--rising stars in cancer therapeutics? Mol. Cancer Ther. 9, 268-278 (2010). 468. Carrassa, L. and Damia, G. Unleashing Chk1 in cancer therapy. Cell Cycle 10, 2121-2128 (2011). 469. Thompson, R. and Eastman, A. The cancer therapeutic potential of Chk1 inhibitors: how mechanistic studies impact on clinical trial design. Br. J. Clin. Pharmacol. 76, 358-369 (2013). 470. Hett, E.C., et al. Rational Targeting of Active-Site Tyrosine Residues Using Sulfonyl Fluoride Probes. ACS Chem. Biol. (2015). 471. Serafimova, I.M., et al. Reversible targeting of noncatalytic cysteines with chemically tuned electrophiles. Nat. Chem. Biol. 8, 471-476 (2012). 472. Liu, Q., et al. Developing irreversible inhibitors of the protein kinase cysteinome. Chem. Biol. 20, 146-159 (2013). 473. London, N., et al. Covalent docking of large libraries for the discovery of chemical probes. Nat. Chem. Biol. 10, 1066-1072 (2014). 474. Miller, R.M. and Taunton, J. Targeting protein kinases with selective and semipromiscuous covalent inhibitors. Methods Enzymol. 548, 93-116 (2014). 475. Simon, G.M., Niphakis, M.J. and Cravatt, B.F. Determining target engagement in living systems. Nat. Chem. Biol. 9, 200-205 (2013). 476. Aebersold, R., Burlingame, A.L. and Bradshaw, R.A. Western blots versus selected reaction monitoring assays: time to turn the tables? Mol. Cell. Proteomics 12, 2381-2382 (2013). 477. Picotti, P. and Aebersold, R. Selected reaction monitoring-based proteomics: workflows, potential, pitfalls and future directions. Nat. Methods 9, 555-566 (2012). 478. Chew, E.H., et al. Cinnamaldehydes inhibit thioredoxin reductase and induce Nrf2: potential candidates for cancer therapy and chemoprevention. Free Radic. Biol. Med. 48, 98-111 (2010). 479. Liu, X., Pietsch, K.E. and Sturla, S.J. Susceptibility of the antioxidant selenoenyzmes thioredoxin reductase and glutathione peroxidase to alkylation-mediated inhibition by anticancer acylfulvenes. Chem. Res. Toxicol. 24, 726-736 (2011). 480. Yokota, Y., Bargagna-Mohan, P., Ravindranath, P.P., Kim, K.B. and Mohan, R. Development of withaferin A analogs as probes of angiogenesis. Bioorg. Med. Chem. Lett. 16, 2603-2607 (2006). 481. Fornerod, M., Ohno, M., Yoshida, M. and Mattaj, I.W. CRM1 is an export receptor for leucine- rich nuclear export signals. Cell 90, 1051-1060 (1997). 482. Fenteany, G., et al. Inhibition of proteasome activities and subunit-specific amino-terminal threonine modification by lactacystin. Science 268, 726-731 (1995). 483. Usui, T., et al. The anticancer natural product pironetin selectively targets Lys352 of alpha- tubulin. Chem. Biol. 11, 799-806 (2004). 484. Katsetos, C.D. and Draber, P. Tubulins as therapeutic targets in cancer: from bench to bedside. Curr. Pharm. Des. 18, 2778-2792 (2012). 485. Neckers, L. and Workman, P. Hsp90 molecular chaperone inhibitors: are we there yet? Clin. Cancer Res. 18, 64-76 (2012). 486. Broncel, M., et al. Multifunctional Reagents for Quantitative Proteome-Wide Analysis of Protein Modification in Human Cells and Dynamic Profiling of Protein Lipidation During Vertebrate Development. Angew. Chem. Int. Ed. Engl. (2015). 487. Wright, M.H., et al. Global Analysis of Protein N-Myristoylation and Exploration of N- Myristoyltransferase as a Drug Target in the Neglected Human Pathogen Leishmania donovani. Chem. Biol. 22, 342-354 (2015). 488. Fonovic, M., Verhelst, S.H., Sorum, M.T. and Bogyo, M. Proteomics evaluation of chemically cleavable activity-based probes. Mol. Cell. Proteomics 6, 1761-1770 (2007). 489. Yang, Y., Hahne, H., Kuster, B. and Verhelst, S.H. A simple and effective cleavable linker for chemical proteomics applications. Mol. Cell. Proteomics 12, 237-244 (2013). 490. Battenberg, O.A., Yang, Y., Verhelst, S.H. and Sieber, S.A. Target profiling of 4- hydroxyderricin in S. aureus reveals seryl-tRNA synthetase binding and inhibition by covalent modification. Mol. Biosyst. 9, 343-351 (2013).

218

491. Wirth, T., Schmuck, K., Tietze, L.F. and Sieber, S.A. Duocarmycin analogues target aldehyde dehydrogenase 1 in lung cancer cells. Angew. Chem. Int. Ed. Engl. 51, 2874-2877 (2012). 492. Hong, F., Freeman, M.L. and Liebler, D.C. Identification of sensor cysteines in human Keap1 modified by the cancer chemopreventive agent sulforaphane. Chem. Res. Toxicol. 18, 1917- 1926 (2005). 493. Mohamed, M.M. and Sloane, B.F. Cysteine cathepsins: multifunctional enzymes in cancer. Nat. Rev. Cancer 6, 764-775 (2006). 494. Gondi, C.S. and Rao, J.S. Cathepsin B as a cancer target. Expert Opin. Ther. Targets 17, 281-291 (2013). 495. Messina, A., Reina, S., Guarino, F. and De Pinto, V. VDAC isoforms in mammals. Biochim. Biophys. Acta 1818, 1466-1476 (2012). 496. Hashemy, S.I. and Holmgren, A. Regulation of the catalytic activity and structure of human thioredoxin 1 via oxidation and S-nitrosylation of cysteine residues. J. Biol. Chem. 283, 21890-21898 (2008). 497. Guantai, E.M., et al. Design, synthesis and in vitro antimalarial evaluation of triazole-linked chalcone and dienone hybrid compounds. Bioorg. Med. Chem. 18, 8243-8256 (2010). 498. Hans, R.H., et al. Synthesis, antimalarial and antitubercular activity of acetylenic chalcones. Bioorg. Med. Chem. Lett. 20, 942-944 (2010). 499. Link, M., Li, X.H., Kleim, J. and Wolfbeis, O.S. Click Chemistry Based Method for the Preparation of Maleinimide-Type Thiol-Reactive Labels. European Journal of Organic Chemistry, 6922-6927 (2010). 500. Welser, K., Perera, M.D., Aylott, J.W. and Chan, W.C. A facile method to clickable sensing polymeric nanoparticles. Chem. Commun. (Camb.), 6601-6603 (2009). 501. Macpherson, L.J., et al. Noxious compounds activate TRPA1 ion channels through covalent modification of cysteines. Nature 445, 541-545 (2007). 502. Wang, L.Y., et al. Synthesis and properties of 1-(2-(alkylamino)-2-oxoethyl) pyridinium chloride surfactants. Journal of Chemical Research, 205-207 (2013). 503. Charlton, T. MRes Thesis, Imperial College London (2011). 504. Amslinger, S. The tunable functionality of alpha,beta-unsaturated carbonyl compounds enables their differential application in biological systems. ChemMedChem 5, 351-356 (2010). 505. Ekkebus, R., et al. On terminal alkynes that can react with active-site cysteine nucleophiles in proteases. J. Am. Chem. Soc. 135, 2867-2870 (2013). 506. Sommer, S., Weikart, N.D., Linne, U. and Mootz, H.D. Covalent inhibition of SUMO and ubiquitin-specific cysteine proteases by an in situ thiol-alkyne addition. Bioorg. Med. Chem. 21, 2511-2517 (2013). 507. Arkona, C. and Rademann, J. Propargyl amides as irreversible inhibitors of cysteine proteases--a lesson on the biological reactivity of alkynes. Angew. Chem. Int. Ed. Engl. 52, 8210-8212 (2013). 508. Shenhav, H. and Patai, S. Nucleophilic Attacks on Carbon-Carbon Double Bonds .12. Addition of Amines to Electrophilic Olefins and Reacitvity Order of Activating Groups. Journal of the Chemical Society B-Physical Organic, 469 (1970). 509. Domadia, P., Swarup, S., Bhunia, A., Sivaraman, J. and Dasgupta, D. Inhibition of bacterial cell division protein FtsZ by cinnamaldehyde. Biochem. Pharmacol. 74, 831-840 (2007). 510. Plaisier, C., et al. Effects of cinnamaldehyde on the glucose transport activity of GLUT1. Biochimie 93, 339-344 (2011). 511. Youn, H.S., et al. Cinnamaldehyde suppresses toll-like receptor 4 activation mediated through the inhibition of receptor oligomerization. Biochem. Pharmacol. 75, 494-502 (2008). 512. Andersen, J.S., et al. Nucleolar proteome dynamics. Nature 433, 77-83 (2005). 513. Erde, J., Loo, R.R. and Loo, J.A. Enhanced FASP (eFASP) to increase proteome coverage and sample recovery for quantitative proteomic experiments. J. Proteome Res. 13, 1885-1895 (2014). 514. Wisniewski, J.R., Zougman, A., Nagaraj, N. and Mann, M. Universal sample preparation method for proteome analysis. Nat. Methods 6, 359-362 (2009). 515. Cai, X.Z., et al. Inhibitory effects of curcumin on gastric cancer cells: a proteomic study of molecular targets. Phytomedicine 20, 495-505 (2013). 516. Thangapazham, R.L., et al. Androgen responsive and refractory prostate cancer cells exhibit distinct curcumin regulated transcriptome. Cancer Biol. Ther. 7, 1427-1435 (2008). 517. Zhu, D.J., et al. Proteomic analysis identifies proteins associated with curcumin-enhancing efficacy of irinotecan-induced apoptosis of colorectal cancer LOVO cell. Int. J. Clin. Exp. Pathol. 7, 1-15 (2014).

219

518. Agyeman, A.S., et al. Transcriptomic and proteomic profiling of KEAP1 disrupted and sulforaphane-treated human breast epithelial cells reveals common expression profiles. Breast Cancer Res. Treat. 132, 175-187 (2012). 519. Mastrangelo, L., Cassidy, A., Mulholland, F., Wang, W. and Bao, Y. Serotonin receptors, novel targets of sulforaphane identified by proteomic analysis in Caco-2 cells. Cancer Res. 68, 5487-5491 (2008). 520. Jia, J., et al. Mechanisms of drug combinations: interaction and network perspectives. Nat. Rev. Drug Discov. 8, 111-128 (2009). 521. Facchetti, G., Zampieri, M. and Altafini, C. Predicting and characterizing selective multiple drug treatments for metabolic diseases and cancer. BMC Syst. Biol. 6, 115 (2012). 522. He, L., Wennerberg, K., Aittokallio, T. and Tang, J. TIMMA-R: an R package for predicting synergistic multi-targeted drug combinations in cancer cell lines or patient-derived samples. Bioinformatics (2015). 523. Havaleshko, D.M., et al. Prediction of drug combination chemosensitivity in human bladder cancer. Mol. Cancer Ther. 6, 578-586 (2007). 524. Mohan, A., Narayanan, S., Sethuraman, S. and Krishnan, U.M. Combinations of plant & anti-cancer molecules: a novel treatment strategy for cancer chemotherapy. Anticancer Agents Med. Chem. 13, 281-295 (2013). 525. Kallifatidis, G., et al. Sulforaphane increases drug-mediated cytotoxicity toward cancer stem- like cells of pancreas and prostate. Mol. Ther. 19, 188-195 (2011). 526. Montopoli, M., Ragazzi, E., Froldi, G. and Caparrotta, L. Cell-cycle inhibition and apoptosis induced by curcumin and cisplatin or oxaliplatin in human ovarian carcinoma cells. Cell Prolif. 42, 195-206 (2009). 527. Fimognari, C., Lenzi, M., Sciuscio, D., Cantelli-Forti, G. and Hrelia, P. Combination of doxorubicin and sulforaphane for reversing doxorubicin-resistant phenotype in mouse fibroblasts with p53Ser220 mutation. Ann. N. Y. Acad. Sci. 1095, 62-69 (2007). 528. Sen, G.S., et al. Curcumin enhances the efficacy of chemotherapy by tailoring p65NFkappaB- p300 cross-talk in favor of p53-p300 in breast cancer. J. Biol. Chem. 286, 42232-42247 (2011). 529. Hussain, A., et al. Sulforaphane inhibits growth of human breast cancer cells and augments the therapeutic index of the chemotherapeutic drug, gemcitabine. Asian Pac. J. Cancer Prev. 14, 5855-5860 (2013). 530. Wang, X.F., Wu, D.M., Li, B.X., Lu, Y.J. and Yang, B.F. Synergistic inhibitory effect of sulforaphane and 5-fluorouracil in high and low metastasis cell lines of salivary gland adenoid cystic carcinoma. Phytother. Res. 23, 303-307 (2009). 531. Vinod, B.S., et al. Mechanistic evaluation of the signaling events regulating curcumin- mediated chemosensitization of breast cancer cells to 5-fluorouracil. Cell Death Dis. 4, e505 (2013). 532. Chen, H., Landen, C.N., Li, Y., Alvarez, R.D. and Tollefsbol, T.O. Epigallocatechin gallate and sulforaphane combination treatment induce apoptosis in paclitaxel-resistant ovarian cancer cells through hTERT and Bcl-2 down-regulation. Exp. Cell Res. 319, 697-706 (2013). 533. Khafif, A., Schantz, S.P., Chou, T.C., Edelstein, D. and Sacks, P.G. Quantitation of chemopreventive synergism between (-)-epigallocatechin-3-gallate and curcumin in normal, premalignant and malignant human oral epithelial cells. Carcinogenesis 19, 419-424 (1998). 534. Du, Q., et al. Synergistic anticancer effects of curcumin and resveratrol in Hepa1-6 hepatocellular carcinoma cells. Oncol. Rep. 29, 1851-1858 (2013). 535. Jiang, H., et al. Combination treatment with resveratrol and sulforaphane induces apoptosis in human U251 glioma cells. Neurochem. Res. 35, 152-161 (2010). 536. Cheung, K.L., Khor, T.O. and Kong, A.N. Synergistic effect of combination of phenethyl isothiocyanate and sulforaphane or curcumin and sulforaphane in the inhibition of inflammation. Pharm. Res. 26, 224-231 (2009). 537. Thakkar, A., Sutaria, D., Grandhi, B.K., Wang, J. and Prabhu, S. The molecular mechanism of action of aspirin, curcumin and sulforaphane combinations in the chemoprevention of pancreatic cancer. Oncol. Rep. 29, 1671-1677 (2013). 538. Jyothi, D., et al. Diferuloylmethane augments the cytotoxic effects of piplartine isolated from Piper chaba. Toxicol. In Vitro 23, 1085-1091 (2009). 539. Li, Y., Zhang, T., Schwartz, S.J. and Sun, D. Sulforaphane potentiates the efficacy of 17- allylamino 17-demethoxygeldanamycin against pancreatic cancer through enhanced abrogation of Hsp90 chaperone function. Nutr. Cancer 63, 1151-1159 (2011).

220

540. Rao, D.K., Liu, H., Ambudkar, S.V. and Mayer, M. A combination of curcumin with either gramicidin or ouabain selectively kills cells that express the multidrug resistance-linked ABCG2 transporter. J. Biol. Chem. 289, 31397-31410 (2014). 541. Jakubikova, J., et al. Anti-tumor activity and signaling events triggered by the isothiocyanates, sulforaphane and phenethyl isothiocyanate, in multiple myeloma. Haematologica 96, 1170- 1179 (2011). 542. Nautiyal, J., Kanwar, S.S., Yu, Y. and Majumdar, A.P. Combination of dasatinib and curcumin eliminates chemo-resistant colon cancer cells. J. Mol. Signal 6, 7 (2011). 543. Tinoco, G., Warsch, S., Gluck, S., Avancha, K. and Montero, A.J. Treating breast cancer in the 21st century: emerging biological therapies. J. Cancer 4, 117-132 (2013). 544. Bijnsdorp, I.V., Giovannetti, E. and Peters, G.J. Analysis of drug interactions. Methods Mol. Biol. 731, 421-434 (2011). 545. Chou, T.C. and Talalay, P. Quantitative analysis of dose-effect relationships: the combined effects of multiple drugs or enzyme inhibitors. Adv. Enzyme Regul. 22, 27-55 (1984). 546. Chou, T.C. Theoretical basis, experimental design, and computerized simulation of synergism and antagonism in drug combination studies. Pharmacol. Rev. 58, 621-681 (2006). 547. Li, D., et al. BIBW2992, an irreversible EGFR/HER2 inhibitor highly effective in preclinical lung cancer models. Oncogene 27, 4702-4711 (2008). 548. Nelson, V., Ziehr, J., Agulnik, M. and Johnson, M. Afatinib: emerging next-generation tyrosine kinase inhibitor for NSCLC. Onco. Targets Ther. 6, 135-143 (2013). 549. Schwartz, P.A., et al. Covalent EGFR inhibitor analysis reveals importance of reversible interactions to potency and mechanisms of drug resistance. Proc. Natl. Acad. Sci. U.S.A. 111, 173-178 (2014). 550. Solca, F., et al. Target binding properties and cellular activity of afatinib (BIBW 2992), an irreversible ErbB family blocker. J. Pharmacol. Exp. Ther. 343, 342-350 (2012). 551. Harbeck, N., Solca, F. and Gauler, T.C. Preclinical and clinical development of afatinib: a focus on breast cancer and squamous cell carcinoma of the head and neck. Future Oncol. 10, 21-40 (2014). 552. Hurvitz, S.A., Shatsky, R. and Harbeck, N. Afatinib in the treatment of breast cancer. Expert Opin. Investig. Drugs 23, 1039-1047 (2014). 553. Chou, T.C. Preclinical versus clinical drug combination studies. Leuk. Lymphoma 49, 2059- 2080 (2008). 554. Chadalapaka, G., Jutooru, I., Burghardt, R. and Safe, S. Drugs that target specificity proteins downregulate epidermal growth factor receptor in bladder cancer cells. Mol. Cancer Res. 8, 739-750 (2010). 555. Lee, J.Y., et al. Curcumin induces EGFR degradation in lung adenocarcinoma and modulates p38 activation in intestine: the versatile adjuvant for gefitinib therapy. PLoS One 6, e23756 (2011). 556. Yamauchi, Y., Izumi, Y., Yamamoto, J. and Nomori, H. Coadministration of erlotinib and curcumin augmentatively reduces cell viability in lung cancer cells. Phytother. Res. 28, 728- 735 (2014). 557. Ye, M.X., Li, Y., Yin, H. and Zhang, J. Curcumin: updated molecular mechanisms and intervention targets in human lung cancer. Int. J. Mol. Sci. 13, 3959-3978 (2012). 558. Wang, P., Henning, S.M. and Heber, D. Limitations of MTT and MTS-based assays for measurement of antiproliferative activity of green tea polyphenols. PLoS One 5, e10202 (2010). 559. Bruggisser, R., von Daeniken, K., Jundt, G., Schaffner, W. and Tullberg-Reinert, H. Interference of plant extracts, phytoestrogens and antioxidants with the MTT tetrazolium assay. Planta Med. 68, 445-448 (2002). 560. Wisman, K.N., Perkins, A.A., Jeffers, M.D. and Hagerman, A.E. Accurate assessment of the bioactivities of redox-active polyphenolics in cell culture. J. Agric. Food Chem. 56, 7831-7837 (2008). 561. Chou, T.C. Drug combination studies and their synergy quantification using the Chou-Talalay method. Cancer Res. 70, 440-446 (2010). 562. Gaulton, A., et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 40, D1100-1107 (2012). 563. Awasthi, S., et al. Curcumin-glutathione interactions and the role of human glutathione S- transferase P1-1. Chem. Biol. Interact. 128, 19-38 (2000). 564. Mathews, S. and Rao, M.N.A. Interaction of Curcumin with Glutathione. International Journal of Pharmaceutics 76, 257-259 (1991).

221

565. Go, Y.M. and Jones, D.P. The redox proteome. J. Biol. Chem. 288, 26512-26520 (2013). 566. Khar, A., et al. Induction of stress response renders human tumor cell lines resistant to curcumin-mediated apoptosis: role of reactive oxygen intermediates. Cell Stress Chaperones 6, 368-376 (2001). 567. Mukherjee, S., et al. Curcumin inhibits breast cancer stem cell migration by amplifying the E- cadherin/beta-catenin negative feedback loop. Stem Cell Res. Ther. 5, 116 (2014). 568. Zang, S., Liu, T., Shi, J. and Qiao, L. Curcumin: a promising agent targeting cancer stem cells. Anticancer Agents Med. Chem. 14, 787-792 (2014). 569. Li, Y. and Zhang, T. Targeting cancer stem cells with sulforaphane, a dietary component from broccoli and broccoli sprouts. Future Oncol. 9, 1097-1103 (2013). 570. Li, Y., et al. Sulforaphane, a dietary component of broccoli/broccoli sprouts, inhibits breast cancer stem cells. Clin. Cancer. Res. 16, 2580-2590 (2010). 571. D'Aguanno, S., et al. Shotgun proteomics and network analysis of neuroblastoma cell lines treated with curcumin. Mol. Biosyst. 8, 1068-1077 (2012). 572. Fang, H.Y., Chen, S.B., Guo, D.J., Pan, S.Y. and Yu, Z.L. Proteomic identification of differentially expressed proteins in curcumin-treated MCF-7 cells. Phytomedicine 18, 697-703 (2011). 573. Teiten, M.H., et al. Identification of differentially expressed proteins in curcumin-treated prostate cancer cell lines. OMICS 16, 289-300 (2012). 574. Sharma, K., et al. Quantitative proteomics reveals that Hsp90 inhibition preferentially targets kinases and the DNA damage response. Mol. Cell. Proteomics 11, M111 014654 (2012). 575. Yang, W., et al. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 41, D955-961 (2013). 576. Garnett, M.J., et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483, 570-575 (2012). 577. Wright, M.H., et al. Validation of N-myristoyltransferase as an antimalarial drug target using an integrated chemical biology approach. Nat. Chem. 6, 112-121 (2014). 578. Salisbury, C.M. and Cravatt, B.F. Optimization of activity-based probes for proteomic profiling of histone deacetylase complexes. J. Am. Chem. Soc. 130, 2184-2194 (2008). 579. Rappsilber, J., Ishihama, Y. and Mann, M. Stop and go extraction tips for matrix-assisted laser desorption/ionization, nanoelectrospray, and LC/MS sample pretreatment in proteomics. Anal. Chem. 75, 663-670 (2003).

222

10. Appendices

Appendix Table 1. List of available electronic files for further reference. For the proteomics experimental data (entries 1-8), a single .xlsx file is provided. Contained within it are multiple sheets containing i) a summary of the experimental setup ii) the RAW .txt file that was the output of the MaxQuant search and iii) any relevant analysis sheets for the particular experiment. For compound combination studies between the electrophilic natural products and cancer therapeutics (entry 9), an .xlsx file containing all relevant information for the analysis is provided. For the Cytoscape interaction networks (entries 10-14), the protein-protein interaction network for each electrophilic natural product is provided. The Cytoscape software can be freely downloaded from http://www.cytoscape.org/download_old_versions.htmL (Version 3.1.0).

# File Description

Proteomics data of ABP only samples fed to MDA-MB-231 1 File1_ABPtargets_NonQuantitative cells (Chapter 4.1.3)

Proteomics data of in-cell ABP competed against parent 2 File2_DuplexSILAC_ABPvParent_Cell compound in duplex SILAC experiments in MDA-MB-231 cells (Chapter 4.3)

Proteomics data of in-cell, in-lysate and in-cell/lysate ABP 3 File3_DuplexSILAC_ABPvParent_Cell_Lysate_Cell+Lysate competed against parent compound in duplex SILAC experiments in MDA-MB-231 cells (Chapter 4.5)

Proteomics data for incorporation check of SILAC labels 4 File4_SILAC_Incorporation_check into whole lysates of MDA-MB-231 (R6K4), MDA-MB-231 (R10K8) and MCF7 (R10K8) (Chapter 4 and Chapter 5)

Proteomics data of sulforaphane ABP 2 competed against a concentration gradient of sulforaphane using a ‘spike-in’ 5 File5_SpikeSILAC_Sulforaphane SILAC setup in both MCF7 and MDA-MB-231 cells (Chapter 5.2)

Proteomics data of curcumin ABP 1 competed against a 6 File6_SpikeSILAC_Curcumin concentration gradient of curcumin and PC using a ‘spike- in’ SILAC setup in the MDA-MB-231 cell line (Chapter 5.3)

Proteomics data of piperlongumine ABP competed against 7 File7_SpikeSILAC_Piperlongumine a concentration gradient of piperlongumine using a ‘spike- in’ SILAC setup in the MDA-MB-231 cell line (Chapter 5.3)

Proteomics data of a triplex SILAC setup to compare the 8 File8_TriplexSILAC_ElectrophilicABPs protein target labelling of a panel of small electrophilic ABPs in the MDA-MB-231 cell line (Chapter 5.7)

MTS assay data and subsequent compound combination analysis for the electrophilic natural products in 9 File9_MTSAssay_CombinationStudies combination with cancer therapeutics in both the MCF7 and MDA-MB-231 cell lines (Chapter 6)

Protein interaction network of the 196 curcumin targets 10 Cytoscape_1_Curcumin_MDA-MB-231 generated in Cytoscape (Chapter 5.3)

Protein interaction network of the 472 PC targets 11 Cytoscape_2_PC_MDA-MB-231 generated in Cytoscape (Chapter 5.3)

Protein interaction network of the 426 piperlongumine 12 Cytoscape_3_Piperlongumine_MDA-MB-231 targets generated in Cytoscape (Chapter 5.3)

Protein interaction network of the 290 sulforaphane targets 13 Cytoscape_4_Sulforaphane_MCF7 in the MCF7 cell line generated in Cytoscape (Chapter 5.2)

Protein interaction network of the 290 sulforaphane targets 14 Cytoscape_5_Sulforaphane_MDA-MB-231 in the MDA-MB-231 cell line generated in Cytoscape (Chapter 5.2)

Appendix Table 2. Targets of sulforaphane and curcumin identified by chemical proteomics in attempts to globally profile their targets, carried out by other research groups prior or during the PhD studies. These targets

223 require further validation if they are to be deemed genuine targets as insufficient control experiments or orthogonal validation was not carried out which therefore leaves these targets very much as speculative. Although a useful resource nonetheless to cross-reference to the targets identified in work carried out during the PhD studies.

Sulforaphane targets92 Curcumin targets132, 133 14-3-3 protein beta/alpha ALDOA 14-3-3 protein epsilon ALDOC 14-3-3 protein sigma Alkyldihydroxyacetone phosphate synthase precursor 14-3-3 protein theta Amino peptidase B 40S ribosomal protein SA Annexin A5 Actin ATPase Alpha-internexin Bcl-2 ATP synthase beta chain, mitochondrial Creatine kinase Calpain small subunit 1 (CSS1) DDX17 Desmin DNA helicase EIF5A Dual specificity protein phosphatase CDC14C Galectin-1 Epidermal filaggrin GSTP Far upstream element-binding protein Heat shock 70 kDa protein 1 Fragile X mental retardation syndrome-related protein 1 Heat shock 70 kDa protein 1 L Gamma-enolase (ENOG) Heat shock 70 kDa protein 4 GAPDH Histone H2A type 1-A Glucan (1,4-alpha-), branching enzyme Histone H4 Glyoxylase 1 HNRNPF HSP70 HNRNPH Keap1 HNRNPK 3-ketoacyl-CoA thiolase beta-subunit Histone-binding protein RBBP4 KIAA0719 protein HSPD1 Lactate dehydrogenase Nucleoside diphosphate kinase A Lacritin precursor Peripherin Lipocalin Plasma protease C1 inhibitor Lysl-tRNA synthetase Proliferating cell nuclear antigen M4 protein Protein disulfide-isomerase MTHSP75 Retinal dehydrogenase mTOR Stress-70 protein, mitochondrial Nucleolin T-complex protein 1 subunit beta 5' nucleotidase, ecto Tenascin-X Phosphatidylethanolamine-binding protein (PEBP) Thioredoxin-dependent peroxide reductase, Phosphofructokinase Transitional endoplasmic reticulum ATPase Phosphoglycerate mutase 1 Tropomyosin 1 alpha chain Peroxiredoxin 2 (PRDX2) Tubulin Protein kinase c inhibitor protein Tubulin-specific chaperone B Proteasome Vimentin S100 calcium-binding protein A9 Zinc finger protein 429 SAM domain- and HD domain-containing protein 1 Sorting nexin 2

Src substrate cortactin (Amplaxin)

TNF-α

TRIO and F-actin binding protein, isoform CRA f

Zn-alpha2-glycoprotein

224

(B) 10 μM (A) Sulforaphane ABP 3 Sulforaphane ABP 3 (25 μM) + + 0 0.3 1 2 4 24 90 ˚C boiling for 5 min - + 250 150 250 100 150 75 100 75 50

37 50 gel fluorescence

37 25 - In

gel fluorescence 20 -

25 In 15 20 10 Coomassie

(C) Coomassie (D)

Sulforaphane ABP 3 Sulforaphane ABP 3 + + - + + - + + + - (10 μM) (10 μM) Reduction/alkylation - - - + + + Temperature (˚C) 20 55 90 20

250 250 150 150 100 100 75 75

50 50 37 37

25 25 20 gel fluorescence gel fluorescence

20 - -

15 In 15 In 10 10 Coomassie Coomassie

Appendix Figure 1. In-gel fluorescence labelling profiles of sulforaphane ABP 3 (Chapter 4.1.1). (A) The instability of ABP-protein adducts when boiled in SLB containing β-mercaptoethanol (B) Stability of ABP-protein adducts when ABP was treated to cells initially for 30 min and then then withdrawn and replaced with normal media for the stated time intervals (h) (Chapter 4.1.2). (C) Confirmation that proteomic sample preparation (including reduction and alkylation) do not affect ABP-protein adduct stability. (D) Stability of ABP-protein adducts to reduction and alkylation at increased temperatures.

225

(A) Sulforaphane ABP 1 (5 μM) + + + + + + + ------Sulforaphane ABP 2 (5 μM) ------+ + + + + + + Sulforaphane (μM) - 5 25 100 200 - - - 5 25 100 200 - - NEM (100 μM) ------+ ------+ IA (100 μM) - - - - - + ------+ -

250 150 100 75

50 37

25 gel fluorescence

20 - In 15

10 Coomassie (B) Cell line MCF-7 cells MDA-MB-231 cells Sulforaphane ABP 1 (μM) 2 5 10 - - - - 2 5 10 - - - - Sulforaphane ABP 2 (μM) 2 5 10 - 2 5 10 -

250 150 100 75

50 37

25

20 gel fluorescence -

15 In

10 Coomassie

(C) Cell line MCF-7 cells MDA-MB-231 cells Sulforaphane ABP 3 (μM) 2 5 10 20 - 2 5 10 20 -

250 150 100 75

50 37

25 gel fluorescence

20 -

15 In

10 Coomassie

Appendix Figure 2. In-gel fluorescence in-cell labelling profiles of sulforaphane ABP 1, 2 and 3 (Chapter 4.1.1). (A) Competition of sulforaphane ABP 1 and sulforaphane ABP 2 against a concentration gradient of sulforaphane in the MCF7 cell line (B) Concentration-dependent labelling of sulforaphane ABP 1 and sulforaphane ABP 2 in

226 the MDA-MB-231 and MCF7 cell lines (C) Concentration-dependent labelling of sulforaphane ABP 3 in the MDA- MB-231 and MCF7 cell lines.

(A) (B) Cell line MCF-7 MDA-MB-231 CURC ABP 3 (μM) 2 5 20 50 - PIP ABP (μM) 20 5 1 - 20 5 1 - 250 150 250 150 100 75 100 75 50

50 37

37 25 20 gel fluorescence

25 - 20 15 In gel fluorescence -

15 In 10

10 Coomassie Coomassie

Appendix Figure 3. In-gel fluorescence in-cell labelling profiles of piperlongumine ABP and curcumin ABP 3 (Chapter 4.1.1). (A) Concentration-dependent labelling of piperlongumine ABP in the MDA-MB-231 and MCF7 cell lines. (B) Concentration-dependent labelling of curcumin ABP 3 in the MDA-MB-231 cell line.

227

Lane 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 25 26 27 28 29 30 31 32 33 34 (A) 5 μM SULF ABP 2 (MCF7 cells) 5 μM SULF ABP 2 (MDA-MB-231 cells) 5 μM SULF ABP 2 (MDA-MB-231 cells) TH SF NA TH SF NA SF NE D - IA PC TH CT RS NC - D - IA PC TH CT RS NC D - PI PH BQ MA CA IA - D C N C C N C N M MF

250 250 150 150 150 100 100 75 100 75 75 50 50 50 37 37 37 25 20 25 gel fluorescence 20 25 - 15 In 15 20 10 Coomassie

Lane 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 24 25 25 26 27 28 29 30 31 32 33 34 35 36 (B) 2 μM CURC ABP 1 (MCF7 cells) 5 μM CURC ABP 1 (MDA-MB-231 cells) 5 μM CURC ABP 1 (MDA-MB-231 cells) TH SF NA N T N D - IA PC TH CT RS NC - D NC PC C N TH PI SF M B P C N C - NC PC FA TH PI RSCT A D - PC H IA E M D † † A C C P N A Q H C P M F

150 250 250 100 150 150 75 100 100 75 75 50 37 50 50 37 37 25 gel fluorescence 20 - 25 25 In 15 20 20 Coomassie

Lane 1 2 3 4 5 6 7 8 9 10 11 13 14 15 16 17 18 19 20 21 22 23 24 25 25 26 27 28 29 30 31 32 33 34 35 (C) 2 μM PIP ABP (MCF7 cells) 2 μM PIPABP (MDA-MB-231) 2 μM PIP ABP (MDA-MB-231 cells) TH RE SF NC NA TH SF NA D - IA PC* CT PIP D - IA PC* PIP CT RS NC D NE C C S N * C C N * C - PIP PI PH M BQ MA FA IA - D M A F

250 250 250 150 150 100 100 150 75 75 100 75 50 50 37 37 50 37 25 25

20 20 gel fluorescence 25 - 15 15 In 20 10 Coomassie

Appendix Figure 4. In-gel fluorescence labelling profiles of the in-cell competition-based assays of ABPs against a panel of other electrophilic species in the MCF7 and MDA-MB-231 cell lines (Chapter 4.2). (A) Sulforaphane ABP 2 (B) Curcumin ABP 1 (C) Piperlongumine ABP. IA = iodoacetamide, PC = mono-O-propylcurcumin, TH = theophylline, CT = citral, THC = tetrahydrocurcumin, RS = resveratrol, SFN = sulforaphane, NC = curcumin, NAC = N-acetylcysteine, PI = piperine, PH = , NEM = N-ethylmaleimide, DMF = dimethylfumarate, BQ = benzoquinone, MA = maleic anhydride, CA = coniferal aldehyde, PIP = piperlongumine, THP = tetrahydropiperlongumine, FA = feruloyl acetone, D = DMSO.

228

Curcumin ABP 2 (20 μM) + ------Curcumin ABP 3 (20 μM) - - - + ------Curcumin (20 μM) - + + ------Curcumin ABP 1 (5 μM) - - - - + + ------Mono-O-propylcurcumin (50 μM) - - - - - + ------Piperlongumine ABP (2 μM) ------+ + - - - - Piperlongumine (100 μM) ------+ - - - - Sulforaphane ABP 1 (5 μM) ------+ + - - Sulforaphane (100 μM) ------+ + - Sulforaphane ABP 2 (5 μM) ------+ +

250 150 100 75

50

37

25 gel fluorescence -

20 In 15

10 Coomassie

Appendix Figure 5. In-gel fluorescence in-cell labelling profiles for the reported ABPs in the MDA-MB-231 cell line. Samples were then used for proteomic target identification in the duplex SILAC experiment described in Chapter 4.3.

229

(A) Sulforaphane ABP 1 (4 μM) - - - - - + + + + + - Sulforaphane ABP 2 (4 μM) + + + + + ------Sulforaphane (μM) - 40 80 200 400 - 40 80 200 400 -

250 250 150 150 100 100 75 75

50 50

37 37 gel fluorescence - In 25 25

20 20 Coomassie

(B) Curcumin ABP 1 (5 μM) - - - - - + + + + + - Curcumin ABP 2 (5 μM) + + + + + ------Curcumin (μM) - 100 200 300 - - 100 200 300 - - Mono-O-propylcurcumin (μM) - - - - 200 - - - - 200 -

250 150 100 75

50

37 gel fluorescence - In 25

20

15 Coomassie

(C) Curcumin ABP 3 (5 μM) ------+ + + + + - Piperlongumine ABP (1 μM) + + + + + + ------Piperlongumine (μM) - 10 20 50 100 150 ------Curcumin (μM) ------100 200 300 - - Mono-O-propylcurcumin (μM) ------200 -

250 250 150 150 100 100 75 75

50 50

37 37 gel fluorescence -

25 In 25

20 20

15 Coomassie

Appendix Figure 6. In-gel fluorescence labelling profiles of the in-lysate competition-based assays of ABPs against a concentration gradient of their respective parent compounds (Chapter 4.4). (A) The labelling of

230 sulforaphane ABP 1 and 2 competed against an excess of sulforaphane (B) The labelling of curcumin ABP 1 and 2 competed against an excess of curcumin or PC. (C) The labelling of curcumin ABP 3 competed against an excess of curcumin or PC and piperlongumine ABP competed against an excess of piperlongumine. The labelling of the sulforaphane ABPs and piperlongumine ABP could be readily competed by excess of parent compound, however this was less observed for any of the 3 curcumin ABPs.

(A) Heat denatured (Δ) - - + + - - + + - - + + - + Sulforaphane ABP 1 (4 μM) + + + + ------Sulforaphane ABP 2 (4 μM) - - - - + + + + ------Piperlongumine ABP (1 μM) ------+ + + + - - Sulforaphane (400 μM) - + - + - + - + ------Piperlongumine (100 μM) ------+ - + - -

250 150 100 75

50

37 gel fluorescence -

25 In

20 Coomassie

(B) Piperlongumine ABP (2 μM) + + + + + + + + + + + + - Piperlongumine (100 μM) - + - + - + - + - + - + - Pre-incubation time (min) 30 30 30 30 30 30 10 10 10 10 10 10 - Competition time (min) 30 30 30 30 30 30 15 15 5 5 2 2 - DMSO (%) 2 2 5 5 8 8 2 2 2 2 2 2 2

250 150 100 75 50 37

25 gel fluorescence

20 - In 15

Appendix Figure 7. In-gel fluorescence labelling profiles of the In-lysate competition-based assays of ABPs against their respective parent compounds for sulforaphane and piperlongumine (Chapter 4.4). (A) Heat denaturation of the protein lysate prior to compound treatment was carried out showing an increase in ABP labelling. (B) For competition assays, using piperlongumine ABP and piperlongumine as a model case, incubation time of both the parent compound and the ABP was varied to identify the optimum incubation times for further studies.

231

(A) Cell line MCF7 cell line MDA-MB-231 cell line Cytosolic Fraction + + + + + + + + + + + + + + + + Sulforaphane ABP 2 (5 μM) + + + + + + + + + + + + + + + + Sulforaphane (μM) - - 5 5 25 25 100 100 - - 5 5 25 25 100 100

250 250 150 150 100 100 75 75

50 50

37 37 gel fluorescence - In 25 25 20 20

15 15 Coomassie (B) Cell line MDA-MB-231 cell line Piperlongumine ABP (2 μM) + + + + + + + + + + + + - - Piperlongumine ABP (8 μM) ------+ Piperlongumine (μM) - - 2 2 10 10 25 25 50 50 100100 - -

250 150 100 75

50 37

25 gel fluorescence - In 20

15 10

(C) Coomassie

Cell line MDA-MB-231 cell line Curcumin ABP 1 (5 μM) + + + + + + + + + + + + + + + + + + + + + + + + - - Curcumin ABP 1 (20 μM) ------+ Curcumin (μM) - - 5 5 20 20 50 50 100 100 150 150 ------Mono-O- 5 5 20 20 50 50 100 100 150 150 propylcurcumin (μM) ------

250 250 150 150 100 100 75 75 50 50 37 37 gel fluorescence -

25 In 25 20 20 15 15 10 Coomassie

Appendix Figure 8. In-gel fluorescence labelling profiles of the in-cell competition-based assays of ABPs against a concentration gradient of their respective parent compounds (Chapter 5.2 and Chapter 5.3). The labelling of

232 each experimental sample was confirmed by in-gel fluorescence prior to commencement to the MS-based proteomic identification workflow. (A) The cytosolic fractions from sulforaphane ABP 2 competed against sulforaphane (5, 25 and 100 μM) in the MCF7 and MDA-MB-231 cell lines. (B) Piperlongumine ABP competed against piperlongumine (2, 10, 25, 50 and 100 μM) as well as the ‘spike-in’ heavy piperlongumine ABP only sample in the MDA-MB-231 cell line. (C) Curcumin ABP 1 competed against curcumin and PC (5, 20, 50, 100 and 150 μM) as well as the ‘spike-in’ heavy curcumin ABP 1 only sample in the MDA-MB-231 cell line.

233

(A) KEAP1 GSTO1 VDAC2 EGFR 1.2 1.2 1.2 1.2

1.0 1.0 1.0 1.0 VDAC2 EGFR 0.8 KEAP1 0.8 GSTO1 0.8 0.8

0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 Response toResponse competition Response toResponse competition toResponse competition toResponse competition 0.2 0.2 0.2 0.2

0.0 0.0 0.0 0.0 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 Log (concentration / M) Log (concentration / M) Log (concentration / M) Log (concentration / M)

RB1 CHEK1 HMOX2 NFKB2 1.2 1.2 1.2 1.2

1.0 1.0 1.0 1.0 RB1 0.8 NFKB2 0.8 0.8 CHEK1 0.8 HMOX2 0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 Response toResponse competition Response toResponse competition toResponse competition toResponse competition 0.2 0.2 0.2 0.2

0.0 0.0 0.0 0.0 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 Log (concentration / M) Log (concentration / M) Log (concentration / M) Log (concentration / M)

PTPN1 BID STAT3 EIF2AK2 1.2 1.2 1.2 1.2

1.0 1.0 1.0 1.0 BID PTPN1 EIF2AK2 0.8 0.8 0.8 STAT3 0.8

0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 Response toResponse competition toResponse competition toResponse competition toResponse competition 0.2 0.2 0.2 0.2

0.0 0.0 0.0 0.0 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 Log (concentration / M) Log (concentration / M) Log (concentration / M) Log (concentration / M)

TXN RELA AHR CDK5 1.2 1.2 1.2 1.2

1.0 1.0 1.0 1.0 CDK5 AHR RELA 0.8 0.8 TXN 0.8 0.8

0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 Response toResponse competition Response toResponse competition toResponse competition Response toResponse competition 0.2 0.2 0.2 0.2

0.0 0.0 0.0 0.0 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 Log (concentration / M) Log (concentration / M) Log (concentration / M) Log (concentration / M)

MARCKS HAT1 STK10 AURKA 1.2 1.2 1.2 1.2

1.0 1.0 1.0 1.0 MARCKS STK10 AURKA HAT1 0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 Response toResponse competition toResponse competition toResponse competition Response toResponse competition 0.2 0.2 0.2 0.2

0.0 0.0 0.0 0.0 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 (B) Log (concentration / M) Log (concentration / M) Log (concentration / M) Log (concentration / M)

EC value R squared EC value R squared Protein ID 50 Hill Slope Protein ID 50 Hill Slope (μM) value (μM) value

KEAP1 18.1 -1.047 0.984 STAT3 41.8 -1.037 0.934

GSTO1 12.0 -1.623 0.981 EIF2AK2 50.4 -0.810 0.943

VDAC2 136 -0.962 0.846 TXN 19.0 -1.725 0.992

EGFR 37.2 -1.245 0.946 RELA 16.6 -1.083 0.957

RB1 9.6 -0.806 0.928 AHR 32.2 -1.074 0.946

CHEK1 12.4 -0.644 0.906 CDK5 78.4 -1.369 0.934

HMOX2 13.7 -1.207 0.969 MARCKS 97.9 -0.900 0.932

NFKB2 29.5 -1.065 0.969 HAT1 70.2 -0.954 0.996

PTPN1 23.4 -1.001 0.952 STK10 94.6 -1.657 0.868

BID 48.1 -1.283 0.897 AURKA 55.2 -1.186 0.956

Appendix Figure 9. Examples of the EC50 profiles fitted with GraphPad Prism 5 (A) for piperlongumine targets obtained by fitting dose response curves to piperlongumine ABP labelling upon competition with piperlongumine (Chapter 5.3). Shown are 20 protein targets as examples, indicated by their gene names. The Hill slope and R squared values (B) indicate that for the most part, the in-cell ABP competition assays fit very well to sigmoidal

234 dose response curves, providing support for the calculation of EC50 as a parameter to allow the comparison of protein target engagement for piperlongumine across its entire target set.

(A) KEAP1 PDIA6 STAT3 HMOX2 1.2 1.2 1.2 1.2

1.0 1.0 1.0 1.0

KEAP1 PDIA6 STAT3 0.8 0.8 0.8 0.8 HMOX2

0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 Response toResponse competition toResponse competition toResponse competition toResponse competition 0.2 0.2 0.2 0.2

0.0 0.0 0.0 0.0 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 Log (concentration / M) Log (concentration / M) Log (concentration / M) Log (concentration / M)

ATP2A2 EGFR ALDH2 STAT1 1.2 1.2 1.2 1.2

1.0 1.0 1.0 1.0 STAT1 ALDH2 ATP2A2 EGFR 0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 Response toResponse competition toResponse competition toResponse competition toResponse competition 0.2 0.2 0.2 0.2

0.0 0.0 0.0 0.0 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 Log (concentration / M) Log (concentration / M) Log (concentration / M) Log (concentration / M)

TXNRD1 TUBB FXR1 VDAC2 1.2 1.2 1.2 1.2

1.0 1.0 1.0 1.0 TXNRD1 TUBB FXR1 VDAC2

0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 Response toResponse competition toResponse competition toResponse competition Response toResponse competition 0.2 0.2 0.2 0.2

0.0 0.0 0.0 0.0 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 Log (concentration / M) Log (concentration / M) Log (concentration / M) Log (concentration / M)

DNMT1 HAT1 PDIA3 HK2 1.2 1.2 1.2 1.2

1.0 HK2 1.0 1.0 HAT1 1.0 DNMT1 PDIA3 0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 Response toResponse competition Response toResponse competition Response toResponse competition toResponse competition 0.2 0.2 0.2 0.2

0.0 0.0 0.0 0.0 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 Log (concentration / M) Log (concentration / M) Log (concentration / M) Log (concentration / M)

HDAC1 MAP2K1 IMPDH2 PTPN11 1.2 1.2 1.2 1.2

1.0 1.0 HDAC1 1.0 1.0 MAP2K1 IMPDH2 PTPN11 0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4 Response toResponse competition Response toResponse competition Response toResponse competition 0.2 toResponse competition 0.2 0.2 0.2

0.0 0.0 -7 -6 -5 -4 0.0 0.0 -7 -6 -5 -4 -7 -6 -5 -4 Log (concentration / M) -7 -6 -5 -4 Log (concentration / M) Log (concentration / M) Log (concentration / M) (B)

EC value R squared EC value R squared Protein ID 50 Hill Slope Protein ID 50 Hill Slope (μM) value (μM) value

KEAP1 36.1 -1.028 0.999 FXR1 130 -1.666 0.998

PDIA6 134 -2.193 0.979 VDAC2 137 -1.392 0.980

STAT3 151 -1.037 0.969 DNMT1 160 -1.225 0.820

HMOX2 13.3 -0.992 0.993 HAT1 154 -1.399 0.987

ATP2A2 46.8 -1.118 0.998 PDIA3 156 -3.39 0.567

EGFR 119 -0.9957 0.980 HK2 192 -1.754 0.951

ALDH2 112 -1.736 0.883 HDAC1 281 -0.737 0.717

STAT1 120 -1.431 0.919 IMPDH2 139 -1.336 0.982

TXNRD1 135 -1.421 0.807 PTPN11 184 -1.506 0.987

TUBB 187 -1.800 0.950 MAP2K1 220 -1.948 0.983

Appendix Figure 10. Examples of the EC50 profiles fitted with GraphPad Prism 5 (A) for curcumin targets obtained by fitting dose response curves to curcumin ABP 1 labelling upon competition with curcumin (Chapter 5.3). Shown are 20 protein targets as examples, indicated by their gene names. The Hill slope and R squared values (B) indicate that for the most part, the in-cell ABP competition assays fit very well to sigmoidal dose

235 response curves, providing support for the calculation of EC50 as a parameter to allow the comparison of protein target engagement for curcumin across its entire target set.

MCF-7 cell line MDA-MB-231 cell line

(A) Lane 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 ABP (10 μM) + + + + + + + + + + + + + + + + + + + + + + + + PIP (μM) - - - 10 10 10 50 50 50 100 100 100 - - - 10 10 10 50 50 50 100 100 100 Sample PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD

250 250 150 150 100 100 75 75

50 50

37 37 gel fluorescence - In 25 25

Piperlongumine 20 20

15

(B) Lane 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 ABP (25 μM) + + + + + + + + + + + + + + + + + + + + + + + + NC (μM) - - - 25 25 25 100 100 100 150 150 150 - - - 10 10 10 50 50 50 100 100 100 Sample PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD

250 250 150 150 100 100 75 75 50 50 37 37 gel fluorescence - In

Curcumin 25 25 20 20 15

(C) Lane 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 ABP (25 μM) + + + + + + + + + + + + + + + + + + + + + + + + SULF (μM) - - - 25 25 25 125 125 125 250 250 250 - - - 10 10 10 50 50 50 100 100 100 Sample PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD

250 250 150 150 100 100 75 75

50 50

37 37 gel fluorescence - Sulforaphane 25 25 In 20 20

Appendix Figure 11. In-gel fluorescence labelling profiles of the in-cell competition-based assays of ABPs against their respective parent compounds for the corresponding WB analysis (Figure 35) for piperlongumine (A), curcumin (B) and sulforaphane (C) (Chapter 5.4). PPD = pre-pulldown sample (after the CuAAC reaction), SN = supernatant or protein lysate after affinity enrichment with the Neutravidin sepharose resin and PD = pulldown proteins immobilised on the Neutravidin sepharose resin after enrichment. NC = curcumin, PIP = piperlongumine and SULF = sulforaphane.

236

(A) MCF-7 cells MDA-MB-231 cells

Lane 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Curcumin ABP 1 (50 μM) + + + ------+ + + ------Piperlongumine ABP (50 μM) - - - + + + ------+ + + ------Sulforaphane ABP 2 (50 μM) ------+ + + ------+ + + - - - DMSO ------+ + + ------+ + + Sample PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD

α-HSP90

α-STAT3

α-ACTIN

α-CDK2

α-BID

α-MIF

α-MARCKS

α-IMPDH2

α-PSMC1

α-GSTO1/2

α-HCCS

(B)

Lane 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Curcumin ABP 1 (50 μM) + + + ------+ + + ------Piperlongumine ABP (50 μM) - - - + + + ------+ + + ------Sulforaphane ABP 2 (50 μM) ------+ + + ------+ + + - - - DMSO ------+ + + ------+ + + Sample PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD PPD SN PD

250 250 150 150 100 100 75 75 50 50

37 37

25 25

20 gel fluorescence

20 - In 15 15

10 10

Appendix Figure 12. Additional WB analyses of in-cell ABP only treated samples to supplement the WB analyses (Figure 35) (Chapter 5.4). (A) WB analyses with 50 μM curcumin ABP 1, 50 μM piperlongumine ABP, 50 μM sulforaphane ABP 2 and DMSO in both the MCF7 and MDA-MB-231 cell lines. PPD = pre-pull down lysate, SN = supernatant lysate left after the affinity enrichment, and PD = proteins immobilised on the resin as a result of pulldown. (B) The corresponding in-gel fluorescence image for the WB image.

237

(A) 794 IA targets (MCF7, MDA-MB-231 and Jurkat)

Fructose and mannose metabolism Proteasome Spliceosome Pyrimidine metabolism Aminoacyl-tRNA biosynthesis KEGG_PATHWAY Glutathione metabolism Cell cycle Pytuvate metabolism Glycolysis / Gluconeogenesis Valine, leucine and isoleucine degradation Soluble fraction Ribonucleoprotein complex Cytoskeleton Nuclear lumen GOTERM_CC_FAT Melanosome Mitochondrion Organelle lumen Non-membrane bounded organelle

Identical protein binding Cytosol Enzyme binding Adenyl ribonucleotide binding Adenyl nucleotide binding Purine nucleoside binding GOTERM_MF_FAT Oxidoreducatase activity, acting on sulfur group of donors Nucleoside binding Ribonucleotide binding Purine nucleotide binding Nucleotide binding Negative regulation of cellular protein metabolic process Negative regulation of metabolic process Macromolecular complex assembly Protein complex assembly Macromolecular complex subunit organisation Glucose metabolic process GOTERM_BP_FAT Cell cycle Mitotic cell cycle Translation Cell redox homeostasis 0 5 10 15 20

(B) 81 conserved protein targets across curcumin, piperlongumine and sulforaphane

KEGG_PATHWAY

Calcium signalling pathw ay Pancreatic cancer Ubiquitin mediated proteolysis Condensed chromosome Internal side of plasma membrane Vesicle coat Endoplasmic reticulum Organelle membrane GOTERM_CC_FAT Endomembrane system Organelle lumen Non-membrane bounded organelle Nuclear lumen Nucleolus ATP-dependent helicase activity Actin binding Hydrolase activity, acting on carbon-nitrogen (but not peptide) bond Transcription activity Helicase activity GOTERM_MF_FAT Nucleoside binding ATP binding Nucleotide binding Transcription factor binding RNA binding Nucleobase, nucleoside, nucleotide and nucleic acid transport Mitotic sister chromatid segregation RNA splicing M phase Organelle fission GOTERM_BP_FAT mRNA processing Cell cycle Nuclear division Mitosis Cell cycle process 0 1 2 3 4 5 6 7 8 9 10

Appendix Figure 13. DAVID analysis of protein targets representing enriched KEGG pathways, cellular compartmentalisation (CC), molecular function (MF) and biological processes (BP) for the 794 protein IA targets identified by Cravatt and co-workers82 (A) and the 81 protein targets conserved across all three electrophilic natural products under study here in the MDA-MB-231 cell line (B). The analyses are in conjunction with Chapter 5.5.

AzRB2 capture reagent

H O H O N N N3 N N O H H AzRB capture reagent O O O O O O O H H NH NH O O N N N H 3 N N N O H2N NH HN N H H H H O O O H S O NH NHH O O H N NH HN N 2 H H S

Appendix Figure 14. The chemical structures of the trypsin cleavable reagents utilised to release the modified peptides from the Neutravidin sepharose resin such that the individual amino acid modification site of the ABP

238 can be mapped for a protein target. The two reagents (AzRB and AzRB2) are highly analogous and both contain a trypsin cleavage site (highlighted in green).

(A)

(B)

(C)

Appendix Figure 15. The MS2 spectra for the assignment of the sulforaphane ABP 2 modification on the designated peptide for APOBEC3C (A), CTSZ (B) and GSDMD (C) (Chapter 5.6).

239

(A)

(B)

(C)

Appendix Figure 16. The MS2 spectra for the assignment of the sulforaphane ABP 2 modification on the designated peptide for TXN (A), RTN3 (B) and PTGE3 (C) (Chapter 5.6).

240

(A) (B) NEM ABP (μM) 2 5 10 25 CA ABP (μM) 2 5 10 25 - - - - - 250 ACR ABP (μM) - - - - 2 5 10 25 - 150 100 75 250 150 100 50 75 37 50

37

25 gel fluorescence -

20 In 25 gel fluorescence -

15 20 In

15

10 Coomassie (C) Coomassie AE ABP (μM) 1 2 5 10 20 ------(D) AC ABP (μM) - - - - - 1 2 5 10 20 - BA ABP (μM) 5 20 50

250 150 250 150 100 100 75 75 50 50 37 37

25 25 gel fluorescence gel fluorescence - 20 - In In 20 15 15

10 10 Coomassie Coomassie

Appendix Figure 17. In-gel fluorescence of the in-cell labelling profiles of a concentration gradient of electrophilic small molecule ABPs (Chapter 5.7), namely N-ethylmaleimide (NEM) ABP (A), chloroacetamide (CA) ABP and acrylamide (ACR) ABP (B), acetylenic enone (AE) ABP and acetylenic chalcone (AC) ABP (C) and benzaldehyde (BA) ABP (D) in the MDA-MB-231 cell line.

241

(A) (B) ITC ABP (10 μM) + + + + + + + - CMK ABP (3 μM) + + + + + + + - SULF (μM) - 100 200 - - - - - IA (μM) - 50 100 - - - - - IA (100 μM) - - - + - - - - SULF (100 μM) - - - + - - - - PC (100 μM) - - - - + - - - NC (100 μM) - - - - + - - - NC (100 μM) - - - - + - - PC (100 μM) - - - - + - - PIP (100 μM) ------+ - PIP (100 μM) ------+ - 150 100 150 75 100 75 50 50 37 37

25 gel fluorescence -

25 gel fluorescence - 20 In

20 In 15 15 10 10 Coomassie Coomassie

(C) NEM ABP(3 μM) + + + + + + + + - NEM (150 μM) - + ------NC (100 μM) - - + ------PC (100 μM) - - - + - - - - - THC (100 μM) - - - - + - - - - PIP (100 μM) - - - - - + - - - THP (100 μM) ------+ - - SULF (150 μM) ------+ -

250 150 100 75

50 37 gel fluorescence 25 - In 20

15 10 Coomassie

Appendix Figure 18. In-gel fluorescence of the in-cell labelling profiles of a selected number of electrophilic small molecule ABPs, (Chapter 5.7), namely the isothiocyanate (ITC) ABP (A), chloromethylketone (CMK) ABP (B) and the NEM ABP (C) competed in-cell against IA, PC, NC, PIP, THC, THP and SULF in the MDA-MB-231 cell line. IA = iodoacetamide, PC = mono-O-propylcurcumin, THC = tetrahydrocurcumin, SULF = sulforaphane, NC = curcumin, NEM = N-ethylmaleimide, PIP = piperlongumine and THP = tetrahydropiperlongumine.

242

CURCUMIN (MDA-MB-231) CURCUMIN (MCF7)

Combination 1 (EC25) Combination 2 (EC40) Combination 1 (EC25) Combination 2 (EC40)

PF-477736 (CHEK1)

PF-573228 (FAK)

PP2 (SRC)

AFATINIB (EGFR)

TAMOXIFEN (ESR1)

EPIRUBICIN (DNA)

Appendix Figure 19. The normalised isobologram plots of the tested cancer therapeutics with curcumin in the MCF7 and MDA-MB-231 cell lines for each of the two combination concentrations of the electrophilic natural product (combination 1 is EC25 and combination 2 is EC40) (Chapter 6). All graphs were generated by the CompuSyn software. The line designates the CI where CI = 1 (additive effect). Below the line (CI < 1) indicates synergy and above the line (CI > 1) indicates antagonism for the combination of the two agents.

PIPERLONGUMINE (MDA-MB-231) PIPERLONGUMINE (MCF7)

Combination 1 (EC25) Combination 2 (EC40) Combination 1 (EC25) Combination 2 (EC40)

PF-477736 (CHEK1)

PF-573228 (FAK)

PP2 (SRC)

AFATINIB (EGFR)

TAMOXIFEN (ESR1)

EPIRUBICIN (DNA)

Appendix Figure 20. The normalised isobologram plots of the tested cancer therapeutics with piperlongumine in the MCF7 and MDA-MB-231 cell lines for each of the two combination concentrations of the electrophilic natural

243 product (combination 1 is EC25 and combination 2 is EC40) (Chapter 6). All graphs were generated by the CompuSyn software. The line designates the CI where CI = 1 (additive effect). Below the line (CI < 1) indicates synergy and above the line (CI > 1) indicates antagonism for the combination of the two agents.

SULFORAPHANE (MDA-MB-231) SULFORAPHANE (MCF7)

Combination 1 (EC25) Combination 2 (EC40) Combination 1 (EC25) Combination 2 (EC40)

PF-477736 (CHEK1)

PF-573228 (FAK)

PP2 (SRC)

AFATINIB (EGFR)

TAMOXIFEN (ESR1)

EPIRUBICIN (DNA)

Appendix Figure 21. The normalised isobologram plots of the tested cancer therapeutics with sulforaphane in the MCF7 and MDA-MB-231 cell lines for each of the two combination concentrations of the electrophilic natural product (combination 1 is EC25 and combination 2 is EC40) (Chapter 6). All graphs were generated by the CompuSyn software. The line designates the CI where CI = 1 (additive effect). Below the line (CI < 1) indicates synergy and above the line (CI > 1) indicates antagonism for the combination of the two agents.

244