CHARACTERIZATION OF THE E3 LIGASE SIAH2 AS AN ANTI-CANCER TARGET

ANUPRIYA GOPALSAMY M.Sc (Bioinformatics), University of Madras

A THESIS SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

YONG LOO LIN SCHOOL OF MEDICINE DEPARTMENT OF OBSTETRICS AND GYNAECOLOGY NATIONAL UNIVERSITY OF SINGAPORE 2013

For my parents...

DECLARATION

I hereby declare that this thesis is my original work and it has been written by me in its entirety. I have duly acknowledged all the sources of information which have been used in the thesis.

This thesis has also not been submitted for any degree in any university previously.

Anupriya Gopalsamy 21 May 2014

i ACKNOWLEDGEMENTS

First and foremost, I thank GOD for blessing me with privilege and much-needed strength to pursue this program. It is only with His grace that I have come so far in this journey of learning. I express my supreme gratitude to Prof. Kunchithapadam Swaminathan, Department of Biological sciences for kindly agreeing to be my supervisor when my former supervisor left. I feel profound privilege to thank Prof. Swaminathan for believing in me and took me under his wings to embark on the same project with different prospective. I am thankful to him for patiently tolerating my mistakes and politely pointing them out to me and, in the process, moulding me into a better researcher. I will be deeply indebted to him for all his support over the years. I express my heartfelt gratitude and thanks to Prof. Thilo Hagen, Department of Biochemistry for agreeing to be a collaborator and for hosting me in his lab, providing with all facilities and guidance. Prof. Hagen taught me several experimental techniques and also taught me how to think as a researcher. He also expertly showed me how to design experiments and communicate results. I want to thank all the wonderful people in his lab for their valuable discussions on my project. I extend my special thanks to Christine, Yee Liu and Shin Yee for voluntarily providing timely help in my experiments. I also express my appreciation and thanks to Dr. Lijun, Department of obstetrics and Gynaecology, who allowed me to share their lab facilities to carry out certain experiments. I thank Zhang Zhi Wei and Seok Eng for providing me the needful in their lab. I am grateful to my former supervisor Dr. Annamalai Loganath for his support and guidance for the first year of my candidature. I would like to extend my sincere thanks to my other supervisors Prof. Arijit Biswas and Prof. Yap Seng Chong of the department of Obstetrics and Gynaecology, who have given me financial, academic and technical support to pursue my research smoothly.

ii A word of praise will not be sufficient to acknowledge Lilly, Liha and Apple for their timely help to solve all the administrative works. I am thankful to the all my present and former SBL4 labmates Roopa, Deepthi, Umar, Shani, Pavithra, Madhuri, Divya, Raghu, Jobi, Pankaj, Fengxia, Jikun, and Kanmani for their help as and when required. My Special thanks to Thomas, undergraduate student for his contribution to this work. I also would like to thank Prof. Sivaraman and his lab members (SBL5) for their help and valuable discussions on my project. A few people were not only my lab mates but also happened to be my closest friends during the course of my research. I would like to say “Thank you” to Roopa, (my close friend, senior and labmate) for always being there for me and shared most of the best moments in Singapore. She always provided a source of help in research, advice on life and living, and support during hard times. I would also take this opportunity to thank Lakshmi Akka (former colleague) for providing a great company and unforgettable friendship throughout the stay at NUS. I would like to thank Lalitha (my close friend and rooomate) for all the good times and also for listening to and letting me vent out all my thoughts, worries and frustrations. I want to thank my housemates Ambica, Deepthi, Divya, Lavanya, Lalitha and Roopa for all the fun, memorable times in Singapore and for their kind help and cooperation during this dissertation. I thank NUS for having given me the opportunity to pursue my Ph.D. with a research scholarship. Finally and most importantly, my indebtedness to my parents for their constant support for the choices I made and their sacrifice are beyond expression. I dedicate this thesis for them as a humble tribute to their love and blessings that has brought me to where I am now.

iii TABLE OF CONTENTS

DECLARATION ...... i

ACKNOWLEDGEMENTS ...... ii

TABLE OF CONTENTS ...... iv

SUMMARY ...... viii

LIST OF TABLES ...... x

LIST OF FIGURES ...... xi

LIST OF PUBLICATIONS ...... xiii

LIST OF SYMBOLS AND ABBREVIATIONS ...... xiv

CHAPTER 1. INTRODUCTION ...... 1

1.1 Biological background ...... 1

1.1.1 The importance of regulated protein degradation ...... 1 1.1.2 Ubiquitination and protein degradation ...... 1 1.1.3 E3 ubiquitin ligases ...... 4 1.1.4 The Siah family of proteins ...... 5 1.1.5 Siah Structure ...... 6 1.1.6 Diverse role and substrates of Siah ...... 8 1.1.7 Siah1 and cancer ...... 10 1.1.8 Role of Siah2 in cancer ...... 10 1.1.8.1 Hypoxia signaling connected to Siah2 in cancer ...... 12 1.1.8.2 RAS signaling connected to Siah2 in cancer ...... 14 1.1.9 E3 ubiquitin ligases as a cancer target ...... 15 1.1.10 Protein-Protein Interactions as drug targets ...... 16

1.2 Drug Development ...... 17

1.2.1 Introduction to drug discovery ...... 17 1.2.2 Stages in drug development and the role of in silico approaches .... 17 1.2.3 Computer-aided drug design (CADD) ...... 20 1.2.3.1 Homology modeling ...... 21 1.2.3.2 Molecular dynamics simulation ...... 23

iv 1.2.3.3 Structure based drug design (SBDD) ...... 24 1.2.3.4 Active site identification ...... 26 1.2.3.5 Virtual screening ...... 26 1.2.3.6 Docking ...... 28 1.2.3.7 Scoring methods ...... 30 1.2.3.8 Adsorption, distribution, metabolism and excretion (ADME) ...... 32 1.2.4 Successes and limitations of SBDD ...... 33

1.3 Protein crystallization and data collection ...... 35

Objectives ...... 36

CHAPTER 2. MATERIALS AND METHODS ...... 37

2.1 Homology modeling ...... 37 2.2 Validation of the modeled structure ...... 37 2.3 Protein simulation ...... 38 2.4 Docking ...... 38 2.5 Virtual screening ...... 39 2.6 Cell lines and cell culture ...... 39 2.7 Molecular cloning ...... 40 2.7.1 Human plasmid constructs ...... 40 2.7.2 Bacterial expression plasmid constructs ...... 40 2.8 Transient transfection ...... 41 2.8.1 Plasmid transfection ...... 41 2.8.2 Small-interfering (siRNA) transfection ...... 41 2.9 Protein isolation and concentration determination...... 42 2.10 SDS-PAGE ...... 42 2.11 Immunoblotting and detection ...... 42 2.12 Co-immunoprecipitaction ...... 43 2.13 GST pull down ...... 43 2.14 RNA isolation ...... 44 2.15 Real-time polymerase chain reaction using SYBR green ...... 44 2.16 Domain swapping using fusion PCR and mutants ...... 45 2.17 Drug treatments and preparation ...... 47

v 2.18 MTS assay ...... 47 2.19 Protein expression and purification of MBP-SBD-Siah2 and MBP- PHD3 ...... 48 2.20 Statistical analysis ...... 49 2.21 Isothermal titration calorimetry (ITC) ...... 49 2.22 Crystallization ...... 50

CHAPTER 3. IDENTIFICATION OF DRUG LIKE MOLECULES AGAINST SBD OF SIAH2 BY AN IN SILICO APPROACH ...... 51

3.1 Introduction ...... 51

3.2 Results ...... 52

3.2.1 Generation of SBD of Siah2 by homology modeling ...... 52 3.2.2 Evaluation of the modeled structure ...... 53 3.2.3 Molecular dynamics simulation of stability ...... 55 3.2.4 Docking and HTVS screening of compounds ...... 56 3.2.5 Selection of inhibitors ...... 57 3.2.6 Bioavailability of the lead molecules ...... 60

3.3 Discussions ...... 62

CHAPTER 4. EXPERIMENTAL VALIDATION OF DRUG LIKE MOLECULES BY IN VITRO AND IN VIVO BINDING ASSAYS ...... 64

4.1 Introduction ...... 64

4.2 Results ...... 65

4.2.1 SBD of Siah2 alone can independently interact with PHD3 irrespective of the N terminal Ring Domain ...... 65 4.2.2 GST pull down for SBD of Siah2 with PHD3 ...... 67 4.2.3 Expression and purification of MBP-SBD of Siah2 ...... 67 4.2.4 Expression and purification of MBP-PHD3 ...... 70 4.2.5 ITC analyses ...... 73 4.2.6 Compounds partially decrease SBD interaction with PHD3 ...... 77 4.2.6.1 Coimmunoprecipitation of SBD with PHD3 in the presence of compounds ...... 77

vi 4.2.6.2 GST pull down of SBD with PHD3 in the presence of compounds ...... 79 4.2.7 Basal level expression of Siah2 in breast cancer ...... 81 4.2.8 siRNA mediated silencing ...... 83 4.2.9 Endogenous Siah2 protein expression and antibody validation...... 84 4.2.10 Effect of Compounds on Cell Viability ...... 87 4.2.11 Analysis of the specificity of the compound 1 for Siah2 in MCF7 cells ...... 89

4.3 Discussions ...... 90

4.3.1 Advantages ...... 90 4.3.2 Limitations ...... 90

CHAPTER 5. IDENTIFICATION OF CRITICAL RESIDUES INVOLVED IN SUBSTRATE SPECIFICITY OF SIAH2 ...... 92

5.1 Introduction ...... 92

5.2 Results ...... 94

5.2.1 Siah1-PHD3 interaction ...... 94 5.2.2 SBD of Siah1 and Siah2 region swapped to obtain chimeric forms 96 5.2.3 Binding of wild type (WT) and chimera SBD's with substrates ...... 96 5.2.4 Identification of critical residues unique for Siah2 ...... 97 5.2.5 Mutagenesis of Siah1-2 and Siah2-1 ...... 98 5.2.6 Binding of wild type and mutated chimeric SBD's with substrates ...... 100 5.2.7 Crystallization ...... 102

5.3 Discussion ...... 104

CHAPTER 6. CONCLUSION AND FUTURE DIRECTIONS ...... 105

6.1 Conclusion ...... 105 6.1.1 Siah2 inhibition as anti-cancer therapy ...... 105 6.2 Future directions ...... 107

APPENDICES ...... 109

REFERENCES ...... 114

vii

SUMMARY

The Siah ubiquitin ligases play an important role in a number of signaling pathways in cancer progression and metastasis. One such critical pathway regulated by Siah2 in cancer is the hypoxia inducible factor 1α (Hif- 1α) dependent pathway, which gets activated in response to hypoxia. The domain architecture of Siah2 consist of a ‘really interesting new gene’ (RING) domain at its N-terminus, followed by two novel zinc-finger motifs and a highly conserved substrate binding domain (SBD) at its C-terminus. The substrate binding domain (SBD) of Siah2 is of importance in the induction of the Hif-1α dependent pathway by regulating prolyl hydroxylase 3 (PHD3), a well-studied substrate of Siah2 under hypoxia. Siah2 is also found to be involved in a Hif-1α independent pathway by regulating sprouty2 (SPRY2) via the Ras/ERK pathway. Since these pathways are critical in human cancers, various studies have suggested that Siah2 may be a suitable target for developing inhibitors. This study has three aims: the first is to identify specific drug like lead molecules towards the SBD of Siah2 using in silico high throughput virtual screening (HTVS); the second is to validate the identified compounds using in vitro and in vivo binding assays and the third is to identify the critical residues involved in Siah2 substrate specificity. Firstly, we prepared a structure of the SBD of Siah2 using a homology modeling approach, followed by molecular dynamics simulation in order to analyze the stability of the model. Eventually, we applied in silico structure- based HTVS and identified 5 ligands from the Maybridge database which interact with the SBD of Siah2. Menadione, a compound that has earlier been identified and characterized as a Siah2 inhibitor was used as a reference ligand for screening and selection of lead molecules. Secondly, since Siah2 is a non-catalytic protein, further experimental validation was performed targeting protein-protein interaction with partners. We experimentally validated the computationally identified potential and selective Siah2 inhibitors in vitro and in vivo through binding assays. The

viii results obtained show that the identified drug-like small molecules could partially inhibit protein-protein interaction. In addition, the compounds also possess anti proliferative and cytotoxic activity. Furthermore the backbone structural scaffold of these inhibitors could serve as building blocks in designing suitable drug-like molecules for cancer. The unique role of Siah1 and Siah2 is still unclear. Siah1 and Siah2 have their respective substrates as well as several common substrates. Our third study explores the substrate specificity of Siah1 and Siah2. We intend to identify the critical region or residues of Siah2 that confer substrate specificity. We have identified a few key residues that might play a crucial role in the binding of selective Siah2 substrates. Further mutational studies are being carried out to identify the specific residues involved in binding. This study might help in further understanding of the detailed roles of Siah1 and Siah2 and in designing more specific Siah based therapeutic strategies.

ix LIST OF TABLES

Table 1. Oncogenic role of Siah2 in various cancers and their respective effects ...... 11 Table 2. List of primers and corresponding template used for fusion PCR .. 46 Table 3. Ramachandran Plot statistics on 3D model of SBD of Siah2 calculated using PROCHECK ...... 55 Table 4. Glide extra-precision (XP) results for the five lead molecules and menadione using Schrodinger 9.0...... 60 Table 5. QikProp properties of the identified drug-like molecules and menadione using Schrodinger 9.0...... 61 Table 6. Thermodynamic parameters for interaction between SBD of Siah2 and PHD3 or compounds...... 76 Table A1. List of constructs and corresponding primers ...... 109

x LIST OF FIGURES

Figure 1. The ubiquitin proteolytic pathway...... 3 Figure 2. Domain architecture of Siah...... 6 Figure 3. Outline of Siah regulation and function...... 9 Figure 4. Hypoxic signaling connected to Siah2...... 14 Figure 5. Sequential procedure and steps involved in traditional drug development ...... 18 Figure 6. Integration of structural and informatics methods ...... 25 Figure 7. A docked pose of a ligand and protein receptor in its active site .... 29 Figure 8. Prediction of best-fit ligand to the binding site of the target by scoring function...... 30 Figure 9. Schematic flow of virtual screening process...... 33 Figure 10. Docking pose of zanamivir (Relenza) bound to influenza neuraminidase ...... 34 Figure 11. Modeled structure for SBD of Siah2 obtained from Modeller9v7...... 53 Figure 12. Model validation ...... 54 Figure 13. Total energy calculated by MD Simulation...... 56 Figure 14. Diagrammatic representation of the 13 predicted active site residues of SBD...... 57 Figure 15. Chemical structure of the five lead molecules...... 58 Figure 16. Binding poses of the five lead molecules and menadione to SBD of Siah2...... 59 Figure 17. Coimmunoprecipitation of Siah2 with PHD3...... 66 Figure 18. GST pull down of SBD with PHD3...... 67 Figure 19. SDS-PAGE affinity purification profile of MBP-SBD of Siah2. .. 68 Figure 20. Gel filtration profile of purified MBP-SBD of Siah2...... 69 Figure 21. SDS-PAGE affinity purification profile of MBP-PHD3 ...... 70 Figure 22. The ion-exchange chromatography profile of PHD3...... 71 Figure 23. Gel filtration profile of purified MBP-PHD3...... 72 Figure 24. ITC data for SBD with compounds and PHD3...... 75

xi Figure 25. Coimmunoprecipitation of SBD with PHD3 in the presence of the compounds...... 78 Figure 26. GST pull down of SBD with PHD3 in the presence of compounds...... 80 Figure 27. Agarose gel showing the size of single PCR products generated by Siah2 gene-specific primers for q-PCR ...... 82 Figure 28. mRNA expression analysis of Siah2 in breast cancer cell lines. .. 82 Figure 29. Validation of knockdown of Siah2 in HEK293T and MCF7 cell lines ...... 83 Figure 30. Rescue experiments using the Siah2 full length constructs...... 84 Figure 31. Analysis of endogeneous Siah2 protein expression and test for its stability...... 85 Figure 32. Antibody validation using siRNA oligos...... 86 Figure 33. Cell proliferative effects of compounds in MCF7 cell line...... 88 Figure 34. Specificity of Compound 1 to Siah2 in MCF 7 cell line...... 89 Figure 35. Association of Siah1 with PHD3...... 95 Figure 36. Diagrammatic representation of SBD of Siah1 and Siah2...... 96 Figure 37. Association of wild type and chimeric SBD's of Siah with its substrates...... 97 Figure 38. Sequence alignment of SBD of Siah1 and Siah2...... 98 Figure 39. Representation of mutagenesis and distribution of unique amino acids in chimeric forms of Siah...... 99 Figure 40. Siah2-1Mut restores its binding with its substrates...... 101 Figure 41. Analyses of Binding between mutated 2-1 and 1-2 chimeric SBD ...... 102 Figure 42. DLS Profile of MBP-SBD of Siah2 at 10mg/ml...... 104 Figure 43. Example illustration of steps involved in combinatorial chemistry based design of ligands...... 108 Figure 44. Purification of PHD3-His by refolding...... 111 Figure 45. Identification of MBP-SBD of Siah2 and MBP-PHD3 by peptide mass fingerprint...... 112

xii LIST OF PUBLICATIONS

Anupriya G, Kothapalli R, Basappa S, Chong YS, Annamalai L: Homology modeling and in silico screening of inhibitors for the substrate binding domain of human Siah2: implications for hypoxia-induced cancers. Journal of molecular modeling 2011, 17(12):3325-3332.

Kothapalli R, Khan AM, Basappa, Anupriya G, Chong YS, Annamalai L: Cheminformatics-based drug design approach for identification of inhibitors targeting the characteriztic residues of MMP-13 hemopexin domain. PloS one 2010, 5(8):e12494.

xiii LIST OF SYMBOLS AND ABBREVIATIONS

G Gibbs free energy change H Enthalpy change S Entropy change ADMET Adsorption, Distribution, Metabolism, Excretion and Toxicity APC Adenomatous polyposis coli AR Androgen receptor BLAST Basic local alignment search tool C/EBP The transcription factor CCAAT/enhancer-binding protein delta CaCl Calcium chloride CADD Computer aided drug design cDNA Complementary DNA CRPC Castration-resistant human prostate cancer DCC Deleted in colorectal carcinoma DCIS Ductal carcinoma in situ DLS Dynamic light scattering DMEM Dulbecco’s Modified Eagle Medium DMSO Dimethyl sulfoxide DNA Deoxyribonucleic acid DSBC Disulfide bond isomerase DTT Dithiothreitol DUBs Deubiquitinating enzymes E. coli Escherichia coli ER Estrogen receptor ERK Extracellular signal-regulated kinases FBS Fetal Bovine serum FoxA2 Forkhead box A2 FPLC Fast protein liquid chromatography GI Geninfo identifier GnHCL Guanidine hydrochloride

xiv GST Glutathione S-transferase HCC Human hepatocellular carcinoma HECT Homologous to E6-AP carboxy-terminus HEK 293T Human embryonic kidney 293T Hif-1α Hypoxia inducible factor 1α HIPK2 Homeodomain-interacting protein kinase 2 HTVS High throughput virtual screening IP Immunoprecipitation IPTG Isopropyl β-D-1-thiogalactopyranoside ITC Isothermal titration calorimetry

Ka Association constant

Kd Dissociation constant kDa kilo Daltons

Ki Inhibition constant LB Luria-bertani broth MAPK Mitogen-activated protein kinase MBP Maltose Binding Protein MD Molecular Dynamics mRNA Messenger ribonucleicacid MS Mass spectrometry MTD Maximum tolerance dose MTS 3-(4,5-dimethylthiazol-2-yl)-5-(3- carboxymethoxyphenyl)-2-(4- sulfophenyl)-2H- tetrazolium, inner salt NaCl Sodium chloride NCBI National centre for biotechnology information NcoR Nuclear receptor co-repressor 1 NED Neuroendocrine differentiation Ni-NTA Nickel-nitrilotriacetic acid NMR Nuclear magnetic resonance Nrf2 Nuclear factor (erythroid-derived 2)-like 2 OD Optical density OGDC-E2 2-oxoglutarate (also known as α-ketoglutarate

xv dehydrogenase complex) OPLS-AA Optimized potentials for liquid simulations all-atom PBS Phosphate buffered saline PCa Prostate adenocarcinomas PCR Polymerase chain reaction PDB Protein data bank Pdfs Probability density functions PHD Prolyl hydroxylases PHYL Phyllopod pI Isoelectric point Position-specific iterative-Basic local alignment PSI-BLAST search tool RING Really interesting new gene RMSD Root mean square deviation RNA Ribonucleic acid RPMI -1640 Roswell park memorial institute-1640 medium RT-PCR Real-time polymerase chain reaction SBD Substrate binding domain SBDD Structure based drug design Sodium dodecyl sulfate polyacrylamide gel SDS-PAGE electrophoresis SEC Size exclusion chromatography SIAH Seven in absentia homologue SINA Seven in absentia SIP Siah interacting protein SPRY2 Sprouty2 TBS Tris buffered saline TGF Tumor growth factor TIEG-1 TGF-beta induced early gene-1 TNFα Tumor necrosis factor alpha TRAF2 Tumor necrosis factor (TNF) receptor associated factor Tris Trisaminomethane

xvi Tris HCl Tris (hydroxymethyl) aminomethane TTk Tramtrack UBC Ubiquitin-conjugating enzyme UPP Ubiquitin proteasome pathway VDW Van der Waals WT Wild type ZnF Zinc fingers

xvii CHAPTER 1. INTRODUCTION

1.1 Biological background 1.1.1 The importance of regulated protein degradation In eukaryotes, intracellular protein degradation is a highly selective process, whereby proteins are broken down into their constituent amino acids at widely differing rates. This selective process impacts a range of cellular functions and serves several important homeostatic purposes. For example, gene expression patterns and developmental programs can be altered via coordinated destruction of transcriptional regulatory proteins (Conaway, Brower et al. 2002). Additionally, aberrant proteins that are mutated, damaged, or mis-folded and fail to degrade may pose a threat to cellular integrity. This is essential in protecting cells against environmental stress. The function and dysfunction of protein degradation have been implicated in the pathogenesis of many diseases including numerous cancer, neurodegenerative, immunological and several other disorders (Schwartz and Ciechanover 1999). Understanding the mechanisms involved in this process are therefore important for understanding and treating human disease. The intracellular degradation of protein is mainly achieved in two ways. First is by a lysosomal pathway, which is normally a non-selective process. The second and the major pathway is the ubiquitin proteasome pathway (UPP), a selective process which involves three phases. Proteins marked for degradation are covalently linked to ubiquitin known as Ubiquitination and the polyubiquinated protein is targeted to an ATP- dependent protease complex, the proteasome, for degradation and the ubiquitin is released (termed deubiquitination) and reused.

1.1.2 Ubiquitination and protein degradation Ubiquitin is a highly conserved, small protein of 76 amino acids and has a molecular mass of about 8.5 kDa. It acts as a tag for proteins to signal degradation, subcellular localization, membrane trafficking, DNA repair and chromatin dynamics (Murata, Yashiroda et al. 2009). The key feature of ubiquitin includes 7 lysine residues through which it forms a polymer by

1 forming covalent linkages through any one of the lysine residues. Polyubiquitin linkages through lys48 and lys63 are well-characterized. The substrates with lys48 linkages undergo degradation, whereas Lys63 linked chains tend to be directly involved in signal transduction (Thrower, Hoffman et al. 2000). The functional consequences of other ubiquitin linkages remain unclear, and current research aims to determine how those lysine linkages affect the protein substrates. The ubiquitination reaction requires at least three enzymes: an E1 ubiquitin activating enzyme, an E2 ubiquitin conjugating enzyme, an E3 that functions in a hierarchical system that allows ubiquitin to be activated and become covalently linked to protein substrates (Hershko 1996; Hershko and Ciechanover 1998). This conjugation process is reversible, and ubiquitin is removed from substrates by deubiquitinating enzymes (DUBs). An E4 ubiquitin chain assembly factor may be required if the E3 ubiquitin ligase is unable to polyubiquitinate a protein substrate (Koegl, Hoppe et al. 1999). Protein modification by ubiquitin also has non-conventional (non- degradative) functions that are dictated by the number of ubiquitin units attached to proteins (mono versus polyubiquitination). Monoubiquitination is the attachment of a single ubiquitin molecule to a substrate and is implicated in many cellular functions including protein trafficking between various cellular compartments, DNA repair and transcriptional regulation (Hicke and Dunn 2003). Polyubiquitination involves the addition of several ubiquitin molecules to the first ubiquitin moiety. These polyubiquitin chains target proteins for destruction, by a process known as proteolysis by the multiprotein complex 26S proteasome (Fig. 1). The 26S proteasome is a 2.5 MDa complex that is made up of two components: a 20S core particle and the 19S regulatory particle (Hanna and Finley 2007). The 20S core particle is a barrel-shaped complex that consists of seven proteins which are responsible for the peptidolytic activity of the proteasome. The 19S component is the regulatory subunit that can be further subdivided into two sub complexes the base and the lid. The base aids in substrate unfolding prior to degradation and is located proximal to the core particle; unfolding of the protein is required since the barrel of the 20S

2 proteasome core particle is too small to degrade folded proteins. The lid of the regulatory subunit is involved in substrate recognition (Braun, Glickman et al. 1999).

Figure 1. The ubiquitin proteolytic pathway. 1: Activation of ubiquitin by the ubiquitin-activating enzyme E1, a ubiquitin-carrier protein, E2 (ubiquitin- conjugating enzyme, UBC), and ATP. The product of this reaction is a high- energy E2∼ubiquitin thiol ester intermediate. 2: Binding of the protein substrate, via a defined recognition motif, to a specific ubiquitin-protein ligase, E3. 3: Multiple (n) cycles of conjugation of ubiquitin to the target substrate and synthesis of a polyubiquitin chain. E2 transfers the first activated ubiquitin moiety directly to the E3-bound substrate, and in following cycles, to previously conjugated ubiquitin moiety. Direct transfer of activated ubiquitin from E2 to the E3-bound substrate occurs in substrates targeted by RING finger E3s. 3′: As in3, but the activated ubiquitin moiety is transferred from E2 to a high-energy thiol intermediate on E3, before its conjugation to the E3- bound substrate or to the previously conjugated ubiquitin moiety. This reaction is catalyzed by HECT domain E3s. 4: Degradation of the ubiquitin-tagged substrate by the 26S proteasome complex with release of short peptides. 5: Ubiquitin is recycled via the activity of deubiquitinating enzymes (DUBs). Adapted from (Glickman and Ciechanover 2002).

3 1.1.3 E3 ubiquitin ligases The roles of E3 ubiquitin ligases can be broadly characterized as substrate recognition and polyubiquitin chain formation with the aid from E2 enzyme. E3 binds to a substrate via the degron – the ubiquitination signal, conferring substrate specificity, and E3 also catalyzes the ligation of one or more to the substrate (Pickart 2001) . In the human there are about 600 E3 ubiquitin ligases. However the detailed molecular mechanisms of protein selection are poorly understood and the exact protein substrates for each E3 are mostly unknown. Generally, E3 ubiquitin ligases can be classified into three main types based on their structure and mechanism of action. These include the HECT (homologous to E6-AP carboxy-terminus) domain family, the U-box family and the RING (really interesting new gene) finger family (Jackson, Eldridge et al. 2000). The HECT domain proteins are large monomeric E3’s which contain a cysteine residue that forms a covalent bond with the activated ubiquitin before transferring it to the substrate (Scheffner, Nuber et al. 1995). The RING and U-box E3 ligases however do not possess an active-site cysteine residue and these E3’s simply serve as a scaffold, bringing the E2- ubiquitin complex and substrate in close proximity to mediate the transfer of ubiquitin from the E2 enzyme directly to the substrate. The majority of E3 ubiquitin ligases are members of the RING finger family and over 600 human encoding RING finger E3s have been identified (Deshaies and Joazeiro 2009), The RING finger constitutes a small zinc-binding domain that binds to specific E2 enzymes (Lorick, Jensen et al. 1999; Joazeiro and Weissman 2000). The U-box E3s have a similar tertiary structure to that of the RING finger except that the U-box scaffold is stabilised by salt-bridges and hydrogen bonds rather than zinc ions (Aravind and Koonin 2000; Hatakeyama and Nakayama 2003).

4 1.1.4 The Siah family of proteins SIAH (Seven in Absentia Homolog) is a mammalian homolog of Seven in Absentia (SINA), a Drosophlila protein which was the first of the family to be defined that has a function in eye development (Carthew and Rubin 1990). The Siah family of proteins are evolutionarily conserved E3 ubiquitin ligases containing an N-terminal RING-finger domain required for interaction with E2 ubiquitin conjugating enzymes, and a C-terminal substrate binding domain (SBD) which interacts with target proteins (Hu, Chung et al. 1997). Specific target protein degradation via the ubiquitin–proteasome pathway is accomplished through the N-terminal RING domain of Siah proteins (Hu and Fearon 1999). Drosophila SINA was first isolated in a screen for mutations which affect the morphology of the Drosophila eye and SINA targets the transcriptional repressor, tramtrack 88 (TTK88) for degradation by cooperating with phyllopod (PHYL) and is essential for the correct development of R7 photoreceptor cells in the compound eye(Carthew, Neufeld et al. 1994). Another Drosophilia protein Sina-homologue, SinaH (46% identical to SINA) was recently identified (Cooper, Murawsky et al. 2008). Characterization of murine Sina homologues (Della, Senior et al. 1993) revealed that the mouse genome contains a family of five Siah genes, located at unlinked chromosomal positions (Holloway, Della et al. 1997). Two of these genes, Siah1-ps1 and Siah1-ps2, are pseudogenes as they contain a number of frame shifts and in-frame stop codons within the expected coding region. However, the remaining three genes, Siah1a, Siah1b and Siah2 have open reading frames and are expressed. In contrast, only two SINA homologues have been identified in the , Siah1 and Siah2 (Nemani, Linares-Cruz et al. 1996), both of which encode functional proteins.

5 1.1.5 Siah Structure

a

b

Figure 2. Domain architecture of Siah. (a) Siah 1 domain organization containing structurally characterized SBD. (b) Siah2 domain organization containing structurally uncharacterized SBD.

Structurally, the Siah family has a divergent N-terminal domain, and a highly conserved C-terminus, SBD that encompasses two Zinc fingers (ZnF) (Fig. 2). So far three SBD crystal structure of Siah1 have been solved and listed as follows. 1. The crystal structure of the SBD domain of murine Siah-1a was determined, and it shows that the cysteine-rich region forms a pair of zinc fingers and that the C-terminal region self-dimerizes. It was found that the SBD adopts eight- stranded β-sandwich fold that is homologous to tumor necrosis factor (TNF) receptor associated factor (TRAF) proteins (Polekhina, House et al. 2002). Later studies showed that Siah2, via degradation of TRAF2, was a regulator of TRAF2 signaling (Habelhah, Frew et al. 2002). It has been reported that many Siah binding proteins contain a common binding motif that may act as a degradation signal. This core sequence, PxAxVxP, often referred to as the ‘degron motif’, was found to be conserved and functional in a number of Siah-interacting proteins (examples such as SIP, TEIG-1, DCC). Other Siah interacting proteins, including KID, APC, synphylin-1, carry similar peptides that do not perfectly match the reported consensus degron motif (Dinkel, Michael et al. 2011). High affinity peptides derived from phyllopod and plectin have been shown to bind very strongly to

Siah1 with a Kd of ~100 nM and to compete effectively with a range of Siah- binding proteins, including SIP (House, Frew et al. 2003) 2. The crystal structure of SBD of human Siah1 bound to a degron motif containing peptide from SIP, was determined (Santelli, Leone et al. 2005). The

6 peptide binds to SBD with a kd of 24 ± 4µM. This study reported two acid patches on SBD of Siah1 which includes the residues Glu161, Asp162, Glu226 and Glu237 in the shallow groove, forms salt bridges with SIP peptide. 3. The crystal structure of SBD of human Siah1 bound to a degron motif- containing peptide from PHYL was determined and reveals a binding site for Siah-interacting proteins (House, Hancock et al. 2006). This site is distinct from those previously hypothesized as protein interaction sites. Various interacting residues that interact with the peptide were identified. Some residues that are reported to bind to the peptide includes, Thr156, Leu158, Asp162, Val164, Phe165, Leu166, Thr168, Ala175, Trp178, Asp177, Met180 and Asn276. Further mutations of the Siah1 residues that interact with the degron motif abrogate binding to the peptide. Mutations were performed based on the crystal structure of the SBD-peptide complex as a guide. Mutation of Met180Lys in the hydrophobic pocket abolished the binding of the target to a greater extent by Siah underlying the importance of the hydrophobic interaction observed in the crystal structure, and suggesting that it is a key recognition point. However other mutations in strands β0 and β1 do not have any pronounced effect as Met180. These Siah structures reveal a fold that has not been seen in other E3 structures (Schulman, Carrano et al. 2000; Zheng, Wang et al. 2000). To date no crystal structure of Siah2 has been determined although both Siah1 and Siah2 are expected to have a similar fold because of high sequence similarity. The domain structure of Siah2 is also similar to Siah1

7

1.1.6 Diverse role and substrates of Siah The Drosophila version of this protein, Sina is required for targeting the TTK88 for degradation , an essential process for neuronal differentiation of R7 photoreceptor cells in the developing fly eye (Tang, Neufeld et al. 1997). In mammals the substrates targeted for degradation are quite diverse. The role of the RING finger E3 ubiquitin ligase Siah is implicated in the control of diverse cellular biological events. The Siah proteins regulate ubiquitination-dependent degradation of multiple substrates, including transcriptional repressors NcoR/TIEG-1, activators β-catenin, netrin receptor, a microtuble-associated motor protein (Kid), TRAF2, OGDC-E2, PHD3, SPRY2, HIPK2, Nrf2, DCC and other proteins thus influencing an array of regulatory functions such as angiogenesis, DNA damage response, mitochondrial dynamics and Ras- and estrogen-receptor (ER) signaling (Fig. 3). More than 30 substrates of the Siah ubiquitin ligases have been identified to date and have been reviewed (Nakayama, Qi et al. 2009; Qi, Kim et al. 2013). It can directly mediate or bind to adaptor proteins to mediate degradation. For example, substrates such as transforming growth factor beta TGFβ, interacts directly with Siah (Johnsen, Subramaniam et al. 2002) whereas the interaction with substrates such as β-catenin and TTK88 are mediated through the adaptor proteins SIP/APC and PHYL (Matsuzawa and Reed 2001). As described earlier, SIP and PHYL peptides engage the human Siah SBD in a near identical manner with an apparent 100-fold higher affinity for the PHYL degron peptide than the SIP degron peptide (Santelli, Leone et al. 2005; House, Hancock et al. 2006). In addition to substrate ubiquitination, Siah1 and Siah2 can also self or auto-regulate their own stability in the presence of an E2 enzyme, a property that may be used as an auto-regulatory mechanism to control its own intracellular levels (Hu and Fearon 1999; Depaux, Regnier-Ricard et al. 2006). Siah2 mutant mice under normal conditions are fertile and phenotypically normal whereas Siah2 mutants under stress conditions show marked phenotypes, suggesting central role for it in maintaining normal cellular homeostasis and response to stress (Frew, Hammond et al. 2003). It is

8 interesting to note that most signaling pathways regulated by Siah2 are deregulated in human cancer.

Figure 3. Outline of Siah regulation and function. Factors involved in the regulation of Siah ligases are associated with stress response, including hypoxia, ER stress and genotoxic stress, which induce respective transcription factors, microRNA to induce Siah2 transcription, as well as post-translational modifications that determine Siah subcellular localization and activity. Consequently, Siah activities as an E3 ubiquitin ligase affects growing number of substrates associated with fundamental cellular processes including hypoxia, DNA damage response, neurodegeneration and cancer (Qi, Kim et al. 2013).

9

1.1.7 Siah1 and cancer Studies revealed that Siah1 plays an important role in regulating cell proliferation and cell apoptosis. Role of Siah1 in cancer is poorly understood. In this regard, most of the studies point Siah1 towards tumor suppressive role in contrast to oncogenic role of Siah2. Some of the early studies have reported Siah1 as a tumor suppressor that causes growth arrest or has pro-apoptotic properties (Matsuzawa, Takayama et al. 1998). It is also involved in the degradation of the oncogene β-catenin, via interactions with adenomatous polyposis coli (APC) (Liu, Stevens et al. 2001) and Siah-interacting protein (SIP) (Matsuzawa and Reed 2001) . So far there is only limited evidence to support this role The expression levels of Siah1 are reported to be reduced or down regulated in various cancers. Also inhibition or low levels of Siah1 has been shown to reduce apoptosis, thereby promoting cancer progression (Kim, Cho et al. 2004; Brauckhoff, Ehemann et al. 2007; Yoshibayashi, Okabe et al. 2007; Wen, Yang et al. 2010).

1.1.8 Role of Siah2 in cancer In contrast to the role of Siah1 in cancer, Siah2 in cancer is well characterized. Growing evidences highlight the functional role and supporting inhibition studies of Siah2 in the progression of multiple types of cancer, including breast (Sarkar, Sharan et al. 2012; Wong, Sceneay et al. 2012), lung (Ahmed, Schmidt et al. 2008), pancreatic (Schmidt, Park et al. 2007), prostate (Qi, Nakayama et al. 2010; Qi, Tripathi et al. 2013), liver (Malz, Aulmann et al. 2012) and melanoma (Qi, Nakayama et al. 2008). Additionally, Siah2 protein levels are significantly increased in the ductal carcinoma in situ (DCIS) tumor tissue in high-grade breast cancer (Behling, Tang et al. 2011; Wong, Sceneay et al. 2012), lung cancer (Ahmed, Schmidt et al. 2008), and prostate cancer (Qi, Nakayama et al. 2010), compared with its expression in the corresponding low-grade cancers or in normal tissues (Table 1).

10

Table 1. Oncogenic role of Siah2 in various cancers and their respective effects. (Wong and Moller 2013)

Cancer Studies Role type Mouse model of Inhibition of Siah2 with PHYL reduced tumor breast cancer growth (Moller, House et al. 2009) Mouse model of Siah2-knockout mice show delayed breast spontaneous tumor onset, reduced stromal infiltration, and breast cancer altered tumor angiogenesis (Wong, Sceneay et al. 2012) Human breast High Siah expression predicts DCIS cancer DCIS progression to invasive breast cancer tissue (Behling, Tang et al. 2011)

Human breast Siah2 upregulation in basal-like breast

cancer tissue cancers (Chan, Moller et al. 2011) Human cell line Src/Siah2 mediates degradation of C/EBP, a tumor suppressor protein (Sarkar, Sharan et al. 2012). Inhibition of Siah2 in fibroblasts

Breast cancer Breast and in the non-transformed mammary epithelial MCF10A cells confers resistance to the transforming activity of Ras and Src oncogenes, respectively Human cell line Estrogen increases Siah2 levels in estrogen receptor ER positive breast cancer cells, resulting in degradation of the nuclear receptor corepressor 1 (N-CoR) and a reduction in N-CoR–mediated repressive effects on gene expression, thus promoting cancer growth (Frasor, Danes et al. 2005). Mouse model of Inhibition of Siah2 activity reduces

melanoma melanoma progression via HIF dependent and

independent pathways (Qi, Nakayama et al. 2008) Mouse model of Siah2 inhibition with vitamin K3 blocks

Melanoma melanoma melanoma progression by disrupting hypoxia and MAPK signaling (Shah, Stebbins et al. 2009) Mouse model of Reduced tumor progression and metastasis in

spontaneous Siah2-null prostate cancer mice (Siah2- prostate cancer dependent regulation of Hif and FoxA2

and Human interaction). Increased Siah2 expression in Cancer Prostate Prostate prostate cancer human NE prostate tumor samples (Qi, tissues Nakayama et al. 2010)

11 Mouse model Siah2 expression is upregulated in Castration- of prostate Resistant Human Prostate Cancer [CRPC] cancer, Human tissues. Saih2 mediates the degradation of prostate cancer NCOR1-bound, transcriptionally-inactive AR tissue and cell (Androgen Recepor) there by promoting line expression of AR target genes, a key player in CRPC. Knockdown of Siah2 or AR significantly reduced proliferation of Rv1 and LNCaP cells. Inhibition of Siah2 promotes prostate cancer regression upon castration. (Qi, Tripathi et al. 2013) Xenograft Disrupting Siah2 function in lung cancer

mouse model of induced apoptosis and prevented tumor Lung

Cancer lung cancer growth (Ahmed, Schmidt et al. 2008)

Xenograft Siah mediates Ras signaling in pancreatic mouse model of tumor progression (Schmidt, Park et al. 2007) pancreatic

Cancer cancer Pancreatic Pancreatic

Human HCC Nuclear accumulation of Siah2 enhances tissue and cell. motility and proliferation of liver cancer cells (Malz, Aulmann et al. 2012). Inhibition of Siah2 expression also directly sensitizes human hepatocellular carcinoma (HCC) cells

Liver Cancer Liver to treatment with cytostatic drugs

The above studies support an oncogenic role of the Siah2 protein in promoting cancer and suggest that two main pathways are associated with it, for its activity: the hypoxic response pathway (prostate, melanoma, and breast cancer models) and the Ras signaling pathway (lung, pancreatic, and melanoma cancer models). The role of siah2 in these pathways are reviewed by Moller (House, Moller et al. 2009).

1.1.8.1 Hypoxia signaling connected to Siah2 in cancer Growth of solid tumors is associated with a poor oxygen supply (hypoxia) within the tumor mass, triggering a hypoxic response necessary to recruit new vasculature resulting in tumor cell survival (Seta, Spicer et al. 2002). HIF, the master regulator of hypoxia responses, has been implicated in cancer. HIF consists of a heterodimer between α-subunit (HIF-1α or HIF-2α) and β-subunit (HIF-1β). In a normoxic condition, the HIF-α level is regulated by the PHD (prolyl-hydroxylase) mediated hydroxylation of two proline residues that leads to VHL-dependent degradation (Maxwell, Wiesener et al.

12 1999). However under a hypoxic condition HIF-1α triggers a response that includes the expression of genes like VEGF necessary to recruit new blood vessels thus playing an important role in promoting tumor cell survival. A cancer response to hypoxia not only sustains tumor growth and survival, but through angiogenesis it fosters invasion and metastasis (Brizel, Scully et al. 1996). This pathway is well reviewed in (Semenza 2007). Studies provide a link between Siah2 and HIF in tumor development and progression (Qi, Nakayama et al. 2008; Qi, Nakayama et al. 2010). Siah2 is subjected to phosphorylation, which increases its ubiquitin targeted degradation activity under hypoxia conditions through mitogen-activated protein kinase (MAPK) p38 signaling cascade (Khurana, Nakayama et al. 2006) . Essentially, Siah2 is induced under hypoxic conditions, that control the stability of prolyl hydroxylases, PHD3 and PHD1 (Nakayama, Frew et al. 2004), resulting in proteasomal degradation of hydroxylases and impairment of HIF-1α (Qi, Nakayama et al. 2008), which is essential for HIF-1a’s association with and ubiquitination by pVHL (Ivan, Kondo et al. 2001). Another study reports that Siah2 controls the levels and activity of HIF-1α, which cooperates transcriptionally with FoxA2 to promote NE tumor development or formation of NED of human prostate cancers. However, in this study the signaling pathway of how Siah2 regulates HIF-1α is not described (Qi, Nakayama et al. 2010). Thus the studies in various mouse models show that Siah2 indirectly controls HIF-1α abundance, and is known to have a potential role in tumor development. Furthermore results showed that Siah2 seems to promote vasculogenesis, cell growth and metastasis (Fig. 4) (Qi, Nakayama et al. 2008). Siah2 knockout mice have a delayed and abrogated response to hypoxic conditions. At a cellular and tissue level, exposure of Siah2 mutants to hypoxia leads to significantly lower protein levels of HIF-1α, resulting in reduced hypoxia-induced gene expression (Nakayama, Frew et al. 2004). Consistent with this, a number of studies have shown that inhibition of Siah2 activity reduces HIF-1α levels under hypoxic conditions that subsequently blocks the formation of tumors and reduced metastasis (Qi, Nakayama et al.

13 2008; Moller, House et al. 2009; Shah, Stebbins et al. 2009; Qi, Nakayama et al. 2010). In addition, Siah2 has also been shown to regulate the stability of HIPK2 (Calzado, de la Vega et al. 2009), a transcriptional repressor of the HIF-1α gene (Nardinocchi, Puca et al. 2009). HIPK2 directly phosphorylates Siah2 resulting in a disruption of the HIPK2/Siah2 interaction, thus stabilizing HIPK2 and promoting apoptosis.

Figure 4. Hypoxic signaling connected to Siah2. Under the hypoxia condition, hypoxia-activated p38 and Akt pathways positively regulate Siah2 activity through phosphorylation and induction, respectively. Siah2 is phosphorylated by p38 on its serine and threonine residues, which will enhance its activity. Akt pathway increases the abundance of Siah2 by induction of its mRNA through signaling pathways yet to be clarified. Induced and activated Siah2 modulates the hypoxic signaling through PHD3 degradation and HIF-1α stabilization on one hand. Although, it is not yet identified, other Siah2 substrates would also play roles on hypoxic signaling by both HIF-1α-dependent and -independent mechanism. Adapted from (Nakayama, Qi et al. 2009).

1.1.8.2 RAS signaling connected to Siah2 in cancer Siah2 enhances the MAPK-ERK signaling pathway, primarily through direct ubiquitination-dependent modulation of Sprouty2 (SPRY2), a Ras

14 inhibitory protein that negatively regulates the MAPK-ERK signaling (Nadeau, Toher et al. 2007; Shaw, Meissner et al. 2007). Inhibition of Siah2 reduced Erk signaling, cell proliferation and increased apoptosis in various cancers (Schmidt, Park et al. 2007; Ahmed, Schmidt et al. 2008; Qi, Nakayama et al. 2008). In addition, attenuation of Sprouty2 expression effectively restores phospho-ERK levels in Siah2-inhibited melanoma cells and partially rescues tumor formation in vivo (Nadeau, Toher et al. 2007). Inhibition of Siah2 activity by a dominant negative Siah2 RING mutant in SW1 melanoma cells resulted in reduced tumor growth and metastases by robustly upregulating the levels of SPRY2 a specific inhibitor of Ras/ERK signaling. In contrast, the RING mutant inhibition of Siah1 has only little effect on levels of PHD3 and HIF-1α in SW1 cells (Shaw, Meissner et al. 2007). However inhibition of Siah2 substrate binding through PHYL (a peptide inhibitor for Siah2) reduced only metastatic spread and not tumor growth in melanoma cells growth via the hypoxic response pathway (Qi, Nakayama et al. 2008; Moller, House et al. 2009). These studies suggest that Siah2 may act through both pathways simultaneously or through different pathways at different stages of tumor progression. Consistent with the important role of Siah2 in HIF and ERK signaling, vitamin K3, a Siah2 inhibitor, reduces HIF and phospho-ERK levels in a human melanoma cell line and inhibits tumor formation in the mouse xenograft model (Shah, Stebbins et al. 2009). Taken together, these observations provide a foundation for the development of Siah2-targeted therapeutic agents for cancer therapy.

1.1.9 E3 ubiquitin ligases as a cancer target E3 ubiquitin ligases regulate a variety of biological processes, including cell growth and apoptosis, through the timely ubiquitination and degradation of many cell cycle- and apoptosis-regulatory proteins. Abnormal regulation of E3 ligases has been convincingly shown to contribute to cancer development (Nakayama and Nakayama 2006). Thus, targeting E3 ubiquitin ligases for cancer therapy has gained increasing attention, which is further stimulated by the recent approval of a general proteasomal inhibitor, bortezomib (Velcade, Millennium, MA), for the treatment of relapsed and

15 refractory multiple myeloma (Richardson, Mitsiades et al. 2006), as well as for the discovery of a new class of proteasome inhibitors (Chauhan, Catley et al. 2005). Targeting a particular E3 ubiquitin ligase rather than the total proteasome system is expected to increase specificity and limit toxicity. This became even more blossomed after the approval of SMAC mimetics, a small inhibitor for E3 IAP family (Feltham, Bettjeman et al. 2011). 1.1.10 Protein-Protein Interactions as drug targets Protein-Protein interactions (PPIs) are essential for many biological functions. Numerous investigations in genomics, proteomics and bioinformatics have provided convincing experimental evidence that PPIs participate in networks of interactions that may also involve other biological macromolecules (Schwikowski, Uetz et al. 2000; Schächter 2002). PPIs are increasingly attracting attention as a new frontier in drug development, particularly where they are involved in the regulation of cellular function and disease states. The advantage of targeting PPIs is mainly the biological specificity of inhibition (Valkov, Sharpe et al. 2012). It might, therefore, be possible to limit the adverse effects of a drug by selectively targeting one of the many functions the protein might have.

16 1.2 Drug Development 1.2.1 Introduction to drug discovery The search for new, effective and safe drugs has become increasingly sophisticated. Two pronounced characteriztics marked the modern age of the pharmaceutical industry: “competitiveness” and “high cost”. Driven by the high exclusive marketing profit, competition between pharmaceutical companies is much more intensive than before (Drews and Ryser 1997). “Innovate or die” is the first rule of international industrial competition. The cost of discovering a new drug is becoming very high. The initial statistics of drug discovery usually shows that it would take 12-15 years and more than $1 billion to discover a new drug and this cost has been growing at a rate of 20% per year (Heilman 1995; Yevich 1996). To alleviate this problem, efforts have been directed to reduce the cost and the time span needed for the discovery of a new drug. In consideration of the current patent protection period of 20 years for new drugs, any advance in marketing a drug more quickly is desirable. In addition to its great contribution to the improvement of our life qualities, earlier marketing is also enormously profitable.

1.2.2 Stages in drug development and the role of in silico approaches The stages of drug development are simplified as follows, 1. Pre-Clinical trials (also known as Phase 0). This phase mainly consists . Discovery phase: identification of target . Lead identification and optimization . Toxicology 2. Clinical development: Phase1, 2 & 3 3. Marketing The hierarchical process of traditional drug discovery after the target is identified is outlined in Fig. 8. As the preclinical stage normally takes about 6 years, and if it were to be shortened to 2 or 3 years, not only a big amount of research and development funding could be saved, but also the more important and exclusive clinical and marketing periods would be benefited. More and

17 more computer approaches are now being developed to reduce the cost and cycle time for discovering a new drug. In silico approaches in drug discovery and development are discussed in detail Section 1.14.

Figure 5. Sequential procedure and steps involved in traditional drug development

Target Identification The first step is to identify suitable biological targets, such as hormone, receptor, enzyme, peptide and nucleic acid based on a thorough understanding of regulatory networks and metabolic pathways. The second step is to design a highly specific drug based on the known three-dimensional (3D) structure of that target. X-ray crystallography is the most powerful technique for determining the three-dimensional structure of a protein but this can be a time-consuming process, and it will succeed only if it is possible to find suitable conditions for growing crystals. This can therefore easily become a bottleneck in drug designing projects. Also, some proteins have highly flexible domains making it difficult to crystallize them. Since the number of protein folds used by Nature is limited, homologous proteins can be expected to have a similar fold (Govindarajan, Recabarren et al. 1999). Hence homology based modeling,

18 which is discussed in section 1.14.1 is a practical and reasonable alternative to crystallography.

Lead identification and optimization The step of lead discovery is considered a bottle-neck of the drug discovery process (Kenny, Bushfield et al. 1998; Langer and Hoffmann 2001). In the past, leads were mainly discovered by random screening of a large chemical library. The sources of chemicals can be diverse such as active ingredients of natural products, derivatives of existing drugs, or even randomly synthesized chemicals. Most large pharmaceutical companies have their own libraries, which contain the chemicals accumulated from years of efforts. It was reported that only one potential lead can be identified from random screening of 5-10 thousand chemicals (Yevich 1996). Therefore, the efficiency of mere screening is very low. The increasingly better understanding of a drug-target interaction mechanism and rapid advances in biochemistry and organic chemistry lead to the advent of computer aided drug design (CADD), which aims to help the rapid and efficient discovery of drug leads (Marshall 1987; Vedani 1991; Ooms 2000; Veselovsky and Ivanov 2003; Bienstock 2012; Chen 2013). The role of CADD in lead identification and optimization has been discussed in Section 1.14

Toxicology In pre-clinical treatment before administering to human, in vitro and in vivo tests of a lead molecule are conducted. Acute and short term toxicity of lead molecules are evaluated on animals. The physiological and biological effects of lead molecules are observed and a molecule is absorbed, distributed, metabolized and excreted in animals need to be addressed. The lethal dose limits of a lead molecule is determined in this phase. Only one out of approximately 5000-10000 molecules facing pre-clinical tests is usually approved for marketing (Klees and Joines 1997). Clinical trials involving new drugs are commonly classified into four phases. Each phase of the drug approval process is treated as a separate clinical trial.

19 Phase 1: Screening for safety. (human pharmacology) Phase I studies are used to evaluate pharmacokinetic parameters and tolerance, generally in healthy volunteers, normally about 20-80 individuals. It is the first stage to test a drug on human subjects. The test usually starts with very small doses and subsequently increased. Escalating doses of a drug are administered to determine the maximum tolerance dose (MTD), which can induce the first symptom of toxicity.

Phase 2: Establishing a testing protocol. (therapeutic exploratory) After the Phase I trials, when the initial safety levels of a drug have been confirmed, Phase II clinical studies are conducted on about 100 to 300 patients to assess the efficacy of a lead molecule and to determine how well the drug works. Additional safety, clinical and pharmacological studies are also included in this study. During this phase, effective dose, method of delivery, safety and dose intervals of a lead molecule are established.

Phase 3: Final testing. (therapeutic confirmatory) In Phase 3 trials, treatment is given to a large group of people (1,000- 3,000) to confirm the effectiveness of a drug, monitor side effects, compare it to commonly used treatments, and collect information that will allow a drug to be used safely. The drug-development process will normally proceed through all four phases over many years. If the drug successfully passes through Phases 1, 2, and 3, it will usually be approved by the national regulatory authority of a country for use in the general population.

Phase 4: Post approval studies. I Phase 4 trials, post-marketing studies delineate additional information, including the risk of treatment, benefits, and optimal use.

1.2.3 Computer-aided drug design (CADD) Computer-aided drug design is typically carried out by computationally docking sets of small molecular compounds and peptides into a rigid model of the active site of a protein. In the past, docking methods were

20 primarily used to identify a small set (10–100), of molecules, which would be active against a single target. More recently, docking methods have been used to prioritize combinatorial libraries and create screening libraries focused on specific gene families/macromolecular targets. Following are the approaches involved in CADD.

1.2.3.1 Homology modeling Homology modeling aims to build three-dimensional protein structure models using experimentally determined structures of related family members as templates based on the observation that the structures are more conserved than the corresponding sequences (Lesk and Chothia 1980; Chothia and Lesk 1986). The active site can have very similar geometries, even for distantly related proteins (Zvelebil, Barton et al. 1987; Raha, Wollacott et al. 2000). The methodology involves four steps: template selection, sequence alignment, model building and model validation (Brenner, Chothia et al. 1998; Al- Lazikani, Jung et al. 2001).

Template identification and selection Template selection starts with identifying one (or more) experimentally determined structures (the template), which is likely to have structure homology with the sequence of interest (the target). Heuristic search methods such as BLAST (Altschul, Gish et al. 1990) and FASTA (Pearson and Lipman 1988), are often used for this as they are fast and accurate. In difficult cases more sensitive fold recognition methods, which utilize techniques such as hidden Markov methods, neural networks, iterated searches (e.g. PSI-BLAST (Pearl, Martin et al. 2001)), and evolutionary information that offer high relaiablility for the model and can be used to scan a structural database for suitable templates (Edwards and Cottage 2003).

Target and template sequence alignment Once the best template for modeling is identified, an optimal alignment of the target and template must be made. The alignment created by a search method or a fold recognition method may be sub-optimal with respect to modeling. Different score matrices are needed in order to get an optimal

21 alignment for homology modeling as compared to fold recognition, possibly because fold recognition needs to focus on conserved regions whereas homology modeling needs to take all regions into account. The Smith- Waterman algorithm uses dynamic programming to find an optimal alignment between two sequences, given a scoring matrix and a gap model (Smith and Waterman 1981). If multiple templates were found with significant sequence similarity, then the use of alignments based on multiple sequences is implied, as it highlights evolutionary relationships, and increase the alignment accuracy (Jaroszewski, Rychlewski et al. 2000). Several tools like ClustalX (Thompson, Gibson et al. 1997), and T-Coffee (Notredame, Higgins et al. 2000) serve this purpose. Other interesting methods include machine learning fast Fourier transform, improved score matrices and new scoring functions have also been developed to give a quantitative measure of alignment accuracy (Cristobal, Zemla et al. 2001; Karwath and King 2002). If sequence similarity is low, then structurally aligned templates are a good starting point to improve the alignment quality (Yang 2002). DALI (Holm and Sander 1993) is an example of the multiple structural and templates alignment templates.

Model building Model building consists of three main steps which involves main-chain building, loop modeling, basically de novo model building and side-chain modeling, an optimization step. Building the core is based on the approaches like rigid body superimposition that constructs the model from a few core sections defined by the average of Cα atoms in conserved regions. Distance geometry uses spatial restraints obtained from alignment. Segment matching uses a database of short segments of a protein structure, with a combination of energy and geometry rules (Deane, Kaas et al. 2001). Several programs are available for homology modeling. SwissModel (Guex and Peitsch 1997) a popular implementation of the rigid body approach is a commonly used online tool. The backbone is rebuilt based on the positions of Cα atoms, using a library of backbone elements derived from high quality Xray structures. Homology modeling in MODELLER (Sali and Blundell 1993) is very reliable and is performed based on satisfying spatial restraints. Distance and

22 dihedral angle restraints on the target structure are also generated, based on the alignment to the template structure. Corresponding distances and angles between aligned residues in the template and target structures are assumed to be similar. Restraints on bond lengths, bond angles, dihedral angles and non- bonded atom-atom contacts are also derived from statistical analysis of the relationships between Cα atoms, solvent accessibilities and side-chain torsion angles in known protein structures. The restraints are expressed as probability density functions (pdfs). These pdfs are combined to give a molecular function, which is optimized using a combination of energy minimization with molecular dynamics and simulated annealing.

1.2.3.2 Molecular dynamics simulation Molecular dynamics (MD) simulation is a computer simulation of the physical movements of atoms and molecules which are allowed to interact for a period of time. Computer simulations are also an alternative to complex experiments to predict the thermodynamic and kinetic roles of amino acids in proteins (Dokholyan, Buldyrev et al. 1998; Munoz and Eaton 1999). Molecular dynamics investigates the motion of discrete particles described by a potential energy function under external forces and quantifies them (Kandt, Ash et al. 2007). With the knowledge of these forces it is possible to calculate the dynamic behavior of the system using classical equations of motion. This potential energy function that consists of a set of equations that empirically describe bonded and non-bonded interactions between atoms is referred to as “force field” (Ponder and Case 2003). The bonded interactions between atoms are via covalent bonds, which typically includes bonds, bond angles, and dihedrals. Non-bonded interactions are electrostatic interactions between the partial charges on each atom and a Lennard-Jones potential to model dispersive van der Waals interactions. Each molecule in the simulation is described by its ‘topology’, the combination of the set of all atoms with their non-bonded and bonded interaction parameters and the connectivity of those atoms in the molecule's in silico MD simulations and provides a means to facilitate the interpretation of experimental data. On the other hand, the complexity and vast dimensionality of the protein conformational space makes the folding time too long to be reachable

23 by computational studies at the atomic detail (Taketomi, Ueda et al. 1975; Shakhnovich 1997). Simplified models became popular due to their ability to reach reasonable time scales and to reproduce the basic thermodynamic and kinetic properties of proteins (Privalov 1989). They are (i) unique native state (NS), i.e. there should exist a single conformation with the lowest potential energy (ii) cooperative folding transition (iii) thermodynamical stability of the NS (iv) kinetic accessibility, i.e. the NS should be reachable in a biologically reasonable time. Simulation of complex systems of hundreds and millions of atoms is achieved by GROMACS (see www.GROMACS.org), one of the fastest and most popular molecular dynamics software packages. GROMACS was designed primarily for simulations of nucleic acids, proteins, and lipids based on non-bonding interactions. The software is not capable of simulations where bonds are broken and reformed, such as chemical/enzymatic reactions. But for situations where non-bonding interactions are expected to predominate, GROMACS can provide information about the behavior of solute and solvent molecules, and potentially support and explain experimental data.

1.2.3.3 Structure based drug design (SBDD) A drug is most commonly a small organic molecule that activates or inhibits the function of a biomolecule such as a protein which in turn results in a therapeutic benefit to a patient. In most cases, drug designing involves the development of small molecules that are complementary in shape and charge to a biomolecular target with which they interact. (Cohen 1996). Structure- based drug design relies on the knowledge of the three dimensional structure of a biological target obtained through methods such as X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy (Leach and Jhoti. 2007). If an experimental structure of a target is not available, it may be possible to create a homology model of the target, based on the experimental structure of a related protein as discussed earlier in Section 1.14. In such cases, SBDD frequently relies on computational techniques and involves iterative steps of identifying new therapeutics based on a particular biological target (Madsen, Krogsgaard-Larsen. et al. 2002). Using the structure of a biological target, candidate drugs that are predicted to bind with high affinity and selectivity

24 may be designed using interactive graphics and the intuition of medicinal chemistry. Alternatively various automated computational procedures may be used to suggest new drug candidates. The complete iterative process of this is illustrated in Fig. 6

Experimental structure determination 3D protein structures (templates for (X-ray crystallography, NMR) model building)

Virtual libraries (diverse Sequences of novel compounds, chemical Comparative modelling therapeutic targets inventory)

3D structures of therapeutic targets

Structure-based library design High-throughput docking (in silico of (design of combinatorial libraries compounds on 3D structures of taking 3D constraints of binding sites targets) into account)

Hits

Structure-based analog design (design of modified hits using 3D Leads protein-ligand complexes and computer simulations)

Figure 6. Integration of structural and informatics methods. The diagram illustrates how different computational approaches complement each other in the context of structure-based drug design (Tang and Marshall 2011).

SBDD can potentially reduce the numbers of compounds that need to be evaluated, and lead to new directions for synthetic optimization (Scapin 2006). Current methods for structure-based drug designing can be divided roughly into two categories. The first category is about “finding” ligands for a given target which is usually referred as database searching. In this case, a large number of potential ligand molecules are screened to find those fitting the binding pocket of the target. This method is usually referred as ligand- based drug design. The key advantage of database searching is that it saves synthetic effort to obtain new lead compounds. Another category of structure- based drug design methods is about “building” ligands, which is usually

25 referred as receptor-based drug design. In this case, ligand molecules are built up within the constraints of the binding pocket by assembling small pieces in a stepwise manner. These pieces can be either individual atoms or molecular fragments. The key advantage of such a method is that novel structures, not contained in any database, can be suggested (R., Y. et al. 2000; Jorgensen 2004; Schneider and Fechner 2005).

1.2.3.4 Active site identification Active site identification is the first step in this program. It analyses a protein to assign a suitable drug binding pocket, derives key interaction sites within the binding pocket, and then prepares necessary data for ligand- fragment link. The basic inputs for this step are the 3D-structure of the protein and a pre-docked ligand, as well as their atomic properties. Both ligand and protein atoms need to be classified and their atomic properties should be defined, basically, into four types: hydrophobic atoms (all carbons); H-bond donor (oxygen and nitrogen atoms bonded to hydrogen atoms); H-bond acceptor (oxygen and sp2 or sp hybridized nitrogen atoms with lone electron pairs); polar atom (oxygen and nitrogen atoms that are neither H-bond donor nor H-bond acceptor, sulphur, phosphorus, halogen, metal, and carbon atoms bonded to hetero-atoms). The space inside the ligand binding region would be studied with virtual probe atoms of the above four types so that the chemical environment of all spots in the ligand binding region can be known. This will suggest what kind of chemical fragments can be put into their corresponding spots in the ligand binding region of the target. A typical in silico drug design cycle consists of docking, scoring and ranking initial hits on the basis of their steric and electrostatic interactions with the target site, which is commonly referred to as virtual screening.

1.2.3.5 Virtual screening High-throughput screening is the most common experimental method used to identify lead compounds. Because of the cost, time, and resources required for performing high-throughput screening for compound libraries, the use of alternative strategies is necessary for facilitating lead discovery. Virtual

26 screening has been successful in prioritizing large chemical libraries to identify experimentally active compounds, serving as a practical and effective alternative to high-throughput screening. Virtual screening is a computational method for identifying lead compounds from large and chemically diverse compound libraries from public or specifically compiled databases of hundreds of thousands of organic or synthetic compounds (Oprea and Matter 2004; Shoichet 2004) It is always advantageous to begin the screening with established drug compounds. This will enable a developer to rapidly prototype a candidate ligand whose chemistry is well known and within the intellectual property of the developer. The initial successful hits are called lead compounds. The interaction of a lead compound with its target can be verified by X-ray crystallography, if possible. From this point onward, a cycle of iterative chemical refinement and testing continues until a drug is developed that undergoes clinical trials. As a consequence of lack of knowledge about function-determining features of desired ligand molecules in the early stages of the discovery process, virtual screening had become a complementary strategy to high throughput screening (HTS) as follows. 1. By suggesting new chemical entities that are designed by taking into consideration of several aspects of lead and drug-likeness and synthetic accessibility; by making full use of pre-existing knowledge, such as reference ligands or receptor models. 2. By facilitating the designing of, activity enriched screening sets with increased hit rates. This computational method is valuable for discovering lead compounds in a faster, more cost-efficient, and less resources-intensive manner compared to experimental methods such as high-throughput screening. The virtual screening could be separated into two steps: docking and scoring (Tang and Marshall 2011) . Methodologies used in virtual screening, such as molecular docking and scoring, have advanced to a point where they can rapidly and accurately identify lead compounds in addition to predicting native binding conformation.

27 1.2.3.6 Docking The docking process involves the prediction of ligand conformation and orientation (called posing) within the binding site of a target (Fig. 7). In general, there are two aims of docking studies: accurate structural modeling and correct prediction of activity. However, the identification of the molecular features that are responsible for specific biological recognition, or the prediction of compound modifications that improve potency, are complex issues that are often difficult to understand and even more so to simulate on a computer. In view of these challenges, docking is generally devised as a multi- step process in which each step introduces one or more additional degree of complexity (Brooijmans and Kuntz 2003). The process begins with the application of docking algorithms that place small molecules in the active site. The outcome of this process determines whether a given conformation and orientation of a ligant fits the active site. This in itself is challenging, as even relatively a simple organic molecule can contain many conformational degrees of freedom. Sampling these degrees of freedom must be performed with sufficient accuracy to identify the conformation that best matches a receptor structure, and must be fast enough to permit the evaluation of thousands of compounds in a given docking run.

28

a b

Figure 7. A docked pose of a ligand and protein receptor in its active site. (a) The compound is displayed in ball and sticks and the macromolecule is displayed as surface representation by use of PyMOL. (b) A diagram illustrating the docking of a small molecule ligand (brown) to a protein receptor (green) to produce a complex.

Algorithms are complemented by scoring functions (both posing and ranking involve scoring). The pose score is often a rough measure of the fit of a ligand into the active site. The rank score is generally more complex and attempt to estimate binding energies. The scores are designed to predict the biological activity through the evaluation of interactions between compounds and potential targets. Early scoring functions evaluated compound fits on the basis of calculations of approximate shape and electrostatic complementarities. Relatively simple scoring functions continue to be heavily used, at least during the early stages of docking simulations. Pre-selected conformers are often further evaluated using more complex scoring schemes with a more detailed treatment of electrostatic and Van der Waals interactions, and inclusion of at least some solvation or entropic effects (Gohlke and Klebe 2002). It should also be noted that ligand binding events are driven by a combination of enthalpic and entropic effects, and that either entropy or enthalpy can dominate specific interactions. This often presents a conceptual problem for contemporary scoring functions (discussed below), because most of them are much more focused on capturing energetic than entropic effects. In addition to the problems associated with scoring of compound conformations, other complications exist that make it challenging to accurately

29 predict binding conformations and compound activity. These include, among others, limited resolution of crystallographic targets, inherent flexibility, induced fit or other conformational changes that occur on binding, and the participation of water molecules in protein–ligand interactions.

1.2.3.7 Scoring methods Structure-based drug designing attempts to use the structure of proteins as a basis for designing new ligands by applying accepted principles of molecular recognition. The basic assumption underlying structure-based drug designing is that a good ligand molecule should bind tightly to its target. Thus, one of the most important principles for designing or obtaining potential new ligands is to predict the binding affinity of a ligand to its target and use it as a criterion for selection. It gives the best possible conformation of the ligand to the target with least binding energy (Fig. 8).

Figure 8. Prediction of best-fit ligand to the binding site of the target by scoring function.

30 A method was developed by Bohm (Bohm 1994) to develop a general-purpose empirical scoring function in order to describe binding energy. The following equation (Eq.1) was derived:

Gbind Kd

Kd

Gbind = Gdesolvation + Gmotion + Gconfiguration + Ginteraction Eq. 1 where: desolvation is the enthalpic penalty for removing the ligand from solvent; motion is the entropic penalty for reducing the degrees of freedom when a ligand binds to its receptor; configuration is the conformational strain energy required to put the ligand in its "active" conformation; interaction is the enthalpic gain for resolvating the ligand with its receptor. The basic idea is that the overall binding free energy can be distributed into the independent components that are known to be important for the binding process. Each component reflects a certain kind of free energy alteration during the binding process between a ligand and its target. The master equation is the linear combination of these components. According to the Gibbs free energy equation, a relationship between the dissociation equilibrium constant, Kd, and the components of free energy can be built. Various computational methods are used to estimate each of the components of the equation. For example, the change in polar surface area upon ligand binding can be used to estimate the desolvation energy. The number of rotatable bonds, frozen upon ligand binding, is proportional to the motion term. The configuration or strain energy can be estimated using molecular mechanics calculations. Finally the interaction energy can be estimated using methods such as the change in non-polar surface, statistically derived potentials of mean force and the number of hydrogen bonds formed. In practice, the components of the equation are fit to experimental data using multiple linear regression. This can be achieved with a diverse training set, including many types of ligands and targets to produce a less accurate but a more general global model or a more restricted set of ligands and targets to produce a more accurate but less general local model (Gohlke, Hendlich et al. 2000; Clark, Strizhev et al. 2002; Wang, Lai et al. 2002).

31 Drug discovery pipelines in pharmaceutical companies are fuelled by HTS as one of the major sources of new hit-to-lead candidates. As experimental methods, such as X-ray crystallography and NMR, develop, the amount of information concerning 3D structures of biomolecular targets has increased dramatically. In parallel, the information about structural dynamics and electronic properties of ligands has also increased. This has encouraged rapid development of the structure-based drug designing.

1.2.3.8 Adsorption, distribution, metabolism and excretion (ADME) While virtual screening was expected to generate new drugs easily, just by virtue of the sheer size of available libraries, in fact, things were not that straight forward (Lahana 1999). In the early days, many large sets of biologically inactive molecules were tested, most often as ill-defined mixtures. Library composition has subsequently been significantly influenced by Chris Lipinski, who formulated simple rules for predicting which compounds would support useful bioavailability (Lipinski, Lombardo et al. 2001). The bioavailability of drug molecules could be analyzed by its physicochemical properties using Lipinski’s rule of five. According to Lipinksi's 'rule of five' (Lipinski, Lombardo et al. 2012), drug candidates should have a molecular mass below 500 Daltons, a lipophilicity below Log P = 5, and contain no more than 5 hydrogen bond donors and no more than 10 oxygen and nitrogen atoms. Furthermore, poor passive absorption is to be expected if two or more of these conditions are violated. As an alternative to the Lipinski rules, polar surface area (Ertl, Rohde et al. 2000) or VolSurf parameters (Cruciani, Pastor et al. 2000) can be used to predict oral absorption (Jorgensen 2004; Shoichet 2004; Scapin 2006; Leach and Jhoti. 2007) and blood–brain barrier penetration (Kelder, Grootenhuis et al. 1999; Cruciani, Pastor et al. 2000). The flexibility of molecules has been recognized as another factor influencing bioavailability (Veber, Johnson et al. 2002). Violation of the Lipinski rules was indeed the main reason for failure of early combinatorial libraries, and the application of the rules significantly aided improvements. In this context, two investigations are most often cited (Bohm 1994), which correlated about 40% of failures in clinical development

32 with inappropriate pharmacokinetics, that is, a lack of sufficient oral efficacy. Correspondingly, absorption, distribution, metabolism and excretion (ADME) parameters are now considered to be key factors in drug development. The sequential analysis of virtual screening is presented below in Fig. 9.

• Available from public database 100000

• Screened and narrowed down to ‘n’ number of molecules based on pharmacokinetics properties 1000 (Lipinskis rule of five) using computational techniques

• Compounds that have better binding energy than 25 known

• Analysed based on their specificity towards binding 5 site residues

Figure 9. Schematic flow of virtual screening process.

1.2.4 Successes and limitations of SBDD Successes SBDD has proven to be successful in the design of the following drugs which are already in clinical trials, of medically relevant proteins in complex with compounds developed during the drug discovery process. 1. Dorzolamide (Ki = 0.37nM )(Glaucoma treatment) (Baldwin, Ponticello et al. 1989) 2. Saquinavir, Indinavir, Ritonavir, Nelfinavir (HIV therapy) (Wlodawer and Vondrasek 1998; Edwards 2001) Agouron Pharmaceuticals and Vertex Pharmaceuticals, were both successful in designing the HIV protease inhibitors nelfinavir 3 (Ki = 2.0nM) and amprenavir 4 (Ki = 0.6nM) 3. Zamnavir: Neuroaminidase inhibitors (anti-influenza) (Fig. 10) (von Itzstein, Wu et al. 1993)

33 4. COX-2 inhibitors (anti-inflammatory) 5. Phospholipase A(2) inhibitors (anti-inflammatory)

Figure 10. Docking pose of zanamivir (Relenza) bound to influenza neuraminidase. An example for structure based drug design showing ligand in virtual space docked with the crystal structure of the target protein. Residues involved in ligand binding are highlighted in three letter codes (Blundell, Jhoti et al. 2002).

The design of estrogen receptor subtype-selective (ERα and ERβ) ligands is an exciting success story of homology modeling and structure-based design (Hillisch, Pineda et al. 2004). These compounds go through numerous iterations as they progress towards clinical trials. In an industrial environment, where multiple projects are being worked simultaneously, with each of them likely having multiple compound series, the number of required total complex structures can be very high (hundreds per year). Also, the success ratio of crystals with the ligand to harvested crystals will be very small. The efficiency of the process can be improved only by using bioinformatics tools for sample tracking and automation for sample handling (Gyorgy 2012).

34 Limitations Although docking is a relatively well-developed technique, it still suffers from some limitations. One limitation is that docking studies are typically carried out using a rigid target model. Although a few docking methods allow a certain degree of side-chain flexibility, these programs are currently unable to model loop movements or induced-fit effects, which are found in many systems (Ajay and Murcko 1995; Charifson, Corkery et al. 1999; Muegge and Rarey 2001). Combinatorial chemistry and structure based validation assist the refinement process significantly. This is a very powerful technique that chemists employ to aid in the refinement of a lead compound. It is a synthetic tool that enables chemists to rapidly generate thousands of lead compound derivatives for testing. A scaffold is employed that contains a portion of the ligand that remains constant. Subsite groups are potential sites for derivatization. These subsites are then reacted with combinatorial libraries to generate a multitude of derivative structures, each with different substituent groups. If a scaffold contains three derivatization sites and the library contains ten groups per site, theoretically 1000 different combinations are possible. By carefully selecting libraries, based upon the study of the active site, we can target the derivatization process towards optimizing ligand receptor interaction.

1.3 Protein crystallization and data collection

Protein crystals can be interpreted as an infinite array in which a minimum volume (known as a unit-cell) is repeated in three-dimensional space and within a unit-cell molecules are arranged following well-defined symmetry rules. Growth of a single and well defined diffracting crystal forms the basic and essential prerequisite for X-ray crystallographic protein structure determination (Blow 2002). Producing high quality crystals has always been the bottleneck in structure determination and it is still not understood why some proteins crystallize with ease while others stubbornly refuse to produce suitable crystals (Chayen 2004). Crystallization requires that a protein is first be purified to homogeneity and then concentrated to a supersaturated state.

35 Crystal growth has three steps: nucleation, growth and cessation of growth. However, identifying a successful crystallization condition for a new protein is like looking for a needle in a haystack (McPherson 2009). Also, truncation of a protein at the flexible N or C terminus may lead to proper crystallization and diffraction. In addition, tagging a partially soluble or partially homogeneous protein with maltose binding protein (MBP) (Gruswitz, Frishman et al. 2005) or other tags (Cherezov, Rosenbaum et al. 2007) at the N or C terminus has given a positive effect in crystal formation (nucleation site for crystal formation) as well as in enhancing solubility. Different techniques are employed for setting up crystallization trials, which include sitting drop vapor diffusion, hanging drop vapor diffusion, sandwich drop, batch, micro batch under oil, micro dialysis and free interface diffusion. While vapor diffusion has been very popular and successful for the past 40 years, microbatch, which is the simplest method, is a relatively new technique. Sitting and hanging drop methods are easy to manipulate, require a small amount of sample, and allow large amount of flexibility during screening and optimization. Once a lead is obtained as to which conditions may be suitable for crystal growth, the conditions can generally be fine-tuned by making variations to the parameters including precipitant, pH, salt, temperature, etc. Cryo-preservation of good crystals is also a defining factor in protein crystallography. This is helpful in protecting a protein crystal from the naturally damaging effects of X-rays during data collection. Usually, glycerol, sucrose, propanediol, glycol, etc. are used for cryo-preservation. No additional cryo protection agent is needed if a crystal is already grown in some natural cryo buffers or glutaraldehyde.

Objectives

1. To identify specific drug targets towards the substrate binding domain (SBD) of Siah2 using in silica high throughput virtual screening (HTVS) 2. To validate the potential drug targets using in vitro and in vivo binding assays, and 3. To identify the critical amino acid residues involved in the binding/interaction between Siah2 and the target molecules.

36 CHAPTER 2. MATERIALS AND METHODS

2.1 Homology modeling Homology modeling of the SBD of Siah2 was built using Modeller 9v7 software (Sali and Blundell 1993). The amino acid sequence of human Siah2 was retrieved from GenBank (accession number: NM_005067) in NCBI (McGinnis and Madden 2004). It consists of 324 amino acids of which, residues from 130-322 belongs to SBD. The SBD was then subjected to PSI- BLAST search (Altschul, Madden et al. 1997) in order to identify the homologous proteins from the Protein Data bank (PDB) (Berman, Henrick et al. 2003). An appropriate template for SBD was identified based on the e- value and sequence identity. The template and the target sequences was then aligned using ClustalW (Thompson, Gibson et al. 2002). Subsequently, homology modeling for SBD of Siah2 against the chosen template was carried out using Modeller 9v7. The outcome of the modeled structures were ranked on the basis of an internal scoring function and those with the least internal scores were identified and utilized for model validation.

2.2 Validation of the modeled structure In order to assess the reliability of the modeled structure of SBD of Siah2, we calculated the root mean square deviation (RMSD) by superimposing with template structure using 3-dimensional structural superposition (3D-SS) tool (Sumathi, Ananthalakshmi et al. 2006). The backbone conformation of the modeled structure was calculated by analyzing the phi (Φ) and psi (ψ) torsion angles using PROCHECK (Laskowski, MacArthur et al. 1993) determined by the Ramachandran plot statistics. Finally, the quality of consistency between the template and the modeled Siah2 were evaluated using ProSA (Wiederstein and Sippl 2007), in which the energy criteria for the modeled structure was compared with the potential mean force obtained from a large set of known protein structures.

37 2.3 Protein simulation The stability of the modeled Siah2 was checked by molecular dynamics (MD) simulation using Gromacs software (Hess, Kutzner et al. 2008). The modeled SBD was energy minimized using optimized potentials for liquid simulations all-atom (OPLS) force field (McDonald and Jorgensen 1998). This preliminary energy minimization was performed to discard the high energy intramolecular interactions. The overall geometry and atomic charges were optimized to avoid steric clashes. The modeled SBD was solvated in a rectangular box of about 16,614 water molecules using TIP3P model system in order to mimic the physiological behavior of the biomolecules (Jorgensen, Chandrasekhar et al. 1983). The energy of the modeled Siah2 was then minimized without restraints for 2000 steps using the above mentioned algorithms. The system was then gradually heated from 10 to 300 Kelvin over 5 nano seconds (ns) using the NVT ensemble. Finally MD simulation was carried out to examine the quality of the model structures by checking their stability and by performing a 5 ns simulation and the lowest energy structure during simulation was calculated. Similar protocol was followed for the template also in order to compare the stability and reliability between the modeled and experimentally solved proteins.

2.4 Docking SBD of Siah2 was modeled computationally and its active sites were predicted using SiteMap (version 2.3, Schrödinger, LLC, New York, NY, 2009) followed by docking studies. The preparation of the modeled Siah2 was performed by protein preparation wizard and menadione was prepared using the ‘LigPrep’ program and the docking procedure was performed using Glide extra precision (XP version 5.5), which produces the least number of inaccurate poses and calculates the accurate binding energy of 3D structures of a known protein with ligands (Kontoyianni, McClellan et al. 2003; Friesner, Banks et al. 2004). Docking studies were carried out using menadione as an inhibitor against the modeled SBD of Siah2. The Siah2-menadione ligand

38 complex was used as a reference point in order to identify better drug-like molecules that could interact against SBD of Siah2.

2.5 Virtual screening A total of 16,908 molecules derived from public databases namely Maybridge (14,400, www.maybridge.com), PubChem (Wang, Xiao et al. 2009) (2438, obtained from Shanghai Institute of Organic Chemistry) and Binding (70, www.bindingdb.org), were selected for virtual screening against SBD. The in silico structure-based high-throughput virtual screening (HTVS) method of Glide, version 5.5 (Schrödinger, LLC, New York, 2009) (Friesner, Banks et al. 2004) was used for screening. Before performing HTVS, the ligand molecules collected from the databases were prepared using the ‘LigPrep’ module and were subsequently subjected to Glide ‘Ligand docking’ protocol with HTVS mode. A grid file was generated using the Receptor Grid Generation protocol with centroid at the predicted binding site of the modeled structure. A scaling factor of 1.0 was set to van der Waals (VDW) radii for the atoms of residues that presumably interact with ligands and the partial atomic charge was set to less than 0.25. Ligands were then allowed to dock with the high throughput screening (HTVS) mode and all the obtained molecules were subjected to the Glide extra precision (XP) mode of docking, which performs extensive sampling and provides reasonable binding poses(Friesner, Banks et al. 2004). The lead molecules from screening were then tested in silico for their pharmacokinetic properties and percent human oral absorption using QikProp programme (Jorgensen 2006).

2.6 Cell lines and cell culture Human embryonic kidney 293T (HEK 293T) were grown at 37°C and

5% CO2 in Dulbecco's modified Eagle's medium (Invitrogen) supplemented with 10% fetal bovine serum (FBS) (HyClone), L-glutamine (Invitrogen) and pencillin/streptomycin (Invitrogen). T47D, MCF 7 breast cancer cell lines, were kindly provided by Dr. Yong Eu Leong (NUS) and maintained in Minimum Essential Medium Eagle

39 (MEM) (Sigma Aldrich) with 1% L-glutamine, 1% penicillin/streptomycin and 10% FBS. MDA-MB-231, BT 549, MCF 10A breast cancer cell lines, were kindly provided by Dr. Alan PremKumar (NUS) and maintained in Roswell Park Memorial Institute 1640 medium (Sigma Aldrich) supplemented with 1% L-glutamine, 1% penicillin/streptomycin and 10% FBS. These cell lines were obtained ethically from ATCC (Singapore). MCF 10a was cultured in Mammary Epithelial Growth Medium (MEGM, Lonza) supplemented with growth factors, 10% FBS, 1% L-glutamine and 1% penicillin/streptomycin. All cell lines were maintained in a humidified incubator at 37oC with 5% CO2 and sub-cultured according to suppliers recommendations.

2.7 Molecular cloning 2.7.1 Human plasmid constructs The pcDNA3.1 FLAG-SBD of human Siah2(130-392) and Full- length(1-394) were constructed by PCR amplification of Siah2 cDNA fragments separately from the pCMV-SPORT6 plasmid (Thermo Scientific OpenBiosystems) with a HindIII-containing forward primer and an XbaI- containing reverse primer. The HindIII-Siah2-XbaI fragments was then subcloned into the FLAG-pcDNA3.1 plasmid between the HindIII and XbaI restriction sites. Similarly FLAG-SBD of Siah1 (90-292) and full-length (1- 292) were constructed as described for Siah2 using the same restriction sites. HA-PHD3, HA-Nrf2 and HA-βcatenin plasmids were provided by our collaborator Dr. Thilo Hagen.

2.7.2 Bacterial expression plasmid constructs As described above, the gene fragment encoding SBD residues (130- 322) of Siah2, was PCR amplified using the Pfu DNA polymerase (Fermentas) with an EcoRI containing forward primer and NotI containing reverse primer (1st Base). The amplified gene was ligated into the modified pET-Duet vector, next to a pre-inserted bacterial maltose binding protein (MBP), between the EcoRI and NotI restriction sites. The second molecular cloning site of the vector contained a preinserted bacterial disulfide bond isomerase (DSBC) gene

40 (Pioszak and Xu 2008). This modified vector was a generous gift from one of our collaborator Dr. Eric Xu of Van Andel Institute, USA. Full length PHD3 was PCR amplified using the pcDNA 3.1 PHD3 plasmid as a template and then cloned into the modifed pET-Duet vector as described for Siah2. SBD of Siah2 was also cloned in pGEX6P-1 vector with an N-terminal GST tag using a BamHI containing forward primer and a XhoI containing reverse primer. The list of constructs and primers are given in Table 7 (Appendix 1).

2.8 Transient transfection 2.8.1 Plasmid transfection DNA plasmids were transiently co-transfected in subconfluent HEK 293T cells platted in a 60mm plate with GeneJuice Transfection Reagent (Novagen). Empty pcDNA3.1 vector was also co-transfected as a control. To this end, GeneJuice transfection reagent was first added to serum free DMEM media and incubated for 5 min. 1µg of each DNA plasmid was then added to the mixture and further incubated for 15 min. The mixture was then added dropwise to the cells and returned to the 5% CO2. Cells were lysed 48 hours after transfection.

2.8.2 Small-interfering (siRNA) transfection HEK 293T cells were plated at 2 X 105 cells per 6-well plate for siRNA transfections. RNAi Max Lipofectamine (Invitrogen) was used as a transfection agent according to the manufacturer’s instructions. The following predesigned siRNA duplexes (IDT) at final concentration of 20nM were used: HSC.RNAI.N005067.12.3, 3'UTR/2 (Siah2 siRNA oligo #1) and HSC.RNAI.N005067.12.7, CDS/2 (Siah2 siRNA oligo #2). Lipofectamine RNAiMax and respective siRNAs were separately diluted in serum free DMEM media for 5 min before mixing together for another 20min of incubation at room temperature. The mixture was then added dropwise to the cells and returned to the 5% CO2. Cells were lysed three days after siRNA transfection. For MCF 7 cells reverse transfection was carried out. siRNA- Lipofectamine RNAiMAX complexes as described above, were prepared in

41 opti-MEM reduced serum free media (Invitrogen). This mixture was first added to the well and then 1.5 X 105 and 3 X 103 cells per 6-well and 96-well were seeded. Plates were incubated for 72 hours at 37°C in a CO2 incubator.

2.9 Protein isolation and concentration determination At the time of harvest, cells were washed with ice-cold 1X PBS and then lysed in 1X MPHER mammalian protein extraction reagent (Thermo Scientific). Cell lysates were stored at -80 °C freezer. Protein concentration determination of cell lysates was carried out using BSA standard curve assay in a 96-well format. Frozen cells were thawed and spun down at 4 °C for 10 min at 13000 rpm in a microfuge. 1 µl of supernatant was added to 100 µl of solution containing 1X Bradford reagent and the absorbance at 595nm was read using a Tecan GENios microplate reader (Tecan Trading, Austria).

2.10 SDS-PAGE A 12% resolving gel and 4% stacking gel were casted using 30% acrylamide-Bis solution 37.5:1 (2.6% C). Protein lysate was mixed with 1X loading dye or Laemmli sample buffer and heated at 95 °C for 5 min. Protein samples were spun at 12,000 rpm for 1 min and loaded into wells of SDS- PAGE gel. Gel containing protein lysate samples were electrophorezed using 1X running buffer (3.03g Tris base, 14.4g glycine, 10ml of 10% SDS/L) at 80V for fifteen minutes, followed by 95V for two hours until the dye front reached the bottom of the gel. After electrophoresis, the gel was removed and proteins were transferred onto a nitrocellulose membrane (AmershamHybond ECL, GE Healthcare) in 1X cold transfer buffer (3.1 g Tris base, 14.4 g glycine, 20% methanol/L) for overnight at 25 V.

2.11 Immunoblotting and detection The nitrocellulose membrane with proteins was blocked with 5% w/v non-fat milk in 1X Tris-buffered saline [0.1% (v/v], Tween-20 (TBS-T) for 1 hour at room temperature. The membrane was incubated with diluted primary antibody in TBST for 1 hour at room temperature or overnight at 4 °C. Unbound primary antibody was removed by washing thrice with 1X TBS-T

42 for 5 min. Following this, the membrane was incubated with IgG-HRP- conjugated secondary antibodies at 1:10,000 dilution for 1 hour at room temperature. The following antibodies were used to probe the membranes: mouse monoclonal anti-FLAG (Sigma) at 1:5000 dilution, rat or mouse monoclonal anti-HA (clone 3F10, Roche) at 1:5000 dilution, mouse monoclonal anti-β-actin (Sigma) at 1:10,000 dilution, anti-p27 at 1:2000 dilution. The following primary antibodies required overnight incubation at 4°C: goat polyclonal anti-Siah2 (Santa Cruz Biotechnology) at 1:500 dilution, mouse monoclonal anti-Siah2 (Sigma) at 1:500 dilution. The bound antibodies were detected using the detection reagent (Amersham ECL Western Blotting System, GE Healthcare. The blot was exposed to X-ray film (Thermo Scientific) for different exposure times.

2.12 Co-immunoprecipitaction For co-immunoprecipitation, cells were washed with cold PBS and lysed 2 days post-transfection with lysis buffer containing 25mM Tris-

HCL(pH 7.5), 3mM EDTA, 2.5mM EGTA, 20mM NaF, 1mM Na3VO4, 20mM sodium β-glycerophosphate, 10mM sodium pyrophosphate, 0.5% Triton X- 100, 0.1% β-mercaptoethanol and Roche protease inhibitor cocktail. lysates from transfected cells grown in 60mm plates were precleared by centrifugation and were added to beads. Anti-FLAG or anti-HA M2 agarose beads coupled to anti-FLAG or anti-HA monoclonal antibody was used to immunoprecipitate the FLAG-Siah2 or HA-PHD3. Samples were tumbled at 4 °C for 1 hour and washed four times with NP40 lysis buffer containing 20mM Tris, pH7.5, 50mM NaCl, 0.5mM EDTA, 5% glycerol, 0.5% NP40; and once with buffer containing 50 mM Tris, pH 7.5.

2.13 GST pull down For GST pull down assay, GST-fused Siah2 SBD was allowed to bind to glutathione sepharose beads (GE Healthcare) for 30 min at 4 °C in binding buffer containing 50 mM Tris-HCl buffer, 150 mM NaCl, 1 mM DTT, 5% glycerol, 0.1% Triton X-100, pH 8.0. The PHD3 lysate from HEK293T cells were allowed to incubate with GST-Siah2 fusion proteins, as indicated,

43 immobilized on glutathione sepharose beads for 1 hour at 4 °C. Controls included GST alone, tested for binding to PHD3. After binding, the resin was washed three times in binding buffer, and then heated in Laemmli sample buffer for 5min at 95 °C. Samples were separately resolved by 12% PAGE and Western blotted using anti-HA antibody.

2.14 RNA isolation RNA extraction was carried out using the TRIzol reagent ( Invitrogen). Cells were scrapped in 1ml of TRIzol and transferred to 1.7 ml eppendoff tubes. The homogenates were then stored for 5 min at room temperature to permit complete dissociation of nucleoprotein complexes. They were then supplemented with 0.6ml chloroform, vortexed and microcentrifuged at 13,000 rpm for 15 min. Following centrifugation, the mixture separated into middle red phenol-chloroform phase, interphase and organic bottom phase respectively. The top colourless aqueous layer was then transferred into a fresh tube and equal volume of isopropanal was slowly added to precipitate RNA and incubated on ice for 15 min. The samples were then spun down for 10 min and the supernatant was discarded. The pellet was washed with 1ml of cold 75% ethanol. The samples were then centrifuged at 13000 rpm for another 10 min and the supernatant was removed. The RNA pellet was air-dried and dissolved in ultra-pure RNase free water and its concentration was measured using NanoDrop ND-1000 spectrophotometer (NanoDrop technologies, Wilmington, DE).

2.15 Real-time polymerase chain reaction using SYBR green Real time quantitative RT-PCR was carried out using the one-step RT- PCR kit with SYBR GREEN (Bio-Rad). PCR was carried out in triplicates for each sample being validated. The β-actin gene was used as a control. Primers (1st Base) used for Siah2 are as follows and can be found in (Frasor, Danes et al. 2005) Forward primer: 5-CTATGGAGAAGGTGGCCTCG-3 Reverse primer: 5-CGTATGGTGCAGGGTCAGG-3

44

2.16 Domain swapping using fusion PCR and mutants A three step fusion PCR procedure was employed to create the fusion proteins (Hobert 2002), SBD-Siah1-2 and SBD-Siah2-1 from the wild type pcDNA Siah1 and Siah2 constructs described previously. In this procedure for generating SBD-Siah1-2, the first round of PCR involved amplifying the first 100 bp of SBD (1-193 residues) using pcDNA forward primer and Siah1-2 reverse primer. In the second round of PCR involved amplifying the C terminal region of SBD (the remaining 93 bp) using Siah1-2 forward primer and pcDNA reverse primer. The final round of PCR was performed using both the PCR products that were amplified in the previous rounds as templates and with forward and reverse primer of pcDNA. The final PCR product was gel purified and digested with XbaI and HindIII restriction enzymes (New England Biolabs). Similarly SBD-Siah2-1 was also prepared as mentioned in Table 2. Predigested products were individually ligated with pcDNA, which is also digested with XbaI and HindIII restriction enzymes and transformed into E. coli DH5α cells. Positive colonies were screened and verified by DNA sequencing. SBD-Siah1-2 mutant (Mut) and SBD-Siah2-1 mutant (Mut) were custom synthesized (Shanghai ShineGene Molecular Biotech,Inc).

45

Table 2. List of primers and corresponding template used for fusion PCR

Gene Forward Primers Reverse Primers Template

Name Sequence Name Sequence

Siah1-2 PCR1 pcDNA F 5AGCAGAGCTCT Siah1-2 R 3TCCAGCACCAGC pcDNA-Siah1 CTGG3 ATGAAGTGAAAGC CAA5 PCR2 Siah1-2 F 5TTGGCTTTCACT pcDNA R 3AGGCACAGTCGA pcDNA-Siah2 46 TCATGCTGGTGCT GGC5 GGA 3 PCR3 pcDNA F 5AGCAGAGCTCT pcDNA R 3AGGCACAGTCGA Gel purified CTGG 3 GGC5 product of PCR1 and PCR2 Siah2-1 PCR1 pcDNA F 5AGCAGAGCTCT Siah2-1 R 3TCTAAGACTAAC pcDNA-Siah2 CTGG 3 ATGAAGTGATGGC CAA5 PCR2 Siah2-1 F 5TTGGCCATCAC pcDNA R 3AGGCACAGTCGA pcDNA-Siah1 TTCATGTTAGTCT GGC5 TA GA 3 PCR3 pcDNA F 5AGCAGAGCTCT pcDNA R 3AGGCACAGTCGA Gel purified CTGG 3 GGC5 product of PCR1 and PCR2

2.17 Drug treatments and preparation The Five compounds, Compound 1 (ACR#HTS09269SC), Compound 2 (ACR#SCR01006SC), Compound 3 (ACR#SCR00073SC), Compound 4 (ACR#DSHS00977SC), and Compound 5 (ACR#DP01471SC) were purchased from Fisher Scientific Pte Ltd, Singapore. Compounds were dissolved in DMSO to prepare 200mM Stock. For all compounds, final working concentrations were attained by freshly diluting stock solutions with complete medium before use.

2.18 MTS assay The [3-(4,5-dimethylthiazol-2-yl)-5-(3-carboxymethoxyphenyl)-2-(4- sulfophenyl)-2H-tetrazolium, inner salt] (MTS) assay is a colorimetric method for determining the number of viable cells in proliferation or cytotoxicity assays. Growth inhibition was measured using the CellTiter 96 Aqueous One Solution Assay (Promega, Madison, WI) as recommended by the manufacturer. The solution contains MTS compound and an electron coupling reagent PES [phenazine ethosulfate]. The MTS compound is bio-reduced by mitochondrial enzymes of cells into a colored formazan product that is soluble in tissue culture medium. The quantity of formazan product as measured by the amount of 490 nm absorbance is directly proportional to the number of living cells in culture. Phenol red free culture medium was used. The experiment was performed in triplicates. After 4 hours of incubation under standard conditions of 5% CO2 and 37°C, the colour change due to the reduction of formazan product (indicative of reduction of MTS) was visible. MCF7 cancer cells (4x103 per well) were seeded on 96-well plates in triplicate at each dose level. The cells were grown for 24 h prior to treatment. Compounds were diluted in DMSO at 200 mM before being further diluted in culture medium prior to each experiment. The compounds were added on day 1 at 0.06 µM, 0.032 µM, 1.6 µM, 8 µM, 40 µM, 200 µM and 1000 µM and kept in the media for 72 hours. Absorbance was read in a Tecan GENios micro plate reader (Tecan trading, Austria). The signal generated (color intensity) is directly proportional to the number of viable (metabolically active) cells in the wells. Relative cell numbers can therefore be determined based on the optical

47 absorbance (optical density, OD) of the sample. The blank values (media only) were subtracted from each well of the treated cells and controls.

2.19 Protein expression and purification of MBP-SBD-Siah2 and MBP-PHD3 The pET-Duet construct carrying MBP-SBD-Siah2-6xHis gene at the first multiple cloning site (MCS) and DSBC at the second MCS was transformed in the E. coli BL21 (DE3) (Stratagene) for protein expression. A single colony was picked and inoculated into 100 ml of LB medium supplemented with 100 μg/ml ampicillin (Gold Biotech) and allowed to grow overnight at 37 °C with continuous shaking. Subsequently, it was transferred into 1 L LB medium supplemented with 100 μg/ml ampicillin and allowed to grow at 37 °C until the OD600 was 0.6-0.8 AU. Subsequently, to induce protein expression, 0.1 mM isopropyl-1-thio-β-D (-) thiogalactopyranoside (IPTG) (Gold Biotech) was added to the culture and grown at 16°C for 16 hrs with continuous shaking. Cells from 1 L of culture were harvested by centrifugation at 7,000 g for 30 minutes and the pellet was stored at -80 °C. Cell pellet obtained was resuspended in 50 ml of lysis buffer containg 50 mM Tris-HCl buffer, 150 mM NaCl, 2 mM DTT, 5% Glycerol, 0.1% Triton X-100 with the final pH adjusted to 8.0. and 1 ml of Sigma protease inhibitor cocktail. Cell suspension was sonicated for 9 min with 1 sec ON and 1 sec OFF pulse. Cell lysate obtained was centrifuged at 39,000 g for 30 min to sediment cell debris and insoluble proteins; clear supernatant was mixed with 5 ml of Ni-NTA (Qiagen) resin pre-equilibrated with lysis buffer and allowed to bind for 1 hr. Weakly bound proteins were removed by extensive washing using lysis buffer with 10 mM imidazole and later eluted with 250 mM imidazole in lysis buffer. To remove the excess imidazole and further purification, amylose affinity purification with 3 mL of 50% slurry of amylose resin (New England Biolabs) was allowed to bind for 1 hour with Ni-NTA purified MBP-SBD-6xHis with gentle shaking using MBP affinity. The wash buffer composition for the amylose column is 50 mM Tris-HCl (pH 8.0), 150 mM NaCl, and 5% glycerol while 10 mM maltose was added for elution. After affinity chromatography, the eluted protein was loaded to a Superdex 200 column (GE Healthcare)

48 equilibrated with 20 mM Tris, 100 mM NaCl, pH 8.0. The purified protein was then concentrated using ultra 30 kDa Mol. Wt. cut off Centricon (Merck Millipore). Expression of MBP-PHD3 followed almost the same above conditions, but during purification in addition to affinity chromatography, ion exchange chromatography was carried out prior to gel filtration. In short, the construct was transformed into E. coli BL21 (DE3) competent cells and the protein was over expressed and first affinity purified using Amylose resin. The protein was further purified using an anion exchange Hi-Trap Q-HP (GE Healthcare) column, with buffer A: 50 mM Tris and buffer B: 50 mM Tris, 300mM NaCl pH adjusted to 8.0. MBP-PHD3 was again purified using a Superdex 200 column (GE Healthcare) equilibrated with 20 mM Tris, 100 mM NaCl pH 8.0. The purified MBP-PHD3 protein was concentrated using ultra filtration membrane (Merck Millipore) with 30 kDa molecular weight cut off. All the protein purification steps were carried out at 4°C.

2.20 Statistical analysis Siah2 gene expression levels by real time PCR data were analyzed by one-way ANOVA, Dunnett's multiple comparison test using the Graphpad Prism software. For comparisons, probability values <5% (P < 0.05) were considered statistically significant.

2.21 Isothermal titration calorimetry (ITC) The binding affinities and thermodynamic parameters between SBD of Siah2 and PHD3 or compounds were studied using isothermal titration calorimetry (ITC). The interaction of Siah2 SBD with PHD3 was verified using a VP-ITC microcalorimeter (MicroCal) at 25°C in 50 mM Tris buffer, pH 8. The cell contained 10 μM SBD. The syringe contained 150 μM of PHD3 and titrated against SBD of Siah2, which was prepared in the same buffer. Samples were first degassed for 15 min and titration was performed using a stirring speed of 390 rpm. The initial injection for ligand was 2 μl for 10 s and subsequent injections were 10 μl for 10 s with 240 s spacing. Data points were fitted using the integrated program ORIGIN (MicroCal) for one-site binding.

49 2.22 Crystallization Crystallization trials were carried for the purified MBP-SBD protein at room temperature by hanging-drop vapor-diffusion method using crystallization screens from Hampton Research and Jena Bioscience screens.

50 CHAPTER 3. IDENTIFICATION OF DRUG LIKE MOLECULES AGAINST SBD OF SIAH2 BY AN IN SILICO APPROACH

3.1 Introduction Given its involvement in various cancer and its related signaling pathways (discussed in Section 1.8), Siah2 has been suggested as a target for developing inhibitors for blocking its activity or abundance (Nakayama, Qi et al. 2009). Siah2 regulates HIF-1α by hypoxia response and SPRY2 by RAS signaling leading to tumor growth and metastasis. The SBD is of importance in the induction of HIF-1α dependent pathway. However, to date no structural or drug targeting information against SBD of human Siah2 is available. Homology modeling provides a way of constructing the structure of a protein with a "target" from its amino acid sequence and an experimental 3D structure of a related homologous protein (the "template"). Herein, we applied homology modeling approach to construct the structure of SBD of Siah2 The inhibition of the Siah2 expression could effectively inhibit angiogenesis in tumors. In this regard, menadione (vitamin K3) has been identified as an inhibitor for Siah2 which effectively attenuates hypoxia and the MAPK signaling pathway and inhibits melanoma growth in the mouse xenograft model (Shah, Stebbins et al. 2009). Targeting a non-catalytic substrate has been deemed among the more difficult tasks, although more advanced structure-based design holds a better promise. In this study, we present structural information of SBD for Siah2 using a homology modeling approach followed by a molecular dynamics simulation in order to analyze the stability of this domain. In addition, we predicted the binding site of the SBD in order to eventually identify drug-like molecules which possessed better binding energies and pharmacokinetic properties for this SBD using in silico high throughput virtual screening with menadione as a reference ligand, thereby investigating the structural features of SBD of Siah2, which may be a novel drug target for inhibitor development studies for cancers during hypoxic conditions. The potential specific drug-like molecules obtained

51 from such a screening procedure could serve as inhibitors against SBD of Siah2.

3.2 Results 3.2.1 Generation of SBD of Siah2 by homology modeling The main criteria in homology modeling are the template selection and sequence alignment between the target and the template. The PDB ID’s for the three template hits that were found for the SBD of Siah2 after performing PSI- Blast were 1K2F, 2AN6 and 2A25. The 3D-structure of 2AN6 has low resolution of 3 Å and there are 13 missing residues in its X-ray structure. Although 2A25 seems to be a good template with a resolution of 2.2Å, its 15 missing residue regions appears as a string in the modeled protein. Therefore, 1K2F was identified as a promising template having a sequence similarity of 86% to the SBD with a resolution of 2.4Å and contained no missing residues. Hence, the structure 1K2F was selected as the appropriate template for SBD modeling. During modeling the original amino acids positions of SBD i.e. 130 – 322 were assigned as 1- 193 residues respectively by Modeller 9v7. Hence the later positions are assigned for further data analysis. Twenty five models were generated and ranked during modeling of SBD against template 1K2F. The best ranked model (Fig. 11) was then subjected to evaluation to check the quality of the generated model.

52

Figure 11. Modeled structure for SBD of Siah2 obtained from Modeller9v7. The structure is represented in secondary structure mode using pymol; the α-helixes, β-sheets and loops are colored in red, yellow and green respectively.

3.2.2 Evaluation of the modeled structure The modeled SBD of Siah2 was taken for further optimization and validation. The calculated root mean square deviations between the target and template structure was found to be 0.324 Å (Fig. 12a). Furthermore, the quality of the structure derived from homology modeling was validated by calculating the ProSA. z-score which gives the overall model quality by Cα positions. The z-score of the target and template were -4.7 and -5.2. Hence the quality of the model obtained was validated using ProSA score. Both the target and the template models show the similar profiles given in the Fig. 12b and 12c. The geometric evaluations of the modeled 3D-structure of SBD were performed using PROCHECK by calculating the Ramachandran plot (Fig. 12d) (Table 3). This plot represents the distribution of the Φ and ψ angles in each residual of amino acid. The percentage of phi and psi angles in the allowed regions is 90% for residues in the core region whereas 0.9% of residues were located in the disallowed regions.

53

Figure 12. Model validation. (a) Superimposition of modeled Siah2 (target) and 1K2F (template) using the 3d-SS tool. In this wireframe diagram, yellow color represents the target and green represents the template. (b) z-score of - 4.7 from this graph represents the overall quality of the modeled 3D-structure to Siah2. It can be used to check whether the z-score of the input structure is within the range of scores typically found for native proteins of similar size. Its value is displayed in this plot that contains the z-scores of all experimentally determined protein chains in current PDB solved by either X- ray diffraction or NMR. (c) z-score of -5.1 from this graph represents the overall quality of the template 3D-structure (PDB ID 1K2F). When compared to Fig. 2b, it was found that the overall quality of the 3D-structures of the template and target is highly similar in terms of their corresponding z-scores. (d) Ramachandran Plot for the modeled SBD of Siah2. The most favored regions are colored red, additional allowed, generously allowed and disallowed regions are indicated as yellow, light yellow and white fields respectively.

54 Table 3. Ramachandran Plot statistics on 3D model of SBD of Siah2 calculated using PROCHECK

Ramachandran plot statistics for the modeled protein core% 90.3 allowed% 0.8 general% 0.6 Disallowed% 0.9 Bad contacts 12 G factor -0.02 Main chain bond lengths -0.17 Main chain bond angles -0.23

3.2.3 Molecular dynamics simulation of stability The stability of the modeled SBD of Siah2 was assessed by MD simulation for 5ns and was used for subsequent docking studies. In addition, parameters such as pressure, temperature and total energy were calculated to check the stability of the modeled protein with improved steric parameters. The analysis of the simulation parameters revealed that the total energy for the modeled SBD is -6.74E+5 kJ mol-1 and for the template 1K2F is -7.22E+5 kJ mol-1 (Fig. 13a, 13b). This shows that the target and template does not show much variation in terms of their total energy and hence the modeled structure was considered to be stable as of the experimentally solved structure of the template (1K2F). A pressure bar of 1.0 and temperature of 300K applied to both the proteins did not exhibit any significant difference until 5ns. Based on these simulation parameters it is evident that the modeled SBD adopted good conformational stability that could used for further SBDD.

55

Figure 13. Total energy calculated by MD Simulation. The total energies of the modeled SBD of Siah2 (a) and the template 1K2F (b) were compared using gromacs. This comparison shows that the overall energy of the target and template are very similar, so the modeled SBD was considered to be stable as the experimentally solved structure of the template (1K2F).

3.2.4 Docking and HTVS screening of compounds The validated and refined model was then used for docking using a known functional inhibitor menadione and the complexes were analyzed for the putative functional amino acid residues that are involved in hydrogen bonding. Prediction of the binding site of SBD was performed using the Sitemap in Schrodinger, which yields 13 amino acid residues in the binding pocket of SBD for Siah2 (Fig. 14). The predicted binding pocket was validated by both blind docking (to examine whether the ligand binds the protein away from the predicted binding site) and normal docking (allowing the ligand to bind to the predicted binding site) using menadione as an inhibitor against Siah2 using Glide (XP). It was observed that 8 of 13 predicted binding site residue atoms are within 5 Å of the ligand in both docking procedures.

56

Figure 14. Diagrammatic representation of the 13 predicted active site residues of SBD. The amino acids in SBD were numbered from 1-193 during modeling, whereas the corresponding amino acid residue position for the full length protein is also shown which contains the amino acid residues 1-324. Yellow color represents the amino acid which is consistent in hydrogen bonding formation for all the 5 identified lead molecules, whereas orange color represents the other amino acids involved in hydrogen bonding.

In addition we observed that the amino acid Ser39 (corresponding to Ser167 for a full length protein) is involved in hydrogen bond formation with menadione during both the docking procedures. This particular binding site was utilized for HTVS of compounds from a Maybridge database as one of the selection criteria. We next aimed to identify and characterize ligands interacting with predicted binding sites of SBD of Siah2. The in silico structure-based high-throughput virtual screening (HTVS) method of Glide, version 5.5 (Schrödinger, LLC, New York, 2009) (Friesner, Banks et al. 2004), was used to identify potential ligand molecules that interacted with at least one of the residues identified. The binding of ligands to these residues is postulated to render competitive inhibition with substrates of Siah2 that interact through SBD.

3.2.5 Selection of inhibitors Out of 16,908 molecules from the three public ligand databases, Hits were obtained only for Maybridge database. As a result 1000 potential ligand hits were identified based on their pharmacokinetic properties (ADMET). Subsequently 156 drug-like molecules were identified based on their glide scores better than menadione and compared with its binding conformation; eventually 20 lead molecules were selected. Five of 20 drug-like molecules were identified based on their hydrogen bond interaction with the putative

57 functional amino acid residue Ser39 with a binding energy ranged from ~ -6 kcal mol-1 to ~ -8 kcal mol-1 against Siah2 and their chemical structures are presented along with their IUPAC names (Fig. 15), whereas the binding energy for menadione is ~ -5 kcal mol-1.

Figure 15. Chemical structure of the five lead molecules. Compound 1(7599): 2',3',4',9'-tetrahydrospiro[piperidine-4,1'-pyrido[3,4-b]indole]; Compound 2 (724): (R)-4-methyl-N-(2-oxoazocan-3-yl)-3,4-dihydro-2H- benzo[b][1,4]oxazine-6-sulfonamide; Compound 3 (4035): 4-methyl-N-((5- methyl-3-phenylisoxazol-4-yl)methyl)-3,4-dihydro-2H-benzo[b][1,4]oxazine- 6-sulfonamide;Compound 4(10746): 2-(2-hydroxyphenyl) quinazoline-4(1H)- one; Compound 5(7757): 4-oxo-1,4-dihydroquinazoline-2-carbohydrazide. Corresponding Maybridge ID's are given in parentheses.

The binding conformations of these lead compounds towards the modeled Siah2 were also analyzed to determine the hydrogen bond interactions (Fig. 16). All protein-ligand complexes for these five lead molecules possess multiple hydrogen bonds when compared with menadione. The short hydrogen bond distance, ranging from ~1.5 to ~2.4 Å, and the favourable binding G-scores (-8 to -6 kcal/mol) suggested strong enzyme- ligand interactions. It is noteworthy that all five lead molecules interacted with residue Ser39 in common and their respective glide scores are comparable to menadione. The docking results of the final lead molecules against Siah2 are given in Table 4.

58

Figure 16. Binding poses of the five lead molecules and menadione to SBD of Siah2. Panels (a-e) represents the binding pose of lead molecules and (f) represents the binding pose of Menadione. The proposed binding mode of the lead molecules are shown in ball and stick display and non-carbon atoms are colored by atom types. Critical residues of binding are shown as tubes colored by atom types. Hydrogen bonds are shown as dotted yellow lines with the distance between donor and acceptor atoms indicated. Atom type color code: red for oxygen, blue for nitrogen, grey for carbon and yellow for sulfur atoms respectively. Ser39 amino acid residue that interacted with the lead molecules is indicated by circles.

59 Table 4. Glide extra-precision (XP) results for the five lead molecules and menadione using Schrodinger 9.0.

Lead Glide score #HB Amino acids interacted molecules a kcal/mol) b c 7599 -8.3618 SER39, ILE155 and HIS156 3 GLY44, VAL159, CYS40, SER39 and 724 -7.4965 6 HIS156 4035 -6.7967 SER39, CYS40, HIS156 and VAL159, 4 10746 -6.5975 SER39 and ILE155 3 7757 -6.5049 SER39, ILE155 and PRO7 3 Menadione -5.6476 SER39 1

a Ligand IDs are of the Maybridge database. b Glide score. c Number of hydrogen bonds formed.

3.2.6 Bioavailability of the lead molecules The drug-like properties of the lead compounds was assessed by evaluating their physicochemical properties using QikProp(Lipinski, Lombardo et al. 2001). Their molecular weights were < 500 daltons with < 5 hydrogen bond donors, < 10 hydrogen bond acceptors and a log p of < 5 (Table 5). These properties are well within the acceptable range of Lipinski’s rule of five for the five lead molecules. Further, the pharmacokinetic parameters of the lead molecules were analyzed which includes absorption, distribution, metabolism, excretion and toxicity (ADMET) using QikProp. For the five lead compounds, the partition coefficient (QPlogPo/w) and water solubility (QPlogS) are critical for estimation of absorption and distribution of drugs within the body, ranged between ~ -0.9 to ~ 2.23 and ~ -0.05 to ~ -3.65, while the bioavailability and toxicity were ~ -0.1 to ~ 0.3. Overall, the percentage human oral absorption for the compounds ranged from ~ 57 to ~ 72%. These pharmacokinetic parameters are well within the acceptable range defined for human use, thereby indicating their potential as drug- like molecules.

60

Table 5. QikProp properties of the identified drug-like molecules and menadione using Schrodinger 9.0.

Percent human Lead QP log QP log S c MW d QPlogHERG e # Metab f HA g HD h oral molecules a Po/w b absorption i 7599 1.771 -1.251 241.33 -5.708 1 5 3 72.335

724 0.845 -2.075 339.41 -6.651 3 7 2 74.975

4035 3.008 -4.930 339.46 -5.765 4 7 1 77.055

10746 1.842 -3.058 238.24 -5.360 2 4 2 70.148

7757 -0.523 -1.966 204.188 -5.416 0 6 3 55.280

61 Menadione -0.967 -1.051 198.671 -6.351 4 6 4 57.45

a Ligand IDs are of the Maybridge database. b Predicted octanol/water partition co-efficient log p (acceptable range: -2.0 to 6.5). c Predicted aqueous solubility; S in mol/L (acceptable range: -6.5 to 0.5 ). d Molecular weight ( < 500 daltons). e Predicted IC50 value for blockage of HERG K+ channels (acceptable range: below -5.0). f Number of metabolic reactions (1 – 8). g Hydrogen bond acceptors (< 10). h Hydrogen bond donors (< 5). I Percentage of human oral absorption < 25% is poor and > 80% is high).

3.3 Discussions

This study provides the 3D structure of SBD of Siah2 by homology modeling approach. Energy minimization and molecular dynamics simulation were done to check the stability of the modeled structure. The analysis of the homology modeling and MD simulation indicate that the theoretical prediction is consistent with the known set of experimental results. This structure was used for further insilico HTVS and we have identified five drug-like molecules with better binding affinity, when compared to menadione. Compounds are screened against the SBD with the aim of inhibiting the activity of Siah2 on its substrates targeting via SBD. We mainly hypothesize a competitive binding of these compounds to SBD of Siah2 with its substrates. Also Further analysis of the binding conformations of the lead molecules revealed that Ser39 (ser 167 of full length) residue of SBD may be the key amino acid residue interacting with these ligands necessary for the induction of Siah2 activity. Yet another possibility of inhibition is also suspected. Siah2 is subjected to phosphorylation, which increases its ubiquitin targeted degradation activity under hypoxia conditions through MAPK p38 signaling cascade. Phosphopeptide mapping or mutational analyses revealed S29 and T24 as major sites of Siah2 phosphorylation which are present in the RING domain of Siah (Emerling, Platanias et al. 2005; Khurana, Nakayama et al. 2006). The study also highlight the possibility that Siah2 may be subjected to phosphorylation on other MAPK acceptor sites which are below the detection limits used in the experimental condition. We also hypothesize that Ser 39 (Ser 167 of full length), if found to be one of the phosphorylation site of Siah2, could play a role in activity of the compounds. So far the ubiquitin ligases have not been shown to exhibit the catalytic activity, but merely to function as scaffolds for ubiquitin transfer, suggesting that inhibitors will have to target protein-protein interaction. The identified drug like molecules against the SBD of Siah2 may block the interaction between Siah2 and its substrates by inhibiting their interaction. We observed that these small molecules possessed greater binding affinity, when compared to menadione by docking studies. As this study mainly aims to block protein- protein interaction with no catalytic activity, we hypothesize that these lead

62 drug-like small molecules could serve as inhibitors against Siah2. More studies are warranted to reinforce these findings. Since the functionally essential role of ser39 (Ser167) is not evident from experimental method, our further experimental validation of drug compounds are targeted to inhibit the interaction of SBD of Siah2 and its substrates.

63 CHAPTER 4. EXPERIMENTAL VALIDATION OF DRUG LIKE MOLECULES BY IN VITRO AND IN VIVO BINDING ASSAYS

4.1 Introduction Siah proteins, evolutionarily conserved E3 RING zinc finger ubiquitin ligases, have recently been implicated in various cancers and show promise as novel anticancer drug targets. There are various substrates of Siah that have recently been reviewed and discussed (House, Moller et al. 2009; Nakayama, Qi et al. 2009). Two substrates of Siah2 in hypoxia signaling and cancer have been reported, HIPk2 and PHD3. Siah2 has been shown to regulate the stability of HIPK2 (Calzado, de la Vega et al. 2009), a transcriptional repressor of the HIF-1α gene (Nardinocchi, Puca et al. 2009). HIPK2 directly phosphorylates Siah2 resulting in a disruption of the HIPK2/Siah2 interaction, thus stabilizing HIPK2 and promoting apoptosis. It was also reported that chronic hypoxia mainly in rat brain region decreases 2-oxoglutarate (also known as α-ketoglutarate dehydrogenase complex (OGDC or KGDHC) activity, and Siah2 may function in this process as it targets OGDC-E2 for degradation although factors signaling this are not well understood and this regulation in not reported in cancer (Habelhah, Laine et al. 2004). Among the various substrates of Siah2, PHD3 is well characterized substrate under hypoxia condition in cancer. Three PHD proteins have been identified: PHD1, 2, and 3(Epstein, Gleadle et al. 2001). The association of Siah2 with each of the 3 PHDs were examined and the results revealed that PHD3 exhibited the highest degree of binding (Nakayama, Frew et al. 2004). Such selective targeting was due to the lack of N-terminal extension in PHD3 which are present in PHD1 and PHD2 (Nakayama, Gazdoiu et al. 2007). The E3 ligase Siah2 preferentially targets PHD3 for degradation under hypoxic conditions and thus stabilizing HIF-1α which in turn promotes its transcriptional activity (Nakayama, Frew et al. 2004). Although, all three were found to hydroxylate HIF-1α in the in vitro experiments (Bruick and McKnight 2001), in vivo PHD2 plays a major role in normoxic HIF-1α regulation (Berra, Benizri et al. 2003). PHD3 is known to be induced under hypoxia and believed to regulate

64 HIF-1α stability specifically under conditions of low oxygen concentrations. Hypoxic condition and the role of PHD1 in regulating Hif -1α is not clear. PHD3 can homo- and hetero-multimerize with other PHDs. Consequently, PHD3 is found in high-molecularmass fractions that were enriched in hypoxia and subsequent degradation by Siah2 (Nakayama, Gazdoiu et al. 2007). Inhibiting Siah2 activity with a peptide inhibitor PHYL designed to outcompete association of Siah2-interacting proteins reduced metastasis through HIF-1α, which is implicated in tumorigenesis, and metastasis. This selective targeting of PHD3 is achieved through the SBD. PHYL peptide binds to SBD of Siah2 with high affinity and disrupts the PHD3 binding, thereby elevating PHD3 levels and reducing HIF-1 α (Qi, Nakayama et al. 2008).

4.2 Results 4.2.1 SBD of Siah2 alone can independently interact with PHD3 irrespective of the N terminal Ring Domain In light of the role of Siah2 in regulating PHD3 through SBD, we chose this substrate for our further studies. We used coimmunoprecipitation to analyze the interaction of Siah2 and PHD3. Two Siah2 plasmid constructs, full length and SBD were generated. For this, cells were cotransfected with N- terminally FLAG-tagged Siah2 full length or SBD construct and HA tagged PHD3 or empty vector. Anti-FLAG M2 agarose beads were then used to immunoprecipiate the Siah-PHD3 protein complex. The complex was analyzed using Western blots with HA antibody to directly compare the binding of PHD3 to full length and the SBD of Siah2. When comparing the ratio of PHD3 in the FLAG-immunopreciptates to that in the lysate, a marked enrichment of PHD3 protein was seen in the immunoprecipitates suggesting strong binding of Siah2 towards PHD3 (Fig. 17). Furthermore, it was found that the amounts of PHD3 that bound to full length and SBD of Siah2 were similar. Taken together, the comparable binding of PHD3 to full length and the SBD of Siah2 suggests that under in vivo conditions the substrate binding domain alone is sufficient to bind to PHD3. Our results are consistent with the data published by Nakayama et al., (Nakayama, Frew et al. 2004), who showed that PHD3 is regulated by E3 ubiquitin ligase Siah2 for proteasome-

65 dependent degradation. Hence for further experiments, we focused on the interaction between SBD of Siah2 and PHD3.

Figure 17. Coimmunoprecipitation of Siah2 with PHD3. HEK293T cells were transfected in 60-mm cell culture plates for 2 days with expression constructs for the proteins indicated at the top of panel. The cells were lysed, and the lysates were subjected to FLAG immunoprecipitation (IP), as described under Materials and Methods. Immunoprecipitates and aliquots of the cell lysates were analyzed by Western blotting with the indicated antibodies. Both the Full length and SBD binds to their substrate PHD to the same extent.

66 4.2.2 GST pull down for SBD of Siah2 with PHD3 To confirm the interaction between PHD3 and Siah2 we also carried out GST pull down assays. In these assays we used recombinant GST tagged SBD of Siah2 and lysates from cells transfected with PHD3. To prepare a recombinant GST-SBD, bacterial expression plasmid of GST-Siah2 was made in pGEX-6P-1 vector. This expression vector pGEX-6P-1: SBD (130-322) was transformed into E. coli BL21 and induced with 0.2 mM IPTG at 18 ˚C overnight. The bacteria pellets were harvested, sonicated and lysed in 50mM Tris-HCl pH 8.0, 100 mM NaCl, 2 mM dithiothreitol containing a protease inhibitors cocktail (Sigma). Similarly, the pGEX-6P-1 expression vector alone was expressed as a GST control. After sonication the lysates were pre-cleared at high speed centrifugation and the supernatant was used for pull down assay. As shown in Fig 18, the GST pull down experiment was successful and in vitro binding of SBD of Siah2 to PHD3 was detected. Therefore, the GST pull-down results support the coimmunoprecipitation studies.

Figure 18. GST pull down of SBD with PHD3. GST and GST-SBD proteins were immobilized on glutathione-agarose beads and incubated with lysates from cells trasfected with PHD3. The samples were then analyzed by western blotting using anti HA antibody.

4.2.3 Expression and purification of MBP-SBD of Siah2 We next intended to analyze the thermodynamics parameters between compounds/PHD3 and SBD of Siah2 by ITC experiments. For this corresponding proteins were purified. Siah2 with an N-terminal GST was expressed for pulldown assays. However attempts to purify and cleave the tag leads to protein precipitation and aggregation. Since the protein yield with MBP fusion tag was much higher than when expressed with GST tag, for further experiments MBP-SBD-siah2 fusion protein was used.

67 The soluble SBD, (residues 130-322) of human Siah2 was expressed in fusion with the MBP using an oxidizing atmosphere of the in E. coli BL21 (DE3) competent cells. It was cloned downstream of the MBP tag in multiple cloning site 1 (MCS1) of the modifed pETDuet-1 vector which has a bacterial DsbC in MCS2 to assist proper folding of the protein of interest during expression. In summary, the final expressed protein was MBP-SBD-Siah2- 6xHis. The protein was purified with affinity chromatography using affinity tags at both ends. This was essential to ensure the removal of prematurely truncated proteins, which arise due to unwarranted translational stops that are common when human proteins are expressed in E. coli. In addition, having affinity tags at both ends also prevents co-purification of truncation products, which could be formed during or after cell lysis. The recombinant protein was first purified using Ni-NTA affinity purification and then followed by amylose affinity purification (Fig 19). Secondly, the protein was purified using size exclusion chromatography, where the protein eluted as dimer and monomer at a molecular weight of 130 and 63 kDa in an S200 column (Fig 20).

Figure 19. SDS-PAGE affinity purification profile of MBP-SBD of Siah2. Key for each lanes were labelled separately.

68

a Aggregates Dimer

Monomer

Imidazole

b

Figure 20. Gel filtration profile of purified MBP-SBD of Siah2. (a) After affinity purification the elute was passed through Superdex-200 column and eluted as three peaks. First peak corresponds to void volume. the second and third peak is the dimeric and monomeric SBD of Siah2. (b) Samples from three peaks were separated by 12% SDS-PAGE and visualized by Coomassie staining. Fusion protein migrated in the gel with an apparent molecular weight of 63 kDa. Lanes: 1. Low molecular weight ladder; 2-4. First peak of gel filtration elution; 5-9. Second peak of gel filtration elution; 9-12. Third peak of the gel filtration elution. SDS-PAGE shows the protein is highly pure.

69 4.2.4 Expression and purification of MBP-PHD3 Full length and truncated PHD3 with a Histag were cloned in pET-M vector and over expressed in BL21 (DE3) cells. However the proteins were completely insoluble. The proteins in the inclusion bodies were denatured and the attempts to refolding yielded, a protein precipitation (Fig. 43 Appendix 2.1). Hence we moved on to the MBP tag which aids in the solubility and purification of PHD3. Similar to the expression of MBP-Siah2, MBP-PHD3 expression was carried out (Fig 21) and purified in three steps: affinity purification using amylose resin, anion exchange chromatography (the pI of MPB-PHD3 is 6) (Fig 22) and size exclusion chromatography. The MBP- PHD3 protein eluted as a monomer in the size exclusion chromatography, corresponding to its molecular weight of 65 kDa. The eluted fractions of gel filtration show the protein is about 95% pure, (Fig 23).

Figure 21. SDS-PAGE affinity purification profile of MBP-PHD3

70 a

b

Figure 22. The ion-exchange chromatography profile of PHD3. (a) The blue line indicates the UV reading when samples are passed out of the column and the brown line indicates the concentration of high-salt buffer added into the column. (b) SDS-PAGE gel showing the FPLC fractions of the eluted protein run on an anion exchange column. The SDS-PAGE gel verified that PHD3 is found in Peak 3. Lanes 2 to 3 correspond to Peak 1, lanes 4 to 5 correspond to Peak 2 and lanes 6 to 10 correspond to Peak 3.

71 a

b

Figure 23. Gel filtration profile of purified MBP-PHD3. (a) After ion exchange the fractions were pooled and passed through Superdex-200 column for further purification. MBP-PHD3 eluted as two peaks. First peak corresponds to void volume. The second elutes as a monomeric PHD3. (b) Samples from two peaks were separated by 12% SDS-PAGE and visualized by Coomassie staining. Fusion protein migrated in the gel with an apparent molecular weight of 65 kDa. Lanes: 1. Low molecular weight ladder; 2-8. First peak of gel filtration elution; 9-10. Second peak of gel filtration elution.

The purification of both MBP-SBD and MBP-PHD3 resulted in highly pure proteins, which were used for ITC experiments and crystallization attempts. The identity of the protein was verified by Mass spectrometry (MS) (Fig 44. Appendix 2.2).

72 4.2.5 ITC analyses Inhibitors have high affinity towards Siah2 compared to PHD3 Interactions between SBD of Siah2 and the compounds and PHD3 were studied using ITC experiments. ITC is very useful in drug development, mainly to determine the relative affinities of a ligand that can bind to a protein with one or more different and specific binding sites. The thermodynamic parameters of these interactions are shown in Table 6. All the ITC experiments were performed with the titration of ligands at 150 μM into SBD at 10 μM in the cell under same buffer conditions. All reactions were exothermic in nature as shown in Fig. 24 and the upper panels show the raw ITC data of each injection. The peaks were normalized to the ligand-protein molar ratio and were integrated as shown in the bottom panels. Solid dots indicate experimental data, and their best fit was obtained from a non-linear least squares method, using a one-site binding model depicted by a continuous line. PHD3, compound 1 and compound 5 bind to SBD in 1:2 ratio (N = ~2) at two binding sites with nearly the same thermodynamic properties (Fig. 24a, b, f). However compound 2 and compound 3 bind in 1:1 ratio at one binding site (Fig 24c, d). For compound 3, the inflection point in the calorimetric isotherm occurs near the molar ratio of 0.6, suggesting only half of the binding site is occupied by the compound (Fig 24d). Compound 4 shows a non- stoichiometric binding profile with N = 0.05 and from the best fit values, the standard error was large (data not shown) suggesting the initial heat changes are unlikely to be significant and suggests no binding (Fig. 24e). Compared to the compounds, the heat released when the substrate (PHD3) was titrated against the SBD of Siah2 was observed to be large as it a protein-protein interaction. The binding affinities of compound 1 and 5 were observed to be four fold stronger than that of the binding affinity of Siah2-SBD-PHD3. As control, MBP-SBD of Siah2 was titrated against no ligand or substrate (Fig. 24g). In addition, all observations were further analyzed by enthalpy and entropy changes (Table 6). The negative Gibbs free energy change for SBD- PHD3 and SBD compound interactions also indicates that all the reactions are thermodynamically favored.

73

a b

c d

74 e f

g

Figure 24. ITC data for SBD with compounds and PHD3. (a) SBD-PHD3 titration. (b) SBD-Cpd 1 titration. (c) SBD-cpd 2 titration. (d) SBD-cpd3 titration. (e) SBD-cpd4 titration. (f) SBD-cpd5 titration. (g) SBD-Control. The upper panels show the injection profile after baseline correction and the bottom panels show the integration (heat release) for each injection.

75

Table 6. Thermodynamic parameters for interaction between SBD of Siah2 and PHD3 or compounds

(“-” represents “no binding”)

K ΔH TΔS ΔG 5 -1 d Molecules N Ka (10 M ) μM (kcal/mol) (kcal/mol) (kcal/mol)

PHD3 2.07 0.029 2.30 0.339 4.34 -7.74 0.177 3.15 -10.89

Cpd 1 1.805±0.050 7.02 ±0.801 1.42 -2.23±0.120 5.72 -7.95

76

Cpd 2 1.06 0.081 3.28 3.04 -2.11 6.62 -8.73

Cpd 3 0.649 0.073 1.28 7.8 -3.95 0.837 1.18 -5.13

Cpd 4 ------

Cpd 5 1.785 9.34 0.485 1.07 -6.14 6.02 -12.16

4.2.6 Compounds partially decrease SBD interaction with PHD3 We used both coimmunmoprecipitation and GST pull down assays to study the effect of the compounds on the interaction between the SBD of Siah2 and PHD3. We hypothesized that the identified small molecules would bind to the binding sites of SBD and would reduce or prevent Siah2 binding to its substrate. From the docking screen of more than 16,000 compounds in the May bridge library the top scoring 5 chemicals that were predicted to bind to the ligand binding pocket of SBD with high affinity were tested for their ability to inhibit the binding interaction between the Siah2 SBD and PHD3.

4.2.6.1 Coimmunoprecipitation of SBD with PHD3 in the presence of compounds Coimmunoprecipitation of Siah2 SBD and PHD3 was carried out in the presence of the identified lead compounds to test their efficacy. As mentioned earlier, the FLAG tagged SBD and HA tagged PHD3 proteins were co-transfected in the HEK293T cell line. After 40 hours of transfection, the cells were treated with the inhibitors at a concentration of 100 µM for 4 hours. The cells were then subjected to lysis and immunoprecipitated with FLAG beads and analyzed by western blot using an anti-HA antibody. Compounds 1, 2, 3 and 5 were found to partially decrease the SBD coimmunoprecipitation with PHD3 (Fig. 25a).

77 a

b

Figure 25. Coimmunoprecipitation of SBD with PHD3 in the presence of the compounds. HEK293T cells were transfected in 60-mm cell culture plates for 2 days with expression constructs for the proteins indicated at the top of each panel. four hours before the lysis, cells were treated with 100 µM of each compounds seperately and then coimmunoprecipitates of SBD with PHD3 was analyzed. The cells were lysed, and the lysates were subjected to FLAG immunoprecipitation (IP), as described under materials and methods. (a) represents the immunoprecipitates of SBD with PHD3 in the presence of inhibitor. Compound 1, 2, 3 and 5 found to have a subtle level of inhibition between the SBD and PHD3 complex. (b) represents the expression levels of proteins in the respective lysates.

78 4.2.6.2 GST pull down of SBD with PHD3 in the presence of compounds We next performed the GST pull down assay in the presence of compounds to test the effect of the compounds in the in vitro binding of SBD of Siah2 with PHD3. SBD was, expressed as the GST fusion protein in Escherichia coli. Equal volume of bacterial protein supernatants (derived by centrifugation of 1 litre cultures depending on the relative expression level of recombinant protein) were incubated with PHD3 in the presence of a given compounds at 100 μM concentration. Among the compounds only compound 5 showed marked activity in suppressing Siah2 binding to PHD3 (Fig. 26a). As pre incubation of compounds with Siah2 was not performed previously, we intended to check whether pre incubation of the compounds with Siah2 prior to the addition of PHD3, would result in a greater inhibitory effect. Hence, we performed the pull-down assay with 3 different drug concentrations: 200 μM, 1 mM and 2 mM, where the compounds were allowed to pre incubate with SBD prior to binding to PHD3. Based on the initial pulldown data, Compounds 1, 3, 5 were chosen for pull down in a dose dependent manner. Compound 5 was effective in inhibiting the interaction at 200 μM and 1mM but not at 2mM. The lack of effect at 2mM may be due to low solubility. Compound 1 was found to partially inhibit the interaction at the same level in all the three doses. Compound 3 partially inhibited the interaction at the higher concentrations of 1mM and 2mM (Fig 26b). From this data, among the five compounds, compound 5 represents a candidate inhibitor of the Siah2 E3 ubiquitin ligase.

79 a

b

c

Figure 26. GST pull down of SBD with PHD3 in the presence of compounds. (a) Hek293T cells transiently over expressing PHD3, were harvested and the cell lysates were subjected to the GST-SBD or GST alone pull-down assay (materials and methods) in the absence or presence of Compounds at the indicated concentration. Only Cpd 5 has visible reduction in binding of SBD of Siah2 with PHD3. (b) GST pull down of SBD with PHD3 in the presence of compounds in a dose dependent manner. Hek293T cells transiently over expressing PHD3, were harvested and the cell lysates were subjected to the GST-SBD or GST alone pull-down assay (as described in materials and methods) in the absence or presence of Compound 1, 3, and 5 at the three different concentrations. Cpd 5 shows consistent and marked reduction in inhibition. (c) commasive blue staining to show the expression of GST alone or GST- SBD of Siah2.

80 4.2.7 Basal level expression of Siah2 in breast cancer In order to further analyze the antiproliferative and cytotoxicity of the compounds in cell lines, we chose selected breast cancer cell lines. Studies of Siah2 have shown that the E3 ligase functions as an oncogene in animal models and patient cohort studies of human breast cancer (Moller, House et al. 2009; Behling, Tang et al. 2011; Chan, Moller et al. 2011; Wong, Sceneay et al. 2012) . The expression of Siah2 in breast cancer cell lines has not been well characterized as most studies are based on Siah2 staining in cancer tissues and the normal counterparts. Two cell line based studies in breast cancer drew our interest in analyzing the expression levels of Siah2 prior to test the activity of the compounds. The first study reported that, estrogen increases Siah2 levels in estrogen receptor ER positive breast cancer cells. This resulted in degradation of the nuclear receptor corepressor 1 (N- CoR) and a reduction in N-CoR–mediated repressive effects on gene expression, thus promoting cancer growth (Frasor, Danes et al. 2005). In contrast to this, another study reported the higher level of Siah2 expression in the ER negative cell line (MDA-MB2 31) (Sarkar, Sharan et al. 2012). In this cell line Src and Siah2 mediated the degradation of C/EBP, a tumor suppressor protein. Hence we chose two ER positive breast cancer cells (MCF7 and T47D) and two ER negative breast cancer cells (MDA-MB231 and BT549) and one normal breast epithelial cell MCF 10A to compare the expression levels of Siah2 in these cell lines and subsequently to check the activity of the compounds in the same. The Siah2 m-RNA expression levels in these cell lines were analyzed. To quantify the mRNA levels, we used real time PCR and the primers were validated (Fig. 27). Our result shows that the expression level was elevated in ER positive MCF7 cells consistent with the finding that estrogen increases Siah2 activity in ER positive cells. The ER positive T47D cell line also shows significant difference compared to the normal cell line (Fig. 28). In conclusion these results are consistent with an important role of Siah2 in breast cancer. Given that Siah2 showed the highest expression in MCF7 cells we used this cell line for further studies.

81

100 bp 80 bp amplicon 50 bp

Figure 27. Agarose gel showing the size of single PCR products generated by Siah2 gene-specific primers for q-PCR. cDNA from HEK293T cells were used as template. 100 bp DNA ladders were loaded to identify the base- lengths of PCR products. Sizes of PCR products correspond to the sizes obtained from primer validation data.

Figure 28. mRNA expression analysis of Siah2 in breast cancer cell lines. Total RNA extracted from all five breast cell lines was subjected to semiquantitative RT-PCR. Experiments were repeated in triplicates. * denotes p<0.05. one way annova, Dunnett's Multiple Comparison Test is used to test the significance. .

82 4.2.8 siRNA mediated gene silencing We next wanted to determine the protein concentration of Siah2 in different cell lines. Gene knock down studies were carried out for evaluating the specificity of the compounds for Siah2 and also to validate the antibody. The Siah2 gene was knocked down in HEK293T and MCF 7 cells using siRNA and quantitative reverse transcription–PCR was performed to determine the extent of knockdown of Siah2. Two SiRNA’s, one targeting the 3′ UTR region and the other targeting the coding region were used. From the two dsiRNA oligos tested, oligo #1 and oligo #2 were found to efficiently down-regulate Siah2 mRNA levels (Fig. 29). The selectivity and the specificity of siRNA knockdown can be verified by rescue experiments. Our results clearly show that over expression of Siah2 was not rescued by siRNA 2 and thus both endogenous and over expressed Siah2, is successfully abrogated by siRNA2. However, siRNA1 completely rescued Siah2 as it targets the 3 UTR region (Fig. 30).

Figure 29. Validation of knockdown of Siah2 in HEK293T and MCF7 cell lines. Cells were transfected with indicated oligos for 72 hours and were harvested for RNA extraction. Real time-PCR was performed with Siah2 primer, normalized with β-Actin and represented as fold difference.

83

Figure 30. Rescue experiments using the Siah2 full length constructs. HEK293T cells were cotransfected with indicated siRNA or control siRNA oligos, together with the expression vectors: Flag-tagged Siah2, or the empty vehicle vector; At 48 h post-transfection, cell lysates were analyzed by Western blotting with anti FLAG ab and actin as a loading control.

4.2.9 Endogenous Siah2 protein expression and antibody validation Goat polyclonal anti-Siah2 (N-14) (sc-5507; Santa Cruz Biotechnology) antibody was used to identify the endogenous Siah2 protein expression levels in all four breast cancer and one normal breast epithelial cell lines. However, the observed non specific binding made it difficult to identify endogenous Siah2 expression. Also differences in protein (from the suspected respective molecular weight band) (Fig. 31a) and mRNA expression patterns of Siah2 were observed. In this regard, Siah2 protein was tested for its stability using cyclohexamide, a protein synthesis inhibitor. Over time there was no degradation of the protein as detected by the antibody (Fig. 31b). Thus we validated the antibody to check its specificity for Siah2 and to confirm the identity of the protein prior to any further studies.

84 a b

Figure 31. Analysis of endogeneous Siah2 protein expression and test for its stability. (a) The cell lines were cultured, harvested, and analyzed by Western blot analysis with Siah2 antibody from Santa Cruz and actin (as a loading control). Rasmos lysate positive control for Ab1 was also used. (b) MCF7 and T47D cells were treated with 40 μM cycloheximide (CHX), a protein synthesis inhibitor, and were lysed at time points 0, 2, 5 and 8 hr. No change in stability of Siah2 protein detected by Ab1. As a control, an unstable protein p27 was analyzed for its stability by CHX.

In order to determine the specificity of the antibody we analyzed both the endogenous and ectopic expression of Siah2 in the presence or absence of siRNA oligos. There was no difference in expression levels observed in both endogenous and ectopic Siah2 (Fig. 32). The results apparently show the inability of the antibody to detect the Siah2 protein. We also tried validating another antibody, Mouse monoclonal anti-Siah2 (S7945; Sigma). Unfortunately even this antibody was also not able to detect Siah2 protein expression. Antibody IP was carried out for both the antibodies simultaneously along with a control antibody to check whether they could detect the Siah2 protein in immunoprecipitates (data not shown). For these reasons we could not detect the protein expression siah2 in breast cancer cells.

85

Figure 32. Antibody validation using siRNA oligos. HEK293T Cells were transfected with negative control or Siah2 siRNA for 3 days and with FLAG- Siah2 for the last 2 days. Cells were lysed and cells lysates were analyzed by Western blotting with the indicated antibody. Antibody could not detect the ectopic Siah2 expression.

86 4.2.10 Effect of Compounds on Cell Viability MTS proliferation assay was used to analyze the anti proliferative and cytotoxic activity of all the five compounds in MCF7 cells. Viable cell masses at the time of drug addition (time 0), and following 72 hours of drug exposure were determined. For this, serial dilutions of these compounds were prepared at a final concentration of 0.064, 0.32, 1.6, 8, 40, 200 and 1000 µM respectively. The compounds and untreated control (UnC) were tested in triplicate. To investigate the effect of compounds on the growth of MCF 7 cell, the cells were treated with the compounds of indicated concentration and incubated for 72h. The absorbance measures were normalized to that of control untreated cells at 72h. Anti-proliferative and cytotoxicity, elicited by all 5 compounds in the MCF 7 cell line were obtained as shown in Fig 33. From this data we observe that the all compounds exhibit varied activity on MCF7 cells. As shown in Fig. 33, marked reduction in the cell growth was observed in the cell treated with compound 1 after 72 h treatment. This inhibitory effect of compound 1 is observed between 1 and 40 μM. On the other hand compound 4 has cytotoxic effect at low concentration and as the concentration increases cytotoxicity decreases. This could be because of poor compound solubility at high concentrations. No cytotoxicity or anti- proliferative activity was observed in cells treated with the compound 2. However, compounds 5 and 3 showed minimal inhibition of growth when compared to the UnC. These results indicate that the compounds have limited activity in cell line. These observations frame critical considerations to focus on the other structural analogues or by side chain modifications, which could increase the pharmacokinetics property of the drug compounds.

87

Figure 33. Cell proliferative effects of compounds in MCF7 cell line. Antiproliferative or cytotoxicity effects of all five compounds in MCF 7 cell was analyzed by MTS assay by measuring the number of viable cells. The values are reported as the fold change relative to control at 72 hrs.

88 4.2.11 Analysis of the specificity of the compound 1 for Siah2 in MCF7 cells Since compound 1 possesses effective anti proliferative activity, the compound was chosen to study the specificity of compounds for Siah2. To determine whether the anti-proliferative and cytotoxic activity of compound 1 is dependent on the presence of Siah2, we repeated the assay in MCF 7 cells with siRNA mediated Siah2 knock down. Since the maximum inhibition was observed at a concentration of 40 µM, we used this concentration for the knock down studies. However, there was a similar growth inhibition in Siah2 knock down compared to control cell. This result suggests that compound 1 has low specificity in cells and exerts Siah2 independent effects specificity of compounds within the cells (Fig 34).

Figure 34. Specificity of Compound 1 to Siah2 in MCF 7 cell line. Cell proliferation of MCF7 cells at 40 µM in the presence of non specific and specific control Siah2 siRNA was measured by MTS assay for 72 hrs.

89 4.3 Discussions 4.3.1 Advantages In our second study, we addressed one of the important criteria which determine the medical value of a candidate drug, 'specificity'. The physiological effect of a drug should be defined to an extent. It has to specifically bind to a target protein in order to minimize undesired side- effects. However, efficacy is not always a result of lack of specificity alone but also could be from the response body to drug and related regulation of malfunction. Molecular level specificity includes two independent mechanisms. First the drug has to bind to its substrate with suitable affinity and second it has to either stimulate or inhibit the regulation of its target protein. Both mechanisms are mediated by a variety of interactions between the drug and target. The results from our biochemical, biophysical and cell-biological assays indicate that the novel compounds identified through computational screening have partial selective inhibitory property for the SBD of Siah2 in the in vitro and in vivo binding assays. We characterized the thermodynamic parameters of the interaction of the SBD of Siah2 with the substrate PHD3 by identified test compounds. We also analyzed the anti-proliferative, cytotoxic profile and specificity of the compounds. The relationship between in vitro affinity and in vivo inhibition constants is reported in Appendix 3. Although the molecular details of the specificity of the compounds towards SBD remains unclear, our study provides the strategy of identifying small molecules through SBDD and validation for its specificity through in vitro and cell based binding assays. . 4.3.2 Limitations There could be various reasons for the partial activity of the compounds. The first reason is drug solubility. Although the computational predictions of pharmacokinetic and bioavailability properties of all these compounds, falls under acceptable ranges, there could be some issues with poor drug solubility. The solubility profile of a substance is required to identify the potential issues in drug precipitation in vivo. With the aid of advancement in combinatorial and medicinal chemistry, a slight chemical

90 modification of a molecule can lead to enhancement in solubility (Amidon, Lennernas et al. 1995). In addition to solubility, factors such as trans- membrane delivery or cell uptake efficacy, bio-stability, and nonspecific absorption or binding by serum or cellular proteins may contribute to better efficacy in cells. The reason for not including menadione as one of the control is because of their potential toxicity in human use which is banned by FDA. Since the compounds are screened against SBD, there was a limitation to measure the activity of compounds in degrading PHD3 as it requires full length protein. The Siah protein levels may be differentially regulated because of stability as these proteins are reported in auto ubiquitination (Hu and Fearon 1999; Depaux, Regnier-Ricard et al. 2006), and hence this may limit the availability of the proteins to bind to the target or compounds. We could not confirm this hypothesis since we were not able to detect the endogenous protein expression levels because of poor antibodies. Another important reason is cross reactivity. Due to high sequence similarity between the Siah1 and Siah2 SBD, it is suspected the identified drug compounds could also cross react with Siah1. Our initial screening of the compounds also ignored Siah1 specificity. In support to this, most of the tumor suppressor role of Siah1 was identified in breast cancer. Hence we hypothesize that the activity of compounds in the Siah2 knockdown cells may likely be affected due to the cross reactivity with Siah1.

91 CHAPTER 5. IDENTIFICATION OF CRITICAL RESIDUES INVOLVED IN SUBSTRATE SPECIFICITY OF SIAH2

5.1 Introduction Siah1 and Siah2 share a highly conserved protein sequence which reflects similar functional roles in a few instances (Confalonieri, Quarto et al. 2009; Kramer, Stauber et al. 2013). However, there are notable differences in the expression of Siah1 and Siah2. Siah1 is induced by p53 upon genomic stress (DNA damage), while Siah2 is induced by hypoxia (Matsuzawa, Takayama et al. 1998; Qi, Nakayama et al. 2008). Siah1 and Siah2 have a number of substrates in common but there are some proteins targeted by one and not the other (Qi, Kim et al. 2013), which might affect different cellular signaling networks. For example, DCC is regulated by both Siah1 and Siah2 (Hu, Zhang et al. 1997). In some cases, substrates are targeted for degradation by solely one Siah member, for example TRAF2 is targeted for proteasomal degradation by Siah2 but not Siah1 (Habelhah, Frew et al. 2002). In most publications however, only one Siah family member is analyzed and often it is not clear which Siah family member is under scrutiny. Thus, further investigation is needed to determine whether or not the identified substrates are regulated by each of Siah isoform. Animal model system uniformly supports an oncogenic role of siah proteins in various cancers, whereas cell based and patient cohort observations describe Siah2 as tumor promoter and Siah1 as tumor suppressor in most cases (chapter 1). A recent review highlights the different roles of Siah proteins, highlighting the necessity for more in-depth understanding of both Siah1 and Siah2 protein function (Wong and Moller 2013). In spite of this, most of the studies reported, do not discriminate Siah1 and Siah2 expression in same cancer type and therefore, are not conclusive whether both or only one is a valuable prognostic marker. A report investigating the physiological functions of Siah1 and Siah2 genes by generating knock-out mice demonstrated that Siah1a knockout mice are sub viable and growth retarded whereas Siah2 knock-out mice are completely viable. However Siah2 Siah1a double mutant mice die at birth

92 (Frew, Hammond et al. 2003). This supports the notion that Siah1 and Siah2 proteins have both distinct and overlapping functions, but their exact roles are still not clear. Because of these differences, they are also thought to have unique roles. Hence, we decided to investigate the critical residues involved in substrate specificity of Siah2 with respect to Siah1, which will allow us to gain a better understanding of their unique roles. In order to identify critical residues, we verified the binding studies with other Siah substrates; β catenin (DNA damage) and Nrf2 (Hypoxia) in addition to PHD3 (Hypoxia). β-catenin is a multifunctional protein that plays an essential role in the transduction of Wnt signals. Studies reported that Siah-1 mediates a β-catenin degradation pathway linking p53 activation to cell cycle control with the interacting protein. However, this degradation is mediated through a complex including Ebi, Sip and APC, which facilitates the degradation of β-catenin in a p53-dependent manner (Liu, Stevens et al. 2001; Matsuzawa and Reed 2001). On the other hand it is reported that Siah2 mediates PHD3 degradation under hypoxic conditions. It is also highlighted that PHD3 might require an unknown adaptor protein for Siah2 mediated degradation (Nakayama, Frew et al. 2004). However, our binding studies show the direct binding of Siah2 with PHD3 (Fig 14). So we also analyzed the direct binding of Siah isoforms with β-catenin along with PHD3 to gain more insights about substrate specificity. Nuclear factor (erythroid-derived 2)-like 2 (Nrf2) is a key transcriptional activator of antioxidant-response elements (AREs) and a central regulator of the induction of antioxidant responsive enzymes. Under normal conditions, Nrf2 is constantly degraded through the ubiquitin- proteasome pathway in a Keap1-mediated manner (McMahon, Itoh et al. 2003). It has been very recently reported that Nrf2 is regulated by Siah2 under hypoxic condition and the direct binding of Siah2 with Nrf2 was also reported (Baba, Morimoto et al. 2013). In this respect our final findings were also verified with Nrf2 to confirm the role of identified critical residues.

93 5.2 Results In general, the roles of Siah1 and Siah 2 are expected to overlap and the unique features of each protein are not clear. We tried to investigate the Siah1-PHD3 interaction in order to know is there any significant difference in the level of binding of PHD3 with Siah1 and Siah2. In this respect, a FLAG tagged Siah1 expression plasmid in pcDNA3.1 was generated. Cloning was confirmed and coimmunoprecipitation studies were carried out in HEK293T cells.

5.2.1 Siah1-PHD3 interaction It has been reported that Siah2 is more efficient than Siah1 in destabilizing over expressed PHD3 (Nakayama, Frew et al. 2004). Another study reported that Siah1 (human) and Siah1a (mouse) reduced PHD3 protein levels similar to Siah2 but Siah1a M180K mutant did not alter the PHD3 protein levels suggesting that the Siah substrate-binding groove is important for PHD ubiquitination and degradation (Moller, House et al. 2009). However no direct binding of Siah1 with PHD3 has been reported so far. In this regard, we first investigated the interaction of Siah1 with PHD3 (Fig 35). To this end, HEK293T cells were cotransfected with the corresponding expression constructs, followed by immunoprecipitation using an anti-flag antibody. Interestingly we observed a weak interaction of SBD of Siah1 with PHD3 when compared to Siah2 with PHD3 (Fig 35a). A subsequent reciprocal immunoprecipitation assay was also performed to check the binding of both full length and SBD of Siah1 with PHD3 as it shows weak binding with the FLAG immunoprecipitaion of the SDB of Siah1. For this, HEK293T cells were cotransfected with the corresponding expression constructs, followed by immunoprecipitation using an anti-HA antibody. HA immunoprecipitation of PHD3 exhibits no binding of PHD3 with Siah1 (Fig 35b). Hence the data obtained by both immunoprecipitation and reciprocal immunoprecipitation assays, suggest the difference in binding levels between Siah1 and Siah2.

94

a b

95

Figure 35. Association of Siah1 with PHD3. HEK293T cells were transfected in 60-mm cell culture plates for 2 days with expression constructs for the proteins indicated at the top of panel. The cells were lysed. (a) The lysates were subjected to FLAG-IP and aliquots of the cell lysates were analyzed by western blotting with the anti-HA antibodies. Weak binding of SBD of Siah1 with PHD3 was observed compared to Siah2. (b) The lysates were subjected to HA-IP. Immunoprecipitates and aliquots of the cell lysates were analyzed by Western blotting with the anti-FLAG antibodies. Both the Full length and SBD of Siah1 did not show binding with PHD3 by HA-IP (reciprocal immunoprecipitation). The membrane was also immunoblotted using anti-FLAG or anti-HA antibody for the detection of protein in the lysates.

5.2.2 SBD of Siah1 and Siah2 region swapped to obtain chimeric forms Amino acid residues 90-282 (193) and 130-322(193) contribute to the SBD of Siah1 and Siah2, respectively. In order to avoid confusion residue numbers for both the SBD are labeled as 1-193 (Fig. 36). To identify which region of SBD favours the binding, region swapping using fusion PCR was performed as described in Materials and Methods, in which residues from 101- 193 region of Siah1 was swapped with the corresponding region of Siah2 and vice versa to obtain the SBD of Siah1-2 and Siah2-1 constructs (Fig 36).

Figure 36. Diagrammatic representation of SBD of Siah1 and Siah2. SBD of Siah1 and Siah2 (Top panel). SBD of Siah1-2 and Siah2-1(bottom panel). Corresponding full length residue numbers are given in given in parentheses.

5.2.3 Binding of wild type (WT) and chimera SBD's with substrates Next, we investigated the interactions of the SBD of Siah1-2 and Siah2-1 with PHD3. Along with this, direct interaction of β-catenin with the WT and chimeric Siah isoforms were also analyzed using coimmunoprecipitation. To this end, HEK293T cells were cotransfected with the corresponding expression constructs, followed by immunoprecipitation using an anti-FLAG antibody. In these experiments, only WT SBD of Siah2 interacted with PHD3, whereas Siah2-1 interaction was markedly reduced compared to WT SBD, suggesting the importance of the region 101-193 in binding. Interestingly Siah1-2 did not exhibit any binding to PHD3 even though it contains the swapped region of Siah2 (Fig 37). These results suggest

96 that both regions of SBD 1-100 and 101-193 are equally important for binding with substrate. We observed no direct binding of β-catenin with both SBDs of (Fig 37). This study supports the finding that Siah1 mediates β-catenin degradation through SIP or APC by UV-induced DNA damage whereas Siah2 causes PHD3 degradation under hypoxia conditions and suggests that the substrate specificity of Siah seems to be dependent on the types of stress that cells encounter.

Figure 37. Association of wild type and chimeric SBD's of Siah with its substrates. HEK293T cells were co-transfected in 60mm plate with pcDNA3.1 containing HA-β-Catenin, HA-PHD3, FLAG-SBD-Siah2,FLAG SBD-Siah1, FLAG-SBD-Siah1-2, FLAG SBD of Siah2-1 as indicated in the top of panel. Forty eight hours after transfection, the cells were lysed and cell lysates were subjected to FLAG immunoprecipitation and then immunoblotted using anti-HA antibody. We also immunoblotted using anti-FLAG antibody for the detection of protein in the lysates (bottom panel).

5.2.4 Identification of critical residues unique for Siah2 In order to identify the critical residues in SDB of Siah2 which confer a substrate specificity, sequence alignment was performed to identify the unique amino acids in which Siah1 and Siah2 differ. Pairwise sequence alignment was performed by the EMBOSS Needle tool, which creates an optimal global alignment of two sequences using the Needleman-Wunsch algorithm (Rice,

97 Longden et al. 2000). From the alignment of SBD of the Siah isoforms, twenty six amino acids are different while the remaining residues are identical or conserved (Fig. 38). Out of 26, ten amino acids are found be less similar or different, based on amino acid properties. Hence we narrowed to 10 amino acids for further mutational analysis.

Figure 38. Sequence alignment of SBD of Siah1 and Siah2. Unique 26 aminoacids are highted in grey. Dissimilar amino acids are highlighted by ‘*’. Highly Similar amino acids are highlighted by ‘:’ and identical amino acids are highlighted by ‘|’.

5.2.5 Mutagenesis of Siah1-2 and Siah2-1 Immunoprecipitation studies between SBD of swapped clones, Siah1-2 and Siah2-1, revealed that both the regions 1-100 and 101-193 are important for binding. Out of the 10 residues that differ between Siah1 and Siah2, 6 amino acid of Siah2 in region 1-100 remain under the Siah2-1construct and remaining 4 amino acids of Siah2 in region 101-193 fall under the swapped Siah1-2 construct. From the corresponding 6 amino acid positions of Siah1 in Siah1-2 clone, three were mutated back to Siah2 (Siah1-2Mut). Similarly from the remaining four corresponding Siah1 amino acid position in Siah2-1 clone, two were mutated back to Siah2 (Siah2-1Mut) (Fig. 39).

98

a

b

Figure 39. Representation of mutagenesis and distribution of unique amino acids in chimeric forms of Siah. (a) 10 unique amino acids in which Siah1 and Siah2 differ are highlighted in diagrammatic representation of SBD of Siah1-2 and Siah2-1 (top panel). Three mutated residues of Siah1 in Siah1- 2 clone in region 1-100 to Siah2 and two mutated residues of Siah1 in Siah2-1 clone in region 101-193 to Siah2 are highlighted (bottom panel). Corresponding full length residue number Siah2 are indicated in parentheses. (b) sequence alignment of SBD of Siah1-2 and Siah2-1 highlighting the respective mutations.

99 5.2.6 Binding of wild type and mutated chimeric SBD's with substrates We proceeded with coimmunoprecipitation studies of the SBDs of Siah1-2Mut and Siah2-1Mut with PHD3 and Nrf2. PHD3 is a well-studied hypoxia substrate and Nrf2 is a recently identified novel hypoxia substrate whose direct binding has been reported. The results show that Siah2-1Mut regains its binding with PHD3 and Nrf2, equivalent to binding levels of WT, proving the crucial roles of the amino acids Leu121 (original number 250) and Ala160 (289) of Siah2 in binding (Fig 40). On the other hand, Siah1-2Mut (E17S, P57S and F98H) did not exhibit the strong binding compared to the Siah2 WT and Siah2-1Mut, which verifies the importance of the other three residues, His21(150), Pro26(155) and Ala62(191) in binding. Fig 40b also shows the specificity of Nrf2 towards the SBD of Siah2 but not that of Siah1. The AxVxP motif, is the consensus sequence found in some of the Siah substrate proteins (House, Frew et al. 2003; House, Hancock et al. 2006). Both Nrf2 and PHD3 have the AXVXP motif at positions 170–174 in Nrf2 (AQVAP) and at positions 176-180 in PHD3 (ADVEP). Siah2 binds to Nrf2 through binding motif (Baba, Morimoto et al. 2013). The mutational study analysis of Siah isoforms and their chimeric forms with both PHD3 and Nrf2 are consistent in terms of direct binding, suggesting the importance of the identified residues in Siah2. Since Siah2-1M restore binding equivalent to Siah2 WT, we next investigated which of the two mutants (either one or both) is critical for this preferential binding. For this point mutants were made and the binding with PHD3 was analysed. We observed that Q250L restores binding compared to T289A suggesting the importance of Leucine 250 in the C terminal region (Fig 41a). Similarly, in the N terminal region of Siah1-2, we made remaining 3 mutants and also all 6 mutants to investigate the binding with PHD3. But both 3M and 6M only partially restores binding but not equivalent to Siah2 WT (Fig. 41b). We next focused on the 10 similar amino acid on the N terminal region of which Serine 133 and tyrosine 164 could be invovled in hyrogen bonding with its OH group. So we made additional 3 mutant constructs which include 8M(6+SY), 7M(6+S) and 7M(6+Y). Binding of these mutants with PHD3 was analysed. It was observed that 8M mutant increased binding compared to 6M alone (Fig. 41c). This results suggest that all the amino acid in the Nterminal region might be involved in binding.

100

a b

101

Figure 40. Siah2-1Mut restores its binding with its substrates. HEK293T cells were transfected with (a) WT FLAG SBD of Siah1, WT FLAG SBD of Siah2 , FLAG Siah1-2Mut, FLAG Siah2-1Mut, HA PHD3 or empty vector; (b) WT FLAG SBD of Siah1, WT FLAG SBD of Siah2 , Siah1-2Mut, Siah2-1Mut, Nrf2 or empty vector as indicated on the top of panel, followed by immunoprecipitation (IP) of cell lysates with FLAG antibody. Immunoprecipitates were then analyzed for coimunoprecipitates of PHD3 and Nrf2 by Western blotting (WB). Both the results showed that Siah2-1Mut restores binding with its substrates.

IP-FLAG Lysates

HA- PHD3 a FLAG-SBD Siah1 WT FLAG-SBD Siah2 WT FLAG-SBD Siah2-1 FLAG-SBD Siah2-1 M1 FLAG-SBD Siah2-1 M2 Empty pcDNA3.1

HA- PHD3 HA

FLAG FLAG-SBD Siah

102 IP-FLAG Lysates c IP-FLAG Lysates b HA- PHD3 HA- PHD3 FLAG-SBD Siah1 WT FLAG-SBD Siah2 WT FLAG-SBD Siah2 WT FLAG-SBD Siah1-2 FLAG-SBD Siah 6M FLAG-SBD Siah 2-1 FLAG-SBD Siah2-1 8M FLAG-SBD Siah2-1 3M FLAG-SBD Siah2-1 7MS FLAG-SBD Siah2-1 6M FLAG-SBD Siah2-1 7MY Empty pcDNA3.1 Empty pcDNA3.1 Heavy Chain HA HA- PHD3 HA HA- PHD3 Light Chain

FLAG FLAG-SBD Siah FLAG FLAG-SBD Siah Figure 41. Analyses of Binding between mutated 2-1 and 1-2 chimeric SBD (a) Binding of wild type and mutated 2-1SBD Chimeric with PHD3; (b&c) Binding of wild type and mutated 1-2SBD Chimeric with PHD3.

5.2.7 Crystallization Simultaneously, attempts were carried out to solve the structure of SBD of Siah2 to analyze whether any significant change in the fold could contribute to the differential binding. Even though the crystal structures of Siah1 SBD with the peptide EKPAAVVAPITTG from SIP and that for murine Siah, in complex with a peptide from phyllopod (discussed in chapter 1.5), are known, the Siah2 structure is still important to gain more insights into the mechanism of specificity. Hence, we attempted to crystallize the MBP-SBD of Siah2. Purified MBP-Siah2-SBD was concentrated up to 10mg/ml and its homogeneity was analyzed using dynamic light scattering (DLS). The results confirm that MBP-SBD-Siah2-H6 is homogenous but the predicted molecular weight was very high suggesting homogeneous aggregation after concentrating (Fig 41). Crystallization screening for MBP-SBD-Siah2-6xHis was performed by hanging drop vapor diffusion method at room temperature using various commercially available Hampton and Emerald biosystems screens. 1 μl of purified protein was mixed with 1 μl of the corresponding reservoir solution and allowed to equilibrate against 500 μl of the reservoir solution at room temperature. Tiny crystals not suitable for diffraction was obtained in wizard screen I. Further optimization of the initial conditions is underway. Crystallization trials were also aided by the use of the Mosquito robotic crystallization system (TTP Labtech, UK). The robotic screening was carried out using hanging drop method in 96 well plates. 0.2 µL of protein sample was mixed with 0.2 µL of mother liquor. Crystallization screens with Wizard screens 1 to 4 were setup using this method. Currently the drops are under observation.

103

Figure 42. DLS Profile of MBP-SBD of Siah2 at 10mg/ml.

5.3 Discussion There could be various reasons for the varied substrate specificity of both Siah1 and Siah2, in spite of their high sequence similarity in terms of direct binding. One apparent reason is that one or more unique amino acids that is specific for Siah2, favors this selective binding. The other reason could be the differences in the conformation of the two proteins. However, this can be compared only when the crystal structure of Siah2 is available. We first intended to identify the critical residues involved in selective binding of Siah2 and then we attempted to solve the structure of SBD of Siah2. From our first study, we narrowed down 5 critical amino acids, specific for Siah2, that confer to substrate specificity. Our results suggest that two regions of the SBD of Siah2 are involved in binding and we believe that further validation of these residues by using specific Siah1 and Siah2 substrates will provide us with a better insight into the unique functions of these proteins. Our study highlights the differences in binding between Siah1 and Siah2 with their substrates based on over-expression studies.

104 CHAPTER 6. CONCLUSION AND FUTURE DIRECTIONS

6.1 Conclusion 6.1.1 Siah2 inhibition as anti-cancer therapy The SBD of Siah1 and Siah2 are highly conserved with 86% sequence similarity, making it challenging to design Siah2 specific inhibitor. The N- terminal regions of Siah1 and Siah2 differ from each other (Della, Senior et al. 1993), which makes it feasible to design Siah2-specific RING domain inhibitors. However, targeting the RING domain does not impair the ability of Siah2 to bind to its substrate proteins via the SBD. Another important issue in targeting the RING domain is its similarity among the Ring-finger ligase family (Deshaies and Joazeiro 2009) and thus all the inhibitors for E3 ligases obtained so far inhibit multiple ligases (Garber 2005). The Siah proteins are unique in their SBD, which adopts a pocket and forms the binding site for adaptor and substrate containing an AxVxP motif, a consensus motif found in most of the Siah binding proteins (House, Frew et al. 2003). Thus, screening inhibitors against the Siah2 SBD may lead to identification of specific inhibitors that will not affect other ring-finger E3 ligase. Blocking the SBD using a peptide inhibitor is capable of reducing tumor growth and metastasis in various models (Qi, Nakayama et al. 2008; Shah, Stebbins et al. 2009; Qi, Nakayama et al. 2010). So far, menadione is the only available Siah2 specific inhibitor, identified from a Meso-Scale-based assay of 2000 compounds (Shah, Stebbins et al. 2009). Because of high homology, SBD inhibitors, if available, will inhibit both Siah1 and Siah2. In general, the roles of Siah1 and Siah 2 are expected to overlap and unique feature for each of the proteins is not clear. However based on the current understanding of the role of Siah isoforms in cancer, Siah1 is more often described as tumor suppressor and Siah2 as an oncogene. Addressing the above facts, our results has contributed the following findings, 1. In the first part of the thesis, SBDD approach was implied and we have identified five drug like molecules for SBD of Siah2. These drug-like molecules were tested for their efficacy to be potential inhibitors against Siah2

105 by biophysical, biochemical, in vitro and in vivo binding assays, which were carried out in the second part of this thesis. The identified compounds were experimentally validated for their activity in disrupting the interaction of Siah2 with its high affinity substrate PHD3 by in vitro and in vivo immuno precipitation assays. The thermodynamic properties of the binding between SBD and PHD3 or SBD and the compounds were also characterized. From these studies, compound 5 is a promising inhibitor among all. The anti- proliferative and cytotoxic properties of the compounds were also analyzed by the MTS assay. 2. In our initial high throughput virtual screening for compounds against the SBD of Siah2, menadione was used as a reference ligand for achieving specificity in screening. However the compound specificity for Siah1 and Siah2 was not addressed due to the high sequence similarity and the possibility of overlapping functions. Current understanding highlights the controversial roles of Siah1 and Siah2 in cancer. The partial activity of the previously screened compounds might be due to the lack of specificity. In this regard, the third part of the thesis, intends to identify critical amino acids involved in the substrate specificity of the SBD of Siah2 with respect to Siah1. With the help of chimeric and mutational analysis, a few residues were narrowed down for substrate specificity that might help in developing Siah2 specific inhibitors in future and could also address the cross-talk between the roles of Siah1 and Siah2 in cancer. Overall, our work presents an attempt to translate the mechanistic information of Siah2 mediated hypoxic signaling in cancer to structure based design of chemical inhibitors which might be useful for anticancer therapy. This approach may initiate broad implications in drug discovery efforts targeting diverse signaling networks. To more clearly define the mechanism of action, it will be a priority in future studies to obtain the structure of Siah2 alone or in complex with its inhibitors and their derivative.

106 6.2 Future directions Based on the experiments and results reported in this thesis, the following continuation is proposed. Site directed mutagenesis of the identified residues should be performed to further narrow down to specific residues that would confer specificity for Siah2 substrates. Using high-throughput virtual screening (HTVS) more specific lead compounds can be screened against the SBD of Siah2 using the identified residues that show better substrate specificity. HTVS from other databases like ZINC, NCI and FDA, which were not available during the initial screening, has to be undertaken. In addition, the identified compounds should be cross validated against Siah1 using in vitro and in vivo studies as the specificity selection of the compounds was not implied in our initial screening. To understand the structural and functional role of Siah2 with respect to the identified residues, site directed mutation studies could be performed on the full length Siah2, to check the affect on activity of the protein to its substrate. In order to further substantiate the findings, another choice would have been to use a peptide such as the PHYL peptide that is necessary and sufficient for SBD binding to confirm the importance of identified residues. Also various Siah substrates (both unique and common for Siah1 and Siah2) can be tested to against this identified unique residues which could help us in addressing the cross talk between the roles of Siah1 and Siah2. More crystallization trials should be undertaken to determine the structure of SBD of Siah2. As crystal formation of Siah2 wildtype was limiting step which could answer most of the hypotheses, it is recommended to attempt crystalization trails with mutated Siah2 residues. This could render a better comparative picture of the Siah2 protein scaffold with respect to Siah1. Co-crystallization of the SBD-inhibitor complex along with apo structure, will provide more insights in understanding the interactions between them. Their structural backbone of the inhibitors could serve as building blocks in designing better drug-like molecules targeting E3 ligases in cancer therapy. Further side chain modification, with the help of medicinal and combinatorial chemistry can be undertaken to increase the solubility of the inhibitors (Fig. 42). Further screens with different bio-relevant buffer systems,

107 termed as pH profiling will allow the detection of a potential counter ion exchange and formation of more stable/less-soluble salts of a molecule that will lead to precipitation in vivo.

Figure 43. Example illustration of steps involved in combinatorial chemistry based design of ligands. An example of a drug molecule from a database is presented in the above figure. Using combinatorial chemistry, where the backbone scaffold of the drug molecules has been slightly modified with an additional side chain of benzene rings that is in blue colour.

108 APPENDICES

APPENDIX 1

Table A1. List of constructs and corresponding primers

Recombinant Forward Reverse Restiction Constructs Vector Primer(FP) Primer(RP) Sites

FP RP

MBP-SBD- Modified TTCAATCATGC TTCTGCGGCCGC EcoRI NotI Siah2 pET Duet CCAGCATCAG GTGGTGATGGT

GST-SBD- pGEX6P-1 ATGCGAACCTGGATCCGGCTAG ATGCGATGATGTCAACTCGAGTC BamHI XhoI T C ATGTAGAAAT Siah2 GCGTGGCCTCGGCAGAATTCAT ACAAACATGTAGAAGCGGCCGC MBP-SBD- Modified AGTAACATTG GCCCCTGGGACGTCC TCAGTCTTCAATAGTAACATT EcoRI NotI PHD3 pET Duet ATC G

TGTAAGCTTGT GCATTCTAGATT FLAG-SBD- pcDNA3.1 GGCT AAT ATGGACAACAT HindIII XbaI Siah1 TCAGTAC GTAGAAATAG

GTGCTCTAGATT ATCGTAAGCTT FLAG-SBD- AACATGTAGAA pcDNA3.1 AGCCGCCCGTC HindIII XbaI Siah2 ATAGTAACATT CTC G

TTGCAAGCTTA ACGCTCTAGATT FLAG-Siah1 pcDNA3.1 TGAGCCGCCCG C ACA TGG AAA HindIII XbaI TCCTCC TAG

GTGCAAGCTTA ATCGTCTAGATT FLAG-Siah2 pcDNA3.1 TGGTGGCCTCG ATGGACAACAT HindIII XbaI GCAGTCC GTAGAAATAG

TGTGAATTCAA ACACTCGAGTT PHD3-His pET-M TGCC AGTCTTCAGTGA EcoRI XhoI CCTGGGAC G GG

109 APPENDIX 2

A2.1 Purification of PHD3 by refolding PHD3 was initially cloned in the pETM vector with N terminal His tag. It was expressed as an insoluble protein in E. coli (BL21). The proteins in the inclusion bodies was denatured using 6 M GnHCL. The denatured protein was first purified with the Ni-NTA resin and then refolded by dialysis. Beyond 2 M GnHCL, the protein tends to precipitate completely. Hence dialysis was stopped at 1.5M GnHCL and then purified by gel filtration using an S75 column with 0.5M GnHCL. Fig. 43a shows the gel filtration results of refolded PHD3. Most of the refolded protein got aggregated. Fig. 43b shows the presence of the PHD3 protein in the gel filtration fractions (mostly in the aggregates) and the protein size by SDS-PAGE. As the refolding was not successful, purification of MBP- PHD3 was carried out.

a Aggregates

Phd3

110 b

Figure 44. Purification of PHD3-His by refolding. (a) The gel filtration profile of Phd3 with His tag when loaded onto a size exclusion column. (b) SDS-PAGE gel showing the FPLC fractions of the refolded protein with 0.5 M GnHCL run on an S200 gel filtration column. Lanes 1-7 are aggregates.

A2.2 MS-MS Analysis of SBD of Siah2 and PHD3 For MS/MS analysis, sample digestion, desalting and concentration steps were carried out by using the Montage® In-Gel digestion Kits (Millipore Corp.) Protein spots were analyzed using an Applied Biosystems 4700 Proteomics Analyzer MALDI-TOF/TOF (Applied Biosystems, Framigham, MA, USA). Data processing and interpretation was carried out using the GPS Explorer Software (Applied Biosystems) and database searching was achieved by using the MASCOT program (Matrix Science Ltd., London, UK). The NCBI database was used for combined MS and MS/MS search. Samples for the mass spectrometry experiments were prepared and analyzed from PPC, DBS-NUS facility.

111

Figure 45. Identification of MBP-SBD of Siah2 and MBP-PHD3 by peptide mass fingerprint. The peptide mass fingerprint of peak from gel- filtration (Fig. 20 and Fig 23) result revealed the identity of the Siah2 and PHD3.

112 APPENDIX 3

A3.1 IC50 and affinity

In biochemical assays, the dissociation constant Kd is termed as the inhibition constant Ki. It is important to be aware that data from inhibition is often not reported as Ki (or, equivalently, Kd), but instead as the IC50, the concentration of a ligand that reduces particular activity by 50%. The difference between IC50 and Ki results from the fact, that in a biochemical binding assay of a competitive inhibitor, the inhibitor is not the only molecule trying to bind the target but also competing with the substrate, so the concentration of the ligand needed to reduce the activity by 50% depends on the concentration of substrate and how tightly it binds the enzyme. IC50 is not a direct indicator of affinity although the two can be related at least for competitive agonists and antagonists by the Cheng-Prusoff equation (Cheng and Prusoff 1973). In general, then, the IC50 is expected to be greater than Ki

(the Cheng-Prusoff equation). IC50 is not a direct indicator of affinity, nor is

Kd/Ki a direct indicator of inhibition. However, the two can be related using the Cheng-Prusoff equation:

where Ki is the binding affinity of the inhibitor, IC50 is the functional strength of the inhibitor, [S] is substrate concentration and Km is the concentration of substrate at which enzyme activity is at half maximal (but is frequently confused with substrate affinity for the enzyme, which it is not). Whereas the IC50 value for a compound may vary between experiments depending on experimental conditions, the Ki is an absolute value. An ITC assay tends to require a larger quantity of sample than an enzyme inhibition assays. However, ITC has the advantage of not requiring the development of a specialized substrate whose reaction can be detected. The natural heat release of the binding reaction is what is required.

113 REFERENCES

Ahmed, A. U., R. L. Schmidt, et al. (2008). "Effect of disrupting seven-in- absentia homolog 2 function on lung cancer cell growth." Journal of the National Cancer Institute 100(22): 1606-1629.

Ajay and M. A. Murcko (1995). "Computational methods to predict binding free energy in ligand-receptor complexes." J Med Chem 38(26): 4953- 4967.

Al-Lazikani, B., J. Jung, et al. (2001). "Protein structure prediction." Curr Opin Chem Biol 5(1): 51-56.

Altschul, S. F., W. Gish, et al. (1990). "Basic local alignment search tool." J Mol Biol 215(3): 403-410.

Altschul, S. F., T. L. Madden, et al. (1997). "Gapped BLAST and PSI- BLAST: a new generation of protein database search programs." Nucleic Acids Res 25(17): 3389-3402.

Amidon, G. L., H. Lennernas, et al. (1995). "A theoretical basis for a biopharmaceutic drug classification: the correlation of in vitro drug product dissolution and in vivo bioavailability." Pharm Res 12(3): 413- 420.

Aravind, L. and E. V. Koonin (2000). "The U box is a modified RING finger - a common domain in ubiquitination." Curr Biol 10(4): R132-134.

Baba, K., H. Morimoto, et al. (2013). "Seven in Absentia Homolog 2 (Siah2) Protein Is a Regulator of NF-E2-related Factor 2 (Nrf2)." The Journal of biological chemistry 288(25): 18393-18405.

Baldwin, J. J., G. S. Ponticello, et al. (1989). "Thienothiopyran-2- sulfonamides: novel topically active carbonic anhydrase inhibitors for the treatment of glaucoma." J Med Chem 32(12): 2510-2513.

Behling, K. C., A. Tang, et al. (2011). "Increased SIAH expression predicts ductal carcinoma in situ (DCIS) progression to invasive carcinoma." Breast Cancer Res Treat 129(3): 717-724.

Berman, H., K. Henrick, et al. (2003). "Announcing the worldwide Protein Data Bank." Nat Struct Biol 10(12): 980.

Berra, E., E. Benizri, et al. (2003). "HIF prolyl-hydroxylase 2 is the key oxygen sensor setting low steady-state levels of HIF-1alpha in normoxia." EMBO J 22(16): 4082-4090.

Bienstock, R. J. (2012). "Computational drug design targeting protein-protein interactions." Curr Pharm Des 18(9): 1240-1254.

114 Blow, D. M. (2002). "Rearrangement of Cruickshank's formulae for the diffraction-component precision index." Acta Crystallogr D Biol Crystallogr 58(Pt 5): 792-797.

Blundell, T. L., H. Jhoti, et al. (2002). "High-throughput crystallography for lead discovery in drug design." Nat Rev Drug Discov 1(1): 45-54.

Bohm, H. J. (1994). "The development of a simple empirical scoring function to estimate the binding constant for a protein-ligand complex of known three-dimensional structure." J Comput Aided Mol Des 8(3): 243-256.

Brauckhoff, A., V. Ehemann, et al. (2007). "[Reduced expression of the E3- ubiquitin ligase seven in absentia homologue (SIAH)-1 in human hepatocellular carcinoma]." Verh Dtsch Ges Pathol 91: 269-277.

Braun, B. C., M. Glickman, et al. (1999). "The base of the proteasome regulatory particle exhibits chaperone-like activity." Nat Cell Biol 1(4): 221-226.

Brenner, S. E., C. Chothia, et al. (1998). "Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships." Proceedings of the National Academy of Sciences of the United States of America 95(11): 6073-6078.

Brizel, D. M., S. P. Scully, et al. (1996). "Tumor oxygenation predicts for the likelihood of distant metastases in human soft tissue sarcoma." Cancer Res 56(5): 941-943.

Brooijmans, N. and I. D. Kuntz (2003). "Molecular recognition and docking algorithms." Annu Rev Biophys Biomol Struct 32: 335-373.

Bruick, R. K. and S. L. McKnight (2001). "A conserved family of prolyl-4- hydroxylases that modify HIF." Science 294(5545): 1337-1340.

Calzado, M. A., L. de la Vega, et al. (2009). "An inducible autoregulatory loop between HIPK2 and Siah2 at the apex of the hypoxic response." Nat Cell Biol 11(1): 85-91.

Carthew, R. W., T. P. Neufeld, et al. (1994). "Identification of genes that interact with the sina gene in Drosophila eye development." Proceedings of the National Academy of Sciences of the United States of America 91(24): 11689-11693.

Carthew, R. W. and G. M. Rubin (1990). "seven in absentia, a gene required for specification of R7 cell fate in the Drosophila eye." Cell 63(3): 561- 577.

Chan, P., A. Moller, et al. (2011). "The expression of the ubiquitin ligase SIAH2 (seven in absentia homolog 2) is mediated through gene copy number in breast cancer and is associated with a basal-like phenotype and p53 expression." Breast Cancer Res 13(1): R19.

115 Charifson, P. S., J. J. Corkery, et al. (1999). "Consensus scoring: A method for obtaining improved hit rates from docking databases of three- dimensional structures into proteins." J Med Chem 42(25): 5100-5109.

Chauhan, D., L. Catley, et al. (2005). "A novel orally active proteasome inhibitor induces apoptosis in multiple myeloma cells with mechanisms distinct from Bortezomib." Cancer Cell 8(5): 407-419.

Chayen, N. E. (2004). "Turning protein crystallisation from an art into a science." Curr Opin Struct Biol 14(5): 577-583.

Chen, C. Y. (2013). "A novel integrated framework and improved methodology of computer-aided drug design." Curr Top Med Chem 13(9): 965-988.

Cheng, Y. and W. H. Prusoff (1973). "Relationship between the inhibition constant (K1) and the concentration of inhibitor which causes 50 per cent inhibition (I50) of an enzymatic reaction." Biochem Pharmacol 22(23): 3099-3108.

Cherezov, V., D. M. Rosenbaum, et al. (2007). "High-resolution crystal structure of an engineered human beta2-adrenergic G protein-coupled receptor." Science 318(5854): 1258-1265.

Chothia, C. and A. M. Lesk (1986). "The relation between the divergence of sequence and structure in proteins." EMBO J 5(4): 823-826.

Clark, R. D., A. Strizhev, et al. (2002). "Consensus scoring for ligand/protein interactions." J Mol Graph Model 20(4): 281-295.

Cohen, N. C. (1996). "Guidebook on Molecular Modeling in Drug Design. Boston: Academic Press." (ISBN 0-12-178245-X).

Conaway, R. C., C. S. Brower, et al. (2002). "Emerging Roles of Ubiquitin in Transcription Regulation." Science 296(5571): 1254-1258.

Confalonieri, S., M. Quarto, et al. (2009). "Alterations of ubiquitin ligases in human cancer and their association with the natural history of the tumor." Oncogene 28(33): 2959-2968.

Cooper, S. E., C. M. Murawsky, et al. (2008). "Two modes of degradation of the tramtrack transcription factors by Siah homologues." The Journal of biological chemistry 283(2): 1076-1083.

Cristobal, S., A. Zemla, et al. (2001). "A study of quality measures for protein threading models." BMC Bioinformatics 2: 5.

Cruciani, G., M. Pastor, et al. (2000). "VolSurf: a new tool for the pharmacokinetic optimization of lead compounds." Eur J Pharm Sci 11 Suppl 2: S29-39.

116 Deane, C. M., Q. Kaas, et al. (2001). "SCORE: predicting the core of protein models." Bioinformatics 17(6): 541-550.

Della, N. G., P. V. Senior, et al. (1993). "Isolation and characterisation of murine homologues of the Drosophila seven in absentia gene (sina)." Development 117(4): 1333-1343.

Depaux, A., F. Regnier-Ricard, et al. (2006). "Dimerization of hSiah proteins regulates their stability." Biochem Biophys Res Commun 348(3): 857- 863.

Deshaies, R. J. and C. A. Joazeiro (2009). "RING domain E3 ubiquitin ligases." Annual review of biochemistry 78: 399-434.

Dinkel, H., S. Michael, et al. (2011). "ELM—the database of eukaryotic linear motifs." Nucleic Acids Research.

Dokholyan, N. V., S. V. Buldyrev, et al. (1998). "Discrete molecular dynamics studies of the folding of a protein-like model." Fold Des 3(6): 577-587.

Drews, J. and S. Ryser (1997). "The role of innovation in drug development." Nat Biotechnol 15(13): 1318-1319.

Edwards, P. (2001). "HIV-1 protease inhibitors." Drug Discov Today 6(8): 437-439.

Edwards, Y. J. and A. Cottage (2003). "Bioinformatics methods to predict protein structure and function. A practical approach." Mol Biotechnol 23(2): 139-166.

Emerling, B. M., L. C. Platanias, et al. (2005). "Mitochondrial reactive oxygen species activation of p38 mitogen-activated protein kinase is required for hypoxia signaling." Mol Cell Biol 25(12): 4853-4862.

Epstein, A. C., J. M. Gleadle, et al. (2001). "C. elegans EGL-9 and mammalian homologs define a family of dioxygenases that regulate HIF by prolyl hydroxylation." Cell 107(1): 43-54.

Ertl, P., B. Rohde, et al. (2000). "Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties." J Med Chem 43(20): 3714- 3717.

Feltham, R., B. Bettjeman, et al. (2011). "Smac mimetics activate the E3 ligase activity of cIAP1 protein by promoting RING domain dimerization." The Journal of biological chemistry 286(19): 17015- 17028.

Frasor, J., J. M. Danes, et al. (2005). "Estrogen down-regulation of the corepressor N-CoR: mechanism and implications for estrogen derepression of N-CoR-regulated genes." Proceedings of the National

117 Academy of Sciences of the United States of America 102(37): 13153- 13157.

Frew, I. J., V. E. Hammond, et al. (2003). "Generation and Analysis of Siah2 Mutant Mice." Molecular and Cellular Biology 23(24): 9150-9161.

Friesner, R. A., J. L. Banks, et al. (2004). "Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy." J Med Chem 47(7): 1739-1749.

Garber, K. (2005). "Missing the target: ubiquitin ligase drugs stall." J Natl Cancer Inst 97(3): 166-167.

Glickman, M. H. and A. Ciechanover (2002). "The ubiquitin-proteasome proteolytic pathway: destruction for the sake of construction." Physiol Rev 82(2): 373-428.

Gohlke, H., M. Hendlich, et al. (2000). "Knowledge-based scoring function to predict protein-ligand interactions." J Mol Biol 295(2): 337-356.

Gohlke, H. and G. Klebe (2002). "Approaches to the description and prediction of the binding affinity of small-molecule ligands to macromolecular receptors." Angew Chem Int Ed Engl 41(15): 2644- 2676.

Govindarajan, S., R. Recabarren, et al. (1999). "Estimating the total number of protein folds." Proteins 35(4): 408-414.

Gruswitz, F., M. Frishman, et al. (2005). "Coupling of MBP fusion protein cleavage with sparse matrix crystallization screens to overcome problematic protein solubility." Biotechniques 39(4): 476, 478, 480.

Guex, N. and M. C. Peitsch (1997). "SWISS-MODEL and the Swiss- PdbViewer: an environment for comparative protein modeling." Electrophoresis 18(15): 2714-2723.

Gyorgy, S. (2012). "X-ray sources and high-throughput data collection methods." Methods in Molecular Biology 841(DOI: 10.1007/978-1- 61779-520-6_5): 93-141.

Habelhah, H., I. J. Frew, et al. (2002). "Stress-induced decrease in TRAF2 stability is mediated by Siah2." EMBO J 21(21): 5756-5765.

Habelhah, H., A. Laine, et al. (2004). "Regulation of 2-oxoglutarate (alpha- ketoglutarate) dehydrogenase stability by the RING finger ubiquitin ligase Siah." The Journal of biological chemistry 279(51): 53782- 53788.

Hanna, J. and D. Finley (2007). "A proteasome for all occasions." FEBS Lett 581(15): 2854-2861.

118 Hatakeyama, S. and K. I. Nakayama (2003). "U-box proteins as a new family of ubiquitin ligases." Biochem Biophys Res Commun 302(4): 635-645.

Heilman, R. D. (1995). "Drug development history, "overview," and what are GCPs?" Qual Assur 4(1): 75-79.

Hershko, A. (1996). "Lessons from the discovery ofthe ubiquitin system." Trends in Biochemical Sciences 21(11): 445-449.

Hershko, A. and A. Ciechanover (1998). "The ubiquitin system." Annual review of biochemistry 67: 425-479.

Hess, B., C. Kutzner, et al. (2008). "GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation." Journal of Chemical Theory and Computation 4(3): 435-447.

Hicke, L. and R. Dunn (2003). "Regulation of membrane protein transport by ubiquitin and ubiquitin-binding proteins." Annu Rev Cell Dev Biol 19: 141-172.

Hillisch, A., L. F. Pineda, et al. (2004). "Utility of homology models in the drug discovery process." Drug Discov Today 9(15): 659-669.

Hobert, O. (2002). "PCR fusion-based approach to create reporter gene constructs for expression analysis in transgenic C. elegans." Biotechniques 32(4): 728-730.

Holloway, A. J., N. G. Della, et al. (1997). "Chromosomal mapping of five highly conserved murine homologues of the Drosophila RING finger gene seven-in-absentia." Genomics 41(2): 160-168.

Holm, L. and C. Sander (1993). "Protein structure comparison by alignment of distance matrices." J Mol Biol 233(1): 123-138.

House, C. M., I. J. Frew, et al. (2003). "A binding motif for Siah ubiquitin ligase." Proceedings of the National Academy of Sciences of the United States of America 100(6): 3101-3106.

House, C. M., N. C. Hancock, et al. (2006). "Elucidation of the substrate binding site of Siah ubiquitin ligase." Structure 14(4): 695-701.

House, C. M., A. Moller, et al. (2009). "Siah proteins: novel drug targets in the Ras and hypoxia pathways." Cancer Res 69(23): 8835-8838.

Hu, G., Y. L. Chung, et al. (1997). "Characterization of human homologs of the Drosophila seven in absentia (sina) gene." Genomics 46(1): 103- 111.

Hu, G. and E. R. Fearon (1999). "Siah-1 N-terminal RING domain is required for proteolysis function, and C-terminal sequences regulate oligomerization and binding to target proteins." Mol Cell Biol 19(1): 724-732.

119 Hu, G., S. Zhang, et al. (1997). "Mammalian homologs of seven in absentia regulate DCC via the ubiquitin-proteasome pathway." Genes Dev 11(20): 2701-2714.

Ivan, M., K. Kondo, et al. (2001). "HIFalpha targeted for VHL-mediated destruction by proline hydroxylation: implications for O2 sensing." Science 292(5516): 464-468.

Jackson, P. K., A. G. Eldridge, et al. (2000). "The lore of the RINGs: substrate recognition and catalysis by ubiquitin ligases." Trends Cell Biol 10(10): 429-439.

Jaroszewski, L., L. Rychlewski, et al. (2000). "Improving the quality of twilight-zone alignments." Protein Sci 9(8): 1487-1496.

Joazeiro, C. A. and A. M. Weissman (2000). "RING finger proteins: mediators of ubiquitin ligase activity." Cell 102(5): 549-552.

Johnsen, S. A., M. Subramaniam, et al. (2002). "Modulation of transforming growth factor beta (TGFbeta)/Smad transcriptional responses through targeted degradation of TGFbeta-inducible early gene-1 by human seven in absentia homologue." J Biol Chem 277(34): 30754-30759.

Jorgensen, W., J. Chandrasekhar, et al. (1983). "Comparison of simple potential functions for simulating liquid water." The Journal of Chemical Physics 79(2): 926-935.

Jorgensen, W. L. (2004). "The many roles of computation in drug discovery." Science 303(5665): 1813-1818.

Jorgensen, W. L. (2006). "QikProp, version 3.0; Schrodinger, LLC:New York." (2006).

Kandt, C., W. L. Ash, et al. (2007). "Setting up and running molecular dynamics simulations of membrane proteins." Methods 41(4): 475- 488.

Karwath, A. and R. D. King (2002). "Homology induction: the use of machine learning to improve sequence similarity searches." BMC Bioinformatics 3: 11.

Kelder, J., P. D. Grootenhuis, et al. (1999). "Polar molecular surface as a dominating determinant for oral absorption and brain penetration of drugs." Pharm Res 16(10): 1514-1519.

Kenny, B. A., M. Bushfield, et al. (1998). "The application of high-throughput screening to novel lead discovery." Prog Drug Res 51: 245-269.

Khurana, A., K. Nakayama, et al. (2006). "Regulation of the ring finger E3 ligase Siah2 by p38 MAPK." J Biol Chem 281(46): 35316-35326.

120 Kim, C. J., Y. G. Cho, et al. (2004). "Inactivating mutations of the Siah-1 gene in gastric cancer." Oncogene 23(53): 8591-8596.

Klees, J. E. and R. Joines (1997). "Occupational health issues in the pharmaceutical research and development process." Occup Med 12(1): 5-27.

Koegl, M., T. Hoppe, et al. (1999). "A novel ubiquitination factor, E4, is involved in multiubiquitin chain assembly." Cell 96(5): 635-644.

Kontoyianni, M., L. M. McClellan, et al. (2003). "Evaluation of Docking Performance: Comparative Data on Docking Algorithms." Journal of Medicinal Chemistry 47(3): 558-565.

Kramer, O. H., R. H. Stauber, et al. (2013). "SIAH proteins: critical roles in leukemogenesis." Leukemia 27(4): 792-802.

Lahana, R. (1999). "How many leads from HTS?" Drug Discov Today 4(10): 447-448.

Langer, T. and R. D. Hoffmann (2001). "Virtual screening: an effective tool for lead structure discovery?" Curr Pharm Des 7(7): 509-527.

Laskowski, R. A., M. W. MacArthur, et al. (1993). "PROCHECK: a program to check the stereochemical quality of protein structures." Journal of Applied Crystallography 26(2): 283-291.

Leach, A. R. and H. Jhoti. (2007). "Structure-based Drug Discovery. Berlin." Springer(ISBN 1-4020-4406-2).

Lesk, A. M. and C. Chothia (1980). "How different amino acid sequences determine similar protein structures: the structure and evolutionary dynamics of the globins." J Mol Biol 136(3): 225-270.

Lipinski, C. A., F. Lombardo, et al. (2001). "Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings." Adv Drug Deliv Rev 46(1-3): 3-26.

Lipinski, C. A., F. Lombardo, et al. (2012). "Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings." Adv Drug Deliv Rev.

Liu, J., J. Stevens, et al. (2001). "Siah-1 mediates a novel beta-catenin degradation pathway linking p53 to the adenomatous polyposis coli protein." Mol Cell 7(5): 927-936.

Lorick, K. L., J. P. Jensen, et al. (1999). "RING fingers mediate ubiquitin- conjugating enzyme (E2)-dependent ubiquitination." Proceedings of the National Academy of Sciences of the United States of America 96(20): 11364-11369.

121 Madsen, U., Krogsgaard-Larsen., et al. (2002). "Textbook of Drug Design and Discovery, Washington, DC: Taylor & Francis." (ISBN 0-415-28288- 8).

Malz, M., A. Aulmann, et al. (2012). "Nuclear accumulation of seven in absentia homologue-2 supports motility and proliferation of liver cancer cells." Int J Cancer 131(9): 2016-2026.

Marshall, G. R. (1987). "Computer-aided drug design." Annu Rev Pharmacol Toxicol 27: 193-213.

Matsuzawa, S., S. Takayama, et al. (1998). "p53-inducible human homologue of Drosophila seven in absentia (Siah) inhibits cell growth: suppression by BAG-1." EMBO J 17(10): 2736-2747.

Matsuzawa, S. I. and J. C. Reed (2001). "Siah-1, SIP, and Ebi collaborate in a novel pathway for beta-catenin degradation linked to p53 responses." Mol Cell 7(5): 915-926.

Maxwell, P. H., M. S. Wiesener, et al. (1999). "The tumour suppressor protein VHL targets hypoxia-inducible factors for oxygen-dependent proteolysis." Nature 399(6733): 271-275.

McDonald, N. A. and W. L. Jorgensen (1998). "Development of an All-Atom Force Field for Heterocycles. Properties of Liquid Pyrrole, Furan, Diazoles, and Oxazoles." The Journal of Physical Chemistry B 102(41): 8049-8059.

McGinnis, S. and T. L. Madden (2004). "BLAST: at the core of a powerful and diverse set of sequence analysis tools." Nucleic Acids Res 32(Web Server issue): W20-25.

McMahon, M., K. Itoh, et al. (2003). "Keap1-dependent proteasomal degradation of transcription factor Nrf2 contributes to the negative regulation of antioxidant response element-driven gene expression." The Journal of biological chemistry 278(24): 21592-21600.

McPherson, A. (2009). "Introduction to Macromolecular Crystallography." 2nd Edition, Wiley, Newyork.

Moller, A., C. M. House, et al. (2009). "Inhibition of Siah ubiquitin ligase function." Oncogene 28(2): 289-296.

Muegge, I. and M. Rarey (2001). "Reviews in Computational Chemistry (eds Lipkowitz, K. B. & Boyd, D. B)." Wiley–VCH, New York 17: 1-60.

Munoz, V. and W. A. Eaton (1999). "A simple model for calculating the kinetics of protein folding from three-dimensional structures." Proceedings of the National Academy of Sciences of the United States of America 96(20): 11311-11316.

122 Murata, S., H. Yashiroda, et al. (2009). "Molecular mechanisms of proteasome assembly." Nature reviews. Molecular cell biology 10(2): 104-115.

Nadeau, R. J., J. L. Toher, et al. (2007). "Regulation of Sprouty2 stability by mammalian Seven-in-Absentia homolog 2." J Cell Biochem 100(1): 151-160.

Nakayama, K., I. J. Frew, et al. (2004). "Siah2 regulates stability of prolyl- hydroxylases, controls HIF1alpha abundance, and modulates physiological responses to hypoxia." Cell 117(7): 941-952.

Nakayama, K., S. Gazdoiu, et al. (2007). "Hypoxia-induced assembly of prolyl hydroxylase PHD3 into complexes: implications for its activity and susceptibility for degradation by the E3 ligase Siah2." The Biochemical journal 401(1): 217-226.

Nakayama, K., J. Qi, et al. (2009). "The ubiquitin ligase Siah2 and the hypoxia response." Mol Cancer Res 7(4): 443-451.

Nakayama, K. I. and K. Nakayama (2006). "Ubiquitin ligases: cell-cycle control and cancer." Nat Rev Cancer 6(5): 369-381.

Nardinocchi, L., R. Puca, et al. (2009). "Transcriptional regulation of hypoxia- inducible factor 1alpha by HIPK2 suggests a novel mechanism to restrain tumor growth." Biochim Biophys Acta 1793(2): 368-377.

Nemani, M., G. Linares-Cruz, et al. (1996). "Activation of the human homologue of the Drosophila sina gene in apoptosis and tumor suppression." Proceedings of the National Academy of Sciences of the United States of America 93(17): 9039-9042.

Notredame, C., D. G. Higgins, et al. (2000). "T-Coffee: A novel method for fast and accurate multiple sequence alignment." J Mol Biol 302(1): 205-217.

Ooms, F. (2000). "Molecular modeling and computer aided drug design. Examples of their applications in medicinal chemistry." Curr Med Chem 7(2): 141-158.

Oprea, T. I. and H. Matter (2004). "Integrating virtual screening in lead discovery." Curr Opin Chem Biol 8(4): 349-358.

Pearl, F. M., N. Martin, et al. (2001). "A rapid classification protocol for the CATH Domain Database to support structural genomics." Nucleic Acids Res 29(1): 223-227.

Pearson, W. R. and D. J. Lipman (1988). "Improved tools for biological sequence comparison." Proceedings of the National Academy of Sciences of the United States of America 85(8): 2444-2448.

Pickart, C. M. (2001). "Mechanisms underlying ubiquitination." Annu Rev Biochem 70: 503-533.

123 Pioszak, A. A. and H. E. Xu (2008). "Molecular recognition of parathyroid hormone by its G protein-coupled receptor." Proc Natl Acad Sci U S A 105(13): 5034-5039.

Polekhina, G., C. M. House, et al. (2002). "Siah ubiquitin ligase is structurally related to TRAF and modulates TNF-alpha signaling." Nature structural biology 9(1): 68-75.

Ponder, J. W. and D. A. Case (2003). "Force fields for protein simulations." Adv Protein Chem 66: 27-85.

Privalov, P. L. (1989). "Thermodynamic problems of protein structure." Annu Rev Biophys Biophys Chem 18: 47-69.

Qi, J., H. Kim, et al. (2013). "Regulators and Effectors of Siah Ubiquitin Ligases." Cell biochemistry and biophysics.

Qi, J., K. Nakayama, et al. (2010). "Siah2-dependent concerted activity of HIF and FoxA2 regulates formation of neuroendocrine phenotype and neuroendocrine prostate tumors." Cancer Cell 18(1): 23-38.

Qi, J., K. Nakayama, et al. (2008). "The ubiquitin ligase Siah2 regulates tumorigenesis and metastasis by HIF-dependent and -independent pathways." Proc Natl Acad Sci U S A 105(43): 16713-16718.

Qi, J., M. Tripathi, et al. (2013). "The E3 ubiquitin ligase Siah2 contributes to castration-resistant prostate cancer by regulation of androgen receptor transcriptional activity." Cancer Cell 23(3): 332-346.

R., W., G. Y., et al. (2000). "LigBuilder: A Multi-Purpose Program for Structure-Based Drug Design." Journal of Molecular Modeling 6(7-8): 498–516.

Raha, K., A. M. Wollacott, et al. (2000). "Prediction of amino acid sequence from structure." Protein Sci 9(6): 1106-1119.

Rice, P., I. Longden, et al. (2000). "EMBOSS: the European Molecular Biology Open Software Suite." Trends Genet 16(6): 276-277.

Richardson, P. G., C. Mitsiades, et al. (2006). "Bortezomib: proteasome inhibition as an effective anticancer therapy." Annu Rev Med 57: 33- 47.

Sali, A. and T. L. Blundell (1993). "Comparative protein modelling by satisfaction of spatial restraints." J Mol Biol 234(3): 779-815.

Santelli, E., M. Leone, et al. (2005). "Structural analysis of Siah1-Siah- interacting protein interactions and insights into the assembly of an E3 ligase multiprotein complex." The Journal of biological chemistry 280(40): 34278-34287.

124 Sarkar, T. R., S. Sharan, et al. (2012). "Identification of a Src tyrosine kinase/SIAH2 E3 ubiquitin ligase pathway that regulates C/EBPdelta expression and contributes to transformation of breast tumor cells." Mol Cell Biol 32(2): 320-332.

Scapin, G. (2006). "Structural biology and drug discovery." Curr Pharm Des 12(17): 2087-2097.

Schächter, V. (2002). "Protein-interaction networks: from experiments to analysis." Drug Discovery Today 7(11): S48-S54.

Scheffner, M., U. Nuber, et al. (1995). "Protein ubiquitination involving an E1-E2-E3 enzyme ubiquitin thioester cascade." Nature 373(6509): 81- 83.

Schmidt, R. L., C. H. Park, et al. (2007). "Inhibition of RAS-mediated transformation and tumorigenesis by targeting the downstream E3 ubiquitin ligase seven in absentia homologue." Cancer research 67(24): 11798-11810.

Schneider, G. and U. Fechner (2005). "Computer-based de novo design of drug-like molecules." Nat Rev Drug Discov 4(8): 649-663.

Schulman, B. A., A. C. Carrano, et al. (2000). "Insights into SCF ubiquitin ligases from the structure of the Skp1-Skp2 complex." Nature 408(6810): 381-386.

Schwartz, A. L. and A. Ciechanover (1999). "The ubiquitin-proteasome pathway and pathogenesis of human diseases." Annu Rev Med 50: 57- 74.

Schwikowski, B., P. Uetz, et al. (2000). "A network of protein-protein interactions in yeast." Nat Biotechnol 18(12): 1257-1261.

Semenza, G. L. (2007). "Hypoxia-inducible factor 1 (HIF-1) pathway." Sci STKE 2007(407): cm8.

Seta, K. A., Z. Spicer, et al. (2002). "Responding to hypoxia: lessons from a model cell line." Sci STKE 2002(146): re11.

Shah, M., J. L. Stebbins, et al. (2009). "Inhibition of Siah2 ubiquitin ligase by vitamin K3 (menadione) attenuates hypoxia and MAPK signaling and blocks melanoma tumorigenesis." Pigment Cell Melanoma Res 22(6): 799-808.

Shah, M., J. L. Stebbins, et al. (2009). "Inhibition of Siah2 ubiquitin ligase by vitamin K3 (menadione) attenuates hypoxia and MAPK signaling and blocks melanoma tumorigenesis." Pigment cell & melanoma research 22(6): 799-808.

Shakhnovich, E. I. (1997). "Theoretical studies of protein-folding thermodynamics and kinetics." Curr Opin Struct Biol 7(1): 29-40.

125 Shaw, A. T., A. Meissner, et al. (2007). "Sprouty-2 regulates oncogenic K-ras in lung development and tumorigenesis." Genes Dev 21(6): 694-707.

Shoichet, B. K. (2004). "Virtual screening of chemical libraries." Nature 432(7019): 862-865.

Smith, T. F. and M. S. Waterman (1981). "Identification of common molecular subsequences." J Mol Biol 147(1): 195-197.

Sumathi, K., P. Ananthalakshmi, et al. (2006). "3dSS: 3D structural superposition." Nucleic Acids Res 34(Web Server issue): W128-132.

Taketomi, H., Y. Ueda, et al. (1975). "Studies on protein folding, unfolding and fluctuations by computer simulation. I. The effect of specific amino acid sequence represented by specific inter-unit interactions." Int J Pept Protein Res 7(6): 445-459.

Tang, A. H., T. P. Neufeld, et al. (1997). "PHYL acts to down-regulate TTK88, a transcriptional repressor of neuronal cell fates, by a SINA- dependent mechanism." Cell 90(3): 459-467.

Tang, Y. T. and G. R. Marshall (2011). "Methods in Molecular Biology." DRUG DISCOVERY AND DISCOVERY 716: 1-22.

Thompson, J. D., T. J. Gibson, et al. (2002). "Multiple sequence alignment using ClustalW and ClustalX." Curr Protoc Bioinformatics Chapter 2: Unit 2 3.

Thompson, J. D., T. J. Gibson, et al. (1997). "The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools." Nucleic Acids Res 25(24): 4876-4882.

Thrower, J. S., L. Hoffman, et al. (2000). "Recognition of the polyubiquitin proteolytic signal." EMBO J 19(1): 94-102.

Valkov, E., T. Sharpe, et al. (2012). "Targeting protein-protein interactions and fragment-based drug discovery." Topics in current chemistry 317: 145-179.

Veber, D. F., S. R. Johnson, et al. (2002). "Molecular properties that influence the oral bioavailability of drug candidates." J Med Chem 45(12): 2615- 2623.

Vedani, A. (1991). "[Computer-Aided Drug Design: An Alternative to Animal Testing in the Pharmacological Screening]." ALTEX 8(1): 39-60.

Veselovsky, A. V. and A. S. Ivanov (2003). "Strategy of computer-aided drug design." Curr Drug Targets Infect Disord 3(1): 33-40. von Itzstein, M., W. Y. Wu, et al. (1993). "Rational design of potent sialidase- based inhibitors of influenza virus replication." Nature 363(6428): 418- 423.

126 Wang, R., L. Lai, et al. (2002). "Further development and validation of empirical scoring functions for structure-based binding affinity prediction." J Comput Aided Mol Des 16(1): 11-26.

Wang, Y., J. Xiao, et al. (2009). "PubChem: a public information system for analyzing bioactivities of small molecules." Nucleic Acids Res 37(Web Server issue): W623-633.

Wen, Y. Y., Z. Q. Yang, et al. (2010). "The expression of SIAH1 is downregulated and associated with Bim and apoptosis in human breast cancer tissues and cells." Mol Carcinog 49(5): 440-449.

Wiederstein, M. and M. J. Sippl (2007). "ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins." Nucleic Acids Res 35(Web Server issue): W407-410.

Wlodawer, A. and J. Vondrasek (1998). "Inhibitors of HIV-1 protease: a major success of structure-assisted drug design." Annu Rev Biophys Biomol Struct 27: 249-284.

Wong, C. S. and A. Moller (2013). "Siah: a promising anticancer target." Cancer research 73(8): 2400-2406.

Wong, C. S., J. Sceneay, et al. (2012). "Vascular normalization by loss of Siah2 results in increased chemotherapeutic efficacy." Cancer Res 72(7): 1694-1704.

Wong, C. S., J. Sceneay, et al. (2012). "Vascular normalization by loss of Siah2 results in increased chemotherapeutic efficacy." Cancer research 72(7): 1694-1704.

Yang, A. S. (2002). "Structure-dependent sequence alignment for remotely related proteins." Bioinformatics 18(12): 1658-1665.

Yevich, J. P. (1996). Drug development from discovery to marketing. Australia.

Yoshibayashi, H., H. Okabe, et al. (2007). "SIAH1 causes growth arrest and apoptosis in hepatoma cells through beta-catenin degradation- dependent and -independent mechanisms." Oncol Rep 17(3): 549-556.

Zheng, N., P. Wang, et al. (2000). "Structure of a c-Cbl-UbcH7 complex: RING domain function in ubiquitin-protein ligases." Cell 102(4): 533- 539.

Zvelebil, M. J., G. J. Barton, et al. (1987). "Prediction of protein secondary structure and active sites using the alignment of homologous sequences." J Mol Biol 195(4): 957-961.

127