Combinatorial Approaches to Study Protein Stability: Design and Application of Cell

Combinatorial Approaches to Study Protein Stability: Design and Application of Cell-

Based Screens to Engineer Tumor Suppressor Proteins

DISSERTATION

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of the Ohio State University

Brinda Ramasubramanian, M.Sc.

Graduate Program in Chemistry

The Ohio State University

2012

Dissertation Committee:

Thomas J Magliery, Advisor

Ross E Dalbey

Karin Musier Forsyth

Brinda Ramasubramanian

2012

Abstract

Tumor suppressor protein p53 is a transcription activation factor that is found mutated in

more that 50 percent of human cancers. Despite its pathological significance, there is no

robust, in vivo, bacterial screen to select for functional mutants of p53. We have developed a transcription interference screen for p53 core domain based on an artificial p53-responsive lac operon, controlling the expression of GFPuv in the host plasmid pGFPuv. The operator region of the lac promoter was replaced with a library of p53 binding elements based on the consensus sequence (RRRCWWGYYY)2. Wild type or

wt-like p53 binds to this site, blocking the polymerase leading to a non-fluorescent

phenotype. P53 Quad was expressed under the control of another promoter. The host

plasmid pACBAD-p53 can be co-maintained in the cell with pGFPuv. Known hotspot

mutants of p53, V143A, R175H, R249S and R273H were constructed by overlap PCR

and expressed from the pACBAD-p53 host plasmid. Our results show a marked decrease

in the fluorescence of pGFPuv when co-transformed with p53-Quad (wt like) and higher

fluorescence when co-transformed with the hotspot mutants of p53. We have successfully

designed a screen which discriminates between p53 which shows DNA binding activity

and variants of p53 that cannot bind or weakly bind to DNA. Our screen provides a

simple and quick method to screen for stable and functional variants of p53 and function

correlated to stability. The validity of the screen was further proved by biophysical

characterizations of stable and functional variants resulting from the screen. Further, the

screen can potentially be used in combination with small molecule libraries, for

identifying lead compounds that may stabilize p53.

BRCA1 is another tumor suppressor protein for which the mutations are implicated in

familial cancers, and its function is governed by its ability to form complexes with its

various interacting partners. Its ability to interact with BARD1 through its RING finger

domain to form heterodmeric complex is particularly important for its function. In vivo analysis of a comprehensive list of cancer-associated mutations of BRCA1 using split

GFP complementation assay indicates that the interface between BRCA1 and BARD1 is

surprisingly robust to mutations. We further studied this interface in vitro, choosing a few

mutants of BRCA1, V11A, M18K, I21V, L52F and R71G in the absence of GFP fusion.

V11, M18 and I21 are at the interface while L52 and R71 are away from the interface.

Among these only M18K and V11A significantly affected the compex formation. Since

these proteins are soluble only as a complex we utilized this to analyze the interaction in

vitro. Co-expression of the various mutants of BRCA1 with wild type BARD1 using compatible plasmids with orthogonal tags showed that at constant expression levels of

BRCA1, the amount of the complex that can be purified from the soluble fraction

obtained is a function of the interaction between BRCA1 and BARD1. Using in vitro

analysis to complement the in vivo results, we were able to decisively prove that this

interface is indeed quite robust to mutations.

iii

Dedication

To my family, Shiva and Sahana

My parents, my brother and my sister

Acknowlegements

I have come across many individuals who have encouraged and supported me and given me the resilience to work towards this degree. I will not have found myself here, at the verge of receiving my degree without their contributions. I would like to thank one and all for their efforts.

First and foremost I would like to thank my advisor Dr. Thomas Magliery for giving me an opportunity to work in his lab. He is an incredible teacher and I have been fortunate to learn from him. He achieves the perfect balance of helping when required but allowing independent thought at the same time. He has provided encouragement and support through my time here and has helped whenever I needed with my project. It would not have been possible for me to work on the challenging aspects of this project without his input and expertise.

I am highly appreciative if various Chemistry Personnel, especially Judy Brown and

Jennifer Hambach for the various timely reminders, absolute patience and incredible helpfulness. I am grateful to Dr. Sean Taylor for allowing me to work in his lab as a volunteer researcher and for encouraging me to pursue a graduate career. I also thank my various committee members Dr. Ross Dalbey, Dr. Karin Musier-Forsyth, Dr. Christopher

Jaroniec and Dr. Kimerly Powell for taking time from their busy schedules and offering

valuable suggestions.

As important as the work are the people one works with. I would like to thank my

colleagues at Hindustan Lever Research Center, Bangalore for supporting me on my

career ambitions. I would like to thank Matt Heberling, Ely Porter, Luke Smith, Grace

Cooper and David Bowles for the contributions that they have made to my project. Matt

joined the Magliery lab a few months after I did and we learnt together the various facets

of working with this project. Even before Ely joined the lab as a researcher, he impressed

me as a quiet and smart student. When he joined the lab I found that he is very outgoing,

funny and a very quick learner. He was able to make good progress in the project that he

worked on in addition to offering timely help for my late night experiments. It was equally fun to work with Luke. He was very sincere and started working on a somewhat

different and challenging aspect of the project. Grace Cooper who spent one summer

doing research with me was quiet and very productive and was a pleasure to work with. I am glad that David Bowles is continuing with what this project and I wish him all success.

I have been fortunate with a dynamic and intelligent group without whom my years here would have been less productive and less fun. I thank all Magliery lab members for their support and help. I thank Rachel Baldauff, Shila Sen, Chau Ngyuen, Danielle Williams,

Sarah Johnston, Nishanthi Paneerselvam, David Mata, Ted Schoenfeldt, Tran Ngyuen,

Kimberly Stephany, George Matic and Nick Callahan for contributing to the friendly atmosphere of the lab, for the various insightful chats on science and life and for being entertaining, most of the times without even realizing it. I thank Christina Harsch and

Mohosin Sarkar for making me feel welcome during my initial days in the lab. Of the various people who helped me with my work, Dr. Jason Lavinder who was a graduate student at the time deserves special mention. Jason seemingly believed in “perfect” experimental set up and I was fortunate to have an opportunity to learn from him. He is also one of the happiest people I have met and his enthusiasm made the working in the lab and putting in long hours seem pleasant. I also have to thank Jason and Sanjay for my present, although meager, knowledge of American pop culture.

I am grateful for the many friends that I have made during my years here. I thank Dr.

Vivekanand Shete for helping me with various instruments and for words of encouragement at the times that I needed. Dr. Lihua Nie sets an example for hard work and I am fortunate to have worked beside her. I thank her for helping me out numerous times over weekends. I thank Srividya Murali for being her dignified self. I am grateful to

Brandon Sullivan for being an awesome co-worker and a great friend. During my first quarter in Magliery lab, Brandon asked me “How long is your rotation going to last?” My

‘rotation’ in the lab lasted for over six years and we built a friendship which I know will last for the years to come. Apart from his proactive helpfulness in the lab, his sunny personality often acts as a stress relief. It is hard for me to imagine Magliery lab without him and I am glad that we will receive our doctoral degrees around the same time. I am

vii

fortunate to have worked with Venuka Durani who has donned the roles of a great friend,

valuable coworker, and my personal counselor in a seamless fashion. We have shared the

jinxed ‘β-sheet bench space’ for over five years and our projects have had ups and downs

over the course of this time. I could always count on her to be there for the countless low

moments in my project and offer suggestions and reassurance. I thank her for the smiles, inside jokes, and invaluable memories which go beyond the confines of the lab.

“Home is the place you complain the most and are treated the best”. I am fortunate to have a loving family and I thank my parents for believing in me and supporting me on my various decisions. My advisor says PhD is a lifestyle where our lives are dictated by

growing cells and ongoing experiments and I have come to accept that. It can be challenging for someone who is not a graduate student to accept that lifestyle. I am lucky to have Shiva in my life, who has gracefully accepted this lifestyle and proved to be a tower of support. He has encouraged me to keep going and not falter in the face of difficulties. I would not have been able to complete this degree without his loving support. I appreciate Sahana, my daughter, for being pure joy and her ability to make me forget the world when I am with her. She is the best entertainment I could ever ask for and I am thankful that she is part of our life for now and ever.

viii

Vita

March 1994 ...... Nirmala High School, Aluva, India

1999...... B.S. Chemistry, Mahatma Gandhi

University

2001...... M.Sc. Applied Chemistry, NIT

Tiruchirapalli

2003...... Research Assistant, Hindustan Lever

Research Centre, Bangalore

2005 to present ...... Graduate Teaching Associate, Department

of Chemistry, The Ohio State University

Fields of Study

Major Field: Chemistry

Specialization: Biological Chemistry

Table of Contents

Abstract ...... ii

Dedication ...... iv

Acknowlegements ...... v

Vita ...... ix

Table of Contents ...... x

List of Figures ...... xv

Chapter 1: Introduction ...... 1

1.1 Importance of protein stability ...... 1

1.2 Protein Engineering ...... 5

1.3 Screens and selections for folded proteins ...... 9

1.4.1 Survey of Chacracterization of the DNA Binding Domian of p53 ...... 19

1.4.2 Biophysical Characterization ...... 30

1.4.3 Characterization of the Response Elements of p53 Core Domain ...... 42

1.5 Rescue strategies for p53C and mutants ...... 46

1.6 Identification of functional mutants of p53 using genetic screens ...... 53

1.7 Thesis Synopsis ...... 58 x

Chapter 2: A cell based screen for the functional core domain variants of tumor suppressor protein p53 ...... 62

Contributions ...... 62

2.1 Summary ...... 62

2.2 Introduction ...... 63

2.3 A functional screen for p53 core domain ...... 68

2.3.1 Optimization of Growth Conditions ...... 74

2.4 Proof of principle using known hotspot mutant ...... 76

2.5 P53 responsive lac operon with robust transcription using a combinatorial library 79

2.6 Discussion ...... 84

2.7 Materials and Methods ...... 88

2.7.1 Construction of Reporter Plasmid – pGFPuv BDx ...... 88

2.7.2 Construction of Expression Plasmid – pACBADp53...... 91

2.7.3 Choosing the Reporter Plasmid form a Library of positives ...... 91

Chapter 3: Utilizing the cell baed screen to identify functional p53 mutations ...... 93

Contributions ...... 93

3.1 Summary ...... 93

3.2 Significance of studying libraries for core directed design ...... 95

3.3 Core randomized libraries of p53 DNA binding domain ...... 100

3.3.1 Four-position Library ...... 100

3.3.2 AA and TI Sub-libraries ...... 104

3.4 In vitro characterization of library variants ...... 112

3.4.1 Stability Measurements using Urea Denaturation ...... 114

3.4.2 Thermal Melts Monitored Using Circular Dichroism ...... 119

3.4.3 DNA Binding Using Fluorescence Anisotropy ...... 120

3.5 Discussion ...... 125

3.6 Materials and Methods ...... 129

3.6.1 Construction of Libraries ...... 129

3.6.2 Protein Expression and Purification ...... 133

3.6.3 Chemical Denaturation Using Urea Monitored by Fluorescence ...... 134

3.6.4 Thermal Denaturation Monitored by Circular Dichroism ...... 134

3.6.5 DNA Binding Studies Using Fluorescence Anisotropy ...... 135

Chapter 4: Engineering the S7/S8 loop of the p53 core domain to improve stability .... 136

Contributions ...... 136

4.1 Summary ...... 136

4.2 Introduction ...... 137

4.3 Study of p53 in C. elegans as a potential method to stabilize human p53 ...... 142

4.4 Experimental studies to parallel the silico results ...... 145

xii

4.6 Future work ...... 150

4.7 Materials and Methods ...... 152

4.7.1 Cloning and Expression of Chimera Variants ...... 152

4.7.2 Screening for Positives ...... 152

4.7.3 Protein Expression and Purification ...... 153

4.7.4 Urea Mediated Chemical Denaturation Monitored by Fluorescence ...... 153

Chapter 5: In vivo and in vitro studies of cancer associated mutations in BRCA1 ...... 154

Contributions ...... 154

5.1 Summary ...... 154

5.2 Introduction ...... 155

5.3.1 Study of Cancer-associated Mutants of BRCA1 for their interaction with

BARD1 ...... 163

5.3.2 In vitro Analysis of Binding Interaction ...... 168

5.4 Future Directions ...... 171

5.5 Materials and Methods ...... 172

5.5.1 Construction of BRCA1 Mutants ...... 172

5.5.2 Screening for Positives ...... 174

5.5.3 Affinity purification of fusion protein and interaction partners ...... 175

5.5.4 Western Blots Using Anti-HA Antibody...... 175

xiii

5.5.5 Purification of BRCA1/BARD1 Complex ...... 176

Chapter 6: Extensions to the p53 bacterial screen ...... 178

6.1 Introduction ...... 178

6.2 Optimization of SICLOPPS Expression from pGDFuv-BDI ...... 183

6.3 Engineering the pEC vector to express SICLOPPS ...... 186

6.5 Materials and Methods ...... 195

6.5.1 Construction of SICLOPPS Library ...... 195

6.5.2 Re-engineering the pEC Vector ...... 196

6.5.3 Screening in BL21 (DE3) Cells ...... 196

Chapter 7: Materials and Methods ...... 197

7.1 Materials ...... 197

7.2 Methods ...... 198

7.2.1 Molecular Cloning ...... 198

7.2.2 Ligations and Transformations ...... 201

xiv

List of Figures

Figure 1 Domain organization and structure of p53 ...... 14

Figure 2 p53 pathway...... 16

Figure 3: Sturcture of the core domain of p53 ...... 18

Figure 4 Model for DNA binding of p53 as a tetramer ...... 20

Figure 5 Binding of full length p53 to its DNA...... 23

Figure 6 Conformational change between dimer to tetramer ...... 27

Figure 7 Properties of the species formed during denaturation ...... 31

Figure 8 Hotspot mutations on the core domain ...... 34

Figure 9: Regulatory network of p53 ...... 44

Figure 10: Rescue of p53 by binding to MDM2 ...... 47

Figure11: URA3 dependent screen for p53 in Yeast ...... 54

Figure 12 Red white screening in yeast ...... 56

Figure 13: A schematic representation of the screen ...... 69

Figure 14: Schematic representation of the screen ...... 71

Figure 15: Intial optimizations of the screen ...... 76

Figure 16: Mutant Properties ...... 78

Figure 17: Optimization of conditions for screening ...... 79

Figure 18: Selection of variants from the binding domain library ...... 80

Figure 19: Screen with improved dynamic range...... 83

Figure 20: Sequence of Quad mutant...... 90

Figure 21: Results from AATI library ...... 103

Figure 22: Results from AA library: ...... 106

Figure 23: Positives from TI library ...... 109

Figure 24: Negatives from TI library ...... 111

Figure 25: Actives from T253 I255 library ...... 113

Figure 26:Wavelength scans for the different variants ...... 116

Figure 27: Characterization of Actives ...... 121

Figure 28 CD wavelength scans for TI library variants ...... 122

Figure 29 : Data from TI variants characterizations ...... 122

Figure 30: Binding studies of the TI library variants ...... 124

Figure 31: Proposed TIVA library for the p53 core ...... 128

Figure 32: Schematic representation of the library cloning scheme ...... 130

Figure 33: human p53C aligned with Cep1 ...... 141

Figure 34: A comparison of the core domain structures of human and worm p53 ...... 143

Figure 35: Screening results and designed variants ...... 144

Figure 36: Urea mediated denaturation of 4ΔL+EGG ...... 146

Figure 37:Analysis of the S7S8 loop ...... 147

Figure 38: Proposed mutations to study the S7S8 loop ...... 151

Figure 39: Structural network organization of BRCA1 and BARD1 ...... 162

Figure 40: Initial set of mutations studied by Sarkar and Magliery ......

xvi

Figure 41 Cancer associated mutations on BRCA1 ......

Figure 42: Cloning and screening of the cancer associated mutants of BRCA1 ...... 167

Figure 43: vectorlogy for in vitro expression of the BRCA1/BARD1 complex ...... 169

Figure 44: Screening and in vivo results for a selected set of variants ...... 170

Figure 45: Vector map with HA tag for western blots ...... 173

Figure 46 Principle of production of SICLOPPS ...... 183

Figure 47 Vector maps screening for active p53 in presence of SICLOPPS ...... 184

Figure 48 Optimized screening conditions in BL21 ...... 186

Figure 49 Comparison of DH10B and BL21 Cell lines ...... 187

Figure 50: pEC vector and copy number ...... 188

Figure 51 Proposed SICLOPPS in pEC vector ...... 190

Figure 52 A comparison of the Quad and the Hexa structures ...... 192

xvii

Chapter 1: Introduction

1.1 Importance of protein stability

Understanding protein folding and the basis of protein stability is of paramount

importance enabling us to design or redesign proteins for pharmacological and industrial

applications. Protein-based therapeutics including engineered antibodies are becoming an

effective way of treating many diseases including diabetes mellitus, rheumatoid arthritis

and thrombocytopenia.2 Successful design of such therapeutics relies on a detailed understanding of the structure of proteins and the forces behind it. The biochemical applications of proteins as analytical tools and enzymes are hugely improved with the stability of proteins, and hence a large effort is oriented towards improving the stability of proteins with pharmaceutical applications in order to tailor these proteins for the harsh conditions that industrial processing often requires. In addition, a higher stability also renders the protein more tolerant to point mutations as they do not fall below the

minimum threshold of stability. Therefore understanding the sequence structure

relationship of proteins and using the collective knowledge to engineer proteins of

improved stability and function has been an area of widespread interest.

Proteins fold into highly ordered structures forming α-helices or β-sheets to exclude

surrounding solvent. This hydrophobic collapse gives rise to modest net stabilization of

the protein. The large contributing factors (enthalpic gain due to van der Waals interactions, hydrogen bonds and other electrostatic interactions and salt bridges of the protein, entropic gain of the solvent, typically water, and the entropic loss due to the resultant ordered structure) lead to small differences between the large opposing forces leading to the marginal stabilization of most proteins. Most proteins are stabilized by only

5-15 kcal mol-1 at physiological temperatures.3 This marginal stabilization has profound

implications on their response to mutations. A large number of pathological mutations in

proteins render them non-functional simply due to reduced stability. The stability of a

protein determines its ability to resist proteolysis and carry out its biological function.

Protein stability is also known to play a significant role in evolution. Although evolution optimizes for function, a protein’s fitness and evolution does depend at least to some extent on its stability.4 It is therefore of immense value to be able to predict the effect of

various mutations on the structure and stability of proteins and this has been an area of

research for the past four decades.

Based on RNase refolding experiments, Anfinsen proposed that the native structure of a

protein, which is in equilibrium with its unfolded state, is the thermodynamic minimum

or the lowest energy state at a given environment, and that this structure is pre-

determined by the amino acid sequence of the protein.5 Since then scientists have devoted

themselves to decipher how the secondary and tertiary fold of proteins are determined by

its primary structure. Accurate modeling of the equilibrium between the native and the

unfolded states has been a challenge due the marginal stabilization of proteins. The

astronomically large conformational space available even for small proteins further

complicates their modeling. This is defined as the protein folding problem.6 If a protein

samples all the conformations available, the folding will take an an astronomically long

time. However, nature achieves the folding of proteins within micro-second timescales,7 and this was the observation that led to Levinthal’s paradox. To find a way around this, folding funnel models have been proposed.8 The protein folding funnel is a simplified

representation of the energy landscape of convergence of a protein from its disordered,

denatured state to its ordered native state. It literally represents that when the polypeptide

chain has higher conformational freedom, like in the denatured state, it is at a high energy

state, and as the energy decreases, the number of conformations available to the chain

also decrease. It also implies the high entropy of the denatured state and the compact low

entropy state of the folded protein.

Apart from the challenge in accurate energy prediction for a given structure, design

strategies to arrive at the most optimal sequence for a given structure are also

complicated by the observation that proteins with low sequence similarity do fold into similar folds and conversely high sequence similarity can lead to proteins of different folds. This makes it a difficult problem for modeling and computational analyses for the accurate prediction of stability.9 Therefore predicting the structure that a sequence adopts

or deriving the sequence from a particular structure are both challenging. An approach to

study this would be to test for sequences that can be tolerated in a particular backbone10 thus studying the inverse of the protein folding problem. In silico modeling achieves this by fitting a large number of sequences by energy minimization parameters to one structure for the design of a protein.11 The prediction of the tertiary structure from its

primary structure using computational modeling requires a potential energy function, the

global minimum of which coincides with the native state of the protein.12 Physics-based

methods, which consider a protein at its atomic level, include empirical components in

the potential function to enable reasonably effective conformational searching.13

Knowledge-based methods use experimentally derived data to build the empirical component and are often found to be more successful.14 15; 16 The accuracy of

computational structure predictions is vastly improved if they are trained on a larger

database of experimental sequences and structures. Mutations introduced to the protein of

interest followed by structural or functional analysis allow us to decipher the effect of

such mutations to the structure and function of the protein and also generate larger dataset

for “training” and testing. Combinatorial methods to generate large libraries of variants of

the protein and analysis using screens or selections improve the throughput of such

experiments. In this inverse approach the tolerance of a particular structure to changes in

sequence is analyzed to arrive at the rules of protein folding.17 On the road to

understanding proteins a variety of techniques including high resolution X-ray crystal

structures, energy refinement, recombinant DNA technology, molecular dynamics simulations, site directed mutagenesis, protein NMR, etc., have been developed and have

led to our collective knowledge of protein folding.18

1.2 Protein Engineering

Understanding the molecular basis of various functions of proteins will enable us to tailor

improved functionality to the protein or, in other words, evolve the protein to possess

specific function. Rational design and combinatorial approaches can be used to decipher

the fold, function and stability of proteins. Protein engineering approaches have been

successful in improving the thermal stability of biocatalysts to allow the use of these at

high temperatures for industrial applications.19 Also recombinant DNA technologies

combined with rational and combinatorial approaches have led to the generation of high affinity antibodies with industrial and pharmaceutical relevance. Protein engineering

approaches have also been a powerful tool in the discovery of small molecule drugs that

mediate or disrupt specific protein-protein interactions.20 Computational design of

proteins using potential energy minimizations is gaining success, especially for small

proteins. Dahiyat and Mayo redesigned a Zn-binding sequence using an algorithm that

uses a backbone template to contain no metal binding sites.21 The redesigned protein has

only 40% sequence similarity with the parent template and is well folded and soluble.

The design of a 93 residue αβ motif using an algorithm that allowed simultaneous

optimization of sequence and structure led to the generation of a novel motif. This study

demonstrated that such novel architectures can be explored using computational

approaches.22 De novo design strategies have evolved to design novel folds, function and

to improve the stability of proteins.23 The advances in the various fields of design of

proteins have allowed us to improve the stability of proteins and impart function but, we

are still unable to accurately predict the structure of a sequence or the energetic and

functional consequences of mutations to a particular protein.

One of the approaches that can be taken to answer this inverse protein folding problem is

to apply combinatorial methods to study large libraries of proteins and to use screens and

selections to interrogate the effect of various mutations on the function and stability of a

protein. The advances in DNA synthesis allow the synthesis of degenerate codons in

oligonucleotides which can be utilized to generate libraries of genes using PCR based

techiniques. Functional screens or selections that link the phenotype to the genotype can

be utilized to infer the stability of the protein that resulted from the library, and

sequencing allow the identification of the variants. This approach has been successfully

used in our lab to design core and loop libraries of the four helix bundle protein ROP to

elucidate the determinants of stability of this protein.24; 25 Using a cell-based screen for the function of ROP and high-throughput methods to determine the stability of the variants, the role of various amino acids in positions of interest has been studied and an overall picture of the stability determinants of this protein is emerging. The stability and fold of tumor suppressor proteins like p53 and BRCA1 is another area of immense

interest. Many cancer-associated mutations of these proteins are known to be

destabilizing, and they render the protein non-functional. Therefore improving the

stability of these proteins is an area of interest for applications to targeted cancer therapy.

It is beneficial to accumulate data that provide information about the role of various amino acids in the folding and stability of these proteins. One of the key requirements to assimilate such large datasets is a method to sort large number of variants in a high- throughput format and report the functionality of the protein. Inferential methods such as genetic screens and selections which assume that the function of a protein is a consequence of it being well-folded serve as a powerful tool for this purpose. For example, development of a cell-based screen for the function of ROP has been an important tool to segregate functional variants from non functional ones and the residues important for function are left intact in the various libraries, a functional protein implies foldedness. The functional variants obtained following the screening were analyzed using high throughput analyses to arrive at the rules of folding of this four helix bundle protein.

Our present understanding of protein structure and stability has been derived from site

directed mutagenesis experiments on proteins like T4 lysozyme, λ repressor, B1 domain

of protein G, staphylococcus nuclease, barnase and ROP.26 The majority of these are

alpha helical proteins and most of our knowledge in β-sheet proteins has originated from

the studies on the GB1 protein. Alpha helices are inherently more robust than β-sheets because of the effective exclusion of the solvent water by the protein backbone in these

structures. In comparison the entropic gain for the surrounding water for β-sheet

structures is limited due to the fact that the side chains are not as shielded in the β-sheet

structures as in alpha helices.27 This presumably makes the alpha helical system more

amenable to mutations and engineering studies, and this has lead to a more well-defined

understanding of the sequence-structure relationship of the helical structures. The

groundbreaking studies of Chou and Fasman lead to the prediction of β-sheet propensities

of various amino acids based on statistical analyses.28 Following this various studies were conducted, mainly of GB1, to study the role of amino acids at various positions.27; 29; 30; 31

The initial studies were limited by the lack of technology available at the time to synthesize and analyze large number of variants. But these initial studies lead to the

conclusion that the significance and propensity of various amino acids in β-sheets is

highly context dependent, which is to say that their role is determined to a large extent by

the nature of surrounding residues. Following this, significant progress has been made in

studying this protein, mainly using phage display techniques. Various studies indicate

that the structural context of each of the residues is a significant factor in determining the

role of the mutations at that position. In other words general rules of the ‘jigsaw puzzle’

or the ‘oil drop’ models of the protein core that can be used to explain the fold and

stability of helical proteins cannot be directly used to explain the properties of β-sheet

structures. The integrity of β-sheet structures is governed by both short range interactions

between residues and long range forces. Setting up a platform to study β-sheet proteins in

high throughput format to learn the sequence-structure-stability relationship is therefore a

valuable to further our knowledge of the determinants of packing and stability.

We wanted to implement the analysis strategy on a physiologically relevant β-sheet

protein with the goal to gain insight into the sequence structure stability relationship of a

β-sheet protein and at the same time address pharmaceutically relevant questions. We chose to work on tumor suppressor protein p53, the mutants of which are implicated in

more that 50% of human cancers.32 It is a sequence specific transcription activation factor

and therefore allows us to develop genetic screening systems in which the phenotype can

be conveniently liked to the genotype enabling the analyses of interesting variants.

1.3 Screens and selections for folded proteins

Genotypic screens serve as a key tool in the development of high-throughput platforms to

study various mutants of a protein. The basis of any screen or selection is to link the observed phenotype to its genotype in order to decipher the sequences of proteins with ease from their DNA sequences. The various screens and the principle behind them are reviewed by Magliery and Regan.26 Historically, selections were applied only to those

proteins that were known to be important for cell survival. For example, selections for

tryptophan synthase, lambda repressor and lac repressor. A folded and functional

tryptophan synthase is required for the growth of cells in media that lacks tryptophan.

Therefore this was used to select for variants of tryptophan synthase that were functional.

Lambda repressor blocks superinfection by lytic phage and again is essential for the survival of the bacteria. On the other hand the lac repressor dictates the survival of cells in lactose minimal media. The ‘blue white’ screen is one of the earliest and the well

known applications of genetic screening. This is based on lac repressor which can block

the hydrolysis of the chromogenic galactoside, bromo-chloro-indolyl-galactopyranoside,

(abbreviated as BCIP, popularly known as X-gal) resulting in white colonies whereas a

non functional lac repressor would be indicated by blue colonies. This represents one of

the systems that can be used both as screening system and a selection. Screening systems

which rely on functions not linked to the survival of the cells allow the analysis of both

functional and non functional variants of the protein of interest enabling us to interrogate

the rules of folding and function of that protein in greater depth. This is a distinct

advantage of screening systems over selections which rely on functions of the protein that

determine the ability of the cell to survive and do not provide direct data about the non-

fucntional variants. The key challenge for the design of screening systems is to link the

phenotype to the genotype so that analysis of interesting variants is facilitated. The

advances in sequencing of DNA have made it easy and economical to sequence a large

number of variants allowing the study of large libraries of proteins that pass the screen giving the right phenotype.

Following the studies on proteins that have a genetic selection, various methods that rely of the binding function of protein has been developed to screen for function. The basic premise of such studies is that a protein needs to be folded into the correct conformation and has to be stabilized beyond a minimal threshold for it to be functional. In other words, these inferential methods assume that a functional protein is required to be folded and native-like. Screens using mRNA display, ribosome display, phage display, yeast-2-

hybrid systems and bacteria have been developed based on this principle. mRNA display

relies on the fusion of a puromycin tagged DNA with the corresponding RNA fusion.

During translation, when the ribosome pauses at the RNA-DNA junction, puromycin gets covalently linked to the nascent peptide. Reverse transcription followed by DNA sequencing effectively results in gaining the information of sequence structure relationship.33 Although large libraries can be screened using this technique, it is

complicated by the requirement of in vitro compartmentalization method that involves

making oil droplets. The size of the droplets needs to be controlled with great precision to

ensure that only one variant is enclosed in one droplet. On the other hand, phage display

utilizes the display of the protein of interest in the phage coat protein and overcomes the

compartmentalization issues with the mRNA display. Screening is typically based on the

binding of the displayed protein to target of interest and the stringency of selection can be

increased by panning multiple rounds.

Phage display is also one of the few methods that have been adapted to screen purely for

stability or fitness of the protein of interest. The proteins can screened for their protease

resistance, thus moving away from being an inferential screening method to become a

direct measurement of protein fitness.34 Another method for direct measurement of

fitness of a protein was developed in yeast. Hagihara and Kim have utilized the quality

control mechanisms inherent to yeast to select for soluble peptides formed from random

sequences generated by combinatorial library methods. It has been shown that the

secretion efficiency correlates with the stability of protein and this is another property

that has been used as a screening method.35 One of the disadvantages of this method is that some misfolded proteins were selected and required a secondary characterization

method to validate the technique. Yeast surface display which relies on the fusion of the

protein of interest with a cell wall mating protein on the surface of yeast has been

employed to display folded and stable mammalian peptides. The displayed protein can

then be selected based on various binding assays similar to the techniques used in phage

display. The use of a eukaryotic system facilitates post translational modification and

allows the mammalian proteins to achieve a near native conformation.36; 37; 38; 39Yeast n-

hybrid systems are well established to decipher protein protein interactions.40 A related

method is the bacterial two hybrid system which offers the advantages of being faster and

not requiring nuclear localization.41; 42

Bacterial systems are especially amenable to combinatorial approaches since large

libraries (~109) can be generated in bacteria. These systems have been adapted to study

protein-DNA and protein-protein interactions. In a combinatorial context, libraries of

mutants can be studied and residue specific information on the determinants of the

interaction surface can be delineated. In addition, screens that simply monitor the

solubility of the protein of interest have also been developed.43 A reporter protein such as

GFP is fused to the C-terminus of the protein of interest. The folding of the protein of interest leads to the folding of this C-terminal fusion protein and its fluorescence in turn reports the folding of the protein of interest. A selection method based on the solubility of the protein is established when the C-terminal reporter fusion is required for the survival

12 of the cells. This idea has been implemented in selections based on fusion to antibiotic resistance genes. Maxwell et al. selected for soluble variants of HIV integrase based on the observation that resistance to chloramphenicol is a function of the solubility of the

HIV intergrase when expressed as an N-terminal fusion to the chloramphenicol acetyl transferase gene.44 Mansell et al. have recently designed a rapid folding assay for proteins expressed in the bacterial periplasm. They express the protein as a sandwich between an

N-terminal export system and a C-terminal selectable marker, TEM1 β-lactamase. They demonstrate that the folding efficiency of various target proteins correlates directly with in vivo β-lactamase activity and thus dictates survival.45

1.4 Tumor Supressor protein p53

Ever since its discovery as a 54 kDa (p53) cellular SV40 tumor antigen, extensive research has been conducted to provide us with the information we now have about p53.46 The protein p53 as we know today is a multidomain protein consisting of an N- terminal transactivation domain, a sequence specific DNA binding domain and a C- terminal domain which in turn is composed of a tetramerization domain and a terminal regulatory domain. A proline rich region is sandwiched between the N-terminal and DNA binding core domains (Figure 1). The function of p53 is to regulate cell differentiation under conditions of cellular stress mainly by transcriptional methods in addition to non transcriptional modes.47 It is a tumor suppressor protein and is found to be mutated in more than 50% of human cancers. Tremendous amount of work which has resulted in. more than 50,000 publications has changed the view from p53 being a tumor antigen to

Figure 1 Domain organization and structure of p53 a) Domain organization of p53 including N-terminal transactivation domain, DNA binding core domain, C-terminal tetramerization domain. The vertical lines represent the occurrence of hotspot mutations b) the structure of DNA binding domain of p53 (pdb id 1TSR).48

14 being called the “guardian of the genome.” Despite the large amount of research being done on p53, we are still far away from completely understanding this protein structurally or functionally. The folded and intrinsically disordered domains function in a concerted fashion allowing the widespread yet specific DNA binding properties of p53.

p53 was initially thought to be an oncogene linked to the viral transformation process since various studies showed that the functions of p53 were closely involved with the viral replication and tumorigenesis by the small DNA tumor viruses. The decade of the

1980s saw some research towards elucidating the cellular function of p53 which led to it being thought of as an oncogene.49; 50; 51; 52 Continued research showed that the variants of p53 in tumor cells are mutated and these led to its accumulation in tumor cells and these observations helped the classification of p53 as a tumor suppressor protein.53; 54 The research that followed in the decade of the 1990s further established that it plays a central role in preventing cancer in humans and animals. Malkin et al. showed that inheriting a mutant allele of this gene leads to cancer with 100% penetrance, multiple groups showed that knockout mice developed cancer at a very young age when they were complemented with loss-of-function mutants of p53,55; 56 and up to 50% of human cancers contain mutations which is defined as a predisposition towards different kinds of cancer, the onset of which may occur at a very young age.57

The decade of the 1990s also saw substantial research being done to elucidate the structural properties of this protein. Studies to decipher the cellular function of p53 show

Figure 2 p53 pathway p53 is activated as a result of cellular stress leading to changes in the chromatin structure. Activated p53 accumulates to initiate a cascade of downstream effects ultimately leading to cell cycle arrest, apoptosis, or senescence. Image from Sengupta and Harris (2005).58

16 that it is at the hub of a variety of signaling pathways that control the cell cycle and in both the alleles of p53 accompanied with aberrantly high levels of protein p53 in tumor cells.59 Germline mutations in the p53 gene are an indicator for Li-Fraumini syndrome maintain the integrity of the genome (Figure 2). In response to cellular stress that includes DNA damage and dis-regulated growth, the p53 pathway is activated which eventually leads to cell cycle arrest, apoptosis or senescence. This multifaceted role of p53 is mirrored in its complex and intricate structural biology. Understanding the individual components of the p53 structure, such as DNA binding domain and tetramerization domain, has laid the framework for understanding the effects of common cancer mutations. Prives et al. have shown that the manipulation of mutant p53 to bind

DNA can have implications for possible therapeutic applications.60 The DNA binding of wild type p53 is a complex function of its affinity to the DNA and other interacting proteins in addition to the DNA binding properties of competing proteins, and is regulated by phosphorylation, acetylation and other lysine modifications.61 The presence of the tetramerization domain which shows non-specific DNA binding and an unstructured C-terminal regulatory domain adds to the complexity of the picture. This structural intricacy has also proven to be a challenging target for elucidation of the structural basis of the function of p53. As a result many aspects of p53 function still remain elusive.

Figure 3: Sturcture of the core domain of p53 The structure of p53 core domain bound to DNA solved by Cho et al. PDB ID 1tsr is shown.62 Two of the monomers shown in cyan and blue make extensive contacts to the DNA while the third shown in warm pink does not interact with the DNA but the protein protein contacts stabilize the complex. The figure is rendered using Pymol.

1.4.1 Survey of Chacracterization of the DNA Binding Domian of p53

1.4.1.1 Structural Characterizations

The first step towards elucidating the structural biology of p53 came when the crystal

structure of the DNA bound core domain was reported by the Pavletich group.62 They reported that the core domain consists of residues 102 to 292 and forms a β-sandwich that serves as a scaffold for two large loops and a loop-sheet-helix motif (Figure 3). The two large loops are held together by a tetrahedrally coordinated Zn2+ ion. The sheets in the β-

sandwich are packed face-to-face forming a Greek key topology. Most of the mutations

that inactivate p53 are found to be in the four conserved regions within the core domain.

This first crystal structure showed three monomers of the core domain bound to the

consensus DNA binding sequence, one of which bound at its central region and made

extensive contacts to the bases and the phosphate backbone. The various cancer

associated mutations of p53 were mapped on to the DNA binding domain of this protein.

The solution NMR structure of the core domain, which was solved much later, provides a

picture with only subtle changes from the crystal structure.63 The overall structure although similar was found to be far more mobile than expected from the crystal structure. This mobility was mainly contributed by changes in the loop 1 conformations and the presence of unsatisfied hydrogen bond donors in the hydrophobic core of the protein.

Figure 4 Model for DNA binding of p53 as a tetramer Proposed model show the monomers making extensive PPIs as well as Protein DNA interactions to bind to DNA a tetramer. a and b represent the closed and open configurations of the tetramerization domain respectively. c and d shows the cartoon model based in the individual crystal structures of the core and the tetramerization domains. d is a 90° rotation view of c64

Following this landmark crystal structure which paved way to a variety of biophysical

studies on this protein, a number of crystal structures have been published. These

conspire to give us an overall picture of how this protein binds to DNA and the

implications of this binding to its function. P53 exists as a dimer in the unbound form

and its consensus DNA binding sequence provides a scaffold for the protein to

tetramerize as dimer of dimers. A variety of interactions occur in this complex. The

protein-DNA interface is the most conserved and various crystal structures solved

confirm this observation. The protein DNA binding is mediated by the sequences that

comprise the loop-sheet-helix motif and the large loop. The crystal structure of the core

domain from mouse in the absence of DNA reveals that the loop L1 and the C-terminal

end of helix H2 undergo significant structural changes upon binding to DNA.65 The helix

H2 and the loop L1 from the loop-sheet-helix motif packs against the major groove of the

DNA while an arginine residue from the large loop L3 packs against the minor groove of

the DNA. Cho et al.62 and later Kitayner et al64 proposed a model for the binding of p53

as a tetramer to DNA (Figure 4). Their proposed model was essentially confirmed by the

crystal structures that were solved subsequently. The present picture proposes that the

dimer-dimer interface contacts and protein-protein interactions within the dimer, in

addition to the dimer-DNA interaction surface contribute to the highly cooperative binding of p53 to DNA. Mutations that lead to the loss of some of the core domain-core domain interactions can still lead to p53-dependent transactivation if the tetramerization domain remains intact64 as the tetramerization domain contributes to the protein-DNA complex formation.

The dimer contacts involve the H1-S5 loop of one subunit and the S4-H1 (L2) and S6-S7

loops of the other dimer subunit. Zhao et al. have solved the structure of mouse p53 core domain, which is highly homologous to the human p53 with an overall sequence identity of 89%, in the absence of DNA at 2.7 Å resolution.65 Comparison of this free form of the

core domain with the DNA bound form62 indicates that the core domain binds to DNA

via extensive reconfiguration of Loop 1. Also, the physiologically relevant low affinity

dimer of the core domain is in a configuration that is incompatible with simultaneous

binding of both subunits to duplex DNA. The NMR structure of the 58 kDa dimer,

complexed with DNA, shows that the core domain undergoes substantial conformational

changes on binding to DNA.66 Furthermore, the various chemical shift analyses indicate

that the helix 1 and the neighboring G244 regions may form a possible dimerization

interface. In order to obtain the full picture of how the core domain binds as a tetaramer

to the DNA, various groups have employed chemical crosslinking between Cys277 and

Cyt18 modified to a cystamine to trap the protein DNA complex using a disulfide

linkage64; 67; 68 to study the molecular basis of the binding of p53 dimer and later the

tetramer to DNA. Recently, Chen et al. have crystallized the self assembled tetramer bound to the full consensus site and their findings explain the significance of the zero base pair separation observed between the half sites and provide further insight into the high cooperativity and kinetic stability of the p53-DNA complex.69 They propose that

p53 core domain forms a planar tetramer complex to bind to the minor grove face of the

DNA. The geometry and symmetry of this tetramer is exquisitely molded to match that of

the major grove of the DNA. The DNA largely remains unchanged upon binding to the

Figure 5 Binding of full length p53 to its DNA. The crystal structure of full-length p53 (PDB ID 3KMD).69 a and b shows the tetramer bound to consensus DNA. b shows the envelope form of the protein. c and d show a 90° rotation of a and b respectively. The figures are rendered using Pymol

protein but undergoes significant sliding at the central base pairs between the two half

sites. The tetramer forms a trough-like structure to bind to the DNA employing the L2

loop region and the S7S8 turn region for the individual dimers to come together (Figure

5). The loop 1 region is predominantly used in the binding to DNA. The significance of

the S7S8 turn is reflected in the unstable variants that resulted from the mutations of this

region as discussed in Chapter 4. Malecka et al. suggest that the variability observed

between the different structures for the dimer-dimer contact interface may be a function

of the specific sequence of the DNA used.67 As suggested by previous studies, although

a dimer can bind to the DNA, the complex is stabilized by tetramerization of the core

domain and this is reflected in the improved binding and half life of the complex.

Therefore the DNA-induced interactions that lead to the tetramerization of the core

domain play a significant role in the cooperativity of binding of the core domain. The

tetramerization is further assisted by the dedicated tetramerization domain of p53 which is connected to the core domain by a 30 residue linker region which is highly sensitive to

proteolytic digestion. Overall, when the core domain binds to the DNA as a tetramer, the

N-terminal regions face against each other with the DNA lacing through them and the C-

termini point toward one face of the complex, parallel to the axis of the DNA. This

facilitates the positive regulation of the C-terminal teramerization domain (CTD) and

provides additional stabilization to the sequence specific DNA-core complex by non

specific electrostatic interactions between the positively charged CTD and the DNA

backbone.

Tidow et al. have employed a multitechnique approach to solve and model the structure of full length p53. An array of techniques including small angle X-ray scattering (SAXS), electron microscopy (EM) and NMR spectroscopy was used in combination with the various crystal structures solved to determine the structure of the tetrameric human full- length p53.70 The structure solved in this manner agrees well with the model proposed by Kitayner et al. In this study the authors used the optimized spacial arrangement obtained from the SAXS and EM data to fit to the available high resolution crystal structure. They found that there is some difference between the structure of the unligated form and the DNA bound form. The authors propose that the loosely tethered dimers in the unligated form can facilely undergo conformational changes that are required for the DNA binding. The unligated p53 predominantly exists in an open conformation of two separate pairs of core domains. One of the dimers binds to the DNA first and the flexible linkers between the core and the tetramerization domains allow the second pair of the core domains to bind to the remaining sites on the DNA, thus burying it within the protein. The model for the existence of loosely tethered dimers in solution gains credibility from various previous studies. The Hill coefficient for the binding of p53 core domain to DNA was found to be 1.8. Since at physiological conditions p53 exists as a dimer, and binds to DNA as a dimer of dimers, this indicates the formation of a highly cooperative complex containing two molecules.71 This was further confirmed when

Veprintsev et al. solved the NMR structure for the full length tetramer.72 NMR serves as a complementary technique to explore the solution effects of the protein that is not constrained by the crystal packing effects. The size of the full length tetramer confirmed

to be ~170 kDa using analytical ultracentrifugation experiments is well beyond the

typically prescribed size limit for conventional NMR spectroscopy. In this seminal work

the authors collected high resolution NMR data using 15N -1H HSQC and 15N -1H

TROSY methods for the tetrameric complex and compared them with the spectral data available for the smaller domains. Their results indicated the presence of a self- complementary core domain interaction surface. They confirmed this using mutational analysis of the various charged residues on the putative surface. They propose that the p53 monomers exist as a head-to-head dimer and need to undergo a 70° rotation upon binding to the DNA (Figure 6) in order to facilitate the various interactions with the DNA which agrees with their previous observation that the DNA-dimer complex in solution is incompatible to bind the DNA as a tetramer. The biological implication of this dimer of dimers was further analyzed by Natan et al. and their studies show that the association of the dimers to form tetramers is ultraslow in the absence of DNA.73 In addtition, hetero- oligomerization in presence of mutants was found to be caused only from homodimers.

This indicates that p53 is targeted for degradation in a timescale that is much faster than

the rate of tetramerization. The tight control on the rates of oligomerization may serve as

an additional control mechanism to regulate the levels of p53 under normal cell

conditions. Under cellular stress, the upregulation of p53 allows for p53 to ‘find’ its DNA

in a more facile manner leading to the formation of highly stable protein DNA

complexes.

Apart from providing significant insight into the intricate molecular mechanisms

Figure 6 Conformational change between dimer to tetramer The dimer (top) in solution undergoes a conformational change, with one of the monomers rotating substantially to enable DNA-protein contact.72

employed by p53 in its function as the guardian of the genome, X-ray cryatallography has

also facilitated the design of small molecule drugs and our understanding of the role of

various cancer associated mutations which are largely clustered in the core domain.

Kussie et al. solved the crystal structure of MDM2 bound to a p53 peptide which is part

of the N-terminal transactivation domain.74 MDM2 is an oncoprotein identified as an

amplified gene product in a transformed mouse cell line. MDM2 leads to negative

regulation of p53 and therefore the MDM2-p53 interaction has served as a target for the

anti-cancer therapies.75 MDM2 negatively regulates p53 via monoubiquitination of p53

and the observation that disrupting this interaction is beneficial has led to generation of

small molecule drugs against cancer. It was found that an 11 amino acid region which is

part of the 12 kD conserved amino terminal domain of p53 containing the sequences responsible for transactivation by p53 is sufficient to bind to MDM2. This p53 peptide forms an amphipathic helix that binds to a cleft in MDM2 which is formed by two helices and a β-sheet and is lined with hydrophobic residues including aromatic amino acids.

The amphipathic helix is loosely folded in the absence of MDM2 and uses the hydrophobic residues to interact with the MDM2 binding pocket. Many transcription factors are composed of transactivation regions that are amphipathic helices. This structure shows that the buried volume is suitable for small molecule inhibitors to prevent the interaction of the oncogenic MDM2 with the tumor suppressor protein p53. The

Nutlins, for example, are an important class of p53 activators that function by disrupting the MDM2-p53 interaction, thus upregulating p53 and leading to the normal consequences of p53 accumulation in cell, namely apoptisis and cell cycle arrest. Some

of the other small molecules that rescue p53 function by binding to MDM2 are discussed

later in this chapter.

The structures of various cancer associated mutants of p53 have added to our knowledge

of the biological role of these mutations and have helped the design of specific second

site mutations that nullify the effect of the cancer associated mutation. Joerger et al. have

crystallized five different mutants of p53 providing a mechanism of binding and second

site suppression by additional mutations.76 They reported that R273H annihilates DNA binding exclusively by losing the contact by R273 to DNA backbone. This mutation leaves the global structure of the core domain undisturbed and is reflected in the minimal destabilization caused by this mutation. Therefore this mutation is dubbed as the ‘contact’ mutation. On the other hand, R249S which is another cancer associated mutation leads to substantial changes in the DNA binding surface specifically to loop L3. This structural disruption leads to a non-native conformation of the loop L3 by displacing the M243 residue that packs against the DNA. In addition, this mutation causes partial unfolding of the core domain leading to a more flexible structure. The structure with the known rescue mutant H168R mimics the R249 DNA contact and renders the protein functional with respect to DNA binding and transactivation. The above studies were further confirmed by Suad et al.77

1.4.2 Biophysical Characterization

Although p53 was discovered as a tumor suppressor protein about 30 years ago,

significant biophysical characterization was made much later, starting only in the late

1990s. The characterization provides an overall picture of the structural basis of

destabilization and the implications of stability on the function of the protein. P53 is a modular protein with five domains, most of the stability of which is determined by the

DNA binding core domain. The C-terminal domain imparts stability to the complex of p53 with the DNA and has been implicated in the regulation of function. The N-terminal domain of this protein is categorized among natively unfolded proteins and plays a role in

the transactivation function of the protein. This is followed by the proline rich domain.

P53 is found to be mutated in a variety of human cancers with most of the mutations being located in the core domain. Understanding the structural basis of instability of the protein and identifying the binding sites for its various partners has led to the targeted design of peptidic and non-peptidic drugs to rescue the function of this protein.

1.4.2.1 Understanding the structure and function relationship of the core domain

The first attempt for biophysically characterizing this protein was done by the Fersht lab.78 This lab has done substantial work on understanding the biophysical nature of this

protein, leading to advances in drug discovery and therapeutics. Bullock et al. have set up

a robust system to quantitatively measure the structure-activity relationship of the p53

core domain. Using Differential scanning calorimetry (DSC) and urea denaturation, they

have characterized this protein as being marginally stabilized at room temperature. DSC

Figure 7 Properties of the species formed during denaturation a) spectral properties of the various species formed during the urea mediated denaturation of p53 dore domain. b) equilibrium showing the various species formed during the denaturation of p53 core domain.78

experiments showed that p53 core domain does not unfold reversibly with temperature,

and qualitatively, the stability of the protein increases on binding to the consensus DNA.

Following this the authors have set up a system to measure the reversible unfolding of

p53 core domain by urea denaturation monitored by following the fluorescence of the

tryptophan residue in the core domain and estimating the free energy of unfolding. They

have defined the spectral properties of the native state, the denatured state and an

aggregated species that is formed under certain denaturing conditions (Figure 7). The aggregation was confirmed by gel filtration chromatography. Since dithiothreitol (DTT) was essential for reversible chemical denaturation, they have also characterized the fate of the Zn2+ in these experiments. Direct measurement using spectrophotometric assays

showed that Zn2+ is bound in 1:1 ratio to the core domain in the native state and remains

bound even in the presence of 5 M urea. Therefore the equilibrium that we measure using

urea mediated denaturation is between holo-native and Zn2+-bound denatured state

(Figure 7). All literature that has followed uses this method to purify and characterize the stability of this protein.

The free energy of unfolding of wt-p53 core domain is reported to be 8.6 kcal mol-1, confirming that p53 is only marginally stabilized at physiological temperatures. In addition, most of the tumor-derived mutations destablize the protein. Mutations also lead to a loss of the delicate balance between p53 and its negative regulators resulting in accumulation of non-functional p53. Gene therapy approaches that aim at delivering functional p53 to tumor cells to re-establish the p53 dependent pathways are gaining success. This triggered the design of a stable functional variant of p53. Nikolova et al. have employed a molecular evolution strategy to design a superstable variant of p53 core domain with wt-like DNA binding properties.79; 80 They mutated 20 different positions of

different solvent accessibilities to the consensus residues obtained from the comparison

of 22 different homologous p53 proteins. Four of the stabilizing mutations M133L

V203A N239Y N268D were also confirmed to provide an additive effect generating the

highly stablized Quadruple (Quad) mutant. The main contribution to stability increase

came from the N239Y and N268D substitutions which are also known to act as second-

site suppressors for various cancer-associated mutations.81 The crystal structure of this

mutant (PDB 1UOL) revealed that it folds into a native structure, and binding studies

using surface plasmon resonance (SPR) showed that the Quad mutant binds to DNA with

wt-like affinity. The N268D mutation leads to favorable hydrogen bonding between the

S7 and S8 strands and the N239Y leads to the rigidification of loop L3. These factors

explain the improved thermal stability of the the Quad mutant and provide insight into the

mechanism of action of the two stabilizing rescue mutations. Also this study clearly

reveals that stabilizing the protein leads to rescue of function and this approach has been

pursued for targeted drug design ventures. The Quad mutant itself serves as a potential

candidate for preliminary trials for gene therapy. In addition, this stabilized mutant of p53

has served as model system for various structural and biophysical studies and has

contributed immensely to our understanding of the structure-function relationship of this

protein.

The inherent plasticity of the hydrophobic core of this protein was analyzed using NMR

studies of the core domain. The authors found several buried polar groups which

explained the structural reasons for the instability. NMR spectroscopy, with its ability to

detect protons, located buried hydroxyl and sulfhydryl groups that form suboptimal

hydrogen-bond networks, one of which the authors pursued further. Tyr-236 and Thr-253

which are located in the hydrophobic core and away from the DNA binding motifs were

mutated to Phe-236 and Ile-253.63 These residues were chosen based on the structural

alignment with p63 and p73, the stable paralogs of p53. These mutations stabilized p53 by 1.6 kcal mol-1. NMR analyses of the mutant showed differences in the conformation of

a mobile loop that might reflect the existence of physiologically relevant alternative

Figure 8 Hotspot mutations on the core domain The six known hot spot mutations (R175H, G245S, R248Q, R249S, R273H and R282) are highlighted as red sticks. Crystal structure of p53 bound to DNA (PDB ID 1tsr) was rendered using Pymol. conformations. These mutations when combined with the previously characterized Quad mutant led to further stabilization.82 The core domain of this “Hexa’ mutant has an overall fold of the wt core domain, as the crystal structure of this ‘Hexa’ mutant (PDB

2WGX) was virtually identical but for the immediate hydrophobic environment of the mutated residues. The mutations lead to a change from the suboptimal hydrogen bonding to favorable van der Waals contact between F236 and I253. In addition to the specific interaction between these residues, the change from polar residues to hydrophobic residues allowed the neighbouring hydrophobic environment of these residues to repack

34 and form more favorable interactions. This mutant also showed native-like DNA binding to the fluorescein-labelled 30-mer double stranded DNA (dsDNA) containing the 3′ p21 response element, as indicated by the anisotropy experiments with the full length version of the mutant. One of the consequences of these mutations as indicated by urea mediated equilibrium denaturation is that they stabilize an intermediate in the unfolding pathway, moving away from the cooperative two state denaturation exhibited by the wt and Quad core domains. This behavior, different from that of the wt protein, may have implications in the in vivo function of the full length protein and makes it a less attractive template for in vitro characterizations.

P53 is arguably one of the most studied proteins and the common goal of these investigations is to battle cancer. According to the latest statistics, over 26000 somatic mutations and over 500 germline mutations have been reported to occur in the p53 gene.

More than 2300 of these mutations have been reported to have functional consequences83

(http://www-p53.iarc.fr version R15). The initial data derived for the various non- functional mutants of p53 were derived using yeast based transcriptional activation studies. Some of the initial biophysical characterizations to elucidate the mechanism of action of these mutations came from NMR studies conducted by Wong et al.84 They monitored the effect of five different ‘hotspot’ mutations of p53 (V143A, G245S,

R248Q, R249S, and R273H) by observing the changes in the chemical-shift pattern. The location of these mutations on the crystal structure of p53 is highlighted in Figure 8. The

extent of structural perturbation provided insight into the nature of these mutations.

R273H was categorized as a ‘contact’ mutant as this mutation mainly leads to

perturbation of the loop-sheet-helix motif and the L3 loop of the core domain. In

addition, this mutation leads to loss of a salt- bridge interaction. These changes minimally

affect the overall stability of the core domain but lead to considerably decreased

transcription of the p21 gene. Mutations R249S and R245S are characterized to be

‘structural’ mutations as they lead to global perturbation of the core domain structure and

cause the protein to be destabilized >2 kcal mol-1. The R248Q mutation leads to structural

destabilization and loss of DNA binding indicating that this is a structural and contact

mutant. The mutation V143A which is buried deep inside the hydrophobic core of the β-

sandwich was found to cause perturbations to all the residues in the core. The

understanding of these mutations allowed Nikolova et al. to analyze the mechanism of

rescue of some of the cancer associated mutants by second site suppressor mutations.85

These studies allowed them to categorize targeted drug discovery into two groups. The first group will consist of drugs that can globally stabilize the protein, and these will be effective against the vast majority of cancer-associated p53 mutations which are known to be destabilizing. These include structural mutants such as the previously studied

V143A and G245S. On the other hand contact mutants like R273H require the drug to restore the DNA binding activity in order to be effective in rescuing the mutant p53 to possess wt-like activity. Furthermore, Bullock et al. have conducted a comprehensive analysis of all the hotspot mutations and found that most of the mutations lead to destabilization of the core domain.86 They categorized the cancer associated mutants into

different classes: DNA contact, DNA region, Zn region and β sandwich, providing an

overall perspective on the various p53 mutations based on their location on the core

domain affording a direction for future drug design ventures. In subsequent studies, the

crystal structures of R273H and R249S were solved.87 These structures reinforce the

hypothesis on the reason behind the loss of function of these mutants. They also solved

the structure of Y220C, one of the mutations in p53 that occurs with high frequency.

Understanding the structural effects of this mutation has led to the discovery of a class of

compounds that can potentially serve as lead molecules for the rescue of a broad range of

p53 mutations.88; 89

A significant consequence of the development of the various characterization methods is

the ability to interrogate the sequence-structure-function relationship of these variants.

Khoo et al. analyzed various stabilizing and destabilizing mutants for their transcriptional and apoptotic activities in tumor derived cell lines.90 They found that the stability of the various mutants correlates with the serum half life of these proteins and by and large an increased stability leads to improved activity. The outlier for this hypothesis was the

V143A mutation in the Quad and the Hexa mutant context, which showed WT-like transcriptional activity despite being a highly destabilized variant. Mutational analysis of these variants indicated that the improved activity of these mutants is due to the N239Y which is universally present in all these variants. It was also shown that this mutation leads to a higher apoptotic activity in a transcription dependent manner. This may indicate an effect of this mutation on the binding efficiency of the p53-DNA complex.

Comparative binding studies of the various DNA targets indicated that the transcriptional

91 consequence of p53 binding is determined by the tightness (KD) of binding. Typically

p53 exhibited tighter binding to its apoptotic targets as compared to the cell cycle arrest complexes. Therefore this mutation may improve the binding to levels that are required for the apoptotic activity. This implies that specificity of the response elicited by targeted p53 therapy may be more challenging to engineer.

1.4.2.2 The role of unstructured C-terminal regulatory Domain of p53

The mode of DNA binding by p53 and the role of C-terminal domain in the regulation of p53 is also an active area of research. The binding of p53 to DNA is found to be affected

by post-translational modifications such as phosphorylation and acetylation, in addition

to the specific sequence of the DNA. The role of the unstructured C-terminal domain is

probably the least understood among the various domains of p53. Early studies

demonstrated that truncation of the CTD or blocking the CTD by an antibody lead to

improved DNA binding by the core domain.92 The role of specific post-translational

modifications on the C-terminal domain was studied by Friedler et al.93 Their results

show that acetylation of specific lysine residues reduce the binding efficiency of the C-

terminal domain (CTD) peptide to a long non-specific DNA sequence derived from

sheared herring sperm. Concentration-dependent measurements show that there is an

exponential effect in the binding when p53 is present as monomer versus dimers versus

tetramers. This further added to the proof that the CTD may negatively regulate the DNA

binding of p53 core domain. They also reported that the phosphorylation of S392 in the

CTD did not affect the binding to any degree, which contradicts previous results. The

authors suggest that S392 may not affect the binding and that previous work may have involved phosphorylation of other serine residues in addition to the S392. The observed dependence on phosphorylation of serine might be due to S376 and S378 which are in the

DNA binding stretch of the CTD. These experiments led to the hypothesis that the CTD negatively affects the sequence specific DNA binding by the core domain and an allosteric regulation of DNA binding by the CTD was suggested.

This allosteric regulation mechanism was challenged by experimental results that indicated that the DNA binding core domain is unaffected by changes in the CTD.

Specific constructs known to inhibit tetramerization and the contructs with the CTD were compared using NMR spectroscopy, and the structure of the core domain was found to be unvarying irrespective of the presence of the CTD.94 A comparison of the sequences of

the CTD of p53 and its paralogs p63 and p73 revealed significant differences among the

proteins supporting the tight binding of DNA by p73 and weak binding by p53.95 The influence of the CTD on DNA binding was also found to depend on the length of the target DNA studied. It was shown that the CTD facilitates binding to long stretches of

DNA while it negatively affected binding to short DNA. The presence of the CTD was found to improve both the kinetic and thermodynamic stability of the p53-DNA complex.

These results imply a positive regulatory role for the C-terminal domain in DNA binding by the core and tetramerization domains.

The tetramerization domain does seem to facilitate the cooperative binding of the protein to DNA whereas the mechanism of the C-terminal regulatory domain in still unclear.

Recent work reported by Tafvizi et al. provides a feasible mechanism by which p53 binds to sequence specific DNA.96 They used single molecule experiments to probe the mode of binding of each of the various domains in p53. It is known that non-specific DNA binding is independent of ionic strength while ionic strength does play a role on sequence specific DNA binding. Their experiments showed that the C-terminal domain binds DNA via non-specific electrostatic interactions and the sequence specific binding is mediated by the core domain. The fluorescence profiles of the labeled proteins also indicate that the C-terminal domain employs a sliding mechanism while the core domain engages in a

‘hopping’ mechanism to search through the DNA. DNA binding proteins are known to undergo conformational changes when they change from ‘search’ mode to the ‘binding’ mode. In the case of p53, the function is split between two domains, C-terminal domain sliding through the DNA aiding the ‘search’ process and the core domain binds that specific sequence of the DNA. The hopping mechanism followed by the core domain allows for the rapid traversing of long stretches of DNA by the protein. It is proposed that the protein spends a large amount of time in the ‘search’ configuration as compared to the

‘binding’ conformation. The authors propose that different roles of the domains in DNA binding explain the contradictory role of C-terminal domain on DNA binding. Truncation of the C terminus or binding by specific antibodies eliminates sequestration and leads to better binding to the cognate sites on short DNA fragments, while making binding to long

DNA molecules kinetically inefficient. Therefore the C-terminal domain might

kinetically favor binding to long DNA, which is the most likely scenario in vivo, and

thermodynamically hinder the binding of the core domain to short DNA by sequestering

the protein on to non-specific DNA.

In vitro experiments comparing the effects of various mutations on the core domain in isolation versus in the presence of tetramerization domains showed similar trends in the

DNA binding activity. Wieinberg et al. tested the effects on sequence-specific and non-

specific DNA using fluorescence anisotropy and analytical ultracentrifugation.71 The Hill plot indicated that the binding of the core domain is completely cooperative both in isolation and in the presence of the tetramerization domain. In addition, the amount of destabilization imparted by each of the mutants tested followed a similar trend in both the contexts. This establishes that the core domain, which dictates the stability of the protein, also determines the sequence specific DNA binding and can be used as a model to study the effect of mutations on the protein. This paints a picture which shows the DNA binding and the tetramerization domains act synchronously to bind DNA while the regulatory domain allows for the fast traversing through DNA to find the right binding targets. In the presence of non-native, non-specific or short fragments of DNA, this regulatory domain sequesters the protein by non specific electrostatic interactions with

DNA. This sequestering kinetically favors the degradation of the protein, which is the most likely scenario at normal cell conditions.

In the course of vertebrate evolution, p53 has probably evolved to be kinetically unstable

at the organismal temperature with a short half-life in the cell to allow a mechanism for

spontaneous degradation in addition to regulatory pathways such as MDM2. This rapid

turnover of p53 ensures that the protein will remain in the cell no longer than required,

unless it is in complex with DNA or other proteins. Some experimental data to support

this comes from the studies done by Khoo et al.97 They analyzed the changes imparted to

the sequence, structure and function of p53 in various lower organisms and its paralogs

p63 and p73. P53 is functionally evolved to be a tumor suppressor protein whereas the

main cellular function of p63 and p73 is not growth suppression. In the process of this

functional evolution, p53 has acquired destabilizing mutations that render the protein to

be highly unstable at the body temperature of the organism. This allows an additional

regulatory mechanism to be available for p53 to be targeted for degradation under normal

cell conditions. Studies on the tetramerization domains of these homologs indicate that the tetramerization domain of p53 has evolved to be less complex than in p53 from lower

invertebrates and the p63 and p73 in humans.98 This acquired low promiscuity may have

defined the functional drift of p53 from its paralogs. Its significance in human cancer has

made it a target for various drug design studies.

1.4.3 Characterization of the Response Elements of p53 Core Domain

Considerable efforts have been made towards understanding the sequence and

conformation of the DNA response elements (REs) of p53. The first report on the

sequence of the DNA that binds to p53 was done by Kern et al.99; 100 Their experiments,

which utilized radiolabeling of the DNA were able to identify that a human DNA

fragment as small as 33 bp can bind to p53. Using methylation interference assays, they

were also able to assign the residues that are significant for this binding. These

experiments first claimed that the function of p53 may be mediated by its ability to bind to specific DNA sequences in the human genome, and that this ability is altered by mutations that occur in p53 found in human tumors.

P53 was then defined as a sequence specific DNA binding protein by El Diery et al., and

they defined the currently known ‘consensus’ binding sequence for p53.101 They

identified and analyzed 18 different p53 binding sequences from the human genomic

DNA. Fragmented DNA was analyzed for binding to p53 using p53-specific antibody,

and the DNA bound to p53 was identified using sequencing. The significance of each of

the DNA bases was tested using methylation interference (MI) and immunoprecipitation

(IP) assays. Their results showed a striking pattern for the binding sequences which

consisted of two copies of the 10 bp motif 5’-PuPuPuC(A/T)|(T/A)GPyPyPy-3’

separated by 0-13 bp, and this was referred to as the ‘definition of the consensus.’ Their

experiments also showed that the ‘inversion’ of the half site as well as the C and G bases

at the positions 4 and 7 are significant for p53 binding.

In a parallel study, Funk et al. used an iterative selection procedure (CASTing: cyclic

amplification and selection of targets) to identify new specific binding sites for p53, using

Figure 9: Regulatory network of p53 The various stress that lead to change in Chromatic leads to activation of p53 pathway which ultimately leads to cell cycle arrest, apoptosis or senescence depending on the severity of cellular stress. MDM2 plays a major role in the inhibition of p53 and this interaction is the target of many small molecule drugs. Figure from Kim and Dass 2011102

nuclear extracts from normal human fibroblasts as the source of p53 protein.103 A

completely degenerate DNA sequence flanked by primer-specific amplification

sequences bound to p53 was isolated using magnetic beads coated with p53-specific

antibody. The DNA was released from this complex by denaturation and was amplified.

Multiple cycles improved the stringency of selection and yielded specific DNA binding

sequences for p53. The preferred consensus was the palindrome

GGACATGCCC|GGGCATGTCC. In vitro binding was assessed by Electrophoretic

Mobility Shift Assay (EMSA), and placing the identified sequences upstream of a

promoter in a yeast assay led to transcription by WT p53. Yeast one-hybrid methods

have been modified to identify REs from the human genome. Initially Tokino et al.

cloned putative p53 REs upstream of a basal promoter system controlling the expression

of HIS3.104 Only those cells encoding sequences that p53 expressed from a GAL4 promoter from a different vector can transactivate and will survive on media lacking histidine. They were able to identify 200 to 300 REs from the human DNA and they also reported that the spacing between the consensus half sites is significant. Later, Hearnes et al. combined Chromatin Immunoprecipitation (ChIP) with this yeast screen to identify novel binding sites from the whole genome.105 They were able to identify sequences that matched the consensus completely, and they observed that the spacing between the half sites was 2 bp or less.

1.5 Rescue strategies for p53C and mutants

The rescue strategies developed for p53 can be categorized into two groups.102; 106; 107 One

of the methods is direct rescue by gene therapy. In this method phage delivery systems

are targeted to deliver the cargo to the tumor cells. The ill-effects of the immunological

response to the virus and the accompanying side effects have been a major detriment for

this approach. The second method is the indirect rescue by small molecule activators, and

the effectiveness of a number of these molecules is currently being tested by pre-clinical

trials. P53 is maintained at low levels using tightly-regulated pathways and the knowledge about this regulatory network has paved way for this class of p53 rescue

(Figure 9). This class can be further subdivided into those molecules that reactivate

mutant p53 and those which confer activity to wt p53 by interacting with its negative

regulator MDM2. Despite a general understanding of the effect of these molecules, the

specific mechanism of action of a number of these compounds is still poorly understood.

Studies are underway to gain a better understanding of these in order for easier and more

effective design of drugs in the future. The few compounds for which the mechanism of

action is clear, it appears that stabilizing p53 is a major determinant of the effectiveness

of the drug.

The class of small molecule activators of p53 function by disrupting its interaction with

the protein MDM2108 is seemingly more established, with a few compounds in pre-

clinical studies. The high resolution crystal structure ofMDM2 bound to p53 shows that

Figure 10: Rescue of p53 by binding to MDM2 a) shows the various strategies to rescue the function of p53. Figure from Mandinova et al.109 b) shows the structures of p53 targeted drugs Nutlin 3a and MI219 bound to the p53 peptide binding pocket of MDM2110

this binding pocket is ideal in size for small molecule binding and has facilitated the

design and discovery of non-peptidic inhibitors of this interaction. Among the various

targets identified, Nutin3 and MI219 have all the desirable properties: namely, (a) high

binding affinity and specificity to MDM2, (b) potent cellular activity in cancer cells with

wild-type p53, and (c) a highly desirable pharmacokinetic (PK) profile and are among the

small molecules which are in clinical trials now.110; 111 The structure of Nutlin3 bound to

MDM2 has been solved and shows that it binds to the pocket where the p53 peptide

binds112 (Figure 10). RITA is another small molecule drug that has been reported to

activate WT p53 by disrupting its interactions with MDM2,113 but its in vivo

pharmacokinetic properties are yet to be established.

53BP2 is a positive regulator of p53, and based on this, Friedlar et al., have designed a

peptide CDB3 that can reactivate the DNA binding property of mutant p53.114; 115 Using

NMR studies they were able to localize the binding of the peptide to the edge of the DNA binding site and show that incubation with CDB3 improved the stability of the p53 core domain. Furthermore, when mutants of p53 were incubated with the peptide, it led to improved binding of the mutant. Using NMR chemical shifts, it was shown that this peptide can shift the conformation of structural mutant R249S to be WT-like.116 When mutants are in a dynamic equilibrium between the WT-like and non-native conformation, the chaperone effect of the CDB3 peptide assists in shifting the equilibrium towards the native like form. Thus CDB3 potentially rescues the function, simply by stabilizing the

mutant. The cellular uptake and the in vivo effect of the FL CDB3 were validated in three

different human cancer derived cell lines.117

Representative structural and contact mutants R175H and R273H were rescued by this

peptide. ELISA experiments using PAB1620, an antibody that recognizes properly folded

p53, confirmed the chaperone-like activity of the peptide. In the context of the contact

mutant, the authors propose the total upregulation of p53 in cells might be the reason for

the rescue of function by the peptide. In a separate study, the authors have shown that the

kinetic stability of the p53 mutants correlate with their thermodynamic stability. The

kinetic instability might explain the loss of activity by many of the structural mutants.

The authors tested the effect of stabilizing small molecules on the half life of various

mutants including WT p53. The presence of peptide drugs like the CDB3, which is

proposed to act as a chaperone in the folding of the protein, improved the kinetic stability

of the protein.118 This presents a unique strategy for stabilizing the mutant which might

lead to the rescue of structural mutants. A general mechanism for the binding of this

peptide was found by NMR combined with alanine mutation studies of the CDB3 peptide. Their studies revealed that the DNA binding region of p53 is highly positively charged and binding by the peptide is governed by non-specific electrostatic interactions.

An analysis of the binding of various p53 partners reveals that regions overlapping with the DNA binding area of the p53 form a promiscuous site for binding.119 This site

consists of the residues Leu114, His115, Gly117, Thr118, Val122, and Thr123 from loop

L1, Arg280, Arg282, Arg283, Thr284, and Glu286 from helix H2, and Tyr126, Thr140,

and Glu198 from the rest of the protein. Residues Val143, His179, Asn239, Gly244,

Val272, Cys277, and Gly279 also form part of the binding site. DNA-binding interface in

p53CD serves as a multipurpose promiscuous protein-binding site. This site mediates

binding of most of the p53CD-binding proteins, including Rad51, HIF-1α, Bcl-XL, CTF2,

53BP1, 53BP2 and heparin. The rigid hydrophobic core and the multitude of flexible loops in the core domain of p53 allow the promiscuous binding to its various partners.

Further experimental evaluation and possibly engineering is required to prove the specificity of CBD3 in vivo.

PRIMA1 and its methylated derivative PRIMA1MET are two non-peptidic small-molecule

drugs which rescue the activity of p53 mutants and have been shown to be active against

various cell lines and carcinomas.120; 121 A library of small molecules that suppress the

growth of human tumor cells in a mutant p53–dependent manner was screened using an

assay based on Saos-2-His-273 cells carrying tetracycline-regulated mutant p53. This

study identified PRIMA1 as a broad spectrum p53 rescue drug which reactivated 13 of 14

different mutants. In vivo studies show that PRIMA1 has tumor suppressor activity in

animal models and also induces other p53 dependent genes including MDM2. This

indicated that PRIMA1 reactivates p53 by inducing WT-like conformation of mutant p53.

In vitro studies in tumor cell lines and subsequent in vivo studies in mouse models

identified PRIMA1MET,a methylated derivative of PRIMA1, as a more potent p53

activator .122; 123 An insight into the mechanism of action of these compounds came from

in vivo and in vitro decomposition studies of PRIMA1 and PRIMA1MET.124 Their

decomposition leads to the formation of products that contain thiol reactive groups (for

example, methylene quinuclidinone, MQ). These decomposition products alkylate the

thiol groups in p53 leading to an unscrambling of the misfolded mutant. This mechanism

of action based on thiol reactivity is similar to other p53 reactivating drugs like MIRA,

STIMA, and CP-31398. The effect of PRIMA1MET on global gene expression was studied

by microarray analyses of tumor cell lines expressing p53 mutant and showed that various transcription dependent and independent p53 targets were regulated in the

presence of the small molecule in a p53 dependent manner.125 This further reinforced that

PRIMA1MET recues the wt-like functions of p53 and potentially has minimal possibilities

of the development of drug resistance.

Using the same screening system that was used for the discovery of PRIMA1, the Bykov

group identified a novel compound STIMA1 that improved the DNA binding and

transcriptional activation of mutant p53.126 The structural scaffold of 2-vinylquinazolin-

4(3H) is similar to the previously characterized mutant p53-reactivating compound CP-

31398. A set of 26 different derivatives of this scaffold were tested to identify STIMA1.

In vitro studies indicate that STIMA1 is specific toward cells expressing mutants of p53 and that WT p53 is less sensitive to the effects of STIMA1. Just as in the case of

PRIMA1, the authors propose a thiol reactive mechanism for the activity of STIMA1.

The compromised solubility of STIMA1 has hindered the in vivo studies of this compound and derivatizations to improve the solubility are being explored.

Phikan083, which belongs to a class of carbazole derivatives and can rescue the function

of Y220C, was discovered by in silico screening followed by experimental testing.87; 89

This mutation leads to a cavity in the p53 core domain distant from the surface regions that are known to be involved in DNA recognition or protein–protein interactions, making it a particularly attractive target site for stabilizing small-molecule drugs.

PhiKan083 was demonstrated to improve both the thermodynamic and kinetic stability of the mutant. Since Y220 is not in the region that binds DNA, PhiKan083 can serve as a lead compound for the design of generic p53 activators. The binding of this compound to the cavity generated by the Y220C was unambiguously proved when the complex was crystallized with PhiKan083 bound to the particular cavity generated Y220C mutation

(PDB 2YUK). The effectiveness of PhiKan083 demonstrates the utility of stabilization of p53 for rescue of function.

The significance of high throughput screening systems is accentuated by the fact that such discoveries were made via such screening systems albeit at a small scale. Despite the fact that screening systems provide a powerful tool in the identification of small molecules, only a few yeast based screening systems are prevalent for the identification of functional p53. The number of variants that can be studied in a high-throughput manner is limited by the transformation efficiency in yeast. This explains the small libraries of variants that have been used in the studies thus far. A review of the presently available screens for p53 is detailed below.

1.6 Identification of functional mutants of p53 using genetic screens

One of the earliest screening systems developed for p53 is a negative selection scheme

based on a yeast reverse one-hybrid system. Brachmann et al. used a URA3 based

reporter system and screened for both survival in uracil-free media and sensitivity to

FOA (5-fluoro-orotic acid). P53 expression was driven from a different plasmid under the

control of an ADH1 promoter127; 128 (Figure 11). In this assay, which depends on the

activation of URA3 placed downstream of a p53 dependent promoter, WT p53 can survive on media lacking uracil and will not survive on plates containing FOA (FOAs).

The binding sequence of p53 was based on the consensus reported by El Diery et al.

Mutations of p53 were identified based on their ability to rescue FOAS phenotype. They

isolated 49 different mutations, most of which were previously reported in cancer, and

they found that most of the mutations clustered around three of the six hotspot sites. The degree of dominant negative effect exerted by these mutants was further explored by screening in presence of one or two copies of WT p53. The same screen was later used in

order to identify second site repressor mutations.81 Using PCR and gap repair, a library of mutants was analyzed using the same screen, and identified second site repressor mutations for V143A, R249S and G245S. They also showed that the mutations that

Figure 11: URA3 dependent screen for p53 in Yeast Yeast based screen for p53 PCR mutagenesis and gap repair of different regions of the p53 ORF (PAC products A and B) was used to assess the transactivation capacity of various mutants of p53 monitored by the survival rate on uracil drop-out media. Survival on the histidine drop-out media denoted successful gap repair.127

resulted from the yeast screen lead to transcription of the reporter gene and natural p53

RE in mammalian cells. Ishioka et al. report a functional screen for p53 based in the p53

dependent expression of HIS3 gene.129 The expression plasmid that contained the DNA encoding the p53 fragments from the analyte was selected based on survival on plates lacking leucine. This tests the gap repair efficiency in yeast and selects for clones expressing full length p53. These were then transformed with the reporter plasmid that utilized a p53-dependent HIS3 system and plated using replica plating on histidine and leucine drop-out plates. This assay is cumbersome due to the requirement of replica plating. A modification of this system, which was developed by Flaman et al., depends on the expression of p53 dependent Ade2 gene from a plasmid that is integrated into the yeast strain130 (Figure 12). In presence of limiting amounts of adenine, Ade2- leads to the accumulation of a colored intermediate in the biosynthesis of adenine, turning the cells red. Therefore when Ade2 is expressed in a p53 dependent manner, WT p53 leads to white colonies while mutants will result in red colonies. This screen also identified marginally active variants forming pink colonies. A detailed analysis showed that these were temperature sensitive mutants which resulted in pink colonies at 25 °C and white colonies at 37 °C. This screen can be utilized to identify functional p53 from clinical samples including cell lines, peripheral blood lymphocytes and tumors. The advantage of this screen over the previous URA3 based screen is that it avoids the replica plating step.

The previous screen though has the distinct advantage that the URA3 is regulated by a tightly controlled promoter limiting basal expression in the absence of p53.

Figure 12 Red white screening in yeast p53 response elements enginerred upstream of ADE2 allows red white screening due to the coloured intermediate formed during the degradation of ADE2.130

A number of modifications of this basic yeast one hybrid screening system have been developed to analyze for active p53 or p53 REs. The p53 dependent transactivation has been monitored using autotrophic markers such as luciferase, tryptophan uracil and histidine. A significant drawback of yeast based screens is that efficient nuclear import of the protein is required. In addition, the transformation efficiency of yeast heavily limits the number of variants that can be studied at the same time. This is an important deterring factor for combinatorial experiments. mRNA display was used by the Ghadessey group to identify p53 variants that bound to the RE in the p21 gene.131; 132 In vivo results showed that in addition to the transactivation from p21, the endogenous levels of p21 were also increased by some of the variants tested. In vitro methods offer the advantage of avoiding the transformation step, which usually limits the library size, but they are also 56

complicated by the in vitro translational machinery required. A solubility screen was

developed by Mayer et al. in E. coli.132 The authors monitored the levels of a p53 core

domain with a C-terminal EGFP fusion as a function of cellular fluorescence. They

observed that the levels of p53 correlated with the thermodynamic stability of the variants tested. One of the problems with the solubility screen is that it does not require the folded conformation of the protein. If the chromophore is formed from misfolded proteins, false positives may occur.

The tremendous amount of work that has been done on p53 has led to a clearer picture of the mechanism of action of p53 as a tumor suppressor protein. Both the kinetic and

themodymamic stability of the full-length protein are under tight regulation at

physioloigical conditions. It appears that the CTD may facilitate the search process for

sequence specific DNA binding while the tetramerization domain promotes the

cooperative binding of p53 to DNA. The N-terminal domain is implicated in the transactivation function of p53 and also interacts with MDM2, which is the primary negative regulator of p53. Understanding the structural basis of instability has led to the identification of various drug targets. High throughput screening systems and combinatorial methods to generate large libraries of p53 can aid in interrogation of the sequence-structure-function relationship of this protein which is still unclear.

1.7 Thesis Synopsis

What determines the fold that a particular protein adapts? In other words, how much

perturbation can a structure tolerate with respect to its sequence before it changes it is unable to form ‘native’ structures? Also is there any difference between the stability determinants in β-sheets versus α-helices? Does the large contribution of long range

forces make the β-sheet systems have different tolerances to sequences? We have

attempted to answer some of these questions using combinatorial methods to study the

tumor suppressor protein p53. P53 is a physiologically important protein with a common

β-immunoglobulin fold. We have also analyzed the specificity of the interface of a

heterodimeric coiled coil formed by the interaction between BRCA1 and BARD1

We developed a transcription interference screen for p53 core domain based on an

artificial p53 responsive lac operon, controlling the expression of GFPuv in the host

plasmid pGFPuv. We report a novel p53 responsive lac operator which was derived by

the simultaneous optimization of the transcription fo GFP and binding to wt-like Quad

from a library of binding sequences. We have successfully designed a screen which

discriminates between p53 which shows DNA binding activity and variants of p53 that

cannot bind or weakly bind to DNA. As in the case of most proteins, the function of p53

is closely related to its stability. Therefore our screen provides a simple and quick method

to screen for stable and functional variants of p53. The design, optimization the results

from the screen are described in Chapter 2.

The various applications of the screen are the subject of discussion in Chapters 3 and 4.

Chapter 3 describes the application of the screen to find stable functional variants of p53

core domain from a library of core randomized variants. Initially, four core residues were

randomized to all twenty amino acids. The number of positives obtained as a result of

screening this library was very small, and therefore we generated two smaller sub-

libraries randomizing only two residues at a time to investigate the role of these four

residues. The results obtained from the sub-libraries showed that two of the four positions

randomized were more stringent to the size requirements in the β-sheet core than the

other two, thus explaining the low occurrence of positives from the 4-position library.

The validity of the screen was further proved by biophysical characterization of stable

and functional variants resulting from the screening. The characterization shows that the

variants that pass the screen are as stabilized as the Quad mutant. In silico studies indicate

that decreasing the global dynamics of the protein by reducing the loop length of the

S7S8 turn in the p53 core domain can lead to significant stabilization of the protein. But

this in silico designed variant failed to give a positive phenotype in our screen. Following

this, we rationally designed various deletion mutants and utilized our screen to explore

the stabilization of core domain through mutagenesis of this loop. We were able to

identify one such variant which is intermediate in stability with respect to the in silico designed mutant and Quad. Using the screen to identify rationally designed loop variants

of p53 is detailed in Chapter 4.

BRCA1 is another important tumor suppressor protein and it is found that some oncogenic mutations in BRCA1 lie in the putative binding regions to key interation partners. e.g., BRCA1 and BARD1 interact with each other through their RING finger domains to form a heterdimeric four helix bundle. The occurrence of oncogenic mutations is much more widespread in BRCA1 than in BARD1. Sarkar et al. have designed a screen based of the split complementation of GFP to screen for BRCA1 variants that can bind to BARD1 and thus provide information on the interaction surface between these proteins. The results obtained from the initial study of a few mutants indicated that the interface is fairly insensitive to mutations. Analysis of all the 36 known cancer-associated mutations of BRCA1 using this method also indicated the presence of a robust interface which is not highly sensitive to mutations. As a further validation of the results obtained from the in vivo screen, we have characterized these interactions in vitro.

A selected set of mutations both in and away from the interface were selected and variants with these mutations were co-expressed in the absence of the GFP tags. Both

BRCA1 and BARD1 partition into inclusion bodies when expressed separately in E. coli and are soluble only as a complex. We expressed the two proteins with orthogonal tags and the purification of the complex from the soluble fraction report on the interaction between WT BARD1 and BRCA1 variants. The observations from the in vitro studies confirm that the interface between these proteins is quite robust to mutations. The various details of this study are described in Chapter 5.

Steps to improve the dynamic range of the screen have been initiated. Two different approaches are being taken, to optimize the screen for a more stabilized core mutant

‘hexa’ p53 and to screen in the presence of the tetramerization domain to enable us to detect weaker DNA binding of the core domain. The screen can be potentially used for screening therapeutic targets. As a first step towards this, cyclic peptide libraries will be screened to identify molecules that can stabilize and rescue the function of known hotspot mutants of p53 core domain. These extensions to the screen are described in Chapter 6.

Given the pathological significance of p53 in human cancers and the high throughput nature of the functional screen for p53, our screen is a powerful tool in identification of destabilized mutants. In combination with other small molecule libraries, the screen can also be used as a tool to identify drugs to rescue various destabilized variants. Since the screen is bacterial and phenotypic, it provides a quick approach to identify lead compounds. A repertoire of such potential drugs which can be identified using a combinatorial library approach will be powerful against various oncogenic mutants of p53.

Chapter 2: A cell based screen for the functional core domain variants of tumor

suppressor protein p53

Contributions

The material presented in the chapters 2 and 3 will be published as a full paper co-

authored by Brinda Ramasubramanian and Thomas J Magliery. The work summarized in

this chapter was produced and written by the primary author. The experimental design

and data analyses were accomplished by the primary and the corresponding authors.

2.1 Summary

We have developed a high throughput phenotypic bacterial screen for the core domain of the tumor suppressor protein p53. The screen relies on the transcriptional interference of

an artificial p53 responsive lac operon, controlling the expression of GFPuv in the host plasmid pGFPuv in a p53-dependent manner. We modified the operator region of the lac operon to contain a p53 binding site. Wild type-like p53 binds to this site, blocking the polymerase and leading to a non-fluorescent phenotype. In the presence of mutant p53 whose ability to bind DNA is compromised, transcription of GFPuv is undeterred and the

resulting cells are fluorescent. In addition to utilizing some of the known binding

sequences of p53, we have also used combinatorial methods based on the consensus p53

binding sequence to generate a novel p53 construct (Binding Domain-1, BD-1) which, in

the context of the lac promoter optimizes the transcription from the operon in the absence

of functional p53. P53 Quad (an engineered stable variant of WT p53 core domain) and various mutants were expressed under the control of an arabinose promoter, a tunable system. The host plasmid pACBAD-p53 can be co-maintained in the cell with pGFPuv-

p53 binding domain (pGFPuv BD-x). Known hotspot mutants of p53, V143A, R175H,

R249S and R273H, which are known to span the spectrum of structural and contact

mutants of p53, were chosen as the negative controls for the screen. Our results show a

marked decrease in the fluorescence of pGFPuv-BD-1 when co-transformed with p53-

Quad (wt like) and higher fluorescence when co-transformed with the hotspot mutants of

p53. We have successfully designed a screen which discriminates between p53 which

shows DNA binding activity and variants of p53 that cannot bind or weakly bind to

DNA. As in the case of most proteins, the function of p53 is closely related to its

stability. Therefore our screen provides a simple and quick method to screen for stable

and functional variants of p53.

2.2 Introduction

Proteins play a central role in almost every biological process including signal

transduction, cell cycle regulation, DNA transcription, translation, cell cycle arrest,

apoptosis, etc. Amazingly, their large diversity is derived from varying just twenty amino

acids. The functional diversity of proteins is a result of the different chemical properties

of these twenty amino acids and their architecture in the native state. The wide

involvement of proteins in various cellular processes also means that mutations that

destabilize proteins lead to impaired biological functions and thus lead to various human

diseases. Cystic fibrosis, sickle-cell anemia, Alzheimer’s, heart diseases etc. are examples

of human diseases caused by functionally impaired proteins that are destabilized due to

mutations. One of the devastating consequences of impaired function of mutant proteins

is cancer. Census reports from the American Cancer Society show that about a million

new cases of cancer were diagnosed in patients over the last 10 years and more than half

a million fatalities were reported to be due to cancer in the past years in the United states

alone.133 Understanding the correlation between protein sequence and stability, structure and function is therefore of pivotal importance to human health.

Carcinogenesis is caused by the impaired balance between cell proliferation and

apoptosis. Human p53 was identified as a tumor suppressor protein as early as 1992.134

P53 is a sequence specific transcription activation factor which is at the hub of a network

of signaling pathways and its function includes the direct or indirect activation of

multiple genes involved in cell cycle arrest, apoptosis, cell adhesion etc.135 It can cause

temporary cell cycle arrest and apoptosis in response to carcinogenic cell stress. It is

found mutated in more that 50% of human cancers and leads to complex functional consequences including specificity of mutation to cancer prognosis and drug response.48

P53 is a multidomain protein containing an N-terminal transactivation domain, a DNA

binding core domain, a tetramerization domain and a C-terminal regulatory domain. A

proline rich domain is sandwiched between the transactivation and the DNA binding

domains.136 Proteolytic digestion experiments have shown that the sequence specific

DNA binding core domain coincides with the major hot spots for oncogenic mutations

and helps us understand why cancer derived mutants are defective in DNA binding.137

Wild type p53 is tightly regulated at low levels under normal cellular conditions and is a

marginally stabilized protein under physiological conditions. Mutant p53 falls outside of

this feedback loop, and accumulates in cancerous cells. Consequently, the activity of p53

relies on its intact conformation which is disrupted even by single amino acid mutations.

Many of these mutations are simply destabilizing and the reduced cellular levels lead to

the loss of activity.48 These results reinforce the fact that identifying stable variants or

stability determinants of p53 is valuable to information-based drug discovery. The sequence-structure-function relationship of proteins has been among the most elusive

concepts to understand. One of the most successful approaches to study this relationship

is the use of combinatorial methods to study the tolerance of sequence changes to

structure and stability in a given backbone structure, and this inverse protein folding

approach has been applied to a variety to proteins including ROP138, T4 lysozyme,139

GB1,31; 34 ubiquitin140 and others. High throughput analyses of large number of variants

using combinatorial experiments rely on linking the phenotype to its genotype. Therefore

an essential tool for such experiments is the development of a screen which reports on the

function of the protein of interest (POI).

P53, which has been named the ‘guardian of the genome’, is a physiologically important

protein which and has a β-immunoglobulin structure. Studies show that most of the

cancer-associated missense mutations render the p53 non-functional due to the reduced

stability of the protein. Therefore stabilizing the protein can lead to rescue of function of the protein. Small molecule rescue for a particular destabilized variant Y220C has recently been achieved using in silico screening and docking to the available crystal structure of the mutant.87 Phikan083 binds to the hydrophobic pocket generated by the

mutation and leads to improved thermal stability of the protein.89 This was found to be

sufficient to reinstate the DNA binding of the mutant. Attempts at rescue of the function of this protein target the DNA binding properties as well as stabilizing it. Molecules like

PRIMA1 and PRIMA1MET rescue the transcriptional activation by preventing the sequestering of the various folding intermediates in local minima in the folding funnel.141

The effect of mutations on the residues in its β-sheet core domain varies, depending on the hydrogen bonding pattern of that particular residue. Also, there is a higher prevalence of long range interactions for β-sheet proteins when compared to the alpha helices. Such differences have made any predictions of sequence changes on stability quite challenging.

This in turn has led to a lesser number of studies done on β-sheet proteins in comparison to the studies based in alpha helical proteins. Most of the drugs that have been developed are based on rational design, deriving information from the known scaffolds that can bind to p53. Understanding the tolerance of this protein to sequence changes and the

consequences of the various changes to stability and function of the protein will allow us

to gain insight into the structure-function relationship of this protein and to use the

information for targeted drug design. Combinatorial methods using phage or bacteria

allow the screening of ~109 variants at a time, and this can significantly speed up the

process of drug discovery. Therefore it is advantageous to develop a screen as a tool to

identify stable functional variants of p53 and identify drug targets to rescue known cancer

mutations of p53. The in vivo approach will aid in understanding the determinants of

protein stability under conditions somewhat native to the cell. Many factors like the effect

of molecular crowding, chaperones and proteolytic degradation are masked when the

proteins are studied in vitro.132 It has been shown that the stability of p53 is parallel to

the stability of its DNA binding core domain (p53C) and therefore a screen that can

identify stable variants of p53C can also be used as a tool to screen for therapeutic agents

for known p53 mutations.142

A few cell based assays mainly based on yeast one-hybrid systems have been developed

to study the effect of various mutations in this protein.143 These rely on the expression of

a gene of interest from a promoter modified to contain p53 response elements (REs). The transcription of these genes is effected only in the presence of a functional p53.

Selections based on this principle have auxotrophic marker genes downstream of this p53 responsive promoter whereas screens contain quantifiable genes such as ADE, luciferase,

Gal4, GFP and others downstream of the p53 dependent promoter. Selections monitor cell survival in selective media lacking the respective component such as uracil, histidine

and others. There has been an array of modifications to the original version of the system,

FASAY,144 to study the effect of mutations on p53-DNA interactions, in a region-specific

manner, expression-dependent manner and promoter-dependent manner. The system has

also been adapted to study the interaction of p53 with other proteins. The use of a

eukaryotic system facilitates various a post-translational mechanism by which p53 is

regulated. One of the significant requirements of yeast hybrid screening systems is the

need for nuclear import of the protein for the assay to work. Also, the outcome of these

assays is a complex function of other protein-protein interactions in addition to the

stability and DNA binding efficiency of p53. Therefore it is challenging to decipher the

specific effect of any particular mutation of the structure and function of the protein. The

library sizes that can be studied using yeast based methods is also limited by the low

transformation efficiency of yeast strains in general. Recently a bacterial screening

system, based on the solubility of the mutants reported by a C-terminal eGFP fusion has

also been reported. However, mutations may alter the conformation of the protein, and

since solubility does not demand the native conformation of the protein, additional

experiments are required to confirm the function of the protein. Here we report a robust,

in vivo bacterial screen that directly measures the DNA binding function of native and

mutant p53.

2.3 A functional screen for p53 core domain

The screen is based on the transcription interference of GFPuv (Green Fluorescent

Protein which contains the mutations F99S, M153T and V163A, also known as the ‘cycle

3’ variant of GFP145) gene in a p53 dependent manner. The overall scheme of the screen

is shown in Figure 13. The native function of p53 is to act as a transcription activation

factor that triggers downstream targets leading to cell cycle arrest, apoptosis, senescence

or cellular death. The response evoked by p53 is found to depend on its binding efficacy

to its target. It is reported to bind to DNA which has the following sequence of

Figure 13: A schematic representation of the screen The operator region of lac promoter upstream of GFPuv is modified to bind p53. If the p53 variant expressed from a different plasmid is well folded and functional it binds to its DNA Binding Domain (DBD) and inhibits the transcription of GFP, resulting in low or no cellular fluorescence. If the p53 variant is misfolded and non functional, it fails to bind to the DBD and can result in high cellular fluorescence.

an inverted repeat of PuPuPuCA/TA/TGPyPyPy,91; 101 where Pu is a purine and Py is a

pyrimidine. The repeat of the consensus sequence can be separated by 2-13 base pairs,

although a majority of the natural binding sequences are tandem repeats with no base

pairs separating them.91; 146 We decided to exploit the DNA binding function of p53 to

generate a simple phenotypic cellular screen. We modified the operator region of a lac

promoter which encodes GFPuv to contain the consensus DNA binding site for p53. The

lac operon system is one of the most studied genetic systems. In its native state, lac

operon encodes for a lacI, an inhibitor of the lac operon. In the absence of glucose or

lactose the lacI binds to the operon region and inhibits the transcription of downstream

genes, lac z,y and a. When lactose or glucose is present, it binds to the lacI, allowing the

RNA polymerase to transcribe the downstream genes. Bacterial one and two-hybrid

systems have utilized the properties of the lac operator to study protein-DNA and protein-protein interactions.147 We have modified the lac operon to bind p53 so that it

displaces the RNA polymerase to effectively cause transcriptional interference. The

details of the various modifications to the operon and the plasmids are described later in

this chapter. If the p53 variants are well folded and can bind to the consensus DNA

binding domain, the transcription of the downstream GFPuv will be reduced, and therefore will lead to cells in which fluorescence levels are low. On the other hand when the p53 variant expressed is not wt-like, it will not bind to the DNA and therefore leads to strongly fluorescent cells. Thus this screen that we have developed can be called a negative phenotypic screen.

We chose a plasmid pGFPuv, which encodes GFPuv under the control of a lac promoter as the reporter plasmid. pGFPuv has a ColE1 origin and an ampicillin resistance marker.

Figure 14: Schematic representation of the screen

a) Plasmid maps pACBADp53 and pGFP-p53bs (binding site) constructed for the screening of functional p53 variants. The pACBAD-p53 plasmid expresses p53 variants which can interact with the pGFPuv-p53bs plasmid which has a modified lacI promoter that facilitated the binding of p53 variants.

The presence of GFPuv allows us to decipher the effects of binding from cellular fluorescence and thus provides a quick phenotypic screen. In order to express GFPuv in a

p53-dependent manner, we modified the promoter encoding GFPuv to contain an

artificial p53-responsive operon. The original lac promoter was modified to contain known p53 binding elements at the +1 site of the operator, resulting in the plasmid that we call pGFPuv-BDx, where x indicates the identity of the binding sequence used. Three different p53 binding sequences derived from literature were used separately to modify

the operator in order for us to choose the most optimum operator sequence. The ideal

property of this binding sequence is that it will lead to uninterrupted and robust transcription of GFPuv in the absence of any p53 and will express GFPuv in a p53- dependent manner in the presence of p53. Two of the three variants of the binding sequence (BD1, BD2) were previously used by Sakaguchi et al. in a decoy experiment for p53 binding.148 We used this sequence as is and as a tandem repeat to yield two variants with which we modified the operator sequence. The third sequence (BD3) among the three variants of pGFPuv-BDx was reported by Kern et al. who used methylation interference and immunoprecipitation assays, to decipher this sequence. This sequence is also the first reported natural binding sequence for p53.100 The various sequences are depicted in Figure 14. Oligonucleotides encoding these sequences, flanked by sequences complementary to the parent vector, pGFPuv were used in a PCR reaction to replace the original lac operon sequences with the p53 REs. Following this, overlap extension PCR was used to generate sequences with appropriate restriction endonuclease sites and the final 750 bp fragments were ligated between AlwNI and HindIII sites in the pGFPuv vector to yield pGFPuv-BD1, pGFPuv-BD2 and pGFPuv-BD3.

For the expression system, we constructed a plasmid which has an orthogonal origin of

replication and resistance marker compared to the reporter system. We also engineered an

arabinose promoter in this vector so that the expression levels of the p53 variants can be

tightly regulated. This vector was generated from commercially available vector

pACYC177 and pUCBADGFPuv. The p15A origin and the kanamycin resistance gene

were amplified using oligonulceotides which encoded for BglII at the 5’ end and NotI at

the 3’ end. These were chosen so that co-maintained with pGFPuv which has a colE1

origin and encodes for ampicillin resistance.149 Using oligonucleotides which encoded for

the same restriction endonuclease sites, we also amplified the arabinose promoter region

and the GFPuv gene from the pUCBADGFPuv vector. These PCR products were ligated to generate the pACBADGFPuv vector. The GFPuv in this case serves as a “stuffer” gene in the multiple cloning site of the pACBAD vector which aided in the quick identification of successful ligation reactions when replaced with p53 variants. P53 “Quad” which is a stable, engineered variant of human p53 core domain was used as the “wild type” for the screen. 79 Since WT p53 is only marginally stabilized at room temperature (rt) in vitro

characterization of the WT has proved to be challenging. Using Quad as the “wild type”

will allow us to perform in vitro characterizations of the different variants that will result

from our screening studies. P53-Quad is expressed under the control of an arabinose

promoter which is tightly regulated. The details of the vectors are shown in Figure 14.

We generated a screening strain by transforming pGFPuv-BDx into DH10B E. coli and competent cells of this strain were transformed with pACBAD-p53. The level of GFPuv

fluorescence will depend on the efficacy with which the p53 variants expressed are able

73 to bind to its consensus DNA. In other words, well folded p53 variants will lead to cells that are lower in fluorescence, and mutants which are not well folded will lead to fluorescent cells. All the three of these modified lac operons did show some response to the presence of p53 and the resulting phenotype reflected lowered amounts of GFPuv.

The Kern sequence showed the maximum p53 dependent cellular fluorescence.

2.3.1 Optimization of Growth Conditions

The temperature of growth and the concentration of the arabinose required for optimal difference between p53 Quad and a linker used as a negative control were explored. The optimum temperature is the one that does not destabilize the p53 variants expressed and allows for the efficient formation the GFP chromophore. On the other hand, the arabinose concentration will allow us to choose the amount of p53 variant that needs to be expressed in order for maximum transcriptional interference of the GFP gene. Cells were grown at 37 °C and other lower temperatures like 30 °C and rt. Incubation times were varied and a range of conditions such as 12 h at 30 °C followed by incubation at 4

°C for 12 h were also tested. Optimal results were observed for plates incubated at 30 °C for 48 hours. Under these conditions the difference in the fluorescence levels between the cells that contain the Quad mutant and the cells not expressing any p53 was maximum.

The fluorescence of cells with pGFPuv in the presence of Quad were comparable to cells that did not express any p53.

The concentration of arabinose was varied from 0.0005% to 0.2%. Specifically the various concentrations of arabinose tested were 0.0005%, 0.002%, 0.005%, 0.02%,

0.05%, and 0.2%. When grown at 30 °C for 48 h, 0.005% concentration of arabinose showed maximum difference for the positive and the negative phenotypes (Figure 15).

Based on these observations, routine screenings were done on fresh made plates with

0.005% arabinose and the plates were incubated at 30°C for 48 hours.

In order to quantify the difference in fluorescence between the negatives and the positive, whole cell fluorescence from normalized amounts of cells were measured. In order for this, cells were grown at 30 °C overnight. The amount of cells were normalized based on

OD600 and equal amount of cells were pelleted and resuspended in sodium phosphate

buffer pH 7.2. The fluorescence from these were measured using the excitation at 395 nm

and emission at 509 nm which monitors GFPuv, the cells needed to be induced with

higher amounts of arabinose (0.1% versus 0.005%) to collect data that is above the noise

level. This might be due to the different expression levels resulting from the two

plasmids. pGFPuv which is expressed from a colEI vector may be expressed at higher levels in comparison to the low copy p15A origin vector that expresses p53. In spite of

the additional arabinose used, the data collected from these measurements were not robust between different trials. Extensive optimizations may yield robust data for

fluorescence in liquid cultures but since it was not a requirement for us to quantitate the

fluorescene levels, we did not pursue this further. A sample of the data collected from the

liquid culture sample is shown in Figure 15.

Figure 15: Initial optimizations of the screen a) shows the co-transformation of the Quad mutant with the pGFPuv-BD3 in comparison to pACYC177 a non fluorescent vector and pGFPuv-BD1 co-transformed with an empty vector (pACT7lacCAM) b) shows the fluorescence levels of normalized amount of cells grown in liquid media. Co-tranformation of Quad with pGFPuv-BD1 is compared with pGFPuv-BD1 in isolation and when transformed with a pAC-linker vector.

2.4 Proof of principle using known hotspot mutant

Now that we had a system that gave good screening results with the wt-like Quad mutant of p53 we wished to examine and establish the dynamic range of the screen. Four of the

known physiologically relevant mutants of p53 (V143A, R175H, R249S and R273H)

were constructed by overlap extension PCR150, using p53 Quad as the template. V143A is

a temperature-sensitive mutant and is found to be inactive at temperatures above 32 °C.84

R175H and R249S are structural mutants that destabilize the core domain to different extents and R273H is a contact mutation, which does not destabilize the core but since

R273 is close to the Zn2+ binding site, this mutation destroys the core domain-DNA

interaction.87 Their differences in stability 86 and function will allow us to establish the ability of our screen to report on functional folded core variants. The different mutants used as negative controls are summarized in Figure 16. The dynamic range of the screen was proven by the different phenotypes obtained for the different mutants as shown in

Figure 17. Low fluorescence level was obtained for the wt-like Quad mutant showing that it is the transcriptional activity that is compromised in the modified lac operon.

Intermediate fluorescence levels were obtained for the co-transformation with the temperature sensitive V143A and R249S. The most common mutant R175H and the contact mutant R273H gave the highest amount of fluorescence. The optimizations for temperature and arabinose concentration as detailed above were done for the mutants and

Quad and the comparison showed that cells grown under the previously optimized conditions, 48 h at 30 °C in media containing 0.005% arabinose showed maximum difference in the fluorescence levels between the Quad and the various mutants. But the fluorescence intensities in the presence of mutants were not as high as desirable, although in the presence of Quad, the cells were as dim as in the absence of any GFP.

R175H

R273H

R249S V143A

Figure 16: Mutant Properties a) The four mutants selected as negatives are shown as sticks on the structure. Picture made from structure of Quad (PDB 1UOL) using Pymol.b) Categorizations of the mutants and the effect of the mutation to stability is shown.

Figure 17: Optimization of conditions for screening The various plates were screened at different concentrations of arabinose (as indicated by the percentages in yellow in each plate). The four mutations and the Quad which was used as the WT were subjected to optimizations. Plates screened at 0.005% arabinose showed the maximum discrimination between the mutants and Quad.

2.5 P53 responsive lac operon with robust transcription using a combinatorial library

We wanted to further improve the difference in fluorescence levels that reports the binding of p53. Using the literature reported sequences to modify the lac operon, we succeeded in making the transcription of GFPuv p53-dependent, but with an overall

Quad over-expressed Quad not expressed

Figure 18: Selection of variants from the binding domain library a, b) Co-transformations of the binding domain library with the Quad mutant plated on media with and without arabinose respectively. Shown in the orange square is an example where the colony exhibited low cellular fluorescence in the presence of p53-Quad and had higher cellular fluorescence in the absence of the p53-Quad. (c) the sequences of twelve such colonies chosen from the DNA binding domain library. (d) The twelve colonies showing efficient expression of GFP after separation of the plasmid expressing p53 by digestion. Pictures of the plates were taken under UV-illuminator using a long wave UV lamp (365 nm)

reduction in the transcriptional efficiency of the GFPuv even in the absence p53. We

aimed to improve the transcription efficiency from this artificial p53 responsive lac operon while preserving the ability of p53 core domain to bind to it. In order for this we replaced the operator with a library of p53 binding DNA sequences based on the reported

101 consensus binding sequence,PuPuPuCWWGPyPyPy in tandem repeat. Degenerate oligos encoding (RRRCWWGYYY) 2 flanked by sequences complementary to the pGFPuv plasmid were PCR amplified using the pGFPuv as the template. This library can

have 65,536 possible variants (two possibilities each at 16 positions will lead to a library

of 216 variants) and a library of ~105 was generated. We inverted the scheme of

transformation in this case in order to achieve maximum efficiency of the transformation.

We generated a strain containing pACBADp53-Quad and competent cells of this strain

were transformed with the pGFPuv-BD-library. Again we wanted to choose variants that

were optimum, i.e., the variants that gave maximum cellular fluorescence in the absence

of Quad and minimum fluorescence in the presence of Quad. Since p53 is expressed

under the control of the highly tunable arabinose promoter we were able to simulate the above conditions simply by growing the cells in the presence and absence of arabinose.

We used replica plating to allow us the spacially identify the optimal library members.

Cells containing both pGFPuv-BDlib and pACBAD-p53 Quad were initially plated on solid media containing ampicillin and kanamycin. This represents a system where the

Quad variant is not expressed. Overnight cultures were carefully transferred to solid media containing ampicillin, kanamycin and 2 % arabinose, where the Quad variant is

overexpressed using nitrocellulose membranes and both the plates were further incubated

at 30 °C. The plates were analyzed for cells with diminished fluorescence in the presence

of arabinose and were considered optimal. Twelve such colonies were isolated and

analyzed. The colonies which maintained excellent expression of GFPuv upon clearing

the p53 expressing plasmid were considered to be good candidates for being the reporter

plasmid for the screen. Clearing the pACBADp53-Quad plasmid was achieved by

digesting the co-transformation with restriction endonucleases that cut only on the p53

expression plasmid followed by re-transformation. The efficiency of clearing the plasmid

was confirmed by the absence of grown on media containing kanamycin. The

fluorescence levels of these plasmids are shown in Figure 18. Four of these were then

tested for their efficacy in differentiating between the Quad variant and the hotspot

mutants under previously optimized conditions. One such colony was which exhibited

maximum discrimination between the Quad and mutants was used in further screening

experiments. The sequence for this particular DNA binding domain was found to be

“GAACTTGCCCGGGCTTGCCC”. We compared this sequence with the p21 sequence,

GAACATGTCCCAACATGTTG, which is a known tight binding sequence for p53.

Significant difference lie at the CATG stretch sandwiched between the purines and the pyrimidines. Most of the known binding sequences of p53 exhibit this CATG pattern whereas the sequence resulting from our screen has CTTG this position. This may have

been important for the transcriptional activity of the lac promoter. A comparison of the

colonies that resulted from the transformation of Quad and the various negatives with the

library-derived binding domain and the Kern sequences is shown in Figure 19.

Figure 19: Screen with improved dynamic range.

A comparison of the cellular fluorescence levels when the hotspot mutants were transformed with the library member (Binding Domain 1, BD1) on the left and the p53 DNA binding sequence reported by Kern et al. on the right. The different variants are indicated in yellow on the plate.

2.6 Discussion

We have successfully generated a high throughput system that screens for the function of p53 core domain. V143A, which is a thermosensitive variant, is only slightly fluorescent when screened at 30 °C whereas R175H, one of the very prominent mutations in cancer, gives the maximum fluorescence. The other structural mutants fall between these two mutations with respect to the scale of fluorescence. Our screen is based on the sequence specific DNA binding property of p53. We have engineered the lac operon which allows the transcription of the downstream GFP gene in a p53 dependent manner. Therefore the level of GFP expression is a direct correlation to the function of p53, which in turn indicates the structural integrity of p53. When we used the literature known p53 binding sequences to modify the lac promoter, the Quad mutant bound to the operator modified with the Kern sequence strongly and resulted in very low cellular fluorescence. But when screened with the negatives, or even in the absence of any p53, the cellular fluorescence was lower than unmodified pGFPuv. This showed that although the Kern sequence exhibited optimal binding to the p53 variants, it compromised the transcriptional activity of the lac promoter. We were able to consistently observe difference in cellular fluorescence in the presence and absence of p53 variants when screened on plates. The same differences were not reflected when the cells were grown in liquid media. In these experiments although the Quad gave the lowest fluorescence, the trend of the fluorescence levels for the different mutants analyzed were not captured. Expression of p53 variants needed to be induced using higher amounts of arabinose (0.1%) in the liquid

culture experiments to obtain reproducible data for the Quad. But at this higher

expression level of p53, the small differences in fluorescence between the different

hotspot mutations fell below the noise level of these experiments.

We replaced the operator sequence with a library of DNA binding sequences based on the

consensus binding sequence for p53 and simultaneously optimized for transcriptional

activity and the p53 binding. Using replica plating, we compared the same cells grown in the presence and absence of Quad. Three different phenotypes were observed for the resulting cells. In the first, the cells were fluorescent, irrespective of the presence of

Quad. These cells showed robust transcriptional efficiency from the modified lac promoter but it was no longer sensitive to p53 binding. The second phenotype observed was when the cells remained non-fluorescent whether Quad was expressed or not. This clearly indicated compromised transcriptional ability. The third category of cells showed different levels of fluorescence in response to Quad. This indicated that the promoter in these cells were able to transcribe for GFP in the absence of Quad and gets blocked in the presence of Quad and thus were optimized for both transcriptional activity and p53 binding. The conditions in which the cells were screened, namely temperature and the concentration of arabinose were also optimized. The temperature of growth has implications on the stability of the various p53 mutants studied in addition to the chromophore maturation of GFP. Varying the concentration of arabinose allowed us to tightly regulate the expression of p53 variants.

Screening methods are a valuable tool for high throughput analysis of proteins and help us in deciphering the changes that are a result of mutations to the proteins. A variety of screening systems have been developed for various proteins allowing us to choose the proteins based on various properties on the basis of cellular expression, resistance to proteolysis, ligand binding, catalytic activity, etc. Many methods, including phage display, mRNA display, in vitro proteolytic treatment, NMR and mass spectroscopy screening have been used to analyze the proteins. In vivo screening offers the advantage of providing native like conditions for the selections. The effective concentration of the

POI is maintained in addition to other cellular factors like molecular crowding, presence of chaperones, etc., are an added advantage of in vivo systems. Yeast, bacteria and phage are typically used as host systems for such in vivo screens. Each of these offer unique advantages and have different disadvantages. The eukaryotic machinery in yeast allows the POI to undergo post-translational modification. This is important for proteins where post-translational modification is important for the function. The occurrence of false positives is a major drawback for this system. The disadvantage of yeast n-hybrid systems is also that the POI needs to be imported to the nucleus. Nuclear import, especially of foreign proteins is often inefficient and in some cases non specific interactions lead to the false positives. In addition, various other protein-protein interactions in the yeast genome might lead to the positive phenotype. The library size of variants that can be screened using the yeast system is limited by the low transformation efficiency of yeast. Phage display, usually as a pVIII fusion is another vastly used host to

86 screen for various properties including ligand binding and resistance to proteolysis. The latter offers a unique opportunity to screen for proteins based only on fitness.

This screen that we have developed will allow us to study the effect of mutations on the structure of the protein and provides us with a high throughput method to study the determinants of stability of this protein. We have utilized the DNA binding property of p53 to develop a functional screen for p53, a β-sheet scaffold. A parallel screening system was developed to study ROP, a model protein which has coiled coil architecture.24

ROP regulates the copy number of ColE1-origin plasmids in bacteria. When GFP is expressed from a ColE1-origin plasmid, the level of GFP is a direct readout of the function of ROP. Structural analysis of the functional variants selected from this screen showed that they can be WT-like or destabilized to be molten globular25. This wide range of stabilities tolerated by the screen allowed the compilation of a large data set for this core packing propensity of this protein.

Detailed understanding of the sequence-structure relationship of p53 can aid in engineering this protein for pharmaceutical applications in addition to helping us predict the effect of various mutations to the core. As an initial step towards this, we have constructed and analyzed core randomized libraries of this protein. The results obtained from these are discussed in Chapter 3. The application of the screen to stabilization of loop regions by rational design is discussed in Chapter 4. The screen can also be used in

conjunction with small molecules that can rescue the function of the protein. Some

studies towards this goal are discussed in Chapter 6.

2.7 Materials and Methods

2.7.1 Construction of Reporter Plasmid – pGFPuv BDx

The lac operon region of pGFPuv was modified to contain three known binding domains of p53. These literature reported binding domains were constructed by reassembly of synthetic genes followed by amplification using primers containing the respective restriction sites and ligated between AlwNI and HindIII sites in pGFPuv. A part of the vector starting from the PvuII site, 5’ CAGCTGGC ACGACAGGTT TCCCGACTGG

AAAGCGGGCA GTGAGCGCAA CGCAATTAAT GTGAGTTAGC TCACTCATTA

GGCACCCCAG GCTTTACACTTTATGCTTCC GGCTCGTATG TTGTGTG 3’ was amplified using oligos 5’ AATAATCAGCTGGCACGACAGGTTC 3’ at the 5’ end and

5’ CCACACAACATACGAGCCG 3’ at the 3’ end . The region of the lac promoter containing the operator sequence was reassembled from oligos: 5’

CGGCTCGTATGTTGTGTGGTACAGAACATGTCTAAGCATGCTGGGG

TCACACAGGAAACAGCTATGACC 3’ and 5’ ATTATTAAGCTTGGCGTA

ATCATGGTCATAGCTGTTTCCTG 3’ to generate BD1 where the operator, shown

with the bases underlined replaced the original sequence of the operator :

GGCTCGTATG TTGTGTGGAA TTGTGAGCGG ATAACAATTT CACACAGGAA

ACAGCTATGA CCATGATTAC GCCAAGCTT 3’. This reassembly was mixed with

the previous PCR product generated and amplified using 5’ and 3’ primers, 5’

AATAATCAGCTGGCACGACAGGTT 3’ and 5’ ATTATTAAGCTT

GGCGTAATCATG 3’ respectively. The overlapping region for the original PCR product

and the reassembly reaction is shown with dashed underline. Following the same scheme,

BD2 was generated using 5’ CGGCTCGTATGTTGTGTGG TACAGAACATGTCTA

AACATGCTGGGG T ACA GAACATGTCTA AGCATGCTGGGG TCACACAGGA

AACAGCTATGACC 3’ and BD3 was generated using 5’ CGGCTCGTATGTT

GTGTGGCCTTGCCTGGACTTGCCTGG CCTTGCCTTTTCT TCACACAGGAA

ACAGCTATGACC 3’.

Overlap PCR was used to insert an NheI site into the original pGFPuv plasmid to enable direct cloning of the binding domain library. A library of p53 binding sites, based on the consensus binding sequence RRRCWWGYYYRRRCWWGYYY was constructed by thermally balanced inside out (TBIO) PCR of degenerate oligos 151 using the following

oligos: 5’AATAATAATGCTAGCATTA ATGTGAGTTA GCTCACTCAT TAGGCA

CCCCAGGCTTTACA 3’ , 5’ TAGGCACCCCAGGCTTTACACTTTATGCTTCCGG

CTCGTA TGTTGTGTGG 3’, 5’ CATAGCTGTTTCCTGTGTGARRRCWWGYYY

YYYCWWGRRR CCACACAACATACGAGCCGG 3’ , 5’ ATTATTATTAAG

CTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGA 3’ . pGFPuv binding

domain library (pGFPuv-BD) was obtained by ligation of the PCR amplified product

between NheI and HindIII sites. About 200 fluorescent colonies were selected from this

library to obtain a library of positives. The cells those were fluorescent when grown in

Figure 20: Sequence of Quad mutant

The DNA sequence of the Quad mutant used is shown in black. The corresponding amino acid sequence in our construct is shown in blue, aligned to the codons. The various hotspot mutations studied are highlighted in red

plates containing ampicillin were selected using a hand-held UV lamp emitting 365 nm

UV radiation.

2.7.2 Construction of Expression Plasmid – pACBADp53

A kanamycin resistant plasmid containing a p15A origin which encodes the p53 core

domain gene under the control of an arabinose promoter was constructed. This was done

using PCR amplification of the region encoding p15a origin and kanamycin resistance

from the commercially available vector pACYC177. The region encoding arabinose

promoter and GFP was amplified from pUCBADGFP. The two PCR products were

ligated using NotI and BglII sites to obtain pACBADGFP which serves as a null to clone

in p53 variants. P53 core domain variants, p53-Quad, V143A, R175H, R249S and

R273H were cloned into this vector between NdeI and EcoRI sites. P53-core domain

variant “Quad” gene was constructed by Stemmer reassembly of synthetic

oligonucleotides encoding residues 94-298, 152 followed by an amplification reaction

using primers containing the required restrictions sites and cloned into pACADGFPuv

between NdeI and EcoRI sites. Four of the known hotspot mutants of p53, V143A,

R175H, R249S and R273H were constructed by overlap PCR, using p53 Quad as the

template. The DNA sequence for the Quad mutant is shown in Figure 20.

2.7.3 Choosing the Reporter Plasmid form a Library of positives

Replica plating was used to isolate the pGFP-BD library member which exhibited the

maximum difference in fluorescence in the presence of Quad and when Quad is absent.

pGFP-BD library was transformed into electrocompetent DH10B cells containing p53-

Quad. The transformation was initially plated on to LB kanamycin (kan) and ampicillin

(amp). Following 18 h of growth, the colonies were transferred on to plates containing

LB kan amp and 2% arabinose using nitrocellulose membranes. Both the plates were incubated at 37 °C for 18 h. Careful comparison of the plates allowed for selection of

colonies that gave low or no fluorescence in presence of Quad (2% arabinose) and high

fluorescence in the absence of Quad (no arabinose). Twelve such colonies were selected

for further analyses. The plasmid containing Quad was separated by digesting with NcoI

and SphI and transformed. The restriction digests linearize the DNA and therefore reduces the transformation efficiency. The colonies resulting from the re-transformation were grown to saturation and plated on to LB amp to analyze the fluorescence in the absence of Quad. The efficiency of plasmid separation when no growth was observed when the cultures were plated on to LB kan. Four of the twelve variants, (labeled 1, 4, 7 and 12) were co-transformed with the p53 hotspot mutants and based phenotypes of positive and negatives, BD1 was chosen to be carried forward in further studies.

Chapter 3: Utilizing the cell baed screen to identify functional p53 mutations

Contributions

The material presented in chapters 2 and 3 will be published as a full paper co-authored

by Brinda Ramasubramanian and Thomas J Magliery. The work summarized in this chapter was produced by the primary author. The experimental design and data analyses were accomplished by the primary and the corresponding authors.

3.1 Summary

We developed a functional screen for the core domain of p53 based on its DNA binding function (Chapter 2). The screen, as discussed previously, utilizes the DNA binding

function of the transcription factor p53 to displace the RNA polymerase from a promoter which expresses GFPuv from an engineered lac operon. A non-fluorescent or negative phenotype indicates a functional p53 and a fluorescent phenotype results when the p53 present is inactive. We have used the screen to identify mutations that retain the function of p53 from a library of p53 core variants.We generated three different core-randomized libraries to test the screen. The initial library in which four hydrophobic residues in the

core domain away from the DNA binding region were randomized had a low frequency

of occurrence of positives. These results indicated that either the mutability of these

residues is low and/or the stability tolerance of the screen is very narrow. In order to

deconvolute these effects, two smaller libraries, each a subset of the initial library, were

generated. Generating smaller libraries will result in a larger fraction of positives that can

be analyzed. This in turn will allow us to understand the low occurrence of positives in

the four-residue library. In order for this I255 and T253 were separately randomized from

the A161and A159. The results obtained from the screening of these smaller libraries

indicate that mutations to positions A161and I255 are destabilizing indicating that these

positions are considerably less tolerant to mutations than positions A159 and T253.

While position A161 is highly selective for small residues position I255 appears to be dominated by residues with high β-sheet propensity. In addition, postions A159 and T253 are biased towards smaller residues. A set of 8 positives obtained from the IT library were analyzed in vitro for their stability and DNA binding ability. The results obtained from the in vitro characterization show that the variants that had a positive phenotype in the screen were well folded and were very close to the parent Quad in terms of stability and function. We hypothesize that improving the tolerance of the screen will prove to be useful in analyzing a larger number of positives to gain insight into the folding and stability of p53 core domain. Adjacent and cross strand alanines have been shown to contribute to the stability of GB1. Therefore this alanine pairing might have a significant contribution to the packing and stability in the context of p53 core domain. Similar studies can be done by choosing other core domain residues in this protein and

randomizing them to identify the stability determinants of p53. Some of the prospective residues that can be chosen are discussed at the end of this chapter.

3.2 Significance of studying libraries for core directed design

The hydrophobic core of a protein is a major determinant of its stability.153; 154 It is

hypothesized that the degree solvent exposure of surface exposed and loop residues

remain similar for the native and unfolded states and therefore, changes to the surface residues are not as destabilizing to the protein as such mutations in the core.155; 156 Hence, a major avenue for the design of proteins has been core directed.157; 158 Computational

and combinatorial approaches have been utilized to design or re-design the cores of

proteins. Well-behaved proteins like T4Lysozyme, λ repressor, ubiquitin, GB1 have been used as model systems to implement various core directed design strategies.26

The significance of core packing and its contribution to the stability of the protein has

been extensively studied by the Matthews group using T4 lysozyme as a model system.159

The array of studies on this protein establishes the dominance of hydrophobic effect.

They found that this protein is surprisingly tolerant to various substitutions to the core and result in folded active proteins. Initial studies that replaced an Ile in the core to 13 other residues were found to destabilize the protein and the extent of destabilization correlated with the increase in solvent exposed surface area.160 Following this ‘cavity

filling’ and ‘cavity creating’ mutations were engineered into the core using site directed

mutagenesis and their results established that the effect of substitutions to the core

depends on the context of substitutions even when the packing of the residue under study

is optimum.161; 162 These studies established that although the hydrophobic effect plays a

dominant role in the stability of the protein, side chain packing interactions also

contribute significantly.

The significance of core packing was examined by the Fersht group using barnase as a

model system.163 All the positions in barnase, a protein that has a 13-residue helix

packing against a 5-strand anti-parallel β-sheet were randomized in 3 stages. Overall, the

protein was tolerant to these changes and a significant percentage of the mutants showed

some activity. The authors suggest that the specificity of the fold encoded in the

hydrophobic core is minimal and therefore the first step in evolution, the achievement of

activity, has a low barrier. Of note here is that the percentage of functional protein

obtained when the six residues on the sheet side were varied was much lower than that

observed when the six positions in the helix side were randomized. Since the sheet-side residues were randomized in stage 2, where the helix-side residues already incorporated changes, this might reflect on the large changes to the core, it might also imply the lower mutability of the sheet-side residues.

The ground-breaking studies of Woolfson and co-workers utilized phage display to generate large libraries of ubiquitin.164; 165 Their selection was based on the resistance to

degradation by proteases by well-folded proteins. The variants of ubiquitin were

expressed as fusions to 6ΧHis tag and were bound to Ni-NTA agarose columns. Phage

resistant to protease infected bacteria and thus could be amplified to decipher the

sequences. Multiple rounds of these were designed to yield stabilized variants of

ubiquitin. This was the first report of a combinatorial selection based purely on the

stability of the protein and not linked to the function of the protein. Their studies showed that ubiquitin is surprisingly restrictive to sequence changes in its hydrophobic core.

Phage display is a well established technique, and when combined with shotgun scanning methods, provides a powerful tool to analyze large libraries of variants and

screen them, based purely on stability in addition to the ability to screen based on ligand

binding function.166

The Sauer group examined the effect of core packing in λ repressor using cassette

mutagenesis. Initial studies that randomized eight positions in the dimerization interface

two of the positions were found to contain most of the information to result in effective

dimerization.167 Using the same methodology, the central core of the protein was

repacked and their results show that most of the information for the stability of the

protein is contained in the hydrophobic core of the protein.168 This implies that not all

hydrophobic cores pack to yield active protein, and that, additional steric considerations

that specify the packing do need to be considered. The role of steric compatibility is

illustrated by the observation that the number of functional substitutions at a particular

position is governed by the number of other residues that are allowed to co-vary. Their

studies indicate that the core residues are highly intradependent.

Computational algorithms essentially specify the structural co-ordinates of the target

protein. An energy minimization function, a rotamer library of permissible conformations

of each residue and a search method to find sequences of lowest energy represent the

algorithm.17 The specifications of the inputs are different for different algorithms. In their

pioneering work, Desjarlais and Handel re-packed the hydrophobic core of phage 434-cro

protein using a “custom made” rotamer library generated for the specific backbone of

interest and a genetic algorithm for energy minimization.169 They characterized three of

the designed variants which had 4, 7 and 8 sequence changes from the WT core. Their results indicated that this protein can accommodate such large changes to its core. Also, one of the designed variants resulted in slight stabilization over the WT. Dahiyat and

Mayo implemented a side-chain selection program which explicitly considers the specific packing interaction to choose for optimal side chains and their conformation as their

design criteria.170 By varying the strengths of packing constraints, they were able to

assess both the extent to which such packing constraints are important in computational

protein design and the tolerance of the hydrophobic core to sequence changes. Four

different designed variants were selected for in vitro characterization. Their results

indicate that the packing constraints correlate to the foldedness of the designed cores and

when the packing constraints are relaxed it can yield to highly mobile molten globular

structures and completely disordered chains. The Mayo group has also developed

computational algorithms that can sample a vast combinatorial library of sequences for a

target ββα motif. 171 This was one of the pioneering studies on a motif that contained

three secondary structure elements helix, sheet and turn.

These studies emphasize the role of the hydrophobic core in the stability of a protein and

also imply the role of other interactions including steric packing of the side chains to

specify the unique fold and function of each protein. The hydrophilic residues determine

the solubility of a protein and in addition, allow the exclusion of the hydrophobic core

residues from the solvent and may contribute to the stability in this manner. The significant coupling found among the core residues offer further challenges to design of

proteins. This emphasizes the need of large bodies of empirical data that are accessible

using combinatorial experiments in the context of different proteins to provide maximum

information for design efforts.

Bacterial systems are one of the most studied and understood systems and can be easily

manipulated. They have a short doubling time and therefore is ideal for the development

of screens. These screens will aid in interrogating large libraries of variants at a time and

find the ones with the desired properties. Linkage of the genotype to the phenotype

allows us to study the effect of mutations on a large number of variants, thus contributing

the repertoire of data that will, in an ideal scenario, allow us to predict a structure from a

sequence or predict the effect of point mutations to a particular protein. To that end we

have used a bacterial cell-based screen to analyze libraries of core randomized libraries of

the tumor suppressor protein p53. The data collected on this protein will provide a unique

opportunity to both understand the sequence structure stability relationship of this protein

and also utilize stabilizing mutations to design novel drugs against cancer.

3.3 Core randomized libraries of p53 DNA binding domain

3.3.1 Four-position Library

Our lab has extensively studied the core packing requirements of the model protein ROP, which forms an anti-parallel four-helix bundle, by generating combinatorial libraries of core-randomized variants of this protein followed by screening for functional variants using a cell-based screen. Initially, a four position library which randomized the residues

I15, T19, L41 and A45 in the two central layers of the core domain to all hydrophobic amino acids and the alcohols was studied. In vitro characterizations of active variants of

ROP resulting from this library showed that folded variants with a wide range of stabilities were obtained as positives from the screen. This allowed the study of large number variants from this library. We applied similar studies to p53 core domain to interrogate the effects that core mutations have on the packing and stability of a β-sheet protein using the screen to identify stable and functional variants of p53 from a library of p53 variants. We chose four residues, A159, A161, T253 and I255 in the core of p53, away from the DNA binding region to randomize, in order to analyze the effects of such mutations. The striking similarity between the four residues AATI in the p53 core domain and the ITLA residues in the ROP core domain which has previously led to successful study of a large number of variants provided added credibility to the choice of residues.

Initially a BsrGI site was engineered in the p53 gene using a silent mutation to the codon at T125. An NNK library which randomizes the positions A159, A161, T253 and I255 to all 20 amino acids was generated using PCR and ligated between BsrGI and BsaI sites in

100

the p53 gene. In order to screen for active variants, the library was transformed into

competent cells of a DH10B strain which already contained the pGFPuv screening vector.

This library yielded ~106 colonies, indicating a 10-fold coverage of the theoretical library

space (204 for four-position randomization into all 20 amino acids). Cells were grown at

30 °C for 48 h with 0.005% arabinose. The colonies that appeared as positives (low cellular fluorescence) were grown to saturation and re-plated and grown under screening conditions. The results from this NNK-4 library of the p53 core domain were convoluted in multiple ways.

First of all, the positives, i.e., cells exhibiting low cellular fluorescence for this AATI library were found to be about 0.025%. Colonies which exhibit low cellular fluorescence were isolated from solid media and were typically re-streaked to confirm the fluorescence levels. This procedure led to fluorescent cells. The initially observed non-fluorescent phenotype may be have been erroneous due to the large number of colonies screened and the differing apparent fluorescence emitted by colonies with different sizes. Each plate contained about 4000 colonies and only one or two of them were of the required phenotype. This becomes an issue because the overall goal of this library was to enable us to collect data on a statistically significant number of variants and thus allow us to contribute to the existing statistical model for the occurrence of stabilizing residues in the core of a predominantly β-sheet protein. The low occurrence of positives precludes the possibility for this.

101

About 36 positives obtained from approximately 40 plates were verified by re-streaking

and the sequences were checked by DNA sequencing. Only 50% of these positives

yielded complete full length proteins, and the rest of the sequences contained insertions

or deletions, thus leading to nonsense protein sequences. Even among the 19 or so sequences of positives that gave complete reads, a quarter of them had stop codons in the

sequences which lead to truncated proteins. This makes the actual occurrence of positives

in the screen much less than the observed 0.025%. One of the disadvantages of a screen

that is based on a negative phenotype is that any spontaneous mutations in the reporter

plasmid will lead to a ‘positive’ phenotype if GFP expression is impaired for any reason

that exclude blockage of transcription by well folded p53 variants. The extremely low

rate of occurrence of positives complicates the picture and makes it difficult to draw

conclusions on the structure function relationship of the protein. When 10 of the naïve

sequences were analyzed, only seven of them contained the full length protein that also

contained deletions and stop codons in addition to having charged residues packed into

the core. These observations raise the possibility that the low occurrence of positives

from the library may have been due to errors in the library itself. The other possibility is

that it might be due to the failure of GFP expression. If this were the case, both authentic

clones and insertion and deletion variants would appear as positives. Since this is not

reflected in the various sequencing results, it appears that the low rate of positives is due

at least in part to the errors in the library. It is also feasible that the low rate of positives

from this library is due to the high stringency of packing at these positions. This needed

to be rigorously explored.

102

Finally, interpreting the results of the screen came from the sequences encoded by the core variants that yielded the positive phenotype were challenging. These sequences had a high occurrence of proline which would not be expected to be stabilizing mutations in the core of any secondary structure. Also some of the positives contained a tryptophan which not an isosteric replacement for the original alanine, threonine and isoleucine residues. In addition, the positives which contained the tryptophan also had

Figure 21: Results from AATI library a) Crystal structure of Quad mutant (PBD 1UOL) with A159, A161, T253, and I255 which are the four residues that were randomized shown as spheres. The spheres are colored as green for carbon, red for the oxygen and blue for nitrogen. The pictured is rendered using Pymol (DeLano Scientific) b) Sequences of apparent positives obtained from the ITAA library.

phenylalanine, another large residue. Together, these variants appear to be highly overpacked. A summary of the apparent positives sequences obtained from this library is given in Figure 21. These results required us to do further experiments to characterize

103

the validity of the screen and these positions that were randomized in the core domain.

Considering the various sequencing results, we concluded that the positions that were randomized to generate this library have a low inherent mutability in addition to the

presence of errors in the library. Since p53 is only marginally stabilized under

physiological conditions, it is conceivable that any more mutations to the core further

destabilizes the protein, thus leading to the low observed occurrence of stabilized

functional variants in the screen. Alignment of the sequences of p53 core domain in the

Pfam data base provides further proof that the residues in these positions are much

conserved and most of the mutations found lead to loss of function of the protein, making

them disease causing mutations. To further investigate the mutability of these residues,

we generated two smaller sub-libraries. The results obtained from these and further

biophysical characterizations provide insight into the properties of the screen and the

significance of these residues.

3.3.2 AA and TI Sub-libraries

The low rate of occurrence of positives in the AATI library makes it difficult to analyze

the positives to understand the packing specificity at these positions. Since the rate of

occurrence of positives is low and is similar to the rate of occurrence of false positives, it

is difficult to deconvolute the various effects: packing stringency at these positions, errors

in the library, and errors in GFP transcription, all of which result in low cellular

fluorescence indicating a positive phenotype. To make the analysis of these possibilities

easier, we generated two smaller sub libraries, only randomizing two positions each time.

104

Each clone in the smaller libraries represents 1 in 400 possibilities which is a much larger

number than 1 in 4000, the observed rate of occurrence of apparent positives in the

AATI library. The higher frequency of positives will allow us to analyze a larger number of variants and therefore lead to a better understanding of the importance of these

positions.

The residues A159 and A161 were randomized to all 20 amino acids using oligos that

encoded for degenerate NNK codons at these positions. The library generated using PCR with Quad as the template was ligated into the pACBADp53-Quad vector between BsrGI and BsaI sites. The observed library size was well beyond the theoretical library size of

400 clones (202 for randomization of 2 positions to all 20 amino acids). The library was

screened for activity by transforming into DH10B cells, which already contained the

screening plasmid pGFPuv-BD1, by electroporation. The resulting colonies were

analyzed for actives based on the cellular fluorescence.

The results from the AA library are summarized in Figure 22. These positions show a

striking preference for alanine and cysteine. Position A161shows high selectivity for

small residues. In fact, the only residues that are tolerated at this site are Ala, Cys, Ser,

Asn and Val. Ser and Asn are small polar residues and Val is highly preferable in the core

of β-sheet proteins. But there is an overwhelming preference for Ala and Cys at this

position. Position A159 has a wider a distribution of residues in comparison to position

A161. A considerable distribution of larger residues such Leu, Ile, Val, Met and Thr are

105 found at this position although Ala is still highly preferred over the other residues.

Analysis of the pairwise distribution shows that when any of these large residues occur at this position, the A161 is restricted to the small residues. For example, when methionine

Figure 22: Results from AA library: a) Occurrence of positives in the AA library. Codon adjusted frequency of occurrence for each of the positions in the AA library

106

occurs at position 159, position 161 is always an alanine. This implies significant

coupling between these residues. Also, A161 packs against I195 in an opposing helix in

the crystal structure of Quad. This large residue may also restrict the various residues that

can be accommodated at A161. The absence of charged residues, large hydrophobic

residues, glycine and proline further establishes the packing stringency of these positions.

Alanine, which is the WT residue, is found to occur as one of the most frequent residues,

and the AA pairing is also observed at a high frequency. The high frequency of

occurrence of cysteine was intriguing and we decided to characterize the consensus ‘CL’

variant in vitro.

The residues T253 and I255 were also randomized to all 20 amino acids using oligos that

encoded for degenerate NNK codons at these positions. The library generated using PCR

with Quad as the template with a cloning scheme similar to that of the AA library. The

observed library size was well beyond the theoretical library size of 400 clones (202 for randomization of 2 positions to all 20 amino acids). The library transformed and screened for activity as previously described for the TI library.

The observed occurrence of positives in the TI library was close to 13%, which is about

50 of the 400 possibilities, if every clone is valid. We sequenced about 32 active and 64 inactive variants of p53 from this library. The results from the positives in the library are summarized in Figure 23. The occurrence of the various amino acids in these positions

107 indicate that all the amino acids except large hydrophobic residues like Phe, Trp and Tyr, charged residues, Pro and Gly are present in the sequences for the positives. The high flexibility of Gly and the restricted geometry of Pro make these residues non-ideal in the core of the protein. The absence of the large hydrophobic residues among the positives indicates that overpacking these positions lead to destabilization of the protein. The high occurrence of Val and Ile at position I255 indicates that this position is highly selective for residues with high β-sheet propensity. The conformational flexibility of Met may the determining factor which allows it to occur at position I255 at a relatively high frequency. The distribution of residues at position T253 is more widespread, allowing for small and polar residues. Thr may make hydrogen bonding interactions in the core with the back bone or with other residues. The higher occurrence of polar residues like

Gln and Asn in this position indicates that such hydrogen bonding interactions may be important in this position. But the higher preference for cysteine at this position indicates that size may be the major determining factor at the T253 site. An analysis of the pairwise distribution of various amino acids confirm that size and β-sheet propensities of the residues are the two major determinants of packing in these positions. The absence of large hydrophobic residues like tryptophan, phenylalanine and tyrosine, a large alcohol, suggests that these may lead to hydrophobic clash in the packing and therefore are poor substitutes for smaller amino acids, Ileand Thr. The overall preference for Val and Ile in both these positions compared to lower preference for Leu supports the preference for residues with high β-sheet propensities. We also see that charged residues such as Asp,

Glu, Lys and Arg are not represented in the positives selected from this library,

108 presumably because these will not be tolerated within the hydrophobic core of the protein

Figure 23: Positives from TI library a) Positives at each of the positions from the IT library. The frequency of occurrence is adjusted for the codon bias from the NNK library.

unless when the charge interactions are satisfied by a salt bridge. If size were the only

constraint, we would have expected pairs like TI, LT and TL. This may have been due to

undersampling since only 32 positives were sequenced. Considering that 13% of the 109

sequences are positives, 52 clones will be unique positive sequences and in order to cover

all the positives, we will need to sequence a 3-10-fold excess of that, approximately 150-

520 clones. On the other hand, the preference for polar residues at T255 and the bias

towards residues with β-branching at I253 indicates that the packing is determined by

more constraints than just size. In order to rigorously analyze it, a larger number of

colonies will need to be sequenced.

In the negatives, when we just look at the identity of residues at positions that were

randomized, we can see that all residues seem to be occurring. We need to analyze the

pairwise distribution of the negatives in order for us to draw conclusions on why the pair

was not folded well and therefore was not functional. The frequency of pairwise

distribution of amino acids in the TI library is summarized in Figure 24. These results

make it obvious that certain residues occur in the list of negatives due to the residue with

which it is paired. For example, isoleucine, which is the native sequence, when paired

with aspartic acid leads to highly fluorescent cells. Similarly threonine occurs in the

negatives when it is paired with glycine, proline or arginine. Just like the case of

positives, all possible negatives have not been explored. Out of the 66 negatives sent for

sequencing, 22 did not produce any sequence if p53 core domain and the rest of the 44 were analyzed.

These results demonstrated the clear advantage of screening systems in comparison to

selections.26 Screens allow for the negatives to be analyzed and we are able to make the

110 conclusion that the screen does allow only those variants which are functional to exhibit the right phenotype. The positions T253 and I255 allow for various mutations to small hydrophobic residues, residues with β-branching and small polar residues. The residues

I253 and A161 are more restrictive to mutations. At position A161 the sequences are mainly governed by the size of the residues and position I253 is selective to residues with

β branching. The results obtained from the ITAA library do mean that randomizing those four residues leads to destabilization and loss of function in a large population of the resulting mutants.

Figure 24: Negatives from TI library a) Pairwise distribution of negatives from the IT library b) p53 Quad mutant (PDB 1UOL) with the I255 and T253 hghlighted as sticks enveloped by transparent sphere. The image was rendered using Pymol

The sequences of the positives from TI and AA libraries show that two of the positions are more restrictive than the others and the results from the AA library also show that these positions contribute to the observed low number of positives in the AATI library. 111

Previous studies show that adjacent alanine residues in the core of a β-sheet structure

have energetic contributions to the structure although alanine is not considered to have high β-sheet propensity.172 Disrupting alanine-alanine pairing in the same chain has been

shown to have energetic consequences when occurring in the core of a β-sheet protein.

3.4 In vitro characterization of library variants

The ultimate proof of the validity of the screen can come only from the in vitro

characterizations of the different positives resulting from the screen. We have

characterized a few of the interesting variants using two standard methods for biophysical

characterization; urea mediated chemical denaturation and thermal denaturation. Nine

variants, eight from the TI library and one from the AA library, which were positives by

the screen were selected to characterize based on their sequences (Figure 25). The

variants had polar amino acids packed with hydrophobic amino acids like, N253 I255,

S253 V255 and others, all hydrophobes like V253 V255, L253 V255 and others were

chosen. These variants were amplified from the pACBADp53 vector using PCR and were

ligated into pET11a vector between NdeI and BamHI sites. The variants can be over

expressed under the control of a T7 promoter from the pET11a vector when transformed

into cells which has the T7 polymerase gene lysogenized into them. We have used C41

(DE3), a mutant strain of BL21 (DE3) for the expression of all the variants. This strain

has been reported to allow the expression of proteins which are toxic to the cells and have

also been previously used for the expression of p53 variants.50; 71

112

a. b.

Quad 8 R175H

7 1

6 2

5 3 4

Figure 25: Actives from T253 I255 library a) The sequences at positions T253 and I255 for the actives b) screening results for the various actives chosen from the TI library.

The L159 C161 variant from the AA library and the L253 V255 variant from the TI library failed to express and initial attempts to purify these proteins were not successful in isolating these proteins in concentrations that are required for the characterization. This may indicate that these are destabilized variants. The remaining seven variants expressed well and were able to be purified.

P53 was previously purified via cation exchange of the soluble fraction of the cell lysate.78 We have optimized the purification using Ni-NTA agarose column for the

6ΧHis-tagged version this protein, and the protein was eluted off the column by the cleavage of the 6ΧHis tag by TEV (Tobacco Etch Virus) protease while the proteins are 113

still bound to the Ni-NTA agarose column. On-column cleavage leads to efficient

separation between the protein molecules and minimizes exposure to imidazole during

the cleavage and elution steps. This also minimizes the chance of aggregation and

promotes efficient cleavage of the tag. The characterizations were optimized for Quad

mutant and four of the known hot spot mutants. The Urea melt data for these proteins are

shown and the data agree well with the literature reported D50 values (concentration of

urea at which 50% of the protein is unfolded) for these proteins.84

Since p53 is a DNA binding protein and the screen is based on the sequence specific

DNA binding ability of this protein we intended to characterize the binding efficiency of

a few positives obtained from the libraries to the DNA sequence obtained from the

binding domain library and compare with the binding efficiency to the literature reported

consensus sequence using fluorescence anisotropy. P53-DNA binding was studied using anisotropy experiments with 5’ flourescein (FL) labeled GADD45 DNA. 20 nM FL-

GADD45 DNA which has been reported to bind Quad p53 with a binding constant KD of

100 nM 71; 79 was used in all the experiments. The details of these characterizations are

discussed in the following sections.

3.4.1 Stability Measurements using Urea Denaturation

Urea denaturation was measured as a function of the change in intrinsic fluorescence of tryptophan at increasing concentrations of urea as detailed by Bullock et al.78 In the case of p53 core domain, the intrinsic fluorescence of the single tryptophan increases when the

114 protein is denatured. Therefore the denaturation of the protein can be monitored by the increase in fluorescence at 356 nm. The fluorescence maximum for tyrosine at 310 nm decreases as the protein melts. In addition, it has also been documented that an aggregated species of the protein give a maximum at 340 nm. The natively folded wt- protein is known to unfold via a two state mechanism.86 We have used a purification scheme different from that reported in the literature and we wanted to ensure that we are purifying the protein in the non-aggregated form.

We characterized 8 different variants from the TI library and one from the AA library choosing the variants as previously described. The LC variant from the AA library failed to express well and therefore may have been a destabilized variant. The various proteins buffered in sodium phosphate pH 7.2 were incubated with various concentrations of urea

(0 to 7 M) for at least 8 h. A concentration of 2 µM protein was maintained in all the samples. The samples were excited at 280 nm and the emission spectra between 300 nm to 400 nm were collected. Monitoring the change in fluorescence at 356 nm with the concentration of urea allowed us to follow the chemical denaturation of these variants.

The Quad mutant is known to unfold via a two state mechanism exhibiting an isofluorescence point at ~ 320 nm. The full scans for the denaturation curves show that

115

Figure 26:Wavelength scans for the different variants

2 µM of each of the proteins buffered in sodium phosphate pH 7.2, 5mM DTT were incubated in various concentrations of urea for at least 8 h. Data were collected by following the emission spectra between 300 and 400 nm after excitation at 280 nm. The various curves in each graph shows the spectra for proteins incubated with 0 to 7 M urea. Isofluorescence point was observed for all but variants: V253 M255, V253 V255 and M253 Q255, within the error for measurement.

(Continued)

116

Figure 26: Continued

(Continued) 117

Figure 26: Continued

118

most of the variants are well folded and donot have a third species present. But some of

the variants like VV and QL do have some amount of aggregated species present and

donot exhibit an isofluorescence point demonstrating a two-state transition. The presence

of the third species may be a function of the reaction conditions. We optimized the

purifications for Quad and sought to decipher the properties of the mutants under those

conditions. The results obtained from different variants suggest that some of them

donot have a two-state mechanism of unfolding. An uncharacterized third species as evident from the scans of the fluorescence melts seems to exist for some of the variants.

The various scans are shown in Figure 26.

Urea denaturation melts showed all the variants that were characterized to be native like and were similar to the Quad variant in stability. The chemical denaturation melts were fit to the Clarke and Fersht equation and the resultant D50 values are presented in Figure

27. These were very close to each other and to that of the Quad variant. This indicated that only highly stabilized variants can appear as a positive from the screen. This result suggested that the screen is highly stringent and that is one of the likely reasons for the observed low frequency of occurrence of positives from the AATI library.

3.4.2 Thermal Melts Monitored Using Circular Dichroism

To confirm the results from chemical denaturation experiments, we measured the thermal

denaturation trend of the various variants. It is known that p53 core domain unfolds

119

irreversibly with temperature. Therefore it is not possible to arrive at the thermodynamic

melting temperatures with these experiments, but we can observe the relative trends for

stability among the variants. Ang et al. have shown that the relative trends of stability

obtained from the thermal denaturation of p53 corresponded well with the trend obtained

from the chemical denaturation experiments.142 The wavelength scans of the variants

showed a positive peak at 234 nm and a minimum at 218 nm. Thermal melts were

monitored for change in signal at 218nm. Our thermal melt data showed that all the

variants had similar apparent melting temperatures and all the variants aggregate upon

denaturation. The most and the least stabilized variants were different only by 5 °C. The

results from the thermal denaturation curves are given in Figure 27 and a comparison of

the D50 and T50 are summarized in Figure 29. These additional experiments further

confirm that all the variants selected are close in stability.

3.4.3 DNA Binding Using Fluorescence Anisotropy

A final test of the validity of the screen is to measure and understand the DNA binding

function of the positives from the screen in vitro. Balagurumoorthy et al. have shown

that p53 core domain forms a 4:1 complex with its consensus DNA in a cooperative

manner.173 It has been shown that the core domains do not associate with each other to

form tetramers unless in the presence of DNA. Although the affinity of the p53 core

domain for the DNA is about three orders of magnitude weaker than in the presence of the tetramerization domain, it has been shown that the core domain interaction stabilizes

79 the p53 DNA complex and binds DNA with nanomolar KD values. We used

120

Figure 27: Characterization of Actives a) Urea mediated denaturation melts for the actives from the library. Change in fluorescence at 356 nm as a function of the concentration of urea is monitored. b) Thermal denaturation melts for actives from the library. Change in CD signal at 218 nm is plotted as a function of temperature. 121

Figure 28 CD wavelength scans for TI library variants CD scans of the various mutants in 50 mM sodium phosphate, 300 mM NaCl, pH 7.2 The concentrations of the proteins were kept constant at 15µM measured using UV absorbance.

Figure 29 : Data from TI variants characterizations

The T1/2 and D50 values were obtained by fitting the data from the CD thermal melts and urea denaturation melts respectively to the Clarke and Fersht equation. 122

fluorescence anisotropy to measure the affinity of the various mutants from our library to

the consensus DNA binding domain for p53 (GADD45 : GTACAGAACATG

TCTAAGCATGCTGGGGAC ) and the DNA binding domain used in our screen arrived

at via library methods.

Fluorescence anisotropy specifically measures the tumbling rate of fluorescent molecules

in solution. The linearly polarized excitation light is depolarized to different extents by the sample, depending on its tumbling rate which varies when it is in the free form versus when it is bound to bulky substrates such as proteins. The retention of polarization of a bound sample in comparison to its free form is what we measure in anisotropy experiments. The dissociation constant can be measured as a function of the anisotropy.

Fluorescence anisotropy measures the change in polarization of plane polarized light accomplished by the sample. When the rotational motion of a sample is faster than the time scale of the fluorophore it is attached to, the plane polarized light is scrambled by this sample. On the other hand for a molecule of restricted rotational freedom, like when bound to another large species, the rotational motion is slower than the timescale of the fluorophore and therefore the emitted light is still polarized. The extent of polarization in the emitted light is indicative of the reduced rotational motion of the probe bound to the sample. This is measured by addition of another polarizer before the detector which can toggle between vertical and horizontal orientations. Intensity of the parallel and

123 perpendicular polarized light measured allows the calculation of the anisotropy of the samples using the equation,

where ‘r’ is the anisotropy, measured as the ratio of the difference in parallel (Ivv) and

perpendicular (Ivh) intensities to the total intensity. Our fluorimeter (Olis DM45P) has a

Figure 30: Binding studies of the TI library variants Binding was measured in 50 mM HEPES buffer with 20 nM Fl-GADD45 DNA. Normalized values are shown.

photoelastic modulator (PEM) which removes the need for a moving polarizer. The PEM eliminates the need for corrections in the anisotropy data due to the difference in the intensity of light transmitted by the polarizer in different orientations (G-factor) and also allows continuous and faster data collection. 124

Direct binding of p53 core domain to GADD45 DNA (GTACAGAACATGTCTAA

GCAT GCTGGGGAC), a natural substrate of p53, labeled either with Fluorescein (FL) or Alexafluor-488(Alexa) at the 5’ end was studied. Anisotropy was monitored at 520nm following an excitation at 460nm. The binding of p53 core domain to the DNA was

assumed to be completely co-operative based on the studies reported by Weinberg et al.

and the data were fit accordingly.71 The dissociation constants of all the variants were

within an order of magnitude of each other. The data obtained from the binding studies

are summarized in Figure 30.

3.5 Discussion

High throughput studies on β-sheet proteins have been more difficult to achieve than on

alpha helices due to the inherent aggregation propensity of β-sheets. Chou and Fasman

predicted β-sheet propensity of various amino acids using statistical methods.28 In these

predictions, probability values for β-sheet formation were averages of all possible

environments including middle and edge positions and partially and fully hydrogen

bonded positions. Following this, experimental studies on proteins like GB1 and

ubiquitin showed that the contribution of various residues to β-sheet stability depends on

the conformational requirements of various positions in the β-sheet.174; 175 Smith et al. chose a guest site which had a uniform hydrophobic environment in the hydrophobic core of GB1 and mutated it to all 20 amino acids.27 Although the propensities were correlated

to those predicted by Chou and Fasman, it was not a one to one correspondence showing

125 that the choice of the position, fully hydrogen bonded in this case, will influence the propensity scale.

We randomized four residues in on the adjacent strands of two antiparallel β-strands in the hydrophobic core of the DNA binding domain of p53. The frequency of positives in this library was very low as read out from the screen. We separately randomized T253

I255 (TI library) and A159 A161 (AA library). The results from the TI library showed that position I255 is highly biased for residues with β branching while T253 selective towards small and polar residues. The positives from the AA library show that position

A161 is highly restricted to small residues while position A159 fairly permissive to hydrophobic residues with high β sheet propensity, although Ala was still highly preferred. If we were to predict the number of positives in the AATI library from the positives obtained from the TI and the AA library, assuming each position to be independent of each other, our calculations will indicate the rate of positives to be 0.7%.

But the positives obtained from the AATI library screening were much less frequent than

1%. This suggests that there is significant coupling between the residues in these positions. To further understand this we expressed a few of these variants and characterized them in vitro using urea denaturation melts followed using intrinsic fluorescence, thermal denaturation melts followed using CD and fluorescence anisotropy experiments. The various characterizations showed that all the variants were functionally and structurally similar to the native-like Quad mutant indicating the screen is too stringent to identify functional molten globules as positives. Studies in silico have

126

suggested that mutagenesis on more stable parents will yield more stable functional proteins. Later, this claim was proved by obtaining more mutants from error prone PCR of cytochrome P450 variants when using a stabilized P450 parents. This study demonstrated the impact of parental stability on the diversity generation of the directed evolution experiments.176 These studies allude to the fact that the observed low

occurrence of positives may be a function of the low tolerance of β-sheet structures to

sequence changes in addition to the high stringency of the screen.

This is in contrast to the results obtained previous on core randomization studies of ROP,

a four helix bundle protein. Randomization of two layers of the core of ROP using a

DVY codon which includes all hydrophobic amino acids and alcohols yielded functional

variants spanning a wide range of stabilities. While the thermal melt data from these

libraries showed a difference of 20 °C between the least and the most stabilized variants,

our studies on p53 span only a 5 °C range for thermal stabilities. In addition, the ROP

libraries yielded molten globular variants that were functional. The stringency of the p53

screen is reflected in that the variants exhibit near identical urea and thermal stabilities.

The lower tolerance of β-sheet cores to mutations has been previously reported. The

Handel group designed an algorithm to predict novel stable core for proteins. Using their

algorithm they were able to successfully design stabilized cores for the protein CRO

lambda repressor, a helical protein.169 When the same algorithm was used to re-design the

core of ubiquitin, a highly stabilized β-sheet protein, none of the designed variants were

stabilized.177 In a separate study, Silverman et al. analyzed the TIM barrel protein,

127

Figure 31: Proposed TIVA library for the p53 core Residues T253 I255 V157 A159 which represent an even distribution of sizes and can be randomized are shown as sticks. triosephosphate isomerase for the importance of size, charge and shape complementarities.178 Their results show that the residues in the β-sheet core and the α-β-

interface are highly immutable whereas the mutations in the helical region were fairly

tolerated.

We aim to relax the stringency of the screen to allow us to characterize the core domain

more extensively. We plan to improve the stability of the starting system by two

approaches, by using the hexa mutant as the ‘native’ state and constructing the library in

the hexa mutant context and also by screening in presence of the tetramerization

domain.82 Introducing the tetramerization domain may allow weakly binding variants to

128

pass the screen since this oligomerization domain is known to assist in the DNA binding

of the core domain.179; 180; 181

The low occurrence of positives in the initial library limits the amount of data that can be

collected for this protein. Apart from the high stringency of the screen, it is possible that the low occurrence of positives in the four residue randomization is due to the inability for these residues to pack into the native structure. To explore the significance of the two alanines randomized in this library and the extent to which coupling is because of these residues, other sites can be randomized. A possible site for core randomization is T253

I255 V157 A159 (Figure 31). These four residues form a set that is more distributed in size and therefore the complications of randomizing two adjacent small residues may be minimal.

3.6 Materials and Methods

3.6.1 Construction of Libraries

An endonuclease site for BsrGI was engineered into the Quad gene in the pACBADp53- vector by a silent mutation using PCR. Two fragments the first containing the BsrGI site at the 3’ end and the second containing the site at the 5’ end were generated. The pACBADp53-Quad was used as the template. The first fragment was generated by PCR amplification using 5’ CGCTAACCAAACCGGTAACCTCGCTTATTAAAAGCATTC

3’ in the forward direction and 5’ GTTTATTCAGTGCCGGGCTATATGT ACAGGTC

ACGCTCTTTGC 3’. Under similar conditions the second fragment was generated using

129

5’ GCAAAGAGCGTGACCTGTACATATAGCCCGGCACTGAATAAAC 3’ in the forward direction and 5’ ACCGAGCTCGAATTCGAGACCTTACAGGTTCTCCTC

TTCGGTAC 3’ in the reverse direction. The fragments were sequentially ligated into the pACBADp53-Quad to generate a linear product which was digested with BsrGI and circularized to give the final product containing the BsrGI site. The purpose of generating this restriction endonuclease site in the Quad gene was to facilitate the cloning of libraries without having to rely on overlap extension PCR methods. A stuffer fragment from pGFPuv vector was ligated between BsrGI and PciI sites in the pACBADp53-Quad to yield pACBAD-p53 Quad-Stuffer vector. The stuffer insert which was obtained by digesting the pGFPuv vector with BsrGI and PciI followed by gel extraction and purification. The stuffer was chosen such that it contained restriction endonuclease sites orthogonal to the pACBADp53-Quad vector in order to facilitate elimination of contamination from the parent vector when cloning libraries.

Figure 32: Schematic representation of the library cloning scheme The AATI, AA and TI libraries were cloned between BsrGI and BsaI sites. The red crosses indicate the sites where the residues were randomized using NNK codons. The blue crosses indicates sites where silent mutations were introduced using oligos to minimize contamination from parent vector.

130

Three libraries, A159 A161 T253 T255 (AATI) that randomize four core residues and

A159 A161 (AA) and T253 I255 (TI) that randomizes two core residues each were constructed using oligos encoding NNK for each of the residues. Each library was

constructed by PCR and was ligated into the pACBAD-p53 plasmid between BsrG1 and

BsaI sites. The schematic for making the libraries is shown in Figure 32. The four oligos,

two internal ones encoding the NNK sites, 5’ CCGCCAGGCACCCGTGTGCGT NNK

ATGNNKATTTATAAACAGAGCCAG 3’ (A159 A161 sites), 5’ ACCGCTAGAGTC

CTCCAGGGT MNNAATMNNCAGGATCGGACGACGATTCAT 3’ (T253 I255 sites)

and the external ones, 5’ AATAATAAT AGTGTCACGTGTACATACTCTCCGGC

CCTGAACAAGCTGTTCTGCCAGCTG GCGAAAACG TGCCCGGTTCAG

CTGTGGGTTGA CTCTACCCCGCCGCCAGG CACCCGT GTGCGT 3’ (5’ BsrGI

end) and 5’ GAGCTCGAATTCGAGACCTT ACAGGTTTTCT TCCTCG GTGC

GGCGG TCGCGACCCG GGCACGCGCAA ACGCGAACTTCAAAAGAAT

CGCGACCCAGCAGGTTA CCGCTAGAGTCCTC CAGGGT 3’ (3’ BsaI end) were

mixed in equal concentrations (100 µM) each and diluted 50 fold into the PCR reaction

mixture (8 µM final total concentration). This reaction was amplified and following this,

the terminal primers encoding BsrGI at the 5’ end and BsaI at the 3’ end were used to

amplify the final construct. This was ligated into pACBADp53-Quad stuffer vector previously digested with BsrGI and BsaI.

131

About 0.5 µg DNA was maintained in the ligation reaction for the AATI library to ensure

complete coverage of the theoretical library size. Typically, the reaction was carried out

overnight at 16 °C following which it was incubated at 65 °C for 20 min to heat

inactivate the ligase. This was diluted into appropriate reaction buffer for restriction

digest reactions for minimizing the background. These reactions were cleaned up using a

PCR clean up kit and all of the ligation was transformed by electroporation into freshly

prepared DH10B electrocompetent cells. The cells were allowed to recover for 1 h in the

absence of any antibiotics at 37 °C. These were plated on to solid media (LB) containing

the appropriate antibiotic, here kanamycin, and incubated overnight at 37 °C. The number

of colonies that result from the recovery is indicative of the size of the library. A few of

these were typically analyzed by sequencing to assess the diversity of the library. The recovery is also grown in liquid media containing kanamycin for 12-18 h at 37 °C.

Minipreps and glycerol stocks of the library were made from this saturated culture resulting from the overnight growth.

In order to screen for positives from the library, electrocompetent DH10B cells containing the screening vector pGFPuv-BD were first made. About 2 µL of the miniprep

DNA from the library were transformed into the screening strain. These were recovered for 1 h and plated on freshly made LB-agar plates containing kanamycin, ampicillin and

0.005% of arabinose. These plates were incubated at 30 °C for 48 h before analyzing for colonies with low cellular fluorescence. These were grown separately to saturation and

re-plated on to the screening plates to ascertain the phenotype. The saturated cultures

132

were typically diluted 105 from which 50 µL were plated to obtain well separated

colonies. Once the phenotype is confirmed, the sequences of the positives were found

using sequencing reactions at Genewiz Inc.

3.6.2 Protein Expression and Purification

The various positives selected for characterization were cloned into the multiple cloning

site of pET11a vector between NdeI and BamH1 sites with an N-terminal 6ΧHis tag. The

expression vectors were transformed into C41 (DE3). The cells were grown at 37 °C until

OD600 was between 0.6-1. The cells were cooled to 16 °C, supplemented with 100 µM

ZnCl2 and protein expression was induced using 1 mM IPTG. Cells were harvested by centrifugation after 16 h. For purification of the protein, the cells were resuspended in

25mL of Lysis buffer (50 mM Tris (pH 8.0), 10 mM Imidazole, 300 mM NaCl, 10% glycerol and 15 mM BME). These were incubated for 1h at 4°C after the addition of 5 mM MgCl2, 0.5 mM CaCl2, 5 µL of 2 U/mL DNase (Pierce), 200 ng/mL RNase (Fisher),

1.2 mg/mL lysozyme (Fisher), and 0.1% Triton X-100. Cells were then sonicated on ice

for a total process time of 3 minutes, with 45 s in between 10 spulses. The soluble

fraction was isolated by centrifugation at 30,000 g for 30 min. The supernatant was

allowed to bind to 1.0 mL 50% Ni-NTA agarose (Qiagen) for 1 h. This was then loaded

onto a small pre-fritted column and washed twice with 6 mL each of wash buffer (50 mM

Tris (pH 8.0), 15 mM Imidazole, 300 mM NaCl, 10% glycerol and 15 mM BME). The

Ni-NTA matrix containing the bound protein was then resuspended to 2-5mL with lysis buffer and incubated with 6ΧHis-TEV protease at room temperature for 12-16 hours.

133

Following this the flow-through which contains the cleaved protein is collected. The eluted protein was further purified by size exclusion using a Superdex 75 (10/300) (GE

Amersham). This column can separate 3 – 70 KD proteins at high resolution. This allows

us to separate the aggregates that may have been formed from the folded form for the

variants under study. Quantitative cleavage and homogeneity were confirmed by running

the samples on SDS-PAGE gels.

3.6.3 Chemical Denaturation Using Urea Monitored by Fluorescence

Protein denaturation was monitored using an Olis DM45P fluorimeter. The samples were

excited at 280 nm and emission spectra were followed between 300 and 400 nm with a

band pass of 5 nm. Data were collected at every nanometer with a 3 s integration time for

leach point. The proteins were incubated in 50 mM sodium phosphate, pH 7.2, 5 mM

DTT at 10°C for 12-16 h. The concentrations of the proteins were determined

-1 - spectrophotometrically using an extinction coefficient ε280 of 17,210 M cm 1. The final

concentration was maintained to be 2 µM. Data were exported and analyzed using

Microsoft Excel 2003/2007. The melt data was fit to the Clarke and Fersht equation to

derive at the D50 values (the concentration of urea at which 50% of the protein is folded)

for the various mutants.

3.6.4 Thermal Denaturation Monitored by Circular Dichroism

Thermal denaturation of purified proteins at 15 µM concentration in 50 mM sodium

phosphate buffer, pH 7.2, 300 mM NaCl was monitored by the change in secondary

134

structure reported by the CD spectra followed using a Jasco-815 CD spectrophotometer.

Wavelength scans recorded the ellipticity every nanometer with a 2 s integration time at

100 nm min-1 scanning speed. Data points which showed HT voltages greater than 600 V

were discarded. For thermal denaturation, ellipticity was monitored at 218 nm with

temperature increasing from 10 to 95 °C. Temperature was ramped up at the rate of 1 °C

min-1 with 6-s temperature equilibration at each step. Data were exported and analyzed in

Microsoft Excel 2003/2007.

3.6.5 DNA Binding Studies Using Fluorescence Anisotropy

Fluorescence anisotropy was followed using an Olis DM45 fluorimeter. The anisotropy for increasing concentrations of the protein at 20 nM labeled DNA was monitored at 10

°C. The reactions were set up in ice and incubate at 10 °C during the experiment. The proteins in 50 mM sodium phosphate pH 7.2, 5 mM DTT were diluted into the 50 mM

HEPES buffer immediately before data collection. The concentrations of the proteins which were determined both spectrophotometrically and by gel electrophoresis were varied between 0 -500 nM. The samples were excited at 460 nm and the data was collected using a 495 nm filter. The binding was monitored in 50 mM HEPES buffer containing 5 mM DTT. Data was exported to Microsoft excel and fitted to a cooperative binding model as described previously to derive the KD values. Binding was measured to

double-stranded DNA. Fluorescently labeled forward strand was annealed to unlabeled

reverse strand by heating to 95 °C for 5 minutes. This was cooled slowly to room

temperature following which the double stranded DNA was stored at -20 °C.

135

Chapter 4: Engineering the S7/S8 loop of the p53 core domain to improve stability

Contributions

The work presented in this chapter was carried out by Brinda Ramasubramanian, Ely B

Porter, Matthew Heberling, and Thomas J Magliery. The majority of the work reported

here was accomplished and written up by the primary author. Cloning and expression of the first loop mutant were accomplished by the M. Heberling. The cloning and expression of the re-engineered chimera mutants were carried out by E. Porter. The experimental

design and data analysis were done by the primary and corresponding authors.

4.1 Summary

We generated a loop deletion mutant of the core domain of tumor suppressor protein p53 which was posited to be stabilized in comparison to the WT core domain based on MD simulation studies. Screening and characterization show that this variant is quite destabilized. Analysis of the structure of p53 core domain revealed that E221 in this loop region may contribute to some long range stabilizing interactions in p53. Also the side chains in the p53 core are oriented in a slightly different manner when compared to the

Cep1 homolog indicating that the loop length may also be important for p53. Following

136 this we generated four different rationally designed mutants and two different combinatorial libraries to investigate the role of packing, loop length and flexibility in the stability of this protein.We also designed variants which replaced different residues in the loop with glycine. These substitutions with glycine impart flexibility to the loop in addition to removing side chain interactions and will aid in addressing the role of flexibility of this loop region without increasing the length of the loop. Two libraries that randomize all the residues with NNK codons to include all the 20 amino acids in the context of four and five residue deletions were also screened using a cell based screen but did not lead to any positives. Our results indicate that a four residue deletion loop which encodes for ‘YEGSGC’ is a native like folded protein but is destabilized compared to the stabilized Quad mutant. A number of point mutants are currently being investigated to delineate the role of each of the residues and the effect of loop length in the above variant. The results from this study will assist in generating a stabilized variant of core domain in addition to contributing to our understanding of the role of loops in the stability of proteins in the context of a β-sheet core.

4.2 Introduction

Improving the stability of p53 is of paramount importance to human health. Tumor suppressor protein p53, often dubbed as ‘the guardian of the genome’134 is a transcription activation factor which causes cell cycle arrest apoptosis or senescence depending on the downstream targets activated. P53 is allowed to accumulate in cells only under conditions of stress and is marginally stabilized under physiological conditions. More that 50% of

137

human cancers are found to have mutated p53. The various p53 mutations are categorized

as structural and contact mutants. A majority of the disease mutations are structural as

they lead to global loss of stability of the protein. On the other hand contact mutations

lead to loss of function by directly affecting the interaction of p53 to its downstream

DNA targets. Stabilizing p53 is therefore a potential area of research in the fight against

cancer. One of the strategies to improve the stability of this protein is to introduce second

site mutations that lead to the stabilization of some of the disease mutants.85Another

strategy has been the development small molecules that bind to the folded form of the

protein and not to the denatured form.124; 141 Physiologically, this will shift the

equilibrium towards a higher population of the folded form of the protein. An example of

a stabilized variant of p53C (p53 core domain) is the Quad mutant which has four

different mutations to the core domain M133L, V203A, N239Y, and N268D.79 These

were designed based on the multiple sequence alignment of 22 different homologues of

p53 and were shown to be stabilizing over the wild type p53C (ΔΔG of 2.65 kcal mol-1).

Combinatorial studies that explore the rules of core packing in this protein will give us an insight into the various mutations that can stabilize this protein.

One of the approaches that we have taken to stabilize p53 is to re-engineer S7S8 turn in the core domain based on MD simulation studies reported by Pan et al.182 Loops are important structural elements which not only connect the alpha helical and β-sheet

structures but also form ligand binding sites and active sites in enzymes. Loops are one of

nature’s ways of including diversity while keeping conserved folds. At the same time not

138

all loops have any function associated with them and these flexible linkers between the

structural alpha helices and β-strands have a profound influence on the thermodynamic

stability of proteins. Detailed analyses show that the role of loops in proteins cannot

easily be generalized. The effect of loop length on the stability of proteins can be

explained based on the entropic penalty that is required for the folding of a more dynamic system and the increased conformational freedom that a longer loop will have.

There have been several studies on the role of loops in protein structure and the effect of the loop length and flexibility on the stability of protein. Loop length variations in the model protein ROP showed that both unfolding and refolding rates were faster when native loop was replaced by a more flexible loop of the same length.183 The authors

hypothesize that this is because the more flexible loop leads to a reduced barrier to the

transition state due to easier accessibility to various conformations in the variant with a more flexible loop compared to a rigid loop structure. Increasing the loop length leads to a faster unfolding rate indicating that longer loops have a destabilizing effect on the folded state. Similar studies done on several other proteins, for example, α-spectrin SH3 domain, chymotrypsin inhibitor2, GB1 and others, show similar results with changes in loop length.184; 185; 186 An increased loop length leads to a system with more conformational freedom, thus requiring more energy to remain folded. Studies also indicate that loops play a role in the folding of proteins by dictating that conformations the unfolded peptide samples to converge to the folded state. Not only do the loop structures influence the stability of protein, they also influence the inter-helical

139 geometry.187 Wright et al. have also reported that inserting a five residue loop between two β strands in an immunoglobin fold leads to global destabilization of the structure. In the same vein we can conclude that decreasing the loop length could lead to increased stability of the protein.188

Homologous proteins which exhibit decreased dynamics are also characterized by a higher thermodynamic stability. Recent studies by H/D exchange on PCNA homologues showed a distribution of amino acids that were more dynamic in the human variant when compared to the yeast variants.189 In vitro characterization showed that the yeast variant is more stabilized than the human homologue. MD simulation studies identified a transient hydrogen bond between an apparently non functional aspartic acid residue and a

His residue in the P-loop of EF-Tu.190 In order to further probe this interaction the authors generated a D109A mutant and compared it with the wild type. Kinetic assays of these proteins revealed that the D109 residue forms transient hydrogen bond interactions that stabilize the transition state for the nucleotide exchange reaction. This is an example where theoretical and experimental approaches were combined to identify these 3-D communication networks and their results indicated that evolutionarily conserved residues in loops can indicate a second layer of interaction and their roles cannot be generalized.

140

Figure 33: human p53C aligned with Cep1 a) Sequence of human p53 core domain, residues 94-293, is shown aligned with p53 homolog from Cep1. The secondary- structure elements are shown above it b) the structure human p53 core domain overlayed with Cep1. The picture is rendered using Pymol. 141

Pan et al. performed MD simulations to compare the stabilities of human and worm p53.

Their studies show that the p53 homologue Cep-1 in C.elegans is substantially less

dynamic than its human counterpart. Detailed analyses showed that the structural differences between these proteins were localized more on the peripheral motifs than the

core secondary structures. Since reducing the loop length is known to be a method for

improving the stability of proteins191, they analyzed the effect of reduced loop lengths

between the S7S8 turn. The root mean square deviation (RMSD) values obtained from

their studies showed that reducing the length of this loop not only stabilized the S7S8

turn but also appeared to reduce the global dynamics of the protein. This in silico

generated mutant was as stable as the Quad mutant and can be a potential candidate to

serve as a host for core domain studies in this protein. We therefore cloned, screened,

expressed, purified and characterized this variant to compare the results from the MD

simulations. Further details for the basis of this study are given in the following sections.

4.3 Study of p53 in C. elegans as a potential method to stabilize human p53

P53 core domain has an immunoglobin like fold, the β sandwich of which forms the

DNA binding scaffold for this domain. β strands S2, S2’, parts of S8 and loop L1 form a

loop-sheet-helix motif and binds to the major groove of DNA (Figure 33). L1 is a highly

destabilized loop region but since it is implicated in binding to DNA, mutations to this

region might lead to loss of function of the protein. The crystal structure of the the p53

homolog in C. elegans, Cep1, shows that the loop connecting the S7/S8 turn is smaller

than the corresponding loop region in human p53C. Therefore the S7S8 turn which is

142

away from both the Zn2+ binding site and the DNA binding loop-sheet-helix motif is a

potential target for the introduction of stabilizing mutations. One of the striking

differences between the two structures is that the loops consistently have higher helical

content and this enables these peripheral motifs to make more contacts with the β sheet

structure.

Figure 34: A comparison of the core domain structures of human and worm p53 a) The crystal structure of human p53 core domain (pdb id 1TSR and b) the crystal structure of WT C.elegans (pdb id 1T4W) p53 core domain are shown. The S7S8 loop is shorter in the worm structure when compared to that of the human structure as indicated in red. The figures were adapted from Pan et al.182

This structural stability is revealed by MD simulations in which RMSDs of these two

proteins of similar size were directly compared. Residue specific information of the

RMSDs show that loops L1 and L3 are the most dynamic on human p53C and L2 is

constant. Further, the S7S8 turns were longer in human p53C when compared to the

143

Figure 35: Screening results and designed variants a) A comparison screening results obtained from the various loop modification mutants. b) Sequence alignments for the loop modification mutants. The native sequences are indicated as human and worm respectively. The libraries are named as 6RL and 5RL. The residues randomized are indicated by ‘X’.

worm homolog and were among the more dynamic regions with the β-sheet core. A deletion between the S7S8 loop showed reduced overall dynamics. Deconvoluting this data showed that this deletion of five residues between S7S8 turn dampened down the dynamics of the whole protein in addition to stabilizing the S7S8 turn. Therefore we wanted to construct this variant of the human p53C and test its stability in comparison to the engineered stabilized Quad mutant of p53.

144

4.4 Experimental studies to parallel the silico results

The first chimera mutant (LDM, loop deletion mutant) was based on the structure that

Pan et al. proposed as being stabilized in comparison to the WT. We used mutagenic primers to generate this mutant that deleted residues 224 to 228 in addition to a point

mutation P223A both in the contexts of p53 WT and Quad. We used the cell based screen

(Chapter 2) to test whether this mutant was native-like. The initial results from the screen

showed that this is not an active variant. When this variant was over-expressed for in vitro biophysical characterization, it was found that it partitions into the insoluble fraction, even at 4 °C and therefore we were not able to derive any thermodynamic data for this variant. Presumably, it is quite destabilized even in the context of Quad.

All the variants generated were tested for activity using the in vivo screen that we have

developed based on the transcription interference of GFP in a p53 dependent manner. The

variants were initially constructed by PCR based methods using WT p53C as the

template. None of the variants appeared to be positive by the screen. We later conctructed

all the variants with Quad as the template in order to make sure that the results from the

screen are not merely due to the marginal stability of WT. All the variants discussed below, were constructed with Quad as the backbone. The variants CHM1, CHM2 and

5∆GG appeared not to bind DNA based on the screen. Preliminary results on the biophysical characterizations of these proteins also showed that these variants aggregate upon purification indicating that these are highly destabilized variants.

145

Figure 36: Urea mediated denaturation of 4ΔL+EGG

Open squares indicate the melt data for 4ΔL+EGG variant and open circles indicate Quad melt data. The lines are fits to the data using the Clarke and Fersht equation. D50 for Quad was 3.8M urea and that for 4ΔL+EGG was 2.7M urea

The 4∆EGG mutant gave a positive phenotype with the screen indicating that this variant is able to bind to the consensus DNA. Loops have been reported to form long range interactions with the core structure and it is often found that in spite of low sequence homology between loops, these regions are not completely mutable. We also expressed and purified this protein in order to characterize it using chemical melts induced by urea.

The in vitro data corresponds to the results obtained from the screen and shows that typical curve for the folded protein in the absence of urea. Although this variant is not as

146

Figure 37:Analysis of the S7S8 loop a) The crystal structure of human p53 core domain (pdb id 1UOL) is overlayed with the crystal structure of WT C. elegans (pdb id 1T4W) p53 core domain. The S7S8 loop is shorter in the worm structure (yellow) when compared to that of the human structure (red). The residues in the Cep1 shown in blue points towards each other and the sidechains of the human variant are pointed away from each other (green).b) shows the difference in the length of the strands in worm and human p53. The S8 strand in human shown in cyan is shorter than its worm counterpart. c) Structure of p53 core domain with the S7S8 loop indicated as sticks. Y220 is shown in red and E221 is shown in yellow. The rest of the loop is shown in blue. It can be seen that Y220 packs into the hydrophobic core of the core domain d) Shown is E221 (red) which potentially forms a salt bridge with R202 (Green). E221 and R202 are shown as sticks enveloped by transparent spheres. The pictures were rendered using Pymol (De Lano Scientific).

147

stable as the Quad mutant, including the E221 residue imparts stability to the protein and

characterizations show that the 4∆EGG mutant is well folded and native like when compared to the original LDM variant. Therefore we have successfully generated a four residue loop deletion variant which is natively folded, although it is fairly destabilized compared to Quad. This D50 value for this variant is found to be 2.7 M urea whereas

Quad melts at 3.7 M urea. The urea denaturation melt for this variant is shown in Figure

36. These preliminary results indicate the possibility of both loop length and residue

identity being important for p53C to be folded. Further experiments need to be done to

address the individual contribution of these factors. The architecture of the sheets in

human p53 can tolerate only up to a four residue deletion in the S7/S8 turn region

although MD simulations expect a global stabilization by the five residue deletion

variant. Also the polar residue present in the loop region may play a significant role in

packing against the β-sheet core and leads to stabilization of the structure.

4.5 Discussion

In contrast to the suggestion by Pan et al., we found the S7S8 loop of human p53 to be very sensitive to mutations. We propose that this loop region is governed by both packing

and flexibility. We found that proteins which had either a reduced loop length or changed residue identity are inactive by the screen. Biophysical characterization also shows that these variants are ill-behaved and are susceptible to aggregation. Some studies show that shorter loop lengths can lead to less stable proteins. Detailed examination of the crystal structures allude to the reason behind this sensitivity to mutation. One of the reasons may

148

be that the strand S8 is shorter in the human protein when compared to the worm protein

Figure 37. The longer loop length may be needed for the packing of the sheets in spite of

the fact that longer loops make the protein more dynamic. This longer length also enables

the loop to pack against the hydrophobic core better in human than the worm.

Perturbations to loop residues connecting shorter turns are found to have a greater impact

on the stability of the protein compared to those between longer turns. Therefore

perturbing this loop may lead to loss of stabilizing interaction within the hydrophobic

core. Packing and length of the loop both seem play a significant role in the stability of

p53 core domain.

Literature shows that loops play a significant role in the folding and stability of proteins.

For instance it is well known that loop 2, 2’ and 4 form a loop bundle and stabilize the

active site of caspase-3.192 It was found that a salt bridge in loop 4 of this protein is

important for interface contacts in addition to the formation of the loop bundle. Takano et

al. studied the role of amino acid residues in conformational stability and folding at the

turns in lysozyme.193 They engineered a series of mutants with insertions and deletions in

the loop region. They concluded that the entropic gain from reducing the length of the

loop region is not always sufficient to impart stability to the protein. The conformation of

the loop depends on the identity of the residues and therefore any long range interactions will also depend on the identity of the residues in the loop. Loop swap studies into CDRs

194 of immunoglobin VL also proved to have destabilizing effects. The authors found that

these loops are pre-organized to form certain conformations and therefore the loop

149

lengths and residues are optimized for that conformation. Often times the dynamic nature

of these loops impart conformational plasticity enabling these proteins in their ligand

binding function, although they may destabilize the protein overall. This reinforces the theory that proteins are evolved for function and the criteria for stability are met on the basis of necessity.195 De Basio et al. recently studied the human and yeast homologs of

proliferating cell nuclear antigen (PCNA).189 H/D exchange studies showed that the

human homolog is more dynamic than the yeast homolog. Increasing the flexibility has a

functional role and the accompanying decreased stability may only be consequence of the

gain of functionality.

4.6 Future work

We have generated two different loop randomized libraries using NNK codons. The

number of residues in the loop region in one of the libraries was five and the other had six

loop residues. Preliminary screening results showed that the frequency of occurrence of

positives from these libraries was small. This may be a combined artifact of the highly

unstable ‘host’ protein and a stringent screening system (as discussed in Chapter 2). To

address the issue of the mutability of this loop region, work is presently underway to

construct five different variants Figure 38. The 4∆AGG specifically explores the role of

150

Figure 38: Proposed mutations to study the S7S8 loop Proposed mutants for further investigation of the stabilization of the p53 core domain by mutagenesis of S7S8 turn.

the glutamic acid residue. The E221A mutation will address whether the glutamic acid residue simply assists in packing or if it has some long range interactions. The variants

4∆EAA, 4∆AGG and 5∆G explore the role of glycine in this loop. By mutating both the

glycine residues to alanine in 4∆EAA, we will be able to decipher if the E221 alone is

sufficient to result in a folded structure or if it requires a flexible loop to assist in the

packing. Similarly 5∆G will assess whether a single glycine is sufficient of if this loop

needs both the glycine residue and if 5-residue deletion will allow the protein to fold into

its native form. The role of aspartic acid is the subject of interest in the 4∆EAD mutant.

Although the 4∆EGG mutant does give a positive phenotype in the screen, we see that it

is of intermediate fluorescence and is not as well folded as the Quad mutant. The

presence of Asp might lead to a variant of improved stability. The CHM3 mutant is

another chimera, replacing the human loop with the loop residues from the worm

151

sequence. The initial chimera generated did not have the tyrosine residue and since

analysis of the loop structure indicates that this tyrosine in the human loop packs into the

hydrophobic core, a chimera which includes the Tyr residue may potentially be

stabilizing.

4.7 Materials and Methods

4.7.1 Cloning and Expression of Chimera Variants

Overlap PCR was used to introduce a required mutation in the p53 gene. The resulting

products were amplified using oligos encoding for BstEII and BsaI sites at the 5’ and 3’ ends respectively. This was then ligated into pACBADGFPuv vector between BstEII and

BsaI sites. The transformants were analyzed by restrictions digests and the cloning was

conformed using sequencing reactions. The libraries, Q5L and Q6L which randomizes 5

and 6 resdiues respectively were also constructed using overlap extension PCR using

oligos encoding NNK codons at the positions to be randomized. The cloning scheme for

the libraries was the same as that followed for the construction of the individual mutants.

4.7.2 Screening for Positives

The protocols as described in chapter 3 were followed. The cells containing the variant

EQ typically required four days to grow as compared to the 2 days for the normal

screening. So the screening results reported here for the loop variants are based on the 4- day growth.

152

4.7.3 Protein Expression and Purification

All the variants were amplified from the respective screening vectors using 5’ AAT AAT

AATCATATGGCG CAT CAC CAT CAC CAT CAC GGC GGC GAA AAT CTA TAT

TTC CAA GCT AGC TCT AGC GTG CCG AGC 3’ which includes an N-terminal

6ΧHis tag, followed by a TEV protease cleavage site separated by a GSSG linker at the

5’ end and 5’ ATT ATTGGATCCTTA CAG GTT CTC CTC TTC GGT AC 3’ which includes a stop codon after the gene at the 3’ end. They are transformed into C41 (DE3) cells and expressed and purified as described in Chapter 3.

4.7.4 Urea Mediated Chemical Denaturation Monitored by Fluorescence

Protein denaturation was monitored using Perkin Elmer fluorimeter. The samples were

excited at 280 nm and emission spectra were followed between 300 to 400 nm with a

band pass of 5nm and a scan speed of 1 nm s-1. The experimental set up is described in

Chapter 2. The collected data were exported as an ascii format and was sing Microsoft

Excel 2003/2007. The melt data was fit to the Clarke and Fersht equation to derive the

D50 values for the various mutants.

153

Chapter 5: In vivo and in vitro studies of cancer associated mutations in BRCA1

Contributions

The work presented in this chapter includes the work of Mohosin Sarkar, Brinda

Ramasubramanian, George Matic and Thomas J Magliery. The majority of the in vivo

results reported here was accomplished and written up by M. Sarkar in collaboration

with G. Matic. The in vitro experiments were designed and accomplished by B.

Ramasubramanian. The experimental design and data analysis were done by M. Sarkar,

B. Ramasubramanian and T. Magliery.

5.1 Summary

We have generated and screened a comprehensive list of BRCA1 mutants for interaction

with wild type BARD1 using an improved version of split GFP complementation assay.

Of the thirty six different mutations generated, only a few of the mutants reduced the

association of the BRCA1 with BARD1. Some of the mutations in the four-helix

interface, like V11A , M18K and to a lesser extent, D96N and G98R, a few of the

mutations in the Zn2+ regions like C39R, C39Y and C61G were found to reduce the

cellular fluorescence indicative of disruption of interaction between these proteins. Some

154 of the mutations in the RING finger domain like, T37R, K38N and R71G also lead to lower cellular fluorescence in the screen. Surprisingly L53F, which is away from the interacting surface, also diminished the interaction with BARD1 and lead to low cellular fluorescence. Western blots of whole cell lysates showed that the difference in the fluorescence levels were not due to reduced expression levels of mutant BRCA1. In order to decisively prove this, we selected a few mutants spanning different positions and expressed them with orthogonal tags. This enabled us to purify these variants without any

GFP fusion. We purified BRCA1 using a 6ΧHis affinity tag from the soluble fraction.

Since the solubility of the BRCA1 is a function of its ability to bind to BARD1, the amount of BRCA1 purified correlates to its binding efficacy. The levels of expression of

BRCA1 mutants were found to be similar but the solubility and therefore the amount of

BRCA1 that purified from the soluble fraction followed a trend similar to the cellular fluorescence obtained from the in vivo assays. Future work includes the use of the screen in combinatorial experiments to explore the packing rules of this protein-protein interaction surface in addition to screening for mutations or small molecules that can lead to interface repair.

5.2 Introduction

Interaction between proteins is intrinsic to virtually every cellular process. Altering kinetic properties of proteins, facilitating substrate channeling, forming new binding sites, changing substrate specificity are some of the many measurable consequences of protein-protein interactions.196 Better understanding of the chemical aspects of

155

association between interacting proteins will assist in deciphering the role of crucial

mutations which are often clustered in the binding site. Mutations that lead to alteration

of affinity between proteins in a complex in turn causes the disruption of signaling

pathways, metabolic and gene regulation, and other cellular processes. An excellent example is tumor suppressor protein p53 which is found to be mutated in more than 50% of human cancers.197 These mutations, often called the hotspot mutations, are found in the

DNA binding core domain of p53. Studying the effect of these mutations contributes significantly towards understanding the role of these residues which then leads to rational drug design. The interaction between p53 and its negative regulator MDM2 has been studied extensively. MDM2 binds to the transctivation domain of p53 and the E3

ubiquitin ligase activity of MDM2 leads to the ubiquitination of p53, targeting it to the

protease machinery.198 Interrupting this interaction was identified as one of the potential

therapeutic strategies to allow the accumulation of p53, thus favoring tumor suppression.

A variety of drugs, many of which are presently in pre-clinical studies have resulted from the understanding of this interaction.108 199 Thus, identifying the different interactions, understanding the extent to which they take place, and determining the consequences of the interactions can lead to effective drug design and provide useful information for forecasting side effects. One such interacting pair, the ring domains of BRCA1 and

BARD1 is implicated in a variety of cellular functions including double stranded DNA break repair, cell cycle regulation, chromatin remodeling, protein ubiquitylation and transcriptional regulation,200; 201; 202 and the significance of this interaction is reflected in

156 the fact that germline mutations in the BRCA1 gene predispose women to breast cancer.203

A comprehensive knowledge of various protein-protein interactions and their molecular mechanisms is essential to understanding various biological processes. The significance of protein-protein interactions is reflected in the number of techniques that have been developed in order to study these interactions.20; 33; 204 In vitro methods such as tandem affinity (TAP tag) purification, chromatin immunoprecipitation (ChIP) and microarray based methods have been proven useful for detection of protein-protein interactions. But these in vitro methods have various limitations, the foremost of which is that the in vitro processing steps involved in these studies makes the results obtained from these techniques not directly comparable to the in vivo conditions. Loss of transient interactions and non-specific binding are some of the limitations of the ChIP and TAP tag methods. FRET is one of the techniques that can be used both for in vivo and in vitro studies. FRET requires both the distance and the orientation of the interaction between the proteins to be optimal. Therefore, at low, physiologically relevant concentrations,

FRET may not be detectable depending on the spectral overlap and quantum efficiency of the donor and acceptor molecules. Other in vivo methods have been developed and the most applicable has been Yeast Two Hybrid assays (Y2H).205 The principle of the original form of Y2H is based on the splitting of the transcription factor GAL4 into its

DNA binding domain (DBD) and transcription activation domain (AD). These are fused to putative interacting partners and do not interact with each other on their own to initiate transcription. When the proteins fused to the DBD and AD interact with each other, these

157

domains are brought together to reconstitute the transcription activation factor which can

initiate downstream targets like GFP, lacIZ, etc.206 One of the main problems with Y2H is false positives due to two reasons. The first one is that it does not demand direct interaction between the proteins, as even indirect interactions can re-constitute the transcription factor to initiate transcription. The other is that even when the particular transcription factor is not reconstituted, other mechanisms may initiate the transcription.

An added complication is that nuclear import is required for Y2H.207

Fluorescence complementation assays have been developed based on the same principles

as Y2H but taking care to avoid the potential negatives.207; 208 Proteins like dihydrofolate

reductase and ubiquitin have been fragmented and they reassemble only in the presence

of fused proteins. One of the drawbacks with these is that an external chromophore is

required for the detection of reassembly. Split complementation assays have been

developed with green fluorescent protein (GFP) which circumvents the need to add an

external chromophore for detection, in addition to being amenable to a variety of cell

types. Split complementation of proteins provides a powerful tool to analyze interaction

between proteins with minimal false positives. This method relies on the non-

spontaneous irreversible association of the split fragments when they are in close

proximity, brought about by the interaction between the proteins under study. Transient

interactions get trapped due to the split reassembly and therefore enable the detection of

weak interactions. In addition, the non spontaneous nature of the association ensures that

false positives are minimal. Split complementation was used as a selection technique

158

until Ghosh et al. introduced a method to fragment sg100 GFP in order to yield insoluble

fragments which associate only when fused to interacting proteins.209 Later an improved version of the screen was reported by Magliery et al. who engineered efficient plasmids for incorporating the screen and demonstrated the application of the screen to libraries of larger protein as against peptides in the initial report.207; 210 One of the disadvantages of

the split GFP screen in these reports was that the time required for the chromophore

maturation to be detectable was between 24 h to 4 days. As an improvement to this,

Sarkar et al. have reported the use of folding reporter GFP which is a hybrid of EGFP

(F64L, S65T) and GFPuv (F99S, M153T, V163A).211 This modification allowed the

fluorescence to be monitored within 24 h. We have used this system to study the cancer

associated mutants of BRCA1.

BRCA1 forms a stable heterodimeric complex with BARD1 and this dimerization is

essential for the multifaceted role of BRCA1. Although mutations in BRCA1 mutations

predispose women to breast and ovarian cancers, sporadic cases which form the majority

of these cancers do not have mutations on the BRCA1 gene. This observation combined

with the fact that it interact with a plethora of other proteins, leads to the conclusion that

it is the disturbances caused to these interactions that result in a variety of human diseases including defective DNA damage repair, abnormal centrosome duplication, cell-cycle

arrest, growth retardation, increased apoptosis, genetic instability and tumorigenesis.

Disruption of the interaction between BRCA1 and its partners due to mutations in

BRCA1 interacting proteins (BIPs) , notably p53, also leads to a variety of cancers.203

159

The mechanism of interaction between p53 and BRCA1 is an active area of interest and

recent studies suggest that p53 aids in the nuclear import of BRCA1.212 The

BRCA1/BARD1 complex is also thought to support cell proliferation by retardation of

the S2 phase. The various binding partners for this complex or the mechanism by which

this complex acts as a E3 ubiquitin ligase is still unclear but the interaction between

BRCA1 and BARD1 has been shown to be essential.213

The function of BRCA1 is governed by its nuclear localization202 and it undergoes active

nuclear export and import. This dynamic equilibrium acts as a regulation mechanism for

the activity of BRCA1. It is proposed that BARD1 plays a role in the nuclear retention

and nuclear import of BRCA1. BRCA1 is a tumor suppressor protein and the ubiquitin

ligase activity of BRCA1 is known to be governed by its ability to interact with BARD1.

BRCA1 is predominantly localized to the nucleus in the cell and the tumor suppressor

activity of BRCA1 by transactivation of genes involved in apoptosis is regulated by

nuclear BRCA1. It has been proposed that Importin α/β mediates the nuclear transport of

the BRCA1/BARD1 complex and this interaction is necessary for the efficient nuclear

transport of BRCA1. Apart from the indirect role, the heterodimer is implicated in the

direct tumor suppression activity on basal-like mammary carcinomas.214 Although the

molecular mechanism by which BRCA1 acts as a tumor suppressor is still unclear, it is

well established that the heterodimer formation of BRCA1 with BARD1 mediated through the interaction between the RING finger domains is essential for its E3 ubiquitin ligase activity.215 It is proposed that BRCA1 might aid in maintaining the genome

160

integrity through tumor suppression in normal cells partly via its ubiquitylation activity.

The BRCA1/BARD1 complex is also implicated in the down regulation of aromatase which is upregulated in cancerous cells thorough transcriptional repression.216 Evidence suggesting that both ubiquitin ligase and transcriptional activity of BRCA1 is governed by its interaction with BARD1,217 in addition to the role of BARD1 in the nuclear

localization of BRCA1, further establishes the significance of this interaction.

Understanding the molecular mechanism of this interaction can lead to an understanding

of how this protein accomplishes its complex function. In addition, this interaction might

serve as a direct target for protein engineering efforts for cancer therapy.

BRCA1 forms a stable heterodimeric complex with BARD1, and this dimerization is

essential for the multifaceted role of BRCA1. The formation of the complex itself seems

to be important for the ubiquitin ligase activity which in turn is directly correlated to the

tumor suppressor activity of the BRCA1/BARD1 complex. They are structurally similar

as shown in Figure 39, and their RING domains interact with each other forming

antiparallel four helix bundle. Y2H screening has previously been used to study the

comprehensive set of cancer associated mutations and their effect of interaction of

BRCA1 with BARD1 and E2 enzyme UbCh5α.218; 219 Surprisingly they found that none

mutations in the interface between these proteins led to a negative phenotype for interaction. Some of their results also contradicted previous studies by Klevit.220 These results justify the need for a third complementary method of study.

161 a.

Figure 39: Structural network organization of BRCA1 and BARD1221

162

5.3 Studying the BRCA1 mutations using Split GFP Complementation

Initially, Sarkar et al. analyzed eight different cancer associated mutants using an

improved split GFP complementation screen.211 Following this all the 33 known cancer

associated mutants have been constructed and tested using the screen. In order to prove the validity of the split complementation system in the context of the BRCA1/BARD1 complex formation, we expressed a few mutants spanning all regions of mutations and phenotypes in the absence of the fusion GFP fragments and analyzed the interaction in vitro. We were able to see that our in vivo and in vitro results were complementary to each other, and therefore this is a good method to study this interaction.

5.3.1 Study of Cancer-associated Mutants of BRCA1 for their interaction with BARD1

The split frGFP we re-engineered was able to detect the BRCA1/BARD1 N-terminal

RING domain interactions efficiently and evolve fluorescence in vivo. In previous

study211 using this improved split GFP assay, we examined a small number of cancer-

associated mutants of BRCA1, which have been examined by Y2H methods, in vitro

methods, or both. Five mutations which were mostly buried in the helical interface

(V11A, I15T, M18T, M18K and I21V) and one mutation in the RING motif away from the interface (L52F), were selected for this initial study (Figure 44). The V11A and

M18K mutations showed a reduced fluorescence level compared to the wild type

BRCA1/BARD1. The I15T, I21V, and M18T mutations did not markedly reduce the

163

a b

Figure 40: Initial set of mutations studied by Sarkar and Magliery

(a) Interaction of BARD1 with cancer-associated BRCA1 mutants studied by split frGFP reassembly1. Fluorescence was observed after 14 h of incubation at 30 °C and 12-16 h of incubation at room tempeture. (b) SDS-PAGE of reassembled and purified complexes on Ni-NTA affinity column. Lane 1, MW markers; 2, positive control (pET11a-Z- NfrGFP/pMRBAD-Z-CfrGFP); 3, wild-type BRCA1; 4: V11A; 5, I15T; 6, M18T; 7, M18K; 8, I21V; 9, L52F; 10, negative control (pET11a-BARD1-NfrGFP/pMRBAD-Z- CfrGFP). (c) Western blot of HA tag-CFrGFP using anti-HA-Goat-HRP to confirm the expression level of C-terminal fragments.

fluorescence, although the L52F mutation which is away from the interface did reduce

fluorescence level. These results showed that the interface is fairly insensitive to mutation

but that under packing and charge burial can be sufficiently deleterious to prevent

binding. Any mutation on the BRCA1 may affect the expression level of the CfrGFP-

BRCA1 fusion protein and thereby it may affect the fluorescence level. To confirm the expression level of BRCA1-mutant fusion protein a western blot was performed against an HA tag (fused to the C-terminus of the fusion protein) using HRP conjugated Goat-

164

anti-HA-tag polyclonal antibody (from GenScript). Figure 40 shows the expression levels

of all six BRCA1 mutants-fusion proteins and wild type BRCA1. This result suggested

that the lowered cellular fluorescence levels we observed for these various mutations

were not due to the lack expression.

Following this study, we aimed to examine the whole spectrum of mutations on the N-

terminal RING domain of BRCA1 observed in cancer patients for their role in

BRCA1/BARD1 interactions using same approach. Cancer associated mutations that

were tested are shown in Figure 41. Mutations were introduced into the BRCA1 N-

terminal RING domain gene sequence of pMRBAD-BRCA1-CfrGFP vector by site

directed mutagenesis using either QuikChange (Stratagene) or overlap PCR methods. All

thirty six mutants were co-transformed with the pET11a-BARD1-NfrGFP plasmid

expressing wild-type BARD1. In addition to these, the pET11a-BARD1-NfrGFP plasmid

was co-transformed with wild type BRCA1 (positive control) and a plasmid containing a

linker, pMRBAD-linker-CfrGFP (negative control). Both fusion proteins were co-

expressed and screened for cellular fluorescence on LB plates supplemented with 10 μM

IPTG, 0.2% L (+)-arabinose and the required antibiotics. Cells were grown for 9-12 h at

30 ºC and 12-16 h at room temperature and observed for fluorescence under long-wave

UV illumination. Any perturbation of the BRCA1/BARD1 interaction reduces the reassembly efficiency and thereby evolves reduced or no cellular fluorescence, which is shown in Figure 42. Bright fluorescence was observed for positive control and no fluorescence was observed for a non-cognate negative control. The cellular fluorescence

165

Figure 41 Cancer associated mutations on BRCA1 The table above lists the known cancer associated mutations in the N-terminal RING domain of BRCA1 (Breast Cancer Information Core, NIH). Mutations highlighted yellow were found in the BRCA1/BARD1 binding interface. Mutations highlighted cyan and red (Zn2+ ions binding sites) were found on the RING motif. b shows the solution NMR structures of RING domains of BRCA1 (1-109 aa, cyan) and BARD1 (26-140 aa, gray) with associated Zn2+ (spheres) ions (PDB:1JM7).194 Sticks represent the known cancer predisposing mutations on BRCA1 RING domain 1

levels of mutants were compared with the fluorescence level obtained from wild type

BRCA1/BARD1 and with negative control, as shown in Figure 42. In addition to V11A,

2+ M18K and L52F, we observed that Zn binding site mutations C39R, C39Y, C61G

perturb the interaction of BRCA1 and BARD1.

Mutations in the four-helix bundle interface, G98R and D96N, and in RING motif, T37R,

K38N and R71G also showed reduced or no cellular fluorescence. 166 a.

Figure 42: Cloning and screening of the cancer associated mutants of BRCA1

a) Plasmid maps, pMRBAD-BRCA1-CfrGFP and pET11-NfrGFP-BARD1, constructed for co-expression and in vivo interaction study of N-terminal RING domain of BARD1 with that of wild type. b) Cellular fluorescence obtained from screening of BRCA1 mutants for their interaction with BARD1 using split-frGFP assay. BRCA1 mutants were cotransformed with BARD1 and cells were grown on plate at 30 C for 12 h and at room temperature for 12-16 h before taking the picture. Pictures were taken using long wave UV lamp using a filter that passes blue light only to excite the fluorophore. (+) indicates wild type BRCA1 with BARD1and (-) indicates pET11-BARD1-NfrGFP cotransformed with CGFP fused with a linker (pMRBAD-Link-CfrGFP).1

167

5.3.2 In vitro Analysis of Binding Interaction

Only a few mutations on the interface caused reduced fluorescence when screened using the split GFP complementation assay. We used in vitro expression of these proteins in the absence of the GFP fusions in order to further probe these interface. In vitro experiments pose a challenge as these proteins are known to form insoluble aggregates and need to be purified from inclusion bodies. We consistently observed that BRCA1 did not partition into the soluble fraction when expressed alone. On the otherhand BARD1 was soluble when expressed as a C-terminal fusion to maltose binding protein (MBP). Therefore we analyzed the solubility of some of the mutations which resulted in positive and negative phenotypes in the in vivo screen when co-expressed with MBP-BARD1. We expressed these variants from compatible plasmids separately and by co-expression. The details of the vectors designed for this are given in Figure 43. We cloned the BRCA1 variants between NcoI and BamH1 sites into the multiple cloning site of pET11a vector which has a ColE1 origin and encodes for an ampicillin resistance gene. BARD1 was cloned between NcoI and BstEII sites of the pMR101 vector which encodes for kanamycin resistance and has a p15A origin. When the in vivo screen was designed, BRCA1 was expressed both as a C-terminal fusion to N-terminal fragment of GFP and as an N- terminal fusion to C-terminal GFP. BARD1 was also expressed N- and C-terminal fusions to the respective counter parts. Co-transformations of BRCA1 with a C-terminal

GFP fusion and BARD1 with an N-terminal GFP fusion yielded the positive phenotype.

In addition, when cotransformed with corresponding zipper fusions, these

168

Figure 43: vectorlogy for in vitro expression of the BRCA1/BARD1 complex

a) Plasmid maps, and pET1a-BRCA1-His6 tag and pMR-MBP-BARD1, constructed for co-expression and in vitro interaction study of N-terminal RING domain of BARD1 with that of wild type and mutant BRCA1 variants

failed to complement in the presence of zippers and thereby leading to the low occurrence

false positives. Therefore we decided to express BRCA1 with a C-terminal 6ΧHis tag and

BARD1 with an N-terminal MBP tag for the in vitro experiments. This orthogonal

tagging would allow.

us to purify only the specific proteins in a co-expression. Co-expression from compatible

plasmids will also ensure that both the proteins are exposed to similar growth conditions.

The proteins were co-expressed using the aforementioned compatible plasmids and the

levels of expression from equal amount of cells were analyzed. The amount of cells was

normalized using OD600. SDS-PAGE of equal amounts of these cells showed that the

169

Figure 44: Screening and in vivo results for a selected set of variants a) Equal amount whole cell lysate showing equivalent expression for all the varints b) Equal amounts loaded from purification of the soluble complex from normalized amount of cells using 6ΧHis tag c) Cellular fluorescence obtained from screening of BRCA1 mutants for their interaction with BARD1 using split-frGFP assay. BRCA1 mutants were cotransformed with BARD1 and cells were grown on plate at 30 °C for 12 h and at room temperature for 12-16 h before taking the picture. Pictures were taken under UV- Illuminator using long wave UV lamp using a filter that passes blue light only to excite the fluorophore. (+) indicates wild type BRCA1 with BARD1and (-) indicates pET11- BARD1-NfrGFP cotransformed with CGFP fused with a linker (pMRBAD-Link- CfrGFP).

total proteins from these were roughly equal. But purification of the variants from normalized amount of cells yielded results which corresponded well with the trend of fluorescence levels (Figure 44). Since BRCA1 partitions into the inclusion bodies and is known to form a soluble complex with BARD1, we use this solubility to test the interaction between BRCA1 and BARD1. Only when the variant of BRCA1 binds to

BARD1, it will yield cellular fluorescence in the split complementation assay and can be purified from the soluble fractions in the co-expressions.

170

The results from the in vitro expressions correlated well with the observations from the

screen. In vivo screening showed that M18K was the least fluorescent followed by V11A

and R71G. The fluorescence levels when transformed with I21V and L52F were

intermediate between WT and the least fluorescent variants. The highest fluorescence levels were observed for the cotransformation of the zipper. The proteins were isolated using IMAC purification and the trends show that the results obtained from the split GFP screen were indeed reporting on the strength of interaction between WT BARD1 and

BRCA1 variants.

5.4 Future Directions

It is evident that in cells BRCA1 is stabilized by BARD1 and the interaction is essential

for BRCA1 for its regular cellular functions, such as DNA break repair, chromatin

remodeling etc. A large fraction of its cellular functions come from E3 ubiquitin ligase

activity of the BRCA/BARD1 complexes through ubiquitylation of itself or various other

mostly unknown substrates. Though the clinical relevance of most of the cancer

predisposing BRCA1 mutations is not clearly known yet, it is believed that the loss or

perturbation of the interaction and consequently the loss of E3 ligase activity perturb the regular events in the cells and lead to uncontrolled cell proliferation.

What dictates the association and stability of this robust anitparallel heterodimer? The

split complementation can be used to screen for stable interacting BRCA1/BARD1 171 complexes from a library of BRCA1 variants. In addition to gaining insight into the core packing rules in the context of an antiparallel heterodimer, we can screen for stabilizing second site mutations to reinstate the dimerization. Restoration of the biological function of BRCA1/BARD1 complex in cells by restoring their interaction to its native state using small molecule drugs has the potential to treat the cancer caused by BRCA1 mutations.

Future work include screening of small molecule or cyclic peptide libraries (SICLOPPS, split intein-mediated circular ligation of peptides and proteins) using split-frGFP screening method to identify potential compounds that can rescue the interaction.

5.5 Materials and Methods

5.5.1 Construction of BRCA1 Mutants

An HA (YPYDVPDYAK) tag was introduced at the C-terminus of the CfrGFP in pMRBAD-BRCA1-CfrGFP vector using a 5' forward primer (AATAATAAT CCATG

GATTTAT CTGCTCTCCG CGTTGAAGAA G) and a 3' reverse primer

(AATAATAATGTACA TTACTTAGCGTAATCTGG AACATCGTATGGGTA

GAGGAGCCACTCGA ACC TTT GTA GAG CTC ATC CAT GCCATG) encoding

SGSS linker and an HA-tag. A total of 36 BRCA1 point mutations (listed in Figure 45) were introduced in BRCA1 gene. Mutations in BRCA1 gene were introduced by site directed mutagenesis method either by using overlap PCR method or QuikChange

(Stratagene) mutagenesis. Mutations S4P, R7C, V11A, I15T, M18T, M18K and I21V were introduced by overlap PCR method using a mutant 5' primer and a 3' primer within

172

Figure 45: Vector map with HA tag for western blots

pMRBAD vector used for constructing cancer predisposing BRCA1 mutants. C-terminal of C-frGFP was tagged with HA (YPYDVPDYAK) pMRBAD-BRCA1-CfrGFP-HA (Figure 42) vector between NcoI and AatII sites.

NcoI and BclI restriction sites which are 117 bases apart from each other. Since BclI site

is sensitive to methylation on bases C or G, the vector pMRBA-BRCA1-CfrGFP was

transformed into dcm-/dam- strain for plasmid preparation. Two synthetic

oligonucleotides with the required point mutations and with 18 bases overlap were used to generate a gene sequence between NcoI and BclI sites. For the L52F mutation, a modified overlap method is used to insert the point mutation using MluI and BsrG1 sites for cloning. Mutations C24R, L28P, I31M, T37R, C39R, C39Y, I42V, C44F, K45N and

173

C47G were introduced by overlap PCR methods using a 5' primer and mutant 3' primers

within NcoI and SphI sites. Mutations K38N, C61G and L63F were introduced by

QuikChange PCR methods (Stratagene). Mutations C64G, C64R, C64Y, D67Y, I68K,

I68R, R71G, S72R, T77M, L87V, I89M, I89T, I90T, D96N AND G98R were introduced

by synthesizing whole gene using six synthetic oligonucleotides (5'WT-NcoIfwL, 3'WT-

SphIreL, 5'WT-SphIfwL. All inserted mutations were confirmed by DNA sequencing

from GeneWiz, Inc (South Plainfield, NJ). The oligonucleotides used for creating

BRCA1 mutants are listed M Sarkar’s PhD Dissertation.

5.5.2 Screening for Positives

All the screening experiments were carried out following the protocols described by

Regan and coworkers. Compatible pairs of plasmids (e.g., pET11a-BARD1-NfrGFP and

pMRBAD-BRCA1-CfrGFP) were cotransformed into BL21(DE3) E. coli

electrocompetent cells by electroporation. Cells were grown overnight to a saturation at

-1 -1 37 ºC in LB supplemented with 100 μg mL ampicillin and 35 μg mL kanamycin. Five

to 10 μL of 1:1000 dilutions of saturated culture were plated on LB agar media

supplemented with 20 μM IPTG, 0.2% arabinose and antibiotics. Plates were incubated at

30 °C for 10-12 h and 12-16 h at room temperature before taking pictures. In each case green fluorescence was observed on a transilluminator (UVP Inc.) using long wavelength

(365 nm) UV irradiation. Pictures were taken using a digital camera under UV-

transillumination using a filter that allows only blue light to pass through it.

174

5.5.3 Affinity purification of fusion protein and interaction partners

BL21 (DE3) cells containing compatible plasmids were grown overnight to saturated

culture. Two mL of LB broth containing kanamycin and ampicillin was inoculated with

40 μL of saturated culture for each sample and grown at 37 ºC to OD ~0.60. Cells were 600

diluted 1:1000 and 100 μL and plated on screening plates supplemented with 10 μM

-1 -1 IPTG, 0.2% arabinose, 100 μg mL ampicillin and 35 μg mL kanamycin. After growing

for 24 h at 30 °C and 48 h at room temperature, cells were resuspended in 1X phosphate

buffered saline (PBS). The OD of 100-fold diluted cells were measured to normalize 600 the cell densities. Cells were then harvested by centrifugation and each pellet was resuspended in 2.5 mL of lysis buffer (50 mM Tris-HCl, 200 mM NaCl, 100 μM ZnCl2,

0.1% Tween 20, 5 mM β-mercaptoethanol, 10 mM imidazole, pH8.0) containing 0.5 mg

-1 mL HEW lysozyme, DNase, RNase, PMSF,and 0.5 mM MgCl2. After sonication and centrifugation, cleared lysate was collected, mixed with 100 μL of Ni-NTA agarose

(Qiagen) equilibrated with lysis buffer and left at 4 °C for 2 h with gentle shaking. The

resin was washed twice with 5 volumes of wash buffer (lysis buffer containing 20 mM

imidazole) and purified proteins were eluted with 300 μL of elution buffer (lysis buffer

containing 250 mM imidazole).

5.5.4 Western Blots Using Anti-HA Antibody

Split fragments with fused proteins or peptides were expressed following the same

procedure mentioned above. An appropriate amount of the cell pellet was resuspended in

200 μL of lysis buffer (50 mM Tris-HCl, 200 mM NaCl, 100 μM ZnCl2, 0.1% Tween 20,

175

5 mM β-mercaptoethanol, 10 mM imidazole, pH 8.0) and mixed with 100 μL of glass beads (Biospec, Sigma). It was vortexed very hard to get complete lysis. Following centrifugation for 30 min at maximum speed, cleared lysate was collected and mixed with an equal volume of SDS buffer. Samples were run on 12.5 % SDS-PAGE gel. Protein bands were transferred on a PVDF membrane (from Pierce) using TE22 Mighty Small

Transphor Unit (80-6205-59) from Amersham Biosciences following the manufacturer’s protocol. Following the “OneStep TMB Blot” (Pierce) protocol, membrane was treated with 40 mL of 1:4,000 dilution anti-HA-Goat-HRP in TBST buffer (25 mM Tris-HCl,

150 mM NaCl, 0.05% Tween-20, pH 7.6). Membrane was then treated with OneStep

TMB substrate (from Pierce) following the supplier’s protocol. Image of the blot was taken under White Light UV illumination using Gel Logic 100 imaging system from

Kodak.

5.5.5 Purification of BRCA1/BARD1 Complex pET-11a BRCA1 variants and pMR-BARD1 wt were cotransformed into BL21(DE3) cells. Single colonies from these transformations were grown to saturation. 25mL of the saturated culture was diluted into 1 L of 2YT media and the cells were grown to OD600 of

0.6-0.1 at 37 °C following which the cells were cooled down to 25 °C, the media was supplemented with 10µM ZnCl2 and protein synthesis was induced using 0.1 mM IPTG.

Equal amounts of cells (normalized by OD600) were harvested after16 h by spinning down at 7,000 rpm for 5 minutes and the pellets were frozen at -80 °C. The pellets were then resuspended in 25 mL 50 mM Tris, 300 mM NaCl, 10 mM Imidazole, 10% glycerol pH 8.0. To this 5 mM MgCl2, 0.5 mM CaCl2, 2 U/mL DNase (Pierce), 200 ng/mL RNase

176

(Fisher), and 0.1% Triton X-100 were added and lysed using Emulsiflex (Avestin

EmulsiFlex). The soluble fraction was isolated by centrifugation at 30,000 g for 30 min.

The supernatant was allowed to bind to 1.0 mL 50% Ni-NTA agarose slurry (Qiagen) for

1 h. This was then loaded onto a large pre-fritted column and washed twice with 6mL of wash buffer (50 mM Tris (pH 8.0), 20 mM Imidazole, 300 mM NaCl, 10% glycerol and

15 mM BME). The proteins were then eluted from the column using 3 mL of elution

Buffer (50 mM Tris (pH 8.0), 250 mM Imidazole, 300 mM NaCl, 10% glycerol and 15 mM BME). Equal amounts of protein co-expressions were examined on 18% SDS-PAGE gels to assess the amount of soluble BRCA1 and corresponding amount of BARD1 bound to the BRCA1 variant.

177

Chapter 6: Extensions to the p53 bacterial screen

6.1 Introduction

The ability to interfere with different protein-protein interactions will lead to an

understanding of the normal processes and possibly control medical disorders. The use of

proteins as therapeutics has gained momentum since the early 1980s when the first

protein drug, recombinant insulin, was created by Genentech. Now there are

approximately 200 marketed products, mainly protein therapeutics2; 222. Initially, protein

drugs were recombinant proteins supplementing the function of natural protein, but with the advances in protein engineering, designed proteins as drugs are gaining more significance. Diabetes, multiple sclerosis, growth hormone deficiency and cancer are some of the various human malignancies that are being targeted using protein engineering methods.

One of the biggest challenges with protein drugs is that they are not bioavailable

especially when administered orally. The reason for this is that they will be targeted by

the cellular and digestive machinery to be degraded. Therefore all the protein drugs

require intravenous administration. Small molecule drugs are an important class of

178

therapeutics which combat this by the virtue of being non-native. Lipinski’s rules provide

a guideline for the properties that a drug molecule should possess in order for it to have

effective pharmacokinetic properties in vivo.223 The advantages of using peptides as

drugs have been well documented. Lead compounds based on peptides have been

designed to rescue function of p53 to treat for cancer,224 Eph receptors, the signaling

pathways of which are implicated in a variety of human disorders.225 Naturally occurring

antimicrobial peptides have serves as a tool in the design of molecular templates for new

anti infective drugs.226

For the past thirty years p53 has been identified as an important target for development of

anticancer treatments. In addition to being implicated in cancer, p53 has also been found

to play a role in a plethora of human diseases including Alzhiemer’s (AD), ischaemia and

Parkinson’s diseases.227; 228 Recent studies indicate an accumulation of p53 with a

mutant-like conformation in aging and AD patients. The amount of such p53 is also

dependent on the concentration of β-amyloid, the aggregate mainly implicated in AD.229;

230; 231 This conformational change in p53 also resulted in the cells being less sensitive to

apoptosis. A significant portion of the efforts towards targeted drug discovery is aimed

towards re-activating the tumor suppression by mutant p53. The dominant negative effect of mutant p53 leads to progressively lesser amounts of wild type p53 in cancer cells.232

Therefore re-activating the p53 pathway in the presence of mutant p53 is significant in the fight against cancer.

179

A portion of pharmaceutical cancer therapy depends on the ability of the drugs to induce

apoptosis, the programmed cell death. Since one of the main functions of p53 is to induce

apoptosis under conditions of cellular stress, it has served as a natural target for drug

discovery and a variety of protein based drugs have been invented that can interact with

p53 and lead to apoptosis of the tumor cells. The different strategies to reactivate p53 include stabilizing mutant p53 and to interrupt p53-MDM2 interaction. Gene therapy advancements have lead to two different marketed products namely Gendicidine and

Oncorine.233; 234 Treatment using Gendicidine relies on the delivery of wild type p53 to

tumor cells. On the other hand Oncorine is an E1B-defective adenovirus which replicates

selectively in tumor cells and destroys them. A major setback of such gene therapy

approaches are their adverse immunological effects and these therefore are not approved

for use in the United States. PRIMA-1 is a small molecule drug that has been identified to

be effective in cancer therapy, but the exact mechanism of action by this drug is only

starting to become established. It is proposed that the degradation products of PRIMA1 and PRIMA1MET leads to the formation of thiol reactive groups which can facilitate the

formation of native- like conformation of p53 by preventing disulfide bonds in the

protein that can sequester folding intermediates.124; 125 Another small molecule drug

RETRA reactivates p73, a homologue of p53 which can also act as a tumor suppressor

and is not mutated in cancer cells.235 Targeting the MDM2-p53 interaction is another approach that has been taken in addition to rescue of the transcriptional activation

function of mutant p53.236 Nutlins and RITA are two other small molecule drugs that

function by inhibition of the p53-MDM2 interaction.112; 113 The crystal structure of

180

MDM2 bound to Nutlin provided convincing evidence that it disrupts the p53-MDM2

interaction by binding to the same pocket that the p53 peptide binds to. RITA has also

been shown to activate apoptosis by interacting with MDM2 in various cell lines. In

addition it has been shown that RITA is capable of reactivating mutant p53, along with

wild type p53 and triggering p53-dependent apoptosis.113 Small molecule drugs such as

Tenovin-6 and Leptomycin B function by targeting other p53 modifying proteins such as

situin and CRM1.102 Protein based Phi-kan03 has been identified and this has been

shown to stabilize Y220C mutant of p53 by elevating its melting temperature.89 In silico

screening and docking based of the crystal structure of the mutant allowed precise

modeling of the small molecule. Convincing evidence for the mode of thermal stabilization was provided by the crystal structure of Y220C p53 bound to PhiKan-083 at the pocket generated by the mutation (Figure 10). Small molecules have thus been in the forefront of targeted drug design and have resulted in a number of successful drugs in the market.

One of the limitations in the synthesis of small molecules for drug discovery is the small size of the libraries that can be generated by typical synthetic methods. Biological

syntheses of small molecules offer a powerful facile method for screening large libraries

when combined with in vivo screens. This has the potential to yield large libraries which

are generated in vivo and therefore does not need additional strategies to render them cell

permeable, one of the significant challenges of in vitro library generation for in vivo

screening. Scott et al. have utilized the split inteins from the naturally occurring

181

Synechocystis sp PCC6803 DnaE intein to generate large libraries of peptides which cyclize via head-to-tail intramolecular backbone cyclization.237; 238 This Split Intein mediated Circular Ligation Of Proteins and Peptides (SICLOPPS) has a split intein arranged so that the linker region between them is post translationally spliced out as a cyclic peptide. The overall scheme of the generation of cyclic peptides is shown in

Figure 46. The linker can be as small as four amino acids or as large as a protein and there is no limitation on the sequence of the linker apart from the position of the nucleophile (usually cysteine) to direct splicing. The SICLOPPS plasmid has been engineered by Abel-Santos et al. to facilitate the ligation of different peptide libraries so that screening for effective cyclic peptides is made easy.239

Horsewill et al. have successfully combined the SICLOPPS libraries with a bacterial reverse two-hybrid for the identification of inhibitors of several protein–protein

interactions.240 DNA binding domains of a promoter controlling the expression of HIS3,

Kan and LacZ when induced by IPTG blocks the survival in media lacking histidine and

residual activity can be measured using the expression of lacZ. Small molecule inhibitors of the DNA binding domain association were identified using this reverse hybrid system.

Using this method, the authors were able to identify inhibitors of different PPIs which have implications in HIV and cancer. Antimicrobial peptides based on SICLOPPS moleculeswere discovered by selecting for growth under varying concentrations of the inducer.241 Cyclic peptide libraries have also been used in a yeast selection system to

identify SICLOPPS molecules that counter the toxicity of human α-synuclien, a

182 significant molecular marker for Parkinson’s disease.242 Recently, Young et al. have reported the used of SICLOPPS methodology to incorporate unnatural amino acids into

Figure 46 Principle of production of SICLOPPS Splicing by the split intein flanking the peptide of interest results in the formation of cyclic peptides in vivo.242

the cyclic peptides to generate novel protease inhibitors. We propose to adapt this robust well-engineered split intein system, in combination with the in vivo screen that we have developed, to identify cyclic peptides that can mediate and stabilize interaction with p53 and DNA. These peptides can be potential lead compounds for the anti-cancer therapy targeting the function of p53.

6.2 Optimization of SICLOPPS Expression from pGDFuv-BDI

We amplified the cyclic peptide library from the pET28 Npu INIC vector using PCR with primers encoding AatII and SpeI sites. The amplified products were ligated into the pGFPuv-BD1 vector. The target vector chosen for the library is based on the fact that this allows us to screen the small molecule library against a variety of p53 variants expressed

183

in trans from a separate plasmid. The pGFPuv-BD1 SICLOPPS plasmid has a ColE1 origin and encodes ampicillin resistance. On the other hand, p53 variants are expressed under the control of an arabinose promoter from pACBAD vector which has a p15A origin and has a kanamycin resistance marker. The scheme of the screen which includes the SICLOPPS expression is depicted in Figure 47. The screen has been previously optimized in DH10B cells since the two promoters that express GFP and p53 are lac promoter and arabinose.

Figure 47 Vector maps screening for active p53 in presence of SICLOPPS The vector map on the left indicates the original reporter vector which is modified to express SICLOPPS. On the right is the plasmid which expresses the p53 variant of interest.

Since the SICLOPPS expression is regulated by a T7 promoter we needed to move the screening system to BL21 (DE3), a cell line which contains the cellular machinery to synthesize T7 RNA polymerase, enabling the synthesis of the cyclic peptides in

184 abundance. As a result, we needed to re-optimize the expression levels of the reporter and the p53 molecules in the new system. We plated side by side on various concentrations of

IPTG, arabinose, time of incubations, and temperatures. At 30 °C, 48 hours, 0.0075% arabinose (v/v), and 0.1 mM IPTG the GFP level was highest in the p53 WT streaks, the

Quad streaks showed the least amount of fluorescence, and the colonies were large and distinct. Most important, the difference between the positive and negative control is the greatest, making the screen more sensitive (Figure 48).

We tested the effect of expression of SICLOPPS peptides by comparing the phenotype of the cells in the in the absence of p53.The plates are shown in Figure 49. Surprisingly we found the transformations of the pGFPuv-BD1 –SICLOPPS into BL21 (DE3) yielded mixed population of cells with respect to fluorescence levels whereas transformations into DH10B gave a uniformly fluorescent population of cells. This suggested that expression of SICLOPPS somehow interfered with the expression of GFP from the same plasmid. This is with the results obtained by Kritzer et al., who found that a large percentage of the cyclic peptides affect the expression of protein not in a protein dependent manner but in a promoter-dependent manner. The initial cloning scheme put the C-terminus of the Intein Ic towards the C-terminus of the GFPuv. In order to reduce the complexity of the screening system and to enable us to delineate whether the observed results are from cyclic peptide-mediated p53-DNA interaction or due the promoter dependent effects of cyclic peptide molecules, we decided to move the split

185

intein gene encoding for cyclic peptides into a plasmid that has been reported to have a novel origin and is compatible with both ColE1 and p15A origins.

Figure 48 Optimized screening conditions in BL21 The streaks on the left are WT p53 which appears are a negative in the screen and to the right is Quad which is a positive in the screen. The conditions were optimized to be 0.0075% arabinose, 0.1 mM IPTG.

6.3 Engineering the pEC vector to express SICLOPPS

The pEC vector was reported as a stable and high copy plasmid of a novel origin that is

compatible with the well known ColE1 and p15A origin of replications in bacteria.243

Zhang et al. isolated the minimal replication region of the cryptic plasmid pEIB1 from

Vibrio anguillarum using a selection for kanamycin resistance. They have reported that

this novel origin is inherited for over 120 generations and can be co-maintained with

ColE1 and p15A origins. Their analysis of the copy number of this plasmid places it to be

slightly lower than pUC18 (a ColE1 plasmid) and the ratio of copy number is 1:1.3 for 186

pEC: pUC18. The published plasmid map and the gel depicting the copy number

analysis are shown in Figure 50. This novel plasmid represents a simple and useful

expression system with a high copy number that enables us to select for the protein to be expressed in a multi-plasmid system.

The original construct made by Zhang et al., contains a kanamycin resistance and an

arabinose promoter under the control of which any protein of interest can be cloned. The

screen that has been designed to identify functional variants of p53 is expressed

Figure 49 Comparison of DH10B and BL21 Cell lines In the absence of p53 variant, the transformation of pGDPuv-BD1-SICLOPPS yields uniformly fluorescent cells (a) whereas BL21 (DE3) yields non uniform cells. Non fluorescent cells are highlighted using black boxes.

from a plasmid containing a kanamycin resistance marker and the p53 variants are expressed under the control of an arabinose promoter. The reporter plasmid for the screen is based on a ColE1 origin of replication and encoded for ampicillin resistance. In order 187

Figure 50: pEC vector and copy number

The published map of the pEC vector is shown on the left with the multiple cloning sites listed.243 On the right is a comparison of the copy number of the pEC vector with other known systems. Lane 1, 0.1 µL pUC18 and pEC; lane 2, 0.5 µL pUC18 and pEC; lane 3, 1 µL pUC18 and pEC; lane 4, 5 µL pUC18 and pEC; lane 5, 10 ng pUC18; lane 6, 20 ng pUC18.

to make the pEC vector orthogonal to both the screening and reporter plasmids we

replaced the kanamycin resistance marker with chloramphemicol resistance. The chloramphenicol resistance marker was amplified using PCR from the pSup MjTyrRS-

6TRN plasmid using forward and reverse primers which have NcoI and Bgl II sites respectively. The amplified fragment was ligated in to the pEC vector between the sites as above. This scheme effectively replaces the kanamycin resistance gene with the gene coding for chloramphenicol resistance. Following this, we tested the copy number of this modified plasmid and the results indicate that the resistance marker did not affect the copy number of the pEC system. We extensively compared the copy number to the know

188

replicons, namely pBR322 (ColE1), pUC19 (a mutation in the ColE1 leading to slightly higher copy number that the original ColE1), pAC (p15A) and pEC (pEIB1). Equal volumes of DNA prepared from normalized amounts of cells loaded on the gel indicate that the copy number of pEC (pEIB1) between pBR322 (ColE1) and pAC (p15A). pUC19 has the highest copy number, followed by pBR322(ColE1), then pEC, and finally p15A. The modified plasmid and the copy number obtained are shown in Figure 51.

Figure 51 also shows the position where the SICLOPPS library will be cloned. This will effectively provide us with a plasmid which is orthogonal to the screening and the reporter vectors of the p53 screen with respect to both origin of replication and antibiotic resistance. This will allow facile transformation of the SILCOPPS library, p53 variant, and the reporter GFP plasmid in to the same cell.

Following this we plan to express the SICLOPPS library from this orthogonal vector.

This will allow us to express the SICLOPPS molecule independent of the expression of p53 and the reporter GFPuv. In addition, making the resistance marker orthogonal to both kanamycin and ampicillin provides us with a vector system orthogonal with respect to both origin of replication and resistance marker. The orthogonal vector system provides us with an opportunity to choose the p53 variant against which we can screen for the cyclic peptides that can rescue interaction with its DNA binding domain.

189

M 1 2 3 4

Figure 51 Proposed SICLOPPS in pEC vector Map on the left shows the modified pEC vector, with chloramphenicol resistance and the proposed cloning scheme for the SICLOPPS library. Gel on the right shows equal amounts of DNA loaded for pEC,pUC,pAC and pBR origins. Lanes labeled are M:marker, 1-pBR322, 2-pUC19, 3-pACYC177, 4-pEC CAM

6.4 Approaches to relax the stability stringency of the p53 screen

One of the characteristics of the screen that we have developed is that the occurrence of

positives is quite low since only variants with a Quad-mutant-like stability pass the

screen. While this may be advantageous to identifying those mutants which exhibit tight

binding to the consensus DNA, this is a potential hurdle to interrogate the effects of

mutations and collect statistically significant data to analyze the packing effect in this protein. The low occurrence of positives may either be due to the inherent intolerance to mutations in the marginally stabilized scaffold or due to only tight binding variants being able to interfere with the transcription of the reporter protein. Therefore we propose to

improve the stringency of the screen in two different ways.

190

One of the approaches that we are adopting is to screen in the context of a ‘Hexa’ mutant

(HM), rationally designed variant of p53 which has higher stability than the Quad mutant

and similar DNA binding properties as the WT.82 Bloom et al. have reported that a

protein’s capacity to evolve is enhanced by the mutational robustness conferred by extra

stability.244; 245 They show that protein engineering by directed evolution is more efficient if direct selection for protein stability is used to increase a protein’s evolvability. The extra stability is neutral with respect to selection for protein function but can be crucial in allowing the protein to tolerate destabilizing mutations that are functionally beneficial.

Directed evolution on a stabilized variant of Cytochrome P450 has led to variants of

higher catalytic activity than when starting from the wt. These studies propose that the

functional expression level of proteins remains largely unaffected as long as its stability

remains above a certain threshold. Increasing the stability can lead to a higher threshold

which in turn results in higher tolerance to mutations.4 From these we concluded that

increasing the stability of the scaffold in which the core randomization libraries are

generated can be beneficial in allowing a larger number of mutations to be tolerated in

the core domain.

Khoo et al. have characterized this mutant which has two mutations Y236F and T253I in

addition to the Quad mutations M133L V203A N239Y N268D (Figure 52). These residues are found in the more stable p53-paralogs p63 and p73 and are found to stabilize the core domain. A comparison of the crystal structure of the HM and the wild type WT p53 showed that there are minimal differences between them. They also measured the

191

DNA binding and found that the HM binds p21 DNA, a known target for WT p53, with

WT like affinity. Therefore the HM provides a stabilized template with WT-like DNA

binding for generating core mutants. Generating core mutants in the context of a

stabilized variant might aid us in acquiring data from functional but weakly stabilized

variants. The authors also report that these mutations stabilized some intermediate or

aggregate in the unfolding pathway making it impossible to compare the unfolding

energies of WT, QM and HM. The WT and QM are known to unfold via a two state

mechanism during urea mediated denaturation. In comparison, the urea mediated

denaturation pathway is governed by the temperature and the concentration of the protein at which the unfolding is measured. We generated the HM using Quad as a template by overlap extension PCR using mutagenic primers. The penta mutant which was generated as an intermediate in the generation of HM and the HM were screened and compared to the QM.

Figure 52 A comparison of the Quad and the Hexa structures a) Structure of Quad (PDB ID 1UOL) with the 4 residues that are mutated in comparison to WT highlighted in red. b) The residues that are mutated in Hexa in comparison to Quad are highlighted in blue and the Quad residues are shown in red. 192

All the three variants exhibited similar fluorescence levels indicating that the HM can be

used as the WT for the the screen. The various hotspot mutants were also constructed in

the context of the HM. V143A and R175H were constructed by ligating digested

fragments from the respective constructs in the QM context between NdeI and BsrG1 sites in the HM. The insert for R275H mutation was generated by overlap extension PCR and ligated between BsrG1 and BsaI sites in the HM. Initial results obtained from screening these variants in comparison to the mutants in the QM context show that the

HM and the hotspots exhibit cellular fluorescence similar to that in the QM context. The limits of the screen need to be further explored when studied in the presence of the HM.

For example, the screen in the framework of the HM may exhibit low cellular fluorescence at different concentration of the protein, i.e., at different concentrations of

arabinose, allowing us to expand the operating window and enable us to screen for functional variants of marginal stability. We will construct core randomized libraries of the HM of p53 and screen for stable functional variants. This will allow us to decipher the rules of packing and stability of this protein.

Another approach that we have adopted is to screen in the presence of the tetramerization domain. It has been reported that although it is the core domain which contributes to the specificity of DNA binding by p53, the tetramerization domain contributes to the stability of the protein-DNA complex.71 The p53 tetramer is comprised of a dimer of dimers. It

has been shown that the first event of p53 binding to DNA occurs as a dimer and this

193

becomes the site of tetramerization. The binding of p53 core (p53C) to its consensus

DNA is highly cooperative and analytical ultracentrifugation results show that p53C

binds to DNA is a 4:1 complex. Structural and dynamic properties of the p53-DNA

binding site determine the overall affinity and stability of the complex, and the presence

of the tetramerization domain makes the binding co-operative and stable. Studies have

shown that the p53 DNA complex involves non only p53-DNA interactions but also

protein-protein interactions between the dimers mediated by the tetramerization domain.

Comparison of the KDs of specific DNA to p53C and p53CT reveals that p53 binds to

DNA with three orders of magnitude higher affinity in the presence of the tetramerization

domain. Also when a mutant p53, L334A, in which the tetramerization domain is

impaired is studied, it is shown that the tetramerization domain assists and stabilizes

binding but it is the DNA binding domain that dictated the specificity itself. Therefore we

decided to include the tetramerization domain in the constructs that we use for screening.

Residues following the core domain starting from 312 to 360 were amplified from full

length (FL) p53 plasmid pCMV-Neo-Bam-p53 (addgene) and ligated between BsrG1 and

Bsa1 sites of p53-QM and p53 HM to generate QM-tet and HM-tet. Initial screening

results show that both the QM-tet and HM-tet binds DNA in the screen and leads to cells

of low fluorescence. The dynamic range of the screen in presence of the tetramerization

domain needs to be determined by further studies. We will construct the hotspot mutants

in the context of QM-tet and HM-tet and analyze the cellular fluorescence at different concentrations of arabinose to determine the working range of the screen. In addition to

194

allowing us to find weak binders, the presence of the tetramerization domain might aid in

the determination of the KD values using fluorescence anisotropy experiments.

6.5 Materials and Methods

6.5.1 Construction of SICLOPPS Library

The library was constructed by PCR amplification from pET28 Npu INIC vector using

AatII and SpeI using 5’ AATAATAATACTA GTTAATCGCC GCGACAATTTGC 3’ and 5’ ATTATTATTGACGTCTCAGTGGTGGTGGTGGTGGTG 3’ oligos. The

amplified products were digested with the respective enzymes and ligated into pGFPuv-

BD1 vector previously digested with AatII and SpeI using T4 DNA ligase. The library

was transformed by electroporation into freshly prepared DH10B competent cells and

recovered for 1h in 25 mL of 2YT media. The transformation effieciency and library size

was estimated from the colonies resulting by plating the recovery. The recovery medium

was then diluted into 1L of 2YT supplemented with 0.1µgmL-1 of ampicillin and grown

to saturation. The saturated cultured were resuspended in ~15% glycerol, flash frozen and

stored at -80°C for future use. To clone the library into pEC, it was amplified from

pET28 Npu INIC using primers encoding for HindIII and BsSHI sites using the primers 5’

AATA A TAATAAGCTTTAATCGCCGCGACAATTTGC 3’ and 5’

AATAATAATGCGC GCTTATCAGTGGTGGTGGTGGTGGTG 5’ and will be ligated

into pEC-CAM between the HindIII and BsSHI sites using T4 DNA ligase. The library

size was estimated and the cells containing the libraries were stored as described above.

195

6.5.2 Re-engineering the pEC Vector

The pEC vector was generously provided by Dr. Huizhan Zhang State Key Laboratory of

Bioreactor Engineering, China. The original pEC vector which has the pEIB1 origin has a

kanamycin resistance gene and has an arabinose promoter. The chloramphenicol (CAM)

resistance gene including the lac promoter was amplified by PCR from pSup MjTyrRS-

6TRN vector using primers encoding for NcoI and BglII sites at the 5’ and 3’ sites

respectively using oligos 5’ ATTATTATTCCATGGCATCTCGAGCAGCTCAGGGTC

3’ and 5’ AATAATAATAGATCTGCCAGTATACACTCCGCTAG 3’.This was ligated

into pEC vector between NcoI and BglII sites effectively replacing the kanamycin resistance gene. The chloramphenicol gene insert was subjected to partial digestion with

NcoI as the gene contains two NcoI sites including the terminal site. The digested fragments were run on a 1% agarose gel and the fragment of the correct size was purified away from the truncated CAM gene by excision of the gel. Since SICLOPPS is controlled by a T7 promoter, the library was cloned between HindIII and BsSHI sites replacing the arabinose promoter.

6.5.3 Screening in BL21 (DE3) Cells

The library was transformed into elecrocompetent pACBADp53-WT cells and grown on

LB media containing Kan, Amp and 0.0075% arabinose (v/v) and 0.1mM IPTG for 48 hours at 30°C. The plates were examined under uv light (365nm) for fluorescence.

196

Chapter 7: Additional Materials and Methods

7.1 Materials

Molecular biology reagents such as DNA polymerases (Herculase and Phusion

polemerases), Restriction enzymes and T4 DNA ligase were purchased from New

England Biolabs Inc (NEB). Pfu Ultra II was purchased from Stratagene. Kanamycin,

ampicillin, dithiothreiotol (DTT), IPTG (Isopropyl-β-D-Thiogalactopyranoside), and L

(+)-arabinose were purchased from Research Products International.

Deoxyribonucleotide mixtures (dNTPs) or individual deoxyribonucleotides (dNTP) were

purchased from American Bioanalytical at 100 mM concentrations and mixed equimolar

to produce 10 mM stocks of dNTPs. Stock solutions were usually made in 1000X

-1 -1 (ampicillin at 100 mgmL in dd water, kanamycin at 35 mg mL in dd water,

-1 chloramphenicol at 30 mg mL in ethanol, DTT (Dithiothreiotol) at 1 M in ddwater,

IPTG at 1M in dd water, arabinose at 20% in dd water) and sterilized by filtration using

0.2 μm syringe filters from Millipore. Oligonucleotides for PCR were purchased from

Sigma Genosys (Sigma-Aldrich Life Science) for cloning point mutants and Integrated

DNA Technology (IDT) for degenerate library oligos, resuspended in dd water to make a

100 μM stock. Reagents to make media such as Bacto Tryptone, Bacto Yeast extract and

Bacto Agar were purchased from BD (Becton Dickinson & Company). Disodium

197

monohydrogen phosphate, monosodium dihydrogen phosphate, Tris-Base and Tris-HCl

and sodium chloride and were purchased from Fisher Scientific. ZnCl2 was purchased

from Sigma Aldrich and made intp 10mM stick solutions in dd water. DNase from Roche

and RNase A either from United States Biochemical Inc. or Roche was used for protein

purification steps. 10-225 kD ladder from USB was used to assure the size of the purified

proteins and quantitate them. Phenol-chloroform-Isoamyl alcohol (25:24:1) and Phenol-

chloroform (24:1) reagents were purchased from USB Corporation. Triton-X100 was

purchased from Sigma-Aldrich. Water for all molecular biology experiments was purified

using a Barnstead NANOpure Diamond system to 18 MΩ•cm and autoclaved. Solutions

and buffers were typically filter sterilized with 0.22 μM Millex-GV PVDF syringe driven

filter unit (from Millipore) or autoclaved.

7.2 Methods

The various protocols for the cloning like PCR amplification, Digests, DNA purifications

and protein purifications followed are based on the Current Protocols in Molecular

Biology (2002) by John Wiley and Sons, Inc; The following sections give some of the

details for the experiments specific to the this project.

7.2.1 Molecular Cloning

Genes of interest were either amplified from plasmids or were constructed by reassembly

and further amplified followed by digests with the respective enzymes. These were cleaned up and ligated into the target vector which was previously digested with the

198 respective enzymes. The ligations were transformed into the cell line of choice and the individual colonies were analyzed by restriction digests and the reaction was confirmed using sequencing. All sequencing reactions were done by Genewiz Inc.

7.2.1.1 PCR amplification

The required product was amplified from the template, either a plasmid encoding the gene of interest or a product of the reassembly of synthetic oligonucleotides by PCR using primers encoding restriction endonuclease sites at the terminal ends. PCR was typically set up as a 50µL reaction using Pfu polymerase, purified in Magliery Lab or

Phusion polymerase (Stratagene) in the relevant buffers and 10 mM each of the dNTPs

(deoxy Nucleotide TriPhosphates) over 20-25 cycles. The annealing and extension times of the reactions were followed according to the manufacturer’s instructions for each of the polymerases. Overlap extension reactions were also set up in a similar fashion but excluding the primers for ten cylces. This facilitated the formation of the correct initial template before the amplification step in presence of the primers over 20 cycles. For the construction of the library insert using thermally balanced inside out (TBIO) PCR, the conditions proposed by Gao et al., was followed.151

The three literature-reported binding sequences for p53 was cloned between AlwNI and

HindIII. A NheI site was engineered into the vector and was later used in combination with HindIII to construct the binding domain library. Some of the core domain variants of

199 p53 were cloned between NdeI and EcoRI. To make the cloning efficient, a BsaI site was engineered such that it can be used as the 3’ cloning site. In addition, in order to avoid overlap PCR for construction of the library, a BsrGI site was engineered by making silent mutations to the p53 gene. Overlap PCR, bound to have a small amount of background products from the original template, adds to the complication of screening, since even small amount of the WT contamination in the library will manifest as a higher rate of occurrence of positives. Therefore the libraries, ITAA, IT and AA, we cloned between

BsrGI and BsaI sites, eliminating overlap PCR. The detailed scheme of cloning for various constructs is described in the respective chapters.

7.2.1.1 Clean Up and Digestion

PCR products were subjected to a clean-up process before setting up digests using restriction endonucleases. Products smaller than 500bp were typically cleaned up by

Phenol Chloroform extraction followed by precipitation of DNA using 100% ethanol.

The precipitated DNA was then resuspended in elution buffer to set up restriction digests.

PCR amplified products larger than 500bp were cleaned up using the PB1 buffer from the

Qiagen kit according to the instructions on the kit. The cleaned up products resuspended in EB was then digested using the required restriction enzymes. The restriction enzymes were purchased from New England Biolabs (NEB) and the reactions were set up in the recommended buffers. Digestion reactions were carried out with ~ 250 ng of insert DNA

(PCR product) and carried out typically for 3 to 4 hours. About 500ng of DNA was digested for the vector. The digested material was cleaned up either by gel extraction

200

method or by membrane capture of the DNA. The manufacturer’s instructions for the

QIAQuick gel extraction protocol was followed for fragments larger that 500bp and

fragments smaller than 500 bp was purified using membrane capture of the DNA.

Typically DNA was run on a 1% agarose gel until the fragments are well separated from

each other. In order to capture the DNA in a membrane, GF/F glass microfiber filter

(Whatman) and dialysis membrane was cut to match the size of the well in which DNA is run. Using a surgical knife, the gel is slit horizontally as close to the DNA as possible on

the side of the positive electrode. The filter paper-membrane combination is inserted into

the slit, with the filterpaper facing the DNA. The DNA runs into the filterpaper and is not

permenable through the dialysis membrane and therefore is sandwiched between the filter

paper and the membrane. This DNA is recovered by placing into a fritted column and

collecting using Elution Buffer. The DNA thus collected is precipitated using 100%

ethanol and resuspended again in water.

7.2.2 Ligations and Transformations

The concentration of the vector and insert DNA is typically ascertained by gel

electrophoresis and comparison to Lambda BstEII digest ladder (NEB). For single-

mutant cloning reactions, a concentration of 15-20 ng µL-1 including the vector and the

insert was used. A ratio of 1:1 was maintained for the concentration of the vector versus the insert. For ligating a library of variants, a total of 0.5 µg of DNA was used. The ratio

of the vector to the insert and concentration was maintained as in the case of single

variants.

201

For transforming the libraries fresh electrocompetent cells of the decided strain was made and the efficiency of transformation of the cells was tested using known concentration of the DNA using the metric,

( )( ℎ μ) () − 1 = ( μ)( μ)

Where, TE is transformation efficiency in colony forming units (cfu) as a function of the amount of DNA transformed. Typically, 1 ng of DNA is transformed into 40 µL of competent cells and quenched with 1 mL of 2YT media. Following 1 h of recovery at 37

°C , 1-10 µL is plated. The colonies resulting after incubating at 37 °C for 12-18 h were counted.

The ligation of the library is cleaned up using the PCR clean-up kit and transformed into electrocompetent cells with TE 108 or higher. This is quenched typically in 20 mL of media and various amounts were plated to estimate the size of the library. The recovery is also inoculated into 1 L of media and grown 12-18 h in the presence of appropriate antibiotics. This culture was used to prepare miniprep of the library and make to make glycerol stocks.

202

7.3 Protein Expression and Purification

The general overall protocol for purification of the p53 variants is described on Chapter

3. Care was taken to maintain a temperature of 4 °C for all the steps of the purification.

The cells were mixed after each pulse during sonication. This helped in efficient heat

dissipation and also improved the lysis. It was found that using the large pre-fritted

column following the Ni-binding step affected the yield of the protein. Therefore the

small pre-fritted column was used for this purpose. The TEV reaction was carried out at

RT for 12-16 h. The 6ΧHis tagged TEV protease was expressed from BL21 (DE3) cells

containing pRK793 plasmid in 2YT media containing ampicillin at 37 °C until OD600 was

between 0.6 and 1. Following this the protein was expressed for 6 h at 30 °C using 0.1

mM IPTG and the cells were harvested by centrifugation at 7000 rpm. The TEV protease was purified using a Ni-NTA agarose column as in the case of the p53 variants.

7.4 Biophysical Characterization

The various procedures used for characterization (urea mediated denaturation monitored

using fluorescence, thermal denaturation monitored using CD, sequence specific DNA

binding using fluorescence anisotropy) are described in Chapter 2.

203

References

1. Sarkar, M. (2009). Engineering Proteins with GFP: Study of Protein-Protein Interactions in vivo, Protein Expression and Solubility, PhD Dissertation, The Ohio State University. 2. Carter, P. J. Introduction to current and future protein therapeutics: a protein engineering perspective. Exp Cell Res 317, 1261-9. 3. Yon, J. M. (1997). Protein folding: concepts and perspectives. Cell Mol Life Sci 53, 557-67. 4. Tokuriki, N. & Tawfik, D. S. (2009). Stability effects of mutations and protein evolvability. Curr Opin Struct Biol 19, 596-604. 5. Anfinsen, C. B. (1973). Principles that govern the folding of protein chains. Science 181, 223-30. 6. Dill, K. A., Ozkan, S. B., Shell, M. S. & Weikl, T. R. (2008). The protein folding problem. Annu Rev Biophys 37, 289-316. 7. Kubelka, J., Hofrichter, J. & Eaton, W. A. (2004). The protein folding 'speed limit'. Curr Opin Struct Biol 14, 76-88. 8. Dill, K. A. & Chan, H. S. (1997). From Levinthal to pathways to funnels. Nat Struct Biol 4, 10-9. 9. Magliery, T. J., Lavinder, J. J. & Sullivan, B. J. Protein stability by number: high- throughput and statistical approaches to one of protein science's most difficult problems. Curr Opin Chem Biol 15, 443-51. 10. Pabo, C. (1983). Molecular technology. Designing proteins and peptides. Nature 301, 200. 11. Pokala, N. & Handel, T. M. (2001). Review: protein design--where we were, where we are, where we're going. J Struct Biol 134, 269-81. 12. Fetrow, J. S., Giammona, A., Kolinski, A. & Skolnick, J. (2002). The protein folding problem: a biophysical enigma. Curr Pharm Biotechnol 3, 329-47. 13. Skolnick, J. (2006). In quest of an empirical potential for protein structure prediction. Curr Opin Struct Biol 16, 166-71. 14. Pandit, S. B., Zhang, Y. & Skolnick, J. (2006). TASSER-Lite: an automated tool for protein comparative modeling. Biophys J 91, 4180-90. 15. Bradley, P., Misura, K. M. & Baker, D. (2005). Toward high-resolution de novo structure prediction for small proteins. Science 309, 1868-71.

204

16. Bradley, P., Malmstrom, L., Qian, B., Schonbrun, J., Chivian, D., Kim, D. E., Meiler, J., Misura, K. M. & Baker, D. (2005). Free modeling with Rosetta in CASP6. Proteins 61 Suppl 7, 128-34. 17. Ventura, S. & Serrano, L. (2004). Designing proteins from the inside out. Proteins 56, 1-10. 18. Fersht, A. R. (2008). From the first protein structures to our current knowledge of protein folding: delights and scepticisms. Nat Rev Mol Cell Biol 9, 650-4. 19. Bommarius, A. S., Broering, J. M., Chaparro-Riggers, J. F. & Polizzi, K. M. (2006). High-throughput screening for enhanced protein stability. Curr Opin Biotechnol 17, 606-10. 20. Khan, S. H., Ahmad, F., Ahmad, N., Flynn, D. C. & Kumar, R. Protein-protein interactions: principles, techniques, and their potential role in new drug development. J Biomol Struct Dyn 28, 929-38. 21. Dahiyat, B. I. & Mayo, S. L. (1997). De novo protein design: fully automated sequence selection. Science 278, 82-7. 22. Kuhlman, B., Dantas, G., Ireton, G. C., Varani, G., Stoddard, B. L. & Baker, D. (2003). Design of a novel globular protein fold with atomic-level accuracy. Science 302, 1364-8. 23. Smith, B. A. & Hecht, M. H. Novel proteins: from fold to function. Curr Opin Chem Biol 15, 421-6. 24. Magliery, T. J. & Regan, L. (2004). A cell-based screen for function of the four- helix bundle protein Rop: a new tool for combinatorial experiments in biophysics. Protein Eng Des Sel 17, 77-83. 25. Lavinder, J. J., Hari, S. B., Sullivan, B. J. & Magliery, T. J. (2009). High- throughput thermal scanning: a general, rapid dye-binding thermal shift screen for protein engineering. J Am Chem Soc 131, 3794-5. 26. Magliery, T. J. & Regan, L. (2004). Combinatorial approaches to protein stability and structure. Eur J Biochem 271, 1595-608. 27. Smith, C. K., Withka, J. M. & Regan, L. (1994). A thermodynamic scale for the beta-sheet forming tendencies of the amino acids. Biochemistry 33, 5510-7. 28. Chou, P. Y. & Fasman, G. D. (1978). Empirical predictions of protein conformation. Annu Rev Biochem 47, 251-76. 29. Smith, C. K. & Regan, L. (1995). Guidelines for protein design: the energetics of beta sheet side chain interactions. Science 270, 980-2. 30. Merkel, J. S., Sturtevant, J. M. & Regan, L. (1999). Sidechain interactions in parallel beta sheets: the energetics of cross-strand pairings. Structure 7, 1333-43. 31. Gronenborn, A. M., Frank, M. K. & Clore, G. M. (1996). Core mutants of the immunoglobulin binding domain of streptococcal protein G: stability and structural integrity. FEBS Lett 398, 312-6. 32. Ozaki, T. & Nakagawara, A. p53: the attractive tumor suppressor in the cancer research field. J Biomed Biotechnol 2011, 603925. 33. Wang, H. & Liu, R. Advantages of mRNA display selections over other selection techniques for investigation of protein-protein interactions. Expert Rev Proteomics 8, 335-46. 205

34. Kotz, J. D., Bond, C. J. & Cochran, A. G. (2004). Phage-display as a tool for quantifying protein stability determinants. Eur J Biochem 271, 1623-9. 35. Shusta, E. V., Kieke, M. C., Parke, E., Kranz, D. M. & Wittrup, K. D. (1999). Yeast polypeptide fusion surface display levels predict thermal stability and soluble secretion efficiency. J Mol Biol 292, 949-56. 36. Boder, E. T. & Wittrup, K. D. (1997). Yeast surface display for screening combinatorial polypeptide libraries. Nat Biotechnol 15, 553-7. 37. Boder, E. T. & Wittrup, K. D. (2000). Yeast surface display for directed evolution of protein expression, affinity, and stability. Methods Enzymol 328, 430-44. 38. Park, S., Xu, Y., Stowell, X. F., Gai, F., Saven, J. G. & Boder, E. T. (2006). Limitations of yeast surface display in engineering proteins of high thermostability. Protein Eng Des Sel 19, 211-7. 39. Pepper, L. R., Cho, Y. K., Boder, E. T. & Shusta, E. V. (2008). A decade of yeast surface display technology: where are we now? Comb Chem High Throughput Screen 11, 127-34. 40. Brachmann, R. K. & Boeke, J. D. (1997). Tag games in yeast: the two-hybrid system and beyond. Curr Opin Biotechnol 8, 561-8. 41. Hu, J. C. (2001). Model systems: Studying molecular recognition using bacterial n-hybrid systems. Trends Microbiol 9, 219-22. 42. Hu, J. C., Kornacker, M. G. & Hochschild, A. (2000). Escherichia coli one- and two-hybrid systems for the analysis and identification of protein-protein interactions. Methods 20, 80-94. 43. Waldo, G. S., Standish, B. M., Berendzen, J. & Terwilliger, T. C. (1999). Rapid protein-folding assay using green fluorescent protein. Nat Biotechnol 17, 691-5. 44. Maxwell, K. L., Mittermaier, A. K., Forman-Kay, J. D. & Davidson, A. R. (1999). A simple in vivo assay for increased protein solubility. Protein Sci 8, 1908-11. 45. Mansell, T. J., Linderman, S. W., Fisher, A. C. & DeLisa, M. P. A rapid protein folding assay for the bacterial periplasm. Protein Sci 19, 1079-90. 46. Linzer, D. I. & Levine, A. J. (1979). Characterization of a 54K dalton cellular SV40 tumor antigen present in SV40-transformed cells and uninfected embryonal carcinoma cells. Cell 17, 43-52. 47. Lane, D. & Levine, A. p53 Research: the past thirty years and the next thirty years. Cold Spring Harb Perspect Biol 2, a000893. 48. Joerger, A. C. & Fersht, A. R. (2007). Structure-function-rescue: the diverse nature of common p53 cancer mutants. Oncogene 26, 2226-42. 49. Eliyahu, D., Raz, A., Gruss, P., Givol, D. & Oren, M. (1984). Participation of p53 cellular tumour antigen in transformation of normal embryonic cells. Nature 312, 646-9. 50. Eliyahu, D., Michalovitz, D. & Oren, M. (1985). Overproduction of p53 antigen makes established cells highly tumorigenic. Nature 316, 158-60. 51. Parada, L. F., Land, H., Weinberg, R. A., Wolf, D. & Rotter, V. (1984). Cooperation between gene encoding p53 tumour antigen and ras in cellular transformation. Nature 312, 649-51.

206

52. Parada, L. F., Land, H., Chen, A., Morganstern, J. & Weinberg, R. A. (1985). Cooperation between cellular oncogenes in the transformation of primary rat embryo fibroblasts. Prog Med Virol 32, 115-28. 53. Finlay, C. A., Hinds, P. W. & Levine, A. J. (1989). The p53 proto-oncogene can act as a suppressor of transformation. Cell 57, 1083-93. 54. Baker, S. J., Fearon, E. R., Nigro, J. M., Hamilton, S. R., Preisinger, A. C., Jessup, J. M., Vantuinen, P., Ledbetter, D. H., Barker, D. F., Nakamura, Y., White, R. & Vogelstein, B. (1989). CHROMOSOME-17 DELETIONS AND P53 GENE-MUTATIONS IN COLORECTAL CARCINOMAS. Science 244, 217- 221. 55. Donehower, L. A., Harvey, M., Slagle, B. L., McArthur, M. J., Montgomery, C. A., Jr., Butel, J. S. & Bradley, A. (1992). Mice deficient for p53 are developmentally normal but susceptible to spontaneous tumours. Nature 356, 215-21. 56. Donehower, L. A. (2009). Using mice to examine p53 functions in cancer, aging, and longevity. Cold Spring Harb Perspect Biol 1, a001081. 57. Malkin, D. (1993). p53 and the Li-Fraumeni syndrome. Cancer Genet Cytogenet 66, 83-92. 58. Sengupta, S. & Harris, C. C. (2005). p53: traffic cop at the crossroads of DNA repair and recombination. Nat Rev Mol Cell Biol 6, 44-55. 59. Malkin, D., Li, F. P., Strong, L. C., Fraumeni, J. F., Jr., Nelson, C. E., Kim, D. H., Kassel, J., Gryka, M. A., Bischoff, F. Z., Tainsky, M. A. & et al. (1990). Germ line p53 mutations in a familial syndrome of breast cancer, sarcomas, and other neoplasms. Science 250, 1233-8. 60. Beckerman, R. & Prives, C. Transcriptional regulation by p53. Cold Spring Harb Perspect Biol 2, a000935. 61. Meek, D. W. & Anderson, C. W. (2009). Posttranslational modification of p53: cooperative integrators of function. Cold Spring Harb Perspect Biol 1, a000950. 62. Cho, Y., Gorina, S., Jeffrey, P. D. & Pavletich, N. P. (1994). Crystal structure of a p53 tumor suppressor-DNA complex: understanding tumorigenic mutations. Science 265, 346-55. 63. Canadillas, J. M., Tidow, H., Freund, S. M., Rutherford, T. J., Ang, H. C. & Fersht, A. R. (2006). Solution structure of p53 core domain: structural basis for its instability. Proc Natl Acad Sci U S A 103, 2109-14. 64. Kitayner, M., Rozenberg, H., Kessler, N., Rabinovich, D., Shaulov, L., Haran, T. E. & Shakked, Z. (2006). Structural basis of DNA recognition by p53 tetramers. Mol Cell 22, 741-53. 65. Zhao, K., Chai, X., Johnston, K., Clements, A. & Marmorstein, R. (2001). Crystal structure of the mouse p53 core DNA-binding domain at 2.7 A resolution. J Biol Chem 276, 12120-7. 66. Rippin, T. M., Freund, S. M., Veprintsev, D. B. & Fersht, A. R. (2002). Recognition of DNA by p53 core domain and location of intermolecular contacts of cooperative binding. J Mol Biol 319, 351-8.

207

67. Malecka, K. A., Ho, W. C. & Marmorstein, R. (2009). Crystal structure of a p53 core tetramer bound to DNA. Oncogene 28, 325-33. 68. Ho, W. C., Fitzgerald, M. X. & Marmorstein, R. (2006). Structure of the p53 core domain dimer bound to DNA. J Biol Chem 281, 20494-502. 69. Chen, Y., Dey, R. & Chen, L. Crystal structure of the p53 core domain bound to a full consensus site as a self-assembled tetramer. Structure 18, 246-56. 70. Tidow, H., Melero, R., Mylonas, E., Freund, S. M., Grossmann, J. G., Carazo, J. M., Svergun, D. I., Valle, M. & Fersht, A. R. (2007). Quaternary structures of tumor suppressor p53 and a specific p53 DNA complex. Proc Natl Acad Sci U S A 104, 12324-9. 71. Weinberg, R. L., Veprintsev, D. B. & Fersht, A. R. (2004). Cooperative binding of tetrameric p53 to DNA. J Mol Biol 341, 1145-59. 72. Veprintsev, D. B., Freund, S. M., Andreeva, A., Rutledge, S. E., Tidow, H., Canadillas, J. M., Blair, C. M. & Fersht, A. R. (2006). Core domain interactions in full-length p53 in solution. Proc Natl Acad Sci U S A 103, 2115-9. 73. Natan, E., Hirschberg, D., Morgner, N., Robinson, C. V. & Fersht, A. R. (2009). Ultraslow oligomerization equilibria of p53 and its implications. Proc Natl Acad Sci U S A 106, 14327-32. 74. Kussie, P. H., Gorina, S., Marechal, V., Elenbaas, B., Moreau, J., Levine, A. J. & Pavletich, N. P. (1996). Structure of the MDM2 oncoprotein bound to the p53 tumor suppressor transactivation domain. Science 274, 948-53. 75. Moll, U. M. & Petrenko, O. (2003). The MDM2-p53 interaction. Mol Cancer Res 1, 1001-8. 76. Joerger, A. C., Ang, H. C., Veprintsev, D. B., Blair, C. M. & Fersht, A. R. (2005). Structures of p53 cancer mutants and mechanism of rescue by second-site suppressor mutations. J Biol Chem 280, 16030-7. 77. Suad, O., Rozenberg, H., Brosh, R., Diskin-Posner, Y., Kessler, N., Shimon, L. J., Frolow, F., Liran, A., Rotter, V. & Shakked, Z. (2009). Structural basis of restoring sequence-specific DNA binding and transactivation to mutant p53 by suppressor mutations. J Mol Biol 385, 249-65. 78. Bullock, A. N., Henckel, J., DeDecker, B. S., Johnson, C. M., Nikolova, P. V., Proctor, M. R., Lane, D. P. & Fersht, A. R. (1997). Thermodynamic stability of wild-type and mutant p53 core domain. Proc Natl Acad Sci U S A 94, 14338-42. 79. Nikolova, P. V., Henckel, J., Lane, D. P. & Fersht, A. R. (1998). Semirational design of active tumor suppressor p53 DNA binding domain with enhanced stability. Proc Natl Acad Sci U S A 95, 14675-80. 80. Joerger, A. C., Allen, M. D. & Fersht, A. R. (2004). Crystal structure of a superstable mutant of human p53 core domain. Insights into the mechanism of rescuing oncogenic mutations. J Biol Chem 279, 1291-6. 81. Brachmann, R. K., Yu, K., Eby, Y., Pavletich, N. P. & Boeke, J. D. (1998). Genetic selection of intragenic suppressor mutations that reverse the effect of common p53 cancer mutations. Embo J 17, 1847-59.

208

82. Khoo, K. H., Joerger, A. C., Freund, S. M. & Fersht, A. R. (2009). Stabilising the DNA-binding domain of p53 by rational design of its hydrophobic core. Protein Eng Des Sel 22, 421-30. 83. Petitjean, A., Mathe, E., Kato, S., Ishioka, C., Tavtigian, S. V., Hainaut, P. & Olivier, M. (2007). Impact of mutant p53 functional properties on TP53 mutation patterns and tumor phenotype: lessons from recent developments in the IARC TP53 database. Hum Mutat 28, 622-9. 84. Wong, K. B., DeDecker, B. S., Freund, S. M., Proctor, M. R., Bycroft, M. & Fersht, A. R. (1999). Hot-spot mutants of p53 core domain evince characteristic local structural changes. Proc Natl Acad Sci U S A 96, 8438-42. 85. Nikolova, P. V., Wong, K. B., DeDecker, B., Henckel, J. & Fersht, A. R. (2000). Mechanism of rescue of common p53 cancer mutations by second-site suppressor mutations. Embo J 19, 370-8. 86. Bullock, A. N., Henckel, J. & Fersht, A. R. (2000). Quantitative analysis of residual folding and DNA binding in mutant p53 core domain: definition of mutant states for rescue in cancer therapy. Oncogene 19, 1245-56. 87. Joerger, A. C., Ang, H. C. & Fersht, A. R. (2006). Structural basis for understanding oncogenic p53 mutations and designing rescue drugs. Proc Natl Acad Sci U S A 103, 15056-61. 88. Basse, N., Kaar, J. L., Settanni, G., Joerger, A. C., Rutherford, T. J. & Fersht, A. R. Toward the rational design of p53-stabilizing drugs: probing the surface of the oncogenic Y220C mutant. Chem Biol 17, 46-56. 89. Boeckler, F. M., Joerger, A. C., Jaggi, G., Rutherford, T. J., Veprintsev, D. B. & Fersht, A. R. (2008). Targeted rescue of a destabilized mutant of p53 by an in silico screened drug. Proc Natl Acad Sci U S A 105, 10360-5. 90. Khoo, K. H., Mayer, S. & Fersht, A. R. (2009). Effects of stability on the biological function of p53. J Biol Chem 284, 30974-80. 91. Weinberg, R. L., Veprintsev, D. B., Bycroft, M. & Fersht, A. R. (2005). Comparative binding of p53 to its promoter and DNA recognition elements. J Mol Biol 348, 589-96. 92. Hupp, T. R., Meek, D. W., Midgley, C. A. & Lane, D. P. (1992). Regulation of the specific DNA binding function of p53. Cell 71, 875-86. 93. Friedler, A., Veprintsev, D. B., Freund, S. M., von Glos, K. I. & Fersht, A. R. (2005). Modulation of binding of DNA to the C-terminal domain of p53 by acetylation. Structure 13, 629-36. 94. Ayed, A., Mulder, F. A., Yi, G. S., Lu, Y., Kay, L. E. & Arrowsmith, C. H. (2001). Latent and active p53 are identical in conformation. Nat Struct Biol 8, 756-60. 95. Sauer, M., Bretz, A. C., Beinoraviciute-Kellner, R., Beitzinger, M., Burek, C., Rosenwald, A., Harms, G. S. & Stiewe, T. (2008). C-terminal diversity within the p53 family accounts for differences in DNA binding and transcriptional activity. Nucleic Acids Res 36, 1900-12.

209

96. Tafvizi, A., Huang, F., Fersht, A. R., Mirny, L. A. & van Oijen, A. M. A single- molecule characterization of p53 search on DNA. Proc Natl Acad Sci U S A 108, 563-8. 97. Khoo, K. H., Andreeva, A. & Fersht, A. R. (2009). Adaptive evolution of p53 thermodynamic stability. J Mol Biol 393, 161-75. 98. Joerger, A. C., Rajagopalan, S., Natan, E., Veprintsev, D. B., Robinson, C. V. & Fersht, A. R. (2009). Structural evolution of p53, p63, and p73: implication for heterotetramer formation. Proc Natl Acad Sci U S A 106, 17705-10. 99. Kern, S. E., Kinzler, K. W., Baker, S. J., Nigro, J. M., Rotter, V., Levine, A. J., Friedman, P., Prives, C. & Vogelstein, B. (1991). Mutant p53 proteins bind DNA abnormally in vitro. Oncogene 6, 131-6. 100. Kern, S. E., Kinzler, K. W., Bruskin, A., Jarosz, D., Friedman, P., Prives, C. & Vogelstein, B. (1991). Identification of p53 as a sequence-specific DNA-binding protein. Science 252, 1708-11. 101. el-Deiry, W. S., Kern, S. E., Pietenpol, J. A., Kinzler, K. W. & Vogelstein, B. (1992). Definition of a consensus binding site for p53. Nat Genet 1, 45-9. 102. Kim, S. H. & Dass, C. R. p53-targeted cancer pharmacotherapy: move towards small molecule compounds. J Pharm Pharmacol 63, 603-10. 103. Funk, W. D., Pak, D. T., Karas, R. H., Wright, W. E. & Shay, J. W. (1992). A transcriptionally active DNA-binding site for human p53 protein complexes. Mol Cell Biol 12, 2866-71. 104. Tokino, T., Thiagalingam, S., el-Deiry, W. S., Waldman, T., Kinzler, K. W. & Vogelstein, B. (1994). p53 tagged sites from human genomic DNA. Hum Mol Genet 3, 1537-42. 105. Hearnes, J. M., Mays, D. J., Schavolt, K. L., Tang, L., Jiang, X. & Pietenpol, J. A. (2005). Chromatin immunoprecipitation-based screen to identify functional genomic binding sites for sequence-specific transactivators. Mol Cell Biol 25, 10148-58. 106. Di Cintio, A., Di Gennaro, E. & Budillon, A. Restoring p53 function in cancer: novel therapeutic approaches for applying the brakes to tumorigenesis. Recent Pat Anticancer Drug Discov 5, 1-13. 107. Bykov, V. J. & Wiman, K. G. (2003). Novel cancer therapy by reactivation of the p53 apoptosis pathway. Ann Med 35, 458-65. 108. Millard, M., Pathania, D., Grande, F., Xu, S. & Neamati, N. Small-molecule inhibitors of p53-MDM2 interaction: the 2006-2010 update. Curr Pharm Des 17, 536-59. 109. Mandinova, A. & Lee, S. W. The p53 pathway as a target in cancer therapeutics: obstacles and promise. Sci Transl Med 3, 64rv1. 110. Shangary, S. & Wang, S. (2009). Small-molecule inhibitors of the MDM2-p53 protein-protein interaction to reactivate p53 function: a novel approach for cancer therapy. Annu Rev Pharmacol Toxicol 49, 223-41. 111. Yu, S., Qin, D., Shangary, S., Chen, J., Wang, G., Ding, K., McEachern, D., Qiu, S., Nikolovska-Coleska, Z., Miller, R., Kang, S., Yang, D. & Wang, S. (2009).

210

Potent and orally active small-molecule inhibitors of the MDM2-p53 interaction. J Med Chem 52, 7970-3. 112. Vassilev, L. T., Vu, B. T., Graves, B., Carvajal, D., Podlaski, F., Filipovic, Z., Kong, N., Kammlott, U., Lukacs, C., Klein, C., Fotouhi, N. & Liu, E. A. (2004). In vivo activation of the p53 pathway by small-molecule antagonists of MDM2. Science 303, 844-8. 113. Zhao, C. Y., Grinkevich, V. V., Nikulenkov, F., Bao, W. & Selivanova, G. Rescue of the apoptotic-inducing function of mutant p53 by small molecule RITA. Cell Cycle 9, 1847-55. 114. Friedler, A., Hansson, L. O., Veprintsev, D. B., Freund, S. M., Rippin, T. M., Nikolova, P. V., Proctor, M. R., Rudiger, S. & Fersht, A. R. (2002). A peptide that binds and stabilizes p53 core domain: chaperone strategy for rescue of oncogenic mutants. Proc Natl Acad Sci U S A 99, 937-42. 115. Selivanova, G. & Fersht, A. (2004). Rescue of the p53 tumor suppressor by a rationally designed molecule. Discov Med 4, 28-30. 116. Friedler, A., DeDecker, B. S., Freund, S. M., Blair, C., Rudiger, S. & Fersht, A. R. (2004). Structural distortion of p53 by the mutation R249S and its rescue by a designed peptide: implications for "mutant conformation". J Mol Biol 336, 187- 96. 117. Issaeva, N., Friedler, A., Bozko, P., Wiman, K. G., Fersht, A. R. & Selivanova, G. (2003). Rescue of mutants of the tumor suppressor p53 in cancer cells by a designed peptide. Proc Natl Acad Sci U S A 100, 13303-7. 118. Friedler, A., Veprintsev, D. B., Hansson, L. O. & Fersht, A. R. (2003). Kinetic instability of p53 core domain mutants: implications for rescue by small molecules. J Biol Chem 278, 24108-12. 119. Friedler, A., Veprintsev, D. B., Rutherford, T., von Glos, K. I. & Fersht, A. R. (2005). Binding of Rad51 and other peptide sequences to a promiscuous, highly electrostatic binding site in p53. J Biol Chem 280, 8051-9. 120. Bykov, V. J., Selivanova, G. & Wiman, K. G. (2003). Small molecules that reactivate mutant p53. Eur J Cancer 39, 1828-34. 121. Bykov, V. J., Issaeva, N., Shilov, A., Hultcrantz, M., Pugacheva, E., Chumakov, P., Bergman, J., Wiman, K. G. & Selivanova, G. (2002). Restoration of the tumor suppressor function to mutant p53 by a low-molecular-weight compound. Nat Med 8, 282-8. 122. Bykov, V. J., Zache, N., Stridh, H., Westman, J., Bergman, J., Selivanova, G. & Wiman, K. G. (2005). PRIMA-1(MET) synergizes with cisplatin to induce tumor cell apoptosis. Oncogene 24, 3484-91. 123. Zache, N., Lambert, J. M., Wiman, K. G. & Bykov, V. J. (2008). PRIMA-1MET inhibits growth of mouse tumors carrying mutant p53. Cell Oncol 30, 411-8. 124. Lambert, J. M., Gorzov, P., Veprintsev, D. B., Soderqvist, M., Segerback, D., Bergman, J., Fersht, A. R., Hainaut, P., Wiman, K. G. & Bykov, V. J. (2009). PRIMA-1 reactivates mutant p53 by covalent binding to the core domain. Cancer Cell 15, 376-88.

211

125. Lambert, J. M., Moshfegh, A., Hainaut, P., Wiman, K. G. & Bykov, V. J. Mutant p53 reactivation by PRIMA-1MET induces multiple signaling pathways converging on apoptosis. Oncogene 29, 1329-38. 126. Zache, N., Lambert, J. M., Rokaeus, N., Shen, J., Hainaut, P., Bergman, J., Wiman, K. G. & Bykov, V. J. (2008). Mutant p53 targeting by the low molecular weight compound STIMA-1. Mol Oncol 2, 70-80. 127. Brachmann, R. K., Vidal, M. & Boeke, J. D. (1996). Dominant-negative p53 mutations selected in yeast hit cancer hot spots. Proc Natl Acad Sci U S A 93, 4091-5. 128. Vidal, M., Brachmann, R. K., Fattaey, A., Harlow, E. & Boeke, J. D. (1996). Reverse two-hybrid and one-hybrid systems to detect dissociation of protein- protein and DNA-protein interactions. Proc Natl Acad Sci U S A 93, 10315-20. 129. Ishioka, C., Frebourg, T., Yan, Y. X., Vidal, M., Friend, S. H., Schmidt, S. & Iggo, R. (1993). Screening patients for heterozygous p53 mutations using a functional assay in yeast. Nat Genet 5, 124-9. 130. Flaman, J. M., Frebourg, T., Moreau, V., Charbonnier, F., Martin, C., Chappuis, P., Sappino, A. P., Limacher, I. M., Bron, L., Benhattar, J. & et al. (1995). A simple p53 functional assay for screening cell lines, blood, and tumors. Proc Natl Acad Sci U S A 92, 3963-7. 131. Fen, C. X., Coomber, D. W., Lane, D. P. & Ghadessy, F. J. (2007). Directed evolution of p53 variants with altered DNA-binding specificities by in vitro compartmentalization. J Mol Biol 371, 1238-48. 132. Mayer, S., Rudiger, S., Ang, H. C., Joerger, A. C. & Fersht, A. R. (2007). Correlation of levels of folded recombinant p53 in escherichia coli with thermodynamic stability in vitro. J Mol Biol 372, 268-76. 133. (2010). American Cancer Society. 134. Lane, D. P. (1992). Cancer. p53, guardian of the genome. Nature 358, 15-6. 135. Sigal, A. & Rotter, V. (2000). Oncogenic mutations of the p53 tumor suppressor: the demons of the guardian of the genome. Cancer Res 60, 6788-93. 136. Prives, C. (1994). How loops, beta sheets, and alpha helices help us to understand p53. Cell 78, 543-6. 137. Pavletich, N. P., Chambers, K. A. & Pabo, C. O. (1993). The DNA-binding domain of p53 contains the four conserved regions and the major mutation hot spots. Genes Dev 7, 2556-64. 138. Kamtekar, S. & Hecht, M. H. (1995). Protein Motifs. 7. The four-helix bundle: what determines a fold? Faseb J 9, 1013-22. 139. Matthews, B. W. (1995). Studies on protein stability with T4 lysozyme. Advances in Protein Chemistry 46, 249-78. 140. Jackson, S. E. (2006). Ubiquitin: a small protein folding paradigm. Org Biomol Chem 4, 1845-53. 141. Bykov, V. J. N., Selivanova, G. & Wiman, K. G. (2003). Small molecules that reactivate mutant p53. European Journal of Cancer 39, 1828-1834.

212

142. Ang, H. C., Joerger, A. C., Mayer, S. & Fersht, A. R. (2006). Effects of common cancer mutations on stability and DNA binding of full-length p53 compared with isolated core domains. J Biol Chem 281, 21934-41. 143. Smardova, J., Smarda, J. & Koptikova, J. (2005). Functional analysis of p53 tumor suppressor in yeast. Differentiation 73, 261-77. 144. Smardova, J. (1999). FASAY: a simple functional assay in yeast for identification of p53 mutation in tumors. Neoplasma 46, 80-8. 145. Crameri, A., Whitehorn, E. A., Tate, E. & Stemmer, W. P. (1996). Improved green fluorescent protein by molecular evolution using DNA shuffling. Nat Biotechnol 14, 315-9. 146. Veprintsev, D. B. & Fersht, A. R. (2008). Algorithm for prediction of tumour suppressor p53 affinity for binding sites in DNA. Nucleic Acids Res 36, 1589-98. 147. Joung, J. K., Ramm, E. I. & Pabo, C. O. (2000). A bacterial two-hybrid selection system for studying protein-DNA and protein-protein interactions. Proc Natl Acad Sci U S A 97, 7382-7. 148. Sakaguchi, M., Nukui, T., Sonegawa, H., Murata, H., Futami, J., Yamada, H. & Huh, N. H. (2005). Targeted disruption of transcriptional regulatory function of p53 by a novel efficient method for introducing a decoy oligonucleotide into nuclei. Nucleic Acids Res 33, e88. 149. Munson, M., Predki, P. F. & Regan, L. (1994). ColE1-compatible vectors for high-level expression of cloned DNAs from the T7 promoter. Gene 144, 59-62. 150. Ho, S. N., Hunt, H. D., Horton, R. M., Pullen, J. K. & Pease, L. R. (1989). Site- directed mutagenesis by overlap extension using the polymerase chain reaction. Gene 77, 51-9. 151. Gao, X., Yo, P., Keith, A., Ragan, T. J. & Harris, T. K. (2003). Thermodynamically balanced inside-out (TBIO) PCR-based gene synthesis: a novel method of primer design for high-fidelity assembly of longer gene sequences. Nucleic Acids Res 31, e143. 152. Stemmer, W. P. (1994). DNA shuffling by random fragmentation and reassembly: in vitro recombination for molecular evolution. Proc Natl Acad Sci U S A 91, 10747-51. 153. Cordes, M. H., Davidson, A. R. & Sauer, R. T. (1996). Sequence space, folding and protein design. Curr Opin Struct Biol 6, 3-10. 154. Sauer, R. T. (1996). Protein folding from a combinatorial perspective. Fold Des 1, R27-30. 155. Fersht, A. R., Matouschek, A. & Serrano, L. (1992). The folding of an enzyme. I. Theory of protein engineering analysis of stability and pathway of protein folding. J Mol Biol 224, 771-82. 156. Serrano, L., Kellis, J. T., Jr., Cann, P., Matouschek, A. & Fersht, A. R. (1992). The folding of an enzyme. II. Substructure of barnase and the contribution of different interactions to protein stability. J Mol Biol 224, 783-804. 157. Desjarlais, J. R. & Handel, T. M. (1995). New strategies in protein design. Curr Opin Biotechnol 6, 460-6.

213

158. Woolfson, D. N. (2001). Core-directed protein design. Curr Opin Struct Biol 11, 464-71. 159. Baase, W. A., Liu, L., Tronrud, D. E. & Matthews, B. W. Lessons from the lysozyme of phage T4. Protein Sci 19, 631-41. 160. Matsumura, M., Becktel, W. J. & Matthews, B. W. (1988). Hydrophobic stabilization in T4 lysozyme determined directly by multiple substitutions of Ile 3. Nature 334, 406-10. 161. Xu, J., Baase, W. A., Baldwin, E. & Matthews, B. W. (1998). The response of T4 lysozyme to large-to-small substitutions within the core and its relation to the hydrophobic effect. Protein Sci 7, 158-77. 162. Karpusas, M., Baase, W. A., Matsumura, M. & Matthews, B. W. (1989). Hydrophobic packing in T4 lysozyme probed by cavity-filling mutants. Proc Natl Acad Sci U S A 86, 8237-41. 163. Axe, D. D., Foster, N. W. & Fersht, A. R. (1996). Active barnase variants with completely random hydrophobic cores. Proc Natl Acad Sci U S A 93, 5590-4. 164. Finucane, M. D. & Woolfson, D. N. (1999). Core-directed protein design. II. Rescue of a multiply mutated and destabilized variant of ubiquitin. Biochemistry 38, 11613-23. 165. Finucane, M. D., Tuna, M., Lees, J. H. & Woolfson, D. N. (1999). Core-directed protein design. I. An experimental method for selecting stable proteins from combinatorial libraries. Biochemistry 38, 11604-12. 166. Sidhu, S. S. & Koide, S. (2007). Phage display for engineering and analyzing protein interaction interfaces. Curr Opin Struct Biol 17, 481-7. 167. Reidhaar-Olson, J. F. & Sauer, R. T. (1988). Combinatorial cassette mutagenesis as a probe of the informational content of protein sequences. Science 241, 53-7. 168. Lim, W. A. & Sauer, R. T. (1989). Alternative packing arrangements in the hydrophobic core of lambda repressor. Nature 339, 31-6. 169. Desjarlais, J. R. & Handel, T. M. (1995). De novo design of the hydrophobic cores of proteins. Protein Sci 4, 2006-18. 170. Dahiyat, B. I. & Mayo, S. L. (1997). Probing the role of packing specificity in protein design. Proc Natl Acad Sci U S A 94, 10172-7. 171. Dahiyat, B. I. & Mayo, S. L. (1996). Protein design automation. Protein Sci 5, 895-903. 172. Distefano, M. D., Zhong, A. & Cochran, A. G. (2002). Quantifying beta-sheet stability by phage display. J Mol Biol 322, 179-88. 173. Balagurumoorthy, P., Sakamoto, H., Lewis, M. S., Zambrano, N., Clore, G. M., Gronenborn, A. M., Appella, E. & Harrington, R. E. (1995). Four p53 DNA- binding domain peptides bind natural p53-response elements and bend the DNA. Proc Natl Acad Sci U S A 92, 8591-5. 174. Regan, L. (1994). Protein structure. Born to be beta. Curr Biol 4, 656-8. 175. Thoms, S., Max, K. E., Wunderlich, M., Jacso, T., Lilie, H., Reif, B., Heinemann, U. & Schmid, F. X. (2009). Dimer formation of a stabilized Gbeta1 variant: a structural and energetic analysis. J Mol Biol 391, 918-32.

214

176. Kumar, S., Sun, L., Liu, H., Muralidhara, B. K. & Halpert, J. R. (2006). Engineering mammalian cytochrome P450 2B1 by directed evolution for enhanced catalytic tolerance to temperature and dimethyl sulfoxide. Protein Eng Des Sel 19, 547-54. 177. Lazar, G. A., Desjarlais, J. R. & Handel, T. M. (1997). De novo design of the hydrophobic core of ubiquitin. Protein Sci 6, 1167-78. 178. Silverman, J. A., Balakrishnan, R. & Harbury, P. B. (2001). Reverse engineering the (beta/alpha )8 barrel fold. Proc Natl Acad Sci U S A 98, 3092-7. 179. Nicholls, C. D., McLure, K. G., Shields, M. A. & Lee, P. W. (2002). Biogenesis of p53 involves cotranslational dimerization of monomers and posttranslational dimerization of dimers. Implications on the dominant negative effect. J Biol Chem 277, 12937-45. 180. McLure, K. G. & Lee, P. W. (1999). p53 DNA binding can be modulated by factors that alter the conformational equilibrium. Embo J 18, 763-70. 181. McLure, K. G. & Lee, P. W. (1998). How p53 binds DNA as a tetramer. Embo J 17, 3342-50. 182. Pan, Y., Ma, B., Levine, A. J. & Nussinov, R. (2006). Comparison of the human and worm p53 structures suggests a way for enhancing stability. Biochemistry 45, 3925-33. 183. Nagi, A. D., Anderson, K. S. & Regan, L. (1999). Using loop length variants to dissect the folding pathway of a four-helix-bundle protein. J Mol Biol 286, 257- 65. 184. Ladurner, A. G. & Fersht, A. R. (1997). Glutamine, alanine or glycine repeats inserted into the loop of a protein have minimal effects on stability and folding rates. J Mol Biol 273, 330-7. 185. Viguera, A. R. & Serrano, L. (1997). Loop length, intramolecular diffusion and protein folding. Nat Struct Biol 4, 939-46. 186. Li, H., Wang, H. C., Cao, Y., Sharma, D. & Wang, M. (2008). Configurational entropy modulates the mechanical stability of protein GB1. J Mol Biol 379, 871- 80. 187. Engel, D. E. & DeGrado, W. F. (2005). Alpha-alpha linking motifs and interhelical orientations. Proteins 61, 325-37. 188. Wright, C. F., Christodoulou, J., Dobson, C. M. & Clarke, J. (2004). The importance of loop length in the folding of an immunoglobulin domain. Protein Eng Des Sel 17, 443-53. 189. De Biasio, A., Sanchez, R., Prieto, J., Villate, M., Campos-Olivas, R. & Blanco, F. J. Reduced stability and increased dynamics in the human proliferating cell nuclear antigen (PCNA) relative to the yeast homolog. PLoS One 6, e16600. 190. Wieden, H. J., Mercier, E., Gray, J., Steed, B. & Yawney, D. A combined molecular dynamics and rapid kinetics approach to identify conserved three- dimensional communication networks in elongation factor Tu. Biophys J 99, 3735-43.

215

191. Thompson, M. J. & Eisenberg, D. (1999). Transproteomic evidence of a loop- deletion mechanism for enhancing protein thermostability. J Mol Biol 290, 595- 604. 192. Walters, J., Swartz, P., Mattos, C. & Clark, A. C. Thermodynamic, enzymatic and structural effects of removing a salt bridge at the base of loop 4 in (pro)caspase-3. Arch Biochem Biophys 508, 31-8. 193. Takano, K., Yamagata, Y. & Yutani, K. (2000). Role of amino acid residues at turns in the conformational stability and folding of human lysozyme. Biochemistry 39, 8655-65. 194. Helms, L. R. & Wetzel, R. (1995). Destabilizing loop swaps in the CDRs of an immunoglobulin VL domain. Protein Sci 4, 2073-81. 195. Miyazaki, K., Wintrode, P. L., Grayling, R. A., Rubingh, D. N. & Arnold, F. H. (2000). Directed evolution study of temperature adaptation in a psychrophilic enzyme. J Mol Biol 297, 1015-26. 196. Phizicky, E. M. & Fields, S. (1995). Protein-protein interactions: methods for detection and analysis. Microbiol Rev 59, 94-123. 197. Keskin, O., Gursoy, A., Ma, B. & Nussinov, R. (2008). Principles of protein- protein interactions: what are the preferred ways for proteins to interact? Chem Rev 108, 1225-44. 198. Rayburn, E. R., Ezell, S. J. & Zhang, R. (2009). Recent advances in validating MDM2 as a cancer target. Anticancer Agents Med Chem 9, 882-903. 199. Shen, H. & Maki, C. G. Pharmacologic activation of p53 by small-molecule MDM2 antagonists. Curr Pharm Des 17, 560-8. 200. Yang, E. S. & Xia, F. BRCA1 16 years later: DNA damage-induced BRCA1 shuttling. Febs J 277, 3079-85. 201. Thompson, M. E. BRCA1 16 years later: nuclear import and export processes. Febs J 277, 3072-8. 202. Thompson, M. E. BRCA1 16 years later: an overview. Febs J 277, 3071. 203. Deng, C. X. & Brodie, S. G. (2000). Roles of BRCA1 and its interacting proteins. Bioessays 22, 728-37. 204. Li, Y., Xie, W. & Fang, G. (2008). Fluorescence detection techniques for protein kinase assay. Anal Bioanal Chem 390, 2049-57. 205. Fields, S. & Song, O. (1989). A novel genetic system to detect protein-protein interactions. Nature 340, 245-6. 206. Chien, C. T., Bartel, P. L., Sternglanz, R. & Fields, S. (1991). The two-hybrid system: a method to identify and clone genes for proteins that interact with a protein of interest. Proc Natl Acad Sci U S A 88, 9578-82. 207. Magliery, T. J. & Regan, L. (2006). Reassembled GFP: detecting protein-protein interactions and protein expression patterns. Methods Biochem Anal 47, 391-405. 208. Magliery, T. J., Wilson, C. G., Pan, W., Mishler, D., Ghosh, I., Hamilton, A. D. & Regan, L. (2005). Detecting protein-protein interactions with a green fluorescent protein fragment reassembly trap: scope and mechanism. J Am Chem Soc 127, 146-57.

216

209. Ghosh, I., Hamilton, A. D. & Regan, L. (2000). Antiparallel leucine zipper- directed protein reassembly: Application to the green fluorescent protein. Journal of the American Chemical Society 122, 5658-5659. 210. Magliery, T. J., Wilson, C. G. M., Pan, W., Mishler, D., Ghosh, I., Hamilton, A. D. & Regan, L. (2005). Detecting protein-protein interactions with a GFP- fragment reassembly trap: scope and mechanism. J Am Chem Soc 127, 146-157. 211. Sarkar, M. & Magliery, T. J. (2008). Re-engineering a split-GFP reassembly screen to examine RING-domain interactions between BARD1 and BRCA1 mutants observed in cancer patients. Mol Biosyst 4, 599-605. 212. Jiang, J., Yang, E. S., Jiang, G., Nowsheen, S., Wang, H., Wang, T., Wang, Y., Billheimer, D., Chakravarthy, A. B., Brown, M., Haffty, B. & Xia, F. p53- dependent BRCA1 nuclear export controls cellular susceptibility to DNA damage. Cancer Res 71, 5546-57. 213. Nishikawa, H., Wu, W., Koike, A., Kojima, R., Gomi, H., Fukuda, M. & Ohta, T. (2009). BRCA1-associated protein 1 interferes with BRCA1/BARD1 RING heterodimer activity. Cancer Res 69, 111-9. 214. Shakya, R., Szabolcs, M., McCarthy, E., Ospina, E., Basso, K., Nandula, S., Murty, V., Baer, R. & Ludwig, T. (2008). The basal-like mammary carcinomas induced by Brca1 or Bard1 inactivation implicate the BRCA1/BARD1 heterodimer in tumor suppression. Proc Natl Acad Sci U S A 105, 7040-5. 215. Baer, R. & Ludwig, T. (2002). The BRCA1/BARD1 heterodimer, a tumor suppressor complex with ubiquitin E3 ligase activity. Curr Opin Genet Dev 12, 86-91. 216. Lu, Y., Kang, T. & Hu, Y. BRCA1/BARD1 complex interacts with steroidogenic factor 1--A potential mechanism for regulation of aromatase expression by BRCA1. J Steroid Biochem Mol Biol 123, 71-8. 217. Lane, T. F. (2004). BRCA1 and transcription. Cancer Biol Ther 3, 528-33. 218. Morris, J. R., Keep, N. H. & Solomon, E. (2002). Identification of residues required for the interaction of BARD1 with BRCA1. J Biol Chem 277, 9382-6. 219. Morris, J. R., Pangon, L., Boutell, C., Katagiri, T., Keep, N. H. & Solomon, E. (2006). Genetic analysis of BRCA1 ubiquitin ligase activity and its relationship to breast cancer susceptibility. Hum Mol Genet 15, 599-606. 220. Brzovic, P. S., Meza, J. E., King, M. C. & Klevit, R. E. (2001). BRCA1 RING domain cancer-predisposing mutations. Structural consequences and effects on protein-protein interactions. J Biol Chem 276, 41399-406. 221. Narod, S. A. & Foulkes, W. D. (2004). BRCA1 and BRCA2: 1994 and beyond. Nat Rev Cancer 4, 665-76. 222. Brekke, O. H. & Sandlie, I. (2003). Therapeutic antibodies for human diseases at the dawn of the twenty-first century. Nat Rev Drug Discov 2, 52-62. 223. Walters, W. P., Ajay & Murcko, M. A. (1999). Recognizing molecules with drug- like properties. Curr Opin Chem Biol 3, 384-7. 224. Seemann, S., Maurici, D., Olivier, M., de Fromentel, C. C. & Hainaut, P. (2004). The tumor suppressor gene TP53: implications for cancer management and therapy. Crit Rev Clin Lab Sci 41, 551-83. 217

225. Noberini, R., Lamberto, I. & Pasquale, E. B. Targeting Eph receptors with peptides and small molecules: Progress and challenges. Semin Cell Dev Biol. 226. Pasupuleti, M., Schmidtchen, A. & Malmsten, M. Antimicrobial peptides: key components of the innate immune system. Crit Rev Biotechnol. 227. Behrens, M. I., Lendon, C. & Roe, C. M. (2009). A common biological mechanism in cancer and Alzheimer's disease? Curr Alzheimer Res 6, 196-204. 228. Wang, J., Cao, Z., Zhao, L. & Li, S. Novel Strategies for Drug Discovery Based on Intrinsically Disordered Proteins (IDPs). Int J Mol Sci 12, 3205-19. 229. Lanni, C., Racchi, M., Uberti, D., Mazzini, G., Stanga, S., Sinforiani, E., Memo, M. & Govoni, S. (2008). Pharmacogenetics and pharmagenomics, trends in normal and pathological aging studies: focus on p53. Curr Pharm Des 14, 2665- 71. 230. Lanni, C., Racchi, M., Mazzini, G., Ranzenigo, A., Polotti, R., Sinforiani, E., Olivari, L., Barcikowska, M., Styczynska, M., Kuznicki, J., Szybinska, A., Govoni, S., Memo, M. & Uberti, D. (2008). Conformationally altered p53: a novel Alzheimer's disease marker? Mol Psychiatry 13, 641-7. 231. Lanni, C., Uberti, D., Racchi, M., Govoni, S. & Memo, M. (2007). Unfolded p53: a potential biomarker for Alzheimer's disease. J Alzheimers Dis 12, 93-9. 232. Essmann, F. & Schulze-Osthoff, K. Translational approaches targeting the p53 pathway for anticancer therapy. Br J Pharmacol. 233. Yang, Z. X., Wang, D., Wang, G., Zhang, Q. H., Liu, J. M., Peng, P. & Liu, X. H. Clinical study of recombinant adenovirus-p53 combined with fractionated stereotactic radiotherapy for hepatocellular carcinoma. J Cancer Res Clin Oncol 136, 625-30. 234. Lu, W., Zheng, S., Li, X. F., Huang, J. J., Zheng, X. & Li, Z. (2004). Intra-tumor injection of H101, a recombinant adenovirus, in combination with chemotherapy in patients with advanced cancers: a pilot phase II clinical trial. World J Gastroenterol 10, 3634-8. 235. Kravchenko, J. E., Ilyinskaya, G. V., Komarov, P. G., Agapova, L. S., Kochetkov, D. V., Strom, E., Frolova, E. I., Kovriga, I., Gudkov, A. V., Feinstein, E. & Chumakov, P. M. (2008). Small-molecule RETRA suppresses mutant p53-bearing cancer cells through a p73-dependent salvage pathway. Proc Natl Acad Sci U S A 105, 6302-7. 236. Brown, C. J., Cheok, C. F., Verma, C. S. & Lane, D. P. Reactivation of p53: from peptides to small molecules. Trends Pharmacol Sci 32, 53-62. 237. Scott, C. P., Abel-Santos, E., Wall, M., Wahnon, D. C. & Benkovic, S. J. (1999). Production of cyclic peptides and proteins in vivo. Proc Natl Acad Sci U S A 96, 13638-43. 238. Scott, C. P., Abel-Santos, E., Jones, A. D. & Benkovic, S. J. (2001). Structural requirements for the biosynthesis of backbone cyclic peptide libraries. Chem Biol 8, 801-15. 239. Abel-Santos, E., Scott, C. P. & Benkovic, S. J. (2003). Use of inteins for the in vivo production of stable cyclic peptide libraries in E. coli. Methods Mol Biol 205, 281-94. 218

240. Horswill, A. R., Savinov, S. N. & Benkovic, S. J. (2004). A systematic method for identifying small-molecule modulators of protein-protein interactions. Proc Natl Acad Sci U S A 101, 15591-6. 241. Nilsson, L. O., Louassini, M. & Abel-Santos, E. (2005). Using siclopps for the discovery of novel antimicrobial peptides and their targets. Protein Pept Lett 12, 795-9. 242. Kritzer, J. A., Hamamichi, S., McCaffery, J. M., Santagata, S., Naumann, T. A., Caldwell, K. A., Caldwell, G. A. & Lindquist, S. (2009). Rapid selection of cyclic peptides that reduce alpha-synuclein toxicity in yeast and animal models. Nat Chem Biol 5, 655-63. 243. Zhang, H., Wu, H. & Zhang, H. (2007). A novel high-copy plasmid, pEC, compatible with commonly used Escherichia coli cloning and expression vectors. Biotechnol Lett 29, 431-7. 244. Bloom, J. D., Labthavikul, S. T., Otey, C. R. & Arnold, F. H. (2006). Protein stability promotes evolvability. Proc Natl Acad Sci U S A 103, 5869-74. 245. Wilke, C. O., Bloom, J. D., Drummond, D. A. & Raval, A. (2005). Predicting the tolerance of proteins to random amino acid substitution. Biophys J 89, 3714-20.

219