<<

The effect of the polyglutamine expansion of the on the system for degradation

Thomas Carr Scanlon Department of Human Genetics McGill University, Montreal, Quebec August, 2007

A thesis submitted to McGill University in partial fulfillment of the requirements of the degree of Doctor of Philosophy

© Thomas Scanlon, 2007

i ABSTRACT

The Androgen Receptor (AR; Xq11.2-q12) is a well-characterized X-linked .

The AR protein functions primarily as a steroid-activated . A

structural feature of the AR transactivational domain (TAD) called the polyglutamine

tract will be the primary focus of this thesis. The polyglutamine tract is an uninterrupted

stretch of glutamine residues, polymorphic in length, that contributes to the

transactivational potential of the receptor, and has also been demonstrated to be the

causative agent in Spinal and Bulbar Muscular Atrophy (SBMA).

SBMA is a late-onset neurological disease caused by an expansion of the

polyglutamine tract in the AR. Immunohistopathological investigations of affected tissue

in SBMA patients reveal strong anti-AR staining in discrete, electrodense regions, termed

“inclusion bodies” or “protein aggregates”. This finding has been corroborated by

various transfection studies comparing wild type and polyglutamine tract-expanded AR,

leading to a large variety of hypotheses regarding the formation and potential toxicity of

these structures. The consistent observation of accumulated, insoluble polyglutamine- expanded protein suggests a malfunction of the normal mechanisms of protein clearance.

The primary goal of this thesis is to explore the effect of the polyglutamine expansion mutation in the AR on the ubiquitin proteasome system of protein degradation

(UPS). To this end, several strategies have been implemented. A classic proteasome reporter cell line was used to create a cell culture model of SBMA to show evidence for involvement of the UPS in SBMA. To investigate the putative nuclear dependence of polyglutamine-mediated toxicity, transfection studies using a mutant AR with defective

ii

nuclear localization signal were also performed in this cell line. In an attempt to unequivocally demonstrate polyglutamine-expanded protein degradation by the proteasome, efforts were made to create an entirely in vitro method to test AR degradation by the UPS. Human were successfully purified by a novel affinity chromatography method towards this end.

Finally, the AR TAD with various polyglutamine tract lengths was bacterially- expressed and highly purified for structural analysis of the polyglutamine tract in a native protein. Dynamic light scattering experiments demonstrated expanded polyQ tracts exhibit biochemical aggregative properties. Circular dichroism spectra were collected for each polyQ tract variant. Results indicated any length polyQ tract is an strong structure- breaking sequence. Furthermore, the expansion mutant exhibited pronounced β-sheet strucutre absent from all other variants. In sum, structural investigations suggest a molten globule structure for the wild type AR-TAD.

iii RÉSUMÉ

Le récepteur androgène (AR), un gène situé sur le X en position

Xq11.2-q12 est, à prime abord, un facteur transcriptionnel activé via les hormones stéroïdiennes. Le sujet de cette thèse se concentre sur un domaine structural du récepteur androgène, lequel s’avère nécessaire pour l’activité transcriptionnelle du récepteur. Ce domaine nommé « polyglutamine tract » consiste en une séquence répétitive et ininterrompue de glutamine dont le nombre est polymorphe. Le nombre de répétition affecte directement le potentiel transactivationnel du récepteur et s’avère être l’élément causal de l’atrophie spinobulbaire (SBMA; SpinoBulbar Muscular Atrophy).

L’atrophie spinobulbaire est une maladie neurologique se développant tardivement chez l’adulte. Celle-ci est causée par une expansion du nombre de répétition de glutamine formant le « polyglutamine tract » du récepteur androgène. L’investigation par immunohistopathologie de tissus provenant de patients atteints de SBMA révèle une concentration dense du récepteur androgène sous forme d’agrégats protéiques nommé corps d’inclusion. Cette découverte sur le potentiel toxique de ces structures a été corroborée par plusieurs études de surrexpression du gène AR normal (WT; Wild Type) ainsi que la version ayant subie une expansion du nombre de glutamine. L’observation constante des agrégats insolubles de la protéine AR portant l’expansion de glutamine suggère un problème au niveau de sa gestion par les mécanismes de dégradation protéiques normaux.

iv

Le but premier de cette thèse est d’explorer les effets causés sur la dégradation par le système ubiquitine/protéasome (UPS) du récepteur androgène à la suite d’une expansion du nombre de répétition de glutamine dans le gène. Pour ce faire, plusieurs stratégies ont dues être utilisées à l’aide d’une lignée cellulaire servant de modèle pour l’atrophie spinobulbaire (SBMA). Pour montrer l’effet de la localisation nucléaire de la protéine du récepteur androgène portant une expansion de glutamine sur la toxicité cellulaire, des études de transfection de la protéine mutante portant aussi un signal de localisation nucléaire déficient ont été utilisées. Afin de démontrer l’effet de l’expansion de la répétition de glutamine, des protéasomes humains ont été purifiés par chromatographie d’affinité afin de tester l’efficacité de la dégradation de constructions du récepteur androgène arborant différentes longueurs de répétition de glutamine.

v ACKNOWLEDGEMENTS

In the spirit of full disclosure, it is acknowledged that the author was the beneficiary of excellent guidance and spirited discussion of this thesis and all aspects of science (ranging from the appropritately relevant to the wildly speculative) with my supervisor, Dr. Mark Trifiro. In addition, considerable time and effort in the design of experiments, interpretation of results and suggestion of controls was provided by several members of the Trifiro laboratory: Dr. Lenore K. Beitel, Carlos Alvarado, Rose Lumbroso, and Dr. Bruce Gottlieb. It should be expressly stated that tandem mass spectrometry of purified proteasome fractions was performed on a fee-for-service basis by Dr. Marcos DiFalco at the McGill University and Genome Quebec Innovation Centre. Database mining of the resultant LC-MS/MS data was performed by the author with the assistance of Dr. Beniot Houle and Yannic Richard of the same institution. Finally, the LC-MS/MS data from proteasome purifications was searched against a randomized version of the NCBI database by Dr. Rob Kearney, also of the McGill University and Genome Quebec Innovation Centre. Materials were gathered from various sources for completion of the research described in this thesis. Dr. Ron Kopito provided the HEK-293 GFPu cell line. In addition, several expression vectors for transfection experiments had already been

constructed by various members of the Trifiro laboratory including EBFP-AR -Q20 and -

Q50; ARΔNLS -Q20 and -Q50 plasmids, and the bacterial N-terminal TAD expression vectors. In addition, the GST-UBL was provided by Dr. Juli Feigon, University of California at Los Angeles, and the pMT123 expression vector was a gift from Dr. Simon Wing, McGill University. All other DNA vectors were constructed by the author. Finally, it would have been impossible to complete this work without the unconditional love and suppport from my parents, Timothy and Barbara Scanlon. Additionally, I would like to express gratitude for which there are no words to my wife, Katherine Scanlon, for her unwavering love and support (not to mention unimaginable patience).

vi

Table of Contents

ABSTRACT ii RESUME iv ACKNOWLEDGEMENTS vi TABLE OF CONTENTS vii LIST OF FIGURES x LIST OF TABLES xi LIST OF ABBREVIATIONS xii CHAPTER 1 – Introduction and Literature Review 1 1.1 General Introduction 1 1.2 Molecular Biology of the Nuclear Receptor Superfamily 2 1.2.1 - Nuclear Receptor Superfamily 2 1.2.2 - Modular Domain Structure of Nuclear Receptors 3 1.2.3 - Nuclear Receptor Coregulators 8 1.3 Molecular Biology of the Androgen Receptor 10 1.3.1 - Role of Androgens 10 1.3.2 - AR Genetic Features 11 1.3.3 - AR Mode of Action 14 1.4 The Ubiquitin Proteasome System for Protein Degradation 17 1.4.1 - General Introduction 17 1.5 Polyglutamine Expansion Disease 21 1.5.1 - General Introduction to Polyglutamine Expansion Disease 22 1.5.2 - Molecular Pathology 24 1.6 Spinal and Bulbar Muscular Atrophy 41 1.6.1 - Pathogenesis 41 1.6.2 - Aggregate Formation in SBMA 44 1.6.3 - Animal Models of SBMA 47

vii CHAPTER 2 – Role of Androgen Receptor Nuclear Translocation in the Polyglutamine Expansion-Mediated Molecular Pathology of the Ubiquitin Proteasome System 61 2.1 Introduction 63 2.1.1 - Role of Proteasome in PolyQ Expansion Diseases 63 2.2 Results and Discussion 68 2.2.1 - PolyQ-expanded AR inhibits the proteasome 68 2.2.2 - Exclusion of the AR from the nucleus rescues proteasome impairment 70 2.3 Conclusions 73 2.4 Experimental Procedures 75 2.5 Figures 77 2.6 References 81 CHAPTER 3 – A Novel Method for Purification of Functional Human Proteasomes Under Native Conditions 82 3.1 Introduction 85 3.2 Results 88 3.2.1 - Functional analysis of affinity purified proteasomes 88 3.2.2 - Structural analysis of affinity purified proteasomes 90 3.3.3 - Characterization of the holoproteasome complex by mass spectrometry 90 3.3 Discussion 95 3.3.1 Group 1: Deubiquitinases 101 3.3.2 Group 2: Ubiquitin Conjugating/ 102 3.3.3 Group 3: Ubiquitin Domain 104 3.3.4 Group 4: DNA Repair 105 3.3.5 Other Proteins 106 3.4 Conclusions 106 3.5 Experimental Procedures 107 3.6 Tables 114 3.7 Figures 117 3.8 References 120

viii

CHAPTER 4 – Production and Purification of Polyubiquitinated Androgen Receptor 124 4.1 Introduction 126 4.1.1 - Endogenous ubiquitination of the AR 126 4.1.2 - Towards in vitro ubiquitination of the AR 129 4.2 Results and Discussion 130 4.2.1 - In vivo ubiquitination and purification of transfected AR from HEK-293 130 4.2.2 - Ubiquitination and purification of endogenous AR in LNCaP 131 4.2.3 - In vivo ubiquitination in HEK-293Mdm2 132 4.2.4 - PROTACS 133 4.2.5 - In vitro ubiquitin conjugation of GST-UbcH5a 134 4.3 Conclusions 135 4.4 Experimental Procedures 137 4.5 Figures 141 4.6 References 151 CHAPTER 5 – Structural Investigation of the Polyglutamine Expansion of the AR 153 5.1 Introduction 156 5.1.1 - Structure of the AR transactivational domain 156 5.1.2 - Structural investigations of the polyQ tract 158 5.2 Results and Discussion 160 5.2.1 - Expression and purification of AR TAD 160 5.2.2 Structural Investigation of the AR-TAD polyQ Variants – Dynamic

Light Scattering 163

5.2.2 Strucutral Investigation of the AR-TAD polyQ Variants – Circular

Dichroism 164

5.3 Conclusions 167 5.4 Experimental Procedures 169 5.5 Figures 172 5.6 References 181 FINAL CONCLUSION AND SUMMARY 183 LIST OF PUBLICATIONS 187

ix

LIST OF FIGURES

Figure 1.1: Modular Domain Organization of Nuclear Receptor 5 Figure 1.2: Androgen Receptor Gene, Transcript, and Protein 12 Figure 1.3: Androgen Receptor Mode of Action 15 Figure 1.4: Ubiquitin-Proteasome System for Protein Degradation 19 Figure 2.1: AR Ligand-Dependent Inhibition of the Proteasome 77 Figure 2.2: AR Cytoplasmic Restriction Rescues Proteasome Impairment 79 Figure 3.1: Fluorigenic Substrate Assay of Purified Proteasomes 117 Figure 3.2: Immunodetection of Purified Proteasomes 119 Figure 4.1: Ubiquitin Conjugation and Ni2+-IMAC Purification of HIS-AR 141 Figure 4.2: Ubiquitin Conjugation with pHA-Ub;HIS-AR 142 Figure 4.3: Ubiquitin Conjugation and HA Immunoprecipitation of HA-Ub 143 Figure 4.4: Ubiquitin Conjugation of Endogenous AR in LNCaP 144 Figure 4.5: Ubiquitin Conjugation in HEK-293Mdm2 145 Figure 4.6: Effects of AR Antagonists on AR Ubiquitin Conjugation 146 Figure 4.7: PROTAC-5 Treatment Reduces AR Levels 147 Figure 4.8: GST-UbcH5a Undergoes Rapid Autoubiquitination In Vitro 149 Figure 4.9: GST-UbcH5a-TAD Fusion Protein Does Not Autoubiquitinate 150 Figure 5.1: Bacterial Production and Purification of GST-AR TAD 172 Figure 5.2: Production and Purification of AR TAD 0Q 173 Figure 5.3: Production and Purification of AR TAD 20Q 174 Figure 5.4: Production and Purification of AR TAD 50Q 175 Figure 5.5: SDS- and Native-PAGE Analysis of AR TAD 176 Figure 5.6: DLS of TAD-AR polyQ variants 177 Figure 5.7: Secondary structure analysis of the AR-TAD 178 Figure 5.8: Far-UV Circular Dichroism spectra of AR-TAD polyQ variants 179 Figure 5.8: Tertiary structure analysis of the AR-TAD by Near-UV CD 180

x

LIST OF TABLES

Table 1.1: Polyglutamine Expansion Diseases 23 Table 1.2: Animal Models of SBMA 48 Table 3.1: Proteasome Subunits ID’ed by GST-UBL Chromatography 114 Table 3.2: Nonproteasome Proteins ID’ed by GST-UBL Chromatography 115 Table 3.3: Nucleotide Regulation of Proteasome Subunit Composition 116

xi LIST OF ABBREVIATIONS AA AF-1 Activating Function 1 AF-2 Activating Function 2 AIS Androgen Insensitivity Syndrome ALS Amyotrophic lateral sclerosis AMC Aminomethyl coumarin Amp Ampicillin AR Androgen Receptor ARE Androgen response element BIC Bicalutamide

Bmax Maximum binding capacity CAG cytosine-adenine-guanine CD Circular dichroism CBP CREB binding protein CHIP C-terminal Hsp-interacting protein Chl Chloramphenicol CPA Cyproterone acetate DALPC Direct analysis of large protein complexes DBD DNA binding domain DHT Dihydrotestosterone DRPLA Dentatorubal-pallidoluysian atrophy DSB Double strand break DUB Deubiquitinase E1 Ubiquitin activating E2 Ubiquitin conjugating enzyme E3 Ubiquitin ligase EM Electron microscopy ERS Energy regenerating system FSA Fluorigenic substrate assay FTIR Fourier transform infrared spectroscopy xii

GFP Green Fluorescent Protein GR Glucocortocoid Receptor GSH Glutathione GST Glutathione S-Transferase HA Hemagglutin HAT Histone acetyltransferase HD Huntington Disease HDAC Histone deacetylase HEK Human Embryonic Kidney HF Hydroxyflutamide Htt Huntingtin IPTG Isopropyl β-D-1-thiogalactopyranoside IR Infrared

Kd Dissociation equilibrium constant LBD Ligand binding domain LBP Ligand binding pocket LC-MS/MS Liquid chromatography – tandem mass spectrometry LNRH Lutenizing hormone-releasing hormone Luc Luciferase MB Mibolerone MG132 Z-Leu-Leu-Leu-aldehyde MMTV Mouse mammary tumour virus NCBI National Center for Biotechnology Information NCoR Nuclear Receptor Corepressor NER Nucleotide excision repair Ni2+NTA Nickel nitrilotriacetic acid NLS Nuclear Localization Sequence NMR Nuclear Magnetic Resonance NR Nuclear Receptor OD Optical density PIP Proteasome interacting protein

xiii polyQ polyglutamine p/p PreScission Protease PROTAC targeting chimeric molecule RAP30 RNA polymerase II-associated protein 30 kDa RAR-γ Retinoid Acid Receptor-γ RIPA Radioimmunoprecipitation assay RXR-α Retinoid X Receptor-α SAGA Spt/Aga/Gen5 acetyltransferase complex SBMA Spinal and Bulbar Muscular Atrophy SCA Spinocerebellar Ataxia SCX Strong cation exchange SMRT Silencing mediator of Retinoic Acid and Thyroid Hormone Receptors SP1 Specificity Protein 1 SUMO Small ubiquitin-like protein modifier T Testosterone TAD Transactivation domain TAF4 TBP-associated factor tb Thrombin TBP TATA-binding protein TFTC TBP-associated factor-containing complex TMAO Triemthylamine-N-oxide Tomas Toolbox to optimize mass spectrometry data Ub Ubiquitin UBA Ubiquitin associating domain UBL Ubiquitin like domain Ub-H Ubiquitin aldehyde UBX Ubiquitin regulatory X UIM-1 Ubiquitin interacting motif 1 UIM-2 Ubiquitin interacting motif 2 UPS Ubiquitin proteasome system for protein degradation YFP Yellow Fluorescent Protein

xiv

CHAPTER 1 – Introduction and Literature Review

1.1 General Introduction

The Androgen Receptor (AR; NRC34) is a critical signaling molecule involved in

myriad cellular functions pertaining to growth and development of male reproductive and

nonreproductive systems. The AR functions primarily as a ligand-dependent

transcription factor activated by the natural ligands testosterone and dihydrotestosterone.

An exceptional number of mutations in the AR have been reported, mostly resulting in the inactivation of normal androgen-binding activity. These loss-of-function mutations

clinically manifest as Androgen Insensitivity Syndrome (AIS), having dramatic effects on the development of the male phenotype, but are not lethal.

Another class of AR mutation, the so-called trinucleotide repeat expansion, occurs

when the normally polymorphic cytosine-adenine-guanine (CAG) repeat in the coding

region of exon 1 is expanded past a pathologic threshold. In unaffected individuals, the

CAG repeat ranges from 9-33 repeats with a mode of 21, but males who inherit an AR

with more than 37 CAGs develop a late-onset neurological disorder termed Spinal and

Bulbar Muscular Atrophy (SBMA). It is widely regarded that expansion of the CAG

repeat in AR results in a “gain-of-function” mutation, considering that knockout of the

mouse AR gene yielded no neurological phenotype. The CAG codon codes for a

glutamine residue, and therefore SBMA is a member of a class of disorders referred to as

Polyglutamine (polyQ) Expansion Disorders. The underlying toxicity leading to

neurodegeneration engendered by the polyQ expansion remains largely unknown. There

is ample evidence to suggest that misregulation in the ubiquitin proteasome system for

1 protein degradation (UPS) plays a role in the polyglutamine expansion-mediated dysfunction. The major goal of this thesis is to explore this hypothesis.

This report details an investigation of the role of the UPS in PolyQ Expansion

Disease, using SBMA as a model. Chapter 2 describes the use of a classic proteasome reporter cell line to assess global function of the UPS in the context of a polyQ-expanded

AR restricted to either the nucleus or the cytoplasm. Chapters 3 and 4 detail attempts to generate a totally in vitro system for generation of appropriately targeted (i.e. polyubiquitinated) AR, and proteasome enzyme. A novel method for purification of intact, functional human proteasomes under native conditions was developed and found to be useful for the detection of proteasome interacting proteins, discussed in Chapter 3.

Various attempts to purify large quantities of polyubiquitinated AR are described in

Chapter 4. Finally, Chapter 5 describes the large scale expression and purification of N- terminal fragments of the AR encompassing both wild type and expanded polyglutamine tracts for the purposes of structural investigation of the effects of the expansion mutation.

1.2 Molecular Biology of the Nuclear Receptor Superfamily

1.2.1 Nuclear Receptor Superfamily

The Nuclear Receptor (NR) superfamily is one of the largest families of ligand- activated transcription factors in the animal kingdom. Systematic classification of the

NR superfamily was addressed in 1999 by the Nuclear Receptors Nomenclature

Committee [1], and the current nomenclature is based upon the phylogenic relationship of the most conserved regions of the NR: the DBD and LBD. This method yields 6 major

2

subfamilies that are each divided into numerous groups. The number of NR in various animal species varies widely; the Drosophila melanogaster has 21 NR genes, while humans have 48 NR genes, yet there exist over 270 NR genes in Caenorhabditis elegans.

NRs play a vital role in the regulation of varied cellular activities including homeostasis, differentiation, reproduction and development [2]. NRs serve as a one-stop signal transduction cascade, providing a direct link between signaling molecules and transcription. In addition, NR have a role in integration of information from diverse signal transduction cascades, as their activity is modulated by post-translational modification.

The diverse signaling molecules serving as NR ligand activators are mainly small and hydrophobic, including the steroid hormones (androgens, estrogens, glucocorticoids), retinoic acid, fatty acids, leukotrienes and . In addition, a large number of genes with significant homology to classical NR DBD and LBD have been identified, but have no known natural ligand. These are the so-called “orphan receptors”. The nature of NR natural ligands (small, hydrophobic, cell permeable, high binding affinity), and the confirmed involvement in high-profile disease processes like and diabetes makes the NR superfamily a highly intriguing class of proteins for pharmacologic intervention, this facet has undoubtedly accelerated understanding of the

NR superfamily.

1.2.2 Modular Domain Structure of the Nuclear Receptor Superfamily

3 The structural organization of NRs is highly conserved throughout the

superfamily (Figure 1.1). Generally, there are four conserved domains of NR including:

(from N- to C-terminal) the transactivational domain (TAD, or A/B domain), DNA binding domain (DBD, or C domain), a flexible linker domain (the hinge region, or D domain), and the ligand binding domain (LBD, E domain). The conservation in overall domain structure suggests a common mode of ligand-dependent activation across subfamily members.

The TAD is the least conserved domain of NR superfamily members, with a reported average amino acid identity to the consensus NR TAD sequence of only 29%

[2]. At the extremes of TAD variability are the Vitamin D Receptor TAD which has only 35 amino acids (AA), and the AR which has 532 AA and a reported AA identity to the consensus sequence of only 8%. The variability in TAD structure likely contributes to the specificity of individual NR response. Currently, no 3D structure is available for any NR TAD, and it is unlikely that the solution of one NR TAD would be relevant for homology modeling of other NRs. Structural data garnered from circular dichroism and nuclear magnetic resonance (NMR) of the glucocorticoid receptor (GR) suggest the AF-1

is natively unfolded [3], and may be forced into an α-helical conformation via an

‘induced-fit’ mechanism [4], creating a stabilized structure competent for transactivation.

This indirect data suggests that conditional folding and α-helical formation is an

important requirement for interaction with coregulators involved in transcriptional regulation [5-7].

In spite of reported NR TAD variability, the presence of one constitutively active transactivation region (AF-1), and several other autonomous transactivational

4

Highly Variable Strongly Conserved Secondary and 3-D Conserved

TAD DBDH LBD

AF-1 NLS AF-2

Figure 1.1: Modular Domain Organization of Nuclear Receptor Superfamily. Displayed are the Transactivation domain (TAD), which exhibits low homology between family members and harbors the activation function-1 (AF-1), primarily involved in interaction with coregulatory proteins; the DNA-binding domain (DBD), responsible for recognition of DNA response elements and dimerization; the hinge region (H), containing the nuclear localization sequence (NLS); the ligand-binding domain (LBD), a structurally conserved domain involved in ligand-binding and transactivation functions via the activation function-2 (AF-2) subdomain, and dimerization.

5 subdomains are conserved features of NR TADs. These domains are constitutively active in the absence of the DBD and LBD, but are repressed in the apo (unliganded) state of the full length receptor. The main reported function of AF-1 is the recruitment of the basal transcription machinery, specifically the TFIID complex via the Alteration/deficiency of activation 2 protein (Ada2p) and TATA box-binding protein (TBP) [6, 8]. Importantly, the transactivational potential of the AF-1 is dependent upon promoter context, cell type and post-translational modifications, such as phosphorylation and small ubiquitin-like modification (SUMO-ylation) [9, 10].

The DBD is extremely well conserved among the NR superfamily, both in primary sequence and 3-D conformation. The central feature of NR DBD structure is the

3 protruding helices organized into 2 zinc-finger motifs that make site-specific contacts with DNA response element hexamers. Each zinc-finger is characterized by 4 invariant cysteine residues that chelate one zinc ion [11-13]. The 2 zinc-finger motifs of NRs are interdependent subdomains with slight structural and functional differences, but fold as a unified globular domain whose structure is modestly changed upon DNA binding [14-

16]. This differs differ considerably from classical zinc-finger motifs that function as conformationally stable units in the presence or absence of DNA, and contribute to DNA binding independently [17]. The first NR zinc-finger subdomain contains 3 to 4 residues termed the P box, and aids in the discrimination of receptor-specific DNA response elements [13, 18]. In contrast, a helix in the second NR zinc-finger has a nonspecific affinity for DNA, while a loop in the same subdomain forms a homodimerization interface called the D box.

6

The hinge region is a small, but multifunctional domain that links the DBD and

LBD, harbors the primary nuclear localization sequence (NLS), has roles in DNA binding, and modulation of transactivation [19]. The NLS is comprised of a cluster of basic residues that enables a ligand-bound NR to be translocated to the nucleus. In addition, the hinge region has been demonstrated to harbor several lysine residues targeted for post-translational modification including acetylation and ubiquitination [20].

The NR LBD provides the receptor interface for binding of class-specific ligands.

In addition, the LBD mediates transcriptional repression, harbors the ligand-dependent transcriptional activation function (AF-2), and is involved in homo- and heterodimerization. Crystal structures have been solved for at least 20 of the 48 human

NR LBDs identified by bioinformatics, it is intriguing that both secondary and tertiary structure of the LBD across superfamily members is highly conserved: all known NR

LBDs are composed of 11-12 α-helices with a short β-sheet insertion between H5 and

H6, arranged in a similar fold termed the “α-helix sandwich” [21]. Helices H1, H2 and

H3 form one face of the LBD; H4, H5, H8, H9, and the β-sheets form a central layer; and

H6, H7 and H10 form the opposite face. In the apo state, H11 is very close to the ligand binding pocket (LBP), and its hydrophobic residues are proposed to stabilize the very hydrophobic LBP. Upon ligand-binding, H11 is expelled from its position proximal to the LBP and H12 undergoes a dramatic conformational change, folding over the LBP, stabilizing the ligand, and creating a surface for cofactor recruitment that comprises the

AF-2.

Comparing crystal data from the apoRetinoid X receptor-α (RXR-α) with the holo

(ligand-bound) Retinoic Acid Receptor-γ (RAR-γ), a similar overall structure between

7 apo and holo forms is observed, with the major difference being a reorganization of the

H12 creating a novel surface for coactivator recruitment, and a reorganization of helices 1 and 3 resulting in a more compact holo form, suggesting that the receptor is stabilized by ligand binding [21-23]. It has been demonstrated that antagonist binding results in impaired conformational shift of H12; this is cited as the major structural difference correlated with transcriptional repression [24].

More evidence cororborating the importance of ligand-induced H12 conformational change comes from mutational analysis of the ERα that demonstrated defined regions of the C-terminus of the LBD are required for ligand-dependent transcriptional activation [25]. Moreover, crystal data has confirmed that the conformational shift of the amphipathic H12 associated with ligand binding creates a surface that is optimal for recruitment of interacting proteins that contain conserved leucine-rich motifs with the consensus sequence LxxLL (L: leucine; x: any amino acid)

[26]. The coactivator LxxLL motif forms an α-helix and binds to a hydrophobic groove in the LBD AF-2 surface formed by H3, loop L3-L4 and H4. Most NR coregulators that interact ligand-dependently with the AF-2 possess the LxxLL motif, including the p160- family, and CBP/p300 [27].

1.2.3 Nuclear Receptor Coregulators

Full transactivational potential of NR is achieved only with the combined action of coregulatory proteins. In fact, transcriptional activity of NRs is critically dependent on the chromatin environment and recruitment of RNA polymerase II to target promoter regions. At present, there are at least 285 reported NR coregulatory proteins [28]. The 2

8

major sites of coregulator recruitment, AF-1 and AF-2, have drastically different modes of action: AF-1-interacting coregulators have almost no conserved domains and do not bind to a defined surface on the TAD, while the AF-2 coregulators have a highly conserved motif, the LxxLL, that fits nicely into the well-defined hydrophobic AF-2 groove.

Coactivators are commonly recruited to the promoter in preformed complexes, but it important to note that controlling the rate and identity of proteins in the assembly of various promoter-specific complexes is one way that NR can effect transcriptional activity and integrate signals from various signal transduction cascades. An important role of coactivators is to stabilize the interaction between NR and the basal transcription machinery [29]. In addition, coactivator complexes commonly contain enzymes with histone acetyltransferase activity (HAT). Activity of these enzymes results in ATP- dependent covalent modification of histone proteins, leading to chromatin remodeling that results in higher transcriptional potential [30]. Examples of well known NR coactivators with HAT activity include: SRC1, CPB/p300, and PCAF.

Corepressors of NR function also exist. There are two very well defined NR corepressors, the Nuclear Receptor Corepressor (NCoR) and the Silencing Mediator of

Retinoic Acid and Thyroid Hormone Receptors (SMRT). These corepressors function to decrease the transcriptional activity of NR in the absence of ligand or in the presence of antagonists. Several mechanisms for repression of basal transcription of NR by corepressors have been discovered. Passive repression occurs when corepressive proteins block promoter binding, prevent dimerization or binding of coactivators [31]. In contrast, active silencing occurs when the corepressive protein blocks transcriptional initiation,

9 either directly, or by recruitment of factors that facilitate blockage [32]. Not surprisingly,

the NR corepressors can also function by active recruitment of histone deacetylases

(HDAC), which function in opposition HATs, catalyzing the production of inactive

chromatin structure incompatible with high rates of transcriptional activation [33].

In conclusion, the sheer number and diversity of NR coregulatory proteins adds

an almost unfathomable level of complexity to NR signaling, allowing for exquisite

control of NR-regulated cellular processes and the ability to integrate signals from

diverse signal transduction cascades.

1.3 Molecular Biology of the Androgen Receptor

1.3.1 Role of Androgens

Androgens are important steroid hormones essential for the development of the

male phenotype. Androgens have well-defined roles during development of male sexual

differentiation including development and maintenance of external genetalia and

secondary sex characteristics, and initiation and maintenance of spermatogenesis. In

addition, androgens have confirmed roles in the growth of body hair and male pattern baldness [34]. There are two naturally occurring androgens synthesized by humans, testosterone (T) and 5 α-dihydrotestosterone (DHT). T is synthesized from cholesterol in the Lydig cells of the testes. A lipid soluble small molecule, T passively diffuses across the membranes of target cells where it either binds directly to the AR, or becomes a substrate for the 5 α-reductase enzyme (EC 1.3.99.5) that catalyzes conversion to DHT, a more avid AR-binding molecule with higher AR activation potential than T. Although

10

the two androgens have specific roles in male sexual development, there is only one AR protein, suggesting that subtle differences in T-AR and DHT-AR complexes result in distinct modes of activation of androgen-regulated genes.

1.3.2 Androgen Receptor Genetic Features

The AR gene is located on the at Xq11-12. It spans 90 kb of

DNA, and is transcribed to form a 10.6 kb mRNA containing eight exons that code for a protein of approximately 919 amino acids [35, 36]. The genetic organization of the AR reflects the conserved modular domain structure of the NR superfamily: exon 1 encodes for the entire TAD (amino acids 1-532), exons 2 and 3 code for the DBD, and exons 4 through 8 code for the LBD (Figure 1.2). The organization of the AR exon structure is conserved throughout mammalian evolution from rodents to humans. Furthermore, the conservation of genetic locus of the AR across many species of animals suggests a significant developmental association with syntenic genes [37].

The AR is widely expressed in almost all human tissues with the notable exception of the spleen, however the expression levels and timing are largely cell-type specific. The 5’ promoter region of the AR lacks the traditional TATA or CAAT transcription initiation sites, however it does have a GC-rich region required for recruitment of the Specificity Protein 1 (SP1) general transcription factor [38]. There are at least two confirmed promoter binding sites that have variable timing, activity and cell- type specificity [39]. Androgens and other steroid hormones contribute to the transcriptional control of AR expression. The rat AR promoter contains several

11

X Chromosome

AR Gene

Exon 1 2 3 4 5 6 7 8

DBD LBD 1 (Gln)14-35 (Pro)8 (Gly)16 538 625 919 AR Protein

Figure 1.2: Genomic and Protein Domain Structure of the Androgen Receptor. The AR gene is located on the long arm of the X chromosome and spans approximately 90 kb.

The AR is organized into eight exons which are spliced to form the three functional domains of the AR protein as demonstrated. The polymorphic glutamine tract is located

in the transactivation domain (TAD), with a normal range of between 14 and 35

glutamine repeats. Similarly, repeat sequences encoding the CCG (proline) and GGC

(glycine) are displayed.

12

palindromic hormone response elements trophic for the AR, GR and PR [40] that function mainly in a negative feedback loop to downregulate AR signalling.

As the AR is located on the X chromosome, males express only one allele and are therefore sensitive to mutations without the balance of a codominant allele. In addition, the AR gene has been demonstrated to be dispensable for viability. It is for these reasons that the AR has an extremely large record of spontaneous mutations [41] that have been used to probe the structure-function relationship of the AR. An important observation resulting from this phenomenon is that mutations in the AR engender a continuum of functional consequences, clinically manifested in the phenotypic variability of the

Androgen Insensitivity Syndrome (AIS), which is subclassified according to clinical severity into at least three categories: Mild Androgen Insensitivity, Partial Androgen

Insensitivity and Complete Androgen Insensitivity. The severity of the AIS has been correlated with the degree of impairment of several molecular mechanisms of AR function including binding affinity, off-rates of ligand-binding, AR N/C terminal interactions and coactivator recruitment.

An interesting feature of the AR gene is the presence of several regions of repetitive DNA sequences. Exon one contains three such regions: a noninterrupted CAG repeat coding for poly-glutamine beginning at codon 58 and polymorphic in length, a

CCT/CCG repeat coding for poly-proline from codons 371-379, and a GGT/GGC repeat coding for poly-glycine from codon 449-472. The functional consequences of the poly- proline and poly-glycine tracts are not well understood. In contrast, the poly-glutamine

(polyQ) tract has proven functional consequences both in ligand-dependent AR

13 transcriptional activation [42], and as a toxic element causing an inherited disease called

Spinal and Bulbar Muscular Atrophy (SBMA) [43].

An important feature of the CAG tract is its polymorphic length, likely

attributable to slippage of the DNA polymerase during replication. In the human

population, the average length of the CAG repeat is 21 ± 2, however the normal range is between 14-35 and varies with ethnicity and race [44]. The polyQ tract length shows an inverse correlation with the transcriptional activation potential of the AR. Men with exceptionally short tracts have a high risk for prostate cancer, indicative of a hypersensitive AR [45], while men with much longer tracts develop mild symptoms of

AIS caused by marginally reduced AR activity [43]. In addition to the functional consequences of CAG tract length on transcriptional activation, elongation of the repeat past a threshold of 37 repeats results in a late-onset, neurological disorder called SBMA.

The disease is one of a class of disorders caused by triplet repeat expansions in coding

regions called PolyQ Expansion Disorders (see section 1.5 “Polyglutamine Expansion

Disease”). SBMA is characterized by progressive muscle wasting in the facial and limb

muscles as a result of degeneration of motor neurons in the anterior horn of the spinal

cord as well as the brainstem [46].

1.3.3 Androgen Receptor Mode of Action

The AR functions via a tightly-regulated sequence of events leading to

transcriptional activation of target genes (Figure 1.3). Following translation, the apo-AR is mainly located in the cytoplasm, constitutively bound to molecular chaperones in a manner that blocks recruitment of coactivators. Either T or DHT bind the AR at a single

14

T DHT

a

Chaperones Nucleus

Coregulators AR

ARE

Cytoplasm

Figure 1.3: Androgen Receptor mode of action. Testosterone (T), the primary circulating androgen, crosses the cell membrane by passive diffusion. In the cell, T can bind directly to the AR, or is metabolized by (a) steroid 5-α-reductase (EC 1.3.99.5) to dihydrotestosterone (DHT). Androgen-activation of the AR abrogates its association with molecular chaperones (HSPA1A, HSP90AA1) and causes a conformational change promoting nuclear localization. In the nucleus, the AR recruits coregulators, and binds directly to DNA sequences called Androgen Response Elements (ARE), activating or repressing transcription from these promoter elements.

15 site with high affinity (T-AR Kd: 10 nM, DHT-AR Kd: 5 nM). As a consequence of

hormone-binding, a conformational change in the receptor occurs, resulting in the

shedding of molecular chaperones, and the revelation of an NLS sequence in the hinge

region that promotes AR nuclear translocation [47]. Although several crystal structures of the ligand-bound AR LBD have revealed this domain to be monomeric when ligand- bound, the binding of full-length AR to DNA occurs as a consequence of tight AR homodimerization.

Several post-translational modifications of the AR have been identified with variable effects on ligand-dependent or ligand-independent transcriptional activation.

Regulation of AR activity by phosphorylation has been proposed by several groups.

Indeed, the TAD alone has at least eight serines reported to be modified by phosphorylation [48]. However, conflicting data concerning the identity of AR kinases and cell-type specificity confuse the significance of this aspect of AR signaling. It is likely that phosphorylation of the AR has nuanced functional significance, as S94A,

S515A and S650A mutations demonstrated no reduction in AR signaling [49], and there has yet to be a reported case of AIS resulting from a mutation at any of these sites.

The AR specifically recruits several enzymes with HAT activity, therefore it is not surprising that acetylation of the AR has been proposed as a mechanism of AR regulation. The AR hinge region harbors several confirmed acetylation sites and contains the KxKK motif [50]. The functional consequences of AR acetylation have been established as K630A and K633A mutations displayed markedly reduced transactivation from androgen response elements, and 10-fold increased binding of NCoR corepressor,

16

while maintaining wild-type ligand-binding activity [51], suggesting acetylation is a key component in the recruitment of coregulatory proteins.

Once the AR homodimer is bound to DNA at specific sequences termed androgen response elements (ARE), a complex of coactivators and basal transcription machinery is recruited for the initiation of target gene transcription. The conformational change associated with androgen-binding results in the formation of a hydrophobic groove composed of residues from H3, H4, H5 and H12 that constitutes the AF-2. As a rule, NR coactivators bind to this groove via their LxxLL motif. However, the AR breaks the NR mold as a consequence of sequences in its own TAD that outcompete coactivators for the

AF-2 binding site. Two AR TAD sequences 23FQNLF27 and 433WHTLF437 bind specifically, and with high affinity to the C-terminus, forming an intramolecular interaction (commonly referred to as the “N/C terminal interaction”) required for full transcriptional activity [52]. Interestingly, classical coactivators of the NR superfamily

(p160, CBP/p300) also are coactivators of the AR, but interact independently of their

LxxLL motifs, binding to the AR TAD with conserved glutamine-rich regions [53]. The importance of N/C terminal interactions and coactivator recruitment for proper AR transcriptional regulation are highlighted by the identification of mutations at these loci that result in AIS [41].

1.4 The Ubiquitin-Proteasome System for Protein Degradation

1.4.1 General Introduction

17 The Ubiquitin Proteasome System (UPS) for protein degradation involves two

discrete processes. First, covalent attachment of multiple ubiquitin moieties to the

protein substrate provides the “death signal”. Appropriately targeted protein substrates

subsequently undergo proteolysis by the 26S proteasome (Figure 1.4). Ubiquitin- mediated proteasomal targeting of protein substrates occurs via a tightly-regulated, ATP- dependent enzyme cascade. Ubiquitin activating enzyme (E1) forms a thioester bond with the main-chain carboxyl group of G76 of ubiquitin. The E1 then transfers the ubiquitin moiety to one of a number of ubiquitin conjugating enzymes (E2). Each E2- ubiquitin is able to contact multiple ubiquitin ligases (E3) that catalyze the covalent linkage of ubiquitin chains to specific protein substrates via isopeptide bond formation between the ε-amine of substrate lysines and G76 of ubiquitin. Substrate-attached ubiquitin can also be modified by ubiquitination on several internal lysine residues (K6,

K11, K29, K33, K48, K63), resulting in the formation of polyubiquitin structures. A tetra-ubiquitin chain covalently attached to a single substrate lysine residue (termed

“polyubiquitination” to differentiate between “multiubiquitination”, which might occur as single ubiquitin moieties attached at several substrate lysines) is the minimal signal required for proteolytic processing by the proteasome [54]. While the classical cellular role of ubiquitination is the regulated degradation of protein substrates and the formation of antigenic peptides, the importance of ubiquitination in a number of cellular processes including DNA repair, endocytosis and signal transduction has recently been established

(reviewed in [55]). It seems the nature and number of ubiquitin chain linkages determines the role of ubiquitination in a given protein-specific context.

18

Peptide cleavage products

AMP + PPi Ub Ub recycling ATP

E1 Ub E1 Ub Ub Ub Ub Ub

E2 E2 Ub Protein Deubiquination degradation

E3 Ub E3

Ub Ub Substrate Protein Ub Ub Substrate Ub targeting 26S Proteasome

Figure 1.4: The Ubiquitin Proteasome System for Protein Degradation. Ubiquitin monomers are successively added to protein substrates by the E1, E2, and E3 ubiquitin conjugation cascade. Attachment of at least 4 ubiquitin moieties is the minimal signal targeting protein substrates for degradation by the 26S proteasome. Proteolysis in the central core of the proteasome releases small peptides that exit the proteasome by passive diffusion. Ubiquitin chains are recycled by the action of deubiquitinating enzymes.

19 The 26S proteasome harbours the proteolytic enzyme activity of the UPS, and as such is the primary non-lysosomal process for cellular protein degradation. It is composed of two multisubunit complexes with distinct activities required for coordinated proteolysis of appropriately targeted protein substrates. The 19S regulatory particle drives ATP-dependent substrate recognition, processing and deubiquitination, and 20S core binding, while the proteolytic activity of the proteasome is partitioned from the cytoplasmic environment, as it is sequestered within the central chamber of the cylindrical 20S catalytic core.

The proteasome is a dynamic complex involved in numerous transient interactions with proteins that modulate its activity. The 19S regulatory particle forms a differentially regulated structure with varying subunit composition depending on organism and tissue type [56, 57]. Upon exposure to high salt conditions the 19S can be separated into 2 subcomplexes, the lid and the base. The base is composed of at least 10 subunits, 6 of which contain ATPase activity and form a hexameric ring that makes direct contact with the 20S catalytic core in an ATP-dependent interaction. The ATPase subunits have confirmed functions in polyubiquitin chain binding, 20S “gate” activation, and anti- chaperone activities. The 19S lid subcomplex is composed of 8 nonATPase subunits; little has been confirmed regarding its function. Remarkable homology between the 19S lid and the COP9 signalosome has been observed. In addition, two other regulatory particles, PA28 (aka 11S proteasome regulator) [58] and PA200 [59], can activate proteolytic activity of the 20S for ubiquitin and ATP-independent proteolysis. Recent reports have even identified “hybrid” proteasomes comprised of a single 20S subunit capped by both a 19S and a PA 28 or PA200 [60-62].

20

The proteasome 20S catalytic core has remarkable conservation of subunit structure and overall complex architecture throughout all domains of life. Indeed, the crystal structure of the archaeal organism Thermoplasma acidophilium 20S catalytic core was solved in 1995 and has provided the basis for all structural studies in higher organisms [63]. The archaeal 20S core has 2 types of subunits, α and β, each of which are arranged in 2 homoheptameric rings and stacked with a stoichiometry of 7α7β7β7α.

In eukaryotes 14 unique proteins exist, each assigned α or β on the basis of sequence and structural homology to the archaeal subunits. In mammals, there exists even greater subunit complexity, as interferon γ is known to induce the transcription of 3 additional β subunit genes that replace their constitutive subunits when expressed [64]. The proteolytic activity of the eukaryotic 20S core occurs in the central of 3 chambers in the hollow cylinder formed by the junction of the β subunit rings. The opening into the cylinder is gated in eukaryotes by N-terminal extensions of the α subunits, and can be accessed by substrates only after ATP-dependent allosteric activation by the ATPases of the 19S base. Following activation, access to the 20S catalytic core is still severely restricted, as substrates must pass through a 1.3 nm diameter pore. This implies all protein substrates must be devoid of tertiary structure prior to proteolytic processing in the central chamber of the 20S. Proteolytic activity is confined to 3 β subunits on active sites exposed to the inner face of the 20S cylinder, where unfolded protein substrates are processively cleaved at 3-20 residue increments [65]; newly cleaved peptide fragments exit the chamber by passive diffusion.

1.5 Polyglutamine Expansion Disease

21

1.5.1 General introduction to Polyglutamine Expansion Disease

Since the beginning of the molecular biology revolution, a common finding in genetics is the instability of triplet repeat sequences of DNA. In 1991, a link was established between a CAG triplet repeat expansion and a human disease, as it was

reported that the expansion mutation of the polymorphic CAG tract in the coding

sequence of the AR was responsible for SBMA [43]. Since then, 8 more genes of

seemingly unrelated native function have been reported to cause late-onset neurological disorders as a result of expansion of CAG triplet repeats in coding regions. The CAG triplet codes for glutamine, and thus these disorders have been classified as PolyQ

Expansion Diseases, in an attempt to distinguish them from an unrelated class of disorders that result from triplet repeat expansions in noncoding regions of DNA.

The class of PolyQ Expansion Diseases includes SBMA, Huntington’s Disease

(HD), Dentatorubal-pallidoluysian atrophy (DRPLA), and Spinocerebellar Ataxias (SCA)

Types 1, 2, 3, 6, 7, and 17 (Table 1.1); while these genes share no homology outside of the CAG tract and do not seem to be related with respect to biochemical function, the disorders do share several clinical, pathological, and molecular features. PolyQ

Expansion Diseases are all late-onset neurological disorders, and share a dominant mode of inheritance, with the exception of an X-linked recessive mode implicated for SBMA.

Genetically dominant inheritance suggests a toxic, “gain-of-function” is bestowed upon the disease protein by the polyQ expansion itself, and might explain why proteins with unrelated biochemical or metabolic function cause a common pathology [66]. Another conserved feature, somatic and germ-line instability of triplet repeats have been

22

PolyQ Tract Length PolyQ Protein Subcellular Disease Protein Normal Disease Localization SBMA Androgen Receptor 6-36 38-62 Cytoplasmic, Nuclear (+ Androgen)

Huntington's Disease Huntingtin 6-34 36-121 Cytoplasmic (Nuclear cleavage products)

Dentatorubropallidoluysian atrophy Atrophin-1 6-35 49-88 Cytoplasmic

Spinocerebellar Ataxia 1 Ataxin-1 8-35 39-83 Nuclear

Spinocerebellar Ataxia 2 Ataxin-2 14-32 33-77 Cytoplasmic

Spinocerebellar Ataxia 3 - MJD Ataxin-3 12-40 54-89 Cytoplasmic (Nuclear Inclusions Observed)

Spinocerebellar Ataxia 6 CACA1A 4-18 19-33 Cell Membrane Spinocerebellar Ataxia 7 Ataxin-7 4-35 37-306 Nuclear TATA-Binding Spinocerebellar Ataxia 17 25-42 47-63 Nuclear Protein

Table 1.1: Characteristics of PolyQ Expansion Diseasae

23 demonstrated in these patients [67], resulting in expansion of repeat number in

subsequent generations. This clinical attribute, called genetic anticipation has been

correlated with age of onset and disease severity [68]. In addition, all reported PolyQ

Expansion Diseases have an inverse correlation between polyQ tract length and age of

onset but a direct correlation between tract length and disease severity. Another

similarity in these diseases is the “pathological threshold” of glutamine tract length

required for disease initiation. Even though normal alleles of these disease-causing genes

have widely variable average polyQ tract lengths, the number of glutamines required for

acquiring disease is a relatively constant 36-39Q across almost all PolyQ Expansion

Diseases. The reason behind the pathological threshold for toxicity remains elusive.

Finally, the most intriguing common feature among PolyQ Expansion Diseases is the

histopathological observation of insoluble protein deposits termed “protein aggregates”

or “inclusion bodies” that contain the polyQ-expanded protein. A detailed examination of polyQ expansion-mediated protein aggregation will be discussed in the next section.

An unexplained phenomenon in PolyQ Expansion Disease is late-onset, neuronal-

specific toxicity in non-overlapping brain regions observed in all these disorders despite

confirmed expression of the polyQ expansion in just about every tissue type. This

suggests a highly nuanced mechanism of toxicity that so far has eluded scientific

discovery.

1.5.2 Molecular Pathology of PolyQ Expansion Disease

Observation of a dominant, gain-of-function mechanism for polyQ expansion- mediated pathology would suggest a single, conserved molecular etiology of toxicity in

24

every PolyQ Expansion Disease. This is likely an oversimplification of the actual scenario, but conserved mechanistic features of toxicity surely exist. Several hypotheses for polyQ expansion-mediated toxicity are well-developed, and include toxicity of protein aggregates, aberrant transcriptional properties of polyQ-expanded proteins including misregulation of histone acetyltransferases, and inhibition of the UPS. Each of these topics has been extensively investigated in cell culture experiments, animal models and in human tissue isolated from patients, but has yielded confusing, often conflicting results for each hypothesis. Although great strides have been made in understanding the polyQ expansion toxicity, the pathogenic mechanism of the polyQ expansion is still not well understood.

Role of Aggregation of the PolyQ Expansion

A unifying pathological phenomenon in PolyQ Expansion Disease is the presence of discrete electrodense deposits of cellular material detectable by microscopy of patient tissue, cell culture experiments and animal models. Immunohistochemical analysis of these deposits universally stain positive for the polyQ-expanded protein. Two terms are commonly used interchangeably to describe these deposits: “aggregates” and “inclusion bodies”. For the purposes of this review, I will attempt to distinguish between the two terms. Protein “aggregation” will be defined as a biochemical phenomenon that occurs when two or more polyQ protein molecules self-associate via a novel structure (likely a

β-sheet) distinct from the native structure of the polyQ tract (likely a random coil) [69].

The term “inclusion” is historically used to describe abnormal intracellular structures observed histologically [70], and for the purposes of this review will be referred to as a

25 “cellular response” resulting in relocalization polyQ proteins to a small area, and may or

may not include self-associating polyQ tracts. The activity of heat shock proteins and the

aggresome machinery are likely involved in the process of inclusion body formation, and

may or may not be dependent on prior polyQ tract self-association. Therefore, inclusion

bodies are considered a distinct process from aggregate formation. Note there is no

requirement for polyQ expansion in either definition, as normal length polyQ tracts have been demonstrated to aggregate in vitro, and are often sequestered into inclusion bodies.

When referring to a specific scientific paper, the terminology used in the original publication will be maintained.

Aggregate Formation

PolyQ-mediated aggregate assembly has been studied in vitro by several groups.

The first attempts to describe the propensity of polyQ expansion to self-associate used X- ray diffraction analysis of synthetic peptides consisting of short glutamine tracts, and concluded that consecutive glutamines favor a β-pleated sheet conformation that would explain self-association of polyQ proteins as the formation of “polar zippers” [71]. Short glutamine tracts in the context of a native protein were nonetheless demonstrated to be in a random coil conformation by NMR, but on the basis of entropic factors it was hypothesized that elongated polyQ tracts would form a stable intramolecular structure, likely a β-hairpin. This inherently different structure of the polyQ-expanded protein

would facilitate the formation of polar zippers resulting in formation of a nucleus of

aggregated protein that might rapidly recruit other polyQ protein monomers [72]. This

intriguing hypothesis nicely explained the pathogenic threshold observed across PolyQ

26

Expansion Disorders, but has been refuted by more recent studies. Current models of soluble polyQ proteins demonstrate a random coil structure of both normal and expanded polyQ tracts as assessed by CD spectra and NMR; formation of a β-sheet conformation occurs concomitantly with aggregate formation in polyQ tracts of any length [73, 74].

This invalidates the thought that expanded polyQ tracts have an implicitly novel structure different from the normal length polyQ tract, and rather suggests that a rare folding event, a transition from the native random coil to a β-sheet, occurs with greater frequency in polyQ-expanded proteins, and is responsible for increased propensity for self-association of polyQ-expanded proteins.

In order to reconcile the structural equivalence of native forms of normal and expanded polyQ tract length with the observation that aggregate formation is only observed in patients expressing polyQ tracts longer than the pathological threshold, it has been hypothesized that the timing of “nucleation” of aggregate formation is a critical parameter that is proportional to tract length. Indeed, short polyQ tracts have been demonstrated to aggregate in vitro, but with delayed kinetics compared to proteins with longer tracts [73, 75]. Therefore, subpathological length polyQ tracts might be expected to form aggregates in the cell, but at such slow rates as to be inconsequential in relation to the normal half-life of the soluble polyQ protein. Taken together, these findings suggest aggregate formation in cells is dependent on integration of several factors not limited to tract length, timing and level of expression, and ability to sequester or degrade soluble polyQ protein. Indeed, these are recurring themes in all proposed mechanisms of polyQ- mediated toxicity.

27 Toxicity of Aggregates / Inclusion Bodies

The incidence of inclusion bodies in cells expressing polyQ-expanded protein is such a universal feature that most hypotheses of mechanistic toxicity are explained in relation to inclusion body or aggregate formation. However, there is still considerable debate regarding whether or not aggregates and inclusion bodies are toxic per se. An alternative hypothesis is that inclusion body formation is a protective cellular response to toxic soluble polyQ-expanded protein [76]. Traditionally, aggregate size and presence have been correlated to affected/unaffected regions of postmortem tissue, and to disease severity to prove/disprove the toxic aggregate hypothesis. Unfortunately, due to largely conflicting evidence, conclusions are difficult to reach. For example, aggregate presence and density have been positively correlated with disease severity in several PolyQ

Expansion Diseases including SBMA [77], SCA1 [78], and SCA3 [79]. In contrast, aggregates form in both affected and unaffected neuronal regions of the SCA7 brain [80].

In HD, polyQ expansion-containing aggregates occur in less than 1% of striatal neurons, the most affected neuronal region, but are quite common in the cortex, where aggregate size demonstrates a positive correlation with advanced grade tissue [81]. Difficulties in interpretation of these experiments confuse the situation even more. Is diffuse staining of polyQ-expanded proteins indicative of soluble protein or microaggregates? Is it possible that neurons with the highest load of aggregates are eliminated prior to postmortem analysis? Is it fair to compare analyses performed with a benchtop fluorescent microscope to those done with electron microscopy? Does signaling from “healthy” neurons with aggregates impact adjacent affected neurons that do not have aggregates?

To address these questions, and to more strictly define how aggregates might cause

28

toxicity, a large number of cell culture experiments have been performed, again contributing ambivalent data to the “toxic aggregate” hypothesis.

Conflicting evidence to the “toxic aggregate” hypothesis has been demonstrated in several cell culture models. Treatment of an immortalized striatal cell line expressing polyQ-expanded Huntingtin (Htt) with various caspase inhibitors significantly prevented nuclear and cytoplasmic inclusion formation, but did not alter the polyQ-mediated toxicity [82]. Moreover, a conditional-expression system for expression of polyQ- expanded protein demonstrated aggregate formation is a dynamically-reversible process that was not linked to neuronal death [83]. Furthermore, several studies suggest that sequestration of aggregates into inclusion bodies is a protective mechanism for affected cells [70, 84]. Another impressive study using automated live-cell imaging of striatal neurons expressing pathogenic polyQ tracts demonstrated that inclusion body formation reduced the risk of neuronal death [85]. When pathogenic length polyQ tracts of N- terminal Htt fragments were imaged over time, a positive correlation between diffuse Htt and cell death was established, and inclusion body formation was shown to reduce diffuse

Htt levels and cell death [85]. This study strongly supports the idea that inclusion bodies form as a protective cellular response to a toxic protein. Taken together, these studies seem to dissociate aggregation from cellular toxicity of polyQ-expanded proteins.

In contrast, several cell culture studies suggested a role for polyQ aggregates as toxic particles. As an example, transfection studies with mutant Htt demonstrated rapid inclusion body formation resulting in nuclear envelope disruption and cell death [86].

However, many early studies attempting to link the formation of aggregates to toxicity were inherently flawed, as the expression of soluble monomeric polyQ-expanded protein

29 was required for aggregate formation; as a result, the presence of soluble polyQ-

expanded protein might have also contributed to observed pathology. One solution to

this problem was achieved by Yang et al [87]. In this study, synthetic polyQ peptides of

normal and expanded forms, with and without NLS sequences were allowed to preform

into aggregates and then introduced into cells by liposomes. Preformed aggregates

targeted to the nucleus, including peptides with subpathogenic polyQ tract lengths,

reduced cell viability compared to soluble monomer polyQ peptides of matched length.

These studies provide a good link between aggregation and toxicity; a detailed discussion

of how aggregates might mediate toxicity follows.

Possible Mechanisms of Aggregate Toxicity

Correlation of protein aggregation/inclusion bodies and cellular toxicity is not

sufficient to explain the mechanism of polyQ-mediated neurodegeneration. There are

several proposed mechanisms by which aggregate formation might induce the observed

neurotoxicity of PolyQ Expansion Disease. Depletion of polyQ protein as a consequence

of aggregation may result in haploinsufficiency, but this is unlikely to be a primary

mechanism for toxicity considering the case of SBMA: males with a complete loss of

function of the AR develop AIS, but do not show any discernable neurological

phenotype. The same is true for individuals hemizygous for SCA1 or HD genes.

Sequestration of proteins essential for viability is another proposed mechanism for aggregate-mediated toxicity. Indeed, immunohistochemical analysis from patients [88] and mass spectrometry analysis of aggregates purified from cell culture [89] has revealed that a large number of cellular proteins are recruited to protein aggregates formed in

30

PolyQ Expansion Disease. It is possible that mislocalization into protein aggregates prevents normal function of these proteins, leading to toxicity. Transcription factors are a likely candidate for aberant sequestration into protein aggregates resulting in toxicity.

A universal theme in PolyQ Expansion Disease is the correlation of nuclear localization of polyQ-expansions and toxicity. A proposed mechanism to explain the role of nuclear dependence is that polyQ-expanded proteins might disrupt transcriptional events. Global regulation of transcription is a delicate synchronization of many separate events including regulation of a complex network of protein-protein interactions and protein trafficking. An important feature of many transcription factors is the prevalence of glutamine-rich sequences that have confirmed roles in promoting protein-protein interactions [90]. Examples of transcription factors with glutamine-rich sequences that have been confirmed to localize to intracellular polyQ aggregates include: the cAMP response element binding protein binding protein (CBP) [91], TATA-binding protein

(TBP, i.e. SCA17) [92], TBP-associated factor (TAF4) [93] and SP1 [93].

There are at least two reports detailing observations of aberant transcriptional activity resulting from direct interactions of a transcription factor and a polyQ-expansion protein. The polyQ-expanded protein in SCA7, ataxin-7, has recently been shown to be a component of a histone acetyltransferase complex called the TATA-binding protein-free

TBP-associated factor-containing complex (TFTC). In a mouse model of SCA7, polyQ- expanded ataxin-7-containing TFTC complex was highly recruited to specific promoters in a subset of genes expressed in rod photoreceptors, resulting in hyperacetylation of histone H3 in the promoter region and a detectable change in transcription of these genes

[94]. Furthermore, polyQ-expanded Htt has shown the capacity to repress in vitro

31 transcription by TFIID via direct interactions with SP1, TAF4 and RNA polymerase II-

associated protein 30 kDa (RAP30) [95]. A lingering question is how sequestration or

inactivation of a general transcription factor by proteins expressed in myriad tissue types

results in selective degeneration of neurons characteristic of each PolyQ Expansion

Disease. This remains a major topic of debate in this field.

Another potential mechanism of transcriptional misregulation induced by the

polyQ-expansion, several studies have implicated altered chromatin acetylation as a

possible mediator of neurotoxicity. Specifically, expression of polyQ protein in the

nucleus of Saccharomyces cerevisiae resulted in repression specific set of genes known

to be regulated by the Spt/Aga/Gen5 acetyltransferase complex (SAGA). Importantly,

this transcriptional repression was reversed by treatment with a histone deacetylase

inhibitor [96]. The SAGA complex is a histone acetyltransferase with significant

homology to human CBP, a confirmed polyQ-interacting protein. As a result, several groups have investigated CBP’s role in polyQ-expansion toxicity. Reports indicate

interaction of CBP with polyQ is dependent upon the acetyltransferase domain, and

enzymatic activity is reduced when mutant Htt is expressed [97]. In addition,

amelioration of polyQ-expansion toxicity has been observed with CBP overexpression

[98]. The implication that polyQ-expansion toxicity might be directly caused or exacerbated by aberrant acetylation is intriguing because this phenomenon might be regulated by an existing class of drugs, the histone deacetylase inhibitors.

The Ubiquitin Proteasome System

32

The observation of aberrant protein deposits in neurons of individuals afflicted with PolyQ Expansion Disease has led to speculation that malfunction of the normal mechanism of protein turnover plays a role in pathology. The ubiquitin proteasome system (UPS) for protein degradation is the primary non-lysosomal mechanism for protein destruction in eukaryotic cells. For over a decade, immunohistochemical analysis of polyQ-expansion-induced protein aggregates in cells have consistently revealed the presence of several components of the UPS including ubiquitin, proteasome subunits and

UPS adaptor proteins [99-101]. This finding has been confirmed in patient tissue, cell culture and animal models in almost every PolyQ Expansion Disease type. As a result,

UPS involvement in the clearance of soluble or aggregated polyQ-expanded protein is one or the most consistently researched themes relating to polyQ toxicity.

Several mechanisms explaining UPS-mediated toxicity have been proposed, and will be discussed in relation to the observed data. The UPS degrades proteins involved in most metabolic or regulatory pathways in the cell including , tumor suppression, development, mitosis, etc. Disruption of the UPS by polyQ-expanded proteins could therefore interrupt proper regulation of any of these pathways in affected neurons, leading to cell death. General UPS inhibition could be envisioned by several mechanisms: direct inhibition of proteasomes by polyQ-expanded protein as a result of failure to degrade and release the protein, resulting in “clogging” of the proteasome catalytic chamber, irreversible (or slowly reversible) interactions with the 19S that prevent normal recognition and processing of other substrates without affecting catalytic activity, and indirect inhibition by sequestration of free ubiquitin or proteasome subunits into aggregates or inclusion bodies resulting in a depletion of active UPS.

33 Measurement of UPS activity is a technically challenging feat, considering the

requirement for appropriate post-translational targeting of substrates, proteasome

recognition, ATP-dependent unfolding and deubiquitination, and catalytic proteolysis.

Conventional attempts to quantify enzymatic activity of the proteasome via the use of small peptide fluorigenic substrates focus solely on the proteolytic activity of the catalytic core, largely ignoring critical events in substrate processing by the 19S. Nevertheless, fluorigenic substrate assays do provide quantifiable measurements of proteolytic activity, and have been used to demonstrate proteasome impairment in response to expression of polyQ-expanded protein. Examining lysates from a transfected cell model of HD, Jana et al., reported a redistribution of proteasome activity to aggregates accompanied by reduced degradation of the proteasome-regulated protein in cells containing polyQ- expanded Htt, not observed in matched controls [102]. However, these findings did not translate to animal models, as it was demonstrated that proteasome activity was actually increased in a conditional model of HD expressing Htt with 94Q, relative to control mice

[103]. In constrast, similar experiments using peptide fluorigenic substrates have been performed with postmortem tissue from HD patients, demonstrating a decrease in proteasome activity in early and late stage disease, compared to matched controls [104].

A major problem with these studies is the use of peptide substrates that do not require ubiquitin conjugation or substrate processing via the 19S proteasome. In addition, detergents commonly used to lyse cells in preparation for analysis by fluorigenic substrates are also useful to activate 20S proteasomes, and could confound results.

In vivo functional studies have corroborated a connection between the UPS and aggregates/inclusion bodies. Reports have indicated that the proteasome is both

34

responsible for, and capable of, degradation of polyQ-expanded protein containing aggregates observed in primary striatal neuronal cultures from a conditional mouse model of HD [83]. When expression of a polyQ-expanded Htt (94Q) was induced, intranuclear aggregate formation was rapid, forming within two days. Abolition of transgene expression resulted in the slow disappearance of aggregates over a five day period, and was sensitive to proteasome-inhibition, suggesting aggregates are normally degraded by the proteasome. This study strongly suggests that aggregates form as the result of a dynamic balance between production of polyQ-expanded protein, and proteasome activity [83]. Further evidence of UPS interaction with aggregates comes from exquisite live-cell imaging studies of fluorescently-tagged functional proteasomes; this study demonstrated irreversible sequestration of proteasomes into polyQ-expanded protein- induced aggregates, and inefficient proteasome-dependent degradation of polyQ- expanded proteins [105]. Taken together, these studies suggest that an impaired ability of the proteasome to “keep up” with soluble polyQ-expanded protein may result in accumulation, and aggregation/inclusion body formation, resulting in irreversible sequestration of proteasome. It may be relevant to late-onset PolyQ Expansion Disease that proteasome activity has been shown to decline with age in the nervous system of healthy individuals [106, 107].

Several in vivo reporter assays that more accurately reflect proteasome activity have been important in the study of PolyQ Expansion Disease. These studies will be reviewed in the introduction to Chapter 2.

Another method to analyze the polyQ-expansion effect on the UPS is in vitro analysis of direct proteasomal degradation of polyQ protein. This has proven incredibly

35 difficult (see Chapters 3 and 4), with a very limited number of successful reports. Two components are required: appropriately targeted substrates and purified, functional

proteasomes. As proteasome-substrates are polyubiquitinated by specific E3 ligases,

often context-dependently, in vitro production of polyubiquitinated protein has proven

difficult. In addition, the presence and high activity of intracellular deubiquitinases

(DUBs) has oft been cited as a factor preventing purification of polyubiquitinated

proteins from cell culture. As a result, the few reports of in vitro degradation of polyQ

proteins by the proteasome have been forced to take drastic measures including the use of

archeal proteasomes, the use of SDS to activate eukaryotic proteasomes, and imaginative

targeting schemes.

PolyQ peptides of various subpathogenic lengths and a Q35-myoglobin fusion

were introduced to various preparations of eukaryotic and archeal proteasomes by

Venkatraman et al. [108] Substrate proteins were not poly-ubiquitinated, but authors

overcame this limitation by addition of SDS or by using the yeast α3ΔN mutant proteasome to circumvent the gate function of the 20S. Authors reported both mammalian and yeast 20S proteasomes were unable to digest polyQ sequences, but rapidly digested all flanking sequences. Additionally, treatment of eukaryotic proteasomes with nonubiquitinated polyQ proteins did not inhibit proteasome degradation of fluorigenic peptides, an important finding suggesting polyQ do not act as suicide inhibitors for proteasome catalytic sites. Intriguingly, the authors reported that archeal proteasomes completely and efficiently degraded all polyQ peptides [108]. While these are impressive results, it remains possible that appropriately targeted proteins with polyQ-expansions might be unfolded by the anti-chaperone activity of the eukaryotic

36

19S, sensitizing the proteins to degradation by the 20S. Another in vitro study examined the effects of polyQ expansion and aggregation of the Htt protein on the polyubiquitin- dependent proteasome degradation of cyclin N100. Briefly, cyclin N100 was polyubiquitinated in vitro and introduced to purified 26S proteasome in the presence of either HttQ18 or HttQ51 produced in bacteria. The Htt51Q rapidly formed aggregates in vitro, but had no effect on the efficient degradation of Ubn-cyclin N100 [109]. While this

study seems to demonstrate that polyQ expansion does not sequester the proteasome into

inactive aggregates, the reader is left to wonder if sequestration would occur if the polyQ-

expanded protein were specifically delivered to the proteasome by the polyubiquitin

targeting signal. In addition, the study does not establish if the proteasome can efficiently

degrade polyQ-expanded proteins.

Molecular Chaperones

Proteins are synthesized as a linear polymer of amino acids, but require exquisite

3D conformations for optimal function. The cellular environment is often not optimal for

proper folding of proteins or specific domains, and therefore members of a class of

ubiquitous proteins, called molecular chaperones, are many times required to facilitate

this process. Chaperones function at various steps in synthesis and translocation of the

nascent polypeptide chain to ensure folding and prevent aggregation, but how chaperones

discriminate between correct and incorrect forms of a protein is currently not understood.

Proteins recognized as incorrectly folded are processed by a specific molecular pathway

called “quality control”, composed of molecular chaperones and the UPS [110]. Proteins

37 recognized as misfolded by molecular chaperones can be delivered to the proteasome for destruction in a process that has been implicated in neurodegenerative disease [111, 112].

The role of specific chaperone proteins Hsp70, Hsp90, Hsp104, Hsp40 and Hsp27 in the recognition and degradation of abnormally folded proteins has been reviewed

[113]. Interestingly, several molecular chaperone proteins are recognized to coprecipitate with polyQ-expanded proteins and components of the UPS in inclusion bodies [114,

115]. Several models of PolyQ Expansion Disease have demonstrated a protective effect of overexpression of heat shock proteins. In a cell culture model of SBMA, HSP70 overexpression resulted in enhanced solubility and increased turnover of polyQ-expanded

AR [116]. In addition, several animal models of HSP70 have shown similar protective effects of HSP overexpression [117, 118]. A specific interaction between the E3 ligase

CHIP and HSP70 and HSP90 has been reported to be involved in targeting polyQ- expanded AR to the proteasome, engendering a therapeutic effect on the SBMA phenotype in transgenic mice [119]. In summary, it is likely that molecular chaperones play a role in ameliorating the toxicity of polyQ-expanded proteins by reducing aggregation and/or targeting these misfolded proteins for destruction by the proteasome.

Aggresomes

Several misfolded proteins, including the polyQ expansions, have been found as cellular inclusion bodies that stain positive for ubiquitin and components of the UPS, and display retarded electrophoretic mobility on SDS-PAGE. The misfolded proteins are often recruited to pericentriolar structures, termed aggresomes [120], where it is hypothesized that the UPS is recruited to degrade the misfolded protein. Interestingly,

38

treatment of cells with proteasome inhibitors promotes the formation of the aggresome

[120]. Aggresomes accumulate misfolded proteins at the microtubule organizing center by microtubule-mediated active transport, and protein cargo is commonly observed to be ubiquitinated. It is likely that many literature descriptions of polyQ-expansion, ubiquitin- positive “aggregates” are in fact, aggresomes. This is important because several models of PolyQ Expansion Disease in which conditional expression systems have been used have demonstrated that “aggregates” are dynamic structures that are completely degraded by the proteasome [83, 121, 122], suggesting the proteasome function is not inhibited by polyQ-expanded proteins. However, alternative routes for destruction of aggresomes have been proposed, including -mediated [70, 123, 124], although this hypothesis remains controversial [125]. It is possible that soluble or microaggregated polyQ-expanded proteins are recognized as misfolded, and are polyubiquitinated for removal from the cell via the misfolded protein response.

However, if polyQ-expanded proteins inhibit, or markedly slow proteasome function, they may be recruited to aggresomes followed by lysosome-mediated autophagy. Indeed, several groups have shown evidence that aggresomes form a protective role by sequestering toxic soluble or microaggregated polyQ-expanded proteins [70, 126].

Ellucidation of the relative role of proteasome and aggresome in PolyQ Expansion

Disease is confounded by several factors. First, while proteasome impairment is a known instigator of aggresome formation, recognition of polyQ-expanded proteins as a misfolded protein might result in their recruitment to aggresomes in the absence of proteasome inhibition. In addition, autophagy has been demonstrated to remove polyQ- expansion proteins directly [127] in the absence of aggresomes, therefore attempts to

39 verify autophagy-dependent aggresome clearance will be complicated. Finally,

disruption of microtubules causes inhibition of both aggresome formation and

autophagosome-lysosome fusion contributing a significant obstacle to the dissection of

these two events.

Altered Proteolysis

Another theme in the search for the underlying toxicity of polyQ expansions is the

observation of polyQ-containing protein fragments in vitro and in vivo. Evidence has

accumulated that proteolysis of polyQ-expanded proteins is a biologically relevant event, and has led some to speculate that release of a protein fragment containing the polyQ- expansion may be more toxic than full length protein. This field of research was launched when immunohistochemical analysis of HD brain with a panel of anti-Htt antibodies revealed intense staining of intranuclear inclusions with N-terminal antibodies that was not observed when the sections were stained with C-terminal antibodies [101,

128]. The absence of immunoreactivity from antibodies that recognize protein moieties distal to the polyQ-expansion has been confirmed in other PolyQ Expansion Diseases tissues including SBMA [77], and SCA3 [129]. The results of these studies might be explained by a conformation change in the distal portions of polyQ-expansion proteins, however western blots of brain homogenates have unequivocally demonstrated that polyQ-containing proteins are cleaved, as shorter protein fragments were detected [130,

131].

Several polyQ proteins have been demonstrated to be substrates for proteolytic cleavage by caspases, known mediators of apoptosis. Significantly, several reports of

40

mutation of caspase recognition sites or treatment with caspase inhibitors ameliorated polyQ-mediated toxicity [132, 133]. Calpains are another class of proteases implicated in cleavage of polyQ proteins. The relative roles of these proteolytic enzymes, and how proteolytic processing of polyQ-expanded protein might mediate toxicity is not fully understood, but several hypothesis have been put forth. Proteolysis might contribute to pathogenesis by differential cleavage of expanded polyQ tracts promoting release of polyQ peptides removed from native protein context, leading to increased aggregation

[134] . Another possibility is that proteolytic fragments of polyQ-expanded protein cannot be processed as efficiently by the UPS. Finally, recent cell culture experiments have found polyQ fragments are more easily translocated to the nucleus where polyQ toxicity is exacerbated [135].

1.6 Spinal and Bulbar Muscular Atrophy

1.6.1 Pathogenesis

SBMA was distinctly classified as an X-linked progressive neurodegenerative disorder in 1968 [136]. However, more than 20 years passed before the CAG tract expansion in the AR gene was determined to be the mutation responsible for the disease

[43]. This was the first discovery of a PolyQ Expansion Disease, and facilitated discovery of other members of this class of disorder.

SBMA affects adult males, onset of clinical symptoms occurs between 30 and 50 years of age. The disease prevalence is 1-2 in 100,000 individuals, with a slight elevated prevalence in the Asian population. The classical clinical feature of SBMA that assists

41 discrimination from related disorders such as amyotrophic lateral sclerosis (ALS) is the

incidence of minute fasciculations of the bulbar, facial and limb muscles. Patients initially present with weakness and cramping accompanied by muscular atrophy. SBMA is a progressive disease, advanced patients develop dysphagia and dysarthia, leading to aspiration or choking, and patients in their fifties and sixties often require the assistance of a walker or wheelchair for mobility. Neurogenic abnormalities are observed by electromyogram; impairment of sensory nerve action potential and sensory evoked potential are also observed [137].

Neurodegeneration is progressive and is characterized by selective loss of motor neurons in the anterior horn of the spinal cord followed by loss of sensory neurons in the dorsal root ganglia, and then degeneration of the motor neurons in the brainstem [46].

One attractive hypothesis to explain the selective neuropathology of SBMA is that polyQ-expanded AR in the brain might be highly expressed only in affected brain

regions. AR is expressed in many regions of the human brain including the

hypothalamus, where it is expected to play a role in sexual dimorphism accounting for

the sex differences associated with appropriate function and regulation of the

hyopthalamus-pituitary-gonadal axis [138]. In the spinal cord, the AR is predominantly

expressed in motor neurons of the anterior horns and the sensory neurons of the dorsal

root ganglia [139]. However, evidence has accumulated against expression levels of AR

having a definitive role in the observed selective neuropathology. For example, reports

have demonstrated high levels of AR in the spinal nucleus of the bulbocavernosus in

rodents (called Onuf’s nucleus in humans), a region in the spinal cord that displays

42

androgen-dependence for development and maintenance of motor neurons and target muscle fibers, yet displays no pathology in SBMA [140].

A common feature of the PolyQ Expansion Diseases is the dominant mode of genetic inheritance. However, analysis of the clinical attributes of SBMA reveals a nuanced disease phenotype that does not conform to the standard model of inheritance.

Several lines of evidence indicate SBMA demonstrates a classical gain-of-function mutation, consistent with the idea of dominant inheritance. Human males with an inactivating mutation in the AR gene resulting in complete AIS (XY, phenotypic female) do not develop any neurological phenotype. Similarly, the AR knockout mouse displays no motor neuron phenotype [141]. Finally, transgenic mice with polyQ expansions recapitulate many facets of SBMA despite the presence of endogenous AR and intact AR signaling pathways [142-144]. However, other lines of evidence suggest an X-linked recessive mode of inheritance. Male SBMA patients are also afflicted with a mild form of AIS, postulated to result from reduced transactivational potential of the AR [42]. This includes gynocomastia, testicular atrophy with pronounced involution of Leydig cells, loss of secondary male sexual characteristics and decreased fertility. Female heterozygotes are usually asymptomatic but have been reported to suffer muscle cramps and tremor and present subclinical attributes including high amplitude motor unit potentials [145]. One interpretation of this finding is that random X-inactivation of the polyCAG-expanded allele prevents 50% of motor neuron loss as compared to males, and this protection is sufficient to prevent the clincal syndrome. However, an intriguing study of two females homozygous for the expanded polyCAG mutation reported no clinical evidence of neurodegeneration [146]. Although it is possible that the AR is not

43 expressed in motor neurons of the female brain, this is not consistent with studies in

rodents. An alternative explaination for the absence of disease pathology in females expressing the polyQ-expanded AR is the low levels of circulating androgen in women, suggesting androgen-activation of the AR is a requisite for neuropathology. Possible attributes of androgen-activation that have been proposed to contribute to neurotoxicity include nulcear localization, loss of molecular chaperone interaction, AR conformational

change, interaction with a diverse set of coregulatory proteins, and AR transcriptional

activation.

1.6.2 Aggregate Formation in SBMA

A pathological hallmark of SBMA is the presence of nuclear inclusions in

affected tissue that stain positive for AR. There are at least 2 proposed explanations for

accumulation of nuclear inclusions in the SBMA brain, and their relation to disease-

specific toxicity. The first explanation focuses on the idea that proteolytic cleavage of

polyQ-expanded AR produces a fragment that translocates to the nucleus where its

toxicity and propensity to aggregate is exacerbated. This idea is supported by immunohistochemical analysis of the nuclear inclusions in SBMA brain that routinely demonstrate reactivity for regions of the AR proximal to the polyQ tract, but not for distal regions [77]. This finding has been corroborated by cell culture experiments from several different groups [147, 148]. PolyQ-expansion dependent proteolytic cleavage is proposed to occur via caspase-3, at a consensus site located at D146. Mutation of this site inhibited aggregate formation [149]. Finally, evidence implicating that a toxic fragment resulting from proteolytic processing of polyQ-expansion protein has been

44

demonstrated in transgenic mouse models. Early attempts to create mouse models of

SBMA utilized full-length polyQ-expanded AR, and were unsuccessful in recapitulating disease features [150, 151]. However, when transgenes were made to express a truncated version of the AR gene product coding for an N-terminal fragment of the protein under the control of various promoters, significant neuropathology was observed [152]. Indeed, these mice display a substantial neurodegenerative phenotype and nuclear inclusions characteristic of SBMA. However, these mice do not recapitulate several aspects of the disease critical to the interpretation of the observed pathology. Specifically, the lower motor neuron pathology specific to SBMA was not observed, and the gender-specificity of SBMA expression did not occur. In sum, proteolytic processing of the AR is likely to play some role in SBMA pathogenesis, but several features of the disease (namely the requirement of androgen-activation that requires a functional LBD) are not fully explained by the “toxic fragment” hypothesis.

A more well-characterized explanation for the high incidence of nuclear inclusions in SBMA patient brain is that androgen-stimulated activation of the AR promotes several molecular events including disruption of molecular chaperone interaction and nuclear localization, resulting in inclusion body formation. The AR presents a unique opportunity for research of PolyQ Expansion Disease because its function, subcellular localization, and set of interacting proteins are dramatically modulated by steroid hormones. Moreover, the steroid hormones have the fortuitous properties of being membrane soluble and relatively nontoxic at therapeutic doses, features that have facilitated pharmacologic derivativization of these ligands; several synthetic androgen analogues with differential potential to promote or inhibit multiple

45 facets of receptor activity have been discovered. Normal and polyQ-expanded AR are normally cytoplasmic, tightly bound by HSP70 and HSP90 family members in a binding- competent, inactive conformation. Upon ligand-binding the AR sheds the interactions with chaperones, translocates to the nucleus and binds to DNA as a dimer, where it recruits multiple cofactors for the initiation of transcription. As has been established, the nucleus is an important site for toxicity, and the role of ligand-promoted nuclear translocation of the AR has been demonstrated to be an important feature of polyQ- expansion-mediated toxicity. An unresolved issue is that many cell culture models of

SBMA display primarily cytoplasmic inclusions that form only in the presence of ligand.

This finding suggests ligand-binding is an important event in inclusion body formation, possibly as a consequence of the conformational change or loss of molecular chaperone interaction. However, whether ligand-dependent cytoplasmic inclusion body formation is a biologically relevant to the high incidence of intranuclear inclusion bodies in SBMA patient brain is a topic of much debate.

One of the earliest reports of ligand-induced effects on the pathology of a polyQ- expanded AR come from cell culture experiments in HeLa cells. Live cell imaging demonstrated rapid cytoplasmic aggregate formation in cells expressing polyQ-expanded

AR occurring within ten minutes of treatment with methyltrienelone, a synthetic androgen, an effect that was reversed by overexpression of HDJ-2 [153]. Ligand- dependent intranuclear aggregate formation has also been observed in a PC12 cell culture model of SBMA. Interestingly, proteolytic fragments of polyQ-expanded AR were observed in nuclear inclusions at high frequency only in the presence of ligand in this model, supporting a synthesis of the two proposed explanations for nuclear inclusion

46

incidence [148]. A possible explanation for this observation is that heat shock proteins interacting with apoAR mask the proteolytic site. Upon ligand-binding, a combination of conformational change, shedding of heat shock proteins, proteolytic processing and normal AR-mediated nuclear translocation promote intranuclear aggregate formation.

A commonly observed morphological change correlated with polyQ-expansion induced neurotoxicity in cultured neuroblastoma cells is the appearance of short, dystrophic neurites [154, 155]. Several studies have also confirmed testosterone- dependent formation of “neuropil” aggregates in neurites of SBMA motor neuronal models, reminiscent of neuropil aggregates in HD patients [81] and a HD transgenic mouse model [156]. The observation of inclusion bodies in neurites suggests essential synaptic functions may be misregulated leading to neuronal dysfunction and subsequent degeneration observed in PolyQ Expansion Disease.

1.6.3 Animal Models of SBMA

The most intriguing evidence suggesting a role for ligand-dependence in the neuropathology of SBMA has come from animal models (Table 1.2). Expression of full length AR with the polyQ expansion in photoreceptor neurons of Drosophila melanogaster using the GAL4-UAS system revealed no aberrant phenotype in the absence of ligand. Upon dietary administration of DHT, polyQ-expanded AR demonstrated a significant degeneration of photoreceptor neurons, with only a marginal reduction in AR transactivation of a GFP-based reporter gene compared to wild type AR.

Administration of the AR antagonists hydroxyflutamide (HF) and bicalutamide (BIC) also promoted neurodegeneration, but independent of transactivation. Because HF and

47 Table 1.2: Animal Models of SBMA, adapted from Katsuno et al., 2003 [157]; Beitel et al., 2005 [158] CAG Gender Cell Reference Transgene Nuclear Inclusions Muscle Pathology Motor Impairment Length Effect Loss SBMA Patients

Kennedy, 1968 grouped atrophy, fiber-type weakness, amyotrophy, AR 38-62 Yes spinal cord, brainstem Yes Sobue, 1989 grouping, hypertrophic fiber fasciculation

Drosophila melanogaster Chan, 2001 Truncated AR 112 No Yes Yes N/A N/A

Takeyama, 2002 Full Length AR 52 Yes Yes Yes N/A N/A

Mouse Bingham, 1995 Full Length AR 45 No No No No No Merry, 1996 Full Length AR 66 No No No No No LaSpada, 1998 Full Length AR 45 No No No No No Pure CAG, AR spinal cord, cerebrum, Adachi, 2001 239 No No No weakness, amyotrophy promoter cerebellum spinal cord, cerebrum, Abel, 2001 Truncated AR 112 No No No weakness, foot clasping brainstem

hypoactivity, foot Abel, 2001 Truncated AR 112 No all neurons No No clasping, tremor, seizure

spinal cord, cerebrum, Katsuno, 2002 Full Length AR 97 Yes grouped atrophy weakness, amytrophy brainstem grouped atrophy, fiber-type weakness, amytrophy, McManamny, 2002 Full Length AR 120 Mild No Yes grouping, hypertrophic fiber foot clasping several neuronal types Scattered type II muscle fiber weakness, hindlimb Sopher, 2004 Full Length AR 100 Mild excluding spinal cord Yes distribution, fiber type grouping atrophy and paralysis motor neurons

Chevalier-Larson, spinal cord, brainstem, weakness, foot clasping, Full Length AR 112 Mild No No 2004 cortex gait abnormalities

48

BIC promote nuclear translocation but not transcription, it was concluded that nuclear localization prompted by ligand-binding is critical for SBMA pathogenesis. A strict requirement of ligand-binding for neurotoxicity was also eliminated, as this group observed toxicity when an polyQ-expanded AR N-terminal fragment (devoid of LBD) was fused to an NLS [159].

The first transgenic mouse model of SBMA to display progressive motor impairment and gender differences was reported by Katsuno et al. in 2002. AR with either 24 or 97 CAG was expressed from a cytomegalovirus enhancer and chicken β-actin promoter; male mice recapitulated several disease features including: intranuclear inclusion formation, muscular atrophy, mild myopathic change, reduction in cross- sectional area of spinal motor neurons, and appearance of truncated polyQ-expanded AR

N-terminal fragments in affected tissue. Castration of male transgenic mice largely prevented symptoms, pathology, and relevantly, nuclear localization of the polyQ- expanded AR, while administration of T to female transgenic mice with polyQ-expanded

AR strongly induced these features [160]. An elegant study investigating androgen blockade by pharmacological intervention in this transgenic mouse model followed, concluding that nuclear localization of polyQ-expanded AR is the primary pathological event in SBMA [144].

At least two more mouse models of SBMA displaying gender effects have been described. Sopher, et al. [161] created a transgenic model based on transfer of a full length AR with 100 CAG, together with endogenous regulatory elements. A neuromuscular phenotype including muscle weakness, impaired mobility and hindlimb atrophy and paralysis was observed in male mice, but was markedly reduced in females.

49 Interestingly, nuclear inclusion bodies were only observed in neurons in regions that were largely unaffected, suggesting inclusion body formation is not a strict requirement for toxicity [161]. Chevalier-Larson et al. [143], described a transgenic mouse model of

SBMA created by transfer of a full length AR with 112 CAG under the control of the prion protein promoter. Males displayed a slowly progressive motor dysfunction including forelimb clasping, gait abnormalities and hindlimb muscle weakness, but these effects were largely absent in female mice carrying the polyQ-expanded AR. Neuronal dysfunction, but not neurodegeneration was hypothesized to be the cause of these symptoms because there was no observed difference in the number or size of motor neurons in affected mice. Nuclear inclusion bodies were observed in males, but their presence was significantly reduced by castration, which also improved motor function.

Neurons demonstrated the ability to clear inclusion bodies in the absence of androgen activation, and this finding is expected to be relevant to overcoming the dysfunction caused by the polyQ-expanded AR [143].

Current research into the pathology of PolyQ Expansion Disease indicates that several molecular mechanisms are likely to contribute to observed neurodegeneration.

The debate regarding toxicity of protein aggregates/inclusion bodies remains contentious.

It is likely that formation of aggregates/inclusion bodies occurs by at least two different mechanisms: self-association via polyQ tracts, aggresome formation and probably others.

Both the formation and toxic/protective effects of aggregates/inclusion bodies are likely to be critically dependent on context including such variables as cell type, subcellular localization, expression levels and half-life of polyQ-expanded protein, expression levels

50

of molecular chaperones, native and spurious interactions with other cellular proteins, and post-translational modifications of the polyQ-expanded protein.

In this thesis, SBMA was chosen as a model for PolyQ Expansion Disease, largely because the polyQ protein responsible for this disease, the AR, has the unique and experimentally invaluable property of androgen-activation of both AR transcriptional activity, and polyQ-expansion-mediated pathogenesis. It is hypothesized that proteasome function is reversibly impaired by the androgen-activated, polyQ-expanded AR; formation of inclusion bodies is hypothesized to play a protective role by sequestering activated AR with pathogenic tract lengths, thereby reducing the soluble load of noxious

AR species. A proteasome reporter cell line was implemented for the creation of a cell culture model of SBMA to test this hypothesis. To definitively demonstrate direct proteasome impairment by a polyQ-expanded protein, attempts to create a totally in vitro ubiquitin-dependent proteasome degradation of the polyQ-expanded AR are described.

Finally, to investigate the structural aspects of the polyQ tract of the AR that might engender the ability to self-associate into protein aggregates, a system for the high level expression and purification of the AR TAD with various polyQ tract lengths is described.

Structural analysis of these proteins is ongoing.

51 1.7 References

1. A unified nomenclature system for the nuclear receptor superfamily. Cell, 1999. 97(2): p. 161-3. 2. Robinson-Rechavi, M. and V. Laudet, Bioinformatics of nuclear receptors. Methods Enzymol, 2003. 364: p. 95-118. 3. Dahlman-Wright, K., et al., Structural characterization of a minimal functional transactivation domain from the human glucocorticoid receptor. Proc Natl Acad Sci U S A, 1995. 92(5): p. 1699-703. 4. Tjian, R. and T. Maniatis, Transcriptional activation: a complex puzzle with few easy pieces. Cell, 1994. 77(1): p. 5-8. 5. Guarente, L., Transcriptional coactivators in yeast and beyond. Trends Biochem Sci, 1995. 20(12): p. 517-21. 6. Henriksson, A., et al., Role of the Ada adaptor complex in gene activation by the glucocorticoid receptor. Mol Cell Biol, 1997. 17(6): p. 3065-73. 7. Onate, S.A., et al., Sequence and characterization of a coactivator for the steroid hormone receptor superfamily. Science, 1995. 270(5240): p. 1354-7. 8. Ford, J., et al., Involvement of the transcription factor IID protein complex in gene activation by the N-terminal transactivation domain of the glucocorticoid receptor in vitro. Mol Endocrinol, 1997. 11(10): p. 1467-75. 9. Cenni, B. and D. Picard, Ligand-independent Activation of Steroid Receptors: New Roles for Old Players. Trends Endocrinol Metab, 1999. 10(2): p. 41-46. 10. Tzukerman, M.T., et al., Human estrogen receptor transactivational capacity is determined by both cellular and promoter context and mediated by two functionally distinct intramolecular regions. Mol Endocrinol, 1994. 8(1): p. 21- 30. 11. Freedman, L.P., et al., The function and structure of the metal coordination sites within the glucocorticoid receptor DNA binding domain. Nature, 1988. 334(6182): p. 543-6. 12. Freedman, L.P., et al., More fingers in hand. Cell, 1988. 54(4): p. 444. 13. Luisi, B.F., et al., Crystallographic analysis of the interaction of the glucocorticoid receptor with DNA. Nature, 1991. 352(6335): p. 497-505. 14. Baumann, H., et al., Refined solution structure of the glucocorticoid receptor DNA-binding domain. Biochemistry, 1993. 32(49): p. 13463-71. 15. Freedman, L.P., Anatomy of the steroid receptor zinc finger region. Endocr Rev, 1992. 13(2): p. 129-45. 16. Hard, T., et al., Solution structure of the glucocorticoid receptor DNA-binding domain. Science, 1990. 249(4965): p. 157-60. 17. Klevit, R.E., J.R. Herriott, and S.J. Horvath, Solution structure of a zinc finger domain of yeast ADR1. Proteins, 1990. 7(3): p. 215-26. 18. Lee, M.S., et al., Structure of the retinoid X receptor alpha DNA binding domain: a helix required for homodimeric DNA binding. Science, 1993. 260(5111): p. 1117-21. 19. Tanner, T., F. Claessens, and A. Haelens, The hinge region of the androgen receptor plays a role in proteasome-mediated transcriptional activation. Ann N Y Acad Sci, 2004. 1030: p. 587-92.

52

20. McEwan, I.J., Sex, drugs and : signalling by members of the nuclear receptor superfamily. Essays Biochem, 2004. 40: p. 1-10. 21. Bourguet, W., et al., Crystal structure of the ligand-binding domain of the human nuclear receptor RXR-alpha. Nature, 1995. 375(6530): p. 377-82. 22. Renaud, J.P., et al., Crystal structure of the RAR-gamma ligand-binding domain bound to all-trans retinoic acid. Nature, 1995. 378(6558): p. 681-9. 23. Wurtz, J.M., et al., A canonical structure for the ligand-binding domain of nuclear receptors. Nat Struct Biol, 1996. 3(2): p. 206. 24. Renaud, J.P. and D. Moras, Structural studies on nuclear receptors. Cell Mol Life Sci, 2000. 57(12): p. 1748-69. 25. Lees, J.A., et al., A 22-amino-acid peptide restores DNA-binding activity to dimerization-defective mutants of the estrogen receptor. Mol Cell Biol, 1990. 10(10): p. 5529-31. 26. Heery, D.M., et al., A signature motif in transcriptional co-activators mediates binding to nuclear receptors. Nature, 1997. 387(6634): p. 733-6. 27. Robyr, D., A.P. Wolffe, and W. Wahli, Nuclear hormone receptor coregulators in action: diversity for shared tasks. Mol Endocrinol, 2000. 14(3): p. 329-47. 28. Lonard, D.M., R.B. Lanz, and W. O'Malley B, Nuclear Receptor Coregulators and Human Disease. Endocr Rev, 2007. 29. Darimont, B.D., et al., Structure and specificity of nuclear receptor-coactivator interactions. Genes Dev, 1998. 12(21): p. 3343-56. 30. Roth, S.Y., J.M. Denu, and C.D. Allis, Histone acetyltransferases. Annu Rev Biochem, 2001. 70: p. 81-120. 31. Hudson, L.G., et al., Ligand-activated thyroid hormone and retinoic acid receptors inhibit growth factor receptor promoter expression. Cell, 1990. 62(6): p. 1165-75. 32. Baniahmad, A., et al., Interaction of human thyroid hormone receptor beta with transcription factor TFIIB may mediate target gene derepression and activation by thyroid hormone. Proc Natl Acad Sci U S A, 1993. 90(19): p. 8832-6. 33. Baniahmad, A., Nuclear hormone receptor co-repressors. J Steroid Biochem Mol Biol, 2005. 93(2-5): p. 89-97. 34. Schweikert, H.U. and J.D. Wilson, Regulation of human hair growth by steroid hormones. II. Androstenedione metabolism in isolated hairs. J Clin Endocrinol Metab, 1974. 39(6): p. 1012-9. 35. Lubahn, D.B., et al., Cloning of human androgen receptor complementary DNA and localization to the X chromosome. Science, 1988. 240(4850): p. 327-30. 36. Chang, C.S., J. Kokontis, and S.T. Liao, Molecular cloning of human and rat complementary DNA encoding androgen receptors. Science, 1988. 240(4850): p. 324-6. 37. Spencer, J.A., et al., The androgen receptor gene is located on a highly conserved region of the X of marsupial and monotreme as well as eutherian mammals. J Hered, 1991. 82(2): p. 134-9. 38. Song, C.S., et al., A distal activation domain is critical in the regulation of the rat androgen receptor gene promoter. Biochem J, 1993. 294 ( Pt 3): p. 779-84.

53 39. Grossmann, M.E., et al., The mouse androgen receptor gene contains a second functional promoter which is regulated by dihydrotestosterone. Biochemistry, 1994. 33(48): p. 14594-600. 40. Baarends, W.M., et al., The rat androgen receptor gene promoter. Mol Cell Endocrinol, 1990. 74(1): p. 75-84. 41. Gottlieb, B., et al., The androgen receptor gene mutations database (ARDB): 2004 update. Hum Mutat, 2004. 23(6): p. 527-33. 42. Mhatre, A.N., et al., Reduced transcriptional regulatory competence of the androgen receptor in X-linked spinal and bulbar muscular atrophy. Nat Genet, 1993. 5(2): p. 184-8. 43. La Spada, A.R., et al., Androgen receptor gene mutations in X-linked spinal and bulbar muscular atrophy. Nature, 1991. 352(6330): p. 77-9. 44. Sartor, O., Q. Zheng, and J.A. Eastham, Androgen receptor gene CAG repeat length varies in a race-specific fashion in men without prostate cancer. Urology, 1999. 53(2): p. 378-80. 45. Giovannucci, E., et al., The CAG repeat within the androgen receptor gene and its relationship to prostate cancer. Proc Natl Acad Sci U S A, 1997. 94(7): p. 3320- 3. 46. Sobue, G., et al., X-linked recessive bulbospinal neuronopathy. A clinicopathological study. Brain, 1989. 112 ( Pt 1): p. 209-32. 47. McEwan, I.J. and J. Gustafsson, Interaction of the human androgen receptor transactivation function with the general transcription factor TFIIF. Proc Natl Acad Sci U S A, 1997. 94(16): p. 8485-90. 48. Faus, H. and B. Haendler, Post-translational modifications of steroid receptors. Biomed Pharmacother, 2006. 60(9): p. 520-8. 49. Wong, H.Y., et al., Phosphorylation of androgen receptor isoforms. Biochem J, 2004. 383(Pt 2): p. 267-76. 50. Fu, M., et al., p300 and p300/cAMP-response element-binding protein-associated factor acetylate the androgen receptor at sites governing hormone-dependent transactivation. J Biol Chem, 2000. 275(27): p. 20853-60. 51. Fu, M., et al., Androgen receptor acetylation governs trans activation and MEKK1-induced apoptosis without affecting in vitro sumoylation and trans- repression function. Mol Cell Biol, 2002. 22(10): p. 3373-88. 52. He, B., J.A. Kemppainen, and E.M. Wilson, FXXLF and WXXLF sequences mediate the NH2-terminal interaction with the ligand binding domain of the androgen receptor. J Biol Chem, 2000. 275(30): p. 22986-94. 53. Bevan, C.L., et al., The AF1 and AF2 domains of the androgen receptor interact with distinct regions of SRC1. Mol Cell Biol, 1999. 19(12): p. 8383-92. 54. Thrower, J.S., et al., Recognition of the polyubiquitin proteolytic signal. Embo J, 2000. 19(1): p. 94-102. 55. Mukhopadhyay, D. and H. Riezman, Proteasome-independent functions of ubiquitin in endocytosis and signaling. Science, 2007. 315(5809): p. 201-5. 56. Groll, M., et al., Structure of 20S proteasome from yeast at 2.4 A resolution. Nature, 1997. 386(6624): p. 463-71. 57. Unno, M., et al., The structure of the mammalian 20S proteasome at 2.75 A resolution. Structure (Camb), 2002. 10(5): p. 609-18.

54

58. Hoffman, L., G. Pratt, and M. Rechsteiner, Multiple forms of the 20 S multicatalytic and the 26 S ubiquitin/ATP-dependent proteases from rabbit reticulocyte lysate. J Biol Chem, 1992. 267(31): p. 22362-8. 59. Ustrell, V., et al., PA200, a nuclear proteasome activator involved in DNA repair. Embo J, 2002. 21(13): p. 3516-25. 60. Wang, X., et al., Mass spectrometric characterization of the affinity-purified human 26S proteasome complex. Biochemistry, 2007. 46(11): p. 3553-65. 61. Tanahashi, N., et al., Hybrid proteasomes. Induction by interferon-gamma and contribution to ATP-dependent proteolysis. J Biol Chem, 2000. 275(19): p. 14336-45. 62. Schmidt, M., et al., The HEAT repeat protein Blm10 regulates the yeast proteasome by capping the core particle. Nat Struct Mol Biol, 2005. 12(4): p. 294-303. 63. Stock, D., et al., Catalytic mechanism of the 20S proteasome of Thermoplasma acidophilum revealed by X-ray crystallography. Cold Spring Harb Symp Quant Biol, 1995. 60: p. 525-32. 64. Gaczynska, M., K.L. Rock, and A.L. Goldberg, Gamma-interferon and expression of MHC genes regulate peptide hydrolysis by proteasomes. Nature, 1993. 365(6443): p. 264-7. 65. Tenzer, S. and H. Schild, Assays of proteasome-dependent cleavage products. Methods Mol Biol, 2005. 301: p. 97-115. 66. Housman, D., Gain of glutamines, gain of function? Nat Genet, 1995. 10(1): p. 3- 4. 67. Chong, S.S., et al., Gametic and somatic tissue-specific heterogeneity of the expanded SCA1 CAG repeat in spinocerebellar ataxia type 1. Nat Genet, 1995. 10(3): p. 344-50. 68. Trottier, Y., V. Biancalana, and J.L. Mandel, Instability of CAG repeats in Huntington's disease: relation to parental transmission and age of onset. J Med Genet, 1994. 31(5): p. 377-82. 69. Michalik, A. and C. Van Broeckhoven, Pathogenesis of polyglutamine disorders: aggregation revisited. Hum Mol Genet, 2003. 12 Spec No 2: p. R173-86. 70. Taylor, J.P., et al., Aggresomes protect cells by enhancing the degradation of toxic polyglutamine-containing protein. Hum Mol Genet, 2003. 12(7): p. 749-57. 71. Perutz, M.F., et al., Glutamine repeats as polar zippers: their possible role in inherited neurodegenerative diseases. Proc Natl Acad Sci U S A, 1994. 91(12): p. 5355-8. 72. Perutz, M.F., Glutamine repeats and inherited neurodegenerative diseases: molecular aspects. Curr Opin Struct Biol, 1996. 6(6): p. 848-58. 73. Chen, S., et al., Amyloid-like features of polyglutamine aggregates and their assembly kinetics. Biochemistry, 2002. 41(23): p. 7391-9. 74. Masino, L., et al., Solution structure of polyglutamine tracts in GST- polyglutamine fusion proteins. FEBS Lett, 2002. 513(2-3): p. 267-72. 75. Chen, S., et al., Polyglutamine aggregation behavior in vitro supports a recruitment mechanism of cytotoxicity. J Mol Biol, 2001. 311(1): p. 173-82.

55 76. Cummings, C.J., et al., Mutation of the E6-AP ubiquitin ligase reduces nuclear inclusion frequency while accelerating polyglutamine-induced pathology in SCA1 mice. Neuron, 1999. 24(4): p. 879-92. 77. Li, M., et al., Nuclear inclusions of the androgen receptor protein in spinal and bulbar muscular atrophy. Ann Neurol, 1998. 44(2): p. 249-54. 78. Skinner, P.J., et al., Ataxin-1 with an expanded glutamine tract alters nuclear matrix-associated structures. Nature, 1997. 389(6654): p. 971-4. 79. Paulson, H.L., et al., Intranuclear inclusions of expanded polyglutamine protein in spinocerebellar ataxia type 3. Neuron, 1997. 19(2): p. 333-44. 80. Holmberg, M., et al., Spinocerebellar ataxia type 7 (SCA7): a neurodegenerative disorder with neuronal intranuclear inclusions. Hum Mol Genet, 1998. 7(5): p. 913-8. 81. Gutekunst, C.A., et al., Nuclear and neuropil aggregates in Huntington's disease: relationship to neuropathology. J Neurosci, 1999. 19(7): p. 2522-34. 82. Kim, M., et al., Mutant huntingtin expression in clonal striatal cells: dissociation of inclusion formation and neuronal survival by caspase inhibition. J Neurosci, 1999. 19(3): p. 964-73. 83. Martin-Aparicio, E., et al., Proteasomal-dependent aggregate reversal and absence of cell death in a conditional mouse model of Huntington's disease. J Neurosci, 2001. 21(22): p. 8772-81. 84. Kopito, R.R., Aggresomes, inclusion bodies and protein aggregation. Trends Cell Biol, 2000. 10(12): p. 524-30. 85. Arrasate, M., et al., Inclusion body formation reduces levels of mutant huntingtin and the risk of neuronal death. Nature, 2004. 431(7010): p. 805-10. 86. Waelter, S., et al., Accumulation of mutant huntingtin fragments in aggresome- like inclusion bodies as a result of insufficient protein degradation. Mol Biol Cell, 2001. 12(5): p. 1393-407. 87. Yang, W., et al., Aggregated polyglutamine peptides delivered to nuclei are toxic to mammalian cells. Hum Mol Genet, 2002. 11(23): p. 2905-17. 88. de Pril, R., et al., Accumulation of aberrant ubiquitin induces aggregate formation and cell death in polyglutamine diseases. Hum Mol Genet, 2004. 13(16): p. 1803- 13. 89. Mitsui, K., H. Doi, and N. Nukina, Proteomics of polyglutamine aggregates. Methods Enzymol, 2006. 412: p. 63-76. 90. Tanese, N. and R. Tjian, Coactivators and TAFs: a new class of eukaryotic transcription factors that connect activators to the basal machinery. Cold Spring Harb Symp Quant Biol, 1993. 58: p. 179-85. 91. McCampbell, A., et al., CREB-binding protein sequestration by expanded polyglutamine. Hum Mol Genet, 2000. 9(14): p. 2197-202. 92. Perez, M.K., et al., Recruitment and the role of nuclear localization in polyglutamine-mediated aggregation. J Cell Biol, 1998. 143(6): p. 1457-70. 93. Shimohata, T., et al., Expanded polyglutamine stretches interact with TAFII130, interfering with CREB-dependent transcription. Nat Genet, 2000. 26(1): p. 29-36. 94. Helmlinger, D., et al., Glutamine-expanded ataxin-7 alters TFTC/STAGA recruitment and chromatin structure leading to photoreceptor dysfunction. PLoS Biol, 2006. 4(3): p. e67.

56

95. Zhai, W., et al., In vitro analysis of huntingtin-mediated transcriptional repression reveals multiple transcription factor targets. Cell, 2005. 123(7): p. 1241-53. 96. Hughes, R.E., et al., Altered transcription in yeast expressing expanded polyglutamine. Proc Natl Acad Sci U S A, 2001. 98(23): p. 13201-6. 97. Steffan, J.S., et al., Histone deacetylase inhibitors arrest polyglutamine-dependent neurodegeneration in Drosophila. Nature, 2001. 413(6857): p. 739-43. 98. Nucifora, F.C., Jr., et al., Interference by huntingtin and atrophin-1 with cbp- mediated transcription leading to cellular toxicity. Science, 2001. 291(5512): p. 2423-8. 99. Cummings, C.J., et al., Chaperone suppression of aggregation and altered subcellular proteasome localization imply protein misfolding in SCA1. Nat Genet, 1998. 19(2): p. 148-54. 100. Davies, S.W., et al., Formation of neuronal intranuclear inclusions underlies the neurological dysfunction in mice transgenic for the HD mutation. Cell, 1997. 90(3): p. 537-48. 101. DiFiglia, M., et al., Aggregation of huntingtin in neuronal intranuclear inclusions and dystrophic neurites in brain. Science, 1997. 277(5334): p. 1990-3. 102. Jana, N.R., et al., Altered proteasomal function due to the expression of polyglutamine-expanded truncated N-terminal huntingtin induces apoptosis by caspase activation through mitochondrial cytochrome c release. Hum Mol Genet, 2001. 10(10): p. 1049-59. 103. Diaz-Hernandez, M., et al., Neuronal induction of the immunoproteasome in Huntington's disease. J Neurosci, 2003. 23(37): p. 11653-61. 104. Seo, H., K.C. Sonntag, and O. Isacson, Generalized brain and skin proteasome inhibition in Huntington's disease. Ann Neurol, 2004. 56(3): p. 319-28. 105. Holmberg, C.I., et al., Inefficient degradation of truncated polyglutamine proteins by the proteasome. Embo J, 2004. 23(21): p. 4307-18. 106. Keller, J.N., F.F. Huang, and W.R. Markesbery, Decreased levels of proteasome activity and proteasome expression in aging spinal cord. Neuroscience, 2000. 98(1): p. 149-56. 107. Keller, J.N., K.B. Hanni, and W.R. Markesbery, Possible involvement of proteasome inhibition in aging: implications for oxidative stress. Mech Ageing Dev, 2000. 113(1): p. 61-70. 108. Venkatraman, P., et al., Eukaryotic proteasomes cannot digest polyglutamine sequences and release them during degradation of polyglutamine-containing proteins. Mol Cell, 2004. 14(1): p. 95-104. 109. Bennett, E.J., et al., Global impairment of the ubiquitin-proteasome system by nuclear or cytoplasmic protein aggregates precedes inclusion body formation. Mol Cell, 2005. 17(3): p. 351-65. 110. Berke, S.J. and H.L. Paulson, Protein aggregation and the ubiquitin proteasome pathway: gaining the UPPer hand on neurodegeneration. Curr Opin Genet Dev, 2003. 13(3): p. 253-61. 111. Meacham, G.C., et al., The Hsc70 co-chaperone CHIP targets immature CFTR for proteasomal degradation. Nat Cell Biol, 2001. 3(1): p. 100-5.

57 112. Imai, Y., et al., CHIP is associated with Parkin, a gene responsible for familial Parkinson's disease, and enhances its ubiquitin ligase activity. Mol Cell, 2002. 10(1): p. 55-67. 113. Sherman, M.Y. and A.L. Goldberg, Cellular defenses against unfolded proteins: a cell biologist thinks about neurodegenerative diseases. Neuron, 2001. 29(1): p. 15-32. 114. Schmidt, T., et al., Protein surveillance machinery in brains with spinocerebellar ataxia type 3: redistribution and differential recruitment of 26S proteasome subunits and chaperones to neuronal intranuclear inclusions. Ann Neurol, 2002. 51(3): p. 302-10. 115. Zander, C., et al., Similarities between spinocerebellar ataxia type 7 (SCA7) cell models and human brain: proteins recruited in inclusions and activation of caspase-3. Hum Mol Genet, 2001. 10(22): p. 2569-79. 116. Bailey, C.K., et al., Molecular chaperones enhance the degradation of expanded polyglutamine repeat androgen receptor in a cellular model of spinal and bulbar muscular atrophy. Hum Mol Genet, 2002. 11(5): p. 515-23. 117. Cummings, C.J., et al., Over-expression of inducible HSP70 chaperone suppresses neuropathology and improves motor function in SCA1 mice. Hum Mol Genet, 2001. 10(14): p. 1511-8. 118. Adachi, H., et al., Heat shock protein 70 chaperone overexpression ameliorates phenotypes of the spinal and bulbar muscular atrophy transgenic mouse model by reducing nuclear-localized mutant androgen receptor protein. J Neurosci, 2003. 23(6): p. 2203-11. 119. Adachi, H., et al., CHIP overexpression reduces mutant androgen receptor protein and ameliorates phenotypes of the spinal and bulbar muscular atrophy transgenic mouse model. J Neurosci, 2007. 27(19): p. 5115-26. 120. Johnston, J.A., C.L. Ward, and R.R. Kopito, Aggresomes: a cellular response to misfolded proteins. J Cell Biol, 1998. 143(7): p. 1883-98. 121. Xia, H., et al., RNAi suppresses polyglutamine-induced neurodegeneration in a model of spinocerebellar ataxia. Nat Med, 2004. 10(8): p. 816-20. 122. Yamamoto, A., J.J. Lucas, and R. Hen, Reversal of neuropathology and motor dysfunction in a conditional model of Huntington's disease. Cell, 2000. 101(1): p. 57-66. 123. Fortun, J., et al., Emerging role for autophagy in the removal of aggresomes in Schwann cells. J Neurosci, 2003. 23(33): p. 10672-80. 124. Marx, F.P., et al., The proteasomal subunit S6 ATPase is a novel synphilin-1 interacting protein--implications for Parkinson's disease. Faseb J, 2007. 21(8): p. 1759-67. 125. Webb, J.L., B. Ravikumar, and D.C. Rubinsztein, Microtubule disruption inhibits autophagosome-lysosome fusion: implications for studying the roles of aggresomes in polyglutamine diseases. Int J Biochem Cell Biol, 2004. 36(12): p. 2541-50. 126. Muchowski, P.J., et al., Requirement of an intact microtubule cytoskeleton for aggregation and inclusion body formation by a mutant huntingtin fragment. Proc Natl Acad Sci U S A, 2002. 99(2): p. 727-32.

58

127. Ravikumar, B., R. Duden, and D.C. Rubinsztein, Aggregate-prone proteins with polyglutamine and polyalanine expansions are degraded by autophagy. Hum Mol Genet, 2002. 11(9): p. 1107-17. 128. Sieradzan, K.A., et al., Huntington's disease intranuclear inclusions contain truncated, ubiquitinated huntingtin protein. Exp Neurol, 1999. 156(1): p. 92-9. 129. Schmidt, T., et al., An isoform of ataxin-3 accumulates in the nucleus of neuronal cells in affected brain regions of SCA3 patients. Brain Pathol, 1998. 8(4): p. 669- 79. 130. Kim, T.Y., et al., Selective anabolic effects of muteins of mid-region PTH fragments on skeletal tissues of prepubertal rats. Bone, 2002. 30(1): p. 78-84. 131. Nucifora, F.C., Jr., et al., Nuclear localization of a non-caspase truncation product of atrophin-1, with an expanded polyglutamine repeat, increases cellular toxicity. J Biol Chem, 2003. 278(15): p. 13047-55. 132. Ellerby, L.M., et al., Cleavage of atrophin-1 at caspase site aspartic acid 109 modulates cytotoxicity. J Biol Chem, 1999. 274(13): p. 8730-6. 133. Wellington, C.L., et al., Caspase cleavage of mutant huntingtin precedes neurodegeneration in Huntington's disease. J Neurosci, 2002. 22(18): p. 7862-72. 134. Verhoef, L.G., et al., Aggregate formation inhibits proteasomal degradation of polyglutamine proteins. Hum Mol Genet, 2002. 11(22): p. 2689-700. 135. Cornett, J., et al., Polyglutamine expansion of huntingtin impairs its nuclear export. Nat Genet, 2005. 37(2): p. 198-204. 136. Kennedy, W.R., M. Alter, and J.H. Sung, Progressive proximal spinal and bulbar muscular atrophy of late onset. A sex-linked recessive trait. Neurology, 1968. 18(7): p. 671-80. 137. Katsuno, M., et al., Pathogenesis, animal models and therapeutics in spinal and bulbar muscular atrophy (SBMA). Exp Neurol, 2006. 200(1): p. 8-18. 138. Swaab, D.F., et al., Structural and functional sex differences in the human hypothalamus. Horm Behav, 2001. 40(2): p. 93-8. 139. Li, M., et al., Primary sensory neurons in X-linked recessive bulbospinal neuropathy: histopathology and androgen receptor gene expression. Muscle Nerve, 1995. 18(3): p. 301-8. 140. Watson, N.V., L.M. Freeman, and S.M. Breedlove, Neuronal size in the spinal nucleus of the bulbocavernosus: direct modulation by androgen in rats with mosaic androgen insensitivity. J Neurosci, 2001. 21(3): p. 1062-6. 141. Sato, T., et al., Late onset of obesity in male androgen receptor-deficient (AR KO) mice. Biochem Biophys Res Commun, 2003. 300(1): p. 167-71. 142. Adachi, H., et al., Transgenic mice with an expanded CAG repeat controlled by the human AR promoter show polyglutamine nuclear inclusions and neuronal dysfunction without neuronal cell death. Hum Mol Genet, 2001. 10(10): p. 1039- 48. 143. Chevalier-Larsen, E.S., et al., Castration restores function and neurofilament alterations of aged symptomatic males in a transgenic mouse model of spinal and bulbar muscular atrophy. J Neurosci, 2004. 24(20): p. 4778-86. 144. Katsuno, M., et al., Leuprorelin rescues polyglutamine-dependent phenotypes in a transgenic mouse model of spinal and bulbar muscular atrophy. Nat Med, 2003. 9(6): p. 768-73.

59 145. Sobue, G., et al., Subclinical phenotypic expressions in heterozygous females of X-linked recessive bulbospinal neuronopathy. J Neurol Sci, 1993. 117(1-2): p. 74- 8. 146. Schmidt, B.J., et al., Expression of X-linked bulbospinal muscular atrophy (Kennedy disease) in two homozygous women. Neurology, 2002. 59(5): p. 770-2. 147. Kobayashi, Y., et al., Chaperones Hsp70 and Hsp40 suppress aggregate formation and apoptosis in cultured neuronal cells expressing truncated androgen receptor protein with expanded polyglutamine tract. J Biol Chem, 2000. 275(12): p. 8772-8. 148. Walcott, J.L. and D.E. Merry, Ligand promotes intranuclear inclusions in a novel cell model of spinal and bulbar muscular atrophy. J Biol Chem, 2002. 277(52): p. 50855-9. 149. Ellerby, L.M., et al., Kennedy's disease: caspase cleavage of the androgen receptor is a crucial event in cytotoxicity. J Neurochem, 1999. 72(1): p. 185-95. 150. Bingham, P.M., et al., Stability of an expanded trinucleotide repeat in the androgen receptor gene in transgenic mice. Nat Genet, 1995. 9(2): p. 191-6. 151. Merry, D., Am J Hum Genet, 1996. 59 (Suppl): p. A271. 152. Abel, A., et al., Expression of expanded repeat androgen receptor produces neurologic disease in transgenic mice. Hum Mol Genet, 2001. 10(2): p. 107-16. 153. Stenoien, D.L., et al., Polyglutamine-expanded androgen receptors form aggregates that sequester heat shock proteins, proteasome components and SRC- 1, and are suppressed by the HDJ-2 chaperone. Hum Mol Genet, 1999. 8(5): p. 731-41. 154. Avila, D.M., et al., Androgen receptors containing expanded polyglutamine tracts exhibit progressive toxicity when stably expressed in the neuroblastoma cell line, SH-SY 5Y. Exp Biol Med (Maywood), 2003. 228(8): p. 982-90. 155. Piccioni, F., et al., Polyglutamine tract expansion of the androgen receptor in a motoneuronal model of spinal and bulbar muscular atrophy. Brain Res Bull, 2001. 56(3-4): p. 215-20. 156. Li, H., et al., Ultrastructural localization and progressive formation of neuropil aggregates in Huntington's disease transgenic mice. Hum Mol Genet, 1999. 8(7): p. 1227-36. 157. Katsuno, M., et al., Transgenic mouse models of spinal and bulbar muscular atrophy (SBMA). Cytogenet Genome Res, 2003. 100(1-4): p. 243-51. 158. Beitel, L.K., et al., Progress in Spinobulbar muscular atrophy research: insights into neuronal dysfunction caused by the polyglutamine-expanded androgen receptor. Neurotox Res, 2005. 7(3): p. 219-30. 159. Takeyama, K., et al., Androgen-dependent neurodegeneration by polyglutamine- expanded human androgen receptor in Drosophila. Neuron, 2002. 35(5): p. 855- 64. 160. Katsuno, M., et al., Testosterone reduction prevents phenotypic expression in a transgenic mouse model of spinal and bulbar muscular atrophy. Neuron, 2002. 35(5): p. 843-54. 161. Sopher, B.L., et al., Androgen receptor YAC transgenic mice recapitulate SBMA motor neuronopathy and implicate VEGF164 in the motor neuron degeneration. Neuron, 2004. 41(5): p. 687-99.

60

PREFACE TO CHAPTER 2

A common histopathological observation in tissues from PolyQ Expansion

Disease patients is insoluble protein material containing the polyQ-expanded protein, termed either “protein aggregates”, or “inclusion bodies”. Such aberrant accumulation of protein suggests a malfunction of the normal process by which cellular proteins are destroyed. An intriguing observation regarding the composition of inclusion bodies is the consistent finding, by numerous independent research groups, of ubiquitin and protein subunits of the proteasome. This finding strongly suggests a role for the ubiquitin proteasome system for protein degradation in PolyQ Expansion Disease. Pursuant to the study of the polyQ expansion in the AR, a proteasome reporter cell line, the HEK-293

GFPu, was obtained from Dr. Ron Kopito’s laboratory. This cell line allows an in vivo

approach to assess global proteasome function in the context of a polyQ-expanded

protein.

ORIGINAL CONTRIBUTIONS TO KNOWLEDGE

Two important findings are reported and discussed here: expression of full-length

polyQ-expanded AR in the HEK-293 GFPu cell line promoted androgen-dependent

proteasome impairment not observed in normal polyQ tract length controls.

Interestingly, proteasome impairment was restricted to the nucleus, and occurred in the

absence of inclusion body formation. Secondly, deletion of the nuclear localization

signal of the AR rescued proteasome impairment by polyQ-expanded AR. In the

61 presence of androgen, nuclear localization signal mutants remained in the cytoplasm and

rapidly formed inclusion bodies upon treatment with androgen, but did not promote

global accumulation of GFPu fluorescence, indicative of normal proteasome function.

ACKNOWLEDGEMENTS

• The author performed all cell culture, transfection studies, immunocytochemistry and

fluorescent microscopy.

• Dr. Lenore K. Beitel previously constructed the ARQn-EBFP; ARΔNLSQn-EBFP

plasmids, and assisted in editing the manuscript.

• The GFPu cell line was a gift from Dr. Ron Kopito.

• Dr. Mark Trifiro assisted in editing the manuscript.

62

CHAPTER 2 – Role of Androgen Receptor Nuclear Translocation in the Polyglutamine Expansion-Mediated Molecular Pathology of the Ubiquitin Proteasome System

2.1 Introduction

2.1.1 Role of Proteasome in PolyQ Expansion Disease

A pathological hallmark of PolyQ Expansion Disease is the incidence of inclusion

bodies containing the polyQ-expanded protein, molecular chaperones, and components of

the UPS. A contentious debate is whether this finding represents a failed attempt of the

proteasome to degrade polyQ expansions, or an appropriate, efficient cellular response

invoking recruitment of active proteasomes to degrade aggregated or included protein.

The truth is likely to lie somewhere in between these two extremes. It should be noted

that two clinical attributes of PolyQ Expansion Disease make it highly unlikely that

polyQ expansions drastically inhibit proteasome function by acting as suicide inhibitors

of proteasome catalytic subunits: restricted toxicity specific to defined neuronal

subpopulations and late-onset of symptoms, usually occuring in the 5th decade of life.

Efficient proteasome function is essential to cell viability as inactivating mutations or

even epitope-tagging of various proteasome subunits has proven incompatible with cell

viability [1, 2], therefore global toxicity or even embryonic lethality would be expected if

polyQ-expanded proteins acted as suicide inhibitors of the proteasome.

Endogenous targeting of protein substrates to the proteasome involves covalent

attachment of the polyubiquitin signal via a well-characterized enzymatic cascade

involving ubiquitin activating (E1), ubiquitin conjugating (E2), and ubiquitin ligase (E3) enzymes. This process is highly-regulated and specific for the substrate protein, and

63 often occurs only following specific post-translational modifications. Polyubiquitinated

proteins are either delivered to the proteasome for processing and proteolytic degradation

or are acted upon by deubiquitinases (DUBs) that remove the polyubiquitin tag,

stabilizing the protein. The multiple steps required for targeting, processing and

proteolytic degradation of protein substrates have complicated studies investigating the

putative role of the UPS in the degradation of pathologic and normal polyQ-expanded

proteins. Recently, several tools have been developed that allow more detailed scrutiny

of the potential involvement of proteasome in the clearance of normal and expanded

polyQ tracts by induced-targeting of proteins for rapid degradation by the proteasome.

Two such systems are the GFPu proteasome reporter, and the N-end rule, which both act

as a consequence of a small peptide sequence fused to the target protein that serves as a recognition site for a specific ubiquitin conjugation cascade.

Several in vivo reporter schemes that quantitatively measure proteasome activity

have been important in the study of PolyQ Expansion Disease. Specifically, the fusion of

Green Fluorescent Protein (GFP) with a “degron” peptide sequence (the so-called, GFPu) targeting the protein for rapid ubiquitination and subsequent proteasome-dependent degradation, was used to evaluate the effects of the polyQ-expansion on proteasome function [3]. As a consequence of rapid proteasome-dependent degradation, GFPu has a

very short half-life (30 min), and accumulates discernibly in response to UPS inhibition.

Transient transfection of polyQ-expanded protein into cells stably-expressing the GFPu protein resulted in marked accumulation of green fluorescence correlated with inclusion body formation, indicating polyQ expansion proteins induce proteasome impairment [3].

To study the effect of physically separating the proteasome reporter protein from the

64

polyQ-expanded protein, this group modified the GFPu by fusion of nuclear localization

and export sequences (NLSGFPu and NESGFPu, respectively). PolyQ-expanded proteins

were transfected into cells expressing either of these reporter constructs, resulting in the

production of aggregates and inclusion bodies (sic) restricted to the cellular

compartments where the polyQ-expanded protein was expressed. Observations of

proteasome impairment were recorded in either cellular compartment irrespective of the

localization of the polyQ-expansion. This surprising finding suggests that proteasome

impariment induced by polyQ-expanded proteins does not require a direct interaction. In

addition, accumulation of green fluorescence was reported prior to the recruitment of

aggregates into inclusion bodies, suggesting inclusion bodies may act as a protective mechanism to rescue the cell from polyQ expansion aggregation events [4].

A derivative of GFPu, the unstable Yellow Fluorescent Protein (YFPu), was used

to evaluate the effect of polyQ expansion in the AR in the motor neuron cell line NSC-

34. The authors demonstrated expression of the expanded polyQ AR promoted

cytoplasmic YFPu accumulation in the absence of testosterone, indicative of proteasome impairment in the absence of AR androgen-activation. Furthermore, testosterone- dependent formation of cytoplasmic aggregates was observed, and appeared to reduce

YFPu accumulation, indicating aggregates may serve a protective role for proteasome

function [5]. These results seem to be contradictory to animal studies that have clearly

demonstrated androgen-activation is requisite for AR polyQ expansion toxicity by promoting receptor nuclear translocation; it is unclear what relevance the reported testosterone-rescue of androgen-independent cytoplasmic YFPu fluorescence

accumulation, reported in this study, has in relation to the pathology of SBMA.

65 Interestingly, the authors define “inclusions” as “insoluble material”, yet do not regard

their aggregates, which were demonstrated to be resistant to 1% SDS solubilization on a

filter retardation assay, as inclusions.

The GFPu reporter systems are not without drawbacks. First, the GFPu degron is specific for a particular E2 and E3, while the polyQ-expansion might mediate toxicity by sequestration or inactivation of any of the myriad targeting enzymes of the ubiquitin conjugation cascade or adaptor proteins of the UPS. In addition, the human proteasome likely has multiple polyubiquitin receptors; the GFPu might only function through a single receptor, while the polyQ-expansion could inhibit another receptor site. Finally, the GFPu proteasome reporter system does not directly monitor the proteasome-

dependent degradation of the polyQ-expanded protein, but is rather a surrogate marker

for global proteasome function.

An alternative method to target proteins for rapid degradation by the proteasome

is exploitation of the N-end rule [6]. It is well-established that destabilizing residues at

the N-terminus of a protein can promote the interaction with a targeting complex for

polyubiquitin attachment and proteasome-dependent degradation. The N-end rule has been applied to fusion proteins containing pure polyQ tracts, N-terminal fragments of Htt, and ATXN1 [7-10]. These experimental schemes differ from the GFPu studies discussed above because the polyQ tract proteins are directly targeted to the proteasome. With respect to the ability of the proteasome to degrade these targeted polyQ tract proteins, variable results have been observed. Holmberg et al. reported an incomplete degradation of normal and pathogenic polyQ tract lengths in vivo and in reticulocyte lysate, and also reported a tight association of N-end rule-targeted polyQ-expanded protein and the

66

proteasome in aggregates demonstrated by fluorescence resonance energy transfer, suggesting the proteasome is unable degrade any length polyQ tract [8]. However, two groups have observed efficient degradation of soluble N-end rule-targeted polyQ-tract proteins with no difference between normal and expanded tract lengths, but a significant stabilization of polyQ proteins after the formation of aggregates [7, 9]. Finally, Kaytor et al. observed a significant decrease in levels polyQ-expanded N-end rule-targeted protein compared to nonpathogenic controls, attributed to the increased formation of aggregates

[10]. An intriguing finding common to three of these reports is that targeting polyQ- expanded proteins for rapid proteasome-dependent degradation reduced the incidence of aggregate formation [7, 9, 10]. Taken together these results implicate a role for the proteasome in regulating the formation of aggregates by disposal of soluble polyQ- containing proteins. Aggregate formation clearly impairs the proteasome’s ability to rapidly degrade polyQ-expanded proteins, but does not promote irreversible inhibition.

Based on a large body of evidence implicating proteasome dysfunction as a possible toxic effect of the polyQ expansion protein, investigation of a polyQ-expanded

AR in the well-established GFPu proteasome reporter cell line was pursued. Two important aspects critical to the understanding of SBMA-specific disease processes were examined: androgen-activation of the AR, and receptor nuclear localization. Androgen- dependent activation of polyQ-expanded AR and nuclear localization were required for observations of increased levels of GFPu, which was observed only in the cellular

compartment cis to the expression of the polyQ expansion, and occurred in the absence of

inclusion body formation. Cytoplasmic restriction of the polyQ-expanded protein did not

67 promote GFPu accumulation, but resulted in the androgen-dependent formation of inclusion bodies, suggesing a protective function for these structures.

2.2 Results and Discussion

2.2.1 PolyQ-expanded AR inhibits the proteasome

The recent finding in animal and cell culture models of SBMA that AR polyQ-

expansion toxicity requires androgen-activation of the receptor presents the researcher with a unique opportunity to study the effects of polyQ expansion disease. Simply put, the AR is the only known polyQ-containing protein with an “on/off” switch for pathogenesis. The salient molecular event associated with androgen-activation of AR polyQ toxicity seems to be nuclear localization. In this context, it was decided to investigate the role of the proteasome in an SBMA cell culture model. If proteasome impairment is related to toxicity, it should also be correlated with androgen- activation/nuclear localization of the the polyQ-expanded receptor. In addition, dissociation of androgen-binding and nuclear localization properties of the AR is possible as a consequence of a synthetic AR mutant with an interrupted nuclear localization signal

(ARΔNLS); this mutant is expected to have interesting properties in this cell culture model.

To investigate possible proteasome dysfunction in a cell culture model of SBMA, transient transfection of the GFPu reporter cell line with normal and pathogenic polyQ

tract lengths was performed. Previously, cDNAs encoding normal (20 CAG) or

pathogenic (50 CAG) polyQ tract AR were inserted into the pEBFP-N1 plasmid

68

(Clontech), encoding fusion proteins of ARQn-EBFP. Transient transfection of these

plasmids into the GFPu cell line was performed. As the EBFP gives a weakly fluorescent

signal, immunofluorescent labeling of the AR was performed with an anti-AR primary

antibody (clone 441, Neomarkers), recognized by a mouse tetramethylrodamine

isothiocyanate (TRITC)-labeled secondary antibody. GFPu cells were plated on glass

coverslips the day before transfection, and after 24 hours the cells were either treated

with 7 mM mibolerone (MB), a nonmetabolizable androgen, or ethanol solvent. The

cells were allowed to grow for an additional 24 hours, at which point mock-transfected

cells were treated with proteasome inhibitor MG132 as a positive control of proteasome

dysfunction, 5 μM for 3 hours. Cells were fixed in paraformaldehyde, immunostained,

and analyzed by epifluorescence microscopy.

GFPu cells treated with MG132 exhibited increased green fluorescence

throughout the cell, characteristic of global proteasome impairment (Figure 2.1, compare

“MG132” to “GFPu”). In addition, AR fusion proteins displayed expected cellular

compartmentalization patterns, as both ARQn-EBFP proteins were cytoplasmic in the

absence of androgen (Figure 2.1); upon addition of MB, near complete nuclear

translocation of the ARQn-EBFP proteins was observed (Figure 2.1).

ARQ20-EBFP cells did not display an increase in green fluorescence compared to

negative controls either in the absence (not shown) or the presence of MB (Figure 2.1).

u In contrast, while ARQ50-EBFP-transfected cells displayed low level GFP fluorescence

in the absence of MB (Figure 2.1), androgen-activation of the receptor resulted in a

marked increase in green fluorescence that colocalized to the nucleus with the receptor

(Figure 2.1). This result indicates polyQ-expanded AR induces proteasome dysfunction

69 in the cis cellular compartment. It is intriguing to note that proteasome dysfunction occurred in the absence of observable inclusion body formation. This suggests that soluble, polyQ-expanded AR or microaggregated species are responsible for proteasome impairment.

It should be noted that this observation of proteasome impairment restricted to the cell compartment harboring the polyQ-expansion is at odds with a previous report from

Ron Kopito’s lab, where nuclear- or cytoplasmic-restricted GFPu variants accumulated in the presence of polyQ-expanded protein restricted to the trans cell compartment [4]. It is possible that proteasome impairment restricted to the cis cell compartment reported here is an early event that precedes impairment in the trans compartment. Another explanation is the difference in half-life between the two reporter types: Kopito’s group measured the half-life of the NESGFPu and NLSGFPu to be twice that of the original

GFPu used here. The GFPu should therefore be a more sensitive indicator of proteasome impairment, and might allow detection of early or less robust proteasome impairment in the experimental design reported here. In addition, it is possible that the shorter polyQ tract length used here (50Q, as compared to 82Q and 103Q) is not strong enough to induce trans proteasome impairment; it is also possible that the protein context of the polyQ-expanded tract plays a role in the observed differences.

2.2.2 Exclusion of the AR from the nucleus rescues proteasome impairment

Although the polyQ expansion in the AR is expressed in both males and females,

SBMA affects only men. Since the finding that castration of SBMA transgenic mice, resulting in diminished serum testosterone levels, largely ameliorated pathology [11], the

70

effect of androgen-activation of the polyQ-expanded AR has been widely studied.

Treatment of SBMA mice with Leuprorelin, a Lutenizing Hormone-Releasing Hormone

(LNRH) agonist that reduces serum testosterone, rescued transgenic mice similar to castration, however treatment with AR antagonist hydroxyflutamide, an AR antagonist that provides pharmacological knockout of AR transcriptional activity, did not rescue

SBMA pathology. While hydroxyflutamide inhibits AR transcriptional activity, it still promotes translocation to the nucleus upon AR binding, strongly suggesting nuclear localization of the polyQ-expanded protein is the molecular event associated with androgen-activation of polyQ-expanded AR toxicity [12].

Using the GFPu proteasome reporter cell line, it was established that polyQ-

expanded AR promotes nuclear proteasome dysfunction in the presence of androgen

(Figure 2.1). In order to dissociate ligand-binding and nuclear localization events

associated with androgen-activation of the AR, an AR mutant with defective nuclear

localization that retains full ligand-binding activity was transfected into the GFPu reporter

cell line. This AR mutation, an 8 amino acid delection at the junction of the hinge and

LBD regions (R629-N636; ΔNLS), was prepared by standard PCR mutagenesis

previously in our lab, and results in the abolition of the AR NLS. Studies performed previously in our lab demonstrate normal androgen-binding but severely deficient transactivation of an androgen responsive reporter gene (pMMTV-LTR) (unpublished results). The cDNA from this mutant AR was fused in-frame to the EBFP cDNA for fluorescent microscopic functional analysis. Clearly, the ΔNLS deletion prevents ligand- induced nuclear translocation of the AR (Figure 2.2).

71 To investigate the effects of the polyQ expansion in a cytoplasmically-restricted,

ligand-bound AR, normal (Q20) and pathogenic (Q50) polyQ tracts were cloned into the

ΔNLS AR mutant. These plasmid constructs were transiently transfected into the HEK-

GFPu proteasome reporter cell line and the cells were examined microscopically.

Both ARΔNLSQ20-EBFP and ARΔNLSQ50-EBPF constructs are well-expressed,

and display the expected cytoplasmic compartmentalization both in the absence and

presence of mibolerone (MB) (Figure 2). In the absence of MB, immunostaining of both

ARΔNLS proteins reveals diffuse cytoplasmic staining (Figure 2), with no evidence of

GFPu accumulation (data not shown). An intriguing observation in these transfection

experiments is the ligand-dependent formation of large cytoplasmic inclusion bodies in

cells transfected with either polyQ tract length. Inclusion body formation occurred

equally for both polyQ tract lengths tested, and was not associated with an increase in

diffuse green fluorescence, indicative of normal proteasome activity in the presence of

cytoplasmic polyQ tract expansion proteins.

An interesting feature of these inclusions is the recruitment of the GFPu protein.

As demonstrated in Figure 2.2, all inclusion bodies formed by ARΔNLS proteins in the

presence of MB are strongly fluorescent in the green channel. Whether this represents

recruitment of functional proteasomes (and thus proteasome-targeted GFPu protein) to

sites of sequestered polyQ tracts for their efficient removal, or inhibited proteasomes

unable to degrade either the ARΔNLS or the GFPu proteins is a question that remains to be answered. However, an indisputable observation is the complete lack of increase in diffuse GFPu staining, indicating normal proteasome function in the majority of the cell

volume in the presence of large inclusion bodies. This is a stark contrast to the previous

72

findings (Figure 2.1) where diffuse, soluble, polyQ-expanded AR in the nucleus induced proteasome dysfunction. Putting together these findings, it seems likely that cytoplasmic inclusion body formation is a cellular response that facilitates sequestration of toxic proteins. It is likely that local (inclusion body-sequestered) proteasomes have reduced activity, as indicated by inefficient removal of both polyQ tract proteins and the proteasome-targeted GFPu protein. However, this reduced proteasome activity in

inclusion bodies does not effect the bulk degradative activity in the overall cell. It is

possible that nonproteasomal mechanisms such as autophagy are responsible for

clearance of cytoplasmic inclusion bodies, although this was not tested.

2.3 Conclusions

Based on numerous literature reports, it is established that androgen-activation is

required for AR polyQ-expansion toxicity, suggesting some aspect of ligand-binding

exposes a toxic property of this molecule. In addition, the reports that nuclear

localization is required for disease progression might imply that cytoplasmic factors

engender protection against androgen-activated polyQ-expansion toxicity in the

cytoplasm. In the experiments described here, it was observed that ligand-binding of the

polyQ-expanded AR promoted both diffuse nuclear localization and proteasome

dysfunction (Figure 2.1). When restricted to the cytoplasm, the ligand-free AR was

diffuse; however the ligand-activated AR rapidly induced inclusion body formation and

did not induce proteasome dysfunction (Figure 2.2). As ligand-activation was previously

demonstrated to promote nuclear proteasome dysfunction in polyQ-expanded AR, these

73 results support the hypothesis that inclusion body formation is a protective cytoplasmic

response to a toxic protein that rescues the cell compartment specific proteasome

impairment observed when the ligand-activated pathogenic polyQ tract proteins are

soluble and localized in the nucleus.

Much controversary exists regarding the role of inclusion body formation in

PolyQ Expansion Disease. Several groups have proposed that cytoplasmic and nuclear

inclusion bodies serve markedly different roles [13]. In this report androgen-activated, polyQ-expanded AR was almost totally relocalized to the nucleus where it was observed to be soluble and diffuse. This soluble, ligand-bound, polyQ-expanded, nuclear AR induced nuclear proteasome dysfunction in the absence of inclusion body formation.

Cytoplasmic-restriction of the AR resulted in the androgen-dependent formation of large cytoplasmic inclusion bodies, and rescued proteasome impairment.

Two conclusions regarding inclusion body formation are proposed. First, formation of inclusion bodies is clearly androgen-dependent; it is likely that the apoAR is kept soluble by its interactions with molecular chaperones that prevent polyQ tract conformational conversion to a toxic β-sheet structure (see Introduction and Chapter 5).

Second, cytoplasmic inclusion body formation is demonstrated to be a protective cellular response to a toxic protein. The inability of the nucleus to rapidly form protective inclusion bodies is proposed to explain the increased proteasome dysfunction associated androgen-activated, polyQ-expanded AR proteins in the nucleus, compared to cytoplasmically restricted variants observed in these experiments. It is interesting to note that nuclear inclusion bodies in SBMA cell culture models are rare, and occur only after extended durations of polyQ-expanded AR expression [14], while the cytoplasmic

74

inclusions observed here (and in numerous other experimental conditions) were induced very rapidly, in under 24 hours. It is possible that delayed formation of inclusion bodies in the nucleus results in prolonged exposure to the pathogenic properties of the soluble expanded polyQ protein, while efficient sequestration of polyQ-expanded protein in the cytoplasm protects the cell from these same pathogenic properties. The idea that the nucleus is deficient in rapid formation of protective inclusion bodies would explain the selective nuclear toxicity of polyQ-expanded proteins.

2.4 Experimental Procedures

Plasmids: The ARQ20-EBFP and -Q50-EBFP; ARΔNLSQ20-EBFP and -Q50-EBFP plasmids were prepared previously in the Trifiro laboratory.

Antibodies: A mouse anti-AR antibody, clone 441, was purchased from Neomarkers

(Fremont, CA). The TRITC-labeled goat anti-mouse secondary antibody was purchased from Zymed (San Francisco, CA).

Fluorescent Microscopy: The HEK-GFPu cell line was a gift from Dr. Ron Kopito, and was maintained in DMEM supplemented with 10% FBS and 100 units/mL penicillin- streptomycin (Invitrogen). Cells were plated on glass coverslips and allowed to adhere overnight. On the day of transfection, cells were 60-70 % confluent, and were transfected with Lipofectamine 2000 according to manufacturer’s instructions

(Invitrogen). Cells were allowed to recover for 24 hours at which time the media was

75 changed and supplemented with appropriate hormonal treatment; samples were routinely

treated with 7 nM MB. The cells were further incubated for another 24 hours and

prepared for immunocytochemistry. Treatment with the proteasome inhibitor MG132

was routinely performed at a level of 5 μM for 3-5 hours on the day of analysis. In

preparation for immunolabeling, cells were washed twice in D-PBS, and fixed in 3.75%

paraformaldehyde for 15 minutes on ice. Cells were permeabilized by treatment with

0.1% Triton-X 100 for 10 minutes at 4oC, washed 3 times with D-PBS, and labeled with a 1:1000 dilution of the AR antibody in 1% BSA for 1 hour at 4oC. The cells were

washed another 3 times and treated with a 1:1000 dilution of the TRITC-labeled goat

anti-mouse secondary antibody in 1% BSA for 1 hour at 4oC. Cells were again washed three times with D-PBS and mounted on glass slides using GelTol Mounting Medium

(Thermo Electron Corporation, Pittsburgh, PA). Cells were analyzed with an Olympus

epifluorescence microscope and images were recorded with SPOT CCD digital camera and software (Diagnostic Instruments, Sterling Heights, MI).

76

2.5 Figures

GFPu MG132

u ARQ20 + MB GFP Merge

u ARQ50 - MB GFP Merge

u ARQ50 + MB GFP Merge

77 Figure 2.1: Ligand-dependent proteasome inhibition by polyQ-expanded AR. The HEK-

u 293 GFP proteasome reporter cell line was transiently transfected with ARQn-EBFP, and

subsequently labelled with monoclonal anti-AR antibody (clone 441, Neomarkers)

recognized by a TRITC-labelled goat anti-mouse secondary antibody. Mock transfected

cells (GFPu) display a low level of green fluorescence that is markedly increased upon

treatment with proteasome inhibitor (MG132, 5 μM for 3 hours). Transfection of the

ARQ20-EBFP did not increase green fluorescence levels above background, indicating

normal proteasome function in the presence of mibolerone (MB, 7 nM). Cells transfected

with the ARQ50-EBFP displayed no evidence of proteasome impairment in the absence of

u MB (ARQ50 - MB). However, a clear increase in GFP levels was observed in the nucleus following treatment with MB (ARQ50 + MB) in the absence of inclusion body

formation.

78

ΔNLSQ20 - MB ΔNLSQ50 - MB

u ΔNLSQ20 + MB GFP Merge

u Merge ΔNLSQ50 + MB GFP

Figure 2.2: Cytoplasmic restriction rescues ligand-dependent proteasome dysfunction.

u Plasmids expressing ARΔNLSQn-EBFP were transfected into the HEK-293 GFP cell line. Both ΔNLSQn-EBFP proteins are soluble and cytoplasmically restricted in the

absence of hormone (ΔNLSQ20,50 - MB). Treatment with MB induces the formation of

large cytoplasmic inclusion bodies in the cytoplasm of cells expressing either ΔNLSQn protein (ΔNLSQ20,50 + MB). No difference in the number or size of inclusion bodies was

79 observed in cells transfected with either length polyQ tract. Interestingly, the GFPu protein is also recruited into the inclusion bodies, but overall levels of green fluorescence are not increased throughout the cytoplasm.

80

2.6 References

1. Rubin, D.M., et al., Active site mutants in the six regulatory particle ATPases reveal multiple roles for ATP in the proteasome. Embo J, 1998. 17(17): p. 4909- 19. 2. Verma, R., et al., Proteasomal proteomics: identification of nucleotide-sensitive proteasome-interacting proteins by mass spectrometric analysis of affinity- purified proteasomes. Mol Biol Cell, 2000. 11(10): p. 3425-39. 3. Bence, N.F., R.M. Sampat, and R.R. Kopito, Impairment of the ubiquitin- proteasome system by protein aggregation. Science, 2001. 292(5521): p. 1552-5. 4. Bennett, E.J., et al., Global impairment of the ubiquitin-proteasome system by nuclear or cytoplasmic protein aggregates precedes inclusion body formation. Mol Cell, 2005. 17(3): p. 351-65. 5. Rusmini, P., et al., Aggregation and proteasome: the case of elongated polyglutamine aggregation in spinal and bulbar muscular atrophy. Neurobiol Aging, 2007. 28(7): p. 1099-111. 6. Varshavsky, A., The N-end rule: functions, mysteries, uses. Proc Natl Acad Sci U S A, 1996. 93(22): p. 12142-9. 7. Verhoef, L.G., et al., Aggregate formation inhibits proteasomal degradation of polyglutamine proteins. Hum Mol Genet, 2002. 11(22): p. 2689-700. 8. Holmberg, C.I., et al., Inefficient degradation of truncated polyglutamine proteins by the proteasome. Embo J, 2004. 23(21): p. 4307-18. 9. Michalik, A. and C. Van Broeckhoven, Proteasome degrades soluble expanded polyglutamine completely and efficiently. Neurobiol Dis, 2004. 16(1): p. 202-11. 10. Kaytor, M.D., K.D. Wilkinson, and S.T. Warren, Modulating huntingtin half-life alters polyglutamine-dependent aggregate formation and cell toxicity. J Neurochem, 2004. 89(4): p. 962-73. 11. Katsuno, M., et al., Testosterone reduction prevents phenotypic expression in a transgenic mouse model of spinal and bulbar muscular atrophy. Neuron, 2002. 35(5): p. 843-54. 12. Katsuno, M., et al., Leuprorelin rescues polyglutamine-dependent phenotypes in a transgenic mouse model of spinal and bulbar muscular atrophy. Nat Med, 2003. 9(6): p. 768-73. 13. Poletti, A., The polyglutamine tract of androgen receptor: from functions to dysfunctions in motor neurons. Front Neuroendocrinol, 2004. 25(1): p. 1-26. 14. Walcott, J.L. and D.E. Merry, Ligand promotes intranuclear inclusions in a novel cell model of spinal and bulbar muscular atrophy. J Biol Chem, 2002. 277(52): p. 50855-9.

81 PREFACE TO CHAPTER 3

Numerous research groups have established a role for the UPS in PolyQ

Expansion Disease. However, a spirited debate still rages regarding a critical issue of this hypothesis: does the polyQ expansion pose a direct threat to proteasome function? The presence of numerous systems for protein clearance in the cell: proteasome, caspases, autophagy, etc. complicate in vivo analysis of polyQ protein elimination, and have not thus far permitted any group to definitively conclude for or against efficient proteasome- mediated digestion of any polyQ-expanded protein. A fully in vitro system for ubiquitin- dependent proteasome degradation is needed to address this question. In theory, a two part system would be required, an efficient method for polyubiquitination and purification of large quantities of polyQ-expanded substrates, and a method to purify human proteasomes under native conditions that retains many of the numerous accessory factors that are likely necessary for proper processing of polyubiquitinated substrates.

ORIGINAL CONTRIBUTIONS TO KNOWLEDGE

In this chapter, a novel method for purification of human proteasomes under

native conditions using a biologically-based affinity chromatography method is

described. Development of such a scheme for purification of human proteasomes was

deemed necessary because conventional methods for proteasome purification routinely

expose the proteasome complex to conditions of high salt during ion exchange

chromatography and abrogate protein-protein interactions with proteasome modulatory

82

proteins that could serve roles in processing misfolded or toxic proteins with a polyQ expansion.

Purification of functional proteasomes from a limited quantity of a human-derived cell line was achieved using a novel method. Prior to the development of this method, there was not a single report of purification of human proteasomes using mild conditions.

It was therefore decided to investigate the composition of affinity-purified proteasome complexes to identify proteasome modulatory proteins by mass spectrometry. Proteomic analysis of the dynamic proteasome network using this purification scheme resulted in the identification of several previously unknown proteasome-interacting proteins.

83

ACKNOWLEDGEMENTS

• The author synthesized the GST-hUIM-2, performed all cell culture, affinity

chromatography of proteasome complexes, fluorigenic substrate assays of proteasome

activity, western blots, and participated in bioinformatic analysis of the raw mass

spectrometry data.

• Dr. Marcos DiFalco performed tandem mass spectrometry of purified proteasome

fractions.

• Dr. Beniot Houle and Yannic Richard assisted in database mining of the LC-MS/MS

data.

• Dr. Rob Kearney performed the randomized database search of LC-MS/MS data.

• Dr. Juli Feigon provided the pGEX-UBL plasmid.

• Dr. Mark Trifiro assisted in editing the manuscript.

• Dr. Lenore K. Beitel assisted in editing the manuscript.

84

CHAPTER 3 – A Novel Method for Purification of Functional Human Proteasomes Under Native Conditions

3.1 Introduction

The proteasome is a dynamic complex involved in numerous transient interactions

with proteins that modulate its activity. Defining a protein as an integral subunit of the

proteasome thus becomes a complicated matter. Stoichiometric equivalency and multiple

identifications by multiple groups and different proteasome purification strategies are

generally all required for a protein to be accepted as a bona fide proteasome subunit.

Diverse methods for protesome purification have proven vital to understanding function

and regulation of the proteasome complex.

Conventional purification of the 26S proteasome involves multiple

chromatographic steps and usually takes several days. This procedure involves exposure

to high salt concentrations during ion exchange chromatography, and yet eukaryotic

proteasomes maintain a stable complex of roughly 34 protein subunits. In attempt to

more completely define the proteasome holocomplex and transiently associated proteins,

milder affinity chromatographic methods have been developed by epitope-tagging various yeast proteasomal subunit genes. These gentler purification methods elucidated the presence of 3 novel yeast proteasome subunits (Ecm29, Ubp6 and Hul5), and multiple interacting proteins [1, 2]. Recently, the first report of affinity-tagged human proteasomes was published [3]. Identification of proteasome-interacting proteins (PIP) has also been performed using two hybrid, immunoprecipitation and/or GST-pulldown assays utilizing epitope-tagging of isolated subunits or putative binding partners [4-9].

85 These studies clearly demonstrate the proteasome is a dynamic complex with many transiently interacting proteins.

A number of proteins with seemingly unrelated cellular functions possess two conserved domains enabling them to act as adaptor proteins to the UPS. The ubiquitin-

Like domain (UBL) is a proteasome-interacting domain of 80 amino acid residues that shares 20-30% identity to ubiquitin and takes on a similar overall fold [10-12]. The ubiquitin-associated domain (UBA) is a ubiquitin-binding domain of 50 amino acid residues with preference for poly- rather than mono-ubiquitinated protein substrates [13-

15]. UPS adaptor proteins containing both the UBL and UBA domains first bind to ubiquitinated substrates via their UBA domain and then deliver them to the proteasome via a UBL-proteasome interaction. Using nitrocellulose filter binding assays, the identity of the yeast proteasome receptor for UBL-containing proteins has been identified as Rpn1

[16], however, human proteasomes possess a different UBL receptor on proteasomal subunit S5a, in a portion of the protein not existent in the simpler organisms [11, 17].

Mutation of type 2 UBL amino acid residues required for human UBL-S5a interaction abrogated UBL-proteasome interaction, suggesting the human type 2 UBL domain has a lone proteasomal recognition site on subunit S5a [11].

The search for the proteasomal polyubiquitin binding site has yielded controversial results. Yeast subunit Rpn10/S5a was initially identified as the lone ubiquitin recognition site, however deletion of the rpn10 yielded a viable yeast strain with near normal protein turnover rates, suggesting redundancy in other yeast proteasome subunits [18]. Investigation of human S5a-polyubiquitin binding demonstrated two regions with affinity for polyubiquitin chains on the S5a subunit, initially named PUbS1

86

(UIM-1) and PUbS2 (UIM-2). In vitro binding experiments with 125I-polyubiquitin demonstrated UIM-2 has a 10-fold increased affinity for polyubiquitin than UIM-1 suggesting the two domains may have alternative functions in their endogenous context

[19]. Intriguingly, it was demonstrated using in vitro binding experiments that the type 2

UBL of hHR23a, but not a ubiquitin-hHR23a fusion, specifically bound S5a in the polyubiquitin recognition site UIM-2, suggesting the N-terminus of hHR23a is more polyubiquitin-like, than ubiquitin-like [17]. Chemical shift mapping of the hHR23a UBL

domain in complex with UIM-2 corroborated this finding [11, 20]. The binding surface

of ubiquitin is largely similar to the type 2 UBL domain, however the weaker binding

affinity of ubiquitin for UIM-2 can be explained by presence of polar and charged

residues in what otherwise would be the hydrophobic interaction surface, and the

substitution of acidic residue D52 for K47 in the basic complementary electrostatic

potential region of hHR23a UBL domain [11]. Using surface plasmon resonance, the

dissociation constant for the hHR23a UBL-S5a UIM-2 complex was determined to be 13

μM, 30-fold stronger than the reported Kd for the ubiquitin-UIM-2 interaction [20].

These studies demonstrate a high-affinity, highly-specific interaction of hHR23a

UBL with the UIM-2 domain of proteasome subunit S5a/PSMD4. It was therefore decided to use the hHR23a UBL domain as an affinity chromatography matrix for the purification of the holoproteasome complex from human cell lines using mild, nondenaturing, non-high salt conditions. This strategy will be compared to a recently

published affinity tag-based human proteasome purification using similar conditions [3].

It is probable that GST-UBL affinity purification scheme will allow the copurification of unknown putative PIP, and potentially elucidate new species-specific proteasome

87 subunits. To eliminate confusion stemming from the numerous systems of proteasome

subunit nomenclature, all proteasome proteins will henceforth be referred to using their

yeast subunit names/Gene ID designations, while all non-proteasomal proteins will simply be referred to using accepted Gene ID designations.

3.2 Results

To create an affinity matrix for purification of proteasomes from human cell lines, the UBL domain of RAD23A (residues 1-87) was expressed as a fusion protein with GST

(GST-UBL). GST-UBL was immobilized on glutathione-sepharose 4B for purification of the human proteasome from 5-7 mg of cleared Human Embryonic Kidney-293 (HEK-

293) cell lysate under mild conditions. In order to elute proteasomes from the GST-UBL affinity matrix, solution-phase 20-fold molar excess of a peptide derived from the proteasome UBL receptor, Rpn10/PSDM4 (residues 274-298, hUIM-2-HIS), was added to the matrix to compete for UBL binding. Proteasome complexes were thus competed off the GST-UBL affinity matrix. To remove the unbound hUIM-2-HIS, the proteasome fraction was adsorbed onto nickel-nitrilotriacetic acid (Ni2+-NTA) agarose beads. To our

knowledge, this is the first successful utilization of an immobilized UBL domain as a

matrix for affinity purification of functional proteasomes from a human-derived cell line.

Both functional and structural approaches were employed to confirm the activity and presence of intact 26S proteasomes.

3.2.1 Functional analysis of affinity purified proteasomes

88

Functional validation of proteasome purification was performed using a

Fluorogenic Substrate Assay (FSA) with the proteasome substrate Suc-LLVY-AMC.

The chymotryptic-like activity of the proteasome hydrolyzes the peptide bond formed between the tyrosine residue and the 7-amido-4-methylcoumarin (AMC) moiety, activating the AMC fluorophore. The reaction was monitored by measuring fluorescence intensity at 460 nm. HEK-293 lysates demonstrated robust proteolytic activity against

Suc-LLVY-AMC (10 μM) that was almost completely inhibited by treatment with the proteasome inhibitor MG132 (5 μM). GST-UBL affinity purification of proteasomes from HEK-293 cell lysates was performed, and the purified fractions were evaluated by

FSA. The results were normalized to a standard curve for determination of absolute activity of the purified fraction (Figure 3.1). A typical GST-UBL affinity purification yielded a fraction with total fluorogenic peptide cleavage activity of 1400-2500 pmol/hr.

Parallel purifications with untagged GST yielded totally inactive fractions (Figure 3.1,

A). Elution of proteasome complexes was performed using competition with a human- based hUIM-2-HIS peptide, yielding a solution phase fraction with proteolytic activity against Suc-LLVY-AMC that was MG132-sensitive. Interestingly, a UIM-2 peptide derived from mouse origin (Rpn10/mPSMD4 residues 244-351, HIS-mUIM-2) was completely ineffective for eluting proteasomes bound to the GST-UBL affinity matrix

(Figure 3.1, B), suggesting a species-specific UBL-UIM-2 domain interaction. Stability of the holoproteasome complex isolated by GST-UBL affinity chromatography in the presence and absence of ATP or analogue ATP-γ-S was analyzed by FSA. Interestingly,

ATP-γ-S treated proteasomes consistently outperformed ATP-treated controls, while proteasomes isolated in the absence of ATP were much less proteolytically active (Figure

89 3.1, C). While MG132 is not absolutely specific for the proteasome, LC-MS/MS analysis

of the entire proteasome purification fraction identified no other proteases for which

MG132 has inhibitory potential (Table 3.2, 3.3). Therefore, while MG132-sensitivity

might not be a quantitative indicator for proteasome activity in cell lysates, its application

should suffice for quantitative comparisons between purified fractions.

3.2.2 Structural analysis of affinity purified proteasomes

Immunoblotting of HEK-293 lysates and purified fractions demonstrated GST-

UBL affinity chromatography immobilized functional proteasomes to the affinity column

(Figure 3.2, lane 2). Moreover, hUIM-2 peptide effectively competed for UBL binding,

eluting proteasomes into the solution phase (Figure 3.2, lane 3). As demonstrated with

the fluorogenic substrate assay, proteasome elution from the GST-UBL affinity matrix

required human UIM-2, as the mouse version of the peptide HIS-mUIM-2 was

ineffective (Figure 3.2, lane 5).

3.2.3 Characterization of the holoproteasome complex by mass spectrometry

GST-UBL affinity purified proteasomes were analyzed by DALPC (Direct

Analysis of Large Protein Complexes) [21]. The solution phase proteasome purification

fraction was digested overnight by trypsin, and the resultant peptides were fractionated

by ion exchange chromatography. Peptides were eluted iteratively from the ion exchange

column by increasing salt concentrations in successive washes. The elution washes were

direct injected into a QTRAP LC-MS/MS by an electrospray interface. Peak lists were

created by Mascot Distiller v1.0, and were submitted to Mascot v1.9.03 for peptide

90

assignments searched against the National Center for Biotechnology Information (NCBI) nonredundant human database.

Six separate proteasome purifications were analyzed by DALPC. In 3 trials, 2 mM ATP was included in all purification steps, another two trials were performed in the absence of ATP and the presence of the ATP-γ-S analogue, and a final trial was performed with untagged GST, performed as a negative control. Purification in the presence of ATP yielded a total of 86 proteins. Percentage purity of the sample was calculated on the basis of the number of distinct proteasome subunit queries, prorated on the basis of shared queries, divided by the total number of distinct prorated queries, and was found to be 69.6%, 79.0%, and 63.5% (Trials 1, 2, 3, respectively). Five criteria were used to eliminate uninformative protein assignments: introduction to the sample by purification scheme, redundancy, presence in negative control, known contaminant proteins, and cellular localization in the endoplasmic reticulum or mitochondria. Of the identified proteins, GST, F2 (thrombin) and RAD23A were immediately disregarded as these proteins were introduced into the sample as part of the purification scheme. An additional 16 protein assignments were disregarded as they were redundancies of other protein assignments (n=9), or they were identified by a parallel DALPC of nonUBL- tagged GST affinity chromatography purification, performed as a negative control (n=7, data not shown). Nine more protein assignments were disregarded including: actin, tubulin, HSPA5 (aka BiP), GRPEL1, albumin, keratins 1, 2, and 10, and one protein whose record has been removed from the database since initial search. Table 3.1 displays the list of proteasome subunits identified by mass spectrometry in at least 2 ATP trials.

In total, 32 proteasome subunits were identified in 2 of the 3 purification trials, with the

91 notable exception of subunits β2/PSMB7, Rpt4/PSMC6, (S5b)/PSMD5, (p27)/PSMD9,

and Gankyrin/PSMD10, none of which were identified in any purification with ATP. An

additional 26 proteins were identified, and are considered as putative PIP (Table 3.2). It

is important to note that although 19 of the putative PIP were identified in only a single

trial, 8 of these have a confirmed role in the UPS, and are likely to be true PIP.

Conversely, highly expressed proteins identified in the negative control (CBR1,

HSPA1A, KRT9) were identified in all 2 mM ATP trials. Therefore, because

identifications of proteins with a confirmed physical interaction with the proteasome

were made in only a single trial, and because contaminant proteins also identified in the

negative control were identified in every trial, identification in multiple trials was not

required for scoring a protein as a putative PIP. Of the nonproteasome protein

assignments, 17 of a total of 26 (65.4%) have a confirmed role in the UPS. This cohort

of putative PIPs includes 6 deubiquitinases (DUBs), 3 proteins with a ubiquitin conserved

domain, 3 nonclassical proteasome subunits (Rpn13/ADRM1, PA200/PSME4,

Sem1/SHFM1), 2 E3 ligases, and ubiquitin.

It is generally regarded that formation and maintenance of the 26S proteasome

complex is ATP dependent, and the complex breaks down into its 19S and 20S

components in the absence of ATP [22]. However, recent investigations have

surprisingly found the complex to be stable in the absence of ATP when purification is performed in the absence of high salt wash steps [1]. Moreover, recent reports in yeast demonstrated that hydrolysis of ATP can regulate the composition of the holoproteasome and associated PIP [2]. UBL-affinity purification was therefore performed in the absence

of ATP, and in the presence of the non-hydrolysable ATP analogue ATP-γ-S (2 mM) to

92

examine the stability of the proteasome complex, and to identify any ATP-regulated putative PIP from a human cell line. Tryptic digest/DALPC was performed on the resulting complexes, and peptide identification/database searching was performed as for

ATP-purified samples as described.

Purification in the absence of ATP resulted in identification of 69 unique protein assignments, and 27 proteasome subunits. Percentage purity of the sample, as described above, was determined to be 54.6%. The “missing subunits” in the 0 mM ATP trial were solely located in 20S complex (α2/PSMA2, α5/PSMA5, β6/PSMB1, β3/PSMB3,

β5/PSMB5), and the number of prorated queries of 20S core subunits was consistently lower than the 2 mM ATP-purified sample. In contrast, the number of proteasome subunits and prorated queries from the 19S regulatory particle were not different between the ATP conditions. The decrease in number of proteasome subunits identified, and the reduction in percent proteasome purity reflected a lack of proteasome stability in the absence of ATP, exclusively located in the 20S core. Several 20S proteasome subunits were still identified in the absence of ATP with reduced sequence coverage (α6/PSMA1,

α7/PSMA3, α3/PSMA4, α1/PSMA6, α4/PSMA7, β4/PSMB2, β7/PSMB4, β1/PSMB6).

This result may reflect simple dissociation of the 20S from the 19S and identification of residual intact 20S complexes; alternatively it may reflect a partially dissociated 20S core. It is intriguing to note the only other reported affinity purification of human proteasomes found that a 19S tag-based purification in the absence of ATP yielded

“partially copurified” 20S complex [3]. An additional 22 protein assignments were made in the 0 mM ATP trial (Table 3.3). Of these, a remarkable 68.0% have a confirmed

93 role in the UPS. This list includes 5 DUBs, 3 proteins with a ubiquitin conserved

domain, 3 E3 ligases, an E2 ubiquitin conjugating enzyme, and ubiquitin.

Proteasomes were also purified in the presence of the non-hydrolysable ATP analogue ATP-γ-S, in the hopes of elucidating novel ATP-regulated human putative PIP as performed in yeast by Verma et al. [2]. Purification with ATP-γ-S resulted in a total of

64 protein assignments, with a calculated proteasome purity of 53.3%. Eighteen protein

assignments were culled according to the previously described criteria, leaving a total of

46 proteins. Thirty-one proteasome subunits were thusly identified. Only Rpt4/PSMC6,

(S5b)/PSMD5, (p27)/PSMD9, and Gankyrin/PSMD10 were not identified, and these

subunits were not identified in any trial. Of the remaining 15 protein assignments, 10

have a confirmed role in the UPS (Table 3.3): 4 DUBs, 3 E3 ligases, and 2 ubiquitin

conserved domain proteins. The list of nonproteasomal UPS proteins identified in the 0

mM ATP and 2 mM ATP-γ-S purification schemes are largely overlapping, indicating

this strategy is unlikely to be useful in discrimination of ATP-regulated putative PIP. It is

also acknowledged that high throughput LC-MS/MS data often suffer from a lack of

reproducibility, so prudence should be taken in drawing biological conclusions from

samples from the 0 mM ATP, and 2 mM ATP-γ-S trials. The large percentage of

proteins identified by LC-MS/MS in these samples with a confirmed physical interaction

with the proteasome suggests these protein identifications are not spurious, and are likely

to be putative PIP.

The rate of false positive peptide assignment was calculated to be 0.03% on the

basis of searching all queries against a randomized version of the NCBI database

(273,951 sequences). Thirteen queries were matched with a peptide with Mascot

94

PEPTIDESCORE ≥ IDScore, as compared to 2918 queries matched when searching against the NCBI database.

3.3 Discussion

A new method was developed for the isolation of functional proteasomes of human origin together with a subset of dynamically-interacting proteins. This single step affinity-based procedure is rapid, and requires small amounts of starting material. The type 2 UBL domain from RAD23A, expressed as a fusion protein with GST, is used as an affinity-matrix for the proteasome, based on the well documented high-affinity complex formed between UBL and the UIM-2 domain of Rpn10/PSMD4. This attraction allows isolation and analysis of functional human proteasomes under native conditions.

Classical chromatographic purification of proteasomes exposes the complex to conditions of high salt and other chaeotropic agents that abrogate binding of the many transiently interacting proteins that modulate the activity of the proteasome holocomplex.

In addition, conventional purification techniques are time-consuming and require expensive chromatographic apparatus and large amounts of starting material.

Immunoaffinity methods based on monoclonal antibodies are another option for proteasome purification. These methods have been successful [23], but presence of antibodies and harsh elution conditions limit functional studies of the purified proteasomes. Finally, epitope-tagging of proteasome subunit genes for affinity purification of proteasomes has been performed in both yeast [1, 2, 24], and in a human cell line [3] with great success, and has thus far provided the most complete analysis of

95 the dynamic interactions of the UPS. The GST-UBL affinity method described here is comparable to these techniques, but is slightly less specific, and produces lower yields.

However, the GST-UBL scheme does enjoy the advantage of not requiring an intrinsic change in any proteasome subunit introduced by epitope-tagging, and therefore avoids potential alteration in UPS activity from addition of a proteasomal epitope-tag, and could theoretically be used to isolate proteasomes from any human cell line or tissue. It is demonstrated that isolation of proteasome fractions using GST-UBL affinity chromatography allows identification of proteasome-interacting proteins. However, it is acknowledged that the RAD23A UBL domain may make several non-proteasome interactions that likely complicate this analysis, and therefore this technique is more accurately termed a proteasome-UBL domain interaction network purification scheme.

This technique is complimentary to established reports of proteasome purification schemes, and provides valuable insights to proteasome dynamics.

In total, 34 proteasome subunits were identified with the notable exception of

Rpt4/PSMC6, (S5b)/PSMD5, (p27)/PSMD9 and Gankyrin/PSMD10. Fourteen subunits of the 20S catalytic core were identified, although β2/PSMB7 was only identified in the

ATP–γ-S trial. Nineteen subunits from the 19S regulatory particle were identified, including three recent assignments: Rpn15/Sem1/SHFM1, Ubp6/USP14, and

Rpn13/ADRM1. Sem1/SHFM1 has been shown to be a salt-labile proteasome interactor, and likely a bona fide proteasome subunit in both yeast and human [25, 26].

Ubp6/USP14, another human orthologue of a salt-labile yeast proteasome subunit

(Ubp6), has also been shown to associate with the human proteasome by successful coimmunoprecipitation of the 20S proteasome with 125I vinyl sulfone-labled

96

Ubp6/USP14 [6]. Rpn13/ADRM1 has recently been described as a stoichiometric proteasome subunit with a specialized nonessential role in proteasomal recruitment of

UCHL5 (also identified) [27, 28]. Finally, the alternative 20S proteasome activator

PA200 was also identified. This result is intriguing considering the UBL purification scheme necessitates an intact 19S subunit, suggesting the presence of a hybrid proteasome [3, 29].

Two recent papers have used LC-MS/MS to examine the subunit composition of mammalian proteasomes purified by affinity, and classical techniques [3, 30]. In comparison, the list of proteasome subunits identified here is largely overlapping.

Notably, the GST-UBL scheme failed to pick-up proteasome subunits Rpt4/PSMC6,

(S5b)/PSMD5, (p27)/PSMD9 and Gankryin/PSMD10. While both alternative experimental designs identified Rpt4/PSMC6 with excellent sequence coverage, the

Wang et al. affinity-based purification of human proteasomes identified neither

(S5b)/PSMD5 nor (p27)/PSMD9, although Gankyrin/PSMD10 was identified with good sequence coverage. However, in the classically-based purification of murine cardiac proteasomes described by Gomes et al., neither (p27)/PSMD9 nor Gankyrin/PSMD10 were identified, while (S5b)/PSMD5 was identified reproducibly, but with substantially reduced sequence coverage. If anything, this ambivalent data suggests a weak proteasome-association of (S5b)/PSMD5, and Gankyrin/PSMD10; proteasome- incorporation of these subunits is likely dynamic, and identification is dependent on several variables not limited to purification conditions and source material, while

(p27)/PSMD9 is unlikely to be a bona fide proteasome subunit. In addition, the 3 interferon-γ inducible β subunits were absent from both affinity-based procedures

97 (performed in similar human cell lines), while all 3 subunits were present in murine

cardiac tissue. Moreover, Rpn13/ADRM1, Rpn15/Sem1/SHFM1 PA200/PSME4,

UCHL5, UBC, and UBE3A were identified only by the affinity-based purification schemes, providing evidence that these proteins interact with the proteasome in a tenuous complex that is easily broken by conventional chromatographic purification methods.

Finally, the GST-UBL affinity technique described here identified many more nonproteasomal proteins than were reported in either of the other techniques, suggesting its utility in identifying proteasome-interacting proteins.

Classification of co-purified proteins on the basis of functional groups or conserved domains was performed. Of the 42 non-proteasomal proteins identified with this strategy, 22 (52.4%) have a confirmed role in the ubiquitin-proteasome pathway.

These proteins include deubiquitinases (DUBs), E2 and E3 enzymes, and proteins with ubiquitin conserved domains (UBX/UBL/UBA/UIM). For proteins without a confirmed role in the ubiquitin-dependent proteasome pathway, the cohort was enriched in proteins associated with DNA repair. This is intriguing considering recent literature linking the proteasome and DNA repair processes [26].

At least 9 proteins harboring a ubiquitin conserved domain were identified as putative PIP. While it is possible that ubiquitin domain-containing protein were identified by GST-UBL affinity purification due to a direct interaction with the GST-

UBL affinity matrix rather than interaction with the proteasome, there is strong evidence that does not support this conclusion. Two hybrid studies of yeast Dsk2 protein homodimerization demonstrated a complete lack of UBL-UBA, or UBL-UBL physical interaction in physiological context in the presence of ubiquitin [31]. However, purified

98

Dsk2 UBL and UBA domains were crystallized in a weak 1:1 complex, suggesting that

UBA and UBL can interact. Surface plasmon resonance binding studies measured the Kd for the UBA-Ubiquitin interaction to be tenfold greater than for the UBA-UBL interaction [32]. It is therefore likely that UBA domains do not interact with UBL domains in the presence of ubiquitin. Presence of cellular ubiquitin in the GST-UBL affinity purification scheme likely prevented the GST-UBL from making direct contact with UBA or UBL domain proteins, meaning purification and subsequent identification of proteins with these domains were likely the result of their interaction with the proteasome.

UIM domain-containing proteins identified by GST-UBL affinity chromatography are less likely to be bona fide proteasome interactors than other ubiquitin domain proteins. The GST-UBL was used as a proteasome affinity matrix specifically because of the high affinity complex formed between the RAD23 UBL domain and the

UIM-2 domain of Rpn10/PSMD4. It is likely that other UIM domain proteins make direct contact with the GST-UBL, and therefore cannot be considered as PIP. It should be noted that the UBL domain of RAD23A selectively binds to 1 of the 2 UIM domains of Rpn10/PSMD4 (UIM-2, and not UIM-1) suggesting alternate functions and perhaps that UIM domains from different proteins have specific binding partners [17]. It is therefore possible that the identified UIM domain-containing proteins are bona fide PIP.

A closer investigation of each case is necessary to rule out the possibility of a direct

UIM-UBL interaction. Five UIM domain proteins were identified by GST-UBL affinity chromatography: ATXN3, EPS15, HUWE1, UIMC1 and USP25. ATXN3, a PolyQ

Expansion Disorder protein, coimmunoprecipitated with the RAD23A UBL in HEK-293

99 cells [33], and thus its identification in this scheme was likely the result of direct

interaction with the GST-UBL affinity matrix rather than copurification with the

proteasome. EPS15, a component of the endocytic pathway, harbors 2 UIM domains, the

first of which was shown to directly interact with the type 2 UBL of ubiquilin [34]

suggesting a direct interaction with the GST-UBL. There is currently no information

regarding HUWE1, USP25, or UIMC1 binding to UBL domain proteins; biochemical

analysis must be performed to substantiate proteasome interaction of these proteins.

Proteasome-mediated protein catabolism is an ATP-dependent process. Not only

is ATP required for enzymatic processing of ubiquitinated substrates, but it is also

required for stability of the intact complex. It was thus believed that varying ATP

conditions during the GST-UBL purification scheme would result in dramatically different composition of the purification fraction, both in proteasome subunit composition and in the identity of putative PIP.

GST-UBL affinity purification of proteasomes in the absence of ATP resulted in a reduction of protein assignments from the 20S catalytic core, suggesting decreased stability of the proteasome in the absence of ATP. However, the absence of ATP did not result in a decrease in the number of putative PIP identifications as compared to purifications performed with either 2 mM ATP or 2 mM ATP-γ-S. In addition, the list of putative PIP identifications from different ATP conditions was largely overlapping. This implies that the current list of putative PIP is largely comprised of proteins that are not regulated by ATP.

GST-UBL affinity chromatography allows the proteasome to be purified in mild, low-salt conditions that permit copurification of putative PIPs. MS/MS analysis of the

100

entire purification fraction was performed in order to identify novel proteins that may modulate the activity of the proteasome complex. The list of identified proteins is highly enriched in 4 functional groups: deubiquitinases, ubiquitin conjugating/ubiquitin ligase enzymes, ubiquitin domain-containing proteins, and proteins involved in DNA repair.

3.3.1 Group 1: Deubiquitinases

Covalent attachment of polyubiquitin chains onto protein substrates constitutes the degradation signal for 26S proteasome-mediated protein destruction. Ubiquitination, catalyzed by E3 ligase enzymes, is highly coordinated and plays a key regulatory role in myriad cellular processes. DUBs are a class of enzyme that reverses the process of ubiquitination, providing yet another level of regulation for the UPS. DUBs contribute to

UPS activity in ambivalent roles: by removing polyubiquitin tags from targeted protein substrates they can rescue proteins from 26S mediated destruction; however DUBs can increase global UPS proteolysis by efficient recycling of free ubiquitin [35]. DUBs have been reported to be proteasome-associated or free, indeed the proteasome harbors an intrinsic DUB; 19S subunit Rpn11/PSMD14 has ubiquitin hydrolase activity that is thought to be a rate limiting step in 26S degradation [36, 37]. GST-UBL affinity chromatography identified 8 known DUBs: Rpn11/PSMD14, ATXN3, EEF1A1,

UCHL5, USP5, USP13, Ubp6/USP14, and USP25. Two of these proteins,

Rpn11/PSMD14 and Ubp6/USP14 are stoichiometric constituents of the proteasome, while UCHL5 has been shown to be a substoichiometric proteasome interactor responsible for polyubiquitin chain editing [38].

101 Although DUB activity is essential for efficient proteasome function, disruption of yeast orthologues of Ubp6/USP14 and UCHL5 does not compromise overall UPS activity, suggesting functional redundancy via other DUBs [39]. It is therefore likely that other DUBs routinely associate with the proteasome, and GST-UBL affinity chromatography has likely identified these proteasome-associated DUBs. USP5 (aka

Isopeptidase-T) plays a key role in maintaining pools of free ubiquitin via its disassembly of preformed polyubiquitin chains, however a physical association with the proteasome has not previously been demonstrated. It is possible that USP5 interacts with the proteasome only in the context of a larger complex, as it harbors a UBA domain. USP13, with 67.3% nucleotide conservation and 54.8% amino acid identity, is clearly descended from a common ancestor gene USP5; however reported differences in expression pattern and overall sequence identity have suggested alternative roles for the two enzymes [40].

A physical interaction with the proteasome has not previously been reported for USP13.

EEF1A1, the eukaryotic translation elongation factor 1 alpha 1, is principally responsible for the enzymatic delivery of aminoacyl tRNAs to the ribosome. In addition to its anabolic role in protein synthesis, it has also been demonstrated to harbor ubiquitin isopeptidase activity under catabolic conditions [41]. As previously mentioned, ATXN3 and USP25 both contain UIM domains and as such may directly interact with the GST-

UBL affinity matrix rather than the proteasome. It is not surprising that DUBs constitute the largest group of putative PIPs identified in this analysis considering the requirement of deubiquitination for efficient UPS function.

3.3.2 Group 2: Ubiquitin Conjugating/Ubiquitin Ligase Enzymes

102

E3 ubiquitin ligase activity is generally the terminal step in a highly regulated enzyme cascade targeting a protein substrate for proteasome-mediated proteolysis. With the large number of specific E3 ligases it is surprising that so few have been shown to have a physical interaction with the proteasome, especially considering the finding that the polyubiquitin receptor of the proteasome is dispensable in yeast. GST-UBL affinity chromatography identified 4 E3 ligase enzymes and an E2 ubiquitin conjugation enzyme.

HUWE1 (aka E3Histone, LASU1) is an E3 ligase with 4 conserved UPS domains: C- terminal HECT, UBA, UIM and WWE, and has a confirmed role in ubiquitination of histones [42]. UBE3A (aka E6-AP) is another HECT domain E3 ligase with a wide range of substrates including estrogen receptor and p53. UBE3A has recently been identified as a proteasome-interacting protein by Wang et al. [3], and is the human ortholog of yeast protein Hul5, a salt-labile proteasome-interacting protein [1]. RAD23A is a substrate of UBE3A ubiquitination, suggesting that identification of UBE3A in this purification strategy is a byproduct of direct GST-UBL binding. However, UBE3A has been shown to bind RAD23A independently of the UBL domain [43], making it likely that UBE3A also binds the proteasome. Intriguingly, UBE3A has been demonstrated to form a complex with the proteasome and UBL domain of hPLIC in HeLa cells [44]. It is likely that E3 enzymes and UBL domain proteins act in a coordinate manner to deliver targeted proteins to the proteasome.

UBAC1 (aka KPC2) has recently been identified as an integral member of KPC

(kip1 ubiquitination-promoting complex), an E3 ligase complex responsible for the cytoplasmic ubiquitination of p27kip1. The identification of UBAC1 via GST-UBL affinity chromatography agrees with a previous claim that UBAC1 interacts with the

103 proteasome via a UBL domain. UBAC1 probably acts as a shuttling factor, delivering polyubiquitinated p27kip1 to the proteasome following ubiquitination by KPC [45].

ZUBR1 (aka p600) is a 600 kDa component of the pRB-associated protein complex, and harbors a RING finger domain similar to the E3 ligase N-recognin [46]. It has also been implicated as a UPS protein via its identification in a ubiquitin affinity extract [47], however a bona fide role in protein ubiquitination has not yet been confirmed. UBE2N

(aka UBC13) was identified as a putative PIP via GST-UBL affinity chromatography.

This E2 enzyme is involved in the synthesis of Lys-63-linked ubiquitin chains, and has a proven function in post-replication DNA repair (PRR) [48], a process that requires 20S proteasome activity and is distinct from the nucleotide excision repair (NER) activity of

RAD23A [48, 49].

3.3.3 Group 3: Ubiquitin Domain Proteins

Identifications of two other ubiquitin domain-containing proteins were observed by GST-UBL affinity chromatography. Fas-Associated Factor (FAF1) may act as a scaffolding factor in the UPS: it is known to bind polyubiquitinated proteins via its UBA domain, and directly interact with Vasolin Containing Protein (VCP), a known proteasome targeting protein, via a UBX domain [50]. Most likely, FAF1 has a physical interaction with the proteasome only in the context of a larger complex. Identification of

FAF1 via GST-UBL affinity chromatography reinforces its role as a ubiquitin- proteasome pathway protein. Another potential UPS scaffolding protein, SQSTM1, was also identified by GST-UBL purification. SQSTM1 harbors a UBA domain capable of binding Lys63-polyubiquitin chains, and an N-terminal PB1 domain that is structurally

104

similar to the UBL domain. Interaction between SQSTM1 and proteasome subunit

Rpn10/PSMD4 has been confirmed via coimmunoprecipitation and GST pull-down experiments [51]. This protein likely delivers Lys63-polyubiquitinated proteins to the proteasome for degradation.

3.3.4 Group 4: DNA Repair

The proteasome has been strongly implicated in a number of conserved pathways of DNA repair. The 19S regulatory particle is thought to have chaperone-like functions vital to NER mediated by the action of RAD23A. In addition, 20S core proteolytic activity has been shown to play a role in decreasing UV sensitivity via the PRR pathway, independent of RAD23A. Intriguingly, GST-UBL affinity chromatography identified 7 proteins with some connection to DNA repair processes: GCC2, MORC3, ERC1,

SHFM1, UBE2N, XRCC5, and Ubp6/USP14. This data strengthens the link between the

UPS and DNA repair, and these proteins may provide fodder for future investigations regarding the connection between the 26S proteasome and DNA repair pathways.

Confirmed proteasome-associated protein Sem1/SHFM1 is recruited with the proteasome to double strand breaks (DSB) in yeast [26]. XRCC5 (aka Ku 80) heterodimerizes with

XRCC6 and binds to DSB, and are known to recruit a catalytic repair enzyme. The DNA binding activity of the Ku heterocomplex is known to be downregulated by treatment with proteasome inhibitor MG132, but a physical interaction has not previously been demonstrated [52]. UBE2N was discussed earlier as a direct mediator of proteasomal recruitment for PRR. Yeast two hybrid analysis uncovered an association between

Ubp6/USP14 and the Fanconi anemia complex, which is involved in DNA repair and

105 colocalizes with DNA damage proteins like BRCA1 [53]. GCC2 and ERC1 are both newly characterized proteins with the conserved domain SbcC, an ATPase domain involved in DNA repair. MORC3 harbors the HATPase_c conserved domain, an ATP- binding domain found in many DNA repair enzymes [54].

3.3.5 Other Proteins

Literature and database searches did not reveal a common functional theme for

the remaining protein assignments. Of note, identified protein TXNL1 is part of the

thioredoxin-like family, several of whose members have been implicated in UPS

pathways via ubiquitin affinity chromatography [47]. In addition, BASP1, a component

of the co-suppressor of the Wilms tumor suppressor protein WT1, contains several PEST

sequences and is thus implicated to have a high turnover rate associated with UPS

degradation.

3.4 Conclusions

The dynamic regulation of the holoproteasome complex requires high throughput

techniques to identify the numerous proteins involved in coordinated protein degradation

via the UPS. GST-UBL affinity chromatography using mild conditions is demonstrated

to be an effective method to purify functional proteasomes and associated complexes that

would be lost by conventional proteasome purification. Using this strategy, several

proteins not previously reported to interact with the proteasome were identified by

tandem mass spectrometry, including 5 DUBs, 4 E3 ligases, an E2 conjugating enzyme,

106

and 5 proteins implicated in DNA repair. Copurification with the proteasome via GST-

UBL affinity purification is only the first step in validating their role in the ubiquitin proteasome system for protein degradation.

3.5 Experimental Procedures

DNA Constructs: The RAD23A UBL expression vector (pGEX-2T-UBL) was kindly provided by Juli Feigon; the construction of the vector has been described previously

[20]. An expression vector encoding the entire mouse Rpn10/mPsmd4 cDNA was provided as a gift from Dr. Edward Fon. A mUIM-2 motif (corresponding to amino acids

244-351) was subcloned into HIS-tagged expression vector pET15b with PvuII/BamHI to create pET15b-mUIM-2. A Klenow DNA polymerase fill-in reaction was employed for the construction of the human Rpn10/PSMD4 UIM-2 expression vector (pGEX-KG- hUIM-2-HIS). The template DNA was a 123 bp synthetic oligonucleotide comprised of sequence corresponding to amino acid residues 274-298 of the human Rpn10/PSMD4

(sequence following thrombin cleavage:

GLPDLSSMTEEEQIAYAMQMSLQGAHHHHHH), plus a 5’ EcoRI restriction site in frame with the GST ORF sequence of pGEX-KG, and a 3’ in frame 6X-HIS tag, followed by two stop codons and an Nco I restriction site. A 20 bp oligonucleotide complimentary to the template DNA served as the primer. A primer extension reaction was performed at room temperature for 120 minutes. The DNA was immediately purified using the QIAquick PCR Purification Kit (Qiagen, Mississauga, ON). To

107 eliminate unreacted oligonucleotides, the purified sample was digested overnight with 10

units Mung Bean Nuclease. The reaction product was analyzed by 1.2% agarose gel

electrophoresis followed by DNA purification with QIAquick Gel Extraction Kit

(Qiagen, Mississauga, ON). The resultant DNA was digested with EcoR I and Nco I overnight, and ligated with similarly digested pGEX-KG [55]. All restriction enzymes,

Klenow DNA polymerase, Mung Bean Nuclease, T4 DNA Ligase and appropriate buffers were purchased from New England Biolabs (Beverly, MA). Transformation was performed into XL2-Blue E. coli (Stratagene, West Cedar Creek, TX). The pGEX-4T- hUIM-2-HIS plasmid was verified by sequencing.

Expression of Fusion Proteins: Expression and glutathione affinity purification of all

GST fusion proteins was performed on glutathione-sepharose 4B affinity beads. Twenty- five mL cultures of LB-Amp were innoculated with a glycerol stock scraping of BL21 E. coli (Stratagene) transformed with appropriate expression plasmid, and grown overnight

o at 37 C with shaking at 200 rpm. Cultures were diluted 1:10, and grown to an OD600nm

0.8 (roughly 2:15 hours); expression of fusion protein was induced with the addition of

100 μM IPTG for 3 hours at 37oC, 200 rpm. Cells were lysed by sonication and purified

on glutathione-Sepharose 4B, according to manufacturer’s instructions (GE Amersham,

Uppsala, Sweden). HIS-mUIM-2 was purified by nickel-nitrilotriacetic acid (Ni2+-NTA)

IMAC (Qiagen) according to manufacturer’s instructions. Typical yields of recombinant

proteins were 2.75-3.0 mg from 250 mL-500 mL liquid culture.

108

Proteasome Purification: Human Embryonic Kidney (HEK-293) cells were grown in

o DMEM supplemented with 10% FBS at 37 C, 5.0% CO2. For each proteasome

purification a 150 cm2 flask of HEK-293 was grown to 95% confluence, washed twice

with D-PBS (Dulbecco’s Phosphate Buffered Saline), and harvested with Trypsin-EDTA.

Cell culture media, serum and Trypsin-EDTA were purchased from Invitrogen

(Burlington, ON). Following trypsination, the cells were washed again with D-PBS, and lysed in 1.0 mL of HEPES-buffered saline pH 7.5, 50 mM NaCl, supplemented with 2 mM ATP (or the nonhydrolyzable analogue ATP-γ-S, where indicated) and 0.8% Triton-

X 100 (Buffer A). The lysate was centrifuged at 14,000 rpm for 8 minutes at 4oC.

Following centrifugation, the lysate was filtered through cheesecloth to remove lipids

remaining in solution; total protein concentration of the cleared lysate was then measured

(typical range 5-7 mg). GST-UBL (typical amount: 400 μg) bound to glutathione-

sepharose 4B was added to the cleared lysate, and allowed to incubate for 60 minutes at

4oC with constant mixing. Following incubation, the GST-UBL glutathione-sepharose

4B column was washed 3 times with 10 column volumes of HEPES-buffered saline pH

7.5, 50 mM NaCl, 2 mM ATP (Buffer B). Proteasomes bound to the GST-UBL affinity

matrix were then eluted by competition with a 20-fold molar excess of hUIM-2-HIS (1.0

mL) incubated at 37oC for 90 minutes with continuous mixing. Removal of the hUIM-2-

HIS was achieved by Ni2+-NTA IMAC. Ni2+-NTA agarose (200 μL slurry, Qiagen) was added to the purified fraction and incubated for 60 minutes at 4oC. The supernatant was

removed, and concentrated by ultracentrifugation with Nanosep 10K filters (Pall Life

Sciences, Mississauga, ON), to a final volume of approximately 100 μL.

109 Fluorogenic Substrate Assay/Western Blot: Samples of purification fractions were monitored for proteasomal activity by FSA. Briefly, samples were incubated in Buffer B supplemented with 10 μM Suc-LLVY-AMC and, where indicated 5 μM MG132 (Boston

Biochem, Cambridge, MA) for one hour at 37oC. The reaction was stopped by the addition of 1.0 mL 1% SDS, and subsequently monitored for free AMC emission

fluorescence on a LS55 Luminescence Spectrometer (Perkin Elmer Instruments,

Montreal, QC; settings: excitation: 380nm, emission: 460 nm). To quantitate substrate hydrolyzed, a standard curve was prepared with each experiment, using free AMC

(Sigma, Oakville, ON). Standard western blots were performed with anti-Rpt5/PSMC3 polyclonal rabbit antibody purchased from Affinity Research Products (Exeter, UK).

Mass Spectrometry: Purified proteasome fractions were sent to the McGill University and

Genome Quebec Innovation Centre for tryptic digest/mass spectrometry analysis. For

tryptic digest, samples were adjusted to contain 50 mM ammonium bicarbonate pH 8.0, 5 mM DTT, 0.1% octyl-glucopyranoside and 10% acetonitrile and incubated at 50oC for 30

minutes before addition of a final concentration of 15 mM iodoacetamide. Following 15

min incubation at room temperature in the dark, 150 ng of Trypsin Gold (Promega,

Madison, WI) was added and allowed to digest the entire sample overnight at 37oC.

Digestion was terminated by the addition of a final concentration of 1% formic acid.

Digest solutions were desalted using michrocrom C8 clean-up traps. Bound material was eluted with 90% acetonitrile, dried and reconstituted in 3% acetonitrile:0.1% formic acid

before LC-MS/MS. Digest samples were fractionated by HPLC before direct injection

into the mass spectrometer. Forty μL of digest solution was loaded onto a Zorbax Bio-

110

SCX 50X0.8mm column (Agilent, Mississauga, ON), and washed for 10 minutes at 15

μL/min with a solution containing 3% acetonitrile:0.1% formic acid. Elution from the

SCX (Strong Cation eXchange) column was done by stepwise injections of 40 μL of increasing salt concentrations (12 mM, 37 mM, 75 mM, 150 mM, 500 mM NaCl) in a 3% acetonitrile:0.1% formic acid solution. Eluate was sent to a Zorbax C18 5X0.3 mm guard column (Agilent). Elution from the guard column and analytical peptide separation was done on a Biobasic C18 10X0.075 mm picofrit analytical column (New Objective,

Woburn, MA) using a gradient elution of 5% acetonitrile:0.1% formic acid to 95% acetonitrile:0.1% formic acid over 93 minutes.

Eluted peptides were directly introduced into the nanospray needle and direct injected into a QTRAP 4000 LC-MS/MS hybrid triple quadrupole/linear ion trap mass spectrometer (Sciex-Applied Biosystems, Streetsville, ON). Information-dependent

MS/MS analysis of 3 most intense ions with dynamic exclusion for 180 seconds after two occurrences was performed. MS mass range scanned was from 350 to 1600 m/z using enhanced MS scan with dynamic fill time. Three averaged MS/MS scans were acquired using enhanced product ion scan from 70 to 1700 m/z using a 20 millisecond linear ion trap fill time and Q0 trapping.

Peak Identification/Database searching: MS/MS data were transferred from the LC-Q-

TOF computer to a dedicated server and were automatically manipulated by TOMAS

(Toolbox to Optimize MAss Spectrometry data) [56] for generation of peak lists using

Mascot Distiller v1.0 from Matrix Science [57],

(http://www.matrixscience.com/distiller.html) [58] with its parameters set at SNR=20 and

111 CT=0.7. This reduced noise and produced a list of distinct peptide peaks in which all

members of the isotopic clusters were collapsed into an equivalent monoisotopic peak.

The peak list data were submitted to Mascot Cluster version 1.9.03

(http://www.matrixscience.com/cluster.html) [59] and searched against the NCBI nonredundant human database (ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nr.gz) [60], frozen

on May 11, 2006 (total database sequences: 273,951). Searching was restricted to one

missed trypsin cleavage, allowing for variable oxidation of methionines, but fixed

modification of alkylation of cysteines. A Peptide Mass Tolerance of 1.5 and a Fragment

Mass Tolerance of 0.8 were required. In addition, a random database with the same

number of entries and amino acids as the original was generated by randomly permuting

each sequence in the frozen database. The peak list dataset was similarly submitted to

Mascot Cluster version 1.9.03 and searched against this randomized database to calculate

the rate of false positive peptide assignment. All lists of peptide assignments generated by Mascot were processed to filter those identifications likely to have arisen by chance by retaining only the peptides with a Mascot PEPTIDESCORE ≥ IDScore, resulting in retention of peptide assignments with a probability of occurring by chance of < 0.05%.

The high confidence peptide assignments were then matched to protein sequences to produce an initial list of protein identifications. To reduce redundancy in the dataset, and to afford comparisons across datasets, the protein lists were then processed by a grouping algorithm [61] to generate a list of distinct protein groups. A representative protein for each group was automatically selected to generate a list of proteins corresponding to the minimum number of protein sequences needed to explain the peptides observed. This peptide identification process provides information about the relative abundance of

112

proteins in the sample using an established peptide counting algorithm that relates protein abundance to the number of times a given peptide was sampled by the mass spectrometer

[62, 63]. This method was used to evaluate the purity of the proteasome in the purification fraction. The proteins identified by only a single peptide were manually assessed for quality. Upon inspection, the large majority of single-peptide identifications have spectra that matched the peptide sequence well: only a single protein identification was removed from analysis as a result of manual inspection.

113 3.6 Tables

Table 3.1: MS/MS analysis of GST-UBL affinity chromatography purification fractions purified with 2 mM ATP, identified in 2/3 trials.

Trial 1 - 2 mM ATP Trial 2 - 2 mM ATP Trial 3 - 2 mM ATP % % NCBI Protein Distinct Sequence Distinct % Sequence Distinct Sequence Subunit Gene ID Accession # Peptides Coverage Peptides Coverage Peptides Coverage α1 PSMA6 48146005 3 16.7 5 22.0 0 0.0 α2 PSMA2 62897513 3 18.0 1 8.1 0 0.0 α3 PSMA4 33604016 2 24.5 2 11.4 0 0.0 α4 PSMA7 4092058 3 17.7 8 42.3 1 9.3 α5 PSMA5 54696300 1 7.9 5 31.5 0 0.0 α6 PSMA1 4506179 8 52.9 6 32.3 1 7.6 α7 PSMA3 13528948 1 8.1 1 4.0 0 0.0 β1 PSMB6 296734 0 0.0 2 27.8 1 6.1 β3 PSMB3 112180762 1 9.3 2 16.6 0 0.0 β4 PSMB2 85397257 2 15.9 3 23.4 0 0.0 β5 PSMB5 1172607 4 31.7 8 47.1 0 0.0 β6 PSMB1 62897947 1 5.8 5 24.1 0 0.0 β7 PSMB4 48145757 2 14.0 1 8.3 0 0.0 Rpt1 PSMC2 76879893 5 20.6 8 24.3 5 18.9 Rpt2 PSMC1 62896895 10 35.2 8 22.3 7 33.9 Rpt3 PSMC4 5080757 9 40.0 8 33.5 6 28.2 Rpt5 PSMC3 48145579 8 33.2 8 33.7 9 38.6 Rpt6 PSMC5 49065819 6 24.4 7 24.9 10 36.2 Rpn1 PSMD2 6174930 18 36.9 12 19.9 15 32.1 Rpn2 PSMD1 89365965 10 22.4 10 19.2 11 25.0 Rpn3 PSMD3 48145661 7 22.9 6 14.2 9 28.5 Rpn5 PSMD12 62896917 6 24.6 4 12.5 6 23.3 Rpn6 PSMD11 12653337 5 26.3 5 14.9 6 29.4 Rpn7 PSMD6 51702772 8 29.3 7 29.3 6 27.3 Rpn8 PSMD7 998688 4 24.9 8 44.6 6 37.7 Rpn9 PSMD13 3618343 6 29.8 5 22.3 7 28.2 Rpn10 PSMD4 55663926 5 21.6 6 19.2 6 36.8 Rpn11 PSMD14 51701716 2 14.5 1 7.4 5 24.5 Rpn12 PSMD8 4506233 4 25.7 2 7.0 1 6.2 Rpn13 ADRM1 59016665 1 6.4 0 0.0 1 6.4 Rpn15/Sem1 SHFM1 48145891 1 75.7 0 0 1 75.71 Ubp6 USP14 4827050 4 15.6 3 12.6 1 4.9 Blm10/PA200 PSME4 85566853 2 2.6 0 0 0 0

114

Table 3.2: Nonproteasomal proteins identified by LC-MS/MS analysis of GST-UBL affinity chromatography.

Trial 1 - 2 mM ATP Trial 2 - 2 mM ATP Trial 3 - 2 mM ATP % % NCBI Protein Distinct % Sequence Distinct Sequence Distinct Sequence Gene ID Accession # DESCRIPTION Peptides Coverage Peptides Coverage Peptides Coverage Deubiquitinases UCHL5 55859545 Ubiquitin carboxyl-terminal hydrolase L5 0 0.0 0 0.0 1 13.2 USP13 2501459 Ubiquitin carboxyl-terminal hydrolase 13 3 8.0 0 0.0 3 8.0 USP25 49899220 Ubiquitin specific peptidase 25 3 6.5 0 0.0 0 0.0 USP5 1585128 Isopeptidase T 10 28.1 8 16.9 11 29.9 EEF1A1 181965 Elongation factor 1 alpha 0 0.0 0 0.0 1 10.6 Ubiquitin Conjugating/Ubiquitin Ligase Enzymes HUWE1 56417899 ARF-binding protein 1 5 2.9 0 0.0 3 1.5 ZUBR1 82659109 -associated factor 600 9 4.0 0 0.0 7 3.5 Ubiquitin Conserved Domains

EPS15 466260 Epidermal growth factor receptor substrate 1 3.2 0 0.0 0 0.0 FAF1 6599275 FAS (TNFRSF6) associated factor 1 7 27.6 0 0.0 5 18.1 SQSTM1 12804857 Sequestosome 1 1 8.7 0 0.0 0 0.0 UBC 18159026 Ubiquitin 0 0.0 0 0.0 1 24.5 Confirmed Ubiquitin Proteasome System Proteins Brain abundant, membrane attached signal BASP1 12653493 protein 1 0 0.0 0 0.0 3 34.4 TXNL1 4759274 Thioredoxin-like 1 0 0.0 5 27.3 0 0.0 DNA Repair Proteins XRCC5 62896765 ATP-dependent DNA helicase II 0 0.0 0 0.0 5 13.5 ELKS/RAB6-interacting/CAST family ERC1 45751568 member 1 4.0 0 0.0 0 0.0 GCC2 11385642 CTCL tumor antigen se1-1 1 4.1 0 0.0 0 0.0 MORC3 27768973 MORC family CW-type zinc finger 3 1 11.6 0 0.0 0 0.0 Other Proteins C10orf97 45709240 Chromosome 10 open reading frame 97 4 16.9 1 2.7 2 13.0 HAL 64653005 Histidine ammonia-lyase 0 0.0 1 4.1 1 4.1 HIP1 3510693 Huntingtin interacting protein 1 2 2.7 0 0.0 2 6.9 HSPCB 56204415 Heat shock 90kDa protein 1, beta 0 0.0 1 20.8 0 0.0

IMPDH2 307066 Inosine-5'-monophosphate dehydrogenase 0 0.0 0 0.0 2 8.8 Similar to heat shock 70kDa protein 8 N/A 51095054 isoform 2 1 18.9 0 0.0 1 18.9 PDZD8 73621383 PDZ domain-containing protein 8 1 2.0 0 0.0 0 0.0 RDH5 20271410 Retinol dehydrogenase 5 (11-cis/9-cis) 0 0.0 1 3.8 0 0.0 TPM3 57997573 Tropomyosin 3 1 6.0 0 0.0 2 11.2

115 Table 3.3: Nucleotide regulation of holoproteasome complex. Nonproteasomal proteins identified by LC-MS/MS of GST-UBL affinity chromatography.

0 mM ATP 2 mM ATP-γ-S NCBI Protein Distinct % Sequence Distinct % Sequence Gene ID Accession # DESCRIPTION Peptides Coverage Peptides Coverage Deubiquitinases USP5 1585128 Isopeptidase T 18 42.4 16 42.7 USP13 2501459 Ubiquitin carboxyl-terminal hydrolase 13 4 11.8 3 8.7 USP25 49899220 Ubiquitin specific peptidase 25 4 7.8 2 3.0 ATXN3 13518013 Ataxin 3 1 6.5 1 4.2 EEF1A1 181965 Elongation factor 1 alpha 1 10.6 0 0.0 Ubiquitin Conjugating/Ubiquitin Ligase Enzymes ZUBR1 82659109 Retinoblastoma-associated factor 600 18 8.1 12 5.1 HUWE1 56417899 ARF-binding protein 1 12 6.8 2 1.4 UBE2N 80479362 Ubiquitin conjugating enzyme E2N 1 12.5 0 0.0 UBAC1 3211975 UBA domain containing 1 1 10.5 0 0.0 UBE3A 55416028 Ubiquitin ligase E3A isoform 1 0 0.0 1 1.6 Ubiquitin Conserved Domains FAF1 6599275 FAS (TNFRSF6) associated factor 1 6 19.9 6 22.9 EPS15 466260 Epidermal growth factor receptor substrate 6 11.4 0 0.0 SQSTM1 12804857 Sequestosome 1 1 8.7 0 0.0 UBC 18159026 Ubiquitin 1 34.0 0 0.0 UIMC1 12584837 Ubiquitin interaction motif containing 1 0 0.0 1 6.6 Confirmed Ubiquitin Proteasome System Proteins TXNL1 4759274 Thioredoxin-like 1 5 36.0 2 12.8 DNA Repair Proteins XRCC5 62896765 ATP-dependent DNA helicase II variant 2 7.0 0 0.0 ERC1 45751568 ELKS/RAB6-interacting/CAST family member 1 4.0 0 0.0 GCC2 11385642 CTCL tumor antigen se1-1 1 4.1 0 0.0 MORC3 27768973 MORC family CW-type zinc finger 3 1 11.6 0 0.0 Other Proteins C10orf97 45709240 Chromosome 10 open reading frame 97 7 25.2 3 17.3 HIP1 3510693 Huntingtin interacting protein 1 5 9.1 2 6.2 HEXA 46255684 Hexosaminidase A 1 8.1 0 0.0 IFI30 21411337 Interferon, gamma-inducible protein 30 1 15.2 0 0.0 IMPDH2 307066 Inosine-5'-monophosphate dehydrogenase 0 0.0 2 7.4

N/A 51095054 Similar to heat shock 70kDa protein 8 isoform 2 0 0.0 1 18.9 ZFAND6 4335943 Zinc-finger, AN1-type domain 6 0 0.0 1 8.1

116

3.7 Figures

117 Figure 3.1: Fluorogenic substrate assay of proteasome activity. Peptide hydrolysis of

proteasome substrate Suc-LLVY-AMC was measured by luminescence spectrometry (A,

B, and C). Purification fractions were resuspended in 1.0 mL of HEPES-buffered saline at pH 7.5, 100 mM NaCl, 2 mM ATP, and supplemented with 10 μM Suc-LLVY-AMC and 5 μM MG132 (where indicated). The fractions were incubated for 60 minutes at

37oC, and fluorescence intensity was measured at 460 nm. The readings were compared

to a standard curve of free AMC fluorescence to generate absolute activity units (A).

HEK-293 lysate harbored robust proteolytic activity against the peptide cleavage that was dramatically inhibited by treatment with proteasome inhibitor MG132. GST-UBL affinity chromatography yields a fraction with significant peptide cleavage activity, also sharply inhibited by MG132. Parallel purification with untagged GST yielded a fraction that was unreactive towards the peptide substrate. Reported percentages correspond to portion of the total fraction used in the assay. (B) Fluorogenic substrate assay was performed as in (A) to monitor the progress of proteasome purification. As expected,

HEK-293 cell lysates had robust proteasome activity that was captured on the GST-UBL affinity column, but not by untagged GST. Elution of the proteasome complex was performed in parallel by competition with human or mouse UIM-2 peptide (hUIM-2, and mUIM-2, respectively). The human peptide was successful in releasing proteasome activity from the GST-UBL, while the mouse peptide was not. (C) Effect of ATP, ATP analogue on proteasomes purified by GST-UBL affinity chromatography. The samples assayed are the same used to produce LC-MS/MS results reported in Trial 1, 0 mM ATP,

2 mM ATP-γ-S in Tables 1 and 3, respectively.

118

Figure 3.2: Immunodetection of proteasome subunit Rpt5/PSMC3. HEK-293 lysates

(lane 1) were used as source material for proteasome purification. Lysate (1% of total) was loaded onto GST-UBL affinity column, and washed extensively (10% of total, lane

2). A 20-fold molar excess of the hUIM-2 peptide was added to the column to compete for UBL binding, and both the soluble fraction (10%, lane 3) and sepharose-immobilized fraction (10%, lane 4) were analyzed. In parallel, a proteasome loaded GST-UBL column was treated with a 20-fold molar excess of mUIM-2 peptide. The soluble fraction (10%, lane 5) did not contain any proteasome immunoreactivity, as the proteasomes remained in the sepharose immobilized fraction (10%, lane 6).

119 3.8 References

1. Leggett, D.S., et al., Multiple associated proteins regulate proteasome structure and function. Mol Cell, 2002. 10(3): p. 495-507. 2. Verma, R., et al., Proteasomal proteomics: identification of nucleotide-sensitive proteasome-interacting proteins by mass spectrometric analysis of affinity- purified proteasomes. Mol Biol Cell, 2000. 11(10): p. 3425-39. 3. Wang, X., et al., Mass spectrometric characterization of the affinity-purified human 26S proteasome complex. Biochemistry, 2007. 46(11): p. 3553-65. 4. Xie, Y. and A. Varshavsky, Physical association of ubiquitin ligases and the 26S proteasome. Proc Natl Acad Sci U S A, 2000. 97(6): p. 2497-502. 5. Hu, M., et al., Structure and mechanisms of the proteasome-associated deubiquitinating enzyme USP14. Embo J, 2005. 6. Borodovsky, A., et al., A novel active site-directed probe specific for deubiquitylating enzymes reveals proteasome association of USP14. Embo J, 2001. 20(18): p. 5187-96. 7. Grand, R.J., et al., Adenovirus early region 1A protein binds to mammalian SUG1-a regulatory component of the proteasome. , 1999. 18(2): p. 449- 58. 8. Cagney, G., P. Uetz, and S. Fields, Two-hybrid analysis of the Saccharomyces cerevisiae 26S proteasome. Physiol Genomics, 2001. 7(1): p. 27-34. 9. Schauber, C., et al., Rad23 links DNA repair to the ubiquitin/proteasome pathway. Nature, 1998. 391(6668): p. 715-8. 10. Vijay-Kumar, S., C.E. Bugg, and W.J. Cook, Structure of ubiquitin refined at 1.8 A resolution. J Mol Biol, 1987. 194(3): p. 531-44. 11. Walters, K.J., et al., Structural studies of the interaction between ubiquitin family proteins and proteasome subunit S5a. Biochemistry, 2002. 41(6): p. 1767-77. 12. Rao-Naik, C., et al., The rub family of ubiquitin-like proteins. Crystal structure of Arabidopsis rub1 and expression of multiple rubs in Arabidopsis. J Biol Chem, 1998. 273(52): p. 34976-82. 13. Wilkinson, C.R., et al., Proteins containing the UBA domain are able to bind to multi-ubiquitin chains. Nat Cell Biol, 2001. 3(10): p. 939-43. 14. Rao, H. and A. Sastry, Recognition of specific ubiquitin conjugates is important for the proteolytic functions of the ubiquitin-associated domain proteins Dsk2 and Rad23. J Biol Chem, 2002. 277(14): p. 11691-5. 15. Raasi, S. and C.M. Pickart, Rad23 ubiquitin-associated domains (UBA) inhibit 26 S proteasome-catalyzed proteolysis by sequestering lysine 48-linked polyubiquitin chains. J Biol Chem, 2003. 278(11): p. 8951-9. 16. Elsasser, S., et al., Proteasome subunit Rpn1 binds ubiquitin-like protein domains. Nat Cell Biol, 2002. 4(9): p. 725-30. 17. Hiyama, H., et al., Interaction of hHR23 with S5a. The ubiquitin-like domain of hHR23 mediates interaction with S5a subunit of 26 S proteasome. J Biol Chem, 1999. 274(39): p. 28019-25. 18. van Nocker, S., et al., The multiubiquitin-chain-binding protein Mcb1 is a component of the 26S proteasome in Saccharomyces cerevisiae and plays a

120

nonessential, substrate-specific role in protein turnover. Mol Cell Biol, 1996. 16(11): p. 6020-8. 19. Young, P., et al., Characterization of two polyubiquitin binding sites in the 26 S protease subunit 5a. J Biol Chem, 1998. 273(10): p. 5461-7. 20. Mueller, T.D. and J. Feigon, Structural determinants for the binding of ubiquitin- like domains to the proteasome. Embo J, 2003. 22(18): p. 4634-45. 21. Link, A.J., et al., Direct analysis of protein complexes using mass spectrometry. Nat Biotechnol, 1999. 17(7): p. 676-82. 22. Hendil, K.B., R. Hartmann-Petersen, and K. Tanaka, 26 S proteasomes function as stable entities. J Mol Biol, 2002. 315(4): p. 627-36. 23. Froment, C., et al., A quantitative proteomic approach using two-dimensional gel electrophoresis and isotope-coded affinity tag labeling for studying human 20S proteasome heterogeneity. Proteomics, 2005. 5(9): p. 2351-63. 24. Guerrero, C., et al., An integrated mass spectrometry-based proteomic approach: quantitative analysis of tandem affinity-purified in vivo cross-linked protein complexes (QTAX) to decipher the 26 S proteasome-interacting network. Mol Cell Proteomics, 2006. 5(2): p. 366-78. 25. Sone, T., et al., Sem1p is a novel subunit of the 26 S proteasome from Saccharomyces cerevisiae. J Biol Chem, 2004. 279(27): p. 28807-16. 26. Krogan, N.J., et al., Proteasome involvement in the repair of DNA double-strand breaks. Mol Cell, 2004. 16(6): p. 1027-34. 27. Jorgensen, J.P., et al., Adrm1, a putative cell adhesion regulating protein, is a novel proteasome-associated factor. J Mol Biol, 2006. 360(5): p. 1043-52. 28. Yao, T., et al., Proteasome recruitment and activation of the Uch37 deubiquitinating enzyme by Adrm1. Nat Cell Biol, 2006. 8(9): p. 994-1002. 29. Schmidt, M., et al., The HEAT repeat protein Blm10 regulates the yeast proteasome by capping the core particle. Nat Struct Mol Biol, 2005. 12(4): p. 294-303. 30. Gomes, A.V., et al., Mapping the murine cardiac 26S proteasome complexes. Circ Res, 2006. 99(4): p. 362-71. 31. Sasaki, T., et al., Budding yeast Dsk2 protein forms a homodimer via its C- terminal UBA domain. Biochem Biophys Res Commun, 2005. 336(2): p. 530-5. 32. Lowe, E.D., et al., Structures of the Dsk2 UBL and UBA domains and their complex. Acta Crystallogr D Biol Crystallogr, 2006. 62(Pt 2): p. 177-88. 33. Wang, G., et al., Ataxin-3, the MJD1 gene product, interacts with the two human homologs of yeast DNA repair protein RAD23, HHR23A and HHR23B. Hum Mol Genet, 2000. 9(12): p. 1795-803. 34. Regan-Klapisz, E., et al., Ubiquilin recruits Eps15 into ubiquitin-rich cytoplasmic aggregates via a UIM-UBL interaction. J Cell Sci, 2005. 118(Pt 19): p. 4437-50. 35. Swaminathan, S., A.Y. Amerik, and M. Hochstrasser, The Doa4 deubiquitinating enzyme is required for ubiquitin homeostasis in yeast. Mol Biol Cell, 1999. 10(8): p. 2583-94. 36. Verma, R., et al., Role of Rpn11 metalloprotease in deubiquitination and degradation by the 26S proteasome. Science, 2002. 298(5593): p. 611-5. 37. Yao, T. and R.E. Cohen, A cryptic protease couples deubiquitination and degradation by the proteasome. Nature, 2002. 419(6905): p. 403-7.

121 38. Holzl, H., et al., The regulatory complex of Drosophila melanogaster 26S proteasomes. Subunit composition and localization of a deubiquitylating enzyme. J Cell Biol, 2000. 150(1): p. 119-30. 39. Stone, M., et al., Uch2/Uch37 is the major deubiquitinating enzyme associated with the 26S proteasome in fission yeast. J Mol Biol, 2004. 344(3): p. 697-706. 40. Timms, K.M., et al., The genomic organization of Isopeptidase T-3 (ISOT-3), a new member of the ubiquitin specific protease family (UBP). Gene, 1998. 217(1- 2): p. 101-6. 41. Gonen, H., et al., Protein synthesis elongation factor EF-1 alpha is an isopeptidase essential for ubiquitin-dependent degradation of certain proteolytic substrates. Adv Exp Med Biol, 1996. 389: p. 209-19. 42. Liu, Z., R. Oughtred, and S.S. Wing, Characterization of E3Histone, a novel testis ubiquitin protein ligase which ubiquitinates histones. Mol Cell Biol, 2005. 25(7): p. 2819-31. 43. Kumar, S., A.L. Talis, and P.M. Howley, Identification of HHR23A as a substrate for E6-associated protein-mediated ubiquitination. J Biol Chem, 1999. 274(26): p. 18785-92. 44. Kleijnen, M.F., et al., The hPLIC proteins may provide a link between the ubiquitination machinery and the proteasome. Mol Cell, 2000. 6(2): p. 409-19. 45. Kamura, T., et al., Cytoplasmic ubiquitin ligase KPC regulates proteolysis of p27(Kip1) at G1 phase. Nat Cell Biol, 2004. 6(12): p. 1229-35. 46. Huh, K.W., et al., Association of the human papillomavirus type 16 E7 oncoprotein with the 600-kDa -associated factor, p600. Proc Natl Acad Sci U S A, 2005. 102(32): p. 11492-7. 47. Gururaja, T., et al., Multiple functional categories of proteins identified in an in vitro cellular ubiquitin affinity extract using shotgun peptide sequencing. J Proteome Res, 2003. 2(4): p. 394-404. 48. Ashley, C., et al., Roles of mouse UBC13 in DNA postreplication repair and Lys63-linked ubiquitination. Gene, 2002. 285(1-2): p. 183-91. 49. Podlaska, A., et al., The link between 20S proteasome activity and post- replication DNA repair in Saccharomyces cerevisiae. Mol Microbiol, 2003. 49(5): p. 1321-32. 50. Song, E.J., et al., Human Fas-associated factor 1, interacting with ubiquitinated proteins and valosin-containing protein, is involved in the ubiquitin-proteasome pathway. Mol Cell Biol, 2005. 25(6): p. 2511-24. 51. Seibenhener, M.L., et al., Sequestosome 1/p62 is a polyubiquitin chain binding protein involved in ubiquitin proteasome degradation. Mol Cell Biol, 2004. 24(18): p. 8055-68. 52. Um, J.H., et al., Increased and correlated nuclear factor-kappa B and Ku autoantigen activities are associated with development of multidrug resistance. Oncogene, 2001. 20(42): p. 6048-56. 53. Reuter, T.Y., et al., Yeast two-hybrid screens imply involvement of Fanconi anemia proteins in transcription regulation, cell signaling, oxidative metabolism, and cellular transport. Exp Cell Res, 2003. 289(2): p. 211-21. 54. Marchler-Bauer, A., et al., CDD: a Conserved Domain Database for protein classification. Nucleic Acids Res, 2005. 33(Database issue): p. D192-6.

122

55. Guan, K.L. and J.E. Dixon, Eukaryotic proteins expressed in Escherichia coli: an improved thrombin cleavage and purification procedure of fusion proteins with glutathione S-transferase. Anal Biochem, 1991. 192(2): p. 262-7. 56. Morales, F.R., et al., Tomas: Toolbox for mass spectrometry data analysis, in HUPO 2nd & IUBMB XIX World Congress. 2004, Mol Cell Prot: Montreal. 57. Perkins, D.N., et al., Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis, 1999. 20(18): p. 3551-67. 58. Matrix Science - Mascot Distiller v1.0. [cited; Available from: http://www.matrixscience.com/distiller.html. 59. Matrix Science - Mascot Cluster version 1.9.03. [cited; Available from: http://www.matrixscience.com/cluster.html. 60. National Center for Biotechnology Information nonredundant human database. [cited; Available from: ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nr.gz. 61. Kearney, R.E., et al., Elimination of Redundant Protein Identifications in High Throughput Proteomics, in Proceedings of the 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference. 2005, IEEE: Shanghai, China. p. 4803-06. 62. Blondeau, F., et al., Tandem MS analysis of brain clathrin-coated vesicles reveals their critical involvement in synaptic vesicle recycling. Proc Natl Acad Sci U S A, 2004. 101(11): p. 3833-8. 63. Liu, H., R.G. Sadygov, and J.R. Yates, 3rd, A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal Chem, 2004. 76(14): p. 4193-201.

123 PREFACE TO CHAPTER 4

As discussed in the Preface to Chapter 3, a completely in vitro system for ubiquitin-dependent proteasome degradation of polyQ-expanded proteins requires purified human proteasomes and large quantities of purified polyubiquitinated polyQ- expanded protein.

ORIGINAL CONTRIBUTIONS TO KNOWLEDGE

In this chapter, several eclectic attempts to purify large quantities of various forms of polyubiquitinated AR are described. Very little success was achieved with these experimental designs, and it is concluded that polyubiquitinated forms of cellular proteins are extremely transient species.

ACKNOWLEDGEMENTS

• The author performed all cell culture, transfection experiments, cloning of the pHA-

Ub;HIS-AR, affinity chromatography, immunoprecipitations, western blots, in vitro

ubiquitin conjugation reactions, cloning of the various GST-UbcH5a-AR constructs,

and bacterial expression and purification of recombinant enzymes and substrates.

• Dr. Simon Wing kindly provided the pMT123 vector and E1 ubiquitin activating

enzyme.

• Dr. Hua Lu provided the HEK-293Mdm2 cell line.

124

• Dr. Craig Crews provided PROTACS-5.

• Dr. Ron Hay provided the pGEX-2T-UbcH5a.

• Dr. Mark Trifiro assisted in editing the manuscript.

• Dr. Lenore K. Beitel assisted in editing the manuscript.

125 CHAPTER 4 – Production and Purification of Polyubiquitinated Androgen Receptor

4.1 Introduction

4.1.1 Endogenous ubiquitination of the AR

The small protein ubiquitin has diverse roles in cell signaling not limited to proteasome-dependent protein degradation, endosomal targeting, protein-protein interaction, and transcriptional regulation. Covalent attachment of ubiquitin to a protein substrate is catalyzed by the coordinated action of three types of enzymes: ubiquitin activating enzyme (E1), ubiquitin conjugating enzymes (E2), and ubiquitin ligases (E3).

Each level of the cascade provides another layer of specificity; there is only one E1 enzyme in humans, but at least 18 E2s and more than 200 E3s. The most well- characterized function of ubiquitination is substrate-targeting to the proteasome for degradation via attachment of polyubiquitin [1], a process involving iterative attachment of at least four internally-linked ubiquitin moieties ligated to a single substrate lysine residue. Specificity of substrate-ubiquitination is generally regulated at the level of the

E3 ubiquitin ligases that bind specific protein substrates via poorly-conserved interaction domains. The polyubiquitination signal is recognized by at least two proteasome

subunits: S5a/PSMD4 [2] and S6’/PSMC4 [3]. Polyubiquitinated proteins become

substrates for the 26S proteasome, subjecting the protein substrates to ATP-dependent

unfolding, deubiquitination, and transfer into the catalytic core where ATP-independent

proteolysis degrades the substrate into small peptides.

126

An unexpected role of the UPS has become increasingly apparent, as a large body of evidence has indicated significant involvement of various forms of ubiquitination and proteasomal degradation in regulation of transcription. Transcription factors, including the steroid hormone receptors, are a target of ubiquitination at several levels; in addition, many steroid hormone receptors have a seemingly paradoxical positive correlation between transactivational potential and rate of proteasomal degradation [4, 5].

Relevantly, the proteasome seems to have a defined role in regulating AR transactivational potential [6]. The normal cycle of AR production, transcriptional activation and degradation is marked by various types of ubiquitin conjugation with widely ranging functional consequences. Ligand-independent polyubiquitination of the

AR results in rapid receptor degradation and downregulation of transcriptional activity, while ligand-dependent mono- and polyubiquitination have been demonstrated to have positive effects on AR transcriptional activation.

In the absence of androgen, the AR resides in the cytoplasm, and is subject to post-translational modifications that regulate stability. Indeed, many studies report androgen treatment stabilizes the receptor, indicating levels of apoAR are tightly regulated. A role for the proteasome in rapid turnover of the unliganded AR has been proposed, as activation of the phosphatidylinositol 3-kinase (PI3K)-Akt pathway results in AR phosphorylation on S210 and S790, ubiquitination by E3 ligase, and proteasome-dependent degradation [7]. Another E3 ligase with reported ability to associate with, and reduce levels of the apoAR is the C-terminal Hsp-interacting protein

(CHIP). Although CHIP harbors E3 ligase activity, it has another function as a chaperone protein, acting in conjunction with Hsp70 and Hsp40 [8]. In the absence of

127 ligand, overexpression of CHIP leads to AR destabilization and ubiquitination, but this effect was also observed by overexpression of CHIPΔU, a truncated CHIP mutant with no E3 ligase activity [9]. A specific, evolutionarily-conserved region of the AR TAD

(residues 234–247) has been demonstrated to be a specific binding-surface for CHIP, but its precise role in AR ubiquitination remains unclear [10]. As noted in earlier portions of this manuscript, overexpression of CHIP in both neuronal cell cultures and transgenic mice reduced levels and associated toxicity of polyQ-expanded AR [11], but it is unclear if this is due to ubiquitin-dependent proteasome degradation of polyQ-expanded AR resulting from CHIP E3 ligase activity, or a result CHIP’s involvement in the recruitment of molecular chaperones to misfolded proteins.

Ubiquitination of the AR has also been demonstrated to be an androgen-regulated event, with significant effects on AR transcriptional activity. At least two studies have reported AR is monoubiquitinated upon androgen-activation [12, 13], and this modification is hypothesized to stabilize the AR by interaction with a defective ubiquitin conjugating enzyme, TSG101, preventing AR polyubiquitination [12]. Following translocation to the nucleus and DNA-binding, the AR recruits coactivators and the basal transcription machinery. Components of the RNA Pol II transcription complex are likely involved in regulated ubiquitination of transcription factors for clearance of promoter regions [14]. Clearance of androgen-regulated promoters by ubiquitin-dependent proteasome degradation of the AR may explain the positive correlation between AR transcriptional activation and proteasome degradation [15]. Interestingly, p300, a known

AR coactivator and histone acetyltransferase, also harbors E3 ligase activity [16].

128

4.1.2 Towards in vitro ubiquitination of the AR

A large body of evidence suggests that polyubiquitination and subsequent degradation by the proteasome plays a major role in the regulation of AR levels, both in the normal functioning of the AR and in disease processes. An unresolved issue in PolyQ

Expansion Disease is the ability of the proteasome to degrade polyQ-expanded proteins.

Many lines of evidence indicate polyQ expansions inhibit, or at least impair, the normal function of the UPS. However, this evidence is confounded by the presence of other cellular mechanisms of protein clearance like autophagy, which may be relevant to the disposal of polyQ expansions.

One method to resolve this issue would be to create a completely in vitro experimental system whereby proteasomal degradation of polyQ-expanded proteins could be compared to their wild type, nonpathogenic counterparts. This would require two separate, but equally important pieces. First, a system for the production and purification of sufficient quantities polyubiquitinated polyQ proteins must be generated. Second, functional human proteasomes, complete with accessory factors necessary to correctly recognize and process the polyubiquitinated protein should also be purified. Such a system to monitor in vitro degradation of purified, polyubiquitinated, polyQ-expanded protein has not been achieved [17].

This chapter will focus on the first aspect of this system, production of purified polyubiquitinated AR in sufficient quantities for in vitro enzymatic degradation with purified proteasome. An exhaustive variety of experimental conditions were attempted, with little success. One important conclusion is drawn from this line of work: polyubiquitinated forms of the AR are extremely rare species in the cell as a result of the

129 combined actions of the proteasome, which rapidly degrades substrate proteins by

proteolysis, and cellular deubiquitinases. Deubiquitinases are a diverse class of enzyme;

some members have confirmed functions as intrinsic subuntis of the proteasome, while

others function proteasome-independently to recycle ubiquitin or rescue substrate

proteins from proteasome-dependent proteolysis.

4.2 Results and Discussion

4.2.1 In vivo ubiquitination and purification of transfected AR from HEK-293

As AR is thought to be degraded by the proteasome in its normal, inactivated

state, experiments were pursued to purify the intermediate polyubiquitinated forms of

AR. Cells were co-transfected with the pMT123 plasmid encoding for an open reading

frame expressing HA-tagged ubiquitin [18], and pcDNA3-HIS-AR. Co-transfected

HEK-293 cells were allowed to recover from transfection and treated with cell-permeable

proteasome inhibitor MG132 for various time periods (2-16 hr), at various concentrations

(1-10 μm). An optimal MG132 treatment of 5 μM for 16 hours was utilized in most

experiments. Cells were lysed using a mild concentration (0.05%) of nonionic detergent

Triton-X 100, and HIS-AR was purified using nickel immobilized metal-ion affinity

chromatography (Ni2+-IMAC). Ni2+-NTA agarose (Qiagen) was used for purification of

HIS-AR according to manufacturer’s recommendations. Western blots against the AR

(N- and C-terminal epitopes) and against the HA tag were performed on samples taken

from the lysate, and from purified fractions. While both HIS-AR and HA-ubiquitin (HA-

Ub) were consistently well-expressed, and HIS-AR was efficiently purified from the

130

soluble fraction, there was no evidence of AR ubiquitin conjugation of the HIS-AR in the presence of MG132 (Figure 4.1). Evidence of efficient ubiquitin conjugation, and incorporation of HA-Ub into high molecular weight species was confirmed though

Western blot of total cell lysate, and MG132 treatment increased anti-HA immunoreactivity of these high molecular weight species (Figure 4.1, lower panel)

Addition of synthetic (mibolerone, 7 nM) or natural (DHT, 10 nM) androgen did not affect the outcome of these experiments (data not shown). All subsequent analyses were performed with the nonmetabolizable androgen mibolerone (MB, 7 nM), unless otherwise indicated.

In an attempt to ensure co-transfection of both HA-Ub and HIS-AR constructs into the same cell, a single plasmid was constructed with two open reading frames, expressing HA-Ub and HIS-AR (pHA-Ub;HIS-AR). These experiments were likewise ineffective at identifying polyubiquitinated forms of the AR (Figure 4.2).

Immunoprecipitation of the HA tag was performed in an attempt to co-immunoprecipitate polyubiquitinated AR (Figure 4.3), with similar negative results.

4.2.2 Ubiquitination and purification of endogenous AR in LNCaP

Because HEK-293 cells do not express the AR endogenously, one possibility to explain the lack of observed AR ubiquitination is HEK-292 might not express the specific E3 ligase required for ubiquitination of transfected AR. Therefore, LNCaP cells that endogenously produce (and therefore must degrade) AR were transfected with pMT123. Although AR immunodetection was robust in the lysate, immunoprecipitation

131 with anti-HA antibody did not co-immunoprecipitate polyubiquitinated forms of AR

(Figure 4.4).

4.2.3 In vivo ubiquitination in HEK-293Mdm2

After exhaustive literature search, the best candidate for the endogenous E3 ligase

of the AR is Mdm2. Therefore, it was considered that transfection of the pHA-Ub;HIS-

AR plasmid into cells stably overexpressing Mdm2 [19] might result in production of polyubiquitinated AR. Transfections were performed as above, and cells were treated with MG132, 5 μM for 16 hours. Immunoprecipitation of HA was performed and robust incorporation of the HA-Ub into ubiquitin conjugates was observed. In these experiments, HIS-AR of near endogenous size was co-immunoprecipitated by anti-HA, suggesting mono- or di-ubiquitination (Figure 4.5). However, this finding was not consistently reproduced. High molecular weight AR species of polyubiquitinated AR were never observed.

At least one study has demonstrated a connection between promoter-bound AR, corepressor recruitment, and Mdm2-mediated ubiquitination [13]. Based on this evidence, experiments were performed to assess ubiquitin conjugation of the AR in the presence of a partial agonist (cyproterone acetate - CPA) and an exogenous androgen response element. Cotransfection of pHA-Ub;HIS-AR and an AR reporter plasmid expressing luciferase from the mouse mammary tumor virus promoter (pMMTV-luc) was performed in HEK-293Mdm2 cells. The cells were treated with either CPA or MB, and

MG132 for 16 hours. Immunoprecipitation of HA-Ub did not co-immunoprecipitate the

AR in any experimental condition (Figure 4.6).

132

4.2.4 PROTACS

A novel method for chemical genetic control of target protein levels in vivo was developed by Dr. Craig Crews of Yale University. Basically, it was hypothesized that ligation of a well-characterized E3 ligase recognition domain peptide to small molecule drugs with specific substrates could rapidly drive peptide-specific E3 ligase recruitment, substrate-polyubiquitination and proteasome-dependent degradation [20]. The AR is an attractive target for this type of study, as androgen-AR binding is a well-characterized molecular event, and androgens are themselves small, cell-permeable molecules with well-defined chemistries. Therefore, the ALAPYIP peptide, which has been recognized as the minimum recognition domain for the von Hippel-Lindau (VHL) protein, part of the

VBC-Cul2 E3 ligase complex [21], was covalently attached to DHT [22]. Inclusion of the polyarginine sequence [23] rendered this molecule (PROTAC-5) cell permeable;

PROTAC-5 treatment (25 μM, 1 hour) of cells stably expressing a GFP-AR fusion protein promoted rapid proteasome-dependent loss of GFP signal [22].

Based on this evidence, it was hypothesized that treatment of cells expressing

HIS-AR with both PROTAC-5 and proteasome inhibitors might promote accumulation of appreciable levels of polyubiquitinated HIS-AR that could be purified with Ni2+-IMAC.

As a pilot experiment, HEK-293 cells were transiently-transfected in 6-well plates, and treated with increasing concentrations of PROTAC-5 (100-300 μM, 1 hour), ± MG132

(10 μM). Whole cell lysates were examined by anti-AR western blot. Treatments with

100-175 μM PROTAC-5 did not reduce levels of AR (Figure 4.7 A). However, treatments with 250-300 μM decreased AR levels in a dose-dependent manner that was

133 sensitive to MG132, suggesting proteasome-dependent degradation (Figure 4.7 A).

However, there was no accumulation of high molecular weight species indicative of

polyubiquitination. It was concluded that polyubiquitinated-AR is likely an

extraordinarily transient species; specific targeting to the proteasome likely marks polyubiquitinated-AR for DUB activity resident in the proteasome.

As a result of speculation that intracellular DUBs disassemble polyubiquitinated-

AR, an in vitro ubiquitin conjugation reaction was pursued in the presence of the specific

DUB inhibitor ubiquitin-aldehyde (Ub-H). S100 fractions were prepared from 1.0x107

HEK-293 cells. For the ubiquitin conjugation reaction, 50 μg total protein of the S100 fraction was supplemented with 1 mM ubiquitin, 5 μM Ub-H, 350 μM PROTAC-5, 10

μM MG132, an energy regeneration system (ERS), and 2 μg HIS-AR. Western blots of the ubiquitin conjugation failed to detect high molecular weight AR species (Figure 4.7

B), but probing with anti-ubiquitin demonstrated a functional ubiquitin conjugation reaction. Repetition of these experiments failed to induce ubiquitin conjugation of the

AR.

4.2.5 In vitro ubiquitin conjugation of GST-UbcH5a

Based on evidence that cellular accumulation of polyubiquitinated-AR is an exceedingly rare event even in the presence of proteasome inhibitors, another in vitro ubiquitin conjugation scheme was pursued. UbcH5a (aka UBE2D1) is a well- characterized E2 enzyme with the phenomenal ability to autoubiquitinate as a GST- fusion protein in defined in vitro ubiquitin conjugations [24]. The pGEX2T-UbcH5a plasmid was a gift from Dr. Ron Hay (University of Dundee). GST-UbcH5a was

134

produced in E. coli BL21, and purified with glutathione-Sepharose 4B. In vitro ubiquitin conjugations were routinely performed with 10-50 μg GST-UbcH5a, requiring supplementation with only ATP (2 mM), E1 (3 pmol), and ubiquitin (1 mM) (Figure 4.8).

The major goal of this section was to produce polyubiquitinated AR for in vitro ubiquitin-dependent proteasome degradation to directly assess the ability of the proteasome to digest polyQ expansions. Therefore, various constructs encoding for fusion proteins of GST-UbcH5a and AR TAD (amino acids 1-341) with normal (20Q), and pathogenic (50Q) polyQ tract lengths were constructed.

To summarize: GST-UbcH5a-AR, GST-UbcH5a-p/p-AR-HIS (p/p: PreScission

Protease recognition site), GST-UbcH5a-tb-AR-HIS (tb: thrombin), and HIS-AR-tb-

UbcH5a were constructed and assayed for in vitro ubiquitin conjugation. Reproducible appearance of high molecular weight species of polyubiquitinated-AR fusion protein was not observed with any experimental construct (a representative in vitro ubiquitin conjugation reaction is shown in Figure 4.9 A, B). It was concluded that addition of the

AR TAD either N- or C-terminal to the UbcH5a inhibits enzyme activity. Future experiments will examine if this inhibition is a direct result of the polyQ tract, as plasmid constructs are currently being synthesized with 0Q length tracts.

4.3 Conclusions

Unfortunately, few conclusions can be drawn from these experiments.

Accumulation of polyubiquitinated-AR is a very rare event in cells. This finding is likely attributable to the specific targeting of polyubiquitinated-AR to the proteasome where

135 deubiquitination of substrates occurs as a requisite function of proteasome activity.

Conditions that have been reported to increase endogenous proteasome-dependent

degradation of the AR in vivo, including treatment with androgen, overexpression of

Mdm2 E3 ligase, treatment with partial agonist, and transcriptional activation, did not

permit accumulation of polyubiquitinated AR. In addition, treatment of the AR with an

E3 ligase recognition peptide covalently linked to DHT, a molecule expected to facilitate

rapid AR polyubiquitination, also failed to permit purification of large quantities of

polyubiquitinated AR.

One method to circumvent intracellular DUBs would be to perform in vitro

ubiquitin conjugation reactions, allowing chemical inhibition of DUBs by ubiquitin-

aldehyde. Unfortunately, the experimental conditions described did not yield

ubiquitinated AR, even though in vitro ubiquitin conjugations were functional as evidenced by the accumulation of high molecular weight ubiquitin-conjugates. As a final

trial, a known E2 enzyme with robust autoubiquitination activity was fused to an N-

terminal fragment of the AR. These fusion proteins inhibited UbcH5a autoubiquitination

activity. The physiological significance of these findings is questionable. However, as

the goal of this project was to develop a method for the purification of large quantities of

a physiologically rare specimen, these methods were deemed appropriate, although quite

risky. In the future, other methods will have to be developed in order to achieve the goal of a totally in vitro ubiquitin-dependent proteasome degradation of the AR.

136

4.4 Experimental Procedures

Plasmids and Cell Culture: pMT123 was a gift from Dr. Simon Wing (McGill

University); pcDNA3-HIS-AR was constructed previously in our lab. The pHA-Ub;HIS-

AR was constructed by blunt-end ligation of a pMT123 NaeI digest, and a Klenow fragment-treated, partial Xma I, Ssp I digest of the pcDNA3-HIS-AR plasmid. The resulting plasmid expresses both HA-Ub and HIS-AR from different promoters. All restriction and modifying enzymes were from New England Biolabs. HEK-293 and

HEK-293Mdm2 were routinely maintained in DMEM supplemented with 10% FBS, and

1% penicillin/streptomyocin. LNCaP were maintained in RPMI with 10% FBS, and 1% penicillin/streptomyocin. All cell culture reagents were purchased from Invitrogen.

Androgens (DHT, MB, CPA) and MG132 were purchased from Sigma, and PROTAC-5

was a gift from Dr. Craig Crews, Yale University.

Transfections: HEK-293, LNCaP, and HEK-293Mdm2 (a gift from Dr. Hua Lu of Oregon

Health and Science University) were seeded on the day before transfection in 6-well plates at 300,000 cells per well. Transfections were routinely performed using Superfect

(Qiagen) according to manufacturer’s instructions. Twenty hours post transfection, cells were exposed to hormonal treatment, 7 nM of the nonmetabolizable synthetic androgen mibolerone (MB). Thirty hours post transfection the proteasome inhibitor MG132 was added to culture media, treatment with MG132 was generally 5 μM for 16 hours.

Antibodies used in western blots include: anti-AR (clone 441, Neomarkers; clone C-19,

137 Santa Cruz Biotechnology), anti-Ubiquitin (rabbit polyclonal, Neomarkers), anti-mouse

and anti-rabbit secondary antibodies were purchased from GE Amersham.

Ni2+-IMAC: Purification of HIS-AR from mammalian cells was performed by Ni2+-

IMAC. Transfected cells were lysed in native lysis buffer (50 mM NaH2PO4, 300 mM

NaCl, 10 mM imidazole pH 8.0), supplemented with 1.5 μM ubiquitin-aldehyde (Ub-H),

Complete Mini, EDTA-free protease inhibitor cocktail (Roche), and 0.05% Triton-X 100.

Cell lysates were centrifuged at 13,000 rpm for 10 minutes at 4oC, and adsorbed onto 15

μL Ni2+-NTA slurry (Qiagen) per well, for 1 hour at 4oC. Ni2+-NTA was washed 3 times

in lysis buffer with 20 mM imidazole, and resuspended in 2x SDS sample buffer for

western blot, or eluted using lysis buffer with 300 mM imidazole, and dialyzed into

appropriate buffer for in vitro ubiquitin conjugation reactions.

Immunoprecipitation: Each well of transfected cells was harvested by cell scraping and

lysed by direct addition of 1 mL radioimmunoprecipitation (RIPA) buffer (50 mM Tris-

HCl, pH 7.5, 150 mM NaCl, 1% NP-40, 0.1% SDS, and 1 mM EDTA) with Complete

Mini, EDTA-free protease inhibitor cocktail. Insoluble material was removed by

centrifugation at 13,000 rpm for 10 minutes at 4oC and lysate was precleared by the

addition of 10 μL ProteinA-Sepharose (Sigma) for 1 hour at 4oC. Following removal of

ProteinA-Sepharose, 3.7 μg monoclonal anti-HA antibody (clone HA-7, Sigma) was added for 2 hours at 4oC; protein-antibody conjugates were adsorbed onto 10 μL of

ProteinA-Sepharose for 2 hours at 4oC. Beads were resuspended in 2x SDS sample

buffer for western blots.

138

Preparation of S100 Fractions: HEK-293 were trypsinized, washed two times with ice cold D-PBS (Invitrogen), and resuspended in hypotonic lysis buffer (20 mM Tris-HCl pH

7.5, 8 mM KCl, 5 mM MgCl2, 1 mM DTT). The cell suspension was allowed to swell on

ice for 20 minutes, and exposed to three freeze/thaw cycles by dry ice-ethanol. Lysates

were centrifuged at 13,000 rpm for 10 minutes at 4oC, and then ultracentrifuged at

100,000g for 4 hours at 4oC.

In vitro Ubiquitin Conjugation reactions: Ubiquitin, Ub-H, and the energy regenerating

system (ERS) were purchased from Boston Biochem. ATP was purchased from Sigma.

E1 was a gift from Dr. Simon Wing (McGill University).

Production and Purification of GST-UbcH5a, AR fusions: pGEX-2T-UbcH5a was a gift from Dr. Ron Hay (University of Dundee). The plasmid was transformed into XL2-Blue

(Stratagene) for plasmid maintenance, and BL21 (Stratagene) for protein production.

25mL cultures of LB-Amp were inoculated from glycerol stocks of pGEX-2T-UbcH5a

and grown overnight at 37oC with shaking at 200 rpm. Overnight cultures were pelleted

by centrifugation, resuspended in 250 mL cultures of LB-Amp, and grown to an optical

o density (OD600nm) of 0.6-0.8 at 37 C, 200 rpm. Induction of protein expression was

performed by treatment with 100 μM isopropyl β-D-1-thiogalactopyranoside (IPTG) for

2-3 hours at 37oC. Cells were resuspended in glutathione binding buffer 50 mM Tris-

HCl pH 7.5, 150 mM NaCl, 1 mM DTT, 0.05% Triton-X 100, and Complete Mini,

EDTA-free protease inhibitor cocktail, and lysed by sonication. Lysate was cleared by

139 centrifugation at 13,000 rpm for 10 minutes at 4oC, and adsorbed onto 150 μL glutathione-Sepharose 4B (GE Amersham) at 4oC for 1 hour. The beads were washed at least 3 times in glutathione binding buffer, and eluted by addition of glutathione binding buffer supplemented with 10 mM glutathione. Proteins were buffer exchanged by size exclusion on Bio-Gel P-6 (BioRad), into appropriate buffers for in vitro ubiquitin conjugation. GST-UbcH5a-AR fusion proteins were constructed by standard cloning procedures. AR cDNA encoding amino acids 1-350 was amplified by PCR with flanking sequence coding for protease recognition sites and restriction sites and inserted into the

UbcH5a open reading frame. Fusion proteins were produced and purified as for GST-

UbcH5a.

140

4.5 Figures

HIS-AR + + + + Ni2+

HA-Ub - - + + Ni2+

2+ MG132 - + - + Ni

α-AR WB

105

250

160 α-HA WB

105

75

kDa

Figure 4.1: HEK-293 cells were transfected with HIS-AR and/or HA-Ub, as indicated.

10% of the total lysate was loaded in each of the first 4 lanes. Ni2+ IMAC was performed

on the sample cotranfected with HIS-AR and HA-Ub treated with MG132. 20% of Ni2+ purified fraction was loaded in the last lane. After probing the blot for AR immunoreactivity, it was stripped and probed for HA. Accumulation of high molecular weight species of AR, indicative of ubiquitin conjugation, was not observed.

141

HA-Ub + + - - + + - -

HA-Ub-HIS-AR - - + + - - + +

MG132 (5μM) - + - + - + - +

α-AR WB 105

250 160 α-HA WB 105

75

kDa -----LYSATE----- =====Ni2+======

Figure 4.2: HEK-293 cells were transfected with either HA-Ub or a plasmid containing

open reading frames for both HA-Ub and HIS-AR. After 24 hours, transfected cells were

treated with MG132 for 16h, as indicated. 5% of whole cell lysate was loaded to verify transfection of both plasmid constructs, and Ni2+ IMAC was performed to purify HIS-

AR. Ubiquitin-conjugated AR was not observed in the AR WB; high molecular weight smearing positive for HA in the Ni2+ purification (lower panel) is likely due to

nonspecific binding of ubiquitinated protein binding to the Ni-NTA resin.

142

HA-Ub - + - - + - - + - - + -

HA-Ub-HIS-AR - - + - - + - - + - - + MG132 (5μM) - - - + + + - - - + + +

α-AR WB 105

250 160 α-HA WB

105

75

kDa -----LYSATE----- ====α-HA IP====

Figure 4.3: HEK-293 cells were transfected with either HA-Ub or a plasmid containing open reading frames for both HA-Ub and HIS-AR. After 24 hours, transfected cells were treated with MG132 for 16h, as indicated. 5% of whole cell lysate was loaded to verify transfection of both plasmid constructs, and immunoprecipitation for HA was performed to pull down HA-ubiquitin conjugates. HA-Ub is highly incorporated into high molecular weight complexes, indicating efficient ubiquitin conjugation with this substrate, and purified by anti-HA IP. However, HIS-AR is not coimmunoprecipitated, indicating HIS-AR is not ubiquitin-conjugated under these conditions.

143

HA-Ub - + - +

MG132 (5μM) + + + +

α-AR WB

105

250

160 α-HA WB

105

75

kDa -LYS- =α-HA IP=

Figure 4.4: LNCaP were transfected with HA-Ub. All cells were treated with MG132 for 16h, as indicated. 5% of whole cell lysate was loaded to verify HA-Ub transfection.

Immunoprecipitation for HA was performed to pull down HA-ubiquitin conjugates. As in HEK-293 cells, HA-Ub is incorporated into high molecular weight complexes, indicating functional incorporation of HA-Ub into ubiquitin conjugates. The endogenous

AR from LNCaP is not coimmunoprecipitated by HA IP, indicating AR is inefficiently ubiquitin-conjugated, or rapidly deubiquitinated.

144

Cell Type 293 Mdm2+ 293 Mdm2+

HA-Ub-HIS-AR + + + + + + + +

MG132 (5μM) - + - + - + - +

α-AR WB 105

250

160 α-HA WB

105

75

kDa ----LYSATE--- ====α-HA IP====

Figure 4.5: As a comparative study, HEK-293 and HEK-293Mdm2 were transfected with

the HA-Ub, HIS-AR plasmid; 24 hours post transfection cells were treated with MG132

for 16 hours. Immunoprecipitation with anti-HA was performed to purify ubiquitin

conjugates. Intriguingly, HIS-AR of endogenous size was coimmunoprecipiated from

both cells lines, indicative of mono- and di-ubiquitination. Both the HEK-293Mdm2 cell

line, and MG132 treatment modestly increased ubiquitination. Apparent mono- and di-

ubiquitination patterns as observed here, occurred infrequently; the molecular event

controlling their formation was never identified.

145

HA-Ub-HIS-AR + + + + + + + + MG132 (5μM) + + + + + + + +

CPA - + - + - + - + MB + - + - + - + -

MMTV-luc - - + + - - + +

α-AR WB 105

250

160 α-HA WB

105

75 kDa

----LYSATE--- ====α-HA IP====

Figure 4.6: To investigate the role of antagonist and transcriptional activity on AR ubiquitination, the HA-Ub;HIS-AR plasmid was transfected into HEK-293Mdm2 cells treated with the AR antagonist cyproterone acetate (CPA) or agonist mibolerone (MB) in the presence and absence of the AR reporter plasmid MMTV-luciferase. Ubiquitin- conjugates of the AR could not be coimmunoprecipitated with the anti-HA IP.

146

pcDNA3-AR - + + + + + +

PROTAC-5 (μM) - 0 100 175 250 300 250

MG132 (10μM) ------+

A. 105 α-AR WB kDa

α-SAM68

1 2 1 2 B.

α-AR WB α-Ub WB 105 kDa

Figure 4.7: A. HEK-293 cells were transfected with AR and treated for 1 hour with various concentrations of PROTAC-5. Treatments of below 250 μM did not appreciably reduce levels of AR. At higher concentrations, AR level was reduced in a proteasome- dependent manner. Sam68 was used as a loading control. Accumulation of polyubiquitinated forms of the AR was not observed. B. In vitro treatment of HIS-AR with PROTAC-5. S100 fractions of HEK-293 cells were prepared, and treated with

147 MG132 for proteasome-inhibition and Ub-H for DUB inhibition. HIS-AR purified from transfected cells by Ni2+-IMAC, was added to the fraction, along with PROTAC-5. The ubiquitin conjugation reaction was analyzed by WB; anti-AR (left panel) lane 1 represents 5% of the Ub-conjugation reaction at time 0; lane 2 is 40% of the sample at one hour. The blot was stripped and probed by anti-UB WB. Accumulation of high molecular weight αAR-reactive species was not observed.

148

Time (hours) 0 1.5 3

75

50

35 kDa

Figure 4.8: GST-UbcH5 has robust autoubiquitination activity in vitro. GST-UbcH5 (30

μg) was reacted with ubiquitin (1 mM), ATP (2 mM), and E1 (3 pmol) in 50 mM Tris-

HCl, pH 7.5, at 37oC for the indicated times.

149

Time (hours) 0 1.5 3 Time (hours) 0 1.5 3 AB

160 α-AR WB 75 105

50 75 kDa 35 kDa

Figure 4.9: Representative in vitro ubiquitin conjugation reaction of bacterially-produced

GST-UbcH5-p/p-AR-HIS fusion protein. A. In vitro ubiqutin conjugation of GST-

UbcH5a, performed as a positive control. B. 5 μg of GST-UbcH5a-p/p-AR-HIS was

reacted as in A, at 37oC for the times indicated. Reaction components: ubiquitin (1 mM),

ATP (2 mM), and E1 (3 pmol). The autoubiquitination activity of GST-UbcH5a is lost

when produced as a fusion protein with the AR TAD.

150

4.6 References

1. Ciechanover, A., The ubiquitin-proteasome proteolytic pathway. Cell, 1994. 79(1): p. 13-21. 2. Deveraux, Q., et al., A 26 S protease subunit that binds ubiquitin conjugates. J Biol Chem, 1994. 269(10): p. 7059-61. 3. Lam, Y.A., et al., A proteasomal ATPase subunit recognizes the polyubiquitin degradation signal. Nature, 2002. 416(6882): p. 763-7. 4. Dennis, A.P. and B.W. O'Malley, Rush hour at the promoter: how the ubiquitin- proteasome pathway polices the traffic flow of nuclear receptor-dependent transcription. J Steroid Biochem Mol Biol, 2005. 93(2-5): p. 139-51. 5. Hager, G.L., et al., Dynamics of nuclear receptor movement and transcription. Biochim Biophys Acta, 2004. 1677(1-3): p. 46-51. 6. Lin, H.K., et al., Proteasome activity is required for androgen receptor transcriptional activity via regulation of androgen receptor nuclear translocation and interaction with coregulators in prostate cancer cells. J Biol Chem, 2002. 277(39): p. 36570-6. 7. Lin, H.K., et al., Phosphorylation-dependent ubiquitylation and degradation of androgen receptor by Akt require Mdm2 E3 ligase. Embo J, 2002. 21(15): p. 4037-48. 8. Ballinger, C.A., et al., Identification of CHIP, a novel tetratricopeptide repeat- containing protein that interacts with heat shock proteins and negatively regulates chaperone functions. Mol Cell Biol, 1999. 19(6): p. 4535-45. 9. Cardozo, C.P., et al., C-terminal Hsp-interacting protein slows androgen receptor synthesis and reduces its rate of degradation. Arch Biochem Biophys, 2003. 410(1): p. 134-40. 10. He, B., et al., An androgen receptor NH2-terminal conserved motif interacts with the COOH terminus of the Hsp70-interacting protein (CHIP). J Biol Chem, 2004. 279(29): p. 30643-53. 11. Adachi, H., et al., CHIP overexpression reduces mutant androgen receptor protein and ameliorates phenotypes of the spinal and bulbar muscular atrophy transgenic mouse model. J Neurosci, 2007. 27(19): p. 5115-26. 12. Burgdorf, S., P. Leister, and K.H. Scheidtmann, TSG101 interacts with apoptosis- antagonizing transcription factor and enhances androgen receptor-mediated transcription by promoting its monoubiquitination. J Biol Chem, 2004. 279(17): p. 17524-34. 13. Gaughan, L., et al., Regulation of androgen receptor and histone deacetylase 1 by Mdm2-mediated ubiquitylation. Nucleic Acids Res, 2005. 33(1): p. 13-26. 14. Conaway, R.C., C.S. Brower, and J.W. Conaway, Emerging roles of ubiquitin in transcription regulation. Science, 2002. 296(5571): p. 1254-8. 15. Kang, Z., et al., Involvement of proteasome in the dynamic assembly of the androgen receptor transcription complex. J Biol Chem, 2002. 277(50): p. 48366- 71. 16. Grossman, S.R., et al., Polyubiquitination of p53 by a ubiquitin ligase activity of p300. Science, 2003. 300(5617): p. 342-4.

151 17. Valera, A.G., et al., Testing the possible inhibition of proteasome by direct interaction with ubiquitylated and aggregated huntingtin. Brain Res Bull, 2007. 72(2-3): p. 121-3. 18. Treier, M., L.M. Staszewski, and D. Bohmann, Ubiquitin-dependent c-Jun degradation in vivo is mediated by the delta domain. Cell, 1994. 78(5): p. 787-98. 19. Jin, Y., et al., MDM2 mediates p300/CREB-binding protein-associated factor ubiquitination and degradation. J Biol Chem, 2004. 279(19): p. 20035-43. 20. Sakamoto, K.M., et al., Development of Protacs to target cancer-promoting proteins for ubiquitination and degradation. Mol Cell Proteomics, 2003. 2(12): p. 1350-8. 21. Hon, W.C., et al., Structural basis for the recognition of hydroxyproline in HIF-1 alpha by pVHL. Nature, 2002. 417(6892): p. 975-8. 22. Schneekloth, J.S., Jr., et al., Chemical genetic control of protein levels: selective in vivo targeted degradation. J Am Chem Soc, 2004. 126(12): p. 3748-54. 23. Wender, P.A., et al., The design, synthesis, and evaluation of molecules that enable or enhance cellular uptake: peptoid molecular transporters. Proc Natl Acad Sci U S A, 2000. 97(24): p. 13003-8. 24. Cooper, H.J., et al., Identification of sites of ubiquitination in proteins: a fourier transform ion cyclotron resonance mass spectrometry approach. Anal Chem, 2004. 76(23): p. 6982-8.

152

PREFACE TO CHAPTER 5

The expansion of the polyQ tract engenders a gain-of-function to the contextual

protein that has yet to be elucidated. Several groups have hypothesized a novel structure is formed by polyQ tracts, and this gain-of-structure is directly responsible for the toxic properties of the expansion. One hypothesis is that a conformational change from the native random coil structure of the polyQ tract to a β-sheet structure promotes self- association (“aggregation”), creating macrostructures that either directly or indirectly impair the proteasome. The revised version of this chapter details direct evidence of β- sheet structure unique to the polyQ tract expansion mutation of the Androgen Receptor, and also demonstrates an increased propensity of the mutant protein to aggregate.

ORIGINAL CONTRIBUTIONS TO KNOWLEDGE

In this chapter, a bacterial expression system for production of the AR TAD with

various polyQ tract lengths is described. A scheme for purification of recombinant AR

proteins was developed, and used for direct observation of differential physical properties

of the TAD with various polyQ tract lengths. An interesting finding is the report of

considerable charge heterogeneity in AR TAD with polyQ tracts of 20 and 50Q that was

not observed in the AR TAD with 0Q.

A new addition to the revised thesis is the inclusion of an entirely new set of data

describing structural characterization of each AR TAD polyQ tract variant by dynamic

light scattering (DLS) and circular dichroism (CD). A unique and novel feature of the

153 expression system described herein is the expression, purification and structural characterization of a polyQ tract deletion (“0Q”) variant. To my knowledge there has never been a report of a protein harboring a complete polyQ tract deletion used for structural characterization. Considering the polyQ tract expansion is a gain-of-function mutation, and several groups have demonstrated that *any* length polyQ tract predisposes the host protein to aggregation, it is an interesting finding that the physical properties of the AR TAD 0Q protein differ significantly from either the 20Q or 50Q variants. Specifically, the aromatic CD spectrum for the AR TAD 0Q indicated considerable tertiary structure that was absent in both 20Q and 50Q. It is concluded that even wild-type length polyQ tracts are inherently structure-breaking elements that result in the AR TAD having a molten globule structure in the absence of AR-coactivator interaction. This is the first ever structural examination of a polyQ tract deletion mutant, significant differences between it and both wild-type and mutant AR TAD were observed, suggesting a novel function for the polyQ tract in the regulation AR activity.

In addition, a set of DLS experiments with each polyQ tract variant elucidated significant differences in dispersity. The AR TAD 0Q and 20Q were observed at very low polydispersity (6.7% and 10%, respectively), while the 50Q variant was highly polydisperse (46%) indicating that the expansion mutant is aggregated. While this is not the first in vitro demonstration of polyQ expansion protein aggregation, it is important to note that this data allows a distinction to be made between the structure-breaking function of the polyQ tract and the aggregative properties of an expanded tract, as the 20Q tract is both unstructured and monodisperse. This novel finding will be of considerable interest to researchers studying structure of polyQ tract proteins.

154

ACKNOWLEDGEMENTS

• The author performed optimization of bacterial expression, affinity chromatography,

and ion exchange chromatography.

• Dr. Lenore K. Beitel ran native PAGE gels and assisted in the editing of the

manuscript.

• The author performed circular dichroism experiments in the laboratory of Dr. Joanne

Turnbull at Concordia University.

• The author performed dynamic light scattering experiments in the laboratory of Dr.

Albert Berghuis at McGill University.

• The pGEX-AR TAD 0Q, 20Q and 50Q expression plasmids were prepared previously

in the Trifiro lab.

• Dr. Mark Trifiro assisted in editing of the manuscript.

155 CHAPTER 5 – Heterologous Expression, Purification and Structural Analysis of the AR N-Terminus

5.1 Introduction

5.1.1 Structure of the AR transactivational domain

Ligand-dependent activation of the AR is a complex, multi-step process requiring interaction of various coregulatory proteins and the putative involvement of several different types of post-translational modifications. While the structure-function relationship of AR ligand-binding has been probed by interpretations of X-ray diffraction patterns of LBD crystals, the structural properties of the AR TAD remain largely unknown. In fact, there is not a single report of X-ray crystallization for any NR TAD.

What is known regarding the functional properties of the AR TAD has mostly been garnered from two-hybrid and reporter gene assays; structural descriptions of AR TAD using circular dichroism (CD), fluorescence and Fourier transform infrared spectroscopy

(FTIR) have only recently been reported [1, 2].

One explanation for the dearth of structural information of NR TAD is that the domain actually does not possess much structure. A surprising development in structural biology is the finding that many proteins have large regions that are natively unfolded, and only form well-ordered, functional structures upon stimulation [3]. Primary sequence analysis of unfolded proteins has revealed a high proportion of proline, serine and glycine residues is associated with this propensity; this compositional bias is observed in the TAD of many NR. Indeed, the AR TAD displays the sequence signature for an unfolded protein domain [3]. Experimental secondary structural determination of

156

the AR TAD in aqueous solution largely confirms the hypothesis of a predominantly unstructured domain. CD spectrums of the AR TAD in aqueous solution display troughs characteristic of unfolded proteins similar to reports of ERα/β-TAD, and GR-AF-1; α- helical content of AR TAD was measured at 13-16%, while unstructured regions made up

24-36% of the domain [1, 2].

Investigations of unfolded or partially folded proteins have discovered that protein-protein interactions are often involved in conformational orientation for functional activation [4]. This phenomenon is likely to contribute to AR NTD function, as several reports indicate a higher percentage of α-helical structure when placed in a hydrophobic environment, incubated with the natural osmolyte triemethylamine-N-oxide

(TMAO), or incubated with coactivator protein RAP74 [2]. Consistent with a model of induced-folding, these treatments produced a conformational change in the AR TAD observed by a change in the fluorescence spectra corresponding to less solvent-exposed tryptophan residues, and protease-resistance [1]. Furthermore, helix disrupting mutations introduced into the proposed AF-1 α-helical region altered protease-resistance profile of the AR TAD [1]. Using FTIR spectroscopy, direct evidence of an induced α-helical region in the AR TAD was observed upon binding to RAP74; this induced helical content of the AR TAD was hypothesized to stabilize the interaction with p160 coactivator family member SRC-1 [2]. These investigations suggest the AR TAD is a structurally flexible domain, and binding to biologically relevant interacting-proteins promotes the formation of a more ordered secondary structure.

157 5.1.2 Structural investigations of the polyQ tract

A pathological hallmark of PolyQ Expansion Disease is the presence of

intracellular inclusion bodies in both affected and unaffected tissue. Whether or not the

formation of these structures is causally related to observed neurotoxicity, one established

property of uninterrupted tracts of glutamine residues is the propensity to self-associate in

vitro and in vivo. A mechanistic link between polyQ tract aggregation and amyloid fibril

formation is likely, and is supported by kinetic analysis and molecular dynamics

simulations of polyQ and the Alzheimer’s plaque protein Aβ aggregation, which suggest a transition to a stable β-sheet structure is the nucleation event responsible for formation of protein aggregates [5, 6].

Several hypotheses have been proposed to explain the propensity for expanded- polyQ tracts to self-associate into protein aggregates. Early studies suggested that expanded tracts adopted an implicitly different structure than nonpathogenic length polyQ tracts. Specifically, an exceptionally strong β-sheet structure was proposed to be a property of expanded tracts, stabilized not only by the classical β-sheet H-bonding pattern between peptide backbone atoms, but also with a significant contribution from H-bonding between glutamine side chain atoms. Indeed, this idea has been around since Max Perutz originally proposed the “polar zipper” structure, whereby polyQ tracts form an antiparallel β-sheet stabilized by glutamine side chain amides positioned such that an amide group in one strand H-bonds with a carbonyl group in an adjacent strand [7].

An intriguing hypothesis providing a stereochemical explanation for the pathological threshold of polyQ tract length was proposed by Perutz upon reinterpretation of X-ray diffraction data of a small polyQ peptide (D2Q15K2) [8]. The formation of a

158

cylindrical β-sheet structure with one helical turn per 20 glutamine residues was suggested to be stabilized by H-bonding between amides of successive turns, implying a significant stabilization event would occur when each amide in the first turn has an H- bond partner with an amide in the second turn, at a minimum tract length of 40 glutamines [8]. However, more recent analysis of the same X-ray diffraction data contradict this interpretation, and suggest a cross-β structure with exceptionally tight packing and twice the number of H-bonds as suggested by Perutz [9].

Structural data have since accrued strongly implicating that any length polyQ tract may adopt a β-sheet structure, and transition to β-sheet conformation increases the propensity for polyQ tracts to aggregate. Experimental observations of polyQ peptides confirm a β-sheet structure using CD, electron microscopy (EM), infrared (IR) spectroscopy and X-ray diffraction [10, 11]. Analysis of a polyQ tract in context of its native protein was performed using full-length ATXN-3 protein. CD, IR spectroscopy and EM were used to analyze normal (27Q) and expanded (78Q) polyQ tract lengths; while normal ATXN-3 protein exhibited stable α-helical secondary structural elements, the pathogenic polyQ tract displayed a rapid conformational change with large increases in β-strand structure in polyQ region with concomitant loss of α-helical regions [12].

Using molecular modeling this study reported a parallel β-strand conformation is favored and maximizes H-bonding potential [12]. More evidence of β-sheet formation comes from structural studies of a myoglobin protein expressing polyQ insertions of various lengths [13]. Intramolecular antiparallel β-sheets were observed by IR spectroscopy and

CD in pathogenic polyQ tracts that were not observed in normal-sized tracts. In addition,

159 highly expanded polyQ tracts (50Q) were relocalized to the protein surface, where intermolecular β-sheet aggregation was observed [13].

There is little consensus regarding the native structure of nonpathogenic polyQ

tracts, but a random coil is likely [14, 15]. Experimental observations indicate polyQ

tracts are induced to form a stable β-sheet structure, with evidence accumulating that an

intramolecular conformational change resulting in β-sheet formation strongly promotes

aggregation with other polyQ tracts [15]. Kinetic analysis of this process suggests a

nucleated growth polymerization pathway [16] in which a highly unfavorable protein

folding reaction accounts for the conformational change and acts as the nucleation event;

aggregation then proceeds in an elongation phase that follows pseudo-first-order kinetics

[6].

Pursuant to further the understanding of the molecular mechanisms of aggregation

associated with AR polyQ tract expansion, structural studies of the N-terminus of AR

were undertaken. A bacterial expression system was established in which the AR TAD

(11-341) was expressed as a fusion protein with GST with deleted (0Q), nonpathogenic

(20Q) and expanded (50Q) polyQ tract lengths. Using a combination of affinity, ion

exchange, and size exclusion chromatography, a method for efficient purification of AR

TAD was produced. Structural aspects of these polyQ tract variants were investigated by

a combination of DLS and both Far-UV and Near-UV CD.

5.2 Results and Discussion

5.2.1 Expression and purification of AR TAD

160

An N-terminal fragment of the AR TAD was expressed as a fusion protein with

GST from a pGEX-KG-based vector previously constructed in our lab. An AccI/AccI restriction enzyme fragment of the AR gene, encoding for amino acids 11-341, was ligated in-frame with GST cDNA, downstream from an intervening thrombin cleavage site. As indicated, polyQ tract lengths of 0Q, 20Q and 50Q were expressed in the context of these AR TAD fragments. Analysis of the AR TAD genetic sequence revealed greater than 6% of all codons are rare in E. coli; it was therefore decided to use the Rosetta 2 competent stain (Novagen), a BL21 derivative expressing 7 rare tRNA genes on a chloramphenicol-resistant plasmid. A comparison of expression levels between BL21 and Rosetta 2 did not reveal significant differences.

To optimize expression and affinity purification conditions, 5 mL cultures were produced and purified by glutathione-Sepharose 4B. Cultures of pGEX-KG-AR TAD were grown in LB with ampicillin (Amp – 100 μg/mL) and chloramphenicol (Chl – 34

μg/mL); induction of protein expression was controlled by treatment with IPTG (100 nM-

1 mM). Experimental parameters that were varied include timing of growth phase, length of induction of protein expression, temperature of induction of protein expression, concentration of IPTG, concentration factor prior to cell lysis, and duration of glutathione-Sepharose 4B binding. All polyQ tract lengths of the GST-AR TAD were found to be well-expressed and soluble (Figure 5.1), however some degree of proteolysis during bacterial expression was unavoidable in large scale expression cultures (Figures

5.2 A, 5.3 A, 5.4 A). Intriguingly, only a fraction of the GST-AR TAD binds to glutathione-Sepharose 4B, irrespective of column capacity. This was not due to an inherent lack of glutathione-binding capacity of the unbound GST-fusions, as individual

161 samples of unbound GST-AR TAD subjected to iterative rounds of short (20 minute) glutathione-Sepharose 4B binding protocols demonstrated binding similar to the first pass.

Routinely, each plasmid was expressed from 500 mL cultures grown in triplicate, purified on glutathione-Sepharose 4B with iterative passes over 2, 1.5 mg capacity columns, washed extensively, pooled, and thrombin cleaved with 8-10 units of thrombin overnight at 4oC. In preparation for anion exchange chromatography, samples were desalted and buffer exchanged into 20 mM piperazine-HCl pH 5.5 on a HiPrep 26/10

Desalting column (GE Amersham) controlled by the AKTAFPLC liquid handling system

(GE Amersham). Low pH was used for two reasons, the calculated isoelectropoint of the

AR TAD is 4.77 indicating the AR TAD is likely stable and negatively charged at 5.5; secondly, many bacterial proteases have well-defined pH-activity profiles and are inactive at 5.5. Indeed, the AR TAD was observed to be more stable when anion exchange was performed at pH 5.5 than at 7.5.

The AR TAD was bound to a 1 mL MonoQ 5/50 anion exchange column (GE

Amersham), and eluted with a linear NaCl gradient from 0-300 mM over 20 column volumes. Contaminant proteins, a majority of which were identified as proteolytic breakdown products of AR TAD by western blot, primarily eluted at low salt concentrations (40-85 mM). While the AR TAD-0Q was eluted in a single, well-resolved peak at roughly 120 mM NaCl (Figure 5.2 B, C), both AR TAD-20Q and -50Q were eluted in ill-defined “peaks” over an extremely large range of NaCl concentration (100-

250 mM) (Figure 5.3 B, C; 5.3 B, C), indicating a large degree of charge heterogeneity conferred by the presence of a polyQ tract.

162

Following anion exchange chromatography, proteins were buffer exchanged by size exclusion chromatography on a Bio-Gel P-6 column into appropriate buffers for downstream applications. Typical yield of pure AR TAD protein from 1.5 L cultures ranged from 0.7-1.0 mg.

As discussed, polyQ tracts have the intrinsic ability to self-associate by the formation of intermolecular β-sheet structure, forming protein aggregates. An intriguing hypothesis to explain the difference between the well-defined ionic properties of the AR

TAD-0Q and the largely heterogeneous ionic properties of the AR TAD-20Q and -50Q would be the polyQ tract-containing proteins form heterogeneous protein aggregates. To test this hypothesis, native and denaturing SDS-PAGE analysis of purified proteins were performed (Figure 5.5 A, B). Nondenaturing PAGE analysis reveals a single protein band for all samples, irrespective of polyQ tract length, similar to SDS-PAGE analysis.

Little evidence of protein aggregate formation was observed via nondenaturing PAGE, but this is a relatively nonanalytical method, and absence of a species with retarded mobility does not necessarily indicate a complete lack of aggregation of the expanded polyQ AR TAD.

5.2.2 Structural Investigation of the AR TAD polyQ Variants – Dynamic Light Scattering

Purified AR TAD polyQ variants were interrogated by several methods to elucidate structural characteristics. Undeniably, the most salient characteristic of the expanded polyQ tract is its propensity to self-associate into insoluble aggregates. This property is particularly relevant to the molecular pathology of SBMA, as the appearance of insoluble protein deposits of endogenously produced AR is a hallmark of the disease.

163 In this study, dynamic light scattering (DLS) was used to measure the aggregation of

polyQ variants expressed and purified from E. coli. DLS is commonly used in the field

of crystallography to evaluate hydrodynamic size and polydispersity of protein samples

by measuring laser light that is scattered by electrons in the dissolved protein. Solutions

of AR TAD polyQ variants were normalized to 1.0 mg/mL in 10mM KHPO4 buffer, pH

7.1, and filtered with a 0.1µm syringe-top filter to remove any dust particles. The

exquisite sensitivity of DLS required filtration of even the negative control (buffer alone)

sample in order to avoid noise from dust particles. Figure 5.6 displays a representative

DLS histogram for the variants. The AR TAD 0Q (6.7%) and 20Q (10%) are both

considered monodisperse (10% cutoff, by convention). In sharp contrast, the AR TAD

50Q variant is considered highly polydisperse with a value of 46%, suggesting a large

degree of aggregation (Figure 5.6). Observation of polyQ expansion-dependent

aggregation recapitulates the most important physical characteristic of the polyQ

expansion disease phenotype, and helps to validate this bacterial expression and

purification system as useful for the study of the polyQ expansion disorder.

5.2.3 Structural Investigation of the AR TAD polyQ Variants – Circular Dichroism

Of the very few structural investigations of the AR TAD extant in the literature, a

common theme is the lack of defined secondary structural elements, and overall tertiary

structure. Figure 5.7 displays the predicted secondary structural elements by the Self

Optimized Prediction Method from Alignments (SOPMA) [17]. As expected, a large percentage of the protein is predicted to be in a random coil conformation, with some α- helix structure. In an attempt to elucidate the effect of the polyQ tract upon structural

164

features of the AR TAD, a polyQ tract deletion (0Q), a “normal” tract length (20Q) and an expansion mutant length tract (50Q) were analyzed using CD spectroscopy. AR TAD polyQ proteins were expressed as discussed above, buffer exchanged into 10mM KHPO4, pH 7.1, and analyzed with a Jasco-815 spectropolarimeter. Figure 5.8 displays an overlay of the Far-CD spectra for each AR TAD polyQ variant. The 0Q and 20Q variants are remarkably similar in secondary structure content, with characteristic absorption around 222 nm and 209 nm typical for a protein with α-helical structure offset by a large degree of random coil conformation, which absorbs in the same range, and partially obscures signal from the α-helix. The CD spectra of the expanded (50Q) polyQ variant has CD spectra that is largely similar to the 0Q and 20Q variants from 250 nm to

210 nm, suggesting little disruption of secondary structure by elongation of the polyQ tract. However, the 50Q variant displays a significantly different absorption in the 210 nm to 200 nm range; adsorption in this range is indicative of β-sheet structure that not present in the deletion or normal polyQ tract variant. This demonstration of polyQ expansion-dependent β-sheet structure corroborates the experimental and molecular modeling data that predict a β-sheet structure for expanded polyQ tracts, and is the first demonstration of polyQ tract-expansion-dependent β-sheet structure for the AR TAD.

Especially intriguing is the correlation of β-sheet structure measured by Far-UV and aggregation as evidenced by high percentage of polydispersity measured by DLS that was observed exclusively in the expanded polyQ tract length.

Near-UV CD spectra were measured for each polyQ tract variant in order to evaluate the effect of polyQ tract length on tertiary structure (Figure 5.8). In this range the aromatic amino acids are the chromophores, the CD spectra they produce are

165 reflective of the local environment surrounding these groups and can be sensitive to overall tertiary structure. There are no tryptophan residues in this AR TAD preparation, so absorption spectra are expected in the 290 nm to 250 nm ranges. The polyQ deletion mutant displayed a broad absorption peak through the entire absorption range, indicating

CD asymmetry of the aromatic amino acids in the AR TAD 0Q. However, this asymmetry was completely lost in both polyQ tract-containing variants, indicating a significant difference in the local environment of the aromatic amino acids, and likely, a complete lack of tertiary structure in the presence of any length polyQ tract. This evidence strongly supports the idea that any length polyQ tract is a folding inhibiting sequence, and is consistent with a large body of literature demonstrating proteins containing polyQ tracts, especially the domains in which the polyQ tracts occur, are highly solvent exposed and are very susceptible to protease digestion in vivo (ref [18] and therein, [19]).

In summary, the Far-UV CD spectra for various polyQ tract lengths demonstrate the presence of α-helical secondary structural elements in all tested polyQ tract variants.

An intriguing finding is that deletion mutation possesses evidence of tertiary structure by

Near-UV CD that is completely abrogated by the presence of even “normal” polyQ tract lengths. Therefore, polyQ tracts of any length appear to disrupt tertiary structure while allowing maintenance of secondary structural elements. This scenario is highly reminiscent of the molten globule paradigm of partially unfolded proteins, and strongly corroborates the “induced-fit” hypothesis of nuclear receptor TAD. It is possible that an

AR TAD with a polyQ tract deletion adopts the “induced fit” tertiary structure without the need for a binding partner. Insertion of any length polyQ tract disrupts this tertiary

166

structure in a length-dependent manner, and provides an extra layer of negative regulation of receptor transcriptional function in the apo-form, i.e. in the absence of co- regulatory proteins that are recruited by conformational change induced by androgen- binding. One of the most well-established principles of AR transactivation supports this hypothesis: the inverse correlation between polyQ tract length and transcriptional activity

[20, 21]; the longer the tract, the more structure-breaking force and thus less transactivational potential. The polyQ tract might thus be thought of as a protective sequence preventing hyperactivity of AR transcriptional function, and the polymorphic length functions to fine-tune the response to hormone.

5.3 Conclusions

The AR TAD is expressed as a soluble, stable protein. One consistent issue encountered when expressing and purifying this protein is susceptibility to protease digestion. While the protein is observed to have some secondary and tertiary structure, large portions of the protein are likely unfolded with protease recognition sites exposed.

Purification under conditions in which proteases are inactive increased stability.

An intriguing finding regarding the ion exchange protocol is the apparent charge homogeneity of the AR TAD containing a deletion of the polyQ tract compared to extremely variable ionic properties of both normal and expanded polyQ AR TAD. The formation of high molecular weight aggregates is not a suitable explanation for this phenomenon because DLS data clearly demonstrate the AR TAD-20Q is present as a monodisperse species, yet this variant still exhibits large charge heterogeneity.

167 Purified AR TAD polyQ tract variants were examined by DLS and CD to

elucidate structural aspects of these proteins. DLS unequivocally demonstrated the 0Q and 20Q versions were monodisperse in aqueous solution. The DLS data suggest the 0Q version may be suitable for crystallization studies, which would be a first for a nuclear receptor TAD. The polyQ expanded AR TAD however, was observed to be highly polydisperse at the same concentration and in the same buffer as the 0Q and 20Q versions, suggesting in this experimental design, the polyQ tract expansion is responsible for considerable aggregation.

Far-UV CD spectra of the AR TAD revealed little well-defined secondary structure, namely a small degree of α-helix and a large signal resulting from the overwhelming random coil structure. This experimental data matches well with several secondary structure prediction algorithms. An intriguing finding is the appearance of β- sheet structure unique to the polyQ expanded variant. The value of this finding cannot be understated because it provides experimental validation of the hypothesis that intramolecular conversion to a β-sheet structure is the primary driving feature leading to self-association of polyQ tracts. Importantly, the exact same experimental conditions for production, purification and analysis of the AR TAD non-expanded polyQ tract variants display no β-sheet structure and are not aggregated, while the expanded polyQ variant exhibited both β-sheet structure and aggregation, implying a relationship between the two states. However, it would be inappropriate to conclude a causal relationship based on the data collected here.

Near-UV CD spectra were analyzed from each AR TAD polyQ variant. The 0Q version displayed CD asymmetry, suggesting a tertiary structure that was completely

168

absent in AR TADs harboring any length polyQ tract. It is intriguing to note that disrupted tertiary structure is not a consequence of an expanded polyQ tract, as the AR

TAD 20Q is similar to the 50Q, and significantly different from the 0Q version. It is concluded that any length polyQ tract is a structure-disrupting sequence. Based on the

Far- and Near-UV CD spectra, a molten globule structure is proposed for AR TAD containing any length polyQ tract. Thus, data collected here suggest the wild-type AR

TAD is present in a permanent folding-intermediate state. Based on literature survey, it would follow that the AR TAD is only fully activated upon association with co-activators that promote an “induced-fit” structure. This experimental data provide a structure-based explanation for the inverse correlation between AR transactivational potential and polyQ tract length, and suggest the polyQ tract may provide a layer of regulation preventing hyperactive AR signaling.

5.4 Experimental Procedures

Expression and purification of AR TAD: For each 500 mL culture of pGEX-KG-AR

TAD, 2, 25 mL cultures of LB with Amp (100 μg/mL), and Chl (34 μg/mL) were inoculated with pGEX-KG-AR TAD Rosetta 2 cells for overnight growth at 37oC, with shaking at 200 rpm. The next day, the 2 cultures were pelleted by centrifugation at 3000 rpm for 10 minutes at 4oC, and resuspended in 500 mL fresh LB-Amp-Chl. Cultures

o were grown at 37 C, 200 rpm until an OD600nm of 0.7 was reached (roughly 2:15 hours).

Protein expression was induced by treatment with 0.4 mM IPTG for 2 hours at 37oC, 200

rpm. Bacteria were pelleted by centrifugation at 6000 rpm for 10 minutes at 4oC,

169 resuspended (1:100 concentration factor) in glutathione binding buffer: 50 mM Tris-HCl,

pH 7.5, 150 mM NaCl, 1 mM DTT, 1 mM ATP, 1 mM MgCl2, 0.05% Triton-X 100, and

Complete Mini, EDTA-free protease inhibitor cocktail (Roche), and sonicated on ice for

2.5 minutes with 10 second off/on cycles. Lysate was adsorbed onto 200 μL of

prewashed glutathione-Sepharose 4B slurry (capacity: 1.5 mg fusion protein) for 20

minutes at 4oC. Glutathione-Sepharose 4B beads were pelleted by centrifugation, and

lysate was adsorbed onto a fresh 200 μL slurry of glutathione-Sepharose for another 20

minutes at 4oC. Beads were washed 4 times in 5 mL glutathione binding buffer, and

routinely pooled with two parallel purifications (from 500 mL culture volume each) for

overnight thrombin cleavage with 8-10 units of thrombin.

Anion exchange chromatography: Samples were desalted and buffer exchanged into 20

mM piperazine-HCl pH 5.5 using a HiPrep 26/10 Desalting column (GE Amersham)

controlled by the AKTAFPLC liquid handler using manufacturer’s instructions. The sample was eluted from the size exclusion column over 10 mL and applied directly to a 1

mL MonoQ 5/50 column (GE Amersham). Sample binding and washing of contaminants

was performed over 17 mL, and sample was subsequently eluted with a linear NaCl

gradient of 0-300 mM over 20 mL. Each separation was monitored with an on-line UV

source, and fractions containing protein were collected in 0.2 mL fractions. Analysis of

protein-containing fractions was performed by denaturing SDS-PAGE.

Dynamic Light Scattering: DLS experiments were collected using a DynaPro 99E

(Proterion, Protein Solutions, Santa Barbara, CA) operating at wavelength 824.9 nm.

170

Samples of AR TAD protein were buffer exchanged on a HiPrep 26/10 desalting column into 10 mM KHPO4, pH 7.1 diluted to a final protein concentration of 1.0 mg/mL. It was

necessary to filter each sample (including the negative control buffer sample) with a 0.1

µM syringe top filter (Whatman) in order to avoid interference from dust particles. Data were analyzed with Dyanmics V6 software; refractive index and viscosity values for the phosphate buffer were used as provided by the software.

Circular Dichroism: Far-UV and Near-UV CD spectra of AR TAD polyQ variants were

recorded on a Jasco-815 spectropolarimeter in 2-mm path length rectangular cell. The

cell chamber was connected to Peltier type temperature control system connected to a

theromostated circulating water bath maintained at 20oC. AR TAD polyQ variants were

buffer exchanged on a HiPrep 26/10 desalting column into 10mM KHPO4, pH 7.1 and

then diluted to an appropriate concentration in the same buffer. The protein

concentration of each polyQ variant was adjusted to an equivalent concentration, Far-UV

CD spectra were collected at a protein concentration of 0.45mg/mL, while the Near-UV

CD spectra were measured at 1.5mg/mL. For Far-UV CD, spectra were collected by

averaging 5 wavelength scans from 250nm to 205nm (1nm bandwidth) in 0.2nm steps at

a rate of 50nm/min and 0.25 second response. Near-UV CD spectra were recorded from

240nm to 350nm at 20oC by averaging 5 wavelength scans (1nm bandwidth) in 0.2-nm

steps at a rate of 20nm/min and 2 second response.

171 5.5 Figures

1 2 3 4 5 6

105

75

50

35

kDa

Figure 5.1: Small scale production and purification of GST-AR TAD 0Q, 20Q, 50Q.

Lane 1, 1% homogenate of GST-TAD0Q. Lane 2: 10% Glutathione-Sepharose-bound

GST-TAD0Q. Lane 1, 1% homogenate of GST-TAD20Q. Lane 2: 10% GSH-Sepharose- bound GST-TAD20Q Lane 1, 1% homogenate of GST-TAD50Q. Lane 2: 10% GSH-

Sepharose-bound GST-TAD50Q

172

1 2 3 4 MonoQ AEX - AR 11-341 0Q

A B 60 75 50

40 50

30 35 20

10 30

(mAU) UV Absorbance kDa 0 0 5 10 15 20 25 30 35 40 -10 Volume (mL) UV Absorbance Conductance (mS/cm)

Volume (mL) - 22 26 27 27 27 27 28 29 29 29 30 30 30 30 31 C

35

30 kDa

Figure 5.2: Expression and purification of AR TAD-0Q. A. Typical purification from

500 mL LB-Amp-Chl culture. 1. Homogenate: 0.01% of prep, 2. GSH-Sepharose-bound

AR TAD: 0.12%, 3. Thrombin cleavage supernatant: 0.12%, 4. Thrombin cleavage,

GSH-Sepharose bound: 0.12%. B. Anion exchange chromatography of AR TAD-0Q. C.

SDS-Page analysis of anion exchange (AEX) in B. Wells are labeled with the elution volume corresponding to B.

173

MonoQ AEX - AR 11-341 20Q 1 2 3 4 180 A B 160 75 140 120 50 100 80 35 60 40 UV Absorbance (mAU) UV Absorbance

20 30 0 kDa 0 5 10 15 20 25 30 35 40 45 Volume (mL)

UV Absorbance Conductance (mS/cm)

Volume (mL) 25 26 27 27 27 27 28 28 29 30 31 31 31 34 37 37 C 50

35

30 kDa

Figure 5.3: Expression and purification of AR TAD-20Q. A. Typical purification from

500 mL LB-Amp-Chl culture. 1. Homogenate: 0.01% of prep, 2. GSH-Sepharose-bound

AR TAD: 0.12%, 3. Thrombin cleavage supernatant: 0.12%, 4. Thrombin cleavage,

GSH-Sepharose bound: 0.12%. B. Anion exchange chromatography of AR TAD-20Q. C.

SDS-Page analysis of AEX in B. Wells are labeled with the elution volume corresponding to B.

174

MonoQ AEX - AR 11-341 50Q 1 2 3 4 50 AB45

40 75 35

30 50 25 20 35 15 10

UV Absorbance (mAU) Absorbance UV 5 kDa 0 0 5 10 15 20 25 30 35 40 45 -5 Volume (mL)

UV Absorbance Conductance (mS/cm)

Volume (mL) 27 28 29 29 30 30 30 31 31 31 34 35 35 35 36 C

50

35

kDa

Figure 5.4: Expression and purification of AR TAD-50Q. A. Typical purification from

500 mL LB-Amp-Chl culture. 1. Homogenate: 0.01% of prep, 2. GSH-Sepharose-bound

AR TAD: 0.12%, 3. Thrombin cleavage supernatant: 0.12%, 4. Thrombin cleavage,

GSH-Sepharose bound: 0.12%. B. Anion exchange chromatography of AR TAD-50Q. C.

SDS-Page analysis of AEX in B. Wells are labeled with the elution volume corresponding to B.

175

1 2 3

A

50

35

kDa

1 2 3

B

50

35 kDa

Figure 5.5: A. SDS-PAGE of purified AR. AR TAD 0Q. 20Q, 50Q were loaded in lanes

1, 2 and 3, respectively at 2.5 μg per well. B. Nondenaturing PAGE of purified AR. AR

TAD 0Q, 20Q, 50Q were loaded in lanes 1, 2, and 3, respectively at 2.5 μg per well.

176

Figure 5.6: DLS of TAD-AR polyQ variants, all proteins were measured at a standardized concentration of 1.0 mg/mL. TAD-AR peaks were measured at a polydispersity of (0Q) 6.7%, (20Q) 10% and (50Q) 46%, strongly suggesting significant aggregation in only the polyQ expanded protein.

177

10 20 30 40 50 60 70 | | | | | | | YPRPPSKTYRGAFQNLFQSVREVIQNPGPRHPEAASAAPPGASLLLLQQQQQQQQQQQQQQQQQQQQETS cccccccchhhhhhhhhhhhhhhhcccccccccccccccttceeeeeecccccccccccccccccccccc

PRQQQQQQGEDGSPQAHRRGPTGYLVLDEEQQPSQPQSALECHPERGCVPEPGAAVAASKGLPQQLPAPP ccccccccccccccccccccccceeeeccccccccccchhcccttccccccttchhhhcttccccccccc

DEDDSAAPSTLSLLGPTFPGLSSCSADLKDILSEASTMQLLQQQQQEAVSEGSSSGRAREASGAPTSSKD cccccccchhheeecccccccccchhhhhhhhhhhhhhhhhhhhhhhhhhttcccccchhhtcccccccc

NYLGGTSTISDNAKELCKAVSVSMGLGVEALEHLSPGEQLRGDCMYAPLLGVPPAVRPTPCAPLAECKGS ceetccccchhhhhhhhhhhhhhhtcchhhhhhccttccccccheeccccccccccccccccchhhttcc

LLDDSAGKSTEDTAEYSPFKGGYTKGLEGESLGCSGSAAAGSSGTLE eecccccccccchhhccccccccccccttcccccccchhhtccccee

SOPMA : (Hh) : 85 is 25.99% 310 helix (Gg) : 0 is 0.00% Pi helix (Ii) : 0 is 0.00% Beta bridge (Bb) : 0 is 0.00% Extended strand (Ee) : 21 is 6.42% Beta turn (Tt) : 20 is 6.12% Bend region (Ss) : 0 is 0.00% Random coil (Cc) : 201 is 61.47% Ambigous states (?) : 0 is 0.00% Other states : 0 is 0.00%

Figure 5.7: AR TAD 20Q secondary structural elements. Secondary structure prediction for AR TAD 20Q by SOPMA [17]. Aromatic amino acids that contribute to the Near-

UV signal are bolded.

178

Far-UV Circular Dichroism Spectrum of AR-TAD PolyQ Variants

20

0 200 210 220 230 240 250 260

-20

-40 TAD 0Q TAD 20Q TAD 50Q -60 Ellipticity (mdeg)

-80

-100

-120 Wavelength (nm)

Figure 5.8: Far-UV Circular Dichroism spectra of AR TAD polyQ variants. The CD spectra are largely in agreement with predicted secondary structure elements, suggesting a majority of the protein has a random coil arrangement, with small amounts of α-helix. Significant difference in the spectra patterns is observed in the 205nm to 200nm range, suggesting β- structure that is unique to the 50Q variant.

179

Near-UV Circular Dichroism Spectrum for AR TAD PolyQ Variants

10

0 240 250 260 270 280 290 300

-10

TAD 0Q -20 TAD 20Q TAD 50Q Ellipiticity (mdeg) Ellipiticity

-30

-40

-50 Wavelength (nm)

Figure 5.9: Near-UV circular dichroism spectra of AR TAD polyQ variants. Circular dichroism is observed for the aromatic amino acids in the AR TAD 0Q variant, suggesting a structured local environment. CD spectra for the 20Q and 50Q variants are completely flat, suggesting the presence of the polyQ tract disrupts the structural elements observed in the AR TAD 0Q variant.

180

5.6 References

1. Reid, J., et al., Conformational analysis of the androgen receptor amino-terminal domain involved in transactivation. Influence of structure-stabilizing solutes and protein-protein interactions. J Biol Chem, 2002. 277(22): p. 20079-86. 2. Kumar, R., et al., Induced alpha-helix structure in AF1 of the androgen receptor upon binding transcription factor TFIIF. Biochemistry, 2004. 43(11): p. 3008-13. 3. Kumar, R. and E.B. Thompson, Transactivation functions of the N-terminal domains of nuclear hormone receptors: protein folding and coactivator interactions. Mol Endocrinol, 2003. 17(1): p. 1-10. 4. Uversky, V.N., J.R. Gillespie, and A.L. Fink, Why are "natively unfolded" proteins unstructured under physiologic conditions? Proteins, 2000. 41(3): p. 415-27. 5. Marchut, A.J. and C.K. Hall, Side-chain interactions determine amyloid formation by model polyglutamine peptides in molecular dynamics simulations. Biophys J, 2006. 90(12): p. 4574-84. 6. Wetzel, R., Kinetics and thermodynamics of amyloid fibril assembly. Acc Chem Res, 2006. 39(9): p. 671-9. 7. Perutz, M.F., et al., Polar zippers. Curr Biol, 1993. 3(5): p. 249-53. 8. Perutz, M.F., et al., Amyloid fibers are water-filled nanotubes. Proc Natl Acad Sci U S A, 2002. 99(8): p. 5591-5. 9. Sikorski, P. and E. Atkins, New model for crystalline polyglutamine assemblies and their connection with amyloid fibrils. Biomacromolecules, 2005. 6(1): p. 425- 32. 10. Perutz, M., Polar zippers: their role in human disease. Protein Sci, 1994. 3(10): p. 1629-37. 11. Perutz, M.F., et al., Glutamine repeats as polar zippers: their possible role in inherited neurodegenerative diseases. Proc Natl Acad Sci U S A, 1994. 91(12): p. 5355-8. 12. Bevivino, A.E. and P.J. Loll, An expanded glutamine repeat destabilizes native ataxin-3 structure and mediates formation of parallel beta -fibrils. Proc Natl Acad Sci U S A, 2001. 98(21): p. 11955-60. 13. Tanaka, M., et al., Intra- and intermolecular beta-pleated sheet formation in glutamine-repeat inserted myoglobin as a model for polyglutamine diseases. J Biol Chem, 2001. 276(48): p. 45470-5. 14. Marchut, A.J. and C.K. Hall, Effects of chain length on the aggregation of model polyglutamine peptides: molecular dynamics simulations. Proteins, 2007. 66(1): p. 96-109. 15. Chen, S., F.A. Ferrone, and R. Wetzel, Huntington's disease age-of-onset linked to polyglutamine aggregation nucleation. Proc Natl Acad Sci U S A, 2002. 99(18): p. 11884-9. 16. Ferrone, F., Analysis of protein aggregation kinetics. Methods Enzymol, 1999. 309: p. 256-74. 17. Geourjon, C. and G. Deleage, SOPMA: significant improvements in protein secondary structure prediction by consensus prediction from multiple alignments. Comput Appl Biosci, 1995. 11(6): p. 681-4.

181 18. Tarlac, V. and E. Storey, Role of proteolysis in polyglutamine disorders. J Neurosci Res, 2003. 74(3): p. 406-16. 19. Masino, L., et al., Domain architecture of the polyglutamine protein ataxin-3: a globular domain followed by a flexible tail. FEBS Lett, 2003. 549(1-3): p. 21-5. 20. Giovannucci, E., et al., The CAG repeat within the androgen receptor gene and its relationship to prostate cancer. Proc Natl Acad Sci U S A, 1997. 94(7): p. 3320- 3. 21. La Spada, A.R., et al., Androgen receptor gene mutations in X-linked spinal and bulbar muscular atrophy. Nature, 1991. 352(6330): p. 77-9.

182

FINAL CONCLUSION AND SUMMARY

The pathogenic properties of the polyQ tract expansion in the AR remain largely unknown. One attractive hypothesis is that the polyQ expansion presents a uniquely difficult substrate for ubiquitin-dependent proteasome degradation. Inhibition, or impairment of polyQ-expanded-AR degradation by the proteasome would neatly explain the presence of inclusion bodies that contain insoluble protein material comprising the

AR, ubiquitin and protein subunits of the proteasome. However, this cannot be the whole story: while toxicity in PolyQ Expansion Disease is restricted to specific neuronal subpopulations, general inhibition of the proteasome is likely incompatible with cellular life, and inclusion bodies have been shown to form both in neurons fated to die, and in healthy cells without any other indicators of pathology.

It was demonstrated in this thesis that androgen-activated AR with a polyQ tract expansion promotes nuclear proteasome impairment, in the absence of inclusion body formation. Restricting androgen-activated AR to the cytoplasm promoted the rapid formation of large inclusion bodies, and protected against global proteasome dysfunction.

Cytoplasmic inclusion bodies also contained a synthetically-targeted proteasome substrate, suggesting that proteasomes were recruited to inclusion bodies. Taken together, these results indicate a protective role for cytoplasmic inclusion bodies that are rapidly produced following androgen-activation of the AR. In constrast, soluble, polyQ- expanded AR in the nucleus caused nuclear proteasome impairment, suggesting that soluble, or microaggregated polyQ-expanded AR might be responsible for pathogenesis.

These results indicate the nucleus cannot rapidly form inclusion bodies required to delay or defer proteasome impairment by expanded polyQ tracts; this observation may explain

183 the increasing evidence that nuclear localization is essential for toxicity of the polyQ- expanded protein. It is hypothesized that the formation of nuclear inclusions, which are rare in cell culture and uniformly require prolonged expression of polyQ-expanded AR in

SBMA cell culture models, occurs more slowly or via a different mechanism than cytoplasmic inclusion body formation, and increased exposure to soluble or microaggregated polyQ-expanded protein is primarily responsible for polyQ expansion pathogenesis.

A purely in vitro system for ubiquitin-dependent proteasome degradation of a polyQ-expanded protein would go a long way to resolving the issue of proteasome impairment in PolyQ Expansion Disease. To this end, a novel method for purification of human proteasomes was developed. Because this achievement in itself represented a unique opportunity to investigate the composition of the proteasome holocomplex, mass spectrometry analysis of the purified fractions was performed. This data confirmed the presence of most proteasome subunits and implicated that several proteins with a confirmed role in UPS, and several others with no previous reports of UPS activity could physically interact with the proteasome. Unfortunately, these purified proteasomes could not be reacted with appropriately-targeted polyQ-expanded proteins in vitro because purification of polyubiquitinated AR proved extremely difficult. It is likely that the combined activity of the proteasome and intracellular DUBs make polyubiquitinated proteins extremely transient species in the cell.

Finally, structural investigation of the AR TAD was performed. It has been reported that the polyQ tract is likely a random coil in its native form, but a transition to a

β-sheet structure is the molecular event responsible for polyQ expansion toxicity. A

184

bacterial expression system for the production and purification of AR TAD with various polyQ tract lengths was developed. An interesting finding is the pronounced charge heterogeneity of AR TAD with both native and pathogenic polyQ tract lengths. This property was completely absent in AR TAD proteins lacking a polyQ tract.

Two methods were carried out to probe the structure of the various polyQ tract length variants: dynamic light scattering (DLS) and circular dichroism (CD). DLS analysis revealed that both the polyQ tract deletion mutant and the wild type tract were monodisperse in aqueous soultion. The expansion mutant displayed pronounced polydispersity, indicative of aggregation.

CD analysis of the AR-TAD revealed intriguing structural features. First, Far-UV analysis of secondary structural elements provided evidence of a largely random coil structure with some α-helical elements for all variants. While the CD spectra of the 0 and

20Q variants were remarkably similar, the expansion mutant showed significant differences in the 205-210nm range, indicating β-sheet structure is a unique feature of polyQ expanded AR-TAD. It is intriguing that the only structure exhibiting β-sheet structure also was found to be polydisperse by DLS. While a causual relationship cannot be concluded from this data, this finding provides experimental evidence for the long- standing hypothesis that a conformational preference of long polyQ tracts for β-sheet structure promotes aggregation.

Finally, Near-UV CD spectra were observed for all polyQ tract variants. The deletion mutant displayed asymmetric CD indicative of ordered structure in the local environment of the aromatic amino acids of the TAD. However, a dramatic difference was observed for AR-TAD with any length polyQ tract, where a complete absence of

185 tertiary strucutre is indicted by the flat CD spectra. This finding suggests the polyQ tract

is a structure-disrupting sequence. In sum, the CD data suggest that the AR-TAD has

secondary structural elements that are not disrupted by an expansion in the polyQ tract.

However, there appears to be tertiary structure in the deletion mutant that is disrupted by both wild type and expanded polyQ tracts. This pattern is highly reminiscent of the molten globlue paradigm. It is likely that the wild type AR-TAD is natively in a molten globule state that is inactive. Upon androgen-binding, the characteristic N-C terminal interaction likely induces the AR-TAD to form tertiary structure activating the AR for transcriptional activity. Future experiments will investigate effect of the polyQ tract in the conformational shift in the AR-TAD induced by co-expression of the AR-LBD.

186

List of Publications

Mandrusiak LM, Beitel LK, Wang X, Scanlon TC, Chevalier-Larsen E, Merry DE, Trifiro M. Transglutaminase potentiates ligand-dependent proteasome dysfunction induced by polyglutamine-expanded androgen receptor. Hum Mol Genet 12: 1497- 1506, 2003.

Beitel LK, Scanlon T, Gottlieb B, Trifiro M. Progress in spinobulbar muscular atrophy research: Insights into neuronal dysfunction caused by the polyglutamine-expanded androgen receptor. Neurotoxicity Res 7: 219-230, 2005.

Abstracts Mandrusiak LM, Beitel LK, Elhaji YA, Scanlon TC, Alvarado C, Gottlieb B, Trifiro MA. The human androgen receptor is a substrate for transglutaminase: possible pathogenic implications for Spinobulbar Muscular Atrophy. 84th Annual Meeting, The Endocrine Society, San Francisco, June 18-22, 2002 (Abst #P3-157).

Mandrusiak LM, Beitel LK, Elhaji YA, Scanlon TC, Alvarado C, Gottlieb B, Trifiro MA. Transglutaminase cross-linking of androgen receptor could inhibit the proteasome: possible pathogenic implications for spinobulbar muscular atrophy. American Society of Human Genetics Annual Meeting, Baltimore, MD, Oct 15-19, 2002 (Abst. #1974W).

Gottlieb B, Wu JH, Scanlon TC, Ghali S, Beitel LK, Trifiro MA. A unique structure- function analysis of a mutation in the androgen receptor gene that causes complete androgen insensitivity. American Society of Human Genetics Annual Meeting, Baltimore, MD, 2002 (Abst #2186W).

Mandrusiak LM, Beitel LK, Scanlon TC, Alvarado C, Trifiro MA. Transglutaminase cross-linking of the androgen receptor could inhibit the proteasome: pathogenic implications for spinobulbar muscular atrophy. The Endocrine Society, Philadelphia, PA, June 19-22, 2003 (Abst. #P1-92).

Scanlon TC, Beitel LK, Trifiro MA. Nuclear localization of polyglutamine-expanded androgen receptor is not required for transglutaminase-potentiated proteasome dysfunction. American Society of Human Genetics Annual Meeting, Los Angeles, CA, November 4-8, 2003 (Abst #2460/S).

Scanlon TC, Trifiro M. Somatic instability of the human androgen receptor CAG tract in microdissected prostate cancer tissue. Montreal Centre for Experimental Therapeutics in Cancer, 2nd Annual Meeting, Montreal, QC, May 13-14, 2004.

Scanlon TC, Trifiro MA, Proteasomics: Mass spectrometry based identification of proteasome-interacting proteins by direct analysis of large protein complexes (DALPC). The 20th Symposium of the Protein Society, San Diego, CA, August 5- 9, 2006, Oral Presentation, Young Protein Scientist Talk, Abst. #248.

187

188

Program Nr: 1974 from 2002 ASHG Annual Meeting

Transglutaminase cross-linking of androgen receptor could inhibit the proteasome: possible pathogenic implications for spinobulbar muscular atrophy. L.M. Mandrusiak1, 2, L.K. Beitel1, Y.A. El-haji1, 2, T.C. Scanlon1,2, C. Alvarado1, B. Gottlieb1, 4, M.A. Trifiro1, 2, 3. 1) Lady Davis Institute for Medical Research, Sir Mortimer B. Davis-Jewish General Hospital, Montreal, Canada; 2) Department of Human Genetics, McGill University, Montreal, Canada; 3) Department of Medicine, McGill University, Montreal, Canada; 4) Department of Biology, John Abbott College, Montreal, Canada.

Several different proteins bearing long polyglutamine (polyGln) tracts are known to cause neurodegenerative diseases by mechanisms that may involve a toxic gain of function. All these diseases are characterized by intracellular insoluble protein aggregates containing the Gln-expanded protein. The aggregates are ubiquitinated, but resistant to degradation by the proteasome. Expansions of the polyGln tract of the human androgen receptor (AR) cause spinobulbar muscular atrophy (SBMA). One established property of Gln residues is their ability to act as an amine acceptor in a transglutaminase-catalyzed reaction, resulting in a protealytically resistant glutamyl-lysine cross-link. Transglutaminase (TG) is a calcium-dependent enzyme that naturally occurs in vivo. We have found that bacterially expressed AR is a substrate of guinea pig liver tissue TG in vitro. Both GST-AR fusion proteins and thrombin cleaved (to remove the GST) proteins were shown to react with TG. Western blots of the proteins following incubation with TG demonstrate that several different epitopes of the AR (1C2, polyGln tract, 441, amino acids 301-321, PG21, amino acids 1-20) appear to be lost. We propose that this is due to TG cross-linking of the AR, which interferes with antibody recognition. Interestingly, both intermolecular and intramolecular bonds appear to be formed. When AR with polyGln tracts of varying length were analyzed, it was found that expanded (50 Gln) AR has a greater propensity to form intermolecular bonds than the normal length protein. TG cross-linked intermolecularly bonded AR is hypothesized to decrease proteasome function by clogging the proteasome pore. The inhibited ubiquitin proteasome pathway could contribute to the selective cell death seen in the SBMA phenotype.

189

Program Nr: 2186 from 2002 ASHG Annual Meeting

A unique structure-function analysis of a mutation in the androgen receptor gene that causes complete androgen insensitivity. B. Gottlieb1,5, J.H. Wu3,5, T.C. Scanlon1,2, S. Ghali1,2, L.K. Beitel1, M.A. Trifiro1,4. 1) Dept Cell Genetics, Lady Davis Inst Medical Res, Montreal, PQ, Canada; 2) Department of Human Genetics, McGill University, Montreal; 3) Department of Oncology, McGill University, Montreal; 4) Department of Medicine, McGill University, Montreal; 5) Center for Translational Research in Cancer, McGill University, Montreal.

An androgen receptor (AR) gene mutation (R774C) in the ligand binding domain (LBD) of the AR results in complete androgen insensitivity due to the receptor's inability to bind ligand. This is typical of many AR LBD mutations that are either not in the putative ligand binding pocket (LBP), or do not result in gross structural LBD alterations. To understand the possible mechanism of action of such mutations, the structure/function relationship of R744C has been analyzed by performing a molecular dynamic simulation of 1.4 ns using the Generalized Born model 2 as implemented in the Amber 7 package. The average structure of the mutant showed that the mutation had local structure distortion in the LBD. Part of AR helix 5 rewinds into a loop, which in turn causes movement of loops 759-772 and 682-695. The movement of loop 682-695 in particular, results in a change in shape of the LBP, and movement of the bound ligand within the pocket. This result is significant , as this is the first case of modeling showing an effect on ligand binding, of a mutation not within the LBP, that has a subtle effect on LBD structure. To confirm the validity of this observation, photoaffinity cross-linking experiments are underway. The normal and mutant AR LBDs were cloned into a GST bacterial expression system. Following protein expression, the cell cultures were incubated with the synthetic ligand [3H] R1881, the cells exposed to UV radiation and the GST-fusion proteins purified. The cross-linked fusion proteins were then cleaved with trypsin and Asp-N, run on a SDS-PAGE gel, which was exposed to film. Any resulting differences in R1881 cross-linking between normal and mutant receptors would validate the modeling technique and allow for analysis of many more non-LBP LBD mutations, not just in AR, but also in other members of the super-family of steroid receptors.

190

Program Nr: 2460 from 2003 ASHG Annual Meeting

Nuclear localization of polyglutamine-expanded Androgen Receptor is not required for Transglutaminase-potentiated proteasome dysfunction. T.C. Scanlon1,2, L.K. Beitel1,3, B. Gottlieb1, M.A. Trifiro1,2,3. 1) Lady Davis Institute, Montreal, Quebec, H3T 1E2, Canada; 2) Department of Human Genetics, McGill University, Montreal, Quebec H3A 1B1, Canada; 3) Department of Medicine, McGill University, Montreal, Quebec H3A 1B1, Canada.

The polyglutamine (polyQ) expansion in the Androgen Receptor (AR) has been shown to be the molecular insult responsible for Spinal Bulbar Muscular Atrophy (SBMA). PolyQ-expanded tracts could act as good glutamyl donors in reactions catalyzed by Transglutaminase(TG). TG-mediated isopeptide bonds are proteolytically resistant, and may thus cause malfunction of the ubiquitin-proteasome protein degradation pathway. To test this hypothesis, we have utilized HEK 293 cells stably transfected with a GFPu plasmid: Green Fluorescent Protein fused to an amino acid sequence known to target a protein to the proteasome. Proteasomal malfunction results in accumulation of GFP, and can be monitored by fluorescence detection. We have previously shown that proteasomal dysfunction caused by transient-co-transfection of polyQ-expanded AR and TG occurs androgen-dependently. It has been hypothesized that androgen-induced nuclear localization of the AR is responsible for the observed androgen-dependence of SBMA animal models. Transfection studies are underway to test this hypothesis, employing ARs in which the nuclear localization sequence has been deleted by removing amino acids 628-640 (NLS). These constructs do not enter the nucleus upon androgen introduction. Preliminary results may indicate transfection of polyQ-expanded NLS mutants, but not normal tract-length receptors cause proteasomal dysfunction. In addition, formation of protein aggregates is observed in cells which do not exhibit proteasomal dysfunction. Therefore nuclear localization is not critical, and androgens may induce a conformational change rendering the polyQ-expanded AR a better substrate for TG isopeptide bond formation, resulting in proteasomal dysfunction. In addition, formation of cytoplasmic protein aggregates does not contribute to proteasomal dysfunction.

191

192

193

Proteasomics: Mass spectrometry based identification of proteasome-interacting proteins by direct analysis of large protein complexes (DALPC). Scanlon TC 1,3, Trifiro MA 1,2,3, 1) Lady Davis Institute, Montreal, Quebec, H3T 1E2, Canada; 2) Department of Human Genetics, McGill University, Montreal, Quebec H3A 1B1, Canada; 3) Department of Medicine, McGill University, Montreal, Quebec H3A 1B1, Canada.

The 20th Symposium of the Protein Society, San Diego, CA, August 5-9, 2006, Oral Presentation, Young Protein Scientist Talk, Abst. #248.

The 26S proteasome is the principal non-membrane bound subcellular organelle responsible for cellular protein degradation. Conventional purification of the 26S proteasome involves multiple chromatographic steps and usually takes several days. This procedure involves exposure to high salt concentrations during ion exchange chromatography, and yet eukaryotic proteasomes maintain a stable complex of 34 protein subunits. However, it is well established that the proteasome is a dynamic complex, making numerous transient interactions with proteins that modulate its activity. In attempt to more completely define the proteasome halocomplex and transiently associated proteins, milder affinity chromatographic methods have been developed by epitope tagging various yeast proteasomal subunit genes. Mass spectrometry is then used to identify Proteasome Interacting Proteins (PIP) by Direct Analysis of Large Protein Complexes (DALPC). We report here a novel method for affinity purification of human proteasomes by exploiting the high affinity interaction between the UBiquitin Like domain (UBL) of RAD23A and the Ubiquitin Interacting Motif (UIM) of proteasomal subunit PSMD4. By a variety of criteria, we demonstrate purification of functional proteasomes. An advantage of this affinity based purification scheme is exclusion of high salt wash steps generally required in classical proteasome purification, allowing the co-purification of transiently interacting PIPs. A proteomics based approach was performed in order to identify PIP. Briefly, purified proteasome fractions were exposed to trypic digest, and mass spectrometric analysis was performed on a LC-QToF (Micromass), a tandem MS-MS that provides peptide masses and sequence tag information. Automated peak identification was performed by presenting peak lists to Mascot software (www.matrixscience.com). In total, 31 of 34 proteasomal subunits were identified (91%), and 44 unique protein identifications were made of non-proteasomal proteins. It is concluded that UBL affinity purification, combined with DALPC is a useful tool for identification of PIP.

194