Quick viewing(Text Mode)

The Aryl Hydrocarbon Receptor: Structural Analysis and Activation Mechanisms

The Aryl Hydrocarbon Receptor: Structural Analysis and Activation Mechanisms

The Aryl Hydrocarbon Receptor: Structural Analysis and Activation Mechanisms

This thesis is submitted in fulfilment of the requirements for the degree of Doctor of Philosophy in the School of Molecular and Biomedical Sciences (Biochemistry), The University of Adelaide, Australia

Fiona Whelan, B.Sc. (Hons) 2009 2 Table of Contents THESIS SUMMARY...... 6 DECLARATION...... 7 PUBLICATIONS ARISING FROM THIS THESIS...... 8 ACKNOWLEDGEMENTS...... 10 ABBREVIATIONS ...... 12

CHAPTER 1: INTRODUCTION ...... 17 1.1 BHLH.PAS ...... 17 1.1.1 General background...... 17 1.1.2 bHLH.PAS Class I Proteins...... 18 1.2 THE ARYL HYDROCARBON RECEPTOR...... 19 1.2.1 Domain Structure and Ligand Activation ...... 19 1.2.2 AhR Expression and Developmental Activity ...... 21 1.2.3 Mouse AhR Knockout Phenotype ...... 23 1.2.4 Xenobiotic Toxicity Through Activation of AhR Target ...... 25 1.2.5 Endogenous Activation of AhR...... 29 1.2.6 Epidermal Effects of TCDD Toxicity ...... 30 1.3 SUMMARY AND AIMS ...... 33

CHAPTER 2: MATERIALS AND METHODS...... 35 2.1 MATERIALS ...... 35 2.1.1 Chemicals, reagents and kits...... 35 2.1.2 Plasmids...... 37 2.1.3 Oligonucleotides and primers...... 39 2.1.4 Antibodies ...... 41 2.1.5 Cell Lines ...... 42 2.1.6 Bacterial Strains ...... 42 2.2 METHODS ...... 42 2.2.1 Solutions...... 42 2.2.2 Bacterial Co-Expression and Purification of bHLH.PAS A Dimers .....44 2.2.3 Purification of Oligonucleotides for Crystallization...... 48 2.2.4 Crystallization of bHLH.PAS A/XRE Complexes...... 49 2.2.5 AFM Analysis...... 51 2.2.6 SAXS Analysis...... 52 2.2.7 Cell Culture ...... 52 2.2.8 Treatment and Denaturing Ni-IMAC of Full-Length HisMyc mAhR..54

CHAPTER 3: PURIFICATION AND CRYSTALLIZATION OF TRANSCRIPTION FACTOR AHR/ARNT COMPLEXED WITH A XENOBIOTIC RESPONSE ELEMENT...... 56 3.1 INTRODUCTION: BHLH TRANSCRIPTION FACTOR FAMILY AND SUBGROUPS BHLH.LZ, BHLH.O AND BHLH.PAS...... 56 3.1.1 bHLH Transcription Factors...... 56 3.1.2 Per-ARNT-Sim (PAS) Domains:...... 59 3 3.1.3 Dimer Interactions Mediated by PAS Domains ...... 59 3.1.4 Dimeric bHLH.PAS Transcription Factors...... 61 3.2 RESULTS ...... 64 3.2.1 Recombinant expression, purification and solubilisation of hAhR/ARNT...... 64 3.2.2 Crystallization and Partial Proteolysis of bHLH.PAS A Heterodimer hAhR/ARNT...... 71 3.3 SUMMARY AND DISCUSSION...... 79

CHAPTER 4: ATOMIC FORCE MICROSCOPY AND SMALL ANGLE X- RAY SCATTERING ANALYSIS OF AHR/ARNT BHLH.PAS A HETERODIMER ...... 81 4.1 INTRODUCTION ...... 81 4.1.1 The Mechanics of DNA Packaging and Transcriptional Regulation.81 4.1.2 Transcription Factors as Effectors of Localized Chromatin Remodelling...... 83 4.1.3 The AhR/ARNT Heterodimer Affects Chromatin Structures in the Enhancer of Cyp1a1...... 85 4.1.4 Atomic Force Microscopy of Proteins and DNA ...... 88 4.1.5 Small Angle X-Ray Scattering Analysis of Biological Molecules...... 89 4.1.6 Summary and Aims ...... 92 4.2 RESULTS ...... 93 4.2.1 Atomic Force Microscopy of AhR284/H6ARNT362/Cyp1a1 Enhancer Fragment Complexes...... 93 4.2.2 SAXS Analysis of AhR284/H6ARNT362  XRE Complexes...... 95 4.2.3 Hypothetical bHLH.PAS A Domain Topology of AhR/ARNT Heterodimer ...... 97 4.3 DISCUSSION...... 97

CHAPTER 5: RATIONAL MUTAGENESIS OF MAHR LIGAND BINDING DOMAIN TO DISCOVER MUTANTS WITH ALTERED INDUCIBILITY...... 104 5.1 INTRODUCTION ...... 104 5.1.1 PAS domains as environmental sensors through and ligand binding ...... 104 5.1.2 AhR Ligands; Synthetic and Natural Compounds...... 105 5.1.3 PAS domain flexibility correlates with ligand binding mechanisms ...... 110 5.1.4 LBD Homology Models and Mutagenesis Studies...... 112 5.1.5 Summary and Aims ...... 115

5.2 RESULTS...... 116 5.2.1 Homology based targeted mutagenesis of the Ligand Binding Domain of Mouse AhR...... 116 5.2.2 Histidine 285 in the LBD of mAhR is critical for suspension culture activity, but is not required for atypical AhR ligand YH439 inducibility ...... 118 5.2.3 YH439, unlike prototypical PAH ligands (TCDD, 3MC and B[a]P),has a novel binding mode which tolerates mutation of Histidine 285 to Tyrosine...... 119 4 5.2.4 Mutation of mAhR Histidine 285 to Tyrosine abolishes endogenous activation paradigms...... 120 5.3 CONCLUSIONS ...... 120

CHAPTER 6: IDENTIFICATION OF POST-TRANSLATIONAL MODIFICATION OF THE MOUSE ARYL HYDROCARBON RECEPTOR ...... 123 6.1 INTRODUCTION...... 123 6.1.1 PTMs induce conformational change and form docking sites for :Protein Interactions ...... 123 6.1.2 Post-Translational Modification of Transcription Factor p53...... 126 6.1.3 AhR Regulation by Post-Translational Modification...... 129 6.2 RESULTS ...... 132 6.2.1 Disorder as a Predictor of Modification and Protein:Protein Interaction Sites ...... 132 6.2.2 Purification and analysis of Post-translational modification of exogenously expressed mouse AhR from Control, YH439 and Suspension Treated Cells ...... 133 6.3 DISCUSSION...... 138

CHAPTER 7: FINAL DISCUSSION...... 145

CHAPTER 8: REFERENCES ...... 147

5 Thesis Summary The Aryl-hydrocarbon Receptor (AhR) is a basic Helix-Loop-Helix Per-ARNT-Sim (bHLH.PAS) transcription factor (TF) which binds partner protein Aryl hydrocarbon Receptor Nuclear Translocator (ARNT), in order to activate target genes in response to environmental or endogenous stimuli. The PAS region of these TFs consists of two adjacent repeats of the PAS domain, where the PAS repeat defines dimerization specificity and also serves as a primary sensor in exogenous ligand activation of AhR. Active AhR/ARNT heterodimer binds specific DNA sequences, termed Xenobiotic Response Elements (XRE). The molecular detail of interactions that dictate dimerization and DNA binding specificity are unknown for this TF family. In addition, active AhR has recently been shown to function as a recognition component of an E3 . Reports that AhR null mice have poor fertility and defects in liver vasculature are indicative of the potential for a number of endogenous roles. Research into activation of AhR has highlighted that post-translational modifications may affect function by regulating subcellular localization.

The complex regulatory outcomes of AhR expression and activation require a number of approaches to elucidate mechanistic information. In this thesis, a structural investigation of heterodimerization and DNA binding has been used to propose a molecular mechanism for target recognition and activation following XRE binding. Crystallographic approaches have yielded crystals of bHLH.PAS-A regions of AhR/ARNT heterodimer bound to DNA. Atomic force microscopy and small angle x-ray scattering analyses have illustrated an XRE binding mechanism whereby the DNA is bent, and the PAS- A region of the dimer flattens around the DNA. A targeted mutagenesis screen of the AhR ligand binding domain (LBD) was performed to investigate polycyclic aromatic hydrocarbon (PAH) and atypical ligand binding specificity. In parallel, mutant AhR proteins were assessed for inducibility by non- exogenous ligand modes of activation, including cell suspension and application of shear stressed serum. This process identified an LBD mutant selectively activated by novel ligand YH439, and completely inactive following PAH, cell suspension and shear stressed serum treatments, inferring the potential for differential ligand binding pocket access by YH439. Finally, given the complex output following expression and activation of AhR, regulation by post-translational modification was investigated as a potential means of subtle regulation of signalling fate. A thorough analysis of untreated AhR has revealed a concert of modifications occurring on functionally relevant regions of the protein that are implicated in regulating subcellular localization, protein:protein interactions, and potentially, protein stability. Preliminary analyses of YH439 and cell suspension treated AhR has additionally indicated the possibility of activation state specific modification patterns. In summary, this thesis describes: Novel approaches to structural characterisation of a bHLH.PAS protein dimer bound to DNA; atypical ligand binding to a novel site of AhR; and an analysis of a proposed AhR PTM code.

6

Declaration

This work contains no material which has been accepted for the award of any degree or diploma in any university or other tertiary institution and, to the best of my knowledge and belief, contains no material previously published or written by another person, except where due reference has been made in the text.

I give consent to this copy of my thesis, when deposited in the University library, being available for loan and photocopying.

Fiona Whelan

7 Publications Arising From This Thesis

Whelan F., Hao N., Chapman-Smith A. and Whitelaw M.L. (2009) Atypical Aryl hydrocarbon receptor ligand YH439 has a novel binding mode with interesting implications for endogenous activation. Manuscript in preparation.

Chapman-Smith A., Whelan F., Mechler A., Ng S.M., Shearwin K., Martin L.L., and Whitelaw M.L. (2009) PAS domains mediate protein-DNA interactions and alter DNA conformation. Journal of Biological Chemistry. Manuscript in preparation.

Dave K.A., Furness S.G.B., Goswami H., Whelan F., Bindloss C., Chapman-Smith A., Whitelaw M.L. and Gorman J.J. (2009) Comprehensive phosphoproteomic analysis reveals that the latent Dioxin Receptor is highly post-translationally modified. Molecular and Cellular Proteomics. Manuscript submitted.

Dave K.A., Whelan F., Bindloss C., Furness S.G., Chapman-Smith A., Whitelaw M.L. and Gorman J.J. (2008) Sulfonation and phosphorylation of regions of the dioxin receptor susceptible to methionine modifications. Molecular and Cellular Proteomics. Electronic publication December, 2008.

Presentations

53rd Annual Conference of the Australian Society for Biochemistry and Molecular Biology (Combio) Presentation: “PAS repeats of the Dioxin Receptor: Coconspirators in Detection and Response” F. Whelan, C. Bindloss, K. Dave, A. Mechler, S. Ng, L. Martin, J. Gorman, M.L. Whitelaw and A. Chapman-Smith. September 2008

Poster Prize “Ligand Activation, Post-Translational Modification and DNA Binding: Understanding Differential Gene Activation by the Dioxin Receptor.” F. Whelan, S. Berry, M.C.J. Wilce, M.L. Whitelaw and A. Chapman-Smith 33rd Lorne Protein Structure and Function Conference February 2008

8 Poster Prize “A Novel Developmental Switch for the Dioxin Receptor?” F. Whelan, S. Furness, K. Dave, J. Gorman, A. Chapman-Smith and M. Whitelaw 28th Lorne Genome Conference February 2007

Poster Prize “The role of the PAS domain in DNA Binding by Dioxin Receptor and ARNT: Biochemical and Structural Analysis” F. Whelan, J.A. Wilce, M. Wilce, M. Whitelaw and A. Chapman-Smith School of Molecular and Biomedical Sciences School Research Symposium, 2005

9 Acknowledgements

I’ve seen a lot of PhD students come and go over the years, and I’ve enough experience of acknowledgements to know it’s the only bit anyone really wants to read. They’re hoping primarily that they get a mention, and, if they do get a mention, that they don’t have to go through too much waffle, too many quotes, and too many nauseating clichés. So my acknowledgements are a bit of a stream of consciousness, which will attempt to avoid the aforementioned acknowledgement horrors. I should start at the beginning, with Velta, whose understanding and confidence instilled bravery in a terrified first year. Then onto Lynn and Tony – they have to be thanked together for patience and kindness in teaching us little second year students. Separately, Prof. Lynn, you’ve been such a fantastic matriarch, role model and friend over the years… and years of study. John Wallace hosted me as a summer scholarship student and honours student and, in collaboration with Steve and Briony, taught me a lot about the masochistic side of protein chemistry. There I met Bec, who introduced me to the darker side of biochemistry to be found on the 3rd floor, and teased me mercilessly for about the last 10 years. Luckily, the Wallace group also introduced me to a great protein chemist, whose rigorous and encouraging teaching have taught me a lot about mentorship and science. So, thanks to Anne for encouragement all the way from a summer scholarship when I was a little undergrad, through tons of advice during my ‘fd’ and finally with bashing the fi’sis into shape at the end. Anne, it’s had it’s highs and of course its aggregation, but most of all it’s been FUN! Murray has been Murray for as long as I’ve known him, and deserves a special mention for urging me on to utilise more techniques than anyone thought humanly possible to squeeze into a PhD. You’re still the boss! To the lab, it’s morphed over the years, and I’ve managed to pick up lifelong (confidently) friends along the way. From the days when I shared a bay with the famous Dan Peet (as an aside, I used to think Danpeet was one name because, for some reason, most people referred to him with both names, just for clarity one assumes), who was always ready with a joke, or at least ready to listen to my awful jokes and just about always groaned… to the big split where all my Hiffy buddies left me (well, moved across the corridor)… I think I pined for Cameron Bracken and our ‘Power’base of two for at least the next two years… Choppy saw me through the worst of the protein expressions Cam! Sarah Linke also took off (was it something I said?) and took all her awful music with her. I’ve been studying alongside Slinky since first year uni, and as a peer, there couldn’t have been anyone more capable of making me laugh, comforting me and pushing me all at once – I’ll always cherish you mate. Anthony Fedele – what can you say really, except… ehhhhh, we had a very good race. Always good to catch up with you Amf, and James in cool stance corner – see you back there in a few years I’m sure. Then the Whitelaws started disappearing too – my gorgeouses Susi and Jo – I miss you both so much! Hopefully we can catch up one of these days over a few bottles of red, a couple of games of cards, and that all night dancing we were famous for. I guess the waffle is

10 hard to avoid, so I’ll press on. Alix joined a short time after I started and is the hardest working biochemist I know. Always an inspiration and ready with answers just when you need em, and a sympathetic ear… oh, and my first and possibly only opera experience, thanks Alix (and Pels)! Much as I’d like to include everyone, the list is getting ridiculous, so to all the relative newbies – thanks for trying your best to keep me young, you failed miserably, but bless your hearts, you all tried (Anne, Colleen, Dave, Andrew, Sam, Sarah W, Rach, Adrienne and Dale). I haven’t known you a long time, but I’m glad we got to know each other Edwina – it’s been a wonderful experience to watch you become a gorgeous young mum. Rock on physics man. Scott, it was a great to see you learn about crystallography with the same energy and enthusiasm that I have for the practice (alchemy) – the memory of those synchrotron trips still brings a tear to my eye! Thanks to Matthew Wilce, for teaching me some of what you need to know to crystallize stuff! Thanks also to Nathan Cowieson for all the help with SAXS, and for being the most fun collaborator! Thanks most recently to Keith Shearwin for keeping me gainfully employed and engaged in the work that I so enjoy – and to the lab for being so welcoming and helpful – Ian, Julian and my dear friend Alex. Thanks to all the service staff over the years who’ve made EVERYTHING simpler – from the store and administration (Shelley, Soki, Serge, Jan and Genny), to the central services guys (Judy, Ros, Shirley, Adrianna and John Mac) – your patience and kindness over the years has been truly remarkable and more help than you’ll ever know.

To all my patient friends who put up with every late arrival, and invitation refusal – thanks especially to lovely Suzy who helped me through some of the worst mid PhD blues; and for introducing me to a heap of new mates – foxy Jade, Mark, Andrew and gorgeous Kate. Also to the old friends; Alana Banana, John, Cec, Sam, Hannah, Christine, Karen and Aurelia. All the martinis, food, wine and great company kept me cheery through most of the study, so thankyou all! Thanks also to Les, for welcoming me into your life and honouring me with inclusion as a member of the Porter family – your generosity and warmth continue to enrich my life. I can’t really tell you how much my Mum and Dad have encouraged and supported my study over all these years. When I was a little secretary on K.I., then repeating a bit of high school, all the way through undergrad – they never doubted me for a second (well, not that they told me anyway) and provided all the emotional and financial support that kept me going, and of course, countless hot dinners waiting for me in the middle of the night. It’s your PhD too I reckon! Thanks also to my sister Siobhan for all the late night pickups in the city after missing the last bus at the end of a long purification, and all the little kindnesses that cheered up a pitiful biochemist whose protein just kept aggregating! Finally a huge thankyou to Nate – for keeping my spirits up through the worst of the process, supporting my academic obsessions, embracing all my eccentricities, making me laugh, and sharing any little successes along the way. Suffice to say, (cliché warning) I think you saved me from myself.

11 Abbreviations

˚C - Degrees Celsius 3D – Three Dimensional 3MC – 3- 3M4NF – 3’-methoxy-4’-nitroflavone 53BP1 – p53 binding protein 1 ACH – Active Chromatin Hub Adr - Adricamycin AEC – Anion Exchange Chromatography AFM – Atomic Force Microscopy AhR – Aryl hydrocarbon Receptor AhRR – AhR Repressor AP1 – Activating Protein 1 APBS – Adaptive Poisson-Boltzmann Server APS - Ammonium Persulphate AR – Androgen Receptor ARD1 – N-Acetyltransferase ARNT – Aryl hydrocarbon Receptor Nuclear Translocator ATP - Adenosine Triphosphate B[a]P – Benzo[a]Pyrene Bcl-2 – B-cell CLL/Lymphoma 2 bHLH – basic Helix-Loop-Helix Domain bHLH.O – bHLH.Orange Domain BMAL – Brain Muscle ARNT like BNF – Beta-Naphthoflavone bp – BPDE – B[a]P diol epoxide Brg-1 – Brahma/SWI2-related gene product 1 BSA - Bovine Serum Albumin C – Carbenicillin CCD – Charged Coupled Device cdk – cyclin dependent kinase cDNA - Coding Deoxyribonucleic Acid ChIP – Chromatin Immunoprecipitation CME – Central Midline Element CNS – Central Nervous System CRM-1 – Region Maintenance 1 CTCF – CCCTC-binding factor CV-1 – African green monkey kidney cells Cyp – Cytochrome P450 DHT – Dihydrotestosterone DIC – Differential Interference Contrast DMax – Maximum Dimension DMBA – 9,10-dimethylbenz[a]anthracene DMEM - Dulbecco’s Modified Eagle’s Medium DMSO - Dimethylsulphoxide DNA – Deoxyribonucleic Acid DNMT – DNA Methyltransferases dNTP - Deoxynucleotide Triphosphate dpi – Dots per square inch ds – Double Stranded DTT – Dithiothreitol DV – Ductus Venosus E# - Embryonic Day # 12 EAE – Experimental Autoimmune Encephalitis E-cad – E-Cadherin ED – Essential Dynamics EDTA - Ethylene Diamine Tetra-acetic Acid EGF-R – Epidermal Growth Factor Receptor EMT – Epithelial Mesenchymal Transition EMSA – Electrophoretic Mobility Shift Assay EPO - Erythropoietin ER – Estrogen Receptor EROD – Ethoxyresorufin-O-deethylase EtBr - Ethidium Bromide FAD – Flavine Adenine Dinucleotide FAM – carboxyfluorescein FCS – Foetal Calf Serum FICZ – 6-formylindolo(3,2b)carbazole FITC – Fluorescein isothiocyanate FMN – Flavine mononucleotide FOXO1 – Forkhead Box O-1 FPLC – Fast Protein Liquid Chromatography FRET – Fluorescence Resonance Energy Transfer G1 – Gap Phase 1 GFP – Green Fluorescent Protein gsc - goosecoid HaCaT – Human Keratinocyte Cell line HAH – Halogenated Aromatic Hydrocarbon HAS – HIF ancillary sequence HAT – Histone Acetyl HDAC – Histone Deacetylase HEK293T – Human Embryonic Kidney Transformed 293T Cells HEPES - 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid HERG – Human Ether-a-go-go HES-1 – Hairy and Enhancer of Split Homolog 1 HIF – Hypoxia Inducible Factor HNF4 – Hepatic Nuclear Factor Receptor 4 H-NOXA – Heme-Nitric Oxide/Oxygen binding domain hOGG1 – human 8-Oxoguanine DNA Glycosylase 1 hPASK – Human PAS Kinase HPLC – High Performance Liquid Chromatography HRE – Hypoxia Response Element HRP – Horseradish Peroxidase HSE – Heat Shock Element HSF2 – Heat shock transcription factor 2 90 Hz – Hertz ICZ – Indolo[3,2-b]carbazole IDA – Iminodiacetic Acid IDPs – Intrinsically disordered proteins IGEPAL - IGEPAL CA-360 IGFBP-1 – Insulin-like Growth Factor Binding Protein 1 IgG – Immunoglobulin IL – Interleukin ITE – 2-(1H-indole-3’-carbonyl)thiazole-4-carboxylic acid methyl ester IMAC – Immobilized Metal Affinity Chromatography iNOS – Inducible Nitric-oxide Synthetase IPTG – Isopropyl-beta-D-thiogalactopyranoside K - Kanamycin 13 KCs – Keratinocytes KD – Dissociation Constant KDa – Kilodaltons KinA – Histidine Protein Kinase K/O - Knockout L - Litre LB – Luria Broth LBD – Ligand Binding Domain LCR – Locus Control Region LDL – Low Density Lipoprotein LIMK1 – LIM (Lin-11, Isl-1 and Mec-3) Kinase 1 LM – Left Median LOV – Light Oxygen or Voltage Domain LPS – Lipopolysaccharide LSD1 – demethylase 1 LXA4 – A4 LZ – Leucine Zipper Domain M – Molar mA – Milliamperes MALDI – Matrix Assisted Laser Desorption/Ionization µg – Microgram mg – Milligram min – Minute µL – Microlitre mL – Millilitre ML – Mother Liquor mM – Millimolar MMP – Matrix Metalloprotease MOF – Males absent on first MPa – Megapascals MQ – MilliQ Water MS – Mass Spectrometry MT – Melting Temperature MW – Molecular Weight MWCO – Molecular Weight Cut-Off MybBP2a – Myb Binding Protein 2a NES – Nuclear Export Sequence NFB – nuclear factor kappa-light-chain-enhancer of activated B cells ng - nanogram Ni - Nickel ng - Nanograms NLS – Nuclear Localization Sequence nm – nanometres nM - nanomolar NMR – Nuclear Magnetic Resonance NMT2 – N-myristoyltransferase-2 NP40 – Tergitol NPAS – Neuronal PAS domain Protein NPC – Nuclear Pore Complex NQO-1 – NAD(P)H:quinone 1 NXF – Neuronal X Factor OD600 – Optical Density 600nm ODDD – Oxygen Dependent Degradation Domain O/N – Overnight PAGE - Polyacrylamide Gel Electrophoresis PAH – Polycyclic Aromatic Hydrocarbon 14 PAI-2 – Plasminogen Activator Inhibitor-2 PAS – Per-ARNT-Sim Homology Domain PBS – Phosphate Buffered Saline PBS/T – Phosphate Buffered Saline with 1% Tween-20 PCAF – p300/CBP Associated Factor PCB – Polychlorinated p/CIP2 – protein kinase PAM C-terminal Interacting Protein 2 PCR – Polymerase Chain Reaction PCDD – Polychlorinated Dibenzo[p]dioxins PDBID – Identification PEG – Polyethylene Glycol pg – picogram PG - Phyre – Protein Homology/Analogy Recognition Engine PIC – Polymerase Preinitiation Complex PKA – Protein Kinase A pM – picomolar PMA – phorbol-12-myristate-13-acetate PMSF - Phenylmethyl Sulphonyl Fluoride Pol - DNA polymerase kappa PONDR – Predictor of Natural Disordered Regions P(r) – Distance Distribution Function PRH – Proline Rich Homeodomain PRMT – Protein arginine Methyltransferase PSGL-1 – P-selectin Glycoprotein Ligand 1 PTM – Post Translational Modification puro - Puromycin PYP – Photoactive Yellow Protein q-PCR – quantitative PCR Q-rich – Glutamine Rich RA – Retinoic Acid RB – Retinoblastoma Protein RD – Regulatory Domain Rg – Radius of Gyration RIP140 – Receptor Interacting Protein 40 RL-TK – Renilla Luciferase Thymidine Kinase RM – Right Median rpm - revolutions per minute ROR2 – Human Orphan Receptor Tyrosine Kinase 2 RT – Room Temperature (~21°C) SANS – Small Angle Neutron Scattering SAXS – Small Angle X-ray Scattering SBE – Smad Binding Element SC – Stratum Corneum SDS - sodium dodecyl sulphate S.E. – Standard Error sec - seconds SEC – Size Exclusion Chromatography SET – Su(var)3-9, Enhancer-of-Zeste, Trithorax sGC – Soluble Guanylyl Cyclase SILAC – Stable Isotope Labelling of Amino Acids in Cell Culture Sim – Single Minded Sima – Similar siRNA – Small interfering RNA Smyd2 – SET and MYND domain-2 S-phase – Synthesis Phase 15 SRC1 – Steroid Receptor Coactivator 1 ss – Single Stranded STAS – Sulphate Transporter and AntiSigma Factor antagonist domain STAT1 – Signal Transducer and Activator of Transcription 1 Su(HW) – suppressor of Hairy Wing TAD – Transactivation Domain TAF10 – TBP Associated Factor 10 TBP – TATA-binding protein TCDD – 2,3,7,8-Tetrachlorodibenzo-p-dioxin T-cell - Thymocyte TE - Tris.HCl/EDTA TEA – Triethylamine TEAAc – Triethylammonium Acetate TEMED - N,N,N1,N1-tetramethyl-ethylenediamine TEV – Tobacco Etch Virus TF – Transcription Factor TGF - Transforming Growth Factor Beta TGN – Trans-Golgi Network TH17 – Thymocyte Helper Cell 17 Tip60 – HIV Tat interacting protein of 60kDa TOF – Time of Flight TPM – Tethered Particle Motion TPST – Tyrosylprotein TR – Thyroid Hormone Receptor TREG – Regulatory Thymocyte Cell Trh – Trachealess Tris.HCl - Tris(hydroxymethyl)aminomethane Trityl - Triphenylmethane Trx – Thioredoxin TSA – Trichostatin A Tween-20 - Polyoxyethylene-sorbitan Monolaurate USF – Upstream Stimulatory Factor UV – Ultraviolet Light VEGF – Vascular Endothelial Growth Factor VHL – Von Hippel Lindau Protein VP16 – Viral Protein 16 VVD – Vivid blue-light photosensor WCE - whole cell extract wt – Wild Type X1X1 – Duplexed XRE XAP2 – hepatitis B virus X associated protein 2 X-gal – 5-bromo-4-chloro-3-indolyl-b-D-galactopyranoside XRE – Xenobiotic Response Element

16

Introduction Chapter 1: Introduction 1.1 bHLH.PAS Proteins

1.1.1 General background

Basic Helix-Loop-Helix Per-ARNT-Sim (bHLH.PAS) proteins form a large family of transcription factors, the majority of which show embryonic lethality when deleted in Drosophila and mice (reviewed in 1). These factors are involved in regulating diverse processes such as cell-fate determination (e.g. Drosophila Single-minded (Sim)), xenobiotic metabolism (Aryl hydrocarbon Receptor (AhR)) and the cellular response to hypoxia (Hypoxia Inducible Factor (HIF1 and HIF2)) (reviewed in 2). The bHLH.PAS family of transcription factors form functional heterodimers by combining a specific receptor or developmental regulator with a general partner factor to bind asymmetrical E-box DNA response elements. The general domain topology includes an N-terminal basic helix-loop-helix (bHLH), neighbouring a PAS repeat (PAS A and B) with a variable C-terminus (Fig. 1.1).

The bHLH domain puts this family into the company of a larger group of proteins that dimerize to bind DNA, where the basic region makes direct contact with the DNA and the HLH forms the minimal dimer interface (continued in Chapter 3). Dimerization is further specified and strengthened by formation of an additional dimer interface, consisting of the PAS repeat, which acts to restrict pairing within the family and prevents dimer formation with other members of the bHLH superfamily (3). The PAS domain typically features low sequence similarity, but high predicted secondary structural homology and conserved amino acids at sites critical for the PAS fold. This PAS fold has been defined by structural analysis of proteins Photoactive Yellow Protein (PYP) (4), Human Ether-a-go-go (HERG) (5), FixL (6), and Light Oxygen or Voltage domain (LOV) (7) (Fig. 1.2 A). An alignment of these sequences reveals the limited sequence identity and specific conserved (green) or similar residues (yellow) (Fig. 1.2 B) (8). Recent work by Chapman-Smith et al. has revealed that PAS A is required for strong DNA binding and physically contacts the bHLH region and DNA, a unique capacity for the additional dimer interface within this superfamily (9) (see Chapter 4). PAS B serves as a primary sensor in ligand activation of AhR. The PAS domain has also been implicated in binding of coactivator proteins to form the active, promoter bound transcription complex (10).

The bHLH.PAS family can be grouped by separating the receptor/regulators from their general partner factors. Receptor/regulators fall into Class I, characterised by their inability to form homodimers or heterodimers with other Class I proteins. Class I members identified include the developmental regulators Single Minded proteins (SIM1 and 2); Aryl hydrocarbon Receptor (AhR), Hypoxia Inducible Factors (HIF; 1, 2 and 3), Clock, and four distinct Neuronal PAS domain proteins (NPAS1-4). 17 bHLH PAS A PAS B

AhR 4$' $'

HIF1˞ 2''' 17$' &7$'

HIF2˞ 2''' 17$' &7$'

SIM1 "

SIM2 5' Class I

NPAS4 "

Clock 4ULFK

ARNT $'

Per Class II BMAL1

Figure 1.1 Primary structures of bHLH.PAS proteins with representative secondary structures of homologous proteins to illustrate functional units investigated. Primary structure conservation among proteins of this family illustrates the domain topology required to form a heterodimeric DNA binding transcriptional regulator and additionally serves to illustrate the utility of this fold for a range of transcriptional responses. Shown above the domain schematic are secondary structures including a bHLH monomer (Max, cyan (282), the PAS A domain of Period (Per, lilac) (174) and PAS B domain (HIF2˞, blue) (176). The primary structures are schematically represented as bHLH, trapezoid; PAS A, large oval; PAS B, circle; and C- terminal regulatory units as rectangles. In the dimer form, the basic region binds DNA, the HLH and PAS A make up the partner selective dimerization interface, PAS B in the case of AhR can directly bind to activating ligands to induce dimer formation, while the C-terminal regions of this family regulate transcription via oxygen dependent degradation domains (ODDD) transactivation domains (AD, TAD), and Q-rich regions (QAD, Q-rich) and repressive domains (RD). Class I factors can only heterodimerise with Class II factors, but cannot homodimerise, whereas Class II factors can homodimerise. Indicated schematically are roles in the cell, where bHLH.PAS factors can respond to cellular stress (red cross/circle) or regulate dark/light circadian entrainment (clock face). AhR = Aryl hydrocarbon receptor, HIF = Hypoxia Inducible Factor, SIM = Single Minded, NPAS = Neuronal PAS, ARNT = AhR Nuclear Translocator, BMAL = Brain and Muscle ARNT Like, bHLH = basic Helix-Loop-Helix, PAS = Per-ARNT-SIM homology domain. (A)

PYP FixL

HERG LOV2

(B) ˟A ˟B ˞A ˞B ˞’A PYP 27 AFGAIQLDG---DGNILQYNAAEGDITGRDPKQVIGKNFF--KD HERG 26 SRKFIIANARVENCAVIYCNDGFCELCGYSRAEVMQRPCT---- FixL 153 PDAMIVIDG---HGIIQLFSTAAERLFGWSELEAIGQN--VN-- LOV2 929 -KSFVITDPRLPDNPIIFASDRFLELTEYTREEVLGNN--CR--

˞’A ˞C ˟C PYP 66 VAPC--TDSPEFY------GKFKE-GVASGNLNT---- HERG 66 CDFL------HGPCTQRRAAAQIAQALLG------AEER---- FixL 190 ILMPEPDRSR-HD------SYISRY-RTTSD-PHIIGI LOV2 968 FLQG--RG------TDRKAVQLIRDAVKE-Q-----R----DV

˟C ˟D ˟E PYP 91 MFEYTFDY-Q-MTPTKVKVHMKKALS----GDSYWVFVKRV--- HERG 93 KVEIAFYR-KDGSCFLCLVDVVPVKNEDGAVIMFILNFEVVMEK FixL 219 GRIVTGKRRD-GTTFPMHLSIGEMQS--GGEPYFTGFVRDLTEH LOV2 993 TVQVLNYTKG-GRAFWNLFHLQVMRDENGDVQYFIGVQQEM---

Figure 1.2 The PAS domain, structural conservation, with divergent sequence. Reproduced from [8], secondary structures of PAS domains of Photoactive Yellow Protein (PYP) (4), FixL (6), human Ether-a-go-go related gene (HERG) (5) and light oxygen or voltage domain (LOV2) (7) show the conserved structural fold of the PAS domain (A) highlighted in red are homologous structural elements, although PAS domains have divergent sequences (B). In the case of FixL, PYP and LOV2, small molecule cofactors regulate protein function, shown in yellow (A). HERG is a potassium channel which maintains a PAS fold, but does not coordinate a small molecule within the PAS core. Conserved amino acids maintained in similar positions of the PAS fold are highlighted in green. A structure based sequence alignment of these PAS domains additionally shows similar amino acids in yellow, with standard PAS fold nomenclature featuring arrows for -strands and blue lines for -helices. Class I proteins require heterodimerization with a general partner factor from Class II of this family to form a DNA binding complex. Class II proteins include ARNT (Aryl hydrocarbon Receptor Nuclear Translocator), ARNT2, BMAL1 (Brain Muscle ARNT Like-1) and BMAL2 which are able to homodimerize, or heterodimerize with Class I (2). ARNT is the best characterised of this class and is ubiquitously expressed (11). Other Class II family members, ARNT2 (12), BMAL1 (13) and BMAL2 (14) are more restricted in their expression pattern. Class I members listed above are all involved in development, as indicated by effects of gene knockout. Additionally, AhR, HIF1, HIF2, and NPAS4 are known to regulate adaptive physiological processes in response to environmental and endogenous stimuli. bHLH.PAS factors AhR and HIF1 are responsive to cellular stress such as xenobiotic exposure and hypoxia, respectively; while proteins such as Sim, and NXF are known to function during development.

1.1.2 bHLH.PAS Class I Proteins

HIF proteins are regulated by post-translational modification (PTM) under normoxic conditions such that they are effectively silenced. HIF1 is the best characterised HIF protein, and the N-terminal region is known to resemble other Class I proteins, featuring a bHLH domain and PAS repeat. The C-terminus of HIF1diverges from the other proteins and includes sequences targeted for PTM. A Murine knockout of HIF1 results in embryonic lethality as a result of cardiac, erythroid or vasculature deformities (15). Importantly, in addition to its role in development, the hypoxic response is known to influence the pathophysiology of many diseases including heart attack, stroke, numerous cancers (16) and a pregnancy disorder, preeclampsia (17). In normoxia, HIF1 is expressed and rapidly targeted for degradation by hydroxylation of two proline residues in the Oxygen Dependent Degradation Domain (ODDD) (Fig. 1.1). The hydroxylated proline residues are recognised by a , Von Hippel Lindau protein (VHL), targeting HIF to the proteosome. A second hydroxylation of an asparagine residue by Factor Inhibiting HIF (FIH-1) in the C-terminal activation domain prevents interaction with CBP/p300 such that HIF cannot activate hypoxia responsive genes prior to degradation (18). Thus, proline and asparagine hydroxylases are oxygen dependent that act as the oxygen sensors to control HIF activity. Under hypoxic conditions, HIF1 is stable and binds ARNT in order to activate target genes. The Hypoxia Response Element (HRE) is a modified E-box sequence 5’ (A/G)CGTG 3’ (19) and is located in genes coding for factors involved in angiogenesis and metabolic activity, such as erythropoietin (EPO), vascular endothelial growth factor (VEGF) and various glycolytic enzymes (18 and references therein). EPO and VEGF regulate the number of circulating red blood cells and their delivery to tissues via the vasculature. While HIF1acts indirectly to respond to hypoxia, other bHLH.PAS

18 proteins, such as the single minded proteins, Sim1 and Sim2 do not have an identified activation pathway.

The mammalian Sims, in contrast to other bHLH-PAS factors, lack transactivation domains (TAD) and indeed, Sim2 can actively repress transactivation mediated by the ARNT TAD (20). Passive repression by Sim1 is achieved by competing ARNT away from the other regulatory bHLH-PAS factors. The manner in which the Sims are controlled is not yet clear although their activities may simply reflect a spatio-temporal expression pattern. Mouse homologs Sim1 and Sim2 (Fig. 1.1) have been shown to require the PAS domain for formation of the heterodimer with ARNT, and do not form homodimers, and are therefore classified as Class I bHLH.PAS proteins (21). A Murine Sim1 knockout was found to be postnatal lethal, due to an essential role in development of neuroendocrine secreting cells of the hypothalamus (22). Disruption of the Sim2 gene results in lethality due to breathing difficulties in newborn mice (23). Adult mice show expression of Sim2 in muscle, kidney and lung (21, 23), and recent work in our laboratory indicates that Sim2 can also function as an activator to upregulate transcription of a muscle specific target gene by binding a non-canonical E-box sequence, 5’ AACGTG 3’ (24). The theme of Class I bHLH.PAS proteins dimerising with Class II to bind E-box like sequences is reiterated with Clock/BMAL and NPAS4/ARNT2, which regulate circadian rhythm and development of inhibitory synapses, respectively. Post-translational modifications are also thought to play key roles in regulating these complexes (25, 26).

1.2 The Aryl hydrocarbon Receptor

1.2.1 Domain Structure and Ligand Activation

One of the best characterised Class I bHLH.PAS transcription factors is mammalian AhR, which is a ligand activated transcription factor. AhR is best characterised for its capacity to bind polycyclic aromatic hydrocarbon (PAH) type ligands (reviewed in 27). The domain structure of AhR encompasses an N- terminal basic DNA binding region, a helix-loop-helix domain followed by a linker to the first PAS domain, PAS A, another linker to PAS B and a C-terminal Q-rich transactivation domain (Fig. 1.1). AhR resides in the cytoplasm bound by a complex consisting of heat shock protein 90 (Hsp90) (28, 29), hepatitis B virus X associated protein (XAP2) (30, 31) and heat shock protein p23 (32), which act to maintain latency and keep AhR in a conformation capable of binding polycyclic aromatic hydrocarbons such as 2,3,7,8-Tetrachlorodibenzo-p-Dioxin (TCDD), 3MC (3-methylcholanthrene) and B[a]P (Benzo[a]pyrene) (33). The PAS B domain is known to bind these aromatic hydrocarbons (34), and once bound, the chaperone complex is remodelled, exposing an N-terminal nuclear localisation sequence (NLS) (32). A thorough review of the ligand binding domain and activating ligands is included

19 in Chapter 5 of this thesis. Nuclear import machinery transports the chaperone/AhR/ligand complex to the nucleus allowing exchange of chaperones for the AhR partner protein ARNT (Fig. 1.3). DR PAS B deletion mutants show constitutive activity that removes the requirement for a ligand binding event for nuclear transport (35). ARNT is a constitutively nuclear Class II bHLH.PAS protein with a domain structure that closely resembles AhR (Fig. 1.1), having slightly different length linkers between domains and a longer N-terminal extension ahead of the basic region. ARNT heterodimerizes with AhR, requiring at least the HLH and PAS A domains to form the active DNA binding transcription factor complex (36).

The ligand activated AhR/ARNT heterodimer binds E-box like Xenobiotic Response Element (XRE) sequences 5’ TNGCGTG 3’ (37, 38) in enhancers of target genes and associates with coactivators to upregulate target gene transcription (39, 40) (Fig. 1.3). Genes typically responsive to PAH activated AhR include xenobiotic metabolism enzymes such as Cytochrome P450s and Glutathione-S- Transferase that convert aryl hydrocarbons to water soluble forms, enabling excretion, and providing a negative feedback loop for control of AhR activation. These enzymes may also mediate formation of highly carcinogenic by-products during these metabolic pathways (as reviewed in 41, 42). In addition, AhR knockout animals show a range of phenotypic changes including decreased liver size due to failure of foetal vascular structures to remodel (43), and decreased fertility (44), which presumably are due to abrogation of endogenous function, rather than in response to environmental xenobiotic exposure.

Following activation, AhR is targeted for ubiquitination and proteosomal degradation, either within the nucleus or after export to the cytoplasm. AhR may be shuttled from the nucleus by a nuclear export sequence (NES) located in the helix-loop-helix region. Exportin, also known as CRM-1 (chromosome region maintenance 1) interacts with a leucine rich region in the HLH to regulate subcellular localization of AhR (45) (Fig. 1.3). In the absence of an activating ligand, AhR shuttles in and out of the nucleus. AhR nuclear egress in this context is regulated by a second NES located in the PAS A domain (45). Recent evidence suggests there may be quite complex physiological regulation of AhR, with a number of activation modes to control a diverse range of cellular events, implicating context specific, subtle regulation of receptor function. The following sections include a review of the expression pattern of AhR, with a view to inferring spatiotemporal function; AhR knockout mouse phenotypes to illustrate the gross effects of abrogating AhR endogenous activity; TCDD toxicity effects and associated microarray studies to indicate possible regulatory systems affected by AhR activity; endogenous mechanisms for activation of AhR that have been identified; and a broad discussion of AhR affects on epidermal homeostasis and stress responses. This review highlights the diverse regulatory functions of AhR and describes mechanisms including DNA binding, protein:protein interactions and a role as an adaptor component of a Ubiquitin ligase complex.

20 CYTOPLASM 0 SP9 P2 H XA 3 B TAD AhR p2 A H L H b Dioxin e Ub m Ub so a e UbUb t o Ub r B P A A B N L S

NPC NPC

NUCLEUS

B

ARNT

B B A A

Transcriptional Activation

B ? B

B B

Figure 1.3 Schematic representation of AhR ligand activation. Latent AhR is maintained in the cytoplasm in complex with chaperones XAP2, p23 and HSP-90. Ligand binding, in this case, dioxin, results in rearrangement of the chaperone/AhR complex to expose a nuclear localization signal (NLS). Chaperone/AhR complex is translocated through the nuclear pore complex (NPC) to the nucleus where chaperones are exchanged for partner factor ARNT to form the mature DNA binding heterodimeric transcriptional activator which targets response elements in the promoters on target genes to upregulate expression, e.g. Cytochrome P4501A1. Once activated, AhR becomes labile and may be degraded in the nucleus, or exported for polyubiquitination and proteasomal degradation. 1.2.2 AhR Expression and Developmental Activity

AhR expression during development has been characterised using a range of techniques of varying sensitivity. Potentially, roles for the AhR during development can be estimated by the pattern of expression at specific time points and spatially restricted positions within the developing embryo. Additional insight may also be gained by understanding where expression patterns of ARNT and the AhR overlap spatially and temporally. However, the absence of ARNT at sites of AhR expression does not preclude the possibility that AhR might act independently of ARNT in roles which do not involve DNA binding. Looking to DNA binding and gene regulation by the dimer, Cyp1a1 promoter activation has been traditionally used as a reporter of AhR constitutive activity or inducibility. Cyp1a1 is the best characterised AhR target gene and encodes Cytochrome P4501A1, a key for oxidation and metabolism of aromatic xenobiotics. As will be discussed, the Cyp1a1 activation paradigm may be deficient due to tissue context specific activities and possible alternative roles for the AhR which do not result in Cyp1a1 gene activation.

Feto-Maternal Interactions The earliest developmental stages, prior to implantation of the embryo, are affected by TCDD exposure, resulting in decreased embryo implantation. Studies of AhR localization in the early phase of the placenta and during the peri-implantation period in the mouse uterus have implicated its involvement in implantation processes. Support for this endogenous AhR effect is established by the fact that coadministration of AhR antagonist -naphthoflavone and TCDD is able to reverse dioxin toxicity affects on implantation (46), potentially blocking TCDD toxicity to maintain endogenous function. However, some of these effects may be attributed to the anti-estrogenic effects of ligand activated AhR (47) (48). At embryonic day 4 (E4) AhR mRNA is expressed in the mouse uterus in the blood vessels of the stroma. By E5, AhR protein is localized to the cytoplasm of uterine luminal epithelial cells surrounding the site of blastocyst implantation (46). Maternally deposited AhR and ARNT mRNA has been detected at the one cell stage and later, in the blastocyst, which establishes the potential for functional AhR during important stages of feto-maternal interactions. However, TCDD induced CYP1A1 expression has not been identified prior to formation of the blastocyst (49). By E7, cells in the wall of blood vessels in the outer primary decidual zone show strong immunoreactivity for AhR (46). A promoter fragment from the AhR target gene Cyp1a1 shows endogenous activity at E7, with marked alterations in promoter activation during organogenesis between E8 and E9. In the same study, a low-affinity receptor variant was not found to alter Cyp1a1 promoter activity at this stage of development (50), potentially indicating Cyp1a1 expression independent of AhR activation. E9 and 10 immunocytochemistry localises AhR to the wall of dilated blood vessels in the labyrinthine and peripheral regions of the placenta (46).

21 Embryonic Development Within the embryo, expression of AhR and ARNT is first detected in the neuroepithelium and heart (11); and the neural crest cell derived branchial arches 1 and 2 at E9.5-10 (51). Neural crest cells of the trigeminal placode show expression by E11 (11, 51). In agreement with these observations, Cyp1a1 promoter activity is detected in neuroepithelial tissue and the heart, although this was also detected in advance of AhR expression at E8.5 and in the mesonephros from E10, where AhR expression has not been identified (50). Advancing organogenesis displays maintenance of AhR expression in a range of tissues derived from these primitive cell populations. At these stages the ubiquitous expression patterns in the adult arise from differentiation of primordial tissue expression beginning around E10. The first organs to display AhR expression include the heart, liver and brain, which initiates at E9.5 and is maintained until E12 in the brain, E15 in the heart and continues in the liver (11, 51). Advancing development around E12-13 shows an expansion of tissues where AhR is detected. Anteriorly, AhR is detected in the developing eye, pituitary, mandibular meckels cartilage, tongue, palatal shelf, toothbud, facial bone primordia, the nasal septum and thymus (11, 51). Interestingly, TCDD treatment between E8-10 has been shown to induce cleft palate formation by interfering with palatal shelf fusion (52). Primitive gut structures including the gut, lungs, kidney, bladder muscle, urogenital sinus and genital tubercle also show AhR expression (11). Sites of bone formation including the ribs and vertebrae express AhR at E14-16 (51). E15-16 finds AhR expression extending from the lens of the eye and into the retina, while the trigeminal ganglion, olfactory bulbs, adrenal glands, nasal epithelium and muscle initiate transient expression at this stage (51).

-Galactosidase Reporter Transgenic Mouse Generation of mice carrying a transgene containing an 8.5kb fragment of the rat CYP1A1 promoter upstream of a -galactosidase (-gal) reporter gene has allowed analysis of Cyp1a1 promoter activity during development (50). A summary of promoter activity over embryonic development days E8.5-17, with the exception of the mesonephros and trigeminal ganglion, generally includes tissues where both ARNT and AhR are expressed at the earliest timepoints, including the heart, liver and brain (50). Additionally, the authors observed that inducible activity from this reporter by the injection of 3MC, or TCDD, was limited to those tissues where constitutive activation was evident, although induction was not seen in all of these of tissues (50). In another study, a construct containing a 2 XRE-TATA promoter sequence from the Cyp1a1 gene was positioned upstream of a -gal reporter gene and transgenic lines were established. To investigate AhR activity, TCDD was administered to an E13 mouse, and 5-bromo- 4-chloro-3-indolyl-b-D-galactopyranoside (X-gal) staining performed on E14 (53). Staining was observed in one line in the cranio-facial region, in limbs, shoulders, hard and soft palates and in the paws, while in the untreated and treated animals staining was evident in the genital tubercle, although this staining was enhanced in treated animals (53). A second study performed examined the effect of polychlorinated 22 (PCBs) and TCDD, administered on E14 and stained on E15. PCB126 and TCDD were found to induce staining in the ear, paws and genital tubercle, but did not overlap completely with the previous data (54).

Summary From these studies, we can establish that: Cyp1a1 promoter induction does not report all tissues where AhR and ARNT are coexpressed; Cyp1a1 promoter is active in tissues where AhR has not been detected; and that TCDD is unable to induce higher expression in a range of tissues which show constitutive Cyp1a1 induction (50, 51, 53). These observations infer firstly, that during development, Cyp1a1 induction may not be contingent on AhR activation; secondly, that AhR may have roles that do not include Cyp1a1 induction; thirdly, that TCDD may not be sufficient for tissue context specific activation of the AhR; and lastly, that constitutively active AhR may be precluded from influence by xenobiotic ligands at defined timepoints during development. Knockout studies have been utilised to infer developmental roles for AhR, where AhR null phenotypes hint at the concert of complex molecular mechanisms influenced by the presence of AhR.

1.2.3 Mouse AhR Knockout Phenotype

In spite of the ubiquitous AhR expression pattern throughout embryonic development, evidence for developmental endogenous activation of AhR by ligand binding has not been conclusively identified. However, a number of endogenous ligands have been isolated (reviewed in Chapter 5). Analysis of AhR knockout animals has inferred a number of roles for AhR in a range of systems, exclusive of xenobiotic metabolism. Three AhR knockout lines have been generated, two by excision of the first exon (43, 55) which encodes most of the basic region; the other by removal of the second exon (56), encoding the bHLH. There are a number of variable phenotypic changes between the different lines, including altered lymphoid cell numbers and age dependent formation of liver lesions (reviewed in 57). The most penetrant effects include resistance to TCDD induced toxicity, decreased growth rate, markedly reduced liver size, hepatic portal fibrosis and reduced fertility. Interestingly, B[a]P induced cancer formation was found to be abrogated in AhR knockout mice, substantiating the role for AhR in the carcinogenic activity of B[a]P (58). AhR knockout effects predominantly appear to relate to roles in the cardiovascular system, regulation of cell cycle, and influences on reproductive efficiency (reviewed in 59).

Vascular Defects The most obvious phenotypic change is a pronounced reduction in liver size, from 25-50% smaller than +/+ and +/- AhR mice (43, 56, 60) (Fig. 1.4 A). Subsequently, colloidal carbon perfusion of AhR -/- mouse liver illustrated the persistence of embryonic hepatic vascular structures that prevent peripheral 23 (A)

(B)

AhR+/+ AhR-/- (C)

(D)

Figure 1.4 AhR knockout mice show vascularisation defects, figure reproduced from Harstad et al. 2006 (60) and Lahvis et al. 2005 (61). AhR knockout mice show a range of phenotypic defects, the most penetrant being decreased liver size. At embryonic day 18.5 (E18.5), AhR +/- mice show normal liver size, while AhR -/- mice display necrotic lesions, as indicated by arrows, in the left (L) and left median (LM) lobes (A) (60). Hepatic perfusion of FITC-dextran preceding necrosis at E15.5 shows that peripheral perfusion was compromised (boxed areas, L and LM) (B). Additional changes in vascular morphology have been visualised with the same perfusion technique, with altered vessel structures evident in the eye of 8 to 12 week old mice (C) (61). Kidneys of adult mice were perfused with resin and corrosion casts generated, illustrating changes in kidney vasculature (D) (61). blood flow, resulting in decreased liver size (44). Specifically, a portal shunt termed the ductus venosus (DV) directs blood flow away from the liver sinusoids, closing within 48h of birth. In the absence of AhR, this DV remodelling does not occur (61). This phenomenon has been further investigated by perfusion of AhR +/- and -/- ED15.5 livers with fluorescein isothiocyanate (FITC) labelled dextran, which showed a distinct lack of fluorescent signal through the left and left median lobes prior to formation of necrotic hepatic tissue (Fig. 1.4 B) (60). Recent experiments using AhR conditional knockouts indicate that DV remodelling is dependent on hematopoietic/endothelial cell types, but independent of AhR hepatocytic cells (62). Other vascular defects have been identified by comparison of AhR knockout and wt mice, including altered limbal vessel structures in the eye (Fig. 1.4 C); and altered kidney vasculature (Fig. 1.4 D) (44). Further evidence of vascular function is suggested by the observation that AhR can be activated by exposure of endothelial cells to direct shear stress, or to serum that has been treated with shear stress, thought to mimic the effect of blood flow through the vascular system (59).

Fibrosis Liver morphogenic changes of AhR knockout animals are not limited to reduced size and persistent DV. Observation of 1 week old -/- AhR mouse liver shows that hepatocytes have undergone microvesicular fatty metamorphosis. Additionally, extramedullary hematopoieses was shown to be more prolonged than in wt littermates, although both defects were predominantly resolved by 3 weeks of age (56). A persistent defect was identified in two knockout lines, where hypercellularity, thickening and fibrosis were seen in the liver portal regions at 2 weeks of age (43, 56). Further research focused on potential mechanisms for AhR deficiency dependent induction of fibrosis. Secretion of active Transforming growth factor- (TGF-) has been found to increase in AhR deficient cell culture and AhR -/- mouse liver (63). TGF- has been identified as a key regulator of development of fibrosis (64), indicating a role for AhR in negative regulation of this peptide, potentially by negatively regulating expression of target gene latent TGF binding protein-1 (65). Functional links between TGF- and AhR have also been detected in smooth muscle cells of AhR -/- mouse aorta (66), and in a TCDD treated mouse palatal fusion rescue experiment (67), indicating a potential endogenous function for AhR in regulation of cell growth.

Fertility AhR knockout females have been shown to have decreased fertility, reflected in reduced conception, litter size and pup survival (68, 69). Decreased formation of mature follicles during ovulation in the knockout female has been identified in a number of studies (69, 70). An absence of AhR dependent expression of aromatase enzyme Cyp19, responsible for production of estradiol in the developing follicle, was subsequently identified as causal in reducing follicle maturation in the AhR knockout mouse (69). Aside from this role in follicle maturation, AhR activation following fertilization has been found to 24 result in upregulation of Cyp1a1 expression up to 100-fold, although a functional outcome of this event has not been identified (71). As described in Section 1.2.2, following fertilization, AhR is upregulated in the vasculature of the uterus, and the tissues joining the embryo and the placenta (46). In the AhR knockout animal, the placental labyrinth has been observed to enlarge, resulting in increased exposure of the pup to a range of teratogenic compounds consumed by the dam (72). After birth, decreased pup survival (68) is theorized to result from impaired mammary gland development in the AhR -/- dam (73). In summary, AhR deficiency decreases fertility by reducing mature follicle numbers, altering feto- maternal interactions and impairing milk production.

1.2.4 Xenobiotic Toxicity Through Activation of AhR Target Genes

Toxic effects of TCDD exposure have been inferred to result from persistent upregulation of cytochrome P450 enzymes 1a1 and 1a2, but the persistence of toxic endpoints in Cyp1a1 and Cyp1a2 knockout mice provides evidence that this is not the case (74). In order to assess other potential causes of toxicity, a large body of work has been produced analysing the effects of TCDD by microarray on a range of tissues and cell lines. Analysis of AhR knockout mice concludes that most of these changes in are AhR dependent (75) and a broader examination of all the available microarray data leads to the understanding that these effects are highly cell context dependent, as there is limited overlap in specific gene targets that are not involved in xenobiotic metabolism (74-77). However, by grouping affected genes by their biological action, patterns emerge which may be informative for downstream effectors of TCDD toxicity (75). Additionally, attempts have been made to understand, from this vast database, how gene interaction networks weave effects in different cell contexts after TCDD exposure to establish this complex response (78).

Microarray Experiments Several microarray publications specifically address the complexity of the expression changes with a view to separating those that are a primary result of dioxin treatment, from changes compounded by secondary effectors. Array data from Fletcher and coworkers examined an early timepoint, 6 hours into treatment of rat liver (76). Puga took a different approach with the HepG2 cell line by treating with cycloheximide to eliminate the possibility of nascent protein synthesis producing a downstream effector (77). Most recently, Hayes et al. sought to infer primary target activation by establishing which genes were regulated in a similar temporal fashion to prototypical gene targets (74). An examination of these microarrays reveals biological groups which appear consistently: Detoxification and stress response; apoptosis, differentiation and proliferation; steroid, and lipid metabolism; and cytoskeletal, cell adhesion and migration factors (74, 76, 77). Additional fluctuations in ribosomal protein (74) and signal transduction factor expression (77) were also evident. 25

Immune System Further investigations have yielded a number of direct target genes that do not have a detoxification function. Within the differentiation functional group, Adseverin is a calcium dependent actin binding protein that localises proximal to the membrane and acts to sever and cap filamentous actin. This function is known to be important for exocytosis and, additionally, activates signalling pathways to induce differentiation of megakaryoblastic leukaemia cells (79), a mechanism that may help to explain TCDD induced thymic atrophy and skewed maturation of thymocytes (80, 81). Adseverin is induced by TCDD primarily in the thymus in a pattern that mimics Cyp1a1 induction, and this response is not detected in AhR knockout animals. TCDD induced Adseverin expression has also been detected in the adult spleen, fetal liver and fetal thymus (82, 83). Studies from the Esser lab focused on potential mechanisms to explain the altered differentiation of immature thymocytes after exposure to TCDD. They reasoned that, as thymocyte differentiation was largely controlled by cytokines, and, as many cytokine gene promoters were found to contain potential XREs (84), a thorough analysis of cytokine levels in immature T-cells may reveal AhR regulation of T-cell maturation (85). From this study, several cytokines including Interleukin (IL)-1, IL-2 and TNF- were shown to be upregulated by treatment with TCDD. Further studies focused on three identified XREs in the IL-2 distal promoter region, all of which were bound by AhR/ARNT in a gelshift, while the most proximal 2 response elements were shown to be active in a reporter context. A complete understanding of how IL-2 affects T-cell maturation is currently lacking, but IL-2 is known to be tightly regulated and activated in response to antigenic stimulation (reviewed in 86). Aberrant AhR activation of this gene may impact on IL-2’s various roles in proliferation, differentiation and cell death to result in thymic atrophy.

Subsequent to these microarray studies, further investigation of TCDD induced effects on T-Cell maturation was undertaken. Differentiation of T helper cell-17 (TH17) and regulatory T-cells (TREG) has been identified as a pathway that can potentiate autoimmune disease development. AhR has recently been shown to regulate TH17 and TREG differentiation in a ligand specific manner. Application of TCDD was found to increase TREG cell numbers, with a reciprocal decrease in TH17 cell differentiation, resulting in decreased severity of experimental autoimmune encephalitis (EAE), a mouse model of multiple sclerosis. Application of endogenous ligand, metabolite 2-(1’H-indole-3’-carbonyl)-thiazole-4- carboxilic acid methyl ester (ITE), showed similar effects on T-cell differentiation. However, treatment with a ligand formed after exposure of keratinocytes to ultraviolet (UV) light, FICZ (6- formylindolo(3,2b)carbazole), had the opposite effect, increasing TH17 and decreasing TREG differentiation, resulting in increased severity of EAE (87). The specific mechanism of regulation of this cell differentiation decision was not identified for these cell species.

26 Recently, naive T-cell differentiation to form TH17 cells has been found to be abrogated in AhR knockout animals. TH17 differentiation resulting from application of IL-6 and TGF- proceeded only in AhR +/+ cells and was due to direct functional interference with a repressor of Th17 cell differentiation, Stat1 (Signal transducer and activator of transcription-1). In contrast to the study discussed above, ligand induction of AhR showed limited impact on Stat1 interaction, or cell differentiation effects (88). The observation of a direct interaction between unliganded AhR and Stat1 additionally suggests possible actions of AhR in other immune system functions, such as innate immunity in Stat dependent mechanisms.

Skin Toxicity The observation that TCDD toxicity results in chloracne has resulted in focused research in keratinocytes. Given that growth and differentiation of keratinocytes is partly controlled by cell density, Ikuta investigated the effect of cell density on AhR localization in HaCaT cells and found that sparsely populated cells have a predominantly nuclear AhR localization (89). Subsequent research focused on potential AhR target genes that may be involved in disrupting cell-cell contacts in a more typical physiological state such as development or wound healing. E-cadherin (E-cad) is understood to be downregulated during tissue remodelling, where it is often regulated by the snail/slug superfamily of zinc finger transcription factors. Further investigation identified 5 XREs in the promoter of the Slug gene, where a reporter construct containing this region was readily activated by removal of calcium from culture medium and also by addition of 3MC. AhR/ARNT heterodimer was able to bind XRE5 of this promoter region in a gelshift, and an anti-AhR ChIP from 3MC treated HaCaT cells resulted in amplification of the Slug promoter fragment containing XRE5 (90). From this analysis, it seems possible that the AhR has a role in inducing an Epithelial Mesenchymal Transition (EMT) during tissue remodelling where Slug is upregulated, resulting in disruption of E-cad cell-cell adhesion. A more thorough review of potential roles for AhR in skin homeostasis is continued in Section 1.2.7.

Cell Proliferation A role for AhR in cell proliferation has been shown in 5L rat hepatoma cells where TCDD treatment resulted in delayed cell cycle at G1. In an elegant series of experiments, Kolluri and coworkers showed that p27kip1 expression was induced in an AhR dependent manner and that p27kip1 upregulation was responsible for the delayed exit from G1 phase. This effect was also seen in fetal thymus gland cultures, linking another potential mechanism for thymic involution after TCDD treatment (91). As discussed below, AhR has a role in anti-estrogenic regulation of a range of estrogen-inducible genes. One of these, Hairy and Enhancer of Split homolog-1 (HES-1), has been characterised as a regulator of differentiation and cell cycle progression, and has also been established to be upregulated in TCDD treated human mammary carcinoma T47D cells. Promoter fragments were analysed in this study and 27 AhR was found to bind to an XRE containing fragment in a gelshift experiment. Mutagenesis of this XRE prevented TCDD inducible reporter activation in these cells (92). Investigations into crosstalk between the androgen receptor (AR) and AhR were initiated when it was observed that TCDD inhibits the growth of dihydrotestosterone (DHT) treated LNCaP cells (93). Androgen dependent stimulation of growth has been shown to rely on the phosphorylation state of the retinoblastoma (RB) protein, so this phenomenon was utilised to assess the effect of TCDD on cell cycle progression. RB phosphorylation, and levels of cyclin D1 and p21CIP1 were examined, with the result that TCDD treatment did not change the phosphorylation status of RB, nor did it change levels of cyclin D1, however, p21CIP1 expression was induced. Co-treatment of LNCaPs with TCDD and DHT resulted in inhibition of DHT induced RB phosphorylation. p21CIP1 acts to bind cyclin D1/cyclin dependent kinase 4 (cdk-4) and cyclin E/cdk-2 complexes resulting in inhibition of their kinase activities. Consequently, RB phosphorylation is inhibited and the cell-cycle is halted. Typically, p53 activates p21CIP1, however, in addition to proximal p53 regulatory sequences, the p21CIP1 promoter also has seven XREs. These were probed for AhR activation using a reporter system which showed that overexpressed AhR/ARNT in TCDD treated LnCaPs and MCF-7 cells was able to activate expression from an intronic promoter region including the 7 XREs, although no direct evidence of DNA binding was established in this study (94).

Diabetes Epidemiological studies have shown that TCDD exposure correlates with increased incidence of diabetes, an observation that prompted research into potential gene regulatory effects of the AhR that may impact on insulin resistance. Microarray data generated from TCDD treated HepG2 cells showed levels of insulin-like growth factor binding protein-1 (IGFBP-1) several fold in excess of untreated cells (unpublished data 95). Coupled with this, IGFBP-1 overexpressing transgenic mice have been shown to have defective glucose tolerance and abnormal insulin action, hence Marchand and coworkers hypothesised that AhR activation of IGFBP-1 would provide a mechanism to link the epidemiological correlates. Analysis of TCDD treated HepG2 cells showed robust activation of IGFBP-1 mRNA expression and protein production. IGFBP-1 promoter fragments were assessed for reporter gene activity, with an XRE being identified at -87 to -80 of the proximal promoter region which was active for AhR binding in a gelshift, abrogated by mutagenesis of the consensus XRE (95).

Fertility Investigations into gene targets responsible for apoptotic effects of PAH exposure highlighted roles for the AhR in oocyte maintenance. In addition to links with oocyte maturation described for the knockout animal, decreased fertility as an outcome of PAH exposure has been explained in part by the observation that primordial oocyte numbers are markedly decreased in treated animals (96). Matikainen hypothesised that an AhR activatable regulator of apoptosis would explain this phenomenon, and 28 hence, performed an analysis of promoter sequences from a number of these genes to isolate potential XREs. The Bax promoter was found to contain 2 potential XREs, and was confirmed to be activatable by the addition of 9,10-dimethylbenz[a]anthracene (DMBA). Additionally, mutation of the response elements prevented this induction effect. Intriguingly, mutation of a downstream guanine 5’ GGGCGTGGTG 3’ to adenine 5’ GGGCGTGGTA 3’ converted this response element from a DMBA specific to a TCDD inducible construct, highlighting target gene specificity engendered by ligand/LBD interactions. Bax is a member of the proapoptotic Bcl-2 (B-cell CLL/Lymphoma-2) family and Bax -/- DMBA treated animals were found to maintain non-apoptotic primordial oocyte numbers that were identical to their control littermates, inferring that the oocyte apoptotic phenotype was directly imposed by AhR induction of the Bax promoter (96, 97). These effects, coupled with decreased primordial oocyte apoptosis in AhR -/- animals, suggest a physiological role for the AhR in ovarian germ cell maintenance (98).

Carcinogenesis The AhR ligand and carcinogenic compound B[a]P forms a B[a]P diol epoxide (BPDE) after modification by Cytochrome P4501A1 which can go on to form a DNA adduct, believed to be largely responsible for carcinogenic effects of the parent molecule. Adducts such as these can be removed by DNA repair enzymes, but if they are not repaired, can block DNA replication. DNA polymerase kappa (pol) has recently been identified as an error prone polymerase that can bypass certain DNA adducts, including BPDE adducts to allow S-phase to progress (99). pol is expressed in many tissues, but specific expression of an alternate transcript was detected in mouse testis from days 14-35, correlating with the first wave of spermatogenesis. A search for potential transcription factor binding sites was conducted in the promoter of pol, and identified two consensus XREs. Gelshift analysis of these response elements revealed AhR/ARNT binding to XRE2 and, more weakly to XRE1. AhR dependent expression of pol was induced by 3MC treatment in mouse testis from two transcription start sites, and this was not evident in the AhR K/O mouse. Testis specific expression of a lower MW protein was also shown by Western blot, and this species was preferentially induced after treatment with 3MC (100). These observations led Ogi to conclude that induction of pol by AhR ligands may increase the mutagenic potential associated with formation of BPDE adducts, and secondly, that aberrant activation of error- prone pol expression by xenobiotics such as dioxin may increase the incorporation of mutations, leading to carcinogenic outcomes (100).

1.2.5 Endogenous Activation of AhR

Endogenous activation mechanisms are proposed to include endogenous ligands such as heme metabolites (101), tryptophan metabolites (102, 103), lipoxin A4 (104) and prostaglandin G2 (105) (Fig. 29 5.2). An endogenous activation mechanism for AhR was recently suggested by exposure of rat 5L hepatoma cells to shear stress. Although the precise mechanism is yet to be elucidated, shear stressed serum alone was capable of activating AhR and fractionation of this material identified the inducing agent as oxidized Low Density Lipoproteins (LDLs) (59). In addition to being a DNA binding transcription factor, AhR has been shown to function as a ligand activated, substrate-specific adaptor component of a novel complex Cul4B(AhR) (48), acting to target estrogen receptor alpha (ER) and androgen receptor (AR) for ubiquitination and degradation. Additionally, crosstalk with ER target gene activation shows positive regulation, whereby AhR/ARNT acts as a ligand induced coactivator with ER at estrogen response element containing promoters (106), potentially revealing a mechanism for decreased fertility in AhR -/- animals. Recent research has shown that both the ubiqutin ligase and coactivator functions are induced by constitutively active AhR, lacking the ligand binding domain (107). Post-translational modifications of AhR have also been shown to affect subcellular localization (108), DNA binding (109) and target gene activation (110) and are hypothesised to provide a mechanism for developmental activation of AhR, however, no definitive evidence to link physiological roles or interaction with ligands with a particular modification has been found.

Preliminary interpretation of endogenous activation mechanisms established in vitro implicates an oxidized LDL related mechanism in AhR’s role in vascular remodelling and potential roles for prostaglandin type ligands in induction of AhR as a ubiquitin ligase to fine tune hormone responses. These mechanisms of regulation are discussed in more detail in Chapter 5. AhR can additionally be activated by growing cells in suspension culture or disruption of cell-cell contacts in keratinocytes (108, 111), however the mechanism at play in this activation state is currently unknown. Roles for AhR in maintenance of skin homeostasis have been investigated in some detail due to the potent chloracne effects resulting from dioxin toxicity.

1.2.6 Epidermal Effects of TCDD Toxicity

Several lines of evidence implicate the AhR in maintaining epidermal homeostasis and/or effecting adaptive responses in the skin. Firstly, the most obvious phenotypic changes occurring subsequent to TCDD poisoning are evident in the skin, where the resulting chloracne is proposed to arise from changes to proliferation and differentiation of keratinocytes (112, reviewed in 113). Additionally, animals expressing a constitutively active form of AhR in the skin develop inflamed skin lesions on their backs 3- 5 weeks after birth, indicating that these effects are AhR and/or AhR target gene dependent (114). Secondly, the corollary to this is seen in the AhR null mouse which shows a variety of phenotypes, including a skin phenotype of severe, localized, interfollicular and follicular epidermal hyperplasia, associated with marked mastocytosis in the inflamed skin, although this phenotype was not consistently 30 penetrant with around 53% of animals affected (115). Thirdly, the AhR can be activated at lower cell density and by suspension culture of a human keratinocyte cell line (HaCaT) (89, 116). Hence, a role for the AhR in maintenance of epidermal homeostasis and/or in adaptive responses in the skin has been proposed (115, 116).

Investigations into the effect of TCDD on 3D models of human epidermal keratinocytes (KCs) have highlighted changes in cell differentiation that hint at affected regulatory pathways. A marked increase in the thickness of the stratum corneum (SC) was observed as the most obvious change, associated with parakeratosis and a concomitant reduction in the other cell layers, where the granular layer was completely lost in this process of aberrant differentiation (117, 118). Expression of KC differentiation markers showed involucrin was increased, while keratins 1 and 10 decreased (117). In a second study, a general decrease in non-keratinized tissue was observed to result from TCDD induced accelerated differentiation (118). Caspase-14 is typically found in the granular layer and SC of differentiated keratinocytes and also in epidermal appendages such as hair follicles and sebaceous glands (119). Caspase-14 was absent from TCDD treated skin equivalent keratinocyte cultures, while processing of pro-caspase-14 was also absent. Notably, pro-fillagrin processing to fillagrin, one of the final stages of keratinocyte differentiation, was blocked by TCDD treatment. Subsequent searches for caspase-14 target proteins have identified pro-fillagrin, which is processed by caspase-14 during the normal differentiation program to define the cornified layer at the skin surface (120). It is interesting to note that chloracne formation is characterized by hyperkeratosis of sebaceous gland duct epithelium and follicular keratosis (113), which overlaps well with the normal expression pattern of caspase-14 (120). In summary, the hyperkeratotic effects of TCDD indicate that differentiation has been accelerated, although the specification of these cells is incomplete as indicated by parakeratosis and failure to induce terminal differentiation markers (117).

Given the proposed role for TCDD activated AhR interfering with proliferation and differentiation of keratinocytes, research into a role for AhR in this tissue has focused on regulation of tumor suppressor proteins and cyclin-dependent kinases – the ‘usual suspects’ for maintenance of growth and cell identity. In spite of the limitations of utilising this potent AhR agonist, much work in keratinocytes has focused on the affect of TCDD on keratinocyte differentiation, although whether the use of TCDD in this system forms a useful paradigm for AhR activity in vivo remains to be elucidated.

AhR expression in keratinocytes has been assessed in HaCaTs, where mRNA was shown to increase 8 fold during the differentiation program initiated as basal cells make cell-cell contacts (121). AhR protein levels were found to increase in human neonatal epidermal keratinocytes incubated in high calcium to induce differentiation (122). Addition of 1 M retinoic acid (RA) maintained AhR mRNA at levels similar 31 to the basal amount in proliferating HaCaTs (121), a treatment additionally known to prevent differentiation of keratinocytes (123), subsequently shown to prevent upregulation of a proliferation factor AP-2 (124). RA signalling is known to contribute to control of development and maintenance of skin (as reviewed by 125) through controlling cell growth, differentiation and tissue remodelling by modulation of matrix metalloprotease (MMP) expression (126). TCDD treatment of human keratinocyte line SCC-12F cells has been shown to oppose the action of trans-RA by downregulating binding by RA receptors (127). However, recent research indicates that crosstalk between active RA and AhR signalling pathways results in enhancement of MMP-1 expression (128). Overlap between these pathways implicates the AhR as a factor that may be important for growth, differentiation and tissue remodelling in the skin.

Some molecular events following AhR activation by TCDD have been elucidated, including effects on targets with implications for tissue remodelling, differentiation and senescence. Initial investigations into regulators of proliferation, differentiation and senescence in keratinocytes highlighted the importance of cyclin-dependent kinases as regulators of these processes (reviewed in 129). Neonatal human epidermal keratinocytes were passaged for 8 days in the presence of high calcium to induce differentiation. TCDD was shown to enhance differentiation, as assessed by the rate of upregulation of keratinocyte differentiation markers involucrin and profilaggrin (129). In the final stages of differentiation of keratinocytes, caspases become activated resulting in nuclei loss to form the cornified layer (130). Through the use of propidium iodide staining, it was established that TCDD increased the numbers of cells in the ‘subG1/G0’ phase of the cell cycle, with the conclusion that increased DNA fragmentation due to nuclei loss resulted from TCDD induced acceleration of late differentiation (129). Analysis of cell numbers showed that TCDD increased proliferation of keratinocytes and subsequently identified TCDD dependent inhibition of senescence. RNAse protection and Western blot analyses of tumor suppressors p16INK4a, p53 and p14ARF showed marked differences between control and TCDD treated cells, where p14ARF was reduced in a manner inconsistent with the timing of downregulation of p53 and p16INK4a (129). p16INK4a and p53 are known to be necessary for the onset of senescence (131, 132) and, significantly, were found to be inhibited in treated cells (129) in a manner that requires AhR and promoter hypermethylation (133). Additionally, prevention of senescence was shown to be inhibited by the addition of AhR antagonist 3’-methoxy-4’-nitroflavone (3M4NF) (133). Additional roles for the AhR in adaptive responses in skin have been proposed due to effects on target genes discussed earlier including Slug and Epiregulin.

The corollary to this adaptive response is, of course, the potential for AhR to act as a tumor promoter. Subcutaneous grafts of immortalized AhR null fibroblasts have been shown to have decreased tumorigenicity (134). Evidence in support of this role has been presented to suggest AhR impacts 32 favourably on keratinocyte proliferation and plasticity. Activation of AhR to control adaptive responses by non-xenobiotic cell suspension activation, or by UVB exposure (described in Chapter 5) would seem likely candidate mechanisms for initiating keratinocyte proliferation and differentiation effects of this transcription factor (102, 135).

1.3 Summary and Aims

What is immediately evident from this review of AhR function is that the traditional model of xenobiotic transformation of the receptor to upregulate metabolic enzymes is fundamentally limited. Mechanistic insight for endogenous regulation of AhR cannot be gleaned by simple comparison with functional outcomes of xenobiotic activation of the receptor. The ubiquitous expression pattern of AhR likely infers functions for which gross knockout phenotypes have not been identified. Of clear importance are the roles for AhR in regulation of vascular remodelling of the liver, eye and kidney; prevention of developmental fibrosis by limiting cell growth; and maintenance of fertility, from oocyte maturation to pup rearing.

The complex regulatory outcomes of AhR expression and activation necessitated a number of approaches to elucidate mechanistic detail on AhR DNA binding, ligand activation and regulation by post-translational modification. The first project described in this thesis aimed to generate novel information on the molecular mechanism of AhR/ARNT XRE binding. This project involved bacterial co- expression, purification and crystallization of an AhR/ARNT heterodimer in complex with an XRE. AhR/ARNT DNA bending has been identified in our laboratory by DNase I sensitivity assays (9), hence an in vitro analysis of target enhancer bending was undertaken using atomic force microscopy and small angle x-ray scattering analysis, generating novel structural detail about the mechanism of DNA binding by bHLH.PAS TFs. These experiments were conducted in collaboration with Assoc. Prof. Lisandra Martin and Dr Nathan Cowieson, Monash University, Victoria, Australia.

AhR is responsive to xenobiotic exposure, where PAS B directly binds to polycyclic aromatic hydrocarbon (PAH) compounds such as the prototypical ligand 2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD). Additionally, AhR knockout animals show a range of phenotypic effects including decreased liver size and decreased fertility. Hence, it has been theorised that endogenous activation of AhR may involve generation of endogenous ligands, and/or may be mediated by specific post-translational modifications. The second project addressed in this thesis focused on an AhR ligand binding domain (LBD) targeted mutagenesis screen. AhR is known to activate in the absence of exogenous ligand in response to cell suspension. Targeted mutagenesis was performed to isolate mutant forms of AhR that were unable to bind PAH ligands, yet maintained the response to atypical ligand YH439 or cell 33 suspension activation. A point mutant was identified which maintained full activation of a reporter gene by YH439, but abrogated all other activation mechanisms tested. This indicates a novel binding mode for YH439, potentially accessing and/or binding a different region of the LBD, relative to other polycyclic aromatic hydrocarbons.

The third and final aim of this research was to investigate the proposal that AhR endogenous activity may be regulated by post-translational modification. This was investigated by performing large scale denaturing Nickel Immobilized Metal Affinity Chromatography to purify mammalian exogenously expressed full length mouse AhR. Purified material was analysed by tryptic digestion and peptides were assessed for modification using MALDI-MS-MS by collaborators Keyur Dave and Prof. Jeff Gorman, Protein Discovery Centre, University of Queensland, Australia. Comprehensive analysis of untreated mAhR and preliminary analysis of YH439 and cell suspension treated mAhR identified the potential for a post-translational modification code specific to activation state. Positions of post-translational modifications in terms of predicted structure were analysed and interpretations regarding possible biological roles were suggested.

This program of investigation has yielded novel information about how bHLH.PAS heterodimeric transcription factors bind to DNA. LBD mutagenesis has shown new links between PAH and endogenous activation mechanisms and inferred the existence of a novel ligand binding mechanism. Finally, this research has highlighted the potential for PTM regulation of AhR function, which may be essential for dictating functional roles in a range of cell and activation contexts. This comprehensive analysis has additionally illustrated the complexity of molecular regulation of AhR, in part explaining how this renaissance protein is able to maintain definitive functions in a range of cell and temporal contexts.

34

Materials & Methods Chapter 2: Materials and Methods

2.1 Materials

2.1.1 Chemicals, reagents and kits

2.1.1.1 Commercially Available Kits Dual Luciferase™ Reporter Assay System Promega QIAprep spin Mini-prep Kit QIAGEN Qiaquick Gel Extraction Kit QIAGEN Qiaquick PCR Cleanup Kit QIAGEN

2.1.1.2 Chemicals All chemicals and reagents were of analytical grade, or of the highest purity available. The majority of common laboratory chemicals were obtained from SIGMA-ALDRICH. Specialized reagents and their suppliers are listed below.

Acrylamide (acryl/bis 29:1) AccuGel National Diagnostics ATP SIGMA-ALDRICH Agarose, DNA grade SIGMA-ALDRICH Apoprotinin SIGMA-ALDRICH APS SIGMA-ALDRICH Bacto-agar Difco Labs Bacto-tryptone Difco Labs Benchmark pre-stained protein markers Invitrogen™ Bestatin SIGMA-ALDRICH BIGDYE™ Version3 Applied Biosystems BSA SIGMA-ALDRICH Bradford protein assay reagent BIO-RAD Bromophenol blue SIGMA-ALDRICH Carbenicillin SIGMA-ALDRICH Coomassie Brilliant Blue R™ stain SIGMA-ALDRICH DMEM Invitrogen/Gibco BRL™ DMSO SIGMA-ALDRICH 1kb DNA Plus Ladder Life Technologies dNTPs Finnzymes

35 DTT BioVectra/SCIMR EDTA SIGMA-ALDRICH EtBr SIGMA-ALDRICH FCS JRH Biosciences Fugene 6 Roche Guanidine Hydrochloride SIGMA-ALDRICH IGEPAL SIGMA-ALDRICH Imidazole SIGMA-ALDRICH IPTG SIGMA-ALDRICH Kanamycin SIGMA-ALDRICH Leupeptin SIGMA-ALDRICH Mark12 unstained protein markers Invitrogen Ni++-IDA Chelating Sepharose FF GE Healthcare NP-40 SIGMA-ALDRICH Pepstatin SIGMA-ALDRICH Pfu Turbo polymerase buffer, 10X Stratagene® PMSF SIGMA-ALDRICH Puromycin SIGMA-ALDRICH Restriction enzyme reaction buffers New England Biolabs Shrimp Alkaline Phosphatase Buffer, 10x Amersham Pharmacia SDS SIGMA-ALDRICH Skim milk powder Diploma (supermarket) Sodium azide SIGMA-ALDRICH Supersignal WestPico chemiluminescence substrate Pierce T4 DNA ligase buffer, 10x New England Biolabs T4 DNA polymerase buffer, 10x New England Biolabs TEMED SIGMA-ALDRICH Triethylamine (TEA) 99% SIGMA-ALDRICH Trypsin-EDTA Gibco BRL™ Trypsin Sequencing Grade Roche Triton X-100 SIGMA-ALDRICH Tween-20 SIGMA-ALDRICH

36 2.1.1.3 Crystallization Screens and Accessories Screens Anion Qiagen Cation Qiagen Classics Qiagen Cryos Qiagen Hampton I Hampton Research HT Screen Hampton Research Index I and II Hampton Research JCSG Jena Bioscience Natrix Hampton Research PACT Suite Qiagen PEGs Qiagen PEGIon Hampton Research SaltRXI/II Hampton Research Sigma DNA Sigma-Aldrich

Accessories 10 oz 22 mm Siliconized Cover Slide (HR3-233) Hampton Research CrystalClear Sealing Tape (4’’) (HR4-710) Hampton Research 5 L Dialysis Button (50) (HR4-508) Hampton Research IZIT Crystal Indicator (HR4-710) Hampton Research Microbridges (400) (HR3-312) Hampton Research Microdialysis Disk (3500, 50) (HR3-338) Hampton Research

2.1.2 Plasmids

All cloning was performed by standard techniques (described in 136). Inserts made using PCR were generated using the proofreading polymerase Pfu. All restriction enzymes were purchased from New England Biolabs, 240 County Road, Ipswich, MA 01938-2723, USA.

2.1.2.1 Reporter Plasmids X1X1-luc – A plasmid engineered with a tandem repeat of XRE1 from the Cyp1a1 enhancer and a minimal thymidine kinase promoter upstream of a firefly luciferase cDNA (previously described in 137). RL-TK – An internal control reporter containing the thymidine kinase promoter driving constitutive expression of the renilla luciferase cDNA (Promega). 37 2.1.2.2 Expression Plasmids pET16B No Tags hAhR1-287, 1-293 and 1-284 – Plasmids encoding three truncations of hAhR, encompassing the bHLH.PAS A region were constructed in three steps. Three PCR fragments of hAhR were generated from pSport/hAhR620, (generated in the Poellinger laboratory, Karolinska Institute, Stockholm) using the forward primer, 5’ CAT GCC ATG gAC AGC AGC AGC 3’, inserting an NcoI site to facilitate in frame cloning without an N-terminal tag. However, this resulted in mutation of the second codon of the hAhR cDNA (lower case), causing mutation N2D. Initially, hAhR1-287 (N2D) was cloned using this strategy, requiring site-directed mutagenesis to revert the sequence to wt, and remove the NcoI site. Reverse primers were engineered to insert a tandem stop codon, and an XhoI site for directional ligation into pET16b including hAhR287 5’ CCG CTC GAG TTA AAA GAT AAA ATT TTT GGT 3’, hAhR293 5’ CCG CTC GAG TCA TCA TAG TTT GTG TTT GGT TC 3’, and hAhR 284 5’ CCG CTC GAG TCA TCA ATT TTT GGT CCG GAT TTC 3’. PCR fragment NcoI-XhoI encoding hAhR1-287 (N2D) was ligated into NcoI/XhoI digested pET16b. Site-directed mutagenesis using primers hAhRN2Dupper 5’ GA AGG AGA TAT ACC ATG aAC AGC AGC AGC GCC AAC 3’ and hAhRN2Dlower 5’ GTT GGC GCT GCT GCT GTt CAT GGT ATA TCT CCT TC 3’ corrected the codon 2 mutation. hAhR1-293 and hAhR1-284 PCR fragments were digested with NheI and XhoI, utilising a native NheI site at 151bp (aa 51) in the hAhR cDNA. pET16b hAhR 1-287 was digested with NheI and XhoI and ligated with fragments hAhR51-293 and hAhR51-284, generating wt clones pET16b hAhR1- 293 and pET16b hAhR1-284.

pAC28 H6TEV-ARNT1-362: A plasmid encoding the bHLH.PAS A region of hARNT was constructed by Dr Anne Chapman-Smith in two steps. pET32 ARNT1-362 (36) and pAC28 (138) were digested with NcoI and HindIII and the ARNT1-362 fragment was ligated into the pAC28 vector. The resulting untagged pAC28 ARNT1-362 was digested with NcoI. The His TEV tag was generated by annealing oligonucleotides TEVH6Upper and TEVH6lower, leaving NcoI compatible sticky ends. pAC28 ARNT1-

362 was digested using NcoI, and the annealed TEVH6 fragment was ligated, generating pAC28

H6TEV-ARNT1-362. pEF/His-myc-mAhR-full_length – A plasmid encoding full-length mAhR with an N-terminal 6His-Myc tag and 3’ UTR was constructed by Sebastian Furness in two steps. An N-terminal fragment, mAhR 1-83 was cloned into pEF/IRES/puro by generating a PCR fragment coding for amino acids 1-83 from mDR/CMV4 (139) using the forward primer, 5’ CGA CGC GTA TGA GCA GCG GCG CC 3’ and the reverse primer, 5’ GCT CTA GAG ATC AAA GAA GCT CTT GG 3’. This allowed in frame cloning to an N-terminal His-myc tag generated by annealing the oligos: Forward, 5’ T CGA GCG ACC ATG GGG GGT CAC CAT CAC CAT CAC CAT CAC CAT GAT CAT GAA CAA AAG CTT ATT TCT GAA GAA GAT CTT GAT CAT A 3’, and His-myc Reverse, 5’ CGC GTA TGA TCA AGA TCT TCT TCA GAA ATA 38 AGC TTT TGT TCA TGA TCA TGG TGA TGG TGA TGG TGA TGG TGA CCC CCC ATG GTC GC 3’, which was inserted into XhoI/MluI digested pEF/mAhR1-83/IRES/puro. The second step was to insert a BlpI/XbaI fragment of mDR/CMV4 [27] into the BlpI/XbaI sites of pEF/His-myc-mAhR1-83/IRES/puro, to create pEF/His-myc-mAhR-full_length. Mutagenesis including amino acids K245E, H241Y/H285Y, K297N/G298W/Q299N/L300F and R369Q/Y372C/Q377H was performed using standard overlap extension PCR methods with external primers mAhR17N and mAhR580rev and subcloned into pEF/His- myc-mAhRfull/IRES/puro. pEF/blank/IRES/puro (140) was used as a control blank expression vector. A375I was mutated by Sebastian Furness.

2.1.3 Oligonucleotides and primers

2.1.3.1 Oligonucleotides for electrophoretic mobility shift assays: Adenovirus Major Late promoter fwd: [5’ TTTATAGGTGTAGGCCACGTGACCGGGTGTTCCTG 3’] Adenovirus Major Late promoter rev: [3’ CAGGAACACCCGGTCACGTGGCCTACACCTATAAA 5’] XRE 33mer upper: [5’ CGTGCCCCGGAGTTGCGTGAGAAGAGCCTGGAG 3’] XRE 33mer lower: [5’ CTCCAGGCTCTTCTCACGCAACTCCGGGGCACG 3’] XRE 16mer upper: [5’ GAGTTGCGTGAGAAGA 3’] XRE 16mer lower: [5’ TCTTCTCACGCAACTC 3’]

2.1.3.2 Oligonucleotides for Crystallization Screens and SAXS analysis: XRE 33mer upper: as above XRE 33mer lower: as above XRE 16mer upper: [5’ GAGTTGCGTGAGAAGA 3’] XRE 16mer lower: [5’ TCTTCTCACGCAACTC 3’] XRE 30mer upper: [5’ CGGAGTTGCGTGAGAAGAGCCTGGAGGCCC 3’] XRE 30mer lower: [5’ GGGCCTCCAGGCTCTTCTCACGCAACTCCG 3’] XRE 33mer-B upper: [5’ CGGAGTTGCGTGAGAAGAGCCTGGAGGCCCGCG 3’] XRE 33mer-B lower: [5’ CGCGGGCCTCCAGGCTCTTCTCACGCAACTCCG 3’]

2.1.3.3 Primers for cloning hAhR into pET16b and Site-Directed Mutagenesis: hAhRupper: [5’ CATGCCATGGACAGCAGCAGC 3’] hAhR287low: [5’ CCGCTCGAGTTAAAAGATAAAATTTTTGGT 3’] hAhR293low: [5’ CCGCTCGAGTCATCATAGTTTGTGTTTGGTTC 3’] hAhR284low: [5’ CCGCTCGAGTCATCAATTTTTGGTCCGGATTTC 3’] hAhRN2Dupper: [5’ GAAGGAGATATACCATGaACAGCAGCAGCGCCAAC 3’] hAhRN2Dlower: [5’ GTTGGCGCTGCTGCTGTtCATGGTATATCTCCTTC 3’]

39

2.1.3.4 Primers for cloning an H6TEV tag onto hARNT1-362 into pAC28: Ligation of these primers generates a fragment with NcoI overhangs to insert into NcoI digested vector.

TEVH6upper: [5’ CATGGGTGGCTCCCATCACCATCACCATCACGACTACGATATCCCGACCACCG AAAACCTGTACTTCCAGGGCGC 3’]

TEVH6lower: [5’ CATGGCGCCCTGGAAGTACAGGTTTTCGGTGGTCGGGATATCGTAGTCGTGAT GGTGATGGTGATGGGAGCCACC 3’]

2.1.3.5 Primers for generating AFM dsDNA fragment: Cyp1a1fwd: [5’ GGAGAGCTGGCCCTTTAAGA 3’] Cyp1a1rev: [5’ TAAGCCTGCTCCATCCTCTG 3’]

2.1.3.6 Primers for site-directed mutagenesis of mAhR: Mutations include silent restriction site insertions and codon changes (lower case). mAhRA83Dupper: [5’ GCCAAGAGCTTCTTGaTGTTGCATTAAAGTCC 3’] mAhRA83Dlower: [5’ GGACTTTAATGCAACAtCAAAGAAGCTCTTGGC 3’] mAhRH241Yfwd: [5’ GGTTAAAGTATCTTtATGGcCAGAACAAGAAAG 3’] mAhRH241Yrev: [5’ CTTTCTTGTTCTGgCCATaAAGATACTTTAACC 3’] mAhRK245Efwd: [5’ CTcCATGGACAGAACgAGAAAGGGAAGGACGG 3’] mAhRK245Erev: [5’ CCGTCCTTCCCTTTCTcGTTCTGTCCATGgAG 3’] mAhRH285Yfwd: [5’ CTTCAGGACCAAAtACAAGCTgGACTTCAC 3’] mAhRH285Yrev: [5’ GTGAAGTCcAGCTTGTaTTTGGTCCTGAAG 3’] mAhR Triple mutant R369Q, Y372C, Q377H Y372C and Q377H mutated sequentially, hence Y372C mutation incorporated into Q377H primer design (underlined). mAhRR369Qfwd: [5’ GATTTACAGAAATGGccaACCAGATTACATCATC 3’] mAhRR369Qrev: [5’ GATGATGTAATCTGGTtggCCATTTCTGTAAATC 3’] mAhRY372Cfwd: [5’ GGCCAACCAGATtgCATCATCGCgACTCAG 3’] mAhRY372Crev: [5’ CTGAGTcGCGATGATGcaATCTGGTTGGCC 3’] mAhRQ377Hfwd: [5’ GATTGCATCATCGCGACgCAcAGACCACTGACGGAT 3’] mAhRQ377Hrev: [5’ ATCCGTCAGTGGTCTgTGCGTCGCGATGATGCAATC 3’] mAhR NWNF mutant: K297N, G298W, Q299N, L300F mAhR297NWNFfwd: [5’ GGTTGTGATGCCAActGGaActTcATTCTGGGgTATACAG 3’] mAhR297NWNFrev: [5’ CTGTATAcCCCAGAATgAagTtCCagTTGGCATCACAACC 3’] 40

2.1.3.7 Sequencing primers: S-tag: [5’ GAACGCCAGCACATGGAC 3’] T7promoter: [5’ TAATACGACTCACTATAGGG 3’] T7terminator: [5’ GCTAGTTATTGCTCAGCGGT 3’] mAhR17N: [5’ CCGGTGCAGAAAACAGTAAAG 3’] mAhR58N: [5’CAAGATGTTATTAATAAG 3’] mAhR68N: [5’ GTTCTTAGGCTCAGCGTC 3’] mAhR130N: [5’ GATGCCTTCCTCTTCTATG 3’] mAhR224N: [5’ AATTCATCTGGTTTTCTG 3’] mAhR424N: [5’ CGCACCAAAAGCAACACTAGC 3’] mAhR443C: [5’ ATCCTTACTTGGGGTTGAC 3’] mAhR580rev: [5’ CCTGCACGTAGGTCAGGATT 3’] mAhR628N: [5’ CCCAGCAGCAGCTGTGTC 3’] mAhR637C: [5’ CACCATCTGACACAGCTG 3’]

2.1.4 Antibodies

9E10 antibody: Mouse monoclonal IgG 1 raised against the myc epitope. Hybridoma cells from the Department of Molecular and Biomedical Sciences, The University of Adelaide, SA, 5005 were grown to confluency and left to lyse. The media and cell suspension was subjected to centrifugation at 13 000 rpm, 4 C; and stored at 4°C with the addition of 0.02% sodium azide and used at 1:10 in PBS/T with 1% skim milk powder incubated overnight at 4°C with shaking.

Anti-His antibody: Mouse monoclonal directed against a His tag peptide (27E8). Catalogue Number: 2366. Purchased from Cell Signalling Technology, 3 Trask Lane, Danvers, MA 01923. Used at 1:1000 in PBS/T, incubated for 1 hour at RT, with shaking.

Anti--tubulin antibody: Rat monoclonal HRP conjugated IgG directed against alpha tubulin. Catalogue Number MCA78G. Purchased from AbD Serotec, the research antibody division of MorphoSys, Lena- Christ-Straße 48, 82152 Planegg, Germany. Used at 1:4000 in PBS/T with 1% skim milk powder, incubated for 2 hours at RT, with shaking.

Mouse Secondary: Goat polyclonal HRP conjugated IgG directed against mouse IgG. Catalogue Number 31430. Purchased from Pierce, PO Box 117, Rockford, Illinois 61105, USA. Used at 1:10 000 in PBS/T with 1% skim milk powder, incubated for 1 hour at RT, with shaking. 41 Rat Secondary: Polyclonal rabbit HRP conjugated IgG directed against rat IgG. Catalogue Number P0450. Purchased from DAKO Cytomation, Produktionsvej 42, DK-2600 Glostrup, Denmark. Used at 1:5000 in PBS/T with 2% skim milk powder, incubated for 1 hour at RT, with shaking.

2.1.5 Cell Lines

293T – Primary human embryonic kidney cells transformed with sheared adenovirus type 5 DNA and transfected with SV-40 large T-antigen. G418 resistant. Received from Susan Cory at the Walter and Eliza Hall Institute, originally from John Heath, Oxford University, UK.

2.1.6 Bacterial Strains

BL21 (DE3) – The E. coli BL21(DE3) strain was used for the bacterial protein expression [F-, ompT, hsdSB,(rB-, mB-)gal, dcm, (DE3)]. DH5 - The E. coli DH5 strain was used for DNA production and subcloning [80lacZM15, recA1,

- + endA1, gyrA96, thi-1, hsdR17(r , m ), supE44, relA1, deoR, (lacZYA-arg F)U169].

2.2 Methods

2.2.1 Solutions

2.2.1.1 General 100 x Protease Inhibitor Cocktail: 100 g/mL aprotinin/400 g/mL bestatin/500 g/mL Leupeptin /100 g/mL pepstatin/150 mM EDTA, aliquoted, stored -80°C. Coomassie Stain: 0.3% Coomassie (Brilliant Blue R, Sigma) in 10% Acetic acid and 50% ethanol Destain: 10% Acetic Acid, 10% Ethanol EMSA Load Buffer: 10 mM HEPES pH 7.9, 10% Glycerol, 150 mM NaCl, 0.1 mM EDTA, 0.5 mM DTT, 3 mM Magnesium Chloride, 4 mM Spermidine, stored at -20°C. EMSA Running Buffer: 50 mM Tris.HCl pH 7.9, 40 mM Glycine, 0.1 mM EDTA, where glycine was dissolved prior to addition of Tris.HCl to aid in solubilization of Tris.HCl. PBS: 10 mM phosphate buffer pH 7, 120 mM sodium chloride, 2.7 mM potassium chloride

42 RIPA Buffer: 150 mM NaCl, 1% NP40, 0.5% sodium deoxycholate, 0.1% SDS, 100 μM PMSF, 1 mM DTT and 1x protease inhibitor cocktail (Sigma) in 50 mM Tris.HCl pH 8 SDS-PAGE Load Buffer (4x): 40% Glycerol, 50 mM Tris.HCl pH 6.8, 5% SDS SDS-PAGE Separation Buffer: 375 mM Tris.HCl pH 8.8; 0.1% SDS SDS-PAGE Stacking Buffer: 125 mM Tris.HCl pH 6.8; 0.1%SDS SDS-PAGE Glycine Running Buffer: 25 mM Tris.HCl pH 8.3, 250 mM Glycine, 0.1%SDS Tris-Tricine Running Buffer: 100 mM Tricine, 100 mM Tris, 0.1%SDS Transfer Buffer: 50 mM Tris.HCl, 40 mM Glycine, 1% SDS, 20% Ethanol TE: 10 mM Tris.HCl pH 7.5, 1 mM EDTA

2.2.1.2 Affinity Purification Ni-IMAC Buffers Native Ni-IMAC Lysis Buffer: 10 mM Imidazole, 20 mM Sodium Phosphate Buffer pH 7.2, 0.5 M NaCl, 0.1 mM DTT, 0.1% Triton X 100, 0.1 mM PMSF Ni Buffer A: Sodium Phosphate Buffer pH 7.2, 0.5 M NaCl Ni Buffer B: 500 mM Imidazole, 20 mM Sodium Phosphate Buffer pH 7.2, 0.5 M NaCl.

Denaturing Ni-IMAC Guanidine Extract Buffer: 6 M Guanidine, 100 mM Sodium Phosphate buffer pH 7, 150 mM Sodium Chloride, 20 mM Imidazole, 0.1% NP-40, 20 mM DTT, 50 mM Sodium Fluoride. Guanidine Wash Buffer: 6 M Guanidine, 100 mM Sodium Phosphate buffer pH 7, 150 mM Sodium Chloride, 50 mM Imidazole, 20 mM DTT, 50 mM Sodium Fluoride SB3-10 Wash Buffer: 1% SB3-10, 100 mM Sodium Phosphate buffer pH 7, 5 mM Imidazole, 50 mM Sodium Fluoride. SDS Wash Buffer: 1% SDS, 100 mM Sodium Phosphate buffer pH 7, 5 mM Imidazole, 50 mM Sodium Fluoride. SDS Elution Buffer: 1% SDS, 100 mM Sodium Phosphate buffer pH 7, 250 mM Imidazole, 50 mM Sodium Fluoride.

43 2.2.1.3 TEV and Dialysis Buffers TEV Buffer 1: 50 mM Tris.HCl pH 7.5, 0.1 mM EDTA, 1 mM DTT, 5% Glycerol TEV Buffer 2: 50 mM Tris.HCl pH 7.5, 50 mM NaCl, 0.1 mM EDTA, 1 mM DTT, 5% Glycerol TEV Buffer 3: 50 mM Tris.HCl pH 8, 150 mM NaCl, 5% Glycerol Storage Buffer: 50 mM Tris.HCl pH 8, 150 mM NaCl, 0.1 mM EDTA, 1 mM DTT, 5% glycerol AEC Load Buffer: 50 mM Tris.HCl pH 8, 50 mM NaCl, 0.1 mM EDTA, 1 mM DTT, 5% glycerol

2.2.1.4 Anion Exchange - Protein Q-Buffer A: 50 mM Tris.HCl pH 8, 5% Glycerol, 50 mM NaCl, 0.1 mM EDTA, 1 mM DTT Q-Buffer B: 50 mM Tris.HCl pH 8, 5% Glycerol, 1 M NaCl, 0.1 mM EDTA, 1 mM DTT

2.2.1.5 Reverse-Phase HPLC TEAAc: 0.1 M Triethylammonium Acetate TFA: 0.5% Trifluoroacetic Acid AcN: Acetonitrile

2.2.2 Bacterial Co-Expression and Purification of bHLH.PAS A Dimers

2.2.2.1 Bacterial Transformation DNA was added to 45 l of competent DH5 cells. Cells were incubated on ice for 20 minutes, heat shocked at 42°C for 2 minutes, then back on ice for 2 minutes. Cells were resuspended in 1 ml of LB and incubated at 37°C for 1 hr, on a rotating platform. Cells were plated onto appropriate selective media and incubated overnight at 30°C.

Cotransformation Procedure: 0.5ul each of pAC28 H6-TEV-ARNT 1-362 and pAC28 H6-TEV-ARNT 39- 362 DNA prepared by miniprep (QIAGEN) were transformed by heat-shock into competent BL21 (DE3) star cells (grown to reject pLys-S vector), plated onto 50 g/ml Kanamycin/2% Glucose/LB Agar 10cm plates and grown O/N at 30°C. Competent BL21 (DE3) star cells containing the pAC28 H6-TEV-ARNT 1-362; or pAC28 H6-TEV-ARNT 39-362 were prepared by standard methods. Competent BL21 (DE3) ARNT cells were subsequently transformed with 0.5ul pET16B hAhR 1-287 DNA by heat shock, plated onto 50 g/ml Kanamycin/100 g/ml Carbenicillin/2% Glucose/LB Agar (K/C/G/LB Agar) 10cm plates and grown overnight at 30°C. Glycerol stocks of cotransformed cells were prepared from cultures grown in 50 g/ml Kanamycin/100 g/ml Carbenicillin/2% Glucose/LB (K/C/G/LB) at 30°C overnight, by addition of 40% glycerol. Glycerol stocks continued expressing protein for approximately 3 months of storage.

44 2.2.2.2 Expression of hAhR/hARNT bHLH.PAS A Heterodimers Glycerol stocks were freshly streaked onto K/C/G LB Agar plates and grown overnight at 30°C. Overnight cultures of 100 ml K/C/G/LB were inoculated with a single DR/ARNT BL21 (DE3) colony and grown overnight at 30°C. The following day, the cells were pelleted at 3K, 21°C and supernatant removed. Pelleted cells were resuspended in fresh media 50 g/ml Kanamycin/100 g/ml

Carbenicillin/LB (K/C/LB) and OD600 determined. The desired volume of culture medium (K/C/LB) was prepared and inoculated to OD600=0.1, grown for ~ 2 hours 30 minutes at 30°C and OD600 monitored.

Once OD600 of 0.5-0.6 was reached, cultures were induced with 0.1 mM IPTG (dissolved fresh immediately prior to induction, and kept on ice). Induction was allowed to proceed for 2 hours and cell pellets were harvested at 5 000 g, 4°C, 10 minutes in centrifuge pots. Pellets were stored at –80 deg C until ready for lysis, typically the following day.

2.2.2.3 Ni-IMAC Purification of bHLH.PAS A Dimers IDA chelating sepharose fast flow (Cat. No. 17-0575-01, GE Healthcare), 50 mL, was packed into an XK-26 (Pharmacia) column and washed with 50 mL MilliQ water. All washes were performed at 5 mL/minute. The resin was charged by washing with 50 mL of 100 mM Nickel Sulphate. The column was washed with 150 mL MilliQ water to remove any unchelated Nickel, and equilibrated in 250 mL of chilled 98% Ni Buffer A/2% Ni Buffer B at 5 mL/min. Cell pellets were resuspended in 15 ml lysis buffer per 500 mL of culture and lysed with two passes through a French press. Lysate was ultracentrifuged at 184000 g, 1 hour, 4 C and filtered using a 0.8  Minisart syringe tip filter, then 0.45  Minisart syringe tip filter for loading onto Ni-IMAC column. For larger preparations (>60 mL lysate), a bottle-top 0.2  filter (Nalgene) was used for vacuum filtration. Lysate was stored on ice and loaded onto the prepared 50 mL Ni-IDA column at 1 mL/min. These steps were performed with a P1 Peristaltic Pump (Pharmacia). At this point, the column was connected to an AKTA FPLC (GE Healthcare) and washed with 100 mL 90% Ni Buffer A/10% Ni Buffer B (50 mM Imidazole); washed with 100 mL 80% Ni Buffer A/20% Ni Buffer B (100 mM Imidazole); and eluted with 100 mL 50% Ni Buffer A/50% Ni Buffer B (250 mM Imidazole). Elution was monitored by absorbance at 280 nm and peak fractions were reserved on ice for Bradford (Biorad) protein quantitation analysis as per the manufacturers protocol, in a 96 well plate format and absorbance at 600 nm measured using an E max precision microplate reader (Molecular Devices). Peak protein fractions were pooled for dialysis overnight at 4°C with stirring in 6-8K MWCO regenerated cellulose dialysis tubing (Cellu-Sep T2) into the appropriate buffer for the subsequent purification step, including: TEV Buffers 1-3; AEC Load Buffer; or Storage Buffer. 1L of dialysis buffer was prepared per 25 mL of eluant.

45 2.2.2.4 AEC Purification of Ni-IMAC Purified bHLH.PAS A Dimers A 1 mL or 5 mL Q-sepharose Hi-load Phenylsepharose (HP) column (GE Healthcare) was washed with 0.45 filtered buffers at 5 mL/minute. Firstly, the column was washed with 10 mL MilliQ water and equilibrated in 10 mL Q Buffer A (50 mM NaCl) on a P1 Peristaltic Pump (Pharmacia). The column was pre-eluted with 10 mL Q Buffer B (1 M NaCl), then equilibrated in 10 mL Q Buffer A (50 mM NaCl). Ni- IMAC purified, dialysed protein was filtered through a 0.45  Minisart syringe filter (Sartorius), stored on ice and loaded onto the prepared AEC column at 1 mL/minute, after which the column was transferred to an AKTA FPLC (GE Healthcare). Gradient elution was performed from 100-55% Q Buffer A/0-45% Q Buffer B (50_500 mM NaCl) (Scheme 2); or 100-80% Q Buffer A/0-20% Q Buffer B (50_250 mM NaCl) (Scheme 5). Stepwise elution was performed from 100% Q Buffer A and 80% Q Buffer A/20% Q Buffer B (250 mM NaCl) (Scheme 4); or 100% Q Buffer A and 90% Q Buffer A/10% Q Buffer B (150 mM NaCl) Scheme 6. Protein quantitation of peak fractions detected by monitoring absorbance at 280 nm was performed as described in Section 2.2.2.3 and purity analysed by SDS-PAGE separation and Coomassie staining.

2.2.2.5 SEC of Partially Purified bHLH.PAS A Dimers A 300 mL Superdex 200 (GE Healthcare) SEC Column (XK26/70, Pharmacia) was equilibrated in 400 mL of Storage Buffer using an AKTA FPLC (GE Healthcare) at 1 mL/min. Protein purified by Ni-IMAC was either dialysed into Storage Buffer and concentrated to 1 mL (Scheme 3), or further purified using a 1 mL AEC column (Scheme 4) and loaded by injection onto the SEC column. Isocratic elution was performed over 300 mL, with 5 mL fractions collected between 100 and 300 mL. Peak protein fractions were stored on ice prior to analysis of protein concentration by Bradford assay. Protein purity was assessed by separation on SDS-PAGE and coomassie stain.

2.2.2.6 Discontinuous Denaturing SDS-PAGE and Coomassie Staining 10% acrylamide SDS-polyacrylamide separating gel was used to analyse homodimer and heterodimer bHLH.PAS A bacterial preparations. The separating gel consisted of: 10% Bis-acrylamide (29:1), 325 mM Tris.HCl pH 8.8, 0.1% SDS, 0.03% APS, 0.125% TEMED. The separating gel was left to polymerise, beneath an overlay of water, before draining and applying the stacking gel. The stacking gel was composed of 4% Bis-acrylamide (29:1), 125 mM Tris.HCl pH 6.8, 0.1% SDS, 0.06% APS, 0.1% TEMED. Minigels were run at 60 mA per 1.5 mm gel in SDS-PAGE Glycine Running Buffer. TEV proteolysis of ARNT homodimer was analysed by 12% acrylamide Tris-Tricine SDS-PAGE. Gels were poured as described above. The separating gel consisted of: 12% Bis-acrylamide (29:1), 1 M Tris.Cl pH 8.3, 0.075% APS, 0.125% TEMED. The stacking gel comprised: 5% Bis-acrylamide (29:1); 975 mM Tris.Cl pH 8.3; 0.125% APS; 0.188% TEMED. Tris-Tricine gels were run at 80 V in Tris-Tricine Running Buffer. Full length HisMyc-mAhR mammalian prepared protein samples were analysed using a 46 separating gel with 7.5% Bis-acrylamide (29:1); other components were as above. Coomassie staining was performed for 2 hrs RT with shaking in Coomassie Stain. Gels were destained in Destain Buffer overnight with shaking at RT. Gels were equilibrated in 5% glycerol and 35% ethanol for 1 hour at RT prior to drying between sheets of cellophane.

2.2.2.7 Denaturing SDS-PAGE and Immunoblotting For immunoblot analysis, SDS-PAGE separation was performed with the use of prestained Precision protein standards (Biorad). Nitrocellulose membrane (Schleicher and Schuell) and 8 sheets of Whatmann filter paper cut to the size of the gel were soaked in Transfer Buffer for 20 mins prior to transfer. Semi-dry transfer was typically used to electroelute the protein from the gel. Four pieces of Whatmann paper were stacked onto the cathode, one at a time, and rolled flat using the shaft of a glass pipette. Next the nitrocellulose was placed on top of the paper stack and rolled again. The gel wells were carefully removed and the gel equilibrated in Transfer Buffer for 5 minutes. The gel was applied on top of the nitrocellulose, noting the orientation of the gel, and wet with a little buffer prior to gently removing air bubbles. Four more sheets of Whatmann paper were applied above the gel and the stack finally rolled to remove any remaining bubbles. Excess buffer was wicked away from the stack using paper towel, without disturbing the stack, and the anode carefully pressed down to seal the semidry transfer apparatus (Transblot, Biorad). 10% and 7.5% gels were transferred for 25 mins at 20V. Transferred membrane was blocked with 10% Skim milk powder (Diploma) in PBS/T for 1 hr at RT with rocking, followed by three 5 min washes in PBS/T with rocking at RT. Primary antibody in PBS/T and milk powder was applied to the membrane and incubated as described in the antibodies section (2.1.4) prior to washing again in PBS/T. The membrane was then incubated in the appropriate secondary antibody (Section 2.1.4) prior to a final wash in PBS/T and the membrane was developed using Supersignal WestPico chemiluminescence substrate (Pierce). Film exposure was performed over a timecourse from 30 secs to 5 mins and film developed using an AGFA Curix 60 developer.

2.2.2.8 EMSA Analysis of Recombinant bHLH.PAS A Dimers Large format continuous native PAGE was used to separate free and complexed FAM labelled XRE DNA. Native PAGE gel was composed of: 7% Bis-acrylamide (29:1), 5% Glycerol, 50 mM Tris.HCl pH 8, 40 mM Glycine, 0.1 mM EDTA, 0.06% APS, 0.1% TEMED. Gels were rinsed in RO water and combs removed with care not to disrupt the well shape. Gels were pre-run at 30 mA, 40 min, 4 C in EMSA running buffer. Wells were flushed several times with running buffer prior to loading samples. Sample Preparation: A 4x stock of EMSA Load Buffer was diluted to 1x with the addition of bHLH.PAS A dimer in the range 0-50 ng, 0.65 mg/mL BSA, 6% salmon sperm DNA, 30 mM DTT, and 6.5 nM FAM labelled XRE 33mer or 16mer DNA, where volumes were made up to a total volume of 50 l per sample with Storage Buffer. Samples were incubated for 15 minutes at RT prior to loading. One lane was reserved 47 for DNA Load Dye alone, which served to indicate the progress of the FAM XRE through the gel. Samples were electrophoresed for 2 hours for 33mer XRE, or 1 hour for 16mer XRE, 30 mA at 4 C. Gels were removed from the running tank and suspended with care in a small volume of EMSA running buffer, prior to immediate imaging using a Typhoon Trio variable mode imager (GE Healthcare) at 100dpi resolution.

2.2.2.9 Proteolysis of bHLH.PAS A hARNT homodimer and hAhR/hARNT heterodimer with Trypsin and Chymotrypsin Total proteolytic digestion of the bHLH.PAS A hARNT homodimer was performed by addition of 0.0625 ng trypsin (sequencing grade, Roche) per g of homodimer in 100 mM Tris.HCl pH 8, 5% Glycerol, 150 mM Sodium Chloride and 10 mM Calcium Chloride and incubated at 37 C. Trypsin digests were quenched with the addition of 100 mM EDTA and SDS-PAGE Load buffer. A 120 g sample of hARNT1-362 was digested as described, and 20 g of material aliquoted and quenched with EDTA for analysis at 0, 30 mins, 2 hrs 30 mins, 5 hrs 30mins, and 19 hrs, where 10 g of the digested sample was analysed by coomassie stained SDS-PAGE. The remaining 10 g of digested homodimer was submitted for mass spectrometric analysis to the Adelaide Proteomics Centre. Partial proteolysis of bHLH.PAS A hARNT/hAhR heterodimer in the presence and absence of XRE dsDNA was performed by addition of 0.25 ng Trypsin per g of heterodimer in the buffer indicated above. A 60 g digest was set up and 10 g aliquots taken at 0, 5 mins, 10 mins, 30 mins and 1 hr timepoints, where the digestion was quenched with the addition of 50 mM EDTA. 5 g samples were analysed by coomassie stained SDS- PAGE and the remaining 5 g submitted for mass spectrometric analysis by the Adelaide Proteomics Centre.

2.2.3 Purification of Oligonucleotides for Crystallization

2.2.3.1 HPLC Purification of ssDNA The single stranded (ss) oligonucleotides 33mer, 16mer, 30mer and 33merB were supplied in crude trityl-on form and were purified by reverse phase HPLC. All buffers were filtered at 0.2 m prior to use using aqueous, or organic stable filters as appropriate (Millipore). In a fumehood, 0.1M Triethylammonium Acetate Buffer (TEAAc) was prepared by diluting 13.9 mL of Triethylamine (TEA) and 4 mL of Acetic Acid in MQ to 800 mL in a 1 L beaker with a clean stir bar. Glacial Acetic Acid was added dropwise to a pH of 6.7, and made up to 1 L with MQ, prior to filtering through a 0.2  filter. 0.5%

Trifluoroacetic Acid (TFA). A reverse phase 25 mL C18 column (Valco) (pressure limit 2.4 Mpa) was prepared by washing with 40 mL MQ at 1 mL/min using an AKTA FPLC (GE Healthcare). Washing and elution was monitored by absorbance at 254 nm. A blank elution was performed prior to purification of 48 ssDNA as per the protocol described below. Aliquots of the oligonucleotide samples were diluted to 2 mL in 0.1 M TEAAc, or, for lyophilised samples, rehydrated in 2 mL of 0.1M TEAAc, 0.2  filtered and stored at 4 C. Initially, with 0.1 M TEAAc in Pump A and 100% Acetonitrile (AcN) in Pump B, the column was equilibrated in 40mL of 15% AcN/85% 0.1 M TEAAc, before injecting 500 l of ssDNA onto the column. Free organics and detritylated synthesis failures were eluted isocratically with 40mL of 15% AcN/85% 0.1 M TEAAc. Next, the column was washed with 40 mL 100% 0.1 M TEAAc to remove the AcN. Pump B was washed into 0.5% TFA, and 40 mL of 0.5% TFA was washed over the column, causing detritylation. AcN was washed back into Pump B, and ssDNA eluted with a shallow gradient from 0-30% AcN/100-85% 0.1M TEAAc over 50 mL, collecting 1.5 mL fractions throughout the elution. Finally, a gradient from 30-100% AcN/70-0% TEAAc was run over column prior to storage in 70% Methanol. Peak fractions were reserved and lyophilised. Following reverse phase purifications, the AKTA FPLC 20% ethanol piston wash solution was replaced.

2.2.3.2 Annealing ssDNA and AEC Purification of dsDNA The purified ss oligonucleotides were rehydrated in 10 mM Tris.HCl pH 8, 1 mM EDTA and 50 mM NaCl (~200 L) and concentration determined by UV spectrophotometry at 260 nm, then complementary ssDNA annealed at an equimolar concentration of 250 M in 1 mL volumes by heating to 95 oC for 5 min, followed by slow cooling to the melting temperature (MT), as calculated by [MT = 69.3+0.41(gc%)- (650/number of base pairs)-2]. The reactions were held at their MT for 2 hrs to allow annealing to occur, and snap frozen on EtOH and dry ice. The annealed oligonucleotides were purified by AEC using a MonoQ 5_50GL column (GE Healthcare). The single stranded and duplex molecules were eluted separately with the application of a 0–1 M NaCl gradient in TE buffer (10 mM Tris.HCl pH 7.5, 1 mM EDTA), and monitored by absorbance at 254 nm. The fractions containing the duplexed DNA molecules were lyophilised, resuspended in TE buffer and the concentration of DNA was estimated spectrophotometrically at 260 nm.

2.2.4 Crystallization of bHLH.PAS A/XRE Complexes

2.2.4.1 Robotic Crystallization Screens Protein/DNA complexes were concentrated and buffer exchanged using a Nanosep centrifugal filter device (10 kDa MWCO) at 1200 rpm, 4 C, into Storage Buffer (no glycerol); or 10 mM Tris.HCl pH 8; or 10 mM Sodium Phosphate pH 8 to protein concentrations listed in Tables 1-4, as determined by Bradford analysis (Biorad). Protein/DNA concentrate was centrifuged for 5 min, 4 C, 14 000 g to remove particulate matter, and transferred to a microcentrifuge tube and stored on ice until ready for use. Crystal screens were prepared by equilibrating the deep well tray to RT and centrifuging at 5000 g, 5 min, 4 C, prior to aliquoting 100 L of mother liquor to a 96 well (2 reaction well) intelli-plate using a 49 Phoenix crystallization robot (Art Robbins). The robot was employed to dispense 50 and 100 nL into wells B and A, respectively. Finally, protein/DNA complex was dispensed into wells B (50 nL) and A (100 nL), prior to sealing the plate with ClearSeal film (Hampton Research) and incubation at RT. Robot screens were monitored by differential interference contrast (DIC) microscopy immediately after being dispensed; the following day; a week after; and once a month from initial set up. Conditions where crystals were observed were optimized by fine screening, typically by hanging drop method.

2.2.4.2 Hanging Drop Fine Screens, IZIT Staining and Diffraction Analysis 24 well Linbro plates (Jena Bioscience) were prepared by coating the top of each well with vaseline dispensed from a 10 mL syringe fitted with a 20-200 L pipette tip, with a small length of the tip cut off to increase the bore. When coating the top of the well, vaseline was applied such that a small gap remained, forming a ‘C’ shape to assist in sealing the well at a later stage. The syringe was filled with vaseline by heating the vaseline in very hot water, and aspirating into the syringe, prior to cooling to RT. The lid of the Linbro plate was modified with the addition of small balls of plasticine at each corner, to lift the level of the lid away from the vaseline. Next, the precipitant was pipetted into each well of the Linbro plate, with care not to scrape the bottom of the well. The remaining ingredients of the desired mother liquor were dispensed serially into each well, with freshly filtered MQ water added last, prior to applying the Linbro plate lid and shaking for 15 minutes on an orbital shaker at full speed to mix. Protein/DNA complex was concentrated and buffer exchanged as described in Section 2.2.4.1. 0.5-1 L of mother liquor was dispensed onto a 22 mm siliconized cover slide (HR3-233, Hampton Research), and an equal volume of protein/DNA concentrate was pipetted into the mother liquor drop, with care not to introduce bubbles, and without deliberate mixing. The cover slide was sealed over the corresponding mother liquor well by rocking the slide from the side covered in vaseline towards the small gap, reducing the likelihood of air pressure lifting the slide off the well. The cover slide was finally rotated ~90 and carefully checked to ensure a good seal was achieved. This procedure was repeated for all wells. Drops were checked for crystallization using DIC microscopy immediately following set up, the following day, the following week and approximately once a month for six months. Drops containing crystals were analysed by IZIT staining, where a small droplet of dye (IZIT Crystal Indicator, Hampton Research, Cat. No. HR4-710) was applied by dipping a needle into the dye, and spotting dye droplets around the crystal drop. Where dye was taken up into crystals, those that could be harvested using a 50 m or 100 m loop were frozen in liquid nitrogen and analysed for diffraction using an in-house system including Rigaku RU200 X-ray generator, Oxford 700 series Cryostream, R-Axis 4 Image Plate detector equipped with Osmic mirrors, and/or using the PX1 beamline at the Australian Synchrotron equipped with a MarResearch 135 CCD and an Oxford 700 series cryostream.

50 2.2.4.3 Sitting Drop Fine Screens Sitting drop fine screens were set up in 24 well Linbro plates. Initially, mother liquor was prepared as per the hanging drop fine screen protocol. Following mixing, micro-bridges (Hampton Research, Cat. No. HR3-312) were placed in each well. 1 L of protein/DNA and 1 L of mother liquor were pipetted onto the well of each microbridge. Once all sitting drops were pipetted, the wells were sealed with 4 inch CrystalClear Sealing Tape (Hampton Research, Cat. No. HR4-508). Drops were checked for crystallization by DIC microscopy immediately following set up, the following day, the following week and approximately once a month for six months.

2.2.4.4 Dialysis Button Fine Screens A 24 well Linbro plate was prepared as per the hanging drop fine screen protocol, however 1 mL of mother liquor was mixed in each well. Microdialysis disks (Hampton Research, Cat. No. HR3-338) were equilibrated in MQ water. 5 l dialysis buttons (Hampton Research, Cat. No. HR3-314) were prepared by pipetting 2.5 l of protein/DNA and 2.5 l of mother liquor. A microdialysis disk was drained of excess MQ and placed at the centre of the dialysis button, with care not to introduce air bubbles. A golf tee (Hampton Research) was prepared with an o-ring, which was slid down the shaft of the tee. The concave end of the golf tee was gently applied to the convex surface of the microdialysis disk on the dialysis button and the o-ring eased over the button and settled into the machined groove at the edge of the dialysis button, sealing the microdialysis disk. The button was finally placed in the mother liquor filled Linbro tray well, where the button was pushed gently to the bottom of the well and checked to ensure mother liquor covered the surface of the button. The well was sealed with Vaseline and a cover slide, as per the hanging drop protocol. The drop was examined for crystal formation by removing the button from the well and visualising by DIC microscopy the following day, the following week, and once a month for approximately 6 months.

2.2.5 AFM Analysis

AFM analysis of protein/DNA samples was performed by Su May Ng in the laboratory of Dr Lisandra

Martin, Monash University. Heterodimers of purified hAhR284/H6ARNT362 were mixed with a 212bp fragment of the Cyp1a1 enhancer, containing two XRE binding sites [3’ ggagagctggccctttaagagcctcac ccacggttcccctcccccagctagcgtgacagcactgggacccgcgcccggttgtgagttgggtagctgggtggctgcgcgggcctccaggctc ttctcacgcaactccggggcaccttgtccccagccaggtggggcggagacaggcagcccgacctctgcccccagaggatggagcaggctta 5’]. The complex was finally buffer exchanged into PBS and deposited onto Ni treated mica, rinsed with ultra-pure water and dried under a nitrogen gas stream. AFM imaging was performed using a Nanoscope IV atomic force microscope with a multimode head (Veeco Instruments California, USA), in tapping mode in air using MikroMasch NSC15 probes. Images generated were 512 x 512 pixels and 51 scan size of 3 x 3, 1 x 1, and 0.5 x 0.5 m2 and acquired at scan frequencies around 0.8-1.5 Hz and 0.4-0.6 Hz for respective cantilevers. AFM images were finally processed to correct for line fluctuations using WsXM freeware (www.nanotec.es version 4 build no. 10.3) 3 x 3 Gaussian noise filtering and 3D rendering.

2.2.6 SAXS Analysis

Heterodimers of purified hAhR284/H6ARNT362 were concentrated alone, or with the addition of a 1:1 molecular ratio of XRE containing 33merB dsDNA to 9-10 mg/ml in a final volume of ~200 l using an Amicon Ultra-4 centrifugal filter device (Millipore) at 4 C, 5000 g. The flow-through volume was reserved, its scattering profile recorded and used for background subtraction. SAXS data was collected by Dr Nathan Cowieson using a SAXSESS instrument with a charged coupled device (CCD) detector (Anton Paar). A 10 m slit width was used with 150 l of sample in a flow cell (Anton Paar). An exposure time of 180 x 10 seconds was performed. Data were collected in the Q range 0.0058 to 0.15 Å-1. P(r) functions were calculated using the GNOM program (141). ab initio shape determination was performed using 10 iterations of the Dammin program (Version 5.3) (142). Alignment and averaging of the Dammin predicted shapes was performed with Damaver (Version 3.2) and Damstart (Version 1) (143). Final ab initio shape determination was performed with the Dammin program, with the shape restricted to the averaged molecular envelope.

2.2.7 Cell Culture

2.2.7.1 Generation of Polyclonal Cell Lines and Cell Culture Human embryonic kidney transformed 293T (HEK293T) cells were routinely grown in Dulbecco’s modified Eagle’s medium (Gibco) supplemented with 10% fetal calf serum and Penicillin/Streptomycin

(1% v/v) and cultured at 37°C with 5% CO2. Polyclonal stable cell lines were generated by transfecting HEK293T cells with pEF/His-myc-mAhRfull/IRES/puro wt and mutants via the Fugene6 method (Roche) according to manufacturer’s instructions. Following a 48 h transfection period, cells were seeded into 10cm dishes and allowed to recover for 24 h prior to selection by addition of puromycin (2 g/ml; Sigma) selection, which was applied for 2 weeks. Pools of surviving cells were then cultured in the presence of increasing levels of puromycin up to 10 g/ml.

2.2.7.2 Whole Cell Extracts, Immunoblotting and Antibodies Whole cell extracts were prepared by lysis into three volumes of RIPA buffer, quantitated by Bradford analysis and 50 g of material subjected to 7.5% acrylamide SDS-PAGE, then transferred to nitrocellulose. Proteins were detected with anti-Myc MAb 9E10 or anti--tubulin MAb MCA78G 52 (Serotec), incubated with the appropriate Horseradish Peroxidase (HRP) conjugated secondary antibody and visualised with Supersignal WestPico Chemiluminescent Substrate (Pierce).

2.2.7.3 Transient Transfections and Dual Luciferase Assays The AhR responsive XRE-thymidine kinase (TK)-luciferase reporter (pT81-X1X1-luc) and control TK- luciferase (pT81-blank-luc) have been described previously (137). Cotransfection with pRL-TK (Promega Dual Luciferase Reporter method) was utilised as a measure of relative luciferase activity in transiently transfected reporter experiments. Triplicate wells were seeded with the relevant stable cell lines at 1.5 x 105 cells per well in 24-well trays and grown for 24 hours. Cells were then transiently cotransfected with 200 ng pT81-X1X1-luc and 50 ng of pRL-TK using the Fugene6 transfection method (Roche) and incubated for 24 hrs.

2.2.7.4 Semisolid Media (1.68% Methylcellulose) 250mL of semisolid media was prepared as described: 4.2 g of methylcellulose and a clean magnetic stir bar were autoclaved in a clean 250 mL Schott™ bottle. 125 mL of standard media was warmed to 58°C and added to the autoclaved methylcellulose, shaken to partially mix and incubated for 20 mins at RT with constant, rapid stirring. Standard media (4°C) was added under sterile conditions to a final volume of 250 mL. The bottle was finally wrapped in alfoil and stirred overnight at RT. Media was stored at 4°C until ready for use.

2.2.7.5 Shear-Stressed Fetal Calf Serum Shear-stressed fetal calf serum was prepared by pumping serum through a P1 pump (Pharmacia) at 10ml/min for 16 h at room temperature, filter sterilized with a 0.2 Minisart syringe filter tip (Sartorius) and stored at -80°C until required for use. This method is an approximation of that used in a previous publication (59), which does not require the use of a parallel-plate perfusion chamber, and generates serum capable of activating AhR.

2.2.7.6 Activation of mAhR for Analysis of Luciferase Reporter Gene Activity 293T HisMyc-mAhR overexpressing cells were seeded at 4 x 104 cells per well in 24 well tissue culture trays (BD Falcon). The following day triplicate wells of 293T HisMyc-mAhR overexpressing cells were left untreated (adherent), treated with vehicle alone (0.1% dimethyl sulfoxide [DMSO]), YH439 [0.01 M, 0.1 M, 0.5 M, 1 M and 10 M], 3-methyl cholanthrene (3MC) [1 M], TCDD [1 nM] or B[a]P [100 nM] and incubated for a further 16 hours. YH439 was generously provided by the Yuhan Research Institute, South Korea. Suspension culture of stable cell lines was performed by aspirating the media and adding 0.05% Trypsin/530 M EDTA in PBS to dislodge cells from the wells. Cells were resuspended in 1 ml culture media with the addition of 1.68% methylcellulose to create semisolid media 53 (144) and incubated for 16 hours. Suspension cultured cells were harvested as previously described (144). Cells were treated with serum-free media with the addition of 60% shear-stressed serum and control untreated serum. Lysates were analysed for luciferase activity using the Dual Luciferase Reporter kit (Promega) as per the manufacturer’s instructions and assayed on a Turner Designs 20/20 luminometer.

2.2.8 Treatment and Denaturing Ni-IMAC of Full-Length HisMyc mAhR

Treatment and Harvesting 293T HisMyc-mAhR overexpressing cells were seeded into 10 x 175 cm2 vented tissue culture flasks (T175 - BD Falcon) and grown at 37 C to confluency. Cells were left untreated, or treated with ligand for 2 hours. Suspension treatment of 5 x 175 cm2 confluent flasks was performed by trypsin digestion with 0.5 ml 0.05% Trypsin/530M EDTA in PBS, with the addition of 4.5 mL PBS per flask. Cells were resuspended by pipetting up and down seven times against the side of the flask. Resuspended cells were added and swirled into 50 mL 37 C prewarmed semisolid media in a 250 mL Schott bottle. 10 mL of PBS was used to rinse the flasks and added to the semisolid media, and the suspension incubated for 2 hours at 37 C. Semisolid media suspension was diluted to 400mL and centrifuged in two 250 mL centrifuge pots at 2000 rpm, 21 C, 15 minutes. The supernatant was carefully aspirated, noting that the pellet was soft and easily disturbed by agitation. The pellet was resuspended in 30 mL of cold PBS in a 50 mL tube (Falcon) and centrifuged at 2000 g, 4 C, 10 minutes. Finally the PBS was carefully decanted prior to lysis.

Lysis Following 2 hours incubation, media from untreated and ligand treated cells was aspirated, cells were gently rinsed with PBS which was similarly aspirated. To each flask was added 15 mL of Guanidine Extract Buffer, prepared with fresh DTT, and flasks were incubated with rocking O/N at 4 C. The suspension treated cell pellet was lysed in 75mL of Guanidine Extract Buffer (for 5 175 cm2 flasks) O/N at 4 C with rocking. The following day, extracts were transferred into a Schott bottle, a clean stir bar was added, and stirred at RT for 2 hours. The extract pH was adjusted to pH 8 with the addition of 2 M Tris.HCl pH 11, covered with alfoil, and iodoacetamide [500 mM] in guanidine extract buffer was added at a 1 in 10 dilution to a final concentration of 50 mM. The extract was stirred for a further 2 hours at room temperature and submitted to ultracentrifugation for 4 hours at 184 000 g, 4 C. The extract was filtered on a 0.22 m bottle-top 500 mL filter (Nalgene) prior to chromatographic purification.

54 Denaturing Ni-IMAC The procedure was adapted from that of S. Furness, PhD thesis, 2003. All buffers for the purification were 0.45  filtered prior to use, with buffers containing Guanidine prefiltered with coarse Whatmann filter paper to remove anticaking agent. A 5mL HiTrap Chelating column (GE Healthcare) was prepared by washing with MQ, 25 mL, using a P1 peristaltic pump (Pharmacia), where the flow rate was maintained at 2.5 mL/min, unless otherwise indicated. The column was charged with 10 mL of 0.1M Nickel Sulphate, and rinsed with 25 mL MQ water. The column was equilibrated with 25 mL Guanidine Wash Buffer and the filtered extract loaded at 1 mL/min. The column was further washed with 50mL Guanidine Wash Buffer starting at 1 mL/min and gradually increasing to 2.5 mL/min as back pressure dropped. Next the column was washed with SB3-10 Wash Buffer (50 mL) to remove the Guanidine and finally SDS Wash Buffer (50 mL). Partially purified material was eluted with SDS Elution Buffer (50 mL) and initially, 3.5 mL of eluate collected (Fraction 1), followed by collection of 7.5 mL (Fraction 2), and two more 3.5 mL fractions collected (Fractions 3 and 4). The column was stripped with 20 mL 0.1 M EDTA and washed with 25 mL MQ water. The column was maintained in MQ or, for longer than 3 days of storage, washed with 25 mL of 20% Ethanol and stored at 4 C. Aliquots of fractions 1-3 (90 L) were retained for SDS-PAGE and immunoblot analysis, and fractions determined to contain partially purified mAhR (typically fraction 2) were transferred to a 50 mL tube (Falcon) and stored at -20 C for mass spectrometric (MS) analysis. Dedicated columns were used for each treatment type including untreated, YH439 treated and suspension treated mAhR.

55

Purification & Crystallization of Transcription Factor hAhR/ARNT Complexed with a Xenobiotic Response Element Chapter 3: Purification and Crystallization of Transcription Factor hAhR/ARNT Complexed with a Xenobiotic Response Element

3.1 Introduction: bHLH Transcription Factor Family and Subgroups bHLH.LZ, bHLH.O and bHLH.PAS

3.1.1 bHLH Transcription Factors bHLH transcription factors are involved in regulation of numerous gene networks involved in early development and metabolism (145), and represent an ancient dimerization architecture with members in all eukaryotic clades (146). This class of transcription factors form homotypic dimers (dimerize with members of the same ) consisting of a four helical bundle, the helix-loop-helix (HLH) and an N-terminal DNA binding basic region. The basic (b) region is predominantly responsible for binding a cognate DNA sequence termed the E box, having generic sequence 5’ CANNTG 3’. The DNA binding conformation of the basic region is conferred by the HLH, which acts to position this region in the major groove of the E-box (147). The bHLH protein family is believed to have diversified at around the time of development of multicellularity (147 and references therein), with the addition of accessory dimerization interfaces. There are currently four groups of bHLH DNA binding proteins (Fig. 3.1) which have been identified, including; 1) bHLH (e.g. MyoD (Fig. 3.1 G), E47) where dimerization and DNA binding is specified by the HLH alone (148 and references therein, 149); 2) bHLH.LZ (e.g. oncogenic proteins Myc and Max (Fig. 3.2 E); Upstream Stimulatory Factor (USF) (150) (Fig. 3.1 H), feature an additional parallel two-stranded coiled coil, termed the leucine zipper (LZ) dimerization domain (Fig. 3.2 B) (151); 3) bHLH.O (Orange) (e.g. Hairy, Enhancer of split, Hey) where the Orange domain forms a secondary dimerization interface (152); and 4) bHLH.PAS (e.g. ARNT, AhR, NPAS4, HIF and Sim) where the PAS domain dictates dimerization specificity. The addition of a second dimerization interface effectively ensures that protein:protein interactions can be contained within families and restrict crosstalk with those bHLH groups whose spatiotemporal activation may be coincident (153). Within each group there are two classes of transcription factors, Class I proteins, which cannot homo- or heterodimerize with other Class I factors, and Class II proteins, which can homodimerize or heterodimerize with Class I factors. Recent reviews highlight the importance of molecular evolution to development of dimerization specificity. Class II factors are believed to represent the most ancient form of each group from which Class I has evolved through gene duplication, domain shuffling and mutation events. This theory provides a mechanism for maintenance of domain architecture, insulation from other related regulatory networks and increasing network complexity (145).

56 (A) (B) (E)

E47 NeuroD1

5’-. . .C A G A T G . . . – 3’ 3’ - . . . G T C T A C . . . – 5’ E47/NeuroD1 E47 in E47-NeuroD1 NeuroD1 in E47-NeuroD1

(C) (D) (F)

E47 in E47-E47 E47 in E47-E47 E47/E47E47/E47

(G) (H) (I)

(J) hsMycbHLH 354-374 hsMaxbHLH 24-53 hsUsf1 200-231 hsARNTbHLH 88-119 hsAhRbHLH 1-57

hsMycbHLH 375-407 hsMaxbHLH 54-75 hsUsf1 232-255 hsARNTbHLH 120-142 hsAhRbHLH 58-88

Figure 3.1 bHLH transcription factors structurally characterised show limited with AhR. (A) E47 DNA contacts in E47/NeuroD1 heterodimer (154). (B) NeuroD1 DNA contacts in E47/NeuroD1 heterodimer (154). (C, D) E47 asymmetrical DNA contacts in E47/E47 homodimer (Adapted from (154)). (E) Crystal structure of E47/NeuroD1 in complex with 16mer Rat Insulin 1 E-box DNA (blunt ends) (154). (F) Crystal structure of E47/E47 homodimer bound to an 11mer E-box (sticky ends) (148). (G) Crystal structure of MyoD homodimer in complex with 14mer E-box DNA (blunt end) (149). Also structurally defined is the bHLH region of bHLH.LZ protein USF bound to a 21bp DNA fragment (blunt end) (H) which homodimerises to bind E-box response elements, with additional backbone contacts made by the loop region (150). Sequence alignment of the bHLH region of human Myc, Max, Usf, ARNT and AhR illustrates that AhR bears limited sequence homology, especially in the basic region, with the other representatives of this domain family (J). However, ARNT shows highest sequence homology with USF, which was selected by Protein Homology/analogY Recognition Engine (Phyre) (281) for homology modelling (I) the ARNT sequence underlined in red (J). A D

B C

GCN4 Homodimer Leucine Zipper b.LZ GCN4 Homodimer

EF

Myc/Max Heterodimer Myc/Max Tetramer

Figure 3.2 b.LZ and bHLH.LZ transcription factors; Leucine zipper parallel coiled coil forms the dimerization interface. Basic-Leucine Zipper (b.LZ) yeast transcriptional activator GCN4 has been structurally characterised using LZ truncation peptides, which show formation of the parallel coiled-coil is specified by the position of buried polar residues such as Asn in the a position of the heptad repeat (figure adapted from (156)) (A-C). Also structurally defined is the b.LZ region of yeast transcriptional activator GCN4 homodimer, bound to 20bp fragment containing an AP-1 site (sticky ends) (155). bHLH-Leucine Zipper (LZ) proteins Myc and Max (E) have been crystallized bound to an 18mer (blunt ends) E-box response element and show DNA binding by the basic region, followed by a HLH dimerization interface contiguous with an LZ which defines partner choice in this family of transcription factors and also forms a multimerisation interface for these factors (F) (233).

The common domain, the bHLH, has been utilised in many regulatory systems through a range of organisms. Of the DNA binding proteins described, the bHLH domain is the most conserved, and has been maintained with extraordinary homology throughout metazoan phyla (146). From to humans, over 240 bHLH transcription factors have been identified. They are known to be involved in regulation of an array of cellular determination and metabolic events, including phosphate uptake, phospholipid biosynthesis, neurogenesis, myogenesis, hematopoeisis and pancreatic development (147 and references therein). bHLH factors have been classified according to tissue localisation, dimerization specificity and DNA binding characteristics. One of the best structurally characterised bHLH proteins is E47 (Fig. 3.1 A-F), belonging to a larger family of bHLH proteins including E12, HEB, E2-2 and daughterless transcriptional regulators (147).

The structure of the bHLH transcription factor E47 homodimer bound to E-box CACCTG DNA sequence has been determined by x-ray crystallography (Fig. 3.1 F) and is characteristic of the bHLH family (148). This protein dimer forms the typical four-helical bundle stabilised by extensive hydrophobic interactions at the dimer interface. A network of H-bonds stabilises the interaction of the loop with helices 1 and 2 of the HLH. The globular HLH positions the basic region in the major groove of the E-box DNA. In the basic region, a glutamate residue conserved throughout bHLH proteins interacts specifically with Cytosine (CACGTG) in the hexanucleotide recognition sequence (Fig. 3.1 C), and illustrated in NeuroD1 (Fig. 3.1 B) (CACGTG) (148, 154). A neighbouring arginine residue is also conserved and interacts with the phosphodiester backbone and glutamate to stabilise the DNA binding position of this residue (148). The two central nucleotides of the E-box, which are variable, are specified by the particular bHLH dimer (148). E protein dimer formation is restricted to those bHLH proteins that do not have an additional dimerization domain. E47 has one extra turn in helix 1, not seen in bHLH-LZ proteins, creating a larger dimer contact area and encompassing an additional polar interaction between subunits. This is theorised to strengthen contact between bHLH proteins, thereby accounting for dimerization selectivity (148). bHLH.LZ transcription factors undergo dimer formation which is specified by the LZ dimerization domain. Characteristically, as shown in the GCN4 homodimer crystal structure (Fig. 3.2 D) (155), this domain contains a heptad repeat (abcdefg)n (Fig 3.2 A) (156) where position a is a valine, or asparagine residue, position d is a leucine, and positions e and g are generally occupied by charged residues. The heptad repeats orient the helices such that the dimer will form on a specific face of the helix and dictate a 'knobs in holes' zipper form of dimer to position the bHLH for DNA binding (Fig. 3.2 B) (156). The asparagine residues have been shown to promote dimer formation of LZ domains by destabilising larger oligomer complex formation (157). An additional effect of the asparagine residues is to partially 57 destabilize the homodimer (158), thereby favouring formation of the heterodimer. In the case of Myc/Max, H-bonding between the a asparagine residue and the backbone of position d' on the opposite helix stabilises this interaction (Fig 3.2 C) (156, 159). Salt bridges are also known to form when charged residues e and g' have opposing charges on the two LZ helices of the heterodimer, thereby additively affecting selectivity and stability of heterodimer interactions (Fig. 3.2 C) (160).

An example of this transcription factor system is seen in the formation of Myc-Max heterodimers (Fig 3.2 (E-F)). The myc family of proteins are involved in positive and negative regulation of cellular processes including cell cycle, proliferation and differentiation (reviewed in 151). Regulation of these processes involves competitive binding of Max by non-homodimer forming c-Myc, N-Myc, L-Myc, Mad, Mad3, Mxi1 and Mnt (161) and recognition of the 5'-CACGTG E-box. Heterodimer formation is understood to be regulated at the level of concentration of competing subunits and potentially post-translational modification (151). Hence, Max must specifically recognise and bind a wide array of bHLH.LZ proteins for survival of the cell. This role is accomplished by the LZ domains of Max and the partner proteins listed above. Heterodimer formation is known to be favoured over homodimerization of Max (160) and subsequent NMR and CD analyses implicate LZ interhelical salt-bridge and H-bond formation in the specific heterodimeric interaction (159, 160). Due to their well documented link with cancer development (161), the molecular interactions that specify dimer formation in the Myc family of proteins have been defined in detail and represent a wide body of knowledge of the bHLH.LZ family that can be extrapolated to non-myc bHLH.LZ proteins (151). In addition, detailed structural analysis of the dimerization of a b.LZ protein, yeast GCN4 (Fig. 3.2 A-D), a transcription factor whose DNA binding is reliant upon LZ domain dimerization, presents another source of information regarding the specificity of molecular interactions at the dimer interface (156, 162). The third group of bHLH proteins, bHLH.O transcription factors, are less well characterised, but are believed to function primarily as negative regulators of bHLH signalling, potentially acting on factors outside of the bHLH.O family in either direct DNA bound transcriptional repression (163), or simply by competing for hub factors such as ARNT (164) to effect indirect transcriptional repression. Another repressive interaction is known to occur between bHLH.O factors Hey 1, Hey 2 and HeyL and a zinc finger transcriptional activator, GATA4, indicating general repressive functionality for this family (152). bHLH.O factors have established roles in neurogenesis, gliogenesis, angiogenesis, bone formation and EMTs (as reviewed in 152) acting in their capacity as downstream effectors of Notch signalling. Tying these transcriptional repressors into the context of the broader HLH family is their function as dimeric active repressors (165, 166), where orange domain dependent dimerization (167, 168) appears to be important for formation of the DNA binding repressor, although the specific details of molecular interactions mediated by the Orange domain are yet to be elucidated. Similarly, contrasting with the well defined role of the leucine zipper in bHLH.LZ dimerization and DNA binding, the fourth group of bHLH proteins is known to require a PAS 58 domain for dimerization and DNA binding, involving as yet structurally undefined molecular recognition events.

3.1.2 Per-ARNT-Sim (PAS) Domains:

The PAS domain was first identified in Drosophila melanogaster period circadian rhythm protein (Per), vertebrate ARNT and Drosophila Sim, forming the acronym PAS. Subsequent investigations have identified PAS domains in every kingdom of life, and proteins containing one or more PAS domains number in excess of 300. PAS domains are known to be involved in signalling circuits monitoring variations in light, redox potential, small ligands, energy levels and oxygen, and are able to recognise and respond to a wide variety of signals (comprehensively reviewed in 169). For example, signal transduction occurs upon binding of an aryl hydrocarbon ligand to the PAS domain of AhR (170). Kinase signalling in a bacterial photophobic swimming response is activated by a prosthetic light-sensitive chromophore bound in the PAS domain of PYP, a photoreceptor in Ectothiorhodospira halophila (171). Activity of nitrogen fixation (nif) promoters in Azotobacter vinelandii is controlled by oxygen-sensing PAS domain NifL, which senses oxygen using prosthetic flavine adenine dinucleotide (FAD) and acts as an inhibitor of nif gene activation when oxidized (172).

The PAS domain cannot always be readily identified by sequence similarity (169). However, the structural homology between PAS domains is well conserved and is defined by the prototypical PAS domains of PYP, B. japonicum oxygen sensor FixL, Human-Ether-a-Go-go potassium channel protein (HERG) and Light-Oxygen or Voltage domain 2 (LOV2) (Fig. 1.2 A). The three-dimensional fold of the PAS domain of PYP shows characteristic structural elements including: (i) An N-terminal cap made up of  helices 1 and 2; (ii) a PAS core containing A and B two strands of the central  barrel, and alpha helices A, B and C; (iii) a helical connector F which links the two sides of the  barrel; and (iv) the other half of the  barrel made up of G, a linker loop and H-hairpin-I (Fig. 3.3 B (adapted from 169, 173)). Although these structural elements are broadly conserved through PAS domains, in PAS repeat containing transcription factors, PAS A alignments reveal distinct loop inserts peculiar to specific proteins, which are distinct from the PAS B fold (174) (Fig. 3.3 compare C with D). Hence, bHLH.PAS A dimer interactions are proposed to involve a structure similar to Drosophila Period PAS A, shown in Figure 3.3 C, (discussed further in Section 4.2.3).

3.1.3 Dimer Interactions Mediated by PAS Domains bHLH.PAS proteins are known to require PAS A for dimer formation, where PAS A forms a cooperative dimerization interface, which strengthens DNA binding and selects dimer partners for this family of 59 PAS B(1) PAS A(1) (A) (B) C-D  E A

A  C F' B B D C C-C D-E

PAS A(2) PAS B(2)

DmPer Homodimer

(C) (D)

DmPer PAS A ARNT PASB

Figure 3.3 Structures of PAS repeat homodimer Drosophila Per, Per PAS A and ARNT PAS B. The only PAS A containing protein to be structurally characterised is Per. PAS A-PAS B of Per were crystallized as an asymmetrical homodimer where PAS A (Molecule 1, royal blue) interacts with PAS B (Molecule 2, Green) (A) (174). If this arrangement was considered as a structural precedent for this family, bHLH.PAS factors would have an antiparallel interface, contrary to typical DNA binding domain arrangements, and indeed, in disagreement with formation of stable heterodimers truncated to include only the bHLH and PAS A (36). Shown in (B) is the typical PAS fold which identifies this domain family, featuring a variable N-terminal PAS cap (purple), PAS core alpha helices and beta sheet forming half of a beta barrel (orange), bridged to the other half of the barrel structure (blue) by a helical connector (green) (174). Also illustrated is Per PAS A in isolation (C), which, according to structure based alignments (174) has a higher sequence homology for AhR and ARNT PAS A than sequences of PAS B truncations such as ARNT (176) (D). However, Per PAS A does not bear high enough sequence identity with AhR and ARNT to predict the fold of this domain for these undetermined structures, likely due to insertions in proposed external loops. transcription factors (3, 175). However, this family is not unique in utilising a PAS domain as a dimer interface. The specific face of PAS A which may be employed for bHLH.PAS transcription factor dimer assembly may be analogous to a range of other PAS dimer proteins. PAS or LOV domains are broadly able to engage two 'faces' for dimer formation, (1 -scaffold dimerization, or (2) variable N-terminal PAS cap dimerization (Fig. 3.4). The most obvious analogy for PAS A dimer formation can be drawn from HIF2/ARNT PAS B dimer which forms an antiparallel dimer interface through the PAS -scaffold (Fig. 3.4 A) (176).

Bacillus subtilis KinA histidine kinase is a sensor kinase, which detects starvation and is critical in the sporulation switch. KinA has three PAS domains, and in periods of starvation, KinA autophosphorylates, initiating a phosphorylation cascade resulting in sporulation. Deletion of PAS 1 alone results in a 95% decrease in phosphorylation (25, 177), confirming that this detection response is sensitive to this structural change. Recently, the crystal structure of the KinA PAS 1 dimer (Fig. 3.4 C) indicated the possibility of two functional dimer interfaces, both of which utilise the -scaffold, but one dimer is approximately parallel (Dimer 1), the other close to antiparallel with a ~120° crossing angle (Dimer 2). Mutagenesis to disrupt dimer formation identified a single residue mutation that maintained a monomeric state, and was found to activate KinA in vivo, although a mechanism for monomer to dimer 1 or dimer 2 switching remains to be identified (25). Another sensor protein in Bacillus subtilis is the YtvA blue-light photoreceptor, which detects light through a flavin binding LOV domain, connected to a nucleotide binding Sulphate Transporter and AntiSigma factor antagonist (STAS) domain. The isolated LOV domain forms a parallel dimer (Fig. 3.4 E) which, again, utilises the -scaffold, although the mechanism of physiological dimerization, or functional outcome of dimer formation are not understood (178). However, light absorption by the flavin mononucleotide (FMN) has been shown to push the dimer interface slightly apart relative to the dark structure in the isolated domain (179).

A second type of PAS dimer utilises the variable N-terminal PAS cap, the least conserved structural element of the PAS domain (173) (Fig. 3.3 B). The N-terminal PAS cap serves to protect the  sheet from solvent interactions, although this role may be fulfilled by other compensatory structures in cases where the typical cap fold is not present (169). A second role defined for the PAS cap as a dimerization interface is seen with Vivid (VVD), another blue-light photosensor from fungus Neurospora crassa which regulates circadian rhythm. VVD forms a parallel dimer interface through the N-terminal PAS cap (Fig. 3.4 B). In the light state, the FMN moiety forms a covalent bond with a conserved Cys residue within the cap region, resulting in conformational change in the N-terminus, causing PAS domain dimerization, although again, in the case of the full length protein, dimer formation has not been determined (180). FixL is a sensor histidine kinase and utilises a heme group in order to function as an oxygen sensor to

60 Structurally Determined PAS Dimer Interfaces

-barrel dimer interface N-terminal cap interface

(A) (B)

HIF2/ARNT PAS B Vivid Photosensor

(D) (C)

FixL

KinA - Two Dimer Forms

(F) (E)

H-NOXA YtvA Figure 3.4 Structures of proteins with a functional PAS dimer;  Scaffold and PAS core interfaces. The PAS domain has been utilised in a range of proteins as a dimeric interface, typically aligning the -scaffold (A, C, E) or the N-terminal PAS cap (B, D, F). HIF2 (yellow)/ARNT PAS B (purple) heterodimerises through the -scaffold in an antiparallel arrangement (176). Bacillus subtilis histidine protein kinase KinA (C) has been structurally characterised in two dimer forms with different crossing angles, parallel and at ~120 degrees (figure taken from (25)). (E) The Blue-light photosensor YtvA is illustrated in aqua and red, with flavin mononucleotide (FMN) ring shown in blue (178) dimerises through the -scaffold. (B) Another photosensor, Neurospora crassa Vivid has been crystallized in the light state and displays a parallel N-terminal cap dimerization interface (180). (D) FixL, a sensor histidine kinase, in contrast to KinA, uses the N-terminal cap interface for homodimer formation (181). Finally, H-Noxa PAS dimer (F) is another N-terminal PAS cap dimer form of sensor histidine kinase, which may be functionally regulated through this interface (182). regulate a battery of nitrogen fixation genes in Rhizobia. The FixL sensor domain is a PAS domain which homodimerizes through the N-terminal cap region (Fig. 3.4 D), although no function for this dimerization event has been identified (181). Another sensor histidine kinase, soluble guanylyl cyclase (sGC) H-NOXA (Heme-Nitric Oxide/Oxygen binding domain) detects the presence of nitric oxide through a bound heme moiety. This sensor also forms a parallel dimer through the N-terminal PAS cap (Fig. 3.4 F). Mutagenesis across the homodimeric interface, and between a proposed heterodimer interface of sGC/sGC revealed a potential role for dimer formation in response to NO stimulation (182). Other PAS/LOV protein structures indicate the utility of the N-terminal PAS cap for dimer formation, including nitrogen fixation regulator NifL (172), heme based sensor phosphodiesterase EcDOS (183) and CitA (184) which all form symmetrical dimers. The region defined for other PAS factors as the N-terminal PAS cap of PAS B has required redefining in light of the crystal structure of Drosophila Period protein (174). Allowing for insert loops in PAS A has shifted the frame of the sequence traditionally attributed to the PAS fold (compare Fig. 3.3 C and D). Accordingly, recombinant expression of soluble, DNA binding heterodimeric bHLH.PAS AhR/ARNT requires that truncations include the previously defined ‘N-terminal PAS cap of PAS B’, which is now believed to form part of the PAS A fold, according to structure based alignments with Per PAS A (9, 174) (described further in Chapter 4).

3.1.4 Dimeric bHLH.PAS Transcription Factors

The bHLH of this transcription factor family can function in dimer formation and in specifying weak DNA binding when isolated from the PAS domains, however, the addition of the PAS A domain forms a dimerization interface which defines partner selectivity and strengthens DNA binding by heterodimeric factors (36). Deletion of the PAS domain from AhR allows homodimerization and heterodimerization with an unrelated bHLH-LZ transcription factor USF (3). Indeed, the AhR and ARNT PAS domains, isolated from their bHLH regions, are sufficient for interaction and activation of reporter genes in a mammalian hybrid interaction system (175). The PAS domain not only defines partner choice, but also stabilises the active dimer for DNA binding and bending. Additionally, these effects cannot be reproduced by the replacement of PAS A with a LZ dimer interface (9). Given that bHLH.PAS proteins specify a variety of responses to environmental cues, it can be implied that these transcription factors must recognise a broad array of target genes and that these recognition events cannot overlap. Mammalian Sim and HIF transcription factors heterodimerise with ARNT and recognise target genes with the same core recognition sequence 5’ ACGTG 3’ (185). It is clear that these factors have vastly different gene activation profiles, and hence, a level of regulation must exist to prevent overlap in the activation of target genes.

61 The selectivity of target gene activation has been investigated in an elegant series of experiments by Zelzer and coworkers. In Drosophila, ARNT homologue Tango acts to homodimerize with Drosophila Class I proteins. Three binding partners for Tango have been identified, including Similar (Sima), dSim and Trachealess (Trh). Sima functions as an oxygen sensor in Drosophila, regulating hypoxic responses in a similar manner to mammalian HIF1 (186). dSim is involved in determination of midline cell fates in response to developmental cues and after this point, is positively autoregulated. Trh specifies cell fate in tracheal cells and, again, autoregulates its expression after differentiation. Trh and Sim share high sequence identity in their basic region. In order to dissect the source of specificity for gene activation by Trh and Sim, ectopic expression patterns for wildtype, bHLH domain swapped and PAS domain swapped constructs were compared. Ectopic expression of wt Trh showed spatially restricted expression of tracheal markers within the ventral ectoderm, whereas ectopic Sim showed activation of target genes throughout the region. When the bHLH of Sim was inserted in the Trh protein (Trh-bSim), the hybrid transcription factor was found to confer identical cell fate determination as wild- type Trh, indicating that the bHLH does not define the expression boundaries of tracheal markers that were seen in this experiment. However, when the PAS domain of Sim was swapped into the Trh protein (Trh-Sim PAS), ectopic activation of Sim target genes was seen and Trh target gene activation was lost (187). Thus it was confirmed that the PAS domain was necessary and sufficient for defining gene activation boundaries. Further investigation revealed a potential coactivator, which interacts directly with the PAS domain of Trh, but does not interact with Sim. Hence, it would appear that the PAS domain of Trh not only acts to specify dimer partner choice, but also has the potential to specify interactions with other proteins necessary for activation of target genes (10), or potentially, specify target gene choice directly by making primary DNA contacts.

The potential for direct DNA contact by the PAS A region of AhR/ARNT heterodimer is supported by several studies. Firstly, mutation within PAS A of mouse AhR (C216W) or ARNT (G326D) results in dramatic changes in DNA binding affinity, which did not appear to affect heterodimer formation (188, 189), although this conclusion has recently been questioned for recombinantly expressed mutant proteins (9). Secondly, in vitro studies on recombinant bHLH.PAS A truncated heterodimers in our laboratory show a requirement for PAS A in high affinity DNA binding by AhR/ARNT and HIF/ARNT, relative to bHLH alone. This proposal was additionally strengthened by domain swap experiments where AhR PAS A was replaced with the same region from ARNT (36). Despite forming a heterodimer, this material was unable to achieve high affinity binding of an XRE, an effect mimicked by previous in vitro translated PAS deleted AhR dimerized with wt ARNT (29). Thirdly, AhR and HIF are known to bind atypical E-boxes, XRE 5’ TNGCGTG 3’ (37) and HRE 5’ TACGTG 3’ (1, 2), respectively, and both require PAS A to achieve high affinity DNA binding. In contrast, ARNT homodimer, which binds the canonical E-box sequence 5’ CACGTG 3’, does not require the PAS region for DNA binding, although 62 homodimerization is negatively affected by this truncation (36). PAS A domain swaps with HIF1 or ARNT PAS A replacing AhR PAS A also showed dramatic reduction in XRE binding affinity relative to wt AhR, suggesting PAS A strengthens DNA binding in a context specific manner, although dimerization and stability were also affected (9). Finally, recent work indicates PAS A interacts directly with the bHLH region, placing PAS A spatially close to the DNA, and evidence of PAS dependent DNA bending and DNAse1 hypersensitivity downstream of the XRE which could not be mimicked by replacement of AhR and ARNT PAS A with a bHLH contiguous LZ from Myc and Max, respectively (9). bHLH.PAS transcription factors are of vital importance to the survival of metazoans. Regulating responses to environmental changes, or activating genes regulating cell-fate determination, these factors display mechanisms of protein recognition that are peculiar to this family of signal transducers. Specifically, the PAS domain appears to mediate interactions with other bHLH.PAS proteins for formation of active DNA binding dimers and additionally, likely makes direct contact with DNA to achieve promoter bending and activation. In contrast to bHLH and bHLH.LZ transcription factors, the molecular recognition occurring between bHLH.PAS proteins has not been quantitatively or structurally established. Rather, this broad field has relied on descriptive evidence to establish the function of the PAS domain in the context of a DNA binding protein. Hence, the aim of this study is to investigate these proteins from a structural perspective in order to elucidate (1) the contribution of structural features within the PAS domain that specify dimerization, contact DNA, position the bHLH region to maintain specific DNA contacts and bending; and (2) to establish the structural features of ARNT which allow it to function as a hub protein to interact with a range of bHLH.PAS Class I proteins. Establishing structural detail to describe functional outcomes of heterodimer formation will inform a broad field of bHLH.PAS investigation including proteins with implications for a number of disease states. In addition to the roles of HIF proteins in cancer and cardiovascular disease, possible links exist between Sim2 and Down's syndrome, NPAS3 in the development of schizophrenia (190), and NPAS4 in autism spectrum diseases (191). Structural analysis of bHLH.PAS heterodimers may facilitate structure based drug design for treatment of disease states resulting from aberrant function of these proteins, e.g. in diseases such as cancer (HIF/ARNT, Sim2/ARNT), heart attack (HIF/ARNT), heart disease (AhR/ARNT), stroke (HIF/ARNT), preeclampsia (HIF/ARNT), Down's syndrome (Sim2/ARNT), schizophrenia (NPAS3/ARNT), xenobiotic poisoning (AhR/ARNT) and hormone dependent cancers (AhR/ARNT).

In order to address how dimerization and DNA binding by bHLH.PAS factors is defined at the molecular level, x-ray crystallographic structure determination was pursued. Full-length AhR and ARNT are known to be insoluble when expressed recombinantly, hence, truncations of these factors were expressed and purification was optimised. Alterations to the protein truncation points were also pursued as a means to generate more stable protein dimers. Purified heterodimer was combined with short DNA sequences 63 derived from the Cyp1a1 promoter, and protein/DNA complexes subjected to broad crystallization screens. Identified crystallization conditions were refined by finescreen, varying screen components to optimize crystal size. Crystals were finally analysed by dyeing with protein stain and tested for diffraction using an x-ray generator and synchrotron radiation.

3.2 Results

3.2.1 Recombinant expression, purification and solubilisation of hAhR/ARNT

3.2.1.1 Expression, purification and TEV proteolysis of H6ARNT 1-362 homodimer reveals the presence of oligomeric species. Given that it is not possible to express full length ARNT in large quantities using a recombinant bacterial expression system (Chapman-Smith, A; pers. comm.), and that bHLH.PAS A is sufficient to achieve dimerization and DNA binding (36), expression of ARNT amino acids 1-362 represented a truncation including bHLH and PAS A to the N-terminus of PAS B (Fig. 3.5). The thioredoxin tag (Trx), used in previous studies in our lab, was removed from this expression construct in order to reduce the potential effects of folding partner tags on bHLH.PAS dimer conformation. Human ARNT isoform 3 (1-362) was cloned with a His (H6) tag at the N-terminus, followed by a Tobacco Etch Virus (TEV) protease sensitive linker sequence (ENLYFQ^G) (192) (Fig. 3.5) to allow for specific removal of the tag for crystallization studies. Note that ARNT isoform 3 has an alternate splice site, which removes the region coding for N- terminal residues 77-91 from isoform 1 (15 residues), and is the original clone used in almost all functional assays. Numbering utilised in this thesis is consistent with the isoform 1 coding sequence, hence ARNT362 refers to a protein coding region including residues 1-347 of isoform 3.

Protein was expressed in bacteria and purified from crude extract using Nickel Immobilized Metal Affinity Chromatography (IMAC). Initial purification and DNA binding assessed by Electrophoretic Mobility Shift Assay (EMSA) using a carboxyfluorescein (FAM) labelled E-box fragment of the Adenovirus Major Late promoter (193, 194) revealed similar function relative to Trx tagged ARNT homodimer (data not shown). Trial purifications showed that impurities present after washing the column with 50 mM imidazole (Fig. 3.6 A) could be removed without losses of ARNT dimer by increasing the imidazole concentration to 100 mM (Fig. 3.6 B), resulting in relatively pure material after IMAC. Persistent higher molecular weight contaminants prompted further investigation to determine if ARNT was forming oligomeric species. Although limited solubility during size exclusion chromatography (SEC) hampered separation of the higher MW material, concentration using a 10 kDa nominal molecular weight cut-off (MWCO) ultrafiltration apparatus of SEC purified homodimer resulted in

64 bHLH.PAS A Truncations of hAhR and hARNT

Helix-Loop-Helix PAS domain Basic PAS-A PAS-B Transactivation Domain

13 88 287 389 AhR 1 115 271 600 848

TrxH6 287 287 293 284

87 143 362 471 ARNT

1 167 789 -15 a.a

His6 --TEV 362 His --TEV 6 50aa

Figure 3.5 Primary structures and recombinant expression constructs of truncated Aryl hydrocarbon Receptor and ARNT. His tagged human Aryl Hydrocarbon Receptor Nuclear Translocator isoform 3 (hARNT) (1-362, 39-362) and variant truncations of human Aryl hydrocarbon Receptor (hAhR) (1-284, 1-287, 1-290), including the region coding for bHLH and PAS A domains, were cloned into pAC28 and pET16b respectively for co-expression in bacteria. A triangle indicates the site of the variant exon 11 not represented in this ARNT construct (-15 amino acids (a.a.)). This combination of vectors was selected due to the presence of compatible origins of replication for maintenance in host bacteria. Expression and Purification of H6hARNT362 Homodimer

. id Im ) (A) M m l 0 oo (5 P 3 4 5 6 h nt nt nt G G s a a a ant ant T T te u u u u u P IP a T l l l l l MW I Wa s F E E E E E ) ) i y i i i i i i (kDa) (- (+ N L N N N N N N

55.4 ~41 kDa 36.5 H6-TEV-ARNT362

. id Im ) M m l o 00 o 1 P 4 5 6 (B) ( t3 t7 h nt G G s a an ant ant ant an T T te u u u u u u P P a l l l l l l I I s Wa E E E E E E ) ) y i i i i i i i (- (+ L N N N N N N N MW (kDa)

55.4

~41 kDa

36.5 H6-TEV-ARNT362

Figure 3.6 Recombinant expression, Ni-IMAC of H6TEV-ARNT362 Homodimer. Expression of His tagged human Aryl Hydrocarbon Receptor Nuclear Translocator isoform 3 (hARNT) (1-362) was induced for 2 hours at 30 C with 0.1 mM IPTG. Cells were lysed using a cell disruptor and lysate was purified from cleared lysate by Ni-IMAC. Initially, the homodimer appeared contaminated after a 50 mM imidazole (imid.) wash (A), and, increasing the stringency at the wash step to 100 mM imidazole resulted in markedly cleaner preparations (B). Ni Eluant pool represents the Ni-IMAC fractions with the highest absorbance at 280 nm, pooled for dialysis. Purified protein was separated by discontinuous SDS-PAGE and visualised with Coomassie blue stain. increased concentration of higher MW contaminants, and decreased resolution of the monomer band by SDS-PAGE (data not shown). Concentrated purified homodimer (Fig. 3.7 A) was subjected to proteolysis using TEV protease, and as expected, produced slightly faster migrating species by SDS- PAGE (Fig. 3.7 A). Western blot analysis using an anti-His primary antibody revealed that the higher MW contaminants contained His tags, which were specifically removed by TEV proteolysis, indicating that H6-TEV-ARNT 1-362 formed higher MW oligomeric species that were not disrupted by 25 mM DTT, heat denaturation and SDS-PAGE separation (Fig. 3.7 A and B). However, the ARNT oligomeric species were disrupted by dilution into buffer containing 0.25% Triton-X-100 at room temperature for 16 hours, indicating that this effect was concentration dependent and likely involved a hydrophobic interface (data not shown). Subsequent partial proteolysis experiments (Section 3.2.2.5) resulted in total digestion of higher MW material, indicating that these oligomeric species were indeed soluble. However, given the oligomeric heterogeneity induced by concentrating ARNT homodimer, crystallization of this material was not pursued. Attempts to purify the more biologically relevant AhR/ARNT homodimer were prioritised at this stage.

3.2.1.2 Co-Expression and IMAC Purification of hAhR287/H6ARNT362 and

TrxH6hAhR287/H6ARNT362 gave soluble, functional heterodimers. Human Aryl hydrocarbon Receptor was cloned without fusion partner or affinity purification tag, truncated to include only bHLH and PAS A domains up to the N-terminus of PAS B, representing sequence 1-287 (Fig. 3.5). A second hAhR clone was engineered to include a TrxH6 fusion partner and affinity tag at the N-terminus of the 1-287 construct. Given the discovery of stable PAS B heterodimeric foms of HIF2/ARNT (195-197), it is possible that truncating hAhR/ARNT in this manner may affect dimer formation. However, inclusion of the ligand binding domain (LBD) region, i.e. PAS B, is known to result in low expression, and insolubility in bacterial expression systems (34). Given that AhR bHLH.PAS A region cannot be recombinantly expressed as a soluble protein in bacteria without partner factor ARNT (Chapman-Smith, A., pers. comm.), initial trials focused on co-expression. BL21 (DE3) E. coli were cotransformed with pAC28-H6-TEV-ARNT 1-362 and pET16b hAhR 1-287, or pET32

TrxH6TEV hAhR 1-287 plasmid DNA. This pair of vectors (pAC28 and pET) was selected due to their complementary origins of replication (36, 138). Initial, small scale expression in bacterial culture followed by IMAC resulted in purification of hAhR and ARNT, for both tagged and untagged AhR, at an approximately equal ratio, indicating purification of heterodimeric species (Fig. 3.8 A, B). To confirm that the dimers were functional, XRE binding activity was assayed with FAM labelled double stranded (ds) DNA from the Cyp1a1 promoter (33mer (3)) by EMSA. Both species were active for DNA binding (Fig. 3.8 C), so purification and solubilisation of the untagged variant was prioritised in order to limit the potential for structural artefacts. Purification schemes attempted and yields achieved by each method

65 TEV Proteolysis of H6hARNT362 Homodimer

(B) (A) r r r r r r r r r r r h h h h h h h h h h h r r r 0 1 2 4 1 2 4 1 2 4 h h h 0 1 2 4 V V V V V V V V V V V V V V E E E E E E E E E E T T T T T T T MW E T T T E E E ) ) ) T T T T -) + + + -) -) -) ) ) ) ) ) ) ) ( ( ( ( ( ( ( (kDa) (- (+ (+ (+ (- (- (-

55.4 36.5 ~41 kDa

Anti-His Western Blot

SDS-PAGE

Figure 3.7 Recombinantly expressed H6TEV-ARNT362 homodimer forms high molecular weight oligomers. H6TEV- ARNT362 homodimer was partially purified by IMAC prior to concentration (1.2 mg/ml) and finally subjected to TEV

proteolysis. Aliquots of H6TEV-ARNT362 were incubated at 30 C for 1, 2, or 4 hours sTEV protease. Proteins were visualised by 12% Tris-Tricine SDS-PAGE and Coomassie staining (A), or transferred to nitrocellulose, which was probed with an anti-His primary antibody and detected using anti-mouse HRP conjugated secondary antibody (B). The His tag detected in the higher MW material suggested that H6TEV-ARNT362 was forming oligomeric species which could not be readily disrupted by reducing SDS-PAGE. Additionally, after TEV proteolysis, His signal was no longer detected in either the

41 kDa band, nor the higher MW bands, hence confirming that H6TEV-ARNT362 was represented in the higher MW species (B). Expression and Purification of hAhR/ARNT ± Thioredoxin Tag

3 4 5 (A) t t t G n n n G e T PT t lua lua lua P I a E E E TrxH AhR287/H ARNT362 I ) s i i i 6 6 MW (-) (+ Ly N N N (kDa)

55.4 TrxH6AhR287 H6TEV-ARNT362 36.5

3 4 5 t t t G n n n G e T PT t lua ua lua P I a l I ) s E E E ) i i i (B) MW (- (+ Ly N N N AhR287/H6ARNT362 (kDa) 55.4

H6TEV-ARNT362 36.5 hAhR287

(C) in x n do xi re do io re Th io (-) o h N +T

Bound 33mer TrxH6-AhR287/H6TEV-ARNT362 AhR287/H6TEV-ARNT362

Unbound 33mer

Figure 3.8 Recombinant co-expression, Nickel Affinity Chromatography and XRE EMSA of TrxH6AhR287/H6TEV- ARNT362 and AhR287/H6TEV-ARNT362. bHLH.PAS A heterodimeric AhR/ARNT was prepared s a Thioredoxin His (TrxH6) tag on AhR287 and purified by Ni-IMAC prior to separation by discontinuous SDS-PAGE and visualisation with Coomassie stain. Both proteins were expressed and purified dimerised with H6TEV-ARNT362, and the untagged AhR (B) appeared to be more efficiently purified than the tagged form (A) by Ni-IMAC. Both forms of heterodimer were active for binding FAM labelled XRE probe by EMSA (C). Untagged hAhR was pursued in preference to the tagged form, given that it was more readily purified, active for binding DNA and in order to reduce the potential for structural artefacts possibly generated by the presence of a synthetic tag. are summarised in Figure 3.9, and application of purifications performed are described in detail in the following sections.

3.2.1.3 TEV Proteolysis of hAhR287/H6ARNT362 incurred a decrease in protein solubility and resulted in reduced yield.

Following IMAC partial purification of expressed protein, further purification of hAhR287/H6TEV- ARNT362 was initially optimised to include a TEV digest to remove the His tag. Heterodimer purified by IMAC required immediate desalting to remove imidazole (250 mM) as high salt can cause protein aggregation, although stabilising soluble heterodimer during buffer exchange proved a difficult task. It is recommended by manufacturers instructions that TEV digestion be performed in low salt buffer to ensure efficient proteolysis (Invitrogen AcTEV Protease). Additionally, further purification of the heterodimer by anion exchange chromatography (AEC) also required low salt concentration to achieve protein adsorption. Hence, buffer conditions for TEV proteolysis and AEC were optimised in parallel.

Purification Scheme 1 The first attempt at TEV proteolysis was performed on IMAC purified ARNT homodimer, where protein was dialysed into TEV buffer 1 (50 mM Tris.HCl pH 7.5, 0.1 mM EDTA, 1 mM DTT, 5% Glycerol) after purification (Scheme 1, Fig. 3.9). 4 mg of homodimer was digested for 4 hours at 37 C with 30 units of TEV protease, where TEV protease units are defined by the capacity for one unit of enzyme to cleave 85% of 3 g of control substrate in 1 hour at 30 C. This procedure resulted in homodimer becoming insoluble and aggregating. TEV proteolysis of the IMAC purified heterodimer hAhR287/H6ARNT362 was attempted and also resulted in aggregation, hence the digestion buffer was modified to improve solubility with the addition of NaCl.

Purification Scheme 2 In scheme 2 (Fig. 3.9), rapid buffer exchange employed SEC using an XK50/30 G25 sephadex column (Amersham Biosciences). The minimal TEV buffer was modified to improve solubility with the addition of 50 mM NaCl: TEV Buffer 2 (50 mM Tris.HCl pH 7.5, 50 mM NaCl, 0.1 mM EDTA, 1 mM DTT, 5% Glycerol). TEV digestion was performed on half the purified material using 20U TEV protease, over 16 hours at 4 C, with a control sample stored without protease. Both samples were found to aggregate, where heterodimer yields were 2.4 mg/L culture from IMAC, 2 mg/L after buffer exchange and 0.75 mg/L soluble material after storage overnight at 4 C. Aggregate was identified as hAhR287/H6ARNT362 by SDS-PAGE analysis (data not shown), hence, buffer conditions for IMAC purification and storage were investigated further to improve solubility and yield.

66 hAhR287/H6ARNT362 hAhR284/H ARNT362 Ni-IMAC 6 Ni-IMAC

(1) (2) (3) (4) (5) (6) (6)

Dialysis Minimal TEV G25 Sephadex Buffer Concentrated to Dialysis Dialysis Dialysis Dialysis Buffer Exchange 1 ml AE Load Buffer AE Load Buffer AE Load Buffer AE Load Buffer TEV Buffer + 50 mM NaCl

TEV

INSOLUBLE AE SEC Superdex 200, AE AE AE 5 mL column AE Gradient Elution 300 mL XK26/70 250 mM NaCl Step 50-250 mM 150 mM NaCl 150 mM NaCl 50-500 mM NaCl 250-500 mM 250 mM NaCl Stepwise Elution 1 mg/L 3.3 mg/L NaCl Gradient Stepwise Elution 3.5 mg/L TEV 1 mg/L 2.4 mg/L SEC Superdex 200 Ni-IMAC flowthrough 300 mL XK26/70 0.25 mg/L 1 mg/L

Figure 3.9 Summary of Purification Protocols for Optimization of Purity, Yield and Solubility of hAhR287/H6TEV-ARNT362 and hAhR284/H6TEV-ARNT362. Inherent solubility problems were associated with purification of hAhR287/H6TEV-ARNT362. Hence, a number of purification strategies were attempted, where schemes are labelled (1-6), with yield shown in bold at the bottom of each scheme. IMAC purification was performed at the beginning of each strategy. TEV proteolysis typically resulted in loss of solubility and decreased yield (Scheme (1) and (2)). SEC involved concentrating the heterodimer prior to loading, resulting in formation of higher order oligomers, and marked losses due to limited solubility at higher concentrations (Scheme (3)). Stepwise elution at the AEC stage of purifications resulted in maintenance of higher MW contaminants, requiring further SEC purification (Scheme (4)). Ni-IMAC and gradient elution during AEC showed higher purity in the early section of the

50_250 mM NaCl gradient (5). Optimal purification was achieved by combining Ni-IMAC and AEC using a step elution at 150 mM NaCl (Scheme (6)). hAhR284/H6TEV-ARNT362 purification was achieved using the same method, although enhanced solubility of this protein improved overall yield to 3.5 mg/L, where protein yield was estimated by Bradford analysis. TEV Proteolysis of hAhR287/H6-TEV-ARNT362

42137 Temperature (თC) Units TEV Protease 

MW (kDa) Contaminants

55.4 9 Hours H6TEV-ARNT362 ARNT362 hAhR287 36.5 Contaminant

MW (kDa) Contaminants 55.4 H6TEV-ARNT362 24 Hours ARNT362 36.5 hAhR287 Contaminant

Figure 3.10 Recombinant co-expression, IMAC purification and TEV proteolysis of hAhR287/H6TEV-ARNT362. TEV proteolysis was optimised for IMAC purified hAhR287/H6TEV-ARNT362, by incubation of ~40 g of protein at 4, 21 and 37 rC for 9 and 24 hours with 2, 5 and 10 units of TEV protease (Invitrogen). Digested heterodimer was separated by denaturing SDS-PAGE and visualised by coomassie staining. This illustrated that total digestion was achieved by incubating heterodimer in the presence of 1 unit of TEV protease: 20 g heterodimer at room temperature over 9 hours (, top panel). Subsequent TEV digests were performed using these conditions in anion exchange load buffer. To test the effect of IMAC buffer components on protein solubility and stability, hAhR287/H6ARNT362 was expressed and half the cells lysed into sodium phosphate pH 7.2 Ni-IMAC Load Buffer, and half into Tris.HCl pH 7.2 buffered Ni-IMAC Load Buffer. IMAC purifications were performed in parallel with sodium phosphate buffering and Tris.HCl buffer respectively. The eluted protein samples were halved again and dialysed overnight into different storage buffers including a glycerol containing buffer (20 mM Tris.HCl pH 7.5, 0.1 mM EDTA, 150 mM NaCl, 10% Glycerol, 1 mM DTT) and a Mg2+ containing buffer

(20 mM Tris.HCl pH 7.5, 0.1 mM EDTA, 150 mM NaCl, 5% Glycerol, 1 mM DTT, 3 mM MgCl2). The latter was chosen as Magnesium salts are known to enhance function of recombinantly expressed transcription factors (198). Protein precipitation was observed in all conditions. Hence, sodium phosphate buffered Ni-IMAC was excluded as causal, and high glycerol or magnesium alone were not able to prevent precipitation.

Given that purified material was precipitating upon storage, a large-scale purification (2L) was performed with multiple steps in rapid succession within one day (Scheme 2, Fig. 3.9). The purification strategy utilised IMAC, followed by desalting into TEV Buffer 2 on a XK50/30 G25 sephadex column. This buffer was suitable for achieving heterodimer adsorption onto anion exchange resin, hence AEC was performed, where heterodimer was eluted with a gradient from 50-500 mM NaCl. Purified heterodimer (2.7 mg) was subjected to TEV proteolysis with the addition of 30 units of TEV and incubated for 16 hours at 4 C. This protocol resulted in 3 mg/L of IMAC purified material, 1.6 mg/L after buffer exchange, 1.1 mg/L after AEC and 1.1 mg/L after TEV proteolysis (Fig. 3.9, Scheme 2). A final IMAC step was included to remove the His tag, undigested heterodimer and TEV protease, with a final yield of 0.25 mg/L protein remaining. Major decreases in soluble protein yield occurred during the desalting step, after removal of the H6 tag by proteolysis and after the final IMAC purification step. It was concluded that the rapid purification process was not able to prevent precipitation, and hence, desalting by dialysis was used in all subsequent experiments.

To assess whether buffer pH and temperature may have affected protein solubility after proteolysis, TEV digests were performed at higher pH and lower temperature. A TEV digestion series was set up with IMAC purified hAhR287/H6ARNT362, and desalted by dialysis into TEV buffer at pH 8 with higher salt: TEV Buffer 3 (50 mM Tris.HCl pH 8, 150 mM NaCl, 5% Glycerol). Heterodimer was noted to have increased solubility at pH 8, hence this was maintained in future experiments. TEV proteolysis of IMAC purified heterodimer was optimised by varying TEV concentration, temperature and digestion time (Fig. 3.10). Digestion was judged complete by coomassie stained SDS-PAGE and Western blot analysis with the addition of 1 unit of TEV protease per 20 g of heterodimer at room temperature after 9 hours (Fig. 3.10, top panel). This procedure was scaled up for crystallization screens, as described in section 3.2.2.4. 67

Buffer conditions identified during these experiments which could improve solubility of IMAC purified hAhR287/H6ARNT362 included maintenance of protein at pH 8, and performing buffer exchange by dialysis into 50 mM NaCl containing buffer. This process established that dialysis into AEC Load Buffer: 50 mM Tris.HCl pH 8, 50 mM NaCl, 0.1 mM EDTA, 1 mM DTT and 5% Glycerol maintained soluble heterodimer in a buffer which would allow binding to anion exchange resin (continued section 3.2.1.6).

3.2.1.4 Size Exclusion Chromatography of hAhR287/H6ARNT362 incurred decreased yield of purified heterodimer Purification Scheme 3 Having determined the optimal storage buffer conditions to maintain heterodimer solubility - Storage Buffer (50 mM Tris.HCl pH 8, 150 mM NaCl, 0.1 mM EDTA, 1 mM DTT, 5% glycerol), the next purification strategy tested entailed IMAC purification, followed immediately by SEC (XK26/70 300 mL Superdex 200) of the highest concentration peak fractions into Storage Buffer. This resulted in ~90% pure heterodimer, although only utilising peak IMAC fractions decreased yield to ~0.9 mg/L culture. This protocol was repeated with the inclusion of a dialysis and concentration step prior to size exclusion chromatography (Scheme (4), Fig. 3.9). Cells from a 3 L preparation of bacterial culture were lysed and expressed protein partially purified by IMAC, and eluted material was dialysed into Storage Buffer (above). No protein precipitation was evident following dialysis, and these conditions were maintained subsequently for storage of the heterodimer, resulting in yields of ~3.5 mg/L after IMAC. The material (10.5 mg) was concentrated from 35 ml to 1 ml using an Amicon stirred ultrafiltration cell concentrator membrane (10 kDa MWCO), and purified further by SEC. Concentration was noted to proceed slowly and resulted in precipitation and loss of ~70% of partially purified material. Concentration additionally resulted in formation of oligomeric species, which eluted in 5 peaks, with peak broadening suggestive of potential hydrophobic interaction with the resin (199) (Fig. 3.11 A). SDS-PAGE analysis of SEC separated material showed heterodimer in peaks stretching from 125-175 mL elution volumes (Fractions 6-16, Fig. 3.11 B). This indicated that higher order oligomers >150 kDa were formed after concentration, although the major peak evident at ~66 kDa may represent unseparated heterodimer (~72 kDa). All peaks shown to contain heterodimer were active for binding XRE, as illustrated by EMSA (Fig. 3.11 C). Hence, concentrating the heterodimer prior to SEC was eliminated from the emerging purification protocol as this resulted in decreased yield and generation of oligomeric heterogeneity.

Purification Scheme 4 Next, heterodimer was partially purified by IMAC, and buffer exchanged by dialysis into AEC Load buffer (50 mM Tris.HCl pH 8, 5% Glycerol, 0.1 mM EDTA, 1 mM DTT, 50 mM NaCl) for 16 hours at 4 C. 68 Purification Scheme 3

Blue Dextran (A) IgG ~150 kDa void BSA ~66 kDa mAu 280 A

1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 22 24 26 28 30 Fraction Number

100 150 200 (B) Elution Volume (ml) Size Exclusion Fractions 5 6 7 8 9 10 11 12 13 14 15 16 21 MW (kDa) Contaminants 55.4 H6TEV-ARNT362 hAhR287 36.5

(C) (-) 5 6 7 8 9 10 11 12 13 14 15 16 21

Oligomer Bound 33mer

Heterodimer Bound 33mer

Unbound 33mer

Figure 3.11 Purification Scheme 3 – Ni-IMAC purified hAhR287/H6TEV-ARNT362 Concentrated for SEC Forms Soluble Oligomers. hAhR287/H6TEV-ARNT362 was purified by Ni-IMAC and concentrated in storage buffer for loading onto a 300mL Superdex 200 SEC column (Scheme 3, Fig. 3.9). Approximately 5mg of heterodimer was concentrated to 1mL and separated by SEC. Peak fractions were separated by discontinuous SDS-PAGE and visualised using Coomassie blue stain, illustrating a range of oligomeric species in solution (B), resulting in five peaks shown in the size exclusion absorbance trace (A), all found to contain heterodimer. All fractions found to contain heterodimer by SDS-PAGE were active for DNA binding, as shown by EMSA with FAM labelled 33mer XRE probe, containing the DNA sequence preferentially bound by AhR/ARNT (Swanson et al., 1995) (C). Peak broadening due to oligomerisation and insolubility resulting from concentrating hAhR287/H6TEV-ARNT362 prompted further optimisation of the purification strategy. Purification by AEC (1 ml HiTrap Q-HP anion exchange column, GE Biosciences) with an 80 mM NaCl wash, followed by 250 mM step elution was performed (Fig. 3.12 B). This material was found to contain some contaminants (Fig. 3.12 B), but use of a 1 ml Q-sepharose column allowed concentrated heterodimer to be applied to an XK26-70 SEC column (300 ml superdex 200). This resulted in separation of higher MW contaminant from the major heterodimer peak (Fig. 3.12 (A, B)). Additionally, all peaks found by SDS-PAGE to contain heterodimer were active for XRE binding, as assessed by EMSA (Fig. 3.12 C). An analysis of the yield revealed 10 mg of partially purified heterodimer was eluted from the anion exchange step, with 50% loss occurring during size exclusion, providing further impetus to remove this purification method from the emerging protocol (summary Fig. 3.9, Scheme 4). Of note was the appearance of higher MW oligomeric species that were active for binding XRE, as determined by EMSA. This indicated the presence of soluble, active oligomeric forms of heterodimer (Fig. 3.12 C).

3.2.1.5 Mass Spectrometric Identification of hAhR287/H6ARNT362 After purification of the heterodimer, it was evident by SDS-PAGE that hAhR287 separated into two bands (Fig. 3.12 B), and occasionally formed a triplet. This suggested that contaminant bacterial proteases may have been degrading purified hAhR. However, this effect was not persistent, as multiple bands were not always evident using different gel systems for the same protein sample (data not shown). Immediate analysis by SDS-PAGE of IMAC purified material indicated that the doublet was present prior to storage, indicating that a proteolytic event may be occurring during expression and IMAC purification. Mass spectrometry was performed to confirm the identity of purified material, and to identify the doublet species. The purified protein was desalted by C4 ZipTip (Millipore) with a view to performing electrospray-MS. Desalting resulted in precipitation, which prevented mass identification. Matrix Assisted Laser Desorption Ionising (MALDI)-MS was performed by the Adelaide Proteomics

Facility which identified hAhR 1-287 32,2723.5 Da, consistent with predicted MW of 32,268.9. H6TEV- ARNT362 had a calculated MW of 40,6624 Da, which correlated with predicted MW 40,790.8 minus the initiating methionine (131.8 Da), indicating that bacterial methionine aminopeptidase (MAP) had cleaved at this position. An analysis of the H6TEV tag sequence showed consensus with MAP target proteolysis sites, having a Glycine at the second position (MetGGSHHHHHH) P1’ (200). Since the doublet band for hAhR287 did not represent species having different mass and was therefore likely to be a gel artefact, rather than resulting from proteolysis, the purification protocol was optimised for greater yield.

69 Purification Scheme 4

(A) Void

150 kDa

66 kDa mAu 280 A

Elution Volume (ml)

(B) 0 1 4 5 6 te . 2 2 7 8 9 1 1 1 G a C C C C C C C C T s E E E E E E E E MW -IP Ly A A S S S S S S (kDa) 55.4 H6TEV-ARNT 1-362 (40.8 kDa) 36.5 hAhR 1-287 (32.3 kDa) 31 Yield 1 mg/L

(C) 4 19 20 21 22 23 2 7 8 9 12 13 14 15 16 17 18 k C C C C C C nk C C C C C C C C C C an E E E E E E la E E E E E E E E E E bl A A A A A A b S S S S S S S S S S

Bound oligomer Bound oligomer

Bound dimer

Unbound 33mer

Figure 3.12 Purification Scheme 4: Recombinant co-expression, Ni-IMAC, AEC and SEC purification of AhR/ARNT

and EMSA for dsXRE binding reveal progressively higher MW oligomer formation. H6TEV-hARNT1-362 and hAhR 1- 287 were co-expressed, cells were lysed using a cell disruptor and cleared lysate was purified by Ni-IMAC (data not shown). Ni-IMAC purified hAhR/ARNT heterodimer was dialysed into buffer containing 50 mM NaCl and purified by AEC with elution performed stepwise at 250 mM NaCl. Purified protein was further fractionated by SEC using a Superdex-200 column (300 ml). The SEC trace shows a well separated peak fraction which represents a stable heterodimer of truncated hAhR and hARNT (A). Protein purity was visualised by coomassie blue stained SDS-PAGE (B). Note that a higher MW contaminant was removed in fractions 6-9. Heterodimer function was confimed by EMSA (C) with FAM labelled XRE probe 33mer, where oligomeric DNA bound bands were visualised in the highest protein concentration fractions. 3.2.1.6 Purification and Optimisation of hAhR287/H6ARNT362 Yield by Stepwise Anion Exchange Chromatography Purification Scheme 5

For optimisation of hAhR287/H6ARNT362 yield, bacterial cultures of 6L resulted in 34 mg of partially purified protein after IMAC. Following this, protein was exchanged into AEC Load Buffer by dialysis, and AEC purification was performed (Scheme (5) Fig. 3.9). Heterodimer was eluted with a shallow gradient from 50 to 250 mM NaCl, followed by a 250-500 mM gradient. Protein purity was analysed by coomassie stained SDS-PAGE (Fig. 3.13). It was evident from this purification that heterodimer was separated from higher MW bacterial protein contaminants at approximately 150 mM NaCl in the 50-250 mM NaCl gradient, with a yield of 1 mg/L culture, prompting stepwise elution at this concentration in subsequent purifications.

Purification Scheme 6

Stepwise elution at 150 mM NaCl allowed the majority of hAhR287/H6ARNT362 to elute, while maintaining retention of contaminant proteins on the column (Fig. 3.14). Specifically, stepwise elution with 150 mM NaCl resulted in ~80% pure heterodimer which co-eluted with very high MW contaminants. These contaminants were hypothesized to represent higher order oligomeric forms of the heterodimer, similar to those identified for ARNT homodimer. Western blot analysis with anti-ARNT and anti-AhR primary antibodies identified both proteins in the major contaminant, ~75 kDa band (data not shown), which indicated that the dimeric form was stable during separation by reducing SDS-PAGE. This purification yielded 12 mg of heterodimer, or 2 mg/L culture. During subsequent purifications using the same 1 ml AEC column, even with column cleaning with 0.5 M NaOH after each purification; the flow rate slowed considerably, with a yellowing of the resin evident. Increasing the size of the AEC column to 5 mL reduced losses at this stage, increasing yield to 2.4 mg/L, and columns could be reused numbers of times without significant reduction in performance. Additionally, using the 5 mL AEC column, a reduction in formation of oligomeric species was evident following coomassie stained SDS-PAGE separation (Fig. 3.15 A).

Large scale TEV proteolysis of a second preparation of Ni-IMAC purified heterodimer (Section 3.2.1.3) was performed using the protocol established in earlier trials and untagged heterodimer was finally purified by AEC (Scheme 6, Fig. 3.9), where protein was eluted stepwise with 150 mM NaCl (Fig. 3.15 B). This material was not as pure as the tagged form, and 25% of the protein was lost after removal of the tag, indicating that the tag may help solubilise the heterodimer. EMSA analysis of purified material recovered after TEV digestion indicated that the untagged dimer was an active DNA binding species (data not shown).

70 Purification Scheme 5

[NaCl] 250- 500 mM

AEC [NaCl] 50-250 mM

MW (kDa)

55.4 ARNT362 36.5 hAhR287

Figure 3.13 Purification Scheme 5: Ni-IMAC and Gradient AEC Elution of hAhR287/H6TEV-ARNT362. hAhR287/H6ARNT362 Heterodimer purified by Ni-IMAC was further purified by anion exchange by two step gradient elution; NaCl 50-250 mM and 250-500 mM. Through the start of the gradient 50-150 mM, heterodimer and higher MW oligomer appeared to elute alone, while at higher concentration showed tailing heterodimer elution with higher MW contaminants. Purification Scheme 6 – 1 mL AEC Column

1 mL AEC Column l aC t G h n Anion Exchange 150 mM NaCl Step Elution N TG T e s a M P IP at a lu T m I ) s i W i E F 0 (-) (+ Ly N N Q 5 MW 2 (kDa) Oligomers Contaminant 55.4 H6ARNT362 36.5 hAhR287

A.E. 250 mM NaCl l aC N M M 1 MW (kDa) 55.4 Contaminant H6ARNT362 36.5 hAhR287 Contaminant

Figure 3.14 Purification Scheme 6: Recombinant co-expression, Ni-IMAC, 150mM step elution AEC purification of

hAhR287/H6TEV-ARNT362. Heterodimer purified by Ni-IMAC was further purified by AEC on a 1 mL column by stepwise elution at 150 mM, 250 mM and 1 M NaCl. The majority of the heterodimer was shown to elute at 150 mM NaCl. Bacterial protein contaminants were evident in the 250 mM and 1 M NaCl elution steps. Oligomeric heterodimer bands were subsequently identified by Western blot, and are indicated in the 150 mM NaCl step elution fractions. Protein fractions shown were subjected to separation by SDS-PAGE and visualised with Coomassie blue stain. Purification Scheme 6 – 5 mL AEC Column

(A) Ni-IMAC and AEC – hAhR287/H6ARNT362

. E te l G TG a e n T P s k io P I y ic n -I + L N A 5 mL AEC Column

51.4 H6TEV-ARNT 1-362 (40.8 kDa) 36.5 hAhR 1-287 (32.3 kDa) 31 Yield 2.4 mg/L

(B) Ni-IMAC, TEV Proteolysis and AEC – hAhR287/ARNT362

V V E E T T ) -) + ( ( C C e G TG t A A 3 4 5 T P a IM IM C C C P I ys i- i- E E E -I + L N N A A A

51.4 H6TEV-ARNT 1-362 36.5 No Tag ARNT 1-362 31 hAhR 1-287 Yield 1.5 mg/L

Figure 3.15 Purification Scheme 6: Recombinant co-expression, Ni-IMAC, and 150 mM NaCl AEC stepwise elution of hAhR287/H6ARNT362 and TEV digested hAhR287/ARNT362 for crystallization screens. (A) Heterodimer purified by Ni-IMAC was further purified by AEC on a 5mL column by stepwise elution at 150 mM NaCl. (B) A second Ni-IMAC purified sample of heterodimer was dialysed and treated with TEV protease, and finally purified by AEC in a 5 mL column and stepwise elution at 150 mM NaCl. This material was subsequently found to bind dsXRE by EMSA analysis (data not shown). Purified protein was separated by discontinuous SDS-PAGE and visualised with Coomassie blue stain.

3.2.2 Crystallization and Partial Proteolysis of bHLH.PAS A Heterodimer hAhR/ARNT

3.2.2.1 DNA Purification and hAhR287/H6ARNT362/XRE Complex Formation for Crystallization Screens Initially, heterodimer alone was buffer exchanged into a range of more minimal buffers to establish soluble conditions to initiate crystallization screens in the absence of DNA. Material purified by IMAC and AEC was dialysed using Slide-a-Lyzer 10 kDa MWCO dialysis units into water alone, 50 mM Tris.HCl pH 8 buffer, which resulted in decreased solubility, concentrating to only ~0.2 mg/ml. Removing components of Storage Buffer to create more minimal storage buffer; either 50 mM Tris.HCl pH 8, 0.1 mM EDTA, 1 mM DTT; or 50 mM Tris.HCl pH 8, 150 mM NaCl similarly resulted in decreased solubility. Hence, initial crystallization screens proceeded with protein concentrated in 50 mM Tris.HCl pH 8, 0.1 mM EDTA, 1 mM DTT, 100 mM NaCl. Glycerol was removed from the regular Storage Buffer in order to employ a nano-dispensing crystallization robot. However, heterodimer, in the absence of DNA could only be concentrated to ~1.5 mg/ml. The aim of the project was to establish not only molecular determinants of dimerization, but also DNA binding specificity. Therefore, DNA purifications were performed to investigate optimisation of complex formation and assess if DNA bound heterodimer might prove more stable for concentration.

Oligonucleotides were designed to include the same region of Cyp1a1 promoter sequence as was utilised for EMSA (Fig. 3.16 A). Initially, complementary `33mer’ ssDNA oligonucleotides purchased in the crude ‘trityl on’ form were purified by RP-HPLC and peak fractions lyophilized. The complementary purified ssDNAs were annealed in a 1:1 ratio. Annealed dsDNA was snap frozen and stored at -80 C. Final purification of annealed dsDNA was carried out by AEC, where a single DNA peak typically eluted at ~750 mM NaCl). Given that protein to be complexed with DNA was quite dilute after anion exchange chromatography (0.15-0.35 mg/ml), addition of dsDNA in a 1:1 molar ratio resulted in final buffer salt concentration around 225 mM NaCl, and while not ideal for crystallization, this allowed DNA binding and did not result in aggregation. Prior to crystallization screening, buffer exchange was performed to remove glycerol, and to reduce salt concentration to 100 mM. Concentration of hAhR287/H6ARNT362 bound to 33mer yielded a complex with a maximum protein concentration of 1.8 mg/ml (Fig. 3.16 C), not considerably higher than achieved in the absence of DNA.

Typically, crystallization of protein:DNA complexes requires trials of different lengths of DNA (201), hence, precedence for optimal length of DNA for bHLH complexes that have been crystallised in the past were reviewed (Fig. 3.16 B) (38, 150, 202, 203). The limits of protein:DNA contacts are 71 (A) 33mer CGTGCCCCGGAGTTGCGTGAGAAGAGCCTGGAG GCACGGGGCCACAACGCACTCTTCTCGGACCTC Green = XRE 16mer GAGTTGCGTGAGAAGA CTCAACGCACTCTTCT Blue = PAS specific protection

30mer CGGAGTTGCGTGAGAAGAGCCTGGAGGCCC Underlined = Enhanced DNAseI sensitivity GCCTCAACGCACTCTTCTCGGACCTCCGGG with AhR/ARNT bound

33mer-B CGGAGTTGCGTGAGAAGAGCCTGGAGGCCCGCG GCCTCAACGCACTCTTCTCGGACCTCCGGGCGC

(B) E-box CGAGTAGCACGTGCTACTC GCTCATCGTGCACGATGAG Myc/Max bHLH.LZ

GTGTAGGTCACGTGACCTACAC CACATCCAGTGCACTGGATGTG Max/Max bHLH.LZ

TCAACAGCTGTTGA AGTTGTCGACAACT MyoD bHLH

CTCACACGTGGGACTAG GAGTGTGCACCCTGATC PHO4 bHLH

CACCCGGTCACGTGGCCTACA TGGGCCAGTGCACCGGATGTG USF bHLH

TTCCTATGACTCATCCAGTT AGGATACTGAGTAGGTCAAA GCN4 b.Zip

Red = Contact residues

(C) dsDNA hAhR Construct [Protein] mg/ml

33mer 1-287 1.8

16mer 1-287 1.8 1-284 10

30mer 1-287 18 1-284 >100

33mer-B 1-287 20 1-284 50

Figure 3.16 Design of double stranded nucleotide sequences for complex formation with purified hDR/hARNT for co- crystallization and SAXS studies. Double stranded (ds) nucleotide sequences represent fragments of the mouse Cyp1a1 enhancer, the prototypical and most responsive XRE (A) (38). Ds DNA fragments with XRE highlighted in green, PAS A specific protection in blue and bending region underlined (9). (B) DNA fragments from cocrystal structures including bHLH.LZ factors Myc/Max (233) and Max homodimer (202); bHLH homodimers MyoD (149), PHO4 (203) and USF (150); and b.Zip factor GCN4 (155) were aligned by E-box element, where applicable (dashed lines). Backbone and nucleotide contacts are highlighted in red and indicate contacts up to 4 residues 5’ of the E-box, and 6 nucleotides 3’ of this site. This alignment was utilised to infer potential contact regions around the core XRE for bHLH.PAS A heterodimer hAhR/ARNT, resulting in design of a 16mer fragment of DNA. (C) Table of maximum complex concentrations achieved with each duplex and hAhR construct as estimated

by Bradford analysis. Note H6TEV-ARNT362 is also in each complex. summarised in Figure 3.16, where direct base contacts and phosphate backbone contacts are coloured red. From this analysis it was evident that bHLH DNA cocrystallization has been successful where the DNA contains an extra 4 bases 5’ of the minimal response element (CACGTG), plus a further 6 bases at the 3’ end, implicating a dsDNA 16mer (Fig. 3.16 A) as most likely to maintain maximal protein:DNA contacts, and minimize possible heterogeneity of DNA extensions not in contact with protein. Initially, FAM labelled 16mer was assessed for heterodimer binding by EMSA with increasing amounts of hAhR287/H6TEV-ARNT362 (Fig. 3.20 A), which showed strong complex formation. Hence, crystallization trials incorporating a 16mer dsDNA (Fig. 3.16 A) were also performed, where the complex reached a maximum concentration of 1.8 mg/ml, the same achieved with 33mer dsDNA (Fig. 3.16 C).

Intriguingly, plate crystals were observed to form on storage of concentrated hAhR287/H6TEV- ARNT362/16mer (~2 weeks) at 4 C which took up coomassie blue protein dye, but did not diffract.

In parallel with crystallization screens, investigations in our laboratory into bHLH.PAS A dimers revealed the PAS A region is required for DNA bending, and implicated a contact region 3’ of the terminus of the original 33mer dsDNA (9). These data were incorporated into dsDNA design, where the DNA was extended 3’ of the original 33mer, and bases at the 5’ end, which are not protected by AhR/ARNT from DNAseI cleavage were removed, resulting in 30mer (Fig. 3.16 A), which was able to stabilise the complex to a protein concentration of 18 mg/ml (Fig. 3.16 C). This indicated that the extra region of DNA was having a fundamental impact on protein solubility, and potentially, protein structure. It has been hypothesised that the additional region may act to cover a hydrophobic patch, as there is a very small increase in DNA binding affinity and complex stability (Chapman-Smith, A., pers. comm.). Addition of three residues 3’ of the 30mer dsDNA produced 33mer-B ds DNA which encompassed the most conserved nucleotides in an alignment of promoter fragments known to be bound by AhR/ARNT (Chapman-Smith, A., pers. comm.), and 33mer-B stabilised the complex to 20 mg/ml (Fig. 3.16 (A,C)), as estimated by Bradford analysis.

3.2.2.2 Robotic Crystallization Screens of hAhR287/H6ARNT362

The first crystallization screens were performed on hAhR287/H6ARNT362 in the absence of DNA where solubility limited the concentration to 1.5 mg/ml (summarised in Table 1). The protein was buffer exchanged during concentration to remove glycerol. Crystal screens with 100nl protein:100nl mother liquor (ML) including Hampton I and PEG/Ion (Hampton Research1 - see appendix for summary of screen components) were set up in 96 well duplicate trays using a HoneyBee crystallization robot and incubated at 4 C and 21 C. One tray showed growth of microcrystals within five minutes in condition

Hampton 1 (H1) #14 (100 mM HEPES pH 7.9, 200 mM CaCl2, 28% PEG 400) (Fig. 3.17 A). Other conditions with high nucleation and minimal crystal growth included H1#23, a similar condition to #14, except the cation source, MgCl2 in this case. PEGIon screen showed further microcrystal formation in 72 Protein Dimer dsXRE Conc. Robot Screen (96) Hits Condition I hAhR287/H6ARNT -- 1.5 mg/ml Hampton /PEG Ion* #14 - microcrystals 200 mM CaCl2, 0.1 M HEPES pH 7.5, 28% PEG 400 (IZIT positive)

#23 - microcrystals 0.2 M MgCl2, 0.1M HEPES pH 7.5, 30% PEG 400 (IZIT positive) 16mer 1 mg/ml Salt RX I/II Natrix/Sigma DNA

I/II 33mer 1 mg/ml Salt RX #42 – microcrystals 0.01 M MgCl2.6H2O, 0.05 M Tris hydrochloride pH 7.5, 5% v/v 2-propanol Natrix/Sigma DNA

30mer 20 mg/ml Natrix/Sigma DNA #46 – small plates 0.005 M Magnesium Sulfate Aq. 0.05 M Tris pH 8.5, 35% w/v 1,6 Hexanediol

HT Screen D9 – small crystals 18% PEG 8000, 0.1 M Na cacodylate pH 6.5, 0.2 M Zn Acetate

hAhR287/ARNT 16mer 1 mg/ml Salt RX I/II Natrix/Sigma DNA 33mer 1 mg/ml Salt RX I/II Natrix/Sigma DNA

hAhR287/TruncH6ARNT39-362 30mer 1.1 mg/ml HT Screen Natrix

Table 1 Summary of hAhR287/H6ARNT362/DNA, hAhR287/NoTagsARNT362/DNA and hAhR287/H6TEV- ARNT39-362/30mer complexes subjected to robotic crystal screens and crystal hits. Complexed material was concentrated, centrifuged at 16k rcf, 4°C for 5 minutes and stored on ice prior to screening. Protein concentration was estimated using Bradford analysis. Material concentrated for each trial was also used for finescreens and stored up to a maximum of 2 weeks at 4°C. (A) 1.5 mg/ml hAhR287/H6ARNT (-) (B) 1.9 mg/ml hAhR287/H6ARNT362 (C) 1.9mg/ml hAhR287/H6ARNT+33mer, 100 (D) 1 mg/ml hAhR287/H6ARNT 50m DNA, Hampton I #14 robot screen +33mer, 100 mM HEPES pH 6.6, mM HEPES pH 7.5, 18% PEG 400, 200 mM +30mer, 100 mM HEPES pH 7.5,

28% PEG 400, 200 mM CaCl2. CaCl2. No Diffraction. 21% PEG 400, 200 mM CaCl2

100m 30m

100m

(E) hAhR287/H100m6ARNT+33mer-B 9 (F) 8.5 mg/ml hAhR284/ARNT362 + (G) 18 mg/ml hAhR284/H6ARNT+33mer- (H) 18 mg/ml hAhR284/H6ARNT+33mer- (I) 7.5 mg/ml hAhR287/H6ARNT +30mer, mg/ml, 31% PEG 400, 200 mM CaCl2, 30mer, 10% PEG 1000, 10% PEG 8000. B, 22% PEG 4000, 100 mM Sodium B, 28% PEG 4000, 100 mM Sodium 100 mM MOPSO pH 6.2, 20% PEG400, HEPES pH 8.1, 20 mM DTT. No Acetate pH 4.8, 200 mM Ammonium Acetate pH 4.8, 200 mM Ammonium 200 mM CaCl2. No Diffraction. Diffraction. Acetate. Diffracted as protein. Acetate.

I Figure 3.17 Summary of hAhR/H6ARNT/DNA crystalline forms for a range of conditions. Hampton condition #14 crystal form from robot screen is shown in (A), and changes in crystal form are shown for finescreens around this hit in figures (B-E), and (I). hAhR284/H6ARNT362 +33mer-B resulted in formation of oils in ~65% of robot screen conditions, as illustrated in (F). Hexagonal crystal forms (G-I) were noted in a range of conditions, with protein diffraction detected for the crystal shown in (G) (data not shown). conditions 5, 7, 16, 31 and 43. Notably, only 34 of 96 conditions tested showed any precipitation or crystallization, indicating protein needed to be at a higher concentration for subsequent screens.

3.2.2.3 Finescreens of hAhR287/H6ARNT362  XRE dsDNA in Hampton 1 Condition 14 Optimization of H1#14 included changing the buffer conditions, where HEPES was exchanged for MOPS, Tris.HCl and Bicine over a pH range from 7-8.4 (Summary Table 2). These hanging drop finescreens similarly resulted in microcrystal formation, suggesting high nucleation and limited growth would be difficult to control. Hence, an exhaustive approach to fine screening techniques and seeding was undertaken to increase the size of crystals grown in this condition. In order to decrease nucleation, several strategies were applied for hAhR287/H6ARNT362, including fine screens with more dilute protein, changing the ratio of ML to protein, addition of glycerol, changing the size of the hanging drop, broad pH screens, and decreased PEG concentration. This series showed that at 12% PEG 400, immediate microcrystal formation was prevented, although upon incubation, smaller microcrystals were evident after 3 days. IZIT1 staining of crystals indicated that they were proteinaceous.

Manual manipulation of drops during staining showed that a thick skin was forming on the drops, and that microcrystals were nucleating predominantly in the skin. Initial seeding trials included streak seeding and microseeding by pipetting microcrystals from established drops into fresh drops (204). Streak seeding attempts were hampered by skin formation, and ultimately no larger crystals were observed. Having pursued this condition, it was evident that the protein concentration may need to be increased in order to get growth of larger crystals from these seeds. The limitation of using hAhR287/H6ARNT362 in the absence of DNA meant that protein concentration was limited to <1.5 mg/ml. Hence, the effect of addition of dsDNA was investigated in an effort to stabilise the heterodimer and increase solubility.

hAhR287/H6ARNT362/33mer finescreens around Hampton I condition #14 (Table 2) were performed with further attempts to control nucleation by incubation at 4 C, room temperature and 37 C. Theorising that microcrystallization may be occurring at the moment ML was added to protein, finescreens were set up using 4 successive drops over a single hanging drop well where protein was spotted in 500nL drops, with the addition of ML using a single tip. This is proposed to result in transfer of seed crystals to each drop in succession, with crystal growth on nucleated seeds favoured over nucleation in the later drops. Slight increases in size were observed, which may suggest that this approach had some merit, although increasing the time to set up the drops would also have resulted in evaporation, increasing the concentration of protein and ML. An attempt was made to induce microcrystal formation, then shift the hanging drop to a well of lower concentration PEG, termed ‘backing off’ (205). The crystal forms

73 Protein/dsXRE Screen [Conc] Buffer Salt Precipitant Additives/Techniques Crystal/Diffraction Hit I hAhR287/H6ARNT (-) Hampton 1.5 - 0.75 HEPES pH 7.1- 8.2 200 mM CaCl2 16-31% PEG 400 1-6% Glycerol Skin Formation and #14 mg/ml Tris pH 7.7-8 Streak Seeding Microcrystals Bicine pH 7.6-8.4 Macroseeding MOPS pH 7.4-7.8 Microseeding I hAhR287/H6ARNT + Hampton 0.2 -1.9 mg/ml HEPES pH 6.6-8.1 200 mM CaCl2 12-31% PEG 400 1-40% Glycerol Skin Formation 33mer #14 Streak Seeding Microcrystals, Macroseeding Leaf shaped 20m, Florettes Microseeding 35m. No diffraction. Backing off.

I hAhR287/H6ARNT + Hampton 0.6 mg/ml HEPES pH 7.1-7.7 200-400 mM 1-30% PEG 400 100 mM DTT Skin Formation 16mer #14 CaCl2 5 l drops

I hAhR287/H6ARNT + Hampton 1-7.5 mg/ml HEPES pH 7.1-8.1 200 mM CaCl2 5-35% PEG 400 10 mM DTT Skin Formation and Hexagonal 30mer #14 CAPS pH 10 2 M TCEP Crystals 70m. No Diffraction. MOPSO pH 6.2 Burst Skin Microseeding

hAhR287/H6ARNT + HTScreen 17 mg/ml Na Cacodylate pH 5.9- 200 mM ZnAc 16-22% PEG 8000 2 M TCEP Amorphous precipitate and Leaf 30mer D9 9.5 shaped.

hAhR287/H6ARNT + Natrix 17 mg/ml Tris pH 6.8-8.5 5 mM MgSO4 20-45% 1,6 2 M TCEP Amorphous precipitate 30mer #46 Hexanediol I hAhR287/TruncH6ARNT Hampton 13 mg/ml HEPES pH 7.5 200 mM CaCl2 1-24% PEG 400 2.5% Glycerol Skin Formation, Small Leaf + 30mer #14 shaped I hAhR287/H6ARNT + Hampton 0.4-9 mg/ml HEPES pH 6.6-8.1 100-400 mM 5-31% PEG 400 Burst Skin Skin Formation, 33mer-B #14 CaCl2 10-30% glycerol Small Leaf shaped, small 10 mM DTT plates, larger spines 100 m Destreak™ No Diffraction.

Table 2 Summary of hAhR287/H6ARNT/DNA complexes subjected to fine screens around original robot screen hit conditions. Complexed material was concentrated, centrifuged at 16k rcf, 4°C for 5 minutes and stored on ice prior to screening. Protein concentration was estimated using Bradford analysis. Additives and techniques tried to optimize crystal growth are also listed, with resulting crystals described and, where tested, diffraction noted. achieved by refined matrix finescreens were typically leaf shaped and are shown in Figure 3.17 (B, C), with improvement in size ranging up to 35 m.

The consistent observation of skin formation suggested that oxidation of Cys residues may be resulting in growth of skin (206), and additionally, skin formation has been proposed to prevent further vapour diffusion, thereby limiting concentration of protein and crystal growth. Measures proposed to prevent oxidation of protein were pursued, including increasing DTT to 10 mM, replacing DTT with the more stable TCEP and carboxymethylation of cysteine residues using iodoacetamide. Other crystallization techniques theorised to prevent skin formation were also attempted: Mineral oil covered hanging drop finescreens (207); microbatch under oil with larger drop volume (10 l) (208); dialysis using Hampton dialysis buttons; and batch free-interface diffusion capillary crystallization (209). Typically, drops covered with oil did not form skin, but crystallization was also prevented. This was also true of the dialysis technique. Only batch capillaries showed crystal growth, where protein was drawn into the tube by capillary action, followed by a small volume of mother liquor. Free interface diffusion has been theorised to result in a gradient of PEG, where the optimum percentage of precipitant will result in crystals somewhere along the gradient. For hAhR287/H6ARNT362/33mer, precipitation was observed at the highest PEG, microcrystals at intermediate levels of PEG, and larger crystals formed at the lowest PEG concentrations. Attempts to harvest crystals from poor optical quality glass capillaries were not successful, although growing crystals by free-interface diffusion in quartz capillaries for synchrotron analysis may be pursued in future experiments.

3.2.2.4 Robotic Crystallization Screens of hAhR287/H6ARNT362 and hAhR287/ARNT362 with 33mer, 16mer, 30mer and 33mer-B dsDNA The first crystallization screens of heterodimer and TEV digested heterodimer complexed with 33mer and 16mer were set up at their solubility limit, 1.8 mg/ml in storage buffer without glycerol (Table 1). HoneyBee crystallization robot was used for 50 nl ML:50 nl protein screens including Natrix (48 conditions), Sigma DNA screen (48 conditions) and SaltRX (96 conditions). The results of these screens are shown in Table 1. Once again, for hAhR287/H6ARNT362/33mer and hAhR287/H6ARNT362/16mer, microcrystals were the predominant form, with ~30-50% of conditions showing oils, precipitate, quazicrystals, crystalline precipitate, or microcrystals. TEV proteolytic removal of the His tag resulted in decreased formation of oils, precipitate, quazicrystals, crystalline precipitate and microcrystals, to around 20-25% of the conditions screened. Microcrystals were evident for tagged heterodimer/33mer complex with Natrix #42 (0.01M MgCl2.6H2O, 0.05M Tris.HCl pH 7.5, 5% v/v isopropanol). Other conditions showed smaller microcrystals, but crystal forms were all ~<2 m.

74 Purified hAhR287/H6ARNT362 was concentrated 1:1 with 30mer and achieved a solubility limit of 18 mg/ml, a ten-fold improvement on previous complexes. Using the HoneyBee crystal robot, hAhR287/H6ARNT362/30mer at 10 mg/ml was utilised to set up crystal screens including Natrix, Sigma DNA and an HT Screen (Table 1), resulting in identification of two new conditions for protein crystal growth. Fine screens were performed around HT Screen D9 (18% PEG 8000, 0.1 M Sodium Cacodylate pH 6.5, 0.2 M Zinc Acetate), which formed small crystals in the robot screen, and Natrix condition #46 (0.005 M Magnesium Sulphate Aq., 0.05 M Tris.HCl pH 8.5, 35% v/v 1,6 Hexanediol), which formed small plates in the robot screen (Table 1). Hampton 1 condition #14 was also included in the finescreen series. In an effort to increase the size of the crystals, higher concentrations of complex were utilised for finescreens (Table 2). HT Screen D9 finescreens at 17 mg/ml protein resulted in predominantly amorphous precipitate and small leaf shaped crystals, where the zinc acetate was isolated as adversely affecting solubility and causing amorphous precipitation. Natrix #46 was also set up at 17 mg/ml complex, but only resulted in amorphous precipitate. However, H1#14 set up at 1 mg/ml complex gave increased size of crystals, including florette shaped spines (Fig. 3.17 D) and, at 7.5 mg/ml, hexagonal crystals ~75 m in diameter (Fig. 3.16 I), which cracked with motion, and did not diffract. The hexagonal crystals appeared stuck on the coverslip and could not be replicated in subsequent finescreen attempts. hAhR287/H6ARNT362/33mer-B complex solubility limit was determined to be ~20 mg/ml, and finescreening with this complex at 9 mg/ml resulted in formation of spines ~100 m in length (Fig. 3.17 E), which, again, did not diffract. Given a comprehensive approach to optimizing crystallization hits had been pursued, and, in spite of DNA length optimization resulting in improved complex solubility, no diffraction quality crystals were generated. Hence, alterations to protein constructs were rationally designed in order to improve crystal growth and further stabilise the heterodimer:DNA complex.

3.2.2.5 Partial Proteolysis of H6ARNT362 Homodimer and hAhR287/H6ARNT362 revealed a labile N-terminal region of ARNT In order to identify regions of conformational flexibility, and hence, potential regions of heterogeneity for crystal packing, partial proteolysis of homodimer and hAhR287/H6ARNT362 was performed. Initial digestion of homodimer in the absence of DNA showed a similar pattern of cleavage in trypsin and chymotrypsin digests (data not shown), suggesting the presence proteolytic target sites for both enzymes in flexible regions of H6ARNT362 homodimer (Fig. 3.18 A, B). Mass spectrometric analysis was performed by the Adelaide Proteomics Facility, where 10g of material was provided for analysis, indicated by * (Fig. 3.18 A). Several attempts were made to identify each band representing stable digestion products at each timepoint of the digest, but only some of fragments were identified. ARNT homodimer at the zero timepoint, taken immediately after addition of enzyme and analysed by MALDI- MS showed early cleavage of the N-terminal fragment including the tag and residues 1-39 of ARNT (Fig. 3.18 B), and 1-38 in a Chymotrypsin digest (data not shown). A fragment representing the rest of 75 Proteolysis of ARNT homodimer and hAhR287/H6ARNT362

(A) MW (-) 0 0.5 2.5 5.5 19 Hours (kDa) * * ** Homodimer Oligomers 55.4

36.5 H6TEV-ARNT362

Tryptic Fragments

(B) ARNT

His6 --TEV 362 39 Zero

c

i t 346 2.5 Hours

p

s

y t

r

n 5.5 Hours

T e 125 219

d

m

e

g

i

f

i

a

t r 128 219

n

F

e

d I 118 219 19 Hours 7240

(C) 5 Minutes 10 Minutes 30 Minutes 60 Minutes

r) r) r) r) r) r) r) r) r) r) r) r) ro e e e e e e e e e e e e e MW Z m m m m m m m m m m m m 6 0 3 6 0 3 6 0 3 6 0 3 (kDa) (-) (-) (1 (3 (3 (-) (1 (3 (3 (-) (1 (3 (3 (-) (1 (3 (3 ** Oligomers H ARNT362 55.4 6 hAhR287 36.5 Digest Fragments

Figure 3.18 Tryptic proteolysis of H6TEV-ARNT362 Homodimer and hAhR287/H6TEV-ARNT362 Heterodimer Results in Cleavage of an N-terminal Fragment of ARNT. Heterogeneity in protein structure is regularly assessed by sensitivity to

proteolysis. Hence, H6TEV-ARNT362 homodimer was subjected to Trypsin proteolysis, where fragments resulting were separated by SDS-PAGE, visualised by Coomassie blue staining (A), and identified by the Adelaide Proteomics Centre

using MALDI-MS (B). hAhR287/H6-TEV-ARNT362 partial proteolysis revealed the same fragment of ARNT (H6TEV-1- 39ARNT), but no other peptides were detected (C). Protein was separated by discontinuous SDS-PAGE (10%) and visualised with Coomassie blue stain and samples indicated (*) analysed by MALDI-MS. the ARNT362 protein including residues 40-347 was also detected (see section 3.2.1). Further analysis at 5.5 hours showed the same fragments, with an additional fragment derived from within the 40-347 fragment, encompassing residues 126-220. This fragment correlates with proteolysis occurring at the start of helix 2 of the bHLH region and within a flexible linker predicted to form between C and C of the PAS A fold at the C-terminal end. This data corroborated the validity of structure based sequence alignment of ARNT PAS A with DmPer PAS A to infer tertiary structural elements. By 19 hours, cleavages occurred at two more sites within and immediately C-terminal to Helix 2 of the bHLH (Summarised Fig. 3.18 B). Based on the first fragment digestion site targeted by Chymotrypsin, a truncated fragment of ARNT (H6 ARNT39-362) was designed and co-expressed with hAhR287 for crystallization trials. Truncation at this position of hARNT positions a basic residue at the N-terminus, maintaining the first basic cluster to potentiate maintenance of DNA binding activity. hAhR287/H6ARNT362 heterodimer was more minimally digested in order to identify the first proteolysis events, rather than stable core fragments. These digests were performed in the presence and absence of 16mer, 30mer and 33mer XRE fragments (Fig. 3.18 C). However, attempts to identify cleavage products by MALDI-MS were broadly unsuccessful, even in the case of more grossly digested material

(data not shown), the conclusion being that hAhR287/H6ARNT362 tryptic digested material was unusually difficult to detect by mass spectrometry, possibly due to generation of hydrophobic peptides from the PAS core. However, the same initial fragment encompassing the tag and ARNT residues 1-39 was detected in these experiments. Hence, the decision was made to continue with the truncated ARNT construct, while maintaining hAhR287 for coexpression.

3.2.2.6 Expression, Purification and Crystallization Trials of hAhR287/H6 ARNT39-362 hAhR287/H6 ARNT39-362 was coexpressed in BL21 DE3 E. coli, lysed with a French press, and purified by Ni-IMAC and AEC, yielding 0.4 mg/L culture (Fig. 3.19 A). EMSA with labelled XRE showed that the truncated heterodimer was active for binding DNA (Fig. 3.19 B), so the preparation was scaled up for crystallization screens. Noting that EMSA revealed higher MW oligomeric complex formation, it was concluded that the extreme N-terminus of ARNT was not responsible for oligomer formation (Fig. 3.19 B). Heterodimer was finally concentrated with a 1:1 molar ratio of 30mer XRE, achieving solubility of 1.1 mg/ml of complex. Robot screens including HT Screen and Natrix were set up with 50nl protein:50nl ML, however, no crystallization conditions were identified (Table 1). A fine screen around Hampton I condition #14 was set up, where skin formation and small leaf shaped crystals similar to hAhR287/H6ARNT362 were observed, which were not analysed for diffraction (Table 2).

3.2.2.7 Co-Expression and Purification of hAhR284/H6ARNT362 and hAhR293/H6ARNT362 Having modified the ARNT expression construct by rationally designing a truncation based on partial proteolysis, the AhR construct was evaluated for potential truncations that may improve yield and/or 76 Expression, Purification and EMSA Analysis of

hAhR287/H6 ARNT39-362

(A) . x G C G T A E T te n a IM IP IP s i- io ) ) y n MW (kDa) (- (+ L N A

55.4 36.5 hAhR287/H6TEV- ARNT39-362 comigrated on SDS- PAGE

(B) hAhR287/H6 ARNT39-362 hAhR287/H6 ARNT362 (-)

Oligomeric Complex

hAhR287/H6ARNT362/33mer

hAhR287/H6 ARNT39-362/33mer

Unbound 33mer

Figure 3.19 Recombinant co-expression, nickel affinity and anion exchange purification of active hAhR287/H6TEV- ARNT39-362 heterodimer. A truncation at the N-terminus of ARNT was designed based on proteolytic cleavage between residues 38 and 39 (Fig. 3.14 (B). The truncated ARNT construct was coexpressed with hAhR287 and the resulting heterodimer was purified by Ni-IMAC and anion exchange chromatography, separated by denaturing SDS-PAGE and visualised using Coomassie blue stain (A). An EMSA with FAM labelled 33mer probe showed a retarded complex forming

for both the truncated ARNT heterodimer (hAhR287/H6TEV- ARNT39-362) and for the control heterodimer hAhR287/H6TEV-ARNT362, indicating that the truncation of ARNT had minimal effect on DNA binding and that the heterodimer was purified in an active form. Additionally, the presence of a higher MW oligomeric species for both protein samples indicated that the N-terminal region encompassing residues 1-38 of ARNT was not responsible for formation of oligomeric species (B). solubility. A newly available structure based alignment of AhR with DmPer (174) revealed that whilst the original hAhR287 construct included PAS A and the small linker between PAS A and PAS B, the truncation was probably within the first  strand of PAS B and resulted in an ‘FIF’ motif at the C- terminus. It was hypothesized that this region may result in surface exposure of a hydrophobic sequence, potentially reducing solubility of the protein. Hence, two alternate truncations of hAhR were generated, including hAhR284, which removed the ‘FIF’, and hAhR293, including the first  strand of

PAS B, and terminating in a potentially charged motif ‘KHKL’. Co-expression of hAhR293/H6ARNT362, followed by Ni-IMAC and AEC yielded ~0.35 mg/L culture (Fig. 3.20 A), and reduced DNA binding by

EMSA (Fig. 3.20 B). In contrast, co-expression of hAhR284/H6ARNT362, lysis and purification by Ni- IMAC and AEC yielded 3.5 mg/L, the highest yield of heterodimer achieved (Fig. 3.20 B). Note that hAhR284/H6ARNT362 still formed higher MW oligomers in the peak anion exchange fractions, indicating that oligomerisation was not dependent on the ‘FIF’ motif. EMSA analysis illustrated that this dimer was also active for binding labelled XRE DNA (Fig. 3.21 B).

3.2.2.8 Robotic Crystallization Screens of hAhR284/H6ARNT362 alone and in complex with 16mer, 30mer and 33mer-B XRE DNA.

For robot screen crystallization of hAhR284/H6ARNT362, complex stability with the addition of a 1:1 molar ratio of XRE was investigated. hAhR284/H6ARNT362/16mer complex could be concentrated up to

~10 mg/ml, similar solubility to hAhR287/H6ARNT362/33mer-B. hAhR284/H6ARNT362/30mer showed unprecedented solubility, concentrating to a maximum of ~100 mg/ml, hence, this complex was investigated for crystallization. hAhR284/H6ARNT362/33mer-B was found to have a solubility limit of ~50 mg/ml and was investigated in parallel with the 30mer complex for crystallization.

hAhR284/H6ARNT362 was purified by IMAC and AEC, complexed with 30mer XRE in a 1:1 molar ratio and the complex was isolated again by anion exchange and concentrated. Interestingly, the complex was found to form spherulites in Storage Buffer after 1 day at 4 C. Prior to spherulite formation, the complex was concentrated to 8.5 mg/ml and, using the crystallization robot Phoenix, crystal screens from Nextal1, including PEGs, Anion, Cation, Classic and Cryos (480 conditions) were set up, noting that this material was concentrated in Storage Buffer (50 mM Tris.HCl pH 8, 150 mM NaCl, 0.1 mM

EDTA and 1 mM DTT) (Summarised Table 3). Additionally, hAhR284/H6ARNT362 was complexed with 33mer-B in a 1:1 molar ratio, concentrated in storage buffer to 8.5 mg/ml and the screens were repeated (Table 3). From these sets of screen conditions, no crystallization conditions were isolated, indicating that indeed, the complexes were soluble and stable in the storage buffer identified earlier for hAhR287/H6ARNT362. ~65% of conditions screened for hAhR284/H6ARNT362/30mer were found to result in phase separation (Fig. 3.16 F), and observation of phase separated drops over time did not reveal subsequent crystal growth. The next series of screens were performed with 77 Expression and Purification of hAhR293/H6ARNT362 and hAhR284/H6ARNT362

(A) hAhR293/H6ARNT362

C e A G TG t T P sa IM C P I y i- E MW -I + L N A (kDa)

51.4 H6TEV-ARNT 1-362 36.5 hAhR 1-293 (33 kDa) 31 Yield 0.35 mg/L

(B) hAhR284/H6ARNT362

C MW te A G TG a T P s IM C P I y i- E (kDa) -I + L N A

51.4 H6TEV-ARNT 1-362 36.5 hAhR 1-284 (31.8 kDa) 31 Yield 3.5 mg/L

Figure 3.20 Recombinant co-expression, Ni-IMAC and AEC purification of AhR and ARNT. H6TEV-hARNT362 and variant truncations of hAhR, (A) 1-293 and (B) 1-284, were co-expressed for 2 hours at 30 C at 0.1 mM IPTG. Cells were lysed using a cell disruptor and cleared lysate was purified by Ni-IMAC. Affinity purified hAhR/ARNT heterodimer was dialysed into buffer containing 50 mM NaCl and finally purified by AEC by stepwise elution with 150 mM NaCl (Purification Scheme 6). Purified protein was separated by discontinuous SDS-PAGE and visualised with Coomassie blue stain.

Enhanced yield for hAhR284/H6TEV-ARNT362 indicated more stable protein which was functional for DNA binding (Fig. 3.21 (B)), and was maintained for future structural analysis. (A) hAhR287/H6TEV-ARNT362 [Protein] (ng) 0 30 75 150 300 600 900 0 30 75 150 300 600 900

Bound Oligomer Complex Bound Dimer

Unbound 33mer

Unbound 16mer

16mer XRE 33mer XRE (B)

hAhR293/H6TEV-ARNT362 hAhR284/H6TEV-ARNT362 [Protein] (ng) 0 1.25 2.5 5 10 25 50 1.25 2.5 5 10 25 50 100

Bound Dimer

Unbound 33mer

Figure 3.21 EMSA for recombinant heterodimer binding to fluorescently labelled XRE dsDNA. Recombinantly expressed and purified (Purification Scheme 6) heterodimeric hAhR/ARNT was quantitated by Bradford analysis and incubated in the presence of FAM labelled XRE probe, then subjected to native PAGE. FAM label allowed visualisation of unbound, complexed and higher order oligomeric complexes as indicated. Fresh preparations of hAhR/ARNT were

analysed by EMSA to assay for activity prior to crystallization trials. (A) A preparation of hAhR287/H6TEV-ARNT362 was shown to bind 16mer and 33mer XRE (note that TEV digested material behaves similarly, data not shown) and (B)

hAhR284/H6TEV-ARNT362 and hAhR293/H6TEV-ARNT362 were found to bind to 33mer XRE. Protein dsXRE Conc. Notes Screen (Tray –Well)

hAhR284/H6ARNT 30mer 8.5 mg/ml Complex MonoQ purified. PEGs (AC0014 -A) Anions (AC0013 -A) Cation (AC0017 -A) Classics (AC0016 -A) Cryos (AC0015 –A)

33mer-B 8.5 mg/ml PEGs (AC0014 -B) Anion (AC0013 -B) Cation (AC0017 -B) Classics (AC0016 –B) Cryos (AC0015 –B)

hAhR284/H6ARNT 30mer 50 mg/ml Protein concentrated in 10mM Sodium PEGs Phosphate pH 8 Anions Cation Classics Cryos

26 mg/ml Protein concentrated in 10mM Tris pH8 PEGs (AC0024) Anions (AC0027) Cation (AC0026) Classics (AC0025) Cryos (AC0028)

Table 3 Summary of hAhR284/H6ARNT362/DNA complexes subjected to robotic crystal screens. Complexed material was concentrated in storage buffer minus glycerol, centrifuged at 16k rcf, 4°C for 5 minutes and stored on ice prior to screening. Protein concentration was estimated using Bradford analysis. No hit conditions were identified for hAhR284/H6ARNT362/30mer or 33mer-B, theorised to maintain solubility due to storage buffer. However, no hit conditions were identified for hAhR284/H6ARNT362/30mer in minimal buffer. hAhR284/H6ARNT362 +33merb in minimal buffer was pursued and resulted in identification of a range of hit conditions (Table 3 continued). hAhR284/H6ARNT362/30mer, this time buffer exchanged during concentration into minimal buffers: (1) 10 mM Sodium Phosphate pH 8 and (2) 10 mM Tris.HCl pH 8. The sodium phosphate buffered material could be concentrated to 50 mg/ml, while the Tris.HCl buffered complex reached 26 mg/ml. Once again, the Nextal screens PEGs, Anion, Cation, Classics and Cryos were set up using a Phoenix robot, and no crystallization conditions were observed (Table 3). Finally, hAhR284/H6ARNT362/33mer-B was concentrated in 10 mM Sodium Phosphate pH 8 to 18 mg/ml and subjected to a number of screens including Nextal1 PEGs, Anion, Cation, Classics and Cryos using a Phoenix robot (Table 3 continued). A HoneyBee crystallization robot was also used to automatically dispense Natrix, Index I and II, PACT

Suite and JCGS (see Appendix for components). Broadly, hAhR284/H6ARNT362/33mer-B showed reduced solubility relative to the 30mer bound complex, resulting in decreased solubility in the crystal screens and a concomitant increase in propensity to crystallize. A range of crystallization hits, summarized in Table 3 (continued) were evident from the hAhR284/H6ARNT362/33mer-B screens, which were subsequently investigated by focused fine screens to optimize crystal growth (Table 4).

3.2.2.9 Fine screen crystallization trials of hAhR284/H6ARNT362 alone and in complex with 30mer and 33mer-B XRE DNA. hAhR284/H6ARNT362 heterodimer in the absence of DNA was tested in a fine screen around Hampton I condition #14. At 8 mg/ml, this material was found to crystallize in thin plates up to ~120 m which took two days to grow, although these did not take up dye. hAhR284/H6ARNT362/30mer was concentrated to ~8 mg/ml and tested in the same condition, with the addition of 10 mM DTT, resulting in small leaf-shaped crystals and oils. hAhR284/H6ARNT362/33mer-B was concentrated to 18 mg/ml and fine screens around Classics condition H2 (0.2 M Ammonium acetate, 0.1 M Sodium Acetate pH 4.6, 30% PEG 4000) were set up, resulting in formation of skin, pseudocrystals and hexagonal plates 20- 100 m (Fig. 3.17 G, H) which grew in around 2 weeks. Interestingly, having set up hanging and sitting drops, crystals only formed in hanging drops, potentially indicating that crystals were indeed seeding in the skin. A hexagonal plate crystal (Fig. 3.17 G) was frozen in liquid nitrogen, and analysed using synchrotron radiation. This crystal showed low resolution protein diffraction, although not of sufficient quality to gain any structural detail. An additive screen (Nextal Additive Screen I) for this condition showed crystal growth with addition of spermidine, but finescreens did not show any increases in crystal thickness, hence this condition was abandoned as the crystals could not be seeded to improve size and did not grow any larger than when first identified. The only other crystals of sufficient size for analysis at the Australian Synchrotron PX beamline (50-100 m) were grown in Natrix condition #39 (0.2 M Ammonium acetate, 0.15 M Mg Acetate, 0.05 M HEPES pH 7, 5% PEG 4000), forming crystals ~200 m, however, upon mounting, these crystals were found to flex like pseudocrystals and did not diffract.

78 Protein Dimer dsXRE Conc. Robot Screen (96) Hits Condition hAhR284/H6ARNT 33mer-B 18mg/ml PEGs (AC0022) A5 – small plates 0.1M Sodium Acetate pH 4.6, 25% PEG 1000 A6 – small plates 0.1M Sodium Acetate pH 4.6, 25% PEG 2000 MME A12 – small plates 0.1M MES pH6.5, 25% PEG 2000 MME B6 – needles 0.1M HEPES pH 7.5, 25% PEG 2000 MME B12 – small plates 0.1M Tris pH 8.5, 25% PEG 2000 MME D3 – small plates 0.1M HEPES pH 7.5, 25% PEG 6000 D7 – small plates 0.1M Tris pH 8.5, 25% PEG 3000 D10 – small plates 0.1M Tris pH 8.5, 25% PEG 8000 E1 – small plates 0.2M Sodium Fluoride, 20% PEG 3350 F5 – small plates 0.2M Sodium Nitrate, 20% PEG 3350 E10 – small plates 0.2M Sodium Iodide, 20% PEG 3350 Anion (AC0021) Cation (AC0023) F6 – small crystals 0.1M Imidazole pH 7.5, 0.8M Zinc Sulfate Classics (AC0019) H2 – small hexagons 0.2M Ammonium acetate, 0.1M Sodium Acetate pH 4.6, 30% PEG4000 Cryos (AC0020) F1 – small crystals 0.17M Magnesium Formate, 15% Glycerol Natrix* #48 – small, fragile 0.2M Ammonium Chloride, 0.01M Calcium Chloride dihydrate, 0.05M Tris pH 8.5, 20% PEG 4000 #33 – 100x10m crystals 0.2M Ammonium Chloride, 0.01M Magnesium Chloride, 0.05M HEPES pH 7, 30% Hexanediol #39 – Small, branching. 0.2M Ammonium Acetate, 0.15M Mg Acetate, 0.05M HEPES pH 7, 5% PEG 4000 Index I + II* D11 - microcrystals 0.1M BIS-TRIS pH 6.5, 28% PEG MME 2000 D12 – microcrystals 0.2M Calcium Chloride dihydrate, 0.1M BIS-TRIS pH 5.5, 45% (+/-)-2-Methyl-2,4-pentanediol PACT Suite* C4 – microcrystals 0.1M PCB buffer pH 7, 25% PEG 1500 JCSG*

Table 3 Continued: Summary of hAhR284/H6ARNT/33mer-B complexes subjected to robotic crystal screens and crystal hits. Complexed material was concentrated, buffer exchanged into 10mM Sodium Phosphate buffer, centrifuged at 16k rcf, 4°C for 5 minutes and stored on ice prior to screening. Material concentrated for each trial was also used for finescreens and stored up to a maximum of 2 weeks at 4°C. Protein concentration was estimated using Bradford analysis. Some screens were conducted by the Wilce laboratory at Monash University, Victoria, as indicated by (*). A full summary of fine screens, additives and techniques for hAhR284/H6ARNT362/33mer-B crystallization is listed in Table 4.

3.3 Summary and Discussion

In order to investigate molecular details of dimerization and DNA binding by bHLH.PAS transcription factor heterodimer AhR/ARNT, co-expression and purification of soluble, active heterodimer was optimised. Initial attempts at purifying bHLH.PAS A AhR287/ARNT362 illustrated that the heterodimer was difficult to solubilise, but after attempting a range of purification protocols and buffers, pure, soluble protein was prepared by Ni-IMAC and anion exchange chromatography. Persistent difficulty with concentrating the purified heterodimer prompted trials of complex formation with 33mer dsDNA fragment containing an XRE, with the result that complex was formed, but did not aid in solubilizing the protein as predicted. Crystallization screens with AhR/ARNT alone and AhR/ARNT/33mer identified two conditions which seeded microcrystals, however, optimization of these conditions by microseeding, streak seeking, and ‘backing off’ produced, at best, 35 m crystals which were difficult to handle and did not show evidence of diffraction.

Subsequently, DNA length was altered to prevent heterogeneity in an effort to improve crystal size. 16mer XRE did not alter the solubility of the complex, and no further crystallization hits were identified for this material. PAS domain dependent DNA bending (9) was identified at a GC rich region 3’ of original 33mer designed dsDNA, hence 30mer was designed to include the bending site, with the result that solubility increased 10 fold. A second dsDNA fragment designed on the basis of the bend site, 33mer-B, included a region 3’ of the 30mer which shows high sequence conservation in alignments of promoter elements known to be targets of AhR/ARNT in vivo (Chapman-Smith, A., pers. comm.). 33mer-B showed similar solubility when complexed with AhR287/ARNT362, and hence was included in fine screens around conditions identified for 30mer. Robotic crystallization screens of AhR287/ARNT362/30mer resulted in identification of a number of hit conditions, although the largest crystals achieved were around 70 m and did not diffract.

Partial proteolysis of ARNT362 homodimer identified a flexible, surface exposed N-terminal region, which, when removed showed decreased solubility of the complex. AhR287/ ARNT39-362/30mer did not show any improvement in growth in crystallization conditions already identified, and did not provide any lead crystallization hits after robotic screening. Finally, the AhR construct was modified to remove a hydrophobic group ‘FIF’ from the C-terminus, resulting in material that was more soluble in minimal buffer, had very high solubility in complex with 30mer (>100 mg/ml) and 33mer-B (50 mg/ml), and expanded the hit rate in crystallization screens, resulting in identification of 21 hits for 79 Protein/dsXRE Screen Hit [Conc] Buffer Salt Precipitant Additives/Techniques Crystal Form I hAhR284/H6ARNT (-)DNA Hampton #14 8mg/ml HEPES pH 6.5-8 200mM CaCl2 20-30% 120uM plate, PEG 400 rectangular crystals. I hAhR284/H6ARNT + Hampton #14 7.5-9.3 MOPSO pH 6.2-6.5, 200mM CaCl2 5-30% PEG 100mM DTT Small Leaf shaped, 30mer mg/ml HEPES pH 6.6-7.5 400 oils.

hAhR284/H6ARNT + Classics H2 18mg/ml Na Acetate pH 4.4-5 170-230mM 5-30% PEG Sitting Drops. (Crystals Pseudocrystals, 33mer-B Ammonium 4000 only in hanging drops) Hexagonal Plates Acetate 20-100 m. Diffract as protein.

hAhR284/H6ARNT + Natrix #39 18mg/ml HEPES pH 6.6-7.5 0.2M Ammonium 5-10% PEG Dialysis (Pseudocrystals 200m gels. 33mer-B Acetate, 0.15M Mg 4000 only form in hanging Pseudocrystals. Acetate drops) No Diffraction.

hAhR284/H6ARNT + Natrix #33 18mg/ml 0.05M HEPES pH 6-7.5 0.2M Ammonium 20-45% Large, gel-like salt 33mer-B Chloride, 0.01M Hexanediol crystals.

MgCl2 I hAhR284/H6ARNT + PEGs E10 18mg/ml 0.1-0.25M Na 5-30% PEG Phase separation, 33mer-B 3350 small crystals

hAhR284/H6ARNT + PEGs F5 18mg/ml 0.1-0.25M NaNO3 14-24% Precipitation, small 33mer-B PEG 3350 hexagonal crystals.

hAhR284/H6ARNT + PEGs B12 18mg/ml 0.1M Tris pH 7.5-9 21-31% IZIT stain positive - Phase separation, 33mer-B PEG 2000 protein crystals. small crystals. MME

hAhR284/H6ARNT + PEGs D3 18mg/ml 0.1M HEPES pH 6.5-8 21-31% Precipitation 33mer-B PEG 6000

hAhR284/H6ARNT + PEGs D7 18mg/ml 0.1M Tris pH 7.5-9 21-31% Pseudocrystals 33mer-B PEG 3000

Table 4 Summary of hAhR284/H6ARNT/DNA complexes subjected to fine screens around original robot screen hit conditions. Complexed material was concentrated, centrifuged at 16k rcf, 4°C for 5 minutes and stored on ice prior to screening. Protein concentration was estimated using Bradford analysis. Additives and techniques tried to optimise crystal growth are also listed, with resulting crystals described and, where tested, diffraction noted. AhR284/ARNT362/33mer-B. Increases in identified crystallization conditions for 33mer-B were probably attributable to slightly reduced solubility relative to 30mer complex, and that the dimer both with and without DNA was found to be much more soluble in minimal buffer (10 mM Sodium Phosphate pH 8).

Having established a purification, solubilization, complex formation and buffer exchange protocol for this, the first crystallized bHLH.PAS protein heterodimer, there are a number of opportunities for optimization and analysis of a range of crystal forms. Advances in our understanding of asymmetrical DNA binding and bending have assisted in designing new dsDNA fragments for complex formation, and future crystallization screens may benefit from including ‘sticky-end’ fragments with complementary A-T overhangs to capitalize on the capacity for DNA to introduce a preliminary architecture for cocrystal seeding and growth (210). Observation of higher order oligomer formation may be impacting adversely on crystal growth by introducing stable heterogeneous oligomeric forms into solution. Analysis of the kinetics and sizes of the oligomeric forms using analytical ultracentrifugation would be advantageous for further crystallization trials. Identification of the mechanism behind oligomerisation, or targeted abrogation of oligomer formation may prove useful for decreasing heterogeneity, rather than the use of random additive screens.

Skin formation observed in finescreens of hit conditions for the heterodimer:DNA complexes indicate that oxidation or other unknown factors contributing to skin formation may be decreasing the effective protein concentration in hanging drop screens, or, slowing further crystal growth after skin formation by preventing vapour diffusion. Hence, a range of crystallization methods to prevent skin formation may prove useful in future crystallization finescreens, including free-interface diffusion in capillaries, counter- diffusion techniques such as the use of a Granada Crystallization Box, microbatch under oil and microdialysis. Finally, access to the soon to be completed microfocus beamline at the Australian Synchrotron will doubtless prove invaluable in efforts to characterise the structure of AhR/ARNT, given the propensity for this material to form a range of small (<50 m) crystals. However, in the absence of a crystal structure, soluble heterodimer was subjected to other biophysical techniques. In order to provide novel spatial information about PAS A domain topology relative to DNA, atomic force microscopy was performed. Additionally, gross domain structures and tertiary relationships between domains and DNA were analysed by small angle x-ray scattering.

80

The Aryl Hydrocarbon Receptor: Structural Analysis and Activation Mechanisms

This thesis is submitted in fulfilment of the requirements for the degree of Doctor of Philosophy in the School of Molecular and Biomedical Sciences (Biochemistry), The University of Adelaide, Australia

Fiona Whelan, B.Sc. (Hons) 2009

Atomic Force Microscopy and Small Angle X-Ray Scattering Analysis of hAhR/ARNT bHLH.PAS A Heterodimer Chapter 4: Atomic Force Microscopy and Small Angle X- Ray Scattering Analysis of AhR/ARNT bHLH.PAS A Heterodimer

4.1 Introduction

4.1.1 The Mechanics of DNA Packaging and Transcriptional Regulation

Our understanding of the behaviour of naked DNA in in vitro systems proposes a packing conundrum for all organisms. From the viral capsid through to nuclear containment of the eukaryotic genome, the volume and free energy of DNA vastly exceeds the capacity of the genomic vessel. Measurements of the mechanical behaviour of naked DNA indicates a persistence length, the length of a polymer which maintains an unbent conformation in solution, typically around 50 nm or 150 bp (211). Genome packing and transcriptional regulation often require tight bending, on a scale smaller than the persistence length, indicating that mechanical manipulation is necessary to achieve biological outcomes (as reviewed in 211).

Packaging of eukaryotic genomes involves tight bending of the DNA, where histone octamers including core histones H2A, H2B, H3 and H4 and linker histone H1, form the core of eukaryotic genome packing. Approximately one persistence length, or 147 bp of DNA wraps 1 ¾ turns around the octamer, forming a nucleosome (212). Hence, according to the persistence length definition, DNA is considered to be tightly bent to achieve this wrapped state and also incurs an electrostatic cost by aligning two negatively charged wrapped loops in close proximity. Nucleosomes are further separated by ~10-50bp of naked DNA, resulting in exposure of only 10-25% of accessible DNA (211). This wrapping mechanism has implications for biological regulation of replication, transcription, recombination and repair in that DNA wrapped in nucleosomes is inaccessible, being sterically blocked on the side bound to histones, and on another face, by proximity to the partner wrapped loop.

The energy required to tightly bend DNA and appose negatively charged loops is provided by the positively charged surface of histones, where charge interactions maintain a marginal net stability favouring packed DNA (211). However, a theory has been proposed which suggests that DNA ‘bendability’ is exploited to generate preferential nucleosome positioning. A thorough review of DNA bending implicates broadly two DNA bending phenomena which may affect histone binding, including sequence specific static bending, and dynamic bendability (213). Static bending has been identified for poly-A tracts of 6 bases in kinetoplasts, organelles of protists, which contain interlocking circular DNA, where repetition of the poly-A tracts every 10bp is hypothesised to result in a static bend toward the 81 minor groove (negative bend), cumulatively culminating in the circular structure (214). The corollary to this is the GGGCCC motif, found to bend toward the major groove (positive bend) (215). Inherent flexibility has been observed for TA copolymers such as the TATA box, implicating these sequences as targets for nucleosome formation (216). A proposed nucleosome code appears to have merit in that predicted directional static bends, combined with dynamically bendable sequences specifically designed to bind histones, do indeed bind with higher affinity than random sequences, and additionally show rotational positioning consistent with the predicted direction of bending (217).

Affinity selection from a random 220mer library revealed a range of 41 sequence families which were able to bind histone octamers with a relatively high affinity. The most striking observation was the occurrence of nucleotide periodicity in these sequences, where TA and AA dinucleotide pairs appeared in phase approximately every 10bp, while CG and GC pairs were similarly in phase with each other, and precisely out of phase with AA/TA sequences (218). This periodicity is predicted to result in oppositely directed local bends such that the GC positive bend, and TA negative bend act additively to induce an in-phase unidirectional bend to achieve curvature, hence favouring nucleosome assembly (218). However, an analysis of mouse genomic DNA revealed that the highest affinity sequence was six-fold lower than the highest affinity synthetic sequence, and the second highest affinity genomic fragment was 34-fold lower, implicating a preference, in nature, for limited histone wrapping efficiency (218).

A synthetic, high affinity binding sequence engineered to target multiple sites along the fragment for restriction enzyme digestion has been analysed to investigate site exposure within a wrapped nucleosome. This research showed strong nucleosome position dependence for target sequence access, where sites closest to the exposed ends of the wrapped DNA are most accessible, decreasing progressively towards the centre (219). This indicates that sequence dependent positioning of nucleosomes provides a mechanism for regulatory protein binding to selectively exposed regions of DNA in a chromatin context. DNA accessibility at the ends of a wrapped nucleosome, as a means to regulate protein binding, requires that nucleosome formation must be sequence specific in order to maintain exposure of selected regulatory sequences, as discussed above. However, nucleosome unwrapping would also provide a mechanism for transient site access. Recent evidence suggests that frequent, short-lived unwrapping events may increase the probability of regulatory protein:DNA interactions. An elegant series of fluorescence resonance energy transfer (FRET) experiments indicates that the fully wrapped nucleosome state is maintained for ~250ms, followed by spontaneous unwrapping (220). These experiments indicate that the unwrapped state survives for only around 10- 50ms, providing a discrete window for regulatory protein binding events and inferring that DNA regulatory proteins are not required to initiate unwrapping, rather to prevent rewrapping. Specifically, the impact of rewrapping on RNA polymerase procession would suggest that, left uncontrolled, unwrapping 82 for 10-50ms would barely allow Pol II to progress one base pair into a nucleosome (220). Hence, the importance of transcription factor recruitment of nucleosome remodelling proteins, such as Histone acetyl (HATs), has been implicated in allowing transcriptional elongation to continue through packed DNA (220).

Post-translational modification (PTM) of histone tails has been hypothesised to alter transcriptional regulation by one of two mechanisms. Firstly, modifications have been predicted to control interactions with proteins containing modification specific interaction motifs, such as the bromodomain, which interacts with acetyl-lysine residues (221). Secondly, modification has been hypothesized to neutralize positively charged lysine residues which would otherwise interact with a negatively charged surface patch of a neighbouring nucleosome, thereby decreasing local DNA wrapping stability (213), and/or by modifying chromatin structure to allow access by transcriptional regulators (222). The complex nature of the proposed histone code is yet to be fully elucidated, and a thorough review of this field is beyond the scope of this thesis, suffice to say that a concert of modification effects are able to alter chromatin structure to change DNA accessibility, and ultimately affect DNA regulatory protein access. However, what is also clear is that this communication is not unidirectional, and localised unwrapping events resulting in transcription factor binding are also critical in determining DNA accessibility by attracting coactivator or corepressor histone modifiers to generate histone code based alterations to chromatin structure.

4.1.2 Transcription Factors as Effectors of Localized Chromatin Remodelling

Transcription factor binding to attract co-activators, and maintain disrupted chromatin for activation of transcription has been theorised to utilise transient unwrapping events to access binding sites, and as discussed, have been proposed to maintain the unwrapped state in order to regulate transcriptional outcomes. However, RNA Polymerase II itself has also recently been implicated in nucleosome rearrangement to maintain coordinate gene expression in Drosophila. Promoter-proximal pausing of Pol II has been identified by ChIP-on-chip in hundreds of genes which are typically responsive to environmental stimuli or developmental cues and appears contingent, in around 60% of identified pausing targets, on the presence of Pol II stalling factor Negative Elongation Factor (NELF) (223). Contrary to first assumptions, NELF stalling of Pol II was found to increase transcription in around two- thirds of identified target genes. After treatment with NELF siRNA, nucleosome occupancy was found to increase, and methylation of H3K4 was reduced, confirming a mechanism for enhanced transcription due to pausing dependent chromatin remodelling in proximal promoter regions of target genes (224).

83 In a recent study, transcriptional repressor protein Proline Rich Homeodomain (PRH/Hex) was investigated as a potential modifier of chromatin structure using the Goosecoid promoter as a model system. PRH is involved during development in regulating cellular differentiation and proliferation, functioning to repress transcription by binding to corepressors from the Transducin-like Enhancer-of- Split (TLE) chromatin binding protein family (225). Purified full-length PRH has been shown, in vitro, to form octamers, even at low concentration, by biophysical studies, including analytical ultracentrifugation, electron microscopy and spectrophotometry (226). However, in order to exclude the possibility that oligomer formation is simply an artefact of having a high concentration of purified protein, in vivo crosslinking has also been utilised, illustrating formation of octameric species in cells. Finally, a ChIP of the goosecoid (gsc) promoter was used to confirm DNA binding by PRH (226, 227), although this series of experiments did not confirm that the octameric form of PRH was involved in DNA binding.

Further in vitro analysis of a monomeric form of PRH, where the proline rich oligomerisation domain had been removed, showed that the gsc promoter fragment could be bound by 4 monomers of PRH, resulting in DNase I protection at these sites, and enhancement of cleavage at sites neighbouring the bound regions, indicating some binding distortion of the DNA (227). Full-length purified PRH showed a different protection pattern, with more widespread protection of the promoter region, indicating that octameric PRH resulted in a distinct DNA structure relative to monomeric PRH. Finally, structural changes to the promoter were shown in vivo using potassium permanganate modification, followed by piperidine cleavage, and indicated that DNA distortion was achieved at low concentrations of octameric PRH (227). The authors propose a mechanism for transcriptional regulation, where DNA structural changes were inferred to be occurring due to the high local concentration of DNA binding homeodomains present in the octameric form. This was theorised to induce local DNA bending and wrapping around the PRH complex, to form the physiologically repressive DNA structure (227). This hypothesis is distinct from other theories of transcriptional regulation by DNA looping which have been proposed for factors such as CTCF (CCCTC-binding factor), a chromatin insulator, at the mouse - globin gene locus (228), and short-range looping of the iNOS (Inducible Nitric-oxide Synthase) promoter by a range of factors including AP-1, NF-B and p300 activity (229).

DNA looping by a bHLH.PAS dimer occurs in regulation of the erythropoietin (EPO) enhancer, where HIF-1/ARNT binds at a 3’ downstream enhancer region and acts to induce the proximal promoter through enhancer bending, an event found to be synergistically regulated by TGF signalling. The HRE lies in close proximity to hepatic nuclear factor receptor 4 binding sites (HNF4) and the TGF responsive smad binding element (SBE), where HNF4, Smad3/4 and HIF1/ARNT bind after hypoxic induction, and cotreatment with TGF. Sp1 binding occurs at the proximal promoter, however, full

84 hypoxic activation requires enhancer looping enabled by interaction of Sp1 with HIF1/ARNT and HNF4 mediated by bridging factor Smad3/4 (230). HIF1/ARNT, HNF4 and Smad3/4 finally recruit CBP/p300 to activate transcription by histone acetyl transferase activity, and by attracting transcriptional machinery (231, 232).

Direct oligomerization of bHLH.LZ proteins has been identified in crystal packing of Myc and Max, resulting in an investigation into proposed DNA looping by this dimer of dimers. The bHLH.LZ region of Myc and Mad have been synthetically dimerized with Max using a chemoselective C-terminal linker in order to stabilize heterodimers for crystallization and prevent preferential formation of Max homodimers. Purified heterodimers were complexed with E-box containing DNA fragments and crystallized. In contrast to the Mad/Max/DNA complex, Myc/Max/DNA was found to form a tightly bound head-to-toe four helix bundle tetramer (Fig. 3.2 F (233)). Further estimates of nuclear concentration of Myc and Max, coupled with sedimentation equilibrium ultracentrifugation revealing a dissociation constant of 90 nM, suggest the existence of tetrameric forms at physiological concentrations, with the implication that tetramers are able to bind promoter sequences to give rise to DNA looping. However, recent atomic force microscopy (AFM) data appears to conflict with this hypothesis, in that Myc/Max heterodimer was found to interact with two sites from the hTERT promoter, but only synthetically linked tetrameric complex was able to result in looping of the hTERT promoter (234), although DNA bending was confirmed in these experiments. The possibility that other DNA binding architectural proteins (235), heterochromatin/euchromatin context, or molecular crowding in the nuclear milieu which may reduce overall entropic cost for DNA looping (236), has not yet been investigated, and hence, the ultimate physiological role for tetrameric Myc/Max cannot be concluded. Additional evidence for formation of a tetrameric DNA bound bHLH.LZ transcription factor complex was recently identified from the solution structure of homodimeric bHLH.LZ protein USF (Fig. 3.1 G) (150). Again, a physiological role for DNA looping has been proposed, although evidence for this function is yet to be shown conclusively (237).

4.1.3 The AhR/ARNT Heterodimer Affects Chromatin Structures in the Enhancer of Cyp1a1

Dioxin induced activation of the aryl hydrocarbon receptor results in robust activation of Cyp1a1 transcription. Prior to activation, the Cyp1A1 regulatory region is maintained in a nucleosomal arrangement, with no protein binding events thus far identified at either the promoter or enhancer region. After treatment with dioxin, the AhR/ARNT heterodimer binds to the enhancer region, the nucleosomes are disrupted and general transcription factors bind at the promoter (238-240). Given that the enhancer is located hundreds of base pairs (-1600 to -400) from the promoter region (-200 to +20), an analysis of the mechanism of activation was undertaken using nuclease accessibility and ligation 85 mediated PCR (241) to illustrate changes to chromatin structure (242). Within the enhancer region, two sites corresponding to positions of XRE’s are protected from digestion (–1066 to –1060 and –983 to – 989), indicating that AhR/ARNT heterodimer is bound. Additional sites of protection include an SP1 site at a G-rich region, and at the TATA box within the promoter, likely indicative of TATA-binding protein (TBP) site occupancy. It is of note that these changes occur on a timescale of around 30 minutes, indicating that the process is dynamic and specific protection sites infer that the chromatin rearrangements result from targeted protein/DNA interactions. An AhR deficient cell line does not display the TCDD induced chromatin changes, strengthening the argument that AhR/ARNT binding induces increased nuclease susceptibility (243). An additional finding of this study shows that DNA in the regions surrounding AhR/ARNT binding sites exhibit increased nuclease susceptibility, localized from –1150 to -400, indicating that AhR/ARNT binding able to alter chromatin structures at a distance from a bound XRE, indicative of chromatin disruption (242).

Concomitant with changes at the enhancer, the promoter region (-200 to +20) is also more susceptible to digestion and this change in chromatin structure allows binding by general transcription factors. Notably, this effect is abrogated in AhR null cells (243). Interestingly, the region spanning -400 to -200 does not undergo any alteration in accessibility, inferring that chromatin structures located between the enhancer and promoter do not change upon induction with dioxin. Given that the proximal promoter does not contain any XREs, it was concluded that AhR/ARNT must act at a distance to induce changes in promoter chromatin structures (242), potentially by inducing DNA bending (9, 244) and/or looping, and by recruiting histone remodellers.

DNA bending by recombinantly expressed truncations of bHLH.PAS heterodimers AhR/ARNT has been identified and characterised in our laboratory using a number of in vitro techniques. Experiments with chimaeric forms of AhR and ARNT bHLH.PAS A indicate that the entire region is required for strong dimerization. An AhR chimaera, including the PAS A region of ARNT; and the corollary, an ARNT chimaera with the PAS A domain from AhR, were coexpressed and purified for analysis of dimerization and DNA binding. The chimaeric heterodimer was found to have reduced stability through SEC, however, DNA binding affinity closely resembled the native bHLH.PAS A AhR/ARNT heterodimer, as assessed by quantitative EMSA. The relevance of the PAS A domain for dimerization and DNA binding was further investigated by replacement of AhR and ARNT PAS A with the LZ of Myc and Max, respectively. Once again, dimerization strength was markedly weakened, and in this case, DNA binding affinity was also decreased, inferring that the PAS A region is not only important for dimer formation, but also required for high affinity XRE binding (9). The possibility that the bHLH.PAS A folds into a single DNA binding/dimerization region was investigated by bacterial-two-hybrid, which illustrated that AhR PAS A can associate with the isolated AhR bHLH domain, indicating that the regions do interact, and 86 therefore are probably in close proximity, or direct contact, in the native heterodimer (9). EMSA analysis of bHLH.LZ chimaeric heterodimer and native AhR/ARNT bHLH.PAS A XRE binding with the addition of AhR basic domain specific antibodies resulted in supershift of only the LZ chimaeric heterodimer DNA complex. This indicated that the basic region of AhR was accessible for antibody binding only in the absence of PAS A, confirming the proximity of this domain in the DNA bound state (9). Finally, DNaseI footprinting analysis of AhR/ARNT bHLH.PAS A XRE binding, relative to the LZ chimaeric heterodimer, showed PAS A specific enhancement of digestion at a region downstream of the core XRE 5’ TNGCGTG 3’ motif. Additionally, PAS A specific protection was also observed close to the enhanced digestion site (Fig. 3.15 A). In summary, bHLH.PAS A of AhR and ARNT appears to be required as a functional unit to convey strength for dimerization; PAS A is required for strong DNA binding; PAS A of AhR interacts with the bHLH region directly, and is in close proximity in the DNA bound complex; and PAS A likely contacts the DNA downstream of the XRE to effect DNA bending, summarised in Figure 4.1 A (9). As discussed above, DNA bending has been shown to be associated with chromatin remodelling, both physically, by stabilising chromatin unwrapping; and by recruiting histone remodellers to potentiate strong target gene induction.

AhR binding within the (Cyp1a1) enhancer is known to disrupt chromatin in this region with no requirement for a chromatin remodelling complex. Changes to the chromatin structure within the promoter region are induced subsequent to AhR binding to the enhancer, and allow formation of the RNA Polymerase preinitiation complex (PIC) (243). More specifically, this sequence of activation is known to require the C-terminal TAD of AhR (242, 243). Within the TAD of AhR are three subdomains including a glutamine-rich (Q-rich) domain, a proline/serine/threonine-rich domain and an acidic domain (245, 246). The Q-rich region was identified as the site responsible for an interaction with coactivator/corepressor protein RIP140 (247) and steroid receptor coactivator SRC1 (40). Coactivators including CBP/p300 (39), Nuclear receptor coactivator 2 (NcoA-2) and protein kinase PAM C-terminal

Interacting protein 2 (p/CIP2) (248), have also been connected with AhR/ARNT transcriptional regulation, where these factors, and SRC1 have been shown to increase transactivation of reporter genes. These coactivators acetylate histone tails to decrease histone association and reduce the DNA wrapping stability of the nucleosome, as discussed earlier (249). Another mechanism for transcriptional activation has also been identified for the AhR/ARNT heterodimer, whereby the heterodimer binds a SWI/SNF nucleosome remodelling complex, which uses the energy from ATP to remodel chromatin. This interaction results in disruption of the DNA/nucleosome interaction to generate histone octamer sliding, acting synergistically with the histone acetylation factors. Specifically, the Q-rich region of AhR has been shown to directly, or indirectly interact with Brahma/SWI2-related gene product-1 (Brg-1), the ATPase subunit of mammalian SWI/SNF (250).

87 Further studies on the histone modification pattern occurring following AhR activation have been undertaken. In a recent investigation, B[a]P induction of Cyp1a1 indicates that acetylation and phosphorylation of histones 3 and 4 accompany activation of transcription (251). Earlier studies indicated that histone deacetylase 1 (HDAC1) is bound at the Cyp1a1 promoter in the inactive state, and upon induction with B[a]P, is removed from the promoter concomitant with the recruitment of HAT p300 (252). ChIP, ReChIP and RT-PCR have recently been used to investigate the role of HDAC1 in silencing Cyp1a1 and to quantify the effect of HDAC1 removal (251). In control cells, HDAC1 and DNMT1 were bound to both the enhancer and promoter regions, while B[a]P treatment resulted in their release from these regions. ReChIP experiments suggest that HDAC1 and DNMT1 were bound at these sites in a complex, particularly at the proximal promoter region. Bisulphite sequencing indicated that, in spite of the presence of DNMT1, cytosine methylation was not above background levels. An analysis of histone modifications from –3.6 to +0.6kbp of the Cyp1a1 transcription start site was undertaken using antibodies directed against modifications on particular histones. B[a]P treatment resulted in increased levels of ac-H3K9 and ac-H3K14 in the proximal promoter, consistent with the observed absence of HDAC1. Dimethyl-H3K4 was identified in the enhancer and promoter region in control cells, becoming tri-methylated after B[a]P treatment. Phospho-H3S10 was also increased, only in the enhancer region of treated samples. qPCR identified specific locations of histone modification, where acetyl-H3K14 is only in the proximal promoter, while the enhancer showed increased formation of acetyl-H4K16. Knockdown of HDAC1 and DNMT1 did not affect Cyp1a1 mRNA levels, although acetylation levels comparable to those of treated samples were evident after HDAC1 inhibition. Intriguingly, AhR was found to be recruited, not only to the enhancer after B[a]P treatment, but also to the promoter region where there are no XREs. However, promoter association was found to occur at a lower level than the enhancer and the authors propose that this binding event may be consistent with a looping mechanism of gene activation (251).

4.1.4 Atomic Force Microscopy of Proteins and DNA

In order to visualise protein binding and bending of DNA, Atomic Force Microscopy (AFM) can be employed. This technique utilises a scanning probe to generate images of a molecule at sub-nanometre resolution. A scanning tip probes the surface of a sample and measures the attractive or repulsive forces which exist between the probe and the sample (253). These measurements resolve a topographical image of the sample surface, providing high resolution of diverse biological samples. Measurement of DNA strands probing for curvature and flexibility confirms that sequences predicted to form static bends do indeed curve, and curvature is also noted for regions rich in AT repeats, indicating flexibility (254). A study of a DNA repair enzyme, human 8-oxoguanine DNA glycosylase (hOGG1), utilised AFM to investigate the mechanism by which hOGG1 scans DNA for the mutagenic base lesion 88 8-oxoguanine. 8-oxoguanine is formed by attack of guanine by reactive oxygen species, and once detected by hOGG1, nuclease excision of the mutated base is performed. A catalytically inactive hOGG1 variant has been analysed to visualise the base targeting, with AFM used to identify a 70 helical kink hypothesised to allow a close interrogation of the G base in the of hOGG1, allowing the enzyme to differentiate between subtly different nucleotide structures (255). LrpC is a helix- turn-helix transcriptional regulator from E. coli, which exists as a tetramer in solution and can regulate its own expression. Specificity of DNA binding by LrpC was investigated using a range of biophysical techniques, including AFM, resulting in identification of statically curved phased polyA tracts which are bound and wrapped by LrpC in a nucleosome-like manner (256).

Snapshots of dynamic DNA binding events have also been rendered using AFM. Images of convergent transcription by E. coli RNA polymerase were identified by judicious application of NTPs, stalling the enzymes at initiation, elongation and collision (257). Finally, the distinct advantage of AFM is the capacity for maintenance of soluble buffer conditions during imaging, as opposed to electron microscopy, where samples have to be imaged in a vacuum and often require metal or carbon coatings. Human heat shock transcription factor 2 (HSF2) is a DNA binding transcriptional activator which functions as a trimer, or sometimes multimers of trimers to induce in DNA binding. HSF2 proteins bind specific sequences of DNA termed Heat Shock Elements (HSEs) at sites both proximal and distal to promoters of target genes. Electron microscopy generated 2D images of synthetic fragments of DNA (~1.7kb) including two HSEs, separated by 880bp, which showed protein bound at both sites, forming a looped structure. AFM allowed resolution of a bilobal structure, indicating the presence of two HSF2 trimers at the loop junction. The higher resolution was due to rendering of a 3D image from the topological data generated by AFM (258). Eukaryotic Class II promoter element looping has also been visualised using AFM, where c-jun binding at an AP1 site resulted in DNA bending at an angle around 36 . Inclusion of nuclear extracts from which AP1 binding proteins were depleted showed formation of a preinitiation complex, which interacted with c-jun to loop the DNA. Looping was further shown to be dependent on the presence of c-jun, the AP1 site and nuclear extract (259).

4.1.5 Small Angle X-Ray Scattering Analysis of Biological Molecules

Small angle x-ray scattering (SAXS) is a useful technique for analysis of structural features of colloidal particles of a given sample. Typically, in the case of biological molecules, particles are in a size range from tens to thousands of angstroms. A colloidal suspension of this size results in scattering from incident x-rays, which is inversely proportional to the size of the solute, hence, scattering angles will be small. Given that x-rays are scattered by electrons, heterogeneous electron density of colloidal particle size generates characteristic small angle x-ray scatter. The resulting scattered x-rays generate a curve 89 of intensity (I), plotted against the momentum transfer vector (Q, nm-1). In the case of randomly oriented biological particles in solution, the observed scatter is averaged over every orientation. The electron density of the solute is measured minus the electron density of the solvent to obtain scattering data of protein dissolved in a solvent. In order to deduce the size, shape and mass of a molecule from a scattering curve, model particles are generated in silico with a theoretically equivalent scattering curve.

However, raw scattering data alone implicates a number of molecular parameters, which inform shape predictions. The radius of gyration (Rg) is calculated from the root-mean square of the distances of all electrons from their centre of gravity within the molecule. Hence, Rg infers detail about the compactness, or extension of the molecular structure. A measure of Rg can be obtained from the slope

2 of SAXS data plotted on a scale of Ln I vs q , which is called a Guinier plot. For low values of Q (q x Rg

2 <1) the Guinier plot should be linear, with the slope being equal to -1 x Rg /3. Linearity over this region is indicative of monodispersity, while ‘upturn’ at low s is diagnostic for aggregation. Another parameter that can be measured for a particle of uniform density is the molecular weight. This can be calculated by determining a constant for the x-ray beam and detector in use by calibrating with a known protein sample. The scatter at zero degrees (I(0)) is divided by the concentration of the solute and by the mass of the scattering particles. This constant can then be used to predict the MW of an unknown protein sample and hence, confirm its monodispersity relative to theoretical MW. Finally, a distance distribution function P(r) can be calculated by performing a fourier transform of the scattering curve, where the r- value at which P(r) drops to zero shows the largest molecular dimension. Additionally, peaks in P(r) correlate with an r-value representing the cross-sectional radius of the molecule. From this function, the overall shape of biological macromolecules can be approximated. For complex structures, the particle shape may be theoretically generated from a large number of densely packed spheres, essentially representing a more intuitive model of what has already been illustrated by the scattering curve (reviewed in 260). Having predicted a molecular envelope by this method, where known, domain structures can finally be docked to generate a rigid body model. Additionally, the use of complementary biochemical techniques such as NMR, chemical cross-linking, fluorescence resonance energy transfer (FRET) and affinity purification, among other techniques, allow predictions of interacting surfaces and can generate distance constraints for refinement of SAXS generated models (261).

SAXS has been successfully utilised to generate structures for a range of biological molecules, including proteins, nucleic acids, viral particles and large protein complexes. More specifically, SAXS has proven useful in determining solution structures of transcription factors including p53 (262), TFIIB and VP16 (263), USF (237), and Thyroid hormone Receptor (TR) (264). Analysis of p53 using SAXS is a particularly pertinent example, given the biologically relevant data generated using this technique. This sequence specific tetrameric transcription factor is a key player in determining cellular events 90 controlling proliferation or differentiation. The protein p53 consists of two folded domains, the DNA binding domain and the tetramerization domain, linked by a region of intrinsic disorder. Additionally disordered regions include the N-terminal transactivation domain, a proline rich region, the nuclear localization signal (NLS) region and the C-terminal negative regulatory domain (reviewed in 265). While the holo DNA binding domain has been structurally characterised by crystallography (266), the apo form by NMR (267), and the tetramerization domain by NMR (268) and crystallography (269); crystallization of full-length p53 has proven problematic, likely due to the disordered regions. Hence, SAXS analysis of the DNA binding domain/tetramerization domain  DNA; DNA binding domain/Tetramerization domain/C-terminal domain and full-length p53  DNA has been undertaken to establish the solution structures and relative regional contributions to molecular shape (262).

In this study, observed increases in Rg corresponded to the increasing size of the protein construct, while peaks and peak shoulders in the P(r) function indicated the presence of two folded subdomains, in agreement with previously identified structured domains (262, 266). The construct containing the C- terminal domain showed minimal size difference relative to the DNA binding/tetramerization region, whereas the inclusion of the unstructured N-terminal region in full-length p53 resulted in a large increase in R(g) and the maximum dimension DMax, indicating that these regions extend from the surface of the tetramer (262). SAXS analysis has provided a number of novel structural details, which clarify a range of regulatory and functional mechanisms of p53. Rigid body modelling included the crystallized structure of the tetramerization domain (269), while the DNA binding domains and linkers were configured around these domains to match experimental data. Unknown linkers and N-terminal mobile regions were modelled as dummy residue chains and the final models refined using NMR derived interaction interface data (270). For the DNA bound form, electron microscopy has been employed for a comparative analysis of structure (262). Overall the shape of the tetramer has an extended cross shape, where the intrinsic flexibility of the linkers between the folded domains allow response element containing DNA access to the DNA binding domains towards the centre of the complex, prior to structural rearrangements resulting in DNA being buried at the centre of the tetramer (262). Additionally, the N-terminal domain surface exposure correlates with the site of extensive post-translational modification (271), and interaction with a number of regulatory factors such as murine double minute 2 () (272) and p300 (263) and also allows access by modulators such as Bcl-xL (273) and Bcl2 (274) to the DNA binding domain.

The use of SAXS has helped to unify a range of functionally relevant phenomena, attributable to a structural architecture that underpins the molecular mechanism of p53 regulation. Application of SAXS was particularly relevant in this case due to presence of unstructured regions, and illustrates the utility of combinatorial techniques for elucidating mechanistic detail of transcription factors with folded domains 91 and intervening inherently flexible linkers, which are essential to function, but refractory to crystallization.

4.1.6 Summary and Aims

It is clear that regulation of transcription is maintained in a reflexive hierarchy, where DNA is wrapped around histones, which interact to form tightly packed chromatin. Hence, access to binding sites for transcription factors appears to be reliant on histone position, and dynamic unwrapping events. The unwrapped state is hypothesised to be maintained by transcription factor binding, to allow sufficient time for RNA polymerase to access template genes and elongate transcription products. Additionally, transcription factors are able to modify chromatin structures to isolate looped loci, which are thereby freed to maintain autoregulatory events in isolation from the surrounding chromatin. Activation by transcription factors typically involves dynamic modification by histone remodelling enzymes such as histone acetyl transferases (HATs) to maintain open chromatin and thereby increase the chances of unwrapping events. Repressive marks can also be generated by recruitment of histone deacetylases (HDACs) and DNA methyltransferases (DNMTs). The regulation of transcription has been described as a reflexive hierarchy because of the fact that active marks beget transcription factor binding and activation of transcription as much as transcription factor binding results in deposition of active marks. However, what is clear is that sequence specific transcription factor binding events are crucial to controlled access to DNA, and at this level, structural change imposed upon DNA is paramount for target gene regulation. Hence, an understanding of transcription factor oligomerisation, DNA bending and loop formation is important for elucidating molecular mechanisms of transcriptional regulation in the complex milieu of the nucleus. Given its utility for imaging protein/DNA interactions under physiological conditions, AFM has been used in this study to image AhR284/H6ARNT362 bound to DNA with a view to establishing domain topology relative to DNA, to differentiate between two models for DNA binding (extended or compact Fig. 4.1 B), and to assess the affect of DNA bending on the Cyp1a1 enhancer. The biochemical experiments, as described above, indicate that the PAS A domain interacts with the basic region, and may directly contact the DNA to induce DNA bending. SAXS analysis has been pursued to corroborate these observations, with a view to establishing greater structural detail in a bipartite mechanism of DNA binding specificity.

92 (A) 1. AhR ARNT PAS

PAS A Folding and dimerization

Intervening Helix1 Helix2 region + Basic Connection between bHLH DR bHLH and PAS A forms strong dimer

2. 3.

DNA

PAS A domains are close to PAS domains both the basic region and the bend DNA DNA in the complex

(B) PAS

bHLH bHLH PAS DNA DNA Extended Compact

Figure 4.1 Biochemical analysis of TrxH6mAhR287/TrxH6hARNT362 infers DNA bending at a site downstream of the XRE (Figure adapted from (9)) and current models for bHLH.PAS DNA binding. Experiments described in Section 4.1.3 implicate the PAS A region in formation of a stable AhR/ARNT dimer interface (A) (Diagram 1), where the bHLH and PAS A domains of AhR directly interact to strengthen the dimer form. EMSA analysis with supershifting antibodies directed against the basic region of AhR suggests that the PAS A domains are in close proximity to the basic region in the DNA bound complex (A) (Diagram 2). PAS A dependent enhancement of DNaseI susceptibility downstream of the XRE, combined with PAS A specific protection (see Fig. 3.15 (A)), indicates PAS A domains directly contact the DNA in this region to bend the DNA (A) (Diagram 3). AFM and SAXS analysis undertaken in this study aimed to differentiate between two models for bHLH.PAS DNA binding, either in an extended, or a compact conformation (B). 4.2 Results

4.2.1 Atomic Force Microscopy of AhR284/H6ARNT362/Cyp1a1 Enhancer Fragment Complexes

Typically, AFM experiments are conducted by depositing nanograms of DNA or protein onto mica, an atomically flat surface. For DNA binding, surfaces are coated with positively charged metal ions, which aids binding of negatively charged DNA to the negatively charged mica. Ni2+ has been optimised as an ideal divalent cation for DNA binding, correlated with the atomic radius of the metal, where larger metals such as Mn2+ or Cd2+ result in weakly bound, or unbound DNA, respectively (275). Immediately after deposition on the surface, protein/DNA complexes are able to move freely for 1-5 minutes, after which time the sample is dried in air for imaging.

Heterodimers of hAhR284/H6ARNT362 prepared as described in Chapter 3 (Fig. 3.20 A) were mixed with a 212bp fragment of the Cyp1a1 enhancer (bp -1114 to -903), containing two XRE binding sites. Final preparation of the complex, deposition onto mica, AFM, image processing and length measurements were performed by Sumay Ng in the laboratory of Dr Lisandra Martin (Monash University, Victoria). The complex was buffer exchanged into PBS and deposited onto Ni2+ treated mica, rinsed with ultra-pure water and dried under a nitrogen gas stream. AFM was performed using tapping mode in air, and data processed as described in materials and methods. Complexes were chosen for measurement if particular criteria were met, including; a) the calculated length of DNA was 72  10 nm, b) two AhR/ARNT heterodimers were found to bind the DNA, and c) a “sharp” bend was observed at the heterodimer , indicating specific binding. WsXM (see materials and methods) was utilised for measuring the DNA length and contours by tracing the DNA along the backbone (Fig. 4.2 B, i and j) green line). For DNA/protein complexes, the backbone trace (green line) was aligned with the centre of the protein along the DNA (Fig. 4.2 B, i and j), where the protein binding site was adjudged by the highest vertical measurement of protein along the DNA trace (see cartoon representation of contour interpretation). Instances of over or under estimation of dimension measurements have been accounted for in the analysis, which was performed by Drs Adam Mechler and Lisandra Martin, of Monash University. The differences between estimated and measured DNA lengths may result from scanner drifts, calibration inaccuracies and as an artefact of the drying procedure. Drying has been known to result in alterations to the tertiary structure of proteins, resulting from the loss of the hydration shell, likely causing flattening and stretching of the molecule. These potential artefacts were negated by performing a statistical analysis of a series of experiments and by using experimentally determined free DNA lengths as an internally consistent reference measurement.

93 (A) Cyp1a1 Promoter Fragment (212bp): 1 ggagagctggccctttaagagcctcacccacggttcccctcccccagctagcgtgacagcactgggacccgcgc cctctcgaccgggaaattctcggagtgggtgccaaggggagggggtcgatcgcactgtcgtgaccctgggcgcg

75 ccggttgtgagttgggtagctgggtggctgcgcgggcctccaggctcttctcacgcaactccggggcaccttgt ggccaacactcaacccatcgacccaccgacgcgcccggaggtccgagaagagtgcgttgaggccccgtggaaca

149 ccccagccaggtggggcggagacaggcagcccgacctctgcccccagaggatggagcaggctta ggggtcggtccaccccgcctctgtccgtcgggctggagacgggggtctcctacctcgtccgaat

(B) (a) (b) (i)

(c)

(d) (j)

(h) (g) (f) (e)

(C) Left arm Right arm

Length Stdev Length Stdev

Calculated DNA length 23.8 nm 34.7 nm

Bases 1-70 111-212

U shaped complexes 23.5 nm 6.1 38.1 nm 5.5 nm (heterodimers interacting) 36.0% 9.4% 59.3% 7.1%

 shaped complexes 23.5 nm 6.1 nm 24.0 nm 5.3 nm (heterodimers not interacting) 36.0% 9.4% 35.5% 6.3%

Figure 4.2 Atomic Force Microscopy (AFM) of hAhR284/H6ARNT362 bound to a fragment of the Cyp1a1 promoter. hDR284/H6ARNT362 was prepared, complexed with a 212bp fragment of the Cyp1a1 promoter containing 2 XREs (A), buffer exchanged into PBS and deposited onto freshly cleaved mica prior to imaging in air using AFM. AFM was performed by Sumay Ng and analysed by Drs Adam Mechler and Lisa Martin, Monash University, Victoria. (a) Representations of free CYP1A1 DNA and DR/ARNT and complex on a 500 nm2 mica surface, (b) unbound DR/ARNT, (c) free cyp1A1 DNA, (d) U conformation, (e) U conformation, (f) Z conformation, (g) Z conformation, (h) W conformation. The individual structures are 100 nm2 backbone trace, where green line traces along the backbone of the DNA while the blue dotted circles highlight the two AhR/ARNT dimers bound on the XREs (magnified representation of g) (j), and the purple dashed line shows that one arm of the DNA is shorter than the other in a magnified representation of e (i). Cartoon interpretations of AFM images are shown proximal to the contour images of the different complexes where a line=DNA, circle=protein; and arrow heads describe the binding sites on naked DNA. Measurements of free DNA (212bp, see Fig. 4.2 A) and AhR284/H6ARNT362/DNA complexes (Fig. 4.2 B, a-j) were performed as described in materials and methods. Numerous optimisation experiments were performed at a range of different concentrations, with the final concentration selected for analysis being 40pM AhR284/H6ARNT362/DNA. This concentration allowed visualisation of a reasonable density of complexes, while minimising larger aggregate formation. Images of DNA alone and

AhR284/H6ARNT362/DNA were collected, where the typical image is represented in Figure 4.2 (B (a)) (500 nm2 image size). This condition fortuitously allowed visualisation of free DNA, unbound heterodimer and AhR284/H6ARNT362/DNA complexes within one image. Free DNA was measured to be 66.6 nm  6.2 nm; n=39, although the estimated length of this DNA fragment was 72 nm. The discrepancy can be assumed to result from artefacts of AFM as discussed above. The height of protein in the AhR284/H6ARNT362/DNA complex was 1.1 nm  0.1 nm; n=5 (Fig. 4.2 B, e-h), where the height of DNA alone was 0.8 nm  0.1 nm; n=5 (Fig. 4.2 B, c). Protein alone was established to be 1.0 nm, with a diameter of 9.7 nm (Fig. 4.2 B, b).

After complex stabilisation, the structure appeared to adopt either a “U” or “V” conformation. The highest region was found in the middle of the DNA, represented at the bend of the “U” (Fig. 4.2 B, d). Given that two bound heterodimers could be resolved, these structures were interpreted to show a tight interaction between heterodimers, forming a dimer of dimers on the DNA. Other conformations, such as “W”, “” and “Z”, were also identified in the mixture of deposited complexes. These examples illustrated two non- interacting heterodimers bound to the DNA, with different ‘arm’ configurations. “W” shaped examples (Fig. 4.2 B, e and h) likely describe the “” form, with a slight inward bending between protein binding sites, while the “Z” form (Fig. 4.2 B, g) showed an alternate conformation of “W”. Hence, conformations “W”, “” and “Z” resolve two bound heterodimers, which were not interacting. This series of conformations represent ~50% of observed stable complexes. “L” shaped complex forms were also observed where only one heterodimer had bound to the DNA (Fig. 4.2 B, f). Other structures resolved did not show a distinct bend in spite of the presence of protein sitting on the fragment (data not shown), suggestive of non-specific colocalization.

Specific XRE binding was interpreted to be occurring where the heterodimer AhR284/H6ARNT362 induced a sharp kink in the DNA and caused it to bend (Fig. 4.2 B, d-j). This bending was consistent with biochemical data (Fig. 4.1 A) (9) with bends localised to areas that correlated with the positions of expected bending sites (Fig. 4.2 C). In the context of the enhancer region, the distal portion of the DNA fragment is predicted to represent 33% of the total length, which correlated with the measured distance (Fig. 4.2 B, j (left arm) and C) of 35.3%. Estimations of protein dimensions based on PAS domains and bHLH structures deposited in the Protein Data Bank suggest that a heterodimer with PAS domains

94 extending straight out from the C-termini of the bHLH region, at 90 from the DNA (see ‘extended’ model, Fig. 4.1 B) would have the majority of the protein density up to 16 nm from the DNA molecule. No such structures were seen in this analysis. Hence, these observations were considered more consistent with the hypothesis that the dimer forms a compact structure where the entire bHLH.PAS A dimer localises close to the DNA, evidently flattening and wrapping around the DNA (see ‘compact’ model, Fig. 4.1 B).

4.2.2 SAXS Analysis of AhR284/H6ARNT362  XRE Complexes

A preparation of heterodimeric hAhR284/H6ARNT362 was concentrated to 9 mg/ml in the absence of, and with the addition of, a 1:1 ratio of 33mer-B DNA. The flow-through from the concentration step was reserved for baseline calibration of the SAXS camera. The SAXS data collection was performed by Dr Nathan Cowieson, of Monash University at the laboratory of Prof. Jennifer Martin, The University of Queensland. Preliminary analysis of the scattering data (Fig. 4.3 B (LogI/q (nm-1), where (I) is the intensity, and (q) is the momentum transfer vector) utilised a Guinier transformation by Primus (version 3) (276). The lowest angles of scatter over the first 20 data points were removed as they were obscured by the beam-stop. The Guinier plot (ln(I)/q 2) showed that the heterodimer both in the presence and absence of DNA was monodisperse, based on the linearity of the curve at low scattering angles (inset,

Fig. 4.3 B). The Guinier plot was used to calculate the Rg value of hAhR284/H6ARNT362, resulting in an estimate of 58.840.171 Å (Fig. 4.3 A). hAhR284/H6ARNT362/33mer-B showed a decrease in Rg at 55.930.165 Å, indicating that upon binding DNA, the heterodimer became more tightly structured or stabilised around the DNA. The observation that Rg did not increase in the presence of DNA was indicative of DNA binding within the protein structure, rather than extending from the protein complex and adding to the length of the molecule. Additionally, P(r), with a maxima shift to the right, illustrated that the complex became more compact in structure when bound to DNA, suggestive of domain rearrangement upon binding. This effect is similar to that observed with circular dichroism (CD) analysis where the -helical content of bHLH proteins increases with the addition of DNA, such as bHLH factor Pho4 (277), bHLH.LZ Max homodimer (278) and a bHLH alone truncation of ARNT (279). This effect has also been identified by CD for the bHLH.PAS A region of AhR/ARNT (Chapman-Smith, A., pers. comm.).

An indirect Fourier transform of the raw scattering data was performed using Gnom (Version 4.5a) (141) to generate the distance distribution function P(r)/R (), where R is the interatomic distance (Fig. 4.3 C).

This plot highlights several structural features of the complexes. Firstly, hAhR284/H6ARNT362  DNA appears to have a maximum dimension of ~200 . Secondly, hAhR284/H6ARNT362 showed a distance distribution function that was skewed to the left, indicating that the majority of the interatomic distances 95 (A)

F F Sample Concentration Rg ( ) Dmax ( ) (mg/ml) ± AhR284/H6ARNT362 No DNA 9 58.84 0.171 200 ± AhR284/H6ARNT362 + 33mer-B 9 55.93 0.165 200

(B) AhR/ARNT No DNA AhR/ARNT +33mer-B

1.3 1.7

1.25 1.65

1.2 1.6 1.15 Ln(I) 1.55 Ln(I) 1.1 1.00E+01 1.5 1.05

1 1.45

0.95 1.4

0.9 1.35 5.00E-05 1.00E-04 1.50E-04 2.00E-04 2.50E-04 3.00E-04 3.50E-04 1.00E+00 2.50E-05 7.50E-05 1.25E-04 1.75E-04 2.25E-04 2.75E-04 3.25E-04 Log (I) Q2, Å Q2, Å

1.00E01

1.00E02 1.00E02 4.90E01 9.90E01 1.49E+00 Q, nm-1 Q, nm-1

(C)

P(r)

R, Å

Figure 4.3 Small Angle X-ray Scattering (SAXS) profiles of hAhR284/H6ARNT362 dimer alone and complexed with 33mer-B dsDNA fragments. hAhR284/H6ARNT362 was purified by IMAC and anion exchange, concentrated alone or complexed with 33mer-B to 9 mg/ml and analysed by SAXS. Table (A) shows the concentration (mg/ml), Å Radius of gyration (Rg) ( ) for each complex, illustrating that the hAhR/hARNT dimer has a more compact fold when bound to 33mer-B, relative to no DNA. The x-ray scattering pattern is shown in (B), where Log (I) represents the intensity of signal for a given scattering distance (Q, nm-1) from the primary beam. A Guinier transformation was performed using Primus and showed a linear pattern of scatter at low angles, inferring that the complexes were 2 monodisperse, and generated Rg values for each complex (Ln(I)/Q ) (inset, (B)). An indirect transform of the data was performed using Gnom to generate the distance distribution function P(r)/R(Å) for each complex (C). SAXS models were generated from this data using Dammin (Figure 4.4). were short. This type of distribution typically indicates that the scattering particle is cylindrical (280).

Additionally, the cross sectional Rg was calculated using the Primus (276) rod particle analysis (ln(I x Q)/Q2), which estimated 19 ± 0.06 for AhR/ARNT alone (data not shown). By comparison, the distance distribution function representing hAhR284/H6ARNT362/33mer-B (yellow line, Fig. 4.3 C) appeared to have a higher intensity, due to the increased scatter from the DNA. This complex also showed a change in the peak maxima, which, relative to the no DNA sample, shifted to the right. This shift indicated that the complex in the presence of DNA was more globular, having a cross sectional Rg of 23.8 ± 0.04 (data not shown). From this analysis we anticipated that the heterodimer alone in solution formed an extended cylindrical shape, and upon transition to a DNA bound state, the protein appeared to conglomerate around the DNA. The cylindrical shape was not observed for protein alone with AFM, although the effects of drying may account for this difference. The observed globular, compact DNA bound state was in agreement with the AFM data, which implicated a mechanism whereby hAhR284/H6ARNT362 flattened and wrapped around the DNA in the ‘compact’ model of bHLH.PAS DNA binding (Fig. 4.1 B). Additionally, a flattened structure where the PAS A region was in close proximity, or making direct DNA contacts would be in accord with observed PAS A dependent DNA bending and increased complex stability (Fig. 4.1 A) (9).

In order to convert the distance distribution function into a more intuitive, accessible structural format, SAXS models were generated using ab initio shape determination software Dammin (Version 5.3) (142). Ten iterations of Dammin were utilised per model, with alignment and averaging of structures performed with Damaver (Version 3.2) and Damstart (Version 1) (143), prior to generation of the final model by Dammin, utilising the molecular envelope predicted from the averaged models. A preliminary comparison of the models for hAhR284/H6ARNT362 (grey, Fig. 4.4 A) relative to hAhR284/H6ARNT362/33mer-B (yellow, Fig. 4.4 B) illustrates that the structures are of similar length, and in one orientation (left), there are no notable changes in shape with the addition of DNA (see overlay, left, Fig. 4.4 C). However, rotation of the overlay by 90 shows dramatic alterations to the heterodimer structure with the addition of DNA (right, Fig. 4.4 B and C) at the ‘top’ and ‘bottom’ of this representation. Upon closer inspection, the hAhR284/H6ARNT362/33b complex (yellow, right, Fig. 4.4 B) shows a uniform width of electron density projecting from the ‘bottom’ of the model, with relatively non-uniform density appearing at the ‘top’. From this analysis we may predict that the ‘bottom’ of the model in this case represents the DNA, while the top may indicate domain rearrangements. The corollary to this, where DNA may wrap around the cylindrical protein cannot be excluded, however this model would still position PAS A in proximity, or in contact with the DNA, in agreement with the ‘compact’ model of bHLH.PAS A DNA binding. Additionally, if the DNA wrapped around the circumference of the rod-like protein heterodimer, we may expect to see density (yellow) appearing in both orientations of the model overlay (Fig. 4.4 C). This is not in evidence, rather ‘yellow’ density 96 (A)

No DNA

(B) 90

+33merb

(C)

Overlay

Figure 4.4 ab initio SAXS models of hAhR284/ARNT362 dimer alone (grey) and complexed with 33mer-B

(yellow). Distance distribution functions (P(r)/R(Å)) for hAhR284/H6ARNT362 (grey) and hAhR284/H6ARNT362/33mer- B (yellow) (Fig. 4.3) were further manipulated using Dammin (142) to generate ab initio models of solute shape. Te iterations of Dammin prediction were performed, and these models were aligned and averaged using Damaver and Damstart (143). The averaged structure was used as the template for a final Dammin model of each complex, shown above. hAhR284/H6ARNT362 (A) and hAhR284/H6ARNT362/33mer-B (B) were overlayed (C), illustrating changes in shape upon addition of DNA. protrudes only from the top and bottom of the ‘right’ overlay (Fig. 4.4 C), and hence does not indicate that the DNA circumnavigates the rod-like protein. Interestingly, the right hand side of these models do not change with the addition of DNA, and may indicate higher order oligomerization, or dimer of dimer formation, as observed in EMSA experiments (Fig. 3.12 B) and by AFM (Fig. 4.2 B, d). However, the absence of an upturn at low Q in the Guinier analysis suggests that this solution is in fact monomeric.

4.3 Discussion

4.3.1 Hypothetical bHLH.PAS A Domain Topology of AhR/ARNT Heterodimer

Typically, SAXS modelling is concluded by performing rigid body modelling, but in this case, modelling did not prove facile. No structures of the representative domains are known, hence, structure based alignments have been utilised to generate hypothetical models from domains with proposed similarity. bHLH alignments (Fig. 3.1 J) illustrate that while hARNT shows conservation of the bHLH region relative to Myc, Max and USF, a large segment at the N-terminus of this protein is not represented by this fold, and the structure of this region (1-87) is unknown. ARNT bHLH shows highest sequence homology with USF, which was selected by Protein Homology/analogY Recognition Engine (Phyre) (281) for homology modelling (Fig. 3.1 I) of the ARNT sequence underlined in red (Fig. 3.1 J). The hAhR shows a distinctly different bHLH region, featuring a bipartite basic sequence, split by insertions, relative to the other bHLH proteins (Fig. 3.1 J), and as such, Phyre did not return a result for this sequence. Hence, a model of the ARNT predicted bHLH fold, combined with a monomer of the solution structure of Max (282) truncated to reflect the AhR bHLH, have been manually docked to represent this region of the heterodimer. The bHLH domain structures were assumed to bind DNA in a similar manner to other bHLH proteins such as Max and MyoD (Fig. 3.1 G), and were manually docked to represent such a binding event.

The PAS A region of the bHLH.PAS family of transcription factors has not been structurally determined, but the closely related Drosophila Period protein establishes a domain architecture which is predicted to be similar to other PAS A folds (Fig. 3.3 A, C) (174), and distinct from known PAS B structures, e.g. ARNT PAS B (176) (compare Fig. 3.3 C, D), as discussed in Section 3.1.3. However, the interaction of HIF2 and ARNT PAS B domains utilises the half -barrel to form the dimer interface (176). An analysis by Card et al. of the electrostatic potential of each half of the dimer indicates that the antiparallel interaction is likely directed by complementary electrostatic interactions (Blue boxes, Basic; Red boxes, Acidic; dotted lines represent interacting residues) (Fig. 4.5 A, C), as well as a hydrophobic interactions theorised to strengthen the dimer (green residues) (Fig. 4.5 A, C) (176). A comparison with 97 (A) A’ B’ A’ A*’ B’ C’

hARNT 358 hHIF 2˞ 242

C’ D’ E’

PAS B hARNT 416 hHIF 2˞ 297

A B A A* B C (B) hARNT 167 hAhR 116 DmPer 170

C hARNT 228 hAhR 177 DmPer 230

D D hARNT 285 PAS A hAhR 229 DmPer 264

E hARNT 336 hAhR 263 DmPer 303

(C)

180r HIF2˞/ARNT PAS B Per PAS A Rotation NMR model Crystal Structure

Figure 4.5 ClustalW alignment and structural comparison of PAS heterodimer ARNT/HIF2˞ PAS B and Per PAS A. JalView was used to present the alignment where Zappo colours represent the physicochemical properties of side chains ( aliphatic, aromatic, positively charged, negatively charged, hydrophilic, conformationally special, cysteine). (A) illustrates the antiparallel PAS B dimer interface with buried surfaces of hARNT and hHIF2˞ boxed in black, and surface charged residues that direct dimer formation boxed in blue (positive) and red (negative), with specific interactions indicated by dotted lines (176). Above the sequences are predicted secondary structures (174), useful for comparison with PAS A sequences (B) where rectangles represent - strands, cylinders ˞-helices and lines represent inserts of unknown structure. Note that there are no similarly charged residues in conserved structural elements of PAS A relative to PAS B, inferring a different mode of dimerization for PAS A. A cartoon representation of the crystal structure of Per PAS A (light blue) (C), compared with the NMR structure of HIF2˞ (yellow)/ARNT (lilac) PAS B dimer illlustrates distinct differences between PAS A and PAS B structures. Per PAS A (light blue) (Fig. 4.5 C) shows secondary structural similarities in the half -barrel structure relative to ARNT PAS B (lilac), although loop extensions in Per are clearly quite distinct.

An alignment of PAS A sequences of Per, hsARNT and hsAhR, illustrates that the charged residues involved in the PAS B interface of HIF2/ARNT are not conserved in the PAS A sequences (see dotted boxes), neither in charge, nor position on secondary elements (compare Fig. 4.5 A, B). However, the conserved domain fold for PAS A and PAS B suggested PAS A dimerization could occur through the equivalent interface. Hence, the secondary structural elements that form the HIF2/ARNT PAS B interface were identified in the crystal structure of Per PAS A and side chains contributing to the interface highlighted (shown in Fig. 4.6 for HIF2 PAS B (i), ARNT PAS B (ii) and Per PAS A (iii)). Next, structure-based sequence alignment of hARNT and hAhR PAS A to Per PAS A (174) identified the equivalent putative interface residues in hARNT and hAhR (Fig. 4.6 B). Note that the charged residues with the potential for electrostatic interactions were also highlighted (blue, basic; red, acidic). The selected residues in Per PAS A were mutated in silico to match either ARNT or AhR PAS A sequences (highlighted green in the structure based alignment (Fig. 4.6 B), generating a hypothetical model of the AhR/ARNT PAS A interface (Fig. 4.7 A). When rotated apart, the potential for electrostatic patches was evident, as calculated using the Adaptive Poisson-Boltzmann Solver (APBS) (283) PyMOL tool (Fig. 4.7 B), and complementary patches were highlighted in dashed white and blue circles. Importantly, the residues proposed to form the charged patches are well conserved across species (see boxed sequences, blue basic, red acidic) (Fig. 4.8 A, B). Specific residues mutated in silico for this analysis are illustrated as side chains in this proposed minimal interaction interface, as shown in Figure 4.7 C. Finally, the PAS A domains of AhR and ARNT are additionally distinguished from PAS B domains, and Per PAS A by the presence of loop extensions, for which secondary structures are not predicted using online prediction software Prof (284) (Fig. 4.9). Interestingly, based on the hypothetical electrostatic directed antiparallel dimer arrangement, loop extensions (Fig. 4.9) with well conserved basic residues (blue arrows, Fig. 4.8 A, B) feature on a single face of the dimer, with the potential to interact with the negatively charged backbone of the DNA downstream of the bHLH XRE contact site to generate DNA bending (9).

The modelled bHLH dimer and theoretical PAS A dimer were manually docked into the molecular envelope of hAhR284/H6ARNT362/33mer-B SAXS generated model (Fig. 4.4 B). While this is not proposed to define the structure of the heterodimer, or molecular details of DNA binding, the hypothetical models generated provide a reasonable theoretical mechanism to describe the SAXS models, AFM results and biochemical data in the literature. The density to the right of the structures has been excluded from the proposed domain arrangements as we hypothesise this region is an artefact of higher order oligomerisation. However, given that there was no upturn in Q at low values, and hence the 98 (A) (i) (ii) (iii)

˞A’ ˞A

ßD’ ßE (˞C’)

ßB’ ßA ßB’ ßD’ ßD ßB ßA’

˞A*’ ßE’ ßE’ ˞A’ ? ˞D

(B) ˟A ˟B ˞A ˞A* ˞B ˞C

hARNT LILEAADGFLFIVSCETGRVVYVSDSVTPVLNQPQSEWFGSTLYDQVHPDDVDKLREQLS 226 hAhR FLLQALNGFVLVVT-TDALVFYASSTIQDYLGFQQSDVIHQSVYELIHTEDRAEFQRQLH 117 DmPer GERVKEDSFCCVISMHDGIVLYTTPSITDVLGYPRDMWLGRSFIDFVHLKDRATFASQIT 291 ˟C

hARNT TSENALTGRILD-LKTGT-VKKE--GQQSSMRMCMGSRRSFICRMRCGSSSVDPVSVNRL 282 hAhR WALNPSQCTESGQGIEEATGLPQTVVCYNPDQIPPENSPLMERCFICRLRCLL------170 DmPer --TGIP----IAESRGSVPKDA------KSTFCVMLRRYRGLKSGG---- 325 ˟D ˞D

hARNT SFVRNRCRNGLGSVKDGEPHFVVVHCTGYIKAWPPAGVSLPDDDPEAGQ------G 332 hAhR ------DNSSGFLAMNFQGKLKYLHG------QNKKGKDGSIL 201 DmPer ------FGVIGRPVSYEPFRLGLTFREAPEEARPDNYMVSNG------361 ˟E

hARNT SKF-CLVAIGRLQVT 346 hAhR PPQLALFAI---ATP 213 DmPer TNMLLVICATPIKSS 376

Figure 4.6 ARNT/HIF2˞ PAS B interface surfaces infer secondary structural elements of DmPer PAS A, and by analogy, hAhR/ARNT PAS A which may form a conserved antiparallel interface architecture. ARNT (lilac)/HIF2˞ (yellow) dimer formation (Fig. 4.5, (A and C)) is proposed by Card et al. (176) to be directed by electrostatic forces, illustrated as positive (red) and negative (blue) side chains (A, (i) and (ii)). Residues of HIF2˞ which when mutated disrupt dimer formation are highlighted in green (A, (i)) (176). Secondary structures of DmPer PAS A that are homologous to the HIF2/ARNT PAS B interface are shown in aqua (A, (iii)) and side chains directed into the hypothetical interface are illustrated on the crystal structure of DmPer PAS A (174). Structure based sequence alignment of hARNT, hAhR and DmPer PAS A (B) illustrates potential interface residues (green) for AhR/ARNT PAS A antiparallel heterodimer formation and boxed residues (blue (positive) and red (negative)) represent potential electrostatic side chains (see Fig. 4.7 (B)). Note, underlined sequences of DmPer are unstructured and hence are not represented in the crystal structure. Predicted secondary structures are represented above sequences; rectangles = - strands, cylinders = -helices; and waves = loops. (A) (B) ARNT

AhR/ARNT AhRDR Hypothetical PAS A -sheet Dimer

(C) K186 P195 L205 T72 D74 D173 K184 F78 Y79 V70 D191 V196 V305 S192 L207 S190 G341 Q182 H307 A209 V68 S81 L176 S82 Y188 A339 T83 I178 T309 N180 Y87 V187 L337 Y311 A178 N65 D86 K313 E182

AhR ARNT

Figure 4.7 Structure based sequence alignment targeted in silico mutagenesis of DmPer PAS A proposed antiparallel dimer interface supports potential electrostatic directed hAhR/hARNT dimer formation. In silico mutagenesis was performed using the mutagenesis function of PyMOL where DmPer PAS A (174) proposed dimer interface residues were mutated to match the sequence of hAhR (aqua) and hARNT (grey) (detailed in Fig. 4.6 (B)). The interface surfaces were then manually docked using PyMOL to form an antiparallel dimer (A) with potential electrostatic patches evident when the dimer interface is rotated apart (dashed circles), as calculated using APBS (B) (283) (highlighted in red (negative) and blue (positive)), orienting the hypothetical dimer form, illustrated in surface (A) and cartoon representations (C). The exposed interface (C) illustrates specific residues of hAhR and hARNT proposed to form the interaction surface. (A) A B A A* B C

mouseAhR LLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELIHTEDRAEFQRQLHWA 175 ratAhR LLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELIHTEDRAEFQRQLHWA 175 whaleAhR LLQALNGFVLVVTTDALVFYASSTIQDYLGFQQSDVIHQSVYELIHTEDRAEFQRQLHWA 176 sealAhR LLQALNGFVLVVTTDALVFYASSTIQDYLGFQQSDVIHQSVYELIHTEDRGEFQRQLHWT 176 humanAhR LLQALNGFVLVVTTDALVFYASSTIQDYLGFQQSDVIHQSVYELIHTEDRAEFQRQLHWA 177 guineapigAhR LLQALNGFVLAVTTDALVFYVSSTIQDYLGFQQSDVIHQSVYELIHTEDRAEFQRQLHWA 176 rabbitAhR LLQALNGFVLVVTVDALVFYASSTIQDYLGFQQSDVIHQSVYELIHTEDRAEFQRQLHWA 176 ternAhR LLQALNGFVLVVTADALVFYVSSTIQDYLGFQQSDIIHQSVFELIHTEDRPEFQRQLHWA 177 chickenAhR LLQALNGFVLVVTADALVFYVSSTIQDYLGFQQSDIIHQSVFELIHTEDRPEFQRQLHWA 176 **********.**.******.**************:*****:******** ********: PAS A D E

mouseAhR GFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATP 265 ratAhR GFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATP 269 whaleAhR GFLAMNFQGRL KYLHGQNKKGKDGSILPPQLALFAIATP 270 sealAhR GFLAMNFQGRLKYLHGQNKKGKDGSVLPPQLALFAIATP 270 humanAhR GFLAMNFQGKLKYLHGQKKKGKDGSILPPQLALFAIATP 271 guineapigAhR GFLAMNFQGRLKYLHGQNKKGKDGSVLPPQLALFAIATP 270 rabbitAhR GFLAMNFQGRLKFLHGQNKKGKDGSLLPPQLALFAIATP 269 ternAhR GFLAMNFQGRLKFLHGQNKKGKDGATLSPQLALF AVATP 271 chickenAhR GFLAMNFQGRLKFLHGQNKKGKDGAALSPQLALFAVATP 270 *********:**:****:******: *.*******:***  PAS A

(B) A B A A* B C

MouseARNT DGFLFIVSCETGRVVYVSDSVTPVLNQPQSEWFGSTLYDQVHPDDVDKLREQLSTSENALTG------234 RatARNT DGFLFIVSCETGRVVYVSDSVTPVLNQPQSEWFGSTLYDQVHPDDVDKLREQLSTSENALTG------234 HumanARNT DGFLFIVSCETGRVVYVSDSVTPVLNQPQSEWFGSTLYDQVHPDDVDKLREQLSTSENALTG------234 GuineapigARNT DGFLFIVSCETGRVVYVSDSVTPVLSQPQSEWFGSTLYDQVHPDDVDKLREQLSTSESALTG------234 SealARNT DGFLFIVSCETGRVVYVSDSVTPVLNQPQSEWFGSTLYDQVHPDDVDKLREQLSTSENALTG------234 RabbitARNT DGFLFIVSCETGRVVYVSDSVTPVLNQPQSEWFGSTLYDQVHPDDVDKLREQLSTSENALTG------234 ChickenARNT DGFLFIVSCETGRVVYVSDSVTPVLNQPQSEWFGSTLYDQVHPDDVGKLREQLSTSENALTEGTKPWC 240 SalmonARNT DGFLFVVSCESGRVVYVSDSLTPVLNQSQSDWLGSSLYDQLHPDDGDKLREQLSTAESNNTG------210 *****:****:*********:****.*.**:*:**:****:**** .********:*. * PAS A D E

MouseARNT MNRLSFLRNRCRNGLGSVKEGEPHFVVVHCTGYIKAWPPAGVSLPDDDPEAGQGSKFCLVAIGRLQVTS 347 RatARNT MNRLSFLRNRCRNGLGSVKEGEPHFVVVHCTGYIKAWPPAGVSLPDDDPEAGQGSKFCLVAIGRLQVTS 338 HumanARNT VNRLSFVRNRCRNGLGSVKDGEPHFVVVHCTGYIKAWPPAGVSLPDDDPEAGQGSKFCLVAIGRLQVTS 338 GuineapigARNT MNRLSFVRNRCRNGLGSVKDGEPHFVVVHCTGYIKAWPPAGVSLPDDDPEAGQGSKFCLVAIGRLQVTS 338 SealARNT MNRLSFVRNRCRNGLGSVKDGEPHFVVVHCTGYIKAWPPAGVSLPDDDPEAGQGSKFCLVAIGRLQVTS 338 RabbitARNT MNRLSFVRNRCRNGLGSVKDGEPHFVVVHCTG YIKAWPPAGVSLPDDDPEAGQGSKFCLVAIGRLQVTS 338 ChickenARNT VNRLSFMRNRCRNGLGATKDGEPHYVVVHCTGYIKAWPPAGVSLPDDDPDAGQGSKFCLVAIGRLQVTS 368 SalmonARNT MNRLNFLRSRNRNGLGPPKDGEPQYVVVHCTGYIKSWPPTGVNLTDEEADNILGSRYCLVAIGRLQVTS 323 :***.*:*.* *****. *:***::**********:***:**.*.*::.: **::************

Figure 4.8 Sequence alignments show conservation of proposed interface residues, electrostatic side chains and K/R rich loops of AhR and ARNT. The dimer interface proposed to be oriented by electrostatic interactions is supported by sequence conservation of interface residues of PAS A in AhR and ARNT, where predicted interface secondary structures are boxed in black; residues predicted to be directed towards the interface highlighted in green; basic residues, blue; and acidic residues, red. Although the PAS A regions of AhR and ARNT are broadly well conserved, where highlighted sequences diverge in AhR (mAhR R236,), a basic residue, Lys, replaces Arg, maintaining proposed electrostatic patches. Additionally, surface loop basic residues proposed to make DNA contacts (Fig. 4.9) are also completely conserved (see ). Note that sequence alignments shown are not contiguous. + 32 aa + 15 aa + 5 aa + + + + + + +

a a -7aa + 8 aa 5

AhR ARNT + 27aa

Figure 4.9 Variable loop length outside the PAS core for hAhR and hARNT relative to the structure determined for DmPer PAS A. Loop extensions of hARNT (grey) and hAhR (aqua) PAS A domains are illustrated on the crystal structure of DmPer PAS A (Yildiz et al., 2005) as dashed lines including insert 1 (red), insert 2 (green) and insert 3 (blue). The effect of these loops on dimer formation and DNA binding can only be hypothesised with current data, but the presence of conserved basic residues in insert 3 of AhR and insert 2 of ARNT, represented by ‘+’, (Fig. 4.8 (A, B)) may suggest a role for these regions in DNA binding (Fig. 4.11). protein should represent a monomeric species, this region of density may be filled by the residues for which no structure has been established such as the PAS A surface loops and large N-terminal extensions. As discussed above, positioning the DNA at the bottom of the model, based on the uniformity of the density at this site, frames the position of the heterodimer within the molecular envelope. A crystallized, bent fragment of DNA was truncated to a 33mer (PDBID: 2v6e) (285) and initially positioned at the bottom of the hAhR284/H6ARNT362/33mer-B SAXS model, although this fragment does not appear to have the correct bend to fill this region (Fig. 4.10). Examination of the hAhR284/H6ARNT362/33mer-B model shows a cavity in the centre, immediately ‘above’ the predicted position of the DNA, and potentially describes a region of reduced density between the bHLH and PAS A domain pairs (Fig. 4.10). The hypothetical PAS A dimer was found to be too large to fit the density at the left of the model. Positioning the bHLH dimer to the left and PAS A dimer at the right, with the positively charged loops projecting towards the DNA seems a better fit for the SAXS generated molecular envelope. However, there remained regions of the density which were not filled by these domains and are predicted to be occupied by the linker regions, the N-terminal extension of ARNT and the surface loops of the PAS A dimer (Fig. 4.11).

4.3.2 Future Direction

Nuclear containment of the eukaryotic genome requires extensive wrapping by histones into nucleosomes, and weaving of nucleosomes into dense chromatin. Regions of transcriptionally accessible chromatin, or euchromatin, have been theorised to localise to ‘transcription factories’ by unknown mechanisms to maintain activity in a transcription factor rich region of the nucleus. Transcription, as discussed, typically requires chromatin remodelling to either initiate, or extend transient unwrapping events to allow transcription to proceed. Remodelling also occurs by recruiting minimal transcriptional machinery to proximal promoter regions through enhancer looping mechanisms, understood to be dependent, in some cases, on transcription factor oligomerization, and certainly on the capacity for these factors to bind and bend DNA in a sequence specific manner. Typically, bending events to control transcriptional output are on a scale smaller than the persistence length (147bp), and these events are theorised to result from a balance between the propensity for a given sequence to (i) be wrapped by nucleosomes, or (ii) be bound by transcription factors; where the balance may be shifted by sequence specific bendability, and/or the presence of intrinsic sequence defined static bends to modulate the binding outcomes.

Chromatin remodelling by AhR/ARNT XRE binding by ligand activated AhR/ARNT has been shown to result in chromatin remodelling to increase the nuclease accessibility of an enhancer region 1.6 - 0.7kb distant from the proximal promoter 99 Figure 4.10 Hypothetical PAS A dimer, bHLH dimer and DNA fitting into SAXS models of hAhR284/ARNT362/33mer. The Per PAS A crystal structure (174) and predicted dimer interface contacts (Fig. 4.6) were utilised to generate a hypothetical hAhR (aqua)/hARNT (grey) PAS A dimer. ARNT bHLH homology model (Fig. 3.1) was fitted into the models. However, AhR bHLH did not have sufficient homology to known structures, hence, the NMR solution structure of bHLH.LZ protein Max (PDBID: 1r05 (282), truncated to similar helical lengths to those predicted for AhR was fitted into the model (Fig. 1.4). The crystal structure of a DNA fragment was truncated to 33bp (PDBID: 2v6e (285)) for inclusion in the +33mer-B model. Loop extensions of AhR and ARNT (Figs. 4.7 and 4.8) and N-terminal extensions were further fitted into the +33mer-B structure (continued Fig. 4.11). From this manual docking, it is evident that the space described by the SAXS models includes a region at the right which is possibly due to higher order oligomerisation effects. 28aa

24aa 32aa

5aa J J J J J 85 J aa J J 24aa

Figure 4.11 Hypothetical hAhR284/ARNT362/33mer complex SAXS generated model annotated with unknown structures of surface loops and N-terminal extensions. ARNT bHLH homology model (Fig. 3.1) does not include 85 residues at the N-terminus, of unknown structure, nor the His tag. Hence, a black dashed circle represents this region. The structure of bHLH.LZ protein Max (PDBID: 1r05 (282)) (aqua), truncated to represent the HLH region of AhR, does not describe a further 24 residues within the N-terminus, represented by a blue dashed circle. The Per PAS A crystal structure (174), as illustrated in Figure 4.9, does not include extended surface loops of hAhR (aqua) and hARNT (grey) PAS A with basic amino acids highlighted with ‘J’. Additionally, linking sequences including 34 residues for hAhR and 29 residues for hARNT have not been annotated. region. Intriguingly, a region spanning between the enhancer and promoter shows no change in nuclease digestion, inferring that chromatin remodelling occurring in the enhancer region is able to act in trans to activate the promoter, which does not contain any XREs (242). Evidence for enhancer looping by AhR/ARNT, while proposed to be required for target gene induction, is currently lacking due to the experimental paradigm of reporter gene assays as a read-out of functional dimer binding events. DNA bending by AhR/ARNT has already been established to be PAS A dependent (9), as defined by DNAseI accessibility. It is interesting to note that the region of DNA required for stabilisation of soluble complex (Chapter 3), and undergoing PAS A dependent DNA bending, has been characterised as a GC rich region (9). Given that the GGGCCC sequence discussed in Section 4.1.1 has the potential to form positive static bends, it is possible that the interplay between DNA flexibility and the novel role for PAS A in bending this region may be crucial for chromatin remodelling, and therefore transcriptional regulation, by AhR/ARNT. At this stage, the biological relevance of bending and confirmation of a looping mechanism of gene activation await investigation at a chromatin level. The utility of reporter systems for analysis of what are effectively chromatin remodelling phenomena is fundamentally limited. In spite of this limitation, these systems are ubiquitous in transcription factor biology investigations. Hence, experiments using modified chromatin by homologous recombination of chromosomal fragments may provide a better paradigm for analysis of effects of enhancer composition, response element distances and proposed bipartite response elements for bHLH.PAS factor target gene regulation.

Testing the validity of the predicted AhR/ARNT PAS A interface A structure based sequence alignment of Drosophila Per, AhR and ARNT illustrated the potential for conserved secondary structural elements, forming the PAS fold of these structurally uncharacterised transcription factors. The PAS B dimer interface of HIF2/ARNT utilises the typical PAS fold, forming an antiparallel electrostatic interaction surface on the -sheet. In silico mutagenesis of the -sheet of Per PAS A to reflect the sequences of AhR and ARNT generated two hypothetical interfaces. Electrostatic potential was calculated using APBS, generating surface electrostatic patches that were theorised to orient an antiparallel AhR/ARNT PAS A dimer. In order to assess the validity of this model, mutagenesis studies would be required. Charged residues predicted to localise to the interface that show strong sequence conservation between species would be targeted for mutagenesis; to a neutral sidechain such as alanine; and charge swap mutations on either AhR or ARNT. The affect of mutation on dimer formation can be assessed using SEC (9) of bacterially co-expressed AhR/ARNT bHLH.PAS A. Analysis of DNA binding affinity by EMSA analysis may also be utilised to examine effects of dimer interface disruption on the predicted bipartite binding mechanism. If charge swap mutations on either the AhR or ARNT PAS A -sheet were shown to weaken or prevent dimer formation, it would suggest the importance of the proposed electrostatic interaction to direct dimerization. Finally, the mutant AhR

100 and ARNT proteins would be combined to assess whether a charge swap interface could rescue dimer formation, confirming the role for the charged patches in dimerization of bHLH.PAS transcription factors.

In vitro analysis of DNA bending A more direct analysis of DNA bound protein structure using AFM and SAXS experiments add gross domain topology to observations of DNA bending to implicate PAS A in either close proximity to the DNA, or making direct contacts to mediate enhancer bending. This observation is in agreement with evidence in the literature that suggests DNA bending is PAS A dependent (9). Evidence also suggests PAS A dependent masking of an antigenic region of AhR bHLH in the DNA bound state, potentially due to observed flattening and wrapping around and along the DNA, indicative of cooperative DNA and protein rearrangements (36). AFM experiments additionally indicated the possibility of oligomer formation to loop the Cyp1a1 enhancer, however, we do not yet have evidence that these dimer of dimer forms are not simply an artefact of protein purity and concentration. AFM with the entire Cyp1a1 enhancer sequence, and analysis with increasing concentrations of protein may aid in establishing a looping model for gene activation, and may resolve ordered oligomerisation, which would argue against non-specific aggregation. Ordered assembly of dimers of dimers may also be assayed using FRET. By labelling DNA downstream of the XRE with a fluorescent donor tag, labelling a second pool of DNA downstream of the XRE with an acceptor tag, and labelling a third sample of DNA upstream of the XRE with an acceptor tag, spatial proximity of 3’ and 5’ ends of the DNA may be identified (286). After complex formation, AhR/ARNT/XRE samples would be mixed, and the intensity of acceptor fluorescence measured. Provided ordered, directional dimer of dimer formation was occurring, we would anticipate that this should result in apposition of 5’/5’ ends, or 5’/3’ ends of bound DNA. Failing generation of any signal, it may be concluded that cross-shaped structures were forming, preventing FRET from occurring. In the case that both mixtures were equally able to give a FRET signal, we may assume that the resultant oligomeric forms are either flexible, or random in their configuration. Another technique that may be useful for determining loop formation is Tethered Particle Motion (TPM), where the DNA is tethered between a microscope slide and a microsphere such that Brownian motion of the sphere can be measured. DNA that is not looped has the greatest microsphere ‘excursion lengths’, where looping reduces the range of motion of the spheres (287). Typically DNA lengths corresponding to ~1000bp are utilised in these systems, making the Cyp1a1 enhancer an ideal test case.

SAXS analysis of AhR/ARNT/XRE The application of SAXS in this investigation indicated that a globular protein structure was adopted upon binding DNA, confirming the AFM observation that the protein flattens around and along the DNA. SAXS also indicated that the bHLH.PAS A dimer/DNA complex did not form a ‘T’ shaped structure that is typical of other members of the bHLH family (Fig. 3.2 E). Further experiments may help to identify the 101 precise positioning of the PAS A dimer relative to DNA within current models. Although rigid body modelling has not been possible due to the paucity of molecular structures for this family of proteins, Small angle neutron scattering (SANS) may prove useful for this system. Protein and nucleic acids undergoing deuterium exchange have different neutron scattering properties that provide a means of contrasting between these components in a complex. At ~43% D2O, protein neutron scatter ‘matches out’ with buffer scatter, and hence, this density ‘disappears’, while DNA matches out at ~72% (288,

289). Hence, by performing SANS experiments at a range of D2O concentrations, DNA alone can be visualised in the structural context of a protein complex. Expression of chimaeric proteins with a leucine zipper dimerization region in place of the PAS A dimer interface (9), in combination with SANS to identify the DNA position should help to define the structure of the bHLH region, as the leucine zipper could be modelled into the defined molecular envelope using known structures. Using this approach, the location of the PAS A region may be identified and this may help with design of crosslinking experiments to identify the sites of proposed PAS A/DNA contacts.

In vitro protein/DNA crosslinking Reversible crosslinking reagents are required to achieve facile mass spectrometric identification of crosslinked peptides. Typical reversible crosslinking reagent formaldehyde, after reversal of crosslinks, leaves behind a heterogeneous modification pattern (290), which can make peptide identification difficult. Targeting DNA binding sites by labelling specific nucleotides with a photoactivatable crosslinking reagent, such as iodo-Uracil, or iodo-Cytosine is a preferable technique, allowing interrogation of peptide interactions that occur in the minor or major groove of the response element. However, successful crosslinking is dependent on the proximity of the crosslinking reagent to the contacting amino acid, and also the side-chain chemistry of the proximal side-chain. Another option is to utilise backbone crosslinking reagents such as azidophenacylbromide, although this is not a reversible crosslink. Site-specific incorporation of a phosphorothioate in a single strand of the DNA allows targeted labelling with the crosslinking reagent. Photoactivation of the protein/DNA complex, followed tryptic digestion, ethanol precipitation, rehydration, nuclease treatment and electrospray MS should allow identification of any DNA bound peptides with a modification of defined mass (291). Preliminary predictions based on a hypothetical antiparallel dimer interface indicate that the basic–rich surface loops of AhR and ARNT PAS A may be required for backbone contacts in the asymmetrically bent region downstream of the XRE, hence backbone labelling may prove a useful technique. Additionally, mutagenesis of the surface loops, followed by DNaseI sensitivity assays may reveal detail of any alterations to the degree of DNA bending induced by mutant AhR/ARNT complexes (9).

102 Mapping bipartite DNA binding Mapping of defined amino acid:DNA contacts would finally allow an investigation of the affect of altering DNA bending by mutagenesis of AhR and examining the effect on target gene function in a cellular context. Current predictions indicate the potential for bipartite DNA recognition, where prototypical, bHLH specified XRE binding localizes the transcription factor to target genes, while a downstream GC rich region undergoes bending by the PAS A region via backbone interactions, culminating in chromatin remodelling. This domain arrangement is predicted to result in sequence specific contacts within the N- terminal bHLH, and potentially backbone contacts by the more C-terminal PAS A region is typical of another family of bipartite DNA binding proteins, the C. elegans Tc1 transposase family (292, 293). Hence, this may represent a more general mechanism to increase DNA binding affinity, whilst maintaining some elasticity in binding specificity. Insights into the mechanism of AhR/ARNT target gene induction may also be extrapolated to other bHLH.PAS family members. Evidence for a similar mechanism of HIF1 bipartite DNA binding is evident for three target genes, including VEGF, EPO (294) and insulin-like growth factor binding protein-1 (IGFBP-1) (295), which feature a sequence analogous to the AhR/ARNT bending site, termed the HIF ancillary sequence (HAS), 8bp downstream of the hypoxia response element, where PAS dependent DNA bending is seen (Chapman-Smith, pers. comm.). A comprehensive understanding of the protein structural features, combined with information about spatial and sequence specific DNA requirements that are cooperatively able to induce primary events to achieve chromatin remodelling, will unveil the mechanics of rapid target gene activation at the core of bHLH.PAS transcription factor function.

103

Rational Mutagenesis of mAhR Ligand Binding Domain to Discover Mutants with Altered Inducibility Chapter 5: Rational Mutagenesis of mAhR Ligand Binding Domain to Discover Mutants with Altered Inducibility

5.1 Introduction

5.1.1 PAS domains as environmental sensors through cofactor and ligand binding

Ligand activation of AhR is known to result from direct binding of an activating molecule to the PAS B region of AhR (see Section 1.2.1). PAS B is not unique in its capacity to coordinate a small molecule to effect regulatory functions. As discussed in Section 3.1.2, PAS domains of sensor proteins are known to utilise prosthetic groups in order to detect environmental stimuli. These factors include photoreceptors which have a chromophore bound within the PAS core to detect light, such as 4-hydroxycinnamate, or flavin mononucleotide (171, 179); redox-sensitive regulators that bind cofactor FAD (172, 296); and oxygen sensing PAS proteins that bind heme (183). Another small-molecule detector, CitA, detects environmental citrate levels by direct binding of citrate within the PAS domain to regulate metabolism and transport of citrate in Klebsiella pneumoniae (184). Structural characterisation of a number of cofactor bound PAS domains indicates that small molecule binding events are typically in the same general region of the -sheet, with some interactions contributed by the PAS core region (PAS domain regional nomenclature is illustrated in Fig. 3.3 B).

Light-sensitive Photoactive Yellow Protein (PYP) from Ectothiorhodospira halophila forms a covalent linkage from Cys 69 to chromophore 4-hydroxycinnamate (Fig. 5.1 A). The position of the cofactor is maintained in the hydrophobic -helical core by electrostatic interaction with Arg 52, and hydrogen bonds with Tyr 42 and Glu 46 (4). Oxygen sensor histidine kinase, Bradyrhizobium japonicum FixL binds a cofactor Heme moiety within its PAS domain in order to detect oxygen in a position on the opposite side of the pocket, relative to the hydroxycinnamate of PYP (Fig. 5.1 B). The heme iron is coordinated to histidine 200 in the -helical connector; the propionate groups of heme are also hydrogen bonded to this region. The heme moiety has also been shown to bind within the hydrophobic core of the PAS domain, where hydrophobic residues in the -barrel and -helical core align in the same plane as the protoporphyrin ring. The major difference noted between PYP and FixL PAS domain structures is the position of the F-helical connector, where the helix is shifted relative to PYP, to open the pocket of FixL to accommodate the heme moiety (6). CitA is shown to bind citrate by coordinate hydrogen bonding by the -scaffold and N-terminal -strand and -helix, similar to the PAS core (see Section 3.1.2) (Fig. 5.1 C) (184). The structure of the PAS A domain of Human PAS Kinase (hPASK) has been determined by NMR. This study utilised a small molecule library to screen potential ligands, 104 Small Molecule Binding by PAS Domains

(A) (B) PAS Arg 52 core PAS core A2 A1 C N A2 A1 B5 Tyr 42 C Glu 46 B2 N Hydroxycinnamate B4 B3 A3 A3 B4 Cys 69 B1 B1 B3 B2

A4 His 200 B5 Heme A4 ‘-helicalF ‘-helical connector’ connector’

(C) (D)

‘-helical PAS ‘A4’ A2 connector’ core B2 A1

N B1 B5 B3

B5 A3 B2 C B3 B4 PAS B4 A1 A4 core B1 ‘-helical connector’ N Citrate C

Figure 5.1 Structures of cofactor bound sensor PAS domains. Sensor PAS protein regulators coordinate cofactor molecules which are typically sensitive to particular environmental stimuli, such as light and oxygen. PAS domains shown are labelled with reference to the Per structures, representing the typical PAS fold and orienting the different PAS structures in this context, where A=-helix; B=-strand (Fig. 3.3 (A) and (B)). Light sensor Photoactive Yellow Protein (PYP) binds cofactor hydroxycinnamate (black) through a covalent bond which positions the chromophore in the hydrophobic PAS core, determined by x-ray crystallography (PDBID: 2PHY) (A) (4). The crystal structure of Bradyrhizobium japonicum sensor Histidine kinase FixL PAS domain shows non-covalent bonding to an oxygen sensing heme moiety (stick figure) (PDBID: 1DRM) (B) (6). Additionally, PAS domains can function to directly detect small molecule ligands, such as shown in the crystal structure of sensor kinase CitA (Klebsiella pneumoniae) (184) (C), which binds citrate (yellow/red stick figure) by coordinate hydrogen bonding. Homo sapiens PAS Kinase PAS A NMR solution structure identified residues undergoing chemical shift (pink, small shift; red, large shift) with addition of hydrophobic ligands from a chemical screen (D) (297). with the resulting molecules identified typically featuring two hydrophobic ring structures spaced up to two C-C bond lengths apart and the strongest affinity identified KD ~13 M. Addition of saturating concentrations of the highest affinity ligand resulted in chemical shifts for residues within the hydrophobic core of the protein, and showed binding in a region similar to that identified for CitA (Fig. 5.1 D). Perturbations to the overall PAS domain structure were shown to occur in the ligand bound state, with the observation that the apo binding cavity required structural remodelling to accommodate identified ligands. The effect of ligand binding, and biologically relevant ligands are yet to be identified, although the authors established that small molecule binding affects protein structure within a region relevant to kinase regulation (297). What is immediately evident is that small molecule binding generally occurs in the hydrophobic core of PAS domains, more specifically, where the -barrel and PAS core typically cup small molecules. Some PAS domains additionally feature a contribution by the -helical connector (Fig. 5.1 B), although this region seems more relevant for ligand selectivity due to size constraints imposed by the position and size of this helix. For AhR PAS B, no ligand binding events have been characterised structurally due to poor folding of this region in the absence of chaperone proteins. However, a great deal of information is available on the spectrum of ligands that AhR is able to bind.

5.1.2 AhR Ligands; Synthetic and Natural Compounds

Much of the research focused on ligand binding points to the capacity for the AhR Ligand Binding Domain (LBD) to accommodate a range of ligand structures. The activation paradigms by which the potency of ligands are adjudged include their capacity to compete with direct binding by radiolabelled TCDD; induction of a synthetic promoter containing two Cyp1a1 minimal XRE fragments; enhancement of the Cyp1a1 encoded enzyme, Cytochrome P4501A1 as measured by ethoxyresorufin-O-deethlyase (EROD) activity. Given the effects of targeted disruption of AhR on mouse development (298); xenobiotic independent activation of AhR by suspension culture (116) and oxidized LDLs (59); and AhR function as a ubiquitin ligase (48), it appears that TCDD binding and induction of Cyp1a1 may be a narrow frame for observing ligand potency. However, given the vast amount of data generated from investigations that subscribe to these paradigms, relative potencies do provide a qualitative understanding of the binding capacity of the AhR LBD.

Xenobiotic, synthetic ligands are amongst the best characterised activating species, including halogenated aromatic aryl hydrocarbons (HAHs) and non-halogenated polyaromatic hydrocarbons (PAHs). Of these, the metabolically stable HAHs, such as TCDD (Fig. 5.2) and 3,4,3’,4’,5’- Pentachlorobiphenyl are the most potent AhR ligands, with pM to nM binding affinities, while PAHs such as 3MC, B[a]P, -Naphthoflavone (BNF) have similar nM affinity, although display weaker activation, 105 AhR Ligands

˞-Naphthoflavone 2,3,7,8-Tetrachlorodibenzo-p-dioxin

Benzo[a]pyrene 3’-Methoxy-4’-nitroflavone

3-Methyl-cholanthrene Lipoxin A4

-Naphthoflavone

YH439

6-formylindolo(3,2b)carbazole Prostaglandin G(2) Figure 5.2 Structures of a sample of AhR ligands. AhR agonists are typically characterised as polyaromatic hydrocarbons (PAH), including xenobiotic chemicals such as 2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD), 3-Methyl-cholanthrene (3MC), ˟-Naphthoflavone (BNF) and Benzo[a]pyrene (B[a]P). Endogenous PAH ligand 6-formylindolo(3,2b)carbazole (FICZ) was recently identified in acetonitrile extracts of UV-B irradiated keratinocytes (Fritsche, Schafer et al., 2007). ˞-Naphthoflavone (ANF) has been shown to function as an antagonist at low concentration, and shows agonist activity at high concentration, while 3’-Methoxy-4’-nitroflavone (3M4NF) acts as an early antagonist, preventing nuclear translocation of AhR. Atypical AhR ligands shown include YH439 and proposed endogenous ligands Lipoxin A4 (LXA4) and Prostaglandin G(2) (PGG2). potentially due to their rapid degradation (Fig. 5.2) (112). Quantitative structure-activity relationship analyses on HAHs and PAHs reveal ligand properties that are particularly relevant to affinity, including hydrophobicity, steric effects, hydrogen bonding, dispersion and electrostatic potential (as reviewed in 33, 299). A spatial alignment of known ligands, correlated with effect on AhR activity, has allowed estimates of LBD capacity. This indicates a pocket with dimensions of 14 x 12 and predicts a limitation of 5 in the third dimension, inferring a requirement for planar ligands (300). Although HAHs represent the highest affinity ligands, the variation in structures of chemicals found to activate AhR is vast (Fig. 5.2), and typically requires only a single aryl ring structure to generate a low affinity activation response (301). Included amongst the exogenous AhR ligands are atypical molecules such as YH439, which do not feature the distinctive 6C aryl ring structure (Fig. 5.2) (302). Interestingly, a specific group of arylhydrocarbons are able to repress AhR activity. Molecules with pure antagonistic activity include 3,5,4’-trihydroxystilbene () (303), and 3’-methoxy-4’-nitroflavone (3M4NF) (304) (Fig. 5.2). Minimal changes to the structure of BNF to form ANF results in a molecule with partial antagonist function and partial agonist activity (305) (Fig. 5.2).

Clearly, synthetic ligands, while representing the highest affinity molecules, bear limited resemblance to agents present during the evolution of developmentally active mammalian AhR. Hence, current research focuses efforts on identification and characterisation of ligands present in the natural environment, such as plant pyrroles and molecules derived endogenously. Dietary compounds that have been isolated as AhR ligands have been extracted from plants in lipophilic solvents. AhR ligand Indole-3-carbinol and derivatives are secondary metabolites of glucobrassicin from plants such as broccoli and Brussels sprouts. The most potent of these derivatives is Indolo[3,2-b]carbazole (ICZ) with

KD between 22-90nM (306), although this ligand displays lower EROD activity than TCDD due to rapid metabolism (306). Another class of dietary ligand are the plant polyphenols termed flavonoids. The only flavonoid confirmed by TCDD competition for LBD binding is quercetin, and although the interaction is weak, 22-fold induction of Cyp1a1 has been detected after application of 10 M quercetin to MCF-7 cells (307). Estimates predict consumption of ~1g of flavonoids per day in the average human diet, hence these lower affinity activators are believed to be physiologically relevant for AhR activation.

Knowledge generated from studies of AhR knockout animals as well as the ubiquitous expression pattern of AhR has implicated diverse roles for the AhR in physiologic and adaptive responses. Endogenous mechanisms responsible for activation in a temporal and tissue specific manner have eluded researchers for many years, but evidence has recently accumulated for endogenous ligands and/or post-translational modifications that regulate AhR activity and subcellular localization. Typically, potential ligands identified in mammalian cells have been analysed in vitro for their capacity to activate

106 AhR. In a more targeted manner, other ligands have been identified from extracts of cells that show potent AhR activation.

Endogenous ligands which have been identified, but not conclusively demonstrated to initiate a particular signalling outcome in a cellular context include , indoles and metabolites (as reviewed by 33). The observation that TCDD can disrupt heme biosynthesis highlighted the potential for a link between AhR and heme metabolism (308). Association of Cyp1a1 with heme metabolism additionally prompted an examination of heme metabolites as potential Cyp1a1 inducers. In this study, tetrapyrroles hemin, biliverdin and were found to induce Cyp1a1 mRNA in Hepa1c1c7 cells (101). Bilirubin was the most potent, exhibiting detectable elevation of Cyp1a1 at 1 M after 3 hours treatment. Bilirubin was the only new ligand identified that could induce binding to XRE containing [5’ TTGCGTG 3’] DNA sequences in electrophoretic mobility shift assays (EMSA) (101). Bilirubin concentration in human serum ranges from 5-20 M, but is typically bound by serum albumin, so it is unlikely that this activation of AhR is physiologically relevant (as reviewed by 101). However, this work represented the first evidence of an endogenous heme metabolite that could act as an AhR ligand to induce Cyp1a1 transcription.

Preliminary studies in rat showed that the highest amount of AhR endogenous ligand was found in lung extracts. Subsequent isolation of a ligand from porcine lung by organic extraction and purification allowed identification of the chemical structure of a novel endogenous ligand, 2-(1H-indole-3’-carbonyl)- thiazole-4-carboxylic acid methyl ester (ITE). Reporter gene analysis demonstrated a similar potency to -naphthoflavone, while competition for TCDD binding showed ITE was competent for binding mouse and human AhR (309). The authors hypothesize that ITE is formed after a condensation reaction of tryptophan and cysteine, although this was not demonstrated in vitro or in vivo. Biological output resulting from the presence of ITE, and the typical endogenous concentrations of this ligand were not established.

Hepatoma cell lines that lack functional Cyp1a1 enzyme (C37) show enhanced AhR target gene activity, suggesting a potential use for these cells in identification of labile endogenous AhR ligands (310). Cyp1a1 and AhR deficient African green monkey kidney (CV-1) cells were transfected with an AhR expression construct and showed dose-dependent XRE reporter activity comparable to that seen with TCDD treatment, indicating that in the absence of Cyp1a1 enzyme, accumulation of an endogenous ligand could potently activate AhR. Complementation with a functional Cyp1a1 construct abrogated the endogenous reporter activity (311). A recent investigation into the CV-1 endogenous ligand showed expression of CYP1A and CYP1B families were able to reduce reporter activity, implying metabolic negative regulation of endogenous ligand levels to autoregulate AhR induction of Cytochrome P450 107 expression. Attempts to isolate the ligand responsible by organic extraction and HPLC separation failed to identify a definitive molecule, rather a number of species were found to activate a reporter, potentially representative of a number of different ligands, or breakdown products of the parent ligand (312).

Initial observations in human skin showed induction of CYP1A1 and 1B1 mRNA and protein after UVB exposure in a dose-dependent fashion (313). Subsequent research has shown that AhR is localised to the nucleus after exposure of HaCaT cells to physiologically relevant levels of UVB and that this exposure activates CYP1A1 induction in an AhR dependent manner. Further experiments showed internalisation of epidermal growth factor receptor (EGF-R) and downstream signalling that was similarly AhR dependent (102). EGF-R activation has previously been characterised to result from TCDD treatment of keratinocytes (314, 315). Indoles, Trp and Trp photoproducts have been identified in vitro as AhR ligands (316, 317), hence Trp starvation was utilised to demonstrate that this activity was Trp dependent (102). 6-formylindolo(3,2b)carbazole (FICZ) (Fig. 5.2), a Trp photoproduct, has been identified as an AhR ligand (103, 318) and intracellular formation of this molecule was subsequently detected in acetonitrile extracts from UVB irradiated HaCaT cells (102). Given the proposed role of AhR as a tumor promoter, it is tempting to implicate the AhR in formation of UV induced skin cancer, however, definitive signalling outcomes for this activation mechanism remain to be elucidated. Bacteria lining the gut have also been implicated in production of Trp metabolite ligands, representing a symbiotic production of AhR ligand that is not strictly endogenous, but does result in a physiologically relevant activation of this pathway, although a purpose for this site-specific activity has not been demonstrated (319).

Reasoning that metabolic products with the potential to activate AhR would be excreted in urine, the Matsuda group selectively adsorbed polycyclic compounds from 5 litre batches of human urine using blue rayon, followed by organic extraction and reverse-phase high performance liquid chromatography (HPLC). Fractions were analysed using a yeast reporter system, which established the presence of two analytes capable of activating AhR. One of the chemical structures was assigned as , the other elution profile and UV spectrum matched indigo, an isomer of indirubin. When quantitatively analysed in the yeast reporter system, indirubin, indigo and TCDD were found to have EC50 values of 0.2, 5 and 9 nM respectively. Normal human urine assessed for levels of these chemicals had an average of ~0.2 nM. Foetal bovine serum levels were also examined, where indirubin was 0.07 nM and indigo could not be detected (320). Subsequent studies in mammalian cells analysed AhR activation by these ligands.

HepG2 derived XRE reporter gene 101L cells were used to test indigo and indirubin with resultant EC50 values of ~5 M and ~0.1 M respectively; a marked difference from the values estimated in the yeast system. Competitive binding experiments showed that both ligands were able to compete for the ligand binding pocket (321). The source of the indigoid ligands is still uncertain, but given that they have been 108 isolated in human urine and that they can be formed by human cytochrome P450’s (322), this suggests that indigoids may be relevant to the physiological regulation of AhR activity. Additionally, indirubin and its derivatives have been identified as direct inhibitors of cyclin dependent kinases (323), linking regulation of AhR with regulation of cell cycle.

Given that most AhR ligands are polyaromatic, fat soluble molecules, studies have focused on lipids and steroids as potential ligands. TCDD is able to induce arachidonic acid metabolizing cytochrome P450s and prostaglandin synthase H2 (reviewed in 33), implicating AhR in these processes. A number of cellular lipids were screened for AhR driven reporter gene activation, which showed a dose- dependent increase in CAT activity with addition of lipoxin A4 (LXA4) (Fig. 5.2). LXA4 was compared with FICZ for AhR activation and at its maximum available concentration (0.3 M) showed a 10-fold increase over control levels, while FICZ at 1 M gave a 30-fold increase in activity. Over a 50 hour timecourse in Hepa-1 cells, FICZ maintained Cyp1a1 enzyme induction, while LXA4 activation dramatically downregulated at around 20h. The C37 cell line, lacking Cyp1a1, did not show this downregulation of LXA4, and hence it was assumed that LXA4 is a substrate for Cyp1a1. Gelshift analysis and TCDD competition confirmed LXA4 acts to activate AhR by directly binding to the LBD

(104). LXA4 does not fit the typical geometry of AhR ligands, and in spite of the fact that the established binding affinity would potentiate a physiological role, no such function has been established to date. (PGs) and related chemicals have also been screened for AhR activation by gelshift analysis, showing that 100 M PGs A3, B3, D3, E3, F3, G2, H1, H2 and K2 activated AhR:XRE complex formation (Fig. 5.2). However, luciferase activity confirmed that PGG2 was clearly the most potent of all tested chemicals, with activity exceeding that of TCDD maximal activation. This result may implicate signalling pathways, in addition to direct AhR activation, for playing a role in enhancing target gene activation. Finally, a competitive binding assay showed that PGG2 could compete with TCDD, albeit weakly, for binding AhR (105). Identification of cellular uptake of (reviewed in 324) provides a proof-of-principle for uptake of PG to result in activation of a nuclear receptor, and this, coupled with the poor stability of PG, may indicate a higher affinity and activation potential for these molecules in vivo (105).

Connections between AhR and cardiovascular disease development (reviewed in 325, 326) led to an examination of the capacity of oxysterols to act as ligands for AhR. A range of oxysterols were assessed based on their capacity to compete with TCDD for the LBD. 7-ketocholesterol was able to compete weakly for binding and prevented formation of a DNA binding complex, and hence was characterised as an endogenous antagonist (327). The potential existence of an oxysterol or oxidized phospholipid agonist has gained support from studies that connected fluid flow of serum with an induction of Cyp1a1 and Cyp1b1 (328, 329). Roles for AhR in vascular remodelling would suggest a 109 cardiovascular specific activation mechanism, hence recent studies in the Bradfield laboratory have focused on identifying how AhR may be activated in these cells and, most importantly, on downstream outcomes of this signalling (59).

Shear stress applied directly to rat hepatoma 5L cells was found to activate Cyp1a1 mRNA expression in an AhR dependent manner. Isolation of the specific activator responsible narrowed the focus to shear stressed serum. When applied to cells containing a stably integrated XRE driven luciferase reporter (H1L6.1c3), shear stressed serum alone could induce reporter gene expression. Serum was fractionated by size exclusion chromatography and samples were tested for their capacity to activate, with the result that sheared low density lipoprotein (LDL) was identified as the AhR inducing factor. Purified LDL exposed to shear stress was additionally able to activate the reporter, inferring that shear stress alone can modify LDL structure or function. Occurrence of atherosclerosis has been linked to oxidized LDLs (330), hence purified LDL was treated with NaOCl to induce oxidation. This material was shown to induce reporter gene activity, leading the authors to speculate that the effects of oxidized LDL may be due either to direct intracellular delivery of an LDL-derived agonist, or, indirectly by the release of arachidonic acid, which has previously been linked to flow-induced CYP1 expression (331). Interestingly, sera from AhR knockout animals was shown to have a 4-fold higher capacity to activate reporter expression than age-matched heterozygotes. This was hypothesised to result from decreased paraoxonase-1 and NAD(P)H:quinone oxidoreductase 1 (NQO1) expression in AhR knockouts, both having an established role in reducing LDL oxidation and both having been previously characterised as AhR target genes (reviewed in 59). Here, an endogenous downstream effect of context specific AhR activity has been hypothesized, but the precise molecular mechanism of activation has yet to be established.

5.1.3 PAS domain flexibility correlates with ligand binding mechanisms

Given the range of ligand structures that have the capacity to bind the AhR PAS domain, speculation regarding small molecule binding specificity has recently focused on intrinsic PAS domain flexibility as a means to both receive small molecule signals, and to transmit signals to effector domains or induce protein-protein interactions to generate a signalling output. The limited sequence similarity, combined with high structural conservation between PAS domains infer a structural function for this fold that likely requires similar mechanical properties to achieve the ‘signal reception/structural output’ responses typically observed for PAS signalling proteins.

Pandini and coworkers compiled structures of HERG, phy3, PYP, FixL, hPASK and HIF-2 for Essential Dynamics (ED) analysis (332) to identify functional domain motions that are common between 110 modelled domains, termed ‘motion patterns’ (Fig. 5.3 A). NMR ensembles were particularly useful in this regard to compare dynamic information with that inferred by Molecular Dynamics (MD) simulations of crystal structures (333). PYP, phy3, FixL and HERG were found to have similarly constrained -sheets, with most of the dynamic regions clustering around inter-strand loops. PYP shows motion patterns within the PAS core region, which are not present in phy3 and FixL (333). ED predicted mobile regions of PYP were confirmed by previous NMR studies by Dux et al., which showed variation in ensemble structures, and using 15N relaxation studies on the shift from dark to light state (334). Phy3 and FixL were also predicted to have enhanced dynamics in the helical connector (A4) and connecting loops (333). HIF-2 PAS B showed highest regions of mobility at the N and C termini of the -helical connector, extending into B3. Finally, hPASK had less evidence of mobile secondary structures, rather dynamic regions were identified in the loops between A4 and B3; and interstrand loop B4-B5. Where known, for PYP, Phy3 and FixL, defined detection and response mechanisms correlate with predicted motion patterns, indicating the utility of ED estimates for determining functionally relevant protein dynamism (333).

For those PAS domains which feature a cofactor, mobility has been found to cluster around the ligand binding site. For those proteins which do not have a ligand or cofactor, it has been proposed by Pandini that the PAS structures may have evolved to bind small molecules, and that this capacity has been lost during later stages of evolution, leaving regional flexibility to indicate where the ancient ligand may have accessed the binding pocket. Indeed, this would appear to be the case with hPASK which has been shown to bind hydrophobic molecules (Fig. 5.1 D) in the region of proposed mobility (Fig. 5.3 A). A structure based alignment of the PAS domains from this study serves to highlight regions of this fold that have a propensity to be mobile (Fig. 5.3 B) (333). This phenomenon is especially notable for the loops between A4-B3; B3-B4 and B4-B5, while the N-terminal -helices tend to display less mobility, except in the case of PYP.

Currently, there are no structures of the AhR LBD available to generate ED based predictions of how the binding pocket is accessed by small molecule ligands. Based on the promiscuous binding activity of the pocket, and given that this region is difficult to solubilise in the absence of chaperone proteins, we may assume that the AhR LBD is unusually flexible, both in terms of access, and in binding characteristics. In order to understand more about how the LBD binds ligand and activates AhR, a number of homology models have been generated to inform mutagenesis studies on this enigmatic xenobiotic sensor.

111 Predicted structural mobility of PAS domains clusters around ligand binding sites (A)

(B)

28 PYP 27 HERG 929 phy3 243 HIF-2 154 FixL 132 hPASK PYP HERG phy3 HIF-2 FixL hPASK

89 PYP 91 HERG 991 phy3 301 HIF-2 217 FixL 196 hPASK PYP HERG phy3 HIF-2 FixL hPASK

Figure 5.3 Predicted PAS domain regional mobility by Essential Dynamics analysis. Figure reproduced from Pandini and Bonati, 2005 (333). Mobility predictions by Essential Dynamics (ED) simulations were generated from Molecular Dynamics (MD) simulations of PYP, Phy3, FixL and HERG X-ray structures; NMR structure ensembles for HIF-2a and hPASK. (A) Locations of motion patterns are highlighted in colour in the cartoon representations of the PAS domains, which illustrates that motion patterns cluster around ligand binding sites for PYP, Phy3 and FixL. Secondary structures are labelled on HERG, where helices are labelled ‘’, and strands ‘’. Shown in (B) is a summary of the regions of highest predicted mobility (green) on a structure based alignment, where secondary structures are represented as boxes; -helices (light grey) and -strands (dark grey). 5.1.4 LBD Homology Models and Mutagenesis Studies

Hsp90 association and ligand binding to mouse AhR is known to require residues 230-421, although this region has not been refined further, and this sequence may not describe the minimal LBD and Hsp90 contact site. This region encompasses the last -strand of PAS A, the entire PAS B, and around 40 amino acids more C-terminal of this domain (34). Polymorphisms that modify ligand activation of the receptor have been identified within this region, specifically, A375V (see below) (335, 336); while deletion of this region (280-420) results in a constitutively active AhR, a phenomenon which concurs with a crucial role for Hsp90 binding in regulating the receptor (35). Given the lack of structural information for the molecular interactions required for ligand binding, homology modelling based on other PAS domain structures, combined with targeted mutagenesis, has been a technique recently employed by several investigators to inform mechanistic predictions. Initial modelling compared the structures of FixL, HERG and PYP, where FixL was determined to be most similar to the sequence of mAhR, and hence was used to generate a proposed mAhR PAS B structure (337) (Fig. 5.4 A). Based on the position of heme in the FixL domain, binding of polychlorinated dibenzo[p]dioxins (PCDDs) was proposed to occur in a similar position on the AhR PAS B model.

G224 and G251 from FixL were established to align with mAhR L347 and A375, where position 375 had already been identified as critical for ligand binding by PAS B. Additionally, it was noted that the side chain size at positions equivalent to mAhR 375 in both HERG and FixL correlated inversely to the size of the bound ligand, i.e. small side chain for large heme binding, the largest side chain (Leu) for HERG, where no ligand binding has been idendified (337). The mechanism of FixL signal transduction was extrapolated to AhR signalling, resulting in the proposal by Procopio et al that R333 typically forms a salt bridge with E339, and that once PCDD binds, R333 preferentially interacts with a chlorine atom, resulting in a conformational change. These residues were additionally predicted to lie at the entrance to the binding pocket. Q377 was suggested to interact with a chlorine atom to position the TCDD in the binding cavity. C327 and R282 were proposed to interact with the electrophilic centre of TCDD, while T343 and F345 were thought to interact with the hydrophobic rings of this ligand (337). Subsequent mutational analysis of these residues showed a progressive decrease in ligand binding affinity with increasing size of the side chain of the proposed ligand binding pocket entrance residue 375 (Fig. 5.5). From this study, it was also proposed that endogenous activation of AhR results from ligand binding events which are prevented by mutants A375I and A375F (Fig. 5.5), as shown by reporter analysis (338). Mutations R333L and E339L were subsequently found to decrease TCDD binding to 25% of wt, and both showed decreased DNA binding and reporter activity (339). R282L showed decreased TCDD binding affinity to ~35% of wt (339), while R282A had minimal effect on TCDD binding and DNA binding (340). The latter was subsequently rationalised to point away from the ligand binding pocket and therefore result in modest changes in ligand binding potential. Q377L has been found to decrease 112 Homology Models of mAhR PAS B

(A) (B) B3’

A3’ A4’  A1’

A2’ B2’ B1’ B4’   B5’



(C) B4’

B3’ Lys350 A1’ Arg333 A3’ Leu300 A4’ A2’ B4’ His285 B2’ Glu339 Met334 B1’

B5’

Figure 5.4 Homology models of mAhR PAS B ligand binding pocket adapted from Procopio et al., 2002 (337), Pandini et al., 2007 (333) and a Phyre derived model. The model shown in (A) (337) was predicted from the structure of FixL. Residues highlighted in (A) were proposed to function in ligand pocket access (Ala375, Leu347). Note that the model in (A) is rotated 180 clockwise relative to models (B) and (C). A model based on structural homology with HIF2˞ and ARNT PASB generated by Pandini et al. is shown in (B) (333). Residues were targeted for mutagenesis if they were located in the interior of the proposed ligand binding pocket and side chains of these amino acids are illustrated. Additional control mutations were generated for side chains predicted to stick out from the binding pocket. Highlighted in green is the predicted ligand binding cavity for TCDD. A third model (C) was generated using Protein Homology/analogY Recognition Engine (Phyre) (281) which predicted a fold based on homology to HIF2˞ PAS B, annotated with some residues shown in A and B to define the relative similarity with the Procopio model. Models in B and C are annotated ‘A’ for ˞–helix and ‘B’ for -strand representing the conserved structural elements of the PAS domain. Mutagenesis of mAhR PAS B decreases ligand inducibility

A375 A375I A375F

I319 I319A

H320 H320A

H285 H285A

M334 M334E Figure 5.5 Mutagenesis of the ligand binding pocket of AhR has identified mutants with decreased ligand inducibility (341, 340, 338). Targeted mutagenesis of conserved hydrophobic residues H320 and I319 (341) and residues predicted to point into the ligand binding pocket H285, M334 (340) and A375 (338) has generated mutant forms of AhR with reduced TCDD inducibility (red side chains). In silico mutagenesis was performed on the Phyre model of mAhR PAS B (Fig. 5.4) using Pymol to illustrate the structural context of the mutations within the proposed ligand binding pocket. TCDD binding and mAhR activity (339), confirmed by analysis of Q377A, which showed similar effects (340). C327A had minimal effect on function (340), while T343 and F345 have not been mutated for analysis. Mutation of adjacent R362 to Leu however had little effect, indicating that it may be oriented away from the ligand binding cavity. Hence, this model appears to have generated useful mutagenesis targets and broadly, mutation of these sites appeared to decrease TCDD binding and mAhR activity, for those residues predicted to be directed into the ligand binding cavity.

Another mutagenesis study performed an alignment of mouse, rat and human AhR, and compared conserved aromatic amino acids of the PAS B region with Drosophila and C. elegans Ahr homologues which lack the capacity to bind ligand. Where the conserved aromatic residues differed from the Drosophila and C. elegans sequences, residues were mutated to Ala, including F168A, Y239A, F318A, F398A and F400A. Transactivation with 3MC, BNF and TCDD was assessed, with the supposition that conserved aromatic residues may stack such that the aromatic rings become aligned with ligand to confer strong binding and therefore, activation. Of these mutations, F318A had a dramatic effect on all three ligands, while mutation to Tyr and Trp affected TCDD binding, but allowed 3MC and BNF activation (Fig. 5.6). F318L intriguingly maintained activation by 3MC, but abrogated activation by BNF and TCDD, suggesting that F318 did indeed contact ligand, and has a role in ligand binding specificity (Fig. 5.6). Mutation of neighbouring residue Q317A showed an effect on BNF, although more obviously, mutation of I319A and H320A adversely affected all ligand activation events, confirming the importance of this region to ligand binding (Fig. 5.5). Given the demonstrated importance of F318 for ligand binding, and that no other models predicted this residue as a potential ligand interacting side chain, another model was generated based on the fold of HIF-2 PAS B. This model was finally used to perform in silico docking of BNF, 3MC and TCDD, which indicated extensive contacts of all ligands with F318, I319 and H320. This model also predicts hydrophobic interactions with A328, M342, L347 and L348; and hydrogen bond formation between BNF, K284 and K286, although these residues await mutation (341).

A final model has been generate by Pandini and coworkers (340), based on the structures of HIF-2 and ARNT PAS B, where energy minimization revealed the combination of templates had MODELLER z-scores slightly better than those based on HIF-2 PAS B alone. The modelled domain (Fig. 5.4 B) was then compared with the template structures to generate mutagenesis targets. Initially, the ligand binding cavity border residues P291, I319, C327 and L347 were identified as markedly smaller than their counterparts on HIF-2 PAS B. Importantly, the authors noted in this analysis that the structured domain prediction likely reflects the ‘ligand-bound’ state of the domain as it is not bound by Hsp90 in this modelling process. Additionally, the authors suggest that Hsp90 binding may allow a more open presentation of the ligand binding cavity. In order to test the model, residues were selected based on their orientation relative to the binding cavity, i.e. pointing ‘into’ or ‘away’ from the binding pocket, on the 113 Mutagenesis of mAhR PAS B F318 refines ligand binding specificity

F318 F318Y F318W F318L F318A

TCDD

3MC

-NF

Figure 5.6 Targeted mutagenesis of hydrophobic residues of mAhR PAS B Ligand Binding Pocket show decreased potential for promiscuous ligand binding activity (341). Alignments of mouse, rat and human AhR sequences revealed conserved hydrophobic residues which were hypothesised to increase PAH ligand binding affinity by - stacking. Mutagenesis of F318 showed altered ligand inducibility by XRE luciferase reporter analysis using PAH ligands TCDD, 3MC and -NF. Conservative mutations F318Y and F318W abrogated TCDD responsiveness, while F318L prevented TCDD and - NF induction, whereas F318A showed no reporter induction following any of these ligand treatments. A Phyre homology model of the mAhR PAS B domain (Fig. 5.4) was mutated in silico using Pymol to generate the structures above to illustrate potential changes to the ligand binding pocket. assumption that mutation of residues pointing away would have limited impact on ligand and subsequent DNA binding, while residues pointing into the pocket should have a more obvious effect on activity. Those residues determined to point into the pocket included H285, C327, M334, A375 and Q377. Sidechains predicted to point away encompassed residues R282, T311, E339 and K350. Finally, to test the impact of the helical connector position, I332 predicted to fold in the middle of the helix, was mutated to proline, assumed to result in major structural change to the domain fold (340).

EMSA analysis of mutant AhR XRE binding after TCDD induction, and [3H] TCDD binding relative to wt was performed. Residues predicted to point away from the pocket displayed minimal changes in TCDD binding (~75-94% relative to wt), and some difference in XRE binding (53-125%), where the most obvious effect (53%) resulted from mutation of E339A. This effect was rationalised to result from changes to interactions with partner factors Hsp90 or ARNT at the back of the -sheet. As assumed, disruption of the helical connector by I332P caused total loss of binding to TCDD and XRE. C327A showed a slight decrease in activity, while mutation of other residues predicted to be directed into the pocket correlated with decreased TCDD and XRE binding. H285A and M334E had the biggest impact on activity (Fig. 5.5), while Q377A showed 52% XRE binding and 31% TCDD binding. H285 was proposed to be unprotonated within the hydrophobic core, and hence assumed to play a role in stabilising aromatic residues by inducing - stacking, which was disrupted by mutation to alanine. The smaller contribution of the other cavity mutations was predicted to result from a network of weak interactions with ligand, where disruption of a single interacting sidechain would reduce affinity, but not prevent binding. The mechanism proposed for M334 and C327 ligand interaction was a sulphur-arene binding event. A375 mutations to L and V had similar deleterious effects on ligand binding, as determined during investigations discussed above (340). The coordinates of the Pandini model were not deposited into the PDB, hence, for the purpose of illustrating the position of mutated residues for the present work, a Phyre model of mAhR LBD was generated (Fig. 5.4 C). Phyre predicted the closest homology structure, as discussed above, to be HIF-2 PAS B. A summary of amino acids which, when mutated, show abrogated ligand or XRE binding are illustrated on an alignment of the LBD region of AhR homologues from mouse, rat, whale, seal, human, guinea pig, rabbit, tern and chicken (Fig. 5.7 A). Residue mutations, including those derived in this study (solid lines), that showed decreased activity are highlighted in red/orange and sites of mutation with little affect on activity are boxed in green. These mutations are further illustrated in the context of the Phyre model of mAhR (Fig. 5.7 B), clearly demonstrating the importance of the side chains directed into the binding cavity localised to the -sheet for ligand interactions, some effect of the helical connector (A4’) possibly having an impact on the overall fold of the domain, and the PAS core (A1’, A2’ and A3’). Side chains directed out of the ligand binding pocket on the opposite face of the -sheet have been proposed to influence hsp90 or ARNT

114 Mutagenesis of mAhR PAS B has regional effects on ligand activation (A) B4 B5 B1’

MouseAhR NFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTP 291 RatAhR NFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTP 295 WhaleAhR NFQGRLKYLHGQNKKGKDGSILPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTP 296 SealAhR NFQGRLKYLHGQNKKGKDGSVLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTP 296 HumanAhR NFQGKLKYLHGQKKKGKDGSILPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTP 297 GuineapigAhR NFQGRLKYLHGQNKKGKDGSVLPPQLALFAIATPLQSPSILELRTKNFLFRTKHKLDFTP 296 RabbitAhR NFQGRLKFLHGQNKKGKDGSLLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTP 295 TernAhR NFQGRLKFLHGQNKKGKDGATLSPQLALFAVATPLQPPSILEIRTKNFIFRTKHKLDFTP 297 ChickenDR NFQGRLKFLHGQNKKGKDGAALSPQLALFAVATPLQPPSILEIRTKNFIFRTKHKLDFTP 296 ****:**:****:******: *.*******:*****.*****:*****:*********** PAS A PAS B B2’ A1’ A2’ A3’ A4’ B3’

MouseAhR IGCDAKGQLILGYTEVELCTRGSGYQFIHAADILHCAESHIRMIKTGESGMTVFRLLAKH 351 RatAhR IGCDAKGQLILGYTEVELCNKGSGYQFIHAADMLHCAESHIRMIKTGESGMTVFRLLAKH 355 WhaleAhR TGCDAKGRIVLGYTEAELCMRGSGYQFIHAADMLYCAEYHIRMIKTGESGLIVFRLLTKD 356 SealAhR TACDAKGKLVLGYTEAELCMRGSGYQFIHAADMLYCAEYHIRMIKTGESGMIVFRLLTKE 356 HumanAhR IGCDAKGRIVLGYTEAELCTRGSGYQFIHAADMLYCAESHIRMIKTGESGMIVFRLLTKN 357 GuineapigAhR IGCDAKGQIVLGYTEAELCAKGSGYQFIHAGDMLHCAEYHMRMIKTGESGMTVFRLLTKH 356 RabbitAhR TGCDAKGQIVLGYTEAELCMRGSGYQFIHAADMLYCAESHIRMIKTGESGLAVFRLLTKD 355 TernAhR TGCDAKGKIVLGYTEAELCMRGTGYQFVHAADMLYCAENHVRMMKTGESGMTVFRLLTKE 357 ChickenAhR TGCDAKGKIVLGYTEAELCMRGTGYQFIHAADMLYCAENHVRMMKTGESGMTVFRLLTKE 356 .*****:::*****.*** :*:****:**.*:*:*** *:**:******: *****:*. B4’ B5’

MouseAhR SRWRWVQSNARLIYRNGRPDYIIATQRPLTDEEGREHLQKRSTSLPFMFATGEAVLYEIS 411 RatAhR SRWRWVQSNARLIYRNGRPDYIIATQRPLTDEEGREHLQKRSMTLPFMFATGEAVLYEIS 415 WhaleAhR NRWTWVQSNARLVYKNGRPDYIIATQRPLTDEEGTEHLRKRNLKLPFMFTTGEAVLYEVT 416 SealAhR NRWTWVQSNARLVYKNGRPDYIIATQRPLTDEEGTEHLRKRNMKMPFMFTTGEAVLYEIT 416 HumanAhR NRWTWVQSNARLLYKNGRPDYIIVTQRPLTDEEGTEHLRKRNTKLPFMFTTGEAVLYEAT 417 GuineapigAhR NRWIWMQSNARVVYKNGRPDYIIATQRALTDEEGREHLRKRGVKLPFMFTTGEAVLYETS 416 RabbitAhR NRWAWVQSNARFIYKNGRPDFIIATQRPLTDEEGREHLLKRNTKLPFMFTTGEAVLYEMT 415 TernAhR NRWAWVQANARLVYKNGRPDYIIATQRPLTDEEGAEHLRKRNMKLPFMFATGEAVLYEVS 417 ChickenAhR NRWAWVQANARLVYKNGRPDYIISTQRPLTDEEGAEHLRKRNMKLPFMFATGEAVLYELT 416 .** *:*:***.:*:*****:** ***.****** *** **. .:****:******** : PAS B (B) A3’

A4’ A1’ A2’ B3’ B4’ B5’ B2’ B1’

Figure 5.7 Summary of mutagenesis of the ligand binding pocket of AhR reveals the importance of the -sheet and helical connector in ligand interactions. AhR sequences from a number of different species were aligned and residues mutated in other studies (dotted boxes, see appendix for references) and mutations generated in this investigation (solid boxes) have been coloured for effects on ligand induced DNA binding and/or reporter gene induction where red/orange represents a mutation that is not tolerated, and green represents sites permissive of mutation (A). Shown above the alignment are representations of secondary structural elements, where rectangles represent - strands (labelled B’), and cylinders -helices (labelled A’) These sites were further illustrated in the context of the model generated using Phyre (281) (B), showing that residues at the entrance to the pocket, and along the -sheet previously proposed to coordinate ligand binding are not permissive of mutation. Mutation of residues outside of these regions do not appear to affect ligand inducibility of AhR. binding (340). Mutations predicted to be located on surface loops and directed out of the LBD, show minimal effect on inducibility (Fig. 5.7 B).

5.1.5 Summary and Aims

It is possible that functions specific to cell context, that are mediated by activated AhR are induced through binding different ligands specific to roles in primary and coactivator gene regulatory events and ubiquitin ligase adaptor functions, or indeed, a combination of all three. In an attempt to assess if xenobiotic metabolism roles are separable from endogenous mechanisms, intra-organism specific differences in TCDD binding affinities between two zebrafish AhR orthologs have been used to design mutant forms of mouse AhR (mAhR). This approach aimed to generate variants with the capacity to functionally discriminate between a range of activation mechanisms including PAH-type and atypical ligand activation (YH439), suspension culture and application of shear-stressed serum. Studies in zebrafish (Danio rario) have identified three AhR genes with differential xenobiotic activation potential; drAhR1a (342), drAhR1b (343) and drAhR2 (344). AhR2 was found to function in a xenobiotic responsive capacity, showing high-affinity TCDD binding and activation of a reporter gene (344), and has been identified as the primary receptor involved in TCDD toxicity (345). The previously identified AhR1a was found to be unable to bind, or be activated by TCDD or NF (342). Recently, researchers identified AhR1b, belonging to the AhR1 clade and sharing 34% sequence identity with AhR1a, and 66% identity through the LBD. However, in contrast to AhR1a, AhR1b was found to bind TCDD and activate reporter gene expression with similar efficiency to AhR2 (343). No function has been identified for AhR1a to date, hence, an analysis of amino acid differences between these factors was undertaken. It was hypothesized that differences in LBD structure may determine a regulatory role for drAhR1a in responding to novel ligands, or endogenous signals, thereby insulating this factor from HAH/PAH ligand binding.

In order to examine amino acid variants which may be critically affecting receptor function, an alignment of murine and danio rario AhR LBD sequences was utilised to inform site directed mutagenesis of mAhR. Mutant mAhR stable cell lines were generated and activation following suspension culture and application of atypical ligand YH439 (302) was assessed by reporter gene analysis. Mutants identified with altered function relative to wt were tested for activation by polyaromatic hydrocarbon ligands TCDD, 3MC, B[a]P and by application of shear-stressed serum.

115 5.2 Results

5.2.1 Homology based targeted mutagenesis of the Ligand Binding Domain of Mouse AhR

To establish residues with the potential to impact on ligand binding, previous studies of AhR have focused efforts on modelling the LBD (337, 340), direct sequence comparison between species with differing TCDD affinity (341), or a combination of both (339). The current approach differs from these in that alignment of zebrafish LBD sequences AhR1a (drAhR1a) and AhR1b (drAhR1b) has provided sequences that are 66% identical, but show dramatic functional differences. Recalling that drAhR1a cannot bind to TCDD, while drAhR1b maintains TCDD inducibility, mouse AhR LBD (mAhR) was also aligned with these sequences (Fig. 5.8 A). Analysis of this alignment highlighted amino acids that had distinctly different physicochemical properties in drAhR1a relative to the other sequences. Residues identified were introduced into mAhR by site-directed mutagenesis, generating 4 clones, where some mutations were initially combined, including K245E, conservative mutations H241Y/H285Y (241.285), K297N/G298W/Q299N/L300F (NWNF) and R369Q/Y372Q/Q377H (Triple) (Fig. 5.8 A).

The mutant cDNAs encoding mAhR were stably integrated into HEK293Ts in the pEF/IRES puro vector (140), resulting in constitutive expression of the cDNA. Expression was examined by immunoblot with antibodies directed against the Myc tag of the mAhR constructs, with relative levels confirmed by comparison with loading control antibodies to -tubulin (Fig. 5.8 B). Since ligand activation and DNA binding by AhR can both be qualitatively assessed using a reporter gene assay, and given the similar expression levels of mutant and wt mAhR (Fig. 5.8 B), experiments examined activity of mutant mAhR relative to wt receptor using an XRE-driven luciferase reporter gene on the backbone vector T81 with a tandem XRE upstream of the firefly luciferase gene (XRE-Luc) (137). The ability of the mutant mAhR to activate this reporter gene was assessed relative to background levels resulting from activation of endogenous AhR with YH439 and suspension culture (Fig. 5.9 A). As the control reporter construct, lacking an XRE (Control-Luc), showed no activation above baseline, while endogenous AhR activated Control-Luc, albeit weakly, the latter is henceforth referred to as background activity (Fig. 5.9 A, data sets 1 and 2). Exogenous overexpressed wt mAhR showed some induction of luciferase activity in the absence of stimulus, as expected from reporter studies in the literature (338) (vehicle and adherent controls), and strong inducible activity after treatment with YH439 (black bars) and suspension culture (chequered bars) (Fig. 5.9 A, data set 3).

Substitution of lysine at position 245 with glutamic acid (K245E) was selected for mutation as alignment with other AhR sequences which have been shown to bind TCDD illustrate strict conservation of a basic residue, an Arg or Lys, at this position and hence, a reversed charge was theorised to have the potential 116 Targeted mutagenesis of mAhR and stable cell line generation (A)



A375I

(B)

Figure 5.8 AhR ligand binding domain alignment, site-directed mutagenesis and expression of wt and mutant mAhR. (A) Ligand binding domain (LBD) sequences of mouse (mAhR, NP_038492) and zebrafish arylhydrocarbon receptor 1b (drAhR1b, NP_001019987) and 1a (drAhR1a, NP_571103) were aligned using ClustalW2. JalView was used to present alignment using Zappo colour scheme. Secondary structures predicted from the structure of Per are represented above the alignment, where an arrow represents a -strand and cylinder an -helix (174). drAhR1a and drAhR1b were compared and side chains with markedly different physicochemical properties were point mutated in the mAhR LBD, including H241Y H285Y (241.285), K245E, K297N G298W Q299N L300F (NWNF), and R369Q Y372C Q377H (Triple). mAhR proposed ligand binding pocket blocking mutation A375I was also highlighted using a grey box. (B) Expression of Myc tagged mAhR wt and mutant constructs from stable HEK293T cell lines. Whole cell extracts (50 μg) were separated by 10% SDS-PAGE prior to immunoblotting with the 9E10 anti-Myc mAb and loading control MCA78G anti- Tubulin. mAhR double mutant H285Y/H241Y cannot be activated by cell suspension

(A) ;5(;5( 7./XF

7./XF 9HKLFOH $GKHUHQW <+ 6XVSHQVLRQ

&RQWURO;5(/XF;5(/XF;5(/XF;5(/XF;5(/XF (1) (2) (3) (4) (5) (6)

(B) (C) 9HKLFOH $GKHUHQW <+ 6XVSHQVLRQ $FWLYLW\ $FWLYLW\ 5HODWLYH/XFLIHUDVH 5HODWLYH/XFLIHUDVH

;5(/XF;5(/XF;5(/XF ;5(/XF;5(/XF;5(/XF

Figure 5.9 Effects of targeted mutagenesis of mAhR LBD on inducibility of an AhR-responsive XRE-luc construct. Mutant 241.285 shows abrogated suspension activity. Stable cell lines expressing wt and mutant forms of mAhR were cotransfected with XRE-Luc or Control-luc control reporter and RL-TK for 24 h, then treated with DMSO, YH439 or suspension culture for 16 h. Relative luciferase activity was determined as described in Materials and Methods. Data show the mean ± S.D. from a single experiment, performed in triplicate. The experiments were repeated three times with similar results. to affect function. However, K245 lies in a linker region that is predicted to be unstructured, and some sequence variability is seen in the surrounding sequence of other fish AhRs (343). K245E showed a slight decrease in activity relative to wt in the reporter gene assay for both YH439 ligand and suspension treatment as well as reduced basal activity in the untreated controls (Fig. 5.9 A, data set 4). As all conditions had similarly reduced activity relative to wt protein, it was concluded that mutation K245E was tolerated for YH439 and suspension activation of mAhR and no further analysis was undertaken.

NWNF, representing mutations K297N, G298W, Q299N and L300F, was expressed at levels similar to wt mAhR (Fig. 5.8 B). However, this mutant did not show any reporter gene activity above background, and since all treatments were similarly affected, this mutant was not analysed further (Fig. 5.9 B, C). Although similar levels of expression were observed, relative to wt mAhR (Fig. 5.8 B lanes 2 and 6), mutation of residues 297-300 may have adversely affected folding of this species, where 297 typically has a conserved basic residue ahead of the start of helix A’ (Fig. 5.8 A). Insertion of Trp and Phe at the first and third positions of this helix represents a dramatic change in size, where there are typically small polar or non-polar side chains. Hence, these mutations may have altered the structure of this helix and abolished the ligand activation of the mutant mAhR. These are the first mutations in this predicted helix to be reported and demonstrate the importance of this region for mAhR function.

Residues 369, 372 and 377 were selected for combined mutagenesis (Triple) as they lie close together in the same predicted -strand (174). Several previous studies have focused on mutagenesis of this - strand and have demonstrated that position 375 has a profound impact on ligand binding pocket access. Substitution with increasing size of side chain, A375V and A375L progressively reduce the capacity for AhR to be activated by ligand (340), and in the case of A375I, the pocket is thought to be completely closed (see below) (338). Q377 has been the target of other mutagenesis studies, based on predicted models of LBD structure where Q377 was hypothesised to form hydrogen bonds with TCDD (337) and, in the more recent model from Pandini et al., predicted to point inwards and affect either the steric capacity of the pocket, or particular stereoelectronic requirements for ligand association, as discussed above (340). Q377L AhR shows decreased DNA binding, decreased TCDD binding (70-75% relative to wt), and a concomitant decrease in TCDD induced reporter gene activity (339). In the case of Q377A AhR, DNA binding is reduced to ~50% and TCDD binding to ~30% relative to wt, although no target gene activity was assessed (340). Hence, the composition of this strand seems intrinsically linked to LBD function, and strong divergence of this strand in drAhR1a suggests that this region could be a good place to search for ligand selectivity. Analysis of the R369Q, Y372C and Q377H (Triple) mAhR showed it was expressed at slightly lower levels than wt mAhR (Fig. 5.8 B), but showed overall much poorer XRE-Luc activity both with YH439 treatment, and after suspension culture (Fig. 5.9 A, data set 117 6). Given the effects of mutagenesis of Q377 to alanine (340), and leucine (339), changing this residue to histidine would be hypothesised to alter the steric and potentially, the stereoelectronic properties of the ligand binding pocket. Accordingly, Triple mutant mAhR had reduced activation by YH439 and suspension culture, with a decrease in background activity also evident. As there was no selective difference in activity, this mutant mAhR was not analysed further.

Two conservative mutations H241Y and H285Y were combined due to their similarity, both in amino acid composition and position at the end of predicted -strands (Fig. 5.8 A) to produce 285.241 mAhR. This mutant form of mAhR was expressed at similar levels to wt mAhR (Fig. 5.8 B lanes 2 and 4) and was able to activate the reporter gene after treatment with YH439 (Fig. 5.9 A, data set 5). However, intriguingly, 285.241 completely abrogated suspension activation of mAhR, and showed no background activity above endogenous levels (compare Fig. 5.9 A, data sets 2 and 5). This implies that endogenous signals or ligands generated in suspension culture have a common requirement for activation similar to that required for background activity in untreated cells, although clearly this effect is enhanced in suspension culture. Given that K245E, NWNF and Triple mAhR mutants showed a similar response to both YH439 and suspension culture, these mutations do not distinguish between endogenous and novel ligand activation modes. However, 285.241 was shown to be selectively activated by YH439, while being unaffected by suspension culture.

5.2.2 Histidine 285 in the LBD of mAhR is critical for suspension culture activity, but is not required for atypical AhR ligand YH439 inducibility

The mutations H285Y and H241Y (Fig. 5.8 A) were constructed separately in mAhR to determine which amino acid was responsible for the selective activation by YH439. H241Y and H285Y mAhR stably integrated HEK293Ts were expressed at similar levels to wt mAhR (Fig. 5.8 B) and showed similar induction of reporter gene activity after treatment with 10 M YH439 (Fig. 5.10 A black bars, data sets 5 and 6). H241Y mAhR also had almost identical activation potential as wt mAhR, in suspension culture (Fig. 5.10 A, data sets 3 and 5). In contrast, H285Y mutation alone was responsible for abrogating suspension culture activation (Fig. 5.10 A, chequered bars, data set 6), as well as reducing background activation of this receptor to endogenous levels (Fig. 5.10 B, compare untreated samples, Data sets 2, 4 and 6). This result implies that the activation mechanism at play in suspension culture has a common requirement to that responsible for the background activation of recombinantly overexpressed AhR that is seen in a number of reporter studies (338, 341). It is clear that mutagenesis of H285 to tyrosine has not simply resulted in a non-functional mAhR as strong activity was maintained after YH439 treatment. Additionally, given that H285Y can be activated to the same extent as wt in the presence of YH439, but 118 mAhR mutant H285Y cannot be activated by cell suspension but maintains full activation by YH439

;5(;5( (A) 7./XF 7./XF

([SUHVVLRQ&RQVWUXFW ZW EODQN ZW  +< +< 5HSRUWHU&RQVWUXFW EODQN&RQWURO;5(/XF;5(/XF;5(/XF;5(/XF;5(/X;; ;; ;; ;; ;; F (1) (2) (3) (4) (5) (6)

(B)

9HKLFOH $FWLYLW\ 5HODWLYH/XFLIHUDVH

([SUHVVLRQ&RQVWUXFW ZW EODQN ZW  +< +< 5HSRUWHU&RQVWUXFW &RQWURO;5(/XF;5(/XF;5(/XF;5(/XF;5(/XFEODQN ;; ;; ;; ;; ;; (1) (2) (3) (4) (5) (6)

Figure 5.10 mAhR LBD mutant H285Y cannot be activated by suspension treatment, shows no background activity and maintains full YH439 activation. 241.285 showed no activity in suspension treatment, hence the mutations were isolated to identify the mutation responsible for selective inducibility of an AhR-responsive XRE-Luc construct. Stable cell lines expressing mutant forms of mAhR were cotransfected with XRE-Luc or Control-Luc vector as indicated and RL-TK for 24 h and treated with DMSO, YH439 or suspension culture for 16 h. Relative luciferase activity was determined as described in Materials and Methods. Data in (A) represent a typical experiment performed in triplicate three times. (B) shows an enlarged representation of the DMSO control background of these cell lines. Data show the mean ± S.D. from a single experiment, performed in triplicate. The experiment was repeated three times with similar results. suspension activity is completely abrogated, it is apparent that endogenous activation does not tolerate a tyrosine at this position of the ligand binding pocket. If an endogenous ligand is present in suspension culture, as has been proposed in the past (311, 338), it is hypothesised that it accesses and binds to the ligand binding pocket in a manner distinct to YH439.

5.2.3 YH439, unlike prototypical PAH ligands (TCDD, 3MC and B[a]P),has a novel binding mode which tolerates mutation of Histidine 285 to Tyrosine.

Given the capacity of H285Y mAhR to categorically discriminate between YH439 and endogenous activation by suspension culture, an analysis of typical PAH-type ligands was undertaken and assessed for AhR activation by reporter assay. Endogenous AhR in the 293T cell line showed some activity after treatment with TCDD [1 nM] (diagonal striped bars), 3MC [1 M] (brick patterned bars) and B[a]P [100 nM] (spotted bars) (Fig. 5.11, data set 2). Exogenous overexpression of wt mAhR gave induction of luciferase reporter gene activity with the addition of these three PAH-type ligands, as well as with YH439 treatment (black bars) (Fig. 5.11, data set 3). In contrast, mutant H285Y mAhR had no activity after treatment with HAH/PAH-type ligands, and as seen before, no background activity, but showed selective inducibility upon treatment with YH439 (Fig. 5.11, data set 4). This demonstrates that typical HAH/PAH ligand activation of AhR does not tolerate the relatively conservative mutation of histidine 285 to tyrosine. Histidine 285 has been predicted to point into the ligand binding pocket in the most recent LBD model (340), and mutation of H285 to alanine prevents TCDD binding (and thus, DNA binding) (340), in agreement with the finding of a requirement for histidine 285 for HAH/PAH binding. However, this analysis extends on the H285A results by conservative incorporation of an aromatic residue. Pandini and coworkers rationalised that H285 may be uncharged due to localization in the hydrophobic core, and may bind ligand by a - stacking mechanism. Tyrosine at this position would be expected to potentiate similar aryl stacking, but evidently is not able to simply replace His in this role, or, alternatively this hypothesised mechanism does not occur.

As discussed above, A375 appears to be critical for ligand binding pocket access (340) and substitution with isoleucine abrogates TCDD inducibility and prevents background constitutive activation of mAhR (338). Therefore, A375I mAhR was included in these assays to allow comparison with the effects of H285Y. There was some YH439 induced activation of A375I mAhR above background, supporting the suggestion that YH439 binds the LBD in a manner distinct from TCDD and predicted endogenous ligands (Fig. 5.11, data set 5). The observation that YH439 induced activity of A375I mAhR was reduced relative to wt and H285Y mAhR suggests that this atypical ligand must require some access to the modelled pocket. Given that assays were performed at 10 M YH439, where the highest induction 119 mAhR mutant H285Y is not responsive to PAH-type ligands and A375I can respond to YH439

;5(;5( 7./XF

7./XF

&RQWURO;5(/XF;5(/XF;5(/XF;EODQN 5(/XF (1) (2) (3) (4) (5)

Figure 5.11 mAhR mutant H285Y cannot be activated by typical PAH ligands TCDD, 3MC and B[a]P. The proposed ligand binding pocket blocking mutation A375I (338) was also tested for inducibiilty and found to have some activation in the presence of YH439. Stable cell lines expressing mutant forms of mAhR were cotransfected with XRE-Luc or Control-Luc vector as indicated and RL-TK for 24 h, then treated with DMSO, TCDD [1 nM], 3MC [1M], B[a]P [100 nM] or YH439 [10 M] for 16 h. Relative luciferase activity was determined as described in Materials and Methods. Data show the mean ± S.D. from a single experiment, performed in triplicate. The experiment was repeated three times with similar results. of target gene cyp1a1 has been identified in the past (302), the possibility that H285Y was a weakly activatable mutant mAhR, and that activation observed was simply due to a vast excess of ligand, could not be discounted. To address this, reporter gene assays were performed over a concentration range of YH439. This indicated that H285Y mAhR, although having no background activity, hence low detected response at 0.01 M, responded in a similar manner as wt mAhR to YH439 at 1 and 10 M (Fig. 5.12). At lower concentrations wt mAhR is proposed to be activated by a combination of background from ‘endogenous’ ligand and YH439, while only the YH439 response is detected for H285Y mAhR (Fig. 5.12). Hence, it was concluded that H285Y may bind YH439 with slightly lower affinity than wt, but can be activated with as little as 0.1 M YH439, and highly activated in the range 0.5-10 M in this assay. The other ligands employed in this analysis were used at saturating concentrations (346, 347), where levels of 1 M 3MC or 0.1 M B[a]P failed to activate H285Y mAhR.

5.2.4 Mutation of mAhR Histidine 285 to Tyrosine abolishes endogenous activation paradigms

Having established that H285Y mAhR could not be activated by maintenance of cells in suspension culture (Fig. 5.10 A, data set 6), the emergence of the enticing discovery that AhR can be activated by oxidized LDLs (59) prompted investigation of the response of H285Y mAhR to this activation mechanism. Here, shear stressed FCS applied at 60% of the total culture medium showed strong activation of XRE-driven reporter gene in the wt mAhR line, exceeding that of suspension culture (Fig. 5.13, striped and chequered bars, respectively; data set 3). However, in contrast, H285Y and A375I forms of mAhR showed no activation above background with application of shear-stressed serum, or suspension culture treatment (Fig. 5.13, data sets 4 and 5). Hence, it is clear that H285Y mAhR responds similarly to suspension culture, shear stressed serum and HAH/PAH-type ligands. This activation profile reflects that seen for the proposed ligand binding pocket blocker A375I (Fig. 5.13 A, data set 5).

5.3 Conclusions

Existing knowledge of drAhR homologs 1a and 1b has been used to perform a targeted mutagenesis screen of the mAhR LBD in order to discover mutants which had the capacity to discern discrete activation mechanisms. This study was performed in to determine if abrogating exogenous ligand binding would affect all HAH/PAH-type ligands, atypical ligand YH439 and endogenous activation states mimicked by suspension culture and application of shear stressed FCS. Past mutagenesis experiments have provided a proof-of-principle for this approach and yielded mutant mAhR, which could be selectively activated by a particular exogenous ligand, such as in the case of F318L, where 3MC was able to activate a reporter driven by this mutant, but -naphthoflavone and TCDD could not induce this 120 mAhR mutant H285Y responds to a YH439 gradient in a similar manner to wt

;5(;5( 7./XF

Figure 5.12 mAhR mutant H285Y shows activation by a gradient of YH439 similar to the profile of wt mAhR activity. The proposed ligand binding pocket blocking mutation A375I was also tested for inducibiilty and found to have some activation in the presence of YH439. Stable cell lines expressing mutant forms of mAhR were cotransfected with XRE-Luc vector and RL-TK for 24 h, then treated with DMSO, or YH439 in a gradient from 0.01- 10Mfor 16 h. Relative luciferase activity was determined as described in Materials and Methods. Data show the mean ± S.D. from a single experiment, performed in triplicate. The experiment was repeated three times with similar results. H285Y and A375I are not responsive to endogenous stimuli and predicted to localize to a similar region of mAhR PAS B

;5(;5( (A) 7./XF

7./XF

&RQWURO;5(/XF;5(/XF;5(/XF;5(/XF (1)(2)(3)(4)(5) (B)

H285 H285Y

A375

H285

A375 A375I

Figure 5.13 mAhR mutants H285Y and A375I are not activated by shear stressed FCS or suspension treatment, and are proposed to localise to a similar region of the ligand binding pocket. To examine the effect of H285Y and A375I on non-exogenous ligand based activation states, stable cell lines expressing mutant forms of mAhR were cotransfected with XRE-Luc or Control-Luc vector as indicated and RL-TK for 24h and left untreated (adherent), grown in suspension culture, cultured in 60% untreated or shear-stressed FCS for 16 h. Relative luciferase activity was determined as described in Materials and Methods. Data show the mean ± S.D. from a single experiment, performed in triplicate. The experiment was repeated three times with similar results. Alignment and annotation of effect of mutation of AhR LBD on ligand inducibility (Fig. 5.7 (A) and (B)) reveals the importance of the -sheet for inducing AhR ligand activation. To contextutalise the positional effect of H285Y and A375I on AhR activity, a Phyre generated homology model of AhR PAS B was modified by in silico mutagenesis and reveals a similar localization for the mutated residues in the tertiary folded structure of the ligand binding domain (B). activity (341). However, past LBD mutagenesis screens have simply utilised background activity to indicate whether these mutants were competent for endogenous activation. The present analysis is distinct from these investigations in that it specifically set out to explore the effects of mutations on endogenous activation of mAhR. Additionally, an atypical AhR ligand was included, to be more comprehensive in understanding the functional affects of any given mutation.

The analysis showed that substitution of histidine 285 for tyrosine in the LBD of mAhR generated a receptor which was selectively activatable by YH439, and that this mutation could also prevent the commonly seen background activation of the overexpressed receptor. Assessment of PAH-type ligand inducibility of wt, H285Y and A375I mAhR identified that H285Y was selectively responsive to YH439 and could not be activated by TCDD, 3MC or B[a]P. Additionally, A375I, theorised to block access to the ligand binding pocket (338), showed a similarly selective capacity to be activated only by YH439, although to a lesser extent than H285Y. This result implies that YH439 has a novel binding mode which does not require histidine 285, or full access to the ligand binding pocket as per the HAH/PAH-ligand paradigm (337, 338, 340). Finally, H285Y and A375I show no activity in suspension culture, or upon treatment with shear-stressed serum, which parallels the lack of background activity. An illustration of the position of H285 and A375 in the context of the Phyre generated mAhR model shows that mutation of these sites positions the side chains in close proximity, indicating that they may have similar effects on ligand exclusion (Fig. 5.13 B). Thus, it seems likely that these activation mechanisms are mediated by an endogenous ligand structurally similar to the PAH class of xenobiotic ligands, which does not tolerate mutation of histidine 285 to tyrosine, or alanine 375 to isoleucine, in order to induce AhR activity. Alternatively, endogenous signals, such as post-translational modifications, responsible for activation in suspension and in the presence of oxidized LDLs, may share similar structural requirements to HAH/PAH binding mechanics.

These data support the hypothesis, as has been proposed in the past (338), that endogenous ligand(s), which can activate wt mAhR, are present in cells in culture. Such ligands cannot interact with the blocked ligand binding pocket of A375I, and are not permissive of a tyrosine at position 285. Furthermore, the mechanism by which shear-stressed serum can activate mAhR likely involves a ligand similar to that induced by suspension culture. More specifically, it seems plausible that both of these endogenous activation states are mediated by a ligand(s) with similar properties to the polyaromatic hydrocarbon type ligands TCDD, 3MC and B[a]P. Given that H285Y mAhR responds similarly to all of these treatments, all proposed ligands must have a similar binding pocket requirement. The corollary to this, of course, is that YH439 must have a distinctly different mode of action in that it is permissive of H285Y mutation and can maintain wt mAhR levels of activation of a reporter gene. Hence, YH439 may represent a physiologically relevant new class of AhR ligand, and H285Y mAhR would seem a good 121 diagnostic tool for the classification of this type of binding event. An analysis of target gene induction may prove important to understanding the capacity of H285Y to activate in a chromatin context. Additionally, analysis of the point at which H285Y signalling is abrogated in the presence of HAH/PAH type ligands, and in endogenous signalling roles may aid in understanding regional structural contributions to cytoplasmic/nuclear localization, Hsp90 and ARNT binding, and interactions with transcriptional machinery.

What remains to be established is whether mutations in the LBD are having differential impacts on other AhR target genes that may be cell context specific, or indeed, if mutations are altering the efficiency of AhR in its functional role as an adaptor protein in the Cul4B(AhR) ubiquitin ligase context. It is predicted that an analysis of mutations in the context of different ligand species, and assessment of these less well characterised aspects of AhR function, will become important as a read-out of the spectrum of roles for AhR in the cell, rather than simply looking to TCDD binding and activation of the Cyp1a1 promoter as an analytical catchall for AhR functionality.

122

Identification of Post-Translational Modification of the Mouse Aryl Hydrocarbon Receptor Chapter 6: Identification of Post-Translational Modification of the Mouse Aryl Hydrocarbon Receptor

6.1 Introduction

6.1.1 PTMs induce conformational change and form docking sites for Protein:Protein Interactions

Post-translational modification (PTM) has been identified as a mechanism for increasing the complexity of the eukaryotic proteome to allow individual proteins to regulate vast networks of signalling cascades and protein:protein interactions. PTMs identified include covalent incorporation of small functional groups, such as phosphorylation, acetylation, methylation and hydroxylation; the addition of small signalling peptides, such as ubiquitination, sumoylation and neddylation; sugar groups, such as glycosylation; and lipoylation to induce membrane anchorage by myristoylation and farnesylation. For the purpose of this investigation, the review of PTM functions has been restricted to those identified in this study as occurring on mouse AhR, including phosphorylation, pyrophosphorylation, sulfonation, acetylation, and methylation.

Phosphorylation The most studied PTM is phosphorylation, a common modification that is believed to occur in approximately 30% of the human proteome (348). Protein kinase A (PKA) was the first enzyme identified to specifically target a serine residue in a sequence dependent manner, and covalently attach a phosphate moiety (349). This discovery was closely followed by the observation that v-Src was able to phosphorylate target tyrosine residues (350). Amongst the most abundant tyrosine kinases are the growth factor receptors, or receptor kinases. These enzymes were identified for their capacity to autophosphorylate (348), creating primary docking sites for phospho binding Src homology 2 domains (SH2), culminating in intracellular signalling cascades resulting from external receptor stimulation (351). Mitogen activated protein kinases (MAPK) were found to have dual specificity for serine/threonine phosphorylation, and required S/T and Y phosphorylation by MAP kinase kinases (MAPKK) to become active. MAPKK activity occurs in response to growth factor stimulation, perpetuating an S/T phosphorelay through MAPK (352).

A recent investigation of cell cycle regulation highlights the role of both protein disorder and PTMs in controlling the association and kinase activity of the p27/cyclin dependent kinase 2 (cdk2)/cyclin-A complex. Within the complex, p27 functions to inhibit cdk2 kinase activity. Tyrosine phosphorylation of p27 within a flexible region causes a conformational change that ejects a p27 C-terminal helix from the 123 cdk2 ATP binding pocket, while maintaining the complex association, resulting in partial activation of this kinase. The proximal target p27 is phosphorylated by the active cdk2/cyclin-A at a threonine residue in a second disordered region, resulting in exposure of a ubiquitination target site, and proteosomal degradation. This event frees the cdk2/cyclin-A complex, inducing maximal kinase activity to accelerate progression from G2 through S-phase of the cell cycle (353).

Acetylation Lysine acetylation mediated by histone acetyl transferase enzymes (HATs) has been identified in many non-histone nuclear regulators, in excess of 80 transcription factors, and a number of cytoplasmic proteins (354). 30 HATs have been identified to date, with targets including p53, as described below; signal transducer and activator of transcription (STAT3), estrogen receptor (ER), c-Myc, and HIF-1. Given that the lysine side chain is the target of multiple modifications, including ubiquitination, sumoylation, neddylation and methylation, it is clear that each event may compete for the same lysine, dependent on surrounding sequence/HAT compatibility. A recent review suggests three typical roles for lysine acetylation; 1) acetylation at a single, or a few sites functions as a binary switch; 2) acetylation of a cluster of lysine residues may regulate the effects of charged surface patches to modify protein structure; and 3) acetylation may interact with other similarly targeted PTMs to regulate events in a site specific manner, such as abrogation of ubiquitination and degradation (355).

As described further below, lysine acetylation of p53 has been shown to interfere with polyubiquitination, thereby increasing protein stability. Acetylation of multiple lysine residues within the regulatory domain is also theorised to result in domain rearrangements to activate DNA binding (Fig. 6.1 A). Another facet of acetylation is the potential to act as a docking site for regulatory factors that contain an acetyl-lysine binding bromodomain, such as the interaction between acetylated signal transducer and activator of transcription (STAT3) and the bromodomain of p300 (356). In another example, p300 acetylates ER at K299, K302 and K303, while phosphorylation by PKA at S305 prevents acetylation at K303, indicating crosstalk between PTM proximal target sites, in addition to direct competition (357). Treatment with HDAC inhibitors shows increased basal ER transcriptional activity (358). Increased ER protein levels following inhibition of HDAC with trichostatin A (TSA) also suggests a similar mechanism of antagonism with polyubiquitination and degradation to that observed for p53. bHLH.LZ protein c-Myc is acetylated by PCAF at 149, 323 and 417, where K323 is located in the NLS, and K417 within the LZ region. However, acetylation appears to have no impact on nuclear localization or dimer formation with partner protein Max (359). p300 acetylation of Myc within the region between the TAD and HLH.LZ domains has been shown to directly decrease c-Myc protein levels in a proteosome dependent manner (360). Acetylation of bHLH.PAS factor HIF-1 at lysine 352 by N-acetyltransferase ARD1 similarly modifies protein stability, stimulating ubiquitination resulting in increased degradation (361). 124 PTM regulation of p53 activity (A) 300 393 p53 TA Pro-rich DBD NLS TET REG

Sumo Ac

di Me Phos SPQPKKKPLDGEYFTLQIRGRERFE.…….DAQAGKEPGGSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD 315 321 331 339 352 366 370 393

NEDD8 Ub Me

Smyd2 Set9 (B) me-K370 Smyd2 p53 me-K372 P21 p53 mdm2

DNA Damage Tip60

me-K372 Apoptosis p53 meK372 p53

ac-K120

p300

Figure 6.1 Overlapping modification of the C-terminal regulatory region of human p53 generates nuanced transcriptional outcomes. Shown is a schematic of the linear domain organisation of p53, including the Transcriptional Activation (TA), Proline-rich (Pro-Rich), DNA Binding Domain (DBD), Nuclear Localization Sequence (NLS), Tetramerization domain (TET), and C-terminal Regulatory Region (REG). A closer inspection of the sequence of the REG region shows a wide variety of modifications which enhance (green arrow, above) or repress (red arrow, below) p53 transcriptional activity, typically by regulating degradation, coactivator interactions and DNA binding (see section 6.1.2 for a summary of these modifications). Schematic representations of modifications include sumoylation of lysine (blue oval, Sumo); acetyl-lysine (acetyl, yellow rectangle, Ac); methyl- lysine (di and mono methyl, pink rectangles, Me); phospho-Ser and phospho-Thr (grey oval, Phos); neddylation of lysine (light blue rectangle, NEDD8); and ubiquitinylation of lysine (green circles, Ub). (B) In the latent state, p53 is methylated by Smyd2 at K370 inhibiting DNA binding. After DNA damage, methylation of K372 by Set9 inhibits Smyd2 binding and activates DNA binding. Me-K372 forms a docking site for Tip60 which acetylates K120. Ac- K120 forms an interaction site for p300, upregulating target gene expression. Ac-K120 additionally defines cellular outcomes of p53 activation, resulting in apoptosis.

Methylation Another modification group competing for target lysine residues are mono, di and tri-methylation which are imparted by enzymes such as the SET domain (Su(var)3-9, Enhancer-of-zeste, Trithorax), Smyd-2 (SET and MYND domain-2), Tip60 (HIV Tat-interacting protein of 60kDa) and MOF (males absent on the first) methyltransferases (362). Additionally, there are nine mammalian protein arginine methyltransferases (PRMTs) that target arginine residues for monomethylation, symmetrical dimethylation, and asymmetrical dimethylation (363). Methylated side chains can function as direct protein docking sites for methyl recognition domains include Chromo, Tudor, MBT (malignant brain tumor) and WD repeat domains (reviewed in 362). The best characterised roles for methylation are in modification of histones, to regulate chromatin remodelling; and in regulation of p53 function in the cell’s response to DNA damage. Histone regulation by methylation describes the complexity of responses possible following simple chemical modifications. Individual modifications have been identified which are associated with altered transcriptional output, including H3K4 methylation which correlates with active chromatin, while H3K4 dimethylation incurs basal transcription, and trimethylation has been isolated on fully active promoters (364). However, histone methylation is typically associated with gene silencing, such as trimethylation of H3K23, known to occur during initiation of X-chromosome inactivation (365). A recent study revealed that H3K20 in Drosophila and human cells is targeted for monomethylation by Pr- SET7/9, and further modified to a dimethylated form by Suv 4-20, a methyltransferase with di and, to a lesser extent, trimethylation activity. Further analysis showed that Suv 4-20 can dimethylate this site without a requirement for a monomethyl priming modification (366).

A non-histone methylation target, TBP associated factor 10 (TAF10) is a component of the general transcription factor complex, TFIID. Following SET7/9 mediated monomethylation, TAF10 shows subtly increased transcriptional activation potential, due to enhanced interaction with RNA Pol II that is dependent on the methylation status of TAF10 K189 (367). The function of p53 methylation involves direct competition with a number of other PTMs for common target lysine residues; prevention of regulatory protein interactions; specification of target promoter binding and generation of specific docking sites for coactivator proteins, described further in the following section. Transcription factor forkhead box O-1 (FOXO1) is methylated by PRMT1 at a residue within an Akt phosphorylation recognition sequence, preventing Akt binding and increasing FOXO1 target gene activation. This represents the first evidence for arginine methylation as a negative regulator of Akt mediated phosphorylation (368).

125 6.1.2 Post-Translational Modification of Transcription Factor p53 p53 has been characterised as a tumor suppressor that is induced during periods of cell stress, resulting in cell cycle arrest in G1 and G2, or apoptosis by activation of targets including p21 (369). Like AhR, p53 is a sequence-specific transcription factor, which has recently been intensively analysed for post-translational modifications. Post-translational modification of p53 is a complex crosstalk between a number of different modifications, and a complete review of this field is beyond the scope of this thesis. However, by focusing on the C-terminal regulatory region of p53, information about the temporal and context specific effects of a concert of modifications is useful for understanding the breadth of possible regulatory outcomes for proteins that, like p53, undergo a large number of modifications, now theorised to describe a PTM code (reviewed in 370) (summary, Fig. 6.1 A).

DNA damage and other cell stress events result in upregulation of p53 protein levels. This induction is dependent on post-translational mechanisms, specifically resulting in disruption of an interaction with the p53 ubiquitin ligase MDM2, and prevention of targeted degradation. Evidence indicates that PTMs such as phosphorylation and acetylation may be involved in regulating the interaction and ubiquitination of p53 by MDM2.

Phosphorylation of p53 Human and mouse p53 phosphorylation events occur at a number of N and C-terminal sites by interaction with a range of kinases, and to affect p53 stability and interaction with transcriptional activators such as CBP/p300 and 14-3-3 proteins (reviewed in 371). DNA damage induces Checkpoint kinases Chk1 and Chk2, which target serine and threonine residues in the MDM2 binding region, disrupt this interaction surface and prevent degradation of p53 (372-374). Phosphorylation within the C-terminal regulatory region occurs at S392, with little apparent effect on activity (375); however, phosphorylation of S366 and T387 results in enhanced transcriptional activity (376).

Acetylation of p53 Association with HAT enzymes CBP/p300 is necessary for p53 activity (377), resulting in acetylation of p53 at a number of lysine residues (378). Additionally, Tip60 and MOF are acetyltransferases that target K120 within the DNA binding domain. Ac-K120 activates a pro-apoptotic target gene ensemble, but shows limited impact on cell-cycle, inferring that this PTM may be critical for mediating the apoptotic versus stalled cell-cycle p53 response (379, 380). Lysine 320 has been identified as a target of HAT p300/CBP Associated Factor (PCAF), and shows acetylation dependent increased in vitro DNA binding affinity. Acetylation of lysines 370, 372, 373, 381, 382 has been identified, where 373 and 382 are preferentially acetylated by p300, and in vitro acetylated p53 was found to bind DNA by EMSA at a higher affinity than unacetylated protein (381). This observation concurred with the hypothesis that the 126 C-terminal regulatory region of p53 docks into the core of the p53 tetramer in the unmodified state, maintaining an inactive state; and that this interaction is disrupted following acetylation (381, 382). K373 replacement with arginine was found to decrease transcriptional activity by 10%, implicating some subtle effects on DNA binding, and/or transcriptional regulation (378). Further analysis revealed that acetylation of K373 and K382 forms a scaffold for binding the double bromodomain of the TAF1 subunit of TFIID on the p21 promoter, following UV exposure, resulting in increased target gene induction (383). Both modification sites were shown by Western blot with acetyl-lysine specific antibodies to increase in occurrence upon UV irradiation. Two C-terminal regulatory domains (RD) have been defined, which showed maintenance of acetylation, including RD1 (300-325) and RD2 (355-393) (378, 381).

Acetylation of p53 regulates ubiquitination and degradation A number of acetylation target sites have also been identified as either direct targets, or in proximity to ubiquitination sites. MDM2 binds p53 through the N-terminus, however, C-terminal sequences including residues 362-392 are known to be required for the ubiquitin mediated degradation of p53 in the absence of cell stress (384). Mutation of six lysine residues within this region (370, 372, 373, 381, 382 and 386) to arginine illustrated that this group of mutations specifically prevented MDM-2 dependent ubiquitination of p53 (385). Further refinement of target sites narrowed this region to include a requirement for K370, K373, K381 and K382, which while binding to MDM2 with wt affinity, showed impaired ubiquitination that could not be mimicked by any single point mutation. This effect was hypothesised to be due to the compensatory ubiquitination of a neighbouring lysine in the absence of a single target residue (386). In a similar conjugation event, MDM2 targets p53 for modification by the ubiquitin-like protein, NEDD8. Neddylation has been identified within the C-terminal regulatory region and requires lysines 370, 372 and 373, and appears to result in decreased p53 transcriptional activity (387). F-box domain protein FBXO11 has also been shown to act as a recognition component of a second p53 neddylation conjugating complex and is implicated in NEDD8 conjugation to lysines 320, 321, 370, 372, 373, 381, 382 and 386, and similarly results in decreased p53 activity following modification (388). Proximal to this region, K386 is the target of a sumoylation event in response to DNA damage, shown to increase the transcriptional activity of p53 (389).

Methylation of p53 Given the complex interplay of modifications including phosphorylation, acetylation, ubiquitylation, neddylation and sumoylation, recent research has investigated the possibility that p53 is methylated by the SET domain methyltransferase Set7/9, previously characterised as a histone modification enzyme (390). p53 has been identified as a Set7/9 target in an in vitro methylation screen, where the modified residue was isolated within the C-terminal regulatory region encompassing amino acids 290-393. This fragment was further broken into three parts and lysine residues within the methylated peptide were 127 serially mutated to non-methylatable arginine, resulting in identification of K372 as the target of Set7/9. The modification was confirmed to be a monomethyl form, and promoted stabilisation of p53 protein resulting in increased transactivation capacity, inducing apoptosis of Set7/9 overexpressing cells. Given the small size of the methylation mark, the authors speculate that a protein interaction induced by this modification may occlude MDM2, rather than directly interfering with ubiquitination of neighbouring lysine residues. Additionally, the authors propose that monomethylation of K372 may precede other PTMs within the C-terminal regulatory region (391). Subsequently, this proposal was validated by the discovery that methylation of p53 causes binding of a chromodomain containing acetyltransferase, Tip60 (Fig. 6.1 B) (392). In vivo activation of p53 by application of DNA damaging agent Adriamycin (Adr) resulted in a dramatic increase of me-K372 bound at the p21 promoter as assessed by ChIP (Fig. 6.1 B). K370 has been found to be monomethylated by Smyd-2 (393), resulting in repression of p53 (Fig. 6.1 B), while dimethylation by an unknown methyltransferase promotes an interaction with coactivator protein p53 binding protein-1 (53BP1) to increase activity (394). Further, a lysine demethylase (LSD1) is preferentially able to remove the dimethylation mark to repress p53 activity (394). Finally, methylation of K382, mediated by methyltransferase SET8 has been confirmed as a monomethylation event, resulting in decreased p53 activity at the most potently activated target genes (395).

Common target residues and proximal modifications regulate p53 function Notably, the sites of modification described above fall within a hot-spot for modification, and, more specifically, overlap between different modification types are hypothesised to generate mosaic effects to fine-tune p53 signalling. The proposal that this C-terminal region acts in trans to regulate the DNA binding properties of p53, in the context of the layered modifications occurring within this region, infers the importance of specific complements of modifications to regulate p53 DNA binding and interactions with coactivators at the target promoter site. Investigations into modification patterns are ongoing and have generated useful information about the occurrence of particular PTMs; their impact on modification at proximal sites within this region; and the functional outcomes of the different modification batteries. As described above, methylation of K372 by Set7/9 results in binding by acetyltransferase Tip60. Tip60 acetylation of K120, in addition to specifying apoptotic outcomes, forms a docking site for p300 to increase transcriptional activity (Fig. 6.1 B) (355). Smyd-2 monomethylation of K370, in contrast to me- K372, acts to repress p53 activity by reducing the amount of promoter bound p53. Crosstalk between the proximal modifications regulates activation of p53, where me-K372 prevents interaction of Smyd-2, thereby abrogating methylation of K370, resulting in target gene induction (Fig. 6.1 B). Additionally, apoptosis after Adr treatment showed dependence on the Smyd-2 mechanism of regulation, where Smyd-2 shRNA resulted in increased apoptosis (393). Arginine methylation by Protein arginine Methyl- transferase-5 (PRMT5) at target sites R333, R335 and R337 has recently been demonstrated to 128 similarly increase cell cycle stalling and decrease apoptosis effects of p53 activity, by directly regulating target promoter binding specificity (396).

Summary of p53 PTMs In summary, p53 is regulated at multiple levels by PTMs targeted to common lysine residues, resulting in biologically relevant outcomes for specific patterns of modification. Following DNA damage, phosphorylation/acetylation modifications rapidly activate transcriptional activity of p53 to induce cell- cycle arrest, apoptosis or senescence (397). Acetylation of lysine residues markedly alters the side- chain chemistry to prevent hydrogen bonding, and masks the positive charge to potentially alter protein structure, or protein:protein interactions. A recent review proposed that acetylation typically follows primary phosphorylation responses, and may convert these signals to ‘acetylation-based actions’ (355). Certainly, p53 fits this theory, in that acetyl-lysine modifications have been shown to prevent ubiquitination, thereby increasing protein stability, while simultaneously relieving p53 domain arrangements thought to interfere with DNA binding. Acetylation additionally forms structural platforms for interaction with coactivators to increase transcriptional activity, and may regulate target gene specificity. Methylation status at K370 and K372 has been shown to be crucial for transcriptional regulation, and these mutually exclusive modification events create a regulatory framework for switching between ‘off’ and ‘on’ states to regulate apoptosis. The possibility that other transcription factor responses may be similarly regulated suggests that comprehensive mass spectrometric analysis could yield interesting detail for those proteins for which PTMs have already been identified, such as AhR. Prediction and identification of regional modification hot-spots, as evidenced in the case of p53, may aid in attempts to generate targeted mutations to investigate biological functions for post-translational modification of other transcription factors.

6.1.3 AhR Regulation by Post-Translational Modification

Recently, endogenous activation of AhR has been examined with the result that contextual activation, such as oxidised LDL and cell suspension induction, has been identified without a supporting molecular mechanism. Post-translational modification has been investigated as a possible means of activating AhR in signalling events for which ligands have not been identified. It is well known that phosphorylation is a typical means of tightly regulating transcription factor responses, including muscle cell differentiation bHLH proteins MyoD and Myf5 (398, 399); circadian rhythm PAS protein Per (400, 401) and bHLH.PAS protein Clock (402); and oxygen sensitive bHLH.PAS protein HIF2 (403).

129

Phosphorylation of AhR Regulation of AhR signalling by PKC or PKA has also been inferred due to the presence of target sites in mAhR and hAhR. Activation of PKC by phorbol-12-myristate-13-acetate (PMA) has been shown to result in superinduction of AhR activity after ligand activation (404, 405). Treatment of hepatoma cells with universal second messenger cAMP has additionally been found to result in nuclear accumulation of AhR, potentially due to activation of PKA (406). Ligand activated AhR/ARNT purified from nuclear extracts is able to bind XRE in EMSA analysis, while phosphatase treatment abrogated this function. Treatment with serine/threonine and tyrosine specific phosphatases shows that generally, phosphotyrosine(s) of AhR/ARNT are crucial for DNA binding by the purified heterodimer in vitro. Subsequently, pTyr was detected by Western blot within the AhR protein, although the specific residue(s) has not been identified; however, this evidence indicates that tyrosine phosphorylation of AhR, rather than ARNT, was the modification(s) affecting DNA binding (407). Investigations into residue specific modifications under a range of activation states have been further interrogated for the regions spanning the nuclear localization and nuclear export sequences.

Phosphorylation within the NES Target genes are activated in keratinocytes in suspension culture, with Cyp1a1 mRNA levels and enzyme activity increasing rapidly in complete suspension (within 1 hour), or with an overlay of methylcellulose containing media (116). The authors speculated that AhR may mediate this induction after changes in adhesion and/or cell shape result in production of an endogenous AhR ligand, perhaps a portent of the subsequent investigations into lipidic ligands. Recent research revealed that cell density regulates the intracellular localization of AhR, where low density cell culture shows AhR predominantly in the nuclear compartment. Disruption of cell-cell contact by treatment with divalent cation chelator EGTA or calcium-deficient medium also activated nuclear localization of AhR. AhR was shown to be active at low cell density as assessed using an XRE reporter, and an XRE driven GFP reporter was found to activate along the wound edge in an in vitro HaCaT wound model (89). In the same study, it was predicted that the NES of AhR, having a similar consensus to other functional transcription factor NESs, may undergo phosphorylation of serine residues like those of p53 and ER, to regulate export from the nucleus. A mutant AhR peptide (55-75) S68D, with a GFP tag, injected into the nucleus of Madin-Darby bovine kidney cells, was found to remain in the nucleus. An antibody to AhR-pS68 identified modified AhR in the nuclear compartment of a 3MC treated mouse liver section (89). p38 MAPK has been described as an inhibitor of p53 and ER nuclear export (408, 409), hence inhibitors to p38 MAPK were investigated for their capacity to decrease levels of nuclear AhR with the result that reporter gene activity was halved and nuclear localization decreased (89). The inference from this investigation was that cell suspension induced phosphorylation of AhR NES residue S68, thereby 130 preventing the nuclear export of AhR, resulting in increased target gene induction. An alternative hypothesis has also been described, where a phosphatase inhibitor, which is localized to the nucleus in sparse cell populations (410), may be preventing removal of the phosphorylation mark on the AhR NES (89).

Phosphorylation of the AhR NLS The corollary to nuclear export of AhR is regulation of nuclear import by modification of the NLS. Two potential PKC sites have been identified within the human AhR NLS (amino acids 12-42), including S12 and S36 (S11 and S35 in mouse AhR). In vitro phosphorylation assays indicated that S36 was most readily phosphorylated in the context of a . Alanine replacement of either site showed no effect on ligand dependent localization of AhR, while creating a phosphorylation mimic by replacing the target serine residues with aspartic acid resulted in cytoplasmic localization, even in the presence of 3MC. In contrast, immunohistochemistry employing an anti-pS36 primary antibody showed that phosphorylated AhR was only detected in the nucleus of the liver of a ligand treated mouse (108). Phosphatase treatment of AhR has additionally shown that XRE binding affinity is abrogated in vitro (411), indicating that phosphorylation of the NLS region may not only impact on nuclear import, but also DNA binding (108). Hence, in summary, it appears that phosphorylation of the NES may enhance nuclear localization, while phosphorylation within the NLS may increase DNA binding and prevent cytoplasmic/nuclear recycling of active AhR. Finally, work conducted in our laboratory indicates that mAhR undergoes dimethylation at K87, although no functional outcome of this modification has been identified (412).

Summary and Aims In spite of a body of evidence implicating the relevance of phosphorylation for AhR function, a thorough analysis of the modification of AhR under a range of activation states is currently lacking. Revelations about the concert of modifications involved in regulation of transcription factor p53 indicate that a range of PTMs, including acetylation, methylation and phosphorylation, may convey exquisitely complex functional output. Hence, we undertook a thorough analysis of tryptic peptides of untreated mAhR, searching for sites of post-translational modification in an unbiased manner. Additionally, an analysis of YH439 and cell suspension activated mAhR was initiated, although confirmation of identified modifications awaits further analysis.

131 6.2 Results

6.2.1 Disorder as a Predictor of Modification and Protein:Protein Interaction Sites

In the past, protein function was typically inferred to result from mechanisms associated with structured regions, based on primary amino acid sequence. This paradigm assumed that the biologically relevant state for the protein of interest was the folded, lowest energy conformation, referred to as the native state. This species was considered to represent the functional fold, given that unfolded, or denatured proteins are typically associated with loss of function (413). The discovery of intrinsically disordered proteins (IDPs) such as histones, calcineurin and , implicates disorder as being important for protein function, and being specified by particular amino acid codes (414). Intrinsic disorder refers to the folding properties of a given sequence, where side chain packing is non-cooperative and random, creating dynamic folding events (415). Proteins with no intrinsic disorder represent less than one third of the deposited PDB entries (416), where regions of disorder typically form in long loops or protein termini, but can also exist for short sequences of only a few residues (415). The function of intrinsically disordered regions have been reviewed comprehensively, resulting in the identification of four classes, including: i) molecular recognition; ii) complex assembly; iii) entropic chain activities, and iv) protein post-translational modification (417). Entropic chain activities represent the function of protein sequences that form mobile regions such as spacers, linkers and bristles (415). The other three classes broadly define protein:protein interface interactions, where molecular recognition is primarily involved in signal transduction. Complex assembly comprises intrinsic disorder of proteins that regulate virus, ribosome, and cytoplasm assembly. Finally, protein modification allows for increased complexity in protein function and fold, where the modification itself may form a protein interaction surface, or may induce allosteric changes to alter molecular structure or function. Analysis of characterised IDPs reveals characteristic sequence biases that can be utilised for de novo predictions of disorder, where residues D, M, K, R, S, Q, P and E are disorder-promoting. A recent review highlighted 14 different programs for in silico disorder prediction, finally illustrating the utility of three analysis programs on a selection of targets that have not been structurally characterised. This procedure included analysis of the mean hydrophobicity/mean charge ratio (418), where high charge and low hydrophobicity have been correlated with native disorder and indicate the propensity for the whole protein, or selected fragments, to be natively ordered, or disordered. Application of PONDR (Predictor of Natural Disordered Regions) allows prediction of disorder from sequence alone, both interior disorder, and a specific analysis of terminal sequences (419). Finally, the disorder prediction suite DisEMBL was utilised to predict regions of loops/coils, ‘hot-loops’ (structures with high temperature factors that are not helical or strand forms), and additionally, this program searches for similar sequence regions from the PDB that are missing structure, thereby inferring disorder, termed Remark456 (420). PTM prediction has additionally recently 132 evolved to encompass disorder prediction as a means of predicting modification probabilities, including phosphorylation using DISPHOS (421) and methylation, although this prediction program is not publicly available (422).

Mouse AhR disorder prediction For this investigation, PONDR and DisEMBL hot-loops and loop/coil prediction programs (http://dis.embl.de/) have been applied to the mouse AhR, to predict regional mobility (Fig. 6.2). Access to PONDR® was provided by Molecular Kinetics (6201 La Pas Trail - Ste 160, Indianapolis, IN 46268; 317-280-8737; E-mail: [email protected]). VL-XT is copyright©1999 by the WSU Research Foundation, all rights reserved. PONDR® is copyright©2004 by Molecular Kinetics, all rights reserved. While predicted loops and coils are spread throughout AhR, hot-loops and PONDR restrict the regions of predicted disorder to the N and C termini. The PAS domains broadly appeared to be ordered, while sequences between the domains, and C-terminal of PAS B continuing to the beginning of the transactivation region were predicted to be disordered. The bipartite NLS was identified as disordered by Hot-loops and PONDR, concurring with a requirement for disorder in this region to potentiate interactions with import factor importin- (karyopherin-2) (423, 424). Interestingly, given the correlation that disorder often accompanies protein:protein interactions, natively unfolded regions were found to extend from the C-terminus of PAS B, matching the region predicted to interact with Hsp90. Additionally, PTMs identified in this study (see Section 6.2.2) appear to cluster predominantly within the sites of disorder, and more specifically, around the NLS, NES and Hsp90 contact regions (yellow lines, Fig. 6.2). This analysis indicates that disorder, as previously described, is likely important for AhR interaction with regulatory proteins such as Hsp90 and importin-. Correlation of disorder with PTM sites indicates that regional disorder may allow PTM regulation of AhR function by forming protein docking sites, as in the case of acetylation and methylation; or that flexibility in these regions allows modification induced conformational change to regulate signalling outcomes.

6.2.2 Purification and analysis of Post-translational modification of exogenously expressed mouse AhR from Control, YH439 and Suspension Treated Cells

A thorough analysis of AhR PTMs necessitated large-scale purification of material for mass spectrometric identification of tryptic fragments. Stably integrated full length D83A HisMyc mAhR 293T cell lines were established and maintained in puromycin at 10 g/ml (S.Furness, PhD thesis, 2003). D83A mAhR represents a mutation with no effect on function (data not shown) which was introduced during cloning. Cells were left untreated or treated with YH439 [10 M] or maintained in suspension culture for 16 hours prior to preparation of whole cell extracts as previously described (S. Furness, 133 Disorder Prediction of Mouse AhR NLS NES NES HSP90 and Ligand Binding Transactivation Region 13 39 62 72 212 224 287 421 13 88 280 381 bHLH PAS A PAS B 1 115 265 600 805 Loops/Coils 50-62 175-208 240--300 365-386 456-489 538-572 616--670 770 -- 779

1 -- 39 85-99 220-229 309-320 410-449 511-529 585-598 679 --- 765 Hot-Loops 82-97 362--387 554- 566

1 -- 39 285-294 427-447 791-805

PONDR 453-465 10--45 352-368 410-435 514-520 608-634 721 805

1-3 182--202 378--393 502 588-594 718-719 750-798 447-449 PTMs

Figure 6.2 Predicted regions of disorder and novel post-translationally modified regions of mouse AhR. DisEMBL (420) server was utilised to predict regions of mAhR that form loops and coils, loops that show high mobility by temperature factors (hot-loops), and sequences that are have missing coordinates in the protein data bank, termed Remark465. These regions have been illustrated on a linear representation of the mDR sequence, where the predicted sequence numbers are annotated in an alternating pattern, below and above the schematic. Analysis of post-translational modifications under control conditions, or following treatment with YH439 and cellular suspension shows, excluding the Helix 2 region of the HLH (72-86), overlap with predicted loop/coil and disordered regions, as indicated by the yellow lines (see Fig. 6.4 for a summary of modifications and peptide fragments, after specific treatment conditions). Additionally, the NLS and Hsp90 contact region C-terminal of PAS B are predicted to be disordered, implicating the potential for key protein:protein interactions at these sites. 2003). Cells were lysed overnight into 6M guanidine lysis buffer, containing 20 mM DTT overnight. Fully reduced lysate was treated for 2 hours at room temperature with 500 mM iodoacetamide to protect cysteine residues, centrifuged and filtered prior to denaturing IMAC purification (Fig. 6.3 A). Material bound at 20 mM imidazole was washed from guanidine into SB3-10 buffer, and finally eluted in SDS buffer for preparative SDS-PAGE. The identity of coomassie stained band was confirmed on analytical SDS-PAGE by Western analysis probed with anti-Myc primary antibody (Fig. 6.3 B). Preparative SDS- PAGE, electroelution, trypsin digestion, HPLC separation and analysis by MALDI-TOF/TOF was performed by K. Dave at the University of Queensland in the laboratory of Professor Jeffrey Gorman (412).

The modification pattern of untreated mAhR has been conclusively demonstrated, and includes phosphorylation, acetylation, dimethylation, sulfonation and carboxamidomethylation events that cluster in the N-terminus within the bHLH, and the interface between PAS A and PAS B. Extensive PTMs are localized to the C-terminal half of mAhR in the linker between PAS B and the transactivation domain. Modification of active mAhR shows a similar restriction to particular regions of the protein, previously predicted to be disordered (Fig. 6.2). The presence of different modifications following activation relative to untreated mAhR infers the potential for different functional outcomes from (1) YH439 activation and (2) cell suspension activity. However, the identification of modifications of active mAhR has not been replicated as many times as for untreated protein, and no enrichment strategies have been employed for these samples, hence the current detail of active modifications cannot be assumed to represent the complete modification pattern. Modifications detected on mAhR have been summarised in Figures 6.4 (N-terminal half) and 6.5 (C-terminal half) on a linear domain schematic for untreated, YH439 treated and suspension treated mAhR. Modification sites have additionally been illustrated on an alignment of a number of AhR homologous sequences (Fig. 6.6 and 6.7) to infer potential conservation of PTM target sites and hence, PTMs.

Post-Translational Modifications of the N-terminal Basic Region of mAhR The N-terminal region of mAhR was phosphorylated in untreated mAhR, including S11 and T21. Within the basic region of untreated mAhR, phospho-serines 32 and 35 were identified. Threonine 44, within the first half of the predicted helix 1 fold was also found to be phosphorylated (Fig. 6.4). Identification of phosphorylated S11 (hAhR S12) and S35 (hAhR S36) within the bipartite NLS is in agreement with the previous prediction that these residues may be targeted by PKC. Threonine 21 (hAhR T22) was also predicted as a potential PKC target, although not investigated further by mutagenesis (108). Lysine modification within this region occurs proximal to phosphorylated residues, including dimet-K20, acetylated or trimethylated K23 (the modification is isobaric and has not been conclusively identified), mono, di and trimethylated K31; and mono and dimethylated K36. It is interesting to note that K20, K31 134 (A) Denaturing IMAC Purification of mAhR

HisMyc mDR 293T Stable Cell LInes

Treated 2 hours

Guanidine Lysis

Nickel IMAC

SB3-10 Wash

SDS-Wash

SDS Elution, Fractions 2 and 3 SDS-PAGE

(B) IMAC Purified HisMyc-mAhR

2 3 2 3 n n o n o n ti io ti io c t c t a c a c r ra r ra F F F F

Contaminant Anti-Myc Western Blot Purified HisMyc mAhR Coomassie Stained SDS-PAGE

Figure 6.3 IMAC Purification of full length wt HisMyc mDR for mass spectrometric analysis of posttranslational modifications in untreated, YH439 and suspension treated stably transfected 293Ts. Stable integrated full length D83A HisMyc mDR 293T cell lines were established and maintained in puromycin at 10g/ml. Cells were left untreated or treated for 2 hours with YH439 [10M] or maintained in suspension culture prior to preparation of whole cell extracts. Cells were lysed into 6M guanidine containing buffer, centrifuged and filtered prior to denaturing IMAC purification. (A) Bound material was washed from guanidine into SB3-10 and finally eluted in SDS containing buffer for (B) SDS-PAGE and the identity of coomassie stained band confirmed by Western analysis probed with anti-Myc primary antibody. Large format SDS-PAGE, electroelution, trypsin digestion and analysis by MALDI-TOF/TOF was performed by K. Dave at the University of Queensland (412). Modification of the bHLH region of mAhR

NLS NES NES HSP90 and Ligand Binding 13 39 62 72 212 224 287 421 Transactivation Region bHLH PAS A PAS B

1 0 1 3 1 2 5 6 4 2 5 2 4 5 7 1 2 2 2 3 3 3 3 4 6 6 7 7 7 8 s k t kks s k t s sy k Untreated

me me me me me3 me2 P 2 P 3 P P P P P P me2 /ac me2 me2 me3

0 7 2 4 5 9 8 8 7 7 7 7 ssyk s k YH439

P P P ac P me2

0 7 8 8 s k Suspension

P me2

Figure 6.4 Mass Spectrometric analysis of post-translational modification of mouse AhR N-terminal bHLH including the NLS and NES, revealed a large number of modifications. Exogenous HisMyc-mAhR1- 805 was partially purified by denaturing IMAC. SDS-PAGE in gel tryptic digestion and mass spectrometric analysis was performed by Keyur Dave and Prof. Jeff Gorman. A thorough analysis of this region of untreated mAhR showed modifications including monomethylation (me), dimethylation (me2), trimethylation (me3) and phosphorylation (P). Isobaric modifications trimethylation and acetylation are shown as me3/ac and were not conclusively identified. Preliminary analysis of YH439 and cell suspension treated mAhR revealed a phosphorylation that was distinct from untreated material, and an acetylation (ac) that occurred in the presence of YH439, but not after suspension activation. and K36 neighbour phosphorylated residues T21, S32 and S35. This modification pattern, surrounding the bipartite NLS and the DNA binding basic region (Fig. 6.4), was not detected for YH439 and suspension treated mAhR, inferring a potential role in regulation of nuclear/cytoplasmic localization of AhR and/or DNA binding. Of the modified residues in this region, S11, K20, K23, K31, S32, S35 and K36 were found to be strictly conserved in an alignment of a number of AhR sequences (Fig. 6.4), indicating that modification at these sites may be similarly conserved and important for AhR function. However, the multiple functions of this region are likely to similarly constrict sequence variation for this site.

Coregulatory modification mechanisms may be informed by the identification of multiply modified peptides at this site. Diphosphorylations have been shown for residues 25-36 (IPAEGIKpSNPpSK (phospho-S32 and phospho-S35)). Combined phosphorylation and methylations were found on peptides encompassing residues 21-27 (pTVme3/acKPIPA (phospho-T21 and trimethyl/acetyl-K23)); 26-

36 (PAEGIme2KSNPpSme2K (dimethyl-K31, phospho-S35 and dimethyl-K36)). Finally, double methylation events were observed for peptides 26-37 (PAEGIme2KSNPSme2KR (dimethyl-K31 and dimethyl-K36)); and 17-31 (PVQme2KTVKPIPAEGIme2K (dimethyl-K20 and dimethyl-K31)). In summary, it appears possible that K20, K31 and K36 dimethylation may be coregulated; phosphorylation and dimethylation of neighbouring residues S35 and K36, respectively, are not antagonistic events; and phosphorylation of S35 and S32 are separable, and therefore may be regulated individually, in spite of their proximity.

Post-Translational Modification of the mAhR NES and Helix-Loop-Helix More C-terminal of the basic/NLS region is the HLH, where Helix 2 has been isolated as a regional ‘hot- spot’ for modification. The leucine-rich NES has been identified in the loop region encompassing residues 63-71 (leucines 63, 66, 69 and 71) (32). Residues proximal to the NES have been found to be heavily modified in this analysis (Fig. 6.4). Untreated mAhR undergoes serine and tyrosine phosphorylation at S72, S74 and Y75. These modifications were identified on a single peptide (LpSVpSpYLR, 71-77), indicating simultaneous regulation at this site. Lysine residues were modified by methylation, including trimethyl-K62, dimethyl-K65, and dimethyl-K87, the latter in agreement with previous studies (412). After treatment with YH439, some modifications were not identified, including trimethyl-K62 and dimethyl-K65, although these cannot be assumed to have been specifically removed as this analysis is still in progress. Notably, phospho-S68 was not identified in this analysis, although it has been detected by specific antibodies in the nuclear compartment of 3MC treated mouse liver and suspension treated kidney cells in a previous study (89). Phospho-S72, S74 and Y75 were maintained after YH439 treatment, as was dimethyl-K87. Ligand activation induced phosphorylation of S80 and acetylation of K79, not present in the untreated protein (Fig. 6.4). Following activation by cell 135 suspension, most modifications seen for untreated AhR in this region were not identified, with dimethyl- K87 and phospho-S80 the only PTMs identified for this active species of mAhR. Identified PTMs in this region inferred the potential for phospho-S80 as a modification occurring only following activation, and acetylation of K79 possibly being a ligand specific PTM. An analysis of sequence conservation within the loop/Helix 2 region showed, similar to the basic region, strong conservation of target residues, suggesting the possibility of conservation of modification mechanisms, although this assertion is tempered by the observed strong regional sequence identity (Fig. 6.6).

Post-Translational Modification of mAhR PAS A, PAS B and Transactivation Region PTMs at the sequence between PAS A and PAS B were identified for untreated mAhR, including phosphorylation of T198 and T305; and trimethylation or acetylation of K277 and K284 (Fig. 6.5). Following treatment with YH439, modifications within this region were not identified, although we cannot conclude that these modifications have been removed, as enrichment strategies may enhance resolution of this region. Cell suspension induced four novel modifications, including phospho-Y239, dimethyl-K245, phospho-T283 and phospho-T290; and maintenance of trimethyl/acetyl-K284 (Fig. 6.5). Sequence alignment of the PAS A/PAS B region showed some conservation of T189 and Y239 within PAS A (Fig. 6.6); and strong conservation of PAS B K277, T283, K284, T290 (Fig. 6.6) and T305 (Fig. 6.7). Again, strong sequence identity through this region may be interpreted to infer conserved PTM mechanisms, but may simply indicate structurally constrained sequence variation.

More C-terminal to the PAS B domain is a region that links the DNA binding and ligand binding domains to the transactivation domain, and features more sequence variability than the strongly conserved bHLH and PAS domains (Fig. 6.7). Analysis of untreated mAhR has identified 18 Phosphorylation events, and unusual modifications including sulfonation and carboxamidomethylation have also been found on neighbouring serine and methionine residues, respectively. Phosphorylated residues include S427, T430, S438, S439, S441, S444, S448, S452, S460, Y462, S467, S468, S480, S490, S496, S533, and S558. Sulfonation has been identified for residues S411 and S547, found to occur in tandem with carboxamidomethylation of M418 and M548, with phosphorylation of S415 occurring on some of these peptides (425). Methylation events were also identified in this region, including dimethyl and monomethyl-K432, monomethyl-K442, dimethyl-K483 and dimethyl-K527 (Fig. 6.5). Following YH439 treatment, the majority of these modifications have not been detected, although phosphorylation of T430 has been maintained after ligand treatment, and a new dimethylation at K426 has been identified. Cell suspension shows a similar decrease in identified PTMs, although reproducible acetylation events have been detected at S415 and S547. Additionally, phospho-S427 was maintained after cell suspension and suspension specific phosphorylation of T429 was identified. Dimethyl-K426 was identified following both suspension and activation with YH439, indicating that dimethyl-K426 may represent an active mark for 136 Post-translational Modification of the C-terminus of Mouse AhR

NLS NES NES HSP90 and Ligand Binding 13 39 63 71 212 224 287 421 Transactivation Region bHLH PAS A PAS B

8 7 4 5 1 5 8 7 0 2 8 9 1 2 4 8 2 0 2 7 8 0 3 0 6 7 3 7 8 8 3 6 9 7 8 0 1 1 1 2 3 3 3 3 4 4 4 4 5 6 6 6 6 8 8 9 9 2 3 4 4 5 0 9 1 2 2 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 6 7 t k k t s s m sskk s t s sss syssskk ss sssm s s Untreated

P P S P C P P P P P P P P P P P P P me P P me P S C P P me3 me3 me2 me 2 2 P /ac /ac me

6 0 6 2 3 1 4 4 7 k s s YH439

me2 P Pp

9 5 3 4 0 5 6 7 9 7 6 3 4 8 8 9 1 2 2 2 4 1 2 2 2 2 2 4 4 4 4 5 7 y k t k t s k s t s s Suspension ac P me2 P me3 P ac me2 P P Pp /ac Figure 6.5 Analysis of tryptic fragments of untreated AhR revealed large numbers of post-translational modifications of the PAS A/PAS B interface, linker to the transactivation region and C-terminus. Purification, tryptic digestion and mass spectrometric identification of HisMyc-mAhR PTMs was performed by Keyur Dave and Prof. Jeff Gorman (Protein Discovery Centre, QIMR). This analysis was performed repeatedly for untreated AhR using a range of mass spectrometric techniques and peptide enrichment strategies. PTMs have been illustrated on regions of AhR, as indicated by the dotted lines. A hot-spot for modification is found in the sequence between PAS B and the transactivation region, where phosphorylation is shown as a P; novel sulfonation, S; monomethylation, me; dimethylation, me2; and novel carbamidomethyl-methionine, C. Initial studies of YH439 and suspension treated AhR show some novel modifications, such as acetylation (ac); and pyrophosphorylation (Pp), that have not been identified in the untreated sample, however, these samples require similar replication and enrichment to confirm PTMs of active AhR. Alignment of PTM target residues in the bHLH region of mAhR Bipartite NLS

mouseAhR -MSSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDV 59 ratAhR -MSSGANITYASRKRRKPVQKTVKPVPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDV 59 whaleAhR MNSSSASITYASRKRRKPVQKTVKPVPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDV 60 sealAhR MNSSSAHITYASRKRRKPVQKAVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDV 60 humanAhR MNSSSANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDV 60 guineapigAhR MNNSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDV 60 rabbitAhR MNGGGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDV 60 ternAhR ---MNPNVTYASRKRRKPVQKIVKPSPAEGVKSNPSKRHRDRLNAELDRLASLLPFPQDV 57 chickenAhR ---MNPNVTYASRKRRKPVQKIVKPSPAEGVKSNPSKRHRDRLNAELDRLASLLPFPQDV 57 .. :************* *** ****:*************:*************** bHLH

mouseAhR INKLDKLSVLRLSVSYLRAKSFFDVALKSTP---ADRNGGQDQCRAQ-IRDWQDLQEGEF 115 ratAhR INKLDKLSVLRLSVTYLRAKSFFDVALKSTP---ADRSRGQDQCRAQ-VRDWQDLQEGEF 115 whaleAhR VNKLDKLSVLRLSVSYLRAKSFFDVALKSTP---ADRNGVQDNCRTK-FREGLNLQEGEF 116 sealAhR INKLDKLSVLRLSVSYLRAKSFFDVSLQSSP---ADRNEVQENCRTK-FREDLHLQEGEF 116 humanAhR INKLDKLSVLRLSVSYLRAKSFFDVALKSSP---TERNGGQDNCRAANFREGLNLQEGEF 117 guineapigAhR INKLDKLSVLRLSVSYLRAKSFFDVALKSSP---TDRNGGQDDCGAP-FRSRLNLQEGEF 116 rabbitAhR INKLDKLSVLRLSVSYLRAKSFFDVALKSSS---ADRNGGQDPCRAK-FGEGLNLQEGEF 116 ternAhR IAKLDKLSVLRLSVSYLRAKSFFDVALKSSNSTRPERNGAQENCRTAKRGEGMQILEGEL 117 chickenAhR IAKLDKLSVLRLSVSYLRAKSFFDVALKSSNSTRLERNGIQE-SRTAKLGEGMQILEGEL 116 : ************:**********:*:*: :*. *: . : . .: ***: bHLH NES

MouseAhR LNP----DSAQGVDEAHGPPQAAVYYTPDQLPPENASFMERCFRCRLRCLLDNSSGFLAM 231 RatAhR LNPSQCTDSAQGVDETHGLPQPAVYYTPDQLPPENTAFMERCFRCRLRCLLDNSSGFLAM 235 WhaleAhR LNPSQCPDSGQKMDEANGLSQPAVYYNPDQVPPENSSSMERCFVCRLRCLLDNSSGFLAM 236 SealAhR LNPSQCTDSGQRIDAANGLPQAVGCHTPDQLPPENSSCMERSFVCRLRCLLDNSSGFLAM 236 HumanAhR LNPSQCTESGQGIEEATGLPQTVVCYNPDQIPPENSPLMERCFICRLRCLLDNSSGFLAM 237 GuineapigAhR FNPSQCSDSGQAIPEPHGLPQPGVSYTTERLPPENSPLMERCFVCRLRCLLDNSSGFLAM 236 RabbitAhR LNPSQCTDPGQGADETHGLPQP-VYYNPDQLPPENSSFMERCFICRLRCLLDNSSGFLAM 235 TernAhR LNPAQSADSAPSVQGDNGFSQPATYYNPDQLPPENSSFMERNFICRLRCLLDNSSGFLAM 237 ChickenAhR LNPTQSADSGPSVQGDNGFSQPANYYNPDQLPPENSSFMERNFICRLRCLLDNSSGFLAM 236 :** :.. * .*. :..:::****:. *** * **************** NES

MouseAhR NFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTP 291 RatAhR NFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTP 295 WhaleAhR NFQGRLKYLHGQNKKGKDGSILPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTP 296 SealAhR NFQGRLKYLHGQNKKGKDGSVLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTP 296 HumanAhR NFQGKLKYLHGQKKKGKDGSILPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTP 297 GuineapigAhR NFQGRLKYLHGQNKKGKDGSVLPPQLALFAIATPLQSPSILELRTKNFLFRTKHKLDFTP 296 RabbitAhR NFQGRLKFLHGQNKKGKDGSLLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTP 295 TernAhR NFQGRLKFLHGQNKKGKDGATLSPQLALFAVATPLQPPSILEIRTKNFIFRTKHKLDFTP 297 ChickenAhR NFQGRLKFLHGQNKKGKDGAALSPQLALFAVATPLQPPSILEIRTKNFIFRTKHKLDFTP 296 ****:**:****:******: *.*******:*****.*****:*****:*********** PAS A PAS B Phosphorylation Methylation Acetylation Figure 6.6 Post-Translational modification of the N-terminal region of mAhR is targeted to well conserved residues in a number of mammal and bird homologs. An alignment of the N-terminal region of mouse, rat, whale, seal, human, guineapig, rabbit, tern and chicken AhR illustrated strong conservation of phosphorylated (blue boxes), serine, threonine and tyrosine residues; dimethyated (yellow boxes) lysine residues; and acetylated/dimethylated lysine (pink box, yellow lines). These modifications were localized to the basic region and helix 2 of the bHLH region; and at the interface between PAS A and PAS B. Modifications within the bHLH region that have been identified in the literature are indicated with a circle below the sequence alignment (89, 108). Note that dashed lines indicate discontinuity in the amino acid sequences between alignment blocks. Above the alignment, cylinders indicate predicted -helices and rectangles -strands, from the structure based sequence alignment (174). The alignment was generated using ClustalW. Alignment of PTM target residues in the C-terminus of mAhR

MouseAhR 300 LILGYTEVELCTR… 400 FATGEAVLYEIS 411 Phosphorylation RatAhR 304 LILGYTEVELCNK… 404 FATGEAVLYEIS 415 Methylation WhaleAhR 305 IVLGYTEAELCMR… 405 FTTGEAVLYEVT 416 SealAhR 305 IVLGYTEAELCMR… 405 FTTGEAVLYEIT 416 Acetylation HumanAhR 306 IVLGYTEAELCTR… 406 FTTGEAVLYEAT 417 GuineapigAhR 305 IVLGYTEAELCAK… 405 FTTGEAVLYETS 416 Carbamidomethylation RabbitAhR 304 IVLGYTEAELCMR… 404 FTTGEAVLYEMT 415 TernAhR 306 IVLGYTEAELCMR… 406 FATGEAVLYEVS 417 Pyrophosphorylation ChickenAhR 305 IVLGYTEAELCMR… 405 FATGEAVLYELT 416 ::*****.*** : *:******** : Sulfonation PAS B MouseAhR SPFSPIMDPLPIRTKSNTSRKDWAPQSTPSKDSFHPSSLMSALIQQDESIYLCPPSSPA- 470 RatAhR SPFSPIMDPLPIRTKSNTSRKDWAPQSTPSKDSFHPNSLMSALIQQDESIYLCPPSSPA- 474 WhaleAhR NPFPPIMDPLPIRTKNGAGGKDSATKSTLSKDFLNPSSLLNAMMQQDESIYLYPASSST- 475 SealAhR SPFPPVMDPLPIRTKNGTSGIDSATKSALNKDSLNPSSLLAAIMQQDESIYLYPASGST- 475 HumanAhR NPFPAIMDPLPLRTKNGTSGKDSATTSTLSKDSLNPSSLLAAMMQQDESIYLYPASSTSS 477 GuineapigAhR KSFPPGMDSLPTQTKSATNGQGSATLSTE-KGSPHPTSLLASMMQQDESIYHCPASNTV- 474 RabbitAhR SPFPPIMDPLPIRPKSGTCGKDSATKPTPSKDSVHPSSLLSALMQQDESIYLYPPSSNA- 474 TernAhR FPMSSLMDPSQLKNKSTTGK---GTKATLHNDSVDPNSLLGAMLRQDESVYLCPPASHK- 473 ChickenAhR FPMSSLSDTSQPRSKTTTGK---GGKTTLHGDSVDPNSLLGAMLRQDESVYLCPPASHK- 472 .:.. *. : *. : . .: . .*.**: ::::****:* *.:.

MouseAhR --LLDSHFLMGSVSKCG-----SWQDSFAAAGSEAALKHEQIGHAQDVNLALSGGPSELF 523 RatAhR --PLDSHFLMDSMSECG-----SWQGSFAVASNEALLKHEEIRHTQDVNLTLSGGPSELF 527 WhaleAhR --PFERNFFSDSQNECS-----NWQNNVAPMGSDDILKHEQIGQSQEMNPTLSGDHAGLF 528 SealAhR --PFERNLFNEPVNECS-----NWQDNIAPMGSDGILKHEEIDHSQEMNPTLCGGPTGLF 528 HumanAhR TAPFENNFFNESMNECR-----NWQDNTAPMGNDTILKHEQIDQPQDVN-SFAGGHPGLF 531 GuineapigAhR --PFERNFFTESLNECS-----NWQDGIAPMRNDNIFKHKQIGQTQEMNPMLSGGHTVPF 526 RabbitAhR --PFERNFFTESLNECS-----NWPENVASVAGGSVLKHEQIGQSQEVSPAFSGDQTVLF 527 TernAhR -LSFERNFFADSRDELGGGVSSGWTDNFLPAGNHSVLKRELMERSQDSTVPLPEDSAALF 532 ChickenAhR -LSFERNFFADSRDELSGVVSSDWTDSLLPAGSHNILKQELMECSQDSSVPLPEDSAALF 531 :: ::: : .* . . :*:: : .*: . : . . *

MouseAhR PDNKNNDLYSIMRNLGIDFEDIRSMQ-NEEFFRTDSTAAGEVDFKDIDITDEILTYVQDS 582 RatAhR PDNKNNDLYSIMRNLGIDFEDIRSMQ-NEEFFRTDS--SGEVDFKDIDITDEILTYVQDS 584 WhaleAhR PDNRNSDLYSIMKHLGIDFEDIKHMQQNEEFFRTDF--SGEDDFRDIDLTDEILTYVEDS 586 SealAhR PDNRNSDLYGIMKQLGIDFEDIKHMQQNEEFFRTDF--SVEDDFRDIDLTDEILTYVQDS 586 HumanAhR QDSKNSDLYSIMKNLGIDFEDIRHMQ-NEKFFRNDF--SGEVDFRDIDLTDEILTYVQDS 588 GuineapigAhR SDNKNSELYSIMKNLGIDFEDIKCMQ-NEEFFRTDA--SSEVDFRDIDVRDEILTYVQDS 583 RabbitAhR PDNKNCDLYNIMKNLGVDFEDIKNMQ-NEEFFGADF--SGEVDFRDIDITDEILTYVQDS 584 TernAhR QDNKTSDLYSIMKNLGIDFEDLKCIQQDEEFFKTEL--SGVDDIGDIDITDEILTYVQDS 590 ChickenAhR QDNKTNDLYSIMKNLGIDFEDLKCIQQDEEFFKTEL--SGMDDIGDIDITDEILTYVQDS 589 *.:. :**.**::**:****:: :* :*:** : : *: ***: *******:** Transactivation Domain MouseAhR 595 QQP-VTQHLSCMLQERLQ… 700 QNFAPCNQP--LLPEHSKSVQLDFPG… 790 SSEIPGSQAGLSKV 803 RatAhR 598 QQP-VSQHLSCMLQERLQ… 705 QNFAPCNQS--LLPEHSKGTQLDFPG… 796 SPEIPGSQAFLSKF 809 WhaleAhR 599 PQQSMALNPSCMVQEHLQ… 697 QNFIPCSQS--VLPPHSKGTQLDFPI… 789 NPTVPGPQAFLNKF 802 SealAhR 599 QPQSLALSSSCAVQEPLQ… 694 QNFIPCNQP--VLPQHSNGTQLGFPI… 784 DP-AAAPQAFLNKF 796 HumanAhR 601 QQQSLALNSSCMVQEHLH… 698 QNFISCNQP--VLPQHSKCTELDYPM… 789 NPVLPGQQAFLNKF 802 GuineapigDR 596 QQQAVALNSSCMVQEQLE… 695 QNFAPCEQP--LLPQRSKGTQLDFAM… 786 NPAVPGPQVFLNKF 799 RabbitAhR 597 PQPATALNSSCMVQERLQ… 695 QNFTPCNQT--VAPQHSRCTQLDFAM… 787 EPPVPGPEAFLNKF 800 TernAhR 603 QQQPLVQNEGCLVQQELD… 702 QEFIPYKQHTGVVPQLSNFAQMDFPV… 796 DPMTSSQQTLLNKF 809 ChickenAhR 602 QQQPLVQNAGCLVQQELD… 701 QEFVPYKQPTAMMPQLSTFAQMDFPV… 795 DPMMANQQTLLNKF 808 .* :*: *. *:* . .* : * * .:: :. .. . :.:*.*. Figure 6.7 Post-Translational modification of the C-terminal region of mAhR is targeted to residues that show some conservation in a number of mammal and bird species. An alignment of the C-terminal region of a number of AhR homologs illustrated strong conservation of phosphorylation (light blue), carbamidomethylation (purple) and dimethylation (yellow) target residues; and minimal conservation of sulfonation (green), acetylation (pink) and pyrophosphorylation (dark blue) target sites. These modifications were localized to the region between PAS B and the transactivation domain, and within the transactivation region. Note that dashed lines indicate discontinuity in the amino acid sequences between alignment blocks. Above the alignment, cylinders represent predicted -helices and rectangles -strands. The alignment was generated using ClustalW. mAhR (Fig. 6.5). Analysis of sequence conservation in this region shows strong conservation of M418, K426, T429, T439, K442, S444, S460, Y462, S467, S480, S533 and M548. Less well conserved were S411, S415, S427, T430, S438, S441, S448, S452, S490, S496, S547 and S558 (Fig. 6.7).

Finally, the transactivation domain showed very few modifications relative to the sequence immediately C-terminal of PAS B, including, for untreated mAhR, phosphorylation of S603 and S796. Interestingly, neither modification was detected for YH439 or cell suspension treated mAhR, whereas, the modification of S716 by pyrophosphorylation appeared to be correlated with the active form of mAhR (Fig. 6.5). Sequence conservation suggested that the well conserved S603 may represent a conserved PTM target, however S716 and S796 were peculiar to mouse, and mouse and rat, respectively (Fig. 6.7).

Mutagenesis and Preliminary Functional Analysis of PTM mAhR Mutants In order to investigate the biological outcomes of mAhR modifications, a number of residues were targeted for mutagenesis, where alanine was incorporated to eliminate modification at specific sites. aspartic acid was also incorporated to replace residues targeted for phosphorylation or pyrophosphorylation in an effort to mimic potential negative charge. Mutants created initially included K87A, S716A and S716D mAhR. The mutant cDNAs encoding HisMyc-mAhR were integrated into HEK293Ts in the pEF/IRES puro vector (140), resulting in constitutive expression. Expression levels were assessed by immunoblot with antibodies directed against the Myc tag of the mAhR constructs, with relative levels compared with loading control antibodies to -tubulin (Fig. 6.8 A). Reporter gene assay experiments examined activity of mutant mAhRs relative to wt receptor using XRE-Luc (137), described in Chapter 5. The ability of the mutant mAhR proteins to activate this reporter gene was assessed relative to wt mAhR following YH439 treatment (Fig. 6.8 B). This experiment showed similar reporter gene activation for wt, K87A, S716A and S716D. Similarly, reporter analysis of mutant and wt mAhR was performed following suspension culture (Fig. 6.8 C), where wt, K87A, S716A and S716D mAhR showed similar activation of XRE-Luc. Two more mutants T290A and K284R were generated and transiently transfected into HEK293T cells prior to reporter analysis. K284 was mutated to arginine to mimic the positively charged residue, while preventing trimethylation (393). Preliminary reporter experiments indicated that incorporation of these mutations had no effect on gene activation by mAhR following YH439 and suspension culture (data not shown). These experiments illustrated that the biological effect of mAhR modification may not be sufficiently potent to be detected by reporter analysis.

Other residues targeted for mutagenesis included S80 and S547, mutated to alanine and aspartic acid (S80A, S80D, S547A and S547D). At this stage, analysis of mutant mAhR function by qPCR was prioritised in an effort to generate quantitative Cyp1a1 target gene activation in a chromatin context. For 137 mAhR PTM Mutants K87A, S716A and S716D Do Not Show Altered Reporter Gene Activation

(A) $ ' $   UR    X W S Z . 6 6

DQWL0\F

DQWL 7XEXOLQ

(B) ;5(;5( 7./XF 0.7 9HKLFOH 7./XF 0.6 <+

0.5

0.4

0.3

0.2

0.1

Relative Luciferase Activity 0 ([SUHVVLRQ&RQVWUXFW ZW EODQN ZW .$ 6$ 6' 5HSRUWHU&RQVWUXFW &RQWURO;5(/XF;5(/XF;5(/XF;5(/XF;5(/XF

0.7 ;5(;5( (C) 7./XF 0.6 $GKHUHQW 7./XF 0.5 6XVSHQVLRQ

0.4

0.3

0.2

0.1 Relative Luciferase Activity 0 ([SUHVVLRQ&RQVWUXFW ZW EODQN ZW .$ 6$ 6' 5HSRUWHU&RQVWUXFW &RQWURO;5(/XF;5(/XF;5(/XF;5(/XF;5(/XF

Figure 6.8 mAhR PTM mutants K87A, S716A and S716D have similar activity to wt mAhR following YH439 and suspension culture. HEK293T stable cell lines expressing K87A, S716A and S716D mAhR were established. (A) Expression of Myc tagged mAhR and mutant mAhR were compared by Western analysis. 50g of whole cell extract was separated by 10% SDS-PAGE prior to immunoblotting with 9E10 anti-Myc mAb and loading control MCA78G anti- tubulin. Analysis of XRE-Luc reporter following YH439 (10M) treatment (B) and cell suspension (C) was performed. Stable cell lines were cotransfected with XRE-Luc or Control-Luc vector as indicated and RL-TK for 24 h and treated with DMSO, YH439, left untreated (adherent), or suspension cultured for 16 h. Relative luciferase activity was determined as described in Materials and Methods. Data in (B) and (C) represent a typical experiment performed in triplicate three times. Data show the mean ± S.D. from a single experiment, performed in triplicate. These experiments were repeated three times with similar results. this purpose, AhR knockout mouse hepatocytes (338) were targeted for generation of stable mutant mAhR expressing cell lines. Endogenous wt mAhR background effects in HEK293Ts were expected to be significant for qPCR analysis, hence the requirement for AhR-/- cells. Numerous attempts to transfect these cells with mAhR-EF-IRES-puro expression vectors and perform puromycin selection failed to generate any stable cell lines. A lentiviral entry strategy was also attempted for wt mAhR and PTM mutants, which were cloned into pSLIK-puro vector (426). However, this strategy was eventually abandoned due to problems associated with virus production using this vector system.

6.3 Discussion

6.3.1 Hypothetical Structural Localization of PTM target sites

As described in section 6.2.1, post-translational modifications are predicted to be targeted to regions of intrinsic disorder, specifically at surface loops, and temperature sensitive loop/coil regions (‘hot-loops’). An alignment of modifications with predicted regions of intrinsic disorder showed that most modified residues localized to predicted loop/coil regions, hot-loops, PONDR predicted sites, or combinations of these. However, some residues, including K65, S72, S74, S75, K79, S80, S452, S533 and S603 did not occur in regions of predicted disorder. Alignment of the bHLH region of mAhR with Max (Fig. 3.1) allowed identification of structurally equivalent residues, where mAhR K20, T21, K23 and K36 cluster within the flexible basic region, while intervening PTM sites, K31, S32 and S35 do not have structurally equivalent residues in Max. Further, T44, K62, K65, S72, S74 and S75 have structurally equivalent residues in Max that fall within the loop/Helix 2 region, in agreement with the observation that Helix 2 residues of mAhR were excluded from the disorder prediction. The PTM target site equivalent Max residues were identified within the SAXS generated model (Chapter 4) and highlighted using spheres (blue, phosphorylation; yellow, methylation) (Fig. 6.9 A).

Within PAS A of Drosophila Per, only one structurally equivalent PTM target site was found to be represented, namely phospho-Y239, found to extend on a surface loop predicted to be directed towards the DNA in the SAXS model (Fig. 6.9 A). Structurally divergent regions between mAhR and Period PAS A include surface loops at insert 1 and insert 3 (Fig. 4.6) which are predicted to undergo modification to form phospho-T198 and dimethyl-K245, respectively, hence these sites were not represented in the SAXS model, and are in agreement with predicted disorder in these surface loops.

The Phyre prediction of the mAhR PAS B fold (Fig. 5.4 C) has finally been used to illustrate the location of modification sites phospho-T283, trimethyl/acetyl-K284, phospho-T290, and phospho-T305 (Fig. 6.9 B). Predicted to localise within loops/coils, and in the case of T290 predicted as a loop with a high

138 Hypothetical Structural Location of mAhR PTMs

(A) Phosphorylation mAhR Methylation bHLH.PAS A SAXS model

(B) R ota ted Phospho- A375 T283 Phospho- T305 H285

Acetyl- Phospho- K284 T290 mAhR PAS B Phyre model

Figure 6.9 PTM target residues within the N-terminus of mAhR, including bHLH, PAS A and PAS B were localized to predicted structurally flexible regions. Within the N-terminus, a hot-spot for modifications occurs between the bipartite basic region. The SAXS based model (Fig. 4.10) incorporated the structure of the bHLH of Max, and a predicted PAS A dimer of the Period PAS A structure (174). (A) PTM target residues within the bHLH region have been illustrated on structurally equivalent residues of Max. The PAS A structure of Period has only one equivalent site to those targeted for PTMs in mAhR, including phospho-Y239. A Phyre generated model of the mAhR PAS B (Fig. 5.4) has been utilised to illustrate the location of PTM sites phospho-T238, ac- K284, phospho-T290 and phospho-T305, shown as spheres and coloured yellow for acetylation and blue for phosphorylation (B). Residues H285 and A375 are shown as sticks (green) to illustrate their potential proximity to the heavily modified PAS core region. However, this figure is not representative of PTMs within loop insertions in the bHLH and PAS A region, nor the structurally uncharacterised region from residue 411 to the C-terminus of AhR. temperature factor, modifications appear to cluster around the first two -strands and in the -helical PAS core. Structural representations of identified PTMs may prove useful for helping to determine the utility of the modifications. However, given that there are no structures representative of the C-terminal half of mAhR, these regions could not be illustrated in this manner.

6.3.2 Final Discussion

Post-translational modification of a diverse range of proteins is essential to regulation of function at a number of levels, including conformational change, subcellular localization, protein stability, and protein:protein interactions. Transcription factor p53 represents a complex signalling system regulated by a range of PTMs where ubiquitination targets this factor for degradation, while acetylation is theorised to stabilise p53 by antagonising ubiqutination at common target lysine residues. Additionally, a patch of acetylations has been shown to result in conformational change to enable DNA binding, and forming protein:protein interaction motifs for acetyl-lysine binding bromodomain proteins such as coactivator complex TFIID. Site-specific effects of lysine methylation of p53 either maintain the inactive p53 (monomethyl-K370), or upregulate activity (monomethyl-K372) by forming an interaction motif for chromodomain coactivator acetyltransferase Tip60. Arginine methylation by PRMT5 (R333, R335, R337) has been shown to target p53 binding to specific target genes to incur cell cycle stalling, and disfavour apoptotic outcomes. It is clear from the studies of p53 that modifications occur at multiple sites, potentially in an antagonistic fashion, to regulate transcription factor function. Indications that phosphorylation of AhR may be important for subcellular localization, DNA binding and protein stability prompted a thorough analysis of post-translational modifications of the nascent receptor. An additional aim of this project was to initiate an investigation into modification following activation by ligand, and by cell suspension, with a view to identifying potential activation specific AhR modifications that may be involved in a binary switch from an ‘off’ to an ‘on’ state. As has been seen with p53, the mosaic of effects of multiple modifications may in fact form a complex concert of regulatory outcomes, potentiating the switch from inactive to active states, through to directly specifying target promoter binding events.

While AhR is known to be degraded in a ubiquitin dependent manner following ligand activation, and modified by phosphorylation within the NLS and NES, little else is known of modification dependent regulation of this transcription factor. Comprehensive analysis of full length mAhR has resulted in identification of serine, threonine and tyrosine phosphorylation events. Phosphorylation by PKC within the NLS of hAhR at S11 and S35 has been identified in previous studies, although a role for these modifications was not clear. Using anti-phospho-S36 (mAhR-S35) immunohistochemistry, AhR was detected in the nucleus, but mutation to phosphorylation mimic aspartic acid caused predominantly cytoplasmic localization. In this investigation, identification of phospho-S11 and phospho-S35 in 139 untreated AhR may indicate a role for this modification pair in nuclear exclusion. The complex methylation of proximal lysine residues may modulate phosphorylation of this region, in a similar manner to FOXO methylation inhibition of Akt mediated phosphorylation (Fig. 6.10 A). However, the identification of a peptide containing both phospho-S35 and dimethyl-K36 does not support this effect. Group targeted mutagenesis of lysine to arginine within this region would prevent methylation by lysine specific methyltransferase enzymes, while maintaining PKC target sequences, and may resolve regulatory effects of these PTMs on subcellular localization and DNA binding.

Methylation of the nuclear export sequence lysine residues K62 and K65 may additionally regulate subcellular localization as these residues abut leucine residues critical for CRM-1 dependent nuclear export. A recent investigation into subcellular localization of corepressor/coactivator protein Receptor interacting protein 140 (RIP140) indicated increased cytoplasmic localization catalysed by phosphorylation and association with nuclear export factor exportin (CRM-1), but this event was also shown to be dependent on arginine methylation (427). This study highlighted the requirement for phosphorylation prior to arginine methylation. This suggests that phosphorylation of neighbouring S72, S74 and Y75 may potentiate methylation of this region of AhR. In this instance, di and trimethylation of lysine residues may reproduce the effect of monomethyl-arginine in RIP140 (Fig. 6.10 B). Hence, if this model held for methyl-lysine, it is possible that trimethyl-K62 and dimethyl-K65 of untreated mAhR may enhance CRM-1 dependent nuclear export of the inactive transcription factor to maintain cytoplasmic localization. Mutation of these lysine residues to arginine or alanine may abrogate this enhancement of nuclear export, and result in increased nuclear localization, while mutation to a hydrophobic residue such as phenylalanine may mimic methylation of lysine, as previously shown for the methyl-arginine mimic (428), would result in maintenance of a more cytoplasmic localisation of AhR.

Preliminary analysis of the modification of mAhR following cell suspension isolated phosphorylation and dimethylation of Y239 and K245, respectively. The hypothetical domain arrangement generated by SAXS and AFM analysis (Chapter 4) indicated that these target residues may be directed towards the DNA, potentially making DNA contacts. Hence, modification at this site could play a role in specifying target XRE binding, resulting in activation state specific target gene profiles. Additionally, the finding that tyrosine phosphorylation is important for DNA binding by AhR (407) may implicate phospho-Y239, and/or phospho-Y75 in DNA binding activity. However, this does not exclude the possible involvement of the remaining phosphotyrosine 462 in DNA binding. Mutation of K245 to glutamate in Chapter 5 showed minimal effect on suspension activation of mAhR by reporter analysis, however, as discussed, this technique may have limited utility in determining subtle functional effects of mutagenesis.

140 Functional Models for AhR NLS and NES Post- translational Modification

(A) Akt PKC Set? PRMT1

P P Me Me Me P Me S S S K S K R RRR K S K S FOXO1 FOXO1 AhR AhR

Nuclear Nuclear Nuclear Nuclear Exclusion Accumulation Exclusion Accumulation

(B) 14-3-3 P P P P CRM1 Nuclear Export RIP140 PRMT1 RIP140 R me

? P P CRM1 P P Nuclear K me Export AhR Set? AhR 2

Figure 6.10 Hypothetical models for regulation of AhR subcellular localization by modification of the NLS and NES. (A) Akt phosphorylation of S253 of transcription factor FOXO1 has been shown to be antagonised by methylation of R248 and R250 by Protein arginine methyltransferase 1 (PRMT1), resulting in nuclear accumulation (figure adapted from (368)). Modification of AhR residues within the NLS are proposed to similarly regulate AhR nuclear accumulation whereby phosphorylation of bipartite NLS proximal residues S11, S32 and S35 by Protein Kinase C (PKC) may prevent nuclear translocation, while methylation of K31 and K36 by a member of the Set lysine methyltransferase enzyme family may antagonise phosphorylation at S32 and S35 to increase nuclear accumulation. (B) Phosphorylation of S102 and S1003 of RIP140 recruits chaperone 14-3-3, forming a scaffold for PRMT1. PRMT1 methylates RIP140, causing association with nuclear export factor CRM1 and resulting in nuclear export of RIP140 (427). The methylation of NES residues K62 and K65 of mAhR may regulate subcellular localization in a similar manner to RIP140 regulation. Phosphorylation of mAhR residues S72, S74 and Y75 may potentiate the methylation events to enhance CRM1 dependent nuclear export of mAhR to decrease transcriptional activity. It is interesting to note that trimethylation or acetylation of K284 in the PAS B domain of untreated and cell suspension treated mAhR neighbours the ligand activation critical H285 (Chapter 5). The proximity of this modification to the theorised entrance to the ligand binding pocket may indicate a role in altering the structure of this region to regulate ligand specificity. The sequence surrounding this target lysine (282 RTK 284) suggests the possibility that, like p53 and TAF10, SET7/9 may monomethylate this residue (consensus target sequence K/R-S/T-K) (362), with additional, unknown methyltransferase(s) performing processive methylation to generate the trimethyl species, as observed for TAF10. This mechanism may also describe methylation of neighbouring target K277, and dimethyl-K426 (active AhR) having the same site sequence 275 RTK 277 and 424 RTK 426. Hence, siRNA depletion of Set7/9, followed by assessment of AhR methylation status and function by reporter gene, target gene and microarray analyses may indicate a role for these methylation events in regulating AhR activity. It is additionally possible that K277 and K284 sites may be modified by an acetyltransferase to form acetyl- lysine, given that trimethylation and acetylation are isobaric modifications. Interestingly, AhR coactivators CBP/p300 and Brahma related gene 1 (Brg-1) contain bromodomains, which may utilise acetyl-lysine as an interaction site to assemble the active transcription factor complex and mediate chromatin remodelling (429).

Sulfonation between PAS B and the Transactivation Region The complex modification of the sequence between PAS B and the transactivation region (residues 411-558) of untreated AhR could indicate an important regulatory role for this site in AhR function, although at this stage, specific functional outcomes can only be hypothesised. The occurrence of serine O-sulfonation in this region may regulate protein:protein interactions, as described for sulfo-tyrosine of PGSL-1. Tyrosine sulfonation has been identified as an event catalysed by the enzyme tyrosylprotein sulfotransferase (TPST), localised to the trans-Golgi network (TGN), resulting in modification of proteins trafficked through the TGN prior to secretion or cell-surface presentation (430). Investigations into a physiological function for the modification have provided evidence of a role in protein:protein interactions between secreted proteins, secreted proteins and cell-surface receptors, and between cell- surface receptors (reviewed in 431). During wound healing, cell signalling results in localization of P- selectin to the endothelial cell surface. This surface display is crucial for leukocyte rolling and extravasation to initiate wound induced inflammation. P-selectin is able to interact with the leukocyte membrane by binding fucose modified P-selectin glycoprotein ligand 1 (PSGL-1), however, prevention of sulfonation in a neutrophil cell line attenuated P-selectin binding. This observation indicated the potential for a critical sulfation event for this interaction, resulting in the identification of three target tyrosine residues in the apical segment of PSGL-1 which mediate the interaction with P-selectin (432). Sulfonation of serine and threonine residues has also been observed in a number of proteins including a neuronal intermediate filament protein, and myosin light-chain from the snail; a cathepsin C like protein 141 from the malaria parasite; and human orphan receptor tyrosine kinase, ROR2 (433), although a specific functional outcome from these modifications is as yet unknown.

Less is known of modifications carboxamidomethyl-methionine and acetyl-serine, hence the affect of these events on AhR function awaits mutagenesis studies. Interestingly, acetyl-S547, modified following cell suspension, is located in a region (515-583) shown to be critical for interaction with coactivator protein Myb binding protein 2a (Mybbp2a) (434). PTMs within this region may regulate interactions with this coactivator. A candidate role for the extensive serine phosphorylation observed for this region may be extrapolated from observations of another Hsp90 client protein, LIM (Lin-11, Isl-1 and Mec-3) kinase 1 (LIMK1). LIMK1 is a serine/threonine kinase that regulates actin cytoskeletal remodelling. This kinase is bound by Hsp90, which has been shown to increase its kinase activity. Five phosphorylation sites have been identified in LIMK1, and recent evidence shows that Hsp90 favours LIMK1 homodimer formation and autophosphorylation, ultimately resulting in increased half-life to 12h, as opposed to a kinase dead mutant, which is degraded in 3 hours (435). Extrapolating this paradigm to AhR, it seems plausible that, for untreated AhR, where the most extensive phosphorylation neighbours the Hsp90 interaction region, Hsp90 may potentiate kinase interactions to hyperphosphorylation AhR and increase its stability in the nascent state. Group mutagenesis of well conserved phosphorylation target sites in the region 411-558, followed by monitoring of AhR levels by Western analysis may elucidate a role for this region in AhR protein stability. Finally, phosphorylations identified in the transactivation region may be involved in enhancing transcription of AhR target genes, as shown for key cell growth control transcription factor c-Fos (436).

Pyrophosphorylation in the C-terminal Activation Region Pyrophosphorylation of S716 of mAhR may confer enhanced duration and strength of transcriptional activity. Non-enzymatic pyrophosphate phosphorylation of serine residues is a modification which has been identified recently, with the discovery that pyrophosphate diphosphoinositol pentakisphosphate

(IP7) was able to phosphorylate mammalian and yeast proteins in a physiological setting (437). The fact that bacterial preparations of yeast pyrophosphate phosphorylation targets cannot be modified prompted further analysis into the mechanism and nature of this PTM. Yeast extracts with the addition of ATP rescued IP7 phosphorylation of bacterially purified proteins, while boiled extracts could not induce this activity, indicating that an enzymatic event facilitated the chemical modification by IP7. Enzymatic phosphorylation of serine residues was finally established to be the critical event that primes the target residue for further phosphorylation by IP7, generating pyrophosphoserine. Intriguingly, an analysis of phosphoserine versus pyrophosphoserine sensitivity to a range of phosphatase enzymes, including -phosphatase, calcineurin, protein phosphatase-1 and alkaline phosphase, revealed that pyrophosphoserine cannot be hydrolysed. Pyrophosphatase enymes were also analysed and found to 142 be inactive for hydrolysis of pyrophosphate from serine (438). A specific role for pyrophosphorylation that is distinct from characterised roles for phosphorylation is currently unknown, although this may represent a more stable modification functioning in signalling memory over longer timescales than typically required of other PTMs. Hence, in the case of active AhR, this modification may provide a stable signal to mediate activation or ubiquitin ligase functions.

Clearly, AhR undergoes complex PTMs in untreated cells, and a full analysis of ligand activated and cell suspension activated AhR may indicate binary switch mechanisms for particular target residues. At this stage, comprehensive analysis of AhR indicates the potential for regional PTM regulation of protein stability, protein:protein interactions, nucleo/cytoplasmic shuttling, ligand binding, DNA binding specificity and transcriptional activity. The possibility for ‘noise’ in the eukaryotic proteome has been hypothesised in the past, generating silent modifications with limited impact on function (reviewed in 348), hence the large number of modifications of AhR may in part represent lowly abundant modification events. Additionally, this analysis required production of large amounts of full length mAhR, and hence, overexpression was performed to maximise yield. Hence, the potential for some PTMs to occur as an artefact of overexpression cannot be overlooked. Future experiments using stable isotope labelling of amino acids in cell culture (SILAC) will allow accurate quantitation of the penetrance of particular modifications under different treatment conditions. For SILAC, two populations of cells are maintained, one in ‘light’ and one in ‘heavy’ lysine/arginine isotopically labelled media. Following maintenance of an untreated and treated population of cells, the cells are mixed for lysis and protein purification. Mass spectrometric analysis allows simultaneous detection of the light and heavy peptides, and quantitation of relative peptide concentration based on peptide signal intensity (439).

This experimental system may allow differentiation between high and low incident modifications, generating a subset of target residues for site-directed mutagenesis and subsequent functional analysis. However, the presence of lower percentage modifications does not preclude a potential role in regulating AhR. Analysis of sequence conservation of targeted sites, in combination with an understanding of the percentage of modified AhR may be a good method to narrow the candidate sites for targeted mutagenesis. Additionally, the occurrence of highly penetrant modifications may not simply result in predicted on/off, or ligand specificity effects; rather, modification of AhR may represent specific temporal functions in the activation, translocation, dimerization, DNA binding, transactivation and degradation steps of the AhR activity cycle which are averaged by purification of the total pool of AhR. Hence, AhR mutagenesis may result in subtle changes to the intensity of activation; altered transactivation potential after ligand or endogenous activation signals; or changed target gene specificity that will only be revealed by systematic analysis of each event. Future mutagenesis of AhR PTM sites may benefit from modification specific group analysis of regional modification targets, such as 143 those in the NLS, NES, PAS B and the PAS B/Transactivation Domain intervening sequence to elucidate the impact of the vast modification pattern identified in this study for untreated mAhR.

144

Final Discussion Chapter 7: Final Discussion

To date, the bHLH.PAS family of transcription factors remains structurally uncharacterised. Attempts to obtain a crystal structure have yielded a number of crystal forms, but failed to achieve diffraction quality. Further crystallization screens with a number of different length dsDNA fragments have subsequently been undertaken in our laboratory, resulting in identification of other crystallization conditions, with increased crystal size and order, indicating that it may yet be possible to define the molecular structure of a bHLH.PAS dimer bound to DNA. Subsequent characterisation of post-translational modifications within the bHLH.PAS A region suggests that modification of residues in helix 2 may impact on DNA binding, and hence, in vitro modification may aid in stabilizing the complex for crystallization.

In the absence of a crystal structure, AFM and SAXS were pursued, resulting in visualization of the domain topology of DNA bound bHLH.PAS A heterodimer AhR/ARNT. In accordance with DNA bending models proposed by Chapman-Smith (9), PAS A appears to lie in close proximity to the DNA downstream of the XRE. According to the hypothetical SAXS model, there is the potential for PAS A/DNA backbone contacts by the proposed DNA proximal loop structures which feature conserved basic residues. This discovery infers a potential mechanism for bHLH.PAS protein DNA interactions involving bipartite association. Hence, this study adds to the fundamental tenets of bHLH.PAS target gene recognition, and clarifies a potential mechanism for differential target gene selection by factors such as HIF and SIM that bind identical core recognition sequences. Additionally, observed DNA bending, formation of oligomeric protein species and the occurrence of multiple XREs within enhancers/promoters of AhR target genes implicates AhR/ARNT in formation of looped DNA structures. Looping of enhancer elements would incur chromatin remodelling and recruitment of transcriptional machinery to proximal promoter regions of target genes. This mechanism has been implied for AhR activation of target gene Cyp1a1 in the past, however this is the first structural evidence in support of a DNA bending model.

Activation of AhR by classical and atypical exogenous ligands has been investigated by targeted mutagenesis of the LBD. YH439 was found to induce AhR for reporter activation irrespective of mutation of histidine 285 to tyrosine. In a parallel series of experiments, it was evident that H285Y could not be activated by PAH ligands, cell suspension, or shear stressed serum. This observation indicated that these non-exogenous ligand activation mechanisms share a common requirement for access to the ligand binding pocket, similar to that for PAH ligand binding. Atypical ligand YH439 also partially activated the LBD blocking mutant AhR A375I, indicating a novel binding mode. This example shows the flexibility, and inherent complexity of ligand binding by AhR PAS B and highlights the potential for subtly different ligand structures to induce precise downstream signalling events. Given the number of 145 functions attributed to AhR, including xenobiotic metabolism, vascular remodelling, regulation of cell growth, follicle maturation, regulation of T-cell differentiation and cross-talk with ER both as a coactivator and ubiquitin ligase, we anticipate a range of ligand types, PTMs, or both to determine outcomes of AhR activity.

PTM analysis of untreated mAhR has shown a large number of modifications that may act to maintain AhR stability, subcellular localization and prime the LBD for ligand activation. Modification in the presence of YH439 may indicate a PTM pattern induced due to ligand binding in a novel region of the LBD. If, as indicated by the LBD mutagenesis investigation, endogenous activation by cell suspension requires a ligand binding event, PTMs identified for this treated material may indicate regulatory events downstream of a particular endogenous ligand structure. Modification following ligand binding may amplify subtle structural changes in order to alter signalling potential. More specifically, the appearance of suspension specific modifications at T290 and T283 within the PAS B PAS core may indicate exposure of target residues to a kinase, following an endogenous ligand binding event. Alternatively, cell suspension may induce a signalling cascade resulting in modification of AhR to induce DNA binding activity. PTMs may modulate protein:protein interactions important for formation of a DNA binding species, coactivator recruitment, or divert AhR to ubiquitin ligase adaptor functions. Additionally, PTMs targeted to the bHLH and PAS A regions following AhR activation may directly influence target sequence specificity.

From this body of work, it is evident that future research into mechanisms of AhR regulation may benefit from an understanding of ligand specific PTM outcomes for AhR. This may allow mutagenesis experiments to be undertaken to eliminate particular ligand activation events, while maintaining others. The knowledge that bHLH.PAS protein DNA binding likely involves complex interplay between XRE position and chromatin remodelling reinforces the importance of examining AhR and AhR mutant functions in a chromatin context. Additionally, the use of Cyp1a1 as a readout of AhR function is a limited frame of reference for analysis of AhR mutants. The breadth of AhR target genes and cross-talk effects of AhR activation indicates that a targeted analysis of function, by designing bespoke microarray platforms (Affymetrix1), including ER and AR target gene sequences, may give a more thorough indication of effects of both mutagenesis and activation state. By careful application of this technology, functional outcomes of cell context dependent ligand or endogenous activation of AhR may be fully elucidated.

146

References Chapter 8: References

1. Crews, S. T., and Fan, C. M. (1999) Remembrance of things PAS: regulation of development by bHLH-PAS proteins, Curr Opin Genet Dev 9, 580-587.

2. Kewley, R. J., Whitelaw, M. L., and Chapman-Smith, A. (2004) The mammalian basic helix-loop-helix/PAS family of transcriptional regulators, Int J Biochem Cell Biol 36, 189-204.

3. Pongratz, I., Antonsson, C., Whitelaw, M. L., and Poellinger, L. (1998) Role of the PAS domain in regulation of dimerization and DNA binding specificity of the dioxin receptor, Mol Cell Biol 18, 4079-4088.

4. Borgstahl, G. E., Williams, D. R., and Getzoff, E. D. (1995) 1.4 A structure of photoactive yellow protein, a cytosolic photoreceptor: unusual fold, active site, and chromophore, Biochemistry 34, 6278-6287.

5. Morais Cabral, J. H., Lee, A., Cohen, S. L., Chait, B. T., Li, M., and Mackinnon, R. (1998) Crystal structure and functional analysis of the HERG potassium channel N terminus: a eukaryotic PAS domain, Cell 95, 649-655.

6. Gong, W., Hao, B., Mansy, S. S., Gonzalez, G., Gilles-Gonzalez, M. A., and Chan, M. K. (1998) Structure of a biological oxygen sensor: a new mechanism for heme-driven signal transduction, Proc Natl Acad Sci U S A 95, 15177-15182.

7. Crosson, S., and Moffat, K. (2001) Structure of a flavin-binding plant photoreceptor domain: insights into light- mediated signal transduction, Proc Natl Acad Sci U S A 98, 2995-3000.

8. Vreede, J., Van Der Horst, M. A., Hellingwerf, K. J., Crielaard, W., and Van Aalten, D. M. (2003) PAS domains - common structure, common flexibility?, J Biol Chem 14, 14.

9. Chapman-Smith, A., and Whitelaw, M. L. (2006) Novel DNA binding by a basic helix-loop-helix protein. The role of the dioxin receptor PAS domain, J Biol Chem 281, 12535-12545.

10. Zelzer, E., and Shilo, B. Z. (2000) Interaction between the bHLH-PAS protein Trachealess and the POU-domain protein Drifter, specifies tracheal cell fates, Mech Dev 91, 163-173.

11. Jain, S., Maltepe, E., Lu, M. M., Simon, C., and Bradfield, C. A. (1998) Expression of ARNT, ARNT2, HIF1 alpha, HIF2 alpha and Ah receptor mRNAs in the developing mouse, Mech Dev 73, 117-123.

12. Hirose, K., Morita, M., Ema, M., Mimura, J., Hamada, H., Fujii, H., Saijo, Y., Gotoh, O., Sogawa, K., and Fujii- Kuriyama, Y. (1996) cDNA cloning and tissue-specific expression of a novel basic helix-loop-helix/PAS factor (Arnt2) with close sequence similarity to the aryl hydrocarbon receptor nuclear translocator (Arnt), Mol Cell Biol 16, 1706- 1713.

13. Ikeda, M., and Nomura, M. (1997) cDNA cloning and tissue-specific expression of a novel basic helix-loop-helix/PAS protein (BMAL1) and identification of alternatively spliced variants with alternative translation initiation site usage, Biochem Biophys Res Commun 233, 258-264.

14. Okano, T., Sasaki, M., and Fukada, Y. (2001) Cloning of mouse BMAL2 and its daily expression profile in the suprachiasmatic nucleus: a remarkable acceleration of Bmal2 sequence divergence after Bmal gene duplication, Neurosci Lett 300, 111-114.

15. Kotch, L. E., Iyer, N. V., Laughner, E., and Semenza, G. L. (1999) Defective vascularization of HIF-1alpha-null embryos is not associated with VEGF deficiency but with mesenchymal cell death, Dev Biol 209, 254-267.

16. Weidemann, A., and Johnson, R. S. (2008) Biology of HIF-1alpha, Cell death and differentiation 15, 621-627.

17. Caniggia, I., Winter, J., Lye, S. J., and Post, M. (2000) Oxygen and placental development during the first trimester: implications for the pathophysiology of pre-eclampsia, Placenta 21 Suppl A, S25-30.

18. Lando, D., Gorman, J. J., Whitelaw, M. L., and Peet, D. J. (2003) Oxygen-dependent regulation of hypoxia-inducible factors by prolyl and asparaginyl hydroxylation, Eur J Biochem 270, 781-790. 147 19. Semenza, G. L., and Wang, G. L. (1992) A nuclear factor induced by hypoxia via de novo protein synthesis binds to the human erythropoietin gene enhancer at a site required for transcriptional activation, Mol Cell Biol 12, 5447-5454.

20. Moffett, P., Reece, M., and Pelletier, J. (1997) The murine Sim-2 gene product inhibits transcription by active repression and functional interference, Mol Cell Biol 17, 4933-4947.

21. Ema, M., Morita, M., Ikawa, S., Tanaka, M., Matsuda, Y., Gotoh, O., Saijoh, Y., Fujii, H., Hamada, H., Kikuchi, Y., and Fujii-Kuriyama, Y. (1996) Two new members of the murine Sim gene family are transcriptional repressors and show different expression patterns during mouse embryogenesis, Mol Cell Biol 16, 5865-5875.

22. Michaud, J. L., Rosenquist, T., May, N. R., and Fan, C. M. (1998) Development of neuroendocrine lineages requires the bHLH-PAS transcription factor SIM1, Genes Dev 12, 3264-3275.

23. Goshu, E., Jin, H., Fasnacht, R., Sepenski, M., Michaud, J. L., and Fan, C. M. (2002) Sim2 mutants have developmental defects not overlapping with those of Sim1 mutants, Mol Cell Biol 22, 4147-4157.

24. Woods, S., Farrall, A., Procko, C., and Whitelaw, M. L. (2008) The bHLH/Per-Arnt-Sim transcription factor SIM2 regulates muscle transcript myomesin2 via a novel, non-canonical E-box sequence, Nucleic Acids Res 36, 3716- 3727.

25. Lee, J., Lee, Y., Lee, M. J., Park, E., Kang, S. H., Chung, C. H., Lee, K. H., and Kim, K. (2008) Dual modification of BMAL1 by SUMO2/3 and ubiquitin promotes circadian activation of the CLOCK/BMAL1 complex, Mol Cell Biol 28, 6056-6065.

26. Lin, J. D., Liu, C., and Li, S. (2008) Integration of energy metabolism and the mammalian clock, Cell cycle (Georgetown, Tex 7, 453-457.

27. Safe, S. H. (1995) Modulation of gene expression and endocrine response pathways by 2,3,7,8-tetrachlorodibenzo- p-dioxin and related compounds, Pharmacol Ther 67, 247-281.

28. Perdew, G. H., and Bradfield, C. A. (1996) Mapping the 90 kDa heat shock protein binding region of the Ah receptor, Biochem Mol Biol Int 39, 589-593.

29. Antonsson, C., Whitelaw, M. L., McGuire, J., Gustafsson, J. A., and Poellinger, L. (1995) Distinct roles of the molecular chaperone hsp90 in modulating dioxin receptor function via the basic helix-loop-helix and PAS domains, Mol Cell Biol 15, 756-765.

30. Lees, M. J., Peet, D. J., and Whitelaw, M. L. (2003) Defining the role for XAP2 in stabilization of the dioxin receptor, J Biol Chem 278, 35878-35888.

31. Meyer, B. K., and Perdew, G. H. (1999) Characterization of the AhR-hsp90-XAP2 core complex and the role of the immunophilin-related protein XAP2 in AhR stabilization, Biochemistry 38, 8907-8917.

32. Ikuta, T., Eguchi, H., Tachibana, T., Yoneda, Y., and Kawajiri, K. (1998) Nuclear localization and export signals of the human aryl hydrocarbon receptor, J Biol Chem 273, 2895-2904.

33. Denison, M. S., and Nagy, S. R. (2003) Activation of the aryl hydrocarbon receptor by structurally diverse exogenous and endogenous chemicals, Annu Rev Pharmacol Toxicol 43, 309-334.

34. Coumailleau, P., Poellinger, L., Gustafsson, J. A., and Whitelaw, M. L. (1995) Definition of a minimal domain of the dioxin receptor that is associated with Hsp90 and maintains wild type ligand binding affinity and specificity, J Biol Chem 270, 25291-25300.

35. McGuire, J., Okamoto, K., Whitelaw, M. L., Tanaka, H., and Poellinger, L. (2001) Definition of a dioxin receptor mutant that is a constitutive activator of transcription: delineation of overlapping repression and ligand binding functions within the PAS domain, J Biol Chem 276, 41841-41849.

36. Chapman-Smith, A., Lutwyche, J. K., and Whitelaw, M. L. (2004) Contribution of the Per/Arnt/Sim (PAS) domains to DNA binding by the basic helix-loop-helix PAS transcriptional regulators, J Biol Chem 279, 5353-5362.

37. Swanson, H. I., Chan, W. K., and Bradfield, C. A. (1995) DNA binding specificities and pairing rules of the Ah receptor, ARNT, and SIM proteins, J Biol Chem 270, 26292-26302.

148 38. Watson, A. J., and Hankinson, O. (1992) Dioxin- and Ah receptor-dependent protein binding to xenobiotic responsive elements and G-rich DNA studied by in vivo footprinting, J Biol Chem 267, 6874-6878.

39. Kobayashi, A., Numayama-Tsuruta, K., Sogawa, K., and Fujii-Kuriyama, Y. (1997) CBP/p300 functions as a possible transcriptional coactivator of Ah receptor nuclear translocator (Arnt), J Biochem (Tokyo) 122, 703-710.

40. Kumar, M. B., and Perdew, G. H. (1999) Nuclear receptor coactivator SRC-1 interacts with the Q-rich subdomain of the AhR and modulates its transactivation potential, Gene Expr 8, 273-286.

41. Whitlock, J. P., Jr. (1999) Induction of cytochrome P4501A1, Annu Rev Pharmacol Toxicol 39, 103-125.

42. Ma, Q. (2001) Induction of CYP1A1. The AhR/DRE paradigm: transcription, receptor regulation, and expanding biological roles, Curr Drug Metab 2, 149-164.

43. Fernandez-Salguero, P., Pineau, T., Hilbert, D. M., McPhail, T., Lee, S. S., Kimura, S., Nebert, D. W., Rudikoff, S., Ward, J. M., and Gonzalez, F. J. (1995) Immune system impairment and hepatic fibrosis in mice lacking the dioxin- binding Ah receptor, Science 268, 722-726.

44. Lahvis, G. P., Lindell, S. L., Thomas, R. S., McCuskey, R. S., Murphy, C., Glover, E., Bentz, M., Southard, J., and Bradfield, C. A. (2000) Portosystemic shunting and persistent fetal vascular structures in aryl hydrocarbon receptor- deficient mice, Proc Natl Acad Sci U S A 97, 10442-10447.

45. Berg, P., and Pongratz, I. (2001) Differential usage of nuclear export sequences regulates intracellular localization of the dioxin (aryl hydrocarbon) receptor, J Biol Chem 276, 43231-43238.

46. Kitajima, M., Khan, K. N., Fujishita, A., Masuzaki, H., Koji, T., and Ishimaru, T. (2004) Expression of the arylhydrocarbon receptor in the peri-implantation period of the mouse uterus and the impact of dioxin on mouse implantation, Arch Histol Cytol 67, 465-474.

47. Buchanan, D. L., Sato, T., Peterson, R. E., and Cooke, P. S. (2000) Antiestrogenic effects of 2,3,7,8- tetrachlorodibenzo-p-dioxin in mouse uterus: critical role of the aryl hydrocarbon receptor in stromal tissue, Toxicol Sci 57, 302-311.

48. Ohtake, F., Baba, A., Takada, I., Okada, M., Iwasaki, K., Miki, H., Takahashi, S., Kouzmenko, A., Nohara, K., Chiba, T., Fujii-Kuriyama, Y., and Kato, S. (2007) Dioxin receptor is a ligand-dependent E3 ubiquitin ligase, Nature 446, 562-566.

49. Wu, Q., Ohsako, S., Baba, T., Miyamoto, K., and Tohyama, C. (2002) Effects of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) on preimplantation mouse embryos, Toxicology 174, 119-129.

50. Campbell, S. J., Henderson, C. J., Anthony, D. C., Davidson, D., Clark, A. J., and Wolf, C. R. (2005) The murine Cyp1a1 gene is expressed in a restricted spatial and temporal pattern during embryonic development, J Biol Chem 280, 5828-5835.

51. Abbott, B. D., Birnbaum, L. S., and Perdew, G. H. (1995) Developmental expression of two members of a new class of transcription factors: I. Expression of aryl hydrocarbon receptor in the C57BL/6N mouse embryo, Dev Dyn 204, 133-143.

52. Pratt, R. M., Dencker, L., and Diewert, V. M. (1984) 2,3,7,8-Tetrachlorodibenzo-p-dioxin-induced cleft palate in the mouse: evidence for alterations in palatal shelf fusion, Teratog Carcinog Mutagen 4, 427-436.

53. Willey, J. J., Stripp, B. R., Baggs, R. B., and Gasiewicz, T. A. (1998) Aryl hydrocarbon receptor activation in genital tubercle, palate, and other embryonic tissues in 2,3,7, 8-tetrachlorodibenzo-p-dioxin-responsive lacZ mice, Toxicol Appl Pharmacol 151, 33-44.

54. Bemis, J. C., Nazarenko, D. A., and Gasiewicz, T. A. (2005) Coplanar polychlorinated biphenyls activate the aryl hydrocarbon receptor in developing tissues of two TCDD-responsive lacZ mouse lines, Toxicol Sci 87, 529-536.

55. Mimura, J., Yamashita, K., Nakamura, K., Morita, M., Takagi, T. N., Nakao, K., Ema, M., Sogawa, K., Yasuda, M., Katsuki, M., and Fujii-Kuriyama, Y. (1997) Loss of teratogenic response to 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) in mice lacking the Ah (dioxin) receptor, Genes Cells 2, 645-654.

149 56. Schmidt, J. V., Su, G. H., Reddy, J. K., Simon, M. C., and Bradfield, C. A. (1996) Characterization of a murine Ahr null allele: involvement of the Ah receptor in hepatic growth and development, Proc Natl Acad Sci U S A 93, 6731- 6736.

57. Lahvis, G. P., and Bradfield, C. A. (1998) Ahr null alleles: distinctive or different?, Biochem Pharmacol 56, 781-787.

58. Shimizu, Y., Nakatsuru, Y., Ichinose, M., Takahashi, Y., Kume, H., Mimura, J., Fujii-Kuriyama, Y., and Ishikawa, T. (2000) Benzo[a]pyrene carcinogenicity is lost in mice lacking the aryl hydrocarbon receptor, Proc Natl Acad Sci U S A 97, 779-782.

59. McMillan, B. J., and Bradfield, C. A. (2007) The aryl hydrocarbon receptor is activated by modified low-density lipoprotein, Proc Natl Acad Sci U S A 104, 1412-1417.

60. Harstad, E. B., Guite, C. A., Thomae, T. L., and Bradfield, C. A. (2006) Liver deformation in Ahr-null mice: evidence for aberrant hepatic perfusion in early development, Mol Pharmacol 69, 1534-1541.

61. Lahvis, G. P., Pyzalski, R. W., Glover, E., Pitot, H. C., McElwee, M. K., and Bradfield, C. A. (2005) The aryl hydrocarbon receptor is required for developmental closure of the ductus venosus in the neonatal mouse, Mol Pharmacol 67, 714-720.

62. Walisser, J. A., Glover, E., Pande, K., Liss, A. L., and Bradfield, C. A. (2005) Aryl hydrocarbon receptor-dependent liver development and hepatotoxicity are mediated by different cell types, Proc Natl Acad Sci U S A 102, 17858- 17863.

63. Zaher, H., Fernandez-Salguero, P. M., Letterio, J., Sheikh, M. S., Fornace, A. J., Jr., Roberts, A. B., and Gonzalez, F. J. (1998) The involvement of aryl hydrocarbon receptor in the activation of transforming growth factor-beta and apoptosis, Mol Pharmacol 54, 313-321.

64. Branton, M. H., and Kopp, J. B. (1999) TGF-beta and fibrosis, Microbes and infection / Institut Pasteur 1, 1349-1365.

65. Santiago-Josefat, B., Mulero-Navarro, S., Dallas, S. L., and Fernandez-Salguero, P. M. (2004) Overexpression of latent transforming growth factor-beta binding protein 1 (LTBP-1) in dioxin receptor-null mouse embryo fibroblasts, J Cell Sci 117, 849-859.

66. Guo, J., Sartor, M., Karyala, S., Medvedovic, M., Kann, S., Puga, A., Ryan, P., and Tomlinson, C. R. (2004) Expression of genes in the TGF-beta signaling pathway is significantly deregulated in smooth muscle cells from aorta of aryl hydrocarbon receptor knockout mice, Toxicol Appl Pharmacol 194, 79-89.

67. Thomae, T. L., Stevens, E. A., and Bradfield, C. A. (2005) Transforming growth factor-beta3 restores fusion in palatal shelves exposed to 2,3,7,8-tetrachlorodibenzo-p-dioxin, J Biol Chem 280, 12742-12746.

68. Abbott, B. D., Schmid, J. E., Pitt, J. A., Buckalew, A. R., Wood, C. R., Held, G. A., and Diliberto, J. J. (1999) Adverse reproductive outcomes in the transgenic Ah receptor-deficient mouse, Toxicol Appl Pharmacol 155, 62-70.

69. Baba, T., Mimura, J., Nakamura, N., Harada, N., Yamamoto, M., Morohashi, K., and Fujii-Kuriyama, Y. (2005) Intrinsic function of the aryl hydrocarbon (dioxin) receptor as a key factor in female reproduction, Mol Cell Biol 25, 10040-10051.

70. Benedict, J. C., Lin, T. M., Loeffler, I. K., Peterson, R. E., and Flaws, J. A. (2000) Physiological role of the aryl hydrocarbon receptor in mouse ovary development, Toxicol Sci 56, 382-388.

71. Dey, A., and Nebert, D. W. (1998) Markedly increased constitutive CYP1A1 mRNA levels in the fertilized ovum of the mouse, Biochem Biophys Res Commun 251, 657-661.

72. Thomae, T. L., Glover, E., and Bradfield, C. A. (2004) A maternal Ahr null genotype sensitizes embryos to chemical teratogenesis, J Biol Chem 279, 30189-30194.

73. Hushka, L. J., Williams, J. S., and Greenlee, W. F. (1998) Characterization of 2,3,7,8-tetrachlorodibenzofuran- dependent suppression and AH receptor pathway gene expression in the developing mouse mammary gland, Toxicol Appl Pharmacol 152, 200-210.

150 74. Hayes, K. R., Zastrow, G. M., Nukaya, M., Pande, K., Glover, E., Maufort, J. P., Liss, A. L., Liu, Y., Moran, S. M., Vollrath, A. L., and Bradfield, C. A. (2007) Hepatic Transcriptional Networks Induced by Exposure to 2,3,7,8- Tetrachlorodibenzo-p-dioxin, Chem Res Toxicol.

75. Tijet, N., Boutros, P. C., Moffat, I. D., Okey, A. B., Tuomisto, J., and Pohjanvirta, R. (2006) Aryl hydrocarbon receptor regulates distinct dioxin-dependent and dioxin-independent gene batteries, Mol Pharmacol 69, 140-153.

76. Fletcher, N., Wahlstrom, D., Lundberg, R., Nilsson, C. B., Nilsson, K. C., Stockling, K., Hellmold, H., and Hakansson, H. (2005) 2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD) alters the mRNA expression of critical genes associated with cholesterol metabolism, bile acid biosynthesis, and bile transport in rat liver: a microarray study, Toxicol Appl Pharmacol 207, 1-24.

77. Puga, A., Maier, A., and Medvedovic, M. (2000) The transcriptional signature of dioxin in human hepatoma HepG2 cells, Biochem Pharmacol 60, 1129-1142.

78. Johnson, C. D., Balagurunathan, Y., Tadesse, M. G., Falahatpisheh, M. H., Brun, M., Walker, M. K., Dougherty, E. R., and Ramos, K. S. (2004) Unraveling gene-gene interactions regulated by ligands of the aryl hydrocarbon receptor, Environ Health Perspect 112, 403-412.

79. Zunino, R., Li, Q., Rose, S. D., Romero-Benitez, M. M., Lejen, T., Brandan, N. C., and Trifaro, J. M. (2001) Expression of scinderin in megakaryoblastic leukemia cells induces differentiation, maturation, and apoptosis with release of plateletlike particles and inhibits proliferation and tumorigenesis, Blood 98, 2210-2219.

80. Temchura, V. V., Frericks, M., Nacken, W., and Esser, C. (2005) Role of the aryl hydrocarbon receptor in thymocyte emigration in vivo, Eur J Immunol 35, 2738-2747.

81. Blaylock, B. L., Holladay, S. D., Comment, C. E., Heindel, J. J., and Luster, M. I. (1992) Exposure to tetrachlorodibenzo-p-dioxin (TCDD) alters fetal thymocyte maturation, Toxicol Appl Pharmacol 112, 207-213.

82. Svensson, C., and Lundberg, K. (2001) Immune-specific up-regulation of adseverin gene expression by 2,3,7,8- tetrachlorodibenzo-p-dioxin, Mol Pharmacol 60, 135-142.

83. Svensson, C., Silverstone, A. E., Lai, Z. W., and Lundberg, K. (2002) Dioxin-induced adseverin expression in the mouse thymus is strictly regulated and dependent on the aryl hydrocarbon receptor, Biochem Biophys Res Commun 291, 1194-1200.

84. Lai, Z. W., Pineau, T., and Esser, C. (1996) Identification of dioxin-responsive elements (DREs) in the 5' regions of putative dioxin-inducible genes, Chem Biol Interact 100, 97-112.

85. Lai, Z. W., Hundeiker, C., Gleichmann, E., and Esser, C. (1997) Cytokine gene expression during ontogeny in murine thymus on activation of the aryl hydrocarbon receptor by 2,3,7,8-tetrachlorodibenzo-p-dioxin, Mol Pharmacol 52, 30-37.

86. Anderson, M. K. (2006) At the crossroads: diverse roles of early thymocyte transcriptional regulators, Immunol Rev 209, 191-211.

87. Quintana, F. J., Basso, A. S., Iglesias, A. H., Korn, T., Farez, M. F., Bettelli, E., Caccamo, M., Oukka, M., and Weiner, H. L. (2008) Control of T(reg) and T(H)17 cell differentiation by the aryl hydrocarbon receptor, Nature 453, 65-71.

88. Kimura, A., Naka, T., Nohara, K., Fujii-Kuriyama, Y., and Kishimoto, T. (2008) Aryl hydrocarbon receptor regulates Stat1 activation and participates in the development of Th17 cells, Proc Natl Acad Sci U S A 105, 9721-9726.

89. Ikuta, T., Kobayashi, Y., and Kawajiri, K. (2004) Cell density regulates intracellular localization of aryl hydrocarbon receptor, J Biol Chem 279, 19209-19216.

90. Ikuta, T., and Kawajiri, K. (2006) Zinc finger transcription factor Slug is a novel target gene of aryl hydrocarbon receptor, Exp Cell Res 312, 3585-3594.

91. Kolluri, S. K., Weiss, C., Koff, A., and Gottlicher, M. (1999) p27(Kip1) induction and inhibition of proliferation by the intracellular Ah receptor in developing thymus and hepatoma cells, Genes Dev 13, 1742-1753.

92. Thomsen, J. S., Kietz, S., Strom, A., and Gustafsson, J. A. (2004) HES-1, a novel target gene for the aryl hydrocarbon receptor, Mol Pharmacol 65, 165-171. 151 93. Jana, N. R., Sarkar, S., Ishizuka, M., Yonemoto, J., Tohyama, C., and Sone, H. (1999) Cross-talk between 2,3,7,8- tetrachlorodibenzo-p-dioxin and testosterone signal transduction pathways in LNCaP prostate cancer cells, Biochem Biophys Res Commun 256, 462-468.

94. Barnes-Ellerbe, S., Knudsen, K. E., and Puga, A. (2004) 2,3,7,8-Tetrachlorodibenzo-p-dioxin blocks androgen- dependent cell proliferation of LNCaP cells through modulation of pRB phosphorylation, Mol Pharmacol 66, 502-511.

95. Marchand, A., Tomkiewicz, C., Marchandeau, J. P., Boitier, E., Barouki, R., and Garlatti, M. (2005) 2,3,7,8- Tetrachlorodibenzo-p-dioxin induces insulin-like growth factor binding protein-1 gene expression and counteracts the negative effect of insulin, Mol Pharmacol 67, 444-452.

96. Matikainen, T. M., Moriyama, T., Morita, Y., Perez, G. I., Korsmeyer, S. J., Sherr, D. H., and Tilly, J. L. (2002) Ligand activation of the aromatic hydrocarbon receptor transcription factor drives Bax-dependent apoptosis in developing fetal ovarian germ cells, Endocrinology 143, 615-620.

97. Matikainen, T., Perez, G. I., Jurisicova, A., Pru, J. K., Schlezinger, J. J., Ryu, H. Y., Laine, J., Sakai, T., Korsmeyer, S. J., Casper, R. F., Sherr, D. H., and Tilly, J. L. (2001) Aromatic hydrocarbon receptor-driven Bax gene expression is required for premature ovarian failure caused by biohazardous environmental chemicals, Nat Genet 28, 355-360.

98. Robles, R., Morita, Y., Mann, K. K., Perez, G. I., Yang, S., Matikainen, T., Sherr, D. H., and Tilly, J. L. (2000) The aryl hydrocarbon receptor, a basic helix-loop-helix transcription factor of the PAS gene family, is required for normal ovarian germ cell dynamics in the mouse, Endocrinology 141, 450-453.

99. Bi, X., Slater, D. M., Ohmori, H., and Vaziri, C. (2005) DNA polymerase kappa is specifically required for recovery from the benzo[a]pyrene-dihydrodiol epoxide (BPDE)-induced S-phase checkpoint, J Biol Chem 280, 22343-22355.

100. Ogi, T., Mimura, J., Hikida, M., Fujimoto, H., Fujii-Kuriyama, Y., and Ohmori, H. (2001) Expression of human and mouse genes encoding polkappa: testis-specific developmental regulation and AhR-dependent inducible transcription, Genes Cells 6, 943-953.

101. Sinal, C. J., and Bend, J. R. (1997) Aryl hydrocarbon receptor-dependent induction of cyp1a1 by bilirubin in mouse hepatoma hepa 1c1c7 cells, Mol Pharmacol 52, 590-599.

102. Fritsche, E., Schafer, C., Calles, C., Bernsmann, T., Bernshausen, T., Wurm, M., Hubenthal, U., Cline, J. E., Hajimiragha, H., Schroeder, P., Klotz, L. O., Rannug, A., Furst, P., Hanenberg, H., Abel, J., and Krutmann, J. (2007) Lightening up the UV response by identification of the arylhydrocarbon receptor as a cytoplasmatic target for ultraviolet B radiation, Proc Natl Acad Sci U S A 104, 8851-8856.

103. Rannug, U., Rannug, A., Sjoberg, U., Li, H., Westerholm, R., and Bergman, J. (1995) Structure elucidation of two tryptophan-derived, high affinity Ah receptor ligands, Chem Biol 2, 841-845.

104. Schaldach, C. M., Riby, J., and Bjeldanes, L. F. (1999) Lipoxin A4: a new class of ligand for the Ah receptor, Biochemistry 38, 7594-7600.

105. Seidel, S. D., Winters, G. M., Rogers, W. J., Ziccardi, M. H., Li, V., Keser, B., and Denison, M. S. (2001) Activation of the Ah receptor signaling pathway by prostaglandins, J Biochem Mol Toxicol 15, 187-196.

106. Ohtake, F., Takeyama, K., Matsumoto, T., Kitagawa, H., Yamamoto, Y., Nohara, K., Tohyama, C., Krust, A., Mimura, J., Chambon, P., Yanagisawa, J., Fujii-Kuriyama, Y., and Kato, S. (2003) Modulation of oestrogen receptor signalling by association with the activated dioxin receptor, Nature 423, 545-550.

107. Ohtake, F., Baba, A., Fujii-Kuriyama, Y., and Kato, S. (2008) Intrinsic AhR function underlies cross-talk of dioxins with sex hormone signalings, Biochem Biophys Res Commun.

108. Ikuta, T., Kobayashi, Y., and Kawajiri, K. (2004) Phosphorylation of nuclear localization signal inhibits the ligand- dependent nuclear import of aryl hydrocarbon receptor, Biochem Biophys Res Commun 317, 545-550.

109. Carrier, F., Owens, R. A., Nebert, D. W., and Puga, A. (1992) Dioxin-dependent activation of murine Cyp1a-1 gene transcription requires protein kinase C-dependent phosphorylation, Mol Cell Biol 12, 1856-1863.

110. Guo, M., Joiakim, A., and Reiners, J. J., Jr. (2000) Suppression of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD)- mediated aryl hydrocarbon receptor transformation and CYP1A1 induction by the phosphatidylinositol 3-kinase inhibitor 2-(4-morpholinyl)-8-phenyl-4H-1- benzopyran-4-one (LY294002), Biochem Pharmacol 60, 635-642. 152 111. Sadek, C. M., and Allen-Hoffmann, B. L. (1994) Suspension-mediated induction of Hepa 1c1c7 Cyp1a-1 expression is dependent on the Ah receptor signal transduction pathway, J Biol Chem 269, 31505-31509.

112. Poland, A., and Knutson, J. C. (1982) 2,3,7,8-tetrachlorodibenzo-p-dioxin and related halogenated aromatic hydrocarbons: examination of the mechanism of toxicity, Annu Rev Pharmacol Toxicol 22, 517-554.

113. Panteleyev, A. A., and Bickers, D. R. (2006) Dioxin-induced chloracne--reconstructing the cellular and molecular mechanisms of a classic environmental disease, Exp Dermatol 15, 705-730.

114. Tauchi, M., Hida, A., Negishi, T., Katsuoka, F., Noda, S., Mimura, J., Hosoya, T., Yanaka, A., Aburatani, H., Fujii- Kuriyama, Y., Motohashi, H., and Yamamoto, M. (2005) Constitutive expression of aryl hydrocarbon receptor in keratinocytes causes inflammatory skin lesions, Mol Cell Biol 25, 9360-9368.

115. Fernandez-Salguero, P. M., Ward, J. M., Sundberg, J. P., and Gonzalez, F. J. (1997) Lesions of aryl-hydrocarbon receptor-deficient mice, Vet Pathol 34, 605-614.

116. Sadek, C. M., and Allen-Hoffmann, B. L. (1994) Cytochrome P450IA1 is rapidly induced in normal human keratinocytes in the absence of xenobiotics, J Biol Chem 269, 16067-16074.

117. Geusau, A., Khorchide, M., Mildner, M., Pammer, J., Eckhart, L., and Tschachler, E. (2005) 2,3,7,8- tetrachlorodibenzo-p-dioxin impairs differentiation of normal human epidermal keratinocytes in a skin equivalent model, J Invest Dermatol 124, 275-277.

118. Loertscher, J. A., Sattler, C. A., and Allen-Hoffmann, B. L. (2001) 2,3,7,8-Tetrachlorodibenzo-p-dioxin alters the differentiation pattern of human keratinocytes in organotypic culture, Toxicol Appl Pharmacol 175, 121-129.

119. Eckhart, L., Declercq, W., Ban, J., Rendl, M., Lengauer, B., Mayer, C., Lippens, S., Vandenabeele, P., and Tschachler, E. (2000) Terminal differentiation of human keratinocytes and stratum corneum formation is associated with caspase-14 activation, J Invest Dermatol 115, 1148-1151.

120. Denecker, G., Hoste, E., Gilbert, B., Hochepied, T., Ovaere, P., Lippens, S., Van den Broecke, C., Van Damme, P., D'Herde, K., Hachem, J. P., Borgonie, G., Presland, R. B., Schoonjans, L., Libert, C., Vandekerckhove, J., Gevaert, K., Vandenabeele, P., and Declercq, W. (2007) Caspase-14 protects against epidermal UVB photodamage and water loss, Nat Cell Biol 9, 666-674.

121. Wanner, R., Brommer, S., Czarnetzki, B. M., and Rosenbach, T. (1995) The differentiation-related upregulation of aryl hydrocarbon receptor transcript levels is suppressed by retinoic acid, Biochem Biophys Res Commun 209, 706- 711.

122. Swanson, H. I. (2004) Cytochrome P450 expression in human keratinocytes: an aryl hydrocarbon receptor perspective, Chem Biol Interact 149, 69-79.

123. Breitkreutz, D., Stark, H. J., Plein, P., Baur, M., and Fusenig, N. E. (1993) Differential modulation of epidermal keratinization in immortalized (HaCaT) and tumorigenic human skin keratinocytes (HaCaT-ras) by retinoic acid and extracellular Ca2+, Differentiation 54, 201-217.

124. Wanner, R., Zhang, J., Henz, B. M., and Rosenbach, T. (1996) AP-2 gene expression and modulation by retinoic acid during keratinocyte differentiation, Biochem Biophys Res Commun 223, 666-669.

125. Fisher, G. J., and Voorhees, J. J. (1996) Molecular mechanisms of retinoid actions in skin, Faseb J 10, 1002-1013.

126. Schoenermark, M. P., Mitchell, T. I., Rutter, J. L., Reczek, P. R., and Brinckerhoff, C. E. (1999) Retinoid-mediated suppression of tumor invasion and matrix metalloproteinase synthesis, Ann N Y Acad Sci 878, 466-486.

127. Lorick, K. L., Toscano, D. L., and Toscano, W. A., Jr. (1998) 2,3,7,8-Tetrachlorodibenzo-p-dioxin alters retinoic acid receptor function in human keratinocytes, Biochem Biophys Res Commun 243, 749-752.

128. Murphy, K. A., Villano, C. M., Dorn, R., and White, L. A. (2004) Interaction between the aryl hydrocarbon receptor and retinoic acid pathways increases matrix metalloproteinase-1 expression in keratinocytes, J Biol Chem 279, 25284-25293.

129. Ray, S. S., and Swanson, H. I. (2003) Alteration of keratinocyte differentiation and senescence by the tumor promoter dioxin, Toxicol Appl Pharmacol 192, 131-145. 153 130. Weil, M., Raff, M. C., and Braga, V. M. (1999) Caspase activation in the terminal differentiation of human epidermal keratinocytes, Curr Biol 9, 361-364.

131. Rheinwald, J. G., Hahn, W. C., Ramsey, M. R., Wu, J. Y., Guo, Z., Tsao, H., De Luca, M., Catricala, C., and O'Toole, K. M. (2002) A two-stage, p16(INK4A)- and p53-dependent keratinocyte senescence mechanism that limits replicative potential independent of telomere status, Mol Cell Biol 22, 5157-5172.

132. Meerson, A., Milyavsky, M., and Rotter, V. (2004) p53 mediates density-dependent growth arrest, FEBS Lett 559, 152-158.

133. Ray, S. S., and Swanson, H. I. (2004) Dioxin-induced immortalization of normal human keratinocytes and silencing of p53 and p16INK4a, J Biol Chem 279, 27187-27193.

134. Mulero-Navarro, S., Pozo-Guisado, E., Perez-Mancera, P. A., Alvarez-Barrientos, A., Catalina-Fernandez, I., Hernandez-Nieto, E., Saenz-Santamaria, J., Martinez, N., Rojas, J. M., Sanchez-Garcia, I., and Fernandez- Salguero, P. M. (2005) Immortalized mouse mammary fibroblasts lacking dioxin receptor have impaired tumorigenicity in a subcutaneous mouse xenograft model, J Biol Chem 280, 28731-28741.

135. Cho, Y. C., Zheng, W., and Jefcoate, C. R. (2004) Disruption of cell-cell contact maximally but transiently activates AhR-mediated transcription in 10T1/2 fibroblasts, Toxicol Appl Pharmacol 199, 220-238.

136. Ausubel, F. M., Brent, R., Kingston, R. E., Moore, D. D., Seidman, J. G., Smith, J. A., and Struhl, K., (Eds.) (1995) Current Protocols in Molecular Biology, Vol. 1-3, John Wiley & Sons, Inc.

137. Berghard, A., Gradin, K., Pongratz, I., Whitelaw, M., and Poellinger, L. (1993) Cross-coupling of signal transduction pathways: the dioxin receptor mediates induction of cytochrome P-450IA1 expression via a protein kinase C- dependent mechanism, Mol Cell Biol 13, 677-689.

138. Kholod, N., and Mustelin, T. (2001) Novel vectors for co-expression of two proteins in E. coli, BioTechniques 31, 322- 323, 326-328.

139. Mason, G. G., Witte, A. M., Whitelaw, M. L., Antonsson, C., McGuire, J., Wilhelmsson, A., Poellinger, L., and Gustafsson, J. A. (1994) Purification of the DNA binding form of dioxin receptor. Role of the Arnt cofactor in regulation of dioxin receptor function, J Biol Chem 269, 4438-4449.

140. Hobbs, S., Jitrapakdee, S., and Wallace, J. C. (1998) Development of a bicistronic vector driven by the human polypeptide chain elongation factor 1alpha promoter for creation of stable mammalian cell lines that express very high levels of recombinant proteins, Biochem Biophys Res Commun 252, 368-372.

141. Svergun, D. I. (1992) Determination of the regularization parameter in indirect-transform methods using perceptual criteria, J. Appl. Cryst., 495-503.

142. Svergun, D. I. (1999) Restoring low resolution structure of biological macromolecules from solution scattering using simulated annealing, Biophysical journal 76, 2879-2886.

143. Volkov, V. V. a. S., D.I. (2003) Uniqueness of ab initio shape determination in small-angle scattering, J. Appl. Cryst., 860-864.

144. Monk, S. A., Denison, M. S., and Rice, R. H. (2001) Transient expression of CYP1A1 in rat epithelial cells cultured in suspension, Arch Biochem Biophys 393, 154-162.

145. Amoutzias, G. D., Robertson, D. L., and Bornberg-Bauer, E. (2004) The evolution of protein interaction networks in regulatory proteins, Comparative and functional genomics 5, 79-84.

146. Simionato, E., Ledent, V., Richards, G., Thomas-Chollier, M., Kerner, P., Coornaert, D., Degnan, B. M., and Vervoort, M. (2007) Origin and diversification of the basic helix-loop-helix gene family in metazoans: insights from comparative genomics, BMC evolutionary biology 7, 33.

147. Massari, M. E., and Murre, C. (2000) Helix-loop-helix proteins: regulators of transcription in eucaryotic organisms, Mol Cell Biol 20, 429-440.

148. Ellenberger, T., Fass, D., Arnaud, M., and Harrison, S. C. (1994) Crystal structure of transcription factor E47: E-box recognition by a basic region helix-loop-helix dimer, Genes Dev 8, 970-980. 154 149. Ma, P. C., Rould, M. A., Weintraub, H., and Pabo, C. O. (1994) Crystal structure of MyoD bHLH domain-DNA complex: perspectives on DNA recognition and implications for transcriptional activation, Cell 77, 451-459.

150. Ferre-D'Amare, A. R., Pognonec, P., Roeder, R. G., and Burley, S. K. (1994) Structure and function of the b/HLH/Z domain of USF, Embo J 13, 180-189.

151. Luscher, B., and Larsson, L. G. (1999) The basic region/helix-loop-helix/leucine zipper domain of Myc proto- oncoproteins: function and regulation, Oncogene 18, 2955-2966.

152. Fischer, A., Klattig, J., Kneitz, B., Diez, H., Maier, M., Holtmann, B., Englert, C., and Gessler, M. (2005) Hey basic helix-loop-helix transcription factors are repressors of GATA4 and GATA6 and restrict expression of the GATA target gene ANF in fetal hearts, Mol Cell Biol 25, 8960-8970.

153. Amoutzias, G. D., Robertson, D. L., Van de Peer, Y., and Oliver, S. G. (2008) Choose your partners: dimerization in eukaryotic transcription factors, Trends Biochem Sci 33, 220-229.

154. Longo, A., Guanga, G. P., and Rose, R. B. (2008) Crystal structure of E47-NeuroD1/beta2 bHLH domain-DNA complex: heterodimer selectivity and DNA recognition, Biochemistry 47, 218-229.

155. Ellenberger, T. E., Brandl, C. J., Struhl, K., and Harrison, S. C. (1992) The GCN4 basic region leucine zipper binds DNA as a dimer of uninterrupted alpha helices: crystal structure of the protein-DNA complex, Cell 71, 1223-1237.

156. Zeng, X., Herndon, A. M., and Hu, J. C. (1997) Buried asparagines determine the dimerization specificities of leucine zipper mutants, Proc Natl Acad Sci U S A 94, 3673-3678.

157. Lovejoy, B., Choe, S., Cascio, D., McRorie, D. K., DeGrado, W. F., and Eisenberg, D. (1993) Crystal structure of a synthetic triple-stranded alpha-helical bundle, Science 259, 1288-1293.

158. Potekhin, S. A., Medvedkin, V. N., Kashparov, I. A., and Venyaminov, S. (1994) Synthesis and properties of the peptide corresponding to the mutant form of the leucine zipper of the transcriptional activator GCN4 from yeast, Protein engineering 7, 1097-1101.

159. Lavigne, P., Crump, M. P., Gagne, S. M., Hodges, R. S., Kay, C. M., and Sykes, B. D. (1998) Insights into the mechanism of heterodimerization from the 1H-NMR solution structure of the c-Myc-Max heterodimeric leucine zipper, J Mol Biol 281, 165-181.

160. Lavigne, P., Kondejewski, L. H., Houston, M. E., Jr., Sonnichsen, F. D., Lix, B., Skyes, B. D., Hodges, R. S., and Kay, C. M. (1995) Preferential heterodimeric parallel coiled-coil formation by synthetic Max and c-Myc leucine zippers: a description of putative electrostatic interactions responsible for the specificity of heterodimerization, J Mol Biol 254, 505-520.

161. Luscher, B. (2001) Function and regulation of the transcription factors of the Myc/Max/Mad network, Gene 277, 1- 14.

162. Pu, W. T., and Struhl, K. (1991) The leucine zipper symmetrically positions the adjacent basic regions for specific DNA binding, Proc Natl Acad Sci U S A 88, 6901-6905.

163. Giagtzoglou, N., Alifragis, P., Koumbanakis, K. A., and Delidakis, C. (2003) Two modes of recruitment of E(spl) repressors onto target genes, Development 130, 259-270.

164. Chin, M. T., Maemura, K., Fukumoto, S., Jain, M. K., Layne, M. D., Watanabe, M., Hsieh, C. M., and Lee, M. E. (2000) Cardiovascular basic helix loop helix factor 1, a novel transcriptional repressor expressed preferentially in the developing and adult cardiovascular system, J Biol Chem 275, 6381-6387.

165. Iso, T., Sartorelli, V., Poizat, C., Iezzi, S., Wu, H. Y., Chung, G., Kedes, L., and Hamamori, Y. (2001) HERP, a novel heterodimer partner of HES/E(spl) in Notch signaling, Mol Cell Biol 21, 6080-6089.

166. Leimeister, C., Dale, K., Fischer, A., Klamt, B., Hrabe de Angelis, M., Radtke, F., McGrew, M. J., Pourquie, O., and Gessler, M. (2000) Oscillating expression of c-Hey2 in the presomitic mesoderm suggests that the segmentation clock may use combinatorial signaling through multiple interacting bHLH factors, Dev Biol 227, 91-103.

167. Nakatani, T., Mizuhara, E., Minaki, Y., Sakamoto, Y., and Ono, Y. (2004) Helt, a novel basic-helix-loop-helix transcriptional repressor expressed in the developing central nervous system, J Biol Chem 279, 16356-16367. 155 168. Taelman, V., Van Wayenbergh, R., Solter, M., Pichon, B., Pieler, T., Christophe, D., and Bellefroid, E. J. (2004) Sequences downstream of the bHLH domain of the Xenopus hairy-related transcription factor-1 act as an extended dimerization domain that contributes to the selection of the partners, Dev Biol 276, 47-63.

169. Taylor, B. L., and Zhulin, I. B. (1999) PAS domains: internal sensors of oxygen, redox potential, and light, Microbiol Mol Biol Rev 63, 479-506.

170. Wilhelmsson, A., Cuthill, S., Denis, M., Wikstrom, A. C., Gustafsson, J. A., and Poellinger, L. (1990) The specific DNA binding activity of the dioxin receptor is modulated by the 90 kd heat shock protein, Embo J 9, 69-76.

171. Baca, M., Borgstahl, G. E., Boissinot, M., Burke, P. M., Williams, D. R., Slater, K. A., and Getzoff, E. D. (1994) Complete chemical structure of photoactive yellow protein: novel thioester-linked 4-hydroxycinnamyl chromophore and photocycle chemistry, Biochemistry 33, 14369-14377.

172. Hill, S., Austin, S., Eydmann, T., Jones, T., and Dixon, R. (1996) Azotobacter vinelandii NIFL is a flavoprotein that modulates transcriptional activation of nitrogen-fixation genes via a redox-sensitive switch, Proc Natl Acad Sci U S A 93, 2143-2148.

173. Pellequer, J. L., Wager-Smith, K. A., Kay, S. A., and Getzoff, E. D. (1998) Photoactive yellow protein: a structural prototype for the three-dimensional fold of the PAS domain superfamily, Proc Natl Acad Sci U S A 95, 5884-5890.

174. Yildiz, O., Doi, M., Yujnovsky, I., Cardone, L., Berndt, A., Hennig, S., Schulze, S., Urbanke, C., Sassone-Corsi, P., and Wolf, E. (2005) Crystal structure and interactions of the PAS repeat region of the Drosophila clock protein PERIOD, Mol Cell 17, 69-82.

175. Lindebro, M. C., Poellinger, L., and Whitelaw, M. L. (1995) Protein-protein interaction via PAS domains: role of the PAS domain in positive and negative regulation of the bHLH/PAS dioxin receptor-Arnt transcription factor complex, Embo J 14, 3528-3539.

176. Card, P. B., Erbel, P. J., and Gardner, K. H. (2005) Structural basis of ARNT PAS-B dimerization: use of a common beta-sheet interface for hetero- and homodimerization, J Mol Biol 353, 664-677.

177. Wang, L., Fabret, C., Kanamaru, K., Stephenson, K., Dartois, V., Perego, M., and Hoch, J. A. (2001) Dissection of the functional and structural domains of phosphorelay histidine kinase A of Bacillus subtilis, J Bacteriol 183, 2795- 2802.

178. Buttani, V., Losi, A., Eggert, T., Krauss, U., Jaeger, K. E., Cao, Z., and Gartner, W. (2007) Conformational analysis of the blue-light sensing protein YtvA reveals a competitive interface for LOV-LOV dimerization and interdomain interactions, Photochem Photobiol Sci 6, 41-49.

179. Moglich, A., and Moffat, K. (2007) Structural basis for light-dependent signaling in the dimeric LOV domain of the photosensor YtvA, J Mol Biol 373, 112-126.

180. Zoltowski, B. D., and Crane, B. R. (2008) Light activation of the LOV protein vivid generates a rapidly exchanging dimer, Biochemistry 47, 7012-7019.

181. Miyatake, H., Mukai, M., Park, S. Y., Adachi, S., Tamura, K., Nakamura, H., Nakamura, K., Tsuchiya, T., Iizuka, T., and Shiro, Y. (2000) Sensory mechanism of oxygen sensor FixL from Rhizobium meliloti: crystallographic, mutagenesis and resonance Raman spectroscopic studies, J Mol Biol 301, 415-431.

182. Ma, X., Sayed, N., Baskaran, P., Beuve, A., and van den Akker, F. (2008) PAS-mediated dimerization of soluble guanylyl cyclase revealed by signal transduction histidine kinase domain crystal structure, J Biol Chem 283 , 1167- 1178.

183. Park, H., Suquet, C., Satterlee, J. D., and Kang, C. (2004) Insights into signal transduction involving PAS domain oxygen-sensing heme proteins from the X-ray crystal structure of Escherichia coli Dos heme domain (Ec DosH), Biochemistry 43, 2738-2746.

184. Reinelt, S., Hofmann, E., Gerharz, T., Bott, M., and Madden, D. R. (2003) The structure of the periplasmic ligand- binding domain of the sensor kinase CitA reveals the first extracellular PAS domain, J Biol Chem 278, 39189-39196.

156 185. Woods, S. L., and Whitelaw, M. L. (2002) Differential activities of murine single minded 1 (SIM1) and SIM2 on a hypoxic response element. Cross-talk between basic helix-loop-helix/per-Arnt-Sim homology transcription factors, J Biol Chem 277, 10236-10243.

186. Lavista-Llanos, S., Centanin, L., Irisarri, M., Russo, D. M., Gleadle, J. M., Bocca, S. N., Muzzopappa, M., Ratcliffe, P. J., and Wappner, P. (2002) Control of the hypoxic response in Drosophila melanogaster by the basic helix-loop- helix PAS protein similar, Mol Cell Biol 22, 6842-6853.

187. Zelzer, E., Wappner, P., and Shilo, B. Z. (1997) The PAS domain confers target gene specificity of Drosophila bHLH/PAS proteins, Genes Dev 11, 2079-2089.

188. Sun, W., Zhang, J., and Hankinson, O. (1997) A mutation in the aryl hydrocarbon receptor (AHR) in a cultured mammalian cell line identifies a novel region of AHR that affects DNA binding, J Biol Chem 272, 31845-31854.

189. Numayama-Tsuruta, K., Kobayashi, A., Sogawa, K., and Fujii-Kuriyama, Y. (1997) A point mutation responsible for defective function of the aryl-hydrocarbon-receptor nuclear translocator in mutant Hepa-1c1c7 cells, Eur J Biochem 246, 486-495.

190. Kamnasaran, D., Muir, W. J., Ferguson-Smith, M. A., and Cox, D. W. (2003) Disruption of the neuronal PAS3 gene in a family affected with schizophrenia, Journal of medical genetics 40, 325-332.

191. Ibi, D., Takuma, K., Koike, H., Mizoguchi, H., Tsuritani, K., Kuwahara, Y., Kamei, H., Nagai, T., Yoneda, Y., Nabeshima, T., and Yamada, K. (2008) Social isolation rearing-induced impairment of the hippocampal neurogenesis is associated with deficits in spatial memory and emotion-related behaviors in juvenile mice, Journal of neurochemistry 105, 921-932.

192. Dougherty, W. G., Carrington, J. C., Cary, S. M., and Parks, T. D. (1988) Biochemical and mutational analysis of a plant virus polyprotein cleavage site, Embo J 7, 1281-1287.

193. Sawadogo, M., and Roeder, R. G. (1985) Interaction of a gene-specific transcription factor with the adenovirus major late promoter upstream of the TATA box region, Cell 43, 165-175.

194. Sogawa, K., Nakano, R., Kobayashi, A., Kikuchi, Y., Ohe, N., Matsushita, N., and Fujii-Kuriyama, Y. (1995) Possible function of Ah receptor nuclear translocator (Arnt) homodimer in transcriptional regulation, Proc Natl Acad Sci U S A 92, 1936-1940.

195. Erbel, P. J., Card, P. B., Karakuzu, O., Bruick, R. K., and Gardner, K. H. (2003) Structural basis for PAS domain heterodimerization in the basic helix--loop--helix-PAS transcription factor hypoxia-inducible factor, Proc Natl Acad Sci U S A 100, 15504-15509.

196. Scheuermann, T. H., Tomchick, D. R., Machius, M., Guo, Y., Bruick, R. K., and Gardner, K. H. (2009) Artificial ligand binding within the HIF2alpha PAS-B domain of the HIF2 transcription factor, Proc Natl Acad Sci U S A 106, 450-455.

197. Evans, M. R., Card, P. B., and Gardner, K. H. (2009) ARNT PAS-B has a fragile native state structure with an alternative beta-sheet register nearby in sequence space, Proc Natl Acad Sci U S A 106, 2617-2622.

198. Moll, J. R., Acharya, A., Gal, J., Mir, A. A., and Vinson, C. (2002) Magnesium is required for specific DNA binding of the CREB B-ZIP domain, Nucleic Acids Res 30, 1240-1246.

199. Friedrich, K. L., Giese, K. C., Buan, N. R., and Vierling, E. (2004) Interactions between small heat shock protein subunits and substrate in small heat shock protein-substrate complexes, J Biol Chem 279, 1080-1089.

200. Frottin, F., Martinez, A., Peynot, P., Mitra, S., Holz, R. C., Giglione, C., and Meinnel, T. (2006) The proteomics of N- terminal methionine cleavage, Mol Cell Proteomics 5, 2336-2349.

201. Jordan, S. R., Whitcombe, T. V., Berg, J. M., and Pabo, C. O. (1985) Systematic variation in DNA length yields highly ordered repressor-operator cocrystals, Science 230, 1383-1385.

202. Ferre-D'Amare, A. R., Prendergast, G. C., Ziff, E. B., and Burley, S. K. (1993) Recognition by Max of its cognate DNA through a dimeric b/HLH/Z domain, Nature 363, 38-45.

203. Shimizu, T., Toumoto, A., Ihara, K., Shimizu, M., Kyogoku, Y., Ogawa, N., Oshima, Y., and Hakoshima, T. (1997) Crystal structure of PHO4 bHLH domain-DNA complex: flanking base recognition, Embo J 16, 4689-4697. 157 204. Bergfors, T. (2003) Seeds to crystals, Journal of structural biology 142, 66-76.

205. Saridakis, E., and Chayen, N. E. (2000) Improving protein crystal quality by decoupling nucleation and growth in vapor diffusion, Protein Sci 9, 755-757.

206. Birtley, J. R., and Curry, S. (2005) Crystallization of foot-and-mouth disease virus 3C protease: surface mutagenesis and a novel crystal-optimization strategy, Acta crystallographica 61, 646-650.

207. Chayen, N. E. (1997) The role of oil in macromolecular crystallization, Structure 5, 1269-1274.

208. Pearl, L., O'Hara, B., Drew, R., and Wilson, S. (1994) Crystal structure of AmiC: the controller of transcription antitermination in the amidase operon of Pseudomonas aeruginosa, Embo J 13, 5810-5817.

209. Salemme, F. R. (1972) A free interface diffusion technique for the crystallization of proteins for x-ray crystallography, Arch Biochem Biophys 151, 533-539.

210. Rice, P. A., Yang, S., Mizuuchi, K., and Nash, H. A. (1996) Crystal structure of an IHF-DNA complex: a protein- induced DNA U-turn, Cell 87, 1295-1306.

211. Garcia, H. G., Grayson, P., Han, L., Inamdar, M., Kondev, J., Nelson, P. C., Phillips, R., Widom, J., and Wiggins, P. A. (2007) Biological consequences of tightly bent DNA: the other life of a macromolecular celebrity, Biopolymers 85, 115-130.

212. Richmond, T. J., and Davey, C. A. (2003) The structure of DNA in the nucleosome core, Nature 423, 145-150.

213. Anderson, J. D., and Widom, J. (2001) Poly(dA-dT) promoter elements increase the equilibrium accessibility of nucleosomal DNA target sites, Mol Cell Biol 21, 3830-3839.

214. MacDonald, D., Herbert, K., Zhang, X., Pologruto, T., and Lu, P. (2001) Solution structure of an A-tract DNA bend, J Mol Biol 306, 1081-1098.

215. Brukner, I., Dlakic, M., Savic, A., Susic, S., Pongor, S., and Suck, D. (1993) Evidence for opposite groove-directed curvature of GGGCCC and AAAAA sequence elements, Nucleic Acids Res 21, 1025-1029.

216. Roychoudhury, M., Sitlani, A., Lapham, J., and Crothers, D. M. (2000) Global structure and mechanical properties of a 10-bp nucleosome positioning motif, Proc Natl Acad Sci U S A 97, 13608-13613.

217. Shrader, T. E., and Crothers, D. M. (1989) Artificial nucleosome positioning sequences, Proc Natl Acad Sci U S A 86, 7418-7422.

218. Widom, J. (2001) Role of DNA sequence in nucleosome stability and dynamics, Quarterly reviews of biophysics 34, 269-324.

219. Anderson, J. D., and Widom, J. (2000) Sequence and position-dependence of the equilibrium accessibility of nucleosomal DNA target sites, J Mol Biol 296, 979-987.

220. Li, G., Levitus, M., Bustamante, C., and Widom, J. (2005) Rapid spontaneous accessibility of nucleosomal DNA, Nature structural & molecular biology 12, 46-53.

221. Dhalluin, C., Carlson, J. E., Zeng, L., He, C., Aggarwal, A. K., and Zhou, M. M. (1999) Structure and ligand of a histone acetyltransferase bromodomain, Nature 399, 491-496.

222. Luger, K., Mader, A. W., Richmond, R. K., Sargent, D. F., and Richmond, T. J. (1997) Crystal structure of the nucleosome core particle at 2.8 A resolution, Nature 389, 251-260.

223. Muse, G. W., Gilchrist, D. A., Nechaev, S., Shah, R., Parker, J. S., Grissom, S. F., Zeitlinger, J., and Adelman, K. (2007) RNA polymerase is poised for activation across the genome, Nat Genet 39, 1507-1511.

224. Gilchrist, D. A., Nechaev, S., Lee, C., Ghosh, S. K., Collins, J. B., Li, L., Gilmour, D. S., and Adelman, K. (2008) NELF-mediated stalling of Pol II can enhance gene expression by blocking promoter-proximal nucleosome assembly, Genes Dev 22, 1921-1933.

158 225. Desjobert, C., Noy, P., Swingler, T., Williams, H., Gaston, K., and Jayaraman, P. S. (2009) The PRH/Hex repressor protein causes nuclear retention of Groucho/TLE co-repressors, Biochem J 417, 121-132.

226. Soufi, A., Smith, C., Clarke, A. R., Gaston, K., and Jayaraman, P. S. (2006) Oligomerisation of the developmental regulator proline rich homeodomain (PRH/Hex) is mediated by a novel proline-rich dimerisation domain, J Mol Biol 358, 943-962.

227. Williams, H., Jayaraman, P. S., and Gaston, K. (2008) DNA wrapping and distortion by an oligomeric homeodomain protein, J Mol Biol 383, 10-23.

228. Splinter, E., Heath, H., Kooren, J., Palstra, R. J., Klous, P., Grosveld, F., Galjart, N., and de Laat, W. (2006) CTCF mediates long-range chromatin looping and local histone modification in the beta-globin locus, Genes Dev 20, 2349- 2354.

229. Guo, H., Mi, Z., and Kuo, P. C. (2008) Characterization of short range DNA looping in endotoxin-mediated transcription of the murine inducible nitric-oxide synthase (iNOS) gene, J Biol Chem 283, 25209-25217.

230. Sanchez-Elsner, T., Ramirez, J. R., Sanz-Rodriguez, F., Varela, E., Bernabeu, C., and Botella, L. M. (2004) A cross- talk between hypoxia and TGF-beta orchestrates erythropoietin gene regulation through SP1 and Smads, J Mol Biol 336, 9-24.

231. Vo, N., Fjeld, C., and Goodman, R. H. (2001) Acetylation of nuclear hormone receptor-interacting protein RIP140 regulates binding of the transcriptional corepressor CtBP, Mol Cell Biol 21, 6181-6188.

232. Freedman, S. J., Sun, Z. Y., Poy, F., Kung, A. L., Livingston, D. M., Wagner, G., and Eck, M. J. (2002) Structural basis for recruitment of CBP/p300 by hypoxia-inducible factor-1 alpha, Proc Natl Acad Sci U S A 99, 5367-5372.

233. Nair, S. K., and Burley, S. K. (2003) X-ray structures of Myc-Max and Mad-Max recognizing DNA. Molecular bases of regulation by proto-oncogenic transcription factors, Cell 112, 193-205.

234. Lebel, R., McDuff, F. O., Lavigne, P., and Grandbois, M. (2007) Direct visualization of the binding of c-Myc/Max heterodimeric b-HLH-LZ to E-box sequences on the hTERT promoter, Biochemistry 46, 10279-10286.

235. Becker, N. A., Kahn, J. D., and Maher, L. J., 3rd. (2007) Effects of nucleoid proteins on DNA repression loop formation in Escherichia coli, Nucleic Acids Res 35, 3988-4000.

236. Marenduzzo, D., Micheletti, C., and Cook, P. R. (2006) Entropy-driven genome organization, Biophysical journal 90, 3712-3721.

237. Lamber, E. P., Wilmanns, M., and Svergun, D. I. (2008) Low resolution structural models of the basic helix-loop-helix leucine zipper domain of upstream stimulatory factor 1 and its complexes with DNA from small angle X-ray scattering data, Biophysical journal 94, 193-197.

238. Durrin, L. K., and Whitlock, J. P., Jr. (1989) 2,3,7,8-Tetrachlorodibenzo-p-dioxin-inducible aryl hydrocarbon receptor- mediated change in CYP1A1 chromatin structure occurs independently of transcription, Mol Cell Biol 9, 5733-5737.

239. Morgan, J. E., and Whitlock, J. P., Jr. (1992) Transcription-dependent and transcription-independent nucleosome disruption induced by dioxin, Proc Natl Acad Sci U S A 89, 11622-11626.

240. Wu, L., and Whitlock, J. P., Jr. (1992) Mechanism of dioxin action: Ah receptor-mediated increase in promoter accessibility in vivo, Proc Natl Acad Sci U S A 89, 4811-4815.

241. Pfeifer, G. P. (1992) Analysis of chromatin structure by ligation-mediated PCR, PCR methods and applications 2, 107-111.

242. Okino, S. T., and Whitlock, J. P., Jr. (1995) Dioxin induces localized, graded changes in chromatin structure: implications for Cyp1A1 gene transcription, Mol Cell Biol 15, 3714-3721.

243. Ko, H. P., Okino, S. T., Ma, Q., and Whitlock, J. P., Jr. (1996) Dioxin-induced CYP1A1 transcription in vivo: the aromatic hydrocarbon receptor mediates transactivation, enhancer-promoter communication, and changes in chromatin structure, Mol Cell Biol 16, 430-436.

159 244. Elferink, C. J., and Whitlock, J. P., Jr. (1990) 2,3,7,8-Tetrachlorodibenzo-p-dioxin-inducible, Ah receptor-mediated bending of enhancer DNA, J Biol Chem 265, 5718-5721.

245. Rowlands, J. C., McEwan, I. J., and Gustafsson, J. A. (1996) Trans-activation by the human aryl hydrocarbon receptor and aryl hydrocarbon receptor nuclear translocator proteins: direct interactions with basal transcription factors, Mol Pharmacol 50, 538-548.

246. Kumar, M. B., Ramadoss, P., Reen, R. K., Vanden Heuvel, J. P., and Perdew, G. H. (2001) The Q-rich subdomain of the human Ah receptor transactivation domain is required for dioxin-mediated transcriptional activity, J Biol Chem 276, 42302-42310.

247. Kumar, M. B., Tarpey, R. W., and Perdew, G. H. (1999) Differential recruitment of coactivator RIP140 by Ah and estrogen receptors. Absence of a role for LXXLL motifs, J Biol Chem 274, 22155-22164.

248. Beischlag, T. V., Wang, S., Rose, D. W., Torchia, J., Reisz-Porszasz, S., Muhammad, K., Nelson, W. E., Probst, M. R., Rosenfeld, M. G., and Hankinson, O. (2002) Recruitment of the NCoA/SRC-1/p160 family of transcriptional coactivators by the aryl hydrocarbon receptor/aryl hydrocarbon receptor nuclear translocator complex, Mol Cell Biol 22, 4319-4333.

249. Strahl, B. D., and Allis, C. D. (2000) The language of covalent histone modifications, Nature 403, 41-45.

250. Wang, S., and Hankinson, O. (2002) Functional involvement of the Brahma/SWI2-related gene 1 protein in cytochrome P4501A1 transcription mediated by the aryl hydrocarbon receptor complex, J Biol Chem 277, 11821- 11827.

251. Schnekenburger, M., Peng, L., and Puga, A. (2007) HDAC1 bound to the Cyp1a1 promoter blocks histone acetylation associated with Ah receptor-mediated trans-activation, Biochim Biophys Acta 1769, 569-578.

252. Wei, Y. D., Tepperman, K., Huang, M. Y., Sartor, M. A., and Puga, A. (2004) Chromium inhibits transcription from polycyclic aromatic hydrocarbon-inducible promoters by blocking the release of histone deacetylase and preventing the binding of p300 to chromatin, J Biol Chem 279, 4110-4119.

253. Binnig, G., Quate, C. F., and Gerber, C. (1986) Atomic force microscope, Physical review letters 56, 930-933.

254. Scipioni, A., Anselmi, C., Zuccheri, G., Samori, B., and De Santis, P. (2002) Sequence-dependent DNA curvature and flexibility from scanning force microscopy images, Biophysical journal 83, 2408-2418.

255. Chen, L., Haushalter, K. A., Lieber, C. M., and Verdine, G. L. (2002) Direct visualization of a DNA glycosylase searching for damage, Chem Biol 9, 345-350.

256. Beloin, C., Jeusset, J., Revet, B., Mirambeau, G., Le Hegarat, F., and Le Cam, E. (2003) Contribution of DNA conformation and topology in right-handed DNA wrapping by the Bacillus subtilis LrpC protein, J Biol Chem 278, 5333-5342.

257. Crampton, N., Bonass, W. A., Kirkham, J., Rivetti, C., and Thomson, N. H. (2006) Collision events between RNA polymerases in convergent transcription studied by atomic force microscopy, Nucleic Acids Res 34, 5416-5425.

258. Wyman, C., Grotkopp, E., Bustamante, C., and Nelson, H. C. (1995) Determination of heat-shock transcription factor 2 stoichiometry at looped DNA complexes using scanning force microscopy, Embo J 14, 117-123.

259. Becker, J. C., Nikroo, A., Brabletz, T., and Reisfeld, R. A. (1995) DNA loops induced by cooperative binding of transcriptional activator proteins and preinitiation complexes, Proc Natl Acad Sci U S A 92 , 9727-9731.

260. Glatter, O., and Kratky, O. (1982) Small Angle X-ray Scattering.

261. Cowieson, N. P., Kobe, B., and Martin, J. L. (2008) United we stand: combining structural methods, Current opinion in structural biology 18, 617-622.

262. Tidow, H., Melero, R., Mylonas, E., Freund, S. M., Grossmann, J. G., Carazo, J. M., Svergun, D. I., Valle, M., and Fersht, A. R. (2007) Quaternary structures of tumor suppressor p53 and a specific p53 DNA complex, Proc Natl Acad Sci U S A 104, 12324-12329.

160 263. Grossmann, J. G., Sharff, A. J., O'Hare, P., and Luisi, B. (2001) Molecular shapes of transcription factors TFIIB and VP16 in solution: implications for recognition, Biochemistry 40, 6267-6274.

264. Nascimento, A. S., Dias, S. M., Nunes, F. M., Aparicio, R., Ambrosio, A. L., Bleicher, L., Figueira, A. C., Santos, M. A., de Oliveira Neto, M., Fischer, H., Togashi, M., Craievich, A. F., Garratt, R. C., Baxter, J. D., Webb, P., and Polikarpov, I. (2006) Structural rearrangements in the thyroid hormone receptor hinge domain and their putative role in the receptor function, J Mol Biol 360, 586-598.

265. Joerger, A. C., and Fersht, A. R. (2008) Structural biology of the tumor suppressor p53, Annu Rev Biochem 77, 557- 582.

266. Kitayner, M., Rozenberg, H., Kessler, N., Rabinovich, D., Shaulov, L., Haran, T. E., and Shakked, Z. (2006) Structural basis of DNA recognition by p53 tetramers, Mol Cell 22, 741-753.

267. Canadillas, J. M., Tidow, H., Freund, S. M., Rutherford, T. J., Ang, H. C., and Fersht, A. R. (2006) Solution structure of p53 core domain: structural basis for its instability, Proc Natl Acad Sci U S A 103, 2109-2114.

268. Clore, G. M., Ernst, J., Clubb, R., Omichinski, J. G., Kennedy, W. M., Sakaguchi, K., Appella, E., and Gronenborn, A. M. (1995) Refined solution structure of the oligomerization domain of the tumour suppressor p53, Nature structural biology 2, 321-333.

269. Jeffrey, P. D., Gorina, S., and Pavletich, N. P. (1995) Crystal structure of the tetramerization domain of the p53 tumor suppressor at 1.7 angstroms, Science 267, 1498-1502.

270. Veprintsev, D. B., Freund, S. M., Andreeva, A., Rutledge, S. E., Tidow, H., Canadillas, J. M., Blair, C. M., and Fersht, A. R. (2006) Core domain interactions in full-length p53 in solution, Proc Natl Acad Sci U S A 103, 2115-2119.

271. Banin, S., Moyal, L., Shieh, S., Taya, Y., Anderson, C. W., Chessa, L., Smorodinsky, N. I., Prives, C., Reiss, Y., Shiloh, Y., and Ziv, Y. (1998) Enhanced phosphorylation of p53 by ATM in response to DNA damage, Science 281, 1674-1677.

272. Kussie, P. H., Gorina, S., Marechal, V., Elenbaas, B., Moreau, J., Levine, A. J., and Pavletich, N. P. (1996) Structure of the MDM2 oncoprotein bound to the p53 tumor suppressor transactivation domain, Science 274, 948-953.

273. Chipuk, J. E., Bouchier-Hayes, L., Kuwana, T., Newmeyer, D. D., and Green, D. R. (2005) PUMA couples the nuclear and cytoplasmic proapoptotic function of p53, Science 309, 1732-1735.

274. Tomita, Y., Marchenko, N., Erster, S., Nemajerova, A., Dehner, A., Klein, C., Pan, H., Kessler, H., Pancoska, P., and Moll, U. M. (2006) WT p53, but not tumor-derived mutants, bind to Bcl2 via the DNA binding domain and induce mitochondrial permeabilization, J Biol Chem 281, 8600-8606.

275. Hansma, H. G., and Laney, D. E. (1996) DNA binding to mica correlates with cationic radius: assay by atomic force microscopy, Biophysical journal 70, 1933-1939.

276. Konarev, P. V., Volkov, V.V., Sokolova, A.V., Koch, M.H.J. and Svergun, D.I. (2003) PRIMUS: a Windows PC-based system for small-angle scattering data analysis, J. Appl. Cryst., 1277-1282.

277. McDonald, R. J., Kahn, J. D., and Maher, L. J., 3rd. (2006) DNA bending by bHLH charge variants, Nucleic Acids Res 34, 4846-4856.

278. Pursglove, S. E., Fladvad, M., Bellanda, M., Moshref, A., Henriksson, M., Carey, J., and Sunnerhagen, M. (2004) Biophysical properties of regions flanking the bHLH-Zip motif in the p22 Max protein, Biochem Biophys Res Commun 323, 750-759.

279. Huffman, J. L., Mokashi, A., Bachinger, H. P., and Brennan, R. G. (2001) The basic helix-loop-helix domain of the aryl hydrocarbon receptor nuclear transporter (ARNT) can oligomerize and bind E-box DNA specifically, J Biol Chem 276, 40537-40544.

280. Svergun, D. I. a. K., M.H.J. (2003) Small-angle scattering studies of biological macromolecules in solution, Rep. Prog. Phys 66, 1735-1782.

281. Bennett-Lovsey, R. M., Herbert, A. D., Sternberg, M. J., and Kelley, L. A. (2008) Exploring the extremes of sequence/structure space with ensemble fold recognition in the program Phyre, Proteins 70, 611-625. 161 282. Sauve, S., Tremblay, L., and Lavigne, P. (2004) The NMR solution structure of a mutant of the Max b/HLH/LZ free of DNA: insights into the specific and reversible DNA binding mechanism of dimeric transcription factors, J Mol Biol 342, 813-832.

283. Baker, N. A., Sept, D., Joseph, S., Holst, M. J., and McCammon, J. A. (2001) Electrostatics of nanosystems: application to microtubules and the ribosome, Proc Natl Acad Sci U S A 98, 10037-10041.

284. Ouali, M., and King, R. D. (2000) Cascaded multiple classifiers for secondary structure prediction, Protein Sci 9, 1162-1176.

285. Aihara, H., Huang, W. M., and Ellenberger, T. (2007) An interlocked dimer of the protelomerase TelK distorts DNA structure for the formation of hairpin telomeres, Mol Cell 27, 901-913.

286. Didenko, V. V. (2001) DNA probes using fluorescence resonance energy transfer (FRET): designs and applications, BioTechniques 31, 1106-1116, 1118, 1120-1101.

287. Schafer, D. A., Gelles, J., Sheetz, M. P., and Landick, R. (1991) Transcription by single molecules of RNA polymerase observed by light microscopy, Nature 352, 444-448.

288. Perkins, S. J. (1988) Structural studies of proteins by high-flux X-ray and neutron solution scattering, Biochem J 254, 313-327.

289. Neylon, C. (2008) Small angle neutron and X-ray scattering in structural biology: recent examples from the literature, Eur Biophys J 37, 531-541.

290. Toews, J., Rogalski, J. C., Clark, T. J., and Kast, J. (2008) Mass spectrometric identification of formaldehyde-induced peptide modifications under in vivo protein cross-linking conditions, Analytica chimica acta 618, 168-183.

291. Yang, S. W., and Nash, H. A. (1994) Specific photocrosslinking of DNA-protein complexes: identification of contacts between integration host factor and its target DNA, Proc Natl Acad Sci U S A 91, 12183-12187.

292. Vos, J. C., and Plasterk, R. H. (1994) Tc1 transposase of Caenorhabditis elegans is an endonuclease with a bipartite DNA binding domain, Embo J 13, 6125-6132.

293. Watkins, S., van Pouderoyen, G., and Sixma, T. K. (2004) Structural analysis of the bipartite DNA-binding domain of Tc3 transposase bound to transposon DNA, Nucleic Acids Res 32, 4306-4312.

294. Kimura, H., Weisz, A., Ogura, T., Hitomi, Y., Kurashima, Y., Hashimoto, K., D'Acquisto, F., Makuuchi, M., and Esumi, H. (2001) Identification of hypoxia-inducible factor 1 ancillary sequence and its function in vascular endothelial growth factor gene induction by hypoxia and nitric oxide, J Biol Chem 276, 2292-2298.

295. Kajimura, S., Aida, K., and Duan, C. (2006) Understanding hypoxia-induced gene expression in early development: in vitro and in vivo analysis of hypoxia-inducible factor 1-regulated zebra fish insulin-like growth factor binding protein 1 gene expression, Mol Cell Biol 26, 1142-1155.

296. Key, J., Hefti, M., Purcell, E. B., and Moffat, K. (2007) Structure of the redox sensor domain of Azotobacter vinelandii NifL at atomic resolution: signaling, dimerization, and mechanism, Biochemistry 46, 3614-3623.

297. Amezcua, C. A., Harper, S. M., Rutter, J., and Gardner, K. H. (2002) Structure and interactions of PAS kinase N- terminal PAS domain: model for intramolecular kinase regulation, Structure (Camb) 10, 1349-1361.

298. McDonnell, W. M., Chensue, S. W., Askari, F. K., and Moseley, R. H. (1996) Hepatic fibrosis in Ahr-/- mice, Science 271, 223-224.

299. Tuppurainen, K., and Ruuskanen, J. (2000) Electronic eigenvalue (EEVA): a new QSAR/QSPR descriptor for electronic substituent effects based on molecular orbital energies. A QSAR approach to the Ah receptor binding affinity of polychlorinated biphenyls (PCBs), dibenzo-p-dioxins (PCDDs) and (PCDFs), Chemosphere 41, 843-848.

300. Waller, C. L., and McKinney, J. D. (1995) Three-dimensional quantitative structure-activity relationships of dioxins and dioxin-like compounds: model validation and Ah receptor characterization, Chem Res Toxicol 8, 847-858.

162 301. Nagy, S. R., Liu, G., Lam, K. S., and Denison, M. S. (2002) Identification of novel Ah receptor agonists using a high- throughput green fluorescent protein-based recombinant cell bioassay, Biochemistry 41, 861-868.

302. Lee, I. J., Jeong, K. S., Roberts, B. J., Kallarakal, A. T., Fernandez-Salguero, P., Gonzalez, F. J., and Song, B. J. (1996) Transcriptional induction of the cytochrome P4501A1 gene by a thiazolium compound, YH439, Mol Pharmacol 49, 980-988.

303. Casper, R. F., Quesne, M., Rogers, I. M., Shirota, T., Jolivet, A., Milgrom, E., and Savouret, J. F. (1999) Resveratrol has antagonist activity on the aryl hydrocarbon receptor: implications for prevention of dioxin toxicity, Mol Pharmacol 56, 784-790.

304. Lu, Y. F., Santostefano, M., Cunningham, B. D., Threadgill, M. D., and Safe, S. (1995) Identification of 3'-methoxy-4'- nitroflavone as a pure aryl hydrocarbon (Ah) receptor antagonist and evidence for more than one form of the nuclear Ah receptor in MCF-7 human breast cancer cells, Arch Biochem Biophys 316, 470-477.

305. Gasiewicz, T. A., and Rucci, G. (1991) Alpha-naphthoflavone acts as an antagonist of 2,3,7, 8-tetrachlorodibenzo-p- dioxin by forming an inactive complex with the Ah receptor, Mol Pharmacol 40, 607-612.

306. Bjeldanes, L. F., Kim, J. Y., Grose, K. R., Bartholomew, J. C., and Bradfield, C. A. (1991) Aromatic hydrocarbon responsiveness-receptor agonists generated from indole-3-carbinol in vitro and in vivo: comparisons with 2,3,7,8- tetrachlorodibenzo-p-dioxin, Proc Natl Acad Sci U S A 88, 9543-9547.

307. Ciolino, H. P., Daschner, P. J., and Yeh, G. C. (1999) Dietary flavonols quercetin and kaempferol are ligands of the aryl hydrocarbon receptor that affect CYP1A1 transcription differentially, Biochem J 340, 715-722.

308. Hahn, M. E., and Chandran, K. (1996) Uroporphyrin accumulation associated with cytochrome P4501A induction in fish hepatoma cells exposed to aryl hydrocarbon receptor agonists, including 2,3,7,8-tetrachlorodibenzo-p-dioxin and planar chlorobiphenyls, Arch Biochem Biophys 329, 163-174.

309. Song, J., Clagett-Dame, M., Peterson, R. E., Hahn, M. E., Westler, W. M., Sicinski, R. R., and DeLuca, H. F. (2002) A ligand for the aryl hydrocarbon receptor isolated from lung, Proc Natl Acad Sci U S A 99, 14694-14699.

310. RayChaudhuri, B., Nebert, D. W., and Puga, A. (1990) The murine Cyp1a-1 gene negatively regulates its own transcription and that of other members of the aromatic hydrocarbon-responsive [Ah] gene battery, Mol Endocrinol 4, 1773-1781.

311. Chang, C. Y., and Puga, A. (1998) Constitutive activation of the aromatic hydrocarbon receptor, Mol Cell Biol 18, 525-535.

312. Chiaro, C. R., Patel, R. D., Marcus, C. B., and Perdew, G. H. (2007) Evidence for an aryl hydrocarbon receptor- mediated cytochrome p450 autoregulatory pathway, Mol Pharmacol 72, 1369-1379.

313. Katiyar, S. K., Matsui, M. S., and Mukhtar, H. (2000) Ultraviolet-B exposure of human skin induces cytochromes P450 1A1 and 1B1, J Invest Dermatol 114, 328-333.

314. Osborne, R., and Greenlee, W. F. (1985) 2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD) enhances terminal differentiation of cultured human epidermal cells, Toxicol Appl Pharmacol 77, 434-443.

315. Kohle, C., Gschaidmeier, H., Lauth, D., Topell, S., Zitzer, H., and Bock, K. W. (1999) 2,3,7,8-Tetrachlorodibenzo-p- dioxin (TCDD)-mediated membrane translocation of c-Src protein kinase in liver WB-F344 cells, Arch Toxicol 73, 152-158.

316. Sindhu, R. K., Reisz-Porszasz, S., Hankinson, O., and Kikkawa, Y. (1996) Induction of cytochrome P4501A1 by photooxidized tryptophan in Hepa lclc7 cells, Biochem Pharmacol 52, 1883-1893.

317. Rannug, A., Rannug, U., Rosenkranz, H. S., Winqvist, L., Westerholm, R., Agurell, E., and Grafstrom, A. K. (1987) Certain photooxidized derivatives of tryptophan bind with very high affinity to the Ah receptor and are likely to be endogenous signal substances, J Biol Chem 262, 15422-15427.

318. Wei, Y. D., Bergander, L., Rannug, U., and Rannug, A. (2000) Regulation of CYP1A1 transcription via the metabolism of the tryptophan-derived 6-formylindolo[3,2-b]carbazole, Arch Biochem Biophys 383, 99-107.

163 319. Perdew, G. H., and Babbs, C. F. (1991) Production of Ah receptor ligands in rat fecal suspensions containing tryptophan or indole-3-carbinol, Nutr Cancer 16, 209-218.

320. Adachi, J., Mori, Y., Matsui, S., Takigami, H., Fujino, J., Kitagawa, H., Miller, C. A., 3rd, Kato, T., Saeki, K., and Matsuda, T. (2001) Indirubin and indigo are potent aryl hydrocarbon receptor ligands present in human urine, J Biol Chem 276, 31475-31478.

321. Peter Guengerich, F., Martin, M. V., McCormick, W. A., Nguyen, L. P., Glover, E., and Bradfield, C. A. (2004) Aryl hydrocarbon receptor response to indigoids in vitro and in vivo, Arch Biochem Biophys 423, 309-316.

322. Gillam, E. M., Notley, L. M., Cai, H., De Voss, J. J., and Guengerich, F. P. (2000) Oxidation of indole by cytochrome P450 enzymes, Biochemistry 39, 13817-13824.

323. Hoessel, R., Leclerc, S., Endicott, J. A., Nobel, M. E., Lawrie, A., Tunnah, P., Leost, M., Damiens, E., Marie, D., Marko, D., Niederberger, E., Tang, W., Eisenbrand, G., and Meijer, L. (1999) Indirubin, the active constituent of a Chinese antileukaemia medicine, inhibits cyclin-dependent kinases, Nat Cell Biol 1, 60-67.

324. Devchand, P. R., Ziouzenkova, O., and Plutzky, J. (2004) Oxidative stress and peroxisome proliferator-activated receptors: reversing the curse?, Circ Res 95, 1137-1139.

325. Heid, S. E., Walker, M. K., and Swanson, H. I. (2001) Correlation of cardiotoxicity mediated by halogenated aromatic hydrocarbons to aryl hydrocarbon receptor activation, Toxicol Sci 61, 187-196.

326. Bertazzi, P. A., Bernucci, I., Brambilla, G., Consonni, D., and Pesatori, A. C. (1998) The Seveso studies on early and long-term effects of dioxin exposure: a review, Environ Health Perspect 106 Suppl 2, 625-633.

327. Savouret, J. F., Antenos, M., Quesne, M., Xu, J., Milgrom, E., and Casper, R. F. (2001) 7-ketocholesterol is an endogenous modulator for the arylhydrocarbon receptor, J Biol Chem 276, 3054-3059.

328. Mufti, N. A., Bleckwenn, N. A., Babish, J. G., and Shuler, M. L. (1995) Possible involvement of the Ah receptor in the induction of cytochrome P-450IA1 under conditions of hydrodynamic shear in microcarrier-attached hepatoma cell lines, Biochem Biophys Res Commun 208, 144-152.

329. Dekker, R. J., van Soest, S., Fontijn, R. D., Salamanca, S., de Groot, P. G., VanBavel, E., Pannekoek, H., and Horrevoets, A. J. (2002) Prolonged fluid shear stress induces a distinct set of endothelial cell genes, most specifically lung Kruppel-like factor (KLF2), Blood 100, 1689-1698.

330. Aviram, M. (1993) Modified forms of low density lipoprotein and atherosclerosis, Atherosclerosis 98, 1-9.

331. Mufti, N. A., and Shuler, M. L. (1996) Possible role of arachidonic acid in stress-induced cytochrome P450IA1 activity, Biotechnol Prog 12, 847-854.

332. Amadei, A., Linssen, A. B., and Berendsen, H. J. (1993) Essential dynamics of proteins, Proteins 17, 412-425.

333. Pandini, A., and Bonati, L. (2005) Conservation and specialization in PAS domain dynamics, Protein Eng Des Sel 18, 127-137.

334. Dux, P., Rubinstenn, G., Vuister, G. W., Boelens, R., Mulder, F. A., Hard, K., Hoff, W. D., Kroon, A. R., Crielaard, W., Hellingwerf, K. J., and Kaptein, R. (1998) Solution structure and backbone dynamics of the photoactive yellow protein, Biochemistry 37, 12689-12699.

335. Ema, M., Ohe, N., Suzuki, M., Mimura, J., Sogawa, K., Ikawa, S., and Fujii-Kuriyama, Y. (1994) Dioxin binding activities of polymorphic forms of mouse and human arylhydrocarbon receptors, J Biol Chem 269, 27337-27343.

336. Poland, A., Palen, D., and Glover, E. (1994) Analysis of the four alleles of the murine aryl hydrocarbon receptor, Mol Pharmacol 46, 915-921.

337. Procopio, M., Lahm, A., Tramontano, A., Bonati, L., and Pitea, D. (2002) A model for recognition of polychlorinated dibenzo-p-dioxins by the aryl hydrocarbon receptor, Eur J Biochem 269, 13-18.

338. Murray, I. A., Reen, R. K., Leathery, N., Ramadoss, P., Bonati, L., Gonzalez, F. J., Peters, J. M., and Perdew, G. H. (2005) Evidence that ligand binding is a key determinant of Ah receptor-mediated transcriptional activity, Arch Biochem Biophys 442, 59-71. 164 339. Henry, E. C., and Gasiewicz, T. A. (2008) Molecular determinants of species-specific agonist and antagonist activity of a substituted flavone towards the aryl hydrocarbon receptor, Arch Biochem Biophys 472, 77-88.

340. Pandini, A., Denison, M. S., Song, Y., Soshilov, A. A., and Bonati, L. (2007) Structural and functional characterization of the aryl hydrocarbon receptor ligand binding domain by homology modeling and mutational analysis, Biochemistry 46, 696-708.

341. Goryo, K., Suzuki, A., Del Carpio, C. A., Siizaki, K., Kuriyama, E., Mikami, Y., Kinoshita, K., Yasumoto, K., Rannug, A., Miyamoto, A., Fujii-Kuriyama, Y., and Sogawa, K. (2007) Identification of amino acid residues in the Ah receptor involved in ligand binding, Biochem Biophys Res Commun 354, 396-402.

342. Andreasen, E. A., Hahn, M. E., Heideman, W., Peterson, R. E., and Tanguay, R. L. (2002) The zebrafish (Danio rerio) aryl hydrocarbon receptor type 1 is a novel vertebrate receptor, Mol Pharmacol 62, 234-249.

343. Karchner, S. I., Franks, D. G., and Hahn, M. E. (2005) AHR1B, a new functional aryl hydrocarbon receptor in zebrafish: tandem arrangement of ahr1b and ahr2 genes, Biochem J 392, 153-161.

344. Tanguay, R. L., Abnet, C. C., Heideman, W., and Peterson, R. E. (1999) Cloning and characterization of the zebrafish (Danio rerio) aryl hydrocarbon receptor, Biochim Biophys Acta 1444, 35-48.

345. Prasch, A. L., Teraoka, H., Carney, S. A., Dong, W., Hiraga, T., Stegeman, J. J., Heideman, W., and Peterson, R. E. (2003) Aryl hydrocarbon receptor 2 mediates 2,3,7,8-tetrachlorodibenzo-p-dioxin developmental toxicity in zebrafish, Toxicol Sci 76, 138-150.

346. Okey, A. B., Riddick, D. S., and Harper, P. A. (1994) The Ah receptor: mediator of the toxicity of 2,3,7,8- tetrachlorodibenzo-p-dioxin (TCDD) and related compounds, Toxicol Lett 70, 1-22.

347. Harper, P. A., Prokipcak, R. D., Bush, L. E., Golas, C. L., and Okey, A. B. (1991) Detection and characterization of the Ah receptor for 2,3,7,8-tetrachlorodibenzo-p-dioxin in the human colon adenocarcinoma cell line LS180, Arch Biochem Biophys 290, 27-36.

348. Cohen, P. (2002) The origins of protein phosphorylation, Nat Cell Biol 4, E127-130.

349. Kemp, B. E., Bylund, D. B., Huang, T. S., and Krebs, E. G. (1975) Substrate specificity of the cyclic AMP-dependent protein kinase, Proc Natl Acad Sci U S A 72, 3448-3452.

350. Hunter, T., and Sefton, B. M. (1980) Transforming gene product of Rous sarcoma virus phosphorylates tyrosine, Proc Natl Acad Sci U S A 77, 1311-1315.

351. Anderson, D., Koch, C. A., Grey, L., Ellis, C., Moran, M. F., and Pawson, T. (1990) Binding of SH2 domains of phospholipase C gamma 1, GAP, and Src to activated growth factor receptors, Science 250, 979-982.

352. Gomez, N., and Cohen, P. (1991) Dissection of the protein kinase cascade by which nerve growth factor activates MAP kinases, Nature 353, 170-173.

353. Galea, C. A., Wang, Y., Sivakolundu, S. G., and Kriwacki, R. W. (2008) Regulation of cell division by intrinsically unstructured proteins: intrinsic flexibility, modularity, and signaling conduits, Biochemistry 47, 7598-7609.

354. Glozak, M. A., Sengupta, N., Zhang, X., and Seto, E. (2005) Acetylation and deacetylation of non-histone proteins, Gene 363, 15-23.

355. Yang, X. J., and Seto, E. (2008) Lysine acetylation: codified crosstalk with other posttranslational modifications, Mol Cell 31, 449-461.

356. Hou, T., Ray, S., Lee, C., and Brasier, A. R. (2008) The STAT3 NH2-terminal domain stabilizes enhanceosome assembly by interacting with the p300 bromodomain, J Biol Chem 283, 30725-30734.

357. Cui, Y., Zhang, M., Pestell, R., Curran, E. M., Welshons, W. V., and Fuqua, S. A. (2004) Phosphorylation of estrogen receptor alpha blocks its acetylation and regulates estrogen sensitivity, Cancer Res 64, 9199-9208.

358. Wang, C., Fu, M., Angeletti, R. H., Siconolfi-Baez, L., Reutens, A. T., Albanese, C., Lisanti, M. P., Katzenellenbogen, B. S., Kato, S., Hopp, T., Fuqua, S. A., Lopez, G. N., Kushner, P. J., and Pestell, R. G. (2001) Direct acetylation of

165 the estrogen receptor alpha hinge region by p300 regulates transactivation and hormone sensitivity, J Biol Chem 276, 18375-18383.

359. Patel, J. H., Du, Y., Ard, P. G., Phillips, C., Carella, B., Chen, C. J., Rakowski, C., Chatterjee, C., Lieberman, P. M., Lane, W. S., Blobel, G. A., and McMahon, S. B. (2004) The c-MYC oncoprotein is a substrate of the acetyltransferases hGCN5/PCAF and TIP60, Mol Cell Biol 24, 10826-10834.

360. Faiola, F., Liu, X., Lo, S., Pan, S., Zhang, K., Lymar, E., Farina, A., and Martinez, E. (2005) Dual regulation of c-Myc by p300 via acetylation-dependent control of Myc protein turnover and coactivation of Myc-induced transcription, Mol Cell Biol 25, 10220-10234.

361. Jeong, J. W., Bae, M. K., Ahn, M. Y., Kim, S. H., Sohn, T. K., Bae, M. H., Yoo, M. A., Song, E. J., Lee, K. J., and Kim, K. W. (2002) Regulation and destabilization of HIF-1alpha by ARD1-mediated acetylation, Cell 111, 709-720.

362. Qian, C., and Zhou, M. M. (2006) SET domain protein lysine methyltransferases: Structure, specificity and catalysis, Cell Mol Life Sci 63, 2755-2763.

363. Bachand, F. (2007) Protein arginine methyltransferases: from unicellular eukaryotes to humans, Eukaryotic cell 6, 889-898.

364. Bernstein, B. E., Humphrey, E. L., Erlich, R. L., Schneider, R., Bouman, P., Liu, J. S., Kouzarides, T., and Schreiber, S. L. (2002) Methylation of histone H3 Lys 4 in coding regions of active genes, Proc Natl Acad Sci U S A 99, 8695- 8700.

365. Plath, K., Fang, J., Mlynarczyk-Evans, S. K., Cao, R., Worringer, K. A., Wang, H., de la Cruz, C. C., Otte, A. P., Panning, B., and Zhang, Y. (2003) Role of histone H3 lysine 27 methylation in X inactivation, Science 300, 131-135.

366. Yang, H., Pesavento, J. J., Starnes, T. W., Cryderman, D. E., Wallrath, L. L., Kelleher, N. L., and Mizzen, C. A. (2008) Preferential dimethylation of histone H4 lysine 20 by Suv4-20, J Biol Chem 283, 12085-12092.

367. Kouskouti, A., Scheer, E., Staub, A., Tora, L., and Talianidis, I. (2004) Gene-specific modulation of TAF10 function by SET9-mediated methylation, Mol Cell 14, 175-182.

368. Yamagata, K., Daitoku, H., Takahashi, Y., Namiki, K., Hisatake, K., Kako, K., Mukai, H., Kasuya, Y., and Fukamizu, A. (2008) Arginine methylation of FOXO transcription factors inhibits their phosphorylation by Akt, Mol Cell 32, 221- 231.

369. Ko, L. J., and Prives, C. (1996) p53: puzzle and paradigm, Genes Dev 10, 1054-1072.

370. Sims, R. J., 3rd, and Reinberg, D. (2008) Is there a code embedded in proteins that is based on post-translational modifications?, Nature reviews 9, 815-820.

371. Xu, Y. (2003) Regulation of p53 responses by post-translational modifications, Cell death and differentiation 10, 400- 403.

372. Chehab, N. H., Malikzay, A., Appel, M., and Halazonetis, T. D. (2000) Chk2/hCds1 functions as a DNA damage checkpoint in G(1) by stabilizing p53, Genes Dev 14, 278-288.

373. Hirao, A., Kong, Y. Y., Matsuoka, S., Wakeham, A., Ruland, J., Yoshida, H., Liu, D., Elledge, S. J., and Mak, T. W. (2000) DNA damage-induced activation of p53 by the checkpoint kinase Chk2, Science 728 , 1824-1827.

374. Shieh, S. Y., Ahn, J., Tamai, K., Taya, Y., and Prives, C. (2000) The human homologs of checkpoint kinases Chk1 and Cds1 (Chk2) phosphorylate p53 at multiple DNA damage-inducible sites, Genes Dev 14, 289-300.

375. Fiscella, M., Zambrano, N., Ullrich, S. J., Unger, T., Lin, D., Cho, B., Mercer, W. E., Anderson, C. W., and Appella, E. (1994) The carboxy-terminal serine 392 phosphorylation site of human p53 is not required for wild-type activities, Oncogene 9, 3249-3257.

376. Rajagopalan, S., Jaulent, A. M., Wells, M., Veprintsev, D. B., and Fersht, A. R. (2008) 14-3-3 activation of DNA binding of p53 by enhancing its association into tetramers, Nucleic Acids Res 36, 5983-5991.

166 377. Barlev, N. A., Liu, L., Chehab, N. H., Mansfield, K., Harris, K. G., Halazonetis, T. D., and Berger, S. L. (2001) Acetylation of p53 activates transcription through recruitment of coactivators/histone acetyltransferases, Mol Cell 8, 1243-1254.

378. Liu, L., Scolnick, D. M., Trievel, R. C., Zhang, H. B., Marmorstein, R., Halazonetis, T. D., and Berger, S. L. (1999) p53 sites acetylated in vitro by PCAF and p300 are acetylated in vivo in response to DNA damage, Mol Cell Biol 19, 1202-1209.

379. Sykes, S. M., Mellert, H. S., Holbert, M. A., Li, K., Marmorstein, R., Lane, W. S., and McMahon, S. B. (2006) Acetylation of the p53 DNA-binding domain regulates apoptosis induction, Mol Cell 24, 841-851.

380. Tang, Y., Luo, J., Zhang, W., and Gu, W. (2006) Tip60-dependent acetylation of p53 modulates the decision between cell-cycle arrest and apoptosis, Mol Cell 24, 827-839.

381. Gu, W., and Roeder, R. G. (1997) Activation of p53 sequence-specific DNA binding by acetylation of the p53 C- terminal domain, Cell 90, 595-606.

382. Hupp, T. R., and Lane, D. P. (1994) Allosteric activation of latent p53 tetramers, Curr Biol 4, 865-875.

383. Li, A. G., Piluso, L. G., Cai, X., Gadd, B. J., Ladurner, A. G., and Liu, X. (2007) An acetylation switch in p53 mediates holo-TFIID recruitment, Mol Cell 28, 408-421.

384. Kubbutat, M. H., Jones, S. N., and Vousden, K. H. (1997) Regulation of p53 stability by Mdm2, Nature 387, 299-303.

385. Rodriguez, M. S., Desterro, J. M., Lain, S., Lane, D. P., and Hay, R. T. (2000) Multiple C-terminal lysine residues target p53 for ubiquitin-proteasome-mediated degradation, Mol Cell Biol 20, 8458-8467.

386. Nakamura, S., Roth, J. A., and Mukhopadhyay, T. (2000) Multiple lysine mutations in the C-terminal domain of p53 interfere with MDM2-dependent protein degradation and ubiquitination, Mol Cell Biol 20, 9391-9398.

387. Xirodimas, D. P., Saville, M. K., Bourdon, J. C., Hay, R. T., and Lane, D. P. (2004) Mdm2-mediated NEDD8 conjugation of p53 inhibits its transcriptional activity, Cell 118, 83-97.

388. Abida, W. M., Nikolaev, A., Zhao, W., Zhang, W., and Gu, W. (2007) FBXO11 promotes the Neddylation of p53 and inhibits its transcriptional activity, J Biol Chem 282, 1797-1804.

389. Gostissa, M., Hengstermann, A., Fogal, V., Sandy, P., Schwarz, S. E., Scheffner, M., and Del Sal, G. (1999) Activation of p53 by conjugation to the ubiquitin-like protein SUMO-1, Embo J 18, 6462-6471.

390. Nishioka, K., Chuikov, S., Sarma, K., Erdjument-Bromage, H., Allis, C. D., Tempst, P., and Reinberg, D. (2002) Set9, a novel histone H3 methyltransferase that facilitates transcription by precluding histone tail modifications required for heterochromatin formation, Genes Dev 16, 479-489.

391. Chuikov, S., Kurash, J. K., Wilson, J. R., Xiao, B., Justin, N., Ivanov, G. S., McKinney, K., Tempst, P., Prives, C., Gamblin, S. J., Barlev, N. A., and Reinberg, D. (2004) Regulation of p53 activity through lysine methylation, Nature 432, 353-360.

392. Kurash, J. K., Lei, H., Shen, Q., Marston, W. L., Granda, B. W., Fan, H., Wall, D., Li, E., and Gaudet, F. (2008) Methylation of p53 by Set7/9 mediates p53 acetylation and activity in vivo, Mol Cell 29, 392-400.

393. Huang, J., Perez-Burgos, L., Placek, B. J., Sengupta, R., Richter, M., Dorsey, J. A., Kubicek, S., Opravil, S., Jenuwein, T., and Berger, S. L. (2006) Repression of p53 activity by Smyd2-mediated methylation, Nature 444, 629- 632.

394. Huang, J., Sengupta, R., Espejo, A. B., Lee, M. G., Dorsey, J. A., Richter, M., Opravil, S., Shiekhattar, R., Bedford, M. T., Jenuwein, T., and Berger, S. L. (2007) p53 is regulated by the lysine demethylase LSD1, Nature 449, 105-108.

395. Shi, X., Kachirskaia, I., Yamaguchi, H., West, L. E., Wen, H., Wang, E. W., Dutta, S., Appella, E., and Gozani, O. (2007) Modulation of p53 function by SET8-mediated methylation at lysine 382, Mol Cell 27, 636-646.

396. Jansson, M., Durant, S. T., Cho, E. C., Sheahan, S., Edelmann, M., Kessler, B., and La Thangue, N. B. (2008) Arginine methylation regulates the p53 response, Nat Cell Biol 10, 1431-1439.

167 397. Sakaguchi, K., Herrera, J. E., Saito, S., Miki, T., Bustin, M., Vassilev, A., Anderson, C. W., and Appella, E. (1998) DNA damage activates p53 through a phosphorylation-acetylation cascade, Genes Dev 12, 2831-2841.

398. Winter, B., Braun, T., and Arnold, H. H. (1993) cAMP-dependent protein kinase represses myogenic differentiation and the activity of the muscle-specific helix-loop-helix transcription factors Myf-5 and MyoD, J Biol Chem 268, 9869- 9878.

399. Zhou, J., and Olson, E. N. (1994) Dimerization through the helix-loop-helix motif enhances phosphorylation of the transcription activation domains of myogenin, Mol Cell Biol 14, 6232-6243.

400. Edery, I., Zwiebel, L. J., Dembinska, M. E., and Rosbash, M. (1994) Temporal phosphorylation of the Drosophila period protein, Proc Natl Acad Sci U S A 91, 2260-2264.

401. Kloss, B., Price, J. L., Saez, L., Blau, J., Rothenfluh, A., Wesley, C. S., and Young, M. W. (1998) The Drosophila clock gene double-time encodes a protein closely related to human casein kinase Iepsilon, Cell 94, 97-107.

402. Lee, C., Bae, K., and Edery, I. (1998) The Drosophila CLOCK protein undergoes daily rhythms in abundance, phosphorylation, and interactions with the PER-TIM complex, Neuron 21, 857-867.

403. Conrad, P. W., Conforti, L., Kobayashi, S., Beitner-Johnson, D., Rust, R. T., Yuan, Y., Kim, H. W., Kim, R. H., Seta, K., and Millhorn, D. E. (2001) The molecular basis of O2-sensing and hypoxia tolerance in pheochromocytoma cells, Comp Biochem Physiol B Biochem Mol Biol 128, 187-204.

404. Long, W. P., Pray-Grant, M., Tsai, J. C., and Perdew, G. H. (1998) Protein kinase C activity is required for aryl hydrocarbon receptor pathway-mediated signal transduction, Mol Pharmacol 53, 691-700.

405. Dolwick, K. M., Schmidt, J. V., Carver, L. A., Swanson, H. I., and Bradfield, C. A. (1993) Cloning and expression of a human Ah receptor cDNA, Mol Pharmacol 44, 911-917.

406. Oesch-Bartlomowicz, B., Huelster, A., Wiss, O., Antoniou-Lipfert, P., Dietrich, C., Arand, M., Weiss, C., Bockamp, E., and Oesch, F. (2005) Aryl hydrocarbon receptor activation by cAMP vs. dioxin: divergent signaling pathways, Proc Natl Acad Sci U S A 102, 9218-9223.

407. Park, S., Henry, E. C., and Gasiewicz, T. A. (2000) Regulation of DNA binding activity of the ligand-activated aryl hydrocarbon receptor by tyrosine phosphorylation, Arch Biochem Biophys 381, 302-312.

408. Zhang, Y., and Xiong, Y. (2001) A p53 amino-terminal nuclear export signal inhibited by DNA damage-induced phosphorylation, Science 292, 1910-1915.

409. Lee, H., and Bai, W. (2002) Regulation of estrogen receptor nuclear export by ligand-induced and p38-mediated receptor phosphorylation, Mol Cell Biol 22, 5835-5845.

410. Leach, C., Eto, M., and Brautigan, D. L. (2002) Domains of type 1 protein phosphatase inhibitor-2 required for nuclear and cytoplasmic localization in response to cell-cell contact, J Cell Sci 115, 3739-3745.

411. Pongratz, I., Stromstedt, P. E., Mason, G. G., and Poellinger, L. (1991) Inhibition of the specific DNA binding activity of the dioxin receptor by phosphatase treatment, J Biol Chem 266, 16813-16817.

412. Dave, D. A., Hamilton, B. R., Wallis, T. P., Furness, S. G. B., Whitelaw, M. L., and Gorman, J. J. (2007) Identification of N,N-epsilon-dimethyl-lysine in the murine dioxin receptor using MALDI-TOF/TOF- and ESI-LTQ-Orbitrap-FT-MS, International Journal of Mass Spectrometry 268, 168-180.

413. Mirsky, A. E., and Pauling, L. (1936) On the Structure of Native, Denatured, and Coagulated Proteins, Proc Natl Acad Sci U S A 22, 439-447.

414. Dunker, A. K., Lawson, J. D., Brown, C. J., Williams, R. M., Romero, P., Oh, J. S., Oldfield, C. J., Campen, A. M., Ratliff, C. M., Hipps, K. W., Ausio, J., Nissen, M. S., Reeves, R., Kang, C., Kissinger, C. R., Bailey, R. W., Griswold, M. D., Chiu, W., Garner, E. C., and Obradovic, Z. (2001) Intrinsically disordered protein, Journal of molecular graphics & modelling 19, 26-59.

415. Radivojac, P., Iakoucheva, L. M., Oldfield, C. J., Obradovic, Z., Uversky, V. N., and Dunker, A. K. (2007) Intrinsic disorder and functional proteomics, Biophysical journal 92, 1439-1456.

168 416. Obradovic, Z., Peng, K., Vucetic, S., Radivojac, P., Brown, C. J., and Dunker, A. K. (2003) Predicting intrinsic disorder from amino acid sequence, Proteins 53 Suppl 6, 566-572.

417. Dunker, A. K., Brown, C. J., Lawson, J. D., Iakoucheva, L. M., and Obradovic, Z. (2002) Intrinsic disorder and protein function, Biochemistry 41, 6573-6582.

418. Uversky, V. N., Gillespie, J. R., and Fink, A. L. (2000) Why are "natively unfolded" proteins unstructured under physiologic conditions?, Proteins 41, 415-427.

419. Li, X., Romero, P., Rani, M., Dunker, A. K., and Obradovic, Z. (1999) Predicting Protein Disorder for N-, C-, and Internal Regions, Genome informatics 10, 30-40.

420. Linding, R., Jensen, L. J., Diella, F., Bork, P., Gibson, T. J., and Russell, R. B. (2003) Protein disorder prediction: implications for structural proteomics, Structure 11, 1453-1459.

421. Iakoucheva, L. M., Radivojac, P., Brown, C. J., O'Connor, T. R., Sikes, J. G., Obradovic, Z., and Dunker, A. K. (2004) The importance of intrinsic disorder for protein phosphorylation, Nucleic Acids Res 32, 1037-1049.

422. Daily, K. M., Radivojac, P., and Dunker, A. K. (2005) Intrinsic disorder and protein modifications: building an SVM predictor for methylation. , in IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

423. Petrulis, J. R., Kusnadi, A., Ramadoss, P., Hollingshead, B., and Perdew, G. H. (2003) The hsp90 Co-chaperone XAP2 alters importin beta recognition of the bipartite nuclear localization signal of the Ah receptor and represses transcriptional activity, J Biol Chem 278, 2677-2685.

424. Lee, B. J., Cansizoglu, A. E., Suel, K. E., Louis, T. H., Zhang, Z., and Chook, Y. M. (2006) Rules for nuclear localization sequence recognition by karyopherin beta 2, Cell 126, 543-558.

425. Dave, K. A., Whelan, F., Bindloss, C., Furness, S. G., Chapman-Smith, A., Whitelaw, M. L., and Gorman, J. J. (2009) Sulfonation and phosphorylation of regions of the dioxin receptor susceptible to methionine modifications, Mol Cell Proteomics 8, 706-719.

426. Shin, K. J., Wall, E. A., Zavzavadjian, J. R., Santat, L. A., Liu, J., Hwang, J. I., Rebres, R., Roach, T., Seaman, W., Simon, M. I., and Fraser, I. D. (2006) A single lentiviral vector platform for microRNA-based conditional RNA interference and coordinated transgene expression, Proc Natl Acad Sci U S A 103, 13759-13764.

427. Gupta, P., Ho, P. C., Huq, M. D., Khan, A. A., Tsai, N. P., and Wei, L. N. (2008) PKCepsilon stimulated arginine methylation of RIP140 for its nuclear-cytoplasmic export in adipocyte differentiation, PLoS ONE 3, e2658.

428. Mostaqul Huq, M. D., Gupta, P., Tsai, N. P., White, R., Parker, M. G., and Wei, L. N. (2006) Suppression of receptor interacting protein 140 repressive activity by protein arginine methylation, Embo J 25, 5094-5104.

429. Hankinson, O. (2005) Role of coactivators in transcriptional activation by the aryl hydrocarbon receptor, Arch Biochem Biophys 433, 379-386.

430. Niehrs, C., Beisswanger, R., and Huttner, W. B. (1994) Protein tyrosine sulfation, 1993--an update, Chem Biol Interact 92, 257-271.

431. Beisswanger, R., Corbeil, D., Vannier, C., Thiele, C., Dohrmann, U., Kellner, R., Ashman, K., Niehrs, C., and Huttner, W. B. (1998) Existence of distinct tyrosylprotein sulfotransferase genes: molecular characterization of tyrosylprotein sulfotransferase-2, Proc Natl Acad Sci U S A 95, 11134-11139.

432. Pouyani, T., and Seed, B. (1995) PSGL-1 recognition of P-selectin is controlled by a tyrosine sulfation consensus at the PSGL-1 amino terminus, Cell 83, 333-343.

433. Medzihradszky, K. F., Darula, Z., Perlson, E., Fainzilber, M., Chalkley, R. J., Ball, H., Greenbaum, D., Bogyo, M., Tyson, D. R., Bradshaw, R. A., and Burlingame, A. L. (2004) O-sulfonation of serine and threonine: mass spectrometric detection and characterization of a new posttranslational modification in diverse proteins throughout the eukaryotes, Mol Cell Proteomics 3, 429-440.

434. Jones, L. C., Okino, S. T., Gonda, T. J., and Whitlock, J. P., Jr. (2002) Myb-binding protein 1a augments AhR- dependent gene expression, J Biol Chem 277, 22515-22519. 169 435. Li, R., Soosairajah, J., Harari, D., Citri, A., Price, J., Ng, H. L., Morton, C. J., Parker, M. W., Yarden, Y., and Bernard, O. (2006) Hsp90 increases LIM kinase activity by promoting its homo-dimerization, Faseb J 20, 1218-1220.

436. Monje, P., Marinissen, M. J., and Gutkind, J. S. (2003) Phosphorylation of the carboxyl-terminal transactivation domain of c-Fos by extracellular signal-regulated kinase mediates the transcriptional activation of AP-1 and cellular transformation induced by platelet-derived growth factor, Mol Cell Biol 23, 7030-7043.

437. Saiardi, A., Bhandari, R., Resnick, A. C., Snowman, A. M., and Snyder, S. H. (2004) Phosphorylation of proteins by inositol pyrophosphates, Science 306, 2101-2105.

438. Bhandari, R., Saiardi, A., Ahmadibeni, Y., Snowman, A. M., Resnick, A. C., Kristiansen, T. Z., Molina, H., Pandey, A., Werner, J. K., Jr., Juluri, K. R., Xu, Y., Prestwich, G. D., Parang, K., and Snyder, S. H. (2007) Protein pyrophosphorylation by inositol pyrophosphates is a posttranslational event, Proc Natl Acad Sci U S A 104, 15305- 15310.

439. Ong, S. E., and Mann, M. (2006) A practical recipe for stable isotope labeling by amino acids in cell culture (SILAC), Nature protocols 1, 2650-2660.

170